I love it when people actually provide data rather than merely working based on "gut feelings"! I have a few questions jump out at me as potentially affecting the reliability of the analysis and conclusions, though:<p>1. Since the mapping from lines to gender goes through the actor/actress involved, it seems that "trouser roles" (particularly in animated features) may skew the statistics. I don't know if the effect is large enough to matter, though.<p>2. The analysis seems to be conducted on the basis of "lines" rather than "words". Does this skew the results? I wouldn't be surprised if predominantly-male "action" scenes had fewer words per line (or, put another way, more lines per word) than other scenes.<p>3. The analysis of actor/actress ages aggregates screenplays over all years of publication. This makes it impossible to distinguish between a bias towards <i>young</i> actresses and a bias towards <i>actresses born after a particular date</i>. This is a very important distinction in terms of policy response, since there is little gap between genders up to age 31: If the problem is "older actresses don't get many roles" then it needs a response, but if the problem was "actresses born before 1985 don't get many roles" then the problem will self-correct as the older generations are replaced by more egalitarian ones.