> I think this paints a somewhat dark picture of gender roles within typical story plots. Women are more likely to be in the role of victims- “she screams”, “she cries”, or “she pleads.” Men tend to be the aggressor: “he kidnaps” or “he beats”. Not all male-oriented terms are negative- many, like “he saves”/”he rescues” are distinctly positive- but almost all are active rather than receptive.<p>Got mathematical analysis, but terrible semantic interpretation.<p>There are only 3 really negative verbs for men ("murdered", "kills", "kidnaps"), and 3 distinctly positive ("saves", "defeats", "rescues"). "Beat" is ambiguous, personally I would more likely interpret it as "defeats" rather than "hit"/"punch". This is in no way a "dark" picture, and the only relevant conclusion is the one he briefly mentions in passing: women are described as more passive, men as more active.
People who have poor understanding of human nature will consider this an evidence of sexism.<p>Think for a minute about why "Fifty shades of grey" is popular among female readers.<p>If there are many stories are filled with males competing over resources and women (in the form of violent criminals kidnapping women and brave heroes rescuing them) then might it be that these kinds of stories are what people are looking for?<p>Try to imagine a story about a weak man who cannot defend himself against a gang of three women who kidnap him, only to get saved by his brave girlfriend, who upon rescuing him promises him to stay by his side for ever.<p>Just try it. Does it sound like an interesting story?
The reference at the end talking about comparing changes over time reminded me of a problem I've been kicking around. For this particular system that's not too hard, since you're putting each word on a one-dimensional femininity/masculinity scale, so you could plot a word on a line graph or something. But what do you do if you want to evaluate the changes of more complex relationships over time? Not just mapping words to a constant vector space, but modeling the relationships between words, such as clusters or word2vec representations. With something like word2vec you can take a bunch of words and project the vector space onto a plane so you can see the relative distances, but how do you express changes over time? You could show a bunch of planar projections for different instants in time, but it's hard to look at that and capture the changes.<p>So how do you visualize changes to these more complex interactions between data points, and also how do you mathematically quantify some of these changes? I'd really appreciate any advice on this. And sorry that this is kind of off topic for the article :)
The gender is strongly correlated to the biological sex. The biological sex is, in turn, strongly correlated to physical strength, aggressivity and other traits.<p>I enjoyed the technical description, but the results didn't exactly shock me!
> what verbs are used after “he” and “she”, and therefore what roles male and female characters tend to have within stories.<p>It might've been better had the author (and the Jane Austen article's author) used some NLP processing to see whether the pronoun was actually the subject of those verbs. But I'll grant that it's usually the case.<p>Also interesting: gender and the object of those verbs.<p>EDIT: after some research (that I should've done before posting), it's a remarkably effective technique and it seems only the most contorted sentences might get tripped up. English is nearly always Subject-Verb-Object.
It's interesting how thoroughly ingrained sexist concepts are in the language. Even a verb that's fairly active like 'resist' assumes a power relation in which they are in a worse position.<p>I'd like to see this done by country or year or language or genre.
The author failed to mention the context of the data chosen; specifically the dates. For example, rob vs. steal, would change depending on the century the work was created.
Direct link to a scatter plot (x axis: quantity, y axis: gender):<p><a href="http://varianceexplained.org/figs/2017-04-27-tidytext-gender-plots/total_log_ratio_scatter-1.png" rel="nofollow">http://varianceexplained.org/figs/2017-04-27-tidytext-gender...</a>
We did a similar analysis of gender stereotypes in the Wattpad online writing community last year: <a href="https://www.aaai.org/ocs/index.php/ICWSM/ICWSM16/paper/view/13112" rel="nofollow">https://www.aaai.org/ocs/index.php/ICWSM/ICWSM16/paper/view/...</a><p>The conformity with existing stereotypes was pretty depressing, and they're perpetuated more or less equally be male and female authors.
><i>I think this paints a somewhat dark picture of gender roles within typical story plots. Women are more likely to be in the role of victims- “she screams”, “she cries”, or “she pleads.” Men tend to be the aggressor: “he kidnaps” or “he beats”.</i><p>How's that different from actual real life beatings, murders and kidnappings?<p>Aren't men usually the aggressors?
I'd wager that much of this could be explained by the "masculine" verbs being used in relation to a protagonist more often, whereas the "feminine" verbs describe reactions to the actions of a protagonist.
Related post from a week or two ago:<p><a href="https://news.ycombinator.com/item?id=14156079" rel="nofollow">https://news.ycombinator.com/item?id=14156079</a>
If you're finding some disonnence between 100K stories and your own interpretation of reality, you've really got to ask yourself where the problem might lie.