To speak of evolution when your time frame is 2008-2012 is somewhat far fetched. But I believe I see a reassuring trend here:
<a href="http://porngram.sexualitics.org/?q=erlang%2C+clojure%2C+Ada%2C+scala%2C+rust" rel="nofollow">http://porngram.sexualitics.org/?q=erlang%2C+clojure%2C+Ada%...</a>
1. It doesn't count word frequencies, but sub-string frequencies. Moreover, if a sub-string appears more than once-per-title, then it is counted more than once. I draw this conclusion by submitting "a,b,c". And from their paper [1]:<p><pre><code> our algorithm strips out dashes and catches any
occurrence of the query in the title, for example,
'blow' catches 'blowing', 'blowjobs'
</code></pre>
This explains the results of these queries: "ada,erlang", "tea,beer". As an alternative they could have used a stemmer [2].<p>2. The "slow,fast" and "love,hardcore" trends illustrate an interesting trend. Perhaps towards women or mainstream viewers.<p>[1] <a href="http://sexualitics.org/wp-content/uploads/2014/01/PORNSTUDIES_preprint.pdf" rel="nofollow">http://sexualitics.org/wp-content/uploads/2014/01/PORNSTUDIE...</a><p>[2] <a href="http://nlp.stanford.edu/IR-book/html/htmledition/stemming-and-lemmatization-1.html" rel="nofollow">http://nlp.stanford.edu/IR-book/html/htmledition/stemming-an...</a>
In my first 2 weeks of working at an adult company (as a dev yes, it's sad) one of my tasks was to watch/scan 200+ video's and describe them.. It's true, you run out of inspiration fast.
Also I could hint you: the "love" in the titles is probably explained by "love(s) to <insert profanity> ".
I don't think I ever used hardcore in a title.
Traditional professions are still on top:<p><a href="http://porngram.sexualitics.org/?q=pizza%2Cdelivery%2Cplumber%2C+programmer" rel="nofollow">http://porngram.sexualitics.org/?q=pizza%2Cdelivery%2Cplumbe...</a><p>I sense a business opportunity there.
Quite fun!<p>Next: provide the porn industry a simple markov chain script to generate probabilistic porn movie titles, and save them all those incredibly tiresome brainstrom sessions they must have to create new titles :)
Very interesting to see the dataset being made available. Whenever I want to do this kind of analysis, I always stumble at 'how to get the data?'. In their paper, it is mentioned that "We created a dedicated computer program to carry out the navigation and data collection tasks required to gather the metadata for all available videos...". I would love to see this program. More broadly, can anyone help me with best resources (pref python) where one can learn to crawl/scrape this type of information?
So I did gay vs lesbian and I was confused why there was a big spike in 2010 for gay that has since dropped off. Is this an anomaly in their sampling?<p>Also Obama's numbers have really dropped compared to Bush:
<a href="http://porngram.sexualitics.org/?q=bush%2Cobama" rel="nofollow">http://porngram.sexualitics.org/?q=bush%2Cobama</a>
I suppose we may as well do the obvious ones<p><a href="http://porngram.sexualitics.org/?q=BDSM%2Ctorture%2Cpain%2Crape" rel="nofollow">http://porngram.sexualitics.org/?q=BDSM%2Ctorture%2Cpain%2Cr...</a><p>:/