One of the major problems with sentiment analysis is that it doesn't pick up on irony or subtext very well. Relying on word choice alone, one such analysis pinned "Fitter Happier" as one of the happiest Radiohead songs.<p><a href="http://rcharlie.com/2017-02-16-fitteR-happieR/" rel="nofollow">http://rcharlie.com/2017-02-16-fitteR-happieR/</a><p>Edit: sorry, I misremembered - it actually handled "True Love Waits" pretty well. It did pick up "Fitter Happier" as one of the happiest, that's what struck me as strange.
Interesting project, horrible presentation.<p>The colour gradient on the graph is confusing - does green mean happy, blue unhappy, and the movie changes tone over time? Axes of the graph are not labeled, what are we looking at? But, those concerns are secondary.<p>Bar graph seems a poor choice here, given the nature of the data. Given that there doesn't seem to be any correlation with time, the order of phrases doesn't seem important -- you could forgo the linear presentation, and display the distribution of the data instead.<p>For example, you could bucket the sentiment values (-3 to -2, -2 to -1, etc) and use a histogram to show the counts in each bucket. This would enable you to compare different movies (one histogram per movie).
I'm always a little wary of the robustness of sentiment analysis; in my experience, if you take the time to check sentiment analysis results sentence by sentence, you will find a high error rate.<p>I haven't confirmed by looking at the source, but my suspicion is that either most sentiment analysis implementations are rule based or are not well tuned.<p>My go-to example is IBM Watson's sentiment analysis service rating "I hope you die" as very positive because the sentence is categorized as "hopeful". Which I suppose it is, technically, and perhaps this is a case where I'm expecting too much because recognizing this particular example as having a negative sentiment requires much human abstract reasoning and inference, but the example remains nonetheless because real-world language usage that isn't dry and technical is rife with these sorts of linguistic usages.
OP here! I was watching the Stanford NLP classes a while back (<a href="https://www.youtube.com/playlist?list=PLiNErZ5Bus8qNxNsFZFkh-9_CzZRW9iH9" rel="nofollow">https://www.youtube.com/playlist?list=PLiNErZ5Bus8qNxNsFZFkh...</a>) and ended up trying the part about sentiment analysis on srt files.
The way it works right is quite primitive but you can still see some "trends" on most movies. If there are any suggestions on how I could make this smarter I would love to hear them!
You should let people see what other people submitted although it's pretty easy by fiddling with the URL. I like seeing the word patterns but I really wonder how well it can detect true sentiment, not an easy thing to do. <a href="https://www.crealdo.com/story/movie-sentiment/movies/13" rel="nofollow">https://www.crealdo.com/story/movie-sentiment/movies/13</a>
Also interesting: Character-to-Character Sentiment Analysis in Shakespeare’s Plays (PDF)<p><a href="http://www.aclweb.org/anthology/P13-2085" rel="nofollow">http://www.aclweb.org/anthology/P13-2085</a>