This is great...I'm teaching a couple of classes next year on data analysis and it's incredibly helpful to have real-world data that is <i>fun</i>...things like Census/NOAA data are great, but too abstract (initially) for the average novice to really ask interesting questions of.<p>But everyone knows what it's like to eat at a crappy/great restaurant, or where such places might be located, or how people might review as a cluster...and so everyone comes in with testable assumptions and hypotheses that are fun to explore.<p>This item brought to mind yesterday's front page post on "Seven habits of highly fraudulent users" (<a href="https://news.ycombinator.com/item?id=8116047" rel="nofollow">https://news.ycombinator.com/item?id=8116047</a>)...does Yelp do any <i>extra</i> processing of this academic set (beyond whatever regular cleaning they do of spam accounts)? It'd be interesting to test hypotheses on signals of spammy/fake accounts (OTOH, I imagine Yelp would probably prefer such trends not to be so apparent in bulk data)