The hardest part of my practical coursework in statistics was picking good, free data sets for final projects. Pick something awesome, your final presentation will be awesome; pick something lame, and your final project gets you an A in Dejected Foot-Shuffling 101. If anything, the best data sets <i>weren't</i> from the sexy, unexplored fields. Remember how everyone tells you to pick classes by the professor, not the subject? It's a similar counter-intuitive thing for data sets. Find a data set that's rich and complete, and even if it's not a topic you're interested in now, you'll secretly love it by the end of the term.<p>Enough sermonizing. Here's a list of data set ideas that served me well in my youth:<p>1. R comes with a lot of built-in data sets. Open up R and run the command "data()" to see the list. Many R packages come with additional data sets (I like the diamonds one from ggplot2). All these built-in data sets are sort of small and not really project-worthy, but they're nice if you're just playing around with new techniques.<p>2. Government agencies release large, interesting data sets. Weather, census reports, travel statistics, public health data... The only problem is that they're usually a pain to query. Think outside your own country. And get ready for spatial stuff.<p>3. Academic institutes release pretty neat data, too. Natural science stuff, geology stuff... Again, here comes spatial data analysis.<p>4. Data journalists sometimes publish their data along with the story, and usually, they haven't found <i>nearly</i> all the cool stuff in there yet. This, for instance, looks insanely fun: <a href="http://project.wnyc.org/dogs-of-nyc/" rel="nofollow">http://project.wnyc.org/dogs-of-nyc/</a><p>5. Sports data is free like tap water, terrifyingly detailed, and deeply cool indeed.<p>6. Natural language processing. Check out Project Guterberg! I like these analysis projects... <a href="http://lotrproject.com/statistics/books/" rel="nofollow">http://lotrproject.com/statistics/books/</a>, <a href="http://bost.ocks.org/mike/miserables/" rel="nofollow">http://bost.ocks.org/mike/miserables/</a><p>7. Make your own data! Do you have a pedometer? Records of what temperature your house is? Some bloggers in the "Quantified Self" movement seem awfully cavalier about their own privacy, but they have undeniably boffo data.<p>8. And finally: commercial real estate?! There has got to be <i>so</i> much interesting data to work with there. I know you don't think you can predict the markets, but at the very least you could make pretty maps and pictures. Maybe your company will let you play with some data, provided you show them your insights? Don't know if they'd let you blog it all over town, though...<p>Congrats, my friend, you are one of us now. The people who drool over CSV files.