I completed the ml-class offered by Prof Andrew Ng last fall<p>I started trying one of the problems in Kaggle<p>When I fit a logistic Regression, the algorithm gave me poor results.<p>I saw that people do lot of data analysis before applying any machine learning algorithm<p>I completed a book in Statistics and want to learn about Data Analysis<p>Please help me identifying resources/courses/tools that I can take to learn Data Analysis<p>Thank you
You're fluent with regressions and stats. Sounds like you're competent with the mechanics. Where you might have gaps, there are some great tools that can do your heavy lifting. But, that's only half of the battle.<p>When you're trying to do meaningful data analysis, you really have to understand your dataset. Fancy math can't substitute for domain expertise. Think long and hard about what's in the set, what the causal connections might be, and how an "expert" in the field might approach the problem.<p>The guys over at OKCupid are awesome at this. Check out this post to see what I'm talking about.<p><a href="http://blog.okcupid.com/index.php/dont-be-ugly-by-accident/" rel="nofollow">http://blog.okcupid.com/index.php/dont-be-ugly-by-accident/</a><p>Their advice on taking a good picture is just about exactly what my professional photographer mother recommends. Good stuff. But, the way they analyzed and presented the data shows (a) exactly how powerful putting numbers on something subjective can be and (b) that they know their domain.<p>If you read through the other posts (do that), you'll see that they have a solid understanding of their dataset. They know what to look for, namely photo attractiveness. They know how to get good data on that (the dependent variable) and they know which independent variables probably matter the most.<p>Throwing math at a complex dataset can be useful (e.g. bayes spam classifiers), but if you really want to do something that will work well or "speak" to a client, invest a bit of time in understanding the field.