TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Ask HN: How to learn Data Analysis?

11 点作者 hhimanshu大约 13 年前
I completed the ml-class offered by Prof Andrew Ng last fall<p>I started trying one of the problems in Kaggle<p>When I fit a logistic Regression, the algorithm gave me poor results.<p>I saw that people do lot of data analysis before applying any machine learning algorithm<p>I completed a book in Statistics and want to learn about Data Analysis<p>Please help me identifying resources/courses/tools that I can take to learn Data Analysis<p>Thank you

2 条评论

johnhess大约 13 年前
You're fluent with regressions and stats. Sounds like you're competent with the mechanics. Where you might have gaps, there are some great tools that can do your heavy lifting. But, that's only half of the battle.<p>When you're trying to do meaningful data analysis, you really have to understand your dataset. Fancy math can't substitute for domain expertise. Think long and hard about what's in the set, what the causal connections might be, and how an "expert" in the field might approach the problem.<p>The guys over at OKCupid are awesome at this. Check out this post to see what I'm talking about.<p><a href="http://blog.okcupid.com/index.php/dont-be-ugly-by-accident/" rel="nofollow">http://blog.okcupid.com/index.php/dont-be-ugly-by-accident/</a><p>Their advice on taking a good picture is just about exactly what my professional photographer mother recommends. Good stuff. But, the way they analyzed and presented the data shows (a) exactly how powerful putting numbers on something subjective can be and (b) that they know their domain.<p>If you read through the other posts (do that), you'll see that they have a solid understanding of their dataset. They know what to look for, namely photo attractiveness. They know how to get good data on that (the dependent variable) and they know which independent variables probably matter the most.<p>Throwing math at a complex dataset can be useful (e.g. bayes spam classifiers), but if you really want to do something that will work well or "speak" to a client, invest a bit of time in understanding the field.
评论 #3641258 未加载
skadamat大约 13 年前
<a href="http://www.amazon.com/Data-Analysis-Open-Source-Tools/dp/0596802358/ref=sr_1_sc_3?ie=UTF8&#38;qid=1330285039&#38;sr=8-3-spell" rel="nofollow">http://www.amazon.com/Data-Analysis-Open-Source-Tools/dp/059...</a>
评论 #3641133 未加载
评论 #3641248 未加载
评论 #3644204 未加载