TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Are ML and Statistics Complementary? [pdf]

68 pointsby snippyhollowover 9 years ago

7 comments

ktamuraover 9 years ago
They definitely are as far as their roles at (most) startups are concerned.<p>Unless your startup&#x27;s core strategy involves machine learning, statistics tends to come handier than machine learning in the early days. Most likely, what moves your company is not a data product built atop machine learning models but the ability to draw less wrong conclusions from your data, which is the very definition of statistics. Also, in the early days of a startup, you experience small&#x2F;missing data problems: You have very few customers, very incomplete datasets with a lot of gotchas. Interpreting such bad data is no small feat, but it&#x27;s definitely different from training your Random Forest model against millions of observations.
tristanzover 9 years ago
LeCun has a comment on this paper here: <a href="https:&#x2F;&#x2F;www.facebook.com&#x2F;yann.lecun&#x2F;posts&#x2F;10153293764562143" rel="nofollow">https:&#x2F;&#x2F;www.facebook.com&#x2F;yann.lecun&#x2F;posts&#x2F;10153293764562143</a>
评论 #10819096 未加载
washedupover 9 years ago
Here is a link to the paper referenced in the beginning: <a href="http:&#x2F;&#x2F;courses.csail.mit.edu&#x2F;18.337&#x2F;2015&#x2F;docs&#x2F;50YearsDataScience.pdf" rel="nofollow">http:&#x2F;&#x2F;courses.csail.mit.edu&#x2F;18.337&#x2F;2015&#x2F;docs&#x2F;50YearsDataSci...</a><p>Great read for anyone interested in the debate.
nextosover 9 years ago
I think they will eventually converge.<p>Probabilistic programming is already a hint of this. The most general class of probability distributions is that of non-deterministic programs. ML is just a quick and dirty way to write these programs.
评论 #10820168 未加载
p4wnc6over 9 years ago
What is commonly understood as &#x27;statistics&#x27; is just a specialized subset of machine learning. Machine learning generalizes statistics.<p>The correct complement to machine learning is cryptography -- trying to intentionally build things that are provably intractable to reverse engineer.
评论 #10819145 未加载
评论 #10819688 未加载
评论 #10819288 未加载
评论 #10819999 未加载
sjg007over 9 years ago
This is a great summary of the field.
_0w8tover 9 years ago
I think feasibility to get an explanation for the results of modern machine learning is wishful thinking. I personally cannot explain my gut feelings. So why should we expect an explanation when machine deals with the same class of problems?<p>Besides, it is easy to get wrong explanation and, as Vladimir Vapnik in his 3 metaphors for complex world observed, <a href="http:&#x2F;&#x2F;www.lancaster.ac.uk&#x2F;users&#x2F;esqn&#x2F;windsor04&#x2F;handouts&#x2F;vapnik.pdf" rel="nofollow">http:&#x2F;&#x2F;www.lancaster.ac.uk&#x2F;users&#x2F;esqn&#x2F;windsor04&#x2F;handouts&#x2F;vap...</a> , &quot;actions based on your understanding of God’s thoughts can bring you to catastrophe&quot;.
评论 #10819080 未加载
评论 #10819108 未加载
评论 #10819074 未加载