TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Nate Silver: What I need from statisticians

37 pointsby carlosggover 11 years ago

9 comments

hharrisonover 11 years ago
To respond to a bunch of other posters here:<p>There&#x27;s a fundamental difference between data scientist and statistician, I think. I see statistics as an academic discipline and data science as an applied discipline.<p>More concretely, the statistics approach is: formulate question --&gt; formulate hypothesis --&gt; collect data in a controlled environment under a specific set of assumptions (i.e., perform an experiment) --&gt; determine probability of the data given the hypothesis (and assumptions).<p>While the data science approach is: hey look, we already have all this data --&gt; generate predictions --&gt; collect more data --&gt; refine predictions.<p>Of course, that&#x27;s an over-generalization. But I think the different emphasis on hypothesis testing vs. machine learning&#x2F;data mining is fundamental.
chimeracoderover 11 years ago
&gt; &quot;I think data scientist is a sexed-up term for a statistician.&quot;<p>As a statistician-and-engineer who is currently on the job market (my graduate program finishes this spring), I feel this pain.<p>I&#x27;ve been referred to as a &quot;data scientist&quot; multiple times (that&#x27;s even been my official title at work before), though I do still cringe sometimes when I hear the word, for this exact reason.<p>That said, I don&#x27;t usually present myself as a statistician, even though my degree is a statistics degree. Most people who hold statistics degrees are fairly lousy engineers[0], and I don&#x27;t know of any other term that (concisely) expresses that I&#x27;m equally competent as a statistician and a (backend) engineer[1].<p>Of course, this is because many of these programs haven&#x27;t yet caught up to the fact that computers exist and are still teaching statistics as if we&#x27;re in a pre-computation era. The perfect solution is to fix this, and thereby fix the connotation of the word &quot;statistician&quot;.<p>It&#x27;s the same reason I dislike the term &quot;growth hacker&quot; - really, that&#x27;s just the way marketing <i>should</i> be done (ie, based on numbers and verifiable statistics). In a perfect world, all (competent) marketers would be &quot;growth hackers&quot;. But many marketers aren&#x27;t, and so we have to make up another cringe-worthy term for it.<p>Unfortunately, that&#x27;s a problem that&#x27;s beyond my means to solve. So I bite my tongue and add the word &quot;data scientist&quot; to my resume anyway.<p>[0] Usually self-proclaimed, too [1] ie, &quot;I could work as a backend engineer if I wanted to&#x2F;needed to, but I&#x27;m looking for work involving both skillsets&quot;
评论 #6631380 未加载
mturmonover 11 years ago
&quot;I think data scientist is a sexed-up term for a statistician.&quot;<p>This statement, given by Silver to the annual meeting of the Joint Statistics Meetings (the main cross-organization stats conference), was guaranteed to be a crowd-pleaser for that audience.<p>Unfortunately for them, it&#x27;s not really true.<p>The problem is that much of conventional academic statistics consists of proving theorems about model classes. This requires a lot of sophisticated analysis, but has turned rather vacuous. And much conventional applied statistics consists of computing diagnostics based on dubious modeling assumptions. Under pressure in the last 20 or so years from computer science, machine learning, computer vision, Moore&#x27;s law, and the data avalanche, the discipline has changed, but not fast enough.<p>As a result, a lot of what <i>should</i> be taught and researched in statistics departments has been co-opted by these other disciplines. And many people with a real problem would rather work with a &quot;machine learning&quot; person than a &quot;statistics&quot; person.<p>The best summary of this state of affairs is Leo Breiman&#x27;s essay (<a href="http://projecteuclid.org/DPubS/Repository/1.0/Disseminate?handle=euclid.ss/1009213726&amp;view=body&amp;content-type=pdf_1" rel="nofollow">http:&#x2F;&#x2F;projecteuclid.org&#x2F;DPubS&#x2F;Repository&#x2F;1.0&#x2F;Disseminate?ha...</a>). The abstract of this essay is brutal:<p>&quot;There are two cultures in the use of statistical modeling to reach conclusions from data. One assumes that the data are generated by a given stochastic data model. The other uses algorithmic models and treats the data mechanism as unknown. The statistical community has been committed to the almost exclusive use of data models. This commitment has led to irrelevant theory, questionable conclusions, and has kept statisticians from working on a large range of interesting current problems. Algorithmic modeling, both in theory and practice, has developed rapidly in fields outside statistics. It can be used both on large, complex data sets and as a more accurate and informative alternative to data modeling on smaller data sets. If our goal as a field is to use data to solve problems, then we need to move away from exclusive dependence on data models and adopt a more diverse set of tools.&quot;<p>Breiman was mathematically sophisticated, so it&#x27;s not that he wasn&#x27;t able to follow the theory he critiques, it&#x27;s that he wasn&#x27;t snowed by detail and could see its lack of relevance to real problems.
评论 #6634020 未加载
评论 #6638577 未加载
chockablockover 11 years ago
If you&#x27;re a trained scientist, &#x27;Data Science&#x27; sounds distinctly odd. What other kind of science is there? A friend likened it to going to a restaurant to do some &#x27;Food Eating&#x27;.
评论 #6632025 未加载
评论 #6631914 未加载
评论 #6631900 未加载
nfozover 11 years ago
Data-scientist is a sexed-up term for statistician without any expectation of mathematical expertise.
评论 #6631896 未加载
评论 #6631315 未加载
sreanover 11 years ago
The way I have made personal peace with this is that I consider myself a better programmer than the median statistician and better in statistics and machine learning than a median programmer. Whether this is a useful spot to be in I have to find out. I can see that depending on the times this can either be an asset or a liability.
cascaover 11 years ago
We need a new term for statistician and data scientist is as good as any. For many years, the terms &quot;statistics&quot; and &quot;statistician&quot; have had negative undertones within the general public and renaming is a great way to overcome that.
评论 #6631035 未加载
评论 #6631616 未加载
评论 #6631147 未加载
rohunatiover 11 years ago
It could be argued that we shouldn&#x27;t abandon the term &quot;statistics.&quot; Data science is to statistics what mathematics is to physics, but we don&#x27;t (nor do physicists) call it &quot;number science.&quot;
mrcactu5over 11 years ago
<p><pre><code> I think data scientist is a sexed up term for a statistician. </code></pre> fuck-yeah
评论 #6631847 未加载