"I think data scientist is a sexed-up term for a statistician."<p>This statement, given by Silver to the annual meeting of the Joint Statistics Meetings (the main cross-organization stats conference), was guaranteed to be a crowd-pleaser for that audience.<p>Unfortunately for them, it's not really true.<p>The problem is that much of conventional academic statistics consists of proving theorems about model classes. This requires a lot of sophisticated analysis, but has turned rather vacuous. And much conventional applied statistics consists of computing diagnostics based on dubious modeling assumptions. Under pressure in the last 20 or so years from computer science, machine learning, computer vision, Moore's law, and the data avalanche, the discipline has changed, but not fast enough.<p>As a result, a lot of what <i>should</i> be taught and researched in statistics departments has been co-opted by these other disciplines. And many people with a real problem would rather work with a "machine learning" person than a "statistics" person.<p>The best summary of this state of affairs is Leo Breiman's essay (<a href="http://projecteuclid.org/DPubS/Repository/1.0/Disseminate?handle=euclid.ss/1009213726&view=body&content-type=pdf_1" rel="nofollow">http://projecteuclid.org/DPubS/Repository/1.0/Disseminate?ha...</a>). The abstract of this essay is brutal:<p>"There are two cultures in the use of statistical modeling to reach conclusions from data. One assumes that the data are generated by a given stochastic data model. The other uses algorithmic models and treats the data mechanism as unknown. The statistical community has been committed to the almost exclusive use of data models. This commitment has led to irrelevant theory, questionable conclusions, and has kept statisticians from working on a large range of interesting current problems. Algorithmic modeling, both in theory and practice, has developed rapidly in fields outside statistics. It can be used both on large, complex data sets and as a more accurate and informative alternative to data modeling on smaller data sets. If our goal as a field is to use data to solve problems, then we need to move away from exclusive dependence on data models and adopt a more diverse set of tools."<p>Breiman was mathematically sophisticated, so it's not that he wasn't able to follow the theory he critiques, it's that he wasn't snowed by detail and could see its lack of relevance to real problems.