科技回声

Data Scientist (noun): A statistician who lives in San Francisco.<p>(only half joking)

"Data scientist" is a mess of a job title. It seems to be as much of a reaction against the commoditization of software engineering (which leaves the smartest, and by correlation, usually the most mathematically literate, 10% of programmers ill-suited for the average software job) as it is a real distinction.<p>There are plenty of "data scientists" who use canned tools and play around with parameters because that's all "the business" thinks it needs.<p>You want to trim complexity for a reason that any data scientist worth his salt (and there are plenty of celebrity engineers in SF making $500k who aren't worth their salt and don't know this) should already know: bias-variance tradeoff (see also: underfitting and overfitting). If your model is too flexible/complex, it will begin absorbing noise. That leads to a model that performs extremely well on training data but fails miserably on unseen data. There are well-studied techniques for preventing this, but I'd guess that fewer than 20% of self-described or titled "data scientists" are familiar with them.

Data Scientist (noun): A statistician who lives in San Francisco.<p>(only half joking)

The Forgotten Job of a Data Scientist: Editing

2 条评论

The Forgotten Job of a Data Scientist: Editing

2 条评论