TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Machine Learning Crash Course: The Bias-Variance Dilemma

540 点作者 Yossi_Frenkel将近 8 年前

11 条评论

taeric将近 8 年前
This seems to ultimately come down to an idea that folks have a hard time shaking. It is entirely possible that you cannot recover the original signal using machine learning. This is, fundamentally, what separates this field from digital sampling.<p>And this is not unique to machine learning, per se. <a href="https:&#x2F;&#x2F;fivethirtyeight.com&#x2F;features&#x2F;trump-noncitizen-voters&#x2F;" rel="nofollow">https:&#x2F;&#x2F;fivethirtyeight.com&#x2F;features&#x2F;trump-noncitizen-voters...</a> has a great widget that shows that as you get more data, you do not necessarily decrease inherent noise. In fact, it stays very constant. (Granted, this is in large because machine learning has most of its roots in statistics.)<p>More explicitly, with ML, you are building probabilistic models. This is contrasted to most models folks are used to which are analytic models. That is, you run the calculations for an object moving across the field, and you get something within the measurement bounds that you expected. With a probabilistic model, you get something that is within the bounds of being in line with previous data you have collected.<p>(None of this is to say this is a bad article. Just a bias to keep in mind as you are reading it. Hopefully, it helps you challenge it.)
评论 #14790958 未加载
评论 #14790543 未加载
评论 #14791953 未加载
评论 #14790692 未加载
rdudekul将近 8 年前
Here are parts 1, 2 &amp; 3:<p>Introduction, Regression&#x2F;Classification, Cost Functions, and Gradient Descent:<p><a href="https:&#x2F;&#x2F;ml.berkeley.edu&#x2F;blog&#x2F;2016&#x2F;11&#x2F;06&#x2F;tutorial-1&#x2F;" rel="nofollow">https:&#x2F;&#x2F;ml.berkeley.edu&#x2F;blog&#x2F;2016&#x2F;11&#x2F;06&#x2F;tutorial-1&#x2F;</a><p>Perceptrons, Logistic Regression, and SVMs:<p><a href="https:&#x2F;&#x2F;ml.berkeley.edu&#x2F;blog&#x2F;2016&#x2F;12&#x2F;24&#x2F;tutorial-2&#x2F;" rel="nofollow">https:&#x2F;&#x2F;ml.berkeley.edu&#x2F;blog&#x2F;2016&#x2F;12&#x2F;24&#x2F;tutorial-2&#x2F;</a><p>Neural networks &amp; Backpropagation:<p><a href="https:&#x2F;&#x2F;ml.berkeley.edu&#x2F;blog&#x2F;2017&#x2F;02&#x2F;04&#x2F;tutorial-3&#x2F;" rel="nofollow">https:&#x2F;&#x2F;ml.berkeley.edu&#x2F;blog&#x2F;2017&#x2F;02&#x2F;04&#x2F;tutorial-3&#x2F;</a>
amelius将近 8 年前
The whole problem of overfitting or underfitting exists because you&#x27;re not trying to understand the underlying model, but you&#x27;re trying to &quot;cheat&quot; by inventing some formula that happens to work in most cases.
评论 #14790325 未加载
评论 #14790855 未加载
评论 #14790241 未加载
评论 #14793704 未加载
评论 #14789820 未加载
therajiv将近 8 年前
Wow, the discussion on the Fukushima civil engineering decision was pretty interesting. However, I find it surprising that the engineers simply overlooked the linearity of the law and used a nonlinear model. I wonder if there were any economic &#x2F; other incentives at play, and the model shown was just used to justify the decision?<p>Regardless, that post was a great read.
评论 #14790777 未加载
评论 #14789722 未加载
评论 #14794856 未加载
评论 #14792569 未加载
评论 #14814797 未加载
eggie5将近 8 年前
I&#x27;ve always liked this visualization of the Bias-Variance tradeoff: <a href="http:&#x2F;&#x2F;www.eggie5.com&#x2F;110-bias-variance-tradeoff" rel="nofollow">http:&#x2F;&#x2F;www.eggie5.com&#x2F;110-bias-variance-tradeoff</a>
评论 #14795534 未加载
评论 #14790470 未加载
plg将近 8 年前
like many things in science and engineering, (and life in general) it comes down to this: what is signal, what is noise?<p>most of the time there is no a priori way of determining this<p>you come to the problem with your own assumptions (or you inherit them) and that guides you (or misguides you)
CuriouslyC将近 8 年前
One good way to solve the bias-variance problem is to use Gaussian processes (GPs). With GPs you build a probabilistic model of the covariance structure of your data. Locally complex, high variance models produce poor objective scores, so hyperparameter optimization favors &quot;simpler&quot; models.<p>Even better, you can put priors on the parameters of your model and give it the full Bayesian treatment via MCMC. This avoids overfitting, and gives you information about how strongly your data specifies the model.
ehsquared将近 8 年前
Welch Labs has a great 15-part series, where they gradually build up a decision tree model that counts the number of fingers in an image. Part 9 in the series explains the bias-variance spectrum really well: <a href="https:&#x2F;&#x2F;youtu.be&#x2F;yLwZEuybaqE?list=PLiaHhY2iBX9ihLasvE8BKnS2Xg8AhY6iV" rel="nofollow">https:&#x2F;&#x2F;youtu.be&#x2F;yLwZEuybaqE?list=PLiaHhY2iBX9ihLasvE8BKnS2X...</a>
gpawl将近 8 年前
Statistics is the science of making decisions under uncertainty.<p>It is far too frequently misunderstood as the science of making certainty from uncertainty.
known将近 8 年前
Brilliant post; Thank you;
Pogba666将近 8 年前
wow nice. Then I have things to do on my flight now.