TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Machine Learning Crash Course: The Bias-Variance Dilemma

540 pointsby Yossi_Frenkelalmost 8 years ago

11 comments

taericalmost 8 years ago
This seems to ultimately come down to an idea that folks have a hard time shaking. It is entirely possible that you cannot recover the original signal using machine learning. This is, fundamentally, what separates this field from digital sampling.<p>And this is not unique to machine learning, per se. <a href="https:&#x2F;&#x2F;fivethirtyeight.com&#x2F;features&#x2F;trump-noncitizen-voters&#x2F;" rel="nofollow">https:&#x2F;&#x2F;fivethirtyeight.com&#x2F;features&#x2F;trump-noncitizen-voters...</a> has a great widget that shows that as you get more data, you do not necessarily decrease inherent noise. In fact, it stays very constant. (Granted, this is in large because machine learning has most of its roots in statistics.)<p>More explicitly, with ML, you are building probabilistic models. This is contrasted to most models folks are used to which are analytic models. That is, you run the calculations for an object moving across the field, and you get something within the measurement bounds that you expected. With a probabilistic model, you get something that is within the bounds of being in line with previous data you have collected.<p>(None of this is to say this is a bad article. Just a bias to keep in mind as you are reading it. Hopefully, it helps you challenge it.)
评论 #14790958 未加载
评论 #14790543 未加载
评论 #14791953 未加载
评论 #14790692 未加载
rdudekulalmost 8 years ago
Here are parts 1, 2 &amp; 3:<p>Introduction, Regression&#x2F;Classification, Cost Functions, and Gradient Descent:<p><a href="https:&#x2F;&#x2F;ml.berkeley.edu&#x2F;blog&#x2F;2016&#x2F;11&#x2F;06&#x2F;tutorial-1&#x2F;" rel="nofollow">https:&#x2F;&#x2F;ml.berkeley.edu&#x2F;blog&#x2F;2016&#x2F;11&#x2F;06&#x2F;tutorial-1&#x2F;</a><p>Perceptrons, Logistic Regression, and SVMs:<p><a href="https:&#x2F;&#x2F;ml.berkeley.edu&#x2F;blog&#x2F;2016&#x2F;12&#x2F;24&#x2F;tutorial-2&#x2F;" rel="nofollow">https:&#x2F;&#x2F;ml.berkeley.edu&#x2F;blog&#x2F;2016&#x2F;12&#x2F;24&#x2F;tutorial-2&#x2F;</a><p>Neural networks &amp; Backpropagation:<p><a href="https:&#x2F;&#x2F;ml.berkeley.edu&#x2F;blog&#x2F;2017&#x2F;02&#x2F;04&#x2F;tutorial-3&#x2F;" rel="nofollow">https:&#x2F;&#x2F;ml.berkeley.edu&#x2F;blog&#x2F;2017&#x2F;02&#x2F;04&#x2F;tutorial-3&#x2F;</a>
ameliusalmost 8 years ago
The whole problem of overfitting or underfitting exists because you&#x27;re not trying to understand the underlying model, but you&#x27;re trying to &quot;cheat&quot; by inventing some formula that happens to work in most cases.
评论 #14790325 未加载
评论 #14790855 未加载
评论 #14790241 未加载
评论 #14793704 未加载
评论 #14789820 未加载
therajivalmost 8 years ago
Wow, the discussion on the Fukushima civil engineering decision was pretty interesting. However, I find it surprising that the engineers simply overlooked the linearity of the law and used a nonlinear model. I wonder if there were any economic &#x2F; other incentives at play, and the model shown was just used to justify the decision?<p>Regardless, that post was a great read.
评论 #14790777 未加载
评论 #14789722 未加载
评论 #14794856 未加载
评论 #14792569 未加载
评论 #14814797 未加载
eggie5almost 8 years ago
I&#x27;ve always liked this visualization of the Bias-Variance tradeoff: <a href="http:&#x2F;&#x2F;www.eggie5.com&#x2F;110-bias-variance-tradeoff" rel="nofollow">http:&#x2F;&#x2F;www.eggie5.com&#x2F;110-bias-variance-tradeoff</a>
评论 #14795534 未加载
评论 #14790470 未加载
plgalmost 8 years ago
like many things in science and engineering, (and life in general) it comes down to this: what is signal, what is noise?<p>most of the time there is no a priori way of determining this<p>you come to the problem with your own assumptions (or you inherit them) and that guides you (or misguides you)
CuriouslyCalmost 8 years ago
One good way to solve the bias-variance problem is to use Gaussian processes (GPs). With GPs you build a probabilistic model of the covariance structure of your data. Locally complex, high variance models produce poor objective scores, so hyperparameter optimization favors &quot;simpler&quot; models.<p>Even better, you can put priors on the parameters of your model and give it the full Bayesian treatment via MCMC. This avoids overfitting, and gives you information about how strongly your data specifies the model.
ehsquaredalmost 8 years ago
Welch Labs has a great 15-part series, where they gradually build up a decision tree model that counts the number of fingers in an image. Part 9 in the series explains the bias-variance spectrum really well: <a href="https:&#x2F;&#x2F;youtu.be&#x2F;yLwZEuybaqE?list=PLiaHhY2iBX9ihLasvE8BKnS2Xg8AhY6iV" rel="nofollow">https:&#x2F;&#x2F;youtu.be&#x2F;yLwZEuybaqE?list=PLiaHhY2iBX9ihLasvE8BKnS2X...</a>
gpawlalmost 8 years ago
Statistics is the science of making decisions under uncertainty.<p>It is far too frequently misunderstood as the science of making certainty from uncertainty.
knownalmost 8 years ago
Brilliant post; Thank you;
Pogba666almost 8 years ago
wow nice. Then I have things to do on my flight now.