科技回声

11 条评论

taeric将近 8 年前

This seems to ultimately come down to an idea that folks have a hard time shaking. It is entirely possible that you cannot recover the original signal using machine learning. This is, fundamentally, what separates this field from digital sampling.And this is not unique to machine learning, per se. <a href="https://fivethirtyeight.com/features/trump-noncitizen-voters/" rel="nofollow">https://fivethirtyeight.com/features/trump-noncitizen-voters...</a> has a great widget that shows that as you get more data, you do not necessarily decrease inherent noise. In fact, it stays very constant. (Granted, this is in large because machine learning has most of its roots in statistics.)More explicitly, with ML, you are building probabilistic models. This is contrasted to most models folks are used to which are analytic models. That is, you run the calculations for an object moving across the field, and you get something within the measurement bounds that you expected. With a probabilistic model, you get something that is within the bounds of being in line with previous data you have collected.(None of this is to say this is a bad article. Just a bias to keep in mind as you are reading it. Hopefully, it helps you challenge it.)

评论 #14790958 未加载

评论 #14790543 未加载

评论 #14791953 未加载

评论 #14790692 未加载

rdudekul将近 8 年前

Here are parts 1, 2 & 3:Introduction, Regression/Classification, Cost Functions, and Gradient Descent:<a href="https://ml.berkeley.edu/blog/2016/11/06/tutorial-1/" rel="nofollow">https://ml.berkeley.edu/blog/2016/11/06/tutorial-1/</a>Perceptrons, Logistic Regression, and SVMs:<a href="https://ml.berkeley.edu/blog/2016/12/24/tutorial-2/" rel="nofollow">https://ml.berkeley.edu/blog/2016/12/24/tutorial-2/</a>Neural networks & Backpropagation:<a href="https://ml.berkeley.edu/blog/2017/02/04/tutorial-3/" rel="nofollow">https://ml.berkeley.edu/blog/2017/02/04/tutorial-3/</a>

amelius将近 8 年前

The whole problem of overfitting or underfitting exists because you're not trying to understand the underlying model, but you're trying to "cheat" by inventing some formula that happens to work in most cases.

评论 #14790325 未加载

评论 #14790855 未加载

评论 #14790241 未加载

评论 #14793704 未加载

评论 #14789820 未加载

therajiv将近 8 年前

Wow, the discussion on the Fukushima civil engineering decision was pretty interesting. However, I find it surprising that the engineers simply overlooked the linearity of the law and used a nonlinear model. I wonder if there were any economic / other incentives at play, and the model shown was just used to justify the decision?Regardless, that post was a great read.

评论 #14790777 未加载

评论 #14789722 未加载

评论 #14794856 未加载

评论 #14792569 未加载

评论 #14814797 未加载

eggie5将近 8 年前

I've always liked this visualization of the Bias-Variance tradeoff: <a href="http://www.eggie5.com/110-bias-variance-tradeoff" rel="nofollow">http://www.eggie5.com/110-bias-variance-tradeoff</a>

评论 #14795534 未加载

评论 #14790470 未加载

plg将近 8 年前

like many things in science and engineering, (and life in general) it comes down to this: what is signal, what is noise?most of the time there is no a priori way of determining thisyou come to the problem with your own assumptions (or you inherit them) and that guides you (or misguides you)

CuriouslyC将近 8 年前

One good way to solve the bias-variance problem is to use Gaussian processes (GPs). With GPs you build a probabilistic model of the covariance structure of your data. Locally complex, high variance models produce poor objective scores, so hyperparameter optimization favors "simpler" models.Even better, you can put priors on the parameters of your model and give it the full Bayesian treatment via MCMC. This avoids overfitting, and gives you information about how strongly your data specifies the model.

ehsquared将近 8 年前

Welch Labs has a great 15-part series, where they gradually build up a decision tree model that counts the number of fingers in an image. Part 9 in the series explains the bias-variance spectrum really well: <a href="https://youtu.be/yLwZEuybaqE?list=PLiaHhY2iBX9ihLasvE8BKnS2Xg8AhY6iV" rel="nofollow">https://youtu.be/yLwZEuybaqE?list=PLiaHhY2iBX9ihLasvE8BKnS2X...</a>

gpawl将近 8 年前

Statistics is the science of making decisions under uncertainty.It is far too frequently misunderstood as the science of making certainty from uncertainty.

known将近 8 年前

Brilliant post; Thank you;

Pogba666将近 8 年前

wow nice. Then I have things to do on my flight now.

11 条评论

taeric将近 8 年前

评论 #14790958 未加载

评论 #14790543 未加载

评论 #14791953 未加载

评论 #14790692 未加载

rdudekul将近 8 年前

amelius将近 8 年前

评论 #14790325 未加载

评论 #14790855 未加载

评论 #14790241 未加载

评论 #14793704 未加载

评论 #14789820 未加载

therajiv将近 8 年前

评论 #14790777 未加载

评论 #14789722 未加载

评论 #14794856 未加载

评论 #14792569 未加载

评论 #14814797 未加载

eggie5将近 8 年前

I've always liked this visualization of the Bias-Variance tradeoff: <a href="http://www.eggie5.com/110-bias-variance-tradeoff" rel="nofollow">http://www.eggie5.com/110-bias-variance-tradeoff</a>

评论 #14795534 未加载

评论 #14790470 未加载

plg将近 8 年前

CuriouslyC将近 8 年前

ehsquared将近 8 年前

gpawl将近 8 年前

Statistics is the science of making decisions under uncertainty.It is far too frequently misunderstood as the science of making certainty from uncertainty.

known将近 8 年前

Brilliant post; Thank you;

Pogba666将近 8 年前

wow nice. Then I have things to do on my flight now.

Machine Learning Crash Course: The Bias-Variance Dilemma

11 条评论

Machine Learning Crash Course: The Bias-Variance Dilemma

11 条评论