Forecasts need to have error bars

334 pointsby apwheeleover 1 year ago

25 comments

bo1024over 1 year ago

Two things I think are interesting here, one discussed by the author and one not. (1) As mentioned at the bottom, forecasting usually should lead to decisionmaking, and when it gets disconnected, it can be unclear what the value is. It sounds like Rosenfield is trying to use forecasting to give added weight to his statistical conclusions about past data, which I agree sounds suspect.(2) it's not clear what the "error bars" should mean. One is a confidence interval[1] (e.g. model gives 95% chance the output will be within these bounds). Another is a standard deviation (i.e. you are pretty much predicting the squared difference between your own point forecast and the outcome).[1] acknowledged: not the correct term

评论 #38522986 未加载

评论 #38525847 未加载

评论 #38522022 未加载

评论 #38520750 未加载

评论 #38525244 未加载

评论 #38520638 未加载

评论 #38521347 未加载

评论 #38522598 未加载

rented_muleover 1 year ago

Yes, please! I was part of an org that ran thousands of online experiments over the course of several years. Having some sort of error bars when comparing the benefit of a new treatment gave a much better understanding.Some thought it clouded the issue. For example, when a new treatment caused a 1% "improvement", but the confidence interval extended from -10% to 10%, it was clear that the experiment didn't tell us how that metric was affected. This makes the decision feel more arbitrary. But that is exactly the point - the decision is arbitrary in that case, and the confidence interval tells us that, allowing us to focus on other trade-offs involved. If the confidence interval is 0.9% to 1.1%, we know that we can be much more confident in the effect.A big problem with this is that meaningful error bars can be extremely difficult to come by in some cases. For example, imagine having something like that for every prediction made by an ML model. I would love to have that, but I'm not aware of any reasonable way to achieve it for most types of models. The same goes for online experiments where a complicated experiment design is required because there isn't a way to do random allocation that results in sufficiently independent cohorts.On a similar note, regularly look at histograms (i.e., statistical distributions) for all important metrics. In one case, we were having speed issues in calls to a large web service. Many calls were completing in < 50 ms, but too many were tripping our 500 ms timeout. At the same time, we had noticed the emergence of two clear peaks in the speed histogram (i.e., it was a multimodal distribution). That caused us to dig a bit deeper and see that the two peaks represented logged-out and logged-in users. That knowledge allowed us to ignore wide swaths of code and spot the speed issues in some recently pushed personalization code that we might not have suspected otherwise.

评论 #38523428 未加载

评论 #38523490 未加载

mightybyteover 1 year ago

Completely agree with this idea. And I would add a corollary...date estimates (i.e. deadlines) should also have error bars. After all, a date is a forecast. If a stakeholder asks for a date, they should also specify what kind of error bars they're looking for. A raw date with no estimate of uncertainty is meaningless. And correspondingly, if an engineer is giving a date to some other stakeholder, they should include some kind of uncertainty estimate with it. There's a huge difference between saying that something will be done by X date with 90% confidence versus three nines confidence.

评论 #38524762 未加载

评论 #38523548 未加载

评论 #38522269 未加载

esafakover 1 year ago

Uncertainty quantification is a neglected aspect of data science and especially machine learning. Practitioners do not always have the statistical background, and the ML crowd generally has a "predict first and asks questions later" mindset that precludes such niceties.I always demand error bars.

评论 #38523512 未加载

评论 #38521280 未加载

评论 #38528833 未加载

评论 #38522348 未加载

CalChrisover 1 year ago

I'm reminded of Walter Lewin's analogous point about measurements from his 8.01 lectures:<pre><code> any measurement that you make without any knowledge of the uncertainty is meaningless </code></pre> <a href="https://youtu.be/6htJHmPq0Os" rel="nofollow noreferrer">https://youtu.be/6htJHmPq0Os</a>You could say that forecasts are measurements you make about the future.

评论 #38522864 未加载

doubled112over 1 year ago

I really thought that this was going to be about the weather.

评论 #38522440 未加载

评论 #38523509 未加载

评论 #38520306 未加载

datadrivenangelover 1 year ago

The interesting example in this article is nowcasting! The art of forecasting the present or past while you're waiting for data to come in.It's sloppy science / statistics to not haven error ranges.

评论 #38520439 未加载

评论 #38525571 未加载

clircleover 1 year ago

Every estimate/prediction/forecast/interpolation/extrapolation should have a confidence/prediction/ or tolerance interval (application dependent) that incorporates the assumptions that the team is putting into the problem.

borg16over 1 year ago

Reminds me of this paper[1]> An illusion of predictability in scientific results: Even experts confuse inferential uncertainty and outcome variability> Traditionally, scientists have placed more emphasis on communicating inferential uncertainty (i.e., the precision of statistical estimates) compared to outcome variability (i.e., the predictability of individual outcomes). Here, we show that this can lead to sizable misperceptions about the implications of scientific results. Specifically, we present three preregistered, randomized experiments where participants saw the same scientific findings visualized as showing only inferential uncertainty, only outcome variability, or both and answered questions about the size and importance of findings they were shown. Our results, composed of responses from medical professionals, professional data scientists, and tenure-track faculty, show that the prevalent form of visualizing only inferential uncertainty can lead to significant overestimates of treatment effects, even among highly trained experts. In contrast, we find that depicting both inferential uncertainty and outcome variability leads to more accurate perceptions of results while appearing to leave other subjective impressions of the results unchanged, on average.[1] <a href="https://www.microsoft.com/en-us/research/publication/an-illusion-of-predictability-in-scientific-results-even-experts-confuse-inferential-uncertainty-and-outcome-variability/" rel="nofollow noreferrer">https://www.microsoft.com/en-us/research/publication/an-illu...</a>

评论 #38532940 未加载

_hyttioaoa_over 1 year ago

Forecasts can also be useful without error bars. Sometimes all one needs is a point prediction to inform actions. But sometimes full knowledge of the predictive distribution is helpful or needed to make good decisions."Point forecasts will always be wrong" - true that for continuous data but if you can predict that some stock will go to 2.01x it's value instead of 2x that's still helpful.

mrguyoramaover 1 year ago

If you are forecasting both "Crime" and "Economy", it's VERY likely you have domain expertise for neither.

lagrange77over 1 year ago

This is a great advantage of Gaussian Process Regression aka. Kriging.<a href="https://en.wikipedia.org/wiki/Gaussian_process#Gaussian_process_prediction,_or_Kriging" rel="nofollow noreferrer">https://en.wikipedia.org/wiki/Gaussian_process#Gaussian_proc...</a>

Animatsover 1 year ago

Looking at the graph, changes in this decade are noise. But what happened back in 1990?

评论 #38524370 未加载

_rmover 1 year ago

Doesn't work.For instance in a business setting, if I say "it'll be done in 10 days +/- 4 days", they'll immediately say "ok so you're saying it'll be done in 14 days tops then".More effective to sound as unsure as possible, disclaim everything in slippery language, and promise to give updates to your predictions as soon as you realise they've changed (granted this wouldn't work as well for an anonymous reader situation like in this article).

评论 #38528304 未加载

ur-whaleover 1 year ago

Not just forecasts.Accounting should do it too in their reporting.I would love to see a balance sheet with a proper 'certainty range' around the values in there.

KingOfCodersover 1 year ago

Take the error bars for Scrum estimations, 3, 5, 8 - people treat them as real things although they have huge errors and are very course.

y1zhouover 1 year ago

For prediction intervals that are guaranteed coverage, check out conformal prediction [1]. Works great especially for time series data.[1] <a href="https://github.com/valeman/awesome-conformal-prediction">https://github.com/valeman/awesome-conformal-prediction</a>

amaiover 1 year ago

Not only forecast need error bars. Every statistic needs error bars. But even then most people interpret error bars wrongly, see e.g. <a href="https://errorbars.streamlit.app/" rel="nofollow noreferrer">https://errorbars.streamlit.app/</a>

Arrathover 1 year ago

I'm just imagining adding error bars to my schedule forecasting (with schedules that are typically one the optimistic side thanks to management), with bars pointing in the bad direction, and seeing management still insist it'll take too long.

predict_addictover 1 year ago

Let me suggest a solution <a href="https://github.com/valeman/awesome-conformal-prediction">https://github.com/valeman/awesome-conformal-prediction</a>

评论 #38525698 未加载

xchipover 1 year ago

And also claims that say "x improves y", should include std and avg in the title.

aurelienover 1 year ago

two inches deeper, is too much money! So Alphabet prefer to loose the idea, the contract, the business plan rather than make the things solid and works for a long time. a $ is a dollars, a project, is just a piece of fun. Alphabet LOL XD

BOOSTERHIDROGENover 1 year ago

What is the best explanations to error bars ?

评论 #38525649 未加载

评论 #38528857 未加载

nothrowawaysover 1 year ago

Linear error rate

amichalover 1 year ago

I have, in my life as a web developer, had multiple "academics" urgently demand that i remove error bands, bars, notes about outliers, confidence intervals etc from graphics at the last minute so people are not "confused"Its depressing

评论 #38521780 未加载

评论 #38521170 未加载

评论 #38520509 未加载

评论 #38520973 未加载

评论 #38521331 未加载

评论 #38521690 未加载

评论 #38521323 未加载

评论 #38528867 未加载

评论 #38521591 未加载