TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Stupid Data Miner Tricks: Overfitting the S&P 500

70 点作者 herrherr大约 14 年前

7 条评论

rgbrgb大约 14 年前
Does interpolation ever work in forecasting?<p>My gut instinct would be that markets and human systems are chaotic in nature. Even in the most chaotic systems, if you look at a suitably small sample, you can see some correlations and patterns between different factors which really don't exist. These are mirage correlations.<p>Take the lorenz attractor as an example. At some points, it will cycle on the same "wing" of the butterfly many times. But betting that it will do it again is a really lousy bet.<p>Polynomial approximation and curve fitting in general works when we're trying to explicate relationships between variables in a problem space in which we understand causal linkages very well (and they're constant) - it can be really useful in engineering.
ck2大约 14 年前
Via google PDF viewer<p><a href="https://docs.google.com/gview?url=http://nerdsonwallstreet.typepad.com/my_weblog/files/dataminejune_2000.pdf&#38;pli=1" rel="nofollow">https://docs.google.com/gview?url=http://nerdsonwallstreet.t...</a>
评论 #2368801 未加载
imurray大约 14 年前
Terrible generalization of polynomials is useful for demonstrating overfitting (I've done it myself in tutorials). However, responsible tutorials should mention that the other obvious lesson is that the polynomials (1, x, x², x³, etc) are a <i>terrible</i> set of basis functions for regression. Don't just watch for overfitting, but use a sensible regression model! For complicated fits some methods to consider are: local regression, splines, various artificial neural nets, or Gaussian processes.
评论 #2370951 未加载
tropin大约 14 年前
What's with the [scribd] tag when direct linking to a .pdf file? It's becoming common, but I can't understand it.
评论 #2368177 未加载
评论 #2368139 未加载
评论 #2368142 未加载
zipstudio大约 14 年前
"If the NFL wins, the market goes up, otherwise, it takes a dive. What’s happened over the last thirty years? Well, most of the time, the NFL wins the Superbowl"<p>Standards of editing have really gone down over the years. The "NFL" always wins the Superbowl...
评论 #2368633 未加载
streptomycin大约 14 年前
TLDR: Correlation != causation; if you have high dimensional data, you can always find a correlation, but it's probably meaningless; polynomial wiggle is a bitch, so don't fit high dimensional polynomials to your data.
waqf大约 14 年前
Related question: if I despite the warnings fancy my chances at <i>this sort of thing</i>, what sort of historical data can I get? Is free [machine-readable] stock market data easy to come by, or impossible?
评论 #2372014 未加载