TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Avoiding a Common Mistake with Time Series

161 pointsby davidkellisover 10 years ago

12 comments

foobarianover 10 years ago
I'm not sure. The author says that adding a common component to two random time series doesn't make them correlated. But that's not true, by construction, at least using any of the simple correlation tests. It's a complicated subject explained in a confusing way.
评论 #8963803 未加载
评论 #8963988 未加载
评论 #8963940 未加载
评论 #8964520 未加载
评论 #8964521 未加载
评论 #8963729 未加载
photon137over 10 years ago
What the author is trying to explain are the concepts of cointegration and stationarity. A useful introduction here: <a href="http://www.uta.edu/faculty/crowder/papers/drunk%20and%20dog.pdf" rel="nofollow">http:&#x2F;&#x2F;www.uta.edu&#x2F;faculty&#x2F;crowder&#x2F;papers&#x2F;drunk%20and%20dog....</a>
评论 #8968568 未加载
omalleytover 10 years ago
Never use first differencing, that&#x27;s crazy. It magnifies measurement error. Pass the signal through a high pass filter to remove trend
评论 #8965300 未加载
bbrazilover 10 years ago
Something I&#x27;ve repeatedly found useful is that when debugging and you have a conjecture, not only look for evidence that a correlation&#x2F;causation is present; but also look for evidence that it isn&#x27;t.<p>Doing a very quick A&#x2F;B test helps too.
评论 #8963575 未加载
评论 #8965285 未加载
评论 #8964585 未加载
esfandiaover 10 years ago
Isn&#x27;t the author exaggerating in the other direction? There is obviously correlation between the two time series. Sure, who&#x27;s saying there is causation (as mentioned in the article there can be a third random variable that the first two depended on)? But also, who&#x27;s to say <i>there&#x27;s no causation</i>? Is it ok to always remove the correlated part of the two time series? What if that&#x27;s the interesting part and the explanation you&#x27;re looking for?
评论 #8964207 未加载
n00b101over 10 years ago
This is called spurious correlation. It&#x27;s well known in financial &#x2F; economic time-series analysis. The lesson is that you never measure the correlation between the PRICE LEVELS of products, instead you measure the correlation between the daily&#x2F;weekly&#x2F;etc CHANGE IN PRICE LEVELS.<p>A famous example of this:<p>The tale of David Leinweber, which is related in the excellent new book &quot;Quantitative Value,&quot; illustrates this point about &quot;stupid data miner tricks.&quot; Leinweber sifted through a United Nations CD covering the economic data of 140 countries. He found that butter production in Bangladesh explained 75 percent of the variation of the S&amp;P 500 Index. Not satisfied, he found that if he added a broader category of global dairy products, the correlation would rise to 95 percent. Then he added a third variable, the population of sheep, and found that he had now explained 99 percent of the variation in the S&amp;P 500 for the period 1983-&#x27;99.<p>(<a href="http://www.cbsnews.com/news/what-butter-production-means-for-your-portfolio/" rel="nofollow">http:&#x2F;&#x2F;www.cbsnews.com&#x2F;news&#x2F;what-butter-production-means-for...</a>)
SixSigmaover 10 years ago
I ordered the book, Quantitative Forecasting Methods by Farnum and Stanton (PWS-KENT, 1989). it was only £2.81, sounds like money well spent.
评论 #8963969 未加载
评论 #8964067 未加载
nemo44xover 10 years ago
Does this mean that if I apply this algorithm and that 2 or more time series data sets are still similar that they are in fact correlated? I find this test fascinating.
eouw0o83hfover 10 years ago
Wow, and the graphs on <a href="http://www.tylervigen.com/" rel="nofollow">http:&#x2F;&#x2F;www.tylervigen.com&#x2F;</a> are just incredible.
plgover 10 years ago
also: statistical tests on correlation coefficients don&#x27;t test whether the correlation is &quot;significant&quot; or not --- they only test whether the correlation is reliably different than 0.00<p>So a small correlation (e.g. r=0.10) can still be &quot;statistically significant&quot; at p&lt;0.001 but all this means is that r is reliably different than 0.00 --- it doesn&#x27;t mean r is big
princebover 10 years ago
perform a dickey fuller if you are unsure if a time series is nonstationary, perhaps.
gengkevover 10 years ago
where are the scrollbars??