TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Ask HN: Regression to the mean math question

4 pointsby noaharcover 15 years ago
This might be too off-topic, but just kill it if you think it is. Otherwise, here goes:<p>I have a question about regression to the mean.<p>Suppose you have a set of pairs (a,b) corresponding to students in a class. a = the student's score on the first midterm, b = score on second midterm.<p>If you plot the pairs with a on x-axis, b on y-axis, then get the least-squares line, you have an upward sloping line.<p>The line slope should be less than 1, indicating regression to the mean.<p>If you plot b on x-axis, a on y-axis, the slope is necessarily now greater than 1. But I fail to see what has changed in the analysis -- a and b are both just supposed to be samples from the same distribution, right?<p>This has been driving me crazy, so I'd love some help.<p>Thank you!

2 comments

roundsquareover 15 years ago
Don't do a least squares line. That doesn't help. In the first plot, you'll see that in general:<p>x &#60; mean =&#62; y &#62; x<p>x &#62; mean =&#62; y &#60; x<p>If the scores are normalized. Regression to the mean is that most people move towards the mean in subsequent games/attempts/whatever.<p><i>But I fail to see what has changed in the analysis -- a and b are both just supposed to be samples from the same distribution, right?</i><p>Not at all. b is not independent of a, thats the whole point of regression to the mean. If you take ordered pairs where there is no connection between a and b, then you won't get any regression to the mean, you'll get points essentially randomly placed on the plane.
评论 #973956 未加载
评论 #974397 未加载
mbrubeckover 15 years ago
Regression to the mean does not imply that the first slope should be less than one.<p>If for some reason only the above-average students regressed, then the slope would be &#60;1. But regression to the mean also affects the scores of students who started below average; as a group we should expect them to regress <i>upward</i> toward the mean. Combine the two groups, and the effects exactly cancel out, leaving a slope of 1.<p>(Since you say the slope "should be" one, I assume the scores are normalized somehow so that the mean score for exam A is the same as the mean for exam B.)
评论 #973783 未加载
评论 #973947 未加载