TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Paradoxes of probability and other statistical strangeness

132 pointsby seycombiabout 8 years ago

8 comments

naftaliharrisabout 8 years ago
A less popular but perhaps more influential phenomenon is Stein&#x27;s Paradox [1]. Here&#x27;s a provocative example often given to illustrate it: Say you have a baseball player, soccer player, and football player, and you wish to estimate the true mean number of home runs, goals, and touchdowns each scores per year. If you have their last ten seasons worth of data for each, then the obvious thing to do, for each player, is to estimate the true yearly mean score for each player by their average yearly score from the last ten years. (E.g., the baseball player hits an average of 20 home runs each year, so let&#x27;s estimate their true mean yearly home runs by 20). Stein&#x27;s Paradox says that you can actually do a lot better than this.<p>Even more crazy, the James-Stein Estimator which does this actually uses data about the football player and soccer player to make predictions about the baseball player, (and vice-versa). This is deeply unintuitive to most people since the players aren&#x27;t related to each other at all. The phenomenon only holds with at least three players; it doesn&#x27;t work for two.<p>(More generally, Stein&#x27;s Paradox is the fact that if you have p &gt;= 3 independent Gaussians with a known variance, you can do better in estimating their p-dimensional mean than just using their sample means).<p>I&#x27;ve spent a bunch of time trying to understand why this actually works [2]; to be honest I still don&#x27;t deeply understand. But nonetheless the consensus is that the same shrinkage phenomenon is what causes improved performance for a variety of high-dimensional estimators, (lasso or ridge regression, e.g.), making the paradox very very influential.<p>[1] <a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;James%E2%80%93Stein_estimator" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;James%E2%80%93Stein_estimator</a> [2] <a href="https:&#x2F;&#x2F;www.naftaliharris.com&#x2F;blog&#x2F;steinviz&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.naftaliharris.com&#x2F;blog&#x2F;steinviz&#x2F;</a>
评论 #14046411 未加载
评论 #14046837 未加载
评论 #14045423 未加载
评论 #14047226 未加载
pmoriartyabout 8 years ago
My favorite probability paradox has always been the Monty Hall problem[1]:<p><i>Suppose you&#x27;re on a game show, and you&#x27;re given the choice of three doors:</i><p><i>Behind one door is a car; behind the others, goats.</i><p><i>You pick a door, say No. 1, and the host, who knows what&#x27;s behind the doors, opens another door, say No. 3, which has a goat.</i><p><i>He then says to you, &quot;Do you want to pick door No. 2?&quot;</i><p><i>Is it to your advantage to switch your choice?</i><p>[1] - <a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Monty_Hall_problem" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Monty_Hall_problem</a>
评论 #14044537 未加载
评论 #14044519 未加载
评论 #14044565 未加载
评论 #14046092 未加载
评论 #14047362 未加载
评论 #14044397 未加载
评论 #14045106 未加载
评论 #14044776 未加载
评论 #14048778 未加载
danbrucabout 8 years ago
No need to look at fancy paradoxes, just think about the following.<p><i>What does it mean that tossing a fair coin has a 50 % probability of showing heads?</i><p>If you think you know the answer, you are probably wrong.<p>EDIT: Instead of just voting this down, try to give an answer. If you think it is easy, you have not thought about it careful enough.
评论 #14045225 未加载
评论 #14046261 未加载
评论 #14045644 未加载
评论 #14045670 未加载
评论 #14045297 未加载
Houshalterabout 8 years ago
By far the most unintuitive paradox for me personally is the one presented here: <a href="https:&#x2F;&#x2F;youtu.be&#x2F;go3xtDdsNQM?t=3m27s" rel="nofollow">https:&#x2F;&#x2F;youtu.be&#x2F;go3xtDdsNQM?t=3m27s</a><p>&quot;Mr. Jones has 2 children. What is the probability he has a girl if he has a boy born on Tuesday?&quot; Somehow knowing the day of the week the boy was born changes the result. It&#x27;s completely bizarre.
评论 #14045836 未加载
评论 #14045837 未加载
haddrabout 8 years ago
I think there is a whole class of statistical &quot;strangeness&quot; with using p values for hypothesis testing. For instance, p = 0.05 means that we have ~30% chance that our hypothesis is a false positive [1], which is far from what intuition tells us.<p>[1] <a href="http:&#x2F;&#x2F;www.nature.com&#x2F;news&#x2F;scientific-method-statistical-errors-1.14700" rel="nofollow">http:&#x2F;&#x2F;www.nature.com&#x2F;news&#x2F;scientific-method-statistical-err...</a>
评论 #14045800 未加载
z3t4about 8 years ago
If you throw a six sided dice two times, there&#x27;s 1&#x2F;6*1&#x2F;6=~3% chance to hit six both times. But if you throw one six, there&#x27;s now ~17% change to hit six again ...
prmphabout 8 years ago
Here is another seemingly basic question that leads down the rabbit hole: What does it mean to say two things are the same?
mrcactu5about 8 years ago
my favorite data science paradox is the &quot;curse of dimensionality&quot;<p><a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Curse_of_dimensionality" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Curse_of_dimensionality</a>