TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

P values are not as reliable as many scientists assume (2014)

83 pointsby e0mover 9 years ago

9 comments

kazinatorover 9 years ago
Previous post with discussion, 563 days ago: <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=7225739" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=7225739</a><p>PDF via same nature.com: <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=8404620" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=8404620</a><p>Related, dupes of each other:<p><a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=9463806" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=9463806</a><p><a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=9486059" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=9486059</a><p>Related:<p><a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=9119228" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=9119228</a>
BorisVSchmidover 9 years ago
From my experience, scientists, -at least in biology, where like in sociology you might have a lot of noise to deal with-, have an internal intuition that a single paper with a significant result does not mean that we have found the truth. The recent study which reported a reproducibility in sociology of about 36% strikes me as pretty accurate.<p>I think the scientific system can work with that. It means that if you build follow-up experiments based on a single paper there is a good chance that the experiment fails. In some way, the scientific system of publishing is self-correcting in this regard, because you can then cast doubt on the previous paper, which is easier to publish than if you only have a fresh negative result (p-value &gt; threshold).
评论 #10142281 未加载
评论 #10142729 未加载
评论 #10143326 未加载
haddrover 9 years ago
It is not that P-values are now bad by definition. It&#x27;s only that they are many times wrongly intepreted. Putting too much confidence in P-values only might result in some wrong conclusions. And this is what some meta analyses discover. Many scientists try hard only to reach the &quot;golden&quot; &lt;0.05 in order to claim discovery and publish it. This is why there is so many papers that misteriously cluster around 0.05...
评论 #10144053 未加载
danharajover 9 years ago
Scientists have to do their work in a system that incentivizes bad science. How many people actually get to do their work in an environment that isn&#x27;t hostile to them?
评论 #10142303 未加载
rndnover 9 years ago
Isn’t a main problem with p-values that you don’t know whether significance (low p-value) is a result of big effect and small sample or big sample and small effect. This is why you also need a measure for the effect, for example the distance of the two measurements in terms of standard derivations.
评论 #10143589 未加载
评论 #10142951 未加载
marvyover 9 years ago
I&#x27;m probably commenting too late to get my question answered, but here goes: the article has a pretty picture where they show how likely your p-values will mislead you depending on how likely the null hypothesis is. For instance, they say if you think that the null hypothesis has a 50% probability of being right, and you get p=5%, then there&#x27;s still a 29% chance the null hypothesis is true. But according to my calculations, the right number should be 1&#x2F;21 = 4.8%. What am I missing here? Or are they wrong? My calculations are below:<p>Curious George has 200 fascinating phenomena he wishes to investigate. In reality, 100 of those are real, and the other hundred are mere coincidences. The experiments for the 100 real phenomena all show that &quot;yes, this is for real&quot;. (I&#x27;m assuming no false negatives.) Most of the 100 experiments that test bogus phenomena show that &quot;this is bogus&quot;, but 5 of them achieve a significance of p=5%, as expected. George then runs of to tell the Man in the Yellow Hat about his 105 amazing discoveries. If Yellow Hat Man knows that half of the phenomena that capture George&#x27;s attention are bogus, he knows that 5&#x2F;105 = 1&#x2F;21 = 4.8% of George&#x27;s discoveries are likely bogus, even though he doesn&#x27;t know which ones.
评论 #10146537 未加载
RA_Fisherover 9 years ago
Great article. I&#x27;m not sure that replication itself will solve the problem since Type 1 error rate requires asymptotics. We&#x27;d have to run many replications and then show convergence. That&#x27;ll be broadly cost-prohibitive for all but the most important conclusions. Lower thresholds probably won&#x27;t do it either. Right now, the only solutions I see are:<p>a) Baysian methods<p>b) Fisher&#x27;s single H hypothesis method<p>c) Tukey&#x27;s Exploratory Data Analysis method.<p>d) All of the above.
评论 #10142396 未加载
评论 #10142335 未加载
eruditelyover 9 years ago
Relevant, from Deborah Mayo.<p><a href="http:&#x2F;&#x2F;errorstatistics.com&#x2F;2015&#x2F;03&#x2F;16&#x2F;stephen-senn-the-pathetic-p-value-guest-post&#x2F;" rel="nofollow">http:&#x2F;&#x2F;errorstatistics.com&#x2F;2015&#x2F;03&#x2F;16&#x2F;stephen-senn-the-pathe...</a>
themodelplumberover 9 years ago
&quot;Essentially, all models are wrong, but some are useful.&quot; --George E.P. Box
评论 #10142618 未加载