This defense of P-values is akin to how a certain type of computer programmer tends to defend bad design decisions, like "if you accidentally typed `rm -rf /` and it wiped out your entire system, then you're an idiot, learn to use your tools".<p>Thing is: even statistics professors often don't manage to interpret P-values and confidence intervals the way they should. (See e.g. "Students’ misconceptions of statistical inference: A review of the empirical evidence from research on statistics education") You're almost automatically forced into a sort of double talk where you write about P(D|H0) but deep down you'd really like your readers to think it's about 1-P(HA|D).<p>When a tool can be used correctly (if handled very delicately and by a true expert) but practically encourages you to abuse it, at a certain point "it's not the tool, it's you" stops being a convincing excuse, and you've just got to say: fuck it, we need better tools.
This is about more than mere p-values. This is a fundamental problem of tools themselves. Using a programming language, hammer, or stats tool incorrectly courts disaster.<p>But some tools are better than others about teaching people how to use themselves correctly. Math tools like the p-value are really hard, because most scientists are not math people - they want a quick solution, whereas statistics requires deep understanding. P-values are just so appealing as a magic number. An apt anology is the late 90's PHP people who just wanted a solution now, rather than designing a robust application for the future. The question then becomes:<p>1) How can build tools that lead people toward the correct decision, or at least avoid the unambiguously wrong ones.<p>2) If we can't do 1, we need to spend more time educating users, and figuring out why they make the wrong decisions regarding a tool.<p>Unfortunately, I can't say these are easy, or even possible problems to solve. We just do the best we can.
The most important thing your P-value "doesn't do" is tell you if your underlying model structure is correct.<p>In particular, the actual value of your P-value is dependent on the underlying distribution, which rather begs the question.<p>This is the biggest issue with the P-value in the context of social science or commerce.<p>For physical systems, where God[1] actually makes sure that your distributions, errors, etc are normal, it's pretty awesome.<p>[1] <a href="http://en.wikipedia.org/wiki/Central_limit_theorem" rel="nofollow">http://en.wikipedia.org/wiki/Central_limit_theorem</a>
I'm going to go for the embarrassing question, but I don't understand p-values or how to design a study. Any pointers gratefully recvd<p>If I understand correctly, let's say I am concerned Earths gravity is increasing underneath Kent school buildings this stunting kids growth.<p>I decide to test this by making a statistical survey with a p-value in it.<p>We shall sample the heights of 10,000 randomly chosen children from 10 counties, one of which is Kent (so 1,000 kids from each school district)<p>My null hypothesis is that there is no gravitational anaomloly and so the height distribution should be equal or the mean height should be equal or ...<p>So I get a bit lost here.<p>What is 95% significance ...