Hey everyone, I'm the author of this guide. It's come full circle -- I posted it a week ago in a "what are you working on?" Ask HN post, someone posted it to Metafilter and reddit, and it made its way to Boing Boing and Daily Kos before coming back here.<p>I'm currently working on expanding the guide to book length, and considering options for publication (self-publishing, commercial publishers, etc.). It seems like a broad spectrum of people find it useful. I'd appreciate any suggestions from the HN crowd.<p>(A few folks have already emailed me with tips and suggestions. Thanks!)<p>(Also, I'm sure glad I added that email signup a couple weeks ago)
If I was a billionaire, I would set up some sort of screening lab for scientific/academic/research papers. There would be a statistics division for evaluating the application of statistical methods being used; a replication division for checking that experiments do actually replicate; and a corruption division for investigating suspicious influences on the research. It would be tempting to then generate some sort of credibility rating for each institution based on the papers they're publishing, but that would probably invite too much trouble, so best just to publish the results and leave it at that.<p>Arguably this would be a greater benefit to humanity than all the millions poured charitably into cancer research etc.
As a graduate student in the life sciences, I was required to take a course on ethical conduct of science. This gave me the tools to find ethical solutions to complex issues like advisor relations, plagiarism, authorship, etc. We were also taught to keep good notes and use ethical data management practices - don't throw out data, use the proper tests, etc. Unfortunately, we weren't really taught how to do statistics "the right way." It seems like this is equally important to ethical conduct of science. Ignorance is no excuse for using bad statistical practices - it's still unethical. By the way, this is at (what is considered to be) one of the best academic institutions in the world.
One of the many challenges in science is that there is no publication outlet for experiments that just didn't pan out. If you do an experiment and don't find statistical significance, there aren't many journals that want to publish your work. That alone helps contribute to a bias toward publishing results that might have been found by chance. If 20 independent researchers test the same hypothesis, and there is no real effect, 1 might find statistical significance. That 1 researcher will get published. The 19 just move on.
Norvig's "Warning Signs in Experimental Design and Interpretation" is also worth reading and covers the higher level problem of bad research and results. Including mentioning bad statistics.<p><a href="http://norvig.com/experiment-design.html" rel="nofollow">http://norvig.com/experiment-design.html</a>
Quite a few years ago i devised an ambitious method to achieve significance while sitting through another braindead thesis presentation (psychology):<p>If you are interested in the difference of a metric scaled quantity between two groups do the following:<p>1.) Add 4-5 plausible control variables that you do not document in advance (questionaire, sex, age...).<p>2.) Write a r-script that helps you do the following:
Whenever you have tested a person increment your dataset with the persons result and run a:<p>t-test<p>u-test<p>ordinal logistic regression over some possible bucket combinations.<p>3.) Do this over all permutations of the control variables.
Have the script ring a loud bell when significance is achieved so data collection is stopped immediately.
An added bonus is that you will likely get a significant result with a small n which enables you to do a reversed power analysis.<p>Now you can report that your theoretical research implied a strong effect size so you choose an appropriate small n which, as expected, yielded a significant result ;)
One thing that constantly saddens me about statistics is that a large amount of energy is expended using is almost correctly to "prove" something that was already the gut feel. Even unbiased practitioners can be lead astray [1] but standards on how not to intentionally lie with statistics are very useful.<p>[1] <a href="http://euri.ca/2012/youre-probably-polluting-your-statistics-more-than-you-think/index.html" rel="nofollow">http://euri.ca/2012/youre-probably-polluting-your-statistics...</a>
I see the author of this interesting site is active in this thread. You may already know about this, but for onlookers I will mention that Uri Simonsohn and his colleagues<p><a href="http://opim.wharton.upenn.edu/~uws/" rel="nofollow">http://opim.wharton.upenn.edu/~uws/</a><p>have published a lot of interesting papers advising psychology researchers how to avoid statistical errors (and also how to detect statistical errors, up to and including fraud, by using statistical techniques on published data).
One way to do statistics less wrong is to move from statistical testing to statistical modelling. This is what we are trying to support with BayesHive at <a href="https://bayeshive.com" rel="nofollow">https://bayeshive.com</a><p>Other ways of doing this include JAGS (<a href="http://mcmc-jags.sourceforge.net/" rel="nofollow">http://mcmc-jags.sourceforge.net/</a>) and Stan (<a href="http://mc-stan.org/" rel="nofollow">http://mc-stan.org/</a>)<p>The advantage of statistical modelling is that it makes your assumptions very explicit, and there is more of an emphasis on effect size estimation and less on reaching arbitrary significance thresholds.
I like that he references Huff's "How to lie with statistics" in the first sentence of the intro. That was the book that came to mind when I saw the subject. Also reminds me of the Twain quote, "There are three types of lies: Lies, Damned Lies, and Statistics."<p>But despite this, statistics done well are very powerful.
What is puzzling to me is that many of the statistical errors showing up in all the science literature are well understood. The problem is not all the junk science that is being generated but that the current tools and culture are not readily naming and shamming these awful studies. Just as we have basic standards in other fields such as GAAP in finance why can' we have an agreed upon standard for data collection and analysis of scientific data?
If you want to see truly egregious uses of statistics, take a look at any paper on diet or nutrition. Be prepared to be angry.<p>At this point, if someone published a study stating that we needed to eat not to die, I'd be skeptical of it.
The greatest problem of statistical analysis is throwing out observations which do not fit the bill. All analysis should be thoroughly documented with postmortems.
whenever there is discussion about statistics role in science (sometimes even going as far as crossing into how science is statistics) i always remember this:<p><a href="http://en.wikipedia.org/wiki/Oil_drop_experiment#Fraud_allegations" rel="nofollow">http://en.wikipedia.org/wiki/Oil_drop_experiment#Fraud_alleg...</a>
That was an excellent read. Thank you. I'll admit I'm often reluctant to read to much in to data I deal with daily (web analytics), as I'm unsure of how to measure its significance accurately. I'm going to dive in and learn more about this.