This is excellent. I have a few questions:<p><i>You need to provide the background of your study, the types of experiments undertaken, the materials and methods, and initial results of your study.</i><p>Do the technicians reproducing the results get to see the initial results? It seems like it might be more accurate if they didn't. Lots of parameters can be fudged and adjusted, a la Millikan's Oil Drop, when the results don't quite match. I imagine this might be exacerbated with the necessity of researcher-validator communication.<p>How are conflicts resolved? If my results are not validated, someone made a mistake - me or the validator. If both parties stand by their mutually incompatible results, where does it go from there? I can imagine a lot of researchers I know feeling annoyed that someone whose expertise they cannot verify (due to anonymity) won't "do my experiment correctly".<p>I imagine that in time there might be specific requirements or explicit funding allocations for such reproduction on grant applications, which would really allow it to take off. As it stands, I imagine a lot of PI's would just ask "hmm, I can spend money that might risk my already high-impact paper, or I can keep the money and not be considered wrong."<p>Still, this is a great first step toward facilitating a central tenet of the scientific method. Congratulations.
It's very expensive (prohibitively so) to reproduce any substantial study. It also feels like there's a lot of potential downside and very little reward, since there's a presumption of correctness in published papers today. Further, wet lab protocols are subjective enough that I'd imagine most labs would ignore a negative result as having been performed incorrectly.<p>Where does the money for this come from? Are you expecting that labs will write this cost into their grants? Have you seen interest from grant agencies to then actually pay for this?
There are very few journals that do this -- I can't imagine this ever taking off.<p>Reproducing experiments seems like a costly (and mostly thankless) effort that few PIs would ever take up.<p>The only journal I know of that completely reproduces results is Organic Syntheses (<a href="http://www.orgsyn.org/" rel="nofollow">http://www.orgsyn.org/</a>), which reproduces every reaction before publication and has a Procedure Checklist for authors: <a href="http://www.orgsyn.org/AuthorChecklist.pdf" rel="nofollow">http://www.orgsyn.org/AuthorChecklist.pdf</a>
The part I like about the execution is where you have CROs and Core Facilities, and not scientists themselves validate results. Apparently Milikan when he measured the electric charge was off (smaller than actual) in the measurements... but researchers who checked, worried perhaps about their academic reputation (speculating here) slowly adjusted this number upwards over several years till it asymptotically approached the truth. Having a 3rd-party, that is less impacted by academic politics might be a good thing.<p>From Feynman's Caltech commencement speech on this:<p><i>We have learned a lot from experience about how to handle some of
the ways we fool ourselves. One example: Millikan measured the
charge on an electron by an experiment with falling oil drops, and
got an answer which we now know not to be quite right. It's a
little bit off, because he had the incorrect value for the
viscosity of air. It's interesting to look at the history of
measurements of the charge of the electron, after Millikan. If you
plot them as a function of time, you find that one is a little
bigger than Millikan's, and the next one's a little bit bigger than
that, and the next one's a little bit bigger than that, until
finally they settle down to a number which is higher.</i><p><i>Why didn't they discover that the new number was higher right away?
It's a thing that scientists are ashamed of--this history--because
it's apparent that people did things like this: When they got a
number that was too high above Millikan's, they thought something
must be wrong--and they would look for and find a reason why
something might be wrong. When they got a number closer to
Millikan's value they didn't look so hard. And so they eliminated
the numbers that were too far off, and did other things like that.
We've learned those tricks nowadays, and now we don't have that
kind of a disease.</i><p>[1] <a href="http://www.lhup.edu/~dsimanek/cargocul.htm" rel="nofollow">http://www.lhup.edu/~dsimanek/cargocul.htm</a>
I was very surprised when I first learned that results could be published <i>without</i> first having been reproduced. I thought at the least they should have been published with a separate label.<p>This is a worthy initiative.
I don't see this approach really able to solve the underlying problems that it references from<p><a href="http://www.nytimes.com/2012/04/17/science/rise-in-scientific-journal-retractions-prompts-calls-for-reform.html?_r=1" rel="nofollow">http://www.nytimes.com/2012/04/17/science/rise-in-scientific...</a> and other articles.<p>"Each year, every laboratory produces a new crop of Ph.D.’s, who must compete for a small number of jobs, and the competition is getting fiercer. In 1973, more than half of biologists had a tenure-track job within six years of getting a Ph.D. By 2006 the figure was down to 15 percent."<p>I would claim that science requires some basic integrity in its practitioners and if an institution is treating it's members as so many throw-away resources, it is hard to expect those members to move ahead with great idealism. The model of every desperate competitor watching every other competitor seems to be the replacement for model of science as a high ideal. I don't see it working out well.
The NSF CISE guidelines (which apply to much of the funded CS research that folks on HN care about) currently require you to retain all data required to reproduce your experiment and make them available to any reasonable request:
<a href="http://www.nsf.gov/cise/cise_dmp.jsp" rel="nofollow">http://www.nsf.gov/cise/cise_dmp.jsp</a><p>My advisor is on appointment there right now, and they're working on guidelines that require you to provide full reproducibility of your program results (modulo system differences). At least for software, things are looking up. Of course, reproducibility is both infinitely easier (I can do it on any system!) and harder (what do you mean the kernel patchlevel or CPU/GPU sub-model-number matters?).
Plasmyd does this via crowdsourcing, it lets users comment on papers so that researchers can point out anomalies or post their inability to reproduce the same results. It also gives the author a chance to explain their work.
This is great! Too often scientific papers are published that are never actually checked by anyone else. A key part of the scientific method is reproducibility though.
The title irks me. Instead of "prove it", how about "let us reproduce your results" or "show us the data". Proof in this context is a mistaken concept.