>> [John Crabbe] performed a series of experiments on mouse behavior in three different science labs: in Albany, New York; Edmonton, Alberta; and Portland, Oregon. Before he conducted the experiments, he tried to standardize every variable he could think of. The same strains of mice were used in each lab, shipped on the same day from the same supplier. The animals were raised in the same kind of enclosure, with the same brand of sawdust bedding. They had been exposed to the same amount of incandescent light, were living with the same number of littermates, and were fed the exact same type of chow pellets. When the mice were handled, it was with the same kind of surgical glove, and when they were tested it was on the same equipment, at the same time in the morning.<p>>> The premise of this test of replicability, of course, is that each of the labs should have generated the same pattern of results. “If any set of experiments should have passed the test, it should have been ours,” Crabbe says. “But that’s not the way it turned out.” In one experiment, Crabbe injected a particular strain of mouse with cocaine. In Portland the mice given the drug moved, on average, six hundred centimetres more than they normally did; in Albany they moved seven hundred and one additional centimetres. But in the Edmonton lab they moved more than five thousand additional centimetres. Similar deviations were observed in a test of anxiety. Furthermore, these inconsistencies didn’t follow any detectable pattern. In Portland one strain of mouse proved most anxious, while in Albany another strain won that distinction.<p>>> The disturbing implication of the Crabbe study is that a lot of extraordinary scientific data are nothing but noise.<p>This wasn't established when the post was written, but mice are sensitive and can align themselves to magnetic fields so if the output is movement the result is not thaaaat surprising. There are a lot of things that can affect mouse behavior, including possibly pheromones/smell of the experimenter. I am guessing that behavior patterns such as anxiety behavior can be socially reinforced as well, which could affect results. I can could come up with another dozen factors if I had to. Were mice tested one at a time? How many mice were tested? Time of day? Gut microbiota? If the effect isn't reproducible without the sun and moon lining up, then it could just a 'weak' effect that can be masked or enhanced by other factors. That doesn't mean it's not real, but that the underlying mechanism is unclear. Their experiment reminds me of the rat park experiment, which apparently did not always reproduce, but doesn't mean the effect isn't real in some conditions: <a href="https://en.wikipedia.org/wiki/Rat_Park" rel="nofollow">https://en.wikipedia.org/wiki/Rat_Park</a>.<p>I think the idea of publishing negative results is a great one. There are already "journals of negative results". However, for each negative result you could also make the case that some small but important experimental detail is the reason why the result is negative. So negative results have to be repeatable too. Otherwise, no one would have time to read all of the negative results that are being generated. And it would probably be a bad idea to not try an experiment just because someone else tried it before and got a negative result once.<p>Either way, researchers aren't incentivized to do that. You don't get more points on your grant submission for publishing negative results, unless you also found some neat positive results in the process.