Machines are better referees than humans but we’ll be sued if we use them

300 点作者 inglesp超过 11 年前

15 条评论

Blahah超过 11 年前

Peter Murray Rust (author of this blog post) is a really great man. He's been a tireless advocate for dismantling privelege and setting knowledge free for several decades. I'm proud to say he's becoming a sort of mentor to me. Last week I spent a couple of days with his research group and saw this software in action - it's really impressive.They can take an ancient paper with very low quality diagrams of complex chemical structures, parse the image into an open markup language and reconstruct the chemical formula and the correct image. Chemical symbols are just one of many plugins for their core software which interprets unstructured, information rich data like raster diagrams. They also have plugins for phylogenetic trees, plots, species names, gene names and reagents. You can develop plugins easily for whatever you want, and they're recruiting open source contributors (see <a href="https://solvers.io/projects/QADhJNcCkcKXfiCQ6" rel="nofollow">https://solvers.io/projects/QADhJNcCkcKXfiCQ6</a>, <a href="https://solvers.io/projects/4K3cvLEoHQqhhzBan" rel="nofollow">https://solvers.io/projects/4K3cvLEoHQqhhzBan</a>).As a side effect of how their software works, it can detect tiny suggestive imperfections in images that reveal scientific fraud. I was shown a demo where a trace from a mass spec (like this <a href="http://en.wikipedia.org/wiki/File:ObwiedniaPeptydu.gif" rel="nofollow">http://en.wikipedia.org/wiki/File:ObwiedniaPeptydu.gif</a>) was analysed. As well as reading the data from the plot, it revealed a peak that had been covered up with a square - the author had deliberately obscured a peak in their data that was inconvenient. Scientific fraud. It's terrifying that they find this in most chemistry papers they analyse.Peter's group can analyse thousands or hundreds of thousands of papers an hour, automatically detecting errors and fraud and simultaneously making the data, which are facts and therefore not copyrightable, free. This is one of the best things that has happened to science in many years, except that publishers deliberately prevent it. Their work also made me realise it would be possible to continue Aaron Swartz' work on a much bigger scale (<a href="http://blahah.net/2014/02/11/knowledge-sets-us-free/" rel="nofollow">http://blahah.net/2014/02/11/knowledge-sets-us-free/</a>).Academic publishers who are suppressing this are literally the enemies of humanity.

评论 #7262779 未加载

评论 #7264121 未加载

评论 #7263659 未加载

yoha超过 11 年前

Google cache: <a href="https://webcache.googleusercontent.com/search?q=cache:b2trH5OA3P8J:http://blogs.ch.cam.ac.uk/pmr/2014/02/18/machines-are-better-referees-than-humans-but-well-be-sued-if-we-use-them/" rel="nofollow">https://webcache.googleusercontent.com/search?q=cache:b2trH5...</a>

atmosx大约 11 年前

When I asked my journalist friend, why in football (soccer) games the ref don't use high-tech, he thought about it for 5 minutes and then told me: "If they use technology it will be really hard to set up games. If you take from a league the ability to set-up games and promote specific teams/individuals, then I don't know how the game will be shaped".Of course it's universal, it's not like everything is a set-up but happens more often than most would likely imagine, especially since betting came into play.So there you got it.

评论 #7264099 未加载

评论 #7265028 未加载

评论 #7265625 未加载

评论 #7264887 未加载

JackFr大约 11 年前

This should be supported (both financially and ideologically) by the National Library of Medicine at the National Institutes of Health. The NIH doles out about $30 billion in research grants every year. If they could spend a tiny fraction of a percent to dramatically improve the quality of the rest and make such automatic checking a standard practice that would be tremendous bang for the buck.Oh yeah -- and they're big enough to fight academic publishers.

tomp超过 11 年前

Can they release the software to the world? Maybe, if we all make an effort to analyse whatever papers we can access, we will together make enough noise that it will be impossible to ignore, and also impossible to silence (cf. The Pirate Bay). This could be one of the most important advancements of science in the past few years.

评论 #7263193 未加载

评论 #7263173 未加载

Shivetya超过 11 年前

At first I thought the article would be about sports, which in itself would make for an interesting discussion about using machines to judge rules adherence, not that I would want to take that human element out of sports.However this is more along the lines of validating what is published. Of any group you would hope that scientist and their like would jump on technology like this so as to provide the most accurate representation of their work as possible. The same for publishers, why wouldn't they want to brag the use the most advanced interrogation methods for the papers they publish?I guess they are people too, hyper sensitive that fault will be found

评论 #7263086 未加载

_greim_大约 11 年前

So as a non-scientist, let me see if I understand.There are lots of uncaught errors floating around out there in scientific papers, and many of them can now be found with this software. But the exposing the errors so that they can be corrected is tricky because: A) you have to have legal access to a paper in order to scan it, and B) even if you do have access, under the current rules only the publishers have the right expose the errors, and they're not interested because they want to avoid the embarrassment.Am I understanding it?

Udo超过 11 年前

I see a very exciting possibility for the future of academic papers in certain disciplines where we could have a machine validation step performed automatically, not only on submission but as a tool for the author to check their work. Like a git commit hook that forces a test suite to run. Of course, this would require some formalism to tag data, diagrams, and formulae but it's probably in our best interest in the long run to make the body of our research more machine-accessible anyway.

评论 #7264271 未加载

评论 #7263481 未加载

sov大约 11 年前

For those curious, the 5 membered ring in cyclopiazonic acid should have a NH atom rather than a CH2.

bloaf大约 11 年前

When people talk about the future, they always seem to think that it will be the scientific jobs that get roboticized last. I think it will be the opposite, it won't be long before systems like this one will be able to analyze the scientific literature, identify shortcomings, and tell us what experiments to do next. Science will become less about creativity and problem solving, and more about following directions; eventually becoming completely automated.<a href="http://www.aejournal.net/content/2/1/1" rel="nofollow">http://www.aejournal.net/content/2/1/1</a>

nder大约 11 年前

Any chance you could farm out the software to lab in a nationality with MUCH MUCH looser copyright laws, and a court system that would be problematic for outside law suits?

评论 #7267771 未加载

dflock大约 11 年前

This blog post is down, try here: <a href="http://blogs.ch.cam.ac.uk.nyud.net/pmr/2014/02/18/machines-are-better-referees-than-humans-but-well-be-sued-if-we-use-them/" rel="nofollow">http://blogs.ch.cam.ac.uk.nyud.net/pmr/2014/02/18/machines-a...</a>

ylem大约 11 年前

I suppose one way around this would be the NSF to require any grant awardees to deposit their structures in a publicly accessible database...But, I'm a bit surprised--is there nothing like arxiv.org for chemistry? Why not?

nl大约 11 年前

There is of course a way around the problems cited in the article.If the referees ran the software on the preprint it would find the same problem.I agree this isn't as good, but it would be a step forward.

bloaf大约 11 年前

I think the dream would be to couple a literature-analyzer like this with a specialized search engine like Wolfram Alpha.

15 条评论

Blahah超过 11 年前

评论 #7262779 未加载

评论 #7264121 未加载

评论 #7263659 未加载

yoha超过 11 年前

atmosx大约 11 年前

评论 #7264099 未加载

评论 #7265028 未加载

评论 #7265625 未加载

评论 #7264887 未加载

JackFr大约 11 年前

tomp超过 11 年前

评论 #7263193 未加载

评论 #7263173 未加载

Shivetya超过 11 年前

评论 #7263086 未加载

_greim_大约 11 年前

Udo超过 11 年前

评论 #7264271 未加载

评论 #7263481 未加载

sov大约 11 年前

For those curious, the 5 membered ring in cyclopiazonic acid should have a NH atom rather than a CH2.

bloaf大约 11 年前

nder大约 11 年前

Any chance you could farm out the software to lab in a nationality with MUCH MUCH looser copyright laws, and a court system that would be problematic for outside law suits?

评论 #7267771 未加载

dflock大约 11 年前

ylem大约 11 年前

nl大约 11 年前

bloaf大约 11 年前

I think the dream would be to couple a literature-analyzer like this with a specialized search engine like Wolfram Alpha.