I saw this headline and my first thought was that someone was claiming that a mind impacting virus that evolved in the ocean was causing scientists to do research with less ambition. Which is of course ridiculous lol. But a bug in a visualization library impacting science is also ridiculous.
Science communication must be at an all-time low. I initially thought the paper was about a sea-borne pathogen being responsible for a decline in disruptiveness in science, which is a crazy statement.<p>Then I thought that it was a paper claiming that a bug in the seaborn plotting library in python was responsible for the decline in disruptiveness in science, which is absurd!<p>Finally I understood, that this is a paper that is debunking another meta paper that claimed that disruptiveness in science had declined. And this new, arxiv paper is showing that a bug in the seaborn plotting library is responsible for the mistake in the analysis that led to that widely publicized conclusion about declining disruptiveness in science. oh boy so many levels...
The seaborn issue linked in the paper, “Treat binwidth as approximate to avoid dropping outermost datapoints” (<a href="https://github.com/mwaskom/seaborn/pull/3489">https://github.com/mwaskom/seaborn/pull/3489</a>), summarizes the problem as follows:<p>> floating point errors could cause the largest datapoint(s) to be silently dropped<p>However, the paper does not contain the string “float”, instead saying only:<p>> A bug in the seaborn 0.11.2 plotting software [3], used by Park et al. [1], silently drops the largest data points in the histograms.<p>So at the very least, the paper is silent on a key aspect of the bug.
Seaborn is a visualization library. No statistical tests should have been done with seaborn as an intermediate processing step. I guess they used some of the convenience functions as part of the data analysis. Seaborn is a final step tool, not a data analysis tool. That's an embarrassing lesson to learn post-publication.
I hope that all the publications that celebrated the original work, like the Economist <a href="https://www.economist.com/science-and-technology/2023/01/04/papers-and-patents-are-becoming-less-disruptive" rel="nofollow">https://www.economist.com/science-and-technology/2023/01/04/...</a>, Nature's news service <a href="https://www.nature.com/articles/d41586-022-04577-5" rel="nofollow">https://www.nature.com/articles/d41586-022-04577-5</a>, the FT <a href="https://www.ft.com/content/c8bfd3da-bf9d-4f9b-ab98-e9677f109e6d" rel="nofollow">https://www.ft.com/content/c8bfd3da-bf9d-4f9b-ab98-e9677f109...</a>, and others spend as much time on correcting the record as they did on promoting the idea that science is broken.<p>And I hope the original authors tell Nature to retract their paper. It's already highly influential unfortunately.
This image is the best illustration of the flaw <a href="https://arxiv.org/html/2402.14583v1/x1.png" rel="nofollow">https://arxiv.org/html/2402.14583v1/x1.png</a><p>On mobile and can’t read the rest of the paper, the impact could be massive.
The submission was flagged, and I am not sure I understand why since the only (negatively) critical discussion I see is on the ambiguity over the title in the HN submission; flagging a submission appears to take it off the HN homepage, and I feel a title ambiguity in the face of the significance of the submission itself isn’t a strong reason for removing the submission from HN? :)<p>There are (at the time of posting this comment) no comments raising any substantive issue with the arxiv submission itself (which ofc has to go through the peer review process of publication, and hopefully the original authors will respond / rebut this new article) - so curious why its been flagged? It’s not dead, so cannot vouch for it.<p>If folks in the HN community who have flagged it have done so because there are serious issues with what the paper is asserting, please comment / critique instead of just flagging it. If it’s because of the ambiguity in the title, I hope @dang and the moderators editorialize - there are some valuable comments in this thread that helped me understand what the issue is and what the bug is!
Bizarre. How do people make such big, splashy findings that can mess with people’s sense of optimism about science and innovation, without doing the simplest types of checks on their data and methodology.
Of course, it has nothing to do with rampant fraud, unreproducible results, incentive structures which reward the number of papers over the quality of papers, having researchers spend their prime scientific years writing grant proposals instead of actual research...<p>...nor does it have anything to do with tech companies hoarding cash by the trillions of dollars oversees instead of spending it on R&D, and even what R&D they internally produce they have no incentive to publish or productize, because virtually no new business will be more profitable than the monopoly business they already have...