TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

An alarming number of scientific papers contain Excel errors

91 点作者 pns将近 9 年前

16 条评论

denzil_correa将近 9 年前
Previous discussion - <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=12349391" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=12349391</a>
ramblenode将近 9 年前
MS Excel is <i>absolutely</i> unfit for most scientific and engineering problems.<p>The spreadsheet GUI, lack of good version tracking&#x2F;history, and eagerness to coerce data types and &quot;correct&quot; values makes it easy to introduce errors that will go unrecognized and propagated through calculations. Unfortunately this story just keeps repeating itself.<p>But all of this is just a secondary concern to Excel&#x27;s real trouble: it&#x27;s history of incorrectly implementing numerical and statistical procedures. One could plumb the depths of this topic for hours, but here are a few highlights: regression formula accepts illegal&#x2F;nonsensical inputs (e.g. collinear predictors) and gives illegal&#x2F;nonsensical outputs [0], variance&#x2F;standard deviation change incorrectly with sample size [0], output of a paired t-test changes when missing values are included [0], formulas are mislabeled [0], v. 2007 gives very wrong answers to 11 of 27 tests in the NIST test suite used for statistical software benchmarks [1], the random number generator was broken as late as v. 2007 [1], and calculations relying on any of 12 particular floats display an incorrect result [2]. There are plenty of other issues mentioned in the links and elsewhere; if you&#x27;re interested you&#x27;ll have no trouble finding them.<p>Remember, friends don&#x27;t let friends use Excel for science. :)<p>[0] <a href="http:&#x2F;&#x2F;people.stern.nyu.edu&#x2F;jsimonof&#x2F;classes&#x2F;1305&#x2F;pdf&#x2F;excelreg.pdf" rel="nofollow">http:&#x2F;&#x2F;people.stern.nyu.edu&#x2F;jsimonof&#x2F;classes&#x2F;1305&#x2F;pdf&#x2F;excelr...</a><p>[1] <a href="http:&#x2F;&#x2F;www.pages.drexel.edu&#x2F;~bdm25&#x2F;excel2007.pdf" rel="nofollow">http:&#x2F;&#x2F;www.pages.drexel.edu&#x2F;~bdm25&#x2F;excel2007.pdf</a><p>[2] <a href="https:&#x2F;&#x2F;blogs.office.com&#x2F;2007&#x2F;09&#x2F;25&#x2F;calculation-issue-update&#x2F;" rel="nofollow">https:&#x2F;&#x2F;blogs.office.com&#x2F;2007&#x2F;09&#x2F;25&#x2F;calculation-issue-update...</a><p>Edit: clarify and add a new issue I became aware of while researching further.
评论 #12371426 未加载
评论 #12371815 未加载
评论 #12371642 未加载
评论 #12372308 未加载
评论 #12372713 未加载
评论 #12371435 未加载
评论 #12372453 未加载
dagaci将近 9 年前
First I thought this story was about math and numerical errors. But it&#x27;s actually about auto- formatting and auto-correction.<p>&quot;Excel automatically converting gene names to things like calendar dates or random numbers&quot;<p>In this case, I think what is needed is some kind of rudimentary knowledge of data-types. Or perhaps more simply a scientific template which is actually plain text by default.<p>But how are people not noticing auto-correction and auto formatting taking place!<p>The only perfect solution is to hire a developer to build you a data entry system. The developer can build the system which they have no cause to entirely understand the science behind, and thus a human to take the blame for errors instead of excel.
评论 #12371649 未加载
评论 #12372642 未加载
评论 #12372067 未加载
keithpeter将近 9 年前
<a href="https:&#x2F;&#x2F;help.libreoffice.org&#x2F;Calc&#x2F;Deactivating_Automatic_Changes" rel="nofollow">https:&#x2F;&#x2F;help.libreoffice.org&#x2F;Calc&#x2F;Deactivating_Automatic_Cha...</a><p>Type apostrophe at beginning of the gene name (&#x27;MARCH1) or format the column for gene names as text (click column letter, then Format | Cell and select text)<p>If people want to use a spreadsheet application for this kind of data collection (and that is a big if I think) then they perhaps need to have some agreed lab protocols for setting up and checking the spreadsheets. This is a known issue in financial circles...<p><a href="http:&#x2F;&#x2F;www.eusprig.org&#x2F;basic-research.htm" rel="nofollow">http:&#x2F;&#x2F;www.eusprig.org&#x2F;basic-research.htm</a>
评论 #12372028 未加载
omginternets将近 9 年前
When my Ph.D is finally done (~3 months), I&#x27;ll post some of the code I&#x27;ve had to work with daily for the past three years.<p>&quot;Spaghetti&quot; doesn&#x27;t even <i>begin</i> to describe it. &quot;Ball of yarn under a cat-lady&#x27;s sofa&quot; comes readily to mind, as does gouging my eyes out and amputating my fingers.<p>The problem isn&#x27;t excel. The problem is scientists.
评论 #12372359 未加载
评论 #12372449 未加载
SNvD7vEJ将近 9 年前
Why is the auto-convert &#x27;features&#x27; in Excel not opt-in?<p>When Excel encounters the first cell in a new sheet that it thinks should be auto-converted, why does it not ask if that is desirable for that sheet?<p>Like: &quot;Do you want Excel to interpret and auto-convert all strings with format &lt;X&gt; into the type &lt;Y&gt; in this sheet?&quot;<p>At least for conversions where the original data is lost.
评论 #12371945 未加载
vanderZwan将近 9 年前
Highly relevant: Felienne[0] Hermans&#x27; compsci research on spreadsheets out there in the wild, and how to develop software engineering tools to make them better:<p><a href="https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=2Cdgew5zvI4" rel="nofollow">https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=2Cdgew5zvI4</a><p><a href="http:&#x2F;&#x2F;www.felienne.com&#x2F;archives&#x2F;tag&#x2F;spreadsheets" rel="nofollow">http:&#x2F;&#x2F;www.felienne.com&#x2F;archives&#x2F;tag&#x2F;spreadsheets</a><p>[0] pronounced Fay-lee-nuh
IndianAstronaut将近 9 年前
Same thing hapened in Economics involving a major figure in Economics.<p><a href="http:&#x2F;&#x2F;www.bloomberg.com&#x2F;news&#x2F;articles&#x2F;2013-04-18&#x2F;faq-reinhart-rogoff-and-the-excel-error-that-changed-history" rel="nofollow">http:&#x2F;&#x2F;www.bloomberg.com&#x2F;news&#x2F;articles&#x2F;2013-04-18&#x2F;faq-reinha...</a>
评论 #12375797 未加载
hirenj将近 9 年前
It&#x27;s funny to consider that these errors slipped past the peer review stage. It really highlights the major issue with reviewing source code published as part of an analysis.<p>If there aren&#x27;t enough resources &#x2F; skilled eyes to catch these simple errors, what are the chances they would catch errors in source code too?
评论 #12371747 未加载
triplesec将近 9 年前
Anyone doing serious statistics uses SPSS or R, or similar stats programs. If not, you deserve all the bad data you get. Using Excel for that is akin to using a point and shoot camera for a fashion photoshoot, or a crossover car offroad in Death Valley.
Gatsky将近 9 年前
Consider that given the poorly conducted statistical analyses, p-hacking etc that goes on in the life sciences, Excel garbling gene names might actually improve the net accuracy of the results by removing false positives.
Steeeve将近 9 年前
Is everybody a washington post subscriber? Or have I missed the route around the paywall somehow?
评论 #12371601 未加载
评论 #12371489 未加载
评论 #12371605 未加载
评论 #12372408 未加载
pjmorris将近 9 年前
Should decison maker&#x27;s spreadsheets in business, policy, and government be peer-reviewed in the same way as scientific papers?<p>Disclaimers: 1) Yes, scientific peer review needs improvement. 2) Yes, spreadsheets are not ideal for science... what makes business less important?
cm2187将近 9 年前
I wonder if that also applies to DNA tests used in criminal investigations...
评论 #12371821 未加载
gregn610将近 9 年前
An alarming number of business spreadsheets contain Excel errors. But it&#x27;s the linga fraca of businesses, departments &amp; teams everywhere.
nol13将近 9 年前
*Excel mutations