Reporting a bug on a fragile analysis

284 pointsby lordgilmanover 14 years ago

13 comments

lordgilmanover 14 years ago

I know that my submission's title is not the same as the blog post's title and that I will get some hate for it. However, the two diffs linked in the post give pretty convincing evidence that IE is picking up on the exact SunSpider test. Furthermore, if you read the last sentence of the blog post the author is more or less beating around the "You're cheating, we've caught you red-handed" bush.

评论 #1913964 未加载

评论 #1914202 未加载

评论 #1913830 未加载

评论 #1913318 未加载

runjakeover 14 years ago

This submission demonstrates why you should just stick to the original article title you're linking, instead of coming up with your own flamebait/trolly title.The issue appears to be a SunSpider bug, not an IE9 bug or "cheat". See <a href="http://news.ycombinator.com/item?id=1913368" rel="nofollow">http://news.ycombinator.com/item?id=1913368</a> for more information.lordgilman, I hope you now realize it would've been wise to wait before passing judgment (especially in a public forum).Edit: I don't know what's with the downvotes. I'm just going by the HN Guidelines, posted at <a href="http://ycombinator.com/newsguidelines.html" rel="nofollow">http://ycombinator.com/newsguidelines.html</a>? If you have a problem, don't downvote me, take it up with pg.

评论 #1914692 未加载

评论 #1915225 未加载

paulirishover 14 years ago

Pretty much all browser vendors agree SunSpider is a bad benchmark, but yet it keeps getting used and abused. All vendors have tweaked their JS engine for SunSpider itself.Dromaeo is a much better benchmark suite in that it tests actual DOM things rather than pure language stuff. Kraken (also by Moz) also attempts to focus on webapp usecases rather than doing billions of regexes per second.

评论 #1913586 未加载

评论 #1914003 未加载

评论 #1914116 未加载

评论 #1913626 未加载

评论 #1913377 未加载

julian37over 14 years ago

In case you don't have IE9 installed, the benchmark results (quoted in the previous blog post) are:<pre><code> cordic: 1ms +/- 0.0% cordic-with-return: 22.6ms +/- 2.7% cordic-with-true: 22.5ms +/- 2.7% </code></pre> (Taken from <a href="http://blog.mozilla.com/rob-sayre/2010/09/09/js-benchmarks-closing-in/" rel="nofollow">http://blog.mozilla.com/rob-sayre/2010/09/09/js-benchmarks-c...</a> )

评论 #1913315 未加载

kenjacksonover 14 years ago

A better test to see if IE9 is cheating is to remove/rearrange code and rename variables. I'd avoid changing operators. Adding a 'true;' or 'return;' may seem harmless, but if their analysis is fragile they may just throw as "may have side-effects" on those statements or (in the case of the 'return;') it may not do liveness analysis on the other side of the block.This code (taken from this thread) seems like a good test:function numNumNum() { var I; var num = 10; for (I = 0; I < 10; I++) { num = num * num * num * num * num % num; } }Except it uses two new operators: '*' and '%'. Test the same code using '+' and '-'.This will give a much better idea of it the analysis is just fragile or if this code was being targeted.

评论 #1915245 未加载

nkurzover 14 years ago

It certainly seems like Microsoft is 'cheating', but it also seems like an excellent but warped example of Test Driven Development: they solved the failing test by the simplest and most direct means available. If time and budget hold out they will refactor later to generalize.How do the TDD proponents feel about Microsoft's approach? How is it different than the supposedly correct behaviour demonstrated here: <a href="http://thecleancoder.blogspot.com/2010/10/craftsman-62-dark-path.html" rel="nofollow">http://thecleancoder.blogspot.com/2010/10/craftsman-62-dark-...</a>

评论 #1916054 未加载

chollida1over 14 years ago

The actual blog post title is:> Reporting a bug on a fragile analysis

评论 #1913230 未加载

niyazpkover 14 years ago

IIRC there was this Microsoft website which listed a few HTML demos in which ie9 was way faster than even google chrome. I wonder whether they used the same 'technique' there too.

评论 #1913764 未加载

itissidover 14 years ago

They have a paradigm in machine learning called over fitting. Trying to do well on a test dataset by cheating and seeing it first... I think teh benchmark should choose tests randomly from a large set of tests and calculate the expected performance over a number of such random runs. not allowing any one to cheat...

评论 #1914384 未加载

评论 #1914200 未加载

pohlover 14 years ago

This was revealed 68 days ago, but nobody seemed to be interested in it at the time:<a href="http://news.ycombinator.com/item?id=1676827" rel="nofollow">http://news.ycombinator.com/item?id=1676827</a>

scottdw2over 14 years ago

That's a pretty big conclusion to jump to (they are cheating the test) based on a small amount of evidence. If they were "precompiling" the java script for the test, and had functionality to "preconpile" java script code in the cache, would the fact that they precompiled the benchmark mean they were cheating? No. It wouldn't.Keep in mind that there is a lot of code, such as Jquery, that is identical but distributed from many sources. It could benefit from similar matching and pre-compilation.If dead code analysis (and other optimizations) was part of an "offline" compilation step (that's not efficient enough to do online), then changing the code would result in a slower execution path. Once the method body changes, the compiler wouldn't know it was dead without re-running the analysis (the changes could introduce side effects).Now, this doesn't mean they are not cheating, because there is no evidence either way. But, what you are observing in this case doesn't imply cheating either.

评论 #1914175 未加载

olalondeover 14 years ago

Could anyone explain what is "dead code analysis"?Update: I still don't get why "the SunSpider math-cordic benchmark is very fast, presumably due to some sort of dead code analysis.". Didn't the author prove exactly the opposite by showing SunSpider is slower when adding dead code to the benchmark? Sorry for the noob question.

评论 #1913274 未加载

评论 #1913610 未加载

评论 #1913592 未加载

评论 #1913276 未加载

评论 #1914725 未加载

评论 #1913271 未加载

pers3usover 14 years ago

What about this reply in IE blog!<a href="http://blogs.msdn.com/b/ie/archive/2010/11/17/html5-and-real-world-site-performance-seventh-ie9-platform-preview-available-for-developers.aspx?PageIndex=4#comments" rel="nofollow">http://blogs.msdn.com/b/ie/archive/2010/11/17/html5-and-real...</a>