TechEcho

7 comments

alexgartrellalmost 15 years ago

To put this in context, this particular bug makes your binary fatter and slower in the same way that eating a tic tac makes you fatter and slower. A single load/restore from memory through a slightly (and only slightly) slower path is really going to be blown away by every other operation in all but the most trivial program (imagine LOTS of recursion with very simple functions (no looping)). Do a single IO operation and you're really talking about nothing.

评论 #1532221 未加载

评论 #1532314 未加载

phsralmost 15 years ago

It's nice to see someone tell you that their benchmark is flawed and why. Most try to pass off their benchmark as the end-all-be-all measurement of whatever they are testing

评论 #1532415 未加载

froydnjalmost 15 years ago

A couple of points:- It's hard to reproduce the benchmarking results, as source code for the benchmarks is not provided.- The original bug report was for 32-bit code, not 64-bit code, as the post assumed throughout.- If you compile the code given in the original bug report as 64-bit code with and without -fomit-frame-pointer, there's no difference in the generated code.- It's not clear to me that the "potential pieces of code" in the article are actually generatable with real-world C code. Again, not having actual source code available hurts.- You shouldn't be using -fomit-frame-pointer on 64-bit code anyway, as you don't need the frame pointer for debugging/unwinding purposes on x86-64 like you do on x86. If the poster had read the x86-64 ABI, this would have been apparent.

评论 #1532627 未加载

评论 #1533366 未加载

dfj225almost 15 years ago

Anyone doing timings in Linux at the cycle level might be interested in the PAPI library: <a href="http://icl.cs.utk.edu/papi/" rel="nofollow">http://icl.cs.utk.edu/papi/</a>

stcredzeroalmost 15 years ago

It sometimes pays to run your tests at different levels of optimization. There are optimizer and code generation bugs in compilers. Running your tests at different levels of optimization can detect some of these.

评论 #1532146 未加载

评论 #1532318 未加载

oliveoilalmost 15 years ago

uhh, what is that disgusting thing on the picture near the top?

评论 #1532603 未加载

评论 #1532614 未加载

评论 #1533336 未加载

aliguorialmost 15 years ago

The analysis is flawed because it merely takes into the account the fact that ebp is callee saved verses using a different register that is caller saved. But ultimately, the register still needs to be spilled somewhere so you're not changing the overall amount of work done.Furthermore, it doesn't take into account the fact that by using ebp as a GP register, you've got an additional register to work with which eliminates additional spilling.Basically, there's nothing to see here.

评论 #1533395 未加载

7 comments

alexgartrellalmost 15 years ago

评论 #1532221 未加载

评论 #1532314 未加载

phsralmost 15 years ago

It's nice to see someone tell you that their benchmark is flawed and why. Most try to pass off their benchmark as the end-all-be-all measurement of whatever they are testing

评论 #1532415 未加载

froydnjalmost 15 years ago

评论 #1532627 未加载

评论 #1533366 未加载

dfj225almost 15 years ago

Anyone doing timings in Linux at the cycle level might be interested in the PAPI library: <a href="http://icl.cs.utk.edu/papi/" rel="nofollow">http://icl.cs.utk.edu/papi/</a>

GCC optimization flag makes your 64-bit binary fatter and slower

7 comments

GCC optimization flag makes your 64-bit binary fatter and slower

7 comments