To put this in context, this particular bug makes your binary fatter and slower in the same way that eating a tic tac makes you fatter and slower. A single load/restore from memory through a slightly (and only slightly) slower path is really going to be blown away by every other operation in all but the most trivial program (imagine LOTS of recursion with <i>very</i> simple functions (no looping)). Do a single IO operation and you're really talking about nothing.
It's nice to see someone tell you that their benchmark is flawed and why. Most try to pass off their benchmark as the end-all-be-all measurement of whatever they are testing
A couple of points:<p>- It's hard to reproduce the benchmarking results, as source code for the benchmarks is not provided.<p>- The original bug report was for 32-bit code, not 64-bit code, as the post assumed throughout.<p>- If you compile the code given in the original bug report as 64-bit code with and without -fomit-frame-pointer, there's no difference in the generated code.<p>- It's not clear to me that the "potential pieces of code" in the article are actually generatable with real-world C code. Again, not having actual source code available hurts.<p>- You shouldn't be using -fomit-frame-pointer on 64-bit code anyway, as you don't need the frame pointer for debugging/unwinding purposes on x86-64 like you do on x86. If the poster had read the x86-64 ABI, this would have been apparent.
Anyone doing timings in Linux at the cycle level might be interested in the PAPI library: <a href="http://icl.cs.utk.edu/papi/" rel="nofollow">http://icl.cs.utk.edu/papi/</a>
It sometimes pays to run your tests at different levels of optimization. There are optimizer and code generation bugs in compilers. Running your tests at different levels of optimization can detect some of these.
The analysis is flawed because it merely takes into the account the fact that ebp is callee saved verses using a different register that is caller saved. But ultimately, the register still needs to be spilled somewhere so you're not changing the overall amount of work done.<p>Furthermore, it doesn't take into account the fact that by using ebp as a GP register, you've got an additional register to work with which eliminates additional spilling.<p>Basically, there's nothing to see here.