TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

GCC optimization flag makes your 64-bit binary fatter and slower

61 pointsby mudgemeisteralmost 15 years ago

7 comments

alexgartrellalmost 15 years ago
To put this in context, this particular bug makes your binary fatter and slower in the same way that eating a tic tac makes you fatter and slower. A single load/restore from memory through a slightly (and only slightly) slower path is really going to be blown away by every other operation in all but the most trivial program (imagine LOTS of recursion with <i>very</i> simple functions (no looping)). Do a single IO operation and you're really talking about nothing.
评论 #1532221 未加载
评论 #1532314 未加载
phsralmost 15 years ago
It's nice to see someone tell you that their benchmark is flawed and why. Most try to pass off their benchmark as the end-all-be-all measurement of whatever they are testing
评论 #1532415 未加载
froydnjalmost 15 years ago
A couple of points:<p>- It's hard to reproduce the benchmarking results, as source code for the benchmarks is not provided.<p>- The original bug report was for 32-bit code, not 64-bit code, as the post assumed throughout.<p>- If you compile the code given in the original bug report as 64-bit code with and without -fomit-frame-pointer, there's no difference in the generated code.<p>- It's not clear to me that the "potential pieces of code" in the article are actually generatable with real-world C code. Again, not having actual source code available hurts.<p>- You shouldn't be using -fomit-frame-pointer on 64-bit code anyway, as you don't need the frame pointer for debugging/unwinding purposes on x86-64 like you do on x86. If the poster had read the x86-64 ABI, this would have been apparent.
评论 #1532627 未加载
评论 #1533366 未加载
dfj225almost 15 years ago
Anyone doing timings in Linux at the cycle level might be interested in the PAPI library: <a href="http://icl.cs.utk.edu/papi/" rel="nofollow">http://icl.cs.utk.edu/papi/</a>
stcredzeroalmost 15 years ago
It sometimes pays to run your tests at different levels of optimization. There are optimizer and code generation bugs in compilers. Running your tests at different levels of optimization can detect some of these.
评论 #1532146 未加载
评论 #1532318 未加载
oliveoilalmost 15 years ago
uhh, what is that disgusting thing on the picture near the top?
评论 #1532603 未加载
评论 #1532614 未加载
评论 #1533336 未加载
aliguorialmost 15 years ago
The analysis is flawed because it merely takes into the account the fact that ebp is callee saved verses using a different register that is caller saved. But ultimately, the register still needs to be spilled somewhere so you're not changing the overall amount of work done.<p>Furthermore, it doesn't take into account the fact that by using ebp as a GP register, you've got an additional register to work with which eliminates additional spilling.<p>Basically, there's nothing to see here.
评论 #1533395 未加载