TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Fast and slow if-statements: branch prediction in modern processors

37 点作者 adg001大约 15 年前

8 条评论

InclinedPlane大约 15 年前
This particular bit of information is worse than useless for the vast majority of developers. For the parts-per-million development efforts where this sort of optimization is necessary (a tiny subset even of kernel developers) there will be many orders of magnitude more developers for whom this information is actively harmful. Devs will spend their time trying to second guess the compiler's optimizations of their if statements in order to eek out micro-seconds of performance improvement. Meanwhile, by focusing their efforts on the wrong thing they will take attention away from quality of design and execution as well as macro-optimizations. Their code will be lower quality and, ironically, slower.<p>Micro-optimization is a silly diversion the vast majority of the time. Wait to optimize at that level until you have the tooling and the measurements that indicate you truly need it (more often than not you won't).
评论 #1377903 未加载
评论 #1377771 未加载
评论 #1377857 未加载
评论 #1377823 未加载
codesink大约 15 年前
The Linux kernel exploits that by using two macros (likely() and unlikely()) that give hints to the compiler using GCC's __builtin_expect:<p>if(likely(condition)) dosomething();
评论 #1377657 未加载
评论 #1377639 未加载
jwegan大约 15 年前
What the article fails to mention is branch predictors in modern CPUs are generally correct &#62;95% of the time for real world benchmarks (usually SPEC).<p>Furthermore each CPU uses different approaches to branch prediction such that a bad pattern for one will not be in another. So spending time trying to optimize your code in this way will only optimize it for a particular processor which wont work if you're trying to make a distributable binary. (i.e. different x86 processors use different branch prediction algorithms. You will basically be optimizing specifically for the Intel i7 or specifically for the AMD Athlon 3)
Zot95大约 15 年前
Interesting article. It has been a long time since I have worked this "close to the metal." Back when I coded at the machine level, there were no branch predictions performed by the processor to help maintain the pipeline. What you had were different opcodes for your conditional jumps - 1 for when the jump was likely, 1 for when it was not. It had amazed me (a bit) that never got reflected in higher level languages. Now that I see that the processor makes (educated?) guesses, I suppose I am no longer surprised.
warfangle大约 15 年前
Would systems that use trace trees (e.g., TraceMonkey) have the same sort of issues?<p>I have a feeling that they would notice repeated calls to the same if-statement and optimize accordingly. They may still have the same sorts of issues, but would the used benchmark (looping and querying the if-statement <i>n</i> times) have completely different results in a JIT system?
TrevorBurnham大约 15 年前
I'm astonished at the magnitude of the difference correct prediction can make in a scenario where the effect of a misprediction is to either increment a sum or not. How is that possible? How many clock cycles does an accurate prediction save in this context, and how many are used in moving through the loop and evaluating the condition?
malkia大约 15 年前
Results from Luajit - <a href="http://gist.github.com/413482" rel="nofollow">http://gist.github.com/413482</a><p>D:\p\badif&#62;..\luajit\src\luajit badif.lua<p>(i &#38; 0x80000000) == 0 time: 1.057<p>(i &#38; 0xFFFFFFFF) == 0 time: 0.696<p>(i &#38; 0x00000001) == 0 time: 3.682<p>(i &#38; 0x00000003) == 0 time: 5.339<p>(i &#38; 0x00000002) == 0 time: 3.686<p>(i &#38; 0x00000004) == 0 time: 3.683<p>(i &#38; 0x00000008) == 0 time: 3.686<p>(i &#38; 0x00000010) == 0 time: 3.856
j_baker大约 15 年前
Just out of curiosity, how applicable is this to optimizing a higher-level language like python or ruby?
评论 #1377913 未加载
评论 #1377872 未加载
评论 #1377915 未加载