TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Speedup Rust's heapsort by 1.5x by making it branchless

4 pointsby pykelloover 2 years ago

1 comment

nequoover 2 years ago
This is the commit in the Rust repository that achieves the speedup by reducing branch misprediction:<p><a href="https:&#x2F;&#x2F;github.com&#x2F;rust-lang&#x2F;rust&#x2F;commit&#x2F;b7089e0dd3e988270f34f182d3749ea5fff5a18f">https:&#x2F;&#x2F;github.com&#x2F;rust-lang&#x2F;rust&#x2F;commit&#x2F;b7089e0dd3e988270f3...</a><p>Do I understand correctly why this works?<p>In the `if` condition on line 201, `child + i &lt; v.len()` is easy to predict but `is_less(...)` is hard. Because of this, `child += 1` on line 202 needs to be walked back with a non-negligible probability. This wastes time.<p><pre><code> 200 &#x2F;&#x2F; Choose the greater child. 201 if child + 1 &lt; v.len() &amp;&amp; is_less(&amp;v[child], &amp;v[child + 1]) { 202 child += 1; 203 } 204 205 &#x2F;&#x2F; Stop if the invariant holds at `node`. 206 if !is_less(&amp;v[node], &amp;v[child]) { </code></pre> After moving the `is_less(...)` call from the `if` condition to the consequent on line 205 below, we still need to fetch both `child += ...` and `is_less(...)` but we are much less likely to need to walk these instructions back. We waste less time and we reach the `if` condition on line 209 sooner.<p><pre><code> 200 &#x2F;&#x2F; Choose the greater child. 201 if child + 1 &lt; v.len() { 202 &#x2F;&#x2F; We need a branch to be sure not to out-of-bounds index, 203 &#x2F;&#x2F; but it&#x27;s highly predictable. The comparison, however, 204 &#x2F;&#x2F; is better done branchless, especially for primitives. 205 child += is_less(&amp;v[child], &amp;v[child + 1]) as usize; 206 } 207 208 &#x2F;&#x2F; Stop if the invariant holds at `node`. 209 if !is_less(&amp;v[node], &amp;v[child]) {</code></pre>