TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Cheap tricks for high-performance Rust

2 pointsby ranadeepabout 5 years ago

1 comment

ciprian_craciunabout 5 years ago
Although the article lists all the usual performance &quot;knobs&quot;, I think it should have put more emphasis on more &quot;hidden performance hogs&quot;, that (as the author passingly mentions) can only be discovered through profiling.<p>Moreover it also fails to mention that having all these enabled does have its own disadvantages:<p>* the build time goes through the roof; (and if you also disable incremental builds, the larger the code base, the more it takes to compile even for a single line edit;) * just by the fact that more code gets inlined, it might reduce performance due to the inability to cache enough executable code in the CPU low-level caches;<p>----<p>As for profiling, I would say it&#x27;s a much better &quot;bet&quot; when it comes to reducing execution time, than these build &quot;knobs&quot;.<p>For example a few months ago I re-wrote a small Go tool in Rust (<a href="https:&#x2F;&#x2F;github.com&#x2F;volution&#x2F;volution-md5-tools" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;volution&#x2F;volution-md5-tools</a>) which had the simplest of jobs: read two MD5 (or similar) files, and compare which files are missing, which have changed, etc. Basically populate two maps and compare them.<p>Now, initially the code I used was almost a 1-to-1 rewrite using simple hash-maps, and to my surprise the Go version was twice as fast than even the Rust release version (with all of the mentioned performance tricks).<p>So digging through the code and profiling it here were a few surprising findings on what plagued my execution time (in order of surprise):<p>* deallocation -- after the Rust program has finished it proceeded to deallocate the two large hash-maps, which by itself took (if I remember correctly) at least 25% of the time; (Go didn&#x27;t have this issue, as being garbage collected, the collector didn&#x27;t kick in...) (the solution: use an `exit(0)` to make sure the deallocation doesn&#x27;t happen;)<p>* `PathBuf` equality comparison -- I&#x27;ve used `PathBuf` as keys in my hash-map (because I wanted to canonize the paths); however the key comparison took another large percentage, which was solved by switching to `OsString`; (for some reason, comparing two `PathBuf`&#x27;s, implies splitting them each time in components, and comparing those one by one;)<p>* regular expression matching with groups -- apparently it&#x27;s far more expensive to use regular expressions with groups for parsing, than just using them to &quot;verify&quot; the validity of the syntax, and then switch to another technique to actually tokenize;<p>So, as the saying goes &quot;caveat emptor&quot;... :)