Although the article lists all the usual performance "knobs", I think it should have put more emphasis on more "hidden performance hogs", that (as the author passingly mentions) can only be discovered through profiling.<p>Moreover it also fails to mention that having all these enabled does have its own disadvantages:<p>* the build time goes through the roof; (and if you also disable incremental builds, the larger the code base, the more it takes to compile even for a single line edit;)
* just by the fact that more code gets inlined, it might reduce performance due to the inability to cache enough executable code in the CPU low-level caches;<p>----<p>As for profiling, I would say it's a much better "bet" when it comes to reducing execution time, than these build "knobs".<p>For example a few months ago I re-wrote a small Go tool in Rust (<a href="https://github.com/volution/volution-md5-tools" rel="nofollow">https://github.com/volution/volution-md5-tools</a>) which had the simplest of jobs: read two MD5 (or similar) files, and compare which files are missing, which have changed, etc. Basically populate two maps and compare them.<p>Now, initially the code I used was almost a 1-to-1 rewrite using simple hash-maps, and to my surprise the Go version was twice as fast than even the Rust release version (with all of the mentioned performance tricks).<p>So digging through the code and profiling it here were a few surprising findings on what plagued my execution time (in order of surprise):<p>* deallocation -- after the Rust program has finished it proceeded to deallocate the two large hash-maps, which by itself took (if I remember correctly) at least 25% of the time; (Go didn't have this issue, as being garbage collected, the collector didn't kick in...) (the solution: use an `exit(0)` to make sure the deallocation doesn't happen;)<p>* `PathBuf` equality comparison -- I've used `PathBuf` as keys in my hash-map (because I wanted to canonize the paths); however the key comparison took another large percentage, which was solved by switching to `OsString`; (for some reason, comparing two `PathBuf`'s, implies splitting them each time in components, and comparing those one by one;)<p>* regular expression matching with groups -- apparently it's far more expensive to use regular expressions with groups for parsing, than just using them to "verify" the validity of the syntax, and then switch to another technique to actually tokenize;<p>So, as the saying goes "caveat emptor"... :)