I like it how he demonstrates the thought process. It seems quite a mechanical recipe that can be applied to a lot of similar problems:<p>Step 1:. A slow but easy implementation. It allows to make sure the algorithm is correct, and later allows validating the faster variants.<p>Step 2: Algorithm and datastructure optimization, guided by profiling.<p>Step 3: Micro optimization, again guided by profiling. A standard bag of tricks is applied. I didn't see cache friendlyness/tiling in here, which surprised me.<p>Step 4: Parallelism, first by SIMD then by threads.<p>I am very much not downing the author: This is a great teaching of the process, it's a lot of hard brainy work, and we didn't see everything he tried but which failed. So thank you, author.<p>My point is: it is not magic. It is engineering. It is a skill that can be aquired, thought, even planned and measured.
I think the intent is fine, it's an interesting problem to want to find the question or set of questions that is the best predictor of the other questions.<p>But I'd have thought there must be heuristics to help narrow the search.<p>For example you could calculate how each question correlates to all the other questions[1]. If you picked the set of k questions with the best correlation, then you have a good sense of which will be in the actual "k-corr set".<p>You could widen your search a little bit, but by chopping out any questions which correlate poorly to the overall score, you narrow your search greatly, going from 200 choose 5 to 10 choose 5 is a speedup factor of a million, and you've probably considered all the sets that are likely to be in the final bucket.<p>It's been a while since I've worked in the domain, but OP might also want to check out <a href="https://en.wikipedia.org/wiki/Item_response_theory" rel="nofollow noreferrer">https://en.wikipedia.org/wiki/Item_response_theory</a> and <a href="https://en.wikipedia.org/wiki/Classical_test_theory" rel="nofollow noreferrer">https://en.wikipedia.org/wiki/Classical_test_theory</a>.<p>[1] <a href="https://en.wikipedia.org/wiki/Point-biserial_correlation_coefficient" rel="nofollow noreferrer">https://en.wikipedia.org/wiki/Point-biserial_correlation_coe...</a>
It's crazy to me how the author spends most of his time finding out what hides in his abstractions, and then changing that. Especially those macros, and a few random crates used, make this code quite hard to reason about.
Moore's law is dead, that means developers who will not waste CPU cycles will be in demand for the next generation of compute intensive development. Big Data needs big compute so obviously these kinds of articles are not only pertinent but is the way ahead.<p>I would not be surprised if we do rewrites of all the past 20 year code in to Rust or Go or any other performant language.<p>Lets agree on this, you are smart and the author is right.<p>I am sorry people who are complaining are not seeing where the ball is going.
> the point of this post isn’t to compare highly-optimized Python to highly-optimized Rust. The point is to compare “standard-Jupyter-notebook” Python to highly-optimized Rust.<p>I guess the title gets clicks, but I'm curious how good python gets. I'm under the impression pandas is pretty fast despite it being python
There’s an optimization technique I’ve used that I think could be useful for this kind of problem. It’s similar in spirit to the post.<p>The post deals a lot both with strings and maps with strings as keys. One idea is to intern these strings using an interer like [1] which returns string keys as monotonically increasing integers starting at 0. Then instead of a map you can just have a regular vector with the interned key as the index into the vector. This gives you the map functionality with very good performance.<p>[1] <a href="https://docs.rs/string-interner/latest/string_interner/" rel="nofollow noreferrer">https://docs.rs/string-interner/latest/string_interner/</a>
The more I use rust, the more I fall in love with writing code again.<p>This is after many years of corporate java/c#/python/ruby shops. The number of "developers" and "architects" that don't understand low level concepts is draining and disappointing. Worked with too many corporate "code monkeys" that masquerade themselves as "engineers" or "architects". It has got me jaded sometimes.<p>The author of this post has given me hope that there are people out there willing to push the boundaries of their code with their years of experience and understanding of low level computer concepts.<p>Definitely bookmarked and will use as reference!
I love demo apps like this. You can see some idiomatic code and then watch it descend into chaos but you can see the pathway. Otherwise you sometimes see the end result and you don't know why it is what it is, but the pathway makes it comprehensible. Thanks for writing it.
Great engineering. Since the article asks about analytical tips for performance improvement: looks like you should be able to substantially get the complexity down using something like least angle regression (ie what's used by efficient implementations of Lasso shrinkage estimators). Or basically any approach where you successively add the most highly correlated variable to your set instead of checking all size-k subsets.
Obviously not the point of the post, but this doesn't seem like a very reasonable thing to calculate, statistically speaking.<p>Also, I wonder if there is some way to use branch-and-bound to look at fewer combinations.
Comparing highly optimized code (including total algorithm rewrite and relying on unsafe and SIMD operations) without doing the same on the other side is a pointless exercise.<p>It's like showing how much faster you can get your handcrafted assembly code to run vs a bash script.
The author found that HashMap::get was taking most of the time and didn't swap the hashing function??<p>It's well known that Rust's default HashMap hashing function is slow because it's designed to be safe against DoS [1]. If you want your HashMap to be faster, just use a faster hashing function like ahash[2], which can be up to 40% faster.<p>[1] <a href="https://doc.rust-lang.org/book/ch08-03-hash-maps.html#hashing-functions" rel="nofollow noreferrer">https://doc.rust-lang.org/book/ch08-03-hash-maps.html#hashin...</a><p>[2] <a href="https://docs.rs/ahash/0.7.4/ahash/" rel="nofollow noreferrer">https://docs.rs/ahash/0.7.4/ahash/</a>