It feels crazy to me that Intel spent years dedicating die space on consumer SKUs to "make fetch happen" with AVX-512, and as more and more libraries are finally using it, as Intel's goal is achieved, they have removed AVX-512 from their consumer SKUs.<p>It isn't that AMD has better AVX-512 support, which would be an impressive upset on it's own. Instead, it is only that AMD has AVX-512 on consumer CPUs, because Intel walked away from their own investment.
Instead of doing 4 comparisons against each character `\n`, `\r`, `;` and `"` followed by 3 or operations, a common trick is to do 1 shuffle, 1 comparison and 0 or operations. I blogged about this trick: <a href="https://stoppels.ch/2022/11/30/io-is-no-longer-the-bottleneck-part-2.html" rel="nofollow">https://stoppels.ch/2022/11/30/io-is-no-longer-the-bottlenec...</a> (Trick 2)<p>Edit: they do make use of ternary logic to avoid one or operation, which is nice. Basically (a | b | c) | d is computed using `vpternlogd` and `vpor` resp.
Take that, Intel and your "let's remove AVX-512 from every consumer CPU because we want to put slow cores on every single one of them and also not consider multi-pumping it"
This is a staggering ~3x improvement in just under 2 years since Sep was introduced June, 2023.<p>You can't claim this when you also do a huge hardware jump
If we are lucky we will see Arthur Whitney get triggered and post either a one liner beating this or a shakti engine update and a one liner beating this. Progress!
The article doesn't clearly define what this 21 GB/s code is doing.<p>- What format exactly is it parsing? (eg. does the dialect of CSV support quoted commas, or is the parser merely looking for commas and newlines)?<p>- What is the parser doing with the result (ie. populating a data structure, etc)?
Considering the non-standard nature of CSV, quoting throughput numbers in bytes is meaningless. It makes sense for JSON, since you know what the output is going to be (e.g. floats, integers, strings, hashmaps, etc).
With CSV you only get strings for each column, so 21 GB/s of comma splitting would be the pinnacle of meaninglessness. Like, okay, but I still have to parse the stringy data, so what gives? Yeah, the blog post does reference float parsing, but a single float per line would count as "CSV".<p>Now someone might counter and say that I should just read the README.MD, but then that suspicion simply turns out to be true: They don't actually do any escaping or quoting by default, making the quoted numbers an example of heavily misleading advertising.
There are very good alternatives to csv for storing and exchanging floating point/other data.<p>The HDF5 format is very good and allows far more structure in your files, as well as metadata and different types of lossless and lossy compression.
In my experience I've found it difficult to get substantial gains with custom SIMD code compared to modern compiler auto-vectorization, but to be fair that was with more vector-friendly code than JSON parsing.