TechEcho

15 comments

chao-il y a 2 jours

It feels crazy to me that Intel spent years dedicating die space on consumer SKUs to "make fetch happen" with AVX-512, and as more and more libraries are finally using it, as Intel's goal is achieved, they have removed AVX-512 from their consumer SKUs.It isn't that AMD has better AVX-512 support, which would be an impressive upset on it's own. Instead, it is only that AMD has AVX-512 on consumer CPUs, because Intel walked away from their own investment.

评论 #43938985 未加载

评论 #43940949 未加载

评论 #43955996 未加载

评论 #43941282 未加载

评论 #43938869 未加载

评论 #43940206 未加载

评论 #43939606 未加载

评论 #43938848 未加载

评论 #43938741 未加载

评论 #43938749 未加载

stabblesil y a 2 jours

Instead of doing 4 comparisons against each character `\n`, `\r`, `;` and `"` followed by 3 or operations, a common trick is to do 1 shuffle, 1 comparison and 0 or operations. I blogged about this trick: <a href="https://stoppels.ch/2022/11/30/io-is-no-longer-the-bottleneck-part-2.html" rel="nofollow">https://stoppels.ch/2022/11/30/io-is-no-longer-the-bottlenec...</a> (Trick 2)Edit: they do make use of ternary logic to avoid one or operation, which is nice. Basically (a | b | c) | d is computed using `vpternlogd` and `vpor` resp.

评论 #43939679 未加载

Aardwolfil y a 2 jours

Take that, Intel and your "let's remove AVX-512 from every consumer CPU because we want to put slow cores on every single one of them and also not consider multi-pumping it"

评论 #43939288 未加载

winterbloomil y a 3 jours

This is a staggering ~3x improvement in just under 2 years since Sep was introduced June, 2023.You can't claim this when you also do a huge hardware jump

评论 #43937050 未加载

评论 #43936989 未加载

评论 #43942779 未加载

评论 #43939269 未加载

评论 #43936979 未加载

vessenesil y a 2 jours

If we are lucky we will see Arthur Whitney get triggered and post either a one liner beating this or a shakti engine update and a one liner beating this. Progress!

voidUpdateil y a 3 jours

I shudder to think who needs to process a million lines of csv that fast...

评论 #43938823 未加载

评论 #43937617 未加载

评论 #43937080 未加载

评论 #43937568 未加载

评论 #43937063 未加载

评论 #43939869 未加载

评论 #43938894 未加载

评论 #43939567 未加载

评论 #43940169 未加载

criddellil y a 3 jours

I was expecting to see assembly language and was pleasantly surprised to see C#. Very impressive.Nice work!

评论 #43938854 未加载

habermanil y a 2 jours

The article doesn't clearly define what this 21 GB/s code is doing.- What format exactly is it parsing? (eg. does the dialect of CSV support quoted commas, or is the parser merely looking for commas and newlines)?- What is the parser doing with the result (ie. populating a data structure, etc)?

评论 #43943463 未加载

imtringuedil y a 2 jours

Considering the non-standard nature of CSV, quoting throughput numbers in bytes is meaningless. It makes sense for JSON, since you know what the output is going to be (e.g. floats, integers, strings, hashmaps, etc). With CSV you only get strings for each column, so 21 GB/s of comma splitting would be the pinnacle of meaninglessness. Like, okay, but I still have to parse the stringy data, so what gives? Yeah, the blog post does reference float parsing, but a single float per line would count as "CSV".Now someone might counter and say that I should just read the README.MD, but then that suspicion simply turns out to be true: They don't actually do any escaping or quoting by default, making the quoted numbers an example of heavily misleading advertising.

评论 #43938477 未加载

constantcryingil y a 2 jours

There are very good alternatives to csv for storing and exchanging floating point/other data.The HDF5 format is very good and allows far more structure in your files, as well as metadata and different types of lossless and lossy compression.

chpatrickil y a 2 jours

In my experience I've found it difficult to get substantial gains with custom SIMD code compared to modern compiler auto-vectorization, but to be fair that was with more vector-friendly code than JSON parsing.

theropostil y a 2 jours

I need this, just finished 300GB of CSV extracts, and manipulating, data integrity checks, and so on take longer than they should.

评论 #43943992 未加载

gitroomil y a 2 jours

tbh the way intel keeps killing cool tech gets on my nerves - wish they'd just stick it out for once

anthkil y a 2 jours

> Net 9.0heh, do it again with mawk.

zeristoril y a 2 jours

Why not use Parquet?

评论 #43939884 未加载

评论 #43938218 未加载

评论 #43942382 未加载

15 comments

chao-il y a 2 jours

评论 #43938985 未加载

评论 #43940949 未加载

评论 #43955996 未加载

评论 #43941282 未加载

评论 #43938869 未加载

评论 #43940206 未加载

评论 #43939606 未加载

评论 #43938848 未加载

评论 #43938741 未加载

评论 #43938749 未加载

stabblesil y a 2 jours

评论 #43939679 未加载

Aardwolfil y a 2 jours

Take that, Intel and your "let's remove AVX-512 from every consumer CPU because we want to put slow cores on every single one of them and also not consider multi-pumping it"

评论 #43939288 未加载

winterbloomil y a 3 jours

This is a staggering ~3x improvement in just under 2 years since Sep was introduced June, 2023.You can't claim this when you also do a huge hardware jump

评论 #43937050 未加载

评论 #43936989 未加载

评论 #43942779 未加载

评论 #43939269 未加载

评论 #43936979 未加载

vessenesil y a 2 jours

If we are lucky we will see Arthur Whitney get triggered and post either a one liner beating this or a shakti engine update and a one liner beating this. Progress!

voidUpdateil y a 3 jours

I shudder to think who needs to process a million lines of csv that fast...

评论 #43938823 未加载

评论 #43937617 未加载

评论 #43937080 未加载

评论 #43937568 未加载

评论 #43937063 未加载

评论 #43939869 未加载

评论 #43938894 未加载

评论 #43939567 未加载

评论 #43940169 未加载

criddellil y a 3 jours

I was expecting to see assembly language and was pleasantly surprised to see C#. Very impressive.Nice work!

评论 #43938854 未加载

habermanil y a 2 jours

评论 #43943463 未加载

imtringuedil y a 2 jours

评论 #43938477 未加载

constantcryingil y a 2 jours

chpatrickil y a 2 jours

theropostil y a 2 jours

I need this, just finished 300GB of CSV extracts, and manipulating, data integrity checks, and so on take longer than they should.

评论 #43943992 未加载

gitroomil y a 2 jours

tbh the way intel keeps killing cool tech gets on my nerves - wish they'd just stick it out for once

21 GB/s CSV Parsing Using SIMD on AMD 9950X

15 comments

21 GB/s CSV Parsing Using SIMD on AMD 9950X

15 comments