Computing Adler32 Checksums at 41 GB/s

2 pointsby woooshalmost 3 years ago

1 comment

Nyanalmost 3 years ago

Nicely done.<p>> There is still a lot of room to micro-optimize both the avx and avx64 implementation<p>I personally couldn't see much - perhaps aligning loads and defering `_mm256_madd_epi16` are the only ideas that come to mind. What did you have in mind?

评论 #32338982 未加载