TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Summing ASCII encoded integers on Haswell at almost the speed of memcpy

132 pointsby iliekcomputers10 months ago

4 comments

ashleyn10 months ago
Knew it&#x27;d be SIMD. Such an underrated feature of modern CPUs. Hopefully with cross-platform SIMD in Rust and Golang, it&#x27;ll be more commonly used.<p>Thinking parallel gets you enormous speed benefits for any number of arbitrary algorithms: <a href="https:&#x2F;&#x2F;mcyoung.xyz&#x2F;2023&#x2F;11&#x2F;27&#x2F;simd-base64&#x2F;" rel="nofollow">https:&#x2F;&#x2F;mcyoung.xyz&#x2F;2023&#x2F;11&#x2F;27&#x2F;simd-base64&#x2F;</a>
评论 #40954849 未加载
dist1ll10 months ago
First time I hear about HighLoad. Seems really interesting to me on the first glance. I personally find SIMD and ISA&#x2F;μarch-specific optimizations more rewarding than pure algorithmic challenges (codeforces and such).<p>Though Haswell seems like a pretty obsolete platform to optimize for at this point. Even Skylake will be a decade old next year.
评论 #40955468 未加载
评论 #40954828 未加载
评论 #40961312 未加载
wolf550e10 months ago
I think the trick with dereferencing unmapped memory is cool, but I only really care about techniques that work reliably and I can use in production.
评论 #40953982 未加载
评论 #40954361 未加载
raldi10 months ago
Is there an explanation of why it sometimes gives the wrong answer?
评论 #40955306 未加载
评论 #40955288 未加载