TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

On Summing Integers

26 pointsby electricshampo1about 4 years ago

6 comments

orfabout 4 years ago
numpy, for comparison:<p><pre><code> In [8]: vec = np.random.randint(-200, 200, (100_000_000,)) In [9]: %timeit vec.sum() 63 ms ± 4.81 ms per loop (mean ± std. dev. of 7 runs, 10 loops each) </code></pre> The branchless C++ version took 125ms, and the AVX 512 version took ~9ms.
tom_melliorabout 4 years ago
After the updates to the article, the takeaway seems to be &quot;you can use AVX-512 dot product instructions to sum an array of bytes to int and get a 15% speedup over more straightforward vector code&quot;. That&#x27;s an interesting point, but it&#x27;s now well-hidden among irrelevant things like the compressed representation that was only relevant to the article&#x27;s original point.<p>It might make sense to resubmit a completely rewritten and pared-down version of the article. The dot product trick is neat.
评论 #26380262 未加载
Const-meabout 4 years ago
I don’t believe in the bright future of AVX512 tech, and I don’t have hardware either, my desktop PC has AMD Zen2 CPU.<p>Here’s how I would do that in AVX2: <a href="https:&#x2F;&#x2F;gist.github.com&#x2F;Const-me&#x2F;eed10bfe690b5804d2fc8266e0218981#file-simintegersavx2-cpp-L35-L105" rel="nofollow">https:&#x2F;&#x2F;gist.github.com&#x2F;Const-me&#x2F;eed10bfe690b5804d2fc8266e02...</a><p>I wonder how does the performance compare to your version.
评论 #26380605 未加载
评论 #26383089 未加载
schmideabout 4 years ago
Ehh. I like playing with vectors and have a weird coding style.<p>My AVX version.<p><a href="https:&#x2F;&#x2F;github.com&#x2F;schmide&#x2F;sumint&#x2F;blob&#x2F;main&#x2F;sumint.cpp" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;schmide&#x2F;sumint&#x2F;blob&#x2F;main&#x2F;sumint.cpp</a>
plesnerabout 4 years ago
Could you not just do<p>sum(byteVals) + sum(intVals) + 128 * len(intVals)?
评论 #26378727 未加载
评论 #26379137 未加载
electricshampo1about 4 years ago
NOTE: The article has been updated and expanded since the initial post.