TechEcho

6 comments

orfabout 4 years ago

numpy, for comparison:<pre><code> In [8]: vec = np.random.randint(-200, 200, (100_000_000,)) In [9]: %timeit vec.sum() 63 ms ± 4.81 ms per loop (mean ± std. dev. of 7 runs, 10 loops each) </code></pre> The branchless C++ version took 125ms, and the AVX 512 version took ~9ms.

tom_melliorabout 4 years ago

After the updates to the article, the takeaway seems to be "you can use AVX-512 dot product instructions to sum an array of bytes to int and get a 15% speedup over more straightforward vector code". That's an interesting point, but it's now well-hidden among irrelevant things like the compressed representation that was only relevant to the article's original point.It might make sense to resubmit a completely rewritten and pared-down version of the article. The dot product trick is neat.

评论 #26380262 未加载

Const-meabout 4 years ago

I don’t believe in the bright future of AVX512 tech, and I don’t have hardware either, my desktop PC has AMD Zen2 CPU.Here’s how I would do that in AVX2: <a href="https://gist.github.com/Const-me/eed10bfe690b5804d2fc8266e0218981#file-simintegersavx2-cpp-L35-L105" rel="nofollow">https://gist.github.com/Const-me/eed10bfe690b5804d2fc8266e02...</a>I wonder how does the performance compare to your version.

评论 #26380605 未加载

评论 #26383089 未加载

schmideabout 4 years ago

Ehh. I like playing with vectors and have a weird coding style.My AVX version.<a href="https://github.com/schmide/sumint/blob/main/sumint.cpp" rel="nofollow">https://github.com/schmide/sumint/blob/main/sumint.cpp</a>

plesnerabout 4 years ago

Could you not just dosum(byteVals) + sum(intVals) + 128 * len(intVals)?

评论 #26378727 未加载

评论 #26379137 未加载

electricshampo1about 4 years ago

NOTE: The article has been updated and expanded since the initial post.

6 comments

orfabout 4 years ago

tom_melliorabout 4 years ago

评论 #26380262 未加载

Const-meabout 4 years ago

评论 #26380605 未加载

评论 #26383089 未加载

schmideabout 4 years ago

plesnerabout 4 years ago

Could you not just dosum(byteVals) + sum(intVals) + 128 * len(intVals)?

评论 #26378727 未加载

评论 #26379137 未加载

electricshampo1about 4 years ago

NOTE: The article has been updated and expanded since the initial post.

On Summing Integers

6 comments

On Summing Integers

6 comments