TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Show HN: Accelerate SHA256 Computations in Go Using AVX512 instructions

74 pointsby y4m4over 7 years ago

6 comments

wolf550eover 7 years ago
A recent blog post by Vlad Krasnov, author of a bunch of the crypto assembly code in openssl and in golang, about frequency scaling when using AVX-512 making it not worth it: <a href="https:&#x2F;&#x2F;blog.cloudflare.com&#x2F;on-the-dangers-of-intels-frequency-scaling&#x2F;" rel="nofollow">https:&#x2F;&#x2F;blog.cloudflare.com&#x2F;on-the-dangers-of-intels-frequen...</a><p>He doesn&#x27;t like the title of the OP and provided links:<p>&gt; Very misleading title. Could just as well name it &quot;accelerate sha256 up to 134x&quot;. You need to compare apples to apples. If AVX2 was used in the same way AVX512 is used, the speedup would be 2X at most. Reminds me of two of my papers <a href="https:&#x2F;&#x2F;eprint.iacr.org&#x2F;2012&#x2F;371.pdf" rel="nofollow">https:&#x2F;&#x2F;eprint.iacr.org&#x2F;2012&#x2F;371.pdf</a> <a href="https:&#x2F;&#x2F;eprint.iacr.org&#x2F;2012&#x2F;067.pdf" rel="nofollow">https:&#x2F;&#x2F;eprint.iacr.org&#x2F;2012&#x2F;067.pdf</a><p>(from <a href="https:&#x2F;&#x2F;twitter.com&#x2F;thecomp1ler&#x2F;status&#x2F;940724783804645376" rel="nofollow">https:&#x2F;&#x2F;twitter.com&#x2F;thecomp1ler&#x2F;status&#x2F;940724783804645376</a>)<p>EDIT: Thanks &#x27;delhanty !
评论 #15918268 未加载
评论 #15920036 未加载
eloffover 7 years ago
This is assembly, not pure Go, but it doesn&#x27;t use CGO which I probably what they mean.<p>Intel Cannon Lake processors will support the SHA instruction extensions (currently available only on Goldmont). It will be interesting to see how that compares with this approach of running 16 SHA computations in parallel. You would be able to get rid of the scheduling overhead of having to first queue up 16 SHA calculations from other threads.
评论 #15918658 未加载
评论 #15918090 未加载
foobarbazetcover 7 years ago
One thing to note is that the benchmark is running on a Skylake Platinum chip which has two AVX512 FMAs.<p>You need a Gold 6000 series and above to see any benefit from AVX512. In most other cases the CPU throttles down some insane amount and there’s no to little benefit.
评论 #15919334 未加载
评论 #15920125 未加载
评论 #15918267 未加载
ComputerGuruover 7 years ago
I blogged about the SHA instruction support in the x86_64 ISA a few months back, it’ll be nice to see it actually happen: <a href="https:&#x2F;&#x2F;neosmart.net&#x2F;blog&#x2F;2017&#x2F;will-amds-ryzen-finally-bring-sha-extensions-to-intels-cpus&#x2F;" rel="nofollow">https:&#x2F;&#x2F;neosmart.net&#x2F;blog&#x2F;2017&#x2F;will-amds-ryzen-finally-bring...</a>
dragonfaxover 7 years ago
Isn&#x27;t this the kind of thing that was missing from the &quot;go on different platforms&quot; benchmark a little while back. The intel platform has crazy optimization for encryption algorithms on Inteil, while ARM was severely lacking.
评论 #15918951 未加载
mikebenfieldover 7 years ago
Possibly I&#x27;m confused, but in what sense is this &quot;in Pure Go&quot;?
评论 #15917828 未加载
评论 #15917685 未加载
评论 #15917679 未加载