TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

GPU Puzzles

356 pointsby cgadski8 months ago

12 comments

srush8 months ago
I made these a couple of years ago as a teaching exercise for <a href="https:&#x2F;&#x2F;minitorch.github.io&#x2F;" rel="nofollow">https:&#x2F;&#x2F;minitorch.github.io&#x2F;</a>. At the time the resources for doing anything on GPUs were pretty sparse and the NVidia docs were quite challenging.<p>These days there are great resources for going deep on this topic. The CUDA-mode org is particularly great, both their video series and PMPP reading groups.
评论 #41626267 未加载
评论 #41625866 未加载
评论 #41627672 未加载
评论 #41676773 未加载
aleinin8 months ago
I recently ported this to Metal for Apple Silicon computers. If you&#x27;re interested in learning GPU programming on an M series Mac, I think this is a very accessible option. Thanks to Sasha for making this!<p><a href="https:&#x2F;&#x2F;github.com&#x2F;abeleinin&#x2F;Metal-Puzzles">https:&#x2F;&#x2F;github.com&#x2F;abeleinin&#x2F;Metal-Puzzles</a>
评论 #41633027 未加载
fifilura8 months ago
I think this course is also relevant for some deeper context.<p><a href="https:&#x2F;&#x2F;gfxcourses.stanford.edu&#x2F;cs149&#x2F;fall23&#x2F;lecture&#x2F;dataparallel&#x2F;" rel="nofollow">https:&#x2F;&#x2F;gfxcourses.stanford.edu&#x2F;cs149&#x2F;fall23&#x2F;lecture&#x2F;datapar...</a>
评论 #41627646 未加载
saagarjha8 months ago
When working on GPU code there’s really two parts to it, I feel. One is “how do I even write code for the GPU” which this tutorial seems to cover but there’s a second part which is “how do I write <i>good</i> code for the GPU” which seems like it would need another resource or expansion to this one.
评论 #41630465 未加载
ismailmaj8 months ago
It would be nice if the puzzles natively supported C++ CUDA.
评论 #41626113 未加载
czhu128 months ago
I loved the tensor puzzles you made. I spent the morning revisiting and liking all the videos on youtube you&#x27;ve made. Hope for many more in the future!
评论 #41630702 未加载
throwaway3141558 months ago
Either puzzle 4 has a bug in it or I&#x27;m losing my mind. (Possible answer to solution below, so don&#x27;t read if you want to go in fresh)<p><pre><code> # FILL ME IN (roughly 2 lines) if local_i &lt; size and local_j &lt; size: out[local_i][local_j] = a[local_i][local_j] + 10 </code></pre> Results in a failed assertion:<p><pre><code> AssertionError: Wrong number of indices </code></pre> But the test cell beneath it will still pass?
评论 #41627752 未加载
wmil8 months ago
So I&#x27;m used to working with lists and maps, which doesn&#x27;t really track well with tackling problems on thousands of cores.<p>Is the usual strategy to worry less about repeating calculations and just use brute force to tackle the problem?<p>Is there a good resource to read about how to tackle problems in an extremely parallel way?
评论 #41632957 未加载
评论 #41630729 未加载
dejanig8 months ago
Wow, It looks realy interesting, I will definitely look into it.
az2268 months ago
Can I hire you to make Flash Attention a reality for V100?
评论 #41630748 未加载
xandrius8 months ago
Looks nice and fun but the &quot;see-through&quot; font for the titles in the screenshots gives me some deep and primordial unease, not sure why.
评论 #41625588 未加载
评论 #41625686 未加载
867-53098 months ago
seems like an opportune moment to gift a plug for bitcoin puzzles, namely BTC32 &#x2F; 1000 BTC Challenge[1]<p>pools are in dire need of cuda developers<p>[1]<a href="https:&#x2F;&#x2F;bitcointalk.org&#x2F;index.php?topic=1306983.0" rel="nofollow">https:&#x2F;&#x2F;bitcointalk.org&#x2F;index.php?topic=1306983.0</a>
评论 #41627861 未加载
评论 #41628535 未加载