TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

SIMD, SIMT, SMT - parallelism in NVIDIA GPUs

47 pointsby PopaLalmost 13 years ago

2 comments

zhengalmost 13 years ago
A very good and reasonably approachable discussion of the pros and cons of how CUDA programming actually is realized in hardware. The explanation of how the GPU's handle context switching is particularly thoughtful and enlightening. It took me a long time to figure this out a couple months ago, a guide like this would have saved me a few nights.<p>I was surprised that the author didn't once use the term CUDA though, they even discuss actual syntax from it, but don't mention the language (extension) once.
评论 #4089592 未加载
radarsat1almost 13 years ago
Very nice article. In my limited experience with OpenCL programming, the most difficult thing is understanding how memory access patterns affect performance. It's not made easier by the fact that it may be different on different platforms.<p>I wonder if what's needed is a higher-level representation that can compile to the best access patterns for the given hardware. (And something that can try several access patterns for your problem and choose the most efficient one.) GPU programming is still quite new, so I guess it's bound to show up eventually.<p>If it can't handle <i>all</i> possible situations, such a tool would be still be useful, even if you end up having to go down to the CUDA/OpenCL level for certain problems that are too difficult to express declaratively.