科技回声

5 条评论

Lichtso将近 3 年前

If I understand correctly:CPUs do minimize latency by:- Register renaming- Out of order execution- Branch prediction- Speculative executionThey should not be over subscribed as they have to context switch by storing / loading registers and the cache coherence protocols scale badly with more threads.GPUs on the other hand maximize throughput by:- A lot more memory bandwidth- Smaller and slower cores, but more of them- Ultra threading (the massively over subscribed hyper threading the video mentions)- Context switching between wavefronts (basically the equivalent of a CPU thread), just shifts the offset into the huge register file (no store and load)The one area in which CPUs are getting closer to GPUs is SIMD / SIMT. CPUs used to be able to apply one instruction to a vector of elements without masking (SIMD). In ARM SVE and x86 AVX-512 they can now (like GPUs) mask out individual lanes (SIMT) for ALU operations and memory operations (gather load / scatter store).

评论 #32075594 未加载

评论 #32075338 未加载

评论 #32075443 未加载

boberoni将近 3 年前

> (Almost) Nobody (really) cares about flops ...because we should really be caring about memory bandwidthIn university, I was shocked to learn in a database class that CPU costs are dwarfed by the I/O costs in the memory hierarchy. This was after spending a whole year on data structures and algorithms, where we obsessed over runtime complexity and # of operations.It seems that the low-hanging fruit of optimization is all gone. New innovations for performance will have to happen in transporting data.

评论 #32075262 未加载

评论 #32076928 未加载

评论 #32075542 未加载

评论 #32075590 未加载

einpoklum将近 3 年前

This seems like this year's version of the talk given last year, which was just recently posted here on HN as "How CUDA Programming works":<a href="https://news.ycombinator.com/item?id=31983460" rel="nofollow">https://news.ycombinator.com/item?id=31983460</a>

oifjsidjf将近 3 年前

Here is another interesting series of articles which describes in more details how GPUs draw:<a href="https://fgiesen.wordpress.com/2011/07/09/a-trip-through-the-graphics-pipeline-2011-index/" rel="nofollow">https://fgiesen.wordpress.com/2011/07/09/a-trip-through-the-...</a>

评论 #32075316 未加载

wrs将近 3 年前

The Wheel of Reincarnation continues. [0] (Though it’s sort of turning the other way, this time around?)[0] <a href="http://www.catb.org/jargon/html/W/wheel-of-reincarnation.html" rel="nofollow">http://www.catb.org/jargon/html/W/wheel-of-reincarnation.htm...</a>

How GPU Computing Works [video]

5 条评论

How GPU Computing Works [video]

5 条评论