科技回声

11 条评论

> Opening the binary with Binary Ninja revealed that clang had already managed to leverage the SSE registers.X86-64 uses SSE registers for all floating point operations. I'm not sure that the author realized that they were looking at an -O0 binary. -O0 does not do vectorization (or anything else for that matter).

评论 #23121138 未加载

评论 #23121405 未加载

评论 #23121163 未加载

rwmj大约 5 年前

Is there a mistake in the original code on the right hand side? I get:<pre><code> card.cpp:16:2: error: ‘g’ was not declared in this scope 16 | <g;p)t=p,n=v(0,0,1),m=1;for(i k=19;k--;) | ^ </code></pre> Edit: Yes there is. The ‘<g;’ seems like it should have been the single character ‘<’, perhaps a corrupted HTML escape.

评论 #23126532 未加载

评论 #23124319 未加载

danielscrubs大约 5 年前

Well there goes my weekend. :)I also tried to optimise the code, and got great speed increases with just constexpr the vector methods and could quickly see that rand was problematic and then Fabien releases this post with nvcc that are another level. Really great blog post!

ectoplasmaboiii大约 5 年前

A Ray-Tracer in 7 Lines of K: <a href="http://nsl.com/k/ray/ray.k" rel="nofollow">http://nsl.com/k/ray/ray.k</a>

评论 #23124216 未加载

blondin大约 5 年前

i love what fabien is doing with his website!also been experimenting with pure html with an itsy-bitsy amount of css. for months now i wondered how to display code without involving javascript.that textarea is so perfect! and i bet you when you copy and paste into word or your todo list application they won't even try to be "smart" about knowing what "rich text" is...that's very cool.

评论 #23121319 未加载

评论 #23124126 未加载

评论 #23122898 未加载

rrss大约 5 年前

This was really fun to read, thanks fsanglard.> This is correlated with the warning nvcc issued. Because the raytracer uses recursion, it uses a lot of stacks. So much actually that the SM cannot keep more than a few alive.Stack frame size / "local memory" size doesn't actually directly limit occupancy. There's a list of the limiters here: <a href="https://docs.nvidia.com/gameworks/content/developertools/desktop/analysis/report/cudaexperiments/kernellevel/achievedoccupancy.htm" rel="nofollow">https://docs.nvidia.com/gameworks/content/developertools/des...</a>. I'm not sure why the achieved occupancy went up after removing the recursion, but I'd guess it was something like the compiler was able to reduce register usage.

fegu大约 5 年前

Almost into passable frame rate territory. Next version could be business card VR:)

ntry大约 5 年前

101,000ms to 150ms is a phenomenal speedup. Props

评论 #23121537 未加载

mianos大约 5 年前

I wonder if the tool-chain would be better under Linux? It is kind if funny the way Windows development has always been a hassle. Mscvars.bat and such has been there for at least 20 years.

tomsmeding大约 5 年前

Nice work on the GPU programming, and the multicore before that, but I'm mystified why going from -O0 to -O3 is named an "optimisation". All respect for Fabien, but running code that's supposed to run faster than a snail (and if you're not debugging and require -O0 for reasonable output) implies -O2 or -O3. (In practice, -O3 often doesn't give much performance over -O2, despite increasing compile times.)The initial time is not 101.8 seconds, it's 11.6 seconds.

评论 #23123307 未加载

评论 #23123697 未加载

lonk大约 5 年前

If you increase resolution you can put more code on business card :P

评论 #23131593 未加载

评论 #23122333 未加载

11 条评论

bigcheesegs大约 5 年前

评论 #23121138 未加载

评论 #23121405 未加载

评论 #23121163 未加载

rwmj大约 5 年前

评论 #23126532 未加载

评论 #23124319 未加载

danielscrubs大约 5 年前

ectoplasmaboiii大约 5 年前

A Ray-Tracer in 7 Lines of K: <a href="http://nsl.com/k/ray/ray.k" rel="nofollow">http://nsl.com/k/ray/ray.k</a>

评论 #23124216 未加载

blondin大约 5 年前

评论 #23121319 未加载

评论 #23124126 未加载

评论 #23122898 未加载

rrss大约 5 年前

fegu大约 5 年前

Almost into passable frame rate territory. Next version could be business card VR:)

ntry大约 5 年前

101,000ms to 150ms is a phenomenal speedup. Props

评论 #23121537 未加载

mianos大约 5 年前

I wonder if the tool-chain would be better under Linux? It is kind if funny the way Windows development has always been a hassle. Mscvars.bat and such has been there for at least 20 years.

Revisiting the Business Card Raytracer

11 条评论

Revisiting the Business Card Raytracer

11 条评论