I suspect the reason the author is seeing very shallow trees for Nvidia might be because the lower levels are done fully behind the scenes:<p><a href="https://forums.developer.nvidia.com/t/extracting-bvh-from-optix-to-manually-traverse/235250" rel="nofollow">https://forums.developer.nvidia.com/t/extracting-bvh-from-op...</a><p>As someone who deals with BVHs a lot for ray intersection, I find it pretty difficult to believe that leaf nodes with that number of primitives will be anywhere near performant, even with fast dedicated hardware like the RT cores.<p>It's true that the Nvidia cards have better intersection performance than ray/box tests, but I don't believe it's in the 100x ratio range which I suspect would be needed if the BVHs were that shallow and leaf nodes that large.
I've often wondered why Nvidia cards are generally so much better at rendering scenes in Blender's cycles renderer (a raytracing engine). The benchmarks on Blender's website are really telling (<a href="https://opendata.blender.org/benchmarks/query/?group_by=device_name&blender_version=3.3.0" rel="nofollow">https://opendata.blender.org/benchmarks/query/?group_by=devi...</a>) by the fact that the only non Nvidia entry on the first page is the AMD 2X EPYC 9654 96-Core.<p>This really lays out the decisions that Nvidia made compared to AMD and how their approach tends to hide some of the shortcoming of GPUs (latency and utilization).
Would love to see a more in-depth article on BVH construction itself! I'm decently familiar with the main concepts but have no clue what the current SOTA looks like (is that even public info?).<p>BVH construction is my favorite question to ask in interviews because there's no single best solution and it mostly relies on mathy heuristics to get a decent tree. You can also always devote more time to making a more optimal tree but there's a tradeoff where it'll eventually take more time than it saves in raytracing.
Ray tracing on Linux for CP2077 with 7900 XTX is still barely usable, but it's getting better.<p>I'd say RDNA 3 is not really giving useful ray tracing on for example 2560x1440 unless you use upscaling to speed it up. May be in a few GPU generations ray tracing will become usable with native resolutions.
Interesting that card/drivers customize so much of ray tracing, like rasterization in pre vulkan/metal/d3d12 or even fixed function gpu days.
I did not get into the real details yet, but mesa radv pulls that horrible glslang due to some shaders related to acceleration structures.<p>Personnaly, I am a dev, then I patch to compile out all that (and all the tracers at the same time) since ray tracing has currently a ridiculous ratio benefits/technical costs.<p>This defeats the very purpose of vulkan spirv: getting rid of those horrible high level shader compilers from the driver stack and keep them contained at the application level.<p>It seems beyond clumsy, but as I said, I need to get into the details of "why" those shaders in the first place, and then why they are not written directly in RDNA assembly or SPIR-V assembly (that would require an "assembler" coded in simple and plain C).