Emulating double precision on the GPU to render large worlds

173 pointsby coppolaemilioover 2 years ago

16 comments

frogblastover 2 years ago

The 2xFP32 solution is also dramatically faster than FP64 on nearly all GPUs.While most GPUs support FP64, unless you pay for the really high-end scientific computing models, you're typically getting 1/32nd rate compared to FP32 performance. Even your shiny new RTX 4090 runs FP64 at 1/64th rate.2xFP32 for most basic operations can be 1/4th the rate of FP32. It is quite often the superior solution compared to using the FP64 support provided in GPU languages.

评论 #33243150 未加载

Enneaover 2 years ago

Since to-scale solar systems are mentioned in the article, it may be worth talking briefly about Outer Wilds. Outer Wilds is a wonderful game built in Unity and comes with its own solar system. Things are quite a bit smaller than in the real world, but I suppose everything is still large enough for floating point precision to be a potential issue. The developers have solved this by making the player the origin instead. Everything else is constantly shifted around to accommodate for the fact that the player is at the center. This works perfectly in normal gameplay, and is only noticeable when flying away a great distance from the game's solar system (nothing's stopping you), at which point you will see the planets and other astral bodies jiggling around on the map.

评论 #33245648 未加载

评论 #33245094 未加载

评论 #33245638 未加载

daniel_rhover 2 years ago

This is similar to the solution we used in Vega Strike <a href="https://www.vega-strike.org/" rel="nofollow">https://www.vega-strike.org/</a> detailed here <a href="https://graphics.stanford.edu/~danielh//classes/VegaStrikeBlack.ppt" rel="nofollow">https://graphics.stanford.edu/~danielh//classes/VegaStrikeBl...</a>

评论 #33244059 未加载

bpyeover 2 years ago

I wonder if there isn’t another solution here. It seems like the issue is due to large translations? Presumably your view frustrum is small enough that single precision floats are sufficient for the entire range, so couldn’t you just add subtract some offset when calculating the translation matrix for both your view and the model translation? I suppose this may result in instances where you need to recalculate the translation matrix for some visible meshes, but that seems less complicated that trying to increase the on-GPU precision?

评论 #33241721 未加载

评论 #33243650 未加载

评论 #33241654 未加载

评论 #33241364 未加载

评论 #33244782 未加载

raphlinusover 2 years ago

Also see the twofloat crate in Rust, which uses a pair of f64's to give double the number of significant digits as a standard f64. The linked docs point to a number of academic papers on the subject.[2f]: <a href="https://docs.rs/twofloat/latest/twofloat/" rel="nofollow">https://docs.rs/twofloat/latest/twofloat/</a>

评论 #33242687 未加载

评论 #33241847 未加载

TT-392over 2 years ago

Feels kinda weird to be using a data type that gets less precise, the further you move out from the center. Unless the world is infinite (which it sometimes is), isn't it a bit of a waste of precision? I kinda doubt you need nanometer precision, but only within 1 meter from the center. I get that gpus have existing floating point hardware to accelerate stuff. But with more open worlds being a thing. Wouldn't it make sense to include some new, big, floating point data type in hardware / emulate it in software?

评论 #33243527 未加载

评论 #33244201 未加载

forrestthewoodsover 2 years ago

I’ve always wanted to see a game that used nanometer scale int64 positions. That’d give 11.5 million miles of nanometer scale precision. I imagine there are terrible problems with this. But I’ve never tried it so I don’t know what they are.Back to Godot, I thought the answer would be to precompute the ModelView matrix on the CPU. Object -> World -> Camera is a “large” transformation. But the final Object -> Camera transform is “small”. I’m sure there’s a reason this doesn’t work, but I forget it.Unreal 5 changes to doubles everywhere for large world coordinates. I wonder what fun issues they had to solve?

veltasover 2 years ago

Article cites a 2007 article, but this technique is quite old. For example <a href="https://csclub.uwaterloo.ca/~pbarfuss/dekker1971.pdf" rel="nofollow">https://csclub.uwaterloo.ca/~pbarfuss/dekker1971.pdf</a>

Animatsover 2 years ago

"Then, when doing the model to camera space transformation instead of calculating the MODELVIEW_MATRIX, we separate the transformation into individual components and do the rotation/scale separately from the translation."That's the core idea here. A bit more detail would help. Is that done in the GPU? Is that extra work for every vertex? Does it slow down rendering because the GPU's 4x4 matrix multiplication hardware can't do it?I actually have to implement this soon in something I'm doing. So I really want to know.

评论 #33243285 未加载

swiftcoderover 2 years ago

This is neat, but it's often a lot simpler to just render each object relative to the camera instead. Which can be as simple as subtracting the camera's position from the world position right before rendering.

Waterluvianover 2 years ago

Why not describe the world space in integers? Where 1 is the Planck length of the simulation?Is there a “lossy compression” benefit to describing space with floats?

评论 #33242800 未加载

pinumover 2 years ago

>The MODELVIEW_MATRIX is assembled in the vertex shader by combining the object’s MODEL_MATRIX and the camera’s VIEW_MATRIXI was taught that MV/MVP should be calculated CPU-side per-model, and that doing it in the vertex shader is wasteful. Is that advice out of date?

评论 #33243755 未加载

bitLover 2 years ago

The book "3D Engine Design for Virtual Globes" contains a bunch of these shader tricks with different levels of precision (e.g. 1cm across the whole solar system etc.)

bullenover 2 years ago

A better way to solve this problem is to move the world around the origin instead. Just like you had to with OpenGL 1!Really half-floats are more interesting, saving 50% memory on the GPU for mesh data. You could imagine using half-floats for animations too!Then we could have the debate about fixed point vs. floating. Why we choose to use a precision that deteriorates with distance is descriptive of our short sightedness in other domains like the economy f.ex. (lets just print money now close to origin and we'll deal with precision problems later, when time moves away from origin)What you want is fixed point, preferably with integer math so you get deterministic behaviour, even across hardware. Just like float/int arrays both give you CPU-cache and atomic parallelism at the same time, often simplicity is the solution!In general 64-bit is not interesting at all, so the idea Acorn had with ARM that jumping to 32-bit forever is pretty much proven by now. Even if addressing only jumped to from 26-bit to 32-bit with ARM6.Which leads me to the next interesting tidbit, when talking 8-bit the C64 had 16-bit addressing.

评论 #33244043 未加载

RJIb8RBYxzAMX9uover 2 years ago

This is similar to the Far Lands[0] in old versions of Minecraft. Woof![1][0] <a href="https://minecraft.fandom.com/wiki/Far_Lands" rel="nofollow">https://minecraft.fandom.com/wiki/Far_Lands</a>[1] <a href="https://farlandsorbust.com/" rel="nofollow">https://farlandsorbust.com/</a>

mbrodersenover 2 years ago

Just render relative to camera (subtract camera position from model position and set camera position to 0).