This looks great! The main potential use for this must be for VR video with 6 degrees of freedom. What they have now does an incredible job of conveying space, but feels a bit limiting when your view doesn’t translate with you.
This is bad news for me. I am working on a simliar project (gaussian splatting + dynamic scene). Our method is different with the mentioned 4D gaussian splatting, but I am unsure shall I continue or not.
Does anyone know if the pixel overdraw of the GS scene is consistent from every view angle? I'm asking because I would assume there is inconsistent GS density but the paper doesn't give a range of FPS measurements or 99th percentile or anything like that.
This gives me hope that one day we'll have a holodeck. Holy crap! The applications for this are pretty broad. From safety (scene reconstruction from video sources) to real-estate, to hollywood and video games. I'm just blown away. Will we eventually see 4D GS AR/XR scenes we can walk about? I feel like that would make the perfect VR sherlock holmes game.
After reconstruction, is there any way to scan for a particular condition in the model, and map it onto the 3D structure? For instance, find the broken cookie, or find a surface matching some input image.
Hard to believe the original Gaussian Splatting paper is still less than three months old, given the explosion of new techniques in recent weeks. It's wild to see the state of the art in environment and object capture suddenly advancing this quickly – beyond the obvious applications like real estate, I wonder what else GS will end up transforming.
Does anyone have a video or post that explains the optimization part for the original paper? I understand most of it but that part and can’t seem to wrap my head around it.
With tech like this I'm starting to wonder if realistic games are going to become normalized and what will happen as a result.<p>Also has anyone been working on solving the "blurry" look these splats have up close?
I'd love to see a machine learning model trained on the resulting data of this. It'd be crazy to see if it can effectively learn and generate realistic looking video as an output.
Can someone explain to me how is it possible using gaussians to have different reflections based on the angle of view like on the demos? I'm finding it hard to grasp.
Interesting that the original publication this is based on (that won the SIGGRAPH 2023 best paper award) didn't get a lot of attention on HN at the time:<p><a href="https://news.ycombinator.com/item?id=36285374">https://news.ycombinator.com/item?id=36285374</a>
Great video I saw a while ago on this: <a href="https://www.youtube.com/watch?v=HVv_IQKlafQ">https://www.youtube.com/watch?v=HVv_IQKlafQ</a> (albeit for 3d, not 4d).<p>His editing is hilarious too.
I've been slowly building my own rendering and training on a non cuda library (trying with vulkan/spirv) I'm curious how many cameras they used here though.