Kinda disappointing that the spherical harmonic data is just totally thrown out. It's a pretty big factor in making gaussian splats look so good, and I'm sure there are interesting ways of compressing the data!
This blog goes a lot more in depth in different techniques for compressing them (although for the purpose of using less VRAM):<p><a href="https://aras-p.info/blog/2023/09/13/Making-Gaussian-Splats-smaller/" rel="nofollow noreferrer">https://aras-p.info/blog/2023/09/13/Making-Gaussian-Splats-s...</a><p><a href="https://aras-p.info/blog/2023/09/27/Making-Gaussian-Splats-more-smaller/" rel="nofollow noreferrer">https://aras-p.info/blog/2023/09/27/Making-Gaussian-Splats-m...</a><p>Including quantising spherical harmonics, grouping splats into chunks and compressing their position and rotation in groups, etc.
The moment I saw the use of motion vectors with gaussian splats I started wondering if it was a viable way to achieve high performance, high quality spatial video. Instead of capturing from all angles like most current approaches (which seems more geared to create content that is used in place of a 3D model) why not capture from a fixed perspective, using an array of cameras covering about 1m square to allow for slop in head position, providing parallax and perspective correct rendering. I'd presume that'd make it even more compressible too, since you could get rid of any data that isn't visible from outside that frustum.<p>It would be a true hologram, completely supplanting 180/360 stereo video. Imagine Avatar on such a format. Or, lets be real, porn.
Can anyone point me to some good high level introduction to Gaussian splats?<p>E.g. what's the benefit of using splats vs traditional polygons? is it somehow easier for the neural network to create these from the 2D photos? or what's the magic behind this?
More so than just compressing the data, I feel like the biggest gains for splats will come from a chunked LOD system, and streaming data into memory. Regardless of how efficient you make the representation, you are still going to run into fundamental limitations without this.<p>On the low hanging fruit side of things, tools should really start integrating a Spherical Harmonics skybox into the training steps to better handle large scale distant details.
<a href="https://lightgaussian.github.io/" rel="nofollow noreferrer">https://lightgaussian.github.io/</a> in similar vein
I see various people online calling the primitive "splats" (ex: the scene has millions of splats, 248 bytes for each splat, etc). But the primitive are the 3D gaussians, "splat" just refers to how they are rendered. I wonder why people call them "splats", because it's catchier?