This is a stellar and interesting idea: train a transformer to turn a scene description set of triangles into a 2d array of pixels, which happens to look like the pixels a global illumination renderer would output from the same scene.<p>That this works at all shouldn’t be shocking after the last five years of research, but I still find it pretty profound. That transformer architecture sure is versatile.<p>Anyway, crazy fast, close to blender’s rendering output, what looks like a 1B parameter model? Not sure if it’s fp16 or 32, but it’s a 2GB file, what’s not to like? I’d like to see some more ‘realistic’ scenes demoed, but hey, I can download this and run it on my Mac to try it whenever I like.