In somewhat related topics, I think we can just use stable diffusion to help convert single photos to 3D NERFs.<p>1. find the prompt that best generates the image<p>2. generate a (crude) NERF from your starting image and render views from other angles<p>3. use stable diffusion with the views from other angles as seed images, refine them using the prompt from 1 combined with(add descriptions to generate "view from back", "view from top", etc<p>4. feed the refined views back to the NERF generator, keeping the initial photo view constant<p>5. Generate new views from the NERF, which should now be much more realistic.<p>Run the above steps 2-5 in a loop indefinitely. Eventually you should end up with a highly accurate, realistic NERF which is full 3d from any angle, all from a single photo.<p>Similar techniques could be used to extend the scene in all directions.