Probably a dumb question, but is this trained by the use of lots of inputs of similar objects, or is it 'just' estimating by the look of the input image?<p>Like, if you have an image of a car, viewed at an angle, you can gauge the shape of the 3d object from the image itself. You could then assume that the hidden side of the car is similar to the side that you can see, and when you generate a 360 rotation animation of it, it will look pretty good (cars being roughly symmetrical). But if you gave it a flat image of a playing card, just showing the face up side, how would it reconstruct the reverse side? Would it infer it based on the front, or would it 'know' from training data that playing cards have a very different patterned back to them?