The differently styled images of "astronaut riding a horse" are great, but that has been a go-to example for image generation models for a while now. The introduction says that they train on 37 million real <i>and synthetic</i> images. Are astronauts riding horses now represented in the training data more than would have been possible 5 years ago?<p>If it's possible to get good, generalizable results from such (relatively) small data sets, I'd like to see what this approach can do if trained exclusively on non-synthetic permissively licensed inputs. It might be possible to make a good "free of any possible future legal challenges" image generator just from public domain content.