Is there a way to have current AI tools maintain consistency when generating multiple images of a specific creature or object? For example, if there are images of 'Dr. Venom' they need to look similar, or if there are images of the same space ship.
This is a great example of AI image generation being unable to generate "art" and instead just replicating a naive approximation of what it was trained on. There's no coherence or consistency between the images and while they all look "shiny", they also look incredibly dull and generic.<p>It's a cool exercise but using this for a real-world project would eliminate any attempt at producing an artistic "voice". AI image generation excels only at generating stock art and placeholder content.
I was trying this recently with the Sierra Christmas Card from 1986![0] The images that I generated were[1], and I was trying to tweak the model parameters with different denoising and CFG scales. When you get the parameters just right you can preserve the composition of the input image very well while still adding a lot of detail. This isn't a completely automatic process though, with Stable Diffusion you have to provide the right prompt otherwise the generation process isn't guided correctly, so approach works better for aesthetics and style transfer than regular image super-resolution such as ESRGAN.<p>[0] <a href="https://archive.org/details/sierra-christmas-card-1986" rel="nofollow">https://archive.org/details/sierra-christmas-card-1986</a><p>[1] <a href="https://i.imgur.com/WxD05gX.jpeg" rel="nofollow">https://i.imgur.com/WxD05gX.jpeg</a>
The title excited me - maybe someone succeeded making new art that looks like the old pre-renders, maybe a convincing imitation of the scanline render look. Instead it was a vapid article about tossing pixel art into img-2-img and getting some tenuously related junk.
I have a question. Stable diffusion is based on gradually processing noise into a coherent image, by training a denoiser. Would it be possible to feed low-fidelity image (such as pixel art, or pixelated image) directly into the denoiser step and get a higher-fidelity image that would match the original?
You'd have to scale the resolution on these waayyy down to not see the usual janky, smudgy, sometimes nightmare-inducing details.<p>I seriously have never understood why what gets published in these blog posts isn't just lower res especially since this is precisely about old video game graphics.
It’s amazing that someday we’ll be able to pass in low fidelity pixel art sprite sheets to an AI and get back high definition hand drawn 2D graphics for use in games.
Honestly, really disappointing, especially since the author forces you to watch the video to see the final image - which looks nothing like the shoulder-spiked, triple-forehead-eye'd villain of the game. Spoiler, the generated image is just a threatening looking green dude with two different coloured eyes.<p>It's a decent writeup on the process of trying to generate specific images using text prompts, I guess, with the conclusion that it's really hard, and in some cases basically impossible (hence the lack of the three forehead eyes).