Good post, I always thought diffusion originated from score matching, today I realized diffusion came before score matching theory, so when OpenAI trained on 250 million images, they didn’t even have great theory explaining why they were modeling the underlying distribution. Gutsy move
How is doing classifier-free guidance where you<p>"Train a single diffusion model on every training sample x0x0 twice: once paired with its class label yy, and once paired with a null class label."<p>not doing exactly the same, and having the same problem that was deemed bad in the first paragraph of the same section:<p>"However, the label can sometimes lead to samples that are not realistic or lack diversity if the model has not seen enough samples from p(x∣y)p(x∣y) for a particular yy. So we often want to tune how much the model “follows” the label during generation."<p>Awesome post btw
> I spent 2022 learning to draw and was blindsided by the rise of AI art models like Stable Diffusion. Suddenly, the computer was a better artist than I could ever hope to be.<p>I hope the author stuck with it anyway. The more AI encroaches on creative work, the more I want to tear it all down.
Thanks for sharing. This has given me much more insight into how and why diffusion models work. Randomness is oddly powerful. Time to try and code one up in some suitably unsuitable language.<p>Not much to TL;DR for the comment lurkers. This post <i>is</i> the TL;DR of stable diffusion.