TechEcho

10 comments

ttulover 1 year ago

I’ve been playing with diffusion a ton for the past few months, writing a new sampler that implements an iterative blending technique described in a recent paper. The latent space is rich in semantic information, so it can be a great place to apply various transformations rather than operating on the image directly. Yet it still has a significant spatial component, so things you do in one spatial area will affect that same area of the image.Stable Diffusion 1.5 may be quite old now, but it is an incredibly rich model that still yields shockingly good results. SDXL is newer and more high tech, but it’s not a revolutionary improvement. It can be less malleable than the older model and harder to work with to achieve a given desired result.

评论 #39282277 未加载

评论 #39286523 未加载

评论 #39285990 未加载

l33tmanover 1 year ago

Just a terminology comment here. "Latent space" means a lot of different things in different models. For a GAN for example it actually means the "top concept" space where you can change the entire concept of the image by moving around in the latent space, which is notoriously difficult. For SD/SDXL it refers to the bottommost layer just above pixelspace, which expands the generated image from 64x64 to 512x512 pixels in the case of SD1.5.This allows the rest of the network to be smaller while still generating a usable output resolution, so it's a performance "hack".It's a really good idea to explore it and hack into it like in the article, to "remaster" the image so to speak!

Der_Einzigeover 1 year ago

Anyone know if the work shown here has been implemented in Automatic1111 or ComfyUI as an extension? If not, than that might be my first project to add since these are quite simple (relatively speaking) in the code to implement.

评论 #39317355 未加载

nomelover 1 year ago

What's the reason for using RGB rather than, say, HSV? RGB seems like it would be fairly discontinuous. Or, do I have that backwards?

评论 #39282534 未加载

评论 #39284703 未加载

评论 #39283855 未加载

Sabinusover 1 year ago

That's very cool. I had no idea the latent space was that accessible and obviously manipulatable.Also interesting is how the way sdxl structures latents affects how it thinks about images.

Lercover 1 year ago

I don't think it's as simple as this naive approach suggests, but it's a good preliminary analysis. It's a good lesson that while being absolutely correct might be quite difficult, diving in and having a go might get you further than you think.

01HNNWZ0MV43FFover 1 year ago

All the patterns and textures are expressed by only one dimension? Bizarre.

评论 #39286584 未加载

SV_BubbleTimeover 1 year ago

That was an excellently written article.I for sure thought a discussion about latent spaces would instantly be over my head. It was, but took a few paragraphs.

HanClintoover 1 year ago

Thank you for the excellent article! Top notch work!

rgmmmover 1 year ago

Enhance.

10 comments

ttulover 1 year ago

评论 #39282277 未加载

评论 #39286523 未加载

评论 #39285990 未加载

l33tmanover 1 year ago

Der_Einzigeover 1 year ago

评论 #39317355 未加载

nomelover 1 year ago

What's the reason for using RGB rather than, say, HSV? RGB seems like it would be fairly discontinuous. Or, do I have that backwards?

评论 #39282534 未加载

评论 #39284703 未加载

评论 #39283855 未加载

Sabinusover 1 year ago

That's very cool. I had no idea the latent space was that accessible and obviously manipulatable.Also interesting is how the way sdxl structures latents affects how it thinks about images.

Lercover 1 year ago

01HNNWZ0MV43FFover 1 year ago

All the patterns and textures are expressed by only one dimension? Bizarre.

评论 #39286584 未加载

SV_BubbleTimeover 1 year ago

That was an excellently written article.I for sure thought a discussion about latent spaces would instantly be over my head. It was, but took a few paragraphs.

HanClintoover 1 year ago

Thank you for the excellent article! Top notch work!

rgmmmover 1 year ago

Enhance.

Explaining the SDXL Latent Space

10 comments

Explaining the SDXL Latent Space

10 comments