Ask HN: Why are AI generated images so shiny/glossy?

67 点作者 arduinomancer9 个月前

I’ve noticed a lot of the time you can tell an image is AI generated because it has a shiny/glossy lighting look to it.Has anyone figured out why this is the case?

27 条评论

keiferski9 个月前

It’s just the typical aesthetic model used and isn’t inherent to the tech itself. It’s very easy to make AI images in specific art styles, with the result that you can’t tell they’re not real.This is actually something of a pet peeve of mine - people sharing AI images never use styles other than the generic shiny one, and so places like Reddit.com/r/midjourney are filled with the same exact style of images.Edit: if you’re looking for other style inspiration ideas, this website is a great resource for Midjourney keywords: <a href="https://midlibrary.io/styles" rel="nofollow">https://midlibrary.io/styles</a>

评论 #41263379 未加载

评论 #41263279 未加载

评论 #41263420 未加载

vipshek9 个月前

Many AI-generated images you encounter are low-effort creations without much prompt tuning, created using something like DALL-E or Llama 3.1. For whatever reason, the default style of DALL-E, Llama 3.1, and base Stable Diffusion seems to lean towards a glossy "photorealism" that people can instantly tell isn't real. By contrast, Midjourney's style is a bit more painted, like the cover of a fantasy novel.All that being said, it's very possible to prompt these generators to create images in a particular style. I usually include "flat vector art" in image generation prompts to get something less photorealistic that I've found is closer to the style I want when generating images.If you really want to go down the rabbit hole, click through the styles on this Stable Diffusion model to see the range that's possible with finetuning (the tags like "Watercolor Anime" above the images): <a href="https://civitai.com/models/264290/styles-for-pony-diffusion-v6-xl-not-artists-styles" rel="nofollow">https://civitai.com/models/264290/styles-for-pony-diffusion-...</a>

评论 #41263396 未加载

评论 #41263362 未加载

评论 #41263190 未加载

feverzsj9 个月前

Maybe because the image is generated from Gaussian noise in diffusion models, while the real photo pixel entropy doesn't distribute like this.

评论 #41271777 未加载

sidkshatriya9 个月前

A lot of (non-AI) photos of humans tend to be airbrushed by (human) photo editors -- this removes natural imperfections -- like patchy skin, acne, discolouration etc.In AI models, I think the pictures the AI generates is biased to generate is also a form of "airbrush" except the model makes the reflectivity of the images high -- simply to hide the fact that there _arent_ any imperfections that would make the photo more realistic.In other words, gloss is just a form of airbrushing -- AI does it to hide the fact that there are no more details available.I would guess that AI models could make the airbrush more like the airbrush human photo editors do by changing some hyper-parameters of their models.

spaceman_20209 个月前

Dall-E at least seems to have adopted the cartoonish style just to avoid lawsuitsYou can get realistic images with Midjourney and Flux with minimal prompt tuning. Adding “posted on snapchat” or “security camera footage” to the prompt will often produce mostly realistic looking images

txnf9 个月前

there is an "aesthetics" model<a href="https://github.com/LAION-AI/laion-datasets/blob/main/laion-aesthetic.md">https://github.com/LAION-AI/laion-datasets/blob/main/laion-a...</a>obviously, it reflects the mass preference for glosslopsecondarily it is likely due to a desire to ensure that ai images have a distinct look

评论 #41263247 未加载

评论 #41262936 未加载

ClassyJacket9 个月前

I don't know, but I've noticed another pattern: They don't like leaving any empty space. Every area has to be busy, filled with objects. They can never leave any empty grass, or walls, or anything. Everything is full of objects.

评论 #41263174 未加载

blululu9 个月前

This is an interesting question, though I think it needs to be qualified a bit since there are many AI images and AI image generators that don't match this pattern.First, AI Images != OpenAI/ChatGPT Images. OpenAI has done a great job making a product that is accessible and thus their product decisions get a lot more exposure than other options. A few people have commented how there are several Stable Diffusion fine tunings that produce very different styles.Second, AI Images and images of AI images of people are different. I think that the high gloss style is most pronounced in people. Partly this is because it is more notable and out of place.If you take the previous two points as being true the question becomes why does ChatGPT image model skew toward generated shiny people. I would venture that is a conscious product decision that has something to do with what someone thought looked the most reliably good given their model's capabilities.Some wild speculation as to why this would be the case:* Might have to do with fashion photos having unusually bright lights and various cosmetics to give a sheen.* It might have something to do with training the model on synthetic data (i.e. 3d models) which will have trouble producing the complicated subsurface scattering of human skin.* Might have something to do with image statistics and glossy finishes creeping in where they don't belong.* Might have to do with the efficiency of representing white spots.

DaoVeles9 个月前

I suppose because a large part of these models is recognition-probability, the shine is sort of an approximation of what is likely lighting. It isn't just the lighting that you expect but the culmination of thousands of similar yet slightly different. If you where to take a thousand photo's of someone with all manner of light angles, maybe it would look like this. Just a wild guess though.

latentsea9 个月前

People have started training Lora's for Flux that look pretty pretty real. This was a good recent example: <a href="https://www.reddit.com/r/StableDiffusion/comments/1ero4ts/fine_tuned_flux/" rel="nofollow">https://www.reddit.com/r/StableDiffusion/comments/1ero4ts/fi...</a>

评论 #41263278 未加载

simonw9 个月前

One of the most interesting things about Midjourney is that it always returns multiple images, and asks the user to select which of those they would like to view at full resolution.This is pretty clearly training for a preference model - so they now have MILLIONS of votes showing which images are more "pleasing" to their users.

BobbyTables29 个月前

I naively assumed the “airbrushed” effect AI photos have was just a way of blending components of the training data to make it look normal — opposite the way a collage of magazine clippings would appear.

评论 #41263052 未加载

disconcision9 个月前

Intentional choices during data set collation (to some degree 'emergent intention' due to aggregate preference). Search for 'boring realism' to find people working in other regions of latent space, e.g. this LORA: <a href="https://civitai.com/models/310571/boring-reality" rel="nofollow">https://civitai.com/models/310571/boring-reality</a> . Most of the example pictures there don't have the shiny/glossy look you're talking about.

anileated9 个月前

ML-generated pseudo-photos look 3D-rendered because noise is information, and more information is both more expensive (a noisy photo can be 3x the size at the same resolution) and creates more opportunities for self-inconsistencies (e.g., with real camera sensor noise) that make fakes easier to identify automatically.

ffhhj9 个月前

Because that's the kind of image that AI trainers like the most? Would they rather train them on old newspapers?That would be the "oiled bodybuilder" applied to image training. Maybe similar and clearly defined lighting also allows AI's to match features much better, specially volumes.

osigurdson9 个月前

I tend to agree. However, I tried to continue to prompt ChatGPT make make the picture less "AI like" and it actually ended up doing a really good job after 5 or 6 attempts. I'm not sure why it took so much prompting. Further prompting just made it worse.

评论 #41266161 未加载

t0bia_s9 个月前

Because models are trained on images that are usually edited in post-production with this aesthetic.Mostly highlights down, shadows and clarity up. I often needs to edit it back to have realistic looking lights on scene.Also "--s" 0 helps with generating more realistic images.

评论 #41265639 未加载

LeoPanthera9 个月前

AI is compression and compression, of any lossy kind, usually works by removing the high-frequency information first. That applies to both audio and imagery. It's obviously not the only factor, but I bet it's an important one.

评论 #41263384 未加载

评论 #41263297 未加载

Scrapemist9 个月前

The noisy pattern of the skin, combined with noise from light hitting the camera sensor, plus the noise created of image compression creates an effect that is too subtle and random to “learn”.

Blackthorn9 个月前

Models have Goodharted themselves into oblivion. That's the result of endless cycles of aesthetic preference optimization, training on synthetic data, repeat ad nauseam.

bronya19c9 个月前

I believe it's due to AI's limited ability to generate localized texture details effectively, often resorting to the use of highlights as a concealment strategy.

jerpint9 个月前

There’s an added objective in some of these models to make them more aesthetically pleasing based on subjective crowdsourced data that very likely contributes to this

bronya19c9 个月前

I believe it's due to AI's limited ability to generate localized texture details effectively, often use highlights as a concealment strategy.

RicoElectrico9 个月前

I think the generators were trained mostly on ArtStation and this style is quite common in concept art.

tivert9 个月前

I don't know what you mean by "shiny/glossy lighting look to it." Could you give some examples?I haven't noticed that a lot of AI images are generated with a "realistic cartoon" style, and I assume that's to smooth over some uncannyness.

评论 #41266179 未加载

bni9 个月前

Also people in them look like caucasian anime characters. Why is that?

评论 #41265180 未加载

allanren9 个月前

It has to do with the training data. Mostly are photorealistic style. I'm sure there will be more real world style coming out soon.

27 条评论

keiferski9 个月前

评论 #41263379 未加载

评论 #41263279 未加载

评论 #41263420 未加载

vipshek9 个月前

评论 #41263396 未加载

评论 #41263362 未加载

评论 #41263190 未加载

feverzsj9 个月前

Maybe because the image is generated from Gaussian noise in diffusion models, while the real photo pixel entropy doesn't distribute like this.

评论 #41271777 未加载

sidkshatriya9 个月前

spaceman_20209 个月前

txnf9 个月前

评论 #41263247 未加载

评论 #41262936 未加载

ClassyJacket9 个月前

评论 #41263174 未加载

blululu9 个月前

DaoVeles9 个月前

latentsea9 个月前

评论 #41263278 未加载

simonw9 个月前

BobbyTables29 个月前

评论 #41263052 未加载

disconcision9 个月前

anileated9 个月前

ffhhj9 个月前

osigurdson9 个月前

评论 #41266161 未加载

t0bia_s9 个月前

评论 #41265639 未加载

LeoPanthera9 个月前

评论 #41263384 未加载

评论 #41263297 未加载

Scrapemist9 个月前

The noisy pattern of the skin, combined with noise from light hitting the camera sensor, plus the noise created of image compression creates an effect that is too subtle and random to “learn”.

Blackthorn9 个月前

Models have Goodharted themselves into oblivion. That's the result of endless cycles of aesthetic preference optimization, training on synthetic data, repeat ad nauseam.

bronya19c9 个月前

I believe it's due to AI's limited ability to generate localized texture details effectively, often resorting to the use of highlights as a concealment strategy.

jerpint9 个月前

There’s an added objective in some of these models to make them more aesthetically pleasing based on subjective crowdsourced data that very likely contributes to this

bronya19c9 个月前

I believe it's due to AI's limited ability to generate localized texture details effectively, often use highlights as a concealment strategy.

RicoElectrico9 个月前

I think the generators were trained mostly on ArtStation and this style is quite common in concept art.

tivert9 个月前

评论 #41266179 未加载

bni9 个月前

Also people in them look like caucasian anime characters. Why is that?

评论 #41265180 未加载

allanren9 个月前

It has to do with the training data. Mostly are photorealistic style. I'm sure there will be more real world style coming out soon.