TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

How to generate realistic people in Stable Diffusion

120 点作者 m0wer12 个月前

22 条评论

zevv11 个月前
I might be going around in the wrong social circles, but none of the people I know look anything like the realistic people in these images. Are these models even able to generate pictures of <i>actual</i> normal everyday people instead of glossy photo models and celebrity lookalikes?
评论 #40829760 未加载
评论 #40829452 未加载
评论 #40829152 未加载
评论 #40829313 未加载
评论 #40830359 未加载
评论 #40829786 未加载
评论 #40829458 未加载
评论 #40828868 未加载
moritzwarhier11 个月前
Generating fake portrait photos seems kind of boring.<p>Wouldn&#x27;t these kinds of negative prompts and tweaking break down if I wanted to plug in more varied descriptions of people?<p>I find it interesting to plug in colorful descriptions of person&#x27;s traits from a novel for example, or of people actually <i>doing</i> something.<p>Using &quot;ugly&quot;, &quot;disfigured&quot; as negative prompt probably wouldn&#x27;t work then...<p>For the pictures in the article, my first association is someone generating romance scam profile pictures, not art.
评论 #40830364 未加载
Animats11 个月前
Try using the &quot;I Can&#x27;t Believe It&#x27;s Not Photography&quot; model. Instead of trying to micro-manage the details, use strong emotional terms. I&#x27;ve had good results with prompts along the lines of &quot;Aroused angry feral Asian woman wearing a crop top riding a motorcycle fast in monsoon rain.&quot;[1]<p>[1] <a href="https:&#x2F;&#x2F;i.ibb.co&#x2F;3zHGyrR&#x2F;feral34.png" rel="nofollow">https:&#x2F;&#x2F;i.ibb.co&#x2F;3zHGyrR&#x2F;feral34.png</a>
评论 #40828776 未加载
评论 #40829176 未加载
评论 #40829183 未加载
ginko11 个月前
Why does everything generated by SD seem to have this weird plasticky sheen to it? Is that a preference of people generating these or innate to the model?
评论 #40829243 未加载
评论 #40828965 未加载
supriyo-biswas11 个月前
It seems to be a requirement to have a model trained on a large number of explicit images to generate correct anatomy.<p>While people have tried going from a base model to a fine tuned model based on explicit images, I wonder if there are people are attempting to go the other way round (train a base model on explicit photographs and other images not involving humans; then fine-tune away the explicit parts), which might lead to better results?
评论 #40828826 未加载
评论 #40829454 未加载
评论 #40828939 未加载
xg1511 个月前
&gt; <i>Caution - Nearly all of [the special-purpose models] are prone to generating explicit images. Use clothing terms like dress in the prompt and nude in the negative prompt to suppress them.</i><p>I like how even with all the &quot;please don&#x27;t make it porn&quot; terms in the prompt, you can easily see (by choice of dresses, cleavage, pose, facial expressions etc) which models &quot;want&quot; to generate porn and are barely held back by the prompt.
BobbyTables211 个月前
I find with stable diffusion, images for a type of prompt seem to all show the same person. Add something like “mature” to the prompt and you get a different (but same) person for all those images, regardless of seed.<p>When one asks prompts for which it hasn’t seen in training data, the results start to look less realistic.<p>Have even seen adult video logos in generated images.<p>I very much strongly suspect AI is not what we think.
评论 #40830114 未加载
nuz11 个月前
Use SDXL instead of sd1.5. Also don&#x27;t do negative prompting on &quot;distorted, ugly&quot; because everyone follows these guides blindly and the result is the same types of mode collapsed faces which people have learned to recognize as AI. Use loras instead (to make it look slightly amateurish, or just away from the cookie cutter AI style which everyone notices).
roenxi11 个月前
This is stable diffusion 1.5. <a href="https:&#x2F;&#x2F;civitai.com&#x2F;" rel="nofollow">https:&#x2F;&#x2F;civitai.com&#x2F;</a> and <a href="https:&#x2F;&#x2F;huggingface.co&#x2F;" rel="nofollow">https:&#x2F;&#x2F;huggingface.co&#x2F;</a> suggest the popular options are SDXL based - it is a much better model (effectively SD 2.5). Still imperfect, but much better.
评论 #40830376 未加载
throwaway157111 个月前
I find all those photos generated by Stable Diffusion to be kind of repetitious and boring.<p>Eking out something &quot;interesting&quot; is difficult, especially with limited time <i>and</i> low-end hardware. Interesting is highly subjective of course. I tend towards the more artistic &#x2F; surrealist style, usually NSFW. Only nudes, no pornography.<p>I&#x27;ve been experimenting these last few months with interesting generating images, trying to make them &quot;artistic&quot; rather than photo-realistic, or the usual bland anime tributes.<p>I usually pick a &quot;classical&quot; artist which already has nudes in their repertoire, and try to blend their style with some photos I take myself, and with the style of other artists.<p>Most fall flat, some come close to what I consider acceptable, but still have major flaws. However, due to my time and hardware constraints they&#x27;re good enough to post. I use fooocus which is kind of limiting, but after trying and failing to produce satisfactory results with Automatic, fooocus is just what I needed.<p>I can&#x27;t really understand why more people don&#x27;t do the same. Stable Diffusion was trained on a long and diverse list of artists, but most people seem to disregard that and focus only on anime or realistic photographs. The internet is inundated with those. I&#x27;m following some people on Mastodon who post more interesting stuff, but they usually tend to be all same-ish. I try to produce more diverse stuff, but most of the time it feels like going against the grain.<p>The women still tend to look like unrealistic supermodels. Sometimes this is what I want. Sometimes not, and it takes many tweaks to make them normal women, and usually I can&#x27;t spare the time. Which is unfortunate.<p>If anyone&#x27;s interested, I post the somewhat better experiments in:<p><a href="https:&#x2F;&#x2F;mastodon.social&#x2F;@TheNudeSurrealist" rel="nofollow">https:&#x2F;&#x2F;mastodon.social&#x2F;@TheNudeSurrealist</a><p>Warning: Most are NSFW. But are NSFW in the way Titian&#x27;s Venus, say, is NSFW.
评论 #40830533 未加载
coreyh1444411 个月前
Impressive breakdown, but this is six months old.
评论 #40828831 未加载
9dev11 个月前
If your goal is to generate content for a fully fictional celebrity magazine, this article will help you.<p>How come this technology appears to be exclusively used to generate fake pictures of unrealistically good-looking women? And to what end..?
评论 #40830375 未加载
评论 #40829763 未加载
Havoc11 个月前
Step 1: Don’t use stability’s lastest model
评论 #40828829 未加载
评论 #40829284 未加载
cubefox11 个月前
Here it may be more reasonable to actually pay some money for commercial models that are far ahead of Stable Diffusion in terms of image quality and prompt understanding. Like Dall-E 3, Imagen 2 (Imagen 3 comes out soon), or Midjourney. The gap between free and commercial diffusion models seems to be larger than the gap between free and commercial LLMs.
aranelsurion11 个月前
Wanted to give it a try just for fun, using the same prompts, base model and parameters (as far as I can tell), and the first 5 images that were created... will probably haunt me in my dreams tonight.<p>I don&#x27;t know if it was me misconfiguring it, or if the images in post were really cherry-picked.
antihero11 个月前
Scrolling through the article the pictures look no more realistic as it goes on.<p>You need to simulate poor lighting, dirt, soul, realistic beauty etc. Perhaps even situations that give a reason for a photo to be taken other than I’m a basic heteronormative woman who is attractive.
efilife11 个月前
The images generated by this guy are nowhere close to realistic. The resolution he&#x27;s using is terrible for getting realisting faces. Most people with better GPUs get way better results whlist using 10% of the tricks from the article
virtualritz11 个月前
It&#x27;s kinda telling when the author says (e.g. about the &quot;Realistic Vision v2&quot; model) that &quot;the anatomy is excellent [...]&quot; when this is obviously not the case.<p>Actually it is in no single image in that blog post.<p>If you have a trained eye that is.
nubinetwork11 个月前
I have a hard time believing that the huge prompt they used at the end (before img2img) will fit in diffusers prompts. I noticed that after 75 tokens or so, it just chops off the prompt and runs with whatever didn&#x27;t get cut.
jredwards11 个月前
I&#x27;ve spent a ton of time playing with Stable Diffusion, just for amusement. I&#x27;ve rarely found it interesting to generate realistic people.
siilats11 个月前
Missing reference to dreambooth
pandemic_region11 个月前
Scary, that&#x27;s all I can say.