Comparing Adobe Firefly, Dalle-2, and OpenJourney

231 点作者 muhammadusman将近 2 年前

24 条评论

For reference, here's what you can get with a properly tweaked Stable Diffusion, all running locally on my PC. Can be set up on almost any PC with a mid range GPU in a few minutes if you know what you're doing. I didn't do any cherry picking; this is the first thing it generated. 4 images per prompt.1st prompt: <a href="https://i.postimg.cc/T3nZ9bQy/1st.png" rel="nofollow noreferrer">https://i.postimg.cc/T3nZ9bQy/1st.png</a>2nd prompt: <a href="https://i.postimg.cc/XNFm3dSs/2nd.png" rel="nofollow noreferrer">https://i.postimg.cc/XNFm3dSs/2nd.png</a>3rd prompt: <a href="https://i.postimg.cc/c1bCyqWR/3rd.png" rel="nofollow noreferrer">https://i.postimg.cc/c1bCyqWR/3rd.png</a>

评论 #36409140 未加载

评论 #36409615 未加载

评论 #36408797 未加载

评论 #36409053 未加载

评论 #36409106 未加载

评论 #36413097 未加载

评论 #36409539 未加载

评论 #36417794 未加载

评论 #36410924 未加载

Skywalker13将近 2 年前

And here with BlueWillow <a href="https://www.bluewillow.ai/" rel="nofollow noreferrer">https://www.bluewillow.ai/</a>1: <a href="https://media.discordapp.net/attachments/1060989219432054835/1120800535306575902/18e3a6c1-5d5d-4947-b677-b220a7cc856d.jpg?width=683&height=683" rel="nofollow noreferrer">https://media.discordapp.net/attachments/1060989219432054835...</a>2: <a href="https://media.discordapp.net/attachments/1060989219432054835/1120800584912605184/29cf5372-1f95-4a9a-9d23-9fdea491bcc0.jpg?width=683&height=683" rel="nofollow noreferrer">https://media.discordapp.net/attachments/1060989219432054835...</a>3: <a href="https://media.discordapp.net/attachments/1060989219432054835/1120800645616771222/50318f57-de17-49e1-9bd0-a9a226aa9189.jpg?width=683&height=683" rel="nofollow noreferrer">https://media.discordapp.net/attachments/1060989219432054835...</a>

评论 #36409474 未加载

mdorazio将近 2 年前

Since the author didn't have access to Midjourney, here's the first two prompts in MJ with default settings (not upscaled):<a href="https://imgur.com/a/siQG06O" rel="nofollow noreferrer">https://imgur.com/a/siQG06O</a><a href="https://imgur.com/a/vp2oOHu" rel="nofollow noreferrer">https://imgur.com/a/vp2oOHu</a>

评论 #36409062 未加载

cainxinth将近 2 年前

Amazing how quickly Dalle-2 went from among the best image transformers to among the worst.

评论 #36408635 未加载

评论 #36408784 未加载

评论 #36412977 未加载

评论 #36408951 未加载

评论 #36410149 未加载

评论 #36408449 未加载

poniko将近 2 年前

Midjurney is still so far ahead it's no competition. Did a lot of testing today and firefly generated so much errors with fingers and stuff, not seen that since the original stability release. Anyone know if the web firefly and the Photoshop version is the same model?

评论 #36408874 未加载

评论 #36408941 未加载

评论 #36415880 未加载

评论 #36408763 未加载

评论 #36408184 未加载

评论 #36409406 未加载

FanaHOVA将近 2 年前

I had done a similar comparison a couple months back but used Lexica instead of DALL-E.Seems clear to me that Midjourney has by far the best "vibes" understanding. Most models get the items right but not the lighting. Firefly seems focused on realism which makes sense for a photography audience.<a href="https://twitter.com/fanahova/status/1639325389955952640?s=46&t=IVF1sX_TGndxvax1l-hJ0Q" rel="nofollow noreferrer">https://twitter.com/fanahova/status/1639325389955952640?s=46...</a>

mdorazio将近 2 年前

Kind of strange to me that they didn't test any prompts with people in them. In my experience that tends to show the limitations of various models pretty quickly.

评论 #36408917 未加载

dvt将近 2 年前

Adobe Firefly is actually extremely competent, especially since it doesn't use copyrighted images in its training set. Using MidJourney (which is fantastic) commercially will be a quagmire for the unlucky company that draws a lawsuit.

theobromananda将近 2 年前

All three of these are horrible, and running Stable Diffusion locally produces incredibly better results as seen in this comment section.

评论 #36409366 未加载

MediumD将近 2 年前

*Shameless Plug*If you want to play around with OpenJourney (or any other fine-tuned StableDiffusion model). I made my own UI with a free tier at <a href="https://happyaccidents.ai/" rel="nofollow noreferrer">https://happyaccidents.ai/</a>.It supports all open-sourced fine-tuned models & loras and I recently added ControlNet.

og_kalu将近 2 年前

Should be compared using Bing Image Creator(better version of dall-e) rather than the Dalle-2 site.

abeppu将近 2 年前

Is it intentional that each of the prompts is given twice in that blockquote? It's done without a space, so e.g. in the 2nd example, the word "centeredvalley" appears because of the way the last/first words of the first/second repetition were mashed together. Does that indicate what was actually given to the engines, or was that a copy-paste issue made only while putting together the article? I could imagine that non-words like "cornera" in the last example could throw things off?

throwaway742将近 2 年前

My result for prompt 2 using Dreamshaper Stable Diffusion model.<a href="https://i.imgur.com/ipnf3f5.png" rel="nofollow noreferrer">https://i.imgur.com/ipnf3f5.png</a>

rgbrgb将近 2 年前

For those curious, I tried the same prompts with Kandinsky 2.1 [0]. In my experience it kind of blends the conceptual understanding of DALL-E with the higher quality image generation of Stable Diffusion. Like Midjourney though it kind of injects it's own style and allows you to get "satisfying" results from short prompts.The flaw with these comparisons is that you really shouldn't use the same prompt with different generators. If you want to get best results you do have to play with the prompts and do a bunch of iteration to kind of explore the latent space and find what you're looking for. The first super long prompt looks like it's tuned for stable diffusion for instance. Different generators also have different syntax (e.g. with stable diffusion you can surround a phrase with parens to give it extra emphasis).[0]: <a href="https://iterate.world/s/clj4n19u20000jv08iqygiaqw" rel="nofollow noreferrer">https://iterate.world/s/clj4n19u20000jv08iqygiaqw</a>

cubefox将近 2 年前

Here is what the haunted house looks like with Dall-E ~3 (Bing Image Creator): <a href="https://www.bing.com/images/create/a-haunted-house-with-ghostly-apparitions2c-eerie-sh/649242295c2c43659807371ae17d875a" rel="nofollow noreferrer">https://www.bing.com/images/create/a-haunted-house-with-ghos...</a>Generally, this model is much better than Dall-E 2, and it beats Firefly in some areas (I didn't try Midjourney or Stable Diffusion). Firefly usually produces photos with significantly fewer visual mistakes (like the wrong number of fingers or messed up faces) than the Bing Dall-E. But the latter usually understands prompts much better and more often produces something that matches it well. Firefly also doesn't "know" a lot of pop culture or history things, e.g. Marilyn Monroe, or what Coca-Cola is.

pdntspa将近 2 年前

Why didnt this person include Stable Diffusion?

评论 #36409070 未加载

SoKamil将近 2 年前

Can we appreciate how well that lightbox works on this site in a mobile mobile browser, especially Safari? Also the gestures are smooth and do not cause any quirks like unintended refresh gesture

personjerry将近 2 年前

The analysis at the end seems to be lacking. From my perspective, PhotoShop and Midjourney come out on top in terms of aesthetic and accuracy, with kouteiheika's Stable Diffusion results[0] a close second. Dall-E falls far behind, which makes sense considering all the work that's gone in to the other systems to fine-tune and build ecosystems around them.[0]: <a href="https://news.ycombinator.com/item?id=36408744">https://news.ycombinator.com/item?id=36408744</a>

senko将近 2 年前

For comparison, these were generated using Stability.ai API: <a href="https://postimg.cc/gallery/MQfkgP7/ce388adf" rel="nofollow noreferrer">https://postimg.cc/gallery/MQfkgP7/ce388adf</a>I used stable-diffusion-xl-beta-v2-2-2 model, copypasted prompts from the blog post, one-shot for each prompt. I chose style presets that closely matched the prompt (added as suffixes in image filenames).

whatscooking将近 2 年前

I like how simple Firefly’s images are, like something you’d want to work with in Photoshop. Dalle-2 looks terrible. Midjourney is still my favorite.

评论 #36409694 未加载

Aeolun将近 2 年前

> small windows opening onto the gardenLiterally all of the examples have floor to ceiling windows across the entire length of the wall…

dahwolf将近 2 年前

I'm glad it's not just me getting unusable garbage out of Dall-E and glorious results from MidJourney.

snowe2010将近 2 年前

not sure this is a good comparison. midjourney likes much shorter prompts, and honestly they're all absolutely terrible for anything that isn't 'photo' based. E.g. ask it to generate a word bubble of the most common programming languages and it will fail every time, no matter what you try. I love it for photo stuff, but for photoshop you'd expect it to be able to do other things as well.

评论 #36408694 未加载

评论 #36408659 未加载

评论 #36413319 未加载

muhammadusman将近 2 年前

Author here: I updated the post to include the generated results from Stable Diffusion and Midjourney (thanks to kouteiheika and mdorazio).

24 条评论

kouteiheika将近 2 年前

评论 #36409140 未加载

评论 #36409615 未加载

评论 #36408797 未加载

评论 #36409053 未加载

评论 #36409106 未加载

评论 #36413097 未加载

评论 #36409539 未加载

评论 #36417794 未加载

评论 #36410924 未加载

Skywalker13将近 2 年前

评论 #36409474 未加载

mdorazio将近 2 年前

评论 #36409062 未加载

cainxinth将近 2 年前

Amazing how quickly Dalle-2 went from among the best image transformers to among the worst.

评论 #36408635 未加载

评论 #36408784 未加载

评论 #36412977 未加载

评论 #36408951 未加载

评论 #36410149 未加载

评论 #36408449 未加载

poniko将近 2 年前

评论 #36408874 未加载

评论 #36408941 未加载

评论 #36415880 未加载

评论 #36408763 未加载

评论 #36408184 未加载

评论 #36409406 未加载

FanaHOVA将近 2 年前

mdorazio将近 2 年前

Kind of strange to me that they didn't test any prompts with people in them. In my experience that tends to show the limitations of various models pretty quickly.

评论 #36408917 未加载

dvt将近 2 年前

theobromananda将近 2 年前

All three of these are horrible, and running Stable Diffusion locally produces incredibly better results as seen in this comment section.

评论 #36409366 未加载

MediumD将近 2 年前

og_kalu将近 2 年前

Should be compared using Bing Image Creator(better version of dall-e) rather than the Dalle-2 site.

abeppu将近 2 年前

throwaway742将近 2 年前

My result for prompt 2 using Dreamshaper Stable Diffusion model.<a href="https://i.imgur.com/ipnf3f5.png" rel="nofollow noreferrer">https://i.imgur.com/ipnf3f5.png</a>

rgbrgb将近 2 年前

cubefox将近 2 年前

pdntspa将近 2 年前

Why didnt this person include Stable Diffusion?

评论 #36409070 未加载

SoKamil将近 2 年前

Can we appreciate how well that lightbox works on this site in a mobile mobile browser, especially Safari? Also the gestures are smooth and do not cause any quirks like unintended refresh gesture

personjerry将近 2 年前

senko将近 2 年前

whatscooking将近 2 年前

I like how simple Firefly’s images are, like something you’d want to work with in Photoshop. Dalle-2 looks terrible. Midjourney is still my favorite.

评论 #36409694 未加载

Aeolun将近 2 年前

> small windows opening onto the gardenLiterally all of the examples have floor to ceiling windows across the entire length of the wall…

dahwolf将近 2 年前

I'm glad it's not just me getting unusable garbage out of Dall-E and glorious results from MidJourney.

snowe2010将近 2 年前

评论 #36408694 未加载

评论 #36408659 未加载

评论 #36413319 未加载

muhammadusman将近 2 年前

Author here: I updated the post to include the generated results from Stable Diffusion and Midjourney (thanks to kouteiheika and mdorazio).