Stable Diffusion animation

616 pointsby gcollard-over 2 years ago

24 comments

fagerhultover 2 years ago

Andreas, author of the Replicate model here -- though "author" feels wrong since I basically just stitched two amazing models together.The thing that really strikes me is that open source ML is starting to behave like open source software. I was able to take a pretrained text-to-image model and combine it with a pretrained video frame interpolation model and the two actually fit together! I didn't have to re-train or fine tune or map between incompatible embedding spaces, because these models can generalize to basically any image. I could treat these models as modular building blocks.It just makes your creative mind spin. What if you generate some speech with <a href="https://replicate.com/afiaka87/tortoise-tts" rel="nofollow">https://replicate.com/afiaka87/tortoise-tts</a>, generate an image of an alien with Stable Diffusion, and then feed those two into <a href="https://replicate.com/wyhsirius/lia" rel="nofollow">https://replicate.com/wyhsirius/lia</a>. Talking alien! Machine learning is starting to become really fun, even if you don't know anything about partial derivatives.

评论 #32660533 未加载

评论 #32659986 未加载

评论 #32666624 未加载

评论 #32660695 未加载

desindolover 2 years ago

Maybe I can shine some light on the debate from an concept artist standpoint that works in VFX and advertising. I worked on feature films (3 of them in the imdb top 100), tv shows (like game of thrones) and hundreds of AD campaigns.In the last 10 year the work of a concept artist changed dramatically we have gone from purely painted concept art to mostly "photobashed". Photobashed means basically that you rip apart other images and stitch them together to get the desired image. Some start with a rough sketch for the composition or make really rough grey shade 3d model and "overpaint" them. When it comes to "photobashing" the disregard for copyrights was always there and it's the worst in smaller studios and a bit better in the leading ones. Still most of the time everyone argues that if you only use really small parts of the images it is covered by fair use. There are some examples were studios got sued but mostly without bigger financial impact.A few months ago I started working with "DiscoDiffusion" to generate the images I use to photobash. "DiscoDiffusion" can produce great "painterly" images but struggles with photorealism and is slower, not as coherent as "StableDiffusion". Still the adoption rate in the concept art community was insanely fast. This all got topped by "StableDiffusion" in the last week. Ofc there are still people that want to do it the "right" way and not use AI but we had the same discussion years ago when "photobashing" came into place and some artists still wanted to paint the whole image. As concept artist you are mostly paid for your design thinking that means it is less about the process and more about the finished product. The turnaround time for styleframes got reduced from 3-4 hours while painting to 45 min - 1 hour when photobashing with stable diffusion me and my peers in the studio are now at 20-45 min per styleframe. When "photobashing" most people constrain themselves on their image library and ressources like Photobashing Kits. Not only does "StableDiffusion" cut the time in half it also gives greater freedom in composition and design especially if you are using img2img.So where does this leave us? For the work in fast paced art environments like VFX, games, conept art or advertising "StableDiffusion" is a welcome gamechanger. Tradionalists and Artists outside of the industry might feel threatened but for us in these industries it's a god send.

评论 #32660350 未加载

评论 #32660386 未加载

评论 #32661474 未加载

评论 #32663713 未加载

评论 #32661222 未加载

评论 #32661245 未加载

评论 #32666885 未加载

评论 #32711910 未加载

评论 #32661969 未加载

r3trohack3rover 2 years ago

I just came across this on twitter, every frame appears to be an evolution of the previous frame using img2img paired with a tilt/zoom to create a psychedelic animation.The author claims to have made this with Stable Diffusion, Disco, and Wiggle: <a href="https://www.youtube.com/watch?v=Nz_n0qxqoPg" rel="nofollow">https://www.youtube.com/watch?v=Nz_n0qxqoPg</a>I believe Wiggle is used to automate the tilt/zoom between frames.

评论 #32661080 未加载

评论 #32658999 未加载

评论 #32659414 未加载

评论 #32666271 未加载

评论 #32659492 未加载

mrpf1sterover 2 years ago

I feel like I’m watching an explosion of progress in AI image generation in real-time.Every day there’s a new application of Stable Diffusion. It’s incredible to watch unfold

评论 #32658943 未加载

评论 #32660634 未加载

NoMoreBroover 2 years ago

Very cool! I generated this with 1000 images <a href="https://twitter.com/UnshushProject/status/1563158214577094657" rel="nofollow">https://twitter.com/UnshushProject/status/156315821457709465...</a> using the Deforum's Colab[1], it's really easy and now has interpolation too. It was the very first video, I could have made something great but, you know, awesome guys keep releasing AI tech and I'm like a child at Luna Park right now, not able to concentrate.If you are interested in my project (I doubt, you are too busy playing like me) I'm posting a lot of things on <a href="https://unshush.com" rel="nofollow">https://unshush.com</a> and on the Instagram account: <a href="https://www.instagram.com/unshushproject/" rel="nofollow">https://www.instagram.com/unshushproject/</a> (Sorry for posting my stuff but I'm not very social so no one will ever see them)If you want to generate videos I can share some links I bookmarked of software/code to make them more smooth.[1] Deforum's Colab (based on Stable Diffusion): <a href="https://colab.research.google.com/github/deforum/stable-diffusion/blob/main/Deforum_Stable_Diffusion.ipynb" rel="nofollow">https://colab.research.google.com/github/deforum/stable-diff...</a>

评论 #32660072 未加载

autoexecover 2 years ago

AI in animation has been interesting to me for a while now. It leaves me a little conflicted though. If we get to the point where we can throw key drawings at AI and let it handle all the inbewteens without a bunch of tweaking and cleanup afterwards it's going to really suck for places like Korea! I guess all those inbeatweeners will just be another victim of automation.I've always loved animation, but I'll admit part of that comes from the hubris involved. It's pure insanity that people ever drew, by hand, mountains of individual drawings each slightly changed and assembled them into compelling illusions to tell stories. The amount of work that goes into animation is just staggering and anyone sensible would have rejected the entire concept as absurd. I wonder if animation will start losing part of its magic for me when it's done primarily by AI.On the other hand though, another thing I've always loved about animation as a storytelling medium is that it isn't as limited by practical concerns like physics or reality. If something can be imagined, it can be drawn and animated if somebody has the skill and the resources to fund the massive amounts of work. It's time/money that forces animators to take shortcuts and make compromises. Creative decisions are made and rejected all the time due to those constraints. If AI driven animation gets more advanced to the point where that's no longer such a barrier it could create output more in line with the vision of creators and that's exciting too!I hope that traditional hand drawn animation never dies, but I look forward to seeing how AI continues to change the industry and the output.

评论 #32659317 未加载

评论 #32659143 未加载

评论 #32659053 未加载

评论 #32659852 未加载

Dweditover 2 years ago

In this example, 25 frames are generated using Stable Diffusion, then frames are interpolated using FILM-Net. I hadn't see FILM-net before, it looks really neat.

评论 #32658871 未加载

评论 #32658954 未加载

Lwepzover 2 years ago

It's clear that the next frontier is to have 3D-space instead of image space transitions. Language itself is very static and action verbs are not enough to specify scene dynamics. I suppose we would need: A. an enriched version of natural language that refines the dynamic processes that occur in a scene B. a data set of isolated processes labeled in the language described in A.I've had a hard time finding ongoing work on A. and B, perhaps it isn't much of a priority for research groups.

评论 #32659685 未加载

fragmedeover 2 years ago

There's also <a href="https://gist.github.com/karpathy/00103b0037c5aaea32fe1da1af553355" rel="nofollow">https://gist.github.com/karpathy/00103b0037c5aaea32fe1da1af5...</a>

justinlloydover 2 years ago

Last year, when 3090 GPUs were astronomically priced, I thought "screw it, I'll just buy an RTX A5000 for a couple of hundred bucks more." Which begat a second A5000 for "reasons." It was almost prescient. Now all these models are coming out requiring slightly higher VRAM GPUs than a 3090, i.e. more in the range of the A5000, and I get to run them. I am a kid in a candy store this past couple of weeks.

评论 #32659942 未加载

评论 #32660063 未加载

zone411over 2 years ago

I played a bit with smoothly interpolating between Stable Diffusion prompts and the effect can be pretty cool but it's hard to avoid discontinuities (like the object changing its orientation), even when using some additional tricks like reusing the previous frame as the initial image or generating several new frames and choosing the one that's closest to the previous frame. You basically have to get lucky with the seed. It probably makes most sense to just wait for video models that take temporal consistency into account explicitly or generate 3D models. There is a lot of promising research out there already, so it's just a matter of time.

cerolover 2 years ago

My predictions for 10 - 15 years:- Mandela effect for famous art pieces: "Monalisa was AI generated" "No it wasn't" "Yes it was".- Art critics will get the last laugh, as people start giving them truckloads of money to ask whether a piece of art is human or AI generated.

评论 #32660657 未加载

gdubsover 2 years ago

I've been playing around w/ StableDiffusion animation using the "deforum" notebook. It's taken a bit to really understand how to get results I like with it, but I'm super happy with how this one came out:<a href="https://twitter.com/dreamwieber/status/1565008078466326528?s=21&t=93XvRkPe07uzwpikT4GXaw" rel="nofollow">https://twitter.com/dreamwieber/status/1565008078466326528?s...</a>It's a pretty magical time with this tech. Things are moving very rapidly and I feel excited the way I did when I first rendered 3d animation on my 286 from RadioShack.

synuover 2 years ago

Wow, this is incredible. AI tech has been so interesting to follow along lately.Is there something like an index of cool new AI projects that is easy to follow? HN works for this to an extent but I’d love to track more closely.

monkeydustover 2 years ago

It is incredible how fast this thing is progressing. Amazing what you can do with some very smart people and 4,000 A100 cluster !What is getting very clear though and this link proves it out is that 'prompt engineering' is really a thing, I tried this out and it took a while to get something I would consider half decent.I feel like there is a space here for tools / technologies to 'suggest' prompts based on understanding user intentions. If anyone is actually working on this then reach out to me. Email on profile.

hackerlightover 2 years ago

Will models like Stable Diffusion be useful for self-driving car research? Like you've got this large NN with weights that are useful for this vision-adjacent task, it should have learned concepts such as edge detection, which could serve as pretrained weights for a self-driving NN?

imhoguyover 2 years ago

Amazing! So counting years now when a first AI feature film hits the cinema screens. Source code: screenplay text. I imagine it may look a bit like "A Scanner Darkly" (2006).

simultsopover 2 years ago

Brace yourselves, Cartoon AI Network, coming soon xD

ameliusover 2 years ago

I guess it is missing the physics simulation between frames. Perhaps that is the next big step for ML to get right.

shlipover 2 years ago

Great, Miyazaki and Lasseter can finally retire.

jdamon96over 2 years ago

incredible!

t2hv33over 2 years ago

pupe151139over 2 years ago

Com.Samsung.Android.Game.gos:2290:9908:313b42360002

pupe151139over 2 years ago

Com.Samsung.Android.Game.gos:2290:9908:313b42360002

24 comments

fagerhultover 2 years ago

评论 #32660533 未加载

评论 #32659986 未加载

评论 #32666624 未加载

评论 #32660695 未加载

desindolover 2 years ago

评论 #32660350 未加载

评论 #32660386 未加载

评论 #32661474 未加载

评论 #32663713 未加载

评论 #32661222 未加载

评论 #32661245 未加载

评论 #32666885 未加载

评论 #32711910 未加载

评论 #32661969 未加载

r3trohack3rover 2 years ago

评论 #32661080 未加载

评论 #32658999 未加载

评论 #32659414 未加载

评论 #32666271 未加载

评论 #32659492 未加载

mrpf1sterover 2 years ago

I feel like I’m watching an explosion of progress in AI image generation in real-time.Every day there’s a new application of Stable Diffusion. It’s incredible to watch unfold

评论 #32658943 未加载

评论 #32660634 未加载

NoMoreBroover 2 years ago

评论 #32660072 未加载

autoexecover 2 years ago

评论 #32659317 未加载

评论 #32659143 未加载

评论 #32659053 未加载

评论 #32659852 未加载

Dweditover 2 years ago

In this example, 25 frames are generated using Stable Diffusion, then frames are interpolated using FILM-Net. I hadn't see FILM-net before, it looks really neat.

评论 #32658871 未加载

评论 #32658954 未加载

Lwepzover 2 years ago

评论 #32659685 未加载

fragmedeover 2 years ago

There's also <a href="https://gist.github.com/karpathy/00103b0037c5aaea32fe1da1af553355" rel="nofollow">https://gist.github.com/karpathy/00103b0037c5aaea32fe1da1af5...</a>

justinlloydover 2 years ago

评论 #32659942 未加载

评论 #32660063 未加载

zone411over 2 years ago

cerolover 2 years ago

评论 #32660657 未加载

gdubsover 2 years ago

synuover 2 years ago

monkeydustover 2 years ago

hackerlightover 2 years ago

imhoguyover 2 years ago

Amazing! So counting years now when a first AI feature film hits the cinema screens. Source code: screenplay text. I imagine it may look a bit like "A Scanner Darkly" (2006).

simultsopover 2 years ago

Brace yourselves, Cartoon AI Network, coming soon xD

ameliusover 2 years ago

I guess it is missing the physics simulation between frames. Perhaps that is the next big step for ML to get right.

shlipover 2 years ago

Great, Miyazaki and Lasseter can finally retire.

jdamon96over 2 years ago

incredible!

t2hv33over 2 years ago

pupe151139over 2 years ago

Com.Samsung.Android.Game.gos:2290:9908:313b42360002

pupe151139over 2 years ago

Com.Samsung.Android.Game.gos:2290:9908:313b42360002