The most incredible thing here is that this demonstrates a level of 3D understanding that I didn't believe existed in 2D image models yet. All of the 3D information in the output was inferred from the training set, which is exclusively uncurated and unsorted 2D still images. No 3D models, no camera parameters, no depth maps. No information about picture content other than a text label (scraped from the web and often incorrect!).<p>From a pile of random undifferentiated images the model has learned the detailed 3D structure and plausible poses and variants of thousands (millions?) of everyday objects. And all we needed to get that 3D information out of the model was the right sampling procedure.
Did we hit some sort of technical inflection point in the last couple of weeks or is this just coincidence that all of these ML papers around high quality procedural generation are just dropping every other day?
In the make-a-video I said that things are getting more and more impressive by the day. I was wrong, because that was a couple hours ago. They're getting more and more impressive by the HOUR.<p>I'm curious where this will end up in a year. Will it plateau? If so, when?
Huh, it's a pretty similar technique to what I outlined a couple days ago: <a href="https://news.ycombinator.com/item?id=32965139" rel="nofollow">https://news.ycombinator.com/item?id=32965139</a><p>Although they start with random initialization and a text prompt. It seems to work well. I now see no reason we can't start with image initialization!
Can someone explain what's going on in this example from the gallery? The prompt is "a humanoid robot using a rolling pin to roll out dough":<p><a href="https://dreamfusion-cdn.ajayj.com/gallery_sept28/crf20/a_DSLR_photo_of_a_humanoid_robot_using_a_rolling_pin_to_roll_out_dough.mp4" rel="nofollow">https://dreamfusion-cdn.ajayj.com/gallery_sept28/crf20/a_DSL...</a><p>But if you look closely, the pin looks like it's actually rolling across the dough as the camera orbits.
Correct link with full demo: <a href="https://dreamfusion3d.github.io/" rel="nofollow">https://dreamfusion3d.github.io/</a>
As someone who went to college for 3D animation in +*<i>1997*</i>+ AND DESIGNED the datacenter for luca' presidio complex..<p>where-by learning that Pixar was developed by steve jobs when lucas didnt think there was a future for computer animation... and so steve bought the death star from lucas...<p>That became pixar...<p>AI is going to fucking kill it - what will happen in the next decade will be ANYONE uploading a script to an AI to make a full length movie...<p>AND their will be editing tools as well that are AI driven...<p>Like mentioned by William Gibson<p>*The future is here, its just not evenly distributed yet*
This is crazy good - most prior text-to-3d models produced weird amorphous blobs that would kind of look like the prompt from some angles, but had no actual spatial consistency.<p>Blown away by how quickly this stuff is advancing, even as someone who's relatively cynical about AI art.
Coincidentally came out the same day as Meta's text-to-video. I wonder if Google deliberately held out the release to make a bigger impact somehow?
What does this mean for our understanding of intelligence?<p>It trivializes it, in my opinion.<p>When asked the question of is lambda/GPT-3 and/or DreamFusion and it's derivatives an aspect of sentience? there's always a bunch of people who are repeating the same cliche negative line, of "no, it's only attempting to statistically mimic sentience." I agree with the reasoning.<p>But have we considered the other side of the story? That yes, the mimicry is All sentience actually is. Nothing more.
The thing that frightens me is that we are rapidly reaching broad humanity disrupting ML technologies without any of the social or societal frameworks to cope with it.
As someone who dabbles in 3d modeling, this is going to be an incredible resource for creating static 3d objects. Someone ought to come up with a way to convert to mesh better than the Marching Cubes algorithm I've seen applied to most NERFs. The models still lack coherent topology and would probably be janky if fully rigged.
Fun that they had an octopus playing a piano.<p>I made the same thing the old fashioned way. Mine can actually play though. <a href="https://twitter.com/LeapJosh/status/1423052486760411136" rel="nofollow">https://twitter.com/LeapJosh/status/1423052486760411136</a> :P
Cool.<p>The samples are lacking definition, but they're otherwise spatially stable across perspectives.<p>That's something that's been struggled with for years.
I'm not even going to pretend that I have a clue on how this is done. But I'm wondering if the output can be turned into 3d objects that can be used in any of the 3D modeling software? It would be a game changer in terms of real world product development in both of speed and ease.
Futurists have been predicting when we'll have stable fusion for decades, but now we suddenly got stable diffusion working. That's good too, not what we wanted, but good. We're gonna need stable fusion or other renewables to run stable diffusion though. /s
Pretty neat, wish I could try it out ( maybe I missed a link). Obviously has interesting / novel uses, but kind of reminds me of the previous discussion of upscaling audio recordings to the “soundstage” format. I doubt most 2d images want to be 3d ;)
Unclear to me what is going on, but there’s another URL that lists the authors names. Given it’s possible this change was done for reason, not linking to it, but strikes me as odd it’s still up. Anyone know what’s going on without causing problems for the authors?
btw guys stable diffusion img2img consistently applied frame-by-frame will get us some insane CGI for movies yo<p>"transform this into this realistically"<p>ILM's holy grail
Url changed from <a href="https://dreamfusionpaper.github.io/" rel="nofollow">https://dreamfusionpaper.github.io/</a> to the page that names the authors.