Toon3D: Seeing cartoons from a new perspective

414 点作者 lnyan大约 1 年前

36 条评论

monitron大约 1 年前

It's interesting that they used the Planet Express building from Futurama as one of their examples of 3D-inconsistency, because I'm pretty sure the exteriors are in fact computer-generated from a 3D model. Watch the show and you can see the establishing shots usually involve a smooth complex camera move around the building.

评论 #40390768 未加载

评论 #40391237 未加载

评论 #40390810 未加载

评论 #40390739 未加载

评论 #40401938 未加载

jsheard大约 1 年前

It's... neat? But I'm struggling to think of what the applications of this would actually be. 2D artwork usually doesn't have a consistent 3D space, which they acknowledge, but they don't seem to have overcome that problem in any useful sense. The scenes are barely coherent once they move from one of the originally drawn camera positions.

评论 #40389745 未加载

评论 #40392112 未加载

评论 #40391905 未加载

评论 #40390199 未加载

评论 #40391107 未加载

评论 #40389773 未加载

评论 #40390672 未加载

评论 #40392524 未加载

JonathanFly大约 1 年前

Creating 3D spaces from inconsistent source images! Super fun idea.I tried a crude and terrible version of something like this a few years ago, but not just inconsistent spaces without a clear ground truth - purely abstract non-space images which aren't supposed to represent a 3D space at all. Transform an abstract art painting (Kandinsky or Pollock for example) into a explorable virtual reality space. Obviously there is no 'ground truth' for whatever 'walking around inside a Pollock painting' means - the goal was just to see what happens if you try to do it anyway. The workflow was:1. Start From Single Abstract Art Source Image2. SinGan to Create Alternative 'viewpoints' of the 'scene'3. 3d-photo-inpainting (or Ken Burns, similar project) on original and SinGan'd images (monocular depth mapping, outputs a zoom/rotate/pan video)4. Throw 3d-photo-inpainting frames into photogrammetry app (Nerf didn't exist yet) and dial up all the knobs to allow for the maximum amount of errors and inconsistency5. Pray the photogrammetry process doesn't explode (9 times out of 10 it crashed after 24 hours, brutal)I must have posted an example on Twitter but I can't find the right search term to find it. But for example, even 2019 tier depth mapping produced pretty fun videos from abstract art: <a href="https://x.com/jonathanfly/status/1174033265524690949" rel="nofollow">https://x.com/jonathanfly/status/1174033265524690949</a> The closest thing I can find is photogrammetry of an NVIDIA GauGAN video (not consistent frame to frame) <a href="https://x.com/jonathanfly/status/1258127899401609217" rel="nofollow">https://x.com/jonathanfly/status/1258127899401609217</a>I'm curious if this project can do a better job at the same idea. Maybe I can try this weekend.

评论 #40393366 未加载

mattfrommars大约 1 年前

In the past after I got Quest 2 and started to dive into the world photogrammetry. I went into the entire pipeline into building a 3D *model* from photos of an object taken from different angle. Pipeline involved using MeshRoom and few other software to clean up mesh and port it into Unity.In the end (from my superficial) understanding, the problem with porting anything into VR (say in Unity in which you can walk around an object) is the important of creating a clean mesh. The 3D model that tools such as OP (I haven't dived deep into it yet) is these are point cloud in 3D space. They do not generate a 3D mesh.Going from memory from tools I came across during my research, there is tools like this <a href="https://developer.nvidia.com/blog/getting-started-with-nvidia-instant-nerfs/" rel="nofollow">https://developer.nvidia.com/blog/getting-started-with-nvidi...</a>, again, this does not generate a mesh. I think it is just a video and not something you can simply walk around in VR.My low key motivation was to make a clone/model like what Matterport and sell it to real estate companies. Major gap in my understanding - the cause of me to loose steam is - I was not sure how are they able to automate the step to generate clean mesh from bunch of photos from a camera. To me, this is the most labor intensive part. Later, I heard there are ML model that is able to do this very step, I have no idea on this tho.

评论 #40391395 未加载

评论 #40392774 未加载

评论 #40395398 未加载

评论 #40391016 未加载

foota大约 1 年前

It's kind of amazing that they're able to take drawing of a scene someone imagined and then create (bad) 3d models. Imagine if in the future an artist could sketch a couple of images from a scene and then get an accurate 3D model?Or if a 2D artist could sketch a couple of poses and automatically get a well structured 3D model and textures?I think there's been a lot of concern in the industry about the impact AI and similar tools will have on artists, but it seems like it's possible to imagine a future where machine learning based systems work more directly with an artist rather than rendering based on language etc.,I don't know how I feel about all the moral arguments about AI training etc.,. I think to me more concerning is how it could impact people more so than how it was trained. Even if a perfectly "ethically" trained model learned to produce perfect art and artists became a niche field, I think it could still be a bad outcome for civilization as a whole because I think there's value in humans producing art, and in having a society where it's (at least somewhat) of a sustainable field.Otoh, I think it's amazing that people can produce the kinds of images using image models, so I'm not sure. Ideally we'd be able to support people in what they want without needing their to be a market for it, but the world's not ready for that.

pcrh大约 1 年前

I'm not a graphic artist, and appreciate how the illustrator's art involves many creative tricks of representation to convey complex meanings.However, the "messy" reconstructions of 3D space seen in these videos did make me think of the recent hype over LLMs.That is, the representations have a clear link to the "truth" or "facts" of the underlying material, but are in no way accurate enough to be considered useful as source material for further use.

评论 #40395760 未加载

robertclaus大约 1 年前

I was surprised by how poorly it reproduces the look from the perspective of specific images. For example, see the magic schoolbus further down. It feels like their algorithm could probably be tuned more in the direction of "trust the images".

评论 #40391386 未加载

MarcScott大约 1 年前

Why would you have a site with a whole load of videos on it, with all of them set to autoplay and constantly loop? I was watching a video on my second screen, and it stutters each time I try to visit the site.

评论 #40392659 未加载

评论 #40396295 未加载

throw4847285大约 1 年前

If you showed the Spirited Away one to Miyazaki, he would probably call it an insult to life itself.

评论 #40391189 未加载

nicklecompte大约 1 年前

I am amazed they didn't seem to talk to any 3D animators before writing this. Because this is just plain wrong:> The hand-drawn images are usually faithful representations of the world, but only in a qualitative sense, since it is difficult for humans to draw multiple perspectives of an object or scene 3D consistently. Nevertheless, people can easily perceive 3D scenes from inconsistent inputs!It is difficult for human artists to maintain perfect geometrical consistency. But that is NOT why 2D animation of 3D scenes is geometrically inconsistent! The reason is that artists stylize 3D scenes to emphasize things for specific artistic reasons. This is especially true for something surreal like SpongeBob. But even King of the Hill has stylized "living room perspectives," "kitchen perspectives," etc. The artists are trying to make things look good, not realistic. And they aren't trying to make humans reconstruct a perfect 3D image - they are trying to evoke our 3D imaginations. It's a very different thing.Pixar and other high-quality 3D animation studios intentionally distort the real geometry of their scenes for cinematic effect: a small child viewed from an adult's perspective might be rendered with a freakishly long neck and stubby little torso, because the animators are intentionally exaggerating visual foreshortening to emphasize the emotional effect of a wee little child. A realistic perspective would be simply boring. These techniques are all over the place in Pixar movies - it's why their films look so good compared to cheaper studios, who really are just moving a virtual camera around a Euclidean 3D space.I don't want to comment on the technical details. But it really seems like the authors missed the artistic mark.

评论 #40396609 未加载

评论 #40396402 未加载

评论 #40395977 未加载

solardev大约 1 年前

It kinda looks like a cartoon version of Microsoft Photosynth? <a href="https://en.wikipedia.org/wiki/Photosynth" rel="nofollow">https://en.wikipedia.org/wiki/Photosynth</a>

iainmerrick大约 1 年前

I don’t like to bring unrealistic expectations to this sort of thing, but even so, all the examples look pretty bad. Am I missing something?In addition to all the noise and haze -- so the intermediate frames wouldn’t be usable alongside the originals -- the start and end points of each element hardly ever connect up. Each wall, door, etc flies vaguely towards its destination, but fades out just as the “same” element fades in at its final position a few feet away.It’s a lovely idea, though, and it would be great to see an actually working version.

评论 #40396380 未加载

James_K大约 1 年前

This web page uses over 1.6 gigabytes of RAM.

评论 #40390876 未加载

chungus大约 1 年前

I imagine Spongebob episodes converted to this 3D format, and watching them with VR goggles, like you're there.

评论 #40392301 未加载

评论 #40391444 未加载

chrisjj大约 1 年前

> The hand-drawn images are usually faithful representations of the world, but only in a qualitative sense, since it is difficult for humans to draw multiple perspectives of an object or scene 3D consistently.I find this premise unsound. The reason is less that it is difficult but more that it is undesirable - in this medium.

bhouston大约 1 年前

It is a good idea, but the results are quite bad. It barely works in their demos, tons of artifacts everywhere.

ambyra大约 1 年前

I can't think of a great application either. Maybe if you want to map camera movements when converting an animated scene from 2d to 3d. It'd probably be easier just to start from scratch though. Simple polygons with a toon shader would work for simpsons and family guy im sure.

评论 #40390394 未加载

评论 #40390764 未加载

westedcrean大约 1 年前

I'm not a historian, but I remember a tour guide in Forum Romanum mentioned that current state of knowledge about how buildings and parts of cities looked like stems from their depictions on coins that period. Perhaps it could be used for that?

评论 #40390965 未加载

localfirst大约 1 年前

Trying to use this but stuck after exporting from the labeler (guessing that is close source), lots of questions:What do I do with this data exactly? Not really following the instructions from READMEDo I need a hefty GPU to run this? Doesn't say anything about hardware.What am I going to get as a result? Will it generate a 3d model or "point clouds" ?Do I need multiple inputs (from different angles) through the labeler?What is the depth estimator being used here (this im most interested in especially its able to detect ground from multiple angles) ?Guess I'm just really lost here but super eager to use this. We do have a real world application to use this.

SiempreViernes大约 1 年前

The ability to reconstruct a coherent 3d view from a sparse set of photos seems much more useful than for a set of 2d drawings of an entirely imagined space, I don't think 2d artists are cheaper than the 3d artists.

评论 #40389998 未加载

djl0大约 1 年前

A little bit off topic, but related: are there any tools to which you can feed a few photos of a room from various angles and it will generate a floorplan or 3d model like this?

评论 #40390191 未加载

评论 #40390266 未加载

评论 #40390225 未加载

nemomarx大约 1 年前

This is very interesting but I feel like the name suggests it's an animation or graphics program more directly? That might be a branding loss

nico大约 1 年前

It’s fascinating that the generated Gaussian splats look kind of like a dream. Almost like that was the way we generate 3d scenes in our minds

orthoxerox大约 1 年前

I see they didn't even try Peppa Pig.

1-6大约 1 年前

It's hallucinating a bit. There are new things put in that weren't there in the previuos frame.

surfingdino大约 1 年前

It's cool. It might be useful as a 3d camera movement visualisation tool in pre-production. As a tool for recreating old cartoons in 3D it'll produce results as desirable as those ghastly coloured versions of old bw movies.

eMerzh大约 1 年前

Not sure how related there are, but it looks like it could be used to do <a href="https://www.wakatoon.com" rel="nofollow">https://www.wakatoon.com</a>

thebeardisred大约 1 年前

Thank you HN for showing me enough papers on "Gaussian Splating" that I was about to pick it out as the method visually from the examples.

selimnairb大约 1 年前

Cool, but why? Structure from motion has applications in the real world, but this use case doesn't seem to be that compelling to me.

deadbabe大约 1 年前

Will be awesome when we can watch old cartoon shows in VR and look all around the world.

JL-Akrasia大约 1 年前

This is so cool!

binary132大约 1 年前

Amazingly weird

RA2lover大约 1 年前

xkcd's yearly april fools had automatic 3d comic conversion done back in 2011: <a href="https://web.archive.org/web/20110813115522/http://chatter.recreclabs.com/" rel="nofollow">https://web.archive.org/web/20110813115522/http://chatter.re...</a>

seattle_spring大约 1 年前

Kiiiiind of disappointed to not see the alley from King of the Hill, I tell you h'what.

JL-Akrasia大约 1 年前

Holy crap, can you imagine rewatching your favorite shows from different perspectives?

评论 #40390240 未加载

评论 #40391607 未加载

评论 #40390067 未加载

7bit大约 1 年前

This website is crap on mobile. No image can be enlarged...

36 条评论

monitron大约 1 年前

评论 #40390768 未加载

评论 #40391237 未加载

评论 #40390810 未加载

评论 #40390739 未加载

评论 #40401938 未加载

jsheard大约 1 年前

评论 #40389745 未加载

评论 #40392112 未加载

评论 #40391905 未加载

评论 #40390199 未加载

评论 #40391107 未加载

评论 #40389773 未加载

评论 #40390672 未加载

评论 #40392524 未加载

JonathanFly大约 1 年前

评论 #40393366 未加载

mattfrommars大约 1 年前

评论 #40391395 未加载

评论 #40392774 未加载

评论 #40395398 未加载

评论 #40391016 未加载

foota大约 1 年前

pcrh大约 1 年前

评论 #40395760 未加载

robertclaus大约 1 年前

评论 #40391386 未加载

MarcScott大约 1 年前

评论 #40392659 未加载

评论 #40396295 未加载

throw4847285大约 1 年前

If you showed the Spirited Away one to Miyazaki, he would probably call it an insult to life itself.

评论 #40391189 未加载

nicklecompte大约 1 年前

评论 #40396609 未加载

评论 #40396402 未加载

评论 #40395977 未加载

solardev大约 1 年前

It kinda looks like a cartoon version of Microsoft Photosynth? <a href="https://en.wikipedia.org/wiki/Photosynth" rel="nofollow">https://en.wikipedia.org/wiki/Photosynth</a>

iainmerrick大约 1 年前

评论 #40396380 未加载

James_K大约 1 年前

This web page uses over 1.6 gigabytes of RAM.

评论 #40390876 未加载

chungus大约 1 年前

I imagine Spongebob episodes converted to this 3D format, and watching them with VR goggles, like you're there.

评论 #40392301 未加载

评论 #40391444 未加载

chrisjj大约 1 年前

bhouston大约 1 年前

It is a good idea, but the results are quite bad. It barely works in their demos, tons of artifacts everywhere.

ambyra大约 1 年前

评论 #40390394 未加载

评论 #40390764 未加载

westedcrean大约 1 年前

评论 #40390965 未加载

localfirst大约 1 年前

SiempreViernes大约 1 年前

评论 #40389998 未加载

djl0大约 1 年前

A little bit off topic, but related: are there any tools to which you can feed a few photos of a room from various angles and it will generate a floorplan or 3d model like this?

评论 #40390191 未加载

评论 #40390266 未加载

评论 #40390225 未加载

nemomarx大约 1 年前

This is very interesting but I feel like the name suggests it's an animation or graphics program more directly? That might be a branding loss

nico大约 1 年前

It’s fascinating that the generated Gaussian splats look kind of like a dream. Almost like that was the way we generate 3d scenes in our minds

orthoxerox大约 1 年前

I see they didn't even try Peppa Pig.

1-6大约 1 年前

It's hallucinating a bit. There are new things put in that weren't there in the previuos frame.

surfingdino大约 1 年前

eMerzh大约 1 年前

Not sure how related there are, but it looks like it could be used to do <a href="https://www.wakatoon.com" rel="nofollow">https://www.wakatoon.com</a>

thebeardisred大约 1 年前

Thank you HN for showing me enough papers on "Gaussian Splating" that I was about to pick it out as the method visually from the examples.

selimnairb大约 1 年前

Cool, but why? Structure from motion has applications in the real world, but this use case doesn't seem to be that compelling to me.

deadbabe大约 1 年前

Will be awesome when we can watch old cartoon shows in VR and look all around the world.

JL-Akrasia大约 1 年前

This is so cool!

binary132大约 1 年前

Amazingly weird

RA2lover大约 1 年前

seattle_spring大约 1 年前

Kiiiiind of disappointed to not see the alley from King of the Hill, I tell you h'what.

JL-Akrasia大约 1 年前

Holy crap, can you imagine rewatching your favorite shows from different perspectives?