科技回声

15 条评论

mk_stjames超过 1 年前

Oof, the dependency tree on this.It uses diff-gaussian-rasterization from the original gaussian splatting implementation (which, is a linked submodule on the git, so if you are trying to git clone that dependency remember to use --recursive to actually download it).But that is written in mostly pure CUDA.That part is just used to display the resulting gaussian splatt'd model, and there have been other cross-platform implementations to render splats – there was even that web demo a few weeks ago, that was using WebGL [0] – and if that was used as a display output in place of the original implementation there is no reason people couldn't use this on non-Nvidia hardware, I think.edit: also device=cuda is hardcoded in the torch portions of the training code (sigh!). This doesn't have to be the case. pytorch could push this onto mps (metal) probably just fine.[0] <a href="https://github.com/antimatter15/splat?tab=readme-ov-file">https://github.com/antimatter15/splat?tab=readme-ov-file</a>

catapart超过 1 年前

So if I'm tracking the progress correctly, now we should be able to do: Single Image -> Gaussian Splats -> Object Identification -> [Nearest Known Object | Algo-based shell] Mesh Generation -> Use-Case-Based Retopology -> Style-Trained Mesh TransformationWhich would produce a new mesh in the style of your other meshes, based on a single photograph of a real-world object....and, at this speed, you could do that as a real-time(ish) import into a running application/game.Gotta say, I'm looking forward to someone putting these puzzle pieces together! But it really does feel like if we wait another month, there might be some new AI that shrinks that pipeline by another one or two steps! It's an exhausting time to be excited!

评论 #38723682 未加载

joosters超过 1 年前

Probably a dumb question, but is this trained by the use of lots of inputs of similar objects, or is it 'just' estimating by the look of the input image?Like, if you have an image of a car, viewed at an angle, you can gauge the shape of the 3d object from the image itself. You could then assume that the hidden side of the car is similar to the side that you can see, and when you generate a 360 rotation animation of it, it will look pretty good (cars being roughly symmetrical). But if you gave it a flat image of a playing card, just showing the face up side, how would it reconstruct the reverse side? Would it infer it based on the front, or would it 'know' from training data that playing cards have a very different patterned back to them?

评论 #38724466 未加载

roflmaostc超过 1 年前

Since it's based on 3D Gaussians in space, is there a way to obtain sharp images? Inherently, Gaussian functions extent infinitely, so images always look blurry. Don't they? Of course, \sigma can be optimized to be small, but then it converges to some point representation, doesn't it?Maybe some CV/ML people can help me understanding.

评论 #38722521 未加载

评论 #38722581 未加载

评论 #38725458 未加载

XorNot超过 1 年前

I guess this is how you'd implement that thing in Enemy Of The State where they pan around a single-perspective camera view (which I think doesn't come across as absurd in the movie anyway since the tech guys point out it's basically a clever extrapolation).

rijx超过 1 年前

Now we can finally turn Street View into a game world!

评论 #38721567 未加载

eurekin超过 1 年前

For anybody wanting to take a look at the code, this time the Github link does include it - it's not empty, which is typicaly for those "too good to be true" publications

lawlessone超过 1 年前

Am I imagining this ,or somebody making a newer and faster one of these every day?I'm expecting Overwhelming Fast Splatter by January.

评论 #38722423 未加载

评论 #38721432 未加载

teunispeters超过 1 年前

For a change, [code] works, but [arXiv] link is not present. Have to say this looks really interesting!

billconan超过 1 年前

the paper link doesn't work for me. the correct link <a href="https://arxiv.org/pdf/2312.13150.pdf" rel="nofollow noreferrer">https://arxiv.org/pdf/2312.13150.pdf</a>

alkonaut超过 1 年前

Wouldn't it be more useful to generate a vector model than a "3d image" voxel/radiance field/splats/whatever it's called? Apart from the use case "I want to spin the thing or walk around in it" they feel like they are of limited use?Unlike say a crude model of a fire hydrant which you could throw into a game or whatever. If the model is fed some more constraints/assumptions? I think I saw some recent paper that did generate meshes now instead of pixels.

评论 #38723717 未加载

评论 #38722066 未加载

StreetChief超过 1 年前

All I have to say is "ENHANCE!"

amelius超过 1 年前

This would be more powerful if you could feed it more input images for a better result, if desired.

anigbrowl超过 1 年前

This could get prove useful for autonomous navigation systems as well.

tantalor超过 1 年前

That "GT" method seems even better, we should just use that. /s

评论 #38724115 未加载

评论 #38722358 未加载

评论 #38722404 未加载

15 条评论

mk_stjames超过 1 年前

catapart超过 1 年前

评论 #38723682 未加载

joosters超过 1 年前

评论 #38724466 未加载

roflmaostc超过 1 年前

评论 #38722521 未加载

评论 #38722581 未加载

评论 #38725458 未加载

XorNot超过 1 年前

rijx超过 1 年前

Now we can finally turn Street View into a game world!

评论 #38721567 未加载

eurekin超过 1 年前

For anybody wanting to take a look at the code, this time the Github link does include it - it's not empty, which is typicaly for those "too good to be true" publications

lawlessone超过 1 年前

Am I imagining this ,or somebody making a newer and faster one of these every day?I'm expecting Overwhelming Fast Splatter by January.

评论 #38722423 未加载

评论 #38721432 未加载

teunispeters超过 1 年前

For a change, [code] works, but [arXiv] link is not present. Have to say this looks really interesting!

billconan超过 1 年前

the paper link doesn't work for me. the correct link <a href="https://arxiv.org/pdf/2312.13150.pdf" rel="nofollow noreferrer">https://arxiv.org/pdf/2312.13150.pdf</a>

alkonaut超过 1 年前

评论 #38723717 未加载

评论 #38722066 未加载

StreetChief超过 1 年前

All I have to say is "ENHANCE!"

amelius超过 1 年前

This would be more powerful if you could feed it more input images for a better result, if desired.

Splatter Image: Ultra-Fast Single-View 3D Reconstruction

15 条评论

Splatter Image: Ultra-Fast Single-View 3D Reconstruction

15 条评论