DeepMind program finds diamonds in Minecraft without being taught

248 点作者 Bender大约 1 个月前

24 条评论

An important caveat from the paper>Moreover, we follow previous work in accelerating block breaking because learning to hold a button for hundreds of consecutive steps would be infeasible for stochastic policies, allowing us to focus on the essential challenges inherent in Minecraft.

评论 #43609625 未加载

评论 #43615672 未加载

评论 #43610515 未加载

评论 #43609453 未加载

评论 #43610804 未加载

评论 #43608723 未加载

评论 #43609796 未加载

评论 #43608747 未加载

Animats大约 1 个月前

Key to Dreamer’s success, says Hafner, is that it builds a model of its surroundings and uses this ‘world model’ to ‘imagine’ future scenarios and guide decision-making.Can you look at the world model, like you can look at Waymo's world model? Or is it hidden inside weights?Machine learning with world models is very interesting, and the people doing it don't seem to say much about what the models look like. The Google manipulation work talks endlessly about the natural language user interface, but when they get to motion planning, they don't say much.

评论 #43609408 未加载

评论 #43608829 未加载

评论 #43609393 未加载

评论 #43609146 未加载

reportgunner大约 1 个月前

Article makes it seem like finding diamonds is some kind of super complicated logical puzzle. In reality the hardest part is knowing where to look for them and what tool you need to mine them without losing them once you find them. This was given to the AI by having it watch a video that explains it.If you watch a guide on how to find diamonds it's really just a matter of getting an iron pickaxe, digging to the right depth and strip mining until you find some.

评论 #43609431 未加载

评论 #43609036 未加载

评论 #43609371 未加载

评论 #43609066 未加载

评论 #43609145 未加载

DeborahEmeni_大约 1 个月前

The “holding a button” thing actually resonated. It feels like the real work here is engineering the reward structure to make exploration even remotely viable. Dreamer’s world model might be cool, but most of the heavy lifting still seems to come from how forgiving the Minecraft environment is for training.I do wonder though: if you swapped Minecraft for a cloud-based synthetic world with similar physics but messier signals, like object permanence or social reasoning, would Dreamer still hold up? Or is it just really good at the kind of clean reward hierarchies that games offer?

lupusreal大约 1 个月前

Characterizing finding diamonds as "mastering" Minecraft is extremely silly. Tantamount to saying "AI masters Chess: Captures a pawn." Getting diamonds is not even close to the hardest challenge in the game, but most readers of Nature probably don't have much experience playing Minecraft so the title is actually misleading, not harmless exaggeration.

评论 #43609790 未加载

评论 #43616710 未加载

CodeCompost大约 1 个月前

I didn't know that Nature did movie promotions.

YeGoblynQueenne大约 1 个月前

Reinforcement learning is very good with games.>> In Minecraft, the team used a protocol that gave Dreamer a ‘plus one’ reward every time it completed one of 12 progressive steps involved in diamond collection — including creating planks and a furnace, mining iron and forging an iron pickaxe.And that is why it is never going to work in the real world: games have clear objectives with obvious rewards. The real world, not so much.

评论 #43624899 未加载

评论 #43616448 未加载

评论 #43613626 未加载

评论 #43612911 未加载

评论 #43617898 未加载

colechristensen大约 1 个月前

Who would have thought you could get your TAS run published in Nature if you used enough hot buzzwords. (they have been using various old-school-definition "artifical intelligence" algorithms for a long time)<a href="https://tasvideos.org/" rel="nofollow">https://tasvideos.org/</a>

FrustratedMonky大约 1 个月前

Minecraft is ubiquitous now.But I remember the alpha version, and NOBODY knew how to make a pick ax. Humans were also very bad at figuring out these steps.People were de-compiling the java and posting help guides on the internet.How to break a tree, get sticks, make a wood pick. In Alpha, that was a big deal for humans also.

评论 #43617031 未加载

N-Krause大约 1 个月前

<a href="https://archive.is/XutGu" rel="nofollow">https://archive.is/XutGu</a>

ljdtt大约 1 个月前

Slightly off-topic from the article itself, but… does anyone else feel like Nature’s cookie banner just never goes away? I have vivid memories of trying to reject cookies multiple times, eventually giving up and accepting them just to get to the article only for the banner to show up again the next time I visit. I swear it’s giving me déjà vu every single visit.. Am I the only one experiencing this, or is this just how their site works?

textlapse大约 1 个月前

Could this perform better by having the internal representation of Minecraft instead of raw pixels?It seems rather tenuous to keep pounding on 'training via pixels' when really a game's 2D/3D output is an optical trick at best.I understand Sergey Brin/et al had a grandiose goal for DeepMind via their Atari games challenge - but why not try alternate methods - say build/tweak games to be RL-friendly? (like MuJoCo but for games)I don't see the pixel-based approach being as applicable to the practical real world as say when software divulges its direct, internal state to the agent instead of having to fake-render to a significantly larger buffer.I understand Dreamer-like work is a great research area and one that will garner lots of citations for sure.

评论 #43616775 未加载

protocolture大约 1 个月前

Finally a use case for AI

Xelynega大约 1 个月前

Isn't this DeepMind achievement from 2023?

评论 #43612871 未加载

successful23大约 1 个月前

Pretty impressive. Minecraft’s a complex environment, so for an AI to figure out how to find diamonds on its own shows real progress in learning through exploration — not just pattern recognition.

评论 #43613331 未加载

fine_tune大约 1 个月前

Attempting to train this on a real workload I converted over the weekend after, "step" 8M~ so far and rarely scores above 5% and most are 0% but has scored 60% once 7M~ steps ago.Adding more than 1 GPU didn't improve speed but that's pretty standard as we don't have fancy interconnect. Bit annoying they didn't use tensorboard for logging, but overall seems like a pretty cool lib - will leave it a few days and see if it can learn (no other algo has so I dont have much hope).

theOGognf大约 1 个月前

This looks like an article about the recent Nature publication. Was confused at first because DreamerV3 is a couple of years old now

sbuttgereit大约 1 个月前

There's a YouTube channel that does a lot of videos focused on LLMs in Minecraft:<a href="https://www.youtube.com/@EmergentGarden" rel="nofollow">https://www.youtube.com/@EmergentGarden</a>I very much like the comparative approach this guy takes looking at how different LLMs fare... including how they interact together. Worth a look.

ninetyninenine大约 1 个月前

It’s still being a stochastic parrot. Now it’s just parroting the human creativity and imagination so I’m still not impressed.If all you’re going to do is parrot things like human consciousness or human ingenuity then I will never be impressed so long that it’s just parroting.

评论 #43612813 未加载

jonathanyc大约 1 个月前

They write: "Below, we show uncut videos of runs during which Dreamer collected diamonds."... but the first video only shows the player character digging downwards without using any tools and eventually dying in lava. What?

评论 #43616978 未加载

评论 #43616058 未加载

camel-cdr大约 1 个月前

How robust is this?Isn't something like finding dimonds in minecraft something that old-school AI could already do decently?

评论 #43610116 未加载

nottorp大约 1 个月前

Isn't "masters" when you build a working copy of Minas Tirith or something like that?

评论 #43610839 未加载

评论 #43616619 未加载

_vere大约 1 个月前

So can i and no one needed to teach me either, but you dont see nature writing articles on it...

评论 #43611383 未加载

评论 #43610465 未加载

fxtentacle大约 1 个月前

I guess we can look forwardto a bright futurewhere we focus 100% on workand AI will play our games/s

评论 #43609059 未加载

评论 #43609283 未加载

评论 #43609851 未加载