Sometimes it's hard to separate signal from noise when you're not part of a field and just hearing about projects/papers, so I wanted to quickly pitch in to say that this is a legitimately ground-breaking approach and line of work that you can expect to hear much more about in the future. It's probably the most exciting robotics/manipulation project I'm currently aware of.<p>What's exciting here is that the entire system is trained end-to-end (including the vision component). In other words, it's heading towards agents/robots that consist entirely of a single neural net and that's it; There is no software stack at all - it's just a GPU running a neural net "code base", from perception to actuators. In this respect this work is similar to the Atari game-playing agent that has to learn to see while also learning to play the game. Except this setting is quite a lot more difficult in some respects; In particular, the actions in the Deepmind Atari paper are few and discrete, while here the robot is an actual physical system with a very large-dimensional and continuous action space (joint torques). Also, if you're new to the field you might think "why is the robot so slow?", while someone in the field is thinking "holy crap how can it be so fast?"
Learning motor torques directly from vision is a very important result.<p><a href="https://youtu.be/EtMyH_--vnU" rel="nofollow">https://youtu.be/EtMyH_--vnU</a><p>This talk by Sergey Levine, Pieter Abbeel's PostDoc outlines Berkley's end-to-end deep-training visuomotor control in detail.<p>Here is the paper :<p>End-to-End Training of Deep Visuomotor Policies,
Sergey Levine<i>, Chelsea Finn</i>, Trevor Darrell, Pieter Abbeel.<p><a href="http://arxiv.org/abs/1504.00702" rel="nofollow">http://arxiv.org/abs/1504.00702</a>
I probably made a career direction error in the early 1990s. I had been on DARPA'S neural network tools advisory panel and written the SAIC Ansim product, but moved on because of a stronger interest in natural language processing. Now, I think deep learning is getting very interesting for NLP.<p>This UCB project looks awesome!<p>BTW, I took Hinton's Coursera neural network class a few years ago, and it was excellent. Take it if that course is still online.
Could someone explain in simple terms how is the target set to the robot so that it can learn to accomplish the task? For example, what inputs are provided in order for it to understand that it needs to put the cap on the bottle?
It seems most of the code behind this effort is open source as well!
<a href="http://lfd.readthedocs.org/en/latest/" rel="nofollow">http://lfd.readthedocs.org/en/latest/</a>
<a href="https://github.com/cbfinn/caffe" rel="nofollow">https://github.com/cbfinn/caffe</a>
Wouldn't this benefit from simulation of the task (from the robot's perspective)? Doing something physical over and over again on ONE single robot must be very slow and inefficient compared to if it could be simulated. Even if the simulated training isn't spot on, the physical robot could start off with network weights from millions of attempts in a simulated environment.
I'm impressed it (apparently) learned to align screw caps with a short backward turn at the start.<p>Then again, why do we make so many containers with these ungainly screw caps? Ever use those caps (popular in Japan) with the locking track that only take a quarter-turn to close? Examples<p><a href="http://www.amazon.com/Yu-Be-Moisturizing-Skin-Cream-Skin-1/dp/B0001UWRCI/" rel="nofollow">http://www.amazon.com/Yu-Be-Moisturizing-Skin-Cream-Skin-1/d...</a><p><a href="http://www.amazon.com/Biotene-PBF-Toothpaste-Ounce-Pack/dp/B00JX73B2A" rel="nofollow">http://www.amazon.com/Biotene-PBF-Toothpaste-Ounce-Pack/dp/B...</a>
While it is how humans learn, there's more to human learning than that. Babies are pre-wired to learn language, recognize shapes, determine "intent", etc.<p>This means that the neural nets used by babies are pre-wired to be good at specific tasks. Then, babies use those neural nets to do "deep learning" for the final part of the process.<p>Starting from <i>nothing</i> and learning how to do a job is a big step. But having <i>something</i> would be a better start position. What that something is, though, is hard to define.
If you're interested in this, I'm putting together a meetup/workshop/lab at the Palace of Fine Arts in SF every weekend. Come out and share, learn, and build with other people interested in this field.<p>Think of it as the Home Brew Computer Club for Robotics/AI :)<p><a href="https://www.facebook.com/groups/762335743881364/" rel="nofollow">https://www.facebook.com/groups/762335743881364/</a>
It acts very organic. But I have to wonder if the organic motion is a good thing. Wouldn't it be more efficient to control the arm using IK, but let the robot "think" where the arm should be ? I mean I could easily imagine a straight line, but I can't draw it.<p>This would also speed it up imo. Since some things can easily be solved using regular algorithms. Our brains also come with some pre wired functions.
there was recently a Talking Machines episode that included some information (not apparent in the title) about difficulties of modeling the world with robots).<p>"We learn about the Markov decision process (and what happens when you use it in the real world and it becomes a partially observable Markov decision process) "<p><a href="http://www.thetalkingmachines.com/blog/2015/5/21/how-we-think-about-privacy-and-finding-features-in-black-boxes" rel="nofollow">http://www.thetalkingmachines.com/blog/2015/5/21/how-we-thin...</a>