Control policies learned via RL are starting to work in the real world!<p>Typically policies learned via simulation tend to transfer poorly to the real world (the so-called sim2real gap), so I'm curious to dig into this work to see how they overcame this limitation.<p>From just watching the video and guessing, it would make sense if noising the belief state (rnn(h,concat(proprio,extero)) + \eps ~ Noise) and learning to condition proprioceptive attention on the belief uncertainty is enough. Very cool work and so exciting to see robotics groups exploiting ML more and more (gate attention + learned belief states here).