As someone working on a reinforcement learning/neuroevolution problem right now, I find this to be extremely exciting. Fewer parameters, <i>ceteris paribus</i>, is always better—the fact that the experiments in this paper were run on one workstation, rather than on a massive farm of TPUs à la AlphaGo, implies quicker development iteration time and more accessibility to the average researcher.<p>The staging of components in this paper (compressor/controller), where neuroevolution is only applied to a low-dimensional controller, reminds me of Ha and Schmidhuber's recent paper on world models (which is briefly cited) [1]. They employ a variational autoencoder with ~4.4M parameters, an RNN with ~1.7M parameters, and a final controller with just 1,088 parameters! Though it's recently been shown that neuroevolution can scale to millions of parameters [2], the technique of applying evolution to as few parameters as possible and supplementing with either autoencoders or vector quantization seems to be gaining traction. I hope to apply some of the ideas in this paper to multiple co-evolving agents...<p>[1]. <a href="https://worldmodels.github.io" rel="nofollow">https://worldmodels.github.io</a><p>[2]. <a href="https://arxiv.org/abs/1712.06567" rel="nofollow">https://arxiv.org/abs/1712.06567</a>
Cool article, lots to digest, one thing caught my eye:<p>"To the best of our knowledge, the only prior work using unsupervised learning as a pre-processor for neuroevolution is (cite)."<p>Just amazing how much low-hanging fruit there still is in the space.
I have been wolfing down RL articles, videos and publications after a intro to deep learning via Manning's Deep Learning for some time now and while the overall concept of RL is easy to grasp (agents, actions and state etc) some of the finer details and processes are quite confusing.<p>I am tempted to blame inconsistency across terminology and implementations for this lack of understanding but I suspect it has more to do with approaching this field through the lens of a developer and not a researcher or academic. Trying to understand the code without grasping the "science" of the mechanisms completely.<p>Either way if you feel to be in a similar spot check out this resource:
<a href="https://reinforce.io" rel="nofollow">https://reinforce.io</a> and their respective Github repo:
<a href="https://github.com/reinforceio/tensorforce" rel="nofollow">https://github.com/reinforceio/tensorforce</a>.<p>Just reading through their code, and documentation has made a lot of the concepts clearer.<p>And a few more resources I found really helpful:
<a href="http://karpathy.github.io/2016/05/31/rl/" rel="nofollow">http://karpathy.github.io/2016/05/31/rl/</a>
<a href="https://www.analyticsvidhya.com/blog/2017/01/introduction-to-reinforcement-learning-implementation/" rel="nofollow">https://www.analyticsvidhya.com/blog/2017/01/introduction-to...</a>
<a href="https://www.oreilly.com/ideas/reinforcement-learning-with-tensorflow" rel="nofollow">https://www.oreilly.com/ideas/reinforcement-learning-with-te...</a><p>Edit: My point that I forgot to mention was that I always feel like I am playing catch-up to understand what is going on half the time as the amount of new content being released exceeds what I can absorb.
And the Github library:<p><a href="https://github.com/giuse/DNE/tree/nips2018" rel="nofollow">https://github.com/giuse/DNE/tree/nips2018</a>