TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Playing Atari with Six Neurons

169 pointsby togeliusalmost 7 years ago

6 comments

pjrulealmost 7 years ago
As someone working on a reinforcement learning&#x2F;neuroevolution problem right now, I find this to be extremely exciting. Fewer parameters, <i>ceteris paribus</i>, is always better—the fact that the experiments in this paper were run on one workstation, rather than on a massive farm of TPUs à la AlphaGo, implies quicker development iteration time and more accessibility to the average researcher.<p>The staging of components in this paper (compressor&#x2F;controller), where neuroevolution is only applied to a low-dimensional controller, reminds me of Ha and Schmidhuber&#x27;s recent paper on world models (which is briefly cited) [1]. They employ a variational autoencoder with ~4.4M parameters, an RNN with ~1.7M parameters, and a final controller with just 1,088 parameters! Though it&#x27;s recently been shown that neuroevolution can scale to millions of parameters [2], the technique of applying evolution to as few parameters as possible and supplementing with either autoencoders or vector quantization seems to be gaining traction. I hope to apply some of the ideas in this paper to multiple co-evolving agents...<p>[1]. <a href="https:&#x2F;&#x2F;worldmodels.github.io" rel="nofollow">https:&#x2F;&#x2F;worldmodels.github.io</a><p>[2]. <a href="https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;1712.06567" rel="nofollow">https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;1712.06567</a>
评论 #17260507 未加载
评论 #17254375 未加载
kthejoker2almost 7 years ago
Cool article, lots to digest, one thing caught my eye:<p>&quot;To the best of our knowledge, the only prior work using unsupervised learning as a pre-processor for neuroevolution is (cite).&quot;<p>Just amazing how much low-hanging fruit there still is in the space.
评论 #17251119 未加载
评论 #17250992 未加载
markatkinsonalmost 7 years ago
I have been wolfing down RL articles, videos and publications after a intro to deep learning via Manning&#x27;s Deep Learning for some time now and while the overall concept of RL is easy to grasp (agents, actions and state etc) some of the finer details and processes are quite confusing.<p>I am tempted to blame inconsistency across terminology and implementations for this lack of understanding but I suspect it has more to do with approaching this field through the lens of a developer and not a researcher or academic. Trying to understand the code without grasping the &quot;science&quot; of the mechanisms completely.<p>Either way if you feel to be in a similar spot check out this resource: <a href="https:&#x2F;&#x2F;reinforce.io" rel="nofollow">https:&#x2F;&#x2F;reinforce.io</a> and their respective Github repo: <a href="https:&#x2F;&#x2F;github.com&#x2F;reinforceio&#x2F;tensorforce" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;reinforceio&#x2F;tensorforce</a>.<p>Just reading through their code, and documentation has made a lot of the concepts clearer.<p>And a few more resources I found really helpful: <a href="http:&#x2F;&#x2F;karpathy.github.io&#x2F;2016&#x2F;05&#x2F;31&#x2F;rl&#x2F;" rel="nofollow">http:&#x2F;&#x2F;karpathy.github.io&#x2F;2016&#x2F;05&#x2F;31&#x2F;rl&#x2F;</a> <a href="https:&#x2F;&#x2F;www.analyticsvidhya.com&#x2F;blog&#x2F;2017&#x2F;01&#x2F;introduction-to-reinforcement-learning-implementation&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.analyticsvidhya.com&#x2F;blog&#x2F;2017&#x2F;01&#x2F;introduction-to...</a> <a href="https:&#x2F;&#x2F;www.oreilly.com&#x2F;ideas&#x2F;reinforcement-learning-with-tensorflow" rel="nofollow">https:&#x2F;&#x2F;www.oreilly.com&#x2F;ideas&#x2F;reinforcement-learning-with-te...</a><p>Edit: My point that I forgot to mention was that I always feel like I am playing catch-up to understand what is going on half the time as the amount of new content being released exceeds what I can absorb.
kthejoker2almost 7 years ago
And the Github library:<p><a href="https:&#x2F;&#x2F;github.com&#x2F;giuse&#x2F;DNE&#x2F;tree&#x2F;nips2018" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;giuse&#x2F;DNE&#x2F;tree&#x2F;nips2018</a>
评论 #17251140 未加载
评论 #17291799 未加载
kabdibalmost 7 years ago
... and that&#x27;s three more than the average Atari marketing exec had back then. No wonder they had trouble understanding the game industry :-)
评论 #17256804 未加载
评论 #17254200 未加载
评论 #17256155 未加载
coldseattlealmost 7 years ago
I can post on hacker news with only 4.
评论 #17254179 未加载
评论 #17252363 未加载