TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Reinforcement Learning from scratch

221 pointsby e_ameisenalmost 7 years ago

7 comments

crapflarealmost 7 years ago
<a href="https:&#x2F;&#x2F;www.alexirpan.com&#x2F;2018&#x2F;02&#x2F;14&#x2F;rl-hard.html" rel="nofollow">https:&#x2F;&#x2F;www.alexirpan.com&#x2F;2018&#x2F;02&#x2F;14&#x2F;rl-hard.html</a> Reinforcment learning for the average person is a big waste of time. Probably for anyone atm
评论 #17258283 未加载
评论 #17261164 未加载
curiousgalalmost 7 years ago
This is mostly just a preview to a codecamp.
评论 #17258067 未加载
yonkshialmost 7 years ago
I think &quot;learn&quot; is a bit misleading here but I do have to say it&#x27;s a nice and intuitive overview of RL. RL is quite hard and math heavy, I don&#x27;t know if one can take a short cut in learning RL without solid graduate level math foundation.
评论 #17258365 未加载
评论 #17258557 未加载
评论 #17258216 未加载
master_yoda_1almost 7 years ago
Is this a click bait article? I wish I have AD BLOCKER plus plus to block this kind of &amp;*## $#!^ :(
评论 #17258623 未加载
ninjamayoalmost 7 years ago
Just get Sutton’s and Barto’s book.
评论 #17258064 未加载
setzer22almost 7 years ago
A small tangential criticism, but using &quot;deep&quot; every other sentence and especially expressions like &quot;classical deep learning&quot; made me take this article less seriously.<p>This is not unique to this author, sadly. I&#x27;m tired of seeing the d word thrown in research papers just for the sake of adding more buzzwords per buzzword.<p>Once you&#x27;ve made clear you are using neural networks with a lot of layers you can start using some variation in the discourse. Maybe just call them neural networks...
ogennadialmost 7 years ago
There were so many technical terms, I&#x27;m surprised you could get through even an overview, and then practicals, in just 4 hours.<p>Do you know of any resources which list most of the common alternatives? e.g. what are the alternatives to a3c for parallelizing; or the alternatives to a2c for getting policy and value estimates?