TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Reinforcement Learning from scratch

221 点作者 e_ameisen将近 7 年前

7 条评论

crapflare将近 7 年前
<a href="https:&#x2F;&#x2F;www.alexirpan.com&#x2F;2018&#x2F;02&#x2F;14&#x2F;rl-hard.html" rel="nofollow">https:&#x2F;&#x2F;www.alexirpan.com&#x2F;2018&#x2F;02&#x2F;14&#x2F;rl-hard.html</a> Reinforcment learning for the average person is a big waste of time. Probably for anyone atm
评论 #17258283 未加载
评论 #17261164 未加载
curiousgal将近 7 年前
This is mostly just a preview to a codecamp.
评论 #17258067 未加载
yonkshi将近 7 年前
I think &quot;learn&quot; is a bit misleading here but I do have to say it&#x27;s a nice and intuitive overview of RL. RL is quite hard and math heavy, I don&#x27;t know if one can take a short cut in learning RL without solid graduate level math foundation.
评论 #17258365 未加载
评论 #17258557 未加载
评论 #17258216 未加载
master_yoda_1将近 7 年前
Is this a click bait article? I wish I have AD BLOCKER plus plus to block this kind of &amp;*## $#!^ :(
评论 #17258623 未加载
ninjamayo将近 7 年前
Just get Sutton’s and Barto’s book.
评论 #17258064 未加载
setzer22将近 7 年前
A small tangential criticism, but using &quot;deep&quot; every other sentence and especially expressions like &quot;classical deep learning&quot; made me take this article less seriously.<p>This is not unique to this author, sadly. I&#x27;m tired of seeing the d word thrown in research papers just for the sake of adding more buzzwords per buzzword.<p>Once you&#x27;ve made clear you are using neural networks with a lot of layers you can start using some variation in the discourse. Maybe just call them neural networks...
ogennadi将近 7 年前
There were so many technical terms, I&#x27;m surprised you could get through even an overview, and then practicals, in just 4 hours.<p>Do you know of any resources which list most of the common alternatives? e.g. what are the alternatives to a3c for parallelizing; or the alternatives to a2c for getting policy and value estimates?