TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Multi-Armed Bandits, Conjugate Models and Bayesian Reinforcement Learning

125 点作者 _eigenfoo将近 7 年前

7 条评论

heydenberk将近 7 年前
I spent two years[0] designing, building and maintaining a system which used contextual multi-armed bandits at large scale. A couple pieces of advice relating to this post and this subject:<p>1. Thompson sampling is great. It&#x27;s intuitive and computationally tractable. The literature is full of other strategies, specifically semi-uniform strategies, but I strongly recommend using Thompson sampling if it works for your problem.<p>2. This is broadly true about ML, but for contextual bandits, most of the engineering work will probably be the feature engineering, not algorithm implementation. Plan accordingly. Choosing the right inputs in the first place makes a big difference. The hashing trick (a la sklearn&#x27;s dictvectorizer) can make a huge difference.<p>3. It can be difficult to obtain organizational alignment on the intention of using reinforcement learning. Tell stakeholders early and often that you&#x27;re using bandit algos to produce some kind of outcome — say, clicks or conversions — and not to do science which will uncover deep truths.<p>[0] along with an excellent data scientist and a team of excellent engineers, of course :)
评论 #17889349 未加载
评论 #17890035 未加载
评论 #17885997 未加载
评论 #17888808 未加载
atrudeau将近 7 年前
Though mentioned it the article, I&#x27;ll add it here for posterity: <a href="https:&#x2F;&#x2F;web.stanford.edu&#x2F;~bvr&#x2F;pubs&#x2F;TS_Tutorial.pdf" rel="nofollow">https:&#x2F;&#x2F;web.stanford.edu&#x2F;~bvr&#x2F;pubs&#x2F;TS_Tutorial.pdf</a><p>Great tutorial on Thompson sampling.
mopierotti将近 7 年前
I would strongly recommend the post he cited. It is the same style but features interactive visualizations: <a href="https:&#x2F;&#x2F;dataorigami.net&#x2F;blogs&#x2F;napkin-folding&#x2F;79031811-multi-armed-bandits" rel="nofollow">https:&#x2F;&#x2F;dataorigami.net&#x2F;blogs&#x2F;napkin-folding&#x2F;79031811-multi-...</a><p>I implemented something like this for my company and found the latter article quite helpful in explaining the concept to people who understood the basics of probability but not programming.
pengstrom将近 7 年前
I want to recommend the book <a href="http:&#x2F;&#x2F;camdavidsonpilon.github.io&#x2F;Probabilistic-Programming-and-Bayesian-Methods-for-Hackers&#x2F;" rel="nofollow">http:&#x2F;&#x2F;camdavidsonpilon.github.io&#x2F;Probabilistic-Programming-...</a> a nice little book about probabilistic programming in Python.
melling将近 7 年前
There was a free Bandits algorithm book discussed on HN about a month ago.<p><a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=17642564" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=17642564</a>
评论 #17886458 未加载
atrudeau将近 7 年前
The math isn&#x27;t rendering for me...
评论 #17885408 未加载
tegansnyder将近 7 年前
This is an excellent well written post. Thanks for sharing.