TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Horizon: Facebook’s Open Source Applied Reinforcement Learning Platform

174 点作者 inputcoffee超过 6 年前

8 条评论

AndrewKemendo超过 6 年前
This is not a research paper. In fact, most ML papers aren&#x27;t research papers. Compare the FB paper to the first result in biorxiv under the genetics heading [1]. There are basically no similarities other than being done in LaTeX. I never expect a research paper to talk about how the research affected business processes, but again this isn&#x27;t research in any traditional sense.<p>What this is, is documentation of how Facebook implemented a technology stack that uses reinforcement learning techniques to do something. Namely: &quot;Notifications at Facebook&quot;<p>So what can other developers and business owners take from this? I don&#x27;t see anything about the down stream product impacts. Does it impact conversion to paid rate for users? Does it reduce human labor? How does it improve benefits to users. All I see them write are two things:<p>&quot;We observed a significant improvement in activity and mean- ingful interactions by deploying an RL based policy for certain types of notifications, replacing the previous system based on supervised learning.&quot;<p>I&#x27;m sorry but there is absolutely nothing rigorous in that statement. How are &quot;meaningful interactions&quot; defined? Hopefully they aren&#x27;t still arguing the formula (more interaction = makes users better off).<p>&quot;After deploying the DQN model, we were able to improve daily, weekly, and monthly metrics without sacrificing notification quality.&quot;<p>Improve for who? Well obviously Facebook and how much activity people have. Not necessarily if the user is actually getting more value from it.<p>What&#x27;s the Return on Investment for this system?<p>Listen, I&#x27;m a huge fan of being open with business practices, research etc...I&#x27;m also obsessive about RL and making progress in the field.<p>What I can&#x27;t stand however is lack of rigorous and tangible proof of how we&#x27;re making things better for users or the society broadly with RL yet, or even in most cases getting positive ROI for the effort we&#x27;re putting into ML&#x2F;DL.<p>I&#x27;ve built these tools at scale so it hurts to say this, but the economics just aren&#x27;t yet lining up here across the entire ML&#x2F;DL industry and that has me worried that another AI winter is coming.<p>[1]<a href="https:&#x2F;&#x2F;www.biorxiv.org&#x2F;content&#x2F;early&#x2F;2018&#x2F;11&#x2F;01&#x2F;422345" rel="nofollow">https:&#x2F;&#x2F;www.biorxiv.org&#x2F;content&#x2F;early&#x2F;2018&#x2F;11&#x2F;01&#x2F;422345</a>
评论 #18360500 未加载
inputcoffee超过 6 年前
In case you want to see the actual paper: <a href="https:&#x2F;&#x2F;research.fb.com&#x2F;wp-content&#x2F;uploads&#x2F;2018&#x2F;10&#x2F;Horizon-Facebooks-Open-Source-Applied-Reinforcement-Learning-Platform.pdf" rel="nofollow">https:&#x2F;&#x2F;research.fb.com&#x2F;wp-content&#x2F;uploads&#x2F;2018&#x2F;10&#x2F;Horizon-F...</a>?
评论 #18358730 未加载
dheera超过 6 年前
If they want to increase adoption, they really need to make this stuff easier to install. I mean <i>zero</i> friction. Either pip, or an Ubuntu PPA that &quot;just works&quot;.<p>Caffe2 install page: &quot;We only support Anaconda packages at the moment. If you do not wish to use Anaconda, then you must build Caffe2 from source.&quot; =&gt; We are a company with a 400B+ market cap but are too lazy to support more than one installation configuration. Good luck dealing with dependency hell, poor ML grad student researcher.<p>MXnet install page: &quot;You can either upgrade your CUDA install or install the MXNet package that supports your CUDA version.&quot; =&gt; We welcome you with open arms regardless of your configuration! No matter your configuration we have an pre-built package for you!
评论 #18366823 未加载
rjammala超过 6 年前
Github repo: <a href="https:&#x2F;&#x2F;github.com&#x2F;facebookresearch&#x2F;Horizon" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;facebookresearch&#x2F;Horizon</a>
评论 #18357820 未加载
sytelus超过 6 年前
One of the most interesting part of the paper is how RL is used - especially here for Horizon where one of the goal seems to be problems where simulation isn&#x27;t available. One such problem is push notifications:<p><i>Historically, we have used supervised learning models for predicting click through rate (CTR) and likelihood that the notification leads to meaningful interactions.</i><p><i>We introduced a new policy that uses Horizon to train a Discrete-Action DQN model for sending push notifications to address the problems above. The Markov Decision Process (MDP) is based on a sequence of notification candidates for a particular person. The actions here are sending and dropping the notification, and the state describes a set of features about the person and the notification candidate. There are rewards for interactions and activity on Facebook, with a penalty for sending the notification to control the volume of notifications sent. The policy optimizes for the long term value and is able to capture incremental effects of sending the notification by comparing the Q-values of the send and don’t send action.</i>
Geee超过 6 年前
This kind of AI applied at scale (Facebook) scares me. Mainly because humans are at the other side of the feedback loop, and not only is the AI adapting, but people are adapting too. Over time this kind of runaway feedback loop could lead into anything.
评论 #18359420 未加载
pesenti超过 6 年前
Blog post: <a href="https:&#x2F;&#x2F;code.fb.com&#x2F;ml-applications&#x2F;horizon&#x2F;" rel="nofollow">https:&#x2F;&#x2F;code.fb.com&#x2F;ml-applications&#x2F;horizon&#x2F;</a> (would be a better link if that can be changed)
amrrs超过 6 年前
Discussion on Google&#x27;s Dopamine - its Reinforcement Learning Framework <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=15648746" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=15648746</a>
评论 #18357502 未加载
评论 #18362551 未加载