TechEcho

4 comments

peripiteaover 5 years ago

I'm not super familiar with AI/ML/RL at all, so I'm sure this is a naive question, but isn't it obvious that the answer is to just build in costs to the utility function for behaviors you want to avoid (what they seem to refer to as constrained RL in the article)? That seems both the simplest way to handle it, and most elegant in terms of mapping to the real world domain. Like are there alternate solutions that are even remotely competitive with this? I'm sure I must be oversimplifying and I assume that there's some nuance I'm missing. E.g. is this more about how you design those constraints to minimize the overall loss in learning efficiency, or something like that?

评论 #21602386 未加载

评论 #21601778 未加载

评论 #21601575 未加载

评论 #21601249 未加载

评论 #21603486 未加载

评论 #21601740 未加载

评论 #21602082 未加载

Jefro118over 5 years ago

On this topic, if anyone wants to understand the behind the scenes of working on and maintaining projects like this, I did an interview with a maintainer of OpenAI Gym here: <a href="https://www.sourcesort.com/interview/peter-zhokhov-open-ai-gym" rel="nofollow">https://www.sourcesort.com/interview/peter-zhokhov-open-ai-g...</a>

sanxiynover 5 years ago

If you like this, you may also enjoy "AI Safety Gridworlds" from DeepMind: <a href="https://arxiv.org/abs/1711.09883" rel="nofollow">https://arxiv.org/abs/1711.09883</a>

scottlocklinover 5 years ago

Everything about "openAI" institute seems to be designed to appeal to frightened, paranoid billionaire donors who think they need to be kept safe from near relatives to logistic regression and the remote control for their television, because muh singularity.<p>Can't you just call it "constrained reinforcement learning" without sexing it up for Elon? I guess not.

评论 #21608194 未加载

4 comments

peripiteaover 5 years ago

评论 #21602386 未加载

评论 #21601778 未加载

评论 #21601575 未加载

评论 #21601249 未加载

评论 #21603486 未加载

评论 #21601740 未加载

评论 #21602082 未加载

Jefro118over 5 years ago

sanxiynover 5 years ago

If you like this, you may also enjoy "AI Safety Gridworlds" from DeepMind: <a href="https://arxiv.org/abs/1711.09883" rel="nofollow">https://arxiv.org/abs/1711.09883</a>

scottlocklinover 5 years ago

评论 #21608194 未加载

Safety Gym

4 comments

Safety Gym

4 comments