I'm not super familiar with AI/ML/RL at all, so I'm sure this is a naive question, but isn't it obvious that the answer is to just build in costs to the utility function for behaviors you want to avoid (what they seem to refer to as constrained RL in the article)? That seems both the simplest way to handle it, and most elegant in terms of mapping to the real world domain. Like are there alternate solutions that are even remotely competitive with this? I'm sure I must be oversimplifying and I assume that there's some nuance I'm missing. E.g. is this more about how you design those constraints to minimize the overall loss in learning efficiency, or something like that?
On this topic, if anyone wants to understand the behind the scenes of working on and maintaining projects like this, I did an interview with a maintainer of OpenAI Gym here: <a href="https://www.sourcesort.com/interview/peter-zhokhov-open-ai-gym" rel="nofollow">https://www.sourcesort.com/interview/peter-zhokhov-open-ai-g...</a>
If you like this, you may also enjoy "AI Safety Gridworlds" from DeepMind: <a href="https://arxiv.org/abs/1711.09883" rel="nofollow">https://arxiv.org/abs/1711.09883</a>
Everything about "openAI" institute seems to be designed to appeal to frightened, paranoid billionaire donors who think they need to be kept safe from near relatives to logistic regression and the remote control for their television, because muh singularity.<p>Can't you just call it "constrained reinforcement learning" without sexing it up for Elon? I guess not.