This is great. I've always thought intelligence can only be defined as an emergent property of self-replicating systems operating under stress, and this provides a good framework for that "stress".
Peter Abeel ( OP link author with Igor Mordatch ) explains his groups work [1] as a guest lecturer for Berkeley's cs294-112 Deep Reinforcement Robotics.<p>[1] <a href="https://www.youtube.com/watch?v=f4gKhK8Q6mY&list=PLkFD6_40KJIwTmSbCv9OVJB3YaO4sFwkX&index=25" rel="nofollow">https://www.youtube.com/watch?v=f4gKhK8Q6mY&list=PLkFD6_40KJ...</a><p>This great talk starts with his work at OpenAI on neural net safety and adverserial images, then the OP research paper Emergence of Grounded Compositional Language in Multi-Agent Populations[2] concluding with his work ( with Andrew Ng ) reinforcement learning helicopter flight and stunt controllers from human pilots.<p>The OP multi-agents divide labour and apparent collaborative plans appear. That the goal seeking agents split up, appear to dance to and from from their goal, distracting the predators from their kin is to all appearance coordinated and clever.<p>Dawkin's Selfish Gene espouses alturism as an inevitability of genetic relatedness, the individual sacrifices but the genes persist in siblings.<p>In this work alturism emerges purely memetically.<p>The Nash equilibrium of cooperation jumps the local minima of selfishness in this prisoners dillemma.<p>Multi agent enviroments are difficult to learn with many false minima for the learners.<p>This work hints that the loose coupling of language, rather than direct sharing of memories or genes, is noisy enough to find more global solutions that appear complex or 'plan-like'.<p>Maybe these AI's should be considered as planners, yet in a bottom-up immediate-heuristic emergent way.<p>[2] <a href="https://arxiv.org/abs/1703.04908" rel="nofollow">https://arxiv.org/abs/1703.04908</a>
Am I the only one that finds this scary?<p>As I have repeatedly said, so far the intelligence we have been producing is NOT the kind that applies abstract logic rules to figure out meaning.<p>It is the kind that takes full advantage of computers' strengths: perfect copying and speed.<p>So good ideas are copied and propagate. Neural networks are just this on steroids... basically extracting signals from noise and doing a search in a space to maximize something, and storing the results.<p>Humans were able to transmit knowledge, then produce books etc. Now bits can be perfectly copied with checksums.<p>This isn't general intelligence in the human sense. But that's what makes it scary. It can solve these problems with brute force. Resistance soon may really be futile. Not just in running, avoiding capture and death. But also in ANY human system we rely on, including voting, due process of law, trust, reputation, family, sex, humor, etc.
> a multiagent environment has no stable equilibrium<p>It does. <a href="https://en.wikipedia.org/wiki/Evolutionarily_stable_strategy" rel="nofollow">https://en.wikipedia.org/wiki/Evolutionarily_stable_strategy</a>