>> But as the AI race heated up, DeepMind was drawn more tightly into Google proper. A bid by the lab’s leaders in 2021 to secure more autonomy failed, and in 2023 it merged with Google’s other AI team—Google Brain—bringing it closer to the heart of the tech giant<p>The way I understand it what happened was that Reinforcement Learning (RL) went out of fashion at the same time that LLMs became wildly popular. DeepMind was all about RL so their needs and wants were basically sidelined in favour of the new New Big Thing in AI™.<p>The reason of course that RL "fell out of fashion" as I say is the continuing failure of RL approaches to work convincingly and reliably in the real world. RL (basically Deep RL, since that's all anyone's doing these days) works great in simulation but there are two big problems with it.<p>The first one is generalisation, or lack thereof. RL doesn't generalise. You can train an RL agent in one environment and it will learn to solve the environment perfectly, if sometimes awkwardly, but if you take the same agent and put in a different environment, even one from the same domain, it will basically die [1,2].<p>The second problem is that RL agents rely on a model of the dynamics of the environment and those are not easy to come by: only humans are able to create robust, useful models of real world environments. There are of course model-free RL approaches that learn a model by interaction with an environment but those only work in virtual environments, for the simple reason that you can't learn real-world dynamics by model-free interaction with the physical world without dying many thousands of times.<p>So it looks like it's RL out, LLMs in, in Google as in everything else, and I guess we'll see what the Next Big Thing in AI™ is going to be after LLMs and who is going to make their fortune with it.<p>________________<p>[1] <a href="https://robertkirk.github.io/2022/01/17/generalisation-in-reinforcement-learning-survey.html" rel="nofollow">https://robertkirk.github.io/2022/01/17/generalisation-in-re...</a><p>[2] I can't find that paper, if it was a paper, but there was a story about moving the paddle in Breakout a few pixels away and thereby causing an RL agent to fail.