As an ML researcher, simple back of the envelope math shows you that the scaling hypothesis is totally wrong. We will never reach human-level AI by simply scaling what we have today.<p>GPT-3 has 10^11 parameters and needs 10^14 bytes of training data. Averaged performance on a bunch of benchmarks is 40-50% depending on what kind of prompts you provide: <a href="https://res.cloudinary.com/dyd911kmh/image/upload/f_auto,q_auto:best/v1598020447/gpt3-3_krvb14.png" rel="nofollow">https://res.cloudinary.com/dyd911kmh/image/upload/f_auto,q_a...</a> 10x fewer parameters drops your performance by about 10%.<p>If you just linearly extrapolate that graph, and ML doesn't generally scale linearly, models tend to peter out eventually, you're talking about needing models that are 10^6 or more larger with a similar increase in training data. This is.. starting to be impractical.<p>That's 10^17 or more parameters and 10^20 or more data. And that's assuming the models actually continue to learn.<p>This is also extrapolating with an average. Datasets in machine learning are not difficulty calibrated at all. We have no idea how to measure difficulty. So this extrapolation is being driven by the easier datasets, and it won't saturate the hard ones. For example, GPT-3 makes a lot of systematic errors and there are plenty of benchmarks where it just isn't very good regardless of how many parameters it has.<p>Our understanding of what intelligence is in the first place is the biggest hurdle here. This is why we can't benchmark systems. Why we can't come up with a benchmark, where performance on that benchmark means we're x% of the way toward an intelligent system. As systems get better, our benchmarks and datasets get better to keep up with them. So just saying we're going to saturate performance on today's benchmarks with some model that has 10^17 parameters just doesn't mean much at all.<p>We have no guarantee and no reason to expect that doing well on today's benchmarks, even if we invested trillions of dollars would matter in the grand scheme of things.<p>Doesn't mean these models can't be useful. But there's plenty more to do before we can just say "take what we have and invest $1T to scale it up and we'll be good to go".