My main question is why would scaling lead to intelligent behavior in an AI and how and why would it generate that intelligent behavior?
For this explanation I'm assuming that scaling can lead to an AGI and superintelligent behavior, as some AI engineers claim.
Using OpenAI's paper for the base: https://arxiv.org/pdf/2001.08361.pdf<p>You basically have 3 things, compute, dataset and parameter size. The parameters as far as I understand are basically universal computation approximators and can execute any function. The data provides the "geometry" which the parameters try to model with functions and compute allows you to accelerate and increased the parameter size and dataset size.<p>Now, as far as I can tell, the parameters will only ever model whatever is actually in the data, and it stands to reason, with more parameters and more data, you can better model the data, and with more data you can model more "stuff" (whether real physical stuff or digital data etc). That's all fine, but here's the problem. What about all the stuff that is not in the data?
Primarily, most intelligent behavior in humans is not in the data. The most important part of intelligence is not the ability to know things, it's the ability to synthesize all disparate pieces of information from wildly different places and then generate a coherent sequence of steps / actions to reach some desired outcome.<p>Humans do this I think primarily because as organisms, we probably evolved to detect disturbances in the body, and extended environment, and then generate actions/behavior to correct those disturbances. Things like needing to eat, sleep, but also higher social things like maintaining ego, even maintaining moral order :P
But the point here is, other than those actions that we learn socially from others, a lot of those behaviors are not in the "data". And a lot of those sequences of steps we generate we learn on our own with the basis of our hands, feet, and the output actuators we have, and so on.
So what happens if say, we have an AI with trillions and trillions of parameters, and it has been trained on an incredible amount of data, maybe all of earths data, and even raw sensor input from all surveillance equipment or something like this.<p>What happens if I ask that AI to generate nanobots that can go into a human host, super accurately target every cancer cell and eliminate it?
And let's say for arguments sake the schematic and shape of the nanobot doesn't exist in the data it's been trained on (since humans haven't invented it), the factory needed to produce the nanobot doesnt either, nor all the materials science needed nor computational ideas needed to power the nanobot CPU/computer.<p>How would that AI generate a large sequence of steps (not in the data), to produce in the real world, an army of nanobots? Where would it get all the information on how to synthesize all the information from the disparate places in its neural net that might actually contain all the information needed, but actually generating a novel sequence of behavior in the physical world to produce them, I don't know how...
But there's also another problem, why would it generate _ANY_ such novel sequence of actions/steps AT ALL? What would be the impetus for it to for example generate this idea for nanobots on its own?
The only way to generate this is by modeling humans or other organisms I think, but even then there's a question because humans and animals brains and behaviors are so tuned to our biological needs that I'm not sure such actions would in the aggregate even make sense for an AGI.<p>Curious about this! Sorry if I'm incredibly wrong I just want to know more.