My background: I'm familiar with pre-GPT machine learning and DNNs.<p>I read the relevant papers, went through many explanations of how transformers work.<p>Often those explanations spend thousands of words to explain attention at the word level, and then just say a few words about "oh and with multiple attention heads, it focuses on different aspects, and then multiple layers, and then, magic!".<p>What's happening in those other aspects, what are they? Are there papers that peruse what kind of concepts the model is actually building/learning in those heads and layers?<p>There are large teams who spend months tuning those models. Do those teams have access to those internal concepts that the model built up and organized? Is any of this work public?<p>In computer vision and CNNs, I recall seeing a paper once that showed that each layer of the network was learning a higher level feature than the layer before it (as an inaccurate example: first layer learns edges, second layer learns shapes, third layer texture, forth layer objects, etc, and they show you the eigenvectors of each as representatives).<p>E.g. I asked ChatGPT to tell me a joke about a table in a sundress in the voice of a famous stoic person. And by its response, it adequately "understands" what that person's style sounds like, basic humor, the concept of clothing and mapping that to an inanimate object (punchline: "I figured if a chair can wear a seat cushion, why can't I wear a sundress?"),...<p>(Obviously this is a tame example, but serves its purpose for the discussion).
> Are there papers that peruse what kind of concepts the model is actually building/learning in those heads and layers?<p>> There are large teams who spend months tuning those models. Do those teams have access to those internal concepts that the model built up and organized? Is any of this work public?<p>See: <a href="https://openai.com/research/language-models-can-explain-neurons-in-language-models" rel="nofollow">https://openai.com/research/language-models-can-explain-neur...</a><p>My understanding: Generally, the models are compressing their understanding of all text, and in doing so, they're learning high order concepts that allow their compression of all the text they were fed during pre-training to be a better compression - more compressed, and less loss.
They have known for a long time that text completion is what is called 'AI-complete' meaning that if you have full AGI then it can do human level text completion and if you have human level text completion then it can do full AGI. So they found a way, using an obscene number of model parameters and obscene compute power and obscene dataset size, to get really really good at text completion. So now they got these systems that, looking back, they are going to call just AGI. So in simpler words, it works because the computers brains got so big that they are now conscious like you and me.