There are some pretty complex and fancy prompts in GTP land (here's a good list https://arxiv.org/pdf/2302.11382.pdf). But if the model is simply trying to put the next most likely words or phrases, one after each other, why do these prompts even work at all? What's going on inside the machine?
My hunch is that following an instruction is a subset of text completion.<p>> How are you?<p>The most likely next word might be computed to be "I".<p>> How are you? I<p>Then, what's the word that's most likely to follow "How are you? I"? That might be "am"<p>> How are you? I am<p>...and what's the word that's most likely to follow "How are you? I am"?<p>BTW, I think it's helpful to put "Ask HN" before question posts. Makes it easier to find.
GPT is trained on lots of data so it can learn patterns + relationships between words. So when a prompt is given, the model looks at the words in the prompt and uses its knowledge from the data it has seen to guess what the next words or phrases might be. The key is patterns + relationships.
part of it is attention, right. if it sees the word pirate then it's more likely to say Argh and Matey...<p>and because it's transformers onto of transformers it can says phrases based on words and phrases<p>the smaller ones like to repeat themselves I'm not sure if they Reinforcement learning is what fixed that or just a larger model.
Part of it is that you see something somewhat like the prompt near relevant text.<p>For instance somebody posted the other day an example where they asked GPT-4 who wrote a snippet from the Stratechery blog and it replied (after a lot of boilerplate about the difficulty of the problem) "Ben Thompson".<p>Somewhere in the training data it saw something like "Author: Ben Thompson" close to a lot of text that uses words in a certain way and and it learned the conditional probability distribution for that.<p>That underlying "knowledge" is captured by the pre-training phase where it learns the statistical regularities of text.<p>The ability to access that knowledge through prompts and to have a personality, add boilerplate text, be helpful, and agreeable even when it refuses to write a Hitler speech are trained in a second stage using this technique<p><a href="https://huggingface.co/blog/rlhf" rel="nofollow">https://huggingface.co/blog/rlhf</a><p>that is, by seeing examples where somebody asks to identify the author of text it learns not only to access the knowledge it learned in pre-training the write way but to also sound equivocal about it, even though that problem is a baseball pitched right at the center of the box so far as it is concerned.