TechEcho

4 comments

wsgeorgeabout 2 years ago

My hunch is that following an instruction is a subset of text completion.> How are you?The most likely next word might be computed to be "I".> How are you? IThen, what's the word that's most likely to follow "How are you? I"? That might be "am"> How are you? I am...and what's the word that's most likely to follow "How are you? I am"?BTW, I think it's helpful to put "Ask HN" before question posts. Makes it easier to find.

360mackyabout 2 years ago

GPT is trained on lots of data so it can learn patterns + relationships between words. So when a prompt is given, the model looks at the words in the prompt and uses its knowledge from the data it has seen to guess what the next words or phrases might be. The key is patterns + relationships.

sharemywinabout 2 years ago

part of it is attention, right. if it sees the word pirate then it's more likely to say Argh and Matey...and because it's transformers onto of transformers it can says phrases based on words and phrasesthe smaller ones like to repeat themselves I'm not sure if they Reinforcement learning is what fixed that or just a larger model.

PaulHouleabout 2 years ago

Part of it is that you see something somewhat like the prompt near relevant text.For instance somebody posted the other day an example where they asked GPT-4 who wrote a snippet from the Stratechery blog and it replied (after a lot of boilerplate about the difficulty of the problem) "Ben Thompson".Somewhere in the training data it saw something like "Author: Ben Thompson" close to a lot of text that uses words in a certain way and and it learned the conditional probability distribution for that.That underlying "knowledge" is captured by the pre-training phase where it learns the statistical regularities of text.The ability to access that knowledge through prompts and to have a personality, add boilerplate text, be helpful, and agreeable even when it refuses to write a Hitler speech are trained in a second stage using this technique<a href="https://huggingface.co/blog/rlhf" rel="nofollow">https://huggingface.co/blog/rlhf</a>that is, by seeing examples where somebody asks to identify the author of text it learns not only to access the knowledge it learned in pre-training the write way but to also sound equivocal about it, even though that problem is a baseball pitched right at the center of the box so far as it is concerned.

4 comments

wsgeorgeabout 2 years ago

360mackyabout 2 years ago

sharemywinabout 2 years ago

PaulHouleabout 2 years ago

Why do prompts even work at all?

4 comments

Why do prompts even work at all?

4 comments