Simply explained: How does GPT work?

410 pointsby nitnelaveabout 2 years ago

26 comments

I’d be interested in hearing from anyone who takes the Chinese Room scenario seriously, or at least can see how it applies to any of this.I cannot see that it matters if a computer understands something. If it quacks like a duck and walks like a duck, and your only need is for it to quack and walk like a duck, then it doesn’t matter if it’s actually a duck or not for all intents and purposes.It only matters if you probe beyond the realm at which you previously decided it matters (e.g roasting and eating it), at which point you are also insisting that it walk, quack and TASTE like a duck. So then you quantify that, change the goalposts, and assess every prospective duck against that.And if one comes along that matches all of those but doesn’t have wings, then if you deny it to be a duck FOR ALL INTENTS AND PURPOSES it simply means you didn’t specify your requirements.I’m no philosopher, but if your argument hinges on moving goalposts until purity is reached, and your basic assumption is that the requirements for purity are infinite, then it’s not a very useful argument.It seems to me to posit that to understand requires that the understandee is human. If that’s the case we just pick another word for it and move on with our lives.

评论 #35471234 未加载

评论 #35472589 未加载

评论 #35471240 未加载

评论 #35471316 未加载

评论 #35473889 未加载

评论 #35472515 未加载

评论 #35473048 未加载

评论 #35473451 未加载

评论 #35471821 未加载

评论 #35483605 未加载

评论 #35472942 未加载

评论 #35471151 未加载

评论 #35473951 未加载

评论 #35471100 未加载

评论 #35477254 未加载

zackmorrisabout 2 years ago

On the other hand, many people who are not ready to change, who do not have the skills or who cannot afford to reeducate are threatened.That's me. After programming since the '80s, I'm just so tired. So much work, so much progress, so many dreams lived or shattered. Only to end up here at this strange local maximum, with so much potential, destined to forever run in place by the powers that be. The fundamentals formula for intelligence and even consciousness materializing before us as the world burns. No help coming from above, so support coming from below, surrounded by everyone who doesn't get it, who will never get it. Not utopia, not dystopia, just anhedonia as the running in place grows faster, more frantic. UBI forever on the horizon, countless elites working tirelessly to raise the retirement age, a status quo that never ceases to divide us. AI just another tool in their arsenal to other and subjugate and profit from. I wonder if a day will ever come when tech helps the people in between in a tangible way to put money in their pocket, food in their belly, time in their day - independent of their volition - for dignity and love and because it's the right thing to do. Or is it already too late? I don't even know anymore. I don't know anything anymore.

评论 #35473569 未加载

评论 #35472844 未加载

评论 #35472835 未加载

seydorabout 2 years ago

This is confusing, using the semantic vectors arithmetic of embeddings is not very relevant to transformers and its completely missing the word 'attention'. I don't think transformers are that difficult to explain to people , but it is hard to explain "why" they work. But i think it's important for everyone to look under the hood and know that there are no demons underneath.

评论 #35468731 未加载

评论 #35468819 未加载

评论 #35468999 未加载

评论 #35470591 未加载

评论 #35473870 未加载

评论 #35468264 未加载

ZeroGravitasabout 2 years ago

> It is able to link ideas logically, defend them, adapt to the context, roleplay, and (especially the latest GPT-4) avoid contradicting itself.Isn't this just responding to the context provided?Like if I say "Write a Limerick about cats eating rats" isn't it just generating words that will come after that context, and correctly guessing that they'll rhyme in a certain way?It's really cool that it can generate coherent responses, but it feels icky when people start interrogating it about things it got wrong. Aren't you just providing more context tokens for it?Certainly that model seems to fit both the things it gets right, and the things it gets wrong. It's effectively "hallucinating" everything but sometimes that hallucination corresponds with what we consider appropriate and sometimes it doesn't.

评论 #35470956 未加载

评论 #35471067 未加载

评论 #35473837 未加载

评论 #35470690 未加载

评论 #35472089 未加载

agentultraabout 2 years ago

A good article and well articulated!I would change the introduction to be more impartial and not anthropomorphize GPT. It is not smart and it is not skilled in any tasks other than that for which it is designed.I have the same reservations about the conclusion. The whole middle of the article is good. But to then compare the richness of our human experience to an algorithm that was plainly explained? And then to speculate on whether an algorithm can "think" and if it will "destroy society," weakens the whole article.I really would like to see more technical writing of this sort geared towards a general audience without the speculation and science-fiction pontificating.Good effort!

评论 #35472130 未加载

评论 #35468820 未加载

评论 #35469329 未加载

评论 #35469281 未加载

评论 #35478288 未加载

onetrickwolfabout 2 years ago

I've been using GPT4 to code and these explanations are somewhat unsatisfactory. I have seen it seemingly come up with novel solutions in a way that I can't describe in any other way than it is thinking. It's really difficult for me to imagine how such a seemingly simple predictive algorithm could lead to such complex solutions. I'm not sure even the people building these models really grasp it either.

评论 #35468048 未加载

评论 #35468368 未加载

评论 #35470537 未加载

评论 #35468095 未加载

评论 #35470459 未加载

评论 #35470207 未加载

评论 #35469671 未加载

评论 #35468229 未加载

评论 #35468038 未加载

评论 #35471007 未加载

评论 #35468400 未加载

评论 #35468061 未加载

habosaabout 2 years ago

Is it possible that we don’t truly know how it works? That there is some emergent behavior inside these models that we’ve created but not yet properly described? I’ve read a few of these articles but I’m still not completely satisfied.

评论 #35468346 未加载

sirwhinesalotabout 2 years ago

It predicts the next word/token based on the previous pile of words/tokens. Given a large enough model (as in GPT3+) it can actually output some rather useful text because the probabilities it learned on what the next token should be are rather accurate.

评论 #35471137 未加载

Zeticeabout 2 years ago

Does anyone have a good recommendation for a book that would cover the underlying ideas behind LLMs? Google ends up giving me a lot of ads, and ChatGPT is vague about specifics as per usual.

评论 #35471559 未加载

评论 #35471282 未加载

评论 #35471660 未加载

alkonautabout 2 years ago

What I wonder most is how it encodes knowledge/state other than in the sequence of queries/responses. Does it not have a "mind"?If I play a number guessing game, can I tell it to "think of a number between 0 and 100" and then tell me if the secret number is higher/lower than my guess (For a sequence of N guesses where it can concistently remember it's original number)? If not, why? Because it doesn't have context? If it can: why? Where is that context?To a layman it would seem you always have two parts of the context for a conversation. What you have said, and what you haven't said, but maybe only thought of. The "think of a number" being the simplest example, but there are many others. Shouldn't this be pretty easy to tack on to a chat bot if it's not there? It's basically just an contextual output that the chat bot logs ("tells itself") and then refers to just like the rest of the conversation?

评论 #35469167 未加载

评论 #35469055 未加载

评论 #35469025 未加载

评论 #35469117 未加载

评论 #35468660 未加载

davesqueabout 2 years ago

I'd be interested in hearing people's takes on the simplest mathematical reason that transformers are better than/different from fully connected layers. My take is:<pre><code> Q = W_Q X K = W_K X A = Q^T K = (X^T W_Q^T) (W_K X) = X^T (...) X </code></pre> Where A is the matrix that contains the pre-softmax, unmasked attention weights. Therefore, transformers effectively give you autocorrelation across the column vectors (tokens) in the input matrix X. Of course, this doesn't really say why autocorrelation would be so much better than anything else.

评论 #35473552 未加载

stareatgoatsabout 2 years ago

This article seems credible and actually made me feel as if I understood it, i.e. at some depth but not deeper than a relative layperson can grasp.What I can't understand is how the Bing chatbot can give me accurate links to sources but chatGPT4 on request gives me nonsensical URLs in 4 case of 5. It doesn't matter in the cases where I ask it to write a program: the verification is in the running of it. But to have real utility in general knowledge situations, verification through accurate links to sources is a must.

评论 #35468063 未加载

评论 #35468750 未加载

charles_fabout 2 years ago

I commend the author for one of the clearest explanations I've seen so far, written to explain rather than impress. Even an idiot like myself understood what is explained.Two things that I felt were glanced over a bit too fast were the concept of embeddings and that equation and parameters thing. Consider elaborating a bit more or giving an example

ianpurtonabout 2 years ago

If you pefer to see it in code there's a succint gpt implementation here <a href="https://github.com/LaurentMazare/tch-rs/blob/main/examples/min-gpt/main.rs">https://github.com/LaurentMazare/tch-rs/blob/main/examples/m...</a>

pwdisswordfishcabout 2 years ago

Not that much to explain, really. Just read chapter 5 of <a href="https://uefi.org/sites/default/files/resources/UEFI_Spec_2_8_final.pdf" rel="nofollow">https://uefi.org/sites/default/files/resources/UEFI_Spec_2_8...</a>

评论 #35467795 未加载

danesparzaabout 2 years ago

At least part of this article is contradicted by Chat GPT itself. From the article:"...Ongoing learning: The brain keeps learning, including during a conversation, whereas GPT has finished its training long before the start of the conversation."From ChatGPT 4.x:"As an AI language model, I don't have a fixed training schedule. Instead, I'm constantly learning and updating myself based on the text data that I'm exposed to. My training data is sourced from the internet, books, and other written material, and my creators at OpenAI periodically update and fine-tune my algorithms to improve my performance. So, in short, I am always in the process of learning and refining my abilities based on the data available to me."

ben7799about 2 years ago

I asked it which was better, Lisp or Almonds.It said that was an impossible comparison like Apples and Oranges.Then I asked it which were more similar, Apples & Oranges or Lisp & Almonds.It said it is impossible to classify either of those two pairs as more similar because they too fundamentally different. It couldn't come up with anything like Lisp is not edible. Or that Apples and Oranges are both sweet and Lisp and Almonds don't share any common traits.It seems like it has far more trouble with weird questions like this that even a small child will instantly figure out than it does with anything that seems like a lookup of information.

评论 #35472842 未加载

jokoonabout 2 years ago

I am not convinced that Chat GPT could "think" if it had as many neurons or parameters as a human brain, and got as much training.I would still be interested to see what it could do, if it did, but I don't think it would really help science understand what intelligence really is.Being able to grow a plant and understand some conditions that favors it is one thing, but it's poor science.Maybe there will some progress when scientists will be able to properly simulate the brain of an ant or even a mouse, but science is not even there yet.

评论 #35469535 未加载

pyinstallwoesabout 2 years ago

So it’s basically the alchemical geometry of gematria and Isopsephia? Kinda cool that they’re similar in method.

LispSporks22about 2 years ago

I think it's the "The Paperclip Maximizer" scenario, not "The Paperclip Optimizer"

oblioabout 2 years ago

<a href="https://old.reddit.com/r/ChatGPT/comments/10q0l92/chatgpt_marketing_worked_hooked_me_in_decreased/j6obnoq/?context=1" rel="nofollow">https://old.reddit.com/r/ChatGPT/comments/10q0l92/chatgpt_ma...</a>

pillowtalks_aiabout 2 years ago

It is still funny to me that so much emergent behavior comes from some simple token sampling task

评论 #35471042 未加载

tabtababout 2 years ago

Would it be a stretch to call GPT "glorified Markov Chains"? (I used tweaked M.C. once to make a music composer bot. I actually got a few decent tunes out of it, kind of a Bach style.)

slawr1805about 2 years ago

This was a great read! Especially for a beginner like me.

winternettabout 2 years ago

Where is IBM's Watson in all this? It seems as if it never existed? That is just one example of how companies keep making these grand presentations and under-delivering on results...Plain and simple the over-hyped GPT editions are NOT truly AI, it is scripting to assemble coherent looking sentences backed by scripts that parse content off of of stored data and the open web into presented responses.... There is no "artificial" nor non-human intelligence backing the process, and if there wasn't human intervention, it wouldn't run on it's own... In a way, it could better replace search engines at this point with even text-to-speech even, if the tech was more geared towards a more basic (and less mystified) reliability and demeanor... It's kind of like the Wizard of OZ, with many humans behind the curtains.Marketers and companies behind promotion of these infantile technology solutions are being irresponsible in proclaiming that these things represent Ai, and in going as far to claim as they will cost jobs at this point, it will prove costly to repair over zealous moves based on the lie. This is what we do as a planet, we buy Hype, and it costs us a lot. We need a lot more practicality in discussions concerning Ai, because over-assertive and under-accountable marketing is destructive. -- Just look at how much hype and chaos promises of self-driving cars cost many (Not me though thanks). It completely derails tech progress to over promise and under deliver on tech solutions. It creates monopolies that totally destroy other valid research and development efforts. It makes liars profitable, and makes many (less flashy, but actually honest tech and innovation conducted by responsible people) close up shop.We are far from autonomous and self reliant tech, even power grids across most of the planet aren't reliable enough to support tech being everywhere and replacing jobs.Just try to hold a conversation with Siri or Google Assistant, which have probably been developed and tested a lot more than GPT, and around for much longer too, and you'll realize why kiosks at the supermarket and CVS are usually out of order, and why articles written by GPT and posted to sites like CNN.Com and Buzz Feed are poorly written and full of filler... We're just not there yet, and there's too many shortcuts, patchwork, human intervention, and failed promises to really say we're even close.Let's stop making the wrong people rich and popular.

评论 #35468200 未加载

评论 #35471417 未加载

rfmozabout 2 years ago

I’ve been looking an article like this, great job. Thanks