TechEcho

The more I have been using LLM/GPT, I am highly doubtfull that the secrete sauce is just train it on large corpuse of text and codes. Various LLM have various kind of analytical skills. This analytical skills can't be explained with "Oh but they are just text predictor, nothing more than that".I wanted to ask HN here, how this analytical skills are possible, if its mere doing text prediction.I asked a LLM/GPT model: >"I want a word similar to accomplice , but which is positive meaning. It replied.>"A word that is similar to accomplice but has a positive connotation is "ally". An ally is someone who supports and helps another person or group in achieving a common goal. This word is often used in the context of social justice movements or political campaigns, where individuals or groups work together to bring about positive change. Other words that you might consider include "partner", "collaborator", or "co-conspirator" (although this last one may still carry some negative connotations depending on the context)."Now I don't belive that the large language corpus its been trained on, had some where this definition, which it vomited out. It exactly understood and analyszed what I wanted and gave me.Morever I have given it simple code C++ functions and asked it if I give following inputs , what would be the output. And it gave me correct answer.This "General Analytics" for anythings seems to me dubious and based on my programming knowledge, I don't think mere curve fitting models can create this anaylitics skills.Again repeating what I am seeking: I wanted to ask HN here, how this analytical skills are possible, if its mere doing text prediction.

Yeah, it turns out that it is totally misleading to dismiss something that is based upon text prediction as being only capable of things that require text prediction. In order to be able to predict the 'correct text', you, as a human, or as a 'device' are not just 'looking for instances where 'b follows a' in your training data. This would just amount to nothing more than running a query on a database, finding the data in there and then outputting it.Sure, there is a bit of this going on with an LLM system.But it turns out that both the finding and the outputting in a neural net are so different to what is happening in a database-driven solution, that although the information contained within the answer might not be something that would be impossible to derive from a 'conventional' database and therefore still make it unrealistic to claim that a conventional database has any 'understanding' of the query that it is attempting to execute, this turns out not to be the case with a neural net-based solution like ChatGPT.In order to 'predict the next word', the neural net can't just look it up, because it is unlikely to be there. The neural net needs to have used its training process to have built up what we might call 'intuitions'. It isn't 'just' consulting its training data. It is mostly consulting its 'intuition-base'. To all intents and purposes, it is consulting its 'understanding' of the training data, rather than consulting the data itself. This intuition-base is so much more optimised than any conventional data base that lookup is near-instantaneous. That's why, when we 'try to figure something out' we can need to take a little time to 'think things through', but when we react to something, it's instant, nee jerk response. Using our own minds, when we are instantly reacting to something, are using the brain's neural net in the same way as ChatGPT, but if we are are figuring things out on a 'if this then do that' basis, our minds are working more like a computer running lines of code. Most of ChatGPT is based on NOT 'just' running lines of code, but there is a tad of code in there as well as a neural net. But to cut a long story short, yes, the intuition-base side gives ChatGPT its ability to analyse and understand and the word prediction aspect is jus ta deceptively misleading characterisation of what's really going on.

the base LLM is token predictor - given some stream of tokens, guess the next token, and the next, etc. But ChatGPT is trained on top of the LLM with a whole bunch of example questions+answers, and also most likely they are augmenting your prompt with additional prefix/suffix to nudge the model to output a chat-like answer/response to your input... that's my guess and you can read more in the papers on how they trained it.

Ask HN: How come LLM/GPT have analysis skills, if they are mere text predictor?

2 comments

Ask HN: How come LLM/GPT have analysis skills, if they are mere text predictor?

2 comments