I learned about ~AI algorithms in the 90s: backprocessing & clustering networks, and a little bit of genetic algos.<p>I then focused & programmed & played for a while with the model of the "backpropagation" network, until the early 2000' => it was fun, but not usable in my context. I then stopped fiddling with it and became inactive in this context.<p>An important property of a backpropagation network was (as much as I know) that it had to be fully re-trained whenever inputs changed (values of existing ones changed or inputs/outputs were removed/added).<p>Question:<p>Is it still like that for the currently fancy algos (the ones developed by Google/Facebook/OpenAI/Xsomething/...) or are they now better, so that they can now adapt without having to be fully retrained using the full set of (new/up-to-date) training data?<p>Asking because I lost track of the progress in this area during the last 20 years and especially recently I understand nothing involving all new names (e.g. "llama", etc...).<p>Thanks :)
I think what you are referring to is the concept of “finetuning”. You use a pretrained network and add a (relatively) small set of new input-output pairs to steer it in a new direction.<p>It's widely used, you can look it up.<p>A more challenging idea is whether it is possible to reuse the pretrained weights when training a network with a <i>different architecture</i> (maybe a bigger transformer with more heads, or something).<p>AFAIK this is not common practice, if you change the architecture you have to retrain from scratch. But given the cost of these trainings, I wouldn't be surprised if OpenAI&co had developed some technique to do this, eg across GPT versions..
Large Language models are pre-trained by creators on the huge data.<p>In many cases you do not need to do anything with LLM and you can just use it.<p>If they were not trained on the data that contains information that you are interested then you can use technique called RAG (Retrieval-Augmented Generation).<p>You also can do fine-tuning which is kind of training but on small amount of data.