TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Ask HN: Do modern AI engines still need to do full re-trainings?

11 pointsby zepearl10 months ago
I learned about ~AI algorithms in the 90s: backprocessing &amp; clustering networks, and a little bit of genetic algos.<p>I then focused &amp; programmed &amp; played for a while with the model of the &quot;backpropagation&quot; network, until the early 2000&#x27; =&gt; it was fun, but not usable in my context. I then stopped fiddling with it and became inactive in this context.<p>An important property of a backpropagation network was (as much as I know) that it had to be fully re-trained whenever inputs changed (values of existing ones changed or inputs&#x2F;outputs were removed&#x2F;added).<p>Question:<p>Is it still like that for the currently fancy algos (the ones developed by Google&#x2F;Facebook&#x2F;OpenAI&#x2F;Xsomething&#x2F;...) or are they now better, so that they can now adapt without having to be fully retrained using the full set of (new&#x2F;up-to-date) training data?<p>Asking because I lost track of the progress in this area during the last 20 years and especially recently I understand nothing involving all new names (e.g. &quot;llama&quot;, etc...).<p>Thanks :)

2 comments

Micoloth10 months ago
I think what you are referring to is the concept of “finetuning”. You use a pretrained network and add a (relatively) small set of new input-output pairs to steer it in a new direction.<p>It&#x27;s widely used, you can look it up.<p>A more challenging idea is whether it is possible to reuse the pretrained weights when training a network with a <i>different architecture</i> (maybe a bigger transformer with more heads, or something).<p>AFAIK this is not common practice, if you change the architecture you have to retrain from scratch. But given the cost of these trainings, I wouldn&#x27;t be surprised if OpenAI&amp;co had developed some technique to do this, eg across GPT versions..
评论 #41101706 未加载
vasili11110 months ago
Large Language models are pre-trained by creators on the huge data.<p>In many cases you do not need to do anything with LLM and you can just use it.<p>If they were not trained on the data that contains information that you are interested then you can use technique called RAG (Retrieval-Augmented Generation).<p>You also can do fine-tuning which is kind of training but on small amount of data.