TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Ask HN: Fine tuning a 65B LLM vs. fine tuning task specific million size models

1 pointsby OthmaneHamzaouiover 1 year ago
Hi HN,<p><i>Ex-ML Engineer to feel free to dive deep in the answers</i><p>Everywhere I look today (medium, reddit, twitter) everyone is talking about fine-tuning LLMs. How the future is taking billion size models and fine-tuning then distilling them to specialised LLMs that perform specific tasks (i.e: sentiment analysis, Q&amp;A, summarisation).<p>1&#x2F;When people speak about fine-tuning are they actually fully re-training the LLM (i.e updating weights) or mostly using techniques like few-shots prompting and Retrieval Augmented Generation (RAG) ?<p>2&#x2F;If you want to actually fully re-train the LLM (i.e retraining the weights), why not just use “small” (millions vs billion size) models like bert that are specifically fine-tuned for the end goal tasks (NER, classification) and fine-tune those instead?<p>3&#x2F;If your fine-tuning LLMs (no matter what you definition of fine tuning is) mind sharing what for and how you’re doing it?<p>P.S: I asked a similar question on reddit [1] but reframing a bit and asking here in the hopes of getting answers that focus on the re-training aspect and also to get a diverse set of views :)<p>[1] https:&#x2F;&#x2F;www.reddit.com&#x2F;r&#x2F;MachineLearning&#x2F;comments&#x2F;15xfesk&#x2F;d_why_fine_tune_a_65b_llm_instead_of_using&#x2F;

no comments

no comments