Hi HN,<p><i>Ex-ML Engineer to feel free to dive deep in the answers</i><p>Everywhere I look today (medium, reddit, twitter) everyone is talking about fine-tuning LLMs. How the future is taking billion size models and fine-tuning then distilling them to specialised LLMs that perform specific tasks (i.e: sentiment analysis, Q&A, summarisation).<p>1/When people speak about fine-tuning are they actually fully re-training the LLM (i.e updating weights) or mostly using techniques like few-shots prompting and Retrieval Augmented Generation (RAG) ?<p>2/If you want to actually fully re-train the LLM (i.e retraining the weights), why not just use “small” (millions vs billion size) models like bert that are specifically fine-tuned for the end goal tasks (NER, classification) and fine-tune those instead?<p>3/If your fine-tuning LLMs (no matter what you definition of fine tuning is) mind sharing what for and how you’re doing it?<p>P.S: I asked a similar question on reddit [1] but reframing a bit and asking here in the hopes of getting answers that focus on the re-training aspect and also to get a diverse set of views :)<p>[1] https://www.reddit.com/r/MachineLearning/comments/15xfesk/d_why_fine_tune_a_65b_llm_instead_of_using/