TechEcho

Hi HN,Ex-ML Engineer to feel free to dive deep in the answersEverywhere I look today (medium, reddit, twitter) everyone is talking about fine-tuning LLMs. How the future is taking billion size models and fine-tuning then distilling them to specialised LLMs that perform specific tasks (i.e: sentiment analysis, Q&A, summarisation).1/When people speak about fine-tuning are they actually fully re-training the LLM (i.e updating weights) or mostly using techniques like few-shots prompting and Retrieval Augmented Generation (RAG) ?2/If you want to actually fully re-train the LLM (i.e retraining the weights), why not just use “small” (millions vs billion size) models like bert that are specifically fine-tuned for the end goal tasks (NER, classification) and fine-tune those instead?3/If your fine-tuning LLMs (no matter what you definition of fine tuning is) mind sharing what for and how you’re doing it?P.S: I asked a similar question on reddit [1] but reframing a bit and asking here in the hopes of getting answers that focus on the re-training aspect and also to get a diverse set of views :)[1] https://www.reddit.com/r/MachineLearning/comments/15xfesk/d_why_fine_tune_a_65b_llm_instead_of_using/

Ask HN: Fine tuning a 65B LLM vs. fine tuning task specific million size models

no comments

Ask HN: Fine tuning a 65B LLM vs. fine tuning task specific million size models

no comments