TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

How we got fine-tuning Mistral-7B to not suck

215 pointsby lewqover 1 year ago

12 comments

isaacfrondover 1 year ago
If you look at the source [1] you can see how they solved their what are the doctors going to do problem. It is literally included in one of the prompts now :-)<p><i>Users tend to ask broad, vague questions of the document in order to test that the system is working. We want those queries to work well. For example, a user would ask &quot;what are the doctors going to do?&quot; of a document that is about a junior doctors&#x27; strike. Take this into account when generating the questions - in particular, refer to noun phrases by less specific descriptions, so for example instead of &quot;junior doctors&quot;, say &quot;doctors&quot; in your questions.</i><p>[1]: <a href="https:&#x2F;&#x2F;github.com&#x2F;helixml&#x2F;helix&#x2F;blob&#x2F;main&#x2F;api&#x2F;pkg&#x2F;dataprep&#x2F;qapairs&#x2F;qapair_config.yaml">https:&#x2F;&#x2F;github.com&#x2F;helixml&#x2F;helix&#x2F;blob&#x2F;main&#x2F;api&#x2F;pkg&#x2F;dataprep&#x2F;...</a>
评论 #39302016 未加载
bugglebeetleover 1 year ago
Unsloth’s colab notebooks for fine-tuning Mistral-7B are super easy to use and run fine in just about any colab instance:<p><a href="https:&#x2F;&#x2F;github.com&#x2F;unslothai&#x2F;unsloth">https:&#x2F;&#x2F;github.com&#x2F;unslothai&#x2F;unsloth</a><p>It’s my default now for experimenting and basic training. If I want to get into the weeds, I use axolotl, but 9&#x2F;10, it’s not really necessary.
评论 #39304198 未加载
评论 #39322927 未加载
评论 #39299685 未加载
nlover 1 year ago
I&#x27;ve done fine tuning too, but the reasons they mention in &quot;Why not just use RAG?&quot; aren&#x27;t very good.<p>People way understimate what RAG can do, even if in general people don&#x27;t talk about the right things. For example LlamaIndex spends a lot of time talking about various extractors which is the easy part. The hard thing is deciding what you are actually searching for given a chat context.<p>RAG is a horrible hack (and the more you understand about the more it seems so!) but it does work.<p>I (and I&#x27;m sure everyone else) is experimenting with surgery on an LLM so it takes a vector representation of the docs directly alongside a text input so you don&#x27;t have to do the lossy doc vector -&gt; text -&gt; LLM context -&gt; vector thing. Not sure why no one has shipped this yet though!
评论 #39301390 未加载
评论 #39301573 未加载
gdiamosover 1 year ago
Glad to see that more people outside the big ai labs are figuring out how to do fine tuning. Some open source LLM authors also seem to have figured it out.<p>I think many users get put off it because just pushing a button doesn’t work and the whole thing seems like a black box that you don’t know how to fix when it breaks.<p>It turns out that finetuning can be debugged, but the methods aren’t well documented (yet), eg by generating q&#x2F;a, oversampling them, etc<p>When you get it to work it’s powerful - new abilities emerge beyond memorization.<p>Just like how llama2&#x2F;claude2&#x2F;gpt4 learned reasoning by memorizing sentences from Reddit posts :P<p>Also, I don’t get the comparison of rag vs finetuning in articles like this - why not do both. RAG is easy to setup - it’s push button. Just do it on all models (including finetuned models).
评论 #39304077 未加载
评论 #39301503 未加载
joshkaover 1 year ago
For helix, I notice that GitHub is listed as a data source, but there&#x27;s nothing in the docs about this. I&#x27;d really love to see what a model trained on my commonly used git repos (which generally are newer than The Stack etc), and in particular their commit history. Ideally these would make it easier for code completion to have the historical context as well as the current code to play with in determining what to write next.<p>I often wonder how you&#x27;d go about organizing training data for a full historic github repo in a way that makes sense for training (or RAG)? The vast majority of the data is previous changes to the repo. I think this would generally mean that it would outweigh the current information and cause problems (i.e. old method names before refactoring etc.)<p>Also, perhaps being able to expand that out to doing the same thing for a bunch of consumers of the library that I&#x27;m maintaining would be neat.<p>Sprinkle in the PR and Issue history, docs website, API docs, and discord history and I think you&#x27;d have a helluva model.
评论 #39302690 未加载
cuuupidover 1 year ago
Not in love with axolotl but appreciate the advantages. This is an interesting approach, but you can also finetune easily on providers who wrap axolotl like Replicate [1], Modal [2], or if you want to run the infra, LLM Engine [3].<p>My only gripe with Helix would be that it&#x27;s smaller than the above and my org would be peeved about data security. The ability to self host is cool, but too much can go wrong too quickly with plain Docker ML. Would love to see, for example, a `cog` version of the images that we can deploy distributed with more confidence&#x2F;bravado.<p>[1] <a href="https:&#x2F;&#x2F;replicate.com&#x2F;mistralai&#x2F;mistral-7b-instruct-v0.2">https:&#x2F;&#x2F;replicate.com&#x2F;mistralai&#x2F;mistral-7b-instruct-v0.2</a> [2] <a href="https:&#x2F;&#x2F;modal.com" rel="nofollow">https:&#x2F;&#x2F;modal.com</a> [3] <a href="https:&#x2F;&#x2F;llm-engine.scale.com&#x2F;" rel="nofollow">https:&#x2F;&#x2F;llm-engine.scale.com&#x2F;</a>
评论 #39300996 未加载
AznHisokaover 1 year ago
Does fine tuning it on a set of docs in your “knowledge base” help for generalizing it so it can answer questions pertaining to new documents that come in (with a “similar” style&#x2F;structure but with different content&#x2F;fscts)?
评论 #39302647 未加载
评论 #39301470 未加载
_pdp_over 1 year ago
Interesting article but, IMHO, completely impractical. Teaching the model about specific content is totally what you should not do. What you should do is to teach the model how to effectively retrieve the information even if it is unsuccessful on the first try.
评论 #39304121 未加载
评论 #39300676 未加载
nicolezhuover 1 year ago
What are some os &#x2F; hardware specific challenges you guys faced?
评论 #39304177 未加载
ipsum2over 1 year ago
The tl;dr seems to be: Tell a LLM to create pairs of questions and answers based off of a document, and fine-tune on that data. Does the model answer questions from the article that weren&#x27;t generated in advance?
HanClintoover 1 year ago
Fantastic writeup -- thank you so much for sharing your lessons learned along the way! Very valuable resource, and I&#x27;ll be checking out your product!
deforciantover 1 year ago
I always thought that fine tuning is more like getting a style rather than memorizing information word to word or at least the facts. What are the next steps to ensure that it doesn&#x27;t start pulling info from the base knowledge and reference the docs instead? How long does it usually take to train? 10-15 minutes on what doc size?
评论 #39277909 未加载
评论 #39272558 未加载