TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Fine-Tuning Llama-2: A Comprehensive Case Study for Tailoring Custom Models

308 pointsby robertnishiharaalmost 2 years ago

11 comments

jawertyalmost 2 years ago
Just to add to this, I run through a lot of these topics around fine-tuning Llama 2 on your own dataset (for me it&#x27;s my own code :P) in a coding live stream a couple weeks ago. All on Colab single GPU<p>Fine-tuning Llama stream: <a href="https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=TYgtG2Th6fI&amp;t=2282s">https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=TYgtG2Th6fI&amp;t=2282s</a><p>I have a couple more one where I do a QLoRa fine tuning session and explain the concepts as a personally self taught engineer (software engineer of 8 years moving into ML recently)<p>QloRa fine-tuning stream: <a href="https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=LitybCiLhSc&amp;t=4584s">https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=LitybCiLhSc&amp;t=4584s</a><p>Overall I&#x27;m trying to breakdown how I&#x27;m approaching a lot of my personal projects and my current AI driven startup. Want to make this information as accessible as possible. Also have a series where I&#x27;m fine-tuning a model to be the smallest webdev llm as possible which seems like people are liking. Only been streaming for about a month and plenty more to come.<p>Ask me any question about the stream and fine-tuning llama!
评论 #37091568 未加载
评论 #37094467 未加载
评论 #37092993 未加载
评论 #37091441 未加载
behnamohalmost 2 years ago
&gt; Additionally, while this wasn’t an issue for GPT, the Llama chat models would often output hundreds of miscellaneous tokens that were unnecessary for the task, further slowing down their inference time (e.g. “Sure! Happy to help…”).<p>That&#x27;s the problem I&#x27;ve been facing with Llama 2 as well. It&#x27;s almost impossible to have it just output the desired text. It will always add something before and after its response. Does anyone know if there&#x27;s any prompt technique to fix this problem?
评论 #37091720 未加载
评论 #37091723 未加载
评论 #37096741 未加载
评论 #37100923 未加载
richardliawalmost 2 years ago
I&#x27;m really glad to see a post like this come out. I&#x27;ve seen so many discussions online about customizing models -- this post really does cut through the noise.<p>Really like the evaluation methodology, and seems well-written as well.
yousif_123123almost 2 years ago
It&#x27;s weird that Lora and training with quantization is not being taken more seriously. It&#x27;s way cheaper, takes less time, and a lot of evidence shows it&#x27;s pretty good.<p>I don&#x27;t think it should be something brushed on the side to be tried out later..
评论 #37091536 未加载
bugglebeetlealmost 2 years ago
Glad to see the NER-like task performed the best, as I was just about to test something like this for comparison with a fine-tuned BERT model. Any idea about the training costs for this task?
评论 #37092932 未加载
评论 #37091606 未加载
ilakshalmost 2 years ago
One challenge is that to get large enough custom datasets you either need a small army or a very strong existing model. Which means that you probably have to use OpenAI. And using OpenAI to generate training material for another model violates their terms.<p>Has anyone taken them to court about this? Do we all just decide it&#x27;s not fair and ignore it?
评论 #37093698 未加载
评论 #37092047 未加载
spdustinalmost 2 years ago
Seeing NER examples pop up more frequently now, and wondering why folks don’t use spacy for those sorts of tasks.
评论 #37091861 未加载
评论 #37091645 未加载
评论 #37091348 未加载
zhz_rayalmost 2 years ago
Disclaimer: I work for Anyscale<p>This blog seems to got good attention :) So we definitely plan to add it to Ray Summit <a href="https:&#x2F;&#x2F;raysummit.anyscale.com&#x2F;agenda" rel="nofollow noreferrer">https:&#x2F;&#x2F;raysummit.anyscale.com&#x2F;agenda</a><p>Please comment on this thread if you have ideas of what kind of content you want to see more at Ray Summit
rising-skyalmost 2 years ago
&gt; ~14 min. for 7B for 1 epoch on 3.5M tokens. ~26 min for 13B for 1 epoch.<p>&gt; At least 1xg5.16xlarge for head-node and 15xg5.4xlarge for worker nodes for both 7B and 13B<p>For the uninitiated, anyone have an idea how much this would cost on AWS?
评论 #37092454 未加载
praveenhmalmost 2 years ago
Is this possible to fine tune llama-2 locally on M1 Ultra 64GB, I would like to know or any pointer would be good. Most of them are on Cloud or using Nvidia Cuda on linux.
评论 #37095224 未加载
0xDEFalmost 2 years ago
Has anyone had luck with fine-tuning Llama-v2-7b using the paid (€11.00) Colab Pro?