TechEcho

11 comments

jawertyalmost 2 years ago

Just to add to this, I run through a lot of these topics around fine-tuning Llama 2 on your own dataset (for me it's my own code :P) in a coding live stream a couple weeks ago. All on Colab single GPUFine-tuning Llama stream: <a href="https://www.youtube.com/watch?v=TYgtG2Th6fI&t=2282s">https://www.youtube.com/watch?v=TYgtG2Th6fI&t=2282s</a>I have a couple more one where I do a QLoRa fine tuning session and explain the concepts as a personally self taught engineer (software engineer of 8 years moving into ML recently)QloRa fine-tuning stream: <a href="https://www.youtube.com/watch?v=LitybCiLhSc&t=4584s">https://www.youtube.com/watch?v=LitybCiLhSc&t=4584s</a>Overall I'm trying to breakdown how I'm approaching a lot of my personal projects and my current AI driven startup. Want to make this information as accessible as possible. Also have a series where I'm fine-tuning a model to be the smallest webdev llm as possible which seems like people are liking. Only been streaming for about a month and plenty more to come.Ask me any question about the stream and fine-tuning llama!

评论 #37091568 未加载

评论 #37094467 未加载

评论 #37092993 未加载

评论 #37091441 未加载

behnamohalmost 2 years ago

> Additionally, while this wasn’t an issue for GPT, the Llama chat models would often output hundreds of miscellaneous tokens that were unnecessary for the task, further slowing down their inference time (e.g. “Sure! Happy to help…”).That's the problem I've been facing with Llama 2 as well. It's almost impossible to have it just output the desired text. It will always add something before and after its response. Does anyone know if there's any prompt technique to fix this problem?

评论 #37091720 未加载

评论 #37091723 未加载

评论 #37096741 未加载

评论 #37100923 未加载

richardliawalmost 2 years ago

I'm really glad to see a post like this come out. I've seen so many discussions online about customizing models -- this post really does cut through the noise.Really like the evaluation methodology, and seems well-written as well.

yousif_123123almost 2 years ago

It's weird that Lora and training with quantization is not being taken more seriously. It's way cheaper, takes less time, and a lot of evidence shows it's pretty good.I don't think it should be something brushed on the side to be tried out later..

评论 #37091536 未加载

bugglebeetlealmost 2 years ago

Glad to see the NER-like task performed the best, as I was just about to test something like this for comparison with a fine-tuned BERT model. Any idea about the training costs for this task?

评论 #37092932 未加载

评论 #37091606 未加载

ilakshalmost 2 years ago

One challenge is that to get large enough custom datasets you either need a small army or a very strong existing model. Which means that you probably have to use OpenAI. And using OpenAI to generate training material for another model violates their terms.Has anyone taken them to court about this? Do we all just decide it's not fair and ignore it?

评论 #37093698 未加载

评论 #37092047 未加载

spdustinalmost 2 years ago

Seeing NER examples pop up more frequently now, and wondering why folks don’t use spacy for those sorts of tasks.

评论 #37091861 未加载

评论 #37091645 未加载

评论 #37091348 未加载

zhz_rayalmost 2 years ago

Disclaimer: I work for AnyscaleThis blog seems to got good attention :) So we definitely plan to add it to Ray Summit <a href="https://raysummit.anyscale.com/agenda" rel="nofollow noreferrer">https://raysummit.anyscale.com/agenda</a>Please comment on this thread if you have ideas of what kind of content you want to see more at Ray Summit

rising-skyalmost 2 years ago

> ~14 min. for 7B for 1 epoch on 3.5M tokens. ~26 min for 13B for 1 epoch.> At least 1xg5.16xlarge for head-node and 15xg5.4xlarge for worker nodes for both 7B and 13BFor the uninitiated, anyone have an idea how much this would cost on AWS?

评论 #37092454 未加载

praveenhmalmost 2 years ago

Is this possible to fine tune llama-2 locally on M1 Ultra 64GB, I would like to know or any pointer would be good. Most of them are on Cloud or using Nvidia Cuda on linux.

评论 #37095224 未加载

0xDEFalmost 2 years ago

Has anyone had luck with fine-tuning Llama-v2-7b using the paid (€11.00) Colab Pro?

11 comments

jawertyalmost 2 years ago

评论 #37091568 未加载

评论 #37094467 未加载

评论 #37092993 未加载

评论 #37091441 未加载

behnamohalmost 2 years ago

评论 #37091720 未加载

评论 #37091723 未加载

评论 #37096741 未加载

评论 #37100923 未加载

richardliawalmost 2 years ago

yousif_123123almost 2 years ago

评论 #37091536 未加载

bugglebeetlealmost 2 years ago

Glad to see the NER-like task performed the best, as I was just about to test something like this for comparison with a fine-tuned BERT model. Any idea about the training costs for this task?

评论 #37092932 未加载

评论 #37091606 未加载

ilakshalmost 2 years ago

评论 #37093698 未加载

评论 #37092047 未加载

spdustinalmost 2 years ago

Seeing NER examples pop up more frequently now, and wondering why folks don’t use spacy for those sorts of tasks.

评论 #37091861 未加载

评论 #37091645 未加载

评论 #37091348 未加载

zhz_rayalmost 2 years ago

rising-skyalmost 2 years ago

评论 #37092454 未加载

praveenhmalmost 2 years ago

Is this possible to fine tune llama-2 locally on M1 Ultra 64GB, I would like to know or any pointer would be good. Most of them are on Cloud or using Nvidia Cuda on linux.

评论 #37095224 未加载

0xDEFalmost 2 years ago

Has anyone had luck with fine-tuning Llama-v2-7b using the paid (€11.00) Colab Pro?

Fine-Tuning Llama-2: A Comprehensive Case Study for Tailoring Custom Models

11 comments

Fine-Tuning Llama-2: A Comprehensive Case Study for Tailoring Custom Models

11 comments