TechEcho

9 comments

smokel2 months ago

I'm interested to know if anyone is using fine-tuning to train a model on proprietary or in-house codebases and documentation.RAG solutions seem to have their limitations, and fine-tuning might be a more effective approach.How much effort is required to turn code into something one can use for fine-tuning?

评论 #43417422 未加载

评论 #43417559 未加载

评论 #43417383 未加载

评论 #43421734 未加载

评论 #43419660 未加载

评论 #43418359 未加载

zk2 months ago

Is there a version of Gemma 3 that has tool calling? Google's blog claimed it supports tools but it doesn't seem like it actually does.

评论 #43421159 未加载

评论 #43466760 未加载

bryan02 months ago

Are people fine-tuning LLMs on their local machines with a single GPU? What are people using to scale their training to multiple nodes / gpus? I've been playing around with Hugging Face Estimators in sagemaker.huggingface but not sure if there are better options for this?

评论 #43416573 未加载

评论 #43417652 未加载

评论 #43417404 未加载

评论 #43418604 未加载

评论 #43416992 未加载

rockwotj2 months ago

is anyone outside of the research labs fine tuning models for production use cases? I have been seeing more people just using foundational models off the shelf especially in light of a new advancement that seems to come every few months

评论 #43415915 未加载

评论 #43416019 未加载

评论 #43416600 未加载

评论 #43416685 未加载

评论 #43416049 未加载

评论 #43415990 未加载

评论 #43415897 未加载

评论 #43415938 未加载

yieldcrv2 months ago

Instead of versions, these things should be labeled by their release date, since this kind of training is based on started at a dataset snapshot in time, colloquially called knowledge-cutoff date which isnt really accuratewe are optimizing these on different dimensions at once, and multiple branches of evolution from each modelso a successor version name doesn't really convey that

huqedato2 months ago

Great article, but I didn't see anything about the costs.I'm particularly interested in this aspect because we're considering fine-tuning Gemma 3, but our budget is tight. We're looking into (real-world) cost estimates for this approach.

评论 #43418321 未加载

评论 #43418267 未加载

siliconc0w2 months ago

It likely makes sense to use more expensive frontier models as teachers or architects for smaller fine-tuned ones that generate the majority of tokens (though possibly against the ToS).

admiralrohan2 months ago

Have anyone used those small models in any production environment?If yes, what they are good and bad at?

dhooper2 months ago

Please try to enjoy each Gemma tuning equally, and not show preference for any over the others

9 comments

smokel2 months ago

评论 #43417422 未加载

评论 #43417559 未加载

评论 #43417383 未加载

评论 #43421734 未加载

评论 #43419660 未加载

评论 #43418359 未加载

zk2 months ago

Is there a version of Gemma 3 that has tool calling? Google's blog claimed it supports tools but it doesn't seem like it actually does.

评论 #43421159 未加载

评论 #43466760 未加载

bryan02 months ago

评论 #43416573 未加载

评论 #43417652 未加载

评论 #43417404 未加载

评论 #43418604 未加载

评论 #43416992 未加载

rockwotj2 months ago

评论 #43415915 未加载

评论 #43416019 未加载

评论 #43416600 未加载

评论 #43416685 未加载

评论 #43416049 未加载

评论 #43415990 未加载

评论 #43415897 未加载

评论 #43415938 未加载

yieldcrv2 months ago

huqedato2 months ago

评论 #43418321 未加载

评论 #43418267 未加载

siliconc0w2 months ago

It likely makes sense to use more expensive frontier models as teachers or architects for smaller fine-tuned ones that generate the majority of tokens (though possibly against the ToS).

admiralrohan2 months ago

Have anyone used those small models in any production environment?If yes, what they are good and bad at?

dhooper2 months ago

Please try to enjoy each Gemma tuning equally, and not show preference for any over the others

Fine-tune Google's Gemma 3

9 comments

Fine-tune Google's Gemma 3

9 comments