TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Fine-tune Google's Gemma 3

226 点作者 tomdekan大约 2 个月前

9 条评论

smokel大约 2 个月前
I&#x27;m interested to know if anyone is using fine-tuning to train a model on proprietary or in-house codebases and documentation.<p>RAG solutions seem to have their limitations, and fine-tuning might be a more effective approach.<p>How much effort is required to turn code into something one can use for fine-tuning?
评论 #43417422 未加载
评论 #43417559 未加载
评论 #43417383 未加载
评论 #43421734 未加载
评论 #43419660 未加载
评论 #43418359 未加载
zk大约 2 个月前
Is there a version of Gemma 3 that has tool calling? Google&#x27;s blog claimed it supports tools but it doesn&#x27;t seem like it actually does.
评论 #43421159 未加载
评论 #43466760 未加载
bryan0大约 2 个月前
Are people fine-tuning LLMs on their local machines with a single GPU? What are people using to scale their training to multiple nodes &#x2F; gpus? I&#x27;ve been playing around with Hugging Face Estimators in sagemaker.huggingface but not sure if there are better options for this?
评论 #43416573 未加载
评论 #43417652 未加载
评论 #43417404 未加载
评论 #43418604 未加载
评论 #43416992 未加载
rockwotj大约 2 个月前
is anyone outside of the research labs fine tuning models for production use cases? I have been seeing more people just using foundational models off the shelf especially in light of a new advancement that seems to come every few months
评论 #43415915 未加载
评论 #43416019 未加载
评论 #43416600 未加载
评论 #43416685 未加载
评论 #43416049 未加载
评论 #43415990 未加载
评论 #43415897 未加载
评论 #43415938 未加载
yieldcrv大约 2 个月前
Instead of versions, these things should be labeled by their release date, since this kind of training is based on started at a dataset snapshot in time, colloquially called knowledge-cutoff date which isnt really accurate<p>we are optimizing these on different dimensions at once, and multiple branches of evolution from each model<p>so a successor version name doesn&#x27;t really convey that
huqedato大约 2 个月前
Great article, but I didn&#x27;t see anything about the costs.<p>I&#x27;m particularly interested in this aspect because we&#x27;re considering fine-tuning Gemma 3, but our budget is tight. We&#x27;re looking into (real-world) cost estimates for this approach.
评论 #43418321 未加载
评论 #43418267 未加载
siliconc0w大约 2 个月前
It likely makes sense to use more expensive frontier models as teachers or architects for smaller fine-tuned ones that generate the majority of tokens (though possibly against the ToS).
admiralrohan大约 2 个月前
Have anyone used those small models in any production environment?<p>If yes, what they are good and bad at?
dhooper大约 2 个月前
Please try to enjoy each Gemma tuning equally, and not show preference for any over the others