TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

GPT 3.5 vs. Llama 2 fine-tuning: A Comprehensive Comparison

47 点作者 samlhuillier超过 1 年前

7 条评论

todd3834超过 1 年前
What has me excited about Llama: I've built some tools that I think would make sense to offer for an affordable "lifetime price" but they currently rely on OpenAI api / GPT4. I cannot get myself to offer lifetime memberships to something with an ongoing cost. Lately I've been considering building Electron apps with Llama for code embedded targeted toward Apple Silicon devices. I think with this stack I wouldn't incur any ongoing costs and these utilities could exist for a one time fee.
评论 #37560810 未加载
评论 #37560968 未加载
svapnil超过 1 年前
This is really cool, nice work!<p>Quick question - what would the cost of inference be, at scale, between a fine-tuned 3.5 and Llama 2 fine-tuned? Surely that&#x27;s another factor that should be considered in this case, right?
lukev超过 1 年前
I&#x27;m curious about the terminology for the &quot;functional representation&quot; dataset.<p>Is this a well-defined term? I&#x27;ve been thinking about similar approaches for getting more structured propositional knowledge into and out of LLMs, and the examples in the Viggo data set are the closest thing so far to someone thinking the same way I am.<p>However, Google doesn&#x27;t turn up many results that use the term in this way. I&#x27;d love any more resources or information on the topic.
thewataccount超过 1 年前
I&#x27;ve been struggling with figuring out a good dataset for fine-tuning. Most of the ones that exist were purpose made for finetuning&#x2F;training a model already.<p>Does anyone have any tips for creating sufficient datasets for finetuning specific workloads?
BoorishBears超过 1 年前
Is there a notebook that shows a reproducible evaluation procedure? You linked to some general eval, not your actual evaluation.
Tepix超过 1 年前
Would using two RTX 3090s instead of the A40 have been an option?
jerpint超过 1 年前
I’d be very interested in a similar comparison for RAG style tasks