TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Free Dolly: First truly open instruction-tuned LLM

181 点作者 rxin大约 2 年前

17 条评论

nidnogg大约 2 年前
There's some blatant astroturfing from new accounts going on in this thread - I gotta say it's not the best impression
评论 #35540087 未加载
评论 #35540136 未加载
评论 #35540238 未加载
评论 #35540205 未加载
评论 #35540380 未加载
pauldix大约 2 年前
Shame that this is flagged, I think this is a really exciting development and was hoping to see the discussion around it. Open sourcing the fine tuning training set is a great building block. Will be exciting to see if others continue to build on this. More open source datasets, models, and evaluation frameworks will accelerate the development and adoption of LLMs. It adds more hackers to the mix building the core, rather than just the stuff at the edges (i.e. apps).
评论 #35540626 未加载
评论 #35550709 未加载
Mizza大约 2 年前
Shills, begone!<p>I brought up the issue of the &quot;dirty&quot; model in their last announcement thread, very cool to see them take that to heart and quickly address the issue. Impressive marketing and engineering.
yawnxyz大约 2 年前
Interesting to see that it&#x27;s trained on data completely generated by Databricks employees. I wonder how &quot;biased&quot; that makes the data, and how much they spent in terms of lost man hours?
评论 #35542305 未加载
评论 #35539956 未加载
satvikpendem大约 2 年前
I&#x27;m looking forward to using this for my startup. Lots of people are using LLaMA derived models but I&#x27;m not sure if they&#x27;re reading the license since it&#x27;s still non-commercial only, even though many people are treating it like true open source.<p>The only other one I&#x27;ve seen that&#x27;s actually open source OpenAssistant, also based on the pythia models I believe.
covi大约 2 年前
Kudos to Databricks! Anyone has insights into benchmark &amp; real-world quality?<p>From <a href="https:&#x2F;&#x2F;huggingface.co&#x2F;databricks&#x2F;dolly-v2-12b#benchmark-metrics" rel="nofollow">https:&#x2F;&#x2F;huggingface.co&#x2F;databricks&#x2F;dolly-v2-12b#benchmark-met...</a>, it seems like dolly-v2-12b&#x27;s benchmark results are actually slightly worse than dolly-v1-6b.<p>A commercially viable instruction-tuned LLM is still a huge deal.
评论 #35542479 未加载
QuadrupleA大约 2 年前
Any info on Pythia base model performance versus GPT-3 or 3.5? Couldn&#x27;t find any benchmarks in the paper. I imagine LLaMA is ahead there.
评论 #35539983 未加载
评论 #35540134 未加载
kenniy大约 2 年前
As an academic researcher with a significant interest and time-investment in Transformer-based models, this restores my faith&#x2F;hope in the trajectory of DL research. Considering it is difficult for academics to catch up to the industry regarding LLMs, seeing a continuation of the OPENness of these research works by a major industry player is a move in the right direction.
ibejoeb大约 2 年前
I deal with this stuff daily, so I think it&#x27;s probably irrationally grinding my gears, but:<p>&gt;&gt; How do I build a campfire?<p>&gt; Safety should always come first when starting a campfire.<p>Hold up: should I touch the fire? It doesn&#x27;t say.<p>OK, there&#x27;s perfectly legitimate advice in the output, like &quot;have water nearby,&quot; but give me a break already. They&#x27;re finetuning for commercial application. If I&#x27;m building business tools, I&#x27;m not putting kid gloves on it. I don&#x27;t have time for a lecture every time I need an answer.<p>You can put a safety model in front of an unencumbered model if you want. We don&#x27;t need to conflate the two.
评论 #35545000 未加载
losvedir大约 2 年前
Does it work...? The examples given at the bottom of the post were pretty great, but could easily have been cherry picked. I&#x27;d be curious to see how it performs against standard benchmarks.<p>But I love the thought here. I didn&#x27;t realize the instruction tuning for GPT was from only 40 people. It really does bring into perspective how easily a motivated large organization could bring their employees to bear to do something like this, and I&#x27;m grateful that DataBricks has done it and is sharing it here.<p>I wish I understood how LLMs work a little better. This is a neat piece of the puzzle I wasn&#x27;t fully aware of. But now my mental model is that LLMs work with kind of &quot;three layers&quot; of inputs:<p>* The base many-billion or even trillion parameter model, trained on a huge corpus of text, which basically is how it learns to use language as I&#x2F;O.<p>* The instruction tuning, on just tens of thousands of inputs, to give the raw model some further guidance. This is a sort of transfer learning, maybe? Doing further training on top of a big model?<p>* The prompt itself can provide further inputs and context to tweak how the response should look.<p>I had been thinking of LLMs in terms of the first layer, the base model, and the bottom layer the prompt, and was thinking that you could get progressively more sophisticated in the prompt &quot;context&quot; to have LLMs tailor made for your particular use case.<p>But actually, there&#x27;s a decent chunk of space to explore on the instruction tuning? Like, say you wanted an LLM to help lawyers with case law or something, to keep it from hallucinating quite as much and being more detailed and useful. Is that something that would fit in the middle layer? Could a &quot;legal AI startup&quot; tackle that problem by starting with a big open source base model, proprietarily tuning it with 10s of thousands of legal questions and answers, and then sharing that model with law firms, with maybe a customer support rep at the firm able to do the final tweaking with the prompt context? Is that how this all fits together?<p>The examples here of digesting DataBricks info and customer support tickets I found really interesting. How exactly would large companies like DB tailor LLMs to their particular use cases and data?
6gvONxR4sf7o大约 2 年前
I’m so tired of models announced without any benchmarking of quality.
marban大约 2 年前
Why is this post full of n00b user &#x27;comments&#x27;?
评论 #35540160 未加载
评论 #35540140 未加载
nicpottier大约 2 年前
Looks like this is 12b parameters. Will this fit on a 32gb M1?
评论 #35544121 未加载
michaelhartm大约 2 年前
I guess Databricks is now going after OpenAI?
评论 #35539446 未加载
pbharrin大约 2 年前
On Monday Yann LeCun and Andrew Ng said instruction following LLMs would be commoditized. Apparently they were right.
prhrb大约 2 年前
Why is this submission flagged?
dontknowyet大约 2 年前
LLM for everyone - it is awesome to see how easy it is to train your own LLM without much effort! And open source