TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

DeepSpeed Chat: Easy, fast and affordable RLHF training of ChatGPT-like models

240 点作者 quantisan大约 2 年前

9 条评论

tinco大约 2 年前
Microsoft: invests 10 billion in company. Also Microsoft: here&#x27;s the tools you need to DIY one of the premium features the company we just invested 10 billion in for free.<p>Not that reproducing GPT-4 is going to be easy with this, but it&#x27;ll definitely get rid of some major hurdles. I read a report about the difficulties HuggingFace had with producing their Bloom model, and a lot of it was the sort of straight forward systems engineering that goes into tooling like this.<p>Is the Bloom model considered a failure by the community? If you read the introduction it was supposed to include improvements over GPT3, but it performs much worse, I guess because of lower quality training data? I wonder what sort of company would have high enough quality data that they could use this project to fine tune a public model to the point where it would be better in some scenario than plain old GPT4 would be. Especially when you can just inject extra info in to the GPT4 prompt, like phind does for example. What even is the use of fine tuning given GPT 4 exists?
评论 #35549170 未加载
评论 #35548279 未加载
评论 #35550790 未加载
评论 #35552240 未加载
评论 #35548271 未加载
评论 #35552832 未加载
评论 #35548318 未加载
summarity大约 2 年前
Also see the example repo README: <a href="https:&#x2F;&#x2F;github.com&#x2F;microsoft&#x2F;DeepSpeedExamples&#x2F;tree&#x2F;master&#x2F;applications&#x2F;DeepSpeed-Chat">https:&#x2F;&#x2F;github.com&#x2F;microsoft&#x2F;DeepSpeedExamples&#x2F;tree&#x2F;master&#x2F;a...</a><p>&gt; With just one click, you can train, generate and serve a 1.3 billion parameter ChatGPT model within 1.36 hours on a single consumer-grade NVIDIA A6000 GPU with 48GB memory. On a single DGX node with 8 NVIDIA A100-40G GPUs, DeepSpeed-Chat enables training for a 13 billion parameter ChatGPT model in 13.6 hours. On multi-GPU multi-node systems (cloud scenarios),i.e., 8 DGX nodes with 8 NVIDIA A100 GPUs&#x2F;node, DeepSpeed-Chat can train a 66 billion parameter ChatGPT model under 9 hours. Finally, it enables 15X faster training over the existing RLHF systems<p>&gt; The following are some of the open-source examples that are powered by DeepSpeed: Databricks Dolly, LMFlow, CarperAI-TRLX, Huggingface-PEFT<p>(disclaimer: MSFT&#x2F;GH employee, not affiliated with this project)
评论 #35548226 未加载
评论 #35550816 未加载
评论 #35549363 未加载
评论 #35554131 未加载
评论 #35550512 未加载
brofallon大约 2 年前
To use RLHF you need a dataset that includes instructions with good &amp; bad answers - do many of those exist? I know there are a few datasets of just plain instructions-with-responses, but I&#x27;m not aware of any that have both good and bad (or ranked) responses. Is that trivial, or an important missing element here?
评论 #35548038 未加载
评论 #35548339 未加载
评论 #35552333 未加载
评论 #35551018 未加载
lxe大约 2 年前
It&#x27;s a little funny how Microsoft DeepSpeed doesn&#x27;t fully work on Windows
评论 #35548391 未加载
评论 #35550904 未加载
burtonator大约 2 年前
I&#x27;ve gotten so used to ChatGPT I just copied the text of this and told it to summarize the entire thing down to 5 paragraphs.<p>I know there was a summary but the point is that ChatGPT just really accelerates a LOT of bulk work we were used to having to do manually.<p>It&#x27;s an amazing time to be alive!
评论 #35551924 未加载
teruakohatu大约 2 年前
Does the RLHF help with training a LLM model to produce better (more accurate) results for a particular problem domain (eg. Customer support for a particular company) or is it only helpful in training the LLM to be a chat agent in general or a chat agent with guard rails?
评论 #35548332 未加载
sebzim4500大约 2 年前
What&#x27;s the difference between the critic model and the reward model? In the diagram they show both.<p>EDIT: Is the idea that the critic model learns via the PPO process and gives a value estimate to prefixes of the responses?
hahnchen大约 2 年前
microsoft being more open than openai haha
scottydog51834大约 2 年前
This is a really cool step but as someone without the suggested GPU, it isn&#x27;t easy or one click for me yet.<p>I am hoping that someone makes a very simple Jupyter notebook where I can enter my RLHF file and select a few other settings and just run (on AWS or Azure; willing to pay per fine-tuned model say $100-$500 for cloud credits + notebook access).
评论 #35549389 未加载
评论 #35550423 未加载