科技回声

9 条评论

tinco大约 2 年前

Microsoft: invests 10 billion in company. Also Microsoft: here's the tools you need to DIY one of the premium features the company we just invested 10 billion in for free.Not that reproducing GPT-4 is going to be easy with this, but it'll definitely get rid of some major hurdles. I read a report about the difficulties HuggingFace had with producing their Bloom model, and a lot of it was the sort of straight forward systems engineering that goes into tooling like this.Is the Bloom model considered a failure by the community? If you read the introduction it was supposed to include improvements over GPT3, but it performs much worse, I guess because of lower quality training data? I wonder what sort of company would have high enough quality data that they could use this project to fine tune a public model to the point where it would be better in some scenario than plain old GPT4 would be. Especially when you can just inject extra info in to the GPT4 prompt, like phind does for example. What even is the use of fine tuning given GPT 4 exists?

评论 #35549170 未加载

评论 #35548279 未加载

评论 #35550790 未加载

评论 #35552240 未加载

评论 #35548271 未加载

评论 #35552832 未加载

评论 #35548318 未加载

summarity大约 2 年前

Also see the example repo README: <a href="https://github.com/microsoft/DeepSpeedExamples/tree/master/applications/DeepSpeed-Chat">https://github.com/microsoft/DeepSpeedExamples/tree/master/a...</a>> With just one click, you can train, generate and serve a 1.3 billion parameter ChatGPT model within 1.36 hours on a single consumer-grade NVIDIA A6000 GPU with 48GB memory. On a single DGX node with 8 NVIDIA A100-40G GPUs, DeepSpeed-Chat enables training for a 13 billion parameter ChatGPT model in 13.6 hours. On multi-GPU multi-node systems (cloud scenarios),i.e., 8 DGX nodes with 8 NVIDIA A100 GPUs/node, DeepSpeed-Chat can train a 66 billion parameter ChatGPT model under 9 hours. Finally, it enables 15X faster training over the existing RLHF systems> The following are some of the open-source examples that are powered by DeepSpeed: Databricks Dolly, LMFlow, CarperAI-TRLX, Huggingface-PEFT(disclaimer: MSFT/GH employee, not affiliated with this project)

评论 #35548226 未加载

评论 #35550816 未加载

评论 #35549363 未加载

评论 #35554131 未加载

评论 #35550512 未加载

brofallon大约 2 年前

To use RLHF you need a dataset that includes instructions with good & bad answers - do many of those exist? I know there are a few datasets of just plain instructions-with-responses, but I'm not aware of any that have both good and bad (or ranked) responses. Is that trivial, or an important missing element here?

评论 #35548038 未加载

评论 #35548339 未加载

评论 #35552333 未加载

评论 #35551018 未加载

lxe大约 2 年前

It's a little funny how Microsoft DeepSpeed doesn't fully work on Windows

评论 #35548391 未加载

评论 #35550904 未加载

burtonator大约 2 年前

I've gotten so used to ChatGPT I just copied the text of this and told it to summarize the entire thing down to 5 paragraphs.I know there was a summary but the point is that ChatGPT just really accelerates a LOT of bulk work we were used to having to do manually.It's an amazing time to be alive!

评论 #35551924 未加载

teruakohatu大约 2 年前

Does the RLHF help with training a LLM model to produce better (more accurate) results for a particular problem domain (eg. Customer support for a particular company) or is it only helpful in training the LLM to be a chat agent in general or a chat agent with guard rails?

评论 #35548332 未加载

sebzim4500大约 2 年前

What's the difference between the critic model and the reward model? In the diagram they show both.EDIT: Is the idea that the critic model learns via the PPO process and gives a value estimate to prefixes of the responses?

hahnchen大约 2 年前

microsoft being more open than openai haha

scottydog51834大约 2 年前

This is a really cool step but as someone without the suggested GPU, it isn't easy or one click for me yet.I am hoping that someone makes a very simple Jupyter notebook where I can enter my RLHF file and select a few other settings and just run (on AWS or Azure; willing to pay per fine-tuned model say $100-$500 for cloud credits + notebook access).

评论 #35549389 未加载

评论 #35550423 未加载

9 条评论

tinco大约 2 年前

评论 #35549170 未加载

评论 #35548279 未加载

评论 #35550790 未加载

评论 #35552240 未加载

评论 #35548271 未加载

评论 #35552832 未加载

评论 #35548318 未加载

summarity大约 2 年前

评论 #35548226 未加载

评论 #35550816 未加载

评论 #35549363 未加载

评论 #35554131 未加载

评论 #35550512 未加载

brofallon大约 2 年前

评论 #35548038 未加载

评论 #35548339 未加载

评论 #35552333 未加载

评论 #35551018 未加载

lxe大约 2 年前

It's a little funny how Microsoft DeepSpeed doesn't fully work on Windows

评论 #35548391 未加载

评论 #35550904 未加载

burtonator大约 2 年前

评论 #35551924 未加载

teruakohatu大约 2 年前

评论 #35548332 未加载

sebzim4500大约 2 年前

hahnchen大约 2 年前

microsoft being more open than openai haha

scottydog51834大约 2 年前

评论 #35549389 未加载

评论 #35550423 未加载

DeepSpeed Chat: Easy, fast and affordable RLHF training of ChatGPT-like models

9 条评论

DeepSpeed Chat: Easy, fast and affordable RLHF training of ChatGPT-like models

9 条评论