Releasing 3B and 7B RedPajama

363 pointsby antimatter15about 2 years ago

13 comments

spharsabout 2 years ago

Slightly off-topic, but as the parent of a toddler, I got a bit of a chuckle out of the name. It's based off the children's book series of "Llama Llama Red Pajama"

评论 #35839566 未加载

评论 #35837344 未加载

评论 #35837712 未加载

评论 #35836952 未加载

评论 #35837883 未加载

rawrmaanabout 2 years ago

There was a lot of detail and data in here, but it's not very useful to me because all of the comparisons are to things I have no experience with.There's really only one thing I care about: How does this compare to GPT-4?I have no use for models that aren't at that level. Even though this almost definitely isn't at that level, it's hard to know how close or far it is from the data presented.

评论 #35838735 未加载

评论 #35836621 未加载

评论 #35836780 未加载

评论 #35836861 未加载

评论 #35838445 未加载

评论 #35836649 未加载

andy_xor_andrewabout 2 years ago

This is beyond exciting. Welcome to the new reality!On one hand, the resources required to run these models continues falling dramatically, thanks to the techniques discovered by researchers: GPTQ quantizing down to 4, 3, 2, even 1 bits! model pruning! hybrid vram offloading! better, more efficient architectures! 1-click finetuning on consumer hardware! Of course, the free lunches won't last forever, and this will level off, but it's still incredible.And on the other side of the coin, the power of all computing devices continues its ever-upward exponential growth.So you have a continuous lowering of requirements, combined with a continuous increase in available power... surely these two trends will collide, and I can only imagine what this stuff will be like at that intersection.

评论 #35836637 未加载

评论 #35840958 未加载

评论 #35836770 未加载

评论 #35836614 未加载

knaik94about 2 years ago

I have been really impressed with the uncensored WizardLM I was playing with. Having a truely open uncensored model to work with is a really important research tool. Censorship of the training data and results in such a heavy handed way is not really possible without lowering the quality of all output.As the resouces required to train and fine tune these models becomes consumer handware friendly, I think we'll see a shift towards a bunch of smaller models. Open models like these also mean the results of securty and capability research is publicly available. Models like this one and the Replit code model will become the new base all open source models are based on. I am really looking forward to the gptj 4bit, cuda optimized 7b models, the others I have tested run fast on 2070max q and 16gb ram, I was getting ~7tokens/second. Lora can work directly with 4bit quantized models. While ggml, cpu models are very strong, I don't believe we're move away from gpu accelarated training and fine tuning anytime soon.

评论 #35840024 未加载

practice9about 2 years ago

Models replicating LLaMA are cool, but they are all missing proper multilingual support, which GPT-3.5 is quite good at.

评论 #35837420 未加载

评论 #35841581 未加载

ftxbroabout 2 years ago

With this one and mosaicml we now got so many of these consumer-gpu-sized models!

wtarreauabout 2 years ago

That's very interesting to perform basic tasks at reasonable speeds or to run on smaller systems. Unfortunately it's not of the many ones based on python and transformers, so all gained resources from the compact model are wasted by the heavy engine and ecosystem, and even a 4GB machine with 4G swap goes oom because the loaded data gets duplicated in memory using read() and malloc() :-(Let's wait for someone to port it to a cheaper and more powerful C-based engine like llama-cpp.

nicoabout 2 years ago

idea: linked parameters / models treebuild a model that can change the number of parameters in the vicinity of some meaning, effectively increasing the local resolution around that meaningso parameter space becomes linked-parameter space, between modelslinks could be pruned based on activation frequencyanother way of seeing the concept is a tree of models/llmsand one additional model/llm that all it does is manage the tree (ie. build it as it goes, use it to infer, prune it, etc)Or is it too dumb what I’m saying?

ftxbroabout 2 years ago

So I tried RedPajama-INCITE-Instruct-7B-v0.1 and the AutoModelForCausalLM.from_pretrained(...) call takes two minutes every time. My GPU is big enough. I don't know why it's so slow. I feel like it's somehow precomputing stuff that can be used across queries, and I had hoped that this stuff would have already been precomputed on the disk and I could just load it up.

born-jreabout 2 years ago

i also wonder how powerful will 3b model will be ? can it act as a prompt router where it can make API call to ChatGPT or other specified model for actual processing. its probably possible to do this with langchain but i have not tried it yet.

ibittoabout 2 years ago

I am really interested in knowing what people are using these smaller models for. I have seen a lot of projects on top of GPT-3.5 / GPT-4, but I have yet to see any using these smaller models.

mirkerabout 2 years ago

Does anyone have experience using these open source models in production?

评论 #35841718 未加载

acapybaraabout 2 years ago

I've been following the RedPajama project closely and I must say, it's quite an impressive undertaking. The fact that it's all open-source, and the collaboration between various institutions, is nothing short of amazing. This shows the power of the open-source community in action, with a bunch of smart people coming together to build something truly remarkable.The 3B model, being super fast and accessible, is a game changer for a lot of us who may not have the latest hardware. I mean, running on an RTX 2070 that was released 5 years ago? That's pretty cool.As for the 7B model, it's great to see that it's already outperforming the Pythia 7B. The bigger dataset definitely seems to be making a difference here. I'm eager to see how far this project goes, and what kinda improvements we can expect in the coming weeks with the new RedPajama dataset they're working on.One thing I found interesting is the mention of differences between the LLaMA 7B and their replication. I'd love to learn more about those differences, as it could shed light on what's working well and what could be improved further.

评论 #35836652 未加载

评论 #35836739 未加载

13 comments

spharsabout 2 years ago

Slightly off-topic, but as the parent of a toddler, I got a bit of a chuckle out of the name. It's based off the children's book series of "Llama Llama Red Pajama"

评论 #35839566 未加载

评论 #35837344 未加载

评论 #35837712 未加载

评论 #35836952 未加载

评论 #35837883 未加载

rawrmaanabout 2 years ago

评论 #35838735 未加载

评论 #35836621 未加载

评论 #35836780 未加载

评论 #35836861 未加载

评论 #35838445 未加载

评论 #35836649 未加载

andy_xor_andrewabout 2 years ago

评论 #35836637 未加载

评论 #35840958 未加载

评论 #35836770 未加载

评论 #35836614 未加载

knaik94about 2 years ago

评论 #35840024 未加载

practice9about 2 years ago

Models replicating LLaMA are cool, but they are all missing proper multilingual support, which GPT-3.5 is quite good at.

评论 #35837420 未加载

评论 #35841581 未加载

ftxbroabout 2 years ago

With this one and mosaicml we now got so many of these consumer-gpu-sized models!

wtarreauabout 2 years ago

nicoabout 2 years ago

ftxbroabout 2 years ago

born-jreabout 2 years ago

ibittoabout 2 years ago

I am really interested in knowing what people are using these smaller models for. I have seen a lot of projects on top of GPT-3.5 / GPT-4, but I have yet to see any using these smaller models.