Could you train a ChatGPT-beating model for $85k and run it in a browser?

430 pointsby sirtenoabout 2 years ago

34 comments

whalesaladabout 2 years ago

Are there any training/ownership models like Folding@Home? People could donate idle GPU resources in exchange for access to the data, and perhaps ownership. Then instead of someone needing to pony up $85k to train a model, a thousand people can train a fraction of the model on their consumer GPU and pool the results, reap the collective rewards.

评论 #35392285 未加载

评论 #35392121 未加载

评论 #35394023 未加载

评论 #35391999 未加载

评论 #35392110 未加载

评论 #35396086 未加载

评论 #35393351 未加载

评论 #35392139 未加载

评论 #35395089 未加载

评论 #35397308 未加载

ftxbroabout 2 years ago

His estimate is that you could train a LLaMA-7B scale model for around $82,432 and then fine-tune it for a total of less than $85K. But when I saw the fine tuned LLaMA-like models they were worse in my opinion even than GPT-3. They were like GPT-2.5 or like that. Not nearly as good as ChatGPT 3.5 and certainly not ChatGPT-beating. Of course, far enough in the future you could certainly run one in the browser for $85K or much less, like even $1 if you go far enough into the future.

评论 #35391717 未加载

评论 #35391598 未加载

评论 #35396779 未加载

captainmuonabout 2 years ago

I guess companies like OpenAI and Google have no incentives to make models use less resources. The compute required, and of course also their training data, is their moat.If you accept that your model knows less about the world - it doesn't have to know about every restaurant in mexico city or the biography of every soccer player around the world - then you can get away with much fewer parameters and much less training data. Then you can't query it like an oracle about random things anymore, but you shouldn't do that anyway. But it should still be able to do tasks like reformulating texts, judging simularity (by embedding distance), and so on.And TFA mentions it also, you could hook up your simple language model with something like ReAct to get really good results. I don't see it running in the browser, but if you had a license-wise clean model that you can run on premises on one or two GPUs, that would be huge for a lot of people!

评论 #35401663 未加载

评论 #35399356 未加载

lxeabout 2 years ago

Keep in mind that image transformer models like stable diffusion are generally smaller than language models, so they are easier to fit in wasm space.Also. you can finetune llama-7b on a 3090 for about $3 using LoRA.

评论 #35391417 未加载

评论 #35393465 未加载

JasonZ2about 2 years ago

Does anyone know how the results from a 7B parameter model with bloomz.cpp (<a href="https://github.com/NouamaneTazi/bloomz.cpp">https://github.com/NouamaneTazi/bloomz.cpp</a>) compares to the 7B parameter Alpaca model with llama.cpp (<a href="https://github.com/ggerganov/llama.cpp">https://github.com/ggerganov/llama.cpp</a>)?I have the latter working on a M1 Macbook Air with very good results for what it is. Curious if bloomz.cpp is significantly better or just about the same.

captaincrowbarabout 2 years ago

The big problem with AI R&D is that nobody can keep up with the big bux companies. It makes this kind of project a bit pointless. Even if you can run a GPT3-equivalent on a web browser, how many people are going to bother (except as a stunt) when GPT4 is available?

评论 #35393008 未加载

评论 #35392981 未加载

评论 #35397056 未加载

version_fiveabout 2 years ago

If you have ~100k to spend, aren't there options to buy a gpu rather than just blow it all on cloud? How much is an 8xA100 machine?4xA100 is 75k, 8 is 140k <a href="https://shop.lambdalabs.com/deep-learning/servers/hyperplane/customize" rel="nofollow">https://shop.lambdalabs.com/deep-learning/servers/hyperplane...</a>

评论 #35391976 未加载

评论 #35391545 未加载

评论 #35391578 未加载

评论 #35392159 未加载

munk-aabout 2 years ago

A wonderful thing about software development is that there is so much reserved space for creativity that we have huge gaps between costs and value. Whether the average person could do this for 85k I'm uncertain of - but there is a very significant slice of people that could do it for well under 85k now that the ground work has been done. This leads to the hilarious paradox where a software based business worth millions could be built on top of code valued around 60k to write.

评论 #35393398 未加载

评论 #35391613 未加载

thih9about 2 years ago

> as opposed to OpenAI’s continuing practice of not revealing the sources of their training data.Looks like that choice makes it more difficult to adopt, trust, or collaborate on the new tech.What are the benefits? Is there more to that than competitive advantage? If not, ClosedAI sounds more accurate.

Trykabout 2 years ago

Why doesn't someone just start a gofundme/kickstarter with the goal of funding the training of an open-source ChatGPT-capable model?

评论 #35395649 未加载

GartzenDeHaesabout 2 years ago

It's interesting to me that LLaMA-nB's still produce reasonable results after 4-bit quantization of the 32-bit weights. Does this indicate some possibility of reducing the compute required for training?

lmeyerovabout 2 years ago

It seems the quality goes up & cost goes down significantly with Colossal AI's recent push: <a href="https://medium.com/@yangyou_berkeley/colossalchat-an-open-source-solution-for-cloning-chatgpt-with-a-complete-rlhf-pipeline-5edf08fb538b" rel="nofollow">https://medium.com/@yangyou_berkeley/colossalchat-an-open-so...</a>Their writeup makes it sounds like, net, 2X+ over Alpaca, and that's an early runThe browser side is interesting too. Browser JS VMs have a memory cap of 1GB, so that may ultimately be the bottleneck here...

评论 #35393884 未加载

评论 #35393424 未加载

评论 #35392279 未加载

make3about 2 years ago

Alpaca uses knowledge distillation (it's trained on outputs from OpenAI models). It's something to keep in mind. You're teaching your model to copy an other model's outputs.

评论 #35391689 未加载

评论 #35392221 未加载

brrrrrmabout 2 years ago

The WebGPU demo mentioned in this post is insane. Blows any WASM approach out of the water. Unfortunately that performance is not supported anywhere but chrome canary (behind a flag)

评论 #35391724 未加载

fzliuabout 2 years ago

I was a bit skeptical about loading a _4GB_ model at first. Then I double-checked: Firefox is using about 5GB of memory for me. My current open tabs are mail, calendar, a couple Google Docs, two Arxiv papers, two blog posts, two Youtube videos, milvus.io documentation, and chat.openai.com.A lot of applications and developers these days take memory management for granted, so embedding a 4GB model to significantly enhance coding and writing capabilities doesn't seem too far-fetched.

astlouis44about 2 years ago

WebGPU is going to be a major component in this. Modern GPU's prevalent in mobile devices, desktops and laptops, is more than enough to do all of this client side.

agnokapatheticabout 2 years ago

> My friends at Replicate told me that a simple rule of thumb for A100 cloud costs is $1/hour.AWS charges $32/hr for an 8xA100s (p4d.24xlarge) which comes out to $4/hour/gpu. Yes you can get lower pricing with a 3 year reservation but thats not what this question is asking.You also need 256 nodes to be colocated on the same fabric -- which AWS will do for you but only if you reserve for years.

评论 #35391377 未加载

评论 #35391335 未加载

评论 #35391419 未加载

评论 #35393635 未加载

评论 #35391382 未加载

d4rkp4tternabout 2 years ago

Everyone seems to assume that all the “tricks” behind training ChatGPT are known. The only clues are in papers from ClosedAI like the InstructGPT paper. So we assume there is Supervised Fine Tuning, then Reward Modeling and finally RLHF.But there are most likely other tricks that ClosedAI has not published. These probably took years of R&D to come up with, others trying to replicate ChatGPT would need to come up with these tricks on their own.Also curiously the app was released in late 2022 while the knowledge cutoff is 2021 — I was curious why that might be, and one hypothesis I had was that it may have been because they wanted to keep the training data fixed while they iterated on numerous methods, hyperparameter tuning etc. All of these are unfortunately a defensive moat that ClosedAI has.

pavelstoevabout 2 years ago

Training a ChatGPT-beating model for much less than $85,000is entirely feasible. At CentML, we're actively working on model training and inference optimization without affecting accuracy, which can help reduce costs and make such ambitious projects realistic. By maximizing (>90%) GPU and platform hardware utilization, we aim to bring down the expenses associated with large-scale models, making them more accessible for various applications. Additionally, our solutions also have a positive environmental impact, addressing the excess CO2 concerns. If you're interested in learning more about how we are doing it, please reach out via our website: <a href="https://centml.ai" rel="nofollow">https://centml.ai</a>

nwoliabout 2 years ago

What we need is a RETRO style model where basically after the input you go through a small net that just fetches a desired set of weights from a server (serving data without compute is dirt cheap) and is then executed locally. We’ll get there eventually

评论 #35394621 未加载

breckabout 2 years ago

Just want to say SimonW has become one of my favorite writers covering the AI revolution. Always fun thought experiments with linked code and very constructive for people thinking about how to make this stuff more accessible to the masses.

skybrianabout 2 years ago

I wonder why anyone would want to run it in a browser, other than to show it could be done? It's not like the extra latency would matter, since these things are slow.Running it on a server you control makes more sense. You can pick appropriate hardware for running the AI. Then access it from any browser you like, including from your phone, and switch devices whenever you like. It won't use up all the CPU/GPU on a portable device and run down your battery.If you want to run the server at home, maybe use something like Tailscale?

评论 #35394656 未加载

jedbergabout 2 years ago

With the explosion of LLMs and people figuring out ways to train/use them relatively cheaply, unique data sets will become that much more valuable, and will be the key differentiator between LLMs.Interestingly, it seems like companies that run chat programs where they can read the chats are best suited to building "human conversation" LLMs, but someone who manages large text datasets for others are in the perfect place to "win" the LLM battle.

fswdabout 2 years ago

There is somebody finetunin 160m rwkv4 on alpaca on the rwkv discord, I am out of the office and can't link but the person posted in prompt showcase channel

评论 #35392520 未加载

nope96about 2 years ago

I remember watching one of the final episodes of Connections 3: With James Burke, and he casually said we'd have personal assistants that we could talk to (in our PDAs). That was 1997 and I knew enough about computers to think he was being overly optimistic about the speed of progress. Not in our lifetimes. Guess I was wrong!

aleccoabout 2 years ago

Interesting blog but the extrapolations are way overblown. I tried one of the 30bn models and it's not even remotely close to GPT-3.Don't get me wrong, this is very interesting and I hope more is done in the open models. But let's not over-hype by 10x.

gesshaabout 2 years ago

We need a DAWNBench* benchmark for training ChatGPT the fastest and cheapest.* <a href="https://dawn.cs.stanford.edu/benchmark/" rel="nofollow">https://dawn.cs.stanford.edu/benchmark/</a>

ushakovabout 2 years ago

Now imagine loading 3.9 GB each time you want to interact with a webpage

评论 #35391555 未加载

评论 #35391728 未加载

cavisneabout 2 years ago

There is a minimum cluster size to get good utilization of the GPU’s. $1 an hour per chip might get you one A100 but it won’t get you hundreds clustered together.

ChumpGPTabout 2 years ago

I'm not so smart and I don't understand a lot about ChatGPT, etc, but could there be a client side app like Folding@home that would allow millions of people to give processing power to train a LLM?

v4dokabout 2 years ago

Can someone at the EU, the only player in this thing with no strategy yet just pool together enough resources so the open-source people can train models. We don't ask much, just give compute power

评论 #35394392 未加载

评论 #35395829 未加载

TMWNNabout 2 years ago

Hey, that means it can be turned into an Electron app!

ultrablackabout 2 years ago

If you could, you should have done it 6 months ago.

评论 #35391536 未加载

rspoerriabout 2 years ago

So cool it runs on a browser /sarcasm/ i might not even need a computer. Or internet when we are at it.It either runs locally or it runs on the cloud. Data could come from both locations as well. So it's mostly technically irrelevant if it's displaying in a browser or not.Except when it comes to usability. I don't get it why people love software running in a browser. I often close important tools i have not saved when it's in a browser. I cant have offline tools which work if i am in a tunnel (living in Switzerland this is an issue) . Or it's incompatible because i am running LibreWolf./sorry to be nitpicking on this topic ;-)

评论 #35391515 未加载

评论 #35391700 未加载

评论 #35391626 未加载

评论 #35392229 未加载