TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Ask HN: Which GPU is good for AI LLMs?

4 点作者 roschdal大约 1 年前
Which GPU is good for running AI LLMs?

3 条评论

ActorNightly大约 1 年前
General go to is pair of nvidia cards with 24gb of ram a piece. Should be enough to run stuff like Mixtral 8x7b in 8 bit precision which is good enough. That being said, single 24gb card is fine enough 4bit precision models if you are using it for basic coding assistance.<p>If you are interested in inference only, not training, its not really worth it to invest in cards. Use the online inference tools. And for training, even a pair of 4090s aren&#x27;t going to be that good without a good CPU and lots of RAM to keep the cards fed as much as possible.
ClassyJacket大约 1 年前
You typically want as much VRAM as possible for this type of application.<p>For example, Llama has versions that take 32GB of VRAM, even after quantization (compression):<p><a href="https:&#x2F;&#x2F;old.reddit.com&#x2F;r&#x2F;LocalLLaMA&#x2F;comments&#x2F;1806ksz&#x2F;information_on_vram_usage_of_llm_model&#x2F;ka72kgc&#x2F;" rel="nofollow">https:&#x2F;&#x2F;old.reddit.com&#x2F;r&#x2F;LocalLLaMA&#x2F;comments&#x2F;1806ksz&#x2F;informa...</a><p>There are smaller versions too however if you&#x27;re VRAM constrained.
pizza大约 1 年前
4090 or 3090 are reasonable choices. You want the GPUs with a lot of VRAM. Seen a lot of people running 2-3x used 3090s for reasonable-ish prices (ie &lt;2k usd). If you want higher speed go for 4090s if you can afford it and your computer can handle the wattage (though you can always limit wattage on your gpus to reduce power draw for a fairly minimal speed hit).