TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Ask HN: What local machines are people using to train LLMs?

56 点作者 Exorust超过 1 年前
How are people building local rigs to train LLMs?

4 条评论

malux85超过 1 年前
I don’t train LLMs from scratch, but I have:<p>3x4090s 1xTesla A100<p>Lots of fine tuning, attention visualisation, evaluation of embeddings and different embedding generation methods, not just LLMs though I use them a lot for deep nets of many kinds<p>Both for my day job (hedge fund) and my hobby project <a href="https:&#x2F;&#x2F;atomictessellator.com" rel="nofollow">https:&#x2F;&#x2F;atomictessellator.com</a><p>It’s summer here in NZ and I have these in servers mounted in a freestanding server rack beside my desk, and it is very hot in here XD
评论 #39030144 未加载
评论 #39030755 未加载
rgbrgb超过 1 年前
Some people have been fine-tuning mistral 7B and phi-2 on their high-end macs. Unified memory is a hell of a thing. The resulting model here is not spectacular but as a proof of concept it&#x27;s pretty exciting what you get in 3.5 hours on a consumer machine.<p>- Apple M2 Max 64GB shared RAM<p>- Apple Metal (GPU), 8 threads<p>- 1152 iterations (3 epochs), batch size 6, trained over 3 hours 24 minutes<p><a href="https:&#x2F;&#x2F;www.reddit.com&#x2F;r&#x2F;LocalLLaMA&#x2F;comments&#x2F;18ujt0n&#x2F;using_gpus_on_a_mac_m2_max_via_mlx_update_on&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.reddit.com&#x2F;r&#x2F;LocalLLaMA&#x2F;comments&#x2F;18ujt0n&#x2F;using_g...</a>
评论 #39035258 未加载
评论 #39030259 未加载
评论 #39030314 未加载
评论 #39041837 未加载
buildbot超过 1 年前
A self built machine with dual 4090s, soon to be 3x. Watercooled for quieter operation.<p>Did the math on how much using runpod per day would be, and bought this setup instead.<p>Using Fully sharded data parallel and bfloat16, I can train a 7b param model very slowly. But that’s fine for only going 2000 steps!
评论 #39030633 未加载
bearjaws超过 1 年前
I doubt many people are using local setups for serious work.<p>Even fine tuning Mixtral is 4xH100 for 4 days. Which is a ~$200k server currently.<p>To fully train, not just fine tune a small model, say Llama 2 7b you need over 128GiB of vram, so still multiple GPU territory, likely A100s or H100s.<p>This is all dependent upon the settings you use, increase the batch size and you will see even more memory utilization.<p>I believe a lot of people see these models running locally and assume training is similar, but it isn&#x27;t.
评论 #39030427 未加载