TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

INTELLECT–1: Launching the First Decentralized Training of a 10B Parameter Model

111 点作者 jasondavies7 个月前

16 条评论

PoignardAzur7 个月前
A lot of comment are sneering at various aspects of this press release, and yeah, there&#x27;s some cringeworthy stuff.<p>But the technical aspects are pretty cool:<p>- Fault-tolerant training where nodes and be added and removed mid-run without interrupting the other nodes.<p>- Sending quantized gradients during the synchronization phase.<p>- (In the OpenDiLoCo article) Async synchronization.<p>They&#x27;re also mentioning potential trustless systems where everyone can contribute compute, which would make this a truly decentralized open platform. Overall it&#x27;ll be pretty interesting to see where this goes!
评论 #41818456 未加载
oefrha7 个月前
Well I don’t have 8xH100s, but if I do, I’m probably not gonna donate it a VC-funded company. Remember “Open”AI?<p><a href="https:&#x2F;&#x2F;pitchbook.com&#x2F;profiles&#x2F;company&#x2F;588977-92" rel="nofollow">https:&#x2F;&#x2F;pitchbook.com&#x2F;profiles&#x2F;company&#x2F;588977-92</a>
评论 #41818532 未加载
评论 #41818659 未加载
ukuina7 个月前
&gt; Decentralized training of INTELLECT-1 currently requires 8x H100 SXM5 GPUs.<p>So, your garden-variety $0.5M desktop PC, then.<p>Cool, cool.<p>[1] <a href="https:&#x2F;&#x2F;viperatech.com&#x2F;shop&#x2F;nvidia-dgx-h100-p4387-system-640gb&#x2F;" rel="nofollow">https:&#x2F;&#x2F;viperatech.com&#x2F;shop&#x2F;nvidia-dgx-h100-p4387-system-640...</a>
评论 #41818314 未加载
ikeashark7 个月前
me: Oh cool, a project like Folding@Home but for AI compute, maybe I&#x27;ll contribute as we-<p>&gt; Decentralized training of INTELLECT-1 currently requires 8x H100 SXM5 GPUs.<p>me: and for that reason, I&#x27;m out<p>Also they state that later they will be adding the ability for you to contribute your own compute but how will they solve the problem of having to back-propagate to all of the remote nodes contributing to the project without egregiously slow training time?
macrolime7 个月前
Not exactly what I would call decentralized training. More like distributed through multiple data centers.<p>Decentralized training would be when you can use consumer GPUs, but that&#x27;s not likely to work with backpropagation directly, but maybe with one of the backpropagation approximating algorithms.
评论 #41818899 未加载
m3kw97 个月前
But I can already train from 30 different vendors distributed across the US, why do I need to use a “decentralized” training system? Decentralized inferercing makes more sense as that is where things can be censored
dmitrygr7 个月前
&gt; solve decentralized training step-by-step to ensure AGI will be open-source, transparent, and accessible<p>One hell of an uncited leap from &quot;we&#x27;re multiplying a lot of numbers&quot; to &quot;AGI&quot;, as if it is a given
评论 #41818302 未加载
mountainriver7 个月前
This is cool work, I’ve been watching the slow evolution of this space for a couple years and it feels like a good way we can ensure AI is owned and accessible to everyone.
openrisk7 个月前
For some purposes a decentrally trained, open source LLM could be just fine? E.g. you want a stochastic parrot that is trained on a large, general purpose corpus of genuine public domain &#x2F; creative commons content. Having such a tool widely available is still a quantum leap versus Lore Ipsum. Up to point you can take your time. There is no manic race to capitalize any hype. &quot;slow open AI&quot; instead of &quot;fast closed AGI&quot;. Helpfully, the nature of the target corpus does not change every day. You can imagine, e.g., annual revisions, trained and rolled-out leisurely. Both costs and benefits get widely distributed.
James_K7 个月前
My initial was quite negative, but having thought it through, I can see the logic in this. Having open models is better than closed models. That said, this page seems like a joke. Someone drank a little too much AI-koolaid methinks.
not_a_dane7 个月前
Decentralised but very high entry barrier.
nickpsecurity7 个月前
The main benefit of this type of decentralization seems to be minimizing the node cost. One can rent the cheapest nodes to use in the system. Even the temporary instances can be replaced with others. It’s also easy for system owners to donate time.<p>So, mostly cost reduction mixed with some cloud, vendor diversity.
pizza7 个月前
So just spitballing here but this is likely a souped-up reverse engineered DisTrO [0] under the hood, right? Or could it be something else?<p>[0] <a href="https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=eLMJoCSjFbs" rel="nofollow">https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=eLMJoCSjFbs</a>
mt_7 个月前
&gt; We quantize the pseudo-gradients to int8, reducing communication requirements by 400x.<p>Can someone explain if it does reduce the model quality overall?
评论 #41819873 未加载
评论 #41818300 未加载
评论 #41818418 未加载
monkeydust7 个月前
Yea, come back when you can do this on BOINC.
saulrh7 个月前
&gt; Prime Intellect<p>Ah, yes, Prime Intellect, the AGI that went foom and genocided the universe because it was commanded to preserve human civilization without regard for human values. A strong contender for the least evil hostile superintelligence in fiction. What a wonderful thing to name your AI startup after. What&#x27;s next, creating the Torment Nexus?<p>(my position on the book <i>as a whole</i> is more complex, but... really? <i>Really?</i>)
评论 #41816653 未加载
评论 #41816421 未加载
评论 #41818035 未加载