TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

INTELLECT–1: Launching the First Decentralized Training of a 10B Parameter Model

111 pointsby jasondavies7 months ago

16 comments

PoignardAzur7 months ago
A lot of comment are sneering at various aspects of this press release, and yeah, there&#x27;s some cringeworthy stuff.<p>But the technical aspects are pretty cool:<p>- Fault-tolerant training where nodes and be added and removed mid-run without interrupting the other nodes.<p>- Sending quantized gradients during the synchronization phase.<p>- (In the OpenDiLoCo article) Async synchronization.<p>They&#x27;re also mentioning potential trustless systems where everyone can contribute compute, which would make this a truly decentralized open platform. Overall it&#x27;ll be pretty interesting to see where this goes!
评论 #41818456 未加载
oefrha7 months ago
Well I don’t have 8xH100s, but if I do, I’m probably not gonna donate it a VC-funded company. Remember “Open”AI?<p><a href="https:&#x2F;&#x2F;pitchbook.com&#x2F;profiles&#x2F;company&#x2F;588977-92" rel="nofollow">https:&#x2F;&#x2F;pitchbook.com&#x2F;profiles&#x2F;company&#x2F;588977-92</a>
评论 #41818532 未加载
评论 #41818659 未加载
ukuina7 months ago
&gt; Decentralized training of INTELLECT-1 currently requires 8x H100 SXM5 GPUs.<p>So, your garden-variety $0.5M desktop PC, then.<p>Cool, cool.<p>[1] <a href="https:&#x2F;&#x2F;viperatech.com&#x2F;shop&#x2F;nvidia-dgx-h100-p4387-system-640gb&#x2F;" rel="nofollow">https:&#x2F;&#x2F;viperatech.com&#x2F;shop&#x2F;nvidia-dgx-h100-p4387-system-640...</a>
评论 #41818314 未加载
ikeashark7 months ago
me: Oh cool, a project like Folding@Home but for AI compute, maybe I&#x27;ll contribute as we-<p>&gt; Decentralized training of INTELLECT-1 currently requires 8x H100 SXM5 GPUs.<p>me: and for that reason, I&#x27;m out<p>Also they state that later they will be adding the ability for you to contribute your own compute but how will they solve the problem of having to back-propagate to all of the remote nodes contributing to the project without egregiously slow training time?
macrolime7 months ago
Not exactly what I would call decentralized training. More like distributed through multiple data centers.<p>Decentralized training would be when you can use consumer GPUs, but that&#x27;s not likely to work with backpropagation directly, but maybe with one of the backpropagation approximating algorithms.
评论 #41818899 未加载
m3kw97 months ago
But I can already train from 30 different vendors distributed across the US, why do I need to use a “decentralized” training system? Decentralized inferercing makes more sense as that is where things can be censored
dmitrygr7 months ago
&gt; solve decentralized training step-by-step to ensure AGI will be open-source, transparent, and accessible<p>One hell of an uncited leap from &quot;we&#x27;re multiplying a lot of numbers&quot; to &quot;AGI&quot;, as if it is a given
评论 #41818302 未加载
mountainriver7 months ago
This is cool work, I’ve been watching the slow evolution of this space for a couple years and it feels like a good way we can ensure AI is owned and accessible to everyone.
openrisk7 months ago
For some purposes a decentrally trained, open source LLM could be just fine? E.g. you want a stochastic parrot that is trained on a large, general purpose corpus of genuine public domain &#x2F; creative commons content. Having such a tool widely available is still a quantum leap versus Lore Ipsum. Up to point you can take your time. There is no manic race to capitalize any hype. &quot;slow open AI&quot; instead of &quot;fast closed AGI&quot;. Helpfully, the nature of the target corpus does not change every day. You can imagine, e.g., annual revisions, trained and rolled-out leisurely. Both costs and benefits get widely distributed.
James_K7 months ago
My initial was quite negative, but having thought it through, I can see the logic in this. Having open models is better than closed models. That said, this page seems like a joke. Someone drank a little too much AI-koolaid methinks.
not_a_dane7 months ago
Decentralised but very high entry barrier.
nickpsecurity7 months ago
The main benefit of this type of decentralization seems to be minimizing the node cost. One can rent the cheapest nodes to use in the system. Even the temporary instances can be replaced with others. It’s also easy for system owners to donate time.<p>So, mostly cost reduction mixed with some cloud, vendor diversity.
pizza7 months ago
So just spitballing here but this is likely a souped-up reverse engineered DisTrO [0] under the hood, right? Or could it be something else?<p>[0] <a href="https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=eLMJoCSjFbs" rel="nofollow">https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=eLMJoCSjFbs</a>
mt_7 months ago
&gt; We quantize the pseudo-gradients to int8, reducing communication requirements by 400x.<p>Can someone explain if it does reduce the model quality overall?
评论 #41819873 未加载
评论 #41818300 未加载
评论 #41818418 未加载
monkeydust7 months ago
Yea, come back when you can do this on BOINC.
saulrh7 months ago
&gt; Prime Intellect<p>Ah, yes, Prime Intellect, the AGI that went foom and genocided the universe because it was commanded to preserve human civilization without regard for human values. A strong contender for the least evil hostile superintelligence in fiction. What a wonderful thing to name your AI startup after. What&#x27;s next, creating the Torment Nexus?<p>(my position on the book <i>as a whole</i> is more complex, but... really? <i>Really?</i>)
评论 #41816653 未加载
评论 #41816421 未加载
评论 #41818035 未加载