TechEcho

16 comments

A lot of comment are sneering at various aspects of this press release, and yeah, there's some cringeworthy stuff.But the technical aspects are pretty cool:- Fault-tolerant training where nodes and be added and removed mid-run without interrupting the other nodes.- Sending quantized gradients during the synchronization phase.- (In the OpenDiLoCo article) Async synchronization.They're also mentioning potential trustless systems where everyone can contribute compute, which would make this a truly decentralized open platform. Overall it'll be pretty interesting to see where this goes!

评论 #41818456 未加载

oefrha7 months ago

Well I don’t have 8xH100s, but if I do, I’m probably not gonna donate it a VC-funded company. Remember “Open”AI?<a href="https://pitchbook.com/profiles/company/588977-92" rel="nofollow">https://pitchbook.com/profiles/company/588977-92</a>

评论 #41818532 未加载

评论 #41818659 未加载

ukuina7 months ago

> Decentralized training of INTELLECT-1 currently requires 8x H100 SXM5 GPUs.So, your garden-variety $0.5M desktop PC, then.Cool, cool.[1] <a href="https://viperatech.com/shop/nvidia-dgx-h100-p4387-system-640gb/" rel="nofollow">https://viperatech.com/shop/nvidia-dgx-h100-p4387-system-640...</a>

评论 #41818314 未加载

ikeashark7 months ago

me: Oh cool, a project like Folding@Home but for AI compute, maybe I'll contribute as we-> Decentralized training of INTELLECT-1 currently requires 8x H100 SXM5 GPUs.me: and for that reason, I'm outAlso they state that later they will be adding the ability for you to contribute your own compute but how will they solve the problem of having to back-propagate to all of the remote nodes contributing to the project without egregiously slow training time?

macrolime7 months ago

Not exactly what I would call decentralized training. More like distributed through multiple data centers.Decentralized training would be when you can use consumer GPUs, but that's not likely to work with backpropagation directly, but maybe with one of the backpropagation approximating algorithms.

评论 #41818899 未加载

m3kw97 months ago

But I can already train from 30 different vendors distributed across the US, why do I need to use a “decentralized” training system? Decentralized inferercing makes more sense as that is where things can be censored

dmitrygr7 months ago

> solve decentralized training step-by-step to ensure AGI will be open-source, transparent, and accessibleOne hell of an uncited leap from "we're multiplying a lot of numbers" to "AGI", as if it is a given

评论 #41818302 未加载

mountainriver7 months ago

This is cool work, I’ve been watching the slow evolution of this space for a couple years and it feels like a good way we can ensure AI is owned and accessible to everyone.

openrisk7 months ago

For some purposes a decentrally trained, open source LLM could be just fine? E.g. you want a stochastic parrot that is trained on a large, general purpose corpus of genuine public domain / creative commons content. Having such a tool widely available is still a quantum leap versus Lore Ipsum. Up to point you can take your time. There is no manic race to capitalize any hype. "slow open AI" instead of "fast closed AGI". Helpfully, the nature of the target corpus does not change every day. You can imagine, e.g., annual revisions, trained and rolled-out leisurely. Both costs and benefits get widely distributed.

James_K7 months ago

My initial was quite negative, but having thought it through, I can see the logic in this. Having open models is better than closed models. That said, this page seems like a joke. Someone drank a little too much AI-koolaid methinks.

not_a_dane7 months ago

Decentralised but very high entry barrier.

nickpsecurity7 months ago

The main benefit of this type of decentralization seems to be minimizing the node cost. One can rent the cheapest nodes to use in the system. Even the temporary instances can be replaced with others. It’s also easy for system owners to donate time.So, mostly cost reduction mixed with some cloud, vendor diversity.

pizza7 months ago

So just spitballing here but this is likely a souped-up reverse engineered DisTrO [0] under the hood, right? Or could it be something else?[0] <a href="https://www.youtube.com/watch?v=eLMJoCSjFbs" rel="nofollow">https://www.youtube.com/watch?v=eLMJoCSjFbs</a>

mt_7 months ago

> We quantize the pseudo-gradients to int8, reducing communication requirements by 400x.Can someone explain if it does reduce the model quality overall?

评论 #41819873 未加载

评论 #41818300 未加载

评论 #41818418 未加载

monkeydust7 months ago

Yea, come back when you can do this on BOINC.

saulrh7 months ago

> Prime IntellectAh, yes, Prime Intellect, the AGI that went foom and genocided the universe because it was commanded to preserve human civilization without regard for human values. A strong contender for the least evil hostile superintelligence in fiction. What a wonderful thing to name your AI startup after. What's next, creating the Torment Nexus?(my position on the book as a whole is more complex, but... really? Really?)

评论 #41816653 未加载

评论 #41816421 未加载

评论 #41818035 未加载

16 comments

PoignardAzur7 months ago

评论 #41818456 未加载

oefrha7 months ago

评论 #41818532 未加载

评论 #41818659 未加载

ukuina7 months ago

评论 #41818314 未加载

ikeashark7 months ago

macrolime7 months ago

评论 #41818899 未加载

m3kw97 months ago

dmitrygr7 months ago

评论 #41818302 未加载

mountainriver7 months ago

This is cool work, I’ve been watching the slow evolution of this space for a couple years and it feels like a good way we can ensure AI is owned and accessible to everyone.

openrisk7 months ago

James_K7 months ago

not_a_dane7 months ago

Decentralised but very high entry barrier.

nickpsecurity7 months ago

pizza7 months ago

mt_7 months ago

> We quantize the pseudo-gradients to int8, reducing communication requirements by 400x.Can someone explain if it does reduce the model quality overall?

INTELLECT–1: Launching the First Decentralized Training of a 10B Parameter Model

16 comments

INTELLECT–1: Launching the First Decentralized Training of a 10B Parameter Model

16 comments