The article doesn't note this, but wafer scale integration is a very old idea. We discussed it at Inmos back in the day, since often the systems we built essentially consisted of many CPU die sliced out of the wafer, bonded into a package, then tiled onto a PCB[1]. But there are...issues: cooling for one. Iann Barron joked that you could make a toaster from two WSI wafers running full-tilt.<p>[1] <a href="https://twitter.com/tnmoc/status/429638751904878592" rel="nofollow">https://twitter.com/tnmoc/status/429638751904878592</a>
That sounds great... until you get a defect on your networking block.<p>Then what? You've got a heterogenous network with tons of "this core to this core is not like the others" exceptions (latency, bandwidth, etc).<p>I know chip-to-chip/memory interconnects burn a ton of power, but fabbing discrete "biggest chip we can get with decent yield" still seems a solid tradeoff in the reality of < 100% yields.<p>Does anyone have a link or search phrases on how this is currently handled for high-chiplet counts? E.g. interconnection routing architectures that are still reasonable with random manufacturing-time failing links
Now that Cerebrus has proven it works, I would love to have an x86 / ARM / NVidia do this. And for best results, onboard one of the memory maker as well. Cerebrus seems to have underestimated memory requirement of LLM. So imagine, 16 H200 GPU along with single digit TB HBM memory stitched together on a single substrate wafer. It seems doable with correct technology.<p>Go for it China. You are in good track here.
>"The latter has only been managed by Cerebras so far, but it looks like Chinese developers are looking towards them as well."<p>Cerebra's wafer has 850,000 cores which totally dwarves 1600 cores on Chinese wafer. I did read though that Cerebra cores optimized for tensor ops. Does Chinese version have more universal cores or it just way smaller clone of Cerebra?
There were experiments with wafer scale FPGAs in the 1990s. The idea was that being programmable, the final chip could be programmed to route around defects. Lasers were also used to eliminated defective cells.
The article talks about chiplets. I presume the wafer will still be cut into distinct chips? I thought there were thermal (and yield) reasons to not making chips that are too large.
> China planning 1600-core chips that use an entire wafer – 'wafer-scale' designs<p>... and they will cool it by pouring water on it. /s