The general approach is called "wafer scale", and it's not new: <a href="https://www.extremetech.com/extreme/286073-building-gpus-out-of-entire-wafers-could-turbocharge-performance-efficiency" rel="nofollow">https://www.extremetech.com/extreme/286073-building-gpus-out...</a><p>However, one of the longstanding problems is yield. A whole wafer will have a number of defects on it, this is simply unavoidable. This requires that the wafer scale system must be able to disable or disconnect faulty subsystems.<p>The use of this for AI raises the interesting possibility of "learning around" some kinds of defect, although it will still be necessary to disconnect bits with short circuits in them.<p>It's also quite expensive simply to buy all that area, at least $10k per wafer. You save a bit on packaging and building a carrier PCB for it, but not a great deal.
> It also eats up as much electricity as all the servers contained in one and a half racks<p>Seems like such a large chip is going to pose issues? 1.5 racks, lets generously say they mean lower power racks and are perhaps talking about 7.5kW, in a single chip? Seems like it would require some kind of water block with sub-zero cooling...
Does anyone remember when "wafer-scale integration" was big - in the 1980s? <a href="https://en.wikipedia.org/wiki/Wafer-scale_integration" rel="nofollow">https://en.wikipedia.org/wiki/Wafer-scale_integration</a>