Right now, we're emulating neural networks (which in nature, are largely asynchronous) on clock-signal-driven von Neumann computers solving matrix math. I'm not a machine learning expert, but I wonder whether the power consumption needs of this approach will become a barrier to faster, more efficient large language models. After all, the human brain seems to operate over 80 billion neurons on approximately 20 watts. Are there organisations, companies or labs working on neuromorphic CPU architectures that are better suited to run LLMs than general computing loads?
I've been pursuing a path that is decidedly edgy... and might work out great, or might be a miserable failure... the BitGrid[1]. It's dead nuts simple... a cartesian grid of 4 bit input, 4 bit output LUTs, latched and clocked in 2 phases (like the colors on a checkerboard) to prevent race conditions. It's a turing complete architecture that doesn't have the routing issues of an FPGA because there's no routing hardware in the way. But it is also nuts because there's no routing fabric to get data rapidly across the chip.<p>If you can unlearn the aversion to latency that we've all had since the days of TTL and the IMSAI, you realize that you could clock an array of 16 Billion cells <i>slowly</i> to save power at 1Mhz, and still get a million tokens/sec.<p>It's all a questiron of programming. (Which is where I'm stuck right now, analysis paralysis)<p>[1] <a href="https://github.com/mikewarot/Bitgrid">https://github.com/mikewarot/Bitgrid</a>