We are in the 640KB of RAM stage of AI accelerators. Current top models are measured in hundreds of billions of parameters. Future models will have many trillions. We've only had GPU/AI processors large enough to run GPT-4o or Llama 3.2 400B for a few years. It is silly to think we have already maxed out this approach.<p>Look at what Cerebras is doing with wafer scale AI chips that have 900,000 cores and can do 125 FP16 petaFLOPS. The most powerful Nvidia chip is 2.25 petaFLOPS. We've hit a local maximum at worst.