Every year they drop a "new GPU!" announcement with absolutely no details. All their numbers are "in simulation only".<p>Last year it was their "Thunder" architecture being the world's fastest GPU, now it's Zeus. Neither one actually exists. <a href="https://bolt.graphics/bolt-graphics-unveils-thunder-the-worlds-fastest-graphics-processor/" rel="nofollow">https://bolt.graphics/bolt-graphics-unveils-thunder-the-worl...</a><p>All their blog posts and marketing material are just generic hype about the concept of raytracing. I don't think these guys have an actual product.
I don't get why more than 2 years after the ChatGPT release moment there is not a plethora of high-mem matrix-matrix matrix-vector hardware for the high-end available. Both high bandwidth and commodity DRAM. Both operations are very well understood for dense and sparse cases. There were FPGAs and ASICs early on but nothing really caught on relative to the GPU behemoth which is tons of silicon on the die that is not needed for base matmult. Hence, it's unbelievable how Nvidia continues to charging for memory to be one of the highest value companies in the world.
Certainly neat, especially the SFP port and Ethernet port next to the display ports.<p>- How much will it cost?<p>- How compatible will it be with existing AI assets (eg models, code for training and inference)?<p>- Will there have to be a translation layer between RISC and CISC (eg my CPU)? What’s the performance penalty?<p>- Will I actually be able to get one, or is this for “enterprise” customers only, who must buy a minimum of 100 at a time?
Hard to know who this is aimed at until we see the price. I guess they are going for the “slow memory, but lots of it” market that is less sensitive to how fast their very large LLMs run as long as they run at all. Hobbyists will rejoice if they can afford it, but is there a commercial use case that can tolerate low tokens per second?