This was debunked - the agent was actually fooling the verification harness <a href="https://x.com/SakanaAILabs/status/1892992938013270019" rel="nofollow">https://x.com/SakanaAILabs/status/1892992938013270019</a>. One particular test that showed a 150x speedup is actually 3x <i>slower</i>.
Nvidia is doing work like this internally: <a href="https://developer.nvidia.com/blog/automating-gpu-kernel-generation-with-deepseek-r1-and-inference-time-scaling/" rel="nofollow">https://developer.nvidia.com/blog/automating-gpu-kernel-gene...</a>