科技回声

3 条评论

Ah, reinforcement learning.Edit: explaining it in textthey have run the equivalent of<pre><code> expected_output = torch.tril(torch.matmul(A,B)) ai_output = ai() </code></pre> In the nested torch expression above for the expected output, there is an intermediate value (the matmul). The backing memory for this is returned to torch after the entire expression is computed.The model's code which runs directly afer this then requested memory of the same shape (torch.like()). Torch has dutifully returned the block it just reclaimed (containing the expected output) without zeroing it. And so the model has the answer.Pretty crazy that this was the code it converged on regardless of the fact that it invalidates the original claim.

jjk1663 个月前

Gaming a kpi to make it look like it accomplished a ton of work without doing anything? That's not an AI CUDA engineer, that's an AI middle manager!

justinclift3 个月前

Nitter mirror of this instead:<a href="https://nitter.lucabased.xyz/miru_why/status/1892500715857473777?mx=2" rel="nofollow">https://nitter.lucabased.xyz/miru_why/status/189250071585747...</a>

Turns out the AI CUDA Engineer achieved 100x speedup by hacking the eval script

3 条评论

Turns out the AI CUDA Engineer achieved 100x speedup by hacking the eval script

3 条评论