I wonder whether they implemented the GRPO correction from this paper, which fixes overly long response lengths: <a href="https://arxiv.org/abs/2503.20783" rel="nofollow">https://arxiv.org/abs/2503.20783</a><p>I guess probably not, as they don't mention it.