I'm not entirely sure how this kind of study jives well with other study, such as "Reasoning models don't always say what they think" [0], discussion [1].<p>To quote the article:<p><pre><code> We can’t be certain of either the “legibility” of the Chain-of-Thought (why, after all, should we expect that words in the English language are able to convey every single nuance of why a specific decision was made in a neural network?) or its “faithfulness”—the accuracy of its description. There’s no specific reason why the reported Chain-of-Thought must accurately reflect the true reasoning process; there might even be circumstances where a model actively hides aspects of its thought process from the user.
</code></pre>
So if we can't trust the reasoning, then what's the point of checking whether they are "effective" or not?<p>[0]: <a href="https://www.anthropic.com/research/reasoning-models-dont-say-think" rel="nofollow">https://www.anthropic.com/research/reasoning-models-dont-say...</a><p>[1]: <a href="https://news.ycombinator.com/item?id=43572374">https://news.ycombinator.com/item?id=43572374</a>
> When controlling for the number of tokens, NoThinking outperforms Thinking across a diverse set of seven challenging reasoning datasets<p>Interesting. I thought the "thinking" was useful because it pulls in a lot of concepts into the context, but I guess not then?<p>It has also been said before that the text a model outputs during its Thinking step isn't actually a view into its inner thoughts. There are times when the model will think X but eventually answer Y.<p>But even so: the models _are_ better, right? So is the Thinking step then mostly useful during training?