context = "you are a useful model. you evaluate your own output for N steps after the initial user input"<p>prompt = "here comes a prompt by the user"<p>context += prompt<p>for _ in range(N):
context += evaluate_llm(context)
Are you asking in terms of how the current crop of "reasoning models" are implemented, or are you asking more philosophically about the nature of true reasoning?<p>Calling it "refinement" is dismissive. It's generating new information, which is in many cases well beyond the scope of the original prompt.<p>Reasoning models today are just a marketing spin on chain-of-thought techniques that benefit from reinforcement learning.