> Although you can prompt such large language models to construct a different answer, those programs do not (and cannot) on their own look backward and evaluate what they’ve written for errors.<p>Given that the next token is always predicted based on everything that both the user <i>and the model</i> have typed so far, this seems like a false statement.<p>Practically, I've more than once seen an LLM go "actually, it seems like there's a contradiction in what I just said, let me try again". And has the author even heard about chain of thought reasoning?<p>It doesn't seem so hard to believe to me that quite interesting results can come out of a simple loop of writing down various statements, evaluating their logical soundness in whatever way (which can be formal derivation rules, statistical approaches etc.), and to repeat that various times.