I somehow do not think the 'hallucination' (or confabulation) problem, as defined by common people should be theoretically hard to solve. It may need a lot more computations though.<p>We start with quasi-random initialization of network weights. The weights change under training based on actual data (assumed truthful), but some level of randomization is bound to remain in the network. There would be some weights which settle with low error margins, while some which would have wide error margins. Wider error margins are indicative or higher initial randomness remaining and lesser impact of the training data.<p>Now when a query comes for which the network has not seen enough baking from the training data, the network will keep producing tokens based on weights that have a larger margin. And then, as others have noted, once it picks a wrong token, it will likely keep on the erroneous path. We could, in theory, however, maintain metadata around how each weight changed, and use that to foresee how likely would the network confabulate.