I don't get where the author is coming from with the idea that a diffusion based LLM would hallucinate less.<p>> dLLMs can generate certain important portions first, validate it, and then continue the rest of the generation.<p>If you pause the animation in the linked tweet (not the one on the page), you can see that the intermediate versions are full of, well, baloney.<p>(and anyone who has messed around with diffusion based image generation knows the models are perfectly happy to hallucinate).