The researchers found that certain artifacts associated with LLM model generations could potentially indicate whether or not a model is hallucinating. Their results showed that the distributions of these artifacts were different between hallucinated and non-hallucinated generations. Using these artifacts, they trained binary classifiers to classify model generations into hallucinations and non-hallucinations. They also discovered that tokens preceding a hallucination can predict the subsequent hallucination before it occurs.