Hi HN!<p>At Vectara (<a href="https://vectara.com" rel="nofollow">https://vectara.com</a>) were hyper focused on providing best in class retrieval-augmented-generation (RAG). We've just released a new open source hallucination detection model (available on HuggingFace and Kaggle) and associated leaderboard to show which LLMs are best at producing accurate summaries. It's far more accurate than our previous model, which has been referenced by a number of HN users here before.<p>The reason we developed/continue developing these open source hallucination detection models is that we've heard from enterprises that hallucinations are one of the top items preventing them from deploying RAG applications in production. We believe by making them open source, we can further engage the community on solving this together.<p>One question that comes up frequently when we talk about this model is "how can we detect the 'truthiness' of a LLM output?" The answer is that our model is hyper focused on detecting hallucinations in summarization tasks in a RAG context. So the model is trained on that specifically as opposed to detecting "arbitrary untruths" in the output.<p>We do have an even more powerful model deployed in our platform, but even so, this is far better than anything else in the OSS realm today