TechEcho

1 comment

Hey, I'm Jan, a Machine Learning researcher, and CTO of Pathway. I've been working on RAG for the last year and I'm excited to share a new RAG optimization strategy that adapts the number of supporting documents to the LLM behavior on a given question.<p>The approach builds on the ability of LLMs to know when they don’t know how to answer. With proper LLM confidence calibration, the adaptive RAG is as accurate as a large context-based RAG, while being much cheaper to run.<p>What was really interesting for us here is that the basic idea is "geometric doubling" but it needs to be put into place with so much care, because of the counter-intuitive correlation effects of mistakes produced by LLM's for different prompts.<p>We provide runnable code examples, you will also find a reference implementation of the strategy in the Pathway LLM expansion pack:<p><a href="https://github.com/pathwaycom/pathway/blob/main/python/pathway/xpacks/llm/question_answering.py">https://github.com/pathwaycom/pathway/blob/main/python/pathw...</a>

Show HN: Adaptive RAG – How we cut LLM costs without sacrificing accuracy

1 comment

Show HN: Adaptive RAG – How we cut LLM costs without sacrificing accuracy

1 comment