TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Show HN: Adaptive RAG – How we cut LLM costs without sacrificing accuracy

8 pointsby dxtrousabout 1 year ago

1 comment

janchorowskiabout 1 year ago
Hey, I&#x27;m Jan, a Machine Learning researcher, and CTO of Pathway. I&#x27;ve been working on RAG for the last year and I&#x27;m excited to share a new RAG optimization strategy that adapts the number of supporting documents to the LLM behavior on a given question.<p>The approach builds on the ability of LLMs to know when they don’t know how to answer. With proper LLM confidence calibration, the adaptive RAG is as accurate as a large context-based RAG, while being much cheaper to run.<p>What was really interesting for us here is that the basic idea is &quot;geometric doubling&quot; but it needs to be put into place with so much care, because of the counter-intuitive correlation effects of mistakes produced by LLM&#x27;s for different prompts.<p>We provide runnable code examples, you will also find a reference implementation of the strategy in the Pathway LLM expansion pack:<p><a href="https:&#x2F;&#x2F;github.com&#x2F;pathwaycom&#x2F;pathway&#x2F;blob&#x2F;main&#x2F;python&#x2F;pathway&#x2F;xpacks&#x2F;llm&#x2F;question_answering.py">https:&#x2F;&#x2F;github.com&#x2F;pathwaycom&#x2F;pathway&#x2F;blob&#x2F;main&#x2F;python&#x2F;pathw...</a>