Show HN: R2R – Open-source framework for production-grade RAG

167 pointsby ocolegroover 1 year ago

Hello HN, I'm Owen from SciPhi (<a href="https://www.sciphi.ai/" rel="nofollow">https://www.sciphi.ai/</a>), a startup working on simplifying˛Retrieval-Augmented Generation (RAG). Today we’re excited to share R2R (<a href="https://github.com/SciPhi-AI/R2R">https://github.com/SciPhi-AI/R2R</a>), an open-source framework that makes it simpler to develop and deploy production-grade RAG systems.Just a quick reminder: RAG helps Large Language Models (LLMs) use current information and specific knowledge. For example, it allows a programming assistant to use your latest documents to answer questions. The idea is to gather all the relevant information ("retrieval") and present it to the LLM with a question ("augmentation"). This way, the LLM can provide answers (“generation”) as though it was trained directly on your data.The R2R framework is a powerful tool for addressing key challenges in deploying RAG systems, avoiding the complex abstractions common in other projects. Through conversations with numerous developers, we discovered that many were independently developing similar solutions. R2R distinguishes itself by adopting a straightforward approach to streamline the setup, monitoring, and upgrading of RAG systems. Specifically, it focuses on reducing unnecessary complexity and enhancing the visibility and tracking of system performance.The key parts of R2R include: an Ingestion Pipeline that transforms different data types (like json, txt, pdf, html) into 'Documents' ready for embedding. Next, the Embedding Pipeline takes text and turns it into vector embeddings through various processes (such as extracting text, transforming it, chunking, and embedding). Finally, the RAG Pipeline follows the steps of the embedding pipeline but adds an LLM provider to create text completions.R2R is currently in use at several companies building applications from B2B lead generation to educational tools for consumers.Our GitHub repo (<a href="https://github.com/SciPhi-AI/R2R">https://github.com/SciPhi-AI/R2R</a>) includes basic examples for application deployment and standalone use, demonstrating the framework's adaptability in a simple way.We’d love for you to give R2R a try, and welcome your feedback and comments as we refine and develop it further!

14 comments

joshringover 1 year ago

Is there a roadmap for planned features in the future? I wouldn't call this a "powerful tool for addressing key challenges in deploying RAG systems" right now. It seems to do the most simple version of RAG that the most basic RAG tutorial teaches someone how to do with a pretty UI over it.The most key challenges I've faced around RAG are things like:- Only works on text based modalities (how can I use this with all types of source documents, including images)- Chunking "well" for the type of document (by paragraph, csvs including header on every chunk, tables in pdfs, diagrams, etc). The rudimentary chunk by character with overlap is demonstrably not very good at retrieval- the R in rag is really just "how can you do the best possible search for the given query". The approach here is so simple that it is definitely not the best possible search results. It's missing so many known techniques right now like:<pre><code> - Generate example queries that the chunk can answer and embed those to search against. - Parent document retrieval - so many newer better Rag techniques have been talked about and used that are better than chunk based - How do you differentiate "needs all source" vs "find in source" questions? Think: Summarize the entire pdf, vs a specific question like how long does it take for light to travel to the moon and back? </code></pre> - Also other search approaches like fuzzy search/lexical based approaches. And ranking them based on criterial like (user query is one word, use fuzzy search instead of semantic search). Things like thatSo far this platform seems to just lock you into a really simple embedding pipeline that only supports the most simple chunk based retrieval. I wouldn't use this unless there was some promise of it actually solving some challenges in RAG.

评论 #39514145 未加载

评论 #39516743 未加载

评论 #39519757 未加载

Dave_Rosenthalover 1 year ago

With the "production-grade" part of the title, I was hoping to see bit more about scalability, fault tolerance, updating continually-changing sources of data, A/Bing new versions of models, slow rollout, logging/analytics, work prioritization/QOS, etc. It seems like the lack of these kind of features is where a lot of the toy/demo stacks aren't really prepared for production. Any thoughts on those topics?

评论 #39515763 未加载

alchemist1e9over 1 year ago

Do you have any insights to share around chunking and labeling strategies around ingesting and embeddings? Qdrant I remember had some interesting abilities to tag vectors with extra information. So to be more specific the issues I see are context aware paragraph chunking and keyword or entity extraction. How do you see this general issue?

评论 #39513969 未加载

评论 #39518628 未加载

deckar01over 1 year ago

I find that ingesting and chunking PDF textbooks automatically creates more of a fuzzy keyword index than a high level conceptual knowledge base. Manually curating the text into chunks and annotating high level context is an improvement, but it seems like chunks should be stored as a dependency tree so that, regardless of delineation, on retrieval the full context is recovered.

m1117over 1 year ago

How is it different from <a href="https://github.com/pinecone-io/canopy">https://github.com/pinecone-io/canopy</a>?

评论 #39517804 未加载

nikhil896over 1 year ago

Cool! I enjoyed speaking with you about our RAG pipeline for call transcripts a week or so back. Will check out the launch

评论 #39527345 未加载

jipsterover 1 year ago

R2R uses deepeval for their evaluation :) <a href="https://github.com/confident-ai/deepeval">https://github.com/confident-ai/deepeval</a>

koengover 1 year ago

Is there an API that could be used? I have a use case that I'm integrating into a larger software package, but wouldn't be using a cli/web app for that.

评论 #39527540 未加载

chasd00over 1 year ago

From what i've seen and experienced in projects, most of the problems that are being solved with RAG are better solved with a good search engine alone.

m1117over 1 year ago

Will it support Pinecone? I deal with a lot of vectors

评论 #39517323 未加载

mushufasaover 1 year ago

RAG is evolving so quickly.How does this compare to the performance and capabilities of the OpenAi Assistants APIs?

评论 #39518644 未加载

isoprophlexover 1 year ago

Tangential to the framework itself, I've been thinking about the following in the past few days:How will the concept of RAG fare in the era of ultra large context windows and sub-quadratic alternatives to attention in transformers?Another 12 months and we might have million+ token context windows at GPT-3.5 pricing.For most use cases, does it even make sense to invest in RAG anymore?

评论 #39513241 未加载

评论 #39519970 未加载

评论 #39518430 未加载

gardnrover 1 year ago

Do you plan to offer content aware chunking?

评论 #39527358 未加载

评论 #39519293 未加载

avereveardover 1 year ago

Uh what makes this production ready? Are the tests hidden somewhere else?

14 comments

joshringover 1 year ago

评论 #39514145 未加载

评论 #39516743 未加载

评论 #39519757 未加载

Dave_Rosenthalover 1 year ago

评论 #39515763 未加载

alchemist1e9over 1 year ago

评论 #39513969 未加载

评论 #39518628 未加载

deckar01over 1 year ago

m1117over 1 year ago

How is it different from <a href="https://github.com/pinecone-io/canopy">https://github.com/pinecone-io/canopy</a>?

评论 #39517804 未加载

nikhil896over 1 year ago

Cool! I enjoyed speaking with you about our RAG pipeline for call transcripts a week or so back. Will check out the launch

评论 #39527345 未加载

jipsterover 1 year ago

R2R uses deepeval for their evaluation :) <a href="https://github.com/confident-ai/deepeval">https://github.com/confident-ai/deepeval</a>

koengover 1 year ago

Is there an API that could be used? I have a use case that I'm integrating into a larger software package, but wouldn't be using a cli/web app for that.

评论 #39527540 未加载

chasd00over 1 year ago

From what i've seen and experienced in projects, most of the problems that are being solved with RAG are better solved with a good search engine alone.

m1117over 1 year ago

Will it support Pinecone? I deal with a lot of vectors

评论 #39517323 未加载

mushufasaover 1 year ago

RAG is evolving so quickly.How does this compare to the performance and capabilities of the OpenAi Assistants APIs?