Hi HN! We're building R2R [<a href="https://github.com/SciPhi-AI/R2R">https://github.com/SciPhi-AI/R2R</a>], an open source RAG answer engine that is built on top of Postgres+Neo4j. The best way to get started is with the docs - <a href="https://r2r-docs.sciphi.ai/introduction">https://r2r-docs.sciphi.ai/introduction</a>.<p>This is a major update from our V1 which we have spent the last 3 months intensely building after getting a ton of great feedback from our first Show HN (<a href="https://news.ycombinator.com/item?id=39510874">https://news.ycombinator.com/item?id=39510874</a>). We changed our focus to building a RAG engine instead of a framework, because this is what developers asked for the most. To us this distinction meant working on an opinionated system instead of layers of abstractions over providers. We built features for multimodal data ingestion, hybrid search with reranking, advanced RAG techniques (e.g. HyDE), automatic knowledge graph construction alongside the original goal of an observable RAG system built on top of a RESTful API that we shared back in February.<p>What's the problem? Developers are struggling to build accurate, reliable RAG solutions. Popular tools like Langchain are complex and overly abstracted and lack crucial production features such as user/document management, observability, and a default API. There was a big thread about this a few days ago: <i>Why we no longer use LangChain for building our AI agents</i> (<a href="https://news.ycombinator.com/item?id=40739982">https://news.ycombinator.com/item?id=40739982</a>)<p>We experienced these challenges firsthand while building a large-scale semantic search engine, having users report numerous hallucinations and inaccuracies. This highlighted that search+RAG is a difficult problem. We're convinced that these missing features, and more, are essential to effectively monitor and improve such systems over time.<p>Teams have been using R2R to develop custom AI agents with their own data, with applications ranging from B2B lead generation to research assistants. Best of all, the developer experience is much improved. For example, we have recently seen multiple teams use R2R to deploy a user-facing RAG engine for their application within a day. By day 2 some of these same teams were using their generated logs to tune the system with advanced features like hybrid search and HyDE.<p>Here are a few examples of how R2R can outperform classic RAG with semantic search only:<p>1. “What were the UK's top exports in 2023?". R2R with hybrid search can identify documents mentioning "UK exports" and "2023", whereas semantic search finds related concepts like trade balance and economic reports.<p>2. "List all YC founders that worked at Google and now have an AI startup." Our knowledge graph feature allows R2R to understand relationships between employees and projects, answering a query that would be challenging for simple vector search.<p>The built in observability and customizability of R2R helps you to tune and improve your system long after launching. Our plan is to keep the API ~fixed while we iterate on the internal system logic, making it easier for developers to trust R2R for production from day 1.<p>We are currently working on the following: (1) Improve semantic chunking through third party providers or our own custom LLMs; (2) Training a custom model for knowledge graph triples extraction that will allow KG construction to be 10x more efficient. (This is in private beta, please reach out if interested!); (3) Ability to handle permissions at a more granular level than just a single user; (4) LLM-powered online evaluation of system performance + enhanced analytics and metrics.<p>Getting started is easy. R2R is a lightweight repository that you can install locally with `pip install r2r`, or run with Docker. Check out our quickstart guide: <a href="https://r2r-docs.sciphi.ai/quickstart">https://r2r-docs.sciphi.ai/quickstart</a>. Lastly, if it interests you, we are also working on a cloud solution at <a href="https://sciphi.ai">https://sciphi.ai</a>.<p>Thanks a lot for taking the time to read! The feedback from the first ShowHN was invaluable and gave us our direction for the last three months, so we'd love to hear any more comments you have!
Do you also see the ingestion process as the key challenge for many RAG systems to avoid "garbage in, garbage out"? How does R2R handle accurate data extraction for complex and diverse document types?<p>We have a customer who has hundreds of thousands of unstructured and diverse PDFs (containing tables, forms, checkmarks, images, etc.), and they need to accurately convert these PDFs into markdown for RAG usage.<p>Traditional OCR approaches fall short in many of these cases, so we've started using a combined multimodal LLM + OCR approach that has led to promising accuracy and consistency at scale (ping me if you want to give this a try). The RAG system itself is not a big pain point for them, but the accurate and efficient extraction and structuring of the data is.
This is excellent. I have been running a very similar stack for 2 years, and you got all the tricks of the trade. Pgvector, HyDe, Web Search + document search. Good dashboard with logs and analytics.<p>I am leaving my position, and I recommended this to basically replace me with a junior dev who can just hit the API endpoints.
The quick start is defiantly not quick. You really should provide a batteries included docker compose with Postgres image ( docker.io/tensorchord/pgvecto-rs:pg14-v0.2.0 )<p>If I want to use dashboard I have to clone another repo? 'git clone git@github.com:SciPhi-AI/R2R-Dashboard.git' ? why not make it available in a docker container so that if im only interested in rag I can plug into the docker container for dashboard?<p>This project feels like a collection of alot of things thats not really providing any extra ease to development. It feels more like joining a new company and trying to find out all the repo and set everything up.<p>This really looks cool, but Im struggling to figure out if its a SDK or suite of apps or both but in the later case the suite of apps is really confusing if i have to still write all the python, then it feels more like a SDK?<p>Perhaps provide better "1 click" install experience to preview/show case all the features and then let devs leverages the r2r lalter...
This looks great, will be giving it a shot today. Not to throw cold water on the release, but I have been look at different RAG platforms. Anyone have any insight into which is the flagship?<p>It really seems like document chunking is not a problem that can be solved well generically. And RAG really hinges on which documents get retrieved/the correct metadata.<p>Current approaches around this seem to be using a ReRanker, where we fetch a ton of information and prune it down. But still, document splitting, is tough. Especially when you start to add transcripts of video that can be a few hours long.
I've been interested in building a RAG for my documents, but as an academic project I do not have the funds to spend on costly APIs like a lot of RAG projects out there depend on, not just LLM part, but for the reranking, chunking, etc, like those form Cohere.<p>Can R2R be built with all processing steps implementing local "open" models?
I’ve checked out quite a few RAG projects now and what I haven’t seen really solved is ingestion, it’s usually like “this is an endpoint or some connectors, have fun!”.<p>How do I do a bulk/batch ingest of say, 10k html documents into this system?
“ What were the UK's top exports in 2023?"<p>"List all YC founders that worked at Google and now have an AI startup."<p>How to check the accuracy of the answers? Is there some kind of a detailed trace of how the answer was generated?
Could you provide more details on the multimodal data ingestion process? What types of data can R2R currently handle, and how are non-text data types embedded?
Can the ingestion be streaming from logs?
Interesting. Can you talk a bit about how the process is faster/better optimized for the dev teams? Sounds like there's a big potential to accelerate time to MVP.
Is there a way to work with source code? I've been looking for a rag solution that can understand the graph of code. For example "what analytics events get called when I click submit"
> R2R is a lightweight repository that you can install locally with `pip install r2r`, or run with Docker<p>Lightweight is good, and running it without having to deal with Docker is excellent.<p>But your quickstart guide is still huge! It feels very much not "quick". How do you:<p>* Install via Python<p>* Throw a folder of documents at it<p>* Have it set there providing a REST API to get results?<p>Eg suppose I have an AI service already, so I throw up a private Railway instance of this as a Python app. There's a DB somewhere. As simple as possible. I can mimic it at home just running a local Python server. How do I do that? _That's_ the real quickstart.
On a side note, is there an open source RAG library that's not bound to a rising AI startup? I couldn't find one and I have a simple in-house implementation that I'd like to replace with something more people use.