Pg_vectorize: Vector search and RAG on Postgres

295 点作者 samaysharma大约 1 年前

20 条评论

I did a hobby RAG project a little while back, and I'll just share my experience here.1. First ask the LLM to answer your questions without RAG. It is easy to do and you may be surprised (I was, but my data was semi-public). This also gives you a baseline to beat.2. Chunking of your data needs to be smart. Just chunking every N characters wasn't especially fruitful. My data was a book, so it was hierarchal (by heading level). I would chunk by book section and hand it to the LLM.3. Use the context window effectively. There is a weighted knapsack problem here, there are chunks of various sizes (chars/tokens) with various weightings (quality of match). If your data supports it, then the problem is also hierarchal. For example, I have 4 excellent matches in this chapter, so should I include each match, or should I include the whole chapter?4. Quality of input data counts. I spent 30 minutes copy-pasting the entire book into markdown format.This was only a small project. I'd be interested to hear any other thoughts/tips.

评论 #39615260 未加载

评论 #39614748 未加载

评论 #39614942 未加载

评论 #39614413 未加载

softwaredoug大约 1 年前

I think people assume RAG will just be a vector search problem. You take the user's input and somehow get relevant context.It's really hard to coordinate between LLMs, a vector store, chunking the embeddings, turning user's chats into query embeddings - and other queries - etc. It's a complicated search relevance problem that's extremely multifaceted, use case, and domain specific. Just doing search from a search bar well without all this complexity is hard enough. And vector search is just one data store you might use alongside others (alongside keyword search, SQL, whatever else).I say this to get across Postgres is actually uniquely situated to bring a multifaceted approach that's not just about vector storage / retrieval.

评论 #39619943 未加载

评论 #39618025 未加载

评论 #39618314 未加载

mattkevan大约 1 年前

Funny this has come up – I've literally just finished building a RAG search with Postgres this morning.I run a directory site for UX articles and tools, which I've recently rebuilt in Django. It has links to over 5000 articles, making it hard to browse, so I thought it'd be fun to use RAG with citations to create a knowledge search tool.The site fetches new articles via RSS, which are chunked, embedded and added to the vector store. On conducting a search, the site returns a summary as well as links to the original articles. I'm using LlamaIndex, OpenAI and Supabase.It's taken a little while to figure out as I really didn't know what I was doing and there's loads of improvements to make, but you can try it out here: <a href="https://www.uxlift.org/search/" rel="nofollow">https://www.uxlift.org/search/</a>I'd love to hear what you think.

评论 #39615572 未加载

评论 #39615797 未加载

评论 #39615467 未加载

tosh大约 1 年前

Are there any examples for when RAG powered by vectorsearch works really well?I tried best practices like having the llm formulate an answer and using the answer for the search (instead of the question) and trying different chunk sizes and so on but never got it to work in a way that I would consider the result as "good".Maybe it was because of the type of data or the capabilities of the model at the time (GPT 3.5 and GPT 4)?By now context windows with some models are large enough to fit lots of context directly into the prompt which is easier to do and yields better results. It is way more costly but cost is going down fast so I wonder what this means for RAG + vectorsearch going forward.Where does it shine?

评论 #39614546 未加载

评论 #39614331 未加载

评论 #39614177 未加载

评论 #39622803 未加载

samaysharma大约 1 年前

Few relevant blogs on using pg_vectorize:* Doing vector search with just 2 commands <a href="https://tembo.io/blog/introducing-pg_vectorize" rel="nofollow">https://tembo.io/blog/introducing-pg_vectorize</a>* Connecting Postgres to any huggingface sentence transformer <a href="https://tembo.io/blog/sentence-transformers" rel="nofollow">https://tembo.io/blog/sentence-transformers</a>* Building a question answer chatbot natively on Postgres <a href="https://tembo.io/blog/tembo-rag-stack" rel="nofollow">https://tembo.io/blog/tembo-rag-stack</a>

politelemon大约 1 年前

I don't know how to articulate the uncomfortable feeling I'd be having, about something 'inside' the database doing the download and making requests to other systems outside a boundary. It might be a security threat or just my inexperience, how common is it for postgres extensions to do this?

评论 #39671508 未加载

pdabbadabba大约 1 年前

There's a fair amount of skepticism towards the efficacy of RAG in these comments—often in contrast to simply using a model with a huge context window to analyze the corpus in one giant chunk. But that will not be a viable alternative in all use cases.For example, one might need to analyze/search a very large corpus composed of many documents which, as a whole, is very unlikely to fit within any realistic context window. Or one might be constrained to only use local models and may not have access to models with these huge windows. Or both!In cases like these, can anyone recommend a more promising approach than RAG?

评论 #39618653 未加载

评论 #39617064 未加载

patresh大约 1 年前

The high level API seems very smooth to quickly iterate on testing RAGs. It seems great for prototyping, however I have doubts whether it's a good idea to hide the LLM calling logic in a DB extension.Error handling when you get rate limited, the token has expired or the token length is too long would be problematic, and from a security point of view it requires your DB to directly call OpenAI which can also be risky.Personally I haven't used that many Postgres extensions, so perhaps these risks are mitigated somehow that I don't know?

评论 #39623321 未加载

评论 #39615851 未加载

swalsh大约 1 年前

Sorry if i'm completely missing it, I noticed in the code, there is something around chat:<a href="https://github.com/tembo-io/pg_vectorize/blob/main/src/chat.rs">https://github.com/tembo-io/pg_vectorize/blob/main/src/chat....</a>This would lead me to believe there is some way to actually use SQL for not just embeddings, but also prompting/querying the LLM... which would be crazy powerful. Are there any examples on how to do this?

评论 #39616653 未加载

throwaway77384大约 1 年前

What is RAG in this context? I only know it as red, amber, green...

评论 #39614864 未加载

评论 #39614862 未加载

nico大约 1 年前

Has anyone used sqlite for storing embeddings? Are there any extensions or tips for making it easier?I have a small command line python app that uses sqlite for a db. Postgres would be a huge overkill for the appPS: is sqlite-vas good? <a href="https://github.com/asg017/sqlite-vss">https://github.com/asg017/sqlite-vss</a>

评论 #39617893 未加载

评论 #39616594 未加载

评论 #39620983 未加载

rgrieselhuber大约 1 年前

Tembo has been doing very interesting work.

评论 #39614127 未加载

jonplackett大约 1 年前

I’m using pg_vector in supabase and it seems great in a prototype form.Has anyone tried using it at scale? How does it do vs pine cone / Cloudflare before search?

评论 #39614203 未加载

评论 #39616436 未加载

评论 #39617027 未加载

评论 #39615119 未加载

falling_myshkin大约 1 年前

been a lot of these RAG abstractions posted recently. As someone working on this problem, it's unclear to me whether the calculation and ingestion of embeddings from source data should be abstracted into the same software package as their search and retrieval. I guess it probably depends on the complexity of the problem. This does seem interesting in that it does make intuitive sense to have a built-in db extension if the source data itself is coming from the same place as the embeddings are going. But so far I have preferred a separation of concerns in this respect, as it seems that in some cases the models will be used to compute embeddings outside the db context (for example, the user search query needs to get vectorized. why not have the frontend and the backend query the same embedding service?) Anyone else have thoughts on this?

评论 #39616957 未加载

gijoegg大约 1 年前

How should developers think about using this extension versus PostgresML's pgml extension?

评论 #39618686 未加载

ravenstine大约 1 年前

Is RAG just a fancy term for sticking an LLM in front of a search engine?

评论 #39619984 未加载

评论 #39624398 未加载

评论 #39618082 未加载

valstu大约 1 年前

I assume you need to split the data to suitable sized database rows matching your model max length? Or does it do some chunking magic automatically?

评论 #39616425 未加载

cybereporter大约 1 年前

How does this compare to other vector search solutions (LanceDB, Chroma, etc.)? Curious to know which one I should choose.

chaps大约 1 年前

Neat! Any notable gotchas we should know about?

评论 #39616095 未加载

nextaccountic大约 1 年前

Is this only for LLMs?

评论 #39614602 未加载