This is cool! A few questions:<p>- Given how neon architecture decouples compute with storage using the safekeepers & pageservers will this still just work with neon? (Wondering because you mention the index is in-memory so unless your stateless compute nodes can somehow hot-swap the indexes I wasn't sure how that'd work?)<p>- Do you plan to offer vector search as a plug-and-play offering? If so does neon as a product offering plan to introduce more "out-of-the-box" functionalities like vector search? (Similar to xata's offerings like search & vector search.)<p>- unrelated question- I believe neon has very small cold startups for free/scale-to-zero configurations but are there also inherent small latencies for infrequently accessed data? In other words is it safe to say if there were a large table with records that are "old"/"archival" but also always a sort of ~last 30-days of records that are "fresh"/"accessed with more frequency" would there more likely be slight latency introduced when accessing the older records?<p>Neon looks awesome and thanks for neons open source contributions!
I was wondering how this compared to Qdrant. I found this:<p>"Qdrant currently only uses HNSW as a vector index."<p>- <a href="https://qdrant.tech/documentation/concepts/indexing" rel="nofollow noreferrer">https://qdrant.tech/documentation/concepts/indexing</a><p>So it would be interesting to see benchmarks between pg_embedding and Qdrant. I would expect them to perform similarly, but perhaps there are other factors?
What's the plan/timeline for offering cosine similarity support, given that most OSS embedding models are fine tuned on a contrastive cosine distance objective?
CEO of Neon here.<p>This was a relatively quick project for us and the index is currently in-memory. However it is fast! We would love your feedback and excited to invest further.
This blog post makes me uneasy. The pg_embedding code on github gives impression of PoC, while blog post creates impression that pg_embedding is ready for use.<p>If we consider pg_embedding ready for use, why don't you compare with pgvector:<p>1) The need to rebuild index every instance restart,<p>2) Replication support?
With people talking about pgvectors current scaling issues, one thing I'm not sure about is whether it's if the Postgres DB table simply contains a lot of vectors - e.g. 500k vectors - or if you are searching over 500k vectors?<p>E.g. If the DB table had 500k vectors, but you were pre-filtering by WHERE client_id = X (returning only 200 rows) then with an AND <embedding search> (returning only 6 rows) - would this still have the same performance issue?<p>Or is it literally if the embedding search is over 500k rows?
Very cool! It would be nice to see a working end to end integration with an LLM. Using this to generate relevant context for example. I see multiple folks mention cos similarity which HNSW doesn’t support - how does the lack of that limit what you can do with this library?<p>Also, since this is in memory, I assume this significantly affects startup time in order to rebuild the index? Would be nice to see how bad that is for larger vector datasets.
Any reason you didn’t contribute to pgvector?<p>It would have been nice to get the support of neon in progressing pgvector - since it’s already so widely adopted by the community<p>(disclosure: supabase ceo)
Is anyone having a really good experience using embedding based semantic retrieval in combination with a downstream LLM ?<p>I am working quite a bit with normal and chained LLMs but so far haven't explored the retrieval route.