TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

VectorChord: Store 400k Vectors for $1 in PostgreSQL

156 点作者 gaocegege6 个月前

12 条评论

dmezzetti6 个月前
This looks like an interesting project.<p>Though it&#x27;s worth noting that the license is AGPL. So if the idea is for this to take over for pgvecto.rs, it&#x27;s an important data point for those building SaaS products.<p>It will make pgvector the only permissively licensed option, given it has the same license as Postgres.
评论 #42385536 未加载
whakim6 个月前
Could you talk about how updates are handled? My understanding is that IVF can struggle if you&#x27;re doing a lot of inserts&#x2F;updates after index creation, as the data needs to be incrementally re-clustered (or the entire index needs to be rebuilt) in order to ensure the clusters continue to reflect the shape of your data?
评论 #42329430 未加载
marcyb5st6 个月前
Awesome work! But aren&#x27;t the comparisons missing ScaNN [1, 2]? I think it&#x27;s the overall SOTA [3] at the moment regarding vector indexing.<p>[1] <a href="https:&#x2F;&#x2F;github.com&#x2F;google-research&#x2F;google-research&#x2F;tree&#x2F;master&#x2F;scann">https:&#x2F;&#x2F;github.com&#x2F;google-research&#x2F;google-research&#x2F;tree&#x2F;mast...</a><p>[2] Also available on something like AlloyDB on GCP: <a href="https:&#x2F;&#x2F;cloud.google.com&#x2F;alloydb&#x2F;docs&#x2F;ai&#x2F;store-index-query-vectors?resource=scann" rel="nofollow">https:&#x2F;&#x2F;cloud.google.com&#x2F;alloydb&#x2F;docs&#x2F;ai&#x2F;store-index-query-v...</a><p>[3] <a href="https:&#x2F;&#x2F;ann-benchmarks.com&#x2F;glove-100-angular_10_angular.html" rel="nofollow">https:&#x2F;&#x2F;ann-benchmarks.com&#x2F;glove-100-angular_10_angular.html</a><p>Disclaimer: Working for Google, but nowhere close to Databases.
评论 #42326372 未加载
7qW24A6 个月前
The “external index build” idea seems pretty interesting. How does it work with updates to the underlying data (e.g., new embeddings being added)? For that matter, I guess, how do incremental updates to pgvector’s HNSW indexes work?
评论 #42325172 未加载
estebarb6 个月前
It would have been nice a comparison with pgvectorscale, which uses binary quantization and StreamingDiskANN.
评论 #42335987 未加载
_mmarshall6 个月前
The cost to store a static set of 400k 768-dimension vectors is also $1 a month on Datastax&#x27;s AstraDB. However, for that $1, AstraDB replicates the data 3x instead of storing it on a single machine.<p>Here is a link to the cost calculator. Note that the calculator includes cost of ingestion, but the article only mentions storage costs, not ingestion costs: <a href="https:&#x2F;&#x2F;www.datastax.com&#x2F;pricing&#x2F;vector-search?cloudProvider=aws&amp;cloudRegion=us-east-2&amp;nonVectorReadSizeBytes=1024&amp;nonVectorWriteSizeBytes=4096&amp;vectorDimensions=768&amp;vectorReadsPerMonth=0&amp;vectorWritesPerMonth=400000" rel="nofollow">https:&#x2F;&#x2F;www.datastax.com&#x2F;pricing&#x2F;vector-search?cloudProvider...</a><p>Disclaimer: I work on vectorsearch&#x2F;AstraDB at DataStax.
评论 #42330281 未加载
评论 #42334266 未加载
tarasglek6 个月前
I am still waiting for a good pattern for using multivector embeddings like ColBert and ColPali in postgres. I get that its fun to optimize single vector stuff, but multivector is that happy middleground between single vector and reranker that seems to be only validated in specialized exotic search dbs like Vespa
评论 #42328440 未加载
评论 #42326424 未加载
评论 #42326296 未加载
gaocegege6 个月前
Hey everyone! We’ve developed a new PostgreSQL extension that supports 400k vectors for just $1. Check it out!
rkuzsma6 个月前
Would you be willing to speculate on how VectorChord&#x27;s ingestion and query performance might compare to Elasticsearch&#x2F;OpenSearch for dense vector and sparse vector search use cases, particularly when dealing with larger full text data sets (&gt;5M records)?
评论 #42327299 未加载
curl-up6 个月前
Does this mean you won&#x27;t support pgvecto.rs anymore?
评论 #42326326 未加载
评论 #42326350 未加载
nextworddev6 个月前
What dimension vectors are we talking here
评论 #42325483 未加载
jasonkester6 个月前
In five pages of text, we never get to learn what a Vector is (in this context), why we’d want to store one in pgsql, or why it costs so much to store them compared to anything else you’d store there.<p>For an example of how you can communicate with domain experts, while still giving everyone else some form of clue as to what this hell you’re talking about, check out the link to the product that this thing claims to be a successor to:<p><a href="https:&#x2F;&#x2F;pgvecto.rs&#x2F;" rel="nofollow">https:&#x2F;&#x2F;pgvecto.rs&#x2F;</a><p>That starts off by telling us what it is and what it does.
评论 #42326195 未加载
评论 #42326741 未加载
评论 #42326225 未加载
评论 #42328486 未加载