TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

On-disk HNSW index for Postgres with pg_embedding

63 pointsby nikitaalmost 2 years ago

7 comments

fzliualmost 2 years ago
Thanks for sharing. How does this compare with DiskANN (<a href="https:&#x2F;&#x2F;zilliz.com&#x2F;blog&#x2F;diskann-a-disk-based-anns-solution-with-high-recall-and-high-qps-on-billion-scale-dataset" rel="nofollow noreferrer">https:&#x2F;&#x2F;zilliz.com&#x2F;blog&#x2F;diskann-a-disk-based-anns-solution-w...</a>) or HNSW-IF (<a href="https:&#x2F;&#x2F;blog.vespa.ai&#x2F;vespa-hybrid-billion-scale-vector-search&#x2F;" rel="nofollow noreferrer">https:&#x2F;&#x2F;blog.vespa.ai&#x2F;vespa-hybrid-billion-scale-vector-sear...</a>)?
评论 #36991098 未加载
评论 #36991361 未加载
评论 #36991046 未加载
nh2almost 2 years ago
I would appreciate a rough comparison with usearch:<p><a href="https:&#x2F;&#x2F;unum-cloud.github.io&#x2F;usearch&#x2F;" rel="nofollow noreferrer">https:&#x2F;&#x2F;unum-cloud.github.io&#x2F;usearch&#x2F;</a><p>Which was also recently on HN: <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=36942993">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=36942993</a>
评论 #36993030 未加载
nikitaalmost 2 years ago
CEO of Neon here. After we built an in memory HNSW index for Postgres that allowed us to establish a baseline in performance and prove that it&#x27;s the right approach to support vector search we now built it &quot;the right way&quot; and now it support restarts of Postgres, replication and the rest of the Postgres machinery.
评论 #36992878 未加载
评论 #36990517 未加载
random_moonwalkalmost 2 years ago
This looks very cool.<p>I&#x27;m interested in how many vectors are indexed&#x2F;how large the index is that corresponds to the latency chart? If we have an in-memory HNSW index of 10M vectors at ~20GB (512 dim), say, what are the RAM requirements when using the disk-based version?
评论 #36995842 未加载
thewataccountalmost 2 years ago
Forgive me I&#x27;m not super familiar with the vector indexes outside of the basic tsvector for text search.<p>What&#x27;s the difference between pg_embedding, pg_vector, and tsvector? Are they compariable&#x2F;interchangable? And how do you know which one to pick?<p>My understanding is pg_vector has poorer performance compared to some dedicated vector databases, does pg_embedding perform better?<p>Sorry if these are silly questions.
评论 #36990865 未加载
评论 #36991029 未加载
readyplayeremmaalmost 2 years ago
The GitHub project has no license file that I see. Does anyone know if this is going to be released under an OSS license of some kind?
评论 #36995846 未加载
jasfialmost 2 years ago
Are there any plans to release binaries for this extension? E.g. TimescaleDB is really easy to install.