TechEcho

6 comments

binarymaxabout 2 years ago

Hey there - interesting solution!A note on benchmarks with HNSW is that you need to optimize your ‘m’ and ‘ef_construction’ params at index time and your ef_search param at query time.So you may not be getting an optimized result from DBs like Qdrant and weaviate.

评论 #35983518 未加载

评论 #35977567 未加载

LukeAIabout 2 years ago

Looks interesting! a high-performance vector database service that has full SQL support, built on top of ClickHouse (famous for its speed as well), and is also cost effective to use.

siyudabout 2 years ago

It's really exciting to see a SQL based vector database with such high performance. I can't wait to try it.

kacperlukawskiabout 2 years ago

Did you run the clients in the same regions as the servers? That may impact the results.

评论 #35983494 未加载

评论 #35977606 未加载

lqhlabout 2 years ago

Over the past few months, we have been working on an exciting project to bridge the gap between high performance vector search and OLAP database. Today, we are thrilled to announce the release of our new end-to-end benchmark of MyScale, which includes a comparison with some of the state-of-the-art vector databases for your reference. Here are some key takeaways that might pique your interest:1. [Low Cost] in this case, we measures the ratio of the monthly cost to the QPS (Queries Per Second) of the service per one hundred units. It quantifies the monthly cost required to achieve 100 QPS on 5 million vector data points. Our analysis highlights the superior cost-performance ratio of MyScale, which is over 3.6 times cheaper than other vector databases. <a href="https://blog.myscale.com/2023/05/17/myscale-outperform-special-vectordb/#cost-performance-ratio" rel="nofollow">https://blog.myscale.com/2023/05/17/myscale-outperform-speci...</a>2. [High Throughput] MyScale outperforms other vector databases in terms of QPS on the LAION 5M dataset with a 98.5% recall rate, achieving over 150 QPS. In comparison, Pinecone s1 has a QPS of approximately 10, which is significantly lower than MyScale. Weaviate and Zilliz Cloud both achieve around 65 QPS, while Qdrant achieves 81 QPS. <a href="https://blog.myscale.com/2023/05/17/myscale-outperform-special-vectordb/#queries-per-second-qps" rel="nofollow">https://blog.myscale.com/2023/05/17/myscale-outperform-speci...</a>3. [Quick Responce] Query latency is an important performance metric that is measured from the time the client sends the request until it receives the response. MyScale achieves 150 QPS while maintaining an average latency as low as 25.8 ms. Pinecone s1 has a relatively high latency of over 400 ms. Weaviate and Zilliz Cloud both have latencies of around 60 ms, while Qdrant has a slightly higher latency of around 100 ms. <a href="https://blog.myscale.com/2023/05/17/myscale-outperform-special-vectordb/#average-query-latency" rel="nofollow">https://blog.myscale.com/2023/05/17/myscale-outperform-speci...</a>4. [Fast Data Ingestion] The time it takes from data upload to the vector index being built and ready to serve is referred to as data ingestion time. Index creation can take a long time, especially for graph-based algorithms such as HNSW. Among all the services tested, MyScale had the fastest ingestion time for 5 million data points, completing the task in about 30 minutes. Pinecone s1 takes approximately 53 minutes, while Weaviate takes 72 minutes. Zilliz Cloud requires a longer duration of approximately 113 minutes, while Qdrant has the longest ingestion time, taking 145 minutes to process 5 million data points. <a href="https://blog.myscale.com/2023/05/17/myscale-outperform-special-vectordb/#data-ingestion-time" rel="nofollow">https://blog.myscale.com/2023/05/17/myscale-outperform-speci...</a>Also here are some other nice features myscale can offer:1. Simple data import and backup: We support common format like Parquet, tar, csv from or to S3 buckets or other object storage systems.2. There are more options from FAISS and HNSW other than MSTG, the algorithmn we proposed and tested in this benchmark. You may choose one you familiar with.3. Built on Clickhouse and being part of the community. Boost your vector search with Clickhouse advanced features. MyScale is currently in beta, with a free developer tier and a commercial plan on the way. To the best of our knowledge,MyScale offers the first free plan that supports 5 million 768-dimensional vector data points with high performance search.

评论 #35983154 未加载

Windfall_about 2 years ago

Great job! I can't wait to try it.

6 comments

binarymaxabout 2 years ago

评论 #35983518 未加载

评论 #35977567 未加载

LukeAIabout 2 years ago

Looks interesting! a high-performance vector database service that has full SQL support, built on top of ClickHouse (famous for its speed as well), and is also cost effective to use.

siyudabout 2 years ago

It's really exciting to see a SQL based vector database with such high performance. I can't wait to try it.

kacperlukawskiabout 2 years ago

Did you run the clients in the same regions as the servers? That may impact the results.

The most cost-effective Vector Databases

6 comments

The most cost-effective Vector Databases

6 comments