Weaviate 1.23 is a massive step forward for managing multi-tenancy with vector databases. For most RAG and Vector DB applications, you will have an uneven distribution in the # of vectors per user. Some users have 10k docs, others 10M+! Weaviate now offers a flat index with binary quantization to efficiently balance when you need an HNSW graph for the 10M doc users and when brute force is all you need for the 10k doc users!<p>Even further, this brute force index lives directly in the LSM store without needing any main memory! This is done through a continuous file read that Etienne explains better in the podcast better than I can in this post haha, really amazing engineering work!<p>Weaviate also comes with some other "self-driving database" features like lazy shard loading for faster startup times with multi-tenancy and automatic resource limiting with the GOMEMLIMIT and other details Etienne shares in the podcast!<p>I am also beyond excited to present our new integration with Anyscale (@anyscalecompute)! Anyscale has amazing pricing for serving and fine-tuning popular open-source LLMs. At the time of this release we are now integrating the Llama 70B/13B/7B, Mistral 7B, and Code Llama 34B into Weaviate -- but we expect much further development with adding support for fine-tuned models, the super cool new function calling models Anyscale announced yesterday. and other model such as Diffusion and multimodal models!<p>Here is a full list of new features:<p>- Lazy Shard Loading
- Flat Index + Binary Quantization
- Default Segments for PQ
- AutoPQ
- Auto Resource Limiting
- Node Endpoint Update
- Generative Anyscale<p>Check it out here!<p>https://www.youtube.com/watch?v=e88O18_2wyo