I do not understand how you can pack 1536d into 192 bytes ???
That some entirely impossible?<p>> Vector search, on the other hand, requires reading a lot of small vectors, which might create a large overhead. It is especially noticeable if we use binary quantization, where the size of even large OpenAI 1536d vectors is compressed down to 192 bytes.
Dataset size: 2M 768d vectors (~6Gb Raw data), binary quantization, 650Mb of RAM limit. All benchmarks are made with minimal RAM allocation to demonstrate disk cache efficiency.