A vector compression technique that decomposes high-dimensional vectors into lower-dimensional subspaces and quantizes each independently, achieving high compression ratios for large-scale multimodal search systems.
Product quantization splits each D-dimensional vector into M subvectors of D/M dimensions. Each subspace has its own codebook of k centroids trained via k-means. A vector is encoded as M centroid indices, reducing storage from D floats to M bytes (when k=256). Distance computation uses precomputed lookup tables between query subvectors and codebook entries.
Standard PQ uses M=8-64 subspaces with k=256 centroids each, compressing 768-dimensional float32 vectors from 3KB to 8-64 bytes. Asymmetric Distance Computation (ADC) keeps the query uncompressed and computes distances against quantized database vectors for better accuracy. Inverted File with PQ (IVF-PQ) combines coarse partitioning with PQ for scalable search.
Connect a bucket and Mixpeek runs the whole multimodal search pipeline for you: extraction, indexing, and search over your own objects. No models to wire up, nothing to host.
Start with ManagedKeep your embeddings on your own cloud and run dense, sparse, and BM25 search directly on object storage. First 1M vectors free.
Start with MVS