A vector compression technique that decomposes high-dimensional vectors into lower-dimensional subspaces and quantizes each independently, achieving high compression ratios for large-scale multimodal search systems.
Product quantization splits each D-dimensional vector into M subvectors of D/M dimensions. Each subspace has its own codebook of k centroids trained via k-means. A vector is encoded as M centroid indices, reducing storage from D floats to M bytes (when k=256). Distance computation uses precomputed lookup tables between query subvectors and codebook entries.
Standard PQ uses M=8-64 subspaces with k=256 centroids each, compressing 768-dimensional float32 vectors from 3KB to 8-64 bytes. Asymmetric Distance Computation (ADC) keeps the query uncompressed and computes distances against quantized database vectors for better accuracy. Inverted File with PQ (IVF-PQ) combines coarse partitioning with PQ for scalable search.