Algorithms that find vectors close to a query vector in sub-linear time by accepting a small accuracy trade-off. ANN is the backbone of scalable multimodal retrieval systems handling millions or billions of embeddings.
ANN algorithms build specialized index structures during an offline phase that organize vectors for fast lookup. At query time, these structures allow the search to examine only a fraction of the database while still finding most of the true nearest neighbors. Common approaches include tree-based partitioning, hashing, and graph-based navigation.
Major ANN families include locality-sensitive hashing (LSH), tree-based methods (Annoy), graph-based methods (HNSW, NSG), and quantization-based methods (IVF-PQ). Performance is measured by recall at N (fraction of true nearest neighbors found), queries per second, and memory footprint. The ANN Benchmarks project provides standardized comparisons across algorithms and datasets.