A high-performance graph-based ANN algorithm that builds a multi-layered navigable small world graph for fast similarity search. HNSW is the default index type in most modern vector databases used for multimodal retrieval.
HNSW constructs a hierarchical graph where each layer is a navigable small world network with decreasing density. The top layers contain few nodes with long-range connections for fast coarse navigation, while the bottom layer contains all nodes with short-range connections for precise search. Query traversal starts at the top layer and greedily descends, narrowing the search region at each level.
Key parameters include M (number of connections per node, typically 16-64), ef_construction (search width during build, typically 100-500), and ef_search (search width during query, typically 50-500). Higher M and ef values improve recall but increase memory and latency. The algorithm achieves logarithmic search complexity with respect to dataset size, making it efficient even at billion-scale.
Connect a bucket and Mixpeek runs the whole multimodal search pipeline for you: extraction, indexing, and search over your own objects. No models to wire up, nothing to host.
Start with ManagedKeep your embeddings on your own cloud and run dense, sparse, and BM25 search directly on object storage. First 1M vectors free.
Start with MVS