NEWVectors or files. Pick a path.Start →

    What is HNSW (Hierarchical Navigable Small World)

    HNSW (Hierarchical Navigable Small World) - Graph-based approximate nearest neighbor search algorithm

    A high-performance graph-based ANN algorithm that builds a multi-layered navigable small world graph for fast similarity search. HNSW is the default index type in most modern vector databases used for multimodal retrieval.

    How It Works

    HNSW constructs a hierarchical graph where each layer is a navigable small world network with decreasing density. The top layers contain few nodes with long-range connections for fast coarse navigation, while the bottom layer contains all nodes with short-range connections for precise search. Query traversal starts at the top layer and greedily descends, narrowing the search region at each level.

    Technical Details

    Key parameters include M (number of connections per node, typically 16-64), ef_construction (search width during build, typically 100-500), and ef_search (search width during query, typically 50-500). Higher M and ef values improve recall but increase memory and latency. The algorithm achieves logarithmic search complexity with respect to dataset size, making it efficient even at billion-scale.

    Best Practices

    • Start with M=16 and ef_construction=200 as reasonable defaults, then tune based on benchmarks
    • Set ef_search dynamically based on the required recall level for each query
    • Use separate read and write threads since HNSW supports concurrent reads but serializes writes
    • Monitor memory usage carefully as HNSW indices can be 1.5-2x the raw vector data size

    Common Pitfalls

    • Setting M too low, resulting in disconnected graph regions and poor recall
    • Using ef_search lower than the number of requested results k
    • Not reserving enough memory for the graph structure on top of vector storage
    • Expecting real-time updates at high throughput without understanding write serialization

    Advanced Tips

    • Combine HNSW with product quantization to reduce memory while maintaining graph navigation quality
    • Use HNSW with scalar quantization (SQ8) for a good balance of memory savings and recall
    • Implement filtered HNSW search by pruning graph traversal based on metadata predicates
    • Consider segment-based HNSW implementations that allow efficient deletions and updates
    Managed Mixpeek

    Put multimodal search to work

    Connect a bucket and Mixpeek runs the whole multimodal search pipeline for you: extraction, indexing, and search over your own objects. No models to wire up, nothing to host.

    Start with Managed
    MVS · bring your own

    Already have vectors?

    Keep your embeddings on your own cloud and run dense, sparse, and BM25 search directly on object storage. First 1M vectors free.

    Start with MVS