Mixpeek Logo

    What is HNSW (Hierarchical Navigable Small World)

    HNSW (Hierarchical Navigable Small World) - Graph-based approximate nearest neighbor search algorithm

    A high-performance graph-based ANN algorithm that builds a multi-layered navigable small world graph for fast similarity search. HNSW is the default index type in most modern vector databases used for multimodal retrieval.

    How It Works

    HNSW constructs a hierarchical graph where each layer is a navigable small world network with decreasing density. The top layers contain few nodes with long-range connections for fast coarse navigation, while the bottom layer contains all nodes with short-range connections for precise search. Query traversal starts at the top layer and greedily descends, narrowing the search region at each level.

    Technical Details

    Key parameters include M (number of connections per node, typically 16-64), ef_construction (search width during build, typically 100-500), and ef_search (search width during query, typically 50-500). Higher M and ef values improve recall but increase memory and latency. The algorithm achieves logarithmic search complexity with respect to dataset size, making it efficient even at billion-scale.

    Best Practices

    • Start with M=16 and ef_construction=200 as reasonable defaults, then tune based on benchmarks
    • Set ef_search dynamically based on the required recall level for each query
    • Use separate read and write threads since HNSW supports concurrent reads but serializes writes
    • Monitor memory usage carefully as HNSW indices can be 1.5-2x the raw vector data size

    Common Pitfalls

    • Setting M too low, resulting in disconnected graph regions and poor recall
    • Using ef_search lower than the number of requested results k
    • Not reserving enough memory for the graph structure on top of vector storage
    • Expecting real-time updates at high throughput without understanding write serialization

    Advanced Tips

    • Combine HNSW with product quantization to reduce memory while maintaining graph navigation quality
    • Use HNSW with scalar quantization (SQ8) for a good balance of memory savings and recall
    • Implement filtered HNSW search by pruning graph traversal based on metadata predicates
    • Consider segment-based HNSW implementations that allow efficient deletions and updates