NEWVectors or files. Pick a path.Start →

    What is Multi-Stage Retrieval

    Multi-Stage Retrieval - A pipeline that chains discrete search operations to express complex information needs.

    A composable pipeline architecture for search that chains discrete stages: filter candidates, sort by relevance, reduce duplicates, enrich with context, and apply business logic. Unlike single-query search, multi-stage retrieval lets you express complex information needs as a sequence of operations, similar to how SQL chains WHERE, ORDER BY, GROUP BY, and JOIN.

    How It Works

    Each stage in the pipeline receives a result set from the prior stage and applies a transformation: a feature search stage performs vector similarity lookup, a filter stage applies metadata predicates, a rerank stage re-scores results using a cross-encoder, and an enrichment stage appends additional fields. Stages are configured declaratively and executed in sequence.

    Key Benefits

    • Separates retrieval concerns so each stage can be tuned independently
    • Supports hybrid search by combining dense vector, sparse keyword, and metadata filters
    • Enables business logic (recency boosts, license filters, diversity constraints) without changing the underlying index
    • Composable and version-controlled, making pipelines auditable and reproducible

    When to Use It

    • When a single nearest-neighbor query cannot express your ranking requirements
    • When you need to combine results from multiple embedding spaces or modalities
    • When business rules must be applied after initial retrieval (e.g., filter by rights, boost by recency)
    • When you want to A/B test retrieval strategies without reindexing data
    Managed Mixpeek

    Put multimodal search to work

    Connect a bucket and Mixpeek runs the whole multimodal search pipeline for you: extraction, indexing, and search over your own objects. No models to wire up, nothing to host.

    Start with Managed
    MVS · bring your own

    Already have vectors?

    Keep your embeddings on your own cloud and run dense, sparse, and BM25 search directly on object storage. First 1M vectors free.

    Start with MVS