Mixpeek Logo
    Best of Both Worlds

    Hybrid Search: Combine Keyword and Vector Search

    Neither keyword search nor vector search alone is enough for production applications. Hybrid search fuses BM25 keyword precision with semantic vector understanding, delivering the best retrieval quality with configurable score fusion and cross-encoder reranking.

    Why Hybrid Search

    Keyword search catches exact terms. Vector search catches meaning. Production applications need both.

    Keyword Search Alone Misses Intent

    Keyword Only (BM25)

    BM25 excels at exact term matching but fails when users use different words than what appears in documents. Searching for 'car repair' misses results about 'automobile maintenance'. Synonym expansion helps but creates maintenance nightmares.

    Hybrid Search

    Hybrid search adds vector similarity to capture semantic meaning. The query 'car repair' matches documents about 'automobile maintenance' through embedding space proximity, while BM25 still catches exact product IDs and technical terms.

    Vector Search Alone Misses Precision

    Vector Only

    Pure vector search understands meaning but can miss exact matches that matter. Searching for error code 'ERR-4021' might return results about generic error handling instead of the specific error. Proper nouns, IDs, and code snippets need exact matching.

    Hybrid Search

    Hybrid search preserves BM25 keyword precision for exact terms while using vectors for semantic understanding. Error code 'ERR-4021' matches exactly, while the surrounding context is understood semantically.

    Production Search Needs Both

    Single Method

    Real-world queries contain both semantic intent and specific terms. A query like 'Python implementation of quicksort with O(n log n) complexity' needs semantic understanding of the concept and exact matching on technical terms.

    Hybrid Search

    Mixpeek's hybrid search fuses BM25 keyword scores with vector similarity scores using configurable weights. Add metadata filtering and cross-encoder reranking as additional stages for production-grade retrieval quality.

    How Hybrid Search Works

    BM25 keyword matching and vector semantic search run in parallel, with score fusion and optional reranking for production-grade retrieval.

    1

    BM25 Keyword Index

    Content is tokenized and indexed using BM25 (Best Matching 25), the industry-standard probabilistic text retrieval algorithm. BM25 scores documents by term frequency and inverse document frequency, excelling at exact term matching, rare term boosting, and precise phrase queries.

    2

    Vector Embedding Index

    The same content is processed through embedding models (CLIP, sentence-transformers, SigLIP) to generate dense vector representations. These embeddings capture semantic meaning and are indexed in Qdrant HNSW indexes for approximate nearest neighbor search.

    3

    Score Fusion

    At query time, both BM25 and vector search execute in parallel. Their scores are normalized and combined using reciprocal rank fusion (RRF) or weighted linear combination. You control the balance -- more weight on keywords for precise queries, more on vectors for exploratory searches.

    4

    Reranking

    Optionally, a cross-encoder reranker re-scores the fused results by processing each query-document pair through a more expensive but more accurate model. This refines ranking quality, especially for ambiguous queries where initial fusion alone may not capture the optimal ordering.

    Why Hybrid Beats Pure Vector Search

    Vector search revolutionized retrieval, but production applications consistently show that hybrid outperforms pure vector approaches.

    Exact Term Matching

    Product SKUs, error codes, version numbers, and proper nouns need exact matching that BM25 provides. Vector search may return semantically related but wrong specific results for these queries.

    Rare Term Boosting

    BM25's inverse document frequency (IDF) component naturally boosts rare, informative terms. Vector embeddings can under-weight rare terms because they are underrepresented in training data. Hybrid search preserves this signal.

    Robustness to Embedding Failures

    Embedding models can have blind spots -- domain-specific jargon, newly coined terms, or niche topics where the model was not well-trained. BM25 provides a fallback that catches results the vector model misses.

    Benchmark Performance

    On standard information retrieval benchmarks (BEIR, MTEB), hybrid search consistently achieves higher nDCG and recall scores than either BM25 or vector search alone, often by 5-15% depending on the dataset and domain.

    Hybrid Search Capabilities

    Configurable fusion, metadata filtering, reranking, and multimodal support -- everything you need for production hybrid retrieval.

    Configurable Score Fusion

    Control how BM25 and vector scores combine. Use reciprocal rank fusion for robust defaults, or weighted linear combination for fine-grained control. Adjust weights per query type or let the system optimize automatically.

    • Reciprocal rank fusion (RRF)
    • Weighted linear combination with configurable weights
    • Per-query weight adjustment via API parameters

    Metadata Filtering

    Add pre-retrieval and post-retrieval metadata filters to narrow results. Filter by date, category, author, language, or any custom metadata field stored alongside your documents in Qdrant namespaces.

    • Boolean filter expressions ($and, $or, $not)
    • Range filters for numeric and date fields
    • Nested metadata field filtering

    Cross-Encoder Reranking

    Refine hybrid search results with cross-encoder models that score query-document pairs jointly. Reranking dramatically improves precision for the top-N results at minimal latency cost.

    • Multiple cross-encoder model options
    • Configurable rerank depth (top-N)
    • Stage-based pipeline integration

    Multimodal Hybrid Search

    Hybrid search is not limited to text. Combine keyword matching over transcripts and metadata with vector similarity over images, video frames, and audio embeddings in a single retriever pipeline.

    • Text + image + video + audio in one query
    • Cross-modal score fusion
    • Modality-specific weight configuration

    Hybrid Search Architecture

    A four-layer architecture that runs keyword and vector retrieval in parallel, fuses scores, and refines with reranking.

    1

    Query Processing

    The incoming query is processed in parallel: tokenized for BM25 keyword matching and embedded for vector search. For multimodal queries, different modality-specific embeddings are generated simultaneously.

    2

    Parallel Retrieval

    BM25 retrieval runs against the keyword index while vector ANN search runs against Qdrant HNSW indexes. Both return scored candidate lists independently. Metadata pre-filters are applied at this stage to narrow the search space.

    3

    Score Fusion Layer

    Candidate lists from keyword and vector retrieval are merged. Reciprocal rank fusion (RRF) or weighted combination normalizes and combines scores. Documents appearing in both lists receive boosted scores.

    4

    Reranking Layer

    The top-N fused results pass through a cross-encoder reranker for refined scoring. The reranker processes query-document pairs jointly, capturing fine-grained relevance signals that bi-encoders and BM25 miss.

    Mixpeek Hybrid Search vs. Alternatives

    See how Mixpeek compares to search engines and vector databases for hybrid retrieval.

    FeatureMixpeekWeaviateElasticsearchVespa
    Hybrid SearchBuilt-in (BM25 + vector + metadata fusion)Built-in (BM25 + vector)Manual (BM25 + kNN, custom fusion)Built-in (multiple retrieval + blending)
    Multimodal SupportNative (text, image, video, audio, PDF)Limited (text + CLIP images)Text only (BYO vectors)Text + vectors (BYO embeddings)
    Embedding GenerationBuilt-in (50+ extractors on Ray GPUs)Built-in modules (limited models)BYO models (external generation)BYO models (external generation)
    Retriever PipelinesComposable multi-stage (search, filter, rerank)Single-stage with modulesQuery DSL (complex configuration)Ranking profiles (YAML configuration)
    Infrastructure ManagementFully managed (zero-ops)Self-managed or Weaviate CloudSelf-managed or Elastic CloudSelf-managed or Vespa Cloud
    Deployment OptionsManaged, Dedicated, BYO CloudSelf-managed or SaaSSelf-managed or Elastic CloudSelf-managed or Vespa Cloud

    Build Hybrid Search in Minutes

    A simple Python API to combine keyword and vector search with configurable fusion weights and cross-encoder reranking.

    hybrid_search.py
    from mixpeek import Mixpeek
    
    client = Mixpeek(api_key="YOUR_API_KEY")
    
    # Create a collection with text and image extractors
    collection = client.collections.create(
        name="product-docs",
        namespace="documentation",
        extractors=[
            {
                "type": "text_embedding",
                "model": "sentence-transformers/all-MiniLM-L6-v2",
                "config": {
                    "chunk_size": 512,
                    "chunk_overlap": 50,
                    "generate_bm25_index": True
                }
            },
            {
                "type": "image_embedding",
                "model": "clip-vit-large",
                "config": {
                    "extract_from_documents": True
                }
            }
        ]
    )
    
    # Hybrid search with configurable fusion weights
    results = client.retrievers.execute(
        namespace="documentation",
        stages=[
            {
                "type": "feature_search",
                "method": "hybrid",
                "query": {
                    "text": "how to configure SSO with SAML 2.0",
                    "modalities": ["text"]
                },
                "weights": {
                    "vector": 0.6,
                    "keyword": 0.4
                },
                "limit": 30
            },
            {
                "type": "filter",
                "conditions": {
                    "metadata.doc_type": {"$in": ["guide", "tutorial"]},
                    "metadata.product_version": {"$gte": "3.0"}
                }
            },
            {
                "type": "rerank",
                "model": "cross-encoder/ms-marco-MiniLM-L-12-v2",
                "limit": 10
            }
        ]
    )
    
    # Results combine keyword precision with semantic understanding
    for result in results:
        print(f"Score: {result.score}")
        print(f"BM25 Score: {result.scores.get('keyword', 0)}")
        print(f"Vector Score: {result.scores.get('vector', 0)}")
        print(f"Source: {result.metadata['filename']}")
        print(f"Section: {result.metadata['section']}")
        print(f"Content: {result.content[:200]}")

    Frequently Asked Questions

    What is hybrid search?

    Hybrid search combines multiple retrieval methods -- typically keyword-based BM25 search and vector-based semantic search -- and fuses their results into a single ranked list. This approach captures both the precision of exact keyword matching and the recall of semantic understanding. When a user searches for 'Python quicksort implementation', BM25 catches the exact term 'quicksort' while vector search understands the broader concept of sorting algorithms.

    How does hybrid search improve over pure vector search?

    Pure vector search excels at understanding meaning but struggles with exact terms -- product IDs, error codes, proper nouns, and technical terminology. Hybrid search adds BM25 keyword matching that catches these exact terms with high precision. In benchmarks, hybrid search consistently outperforms either method alone, particularly for queries that mix natural language with specific identifiers.

    What is reciprocal rank fusion (RRF)?

    Reciprocal rank fusion (RRF) is a score combination method that merges ranked lists from different retrieval methods. Instead of normalizing raw scores (which have different scales across methods), RRF uses the reciprocal of each document's rank position: score = 1 / (k + rank). Documents appearing in multiple lists get their RRF scores summed. This is robust and requires no score calibration, making it a reliable default for hybrid search.

    How do I choose weights for keyword vs. vector search?

    Start with a balanced split (0.5 keyword, 0.5 vector) and adjust based on your data and query patterns. For technical documentation with specific terms, increase keyword weight (0.6-0.7). For conversational or exploratory queries, increase vector weight (0.6-0.7). Mixpeek lets you set weights per query through the API, so you can dynamically adjust based on query characteristics or A/B test different configurations.

    Does hybrid search work with multimodal data?

    Yes. In Mixpeek, hybrid search extends beyond text. You can combine BM25 keyword matching over transcripts, OCR text, and metadata with vector similarity over image embeddings, audio embeddings, and video frame embeddings. All modalities are indexed in the same Qdrant namespace, and the retriever pipeline fuses scores across both methods and modalities in a single query.

    What is the role of reranking in hybrid search?

    Reranking is an optional but powerful stage that refines hybrid search results. After BM25 + vector score fusion produces an initial ranked list, a cross-encoder model re-scores the top-N results by processing each query-document pair jointly. Cross-encoders capture fine-grained relevance signals that the initial retrieval misses, significantly improving precision for the top results at a small latency cost.

    How does Mixpeek hybrid search compare to Elasticsearch kNN?

    Elasticsearch added kNN vector search alongside its BM25 capabilities, but combining them requires manual query construction, custom scripting for score fusion, and external embedding generation. Mixpeek provides hybrid search as a built-in feature with automatic score fusion, built-in embedding generation via 50+ feature extractors, and composable retriever pipelines that chain search, filter, and rerank stages declaratively.

    Can I add metadata filters to hybrid search?

    Yes. Mixpeek retriever pipelines support metadata filtering as a dedicated stage. You can filter before or after the hybrid search stage using boolean expressions ($and, $or, $not), range conditions on numeric and date fields, and exact match or contains conditions on string and array fields. Filters narrow the search space without affecting the relevance scoring of hybrid retrieval.

    Build Production Hybrid Search

    Stop choosing between keyword precision and semantic understanding. Build hybrid search with managed infrastructure, configurable fusion, and composable retriever pipelines.