Best of Both Worlds

Hybrid Search: Combine Keyword and Vector Search

Hybrid search is one stage type in a multi-stage retrieval pipeline. In the multimodal data warehouse, it fuses BM25 keyword precision with vector semantic understanding. Additional filter, rerank, and enrich stages compose on top for complete production retrieval.

Why Hybrid Search

Keyword search catches exact terms. Vector search catches meaning. Production applications need both.

Keyword Search Alone Misses Intent

Keyword Only (BM25)

BM25 excels at exact term matching but fails when users use different words than what appears in documents. Searching for 'car repair' misses results about 'automobile maintenance'. Synonym expansion helps but creates maintenance nightmares.

Hybrid Search

Hybrid search adds vector similarity to capture semantic meaning. The query 'car repair' matches documents about 'automobile maintenance' through embedding space proximity, while BM25 still catches exact product IDs and technical terms.

Vector Search Alone Misses Precision

Vector Only

Pure vector search understands meaning but can miss exact matches that matter. Searching for error code 'ERR-4021' might return results about generic error handling instead of the specific error. Proper nouns, IDs, and code snippets need exact matching.

Hybrid Search

Hybrid search preserves BM25 keyword precision for exact terms while using vectors for semantic understanding. Error code 'ERR-4021' matches exactly, while the surrounding context is understood semantically.

Production Search Needs Both

Single Method

Real-world queries contain both semantic intent and specific terms. A query like 'Python implementation of quicksort with O(n log n) complexity' needs semantic understanding of the concept and exact matching on technical terms.

Hybrid Search

Mixpeek's hybrid search fuses BM25 keyword scores with vector similarity scores using configurable weights. Add metadata filtering and cross-encoder reranking as additional stages for production-grade retrieval quality.

How Hybrid Search Works

BM25 keyword matching and vector semantic search run in parallel, with score fusion and optional reranking for production-grade retrieval.

BM25 Keyword Index

Content is tokenized and indexed using BM25 (Best Matching 25), the industry-standard probabilistic text retrieval algorithm. BM25 scores documents by term frequency and inverse document frequency, excelling at exact term matching, rare term boosting, and precise phrase queries.

Vector Embedding Index

The same content is processed through embedding models (CLIP, sentence-transformers, SigLIP) to generate dense vector representations. These embeddings capture semantic meaning and are indexed in Qdrant HNSW indexes for approximate nearest neighbor search.

Score Fusion

At query time, both BM25 and vector search execute in parallel. Their scores are normalized and combined using reciprocal rank fusion (RRF) or weighted linear combination. You control the balance -- more weight on keywords for precise queries, more on vectors for exploratory searches.

Reranking

Optionally, a cross-encoder reranker re-scores the fused results by processing each query-document pair through a more expensive but more accurate model. This refines ranking quality, especially for ambiguous queries where initial fusion alone may not capture the optimal ordering.

Why Hybrid Beats Pure Vector Search

Vector search revolutionized retrieval, but production applications consistently show that hybrid outperforms pure vector approaches.

Exact Term Matching

Product SKUs, error codes, version numbers, and proper nouns need exact matching that BM25 provides. Vector search may return semantically related but wrong specific results for these queries.

Rare Term Boosting

BM25's inverse document frequency (IDF) component naturally boosts rare, informative terms. Vector embeddings can under-weight rare terms because they are underrepresented in training data. Hybrid search preserves this signal.

Robustness to Embedding Failures

Embedding models can have blind spots -- domain-specific jargon, newly coined terms, or niche topics where the model was not well-trained. BM25 provides a fallback that catches results the vector model misses.

Benchmark Performance

On standard information retrieval benchmarks (BEIR, MTEB), hybrid search consistently achieves higher nDCG and recall scores than either BM25 or vector search alone, often by 5-15% depending on the dataset and domain.

Hybrid Search Capabilities

Configurable fusion, metadata filtering, reranking, and multimodal support -- everything you need for production hybrid retrieval.

Configurable Score Fusion

Control how BM25 and vector scores combine. Use reciprocal rank fusion for robust defaults, or weighted linear combination for fine-grained control. Adjust weights per query type or let the system optimize automatically.

Reciprocal rank fusion (RRF)
Weighted linear combination with configurable weights
Per-query weight adjustment via API parameters

Metadata Filtering

Add pre-retrieval and post-retrieval metadata filters to narrow results. Filter by date, category, author, language, or any custom metadata field stored alongside your documents in Qdrant namespaces.

Boolean filter expressions ($and, $or, $not)
Range filters for numeric and date fields
Nested metadata field filtering

Cross-Encoder Reranking

Refine hybrid search results with cross-encoder models that score query-document pairs jointly. Reranking dramatically improves precision for the top-N results at minimal latency cost.

Multiple cross-encoder model options
Configurable rerank depth (top-N)
Stage-based pipeline integration

Multimodal Hybrid Search

Hybrid search is not limited to text. Combine keyword matching over transcripts and metadata with vector similarity over images, video frames, and audio embeddings in a single retriever pipeline.

Text + image + video + audio in one query
Cross-modal score fusion
Modality-specific weight configuration

Hybrid Search Architecture

A four-layer architecture that runs keyword and vector retrieval in parallel, fuses scores, and refines with reranking.

Query Processing

The incoming query is processed in parallel: tokenized for BM25 keyword matching and embedded for vector search. For multimodal queries, different modality-specific embeddings are generated simultaneously.

Parallel Retrieval

BM25 retrieval runs against the keyword index while vector ANN search runs against Qdrant HNSW indexes. Both return scored candidate lists independently. Metadata pre-filters are applied at this stage to narrow the search space.

Score Fusion Layer

Candidate lists from keyword and vector retrieval are merged. Reciprocal rank fusion (RRF) or weighted combination normalizes and combines scores. Documents appearing in both lists receive boosted scores.

Reranking Layer

The top-N fused results pass through a cross-encoder reranker for refined scoring. The reranker processes query-document pairs jointly, capturing fine-grained relevance signals that bi-encoders and BM25 miss.

Mixpeek Hybrid Search vs. Alternatives

See how Mixpeek compares to search engines and vector databases for hybrid retrieval.

Feature	Mixpeek	Weaviate	Elasticsearch	Vespa
Hybrid Search	Built-in (BM25 + vector + metadata fusion)	Built-in (BM25 + vector)	Manual (BM25 + kNN, custom fusion)	Built-in (multiple retrieval + blending)
Multimodal Support	Native (text, image, video, audio, PDF)	Limited (text + CLIP images)	Text only (BYO vectors)	Text + vectors (BYO embeddings)
Embedding Generation	Built-in (50+ extractors on Ray GPUs)	Built-in modules (limited models)	BYO models (external generation)	BYO models (external generation)
Retriever Pipelines	Composable multi-stage (search, filter, rerank)	Single-stage with modules	Query DSL (complex configuration)	Ranking profiles (YAML configuration)
Infrastructure Management	Fully managed (zero-ops)	Self-managed or Weaviate Cloud	Self-managed or Elastic Cloud	Self-managed or Vespa Cloud
Deployment Options	Managed, Dedicated, BYO Cloud	Self-managed or SaaS	Self-managed or Elastic Cloud	Self-managed or Vespa Cloud

Build Hybrid Search in Minutes

A simple Python API to combine keyword and vector search with configurable fusion weights and cross-encoder reranking.

hybrid_search.py

from mixpeek import Mixpeek

client = Mixpeek(api_key="YOUR_API_KEY")

# Create a collection with text and image extractors
collection = client.collections.create(
    name="product-docs",
    namespace="documentation",
    extractors=[
        {
            "type": "text_embedding",
            "model": "sentence-transformers/all-MiniLM-L6-v2",
            "config": {
                "chunk_size": 512,
                "chunk_overlap": 50,
                "generate_bm25_index": True
            }
        },
        {
            "type": "image_embedding",
            "model": "clip-vit-large",
            "config": {
                "extract_from_documents": True
            }
        }
    ]
)

# Hybrid search with configurable fusion weights
results = client.retrievers.execute(
    namespace="documentation",
    stages=[
        {
            "type": "feature_search",
            "method": "hybrid",
            "query": {
                "text": "how to configure SSO with SAML 2.0",
                "modalities": ["text"]
            },
            "weights": {
                "vector": 0.6,
                "keyword": 0.4
            },
            "limit": 30
        },
        {
            "type": "filter",
            "conditions": {
                "metadata.doc_type": {"$in": ["guide", "tutorial"]},
                "metadata.product_version": {"$gte": "3.0"}
            }
        },
        {
            "type": "rerank",
            "model": "cross-encoder/ms-marco-MiniLM-L-12-v2",
            "limit": 10
        }
    ]
)

# Results combine keyword precision with semantic understanding
for result in results:
    print(f"Score: {result.score}")
    print(f"BM25 Score: {result.scores.get('keyword', 0)}")
    print(f"Vector Score: {result.scores.get('vector', 0)}")
    print(f"Source: {result.metadata['filename']}")
    print(f"Section: {result.metadata['section']}")
    print(f"Content: {result.content[:200]}")

Frequently Asked Questions

What is hybrid search?

Hybrid search combines multiple retrieval methods -- typically keyword-based BM25 search and vector-based semantic search -- and fuses their results into a single ranked list. This approach captures both the precision of exact keyword matching and the recall of semantic understanding. When a user searches for 'Python quicksort implementation', BM25 catches the exact term 'quicksort' while vector search understands the broader concept of sorting algorithms.

How does hybrid search improve over pure vector search?

Pure vector search excels at understanding meaning but struggles with exact terms -- product IDs, error codes, proper nouns, and technical terminology. Hybrid search adds BM25 keyword matching that catches these exact terms with high precision. In benchmarks, hybrid search consistently outperforms either method alone, particularly for queries that mix natural language with specific identifiers.

What is reciprocal rank fusion (RRF)?

Reciprocal rank fusion (RRF) is a score combination method that merges ranked lists from different retrieval methods. Instead of normalizing raw scores (which have different scales across methods), RRF uses the reciprocal of each document's rank position: score = 1 / (k + rank). Documents appearing in multiple lists get their RRF scores summed. This is robust and requires no score calibration, making it a reliable default for hybrid search.

How do I choose weights for keyword vs. vector search?

Start with a balanced split (0.5 keyword, 0.5 vector) and adjust based on your data and query patterns. For technical documentation with specific terms, increase keyword weight (0.6-0.7). For conversational or exploratory queries, increase vector weight (0.6-0.7). Mixpeek lets you set weights per query through the API, so you can dynamically adjust based on query characteristics or A/B test different configurations.

Does hybrid search work with multimodal data?

Yes. In Mixpeek, hybrid search extends beyond text. You can combine BM25 keyword matching over transcripts, OCR text, and metadata with vector similarity over image embeddings, audio embeddings, and video frame embeddings. All modalities are indexed in the same Qdrant namespace, and the retriever pipeline fuses scores across both methods and modalities in a single query.

What is the role of reranking in hybrid search?

Reranking is an optional but powerful stage that refines hybrid search results. After BM25 + vector score fusion produces an initial ranked list, a cross-encoder model re-scores the top-N results by processing each query-document pair jointly. Cross-encoders capture fine-grained relevance signals that the initial retrieval misses, significantly improving precision for the top results at a small latency cost.

How does Mixpeek hybrid search compare to Elasticsearch kNN?

Elasticsearch added kNN vector search alongside its BM25 capabilities, but combining them requires manual query construction, custom scripting for score fusion, and external embedding generation. Mixpeek provides hybrid search as a built-in feature with automatic score fusion, built-in embedding generation via 50+ feature extractors, and composable retriever pipelines that chain search, filter, and rerank stages declaratively.

Can I add metadata filters to hybrid search?

Yes. Mixpeek retriever pipelines support metadata filtering as a dedicated stage. You can filter before or after the hybrid search stage using boolean expressions ($and, $or, $not), range conditions on numeric and date fields, and exact match or contains conditions on string and array fields. Filters narrow the search space without affecting the relevance scoring of hybrid retrieval.

Build Production Hybrid Search

Stop choosing between keyword precision and semantic understanding. Build hybrid search with managed infrastructure, configurable fusion, and composable retriever pipelines.