Hybrid BM25 + Dense Vector Search
Use MVS hybrid search to combine BM25 keyword matching with dense vector similarity. Get the precision of exact keyword matches and the recall of semantic understanding in a single query.
"FastAPI Pydantic v2 validation patterns"
Why This Matters
Pure vector search misses exact keyword matches. Pure keyword search misses semantic meaning. Hybrid search gives you both -- critical for technical content, product catalogs, and any domain with specific terminology.
from openai import OpenAIfrom mixpeek import Mixpeekopenai = OpenAI(api_key="your-openai-key")mvs = Mixpeek(api_key="your-mvs-key")NAMESPACE = "my-namespace"def embed(text: str) -> list[float]:resp = openai.embeddings.create(model="text-embedding-3-small", input=text)return resp.data[0].embedding# Upsert documents with BOTH dense embeddings and text contentdocuments = [{"text": "FastAPI uses Pydantic v2 for data validation and serialization", "topic": "python"},{"text": "Express.js middleware handles request/response transformations", "topic": "node"},{"text": "FastAPI supports async/await natively with Starlette ASGI", "topic": "python"},{"text": "Django ORM provides database abstraction with QuerySet API", "topic": "python"},]for doc in documents:mvs.namespaces.documents.upsert(namespace=NAMESPACE,documents=[{"dense_embedding": embed(doc["text"]),"content": doc["text"], # BM25 indexes this field"metadata": {"topic": doc["topic"]}}])# Hybrid search: BM25 keyword matching + dense vector similarityquery_text = "FastAPI Pydantic validation"results = mvs.namespaces.documents.search(namespace=NAMESPACE,query={"dense_embedding": embed(query_text),"text": query_text # BM25 component},hybrid={"enabled": True,"alpha": 0.6 # 0.0 = pure BM25, 1.0 = pure dense, 0.6 = balanced},top_k=5)for doc in results:print(f"{doc['score']:.3f} | {doc['metadata'].get('topic', '')} | {doc['content'][:80]}")
Feature Extractors
Retriever Stages
limit
Truncate results to a maximum count with optional offset for pagination