Semantic Search API: AI-Powered Search Beyond Keywords
Semantic search is one retrieval stage in the multimodal data warehouse. Search by meaning, not just matching terms -- with multimodal embeddings, hybrid retrieval, and composable pipelines that understand what your users are really looking for.
Semantic Search vs. Keyword Search
Keyword search matches terms. Semantic search understands intent. The difference is transformative for search quality.
Query Understanding
Exact term matching. 'car repair' only finds documents containing those exact words.
Meaning-based matching. 'car repair' also finds 'automobile maintenance', 'vehicle fix', and 'auto mechanic services'.
Handling Synonyms
Misses synonyms entirely. You need to manually expand queries with OR clauses for every variation.
Understands synonyms natively. Embedding models encode meaning, so related terms cluster together in vector space.
Context Awareness
No context. 'apple' returns results about fruit and technology equally, with no way to disambiguate.
Context-aware. Given surrounding query context, semantic search disambiguates 'apple pie recipe' from 'apple stock price'.
Multi-Language
Language-specific. Searching in English does not find relevant French or Spanish documents.
Cross-lingual. Multilingual embedding models map all languages into one vector space, enabling cross-language retrieval.
How Semantic Search Works
From content to embeddings to results -- the pipeline that powers meaning-based retrieval.
Embedding Generation
Content is processed through embedding models (CLIP, SigLIP, sentence transformers) that convert text, images, and other modalities into dense vector representations. Semantically similar content maps to nearby points in vector space.
Vector Indexing
Embeddings are indexed into Qdrant namespaces with optimized HNSW indexes for approximate nearest neighbor search. Metadata payloads are stored alongside vectors for filtering. Mixpeek handles index tuning, sharding, and replication automatically.
Semantic Retrieval
At query time, the search query is embedded using the same model. Qdrant finds the nearest vectors by cosine similarity. Results are ranked by semantic relevance -- not keyword frequency. Retriever pipelines add filtering and reranking stages.
Reranking and Fusion
Optional cross-encoder reranking refines initial results by scoring query-document pairs with more expensive but more accurate models. Score fusion combines semantic similarity with keyword BM25 scores and metadata relevance for optimal ranking.
Semantic Search Capabilities
Everything you need to build, deploy, and scale production semantic search across every data modality.
Multimodal Embeddings
Generate embeddings from text, images, video, audio, and documents using 50+ feature extractors. All modalities map into a shared vector space, enabling cross-modal semantic search.
- Unified embedding space across modalities
- CLIP, SigLIP, sentence-transformers, and custom models
- Batch and real-time embedding generation on Ray GPUs
Composable Retriever Pipelines
Build multi-stage retrieval pipelines that chain semantic search, keyword matching, metadata filtering, and reranking into a single query execution.
- Chain search, filter, and rerank stages
- Weighted score fusion across methods
- Configurable per-stage parameters
Custom Embedding Models
Bring your own fine-tuned embedding models via the Docker plugin system. Deploy domain-specific models that understand your vocabulary and data distribution better than general-purpose models.
- Docker-based custom extractor plugins
- GPU-accelerated inference on Ray clusters
- A/B testing across embedding models
Sub-100ms Query Latency
Qdrant's optimized HNSW indexes deliver semantic search results in under 100 milliseconds, even across millions of documents. Production-ready for real-time search applications.
- HNSW approximate nearest neighbor search
- Automatic index optimization and tuning
- Horizontal scaling for high-throughput workloads
Semantic Search Use Cases
From enterprise knowledge bases to e-commerce product discovery -- semantic search transforms how users find what they need.
Enterprise Knowledge Search
Replace brittle keyword search across internal documents, wikis, and knowledge bases. Semantic search understands what employees are looking for, even when they use different terminology than the source documents.
E-Commerce Product Discovery
Let customers describe what they want in natural language and find matching products. Semantic search bridges the gap between how customers describe products and how catalogs are structured.
Customer Support Retrieval
Surface relevant knowledge base articles and past resolutions for support tickets. Semantic search matches the intent of customer queries to existing solutions, reducing response times and improving resolution rates.
Research and Discovery
Search across scientific papers, patents, legal documents, and technical archives by concept rather than keyword. Find relevant prior art, related research, and supporting evidence across massive document collections.
Mixpeek Semantic Search vs. Alternatives
See how Mixpeek compares to search platforms and vector databases for semantic search.
| Feature | Mixpeek | Algolia | Elasticsearch | Pinecone |
|---|---|---|---|---|
| Embedding Generation | Built-in (50+ extractors, custom models) | NeuralSearch (text only, limited models) | BYO models (manual integration) | BYO models (external embedding) |
| Multimodal Search | Native (text, image, video, audio, PDF) | Text only | Text + limited vector (BYO embeddings) | Vector only (BYO embeddings, any modality) |
| Hybrid Search | Built-in (vector + BM25 + metadata fusion) | NeuralSearch + keyword | BM25 + kNN (manual fusion) | Sparse + dense vectors |
| Retriever Pipelines | Composable multi-stage (filter, search, rerank) | Rules-based ranking | Query DSL (single-stage) | Single-stage vector search |
| Data Processing | Built-in feature extraction on Ray GPUs | No processing (push pre-processed data) | Ingest pipelines (text processing only) | No processing (push pre-computed vectors) |
| Deployment Options | Managed, Dedicated, BYO Cloud | Managed SaaS only | Self-managed or Elastic Cloud | Managed SaaS only |
Build Semantic Search in Minutes
A simple Python API to generate embeddings, index content, and search semantically with hybrid retrieval.
from mixpeek import Mixpeek
client = Mixpeek(api_key="YOUR_API_KEY")
# Create a collection with semantic embedding extractors
collection = client.collections.create(
name="knowledge-base",
namespace="docs",
extractors=[
{
"type": "text_embedding",
"model": "sentence-transformers/all-MiniLM-L6-v2",
"config": {
"chunk_size": 512,
"chunk_overlap": 50
}
},
{
"type": "image_embedding",
"model": "clip-vit-large",
"config": {
"extract_from_documents": True
}
}
]
)
# Upload documents to trigger embedding generation
client.buckets.upload(
bucket="my-bucket",
files=["handbook.pdf", "product_guide.pdf", "faq.md"],
collection=collection.id
)
# Semantic search with retriever pipeline
results = client.retrievers.execute(
namespace="docs",
stages=[
{
"type": "feature_search",
"method": "hybrid",
"query": {
"text": "How do I configure single sign-on for my team?",
"modalities": ["text", "image"]
},
"weights": {
"vector": 0.7,
"keyword": 0.3
},
"limit": 20
},
{
"type": "filter",
"conditions": {
"metadata.doc_type": {"$in": ["guide", "faq"]},
"metadata.updated_after": "2026-01-01"
}
},
{
"type": "rerank",
"model": "cross-encoder",
"limit": 5
}
]
)
for result in results:
print(f"Score: {result.score}")
print(f"Source: {result.metadata['filename']}")
print(f"Section: {result.metadata['section']}")
print(f"Content: {result.content[:200]}")Frequently Asked Questions
What is semantic search?
Semantic search is a search technique that understands the meaning and intent behind a query, rather than relying on exact keyword matches. It uses embedding models to convert text (and other data types) into dense vector representations, then finds results by measuring similarity in vector space. This means a query for 'how to fix a broken pipe' can find results about 'plumbing repair' even if those exact words do not appear.
How is semantic search different from keyword search?
Keyword search (like BM25 or TF-IDF) matches documents based on the exact terms in the query. It excels at precision when users know the right terminology but fails when there is a vocabulary mismatch. Semantic search uses vector embeddings to match by meaning, handling synonyms, paraphrases, and conceptual similarity. Mixpeek supports both and combines them in hybrid search for the best of both approaches.
What are embeddings and how do they enable semantic search?
Embeddings are dense numerical vectors that represent the meaning of content. Embedding models (like CLIP, sentence-transformers, or SigLIP) are trained to map semantically similar content to nearby points in a high-dimensional vector space. When you search, your query is embedded using the same model, and the system finds documents whose vectors are closest to the query vector by cosine similarity or dot product distance.
Does Mixpeek support semantic search across images and video?
Yes. Mixpeek uses multimodal embedding models like CLIP and SigLIP that map text, images, and video frames into a shared vector space. This enables cross-modal semantic search -- you can search with a text query and retrieve relevant images, or search with an image and find semantically similar video frames. All modalities are indexed in the same Qdrant namespace.
What is hybrid search and why is it better than pure semantic search?
Hybrid search combines semantic vector search with keyword-based BM25 search and fuses their scores. Pure semantic search excels at understanding intent and handling synonyms, but can miss exact matches that keyword search catches easily (like product IDs, error codes, or proper nouns). Hybrid search gives you the best of both -- semantic understanding plus keyword precision -- with configurable weights for each method.
How does Mixpeek compare to building semantic search with Pinecone or Elasticsearch?
Pinecone and Elasticsearch require you to generate embeddings externally and push pre-computed vectors. Mixpeek handles the entire pipeline: feature extraction (embedding generation) on managed Ray GPU clusters, vector indexing in Qdrant, and composable retriever pipelines for search. You also get built-in multimodal support, so images, video, and audio are searchable alongside text without separate infrastructure.
Can I use my own fine-tuned embedding models?
Yes. Mixpeek supports custom embedding models through its Docker-based plugin system. Package your fine-tuned model in a container, register it as a custom feature extractor, and it runs on Mixpeek's Ray GPU clusters alongside built-in extractors. This is useful for domain-specific applications where fine-tuned models significantly outperform general-purpose embeddings.
How does reranking improve semantic search results?
Reranking uses a cross-encoder model to re-score the top results from an initial retrieval stage. Unlike bi-encoders used for embedding generation, cross-encoders process the query and document together, enabling more accurate relevance scoring at the cost of higher latency. Mixpeek supports reranking as a stage in its composable retriever pipelines, letting you balance speed and accuracy by reranking only the top-N results.
