Beyond Keywords

Semantic Search API: AI-Powered Search Beyond Keywords

Semantic search is one retrieval stage in the multimodal data warehouse. Search by meaning, not just matching terms -- with multimodal embeddings, hybrid retrieval, and composable pipelines that understand what your users are really looking for.

Semantic Search vs. Keyword Search

Keyword search matches terms. Semantic search understands intent. The difference is transformative for search quality.

Query Understanding

Keyword Search

Exact term matching. 'car repair' only finds documents containing those exact words.

Semantic Search

Meaning-based matching. 'car repair' also finds 'automobile maintenance', 'vehicle fix', and 'auto mechanic services'.

Handling Synonyms

Keyword Search

Misses synonyms entirely. You need to manually expand queries with OR clauses for every variation.

Semantic Search

Understands synonyms natively. Embedding models encode meaning, so related terms cluster together in vector space.

Context Awareness

Keyword Search

No context. 'apple' returns results about fruit and technology equally, with no way to disambiguate.

Semantic Search

Context-aware. Given surrounding query context, semantic search disambiguates 'apple pie recipe' from 'apple stock price'.

Multi-Language

Keyword Search

Language-specific. Searching in English does not find relevant French or Spanish documents.

Semantic Search

Cross-lingual. Multilingual embedding models map all languages into one vector space, enabling cross-language retrieval.

How Semantic Search Works

From content to embeddings to results -- the pipeline that powers meaning-based retrieval.

Embedding Generation

Content is processed through embedding models (CLIP, SigLIP, sentence transformers) that convert text, images, and other modalities into dense vector representations. Semantically similar content maps to nearby points in vector space.

Vector Indexing

Embeddings are indexed into Qdrant namespaces with optimized HNSW indexes for approximate nearest neighbor search. Metadata payloads are stored alongside vectors for filtering. Mixpeek handles index tuning, sharding, and replication automatically.

Semantic Retrieval

At query time, the search query is embedded using the same model. Qdrant finds the nearest vectors by cosine similarity. Results are ranked by semantic relevance -- not keyword frequency. Retriever pipelines add filtering and reranking stages.

Reranking and Fusion

Optional cross-encoder reranking refines initial results by scoring query-document pairs with more expensive but more accurate models. Score fusion combines semantic similarity with keyword BM25 scores and metadata relevance for optimal ranking.

Semantic Search Capabilities

Everything you need to build, deploy, and scale production semantic search across every data modality.

Multimodal Embeddings

Generate embeddings from text, images, video, audio, and documents using 50+ feature extractors. All modalities map into a shared vector space, enabling cross-modal semantic search.

Unified embedding space across modalities
CLIP, SigLIP, sentence-transformers, and custom models
Batch and real-time embedding generation on Ray GPUs

Composable Retriever Pipelines

Build multi-stage retrieval pipelines that chain semantic search, keyword matching, metadata filtering, and reranking into a single query execution.

Chain search, filter, and rerank stages
Weighted score fusion across methods
Configurable per-stage parameters

Custom Embedding Models

Bring your own fine-tuned embedding models via the Docker plugin system. Deploy domain-specific models that understand your vocabulary and data distribution better than general-purpose models.

Docker-based custom extractor plugins
GPU-accelerated inference on Ray clusters
A/B testing across embedding models

Sub-100ms Query Latency

Qdrant's optimized HNSW indexes deliver semantic search results in under 100 milliseconds, even across millions of documents. Production-ready for real-time search applications.

HNSW approximate nearest neighbor search
Automatic index optimization and tuning
Horizontal scaling for high-throughput workloads

Semantic Search Use Cases

From enterprise knowledge bases to e-commerce product discovery -- semantic search transforms how users find what they need.

Enterprise Knowledge Search

Replace brittle keyword search across internal documents, wikis, and knowledge bases. Semantic search understands what employees are looking for, even when they use different terminology than the source documents.

E-Commerce Product Discovery

Let customers describe what they want in natural language and find matching products. Semantic search bridges the gap between how customers describe products and how catalogs are structured.

Customer Support Retrieval

Surface relevant knowledge base articles and past resolutions for support tickets. Semantic search matches the intent of customer queries to existing solutions, reducing response times and improving resolution rates.

Research and Discovery

Search across scientific papers, patents, legal documents, and technical archives by concept rather than keyword. Find relevant prior art, related research, and supporting evidence across massive document collections.

Mixpeek Semantic Search vs. Alternatives

See how Mixpeek compares to search platforms and vector databases for semantic search.

Feature	Mixpeek	Algolia	Elasticsearch	Pinecone
Embedding Generation	Built-in (50+ extractors, custom models)	NeuralSearch (text only, limited models)	BYO models (manual integration)	BYO models (external embedding)
Multimodal Search	Native (text, image, video, audio, PDF)	Text only	Text + limited vector (BYO embeddings)	Vector only (BYO embeddings, any modality)
Hybrid Search	Built-in (vector + BM25 + metadata fusion)	NeuralSearch + keyword	BM25 + kNN (manual fusion)	Sparse + dense vectors
Retriever Pipelines	Composable multi-stage (filter, search, rerank)	Rules-based ranking	Query DSL (single-stage)	Single-stage vector search
Data Processing	Built-in feature extraction on Ray GPUs	No processing (push pre-processed data)	Ingest pipelines (text processing only)	No processing (push pre-computed vectors)
Deployment Options	Managed, Dedicated, BYO Cloud	Managed SaaS only	Self-managed or Elastic Cloud	Managed SaaS only

Build Semantic Search in Minutes

A simple Python API to generate embeddings, index content, and search semantically with hybrid retrieval.

semantic_search.py

from mixpeek import Mixpeek

client = Mixpeek(api_key="YOUR_API_KEY")

# Create a collection with semantic embedding extractors
collection = client.collections.create(
    name="knowledge-base",
    namespace="docs",
    extractors=[
        {
            "type": "text_embedding",
            "model": "sentence-transformers/all-MiniLM-L6-v2",
            "config": {
                "chunk_size": 512,
                "chunk_overlap": 50
            }
        },
        {
            "type": "image_embedding",
            "model": "clip-vit-large",
            "config": {
                "extract_from_documents": True
            }
        }
    ]
)

# Upload documents to trigger embedding generation
client.buckets.upload(
    bucket="my-bucket",
    files=["handbook.pdf", "product_guide.pdf", "faq.md"],
    collection=collection.id
)

# Semantic search with retriever pipeline
results = client.retrievers.execute(
    namespace="docs",
    stages=[
        {
            "type": "feature_search",
            "method": "hybrid",
            "query": {
                "text": "How do I configure single sign-on for my team?",
                "modalities": ["text", "image"]
            },
            "weights": {
                "vector": 0.7,
                "keyword": 0.3
            },
            "limit": 20
        },
        {
            "type": "filter",
            "conditions": {
                "metadata.doc_type": {"$in": ["guide", "faq"]},
                "metadata.updated_after": "2026-01-01"
            }
        },
        {
            "type": "rerank",
            "model": "cross-encoder",
            "limit": 5
        }
    ]
)

for result in results:
    print(f"Score: {result.score}")
    print(f"Source: {result.metadata['filename']}")
    print(f"Section: {result.metadata['section']}")
    print(f"Content: {result.content[:200]}")

Frequently Asked Questions

What is semantic search?

Semantic search is a search technique that understands the meaning and intent behind a query, rather than relying on exact keyword matches. It uses embedding models to convert text (and other data types) into dense vector representations, then finds results by measuring similarity in vector space. This means a query for 'how to fix a broken pipe' can find results about 'plumbing repair' even if those exact words do not appear.

How is semantic search different from keyword search?

Keyword search (like BM25 or TF-IDF) matches documents based on the exact terms in the query. It excels at precision when users know the right terminology but fails when there is a vocabulary mismatch. Semantic search uses vector embeddings to match by meaning, handling synonyms, paraphrases, and conceptual similarity. Mixpeek supports both and combines them in hybrid search for the best of both approaches.

What are embeddings and how do they enable semantic search?

Embeddings are dense numerical vectors that represent the meaning of content. Embedding models (like CLIP, sentence-transformers, or SigLIP) are trained to map semantically similar content to nearby points in a high-dimensional vector space. When you search, your query is embedded using the same model, and the system finds documents whose vectors are closest to the query vector by cosine similarity or dot product distance.

Does Mixpeek support semantic search across images and video?

Yes. Mixpeek uses multimodal embedding models like CLIP and SigLIP that map text, images, and video frames into a shared vector space. This enables cross-modal semantic search -- you can search with a text query and retrieve relevant images, or search with an image and find semantically similar video frames. All modalities are indexed in the same Qdrant namespace.

What is hybrid search and why is it better than pure semantic search?

Hybrid search combines semantic vector search with keyword-based BM25 search and fuses their scores. Pure semantic search excels at understanding intent and handling synonyms, but can miss exact matches that keyword search catches easily (like product IDs, error codes, or proper nouns). Hybrid search gives you the best of both -- semantic understanding plus keyword precision -- with configurable weights for each method.

How does Mixpeek compare to building semantic search with Pinecone or Elasticsearch?

Pinecone and Elasticsearch require you to generate embeddings externally and push pre-computed vectors. Mixpeek handles the entire pipeline: feature extraction (embedding generation) on managed Ray GPU clusters, vector indexing in Qdrant, and composable retriever pipelines for search. You also get built-in multimodal support, so images, video, and audio are searchable alongside text without separate infrastructure.

Can I use my own fine-tuned embedding models?

Yes. Mixpeek supports custom embedding models through its Docker-based plugin system. Package your fine-tuned model in a container, register it as a custom feature extractor, and it runs on Mixpeek's Ray GPU clusters alongside built-in extractors. This is useful for domain-specific applications where fine-tuned models significantly outperform general-purpose embeddings.

How does reranking improve semantic search results?

Reranking uses a cross-encoder model to re-score the top results from an initial retrieval stage. Unlike bi-encoders used for embedding generation, cross-encoders process the query and document together, enabling more accurate relevance scoring at the cost of higher latency. Mixpeek supports reranking as a stage in its composable retriever pipelines, letting you balance speed and accuracy by reranking only the top-N results.

Build Semantic Search That Understands Meaning

Stop losing users to irrelevant keyword results. Build production semantic search with managed embeddings, hybrid retrieval, and composable pipelines.