Mixpeek Logo
    Login / Signup
    Beyond Keywords

    Semantic Search API: AI-Powered Search Beyond Keywords

    Semantic search is one retrieval stage in the multimodal data warehouse. Search by meaning, not just matching terms -- with multimodal embeddings, hybrid retrieval, and composable pipelines that understand what your users are really looking for.

    Semantic Search vs. Keyword Search

    Keyword search matches terms. Semantic search understands intent. The difference is transformative for search quality.

    Query Understanding

    Keyword Search

    Exact term matching. 'car repair' only finds documents containing those exact words.

    Semantic Search

    Meaning-based matching. 'car repair' also finds 'automobile maintenance', 'vehicle fix', and 'auto mechanic services'.

    Handling Synonyms

    Keyword Search

    Misses synonyms entirely. You need to manually expand queries with OR clauses for every variation.

    Semantic Search

    Understands synonyms natively. Embedding models encode meaning, so related terms cluster together in vector space.

    Context Awareness

    Keyword Search

    No context. 'apple' returns results about fruit and technology equally, with no way to disambiguate.

    Semantic Search

    Context-aware. Given surrounding query context, semantic search disambiguates 'apple pie recipe' from 'apple stock price'.

    Multi-Language

    Keyword Search

    Language-specific. Searching in English does not find relevant French or Spanish documents.

    Semantic Search

    Cross-lingual. Multilingual embedding models map all languages into one vector space, enabling cross-language retrieval.

    How Semantic Search Works

    From content to embeddings to results -- the pipeline that powers meaning-based retrieval.

    Embedding Generation

    Content is processed through embedding models (CLIP, SigLIP, sentence transformers) that convert text, images, and other modalities into dense vector representations. Semantically similar content maps to nearby points in vector space.

    Vector Indexing

    Embeddings are indexed into Qdrant namespaces with optimized HNSW indexes for approximate nearest neighbor search. Metadata payloads are stored alongside vectors for filtering. Mixpeek handles index tuning, sharding, and replication automatically.

    Semantic Retrieval

    At query time, the search query is embedded using the same model. Qdrant finds the nearest vectors by cosine similarity. Results are ranked by semantic relevance -- not keyword frequency. Retriever pipelines add filtering and reranking stages.

    Reranking and Fusion

    Optional cross-encoder reranking refines initial results by scoring query-document pairs with more expensive but more accurate models. Score fusion combines semantic similarity with keyword BM25 scores and metadata relevance for optimal ranking.

    Semantic Search Capabilities

    Everything you need to build, deploy, and scale production semantic search across every data modality.

    Multimodal Embeddings

    Generate embeddings from text, images, video, audio, and documents using 50+ feature extractors. All modalities map into a shared vector space, enabling cross-modal semantic search.

    • Unified embedding space across modalities
    • CLIP, SigLIP, sentence-transformers, and custom models
    • Batch and real-time embedding generation on Ray GPUs

    Composable Retriever Pipelines

    Build multi-stage retrieval pipelines that chain semantic search, keyword matching, metadata filtering, and reranking into a single query execution.

    • Chain search, filter, and rerank stages
    • Weighted score fusion across methods
    • Configurable per-stage parameters

    Custom Embedding Models

    Bring your own fine-tuned embedding models via the Docker plugin system. Deploy domain-specific models that understand your vocabulary and data distribution better than general-purpose models.

    • Docker-based custom extractor plugins
    • GPU-accelerated inference on Ray clusters
    • A/B testing across embedding models

    Sub-100ms Query Latency

    Qdrant's optimized HNSW indexes deliver semantic search results in under 100 milliseconds, even across millions of documents. Production-ready for real-time search applications.

    • HNSW approximate nearest neighbor search
    • Automatic index optimization and tuning
    • Horizontal scaling for high-throughput workloads

    Semantic Search Use Cases

    From enterprise knowledge bases to e-commerce product discovery -- semantic search transforms how users find what they need.

    Enterprise Knowledge Search

    Replace brittle keyword search across internal documents, wikis, and knowledge bases. Semantic search understands what employees are looking for, even when they use different terminology than the source documents.

    E-Commerce Product Discovery

    Let customers describe what they want in natural language and find matching products. Semantic search bridges the gap between how customers describe products and how catalogs are structured.

    Customer Support Retrieval

    Surface relevant knowledge base articles and past resolutions for support tickets. Semantic search matches the intent of customer queries to existing solutions, reducing response times and improving resolution rates.

    Research and Discovery

    Search across scientific papers, patents, legal documents, and technical archives by concept rather than keyword. Find relevant prior art, related research, and supporting evidence across massive document collections.

    Mixpeek Semantic Search vs. Alternatives

    See how Mixpeek compares to search platforms and vector databases for semantic search.

    FeatureMixpeekAlgoliaElasticsearchPinecone
    Embedding GenerationBuilt-in (50+ extractors, custom models)NeuralSearch (text only, limited models)BYO models (manual integration)BYO models (external embedding)
    Multimodal SearchNative (text, image, video, audio, PDF)Text onlyText + limited vector (BYO embeddings)Vector only (BYO embeddings, any modality)
    Hybrid SearchBuilt-in (vector + BM25 + metadata fusion)NeuralSearch + keywordBM25 + kNN (manual fusion)Sparse + dense vectors
    Retriever PipelinesComposable multi-stage (filter, search, rerank)Rules-based rankingQuery DSL (single-stage)Single-stage vector search
    Data ProcessingBuilt-in feature extraction on Ray GPUsNo processing (push pre-processed data)Ingest pipelines (text processing only)No processing (push pre-computed vectors)
    Deployment OptionsManaged, Dedicated, BYO CloudManaged SaaS onlySelf-managed or Elastic CloudManaged SaaS only

    Build Semantic Search in Minutes

    A simple Python API to generate embeddings, index content, and search semantically with hybrid retrieval.

    semantic_search.py
    from mixpeek import Mixpeek
    
    client = Mixpeek(api_key="YOUR_API_KEY")
    
    # Create a collection with semantic embedding extractors
    collection = client.collections.create(
        name="knowledge-base",
        namespace="docs",
        extractors=[
            {
                "type": "text_embedding",
                "model": "sentence-transformers/all-MiniLM-L6-v2",
                "config": {
                    "chunk_size": 512,
                    "chunk_overlap": 50
                }
            },
            {
                "type": "image_embedding",
                "model": "clip-vit-large",
                "config": {
                    "extract_from_documents": True
                }
            }
        ]
    )
    
    # Upload documents to trigger embedding generation
    client.buckets.upload(
        bucket="my-bucket",
        files=["handbook.pdf", "product_guide.pdf", "faq.md"],
        collection=collection.id
    )
    
    # Semantic search with retriever pipeline
    results = client.retrievers.execute(
        namespace="docs",
        stages=[
            {
                "type": "feature_search",
                "method": "hybrid",
                "query": {
                    "text": "How do I configure single sign-on for my team?",
                    "modalities": ["text", "image"]
                },
                "weights": {
                    "vector": 0.7,
                    "keyword": 0.3
                },
                "limit": 20
            },
            {
                "type": "filter",
                "conditions": {
                    "metadata.doc_type": {"$in": ["guide", "faq"]},
                    "metadata.updated_after": "2026-01-01"
                }
            },
            {
                "type": "rerank",
                "model": "cross-encoder",
                "limit": 5
            }
        ]
    )
    
    for result in results:
        print(f"Score: {result.score}")
        print(f"Source: {result.metadata['filename']}")
        print(f"Section: {result.metadata['section']}")
        print(f"Content: {result.content[:200]}")

    Frequently Asked Questions

    What is semantic search?

    Semantic search is a search technique that understands the meaning and intent behind a query, rather than relying on exact keyword matches. It uses embedding models to convert text (and other data types) into dense vector representations, then finds results by measuring similarity in vector space. This means a query for 'how to fix a broken pipe' can find results about 'plumbing repair' even if those exact words do not appear.

    How is semantic search different from keyword search?

    Keyword search (like BM25 or TF-IDF) matches documents based on the exact terms in the query. It excels at precision when users know the right terminology but fails when there is a vocabulary mismatch. Semantic search uses vector embeddings to match by meaning, handling synonyms, paraphrases, and conceptual similarity. Mixpeek supports both and combines them in hybrid search for the best of both approaches.

    What are embeddings and how do they enable semantic search?

    Embeddings are dense numerical vectors that represent the meaning of content. Embedding models (like CLIP, sentence-transformers, or SigLIP) are trained to map semantically similar content to nearby points in a high-dimensional vector space. When you search, your query is embedded using the same model, and the system finds documents whose vectors are closest to the query vector by cosine similarity or dot product distance.

    Does Mixpeek support semantic search across images and video?

    Yes. Mixpeek uses multimodal embedding models like CLIP and SigLIP that map text, images, and video frames into a shared vector space. This enables cross-modal semantic search -- you can search with a text query and retrieve relevant images, or search with an image and find semantically similar video frames. All modalities are indexed in the same Qdrant namespace.

    What is hybrid search and why is it better than pure semantic search?

    Hybrid search combines semantic vector search with keyword-based BM25 search and fuses their scores. Pure semantic search excels at understanding intent and handling synonyms, but can miss exact matches that keyword search catches easily (like product IDs, error codes, or proper nouns). Hybrid search gives you the best of both -- semantic understanding plus keyword precision -- with configurable weights for each method.

    How does Mixpeek compare to building semantic search with Pinecone or Elasticsearch?

    Pinecone and Elasticsearch require you to generate embeddings externally and push pre-computed vectors. Mixpeek handles the entire pipeline: feature extraction (embedding generation) on managed Ray GPU clusters, vector indexing in Qdrant, and composable retriever pipelines for search. You also get built-in multimodal support, so images, video, and audio are searchable alongside text without separate infrastructure.

    Can I use my own fine-tuned embedding models?

    Yes. Mixpeek supports custom embedding models through its Docker-based plugin system. Package your fine-tuned model in a container, register it as a custom feature extractor, and it runs on Mixpeek's Ray GPU clusters alongside built-in extractors. This is useful for domain-specific applications where fine-tuned models significantly outperform general-purpose embeddings.

    How does reranking improve semantic search results?

    Reranking uses a cross-encoder model to re-score the top results from an initial retrieval stage. Unlike bi-encoders used for embedding generation, cross-encoders process the query and document together, enabling more accurate relevance scoring at the cost of higher latency. Mixpeek supports reranking as a stage in its composable retriever pipelines, letting you balance speed and accuracy by reranking only the top-N results.

    Build Semantic Search That Understands Meaning

    Stop losing users to irrelevant keyword results. Build production semantic search with managed embeddings, hybrid retrieval, and composable pipelines.