Mixpeek Logo
    Similar

    Document Intelligence Search

    Extract and search through PDFs, presentations, and documents. Combines OCR, layout analysis, and semantic search for comprehensive document retrieval.

    text
    image
    Multi-Tier
    3.2K runs
    Deploy Recipe
    from mixpeek import Mixpeek
    client = Mixpeek(api_key="YOUR_API_KEY")
    namespace = client.namespaces.create(name="doc-search")
    collection = client.collections.create(
    namespace_id=namespace.id,
    name="contracts",
    extractors=["pdf-extraction", "text-embedding-v2", "ocr"],
    chunk_strategy="page-based"
    )
    # Upload documents
    client.buckets.upload(
    collection_id=collection.id,
    url="s3://your-bucket/contracts/"
    )
    # Search with high BM25 weight for exact legal terms
    results = client.retrievers.execute(
    retriever_id=retriever.id,
    query="indemnification clause with liability cap"
    )

    Feature Extractors

    PDF Text Extraction

    Extract structured text and layout information from PDFs

    645K runs

    Retriever Stages

    rerank

    Rerank documents using cross-encoder models for accurate relevance

    sort

    Use Cases Using This Recipe

    Intermediate

    Insurance Claims Document Processing

    Extract structured data from claims documents, photos, and correspondence automatically

    70% reduction in manual document handling

    Adjuster data entry time

    Who It's For

    Insurance carriers, claims adjusters, and third-party administrators processing 1,000+ claims monthly across property, casualty, auto, and health lines

    Beginner

    Semantic Search for Knowledge Bases

    Find answers by meaning, not keywords, across your entire knowledge repository

    85% of queries answered on first search vs. 40% baseline

    First-search success rate

    Who It's For

    Knowledge management teams, internal documentation owners, customer support organizations, and EdTech platforms maintaining 10K+ articles, documents, and multimedia resources

    Intermediate

    Enterprise RAG Search

    Ask questions across all your enterprise data and get sourced, verifiable answers

    80% faster from question to answer

    Information retrieval time

    Who It's For

    Financial services firms, consulting organizations, legal teams, and enterprise knowledge workers who need to synthesize information across thousands of internal documents, reports, and presentations