Mixpeek object storage search hero

    Retrieval for agents, on your object storage.

    Bring your own vectors with MVS, or let Managed extract and index your files. One retrieval API, on the object storage you already use.

    MVS

    Bring vectors

    Agent-native vector store on object storage. Dense, sparse, and BM25 search. First 1M vectors free forever.

    Managed

    Connect files

    Managed indexing extracts scenes, faces, OCR, transcripts, and embeddings from video, images, audio, PDFs, and docs.

    Built for production retrieval
    1Mvectors free, no expiration
    10×lower cost vs. in-memory vector DBs
    <100msp95 hybrid retrieval
    Billionsvectors, scale on object storage
    Live demo

    See retrieval in action.

    Search inside a video by what's on screen, said, or written, using the same hybrid retrieval your agents call through the API.

    Quickstart

    One install. Two paths.

    Most retrieval stacks mean gluing together a vector DB, a file pipeline, and an agent layer. Mixpeek is one install with two ways in.

    Install
    pip install mixpeek
    MVS quickstart

    Bring embeddings

    # pip install mixpeek
    from mixpeek import Mixpeek
    mp = Mixpeek(api_key="YOUR_KEY")
    mp.namespaces.documents.upsert(
    namespace="agent_memory",
    documents=[{
    "id": "clip_001",
    "dense_embedding": embedding,
    "metadata": {"source": "s3://media/clip_001.mp4"}
    }]
    )
    results = mp.namespaces.documents.search(
    namespace="agent_memory",
    queries=[{"vector": query_embedding, "top_k": 10}]
    )
    01BYO embeddings
    02Object storage
    03Agent retrieval
    Why object storage

    Retrieval that lives where your data does.

    Vectors and extracted features persist on the object storage you already own, so there's no in-memory index to keep hot and no extra copies to manage.

    Object storage first

    S3, GCS, B2, Azure, R2, MinIO, and S3-compatible stores stay the system of record, so no data leaves your cloud.

    Agent-ready retrieval

    Tools get searchable context with metadata, filters, traces, and deterministic retrieval plans.

    Production controls

    Usage limits, audit trails, namespaces, and self-hosted deployment for real workloads.

    For agents

    Query Mixpeek from wherever your agent runs, with the same retrieval API everywhere.

    Under the hood

    From object to retrieval.

    Watch a file get decomposed into features, indexed, and made searchable across talent, IP, taste, and compliance workflows.

    Live retriever · Talent search across 10k video ads
    Decompose · Sources4

    Connect any object store. Every file becomes a hierarchy of typed, versioned features.

    Super Bowl adss3://mxp-ads/2026/*.mp4
    Creator headshots42k reference faces
    Casting databaseconflicts + rates
    Outtake reelsagency archive
    Store + Enrich47ms
    Feature Extractors
    Facearcface-v2
    face_boxface_embeddingidentity
    Sceneclip-vit-l
    scene_embeddingscene_id
    Transcriptwhisper-v3
    transcriptlanguage
    detectembedmatchfilterrank
    10,482 ads indexed14 feat/file
    Reassemble · Retrievers4

    Multi-stage pipelines: search, filter, join, rerank. Deterministic, auditable traces.

    Face searchfind talent across ads
    Conflict detectionbrand competitors
    Utilization reportby creator / quarter
    Scene lookupfind the exact moment
    Integrations

    Plugs into your existing stack.

    Connect your storage, point Mixpeek at it, and every file becomes searchable by what's inside it. No migration, no code changes.

    Real workflows

    In production right now.

    query.jpg3 VISUAL MATCHES45,210 ARTWORKS · 38MS
    National Gallery · live demo

    Visual search across 45k artworks

    Upload any image and find visually similar paintings across 45,000+ artworks, or just describe what you're looking for. Hybrid image and text retrieval, ranked with RRF.

    Try gallery search →
    MOOD · warm · dreamy · analog
    Movie taste · live demo

    Posters that learn your taste

    Like or dislike movie posters and watch the grid adapt to your taste in real time. Interaction signals feed learned fusion so recommendations improve from usage.

    Try movie personalization →
    query-face.jpg4 MATCHES · 12 FRAMES · 47MS
    Super Bowl corpus · live demo

    Face search across video

    Drop in a headshot and find every clip a person appears in across 63 video ads and 2,600+ faces. Full trace for takedown evidence.

    Try face search →
    Enterprise-ready
    SOC 2 Type II HIPAA-ready BYO-Cloud Self-hosted option Audit trails SSO / SAML
    Pricing

    Free vectors. Usage-based indexing.

    MVS starts with free vectors. Managed starts with credits for object extraction and indexing.

    MVS
    1Mvectors free

    Bring your own embeddings. Store and search vectors on object storage with no expiration on the free tier.

    Start with MVS
    Managed
    $0.001/credit

    Credits cover extraction, embedding, indexing, enrichment, and retriever execution for raw objects.

    Start with Managed
    Enterprise
    Custom

    Dedicated infrastructure, self-hosted options, SSO, SLA, security reviews, and hands-on architecture support.

    Talk to us
    FAQ

    Common questions.

    Do I have to move my data?

    No. Mixpeek reads from your existing S3, GCS, R2, Azure, or S3-compatible bucket. Your storage stays the system of record, and nothing leaves your cloud.

    How fast is retrieval?

    Hybrid queries (dense, sparse, and BM25) return in well under 100ms p95, even with vectors persisted on object storage rather than held in RAM.

    Do I need embeddings to start?

    No. Bring your own vectors with MVS, or point Managed at raw files and it generates embeddings and features for you.

    What can Managed extract?

    Faces, scenes, transcripts, OCR, labels, and embeddings from video, images, audio, PDFs, and documents, all indexed at the object level.

    Can I self-host?

    Yes. Deploy in your own cloud (BYO-Cloud) with SOC 2 and HIPAA-ready controls, SSO, audit trails, and namespaces.

    How does pricing work?

    MVS starts free with 1M vectors and no expiration. Managed is usage-based credits covering extraction, embedding, indexing, and retriever execution.