How It Works
Index
Upload video, images, audio, and documents to Buckets. Mixpeek runs feature extraction automatically — faces, objects, transcripts, embeddings, and structured metadata all get indexed into searchable Collections.
Search
Build retrieval pipelines that your agent calls. Semantic search, face search, object search, transcript search — chain them together into multi-stage Retrievers and expose them as a single endpoint.
Integrate
Wire Mixpeek into your agent as a LangChain tool, an MCP server, or a direct REST call. Your agent sends a query, gets structured results back, and acts on them.
Quickstarts
Agent That Can See
Give your agent multimodal perception in 30 minutes
MCP Tool
Add Mixpeek as a tool in any MCP-compatible agent
Direct REST
Call the API directly from any language or framework
What Gets Extracted
| File Type | Extracted Features |
|---|---|
| Video | Face embeddings (ArcFace 512D), scene descriptions (Gemini), visual embeddings (Vertex AI 1408D), transcripts (Whisper), transcript embeddings (E5-Large 1024D), keyframes |
| Images | Visual embeddings (SigLIP 768D or Vertex AI 1408D), face embeddings (ArcFace 512D), OCR text, descriptions, structured extraction |
| Audio | Transcripts (Whisper), transcript embeddings (E5-Large 1024D), multimodal audio embeddings (Vertex AI 1408D) |
| Documents | Text chunks, text embeddings (E5-Large 1024D), OCR for scanned PDFs, structured extraction |
Key Concepts
- Namespaces isolate data between tenants, environments, or projects. Every API request includes a namespace header.
- Buckets hold your raw files. Upload once, process many ways.
- Collections define what gets extracted. Each collection runs a feature extractor (CLIP, Whisper, LayoutLM, etc.) against objects in a bucket.
- Retrievers are search pipelines you configure in JSON. Chain stages together — vector search, face matching, filters, re-ranking — and expose the result as one endpoint your agent calls.
Next Steps
Core Concepts
Understand namespaces, buckets, collections, and retrievers in depth
Architecture
See how the pieces fit together end to end
Feature Extractors
Learn what each extractor does and how to configure it
Tutorials
Step-by-step guides for common use cases

