What You’ll Build
A LangChain agent with avideo_search tool backed by Mixpeek. The agent accepts a plain-English question, searches indexed video by transcript and visual similarity, and returns timestamped results.
Prerequisites
- A Mixpeek API key — get one at mixpeek.com/start
- An OpenAI API key (for the LangChain agent LLM)
Create a namespace
A namespace isolates all storage and compute for a project. Create one with the Save the returned
multimodal_extractor feature extractor enabled.namespace_id — every subsequent call requires it.Create a bucket and upload a video
Buckets define the schema for incoming objects. Create one that accepts a video URL, then register an object pointing to a sample video.
Create a collection and process the video
A collection binds a bucket to a feature extractor. When you submit a batch, the engine decomposes the video into scene embeddings, keyframes, and transcripts.
Wait for processing
Video processing takes 1-5 minutes depending on length. Poll the task endpoint until
status is COMPLETED.Python
For production use, register a webhook instead of polling. Mixpeek sends a
batch.completed event when processing finishes.Create a retriever
A retriever defines how search queries map to indexed features. This one performs semantic search over scene embeddings extracted by
multimodal_extractor.What Just Happened
Here is the pipeline you built:- Namespace created an isolated environment with
multimodal_extractorenabled - Bucket + Object registered a video URL with a defined schema
- Collection + Batch triggered the engine to decompose the video into scene segments, each with embeddings, keyframes, and timestamps
- Retriever defined a search interface over those scene embeddings
- LangChain Tool wrapped the retriever so your agent can query video content in plain English
Next Steps
Add transcript search
Add transcript embeddings to your retriever for hybrid visual + spoken-content search.
Retriever stages
Add reranking, filtering, and enrichment stages to your retriever pipeline.
Webhooks
Replace polling with event-driven processing notifications.

