What You’ll Build
An agentic casting assistant that:- Indexes talent headshots and profile videos with face embeddings (ArcFace 512D)
- Indexes ad creatives with scene embeddings, transcripts, and thumbnails
- Cross-references faces against a competitor ad namespace to detect casting conflicts
- Answers natural language queries like “Find talent who appeared in outdoor ads but never in a competitor campaign”
Prerequisites
- A Mixpeek API key — get one at mixpeek.com/start
- An OpenAI API key (for the LangChain agent LLM)
Index talent profiles with face_identity_extractor
Create a namespace, a bucket for talent headshots and profile videos, and a collection that extracts face embeddings from every upload.Upload talent profiles and submit a batch to process them:
Each face detected in a video becomes its own document with a 512-dimensional ArcFace embedding.
video_deduplication ensures the same face appearing across multiple frames is stored once, not hundreds of times.Index ad archive with multimodal_extractor
Create a second collection on the same namespace. This one uses Upload your ad archive and submit a batch the same way as Step 1. Each ad gets decomposed into individual scenes with:
multimodal_extractor to split ad creatives into scenes, transcribe audio, and generate visual embeddings.- Multimodal embeddings (1408D, Vertex AI) — search by visual content
- Transcription — full spoken-word transcript per scene
- Scene descriptions — Gemini-generated natural language summaries
- Thumbnails — keyframes for each scene segment
Build a competitor namespace and cross-reference faces
Create a separate namespace for competitor ads. Index them with Upload competitor ad videos and submit a batch. Then create a retriever for cross-referencing faces against this competitor namespace:
face_identity_extractor using the same configuration as your talent namespace. This gives you two independent face indexes you can query across.Wire it as a LangChain agent
Create three retrievers — one for each search dimension — and wrap them as LangChain tools.First, create the face search and scene search retrievers on your talent namespace:Now wrap all three retrievers as LangChain tools and build the agent:Run a query:The agent reasons through this multi-step query:
Python
Python
Python
- Calls
scene_searchwith “outdoor advertisement” to find matching ad scenes - Extracts face thumbnails from the top results
- Calls
competitor_checkfor each face to filter out talent with conflicts - Returns a shortlist of conflict-free talent with links to their original ad appearances
What Just Happened
Here is the pipeline you built:- Talent namespace indexed face embeddings from talent headshots and profile videos using
face_identity_extractor(SCRFD detection + ArcFace 512D embeddings) - Ad archive collection decomposed ad creatives into scenes with visual embeddings (Vertex AI 1408D), transcripts, and thumbnails using
multimodal_extractor - Competitor namespace indexed faces from competitor ads in an isolated namespace, enabling cross-namespace face matching
- Three retrievers exposed face search, scene search, and competitor cross-reference as queryable endpoints
- LangChain agent wrapped all three retrievers as tools, enabling natural language queries that span multiple search dimensions
Next Steps
Face Identity Extractor
Full parameter reference for face detection and recognition.
Retriever stages
Add reranking, filtering, and enrichment stages to your retriever pipeline.
Webhooks
Replace batch polling with event-driven processing notifications.

