Skip to main content
Talent records live in spreadsheets. Nobody can answer “has this creator appeared in a competitor’s ad?” without manually watching hundreds of videos. A casting agent fixes this — it indexes every face across your ad archive, cross-references against competitor content, and answers natural language queries about talent history. In this guide you build a LangChain agent with three tools: face search, scene search, and competitor cross-reference — all backed by Mixpeek retrievers.

What You’ll Build

An agentic casting assistant that:
  1. Indexes talent headshots and profile videos with face embeddings (ArcFace 512D)
  2. Indexes ad creatives with scene embeddings, transcripts, and thumbnails
  3. Cross-references faces against a competitor ad namespace to detect casting conflicts
  4. Answers natural language queries like “Find talent who appeared in outdoor ads but never in a competitor campaign”

Prerequisites

pip install mixpeek langchain langchain-openai
You also need:
  • A Mixpeek API key — get one at mixpeek.com/start
  • An OpenAI API key (for the LangChain agent LLM)
export MIXPEEK_API_KEY="sk_live_replace_me"
export OPENAI_API_KEY="sk-replace_me"
1

Index talent profiles with face_identity_extractor

Create a namespace, a bucket for talent headshots and profile videos, and a collection that extracts face embeddings from every upload.
from mixpeek import Mixpeek

client = Mixpeek(api_key="YOUR_MIXPEEK_API_KEY")

# Create namespace
ns = client.namespaces.create(
    namespace_name="casting-talent",
    description="Talent face index for casting agent",
    feature_extractors=[
        {"feature_extractor_name": "face_identity_extractor", "version": "v1"},
        {"feature_extractor_name": "multimodal_extractor", "version": "v1"}
    ]
)

namespace_id = ns.namespace_id

# Create bucket for talent profiles
talent_bucket = client.buckets.create(
    bucket_name="talent-profiles",
    namespace_id=namespace_id,
    schema={
        "properties": {
            "video_url": {"type": "url", "required": True}
        }
    }
)

talent_bucket_id = talent_bucket.bucket_id

# Create collection with face_identity_extractor
talent_collection = client.collections.create(
    collection_name="talent-faces",
    namespace_id=namespace_id,
    source={"type": "bucket", "bucket_id": talent_bucket_id},
    feature_extractor={
        "feature_extractor_name": "face_identity_extractor",
        "version": "v1",
        "input_mappings": {"video": "payload.video_url"},
        "parameters": {
            "detection_model": "scrfd_2.5g",
            "embedding_model": "arcface_r100",
            "video_sampling_fps": 1.0,
            "video_deduplication": True,
            "video_deduplication_threshold": 0.8
        }
    }
)

talent_collection_id = talent_collection.collection_id
print(f"Namespace: {namespace_id}")
print(f"Talent collection: {talent_collection_id}")
Upload talent profiles and submit a batch to process them:
# Upload talent headshots / profile videos
talent_objects = []
talent_videos = [
    "https://storage.googleapis.com/your-bucket/talent/creator-anna.mp4",
    "https://storage.googleapis.com/your-bucket/talent/creator-james.mp4",
    "https://storage.googleapis.com/your-bucket/talent/creator-maria.mp4",
]

for url in talent_videos:
    obj = client.objects.create(
        bucket_id=talent_bucket_id,
        namespace_id=namespace_id,
        key_prefix="/talent",
        blobs=[
            {
                "property": "video_url",
                "type": "video",
                "url": url
            }
        ]
    )
    talent_objects.append(obj.object_id)

# Submit batch
batch = client.batches.create(
    bucket_id=talent_bucket_id,
    namespace_id=namespace_id,
    object_ids=talent_objects
)

result = client.batches.submit(
    bucket_id=talent_bucket_id,
    batch_id=batch.batch_id,
    namespace_id=namespace_id
)

print(f"Talent batch submitted: {result.task_id}")
Each face detected in a video becomes its own document with a 512-dimensional ArcFace embedding. video_deduplication ensures the same face appearing across multiple frames is stored once, not hundreds of times.
2

Index ad archive with multimodal_extractor

Create a second collection on the same namespace. This one uses multimodal_extractor to split ad creatives into scenes, transcribe audio, and generate visual embeddings.
# Create bucket for ad creatives
ads_bucket = client.buckets.create(
    bucket_name="ad-creatives",
    namespace_id=namespace_id,
    schema={
        "properties": {
            "video_url": {"type": "url", "required": True}
        }
    }
)

ads_bucket_id = ads_bucket.bucket_id

# Create collection with multimodal_extractor
ads_collection = client.collections.create(
    collection_name="ad-scenes",
    namespace_id=namespace_id,
    source={"type": "bucket", "bucket_id": ads_bucket_id},
    feature_extractor={
        "feature_extractor_name": "multimodal_extractor",
        "version": "v1",
        "input_mappings": {"video": "payload.video_url"},
        "parameters": {
            "split_method": "scene",
            "scene_detection_threshold": 0.5,
            "run_transcription": True,
            "run_multimodal_embedding": True,
            "run_video_description": True,
            "enable_thumbnails": True
        }
    }
)

ads_collection_id = ads_collection.collection_id
print(f"Ads collection: {ads_collection_id}")
Upload your ad archive and submit a batch the same way as Step 1. Each ad gets decomposed into individual scenes with:
  • Multimodal embeddings (1408D, Vertex AI) — search by visual content
  • Transcription — full spoken-word transcript per scene
  • Scene descriptions — Gemini-generated natural language summaries
  • Thumbnails — keyframes for each scene segment
Scene splitting with scene_detection_threshold: 0.5 works well for ad creatives, which typically have fast cuts. Lower the threshold for longer-form content with fewer transitions.
3

Build a competitor namespace and cross-reference faces

Create a separate namespace for competitor ads. Index them with face_identity_extractor using the same configuration as your talent namespace. This gives you two independent face indexes you can query across.
# Create competitor namespace
competitor_ns = client.namespaces.create(
    namespace_name="competitor-ads",
    description="Competitor ad face index for casting conflict detection",
    feature_extractors=[
        {"feature_extractor_name": "face_identity_extractor", "version": "v1"}
    ]
)

competitor_namespace_id = competitor_ns.namespace_id

# Create bucket and collection (same pattern as Step 1)
competitor_bucket = client.buckets.create(
    bucket_name="competitor-videos",
    namespace_id=competitor_namespace_id,
    schema={
        "properties": {
            "video_url": {"type": "url", "required": True}
        }
    }
)

competitor_bucket_id = competitor_bucket.bucket_id

competitor_collection = client.collections.create(
    collection_name="competitor-faces",
    namespace_id=competitor_namespace_id,
    source={"type": "bucket", "bucket_id": competitor_bucket_id},
    feature_extractor={
        "feature_extractor_name": "face_identity_extractor",
        "version": "v1",
        "input_mappings": {"video": "payload.video_url"},
        "parameters": {
            "detection_model": "scrfd_2.5g",
            "embedding_model": "arcface_r100",
            "video_sampling_fps": 1.0,
            "video_deduplication": True,
            "video_deduplication_threshold": 0.8
        }
    }
)

competitor_collection_id = competitor_collection.collection_id
print(f"Competitor namespace: {competitor_namespace_id}")
Upload competitor ad videos and submit a batch. Then create a retriever for cross-referencing faces against this competitor namespace:
# Retriever for competitor face cross-reference
competitor_face_retriever = client.retrievers.create(
    retriever_name="competitor-face-check",
    namespace_id=competitor_namespace_id,
    description="Search competitor ads by face similarity",
    input_schema={
        "properties": {
            "query_url": {"type": "url", "required": True}
        }
    },
    collection_ids=[competitor_collection_id],
    stages=[
        {
            "stage_type": "filter",
            "stage_id": "feature_search",
            "parameters": {
                "feature_uri": "mixpeek://face_identity_extractor@v1/face_embedding",
                "input": {"image": "{{INPUT.query_url}}"},
                "limit": 10
            }
        }
    ],
    cache_config={"enabled": True, "ttl_seconds": 300}
)

competitor_retriever_id = competitor_face_retriever.retriever_id
print(f"Competitor retriever: {competitor_retriever_id}")
The cross-reference pattern queries your talent face embeddings against the competitor namespace. A high similarity score (above 0.85) indicates the same person appears in both your ads and a competitor’s. This is how you detect casting conflicts at scale.
4

Wire it as a LangChain agent

Create three retrievers — one for each search dimension — and wrap them as LangChain tools.First, create the face search and scene search retrievers on your talent namespace:
Python
# Face search retriever (find talent by face similarity)
face_retriever = client.retrievers.create(
    retriever_name="talent-face-search",
    namespace_id=namespace_id,
    description="Find talent by face similarity",
    input_schema={
        "properties": {
            "query_url": {"type": "url", "required": True}
        }
    },
    collection_ids=[talent_collection_id],
    stages=[
        {
            "stage_type": "filter",
            "stage_id": "feature_search",
            "parameters": {
                "feature_uri": "mixpeek://face_identity_extractor@v1/face_embedding",
                "input": {"image": "{{INPUT.query_url}}"},
                "limit": 20
            }
        }
    ],
    cache_config={"enabled": True, "ttl_seconds": 300}
)

face_retriever_id = face_retriever.retriever_id

# Scene search retriever (find ads by visual/transcript content)
scene_retriever = client.retrievers.create(
    retriever_name="ad-scene-search",
    namespace_id=namespace_id,
    description="Search ad scenes by visual content or transcript",
    input_schema={
        "properties": {
            "query_text": {"type": "text", "required": True}
        }
    },
    collection_ids=[ads_collection_id],
    stages=[
        {
            "stage_type": "filter",
            "stage_id": "feature_search",
            "parameters": {
                "feature_uri": "mixpeek://multimodal_extractor@v1/multimodal_embedding",
                "input": {"text": "{{INPUT.query_text}}"},
                "limit": 20
            }
        }
    ],
    cache_config={"enabled": True, "ttl_seconds": 300}
)

scene_retriever_id = scene_retriever.retriever_id
Now wrap all three retrievers as LangChain tools and build the agent:
Python
from langchain.tools import Tool
from langchain_openai import ChatOpenAI
from langchain.agents import AgentExecutor, create_openai_tools_agent
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder


def search_faces(query_url: str) -> str:
    """Search talent database by face image URL."""
    results = client.retrievers.execute(
        retriever_id=face_retriever_id,
        namespace_id=namespace_id,
        inputs={"query_url": query_url},
        limit=10
    )
    formatted = []
    for r in results.results:
        score = r.score
        source = r.metadata.get("source_video_url", "unknown")
        formatted.append(f"(score: {score:.3f}) source: {source}")
    return "\n".join(formatted) if formatted else "No matching faces found."


def search_scenes(query: str) -> str:
    """Search ad archive by visual content or transcript."""
    results = client.retrievers.execute(
        retriever_id=scene_retriever_id,
        namespace_id=namespace_id,
        inputs={"query_text": query},
        limit=10
    )
    formatted = []
    for r in results.results:
        start = r.metadata.get("start_time", "?")
        end = r.metadata.get("end_time", "?")
        desc = r.metadata.get("description", "")
        transcript = r.metadata.get("transcription", "")[:100]
        formatted.append(
            f"[{start}s - {end}s] (score: {r.score:.3f}) {desc}"
            + (f" | transcript: {transcript}..." if transcript else "")
        )
    return "\n".join(formatted) if formatted else "No matching scenes found."


def check_competitor(query_url: str) -> str:
    """Check if a face appears in competitor ad campaigns."""
    results = client.retrievers.execute(
        retriever_id=competitor_retriever_id,
        namespace_id=competitor_namespace_id,
        inputs={"query_url": query_url},
        limit=10
    )
    formatted = []
    for r in results.results:
        score = r.score
        source = r.metadata.get("source_video_url", "unknown")
        conflict = "CONFLICT" if score > 0.85 else "low similarity"
        formatted.append(f"({conflict}, score: {score:.3f}) source: {source}")
    return "\n".join(formatted) if formatted else "No matches in competitor ads."


# Define tools
face_search_tool = Tool(
    name="face_search",
    description=(
        "Search the talent database by face. Input is an image URL of a face. "
        "Returns matching talent profiles with similarity scores. "
        "Use this to find which talent appeared in specific content."
    ),
    func=search_faces
)

scene_search_tool = Tool(
    name="scene_search",
    description=(
        "Search ad archive by visual content or spoken words. Input is a text query. "
        "Returns matching ad scenes with timestamps, descriptions, and transcripts. "
        "Use this to find ads matching a visual or thematic description."
    ),
    func=search_scenes
)

competitor_check_tool = Tool(
    name="competitor_check",
    description=(
        "Check if a talent's face appears in competitor ad campaigns. "
        "Input is an image URL of the talent's face. "
        "Returns matches with conflict flags (score > 0.85 = same person). "
        "Use this to detect casting conflicts before booking talent."
    ),
    func=check_competitor
)

# Build the agent
llm = ChatOpenAI(model="gpt-4o", temperature=0)

prompt = ChatPromptTemplate.from_messages([
    ("system",
     "You are a casting research assistant for a performance marketing agency. "
     "You help find talent, analyze ad creative history, and detect casting conflicts. "
     "You have three tools:\n"
     "- face_search: find talent by face image\n"
     "- scene_search: find ads by visual/transcript content\n"
     "- competitor_check: check if talent appears in competitor campaigns\n\n"
     "When asked about casting conflicts, always run competitor_check. "
     "When asked to find talent for a specific type of ad, use scene_search first "
     "to find matching ads, then cross-reference faces."),
    ("human", "{input}"),
    MessagesPlaceholder(variable_name="agent_scratchpad"),
])

agent = create_openai_tools_agent(llm, [face_search_tool, scene_search_tool, competitor_check_tool], prompt)
executor = AgentExecutor(agent=agent, tools=[face_search_tool, scene_search_tool, competitor_check_tool], verbose=True)
Run a query:
Python
response = executor.invoke({
    "input": "Find me talent who appeared in high-performing outdoor ads but never in a competitor campaign"
})

print(response["output"])
The agent reasons through this multi-step query:
  1. Calls scene_search with “outdoor advertisement” to find matching ad scenes
  2. Extracts face thumbnails from the top results
  3. Calls competitor_check for each face to filter out talent with conflicts
  4. Returns a shortlist of conflict-free talent with links to their original ad appearances
Add more retrievers to expand the agent’s capabilities. A transcript-only retriever using mixpeek://multimodal_extractor@v1/transcription_embedding lets the agent search by spoken dialogue. A metadata filter stage can narrow results by date range, campaign name, or performance metrics stored in object payload.

What Just Happened

Here is the pipeline you built:
  1. Talent namespace indexed face embeddings from talent headshots and profile videos using face_identity_extractor (SCRFD detection + ArcFace 512D embeddings)
  2. Ad archive collection decomposed ad creatives into scenes with visual embeddings (Vertex AI 1408D), transcripts, and thumbnails using multimodal_extractor
  3. Competitor namespace indexed faces from competitor ads in an isolated namespace, enabling cross-namespace face matching
  4. Three retrievers exposed face search, scene search, and competitor cross-reference as queryable endpoints
  5. LangChain agent wrapped all three retrievers as tools, enabling natural language queries that span multiple search dimensions
The agent does not process video itself. Each retriever returns pre-indexed results in milliseconds. As you add more talent profiles, ads, and competitor content, the agent’s knowledge grows automatically.

Next Steps

Face Identity Extractor

Full parameter reference for face detection and recognition.

Retriever stages

Add reranking, filtering, and enrichment stages to your retriever pipeline.

Webhooks

Replace batch polling with event-driven processing notifications.