Mixpeek Logo

    Video Scene Search

    Find specific scenes within videos using natural language descriptions. The pipeline detects scene boundaries, generates embeddings for each scene, and enables precise timestamp-level search across an entire video library. Query for visual content, actions, or spoken dialogue.

    video
    image
    text
    Multi-Tier
    3.2K runs
    Deploy Recipe
    from mixpeek import Mixpeek
    client = Mixpeek(api_key="YOUR_API_KEY")
    # Create video collection with scene decomposition
    collection = client.collections.create(
    namespace_id="ns_your_namespace",
    name="video_scenes",
    extractors=["multimodal-extractor", "text-extractor"],
    params={"video_chunking": "scene-based"}
    )
    # Upload videos
    client.buckets.upload(bucket_id="bkt_videos", url="s3://your-bucket/videos/")
    # Search for a specific scene
    results = client.retrievers.execute(
    retriever_id="ret_scene_search",
    query={"text": "person presenting a chart on a whiteboard"}
    )
    for doc in results["results"]:
    print(f"Video: {doc['root_object_id']}")
    print(f" Scene: {doc['start_time']:.1f}s - {doc['end_time']:.1f}s")
    print(f" Score: {doc['score']:.3f}")
    if doc.get("text"):
    print(f" Transcript: {doc['text'][:80]}...")

    Feature Extractors

    Retriever Stages

    rerank

    Rerank documents using cross-encoder models for accurate relevance

    sort

    Related Recipes & Resources

    Explore these related resources to deepen your understanding and discover more powerful features