Mixpeek Logo
    Similar

    Video Semantic Search Pipeline

    Build a production-ready video search engine that lets users find specific moments across thousands of hours of video using natural language queries.

    video
    text
    Multi-Tier
    3.8K runs
    Deploy Recipe
    from mixpeek import Mixpeek
    client = Mixpeek(api_key="YOUR_API_KEY")
    # 1. Create namespace & collection
    namespace = client.namespaces.create(name="video-search")
    collection = client.collections.create(
    namespace_id=namespace.id,
    name="videos",
    extractors=["video-embedding-v2", "transcript-extraction"],
    chunk_strategy="scene-based"
    )
    # 2. Upload videos
    client.buckets.upload(
    collection_id=collection.id,
    url="s3://your-bucket/videos/"
    )
    # 3. Create retriever
    retriever = client.retrievers.create(
    namespace_id=namespace.id,
    name="video-search",
    stages=[
    {"type": "vector_search", "model": "multilingual-e5-large", "top_k": 50},
    {"type": "rerank", "model": "colbert-v2", "top_k": 10}
    ]
    )
    # 4. Search
    results = client.retrievers.execute(
    retriever_id=retriever.id,
    query="person explaining machine learning concepts"
    )

    Feature Extractors

    Scene Detection

    Detect and classify scenes in video content

    450K runs

    Retriever Stages

    rerank

    Rerank documents using cross-encoder models for accurate relevance

    sort