Video Semantic Search Pipeline
Build a production-ready video search engine that lets users find specific moments across thousands of hours of video using natural language queries.
from mixpeek import Mixpeekclient = Mixpeek(api_key="YOUR_API_KEY")# 1. Create namespace & collectionnamespace = client.namespaces.create(name="video-search")collection = client.collections.create(namespace_id=namespace.id,name="videos",extractors=["video-embedding-v2", "transcript-extraction"],chunk_strategy="scene-based")# 2. Upload videosclient.buckets.upload(collection_id=collection.id,url="s3://your-bucket/videos/")# 3. Create retrieverretriever = client.retrievers.create(namespace_id=namespace.id,name="video-search",stages=[{"type": "vector_search", "model": "multilingual-e5-large", "top_k": 50},{"type": "rerank", "model": "colbert-v2", "top_k": 10}])# 4. Searchresults = client.retrievers.execute(retriever_id=retriever.id,query="person explaining machine learning concepts")
Feature Extractors
Scene Detection
Detect and classify scenes in video content
Retriever Stages
rerank
Rerank documents using cross-encoder models for accurate relevance
