Video Transcription & Indexing Pipeline
Automatically transcribe video content with speaker identification, timestamps, and full-text indexing for downstream search and analytics.
from mixpeek import Mixpeekclient = Mixpeek(api_key="YOUR_API_KEY")namespace = client.namespaces.create(name="transcripts")collection = client.collections.create(namespace_id=namespace.id,name="meetings",extractors=["audio-transcription", "speaker-diarization"])# Upload videos - transcription happens automaticallyclient.buckets.upload(collection_id=collection.id,url="s3://your-bucket/meeting-recordings/")# Retrieve transcriptdocs = client.documents.search(namespace_id=namespace.id,collection_ids=[collection.id],query="action items from last week")
Feature Extractors
Audio Transcription
Transcribe audio content to text
Speaker Diarization
Identify and separate different speakers in audio content
