Mixpeek Logo
    Training

    Video Transcription & Indexing Pipeline

    Automatically transcribe video content with speaker identification, timestamps, and full-text indexing for downstream search and analytics.

    video
    audio
    text
    Single Tier
    2.5K runs
    Deploy Recipe
    from mixpeek import Mixpeek
    client = Mixpeek(api_key="YOUR_API_KEY")
    namespace = client.namespaces.create(name="transcripts")
    collection = client.collections.create(
    namespace_id=namespace.id,
    name="meetings",
    extractors=["audio-transcription", "speaker-diarization"]
    )
    # Upload videos - transcription happens automatically
    client.buckets.upload(
    collection_id=collection.id,
    url="s3://your-bucket/meeting-recordings/"
    )
    # Retrieve transcript
    docs = client.documents.search(
    namespace_id=namespace.id,
    collection_ids=[collection.id],
    query="action items from last week"
    )

    Feature Extractors

    Audio Transcription

    Transcribe audio content to text

    450K runs

    Speaker Diarization

    Identify and separate different speakers in audio content

    320K runs

    Retriever Stages