NEWVectors or files. Pick a path.Start →
    Models/Speech & Audio/pyannote/speaker-diarization-community-1
    HFSpeaker Diarizationcc-by-4.0

    speaker-diarization-community-1

    by pyannote

    Community speaker diarization pipeline for who-spoke-when audio metadata

    2.7Mdl/month
    526likes
    Pipelineparams
    Identifiers
    Model ID
    pyannote/speaker-diarization-community-1
    Feature URI
    mixpeek://transcription@v1/pyannote_diarization_community_1

    Overview

    pyannote Community-1 is a speaker diarization pipeline that segments audio by speaker turns, speech activity, speaker changes, and overlapped speech. It is publicly accessible with license acceptance and has become one of the highest-traffic diarization models on HuggingFace.

    On Mixpeek, diarization turns raw audio and video transcripts into searchable conversational structure. Agents can ask not only what was said, but who said it and when it happened.

    Architecture

    pyannote.audio pipeline composed of voice activity detection, speaker change detection, overlapped speech detection, embedding, and clustering components. Accepts whole files or waveform excerpts.

    Mixpeek SDK Integration

    import { Mixpeek } from "mixpeek";
    const mx = new Mixpeek({ apiKey: "API_KEY" });
    await mx.collections.ingest({
    collection_id: "meetings",
    source: { url: "https://example.com/meeting.wav" },
    feature_extractors: [{
    feature: "speaker_diarization",
    model: "pyannote/speaker-diarization-community-1"
    }]
    });

    Capabilities

    • Speaker turn segmentation
    • Voice activity and speaker change detection
    • Overlapped speech handling
    • Runs through pyannote.audio

    Use Cases on Mixpeek

    Search meeting recordings by speaker and topic
    Build agent memory over multi-speaker calls
    Filter podcast or interview clips by host, guest, or caller
    Attach who-spoke-when metadata to transcripts for audit trails

    Performance

    Input SizeAudio file or waveform excerpt
    GPU LatencyAudio duration dependent
    GPU ThroughputBatch by file for offline archives
    GPU Memory~2 GB

    Model files require accepting HuggingFace access conditions

    Specification

    FrameworkHF
    Organizationpyannote
    FeatureSpeaker Diarization
    Outputspeaker segments
    Modalitiesvideo, audio
    RetrieverSpeaker Filter
    ParametersPipeline
    Licensecc-by-4.0
    Downloads/mo2.7M
    Likes526

    Research Paper

    pyannote.audio speaker diarization community-1

    arxiv.org

    Build a pipeline with speaker-diarization-community-1

    Add this model to a processing pipeline alongside other extractors. Combine with retrieval stages for end-to-end search.

    Open Studio