speaker-diarization-community-1
by pyannote
Community speaker diarization pipeline for who-spoke-when audio metadata
pyannote/speaker-diarization-community-1mixpeek://transcription@v1/pyannote_diarization_community_1Overview
pyannote Community-1 is a speaker diarization pipeline that segments audio by speaker turns, speech activity, speaker changes, and overlapped speech. It is publicly accessible with license acceptance and has become one of the highest-traffic diarization models on HuggingFace.
On Mixpeek, diarization turns raw audio and video transcripts into searchable conversational structure. Agents can ask not only what was said, but who said it and when it happened.
Architecture
pyannote.audio pipeline composed of voice activity detection, speaker change detection, overlapped speech detection, embedding, and clustering components. Accepts whole files or waveform excerpts.
Mixpeek SDK Integration
import { Mixpeek } from "mixpeek";const mx = new Mixpeek({ apiKey: "API_KEY" });await mx.collections.ingest({collection_id: "meetings",source: { url: "https://example.com/meeting.wav" },feature_extractors: [{feature: "speaker_diarization",model: "pyannote/speaker-diarization-community-1"}]});
Capabilities
- Speaker turn segmentation
- Voice activity and speaker change detection
- Overlapped speech handling
- Runs through pyannote.audio
Use Cases on Mixpeek
Performance
Model files require accepting HuggingFace access conditions
Specification
Research Paper
pyannote.audio speaker diarization community-1
arxiv.orgBuild a pipeline with speaker-diarization-community-1
Add this model to a processing pipeline alongside other extractors. Combine with retrieval stages for end-to-end search.
Open Studio