Mixpeek Logo
    Schedule Demo
    Video

    Speaker Diarization

    Identify and separate different speakers in audio

    Note: This playground provides simulated output to showcase functionality. No input data is processed or stored on our servers. Use this demo to explore the feature extractor's capabilities before integrating it into your application.

    Input

    Enter a URL to a video file

    Drag and drop a video file here, or click to browse

    The diarization model to use. Default: pyannote-v2

    Minimum number of speakers to detect. Default: 1

    Maximum number of speakers to detect. Default: 10

    Minimum duration for speaker segments in seconds. Default: 0.5

    Output

    {
    "speakers": [
    {
    "id": "speaker_1",
    "segments": [
    {
    "start": 0,
    "end": 12.5
    },
    {
    "start": 35.2,
    "end": 48.1
    }
    ],
    "total_time": 25.4
    },
    {
    "id": "speaker_2",
    "segments": [
    {
    "start": 12.8,
    "end": 34.9
    },
    {
    "start": 48.5,
    "end": 62.3
    }
    ],
    "total_time": 35.9
    }
    ],
    "total_speakers": 2,
    "speaker_characteristics": [
    {
    "id": "speaker_1",
    "gender": "female",
    "estimated_age": "adult"
    },
    {
    "id": "speaker_2",
    "gender": "male",
    "estimated_age": "adult"
    }
    ]
    }