Mixpeek Logo
    Schedule Demo
    Audio

    Seamless Expressive Translation

    Translate speech across languages while preserving emotional tone, pauses, and vocal style

    Note: This playground provides simulated output to showcase functionality. No input data is processed or stored on our servers. Use this demo to explore the feature extractor's capabilities before integrating it into your application.

    Input

    Enter a URL to a audio file

    Drag and drop a audio file here, or click to browse

    Type of translation to perform. Default: S2ST

    Source language (auto-detect if not specified). Default: auto

    Required

    The language to translate into. Default: undefined

    SeamlessM4T model variant to use. Default: v2-large

    Whether to preserve emotional tone, pauses, and vocal style. Default: true

    Controls the predicted duration and speech rate. Higher values result in slower speech.. Default: 1

    The vocoder to use for speech synthesis. Default: vocoder_pretssel

    Audio sampling rate in Hz. Default: 16000

    Length of audio chunks for processing (in seconds). Default: 30

    Stride length for overlapping chunks (in seconds). Default: 5

    Whether to normalize input audio levels. Default: true

    Whether to return word-level timestamps. Default: true

    Whether to generate speech output (for S2ST and T2ST modes). Default: true

    Output

    {
    "translated_text": "Hola, ¿cómo estás hoy?",
    "source_language": "eng",
    "target_language": "spa",
    "translation_mode": "S2ST",
    "audio_output": {
    "duration": 3.2,
    "sample_rate": 16000,
    "format": "wav"
    },
    "expressivity": {
    "pitch": "preserved",
    "pauses": "preserved",
    "tempo": "preserved",
    "emotion": "preserved",
    "prosody_confidence": 0.94,
    "expressivity_score": 0.89
    },
    "timestamps": [
    {
    "word": "Hola",
    "start": 0,
    "end": 0.5,
    "confidence": 0.98
    },
    {
    "word": "¿cómo",
    "start": 0.6,
    "end": 1.1,
    "confidence": 0.96
    },
    {
    "word": "estás",
    "start": 1.2,
    "end": 1.7,
    "confidence": 0.97
    },
    {
    "word": "hoy?",
    "start": 1.8,
    "end": 2.3,
    "confidence": 0.95
    }
    ],
    "inference_metrics": {
    "duration_factor_used": 1,
    "vocoder_used": "vocoder_pretssel",
    "model_variant": "v2-large",
    "processing_time_ms": 1250,
    "audio_quality_score": 0.92
    },
    "language_detection": {
    "detected_language": "eng",
    "confidence": 0.99,
    "alternative_languages": [
    {
    "language": "eng-US",
    "confidence": 0.85
    },
    {
    "language": "eng-GB",
    "confidence": 0.14
    }
    ]
    },
    "segments": [
    {
    "source_text": "How are you today?",
    "translated_text": "¿Cómo estás hoy?",
    "start_time": 0,
    "end_time": 2.3,
    "speaker_id": "speaker_1"
    }
    ]
    }