Omnilingual ASR
High-quality automatic speech recognition for 1600+ languages using Meta's multilingual ASR system
Input
Enter a URL to a audio file
Drag and drop a audio file here, or click to browse
Model architecture to use for transcription. Default: omniASR_LLM_7B
ISO 639-3 language code (auto-detect if not specified, applies to LLM models). Default: auto
Number of audio samples to process in parallel. Default: 1
Compute device to use for inference. Default: cuda
Precision for inference. Default: bfloat16
Whether to normalize audio levels before processing. Default: true
Target sample rate for audio processing. Default: 16000
Whether to return word-level timestamps. Default: false
Whether to return confidence scores for transcription. Default: false
Output
{
"transcription": "This is the transcribed text from the audio file.",
"language": "eng",
"language_confidence": 0.98,
"model_used": "omniASR_LLM_7B",
"audio_metadata": {
"duration": 30.5,
"sample_rate": 16000,
"channels": 1,
"format": "wav"
},
"timestamps": [
{
"word": "This",
"start": 0,
"end": 0.24,
"confidence": 0.99
},
{
"word": "is",
"start": 0.24,
"end": 0.36,
"confidence": 0.98
},
{
"word": "the",
"start": 0.36,
"end": 0.52,
"confidence": 0.97
}
],
"inference_metrics": {
"processing_time_ms": 1250,
"real_time_factor": 0.041,
"model_size": "7.8B",
"memory_used_mb": 17500
},
"segments": [
{
"text": "This is the transcribed text from the audio file.",
"start": 0,
"end": 30.5,
"confidence": 0.96
}
]
}Ready to run Omnilingual ASR on your data? Spin it up in Studio — no infra to host.
Run this in StudioAlready have embeddings? Skip extraction — search your own vectors with MVS. First 1M vectors free.
Try MVS →