Video
Audio Transcription
Transcribe audio content to text
Note: This playground provides simulated output to showcase functionality. No input data is processed or stored on our servers. Use this demo to explore the feature extractor's capabilities before integrating it into your application.
Input
Enter a URL to a video file
Drag and drop a video file here, or click to browse
The transcription model to use. Default: whisper-large
Language of the audio content. Default: auto
Whether to include word timestamps. Default: true
Minimum confidence threshold for transcription. Default: 0.6
Output
{"transcript": "Welcome to the conference call. Today we'll discuss the quarterly results and future projections.","confidence": 0.95,"word_timestamps": [{"word": "Welcome","start": 0.5,"end": 0.9,"confidence": 0.98},{"word": "to","start": 0.9,"end": 1,"confidence": 0.99},{"word": "the","start": 1,"end": 1.1,"confidence": 0.99}],"language": "en-US"}
Ready to run Audio Transcription on your data? Spin it up in Studio — no infra to host.
Run this in StudioAlready have embeddings? Skip extraction — search your own vectors with MVS. First 1M vectors free.
Try MVS →