NEWWhy single embeddings fail for video.Read the post →

    Automatic Speech Recognition Models

    Browse AI models for multimodal decomposition and recomposition pipelines — plug any model into your extractors.

    337 models available

    Showing 124 of 337 models

    Automatic Speech Recognition

    argmaxinc/whisperkit-coreml

    11.1M
    177
    whisperkit
    Automatic Speech Recognition

    pyannote/speaker-diarization-3.1

    10.5M
    1,882
    pyannote-audio
    Automatic Speech Recognition

    openai/whisper-large-v3-turbo

    7.3M
    3,013
    transformers
    Automatic Speech Recognition

    openai/whisper-large-v3

    4.9M
    5,698
    transformers
    Automatic Speech Recognition

    jonatasgrosman/wav2vec2-large-xlsr-53-russian

    3.8M
    75
    transformers
    Automatic Speech Recognition

    jonatasgrosman/wav2vec2-large-xlsr-53-portuguese

    3.5M
    54
    transformers
    Automatic Speech Recognition

    MahmoudAshraf/mms-300m-1130-forced-aligner

    3.2M
    88
    transformers
    Automatic Speech Recognition

    pyannote/speaker-diarization-community-1

    2.9M
    375
    pyannote-audio
    Automatic Speech Recognition

    pyannote/voice-activity-detection

    2.8M
    233
    pyannote-audio
    Automatic Speech Recognition

    openai/whisper-small

    2.3M
    560
    transformers
    Automatic Speech Recognition

    Qwen/Qwen3-ASR-1.7B

    2.0M
    809
    Automatic Speech Recognition

    jonatasgrosman/wav2vec2-large-xlsr-53-polish

    1.6M
    12
    transformers
    Automatic Speech Recognition

    openai/whisper-base

    1.6M
    270
    transformers
    Automatic Speech Recognition

    mistralai/Voxtral-Mini-4B-Realtime-2602

    1.4M
    848
    vllm
    Automatic Speech Recognition

    distil-whisper/distil-large-v3

    1.4M
    376
    transformers
    Automatic Speech Recognition

    jonatasgrosman/wav2vec2-large-xlsr-53-japanese

    1.2M
    57
    transformers
    Automatic Speech Recognition

    facebook/wav2vec2-base-960h

    1.2M
    397
    transformers
    Automatic Speech Recognition

    Systran/faster-whisper-tiny.en

    1.2M
    9
    ctranslate2
    Automatic Speech Recognition

    jonatasgrosman/wav2vec2-large-xlsr-53-chinese-zh-cn

    1.1M
    133
    transformers
    Automatic Speech Recognition

    Systran/faster-whisper-base

    988K
    27
    ctranslate2
    Automatic Speech Recognition

    mlx-community/parakeet-tdt-0.6b-v3

    936K
    41
    mlx
    Automatic Speech Recognition

    Systran/faster-whisper-large-v3

    871K
    574
    ctranslate2
    Automatic Speech Recognition

    pyannote/speaker-diarization

    858K
    1,270
    pyannote-audio
    Automatic Speech Recognition

    openai/whisper-medium

    758K
    284
    transformers
    1 / 15