NEWAgents can now see video via MCP.Try it now →

    AI Model Hub

    Browse AI models for multimodal decomposition and recomposition pipelines — plug any model into your extractors.

    9,002 models available

    Showing 43934416 of 9,002 models

    Automatic Speech Recognition

    onnx-community/whisper-large-v3-turbo_timestamped

    3K
    11
    transformers.js
    Text To Audio

    Marvis-AI/marvis-tts-250m-v0.1

    3K
    73
    transformers
    Feature Extraction

    Xenova/msmarco-distilbert-base-v4

    3K
    transformers.js
    Feature Extraction

    tanganke/clip-vit-base-patch32_cifar100

    3K
    transformers
    Audio Classification

    HowMannyMore/wav2vec2-lg-xlsr-ur-speech-emotion-recognition

    3K
    transformers
    Image Classification

    timm/mobilenetv2_140.ra_in1k

    3K
    timm
    Translation

    google/madlad400-10b-mt

    3K
    131
    transformers
    Summarization

    deutsche-telekom/mt5-small-sum-de-mit-v1

    3K
    13
    transformers
    Audio Classification

    mo-thecreator/Deepfake-audio-detection

    3K
    16
    transformers
    Image Classification

    timm/tf_efficientnetv2_s.in1k

    3K
    timm
    Video Classification

    MCG-NJU/videomae-large-finetuned-kinetics

    3K
    14
    transformers
    Translation

    Helsinki-NLP/opus-mt-en-sw

    3K
    7
    transformers
    Image Classification

    timm/volo_d1_224.sail_in1k

    3K
    2
    timm
    Sentence Similarity

    Qdrant/clip-ViT-B-32-text

    3K
    2
    transformers
    Object Detection

    foduucom/table-detection-and-extraction

    3K
    106
    ultralytics
    Image To Image

    fal/flux-klein-9b-virtual-tryon-lora

    3K
    107
    diffusers
    Question Answering

    twmkn9/albert-base-v2-squad2

    3K
    4
    transformers
    Image Classification

    cafeai/cafe_aesthetic

    3K
    55
    transformers
    Image To Text

    naver-clova-ix/donut-base-finetuned-rvlcdip

    3K
    20
    transformers
    Sentence Similarity

    mlx-community/embeddinggemma-300m-4bit

    3K
    5
    sentence-transformers
    Image To Text

    xtuner/llava-llama-3-8b-v1_1-gguf

    3K
    226
    Text To Video

    BestWishYsh/Helios-Distilled

    3K
    43
    diffusers
    Translation

    Helsinki-NLP/opus-mt-af-en

    3K
    transformers
    Video Classification

    OpenGVLab/VideoMAEv2-Large

    3K
    1
    184 / 376