NEWAgents can now see video via MCP.Try it now →

    AI Model Hub

    Browse AI models for multimodal decomposition and recomposition pipelines — plug any model into your extractors.

    9,002 models available

    Showing 27372760 of 9,002 models

    Image Classification

    timm/convnextv2_atto.fcmae_ft_in1k

    23K
    timm
    Text To Image

    alimama-creative/FLUX.1-Turbo-Alpha

    23K
    641
    diffusers
    Image Segmentation

    facebook/detr-resnet-50-panoptic

    23K
    138
    transformers
    Image Classification

    facebook/dinov2-small-imagenet1k-1-layer

    23K
    3
    transformers
    Image Classification

    timm/vit_large_patch16_384.augreg_in21k_ft_in1k

    23K
    timm
    Image Classification

    timm/inception_v3.tv_in1k

    23K
    1
    timm
    Image Classification

    timm/visformer_small.in1k

    23K
    1
    timm
    Translation

    utrobinmv/t5_translate_en_ru_zh_small_1024

    23K
    40
    transformers
    Image Classification

    timm/efficientnet_b5.sw_in12k_ft_in1k

    23K
    2
    timm
    Feature Extraction

    RishuD7/finetune_base_bge_pretrained_v4

    23K
    transformers
    Automatic Speech Recognition

    guillaumekln/faster-whisper-tiny

    22K
    9
    ctranslate2
    Image Classification

    timm/resnext50_32x4d.a1h_in1k

    22K
    timm
    Zero Shot Image Classification

    timm/ViT-SO400M-16-SigLIP2-384

    22K
    5
    open_clip
    Tabular Regression

    autogluon/mitra-regressor

    22K
    29
    Feature Extraction

    laion/larger_clap_music

    22K
    45
    transformers
    Image Classification

    timm/repvit_m1.dist_in1k

    22K
    1
    timm
    Text To Video

    bullerwins/Wan2.2-T2V-A14B-GGUF

    22K
    69
    Zero Shot Image Classification

    OysterQAQ/DanbooruCLIP

    22K
    15
    transformers
    Text To Speech

    OpenMOSS-Team/MOSS-TTS-Realtime

    22K
    80
    transformers
    Automatic Speech Recognition

    xkeyC/whisper-large-v3-turbo-gguf

    22K
    33
    transformers
    Image Classification

    dima806/deepfake_vs_real_image_detection

    22K
    46
    transformers
    Fill Mask

    jhu-clsp/ettin-encoder-150m

    22K
    10
    transformers
    Text Classification

    GenTelLab/gentelshield-v1

    22K
    7
    transformers
    Audio To Audio

    speechbrain/metricgan-plus-voicebank

    22K
    71
    speechbrain
    115 / 376