NEWAgents can now see video via MCP.Try it now →

    AI Model Hub

    Browse AI models for multimodal decomposition and recomposition pipelines — plug any model into your extractors.

    9,588 models available

    Showing 66016624 of 9,588 models

    Audio Classification

    anton-l/xtreme_s_xlsr_300m_minds14

    333
    3
    transformers
    Image To Text

    tifa-benchmark/promptcap-coco-vqa

    333
    13
    transformers
    Text To Video

    SulphurAI/Sulphur-2-base

    332
    127
    diffusers
    Text To Audio

    facebook/musicgen-stereo-melody-large

    331
    70
    transformers
    Audio Classification

    jihedjabnoun/wavlm-base-emotion

    331
    6
    transformers
    Audio Classification

    aufklarer/Qwen3-ForcedAligner-0.6B-8bit

    331
    mlx
    Question Answering

    jiapingW/Qwen3.5-35B-A3B-Eagle3-Specforge

    331
    5
    Robotics

    allenai/MolmoAct-7B-D-Pretrain-RT-1-0812

    331
    6
    transformers
    Reinforcement Learning

    mradermacher/Agent-STAR-RL-1.5B-GGUF

    330
    transformers
    Image Segmentation

    pirocheto/schp-atr-18

    329
    2
    Audio To Audio

    LiquidAI/LFM2-Audio-1.5B

    328
    346
    liquid-audio
    Summarization

    tsmatz/mt5_summarize_japanese

    327
    20
    transformers
    Image To Text

    mradermacher/Qwen2-VL-2B-Abliterated-Caption-it-i1-GGUF

    327
    transformers
    Image Feature Extraction

    LiheYoung/pixio-vith16

    327
    1
    transformers
    Text To Video

    vrgamedevgirl84/LTX_2.3_Fantasy_Realism_Style_LoRa

    327
    1
    diffusers
    Question Answering

    mishmashly/Neo-Dolphin-Mistral-7B-GGUF

    327
    4
    Summarization

    pszemraj/long-t5-tglobal-base-16384-book-summary

    326
    136
    transformers
    Image Segmentation

    facebook/maskformer-swin-large-coco

    326
    27
    transformers
    Question Answering

    FlagAlpha/Llama2-Chinese-7b-Chat

    325
    220
    transformers
    Depth Estimation

    jingheya/lotus-depth-g-v2-0-disparity

    325
    7
    diffusers
    Depth Estimation

    Intel/zoedepth-kitti

    325
    2
    transformers
    Text To Audio

    tencent/HunyuanVideo-Foley

    325
    163
    hunyuanvideo-foley
    Object Detection

    Elisaa44/yolo_finetuned_fruits

    325
    transformers
    Image Segmentation

    BritishWerewolf/U-2-Netp

    325
    4
    transformers
    276 / 400