NEWAgents can now see video via MCP.Try it now →

    AI Model Hub

    Browse AI models for multimodal decomposition and recomposition pipelines — plug any model into your extractors.

    9,588 models available

    Showing 46814704 of 9,588 models

    Image Segmentation

    camenduru/RMBG-2.0

    3K
    1
    transformers
    Image To Text

    cyberagent/llava-calm2-siglip

    3K
    26
    transformers
    Summarization

    google/bigbird-pegasus-large-arxiv

    3K
    64
    transformers
    Image Classification

    harrytechiz/vit-base-patch16-224-blur_vs_clean

    3K
    2
    transformers
    Feature Extraction

    marksverdhei/Qwen3-Voice-Embedding-12Hz-0.6B

    3K
    22
    transformers
    Any To Any

    stepfun-ai/Step-Audio-2-mini

    3K
    254
    transformers
    Image Classification

    microsoft/resnet-152

    3K
    18
    transformers
    Text To Speech

    Jahaz/Qwen3-tts-0.6b-gguf-for-koboldcpp

    3K
    2
    Feature Extraction

    principled-intelligence/gemma-4-E2B-it-text-only

    3K
    6
    transformers
    Image Classification

    pamixsun/swinv2_tiny_for_glaucoma_classification

    3K
    3
    transformers
    Image To Image

    egorchistov/optical-flow-MEMFOF-Tartan-T-TSKH

    3K
    4
    pytorch
    Any To Any

    OmniGen2/OmniGen2

    3K
    435
    diffusers
    Sentence Similarity

    uer/sbert-base-chinese-nli

    3K
    137
    sentence-transformers
    Feature Extraction

    tanganke/clip-vit-base-patch32_oxford-iiit-pet

    3K
    transformers
    Sentence Similarity

    intfloat/e5-large-unsupervised

    3K
    6
    sentence-transformers
    Feature Extraction

    tanganke/clip-vit-base-patch32_cifar100

    3K
    transformers
    Image Classification

    timm/seresnet50.a1_in1k

    3K
    timm
    Video Classification

    OpenGVLab/VideoMAEv2-Large

    3K
    1
    Feature Extraction

    tanganke/clip-vit-base-patch32_stl10

    3K
    transformers
    Feature Extraction

    mradermacher/Qwen3-Embedding-4B-i1-GGUF

    3K
    transformers
    Image Classification

    timm/resnetv2_50x1_bit.goog_in21k_ft_in1k

    3K
    timm
    Audio Classification

    Speech-Arena-2025/DF_Arena_1B_V_1

    3K
    7
    Image Classification

    timm/resnet50d.ra4_e3600_r224_in1k

    3K
    timm
    Text To Speech

    neuphonic/neutts-nano-q4-gguf

    3K
    11
    196 / 400