NEWWhy single embeddings fail for video.Read the post →

    AI Model Hub

    Browse AI models for multimodal decomposition and recomposition pipelines — plug any model into your extractors.

    9,588 models available

    Showing 84498472 of 9,588 models

    Video Classification

    Naveengo/videomae-base-finetuned-kinetics-finetuned-ucf101-subset

    44
    transformers
    Video Classification

    DianaKylymnyk/videomae-base-finetuned-ucf101-subset

    44
    transformers
    Video Classification

    Shwifty/videomae-base-finetuned-ucf101-subset

    44
    transformers
    Video Classification

    LSDddd/my_awesome_video_cls_model

    44
    transformers
    Video Classification

    zuvi/videomae-base-finetuned-kinetics-finetuned-k400-subset

    44
    transformers
    Video Classification

    MANMEET75/videomae-base-finetuned-HumanActivityRecognition

    44
    transformers
    Text To Audio

    Chithekitale/chichewa_tts_v2

    44
    transformers
    Zero Shot Classification

    navteca/nli-deberta-v3-large

    44
    3
    transformers
    Zero Shot Classification

    NDugar/ZSD-microsoft-v2xxlmnli

    44
    3
    transformers
    Visual Question Answering

    Navyabhat/Llava-Phi2

    44
    1
    transformers
    Image Feature Extraction

    timm/vit_so400m_patch16_siglip_gap_384.v2_webli

    44
    timm
    Visual Question Answering

    Outlier-Ai/Outlier-Vision

    44
    mlx
    Image Feature Extraction

    timm/fastvit_mci0.apple_mclip2_dfndr2b

    43
    timm
    Image Feature Extraction

    timm/vit_huge_patch14_clip_224.dfn5b

    43
    timm
    Image Feature Extraction

    timm/vit_base_mci_224.apple_mclip2_dfndr2b

    43
    timm
    Image Feature Extraction

    refiners/dinov2.small.patch_14

    43
    refiners
    Depth Estimation

    MackinationsAi/depth-anything-v2-small-hf

    43
    transformers
    Video Classification

    Naveengo/nonviolence-subset

    43
    transformers
    Video Classification

    younggi/videomae-base-finetuned-ucf101-subset

    43
    transformers
    Text To Audio

    alakxender/csm-1b-dhivehi-5-spk-gd

    43
    transformers
    Zero Shot Classification

    schift-io/schift-nli

    43
    1
    transformers.js
    Visual Question Answering

    google/pix2struct-docvqa-large

    43
    32
    transformers
    Image Feature Extraction

    timm/vit_large_patch16_siglip_gap_256.v2_webli

    43
    timm
    Image Feature Extraction

    timm/vit_so400m_patch14_siglip_gap_224.v2_webli

    43
    timm
    353 / 400