NEWWhy single embeddings fail for video.Read the post →

    AI Model Hub

    Browse AI models for multimodal decomposition and recomposition pipelines — plug any model into your extractors.

    10,221 models available

    Showing 91939216 of 10,221 models

    Video Classification

    anirudhmu/videomae-base-finetuned-soccer-action-recognition3

    32
    transformers
    Video Classification

    keras-io/video-classification-cnn-rnn

    32
    15
    tf-keras
    Text To Audio

    Somali-tts/somali_tts_model

    32
    transformers
    Zero Shot Classification

    AntoineBlanot/roberta-nli

    32
    transformers
    Visual Question Answering

    google/pix2struct-infographics-vqa-base

    32
    9
    transformers
    Video Classification

    Parallax-labs-1/parallax_TEMPORAL-ValidPhone

    32
    pytorch
    Depth Estimation

    lc700x/dpt-dinov2-large-nyu-hf

    32
    Unconditional Image Generation

    ljc-1222/sd-class-butterflies-64

    32
    diffusers
    Voice Activity Detection

    collinbarnwell/pyannote-segmentation-30

    31
    pyannote-audio
    Document Question Answering

    Nefertury/test_test_42

    31
    transformers
    Unconditional Image Generation

    rxliang/sd-class-butterflies-32

    31
    diffusers
    Depth Estimation

    onnx-community/metric3d-vit-large

    31
    3
    transformers.js
    Depth Estimation

    Onegafer/glpn-nyu-finetuned-diode-230530-204740

    31
    transformers
    Video Classification

    dd00697/videomae-base-finetuned-ucf101-subset

    31
    transformers
    Video Classification

    irenetrecu/videomae-base-finetuned-conflab

    31
    transformers
    Video Classification

    adenhaus/videomae-small-finetuned-ssv2-finetuned-judo

    31
    transformers
    Video Classification

    sirishgam001/videomae-finetuned-engagenet-full

    31
    transformers
    Video Classification

    codircodir/videomae-base-finetuned-kinetics-finetuned-ucf101-subset-finetuned-N

    31
    transformers
    Video Classification

    facebook/vjepa2-vitg-fpc32-384-diving48

    31
    7
    transformers
    Visual Question Answering

    DAMO-NLP-SG/VideoRefer-7B

    31
    5
    transformers
    Visual Question Answering

    Coobiw/InternLM-XComposer2_Enhanced

    31
    Visual Question Answering

    BranZhu/Qwen3-VL-2B-HotpotQA-SFT

    31
    Unconditional Image Generation

    ym999ai/ddpm-celebahq-finetuned-butterflies-2epochs

    31
    diffusers
    Video Classification

    Temo27Anas/vvt-gs-notf-f195-4.4-h768-t32.16.16

    31
    transformers
    384 / 426