NEWWhy single embeddings fail for video.Read the post →

    AI Model Hub

    Browse AI models for multimodal decomposition and recomposition pipelines — plug any model into your extractors.

    9,588 models available

    Showing 86418664 of 9,588 models

    Visual Question Answering

    google/pix2struct-screen2words-large

    35
    22
    transformers
    Visual Question Answering

    edgeun/blip-medical-vqa-rad

    35
    transformers
    Unconditional Image Generation

    cgnannan19821128/sd-class-butterflies-64

    35
    diffusers
    Video Classification

    archit11/videomae-base-finetuned-ucfcrime-full

    34
    transformers
    Tabular Regression

    MasumBhuiyan/linear-regression

    34
    keras
    Video Classification

    adenhaus/videomae-base-finetuned-judo

    34
    transformers
    Video Classification

    qubvel-hf/vjepa2-vitg-fpc64-384-ssv2

    34
    transformers
    Text To Audio

    alakxender/mms-tts-div-finetuned-md-f02

    34
    transformers
    Zero Shot Classification

    Xenova/DeBERTa-v3-xsmall-mnli-fever-anli-ling-binary

    34
    transformers.js
    Visual Question Answering

    SimulaMet/MedGemma-KvasirVQA-x1-ft

    34
    peft
    Video Classification

    muneeb1812/videomae-base-fake-video-classification

    34
    transformers
    Document Question Answering

    QmQasim/layoutlmv2-base-uncased_finetuned_docvqa

    33
    transformers
    Unconditional Image Generation

    blurbnation/sd-class-pureland-v1

    33
    diffusers
    Unconditional Image Generation

    Victordmss/sd-class-butterflies-32

    33
    diffusers
    Unconditional Image Generation

    Nachammai41/sd-class-butterflies-32-copy-5

    33
    diffusers
    Table Question Answering

    Mazenz/DeepAnalyze-8B-Q8_0-GGUF

    33
    2
    Video Classification

    d2o2ji/videomae-base-finetuned-kinetics-0407_final_5sec_org

    33
    transformers
    Text To Audio

    Ademola265/HeartCodec-oss

    33
    2
    Text To Audio

    tharushaudana/mms-tts-sinhala-custom-vocab

    33
    transformers
    Zero Shot Classification

    Xenova/deberta-v3-base-tasksource-nli

    33
    1
    transformers.js
    Visual Question Answering

    google/matcha-plotqa-v1

    33
    3
    transformers
    Visual Question Answering

    google/pix2struct-infographics-vqa-large

    33
    12
    transformers
    Visual Question Answering

    gaoqie/Glm-Edge-V-5B-fire

    33
    1
    Unconditional Image Generation

    Skeen04/sd-class-butterflies-32

    33
    diffusers
    361 / 400