NEWWhy single embeddings fail for video.Read the post →

    AI Model Hub

    Browse AI models for multimodal decomposition and recomposition pipelines — plug any model into your extractors.

    9,588 models available

    Showing 84258448 of 9,588 models

    Unconditional Image Generation

    Tanayyk2811/Vizuara-ddpm-celebahq-finetuned-butterflies-2epochs

    46
    diffusers
    Image Feature Extraction

    czczup/textnet-tiny

    46
    3
    transformers
    Image Feature Extraction

    r3gm/controlnet-tile-sdxl-1.0-fp16

    46
    1
    diffusers
    Depth Estimation

    coreml-projects/DepthPro-coreml-pruned-10-quantized-linear

    46
    coreml
    Video Classification

    LSummer/my_awesome_video_cls_model

    46
    transformers
    Text To Audio

    LeeAeron/acestep-v15-xl-sft

    46
    transformers
    Zero Shot Classification

    harshag11/zero-shot-wikidata-classifier

    46
    1
    Zero Shot Classification

    NDugar/v3large-2epoch

    46
    transformers
    Text To Audio

    olawale-ahmed/pidgin_speecht5_tts_google_waxal

    46
    transformers
    Video Classification

    facebook/timesformer-hr-finetuned-k600

    45
    6
    transformers
    Image Feature Extraction

    fullstuck/transformers_resnet18_cifar100

    45
    transformers
    Video Classification

    SaurabhShashank/timesformer-base-k400-finetuned-cricketV5.0

    45
    transformers
    Video Classification

    WasuratS/vivit-b-16x2-finetuned-cctv-surveillance

    45
    1
    transformers
    Text To Audio

    thomassiew/frieren_english_tenganai2

    45
    transformers
    Zero Shot Classification

    NDugar/2epochv3mlni

    45
    transformers
    Zero Shot Classification

    NDugar/debertav3-mnli-snli-anli

    45
    3
    transformers
    Zero Shot Classification

    Recognai/zeroshot_selectra_small

    45
    6
    transformers
    Visual Question Answering

    INSAIT-Institute/spear1-franka

    45
    6
    transformers
    Video Classification

    qualcomm/Video-MAE

    44
    2
    pytorch
    Visual Question Answering

    TIGER-Lab/VL-Rethinker-72B

    44
    5
    transformers
    Unconditional Image Generation

    huggan/crypto-gan

    44
    17
    tf-keras
    Image Feature Extraction

    UCSC-VLAA/openvision-vit-so400m-patch14-384

    44
    1
    open_clip
    Image Feature Extraction

    waticlems/Prost40M

    44
    timm
    Depth Estimation

    jingheya/lotus-depth-d-v1-1

    44
    6
    diffusers
    352 / 400