NEWWhy single embeddings fail for video.Read the post →

    AI Model Hub

    Browse AI models for multimodal decomposition and recomposition pipelines — plug any model into your extractors.

    9,588 models available

    Showing 84738496 of 9,588 models

    Image Feature Extraction

    timm/gemma4_vit_167m.gemma4_e4b_it

    43
    2
    timm
    Image Feature Extraction

    timm/gemma4_vit_167m_enc.gemma4_e4b_it

    43
    1
    timm
    Text To Audio

    LeeAeron/acestep-v15-xl-turbo

    43
    transformers
    Zero Shot Classification

    claritylab/zero-shot-implicit-binary-bert

    43
    4
    zeroshot_classifier
    Unconditional Image Generation

    KiranKSaravana/Vizuara-ddpm-celebahq-finetuned-butterflies-2epochs

    42
    diffusers
    Unconditional Image Generation

    d4darius/sd-class-butterflies-32

    42
    diffusers
    Unconditional Image Generation

    HanxiangLi666/sd-class-butterflies-32

    42
    diffusers
    Unconditional Image Generation

    sergio-sanz-rodriguez/ddpm-celebahq-finetuned-butterflies-2epochs

    42
    diffusers
    Tabular Classification

    AWeirdDev/human-disease-prediction

    42
    2
    sklearn
    Tabular Regression

    jc-builds/stockprediction-ai

    42
    1
    lightgbm
    Image Feature Extraction

    birder-project/rope_i_vit_reg1_b16_pn_npn_avg_c1_pe-spatial

    42
    birder
    Image Feature Extraction

    facebook/PE-Spatial-S16-512

    42
    perception-encoder
    Depth Estimation

    MackinationsAi/depth-anything-v2-large-hf

    42
    transformers
    Video Classification

    d2o2ji/videomae-base-finetuned-ucf101-subset

    42
    transformers
    Text To Audio

    DesertMindAI/ASR-hassaniya-whisper-medium-v1

    42
    2
    Text To Audio

    idajikuu/SpeechT5_TTS_Haitian

    42
    5
    transformers
    Text To Audio

    Marvis-AI/marvis-tts-250m-v0.2-MLX-4bit

    42
    2
    transformers
    Zero Shot Classification

    microsoft/LLM2CLIP-Openai-L-14-224

    42
    6
    Visual Question Answering

    LZXzju/Qwen2.5-VL-3B-UI-R1

    42
    8
    Image Feature Extraction

    onnx-community/dinov3-vitl16-pretrain-lvd1689m-ONNX

    42
    2
    transformers.js
    Image Feature Extraction

    timm/vit_base_patch32_siglip_gap_256.v2_webli

    42
    2
    timm
    Image Feature Extraction

    Unspoiled-Egg/ijepa-vit-huge-target-encoder

    42
    pytorch
    Depth Estimation

    phanerozoic/deep-plantain

    42
    diffusers
    Text To Audio

    naklitechie/indic-parler-tts-ONNX

    42
    transformers.js
    354 / 400