NEWWhy single embeddings fail for video.Read the post →

    AI Model Hub

    Browse AI models for multimodal decomposition and recomposition pipelines — plug any model into your extractors.

    9,588 models available

    Showing 84978520 of 9,588 models

    Unconditional Image Generation

    DaveHugging1000/sd-class-butterflies-32

    41
    diffusers
    Unconditional Image Generation

    ibrahim-erekmen/ddpm-celebahq-finetuned-butterflies-2epochs

    41
    diffusers
    Unconditional Image Generation

    AnuragSingh24/sd-class-butterflies-32

    41
    diffusers
    Unconditional Image Generation

    xueord/sd-class-butterflies-32

    41
    diffusers
    Unconditional Image Generation

    LucasjsBatista/sd-class-butterflies-32

    41
    diffusers
    Table Question Answering

    navteca/tapas-large-finetuned-wtq

    41
    1
    transformers
    Image Feature Extraction

    OzzyGT/SDXL_Controlnet_Tile_Realistic

    41
    3
    diffusers
    Text To Audio

    SHENMU007/speecht5_tts_voxpopuli_nl

    41
    transformers
    Text To Audio

    Marvis-AI/marvis-tts-100m-v0.2-MLX-8bit

    41
    5
    transformers
    Image Feature Extraction

    facebook/webssl-mae300m-full2b-224

    41
    transformers
    Image Feature Extraction

    mlx-vision/vit_base_patch16_224.dinov3-mlxim

    41
    mlx-image
    Text To Audio

    Matthijs/mms-tts-deu

    41
    transformers
    Text To Audio

    MuzaffarSharofitdinov/mms-tts-uzbek

    41
    transformers
    Unconditional Image Generation

    adarksky/pokemon-DDPM

    41
    diffusers
    Voice Activity Detection

    aitytech/Pyannote-Segmentation-MLX

    40
    mlx
    Document Question Answering

    PrimWong/layoutlm_qa

    40
    transformers
    Image Feature Extraction

    FM4CS/THOR-1.0-small

    40
    terratorch
    Image Feature Extraction

    timm/vit_so400m_patch14_siglip_gap_378.v2_webli

    40
    timm
    Image Feature Extraction

    facebook/PE-Spatial-T16-512

    40
    1
    perception-encoder
    Image Feature Extraction

    tomg-group-umd/CSD-ViT-L

    40
    5
    transformers
    Video Classification

    d2o2ji/videomae-base-finetuned-kinetics-allkisa-crop-background-0307-clip_duration-abnormal38

    40
    transformers
    Video Classification

    d2o2ji/videomae-base-finetuned-kinetics-0325_final_lr

    40
    transformers
    Video Classification

    d2o2ji/videomae-base-finetuned-kinetics-0402_final_253259_roi

    40
    transformers
    Video Classification

    d2o2ji/videomae-base-finetuned-kinetics-0408_final_5sec_org_ab7_val_as123

    40
    transformers
    355 / 400