NEWWhy single embeddings fail for video.Read the post →

    AI Model Hub

    Browse AI models for multimodal decomposition and recomposition pipelines — plug any model into your extractors.

    9,588 models available

    Showing 87378760 of 9,588 models

    Voice Activity Detection

    Cactus-Compute/silero-vad

    28
    Visual Question Answering

    nhattan9999t/blip-kvasir-vqa

    28
    1
    transformers
    Visual Question Answering

    RhapsodyAI/qwen_vl_guidance

    28
    4
    transformers
    Document Question Answering

    Nefertury/test_test_42

    28
    transformers
    Unconditional Image Generation

    Summer1111111/sd-class-butterflies-32

    28
    diffusers
    Unconditional Image Generation

    IKEJAY/sd-class-butterflies-32

    28
    diffusers
    Table Question Answering

    ethanbradley/fintabqa

    28
    transformers
    Table Question Answering

    MichiganNLP/TAMA-QWen3

    28
    transformers
    Table Question Answering

    AIAT/Kiddee-qatable1

    28
    1
    transformers
    Depth Estimation

    gfaccipo/Depth-Anything-V2-GGUF

    28
    Video Classification

    CAIR-HKISI/SurgMotion-vitg-xformer

    28
    3
    pytorch
    Video Classification

    ihsanahakiim/videomae-base-finetuned-ucf101-subset

    28
    transformers
    Video Classification

    Redgerd/XceptionNet-Keras

    28
    keras
    Video Classification

    TanAlexanderlz/RALL_RGBCROP_ori32F-8B32F

    28
    transformers
    Zero Shot Classification

    KheireddineDaouadi/ZeroAraElectra

    28
    transformers
    Zero Shot Classification

    deliciouscat/kf-deberta-base-cross-nli

    28
    2
    transformers
    Visual Question Answering

    meituan/MemOCR-7B

    28
    7
    Visual Question Answering

    SwordElucidator/MiniCPM-Llama3-V-2_5-int4

    28
    1
    transformers
    Visual Question Answering

    Datadog/Toto-1.0-QA-Experimental

    28
    1
    Video Classification

    Krithiik/videomae-base-videomae-asl

    28
    transformers
    Unconditional Image Generation

    shellypeng/atomixl_realistic

    28
    diffusers
    Unconditional Image Generation

    spdas/sd-class-butterflies-32

    28
    diffusers
    Voice Activity Detection

    aufklarer/Silero-VAD-v5-ONNX

    27
    Document Question Answering

    jinhybr/OCR-DocVQA-Donut

    27
    13
    transformers
    365 / 400