NEWWhy single embeddings fail for video.Read the post →

    AI Model Hub

    Browse AI models for multimodal decomposition and recomposition pipelines — plug any model into your extractors.

    9,588 models available

    Showing 72737296 of 9,588 models

    Audio Classification

    padmalcom/wav2vec2-large-emotion-detection-german

    168
    7
    transformers
    Audio Classification

    ThreeBlessings/distilhubert-finetuned-gtzan-merged

    168
    transformers
    Audio Classification

    0bi0n3/distilhubert-finetuned-gtzan

    168
    transformers
    Audio To Audio

    chenmozhijin/BSRoformer-GGUF

    168
    Document Question Answering

    DmitrySpartak/layoutlm-invoices

    167
    Object Detection

    Antoine101/detr-resnet-50-dc5-fashionpedia-finetuned

    167
    transformers
    Object Detection

    0llheaven/CON-DETR-Dental-V1

    167
    transformers
    Image Segmentation

    Dnq2025/mask2former-finetuned-ER-Mito-LD7

    167
    transformers
    Image To Text

    Natthaphon/thaicapgen-clip-gpt2

    167
    Zero Shot Image Classification

    StanfordAIMI/XrayCLIP__vit-l-14__laion2b-s32b-b82k

    167
    2
    transformers
    Image To Text

    enalis/scold

    167
    8
    transformers
    Question Answering

    usami/electra-base-discriminator-finetuned-squad

    167
    transformers
    Question Answering

    jira877832/cuad-longformer-squadv2-finetuned

    167
    transformers
    Depth Estimation

    jingheya/lotus-depth-d-v2-0-disparity

    166
    7
    diffusers
    Visual Question Answering

    microsoft/git-large-vqav2

    166
    19
    transformers
    Object Detection

    0llheaven/detr-finetuned-V2

    166
    transformers
    Object Detection

    0llheaven/Conditional-detr-finetuned-V9

    166
    transformers
    Object Detection

    0llheaven/CXRMed-AI-V1

    166
    transformers
    Image Segmentation

    Domeandreimno/yolov8-segmentation

    166
    transformers
    Zero Shot Image Classification

    AoiNoGeso/japanese-clip-stair-v3

    166
    transformers
    Image To Text

    EasyDeL/Qwen3.5-27B

    166
    easydel
    Audio Classification

    greenarcade/wav2vec2-vd-bird-sound-classification

    166
    1
    transformers
    Audio Classification

    012shin/KAIROS-ast-fake-audio-detection_unsupervised

    166
    3
    transformers
    Audio To Audio

    AXERA-TECH/Speech-Translation.axera

    165
    1
    304 / 400