NEWWhy single embeddings fail for video.Read the post →

    AI Model Hub

    Browse AI models for multimodal decomposition and recomposition pipelines — plug any model into your extractors.

    9,588 models available

    Showing 71777200 of 9,588 models

    Zero Shot Classification

    Xenova/DeBERTa-v3-base-mnli-fever-anli

    179
    transformers.js
    Object Detection

    kittendev/YOLOv8m-smoke-detection

    179
    19
    ultralytics
    Image Segmentation

    onnx-community/ISNet-ONNX

    179
    2
    transformers.js
    Image To Text

    Mozilla/distilvit

    179
    28
    transformers.js
    Image Feature Extraction

    timm/sam2_hiera_base_plus.fb_r896

    179
    timm
    Question Answering

    Andron00e/YetAnother_Open-Llama-3B-LoRA-OpenOrca

    179
    1
    transformers
    Question Answering

    ModelTC/bart-base-squad

    179
    transformers
    Object Detection

    Alesteba/detr-resnet-50_finetuned_cppe5

    178
    transformers
    Image Feature Extraction

    timm/resnet50_clip_gap.cc12m

    178
    timm
    Text To Video

    squirrelae/Wan2.2-TI2V-5B-GGUF

    178
    1
    gguf
    Audio Classification

    ThomasR/facebook_wav2vec2-large_October_03_2023_05h34PM

    178
    transformers
    Image To Text

    NAMAA-Space/Qari-OCR-0.4.0-VL-4B-Instruct

    178
    3
    peft
    Document Question Answering

    xhyi/layoutlmv3_docvqa_t11c5000

    177
    5
    transformers
    Object Detection

    Davidsv/CourtSide-Computer-Vision-v1

    177
    6
    ultralytics
    Image Segmentation

    Thalirajesh/Aerial-Drone-Image-Segmentation

    177
    12
    transformers
    Image To Text

    Graf-J/captcha-crnn-base

    177
    transformers
    Image To Text

    TIGER-Lab/EditReward-MiMo-VL-7B-SFT-2508

    176
    1
    transformers
    Image Feature Extraction

    timm/resnet50x64_clip_gap.openai

    176
    timm
    Image Feature Extraction

    ibm-granite/granite-geospatial-land-surface-temperature

    176
    20
    terratorch
    Audio Classification

    DL-Project/hatespeech_ast

    176
    transformers
    Image To Text

    PaddlePaddle/PicoDet-L_layout_17cls

    176
    1
    PaddleOCR
    Object Detection

    Alesteba/deep_model_09_detr-resnet-50_finetuned_cppe5

    175
    transformers
    Image Segmentation

    chribark/segformer-b3-finetuned-UAVid

    175
    1
    transformers
    Image Segmentation

    Dnq2025/mask2former-finetuned-ER-Mito-LD4

    175
    transformers
    300 / 400