NEWWhy single embeddings fail for video.Read the post →

    AI Model Hub

    Browse AI models for multimodal decomposition and recomposition pipelines — plug any model into your extractors.

    9,588 models available

    Showing 71537176 of 9,588 models

    Image To Text

    mradermacher/Qari-OCR-0.3-SNAPSHOT-VL-2B-Instruct-merged-GGUF

    184
    1
    transformers
    Image Feature Extraction

    timm/vit_base_patch16_siglip_gap_512.webli

    184
    timm
    Audio Classification

    gaunernst/vit_base_patch16_1024_128.audiomae_as2m_ft_as20k

    184
    2
    timm
    Question Answering

    mewaeltsegay/desta_1b_QA_v4552_Rosa

    184
    transformers
    Depth Estimation

    coreml-projects/DepthPro-coreml

    183
    5
    coreml
    Audio To Audio

    lucadellalib/focalcodec_50hz_4k_causal

    183
    1
    torch
    Object Detection

    Xenova/yolov9-c_all

    183
    3
    transformers.js
    Zero Shot Image Classification

    mkaichristensen/echo-clip-r

    183
    4
    open_clip
    Zero Shot Image Classification

    patentclip/PatentCLIP_Vit_B

    183
    2
    open_clip
    Image To Text

    alecccdd/qwen2-VL-7B-Captioner-Relaxed-Q4_K_M-GGUF

    183
    2
    transformers
    Object Detection

    nsugianto/detr-resnet50_finetuned_mstabletrnsdet_lsdocelementdetv1type6_plusb5_4246s_adjparam6_lr1e-7

    182
    transformers
    Object Detection

    uisikdag/yolo-v8-football-players-detection

    182
    4
    ultralytics
    Zero Shot Image Classification

    mountainsma/ViT-B-16-SigLIP2

    182
    open_clip
    Image To Text

    mradermacher/Qwen2.5-VL-3B-Abliterated-Caption-it-i1-GGUF

    182
    3
    transformers
    Document Question Answering

    AntonioTH/Layout-finetuned-fr-model-50instances20-100epochs-5e-05lr

    181
    transformers
    Object Detection

    0llheaven/CON-DETR-V5

    181
    transformers
    Image To Text

    SamMikaelson/deepseek-ocr-qvlm-4bit

    181
    transformers
    Image To Text

    Xenova/donut-base-finetuned-cord-v2

    181
    1
    transformers.js
    Image Feature Extraction

    onnx-community/dinov3-vitb16-pretrain-lvd1689m-ONNX

    181
    3
    transformers.js
    Audio Classification

    HTill/flexEAT-base_epoch30_pretrain

    181
    1
    transformers
    Object Detection

    RyanJames/yolo12l-person-seg

    180
    13
    ultralytics
    Image To Text

    qantev/trocr-large-spanish

    180
    12
    transformers
    Image Feature Extraction

    j-morano/MIRAGE-Large

    180
    2
    pytorch
    Text To Video

    Rob1221rib/wan22-qx-encoders-gguf

    180
    2
    diffusers
    299 / 400