NEWWhy single embeddings fail for video.Read the post →

    AI Model Hub

    Browse AI models for multimodal decomposition and recomposition pipelines — plug any model into your extractors.

    9,588 models available

    Showing 90739096 of 9,588 models

    Document Question Answering

    Sharka/CIVQA_DVQA_LayoutLMv3

    15
    1
    transformers
    Document Question Answering

    Sharka/CIVQA_LayoutXLM

    15
    2
    transformers
    Document Question Answering

    gozdenergiz/layoutlmv2-base-uncased_finetuned_docvqa

    15
    transformers
    Table Question Answering

    google/tapas-medium-finetuned-wtq

    15
    2
    transformers
    Tabular Regression

    ryukkt62/Suncast

    15
    17
    Tabular Regression

    arviszeile/autotrain-golf-winner-2-87274143425

    15
    transformers
    Tabular Regression

    pcoloc/autotrain-600-dragino-1839063122

    15
    transformers
    Tabular Regression

    al02783013/autotrain-faseiii_diciembre-2311773112

    15
    transformers
    Depth Estimation

    Onegafer/glpn-nyu-finetuned-diode-230530-193901

    15
    transformers
    Depth Estimation

    Onegafer/glpn-nyu-finetuned-diode-230530-195824

    15
    transformers
    Depth Estimation

    Onegafer/glpn-nyu-finetuned-diode-230603-102021

    15
    transformers
    Depth Estimation

    ZEDXULTRA/lotus-depth-d-v1-1

    15
    diffusers
    Visual Question Answering

    bgyoo/vilt_finetuned_200

    15
    transformers
    Visual Question Answering

    UBC-NLP/dallah

    15
    3
    Visual Question Answering

    google/pix2struct-ocrvqa-large

    15
    34
    transformers
    Visual Question Answering

    r-g2-2024/Llama-3.1-70B-Instruct-multimodal-JP-Graph-v0.1

    15
    19
    Visual Question Answering

    IDEA-CCNL/Ziya-BLIP2-14B-Visual-v1

    15
    58
    transformers
    Visual Question Answering

    OpenMed/Ministral-3B-MedVL

    15
    2
    Visual Question Answering

    DAMO-NLP-SG/VideoRefer-7B-stage2.5

    15
    2
    transformers
    Visual Question Answering

    Luxuriant16/Med-RwR

    15
    1
    Visual Question Answering

    OpenDataArena/MMFineReason-4B

    15
    14
    Visual Question Answering

    DAMO-NLP-SG/VideoLLaMA2-7B-Base

    15
    6
    transformers
    Visual Question Answering

    Keetawan/BLIP2SeaLLMs-1.5B

    15
    transformers
    Table Question Answering

    liuddf/tapex-base

    15
    379 / 400