NEWAgents can now see video via MCP.Try it now →

    Visual Question Answering Models

    Browse AI models for multimodal decomposition and recomposition pipelines — plug any model into your extractors.

    202 models available

    Showing 2548 of 202 models

    Visual Question Answering

    google/matcha-chartqa

    369
    47
    transformers
    Visual Question Answering

    second-state/MiniCPM-Llama3-V-2_5-GGUF

    353
    1
    Visual Question Answering

    DAMO-NLP-SG/VideoLLaMA2-7B

    348
    42
    transformers
    Visual Question Answering

    second-state/MiniCPM-V-4-GGUF

    344
    1
    Visual Question Answering

    OpenMed/Qwen3.5-2B-MedVL

    321
    6
    Visual Question Answering

    prapaa/eastrus-vl-qwen3-8b-gguf

    306
    llama.cpp
    Visual Question Answering

    UII-AI/uAI-NEXUS-MedVLM-1.0a-7B-RL

    296
    7
    Visual Question Answering

    microsoft/git-base-vqav2

    290
    21
    transformers
    Visual Question Answering

    openbmb/MiniCPM-Llama3-V-2_5-int4

    273
    79
    transformers
    Visual Question Answering

    SimulaMet/Qwen2.5-VL-KvasirVQA-x1-ft

    272
    peft
    Visual Question Answering

    soorism/Qwen3-VL-2B-instruct-SFT-FakeClues

    254
    transformers
    Visual Question Answering

    mradermacher/TreeVGR-7B-CI-i1-GGUF

    239
    1
    transformers
    Visual Question Answering

    mradermacher/TreeVGR-7B-CI-GGUF

    233
    1
    transformers
    Visual Question Answering

    internlm/internlm-xcomposer2d5-7b-4bit

    207
    13
    transformers
    Visual Question Answering

    Jesteban247/brats_medgemma-GGUF

    205
    transformers
    Visual Question Answering

    erax-ai/EraX-VL-2B-V1.5

    198
    10
    transformers
    Visual Question Answering

    ybelkada/blip2-opt-2.7b-fp16-sharded

    197
    3
    transformers
    Visual Question Answering

    google/matcha-plotqa-v2

    181
    13
    transformers
    Visual Question Answering

    RussRobin/SpatialBot-3B

    174
    19
    transformers
    Visual Question Answering

    DAMO-NLP-SG/VideoLLaMA3-2B-Image

    168
    8
    transformers
    Visual Question Answering

    google/pix2struct-chartqa-base

    165
    10
    transformers
    Visual Question Answering

    DAMO-NLP-SG/VideoLLaMA2-7B-16F

    165
    14
    transformers
    Visual Question Answering

    gaianet/MiniCPM-Llama3-V-2_5-GGUF

    154
    3
    Visual Question Answering

    AI-Safeguard/Ivy-VL-llava

    153
    72
    transformers
    2 / 9