NEWAgents can now see video via MCP.Try it now →

    Visual Question Answering Models

    Browse AI models for multimodal decomposition and recomposition pipelines — plug any model into your extractors.

    202 models available

    Showing 97120 of 202 models

    Visual Question Answering

    flyingfishinwater/Qwen3.5-2B-MedVL-MLX-4bit

    56
    Visual Question Answering

    mPLUG/mPLUG-Owl3-7B-241101

    55
    10
    Visual Question Answering

    nectec/Pathumma-llm-vision-2.0.0-preview

    54
    Visual Question Answering

    Jaguar7788/vilt_finetuned_200

    53
    transformers
    Visual Question Answering

    INSAIT-Institute/spear1-franka

    53
    6
    transformers
    Visual Question Answering

    BUAADreamer/Yi-VL-6B-hf

    52
    2
    transformers
    Visual Question Answering

    Joe99/visionlanguageTransformer

    51
    transformers
    Visual Question Answering

    KFrimps/vilt_finetuned_200

    51
    transformers
    Visual Question Answering

    Swicked86/phi4-mm-gptq

    51
    transformers
    Visual Question Answering

    google/pix2struct-docvqa-large

    50
    32
    transformers
    Visual Question Answering

    Navyabhat/Llava-Phi2

    50
    1
    transformers
    Visual Question Answering

    Jayanth9533/YOUR-REPO

    50
    transformers
    Visual Question Answering

    Jeney/vilt-b32-finetuned-vqa

    49
    1
    transformers
    Visual Question Answering

    GeorgyGUF/INFRL-Qwen2.5-VL-72B-Preview-ggufs-fully-quantized

    49
    transformers
    Visual Question Answering

    JHhan/vilt_finetuned_200

    46
    transformers
    Visual Question Answering

    SimulaMet/MedGemma-KvasirVQA-x1-ft

    45
    peft
    Visual Question Answering

    MariaK/vilt_finetuned_200

    44
    transformers
    Visual Question Answering

    BAAI/Aquila-VL-2B-llava-qwen

    43
    62
    transformers
    Visual Question Answering

    mradermacher/MemOCR-7B-GGUF

    43
    1
    transformers
    Visual Question Answering

    DAMO-NLP-SG/VideoLLaMA3-7B-Image

    42
    10
    transformers
    Visual Question Answering

    Kevin0217/vilt_finetuned_200

    42
    transformers
    Visual Question Answering

    LZXzju/Qwen2.5-VL-3B-UI-R1

    42
    8
    Visual Question Answering

    BUAADreamer/Chinese-LLaVA-Med-7B

    40
    4
    transformers
    Visual Question Answering

    ayushk4/smol-gpt4

    40
    1
    transformers
    5 / 9