NEWAgents can now see video via MCP.Try it now →

    Image Text To Text Models

    Browse AI models for multimodal decomposition and recomposition pipelines — plug any model into your extractors.

    400 models available

    Showing 124 of 400 models

    Image Text To Text

    Qwen/Qwen3-VL-2B-Instruct

    126.9M
    372
    transformers
    Image Text To Text

    Qwen/Qwen2.5-VL-7B-Instruct

    8.9M
    1,509
    transformers
    Image Text To Text

    Qwen/Qwen3.5-9B

    7.1M
    1,344
    transformers
    Image Text To Text

    google/gemma-4-31B-it

    5.8M
    2,351
    transformers
    Image Text To Text

    moonshotai/Kimi-K2.5

    4.7M
    2,765
    transformers
    Image Text To Text

    Qwen/Qwen2.5-VL-3B-Instruct

    4.3M
    640
    transformers
    Image Text To Text

    google/gemma-4-26B-A4B-it

    4.2M
    806
    transformers
    Image Text To Text

    Qwen/Qwen3-VL-8B-Instruct

    4.0M
    884
    transformers
    Image Text To Text

    Qwen/Qwen3.5-35B-A3B

    3.9M
    1,405
    transformers
    Image Text To Text

    Qwen/Qwen2-VL-2B-Instruct

    3.8M
    499
    transformers
    Image Text To Text

    Qwen/Qwen3.5-4B

    3.6M
    491
    transformers
    Image Text To Text

    Qwen/Qwen3.5-27B

    3.4M
    961
    transformers
    Image Text To Text

    unsloth/gemma-4-26B-A4B-it-GGUF

    2.9M
    599
    Image Text To Text

    Qwen/Qwen3.5-0.8B

    2.9M
    511
    transformers
    Image Text To Text

    llava-hf/llava-1.5-7b-hf

    2.6M
    356
    transformers
    Image Text To Text

    google/gemma-3-12b-it

    2.6M
    710
    transformers
    Image Text To Text

    vikhyatk/moondream2

    2.5M
    1,407
    transformers
    Image Text To Text

    Qwen/Qwen3-VL-4B-Instruct

    2.2M
    377
    transformers
    Image Text To Text

    deepseek-ai/DeepSeek-OCR

    2.2M
    3,221
    transformers
    Image Text To Text

    google/gemma-3-4b-it

    2.0M
    1,312
    transformers
    Image Text To Text

    Qwen/Qwen2-VL-7B-Instruct

    1.9M
    1,274
    transformers
    Image Text To Text

    unsloth/gemma-4-31B-it-GGUF

    1.9M
    365
    Image Text To Text

    Qwen/Qwen3.5-2B

    1.7M
    263
    transformers
    Image Text To Text

    Qwen/Qwen2-VL-7B-Instruct-AWQ

    1.7M
    49
    transformers
    1 / 17