NEWVectors or files. Pick a path.Start →

    Image Text To Text Models

    Browse AI models for multimodal decomposition and recomposition pipelines — plug any model into your extractors.

    550 models available

    Showing 124 of 550 models

    Image Text To Text

    google/gemma-4-26B-A4B-it

    12.2M
    1,099
    transformers
    Image Text To Text

    google/gemma-4-31B-it

    11.2M
    2,926
    transformers
    Image Text To Text

    Qwen/Qwen3.5-4B

    9.9M
    614
    transformers
    Image Text To Text

    Qwen/Qwen3.5-9B

    9.3M
    1,536
    transformers
    Image Text To Text

    Qwen/Qwen3-VL-8B-Instruct

    8.2M
    942
    transformers
    Image Text To Text

    Qwen/Qwen3.6-27B-FP8

    7.3M
    254
    transformers
    Image Text To Text

    Qwen/Qwen2.5-VL-7B-Instruct

    6.5M
    1,562
    transformers
    Image Text To Text

    Qwen/Qwen3.6-35B-A3B

    5.9M
    2,038
    transformers
    Image Text To Text

    Qwen/Qwen3.6-35B-A3B-FP8

    5.6M
    251
    transformers
    Image Text To Text

    Qwen/Qwen3.6-27B

    5.5M
    1,638
    transformers
    Image Text To Text

    Qwen/Qwen2.5-VL-3B-Instruct

    5.3M
    653
    transformers
    Image Text To Text

    cyankiwi/gemma-4-26B-A4B-it-AWQ-4bit

    5.1M
    78
    transformers
    Image Text To Text

    zai-org/GLM-OCR

    4.2M
    1,815
    transformers
    Image Text To Text

    Qwen/Qwen3-VL-4B-Instruct

    3.9M
    394
    transformers
    Image Text To Text

    Qwen/Qwen2-VL-2B-Instruct

    3.9M
    506
    transformers
    Image Text To Text

    llava-hf/llava-1.5-7b-hf

    3.5M
    366
    transformers
    Image Text To Text

    moonshotai/Kimi-K2.6

    3.1M
    1,416
    transformers
    Image Text To Text

    google/gemma-3-12b-it

    3.0M
    735
    transformers
    Image Text To Text

    HauhauCS/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive

    2.9M
    1,525
    Image Text To Text

    Qwen/Qwen3.5-27B

    2.9M
    981
    transformers
    Image Text To Text

    microsoft/Florence-2-base

    2.8M
    378
    transformers
    Image Text To Text

    Qwen/Qwen3.5-35B-A3B

    2.8M
    1,441
    transformers
    Image Text To Text

    deepseek-ai/DeepSeek-OCR

    2.7M
    3,274
    transformers
    Image Text To Text

    Qwen/Qwen3.5-0.8B

    2.7M
    563
    transformers
    1 / 23