NEWAgents can now see video via MCP.Try it now →

    Image Text To Text Models

    Browse AI models for multimodal decomposition and recomposition pipelines — plug any model into your extractors.

    400 models available

    Showing 337360 of 400 models

    Image Text To Text

    DavidAU/gemma-4-31B-it-Mystery-Fine-Tune-HERETIC-UNCENSORED-Thinking-Instruct-GGUF

    52K
    46
    Image Text To Text

    google/gemma-3-12b-it-qat-q4_0-unquantized

    51K
    91
    transformers
    Image Text To Text

    LGAI-EXAONE/EXAONE-4.5-33B

    51K
    147
    transformers
    Image Text To Text

    lmstudio-community/Qwen3-VL-30B-A3B-Instruct-MLX-8bit

    51K
    1
    mlx
    Image Text To Text

    google/t5gemma-2-1b-1b

    50K
    76
    transformers
    Image Text To Text

    lmstudio-community/Qwen3-VL-30B-A3B-Instruct-MLX-6bit

    50K
    mlx
    Image Text To Text

    lmstudio-community/Qwen3-VL-30B-A3B-Instruct-MLX-5bit

    50K
    mlx
    Image Text To Text

    BAAI/Emu3-Chat-hf

    50K
    1
    Image Text To Text

    microsoft/Florence-2-base-ft

    49K
    138
    transformers
    Image Text To Text

    LiquidAI/LFM2-VL-450M

    49K
    146
    transformers
    Image Text To Text

    OpenGVLab/InternVL2-40B-AWQ

    49K
    18
    transformers
    Image Text To Text

    OpenGVLab/InternVL3_5-8B-HF

    48K
    9
    transformers
    Image Text To Text

    zai-org/GLM-4.6V-Flash

    48K
    599
    transformers
    Image Text To Text

    winninghealth/olmOCR-2-7B-1025-INT4

    47K
    transformers
    Image Text To Text

    lmstudio-community/gemma-3-12b-it-GGUF

    47K
    44
    Image Text To Text

    unsloth/DeepSeek-OCR-2

    46K
    36
    transformers
    Image Text To Text

    Qwen/Qwen2-VL-7B

    46K
    65
    transformers
    Image Text To Text

    Qwen/Qwen3-VL-2B-Thinking-FP8

    46K
    27
    transformers
    Image Text To Text

    huihui-ai/Huihui-GLM-4.6V-Flash-abliterated-GGUF

    46K
    16
    transformers
    Image Text To Text

    huihui-ai/Huihui-Qwen3.5-9B-Claude-4.6-Opus-abliterated

    45K
    49
    transformers
    Image Text To Text

    unsloth/Qwen3.5-27B

    45K
    14
    Image Text To Text

    mlx-community/Qwen3.6-35B-A3B-nvfp4

    45K
    13
    mlx
    Image Text To Text

    Salesforce/blip2-opt-6.7b

    44K
    80
    transformers
    Image Text To Text

    Qwen/Qwen2-VL-7B-Instruct-GPTQ-Int4

    44K
    38
    transformers
    15 / 17