NEWAgents can now see video via MCP.Try it now →

    Image Text To Text Models

    Browse AI models for multimodal decomposition and recomposition pipelines — plug any model into your extractors.

    400 models available

    Showing 145168 of 400 models

    Image Text To Text

    nvidia/NVIDIA-Nemotron-Nano-12B-v2-VL-BF16

    198K
    82
    transformers
    Image Text To Text

    huihui-ai/Huihui-Qwen3.5-27B-abliterated

    198K
    120
    transformers
    Image Text To Text

    datalab-to/chandra-ocr-2

    192K
    282
    transformers
    Image Text To Text

    unsloth/gemma-4-E4B-it

    190K
    14
    Image Text To Text

    bartowski/Qwen_Qwen3.5-4B-GGUF

    186K
    27
    Image Text To Text

    HuggingFaceM4/Idefics3-8B-Llama3

    185K
    303
    transformers
    Image Text To Text

    trl-internal-testing/tiny-Gemma3ForConditionalGeneration

    185K
    transformers
    Image Text To Text

    tencent/HunyuanOCR

    183K
    746
    transformers
    Image Text To Text

    google/paligemma-3b-mix-224

    182K
    96
    transformers
    Image Text To Text

    shieldstar/Qwen3.5-122B-A10B-int4-AutoRound-EC

    182K
    2
    transformers
    Image Text To Text

    unsloth/Qwen3.5-4B

    182K
    17
    transformers
    Image Text To Text

    rednote-hilab/dots.ocr

    180K
    1,296
    dots_ocr
    Image Text To Text

    dealignai/Gemma-4-31B-JANG_4M-CRACK

    180K
    1,354
    mlx
    Image Text To Text

    lmstudio-community/Qwen3-VL-4B-Instruct-MLX-4bit

    179K
    7
    mlx
    Image Text To Text

    Jackrong/Qwopus3.5-27B-v3-GGUF

    177K
    349
    Image Text To Text

    cyankiwi/Qwen3-VL-4B-Instruct-AWQ-4bit

    177K
    7
    Image Text To Text

    google/gemma-4-26B-A4B

    176K
    239
    transformers
    Image Text To Text

    nvidia/Cosmos-Reason2-8B

    172K
    169
    cosmos
    Image Text To Text

    vikp/texify

    172K
    15
    transformers
    Image Text To Text

    LiquidAI/LFM2.5-VL-1.6B-GGUF

    171K
    78
    Image Text To Text

    lmstudio-community/Qwen3-VL-4B-Instruct-MLX-8bit

    170K
    1
    mlx
    Image Text To Text

    lmstudio-community/Qwen3-VL-4B-Instruct-MLX-6bit

    170K
    mlx
    Image Text To Text

    lmstudio-community/Qwen3-VL-4B-Instruct-MLX-5bit

    170K
    mlx
    Image Text To Text

    liuhaotian/llava-v1.5-7b

    169K
    550
    transformers
    7 / 17