NEWAgents can now see video via MCP.Try it now →

    Image Text To Text Models

    Browse AI models for multimodal decomposition and recomposition pipelines — plug any model into your extractors.

    400 models available

    Showing 193216 of 400 models

    Image Text To Text

    OpenGVLab/InternVL3-1B

    132K
    83
    transformers
    Image Text To Text

    apolo13x/Qwen3.5-27B-NVFP4

    131K
    39
    transformers
    Image Text To Text

    LiquidAI/LFM2.5-VL-1.6B

    131K
    276
    transformers
    Image Text To Text

    HauhauCS/Qwen3.5-122B-A10B-Uncensored-HauhauCS-Aggressive

    131K
    111
    Image Text To Text

    openbmb/MiniCPM-V-4

    130K
    463
    transformers
    Image Text To Text

    openbmb/MiniCPM-V-4_5

    129K
    1,087
    transformers
    Image Text To Text

    lmstudio-community/gemma-4-26B-A4B-it-MLX-8bit

    128K
    1
    transformers
    Image Text To Text

    google/paligemma-3b-pt-224

    128K
    435
    transformers
    Image Text To Text

    abhishekchohan/gemma-3-12b-it-quantized-W4A16

    126K
    7
    transformers
    Image Text To Text

    unsloth/Qwen3.5-2B

    125K
    11
    transformers
    Image Text To Text

    Sehyo/Qwen3.5-122B-A10B-NVFP4

    124K
    62
    transformers
    Image Text To Text

    OpenGVLab/InternVL3-38B

    124K
    43
    transformers
    Image Text To Text

    lmstudio-community/gemma-4-26B-A4B-it-MLX-4bit

    124K
    1
    transformers
    Image Text To Text

    Qwen/Qwen3.5-4B-Base

    124K
    62
    transformers
    Image Text To Text

    mistral-experimental/pixtral-12b

    123K
    104
    transformers
    Image Text To Text

    HuggingFaceTB/SmolVLM2-256M-Video-Instruct

    122K
    102
    transformers
    Image Text To Text

    Qwen/Qwen3.5-9B-Base

    122K
    71
    transformers
    Image Text To Text

    trl-internal-testing/tiny-LlavaForConditionalGeneration

    121K
    transformers
    Image Text To Text

    OpenGVLab/InternVL3-8B

    120K
    104
    transformers
    Image Text To Text

    ibm-granite/granite-4.0-3b-vision

    118K
    104
    transformers
    Image Text To Text

    llmfan46/gemma-4-26B-A4B-it-uncensored-heretic-GGUF

    117K
    21
    transformers
    Image Text To Text

    google/translategemma-4b-it

    116K
    756
    transformers
    Image Text To Text

    unsloth/Qwen3-VL-4B-Instruct-GGUF

    116K
    46
    transformers
    Image Text To Text

    bartowski/Qwen_Qwen3.6-35B-A3B-GGUF

    116K
    50
    9 / 17