NEWAgents can now see video via MCP.Try it now →

    Image To Text Models

    Browse AI models for multimodal decomposition and recomposition pipelines — plug any model into your extractors.

    300 models available

    Showing 124 of 300 models

    Image To Text

    zai-org/GLM-OCR

    7.9M
    1,656
    transformers
    Image To Text

    Salesforce/blip-image-captioning-base

    2.1M
    849
    transformers
    Image To Text

    Salesforce/blip-image-captioning-large

    1.3M
    1,474
    transformers
    Image To Text

    microsoft/trocr-base-printed

    734K
    206
    transformers
    Image To Text

    breezedeus/pix2text-mfr

    633K
    54
    transformers
    Image To Text

    Salesforce/blip2-opt-2.7b-coco

    605K
    11
    transformers
    Image To Text

    PaddlePaddle/PP-OCRv5_server_det

    583K
    59
    PaddleOCR
    Image To Text

    PaddlePaddle/UVDoc

    404K
    8
    PaddleOCR
    Image To Text

    PaddlePaddle/PP-LCNet_x1_0_doc_ori

    366K
    11
    PaddleOCR
    Image To Text

    PaddlePaddle/en_PP-OCRv5_mobile_rec

    330K
    1
    PaddleOCR
    Image To Text

    facebook/nougat-base

    326K
    189
    transformers
    Image To Text

    kha-white/manga-ocr-base

    296K
    170
    transformers
    Image To Text

    nlpconnect/vit-gpt2-image-captioning

    253K
    927
    transformers
    Image To Text

    microsoft/trocr-large-handwritten

    208K
    160
    transformers
    Image To Text

    PaddlePaddle/PP-LCNet_x1_0_textline_ori

    194K
    2
    PaddleOCR
    Image To Text

    microsoft/trocr-large-printed

    190K
    179
    transformers
    Image To Text

    microsoft/kosmos-2-patch14-224

    162K
    184
    transformers
    Image To Text

    naver-clova-ix/donut-base

    158K
    252
    transformers
    Image To Text

    microsoft/trocr-base-handwritten

    157K
    492
    transformers
    Image To Text

    lightonai/LightOnOCR-1B-1025

    147K
    248
    transformers
    Image To Text

    numind/NuMarkdown-8B-Thinking

    133K
    452
    transformers
    Image To Text

    ibm-granite/granite-vision-3.3-2b

    124K
    83
    Image To Text

    alibaba-damo/mgp-str-base

    94K
    65
    transformers
    Image To Text

    PaddlePaddle/PP-OCRv5_server_rec

    76K
    25
    PaddleOCR
    1 / 13