NEWVectors or files. Pick a path.Start →

    Image To Text Models

    Browse AI models for multimodal decomposition and recomposition pipelines — plug any model into your extractors.

    402 models available

    Showing 124 of 402 models

    Image To Text

    Salesforce/blip-image-captioning-base

    2.4M
    860
    transformers
    Image To Text

    Salesforce/blip-image-captioning-large

    763K
    1,475
    transformers
    Image To Text

    PaddlePaddle/PP-OCRv5_server_det

    669K
    68
    PaddleOCR
    Image To Text

    PaddlePaddle/UVDoc

    509K
    10
    PaddleOCR
    Image To Text

    PaddlePaddle/PP-LCNet_x1_0_doc_ori

    444K
    15
    PaddleOCR
    Image To Text

    kha-white/manga-ocr-base

    383K
    173
    transformers
    Image To Text

    microsoft/trocr-base-printed

    373K
    208
    transformers
    Image To Text

    PaddlePaddle/en_PP-OCRv5_mobile_rec

    373K
    2
    PaddleOCR
    Image To Text

    facebook/nougat-base

    274K
    189
    transformers
    Image To Text

    microsoft/trocr-large-handwritten

    262K
    161
    transformers
    Image To Text

    PaddlePaddle/PP-LCNet_x1_0_textline_ori

    257K
    4
    PaddleOCR
    Image To Text

    PaddlePaddle/PP-OCRv5_server_rec

    213K
    27
    PaddleOCR
    Image To Text

    microsoft/trocr-small-handwritten

    213K
    63
    transformers
    Image To Text

    lightonai/LightOnOCR-1B-1025

    208K
    251
    transformers
    Image To Text

    microsoft/kosmos-2-patch14-224

    163K
    185
    transformers
    Image To Text

    naver-clova-ix/donut-base

    156K
    253
    transformers
    Image To Text

    nlpconnect/vit-gpt2-image-captioning

    146K
    931
    transformers
    Image To Text

    ibm-granite/granite-vision-3.3-2b

    143K
    84
    Image To Text

    microsoft/trocr-base-handwritten

    142K
    495
    transformers
    Image To Text

    microsoft/trocr-large-printed

    135K
    180
    transformers
    Image To Text

    ADSKAILab/Zero-To-CAD-Qwen3-VL-2B

    122K
    52
    transformers
    Image To Text

    alibaba-damo/mgp-str-base

    122K
    65
    transformers
    Image To Text

    PaddlePaddle/PP-OCRv5_mobile_det

    121K
    27
    PaddleOCR
    Image To Text

    optimum-intel-internal-testing/pix2struct-tiny-random

    95K
    1 / 17