NEWManaged multimodal retrieval.Explore platform →
    Models/Text Extraction/rednote-hilab/dots.ocr-1.5
    HFOCRApache 2.0

    dots.ocr-1.5

    by rednote-hilab

    Multilingual document parsing — 100+ languages, unified layout + recognition

    320Kdl/month
    1.7Bparams
    Identifiers
    Model ID
    rednote-hilab/dots.ocr-1.5
    Feature URI
    mixpeek://image_extractor@v1/rednote_dots_ocr_15_v1

    Overview

    dots.ocr-1.5 is a unified document parsing model from Xiaohongshu (RedNote) that combines layout detection and content recognition in a single model. It supports 100+ languages and handles academic papers, financial reports, tables, and multilingual content.

    Task switching via prompt alone means no pipeline reconfiguration — the same model handles layout analysis, text extraction, and table parsing depending on the instruction.

    Architecture

    1.7B parameter model. Unified architecture that performs layout detection and OCR in a single forward pass. Prompt-based task switching for different document understanding modes.

    Mixpeek SDK Integration

    mixpeek.ingest.from_url(
    url="s3://documents/report.pdf",
    collection="documents",
    feature_extractors=[{
    "type": "ocr",
    "model": "mixpeek://image_extractor@v1/rednote_dots_ocr_15_v1"
    }]
    )

    Capabilities

    • Multilingual OCR (100+ languages)
    • Layout detection
    • Table extraction
    • Academic paper parsing
    • Financial document processing

    Use Cases on Mixpeek

    Multilingual document search
    International content indexing
    Financial document extraction
    Academic paper processing

    Performance

    Input SizeVariable
    GPU Latency~80ms per page (A100)
    GPU Throughput~12 pages/sec
    GPU MemoryModel dependent

    Specification

    FrameworkHF
    Organizationrednote-hilab
    FeatureOCR
    Outputtext + bbox
    Modalitiesvideo, image, document
    RetrieverText-in-Image
    Parameters1.7B
    LicenseApache 2.0
    Downloads/mo320K

    Build a pipeline with dots.ocr-1.5

    Add this model to a processing pipeline alongside other extractors. Combine with retrieval stages for end-to-end search.

    Open Studio