Mixpeek Logo
    Models/Document Analysis/naver-clova-ix/donut-base
    HFDocument Structuremit

    donut-base

    by naver-clova-ix

    Document understanding transformer — OCR-free document parsing

    468Kdl/month
    251likes
    210Mparams
    Identifiers
    Model ID
    naver-clova-ix/donut-base
    Feature URI
    mixpeek://document_extractor@v1/naver_donut_base_v1

    Overview

    Donut (Document Understanding Transformer) is an end-to-end model for document understanding that directly maps document images to structured outputs without relying on a separate OCR engine. This simplifies the pipeline and avoids OCR error propagation.

    On Mixpeek, Donut offers an OCR-free alternative for document structure extraction, particularly useful for visually rich documents like receipts, forms, and infographics.

    Architecture

    Swin Transformer encoder for image features, BART decoder for text generation. Trained end-to-end on document images with their corresponding JSON annotations. No OCR dependency.

    Mixpeek SDK Integration

    import { Mixpeek } from "mixpeek";
    
    const mx = new Mixpeek({ apiKey: "API_KEY" });
    
    await mx.collections.ingest({
      collection_id: "my-collection",
      source: { url: "https://example.com/receipt.jpg" },
      feature_extractors: [{
        name: "document_structure",
        version: "v1",
        params: {
          model_id: "naver-clova-ix/donut-base"
        }
      }]
    });

    Capabilities

    • OCR-free document understanding
    • Structured JSON output from document images
    • Document classification
    • Key-value extraction from forms

    Use Cases on Mixpeek

    Receipt and invoice parsing without OCR
    Form data extraction for automated workflows
    Document classification and routing

    Specification

    FrameworkHF
    Organizationnaver-clova-ix
    FeatureDocument Structure
    Outputstructure tokens
    Modalitiesdocument
    RetrieverSection Filter
    Parameters210M
    Licensemit
    Downloads/mo468K
    Likes251

    Research Paper

    OCR-free Document Understanding Transformer

    arxiv.org

    Build a pipeline with donut-base

    Add this model to a processing pipeline alongside other extractors. Combine with retrieval stages for end-to-end search.

    Open Pipeline Builder