LightOnOCR-2-1B
by lightonai
State-of-the-art 1B-parameter end-to-end multilingual OCR with bounding box localization
lightonai/LightOnOCR-2-1Bmixpeek://image_extractor@v1/lighton_ocr2_1b_v1Overview
LightOnOCR-2-1B is a 1B-parameter vision-language model that sets the top score on OlmOCR-Bench (83.2) while being compact enough for efficient deployment. Built on a native-resolution ViT initialized from Mistral-Small-3.1, it handles page images up to 1540px on the longest edge with particularly strong performance on ArXiv papers, scanned documents with math, and complex tables.
On Mixpeek, LightOnOCR-2-1B extracts text from documents, scanned pages, and images with high accuracy, powering full-text search across document collections. An image-localization variant adds bounding box predictions without degrading OCR quality.
Architecture
Three-component VLM: native-resolution Vision Transformer (initialized from Mistral-Small-3.1) as encoder, multimodal projector, and language model decoder. Accepts page images up to 1540px longest edge. Optional bounding box localization via coordinate tokens introduced during pretraining and refined with RLVR.
Mixpeek SDK Integration
from mixpeek import Mixpeekmx = Mixpeek(api_key="YOUR_KEY")mx.ingest(collection_id="scanned-documents",source="s3://scans/",extractors=[{"type": "ocr","model": "lightonai/LightOnOCR-2-1B","output_feature": "extracted_text"}])
Capabilities
- Top score on OlmOCR-Bench (83.2) among 1B-class models
- Native-resolution processing up to 1540px
- Multilingual OCR with strong French and scientific document support
- Optional bounding box localization variant
- Apache 2.0 open-source
Use Cases on Mixpeek
Benchmarks
| Dataset | Metric | Score | Source |
|---|---|---|---|
| OlmOCR-Bench | Score | 83.2 | LightOn AI, 2025 — LightOnOCR paper |
| ArXiv papers subset | Score | Best in class | LightOn AI, 2025 — LightOnOCR paper |
Performance
Specification
Research Paper
LightOnOCR: A 1B End-to-End Multilingual Vision-Language Model for State-of-the-Art OCR
arxiv.orgBuild a pipeline with LightOnOCR-2-1B
Add this model to a processing pipeline alongside other extractors. Combine with retrieval stages for end-to-end search.
Open Studio