LightOnOCR-2-1B
by lightonai
State-of-the-art 1B-parameter end-to-end multilingual OCR with bounding box localization
lightonai/LightOnOCR-2-1Bmixpeek://image_extractor@v1/lighton_ocr2_1b_v1Overview
LightOnOCR-2-1B is a 1B-parameter vision-language model that sets the top score on OlmOCR-Bench (83.2) while being compact enough for efficient deployment. Built on a native-resolution ViT initialized from Mistral-Small-3.1, it handles page images up to 1540px on the longest edge with particularly strong performance on ArXiv papers, scanned documents with math, and complex tables.
On Mixpeek, LightOnOCR-2-1B extracts text from documents, scanned pages, and images with high accuracy, powering full-text search across document collections. An image-localization variant adds bounding box predictions without degrading OCR quality.
Architecture
Three-component VLM: native-resolution Vision Transformer (initialized from Mistral-Small-3.1) as encoder, multimodal projector, and language model decoder. Accepts page images up to 1540px longest edge. Optional bounding box localization via coordinate tokens introduced during pretraining and refined with RLVR.
Mixpeek SDK Integration
import { Mixpeek } from "mixpeek";
const mx = new Mixpeek({ apiKey: "API_KEY" });
// Managed: create a collection over a bucket; Mixpeek runs this model's extractor
const collection = await mx.collections.create({
namespace_id: "my-namespace",
collection_name: "my-collection",
source: { type: "bucket", bucket_ids: ["bkt_your_bucket"] },
feature_extractor: {
feature_extractor_name: "ocr",
version: "v1",
parameters: { model_id: "lightonai/LightOnOCR-2-1B" },
},
});Capabilities
- Top score on OlmOCR-Bench (83.2) among 1B-class models
- Native-resolution processing up to 1540px
- Multilingual OCR with strong French and scientific document support
- Optional bounding box localization variant
- Apache 2.0 open-source
Use Cases on Mixpeek
Benchmarks
| Dataset | Metric | Score | Source |
|---|---|---|---|
| OlmOCR-Bench | Score | 83.2 | LightOn AI, 2025 — LightOnOCR paper |
| ArXiv papers subset | Score | Best in class | LightOn AI, 2025 — LightOnOCR paper |
Performance
Common Pipeline Companions
Explore on Mixpeek
Compare alternatives in this category
Hand-picked tools & platforms compared
Deep-dive technical guide
See how Mixpeek runs models as extractors
Store & search embeddings at scale
Usage-based pricing for pipelines
Compare models, APIs & infrastructure
Specification
Research Paper
LightOnOCR: A 1B End-to-End Multilingual Vision-Language Model for State-of-the-Art OCR
arxiv.orgBuild a pipeline with LightOnOCR-2-1B
Add this model to a processing pipeline alongside other extractors. Combine with retrieval stages for end-to-end search.
Open Studio