Falcon-OCR
by tiiuae
300M early-fusion OCR model — plain text, LaTeX, and HTML table output from document images
tiiuae/Falcon-OCRmixpeek://image_extractor@v1/tiiuae_falcon_ocr_v1Overview
Falcon-OCR is an ultra-compact 300M-parameter early-fusion vision-language model for document OCR, developed by the Technology Innovation Institute (TII). Unlike traditional OCR pipelines that chain detection, recognition, and layout analysis, Falcon-OCR processes image patches and text tokens in a shared parameter space from the very first transformer layer, using a hybrid attention mask where image tokens attend bidirectionally while text tokens decode causally conditioned on the image.
At just 300M parameters, Falcon-OCR is roughly 3x smaller than competing VLM-based OCR models yet achieves 80.3% on the olmOCR benchmark and 88.64 overall on OmniDocBench. On Mixpeek, it provides fast, lightweight OCR extraction from scanned documents, receipts, and printed materials, producing plain text, LaTeX for formulas, or HTML for tables depending on the requested output format.
Architecture
Early-fusion dense autoregressive Transformer. A single transformer processes image patches and text tokens in a shared parameter space from layer 1. Hybrid attention mask: image tokens attend bidirectionally, text tokens decode causally conditioned on image. Requires PyTorch 2.5+ for FlexAttention.
Mixpeek SDK Integration
import { Mixpeek } from "mixpeek";const mx = new Mixpeek({ apiKey: "API_KEY" });await mx.collections.ingest({collection_id: "documents",source: { url: "https://example.com/scanned-report.pdf" },feature_extractors: [{feature: "ocr",model: "tiiuae/Falcon-OCR"}]});
Capabilities
- Plain text, LaTeX formula, and HTML table output modes
- Early-fusion architecture — no separate vision encoder
- 88.64 overall on OmniDocBench at just 300M params
- ~2.9 images/sec on a single A100-80GB
- 3x smaller than competing VLM-OCR models
Use Cases on Mixpeek
Benchmarks
| Dataset | Metric | Score | Source |
|---|---|---|---|
| olmOCR | Accuracy | 80.3% | TII, 2026 — Falcon Perception Paper |
| OmniDocBench | Overall | 88.64 | TII, 2026 — Falcon Perception Paper |
Performance
Specification
Research Paper
Falcon Perception
arxiv.orgBuild a pipeline with Falcon-OCR
Add this model to a processing pipeline alongside other extractors. Combine with retrieval stages for end-to-end search.
Open Studio