PaddleOCR-VL-1.6
by PaddlePaddle
Compact document VLM for OCR, tables, formulas, charts, seals, and layout parsing
PaddlePaddle/PaddleOCR-VL-1.6mixpeek://image_extractor@v1/paddle_ocr_vl_16_v1Overview
PaddleOCR-VL 1.6 is the newest compact document parsing model from PaddlePaddle. It upgrades PaddleOCR-VL 1.5 with region-aware data optimization and progressive post-training, improving weak regions such as tables, rare characters, seals, text spotting, and charts.
On Mixpeek, PaddleOCR-VL 1.6 is a strong OCR and document decomposition candidate when agents need to search scans, forms, charts, invoices, and multilingual documents as structured evidence.
Architecture
0.9B to 1.0B parameter document vision-language model built on the PaddleOCR-VL architecture. Supports task prompts for OCR, table recognition, formula recognition, chart recognition, spotting, and seal recognition. Compatible with the PaddleOCR doc parser pipeline and Transformers custom code.
Mixpeek SDK Integration
import { Mixpeek } from "mixpeek";const mx = new Mixpeek({ apiKey: "API_KEY" });await mx.collections.ingest({collection_id: "documents",source: { url: "https://example.com/invoice.pdf" },feature_extractors: [{feature: "ocr",model: "PaddlePaddle/PaddleOCR-VL-1.6"}]});
Capabilities
- Document parsing across text, tables, formulas, charts, seals, and layout
- English, Chinese, and multilingual document support
- OmniDocBench v1.6 score of 96.33 on the model card
- Compatible migration path from PaddleOCR-VL 1.5
Use Cases on Mixpeek
Benchmarks
| Dataset | Metric | Score | Source |
|---|---|---|---|
| OmniDocBench v1.6 | Overall score | 96.33% | PaddleOCR-VL 1.6 model card |
Performance
Use the PaddleOCR doc parser path for page-level parsing
Specification
Research Paper
PaddleOCR-VL-1.5: Towards a Multi-Task 0.9B VLM for Robust In-the-Wild Document Parsing
arxiv.orgBuild a pipeline with PaddleOCR-VL-1.6
Add this model to a processing pipeline alongside other extractors. Combine with retrieval stages for end-to-end search.
Open Studio