granite-vision-4.1-4b
by ibm-granite
Specialized VLM for extracting structured data from charts, tables, and forms
ibm-granite/granite-vision-4.1-4bmixpeek://image_extractor@v1/ibm_granite_vision_41_4b_v1Overview
Granite Vision 4.1 is IBM's purpose-built document extraction model that converts visual content — charts, tables, forms, key-value pairs — into structured machine-readable formats (CSV, JSON, HTML). Unlike general-purpose VLMs that describe what they see, Granite Vision extracts precise data values with high accuracy, making it suitable for automated document processing pipelines.
On Mixpeek, Granite Vision powers structured extraction from document pages: converting chart images to CSV data, table images to JSON records, and form images to key-value pairs. This structured output is directly indexable and filterable, unlike free-text captions.
Architecture
LoRA adapter on Granite-4.1-3B vision-language model. 4B total parameters (3.4B LLM + 0.6B vision encoder/projectors). Trained specifically on document extraction tasks: chart-to-CSV, table-to-JSON/HTML, key-value pair extraction. Integrates with IBM Docling for production pipelines.
Mixpeek SDK Integration
import { Mixpeek } from "mixpeek";const mx = new Mixpeek({ apiKey: "API_KEY" });await mx.collections.ingest({collection_id: "financial-docs",source: { url: "https://example.com/annual-report.pdf" },feature_extractors: [{feature: "document_extraction",model: "ibm-granite/granite-vision-4.1-4b"}]});
Capabilities
- Chart to CSV extraction with high precision
- Table to JSON/HTML structured output
- Key-value pair extraction (94.2% exact-match on VAREX)
- Apache 2.0 license for unrestricted commercial use
- LoRA adapter — lightweight deployment on top of Granite-4.1-3B
Use Cases on Mixpeek
Benchmarks
| Dataset | Metric | Score | Source |
|---|---|---|---|
| VAREX (key-value extraction) | Exact-match accuracy (zero-shot) | 94.2% | IBM Research, 2026 — Model Card |
Performance
Specification
Research Paper
Granite Vision 4.1 for Document Extraction
arxiv.orgBuild a pipeline with granite-vision-4.1-4b
Add this model to a processing pipeline alongside other extractors. Combine with retrieval stages for end-to-end search.
Open Studio