granite-4.0-1b-speech
by ibm-granite
#1 Open ASR Leaderboard at 1B — edge-deployable multilingual transcription
ibm-granite/granite-4.0-1b-speechmixpeek://transcription@v1/ibm_granite_40_1b_speech_v1Overview
Granite 4.0 1B Speech is the smallest model to reach #1 on the HuggingFace Open ASR Leaderboard. At just 1B parameters, it achieves 1.42% WER on LibriSpeech Clean and 5.52% average WER across benchmarks, while running at 280x realtime factor on GPU.
It supports English and Japanese with keyword list biasing for domain-specific vocabulary. The compact size makes it ideal for edge deployment, serverless functions, and cost-sensitive pipelines where Whisper Large v3 (1.5B) is too heavy. On Mixpeek, it serves as the default transcription model for latency-sensitive and high-volume audio processing.
Architecture
Compact encoder-decoder (1B parameters) optimized for throughput. Supports keyword biasing via attention-based shallow fusion. English + Japanese language support.
Mixpeek SDK Integration
import { Mixpeek } from "mixpeek";const mx = new Mixpeek({ apiKey: "API_KEY" });await mx.collections.ingest({collection_id: "my-collection",source: { url: "https://example.com/podcast.mp3" },feature_extractors: [{name: "transcription",version: "v1",params: {model_id: "ibm-granite/granite-4.0-1b-speech"}}]});
Capabilities
- #1 on HuggingFace Open ASR Leaderboard at release
- LibriSpeech Clean WER: 1.42%
- 280x realtime factor on GPU
- Keyword list biasing for domain vocabulary
- Apache 2.0 license, only 1B parameters
Use Cases on Mixpeek
Benchmarks
| Dataset | Metric | Score | Source |
|---|---|---|---|
| LibriSpeech Clean | WER | 1.42% | IBM, 2026 — Model Card |
| Open ASR Leaderboard (avg) | WER | 5.52% | IBM, 2026 — Model Card |
Performance
Specification
Research Paper
Granite 4.0 Speech
arxiv.orgBuild a pipeline with granite-4.0-1b-speech
Add this model to a processing pipeline alongside other extractors. Combine with retrieval stages for end-to-end search.
Open Studio