Qwen3-Embedding-4B
by Qwen
Top-ranked multilingual text embedding with 100+ languages and 32K context
Qwen/Qwen3-Embedding-4Bmixpeek://text_extractor@v1/qwen3_embedding_4b_v1Overview
Qwen3-Embedding-4B is the mid-size model in the Qwen3 Embedding family that achieves top performance on the MTEB multilingual leaderboard with a score of 69.45, excelling across text retrieval, code retrieval, classification, clustering, and bitext mining. It balances strong embedding quality with reasonable compute requirements.
On Mixpeek, Qwen3-Embedding-4B is the recommended text embedding model for production pipelines that need best-in-class multilingual retrieval quality. It powers semantic search over transcripts, documents, and extracted text across 100+ languages.
Architecture
Dense transformer built on the Qwen3 4B foundation model with the same three-stage training pipeline as the 0.6B variant: unsupervised pre-training, supervised fine-tuning, and model merging. Supports flexible embedding dimensions from 32 to 2048 via Matryoshka training and instruction-aware embedding.
Mixpeek SDK Integration
import { Mixpeek } from "mixpeek";const mx = new Mixpeek({ apiKey: "API_KEY" });await mx.collections.ingest({collection_id: "my-collection",source: { url: "https://example.com/report.pdf" },feature_extractors: [{name: "text_embedding",version: "v1",params: {model_id: "Qwen/Qwen3-Embedding-4B"}}]});
Capabilities
- Top-ranked on MTEB multilingual leaderboard (69.45)
- 100+ language support with state-of-the-art multilingual transfer
- Flexible embedding dimensions from 32 to 2048
- 32K token context window for long documents
- Strong performance on code retrieval and classification tasks
Use Cases on Mixpeek
Benchmarks
| Dataset | Metric | Score | Source |
|---|---|---|---|
| MTEB Multilingual | Avg Score | 69.45 | Qwen3-Embedding paper, June 2025 |
| MTEB Retrieval (en) | nDCG@10 | Top-tier among open models | Qwen3-Embedding paper, June 2025 |
| Code Retrieval | MRR | Best among 4B-class models | Qwen3-Embedding paper, June 2025 |
Performance
Specification
Research Paper
Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models
arxiv.orgBuild a pipeline with Qwen3-Embedding-4B
Add this model to a processing pipeline alongside other extractors. Combine with retrieval stages for end-to-end search.
Open Studio