Qwen3-Embedding-4B
by Qwen
Top-ranked multilingual text embedding with 100+ languages and 32K context
Qwen/Qwen3-Embedding-4Bmixpeek://text_extractor@v1/qwen3_embedding_4b_v1Overview
Qwen3-Embedding-4B is the mid-size model in the Qwen3 Embedding family that achieves top performance on the MTEB multilingual leaderboard with a score of 69.45, excelling across text retrieval, code retrieval, classification, clustering, and bitext mining. It balances strong embedding quality with reasonable compute requirements.
On Mixpeek, Qwen3-Embedding-4B is the recommended text embedding model for production pipelines that need best-in-class multilingual retrieval quality. It powers semantic search over transcripts, documents, and extracted text across 100+ languages.
Architecture
Dense transformer built on the Qwen3 4B foundation model with the same three-stage training pipeline as the 0.6B variant: unsupervised pre-training, supervised fine-tuning, and model merging. Supports flexible embedding dimensions from 32 to 2048 via Matryoshka training and instruction-aware embedding.
Mixpeek SDK Integration
import { Mixpeek } from "mixpeek";
const mx = new Mixpeek({ apiKey: "API_KEY" });
// Managed: create a collection over a bucket; Mixpeek runs this model's extractor
const collection = await mx.collections.create({
namespace_id: "my-namespace",
collection_name: "my-collection",
source: { type: "bucket", bucket_ids: ["bkt_your_bucket"] },
feature_extractor: {
feature_extractor_name: "text_embedding",
version: "v1",
parameters: { model_id: "Qwen/Qwen3-Embedding-4B" },
},
});Capabilities
- Top-ranked on MTEB multilingual leaderboard (69.45)
- 100+ language support with state-of-the-art multilingual transfer
- Flexible embedding dimensions from 32 to 2048
- 32K token context window for long documents
- Strong performance on code retrieval and classification tasks
Use Cases on Mixpeek
Benchmarks
| Dataset | Metric | Score | Source |
|---|---|---|---|
| MTEB Multilingual | Avg Score | 69.45 | Qwen3-Embedding paper, June 2025 |
| MTEB Retrieval (en) | nDCG@10 | Top-tier among open models | Qwen3-Embedding paper, June 2025 |
| Code Retrieval | MRR | Best among 4B-class models | Qwen3-Embedding paper, June 2025 |
Performance
Common Pipeline Companions
Explore on Mixpeek
Compare alternatives in this category
Hand-picked tools & platforms compared
Deep-dive technical guide
See how Mixpeek runs models as extractors
Store & search embeddings at scale
Usage-based pricing for pipelines
Compare models, APIs & infrastructure
Specification
Research Paper
Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models
arxiv.orgBuild a pipeline with Qwen3-Embedding-4B
Add this model to a processing pipeline alongside other extractors. Combine with retrieval stages for end-to-end search.
Open Studio