NEWWhy single embeddings fail for video.Read the post →
    Models/Embeddings/Qwen/Qwen3-Embedding-4B
    HFText EmbeddingsApache 2.0

    Qwen3-Embedding-4B

    by Qwen

    Top-ranked multilingual text embedding with 100+ languages and 32K context

    2.0Mdl/month
    4Bparams
    Identifiers
    Model ID
    Qwen/Qwen3-Embedding-4B
    Feature URI
    mixpeek://text_extractor@v1/qwen3_embedding_4b_v1

    Overview

    Qwen3-Embedding-4B is the mid-size model in the Qwen3 Embedding family that achieves top performance on the MTEB multilingual leaderboard with a score of 69.45, excelling across text retrieval, code retrieval, classification, clustering, and bitext mining. It balances strong embedding quality with reasonable compute requirements.

    On Mixpeek, Qwen3-Embedding-4B is the recommended text embedding model for production pipelines that need best-in-class multilingual retrieval quality. It powers semantic search over transcripts, documents, and extracted text across 100+ languages.

    Architecture

    Dense transformer built on the Qwen3 4B foundation model with the same three-stage training pipeline as the 0.6B variant: unsupervised pre-training, supervised fine-tuning, and model merging. Supports flexible embedding dimensions from 32 to 2048 via Matryoshka training and instruction-aware embedding.

    Mixpeek SDK Integration

    import { Mixpeek } from "mixpeek";
    const mx = new Mixpeek({ apiKey: "API_KEY" });
    await mx.collections.ingest({
    collection_id: "my-collection",
    source: { url: "https://example.com/report.pdf" },
    feature_extractors: [{
    name: "text_embedding",
    version: "v1",
    params: {
    model_id: "Qwen/Qwen3-Embedding-4B"
    }
    }]
    });

    Capabilities

    • Top-ranked on MTEB multilingual leaderboard (69.45)
    • 100+ language support with state-of-the-art multilingual transfer
    • Flexible embedding dimensions from 32 to 2048
    • 32K token context window for long documents
    • Strong performance on code retrieval and classification tasks

    Use Cases on Mixpeek

    Production-grade multilingual semantic search across document collections
    RAG pipeline embedding backend for enterprise knowledge bases
    Cross-lingual document matching and deduplication at scale

    Benchmarks

    DatasetMetricScoreSource
    MTEB MultilingualAvg Score69.45Qwen3-Embedding paper, June 2025
    MTEB Retrieval (en)nDCG@10Top-tier among open modelsQwen3-Embedding paper, June 2025
    Code RetrievalMRRBest among 4B-class modelsQwen3-Embedding paper, June 2025

    Performance

    Input Size32K tokens max
    Embedding Dim2048 (Matryoshka: 32-2048)
    GPU Latency~4ms / passage (A100)
    CPU Latency~35ms / passage
    GPU Throughput~250 passages/sec (A100)
    GPU Memory~8.2 GB

    Specification

    FrameworkHF
    OrganizationQwen
    FeatureText Embeddings
    Output1024-dim vector
    Modalitiesdocument, audio
    RetrieverText Similarity
    Parameters4B
    LicenseApache 2.0
    Downloads/mo2.0M

    Research Paper

    Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models

    arxiv.org

    Build a pipeline with Qwen3-Embedding-4B

    Add this model to a processing pipeline alongside other extractors. Combine with retrieval stages for end-to-end search.

    Open Studio