NEWWhy single embeddings fail for video.Read the post →
    Models/Embeddings/Qwen/Qwen3-Embedding-8B
    HFText EmbeddingsApache 2.0

    Qwen3-Embedding-8B

    by Qwen

    #1 multilingual text embedding — 100+ languages, 32K context, instruction-tuned

    1.8Mdl/month
    8Bparams
    Identifiers
    Model ID
    Qwen/Qwen3-Embedding-8B
    Feature URI
    mixpeek://text_extractor@v1/qwen3_embedding_8b_v1

    Overview

    Qwen3-Embedding-8B is the flagship text embedding model from the Qwen3 family, achieving state-of-the-art results on the MTEB Multilingual leaderboard (70.58). It supports 100+ languages with instruction-tuned task conditioning, meaning you can prefix queries with task descriptions to optimize retrieval for specific use cases.

    The 32K context window handles full documents, long articles, and code files without truncation. On Mixpeek, it serves as the backbone for text-only retrieval pipelines — embedding transcripts, document text, metadata, and code for semantic search.

    Architecture

    Decoder-only transformer (Qwen3 backbone, 8B parameters) fine-tuned for embedding via instruction-tuned contrastive learning. Produces dense vectors up to 4096 dimensions with Matryoshka support. 32K context window with RoPE position encoding.

    Mixpeek SDK Integration

    import { Mixpeek } from "mixpeek";
    const mx = new Mixpeek({ apiKey: "API_KEY" });
    await mx.collections.ingest({
    collection_id: "my-collection",
    source: { url: "https://example.com/report.pdf" },
    feature_extractors: [{
    name: "text_embedding",
    version: "v1",
    params: {
    model_id: "Qwen/Qwen3-Embedding-8B",
    embedding_dim: 1024
    }
    }]
    });

    Capabilities

    • 100+ language support with native multilingual training
    • 32K context window for full-document embedding
    • Instruction-tuned task conditioning for query optimization
    • Matryoshka flexible dimensionality (64–4096)
    • #1 on MTEB Multilingual benchmark

    Use Cases on Mixpeek

    Multilingual document search across enterprise knowledge bases
    Long-document semantic retrieval without chunking
    Code search and technical documentation retrieval
    Cross-language search: query in English, find results in any language

    Benchmarks

    DatasetMetricScoreSource
    MTEB MultilingualScore70.58Qwen, 2026 — MTEB Leaderboard
    MTEB EnglishScore72.3Qwen, 2026 — Model Card

    Performance

    Input SizeUp to 32K tokens
    Embedding Dim64–4096 (Matryoshka)
    GPU Latency~35ms / passage (A100, 512 tokens)
    GPU Throughput~29 passages/sec (A100)
    GPU Memory~16 GB

    Specification

    FrameworkHF
    OrganizationQwen
    FeatureText Embeddings
    Output1024-dim vector
    Modalitiesdocument, audio
    RetrieverText Similarity
    Parameters8B
    LicenseApache 2.0
    Downloads/mo1.8M

    Research Paper

    Qwen3-Embedding: Advancing Text and Multimodal Retrieval

    arxiv.org

    Build a pipeline with Qwen3-Embedding-8B

    Add this model to a processing pipeline alongside other extractors. Combine with retrieval stages for end-to-end search.

    Open Studio