NEWWhy single embeddings fail for video.Read the post →
    Models/Embeddings/BAAI/bge-m3
    HFText EmbeddingsMIT

    bge-m3

    by BAAI

    Hybrid retrieval in one model -- dense, sparse, and ColBERT embeddings from a single forward pass

    8.2Mdl/month
    568Mparams
    Identifiers
    Model ID
    BAAI/bge-m3
    Feature URI
    mixpeek://text_extractor@v1/baai_bge_m3_v1

    Overview

    BGE-M3 is BAAI's multi-functionality embedding model that produces three types of embeddings simultaneously: dense vectors for semantic search, sparse vectors for lexical matching, and ColBERT-style multi-vector representations for fine-grained late interaction. This eliminates the need to run separate models for different retrieval strategies.

    The model supports 100+ languages and handles up to 8192 tokens of input, making it suitable for long documents. On Mixpeek, BGE-M3 powers hybrid retrieval pipelines where a single ingest pass produces all three representation types, and the retriever fuses them at query time for higher recall than any single strategy alone.

    Architecture

    XLM-RoBERTa backbone, 568M parameters. Produces dense embeddings (1024d), sparse term-weight vectors, and ColBERT multi-vector representations from one forward pass. Trained with self-knowledge distillation across 100+ languages.

    Mixpeek SDK Integration

    import { Mixpeek } from "mixpeek";
    const mx = new Mixpeek({ apiKey: "API_KEY" });
    await mx.collections.ingest({
    collection_id: "my-collection",
    source: { url: "https://example.com/report.pdf" },
    feature_extractors: [{
    name: "text_embedding",
    version: "v1",
    params: {
    model_id: "BAAI/bge-m3"
    }
    }]
    });

    Capabilities

    • Dense, sparse, and ColBERT embeddings in one pass
    • 100+ language support
    • 8192 token context window
    • Hybrid retrieval without multiple models
    • Matryoshka dimension reduction

    Use Cases on Mixpeek

    Hybrid search combining semantic + lexical matching
    Cross-lingual document retrieval
    Long-document embedding for RAG pipelines
    Multi-strategy retrieval with single model overhead

    Benchmarks

    DatasetMetricScoreSource
    MIRACL (avg, 18 languages)nDCG@1071.9%BAAI, 2024 -- Paper Table 3
    MTEB Retrieval (en)nDCG@1067.2%MTEB Leaderboard

    Performance

    Input SizeUp to 8192 tokens
    GPU Latency~12ms / passage (A100)
    GPU Throughput~800 passages/sec (A100, batch 64)
    GPU Memory~2.2 GB

    Specification

    FrameworkHF
    OrganizationBAAI
    FeatureText Embeddings
    Output1024-dim vector
    Modalitiesdocument, audio
    RetrieverText Similarity
    Parameters568M
    LicenseMIT
    Downloads/mo8.2M

    Research Paper

    BGE M3-Embedding: Multi-Lingual, Multi-Functionality, Multi-Granularity

    arxiv.org

    Build a pipeline with bge-m3

    Add this model to a processing pipeline alongside other extractors. Combine with retrieval stages for end-to-end search.

    Open Studio