NEWManaged multimodal retrieval.Explore platform →
    Models/Embeddings/ibm-granite/granite-embedding-311m-multilingual-r2
    HFText EmbeddingsApache 2.0

    granite-embedding-311m-multilingual-r2

    by ibm-granite

    200+ language embedding model with 32K context and ModernBERT architecture

    156Kdl/month
    311Mparams
    Identifiers
    Model ID
    ibm-granite/granite-embedding-311m-multilingual-r2
    Feature URI
    mixpeek://text_extractor@v1/ibm_granite_embed_311m_multi_r2

    Overview

    Granite Embedding 311M Multilingual R2 is IBM's second-generation multilingual text embedding model, built on ModernBERT with alternating attention mechanisms and GeGLU activations. It achieves a 13-point improvement over R1 on MTEB Multilingual Retrieval (65.2) while supporting 200+ languages, 9 programming languages, and a 32K token context window.

    On Mixpeek, this model excels at cross-lingual retrieval across global document collections. Its 32K context handles full-length legal contracts, research papers, and technical documentation without chunking. The Apache 2.0 license and broad deployment options (ONNX, OpenVINO INT8, vLLM, GGUF) make it suitable for production at scale.

    Architecture

    ModernBERT backbone with 22 layers, 12 attention heads, alternating attention patterns, and GeGLU activations. 311M parameters. Rotary position embeddings (RoPE) supporting 32K context. Trained via knowledge distillation from multiple teachers with contrastive fine-tuning and model merging. Matryoshka representation learning for flexible output dimensions.

    Mixpeek SDK Integration

    from mixpeek import Mixpeek
    mixpeek = Mixpeek(api_key="YOUR_API_KEY")
    mixpeek.ingest.documents(
    collection="global_contracts",
    source={"type": "s3", "bucket": "legal-docs"},
    pipeline={
    "embedding": {
    "model": "mixpeek://text_extractor@v1/ibm_granite_embed_311m_multi_r2"
    }
    }
    )

    Capabilities

    • 200+ language support with 52 enhanced languages
    • 32K token context length via RoPE
    • 768-dimensional embeddings with Matryoshka truncation to 128-dim
    • Code retrieval across Python, Go, Java, JavaScript, PHP, Ruby, SQL, C, C++
    • 1828 docs/sec throughput on single H100

    Use Cases on Mixpeek

    Cross-lingual enterprise search across global document repositories
    Long-document embedding for legal contracts and research papers without chunking
    Multilingual code search across polyglot codebases
    Edge-optimized deployment via OpenVINO INT8 quantization

    Benchmarks

    DatasetMetricScoreSource
    MTEB Multilingual Retrieval (18 tasks)nDCG@1065.2Model card
    MTEB Code Retrieval (12 tasks)nDCG@1063.8Model card
    LongEmbed (6 tasks)nDCG@1071.7Model card

    Performance

    Input SizeVariable
    GPU Latency~3ms per passage (H100)
    GPU Throughput1828 docs/sec (H100, batch 1024, 512 tokens)
    GPU Memory~0.8 GB

    Specification

    FrameworkHF
    Organizationibm-granite
    FeatureText Embeddings
    Output1024-dim vector
    Modalitiesdocument, audio
    RetrieverText Similarity
    Parameters311M
    LicenseApache 2.0
    Downloads/mo156K

    Research Paper

    Model paper or technical report

    arxiv.org

    Build a pipeline with granite-embedding-311m-multilingual-r2

    Add this model to a processing pipeline alongside other extractors. Combine with retrieval stages for end-to-end search.

    Open Studio