NEWManaged multimodal retrieval.Explore platform →
    Models/Embeddings/ibm-granite/granite-embedding-97m-multilingual-r2
    HFText EmbeddingsApache 2.0

    granite-embedding-97m-multilingual-r2

    by ibm-granite

    Highest-quality multilingual embedding under 100M parameters for edge and mobile deployment

    15.6Kdl/month
    97Mparams
    Identifiers
    Model ID
    ibm-granite/granite-embedding-97m-multilingual-r2
    Feature URI
    mixpeek://text_extractor@v1/ibm_granite_embed_97m_multi_r2

    Overview

    Granite Embedding 97M is IBM's ultra-lightweight multilingual text embedding model that achieves the best retrieval quality of any open model under 100M parameters. Using ModernBERT architecture with model pruning and vocabulary selection, it produces 384-dimensional embeddings across 200+ languages while fitting in under 100MB quantized.

    On Mixpeek, this model fills the edge deployment gap — embedding generation on devices, in browser workers, or on minimal hardware where the 300M+ models in the catalog are too large. It supports 32K token context, making it suitable for embedding longer documents without chunking.

    Architecture

    ModernBERT with model pruning and vocabulary selection. 97M parameters. 384-dim output embeddings. 32K max context. Trained on 52 languages + code with enhanced multilingual data. ONNX quantized weights are 98MB.

    Mixpeek SDK Integration

    import { Mixpeek } from "mixpeek";
    const mx = new Mixpeek({ apiKey: "API_KEY" });
    await mx.collections.ingest({
    collection_id: "multilingual-docs",
    source: { url: "https://example.com/documents.json" },
    feature_extractors: [{
    feature: "text_embeddings",
    model: "ibm-granite/granite-embedding-97m-multilingual-r2"
    }]
    });

    Capabilities

    • Best retrieval quality under 100M params (59.6 nDCG@10 on MMTEB)
    • 200+ language support with 52-language enhanced training
    • 384-dimensional embeddings
    • 32K token context window
    • 98MB quantized — runs on edge devices and in-browser

    Use Cases on Mixpeek

    Edge deployment: run embeddings on mobile devices or IoT hardware
    Browser-based search: embed documents client-side in web workers
    High-throughput batch embedding: minimize GPU cost for large corpora
    Multilingual search: embed documents in 200+ languages with minimal resources

    Benchmarks

    DatasetMetricScoreSource
    MMTEB Retrieval (18 tasks)nDCG@1059.6%IBM Research, 2026 — arxiv:2605.13521

    Performance

    Input SizeText (up to 32K tokens)
    Embedding Dim384
    GPU Latency~2ms / doc (A100)
    CPU Latency~15ms / doc
    GPU Throughput~500 docs/sec (A100)
    GPU Memory~0.4 GB

    Specification

    FrameworkHF
    Organizationibm-granite
    FeatureText Embeddings
    Output1024-dim vector
    Modalitiesdocument, audio
    RetrieverText Similarity
    Parameters97M
    LicenseApache 2.0
    Downloads/mo15.6K

    Research Paper

    Granite Embedding Multilingual R2 Models

    arxiv.org

    Build a pipeline with granite-embedding-97m-multilingual-r2

    Add this model to a processing pipeline alongside other extractors. Combine with retrieval stages for end-to-end search.

    Open Studio