NEWManaged multimodal retrieval.Explore platform →
    Models/Embeddings/nomic-ai/modernbert-embed-base
    HFText EmbeddingsApache-2.0

    modernbert-embed-base

    by nomic-ai

    ModernBERT-powered text embeddings -- 8192 tokens, Matryoshka dimensions, fast inference

    2.8Mdl/month
    149Mparams
    Identifiers
    Model ID
    nomic-ai/modernbert-embed-base
    Feature URI
    mixpeek://text_extractor@v1/nomic_modernbert_embed_base_v1

    Overview

    ModernBERT Embed Base is Nomic AI's text embedding model built on the ModernBERT architecture, which modernizes the BERT encoder with rotary position embeddings, Flash Attention, and unpadded variable-length batching. The result is an embedding model that handles 8192-token inputs with faster inference than comparably sized alternatives.

    At 149M parameters, it outperforms Nomic's previous embedding models on MTEB while being significantly cheaper to run. It supports Matryoshka representations at 768 and 256 dimensions. On Mixpeek, it provides a strong baseline text embedding model that balances quality, speed, and context length for document-heavy retrieval pipelines.

    Architecture

    ModernBERT encoder, 149M parameters. Rotary position embeddings for 8192-token context. Flash Attention 2 for efficient long-sequence processing. Matryoshka dimensions: 768 (full) and 256 (compressed).

    Mixpeek SDK Integration

    import { Mixpeek } from "mixpeek";
    const mx = new Mixpeek({ apiKey: "API_KEY" });
    await mx.collections.ingest({
    collection_id: "my-collection",
    source: { url: "https://example.com/documentation.pdf" },
    feature_extractors: [{
    name: "text_embedding",
    version: "v1",
    params: {
    model_id: "nomic-ai/modernbert-embed-base"
    }
    }]
    });

    Capabilities

    • 8192 token context window
    • Matryoshka dimension reduction (768/256)
    • Flash Attention 2 for fast inference
    • Variable-length batching (no padding waste)
    • Strong MTEB performance at 149M params

    Use Cases on Mixpeek

    Long-document embedding for RAG
    High-throughput text indexing
    Baseline embedding model for new collections
    Cost-efficient retrieval at scale

    Benchmarks

    DatasetMetricScoreSource
    MTEB Retrieval (en)nDCG@1054.7Nomic AI, 2025 -- Model Card

    Performance

    Input SizeUp to 8192 tokens
    GPU Latency~4ms / passage (A100)
    GPU Throughput~2500 passages/sec (A100, batch 128)
    GPU Memory~0.6 GB

    Specification

    FrameworkHF
    Organizationnomic-ai
    FeatureText Embeddings
    Output1024-dim vector
    Modalitiesdocument, audio
    RetrieverText Similarity
    Parameters149M
    LicenseApache-2.0
    Downloads/mo2.8M

    Research Paper

    ModernBERT

    arxiv.org

    Build a pipeline with modernbert-embed-base

    Add this model to a processing pipeline alongside other extractors. Combine with retrieval stages for end-to-end search.

    Open Studio