NEWManaged multimodal retrieval.Explore platform →
    Models/Embeddings/lightonai/LateOn-Code
    HFText EmbeddingsApache 2.0

    LateOn-Code

    by lightonai

    ColBERT-style late interaction model purpose-built for code retrieval

    N/Adl/month
    149Mparams
    Identifiers
    Model ID
    lightonai/LateOn-Code
    Feature URI
    mixpeek://text_extractor@v1/lighton_lateon_code_v1

    Overview

    LateOn-Code is a late-interaction (ColBERT-style) embedding model specifically designed for code retrieval. Unlike dense single-vector code embedders that compress an entire function into one vector, LateOn-Code preserves token-level information through multi-vector representations — enabling precise matching of variable names, API calls, and code patterns.

    On Mixpeek, LateOn-Code enables semantic code search across codebases — find functions by describing what they do, locate similar implementations across repositories, or search for specific API usage patterns. The 149M parameter size makes it deployable alongside heavier models without significant overhead.

    Architecture

    ModernBERT backbone fine-tuned for ColBERT-style multi-vector output. 149M parameters. Produces per-token embeddings for late interaction scoring. Also available as a 17M edge variant (LateOn-Code-edge) that outperforms models 3x its size.

    Mixpeek SDK Integration

    import { Mixpeek } from "mixpeek";
    const mx = new Mixpeek({ apiKey: "API_KEY" });
    await mx.collections.ingest({
    collection_id: "codebase",
    source: { url: "https://example.com/repo.tar.gz" },
    feature_extractors: [{
    feature: "code_embedding",
    model: "lightonai/LateOn-Code"
    }]
    });

    Capabilities

    • SOTA code retrieval on MTEB Code v1 (74.12 nDCG avg)
    • ColBERT-style token-level matching for precise code search
    • Understands variable names, API calls, and code structure
    • 149M params — lightweight enough for real-time search
    • Apache 2.0 license

    Use Cases on Mixpeek

    Semantic code search: find functions by natural language description
    Code duplication detection: locate similar implementations across repos
    API usage discovery: search for specific library or framework patterns
    Code review assistance: find related code for context during reviews

    Benchmarks

    DatasetMetricScoreSource
    MTEB Code v1nDCG avg74.12LightOn, 2026
    BEIRnDCG@10>57.0LightOn, 2026

    Performance

    Input SizeCode (up to 8192 tokens)
    Embedding Dim128 per token (multi-vector)
    GPU Latency~8ms / function (A100)
    GPU Throughput~125 functions/sec (A100)
    GPU Memory~0.6 GB

    Specification

    FrameworkHF
    Organizationlightonai
    FeatureText Embeddings
    Output1024-dim vector
    Modalitiesdocument, audio
    RetrieverText Similarity
    Parameters149M
    LicenseApache 2.0
    Downloads/moN/A

    Research Paper

    LateOn-Code: ColBERT for Code Retrieval

    arxiv.org

    Build a pipeline with LateOn-Code

    Add this model to a processing pipeline alongside other extractors. Combine with retrieval stages for end-to-end search.

    Open Studio