NEWManaged multimodal retrieval.Explore platform →
    Models/Embeddings/FireRedTeam/ReMatch-3B
    HFVisual EmbeddingsApache 2.0

    ReMatch-3B

    by FireRedTeam

    Multimodal retriever trained with generative matching for stronger query-item alignment

    35dl/month
    3Bparams
    Identifiers
    Model ID
    FireRedTeam/ReMatch-3B
    Feature URI
    mixpeek://image_extractor@v1/fireredteam_rematch_3b_v1

    Overview

    ReMatch turns a multimodal LLM into a retrieval model by adding a chat-style generative matching objective. Instead of relying only on contrastive pairs, it teaches the model to reason about whether a query and candidate match, then distills that signal into retrieval embeddings.

    On Mixpeek, ReMatch is relevant for agent retrieval when queries are specific, compositional, or visual-textual, such as finding a frame where a person is doing one action while an object appears in a certain place.

    Architecture

    3B multimodal retriever with learnable representation tokens and a generative matching training objective. The model supports English and Chinese according to the model card.

    Mixpeek SDK Integration

    import { Mixpeek } from "mixpeek";
    const mx = new Mixpeek({ apiKey: "API_KEY" });
    await mx.collections.ingest({
    collection_id: "evidence-library",
    source: { url: "https://example.com/camera-stills/" },
    feature_extractors: [{
    feature: "multimodal_embedding",
    model: "FireRedTeam/ReMatch-3B"
    }]
    });

    Capabilities

    • Multimodal retrieval from image and text inputs
    • Generative matching objective for hard query-candidate pairs
    • Single-vector retrieval path with richer alignment than plain contrastive training
    • Apache 2.0 license

    Use Cases on Mixpeek

    Agent search for visually specific evidence
    Image and page retrieval where query wording is compositional
    Second-stage retrieval after a broader dense model

    Benchmarks

    DatasetMetricScoreSource
    CVPR 2026 model cardStatusAcceptedHugging Face model card

    Specification

    FrameworkHF
    OrganizationFireRedTeam
    FeatureVisual Embeddings
    Output768-dim vector
    Modalitiesvideo, image
    RetrieverVector Search
    Parameters3B
    LicenseApache 2.0
    Downloads/mo35

    Research Paper

    ReMatch: Boosting Representation through Matching for Multimodal Retrieval

    arxiv.org

    Build a pipeline with ReMatch-3B

    Add this model to a processing pipeline alongside other extractors. Combine with retrieval stages for end-to-end search.

    Open Studio