NEWVectors or files. Pick a path.Start →
    Models/Segmentation/ZhengPeng7/BiRefNet
    HFSegmentationMIT

    BiRefNet

    by ZhengPeng7

    High-resolution foreground segmentation for object masks and visual evidence cleanup

    824Kdl/month
    0.2B classparams
    Identifiers
    Model ID
    ZhengPeng7/BiRefNet
    Feature URI
    mixpeek://image_extractor@v1/zhengpeng7_birefnet_v1

    Overview

    BiRefNet is the official checkpoint for Bilateral Reference for High-Resolution Dichotomous Image Segmentation. It targets foreground/background masks, salient object segmentation, and related cases where the useful evidence is an object region rather than the whole image.

    On Mixpeek, BiRefNet can turn images or sampled video frames into mask metadata. Agents can use those masks to filter frames with clear foreground objects, crop objects before embedding, or remove distracting background before downstream OCR, detection, captioning, or similarity search.

    Architecture

    Image-segmentation model for high-resolution dichotomous segmentation. The Hugging Face card lists Transformers support through AutoModelForImageSegmentation, MIT licensing, and tags for background removal, mask generation, camouflaged object detection, and salient object detection.

    Mixpeek SDK Integration

    import { Mixpeek } from "mixpeek";
    const mx = new Mixpeek({ apiKey: "API_KEY" });
    await mx.collections.ingest({
    collection_id: "foreground-index",
    source: { url: "s3://product-media/images/" },
    feature_extractors: [{
    feature: "segmentation",
    model: "ZhengPeng7/BiRefNet",
    params: {
    output_masks: true,
    store_crops: true
    }
    }]
    });

    Capabilities

    • Foreground/background mask generation
    • High-resolution dichotomous image segmentation
    • Background removal and object isolation
    • Useful pre-processing for embeddings, OCR, and VLM captioning
    • MIT license

    Use Cases on Mixpeek

    Index object masks for visual search and filtering
    Crop foreground products before embedding or captioning
    Clean screenshots and image evidence before OCR
    Find frames where the foreground object occupies enough of the scene

    Benchmarks

    DatasetMetricScoreSource
    Hugging FaceMonthly downloads824KHF model metadata, June 2026
    BiRefNet task coverageSegmentation tagsDIS, camouflaged, salient objectBiRefNet model card

    Performance

    Input SizeImages or sampled video frames
    GPU LatencyInput dependent
    GPU ThroughputBatch dependent
    GPU MemoryModel dependent

    Run before visual embeddings when foreground isolation improves retrieval quality

    Specification

    FrameworkHF
    OrganizationZhengPeng7
    FeatureSegmentation
    Outputmask + label
    Modalitiesvideo, image
    RetrieverMask Filter
    Parameters0.2B class
    LicenseMIT
    Downloads/mo824K

    Research Paper

    Bilateral Reference for High-Resolution Dichotomous Image Segmentation

    arxiv.org

    Build a pipeline with BiRefNet

    Add this model to a processing pipeline alongside other extractors. Combine with retrieval stages for end-to-end search.

    Open Studio