Mixpeek Logo
    Models/Detection & Recognition/facebook/detr-resnet-50
    HFObject Detectionapache-2.0

    detr-resnet-50

    by facebook

    End-to-end object detection with Transformers — no anchor boxes needed

    385Kdl/month
    936likes
    42Mparams
    Identifiers
    Model ID
    facebook/detr-resnet-50
    Feature URI
    mixpeek://image_extractor@v1/facebook_detr_r50_v1

    Overview

    DETR (DEtection TRansformer) reimagines object detection as a set prediction problem, using a transformer encoder-decoder architecture to directly output a set of bounding boxes and class labels without the need for hand-designed components like anchor boxes or non-maximum suppression.

    On Mixpeek, DETR extracts structured object annotations from video frames and images, producing bounding boxes with class labels that power attribute-based filtering in retrieval pipelines.

    Architecture

    ResNet-50 CNN backbone followed by a 6-layer transformer encoder-decoder. Uses bipartite matching loss (Hungarian algorithm) to assign predictions to ground truth. Outputs 100 object queries in parallel.

    Mixpeek SDK Integration

    import { Mixpeek } from "mixpeek";
    
    const mx = new Mixpeek({ apiKey: "API_KEY" });
    
    await mx.collections.ingest({
      collection_id: "my-collection",
      source: { url: "https://example.com/video.mp4" },
      feature_extractors: [{
        name: "object_detection",
        version: "v1",
        params: {
          model_id: "facebook/detr-resnet-50"
        }
      }]
    });

    Capabilities

    • 91 COCO object categories out of the box
    • Bounding box + class label predictions
    • Panoptic segmentation with extensions
    • No hand-designed post-processing (NMS-free)

    Use Cases on Mixpeek

    Video surveillance — detect people, vehicles, objects in security footage
    Retail analytics — count and classify products on shelves
    Content moderation — identify objects for compliance filtering
    Autonomous driving data — annotate frames with detected objects

    Specification

    FrameworkHF
    Organizationfacebook
    FeatureObject Detection
    Outputbbox + label
    Modalitiesvideo, image
    RetrieverObject Filter
    Parameters42M
    Licenseapache-2.0
    Downloads/mo385K
    Likes936

    Research Paper

    End-to-End Object Detection with Transformers

    arxiv.org

    Build a pipeline with detr-resnet-50

    Add this model to a processing pipeline alongside other extractors. Combine with retrieval stages for end-to-end search.

    Open Pipeline Builder