Mixpeek Logo
    Login / Signup
    Models/Detection & Recognition/facebook/detr-resnet-50
    HFObject Detectionapache-2.0

    detr-resnet-50

    by facebook

    End-to-end object detection with Transformers, no anchor boxes needed

    385Kdl/month
    936likes
    42Mparams
    Identifiers
    Model ID
    facebook/detr-resnet-50
    Feature URI
    mixpeek://image_extractor@v1/facebook_detr_r50_v1

    Overview

    DETR (DEtection TRansformer) reimagines object detection as a set prediction problem, using a transformer encoder-decoder architecture to directly output a set of bounding boxes and class labels without the need for hand-designed components like anchor boxes or non-maximum suppression.

    On Mixpeek, DETR extracts structured object annotations from video frames and images, producing bounding boxes with class labels that power attribute-based filtering in retrieval pipelines.

    Architecture

    ResNet-50 CNN backbone followed by a 6-layer transformer encoder-decoder. Uses bipartite matching loss (Hungarian algorithm) to assign predictions to ground truth. Outputs 100 object queries in parallel.

    Mixpeek SDK Integration

    import { Mixpeek } from "mixpeek";
    
    const mx = new Mixpeek({ apiKey: "API_KEY" });
    
    await mx.collections.ingest({
      collection_id: "my-collection",
      source: { url: "https://example.com/video.mp4" },
      feature_extractors: [{
        name: "object_detection",
        version: "v1",
        params: {
          model_id: "facebook/detr-resnet-50"
        }
      }]
    });

    Capabilities

    • 91 COCO object categories out of the box
    • Bounding box + class label predictions
    • Panoptic segmentation with extensions
    • No hand-designed post-processing (NMS-free)

    Use Cases on Mixpeek

    Video surveillance, detect people, vehicles, objects in security footage
    Retail analytics, count and classify products on shelves
    Content moderation, identify objects for compliance filtering
    Autonomous driving data, annotate frames with detected objects

    Specification

    FrameworkHF
    Organizationfacebook
    FeatureObject Detection
    Outputbbox + label
    Modalitiesvideo, image
    RetrieverObject Filter
    Parameters42M
    Licenseapache-2.0
    Downloads/mo385K
    Likes936

    Research Paper

    End-to-End Object Detection with Transformers

    arxiv.org

    Build a pipeline with detr-resnet-50

    Add this model to a processing pipeline alongside other extractors. Combine with retrieval stages for end-to-end search.

    Open Pipeline Builder