rf-detr-large
by Roboflow
Real-time detection transformer with DINOv2-style visual features
Roboflow/rf-detr-largemixpeek://image_extractor@v1/roboflow_rf_detr_large_v1Overview
RF-DETR Large is a real-time detection transformer from Roboflow. It combines a ViT backbone, multi-scale feature fusion, and a deformable DETR-style decoder to produce object boxes without anchor heuristics.
On Mixpeek, RF-DETR Large adds a modern open object detector for pipelines that need high-quality bounding boxes before retrieval, filtering, or agent inspection.
Architecture
End-to-end detection transformer with a DINOv2-with-registers style ViT backbone, RF-DETR windowed attention, a multi-scale projector, deformable cross-attention decoder, and DETR-style object queries trained on COCO 2017.
Mixpeek SDK Integration
import { Mixpeek } from "mixpeek";const mx = new Mixpeek({ apiKey: "API_KEY" });await mx.collections.ingest({collection_id: "camera-frames",source: { url: "https://example.com/frame.jpg" },feature_extractors: [{feature: "object_detection",model: "Roboflow/rf-detr-large"}]});
Capabilities
- Object detection over the COCO 2017 label space
- Transformer-based boxes without anchor design
- Multi-scale feature fusion for small and large objects
- Apache 2.0 license
Use Cases on Mixpeek
Performance
Use detection output as structured metadata for filters and joins
Specification
Research Paper
RF-DETR: Neural Architecture Search for Real-Time Detection Transformers
arxiv.orgBuild a pipeline with rf-detr-large
Add this model to a processing pipeline alongside other extractors. Combine with retrieval stages for end-to-end search.
Open Studio