rf-detr-base
by roboflow
First real-time detection transformer to break 60 AP on COCO, built on DINOv2
roboflow/rf-detr-basemixpeek://image_extractor@v1/roboflow_rf_detr_base_v1Overview
RF-DETR is a real-time object detection architecture developed by Roboflow that combines a DINOv2 vision transformer backbone with deformable DETR decoding. It eliminates traditional detection components like anchor boxes and NMS, using neural architecture search to find optimal encoder-decoder configurations that balance speed and accuracy across model sizes from Nano (2.3ms) to 2XL (60.1 AP).
On Mixpeek, RF-DETR Base provides the best speed-accuracy tradeoff for real-time object detection pipelines, processing video frames at over 150 FPS on GPU while maintaining 53.3 AP on COCO. Its strong fine-tuning transfer makes it ideal for domain-specific detection tasks on both large and small custom datasets.
Architecture
DINOv2 ViT backbone with deformable attention decoder. 29M parameters. Uses bipartite matching loss for set prediction. Designed via neural architecture search to optimize latency-accuracy Pareto frontier. Supports TensorRT FP16 export for production deployment.
Mixpeek SDK Integration
import { Mixpeek } from "mixpeek";const mx = new Mixpeek({ apiKey: "API_KEY" });await mx.collections.ingest({collection_id: "my-collection",source: { url: "https://example.com/video.mp4" },feature_extractors: [{name: "object_detection",version: "v1",params: {model_id: "roboflow/rf-detr-base"}}]});
Capabilities
- 53.3 AP on COCO val2017 at base size
- Real-time inference at ~6ms / image (T4 TensorRT FP16)
- DINOv2 backbone enables strong domain transfer
- NMS-free end-to-end detection pipeline
- Scales from Nano (2.3ms) to 2XL (60.1 AP)
Use Cases on Mixpeek
Benchmarks
| Dataset | Metric | Score | Source |
|---|---|---|---|
| COCO val2017 | AP50:95 | 53.3 | Roboflow, 2025 — RF-DETR Benchmarks |
| COCO val2017 (Large variant) | AP50:95 | 56.5 | Roboflow, 2025 — RF-DETR Benchmarks |
| COCO val2017 (2XL variant) | AP50:95 | 60.1 | Roboflow, 2025 — RF-DETR Benchmarks |
Performance
Specification
Research Paper
RF-DETR: Neural Architecture Search for Real-Time Detection Transformers
arxiv.orgBuild a pipeline with rf-detr-base
Add this model to a processing pipeline alongside other extractors. Combine with retrieval stages for end-to-end search.
Open Studio