DepthPro
by apple
Zero-shot metric monocular depth estimation with sharp boundaries in under a second
apple/DepthPromixpeek://image_extractor@v1/apple_depthpro_v1Overview
DepthPro is Apple's foundation model for zero-shot metric monocular depth estimation, producing 2.25-megapixel depth maps (1536x1536) in 0.3 seconds on a V100 GPU. Unlike relative depth models, DepthPro predicts absolute metric depth without requiring camera intrinsics, and includes a built-in focal length estimator. Its multi-scale ViT architecture with a shared DINOv2 encoder and DPT-like fusion stage preserves sharp object boundaries.
On Mixpeek, DepthPro enables metric-accurate spatial understanding of images and video frames, powering use cases like 3D scene reconstruction, spatial filtering in retrieval, and depth-aware content organization.
Architecture
Multi-scale Vision Transformer with shared DINOv2 encoder processing image patches at multiple resolutions. DPT-like fusion stage merges and upsamples features for dense prediction. Built-in focal length estimation head. Outputs 1536x1536 metric depth maps with absolute scale.
Mixpeek SDK Integration
from mixpeek import Mixpeekmx = Mixpeek(api_key="YOUR_KEY")mx.ingest(collection_id="real-estate-photos",source="s3://listings/",extractors=[{"type": "depth_estimation","model": "apple/DepthPro","output_feature": "metric_depth"}])
Capabilities
- Zero-shot metric depth (absolute scale, no camera intrinsics needed)
- 2.25-megapixel output (1536x1536) in 0.3s
- Sharp boundary preservation via multi-scale architecture
- Built-in focal length estimation from a single image
- State-of-the-art boundary accuracy metrics
Use Cases on Mixpeek
Benchmarks
| Dataset | Metric | Score | Source |
|---|---|---|---|
| NYUv2 | AbsRel | 0.036 | Bochkovskii et al., 2024 — Depth Pro paper |
| KITTI | AbsRel | 0.039 | Bochkovskii et al., 2024 — Depth Pro paper |
| Boundary F1 | F1 (depth edges) | State-of-the-art | Bochkovskii et al., 2024 — Depth Pro paper |
Performance
Specification
Research Paper
Depth Pro: Sharp Monocular Metric Depth in Less Than a Second
arxiv.orgBuild a pipeline with DepthPro
Add this model to a processing pipeline alongside other extractors. Combine with retrieval stages for end-to-end search.
Open Studio