4D-RGPT-8B
by nvidia
8B video model for region-grounded 3D and 4D reasoning
nvidia/4D-RGPT-8Bmixpeek://video_extractor@v1/nvidia_4d_rgpt_8b_v1Overview
4D-RGPT-8B is an NVIDIA video-text model focused on region grounding, 3D reasoning, and 4D reasoning. Those capabilities are important when an agent needs more than a clip-level summary. The agent needs to know which region changed, where the object moved, and how the event evolved over time.
On Mixpeek, 4D-RGPT can enrich video indexes with region-grounded temporal evidence. It is a fit for robotics footage, surveillance review, sports clips, and operational video where the retrieval result must preserve spatial and temporal context.
Architecture
NVILA-Lite-8B based video-text-to-text model. The Hugging Face metadata tags it for video understanding, region grounding, 3D reasoning, 4D reasoning, and perceptual distillation.
Mixpeek SDK Integration
import { Mixpeek } from "mixpeek";
const mx = new Mixpeek({ apiKey: "API_KEY" });
// Managed: create a collection over a bucket; Mixpeek runs this model's extractor
const collection = await mx.collections.create({
namespace_id: "my-namespace",
collection_name: "my-collection",
source: { type: "bucket", bucket_ids: ["bkt_your_bucket"] },
feature_extractor: {
feature_extractor_name: "s3",
version: "v1",
parameters: { model_id: "mixpeek://video_extractor@v1/nvidia_4d_rgpt_8b_v1" },
},
});Capabilities
- Region-grounded video understanding
- 3D and 4D reasoning over spatial-temporal evidence
- Video-text-to-text analysis for agent perception loops
- Designed for grounding objects and events through time
Use Cases on Mixpeek
Performance
Region-grounded video reasoning cost depends heavily on clip length and frame sampling.
Common Pipeline Companions
Explore on Mixpeek
Compare alternatives in this category
Hand-picked tools & platforms compared
Deep-dive technical guide
See how Mixpeek runs models as extractors
Store & search embeddings at scale
Usage-based pricing for pipelines
Compare models, APIs & infrastructure
Specification
Research Paper
4D-RGPT
arxiv.orgBuild a pipeline with 4D-RGPT-8B
Add this model to a processing pipeline alongside other extractors. Combine with retrieval stages for end-to-end search.
Open Studio