sam2.1-hiera-large
by facebook
Unified promptable segmentation for images and video with streaming memory
facebook/sam2.1-hiera-largemixpeek://image_extractor@v1/facebook_sam2_large_v1Overview
SAM 2 extends SAM to video with a streaming memory architecture for real-time processing. It's 6x faster than SAM on images with better accuracy, and the first foundation model that segments and tracks objects across video frames with prompts.
On Mixpeek, SAM 2 enables video-native segmentation — track objects across frames, segment specific items at any point in a video, and extract per-object features over time.
Architecture
Hiera image encoder with streaming memory for temporal context. SAM 2.1 Large: 224.4M params, 39.5 FPS on A100. Memory attention modules propagate masks across frames without re-computing the full image encoder.
Mixpeek SDK Integration
import { Mixpeek } from "mixpeek";
const mx = new Mixpeek({ apiKey: "API_KEY" });
// Managed: create a collection over a bucket; Mixpeek runs this model's extractor
const collection = await mx.collections.create({
namespace_id: "my-namespace",
collection_name: "my-collection",
source: { type: "bucket", bucket_ids: ["bkt_your_bucket"] },
feature_extractor: {
feature_extractor_name: "segmentation",
version: "v1",
parameters: { model_id: "facebook/sam2.1-hiera-large" },
},
});Capabilities
- Video object segmentation and tracking
- 6x faster than SAM on images
- Streaming memory architecture for real-time video
- Multi-object tracking with mask propagation
- Image segmentation with improved accuracy
Use Cases on Mixpeek
Benchmarks
| Dataset | Metric | Score | Source |
|---|---|---|---|
| SA-V (video seg.) | J&F | 79.5 | Ravi et al., 2024 — Table 1 |
| DAVIS 2017 (val) | J&F | 82.0 | Ravi et al., 2024 — Table 2 |
Performance
Streaming architecture — processes video frames sequentially with memory
Common Pipeline Companions
Explore on Mixpeek
Compare alternatives in this category
Hand-picked tools & platforms compared
Deep-dive technical guide
See how Mixpeek runs models as extractors
Store & search embeddings at scale
Usage-based pricing for pipelines
Compare models, APIs & infrastructure
Specification
Research Paper
SAM 2: Segment Anything in Images and Videos
arxiv.orgBuild a pipeline with sam2.1-hiera-large
Add this model to a processing pipeline alongside other extractors. Combine with retrieval stages for end-to-end search.
Run on your data, free