sam2.1-hiera-large

by facebook

Unified promptable segmentation for images and video with streaming memory

70Kdl/month

139likes

224Mparams

HuggingFace Use in Pipeline

Identifiers

Model ID

facebook/sam2.1-hiera-large

Feature URI

mixpeek://image_extractor@v1/facebook_sam2_large_v1

Overview

SAM 2 extends SAM to video with a streaming memory architecture for real-time processing. It's 6x faster than SAM on images with better accuracy, and the first foundation model that segments and tracks objects across video frames with prompts.

On Mixpeek, SAM 2 enables video-native segmentation — track objects across frames, segment specific items at any point in a video, and extract per-object features over time.

Architecture

Hiera image encoder with streaming memory for temporal context. SAM 2.1 Large: 224.4M params, 39.5 FPS on A100. Memory attention modules propagate masks across frames without re-computing the full image encoder.

Mixpeek SDK Integration

import { Mixpeek } from "mixpeek";

const mx = new Mixpeek({ apiKey: "API_KEY" });

await mx.collections.ingest({
  collection_id: "my-collection",
  source: { url: "https://example.com/video.mp4" },
  feature_extractors: [{
    name: "segmentation",
    version: "v1",
    params: { model_id: "facebook/sam2.1-hiera-large" }
  }]
});