Decomposition

Decomposition: raw objects are split into documents, each with extracted features like embeddings, transcripts, and metadata

Decomposition is the core transformation in Mixpeek. A raw file (video, PDF, image, audio) goes in as one object. It comes out as many documents, each with its own features. This is what makes sub-file search possible — you search within a video at the segment level, not for the video as a whole.

Three Primitives

Primitive	What It Is	Role
Object	Raw file or record in a bucket (video, PDF, JSON row, image).	The input boundary. You upload objects.
Document	One row of output in a collection, produced by decomposition.	The query boundary. You search documents.
Feature	A named output attached to a document (embedding, transcript, OCR text, label, score).	The composition boundary. Retrievers reference features by URI.

The pipeline is always:

Object (bucket) → Decomposition → Document (collection) → Features (MVS + MongoDB)

What Decomposition Decides

The feature extractor controls how an object is decomposed into documents. The strategy depends on the content type:

Content Type	Decomposition Strategy	Result
Video	Time intervals, scene boundaries, or silence gaps	Each segment = 1 document with visual embedding + transcript + scene description
Audio	Silence boundaries or fixed intervals	Each segment = 1 document with transcript + transcript embedding
PDF / Document	Page, paragraph, or sentence boundaries	Each chunk = 1 document with text content + text embedding
Image	No split (1:1)	1 image = 1 document with visual embedding + OCR + description
Structured data	Row-level (1:1)	1 row = 1 document with field-level features

Why It Matters

Without decomposition, a 30-minute video is one record. Searching for “the moment the CEO mentions revenue” means scanning the entire video. There’s no way to return a specific timestamp. With decomposition, that video becomes ~180 ten-second segments, each with its own transcript embedding, visual embedding, and scene description. A search returns the exact segment at 14:30 where the CEO says “revenue grew 22%.” The same applies to documents: a 200-page PDF becomes 200 searchable chunks instead of one monolithic record.

Feature URIs

Every feature produced by decomposition gets a URI that uniquely identifies it:

mixpeek://multimodal_extractor@v1/multimodal_embedding
mixpeek://face_identity_extractor@v1/face_embedding
mixpeek://my_custom_plugin@1.0.0/domain_embedding

Retrievers, taxonomies, and clusters reference features by URI. This is the composition boundary — you can build a retriever that searches multimodal_embedding in one stage and face_embedding in another, even though they were produced by different extractors.

Configuring Decomposition

Decomposition is configured via the feature_extractor field on a collection:

{
  "collection_name": "video-library",
  "source": { "type": "bucket", "bucket_id": "bkt_videos" },
  "feature_extractor": {
    "feature_extractor_name": "multimodal_extractor",
    "version": "v1",
    "input_mappings": {
      "video": "payload.video_url"
    },
    "settings": {
      "video_segmentation": {
        "type": "time",
        "interval_sec": 10
      },
      "run_transcription": true,
      "run_scene_description": true
    }
  }
}

The settings object controls the decomposition strategy. Each extractor has its own settings — see the extractor-specific pages for details:

From Video

Time, scene, and silence segmentation strategies

From Images

Visual embeddings, OCR, and structured extraction

From Audio

Silence-boundary segmentation and transcription

From Documents

Page, paragraph, and sentence chunking

Multi-Tier Decomposition

When a single extraction pass isn’t enough — e.g., you need to transcribe audio then embed the transcription then classify each chunk — you chain collections. Each tier reads the output of the previous one, forming a DAG:

Tier 1: raw video → segments with transcripts
Tier 2: tier-1 docs → text chunks with embeddings
Tier 3: tier-2 docs → classifications per chunk

The engine resolves tiers automatically and executes them in dependency order. See Multi-Tier Feature Extraction for the full guide.

Lineage

Every document tracks its lineage back to the original object:

{
  "root_object_id": "obj_video_123",
  "root_bucket_id": "bkt_marketing",
  "source_collection_id": "col_segments",
  "lineage_path": "bkt_marketing/col_segments/col_chunks"
}

This lets you trace any search result back through tiers to the original file. Use the Lineage API to visualize the decomposition tree.

Get Started

What Mixpeek Extracts

Retrieval

Platform

Vector Store

Resources

Three Primitives

What Decomposition Decides

Why It Matters

Feature URIs

Configuring Decomposition

From Video

From Images

From Audio

From Documents

Multi-Tier Decomposition

Lineage

Get Started

What Mixpeek Extracts

Retrieval

Platform

Vector Store

Resources

Documentation Index

​Three Primitives

​What Decomposition Decides

​Why It Matters

​Feature URIs

​Configuring Decomposition

From Video

From Images

From Audio

From Documents

​Multi-Tier Decomposition

​Lineage

Three Primitives

What Decomposition Decides

Why It Matters

Feature URIs

Configuring Decomposition

Multi-Tier Decomposition

Lineage