> ## Documentation Index
> Fetch the complete documentation index at: https://docs.mixpeek.com/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Decomposition

> How Mixpeek turns raw objects into searchable documents and queryable features

<Frame>
  <img src="https://mintcdn.com/mixpeek/TwtTrae3Fi3EFJ72/assets/mixpeek-decomposition.svg?fit=max&auto=format&n=TwtTrae3Fi3EFJ72&q=85&s=4bfa40801fe7d78a68db0bd3f48b49c4" alt="Decomposition: raw objects are split into documents, each with extracted features like embeddings, transcripts, and metadata" width="1200" height="550" data-path="assets/mixpeek-decomposition.svg" />
</Frame>

Decomposition is the core transformation in Mixpeek. A raw file (video, PDF, image, audio) goes in as one **object**. It comes out as many **documents**, each with its own **features**. This is what makes sub-file search possible — you search *within* a video at the segment level, not *for* the video as a whole.

## Three Primitives

| Primitive    | What It Is                                                                             | Role                                                            |
| ------------ | -------------------------------------------------------------------------------------- | --------------------------------------------------------------- |
| **Object**   | Raw file or record in a bucket (video, PDF, JSON row, image).                          | The input boundary. You upload objects.                         |
| **Document** | One row of output in a collection, produced by decomposition.                          | The query boundary. You search documents.                       |
| **Feature**  | A named output attached to a document (embedding, transcript, OCR text, label, score). | The composition boundary. Retrievers reference features by URI. |

The pipeline is always:

```
Object (bucket) → Decomposition → Document (collection) → Features (MVS + MongoDB)
```

## What Decomposition Decides

The feature extractor controls *how* an object is decomposed into documents. The strategy depends on the content type:

| Content Type        | Decomposition Strategy                            | Result                                                                           |
| ------------------- | ------------------------------------------------- | -------------------------------------------------------------------------------- |
| **Video**           | Time intervals, scene boundaries, or silence gaps | Each segment = 1 document with visual embedding + transcript + scene description |
| **Audio**           | Silence boundaries or fixed intervals             | Each segment = 1 document with transcript + transcript embedding                 |
| **PDF / Document**  | Page, paragraph, or sentence boundaries           | Each chunk = 1 document with text content + text embedding                       |
| **Image**           | No split (1:1)                                    | 1 image = 1 document with visual embedding + OCR + description                   |
| **Structured data** | Row-level (1:1)                                   | 1 row = 1 document with field-level features                                     |

## Why It Matters

**Without decomposition**, a 30-minute video is one record. Searching for "the moment the CEO mentions revenue" means scanning the entire video. There's no way to return a specific timestamp.

**With decomposition**, that video becomes \~180 ten-second segments, each with its own transcript embedding, visual embedding, and scene description. A search returns the exact segment at 14:30 where the CEO says "revenue grew 22%."

The same applies to documents: a 200-page PDF becomes 200 searchable chunks instead of one monolithic record.

## Feature URIs

Every feature produced by decomposition gets a URI that uniquely identifies it:

```
mixpeek://multimodal_extractor@v1/vertex_multimodal_embedding
mixpeek://face_identity_extractor@v1/insightface__arcface
mixpeek://my_custom_plugin@1.0.0/domain_embedding
```

Retrievers, taxonomies, and clusters reference features by URI. This is the composition boundary — you can build a retriever that searches `multimodal_embedding` in one stage and `face_embedding` in another, even though they were produced by different extractors.

## Configuring Decomposition

Decomposition is configured via the `feature_extractor` field on a collection:

```json theme={null}
{
  "collection_name": "video-library",
  "source": { "type": "bucket", "bucket_ids": ["bkt_videos"] },
  "feature_extractor": {
    "feature_extractor_name": "multimodal_extractor",
    "version": "v1",
    "input_mappings": {
      "video": "video_url"
    },
    "settings": {
      "video_segmentation": {
        "type": "time",
        "interval_sec": 10
      },
      "run_transcription": true,
      "run_scene_description": true
    }
  }
}
```

The `settings` object controls the decomposition strategy. Each extractor has its own settings — see the extractor-specific pages for details:

<CardGroup cols={2}>
  <Card title="From Video" icon="video" href="/processing/extractors/multimodal">
    Time, scene, and silence segmentation strategies
  </Card>

  <Card title="From Images" icon="image" href="/processing/extractors/image">
    Visual embeddings, OCR, and structured extraction
  </Card>

  <Card title="From Audio" icon="headphones" href="/processing/extractors/multimodal">
    Silence-boundary segmentation and transcription
  </Card>

  <Card title="From Documents" icon="file-lines" href="/processing/extractors/document">
    Page, paragraph, and sentence chunking
  </Card>
</CardGroup>

## Multi-Tier Decomposition

When a single extraction pass isn't enough — e.g., you need to transcribe audio *then* embed the transcription *then* classify each chunk — you chain collections. Each tier reads the output of the previous one, forming a DAG:

```
Tier 1: raw video → segments with transcripts
Tier 2: tier-1 docs → text chunks with embeddings
Tier 3: tier-2 docs → classifications per chunk
```

The engine resolves tiers automatically and executes them in dependency order. See [Multi-Tier Feature Extraction](/processing/multi-tier-extractors) for the full guide.

## Lineage

Every document tracks its lineage back to the original object:

```json theme={null}
{
  "root_object_id": "obj_video_123",
  "root_bucket_id": "bkt_marketing",
  "source_collection_id": "col_segments",
  "lineage_path": "bkt_marketing/col_segments/col_chunks"
}
```

This lets you trace any search result back through tiers to the original file. Use the [Lineage API](/api-reference/document-lineage/get-document-lineage) to visualize the decomposition tree.
