Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.mixpeek.com/docs/llms.txt

Use this file to discover all available pages before exploring further.

Every Mixpeek document carries the full lineage chain that produced it, from the original bucket object through every transformation. This guide shows the four patterns for navigating that chain efficiently from the API.
Background on the lineage data model: see Documents → Lineage. The TL;DR is that each document has _internal.lineage with root_object_id, root_bucket_id, source_document_id, and a chain array recording every processing step.

When to use what

PatternUse caseRound-trips
?expand=parent”Show me this scene and its source frame on one page”1
?expand=root_object”Show me this document with the original video metadata”1
?expand=ancestors”Show me the full pipeline that produced this document”1
?expand=children”Show me all the segments derived from this scene”1
GET /documents/{id}/ancestorsSame as expand=ancestors but returns only the chain1
GET /documents/{id}/descendantsSame as expand=children but returns only the children1
from_object filter”Search across everything derived from this video”1 (no GET first)
The shared rule: never use a list response to grab IDs and then issue per-document GETs. The expand parameter takes a comma-separated list, so a single request fetches the document plus everything you need from its lineage tree.

$expand keywords

Lineage-aware $expand keywords resolve relative to a document’s own _internal.lineage block. They land under _expanded.<keyword> in the response, matching the existing user-field expand shape.
The single document referenced by _internal.lineage.source_document_id.
cURL
curl "$API/v1/collections/$COL/documents/$DOC?expand=parent" \
  -H "Authorization: Bearer $API_KEY" \
  -H "X-Namespace: $NS"
Python
client.documents.get(
    collection_identifier="col_scenes",
    document_id="doc_scene_42",
    expand="parent",
)
# Response includes:
# response._expanded.parent — the upstream frame document
For a tier-0 document (created directly from a bucket object), parent is absent — there’s no upstream document. Use root_object instead.
You can request multiple keywords in one call by comma-separating them:
curl "$API/v1/collections/$COL/documents/$DOC?expand=parent,root_object,children"
The same expand is accepted by POST /documents/list (in the request body) and by retriever response shaping — the document GET endpoint is just the simplest demonstration.

Convenience endpoints

For SDKs and UIs that want only the lineage walk without fetching the document itself, use the dedicated endpoints:
# Returns the chain root → parent (excludes the document itself)
GET /v1/collections/{collection_identifier}/documents/{document_id}/ancestors

# Returns direct depth=1 children (max 100)
GET /v1/collections/{collection_identifier}/documents/{document_id}/descendants
Both endpoints return a List<DocumentResponse> with the same shape as GET /documents/{id} per element.

Filter aliases

When you want to search the lineage tree (find every document derived from one root), use the filter aliases instead of expand. They work in document list endpoints, retriever filter stages, and aggregations.
AliasResolves toUse for
from_object_internal.lineage.root_object_idEverything derived from this bucket object
from_bucket_internal.lineage.root_bucket_idEverything derived from this bucket
from_document_internal.lineage.source_document_idDirect children of one upstream document
from_collection_internal.lineage.source_collection_idDocuments whose immediate parent was in this collection
// "Show me all scene documents in col_scenes that came from this video"
{
  "AND": [
    { "field": "from_object", "operator": "eq", "value": "obj_video_123" }
  ]
}
// "Direct children of one specific frame document"
{
  "AND": [
    { "field": "from_document", "operator": "eq", "value": "doc_frame_42" }
  ]
}
These aliases are equivalent to the underscore-prefixed paths (_internal.lineage.*) — they exist purely so you don’t have to learn the internal schema. Mix them freely with normal user fields:
{
  "AND": [
    { "field": "from_object", "operator": "eq", "value": "obj_video_123" },
    { "field": "metadata.scene_score", "operator": "gte", "value": 0.8 }
  ]
}

End-to-end example: decomposition tree

To render a decomposition tree for one bucket object — every document at every tier that descended from it — make one filtered list call per collection in the namespace using from_object. The result is already structured by collection, and each document’s _internal.lineage.chain tells you where to draw the edges.
def decomposition_tree(client, namespace, root_object_id):
    namespaces = {}
    for collection in client.collections.list(namespace=namespace):
        docs = client.documents.list(
            collection.collection_id,
            filters={
                "AND": [
                    {"field": "from_object", "operator": "eq", "value": root_object_id}
                ]
            },
        )
        if docs:
            namespaces[collection.collection_id] = docs
    return namespaces
For a deeper materialized view (the chain edges with parent/child resolved inline), use the dedicated decomposition tree endpoint:
GET /v1/buckets/{bucket_id}/objects/{object_id}/decomposition-tree
That endpoint pre-joins everything in one call and is what the Studio namespace detail page uses to draw lineage diagrams.

Limits & caveats

  • Maximum 50 unique user-field references per expand request — doesn’t apply to lineage keywords (those are bounded by chain length for ancestors and by the children cap for children).
  • expand=children is capped at 100 children per parent. For deeper traversal or wider fan-out, fall back to a from_document filter.
  • Recursive expansion is not supportedexpand=parent resolves one level. To walk further, use expand=ancestors (full chain) or call /ancestors then re-expand from there.
  • Lineage is immutable provenance. If an ancestor is deleted, its document_id reference in the chain remains. The ancestors expand silently skips unresolved references — never returns null slots — but client code should still be ready for shorter-than-expected chains.