Skip to main content
POST
/
v1
/
collections
/
{collection_identifier}
/
trigger
Trigger Collection Processing
curl --request POST \
  --url https://api.mixpeek.com/v1/collections/{collection_identifier}/trigger \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "include_buckets": [
    "<string>"
  ],
  "include_collections": [
    "<string>"
  ],
  "object_ids": [
    "<string>"
  ],
  "source_filters": {
    "AND": [
      {
        "field": "status",
        "operator": "eq",
        "value": "pending"
      }
    ]
  }
}
'
{
  "batch_id": "<string>",
  "task_id": "<string>",
  "collection_id": "<string>",
  "total_tiers": 123,
  "message": "<string>",
  "source_bucket_ids": [
    "<string>"
  ],
  "source_collection_ids": [
    "<string>"
  ],
  "object_count": 123,
  "document_count": 123
}

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Path Parameters

collection_identifier
string
required

The ID or name of the collection to trigger

Body

application/json

Request to trigger (re)processing through a collection.

For bucket-sourced collections (tier 0): Discovers objects from source bucket(s) and creates a batch for processing. Use include_buckets to limit which source buckets to process from.

For collection-sourced collections (tier N): Processes existing documents from upstream collection(s). Use include_collections to limit which source collections to process from.

Use source_filters for field-level filtering on objects or documents.

Document Overwrite Behavior:

  • If source bucket has unique_key configured: Documents are UPSERTED (overwrites existing)
  • If source bucket has NO unique_key: New documents are CREATED (may cause duplicates)

To enable idempotent re-processing, configure unique_key on the source bucket.

include_buckets
string[] | null

Limit processing to objects from these specific buckets (IDs or names). Only applies to bucket-sourced collections. If not provided, all configured source buckets are used.

include_collections
string[] | null

Limit processing to documents from these specific collections (IDs or names). Only applies to collection-sourced collections. If not provided, all configured source collections are used.

object_ids
string[] | null

Limit processing to these specific object IDs. Only applies to bucket-sourced collections. This is a convenience shorthand — equivalent to using source_filters with {"AND": [{"field": "object_id", "operator": "in", "value": [...]}]}.

source_filters
LogicalOperator · object

Field-level filters for objects (bucket-sourced) or documents (collection-sourced). Uses LogicalOperator format (AND/OR/NOT). Use this to filter by metadata fields, status, or any other object/document properties.

Example:
{
"AND": [
{
"field": "status",
"operator": "eq",
"value": "pending"
}
]
}
dedup_strategy
enum<string> | null

How to handle sources already processed in prior batches. skip (default): skip sources already materialized in this collection. replace: delete existing documents for the re-processed sources and re-materialize them — this also clears the processed-objects resume ledger, so use it to recover a collection stuck with ledger entries but 0 materialized documents (the orphan/divergence state). force: process regardless, allowing duplicates.

Available options:
skip,
replace,
force

Response

Successful Response

Response after triggering collection processing.

Use batch_id or task_id to monitor progress via GET /v1/batches/{batch_id} or GET /v1/tasks/{task_id}.

batch_id
string
required

ID of the created batch for tracking progress.

task_id
string
required

Task ID for monitoring via GET /v1/tasks/{task_id}.

collection_id
string
required

ID of the collection being processed.

total_tiers
integer
required

Number of processing tiers in the DAG.

message
string
required

Human-readable status message.

source_bucket_ids
string[] | null

Bucket IDs that objects were discovered from (bucket-sourced collections).

source_collection_ids
string[] | null

Collection IDs that documents were read from (collection-sourced collections).

object_count
integer | null

Total number of objects included in the batch (bucket-sourced collections).

document_count
integer | null

Total number of documents to process (collection-sourced collections).