Set Up a Namespace
Every project starts with a namespace — the isolation boundary for all your resources. Use one per environment (dev, staging, prod) or per tenant.
curl -X POST "https://api.mixpeek.com/v1/namespaces" \
-H "Authorization: Bearer $MIXPEEK_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"namespace_name": "production",
"feature_extractors": [
{ "feature_extractor_name": "multimodal_extractor", "version": "v1" }
]
}'
Every subsequent request needs two headers: Authorization: Bearer sk_live_... and X-Namespace: ns_....
Namespace API →
Create a Bucket
Buckets are schema-validated containers for raw files. Define what blob types you accept (text, image, audio, video, json, binary).
curl -X POST "https://api.mixpeek.com/v1/buckets" \
-H "Authorization: Bearer $MIXPEEK_API_KEY" \
-H "X-Namespace: $NAMESPACE_ID" \
-H "Content-Type: application/json" \
-d '{
"bucket_name": "product-catalog",
"bucket_schema": {
"properties": {
"product_text": { "type": "text", "required": true },
"hero_image": { "type": "image" }
}
}
}'
Bucket API →
Storage class
Pass an optional storage_class on create/update to pick a cost tier for a bucket’s objects. It’s provider-agnostic — mapped to your object store:
storage_class | GCS | S3 / MinIO | Best for |
|---|
standard (default) | STANDARD | STANDARD | Hot, frequently-read buckets |
nearline | NEARLINE | STANDARD_IA | Warm / occasional access |
coldline | COLDLINE | GLACIER_IR | Cold / rare access |
archive | ARCHIVE | GLACIER | Long-term retention |
Applied on write for sync-based ingestion; broader rollout in progress. For buckets fed by a storage sync (S3, GCS, Drive, RSS, and other sources — the primary media path), the tier is set on each object at write time. Tiering for direct uploads (POST /objects) and presigned client uploads, plus retroactive re-tiering of existing objects, are a separate backend follow-up (in progress). Keep hot, retriever-source buckets on standard; reserve cheaper tiers for large write-once/read-occasionally media.
Connect External Storage
Sync files directly from your existing cloud storage instead of uploading manually. Mixpeek reads from your provider — no migration needed. This is a two-step flow: create a reusable connection (holds the credentials, lives at the organization level), then attach a sync to a bucket that references it.
Step 1 — Create the connection (once per provider account):
curl -X POST "https://api.mixpeek.com/v1/organizations/connections" \
-H "Authorization: Bearer $MIXPEEK_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"name": "Production S3",
"provider_type": "s3",
"provider_config": {
"provider_type": "s3",
"bucket_name": "my-source-bucket",
"region": "us-east-1",
"credentials": {
"access_key_id": "AKIA...",
"secret_access_key": "..."
}
}
}'
The response includes a connection_id (conn_...). Credentials are encrypted at rest and reusable across buckets.
Step 2 — Attach a sync to your bucket (flat body — no wrapper objects):
curl -X POST "https://api.mixpeek.com/v1/buckets/$BUCKET_ID/syncs" \
-H "Authorization: Bearer $MIXPEEK_API_KEY" \
-H "X-Namespace: $NAMESPACE_ID" \
-H "Content-Type: application/json" \
-d '{
"connection_id": "conn_abc123",
"source_path": "/videos/",
"sync_mode": "continuous",
"polling_interval_seconds": 3600
}'
Then trigger the first sync:
curl -X POST "https://api.mixpeek.com/v1/buckets/$BUCKET_ID/syncs/$SYNC_ID/trigger" \
-H "Authorization: Bearer $MIXPEEK_API_KEY" \
-H "X-Namespace: $NAMESPACE_ID"
After the initial sync, new files are picked up automatically at the configured polling interval. Use continuous mode (vs initial_only) to keep picking up new and changed files — only new or modified files since the last sync are processed, so existing files aren’t reprocessed.
| Provider | Auth Method | S3-Compatible |
|---|
| AWS S3 | IAM User / Role | Native |
| Google Cloud Storage | Service Account Key | No |
| Azure Blob Storage | Access Key / Managed Identity | No |
| Cloudflare R2 | R2 API Token | Yes |
| Backblaze B2 | Application Key | Yes |
| Wasabi | Access Key | Yes |
| Tigris | Access Key | Yes |
| Box | OAuth | No |
| Mux | API Token | No |
| Supabase | Service Key | Yes |
See Object Storage providers for provider-specific setup guides.
Sync API →
Register Objects
Objects are raw multimodal assets within a bucket. Two paths:
URL references — point to files in your existing storage:
curl -X POST "https://api.mixpeek.com/v1/buckets/$BUCKET_ID/objects" \
-H "Authorization: Bearer $MIXPEEK_API_KEY" \
-H "X-Namespace: $NAMESPACE_ID" \
-H "Content-Type: application/json" \
-d '{
"key_prefix": "/products",
"blobs": [
{ "property": "hero_image", "type": "image", "data": "https://example.com/photo.jpg" },
{ "property": "product_text", "type": "text", "data": "Wireless headphones" }
]
}'
Direct uploads — upload to Mixpeek-managed storage via presigned URLs:
curl -X POST "https://api.mixpeek.com/v1/buckets/$BUCKET_ID/uploads" \
-H "Authorization: Bearer $MIXPEEK_API_KEY" \
-H "X-Namespace: $NAMESPACE_ID" \
-H "Content-Type: application/json" \
-d '{ "filename": "photo.jpg", "content_type": "image/jpeg" }'
Then PUT the file to the returned presigned_url and confirm with POST /uploads/{id}/confirm.
For bulk imports, use batch uploads or connect your object storage via sync configurations.
Object API → · Upload API →
Process with Batches
Batches group objects for extraction. Create a batch, then submit it:
# Create batch
curl -X POST "https://api.mixpeek.com/v1/buckets/$BUCKET_ID/batches" \
-H "Authorization: Bearer $MIXPEEK_API_KEY" \
-H "X-Namespace: $NAMESPACE_ID" \
-H "Content-Type: application/json" \
-d '{ "object_ids": ["obj_abc", "obj_def"] }'
# Submit for processing
curl -X POST "https://api.mixpeek.com/v1/buckets/$BUCKET_ID/batches/$BATCH_ID/submit" \
-H "Authorization: Bearer $MIXPEEK_API_KEY" \
-H "X-Namespace: $NAMESPACE_ID"
Batch Lifecycle
DRAFT → QUEUED → PROCESSING → COMPLETED
↘ ↘
FAILED COMPLETED_WITH_ERRORS
Poll GET /v1/buckets/{id}/batches/{id} until the status is terminal — COMPLETED, COMPLETED_WITH_ERRORS, FAILED, or CANCELED (a poller that waits only for COMPLETED hangs on partial success) — or use webhooks to get notified on batch.completed.
Batch API →