How Syncs Work
- Poll — At each interval, Mixpeek lists files from your storage provider using the configured
source_pathand filters. - Filter — Files are matched against glob patterns, size limits, MIME types, and provider-specific metadata filters.
- Create objects — Matching files are registered in the target bucket. Duplicates are skipped by default via source tracking.
- Submit batches — Objects are grouped into batches and submitted to the bucket’s collection pipeline for processing.
- Checkpoint — A resume cursor is saved so the next poll picks up where the last one left off.
Sync Modes
| Mode | Behavior | Use Case |
|---|---|---|
continuous | Polls repeatedly at polling_interval_seconds | Ongoing ingestion — new files are picked up automatically |
initial_only | Runs once, then stops | One-time backfill or migration |
Create a Sync
Connect your storage
Create a storage connection with credentials for your provider.
Configuration Reference
Core Settings
| Field | Type | Default | Description |
|---|---|---|---|
connection_id | string | required | Storage connection to pull from |
source_path | string | required | Path in external storage (format varies by provider) |
sync_mode | string | continuous | continuous or initial_only |
polling_interval_seconds | int | 300 | Poll frequency (30–900 seconds) |
batch_size | int | 50 | Files per batch (1–100) |
skip_duplicates | bool | true | Skip files already in the bucket |
reconcile_on_sync | bool | false | Delete previously-synced objects that no longer match filters |
File Filters
Narrow which files get synced. All filters combine with AND logic.Metadata Filters
Filter on provider-specific metadata fields. Useful for syncing only assets that match certain tags, statuses, or custom fields in your storage system.equals, not_equals, contains, not_contains, gt, lt, gte, lte, exists.
Schema Mapping
Map provider metadata to bucket schema fields during sync, so structured data arrives alongside your files.Lifecycle Management
Pause and Resume
Temporarily stop a sync without losing progress:Manual Trigger
Force a sync to run immediately, outside the polling schedule:Monitoring
Check sync status and metrics:| Metric | Description |
|---|---|
total_files_discovered | Cumulative files found in source |
total_files_synced | Successfully synced files |
total_files_failed | Files that failed after retries (sent to DLQ) |
total_bytes_synced | Total data transferred |
last_sync_at | When the last sync completed |
next_sync_at | When the next poll is scheduled |
consecutive_failures | Sequential failure count (auto-suspends after threshold) |
Robustness
Syncs are designed for unattended, long-running operation:- Distributed locking prevents concurrent runs of the same sync
- Resume cursors checkpoint progress so interrupted syncs pick up where they left off
- Dead letter queue retries failed files up to 3 times before marking them as failed
- Auto-suspend pauses syncs after consecutive failures to prevent runaway errors
- Idempotent ingestion uses source tracking to never duplicate objects on retries
- Reconciliation optionally deletes objects that no longer match your filters (enable with
reconcile_on_sync)

