Resume Sync Configuration
Resume a paused sync configuration.
Reactivates a paused sync, allowing new sync jobs to be scheduled. For continuous syncs, polling will resume at the configured interval. The next sync will be incremental (only files modified since last sync).
Authorizations
Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
Response
Successful Response
Bucket-scoped configuration for automated storage synchronization.
Defines how files are synced from external storage providers to a Mixpeek bucket. Includes configuration, status, metrics, and robustness control fields.
Supported Providers: google_drive, s3, snowflake, sharepoint, tigris
Built-in Robustness:
- Distributed locking (locked_by_worker_id, lock_expires_at)
- Pause/resume control (paused, pause_reason, paused_at)
- Safety limits (max_objects_per_run, batch_chunk_size)
- Resume checkpointing (resume_cursor, resume_objects_processed)
- Batch tracking (batch_ids, task_ids, batches_created)
Metrics Fields:
- total_files_discovered: Files found in source
- total_files_synced: Successfully synced files
- total_files_failed: Files that failed (check DLQ)
- total_bytes_synced: Total data transferred
- consecutive_failures: Failure count for auto-suspend
Target bucket identifier (e.g. 'bkt_marketing_assets').
Storage connection identifier (e.g. 'conn_abc123').
Organization internal identifier (multi-tenancy scope).
Namespace identifier owning the bucket.
Source path in the external storage provider. Format varies by provider: s3/tigris='bucket/prefix', google_drive='folder_id', sharepoint='/sites/Name/Documents', snowflake='DB.SCHEMA.TABLE'.
User identifier that created the sync configuration.
Whether a worker currently holds this sync's run lock.
Unique identifier for the sync configuration.
Optional filter rules limiting which files are synced.
Schema mapping defining how source data maps to bucket schema fields. Maps external storage attributes (tags, metadata, columns, filenames) to bucket schema fields and blob properties. When provided, enables structured extraction of metadata from the sync source. See SchemaMapping for detailed configuration options.
Sync mode controlling lifecycle (initial_only or continuous).
initial_only, continuous Polling interval in seconds (continuous mode).
30 <= x <= 900Number of files processed per sync batch.
1 <= x <= 100Whether objects should be created immediately after confirmation.
Skip files whose hashes already exist in the bucket.
Sync-only mode: download and store files in the bucket without running them through the collection processing pipeline. Set to True during initial bulk ingestion, then flip to False to trigger processing once all files are synced.
Controls how Mixpeek reconciles objects when the source changes. on_delete: cascade-delete when source asset is removed. on_update: propagate metadata changes and re-process. on_filter_drift: remove objects that no longer match filters. All default to True for full consistency.
Current lifecycle status for the sync configuration. PENDING: Not yet started. ACTIVE: Currently running/polling. SUSPENDED: Temporarily paused. COMPLETED: Initial sync completed (for initial_only mode). FAILED: Sync encountered errors.
PENDING, QUEUED, IN_PROGRESS, PROCESSING, COMPLETED, COMPLETED_WITH_ERRORS, FAILED, CANCELED, INTERRUPTED, UNKNOWN, SKIPPED, DRAFT, ACTIVE, ARCHIVED, SUSPENDED Convenience flag used for filtering active syncs.
Cumulative count of files found in source across all runs.
x >= 0Cumulative count of successfully synced files.
x >= 0Cumulative count of failed files (sent to DLQ after 3 retries).
x >= 0Cumulative bytes transferred across all runs.
x >= 0When sync configuration was created.
Last modification timestamp.
When last successful sync completed. Used for incremental syncs.
Per-shard last-sync timestamps keyed by shard value (e.g. collection_id). When a new shard is added, its absence here forces a full scan even if the global last_sync_at is set.
Scheduled time for next sync (continuous/scheduled modes).
Most recent error message if sync attempts failed.
1000x >= 0Provider-specific pre-filters pushed down to the API call. The sync engine passes these to iter_objects() without interpretation. Each provider defines its own schema. Applied BEFORE file_filters. Examples: Iconik {'collection_ids': [...]}, Google Drive {'shared_drive_id': '...'}
Storage provider type for API progress views (for example: s3, google_drive, iconik).
Arbitrary metadata supplied by the user.
Worker ID that currently holds the lock for this sync
Timestamp when lock was acquired
Timestamp when lock expires (for stale lock recovery)
A full sweep was requested (trigger?full_sync=true) while a run held the lock. The finishing run dispatches it automatically on lock release.
Whether sync is currently paused (user-controlled)
Reason for pause
Timestamp when paused
User who paused the sync
Hard cap on objects per sync run (prevents runaway syncs)
x >= 1Maximum objects per batch chunk
1 <= x <= 1000Number of objects per batch chunk (for concurrent processing)
1 <= x <= 1000UUID for current/last sync run
Increments on each sync execution
x >= 0List of batch IDs created by this sync
List of task IDs for batches
Total number of batches created
x >= 0Whether resuming partial runs is enabled
Last page/cursor processed (for paginated APIs like Google Drive)
Last primary key processed (for database syncs with stable ordering)
Count of objects processed in current/last run
x >= 0How often to checkpoint (in objects). Default: every 1000 objects
100 <= x <= 10000Convenience mirror of the current resume cursor for API progress views.
Per-(config, shard) high-water checkpoints keyed by shard key (e.g. collection_id for parallel fan-outs, 'pages_N_M' for page-range shards, 'default' for unsharded runs). Each entry holds: pass_id (lexicographically-ordered pass marker), cursor (provider cursor, e.g. JSON-encoded Iconik search_after), objects_processed (forward-only progress guard), modified_since (incremental filter frozen at pass start), completed_at (set when the shard drained its source — the next cycle wraps around to a fresh full pass only after the polling cadence elapses), and updated_at. A NEW job resumes each shard from its checkpoint instead of re-walking from page 1 (2026-06-11 re-scan treadmill).
Derived scheduling summary: mode, interval, next run, and last successful run.
Derived progress summary for API observability.

