Create a new storage provider connection.
Establishes a connection to an external storage provider (Google Drive, S3, etc.) for use in sync operations. Credentials are validated before saving unless test_before_save is False.
Use Cases:
Security:
Example:
curl -X POST "http://localhost:8000/v1/organizations/connections" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"name": "Marketing Drive",
"provider_type": "google_drive",
"provider_config": {
"credentials": {...},
"shared_drive_id": "0AH-Xabc123"
}
}'
REQUIRED: Bearer token authentication using your API key. Format: 'Bearer sk_xxxxxxxxxxxxx'. You can create API keys in the Mixpeek dashboard under Organization Settings.
"Bearer YOUR_API_KEY"
"Bearer YOUR_STRIPE_API_KEY"
Request payload for creating a new storage connection.
Use this to connect Mixpeek to external storage providers like Google Drive or S3. The connection will be tested before being saved (unless test_before_save is False).
Use Cases:
Security:
Examples:
# Google Drive connection
{
"name": "Marketing Drive",
"provider_type": "google_drive",
"provider_config": {
"credentials": {...},
"shared_drive_id": "0AH-Xabc123"
},
"description": "Team drive for marketing assets"
}REQUIRED. Human-readable name for the storage connection. Must be unique within the organization. Displayed in dashboards and sync logs. Format: 1-100 characters, descriptive of the connection's purpose.
1 - 100"Marketing Google Drive"
"Production S3 Bucket"
"Customer Assets Archive"
REQUIRED. Storage provider to connect to. Supported providers: google_drive, s3, snowflake, sharepoint, tigris. Determines which authentication and sync logic is used.
google_drive, s3, snowflake, sharepoint, tigris, postgresql, instagram, tiktok, rss, http_api, box, brightdata, backblaze "google_drive"
"s3"
"snowflake"
"sharepoint"
"tigris"
REQUIRED. Provider-specific configuration including credentials. Structure varies by provider_type. SECURITY: Sensitive credential fields (private_key, secret_access_key, client_secret, refresh_token, session_token) are automatically encrypted at rest and never appear in responses or logs.
{
"credentials": {
"client_email": "sync@project.iam.gserviceaccount.com",
"type": "service_account"
},
"description": "Google Drive configuration",
"shared_drive_id": "0AH-Xabc123"
}{
"bucket": "my-bucket",
"credentials": {
"access_key_id": "AKIA...",
"secret_access_key": "***REDACTED***"
},
"description": "S3 configuration",
"region": "us-east-1"
}OPTIONAL. Description explaining the connection's purpose and scope. Helpful for team collaboration and documentation. Format: Up to 500 characters.
500"Shared drive for marketing team assets and campaign materials"
OPTIONAL. Arbitrary key-value metadata for tagging and categorization. Common uses: team tags, cost center codes, project identifiers.
{
"cost_center": "CC-1234",
"team": "marketing"
}OPTIONAL. Whether to validate credentials before saving the connection. Defaults to True. If True, connection will be tested against the provider before creation. If False, connection is saved without validation (use with caution).
Successful Response
Canonical representation of an external storage provider connection.
Storage connections enable Mixpeek to access external cloud storage providers (Google Drive, S3, etc.) for automated file ingestion and synchronization. Each connection represents a configured integration with credentials, health monitoring, and usage tracking.
Lifecycle States: - ACTIVE: Connection is healthy and ready for sync operations - SUSPENDED: Temporarily disabled by user (credentials preserved) - FAILED: Health checks failing (may need credential refresh) - ARCHIVED: Permanently retired (cannot be reactivated)
Security: - Sensitive credential fields are encrypted at rest using MongoDB client-side field level encryption (CSFLE) - Credentials never appear in API responses or logs - Failed authentication attempts are logged in last_error - Consecutive failures trigger automatic suspension
Use Cases: - Connect to team Google Drive for document ingestion - Sync files from customer S3 buckets - Monitor and process uploaded media files - Schedule periodic sync operations
Health Monitoring: - Automatic health checks validate connectivity and credentials - consecutive_failures tracks authentication/network issues - Auto-disable after 5 consecutive failures to prevent lockout - last_error stores diagnostic information for debugging
REQUIRED. Organization internal identifier for multi-tenancy scoping. All connection operations are scoped to this organization. Format: int_{24-character secure token}.
"int_org123abc456def789xyz012"
REQUIRED. Storage provider implementation to use. Determines which client adapter is loaded for sync operations. Supported: google_drive, s3, snowflake, sharepoint, tigris.
google_drive, s3, snowflake, sharepoint, tigris, postgresql, instagram, tiktok, rss, http_api, box, brightdata, backblaze Google Drive and Google Workspace shared drive configuration.
This configuration enables Mixpeek to connect to Google Drive for automated file ingestion and synchronization. Supports both personal Drive and Google Workspace shared drives (formerly Team Drives).
Authentication Options: - Service Account: Recommended for production. No user interaction required. - OAuth2: Suitable for personal Drive access or development.
Requirements: - Google Drive API enabled in Google Cloud Console - Appropriate authentication credentials configured - Files/folders shared with the service account or OAuth user - Network connectivity to drive.googleapis.com
Use Cases: - Sync marketing materials from shared drives - Ingest documents from team collaboration folders - Monitor and process uploaded media files - Archive and search historical documents
{
"credentials": {
"client_email": "sync@mixpeek-prod-456.iam.gserviceaccount.com",
"client_id": "123456789012345678901",
"private_key": "-----BEGIN PRIVATE KEY-----\n...\n-----END PRIVATE KEY-----\n",
"private_key_id": "a1b2c3d4e5f6...",
"project_id": "mixpeek-prod-456",
"type": "service_account"
},
"description": "Service account accessing shared drive",
"provider_type": "google_drive",
"shared_drive_id": "0AH-Xabc123def456"
}REQUIRED. Human-readable connection name for identification. Displayed in dashboards, sync logs, and API responses. Must be unique within the organization for clarity. Format: 1-100 characters, descriptive of the connection's purpose.
1 - 100"Marketing Google Drive"
"Production S3 Bucket"
"Customer Assets Archive"
REQUIRED. User identifier of the user who created this connection. Used for audit trails and permission checks. Format: usr_{15-character alphanumeric}. Immutable after creation.
"usr_34bbf6ded1b749"
Unique identifier for the storage connection. Auto-generated with 'conn_' prefix followed by secure random token. Format: conn_{15-character alphanumeric}. Used for API operations and audit trails.
"conn_abc123def456ghi"
"conn_xyz789uvw012qrs"
NOT REQUIRED. Optional description explaining the connection's purpose and scope. Helpful for team collaboration and documentation. Format: Up to 500 characters.
500"Shared drive for marketing team assets and campaign materials"
Operational status of the connection. ACTIVE: Connection is healthy and ready for use in sync operations. SUSPENDED: Temporarily disabled by user, credentials preserved but sync paused. FAILED: Health checks failing, credentials may be invalid or expired. ARCHIVED: Permanently retired, cannot be reactivated. Status transitions automatically based on health checks and user actions.
PENDING, QUEUED, IN_PROGRESS, PROCESSING, COMPLETED, COMPLETED_WITH_ERRORS, FAILED, CANCELED, UNKNOWN, SKIPPED, DRAFT, ACTIVE, ARCHIVED, SUSPENDED Quick boolean flag for filtering active connections in queries. True when status is ACTIVE, False for SUSPENDED/FAILED/ARCHIVED. Maintained automatically when status changes. Use for efficient filtering: db.connections.find({'is_active': True})
NOT REQUIRED. UTC timestamp of the most recent successful sync operation. Updated automatically after each successful file sync/list operation. None if connection has never been used. Useful for identifying stale connections and usage analytics.
NOT REQUIRED. Most recent error message from failed health check or sync. Populated when authentication fails, network errors occur, or permissions denied. None when connection is healthy. Format: Error message truncated to 1000 characters. Used for diagnostics and troubleshooting.
1000"Authentication failed: Invalid credentials"
Counter tracking consecutive failed health checks or sync attempts. Incremented on each failure, reset to 0 on success. Used to implement automatic connection suspension. Auto-suspend after 5 consecutive failures to prevent account lockout. Range: 0 to infinity (typically 0-10).
x >= 0UTC timestamp when the connection was created. Auto-generated using shared.utilities.helpers.current_time(). Immutable after creation. Format: ISO 8601 datetime.
UTC timestamp of the most recent update to the connection. Updated automatically on any field modification. Tracks configuration changes, status updates, and credential refreshes. Format: ISO 8601 datetime.
Arbitrary key-value metadata provided by the user. Useful for tagging, categorization, and custom annotations. NOT REQUIRED - defaults to empty dictionary. Common uses: team tags, cost center codes, project identifiers.
{
"cost_center": "CC-1234",
"team": "marketing"
}{
"environment": "production",
"project": "q4-campaign"
}{
"tags": ["customer-facing", "high-priority"]
}