Create Object

curl --request POST \ --url https://api.mixpeek.com/v1/buckets/{bucket_identifier}/objects \ --header 'Content-Type: application/json' \ --data ' { "blobs": [ { "data": { "num_pages": 5, "title": "Service Agreement 2024" }, "key_prefix": "/contract-2024/content.pdf", "metadata": { "author": "John Doe", "department": "Legal" }, "property": "content", "type": "json" }, { "data": { "filename": "https://example.com/images/smartphone-x1.jpg", "mime_type": "image/jpeg" }, "key_prefix": "/contract-2024/thumbnail.jpg", "metadata": { "height": 300, "width": 200 }, "property": "thumbnail", "type": "image" } ], "key_prefix": "/documents", "metadata": { "category": "contracts", "status": "draft", "year": 2024 } } '

{ "bucket_id": "<string>", "object_id": "<string>", "key_prefix": "<string>", "blobs": [ { "property": "<string>", "blob_id": "<string>", "key_prefix": "/videos/video.mp4", "properties": {}, "presigned_url": "<string>", "details": { "filename": "<string>", "size_bytes": 123, "mime_type": "<string>", "hash": "<string>" } } ], "source_details": [ { "source_id": "<string>" } ], "status": "DRAFT", "error": "Failed to process object: Object not found", "created_at": "2023-11-07T05:31:56Z", "updated_at": "2023-11-07T05:31:56Z", "document_count": 123 }

Headers

Authorization

string

Bearer token authentication using your API key. Format: 'Bearer sk_xxxxxxxxxxxxx'. You can create API keys in the Mixpeek dashboard under Organization Settings.

Example:

"Bearer YOUR_MIXPEEK_API_KEY"

authorization

string

X-Namespace

string

Namespace identifier for scoping this request. All resources (collections, buckets, taxonomies, etc.) are scoped to a namespace. You can provide either the namespace name or namespace ID. Format: ns_xxxxxxxxxxxxx (ID) or a custom name like 'my-namespace'. Falls back to ?namespace= query parameter if the header is omitted.

Examples:

"ns_abc123def456"

"production"

"my-namespace"

Path Parameters

bucket_identifier

string

required

The unique identifier of the bucket.

Query Parameters

policy

string | null

Insertion policy for unique key enforcement. Valid values: 'insert', 'update', 'upsert'. Only applies if bucket has unique_key configured. Overrides bucket's default_policy if provided.

Example:

"insert"

auto_process

boolean

default:false

Automatically create a batch and submit it for processing. When true, the object will be immediately queued for processing without requiring separate batch creation and submission calls. Ideal for onboarding and single-object workflows.

Body

application/json

Request model for creating a bucket object.

Objects can be created with blobs from two sources:

Direct data (URLs, base64) - Use CreateBlobRequest.data field
Upload references - Use CreateBlobRequest.upload_id field (from POST /buckets/{id}/uploads)

Upload Reference Workflow: For large files or client-side uploads, use the presigned URL workflow: 1. POST /buckets/{id}/uploads → Returns {upload_id, presigned_url} 2. User uploads file to presigned_url (client-side) 3. POST /uploads/{upload_id}/confirm → Validates upload 4. POST /buckets/{id}/objects with upload_id in blobs (this endpoint)

Use Cases: - Single blob with direct data (simple) - Multiple blobs from presigned uploads (recommended for large files) - Mix of direct data and upload references - Combine multiple uploads into one object

See Also: - CreateBlobRequest for blob field documentation - POST /buckets/{id}/uploads for presigned URL generation

key_prefix

string | null

Storage key/path prefix of the object, this will be used to retrieve the object from the storage. It's at the root of the object.

Example:

"/contract-2024"

blobs

CreateBlobRequest · object[]

List of blobs to be created in this object

Show child attributes

Example:

[
  {
    "data": {
      "num_pages": 5,
      "title": "Service Agreement 2024"
    },
    "key_prefix": "/content.pdf",
    "metadata": {
      "author": "John Doe",
      "department": "Legal"
    },
    "property": "content",
    "type": "PDF"
  }
]

idempotency_key

string | null

Client-generated idempotency key for safe retries. If an object with the same idempotency_key already exists in this bucket, the existing object is returned instead of creating a duplicate. Use a UUID or deterministic hash per object.

Maximum string length: 255

skip_duplicates

boolean

default:false

Skip duplicate blobs, if a blob with the same hash already exists, it will be skipped.

canonicalize_source

boolean

default:true

Mirror non-S3 sources into internal S3 and reference canonically.

force_remirror

boolean

default:false

Force re-upload to S3 even if a blob with identical content already exists.

Response

Successful Response

Response model for bucket objects.

bucket_id

string

required

ID of the bucket this object belongs to

object_id

string

Unique identifier for the object

key_prefix

string | null

Storage key/path of the object, this will be used to retrieve the object from the storage. It is similar to a file path. If not provided, it will be placed in the root of the bucket.

blobs

BlobModel · object[]

List of blobs contained in this object

Show child attributes

source_details

SourceDetails · object[]

Lineage/source details for this object; used for downstream references.

Show child attributes

status

enum<string>

default:DRAFT

The current status of the object.

Available options:

PENDING,

QUEUED,

IN_PROGRESS,

PROCESSING,

COMPLETED,

COMPLETED_WITH_ERRORS,

FAILED,

CANCELED,

INTERRUPTED,

UNKNOWN,

SKIPPED,

DRAFT,

ACTIVE,

ARCHIVED,

SUSPENDED

error

string | null

The error message if the object failed to process.

Example:

"Failed to process object: Object not found"

created_at

string<date-time> | null

Timestamp when the object was created. Automatically populated by the system.

updated_at

string<date-time> | null

Timestamp when the object was last updated. Automatically populated by the system.

document_count

integer | null

Number of documents produced from this object across all collections. Populated on GET requests. Null on list responses (expensive query). Use this to check if an object has already been processed.

Organization

Namespaces

Buckets

Feature Extractors

Batch Queue

Collections

Documents

Retrievers

Taxonomies

Clusters

Triggers

Alerts

Webhooks

Apps

Agent Sessions

Annotations

Templates

Manifest

Discovery

Analytics

Notifications

Tasks

Inference

Resource Search

Pricing

Headers

Path Parameters

Query Parameters

Body

Response

Organization

Namespaces

Buckets

Feature Extractors

Batch Queue

Collections

Documents

Retrievers

Taxonomies

Clusters

Triggers

Alerts

Webhooks

Apps

Agent Sessions

Annotations

Templates

Manifest

Discovery

Analytics

Notifications

Tasks

Inference

Resource Search

Pricing

Documentation Index

Headers

Path Parameters

Query Parameters

Body

Response