Export collection documents to JSON, CSV, or Parquet format.
Export Formats:
Vector Export:
Vectors are large and exported separately. When include_vectors=True,
a separate file is created for vectors with document_id mapping.
Field Selection:
Use select_fields to export only specific fields, reducing file size.
Filtering: Apply filters to export a subset of documents.
Response: Returns presigned download URLs valid for 1 hour.
Limits:
sample_size for testing.Documentation Index
Fetch the complete documentation index at: https://docs.mixpeek.com/docs/llms.txt
Use this file to discover all available pages before exploring further.
REQUIRED: Bearer token authentication using your API key. Format: 'Bearer sk_xxxxxxxxxxxxx'. You can create API keys in the Mixpeek dashboard under Organization Settings.
"Bearer YOUR_API_KEY"
"Bearer YOUR_STRIPE_API_KEY"
Namespace identifier for scoping this request. All resources (collections, buckets, taxonomies, etc.) are scoped to a namespace. You can provide either the namespace name or namespace ID. Format: ns_xxxxxxxxxxxxx (ID) or a custom name like 'my-namespace'. Falls back to ?namespace= query parameter if the header is omitted.
"ns_abc123def456"
"production"
"my-namespace"
The ID or name of the collection to export
Request model for exporting collection data.
Export Formats:
Vector Export:
Vectors are stored separately from document metadata due to their large size.
When include_vectors=True, vectors are exported to a separate file with the naming convention:
{collection_name}_vectors.{format}
Field Selection:
Use select_fields to export only specific fields, reducing file size for large collections.
Supports dot notation for nested fields (e.g., "metadata.title").
Filtering: Apply filters to export a subset of documents. Uses the same LogicalOperator format as the documents list endpoint.
Export format: json (line-delimited), csv, or parquet (default).
json, csv, parquet Whether to include vectors in the export. Vectors are exported to a separate file due to their large size. This significantly increases export time and file size.
Specific fields to include in the export. If not provided, all fields are exported. Supports dot notation for nested fields (e.g., 'metadata.title', 'metadata.author').
[
"document_id",
"metadata.title",
"metadata.category"
]
Filter conditions to export only matching documents. Uses LogicalOperator format (AND/OR/NOT) same as document listing.
Maximum number of documents to export. If not provided, exports all documents. Useful for testing exports or creating sample datasets.
1 <= x <= 1000000Successful Response
Response model for collection export.
Contains the presigned URL for downloading the exported file. The URL is valid for a limited time (typically 1 hour).
Presigned URL for downloading the exported file. Valid for 1 hour.
Full S3 path where the export is stored (for internal reference).
The format of the exported file.
json, csv, parquet Number of documents included in the export.
x >= 0Size of the exported file in bytes.
x >= 0Timestamp when the export was completed.
Presigned URL for downloading the vectors file (if include_vectors=True). Vectors are exported separately due to their large size.
Full S3 path for the vectors file (if include_vectors=True).