Export Collection
Export collection documents to JSON, CSV, or Parquet format.
Export Formats:
- JSON: Line-delimited JSON (JSONL) format. Good for streaming.
- CSV: Comma-separated values. Best for spreadsheets.
- PARQUET: Columnar format (default). Best for data pipelines.
Vector Export:
Vectors are large and exported separately. When include_vectors=True,
a separate file is created for vectors with document_id mapping.
Field Selection:
Use select_fields to export only specific fields, reducing file size.
Filtering: Apply filters to export a subset of documents.
Response: Returns presigned download URLs valid for 1 hour.
Limits:
- Large exports may take time. Consider using
sample_sizefor testing. - Vector exports significantly increase processing time.
Authorizations
Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
Path Parameters
The ID or name of the collection to export
Body
Request model for exporting collection data.
Export Formats:
- JSON: Line-delimited JSON (JSONL) format, one document per line. Good for streaming and large files.
- CSV: Comma-separated values. Best for tabular data analysis in spreadsheets.
- PARQUET: Columnar format optimized for analytics. Best for large datasets and data pipelines.
Vector Export:
Vectors are stored separately from document metadata due to their large size.
When include_vectors=True, vectors are exported to a separate file with the naming convention:
{collection_name}_vectors.{format}
Field Selection:
Use select_fields to export only specific fields, reducing file size for large collections.
Supports dot notation for nested fields (e.g., "metadata.title").
Filtering: Apply filters to export a subset of documents. Uses the same LogicalOperator format as the documents list endpoint.
Export format: json (line-delimited), csv, or parquet (default).
json, csv, parquet Whether to include vectors in the export. Vectors are exported to a separate file due to their large size. This significantly increases export time and file size.
Specific fields to include in the export. If not provided, all fields are exported. Supports dot notation for nested fields (e.g., 'metadata.title', 'metadata.author').
[
"document_id",
"metadata.title",
"metadata.category"
]Filter conditions to export only matching documents. Uses LogicalOperator format (AND/OR/NOT) same as document listing.
Maximum number of documents to export. If not provided, exports all documents. Useful for testing exports or creating sample datasets.
1 <= x <= 1000000Response
Successful Response
Response model for collection export.
Contains the presigned URL for downloading the exported file. The URL is valid for a limited time (typically 1 hour).
Presigned URL for downloading the exported file. Valid for 1 hour.
Full S3 path where the export is stored (for internal reference).
The format of the exported file.
json, csv, parquet Number of documents included in the export.
x >= 0Size of the exported file in bytes.
x >= 0Timestamp when the export was completed.
Presigned URL for downloading the vectors file (if include_vectors=True). Vectors are exported separately due to their large size.
Full S3 path for the vectors file (if include_vectors=True).

