> ## Documentation Index
> Fetch the complete documentation index at: https://docs.mixpeek.com/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Get Object

> This endpoint retrieves an object by its ID from the specified bucket.

    **Presigned URLs**: Set `return_presigned_urls=true` query parameter to generate fresh presigned download URLs
    for all blobs with S3 storage (default: false). URLs are added to each blob's properties as
    `presigned_url` and expire after 1 hour.

    **Document count**: `document_count` (how many documents this object produced, via vector-store
    lineage) is computed by default. It fans out to the vector partition, which on a cold/serverless
    partition can be slow — pass `include_document_count=false` to skip it for latency-sensitive,
    interactive views (e.g. an object-detail modal) that don't render the count. Even when requested,
    the count is best-effort and bounded by a deadline: it returns `null` rather than blocking the
    response if the partition is cold.


## OpenAPI

````yaml get /v1/buckets/{bucket_identifier}/objects/{object_identifier}
openapi: 3.1.0
info:
  title: Mixpeek API
  description: >-
    This is the Mixpeek API, providing access to various endpoints for data
    processing and retrieval.
  termsOfService: https://mixpeek.com/terms
  contact:
    name: Mixpeek Support
    url: https://mixpeek.com/contact
    email: info@mixpeek.com
  version: '0.82'
servers:
  - url: https://api.mixpeek.com
    description: Production
security: []
paths:
  /v1/buckets/{bucket_identifier}/objects/{object_identifier}:
    get:
      tags:
        - Bucket Objects
      summary: Get Object
      description: |-
        This endpoint retrieves an object by its ID from the specified bucket.

            **Presigned URLs**: Set `return_presigned_urls=true` query parameter to generate fresh presigned download URLs
            for all blobs with S3 storage (default: false). URLs are added to each blob's properties as
            `presigned_url` and expire after 1 hour.

            **Document count**: `document_count` (how many documents this object produced, via vector-store
            lineage) is computed by default. It fans out to the vector partition, which on a cold/serverless
            partition can be slow — pass `include_document_count=false` to skip it for latency-sensitive,
            interactive views (e.g. an object-detail modal) that don't render the count. Even when requested,
            the count is best-effort and bounded by a deadline: it returns `null` rather than blocking the
            response if the partition is cold.
      operationId: >-
        get_object_v1_buckets__bucket_identifier__objects__object_identifier__get
      parameters:
        - name: bucket_identifier
          in: path
          required: true
          schema:
            type: string
            description: The unique identifier of the bucket.
            title: Bucket Identifier
          description: The unique identifier of the bucket.
        - name: object_identifier
          in: path
          required: true
          schema:
            type: string
            description: The unique identifier of the object.
            title: Object Identifier
          description: The unique identifier of the object.
        - name: return_presigned_urls
          in: query
          required: false
          schema:
            type: boolean
            description: >-
              Generate fresh presigned download URLs for all blobs with S3
              storage
            default: false
            title: Return Presigned Urls
          description: Generate fresh presigned download URLs for all blobs with S3 storage
        - name: include_document_count
          in: query
          required: false
          schema:
            type: boolean
            description: >-
              Compute document_count via vector-store lineage (default true).
              Pass false to skip the vector fan-out for latency-sensitive views
              that don't render the count.
            default: true
            title: Include Document Count
          description: >-
            Compute document_count via vector-store lineage (default true). Pass
            false to skip the vector fan-out for latency-sensitive views that
            don't render the count.
      responses:
        '200':
          description: Successful Response
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ObjectResponse'
        '400':
          description: Bad Request
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponse'
        '401':
          description: Unauthorized
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponse'
        '403':
          description: Forbidden
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponse'
        '404':
          description: Not Found
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponse'
        '422':
          description: Validation Error
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/HTTPValidationError'
        '500':
          description: Internal Server Error
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponse'
components:
  schemas:
    ObjectResponse:
      properties:
        object_id:
          type: string
          title: Object Id
          description: Unique identifier for the object
        bucket_id:
          type: string
          title: Bucket Id
          description: ID of the bucket this object belongs to
        key_prefix:
          anyOf:
            - type: string
            - type: 'null'
          title: Key Prefix
          description: >-
            Storage key/path of the object, this will be used to retrieve the
            object from the storage. It is similar to a file path. If not
            provided, it will be placed in the root of the bucket.
        blobs:
          items:
            $ref: '#/components/schemas/BlobModel'
          type: array
          title: Blobs
          description: List of blobs contained in this object
        source_details:
          items:
            $ref: '#/components/schemas/SourceDetails'
          type: array
          title: Source Details
          description: >-
            Lineage/source details for this object; used for downstream
            references.
        status:
          $ref: '#/components/schemas/TaskStatusEnum'
          description: The current status of the object.
          default: DRAFT
        error:
          anyOf:
            - type: string
            - type: 'null'
          title: Error
          description: The error message if the object failed to process.
          examples:
            - 'Failed to process object: Object not found'
        created_at:
          anyOf:
            - type: string
              format: date-time
            - type: 'null'
          title: Created At
          description: >-
            Timestamp when the object was created. Automatically populated by
            the system.
        updated_at:
          anyOf:
            - type: string
              format: date-time
            - type: 'null'
          title: Updated At
          description: >-
            Timestamp when the object was last updated. Automatically populated
            by the system.
        document_count:
          anyOf:
            - type: integer
            - type: 'null'
          title: Document Count
          description: >-
            Number of documents produced from this object across all
            collections. Populated on GET requests. Null on list responses
            (expensive query). Use this to check if an object has already been
            processed.
        consistency:
          anyOf:
            - $ref: '#/components/schemas/WriteConsistency'
            - type: 'null'
          description: >-
            When and how this object becomes a searchable document. Set on write
            (create) responses: managed ingestion is visible only after a
            collection batch processes the object — poll until document_count >
            0.
      additionalProperties: true
      type: object
      required:
        - bucket_id
      title: ObjectResponse
      description: Response model for bucket objects.
      examples:
        - blobs:
            - blob_id: blob_1
              data:
                num_pages: 5
                title: Service Agreement 2024
              key_prefix: /contract-2024/content.pdf
              metadata:
                author: John Doe
                department: Legal
              property: content
              type: PDF
          bucket_id: bkt_9xy8z7
          content_hash: 28a9f5e8...
          created_at: '2024-10-21T10:30:00Z'
          key_prefix: /contract-2024
          metadata:
            category: contracts
            year: 2024
          object_id: obj_123abc456def
          skip_duplicates: false
          status: DRAFT
          updated_at: '2024-10-21T10:30:00Z'
    ErrorResponse:
      properties:
        success:
          type: boolean
          title: Success
          description: Always false for error responses
          default: false
        status:
          type: integer
          title: Status
          description: HTTP status code for this error
        error:
          $ref: '#/components/schemas/ErrorDetail'
          description: Error details payload
      type: object
      required:
        - status
        - error
      title: ErrorResponse
      description: Error response model.
      examples:
        - error:
            details:
              id: ns_123
              resource: namespace
            message: Namespace not found
            type: NotFoundError
          status: 404
          success: false
    HTTPValidationError:
      properties:
        detail:
          items:
            $ref: '#/components/schemas/ValidationError'
          type: array
          title: Detail
      type: object
      title: HTTPValidationError
    BlobModel:
      properties:
        blob_id:
          type: string
          title: Blob Id
          description: Unique identifier for the blob
          examples:
            - blob_a1b2c3d4e5f6
        property:
          type: string
          title: Property
          description: Property name of the blob
          examples:
            - video
            - thumbnail
            - content
        key_prefix:
          anyOf:
            - type: string
            - type: 'null'
          title: Key Prefix
          description: >-
            Storage key/path of the blob, this will be used to retrieve the blob
            from the storage. It is similar to a file path. If not provided, it
            will be placed in the root of the bucket.
          examples:
            - /videos/video.mp4
            - /thumbnails/thumb.jpg
        type:
          $ref: '#/components/schemas/BucketSchemaFieldType'
          description: >-
            The schema field type this blob corresponds to (e.g., IMAGE, PDF,
            DOCUMENT)
          examples:
            - video
            - image
            - pdf
            - text
        properties:
          additionalProperties: true
          type: object
          title: Properties
          description: >-
            All blob data and metadata unified (formerly separate 'data' and
            'metadata' fields). Contains URLs, dimensions, metadata, and any
            other blob-specific information.
          examples:
            - author: John Doe
              duration: 120
              resolution: 1920x1080
              tags:
                - product
                - demo
              url: s3://bucket/video.mp4
        presigned_url:
          anyOf:
            - type: string
            - type: 'null'
          title: Presigned Url
          description: >-
            Canonical top-level presigned URL for this blob. Matches the shape
            used by `document_blobs[].presigned_url` on document responses.
            Populated by the API when `?return_presigned_urls=true`. Also
            mirrored at `properties.presigned_url` for backward compatibility —
            prefer this top-level field; the nested path will be removed in a
            future release.
        details:
          $ref: '#/components/schemas/BlobDetails'
          description: System-generated file details (filename, size, hash, etc.)
      type: object
      required:
        - property
        - type
      title: BlobModel
      description: >-
        Model for a blob within a bucket object.


        Blobs store file references with a flat properties structure.

        All blob-specific data (formerly in separate 'data' and 'metadata'
        fields)

        is now unified in a single 'properties' dictionary.


        Example:
            {
                "blob_id": "blob_xyz123",
                "property": "video",
                "type": "video",
                "properties": {
                    "url": "s3://bucket/video.mp4",
                    "duration": 120,
                    "resolution": "1920x1080",
                    "author": "John Doe"  # All data unified here
                }
            }
    SourceDetails:
      properties:
        type:
          $ref: '#/components/schemas/SourceType'
          description: Immediate origin type from which this entity was derived.
        source_id:
          type: string
          title: Source Id
          description: >-
            Identifier of the immediate source entity (e.g., bucket_id,
            collection_id, taxonomy_id).
      type: object
      required:
        - type
        - source_id
      title: SourceDetails
      description: >-
        Source details for any document/point.


        Keep this intentionally minimal so specialized models (e.g.,
        DocumentSourceDetails)

        can extend it with domain-specific fields.
    TaskStatusEnum:
      type: string
      enum:
        - PENDING
        - QUEUED
        - IN_PROGRESS
        - PROCESSING
        - COMPLETED
        - COMPLETED_WITH_ERRORS
        - FAILED
        - CANCELED
        - INTERRUPTED
        - UNKNOWN
        - SKIPPED
        - DRAFT
        - ACTIVE
        - ARCHIVED
        - SUSPENDED
      title: TaskStatusEnum
      description: |-
        Enumeration of task statuses for tracking asynchronous operations.

        Task statuses indicate the current state of asynchronous operations like
        batch processing, object ingestion, clustering, and taxonomy execution.

        Status Categories:
            Operation Statuses: Track progress of async operations
            Lifecycle Statuses: Track entity state (buckets, collections, namespaces)

        Values:
            PENDING: Task is queued but has not started processing yet
            IN_PROGRESS: Task is currently being executed
            PROCESSING: Task is actively processing data (similar to IN_PROGRESS)
            COMPLETED: Task finished successfully with no errors
            COMPLETED_WITH_ERRORS: Task finished but some items failed (partial success)
            FAILED: Task encountered an error and could not complete
            CANCELED: Task was manually canceled by a user or system
            UNKNOWN: Task status could not be determined
            SKIPPED: Task was intentionally skipped
            DRAFT: Task is in draft state and not yet submitted

            ACTIVE: Entity is active and operational (for buckets, collections, etc.)
            ARCHIVED: Entity has been archived
            SUSPENDED: Entity has been temporarily suspended

        Terminal Statuses:
            COMPLETED, COMPLETED_WITH_ERRORS, FAILED, CANCELED are terminal statuses.
            Once a task reaches these states, it will not transition to another state.

        Partial Success Handling:
            COMPLETED_WITH_ERRORS indicates that the operation completed but some
            documents/items failed. The task result includes:
            - List of successful items
            - List of failed items with error details
            - Success rate percentage
            This allows clients to handle partial success scenarios appropriately.

        Polling Guidance:
            - Poll tasks in PENDING, QUEUED, IN_PROGRESS, or PROCESSING states
            - Stop polling when task reaches COMPLETED, COMPLETED_WITH_ERRORS, FAILED, or CANCELED
            - Use exponential backoff (1s → 30s) when polling
    WriteConsistency:
      properties:
        retriever_visible:
          type: string
          title: Retriever Visible
          description: >-
            Visibility model: 'eventual' (BYOV direct upsert — indexed within
            seconds) or 'after_processing' (managed ingestion — visible after a
            collection batch processes the object).
        recommended_header:
          anyOf:
            - type: string
            - type: 'null'
          title: Recommended Header
          description: Header to send on retriever execute for read-your-writes (BYOV).
        write_token_available:
          type: boolean
          title: Write Token Available
          description: Whether a write_token was issued for read-your-writes.
          default: false
        expected_visible_within_ms:
          anyOf:
            - type: integer
            - type: 'null'
          title: Expected Visible Within Ms
          description: >-
            Typical upper bound for visibility (BYOV indexing). Null when
            visibility depends on asynchronous processing (managed ingestion).
        poll:
          anyOf:
            - additionalProperties: true
              type: object
            - type: 'null'
          title: Poll
          description: >-
            How to poll for visibility when it depends on async processing:
            {endpoint, field, ready_when}.
        next_actions:
          items:
            additionalProperties: true
            type: object
          type: array
          title: Next Actions
          description: Actionable next steps to reach retriever visibility.
      type: object
      required:
        - retriever_visible
      title: WriteConsistency
      description: How and when a write becomes visible to retriever reads.
    ErrorDetail:
      properties:
        message:
          type: string
          title: Message
          description: Human-readable error message
        type:
          type: string
          title: Type
          description: Stable error type identifier (machine-readable)
        code:
          anyOf:
            - type: string
            - type: 'null'
          title: Code
          description: >-
            Fine-grained error code for programmatic handling (e.g.,
            namespace_name_taken, feature_extractor_not_found). Present only
            when consumers may need to branch on a specific error condition.
        details:
          anyOf:
            - additionalProperties: true
              type: object
            - type: 'null'
          title: Details
          description: >-
            Optional structured details to help debugging (validation errors,
            IDs, etc.)
      type: object
      required:
        - message
        - type
      title: ErrorDetail
      description: Error detail model.
    ValidationError:
      properties:
        loc:
          items:
            anyOf:
              - type: string
              - type: integer
          type: array
          title: Location
        msg:
          type: string
          title: Message
        type:
          type: string
          title: Error Type
      type: object
      required:
        - loc
        - msg
        - type
      title: ValidationError
    BucketSchemaFieldType:
      type: string
      enum:
        - string
        - number
        - integer
        - float
        - boolean
        - object
        - array
        - date
        - datetime
        - text
        - image
        - audio
        - video
        - pdf
        - excel
      title: BucketSchemaFieldType
      description: >-
        Supported data types for bucket schema fields.


        Types fall into two categories:


        1. **Metadata Types** (JSON types):
           - Stored as object metadata
           - Standard JSON-compatible types
           - Not processed by extractors (unless explicitly mapped)
           - Examples: string, number, boolean, date

        2. **File Types** (blobs):
           - Stored as files/blobs
           - Processed by extractors
           - Require file content (URL or base64)
           - Examples: text, image, video, pdf

        **GIF Special Handling**:
            GIF files can be declared as either IMAGE or VIDEO type:

            - As IMAGE: GIF is embedded as a single static image (first frame)
            - As VIDEO: GIF is decomposed frame-by-frame with embeddings per frame

            The multimodal extractor detects GIFs via MIME type (image/gif) and routes
            them based on your schema declaration. Use VIDEO for animated GIFs where
            frame-level search is needed, IMAGE for static/thumbnail use cases.

        NOTE: For retriever input schemas that need to accept document
        references

        (e.g., "find similar documents"), use RetrieverInputSchemaFieldType
        instead,

        which includes all bucket types plus document_reference.
    BlobDetails:
      properties:
        filename:
          anyOf:
            - type: string
            - type: 'null'
          title: Filename
        size_bytes:
          anyOf:
            - type: integer
            - type: 'null'
          title: Size Bytes
        mime_type:
          anyOf:
            - type: string
            - type: 'null'
          title: Mime Type
        hash:
          anyOf:
            - type: string
            - type: 'null'
          title: Hash
      type: object
      title: BlobDetails
      description: >-
        File details for a bucket object, these are automatically generated by
        the system.
    SourceType:
      type: string
      enum:
        - bucket
        - collection
        - taxonomy
        - cluster
        - direct_upsert
        - none
      title: SourceType
      description: Source types for any document/point.

````