> ## Documentation Index
> Fetch the complete documentation index at: https://docs.mixpeek.com/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Mixpeek + Snowflake

> Use Mixpeek for unstructured multimodal data, Snowflake for structured analytics

## Overview

Mixpeek and Snowflake serve complementary roles in the modern data stack. Mixpeek decomposes unstructured files (images, video, audio, PDFs) into structured features and searchable documents. Snowflake stores, governs, and analyzes structured data at scale. Together, they close the gap between raw multimodal content and business-ready analytics.

<CardGroup cols={2}>
  <Card title="Mixpeek" icon="wand-magic-sparkles">
    Ingests unstructured files, extracts features (embeddings, transcripts, classifications, metadata), and powers multimodal retrieval.
  </Card>

  <Card title="Snowflake" icon="snowflake">
    Stores structured outputs, enforces governance, and drives dashboards, ML pipelines, and cross-functional analytics.
  </Card>
</CardGroup>

## Architecture

```
                        Mixpeek                              Snowflake
               +-----------------------+            +------------------------+
               |                       |            |                        |
  Files -----> |  Buckets & Collections|            |   Structured Tables    |
  (images,     |                       |            |                        |
   video,      |  Decompose files into |  export    |  - classifications     |
   audio,      |  features:            | ---------> |  - extracted metadata  |
   PDFs)       |   - embeddings        |            |  - taxonomy labels     |
               |   - transcripts       |            |  - document payloads   |
               |   - classifications   |            |                        |
               |   - metadata          |  enrich    |  Dashboards, BI, ML    |
               |                       | <--------- |  (feed back into       |
               |  Retrieval & Search   |            |   Mixpeek retrievers)  |
               +-----------------------+            +------------------------+
```

## Use Cases

### Export taxonomy classifications to Snowflake tables

After Mixpeek classifies your content with [taxonomies](/enrichment/taxonomies), export the labels into Snowflake for reporting and governance.

### Feed extracted metadata into Snowflake dashboards

Mixpeek extracts rich metadata from every file it processes -- transcripts, detected objects, face identities, brand logos, audio fingerprints. Load these structured outputs into Snowflake and build dashboards in Tableau, Sigma, or Snowsight.

### Use Snowflake data to enrich Mixpeek retrievers

Pull structured attributes from Snowflake (pricing, inventory, customer segments) and attach them to Mixpeek documents via the [sql-lookup](/retrieval/stages/sql-lookup) or [api-call](/retrieval/stages/api-call) retriever stages. This lets your multimodal search results carry business context.

## Quick Start

Export Mixpeek document metadata to a Snowflake table using the Mixpeek Python SDK and the Snowflake Connector.

<Steps>
  <Step title="Install dependencies">
    ```bash theme={null}
    pip install mixpeek snowflake-connector-python
    ```
  </Step>

  <Step title="List documents from Mixpeek">
    ```python theme={null}
    from mixpeek import Mixpeek

    client = Mixpeek(api_key="your-api-key")

    # List documents from a collection
    documents = client.collections.documents.list(
        collection_id="your-collection-id",
        page_size=100
    )
    ```
  </Step>

  <Step title="Write to Snowflake">
    ```python theme={null}
    import snowflake.connector
    import json

    conn = snowflake.connector.connect(
        user="YOUR_USER",
        password="YOUR_PASSWORD",
        account="YOUR_ACCOUNT",
        warehouse="YOUR_WAREHOUSE",
        database="MIXPEEK_DATA",
        schema="PUBLIC"
    )

    cursor = conn.cursor()

    # Create table if it does not exist
    cursor.execute("""
        CREATE TABLE IF NOT EXISTS mixpeek_documents (
            document_id VARCHAR,
            source_url VARCHAR,
            content_type VARCHAR,
            metadata VARIANT,
            created_at TIMESTAMP_NTZ
        )
    """)

    # Insert each document
    for doc in documents:
        cursor.execute(
            """
            INSERT INTO mixpeek_documents
                (document_id, source_url, content_type, metadata, created_at)
            VALUES (%s, %s, %s, PARSE_JSON(%s), %s)
            """,
            (
                doc.get("document_id"),
                doc.get("source", {}).get("url"),
                doc.get("content_type"),
                json.dumps(doc.get("metadata", {})),
                doc.get("created_at"),
            )
        )

    conn.commit()
    cursor.close()
    conn.close()
    ```
  </Step>
</Steps>

<Tip>
  For production workloads, use Snowflake's `COPY INTO` with staged files or Snowpipe for continuous loading instead of row-by-row inserts.
</Tip>

## When to Use Each

| Capability                                                  | Mixpeek            | Snowflake                      |
| ----------------------------------------------------------- | ------------------ | ------------------------------ |
| Ingest unstructured files (video, images, audio, PDFs)      | Yes                | No                             |
| Extract features (embeddings, transcripts, classifications) | Yes                | No                             |
| Multimodal semantic search                                  | Yes                | No                             |
| Structured SQL analytics                                    | No                 | Yes                            |
| Data governance and access control                          | Document-level ACL | Role-based, column-level       |
| Dashboard and BI integration                                | No                 | Yes (Snowsight, Tableau, etc.) |
| ML feature store                                            | Embedding vectors  | Tabular features               |

<Info>
  Mixpeek handles everything before the data is structured. Snowflake handles everything after. Use both to get a complete pipeline from raw files to business insights.
</Info>

## Related

* [Taxonomies](/enrichment/taxonomies) -- classify content and export labels
* [SQL Lookup Stage](/retrieval/stages/sql-lookup) -- query external databases from retriever pipelines
* [API Call Stage](/retrieval/stages/api-call) -- call external APIs during retrieval
* [Webhooks](/operations/webhooks) -- trigger Snowflake loads when Mixpeek processing completes
