> ## Documentation Index
> Fetch the complete documentation index at: https://docs.mixpeek.com/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Document Enrich

> Join documents across collections for cross-reference enrichment

<Frame>
  <img src="https://mintcdn.com/mixpeek/TwtTrae3Fi3EFJ72/assets/retrievers/document-enrich.svg?fit=max&auto=format&n=TwtTrae3Fi3EFJ72&q=85&s=16b0bb36191f390695226d70e4622221" alt="Document Enrich stage showing collection joins and cross-reference lookups" width="1000" height="400" data-path="assets/retrievers/document-enrich.svg" />
</Frame>

The Document Enrich stage performs collection joins by looking up related documents from other Mixpeek collections. This enables cross-reference enrichment without external database calls.

<Note>
  **Stage Category**: ENRICH (Enriches documents)

  **Transformation**: N documents → N documents (with joined data added)
</Note>

## When to Use

| Use Case                   | Description                                 |
| -------------------------- | ------------------------------------------- |
| **Cross-collection joins** | Link products to reviews, users to profiles |
| **Reference resolution**   | Expand foreign keys to full documents       |
| **Denormalization**        | Flatten related data for display            |
| **Multi-index search**     | Combine results from different collections  |

## When NOT to Use

| Scenario                 | Recommended Alternative |
| ------------------------ | ----------------------- |
| External database joins  | `sql_lookup`            |
| Single collection search | No enrichment needed    |
| Real-time external data  | `api_call`              |

## Parameters

| Parameter       | Type    | Default         | Description                            |
| --------------- | ------- | --------------- | -------------------------------------- |
| `collection_id` | string  | *Required*      | Target collection to join from         |
| `lookup_field`  | string  | *Required*      | Field in target to match against       |
| `source_field`  | string  | *Required*      | Field in source document to use as key |
| `result_field`  | string  | `enriched_data` | Field to store joined data             |
| `select_fields` | array   | `null`          | Specific fields to return (null = all) |
| `multiple`      | boolean | `false`         | Return multiple matching documents     |
| `limit`         | integer | `10`            | Max documents when `multiple: true`    |

## Configuration Examples

<CodeGroup>
  ```json Basic Collection Join theme={null}
  {
    "stage_name": "document_enrich",
    "stage_type": "enrich",
    "config": {
      "stage_id": "document_enrich",
      "parameters": {
        "target_collection_id": "user_profiles",
        "target_field": "user_id",
        "source_field": "metadata.author_id",
        "output_field": "author"
      }
    }
  }
  ```

  ```json Select Specific Fields theme={null}
  {
    "stage_name": "document_enrich",
    "stage_type": "enrich",
    "config": {
      "stage_id": "document_enrich",
      "parameters": {
        "target_collection_id": "products",
        "target_field": "product_id",
        "source_field": "metadata.product_id",
        "output_field": "product_details",
        "fields_to_merge": ["name", "price", "category", "image_url"]
      }
    }
  }
  ```

  ```json Conditional Enrichment theme={null}
  {
    "stage_name": "document_enrich",
    "stage_type": "enrich",
    "config": {
      "stage_id": "document_enrich",
      "parameters": {
        "target_collection_id": "reviews",
        "target_field": "product_id",
        "source_field": "document_id",
        "output_field": "reviews"
      }
    }
  }
  ```

  ```json Nested Field Lookup theme={null}
  {
    "stage_name": "document_enrich",
    "stage_type": "enrich",
    "config": {
      "stage_id": "document_enrich",
      "parameters": {
        "target_collection_id": "categories",
        "target_field": "category_code",
        "source_field": "metadata.taxonomy.primary_category",
        "output_field": "category_info"
      }
    }
  }
  ```
</CodeGroup>

## Output Schema

### Single Document Join

```json theme={null}
{
  "document_id": "doc_123",
  "content": "Product review content...",
  "metadata": {
    "product_id": "prod_456"
  },
  "product_details": {
    "name": "Wireless Headphones",
    "price": 199.99,
    "category": "Electronics",
    "image_url": "https://..."
  }
}
```

### Multiple Documents Join

```json theme={null}
{
  "document_id": "prod_456",
  "content": "Product description...",
  "reviews": [
    {
      "document_id": "rev_1",
      "rating": 5,
      "text": "Great product!"
    },
    {
      "document_id": "rev_2",
      "rating": 4,
      "text": "Good value"
    }
  ]
}
```

### No Match Found

```json theme={null}
{
  "document_id": "doc_123",
  "content": "...",
  "enriched_data": null
}
```

## Performance

| Metric                 | Value                       |
| ---------------------- | --------------------------- |
| **Latency**            | 5-20ms per document         |
| **Batch processing**   | Automatic batching          |
| **Index usage**        | Uses collection indexes     |
| **Parallel execution** | Up to 10 concurrent lookups |

<Tip>
  Ensure `lookup_field` is indexed in the target collection for optimal performance. Use `select_fields` to reduce payload size.
</Tip>

## Common Pipeline Patterns

### Search + Author Enrichment

```json theme={null}
[
  {
    "stage_name": "semantic_search",
    "stage_type": "filter",
    "config": {
      "stage_id": "feature_search",
      "parameters": {
        "searches": [
          { "feature_uri": "mixpeek://text_extractor@v1/multilingual_e5_large_instruct_v1", "query": { "input_mode": "text", "value": "{{INPUT.query}}" }, "top_k": 20 }
        ],
        "final_top_k": 20
      }
    }
  },
  {
    "stage_name": "document_enrich",
    "stage_type": "enrich",
    "config": {
      "stage_id": "document_enrich",
      "parameters": {
        "target_collection_id": "users",
        "target_field": "user_id",
        "source_field": "metadata.author_id",
        "output_field": "author",
        "fields_to_merge": ["name", "avatar", "bio"]
      }
    }
  }
]
```

### Product Search with Reviews

```json theme={null}
[
  {
    "stage_name": "hybrid_search",
    "stage_type": "filter",
    "config": {
      "stage_id": "feature_search",
      "parameters": {
        "searches": [
          { "feature_uri": "mixpeek://text_extractor@v1/multilingual_e5_large_instruct_v1", "query": { "input_mode": "text", "value": "{{INPUT.query}}" }, "top_k": 10 }
        ],
        "final_top_k": 10
      }
    }
  },
  {
    "stage_name": "document_enrich",
    "stage_type": "enrich",
    "config": {
      "stage_id": "document_enrich",
      "parameters": {
        "target_collection_id": "reviews",
        "target_field": "product_id",
        "source_field": "document_id",
        "output_field": "recent_reviews",
        "fields_to_merge": ["rating", "text", "author_name", "created_at"]
      }
    }
  }
]
```

### Hierarchical Category Enrichment

```json theme={null}
[
  {
    "stage_name": "semantic_search",
    "stage_type": "filter",
    "config": {
      "stage_id": "feature_search",
      "parameters": {
        "searches": [
          { "feature_uri": "mixpeek://text_extractor@v1/multilingual_e5_large_instruct_v1", "query": { "input_mode": "text", "value": "{{INPUT.query}}" }, "top_k": 50 }
        ],
        "final_top_k": 50
      }
    }
  },
  {
    "stage_name": "document_enrich",
    "stage_type": "enrich",
    "config": {
      "stage_id": "document_enrich",
      "parameters": {
        "target_collection_id": "categories",
        "target_field": "category_id",
        "source_field": "metadata.category_id",
        "output_field": "category"
      }
    }
  },
  {
    "stage_name": "document_enrich",
    "stage_type": "enrich",
    "config": {
      "stage_id": "document_enrich",
      "parameters": {
        "target_collection_id": "categories",
        "target_field": "category_id",
        "source_field": "category.parent_id",
        "output_field": "parent_category"
      }
    }
  }
]
```

## Error Handling

| Error                | Behavior                     |
| -------------------- | ---------------------------- |
| Collection not found | Stage fails                  |
| No matching document | `result_field` set to `null` |
| Invalid field path   | Stage fails with error       |
| Timeout              | Continues with null result   |

## vs Other Enrichment Stages

| Feature     | document\_enrich    | sql\_lookup         | api\_call       |
| ----------- | ------------------- | ------------------- | --------------- |
| Data source | Mixpeek collections | SQL database        | External API    |
| Latency     | 5-20ms              | 10-100ms            | 50-500ms        |
| Best for    | Cross-collection    | External relational | REST APIs       |
| Setup       | None                | Connection config   | Endpoint config |

## Related

* [SQL Lookup](/retrieval/stages/sql-lookup) - External database joins
* [API Call](/retrieval/stages/api-call) - REST API enrichment
* [Taxonomy Enrich](/retrieval/stages/taxonomy-enrich) - Classification enrichment
