> ## Documentation Index
> Fetch the complete documentation index at: https://docs.mixpeek.com/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Filters

> Compose filter conditions with logical operators

<Frame>
  <img src="https://mintcdn.com/mixpeek/pDBzbsnRaRIThJZv/assets/mixpeek-filters.svg?fit=max&auto=format&n=pDBzbsnRaRIThJZv&q=85&s=1ab3a765413b2e1644fa88ce899e201d" alt="Filter composition: AND, OR, and NOT operators nest to build complex filter logic on document payloads" width="900" height="280" data-path="assets/mixpeek-filters.svg" />
</Frame>

Filters narrow results using logical operators to combine conditions. They operate on document payloads (metadata, enrichments, passthrough fields) and can be applied in retriever execution or as dedicated `filter@v1` stages.

## Payload Indexes

<Warning>
  Filters require **payload indexes** on the fields you filter by. Without an index, the vector store performs a full scan — which is slow on large collections and may return incomplete results.
</Warning>

Create indexes on your namespace before using filters:

```bash theme={null}
curl -X PATCH https://api.mixpeek.com/v1/namespaces/{namespace_id} \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "payload_indexes": [
      {"field_name": "metadata.category", "type": "keyword"},
      {"field_name": "metadata.price", "type": "integer"},
      {"field_name": "metadata.status", "type": "keyword"}
    ]
  }'
```

Supported index types:

| Type       | Use for                                     |
| ---------- | ------------------------------------------- |
| `keyword`  | Exact-match strings (categories, IDs, tags) |
| `integer`  | Whole numbers                               |
| `float`    | Decimal numbers                             |
| `bool`     | Boolean fields                              |
| `datetime` | Timestamps                                  |
| `text`     | Full-text search                            |
| `geo`      | Geospatial queries                          |

If you filter on an unindexed field, the response includes a `warnings` array telling you which fields need indexes:

```json theme={null}
{
  "warnings": [
    "Filter field 'brand' has no payload index — filtering may be slow or unreliable on large collections. Create an index via PATCH /v1/namespaces/{namespace_id} with payload_indexes: [{\"field_name\": \"brand\", \"type\": \"keyword\"}]"
  ]
}
```

<Note>
  System fields (`collection_id`, `bucket_id`, `object_id`, `batch_id`) and `_internal.*` fields are indexed automatically — you only need to create indexes for your own fields.
</Note>

## Logical Operators

Mixpeek filters support three logical operators for composing conditions:

| Operator | Description                         | Usage                                 |
| -------- | ----------------------------------- | ------------------------------------- |
| `AND`    | All conditions must be true         | Combine multiple required constraints |
| `OR`     | At least one condition must be true | Match any of several alternatives     |
| `NOT`    | Inverts the condition               | Exclude matching documents            |

### AND Operator

Requires all nested conditions to match:

```json theme={null}
{
  "AND": [
    { "field": "metadata.status", "operator": "eq", "value": "published" },
    { "field": "metadata.price", "operator": "lte", "value": 100 }
  ]
}
```

### OR Operator

Matches if any nested condition is true:

```json theme={null}
{
  "OR": [
    { "field": "metadata.category", "operator": "eq", "value": "video" },
    { "field": "metadata.category", "operator": "eq", "value": "audio" }
  ]
}
```

### NOT Operator

Excludes documents matching the condition:

```json theme={null}
{
  "NOT": {
    "field": "metadata.status", "operator": "eq", "value": "draft"
  }
}
```

## Nesting Operators

Logical operators can be nested to create complex filter logic:

```json theme={null}
{
  "AND": [
    { "field": "metadata.status", "operator": "eq", "value": "published" },
    {
      "OR": [
        { "field": "metadata.category", "operator": "eq", "value": "video" },
        { "field": "metadata.category", "operator": "eq", "value": "audio" }
      ]
    },
    {
      "NOT": {
        "field": "metadata.restricted", "operator": "eq", "value": true
      }
    }
  ]
}
```

This filter matches documents that are:

* Published **AND**
* Either video or audio **AND**
* Not restricted

## Comparison Operators

Use these operators within conditions:

| Operator      | Description                                                       |
| ------------- | ----------------------------------------------------------------- |
| `eq`          | Equals                                                            |
| `ne`          | Not equals                                                        |
| `gt`          | Greater than                                                      |
| `gte`         | Greater than or equal                                             |
| `lt`          | Less than                                                         |
| `lte`         | Less than or equal                                                |
| `in`          | Value in list                                                     |
| `nin`         | Value not in list                                                 |
| `exists`      | Field exists                                                      |
| `is_null`     | Field is null                                                     |
| `contains`    | String contains substring                                         |
| `starts_with` | String starts with prefix                                         |
| `ends_with`   | String ends with suffix                                           |
| `regex`       | Matches regular expression                                        |
| `text`        | Full-text search (token-based; word order **not** preserved)      |
| `phrase`      | Exact phrase — matches the words **in order**, word-boundary safe |

<Note>
  Use `text` to match any of the query tokens (BM25); use `phrase` when word
  order matters — e.g. find a transcript where someone says an exact quote.
  `{ "field": "transcription", "operator": "phrase", "value": "make america great again" }`
  matches "...make america great again..." but not "america will be great again".
</Note>

## Lineage Shortcuts

Every Mixpeek document carries a `_internal.lineage` block recording where it
came from. To filter by lineage you don't have to use the underscore-prefixed
paths — use the friendly aliases below in any `field` position.

| Alias             | Resolves to                              | Use for                                                 |
| ----------------- | ---------------------------------------- | ------------------------------------------------------- |
| `from_object`     | `_internal.lineage.root_object_id`       | "Everything derived from this bucket object"            |
| `from_bucket`     | `_internal.lineage.root_bucket_id`       | "Everything derived from this bucket"                   |
| `from_document`   | `_internal.lineage.source_document_id`   | Direct children of one upstream document                |
| `from_collection` | `_internal.lineage.source_collection_id` | Documents whose immediate parent was in this collection |

```json theme={null}
{
  "AND": [
    { "field": "from_object", "operator": "eq", "value": "obj_video_123" }
  ]
}
```

You can mix lineage aliases with regular fields and templates:

```json theme={null}
{
  "AND": [
    { "field": "from_object", "operator": "eq", "value": "{{INPUT.object_id}}" },
    { "field": "metadata.scene_score", "operator": "gte", "value": 0.8 }
  ]
}
```

The aliases are also accepted by document list endpoints and retriever filter
stages — the same vocabulary works everywhere `field` is used.

## Using Templates

Reference request inputs or stage outputs in filter values:

```json theme={null}
{
  "AND": [
    { "field": "metadata.category", "operator": "eq", "value": "{{INPUT.category}}" },
    { "field": "metadata.price", "operator": "lte", "value": "{{INPUT.max_price}}" }
  ]
}
```

## Filter Stage Example

```json theme={null}
{
  "stage_name": "filter",
  "stage_type": "filter",
  "config": {
    "stage_id": "attribute_filter",
    "parameters": {
      "strategy": "structured",
      "structured_filter": {
        "AND": [
          { "field": "metadata.category", "operator": "eq", "value": "audio" },
          { "field": "metadata.price", "operator": "lte", "value": "{{INPUT.max_price}}" }
        ]
      }
    }
  }
}
```

## Stage Pre-Filters and Post-Filters

Every stage in a retriever pipeline accepts optional `pre_filters` and
`post_filters`. `pre_filters` narrow the candidate set **before** the stage
runs (pushed down into the vector store as native filters); `post_filters`
apply to the stage's results before they pass downstream. Both take the same
logical-operator shape as any other filter.

**Canonical shape** — wrap conditions in an explicit logical operator:

```json theme={null}
{
  "pre_filters": {
    "AND": [
      { "field": "archive_status", "operator": "ne", "value": "ARCHIVED" },
      { "field": "Keywords", "operator": "contains", "value": "skincare" }
    ]
  }
}
```

Always prefer the explicit `{ "AND": [ ... ] }` form — it is unambiguous and
nests cleanly with `OR`/`NOT`.

<Note>
  For convenience, two shorthand forms are coerced to an `AND` group:

  * a **single bare condition** — `{ "field": "...", "operator": "...", "value": "..." }` becomes `{ "AND": [ <condition> ] }`
  * a **list of conditions** — `[ { ... }, { ... } ]` becomes `{ "AND": [ ... ] }`

  Each condition must carry all three of `field`, `operator`, and `value`. An
  **incomplete** condition (for example, a missing `operator`) is rejected with a
  clear error rather than silently ignored — so a typo can never quietly degrade
  into an unfiltered result.
</Note>

## Options

| Option           | Default | Description                              |
| ---------------- | ------- | ---------------------------------------- |
| `case_sensitive` | `false` | Enable case-sensitive string comparisons |

```json theme={null}
{
  "field": "metadata.title",
  "operator": "contains",
  "value": "AI",
  "case_sensitive": true
}
```
