Filters narrow results using logical operators to combine conditions. They operate on document payloads (metadata, enrichments, passthrough fields) and can be applied in retriever execution or as dedicated filter@v1 stages.
Payload Indexes
Filters require payload indexes on the fields you filter by. Without an index, the vector store performs a full scan — which is slow on large collections and may return incomplete results.
Create indexes on your namespace before using filters:
curl -X PATCH https://api.mixpeek.com/v1/namespaces/{namespace_id} \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"payload_indexes": [
{"field_name": "metadata.category", "type": "keyword"},
{"field_name": "metadata.price", "type": "integer"},
{"field_name": "metadata.status", "type": "keyword"}
]
}'
Supported index types:
| Type | Use for |
|---|
keyword | Exact-match strings (categories, IDs, tags) |
integer | Whole numbers |
float | Decimal numbers |
bool | Boolean fields |
datetime | Timestamps |
text | Full-text search |
geo | Geospatial queries |
If you filter on an unindexed field, the response includes a warnings array telling you which fields need indexes:
{
"warnings": [
"Filter field 'brand' has no payload index — filtering may be slow or unreliable on large collections. Create an index via PATCH /v1/namespaces/{namespace_id} with payload_indexes: [{\"field_name\": \"brand\", \"type\": \"keyword\"}]"
]
}
System fields (collection_id, bucket_id, object_id, batch_id) and _internal.* fields are indexed automatically — you only need to create indexes for your own fields.
Logical Operators
Mixpeek filters support three logical operators for composing conditions:
| Operator | Description | Usage |
|---|
AND | All conditions must be true | Combine multiple required constraints |
OR | At least one condition must be true | Match any of several alternatives |
NOT | Inverts the condition | Exclude matching documents |
AND Operator
Requires all nested conditions to match:
{
"AND": [
{ "field": "metadata.status", "operator": "eq", "value": "published" },
{ "field": "metadata.price", "operator": "lte", "value": 100 }
]
}
OR Operator
Matches if any nested condition is true:
{
"OR": [
{ "field": "metadata.category", "operator": "eq", "value": "video" },
{ "field": "metadata.category", "operator": "eq", "value": "audio" }
]
}
NOT Operator
Excludes documents matching the condition:
{
"NOT": {
"field": "metadata.status", "operator": "eq", "value": "draft"
}
}
Nesting Operators
Logical operators can be nested to create complex filter logic:
{
"AND": [
{ "field": "metadata.status", "operator": "eq", "value": "published" },
{
"OR": [
{ "field": "metadata.category", "operator": "eq", "value": "video" },
{ "field": "metadata.category", "operator": "eq", "value": "audio" }
]
},
{
"NOT": {
"field": "metadata.restricted", "operator": "eq", "value": true
}
}
]
}
This filter matches documents that are:
- Published AND
- Either video or audio AND
- Not restricted
Comparison Operators
Use these operators within conditions:
| Operator | Description |
|---|
eq | Equals |
ne | Not equals |
gt | Greater than |
gte | Greater than or equal |
lt | Less than |
lte | Less than or equal |
in | Value in list |
nin | Value not in list |
exists | Field exists |
is_null | Field is null |
contains | String contains substring |
starts_with | String starts with prefix |
ends_with | String ends with suffix |
regex | Matches regular expression |
Lineage Shortcuts
Every Mixpeek document carries a _internal.lineage block recording where it
came from. To filter by lineage you don’t have to use the underscore-prefixed
paths — use the friendly aliases below in any field position.
| Alias | Resolves to | Use for |
|---|
from_object | _internal.lineage.root_object_id | ”Everything derived from this bucket object” |
from_bucket | _internal.lineage.root_bucket_id | ”Everything derived from this bucket” |
from_document | _internal.lineage.source_document_id | Direct children of one upstream document |
from_collection | _internal.lineage.source_collection_id | Documents whose immediate parent was in this collection |
{
"AND": [
{ "field": "from_object", "operator": "eq", "value": "obj_video_123" }
]
}
You can mix lineage aliases with regular fields and templates:
{
"AND": [
{ "field": "from_object", "operator": "eq", "value": "{{INPUT.object_id}}" },
{ "field": "metadata.scene_score", "operator": "gte", "value": 0.8 }
]
}
The aliases are also accepted by document list endpoints and retriever filter
stages — the same vocabulary works everywhere field is used.
Using Templates
Reference request inputs or stage outputs in filter values:
{
"AND": [
{ "field": "metadata.category", "operator": "eq", "value": "{{INPUT.category}}" },
{ "field": "metadata.price", "operator": "lte", "value": "{{INPUT.max_price}}" }
]
}
Filter Stage Example
{
"stage_name": "filter",
"version": "v1",
"parameters": {
"strategy": "structured",
"structured_filter": {
"AND": [
{ "field": "metadata.category", "operator": "eq", "value": "audio" },
{ "field": "metadata.price", "operator": "lte", "value": "{{INPUT.max_price}}" }
]
}
}
}
Stage Pre-Filters and Post-Filters
Every stage in a retriever pipeline accepts optional pre_filters and
post_filters. pre_filters narrow the candidate set before the stage
runs (pushed down into the vector store as native filters); post_filters
apply to the stage’s results before they pass downstream. Both take the same
logical-operator shape as any other filter.
Canonical shape — wrap conditions in an explicit logical operator:
{
"pre_filters": {
"AND": [
{ "field": "archive_status", "operator": "ne", "value": "ARCHIVED" },
{ "field": "Keywords", "operator": "contains", "value": "skincare" }
]
}
}
Always prefer the explicit { "AND": [ ... ] } form — it is unambiguous and
nests cleanly with OR/NOT.
For convenience, two shorthand forms are coerced to an AND group:
- a single bare condition —
{ "field": "...", "operator": "...", "value": "..." } becomes { "AND": [ <condition> ] }
- a list of conditions —
[ { ... }, { ... } ] becomes { "AND": [ ... ] }
Each condition must carry all three of field, operator, and value. An
incomplete condition (for example, a missing operator) is rejected with a
clear error rather than silently ignored — so a typo can never quietly degrade
into an unfiltered result.
Options
| Option | Default | Description |
|---|
case_sensitive | false | Enable case-sensitive string comparisons |
{
"field": "metadata.title",
"operator": "contains",
"value": "AI",
"case_sensitive": true
}