Query Optimization & Explain

Mixpeek optimizes every retriever before it runs — reordering, fusing, and pushing work down into the vector store — then lets you inspect exactly what it did with the explain endpoint. You write the pipeline you find readable; the optimizer makes it fast.

Automatic optimizations

When you execute a retriever, the planner rewrites your stage list before execution. These transformations are automatic — you don’t configure them:

Optimization	What it does	Why it helps
Filter push-down	Moves attribute filters ahead of vector search	Shrinks the candidate set before the expensive embedding search runs
Stage fusion	Merges adjacent compatible stages into one	Fewer passes over the result set
Grouping optimization	Rewrites group/reduce stages to run database-side	Avoids materializing intermediate results
Computation push-down	Runs data-plane stages (`feature_search`, `attribute_filter`, `sort_attribute`, `aggregate`) inside MVS	Eliminates a network round-trip and lets the vector store filter/sort where the data lives
Parallel sub-queries	Runs independent operations (search + count, search + facet) concurrently	Lower wall-clock latency
Over-fetch hints	Fetches extra candidates when a later stage will filter them out	Preserves recall after post-filtering

Because the optimizer pushes filters down for you, write filters wherever they read most clearly — you don’t need to hand-order stages for performance. Use explain to confirm what was pushed.

The retriever is also fetched and optimized once per request, then reused across a batch — so POST /v1/retrievers/{id}/execute/batch amortizes planning across all queries.

Inspect the plan with explain

POST /v1/retrievers/{retriever_id}/explain returns the optimized execution plan without running the query — per-stage cost and latency estimates, bottlenecks, and exactly which optimizations were applied. Pass hypothetical inputs to see how the plan changes with different parameters.

curl -sS -X POST "$MP_API_URL/v1/retrievers/{retriever_id}/explain" \
  -H "Authorization: Bearer $MP_API_KEY" \
  -H "X-Namespace: $MP_NAMESPACE" \
  -H "Content-Type: application/json" \
  -d '{ "inputs": { "query": "people discussing electric vehicles" } }'

Example response

{
  "retriever_id": "ret_abc123",
  "execution_plan": [
    {
      "stage_index": 0,
      "stage_name": "attribute_filter",
      "stage_type": "filter",
      "estimated_input": 10000,
      "estimated_output": 5000,
      "estimated_efficiency": 0.5,
      "estimated_cost_credits": 0.01,
      "estimated_duration_ms": 20,
      "cache_likely": true,
      "optimization_notes": ["Pushed down from stage 2"],
      "warnings": []
    },
    {
      "stage_index": 1,
      "stage_name": "feature_search",
      "stage_type": "filter",
      "estimated_input": 5000,
      "estimated_output": 100,
      "estimated_efficiency": 0.02,
      "estimated_cost_credits": 0.5,
      "estimated_duration_ms": 200,
      "cache_likely": false,
      "optimization_notes": [],
      "warnings": ["High cost stage - consider reducing top_k"]
    }
  ],
  "estimated_cost": { "total_credits": 0.51, "total_duration_ms": 220 },
  "bottleneck_stages": ["feature_search"],
  "optimization_applied": true,
  "optimization_details": {
    "original_stage_count": 3,
    "optimized_stage_count": 2,
    "stage_reduction_pct": 33.3,
    "decisions": [
      {
        "rule_type": "push_down_filters",
        "applied": true,
        "reason": "Moved attribute_filter before feature_search to reduce search scope"
      }
    ]
  },
  "optimization_suggestions": [
    { "type": "reduce_limit", "stage": "feature_search", "message": "Consider reducing top_k to improve latency" }
  ]
}

estimated_cost_credits and total_credits are legacy fields expressed in the internal ledger unit (1 credit = $0.001). Customer-facing pricing is in dollars — see Billing.

How to read it

Field	Use it to…
`execution_plan[].estimated_input/output`	See how each stage narrows the set — a stage that barely reduces the set may be unnecessary
`estimated_efficiency`	Spot low-selectivity stages (close to 1.0 = passes almost everything through)
`estimated_cost_credits` / `estimated_duration_ms`	Budget in dollars before running (1 credit = $0.001); find the expensive stage
`bottleneck_stages`	The stages dominating latency — optimize these first
`cache_likely`	Whether a stage will likely hit the cache
`optimization_details.decisions`	Exactly which automatic rewrites fired (and why)
`optimization_suggestions`	Concrete, actionable tuning hints
`warnings`	Per-stage red flags (e.g. high-cost stage, overly broad `top_k`)

The execution_plan reflects the optimized pipeline, not your original stage list. Compare optimization_details.original_stage_count vs optimized_stage_count to see how much the planner collapsed.

Execution-plan variant

POST /v1/retrievers/{retriever_id}/execute/explain returns the same plan in a MongoDB-explain-style shape if you prefer that format. Both are read-only and never execute the query.

Typical workflow

Explain before you ship

Run explain with representative inputs to see estimated cost, bottlenecks, and applied optimizations.

Act on bottlenecks & suggestions

Reduce top_k on high-cost searches, add a selective attribute_filter (the optimizer pushes it down), or drop low-selectivity stages.

Verify in production

Use retriever analytics to confirm real latency and cache-hit rates match the estimate.

Multi-Stage Retrieval — how stages compose
Feature Search — the most common bottleneck stage
Evaluations — measure quality alongside cost
Best Practices — caching and cost optimization

​Automatic optimizations

​Inspect the plan with explain

​How to read it

​Execution-plan variant

​Typical workflow

​Related

Automatic optimizations

Inspect the plan with explain

How to read it

Execution-plan variant

Typical workflow

Related