Agent Search - Mixpeek

Agent Search stage showing LLM reasoning loop with retriever stages as tools

The Agent Search stage uses an LLM reasoning loop to orchestrate other retriever stages as callable tools. Instead of executing a fixed sequence of stages, the LLM dynamically decides which stages to invoke, with what arguments, and how many iterations to perform based on the query and intermediate results.

Stage Category: FILTER (Adaptive retrieval)Transformation: Query → LLM reasoning (1-N iterations) → Refined documents

When to Use

Use Case	Description
Multi-hop queries	Questions requiring following references across documents
Iterative refinement	Broad search followed by intelligent narrowing
Complex filtering logic	When the right filter conditions depend on query semantics
Tree/hierarchy navigation	Navigating hierarchical document indexes top-down
Exploratory search	When the best search strategy isn’t known upfront

When NOT to Use

Scenario	Recommended Alternative
Simple keyword or vector search	`feature_search`
Known metadata filter conditions	`attribute_filter`
Fixed pipeline with known stages	Chain stages directly
Latency-critical applications (< 1s)	Direct stage execution
Cost-sensitive high-volume queries	Pre-configured pipelines

Parameters

Parameter	Type	Default	Description
`strategy`	string	`iterative_refinement`	Reasoning strategy (see Strategies below)
`stages`	list[string]	from strategy	Which retriever stages the agent can invoke as tools
`system_prompt`	string	from strategy	Custom system prompt for the LLM. Supports `{{INPUT.*}}` variables
`max_iterations`	integer	`5`	Maximum reasoning iterations (1-20)
`timeout_seconds`	float	`60.0`	Total timeout for the reasoning loop (5-300s)
`provider`	string	auto	LLM provider (openai, google, anthropic)
`model_name`	string	auto	Specific model to use
`feedback`	string	`null`	Feedback from a prior execution to inject into the agent’s prompt. Use to correct or refine behavior based on previous results
`min_confidence`	float	`0.0`	Minimum confidence (0.0–1.0) required to accept results. Below this threshold, the agent keeps searching even if it signals done
`auto_strategy`	boolean	`false`	When true, a lightweight LLM call selects the best strategy automatically before the main loop. Adds ~0.5–1s latency

Strategies

Strategy	Default Tools	Best For
`iterative_refinement`	feature_search, attribute_filter	Progressive narrowing of results
`multi_hop`	feature_search, attribute_filter, llm_filter	Following cross-document references
`tree_navigation`	feature_search, attribute_filter	Hierarchical index traversal
`full_catalog`	every registered stage	Open-ended queries where the model should compose the pipeline
`custom`	user-specified	Full control over tools and prompt

Available Tools (Stages)

The agent can invoke any registered retriever stage as a tool. Each stage is presented to the LLM with a simplified parameter schema:

Tool	What the LLM Sees	Typical Use
`feature_search`	Semantic similarity search with query and top_k	Vector search
`attribute_filter`	Filter by field, operator, and value	Metadata filtering
`llm_filter`	Filter by natural language criteria	Semantic filtering
`rerank`	Re-score documents by relevance to query	Result refinement
`llm_enrich`	Generate new fields using LLM analysis	Information extraction
`taxonomy_enrich`	Classify documents against a taxonomy	Categorization

Configuration Examples

{
  "stage_type": "filter",
  "stage_id": "agent_search",
  "parameters": {
    "strategy": "iterative_refinement",
    "max_iterations": 5,
    "timeout_seconds": 30.0
  }
}

full_catalog hands the LLM every registered filter/sort/rerank/enrich stage as a tool and lets it decide the pipeline shape per query. Higher cost and latency than the pinned strategies — use it when composition quality matters more than cost.

Use the custom strategy when you need precise control over which tools the agent can access and how it should reason. The built-in strategies provide sensible defaults for common patterns.

Use feedback to create a human-in-the-loop refinement cycle: execute a retriever, review the results, then re-execute with feedback describing what was wrong. The agent adjusts its strategy based on your corrections. Pair with min_confidence to ensure the agent keeps searching until results meet your quality bar.

How It Works

Strategy selection: If auto_strategy is enabled, a lightweight LLM call picks the best strategy for the query
Prompt assembly: The system prompt is built from the strategy defaults, with feedback prepended if provided
Reasoning loop: Each iteration, the LLM receives the query, available tools, and a budget note showing remaining iterations and seconds
Tool execution: The LLM calls retriever stages as tools. Results are summarized and fed back
Confidence gating: The LLM calls finish_search to declare done with a confidence score (0.0–1.0). If confidence is below min_confidence, the loop continues
Context compression: When the conversation grows long, older messages are replaced with a compact working-memory summary to stay within context limits
Return: Accumulated results and metadata (confidence, summary, reasoning trace) are returned

Each tool call creates a sub-state execution of the target stage, inheriting namespace and collection context from the parent. Non-empty results from each iteration replace the previous working set. If a refinement query returns zero results, the previous results are preserved.

Performance

Metric	Value
Latency	2-30s (depends on iterations and sub-stages)
Memory	O(N) per iteration result set
Cost	LLM API calls + sub-stage costs per iteration
Complexity	O(iterations * sub-stage complexity)

Common Pipeline Patterns

Agent as First Stage

[
  {
    "stage_type": "filter",
    "stage_id": "agent_search",
    "parameters": {
      "strategy": "iterative_refinement",
      "stages": ["feature_search", "attribute_filter"],
      "max_iterations": 3
    }
  },
  {
    "stage_type": "sort",
    "stage_id": "rerank",
    "parameters": {
      "inference_name": "baai_bge_reranker_v2_m3",
      "query": "{{INPUT.query}}",
      "document_field": "content"
    }
  },
  {
    "stage_type": "reduce",
    "stage_id": "limit",
    "parameters": {
      "limit": 10
    }
  }
]

Complex Query Decomposition

[
  {
    "stage_type": "filter",
    "stage_id": "agent_search",
    "parameters": {
      "strategy": "custom",
      "stages": ["feature_search", "attribute_filter", "llm_filter"],
      "system_prompt": "Break down the user's complex query into sub-queries. Use feature_search for each sub-query, attribute_filter to narrow by metadata, and llm_filter to validate relevance. Combine the best results.",
      "max_iterations": 5,
      "timeout_seconds": 45.0
    }
  },
  {
    "stage_type": "reduce",
    "stage_id": "deduplicate",
    "parameters": {
      "strategy": "field",
      "fields": ["document_id"]
    }
  }
]

Response Metadata

The stage returns execution metadata in stage_statistics:

Field	Description
`iterations_used`	Number of reasoning iterations completed
`max_iterations`	Maximum iterations that were configured
`timeout_hit`	Whether the timeout was reached
`strategy`	Strategy that was used
`auto_strategy_used`	Whether auto-strategy routing was active
`stages_invoked`	List of tool calls made (name + arguments + result count)
`total_llm_cost`	Total LLM API cost for the reasoning loop
`model_used`	LLM model that was used
`final_confidence`	Agent’s self-reported confidence in results (0.0–1.0)
`final_summary`	Agent’s explanation of what was found and why it stopped
`reasoning_trace`	Per-iteration trace: agent reasoning, tool calls, durations

Error Handling

Error	Behavior
Sub-stage execution fails	Error message returned to LLM; it can retry or try a different approach
Timeout reached	Returns results accumulated so far (graceful degradation)
Max iterations reached	Runs a recovery call to capture confidence, then returns accumulated results
Unknown stage in tools list	Stage is skipped (logged as warning)
LLM returns no tool calls	Runs a recovery call to capture confidence, then returns current results
Empty collection	Returns empty result set (no error)

Feature Search - Vector similarity search (commonly used as agent tool)
Attribute Filter - Metadata filtering (commonly used as agent tool)
LLM Filter - Natural language filtering
Rerank - Result re-scoring by relevance

​When to Use

​When NOT to Use

​Parameters

​Strategies

​Available Tools (Stages)

​Configuration Examples

​How It Works

​Performance

​Common Pipeline Patterns

​Agent as First Stage

​Complex Query Decomposition

​Response Metadata

​Error Handling

​Related

When to Use

When NOT to Use

Parameters

Strategies

Available Tools (Stages)

Configuration Examples

How It Works

Performance

Common Pipeline Patterns

Agent as First Stage

Complex Query Decomposition

Response Metadata

Error Handling

Related