Skip to main content
Agent Search stage showing LLM reasoning loop with retriever stages as tools
The Agent Search stage uses an LLM reasoning loop to orchestrate other retriever stages as callable tools. Instead of executing a fixed sequence of stages, the LLM dynamically decides which stages to invoke, with what arguments, and how many iterations to perform based on the query and intermediate results.
Stage Category: FILTER (Adaptive retrieval)Transformation: Query → LLM reasoning (1-N iterations) → Refined documents

When to Use

Use CaseDescription
Multi-hop queriesQuestions requiring following references across documents
Iterative refinementBroad search followed by intelligent narrowing
Complex filtering logicWhen the right filter conditions depend on query semantics
Tree/hierarchy navigationNavigating hierarchical document indexes top-down
Exploratory searchWhen the best search strategy isn’t known upfront

When NOT to Use

ScenarioRecommended Alternative
Simple keyword or vector searchfeature_search
Known metadata filter conditionsattribute_filter
Fixed pipeline with known stagesChain stages directly
Latency-critical applications (< 1s)Direct stage execution
Cost-sensitive high-volume queriesPre-configured pipelines

Parameters

ParameterTypeDefaultDescription
strategystringiterative_refinementReasoning strategy (see Strategies below)
stageslist[string]from strategyWhich retriever stages the agent can invoke as tools
system_promptstringfrom strategyCustom system prompt for the LLM. Supports {{INPUT.*}} variables
max_iterationsinteger5Maximum reasoning iterations (1-20)
timeout_secondsfloat60.0Total timeout for the reasoning loop (5-300s)
providerstringautoLLM provider (openai, google, anthropic)
model_namestringautoSpecific model to use
feedbackstringnullFeedback from a prior execution to inject into the agent’s prompt. Use to correct or refine behavior based on previous results
min_confidencefloat0.0Minimum confidence (0.0–1.0) required to accept results. Below this threshold, the agent keeps searching even if it signals done
auto_strategybooleanfalseWhen true, a lightweight LLM call selects the best strategy automatically before the main loop. Adds ~0.5–1s latency

Strategies

StrategyDefault ToolsBest For
iterative_refinementfeature_search, attribute_filterProgressive narrowing of results
multi_hopfeature_search, attribute_filter, llm_filterFollowing cross-document references
tree_navigationfeature_search, attribute_filterHierarchical index traversal
customuser-specifiedFull control over tools and prompt

Available Tools (Stages)

The agent can invoke any registered retriever stage as a tool. Each stage is presented to the LLM with a simplified parameter schema:
ToolWhat the LLM SeesTypical Use
feature_searchSemantic similarity search with query and top_kVector search
attribute_filterFilter by field, operator, and valueMetadata filtering
llm_filterFilter by natural language criteriaSemantic filtering
rerankRe-score documents by relevance to queryResult refinement
llm_enrichGenerate new fields using LLM analysisInformation extraction
taxonomy_enrichClassify documents against a taxonomyCategorization

Configuration Examples

{
  "stage_type": "filter",
  "stage_id": "agent_search",
  "parameters": {
    "strategy": "iterative_refinement",
    "max_iterations": 5,
    "timeout_seconds": 30.0
  }
}
Use the custom strategy when you need precise control over which tools the agent can access and how it should reason. The built-in strategies provide sensible defaults for common patterns.
Use feedback to create a human-in-the-loop refinement cycle: execute a retriever, review the results, then re-execute with feedback describing what was wrong. The agent adjusts its strategy based on your corrections. Pair with min_confidence to ensure the agent keeps searching until results meet your quality bar.

How It Works

  1. Strategy selection: If auto_strategy is enabled, a lightweight LLM call picks the best strategy for the query
  2. Prompt assembly: The system prompt is built from the strategy defaults, with feedback prepended if provided
  3. Reasoning loop: Each iteration, the LLM receives the query, available tools, and a budget note showing remaining iterations and seconds
  4. Tool execution: The LLM calls retriever stages as tools. Results are summarized and fed back
  5. Confidence gating: The LLM calls finish_search to declare done with a confidence score (0.0–1.0). If confidence is below min_confidence, the loop continues
  6. Context compression: When the conversation grows long, older messages are replaced with a compact working-memory summary to stay within context limits
  7. Return: Accumulated results and metadata (confidence, summary, reasoning trace) are returned
Each tool call creates a sub-state execution of the target stage, inheriting namespace and collection context from the parent. Non-empty results from each iteration replace the previous working set. If a refinement query returns zero results, the previous results are preserved.

Performance

MetricValue
Latency2-30s (depends on iterations and sub-stages)
MemoryO(N) per iteration result set
CostLLM API calls + sub-stage costs per iteration
ComplexityO(iterations * sub-stage complexity)

Common Pipeline Patterns

Agent as First Stage

[
  {
    "stage_type": "filter",
    "stage_id": "agent_search",
    "parameters": {
      "strategy": "iterative_refinement",
      "stages": ["feature_search", "attribute_filter"],
      "max_iterations": 3
    }
  },
  {
    "stage_type": "sort",
    "stage_id": "rerank",
    "parameters": {
      "inference_name": "baai_bge_reranker_v2_m3",
      "query": "{{INPUT.query}}",
      "document_field": "content"
    }
  },
  {
    "stage_type": "reduce",
    "stage_id": "limit",
    "parameters": {
      "limit": 10
    }
  }
]

Complex Query Decomposition

[
  {
    "stage_type": "filter",
    "stage_id": "agent_search",
    "parameters": {
      "strategy": "custom",
      "stages": ["feature_search", "attribute_filter", "llm_filter"],
      "system_prompt": "Break down the user's complex query into sub-queries. Use feature_search for each sub-query, attribute_filter to narrow by metadata, and llm_filter to validate relevance. Combine the best results.",
      "max_iterations": 5,
      "timeout_seconds": 45.0
    }
  },
  {
    "stage_type": "reduce",
    "stage_id": "deduplicate",
    "parameters": {
      "strategy": "field",
      "fields": ["document_id"]
    }
  }
]

Response Metadata

The stage returns execution metadata in stage_statistics:
FieldDescription
iterations_usedNumber of reasoning iterations completed
max_iterationsMaximum iterations that were configured
timeout_hitWhether the timeout was reached
strategyStrategy that was used
auto_strategy_usedWhether auto-strategy routing was active
stages_invokedList of tool calls made (name + arguments + result count)
total_llm_costTotal LLM API cost for the reasoning loop
model_usedLLM model that was used
final_confidenceAgent’s self-reported confidence in results (0.0–1.0)
final_summaryAgent’s explanation of what was found and why it stopped
reasoning_tracePer-iteration trace: agent reasoning, tool calls, durations

Error Handling

ErrorBehavior
Sub-stage execution failsError message returned to LLM; it can retry or try a different approach
Timeout reachedReturns results accumulated so far (graceful degradation)
Max iterations reachedRuns a recovery call to capture confidence, then returns accumulated results
Unknown stage in tools listStage is skipped (logged as warning)
LLM returns no tool callsRuns a recovery call to capture confidence, then returns current results
Empty collectionReturns empty result set (no error)