Agentic retrieval is a retrieval paradigm where an AI agent dynamically decides how to search, what tools to use, and when to refine its queries based on intermediate results. Unlike static retrieval pipelines with fixed steps, agentic retrieval adapts its strategy in real time — breaking complex information needs into sub-queries, evaluating partial results, and iterating until the best answer is found.
An agentic retrieval system gives an LLM access to retrieval tools (vector search, keyword search, filters, aggregations) and lets it plan a multi-step search strategy. The agent formulates an initial query, evaluates the results, decides whether they are sufficient, and if not, reformulates the query or tries a different approach. This loop continues until the agent determines it has found the relevant information.
Agentic retrieval combines tool-calling LLMs (like GPT-4 or Claude) with retrieval APIs exposed as callable functions. The agent receives a schema describing available search tools, their parameters, and return types. ReAct-style prompting enables the agent to reason about which tools to call and in what order. Results are accumulated in a working memory that informs subsequent tool calls.