# Agentic RAG: When Retrieval Needs an Agent in the Loop

> What agentic RAG is — retrieval as a tool an agent uses iteratively, with query planning, self-correction, and multi-source routing — and when the upgrade pays.

Classic RAG is a fixed pipeline: retrieve once, generate once. Agentic RAG hands retrieval to an agent as a tool: it decomposes the question, searches iteratively, evaluates what came back, reformulates, routes across sources, and stops when it has enough. The upgrade pays on complex questions over messy corpora — at the price of latency, cost, and a new need for evals.

Classic [RAG](/glossary/rag) is a pipeline with the intelligence at the end: embed the user's query, fetch top-k, hand it to the model, hope. Its defining weakness is that **the retrieval happens before any thinking does** — one shot, on the user's raw phrasing, with no recourse if the shot misses. Agentic RAG moves the intelligence forward: retrieval becomes a *tool* an [agent](/glossary/ai-agent) wields — repeatedly, judgmentally — rather than a fixed pre-step.

## What the agent actually does differently

- **Decomposes.** "Compare our churn in EU vs US since the pricing change" becomes three searchable sub-questions; a single embedding of the original query resembles none of them.
- **Evaluates what came back.** After each retrieval, the agent asks the question pipelines never ask: *is this sufficient and relevant?* Thin or off-target results trigger the next move instead of a hallucinated answer.
- **Reformulates.** Failed searches get rephrased — different vocabulary, narrower scope, exploded acronyms — the loop that fixes the "right doc, wrong words" miss.
- **Routes.** Multiple sources stop being a merge problem: per sub-question, the agent picks the vector index, the [knowledge graph](/guides/concepts/graph-rag), the SQL database, or web search. Tool choice *is* retrieval strategy.
- **Stops deliberately.** Enough evidence → answer with citations; exhausted strategies → say so. An honest "couldn't find it" is itself an upgrade over confident fabrication.

Under the hood this is ordinary [tool-calling agent machinery](/guides/concepts/production-tool-calling) — search tools with good descriptions, results fed back as observations, an iteration cap — pointed at retrieval.

## When the upgrade pays

The pattern earns its cost where single-shot structurally fails: **multi-part questions**, **messy or multi-source corpora**, **vocabulary mismatch between askers and documents**, and **high-stakes answers** where "search again" beats "guess." It's overkill for FAQ-shaped lookups — which is why production systems route: a difficulty classifier (or simple heuristics) sends easy queries down the cheap one-shot path and escalates the rest to the loop. Typical agentic queries cost 3–10× a pipeline query in latency and tokens; spent on the right 20% of traffic, that's a bargain.

> [!WARNING]
> Agentic RAG inherits agent failure modes RAG never had: retrieval loops, premature confident stops, tool-choice errors. Cap iterations, trace every search (query → results → agent's judgment), and eval **end-to-end answer quality** on a set that includes the hard multi-hop cases — retrieval metrics alone no longer describe the system. The discipline is the same as any [LLM eval suite](/guides/evaluation/write-llm-evals).

## Building it incrementally

Start from a working pipeline ([the anatomy](/guides/concepts/how-rag-works) — and keep its hybrid search + [reranking](/glossary/reranking); the agent's individual searches should be your *best* searches). Then add, in order of payoff: (1) self-evaluation + one reformulation retry; (2) query decomposition for multi-part questions; (3) multi-source routing; (4) the difficulty router in front. Each step is measurable against your failure set, and the first one alone — *retry on judged-bad retrieval* — routinely closes a surprising share of failures.

Agentic RAG is where the two big 2026 threads — better retrieval and better agents — braid together; the [rag-pipeline-engineer](/agents/data-ai/rag-pipeline-engineer) agent builds exactly this evolution. And for the question that usually precedes the whole topic — "do million-token contexts make RAG obsolete?" — the answer is its own guide: [RAG vs Long Context](/guides/concepts/rag-vs-long-context).

---

_Source: https://agentscamp.com/guides/concepts/agentic-rag — Guide on AgentsCamp._
