← Back to tutorials

Adaptive RAG: Advanced RAG Tutorial

Dynamic routing between different retrieval strategies

Adaptive RAG: Advanced RAG Tutorial (2026)

Naive RAG retrieves the same way for every query — wasteful for simple questions, insufficient for hard ones. Adaptive RAG routes each query to the right strategy: answer directly when no retrieval is needed, do a single retrieval for simple lookups, and run multi-step / iterative retrieval for complex questions. It's RAG that matches effort to difficulty.

The core idea

Add a routing step before retrieval. A classifier (often a cheap LLM) inspects the query and picks a path:

  • No retrieval: the model already knows it (e.g. "what's 2+2", general knowledge) → answer directly.
  • Single-shot retrieval: a factual lookup → retrieve top-k once, then answer.
  • Iterative / multi-hop: a complex question needing several pieces → retrieve, reason, retrieve again (query decomposition).
  • python
    

    Sketch: route then retrieve

    route = classify(query) # 'none' | 'single' | 'multi' (cheap LLM call) if route == 'none': answer = llm(query) elif route == 'single': answer = llm(query, context=retrieve(query, k=5)) else: answer = multi_hop_rag(query) # decompose → retrieve per sub-question → synthesize

    Why it helps

  • Cost & latency: skip retrieval when it's unnecessary; don't over-retrieve simple queries.
  • Accuracy: give hard, multi-hop questions the iterative retrieval they need instead of one shallow pass.
  • Self-correction: advanced variants grade retrieved chunks for relevance and re-query if they're weak (CRAG-style).
  • This is naturally a stateful graph, which is why LangGraph is a common home for it — see LangGraph 状态化 Agent 指南 and the framework choice in LangChain vs LlamaIndex.

    Building blocks you'll reuse

    Adaptive RAG sits on top of solid basics: good semantic search, the right vector store (Chroma vs Qdrant), and reranking for precision. Adaptivity is the routing layer above them.

    FAQ

    How is this different from normal RAG? Normal RAG always retrieves the same way; adaptive RAG routes by query difficulty. What does the routing? Usually a cheap LLM classifier; rules can handle obvious cases. Does it lower cost? Yes — it skips retrieval when unneeded and avoids over-retrieving. What's CRAG? Corrective RAG — grade retrieved docs and re-retrieve if they're irrelevant; a self-correcting variant.

    Summary

    Adaptive RAG matches retrieval effort to query difficulty: answer directly, retrieve once, or retrieve iteratively, chosen per query by a router. It cuts cost on easy questions and boosts accuracy on hard ones. Build it as a stateful graph on top of solid semantic search and reranking.


    *Last updated: June 2026. Verify patterns against current RAG literature and LangGraph docs.*

    Also available in 中文.

    Adaptive RAG: Advanced RAG Tutorial | AI Skill Navigation | AI Skill Navigation