Learning Hub/RAG System Design/Advanced RAG Patterns
10 / 10Production

Advanced RAG Patterns

Hybrid search, parent-child retrieval, query decomposition, and agentic loops

Advanced RAG combines lexical and semantic search, uses parent-child retrieval for granular indexing with coherent generation context, decomposes multi-hop questions into sub-queries, and optionally uses agentic retrieval loops — with careful attention to latency and failure paths.

Hybrid search is the most reliable single improvement for production RAG. Dense embedding search excels at semantic similarity but misses exact strings — product codes, version numbers, names, acronyms. Sparse BM25 or TF-IDF search is the opposite: exact match is strong but semantic generalisation is poor. Combining them with a reciprocal rank fusion or learned score weighting captures both dimensions. Most production systems above a certain query volume will benefit from hybrid search.

Parent-child retrieval solves a fundamental tension in RAG: you want small chunks for precise retrieval but large, coherent sections for generation. The solution is to index small child chunks for retrieval, then at generation time fetch the larger parent section that contains the matched child. This way you get granular indexing without the incoherence of passing fragmented sentences to the generator. Query decomposition addresses multi-hop questions by breaking "Who founded the company that acquired X?" into sequential sub-queries, each retrievable independently.

Agentic retrieval — where an LLM decides whether retrieved context is sufficient and iteratively refines its retrieval strategy — is powerful but carries real costs. Each iteration adds latency and an additional LLM call. Failure modes compound: a mis-guided first retrieval can produce a mis-guided refinement query. Agentic patterns make sense for complex research tasks where quality matters more than latency. For most production RAG use cases, a well-designed non-agentic pipeline with query decomposition and hybrid search will outperform a fragile agentic loop.

Key Concepts

  • Hybrid search (dense + sparse) is the most reliable single improvement for production RAG
  • Reciprocal rank fusion (RRF) is a simple, effective way to merge semantic and keyword scores
  • Parent-child retrieval: index small chunks, return large coherent sections at generation time
  • Query decomposition breaks multi-hop questions into independently retrievable sub-queries
  • Agentic retrieval loops improve quality for complex tasks but add latency and failure surface