Agentic AI Patterns
A practical guide to building AI agents with LangGraph and LangChain. Each lesson includes an architecture diagram, annotated Python code, key concepts, and research references.
ReAct
Reason + Act
The foundational agent loop — the LLM alternates between reasoning about what to do and calling a tool until it has enough information to answer.
Plan-and-Execute
Explicit upfront planning
A planner LLM breaks the task into steps before execution begins. An executor then runs each step in sequence, and a synthesizer merges all outputs into a final answer.
Prompt Chaining
Linear sequential pipeline
The simplest agentic pattern: the output of one LLM call becomes the input of the next. No loops, no branching — a clean assembly line for structured text transformations.
Multi-Agent Supervisor
Supervisor routes to specialists
A supervisor LLM reads the incoming question and routes it to the most appropriate specialist agent. Each specialist has its own focused toolset and system prompt.
Reflection
Generate → Critique → Revise
The agent generates an answer, a separate critic evaluates it, and the agent revises until the critic is satisfied or the maximum iteration count is reached.
Subgraph
Reusable hierarchical graphs
A parent graph calls compiled child graphs as ordinary nodes, enabling modular composition and reuse of agent logic across multiple workflows.
Map-Reduce
Parallel fan-out and aggregation
Work is dynamically split into independent chunks (map), processed in parallel via Send primitives, then all results are merged by a reducer node.
Human-in-the-Loop
Interrupt for human approval
The graph pauses mid-execution and surfaces the agent's proposed action to a human. Execution only continues after the human approves, modifies, or rejects the action.
Corrective RAG
Quality-gated retrieval
Standard RAG augmented with a relevance grader. Documents below the quality threshold trigger a fallback retrieval strategy before generation proceeds.
LLM-as-Judge
Rubric-based evaluation and rewrite
A dedicated judge LLM scores the generator's output on explicit criteria. If the score falls below a threshold, the generator rewrites the answer using the judge's feedback.
Parallelization
Static and dynamic parallel branches
Multiple independent analyst nodes run simultaneously on the same input. Both static (build-time wired) and dynamic (runtime Send) variants are demonstrated.
Debate
Adversarial argumentation with judgment
Two LLM agents argue opposing positions on a question. After multiple rounds of opening statements and rebuttals, a judge evaluates the full transcript and delivers a verdict.
Skeleton of Thought
Parallel structured content generation
Generate a structured outline first, then fill each section IN PARALLEL. Dramatically faster than sequential writing for long structured outputs like reports or documentation.
Self-RAG
Fully autonomous retrieval loop
The LLM controls every decision in the RAG pipeline: whether to retrieve, whether chunks are relevant, and whether its own generated answer is faithful and useful.
Orchestrator-Subagent
Iterative multi-agent orchestration
An orchestrator LLM dynamically decides which subagent to invoke, what input to pass it, and whether to call more agents — building context across multiple rounds until the task is complete.
Tree of Thoughts
Multi-branch search with pruning
The LLM explores multiple reasoning branches simultaneously, scores each branch, and expands only the top-K most promising paths — implementing a BFS tree search over thought space.
Chain of Verification
Systematic claim-by-claim fact-checking
The LLM generates a draft, then systematically extracts verification questions, answers each independently from a ground-truth source, and revises only the claims that were wrong.
Code Generation + Self-Repair
Iterative synthesis with error feedback
The LLM generates Python code, executes it in a subprocess, reads the traceback if it fails, and rewrites the code — looping until success or max retries are exhausted.
Memory-Augmented Agent
Persistent long-term memory across sessions
The agent loads memories from a JSON store at the start of each session, retrieves only the relevant ones, incorporates them into its response, then consolidates new learnings before exiting.
GraphRAG
Knowledge graph-based multi-hop retrieval
Instead of retrieving text chunks, the agent builds a knowledge graph of entities and relationships, then traverses it to answer multi-hop questions that span multiple entities.
Mixture of Experts
Scored confidence-based expert routing
A router scores every available expert against the incoming query. The highest-scoring expert wins the task. Each expert has a domain-specific system prompt optimized for its specialty.
Speculative Execution
Multi-strategy parallel ranking
Generate N candidate answers in parallel using different strategies (temperature, framing, style), then a ranker LLM selects the best one. Trades token cost for answer quality.
Custom MCP Server
Build and integrate your own tool servers
Build a domain-specific MCP (Model Context Protocol) server that exposes tools as a standardised service. A LangGraph ReAct agent connects to it via stdio transport and calls its tools exactly like local @tool functions.