A practical guide to building AI agents with LangGraph and LangChain. Each lesson includes an architecture diagram, annotated Python code, key concepts, and research references.
Reason + Act
The foundational agent loop — the LLM alternates between reasoning about what to do and calling a tool until it has enough information to answer.
Explicit upfront planning
A planner LLM breaks the task into steps before execution begins. An executor then runs each step in sequence, and a synthesizer merges all outputs into a final answer.
Linear sequential pipeline
The simplest agentic pattern: the output of one LLM call becomes the input of the next. No loops, no branching — a clean assembly line for structured text transformations.
Supervisor routes to specialists
A supervisor LLM reads the incoming question and routes it to the most appropriate specialist agent. Each specialist has its own focused toolset and system prompt.
Generate → Critique → Revise
The agent generates an answer, a separate critic evaluates it, and the agent revises until the critic is satisfied or the maximum iteration count is reached.
Reusable hierarchical graphs
A parent graph calls compiled child graphs as ordinary nodes, enabling modular composition and reuse of agent logic across multiple workflows.
Parallel fan-out and aggregation
Work is dynamically split into independent chunks (map), processed in parallel via Send primitives, then all results are merged by a reducer node.
Interrupt for human approval
The graph pauses mid-execution and surfaces the agent's proposed action to a human. Execution only continues after the human approves, modifies, or rejects the action.
Quality-gated retrieval
Standard RAG augmented with a relevance grader. Documents below the quality threshold trigger a fallback retrieval strategy before generation proceeds.
Rubric-based evaluation and rewrite
A dedicated judge LLM scores the generator's output on explicit criteria. If the score falls below a threshold, the generator rewrites the answer using the judge's feedback.
Static and dynamic parallel branches
Multiple independent analyst nodes run simultaneously on the same input. Both static (build-time wired) and dynamic (runtime Send) variants are demonstrated.
Adversarial argumentation with judgment
Two LLM agents argue opposing positions on a question. After multiple rounds of opening statements and rebuttals, a judge evaluates the full transcript and delivers a verdict.
Parallel structured content generation
Generate a structured outline first, then fill each section IN PARALLEL. Dramatically faster than sequential writing for long structured outputs like reports or documentation.
Fully autonomous retrieval loop
The LLM controls every decision in the RAG pipeline: whether to retrieve, whether chunks are relevant, and whether its own generated answer is faithful and useful.
Iterative multi-agent orchestration
An orchestrator LLM dynamically decides which subagent to invoke, what input to pass it, and whether to call more agents — building context across multiple rounds until the task is complete.
Multi-branch search with pruning
The LLM explores multiple reasoning branches simultaneously, scores each branch, and expands only the top-K most promising paths — implementing a BFS tree search over thought space.
Systematic claim-by-claim fact-checking
The LLM generates a draft, then systematically extracts verification questions, answers each independently from a ground-truth source, and revises only the claims that were wrong.
Iterative synthesis with error feedback
The LLM generates Python code, executes it in a subprocess, reads the traceback if it fails, and rewrites the code — looping until success or max retries are exhausted.
Persistent long-term memory across sessions
The agent loads memories from a JSON store at the start of each session, retrieves only the relevant ones, incorporates them into its response, then consolidates new learnings before exiting.
Knowledge graph-based multi-hop retrieval
Instead of retrieving text chunks, the agent builds a knowledge graph of entities and relationships, then traverses it to answer multi-hop questions that span multiple entities.
Scored confidence-based expert routing
A router scores every available expert against the incoming query. The highest-scoring expert wins the task. Each expert has a domain-specific system prompt optimized for its specialty.
Multi-strategy parallel ranking
Generate N candidate answers in parallel using different strategies (temperature, framing, style), then a ranker LLM selects the best one. Trades token cost for answer quality.
Build and integrate your own tool servers
Build a domain-specific MCP (Model Context Protocol) server that exposes tools as a standardised service. A LangGraph ReAct agent connects to it via stdio transport and calls its tools exactly like local @tool functions.