Prompt Engineering
From zero-shot instructions to production evaluation suites. Each lesson covers a core technique with a flow diagram and a runnable Python script using real API calls.
Foundations
2 lessonsZero-Shot Prompting
Instruction design without examples
Zero-shot prompting relies on a well-crafted instruction alone — no examples, no demonstrations. The quality of the output depends entirely on how clearly you specify the task, constraints, output format, and edge-case behaviour.
Few-Shot Prompting
Learning from in-context examples
Few-shot prompting provides examples of input-output pairs before the actual query. The model infers the task pattern from the demonstrations rather than from an explicit instruction. Example selection, ordering, and format have a measurable impact on accuracy.
Reasoning
2 lessonsChain-of-Thought Reasoning
Step-by-step reasoning traces
Chain-of-thought (CoT) prompting asks the model to show its reasoning before producing a final answer. This dramatically improves accuracy on multi-step tasks — arithmetic, logic, planning, and causal reasoning — by forcing the model to decompose the problem rather than pattern-match to a superficial answer.
Self-Consistency & Verification
Multiple reasoning paths and majority vote
Self-consistency samples multiple chain-of-thought reasoning paths for the same question and selects the most common final answer by majority vote. This reduces variance from any single reasoning chain and catches errors that a lone CoT path might produce.
Format Control
2 lessonsStructured Output & Schema Enforcement
JSON, XML, and constrained generation
Production systems need predictable, parseable output — not free-form text. Structured output techniques range from simple prompt instructions ("respond in JSON") to API-level schema enforcement with guaranteed valid output. Choosing the right constraint level depends on your reliability requirements.
System Prompt Architecture
Persona, constraints, and multi-section design
System prompts define who the model is and how it behaves across all user interactions. A well-structured system prompt has distinct sections — identity, capabilities, constraints, output format, and examples — that compose cleanly and can be versioned independently.
Composition
2 lessonsPrompt Decomposition
Breaking complex tasks into chained subtasks
Complex tasks that exceed a single prompt's reliable capability should be decomposed into a sequence of simpler prompts. Each step produces an intermediate output that feeds the next. Decomposition trades latency for reliability — each subtask is easier for the model and easier to debug when something goes wrong.
Tool Use & Function Calling
Defining tools, schemas, and multi-turn tool loops
Tool use lets the model call external functions — search APIs, databases, calculators, code interpreters — and incorporate the results into its response. Effective tool definitions require clear names, precise parameter schemas, and unambiguous descriptions that tell the model when and how to use each tool.
Production
2 lessonsAdversarial Robustness & Guardrails
Prompt injection, input validation, and output filtering
Any user-facing LLM application is a target for prompt injection — attempts to override the system prompt via user input. Defence requires layered security: input validation, instruction hierarchy, output filtering, and monitoring. No single technique is sufficient; defence in depth is the only viable strategy.
Evaluation & Systematic Iteration
Measuring prompt quality and preventing regressions
Prompt engineering without evaluation is guesswork. A rigorous eval process uses a curated test suite of inputs with expected outputs, scores each prompt version against quality metrics, and prevents regressions when prompts are updated. LLM-as-judge scales evaluation beyond what manual review can handle.