Learning Hub/Prompt Engineering

10 Lessons · Python · Anthropic & OpenAI APIs

Prompt Engineering

From zero-shot instructions to production evaluation suites. Each lesson covers a core technique with a flow diagram and a runnable Python script using real API calls.

10Lessons

5Modules

10Python Scripts

Foundations

2 lessons

01Foundations

Zero-Shot Prompting

Instruction design without examples

Zero-shot prompting relies on a well-crafted instruction alone — no examples, no demonstrations. The quality of the output depends entirely on how clearly you specify the task, constraints, output format, and edge-case behaviour.

02Foundations

Few-Shot Prompting

Learning from in-context examples

Few-shot prompting provides examples of input-output pairs before the actual query. The model infers the task pattern from the demonstrations rather than from an explicit instruction. Example selection, ordering, and format have a measurable impact on accuracy.

Reasoning

2 lessons

03Reasoning

Chain-of-Thought Reasoning

Step-by-step reasoning traces

Chain-of-thought (CoT) prompting asks the model to show its reasoning before producing a final answer. This dramatically improves accuracy on multi-step tasks — arithmetic, logic, planning, and causal reasoning — by forcing the model to decompose the problem rather than pattern-match to a superficial answer.

04Reasoning

Self-Consistency & Verification

Multiple reasoning paths and majority vote

Self-consistency samples multiple chain-of-thought reasoning paths for the same question and selects the most common final answer by majority vote. This reduces variance from any single reasoning chain and catches errors that a lone CoT path might produce.

Format Control

2 lessons

05Format Control

Structured Output & Schema Enforcement

JSON, XML, and constrained generation

Production systems need predictable, parseable output — not free-form text. Structured output techniques range from simple prompt instructions ("respond in JSON") to API-level schema enforcement with guaranteed valid output. Choosing the right constraint level depends on your reliability requirements.

06Format Control

System Prompt Architecture

Persona, constraints, and multi-section design

System prompts define who the model is and how it behaves across all user interactions. A well-structured system prompt has distinct sections — identity, capabilities, constraints, output format, and examples — that compose cleanly and can be versioned independently.

Composition

2 lessons

07Composition

Prompt Decomposition

Breaking complex tasks into chained subtasks

Complex tasks that exceed a single prompt's reliable capability should be decomposed into a sequence of simpler prompts. Each step produces an intermediate output that feeds the next. Decomposition trades latency for reliability — each subtask is easier for the model and easier to debug when something goes wrong.

08Composition

Tool Use & Function Calling

Defining tools, schemas, and multi-turn tool loops

Tool use lets the model call external functions — search APIs, databases, calculators, code interpreters — and incorporate the results into its response. Effective tool definitions require clear names, precise parameter schemas, and unambiguous descriptions that tell the model when and how to use each tool.

Production

2 lessons

09Production

Adversarial Robustness & Guardrails

Prompt injection, input validation, and output filtering

Any user-facing LLM application is a target for prompt injection — attempts to override the system prompt via user input. Defence requires layered security: input validation, instruction hierarchy, output filtering, and monitoring. No single technique is sufficient; defence in depth is the only viable strategy.

10Production

Evaluation & Systematic Iteration

Measuring prompt quality and preventing regressions

Prompt engineering without evaluation is guesswork. A rigorous eval process uses a curated test suite of inputs with expected outputs, scores each prompt version against quality metrics, and prevents regressions when prompts are updated. LLM-as-judge scales evaluation beyond what manual review can handle.