Learning Hub/Prompt Engineering/Structured Output & Schema Enforcement
05 / 10Format Control

Structured Output & Schema Enforcement

JSON, XML, and constrained generation

Production systems need predictable, parseable output — not free-form text. Structured output techniques range from simple prompt instructions ("respond in JSON") to API-level schema enforcement with guaranteed valid output. Choosing the right constraint level depends on your reliability requirements.

Language models generate free-form text by default. When you need to feed the output into downstream code — an API response, a database write, a UI component — you need structured output that is guaranteed to parse. There are three levels of enforcement. The weakest is prompt-level: you describe the desired format in the instruction ("respond with a JSON object with keys name, age, and city"). This works most of the time but can fail with extra text, missing keys, or malformed syntax, especially with smaller models.

The middle tier uses format markers and extraction. You instruct the model to wrap structured content in delimiters (```json ... ``` or <output>...</output>), then parse the delimited block. This is more robust because you can extract the structured portion even if the model adds surrounding commentary. XML-style tags work particularly well with Claude because the model reliably produces well-formed XML when asked, and XML naturally handles nested structures with named fields.

The strongest tier is API-level enforcement. OpenAI's structured output mode accepts a JSON Schema and guarantees the response conforms to it. Anthropic's tool-use feature returns structured parameters validated against a defined schema. Frameworks like Instructor and Outlines wrap these APIs with Pydantic model validation, automatic retries on schema violations, and type-safe output objects. For any production system, use API-level enforcement — the marginal complexity is far less than the cost of handling parse failures at scale.

Key Concepts

  • Three enforcement levels: prompt-level instructions, format markers + extraction, API-level schemas
  • Prompt-level ("respond in JSON") works for demos but fails unpredictably at scale
  • XML tags work reliably with Claude; JSON Schema works reliably with OpenAI structured output
  • Pydantic + Instructor/Outlines provide type-safe output objects with automatic retry
  • Always validate and retry — even API-level enforcement benefits from a validation layer