Structured Output

Structured output is getting typed, machine-consumable data from an LLM — the model's response constrained to match a schema you define, instead of prose your code has to parse and pray over.

It's the feature that turns models into software components. Extraction, classification, routing, agent decisions — all of it wants {"category": "billing", "priority": 2}, not three paragraphs containing that information somewhere. Providers offer escalating guarantees: prompt-and-hope, JSON mode (valid JSON, arbitrary shape), and schema-constrained generation (decoding restricted so output must match your schema) — with function calling as the closely related mechanism where the "output" is a tool invocation.

The engineering around it: design schemas the model can fill well (described fields, enums over free strings — the llm-output-schema-generator skill infers one from an example), validate semantics even when syntax is guaranteed, and wrap a validate-and-retry loop — the pattern libraries like Instructor and BAML productize. Which guarantee to use when, per provider, is the Structured Output vs JSON Mode vs Function Calling decision guide.

Frequently asked questions

What's the difference between JSON mode and structured outputs?

JSON mode guarantees syntactically valid JSON — but any JSON: fields can be missing, renamed, or mistyped. Structured outputs (schema-constrained generation) guarantee conformance to your specific schema, enforced during decoding. If code consumes the result, schema enforcement is the one you want.

Do I still need validation with structured outputs?

Yes — schema conformance isn't semantic correctness. The shape can be right while the values are wrong (a plausible-but-invented ID, a date outside your range). Validate semantics in code, and keep a retry path that feeds validation errors back to the model; libraries like Instructor package that loop.

Frequently asked questions

Related