Structured Output vs JSON Mode vs Function Calling: Which to Use in 2026
The reliable ways to get typed data out of an LLM — what JSON mode, function calling, and native structured outputs each guarantee, and when to use which.
For data (not prose) from an LLM, don't prompt-and-parse. Use native structured outputs (a JSON Schema the model is constrained to follow) for data you consume, and function/tool calling when the model should invoke an action. JSON mode only guarantees valid JSON syntax, not your shape. Libraries like Instructor, BAML, and the AI SDK wrap these with validation and auto-retry.
Key takeaways
- Prompt-and-parse ('return JSON') is the unreliable baseline — it breaks on the edge cases that matter.
- JSON mode guarantees valid JSON syntax, not that it matches your schema.
- Native structured outputs constrain the model to a JSON Schema — valid AND conformant; the default for data.
- Function/tool calling is for when the model should DO something (call a tool), not just return data.
- Libraries (Instructor, BAML, Vercel AI SDK) add validation + auto-retry on top of these mechanisms.
When you need data from an LLM — extracted fields, a classification, a filled form — prose is the enemy. You want a typed object your code can rely on. In 2026 there are several mechanisms for that, with genuinely different guarantees, and choosing the wrong one is why so many LLM features break in production on inputs nobody tested.
The four approaches, weakest to strongest
1. Prompt-and-parse (avoid)
You add "respond with JSON" to the prompt and JSON.parse the result. It works in the demo and fails in production: the model occasionally wraps the JSON in prose, adds a comment, mistypes a field, or omits one — usually on the edge case that matters. There's no structural guarantee and no validation. This is the baseline everything else exists to replace.
2. JSON mode
The provider guarantees the output is syntactically valid JSON. That removes the "wrapped in prose / trailing comma" class of bugs — but it does not guarantee your shape. Fields can be missing, mistyped, or extra. JSON mode is a real improvement over prompt-and-parse and a weaker, older guarantee than what comes next.
3. Native structured outputs (the default for data)
The model is constrained to a JSON Schema you provide (via constrained decoding), so the output is valid JSON and conforms to your schema — the right fields, types, and enums. This is the strongest native guarantee and, in 2026, the default mechanism when you want typed data. You define the schema; the provider enforces it.
4. Function / tool calling
Function (tool) calling has the model return a call — a function name plus arguments matching a schema. It was the original structured mechanism and is still the right tool when the model should do something (invoke a tool, take an action) rather than just hand back data. You can coerce it into pure extraction, but for "just give me the object," native structured outputs are more direct. See Production Tool & Function Calling for the action case.
Where the libraries fit
Instructor, BAML, and the Vercel AI SDK sit on top of these provider mechanisms and add the ergonomics you'd otherwise hand-roll:
- Schema from your types — define the shape as Pydantic/Zod/a DSL instead of raw JSON Schema.
- Validation + auto-retry — if output doesn't validate, re-ask with the errors, so you get conforming data or a clean failure.
- Streaming partials and provider-agnostic calls.
Design the schema itself with the llm-output-schema-generator skill.
How to choose
- You want typed data your code consumes → native structured outputs, ideally via a library (Instructor/BAML/AI SDK) for validation + retry.
- You want the model to take an action → function/tool calling.
- You only need valid JSON, not a strict shape → JSON mode (rare; usually you want structured outputs).
- Never → prompt-and-parse for anything that matters.
TIP
Whatever the mechanism, keep schemas tight and flat: clear field names, descriptions, enums for closed sets, honest required/optional flags. The schema is doing prompt engineering — a good one prevents more errors than a long instruction.
For wiring all this into a real app — with streaming, fallback, and cost control — see the llm-integration-engineer and the model-access layer in Calling Any Model.
Frequently asked questions
- What's the difference between JSON mode and structured outputs?
- JSON mode guarantees the model returns syntactically valid JSON — but not that it matches your schema; fields can be missing, mistyped, or extra. Structured outputs (a.k.a. constrained decoding against a JSON Schema) guarantee the output conforms to the exact schema you specify — correct fields, types, and enums. For data your code depends on, use structured outputs; JSON mode is a weaker, legacy guarantee.
- Should I use function calling or structured outputs for extraction?
- Use structured outputs. Function/tool calling is designed for the model to invoke an action — call a tool with arguments — and you can coerce it into returning data, but native structured outputs are the more direct, reliable mechanism when you just want typed data back. Reserve function calling for when the model should actually do something.
- Do I need a library like Instructor or can I use the provider's API directly?
- You can use the provider's structured-output API directly. Libraries like Instructor, BAML, and the Vercel AI SDK add value on top: define the schema with your language's types, automatic validation, retry-on-failure with the errors fed back, streaming of partial objects, and provider-agnostic code. For anything beyond a one-off, the library ergonomics are worth it.
- Why does asking the model to 'return JSON' in the prompt keep breaking?
- Because it's a request, not a guarantee. The model will usually comply, then occasionally wrap the JSON in prose, add a trailing comment, use the wrong type, or omit a field — exactly on the inputs you didn't test. Without a structural guarantee and validation, those rare failures become production incidents. Use structured outputs (or a library that validates and retries) instead.
Related
- Calling Any Model: Unified LLM Gateways & SDKs in 2026Why teams put a unified layer in front of LLM providers — and how LiteLLM, OpenRouter, and the Vercel AI SDK compare for fallback and cost control.
- LLM Integration EngineerUse this agent to add an LLM feature to an application and make it production-grade — typed/structured output, streaming, provider fallback and retries, caching, and cost/latency controls. Examples — "add an AI summary endpoint to our app", "our LLM calls return unparseable JSON and break, make them reliable", "add streaming and a fallback provider to our chat feature".
- LLM Output Schema GeneratorTurn an example of the data you want from an LLM into a precise, validated output schema (Pydantic / Zod / JSON Schema) and wire it into structured-output calls. Use when adding typed LLM output, replacing brittle JSON parsing, or designing an extraction shape.
- InstructorGet structured, validated output from LLMs using plain type definitions, with automatic retries on validation failure.
- BAMLA domain-specific language for type-safe LLM functions, with generated clients and schema-aligned parsing.
- Production Tool & Function Calling: Feed Errors Back as ObservationsHow agents use tools — the call/observe/retry loop, why errors must return to the model, and the schemas, idempotency, and limits that keep it reliable.
- Multimodal Document ExtractorExtract structured data from documents and images with a vision-language model — define the target schema, prompt the VLM to fill it from the page (invoices, forms, receipts, statements, IDs), and verify critical fields against the source. Use when you need reliable structured output from messy, varied, or scanned documents that defeat template-based OCR.
- Programmatic Prompt Optimization with DSPy: Stop Hand-Tuning PromptsHand-tuning prompts doesn't scale. DSPy treats prompting as programming — declare tasks as typed signatures and let an optimizer compile the prompts for you.
- Few-Shot vs Chain-of-Thought vs Structured Prompting: What to Use When (2026)When to reach for few-shot examples, chain-of-thought reasoning, or structured/output-constrained prompting — a 2026 decision guide to the core techniques.
- Using Vision-Language Models for OCR, Documents, and Video UnderstandingHow to use vision-language models for OCR, documents, and video: how they differ from traditional OCR, their failure modes, and getting reliable output.
- Vercel AI SDKAn open-source TypeScript toolkit for building AI apps — unified model API, streaming, structured output, tool calling, and UI hooks.