Instructor
Get structured, validated output from LLMs using plain type definitions, with automatic retries on validation failure.
Instructor turns an LLM into a typed function: define a Pydantic model (or a Zod/equivalent schema in its ports), and Instructor coerces the model's output into that shape, validating it and automatically re-asking on failure. The simplest way to get reliable structured data out of an LLM.
Instructor makes structured output from LLMs feel like calling a typed function. You define the shape you want as a Pydantic model (Python — with ports for TypeScript, Go, and others), pass it in, and Instructor handles the rest: it instructs the model, parses the response into your type, validates it, and automatically retries with the validation errors fed back if the output doesn't conform.
It is aimed at developers who want data, not prose, from an LLM — extraction, classification, form-filling — without writing brittle JSON parsing and retry loops by hand. Because it builds on the providers' native function-calling/structured-output capabilities, it's thin and reliable rather than a heavy framework.
Highlights
- Types as the schema — define output with Pydantic (or the language port's equivalent); no hand-written JSON Schema.
- Validation + auto-retry — invalid output is re-requested with the errors, so you get conforming data or a clear failure.
- Provider-agnostic — works across OpenAI, Anthropic, and many other models.
- Streaming partials — stream structured objects as they're produced.
- Minimal footprint — a focused library, not a framework you build your app around.
In an AI-assisted workflow
import instructor
from pydantic import BaseModel
class User(BaseModel):
name: str
age: int
client = instructor.from_provider("anthropic/claude")
user = client.chat.completions.create(response_model=User, messages=[...]) # -> validated UserTIP
Let your types do the prompting: a well-named model with field descriptions and constraints (enums, ranges) often beats paragraphs of instructions for getting the exact structure you want.
Good to know
Instructor is free and open source (MIT); you pay your model provider for tokens. For a cross-language, schema-first approach with its own DSL, compare BAML; for a TypeScript-native app toolkit that also does structured output, see the Vercel AI SDK. Background on the techniques: Structured Output vs JSON Mode vs Function Calling.
Related
- BAMLA domain-specific language for type-safe LLM functions, with generated clients and schema-aligned parsing.
- Structured Output vs JSON Mode vs Function Calling: Which to Use in 2026The reliable ways to get typed data out of an LLM — what JSON mode, function calling, and native structured outputs each guarantee, and when to use which.
- LLM Output Schema GeneratorTurn an example of the data you want from an LLM into a precise, validated output schema (Pydantic / Zod / JSON Schema) and wire it into structured-output calls. Use when adding typed LLM output, replacing brittle JSON parsing, or designing an extraction shape.
- Vercel AI SDKAn open-source TypeScript toolkit for building AI apps — unified model API, streaming, structured output, tool calling, and UI hooks.