# Temperature

> Temperature controls how random an LLM's token choices are: low values make output focused and repeatable, high values make it varied and creative.

**Temperature is the sampling parameter that scales how confidently an LLM commits to its top token choices: near 0, it almost always picks the most probable next token; higher, it spreads probability across alternatives and output gets more varied.**

Mechanically, the model produces a probability distribution over its vocabulary for each [token](/glossary/llm-token); temperature divides the logits before sampling. Low temperature sharpens the distribution (focused, repeatable, sometimes repetitive), high temperature flattens it (diverse, surprising, occasionally off the rails). It pairs with [top-p](/glossary/top-p), which truncates the candidate pool rather than reshaping it — common guidance is to tune one, not both.

The practical defaults: deterministic-leaning for anything machine-consumed ([structured output](/glossary/structured-output), code, extraction), moderate for chat, higher only when variety is the point. And note the era's caveat: [reasoning models](/glossary/reasoning-model) often fix or constrain sampling parameters during thinking — check your provider's docs before assuming the dial does what it did in 2023.

---

_Source: https://agentscamp.com/glossary/temperature — Term on AgentsCamp._