# Context Window

> The context window is the maximum text — measured in tokens — an LLM can consider at once: prompt, conversation, documents, and its own output combined.

**The context window is the maximum number of [tokens](/glossary/llm-token) a language model can process in one request — everything counts against it: the system prompt, conversation history, retrieved documents, tool results, and the response being generated.**

It's the defining resource constraint of LLM applications. Frontier models grew from 4K tokens (2023) to 200K as standard with million-token windows on recent Claude models — yet the window stays a *budget*, for three durable reasons: cost scales with tokens processed, latency grows with input length, and attention dilutes — models recall the start and end of long contexts better than the middle, so the right answer buried under noise often goes unused.

That's why the craft of [context engineering](/guides/prompting/context-engineering) — load the relevant slice, not the repo — outlives every window-size increase, why [RAG](/glossary/rag) retrieves rather than stuffs, and why agents like Claude Code ship [compaction and memory machinery](/guides/configuration/claude-code-memory-context) to keep long sessions sharp.

---

_Source: https://agentscamp.com/glossary/context-window — Term on AgentsCamp._
