# Computer Use

> Computer use is an AI agent operating software through its real interface — reading the screen, moving the cursor, clicking, and typing like a person would.

**Computer use is the agent capability of operating a computer the way a person does — perceiving the screen visually and acting through mouse and keyboard, with no API required.**

It's the generalization of tool use to interfaces never designed for machines: a [VLM](/glossary/vision-language-model) reads the screenshot, the [agent](/glossary/ai-agent) loop issues primitive actions (click, type, scroll), and each new frame is the observation that drives the next step. Anthropic shipped the first frontier version of the capability in late 2024; by 2026 it powers browser-using agents in products from coding tools to Google's Antigravity, with dedicated frameworks (Browser Use, Stagehand, Skyvern) industrializing the browser case.

Its engineering reality is honest: slower, costlier, and less reliable than structured automation — every step is a model call over an image. So the hierarchy holds: use an API when one exists, structured browser tools like [Playwright MCP](/tools/playwright-mcp) or [Chrome DevTools MCP](/tools/chrome-devtools-mcp) when the DOM is reachable, and pixel-level computer use for everything else — with [human gates](/glossary/human-in-the-loop) on anything that spends money or sends email, since a mis-grounded click is this modality's signature failure.

---

_Source: https://agentscamp.com/glossary/computer-use — Term on AgentsCamp._
