Browser Use
The most-adopted open-source browser-agent framework — point an LLM at a task and it drives a real browser: navigating, clicking, typing, extracting.
Browser Use (MIT, ~98k stars) is the breakout browser-agent framework: hand it a task string and an LLM and it autonomously navigates, clicks, types, and extracts — driving Chromium over the DevTools Protocol. Model-agnostic (their hosted models, OpenAI, Anthropic, Gemini, local), with domain guardrails, and a 2026 Rust-core beta agent for persistence and recovery.
Browser Use is the project that made "give an AI a browser" a one-liner. At ~98k GitHub stars it's the most-adopted framework in the browser-agent category: a Python library where Agent(task="find the three cheapest flights and extract prices", llm=...) produces an autonomous session that navigates, clicks, types, and reports back.
Highlights
- Task-in, result-out — the agent plans and executes multi-step web tasks autonomously, with the perception-action loop handled for you.
- CDP-native — drives Chromium over the Chrome DevTools Protocol directly (not via Playwright), with structure+vision grounding.
- Model-agnostic — OpenAI, Anthropic, Gemini, local models, or Browser Use's own hosted agent models.
- Guardrails built in — browser profiles with
allowed_domains, headless control, and scoped credentials keep the agent inside the fence. - 2026 Rust core (beta) — a new harness with persistent tools and recovery loops (
browser_use.beta), the project's bet on production reliability. - Optional cloud — stealth/anti-detect browsers, CAPTCHA solving, residential proxies, scheduling, webhooks — the operational layer self-hosting makes you build.
In an AI-assisted workflow
pip install browser-use && uvx browser-use install # library + Chromium
# then, in Python:
# agent = Agent(task="Log into the vendor portal and download this month's invoices", llm=llm)
# await agent.run()It's the general-purpose answer to the web's no-API long tail — the workflows covered in How Computer-Use Agents Work.
WARNING
A browser agent reads hostile pages with your session attached — that's prompt injection surface by construction. Use allowed_domains, isolated profiles without your real logins, and human gates on anything that pays or sends.
Good to know
MIT, Python 3.11+, backed by a $17M Felicis-led seed (March 2025, YC W25). The 0.13-era API is mid-transition (classic Agent import still works; the Rust-core agent lives under browser_use.beta) — pin versions in production. Where it sits against Stagehand's code-first primitives and Skyvern's workflow platform: Browser Agents in 2026.
Frequently asked questions
- What does Browser Use actually do?
- It turns 'go to this site, find X, do Y' into an autonomous browser session: an agent loop perceives the page (structure plus vision), decides actions, executes them via the Chrome DevTools Protocol, and iterates to task completion. Agent(task=..., llm=...) is the whole API surface to start.
- Is Browser Use free?
- The framework is MIT-licensed and free — bring your own LLM key and run locally (pip install browser-use, then uvx browser-use install for Chromium). Browser Use Cloud is the optional paid layer: hosted stealth browsers, CAPTCHA handling, residential proxies, and scheduling, with a free tier to start.
- Browser Use vs Playwright — aren't they the same thing?
- Different layers. Playwright executes scripted automation you write; Browser Use decides the steps itself from a natural-language task (and notably drives the browser via CDP directly rather than through Playwright). Use Playwright-style tools when you know the steps; Browser Use when you want the agent to figure them out.
Related
- Browser Agents in 2026: Browser Use vs Stagehand vs Skyvern vs Playwright MCPThe four ways to give AI a browser — autonomous framework, code-first SDK, workflow platform, or MCP server — compared honestly by control, cost, and reliability.
- How Computer-Use Agents WorkInside the perception-action loop that lets AI operate real software — screenshots in, clicks out — plus grounding, reliability, and when to use APIs instead.
- StagehandBrowserbase's open-source SDK for browser agents — act, extract, observe, and agent primitives that mix natural language with code-level control.
- SkyvernOpen-source vision + LLM browser automation aimed at replacing brittle RPA — workflow builder, CAPTCHA/2FA handling, and self-host or cloud.
- Browser Agent EngineerUse this agent to build, harden, or debug browser-automation agents — web tasks via Browser Use, Stagehand, Skyvern, or Playwright-based stacks. Examples: automate a portal workflow, make a flaky browser agent reliable, add verification and guardrails to web automation, choose between vision and DOM grounding.
- Computer UseComputer use is an AI agent operating software through its real interface — reading the screen, moving the cursor, clicking, and typing like a person would.