
Anyone building AI Agents will eventually run into the same wall: letting an agent do work for you in the browser. Booking flights, posting, scraping backend data, running a regression pass—as long as it involves web pages, browser automation is almost unavoidable.
Then you fall into the familiar trap.
Why Traditional Browser Automation Is So Painful
Playwright, Puppeteer, browser-use, and any newly launched browser started with --launch are all doing the same thing: opening a brand-new browser with a blank profile. That means:
- You have to log in again every time. ChatGPT, X, GitHub, and your company admin dashboards that stay logged in year-round in Chrome simply do not exist in that empty browser. By default, it starts with a blank profile: cookies and sessions are all empty. Sure, you can manually feed it a profile, but then you have to maintain that login state yourself.
- CAPTCHAs stop the agent cold. An empty browser, unfamiliar fingerprint, and data-center IP make it obvious to websites that this is a script, so they throw up a CAPTCHA or Cloudflare challenge page. The agent gets stuck on the spot, with nobody there to click through.
- The fingerprint looks fake at a glance.
navigator.webdriver, headless traces, CDP leaks… these are all automation giveaways that can be spotted instantly. Bot-detection vendors like Akamai, DataDome, and PerimeterX also score traffic across dozens of dimensions, and a default Playwright/Puppeteer setup is easy to expose. - You can’t see what it’s doing. It runs in another window in another process. You can’t step in, and when something goes wrong, all you can do is stare at logs and guess.
The root of the problem is simple: it opens “some other” browser, not yours.
chrome-use Takes a Different Approach: Open Your Real Browser
chrome-use points any agent (Claude Code, Cursor, Codex, your own scripts) at the Chrome on your machine where you’re already logged into every site.
It clicks inside your window, so you can watch it work and take over the moment it hits 2FA or a CAPTCHA. And because it is your real browser (via a one-click store extension + native messaging channel, using chrome.debugger to drive tabs, with no exposed TCP remote debugging port), sites see a real human browser at the fingerprinting layer, with no automation traces to catch (0% stealth score on CreepJS; see the “Anti-Detection” section below). Of course, IP reputation, account risk, and behavioral pacing are separate axes you still need to manage yourself.
In one sentence: no new Chrome, no logging in again, no more slamming into the “are you a robot?” wall.
Traditional automation (Playwright / --launch) | chrome-use (connects to your Chrome) | |
|---|---|---|
| Browser | Launches a new empty browser | Connects to the Chrome you’re already using |
| Login state | Empty; you have to log in again | All your existing sessions |
| Fingerprint | Carries automation traces | Your real fingerprint |
| Collaboration | Opens a separate window | Same window, take over anytime |
| CAPTCHA | Agent gets stuck | You click once, agent continues |
What About Claude’s Built-In Chrome Extension, web-access, and CDP Ports?
- Playwright / Puppeteer / browser-use? They launch an empty browser, so you have to redo every login and fight through every CAPTCHA again, while still getting flagged as automation. chrome-use uses the session you already have.
- Claude’s built-in Chrome extension? Great, but it only drives Claude itself. chrome-use drives any agent or CLI.
- Raw
--remote-debugging-porttools (like web-access)? Chrome 136+ shows an interstitial prompt when connecting: “Allow remote debugging?” (effective for that Chrome session). chrome-use never does, because it uses a one-click store extension + native messaging.
How It Works: Entirely Local, No Network

Your chrome-use CLI talks to a tiny browser extension through Chrome’s native messaging, a local inter-process channel with no network sockets, no tokens, and no remote servers. The extension uses chrome.debugger to drive the tab you specify (inside your already-logged-in Chrome), then hands the results back to the CLI. Everything stays on your machine.
Why an Extension, Not a Raw Debug Port
Other local tools use a raw --remote-debugging-port (CDP). Starting with Chrome 136, the first connection triggers a blocking “Allow remote debugging?” authorization prompt (effective for that Chrome session), and the port must be opened in advance. chrome-use’s extension uses native messaging: install once, then zero confirmations every time after.
| chrome-use (extension) | web-access (raw CDP port) | Claude built-in plugin | |
|---|---|---|---|
| Connection method | Native messaging, no port, no token | --remote-debugging-port | chrome.debugger |
| “Allow remote debugging?” prompt | Never ✅ | Appears (per session) 🔴 | None |
| Uses your real login | Yes | Yes | Yes |
Runtime.enable CDP leak | Off by default → clean ✅ | Domain enabled | N/A |
| CreepJS stealth score | 0% stealth · 0% headless ✅ | Real Chrome | Real Chrome |
| Separate tab group per session / concurrent agents | Supported ✅ | No | No |
This authorization prompt is not fearmongering: raw-port tools show a blocking authorization prompt the first time they connect (attach), and it remains effective for that Chrome session; the extension path requires zero confirmations after the one-time install.
Agent-Facing Interface: A Page Glance Costs Only 200–400 Tokens
So far, this has all been about “whose browser to connect to.” But for AI agents, there is another equally critical question: how many tokens does each page glance burn?
Many browser agents are screenshot-driven: they feed a full-page screenshot into a vision model and ask it to find buttons in the pixels. A single screenshot can easily cost thousands of tokens, and every click or page turn needs another one; run a slightly longer workflow and the tokens disappear fast. Others stuff raw HTML/DOM into the context, which is just as long and messy.
chrome-use takes a structured-first approach: snapshot -i gives the agent an accessibility tree snapshot, keeping only interactive elements, with each element assigned a compact @eN reference. A full page usually costs only ~200–400 tokens, instead of parsing raw HTML, let alone stuffing in a screenshot. The agent operates directly by reference:
chrome-use open <url>
chrome-use snapshot -i # Only inspect interactive elements, each with an @eN reference
chrome-use click @e3 # Operate by reference, not by coordinates or screenshots
Screenshots in chrome-use are an output, not an input: you only take them when you need evidence or something for a human to inspect. The agent does not need them at all for locating, reading, or clicking. And site adapters (covered below) go one step further: they return clean JSON directly, skipping even the snapshot, making them the cheapest path for “reading structured data.”
The savings are real: for the same workflow, a structured interface is often an order of magnitude cheaper than a screenshot-driven one, and the longer the task, the more obvious the difference becomes. That is also why chrome-use positions itself as a CLI for agents, not a browser panel for humans.
Anti-Detection: No Patches, No Detectable Lie-Detector Traces

When connected to your real Chrome, chrome-use does not inject a single JavaScript patch. Your browser fingerprint is completely real. The guiding principle is to use native CDP/Chrome-layer overrides instead of JS spoofing: redefined getters can be detected, while native-layer overrides do not leave that kind of trace.
navigator.webdriver = falsegoes throughEmulation.setAutomationOverride(a native override, unlike a redefined getter that would be caught on the spot by “lie detectors” like CreepJS).Runtime.enableis off by default. A liveRuntimedomain is itself a detectable CDP signal (the “runtime leak” described by patchright/rebrowser), even when you are connected to a real Chrome. We only enable it when you explicitly turn on console/error capture.click,fill, andevalstill work normally without it.
Real-World Test Results (Connected to Real Chrome):
| Detection Site | Result |
|---|---|
| CreepJS | 0% stealth · 0% headless (no automation override traces) |
| bot.incolumitas.com | All OK: overflowTest, overrideTest, puppeteerExtraStealthUsed, worker consistency |
| bot.sannysoft.com | All green |
| BrowserScan | Webdriver · User-Agent · CDP all clean |
| Cloudflare Turnstile (nowsecure.nl passive challenge) | Passed |
The 0% stealth on CreepJS is the key number: because the connection path patches nothing, there are no overrides for a “lie detector” to catch in the first place. We also intentionally do not build our own bot detector. The most credible benchmark is to run the strictest public detection sites (CreepJS, incolumitas) against your real browser yourself. Note that they test fingerprints/traces; commercial behavior + IP reputation stacks are another, harder layer. You do not have to take our word for it: verify it yourself.
Behavior-Level Stealth: Making Clicks Look Human
Fingerprints are only half the story. Bot detection vendors like Akamai, DataDome, and PerimeterX also score behavior. A click that teleports the cursor to the exact center of an element, with no approach path and zero press delay, is itself a giveaway, even if our CDP events are isTrusted.

With humanize enabled, cursor movement behaves like a real person operating the browser: clicks follow a curved, decelerating Bézier path and land on a randomly jittered position inside the element (never the exact center); typing uses variable keystroke intervals; scrolling is segmented and eased; dragging follows curves. It is also adaptive: on every navigation, it detects known anti-bot vendors (cookies / scripts / globals), and pages under scrutiny automatically upgrade to human-like trajectories, eliminating low-level giveaways like “teleport clicks.” This can remove obvious machine traces, but behavior-level risk systems still look at dwell time, rhythm, and entropy across the whole session. Humanize is not a silver bullet; ordinary sites keep the original instant clicks (zero extra overhead).
Control it with --humanize off|fast|human or an environment variable. The default is off, and the adaptive detector upgrades automatically by page.
Silent Operation: Never Steals Your Foreground
Since it drives your own real Chrome, it should never interrupt what you are doing. The agent operates entirely in the background: new tabs open without stealing focus (inside its own colored tab group), the agent never forces tabs to the foreground, and Emulation.setFocusEmulationEnabled keeps every agent tab rendering continuously while making document.hasFocus() return true and visibilityState report visible. So screenshots still work, pages do not get render-throttled, and the suspicious signal of “the session tab was invisible the whole time” is not triggered. You keep working in your active tab, while the agent quietly does its job beside you.
Multiple Agents Share One Chrome Without Clashing
Each --session gets its own set of color-coded Chrome tabs, so multiple agents can concurrently share the same real browser without interfering with each other or touching your own tabs. A session only owns the tabs it created itself; it never takes over your tabs, nor the tabs of other agents. Command dispatch is also isolated by session. This means you can run several agents at the same time in one real Chrome, each doing its own work.
Site Adapters: Turn a Website into a Structured Data CLI
Many tasks like “read GitHub issues,” “search Reddit,” or “fetch my Bilibili feed” don’t need clicks or screenshots at all. Behind the site, there’s usually already a JSON API; it just requires a logged-in state to call. A site adapter is a small piece of JS that calls that API inside your already logged-in tab using your cookies, same-origin fetch, and the site’s own modules, then returns clean JSON. From the website’s perspective, the request is almost indistinguishable from something you did manually.
chrome-use does not bundle any adapters. site update pulls the community bb-sites package at runtime, like a package manager fetching dependencies, then runs them through chrome-use’s incognito transport:
chrome-use site update # fetch adapter package (~145 commands)
chrome-use site list # github/issues, reddit/search, bilibili/feed…
chrome-use site github/issues epiral/bb-browser --json
chrome-use site bilibili/feed --json # works because it uses your logged-in session
It also auto-syncs and auto-prompts: when you open/snapshot a domain that has adapters, chrome-use prints a line like 💡 site adapters for <domain> directly in the output, guiding the agent to use the structured data adapter first instead of scraping the DOM.
Turn “Click Around” Into a Rerunnable Test Suite (chrome-use test)
That repetitive work of “open it, click around, check whether things look right” can become a set of rerunnable tests—basically adding a smoke/regression layer to the frontend. Write cases in YAML, reuse chrome-use’s own commands for steps, and compile assertions into a single check:
# smoke.yaml
suite: chatgpt smoke
cases:
- name: home loads logged in
steps:
- open: https://chatgpt.com/
- wait: { load: networkidle }
assert:
- url: { contains: chatgpt.com }
- visible: "#prompt-textarea"
chrome-use test smoke.yaml # launch an isolated browser to run the cases
chrome-use test smoke.yaml --session default # …or run against your connected Chrome
If any case fails, the exit code is non-zero (drop it straight into CI), and the failed case also saves a screenshot. Assertions support url/visible/hidden/text/count/eval, and steps support open/click/fill/type/press/wait/scroll/eval. Found a regression? Just add a case. The more you use it, the more valuable this test suite becomes.
Getting Started: One-Line Install, Connect to Any Agent
curl -fsSL https://raw.githubusercontent.com/leeguooooo/chrome-use/main/install.sh | sh
Download the right prebuilt binary for your platform from the latest GitHub Release, and install chrome-use plus the short alias abs. No npm, no token required.
Connect to your Chrome. The extension path is recommended: one click, zero popups. Install the chrome-use extension from the Chrome Web Store, then register the local bridge once:
chrome-use extension install # register native messaging host (one-time)
chrome-use open https://x.com/home
After that, everything runs through native messaging, driving your real, logged-in Chrome: no debug port, no token, and no “Allow remote debugging?” popup ever.
Install the companion skill for your AI agent, such as Claude Code or Cursor:
npx skills add leeguooooo/chrome-use
This drops skills/chrome-use into your project along with the specialized skills, so your agent gets correct usage examples and pre-authorized bash permissions.
Day to day, it looks like this. Any agent can call it:
chrome-use open https://example.com
chrome-use click "Post"
chrome-use fill "Title" "Hello World"
chrome-use screenshot ./page.png
The agent operates inside your Chrome, and you can watch tabs open, pages load, and clicks happen in real time. You can take over anytime, such as to solve a CAPTCHA, then let the agent continue.
Don’t want to touch your real Chrome? Use chrome-use --launch open <url> to start a fresh isolated incognito browser with the full anti-detection patch set applied. CI automatically uses this path.
What Makes It Different
- Connects to your existing Chrome by default:
chrome-use open <url>drives the browser you’re already using instead of launching a new one. - Token-efficient structured interface: agents receive an accessibility-tree snapshot plus
@eNreferences, around ~200–400 tokens per page, without relying on screenshots or stuffing in raw HTML; screenshots are output, not input. - Extension relay transport: one-click store extension + native messaging, with no debug port and no “Allow remote debugging?” popup.
- CDP-native stealth: anti-detection uses Chrome/CDP overrides rather than JS patches; zero patches when connected to real Chrome, with the full patch set applied only under
--launch. - Humanize: human-like cursor trajectories + adaptive anti-bot handling.
- Multi-agent isolation: concurrent agents share one real Chrome through per-session tab groups without interfering with each other.
- Silent operation: runs in the background and never steals your foreground tab.
chrome-use is part of the *-use family: iphone-use drives your real iPhone, while chrome-use drives your real Chrome. The project is open source under Apache-2.0.
People building agent automation deserve this. Drop by GitHub and give it a star so more fellow Agent builders can discover it: github.com/leeguooooo/chrome-use.
Built by leeguooooo. Field notes on AI agents, reverse engineering, and Cloudflare Workers are at blog.misonote.com, and you can follow @leeguooooo on X.

微信
支付宝
Comments
Replies are public immediately and may be moderated for policy violations.