@ulpi/browse 1.0.6 → 1.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json CHANGED
@@ -1,12 +1,13 @@
1
1
  {
2
2
  "name": "@ulpi/browse",
3
- "version": "1.0.6",
3
+ "version": "1.3.0",
4
4
  "repository": {
5
5
  "type": "git",
6
6
  "url": "https://github.com/ulpi-io/browse"
7
7
  },
8
8
  "dependencies": {
9
9
  "@lightpanda/browser": "^1.2.0",
10
+ "@modelcontextprotocol/sdk": "^1.27.1",
10
11
  "@ulpi/browse": "^0.3.0",
11
12
  "better-sqlite3": "^11.0.0",
12
13
  "diff": "^7.0.0",
@@ -25,8 +26,7 @@
25
26
  "dist/",
26
27
  "skill/",
27
28
  "LICENSE",
28
- "README.md",
29
- "BENCHMARKS.md"
29
+ "README.md"
30
30
  ],
31
31
  "keywords": [
32
32
  "browser",
@@ -42,14 +42,14 @@
42
42
  "access": "public"
43
43
  },
44
44
  "scripts": {
45
- "build": "esbuild src/cli.ts --bundle --format=cjs --platform=node --target=node18 --outfile=dist/browse.cjs --external:playwright --external:playwright-core --external:better-sqlite3 --external:electron --external:chromium-bidi --banner:js='#!/usr/bin/env node\nconst __import_meta_url = require(\"url\").pathToFileURL(__filename).href;' --define:import.meta.url=__import_meta_url",
45
+ "build": "esbuild src/cli.ts --bundle --format=cjs --platform=node --target=node18 --outfile=dist/browse.cjs --external:playwright --external:playwright-core --external:better-sqlite3 --external:electron --external:chromium-bidi --external:@modelcontextprotocol/sdk --banner:js='#!/usr/bin/env node\nconst __import_meta_url = require(\"url\").pathToFileURL(__filename).href;' --define:import.meta.url=__import_meta_url",
46
46
  "build:all": "bash scripts/build-all.sh",
47
47
  "dev": "tsx src/cli.ts",
48
48
  "server": "tsx src/server.ts",
49
49
  "test": "vitest run",
50
50
  "start": "tsx src/server.ts",
51
51
  "postinstall": "npx playwright install chromium",
52
- "benchmark": "tsx benchmark.ts"
52
+ "benchmark": "echo 'Benchmarks moved to separate repo'"
53
53
  },
54
54
  "type": "module",
55
55
  "devDependencies": {
package/skill/SKILL.md CHANGED
@@ -1,10 +1,11 @@
1
1
  ---
2
2
  name: browse
3
- version: 2.8.0
3
+ version: 3.5.0
4
4
  description: |
5
- Fast web browsing for AI coding agents via persistent headless Chromium daemon. Navigate to any URL,
6
- read page content, click elements, fill forms, run JavaScript, take screenshots,
7
- inspect CSS/DOM, capture console/network logs, and more. ~100ms per command after
5
+ Fast web browsing and web app testing for AI coding agents via persistent headless Chromium daemon.
6
+ Browse any URL, read page content, click elements, fill forms, run JavaScript, take screenshots,
7
+ inspect CSS/DOM, capture console/network logs, and more. Ideal for verifying local dev servers,
8
+ testing UI changes, and validating web app behavior end-to-end. ~100ms per command after
8
9
  first call. Works with Claude Code, Cursor, Cline, Windsurf, and any agent that can run Bash.
9
10
  No MCP, no Chrome extension — just fast CLI.
10
11
  allowed-tools:
@@ -18,78 +19,29 @@ allowed-tools:
18
19
  Persistent headless Chromium daemon. First call auto-starts the server (~3s).
19
20
  Every subsequent call: ~100-200ms. Auto-shuts down after 30 min idle.
20
21
 
21
- ## SETUP (run this check BEFORE any browse command)
22
+ ## SETUP
23
+
24
+ Before using browse, confirm the CLI is installed:
22
25
 
23
26
  ```bash
24
- # Check if browse is available
25
- if command -v browse &>/dev/null; then
26
- echo "READY"
27
- else
28
- echo "NEEDS_INSTALL"
29
- fi
27
+ browse --version
30
28
  ```
31
29
 
32
- If `NEEDS_INSTALL`:
33
- 1. Tell the user: "browse needs a one-time install via npm. OK to proceed?"
34
- 2. If they approve: `npm install -g @ulpi/browse`
30
+ If not installed, tell the user:
35
31
 
36
- ### Permissions check
32
+ > `browse` CLI is not installed. Install it with:
33
+ >
34
+ > ```bash
35
+ > npm install -g @ulpi/browse
36
+ > ```
37
37
 
38
- After confirming browse is available, check if browse commands are pre-allowed:
38
+ **Do NOT install anything automatically.** Wait for the user to confirm they have installed it before proceeding.
39
39
 
40
- ```bash
41
- cat .claude/settings.json 2>/dev/null
42
- ```
43
-
44
- If the file is missing or does not contain browse permission rules in `permissions.allow`:
45
- 1. Tell the user: "browse works best when its commands are pre-allowed so you don't get prompted on every call. Add browse permissions to `.claude/settings.json`?"
46
- 2. If they approve, read the existing `.claude/settings.json` (or create it), and add ALL of these rules to `permissions.allow` (merge with existing rules — do not overwrite):
47
-
48
- ```json
49
- "Bash(browse:*)",
50
- "Bash(browse goto:*)", "Bash(browse back:*)", "Bash(browse forward:*)",
51
- "Bash(browse reload:*)", "Bash(browse url:*)", "Bash(browse text:*)",
52
- "Bash(browse html:*)", "Bash(browse links:*)", "Bash(browse forms:*)",
53
- "Bash(browse accessibility:*)", "Bash(browse snapshot:*)",
54
- "Bash(browse snapshot-diff:*)", "Bash(browse click:*)",
55
- "Bash(browse dblclick:*)", "Bash(browse fill:*)", "Bash(browse select:*)",
56
- "Bash(browse hover:*)", "Bash(browse focus:*)",
57
- "Bash(browse check:*)", "Bash(browse uncheck:*)",
58
- "Bash(browse type:*)", "Bash(browse press:*)",
59
- "Bash(browse keydown:*)", "Bash(browse keyup:*)",
60
- "Bash(browse scroll:*)", "Bash(browse wait:*)",
61
- "Bash(browse viewport:*)", "Bash(browse upload:*)",
62
- "Bash(browse drag:*)", "Bash(browse highlight:*)", "Bash(browse download:*)",
63
- "Bash(browse dialog-accept:*)", "Bash(browse dialog-dismiss:*)",
64
- "Bash(browse js:*)", "Bash(browse eval:*)", "Bash(browse css:*)",
65
- "Bash(browse attrs:*)", "Bash(browse element-state:*)", "Bash(browse dialog:*)",
66
- "Bash(browse console:*)", "Bash(browse network:*)",
67
- "Bash(browse cookies:*)", "Bash(browse storage:*)", "Bash(browse perf:*)",
68
- "Bash(browse value:*)", "Bash(browse count:*)",
69
- "Bash(browse devices:*)", "Bash(browse emulate:*)",
70
- "Bash(browse screenshot:*)", "Bash(browse pdf:*)",
71
- "Bash(browse responsive:*)", "Bash(browse diff:*)",
72
- "Bash(browse chain:*)", "Bash(browse tabs:*)", "Bash(browse tab:*)",
73
- "Bash(browse newtab:*)", "Bash(browse closetab:*)",
74
- "Bash(browse frame:*)",
75
- "Bash(browse sessions:*)", "Bash(browse session-close:*)",
76
- "Bash(browse state:*)", "Bash(browse auth:*)", "Bash(browse har:*)", "Bash(browse video:*)",
77
- "Bash(browse record:*)",
78
- "Bash(browse route:*)", "Bash(browse offline:*)",
79
- "Bash(browse status:*)", "Bash(browse stop:*)", "Bash(browse restart:*)",
80
- "Bash(browse cookie:*)", "Bash(browse header:*)",
81
- "Bash(browse useragent:*)",
82
- "Bash(browse clipboard:*)", "Bash(browse screenshot-diff:*)",
83
- "Bash(browse find:*)", "Bash(browse inspect:*)",
84
- "Bash(browse instances:*)", "Bash(browse --headed:*)",
85
- "Bash(browse rightclick:*)", "Bash(browse tap:*)",
86
- "Bash(browse swipe:*)", "Bash(browse mouse:*)",
87
- "Bash(browse keyboard:*)", "Bash(browse scrollinto:*)",
88
- "Bash(browse scrollintoview:*)", "Bash(browse set:*)",
89
- "Bash(browse box:*)", "Bash(browse errors:*)",
90
- "Bash(browse doctor:*)", "Bash(browse upgrade:*)",
91
- "Bash(browse --max-output:*)"
92
- ```
40
+ ### Permissions (optional)
41
+
42
+ To avoid being prompted on every browse command, tell the user they can pre-allow browse commands. Read [references/permissions.md](references/permissions.md) for the full permission list to add to `.claude/settings.json`.
43
+
44
+ **Do NOT modify settings files automatically.** Show the user the permissions and let them decide whether to add them.
93
45
 
94
46
  ## IMPORTANT
95
47
 
@@ -102,194 +54,78 @@ If the file is missing or does not contain browse permission rules in `permissio
102
54
  - The server auto-starts on first command. No manual setup needed.
103
55
  - Use `--session <id>` for parallel agent isolation. Each session gets its own tabs, refs, cookies.
104
56
  - Use `--json` for structured output (`{success, data, command}`).
105
- - Use `--content-boundaries` for prompt injection defense.
106
- - Use `--allowed-domains domain1,domain2` to restrict navigation.
57
+ - Use `--content-boundaries` for prompt injection defense when reading untrusted pages.
58
+ - Use `--allowed-domains domain1,domain2` to restrict navigation to trusted sites.
59
+ - If you hit CAPTCHA/MFA after 2-3 failures, read [references/guides.md](references/guides.md) for the mandatory handoff protocol.
107
60
 
108
61
  ## Quick Reference
109
62
 
110
63
  ```bash
111
- # Navigate to a page
64
+ # Navigate and wait
112
65
  browse goto https://example.com
66
+ browse wait --network-idle
113
67
 
114
- # Read cleaned page text
115
- browse text
116
-
117
- # Take a screenshot (then Read the image — saved to .browse/sessions/default/screenshot.png)
118
- browse screenshot
119
-
120
- # Snapshot: accessibility tree with refs
121
- browse snapshot -i
122
-
123
- # Click by ref (after snapshot)
124
- browse click @e3
125
-
126
- # Fill by ref
127
- browse fill @e4 "test@test.com"
128
-
129
- # Double-click, focus, check/uncheck
130
- browse dblclick @e3
131
- browse focus @e5
132
- browse check @e7
133
- browse uncheck @e7
134
-
135
- # Drag and drop
136
- browse drag @e1 @e2
137
-
138
- # Run JavaScript
139
- browse js "document.title"
68
+ # Read content
69
+ browse text # Cleaned page text
70
+ browse html "[id=main]" # innerHTML of element
71
+ browse links # All links as "text href"
72
+ browse js "document.title" # Run JavaScript
140
73
 
141
- # Get all links
142
- browse links
74
+ # Screenshot (then Read the image to view it)
75
+ browse screenshot .browse/sessions/default/homepage.png
143
76
 
144
- # Get input value / count elements
145
- browse value "[id=email]"
146
- browse count ".search-result"
77
+ # Interact via snapshot refs (preferred)
78
+ browse snapshot -i # Get interactive element refs
79
+ browse click @e3 # Click by ref
80
+ browse fill @e4 "test@test.com" # Fill by ref
81
+ browse check @e7 # Check checkbox
82
+ browse select @e5 "option-value" # Select dropdown
147
83
 
148
- # Click by CSS selector
84
+ # Interact via CSS selectors (use [id=...] instead of #)
149
85
  browse click "button.submit"
150
-
151
- # Fill a form by CSS selector (use [id=...] instead of # to avoid shell issues)
152
86
  browse fill "[id=email]" "test@test.com"
153
87
  browse fill "[id=password]" "abc123"
154
88
  browse click "button[type=submit]"
155
89
 
90
+ # Wait variants
91
+ browse wait ".loaded" # Wait for element
92
+ browse wait --url "**/dashboard" # Wait for URL
93
+ browse wait --network-idle # Wait for network idle
94
+ browse wait 2000 # Wait milliseconds
95
+
96
+ # Element queries
97
+ browse count ".search-result" # Count elements
98
+ browse value "[id=email]" # Get input value
99
+ browse css @e3 "color" # Get computed CSS
100
+ browse attrs @e3 # Get attributes
101
+ browse console # View console messages
102
+ browse errors # View page errors
103
+
156
104
  # Scroll
157
- browse scroll up
158
105
  browse scroll down
159
106
  browse scroll "[id=target]"
160
107
 
161
- # Wait for navigation or network
162
- browse wait ".loaded"
163
- browse wait --url "**/dashboard"
164
- browse wait --network-idle
165
-
166
- # iframe targeting
167
- browse frame "[id=my-iframe]"
168
- browse text # reads from inside the iframe
169
- browse click @e3 # clicks inside the iframe
170
- browse frame main # back to main page
171
-
172
- # Highlight an element (visual debugging)
173
- browse highlight @e5
174
-
175
- # Download a file
176
- browse download @e3 ./file.pdf
108
+ # iframes
109
+ browse frame "[id=my-iframe]" # Target iframe
110
+ browse text # Read inside iframe
111
+ browse frame main # Back to main page
177
112
 
178
113
  # Network mocking
179
- browse route "**/*.png" block
180
114
  browse route "**/api/data" fulfill 200 '{"mock":true}'
115
+ browse route "**/*.png" block
181
116
  browse route clear
182
117
 
183
- # Offline mode
184
- browse offline on
185
- browse offline off
186
-
187
- # JSON output mode
188
- browse --json goto https://example.com
189
-
190
- # Security: content boundaries
191
- browse --content-boundaries text
192
-
193
- # Security: domain restriction
194
- browse --allowed-domains example.com,*.cdn.example.com goto https://example.com
195
-
196
- # State persistence
197
- browse state save mysite
198
- browse state load mysite
199
- browse state clean # delete states older than 7 days
200
- browse state clean --older-than 30 # custom threshold
201
-
202
- # Cookie management
203
- browse cookie clear # clear all cookies
204
- browse cookie set auth token --domain .example.com # set with options
205
- browse cookie export ./cookies.json # export to file
206
- browse cookie import ./cookies.json # import from file
207
-
208
- # Cookie import from real browsers (macOS — Chrome, Arc, Brave, Edge)
209
- browse cookie-import --list # show installed browsers
210
- browse cookie-import chrome --domain .example.com # import cookies for a domain
211
- browse cookie-import arc --domain .github.com # import from Arc
212
- browse cookie-import chrome --profile "Profile 1" --domain .site.com # specific Chrome profile
213
-
214
- # Session auto-persistence (named sessions survive restarts)
215
- browse --session myapp goto https://app.com/login # login...
216
- browse session-close myapp # state auto-saved (encrypted if BROWSE_ENCRYPTION_KEY set)
217
- browse --session myapp goto https://app.com/dashboard # cookies auto-restored
218
-
219
- # Persistent profiles (full browser state, own Chromium)
220
- browse --profile mysite goto https://app.com # all state persists automatically
221
- browse --profile mysite snapshot -i # still logged in next time
222
- browse profile list # list all profiles with size
223
- browse profile delete old-site # remove a profile
224
-
225
- # Load state at launch
226
- browse --state auth.json goto https://app.com # load cookies before first command
227
-
228
- # Auth vault (credentials never visible to LLM)
229
- browse auth save github https://github.com/login user pass123
230
- browse auth login github
231
-
232
- # HAR recording
233
- browse har start
234
- browse goto https://example.com
235
- browse har stop ./recording.har
236
-
237
- # Video recording (watch a .webm of the session)
238
- browse video start ./videos
239
- browse goto https://example.com
240
- browse click @e3
241
- browse video stop
242
-
243
- # Command recording (export replayable scripts)
244
- browse record start
245
- browse goto https://example.com
246
- browse click "a"
247
- browse fill "[id=search]" "test query"
248
- browse record stop
249
- browse record export replay ./recording.json # replay with: npx @puppeteer/replay ./recording.json
250
- browse record export browse ./steps.json # replay with: cat steps.json | browse chain
251
-
252
- # Both together (video + replayable script)
253
- browse video start ./videos
254
- browse record start
255
- browse goto https://example.com
256
- browse snapshot -i
257
- browse click @e3
258
- browse fill "[id=email]" "user@test.com"
259
- browse record stop
260
- browse video stop
261
- browse record export replay ./recording.json
262
-
263
- # Device emulation
264
- browse emulate iphone
265
- browse emulate reset
266
-
267
- # Parallel sessions
268
- browse --session agent-a goto https://site1.com
269
- browse --session agent-b goto https://site2.com
270
-
271
- # Clipboard
272
- browse clipboard
273
- browse clipboard write "copied text"
274
-
275
- # Find elements semantically
276
- browse find role button
277
- browse find text "Submit"
278
- browse find testid "login-btn"
118
+ # Cookie import from real browsers (macOS)
119
+ browse cookie-import chrome --domain .site.com
279
120
 
280
- # Screenshot diff (visual regression)
281
- browse screenshot-diff baseline.png current.png
121
+ # Persistent profiles
122
+ browse --profile mysite goto https://app.com
282
123
 
283
- # Headed mode (visible browser)
284
- browse --headed goto https://example.com
285
-
286
- # Stealth mode (bypasses bot detection)
287
- # Requires: bun add rebrowser-playwright && npx rebrowser-playwright install chromium
288
- browse --runtime rebrowser goto https://example.com
289
-
290
- # State list / show
291
- browse state list
292
- browse state show mysite
124
+ # Cloud providers (encrypted API keys, never visible to agents)
125
+ browse provider save browserbase <api-key>
126
+ browse --provider browserbase goto https://example.com
127
+ browse provider list
128
+ browse provider delete browserbase
293
129
  ```
294
130
 
295
131
  ## Command Reference
@@ -358,6 +194,7 @@ browse keyup <key> Release key
358
194
  browse keyboard inserttext <t> Insert text without key events
359
195
  browse scroll [sel|up|down] Scroll element/viewport/bottom
360
196
  browse scrollinto <sel> Scroll element into view (explicit)
197
+ browse scrollintoview <sel> Alias for scrollinto
361
198
  browse swipe <dir> [px] Swipe up/down/left/right (touch events)
362
199
  browse mouse move <x> <y> Move mouse to coordinates
363
200
  browse mouse down [button] Press mouse button (left/right/middle)
@@ -371,8 +208,14 @@ browse wait --fn "expr" Wait for JavaScript condition
371
208
  browse wait --load <state> Wait for load state
372
209
  browse wait --url <pattern> Wait for URL match
373
210
  browse wait --network-idle Wait for network idle
211
+ browse wait --download Wait for download, return temp path
212
+ browse wait --download ./report.pdf Wait and save to path
213
+ browse wait --download 60000 Custom timeout (ms)
214
+ browse wait --download ./file.pdf 60000 Both path and timeout
374
215
  browse set geo <lat> <lng> Set geolocation
375
216
  browse set media <scheme> Set color scheme (dark/light/no-preference)
217
+ browse header <name>:<value> Set request header
218
+ browse useragent <string> Set user agent string
376
219
  browse viewport <WxH> Set viewport size (e.g. 375x812)
377
220
  browse upload <sel> <files> Upload file(s) to a file input
378
221
  browse highlight <selector> Highlight element (visual debugging)
@@ -384,6 +227,15 @@ browse emulate reset Reset to desktop (1920x1080)
384
227
  browse offline [on|off] Toggle offline mode
385
228
  ```
386
229
 
230
+ ### Cookies
231
+ ```
232
+ browse cookie <n>=<v> Set cookie (shorthand)
233
+ browse cookie set <n> <v> [--domain d --secure] Set cookie with options
234
+ browse cookie clear Clear all cookies
235
+ browse cookie export <file> Export cookies to JSON file
236
+ browse cookie import <file> Import cookies from JSON file
237
+ ```
238
+
387
239
  ### Network
388
240
  ```
389
241
  browse route <pattern> block Block matching requests
@@ -497,7 +349,7 @@ browse cookie-import <browser> --profile <p> --domain <d> Specific Chrome prof
497
349
 
498
350
  ### Auth vault
499
351
  ```
500
- browse auth save <name> <url> <user> <pass> Save credentials (encrypted)
352
+ browse auth save <name> <url> <user> <pass|--password-stdin> Save credentials (encrypted)
501
353
  browse auth login <name> Auto-login using saved credentials
502
354
  browse auth list List saved credentials
503
355
  browse auth delete <name> Delete credentials
@@ -523,6 +375,36 @@ browse record stop Stop recording, keep steps for export
523
375
  browse record status Recording state and step count
524
376
  browse record export browse [path] Export as chain-compatible JSON (replay with browse chain)
525
377
  browse record export replay [path] Export as Chrome DevTools Recorder (Playwright/Puppeteer)
378
+ browse record export replay --selectors css,aria [path] Filter selector types in export
379
+ ```
380
+
381
+ ### React DevTools
382
+ ```
383
+ browse react-devtools enable Enable React DevTools (downloads hook, injects, reloads)
384
+ browse react-devtools disable Disable React DevTools
385
+ browse react-devtools tree Component tree with indentation
386
+ browse react-devtools props <sel> Props/state/hooks of component at element
387
+ browse react-devtools suspense Suspense boundaries + status
388
+ browse react-devtools errors Error boundaries + caught errors
389
+ browse react-devtools profiler Render timing per component
390
+ browse react-devtools hydration Hydration timing (Next.js)
391
+ browse react-devtools renders What re-rendered since last commit
392
+ browse react-devtools owners <sel> Parent component chain
393
+ browse react-devtools context <sel> Context values consumed by component
394
+ ```
395
+
396
+ ### Cloud Providers
397
+ ```
398
+ browse provider save <name> <key> Save provider API key (encrypted)
399
+ browse provider list List saved providers
400
+ browse provider delete <name> Delete provider key
401
+ ```
402
+
403
+ ### Handoff (human takeover)
404
+ ```
405
+ browse handoff [reason] Swap to Chrome for user to solve CAPTCHA/MFA (bypasses bot detection)
406
+ browse handoff --chromium Force Playwright Chromium instead of Chrome
407
+ browse resume Swap back to headless, returns fresh snapshot
526
408
  ```
527
409
 
528
410
  ### Server management
@@ -530,7 +412,7 @@ browse record export replay [path] Export as Chrome DevTools Recorder (Playwr
530
412
  browse status Server health, uptime, session count
531
413
  browse instances List all running browse servers (instance, PID, port, status)
532
414
  browse version Print CLI version
533
- browse doctor System check (Bun, Playwright, Chromium)
415
+ browse doctor System check (Node, Playwright, Chromium)
534
416
  browse upgrade Self-update via npm
535
417
  browse stop Shutdown server
536
418
  browse restart Kill + restart server
@@ -549,87 +431,19 @@ browse inspect Open DevTools (requires BROWSE_DEBUG_PORT)
549
431
  | `--allowed-domains <d,d>` | Block navigation/resources outside allowlist |
550
432
  | `--max-output <n>` | Truncate output to N characters |
551
433
  | `--headed` | Run browser in headed (visible) mode |
552
- | `--runtime <name>` | Browser engine: playwright (default), rebrowser (stealth) |
553
-
554
- ## Speed Rules
555
-
556
- 1. **Navigate once, query many times.** `goto` loads the page; then `text`, `js`, `css`, `screenshot` all run against the loaded page instantly.
557
- 2. **Use `snapshot -i` for interaction.** Get refs for all interactive elements, then click/fill by ref. No need to guess CSS selectors.
558
- 3. **Use `snapshot -C` for SPAs.** Catches cursor:pointer divs and onclick handlers that ARIA misses.
559
- 4. **Use `js` for precision.** `js "document.querySelector('.price').textContent"` is faster than parsing full page text.
560
- 5. **Use `links` to survey.** Faster than `text` when you just need navigation structure.
561
- 6. **Use `chain` for multi-step flows.** Avoids CLI overhead per step.
562
- 7. **Use `responsive` for layout checks.** One command = 3 viewport screenshots.
563
- 8. **Use `--session` for parallel work.** Multiple agents can browse simultaneously without interference.
564
- 9. **Use `value`/`count` instead of `js`.** Purpose-built commands are cleaner than `js "document.querySelector(...).value"`.
565
- 10. **Use `frame` for iframes.** Don't try to reach into iframes with CSS — use `frame [id=x]` first.
566
-
567
- ## When to Use What
568
-
569
- | Task | Commands |
570
- |------|----------|
571
- | Read a page | `goto <url>` then `text` |
572
- | Interact with elements | `snapshot -i` then `click @e3` |
573
- | Find hidden clickables | `snapshot -i -C` then `click @e15` |
574
- | Check if element exists | `count ".thing"` |
575
- | Get input value | `value "[id=email]"` |
576
- | Extract specific data | `js "document.querySelector('.price').textContent"` |
577
- | Visual check | `screenshot` then `Read .browse/sessions/default/screenshot.png` |
578
- | Fill and submit form | `snapshot -i` → `fill @e4 "val"` → `click @e5` |
579
- | Check/uncheck boxes | `check @e7` / `uncheck @e7` |
580
- | Check CSS | `css "selector" "property"` or `css @e3 "property"` |
581
- | Inspect DOM | `html "selector"` or `attrs @e3` |
582
- | Debug console errors | `console` |
583
- | Check network requests | `network` |
584
- | Mock API responses | `route "**/api/*" fulfill 200 '{"data":[]}'` |
585
- | Block ads/trackers | `route "**/*.doubleclick.net/*" block` |
586
- | Test offline behavior | `offline on` → test → `offline off` |
587
- | Interact in iframe | `frame "[id=payment]"` → `fill @e2 "4242..."` → `frame main` |
588
- | Check local dev | `goto http://127.0.0.1:3000` |
589
- | Compare two pages | `diff <url1> <url2>` |
590
- | Mobile layout check | `responsive .browse/sessions/default/resp` |
591
- | Test on mobile device | `emulate iphone` → `goto <url>` → `screenshot` |
592
- | Save/restore session | `state save mysite` / `state load mysite` |
593
- | Auto-login | `auth save gh https://github.com/login user pass` → `auth login gh` |
594
- | Record network | `har start` → browse around → `har stop ./out.har` |
595
- | Record video | `video start ./vids` → browse around → `video stop` |
596
- | Export automation script | `record start` → browse around → `record export replay ./recording.json` |
597
- | Parallel agents | `--session agent-a <cmd>` / `--session agent-b <cmd>` |
598
- | Multi-step flow | `echo '[...]' \| browse chain` |
599
- | Secure browsing | `--allowed-domains example.com goto https://example.com` |
600
- | Scroll through results | `scroll down` → `text` → `scroll down` → `text` |
601
- | Drag and drop | `drag @e1 @e2` |
602
- | Read/write clipboard | `clipboard` / `clipboard write "text"` |
603
- | Find by accessibility | `find role button` / `find text "Submit"` |
604
- | Visual regression | `screenshot-diff baseline.png` |
605
- | Debug with DevTools | `inspect` (set BROWSE_DEBUG_PORT first) |
606
- | Get element position | `box @e3` |
607
- | Check page errors | `errors` |
608
- | Right-click context menu | `rightclick @e3` |
609
- | Test mobile gestures | `emulate iphone` → `tap @e1` / `swipe down` |
610
- | Set dark mode | `set media dark` |
611
- | Test geolocation | `set geo 37.7 -122.4` → verify in page |
612
- | Export/import cookies | `cookie export ./cookies.json` / `cookie import ./cookies.json` |
613
- | Limit output size | `--max-output 5000 text` |
614
- | See the browser | `browse --headed goto <url>` |
615
- | Bypass bot detection | `--runtime rebrowser goto <url>` |
616
- | Persistent login state | `--profile mysite` → browse around → close → reopen (still logged in) |
617
-
618
- ## Architecture
619
-
620
- - Persistent Chromium daemon on localhost (port 9400-10400)
621
- - Bearer token auth per session
622
- - One server per project directory — `--session` handles agent isolation
623
- - Session multiplexing: multiple agents share one Chromium via isolated BrowserContexts
624
- - For separate servers: set `BROWSE_INSTANCE` env var (e.g., fault isolation between teams)
625
- - `browse instances` — discover all running servers (PID, port, status, session count)
626
- - Project-local state: `.browse/` directory at project root (auto-created, self-gitignored)
627
- - `sessions/{id}/` — per-session screenshots, logs, PDFs
628
- - `states/{name}.json` — saved browser state (cookies + localStorage)
629
- - `browse-server.json` — server PID, port, auth token
630
- - Auto-shutdown when all sessions idle past 30 min
631
- - Chromium crash → server exits → auto-restarts on next command
632
- - AI-friendly error messages: Playwright errors rewritten to actionable hints
633
- - CDP remote connection: `BROWSE_CDP_URL` to connect to existing Chrome
634
- - Policy enforcement: `browse-policy.json` for allow/deny/confirm rules
635
- - Two browser engines: playwright (default) and rebrowser (stealth, bypasses bot detection)
434
+ | `--chrome` | Launch system Chrome (real browser, bypasses bot detection) |
435
+ | `--cdp <port>` | Connect to Chrome on a specific debugging port |
436
+ | `--connect` | Auto-discover and connect to a running Chrome instance |
437
+ | `--provider <name>` | Cloud browser provider (browserless, browserbase) |
438
+ | `--runtime <name>` | Browser engine: playwright (default), rebrowser (stealth), lightpanda, chrome |
439
+ | `--mcp` | Run as MCP server (for Cursor, Windsurf, Cline) |
440
+
441
+ ## Reference Files
442
+
443
+ For extended examples, operational guides, or first-time setup:
444
+
445
+ | File | Read when... |
446
+ |------|-------------|
447
+ | [references/permissions.md](references/permissions.md) | First-time setup user wants to pre-allow browse commands in `.claude/settings.json` |
448
+ | [references/guides.md](references/guides.md) | You hit a CAPTCHA/MFA blocker (handoff protocol), want optimization tips (speed rules), or need help choosing which command to use (decision table) |
449
+ | [references/commands.md](references/commands.md) | You want extended usage examples beyond the Quick Reference above |