@ulpi/browse 1.0.6 → 1.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,482 @@
1
+ # browse — Full Command Reference
2
+
3
+ Read this file when you need command syntax not covered in the SKILL.md Quick Reference, or need exact flags for a specific command category.
4
+
5
+ ## Quick Reference (all examples)
6
+
7
+ ```bash
8
+ # Navigate to a page
9
+ browse goto https://example.com
10
+
11
+ # Read cleaned page text
12
+ browse text
13
+
14
+ # Take a screenshot (saved to .browse/sessions/default/screenshot.png)
15
+ browse screenshot
16
+
17
+ # Snapshot: accessibility tree with refs
18
+ browse snapshot -i
19
+
20
+ # Click by ref (after snapshot)
21
+ browse click @e3
22
+
23
+ # Fill by ref
24
+ browse fill @e4 "test@test.com"
25
+
26
+ # Double-click, focus, check/uncheck
27
+ browse dblclick @e3
28
+ browse focus @e5
29
+ browse check @e7
30
+ browse uncheck @e7
31
+
32
+ # Drag and drop
33
+ browse drag @e1 @e2
34
+
35
+ # Run JavaScript
36
+ browse js "document.title"
37
+
38
+ # Get all links
39
+ browse links
40
+
41
+ # Get input value / count elements
42
+ browse value "[id=email]"
43
+ browse count ".search-result"
44
+
45
+ # Click by CSS selector
46
+ browse click "button.submit"
47
+
48
+ # Fill a form by CSS selector (use [id=...] instead of # to avoid shell issues)
49
+ browse fill "[id=email]" "test@test.com"
50
+ browse fill "[id=password]" "abc123"
51
+ browse click "button[type=submit]"
52
+
53
+ # Scroll
54
+ browse scroll up
55
+ browse scroll down
56
+ browse scroll "[id=target]"
57
+
58
+ # Wait for navigation or network
59
+ browse wait ".loaded"
60
+ browse wait --url "**/dashboard"
61
+ browse wait --network-idle
62
+
63
+ # iframe targeting
64
+ browse frame "[id=my-iframe]"
65
+ browse text # reads from inside the iframe
66
+ browse click @e3 # clicks inside the iframe
67
+ browse frame main # back to main page
68
+
69
+ # Highlight an element (visual debugging)
70
+ browse highlight @e5
71
+
72
+ # Download a file
73
+ browse download @e3 ./file.pdf
74
+
75
+ # Network mocking
76
+ browse route "**/*.png" block
77
+ browse route "**/api/data" fulfill 200 '{"mock":true}'
78
+ browse route clear
79
+
80
+ # Offline mode
81
+ browse offline on
82
+ browse offline off
83
+
84
+ # JSON output mode
85
+ browse --json goto https://example.com
86
+
87
+ # Security: content boundaries
88
+ browse --content-boundaries text
89
+
90
+ # Security: domain restriction
91
+ browse --allowed-domains example.com,*.cdn.example.com goto https://example.com
92
+
93
+ # State persistence
94
+ browse state save mysite
95
+ browse state load mysite
96
+ browse state clean # delete states older than 7 days
97
+ browse state clean --older-than 30 # custom threshold
98
+
99
+ # Cookie management
100
+ browse cookie clear # clear all cookies
101
+ browse cookie set auth token --domain .example.com # set with options
102
+ browse cookie export ./cookies.json # export to file
103
+ browse cookie import ./cookies.json # import from file
104
+
105
+ # Cookie import from real browsers (macOS -- Chrome, Arc, Brave, Edge)
106
+ browse cookie-import --list # show installed browsers
107
+ browse cookie-import chrome --domain .example.com # import cookies for a domain
108
+ browse cookie-import arc --domain .github.com # import from Arc
109
+ browse cookie-import chrome --profile "Profile 1" --domain .site.com # specific Chrome profile
110
+
111
+ # Session auto-persistence (named sessions survive restarts)
112
+ browse --session myapp goto https://app.com/login # login...
113
+ browse session-close myapp # state auto-saved (encrypted if BROWSE_ENCRYPTION_KEY set)
114
+ browse --session myapp goto https://app.com/dashboard # cookies auto-restored
115
+
116
+ # Persistent profiles (full browser state, own Chromium)
117
+ browse --profile mysite goto https://app.com # all state persists automatically
118
+ browse --profile mysite snapshot -i # still logged in next time
119
+ browse profile list # list all profiles with size
120
+ browse profile delete old-site # remove a profile
121
+
122
+ # Load state at launch
123
+ browse --state auth.json goto https://app.com # load cookies before first command
124
+
125
+ # Auth vault (credentials never visible to LLM)
126
+ browse auth save github https://github.com/login user pass123
127
+ browse auth login github
128
+
129
+ # HAR recording
130
+ browse har start
131
+ browse goto https://example.com
132
+ browse har stop ./recording.har
133
+
134
+ # Video recording (watch a .webm of the session)
135
+ browse video start ./videos
136
+ browse goto https://example.com
137
+ browse click @e3
138
+ browse video stop
139
+
140
+ # Command recording (export replayable scripts)
141
+ browse record start
142
+ browse goto https://example.com
143
+ browse click "a"
144
+ browse fill "[id=search]" "test query"
145
+ browse record stop
146
+ browse record export replay ./recording.json # replay with: npx @puppeteer/replay ./recording.json
147
+ browse record export browse ./steps.json # replay with: cat steps.json | browse chain
148
+
149
+ # Both together (video + replayable script)
150
+ browse video start ./videos
151
+ browse record start
152
+ browse goto https://example.com
153
+ browse snapshot -i
154
+ browse click @e3
155
+ browse fill "[id=email]" "user@test.com"
156
+ browse record stop
157
+ browse video stop
158
+ browse record export replay ./recording.json
159
+
160
+ # Device emulation
161
+ browse emulate iphone
162
+ browse emulate reset
163
+
164
+ # Parallel sessions
165
+ browse --session agent-a goto https://site1.com
166
+ browse --session agent-b goto https://site2.com
167
+
168
+ # Clipboard
169
+ browse clipboard
170
+ browse clipboard write "copied text"
171
+
172
+ # Find elements semantically
173
+ browse find role button
174
+ browse find text "Submit"
175
+ browse find testid "login-btn"
176
+
177
+ # Screenshot diff (visual regression)
178
+ browse screenshot-diff baseline.png current.png
179
+
180
+ # Headed mode (visible browser)
181
+ browse --headed goto https://example.com
182
+
183
+ # Handoff (human takeover for CAPTCHA/MFA -- see guides.md for protocol)
184
+ browse handoff "stuck on CAPTCHA"
185
+ browse resume
186
+
187
+ # React debugging
188
+ browse react-devtools enable
189
+ browse react-devtools tree
190
+ browse react-devtools props @e3
191
+ browse react-devtools suspense
192
+ browse react-devtools disable
193
+
194
+ # Stealth mode (bypasses bot detection)
195
+ browse --runtime rebrowser goto https://example.com
196
+
197
+ # State list / show
198
+ browse state list
199
+ browse state show mysite
200
+ ```
201
+
202
+ ## Navigation
203
+ ```
204
+ browse goto <url> Navigate current tab
205
+ browse back Go back
206
+ browse forward Go forward
207
+ browse reload Reload page
208
+ browse url Print current URL
209
+ ```
210
+
211
+ ## Content extraction
212
+ ```
213
+ browse text Cleaned page text (no scripts/styles)
214
+ browse html [selector] innerHTML of element, or full page HTML
215
+ browse links All links as "text -> href"
216
+ browse forms All forms + fields as JSON
217
+ browse accessibility Accessibility tree snapshot (ARIA)
218
+ ```
219
+
220
+ ## Snapshot (ref-based element selection)
221
+ ```
222
+ browse snapshot Full accessibility tree with @refs
223
+ browse snapshot -i Interactive elements only -- terse flat list (minimal tokens)
224
+ browse snapshot -i -f Interactive elements -- full indented tree with props
225
+ browse snapshot -i -V Interactive elements -- viewport only (skip below-fold)
226
+ browse snapshot -c Compact (no empty structural elements)
227
+ browse snapshot -C Cursor-interactive (detect divs with cursor:pointer/onclick/tabindex)
228
+ browse snapshot -d <N> Limit depth to N levels
229
+ browse snapshot -s <sel> Scope to CSS selector
230
+ browse snapshot-diff Compare current vs previous snapshot
231
+ ```
232
+
233
+ After snapshot, use @refs as selectors in any command:
234
+ ```
235
+ browse click @e3 Click the element assigned ref @e3
236
+ browse fill @e4 "value" Fill the input assigned ref @e4
237
+ browse hover @e1 Hover the element assigned ref @e1
238
+ browse html @e2 Get innerHTML of ref @e2
239
+ browse css @e5 "color" Get computed CSS of ref @e5
240
+ browse attrs @e6 Get attributes of ref @e6
241
+ ```
242
+
243
+ Refs are invalidated on navigation -- run `snapshot` again after `goto`.
244
+
245
+ ## Interaction
246
+ ```
247
+ browse click <selector> Click element (CSS selector or @ref)
248
+ browse click <x>,<y> Click at page coordinates (e.g. 590,461)
249
+ browse rightclick <selector> Right-click element (context menu)
250
+ browse dblclick <selector> Double-click element
251
+ browse fill <selector> <value> Fill input field
252
+ browse select <selector> <val> Select dropdown value
253
+ browse hover <selector> Hover over element
254
+ browse focus <selector> Focus element
255
+ browse tap <selector> Tap element (requires touch context via emulate)
256
+ browse check <selector> Check checkbox
257
+ browse uncheck <selector> Uncheck checkbox
258
+ browse drag <src> <tgt> Drag source to target
259
+ browse type <text> Type into focused element
260
+ browse press <key> Press key (Enter, Tab, Escape, etc.)
261
+ browse keydown <key> Hold key down
262
+ browse keyup <key> Release key
263
+ browse keyboard inserttext <t> Insert text without key events
264
+ browse scroll [sel|up|down] Scroll element/viewport/bottom
265
+ browse scrollinto <sel> Scroll element into view (explicit)
266
+ browse scrollintoview <sel> Alias for scrollinto
267
+ browse swipe <dir> [px] Swipe up/down/left/right (touch events)
268
+ browse mouse move <x> <y> Move mouse to coordinates
269
+ browse mouse down [button] Press mouse button (left/right/middle)
270
+ browse mouse up [button] Release mouse button
271
+ browse mouse wheel <dy> [dx] Scroll wheel
272
+ browse wait <sel> Wait for element to appear
273
+ browse wait <sel> --state hidden Wait for element to disappear
274
+ browse wait <ms> Wait for milliseconds
275
+ browse wait --text "..." Wait for text to appear in page
276
+ browse wait --fn "expr" Wait for JavaScript condition
277
+ browse wait --load <state> Wait for load state
278
+ browse wait --url <pattern> Wait for URL match
279
+ browse wait --network-idle Wait for network idle
280
+ browse wait --download Wait for download, return temp path
281
+ browse wait --download ./report.pdf Wait and save to path
282
+ browse wait --download 60000 Custom timeout (ms)
283
+ browse wait --download ./file.pdf 60000 Both path and timeout
284
+ browse set geo <lat> <lng> Set geolocation
285
+ browse set media <scheme> Set color scheme (dark/light/no-preference)
286
+ browse header <name>:<value> Set request header
287
+ browse useragent <string> Set user agent string
288
+ browse viewport <WxH> Set viewport size (e.g. 375x812)
289
+ browse upload <sel> <files> Upload file(s) to a file input
290
+ browse highlight <selector> Highlight element (visual debugging)
291
+ browse download <sel> [path] Download file triggered by click
292
+ browse dialog-accept [value] Set dialogs to auto-accept
293
+ browse dialog-dismiss Set dialogs to auto-dismiss (default)
294
+ browse emulate <device> Emulate device (iphone, pixel, etc.)
295
+ browse emulate reset Reset to desktop (1920x1080)
296
+ browse offline [on|off] Toggle offline mode
297
+ ```
298
+
299
+ ## Cookies
300
+ ```
301
+ browse cookie <n>=<v> Set cookie (shorthand)
302
+ browse cookie set <n> <v> [--domain d --secure] Set cookie with options
303
+ browse cookie clear Clear all cookies
304
+ browse cookie export <file> Export cookies to JSON file
305
+ browse cookie import <file> Import cookies from JSON file
306
+ ```
307
+
308
+ ## Network
309
+ ```
310
+ browse route <pattern> block Block matching requests
311
+ browse route <pattern> fulfill <s> [b] Mock with status + body
312
+ browse route clear Remove all routes
313
+ ```
314
+
315
+ ## Inspection
316
+ ```
317
+ browse js <expression> Run JS, print result
318
+ browse eval <js-file> Run JS file against page
319
+ browse css <selector> <prop> Get computed CSS property
320
+ browse attrs <selector> Get element attributes as JSON
321
+ browse element-state <selector> Element state (visible/enabled/checked/focused)
322
+ browse value <selector> Get input field value
323
+ browse count <selector> Count matching elements
324
+ browse box <selector> Get bounding box as JSON {x, y, width, height}
325
+ browse dialog Last dialog info or "(no dialog detected)"
326
+ browse console [--clear] View/clear console messages
327
+ browse errors [--clear] View/clear page errors (filtered from console)
328
+ browse network [--clear] View/clear network requests
329
+ browse cookies Dump all cookies as JSON
330
+ browse storage [set <k> <v>] View/set localStorage
331
+ browse perf Page load performance timings
332
+ browse devices [filter] List available device names
333
+ browse clipboard Read system clipboard text
334
+ browse clipboard write <text> Write text to system clipboard
335
+ ```
336
+
337
+ ## Visual
338
+ ```
339
+ browse screenshot [path] Viewport screenshot (default: .browse/sessions/{id}/screenshot.png)
340
+ browse screenshot --full [path] Full-page screenshot (entire scrollable page)
341
+ browse screenshot <sel|@ref> [path] Screenshot specific element
342
+ browse screenshot --clip x,y,w,h [path] Screenshot clipped region
343
+ browse screenshot --annotate [path] Screenshot with numbered badges + legend
344
+ browse pdf [path] Save as PDF
345
+ browse responsive [prefix] Screenshots at mobile/tablet/desktop
346
+ ```
347
+
348
+ ## Frames (iframe targeting)
349
+ ```
350
+ browse frame <selector> Target an iframe (subsequent commands run inside it)
351
+ browse frame main Return to main page
352
+ ```
353
+
354
+ ## Find (semantic element locators)
355
+ ```
356
+ browse find role <query> Find elements by ARIA role
357
+ browse find text <query> Find elements by text content
358
+ browse find label <query> Find elements by label
359
+ browse find placeholder <query> Find elements by placeholder
360
+ browse find testid <query> Find elements by test ID
361
+ browse find alt <query> Find elements by alt text
362
+ browse find title <query> Find elements by title attribute
363
+ browse find first <sel> First matching element
364
+ browse find last <sel> Last matching element
365
+ browse find nth <n> <sel> Nth matching element (0-indexed)
366
+ ```
367
+
368
+ ## Compare
369
+ ```
370
+ browse diff <url1> <url2> Text diff between two pages
371
+ browse screenshot-diff <base> [curr] Pixel-diff two PNG screenshots
372
+ ```
373
+
374
+ ## Multi-step (chain)
375
+ ```
376
+ echo '[["goto","https://example.com"],["snapshot","-i"],["click","@e1"]]' | browse chain
377
+ ```
378
+
379
+ ## Tabs
380
+ ```
381
+ browse tabs List tabs (id, url, title)
382
+ browse tab <id> Switch to tab
383
+ browse newtab [url] Open new tab
384
+ browse closetab [id] Close tab
385
+ ```
386
+
387
+ ## Sessions (parallel agents)
388
+ ```
389
+ browse --session <id> <cmd> Run command in named session
390
+ browse sessions List active sessions
391
+ browse session-close <id> Close a session
392
+ ```
393
+
394
+ ## Profiles
395
+ ```
396
+ browse --profile <name> <cmd> Use persistent browser profile
397
+ browse profile list List profiles with disk size
398
+ browse profile delete <name> Delete a profile
399
+ browse profile clean [--older-than <d>] Remove old profiles (default: 7 days)
400
+ ```
401
+
402
+ ## State persistence
403
+ ```
404
+ browse state save [name] Save cookies + localStorage (all origins)
405
+ browse state load [name] Restore saved state
406
+ browse state list List saved states
407
+ browse state show [name] Show contents of saved state
408
+ browse state clean Delete states older than 7 days
409
+ browse state clean --older-than N Custom age threshold (days)
410
+ ```
411
+
412
+ ## Cookie import (macOS -- borrow auth from real browsers)
413
+ ```
414
+ browse cookie-import --list List installed browsers
415
+ browse cookie-import <browser> --domain <d> Import cookies for a domain
416
+ browse cookie-import <browser> --profile <p> --domain <d> Specific Chrome profile
417
+ ```
418
+
419
+ ## Auth vault
420
+ ```
421
+ browse auth save <name> <url> <user> <pass|--password-stdin> Save credentials (encrypted)
422
+ browse auth login <name> Auto-login using saved credentials
423
+ browse auth list List saved credentials
424
+ browse auth delete <name> Delete credentials
425
+ ```
426
+
427
+ ## HAR recording
428
+ ```
429
+ browse har start Start recording network traffic
430
+ browse har stop [path] Stop and save HAR file
431
+ ```
432
+
433
+ ## Video recording
434
+ ```
435
+ browse video start [dir] Start recording video (WebM, compositor-level)
436
+ browse video stop Stop recording and save video files
437
+ browse video status Check if recording is active
438
+ ```
439
+
440
+ ## Command recording & export
441
+ ```
442
+ browse record start Start recording commands
443
+ browse record stop Stop recording, keep steps for export
444
+ browse record status Recording state and step count
445
+ browse record export browse [path] Export as chain-compatible JSON (replay with browse chain)
446
+ browse record export replay [path] Export as Chrome DevTools Recorder (Playwright/Puppeteer)
447
+ ```
448
+
449
+ ## React DevTools
450
+ ```
451
+ browse react-devtools enable Enable React DevTools (downloads hook, injects, reloads)
452
+ browse react-devtools disable Disable React DevTools
453
+ browse react-devtools tree Component tree with indentation
454
+ browse react-devtools props <sel> Props/state/hooks of component at element
455
+ browse react-devtools suspense Suspense boundaries + status
456
+ browse react-devtools errors Error boundaries + caught errors
457
+ browse react-devtools profiler Render timing per component
458
+ browse react-devtools hydration Hydration timing (Next.js)
459
+ browse react-devtools renders What re-rendered since last commit
460
+ browse react-devtools owners <sel> Parent component chain
461
+ browse react-devtools context <sel> Context values consumed by component
462
+ ```
463
+
464
+ ## Server management
465
+ ```
466
+ browse status Server health, uptime, session count
467
+ browse instances List all running browse servers (instance, PID, port, status)
468
+ browse version Print CLI version
469
+ browse doctor System check (Node, Playwright, Chromium)
470
+ browse upgrade Self-update via npm
471
+ browse stop Shutdown server
472
+ browse restart Kill + restart server
473
+ browse inspect Open DevTools (requires BROWSE_DEBUG_PORT)
474
+ ```
475
+
476
+ ## Handoff (human takeover)
477
+ ```
478
+ browse handoff [reason] Swap to visible browser for user to solve CAPTCHA/MFA
479
+ browse resume Swap back to headless, returns fresh snapshot
480
+ ```
481
+
482
+ See [guides.md](guides.md) for the mandatory handoff protocol.
@@ -0,0 +1,177 @@
1
+ # browse — Operational Guides
2
+
3
+ Read this file when you need the handoff protocol, optimization tips, or help choosing which command to use.
4
+
5
+ ## Handoff Protocol (MANDATORY)
6
+
7
+ Read this section when you hit CAPTCHA, MFA, OAuth, or any blocker after 2-3 failed attempts. The server auto-suggests handoff after 3 consecutive failures (look for HINT in error messages).
8
+
9
+ When the browser hits a blocker you can't solve, you MUST follow this exact 3-step protocol. Do NOT skip any step.
10
+
11
+ ### Step 1 — Ask permission (REQUIRED before handoff)
12
+
13
+ Use `AskUserQuestion` (or your platform's equivalent interactive prompt tool) to ask
14
+ the user before opening the browser. Do NOT just print text and proceed — you MUST
15
+ wait for an explicit response.
16
+
17
+ ```
18
+ AskUserQuestion:
19
+ question: "I'm stuck on a CAPTCHA at [URL]. Can I open a visible browser so you can solve it?"
20
+ options:
21
+ - label: "Yes, open browser"
22
+ description: "Opens a visible Chrome window with your current session"
23
+ - label: "No, try something else"
24
+ description: "I'll try cookie-import, auth login, or a different approach"
25
+ ```
26
+
27
+ If your platform does not have `AskUserQuestion`, ask the user via text and wait for
28
+ their response before proceeding. Do NOT run handoff without explicit user confirmation.
29
+
30
+ If the user says no, try `cookie-import`, `auth login`, or a different approach.
31
+
32
+ ### Step 2 — Handoff + wait for user (REQUIRED)
33
+
34
+ Run the handoff command, then use `AskUserQuestion` (or equivalent) to wait for the
35
+ user to finish:
36
+
37
+ ```bash
38
+ browse handoff "Stuck on CAPTCHA at login page"
39
+ ```
40
+
41
+ Then immediately prompt the user:
42
+
43
+ ```
44
+ AskUserQuestion:
45
+ question: "Browser is open. Please solve the CAPTCHA, then click Done."
46
+ options:
47
+ - label: "Done"
48
+ description: "I've solved it, return to headless mode"
49
+ - label: "Cancel"
50
+ description: "Close the browser, try something else"
51
+ ```
52
+
53
+ Do NOT proceed or do any other work while waiting. The user is interacting with the
54
+ visible browser — wait for their response.
55
+
56
+ ### Step 3 — Resume
57
+
58
+ After the user responds:
59
+
60
+ ```bash
61
+ browse resume
62
+ # Returns fresh snapshot — continue working with it
63
+ ```
64
+
65
+ If "Done" — continue with the fresh snapshot from resume.
66
+ If "Cancel" — resume anyway (closes headed browser), then try alternative approach.
67
+
68
+ ### When to Handoff
69
+ - CAPTCHA or bot detection blocking progress
70
+ - Multi-factor authentication requiring a physical device
71
+ - OAuth popup that redirects to a third-party login
72
+ - Any blocker after 2-3 failed attempts at the same step
73
+ - The server auto-suggests handoff after 3 consecutive failures (look for HINT in error messages)
74
+
75
+ ### When NOT to Handoff
76
+ - Normal navigation/interaction failures — retry or try a different selector
77
+ - Pages that just need more time to load — use `wait` commands
78
+ - Cookie/auth issues — try `cookie-import` or `auth login` first
79
+
80
+ ### Handoff Rules
81
+ - NEVER run `browse handoff` without asking the user first (Step 1)
82
+ - NEVER proceed without waiting for the user to finish (Step 2)
83
+ - ALWAYS tell the user what they need to do in the visible browser
84
+ - ALWAYS run `browse resume` after the user is done
85
+
86
+ ## Speed Rules
87
+
88
+ Read this section to optimize your command usage and minimize token consumption.
89
+
90
+ 1. **Navigate once, query many times.** `goto` loads the page; then `text`, `js`, `css`, `screenshot` all run against the loaded page instantly.
91
+ 2. **Use `snapshot -i` for interaction.** Get refs for all interactive elements, then click/fill by ref. No need to guess CSS selectors.
92
+ 3. **Use `snapshot -C` for SPAs.** Catches cursor:pointer divs and onclick handlers that ARIA misses.
93
+ 4. **Use `js` for precision.** `js "document.querySelector('.price').textContent"` is faster than parsing full page text.
94
+ 5. **Use `links` to survey.** Faster than `text` when you just need navigation structure.
95
+ 6. **Use `chain` for multi-step flows.** Avoids CLI overhead per step.
96
+ 7. **Use `responsive` for layout checks.** One command = 3 viewport screenshots.
97
+ 8. **Use `--session` for parallel work.** Multiple agents can browse simultaneously without interference.
98
+ 9. **Use `value`/`count` instead of `js`.** Purpose-built commands are cleaner than `js "document.querySelector(...).value"`.
99
+ 10. **Use `frame` for iframes.** Don't try to reach into iframes with CSS — use `frame [id=x]` first.
100
+
101
+ ## When to Use What
102
+
103
+ Read this section when you're unsure which command to use for a task.
104
+
105
+ | Task | Commands |
106
+ |------|----------|
107
+ | Read a page | `goto <url>` then `text` |
108
+ | Interact with elements | `snapshot -i` then `click @e3` |
109
+ | Find hidden clickables | `snapshot -i -C` then `click @e15` |
110
+ | Check if element exists | `count ".thing"` |
111
+ | Get input value | `value "[id=email]"` |
112
+ | Extract specific data | `js "document.querySelector('.price').textContent"` |
113
+ | Visual check | `screenshot` then `Read .browse/sessions/default/screenshot.png` |
114
+ | Fill and submit form | `snapshot -i` then `fill @e4 "val"` then `click @e5` |
115
+ | Check/uncheck boxes | `check @e7` / `uncheck @e7` |
116
+ | Check CSS | `css "selector" "property"` or `css @e3 "property"` |
117
+ | Inspect DOM | `html "selector"` or `attrs @e3` |
118
+ | Debug console errors | `console` |
119
+ | Check network requests | `network` |
120
+ | Mock API responses | `route "**/api/*" fulfill 200 '{"data":[]}'` |
121
+ | Block ads/trackers | `route "**/*.doubleclick.net/*" block` |
122
+ | Test offline behavior | `offline on` then test then `offline off` |
123
+ | Interact in iframe | `frame "[id=payment]"` then `fill @e2 "4242..."` then `frame main` |
124
+ | Check local dev | `goto http://127.0.0.1:3000` |
125
+ | Compare two pages | `diff <url1> <url2>` |
126
+ | Mobile layout check | `responsive .browse/sessions/default/resp` |
127
+ | Test on mobile device | `emulate iphone` then `goto <url>` then `screenshot` |
128
+ | Save/restore session | `state save mysite` / `state load mysite` |
129
+ | Auto-login | `auth save gh https://github.com/login user pass` then `auth login gh` |
130
+ | Record network | `har start` then browse then `har stop ./out.har` |
131
+ | Record video | `video start ./vids` then browse then `video stop` |
132
+ | Export automation script | `record start` then browse then `record export replay ./recording.json` |
133
+ | Parallel agents | `--session agent-a <cmd>` / `--session agent-b <cmd>` |
134
+ | Multi-step flow | `echo '[...]' \| browse chain` |
135
+ | Secure browsing | `--allowed-domains example.com goto https://example.com` |
136
+ | Scroll through results | `scroll down` then `text` then `scroll down` then `text` |
137
+ | Drag and drop | `drag @e1 @e2` |
138
+ | Read/write clipboard | `clipboard` / `clipboard write "text"` |
139
+ | Find by accessibility | `find role button` / `find text "Submit"` |
140
+ | Visual regression | `screenshot-diff baseline.png` |
141
+ | Debug with DevTools | `inspect` (set BROWSE_DEBUG_PORT first) |
142
+ | Get element position | `box @e3` |
143
+ | Check page errors | `errors` |
144
+ | Right-click context menu | `rightclick @e3` |
145
+ | Test mobile gestures | `emulate iphone` then `tap @e1` / `swipe down` |
146
+ | Set dark mode | `set media dark` |
147
+ | Test geolocation | `set geo 37.7 -122.4` then verify in page |
148
+ | Export/import cookies | `cookie export ./cookies.json` / `cookie import ./cookies.json` |
149
+ | Limit output size | `--max-output 5000 text` |
150
+ | See the browser | `browse --headed goto <url>` |
151
+ | CAPTCHA / MFA blocker | `handoff "reason"` then user solves then `resume` (see Handoff Protocol above) |
152
+ | Debug React components | `react-devtools enable` then `tree` then `props @e3` |
153
+ | Debug hydration issues | `react-devtools enable` then `hydration` |
154
+ | Find suspense blockers | `react-devtools enable` then `suspense` |
155
+ | Bypass bot detection | `--runtime rebrowser goto <url>` |
156
+ | Persistent login state | `--profile mysite` then browse then close then reopen (still logged in) |
157
+
158
+ ## Architecture
159
+
160
+ Read this section to understand the system design.
161
+
162
+ - Persistent Chromium daemon on localhost (port 9400-10400)
163
+ - Bearer token auth per session
164
+ - One server per project directory — `--session` handles agent isolation
165
+ - Session multiplexing: multiple agents share one Chromium via isolated BrowserContexts
166
+ - For separate servers: set `BROWSE_INSTANCE` env var (e.g., fault isolation between teams)
167
+ - `browse instances` — discover all running servers (PID, port, status, session count)
168
+ - Project-local state: `.browse/` directory at project root (auto-created, self-gitignored)
169
+ - `sessions/{id}/` — per-session screenshots, logs, PDFs
170
+ - `states/{name}.json` — saved browser state (cookies + localStorage)
171
+ - `browse-server.json` — server PID, port, auth token
172
+ - Auto-shutdown when all sessions idle past 30 min
173
+ - Chromium crash — server exits — auto-restarts on next command
174
+ - AI-friendly error messages: Playwright errors rewritten to actionable hints
175
+ - CDP remote connection: `BROWSE_CDP_URL` to connect to existing Chrome
176
+ - Policy enforcement: `browse-policy.json` for allow/deny/confirm rules
177
+ - Two browser engines: playwright (default) and rebrowser (stealth, bypasses bot detection)