@ulpi/browse 0.7.4 → 0.10.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +1 -1
- package/README.md +449 -283
- package/package.json +1 -1
- package/skill/SKILL.md +113 -5
- package/src/auth-vault.ts +4 -52
- package/src/browser-manager.ts +20 -5
- package/src/bun.d.ts +15 -20
- package/src/chrome-discover.ts +73 -0
- package/src/cli.ts +110 -10
- package/src/commands/meta.ts +247 -9
- package/src/commands/read.ts +28 -0
- package/src/commands/write.ts +236 -16
- package/src/config.ts +0 -1
- package/src/cookie-import.ts +410 -0
- package/src/encryption.ts +48 -0
- package/src/record-export.ts +98 -0
- package/src/server.ts +43 -2
- package/src/session-manager.ts +48 -0
- package/src/session-persist.ts +192 -0
package/README.md
CHANGED
|
@@ -1,327 +1,499 @@
|
|
|
1
1
|
# @ulpi/browse
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
Headless browser CLI for AI coding agents. Persistent Chromium daemon via Playwright, ~100ms per command after startup.
|
|
4
4
|
|
|
5
|
-
|
|
5
|
+
## Installation
|
|
6
6
|
|
|
7
|
-
|
|
7
|
+
### Global Installation (recommended)
|
|
8
8
|
|
|
9
|
-
|
|
9
|
+
```bash
|
|
10
|
+
npm install -g @ulpi/browse
|
|
11
|
+
```
|
|
10
12
|
|
|
11
|
-
|
|
13
|
+
Requires [Bun](https://bun.sh) runtime. Chromium is installed automatically via Playwright on first `npm install`.
|
|
12
14
|
|
|
13
|
-
|
|
15
|
+
### Project Installation (local dependency)
|
|
14
16
|
|
|
15
|
-
|
|
17
|
+
```bash
|
|
18
|
+
npm install @ulpi/browse
|
|
19
|
+
```
|
|
16
20
|
|
|
17
|
-
|
|
18
|
-
|------|------|-------------------------:|-------------------:|----------:|
|
|
19
|
-
| mumzworld.com | Homepage | ~51,151 | ~15,072 | **3x** |
|
|
20
|
-
| mumzworld.com | Search | ~13,860 | ~3,614 | **4x** |
|
|
21
|
-
| mumzworld.com | PDP | ~10,071 | ~3,084 | **3x** |
|
|
22
|
-
| amazon.com | Homepage | ~10,431 | ~2,150 | **5x** |
|
|
23
|
-
| amazon.com | Search | ~19,458 | ~3,644 | **5x** |
|
|
24
|
-
| ebay.com | Homepage | ~4,641 | ~1,557 | **3x** |
|
|
25
|
-
| ebay.com | Search | ~35,929 | ~7,088 | **5x** |
|
|
26
|
-
| ebay.com | PDP | ~1,294 | ~678 | **2x** |
|
|
27
|
-
| nike.com | Homepage | ~2,495 | ~816 | **3x** |
|
|
28
|
-
| nike.com | Search | ~7,998 | ~2,678 | **3x** |
|
|
29
|
-
| nike.com | PDP | ~3,034 | ~989 | **3x** |
|
|
30
|
-
| **TOTAL** | **11 pages** | **~160,362** | **~41,370** | **4x** |
|
|
21
|
+
Then use via `package.json` scripts or by invoking `browse` directly.
|
|
31
22
|
|
|
32
|
-
|
|
23
|
+
### From Source
|
|
33
24
|
|
|
34
|
-
|
|
35
|
-
|
|
36
|
-
|
|
37
|
-
|
|
38
|
-
|
|
39
|
-
|
|
25
|
+
```bash
|
|
26
|
+
git clone https://github.com/ulpi-io/browse
|
|
27
|
+
cd browse
|
|
28
|
+
bun install
|
|
29
|
+
bun run src/cli.ts goto https://example.com # Dev mode
|
|
30
|
+
bun run build # Build standalone binary
|
|
31
|
+
```
|
|
40
32
|
|
|
41
|
-
|
|
33
|
+
## Quick Start
|
|
42
34
|
|
|
43
|
-
|
|
35
|
+
```bash
|
|
36
|
+
browse goto https://example.com
|
|
37
|
+
browse snapshot -i # Get interactive elements with refs
|
|
38
|
+
browse click @e2 # Click by ref from snapshot
|
|
39
|
+
browse fill @e3 "test@example.com" # Fill input by ref
|
|
40
|
+
browse text # Get visible page text
|
|
41
|
+
browse screenshot page.png
|
|
42
|
+
browse stop
|
|
43
|
+
```
|
|
44
|
+
|
|
45
|
+
### The Ref Workflow
|
|
44
46
|
|
|
45
|
-
|
|
47
|
+
Every `snapshot` assigns refs (`@e1`, `@e2`, ...) to elements. Use refs as selectors in any command — no CSS selector construction needed:
|
|
48
|
+
|
|
49
|
+
```bash
|
|
50
|
+
$ browse snapshot -i
|
|
51
|
+
@e1 [button] "Submit"
|
|
52
|
+
@e2 [link] "Home"
|
|
53
|
+
@e3 [textbox] "Email"
|
|
54
|
+
|
|
55
|
+
$ browse click @e1 # Click the Submit button
|
|
56
|
+
Clicked @e1
|
|
57
|
+
|
|
58
|
+
$ browse fill @e3 "user@example.com" # Fill the Email field
|
|
59
|
+
Filled @e3
|
|
60
|
+
```
|
|
46
61
|
|
|
47
|
-
###
|
|
62
|
+
### Traditional Selectors (also supported)
|
|
48
63
|
|
|
64
|
+
```bash
|
|
65
|
+
browse click "#submit"
|
|
66
|
+
browse fill ".email-input" "test@example.com"
|
|
67
|
+
browse click "text=Submit"
|
|
49
68
|
```
|
|
50
|
-
@playwright/mcp browser_navigate → 51,150 tokens (full snapshot, every time)
|
|
51
69
|
|
|
52
|
-
|
|
53
|
-
|
|
54
|
-
|
|
70
|
+
## Commands
|
|
71
|
+
|
|
72
|
+
### Navigation
|
|
73
|
+
|
|
74
|
+
```bash
|
|
75
|
+
browse goto <url> # Navigate to URL
|
|
76
|
+
browse back # Go back
|
|
77
|
+
browse forward # Go forward
|
|
78
|
+
browse reload # Reload page
|
|
79
|
+
browse url # Get current URL
|
|
55
80
|
```
|
|
56
81
|
|
|
57
|
-
|
|
82
|
+
### Content Extraction
|
|
58
83
|
|
|
59
|
-
|
|
84
|
+
```bash
|
|
85
|
+
browse text # Visible text (clean, no DOM mutation)
|
|
86
|
+
browse html [sel] # Full HTML or element innerHTML
|
|
87
|
+
browse links # All links as "text -> href"
|
|
88
|
+
browse forms # Form structure as JSON
|
|
89
|
+
browse accessibility # Raw ARIA snapshot tree
|
|
90
|
+
```
|
|
60
91
|
|
|
61
|
-
|
|
92
|
+
### Interaction
|
|
62
93
|
|
|
63
94
|
```bash
|
|
64
|
-
|
|
65
|
-
|
|
66
|
-
|
|
67
|
-
|
|
68
|
-
|
|
69
|
-
|
|
95
|
+
browse click <sel> # Click element
|
|
96
|
+
browse rightclick <sel> # Right-click element (context menu)
|
|
97
|
+
browse dblclick <sel> # Double-click element
|
|
98
|
+
browse fill <sel> <val> # Clear and fill input
|
|
99
|
+
browse select <sel> <val> # Select dropdown option
|
|
100
|
+
browse hover <sel> # Hover element
|
|
101
|
+
browse focus <sel> # Focus element
|
|
102
|
+
browse tap <sel> # Tap element (requires touch context via emulate)
|
|
103
|
+
browse check <sel> # Check checkbox
|
|
104
|
+
browse uncheck <sel> # Uncheck checkbox
|
|
105
|
+
browse type <text> # Type text via keyboard (current focus)
|
|
106
|
+
browse press <key> # Press key (Enter, Tab, etc.)
|
|
107
|
+
browse keydown <key> # Key down event
|
|
108
|
+
browse keyup <key> # Key up event
|
|
109
|
+
browse keyboard inserttext <text> # Insert text without key events
|
|
110
|
+
browse scroll [sel|up|down] # Scroll element into view or direction
|
|
111
|
+
browse scrollinto <sel> # Scroll element into view (explicit)
|
|
112
|
+
browse swipe <dir> [px] # Swipe up/down/left/right (touch events)
|
|
113
|
+
browse drag <src> <tgt> # Drag and drop
|
|
114
|
+
browse highlight <sel> # Highlight element with visual overlay
|
|
115
|
+
browse download <sel> [path] # Download file triggered by click
|
|
116
|
+
browse upload <sel> <files...> # Upload files to input
|
|
117
|
+
```
|
|
70
118
|
|
|
71
|
-
|
|
72
|
-
Filled @e3
|
|
119
|
+
### Mouse Control
|
|
73
120
|
|
|
74
|
-
|
|
75
|
-
|
|
121
|
+
```bash
|
|
122
|
+
browse mouse move <x> <y> # Move mouse to coordinates
|
|
123
|
+
browse mouse down [button] # Press mouse button (left/right/middle)
|
|
124
|
+
browse mouse up [button] # Release mouse button
|
|
125
|
+
browse mouse wheel <dy> [dx] # Scroll wheel
|
|
76
126
|
```
|
|
77
127
|
|
|
78
|
-
###
|
|
128
|
+
### Settings
|
|
129
|
+
|
|
130
|
+
```bash
|
|
131
|
+
browse set geo <lat> <lng> # Set geolocation
|
|
132
|
+
browse set media <scheme> # Set color scheme (dark/light/no-preference)
|
|
133
|
+
```
|
|
79
134
|
|
|
80
|
-
|
|
135
|
+
### Wait
|
|
81
136
|
|
|
82
137
|
```bash
|
|
83
|
-
|
|
84
|
-
|
|
85
|
-
|
|
138
|
+
browse wait <selector> # Wait for element
|
|
139
|
+
browse wait <selector> --state hidden # Wait for element to disappear
|
|
140
|
+
browse wait <ms> # Wait for milliseconds
|
|
141
|
+
browse wait --url <pattern> # Wait for URL
|
|
142
|
+
browse wait --text "Welcome" # Wait for text to appear in page
|
|
143
|
+
browse wait --fn "js expr" # Wait for JavaScript condition
|
|
144
|
+
browse wait --load <state> # Wait for load state (load/domcontentloaded/networkidle)
|
|
145
|
+
browse wait --network-idle # Wait for network idle
|
|
146
|
+
```
|
|
147
|
+
|
|
148
|
+
### Snapshot
|
|
86
149
|
|
|
87
|
-
|
|
88
|
-
|
|
89
|
-
|
|
90
|
-
|
|
150
|
+
```bash
|
|
151
|
+
browse snapshot # Full accessibility tree
|
|
152
|
+
browse snapshot -i # Interactive elements only (terse flat list)
|
|
153
|
+
browse snapshot -i -f # Interactive elements, full indented tree
|
|
154
|
+
browse snapshot -i -C # Include cursor-interactive elements (onclick, cursor:pointer)
|
|
155
|
+
browse snapshot -V # Viewport only — elements visible on screen
|
|
156
|
+
browse snapshot -c # Compact — remove empty structural elements
|
|
157
|
+
browse snapshot -d 3 # Limit depth to 3 levels
|
|
158
|
+
browse snapshot -s "#main" # Scope to CSS selector
|
|
159
|
+
browse snapshot -i -c -d 5 # Combine options
|
|
91
160
|
```
|
|
92
161
|
|
|
93
|
-
|
|
162
|
+
| Flag | Description |
|
|
163
|
+
|------|-------------|
|
|
164
|
+
| `-i` | Interactive elements only (buttons, links, inputs) — terse flat list |
|
|
165
|
+
| `-f` | Full — indented tree with props and children (use with `-i`) |
|
|
166
|
+
| `-V` | Viewport — only elements visible in current viewport |
|
|
167
|
+
| `-c` | Compact — remove empty structural elements |
|
|
168
|
+
| `-C` | Cursor-interactive — detect divs with `cursor:pointer`, `onclick`, `tabindex` |
|
|
169
|
+
| `-d N` | Limit tree depth |
|
|
170
|
+
| `-s <sel>` | Scope to CSS selector |
|
|
94
171
|
|
|
95
|
-
|
|
172
|
+
The `-C` flag catches modern SPA patterns that ARIA trees miss — `<div onclick>`, `cursor: pointer`, `tabindex`, and `data-action` elements.
|
|
96
173
|
|
|
97
|
-
|
|
174
|
+
### Find Elements
|
|
98
175
|
|
|
99
|
-
|
|
100
|
-
|
|
101
|
-
|
|
102
|
-
|
|
103
|
-
|
|
104
|
-
|
|
105
|
-
|
|
106
|
-
|
|
107
|
-
|
|
108
|
-
|
|
109
|
-
|
|
110
|
-
|
|
111
|
-
| Input value | `browser_evaluate` + custom JS | `value <sel>` |
|
|
112
|
-
| Element count | `browser_evaluate` + custom JS | `count <sel>` |
|
|
113
|
-
| iframe targeting | Not available | `frame <sel>` / `frame main` |
|
|
114
|
-
| Network mocking | Not available | `route <pattern> block\|fulfill` |
|
|
115
|
-
| Offline mode | Not available | `offline on\|off` |
|
|
116
|
-
| State persistence | Not available | `state save\|load` |
|
|
117
|
-
| Credential vault | Not available | `auth save\|login\|list` |
|
|
118
|
-
| HAR recording | Not available | `har start\|stop` |
|
|
119
|
-
| Video recording | Not available | `video start [dir]\|stop\|status` |
|
|
120
|
-
| Clipboard access | Not available | `clipboard [write <text>]` |
|
|
121
|
-
| Element finding | Not available | `find role\|text\|label\|placeholder\|testid` |
|
|
122
|
-
| DevTools inspect | Not available | `inspect` |
|
|
123
|
-
| Domain restriction | Not available | `--allowed-domains` |
|
|
124
|
-
| Prompt injection defense | Not available | `--content-boundaries` |
|
|
125
|
-
| JSON output mode | Not available | `--json` |
|
|
176
|
+
```bash
|
|
177
|
+
browse find role <role> [name] # By ARIA role
|
|
178
|
+
browse find text <text> # By text content
|
|
179
|
+
browse find label <label> # By label
|
|
180
|
+
browse find placeholder <placeholder> # By placeholder
|
|
181
|
+
browse find testid <id> # By data-testid
|
|
182
|
+
browse find alt <text> # By alt text
|
|
183
|
+
browse find title <text> # By title attribute
|
|
184
|
+
browse find first <sel> # First matching element
|
|
185
|
+
browse find last <sel> # Last matching element
|
|
186
|
+
browse find nth <n> <sel> # Nth matching element (0-indexed)
|
|
187
|
+
```
|
|
126
188
|
|
|
127
|
-
###
|
|
189
|
+
### Inspection
|
|
128
190
|
|
|
191
|
+
```bash
|
|
192
|
+
browse js <expr> # Evaluate JavaScript expression
|
|
193
|
+
browse eval <file> # Evaluate JavaScript file
|
|
194
|
+
browse css <sel> <prop> # Get computed CSS property
|
|
195
|
+
browse attrs <sel> # Get element attributes as JSON
|
|
196
|
+
browse element-state <sel> # Element state (visible, enabled, checked, etc.)
|
|
197
|
+
browse value <sel> # Get input/select value
|
|
198
|
+
browse count <sel> # Count elements matching selector
|
|
199
|
+
browse box <sel> # Get bounding box as JSON {x, y, width, height}
|
|
200
|
+
browse clipboard [write <text>] # Read or write clipboard
|
|
201
|
+
browse console [--clear] # Console log buffer
|
|
202
|
+
browse errors [--clear] # Page errors only (filtered from console)
|
|
203
|
+
browse network [--clear] # Network request buffer
|
|
204
|
+
browse cookies # Browser cookies as JSON
|
|
205
|
+
browse storage [set <k> <v>] # localStorage/sessionStorage
|
|
206
|
+
browse perf # Navigation timing (dns, ttfb, load)
|
|
207
|
+
browse devices [filter] # List available device names
|
|
129
208
|
```
|
|
130
|
-
|
|
131
|
-
|
|
209
|
+
|
|
210
|
+
### Visual
|
|
211
|
+
|
|
212
|
+
```bash
|
|
213
|
+
browse screenshot [path] # Take screenshot (viewport)
|
|
214
|
+
browse screenshot --full [path] # Full-page screenshot
|
|
215
|
+
browse screenshot <sel|@ref> [path] # Screenshot specific element
|
|
216
|
+
browse screenshot --clip x,y,w,h [path] # Screenshot clipped region
|
|
217
|
+
browse screenshot --annotate [path] # Annotated screenshot with numbered labels
|
|
218
|
+
browse pdf [path] # Save page as PDF
|
|
219
|
+
browse responsive [prefix] # Mobile/tablet/desktop screenshots
|
|
132
220
|
```
|
|
133
221
|
|
|
134
|
-
|
|
222
|
+
### Compare
|
|
135
223
|
|
|
136
|
-
|
|
224
|
+
```bash
|
|
225
|
+
browse diff <url1> <url2> # Text diff between two pages
|
|
226
|
+
browse snapshot-diff # Diff current vs last snapshot
|
|
227
|
+
browse screenshot-diff <baseline> [current] # Pixel-level visual diff
|
|
228
|
+
```
|
|
137
229
|
|
|
138
|
-
|
|
230
|
+
### Tabs
|
|
139
231
|
|
|
140
232
|
```bash
|
|
141
|
-
#
|
|
142
|
-
browse
|
|
143
|
-
browse
|
|
144
|
-
browse
|
|
145
|
-
|
|
233
|
+
browse tabs # List all tabs
|
|
234
|
+
browse tab <id> # Switch to tab
|
|
235
|
+
browse newtab [url] # Open new tab
|
|
236
|
+
browse closetab [id] # Close tab
|
|
237
|
+
```
|
|
146
238
|
|
|
147
|
-
|
|
148
|
-
browse --session agent-b goto https://www.amazon.com
|
|
149
|
-
browse --session agent-b snapshot -i
|
|
150
|
-
browse --session agent-b fill @e6 "baby stroller"
|
|
151
|
-
browse --session agent-b press Enter
|
|
239
|
+
### Frames
|
|
152
240
|
|
|
153
|
-
|
|
154
|
-
|
|
155
|
-
browse
|
|
241
|
+
```bash
|
|
242
|
+
browse frame <sel> # Switch to iframe
|
|
243
|
+
browse frame main # Back to main frame
|
|
156
244
|
```
|
|
157
245
|
|
|
158
|
-
|
|
246
|
+
### Device Emulation
|
|
159
247
|
|
|
248
|
+
```bash
|
|
249
|
+
browse emulate "iPhone 14" # Emulate device
|
|
250
|
+
browse emulate reset # Reset to desktop (1920x1080)
|
|
251
|
+
browse devices # List all available devices
|
|
252
|
+
browse devices iphone # Filter device list
|
|
253
|
+
browse viewport 1280x720 # Set viewport size
|
|
160
254
|
```
|
|
161
|
-
|
|
162
|
-
|
|
163
|
-
|
|
164
|
-
|
|
165
|
-
|
|
166
|
-
|
|
167
|
-
|
|
168
|
-
|
|
255
|
+
|
|
256
|
+
100+ devices: iPhone 12–17, Pixel 5–7, iPad, Galaxy, and all Playwright built-ins.
|
|
257
|
+
|
|
258
|
+
### Cookies
|
|
259
|
+
|
|
260
|
+
```bash
|
|
261
|
+
browse cookie <name>=<value> # Set cookie (simple)
|
|
262
|
+
browse cookie set <n> <v> [--domain --secure ...] # Set cookie with options
|
|
263
|
+
browse cookie clear # Clear all cookies
|
|
264
|
+
browse cookie export <file> # Export cookies to JSON
|
|
265
|
+
browse cookie import <file> # Import cookies from JSON
|
|
266
|
+
browse cookies # Read all cookies
|
|
169
267
|
```
|
|
170
268
|
|
|
171
|
-
|
|
269
|
+
### Network
|
|
270
|
+
|
|
172
271
|
```bash
|
|
173
|
-
browse
|
|
174
|
-
browse
|
|
175
|
-
browse
|
|
272
|
+
browse route <pattern> block # Block matching requests
|
|
273
|
+
browse route <pattern> fulfill <status> [body] # Mock response
|
|
274
|
+
browse route clear # Remove all routes
|
|
275
|
+
browse offline [on|off] # Toggle offline mode
|
|
276
|
+
browse header <name>:<value> # Set extra HTTP header
|
|
277
|
+
browse useragent <string> # Set user agent
|
|
176
278
|
```
|
|
177
279
|
|
|
178
|
-
|
|
280
|
+
### Dialogs
|
|
179
281
|
|
|
180
|
-
|
|
282
|
+
```bash
|
|
283
|
+
browse dialog # Last dialog info
|
|
284
|
+
browse dialog-accept [text] # Accept next dialog (optional prompt text)
|
|
285
|
+
browse dialog-dismiss # Dismiss next dialog
|
|
286
|
+
```
|
|
181
287
|
|
|
182
|
-
|
|
288
|
+
### Recording
|
|
183
289
|
|
|
184
290
|
```bash
|
|
185
|
-
|
|
291
|
+
browse har start # Start HAR recording
|
|
292
|
+
browse har stop [path] # Stop and save HAR file
|
|
293
|
+
|
|
294
|
+
browse video start [dir] # Start video recording (WebM)
|
|
295
|
+
browse video stop # Stop recording
|
|
296
|
+
browse video status # Check recording status
|
|
297
|
+
|
|
298
|
+
browse record start # Record browsing commands as you go
|
|
299
|
+
browse record stop # Stop recording
|
|
300
|
+
browse record status # Check recording status
|
|
301
|
+
browse record export browse [path] # Export as chain-compatible JSON (replay with browse chain)
|
|
302
|
+
browse record export replay [path] # Export as Chrome DevTools Recorder (Playwright/Puppeteer)
|
|
186
303
|
```
|
|
187
304
|
|
|
188
|
-
|
|
305
|
+
### State & Auth
|
|
189
306
|
|
|
190
|
-
|
|
307
|
+
```bash
|
|
308
|
+
browse state save [name] # Save cookies + localStorage
|
|
309
|
+
browse state load [name] # Restore saved state
|
|
310
|
+
browse state list # List saved states
|
|
311
|
+
browse state show [name] # Show state details
|
|
312
|
+
|
|
313
|
+
browse auth save <name> <url> <user> <pass> # Save encrypted credential
|
|
314
|
+
browse auth save <name> <url> <user> --password-stdin # Password from stdin
|
|
315
|
+
browse auth login <name> # Auto-login with saved credential
|
|
316
|
+
browse auth list # List saved credentials
|
|
317
|
+
browse auth delete <name> # Delete credential
|
|
318
|
+
```
|
|
191
319
|
|
|
192
|
-
|
|
320
|
+
### Multi-Step (Chaining)
|
|
321
|
+
|
|
322
|
+
Execute a sequence of commands in one call:
|
|
193
323
|
|
|
194
324
|
```bash
|
|
195
|
-
|
|
325
|
+
echo '[["goto","https://example.com"],["snapshot","-i"],["text"]]' | browse chain
|
|
196
326
|
```
|
|
197
327
|
|
|
198
|
-
|
|
328
|
+
### Server Control
|
|
199
329
|
|
|
200
330
|
```bash
|
|
201
|
-
browse
|
|
331
|
+
browse status # Server health report
|
|
332
|
+
browse instances # List all running browse servers
|
|
333
|
+
browse doctor # System check (Bun, Playwright, Chromium)
|
|
334
|
+
browse upgrade # Self-update via npm
|
|
335
|
+
browse stop # Stop server
|
|
336
|
+
browse restart # Restart server
|
|
337
|
+
browse inspect # Open DevTools (requires BROWSE_DEBUG_PORT)
|
|
202
338
|
```
|
|
203
339
|
|
|
204
|
-
|
|
340
|
+
### Setup
|
|
205
341
|
|
|
206
|
-
|
|
342
|
+
```bash
|
|
343
|
+
browse install-skill [path] # Install Claude Code skill
|
|
344
|
+
```
|
|
345
|
+
|
|
346
|
+
## Sessions
|
|
207
347
|
|
|
208
|
-
|
|
348
|
+
Run multiple AI agents in parallel, each with isolated browser state, sharing one Chromium process:
|
|
209
349
|
|
|
210
350
|
```bash
|
|
211
|
-
|
|
212
|
-
browse
|
|
213
|
-
browse
|
|
214
|
-
browse
|
|
351
|
+
# Agent A
|
|
352
|
+
browse --session agent-a goto https://site-a.com
|
|
353
|
+
browse --session agent-a snapshot -i
|
|
354
|
+
browse --session agent-a click @e3
|
|
215
355
|
|
|
216
|
-
|
|
217
|
-
browse goto
|
|
356
|
+
# Agent B (simultaneously)
|
|
357
|
+
browse --session agent-b goto https://site-b.com
|
|
358
|
+
browse --session agent-b snapshot -i
|
|
359
|
+
browse --session agent-b fill @e2 "query"
|
|
218
360
|
|
|
219
|
-
|
|
220
|
-
|
|
361
|
+
# Or set once via env var
|
|
362
|
+
export BROWSE_SESSION=agent-a
|
|
363
|
+
browse text
|
|
364
|
+
```
|
|
221
365
|
|
|
222
|
-
|
|
223
|
-
|
|
366
|
+
Each session has its own:
|
|
367
|
+
- Browser context (cookies, storage, cache)
|
|
368
|
+
- Tabs and navigation history
|
|
369
|
+
- Refs from snapshots
|
|
370
|
+
- Console and network buffers
|
|
224
371
|
|
|
225
|
-
|
|
226
|
-
browse
|
|
372
|
+
```bash
|
|
373
|
+
browse sessions # List active sessions
|
|
374
|
+
browse session-close agent-a # Close a session
|
|
375
|
+
browse status # Shows total session count
|
|
227
376
|
```
|
|
228
377
|
|
|
229
|
-
|
|
378
|
+
Sessions auto-close after the idle timeout (default 30 min). Without `--session`, everything runs in a `"default"` session.
|
|
230
379
|
|
|
231
|
-
|
|
380
|
+
For full process isolation (separate Chromium instances), use `BROWSE_PORT` to run independent servers.
|
|
232
381
|
|
|
233
|
-
|
|
234
|
-
`goto <url>` | `back` | `forward` | `reload` | `url`
|
|
382
|
+
## Security
|
|
235
383
|
|
|
236
|
-
|
|
237
|
-
`text` | `html [sel]` | `links` | `forms` | `accessibility`
|
|
384
|
+
All security features are opt-in — existing workflows are unaffected until you explicitly enable a feature.
|
|
238
385
|
|
|
239
|
-
###
|
|
240
|
-
|
|
386
|
+
### Domain Allowlist
|
|
387
|
+
|
|
388
|
+
Restrict navigation and sub-resource requests to trusted domains:
|
|
241
389
|
|
|
242
|
-
|
|
390
|
+
```bash
|
|
391
|
+
browse --allowed-domains "example.com,*.example.com" goto https://example.com
|
|
392
|
+
# Or via env var
|
|
393
|
+
BROWSE_ALLOWED_DOMAINS="example.com,*.api.io" browse goto https://example.com
|
|
243
394
|
```
|
|
244
|
-
|
|
245
|
-
|
|
246
|
-
|
|
247
|
-
|
|
248
|
-
|
|
249
|
-
|
|
395
|
+
|
|
396
|
+
Blocks HTTP requests, WebSocket, EventSource, and `sendBeacon` to non-allowed domains. Wildcards like `*.example.com` match the bare domain and all subdomains.
|
|
397
|
+
|
|
398
|
+
### Action Policy
|
|
399
|
+
|
|
400
|
+
Gate commands with a `browse-policy.json` file:
|
|
401
|
+
|
|
402
|
+
```json
|
|
403
|
+
{ "default": "allow", "deny": ["js", "eval"], "confirm": ["goto"] }
|
|
250
404
|
```
|
|
251
|
-
After snapshot, use `@e1`, `@e2`... as selectors in any command.
|
|
252
405
|
|
|
253
|
-
|
|
254
|
-
`snapshot-diff` — compare current page against last snapshot.
|
|
406
|
+
Precedence: deny > confirm > allow > default. Hot-reloads on file change — no server restart needed.
|
|
255
407
|
|
|
256
|
-
###
|
|
257
|
-
`emulate <device>` | `emulate reset` | `devices [filter]`
|
|
408
|
+
### Credential Vault
|
|
258
409
|
|
|
259
|
-
|
|
410
|
+
Encrypted credential storage (AES-256-GCM). The LLM never sees passwords:
|
|
260
411
|
|
|
261
|
-
|
|
262
|
-
|
|
412
|
+
```bash
|
|
413
|
+
echo "mypassword" | browse auth save github https://github.com/login myuser --password-stdin
|
|
414
|
+
browse auth login github # Auto-navigates, detects form, fills + submits
|
|
415
|
+
browse auth list # List saved credentials (no passwords shown)
|
|
416
|
+
```
|
|
263
417
|
|
|
264
|
-
|
|
265
|
-
`screenshot [path]` | `screenshot --annotate` | `pdf [path]` | `responsive [prefix]`
|
|
418
|
+
Key is auto-generated at `.browse/.encryption-key` or set via `BROWSE_ENCRYPTION_KEY`.
|
|
266
419
|
|
|
267
|
-
###
|
|
268
|
-
`diff <url1> <url2>` — text diff between two pages.
|
|
269
|
-
`screenshot-diff <baseline> [current]` — pixel-level visual regression testing.
|
|
420
|
+
### Content Boundaries
|
|
270
421
|
|
|
271
|
-
|
|
272
|
-
`find role|text|label|placeholder|testid <query> [name]` — semantic element locators.
|
|
422
|
+
Wrap page output in CSPRNG nonce-delimited markers so LLMs can distinguish tool output from untrusted page content:
|
|
273
423
|
|
|
274
|
-
### Multi-Step
|
|
275
424
|
```bash
|
|
276
|
-
|
|
425
|
+
browse --content-boundaries text
|
|
277
426
|
```
|
|
278
427
|
|
|
279
|
-
###
|
|
280
|
-
`tabs` | `tab <id>` | `newtab [url]` | `closetab [id]`
|
|
428
|
+
### JSON Output
|
|
281
429
|
|
|
282
|
-
|
|
283
|
-
`frame <sel>` | `frame main`
|
|
430
|
+
Machine-readable output for agent frameworks:
|
|
284
431
|
|
|
285
|
-
|
|
286
|
-
|
|
432
|
+
```bash
|
|
433
|
+
browse --json snapshot -i
|
|
434
|
+
# Returns: {"success": true, "data": "...", "command": "snapshot"}
|
|
435
|
+
```
|
|
287
436
|
|
|
288
|
-
|
|
289
|
-
|
|
437
|
+
## Configuration
|
|
438
|
+
|
|
439
|
+
Create a `browse.json` file at your project root to set persistent defaults:
|
|
440
|
+
|
|
441
|
+
```json
|
|
442
|
+
{
|
|
443
|
+
"session": "my-agent",
|
|
444
|
+
"json": true,
|
|
445
|
+
"contentBoundaries": true,
|
|
446
|
+
"allowedDomains": ["example.com", "*.api.io"],
|
|
447
|
+
"idleTimeout": 3600000,
|
|
448
|
+
"viewport": "1280x720",
|
|
449
|
+
"device": "iPhone 14",
|
|
450
|
+
"runtime": "playwright"
|
|
451
|
+
}
|
|
452
|
+
```
|
|
290
453
|
|
|
291
|
-
|
|
292
|
-
`state save [name]` | `state load [name]` | `state list` | `state show [name]` | `auth save <name> <url> <user> <pass>` | `auth login <name>` | `auth list` | `auth delete <name>`
|
|
454
|
+
CLI flags and environment variables override config file values.
|
|
293
455
|
|
|
294
|
-
|
|
295
|
-
`har start` | `har stop [path]` | `video start [dir]` | `video stop` | `video status`
|
|
456
|
+
## Usage with AI Agents
|
|
296
457
|
|
|
297
|
-
###
|
|
298
|
-
`inspect` — open DevTools debugger (requires `BROWSE_DEBUG_PORT`).
|
|
458
|
+
### Claude Code (recommended)
|
|
299
459
|
|
|
300
|
-
|
|
301
|
-
`status` | `instances` | `cookie <n>=<v>` | `header <n>:<v>` | `useragent <str>` | `stop` | `restart`
|
|
460
|
+
Install as a Claude Code skill via [skills.sh](https://skills.sh):
|
|
302
461
|
|
|
303
|
-
|
|
462
|
+
```bash
|
|
463
|
+
npx skills add https://github.com/ulpi-io/skills --skill browse
|
|
464
|
+
```
|
|
465
|
+
|
|
466
|
+
Or install directly:
|
|
304
467
|
|
|
468
|
+
```bash
|
|
469
|
+
browse install-skill
|
|
305
470
|
```
|
|
306
|
-
|
|
307
|
-
|
|
308
|
-
|
|
309
|
-
|
|
310
|
-
|
|
311
|
-
|
|
312
|
-
|
|
313
|
-
|
|
314
|
-
|
|
315
|
-
|
|
316
|
-
|
|
317
|
-
|
|
318
|
-
|
|
319
|
-
|
|
320
|
-
|
|
321
|
-
|
|
471
|
+
|
|
472
|
+
Both copy the skill definition to `.claude/skills/browse/SKILL.md` and add all browse commands to permissions — no more approval prompts.
|
|
473
|
+
|
|
474
|
+
### CLAUDE.md / AGENTS.md
|
|
475
|
+
|
|
476
|
+
Add to your project instructions:
|
|
477
|
+
|
|
478
|
+
```markdown
|
|
479
|
+
## Browser Automation
|
|
480
|
+
|
|
481
|
+
Use `browse` for web automation. Run `browse --help` for all commands.
|
|
482
|
+
|
|
483
|
+
Core workflow:
|
|
484
|
+
1. `browse goto <url>` — Navigate to page
|
|
485
|
+
2. `browse snapshot -i` — Get interactive elements with refs (@e1, @e2)
|
|
486
|
+
3. `browse click @e1` / `fill @e2 "text"` — Interact using refs
|
|
487
|
+
4. Re-snapshot after page changes
|
|
488
|
+
```
|
|
489
|
+
|
|
490
|
+
### Just ask the agent
|
|
491
|
+
|
|
492
|
+
```
|
|
493
|
+
Use browse to test the login flow. Run browse --help to see available commands.
|
|
322
494
|
```
|
|
323
495
|
|
|
324
|
-
##
|
|
496
|
+
## Options
|
|
325
497
|
|
|
326
498
|
| Flag | Description |
|
|
327
499
|
|------|-------------|
|
|
@@ -329,99 +501,93 @@ browse [--session <id>] <command>
|
|
|
329
501
|
| `--json` | Wrap output as `{success, data, command}` |
|
|
330
502
|
| `--content-boundaries` | Wrap page content in nonce-delimited markers |
|
|
331
503
|
| `--allowed-domains <d,d>` | Block navigation/resources outside allowlist |
|
|
332
|
-
| `--
|
|
504
|
+
| `--max-output <n>` | Truncate output to N characters |
|
|
505
|
+
| `--headed` | Show browser window (not headless) |
|
|
333
506
|
|
|
334
507
|
## Environment Variables
|
|
335
508
|
|
|
336
509
|
| Variable | Default | Description |
|
|
337
510
|
|----------|---------|-------------|
|
|
338
|
-
| `BROWSE_PORT` | auto 9400
|
|
511
|
+
| `BROWSE_PORT` | auto (9400–10400) | Fixed server port |
|
|
339
512
|
| `BROWSE_PORT_START` | 9400 | Start of port scan range |
|
|
340
513
|
| `BROWSE_SESSION` | (none) | Default session ID for all commands |
|
|
341
|
-
| `BROWSE_INSTANCE` | auto (PPID) | Instance ID for multi-
|
|
342
|
-
| `BROWSE_IDLE_TIMEOUT` | 1800000 (30m) | Idle shutdown in ms |
|
|
514
|
+
| `BROWSE_INSTANCE` | auto (PPID) | Instance ID for multi-agent isolation |
|
|
515
|
+
| `BROWSE_IDLE_TIMEOUT` | 1800000 (30m) | Idle auto-shutdown in ms |
|
|
343
516
|
| `BROWSE_TIMEOUT` | (none) | Override all command timeouts (ms) |
|
|
344
|
-
| `BROWSE_LOCAL_DIR` | `.browse/` or `/tmp` | State/log directory |
|
|
517
|
+
| `BROWSE_LOCAL_DIR` | `.browse/` or `/tmp` | State/log/screenshot directory |
|
|
345
518
|
| `BROWSE_JSON` | (none) | Set to `1` for JSON output mode |
|
|
346
519
|
| `BROWSE_CONTENT_BOUNDARIES` | (none) | Set to `1` for nonce-delimited output |
|
|
347
520
|
| `BROWSE_ALLOWED_DOMAINS` | (none) | Comma-separated domain allowlist |
|
|
348
|
-
| `
|
|
521
|
+
| `BROWSE_MAX_OUTPUT` | (none) | Truncate output to N characters |
|
|
522
|
+
| `BROWSE_HEADED` | (none) | Set to `1` for headed browser mode |
|
|
523
|
+
| `BROWSE_CDP_URL` | (none) | Connect to remote Chrome via CDP |
|
|
349
524
|
| `BROWSE_PROXY` | (none) | Proxy server URL |
|
|
350
525
|
| `BROWSE_PROXY_BYPASS` | (none) | Proxy bypass list |
|
|
351
|
-
| `BROWSE_CDP_URL` | (none) | Connect to remote Chrome via CDP |
|
|
352
526
|
| `BROWSE_SERVER_SCRIPT` | auto-detected | Override path to server.ts |
|
|
353
|
-
| `BROWSE_DEBUG_PORT` | (none) | Port for DevTools debugging
|
|
527
|
+
| `BROWSE_DEBUG_PORT` | (none) | Port for DevTools debugging |
|
|
354
528
|
| `BROWSE_POLICY` | browse-policy.json | Path to action policy file |
|
|
355
|
-
| `BROWSE_CONFIRM_ACTIONS` | (none) |
|
|
529
|
+
| `BROWSE_CONFIRM_ACTIONS` | (none) | Commands requiring confirmation |
|
|
356
530
|
| `BROWSE_ENCRYPTION_KEY` | auto-generated | 64-char hex AES key for credential vault |
|
|
357
|
-
| `BROWSE_AUTH_PASSWORD` | (none) | Password for auth save (alt to `--password-stdin`) |
|
|
531
|
+
| `BROWSE_AUTH_PASSWORD` | (none) | Password for `auth save` (alt to `--password-stdin`) |
|
|
532
|
+
| `BROWSE_RUNTIME` | playwright | Browser runtime (playwright, rebrowser, lightpanda) |
|
|
358
533
|
|
|
359
|
-
##
|
|
534
|
+
## Architecture
|
|
535
|
+
|
|
536
|
+
```
|
|
537
|
+
browse [--session <id>] <command>
|
|
538
|
+
|
|
|
539
|
+
CLI (thin HTTP client)
|
|
540
|
+
|
|
|
541
|
+
Persistent server (localhost, auto-started)
|
|
542
|
+
|
|
|
543
|
+
SessionManager
|
|
544
|
+
├── "default" → BrowserContext → tabs, refs, cookies, buffers
|
|
545
|
+
├── "agent-a" → BrowserContext → tabs, refs, cookies, buffers
|
|
546
|
+
└── "agent-b" → BrowserContext → tabs, refs, cookies, buffers
|
|
547
|
+
|
|
|
548
|
+
Chromium (Playwright, headless, shared)
|
|
549
|
+
```
|
|
550
|
+
|
|
551
|
+
- **First command:** ~2s (server + Chromium startup, once)
|
|
552
|
+
- **Every command after:** ~100–200ms (HTTP to localhost)
|
|
553
|
+
- Server auto-starts on first command, auto-shuts down after 30 min idle
|
|
554
|
+
- Crash recovery: CLI detects dead server and restarts transparently
|
|
555
|
+
- State file: `.browse/browse-server.json` (pid, port, token)
|
|
556
|
+
|
|
557
|
+
## Benchmarks
|
|
558
|
+
|
|
559
|
+
### vs Agent Browser & Browser-Use (Token Cost)
|
|
560
|
+
|
|
561
|
+
Tested on 3 sites across multi-step browsing flows — navigate, snapshot, scroll, search, extract text:
|
|
562
|
+
|
|
563
|
+
| Tool | Total Tokens | Total Time | Context Used (200K) |
|
|
564
|
+
|------|-------------:|-----------:|--------------------:|
|
|
565
|
+
| **browse** | **14,134** | **28.5s** | **7.1%** |
|
|
566
|
+
| agent-browser | 39,414 | 36.2s | 19.7% |
|
|
567
|
+
| browser-use | 34,281 | 72.7s | 17.1% |
|
|
360
568
|
|
|
361
|
-
|
|
569
|
+
browse uses **2.4x fewer tokens** than browser-use, **2.8x fewer** than agent-browser, and completes **2.5x faster** than browser-use.
|
|
570
|
+
|
|
571
|
+
### vs @playwright/mcp (Architecture)
|
|
572
|
+
|
|
573
|
+
@playwright/mcp dumps the full accessibility snapshot on every action. browse returns ~15 tokens per action — the agent requests a snapshot only when needed:
|
|
574
|
+
|
|
575
|
+
| | @playwright/mcp | browse |
|
|
576
|
+
|---|---:|---:|
|
|
577
|
+
| Tokens on `navigate` | ~14,578 (auto-dumped) | **~11** |
|
|
578
|
+
| Tokens on `click` | ~14,578 (auto-dumped) | **~15** |
|
|
579
|
+
| 10-action session | ~145,780 | **~11,388** |
|
|
580
|
+
| Context consumed (200K) | **73%** | **6%** |
|
|
581
|
+
|
|
582
|
+
Rerun: `bun run benchmark`
|
|
362
583
|
|
|
363
584
|
## Changelog
|
|
364
585
|
|
|
365
|
-
|
|
366
|
-
|
|
367
|
-
|
|
368
|
-
|
|
369
|
-
|
|
370
|
-
- `screenshot --annotate` — pixel-annotated PNG with numbered badges
|
|
371
|
-
- `instances` command — list all running browse servers
|
|
372
|
-
- `BROWSE_DEBUG_PORT` env var for DevTools debugging
|
|
373
|
-
|
|
374
|
-
### v0.2.0 — Security, Interactions, DX
|
|
375
|
-
|
|
376
|
-
**Commands:**
|
|
377
|
-
- `dblclick`, `focus`, `check`, `uncheck`, `drag`, `keydown`, `keyup` — interaction commands
|
|
378
|
-
- `frame <sel>` / `frame main` — iframe targeting
|
|
379
|
-
- `value <sel>`, `count <sel>` — element inspection
|
|
380
|
-
- `scroll up/down` — viewport-relative scrolling
|
|
381
|
-
- `wait --url`, `wait --network-idle` — navigation/network wait variants
|
|
382
|
-
- `highlight <sel>` — visual element debugging
|
|
383
|
-
- `download <sel> [path]` — file download
|
|
384
|
-
- `route <pattern> block/fulfill` — network request interception and mocking
|
|
385
|
-
- `offline on/off` — offline mode toggle
|
|
386
|
-
- `state save/load` — persist and restore cookies + localStorage (all origins)
|
|
387
|
-
- `har start/stop` — HAR recording and export
|
|
388
|
-
- `video start/stop/status` — video recording (WebM, compositor-level, works with remote CDP)
|
|
389
|
-
- `screenshot-diff` — pixel-level visual regression testing
|
|
390
|
-
- `find role/text/label/placeholder/testid` — semantic element locators
|
|
391
|
-
|
|
392
|
-
**Security:**
|
|
393
|
-
- `--allowed-domains` — domain allowlist (HTTP + WebSocket/EventSource/sendBeacon)
|
|
394
|
-
- `browse-policy.json` — action policy gate (allow/deny/confirm per command)
|
|
395
|
-
- `auth save/login/list/delete` — AES-256-GCM encrypted credential vault
|
|
396
|
-
- `--content-boundaries` — CSPRNG nonce wrapping for prompt injection defense
|
|
397
|
-
|
|
398
|
-
**DX:**
|
|
399
|
-
- `--json` — structured output mode for agent frameworks
|
|
400
|
-
- `browse.json` config file support
|
|
401
|
-
- AI-friendly error messages — Playwright errors rewritten to actionable hints
|
|
402
|
-
- Per-session output folders (`.browse/sessions/{id}/`)
|
|
403
|
-
|
|
404
|
-
**Infrastructure:**
|
|
405
|
-
- Auto-instance servers via PPID — multi-Claude isolation
|
|
406
|
-
- CDP remote connection (`BROWSE_CDP_URL`)
|
|
407
|
-
- Proxy support (`BROWSE_PROXY`)
|
|
408
|
-
- Compiled binary self-spawn mode
|
|
409
|
-
- Orphaned server cleanup
|
|
410
|
-
|
|
411
|
-
### v0.1.0 — Foundation
|
|
412
|
-
|
|
413
|
-
**Commands:**
|
|
414
|
-
- `emulate` / `devices` — device emulation (100+ devices)
|
|
415
|
-
- `snapshot -C` — cursor-interactive detection
|
|
416
|
-
- `snapshot-diff` — before/after comparison with ref-number stripping
|
|
417
|
-
- `dialog` / `dialog-accept` / `dialog-dismiss` — dialog handling
|
|
418
|
-
- `upload` — file upload
|
|
419
|
-
- `screenshot --annotate` — numbered badge overlay with legend
|
|
420
|
-
|
|
421
|
-
**Infrastructure:**
|
|
422
|
-
- Session multiplexing — multiple agents share one Chromium
|
|
423
|
-
- Safe retry classification — read vs write commands
|
|
424
|
-
- TreeWalker text extraction — no MutationObserver triggers
|
|
586
|
+
See [CHANGELOG.md](CHANGELOG.md) for full release history.
|
|
587
|
+
|
|
588
|
+
## Acknowledgments
|
|
589
|
+
|
|
590
|
+
Inspired by and originally derived from the `/browse` skill in [gstack](https://github.com/garrytan/gstack) by Garry Tan.
|
|
425
591
|
|
|
426
592
|
## License
|
|
427
593
|
|