@ulpi/browse 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/skill/SKILL.md ADDED
@@ -0,0 +1,301 @@
1
+ ---
2
+ name: browse
3
+ version: 1.0.0
4
+ description: |
5
+ Fast web browsing for Claude Code via persistent headless Chromium daemon. Navigate to any URL,
6
+ read page content, click elements, fill forms, run JavaScript, take screenshots,
7
+ inspect CSS/DOM, capture console/network logs, and more. ~100ms per command after
8
+ first call. Use when you need to check a website, verify a deployment, read docs,
9
+ or interact with any web page. No MCP, no Chrome extension — just fast CLI.
10
+ allowed-tools:
11
+ - Bash
12
+ - Read
13
+
14
+ ---
15
+
16
+ # browse: Persistent Browser for Claude Code
17
+
18
+ Persistent headless Chromium daemon. First call auto-starts the server (~3s).
19
+ Every subsequent call: ~100-200ms. Auto-shuts down after 30 min idle.
20
+
21
+ ## SETUP (run this check BEFORE any browse command)
22
+
23
+ ```bash
24
+ # Check if browse is available
25
+ if command -v browse &>/dev/null; then
26
+ echo "READY"
27
+ else
28
+ echo "NEEDS_INSTALL"
29
+ fi
30
+ ```
31
+
32
+ If `NEEDS_INSTALL`:
33
+ 1. Tell the user: "browse needs a one-time install via npm. OK to proceed?"
34
+ 2. If they approve: `bun install -g @ulpi/browse`
35
+ 3. If `bun` is not installed: `curl -fsSL https://bun.sh/install | bash`
36
+
37
+ ### Permissions check
38
+
39
+ After confirming browse is available, check if browse commands are pre-allowed:
40
+
41
+ ```bash
42
+ cat .claude/settings.json 2>/dev/null
43
+ ```
44
+
45
+ If the file is missing or does not contain browse permission rules in `permissions.allow`:
46
+ 1. Tell the user: "browse works best when its commands are pre-allowed so you don't get prompted on every call. Add browse permissions to `.claude/settings.json`?"
47
+ 2. If they approve, read the existing `.claude/settings.json` (or create it), and add ALL of these rules to `permissions.allow` (merge with existing rules — do not overwrite):
48
+
49
+ ```json
50
+ "Bash(browse:*)",
51
+ "Bash(browse goto:*)", "Bash(browse back:*)", "Bash(browse forward:*)",
52
+ "Bash(browse reload:*)", "Bash(browse url:*)", "Bash(browse text:*)",
53
+ "Bash(browse html:*)", "Bash(browse links:*)", "Bash(browse forms:*)",
54
+ "Bash(browse accessibility:*)", "Bash(browse snapshot:*)",
55
+ "Bash(browse snapshot-diff:*)", "Bash(browse click:*)",
56
+ "Bash(browse fill:*)", "Bash(browse select:*)", "Bash(browse hover:*)",
57
+ "Bash(browse type:*)", "Bash(browse press:*)", "Bash(browse scroll:*)",
58
+ "Bash(browse wait:*)", "Bash(browse viewport:*)", "Bash(browse upload:*)",
59
+ "Bash(browse dialog-accept:*)", "Bash(browse dialog-dismiss:*)",
60
+ "Bash(browse js:*)", "Bash(browse eval:*)", "Bash(browse css:*)",
61
+ "Bash(browse attrs:*)", "Bash(browse state:*)", "Bash(browse dialog:*)",
62
+ "Bash(browse console:*)", "Bash(browse network:*)",
63
+ "Bash(browse cookies:*)", "Bash(browse storage:*)", "Bash(browse perf:*)",
64
+ "Bash(browse devices:*)", "Bash(browse emulate:*)",
65
+ "Bash(browse screenshot:*)", "Bash(browse pdf:*)",
66
+ "Bash(browse responsive:*)", "Bash(browse diff:*)",
67
+ "Bash(browse chain:*)", "Bash(browse tabs:*)", "Bash(browse tab:*)",
68
+ "Bash(browse newtab:*)", "Bash(browse closetab:*)",
69
+ "Bash(browse sessions:*)", "Bash(browse session-close:*)",
70
+ "Bash(browse status:*)", "Bash(browse stop:*)", "Bash(browse restart:*)",
71
+ "Bash(browse cookie:*)", "Bash(browse header:*)",
72
+ "Bash(browse useragent:*)"
73
+ ```
74
+
75
+ ## IMPORTANT
76
+
77
+ - Always call `browse` as a bare command (it's on PATH via global install).
78
+ - Do NOT use shell variables like `B=...` or full paths — they break Claude Code's permission matching.
79
+ - NEVER use `#` in CSS selectors — use `[id=foo]` instead of `#foo`. The `#` character breaks Claude Code's permission matching and triggers approval prompts.
80
+ - The browser persists between calls — cookies, tabs, and state carry over.
81
+ - The server auto-starts on first command. No manual setup needed.
82
+ - Use `--session <id>` for parallel agent isolation. Each session gets its own tabs, refs, cookies.
83
+
84
+ ## Quick Reference
85
+
86
+ ```bash
87
+ # Navigate to a page
88
+ browse goto https://example.com
89
+
90
+ # Read cleaned page text
91
+ browse text
92
+
93
+ # Take a screenshot (then Read the image)
94
+ browse screenshot .browse/page.png
95
+
96
+ # Snapshot: accessibility tree with refs
97
+ browse snapshot -i
98
+
99
+ # Click by ref (after snapshot)
100
+ browse click @e3
101
+
102
+ # Fill by ref
103
+ browse fill @e4 "test@test.com"
104
+
105
+ # Run JavaScript
106
+ browse js "document.title"
107
+
108
+ # Get all links
109
+ browse links
110
+
111
+ # Click by CSS selector
112
+ browse click "button.submit"
113
+
114
+ # Fill a form by CSS selector (use [id=...] instead of # to avoid shell issues)
115
+ browse fill "[id=email]" "test@test.com"
116
+ browse fill "[id=password]" "abc123"
117
+ browse click "button[type=submit]"
118
+
119
+ # Get HTML of an element
120
+ browse html "main"
121
+
122
+ # Get computed CSS
123
+ browse css "body" "font-family"
124
+
125
+ # Get element attributes
126
+ browse attrs "nav"
127
+
128
+ # Wait for element to appear
129
+ browse wait ".loaded"
130
+
131
+ # Device emulation
132
+ browse emulate iphone
133
+ browse emulate reset
134
+
135
+ # Parallel sessions
136
+ browse --session agent-a goto https://site1.com
137
+ browse --session agent-b goto https://site2.com
138
+ ```
139
+
140
+ ## Command Reference
141
+
142
+ ### Navigation
143
+ ```
144
+ browse goto <url> Navigate current tab
145
+ browse back Go back
146
+ browse forward Go forward
147
+ browse reload Reload page
148
+ browse url Print current URL
149
+ ```
150
+
151
+ ### Content extraction
152
+ ```
153
+ browse text Cleaned page text (no scripts/styles)
154
+ browse html [selector] innerHTML of element, or full page HTML
155
+ browse links All links as "text → href"
156
+ browse forms All forms + fields as JSON
157
+ browse accessibility Accessibility tree snapshot (ARIA)
158
+ ```
159
+
160
+ ### Snapshot (ref-based element selection)
161
+ ```
162
+ browse snapshot Full accessibility tree with @refs
163
+ browse snapshot -i Interactive elements only (buttons, links, inputs)
164
+ browse snapshot -c Compact (no empty structural elements)
165
+ browse snapshot -C Cursor-interactive (detect divs with cursor:pointer/onclick/tabindex)
166
+ browse snapshot -d <N> Limit depth to N levels
167
+ browse snapshot -s <sel> Scope to CSS selector
168
+ browse snapshot-diff Compare current vs previous snapshot
169
+ ```
170
+
171
+ After snapshot, use @refs as selectors in any command:
172
+ ```
173
+ browse click @e3 Click the element assigned ref @e3
174
+ browse fill @e4 "value" Fill the input assigned ref @e4
175
+ browse hover @e1 Hover the element assigned ref @e1
176
+ browse html @e2 Get innerHTML of ref @e2
177
+ browse css @e5 "color" Get computed CSS of ref @e5
178
+ browse attrs @e6 Get attributes of ref @e6
179
+ ```
180
+
181
+ Refs are invalidated on navigation — run `snapshot` again after `goto`.
182
+
183
+ ### Interaction
184
+ ```
185
+ browse click <selector> Click element (CSS selector or @ref)
186
+ browse fill <selector> <value> Fill input field
187
+ browse select <selector> <val> Select dropdown value
188
+ browse hover <selector> Hover over element
189
+ browse type <text> Type into focused element
190
+ browse press <key> Press key (Enter, Tab, Escape, etc.)
191
+ browse scroll [selector] Scroll element into view, or page bottom
192
+ browse wait <selector> Wait for element to appear (max 15s)
193
+ browse viewport <WxH> Set viewport size (e.g. 375x812)
194
+ browse upload <sel> <files> Upload file(s) to a file input
195
+ browse dialog-accept [value] Set dialogs to auto-accept
196
+ browse dialog-dismiss Set dialogs to auto-dismiss (default)
197
+ browse emulate <device> Emulate device (iphone, pixel, etc.)
198
+ browse emulate reset Reset to desktop (1920x1080)
199
+ ```
200
+
201
+ ### Inspection
202
+ ```
203
+ browse js <expression> Run JS, print result
204
+ browse eval <js-file> Run JS file against page
205
+ browse css <selector> <prop> Get computed CSS property
206
+ browse attrs <selector> Get element attributes as JSON
207
+ browse state <selector> Element state (visible/enabled/checked/focused)
208
+ browse dialog Last dialog info or "(no dialog detected)"
209
+ browse console [--clear] View/clear console messages
210
+ browse network [--clear] View/clear network requests
211
+ browse cookies Dump all cookies as JSON
212
+ browse storage [set <k> <v>] View/set localStorage
213
+ browse perf Page load performance timings
214
+ browse devices [filter] List available device names
215
+ ```
216
+
217
+ ### Visual
218
+ ```
219
+ browse screenshot [path] Screenshot (default: .browse/browse-screenshot.png)
220
+ browse screenshot --annotate [path] Screenshot with numbered badges + legend
221
+ browse pdf [path] Save as PDF
222
+ browse responsive [prefix] Screenshots at mobile/tablet/desktop
223
+ ```
224
+
225
+ ### Compare
226
+ ```
227
+ browse diff <url1> <url2> Text diff between two pages
228
+ ```
229
+
230
+ ### Multi-step (chain)
231
+ ```
232
+ echo '[["goto","https://example.com"],["snapshot","-i"],["click","@e1"]]' | browse chain
233
+ ```
234
+
235
+ ### Tabs
236
+ ```
237
+ browse tabs List tabs (id, url, title)
238
+ browse tab <id> Switch to tab
239
+ browse newtab [url] Open new tab
240
+ browse closetab [id] Close tab
241
+ ```
242
+
243
+ ### Sessions (parallel agents)
244
+ ```
245
+ browse --session <id> <cmd> Run command in named session
246
+ browse sessions List active sessions
247
+ browse session-close <id> Close a session
248
+ ```
249
+
250
+ ### Server management
251
+ ```
252
+ browse status Server health, uptime, session count
253
+ browse stop Shutdown server
254
+ browse restart Kill + restart server
255
+ ```
256
+
257
+ ## Speed Rules
258
+
259
+ 1. **Navigate once, query many times.** `goto` loads the page; then `text`, `js`, `css`, `screenshot` all run against the loaded page instantly.
260
+ 2. **Use `snapshot -i` for interaction.** Get refs for all interactive elements, then click/fill by ref. No need to guess CSS selectors.
261
+ 3. **Use `snapshot -C` for SPAs.** Catches cursor:pointer divs and onclick handlers that ARIA misses.
262
+ 4. **Use `js` for precision.** `js "document.querySelector('.price').textContent"` is faster than parsing full page text.
263
+ 5. **Use `links` to survey.** Faster than `text` when you just need navigation structure.
264
+ 6. **Use `chain` for multi-step flows.** Avoids CLI overhead per step.
265
+ 7. **Use `responsive` for layout checks.** One command = 3 viewport screenshots.
266
+ 8. **Use `--session` for parallel work.** Multiple agents can browse simultaneously without interference.
267
+
268
+ ## When to Use What
269
+
270
+ | Task | Commands |
271
+ |------|----------|
272
+ | Read a page | `goto <url>` then `text` |
273
+ | Interact with elements | `snapshot -i` then `click @e3` |
274
+ | Find hidden clickables | `snapshot -i -C` then `click @e15` |
275
+ | Check if element exists | `js "!!document.querySelector('.thing')"` |
276
+ | Extract specific data | `js "document.querySelector('.price').textContent"` |
277
+ | Visual check | `screenshot .browse/x.png` then Read the image |
278
+ | Fill and submit form | `snapshot -i` → `fill @e4 "val"` → `click @e5` |
279
+ | Check CSS | `css "selector" "property"` or `css @e3 "property"` |
280
+ | Inspect DOM | `html "selector"` or `attrs @e3` |
281
+ | Debug console errors | `console` |
282
+ | Check network requests | `network` |
283
+ | Check local dev | `goto http://127.0.0.1:3000` |
284
+ | Compare two pages | `diff <url1> <url2>` |
285
+ | Mobile layout check | `responsive .browse/prefix` |
286
+ | Test on mobile device | `emulate iphone` → `goto <url>` → `screenshot` |
287
+ | Parallel agents | `--session agent-a <cmd>` / `--session agent-b <cmd>` |
288
+ | Multi-step flow | `echo '[...]' \| browse chain` |
289
+
290
+ ## Architecture
291
+
292
+ - Persistent Chromium daemon on localhost (port 9400-9410)
293
+ - Bearer token auth per session
294
+ - Session multiplexing: multiple agents share one Chromium via isolated BrowserContexts
295
+ - Project-local state: `.browse/` directory at project root (auto-created, self-gitignored)
296
+ - `browse-server.json` — server PID, port, auth token
297
+ - `browse-console.log` — captured console messages
298
+ - `browse-network.log` — captured network requests
299
+ - `browse-screenshot.png` — default screenshot location
300
+ - Auto-shutdown when all sessions idle past 30 min
301
+ - Chromium crash → server exits → auto-restarts on next command