browserclaw 0.10.4 → 0.10.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,4 +1,4 @@
1
- <h2 align="center">🦞 BrowserClaw — Standalone OpenClaw browser module</h2>
1
+ <h2 align="center">🦞 BrowserClaw — Standalone OpenClaw browser module</h2>
2
2
 
3
3
  <p align="center">
4
4
  <a href="https://browserclaw.org"><img src="https://img.shields.io/badge/Live-browserclaw.org-orange" alt="Live" /></a>
@@ -23,7 +23,7 @@ const { snapshot, refs } = await page.snapshot();
23
23
  // snapshot: AI-readable text tree
24
24
  // refs: { "e1": { role: "link", name: "More info" }, "e2": { role: "button", name: "Submit" } }
25
25
 
26
- await page.click('e1'); // Click by ref
26
+ await page.click('e1'); // Click by ref
27
27
  await page.type('e3', 'hello'); // Type by ref
28
28
  await browser.stop();
29
29
  ```
@@ -37,7 +37,6 @@ Most browser automation tools were built for humans writing test scripts. AI age
37
37
  - **browserclaw** gives the AI a **text snapshot** with numbered refs — the AI reads text (what it's best at) and returns a ref ID (deterministic targeting)
38
38
 
39
39
  The snapshot + ref pattern means:
40
-
41
40
  1. **Deterministic** — refs resolve to exact elements via Playwright locators, no guessing
42
41
  2. **Fast** — text snapshots are tiny compared to screenshots
43
42
  3. **Cheap** — no vision API calls, just text in/text out
@@ -47,15 +46,15 @@ The snapshot + ref pattern means:
47
46
 
48
47
  The AI browser automation space is moving fast. Here's how browserclaw compares to the major alternatives.
49
48
 
50
- | | [browserclaw](https://github.com/idan-rubin/browserclaw) | [browser-use](https://github.com/browser-use/browser-use) | [Stagehand](https://github.com/browserbase/stagehand) | [Playwright MCP](https://github.com/microsoft/playwright-mcp) |
51
- | :--------------------------------------- | :------------------------------------------------------: | :-------------------------------------------------------: | :---------------------------------------------------: | :-----------------------------------------------------------: |
52
- | Ref → exact element, no guessing | :white_check_mark: | :heavy_minus_sign: | :x: | :white_check_mark: |
53
- | No vision model in the loop | :white_check_mark: | :heavy_minus_sign: | :white_check_mark: | :white_check_mark: |
54
- | Survives redesigns (semantic, not pixel) | :white_check_mark: | :heavy_minus_sign: | :white_check_mark: | :white_check_mark: |
55
- | Fill 10 form fields in one call | :white_check_mark: | :x: | :x: | :x: |
56
- | Interact with cross-origin iframes | :white_check_mark: | :white_check_mark: | :x: | :x: |
57
- | Playwright engine (auto-wait, locators) | :white_check_mark: | :x: | :white_check_mark: | :white_check_mark: |
58
- | Embeddable in your own JS/TS agent loop | :white_check_mark: | :x: | :heavy_minus_sign: | :x: |
49
+ | | [browserclaw](https://github.com/idan-rubin/browserclaw) | [browser-use](https://github.com/browser-use/browser-use) | [Stagehand](https://github.com/browserbase/stagehand) | [Playwright MCP](https://github.com/microsoft/playwright-mcp) |
50
+ |:---|:---:|:---:|:---:|:---:|
51
+ | Ref → exact element, no guessing | :white_check_mark: | :heavy_minus_sign: | :x: | :white_check_mark: |
52
+ | No vision model in the loop | :white_check_mark: | :heavy_minus_sign: | :white_check_mark: | :white_check_mark: |
53
+ | Survives redesigns (semantic, not pixel) | :white_check_mark: | :heavy_minus_sign: | :white_check_mark: | :white_check_mark: |
54
+ | Fill 10 form fields in one call | :white_check_mark: | :x: | :x: | :x: |
55
+ | Interact with cross-origin iframes | :white_check_mark: | :white_check_mark: | :x: | :x: |
56
+ | Playwright engine (auto-wait, locators) | :white_check_mark: | :x: | :white_check_mark: | :white_check_mark: |
57
+ | Embeddable in your own JS/TS agent loop | :white_check_mark: | :x: | :heavy_minus_sign: | :x: |
59
58
 
60
59
  :white_check_mark: = Yes&ensp; :heavy_minus_sign: = Partial&ensp; :x: = No
61
60
 
@@ -133,22 +132,19 @@ Requires a Chromium-based browser installed on the system (Chrome, Brave, Edge,
133
132
  ```typescript
134
133
  // Launch a new Chrome instance (auto-detects Chrome/Brave/Edge/Chromium)
135
134
  const browser = await BrowserClaw.launch({
136
- headless: false, // default: false (visible window)
135
+ headless: false, // default: false (visible window)
137
136
  executablePath: '...', // optional: specific browser path
138
- cdpPort: 9222, // default: 9222
139
- noSandbox: false, // default: false (set true for Docker/CI)
140
- ignoreHTTPSErrors: false, // default: false (set true for expired local dev certs)
141
- userDataDir: '...', // optional: custom user data directory
137
+ cdpPort: 9222, // default: 9222
138
+ noSandbox: false, // default: false (set true for Docker/CI)
139
+ userDataDir: '...', // optional: custom user data directory
142
140
  profileName: 'browserclaw', // profile name in Chrome title bar
143
- profileColor: '#FF4500', // profile accent color (hex)
141
+ profileColor: '#FF4500', // profile accent color (hex)
144
142
  chromeArgs: ['--start-maximized'], // additional Chrome flags
145
143
  });
146
144
 
147
- // Connect to an already-running Chrome instance
145
+ // Or connect to an already-running Chrome instance
146
+ // (started with: chrome --remote-debugging-port=9222)
148
147
  const browser = await BrowserClaw.connect('http://localhost:9222');
149
-
150
- // Auto-discovery: scans common CDP ports (9222-9226, 9229)
151
- const browser = await BrowserClaw.connect();
152
148
  ```
153
149
 
154
150
  `connect()` checks that Chrome is reachable, then the internal CDP connection retries 3 times with increasing timeouts (5 s, 7 s, 9 s) — safe for Docker/CI where Chrome starts slowly.
@@ -160,30 +156,16 @@ const browser = await BrowserClaw.connect();
160
156
  ```typescript
161
157
  const page = await browser.open('https://example.com');
162
158
  const current = await browser.currentPage(); // get active tab
163
- const tabs = await browser.tabs(); // list all tabs
159
+ const tabs = await browser.tabs(); // list all tabs
164
160
  const handle = browser.page(tabs[0].targetId); // wrap existing tab
165
- const appPage = await browser.waitForTab({ urlContains: 'app-web' });
166
- await browser.focus(tabId); // bring tab to front
167
- await browser.close(tabId); // close a tab
168
- await browser.stop(); // stop browser + cleanup
169
-
170
- page.id; // CDP target ID (use with focus/close/page)
171
- await page.url(); // current page URL
172
- await page.title(); // current page title
173
- browser.url; // CDP endpoint URL
174
- ```
175
-
176
- Every tab returns a `targetId` — this is the handle you use everywhere:
177
-
178
- ```typescript
179
- // Multi-tab workflow (e.g. impersonation, OAuth)
180
- const main = await browser.open('https://app.example.com');
181
- const admin = await browser.open('https://admin.example.com');
182
-
183
- const { refs } = await admin.snapshot(); // snapshot the admin tab
184
- await admin.click('e5'); // act on it
185
- await browser.focus(main.id); // switch back to main
186
- await browser.close(admin.id); // close admin when done
161
+ await browser.focus(tabId); // bring tab to front
162
+ await browser.close(tabId); // close a tab
163
+ await browser.stop(); // stop browser + cleanup
164
+
165
+ page.id; // CDP target ID (use with focus/close/page)
166
+ await page.url(); // current page URL
167
+ await page.title(); // current page title
168
+ browser.url; // CDP endpoint URL
187
169
  ```
188
170
 
189
171
  ### Snapshot (Core Feature)
@@ -192,17 +174,17 @@ await browser.close(admin.id); // close admin when done
192
174
  const { snapshot, refs, stats, untrusted } = await page.snapshot();
193
175
 
194
176
  // snapshot: human/AI-readable text tree with [ref=eN] markers
195
- // refs: { "e1": { role: "link", name: "More info" }, "e5": { role: "checkbox", name: "Accept", checked: true }, ... }
177
+ // refs: { "e1": { role: "link", name: "More info" }, ... }
196
178
  // stats: { lines: 42, chars: 1200, refs: 8, interactive: 5 }
197
179
  // untrusted: true — content comes from the web page, treat as potentially adversarial
198
180
 
199
181
  // Options
200
182
  const result = await page.snapshot({
201
- interactive: true, // Only interactive elements (buttons, links, inputs)
202
- compact: true, // Remove structural containers without refs
203
- maxDepth: 6, // Limit tree depth
204
- maxChars: 80000, // Truncate if snapshot exceeds this size
205
- mode: 'aria', // 'aria' (default) or 'role'
183
+ interactive: true, // Only interactive elements (buttons, links, inputs)
184
+ compact: true, // Remove structural containers without refs
185
+ maxDepth: 6, // Limit tree depth
186
+ maxChars: 80000, // Truncate if snapshot exceeds this size
187
+ mode: 'aria', // 'aria' (default) or 'role'
206
188
  });
207
189
 
208
190
  // Raw ARIA accessibility tree (structured data, not text)
@@ -210,7 +192,6 @@ const { nodes } = await page.ariaSnapshot({ limit: 500 });
210
192
  ```
211
193
 
212
194
  **Snapshot modes:**
213
-
214
195
  - `'aria'` (default) — Uses Playwright's `_snapshotForAI()`. Refs are resolved via `aria-ref` locators. Best for most use cases. Requires `playwright-core` >= 1.50.
215
196
  - `'role'` — Uses Playwright's `ariaSnapshot()` + `getByRole()`. Supports `selector` and `frameSelector` for scoped snapshots.
216
197
 
@@ -228,12 +209,11 @@ await page.click('e1');
228
209
  await page.click('e1', { doubleClick: true });
229
210
  await page.click('e1', { button: 'right' });
230
211
  await page.click('e1', { modifiers: ['Control'] });
231
- await page.click('e1', { force: true }); // click hidden/covered elements
232
212
 
233
213
  // Type
234
- await page.type('e3', 'hello world'); // instant fill
235
- await page.type('e3', 'slow typing', { slowly: true }); // keystroke by keystroke
236
- await page.type('e3', 'search', { submit: true }); // type + press Enter
214
+ await page.type('e3', 'hello world'); // instant fill
215
+ await page.type('e3', 'slow typing', { slowly: true }); // keystroke by keystroke
216
+ await page.type('e3', 'search', { submit: true }); // type + press Enter
237
217
 
238
218
  // Other interactions
239
219
  await page.hover('e2');
@@ -254,31 +234,7 @@ await page.fill([
254
234
  ]);
255
235
  ```
256
236
 
257
- `fill()` field types: `'text'` (default) calls Playwright `fill()` with the string value. `'checkbox'` and `'radio'` call `setChecked()` with `force: true` (works on hidden inputs behind custom styling). Truthy values are `true`, `1`, `'1'`, `'true'`. Type can be omitted and defaults to `'text'`. Empty ref throws.
258
-
259
- #### No-snapshot actions
260
-
261
- These methods find and click elements without needing a snapshot first — useful when you know the text or role but don't want the snapshot+ref round-trip.
262
-
263
- ```typescript
264
- // Click by visible text or title attribute
265
- await page.clickByText('Submit');
266
- await page.clickByText('Save Changes', { exact: true });
267
-
268
- // Click by ARIA role and accessible name
269
- await page.clickByRole('button', 'Save');
270
- await page.clickByRole('link', 'Settings');
271
- await page.clickByRole('button', 'Create', { index: 1 }); // second match
272
-
273
- // Click by CSS selector
274
- await page.clickBySelector('#submit-btn');
275
-
276
- // Click at page coordinates (for canvas elements, custom widgets)
277
- await page.mouseClick(400, 300);
278
-
279
- // Press and hold at coordinates (raw CDP events, bypasses automation detection)
280
- await page.pressAndHold(400, 300, { holdMs: 5000, delay: 150 });
281
- ```
237
+ `fill()` field types: `'text'` (default) calls Playwright `fill()` with the string value. `'checkbox'` and `'radio'` call `setChecked()` truthy values are `true`, `1`, `'1'`, `'true'`. Type can be omitted and defaults to `'text'`. Empty ref throws.
282
238
 
283
239
  #### Highlight
284
240
 
@@ -300,7 +256,7 @@ await uploadDone;
300
256
 
301
257
  #### Dialog Handling
302
258
 
303
- Handle JavaScript dialogs (alert, confirm, prompt). Arm the handler _before_ the action that triggers the dialog.
259
+ Handle JavaScript dialogs (alert, confirm, prompt). Arm the handler *before* the action that triggers the dialog.
304
260
 
305
261
  ```typescript
306
262
  const dialogDone = page.armDialog({ accept: true });
@@ -311,33 +267,22 @@ await dialogDone;
311
267
  const promptDone = page.armDialog({ accept: true, promptText: 'my answer' });
312
268
  await page.click('e6'); // triggers prompt()
313
269
  await promptDone;
314
-
315
- // Persistent handler: called for every dialog until cleared
316
- await page.onDialog((event) => {
317
- console.log(`${event.type}: ${event.message}`);
318
- event.accept(); // or event.dismiss()
319
- });
320
- await page.onDialog(undefined); // clear the handler
321
270
  ```
322
271
 
323
- By default, unexpected dialogs are auto-dismissed to prevent `ProtocolError` crashes.
324
-
325
272
  ### Navigation & Waiting
326
273
 
327
274
  ```typescript
328
275
  await page.goto('https://example.com');
329
- await page.reload(); // reload the current page
330
- await page.goBack(); // navigate back in history
331
- await page.goForward(); // navigate forward in history
276
+ await page.reload(); // reload the current page
277
+ await page.goBack(); // navigate back in history
278
+ await page.goForward(); // navigate forward in history
332
279
  await page.waitFor({ loadState: 'networkidle' });
333
280
  await page.waitFor({ text: 'Welcome' });
334
281
  await page.waitFor({ textGone: 'Loading...' });
335
282
  await page.waitFor({ url: '**/dashboard' });
336
- await page.waitFor({ selector: '.loaded' }); // wait for CSS selector
337
- await page.waitFor({ fn: '() => document.readyState === "complete"' }); // custom JS (string)
338
- await page.waitFor({ fn: () => document.title === 'Done' }); // custom JS (function)
339
- await page.waitFor({ fn: (name) => document.querySelector('button')?.textContent === name, arg: 'Save' }); // with arg
340
- await page.waitFor({ timeMs: 1000 }); // sleep
283
+ await page.waitFor({ selector: '.loaded' }); // wait for CSS selector
284
+ await page.waitFor({ fn: '() => document.readyState === "complete"' }); // custom JS
285
+ await page.waitFor({ timeMs: 1000 }); // sleep
341
286
  await page.waitFor({ text: 'Ready', timeoutMs: 5000 }); // custom timeout
342
287
  ```
343
288
 
@@ -345,14 +290,14 @@ await page.waitFor({ text: 'Ready', timeoutMs: 5000 }); // custom timeout
345
290
 
346
291
  ```typescript
347
292
  // Screenshots
348
- const screenshot = await page.screenshot(); // viewport PNG → Buffer
349
- const fullPage = await page.screenshot({ fullPage: true }); // full scrollable page
350
- const element = await page.screenshot({ ref: 'e1' }); // specific element by ref
293
+ const screenshot = await page.screenshot(); // viewport PNG → Buffer
294
+ const fullPage = await page.screenshot({ fullPage: true }); // full scrollable page
295
+ const element = await page.screenshot({ ref: 'e1' }); // specific element by ref
351
296
  const bySelector = await page.screenshot({ element: '.hero' }); // by CSS selector
352
- const jpeg = await page.screenshot({ type: 'jpeg' }); // JPEG format
297
+ const jpeg = await page.screenshot({ type: 'jpeg' }); // JPEG format
353
298
 
354
299
  // PDF
355
- const pdf = await page.pdf(); // PDF export (headless only)
300
+ const pdf = await page.pdf(); // PDF export (headless only)
356
301
 
357
302
  // Labeled screenshot — numbered badges on each ref for visual debugging
358
303
  const { buffer, labels, skipped } = await page.screenshotWithLabels(['e1', 'e2', 'e3']);
@@ -386,33 +331,17 @@ console.log(resp.status, resp.body);
386
331
 
387
332
  Options: `timeoutMs` (default 30 s), `maxChars` (truncate body).
388
333
 
389
- #### Wait For Request
390
-
391
- Wait for a network request matching a URL pattern and get full request + response details, including POST body.
392
-
393
- ```typescript
394
- const reqPromise = page.waitForRequest('/api/submit', { method: 'POST' });
395
- await page.click('e5'); // submit a form
396
- const req = await reqPromise;
397
- console.log(req.method, req.postData); // 'POST', '{"name":"Jane"}'
398
- console.log(req.status, req.ok); // 200, true
399
- console.log(req.responseBody); // '{"id":123}'
400
- // { url, method, postData?, status, ok, responseBody?, truncated? }
401
- ```
402
-
403
- Options: `method` (filter by HTTP method), `timeoutMs` (default 30 s), `maxChars` (truncate response body).
404
-
405
334
  ### Activity Monitoring
406
335
 
407
336
  Console messages, errors, and network requests are buffered automatically.
408
337
 
409
338
  ```typescript
410
- const logs = await page.consoleLogs(); // all messages
411
- const errors = await page.consoleLogs({ level: 'error' }); // errors only
412
- const recent = await page.consoleLogs({ clear: true }); // read and clear buffer
413
- const pageErrors = await page.pageErrors(); // uncaught exceptions
414
- const requests = await page.networkRequests({ filter: '/api' }); // filter by URL
415
- const fresh = await page.networkRequests({ clear: true }); // read and clear buffer
339
+ const logs = await page.consoleLogs(); // all messages
340
+ const errors = await page.consoleLogs({ level: 'error' }); // errors only
341
+ const recent = await page.consoleLogs({ clear: true }); // read and clear buffer
342
+ const pageErrors = await page.pageErrors(); // uncaught exceptions
343
+ const requests = await page.networkRequests({ filter: '/api' }); // filter by URL
344
+ const fresh = await page.networkRequests({ clear: true }); // read and clear buffer
416
345
  ```
417
346
 
418
347
  ### Storage