browser-pilot 0.0.14 → 0.0.15
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +81 -672
- package/dist/actions.cjs +1058 -41
- package/dist/actions.d.cts +11 -3
- package/dist/actions.d.ts +11 -3
- package/dist/actions.mjs +1 -1
- package/dist/browser-MEWT75IB.mjs +11 -0
- package/dist/browser.cjs +1021 -40
- package/dist/browser.d.cts +3 -3
- package/dist/browser.d.ts +3 -3
- package/dist/browser.mjs +3 -3
- package/dist/cdp.cjs +5 -1
- package/dist/cdp.d.cts +1 -1
- package/dist/cdp.d.ts +1 -1
- package/dist/cdp.mjs +1 -1
- package/dist/{chunk-XMJABKCF.mjs → chunk-7YVCOL2W.mjs} +1058 -41
- package/dist/{chunk-KIFB526Y.mjs → chunk-BVZALQT4.mjs} +5 -1
- package/dist/chunk-DTVRFXKI.mjs +35 -0
- package/dist/{chunk-SPSZZH22.mjs → chunk-LCNFBXB5.mjs} +9 -33
- package/dist/{chunk-7NDR6V7S.mjs → chunk-USYSHCI3.mjs} +1369 -517
- package/dist/chunk-WPNW23CE.mjs +466 -0
- package/dist/{chunk-IN5HPAPB.mjs → chunk-ZAXQ5OTV.mjs} +6 -2
- package/dist/cli.mjs +2522 -1048
- package/dist/{client-Ck2nQksT.d.cts → client-B5QBRgIy.d.cts} +2 -0
- package/dist/{client-Ck2nQksT.d.ts → client-B5QBRgIy.d.ts} +2 -0
- package/dist/{client-3AFV2IAF.mjs → client-JWWZWO6L.mjs} +4 -2
- package/dist/index.cjs +1067 -42
- package/dist/index.d.cts +4 -4
- package/dist/index.d.ts +4 -4
- package/dist/index.mjs +3 -3
- package/dist/page-XPS6IC6V.mjs +7 -0
- package/dist/{types-CjT0vClo.d.ts → types-C9ySEdOX.d.cts} +17 -3
- package/dist/{types-BSoh5v1Y.d.cts → types-Cvvf0oGu.d.ts} +17 -3
- package/package.json +1 -1
- package/dist/browser-LZTEHUDI.mjs +0 -9
package/README.md
CHANGED
|
@@ -6,38 +6,19 @@
|
|
|
6
6
|
[](https://www.typescriptlang.org/)
|
|
7
7
|
[](https://github.com/svilupp/browser-pilot/blob/main/LICENSE)
|
|
8
8
|
|
|
9
|
-
|
|
9
|
+
Automation-first CDP browser control for AI agents.
|
|
10
10
|
|
|
11
|
-
|
|
12
|
-
import { connect } from 'browser-pilot';
|
|
13
|
-
|
|
14
|
-
const browser = await connect({ provider: 'browserbase', apiKey: process.env.BROWSERBASE_API_KEY });
|
|
15
|
-
const page = await browser.page();
|
|
11
|
+
Browser Pilot now teaches one workflow model:
|
|
16
12
|
|
|
17
|
-
|
|
18
|
-
|
|
19
|
-
|
|
20
|
-
|
|
13
|
+
- inspect the page
|
|
14
|
+
- act in the browser
|
|
15
|
+
- record a manual workflow
|
|
16
|
+
- trace behavior over time
|
|
17
|
+
- exercise voice/media and browser conditions
|
|
21
18
|
|
|
22
|
-
|
|
23
|
-
console.log(snapshot.text); // Accessibility tree as text
|
|
24
|
-
|
|
25
|
-
await browser.close();
|
|
26
|
-
```
|
|
19
|
+
`record` and `trace` are two interfaces over the same capture system. `record` writes the canonical artifact. `trace` explains either a live session or a saved artifact.
|
|
27
20
|
|
|
28
|
-
##
|
|
29
|
-
|
|
30
|
-
| Problem with Playwright/Puppeteer | browser-pilot Solution |
|
|
31
|
-
|-----------------------------------|------------------------|
|
|
32
|
-
| Won't run in Cloudflare Workers | Pure Web Standard APIs, zero Node.js dependencies |
|
|
33
|
-
| Bun CDP connection bugs | Custom CDP client that works everywhere |
|
|
34
|
-
| Single-selector API (fragile) | Multi-selector by default: `['#primary', '.fallback']` |
|
|
35
|
-
| No action batching (high latency) | Batch DSL: one call for entire sequences |
|
|
36
|
-
| No inline assertions (extra API calls to verify) | Built-in assertions: verify state within the same batch |
|
|
37
|
-
| No AI-optimized snapshots | Built-in accessibility tree extraction |
|
|
38
|
-
| No audio I/O for voice agents | Mic injection + output capture + Whisper transcription |
|
|
39
|
-
|
|
40
|
-
## Installation
|
|
21
|
+
## Install
|
|
41
22
|
|
|
42
23
|
```bash
|
|
43
24
|
bun add browser-pilot
|
|
@@ -45,690 +26,118 @@ bun add browser-pilot
|
|
|
45
26
|
npm install browser-pilot
|
|
46
27
|
```
|
|
47
28
|
|
|
48
|
-
|
|
49
|
-
|
|
50
|
-
### BrowserBase (Recommended for production)
|
|
51
|
-
|
|
52
|
-
```typescript
|
|
53
|
-
const browser = await connect({
|
|
54
|
-
provider: 'browserbase',
|
|
55
|
-
apiKey: process.env.BROWSERBASE_API_KEY,
|
|
56
|
-
projectId: process.env.BROWSERBASE_PROJECT_ID, // optional
|
|
57
|
-
});
|
|
58
|
-
```
|
|
59
|
-
|
|
60
|
-
### Browserless
|
|
61
|
-
|
|
62
|
-
```typescript
|
|
63
|
-
const browser = await connect({
|
|
64
|
-
provider: 'browserless',
|
|
65
|
-
apiKey: process.env.BROWSERLESS_API_KEY,
|
|
66
|
-
});
|
|
67
|
-
```
|
|
68
|
-
|
|
69
|
-
### Generic (Local Chrome)
|
|
29
|
+
For local Chrome:
|
|
70
30
|
|
|
71
31
|
```bash
|
|
72
|
-
|
|
73
|
-
|
|
74
|
-
|
|
75
|
-
|
|
76
|
-
```typescript
|
|
77
|
-
const browser = await connect({
|
|
78
|
-
provider: 'generic',
|
|
79
|
-
wsUrl: 'ws://localhost:9222/devtools/browser/...', // optional, auto-discovers
|
|
80
|
-
});
|
|
81
|
-
```
|
|
82
|
-
|
|
83
|
-
## Core Concepts
|
|
84
|
-
|
|
85
|
-
### Multi-Selector (Robust Automation)
|
|
86
|
-
|
|
87
|
-
Every action accepts `string | string[]`. When given an array, tries each selector in order until one works:
|
|
88
|
-
|
|
89
|
-
```typescript
|
|
90
|
-
// Tries #submit first, falls back to alternatives
|
|
91
|
-
await page.click(['#submit', 'button[type=submit]', '.submit-btn']);
|
|
92
|
-
|
|
93
|
-
// Cookie consent - try multiple common patterns
|
|
94
|
-
await page.click([
|
|
95
|
-
'#accept-cookies',
|
|
96
|
-
'.cookie-accept',
|
|
97
|
-
'button:has-text("Accept")',
|
|
98
|
-
'[data-testid="cookie-accept"]'
|
|
99
|
-
], { optional: true, timeout: 3000 });
|
|
100
|
-
```
|
|
101
|
-
|
|
102
|
-
### Built-in Waiting
|
|
103
|
-
|
|
104
|
-
Every action automatically waits for the element to be visible before interacting:
|
|
105
|
-
|
|
106
|
-
```typescript
|
|
107
|
-
// No separate waitFor needed - this waits automatically
|
|
108
|
-
await page.click('.dynamic-button', { timeout: 5000 });
|
|
109
|
-
|
|
110
|
-
// Explicit waiting when needed
|
|
111
|
-
await page.waitFor('.loading', { state: 'hidden' });
|
|
112
|
-
await page.waitForNavigation();
|
|
113
|
-
await page.waitForNetworkIdle();
|
|
114
|
-
```
|
|
115
|
-
|
|
116
|
-
### Batch Actions
|
|
117
|
-
|
|
118
|
-
Execute multiple actions in a single call with full result tracking:
|
|
119
|
-
|
|
120
|
-
```typescript
|
|
121
|
-
const result = await page.batch([
|
|
122
|
-
{ action: 'goto', url: 'https://example.com/login' },
|
|
123
|
-
{ action: 'fill', selector: '#email', value: 'user@example.com' },
|
|
124
|
-
{ action: 'fill', selector: '#password', value: 'secret' },
|
|
125
|
-
{ action: 'submit', selector: '#login-btn' },
|
|
126
|
-
{ action: 'wait', waitFor: 'navigation' },
|
|
127
|
-
{ action: 'snapshot' },
|
|
128
|
-
]);
|
|
129
|
-
|
|
130
|
-
console.log(result.success); // true if all steps succeeded
|
|
131
|
-
console.log(result.totalDurationMs); // total execution time
|
|
132
|
-
console.log(result.steps[5].result); // snapshot from step 5
|
|
133
|
-
```
|
|
134
|
-
|
|
135
|
-
Assertion steps verify expected state within the same batch — no extra round trips. Available: `assertVisible`, `assertExists`, `assertText`, `assertUrl`, `assertValue`.
|
|
136
|
-
|
|
137
|
-
```typescript
|
|
138
|
-
const result = await page.batch([
|
|
139
|
-
{ action: 'goto', url: 'https://example.com/login' },
|
|
140
|
-
{ action: 'fill', selector: '#email', value: 'user@example.com' },
|
|
141
|
-
{ action: 'fill', selector: '#password', value: 'secret' },
|
|
142
|
-
{ action: 'submit', selector: '#login-btn' },
|
|
143
|
-
{ action: 'assertUrl', expect: '/dashboard' },
|
|
144
|
-
{ action: 'assertVisible', selector: '.welcome-message' },
|
|
145
|
-
]);
|
|
146
|
-
```
|
|
147
|
-
|
|
148
|
-
Any step supports `retry` and `retryDelay` for flaky or async content:
|
|
149
|
-
|
|
150
|
-
```typescript
|
|
151
|
-
{ action: 'assertVisible', selector: '.async-content', retry: 3, retryDelay: 1000 }
|
|
32
|
+
/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome \
|
|
33
|
+
--remote-debugging-port=9222 \
|
|
34
|
+
--user-data-dir=/tmp/browser-pilot-profile
|
|
152
35
|
```
|
|
153
36
|
|
|
154
|
-
|
|
37
|
+
## Choose the command by job
|
|
155
38
|
|
|
156
|
-
|
|
157
|
-
|
|
158
|
-
|
|
159
|
-
|
|
160
|
-
|
|
161
|
-
|
|
162
|
-
|
|
163
|
-
|
|
164
|
-
// Interactive elements with refs
|
|
165
|
-
console.log(snapshot.interactiveElements);
|
|
166
|
-
// [{ ref: 'e1', role: 'button', name: 'Submit', selector: '...' }, ...]
|
|
167
|
-
|
|
168
|
-
// Text representation for LLMs
|
|
169
|
-
console.log(snapshot.text);
|
|
170
|
-
// - main ref:e1
|
|
171
|
-
// - heading "Welcome" ref:e2
|
|
172
|
-
// - button "Get Started" ref:e3
|
|
173
|
-
// - textbox ref:e4 placeholder="Email"
|
|
174
|
-
```
|
|
175
|
-
|
|
176
|
-
### Ref-Based Selectors
|
|
177
|
-
|
|
178
|
-
After taking a snapshot, use element refs directly as selectors:
|
|
179
|
-
|
|
180
|
-
```typescript
|
|
181
|
-
const snapshot = await page.snapshot();
|
|
182
|
-
// Output shows: button "Submit" ref:e4
|
|
183
|
-
|
|
184
|
-
// Click using the ref - no fragile CSS needed
|
|
185
|
-
await page.click('ref:e4');
|
|
186
|
-
|
|
187
|
-
// Fill input by ref
|
|
188
|
-
await page.fill('ref:e23', 'hello@example.com');
|
|
189
|
-
|
|
190
|
-
// Combine ref with CSS fallbacks
|
|
191
|
-
await page.click(['ref:e4', '#submit', 'button[type=submit]']);
|
|
192
|
-
```
|
|
193
|
-
|
|
194
|
-
Refs are stable until page navigation. Always take a fresh snapshot after navigating.
|
|
195
|
-
CLI note: refs are cached per session+URL after a snapshot, so you can reuse them across CLI calls
|
|
196
|
-
until navigation changes the URL.
|
|
197
|
-
|
|
198
|
-
## Page API
|
|
199
|
-
|
|
200
|
-
### Navigation
|
|
201
|
-
|
|
202
|
-
```typescript
|
|
203
|
-
await page.goto(url, options?)
|
|
204
|
-
await page.reload(options?)
|
|
205
|
-
await page.goBack(options?)
|
|
206
|
-
await page.goForward(options?)
|
|
207
|
-
|
|
208
|
-
const url = await page.url()
|
|
209
|
-
const title = await page.title()
|
|
210
|
-
```
|
|
211
|
-
|
|
212
|
-
### Actions
|
|
213
|
-
|
|
214
|
-
All actions accept `string | string[]` for selectors:
|
|
215
|
-
|
|
216
|
-
```typescript
|
|
217
|
-
await page.click(selector, options?)
|
|
218
|
-
await page.fill(selector, value, options?) // clears first by default
|
|
219
|
-
await page.type(selector, text, options?) // types character by character
|
|
220
|
-
await page.select(selector, value, options?) // native <select>
|
|
221
|
-
await page.select({ trigger, option, value, match }, options?) // custom dropdown
|
|
222
|
-
await page.check(selector, options?)
|
|
223
|
-
await page.uncheck(selector, options?)
|
|
224
|
-
await page.submit(selector, options?) // tries Enter, then click
|
|
225
|
-
await page.press(key)
|
|
226
|
-
await page.focus(selector, options?)
|
|
227
|
-
await page.hover(selector, options?)
|
|
228
|
-
await page.scroll(selector, options?)
|
|
229
|
-
```
|
|
230
|
-
|
|
231
|
-
### Waiting
|
|
232
|
-
|
|
233
|
-
```typescript
|
|
234
|
-
await page.waitFor(selector, { state: 'visible' | 'hidden' | 'attached' | 'detached' })
|
|
235
|
-
await page.waitForNavigation(options?)
|
|
236
|
-
await page.waitForNetworkIdle({ idleTime: 500 })
|
|
237
|
-
```
|
|
238
|
-
|
|
239
|
-
### Content
|
|
240
|
-
|
|
241
|
-
```typescript
|
|
242
|
-
const snapshot = await page.snapshot()
|
|
243
|
-
const text = await page.text(selector?)
|
|
244
|
-
const screenshot = await page.screenshot({ format: 'png', fullPage: true })
|
|
245
|
-
const result = await page.evaluate(() => document.title)
|
|
246
|
-
```
|
|
247
|
-
|
|
248
|
-
### Files
|
|
249
|
-
|
|
250
|
-
```typescript
|
|
251
|
-
await page.setInputFiles(selector, [{ name: 'file.pdf', mimeType: 'application/pdf', buffer: data }])
|
|
252
|
-
const download = await page.waitForDownload(() => page.click('#download-btn'))
|
|
253
|
-
```
|
|
254
|
-
|
|
255
|
-
### Emulation
|
|
256
|
-
|
|
257
|
-
```typescript
|
|
258
|
-
import { devices } from 'browser-pilot';
|
|
259
|
-
|
|
260
|
-
await page.emulate(devices['iPhone 14']); // Full device emulation
|
|
261
|
-
await page.setViewport({ width: 1280, height: 720, deviceScaleFactor: 2 });
|
|
262
|
-
await page.setUserAgent('Custom UA');
|
|
263
|
-
await page.setGeolocation({ latitude: 37.7749, longitude: -122.4194 });
|
|
264
|
-
await page.setTimezone('America/New_York');
|
|
265
|
-
await page.setLocale('fr-FR');
|
|
266
|
-
```
|
|
267
|
-
|
|
268
|
-
Devices: `iPhone 14`, `iPhone 14 Pro Max`, `Pixel 7`, `iPad Pro 11`, `Desktop Chrome`, `Desktop Firefox`
|
|
269
|
-
|
|
270
|
-
### Audio I/O
|
|
271
|
-
|
|
272
|
-
```typescript
|
|
273
|
-
// Set up audio input/output interception
|
|
274
|
-
await page.setupAudio();
|
|
275
|
-
|
|
276
|
-
// Play audio into the page's fake microphone
|
|
277
|
-
await page.audioInput.play(wavBytes, { waitForEnd: true });
|
|
278
|
-
|
|
279
|
-
// Capture audio output until silence
|
|
280
|
-
const capture = await page.audioOutput.captureUntilSilence({ silenceTimeout: 5000 });
|
|
281
|
-
|
|
282
|
-
// Full round-trip: play input → capture response
|
|
283
|
-
const result = await page.audioRoundTrip({ input: wavBytes, silenceTimeout: 5000 });
|
|
284
|
-
|
|
285
|
-
// Transcribe captured audio (requires OPENAI_API_KEY)
|
|
286
|
-
import { transcribe } from 'browser-pilot';
|
|
287
|
-
const { text } = await transcribe(capture);
|
|
288
|
-
```
|
|
39
|
+
| Job | Primary commands |
|
|
40
|
+
| --- | --- |
|
|
41
|
+
| Inspect page state | `snapshot`, `page`, `forms`, `text`, `targets`, `diagnose` |
|
|
42
|
+
| Act in the browser | `exec`, `run` |
|
|
43
|
+
| Capture a human demo | `record` |
|
|
44
|
+
| Investigate behavior over time | `trace` |
|
|
45
|
+
| Exercise voice/media | `audio` |
|
|
46
|
+
| Change browser conditions | `env` |
|
|
289
47
|
|
|
290
|
-
|
|
291
|
-
|
|
292
|
-
```typescript
|
|
293
|
-
// Block images and fonts
|
|
294
|
-
await page.blockResources(['Image', 'Font']);
|
|
295
|
-
|
|
296
|
-
// Mock API responses
|
|
297
|
-
await page.route('**/api/users', { status: 200, body: { users: [] } });
|
|
298
|
-
|
|
299
|
-
// Full control
|
|
300
|
-
await page.intercept('*api*', async (request, actions) => {
|
|
301
|
-
if (request.url.includes('blocked')) await actions.fail();
|
|
302
|
-
else await actions.continue({ headers: { ...request.headers, 'X-Custom': 'value' } });
|
|
303
|
-
});
|
|
304
|
-
```
|
|
305
|
-
|
|
306
|
-
### Cookies & Storage
|
|
307
|
-
|
|
308
|
-
```typescript
|
|
309
|
-
// Cookies
|
|
310
|
-
const cookies = await page.cookies();
|
|
311
|
-
await page.setCookie({ name: 'session', value: 'abc', domain: '.example.com' });
|
|
312
|
-
await page.clearCookies();
|
|
313
|
-
|
|
314
|
-
// localStorage / sessionStorage
|
|
315
|
-
await page.setLocalStorage('key', 'value');
|
|
316
|
-
const value = await page.getLocalStorage('key');
|
|
317
|
-
await page.clearLocalStorage();
|
|
318
|
-
```
|
|
319
|
-
|
|
320
|
-
### Console & Dialogs
|
|
321
|
-
|
|
322
|
-
```typescript
|
|
323
|
-
// Capture console messages
|
|
324
|
-
await page.onConsole((msg) => console.log(`[${msg.type}] ${msg.text}`));
|
|
325
|
-
|
|
326
|
-
// Handle dialogs (alert, confirm, prompt)
|
|
327
|
-
await page.onDialog(async (dialog) => {
|
|
328
|
-
if (dialog.type === 'confirm') await dialog.accept();
|
|
329
|
-
else await dialog.dismiss();
|
|
330
|
-
});
|
|
331
|
-
|
|
332
|
-
// Collect messages during an action
|
|
333
|
-
const { result, messages } = await page.collectConsole(async () => {
|
|
334
|
-
return await page.click('#button');
|
|
335
|
-
});
|
|
336
|
-
```
|
|
337
|
-
|
|
338
|
-
**Important:** Native browser dialogs (`alert()`, `confirm()`, `prompt()`) block all CDP commands until handled. Always set up a dialog handler before triggering actions that may show dialogs.
|
|
339
|
-
|
|
340
|
-
### Iframes
|
|
341
|
-
|
|
342
|
-
Switch context to interact with iframe content:
|
|
343
|
-
|
|
344
|
-
```typescript
|
|
345
|
-
// Switch to iframe
|
|
346
|
-
await page.switchToFrame('iframe#payment');
|
|
347
|
-
|
|
348
|
-
// Now actions target the iframe
|
|
349
|
-
await page.fill('#card-number', '4242424242424242');
|
|
350
|
-
await page.fill('#expiry', '12/25');
|
|
351
|
-
|
|
352
|
-
// Switch back to main document
|
|
353
|
-
await page.switchToMain();
|
|
354
|
-
await page.click('#submit-order');
|
|
355
|
-
```
|
|
356
|
-
|
|
357
|
-
Note: Cross-origin iframes cannot be accessed due to browser security.
|
|
358
|
-
|
|
359
|
-
### Options
|
|
360
|
-
|
|
361
|
-
```typescript
|
|
362
|
-
interface ActionOptions {
|
|
363
|
-
timeout?: number; // default: 30000ms
|
|
364
|
-
optional?: boolean; // return false instead of throwing on failure
|
|
365
|
-
}
|
|
366
|
-
```
|
|
367
|
-
|
|
368
|
-
## CLI
|
|
369
|
-
|
|
370
|
-
The CLI provides session persistence for interactive workflows:
|
|
48
|
+
## Golden path 1: automate a page
|
|
371
49
|
|
|
372
50
|
```bash
|
|
373
|
-
|
|
374
|
-
bp
|
|
375
|
-
bp
|
|
376
|
-
|
|
377
|
-
|
|
378
|
-
|
|
379
|
-
# Execute actions
|
|
380
|
-
bp exec -s my-session '{"action":"goto","url":"https://example.com"}'
|
|
381
|
-
bp exec -s my-session '[
|
|
382
|
-
{"action":"fill","selector":"#search","value":"browser automation"},
|
|
383
|
-
{"action":"submit","selector":"#search-form"}
|
|
384
|
-
]'
|
|
385
|
-
|
|
386
|
-
# Get page state (note the refs in output)
|
|
387
|
-
bp snapshot -s my-session --format text
|
|
388
|
-
# Output: button "Submit" ref:e4, textbox "Email" ref:e5, ...
|
|
389
|
-
|
|
390
|
-
# Use refs from snapshot for reliable targeting
|
|
391
|
-
# Refs are cached per session+URL after snapshot
|
|
392
|
-
bp exec -s my-session '{"action":"click","selector":"ref:e4"}'
|
|
393
|
-
bp exec -s my-session '{"action":"fill","selector":"ref:e5","value":"test@example.com"}'
|
|
394
|
-
|
|
395
|
-
# Quick discovery commands
|
|
396
|
-
bp page -s my-session # URL, title, headings, forms, interactive controls
|
|
397
|
-
bp forms -s my-session # Structured form metadata only
|
|
398
|
-
bp targets -s my-session # Browser tabs with targetIds
|
|
399
|
-
bp connect --new-tab --url https://example.com --name fresh
|
|
400
|
-
|
|
401
|
-
# Handle native dialogs (alert/confirm/prompt)
|
|
402
|
-
bp exec --dialog accept '{"action":"click","selector":"#delete-btn"}'
|
|
403
|
-
bp exec --record '[{"action":"click","selector":"#checkout"},{"action":"assertText","expect":"Thanks"}]'
|
|
404
|
-
|
|
405
|
-
# Other commands
|
|
406
|
-
bp text -s my-session --selector ".main-content"
|
|
407
|
-
bp screenshot -s my-session --output page.png
|
|
408
|
-
bp listen ws -m "*voice*" # monitor WebSocket traffic
|
|
409
|
-
bp list # list all sessions
|
|
410
|
-
bp clean --max-size 500MB # trim old sessions by disk usage
|
|
411
|
-
bp close -s my-session # close session
|
|
412
|
-
bp actions # show complete action reference
|
|
413
|
-
bp run workflow.json # run a workflow file
|
|
414
|
-
|
|
415
|
-
# Daemon management
|
|
416
|
-
bp daemon status # check daemon health
|
|
417
|
-
bp daemon stop # stop daemon for default session
|
|
418
|
-
bp daemon logs # view daemon log
|
|
419
|
-
|
|
420
|
-
# Actions with inline assertions (no extra bp eval needed)
|
|
421
|
-
bp exec '[
|
|
422
|
-
{"action":"goto","url":"https://example.com/login"},
|
|
423
|
-
{"action":"fill","selector":"#email","value":"user@example.com"},
|
|
424
|
-
{"action":"submit","selector":"form"},
|
|
425
|
-
{"action":"assertUrl","expect":"/dashboard"},
|
|
51
|
+
bp connect --provider generic --name dev
|
|
52
|
+
bp snapshot -i -s dev
|
|
53
|
+
bp exec -s dev '[
|
|
54
|
+
{"action":"fill","selector":"ref:e5","value":"user@example.com"},
|
|
55
|
+
{"action":"click","selector":"ref:e7"},
|
|
426
56
|
{"action":"assertText","expect":"Welcome"}
|
|
427
57
|
]'
|
|
428
58
|
```
|
|
429
59
|
|
|
430
|
-
|
|
431
|
-
|
|
432
|
-
The CLI is designed for AI agent tool calls. The recommended workflow:
|
|
60
|
+
Use `bp snapshot -i` first. Refs are the default targeting strategy.
|
|
433
61
|
|
|
434
|
-
|
|
435
|
-
2. **Use refs** (`ref:e4`) for reliable element targeting
|
|
436
|
-
3. **Batch actions** to reduce round trips
|
|
62
|
+
## Golden path 2: capture a manual workflow and derive automation
|
|
437
63
|
|
|
438
64
|
```bash
|
|
439
|
-
|
|
440
|
-
|
|
441
|
-
|
|
442
|
-
|
|
443
|
-
|
|
444
|
-
bp exec '[
|
|
445
|
-
{"action":"fill","selector":"ref:e5","value":"laptop"},
|
|
446
|
-
{"action":"click","selector":"ref:e12"},
|
|
447
|
-
{"action":"snapshot"}
|
|
448
|
-
]' --format json
|
|
449
|
-
```
|
|
450
|
-
|
|
451
|
-
Multi-selector fallbacks for robustness:
|
|
452
|
-
```bash
|
|
453
|
-
bp exec '[
|
|
454
|
-
{"action":"click","selector":["ref:e4","#submit","button[type=submit]"]}
|
|
455
|
-
]'
|
|
456
|
-
```
|
|
457
|
-
|
|
458
|
-
Output:
|
|
459
|
-
```json
|
|
460
|
-
{
|
|
461
|
-
"success": true,
|
|
462
|
-
"steps": [
|
|
463
|
-
{"action": "fill", "success": true, "durationMs": 30},
|
|
464
|
-
{"action": "click", "success": true, "durationMs": 50, "selectorUsed": "ref:e12"},
|
|
465
|
-
{"action": "snapshot", "success": true, "durationMs": 100, "result": "..."}
|
|
466
|
-
],
|
|
467
|
-
"totalDurationMs": 180
|
|
468
|
-
}
|
|
469
|
-
```
|
|
470
|
-
|
|
471
|
-
Run `bp actions` for complete action reference.
|
|
472
|
-
|
|
473
|
-
### Voice Agent Testing
|
|
474
|
-
|
|
475
|
-
Test audio-based AI apps (voice assistants, phone agents) by injecting microphone input and capturing spoken responses.
|
|
476
|
-
|
|
477
|
-
> **Full guide:** [Voice Agent Testing Guide](./docs/guides/voice-agent-testing.md)
|
|
478
|
-
|
|
479
|
-
```bash
|
|
480
|
-
export OPENAI_API_KEY=sk-... # Required for --transcribe
|
|
481
|
-
|
|
482
|
-
# Validate audio pipeline
|
|
483
|
-
bp audio check -s my-session
|
|
484
|
-
# Output: "READY for roundtrip" with agent AudioContext detected
|
|
485
|
-
|
|
486
|
-
# Full round-trip: send audio prompt → wait for response → transcribe
|
|
487
|
-
bp audio roundtrip -i prompt.wav --transcribe --silence-timeout 1500
|
|
488
|
-
# Output: { "transcript": "Welcome! I'd be happy to help...", "latencyMs": 5200, ... }
|
|
489
|
-
|
|
490
|
-
# Save response audio for manual review
|
|
491
|
-
bp audio roundtrip -i prompt.wav -o response.wav --transcribe
|
|
492
|
-
```
|
|
493
|
-
|
|
494
|
-
**Important:** Audio overrides must be injected before the voice agent initializes. Use `bp audio check` to validate the pipeline. See the [full guide](./docs/guides/voice-agent-testing.md) for setup order and troubleshooting.
|
|
495
|
-
|
|
496
|
-
Programmatic API:
|
|
497
|
-
|
|
498
|
-
```typescript
|
|
499
|
-
await page.setupAudio();
|
|
500
|
-
|
|
501
|
-
const result = await page.audioRoundTrip({
|
|
502
|
-
input: audioBytes,
|
|
503
|
-
silenceTimeout: 1500,
|
|
504
|
-
});
|
|
505
|
-
|
|
506
|
-
import { transcribe } from 'browser-pilot';
|
|
507
|
-
const { text } = await transcribe(result.audio);
|
|
508
|
-
console.log(text); // "Welcome! I'd be happy to help..."
|
|
65
|
+
bp record -s demo --profile automation -f ./artifacts/demo.recording.json
|
|
66
|
+
# perform the flow manually, then stop with Ctrl+C
|
|
67
|
+
bp record summary ./artifacts/demo.recording.json
|
|
68
|
+
bp record derive ./artifacts/demo.recording.json -o workflow.json
|
|
69
|
+
bp run workflow.json
|
|
509
70
|
```
|
|
510
71
|
|
|
511
|
-
|
|
72
|
+
Do not start by opening the raw artifact. Use `record summary`, `record inspect`, or `trace summary --view ...` first.
|
|
512
73
|
|
|
513
|
-
|
|
74
|
+
## Golden path 3: debug a realtime or voice session
|
|
514
75
|
|
|
515
76
|
```bash
|
|
516
|
-
|
|
517
|
-
bp
|
|
518
|
-
|
|
519
|
-
|
|
520
|
-
bp
|
|
521
|
-
|
|
522
|
-
# Use specific session with custom output file
|
|
523
|
-
bp record -s my-session -f login-flow.json
|
|
524
|
-
|
|
525
|
-
# Review and edit the recording
|
|
526
|
-
cat recording.json
|
|
527
|
-
|
|
528
|
-
# Replay the recording
|
|
529
|
-
bp exec -s my-session --file recording.json
|
|
530
|
-
```
|
|
531
|
-
|
|
532
|
-
The output format is compatible with `page.batch()`:
|
|
533
|
-
```json
|
|
534
|
-
{
|
|
535
|
-
"recordedAt": "2026-01-06T10:00:00.000Z",
|
|
536
|
-
"startUrl": "https://example.com",
|
|
537
|
-
"duration": 15000,
|
|
538
|
-
"steps": [
|
|
539
|
-
{ "action": "fill", "selector": ["[data-testid=\"email\"]", "#email"], "value": "user@example.com" },
|
|
540
|
-
{ "action": "click", "selector": ["[data-testid=\"submit\"]", "#login-btn"] }
|
|
541
|
-
]
|
|
542
|
-
}
|
|
77
|
+
bp connect --provider generic --name realtime
|
|
78
|
+
bp trace start -s realtime --timeout 20000
|
|
79
|
+
# reproduce the issue in the app
|
|
80
|
+
bp trace summary -s realtime --view ws
|
|
81
|
+
bp trace summary -s realtime --view console
|
|
543
82
|
```
|
|
544
83
|
|
|
545
|
-
|
|
546
|
-
- Sensitive fields are automatically redacted as `[REDACTED]` based on input settings such as `type="password"`, `type="hidden"`, and secret/autofill hints like `autocomplete="one-time-code"` or `cc-number`
|
|
547
|
-
- Selectors are multi-selector arrays ordered by reliability (data attributes > IDs > CSS paths)
|
|
548
|
-
- Edit the JSON to adjust selectors or add `optional: true` flags
|
|
549
|
-
|
|
550
|
-
### Screenshot Trail During Replay
|
|
551
|
-
|
|
552
|
-
Capture a lightweight visual trail while replaying steps. Enable recording at the session level so all `bp exec` calls are captured automatically:
|
|
84
|
+
Voice workflow:
|
|
553
85
|
|
|
554
86
|
```bash
|
|
555
|
-
|
|
556
|
-
bp
|
|
557
|
-
|
|
558
|
-
|
|
559
|
-
bp
|
|
560
|
-
{"action":"goto","url":"https://example.com/login"},
|
|
561
|
-
{"action":"fill","selector":"#email","value":"user@example.com"},
|
|
562
|
-
{"action":"submit","selector":"form"}
|
|
563
|
-
]'
|
|
564
|
-
bp exec -s my-session '{"action":"assertUrl","expect":"/dashboard"}'
|
|
565
|
-
|
|
566
|
-
# Or enable recording on a single exec call
|
|
567
|
-
bp exec --record '[{"action":"click","selector":"#checkout"}]'
|
|
568
|
-
```
|
|
569
|
-
|
|
570
|
-
This writes `recording.json` plus a `screenshots/` directory in the session directory. Sensitive field values are redacted in both the manifest and the screenshot overlays. See the [Action Recording Guide](./docs/guides/action-recording.md) for options like `--record-format`, `--record-quality`, and `--no-highlights`.
|
|
571
|
-
|
|
572
|
-
## Examples
|
|
573
|
-
|
|
574
|
-
### Login Flow with Error Handling
|
|
575
|
-
|
|
576
|
-
```typescript
|
|
577
|
-
const result = await page.batch([
|
|
578
|
-
{ action: 'goto', url: 'https://app.example.com/login' },
|
|
579
|
-
{ action: 'fill', selector: ['#email', 'input[name=email]'], value: email },
|
|
580
|
-
{ action: 'fill', selector: ['#password', 'input[name=password]'], value: password },
|
|
581
|
-
{ action: 'click', selector: '.remember-me', optional: true },
|
|
582
|
-
{ action: 'submit', selector: ['#login', 'button[type=submit]'] },
|
|
583
|
-
], { onFail: 'stop' });
|
|
584
|
-
|
|
585
|
-
if (!result.success) {
|
|
586
|
-
console.error(`Failed at step ${result.stoppedAtIndex}: ${result.steps[result.stoppedAtIndex!].error}`);
|
|
587
|
-
}
|
|
87
|
+
bp audio setup -s realtime
|
|
88
|
+
bp exec -s realtime '{"action":"goto","url":"https://my-voice-app.com"}'
|
|
89
|
+
bp audio check -s realtime
|
|
90
|
+
bp audio roundtrip -s realtime -i prompt.wav --transcribe -o response.wav
|
|
91
|
+
bp trace summary -s realtime --view voice
|
|
588
92
|
```
|
|
589
93
|
|
|
590
|
-
|
|
591
|
-
|
|
592
|
-
```typescript
|
|
593
|
-
// Using the custom select config
|
|
594
|
-
await page.select({
|
|
595
|
-
trigger: '.country-dropdown',
|
|
596
|
-
option: '.dropdown-option',
|
|
597
|
-
value: 'United States',
|
|
598
|
-
match: 'text', // or 'contains' or 'value'
|
|
599
|
-
});
|
|
600
|
-
|
|
601
|
-
// Or compose from primitives
|
|
602
|
-
await page.click('.country-dropdown');
|
|
603
|
-
await page.fill('.dropdown-search', 'United');
|
|
604
|
-
await page.click('.dropdown-option:has-text("United States")');
|
|
605
|
-
```
|
|
606
|
-
|
|
607
|
-
### WebSocket Daemon
|
|
608
|
-
|
|
609
|
-
By default, `bp connect` spawns a lightweight background daemon that holds the CDP WebSocket open. Subsequent CLI commands connect via Unix socket (~5-15ms) instead of re-establishing WebSocket (~280-1030ms per command).
|
|
94
|
+
## Golden path 4: exercise failure modes
|
|
610
95
|
|
|
611
96
|
```bash
|
|
612
|
-
|
|
613
|
-
bp
|
|
614
|
-
|
|
615
|
-
|
|
616
|
-
bp exec -s dev '{"action":"snapshot"}' # ~5-15ms overhead instead of ~280ms
|
|
617
|
-
|
|
618
|
-
# Manage the daemon
|
|
619
|
-
bp daemon status # check health, PID, uptime
|
|
620
|
-
bp daemon stop # stop daemon
|
|
621
|
-
bp daemon logs # view daemon log
|
|
622
|
-
|
|
623
|
-
# Disable daemon for direct WebSocket
|
|
624
|
-
bp connect --no-daemon
|
|
97
|
+
bp env permissions grant -s realtime microphone
|
|
98
|
+
bp env network offline -s realtime --duration 5000
|
|
99
|
+
bp trace watch -s realtime --view ws --assert profile:reconnect --timeout 15000
|
|
100
|
+
bp env visibility hidden -s realtime
|
|
625
101
|
```
|
|
626
102
|
|
|
627
|
-
|
|
628
|
-
|
|
629
|
-
### Cloudflare Workers
|
|
103
|
+
## What is new in the model
|
|
630
104
|
|
|
631
|
-
|
|
632
|
-
|
|
633
|
-
|
|
634
|
-
|
|
635
|
-
|
|
636
|
-
const browser = await connect({
|
|
637
|
-
provider: 'browserbase',
|
|
638
|
-
apiKey: env.BROWSERBASE_API_KEY,
|
|
639
|
-
});
|
|
640
|
-
|
|
641
|
-
const page = await browser.page();
|
|
642
|
-
await page.goto('https://example.com');
|
|
643
|
-
const snapshot = await page.snapshot();
|
|
644
|
-
|
|
645
|
-
await browser.close();
|
|
646
|
-
|
|
647
|
-
return Response.json({ title: snapshot.title, elements: snapshot.interactiveElements });
|
|
648
|
-
},
|
|
649
|
-
};
|
|
650
|
-
```
|
|
105
|
+
- One canonical artifact model with `version: 2`
|
|
106
|
+
- One canonical trace event stream for recording, live trace, and session logs
|
|
107
|
+
- Trace-backed waits and assertions in `exec` / `run`
|
|
108
|
+
- `listen` preserved as a compatibility alias to `trace tail`
|
|
109
|
+
- `audio` for active control, `trace` for explanation, `env` for browser-state controls
|
|
651
110
|
|
|
652
|
-
|
|
111
|
+
## Programmatic example
|
|
653
112
|
|
|
654
113
|
```typescript
|
|
655
|
-
|
|
656
|
-
name: 'browser_action',
|
|
657
|
-
description: 'Execute browser actions and get page state',
|
|
658
|
-
parameters: {
|
|
659
|
-
type: 'object',
|
|
660
|
-
properties: {
|
|
661
|
-
actions: {
|
|
662
|
-
type: 'array',
|
|
663
|
-
items: {
|
|
664
|
-
type: 'object',
|
|
665
|
-
properties: {
|
|
666
|
-
action: { enum: ['goto', 'click', 'fill', 'submit', 'snapshot', 'assertVisible', 'assertExists', 'assertText', 'assertUrl', 'assertValue'] },
|
|
667
|
-
selector: { type: ['string', 'array'] },
|
|
668
|
-
value: { type: 'string' },
|
|
669
|
-
url: { type: 'string' },
|
|
670
|
-
},
|
|
671
|
-
},
|
|
672
|
-
},
|
|
673
|
-
},
|
|
674
|
-
},
|
|
675
|
-
execute: async ({ actions }) => {
|
|
676
|
-
const page = await getOrCreatePage();
|
|
677
|
-
return page.batch(actions);
|
|
678
|
-
},
|
|
679
|
-
};
|
|
680
|
-
```
|
|
681
|
-
|
|
682
|
-
## Advanced
|
|
683
|
-
|
|
684
|
-
### Direct CDP Access
|
|
114
|
+
import { connect } from 'browser-pilot';
|
|
685
115
|
|
|
686
|
-
```typescript
|
|
687
116
|
const browser = await connect({ provider: 'generic' });
|
|
688
|
-
const
|
|
689
|
-
|
|
690
|
-
// Send any CDP command
|
|
691
|
-
await cdp.send('Emulation.setDeviceMetricsOverride', {
|
|
692
|
-
width: 375,
|
|
693
|
-
height: 812,
|
|
694
|
-
deviceScaleFactor: 3,
|
|
695
|
-
mobile: true,
|
|
696
|
-
});
|
|
697
|
-
```
|
|
698
|
-
|
|
699
|
-
### Tracing
|
|
117
|
+
const page = await browser.page();
|
|
700
118
|
|
|
701
|
-
|
|
702
|
-
|
|
119
|
+
await page.batch([
|
|
120
|
+
{ action: 'goto', url: 'https://example.com/login' },
|
|
121
|
+
{ action: 'fill', selector: ['#email', 'input[type=email]'], value: 'user@example.com' },
|
|
122
|
+
{ action: 'submit', selector: 'form' },
|
|
123
|
+
{ action: 'assertUrl', expect: '/dashboard' },
|
|
124
|
+
]);
|
|
703
125
|
|
|
704
|
-
|
|
705
|
-
// [info] goto https://example.com ✓ (1200ms)
|
|
706
|
-
// [info] click #submit ✓ (50ms)
|
|
126
|
+
await browser.close();
|
|
707
127
|
```
|
|
708
128
|
|
|
709
|
-
##
|
|
710
|
-
|
|
711
|
-
browser-pilot is designed for AI agents. Two resources for agent setup:
|
|
712
|
-
|
|
713
|
-
- **[llms.txt](./docs/llms.txt)** - Abbreviated reference for LLM context windows
|
|
714
|
-
- **[Claude Code Skill](./docs/automating-browsers/SKILL.md)** - Full skill for Claude Code agents
|
|
715
|
-
|
|
716
|
-
To use with Claude Code, copy `docs/automating-browsers/` to your project or reference it in your agent's context.
|
|
717
|
-
|
|
718
|
-
## Documentation
|
|
719
|
-
|
|
720
|
-
See the [docs](./docs) folder for detailed documentation:
|
|
129
|
+
## Guides
|
|
721
130
|
|
|
722
|
-
- [
|
|
723
|
-
- [
|
|
724
|
-
- [Action
|
|
725
|
-
- [
|
|
726
|
-
- [
|
|
727
|
-
- [
|
|
728
|
-
- [
|
|
729
|
-
- [
|
|
730
|
-
- [API Reference](./docs/api/page.md)
|
|
131
|
+
- [CLI guide](./docs/cli.md)
|
|
132
|
+
- [Automation workflows](./docs/guides/automation-workflows.md)
|
|
133
|
+
- [Action recording](./docs/guides/action-recording.md)
|
|
134
|
+
- [Trace workflows](./docs/guides/trace-workflows.md)
|
|
135
|
+
- [Realtime debugging](./docs/guides/realtime-debugging.md)
|
|
136
|
+
- [Voice agent testing](./docs/guides/voice-agent-testing.md)
|
|
137
|
+
- [Artifact analysis](./docs/guides/artifact-analysis.md)
|
|
138
|
+
- [LLM contract](./docs/llms.txt)
|
|
731
139
|
|
|
732
|
-
##
|
|
140
|
+
## Compatibility notes
|
|
733
141
|
|
|
734
|
-
|
|
142
|
+
- Prefer `--debug` for transport logging. `--trace` still works as a legacy alias.
|
|
143
|
+
- Prefer `bp trace tail ...`. `bp listen ...` still works as a compatibility alias.
|