playwriter 0.0.62 → 0.0.63

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/src/skill.md CHANGED
@@ -119,6 +119,35 @@ If you find a bug, you can create a gh issue using `gh issue create -R remorses/
119
119
 
120
120
  Control user's Chrome browser via playwright code snippets. Prefer single-line code with semicolons between statements. Use playwriter immediately without waiting for user actions; only if you get "extension is not connected" or "no browser tabs have Playwriter enabled" should you ask the user to click the playwriter extension icon on the target tab.
121
121
 
122
+ **If Chrome is not running**, the extension can't connect. Start Chrome from the command line before retrying:
123
+
124
+ ```bash
125
+ # macOS
126
+ open -a "Google Chrome"
127
+
128
+ # Linux
129
+ google-chrome &
130
+
131
+ # Windows (cmd)
132
+ start chrome.exe
133
+
134
+ # Windows (PowerShell)
135
+ Start-Process chrome.exe
136
+ ```
137
+
138
+ To also enable automatic tab capture for screen recording (no manual extension click needed), add the `--allowlisted-extension-id` and `--auto-accept-this-tab-capture` flags:
139
+
140
+ ```bash
141
+ # macOS
142
+ open -a "Google Chrome" --args --allowlisted-extension-id=jfeammnjpkecdekppnclgkkffahnhfhe --auto-accept-this-tab-capture
143
+
144
+ # Linux
145
+ google-chrome --allowlisted-extension-id=jfeammnjpkecdekppnclgkkffahnhfhe --auto-accept-this-tab-capture &
146
+
147
+ # Windows
148
+ start chrome.exe --allowlisted-extension-id=jfeammnjpkecdekppnclgkkffahnhfhe --auto-accept-this-tab-capture
149
+ ```
150
+
122
151
  You can collaborate with the user - they can help with captchas, difficult elements, or reproducing bugs.
123
152
 
124
153
  ## context variables
@@ -143,14 +172,97 @@ You can collaborate with the user - they can help with captchas, difficult eleme
143
172
  - **Wait for load**: use `page.waitForLoadState('domcontentloaded')` not `page.waitForEvent('load')` - waitForEvent times out if already loaded
144
173
  - **Avoid timeouts**: prefer proper waits over `page.waitForTimeout()` - there are better ways to wait for elements
145
174
 
175
+ ## interaction feedback loop
176
+
177
+ Every browser interaction should follow a **observe → act → observe** loop. After every action, you must check its result before proceeding. Never chain multiple actions blindly — the page may not have responded as expected.
178
+
179
+ **Core loop:**
180
+
181
+ 1. **Open page** — get or create your page and navigate to the target URL
182
+ 2. **Observe** — take an accessibility snapshot to understand the current state
183
+ 3. **Update priors** — read the snapshot, identify the element to interact with
184
+ 4. **Act** — perform one action (click, type, submit)
185
+ 5. **Observe again** — take another snapshot to verify the action's effect
186
+ 6. **Repeat** — continue from step 3 until the task is complete
187
+
188
+ ```
189
+ ┌─────────────────────────────────────────────┐
190
+ │ open page + goto URL │
191
+ └──────────────────┬──────────────────────────┘
192
+
193
+ ┌────────────────┐
194
+ │ observe │◄─────────────────┐
195
+ │ (snapshot) │ │
196
+ └───────┬────────┘ │
197
+ ▼ │
198
+ ┌────────────────┐ │
199
+ │ update priors │ │
200
+ │ (read result) │ │
201
+ └───────┬────────┘ │
202
+ ▼ │
203
+ ┌────────────────┐ │
204
+ │ act │ │
205
+ │ (click/type) │──────────────────┘
206
+ └────────────────┘
207
+ ```
208
+
209
+ **Example: opening a Framer plugin via the command palette**
210
+
211
+ Each step is a separate execute call. Notice how every action is followed by a snapshot to verify what happened:
212
+
213
+ ```js
214
+ // 1. Open page and observe
215
+ state.myPage = context.pages().find(p => p.url() === 'about:blank') ?? await context.newPage();
216
+ await state.myPage.goto('https://framer.com/projects/my-project', { waitUntil: 'domcontentloaded' });
217
+ await accessibilitySnapshot({ page: state.myPage }).then(console.log)
218
+ ```
219
+
220
+ ```js
221
+ // 2. Act: open command palette → observe result
222
+ await state.myPage.keyboard.press('Meta+k');
223
+ await accessibilitySnapshot({ page: state.myPage, search: /dialog|Search/ }).then(console.log)
224
+ ```
225
+
226
+ ```js
227
+ // 3. Act: type search query → observe result
228
+ await state.myPage.keyboard.type('MCP');
229
+ await accessibilitySnapshot({ page: state.myPage, search: /MCP/ }).then(console.log)
230
+ ```
231
+
232
+ ```js
233
+ // 4. Act: press Enter → observe plugin loaded
234
+ await state.myPage.keyboard.press('Enter');
235
+ await state.myPage.waitForTimeout(1000);
236
+ const frame = state.myPage.frames().find(f => f.url().includes('plugins.framercdn.com'));
237
+ await accessibilitySnapshot({ page: state.myPage, frame: frame || undefined }).then(console.log)
238
+ ```
239
+
240
+ **Other ways to observe action results:**
241
+
242
+ Snapshots are the primary feedback mechanism, but some actions have side effects that are better observed through other channels:
243
+
244
+ - **Console logs** — check for errors or app state after an action:
245
+ ```js
246
+ await getLatestLogs({ page, search: /error|fail/i, count: 20 })
247
+ ```
248
+ - **Network requests** — verify API calls were made after a form submit or button click:
249
+ ```js
250
+ page.on('response', async res => { if (res.url().includes('/api/')) { console.log(res.status(), res.url()); } });
251
+ ```
252
+ - **URL changes** — confirm navigation happened:
253
+ ```js
254
+ console.log(page.url())
255
+ ```
256
+ - **Screenshots** — only when you need to verify visual layout (CSS, spatial positioning, colors). Snapshots are always preferred for content verification.
257
+
146
258
  ## common mistakes to avoid
147
259
 
148
260
  **1. Not verifying actions succeeded**
149
- Always screenshot and READ the image after important actions (form submissions, uploads, typing). Your mental model can diverge from actual browser state:
261
+ Always check page state after important actions (form submissions, uploads, typing). Your mental model can diverge from actual browser state:
150
262
  ```js
151
263
  await page.keyboard.type('my text');
152
- await page.screenshotWithAccessibilityLabels({ page });
153
- // Then READ the screenshot file to verify text appeared correctly
264
+ await accessibilitySnapshot({ page, search: /my text/ })
265
+ // If verifying visual layout specifically, use screenshotWithAccessibilityLabels instead
154
266
  ```
155
267
 
156
268
  **2. Assuming paste/upload worked**
@@ -195,7 +307,36 @@ await page.keyboard.press('Enter');
195
307
  await page.keyboard.type('Line 2');
196
308
  ```
197
309
 
198
- **6. Assuming page content loaded**
310
+ **6. Quote escaping in $'...' syntax**
311
+ When using `$'...'` for multiline code, nested quotes break parsing. Use different quote styles or escape them:
312
+ ```bash
313
+ # BAD: nested double quotes break $'...'
314
+ playwriter -s 1 -e $'await page.locator("[id=\"_r_a_\"]").click()'
315
+
316
+ # GOOD: use single quotes inside, or template strings
317
+ playwriter -s 1 -e $'await page.locator(\'[id="_r_a_"]\').click()'
318
+
319
+ # GOOD: use heredoc for complex quoting
320
+ playwriter -s 1 -e "$(cat <<'EOF'
321
+ await page.locator('[id="_r_a_"]').click()
322
+ EOF
323
+ )"
324
+ ```
325
+
326
+ **7. Using screenshots when snapshots suffice**
327
+ Screenshots + image analysis is expensive and slow. Only use screenshots for visual/CSS issues:
328
+ ```js
329
+ // BAD: screenshot to check if text appeared (wastes tokens on image analysis)
330
+ await page.screenshot({ path: 'check.png', scale: 'css' });
331
+
332
+ // GOOD: snapshot is text — fast, cheap, searchable
333
+ await accessibilitySnapshot({ page, search: /expected text/i })
334
+
335
+ // GOOD: evaluate DOM directly for content checks
336
+ const text = await page.evaluate(() => document.querySelector('.message')?.textContent);
337
+ ```
338
+
339
+ **8. Assuming page content loaded**
199
340
  Even after `goto()`, dynamic content may not be ready:
200
341
  ```js
201
342
  await page.goto('https://example.com');
@@ -205,7 +346,7 @@ await page.waitForSelector('article', { timeout: 10000 });
205
346
  await waitForPageLoad({ page, timeout: 5000 });
206
347
  ```
207
348
 
208
- **7. Login buttons that open popups**
349
+ **9. Login buttons that open popups**
209
350
  Playwriter extension cannot control popup windows. If a login button opens a popup (common with OAuth/SSO), use cmd+click to open in a new tab instead:
210
351
  ```js
211
352
  // BAD: popup window is not controllable by playwriter
@@ -230,13 +371,17 @@ await loginPage.waitForURL('**/callback**');
230
371
 
231
372
  ## checking page state
232
373
 
233
- After any action (click, submit, navigate), verify what happened:
374
+ After any action (click, submit, navigate), verify what happened. **Always prefer accessibility snapshots over screenshots** — snapshots are text (cheap, fast, searchable), screenshots require image analysis (expensive, slow).
234
375
 
235
376
  ```js
377
+ // Default: use snapshot with optional filtering
236
378
  page.url() + '\n' + await accessibilitySnapshot({ page })
379
+
380
+ // Filter for specific content when snapshot is large
381
+ await accessibilitySnapshot({ page, search: /dialog|button|error/i })
237
382
  ```
238
383
 
239
- For visually complex pages (grids, galleries, dashboards), use `screenshotWithAccessibilityLabels({ page })` instead to understand spatial layout. Label refs are short `eN` strings (e.g. `e3`).
384
+ Only use `screenshotWithAccessibilityLabels({ page })` for **visual layout issues** (CSS bugs, spatial positioning, colors). For verifying text content, button states, or form values, snapshots are always sufficient.
240
385
 
241
386
  If nothing changed, try `await waitForPageLoad({ page, timeout: 3000 })` or you may have clicked the wrong element.
242
387
 
@@ -285,6 +430,18 @@ Search for specific elements:
285
430
  const snapshot = await accessibilitySnapshot({ page, search: /button|submit/i })
286
431
  ```
287
432
 
433
+ **Filtering large snapshots in JS** — when the built-in `search` isn't enough (e.g., you need multiple patterns or custom logic), filter the snapshot string directly:
434
+
435
+ ```js
436
+ const snap = await accessibilitySnapshot({ page, showDiffSinceLastCall: false });
437
+ const relevant = snap.split('\n').filter(l =>
438
+ l.includes('dialog') || l.includes('error') || l.includes('button')
439
+ ).join('\n');
440
+ console.log(relevant);
441
+ ```
442
+
443
+ This is much cheaper than taking a screenshot — use it as your primary debugging tool for verifying text content, checking if elements exist, or confirming state changes.
444
+
288
445
  ## choosing between snapshot methods
289
446
 
290
447
  Both `accessibilitySnapshot` and `screenshotWithAccessibilityLabels` use the same ref system, so you can combine them effectively.
@@ -333,14 +490,17 @@ await page.locator('li').nth(3).click() // 4th item (0-indexed)
333
490
 
334
491
  ## working with pages
335
492
 
336
- **Pages are shared, state is not.** `context.pages()` returns all browser tabs with playwriter enabled — shared across all sessions. Multiple agents see the same tabs. If another agent navigates or closes a page you're using, you'll be affected. To avoid interference, **always create your own page**.
493
+ **Pages are shared, state is not.** `context.pages()` returns all browser tabs with playwriter enabled — shared across all sessions. Multiple agents see the same tabs. If another agent navigates or closes a page you're using, you'll be affected. To avoid interference, **get your own page**.
337
494
 
338
- **Always create your own page (first call):**
495
+ **Get or create your page (first call):**
339
496
 
340
- On your very first execute call, create a dedicated page and store it in `state`. Use `state.myPage` for all subsequent operations never the default `page` variable:
497
+ On your very first execute call, reuse an existing empty tab or create a new one, and navigate it **in the same execute call**. Store it in `state` and use `state.myPage` for all subsequent operations instead of the default `page` variable:
341
498
 
342
499
  ```js
343
- state.myPage = await context.newPage();
500
+ // Reuse an empty about:blank tab if available, otherwise create a new one.
501
+ // IMPORTANT: always navigate immediately in the same call to avoid another
502
+ // agent grabbing the same about:blank tab between execute calls.
503
+ state.myPage = context.pages().find(p => p.url() === 'about:blank') ?? await context.newPage();
344
504
  await state.myPage.goto('https://example.com');
345
505
  // Use state.myPage for ALL subsequent operations
346
506
  ```
@@ -351,7 +511,7 @@ The user may close your page by accident (e.g., closing a tab in Chrome). Always
351
511
 
352
512
  ```js
353
513
  if (!state.myPage || state.myPage.isClosed()) {
354
- state.myPage = await context.newPage();
514
+ state.myPage = context.pages().find(p => p.url() === 'about:blank') ?? await context.newPage();
355
515
  }
356
516
  await state.myPage.goto('https://example.com');
357
517
  ```
@@ -714,6 +874,42 @@ console.log(data);
714
874
 
715
875
  Clean up listeners when done: `page.removeAllListeners('request'); page.removeAllListeners('response');`
716
876
 
877
+ ## debugging web apps
878
+
879
+ When debugging why a web app isn't working (e.g., content not rendering, API errors, state issues), use these techniques **before** resorting to screenshots:
880
+
881
+ **1. Console logs** — use `getLatestLogs` to check for errors:
882
+
883
+ ```js
884
+ const errors = await getLatestLogs({ page, search: /error|fail/i, count: 20 });
885
+ const appLogs = await getLatestLogs({ page, search: /myComponent|state/i });
886
+ ```
887
+
888
+ **2. DOM inspection via evaluate** — check content directly without screenshots:
889
+
890
+ ```js
891
+ const info = await page.evaluate(() => {
892
+ const msgs = document.querySelectorAll('.message');
893
+ return Array.from(msgs).map(m => ({
894
+ text: m.textContent?.slice(0, 200),
895
+ visible: m.offsetHeight > 0,
896
+ }));
897
+ });
898
+ console.log(JSON.stringify(info, null, 2));
899
+ ```
900
+
901
+ **3. Combine snapshot + logs for full picture:**
902
+
903
+ ```js
904
+ await page.keyboard.press('Enter');
905
+ await page.waitForTimeout(2000);
906
+
907
+ const snap = await accessibilitySnapshot({ page, search: /dialog|error|message/ });
908
+ const logs = await getLatestLogs({ page, search: /error/i, count: 10 });
909
+ console.log('UI:', snap);
910
+ console.log('Logs:', logs);
911
+ ```
912
+
717
913
  ## capabilities
718
914
 
719
915
  Examples of what playwriter can do: