playwriter 0.0.2 → 0.0.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (46) hide show
  1. package/bin.js +1 -1
  2. package/dist/browser-config.js +1 -3
  3. package/dist/browser-config.js.map +1 -1
  4. package/dist/cdp-types.d.ts +25 -0
  5. package/dist/cdp-types.d.ts.map +1 -0
  6. package/dist/cdp-types.js +91 -0
  7. package/dist/cdp-types.js.map +1 -0
  8. package/dist/extension/cdp-relay.d.ts +12 -0
  9. package/dist/extension/cdp-relay.d.ts.map +1 -0
  10. package/dist/extension/cdp-relay.js +378 -0
  11. package/dist/extension/cdp-relay.js.map +1 -0
  12. package/dist/extension/protocol.d.ts +29 -0
  13. package/dist/extension/protocol.d.ts.map +1 -0
  14. package/dist/extension/protocol.js +2 -0
  15. package/dist/extension/protocol.js.map +1 -0
  16. package/dist/index.d.ts +2 -0
  17. package/dist/index.d.ts.map +1 -0
  18. package/dist/index.js +2 -0
  19. package/dist/index.js.map +1 -0
  20. package/dist/mcp-client.d.ts.map +1 -1
  21. package/dist/mcp-client.js +1 -1
  22. package/dist/mcp-client.js.map +1 -1
  23. package/dist/mcp.js +74 -464
  24. package/dist/mcp.js.map +1 -1
  25. package/dist/mcp.test.js +101 -142
  26. package/dist/mcp.test.js.map +1 -1
  27. package/dist/prompt.md +41 -487
  28. package/dist/resource.md +436 -0
  29. package/dist/start-relay-server.d.ts +8 -0
  30. package/dist/start-relay-server.d.ts.map +1 -0
  31. package/dist/start-relay-server.js +33 -0
  32. package/dist/start-relay-server.js.map +1 -0
  33. package/package.json +42 -36
  34. package/src/browser-config.ts +48 -50
  35. package/src/cdp-types.ts +124 -0
  36. package/src/extension/cdp-relay.ts +480 -0
  37. package/src/extension/protocol.ts +34 -0
  38. package/src/index.ts +1 -0
  39. package/src/mcp-client.ts +46 -46
  40. package/src/mcp.test.ts +109 -165
  41. package/src/mcp.ts +202 -694
  42. package/src/prompt.md +41 -487
  43. package/src/resource.md +436 -0
  44. package/src/snapshots/hacker-news-initial-accessibility.md +243 -127
  45. package/src/snapshots/shadcn-ui-accessibility.md +300 -510
  46. package/src/start-relay-server.ts +43 -0
package/src/prompt.md CHANGED
@@ -1,66 +1,26 @@
1
- Executes code in the server to control playwright.
1
+ execute tool let you run playwright code to control user Chrome window
2
2
 
3
- You have access to a `page` object where you can call playwright methods on it to accomplish actions on the page.
3
+ it will control an existing user Chrome window. The execute command will be executed in a sandbox with some variables in context:
4
4
 
5
- You can also use `console.log` to examine the results of your actions.
5
+ - context: the playwright browser context. you can do things like `await context.pages()`
6
+ - page, the first page the user opened and made it accessible to this MCP. do things like `page.url()` to see current url. assume the user wants you to use this page for your playwright code
6
7
 
7
- You only have access to `page`, `context` and node.js globals. Do not try to import anything or setup handlers.
8
+ the window can have more than one page. you can see other pages with `context.pages().find((p) => p.url().includes('localhost'))`
8
9
 
9
- Your code should be stateless and do not depend on any state.
10
+ you can control the browser in collaboration with the user. for example the user can help you get unstuck for things like captchas or difficult to find elements or reproducing a bug
10
11
 
11
- If you really want to attach listeners you should also detach them using a try finally block, to prevent memory leaks.
12
+ ## rules
12
13
 
13
- You can also create a new page via `context.newPage()` if you need to start fresh. You can then find that page by iteration over `context.pages()`:
14
+ - only call `page.close()` if the user asks you so or if you previously created this page yourself with `newPage`. do not close user created pages unless asked
15
+ -
14
16
 
15
- ```javascript
16
- const page = context.pages().find((p) => p.url().includes('/some/path'))
17
- ```
18
-
19
- ## important rules
20
-
21
- - NEVER call `page.waitForTimeout`, instead use `page.waitForSelector` or use a while loop that waits for a condition to be true.
22
- - when a timeout error happen for example during navigation don't worry too much. try to get the snapshot of the page to see the current state, then continue without retrying if the state is what you expect. If the state is not what you expect, then you can retry the action.
23
- - only call `page.close()` if the user asks you so or if you are in a test feedback loop and you know the user is not dependently interacting with the page (for example for debugging).
24
- - always call `new_page` at the start of a conversation. later this page will be passed to the `execute` tool.
25
- - In some rare cases you can also skip `new_page` tool, if the user asks you to instead use an existing page in the browser. You can set a page as default using `state.page = page`, `execute` calls will be passed this page in the scope later on.
26
- - if running in localhost and some elements are difficult to target with locators you can update the source code to add `data-testid` attributes to elements you want to target. This will make running tests much easier later on. Also update the source markdown documents your are following if you do so.
27
- - after every action call the tool `accessibility_snapshot` to get the page structure and understand what elements are available on the page
28
- - after form submissions use `page.waitForLoadState('networkidle')` to ensure the page is fully loaded before proceeding
29
- - sometimes when in localhost and using Vite you can encounter issues in the first page load, where a module is not found, because of updated optimization of the node_modules. In these cases you can try reloading the page 2 times and see if the issue resolves itself.
30
- - for Google and GitHub login always use the Google account you have access to, already signed in
31
- - if you are following a markdown document describing the steps to follow to test the website, update this document if you encounter unexpected behavior or if you can add information that would make the test faster, for example telling how to wait for actions that trigger loading states or to use a different timeout for specific actions.
32
-
33
- ## getting outputs of code execution
34
-
35
- You can use `console.log` to print values you want to see in the tool call result
36
-
37
- ## using page.evaluate
38
-
39
- you can execute client side JavaScript code using `page.evaluate()`
40
-
41
- When executing code with `page.evaluate()`, return values directly from the evaluate function. Use `console.log()` outside of evaluate to display results:
42
-
43
- ```javascript
44
- // Get data from the page by returning it
45
- const title = await page.evaluate(() => document.title)
46
- console.log('Page title:', title)
47
-
48
- // Return multiple values as an object
49
- const pageInfo = await page.evaluate(() => ({
50
- url: window.location.href,
51
- buttonCount: document.querySelectorAll('button').length,
52
- readyState: document.readyState,
53
- }))
54
- console.log('Page URL:', pageInfo.url)
55
- console.log('Number of buttons:', pageInfo.buttonCount)
56
- console.log('Page ready state:', pageInfo.readyState)
57
- ```
17
+ ## utility functions
58
18
 
59
- ## Finding Elements on the Page
19
+ you have access to some functions in addition to playwright methods:
60
20
 
61
- you can use the tool accessibility_snapshot to get the page accessibility snapshot tree, which provides a structured view of the page's elements, including their roles and names. This is useful for understanding the page structure and finding elements to interact with.
21
+ - `async accessibilitySnapshot(page)`: gets a human readable snapshot of clickable elements on the page. useful to see the overall structure of the page and what elements you can interact with
62
22
 
63
- Example accessibility snapshot result:
23
+ example:
64
24
 
65
25
  ```md
66
26
  - generic [active] [ref=e1]:
@@ -86,454 +46,48 @@ Example accessibility snapshot result:
86
46
  - /url: /colors
87
47
  ```
88
48
 
89
- Then you can use `page.locator(`aria-ref=${ref}`).describe(element);` to get an element with a specific `ref` and interact with it.
90
-
91
- For example:
92
-
93
- ```javascript
94
- const componentsLink = page
95
- // Exact target element reference from the page snapshot
96
- .locator('aria-ref=e14')
97
- // Human-readable element description used to obtain permission to interact with the element
98
- .describe('Components link')
99
-
100
- componentsLink.click()
101
- console.log('Clicked on Components link')
102
- ```
103
-
104
- This approach is the preferred way to find elements on the page, as it allows you to use the structured information from the accessibility snapshot to interact with elements reliably.
105
-
106
- You can also find `getByRole` to get elements on the page.
107
-
108
- ```javascript
109
- // Then use the information from the snapshot to click elements
110
- // For example, if snapshot shows: { "role": "button", "name": "Sign In" }
111
- await page.getByRole('button', { name: 'Sign In' }).click()
112
-
113
- // For a link with { "role": "link", "name": "About" }
114
- await page.getByRole('link', { name: 'About' }).click()
115
-
116
- // For a textbox with { "role": "textbox", "name": "Email" }
117
- await page.getByRole('textbox', { name: 'Email' }).fill('user@example.com')
118
-
119
- // For a heading with { "role": "heading", "name": "Welcome to Example.com" }
120
- const headingText = await page
121
- .getByRole('heading', { name: 'Welcome to Example.com' })
122
- .textContent()
123
- console.log('Heading text:', headingText)
124
- ```
125
-
126
- ### Complete Example: Find and Click Elements
127
-
128
- ```javascript
129
- await page.getByRole('button', { name: 'Submit Form' }).click()
130
- console.log('Clicked submit button')
131
-
132
- await page.waitForLoadState('networkidle')
133
- console.log('Form submitted successfully')
134
- ```
135
-
136
- ## Core Concepts
137
-
138
- ### Page and Context
139
-
140
- In Playwright, automation happens through a `page` object (representing a browser tab) and `context` (representing a browser session with cookies, storage, etc.).
141
-
142
- ```javascript
143
- // Assuming you have page and context already available
144
- const page = await context.newPage()
145
- ```
146
-
147
- ### Element Selection
148
-
149
- Playwright uses locators to find elements. The examples below show various selection methods:
150
-
151
- ```javascript
152
- // By role (recommended)
153
- await page.getByRole('button', { name: 'Submit' })
154
-
155
- // By text
156
- await page.getByText('Welcome')
157
-
158
- // By placeholder
159
- await page.getByPlaceholder('Enter email')
160
-
161
- // By label
162
- await page.getByLabel('Username')
163
-
164
- // By test id
165
- await page.getByTestId('submit-button')
166
-
167
- // By CSS selector
168
- await page.locator('.my-class')
169
-
170
- // By XPath
171
- await page.locator('//div[@class="content"]')
172
- ```
173
-
174
- ## Navigation
175
-
176
- ### Navigate to URL
177
-
178
- ```javascript
179
- await page.goto('https://example.com')
180
- // Wait for network idle (no requests for 500ms)
181
- await page.goto('https://example.com', { waitUntil: 'networkidle' })
182
- ```
183
-
184
- ### Navigate Back/Forward
185
-
186
- ```javascript
187
- // Go back to previous page
188
- await page.goBack()
189
-
190
- // Go forward to next page
191
- await page.goForward()
192
- ```
193
-
194
- ## Screenshots
195
-
196
- ### Take Screenshot
197
-
198
- ```javascript
199
- // Screenshot of viewport
200
- await page.screenshot({ path: 'screenshot.png' })
201
-
202
- // Full page screenshot
203
- await page.screenshot({ path: 'fullpage.png', fullPage: true })
204
-
205
- // Screenshot of specific element
206
- const element = await page.getByRole('button', { name: 'Submit' })
207
- await element.screenshot({ path: 'button.png' })
208
-
209
- // Screenshot with custom dimensions
210
- await page.setViewportSize({ width: 1280, height: 720 })
211
- await page.screenshot({ path: 'custom-size.png' })
212
- ```
213
-
214
- ## Mouse Interactions
215
-
216
- ### Click Elements
217
-
218
- ```javascript
219
- // Click by role
220
- await page.getByRole('button', { name: 'Submit' }).click()
221
-
222
- // Click at coordinates
223
- await page.mouse.click(100, 200)
224
-
225
- // Double click
226
- await page.getByText('Double click me').dblclick()
227
-
228
- // Right click
229
- await page.getByText('Right click me').click({ button: 'right' })
230
-
231
- // Click with modifiers
232
- await page.getByText('Ctrl click me').click({ modifiers: ['Control'] })
233
- ```
234
-
235
- ### Hover
236
-
237
- ```javascript
238
- // Hover over element
239
- await page.getByText('Hover me').hover()
240
-
241
- // Hover at coordinates
242
- await page.mouse.move(100, 200)
243
- ```
244
-
245
- ## Keyboard Input
246
-
247
- ### Type Text
49
+ Then you can use `page.locator(`aria-ref=${ref}`)` to get an element with a specific `ref` and interact with it.
248
50
 
249
- ```javascript
250
- // Type into input field
251
- await page.getByLabel('Email').fill('user@example.com')
51
+ `const componentsLink = page.locator('aria-ref=e14').click()`
252
52
 
253
- // Type character by character (simulates real typing)
254
- await page.getByLabel('Email').type('user@example.com', { delay: 100 })
255
-
256
- // Clear and type
257
- await page.getByLabel('Email').clear()
258
- await page.getByLabel('Email').fill('new@example.com')
259
- ```
260
-
261
- ### Press Keys
262
-
263
- ```javascript
264
- // Press single key
265
- await page.keyboard.press('Enter')
266
-
267
- // Press key combination
268
- await page.keyboard.press('Control+A')
269
-
270
- // Press sequence of keys
271
- await page.keyboard.press('Tab')
272
- await page.keyboard.press('Tab')
273
- await page.keyboard.press('Space')
274
-
275
- // Common key shortcuts
276
- await page.keyboard.press('Control+C') // Copy
277
- await page.keyboard.press('Control+V') // Paste
278
- await page.keyboard.press('Control+Z') // Undo
279
- ```
280
-
281
- ## Form Interactions
282
-
283
- ### Select Dropdown Options
284
-
285
- ```javascript
286
- // Select by value
287
- await page.selectOption('select#country', 'us')
288
-
289
- // Select by label
290
- await page.selectOption('select#country', { label: 'United States' })
291
-
292
- // Select multiple options
293
- await page.selectOption('select#colors', ['red', 'blue', 'green'])
294
-
295
- // Get selected option
296
- const selectedValue = await page.$eval('select#country', (el) => el.value)
297
- ```
298
-
299
- ### Checkboxes and Radio Buttons
300
-
301
- ```javascript
302
- // Check checkbox
303
- await page.getByLabel('I agree').check()
304
-
305
- // Uncheck checkbox
306
- await page.getByLabel('Subscribe').uncheck()
307
-
308
- // Check if checked
309
- const isChecked = await page.getByLabel('I agree').isChecked()
310
-
311
- // Select radio button
312
- await page.getByLabel('Option A').check()
313
- ```
314
-
315
- ## JavaScript Evaluation
316
-
317
- ### Execute JavaScript in Page Context
318
-
319
- ```javascript
320
- // Evaluate simple expression
321
- const result = await page.evaluate(() => 2 + 2)
322
-
323
- // Access page variables
324
- const pageTitle = await page.evaluate(() => document.title)
325
-
326
- // Modify page
327
- await page.evaluate(() => {
328
- document.body.style.backgroundColor = 'red'
329
- })
330
-
331
- // Pass arguments to page context
332
- const sum = await page.evaluate(([a, b]) => a + b, [5, 3])
333
-
334
- // Work with elements
335
- const elementText = await page.evaluate(
336
- (el) => el.textContent,
337
- await page.getByRole('heading'),
338
- )
339
- ```
340
-
341
- ### Execute JavaScript on Element
342
-
343
- ```javascript
344
- // Get element property
345
- const href = await page.getByRole('link').evaluate((el) => el.href)
346
-
347
- // Modify element
348
- await page.getByRole('button').evaluate((el) => {
349
- el.style.backgroundColor = 'green'
350
- el.disabled = true
351
- })
352
-
353
- // Scroll element into view
354
- await page.getByText('Section').evaluate((el) => el.scrollIntoView())
355
- ```
356
-
357
- ## File Handling
358
-
359
- ### File Upload
360
-
361
- ```javascript
362
- // Upload single file
363
- await page.getByLabel('Upload file').setInputFiles('/path/to/file.pdf')
364
-
365
- // Upload multiple files
366
- await page
367
- .getByLabel('Upload files')
368
- .setInputFiles(['/path/to/file1.pdf', '/path/to/file2.pdf'])
369
-
370
- // Clear file input
371
- await page.getByLabel('Upload file').setInputFiles([])
372
-
373
- // For file inputs, use setInputFiles directly on the input element
374
- // Find the file input element (often hidden)
375
- await page.locator('input[type="file"]').setInputFiles('/path/to/file.pdf')
376
- ```
377
-
378
- ## Network Monitoring
379
-
380
- ### Check Network Activity
381
-
382
- ```javascript
383
- // Wait for a specific request to complete and get its response
384
- const response = await page.waitForResponse(
385
- (response) =>
386
- response.url().includes('/api/user') && response.status() === 200,
387
- )
388
-
389
- // Get response data
390
- const responseBody = await response.json()
391
- console.log('API response:', responseBody)
392
-
393
- // Wait for specific request
394
- const request = await page.waitForRequest('**/api/data')
395
- console.log('Request URL:', request.url())
396
- console.log('Request method:', request.method())
397
-
398
- // Get all resources loaded by the page
399
- const resources = await page.evaluate(() =>
400
- performance.getEntriesByType('resource').map((r) => ({
401
- name: r.name,
402
- duration: r.duration,
403
- size: r.transferSize,
404
- })),
405
- )
406
- console.log('Page resources:', resources)
407
- ```
408
-
409
- ## Console Messages
410
-
411
- ### Capture Console Output
412
-
413
- ```javascript
414
- // Console messages are automatically captured by the MCP implementation
415
- // Use the console_logs tool to retrieve them
416
-
417
- // To trigger console messages from the page:
418
- await page.evaluate(() => {
419
- console.log('This message will be captured')
420
- console.error('This error will be captured')
421
- console.warn('This warning will be captured')
422
- })
423
-
424
- // Then use the console_logs MCP tool to retrieve all captured messages
425
- // The tool provides filtering by type and pagination
426
- ```
427
-
428
- ## Waiting
429
-
430
- ### Wait for Conditions
431
-
432
- ```javascript
433
- // Wait for element to appear
434
- await page.waitForSelector('.success-message')
435
-
436
- // Wait for element to disappear
437
- await page.waitForSelector('.loading', { state: 'hidden' })
438
-
439
- await page.waitForURL(/github\.com.*\/pull/)
440
- await page.waitForURL(/\/new-org/)
441
-
442
- // Wait for text to appear
443
- await page.waitForFunction(
444
- (text) => document.body.textContent.includes(text),
445
- 'Success!',
446
- )
447
-
448
- // Wait for navigation
449
- await page.waitForURL('**/success')
450
-
451
- // Wait for page load
452
- await page.waitForLoadState('networkidle')
453
-
454
- // Wait for specific condition
455
- await page.waitForFunction(
456
- (text) => document.querySelector('.status')?.textContent === text,
457
- 'Ready',
458
- )
459
- ```
460
-
461
- ### Wait for Text to Appear or Disappear
53
+ ## getting outputs of code execution
462
54
 
463
- ```javascript
464
- // Wait for specific text to appear on the page
465
- await page.getByText('Loading complete').first().waitFor({ state: 'visible' })
466
- console.log('Loading complete text is now visible')
55
+ You can use `console.log` to print values you want to see in the tool call result
467
56
 
468
- // Wait for text to disappear from the page
469
- await page.getByText('Loading...').first().waitFor({ state: 'hidden' })
470
- console.log('Loading text has disappeared')
57
+ ## using page.evaluate
471
58
 
472
- // Wait for multiple conditions sequentially
473
- // First wait for loading to disappear, then wait for success message
474
- await page.getByText('Processing...').first().waitFor({ state: 'hidden' })
475
- await page.getByText('Success!').first().waitFor({ state: 'visible' })
476
- console.log('Processing finished and success message appeared')
59
+ you can execute client side JavaScript code using `page.evaluate()`
477
60
 
478
- // Example: Wait for error message to disappear before proceeding
479
- await page
480
- .getByText('Error: Please try again')
481
- .first()
482
- .waitFor({ state: 'hidden' })
483
- await page.getByRole('button', { name: 'Submit' }).click()
61
+ When executing code with `page.evaluate()`, return values directly from the evaluate function. Use `console.log()` outside of evaluate to display results:
484
62
 
485
- // Example: Wait for confirmation text after form submission
486
- await page.getByRole('button', { name: 'Save' }).click()
487
- await page
488
- .getByText('Your changes have been saved')
489
- .first()
490
- .waitFor({ state: 'visible' })
491
- console.log('Save confirmed')
63
+ ```js
64
+ // Get data from the page by returning it
65
+ const title = await page.evaluate(() => document.title)
66
+ console.log('Page title:', title)
492
67
 
493
- // Example: Wait for dynamic content to load
494
- await page.getByRole('button', { name: 'Load More' }).click()
495
- await page
496
- .getByText('Loading more items...')
497
- .first()
498
- .waitFor({ state: 'visible' })
499
- await page
500
- .getByText('Loading more items...')
501
- .first()
502
- .waitFor({ state: 'hidden' })
503
- console.log('Additional items loaded')
68
+ // Return multiple values as an object
69
+ const pageInfo = await page.evaluate(() => ({
70
+ url: window.location.href,
71
+ buttonCount: document.querySelectorAll('button').length,
72
+ readyState: document.readyState,
73
+ }))
74
+ console.log('Page URL:', pageInfo.url)
75
+ console.log('Number of buttons:', pageInfo.buttonCount)
76
+ console.log('Page ready state:', pageInfo.readyState)
504
77
  ```
505
78
 
506
- ### Work with Frames
507
-
508
- ```javascript
509
- // Get frame by name
510
- const frame = page.frame('frameName')
79
+ ## read for logs during interactions
511
80
 
512
- // Get frame by URL
513
- const frame = page.frame({ url: /frame\.html/ })
81
+ you can see logs during interactions with `page.on('console', msg => console.log(`Browser log: [${msg.type()}] ${msg.text()}`))`
514
82
 
515
- // Interact with frame content
516
- await frame.getByText('In Frame').click()
83
+ then remember to call `context.removeAllListeners()` or `page.removeAllListeners('console')` to not see logs in next execute calls.
517
84
 
518
- // Get all frames
519
- const frames = page.frames()
520
- ```
85
+ ## reading past logs
521
86
 
522
- ## Best Practices
87
+ you can keep track of logs using `globalThis.logs = []; page.on('console', msg => globalThis.logs.push({ type: msg.type(), text: msg.text() }))`
523
88
 
524
- ### Reliable Selectors
89
+ later, you can read logs that you care about. For example, to get the last 100 logs that contain the word "error":
525
90
 
526
- ```javascript
527
- // Prefer user-facing attributes
528
- await page.getByRole('button', { name: 'Submit' })
529
- await page.getByLabel('Email')
530
- await page.getByPlaceholder('Search...')
531
- await page.getByText('Welcome')
91
+ `console.log('errors:'); globalThis.logs.filter(log => log.type === 'error').slice(-100).forEach(x => console.log(x))`
532
92
 
533
- // Use test IDs for complex cases
534
- await page.getByTestId('complex-component')
535
-
536
- // Avoid brittle selectors
537
- // Bad: await page.locator('.btn-3842');
538
- // Good: await page.getByRole('button', { name: 'Submit' });
539
- ```
93
+ then to reset logs: `globalThis.logs = []` and to stop listening: `page.removeAllListeners('console')`