npm - playwriter - Versions diffs - 0.0.2 → 0.0.4 - Mend

playwriter 0.0.2 → 0.0.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (46) hide show

package/bin.js +1 -1
package/dist/browser-config.js +1 -3
package/dist/browser-config.js.map +1 -1
package/dist/cdp-types.d.ts +25 -0
package/dist/cdp-types.d.ts.map +1 -0
package/dist/cdp-types.js +91 -0
package/dist/cdp-types.js.map +1 -0
package/dist/extension/cdp-relay.d.ts +12 -0
package/dist/extension/cdp-relay.d.ts.map +1 -0
package/dist/extension/cdp-relay.js +378 -0
package/dist/extension/cdp-relay.js.map +1 -0
package/dist/extension/protocol.d.ts +29 -0
package/dist/extension/protocol.d.ts.map +1 -0
package/dist/extension/protocol.js +2 -0
package/dist/extension/protocol.js.map +1 -0
package/dist/index.d.ts +2 -0
package/dist/index.d.ts.map +1 -0
package/dist/index.js +2 -0
package/dist/index.js.map +1 -0
package/dist/mcp-client.d.ts.map +1 -1
package/dist/mcp-client.js +1 -1
package/dist/mcp-client.js.map +1 -1
package/dist/mcp.js +74 -464
package/dist/mcp.js.map +1 -1
package/dist/mcp.test.js +101 -142
package/dist/mcp.test.js.map +1 -1
package/dist/prompt.md +41 -487
package/dist/resource.md +436 -0
package/dist/start-relay-server.d.ts +8 -0
package/dist/start-relay-server.d.ts.map +1 -0
package/dist/start-relay-server.js +33 -0
package/dist/start-relay-server.js.map +1 -0
package/package.json +42 -36
package/src/browser-config.ts +48 -50
package/src/cdp-types.ts +124 -0
package/src/extension/cdp-relay.ts +480 -0
package/src/extension/protocol.ts +34 -0
package/src/index.ts +1 -0
package/src/mcp-client.ts +46 -46
package/src/mcp.test.ts +109 -165
package/src/mcp.ts +202 -694
package/src/prompt.md +41 -487
package/src/resource.md +436 -0
package/src/snapshots/hacker-news-initial-accessibility.md +243 -127
package/src/snapshots/shadcn-ui-accessibility.md +300 -510
package/src/start-relay-server.ts +43 -0

package/src/prompt.md CHANGED Viewed

@@ -1,66 +1,26 @@
-Executes code in the server to control playwright.
+execute tool let you run playwright code to control user Chrome window
-You have access to a `page` object where you can call playwright methods on it to accomplish actions on the page.
+it will control an existing user Chrome window. The execute command will be executed in a sandbox with some variables in context:
-You can also use `console.log` to examine the results of your actions.
+- context: the playwright browser context. you can do things like `await context.pages()`
+- page, the first page the user opened and made it accessible to this MCP. do things like `page.url()` to see current url. assume the user wants you to use this page for your playwright code
-You only have access to `page`, `context` and node.js globals. Do not try to import anything or setup handlers.
+the window can have more than one page. you can see other pages with `context.pages().find((p) => p.url().includes('localhost'))`
-Your code should be stateless and do not depend on any state.
+you can control the browser in collaboration with the user. for example the user can help you get unstuck for things like captchas or difficult to find elements or reproducing a bug
-If you really want to attach listeners you should also detach them using a try finally block, to prevent memory leaks.
+## rules
-You can also create a new page via `context.newPage()` if you need to start fresh. You can then find that page by iteration over `context.pages()`:
+- only call `page.close()` if the user asks you so or if you previously created this page yourself with `newPage`. do not close user created pages unless asked
+-
-```javascript
-const page = context.pages().find((p) => p.url().includes('/some/path'))
-```
-## important rules
-- NEVER call `page.waitForTimeout`, instead use `page.waitForSelector` or use a while loop that waits for a condition to be true.
-- when a timeout error happen for example during navigation don't worry too much. try to get the snapshot of the page to see the current state, then continue without retrying if the state is what you expect. If the state is not what you expect, then you can retry the action.
-- only call `page.close()` if the user asks you so or if you are in a test feedback loop and you know the user is not dependently interacting with the page (for example for debugging).
-- always call `new_page` at the start of a conversation. later this page will be passed to the `execute` tool.
-- In some rare cases you can also skip `new_page` tool, if the user asks you to instead use an existing page in the browser. You can set a page as default using `state.page = page`, `execute` calls will be passed this page in the scope later on.
-- if running in localhost and some elements are difficult to target with locators you can update the source code to add `data-testid` attributes to elements you want to target. This will make running tests much easier later on. Also update the source markdown documents your are following if you do so.
-- after every action call the tool `accessibility_snapshot` to get the page structure and understand what elements are available on the page
-- after form submissions use `page.waitForLoadState('networkidle')` to ensure the page is fully loaded before proceeding
-- sometimes when in localhost and using Vite you can encounter issues in the first page load, where a module is not found, because of updated optimization of the node_modules. In these cases you can try reloading the page 2 times and see if the issue resolves itself.
-- for Google and GitHub login always use the Google account you have access to, already signed in
-- if you are following a markdown document describing the steps to follow to test the website, update this document if you encounter unexpected behavior or if you can add information that would make the test faster, for example telling how to wait for actions that trigger loading states or to use a different timeout for specific actions.
-## getting outputs of code execution
-You can use `console.log` to print values you want to see in the tool call result
-## using page.evaluate
-you can execute client side JavaScript code using `page.evaluate()`
-When executing code with `page.evaluate()`, return values directly from the evaluate function. Use `console.log()` outside of evaluate to display results:
-```javascript
-// Get data from the page by returning it
-const title = await page.evaluate(() => document.title)
-console.log('Page title:', title)
-// Return multiple values as an object
-const pageInfo = await page.evaluate(() => ({
-    url: window.location.href,
-    buttonCount: document.querySelectorAll('button').length,
-    readyState: document.readyState,
-}))
-console.log('Page URL:', pageInfo.url)
-console.log('Number of buttons:', pageInfo.buttonCount)
-console.log('Page ready state:', pageInfo.readyState)
-```
+## utility functions
-## Finding Elements on the Page
+you have access to some functions in addition to playwright methods:
-you can use the tool accessibility_snapshot to get the page accessibility snapshot tree, which provides a structured view of the page's elements, including their roles and names. This is useful for understanding the page structure and finding elements to interact with.
+- `async accessibilitySnapshot(page)`: gets a human readable snapshot of clickable elements on the page. useful to see the overall structure of the page and what elements you can interact with
-Example accessibility snapshot result:
+example:
 ```md
 - generic [active] [ref=e1]:
@@ -86,454 +46,48 @@ Example accessibility snapshot result:
                         - /url: /colors
 ```
-Then you can use `page.locator(`aria-ref=${ref}`).describe(element);` to get an element with a specific `ref` and interact with it.
-For example:
-```javascript
-const componentsLink = page
-    // Exact target element reference from the page snapshot
-    .locator('aria-ref=e14')
-    // Human-readable element description used to obtain permission to interact with the element
-    .describe('Components link')
-componentsLink.click()
-console.log('Clicked on Components link')
-```
-This approach is the preferred way to find elements on the page, as it allows you to use the structured information from the accessibility snapshot to interact with elements reliably.
-You can also find `getByRole` to get elements on the page.
-```javascript
-// Then use the information from the snapshot to click elements
-// For example, if snapshot shows: { "role": "button", "name": "Sign In" }
-await page.getByRole('button', { name: 'Sign In' }).click()
-// For a link with { "role": "link", "name": "About" }
-await page.getByRole('link', { name: 'About' }).click()
-// For a textbox with { "role": "textbox", "name": "Email" }
-await page.getByRole('textbox', { name: 'Email' }).fill('user@example.com')
-// For a heading with { "role": "heading", "name": "Welcome to Example.com" }
-const headingText = await page
-    .getByRole('heading', { name: 'Welcome to Example.com' })
-    .textContent()
-console.log('Heading text:', headingText)
-```
-### Complete Example: Find and Click Elements
-```javascript
-await page.getByRole('button', { name: 'Submit Form' }).click()
-console.log('Clicked submit button')
-await page.waitForLoadState('networkidle')
-console.log('Form submitted successfully')
-```
-## Core Concepts
-### Page and Context
-In Playwright, automation happens through a `page` object (representing a browser tab) and `context` (representing a browser session with cookies, storage, etc.).
-```javascript
-// Assuming you have page and context already available
-const page = await context.newPage()
-```
-### Element Selection
-Playwright uses locators to find elements. The examples below show various selection methods:
-```javascript
-// By role (recommended)
-await page.getByRole('button', { name: 'Submit' })
-// By text
-await page.getByText('Welcome')
-// By placeholder
-await page.getByPlaceholder('Enter email')
-// By label
-await page.getByLabel('Username')
-// By test id
-await page.getByTestId('submit-button')
-// By CSS selector
-await page.locator('.my-class')
-// By XPath
-await page.locator('//div[@class="content"]')
-```
-## Navigation
-### Navigate to URL
-```javascript
-await page.goto('https://example.com')
-// Wait for network idle (no requests for 500ms)
-await page.goto('https://example.com', { waitUntil: 'networkidle' })
-```
-### Navigate Back/Forward
-```javascript
-// Go back to previous page
-await page.goBack()
-// Go forward to next page
-await page.goForward()
-```
-## Screenshots
-### Take Screenshot
-```javascript
-// Screenshot of viewport
-await page.screenshot({ path: 'screenshot.png' })
-// Full page screenshot
-await page.screenshot({ path: 'fullpage.png', fullPage: true })
-// Screenshot of specific element
-const element = await page.getByRole('button', { name: 'Submit' })
-await element.screenshot({ path: 'button.png' })
-// Screenshot with custom dimensions
-await page.setViewportSize({ width: 1280, height: 720 })
-await page.screenshot({ path: 'custom-size.png' })
-```
-## Mouse Interactions
-### Click Elements
-```javascript
-// Click by role
-await page.getByRole('button', { name: 'Submit' }).click()
-// Click at coordinates
-await page.mouse.click(100, 200)
-// Double click
-await page.getByText('Double click me').dblclick()
-// Right click
-await page.getByText('Right click me').click({ button: 'right' })
-// Click with modifiers
-await page.getByText('Ctrl click me').click({ modifiers: ['Control'] })
-```
-### Hover
-```javascript
-// Hover over element
-await page.getByText('Hover me').hover()
-// Hover at coordinates
-await page.mouse.move(100, 200)
-```
-## Keyboard Input
-### Type Text
+Then you can use `page.locator(`aria-ref=${ref}`)` to get an element with a specific `ref` and interact with it.
-```javascript
-// Type into input field
-await page.getByLabel('Email').fill('user@example.com')
+`const componentsLink = page.locator('aria-ref=e14').click()`
-// Type character by character (simulates real typing)
-await page.getByLabel('Email').type('user@example.com', { delay: 100 })
-// Clear and type
-await page.getByLabel('Email').clear()
-await page.getByLabel('Email').fill('new@example.com')
-```
-### Press Keys
-```javascript
-// Press single key
-await page.keyboard.press('Enter')
-// Press key combination
-await page.keyboard.press('Control+A')
-// Press sequence of keys
-await page.keyboard.press('Tab')
-await page.keyboard.press('Tab')
-await page.keyboard.press('Space')
-// Common key shortcuts
-await page.keyboard.press('Control+C') // Copy
-await page.keyboard.press('Control+V') // Paste
-await page.keyboard.press('Control+Z') // Undo
-```
-## Form Interactions
-### Select Dropdown Options
-```javascript
-// Select by value
-await page.selectOption('select#country', 'us')
-// Select by label
-await page.selectOption('select#country', { label: 'United States' })
-// Select multiple options
-await page.selectOption('select#colors', ['red', 'blue', 'green'])
-// Get selected option
-const selectedValue = await page.$eval('select#country', (el) => el.value)
-```
-### Checkboxes and Radio Buttons
-```javascript
-// Check checkbox
-await page.getByLabel('I agree').check()
-// Uncheck checkbox
-await page.getByLabel('Subscribe').uncheck()
-// Check if checked
-const isChecked = await page.getByLabel('I agree').isChecked()
-// Select radio button
-await page.getByLabel('Option A').check()
-```
-## JavaScript Evaluation
-### Execute JavaScript in Page Context
-```javascript
-// Evaluate simple expression
-const result = await page.evaluate(() => 2 + 2)
-// Access page variables
-const pageTitle = await page.evaluate(() => document.title)
-// Modify page
-await page.evaluate(() => {
-    document.body.style.backgroundColor = 'red'
-})
-// Pass arguments to page context
-const sum = await page.evaluate(([a, b]) => a + b, [5, 3])
-// Work with elements
-const elementText = await page.evaluate(
-    (el) => el.textContent,
-    await page.getByRole('heading'),
-)
-```
-### Execute JavaScript on Element
-```javascript
-// Get element property
-const href = await page.getByRole('link').evaluate((el) => el.href)
-// Modify element
-await page.getByRole('button').evaluate((el) => {
-    el.style.backgroundColor = 'green'
-    el.disabled = true
-})
-// Scroll element into view
-await page.getByText('Section').evaluate((el) => el.scrollIntoView())
-```
-## File Handling
-### File Upload
-```javascript
-// Upload single file
-await page.getByLabel('Upload file').setInputFiles('/path/to/file.pdf')
-// Upload multiple files
-await page
-    .getByLabel('Upload files')
-    .setInputFiles(['/path/to/file1.pdf', '/path/to/file2.pdf'])
-// Clear file input
-await page.getByLabel('Upload file').setInputFiles([])
-// For file inputs, use setInputFiles directly on the input element
-// Find the file input element (often hidden)
-await page.locator('input[type="file"]').setInputFiles('/path/to/file.pdf')
-```
-## Network Monitoring
-### Check Network Activity
-```javascript
-// Wait for a specific request to complete and get its response
-const response = await page.waitForResponse(
-    (response) =>
-        response.url().includes('/api/user') && response.status() === 200,
-)
-// Get response data
-const responseBody = await response.json()
-console.log('API response:', responseBody)
-// Wait for specific request
-const request = await page.waitForRequest('**/api/data')
-console.log('Request URL:', request.url())
-console.log('Request method:', request.method())
-// Get all resources loaded by the page
-const resources = await page.evaluate(() =>
-    performance.getEntriesByType('resource').map((r) => ({
-        name: r.name,
-        duration: r.duration,
-        size: r.transferSize,
-    })),
-)
-console.log('Page resources:', resources)
-```
-## Console Messages
-### Capture Console Output
-```javascript
-// Console messages are automatically captured by the MCP implementation
-// Use the console_logs tool to retrieve them
-// To trigger console messages from the page:
-await page.evaluate(() => {
-    console.log('This message will be captured')
-    console.error('This error will be captured')
-    console.warn('This warning will be captured')
-})
-// Then use the console_logs MCP tool to retrieve all captured messages
-// The tool provides filtering by type and pagination
-```
-## Waiting
-### Wait for Conditions
-```javascript
-// Wait for element to appear
-await page.waitForSelector('.success-message')
-// Wait for element to disappear
-await page.waitForSelector('.loading', { state: 'hidden' })
-await page.waitForURL(/github\.com.*\/pull/)
-await page.waitForURL(/\/new-org/)
-// Wait for text to appear
-await page.waitForFunction(
-    (text) => document.body.textContent.includes(text),
-    'Success!',
-)
-// Wait for navigation
-await page.waitForURL('**/success')
-// Wait for page load
-await page.waitForLoadState('networkidle')
-// Wait for specific condition
-await page.waitForFunction(
-    (text) => document.querySelector('.status')?.textContent === text,
-    'Ready',
-)
-```
-### Wait for Text to Appear or Disappear
+## getting outputs of code execution
-```javascript
-// Wait for specific text to appear on the page
-await page.getByText('Loading complete').first().waitFor({ state: 'visible' })
-console.log('Loading complete text is now visible')
+You can use `console.log` to print values you want to see in the tool call result
-// Wait for text to disappear from the page
-await page.getByText('Loading...').first().waitFor({ state: 'hidden' })
-console.log('Loading text has disappeared')
+## using page.evaluate
-// Wait for multiple conditions sequentially
-// First wait for loading to disappear, then wait for success message
-await page.getByText('Processing...').first().waitFor({ state: 'hidden' })
-await page.getByText('Success!').first().waitFor({ state: 'visible' })
-console.log('Processing finished and success message appeared')
+you can execute client side JavaScript code using `page.evaluate()`
-// Example: Wait for error message to disappear before proceeding
-await page
-    .getByText('Error: Please try again')
-    .first()
-    .waitFor({ state: 'hidden' })
-await page.getByRole('button', { name: 'Submit' }).click()
+When executing code with `page.evaluate()`, return values directly from the evaluate function. Use `console.log()` outside of evaluate to display results:
-// Example: Wait for confirmation text after form submission
-await page.getByRole('button', { name: 'Save' }).click()
-await page
-    .getByText('Your changes have been saved')
-    .first()
-    .waitFor({ state: 'visible' })
-console.log('Save confirmed')
+```js
+// Get data from the page by returning it
+const title = await page.evaluate(() => document.title)
+console.log('Page title:', title)
-// Example: Wait for dynamic content to load
-await page.getByRole('button', { name: 'Load More' }).click()
-await page
-    .getByText('Loading more items...')
-    .first()
-    .waitFor({ state: 'visible' })
-await page
-    .getByText('Loading more items...')
-    .first()
-    .waitFor({ state: 'hidden' })
-console.log('Additional items loaded')
+// Return multiple values as an object
+const pageInfo = await page.evaluate(() => ({
+    url: window.location.href,
+    buttonCount: document.querySelectorAll('button').length,
+    readyState: document.readyState,
+}))
+console.log('Page URL:', pageInfo.url)
+console.log('Number of buttons:', pageInfo.buttonCount)
+console.log('Page ready state:', pageInfo.readyState)
 ```
-### Work with Frames
-```javascript
-// Get frame by name
-const frame = page.frame('frameName')
+## read for logs during interactions
-// Get frame by URL
-const frame = page.frame({ url: /frame\.html/ })
+you can see logs during interactions with `page.on('console', msg => console.log(`Browser log: [${msg.type()}] ${msg.text()}`))`
-// Interact with frame content
-await frame.getByText('In Frame').click()
+then remember to call `context.removeAllListeners()` or `page.removeAllListeners('console')` to not see logs in next execute calls.
-// Get all frames
-const frames = page.frames()
-```
+## reading past logs
-## Best Practices
+you can keep track of logs using `globalThis.logs = []; page.on('console', msg => globalThis.logs.push({ type: msg.type(), text: msg.text() }))`
-### Reliable Selectors
+later, you can read logs that you care about. For example, to get the last 100 logs that contain the word "error":
-```javascript
-// Prefer user-facing attributes
-await page.getByRole('button', { name: 'Submit' })
-await page.getByLabel('Email')
-await page.getByPlaceholder('Search...')
-await page.getByText('Welcome')
+`console.log('errors:'); globalThis.logs.filter(log => log.type === 'error').slice(-100).forEach(x => console.log(x))`
-// Use test IDs for complex cases
-await page.getByTestId('complex-component')
-// Avoid brittle selectors
-// Bad: await page.locator('.btn-3842');
-// Good: await page.getByRole('button', { name: 'Submit' });
-```
+then to reset logs: `globalThis.logs = []` and to stop listening: `page.removeAllListeners('console')`