@ejazullah/browser-mcp 0.0.56

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (66) hide show
  1. package/LICENSE +202 -0
  2. package/README.md +860 -0
  3. package/cli.js +19 -0
  4. package/index.d.ts +23 -0
  5. package/index.js +1061 -0
  6. package/lib/auth.js +82 -0
  7. package/lib/browserContextFactory.js +205 -0
  8. package/lib/browserServerBackend.js +125 -0
  9. package/lib/config.js +266 -0
  10. package/lib/context.js +232 -0
  11. package/lib/databaseLogger.js +264 -0
  12. package/lib/extension/cdpRelay.js +346 -0
  13. package/lib/extension/extensionContextFactory.js +56 -0
  14. package/lib/extension/main.js +26 -0
  15. package/lib/fileUtils.js +32 -0
  16. package/lib/httpServer.js +39 -0
  17. package/lib/index.js +39 -0
  18. package/lib/javascript.js +49 -0
  19. package/lib/log.js +21 -0
  20. package/lib/loop/loop.js +69 -0
  21. package/lib/loop/loopClaude.js +152 -0
  22. package/lib/loop/loopOpenAI.js +143 -0
  23. package/lib/loop/main.js +60 -0
  24. package/lib/loopTools/context.js +66 -0
  25. package/lib/loopTools/main.js +49 -0
  26. package/lib/loopTools/perform.js +32 -0
  27. package/lib/loopTools/snapshot.js +29 -0
  28. package/lib/loopTools/tool.js +18 -0
  29. package/lib/manualPromise.js +111 -0
  30. package/lib/mcp/inProcessTransport.js +72 -0
  31. package/lib/mcp/server.js +93 -0
  32. package/lib/mcp/transport.js +217 -0
  33. package/lib/mongoDBLogger.js +252 -0
  34. package/lib/package.js +20 -0
  35. package/lib/program.js +113 -0
  36. package/lib/response.js +172 -0
  37. package/lib/sessionLog.js +156 -0
  38. package/lib/tab.js +266 -0
  39. package/lib/tools/cdp.js +169 -0
  40. package/lib/tools/common.js +55 -0
  41. package/lib/tools/console.js +33 -0
  42. package/lib/tools/dialogs.js +47 -0
  43. package/lib/tools/evaluate.js +53 -0
  44. package/lib/tools/extraction.js +217 -0
  45. package/lib/tools/files.js +44 -0
  46. package/lib/tools/forms.js +180 -0
  47. package/lib/tools/getext.js +99 -0
  48. package/lib/tools/install.js +53 -0
  49. package/lib/tools/interactions.js +191 -0
  50. package/lib/tools/keyboard.js +86 -0
  51. package/lib/tools/mouse.js +99 -0
  52. package/lib/tools/navigate.js +70 -0
  53. package/lib/tools/network.js +41 -0
  54. package/lib/tools/pdf.js +40 -0
  55. package/lib/tools/screenshot.js +75 -0
  56. package/lib/tools/selectors.js +233 -0
  57. package/lib/tools/snapshot.js +169 -0
  58. package/lib/tools/states.js +147 -0
  59. package/lib/tools/tabs.js +87 -0
  60. package/lib/tools/tool.js +33 -0
  61. package/lib/tools/utils.js +74 -0
  62. package/lib/tools/wait.js +56 -0
  63. package/lib/tools.js +64 -0
  64. package/lib/utils.js +26 -0
  65. package/openapi.json +683 -0
  66. package/package.json +92 -0
package/README.md ADDED
@@ -0,0 +1,860 @@
1
+ ## @ejazullah/browser-mcp
2
+
3
+ An Enhanced Model Context Protocol (MCP) server that provides comprehensive browser automation capabilities using [Playwright](https://playwright.dev). This server enables LLMs to interact with web pages through structured accessibility snapshots and includes advanced features like CDP (Chrome DevTools Protocol) support for connecting to existing browser instances.
4
+
5
+ ### Enhanced Features
6
+
7
+ - **Fast and lightweight**. Uses Playwright's accessibility tree, not pixel-based input.
8
+ - **LLM-friendly**. No vision models needed, operates purely on structured data.
9
+ - **Deterministic tool application**. Avoids ambiguity common with screenshot-based approaches.
10
+ - **🆕 CDP Support**. Connect to existing Chrome/Chromium instances via DevTools Protocol.
11
+ - **🆕 Dynamic Selectors**. CSS, XPath, text-based, and role-based element selection.
12
+ - **🆕 Form Automation**. Comprehensive form interaction tools (checkboxes, radio buttons, inputs).
13
+ - **🆕 Element States**. Check visibility, enabled status, and wait for elements.
14
+ - **🆕 Advanced Interactions**. Scroll, focus, double-click, right-click capabilities.
15
+ - **🆕 Data Extraction**. Bulk extraction of links, images, tables, and CSS styles.
16
+
17
+ ### Requirements
18
+ - Node.js 18 or newer
19
+ - VS Code, Cursor, Windsurf, Claude Desktop, Goose or any other MCP client
20
+
21
+ <!--
22
+ // Generate using:
23
+ node utils/generate-links.js
24
+ -->
25
+
26
+ ### Getting started
27
+
28
+ First, install the Enhanced Playwright MCP server with your client.
29
+
30
+ **Standard config** works in most of the tools:
31
+
32
+ ```js
33
+ {
34
+ "mcpServers": {
35
+ "ejaz-playwright": {
36
+ "command": "npx",
37
+ "args": [
38
+ "@ejazullah/browser-mcp@latest"
39
+ ]
40
+ }
41
+ }
42
+ }
43
+ ```
44
+
45
+ [<img src="https://img.shields.io/badge/VS_Code-VS_Code?style=flat-square&label=Install%20Server&color=0098FF" alt="Install in VS Code">](https://insiders.vscode.dev/redirect?url=vscode%3Amcp%2Finstall%3F%257B%2522name%2522%253A%2522ejaz-playwright%2522%252C%2522command%2522%253A%2522npx%2522%252C%2522args%2522%253A%255B%2522%2540ejazullah%252Fbrowser-mcp%2540latest%2522%255D%257D) [<img alt="Install in VS Code Insiders" src="https://img.shields.io/badge/VS_Code_Insiders-VS_Code_Insiders?style=flat-square&label=Install%20Server&color=24bfa5">](https://insiders.vscode.dev/redirect?url=vscode-insiders%3Amcp%2Finstall%3F%257B%2522name%2522%253A%2522ejaz-playwright%2522%252C%2522command%2522%253A%2522npx%2522%252C%2522args%2522%253A%255B%2522%2540ejazullah%252Fbrowser-mcp%2540latest%2522%255D%257D)
46
+
47
+
48
+ <details>
49
+ <summary>Claude Code</summary>
50
+
51
+ Use the Claude Code CLI to add the Enhanced Playwright MCP server:
52
+
53
+ ```bash
54
+ claude mcp add ejaz-playwright npx @ejazullah/browser-mcp@latest
55
+ ```
56
+ </details>
57
+
58
+ <details>
59
+ <summary>Claude Desktop</summary>
60
+
61
+ Follow the MCP install [guide](https://modelcontextprotocol.io/quickstart/user), use the standard config above.
62
+
63
+ </details>
64
+
65
+ <details>
66
+ <summary>Cursor</summary>
67
+
68
+ #### Click the button to install:
69
+
70
+ [![Install MCP Server](https://cursor.com/deeplink/mcp-install-dark.svg)](cursor://anysphere.cursor-deeplink/mcp/install?name=EjazPlaywright&config=eyJjb21tYW5kIjoibnB4IEBlamF6dWxsYWgvbWNwLXBsYXl3cmlnaHRAIiwgImFyZ3MiOiBbImxhdGVzdCJdfQ%3D%3D)
71
+
72
+ #### Or install manually:
73
+
74
+ Go to `Cursor Settings` -> `MCP` -> `Add new MCP Server`. Name to your liking, use `command` type with the command `npx @ejazullah/browser-mcp@latest`. You can also verify config or add command like arguments via clicking `Edit`.
75
+
76
+ </details>
77
+
78
+ <details>
79
+ <summary>Gemini CLI</summary>
80
+
81
+ Follow the MCP install [guide](https://github.com/google-gemini/gemini-cli/blob/main/docs/tools/mcp-server.md#configure-the-mcp-server-in-settingsjson), use the standard config above.
82
+
83
+ </details>
84
+
85
+ <details>
86
+ <summary>Goose</summary>
87
+
88
+ #### Click the button to install:
89
+
90
+ [![Install in Goose](https://block.github.io/goose/img/extension-install-dark.svg)](https://block.github.io/goose/extension?cmd=npx&arg=%40ejazullah%2Fbrowser-mcp%40latest&id=ejaz-playwright&name=EjazPlaywright&description=Enhanced%20Playwright%20MCP%20with%20CDP%20support%20for%20browser%20automation)
91
+
92
+ #### Or install manually:
93
+
94
+ Go to `Advanced settings` -> `Extensions` -> `Add custom extension`. Name to your liking, use type `STDIO`, and set the `command` to `npx @ejazullah/browser-mcp`. Click "Add Extension".
95
+ </details>
96
+
97
+ <details>
98
+ <summary>LM Studio</summary>
99
+
100
+ #### Click the button to install:
101
+
102
+ [![Add MCP Server playwright to LM Studio](https://files.lmstudio.ai/deeplink/mcp-install-light.svg)](https://lmstudio.ai/install-mcp?name=playwright&config=eyJjb21tYW5kIjoibnB4IiwiYXJncyI6WyJAcGxheXdyaWdodC9tY3BAbGF0ZXN0Il19)
103
+
104
+ #### Or install manually:
105
+
106
+ Go to `Program` in the right sidebar -> `Install` -> `Edit mcp.json`. Use the standard config above.
107
+ </details>
108
+
109
+ <details>
110
+ <summary>Qodo Gen</summary>
111
+
112
+ Open [Qodo Gen](https://docs.qodo.ai/qodo-documentation/qodo-gen) chat panel in VSCode or IntelliJ → Connect more tools → + Add new MCP → Paste the standard config above.
113
+
114
+ Click <code>Save</code>.
115
+ </details>
116
+
117
+ <details>
118
+ <summary>VS Code</summary>
119
+
120
+ #### Click the button to install:
121
+
122
+ [<img src="https://img.shields.io/badge/VS_Code-VS_Code?style=flat-square&label=Install%20Server&color=0098FF" alt="Install in VS Code">](https://insiders.vscode.dev/redirect?url=vscode%3Amcp%2Finstall%3F%257B%2522name%2522%253A%2522ejaz-playwright%2522%252C%2522command%2522%253A%2522npx%2522%252C%2522args%2522%253A%255B%2522%2540ejazullah%252Fbrowser-mcp%2540latest%2522%255D%257D) [<img alt="Install in VS Code Insiders" src="https://img.shields.io/badge/VS_Code_Insiders-VS_Code_Insiders?style=flat-square&label=Install%20Server&color=24bfa5">](https://insiders.vscode.dev/redirect?url=vscode-insiders%3Amcp%2Finstall%3F%257B%2522name%2522%253A%2522ejaz-playwright%2522%252C%2522command%2522%253A%2522npx%2522%252C%2522args%2522%253A%255B%2522%2540ejazullah%252Fbrowser-mcp%2540latest%2522%255D%257D)
123
+
124
+ #### Or install manually:
125
+
126
+ Follow the MCP install [guide](https://code.visualstudio.com/docs/copilot/chat/mcp-servers#_add-an-mcp-server), use the standard config above. You can also install the Enhanced Playwright MCP server using the VS Code CLI:
127
+
128
+ ```bash
129
+ # For VS Code
130
+ code --add-mcp '{"name":"ejaz-playwright","command":"npx","args":["@ejazullah/browser-mcp@latest"]}'
131
+ ```
132
+
133
+ After installation, the Enhanced Playwright MCP server will be available for use with your GitHub Copilot agent in VS Code.
134
+ </details>
135
+
136
+ <details>
137
+ <summary>Windsurf</summary>
138
+
139
+ Follow Windsurf MCP [documentation](https://docs.windsurf.com/windsurf/cascade/mcp). Use the standard config above.
140
+
141
+ </details>
142
+
143
+ ### ✨ Enhanced Features
144
+
145
+ #### 🔗 CDP (Chrome DevTools Protocol) Support
146
+
147
+ Connect to existing Chrome/Chromium browser instances for debugging and automation:
148
+
149
+ ```javascript
150
+ // Discover available CDP endpoints
151
+ await client.callTool({
152
+ name: 'browser_get_cdp_endpoints',
153
+ arguments: { port: 9222 }
154
+ });
155
+
156
+ // Connect to remote browser
157
+ await client.callTool({
158
+ name: 'browser_connect_cdp',
159
+ arguments: { endpoint: 'wss://your-browser-endpoint/devtools/session-id' }
160
+ });
161
+
162
+ // Use any browser tool normally!
163
+ await client.callTool({
164
+ name: 'browser_navigate',
165
+ arguments: { url: 'https://example.com' }
166
+ });
167
+ ```
168
+
169
+ #### 🎯 Dynamic Selectors
170
+
171
+ Multiple flexible ways to target elements:
172
+
173
+ ```javascript
174
+ // CSS Selectors
175
+ await client.callTool({
176
+ name: 'browser_get_text_by_css',
177
+ arguments: { selector: '#main-heading' }
178
+ });
179
+
180
+ // XPath Expressions
181
+ await client.callTool({
182
+ name: 'browser_get_text_by_xpath',
183
+ arguments: { xpath: '//button[contains(text(), "Submit")]' }
184
+ });
185
+
186
+ // Text-based Selection
187
+ await client.callTool({
188
+ name: 'browser_get_text_by_text',
189
+ arguments: { text: 'Click here', elementType: 'button' }
190
+ });
191
+
192
+ // ARIA Role-based
193
+ await client.callTool({
194
+ name: 'browser_get_text_by_role',
195
+ arguments: { role: 'button', name: 'Submit' }
196
+ });
197
+ ```
198
+
199
+ #### 📋 Form Automation
200
+
201
+ Comprehensive form interaction tools:
202
+
203
+ ```javascript
204
+ // Checkbox operations
205
+ await client.callTool({
206
+ name: 'browser_check_checkbox',
207
+ arguments: { element: 'I agree', ref: 'e3', checked: true }
208
+ });
209
+
210
+ // Radio button selection
211
+ await client.callTool({
212
+ name: 'browser_select_radio',
213
+ arguments: { element: 'Payment method', ref: 'e4' }
214
+ });
215
+
216
+ // Input field operations
217
+ await client.callTool({
218
+ name: 'browser_get_input_value',
219
+ arguments: { element: 'Email field', ref: 'e2' }
220
+ });
221
+ ```
222
+
223
+ #### 🔍 Element States & Interactions
224
+
225
+ Advanced element state checking and interactions:
226
+
227
+ ```javascript
228
+ // State checking
229
+ await client.callTool({
230
+ name: 'browser_is_visible',
231
+ arguments: { selector: '#loading-spinner' }
232
+ });
233
+
234
+ // Advanced interactions
235
+ await client.callTool({
236
+ name: 'browser_double_click',
237
+ arguments: { element: 'File icon', ref: 'e5' }
238
+ });
239
+
240
+ // Scrolling
241
+ await client.callTool({
242
+ name: 'browser_scroll_to',
243
+ arguments: { selector: '#footer' }
244
+ });
245
+ ```
246
+
247
+ #### 📊 Data Extraction
248
+
249
+ Bulk data extraction capabilities:
250
+
251
+ ```javascript
252
+ // Extract all links
253
+ await client.callTool({
254
+ name: 'browser_get_all_links',
255
+ arguments: { includeHref: true }
256
+ });
257
+
258
+ // Extract table data
259
+ await client.callTool({
260
+ name: 'browser_get_table_data',
261
+ arguments: { selector: 'table.data-table' }
262
+ });
263
+
264
+ // Get CSS styles
265
+ await client.callTool({
266
+ name: 'browser_get_css_styles',
267
+ arguments: { selector: '.button', properties: ['color', 'background'] }
268
+ });
269
+ ```
270
+
271
+ ### Configuration
272
+
273
+ Enhanced Playwright MCP server supports following arguments. They can be provided in the JSON configuration above, as a part of the `"args"` list:
274
+
275
+ <!--- Options generated by update-readme.js -->
276
+
277
+ ```
278
+ > npx @ejazullah/browser-mcp@latest --help
279
+ --allowed-origins <origins> semicolon-separated list of origins to allow the
280
+ browser to request. Default is to allow all.
281
+ --blocked-origins <origins> semicolon-separated list of origins to block the
282
+ browser from requesting. Blocklist is evaluated
283
+ before allowlist. If used without the allowlist,
284
+ requests not matching the blocklist are still
285
+ allowed.
286
+ --block-service-workers block service workers
287
+ --browser <browser> browser or chrome channel to use, possible
288
+ values: chrome, firefox, webkit, msedge.
289
+ --caps <caps> comma-separated list of additional capabilities
290
+ to enable, possible values: vision, pdf.
291
+ --cdp-endpoint <endpoint> CDP endpoint to connect to.
292
+ --config <path> path to the configuration file.
293
+ --device <device> device to emulate, for example: "iPhone 15"
294
+ --executable-path <path> path to the browser executable.
295
+ --headless run browser in headless mode, headed by default
296
+ --host <host> host to bind server to. Default is localhost. Use
297
+ 0.0.0.0 to bind to all interfaces.
298
+ --ignore-https-errors ignore https errors
299
+ --isolated keep the browser profile in memory, do not save
300
+ it to disk.
301
+ --image-responses <mode> whether to send image responses to the client.
302
+ Can be "allow" or "omit", Defaults to "allow".
303
+ --no-sandbox disable the sandbox for all process types that
304
+ are normally sandboxed.
305
+ --output-dir <path> path to the directory for output files.
306
+ --port <port> port to listen on for SSE transport.
307
+ --proxy-bypass <bypass> comma-separated domains to bypass proxy, for
308
+ example ".com,chromium.org,.domain.com"
309
+ --proxy-server <proxy> specify proxy server, for example
310
+ "http://myproxy:3128" or "socks5://myproxy:8080"
311
+ --save-session Whether to save the Playwright MCP session into
312
+ the output directory.
313
+ --save-trace Whether to save the Playwright Trace of the
314
+ session into the output directory.
315
+ --storage-state <path> path to the storage state file for isolated
316
+ sessions.
317
+ --user-agent <ua string> specify user agent string
318
+ --user-data-dir <path> path to the user data directory. If not
319
+ specified, a temporary directory will be created.
320
+ --viewport-size <size> specify browser viewport size in pixels, for
321
+ example "1280, 720"
322
+ ```
323
+
324
+ <!--- End of options generated section -->
325
+
326
+ ### User profile
327
+
328
+ You can run Playwright MCP with persistent profile like a regular browser (default), or in the isolated contexts for the testing sessions.
329
+
330
+ **Persistent profile**
331
+
332
+ All the logged in information will be stored in the persistent profile, you can delete it between sessions if you'd like to clear the offline state.
333
+ Persistent profile is located at the following locations and you can override it with the `--user-data-dir` argument.
334
+
335
+ ```bash
336
+ # Windows
337
+ %USERPROFILE%\AppData\Local\ms-playwright\mcp-{channel}-profile
338
+
339
+ # macOS
340
+ - ~/Library/Caches/ms-playwright/mcp-{channel}-profile
341
+
342
+ # Linux
343
+ - ~/.cache/ms-playwright/mcp-{channel}-profile
344
+ ```
345
+
346
+ **Isolated**
347
+
348
+ In the isolated mode, each session is started in the isolated profile. Every time you ask MCP to close the browser,
349
+ the session is closed and all the storage state for this session is lost. You can provide initial storage state
350
+ to the browser via the config's `contextOptions` or via the `--storage-state` argument. Learn more about the storage
351
+ state [here](https://playwright.dev/docs/auth).
352
+
353
+ ```js
354
+ {
355
+ "mcpServers": {
356
+ "playwright": {
357
+ "command": "npx",
358
+ "args": [
359
+ "@ejazullah/browser-mcp",
360
+ "--isolated",
361
+ "--storage-state={path/to/storage.json}"
362
+ ]
363
+ }
364
+ }
365
+ }
366
+ ```
367
+
368
+ ### Configuration file
369
+
370
+ The Playwright MCP server can be configured using a JSON configuration file. You can specify the configuration file
371
+ using the `--config` command line option:
372
+
373
+ ```bash
374
+ npx @ejazullah/browser-mcp --config path/to/config.json
375
+ ```
376
+
377
+ <details>
378
+ <summary>Configuration file schema</summary>
379
+
380
+ ```typescript
381
+ {
382
+ // Browser configuration
383
+ browser?: {
384
+ // Browser type to use (chromium, firefox, or webkit)
385
+ browserName?: 'chromium' | 'firefox' | 'webkit';
386
+
387
+ // Keep the browser profile in memory, do not save it to disk.
388
+ isolated?: boolean;
389
+
390
+ // Path to user data directory for browser profile persistence
391
+ userDataDir?: string;
392
+
393
+ // Browser launch options (see Playwright docs)
394
+ // @see https://playwright.dev/docs/api/class-browsertype#browser-type-launch
395
+ launchOptions?: {
396
+ channel?: string; // Browser channel (e.g. 'chrome')
397
+ headless?: boolean; // Run in headless mode
398
+ executablePath?: string; // Path to browser executable
399
+ // ... other Playwright launch options
400
+ };
401
+
402
+ // Browser context options
403
+ // @see https://playwright.dev/docs/api/class-browser#browser-new-context
404
+ contextOptions?: {
405
+ viewport?: { width: number, height: number };
406
+ // ... other Playwright context options
407
+ };
408
+
409
+ // CDP endpoint for connecting to existing browser
410
+ cdpEndpoint?: string;
411
+
412
+ // Remote Playwright server endpoint
413
+ remoteEndpoint?: string;
414
+ },
415
+
416
+ // Server configuration
417
+ server?: {
418
+ port?: number; // Port to listen on
419
+ host?: string; // Host to bind to (default: localhost)
420
+ },
421
+
422
+ // List of additional capabilities
423
+ capabilities?: Array<
424
+ 'tabs' | // Tab management
425
+ 'install' | // Browser installation
426
+ 'pdf' | // PDF generation
427
+ 'vision' | // Coordinate-based interactions
428
+ >;
429
+
430
+ // Directory for output files
431
+ outputDir?: string;
432
+
433
+ // Network configuration
434
+ network?: {
435
+ // List of origins to allow the browser to request. Default is to allow all. Origins matching both `allowedOrigins` and `blockedOrigins` will be blocked.
436
+ allowedOrigins?: string[];
437
+
438
+ // List of origins to block the browser to request. Origins matching both `allowedOrigins` and `blockedOrigins` will be blocked.
439
+ blockedOrigins?: string[];
440
+ };
441
+
442
+ /**
443
+ * Whether to send image responses to the client. Can be "allow" or "omit".
444
+ * Defaults to "allow".
445
+ */
446
+ imageResponses?: 'allow' | 'omit';
447
+ }
448
+ ```
449
+ </details>
450
+
451
+ ### Standalone MCP server
452
+
453
+ When running headed browser on system w/o display or from worker processes of the IDEs,
454
+ run the MCP server from environment with the DISPLAY and pass the `--port` flag to enable HTTP transport.
455
+
456
+ ```bash
457
+ npx @ejazullah/browser-mcp --port 8931
458
+ ```
459
+
460
+ And then in MCP client config, set the `url` to the HTTP endpoint:
461
+
462
+ ```js
463
+ {
464
+ "mcpServers": {
465
+ "playwright": {
466
+ "url": "http://localhost:8931/mcp"
467
+ }
468
+ }
469
+ }
470
+ ```
471
+
472
+ <details>
473
+ <summary><b>Docker</b></summary>
474
+
475
+ **NOTE:** The Docker implementation only supports headless chromium at the moment.
476
+
477
+ ```js
478
+ {
479
+ "mcpServers": {
480
+ "playwright": {
481
+ "command": "docker",
482
+ "args": ["run", "-i", "--rm", "--init", "--pull=always", "mcr.microsoft.com/playwright/mcp"]
483
+ }
484
+ }
485
+ }
486
+ ```
487
+
488
+ You can build the Docker image yourself.
489
+
490
+ ```
491
+ docker build -t mcr.microsoft.com/playwright/mcp .
492
+ ```
493
+ </details>
494
+
495
+ <details>
496
+ <summary><b>Programmatic usage</b></summary>
497
+
498
+ ```js
499
+ import http from 'http';
500
+
501
+ import { createConnection } from '@ejazullah/browser-mcp';
502
+ import { SSEServerTransport } from '@modelcontextprotocol/sdk/server/sse.js';
503
+
504
+ http.createServer(async (req, res) => {
505
+ // ...
506
+
507
+ // Creates a headless Playwright MCP server with SSE transport
508
+ const connection = await createConnection({ browser: { launchOptions: { headless: true } } });
509
+ const transport = new SSEServerTransport('/messages', res);
510
+ await connection.sever.connect(transport);
511
+
512
+ // ...
513
+ });
514
+ ```
515
+ </details>
516
+
517
+ ### Tools
518
+
519
+ <!--- Tools generated by update-readme.js -->
520
+
521
+ <details>
522
+ <summary><b>Core automation</b></summary>
523
+
524
+ <!-- NOTE: This has been generated via update-readme.js -->
525
+
526
+ - **browser_click**
527
+ - Title: Click
528
+ - Description: Perform click on a web page
529
+ - Parameters:
530
+ - `element` (string): Human-readable element description used to obtain permission to interact with the element
531
+ - `ref` (string): Exact target element reference from the page snapshot
532
+ - `doubleClick` (boolean, optional): Whether to perform a double click instead of a single click
533
+ - `button` (string, optional): Button to click, defaults to left
534
+ - Read-only: **false**
535
+
536
+ <!-- NOTE: This has been generated via update-readme.js -->
537
+
538
+ - **browser_close**
539
+ - Title: Close browser
540
+ - Description: Close the page
541
+ - Parameters: None
542
+ - Read-only: **true**
543
+
544
+ <!-- NOTE: This has been generated via update-readme.js -->
545
+
546
+ - **browser_console_messages**
547
+ - Title: Get console messages
548
+ - Description: Returns all console messages
549
+ - Parameters: None
550
+ - Read-only: **true**
551
+
552
+ <!-- NOTE: This has been generated via update-readme.js -->
553
+
554
+ - **browser_drag**
555
+ - Title: Drag mouse
556
+ - Description: Perform drag and drop between two elements
557
+ - Parameters:
558
+ - `startElement` (string): Human-readable source element description used to obtain the permission to interact with the element
559
+ - `startRef` (string): Exact source element reference from the page snapshot
560
+ - `endElement` (string): Human-readable target element description used to obtain the permission to interact with the element
561
+ - `endRef` (string): Exact target element reference from the page snapshot
562
+ - Read-only: **false**
563
+
564
+ <!-- NOTE: This has been generated via update-readme.js -->
565
+
566
+ - **browser_evaluate**
567
+ - Title: Evaluate JavaScript
568
+ - Description: Evaluate JavaScript expression on page or element
569
+ - Parameters:
570
+ - `function` (string): () => { /* code */ } or (element) => { /* code */ } when element is provided
571
+ - `element` (string, optional): Human-readable element description used to obtain permission to interact with the element
572
+ - `ref` (string, optional): Exact target element reference from the page snapshot
573
+ - Read-only: **false**
574
+
575
+ <!-- NOTE: This has been generated via update-readme.js -->
576
+
577
+ - **browser_file_upload**
578
+ - Title: Upload files
579
+ - Description: Upload one or multiple files
580
+ - Parameters:
581
+ - `paths` (array): The absolute paths to the files to upload. Can be a single file or multiple files.
582
+ - Read-only: **false**
583
+
584
+ <!-- NOTE: This has been generated via update-readme.js -->
585
+
586
+ - **browser_handle_dialog**
587
+ - Title: Handle a dialog
588
+ - Description: Handle a dialog
589
+ - Parameters:
590
+ - `accept` (boolean): Whether to accept the dialog.
591
+ - `promptText` (string, optional): The text of the prompt in case of a prompt dialog.
592
+ - Read-only: **false**
593
+
594
+ <!-- NOTE: This has been generated via update-readme.js -->
595
+
596
+ - **browser_hover**
597
+ - Title: Hover mouse
598
+ - Description: Hover over element on page
599
+ - Parameters:
600
+ - `element` (string): Human-readable element description used to obtain permission to interact with the element
601
+ - `ref` (string): Exact target element reference from the page snapshot
602
+ - Read-only: **true**
603
+
604
+ <!-- NOTE: This has been generated via update-readme.js -->
605
+
606
+ - **browser_navigate**
607
+ - Title: Navigate to a URL
608
+ - Description: Navigate to a URL
609
+ - Parameters:
610
+ - `url` (string): The URL to navigate to
611
+ - Read-only: **false**
612
+
613
+ <!-- NOTE: This has been generated via update-readme.js -->
614
+
615
+ - **browser_navigate_back**
616
+ - Title: Go back
617
+ - Description: Go back to the previous page
618
+ - Parameters: None
619
+ - Read-only: **true**
620
+
621
+ <!-- NOTE: This has been generated via update-readme.js -->
622
+
623
+ - **browser_navigate_forward**
624
+ - Title: Go forward
625
+ - Description: Go forward to the next page
626
+ - Parameters: None
627
+ - Read-only: **true**
628
+
629
+ <!-- NOTE: This has been generated via update-readme.js -->
630
+
631
+ - **browser_network_requests**
632
+ - Title: List network requests
633
+ - Description: Returns all network requests since loading the page
634
+ - Parameters: None
635
+ - Read-only: **true**
636
+
637
+ <!-- NOTE: This has been generated via update-readme.js -->
638
+
639
+ - **browser_press_key**
640
+ - Title: Press a key
641
+ - Description: Press a key on the keyboard
642
+ - Parameters:
643
+ - `key` (string): Name of the key to press or a character to generate, such as `ArrowLeft` or `a`
644
+ - Read-only: **false**
645
+
646
+ <!-- NOTE: This has been generated via update-readme.js -->
647
+
648
+ - **browser_resize**
649
+ - Title: Resize browser window
650
+ - Description: Resize the browser window
651
+ - Parameters:
652
+ - `width` (number): Width of the browser window
653
+ - `height` (number): Height of the browser window
654
+ - Read-only: **true**
655
+
656
+ <!-- NOTE: This has been generated via update-readme.js -->
657
+
658
+ - **browser_select_option**
659
+ - Title: Select option
660
+ - Description: Select an option in a dropdown
661
+ - Parameters:
662
+ - `element` (string): Human-readable element description used to obtain permission to interact with the element
663
+ - `ref` (string): Exact target element reference from the page snapshot
664
+ - `values` (array): Array of values to select in the dropdown. This can be a single value or multiple values.
665
+ - Read-only: **false**
666
+
667
+ <!-- NOTE: This has been generated via update-readme.js -->
668
+
669
+ - **browser_snapshot**
670
+ - Title: Page snapshot
671
+ - Description: Capture accessibility snapshot of the current page, this is better than screenshot
672
+ - Parameters: None
673
+ - Read-only: **true**
674
+
675
+ <!-- NOTE: This has been generated via update-readme.js -->
676
+
677
+ - **browser_take_screenshot**
678
+ - Title: Take a screenshot
679
+ - Description: Take a screenshot of the current page. You can't perform actions based on the screenshot, use browser_snapshot for actions.
680
+ - Parameters:
681
+ - `type` (string, optional): Image format for the screenshot. Default is png.
682
+ - `filename` (string, optional): File name to save the screenshot to. Defaults to `page-{timestamp}.{png|jpeg}` if not specified.
683
+ - `element` (string, optional): Human-readable element description used to obtain permission to screenshot the element. If not provided, the screenshot will be taken of viewport. If element is provided, ref must be provided too.
684
+ - `ref` (string, optional): Exact target element reference from the page snapshot. If not provided, the screenshot will be taken of viewport. If ref is provided, element must be provided too.
685
+ - `fullPage` (boolean, optional): When true, takes a screenshot of the full scrollable page, instead of the currently visible viewport. Cannot be used with element screenshots.
686
+ - Read-only: **true**
687
+
688
+ <!-- NOTE: This has been generated via update-readme.js -->
689
+
690
+ - **browser_type**
691
+ - Title: Type text
692
+ - Description: Type text into editable element
693
+ - Parameters:
694
+ - `element` (string): Human-readable element description used to obtain permission to interact with the element
695
+ - `ref` (string): Exact target element reference from the page snapshot
696
+ - `text` (string): Text to type into the element
697
+ - `submit` (boolean, optional): Whether to submit entered text (press Enter after)
698
+ - `slowly` (boolean, optional): Whether to type one character at a time. Useful for triggering key handlers in the page. By default entire text is filled in at once.
699
+ - Read-only: **false**
700
+
701
+ <!-- NOTE: This has been generated via update-readme.js -->
702
+
703
+ - **browser_wait_for**
704
+ - Title: Wait for
705
+ - Description: Wait for text to appear or disappear or a specified time to pass
706
+ - Parameters:
707
+ - `time` (number, optional): The time to wait in seconds
708
+ - `text` (string, optional): The text to wait for
709
+ - `textGone` (string, optional): The text to wait for to disappear
710
+ - Read-only: **true**
711
+
712
+ </details>
713
+
714
+ <details>
715
+ <summary><b>Tab management</b></summary>
716
+
717
+ <!-- NOTE: This has been generated via update-readme.js -->
718
+
719
+ - **browser_tab_close**
720
+ - Title: Close a tab
721
+ - Description: Close a tab
722
+ - Parameters:
723
+ - `index` (number, optional): The index of the tab to close. Closes current tab if not provided.
724
+ - Read-only: **false**
725
+
726
+ <!-- NOTE: This has been generated via update-readme.js -->
727
+
728
+ - **browser_tab_list**
729
+ - Title: List tabs
730
+ - Description: List browser tabs
731
+ - Parameters: None
732
+ - Read-only: **true**
733
+
734
+ <!-- NOTE: This has been generated via update-readme.js -->
735
+
736
+ - **browser_tab_new**
737
+ - Title: Open a new tab
738
+ - Description: Open a new tab
739
+ - Parameters:
740
+ - `url` (string, optional): The URL to navigate to in the new tab. If not provided, the new tab will be blank.
741
+ - Read-only: **true**
742
+
743
+ <!-- NOTE: This has been generated via update-readme.js -->
744
+
745
+ - **browser_tab_select**
746
+ - Title: Select a tab
747
+ - Description: Select a tab by index
748
+ - Parameters:
749
+ - `index` (number): The index of the tab to select
750
+ - Read-only: **true**
751
+
752
+ </details>
753
+
754
+ <details>
755
+ <summary><b>Browser installation</b></summary>
756
+
757
+ <!-- NOTE: This has been generated via update-readme.js -->
758
+
759
+ - **browser_install**
760
+ - Title: Install the browser specified in the config
761
+ - Description: Install the browser specified in the config. Call this if you get an error about the browser not being installed.
762
+ - Parameters: None
763
+ - Read-only: **false**
764
+
765
+ </details>
766
+
767
+ <details>
768
+ <summary><b>Coordinate-based (opt-in via --caps=vision)</b></summary>
769
+
770
+ <!-- NOTE: This has been generated via update-readme.js -->
771
+
772
+ - **browser_mouse_click_xy**
773
+ - Title: Click
774
+ - Description: Click left mouse button at a given position
775
+ - Parameters:
776
+ - `element` (string): Human-readable element description used to obtain permission to interact with the element
777
+ - `x` (number): X coordinate
778
+ - `y` (number): Y coordinate
779
+ - Read-only: **false**
780
+
781
+ <!-- NOTE: This has been generated via update-readme.js -->
782
+
783
+ - **browser_mouse_drag_xy**
784
+ - Title: Drag mouse
785
+ - Description: Drag left mouse button to a given position
786
+ - Parameters:
787
+ - `element` (string): Human-readable element description used to obtain permission to interact with the element
788
+ - `startX` (number): Start X coordinate
789
+ - `startY` (number): Start Y coordinate
790
+ - `endX` (number): End X coordinate
791
+ - `endY` (number): End Y coordinate
792
+ - Read-only: **false**
793
+
794
+ <!-- NOTE: This has been generated via update-readme.js -->
795
+
796
+ - **browser_mouse_move_xy**
797
+ - Title: Move mouse
798
+ - Description: Move mouse to a given position
799
+ - Parameters:
800
+ - `element` (string): Human-readable element description used to obtain permission to interact with the element
801
+ - `x` (number): X coordinate
802
+ - `y` (number): Y coordinate
803
+ - Read-only: **true**
804
+
805
+ </details>
806
+
807
+ <details>
808
+ <summary><b>PDF generation (opt-in via --caps=pdf)</b></summary>
809
+
810
+ <!-- NOTE: This has been generated via update-readme.js -->
811
+
812
+ - **browser_pdf_save**
813
+ - Title: Save as PDF
814
+ - Description: Save page as PDF
815
+ - Parameters:
816
+ - `filename` (string, optional): File name to save the pdf to. Defaults to `page-{timestamp}.pdf` if not specified.
817
+ - Read-only: **true**
818
+
819
+ </details>
820
+
821
+
822
+ <!--- End of tools generated section -->
823
+
824
+ ---
825
+
826
+ ## 🚀 **About @ejazullah/browser-mcp**
827
+
828
+ This enhanced version of the Playwright MCP server includes **25+ additional tools** and advanced features:
829
+
830
+ ### ✨ **New Tool Categories:**
831
+ - **🔗 CDP Tools** (3): Connect to existing browsers via Chrome DevTools Protocol
832
+ - **🎯 Selector Tools** (5): CSS, XPath, text-based, and role-based element targeting
833
+ - **📋 Form Tools** (5): Checkbox, radio, input field automation
834
+ - **🔍 State Tools** (5): Element visibility, enabled status, waiting
835
+ - **🤝 Interaction Tools** (4): Advanced scrolling, focusing, double/right-click
836
+ - **📊 Extraction Tools** (4): Bulk data extraction (links, images, tables, styles)
837
+
838
+ ### 🎯 **Key Enhancements:**
839
+ - **Remote Browser Support**: Connect to Chrome instances anywhere via WebSocket
840
+ - **Dynamic Element Selection**: No more relying only on element references
841
+ - **Comprehensive Form Handling**: Full form automation capabilities
842
+ - **Enhanced Data Extraction**: Bulk operations for web scraping
843
+ - **Mobile & Remote Testing**: Debug mobile browsers via CDP
844
+ - **Advanced Interactions**: Beyond basic click/type operations
845
+
846
+ ### 📈 **Total Tools: 52+**
847
+ - **Original Playwright MCP**: ~25 tools
848
+ - **Enhanced Features**: +25 new tools
849
+ - **All original functionality preserved** ✅
850
+
851
+ ### 🔧 **Perfect for:**
852
+ - **Web Testing & QA**: Comprehensive browser automation
853
+ - **Web Scraping**: Advanced data extraction capabilities
854
+ - **Remote Debugging**: Connect to browsers on different machines
855
+ - **Mobile Testing**: Debug mobile browsers via USB/network
856
+ - **Integration Testing**: Connect to existing browser sessions
857
+
858
+ ---
859
+
860
+ **Built by [Ejaz Ullah](https://github.com/ejazullah)** | Enhanced version of Microsoft's Playwright MCP