chrometools-mcp 2.5.0 → 3.1.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (48) hide show
  1. package/CHANGELOG.md +420 -0
  2. package/COMPONENT_MAPPING_SPEC.md +1217 -0
  3. package/README.md +406 -38
  4. package/bridge/bridge-client.js +472 -0
  5. package/bridge/bridge-service.js +399 -0
  6. package/bridge/install.js +241 -0
  7. package/browser/browser-manager.js +107 -2
  8. package/browser/page-manager.js +226 -69
  9. package/docs/CHROME_EXTENSION.md +219 -0
  10. package/docs/PAGE_OBJECT_MODEL_CONCEPT.md +1756 -0
  11. package/extension/background.js +643 -0
  12. package/extension/content.js +715 -0
  13. package/extension/icons/create-icons.js +164 -0
  14. package/extension/icons/icon128.png +0 -0
  15. package/extension/icons/icon16.png +0 -0
  16. package/extension/icons/icon48.png +0 -0
  17. package/extension/manifest.json +58 -0
  18. package/extension/popup/popup.css +437 -0
  19. package/extension/popup/popup.html +102 -0
  20. package/extension/popup/popup.js +415 -0
  21. package/extension/recorder-overlay.css +93 -0
  22. package/index.js +3347 -2901
  23. package/models/BaseInputModel.js +93 -0
  24. package/models/CheckboxGroupModel.js +199 -0
  25. package/models/CheckboxModel.js +103 -0
  26. package/models/ColorInputModel.js +53 -0
  27. package/models/DateInputModel.js +67 -0
  28. package/models/RadioGroupModel.js +126 -0
  29. package/models/RangeInputModel.js +60 -0
  30. package/models/SelectModel.js +97 -0
  31. package/models/TextInputModel.js +34 -0
  32. package/models/TextareaModel.js +59 -0
  33. package/models/TimeInputModel.js +49 -0
  34. package/models/index.js +122 -0
  35. package/package.json +3 -2
  36. package/pom/apom-converter.js +267 -0
  37. package/pom/apom-tree-converter.js +515 -0
  38. package/pom/element-id-generator.js +175 -0
  39. package/recorder/page-object-generator.js +16 -0
  40. package/recorder/scenario-executor.js +80 -2
  41. package/server/tool-definitions.js +839 -713
  42. package/server/tool-groups.js +1 -1
  43. package/server/tool-schemas.js +367 -326
  44. package/server/websocket-bridge.js +447 -0
  45. package/utils/selector-resolver.js +186 -0
  46. package/utils/ui-framework-detector.js +392 -0
  47. package/RELEASE_NOTES_v2.5.0.md +0 -109
  48. package/npm_publish_output.txt +0 -0
package/README.md CHANGED
@@ -8,16 +8,18 @@ MCP server for Chrome automation using Puppeteer with persistent browser session
8
8
  - [Usage](#usage)
9
9
  - [AI Optimization Features](#ai-optimization-features) ⭐ **NEW**
10
10
  - [Scenario Recorder](#scenario-recorder) ⭐ **NEW** - Visual UI-based recording with smart optimization
11
- - [Available Tools](#available-tools) - **44+ Tools Total**
12
- - [AI-Powered Tools](#ai-powered-tools) ⭐ **NEW** - smartFindElement, analyzePage, getAllInteractiveElements, findElementsByText
11
+ - [Available Tools](#available-tools) - **46+ Tools Total**
12
+ - [AI-Powered Tools](#ai-powered-tools) ⭐ **NEW** - smartFindElement, analyzePage, getElementByApomId, getAllInteractiveElements, findElementsByText
13
13
  - [Core Tools](#1-core-tools) - ping, openBrowser
14
- - [Interaction Tools](#2-interaction-tools) - click, type, scrollTo, selectOption, dragScroll, scrollHorizontal
14
+ - [Interaction Tools](#2-interaction-tools) - click, type, scrollTo, selectOption, selectFromGroup, drag, scrollHorizontal
15
15
  - [Inspection Tools](#3-inspection-tools) - getElement, getComputedCss, getBoxModel, screenshot
16
16
  - [Advanced Tools](#4-advanced-tools) - executeScript, getConsoleLogs, listNetworkRequests, getNetworkRequest, filterNetworkRequests, hover, setStyles, setViewport, getViewport, navigateTo
17
- - [Recorder Tools](#6-recorder-tools) ⭐ **NEW** - enableRecorder, executeScenario, listScenarios, searchScenarios, getScenarioInfo, deleteScenario, exportScenarioAsCode, appendScenarioToFile, generatePageObject
17
+ - [Tab Management Tools](#5-tab-management-tools) ⭐ **NEW** - listTabs, switchTab
18
+ - [Recorder Tools](#7-recorder-tools) ⭐ **NEW** - enableRecorder, executeScenario, listScenarios, searchScenarios, getScenarioInfo, deleteScenario, exportScenarioAsCode, appendScenarioToFile, generatePageObject
18
19
  - [Typical Workflow Example](#typical-workflow-example)
19
20
  - [Tool Usage Tips](#tool-usage-tips)
20
21
  - [Configuration](#configuration)
22
+ - [Multi-Instance Support](#multi-instance-support) ⭐ **NEW** - Run multiple MCP servers simultaneously
21
23
  - [WSL Setup Guide](#wsl-setup-guide) → [Full WSL Guide](WSL_SETUP.md)
22
24
  - [Development](#development)
23
25
  - [Features](#features)
@@ -185,24 +187,81 @@ Get current page state and structure. Returns complete map of forms (with values
185
187
  - **Parameters**:
186
188
  - `refresh` (optional): Force refresh cache to get CURRENT state after changes (default: false)
187
189
  - `includeAll` (optional): Include ALL page elements, not just interactive ones (default: false). Useful for layout work - find any element, get its selector, then use `getComputedCss` or `setStyles` on it.
190
+ - `useLegacyFormat` (optional): Return legacy format instead of APOM (default: false - **APOM is now the default**) 🔄 **BREAKING CHANGE**
191
+ - `registerElements` (optional): Auto-register elements for ID-based usage (default: true) ⭐ **APOM**
192
+ - `groupBy` (optional): 'type' or 'flat' - how to group elements (default: 'type') ⭐ **APOM**
188
193
  - **Why better than screenshot**:
189
194
  - Shows actual data (form values, validation errors) not just visual
190
195
  - Uses 2-5k tokens vs screenshot 15-25k tokens
191
- - Returns structured data with selectors
196
+ - Returns structured data with **unique element IDs** for easy interaction
197
+ - **Detects UI frameworks** (MUI, Ant Design, Chakra, Bootstrap, Vuetify, Semantic UI) ⭐
198
+ - **Extracts dropdown options** from both native `<select>` and custom UI components ⭐
192
199
  - **Returns**:
193
- - By default: Complete map of forms (with current values), inputs, buttons, links, navigation with selectors
194
- - With `includeAll: true`: Also includes `allElements` array with ALL visible page elements (divs, spans, headings, etc.) - each with selector, tag, text, classes, id
200
+ - **APOM format** (default): Tree-structured Page Object Model with unique IDs **NOW DEFAULT**
201
+ - `tree` - Hierarchical tree of page elements (optimized: ~82% smaller than flat format)
202
+ - Each node: `{ tag, id?, type?, sel, ch?, bounds?, meta? }`
203
+ - Interactive elements have `bounds` and full metadata
204
+ - Parent containers have minimal info (position only)
205
+ - `groups` - Radio/checkbox groups with options (name, value, label, checked state)
206
+ - `meta` - Page metadata (url, title, timestamp, element counts)
207
+ - Elements automatically registered - use IDs with `click({ id: "..." })`, `type({ id: "..." })`, etc.
208
+ - **Token-optimized**: Minified JSON, simplified parents, no redundant data
209
+ - Example: `analyzePage()` returns APOM, then use `click({ id: "button_45" })` or `type({ id: "input_20", text: "..." })`
210
+ - **Use `getElementByApomId({ id: "input_20" })`** to get full details for any element
211
+ - **Legacy format** (`useLegacyFormat: true`): Classic format for backward compatibility
212
+ - Complete map of forms (with current values), inputs, buttons, links, navigation with selectors
213
+ - **Each element includes `uiFramework` info** (name, version, component type) ⭐
214
+ - **Select elements include `options` array** with value, text, index, selected, disabled, group ⭐
215
+ - With `includeAll: true`: Also includes `allElements` array with ALL visible page elements (divs, spans, headings, etc.) - each with selector, tag, text, classes, id
195
216
  - **Example workflow**:
196
217
  1. `openBrowser({ url: "..." })`
197
- 2. `analyzePage()` ← Initial analysis
198
- 3. `click({ selector: "submit-btn" })`
199
- 4. **`analyzePage({ refresh: true })`**See what changed after click!
218
+ 2. `analyzePage()` ← Initial analysis, returns elements with IDs
219
+ 3. `type({ id: "input_20", text: "user@example.com" })` ← Use APOM ID
220
+ 4. `click({ id: "button_45" })`Use APOM ID
221
+ 5. **`analyzePage({ refresh: true })`** ← See what changed after click!
200
222
  - **Layout work example**:
201
223
  1. `analyzePage({ includeAll: true })` ← Get all elements
202
224
  2. Find element you want to style (e.g., `div.header`)
203
225
  3. `getComputedCss({ selector: "div.header" })` ← Get current styles
204
226
  4. `setStyles({ selector: "div.header", styles: [...] })` ← Apply new styles
205
227
 
228
+ #### getElementByApomId ⭐ **NEW**
229
+ Get detailed information about a specific element by its APOM ID from `analyzePage`. Use this to inspect elements without re-analyzing the entire page.
230
+ - **Parameters**:
231
+ - `id` (required): APOM element ID (e.g., `"input_20"`, `"button_45"`)
232
+ - **Use case**: Get full details for a specific element (bounds, attributes, computed styles)
233
+ - **Returns**: Element details including:
234
+ - `id`: Element APOM ID
235
+ - `selector`: CSS selector
236
+ - `tag`: HTML tag name
237
+ - `type`: Input type (for inputs)
238
+ - `text`: Visible text content
239
+ - `bounds`: `{ x, y, width, height }` position and size
240
+ - `attributes`: All HTML attributes
241
+ - `computedStyles`: Key CSS properties (display, visibility, color, background, etc.)
242
+ - `isVisible`: Whether element is visible
243
+ - `isEnabled`: Whether element is enabled (not disabled)
244
+ - **Example**:
245
+ ```javascript
246
+ // Get details for specific input field
247
+ getElementByApomId({ id: "input_20" })
248
+
249
+ // Returns:
250
+ {
251
+ "success": true,
252
+ "id": "input_20",
253
+ "selector": "input[name='email']",
254
+ "tag": "input",
255
+ "type": "email",
256
+ "text": "",
257
+ "bounds": { "x": 100, "y": 200, "width": 300, "height": 40 },
258
+ "attributes": { "name": "email", "placeholder": "Enter email" },
259
+ "computedStyles": { "display": "block", "visibility": "visible" },
260
+ "isVisible": true,
261
+ "isEnabled": true
262
+ }
263
+ ```
264
+
206
265
  #### getAllInteractiveElements
207
266
  Get all clickable/fillable elements with their selectors.
208
267
  - **Parameters**:
@@ -234,25 +293,45 @@ Opens browser and navigates to URL. Browser stays open for further interactions.
234
293
  ### 2. Interaction Tools
235
294
 
236
295
  #### click
237
- Click an element with optional result screenshot.
296
+ Click an element with optional result screenshot. **PREFERRED**: Use APOM ID from `analyzePage` for reliable targeting.
238
297
  - **Parameters**:
239
- - `selector` (required): CSS selector
298
+ - `id` (optional): APOM element ID from analyzePage (e.g., `"button_45"`, `"link_7"`). **Preferred over selector.**
299
+ - `selector` (optional): CSS selector. Use when APOM ID is not available.
300
+ - ⚠️ Either `id` OR `selector` required (mutually exclusive)
240
301
  - `waitAfter` (optional): Wait time in ms (default: 1500)
241
302
  - `screenshot` (optional): Capture screenshot (default: false for performance) ⚡
242
303
  - `timeout` (optional): Max operation time in ms (default: 30000)
243
304
  - **Use case**: Buttons, links, form submissions
244
305
  - **Returns**: Confirmation text + optional screenshot
245
306
  - **Performance**: 2-10x faster without screenshot
307
+ - **Example**:
308
+ ```javascript
309
+ // PREFERRED: Using APOM ID
310
+ click({ id: "button_45" })
311
+
312
+ // Alternative: Using CSS selector
313
+ click({ selector: "button[type='submit']" })
314
+ ```
246
315
 
247
316
  #### type
248
- Type text into input fields with optional clearing and typing delay.
317
+ Type text into input fields with optional clearing and typing delay. **PREFERRED**: Use APOM ID from `analyzePage` for reliable targeting.
249
318
  - **Parameters**:
250
- - `selector` (required): CSS selector
319
+ - `id` (optional): APOM element ID from analyzePage (e.g., `"input_20"`). **Preferred over selector.**
320
+ - `selector` (optional): CSS selector. Use when APOM ID is not available.
321
+ - ⚠️ Either `id` OR `selector` required (mutually exclusive)
251
322
  - `text` (required): Text to type
252
323
  - `delay` (optional): Delay between keystrokes in ms
253
324
  - `clearFirst` (optional): Clear field first (default: true)
254
325
  - **Use case**: Filling forms, search boxes, text inputs
255
326
  - **Returns**: Confirmation text
327
+ - **Example**:
328
+ ```javascript
329
+ // PREFERRED: Using APOM ID
330
+ type({ id: "input_20", text: "user@example.com" })
331
+
332
+ // Alternative: Using CSS selector
333
+ type({ selector: "input[name='email']", text: "user@example.com" })
334
+ ```
256
335
 
257
336
  #### scrollTo
258
337
  Scroll page to bring element into view.
@@ -263,9 +342,11 @@ Scroll page to bring element into view.
263
342
  - **Returns**: Final scroll position
264
343
 
265
344
  #### selectOption
266
- Select option in dropdown (HTML select elements). Automatically detected by analyzePage with all available options.
345
+ Select option in dropdown (HTML select elements). **PREFERRED**: Use APOM ID from `analyzePage` for reliable targeting.
267
346
  - **Parameters**:
268
- - `selector` (required): CSS selector for select element
347
+ - `id` (optional): APOM element ID from analyzePage (e.g., `"select_5"`). **Preferred over selector.**
348
+ - `selector` (optional): CSS selector. Use when APOM ID is not available.
349
+ - ⚠️ Either `id` OR `selector` required (mutually exclusive)
269
350
  - `value` (optional): Option value attribute (priority 1)
270
351
  - `text` (optional): Option text content (priority 2)
271
352
  - `index` (optional): Option index, 0-based (priority 3)
@@ -273,6 +354,46 @@ Select option in dropdown (HTML select elements). Automatically detected by anal
273
354
  - **Returns**: Selected option details (value, text, index)
274
355
  - **Selection priority**: If multiple parameters specified, tries value → text → index
275
356
  - **AI Integration**: Use `analyzePage` to see all available options with their values, text, and indices
357
+ - **Example**:
358
+ ```javascript
359
+ // PREFERRED: Using APOM ID
360
+ selectOption({ id: "select_5", value: "US" })
361
+
362
+ // Alternative: Using CSS selector
363
+ selectOption({ selector: "select[name='country']", text: "United States" })
364
+ ```
365
+
366
+ #### selectFromGroup ⭐ **NEW**
367
+ Select option(s) from radio or checkbox group by name attribute. Works at abstract group level instead of individual clicks.
368
+ - **Parameters**:
369
+ - `name` (required): Name attribute of the radio/checkbox group (e.g., 'size', 'toppings')
370
+ - `value` (optional): Single value to select (for radio or single checkbox)
371
+ - `values` (optional): Array of values to select (for checkbox group)
372
+ - `text` (optional): Label text to match (alternative to value)
373
+ - `texts` (optional): Array of label texts to match (for checkbox group)
374
+ - `by` (optional): Match by 'value', 'text', or 'auto' (default: 'auto')
375
+ - `mode` (optional): For checkboxes - 'set' (replace all), 'add', 'remove', 'toggle' (default: 'set')
376
+ - **Use case**: Radio buttons, checkbox groups, form options
377
+ - **Returns**: Result with changes made and current selection state
378
+ - **AI Integration**: Use `analyzePage` to see available groups in `groups` section with all options and labels
379
+ - **Examples**:
380
+ ```javascript
381
+ // Radio group - select single option
382
+ selectFromGroup({ name: "size", value: "large" })
383
+ selectFromGroup({ name: "size", text: "Extra Large" })
384
+
385
+ // Checkbox group - set specific values (uncheck others)
386
+ selectFromGroup({ name: "toppings", values: ["cheese", "bacon"] })
387
+
388
+ // Checkbox group - add to existing selection
389
+ selectFromGroup({ name: "toppings", values: ["mushrooms"], mode: "add" })
390
+
391
+ // Checkbox group - remove specific values
392
+ selectFromGroup({ name: "toppings", values: ["onions"], mode: "remove" })
393
+
394
+ // Checkbox group - toggle values
395
+ selectFromGroup({ name: "toppings", texts: ["Extra Cheese"], mode: "toggle" })
396
+ ```
276
397
 
277
398
  #### drag
278
399
  Drag element by mouse (click-hold-move-release). Simulates real mouse drag, not scrollbar scrolling.
@@ -419,10 +540,21 @@ Filter requests by URL pattern with full details.
419
540
  3. `filterNetworkRequests({ urlPattern: "api/..." })` - get all matching requests with details
420
541
 
421
542
  #### hover
422
- Simulate mouse hover over element.
423
- - **Parameters**: `selector` (required)
543
+ Simulate mouse hover over element. **PREFERRED**: Use APOM ID from `analyzePage` for reliable targeting.
544
+ - **Parameters**:
545
+ - `id` (optional): APOM element ID from analyzePage (e.g., `"button_10"`). **Preferred over selector.**
546
+ - `selector` (optional): CSS selector. Use when APOM ID is not available.
547
+ - ⚠️ Either `id` OR `selector` required (mutually exclusive)
424
548
  - **Use case**: Testing hover effects, tooltips, dropdown menus
425
549
  - **Returns**: Confirmation text
550
+ - **Example**:
551
+ ```javascript
552
+ // PREFERRED: Using APOM ID
553
+ hover({ id: "button_10" })
554
+
555
+ // Alternative: Using CSS selector
556
+ hover({ selector: ".dropdown-trigger" })
557
+ ```
426
558
 
427
559
  #### setStyles
428
560
  Apply inline CSS styles to element for live editing.
@@ -455,7 +587,49 @@ Navigate to different URL while keeping browser instance.
455
587
  - **Use case**: Moving between pages in workflow
456
588
  - **Returns**: New page title
457
589
 
458
- ### 5. Figma Tools ⭐ ENHANCED
590
+ ### 5. Tab Management Tools ⭐ NEW
591
+
592
+ Tools for managing multiple browser tabs. New tabs opened via `window.open()`, `target="_blank"`, or user actions are automatically detected and tracked.
593
+
594
+ #### listTabs
595
+ List all open browser tabs with their URLs, titles, and active status.
596
+ - **Parameters**: None
597
+ - **Returns**:
598
+ - `tabs`: Array of `{ index, url, title, isActive }`
599
+ - `totalCount`: Number of open tabs
600
+ - `newTabsDetected` (optional): Array of tabs opened since last check
601
+ - **Use case**: See all open tabs, check for newly opened tabs
602
+
603
+ ```javascript
604
+ // Example response
605
+ {
606
+ "tabs": [
607
+ { "index": 0, "url": "https://example.com", "title": "Example", "isActive": false },
608
+ { "index": 1, "url": "https://google.com", "title": "Google", "isActive": true }
609
+ ],
610
+ "totalCount": 2,
611
+ "newTabsDetected": [
612
+ { "timestamp": "2026-01-25T...", "url": "https://google.com", "openerUrl": "https://example.com" }
613
+ ]
614
+ }
615
+ ```
616
+
617
+ #### switchTab
618
+ Switch to a different browser tab by index or URL pattern.
619
+ - **Parameters**:
620
+ - `tab` (required): Tab index (number, 0-based) or URL pattern (string, partial match)
621
+ - **Use case**: Switch between tabs for multi-tab workflows
622
+ - **Returns**: `{ success, switchedTo: { url, title } }`
623
+
624
+ ```javascript
625
+ // Switch by index
626
+ switchTab({ tab: 0 })
627
+
628
+ // Switch by URL pattern
629
+ switchTab({ tab: "google.com" })
630
+ ```
631
+
632
+ ### 6. Figma Tools ⭐ ENHANCED
459
633
 
460
634
  Design-to-code validation, file browsing, design system extraction, and comparison tools with automatic 3 MB compression.
461
635
 
@@ -601,7 +775,7 @@ Extract detailed design specifications from Figma including text content, colors
601
775
  - **Dimensions**: Width, height, x, y coordinates
602
776
  - **Children**: Recursive tree with text extraction from all child elements
603
777
 
604
- ### 6. Recorder Tools ⭐ NEW
778
+ ### 7. Recorder Tools ⭐ NEW
605
779
 
606
780
  **URL-Based Storage (v2.1+)**: Scenarios are automatically organized by website domain in `~/.config/chrometools-mcp/projects/{domain}/scenarios/`.
607
781
 
@@ -970,18 +1144,29 @@ Generate Page Object Model (POM) class from current page structure. Analyzes pag
970
1144
  // 1. Open page
971
1145
  openBrowser({ url: "https://example.com/form" })
972
1146
 
973
- // 2. Fill form
974
- type({ selector: "input[name='email']", text: "user@example.com" })
975
- type({ selector: "input[name='password']", text: "secret123" })
1147
+ // 2. Analyze page to get element IDs
1148
+ analyzePage()
1149
+ // Returns: { tree: {...}, groups: {...}, meta: {...} }
1150
+ // Elements: input_20 (email), input_21 (password), button_45 (submit)
976
1151
 
977
- // 3. Submit
978
- click({ selector: "button[type='submit']" })
1152
+ // 3. Fill form using APOM IDs (preferred)
1153
+ type({ id: "input_20", text: "user@example.com" })
1154
+ type({ id: "input_21", text: "secret123" })
1155
+
1156
+ // 4. Submit using APOM ID
1157
+ click({ id: "button_45" })
979
1158
 
980
- // 4. Verify
981
- getElement({ selector: ".success-message" })
1159
+ // 5. Verify
1160
+ analyzePage({ refresh: true }) // See updated state
982
1161
  screenshot({ selector: ".dashboard", padding: 20 })
983
1162
  ```
984
1163
 
1164
+ **Alternative: Using CSS selectors (still supported)**
1165
+ ```javascript
1166
+ type({ selector: "input[name='email']", text: "user@example.com" })
1167
+ click({ selector: "button[type='submit']" })
1168
+ ```
1169
+
985
1170
  ---
986
1171
 
987
1172
  ## Tool Usage Tips
@@ -1250,17 +1435,29 @@ npx @modelcontextprotocol/inspector node index.js
1250
1435
 
1251
1436
  ## Features
1252
1437
 
1253
- - **27+ Powerful Tools**: Complete toolkit for browser automation
1438
+ - **44+ Powerful Tools**: Complete toolkit for browser automation
1254
1439
  - Core: ping, openBrowser
1255
- - Interaction: click, type, scrollTo
1256
- - Inspection: getElement, getComputedCss, getBoxModel, screenshot
1257
- - Advanced: executeScript, getConsoleLogs, getNetworkRequests, hover, setStyles, setViewport, getViewport, navigateTo
1258
- - AI-Powered: smartFindElement, analyzePage, getAllInteractiveElements, findElementsByText
1259
- - Recorder: enableRecorder, executeScenario, listScenarios, searchScenarios, getScenarioInfo, deleteScenario
1260
- - Figma: getFigmaFrame, compareFigmaToElement, getFigmaSpecs
1440
+ - Interaction: click, type, scrollTo, selectOption, selectFromGroup, drag, scrollHorizontal
1441
+ - Inspection: getElement, getComputedCss, getBoxModel, screenshot, saveScreenshot
1442
+ - Advanced: executeScript, getConsoleLogs, listNetworkRequests, getNetworkRequest, filterNetworkRequests, hover, setStyles, setViewport, getViewport, navigateTo, waitForElement
1443
+ - AI-Powered: smartFindElement, analyzePage, getElementByApomId, getAllInteractiveElements, findElementsByText ⭐ **NEW**
1444
+ - Recorder: enableRecorder, executeScenario, listScenarios, searchScenarios, getScenarioInfo, deleteScenario, exportScenarioAsCode, appendScenarioToFile, generatePageObject
1445
+ - Figma: getFigmaFrame, compareFigmaToElement, getFigmaSpecs, parseFigmaUrl, listFigmaPages, searchFigmaFrames, getFigmaComponents, getFigmaStyles, getFigmaColorPalette, convertFigmaToCode
1446
+ - **UI Framework Detection**: Automatic detection of MUI, Ant Design, Chakra UI, Bootstrap, Vuetify, Semantic UI ⭐ **NEW**
1447
+ - **Smart Dropdown Handling**: Extracts options from both native `<select>` and custom UI framework components ⭐ **NEW**
1448
+ - **APOM (Agent Page Object Model)**: Automatic element ID assignment for reliable interaction ⭐ **NEW**
1449
+ - `analyzePage()` returns elements with unique IDs (e.g., `input_20`, `button_45`)
1450
+ - Use `id` parameter in click/type/hover/selectOption for stable targeting
1451
+ - Use `getElementByApomId()` to get detailed element info
1261
1452
  - **Console Log Capture**: Automatic JavaScript console monitoring
1262
1453
  - **Network Request Monitoring**: Track all HTTP/API requests (XHR, Fetch, etc.)
1263
1454
  - **Persistent Browser Sessions**: Browser tabs remain open between requests
1455
+ - **Multi-Instance Support**: Run multiple MCP servers simultaneously with automatic discovery ⭐ **NEW**
1456
+ - Dynamic port allocation (9223-9227)
1457
+ - Chrome Extension port scanning every 20s
1458
+ - Broadcast pattern for parallel AI clients
1459
+ - Graceful handling of ungraceful shutdowns
1460
+ - **Auto-Sync Active Tab**: MCP server automatically syncs to user's currently active tab ⭐ **NEW**
1264
1461
  - **Visual Browser (GUI Mode)**: See automation in real-time
1265
1462
  - **Cross-platform**: Works on Windows/WSL, Linux, macOS
1266
1463
  - **Simple Installation**: One command with npx
@@ -1268,9 +1465,180 @@ npx @modelcontextprotocol/inspector node index.js
1268
1465
  - **AI-Friendly**: Detailed descriptions optimized for AI agents
1269
1466
  - **Responsive Testing**: Built-in viewport control for mobile/tablet/desktop
1270
1467
 
1468
+ ## Multi-Instance Support
1469
+
1470
+ ⭐ **NEW**: Run up to 8 MCP servers simultaneously, connecting/disconnecting at any time without coordination.
1471
+
1472
+ ### Overview
1473
+
1474
+ ChromeTools MCP uses a **Bridge Architecture** for reliable multi-instance support:
1475
+
1476
+ - **Multiple AI clients** (0-8) can connect/disconnect at any time
1477
+ - **No scanning delays** — instant connection to persistent Bridge Service
1478
+ - **Resilient** — Bridge survives MCP process crashes, maintains state
1479
+ - **Chrome lifecycle** — Bridge starts/stops with Chrome Extension
1480
+
1481
+ ### How It Works
1482
+
1483
+ ```
1484
+ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
1485
+ │ Claude Desktop │ │ Telegram Bot │ │ Custom Script │
1486
+ │ MCP Client │ │ MCP Client │ │ MCP Client │
1487
+ └────────┬────────┘ └────────┬────────┘ └────────┬────────┘
1488
+ │ │ │
1489
+ │ WebSocket │ WebSocket │ WebSocket
1490
+ │ (client) │ (client) │ (client)
1491
+ │ │ │
1492
+ └────────────────────┼────────────────────┘
1493
+
1494
+
1495
+ ┌───────────────────────────────┐
1496
+ │ Bridge Service (:9223) │
1497
+ │ (Native Messaging Host) │
1498
+ │ │
1499
+ │ • Stores tabs state │
1500
+ │ • Stores recordings │
1501
+ │ • Broadcasts events │
1502
+ │ • Accepts 0-8 clients │
1503
+ └───────────────┬───────────────┘
1504
+
1505
+ │ Native Messaging (stdio)
1506
+
1507
+ ┌───────────────┴───────────────┐
1508
+ │ Chrome Extension │
1509
+ │ (Event Producer) │
1510
+ │ │
1511
+ │ • Tracks all tabs │
1512
+ │ • Records user actions │
1513
+ │ • Sends events to Bridge │
1514
+ └───────────────┬───────────────┘
1515
+
1516
+
1517
+ ┌───────────────────────────────┐
1518
+ │ Chrome Browser │
1519
+ └───────────────────────────────┘
1520
+ ```
1521
+
1522
+ ### Installation
1523
+
1524
+ **One-time setup** (installs Native Messaging Bridge):
1525
+
1526
+ ```bash
1527
+ npx chrometools-mcp --install-bridge
1528
+ ```
1529
+
1530
+ This:
1531
+ 1. Creates Bridge Service files in `~/.chrometools/`
1532
+ 2. Registers Native Messaging Host in system (Windows Registry / Chrome config)
1533
+ 3. Bridge will auto-start when Chrome Extension loads
1534
+
1535
+ **Verify installation:**
1536
+ ```bash
1537
+ npx chrometools-mcp --check-bridge
1538
+ ```
1539
+
1540
+ ### Architecture
1541
+
1542
+ **1. Bridge Service (Persistent Intermediary)**
1543
+ - Launched by Chrome via Native Messaging when Extension starts
1544
+ - Runs WebSocket server on port 9223
1545
+ - Stores state: tabs, recordings, recorder state
1546
+ - Lives as long as Chrome is running
1547
+ - Accepts 0-8 simultaneous MCP clients
1548
+
1549
+ **2. Chrome Extension (Event Producer)**
1550
+ - Tracks all browser tabs (created, updated, closed, activated)
1551
+ - Records user actions (clicks, typing, navigation)
1552
+ - Sends ALL events to Bridge via Native Messaging
1553
+ - Doesn't care about MCP clients — just produces events
1554
+
1555
+ **3. MCP Server (Event Consumer)**
1556
+ - Connects to Bridge as WebSocket client
1557
+ - Receives full state immediately on connect
1558
+ - Gets real-time event updates
1559
+ - Can disconnect/reconnect at any time without losing state
1560
+
1561
+ ### Use Cases
1562
+
1563
+ **Ephemeral AI Sessions**
1564
+ ```bash
1565
+ # User sends message to Telegram bot
1566
+ # → Claude Code starts, connects to Bridge
1567
+ # → Gets current tabs state instantly
1568
+ # → Performs automation
1569
+ # → Claude Code exits, disconnects
1570
+ # → Bridge keeps running, state preserved
1571
+
1572
+ # Next message: same flow, instant state access
1573
+ ```
1574
+
1575
+ **Parallel Workflows**
1576
+ ```bash
1577
+ # Claude Desktop: form automation
1578
+ # Telegram Bot: monitoring & debugging
1579
+ # Custom script: data extraction
1580
+
1581
+ # All connected to same Bridge
1582
+ # All see same browser state
1583
+ # All can control Chrome
1584
+ ```
1585
+
1586
+ ### Configuration
1587
+
1588
+ No configuration needed after installation. Just use:
1589
+
1590
+ ```bash
1591
+ npx chrometools-mcp
1592
+ ```
1593
+
1594
+ MCP automatically connects to Bridge on startup.
1595
+
1596
+ ### CLI Options
1597
+
1598
+ ```bash
1599
+ npx chrometools-mcp --install-bridge # Install Native Messaging Bridge
1600
+ npx chrometools-mcp --uninstall-bridge # Uninstall Bridge
1601
+ npx chrometools-mcp --check-bridge # Check if Bridge is installed
1602
+ npx chrometools-mcp --help # Show help
1603
+ ```
1604
+
1605
+ ### Technical Details
1606
+
1607
+ | Component | Technology | Port |
1608
+ |-----------|------------|------|
1609
+ | Bridge Service | Node.js + WebSocket Server | 9223 |
1610
+ | Extension ↔ Bridge | Native Messaging (stdio) | — |
1611
+ | MCP ↔ Bridge | WebSocket (client) | 9223 |
1612
+
1613
+ **Max Clients:** 8 simultaneous MCP connections
1614
+
1615
+ **State on Connect:** Full state (tabs, recordings, recorder state) sent immediately
1616
+
1617
+ **Extension ID:** `dmehkibmncgphijnigkahhlekgajhpbl` (stable, generated from key)
1618
+
1619
+ ### Troubleshooting
1620
+
1621
+ **Bridge not connecting:**
1622
+ ```bash
1623
+ # Check if Bridge is installed
1624
+ npx chrometools-mcp --check-bridge
1625
+
1626
+ # Reinstall if needed
1627
+ npx chrometools-mcp --install-bridge
1628
+
1629
+ # Reload extension in chrome://extensions
1630
+ ```
1631
+
1632
+ **Extension shows "Disconnected":**
1633
+ - Bridge only runs when Chrome Extension is active
1634
+ - Close and reopen Chrome
1635
+ - Check Extension Service Worker console for errors
1636
+
1271
1637
  ## Architecture
1272
1638
 
1273
- - Uses Puppeteer for Chrome automation
1274
- - MCP Server SDK for protocol implementation
1275
- - Zod for schema validation
1276
- - Stdio transport for communication
1639
+ - **Puppeteer** for Chrome automation
1640
+ - **MCP Server SDK** for protocol implementation
1641
+ - **Native Messaging Bridge** for persistent Extension ↔ MCP communication
1642
+ - **WebSocket** for multi-client support (Bridge as server, MCP as clients)
1643
+ - **Zod** for schema validation
1644
+ - **Stdio transport** for MCP communication