chrometools-mcp 1.0.1 → 1.3.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md ADDED
@@ -0,0 +1,193 @@
1
+ # Changelog
2
+
3
+ All notable changes to this project will be documented in this file.
4
+
5
+ ## [1.3.5] - 2025-01-26
6
+
7
+ ### Added
8
+ - **Request/Response payload and headers now included in getNetworkRequests**
9
+ - `postData` - POST request body (e.g., form data, JSON payload)
10
+ - `requestHeaders` - Request headers
11
+ - `responseHeaders` - Response headers
12
+
13
+ ### Changed
14
+ - `getNetworkRequests` now returns complete request/response details
15
+ - Essential for debugging API calls with payloads
16
+
17
+ ### Example
18
+ ```javascript
19
+ getNetworkRequests({ urlPattern: 'send_otp' })
20
+
21
+ // Now returns:
22
+ {
23
+ "url": "http://localhost:4200/api/auth/send_otp/",
24
+ "method": "POST",
25
+ "postData": "{\"phone\":\"+79001234567\"}", // ← NEW!
26
+ "requestHeaders": { // ← NEW!
27
+ "content-type": "application/json",
28
+ "authorization": "Bearer ..."
29
+ },
30
+ "responseHeaders": { // ← NEW!
31
+ "content-type": "application/json"
32
+ },
33
+ "status": 200,
34
+ ...
35
+ }
36
+ ```
37
+
38
+ ## [1.3.4] - 2025-01-26
39
+
40
+ ### Fixed
41
+ - **Network monitoring now persists across page navigations** - auto-reinitializes on navigation
42
+ - Network requests are now captured correctly after form submissions, link clicks, and redirects
43
+ - Added WeakSet tracking to prevent duplicate CDP session setup
44
+ - Added 100ms debounce on navigation to ensure stability
45
+
46
+ ### Changed
47
+ - Refactored network monitoring into `setupNetworkMonitoring()` helper function
48
+ - Network monitoring automatically re-enables on framenavigated events
49
+ - Global `networkRequests[]` array preserves history across all navigations
50
+
51
+ ### Technical Details
52
+ - CDP (Chrome DevTools Protocol) session is recreated on each navigation
53
+ - Network.enable is automatically re-sent after navigation completes
54
+ - Request history accumulates across multiple pages in the same session
55
+ - Use `getNetworkRequests({ clear: true })` to reset history when needed
56
+
57
+ ### Example Use Case
58
+ ```javascript
59
+ // 1. Open login page
60
+ openBrowser({ url: 'https://app.com/login' })
61
+ // Network monitoring: ✅ active
62
+
63
+ // 2. Fill form and submit (navigates to /dashboard)
64
+ click({ selector: 'button[type="submit"]' })
65
+ // Network monitoring: ✅ auto-reinitialized
66
+ // Captures POST /api/login, GET /dashboard, etc.
67
+
68
+ // 3. Check all requests from both pages
69
+ getNetworkRequests({ types: ['XHR', 'Fetch'] })
70
+ // Returns requests from /login AND /dashboard
71
+ ```
72
+
73
+ ## [1.3.3] - 2025-01-26
74
+
75
+ ### Added
76
+ - `getNetworkRequests` tool - monitor all network requests (XHR, Fetch, API calls, resources)
77
+ - Network monitoring via Chrome DevTools Protocol (CDP)
78
+ - Automatic capture of all HTTP/HTTPS requests from page load
79
+ - Filter requests by type (XHR, Fetch, Script, Document, etc.)
80
+ - Filter by status (pending, completed, failed)
81
+ - Filter by URL pattern (regex support)
82
+ - Request details include: URL, method, status, headers, timing, cache info, errors
83
+
84
+ ### Changed
85
+ - Network.enable added to CDP session setup in getOrCreatePage
86
+ - Global networkRequests array for request storage
87
+
88
+ ### Examples
89
+ ```javascript
90
+ // Get all network requests
91
+ getNetworkRequests()
92
+
93
+ // Get only XHR and Fetch requests (API calls)
94
+ getNetworkRequests({
95
+ types: ['XHR', 'Fetch']
96
+ })
97
+
98
+ // Get failed requests
99
+ getNetworkRequests({
100
+ status: 'failed'
101
+ })
102
+
103
+ // Get requests to specific API
104
+ getNetworkRequests({
105
+ urlPattern: 'api\\.example\\.com'
106
+ })
107
+
108
+ // Get requests and clear history
109
+ getNetworkRequests({
110
+ types: ['XHR', 'Fetch'],
111
+ clear: true
112
+ })
113
+ ```
114
+
115
+ ## [1.3.2] - 2025-01-26
116
+
117
+ ### Added
118
+ - `action` parameter for `smartFindElement` - perform actions (click, type, scrollTo, screenshot, hover, setStyles) on the best match immediately
119
+ - `action` parameter for `findElementsByText` - perform actions on the first matching element immediately
120
+ - New helper function `executeElementAction` for unified action execution
121
+
122
+ ### Changed
123
+ - `smartFindElement` can now execute actions on found elements in a single call
124
+ - `findElementsByText` can now execute actions on found elements in a single call
125
+ - Reduces need for separate find + action calls, improving performance
126
+
127
+ ### Examples
128
+ ```javascript
129
+ // Find and click in one call
130
+ smartFindElement({
131
+ description: 'login button',
132
+ action: { type: 'click' }
133
+ })
134
+
135
+ // Find and type in one call
136
+ findElementsByText({
137
+ text: 'Email',
138
+ action: { type: 'type', text: 'user@example.com' }
139
+ })
140
+
141
+ // Find, style and screenshot
142
+ smartFindElement({
143
+ description: 'submit button',
144
+ action: {
145
+ type: 'setStyles',
146
+ styles: [{ name: 'background', value: 'red' }],
147
+ screenshot: true
148
+ }
149
+ })
150
+ ```
151
+
152
+ ## [1.3.1] - 2025-01-26
153
+
154
+ ### Performance Improvements
155
+ - **BREAKING BEHAVIOR CHANGE**: `click` and `executeScript` commands no longer capture screenshots by default
156
+ - Screenshots were causing significant performance overhead (2-10x slowdown)
157
+ - Add `screenshot: true` parameter to explicitly request screenshots when needed
158
+ - This is backward compatible but changes default behavior for better performance
159
+
160
+ ### Added
161
+ - `screenshot` parameter for `click` command (boolean, default: `false`)
162
+ - `screenshot` parameter for `executeScript` command (boolean, default: `false`)
163
+ - `timeout` parameter for `click` command (number, default: `30000ms`)
164
+ - `timeout` parameter for `executeScript` command (number, default: `30000ms`)
165
+
166
+ ### Changed
167
+ - `click` command now executes 2-10x faster without screenshots
168
+ - `executeScript` command now executes 2-10x faster without screenshots
169
+ - Both commands now have 30-second timeout by default to prevent hanging
170
+
171
+ ### Fixed
172
+ - Commands no longer hang indefinitely if operations fail
173
+ - Reduced memory usage by not capturing unnecessary screenshots
174
+
175
+ ### Migration
176
+ If you relied on automatic screenshots, add `screenshot: true` to your calls:
177
+ ```javascript
178
+ // Before (v1.3.0 and earlier)
179
+ await click({ selector: 'button' }) // Always included screenshot
180
+
181
+ // After (v1.3.1+)
182
+ await click({ selector: 'button', screenshot: true }) // Explicitly request screenshot
183
+ await click({ selector: 'button' }) // Fast mode (no screenshot)
184
+ ```
185
+
186
+ ## [1.3.0] - Previous version
187
+ - Scenario recorder with auto-reinjection
188
+ - Smart element finder
189
+ - Page analysis tools
190
+ - Figma integration
191
+
192
+ ## Earlier versions
193
+ See git history for details.
package/README.md CHANGED
@@ -6,11 +6,15 @@ MCP server for Chrome automation using Puppeteer with persistent browser session
6
6
 
7
7
  - [Installation](#installation)
8
8
  - [Usage](#usage)
9
- - [Available Tools](#available-tools) - **16 Tools Total**
9
+ - [AI Optimization Features](#ai-optimization-features) **NEW**
10
+ - [Scenario Recorder](#scenario-recorder) ⭐ **NEW** - Visual UI-based recording with smart optimization
11
+ - [Available Tools](#available-tools) - **26+ Tools Total**
12
+ - [AI-Powered Tools](#ai-powered-tools) ⭐ **NEW** - smartFindElement, analyzePage, getAllInteractiveElements, findElementsByText
10
13
  - [Core Tools](#1-core-tools) - ping, openBrowser
11
14
  - [Interaction Tools](#2-interaction-tools) - click, type, scrollTo
12
15
  - [Inspection Tools](#3-inspection-tools) - getElement, getComputedCss, getBoxModel, screenshot
13
- - [Advanced Tools](#4-advanced-tools) - executeScript, getConsoleLogs, hover, setStyles, setViewport, getViewport, navigateTo
16
+ - [Advanced Tools](#4-advanced-tools) - executeScript, getConsoleLogs, getNetworkRequests, hover, setStyles, setViewport, getViewport, navigateTo
17
+ - [Recorder Tools](#5-recorder-tools) ⭐ **NEW** - enableRecorder, executeScenario, listScenarios, searchScenarios, getScenarioInfo, deleteScenario
14
18
  - [Typical Workflow Example](#typical-workflow-example)
15
19
  - [Tool Usage Tips](#tool-usage-tips)
16
20
  - [Configuration](#configuration)
@@ -40,8 +44,119 @@ Add to your MCP client configuration (e.g., Claude Desktop):
40
44
  }
41
45
  ```
42
46
 
47
+ ## AI Optimization Features
48
+
49
+ ⭐ **NEW**: Dramatically reduce AI agent request cycles with intelligent element finding and page analysis.
50
+
51
+ ### Why This Matters
52
+
53
+ Traditional browser automation with AI requires many trial-and-error cycles:
54
+ ```
55
+ AI: "Find login button"
56
+ → Try selector #1: Not found
57
+ → Try selector #2: Not found
58
+ → Try selector #3: Found! (3 requests, 15-30 seconds)
59
+ ```
60
+
61
+ **With AI optimization:**
62
+ ```
63
+ AI: smartFindElement("login button")
64
+ → Returns ranked candidates with confidence scores (1 request, 2 seconds)
65
+ ```
66
+
67
+ ### Key Features
68
+
69
+ 1. **`smartFindElement`** - Natural language element search with multilingual support
70
+ 2. **`analyzePage`** - Complete page structure in one request (cached)
71
+ 3. **AI Hints** - Automatic context in all tools (page type, available actions, suggestions)
72
+ 4. **Batch helpers** - `getAllInteractiveElements`, `findElementsByText`
73
+
74
+ **Performance:** 3-5x faster, 5-10x fewer requests
75
+
76
+ 📚 [Full AI Optimization Guide](AI_OPTIMIZATION.md)
77
+
78
+ ## Scenario Recorder
79
+
80
+ ⭐ **NEW**: Visual UI-based recorder for creating reusable test scenarios with automatic secret detection.
81
+
82
+ ### Features
83
+
84
+ - **Visual Widget** - Floating recorder UI with compact mode (50x50px minimize button)
85
+ - **Auto-Reinjection** - Recorder persists across page reloads/navigation automatically with duplicate prevention ⭐ **IMPROVED**
86
+ - **Smart Click Detection** - Finds actual clickable parent elements with event listeners ⭐ **NEW**
87
+ - **Smart Waiters** - 2s minimum + animation/network/DOM change detection after clicks ⭐ **NEW**
88
+ - **Detailed Error Reports** - Comprehensive failure analysis with context and suggestions ⭐ **NEW**
89
+ - **Smart Recording** - Captures clicks, typing, navigation with intelligent optimization
90
+ - **Secret Detection** - Auto-detects passwords/emails and stores them securely
91
+ - **Action Optimization** - Combines sequential actions, removes duplicates
92
+ - **Scenario Management** - Save, load, execute, search, and delete scenarios
93
+ - **Dependencies** - Chain scenarios together with dependency resolution
94
+ - **Multi-Instance Protection** - Prevents multiple recorder instances from interfering ⭐ **NEW**
95
+
96
+ ### Quick Start
97
+
98
+ ```javascript
99
+ // 1. Enable recorder UI
100
+ enableRecorder()
101
+
102
+ // 2. Click "Start" in widget, perform actions, click "Stop & Save"
103
+ // 3. Execute saved scenario
104
+ executeScenario({ name: "login_flow", parameters: { email: "user@test.com" } })
105
+ ```
106
+
107
+ 📚 [Full Recorder Guide](RECORDER_QUICKSTART.md) | [Recorder Spec](RECORDER_SPEC.md)
108
+
43
109
  ## Available Tools
44
110
 
111
+ ### AI-Powered Tools
112
+
113
+ #### smartFindElement ⭐
114
+ Find elements using natural language descriptions instead of CSS selectors.
115
+ - **Parameters**:
116
+ - `description` (required): Natural language (e.g., "login button", "email field")
117
+ - `maxResults` (optional): Max candidates to return (default: 5)
118
+ - **Use case**: When you don't know the exact selector
119
+ - **Returns**: Ranked candidates with confidence scores, selectors, and reasoning
120
+ - **Example**:
121
+ ```json
122
+ {
123
+ "description": "submit button",
124
+ "maxResults": 3
125
+ }
126
+ ```
127
+ Returns:
128
+ ```json
129
+ {
130
+ "candidates": [
131
+ { "selector": "button.login-btn", "confidence": 0.95, "text": "Login", "reason": "type=submit, in form, matching keyword" },
132
+ { "selector": "#submit", "confidence": 0.7, "text": "Send", "reason": "submit class" }
133
+ ],
134
+ "hints": { "suggestion": "Use selector: button.login-btn" }
135
+ }
136
+ ```
137
+
138
+ #### analyzePage ⭐
139
+ Get complete page structure in one request. Results are cached per URL.
140
+ - **Parameters**:
141
+ - `refresh` (optional): Force refresh cache (default: false)
142
+ - **Use case**: Understanding page structure before planning actions
143
+ - **Returns**: Complete map of forms, inputs, buttons, links, navigation with selectors
144
+ - **Example**: Returns structured data for all interactive elements on the page
145
+
146
+ #### getAllInteractiveElements
147
+ Get all clickable/fillable elements with their selectors.
148
+ - **Parameters**:
149
+ - `includeHidden` (optional): Include hidden elements (default: false)
150
+ - **Returns**: Array of all interactive elements with selectors and metadata
151
+
152
+ #### findElementsByText
153
+ Find elements by their visible text content.
154
+ - **Parameters**:
155
+ - `text` (required): Text to search for
156
+ - `exact` (optional): Exact match only (default: false)
157
+ - `caseSensitive` (optional): Case sensitive search (default: false)
158
+ - **Returns**: Elements containing the text with their selectors
159
+
45
160
  ### 1. Core Tools
46
161
 
47
162
  #### ping
@@ -59,12 +174,15 @@ Opens browser and navigates to URL. Browser stays open for further interactions.
59
174
  ### 2. Interaction Tools
60
175
 
61
176
  #### click
62
- Click an element and capture result screenshot.
177
+ Click an element with optional result screenshot.
63
178
  - **Parameters**:
64
179
  - `selector` (required): CSS selector
65
180
  - `waitAfter` (optional): Wait time in ms (default: 1500)
181
+ - `screenshot` (optional): Capture screenshot (default: false for performance) ⚡
182
+ - `timeout` (optional): Max operation time in ms (default: 30000)
66
183
  - **Use case**: Buttons, links, form submissions
67
- - **Returns**: Confirmation text + screenshot
184
+ - **Returns**: Confirmation text + optional screenshot
185
+ - **Performance**: 2-10x faster without screenshot
68
186
 
69
187
  #### type
70
188
  Type text into input fields with optional clearing and typing delay.
@@ -105,22 +223,45 @@ Get precise dimensions, positioning, margins, padding, and borders.
105
223
  - **Returns**: Box model data + metrics
106
224
 
107
225
  #### screenshot
108
- Capture PNG screenshot of specific element.
226
+ Capture optimized screenshot of specific element with smart compression.
109
227
  - **Parameters**:
110
228
  - `selector` (required)
111
- - `padding` (optional): Padding in pixels
229
+ - `padding` (optional): Padding in pixels (default: 0)
230
+ - `maxWidth` (optional): Max width for auto-scaling (default: 1024, null for original size)
231
+ - `maxHeight` (optional): Max height for auto-scaling (default: 8000, null for original size)
232
+ - `quality` (optional): JPEG quality 1-100 (default: 80)
233
+ - `format` (optional): 'png', 'jpeg', or 'auto' (default: 'auto')
112
234
  - **Use case**: Visual documentation, bug reports
113
- - **Returns**: Base64 PNG image
235
+ - **Returns**: Optimized image with metadata
236
+ - **Default behavior**: Auto-scales to 1024px width and 8000px height (API limit) and uses smart compression to reduce AI token usage
237
+ - **For original quality**: Set `maxWidth: null`, `maxHeight: null` and `format: 'png'`
238
+
239
+ #### saveScreenshot
240
+ Save optimized screenshot to filesystem without returning in context.
241
+ - **Parameters**:
242
+ - `selector` (required)
243
+ - `filePath` (required): Absolute path to save file
244
+ - `padding` (optional): Padding in pixels (default: 0)
245
+ - `maxWidth` (optional): Max width for auto-scaling (default: 1024, null for original)
246
+ - `maxHeight` (optional): Max height for auto-scaling (default: 8000, null for original)
247
+ - `quality` (optional): JPEG quality 1-100 (default: 80)
248
+ - `format` (optional): 'png', 'jpeg', or 'auto' (default: 'auto')
249
+ - **Use case**: Baseline screenshots, file storage
250
+ - **Returns**: File path and metadata (not image data)
251
+ - **Default behavior**: Auto-scales and compresses to save disk space
114
252
 
115
253
  ### 4. Advanced Tools
116
254
 
117
255
  #### executeScript
118
- Execute arbitrary JavaScript in page context.
256
+ Execute arbitrary JavaScript in page context with optional screenshot.
119
257
  - **Parameters**:
120
258
  - `script` (required): JavaScript code
121
259
  - `waitAfter` (optional): Wait time in ms (default: 500)
260
+ - `screenshot` (optional): Capture screenshot (default: false for performance) ⚡
261
+ - `timeout` (optional): Max operation time in ms (default: 30000)
122
262
  - **Use case**: Complex interactions, custom manipulations
123
- - **Returns**: Execution result + screenshot
263
+ - **Returns**: Execution result + optional screenshot
264
+ - **Performance**: 2-10x faster without screenshot
124
265
 
125
266
  #### getConsoleLogs
126
267
  Retrieve browser console logs (log, warn, error, etc.).
@@ -130,6 +271,22 @@ Retrieve browser console logs (log, warn, error, etc.).
130
271
  - **Use case**: Debugging JavaScript errors, tracking behavior
131
272
  - **Returns**: Array of log entries with timestamps
132
273
 
274
+ #### getNetworkRequests
275
+ Retrieve all network requests (XHR, Fetch, API calls, resources). **Auto-captures across page navigations**.
276
+ - **Parameters**:
277
+ - `types` (optional): Array of request types (XHR, Fetch, Script, Document, Image, etc.)
278
+ - `status` (optional): Filter by status (pending, completed, failed, all)
279
+ - `urlPattern` (optional): Filter by URL using regex
280
+ - `clear` (optional): Clear requests after reading (default: false)
281
+ - **Use case**: Debugging API calls, monitoring backend requests, tracking failed requests
282
+ - **Returns**: Array of requests with URL, method, status, headers, timing, errors
283
+ - **Auto-reinitialization**: Monitoring continues automatically after form submissions, redirects, and navigation
284
+ - **Examples**:
285
+ - `getNetworkRequests({ types: ['XHR', 'Fetch'] })` - API calls only
286
+ - `getNetworkRequests({ status: 'failed' })` - failed requests
287
+ - `getNetworkRequests({ urlPattern: 'api\\.' })` - requests to API endpoints
288
+ - `getNetworkRequests({ clear: true })` - get requests and clear history
289
+
133
290
  #### hover
134
291
  Simulate mouse hover over element.
135
292
  - **Parameters**: `selector` (required)
@@ -167,6 +324,69 @@ Navigate to different URL while keeping browser instance.
167
324
  - **Use case**: Moving between pages in workflow
168
325
  - **Returns**: New page title
169
326
 
327
+ ### 5. Recorder Tools ⭐ NEW
328
+
329
+ #### enableRecorder
330
+ Inject visual recorder UI widget into the current page.
331
+ - **Parameters**: None
332
+ - **Use case**: Start recording user interactions visually
333
+ - **Returns**: Success status
334
+ - **Features**:
335
+ - Floating widget with compact mode (minimize to 50x50px)
336
+ - Visual recording indicator (red pulsing border)
337
+ - Start/Pause/Stop/Stop & Save/Clear controls
338
+ - Real-time action list display
339
+ - Metadata fields (name, description, tags)
340
+
341
+ #### executeScenario
342
+ Execute a previously recorded scenario by name.
343
+ - **Parameters**:
344
+ - `name` (required): Scenario name
345
+ - `parameters` (optional): Runtime parameters (e.g., { email: "user@test.com" })
346
+ - `executeDependencies` (optional): Execute dependencies before running scenario (default: true)
347
+ - **Use case**: Run automated test scenarios
348
+ - **Returns**: Execution result with success/failure status
349
+ - **Features**:
350
+ - Automatic dependency resolution (enabled by default)
351
+ - Secret parameter injection
352
+ - Fallback selector retry logic
353
+ - **Example**:
354
+ ```javascript
355
+ // Execute with dependencies (default)
356
+ executeScenario({ name: "create_post" })
357
+
358
+ // Execute without dependencies
359
+ executeScenario({ name: "create_post", executeDependencies: false })
360
+ ```
361
+
362
+ #### listScenarios
363
+ Get all available scenarios with metadata.
364
+ - **Parameters**: None
365
+ - **Use case**: Browse recorded scenarios
366
+ - **Returns**: Array of scenarios with names, descriptions, tags, timestamps
367
+
368
+ #### searchScenarios
369
+ Search scenarios by text or tags.
370
+ - **Parameters**:
371
+ - `text` (optional): Search in name/description
372
+ - `tags` (optional): Array of tags to filter
373
+ - **Use case**: Find specific scenarios
374
+ - **Returns**: Matching scenarios
375
+
376
+ #### getScenarioInfo
377
+ Get detailed information about a scenario.
378
+ - **Parameters**:
379
+ - `name` (required): Scenario name
380
+ - `includeSecrets` (optional): Include secret values (default: false)
381
+ - **Use case**: Inspect scenario actions and dependencies
382
+ - **Returns**: Full scenario details (actions, metadata, dependencies)
383
+
384
+ #### deleteScenario
385
+ Delete a scenario and its associated secrets.
386
+ - **Parameters**: `name` (required)
387
+ - **Use case**: Clean up unused scenarios
388
+ - **Returns**: Success confirmation
389
+
170
390
  ---
171
391
 
172
392
  ## Typical Workflow Example
@@ -307,12 +527,16 @@ npx @modelcontextprotocol/inspector node index.js
307
527
 
308
528
  ## Features
309
529
 
310
- - **16 Powerful Tools**: Complete toolkit for browser automation
530
+ - **27+ Powerful Tools**: Complete toolkit for browser automation
311
531
  - Core: ping, openBrowser
312
532
  - Interaction: click, type, scrollTo
313
533
  - Inspection: getElement, getComputedCss, getBoxModel, screenshot
314
- - Advanced: executeScript, getConsoleLogs, hover, setStyles, setViewport, getViewport, navigateTo
534
+ - Advanced: executeScript, getConsoleLogs, getNetworkRequests, hover, setStyles, setViewport, getViewport, navigateTo
535
+ - AI-Powered: smartFindElement, analyzePage, getAllInteractiveElements, findElementsByText
536
+ - Recorder: enableRecorder, executeScenario, listScenarios, searchScenarios, getScenarioInfo, deleteScenario
537
+ - Figma: getFigmaFrame, compareFigmaToElement, getFigmaSpecs
315
538
  - **Console Log Capture**: Automatic JavaScript console monitoring
539
+ - **Network Request Monitoring**: Track all HTTP/API requests (XHR, Fetch, etc.)
316
540
  - **Persistent Browser Sessions**: Browser tabs remain open between requests
317
541
  - **Visual Browser (GUI Mode)**: See automation in real-time
318
542
  - **Cross-platform**: Works on Windows/WSL, Linux, macOS