terminator-mcp-agent 0.15.5 → 0.15.6
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +1044 -1044
- package/config.js +280 -280
- package/index.js +194 -194
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -1,1044 +1,1044 @@
|
|
|
1
|
-
## Terminator MCP Agent
|
|
2
|
-
|
|
3
|
-
<!-- BADGES:START -->
|
|
4
|
-
|
|
5
|
-
[<img alt="Install in VS Code" src="https://img.shields.io/badge/VS_Code-VS_Code?style=flat-square&label=Install%20Server&color=0098FF">](https://insiders.vscode.dev/redirect?url=vscode%3Amcp%2Finstall%3F%257B%2522terminator-mcp-agent%2522%253A%257B%2522command%2522%253A%2522npx%2522%252C%2522args%2522%253A%255B%2522-y%2522%252C%2522terminator-mcp-agent%2522%255D%257D%257D)
|
|
6
|
-
[<img alt="Install in VS Code Insiders" src="https://img.shields.io/badge/VS_Code_Insiders-VS_Code_Insiders?style=flat-square&label=Install%20Server&color=24bfa5">](https://insiders.vscode.dev/redirect?url=vscode-insiders%3Amcp%2Finstall%3F%257B%2522terminator-mcp-agent%2522%253A%257B%2522command%2522%253A%2522npx%2522%252C%2522args%2522%253A%255B%2522-y%2522%252C%2522terminator-mcp-agent%2522%255D%257D%257D)
|
|
7
|
-
|
|
8
|
-
<!-- BADGES:END -->
|
|
9
|
-
|
|
10
|
-
A Model Context Protocol (MCP) server that provides desktop GUI automation capabilities using the [Terminator](https://github.com/mediar-ai/terminator) library. This server enables LLMs and agentic clients to interact with Windows, macOS, and Linux applications through structured accessibility APIs—no vision models or screenshots required.
|
|
11
|
-
|
|
12
|
-
## Quick Install
|
|
13
|
-
|
|
14
|
-
### Claude Code
|
|
15
|
-
|
|
16
|
-
Install with a single command:
|
|
17
|
-
|
|
18
|
-
```bash
|
|
19
|
-
claude mcp add terminator "npx -y terminator-mcp-agent" -s user
|
|
20
|
-
```
|
|
21
|
-
|
|
22
|
-
### Cursor
|
|
23
|
-
|
|
24
|
-
Copy and paste this URL into your browser's address bar:
|
|
25
|
-
|
|
26
|
-
```
|
|
27
|
-
cursor://anysphere.cursor-deeplink/mcp/install?name=terminator-mcp-agent&config=eyJjb21tYW5kIjoibnB4IiwiYXJncyI6WyIteSIsInRlcm1pbmF0b3ItbWNwLWFnZW50Il19
|
|
28
|
-
```
|
|
29
|
-
|
|
30
|
-
Or install manually:
|
|
31
|
-
|
|
32
|
-
1. Open Cursor Settings (`Cmd/Ctrl + ,`)
|
|
33
|
-
2. Go to the MCP tab
|
|
34
|
-
3. Add server with command: `npx -y terminator-mcp-agent`
|
|
35
|
-
|
|
36
|
-
### HTTP Endpoints (when running with `-t http`)
|
|
37
|
-
|
|
38
|
-
- `GET /health`: Always returns 200 while the process is alive.
|
|
39
|
-
- `GET /status`: Busy-aware probe for load balancers. Returns JSON and appropriate status:
|
|
40
|
-
- 200 when idle: `{ "busy": false, "activeRequests": 0, "maxConcurrent": 1, "lastActivity": "<ISO-8601>" }`
|
|
41
|
-
- 503 when busy: `{ "busy": true, "activeRequests": 1, "maxConcurrent": 1, "lastActivity": "<ISO-8601>" }`
|
|
42
|
-
- Content-Type is `application/json`.
|
|
43
|
-
- `POST /mcp`: MCP execution endpoint. Enforces single-request concurrency per machine by default.
|
|
44
|
-
|
|
45
|
-
Concurrency is controlled by the `MCP_MAX_CONCURRENT` environment variable (default `1`). Only accepted `POST /mcp` requests are counted toward `activeRequests`. If the server is at capacity, new `POST /mcp` requests return 503 immediately. This 503 behavior is intentional so an Azure Load Balancer probing `GET /status` can take a busy VM out of rotation and route traffic elsewhere.
|
|
46
|
-
|
|
47
|
-
### Getting Started
|
|
48
|
-
|
|
49
|
-
The easiest way to get started is to use the one-click install buttons above for your specific editor (VS Code, Cursor, etc.).
|
|
50
|
-
|
|
51
|
-
Alternatively, you can install and configure the agent from your command line.
|
|
52
|
-
|
|
53
|
-
**1. Install & Configure Automatically**
|
|
54
|
-
Run the following command and select your MCP client from the list:
|
|
55
|
-
|
|
56
|
-
```sh
|
|
57
|
-
npx -y terminator-mcp-agent@latest --add-to-app
|
|
58
|
-
```
|
|
59
|
-
|
|
60
|
-
**2. Manual Configuration**
|
|
61
|
-
If you prefer, you can add the following to your MCP client's settings file:
|
|
62
|
-
|
|
63
|
-
```json
|
|
64
|
-
{
|
|
65
|
-
"mcpServers": {
|
|
66
|
-
"terminator-mcp-agent": {
|
|
67
|
-
"command": "npx",
|
|
68
|
-
"args": ["-y", "terminator-mcp-agent@latest"]
|
|
69
|
-
}
|
|
70
|
-
}
|
|
71
|
-
}
|
|
72
|
-
```
|
|
73
|
-
|
|
74
|
-
### Command Line Interface (CLI) Execution
|
|
75
|
-
|
|
76
|
-
For automation workflows and CI/CD pipelines, you can execute workflows directly from the command line using the [Terminator CLI](../terminator-cli/README.md):
|
|
77
|
-
|
|
78
|
-
**Quick Start:**
|
|
79
|
-
|
|
80
|
-
```bash
|
|
81
|
-
# Execute a workflow file
|
|
82
|
-
terminator mcp run workflow.yml
|
|
83
|
-
|
|
84
|
-
# With verbose logging
|
|
85
|
-
terminator mcp run workflow.yml --verbose
|
|
86
|
-
|
|
87
|
-
# Dry run (validate without executing)
|
|
88
|
-
terminator mcp run workflow.yml --dry-run
|
|
89
|
-
|
|
90
|
-
# Use specific MCP server version
|
|
91
|
-
terminator mcp run workflow.yml --command "npx -y terminator-mcp-agent@latest"
|
|
92
|
-
|
|
93
|
-
# Run specific steps (requires step IDs in workflow)
|
|
94
|
-
terminator mcp run workflow.yml --start-from "step_12" --end-at "step_13"
|
|
95
|
-
|
|
96
|
-
# Run single step
|
|
97
|
-
terminator mcp run workflow.yml --start-from "read_json" --end-at "read_json"
|
|
98
|
-
```
|
|
99
|
-
|
|
100
|
-
**Workflow File Formats:**
|
|
101
|
-
|
|
102
|
-
Direct workflow format (`workflow.yml`):
|
|
103
|
-
|
|
104
|
-
```yaml
|
|
105
|
-
steps:
|
|
106
|
-
- tool_name: navigate_browser
|
|
107
|
-
arguments:
|
|
108
|
-
url: "https://example.com"
|
|
109
|
-
- tool_name: click_element
|
|
110
|
-
arguments:
|
|
111
|
-
selector: "role:Button|name:Submit"
|
|
112
|
-
stop_on_error: true
|
|
113
|
-
include_detailed_results: true
|
|
114
|
-
```
|
|
115
|
-
|
|
116
|
-
Tool call wrapper format (`workflow.json`):
|
|
117
|
-
|
|
118
|
-
```json
|
|
119
|
-
{
|
|
120
|
-
"tool_name": "execute_sequence",
|
|
121
|
-
"arguments": {
|
|
122
|
-
"steps": [
|
|
123
|
-
{
|
|
124
|
-
"tool_name": "navigate_browser",
|
|
125
|
-
"arguments": {
|
|
126
|
-
"url": "https://example.com"
|
|
127
|
-
}
|
|
128
|
-
}
|
|
129
|
-
]
|
|
130
|
-
}
|
|
131
|
-
}
|
|
132
|
-
```
|
|
133
|
-
|
|
134
|
-
**Code Execution in Workflows (engine mode):**
|
|
135
|
-
|
|
136
|
-
Execute custom JavaScript or Python with access to desktop automation APIs via `run_command`.
|
|
137
|
-
|
|
138
|
-
**Passing Data Between Workflow Steps:**
|
|
139
|
-
|
|
140
|
-
When using `engine` mode, data automatically flows between steps:
|
|
141
|
-
|
|
142
|
-
```yaml
|
|
143
|
-
steps:
|
|
144
|
-
# Step 1: Return data directly (NEW - simplified!)
|
|
145
|
-
- tool_name: run_command
|
|
146
|
-
arguments:
|
|
147
|
-
engine: "javascript"
|
|
148
|
-
run: |
|
|
149
|
-
// Get file info (example)
|
|
150
|
-
const filePath = 'C:\\data\\report.pdf';
|
|
151
|
-
const fileSize = 1024;
|
|
152
|
-
|
|
153
|
-
console.log(`Found file: ${filePath}`);
|
|
154
|
-
|
|
155
|
-
// Just return fields directly - they auto-merge into env
|
|
156
|
-
return {
|
|
157
|
-
status: 'success',
|
|
158
|
-
file_path: filePath, // Becomes env.file_path
|
|
159
|
-
file_size: fileSize // Becomes env.file_size
|
|
160
|
-
};
|
|
161
|
-
|
|
162
|
-
# Step 2: Access data automatically
|
|
163
|
-
- tool_name: run_command
|
|
164
|
-
arguments:
|
|
165
|
-
engine: "javascript"
|
|
166
|
-
run: |
|
|
167
|
-
// env is automatically available - no setup needed!
|
|
168
|
-
console.log(`Processing: ${env.file_path} (${env.file_size} bytes)`);
|
|
169
|
-
|
|
170
|
-
// Workflow variables also auto-available
|
|
171
|
-
console.log(`Config: ${variables.max_retries}`);
|
|
172
|
-
|
|
173
|
-
// NEW: Direct variable access also works!
|
|
174
|
-
console.log(`Processing: ${file_path} (${file_size} bytes)`);
|
|
175
|
-
console.log(`Config: ${max_retries}`);
|
|
176
|
-
|
|
177
|
-
// Continue with desktop automation
|
|
178
|
-
const elements = await desktop.locator('role:button').all();
|
|
179
|
-
|
|
180
|
-
// Return more data (auto-merges to env)
|
|
181
|
-
return {
|
|
182
|
-
status: 'success',
|
|
183
|
-
file_processed: env.file_path,
|
|
184
|
-
buttons_found: elements.length
|
|
185
|
-
};
|
|
186
|
-
```
|
|
187
|
-
|
|
188
|
-
**Important Notes on Data Passing:**
|
|
189
|
-
|
|
190
|
-
- **NEW:** `env` and `variables` are automatically injected into all scripts
|
|
191
|
-
- **NEW:** Non-reserved fields in return values auto-merge into env (no `set_env` wrapper needed)
|
|
192
|
-
- **NEW:** Valid env fields are also available as individual variables (e.g., `file_path` instead of `env.file_path`)
|
|
193
|
-
- Reserved fields that don't auto-merge: `status`, `error`, `logs`, `duration_ms`, `set_env`
|
|
194
|
-
- Data passing only works with `engine` mode (JavaScript/Python), NOT with shell commands
|
|
195
|
-
- Backward compatible: explicit `set_env` still works if needed
|
|
196
|
-
- Individual variable names must be valid JavaScript identifiers (no spaces, special chars, or reserved keywords)
|
|
197
|
-
- Watch for backslash escaping issues in Windows paths (may need double escaping)
|
|
198
|
-
- Consider combining related operations in a single step if data passing becomes complex
|
|
199
|
-
|
|
200
|
-
For complete CLI documentation, see [Terminator CLI README](../terminator-cli/README.md).
|
|
201
|
-
|
|
202
|
-
### Core Workflows: From Interaction to Structured Data
|
|
203
|
-
|
|
204
|
-
The Terminator MCP agent offers two primary workflows for automating desktop tasks. Both paths lead to the same goal: creating a >95% accuracy, 10000x faster than humans, automation.
|
|
205
|
-
|
|
206
|
-
#### 1. Iterative Development with `execute_sequence`
|
|
207
|
-
|
|
208
|
-
This is the most powerful and flexible method. You build a workflow step-by-step, using MCP tools to inspect the UI and refine your actions.
|
|
209
|
-
|
|
210
|
-
1. **Inspect the UI**: Start by using `get_focused_window_tree` to understand the structure of your target application. This gives you the roles, names, and IDs of all elements.
|
|
211
|
-
2. **Build a Sequence**: Create an `execute_sequence` tool call with a series of actions (`click_element`, `type_into_element`, etc.). Use robust selectors (like `role|name` or stable `properties:AutomationId:value` selectors) whenever possible.
|
|
212
|
-
3. **Capture the Final State**: Ensure the last step in your sequence is an action that returns a UI tree. The `wait_for_element` tool with `include_tree: true` is perfect for this, as it captures the application's state after your automation has run.
|
|
213
|
-
4. **Extract Structured Data with `output_parser`**: Add the `output_parser` argument to your `execute_sequence` call. Write JavaScript code to parse the final UI tree and extract structured data. If successful, the tool result will contain a `parsed_output` field with your clean JSON data.
|
|
214
|
-
|
|
215
|
-
Here is an example of an `output_parser` that extracts insurance quote data from a web page:
|
|
216
|
-
|
|
217
|
-
```yaml
|
|
218
|
-
output_parser:
|
|
219
|
-
ui_tree_source_step_id: capture_quotes_tree
|
|
220
|
-
javascript_code: |
|
|
221
|
-
// Find all quote groups with Image and Text children
|
|
222
|
-
const results = [];
|
|
223
|
-
|
|
224
|
-
function findElementsRecursively(element) {
|
|
225
|
-
if (element.attributes && element.attributes.role === 'Group') {
|
|
226
|
-
const children = element.children || [];
|
|
227
|
-
const hasImage = children.some(child =>
|
|
228
|
-
child.attributes && child.attributes.role === 'Image'
|
|
229
|
-
);
|
|
230
|
-
const hasText = children.some(child =>
|
|
231
|
-
child.attributes && child.attributes.role === 'Text'
|
|
232
|
-
);
|
|
233
|
-
|
|
234
|
-
if (hasImage && hasText) {
|
|
235
|
-
const textElements = children.filter(child =>
|
|
236
|
-
child.attributes && child.attributes.role === 'Text' && child.attributes.name
|
|
237
|
-
);
|
|
238
|
-
|
|
239
|
-
let carrierProduct = '';
|
|
240
|
-
let monthlyPrice = '';
|
|
241
|
-
|
|
242
|
-
for (const textEl of textElements) {
|
|
243
|
-
const text = textEl.attributes.name;
|
|
244
|
-
if (text.includes(':')) {
|
|
245
|
-
carrierProduct = text;
|
|
246
|
-
}
|
|
247
|
-
if (text.startsWith('$')) {
|
|
248
|
-
monthlyPrice = text;
|
|
249
|
-
}
|
|
250
|
-
}
|
|
251
|
-
|
|
252
|
-
if (carrierProduct && monthlyPrice) {
|
|
253
|
-
results.push({
|
|
254
|
-
carrierProduct: carrierProduct,
|
|
255
|
-
monthlyPrice: monthlyPrice
|
|
256
|
-
});
|
|
257
|
-
}
|
|
258
|
-
}
|
|
259
|
-
}
|
|
260
|
-
|
|
261
|
-
if (element.children) {
|
|
262
|
-
for (const child of element.children) {
|
|
263
|
-
findElementsRecursively(child);
|
|
264
|
-
}
|
|
265
|
-
}
|
|
266
|
-
}
|
|
267
|
-
|
|
268
|
-
findElementsRecursively(tree);
|
|
269
|
-
return results;
|
|
270
|
-
```
|
|
271
|
-
|
|
272
|
-
#### 2. Recording Human Actions with `record_workflow`
|
|
273
|
-
|
|
274
|
-
For simpler tasks, you can record your own actions to generate a baseline workflow.
|
|
275
|
-
|
|
276
|
-
1. **Start Recording**: Call `record_workflow` with `action: "start"`.
|
|
277
|
-
2. **Perform the Task**: Manually perform the clicks, typing, and other interactions in the target application.
|
|
278
|
-
3. **Stop and Save**: Call `record_workflow` with `action: "stop"`. This returns a complete workflow JSON file containing all your recorded actions.
|
|
279
|
-
4. **Refine and Parse**: The recorded workflow is a great starting point. You can then refine the selectors for robustness, add a final step to capture the UI tree, and attach an `output_parser` to extract structured data, just as you would in the iterative workflow.
|
|
280
|
-
|
|
281
|
-
### Browser DOM Inspection
|
|
282
|
-
|
|
283
|
-
The `execute_browser_script` tool enables direct JavaScript execution in browser contexts, providing access to the full HTML DOM. This is particularly useful when you need information not available in the accessibility tree.
|
|
284
|
-
|
|
285
|
-
#### When to Use DOM vs Accessibility Tree
|
|
286
|
-
|
|
287
|
-
**Use Accessibility Tree (default) when:**
|
|
288
|
-
|
|
289
|
-
- Navigating and interacting with UI elements
|
|
290
|
-
- Working with semantic page structure
|
|
291
|
-
- Building reliable automation workflows
|
|
292
|
-
- Performance is critical (faster, cleaner data)
|
|
293
|
-
|
|
294
|
-
**Use DOM Inspection when:**
|
|
295
|
-
|
|
296
|
-
- Extracting data attributes, meta tags, or hidden inputs
|
|
297
|
-
- Debugging why elements aren't appearing in accessibility tree
|
|
298
|
-
- Scraping structured data from specific HTML patterns
|
|
299
|
-
- Validating complete page structure or SEO elements
|
|
300
|
-
|
|
301
|
-
#### Basic DOM Retrieval Patterns
|
|
302
|
-
|
|
303
|
-
```javascript
|
|
304
|
-
// Get full HTML DOM (be mindful of size limits)
|
|
305
|
-
execute_browser_script({
|
|
306
|
-
selector: "role:Window|name:Google Chrome",
|
|
307
|
-
script: "document.documentElement.outerHTML",
|
|
308
|
-
});
|
|
309
|
-
|
|
310
|
-
// Get structured page information
|
|
311
|
-
execute_browser_script({
|
|
312
|
-
selector: "role:Window|name:Google Chrome",
|
|
313
|
-
script: `({
|
|
314
|
-
url: window.location.href,
|
|
315
|
-
title: document.title,
|
|
316
|
-
html: document.documentElement.outerHTML,
|
|
317
|
-
bodyText: document.body.innerText.substring(0, 1000)
|
|
318
|
-
})`,
|
|
319
|
-
});
|
|
320
|
-
|
|
321
|
-
// Extract specific data (forms, hidden inputs, meta tags)
|
|
322
|
-
execute_browser_script({
|
|
323
|
-
selector: "role:Window|name:Google Chrome",
|
|
324
|
-
script: `({
|
|
325
|
-
forms: Array.from(document.forms).map(f => ({
|
|
326
|
-
id: f.id,
|
|
327
|
-
action: f.action,
|
|
328
|
-
method: f.method,
|
|
329
|
-
inputs: Array.from(f.elements).map(e => ({
|
|
330
|
-
name: e.name,
|
|
331
|
-
type: e.type,
|
|
332
|
-
value: e.type === 'password' ? '[REDACTED]' : e.value
|
|
333
|
-
}))
|
|
334
|
-
})),
|
|
335
|
-
hiddenInputs: Array.from(document.querySelectorAll('input[type="hidden"]')).map(e => ({
|
|
336
|
-
name: e.name,
|
|
337
|
-
value: e.value
|
|
338
|
-
})),
|
|
339
|
-
metaTags: Array.from(document.querySelectorAll('meta')).map(m => ({
|
|
340
|
-
name: m.name || m.property,
|
|
341
|
-
content: m.content
|
|
342
|
-
}))
|
|
343
|
-
})`,
|
|
344
|
-
});
|
|
345
|
-
```
|
|
346
|
-
|
|
347
|
-
#### Handling Large DOMs
|
|
348
|
-
|
|
349
|
-
The MCP protocol has response size limits (~30KB). For large DOMs, use truncation strategies:
|
|
350
|
-
|
|
351
|
-
```javascript
|
|
352
|
-
execute_browser_script({
|
|
353
|
-
selector: "role:Window|name:Google Chrome",
|
|
354
|
-
script: `
|
|
355
|
-
const html = document.documentElement.outerHTML;
|
|
356
|
-
const maxLength = 30000;
|
|
357
|
-
|
|
358
|
-
({
|
|
359
|
-
url: window.location.href,
|
|
360
|
-
title: document.title,
|
|
361
|
-
html: html.length > maxLength
|
|
362
|
-
? html.substring(0, maxLength) + '... [truncated at ' + maxLength + ' chars]'
|
|
363
|
-
: html,
|
|
364
|
-
totalLength: html.length,
|
|
365
|
-
truncated: html.length > maxLength
|
|
366
|
-
})
|
|
367
|
-
`,
|
|
368
|
-
});
|
|
369
|
-
```
|
|
370
|
-
|
|
371
|
-
#### Advanced DOM Analysis
|
|
372
|
-
|
|
373
|
-
```javascript
|
|
374
|
-
// Analyze page structure and extract semantic content
|
|
375
|
-
execute_browser_script({
|
|
376
|
-
selector: "role:Window|name:Google Chrome",
|
|
377
|
-
script: `
|
|
378
|
-
// Remove scripts and styles for cleaner analysis
|
|
379
|
-
const clonedDoc = document.documentElement.cloneNode(true);
|
|
380
|
-
clonedDoc.querySelectorAll('script, style, noscript').forEach(el => el.remove());
|
|
381
|
-
|
|
382
|
-
({
|
|
383
|
-
// Page metrics
|
|
384
|
-
domElementCount: document.querySelectorAll('*').length,
|
|
385
|
-
formCount: document.forms.length,
|
|
386
|
-
linkCount: document.links.length,
|
|
387
|
-
imageCount: document.images.length,
|
|
388
|
-
|
|
389
|
-
// Semantic structure
|
|
390
|
-
headings: Array.from(document.querySelectorAll('h1,h2,h3')).map(h => ({
|
|
391
|
-
level: h.tagName,
|
|
392
|
-
text: h.innerText.substring(0, 100)
|
|
393
|
-
})),
|
|
394
|
-
|
|
395
|
-
// Clean HTML without scripts/styles
|
|
396
|
-
cleanHtml: clonedDoc.outerHTML.substring(0, 20000),
|
|
397
|
-
|
|
398
|
-
// Data extraction
|
|
399
|
-
jsonLd: Array.from(document.querySelectorAll('script[type="application/ld+json"]'))
|
|
400
|
-
.map(s => { try { return JSON.parse(s.textContent); } catch { return null; } })
|
|
401
|
-
.filter(Boolean)
|
|
402
|
-
})
|
|
403
|
-
`,
|
|
404
|
-
});
|
|
405
|
-
```
|
|
406
|
-
|
|
407
|
-
#### Passing Data with Environment Variables
|
|
408
|
-
|
|
409
|
-
The `execute_browser_script` tool now supports passing data through `env` and `outputs` parameters:
|
|
410
|
-
|
|
411
|
-
```javascript
|
|
412
|
-
// Step 1: Set environment variables in JavaScript
|
|
413
|
-
run_command({
|
|
414
|
-
engine: "javascript",
|
|
415
|
-
run: `
|
|
416
|
-
return {
|
|
417
|
-
set_env: {
|
|
418
|
-
userName: 'John Doe',
|
|
419
|
-
userId: '12345',
|
|
420
|
-
apiKey: 'secret-key'
|
|
421
|
-
}
|
|
422
|
-
};
|
|
423
|
-
`,
|
|
424
|
-
});
|
|
425
|
-
|
|
426
|
-
// Step 2: Use environment variables in browser script
|
|
427
|
-
execute_browser_script({
|
|
428
|
-
selector: "role:Window",
|
|
429
|
-
env: {
|
|
430
|
-
userName: "{{env.userName}}",
|
|
431
|
-
userId: "{{env.userId}}",
|
|
432
|
-
},
|
|
433
|
-
script: `
|
|
434
|
-
// Parse env if it's a JSON string (for backward compatibility)
|
|
435
|
-
const parsedEnv = typeof env === 'string' ? JSON.parse(env) : env;
|
|
436
|
-
|
|
437
|
-
// Use the data - traditional way
|
|
438
|
-
console.log('Processing user:', parsedEnv.userName);
|
|
439
|
-
|
|
440
|
-
// NEW: Direct variable access also works!
|
|
441
|
-
console.log('Processing user:', userName); // Direct access
|
|
442
|
-
console.log('User ID:', userId); // No env prefix needed
|
|
443
|
-
|
|
444
|
-
// Fill form with data
|
|
445
|
-
document.querySelector('#username').value = userName;
|
|
446
|
-
document.querySelector('#userid').value = userId;
|
|
447
|
-
|
|
448
|
-
// Return result and set new variables
|
|
449
|
-
JSON.stringify({
|
|
450
|
-
status: 'form_filled',
|
|
451
|
-
set_env: {
|
|
452
|
-
form_submitted: 'true',
|
|
453
|
-
timestamp: new Date().toISOString()
|
|
454
|
-
}
|
|
455
|
-
});
|
|
456
|
-
`,
|
|
457
|
-
});
|
|
458
|
-
```
|
|
459
|
-
|
|
460
|
-
#### Loading Scripts from Files
|
|
461
|
-
|
|
462
|
-
You can load JavaScript from external files using the `script_file` parameter:
|
|
463
|
-
|
|
464
|
-
```javascript
|
|
465
|
-
// browser_scripts/extract_data.js
|
|
466
|
-
const parsedEnv = typeof env === "string" ? JSON.parse(env) : env;
|
|
467
|
-
const parsedOutputs =
|
|
468
|
-
typeof outputs === "string" ? JSON.parse(outputs) : outputs;
|
|
469
|
-
|
|
470
|
-
console.log("Script loaded from file");
|
|
471
|
-
console.log("User:", parsedEnv?.userName);
|
|
472
|
-
console.log("Previous result:", parsedOutputs?.previousStep);
|
|
473
|
-
|
|
474
|
-
// Extract and return data
|
|
475
|
-
JSON.stringify({
|
|
476
|
-
extractedData: {
|
|
477
|
-
url: window.location.href,
|
|
478
|
-
title: document.title,
|
|
479
|
-
forms: document.forms.length,
|
|
480
|
-
},
|
|
481
|
-
set_env: {
|
|
482
|
-
extraction_complete: "true",
|
|
483
|
-
},
|
|
484
|
-
});
|
|
485
|
-
|
|
486
|
-
// In your workflow:
|
|
487
|
-
execute_browser_script({
|
|
488
|
-
selector: "role:Window",
|
|
489
|
-
script_file: "browser_scripts/extract_data.js",
|
|
490
|
-
env: {
|
|
491
|
-
userName: "{{env.userName}}",
|
|
492
|
-
previousStep: "{{env.previousStep}}",
|
|
493
|
-
},
|
|
494
|
-
});
|
|
495
|
-
```
|
|
496
|
-
|
|
497
|
-
#### Important Notes
|
|
498
|
-
|
|
499
|
-
1. **Chrome Extension Required**: The `execute_browser_script` tool requires the Terminator browser extension to be installed. See the installation workflow examples for automated setup.
|
|
500
|
-
|
|
501
|
-
2. **Security Considerations**: Be cautious when extracting sensitive data. The examples above redact password fields and you should follow similar practices.
|
|
502
|
-
|
|
503
|
-
3. **Performance**: DOM operations are synchronous and can be slow on large pages. Consider using specific selectors rather than traversing the entire DOM.
|
|
504
|
-
|
|
505
|
-
4. **Error Handling**: Always wrap complex DOM operations in try-catch blocks and return meaningful error messages.
|
|
506
|
-
|
|
507
|
-
5. **Data Injection**: When using `env` or `outputs` parameters, they are injected as JavaScript variables at the beginning of your script. Always parse them if they might be JSON strings.
|
|
508
|
-
|
|
509
|
-
## Local Development
|
|
510
|
-
|
|
511
|
-
To build and test the agent from the source code:
|
|
512
|
-
|
|
513
|
-
```sh
|
|
514
|
-
# 1. Clone the entire Terminator repository
|
|
515
|
-
git clone https://github.com/mediar-ai/terminator
|
|
516
|
-
|
|
517
|
-
# 2. Navigate to the agent's directory
|
|
518
|
-
cd terminator/terminator-mcp-agent
|
|
519
|
-
|
|
520
|
-
# 3. Install Node.js dependencies
|
|
521
|
-
npm install
|
|
522
|
-
|
|
523
|
-
# 4. Build the Rust binary and Node.js wrapper
|
|
524
|
-
npm run build
|
|
525
|
-
|
|
526
|
-
# 5. To use your local build in your MCP client, link it globally
|
|
527
|
-
npm install --global .
|
|
528
|
-
```
|
|
529
|
-
|
|
530
|
-
Now, when your MCP client runs `terminator-mcp-agent`, it will use your local build instead of the published `npm` version.
|
|
531
|
-
|
|
532
|
-
---
|
|
533
|
-
|
|
534
|
-
## Troubleshooting
|
|
535
|
-
|
|
536
|
-
- Make sure you have Node.js installed (v16+ recommended).
|
|
537
|
-
- For VS Code/Insiders, ensure the CLI (`code` or `code-insiders`) is available in your PATH.
|
|
538
|
-
- If you encounter issues, try running with elevated permissions.
|
|
539
|
-
|
|
540
|
-
### Version Compatibility Issues
|
|
541
|
-
|
|
542
|
-
**Problem**: "missing field `items`" or schema mismatch errors
|
|
543
|
-
|
|
544
|
-
**Solution**: Ensure you're using the latest MCP server version:
|
|
545
|
-
|
|
546
|
-
```bash
|
|
547
|
-
# Force latest version in CLI
|
|
548
|
-
terminator mcp run workflow.yml --command "npx -y terminator-mcp-agent@latest"
|
|
549
|
-
|
|
550
|
-
# Update MCP client configuration to use @latest
|
|
551
|
-
{
|
|
552
|
-
"mcpServers": {
|
|
553
|
-
"terminator-mcp-agent": {
|
|
554
|
-
"command": "npx",
|
|
555
|
-
"args": ["-y", "terminator-mcp-agent@latest"]
|
|
556
|
-
}
|
|
557
|
-
}
|
|
558
|
-
}
|
|
559
|
-
|
|
560
|
-
# Clear npm cache if needed
|
|
561
|
-
npm cache clean --force
|
|
562
|
-
```
|
|
563
|
-
|
|
564
|
-
### CLI Integration Issues
|
|
565
|
-
|
|
566
|
-
**Problem**: CLI commands not working or connection errors
|
|
567
|
-
|
|
568
|
-
**Solution**: Test MCP connectivity step by step:
|
|
569
|
-
|
|
570
|
-
```bash
|
|
571
|
-
# Test basic connectivity
|
|
572
|
-
terminator mcp exec get_applications
|
|
573
|
-
|
|
574
|
-
# Test with verbose logging
|
|
575
|
-
terminator mcp run workflow.yml --verbose
|
|
576
|
-
|
|
577
|
-
# Test with dry run first
|
|
578
|
-
terminator mcp run workflow.yml --dry-run
|
|
579
|
-
|
|
580
|
-
# Use HTTP connection for debugging
|
|
581
|
-
terminator mcp run workflow.yml --url http://localhost:3000/mcp
|
|
582
|
-
```
|
|
583
|
-
|
|
584
|
-
### JavaScript Execution Issues
|
|
585
|
-
|
|
586
|
-
**Problem**: JavaScript code fails or can't access desktop APIs
|
|
587
|
-
|
|
588
|
-
**Solution**: Verify JavaScript execution and API access:
|
|
589
|
-
|
|
590
|
-
```bash
|
|
591
|
-
# Test basic JavaScript execution via run_command engine mode
|
|
592
|
-
terminator mcp exec run_command '{"engine": "javascript", "run": "return {test: true};"}'
|
|
593
|
-
|
|
594
|
-
# Test desktop API access with node engine
|
|
595
|
-
terminator mcp exec run_command '{"engine": "node", "run": "const elements = await desktop.locator(\\\"role:button\\\").all(); return {count: elements.length};"}'
|
|
596
|
-
|
|
597
|
-
# Test Python engine
|
|
598
|
-
terminator mcp exec run_command '{"engine": "python", "run": "return {\\\"py\\\": True}"}'
|
|
599
|
-
|
|
600
|
-
# Debug with verbose logging
|
|
601
|
-
terminator mcp run workflow.yml --verbose
|
|
602
|
-
```
|
|
603
|
-
|
|
604
|
-
### Workflow File Issues
|
|
605
|
-
|
|
606
|
-
**Problem**: Workflow parsing errors or unexpected behavior
|
|
607
|
-
|
|
608
|
-
**Solution**: Validate workflow structure:
|
|
609
|
-
|
|
610
|
-
```bash
|
|
611
|
-
# Validate workflow syntax
|
|
612
|
-
terminator mcp run workflow.yml --dry-run
|
|
613
|
-
|
|
614
|
-
# Test with minimal workflow first
|
|
615
|
-
echo 'steps: [{tool_name: get_applications}]' > test.yml
|
|
616
|
-
terminator mcp run test.yml
|
|
617
|
-
|
|
618
|
-
# Check both YAML and JSON formats work
|
|
619
|
-
terminator mcp run workflow.yml # YAML
|
|
620
|
-
terminator mcp run workflow.json # JSON
|
|
621
|
-
```
|
|
622
|
-
|
|
623
|
-
### Platform-Specific Issues
|
|
624
|
-
|
|
625
|
-
**Windows**:
|
|
626
|
-
|
|
627
|
-
- Ensure Windows UI Automation APIs are available
|
|
628
|
-
- Run with administrator privileges if accessibility features are restricted
|
|
629
|
-
- Check Windows Defender/antivirus isn't blocking automation
|
|
630
|
-
|
|
631
|
-
**macOS**:
|
|
632
|
-
|
|
633
|
-
- Grant accessibility permissions in System Preferences > Security & Privacy
|
|
634
|
-
- Ensure the terminal/IDE has accessibility access
|
|
635
|
-
- Check macOS version compatibility (10.14+ recommended)
|
|
636
|
-
|
|
637
|
-
**Linux**:
|
|
638
|
-
|
|
639
|
-
- Ensure AT-SPI (assistive technology) is enabled
|
|
640
|
-
- Install required packages: `sudo apt-get install at-spi2-core`
|
|
641
|
-
- Check desktop environment compatibility (GNOME, KDE, XFCE supported)
|
|
642
|
-
|
|
643
|
-
### Virtual Display Support (Headless VMs)
|
|
644
|
-
|
|
645
|
-
Terminator MCP Agent includes virtual display support for running on headless VMs without requiring RDP connections. This enables scalable automation on cloud platforms like Azure, AWS, and GCP.
|
|
646
|
-
|
|
647
|
-
**How It Works**:
|
|
648
|
-
|
|
649
|
-
The agent automatically detects headless environments and initializes a virtual display context that Windows UI Automation APIs can interact with. This allows full UI automation capabilities even when no physical display or RDP session is active.
|
|
650
|
-
|
|
651
|
-
**Activation**:
|
|
652
|
-
|
|
653
|
-
Virtual display activates automatically when:
|
|
654
|
-
|
|
655
|
-
- Environment variable `TERMINATOR_HEADLESS=true` is set
|
|
656
|
-
- No console window is available (common in VM/container scenarios)
|
|
657
|
-
- Running as a Windows service or scheduled task
|
|
658
|
-
|
|
659
|
-
**Configuration**:
|
|
660
|
-
|
|
661
|
-
```bash
|
|
662
|
-
# Enable virtual display mode
|
|
663
|
-
export TERMINATOR_HEADLESS=true
|
|
664
|
-
|
|
665
|
-
# Run the MCP agent
|
|
666
|
-
npx -y terminator-mcp-agent
|
|
667
|
-
```
|
|
668
|
-
|
|
669
|
-
**Use Cases**:
|
|
670
|
-
|
|
671
|
-
- Running multiple automation agents on VMs without RDP overhead
|
|
672
|
-
- CI/CD pipelines in cloud environments
|
|
673
|
-
- Scalable automation farms on Azure/AWS/GCP
|
|
674
|
-
- Containerized automation workloads
|
|
675
|
-
|
|
676
|
-
**Requirements**:
|
|
677
|
-
|
|
678
|
-
- Windows Server 2016+ or Windows 10/11
|
|
679
|
-
- .NET Framework 4.7.2+
|
|
680
|
-
- UI Automation APIs available (included in Windows)
|
|
681
|
-
|
|
682
|
-
The virtual display manager creates a memory-based display context that satisfies Windows UI Automation requirements, enabling terminator to enumerate and interact with UI elements as if a physical display were present.
|
|
683
|
-
|
|
684
|
-
### Performance Optimization
|
|
685
|
-
|
|
686
|
-
**Large UI Trees**:
|
|
687
|
-
|
|
688
|
-
- Use specific selectors instead of broad element searches
|
|
689
|
-
- Implement delays between rapid operations
|
|
690
|
-
- Consider using `include_tree: false` for intermediate steps
|
|
691
|
-
|
|
692
|
-
**JavaScript Performance**:
|
|
693
|
-
|
|
694
|
-
- Use `quickjs` engine for lightweight operations
|
|
695
|
-
- Use `nodejs` engine only when full APIs are needed
|
|
696
|
-
- Implement `sleep()` delays in loops to prevent overwhelming the UI
|
|
697
|
-
|
|
698
|
-
For additional help, see the [Terminator CLI documentation](../terminator-cli/README.md) or open an issue on GitHub.
|
|
699
|
-
|
|
700
|
-
---
|
|
701
|
-
|
|
702
|
-
## 📚 Full `execute_sequence` Reference & Sample Workflow
|
|
703
|
-
|
|
704
|
-
> **Why another example?** The quick start above shows the concept, but many users asked for a fully-annotated workflow schema. The example below automates the Windows **Calculator** app—so it is 100% safe to share and does **not** reveal any private customer data. Feel free to copy-paste and adapt it to your own application.
|
|
705
|
-
|
|
706
|
-
### 1. Anatomy of an `execute_sequence` Call
|
|
707
|
-
|
|
708
|
-
```jsonc
|
|
709
|
-
{
|
|
710
|
-
"tool_name": "execute_sequence",
|
|
711
|
-
"arguments": {
|
|
712
|
-
"variables": {
|
|
713
|
-
// 1️⃣ Re-usable inputs with type metadata
|
|
714
|
-
"app_path": {
|
|
715
|
-
"type": "string",
|
|
716
|
-
"label": "Calculator EXE Path",
|
|
717
|
-
"default": "calc.exe"
|
|
718
|
-
},
|
|
719
|
-
"first_number": {
|
|
720
|
-
"type": "string",
|
|
721
|
-
"label": "First Number",
|
|
722
|
-
"default": "42"
|
|
723
|
-
},
|
|
724
|
-
"second_number": {
|
|
725
|
-
"type": "string",
|
|
726
|
-
"label": "Second Number",
|
|
727
|
-
"default": "8"
|
|
728
|
-
}
|
|
729
|
-
},
|
|
730
|
-
"inputs": {
|
|
731
|
-
// 2️⃣ Concrete values for *this run*
|
|
732
|
-
"app_path": "calc.exe",
|
|
733
|
-
"first_number": "42",
|
|
734
|
-
"second_number": "8"
|
|
735
|
-
},
|
|
736
|
-
"selectors": {
|
|
737
|
-
// 3️⃣ Human-readable element shortcuts
|
|
738
|
-
"calc_window": "role:Window|name:Calculator",
|
|
739
|
-
"btn_clear": "role:Button|name:Clear",
|
|
740
|
-
"btn_plus": "role:Button|name:Plus",
|
|
741
|
-
"btn_equals": "role:Button|name:Equals"
|
|
742
|
-
},
|
|
743
|
-
"steps": [
|
|
744
|
-
// 4️⃣ Ordered actions & control flow
|
|
745
|
-
{
|
|
746
|
-
"tool_name": "open_application",
|
|
747
|
-
"arguments": { "path": "${{app_path}}" }
|
|
748
|
-
},
|
|
749
|
-
{
|
|
750
|
-
"tool_name": "click_element", // 4a. Make sure the UI is reset
|
|
751
|
-
"arguments": { "selector": "${{selectors.btn_clear}}" },
|
|
752
|
-
"continue_on_error": true
|
|
753
|
-
},
|
|
754
|
-
{
|
|
755
|
-
"group_name": "Enter First Number", // 4b. Groups improve logs
|
|
756
|
-
"steps": [
|
|
757
|
-
{
|
|
758
|
-
"tool_name": "type_into_element",
|
|
759
|
-
"arguments": {
|
|
760
|
-
"selector": "${{selectors.calc_window}}",
|
|
761
|
-
"text_to_type": "${{first_number}}"
|
|
762
|
-
}
|
|
763
|
-
}
|
|
764
|
-
]
|
|
765
|
-
},
|
|
766
|
-
{
|
|
767
|
-
"tool_name": "click_element",
|
|
768
|
-
"arguments": { "selector": "${{selectors.btn_plus}}" }
|
|
769
|
-
},
|
|
770
|
-
{
|
|
771
|
-
"group_name": "Enter Second Number",
|
|
772
|
-
"steps": [
|
|
773
|
-
{
|
|
774
|
-
"tool_name": "type_into_element",
|
|
775
|
-
"arguments": {
|
|
776
|
-
"selector": "${{selectors.calc_window}}",
|
|
777
|
-
"text_to_type": "${{second_number}}"
|
|
778
|
-
}
|
|
779
|
-
}
|
|
780
|
-
]
|
|
781
|
-
},
|
|
782
|
-
{
|
|
783
|
-
"tool_name": "click_element",
|
|
784
|
-
"arguments": { "selector": "${{selectors.btn_equals}}" }
|
|
785
|
-
},
|
|
786
|
-
{
|
|
787
|
-
"tool_name": "wait_for_element", // 4c. Capture final UI tree
|
|
788
|
-
"arguments": {
|
|
789
|
-
"selector": "${{selectors.calc_window}}",
|
|
790
|
-
"condition": "exists",
|
|
791
|
-
"include_tree": true,
|
|
792
|
-
"timeout_ms": 2000
|
|
793
|
-
}
|
|
794
|
-
}
|
|
795
|
-
],
|
|
796
|
-
"output_parser": {
|
|
797
|
-
// 5️⃣ Turn the tree into clean JSON
|
|
798
|
-
"javascript_code": "// Extract calculator display value\nconst results = [];\n\nfunction findElementsRecursively(element) {\n if (element.attributes && element.attributes.role === 'Text') {\n const item = {\n displayValue: element.attributes.name || ''\n };\n results.push(item);\n }\n \n if (element.children) {\n for (const child of element.children) {\n findElementsRecursively(child);\n }\n }\n}\n\nfindElementsRecursively(tree);\nreturn results;"
|
|
799
|
-
}
|
|
800
|
-
}
|
|
801
|
-
}
|
|
802
|
-
```
|
|
803
|
-
|
|
804
|
-
### 2. Key Concepts at a Glance
|
|
805
|
-
|
|
806
|
-
1. **Variables vs. Inputs** – Declare once, override per-run. This is perfect for parameterizing CI pipelines or A/B test data.
|
|
807
|
-
2. **Selectors** – Give every important UI element a _nickname_. It makes long workflows readable and easy to maintain.
|
|
808
|
-
3. **Templating** – `${{ ... }}` (GitHub Actions-style) _or_ legacy `{{ ... }}` lets you reference **any** key inside `variables`, `inputs`, or `selectors`. Both syntaxes are supported; the engine uses Mustache-style rendering.
|
|
809
|
-
4. **Groups & Control Flow** – Add `group_name`, `skippable`, `if`, or `continue_on_error` to any step for advanced branching.
|
|
810
|
-
5. **Output Parsing** – Always end with a step that includes the UI tree, then use the declarative JSON DSL to mine the data you need.
|
|
811
|
-
|
|
812
|
-
### 3. State Persistence & Partial Execution
|
|
813
|
-
|
|
814
|
-
The `execute_sequence` tool supports powerful features for workflow debugging and resumption:
|
|
815
|
-
|
|
816
|
-
#### Partial Execution with Step Ranges
|
|
817
|
-
|
|
818
|
-
You can run specific portions of a workflow using `start_from_step` and `end_at_step` parameters:
|
|
819
|
-
|
|
820
|
-
```jsonc
|
|
821
|
-
{
|
|
822
|
-
"tool_name": "execute_sequence",
|
|
823
|
-
"arguments": {
|
|
824
|
-
"url": "file://path/to/workflow.yml",
|
|
825
|
-
"start_from_step": "read_json_file", // Start from this step ID
|
|
826
|
-
"end_at_step": "fill_journal_entries", // Stop after this step (inclusive)
|
|
827
|
-
"follow_fallback": false // Don't follow fallback_id beyond end_at_step (default: false)
|
|
828
|
-
}
|
|
829
|
-
}
|
|
830
|
-
```
|
|
831
|
-
|
|
832
|
-
**Examples:**
|
|
833
|
-
- Run single step: Set both `start_from_step` and `end_at_step` to the same ID
|
|
834
|
-
- Run step range: Set different IDs for start and end
|
|
835
|
-
- Run from step to end: Only set `start_from_step`
|
|
836
|
-
- Run from beginning to step: Only set `end_at_step`
|
|
837
|
-
- Debug without fallback: Use `follow_fallback: false` to prevent jumping to troubleshooting steps when a bounded step fails
|
|
838
|
-
|
|
839
|
-
#### Automatic State Persistence
|
|
840
|
-
|
|
841
|
-
When using `file://` URLs, the workflow state (environment variables) is automatically saved to a `.workflow_state` folder:
|
|
842
|
-
|
|
843
|
-
1. **State is saved** after each step that modifies environment variables via `set_env` or has a tool result with an ID
|
|
844
|
-
2. **State is loaded** when starting from a specific step
|
|
845
|
-
3. **Location**: `.workflow_state/<workflow_hash>.json` in the workflow's directory
|
|
846
|
-
4. **Tool results** from all tools (not just scripts) are automatically stored as `{step_id}_result` and `{step_id}_status`
|
|
847
|
-
|
|
848
|
-
This enables:
|
|
849
|
-
- **Debugging**: Run steps individually to inspect state between executions
|
|
850
|
-
- **Recovery**: Resume failed workflows from the last successful step
|
|
851
|
-
- **Testing**: Test specific steps without re-running the entire workflow
|
|
852
|
-
|
|
853
|
-
#### Data Passing Between Steps
|
|
854
|
-
|
|
855
|
-
Steps can pass data using multiple methods:
|
|
856
|
-
|
|
857
|
-
##### 1. Tool Result Storage (NEW)
|
|
858
|
-
|
|
859
|
-
ALL tools with an `id` field automatically store their results in the environment:
|
|
860
|
-
|
|
861
|
-
```yaml
|
|
862
|
-
steps:
|
|
863
|
-
# Any tool with an ID stores its result
|
|
864
|
-
- id: check_apps
|
|
865
|
-
tool_name: get_applications
|
|
866
|
-
arguments:
|
|
867
|
-
include_tree: false
|
|
868
|
-
|
|
869
|
-
# Access the result in JavaScript
|
|
870
|
-
- tool_name: run_command
|
|
871
|
-
arguments:
|
|
872
|
-
engine: javascript
|
|
873
|
-
run: |
|
|
874
|
-
// Direct variable access - auto-injected!
|
|
875
|
-
const apps = check_apps_result || [];
|
|
876
|
-
const status = check_apps_status; // "success" or "error"
|
|
877
|
-
console.log(`Found ${apps[0]?.applications?.length} apps`);
|
|
878
|
-
```
|
|
879
|
-
|
|
880
|
-
##### 2. Script Return Values
|
|
881
|
-
|
|
882
|
-
Steps can pass data using the `set_env` mechanism in `run_command` with engine mode:
|
|
883
|
-
|
|
884
|
-
```javascript
|
|
885
|
-
// Step 12: Read and process data
|
|
886
|
-
return {
|
|
887
|
-
set_env: {
|
|
888
|
-
file_path: "C:/data/input.json",
|
|
889
|
-
journal_entries: JSON.stringify(entries),
|
|
890
|
-
total_debit: "100.50"
|
|
891
|
-
}
|
|
892
|
-
};
|
|
893
|
-
|
|
894
|
-
// Step 13: Use the data (NEW - simplified access!)
|
|
895
|
-
const filePath = file_path; // Direct access, no {{env.}} needed!
|
|
896
|
-
const entries = JSON.parse(journal_entries);
|
|
897
|
-
const debit = total_debit;
|
|
898
|
-
```
|
|
899
|
-
|
|
900
|
-
### 4. Running the Workflow
|
|
901
|
-
|
|
902
|
-
1. Ensure the Terminator MCP agent is running (it will auto-start in supported editors).
|
|
903
|
-
2. Send the JSON above as the body of an `execute_sequence` tool call from your LLM or test harness.
|
|
904
|
-
3. Inspect the response: if parsing succeeds you'll see something like
|
|
905
|
-
|
|
906
|
-
### Realtime events (SSE)
|
|
907
|
-
|
|
908
|
-
When running with the HTTP transport, you can subscribe to realtime workflow events at a separate endpoint outside `/mcp`:
|
|
909
|
-
|
|
910
|
-
- SSE endpoint: `/events`
|
|
911
|
-
- Emits JSON payloads for: `sequence` (start/end), `sequence_progress`, and `sequence_step` (begin/end)
|
|
912
|
-
|
|
913
|
-
Example in Node.js:
|
|
914
|
-
|
|
915
|
-
```js
|
|
916
|
-
import EventSource from "eventsource";
|
|
917
|
-
const es = new EventSource("http://127.0.0.1:3000/events");
|
|
918
|
-
es.onmessage = (e) => console.log("event", e.data);
|
|
919
|
-
```
|
|
920
|
-
|
|
921
|
-
```jsonc
|
|
922
|
-
{
|
|
923
|
-
"parsed_output": {
|
|
924
|
-
"displayValue": "50" // 42 + 8
|
|
925
|
-
}
|
|
926
|
-
}
|
|
927
|
-
```
|
|
928
|
-
|
|
929
|
-
### 5. Working with Tool Results
|
|
930
|
-
|
|
931
|
-
Every tool that has an `id` field automatically stores its result for use in later steps:
|
|
932
|
-
|
|
933
|
-
```yaml
|
|
934
|
-
steps:
|
|
935
|
-
# Capture browser DOM
|
|
936
|
-
- id: capture_dom
|
|
937
|
-
tool_name: execute_browser_script
|
|
938
|
-
arguments:
|
|
939
|
-
selector: "role:Window"
|
|
940
|
-
script: "return document.documentElement.innerHTML;"
|
|
941
|
-
|
|
942
|
-
# Validate an element exists
|
|
943
|
-
- id: check_button
|
|
944
|
-
tool_name: validate_element
|
|
945
|
-
arguments:
|
|
946
|
-
selector: "role:Button|name:Submit"
|
|
947
|
-
|
|
948
|
-
# Use both results in script
|
|
949
|
-
- tool_name: run_command
|
|
950
|
-
arguments:
|
|
951
|
-
engine: javascript
|
|
952
|
-
run: |
|
|
953
|
-
// All tool results are auto-injected as variables
|
|
954
|
-
const dom = capture_dom_result?.content || '';
|
|
955
|
-
const buttonExists = check_button_status === 'success';
|
|
956
|
-
|
|
957
|
-
if (buttonExists) {
|
|
958
|
-
const button = check_button_result[0]?.element;
|
|
959
|
-
console.log(`Submit button at: ${button?.bounds?.x}, ${button?.bounds?.y}`);
|
|
960
|
-
}
|
|
961
|
-
|
|
962
|
-
return { dom_length: dom.length, has_button: buttonExists };
|
|
963
|
-
```
|
|
964
|
-
|
|
965
|
-
Tool results are accessible as:
|
|
966
|
-
- `{step_id}_result`: The tool's return value (content, element info, etc.)
|
|
967
|
-
- `{step_id}_status`: Either "success" or "error"
|
|
968
|
-
|
|
969
|
-
### 6. Tips for Production Workflows
|
|
970
|
-
|
|
971
|
-
- **Never hard-code credentials** – use environment variables or your secret manager.
|
|
972
|
-
- **Keep workflows short** – <100 steps is ideal. Break large tasks into multiple sequences.
|
|
973
|
-
- **Capture errors** – `continue_on_error` is useful, but also check `{step_id}_status` for tool failures.
|
|
974
|
-
- **Version control** – Store workflow JSON in a repo and use PR reviews just like regular code.
|
|
975
|
-
- **Use step IDs** – Give meaningful IDs to steps whose results you'll need later.
|
|
976
|
-
|
|
977
|
-
## 🔍 Troubleshooting & Debugging
|
|
978
|
-
|
|
979
|
-
### Finding MCP Server Logs
|
|
980
|
-
|
|
981
|
-
MCP logs are saved to:
|
|
982
|
-
- **Windows:** `%LOCALAPPDATA%\claude-cli-nodejs\Cache\<encoded-project-path>\mcp-logs-terminator-mcp-agent\`
|
|
983
|
-
- **macOS/Linux:** `~/.local/share/claude-cli-nodejs/Cache/<encoded-project-path>/mcp-logs-terminator-mcp-agent/`
|
|
984
|
-
|
|
985
|
-
Where `<encoded-project-path>` is your project path with special chars replaced (e.g., `C--Users-username-project`).
|
|
986
|
-
Note: Logs are saved as `.txt` files, not `.log` files.
|
|
987
|
-
|
|
988
|
-
**Read logs:**
|
|
989
|
-
```powershell
|
|
990
|
-
# Windows - Find and read latest logs (run in PowerShell)
|
|
991
|
-
Get-ChildItem (Join-Path ([Environment]::GetFolderPath('LocalApplicationData')) 'claude-cli-nodejs\Cache\*\mcp-logs-terminator-mcp-agent\*.txt') | Sort-Object LastWriteTime -Descending | Select-Object -First 1 | Get-Content -Tail 50
|
|
992
|
-
```
|
|
993
|
-
|
|
994
|
-
### Enable Debug Logging
|
|
995
|
-
|
|
996
|
-
In your Claude MCP configuration (`claude_desktop_config.json`):
|
|
997
|
-
```json
|
|
998
|
-
{
|
|
999
|
-
"mcpServers": {
|
|
1000
|
-
"terminator-mcp-agent": {
|
|
1001
|
-
"command": "path/to/terminator-mcp-agent",
|
|
1002
|
-
"env": {
|
|
1003
|
-
"LOG_LEVEL": "debug", // or "info", "warn", "error"
|
|
1004
|
-
"RUST_BACKTRACE": "1" // for stack traces on errors
|
|
1005
|
-
}
|
|
1006
|
-
}
|
|
1007
|
-
}
|
|
1008
|
-
}
|
|
1009
|
-
```
|
|
1010
|
-
|
|
1011
|
-
### Common Debug Scenarios
|
|
1012
|
-
|
|
1013
|
-
| Issue | What to Look For in Logs |
|
|
1014
|
-
|-------|--------------------------|
|
|
1015
|
-
| Workflow failures | Search for `fallback_id` triggers and `critical_error_occurred` |
|
|
1016
|
-
| Element not found | Look for selector resolution attempts, `find_element` timeouts |
|
|
1017
|
-
| Browser script errors | Check for `EVAL_ERROR`, Promise rejections, JavaScript exceptions |
|
|
1018
|
-
| Binary version issues | Startup logs show binary path and build timestamp |
|
|
1019
|
-
| MCP connection lost | Check for panic messages, ensure binary path is correct |
|
|
1020
|
-
|
|
1021
|
-
### Fallback Mechanism
|
|
1022
|
-
|
|
1023
|
-
Workflows support `fallback_id` to handle errors gracefully:
|
|
1024
|
-
- If a step fails and has `fallback_id`, it jumps to that step instead of stopping
|
|
1025
|
-
- Without `fallback_id`, errors may set `critical_error_occurred` and skip remaining steps
|
|
1026
|
-
- Use `troubleshooting:` section for recovery steps only accessed via fallback
|
|
1027
|
-
|
|
1028
|
-
> Need more help? Browse the examples under `examples/` in this repo or open a discussion on GitHub.
|
|
1029
|
-
|
|
1030
|
-
## Documentation
|
|
1031
|
-
|
|
1032
|
-
### Workflow Development
|
|
1033
|
-
|
|
1034
|
-
- **[Workflow Output Structure](docs/WORKFLOW_OUTPUT_STRUCTURE.md)**: Detailed documentation on the expected output structure for workflows, including:
|
|
1035
|
-
- How to structure `parsed_output` for proper CLI rendering
|
|
1036
|
-
- Success/failure indicators and business logic validation
|
|
1037
|
-
- Data extraction patterns and error handling
|
|
1038
|
-
- Integration with CLI and backend systems
|
|
1039
|
-
|
|
1040
|
-
### Additional Resources
|
|
1041
|
-
|
|
1042
|
-
- **[CLI Documentation](../terminator-cli/README.md)**: Command-line interface for executing workflows
|
|
1043
|
-
- **[Examples](examples/)**: Sample workflows and use cases
|
|
1044
|
-
- **[API Reference](https://github.com/mediar-ai/terminator#api)**: Core Terminator library documentation
|
|
1
|
+
## Terminator MCP Agent
|
|
2
|
+
|
|
3
|
+
<!-- BADGES:START -->
|
|
4
|
+
|
|
5
|
+
[<img alt="Install in VS Code" src="https://img.shields.io/badge/VS_Code-VS_Code?style=flat-square&label=Install%20Server&color=0098FF">](https://insiders.vscode.dev/redirect?url=vscode%3Amcp%2Finstall%3F%257B%2522terminator-mcp-agent%2522%253A%257B%2522command%2522%253A%2522npx%2522%252C%2522args%2522%253A%255B%2522-y%2522%252C%2522terminator-mcp-agent%2522%255D%257D%257D)
|
|
6
|
+
[<img alt="Install in VS Code Insiders" src="https://img.shields.io/badge/VS_Code_Insiders-VS_Code_Insiders?style=flat-square&label=Install%20Server&color=24bfa5">](https://insiders.vscode.dev/redirect?url=vscode-insiders%3Amcp%2Finstall%3F%257B%2522terminator-mcp-agent%2522%253A%257B%2522command%2522%253A%2522npx%2522%252C%2522args%2522%253A%255B%2522-y%2522%252C%2522terminator-mcp-agent%2522%255D%257D%257D)
|
|
7
|
+
|
|
8
|
+
<!-- BADGES:END -->
|
|
9
|
+
|
|
10
|
+
A Model Context Protocol (MCP) server that provides desktop GUI automation capabilities using the [Terminator](https://github.com/mediar-ai/terminator) library. This server enables LLMs and agentic clients to interact with Windows, macOS, and Linux applications through structured accessibility APIs—no vision models or screenshots required.
|
|
11
|
+
|
|
12
|
+
## Quick Install
|
|
13
|
+
|
|
14
|
+
### Claude Code
|
|
15
|
+
|
|
16
|
+
Install with a single command:
|
|
17
|
+
|
|
18
|
+
```bash
|
|
19
|
+
claude mcp add terminator "npx -y terminator-mcp-agent" -s user
|
|
20
|
+
```
|
|
21
|
+
|
|
22
|
+
### Cursor
|
|
23
|
+
|
|
24
|
+
Copy and paste this URL into your browser's address bar:
|
|
25
|
+
|
|
26
|
+
```
|
|
27
|
+
cursor://anysphere.cursor-deeplink/mcp/install?name=terminator-mcp-agent&config=eyJjb21tYW5kIjoibnB4IiwiYXJncyI6WyIteSIsInRlcm1pbmF0b3ItbWNwLWFnZW50Il19
|
|
28
|
+
```
|
|
29
|
+
|
|
30
|
+
Or install manually:
|
|
31
|
+
|
|
32
|
+
1. Open Cursor Settings (`Cmd/Ctrl + ,`)
|
|
33
|
+
2. Go to the MCP tab
|
|
34
|
+
3. Add server with command: `npx -y terminator-mcp-agent`
|
|
35
|
+
|
|
36
|
+
### HTTP Endpoints (when running with `-t http`)
|
|
37
|
+
|
|
38
|
+
- `GET /health`: Always returns 200 while the process is alive.
|
|
39
|
+
- `GET /status`: Busy-aware probe for load balancers. Returns JSON and appropriate status:
|
|
40
|
+
- 200 when idle: `{ "busy": false, "activeRequests": 0, "maxConcurrent": 1, "lastActivity": "<ISO-8601>" }`
|
|
41
|
+
- 503 when busy: `{ "busy": true, "activeRequests": 1, "maxConcurrent": 1, "lastActivity": "<ISO-8601>" }`
|
|
42
|
+
- Content-Type is `application/json`.
|
|
43
|
+
- `POST /mcp`: MCP execution endpoint. Enforces single-request concurrency per machine by default.
|
|
44
|
+
|
|
45
|
+
Concurrency is controlled by the `MCP_MAX_CONCURRENT` environment variable (default `1`). Only accepted `POST /mcp` requests are counted toward `activeRequests`. If the server is at capacity, new `POST /mcp` requests return 503 immediately. This 503 behavior is intentional so an Azure Load Balancer probing `GET /status` can take a busy VM out of rotation and route traffic elsewhere.
|
|
46
|
+
|
|
47
|
+
### Getting Started
|
|
48
|
+
|
|
49
|
+
The easiest way to get started is to use the one-click install buttons above for your specific editor (VS Code, Cursor, etc.).
|
|
50
|
+
|
|
51
|
+
Alternatively, you can install and configure the agent from your command line.
|
|
52
|
+
|
|
53
|
+
**1. Install & Configure Automatically**
|
|
54
|
+
Run the following command and select your MCP client from the list:
|
|
55
|
+
|
|
56
|
+
```sh
|
|
57
|
+
npx -y terminator-mcp-agent@latest --add-to-app
|
|
58
|
+
```
|
|
59
|
+
|
|
60
|
+
**2. Manual Configuration**
|
|
61
|
+
If you prefer, you can add the following to your MCP client's settings file:
|
|
62
|
+
|
|
63
|
+
```json
|
|
64
|
+
{
|
|
65
|
+
"mcpServers": {
|
|
66
|
+
"terminator-mcp-agent": {
|
|
67
|
+
"command": "npx",
|
|
68
|
+
"args": ["-y", "terminator-mcp-agent@latest"]
|
|
69
|
+
}
|
|
70
|
+
}
|
|
71
|
+
}
|
|
72
|
+
```
|
|
73
|
+
|
|
74
|
+
### Command Line Interface (CLI) Execution
|
|
75
|
+
|
|
76
|
+
For automation workflows and CI/CD pipelines, you can execute workflows directly from the command line using the [Terminator CLI](../terminator-cli/README.md):
|
|
77
|
+
|
|
78
|
+
**Quick Start:**
|
|
79
|
+
|
|
80
|
+
```bash
|
|
81
|
+
# Execute a workflow file
|
|
82
|
+
terminator mcp run workflow.yml
|
|
83
|
+
|
|
84
|
+
# With verbose logging
|
|
85
|
+
terminator mcp run workflow.yml --verbose
|
|
86
|
+
|
|
87
|
+
# Dry run (validate without executing)
|
|
88
|
+
terminator mcp run workflow.yml --dry-run
|
|
89
|
+
|
|
90
|
+
# Use specific MCP server version
|
|
91
|
+
terminator mcp run workflow.yml --command "npx -y terminator-mcp-agent@latest"
|
|
92
|
+
|
|
93
|
+
# Run specific steps (requires step IDs in workflow)
|
|
94
|
+
terminator mcp run workflow.yml --start-from "step_12" --end-at "step_13"
|
|
95
|
+
|
|
96
|
+
# Run single step
|
|
97
|
+
terminator mcp run workflow.yml --start-from "read_json" --end-at "read_json"
|
|
98
|
+
```
|
|
99
|
+
|
|
100
|
+
**Workflow File Formats:**
|
|
101
|
+
|
|
102
|
+
Direct workflow format (`workflow.yml`):
|
|
103
|
+
|
|
104
|
+
```yaml
|
|
105
|
+
steps:
|
|
106
|
+
- tool_name: navigate_browser
|
|
107
|
+
arguments:
|
|
108
|
+
url: "https://example.com"
|
|
109
|
+
- tool_name: click_element
|
|
110
|
+
arguments:
|
|
111
|
+
selector: "role:Button|name:Submit"
|
|
112
|
+
stop_on_error: true
|
|
113
|
+
include_detailed_results: true
|
|
114
|
+
```
|
|
115
|
+
|
|
116
|
+
Tool call wrapper format (`workflow.json`):
|
|
117
|
+
|
|
118
|
+
```json
|
|
119
|
+
{
|
|
120
|
+
"tool_name": "execute_sequence",
|
|
121
|
+
"arguments": {
|
|
122
|
+
"steps": [
|
|
123
|
+
{
|
|
124
|
+
"tool_name": "navigate_browser",
|
|
125
|
+
"arguments": {
|
|
126
|
+
"url": "https://example.com"
|
|
127
|
+
}
|
|
128
|
+
}
|
|
129
|
+
]
|
|
130
|
+
}
|
|
131
|
+
}
|
|
132
|
+
```
|
|
133
|
+
|
|
134
|
+
**Code Execution in Workflows (engine mode):**
|
|
135
|
+
|
|
136
|
+
Execute custom JavaScript or Python with access to desktop automation APIs via `run_command`.
|
|
137
|
+
|
|
138
|
+
**Passing Data Between Workflow Steps:**
|
|
139
|
+
|
|
140
|
+
When using `engine` mode, data automatically flows between steps:
|
|
141
|
+
|
|
142
|
+
```yaml
|
|
143
|
+
steps:
|
|
144
|
+
# Step 1: Return data directly (NEW - simplified!)
|
|
145
|
+
- tool_name: run_command
|
|
146
|
+
arguments:
|
|
147
|
+
engine: "javascript"
|
|
148
|
+
run: |
|
|
149
|
+
// Get file info (example)
|
|
150
|
+
const filePath = 'C:\\data\\report.pdf';
|
|
151
|
+
const fileSize = 1024;
|
|
152
|
+
|
|
153
|
+
console.log(`Found file: ${filePath}`);
|
|
154
|
+
|
|
155
|
+
// Just return fields directly - they auto-merge into env
|
|
156
|
+
return {
|
|
157
|
+
status: 'success',
|
|
158
|
+
file_path: filePath, // Becomes env.file_path
|
|
159
|
+
file_size: fileSize // Becomes env.file_size
|
|
160
|
+
};
|
|
161
|
+
|
|
162
|
+
# Step 2: Access data automatically
|
|
163
|
+
- tool_name: run_command
|
|
164
|
+
arguments:
|
|
165
|
+
engine: "javascript"
|
|
166
|
+
run: |
|
|
167
|
+
// env is automatically available - no setup needed!
|
|
168
|
+
console.log(`Processing: ${env.file_path} (${env.file_size} bytes)`);
|
|
169
|
+
|
|
170
|
+
// Workflow variables also auto-available
|
|
171
|
+
console.log(`Config: ${variables.max_retries}`);
|
|
172
|
+
|
|
173
|
+
// NEW: Direct variable access also works!
|
|
174
|
+
console.log(`Processing: ${file_path} (${file_size} bytes)`);
|
|
175
|
+
console.log(`Config: ${max_retries}`);
|
|
176
|
+
|
|
177
|
+
// Continue with desktop automation
|
|
178
|
+
const elements = await desktop.locator('role:button').all();
|
|
179
|
+
|
|
180
|
+
// Return more data (auto-merges to env)
|
|
181
|
+
return {
|
|
182
|
+
status: 'success',
|
|
183
|
+
file_processed: env.file_path,
|
|
184
|
+
buttons_found: elements.length
|
|
185
|
+
};
|
|
186
|
+
```
|
|
187
|
+
|
|
188
|
+
**Important Notes on Data Passing:**
|
|
189
|
+
|
|
190
|
+
- **NEW:** `env` and `variables` are automatically injected into all scripts
|
|
191
|
+
- **NEW:** Non-reserved fields in return values auto-merge into env (no `set_env` wrapper needed)
|
|
192
|
+
- **NEW:** Valid env fields are also available as individual variables (e.g., `file_path` instead of `env.file_path`)
|
|
193
|
+
- Reserved fields that don't auto-merge: `status`, `error`, `logs`, `duration_ms`, `set_env`
|
|
194
|
+
- Data passing only works with `engine` mode (JavaScript/Python), NOT with shell commands
|
|
195
|
+
- Backward compatible: explicit `set_env` still works if needed
|
|
196
|
+
- Individual variable names must be valid JavaScript identifiers (no spaces, special chars, or reserved keywords)
|
|
197
|
+
- Watch for backslash escaping issues in Windows paths (may need double escaping)
|
|
198
|
+
- Consider combining related operations in a single step if data passing becomes complex
|
|
199
|
+
|
|
200
|
+
For complete CLI documentation, see [Terminator CLI README](../terminator-cli/README.md).
|
|
201
|
+
|
|
202
|
+
### Core Workflows: From Interaction to Structured Data
|
|
203
|
+
|
|
204
|
+
The Terminator MCP agent offers two primary workflows for automating desktop tasks. Both paths lead to the same goal: creating a >95% accuracy, 10000x faster than humans, automation.
|
|
205
|
+
|
|
206
|
+
#### 1. Iterative Development with `execute_sequence`
|
|
207
|
+
|
|
208
|
+
This is the most powerful and flexible method. You build a workflow step-by-step, using MCP tools to inspect the UI and refine your actions.
|
|
209
|
+
|
|
210
|
+
1. **Inspect the UI**: Start by using `get_focused_window_tree` to understand the structure of your target application. This gives you the roles, names, and IDs of all elements.
|
|
211
|
+
2. **Build a Sequence**: Create an `execute_sequence` tool call with a series of actions (`click_element`, `type_into_element`, etc.). Use robust selectors (like `role|name` or stable `properties:AutomationId:value` selectors) whenever possible.
|
|
212
|
+
3. **Capture the Final State**: Ensure the last step in your sequence is an action that returns a UI tree. The `wait_for_element` tool with `include_tree: true` is perfect for this, as it captures the application's state after your automation has run.
|
|
213
|
+
4. **Extract Structured Data with `output_parser`**: Add the `output_parser` argument to your `execute_sequence` call. Write JavaScript code to parse the final UI tree and extract structured data. If successful, the tool result will contain a `parsed_output` field with your clean JSON data.
|
|
214
|
+
|
|
215
|
+
Here is an example of an `output_parser` that extracts insurance quote data from a web page:
|
|
216
|
+
|
|
217
|
+
```yaml
|
|
218
|
+
output_parser:
|
|
219
|
+
ui_tree_source_step_id: capture_quotes_tree
|
|
220
|
+
javascript_code: |
|
|
221
|
+
// Find all quote groups with Image and Text children
|
|
222
|
+
const results = [];
|
|
223
|
+
|
|
224
|
+
function findElementsRecursively(element) {
|
|
225
|
+
if (element.attributes && element.attributes.role === 'Group') {
|
|
226
|
+
const children = element.children || [];
|
|
227
|
+
const hasImage = children.some(child =>
|
|
228
|
+
child.attributes && child.attributes.role === 'Image'
|
|
229
|
+
);
|
|
230
|
+
const hasText = children.some(child =>
|
|
231
|
+
child.attributes && child.attributes.role === 'Text'
|
|
232
|
+
);
|
|
233
|
+
|
|
234
|
+
if (hasImage && hasText) {
|
|
235
|
+
const textElements = children.filter(child =>
|
|
236
|
+
child.attributes && child.attributes.role === 'Text' && child.attributes.name
|
|
237
|
+
);
|
|
238
|
+
|
|
239
|
+
let carrierProduct = '';
|
|
240
|
+
let monthlyPrice = '';
|
|
241
|
+
|
|
242
|
+
for (const textEl of textElements) {
|
|
243
|
+
const text = textEl.attributes.name;
|
|
244
|
+
if (text.includes(':')) {
|
|
245
|
+
carrierProduct = text;
|
|
246
|
+
}
|
|
247
|
+
if (text.startsWith('$')) {
|
|
248
|
+
monthlyPrice = text;
|
|
249
|
+
}
|
|
250
|
+
}
|
|
251
|
+
|
|
252
|
+
if (carrierProduct && monthlyPrice) {
|
|
253
|
+
results.push({
|
|
254
|
+
carrierProduct: carrierProduct,
|
|
255
|
+
monthlyPrice: monthlyPrice
|
|
256
|
+
});
|
|
257
|
+
}
|
|
258
|
+
}
|
|
259
|
+
}
|
|
260
|
+
|
|
261
|
+
if (element.children) {
|
|
262
|
+
for (const child of element.children) {
|
|
263
|
+
findElementsRecursively(child);
|
|
264
|
+
}
|
|
265
|
+
}
|
|
266
|
+
}
|
|
267
|
+
|
|
268
|
+
findElementsRecursively(tree);
|
|
269
|
+
return results;
|
|
270
|
+
```
|
|
271
|
+
|
|
272
|
+
#### 2. Recording Human Actions with `record_workflow`
|
|
273
|
+
|
|
274
|
+
For simpler tasks, you can record your own actions to generate a baseline workflow.
|
|
275
|
+
|
|
276
|
+
1. **Start Recording**: Call `record_workflow` with `action: "start"`.
|
|
277
|
+
2. **Perform the Task**: Manually perform the clicks, typing, and other interactions in the target application.
|
|
278
|
+
3. **Stop and Save**: Call `record_workflow` with `action: "stop"`. This returns a complete workflow JSON file containing all your recorded actions.
|
|
279
|
+
4. **Refine and Parse**: The recorded workflow is a great starting point. You can then refine the selectors for robustness, add a final step to capture the UI tree, and attach an `output_parser` to extract structured data, just as you would in the iterative workflow.
|
|
280
|
+
|
|
281
|
+
### Browser DOM Inspection
|
|
282
|
+
|
|
283
|
+
The `execute_browser_script` tool enables direct JavaScript execution in browser contexts, providing access to the full HTML DOM. This is particularly useful when you need information not available in the accessibility tree.
|
|
284
|
+
|
|
285
|
+
#### When to Use DOM vs Accessibility Tree
|
|
286
|
+
|
|
287
|
+
**Use Accessibility Tree (default) when:**
|
|
288
|
+
|
|
289
|
+
- Navigating and interacting with UI elements
|
|
290
|
+
- Working with semantic page structure
|
|
291
|
+
- Building reliable automation workflows
|
|
292
|
+
- Performance is critical (faster, cleaner data)
|
|
293
|
+
|
|
294
|
+
**Use DOM Inspection when:**
|
|
295
|
+
|
|
296
|
+
- Extracting data attributes, meta tags, or hidden inputs
|
|
297
|
+
- Debugging why elements aren't appearing in accessibility tree
|
|
298
|
+
- Scraping structured data from specific HTML patterns
|
|
299
|
+
- Validating complete page structure or SEO elements
|
|
300
|
+
|
|
301
|
+
#### Basic DOM Retrieval Patterns
|
|
302
|
+
|
|
303
|
+
```javascript
|
|
304
|
+
// Get full HTML DOM (be mindful of size limits)
|
|
305
|
+
execute_browser_script({
|
|
306
|
+
selector: "role:Window|name:Google Chrome",
|
|
307
|
+
script: "document.documentElement.outerHTML",
|
|
308
|
+
});
|
|
309
|
+
|
|
310
|
+
// Get structured page information
|
|
311
|
+
execute_browser_script({
|
|
312
|
+
selector: "role:Window|name:Google Chrome",
|
|
313
|
+
script: `({
|
|
314
|
+
url: window.location.href,
|
|
315
|
+
title: document.title,
|
|
316
|
+
html: document.documentElement.outerHTML,
|
|
317
|
+
bodyText: document.body.innerText.substring(0, 1000)
|
|
318
|
+
})`,
|
|
319
|
+
});
|
|
320
|
+
|
|
321
|
+
// Extract specific data (forms, hidden inputs, meta tags)
|
|
322
|
+
execute_browser_script({
|
|
323
|
+
selector: "role:Window|name:Google Chrome",
|
|
324
|
+
script: `({
|
|
325
|
+
forms: Array.from(document.forms).map(f => ({
|
|
326
|
+
id: f.id,
|
|
327
|
+
action: f.action,
|
|
328
|
+
method: f.method,
|
|
329
|
+
inputs: Array.from(f.elements).map(e => ({
|
|
330
|
+
name: e.name,
|
|
331
|
+
type: e.type,
|
|
332
|
+
value: e.type === 'password' ? '[REDACTED]' : e.value
|
|
333
|
+
}))
|
|
334
|
+
})),
|
|
335
|
+
hiddenInputs: Array.from(document.querySelectorAll('input[type="hidden"]')).map(e => ({
|
|
336
|
+
name: e.name,
|
|
337
|
+
value: e.value
|
|
338
|
+
})),
|
|
339
|
+
metaTags: Array.from(document.querySelectorAll('meta')).map(m => ({
|
|
340
|
+
name: m.name || m.property,
|
|
341
|
+
content: m.content
|
|
342
|
+
}))
|
|
343
|
+
})`,
|
|
344
|
+
});
|
|
345
|
+
```
|
|
346
|
+
|
|
347
|
+
#### Handling Large DOMs
|
|
348
|
+
|
|
349
|
+
The MCP protocol has response size limits (~30KB). For large DOMs, use truncation strategies:
|
|
350
|
+
|
|
351
|
+
```javascript
|
|
352
|
+
execute_browser_script({
|
|
353
|
+
selector: "role:Window|name:Google Chrome",
|
|
354
|
+
script: `
|
|
355
|
+
const html = document.documentElement.outerHTML;
|
|
356
|
+
const maxLength = 30000;
|
|
357
|
+
|
|
358
|
+
({
|
|
359
|
+
url: window.location.href,
|
|
360
|
+
title: document.title,
|
|
361
|
+
html: html.length > maxLength
|
|
362
|
+
? html.substring(0, maxLength) + '... [truncated at ' + maxLength + ' chars]'
|
|
363
|
+
: html,
|
|
364
|
+
totalLength: html.length,
|
|
365
|
+
truncated: html.length > maxLength
|
|
366
|
+
})
|
|
367
|
+
`,
|
|
368
|
+
});
|
|
369
|
+
```
|
|
370
|
+
|
|
371
|
+
#### Advanced DOM Analysis
|
|
372
|
+
|
|
373
|
+
```javascript
|
|
374
|
+
// Analyze page structure and extract semantic content
|
|
375
|
+
execute_browser_script({
|
|
376
|
+
selector: "role:Window|name:Google Chrome",
|
|
377
|
+
script: `
|
|
378
|
+
// Remove scripts and styles for cleaner analysis
|
|
379
|
+
const clonedDoc = document.documentElement.cloneNode(true);
|
|
380
|
+
clonedDoc.querySelectorAll('script, style, noscript').forEach(el => el.remove());
|
|
381
|
+
|
|
382
|
+
({
|
|
383
|
+
// Page metrics
|
|
384
|
+
domElementCount: document.querySelectorAll('*').length,
|
|
385
|
+
formCount: document.forms.length,
|
|
386
|
+
linkCount: document.links.length,
|
|
387
|
+
imageCount: document.images.length,
|
|
388
|
+
|
|
389
|
+
// Semantic structure
|
|
390
|
+
headings: Array.from(document.querySelectorAll('h1,h2,h3')).map(h => ({
|
|
391
|
+
level: h.tagName,
|
|
392
|
+
text: h.innerText.substring(0, 100)
|
|
393
|
+
})),
|
|
394
|
+
|
|
395
|
+
// Clean HTML without scripts/styles
|
|
396
|
+
cleanHtml: clonedDoc.outerHTML.substring(0, 20000),
|
|
397
|
+
|
|
398
|
+
// Data extraction
|
|
399
|
+
jsonLd: Array.from(document.querySelectorAll('script[type="application/ld+json"]'))
|
|
400
|
+
.map(s => { try { return JSON.parse(s.textContent); } catch { return null; } })
|
|
401
|
+
.filter(Boolean)
|
|
402
|
+
})
|
|
403
|
+
`,
|
|
404
|
+
});
|
|
405
|
+
```
|
|
406
|
+
|
|
407
|
+
#### Passing Data with Environment Variables
|
|
408
|
+
|
|
409
|
+
The `execute_browser_script` tool now supports passing data through `env` and `outputs` parameters:
|
|
410
|
+
|
|
411
|
+
```javascript
|
|
412
|
+
// Step 1: Set environment variables in JavaScript
|
|
413
|
+
run_command({
|
|
414
|
+
engine: "javascript",
|
|
415
|
+
run: `
|
|
416
|
+
return {
|
|
417
|
+
set_env: {
|
|
418
|
+
userName: 'John Doe',
|
|
419
|
+
userId: '12345',
|
|
420
|
+
apiKey: 'secret-key'
|
|
421
|
+
}
|
|
422
|
+
};
|
|
423
|
+
`,
|
|
424
|
+
});
|
|
425
|
+
|
|
426
|
+
// Step 2: Use environment variables in browser script
|
|
427
|
+
execute_browser_script({
|
|
428
|
+
selector: "role:Window",
|
|
429
|
+
env: {
|
|
430
|
+
userName: "{{env.userName}}",
|
|
431
|
+
userId: "{{env.userId}}",
|
|
432
|
+
},
|
|
433
|
+
script: `
|
|
434
|
+
// Parse env if it's a JSON string (for backward compatibility)
|
|
435
|
+
const parsedEnv = typeof env === 'string' ? JSON.parse(env) : env;
|
|
436
|
+
|
|
437
|
+
// Use the data - traditional way
|
|
438
|
+
console.log('Processing user:', parsedEnv.userName);
|
|
439
|
+
|
|
440
|
+
// NEW: Direct variable access also works!
|
|
441
|
+
console.log('Processing user:', userName); // Direct access
|
|
442
|
+
console.log('User ID:', userId); // No env prefix needed
|
|
443
|
+
|
|
444
|
+
// Fill form with data
|
|
445
|
+
document.querySelector('#username').value = userName;
|
|
446
|
+
document.querySelector('#userid').value = userId;
|
|
447
|
+
|
|
448
|
+
// Return result and set new variables
|
|
449
|
+
JSON.stringify({
|
|
450
|
+
status: 'form_filled',
|
|
451
|
+
set_env: {
|
|
452
|
+
form_submitted: 'true',
|
|
453
|
+
timestamp: new Date().toISOString()
|
|
454
|
+
}
|
|
455
|
+
});
|
|
456
|
+
`,
|
|
457
|
+
});
|
|
458
|
+
```
|
|
459
|
+
|
|
460
|
+
#### Loading Scripts from Files
|
|
461
|
+
|
|
462
|
+
You can load JavaScript from external files using the `script_file` parameter:
|
|
463
|
+
|
|
464
|
+
```javascript
|
|
465
|
+
// browser_scripts/extract_data.js
|
|
466
|
+
const parsedEnv = typeof env === "string" ? JSON.parse(env) : env;
|
|
467
|
+
const parsedOutputs =
|
|
468
|
+
typeof outputs === "string" ? JSON.parse(outputs) : outputs;
|
|
469
|
+
|
|
470
|
+
console.log("Script loaded from file");
|
|
471
|
+
console.log("User:", parsedEnv?.userName);
|
|
472
|
+
console.log("Previous result:", parsedOutputs?.previousStep);
|
|
473
|
+
|
|
474
|
+
// Extract and return data
|
|
475
|
+
JSON.stringify({
|
|
476
|
+
extractedData: {
|
|
477
|
+
url: window.location.href,
|
|
478
|
+
title: document.title,
|
|
479
|
+
forms: document.forms.length,
|
|
480
|
+
},
|
|
481
|
+
set_env: {
|
|
482
|
+
extraction_complete: "true",
|
|
483
|
+
},
|
|
484
|
+
});
|
|
485
|
+
|
|
486
|
+
// In your workflow:
|
|
487
|
+
execute_browser_script({
|
|
488
|
+
selector: "role:Window",
|
|
489
|
+
script_file: "browser_scripts/extract_data.js",
|
|
490
|
+
env: {
|
|
491
|
+
userName: "{{env.userName}}",
|
|
492
|
+
previousStep: "{{env.previousStep}}",
|
|
493
|
+
},
|
|
494
|
+
});
|
|
495
|
+
```
|
|
496
|
+
|
|
497
|
+
#### Important Notes
|
|
498
|
+
|
|
499
|
+
1. **Chrome Extension Required**: The `execute_browser_script` tool requires the Terminator browser extension to be installed. See the installation workflow examples for automated setup.
|
|
500
|
+
|
|
501
|
+
2. **Security Considerations**: Be cautious when extracting sensitive data. The examples above redact password fields and you should follow similar practices.
|
|
502
|
+
|
|
503
|
+
3. **Performance**: DOM operations are synchronous and can be slow on large pages. Consider using specific selectors rather than traversing the entire DOM.
|
|
504
|
+
|
|
505
|
+
4. **Error Handling**: Always wrap complex DOM operations in try-catch blocks and return meaningful error messages.
|
|
506
|
+
|
|
507
|
+
5. **Data Injection**: When using `env` or `outputs` parameters, they are injected as JavaScript variables at the beginning of your script. Always parse them if they might be JSON strings.
|
|
508
|
+
|
|
509
|
+
## Local Development
|
|
510
|
+
|
|
511
|
+
To build and test the agent from the source code:
|
|
512
|
+
|
|
513
|
+
```sh
|
|
514
|
+
# 1. Clone the entire Terminator repository
|
|
515
|
+
git clone https://github.com/mediar-ai/terminator
|
|
516
|
+
|
|
517
|
+
# 2. Navigate to the agent's directory
|
|
518
|
+
cd terminator/terminator-mcp-agent
|
|
519
|
+
|
|
520
|
+
# 3. Install Node.js dependencies
|
|
521
|
+
npm install
|
|
522
|
+
|
|
523
|
+
# 4. Build the Rust binary and Node.js wrapper
|
|
524
|
+
npm run build
|
|
525
|
+
|
|
526
|
+
# 5. To use your local build in your MCP client, link it globally
|
|
527
|
+
npm install --global .
|
|
528
|
+
```
|
|
529
|
+
|
|
530
|
+
Now, when your MCP client runs `terminator-mcp-agent`, it will use your local build instead of the published `npm` version.
|
|
531
|
+
|
|
532
|
+
---
|
|
533
|
+
|
|
534
|
+
## Troubleshooting
|
|
535
|
+
|
|
536
|
+
- Make sure you have Node.js installed (v16+ recommended).
|
|
537
|
+
- For VS Code/Insiders, ensure the CLI (`code` or `code-insiders`) is available in your PATH.
|
|
538
|
+
- If you encounter issues, try running with elevated permissions.
|
|
539
|
+
|
|
540
|
+
### Version Compatibility Issues
|
|
541
|
+
|
|
542
|
+
**Problem**: "missing field `items`" or schema mismatch errors
|
|
543
|
+
|
|
544
|
+
**Solution**: Ensure you're using the latest MCP server version:
|
|
545
|
+
|
|
546
|
+
```bash
|
|
547
|
+
# Force latest version in CLI
|
|
548
|
+
terminator mcp run workflow.yml --command "npx -y terminator-mcp-agent@latest"
|
|
549
|
+
|
|
550
|
+
# Update MCP client configuration to use @latest
|
|
551
|
+
{
|
|
552
|
+
"mcpServers": {
|
|
553
|
+
"terminator-mcp-agent": {
|
|
554
|
+
"command": "npx",
|
|
555
|
+
"args": ["-y", "terminator-mcp-agent@latest"]
|
|
556
|
+
}
|
|
557
|
+
}
|
|
558
|
+
}
|
|
559
|
+
|
|
560
|
+
# Clear npm cache if needed
|
|
561
|
+
npm cache clean --force
|
|
562
|
+
```
|
|
563
|
+
|
|
564
|
+
### CLI Integration Issues
|
|
565
|
+
|
|
566
|
+
**Problem**: CLI commands not working or connection errors
|
|
567
|
+
|
|
568
|
+
**Solution**: Test MCP connectivity step by step:
|
|
569
|
+
|
|
570
|
+
```bash
|
|
571
|
+
# Test basic connectivity
|
|
572
|
+
terminator mcp exec get_applications
|
|
573
|
+
|
|
574
|
+
# Test with verbose logging
|
|
575
|
+
terminator mcp run workflow.yml --verbose
|
|
576
|
+
|
|
577
|
+
# Test with dry run first
|
|
578
|
+
terminator mcp run workflow.yml --dry-run
|
|
579
|
+
|
|
580
|
+
# Use HTTP connection for debugging
|
|
581
|
+
terminator mcp run workflow.yml --url http://localhost:3000/mcp
|
|
582
|
+
```
|
|
583
|
+
|
|
584
|
+
### JavaScript Execution Issues
|
|
585
|
+
|
|
586
|
+
**Problem**: JavaScript code fails or can't access desktop APIs
|
|
587
|
+
|
|
588
|
+
**Solution**: Verify JavaScript execution and API access:
|
|
589
|
+
|
|
590
|
+
```bash
|
|
591
|
+
# Test basic JavaScript execution via run_command engine mode
|
|
592
|
+
terminator mcp exec run_command '{"engine": "javascript", "run": "return {test: true};"}'
|
|
593
|
+
|
|
594
|
+
# Test desktop API access with node engine
|
|
595
|
+
terminator mcp exec run_command '{"engine": "node", "run": "const elements = await desktop.locator(\\\"role:button\\\").all(); return {count: elements.length};"}'
|
|
596
|
+
|
|
597
|
+
# Test Python engine
|
|
598
|
+
terminator mcp exec run_command '{"engine": "python", "run": "return {\\\"py\\\": True}"}'
|
|
599
|
+
|
|
600
|
+
# Debug with verbose logging
|
|
601
|
+
terminator mcp run workflow.yml --verbose
|
|
602
|
+
```
|
|
603
|
+
|
|
604
|
+
### Workflow File Issues
|
|
605
|
+
|
|
606
|
+
**Problem**: Workflow parsing errors or unexpected behavior
|
|
607
|
+
|
|
608
|
+
**Solution**: Validate workflow structure:
|
|
609
|
+
|
|
610
|
+
```bash
|
|
611
|
+
# Validate workflow syntax
|
|
612
|
+
terminator mcp run workflow.yml --dry-run
|
|
613
|
+
|
|
614
|
+
# Test with minimal workflow first
|
|
615
|
+
echo 'steps: [{tool_name: get_applications}]' > test.yml
|
|
616
|
+
terminator mcp run test.yml
|
|
617
|
+
|
|
618
|
+
# Check both YAML and JSON formats work
|
|
619
|
+
terminator mcp run workflow.yml # YAML
|
|
620
|
+
terminator mcp run workflow.json # JSON
|
|
621
|
+
```
|
|
622
|
+
|
|
623
|
+
### Platform-Specific Issues
|
|
624
|
+
|
|
625
|
+
**Windows**:
|
|
626
|
+
|
|
627
|
+
- Ensure Windows UI Automation APIs are available
|
|
628
|
+
- Run with administrator privileges if accessibility features are restricted
|
|
629
|
+
- Check Windows Defender/antivirus isn't blocking automation
|
|
630
|
+
|
|
631
|
+
**macOS**:
|
|
632
|
+
|
|
633
|
+
- Grant accessibility permissions in System Preferences > Security & Privacy
|
|
634
|
+
- Ensure the terminal/IDE has accessibility access
|
|
635
|
+
- Check macOS version compatibility (10.14+ recommended)
|
|
636
|
+
|
|
637
|
+
**Linux**:
|
|
638
|
+
|
|
639
|
+
- Ensure AT-SPI (assistive technology) is enabled
|
|
640
|
+
- Install required packages: `sudo apt-get install at-spi2-core`
|
|
641
|
+
- Check desktop environment compatibility (GNOME, KDE, XFCE supported)
|
|
642
|
+
|
|
643
|
+
### Virtual Display Support (Headless VMs)
|
|
644
|
+
|
|
645
|
+
Terminator MCP Agent includes virtual display support for running on headless VMs without requiring RDP connections. This enables scalable automation on cloud platforms like Azure, AWS, and GCP.
|
|
646
|
+
|
|
647
|
+
**How It Works**:
|
|
648
|
+
|
|
649
|
+
The agent automatically detects headless environments and initializes a virtual display context that Windows UI Automation APIs can interact with. This allows full UI automation capabilities even when no physical display or RDP session is active.
|
|
650
|
+
|
|
651
|
+
**Activation**:
|
|
652
|
+
|
|
653
|
+
Virtual display activates automatically when:
|
|
654
|
+
|
|
655
|
+
- Environment variable `TERMINATOR_HEADLESS=true` is set
|
|
656
|
+
- No console window is available (common in VM/container scenarios)
|
|
657
|
+
- Running as a Windows service or scheduled task
|
|
658
|
+
|
|
659
|
+
**Configuration**:
|
|
660
|
+
|
|
661
|
+
```bash
|
|
662
|
+
# Enable virtual display mode
|
|
663
|
+
export TERMINATOR_HEADLESS=true
|
|
664
|
+
|
|
665
|
+
# Run the MCP agent
|
|
666
|
+
npx -y terminator-mcp-agent
|
|
667
|
+
```
|
|
668
|
+
|
|
669
|
+
**Use Cases**:
|
|
670
|
+
|
|
671
|
+
- Running multiple automation agents on VMs without RDP overhead
|
|
672
|
+
- CI/CD pipelines in cloud environments
|
|
673
|
+
- Scalable automation farms on Azure/AWS/GCP
|
|
674
|
+
- Containerized automation workloads
|
|
675
|
+
|
|
676
|
+
**Requirements**:
|
|
677
|
+
|
|
678
|
+
- Windows Server 2016+ or Windows 10/11
|
|
679
|
+
- .NET Framework 4.7.2+
|
|
680
|
+
- UI Automation APIs available (included in Windows)
|
|
681
|
+
|
|
682
|
+
The virtual display manager creates a memory-based display context that satisfies Windows UI Automation requirements, enabling terminator to enumerate and interact with UI elements as if a physical display were present.
|
|
683
|
+
|
|
684
|
+
### Performance Optimization
|
|
685
|
+
|
|
686
|
+
**Large UI Trees**:
|
|
687
|
+
|
|
688
|
+
- Use specific selectors instead of broad element searches
|
|
689
|
+
- Implement delays between rapid operations
|
|
690
|
+
- Consider using `include_tree: false` for intermediate steps
|
|
691
|
+
|
|
692
|
+
**JavaScript Performance**:
|
|
693
|
+
|
|
694
|
+
- Use `quickjs` engine for lightweight operations
|
|
695
|
+
- Use `nodejs` engine only when full APIs are needed
|
|
696
|
+
- Implement `sleep()` delays in loops to prevent overwhelming the UI
|
|
697
|
+
|
|
698
|
+
For additional help, see the [Terminator CLI documentation](../terminator-cli/README.md) or open an issue on GitHub.
|
|
699
|
+
|
|
700
|
+
---
|
|
701
|
+
|
|
702
|
+
## 📚 Full `execute_sequence` Reference & Sample Workflow
|
|
703
|
+
|
|
704
|
+
> **Why another example?** The quick start above shows the concept, but many users asked for a fully-annotated workflow schema. The example below automates the Windows **Calculator** app—so it is 100% safe to share and does **not** reveal any private customer data. Feel free to copy-paste and adapt it to your own application.
|
|
705
|
+
|
|
706
|
+
### 1. Anatomy of an `execute_sequence` Call
|
|
707
|
+
|
|
708
|
+
```jsonc
|
|
709
|
+
{
|
|
710
|
+
"tool_name": "execute_sequence",
|
|
711
|
+
"arguments": {
|
|
712
|
+
"variables": {
|
|
713
|
+
// 1️⃣ Re-usable inputs with type metadata
|
|
714
|
+
"app_path": {
|
|
715
|
+
"type": "string",
|
|
716
|
+
"label": "Calculator EXE Path",
|
|
717
|
+
"default": "calc.exe"
|
|
718
|
+
},
|
|
719
|
+
"first_number": {
|
|
720
|
+
"type": "string",
|
|
721
|
+
"label": "First Number",
|
|
722
|
+
"default": "42"
|
|
723
|
+
},
|
|
724
|
+
"second_number": {
|
|
725
|
+
"type": "string",
|
|
726
|
+
"label": "Second Number",
|
|
727
|
+
"default": "8"
|
|
728
|
+
}
|
|
729
|
+
},
|
|
730
|
+
"inputs": {
|
|
731
|
+
// 2️⃣ Concrete values for *this run*
|
|
732
|
+
"app_path": "calc.exe",
|
|
733
|
+
"first_number": "42",
|
|
734
|
+
"second_number": "8"
|
|
735
|
+
},
|
|
736
|
+
"selectors": {
|
|
737
|
+
// 3️⃣ Human-readable element shortcuts
|
|
738
|
+
"calc_window": "role:Window|name:Calculator",
|
|
739
|
+
"btn_clear": "role:Button|name:Clear",
|
|
740
|
+
"btn_plus": "role:Button|name:Plus",
|
|
741
|
+
"btn_equals": "role:Button|name:Equals"
|
|
742
|
+
},
|
|
743
|
+
"steps": [
|
|
744
|
+
// 4️⃣ Ordered actions & control flow
|
|
745
|
+
{
|
|
746
|
+
"tool_name": "open_application",
|
|
747
|
+
"arguments": { "path": "${{app_path}}" }
|
|
748
|
+
},
|
|
749
|
+
{
|
|
750
|
+
"tool_name": "click_element", // 4a. Make sure the UI is reset
|
|
751
|
+
"arguments": { "selector": "${{selectors.btn_clear}}" },
|
|
752
|
+
"continue_on_error": true
|
|
753
|
+
},
|
|
754
|
+
{
|
|
755
|
+
"group_name": "Enter First Number", // 4b. Groups improve logs
|
|
756
|
+
"steps": [
|
|
757
|
+
{
|
|
758
|
+
"tool_name": "type_into_element",
|
|
759
|
+
"arguments": {
|
|
760
|
+
"selector": "${{selectors.calc_window}}",
|
|
761
|
+
"text_to_type": "${{first_number}}"
|
|
762
|
+
}
|
|
763
|
+
}
|
|
764
|
+
]
|
|
765
|
+
},
|
|
766
|
+
{
|
|
767
|
+
"tool_name": "click_element",
|
|
768
|
+
"arguments": { "selector": "${{selectors.btn_plus}}" }
|
|
769
|
+
},
|
|
770
|
+
{
|
|
771
|
+
"group_name": "Enter Second Number",
|
|
772
|
+
"steps": [
|
|
773
|
+
{
|
|
774
|
+
"tool_name": "type_into_element",
|
|
775
|
+
"arguments": {
|
|
776
|
+
"selector": "${{selectors.calc_window}}",
|
|
777
|
+
"text_to_type": "${{second_number}}"
|
|
778
|
+
}
|
|
779
|
+
}
|
|
780
|
+
]
|
|
781
|
+
},
|
|
782
|
+
{
|
|
783
|
+
"tool_name": "click_element",
|
|
784
|
+
"arguments": { "selector": "${{selectors.btn_equals}}" }
|
|
785
|
+
},
|
|
786
|
+
{
|
|
787
|
+
"tool_name": "wait_for_element", // 4c. Capture final UI tree
|
|
788
|
+
"arguments": {
|
|
789
|
+
"selector": "${{selectors.calc_window}}",
|
|
790
|
+
"condition": "exists",
|
|
791
|
+
"include_tree": true,
|
|
792
|
+
"timeout_ms": 2000
|
|
793
|
+
}
|
|
794
|
+
}
|
|
795
|
+
],
|
|
796
|
+
"output_parser": {
|
|
797
|
+
// 5️⃣ Turn the tree into clean JSON
|
|
798
|
+
"javascript_code": "// Extract calculator display value\nconst results = [];\n\nfunction findElementsRecursively(element) {\n if (element.attributes && element.attributes.role === 'Text') {\n const item = {\n displayValue: element.attributes.name || ''\n };\n results.push(item);\n }\n \n if (element.children) {\n for (const child of element.children) {\n findElementsRecursively(child);\n }\n }\n}\n\nfindElementsRecursively(tree);\nreturn results;"
|
|
799
|
+
}
|
|
800
|
+
}
|
|
801
|
+
}
|
|
802
|
+
```
|
|
803
|
+
|
|
804
|
+
### 2. Key Concepts at a Glance
|
|
805
|
+
|
|
806
|
+
1. **Variables vs. Inputs** – Declare once, override per-run. This is perfect for parameterizing CI pipelines or A/B test data.
|
|
807
|
+
2. **Selectors** – Give every important UI element a _nickname_. It makes long workflows readable and easy to maintain.
|
|
808
|
+
3. **Templating** – `${{ ... }}` (GitHub Actions-style) _or_ legacy `{{ ... }}` lets you reference **any** key inside `variables`, `inputs`, or `selectors`. Both syntaxes are supported; the engine uses Mustache-style rendering.
|
|
809
|
+
4. **Groups & Control Flow** – Add `group_name`, `skippable`, `if`, or `continue_on_error` to any step for advanced branching.
|
|
810
|
+
5. **Output Parsing** – Always end with a step that includes the UI tree, then use the declarative JSON DSL to mine the data you need.
|
|
811
|
+
|
|
812
|
+
### 3. State Persistence & Partial Execution
|
|
813
|
+
|
|
814
|
+
The `execute_sequence` tool supports powerful features for workflow debugging and resumption:
|
|
815
|
+
|
|
816
|
+
#### Partial Execution with Step Ranges
|
|
817
|
+
|
|
818
|
+
You can run specific portions of a workflow using `start_from_step` and `end_at_step` parameters:
|
|
819
|
+
|
|
820
|
+
```jsonc
|
|
821
|
+
{
|
|
822
|
+
"tool_name": "execute_sequence",
|
|
823
|
+
"arguments": {
|
|
824
|
+
"url": "file://path/to/workflow.yml",
|
|
825
|
+
"start_from_step": "read_json_file", // Start from this step ID
|
|
826
|
+
"end_at_step": "fill_journal_entries", // Stop after this step (inclusive)
|
|
827
|
+
"follow_fallback": false // Don't follow fallback_id beyond end_at_step (default: false)
|
|
828
|
+
}
|
|
829
|
+
}
|
|
830
|
+
```
|
|
831
|
+
|
|
832
|
+
**Examples:**
|
|
833
|
+
- Run single step: Set both `start_from_step` and `end_at_step` to the same ID
|
|
834
|
+
- Run step range: Set different IDs for start and end
|
|
835
|
+
- Run from step to end: Only set `start_from_step`
|
|
836
|
+
- Run from beginning to step: Only set `end_at_step`
|
|
837
|
+
- Debug without fallback: Use `follow_fallback: false` to prevent jumping to troubleshooting steps when a bounded step fails
|
|
838
|
+
|
|
839
|
+
#### Automatic State Persistence
|
|
840
|
+
|
|
841
|
+
When using `file://` URLs, the workflow state (environment variables) is automatically saved to a `.workflow_state` folder:
|
|
842
|
+
|
|
843
|
+
1. **State is saved** after each step that modifies environment variables via `set_env` or has a tool result with an ID
|
|
844
|
+
2. **State is loaded** when starting from a specific step
|
|
845
|
+
3. **Location**: `.workflow_state/<workflow_hash>.json` in the workflow's directory
|
|
846
|
+
4. **Tool results** from all tools (not just scripts) are automatically stored as `{step_id}_result` and `{step_id}_status`
|
|
847
|
+
|
|
848
|
+
This enables:
|
|
849
|
+
- **Debugging**: Run steps individually to inspect state between executions
|
|
850
|
+
- **Recovery**: Resume failed workflows from the last successful step
|
|
851
|
+
- **Testing**: Test specific steps without re-running the entire workflow
|
|
852
|
+
|
|
853
|
+
#### Data Passing Between Steps
|
|
854
|
+
|
|
855
|
+
Steps can pass data using multiple methods:
|
|
856
|
+
|
|
857
|
+
##### 1. Tool Result Storage (NEW)
|
|
858
|
+
|
|
859
|
+
ALL tools with an `id` field automatically store their results in the environment:
|
|
860
|
+
|
|
861
|
+
```yaml
|
|
862
|
+
steps:
|
|
863
|
+
# Any tool with an ID stores its result
|
|
864
|
+
- id: check_apps
|
|
865
|
+
tool_name: get_applications
|
|
866
|
+
arguments:
|
|
867
|
+
include_tree: false
|
|
868
|
+
|
|
869
|
+
# Access the result in JavaScript
|
|
870
|
+
- tool_name: run_command
|
|
871
|
+
arguments:
|
|
872
|
+
engine: javascript
|
|
873
|
+
run: |
|
|
874
|
+
// Direct variable access - auto-injected!
|
|
875
|
+
const apps = check_apps_result || [];
|
|
876
|
+
const status = check_apps_status; // "success" or "error"
|
|
877
|
+
console.log(`Found ${apps[0]?.applications?.length} apps`);
|
|
878
|
+
```
|
|
879
|
+
|
|
880
|
+
##### 2. Script Return Values
|
|
881
|
+
|
|
882
|
+
Steps can pass data using the `set_env` mechanism in `run_command` with engine mode:
|
|
883
|
+
|
|
884
|
+
```javascript
|
|
885
|
+
// Step 12: Read and process data
|
|
886
|
+
return {
|
|
887
|
+
set_env: {
|
|
888
|
+
file_path: "C:/data/input.json",
|
|
889
|
+
journal_entries: JSON.stringify(entries),
|
|
890
|
+
total_debit: "100.50"
|
|
891
|
+
}
|
|
892
|
+
};
|
|
893
|
+
|
|
894
|
+
// Step 13: Use the data (NEW - simplified access!)
|
|
895
|
+
const filePath = file_path; // Direct access, no {{env.}} needed!
|
|
896
|
+
const entries = JSON.parse(journal_entries);
|
|
897
|
+
const debit = total_debit;
|
|
898
|
+
```
|
|
899
|
+
|
|
900
|
+
### 4. Running the Workflow
|
|
901
|
+
|
|
902
|
+
1. Ensure the Terminator MCP agent is running (it will auto-start in supported editors).
|
|
903
|
+
2. Send the JSON above as the body of an `execute_sequence` tool call from your LLM or test harness.
|
|
904
|
+
3. Inspect the response: if parsing succeeds you'll see something like
|
|
905
|
+
|
|
906
|
+
### Realtime events (SSE)
|
|
907
|
+
|
|
908
|
+
When running with the HTTP transport, you can subscribe to realtime workflow events at a separate endpoint outside `/mcp`:
|
|
909
|
+
|
|
910
|
+
- SSE endpoint: `/events`
|
|
911
|
+
- Emits JSON payloads for: `sequence` (start/end), `sequence_progress`, and `sequence_step` (begin/end)
|
|
912
|
+
|
|
913
|
+
Example in Node.js:
|
|
914
|
+
|
|
915
|
+
```js
|
|
916
|
+
import EventSource from "eventsource";
|
|
917
|
+
const es = new EventSource("http://127.0.0.1:3000/events");
|
|
918
|
+
es.onmessage = (e) => console.log("event", e.data);
|
|
919
|
+
```
|
|
920
|
+
|
|
921
|
+
```jsonc
|
|
922
|
+
{
|
|
923
|
+
"parsed_output": {
|
|
924
|
+
"displayValue": "50" // 42 + 8
|
|
925
|
+
}
|
|
926
|
+
}
|
|
927
|
+
```
|
|
928
|
+
|
|
929
|
+
### 5. Working with Tool Results
|
|
930
|
+
|
|
931
|
+
Every tool that has an `id` field automatically stores its result for use in later steps:
|
|
932
|
+
|
|
933
|
+
```yaml
|
|
934
|
+
steps:
|
|
935
|
+
# Capture browser DOM
|
|
936
|
+
- id: capture_dom
|
|
937
|
+
tool_name: execute_browser_script
|
|
938
|
+
arguments:
|
|
939
|
+
selector: "role:Window"
|
|
940
|
+
script: "return document.documentElement.innerHTML;"
|
|
941
|
+
|
|
942
|
+
# Validate an element exists
|
|
943
|
+
- id: check_button
|
|
944
|
+
tool_name: validate_element
|
|
945
|
+
arguments:
|
|
946
|
+
selector: "role:Button|name:Submit"
|
|
947
|
+
|
|
948
|
+
# Use both results in script
|
|
949
|
+
- tool_name: run_command
|
|
950
|
+
arguments:
|
|
951
|
+
engine: javascript
|
|
952
|
+
run: |
|
|
953
|
+
// All tool results are auto-injected as variables
|
|
954
|
+
const dom = capture_dom_result?.content || '';
|
|
955
|
+
const buttonExists = check_button_status === 'success';
|
|
956
|
+
|
|
957
|
+
if (buttonExists) {
|
|
958
|
+
const button = check_button_result[0]?.element;
|
|
959
|
+
console.log(`Submit button at: ${button?.bounds?.x}, ${button?.bounds?.y}`);
|
|
960
|
+
}
|
|
961
|
+
|
|
962
|
+
return { dom_length: dom.length, has_button: buttonExists };
|
|
963
|
+
```
|
|
964
|
+
|
|
965
|
+
Tool results are accessible as:
|
|
966
|
+
- `{step_id}_result`: The tool's return value (content, element info, etc.)
|
|
967
|
+
- `{step_id}_status`: Either "success" or "error"
|
|
968
|
+
|
|
969
|
+
### 6. Tips for Production Workflows
|
|
970
|
+
|
|
971
|
+
- **Never hard-code credentials** – use environment variables or your secret manager.
|
|
972
|
+
- **Keep workflows short** – <100 steps is ideal. Break large tasks into multiple sequences.
|
|
973
|
+
- **Capture errors** – `continue_on_error` is useful, but also check `{step_id}_status` for tool failures.
|
|
974
|
+
- **Version control** – Store workflow JSON in a repo and use PR reviews just like regular code.
|
|
975
|
+
- **Use step IDs** – Give meaningful IDs to steps whose results you'll need later.
|
|
976
|
+
|
|
977
|
+
## 🔍 Troubleshooting & Debugging
|
|
978
|
+
|
|
979
|
+
### Finding MCP Server Logs
|
|
980
|
+
|
|
981
|
+
MCP logs are saved to:
|
|
982
|
+
- **Windows:** `%LOCALAPPDATA%\claude-cli-nodejs\Cache\<encoded-project-path>\mcp-logs-terminator-mcp-agent\`
|
|
983
|
+
- **macOS/Linux:** `~/.local/share/claude-cli-nodejs/Cache/<encoded-project-path>/mcp-logs-terminator-mcp-agent/`
|
|
984
|
+
|
|
985
|
+
Where `<encoded-project-path>` is your project path with special chars replaced (e.g., `C--Users-username-project`).
|
|
986
|
+
Note: Logs are saved as `.txt` files, not `.log` files.
|
|
987
|
+
|
|
988
|
+
**Read logs:**
|
|
989
|
+
```powershell
|
|
990
|
+
# Windows - Find and read latest logs (run in PowerShell)
|
|
991
|
+
Get-ChildItem (Join-Path ([Environment]::GetFolderPath('LocalApplicationData')) 'claude-cli-nodejs\Cache\*\mcp-logs-terminator-mcp-agent\*.txt') | Sort-Object LastWriteTime -Descending | Select-Object -First 1 | Get-Content -Tail 50
|
|
992
|
+
```
|
|
993
|
+
|
|
994
|
+
### Enable Debug Logging
|
|
995
|
+
|
|
996
|
+
In your Claude MCP configuration (`claude_desktop_config.json`):
|
|
997
|
+
```json
|
|
998
|
+
{
|
|
999
|
+
"mcpServers": {
|
|
1000
|
+
"terminator-mcp-agent": {
|
|
1001
|
+
"command": "path/to/terminator-mcp-agent",
|
|
1002
|
+
"env": {
|
|
1003
|
+
"LOG_LEVEL": "debug", // or "info", "warn", "error"
|
|
1004
|
+
"RUST_BACKTRACE": "1" // for stack traces on errors
|
|
1005
|
+
}
|
|
1006
|
+
}
|
|
1007
|
+
}
|
|
1008
|
+
}
|
|
1009
|
+
```
|
|
1010
|
+
|
|
1011
|
+
### Common Debug Scenarios
|
|
1012
|
+
|
|
1013
|
+
| Issue | What to Look For in Logs |
|
|
1014
|
+
|-------|--------------------------|
|
|
1015
|
+
| Workflow failures | Search for `fallback_id` triggers and `critical_error_occurred` |
|
|
1016
|
+
| Element not found | Look for selector resolution attempts, `find_element` timeouts |
|
|
1017
|
+
| Browser script errors | Check for `EVAL_ERROR`, Promise rejections, JavaScript exceptions |
|
|
1018
|
+
| Binary version issues | Startup logs show binary path and build timestamp |
|
|
1019
|
+
| MCP connection lost | Check for panic messages, ensure binary path is correct |
|
|
1020
|
+
|
|
1021
|
+
### Fallback Mechanism
|
|
1022
|
+
|
|
1023
|
+
Workflows support `fallback_id` to handle errors gracefully:
|
|
1024
|
+
- If a step fails and has `fallback_id`, it jumps to that step instead of stopping
|
|
1025
|
+
- Without `fallback_id`, errors may set `critical_error_occurred` and skip remaining steps
|
|
1026
|
+
- Use `troubleshooting:` section for recovery steps only accessed via fallback
|
|
1027
|
+
|
|
1028
|
+
> Need more help? Browse the examples under `examples/` in this repo or open a discussion on GitHub.
|
|
1029
|
+
|
|
1030
|
+
## Documentation
|
|
1031
|
+
|
|
1032
|
+
### Workflow Development
|
|
1033
|
+
|
|
1034
|
+
- **[Workflow Output Structure](docs/WORKFLOW_OUTPUT_STRUCTURE.md)**: Detailed documentation on the expected output structure for workflows, including:
|
|
1035
|
+
- How to structure `parsed_output` for proper CLI rendering
|
|
1036
|
+
- Success/failure indicators and business logic validation
|
|
1037
|
+
- Data extraction patterns and error handling
|
|
1038
|
+
- Integration with CLI and backend systems
|
|
1039
|
+
|
|
1040
|
+
### Additional Resources
|
|
1041
|
+
|
|
1042
|
+
- **[CLI Documentation](../terminator-cli/README.md)**: Command-line interface for executing workflows
|
|
1043
|
+
- **[Examples](examples/)**: Sample workflows and use cases
|
|
1044
|
+
- **[API Reference](https://github.com/mediar-ai/terminator#api)**: Core Terminator library documentation
|