npm - terminator-mcp-agent - Versions diffs - 0.6.14 → 0.6.16 - Mend

terminator-mcp-agent 0.6.14 → 0.6.16

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (2) hide show

package/README.md +67 -43
package/package.json +5 -5

package/README.md CHANGED Viewed

@@ -10,77 +10,102 @@
 A Model Context Protocol (MCP) server that provides desktop GUI automation capabilities using the [Terminator](https://github.com/mediar-ai/terminator) library. This server enables LLMs and agentic clients to interact with Windows, macOS, and Linux applications through structured accessibility APIs—no vision models or screenshots required.
-### Key Features
+### Getting Started
-- **Fast and lightweight**. Uses OS-level accessibility APIs, not pixel-based input.
-- **LLM/agent-friendly**. No vision models needed, operates purely on structured data.
-- **Deterministic automation**. Avoids ambiguity common with screenshot-based approaches.
-- **Multi-platform**. Supports Windows (full), macOS (partial), Linux (partial).
+The easiest way to get started is to use the one-click install buttons above for your specific editor (VS Code, Cursor, etc.).
-### Requirements
+Alternatively, you can install and configure the agent from your command line.
-- Node.js 16 or newer
-- VS Code, Cursor, Windsurf, Claude Desktop, or any other MCP client
+**1. Install & Configure Automatically**
+Run the following command and select your MCP client from the list:
-### Getting started
+```sh
+npx -y terminator-mcp-agent --add-to-app
+```
-First, install the Terminator MCP server with your client. A typical configuration looks like this:
+**2. Manual Configuration**
+If you prefer, you can add the following to your MCP client's settings file:
 ```json
 {
-  "mcpServers": {
-    "terminator-mcp-agent": {
-      "command": "npx",
-      "args": ["-y", "terminator-mcp-agent"]
-    }
-  }
+	"mcpServers": {
+		"terminator-mcp-agent": {
+			"command": "npx",
+			"args": ["-y", "terminator-mcp-agent"]
+		}
+	}
 }
 ```
-You can also use the CLI to configure your app automatically:
+### Core Workflows: From Interaction to Structured Data
-```sh
-npx -y terminator-mcp-agent --add-to-app [app]
-```
+The Terminator MCP agent offers two primary workflows for automating desktop tasks. Both paths lead to the same goal: creating a >95% accuracy, 10000x faster than humans, automation.
-Replace `[app]` with one of:
+#### 1. Iterative Development with `execute_sequence`
-- cursor
-- claude
-- vscode
-- insiders
-- windsurf
-- cline
-- roocode
-- witsy
-- enconvo
-- boltai
-- amazon-bedrock
-- amazonq
+This is the most powerful and flexible method. You build a workflow step-by-step, using MCP tools to inspect the UI and refine your actions.
-If you omit `[app]`, the CLI will prompt you to select from all available options.
+1.  **Inspect the UI**: Start by using `get_focused_window_tree` to understand the structure of your target application. This gives you the roles, names, and IDs of all elements.
+2.  **Build a Sequence**: Create an `execute_sequence` tool call with a series of actions (`click_element`, `type_into_element`, etc.). Use robust selectors (like `role|name` or stable `properties:AutomationId:value` selectors) whenever possible.
+3.  **Capture the Final State**: Ensure the last step in your sequence is an action that returns a UI tree. The `wait_for_element` tool with `include_tree: true` is perfect for this, as it captures the application's state after your automation has run.
+4.  **Extract Structured Data with `output_parser`**: Add the `output_parser` argument to your `execute_sequence` call. Define a set of rules using our JSON-based DSL to parse the final UI tree. If successful, the tool result will contain a `parsed_output` field with your clean JSON data.
----
+Here is an example of an `output_parser` that extracts insurance quote data from a web page:
+```json
+"output_parser": {
+    "uiTreeJsonPath": "$.results[-1].results[-1].result.content[0].Json.ui_tree",
+    "itemContainerDefinition": {
+        "nodeConditions": [{ "property": "role", "op": "equals", "value": "Group" }],
+        "childConditions": {
+            "logic": "and",
+            "conditions": [
+                { "existsChild": { "conditions": [{ "property": "name", "op": "startsWith", "value": "$" }] } },
+                { "existsChild": { "conditions": [{ "property": "name", "op": "equals", "value": "Monthly Price" }] } }
+            ]
+        }
+    },
+    "fieldsToExtract": {
+        "monthlyPrice": {
+            "fromChild": {
+                "conditions": [{ "property": "name", "op": "startsWith", "value": "$" }],
+                "extractProperty": "name"
+            }
+        }
+    }
+}
+```
-<img width="1512" alt="Screenshot 2025-04-16 at 9 29 42 AM" src="https://github.com/user-attachments/assets/457ebaf2-640c-4f21-a236-fcb2b92748ab" />
+#### 2. Recording Human Actions with `record_workflow`
-MCP is useful to test out the `terminator` lib and see what you can do. You can use any model.
+For simpler tasks, you can record your own actions to generate a baseline workflow.
-<br>
+1.  **Start Recording**: Call `record_workflow` with `action: "start"`.
+2.  **Perform the Task**: Manually perform the clicks, typing, and other interactions in the target application.
+3.  **Stop and Save**: Call `record_workflow` with `action: "stop"`. This returns a complete workflow JSON file containing all your recorded actions.
+4.  **Refine and Parse**: The recorded workflow is a great starting point. You can then refine the selectors for robustness, add a final step to capture the UI tree, and attach an `output_parser` to extract structured data, just as you would in the iterative workflow.
-## Development
+## Local Development
-If you want to build and test the agent locally, clone the repo and run:
+To build and test the agent from the source code:
 ```sh
+# 1. Clone the entire Terminator repository
 git clone https://github.com/mediar-ai/terminator
+# 2. Navigate to the agent's directory
 cd terminator/terminator-mcp-agent
+# 3. Install Node.js dependencies
 npm install
+# 4. Build the Rust binary and Node.js wrapper
 npm run build
+# 5. To use your local build in your MCP client, link it globally
 npm install --global .
 ```
-You can then use the CLI as above.
+Now, when your MCP client runs `terminator-mcp-agent`, it will use your local build instead of the published `npm` version.
 ---
@@ -88,6 +113,5 @@ You can then use the CLI as above.
 - Make sure you have Node.js installed (v16+ recommended).
 - For VS Code/Insiders, ensure the CLI (`code` or `code-insiders`) is available in your PATH.
-- If you encounter issues, try running with elevated permissions or check the config file paths above.
+- If you encounter issues, try running with elevated permissions.
----

package/package.json CHANGED Viewed

@@ -12,10 +12,10 @@
   ],
   "name": "terminator-mcp-agent",
   "optionalDependencies": {
-    "terminator-mcp-darwin-arm64": "0.6.14",
-    "terminator-mcp-darwin-x64": "0.6.14",
-    "terminator-mcp-linux-x64-gnu": "0.6.14",
-    "terminator-mcp-win32-x64-msvc": "0.6.14"
+    "terminator-mcp-darwin-arm64": "0.6.16",
+    "terminator-mcp-darwin-x64": "0.6.16",
+    "terminator-mcp-linux-x64-gnu": "0.6.16",
+    "terminator-mcp-win32-x64-msvc": "0.6.16"
   },
   "repository": {
     "type": "git",
@@ -27,5 +27,5 @@
     "sync-version": "node ./utils/sync-version.js",
     "update-badges": "node ./utils/update-badges.js"
   },
-  "version": "0.6.14"
+  "version": "0.6.16"
 }