cua-mcp-server 0.1.7__py3-none-any.whl → 0.1.9__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.


This version of cua-mcp-server might be problematic. Click here for more details.

@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.1
2
2
  Name: cua-mcp-server
3
- Version: 0.1.7
3
+ Version: 0.1.9
4
4
  Summary: MCP Server for Computer-Use Agent (CUA)
5
5
  Author-Email: TryCua <gh@trycua.com>
6
6
  Requires-Python: >=3.10
@@ -29,6 +29,17 @@ Description-Content-Type: text/markdown
29
29
  **cua-mcp-server** is a MCP server for the Computer-Use Agent (CUA), allowing you to run CUA through Claude Desktop or other MCP clients.
30
30
  ### Get started with Agent
31
31
 
32
+ ## Prerequisites
33
+
34
+ Before installing the MCP server, you'll need to set up the full Computer-Use Agent capabilities as described in [Option 2 of the main README](../../README.md#option-2-full-computer-use-agent-capabilities). This includes:
35
+
36
+ 1. Installing the Lume CLI
37
+ 2. Pulling the latest macOS CUA image
38
+ 3. Starting the Lume daemon service
39
+ 4. Installing the required Python libraries (Optional: only needed if you want to verify the agent is working before installing MCP server)
40
+
41
+ Make sure these steps are completed and working before proceeding with the MCP server installation.
42
+
32
43
  ## Installation
33
44
 
34
45
  Install the package from PyPI:
@@ -68,13 +79,51 @@ You can then use the script in your MCP configuration like this:
68
79
  "CUA_AGENT_LOOP": "OMNI",
69
80
  "CUA_MODEL_PROVIDER": "ANTHROPIC",
70
81
  "CUA_MODEL_NAME": "claude-3-7-sonnet-20250219",
71
- "ANTHROPIC_API_KEY": "your-api-key"
82
+ "CUA_PROVIDER_API_KEY": "your-api-key"
83
+ }
84
+ }
85
+ }
86
+ }
87
+ ```
88
+
89
+ ## Development Guide
90
+
91
+ If you want to develop with the cua-mcp-server directly without installation, you can use this configuration:
92
+
93
+ ```json
94
+ {
95
+ "mcpServers": {
96
+ "cua-agent": {
97
+ "command": "/bin/bash",
98
+ "args": ["~/cua/libs/mcp-server/scripts/start_mcp_server.sh"],
99
+ "env": {
100
+ "CUA_AGENT_LOOP": "UITARS",
101
+ "CUA_MODEL_PROVIDER": "OAICOMPAT",
102
+ "CUA_MODEL_NAME": "ByteDance-Seed/UI-TARS-1.5-7B",
103
+ "CUA_PROVIDER_BASE_URL": "https://****************.us-east-1.aws.endpoints.huggingface.cloud/v1",
104
+ "CUA_PROVIDER_API_KEY": "your-api-key"
72
105
  }
73
106
  }
74
107
  }
75
108
  }
76
109
  ```
77
110
 
111
+ This configuration:
112
+ - Uses the start_mcp_server.sh script which automatically sets up the Python path and runs the server module
113
+ - Works with Claude Desktop, Cursor, or any other MCP client
114
+ - Automatically uses your development code without requiring installation
115
+
116
+ Just add this to your MCP client's configuration and it will use your local development version of the server.
117
+
118
+ ### Troubleshooting
119
+
120
+ If you get a `/bin/bash: ~/cua/libs/mcp-server/scripts/start_mcp_server.sh: No such file or directory` error, try changing the path to the script to be absolute instead of relative.
121
+
122
+ To see the logs:
123
+ ```
124
+ tail -n 20 -f ~/Library/Logs/Claude/mcp*.log
125
+ ```
126
+
78
127
  ## Claude Desktop Integration
79
128
 
80
129
  To use with Claude Desktop, add an entry to your Claude Desktop configuration (`claude_desktop_config.json`, typically found in `~/.config/claude-desktop/`):
@@ -104,7 +153,7 @@ The server is configured using environment variables (can be set in the Claude D
104
153
 
105
154
  | Variable | Description | Default |
106
155
  |----------|-------------|---------|
107
- | `CUA_AGENT_LOOP` | Agent loop to use (OPENAI, ANTHROPIC, OMNI) | OMNI |
156
+ | `CUA_AGENT_LOOP` | Agent loop to use (OPENAI, ANTHROPIC, UITARS, OMNI) | OMNI |
108
157
  | `CUA_MODEL_PROVIDER` | Model provider (ANTHROPIC, OPENAI, OLLAMA, OAICOMPAT) | ANTHROPIC |
109
158
  | `CUA_MODEL_NAME` | Model name to use | None (provider default) |
110
159
  | `CUA_PROVIDER_BASE_URL` | Base URL for provider API | None |
@@ -0,0 +1,7 @@
1
+ cua_mcp_server-0.1.9.dist-info/METADATA,sha256=Bw2ET7kbetLRmVqVkyWwZc91NoDSXjn9qNic6pS7T7I,6668
2
+ cua_mcp_server-0.1.9.dist-info/WHEEL,sha256=tSfRZzRHthuv7vxpI4aehrdN9scLjk-dCJkPLzkHxGg,90
3
+ cua_mcp_server-0.1.9.dist-info/entry_points.txt,sha256=Y3uEunDRfoc-RUDS3HnD942RCxYKquiyk-2HRSqphoc,74
4
+ mcp_server/__init__.py,sha256=G5Bps3KxzYfH79B1TDVQI9vbzjamC_mdgi7GJMgbVcA,575
5
+ mcp_server/__main__.py,sha256=BE2ManEiNpz56nqc7Z_asNjQ6TPtvyu5AbWbyJFePnM,132
6
+ mcp_server/server.py,sha256=nV0aNGymSUB1BjwVzS1snUH2phfbVrP3Bl_P_Y4HWII,7907
7
+ cua_mcp_server-0.1.9.dist-info/RECORD,,
mcp_server/server.py CHANGED
@@ -1,9 +1,10 @@
1
1
  import asyncio
2
+ import base64
2
3
  import logging
3
4
  import os
4
5
  import sys
5
6
  import traceback
6
- from typing import Any, Dict, List, Optional, Union
7
+ from typing import Any, Dict, List, Optional, Union, Tuple
7
8
 
8
9
  # Configure logging to output to stderr for debug visibility
9
10
  logging.basicConfig(
@@ -17,7 +18,7 @@ logger = logging.getLogger("mcp-server")
17
18
  logger.debug("MCP Server module loading...")
18
19
 
19
20
  try:
20
- from mcp.server.fastmcp import Context, FastMCP
21
+ from mcp.server.fastmcp import Context, FastMCP, Image
21
22
 
22
23
  logger.debug("Successfully imported FastMCP")
23
24
  except ImportError as e:
@@ -49,16 +50,37 @@ def serve() -> FastMCP:
49
50
  server = FastMCP("cua-agent")
50
51
 
51
52
  @server.tool()
52
- async def run_cua_task(ctx: Context, task: str) -> str:
53
+ async def screenshot_cua(ctx: Context) -> Image:
53
54
  """
54
- Run a Computer-Use Agent (CUA) task and return the results.
55
+ Take a screenshot of the current MacOS VM screen and return the image. Use this before running a CUA task to get a snapshot of the current state.
56
+
57
+ Args:
58
+ ctx: The MCP context
59
+
60
+ Returns:
61
+ An image resource containing the screenshot
62
+ """
63
+ global global_computer
64
+ if global_computer is None:
65
+ global_computer = Computer(verbosity=logging.INFO)
66
+ await global_computer.run()
67
+ screenshot = await global_computer.interface.screenshot()
68
+ return Image(
69
+ format="png",
70
+ data=screenshot
71
+ )
72
+
73
+ @server.tool()
74
+ async def run_cua_task(ctx: Context, task: str) -> Tuple[str, Image]:
75
+ """
76
+ Run a Computer-Use Agent (CUA) task in a MacOS VM and return the results.
55
77
 
56
78
  Args:
57
79
  ctx: The MCP context
58
80
  task: The instruction or task for the agent to perform
59
81
 
60
82
  Returns:
61
- A string containing the agent's response
83
+ A tuple containing the agent's response and the final screenshot
62
84
  """
63
85
  global global_computer
64
86
 
@@ -72,12 +94,7 @@ def serve() -> FastMCP:
72
94
 
73
95
  # Determine which loop to use
74
96
  loop_str = os.getenv("CUA_AGENT_LOOP", "OMNI")
75
- if loop_str == "OPENAI":
76
- loop = AgentLoop.OPENAI
77
- elif loop_str == "ANTHROPIC":
78
- loop = AgentLoop.ANTHROPIC
79
- else:
80
- loop = AgentLoop.OMNI
97
+ loop = getattr(AgentLoop, loop_str)
81
98
 
82
99
  # Determine provider
83
100
  provider_str = os.getenv("CUA_MODEL_PROVIDER", "ANTHROPIC")
@@ -89,6 +106,9 @@ def serve() -> FastMCP:
89
106
  # Get base URL for provider (if needed)
90
107
  provider_base_url = os.getenv("CUA_PROVIDER_BASE_URL", None)
91
108
 
109
+ # Get api key for provider (if needed)
110
+ api_key = os.getenv("CUA_PROVIDER_API_KEY", None)
111
+
92
112
  # Create agent with the specified configuration
93
113
  agent = ComputerAgent(
94
114
  computer=global_computer,
@@ -98,6 +118,7 @@ def serve() -> FastMCP:
98
118
  name=model_name,
99
119
  provider_base_url=provider_base_url,
100
120
  ),
121
+ api_key=api_key,
101
122
  save_trajectory=False,
102
123
  only_n_most_recent_images=int(os.getenv("CUA_MAX_IMAGES", "3")),
103
124
  verbosity=logging.INFO,
@@ -107,33 +128,34 @@ def serve() -> FastMCP:
107
128
  full_result = ""
108
129
  async for result in agent.run(task):
109
130
  logger.info(f"Agent step complete: {result.get('id', 'unknown')}")
131
+ ctx.info(f"Agent step complete: {result.get('id', 'unknown')}")
110
132
 
111
133
  # Add response ID to output
112
134
  full_result += f"\n[Response ID: {result.get('id', 'unknown')}]\n"
113
-
114
- # Extract and concatenate text responses
115
- if "text" in result:
116
- # Handle both string and dict responses
117
- text_response = result.get("text", "")
118
- if isinstance(text_response, str):
119
- full_result += f"Response: {text_response}\n"
120
- else:
121
- # If it's a dict or other structure, convert to string representation
122
- full_result += f"Response: {str(text_response)}\n"
123
-
124
- # Log detailed information
125
- if "tools" in result:
126
- tools_info = result.get("tools")
127
- logger.debug(f"Tools used: {tools_info}")
128
- full_result += f"\nTools used: {tools_info}\n"
135
+
136
+ if "content" in result:
137
+ full_result += f"Response: {result.get('content', '')}\n"
129
138
 
130
139
  # Process output if available
131
140
  outputs = result.get("output", [])
132
141
  for output in outputs:
133
142
  output_type = output.get("type")
134
- if output_type == "reasoning":
143
+ if output_type == "message":
144
+ logger.debug(f"Message: {output}")
145
+ content = output.get("content", [])
146
+ for content_part in content:
147
+ if content_part.get("text"):
148
+ full_result += f"\nMessage: {content_part.get('text', '')}\n"
149
+ elif output_type == "reasoning":
135
150
  logger.debug(f"Reasoning: {output}")
136
- full_result += f"\nReasoning: {output.get('content', '')}\n"
151
+
152
+ summary_content = output.get("summary", [])
153
+ if summary_content:
154
+ for summary_part in summary_content:
155
+ if summary_part.get("text"):
156
+ full_result += f"\nReasoning: {summary_part.get('text', '')}\n"
157
+ else:
158
+ full_result += f"\nReasoning: {output.get('text', output.get('content', ''))}\n"
137
159
  elif output_type == "computer_call":
138
160
  logger.debug(f"Computer call: {output}")
139
161
  action = output.get("action", "")
@@ -144,17 +166,25 @@ def serve() -> FastMCP:
144
166
  full_result += "\n" + "-" * 40 + "\n"
145
167
 
146
168
  logger.info(f"CUA task completed successfully")
147
- return full_result or "Task completed with no text output."
169
+ ctx.info(f"CUA task completed successfully")
170
+ return (
171
+ full_result or "Task completed with no text output.",
172
+ Image(
173
+ format="png",
174
+ data=await global_computer.interface.screenshot()
175
+ )
176
+ )
148
177
 
149
178
  except Exception as e:
150
179
  error_msg = f"Error running CUA task: {str(e)}\n{traceback.format_exc()}"
151
180
  logger.error(error_msg)
181
+ ctx.error(error_msg)
152
182
  return f"Error during task execution: {str(e)}"
153
183
 
154
184
  @server.tool()
155
- async def run_multi_cua_tasks(ctx: Context, tasks: List[str]) -> str:
185
+ async def run_multi_cua_tasks(ctx: Context, tasks: List[str]) -> List:
156
186
  """
157
- Run multiple CUA tasks in sequence and return the combined results.
187
+ Run multiple CUA tasks in a MacOS VM in sequence and return the combined results.
158
188
 
159
189
  Args:
160
190
  ctx: The MCP context
@@ -164,13 +194,15 @@ def serve() -> FastMCP:
164
194
  Combined results from all tasks
165
195
  """
166
196
  results = []
167
-
168
197
  for i, task in enumerate(tasks):
169
198
  logger.info(f"Running task {i+1}/{len(tasks)}: {task}")
170
- result = await run_cua_task(ctx, task)
171
- results.append(f"Task {i+1}: {task}\nResult: {result}\n")
172
-
173
- return "\n".join(results)
199
+ ctx.info(f"Running task {i+1}/{len(tasks)}: {task}")
200
+
201
+ ctx.report_progress(i / len(tasks))
202
+ results.extend(await run_cua_task(ctx, task))
203
+ ctx.report_progress((i + 1) / len(tasks))
204
+
205
+ return results
174
206
 
175
207
  return server
176
208
 
@@ -1,7 +0,0 @@
1
- cua_mcp_server-0.1.7.dist-info/METADATA,sha256=tYO69KlAhGJJBgR3hs3qIggKUnqF-fhebU1Aghgnh-Q,4811
2
- cua_mcp_server-0.1.7.dist-info/WHEEL,sha256=tSfRZzRHthuv7vxpI4aehrdN9scLjk-dCJkPLzkHxGg,90
3
- cua_mcp_server-0.1.7.dist-info/entry_points.txt,sha256=Y3uEunDRfoc-RUDS3HnD942RCxYKquiyk-2HRSqphoc,74
4
- mcp_server/__init__.py,sha256=G5Bps3KxzYfH79B1TDVQI9vbzjamC_mdgi7GJMgbVcA,575
5
- mcp_server/__main__.py,sha256=BE2ManEiNpz56nqc7Z_asNjQ6TPtvyu5AbWbyJFePnM,132
6
- mcp_server/server.py,sha256=RdM0kytzt8uF-vbqPXQ3oay-jtGhum4k_Z0jTDZmfoc,6547
7
- cua_mcp_server-0.1.7.dist-info/RECORD,,