wcgw 1.5.1__tar.gz → 1.5.2__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.


This version of wcgw might be problematic. Click here for more details.

Files changed (39) hide show
  1. {wcgw-1.5.1 → wcgw-1.5.2}/PKG-INFO +5 -2
  2. {wcgw-1.5.1 → wcgw-1.5.2}/README.md +4 -1
  3. {wcgw-1.5.1 → wcgw-1.5.2}/pyproject.toml +1 -1
  4. {wcgw-1.5.1 → wcgw-1.5.2}/src/wcgw/client/anthropic_client.py +7 -10
  5. {wcgw-1.5.1 → wcgw-1.5.2}/src/wcgw/client/computer_use.py +8 -1
  6. {wcgw-1.5.1 → wcgw-1.5.2}/src/wcgw/client/mcp_server/Readme.md +6 -13
  7. {wcgw-1.5.1 → wcgw-1.5.2}/src/wcgw/client/mcp_server/server.py +8 -10
  8. {wcgw-1.5.1 → wcgw-1.5.2}/src/wcgw/client/tools.py +1 -0
  9. {wcgw-1.5.1 → wcgw-1.5.2}/src/wcgw/types_.py +1 -1
  10. {wcgw-1.5.1 → wcgw-1.5.2}/.github/workflows/python-publish.yml +0 -0
  11. {wcgw-1.5.1 → wcgw-1.5.2}/.github/workflows/python-tests.yml +0 -0
  12. {wcgw-1.5.1 → wcgw-1.5.2}/.gitignore +0 -0
  13. {wcgw-1.5.1 → wcgw-1.5.2}/.python-version +0 -0
  14. {wcgw-1.5.1 → wcgw-1.5.2}/.vscode/settings.json +0 -0
  15. {wcgw-1.5.1 → wcgw-1.5.2}/add.py +0 -0
  16. {wcgw-1.5.1 → wcgw-1.5.2}/claude_desktop_config.json +0 -0
  17. {wcgw-1.5.1 → wcgw-1.5.2}/gpt_action_json_schema.json +0 -0
  18. {wcgw-1.5.1 → wcgw-1.5.2}/gpt_instructions.txt +0 -0
  19. {wcgw-1.5.1 → wcgw-1.5.2}/src/__init__.py +0 -0
  20. {wcgw-1.5.1 → wcgw-1.5.2}/src/wcgw/__init__.py +0 -0
  21. {wcgw-1.5.1 → wcgw-1.5.2}/src/wcgw/client/__init__.py +0 -0
  22. {wcgw-1.5.1 → wcgw-1.5.2}/src/wcgw/client/__main__.py +0 -0
  23. {wcgw-1.5.1 → wcgw-1.5.2}/src/wcgw/client/cli.py +0 -0
  24. {wcgw-1.5.1 → wcgw-1.5.2}/src/wcgw/client/common.py +0 -0
  25. {wcgw-1.5.1 → wcgw-1.5.2}/src/wcgw/client/diff-instructions.txt +0 -0
  26. {wcgw-1.5.1 → wcgw-1.5.2}/src/wcgw/client/mcp_server/__init__.py +0 -0
  27. {wcgw-1.5.1 → wcgw-1.5.2}/src/wcgw/client/openai_client.py +0 -0
  28. {wcgw-1.5.1 → wcgw-1.5.2}/src/wcgw/client/openai_utils.py +0 -0
  29. {wcgw-1.5.1 → wcgw-1.5.2}/src/wcgw/client/sys_utils.py +0 -0
  30. {wcgw-1.5.1 → wcgw-1.5.2}/src/wcgw/relay/serve.py +0 -0
  31. {wcgw-1.5.1 → wcgw-1.5.2}/src/wcgw/relay/static/privacy.txt +0 -0
  32. {wcgw-1.5.1 → wcgw-1.5.2}/static/claude-ss.jpg +0 -0
  33. {wcgw-1.5.1 → wcgw-1.5.2}/static/computer-use.jpg +0 -0
  34. {wcgw-1.5.1 → wcgw-1.5.2}/static/example.jpg +0 -0
  35. {wcgw-1.5.1 → wcgw-1.5.2}/static/rocket-icon.png +0 -0
  36. {wcgw-1.5.1 → wcgw-1.5.2}/static/ss1.png +0 -0
  37. {wcgw-1.5.1 → wcgw-1.5.2}/tests/test_basic.py +0 -0
  38. {wcgw-1.5.1 → wcgw-1.5.2}/tests/test_tools.py +0 -0
  39. {wcgw-1.5.1 → wcgw-1.5.2}/uv.lock +0 -0
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.3
2
2
  Name: wcgw
3
- Version: 1.5.1
3
+ Version: 1.5.2
4
4
  Summary: What could go wrong giving full shell access to chatgpt?
5
5
  Project-URL: Homepage, https://github.com/rusiaaman/wcgw
6
6
  Author-email: Aman Rusia <gapypi@arcfu.com>
@@ -29,7 +29,9 @@ Description-Content-Type: text/markdown
29
29
 
30
30
  # Shell and Coding agent on Chatgpt and Claude desktop apps
31
31
 
32
- A custom gpt on chatgpt web/desktop apps to interact with your local shell, edit files, run code, etc.
32
+ - An MCP server on claude desktop for autonomous shell, coding and desktop control agent.
33
+ - A custom gpt on chatgpt web/desktop apps to interact with your local shell, edit files, run code, etc.
34
+
33
35
 
34
36
  [![Tests](https://github.com/rusiaaman/wcgw/actions/workflows/python-tests.yml/badge.svg?branch=main)](https://github.com/rusiaaman/wcgw/actions/workflows/python-tests.yml)
35
37
  [![Build](https://github.com/rusiaaman/wcgw/actions/workflows/python-publish.yml/badge.svg)](https://github.com/rusiaaman/wcgw/actions/workflows/python-publish.yml)
@@ -40,6 +42,7 @@ A custom gpt on chatgpt web/desktop apps to interact with your local shell, edit
40
42
  ### 🚀 Highlights
41
43
 
42
44
  - ⚡ **Full Shell Access**: No restrictions, complete control.
45
+ - ⚡ **Desktop control on Claude**: Screen capture, mouse control, keyboard control on claude desktop (on mac with docker linux)
43
46
  - ⚡ **Create, Execute, Iterate**: Ask the gpt to keep running compiler checks till all errors are fixed, or ask it to keep checking for the status of a long running command till it's done.
44
47
  - ⚡ **Interactive Command Handling**: Supports interactive commands using arrow keys, interrupt, and ansi escape sequences.
45
48
  - ⚡ **REPL support**: [beta] Supports python/node and other REPL execution.
@@ -1,6 +1,8 @@
1
1
  # Shell and Coding agent on Chatgpt and Claude desktop apps
2
2
 
3
- A custom gpt on chatgpt web/desktop apps to interact with your local shell, edit files, run code, etc.
3
+ - An MCP server on claude desktop for autonomous shell, coding and desktop control agent.
4
+ - A custom gpt on chatgpt web/desktop apps to interact with your local shell, edit files, run code, etc.
5
+
4
6
 
5
7
  [![Tests](https://github.com/rusiaaman/wcgw/actions/workflows/python-tests.yml/badge.svg?branch=main)](https://github.com/rusiaaman/wcgw/actions/workflows/python-tests.yml)
6
8
  [![Build](https://github.com/rusiaaman/wcgw/actions/workflows/python-publish.yml/badge.svg)](https://github.com/rusiaaman/wcgw/actions/workflows/python-publish.yml)
@@ -11,6 +13,7 @@ A custom gpt on chatgpt web/desktop apps to interact with your local shell, edit
11
13
  ### 🚀 Highlights
12
14
 
13
15
  - ⚡ **Full Shell Access**: No restrictions, complete control.
16
+ - ⚡ **Desktop control on Claude**: Screen capture, mouse control, keyboard control on claude desktop (on mac with docker linux)
14
17
  - ⚡ **Create, Execute, Iterate**: Ask the gpt to keep running compiler checks till all errors are fixed, or ask it to keep checking for the status of a long running command till it's done.
15
18
  - ⚡ **Interactive Command Handling**: Supports interactive commands using arrow keys, interrupt, and ansi escape sequences.
16
19
  - ⚡ **REPL support**: [beta] Supports python/node and other REPL execution.
@@ -1,7 +1,7 @@
1
1
  [project]
2
2
  authors = [{ name = "Aman Rusia", email = "gapypi@arcfu.com" }]
3
3
  name = "wcgw"
4
- version = "1.5.1"
4
+ version = "1.5.2"
5
5
  description = "What could go wrong giving full shell access to chatgpt?"
6
6
  readme = "README.md"
7
7
  requires-python = ">=3.11, <3.13"
@@ -223,9 +223,10 @@ def loop(
223
223
  input_schema=GetScreenInfo.model_json_schema(),
224
224
  name="GetScreenInfo",
225
225
  description="""
226
- - Get display information of an OS running on docker using image "ghcr.io/anthropics/anthropic-quickstarts:computer-use-demo-latest"
227
- - If user hasn't provided docker image id, check using `docker ps` and provide the id.
228
226
  - Important: call this first in the conversation before ScreenShot, Mouse, and Keyboard tools.
227
+ - Get display information of a linux os running on docker using image "ghcr.io/anthropics/anthropic-quickstarts:computer-use-demo-latest"
228
+ - If user hasn't provided docker image id, check using `docker ps` and provide the id.
229
+ - If the docker is not running, run using `docker run -d -p 6080:6080 ghcr.io/anthropics/anthropic-quickstarts:computer-use-demo-latest`
229
230
  - Connects shell to the docker environment.
230
231
  - Note: once this is called, the shell enters the docker environment. All bash commands will run over there.
231
232
  """,
@@ -234,26 +235,22 @@ def loop(
234
235
  input_schema=ScreenShot.model_json_schema(),
235
236
  name="ScreenShot",
236
237
  description="""
237
- - Capture screenshot of an OS running on docker using image "ghcr.io/anthropics/anthropic-quickstarts:computer-use-demo-latest"
238
- - If user hasn't provided docker image id, check using `docker ps` and provide the id.
239
- - Capture ScreenShot of the current screen for automation.
238
+ - Capture screenshot of the linux os on docker.
240
239
  """,
241
240
  ),
242
241
  ToolParam(
243
242
  input_schema=Mouse.model_json_schema(),
244
243
  name="Mouse",
245
244
  description="""
246
- - Interact with docker container running image "ghcr.io/anthropics/anthropic-quickstarts:computer-use-demo-latest"
247
- - If user hasn't provided docker image id, check using `docker ps` and provide the id.
248
- - Interact with the screen using mouse
245
+ - Interact with the linux os on docker using mouse.
246
+ - Uses xdotool
249
247
  """,
250
248
  ),
251
249
  ToolParam(
252
250
  input_schema=Keyboard.model_json_schema(),
253
251
  name="Keyboard",
254
252
  description="""
255
- - Interact with docker container running image "ghcr.io/anthropics/anthropic-quickstarts:computer-use-demo-latest"
256
- - If user hasn't provided docker image id, check using `docker ps` and provide the id.
253
+ - Interact with the linux os on docker using keyboard.
257
254
  - Emulate keyboard input to the screen
258
255
  - Uses xdootool to send keyboard input, keys like Return, BackSpace, Escape, Page_Up, etc. can be used.
259
256
  - Do not use it to interact with Bash tool.
@@ -186,6 +186,7 @@ class ComputerTool:
186
186
  docker_image_id: Optional[str] = None,
187
187
  text: str | None = None,
188
188
  coordinate: tuple[int, int] | None = None,
189
+ do_left_click_on_move: bool | None = None,
189
190
  **kwargs: Any,
190
191
  ) -> ToolResult:
191
192
  if action == "get_screen_info":
@@ -217,7 +218,12 @@ class ComputerTool:
217
218
  )
218
219
 
219
220
  if action == "mouse_move":
220
- return self.shell(f"{self.xdotool} mousemove {x} {y}")
221
+ if not do_left_click_on_move:
222
+ return self.shell(f"{self.xdotool} mousemove {x} {y}")
223
+ else:
224
+ return self.shell(
225
+ f"{self.xdotool} mousemove {x} {y} click 1",
226
+ )
221
227
  elif action == "left_click_drag":
222
228
  return self.shell(
223
229
  f"{self.xdotool} mousedown 1 mousemove {x} {y} mouseup 1",
@@ -401,6 +407,7 @@ def run_computer_tool(
401
407
  result = Computer(
402
408
  action="mouse_move",
403
409
  coordinate=(action.action.x, action.action.y),
410
+ do_left_click_on_move=action.action.do_left_click_on_move,
404
411
  )
405
412
  elif isinstance(action.action, LeftClickDrag):
406
413
  result = Computer(
@@ -1,5 +1,9 @@
1
1
  # Claude desktop support
2
2
 
3
+ `wcgw` enables Claude desktop app on Mac to access shell and file system in order to automate tasks, run code, etc.
4
+
5
+ It also has a computer use feature to connect to linux running on docker. Claude can fully control it including mouse and keyboard.
6
+
3
7
  ## Setup
4
8
 
5
9
  Update `claude_desktop_config.json` (~/Library/Application Support/Claude/claude_desktop_config.json)
@@ -32,21 +36,10 @@ Computer use is enabled by default. Claude will be able to connect to any docker
32
36
  First run a sample docker image with desktop and optionally VNC connection:
33
37
 
34
38
  ```sh
35
- docker run \
36
- --entrypoint "" \
37
- -p 6080:6080 \
38
- -e WIDTH=1024 \
39
- -e HEIGHT=768 \
40
- -d \
41
- ghcr.io/anthropics/anthropic-quickstarts:computer-use-demo-latest \
42
- bash -c "\
43
- ./start_all.sh && \
44
- ./novnc_startup.sh && \
45
- python http_server.py > /tmp/server_logs.txt 2>&1 & \
46
- tail -f /dev/null"
39
+ docker run -p 6080:6080 ghcr.io/anthropics/anthropic-quickstarts:computer-use-demo-latest
47
40
  ```
48
41
 
49
- Connect to `http://localhost:6080/vnc.html` for desktop view (VNC) of the system running in the docker. Then ask claude to control the docker os.
42
+ Connect to `http://localhost:6080/vnc.html` for desktop view (VNC) of the system running in the docker. Then ask claude desktop app to control the docker os.
50
43
 
51
44
  ## Usage
52
45
 
@@ -151,9 +151,10 @@ async def handle_list_tools() -> list[types.Tool]:
151
151
  inputSchema=GetScreenInfo.model_json_schema(),
152
152
  name="GetScreenInfo",
153
153
  description="""
154
- - Get display information of an OS running on docker using image "ghcr.io/anthropics/anthropic-quickstarts:computer-use-demo-latest"
155
- - If user hasn't provided docker image id, check using `docker ps` and provide the id.
156
154
  - Important: call this first in the conversation before ScreenShot, Mouse, and Keyboard tools.
155
+ - Get display information of a linux os running on docker using image "ghcr.io/anthropics/anthropic-quickstarts:computer-use-demo-latest"
156
+ - If user hasn't provided docker image id, check using `docker ps` and provide the id.
157
+ - If the docker is not running, run using `docker run -d -p 6080:6080 ghcr.io/anthropics/anthropic-quickstarts:computer-use-demo-latest`
157
158
  - Connects shell to the docker environment.
158
159
  - Note: once this is called, the shell enters the docker environment. All bash commands will run over there.
159
160
  """,
@@ -162,29 +163,26 @@ async def handle_list_tools() -> list[types.Tool]:
162
163
  inputSchema=ScreenShot.model_json_schema(),
163
164
  name="ScreenShot",
164
165
  description="""
165
- - Capture screenshot of an OS running on docker using image "ghcr.io/anthropics/anthropic-quickstarts:computer-use-demo-latest"
166
- - If user hasn't provided docker image id, check using `docker ps` and provide the id.
167
- - Capture ScreenShot of the current screen for automation.
166
+ - Capture screenshot of the linux os on docker.
168
167
  """,
169
168
  ),
170
169
  ToolParam(
171
170
  inputSchema=Mouse.model_json_schema(),
172
171
  name="Mouse",
173
172
  description="""
174
- - Interact with docker container running image "ghcr.io/anthropics/anthropic-quickstarts:computer-use-demo-latest"
175
- - If user hasn't provided docker image id, check using `docker ps` and provide the id.
176
- - Interact with the screen using mouse
173
+ - Interact with the linux os on docker using mouse.
174
+ - Uses xdotool
177
175
  """,
178
176
  ),
179
177
  ToolParam(
180
178
  inputSchema=Keyboard.model_json_schema(),
181
179
  name="Keyboard",
182
180
  description="""
183
- - Interact with docker container running image "ghcr.io/anthropics/anthropic-quickstarts:computer-use-demo-latest"
184
- - If user hasn't provided docker image id, check using `docker ps` and provide the id.
181
+ - Interact with the linux os on docker using keyboard.
185
182
  - Emulate keyboard input to the screen
186
183
  - Uses xdootool to send keyboard input, keys like Return, BackSpace, Escape, Page_Up, etc. can be used.
187
184
  - Do not use it to interact with Bash tool.
185
+ - Make sure you've selected a text area or an editable element before sending text.
188
186
  """,
189
187
  ),
190
188
  ]
@@ -170,6 +170,7 @@ def initial_info() -> str:
170
170
  System: {uname_sysname}
171
171
  Machine: {uname_machine}
172
172
  Current working directory: {CWD}
173
+ wcgw version: {importlib.metadata.version("wcgw")}
173
174
  """
174
175
 
175
176
 
@@ -60,12 +60,12 @@ class GetScreenInfo(BaseModel):
60
60
 
61
61
  class ScreenShot(BaseModel):
62
62
  type: Literal["ScreenShot"]
63
- docker_image_id: str
64
63
 
65
64
 
66
65
  class MouseMove(BaseModel):
67
66
  x: int
68
67
  y: int
68
+ do_left_click_on_move: bool
69
69
  type: Literal["MouseMove"]
70
70
 
71
71
 
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes