PyPI - wcgw - Versions diffs - 1.5.3__tar.gz → 2.0.0__tar.gz - Mend

wcgw 1.5.3tar.gz → 2.0.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.

This version of wcgw might be problematic. Click here for more details.

Files changed (42) hide show

wcgw-2.0.0/.github/workflows/python-types.yml ADDED Viewed

@@ -0,0 +1,29 @@
+name: Mypy strict
+on:
+  push:
+    branches:
+      - main
+  pull_request:
+    branches:
+      - main
+jobs:
+  typecheck:
+    runs-on: ubuntu-latest
+    strategy:
+      matrix:
+        python-version: ["3.11", "3.12"]
+    steps:
+      - uses: actions/checkout@v4
+      - name: Set up Python
+        uses: actions/setup-python@v3
+        with:
+          python-version: "${{ matrix.python-version }}"
+      - name: Install dependencies
+        run: |
+          pip install uv
+          uv venv --python "${{ matrix.python-version }}"
+      - name: Run type checks
+        run: |
+          uv run mypy --strict src

wcgw-2.0.0/PKG-INFO ADDED Viewed

@@ -0,0 +1,156 @@
+Metadata-Version: 2.3
+Name: wcgw
+Version: 2.0.0
+Summary: What could go wrong giving full shell access to chatgpt?
+Project-URL: Homepage, https://github.com/rusiaaman/wcgw
+Author-email: Aman Rusia <gapypi@arcfu.com>
+Requires-Python: <3.13,>=3.11
+Requires-Dist: anthropic>=0.39.0
+Requires-Dist: fastapi>=0.115.0
+Requires-Dist: mcp
+Requires-Dist: mypy>=1.11.2
+Requires-Dist: nltk>=3.9.1
+Requires-Dist: openai>=1.46.0
+Requires-Dist: petname>=2.6
+Requires-Dist: pexpect>=4.9.0
+Requires-Dist: pydantic>=2.9.2
+Requires-Dist: pyte>=0.8.2
+Requires-Dist: python-dotenv>=1.0.1
+Requires-Dist: rich>=13.8.1
+Requires-Dist: semantic-version>=2.10.0
+Requires-Dist: shell>=1.0.1
+Requires-Dist: tiktoken==0.7.0
+Requires-Dist: toml>=0.10.2
+Requires-Dist: typer>=0.12.5
+Requires-Dist: types-pexpect>=4.9.0.20240806
+Requires-Dist: uvicorn>=0.31.0
+Requires-Dist: websockets>=13.1
+Description-Content-Type: text/markdown
+# Shell and Coding agent on Claude desktop app
+- An MCP server on claude desktop for autonomous shell, coding and desktop control agent.
+[![Tests](https://github.com/rusiaaman/wcgw/actions/workflows/python-tests.yml/badge.svg?branch=main)](https://github.com/rusiaaman/wcgw/actions/workflows/python-tests.yml)
+[![Mypy strict](https://github.com/rusiaaman/wcgw/actions/workflows/python-types.yml/badge.svg?branch=main)](https://github.com/rusiaaman/wcgw/actions/workflows/python-types.yml)
+[![Build](https://github.com/rusiaaman/wcgw/actions/workflows/python-publish.yml/badge.svg)](https://github.com/rusiaaman/wcgw/actions/workflows/python-publish.yml)
+## Updates
+- [01 Dec 2024] Deprecated chatgpt app support
+- [26 Nov 2024] Introduced claude desktop support through mcp
+## 🚀 Highlights
+- ⚡ **Full Shell Access**: No restrictions, complete control.
+- ⚡ **Desktop control on Claude**: Screen capture, mouse control, keyboard control on claude desktop (on mac with docker linux)
+- ⚡ **Create, Execute, Iterate**: Ask claude to keep running compiler checks till all errors are fixed, or ask it to keep checking for the status of a long running command till it's done.
+- ⚡ **Interactive Command Handling**: Supports interactive commands using arrow keys, interrupt, and ansi escape sequences.
+- ⚡ **REPL support**: [beta] Supports python/node and other REPL execution.
+## Setup
+Update `claude_desktop_config.json` (~/Library/Application Support/Claude/claude_desktop_config.json)
+```json
+{
+  "mcpServers": {
+    "wcgw": {
+      "command": "uv",
+      "args": [
+        "tool",
+        "run",
+        "--from",
+        "wcgw@latest",
+        "--python",
+        "3.12",
+        "wcgw_mcp"
+      ]
+    }
+  }
+}
+```
+Then restart claude app.
+## [Optional] Computer use support using desktop on docker
+Computer use is disabled by default. Add `--computer-use` to enable it. This will add necessary tools to Claude including ScreenShot, Mouse and Keyboard control.
+```json
+{
+  "mcpServers": {
+    "wcgw": {
+      "command": "uv",
+      "args": [
+        "tool",
+        "run",
+        "--from",
+        "wcgw@latest",
+        "--python",
+        "3.12",
+        "wcgw_mcp",
+        "--computer-use"
+      ]
+    }
+  }
+}
+```
+Claude will be able to connect to any docker container with linux environment. Native system control isn't supported outside docker.
+You'll need to run a docker image with desktop and optional VNC connection. Here's a demo image:
+```sh
+docker run -p 6080:6080 ghcr.io/anthropics/anthropic-quickstarts:computer-use-demo-latest
+```
+Then ask claude desktop app to control the docker os. It'll connect to the docker container and control it.
+Connect to `http://localhost:6080/vnc.html` for desktop view (VNC) of the system running in the docker.
+## Usage
+Wait for a few seconds. You should be able to see this icon if everything goes right.
+![mcp icon](https://github.com/rusiaaman/wcgw/blob/main/static/rocket-icon.png?raw=true)
+over here
+![mcp icon](https://github.com/rusiaaman/wcgw/blob/main/static/claude-ss.jpg?raw=true)
+Then ask claude to execute shell commands, read files, edit files, run your code, etc.
+If you've run the docker for LLM to access, you can ask it to control the "docker os". If you don't provide the docker container id to it, it'll try to search for available docker using `docker ps` command.
+## Example
+### Computer use example
+![computer-use](https://github.com/rusiaaman/wcgw/blob/main/static/computer-use.jpg?raw=true)
+### Shell example
+![example](https://github.com/rusiaaman/wcgw/blob/main/static/example.jpg?raw=true)
+## [Optional] Local shell access with openai API key or anthropic API key
+### Openai
+Add `OPENAI_API_KEY` and `OPENAI_ORG_ID` env variables.
+Then run
+`uvx --from wcgw@latest wcgw_local  --limit 0.1` # Cost limit $0.1
+You can now directly write messages or press enter key to open vim for multiline message and text pasting.
+### Anthropic
+Add `ANTHROPIC_API_KEY` env variable.
+Then run
+`uvx --from wcgw@latest wcgw_local --claude`
+You can now directly write messages or press enter key to open vim for multiline message and text pasting.

wcgw-2.0.0/README.md ADDED Viewed

@@ -0,0 +1,127 @@
+# Shell and Coding agent on Claude desktop app
+- An MCP server on claude desktop for autonomous shell, coding and desktop control agent.
+[![Tests](https://github.com/rusiaaman/wcgw/actions/workflows/python-tests.yml/badge.svg?branch=main)](https://github.com/rusiaaman/wcgw/actions/workflows/python-tests.yml)
+[![Mypy strict](https://github.com/rusiaaman/wcgw/actions/workflows/python-types.yml/badge.svg?branch=main)](https://github.com/rusiaaman/wcgw/actions/workflows/python-types.yml)
+[![Build](https://github.com/rusiaaman/wcgw/actions/workflows/python-publish.yml/badge.svg)](https://github.com/rusiaaman/wcgw/actions/workflows/python-publish.yml)
+## Updates
+- [01 Dec 2024] Deprecated chatgpt app support
+- [26 Nov 2024] Introduced claude desktop support through mcp
+## 🚀 Highlights
+- ⚡ **Full Shell Access**: No restrictions, complete control.
+- ⚡ **Desktop control on Claude**: Screen capture, mouse control, keyboard control on claude desktop (on mac with docker linux)
+- ⚡ **Create, Execute, Iterate**: Ask claude to keep running compiler checks till all errors are fixed, or ask it to keep checking for the status of a long running command till it's done.
+- ⚡ **Interactive Command Handling**: Supports interactive commands using arrow keys, interrupt, and ansi escape sequences.
+- ⚡ **REPL support**: [beta] Supports python/node and other REPL execution.
+## Setup
+Update `claude_desktop_config.json` (~/Library/Application Support/Claude/claude_desktop_config.json)
+```json
+{
+  "mcpServers": {
+    "wcgw": {
+      "command": "uv",
+      "args": [
+        "tool",
+        "run",
+        "--from",
+        "wcgw@latest",
+        "--python",
+        "3.12",
+        "wcgw_mcp"
+      ]
+    }
+  }
+}
+```
+Then restart claude app.
+## [Optional] Computer use support using desktop on docker
+Computer use is disabled by default. Add `--computer-use` to enable it. This will add necessary tools to Claude including ScreenShot, Mouse and Keyboard control.
+```json
+{
+  "mcpServers": {
+    "wcgw": {
+      "command": "uv",
+      "args": [
+        "tool",
+        "run",
+        "--from",
+        "wcgw@latest",
+        "--python",
+        "3.12",
+        "wcgw_mcp",
+        "--computer-use"
+      ]
+    }
+  }
+}
+```
+Claude will be able to connect to any docker container with linux environment. Native system control isn't supported outside docker.
+You'll need to run a docker image with desktop and optional VNC connection. Here's a demo image:
+```sh
+docker run -p 6080:6080 ghcr.io/anthropics/anthropic-quickstarts:computer-use-demo-latest
+```
+Then ask claude desktop app to control the docker os. It'll connect to the docker container and control it.
+Connect to `http://localhost:6080/vnc.html` for desktop view (VNC) of the system running in the docker.
+## Usage
+Wait for a few seconds. You should be able to see this icon if everything goes right.
+![mcp icon](https://github.com/rusiaaman/wcgw/blob/main/static/rocket-icon.png?raw=true)
+over here
+![mcp icon](https://github.com/rusiaaman/wcgw/blob/main/static/claude-ss.jpg?raw=true)
+Then ask claude to execute shell commands, read files, edit files, run your code, etc.
+If you've run the docker for LLM to access, you can ask it to control the "docker os". If you don't provide the docker container id to it, it'll try to search for available docker using `docker ps` command.
+## Example
+### Computer use example
+![computer-use](https://github.com/rusiaaman/wcgw/blob/main/static/computer-use.jpg?raw=true)
+### Shell example
+![example](https://github.com/rusiaaman/wcgw/blob/main/static/example.jpg?raw=true)
+## [Optional] Local shell access with openai API key or anthropic API key
+### Openai
+Add `OPENAI_API_KEY` and `OPENAI_ORG_ID` env variables.
+Then run
+`uvx --from wcgw@latest wcgw_local  --limit 0.1` # Cost limit $0.1
+You can now directly write messages or press enter key to open vim for multiline message and text pasting.
+### Anthropic
+Add `ANTHROPIC_API_KEY` env variable.
+Then run
+`uvx --from wcgw@latest wcgw_local --claude`
+You can now directly write messages or press enter key to open vim for multiline message and text pasting.

wcgw-2.0.0/openai.md ADDED Viewed

@@ -0,0 +1,71 @@
+# ChatGPT Integration Guide
+## 🪜 Steps:
+1. Run a relay server with a domain name and https support (or use ngrok) use the instructions in next section.
+2. Create a custom gpt that connects to the relay server, instructions in next sections.
+3. Run the [cli client](https://github.com/rusiaaman/wcgw?tab=readme-ov-file#client) in any directory of choice.
+4. The custom GPT can now run any command on your cli
+## Creating the relay server
+### If you've a domain name and ssl certificate
+Run the server
+`gunicorn --worker-class uvicorn.workers.UvicornWorker --bind 0.0.0.0:443 src.wcgw.relay.serve:app  --certfile fullchain.pem  --keyfile  privkey.pem`
+If you don't have public ip and domain name, you can use `ngrok` or similar services to get a https address to the api.
+Then specify the server url in the `wcgw` command like so:
+`uv tool run --python 3.12 wcgw@latest --server-url wss://your-url/v1/register`
+### Using ngrok
+Run the server
+`uv tool run --python 3.12 --from wcgw@latest wcgw_relay`
+This will start an uvicorn server on port 8000. You can use ngrok to get a public address to the server.
+`ngrok http 8000`
+Then specify the ngrok address in the `wcgw` command like so:
+`uv tool run --python 3.12 wcgw@latest --server-url wss://4900-1c2c-6542-b922-a596-f8f8.ngrok-free.app/v1/register`
+## Creating the custom gpt
+I've used the following instructions and action json schema to create the custom GPT. (Replace wcgw.arcfu.com with the address to your server)
+https://github.com/rusiaaman/wcgw/blob/main/gpt_instructions.txt
+https://github.com/rusiaaman/wcgw/blob/main/gpt_action_json_schema.json
+### Chat
+Let the chatgpt know your user id in any format. E.g., "user_id=<your uuid>" followed by rest of your instructions.
+### How it works on chatgpt app?
+Your commands are relayed through a server to the terminal client.
+Chatgpt sends a request to the relay server using the user id that you share with it. The relay server holds a websocket with the terminal client against the user id and acts as a proxy to pass the request.
+It's secure in both the directions. Either a malicious actor or a malicious Chatgpt has to correctly guess your UUID for any security breach.
+## Showcase
+### Unit tests and github actions
+[The first version of unit tests and github workflow to test on multiple python versions were written by the custom chatgpt](https://chatgpt.com/share/6717f922-8998-8005-b825-45d4b348b4dd)
+### Create a todo app using react + typescript + vite
+![Screenshot](https://github.com/rusiaaman/wcgw/blob/main/static/ss1.png?raw=true)
+## Local shell access with OpenAI API key
+Add `OPENAI_API_KEY` and `OPENAI_ORG_ID` env variables.
+Then run:
+`uvx --from wcgw@latest wcgw_local  --limit 0.1` # Cost limit $0.1
+You can now directly write messages or press enter key to open vim for multiline message and text pasting.

{wcgw-1.5.3 → wcgw-2.0.0}/pyproject.toml RENAMED Viewed

@@ -1,7 +1,7 @@
 [project]
 authors = [{ name = "Aman Rusia", email = "gapypi@arcfu.com" }]
 name = "wcgw"
-version = "1.5.3"
+version = "2.0.0"
 description = "What could go wrong giving full shell access to chatgpt?"
 readme = "README.md"
 requires-python = ">=3.11, <3.13"

{wcgw-1.5.3 → wcgw-2.0.0}/src/wcgw/client/anthropic_client.py RENAMED Viewed

@@ -4,7 +4,7 @@ import mimetypes
 from pathlib import Path
 import sys
 import traceback
-from typing import Callable, DefaultDict, Optional, cast
+from typing import Callable, DefaultDict, Optional, cast, Literal
 import anthropic
 from anthropic import Anthropic
 from anthropic.types import (
@@ -110,7 +110,10 @@ def parse_user_message_special(msg: str) -> MessageParam:
                     "type": "image",
                     "source": {
                         "type": "base64",
-                        "media_type": image_type,
+                        "media_type": cast(
+                            'Literal["image/jpeg", "image/png", "image/gif", "image/webp"]',
+                            image_type or "image/png",
+                        ),
                         "data": image_b64,
                     },
                 }
@@ -131,6 +134,7 @@ def loop(
     first_message: Optional[str] = None,
     limit: Optional[float] = None,
     resume: Optional[str] = None,
+    computer_use: bool = False,
 ) -> tuple[str, float]:
     load_dotenv()
@@ -222,10 +226,14 @@ def loop(
 - Use SEARCH/REPLACE blocks to edit the file.
 """,
         ),
-        ToolParam(
-            input_schema=GetScreenInfo.model_json_schema(),
-            name="GetScreenInfo",
-            description="""
+    ]
+    if computer_use:
+        tools += [
+            ToolParam(
+                input_schema=GetScreenInfo.model_json_schema(),
+                name="GetScreenInfo",
+                description="""
 - Important: call this first in the conversation before ScreenShot, Mouse, and Keyboard tools.
 - Get display information of a linux os running on docker using image "ghcr.io/anthropics/anthropic-quickstarts:computer-use-demo-latest"
 - If user hasn't provided docker image id, check using `docker ps` and provide the id.
@@ -233,11 +241,11 @@ def loop(
 - Connects shell to the docker environment.
 - Note: once this is called, the shell enters the docker environment. All bash commands will run over there.
 """,
-        ),
-        ToolParam(
-            input_schema=ScreenShot.model_json_schema(),
-            name="ScreenShot",
-            description="""
+            ),
+            ToolParam(
+                input_schema=ScreenShot.model_json_schema(),
+                name="ScreenShot",
+                description="""
 - Capture screenshot of the linux os on docker.
 - All actions on UI using mouse and keyboard return within 0.5 seconds.
     * So if you're doing something that takes longer for UI to update like heavy page loading, keep checking UI for update usign ScreenShot upto 10 turns.
@@ -246,27 +254,27 @@ def loop(
     * If you don't notice even slightest of the change, it's likely you clicked on the wrong place.
 """,
-        ),
-        ToolParam(
-            input_schema=Mouse.model_json_schema(),
-            name="Mouse",
-            description="""
+            ),
+            ToolParam(
+                input_schema=Mouse.model_json_schema(),
+                name="Mouse",
+                description="""
 - Interact with the linux os on docker using mouse.
 - Uses xdotool
 - About left_click_drag: the current mouse position will be used as the starting point, click and drag to the given x, y coordinates. Useful in things like sliders, moving things around, etc.
 """,
-        ),
-        ToolParam(
-            input_schema=Keyboard.model_json_schema(),
-            name="Keyboard",
-            description="""
+            ),
+            ToolParam(
+                input_schema=Keyboard.model_json_schema(),
+                name="Keyboard",
+                description="""
 - Interact with the linux os on docker using keyboard.
 - Emulate keyboard input to the screen
 - Uses xdootool to send keyboard input, keys like Return, BackSpace, Escape, Page_Up, etc. can be used.
 - Do not use it to interact with Bash tool.
 """,
-        ),
-    ]
+            ),
+        ]
     uname_sysname = os.uname().sysname
     uname_machine = os.uname().machine
@@ -355,53 +363,79 @@ System information:
                     type_ = chunk.type
                     if type_ in {"message_start", "message_stop"}:
                         continue
-                    elif type_ == "content_block_start":
+                    elif type_ == "content_block_start" and hasattr(
+                        chunk, "content_block"
+                    ):
                         content_block = chunk.content_block
-                        if content_block.type == "text":
+                        if (
+                            hasattr(content_block, "type")
+                            and content_block.type == "text"
+                            and hasattr(content_block, "text")
+                        ):
                             chunk_str = content_block.text
                             assistant_console.print(chunk_str, end="")
                             full_response += chunk_str
                         elif content_block.type == "tool_use":
-                            assert content_block.input == {}
-                            tool_calls.append(
-                                {
-                                    "name": content_block.name,
-                                    "input": "",
-                                    "done": False,
-                                    "id": content_block.id,
-                                }
-                            )
+                            if (
+                                hasattr(content_block, "input")
+                                and hasattr(content_block, "name")
+                                and hasattr(content_block, "id")
+                            ):
+                                assert content_block.input == {}
+                                tool_calls.append(
+                                    {
+                                        "name": str(content_block.name),
+                                        "input": str(""),
+                                        "done": False,
+                                        "id": str(content_block.id),
+                                    }
+                                )
                         else:
                             error_console.log(
                                 f"Ignoring unknown content block type {content_block.type}"
                             )
-                    elif type_ == "content_block_delta":
-                        if chunk.delta.type == "text_delta":
-                            chunk_str = chunk.delta.text
-                            assistant_console.print(chunk_str, end="")
-                            full_response += chunk_str
-                        elif chunk.delta.type == "input_json_delta":
-                            tool_calls[-1]["input"] += chunk.delta.partial_json
+                    elif type_ == "content_block_delta" and hasattr(chunk, "delta"):
+                        delta = chunk.delta
+                        if hasattr(delta, "type"):
+                            delta_type = str(delta.type)
+                            if delta_type == "text_delta" and hasattr(delta, "text"):
+                                chunk_str = delta.text
+                                assistant_console.print(chunk_str, end="")
+                                full_response += chunk_str
+                            elif delta_type == "input_json_delta" and hasattr(
+                                delta, "partial_json"
+                            ):
+                                partial_json = delta.partial_json
+                                if isinstance(tool_calls[-1]["input"], str):
+                                    tool_calls[-1]["input"] += partial_json
+                            else:
+                                error_console.log(
+                                    f"Ignoring unknown content block delta type {delta_type}"
+                                )
                         else:
-                            error_console.log(
-                                f"Ignoring unknown content block delta type {chunk.delta.type}"
-                            )
+                            raise ValueError("Content block delta has no type")
                     elif type_ == "content_block_stop":
                         if tool_calls and not tool_calls[-1]["done"]:
                             tc = tool_calls[-1]
+                            tool_name = str(tc["name"])
+                            tool_input = str(tc["input"])
+                            tool_id = str(tc["id"])
                             tool_parsed = which_tool_name(
-                                tc["name"]
-                            ).model_validate_json(tc["input"])
+                                tool_name
+                            ).model_validate_json(tool_input)
                             system_console.print(
                                 f"\n---------------------------------------\n# Assistant invoked tool: {tool_parsed}"
                             )
                             _histories.append(
                                 {
                                     "role": "assistant",
                                     "content": [
                                         ToolUseBlockParam(
-                                            id=tc["id"],
-                                            name=tc["name"],
+                                            id=tool_id,
+                                            name=tool_name,
                                             input=tool_parsed.model_dump(),
                                             type="tool_use",
                                         )
@@ -453,7 +487,7 @@ System information:
                             tool_results.append(
                                 ToolResultBlockParam(
                                     type="tool_result",
-                                    tool_use_id=tc["id"],
+                                    tool_use_id=str(tc["id"]),
                                     content=tool_results_content,
                                 )
                             )

{wcgw-1.5.3 → wcgw-2.0.0}/src/wcgw/client/cli.py RENAMED Viewed

@@ -16,6 +16,7 @@ def loop(
     first_message: Optional[str] = None,
     limit: Optional[float] = None,
     resume: Optional[str] = None,
+    computer_use: bool = False,
     version: bool = typer.Option(False, "--version", "-v"),
 ) -> tuple[str, float]:
     if version:
@@ -27,6 +28,7 @@ def loop(
             first_message=first_message,
             limit=limit,
             resume=resume,
+            computer_use=computer_use,
         )
     else:
         return openai_loop(

{wcgw-1.5.3 → wcgw-2.0.0}/src/wcgw/client/computer_use.py RENAMED Viewed

@@ -161,7 +161,7 @@ class ComputerTool:
         assert not result.error, result.error
         assert result.output, "Could not get screen info"
         width, height, display_num = map(
-            lambda x: None if not x else int(x), result.output.split(",")
+            lambda x: None if not x else int(x), result.output.strip().split(",")
         )
         if width is None:
             width = 1080

{wcgw-1.5.3 → wcgw-2.0.0}/src/wcgw/client/mcp_server/Readme.md RENAMED Viewed

@@ -31,15 +31,39 @@ Then restart claude app.
 ### [Optional] Computer use support using desktop on docker
-Computer use is enabled by default. Claude will be able to connect to any docker container with linux environment. Native system control isn't supported outside docker.
+Computer use is disabled by default. Add `--computer-use` to enable it. This will add necessary tools to Claude including ScreenShot, Mouse and Keyboard control.
-First run a sample docker image with desktop and optionally VNC connection:
+```json
+{
+  "mcpServers": {
+    "wcgw": {
+      "command": "uv",
+      "args": [
+        "tool",
+        "run",
+        "--from",
+        "wcgw@latest",
+        "--python",
+        "3.12",
+        "wcgw_mcp",
+        "--computer-use"
+      ]
+    }
+  }
+}
+```
+Claude will be able to connect to any docker container with linux environment. Native system control isn't supported outside docker.
+You'll need to run a docker image with desktop and optional VNC connection. Here's a demo image:
 ```sh
 docker run -p 6080:6080 ghcr.io/anthropics/anthropic-quickstarts:computer-use-demo-latest
 ```
-Connect to `http://localhost:6080/vnc.html` for desktop view (VNC) of the system running in the docker. Then ask claude desktop app to control the docker os.
+Then ask claude desktop app to control the docker os. It'll connect to the docker container and control it.
+Connect to `http://localhost:6080/vnc.html` for desktop view (VNC) of the system running in the docker.
 ## Usage

{wcgw-1.5.3 → wcgw-2.0.0}/src/wcgw/client/mcp_server/__init__.py RENAMED Viewed

@@ -1,10 +1,15 @@
+# mypy: disable-error-code="import-untyped"
 from wcgw.client.mcp_server import server
 import asyncio
+from typer import Typer
+main = Typer()
-def main():
+@main.command()
+def app(computer_use: bool = False) -> None:
     """Main entry point for the package."""
-    asyncio.run(server.main())
+    asyncio.run(server.main(computer_use))
 # Optionally expose other important items at package level

{wcgw-1.5.3 → wcgw-2.0.0}/src/wcgw/client/mcp_server/server.py RENAMED Viewed

@@ -32,6 +32,8 @@ from ..computer_use import SLEEP_TIME_MAX_S
 tools.TIMEOUT = SLEEP_TIME_MAX_S
+COMPUTER_USE_ON_DOCKER_ENABLED = False
 server = Server("wcgw")
@@ -71,7 +73,7 @@ async def handle_list_tools() -> list[types.Tool]:
     ) as f:
         diffinstructions = f.read()
-    return [
+    tools = [
         ToolParam(
             inputSchema=Initialize.model_json_schema(),
             name="Initialize",
@@ -142,17 +144,13 @@ async def handle_list_tools() -> list[types.Tool]:
 """
             + diffinstructions,
         ),
-        ToolParam(
-            inputSchema=ReadImage.model_json_schema(),
-            name="ReadImage",
-            description="""
-- Read an image from the shell.
-""",
-        ),
-        ToolParam(
-            inputSchema=GetScreenInfo.model_json_schema(),
-            name="GetScreenInfo",
-            description="""
+    ]
+    if COMPUTER_USE_ON_DOCKER_ENABLED:
+        tools += [
+            ToolParam(
+                inputSchema=GetScreenInfo.model_json_schema(),
+                name="GetScreenInfo",
+                description="""
 - Important: call this first in the conversation before ScreenShot, Mouse, and Keyboard tools.
 - Get display information of a linux os running on docker using image "ghcr.io/anthropics/anthropic-quickstarts:computer-use-demo-latest"
 - If user hasn't provided docker image id, check using `docker ps` and provide the id.
@@ -160,11 +158,11 @@ async def handle_list_tools() -> list[types.Tool]:
 - Connects shell to the docker environment.
 - Note: once this is called, the shell enters the docker environment. All bash commands will run over there.
 """,
-        ),
-        ToolParam(
-            inputSchema=ScreenShot.model_json_schema(),
-            name="ScreenShot",
-            description="""
+            ),
+            ToolParam(
+                inputSchema=ScreenShot.model_json_schema(),
+                name="ScreenShot",
+                description="""
 - Capture screenshot of the linux os on docker.
 - All actions on UI using mouse and keyboard return within 0.5 seconds.
     * So if you're doing something that takes longer for UI to update like heavy page loading, keep checking UI for update usign ScreenShot upto 10 turns.
@@ -172,28 +170,30 @@ async def handle_list_tools() -> list[types.Tool]:
     * After 10 turns of no change, ask user for permission to keep checking.
     * If you don't notice even slightest of the change, it's likely you clicked on the wrong place.
 """,
-        ),
-        ToolParam(
-            inputSchema=Mouse.model_json_schema(),
-            name="Mouse",
-            description="""
+            ),
+            ToolParam(
+                inputSchema=Mouse.model_json_schema(),
+                name="Mouse",
+                description="""
 - Interact with the linux os on docker using mouse.
 - Uses xdotool
 - About left_click_drag: the current mouse position will be used as the starting point, click and drag to the given x, y coordinates. Useful in things like sliders, moving things around, etc.
 """,
-        ),
-        ToolParam(
-            inputSchema=Keyboard.model_json_schema(),
-            name="Keyboard",
-            description="""
+            ),
+            ToolParam(
+                inputSchema=Keyboard.model_json_schema(),
+                name="Keyboard",
+                description="""
 - Interact with the linux os on docker using keyboard.
 - Emulate keyboard input to the screen
 - Uses xdootool to send keyboard input, keys like Return, BackSpace, Escape, Page_Up, etc. can be used.
 - Do not use it to interact with Bash tool.
 - Make sure you've selected a text area or an editable element before sending text.
 """,
-        ),
-    ]
+            ),
+        ]
+    return tools
 @server.call_tool()  # type: ignore
@@ -263,7 +263,11 @@ async def handle_call_tool(
     return content
-async def main() -> None:
+async def main(computer_use: bool) -> None:
+    global COMPUTER_USE_ON_DOCKER_ENABLED
+    if computer_use:
+        COMPUTER_USE_ON_DOCKER_ENABLED = True
     version = importlib.metadata.version("wcgw")
     # Run the server using stdin/stdout streams
     async with mcp.server.stdio.stdio_server() as (read_stream, write_stream):

{wcgw-1.5.3 → wcgw-2.0.0}/src/wcgw/client/openai_client.py RENAMED Viewed

@@ -123,6 +123,7 @@ def loop(
     first_message: Optional[str] = None,
     limit: Optional[float] = None,
     resume: Optional[str] = None,
+    computer_use: bool = False,
 ) -> tuple[str, float]:
     load_dotenv()

{wcgw-1.5.3 → wcgw-2.0.0}/src/wcgw/client/tools.py RENAMED Viewed

@@ -962,7 +962,7 @@ run = Typer(pretty_exceptions_show_locals=False, no_args_is_help=True)
 @run.command()
 def app(
-    server_url: str = "wss://wcgw.arcfu.com/v1/register",
+    server_url: str = "",
     client_uuid: Optional[str] = None,
     version: bool = typer.Option(False, "--version", "-v"),
 ) -> None:
@@ -970,7 +970,18 @@ def app(
         version_ = importlib.metadata.version("wcgw")
         print(f"wcgw version: {version_}")
         exit()
+    if not server_url:
+        server_url = os.environ.get("WCGW_RELAY_SERVER", "")
+        if not server_url:
+            print(
+                "Error: Please provide relay server url using --server_url or WCGW_RELAY_SERVER environment variable"
+            )
+            print(
+                "\tNOTE: you need to run a relay server first, author doesn't host a relay server anymore."
+            )
+            print("\thttps://github.com/rusiaaman/wcgw/blob/main/openai.md")
+            print("\tExample `--server-url=ws://localhost:8000/v1/register`")
+            raise typer.Exit(1)
     register_client(server_url, client_uuid or "")

{wcgw-1.5.3 → wcgw-2.0.0}/uv.lock RENAMED Viewed

@@ -860,7 +860,7 @@ wheels = [
 [[package]]
 name = "wcgw"
-version = "1.5.2"
+version = "1.5.4"
 source = { editable = "." }
 dependencies = [
     { name = "anthropic" },

wcgw-1.5.3/PKG-INFO DELETED Viewed

@@ -1,178 +0,0 @@
-Metadata-Version: 2.3
-Name: wcgw
-Version: 1.5.3
-Summary: What could go wrong giving full shell access to chatgpt?
-Project-URL: Homepage, https://github.com/rusiaaman/wcgw
-Author-email: Aman Rusia <gapypi@arcfu.com>
-Requires-Python: <3.13,>=3.11
-Requires-Dist: anthropic>=0.39.0
-Requires-Dist: fastapi>=0.115.0
-Requires-Dist: mcp
-Requires-Dist: mypy>=1.11.2
-Requires-Dist: nltk>=3.9.1
-Requires-Dist: openai>=1.46.0
-Requires-Dist: petname>=2.6
-Requires-Dist: pexpect>=4.9.0
-Requires-Dist: pydantic>=2.9.2
-Requires-Dist: pyte>=0.8.2
-Requires-Dist: python-dotenv>=1.0.1
-Requires-Dist: rich>=13.8.1
-Requires-Dist: semantic-version>=2.10.0
-Requires-Dist: shell>=1.0.1
-Requires-Dist: tiktoken==0.7.0
-Requires-Dist: toml>=0.10.2
-Requires-Dist: typer>=0.12.5
-Requires-Dist: types-pexpect>=4.9.0.20240806
-Requires-Dist: uvicorn>=0.31.0
-Requires-Dist: websockets>=13.1
-Description-Content-Type: text/markdown
-# Shell and Coding agent on Chatgpt and Claude desktop apps
-- An MCP server on claude desktop for autonomous shell, coding and desktop control agent.
-- A custom gpt on chatgpt web/desktop apps to interact with your local shell, edit files, run code, etc.
-[![Tests](https://github.com/rusiaaman/wcgw/actions/workflows/python-tests.yml/badge.svg?branch=main)](https://github.com/rusiaaman/wcgw/actions/workflows/python-tests.yml)
-[![Build](https://github.com/rusiaaman/wcgw/actions/workflows/python-publish.yml/badge.svg)](https://github.com/rusiaaman/wcgw/actions/workflows/python-publish.yml)
-[New feature] [26-Nov-2024] Claude desktop support for shell, computer-control, coding agent.
-[src/wcgw/client/mcp_server/Readme.md](src/wcgw/client/mcp_server/Readme.md)
-### 🚀 Highlights
-- ⚡ **Full Shell Access**: No restrictions, complete control.
-- ⚡ **Desktop control on Claude**: Screen capture, mouse control, keyboard control on claude desktop (on mac with docker linux)
-- ⚡ **Create, Execute, Iterate**: Ask the gpt to keep running compiler checks till all errors are fixed, or ask it to keep checking for the status of a long running command till it's done.
-- ⚡ **Interactive Command Handling**: Supports interactive commands using arrow keys, interrupt, and ansi escape sequences.
-- ⚡ **REPL support**: [beta] Supports python/node and other REPL execution.
-## Claude
-Full readme [src/wcgw/client/mcp_server/Readme.md](src/wcgw/client/mcp_server/Readme.md)
-### Setup
-Update `claude_desktop_config.json`
-```json
-{
-  "mcpServers": {
-    "wcgw": {
-      "command": "uvx",
-      "args": ["--from", "wcgw@latest", "wcgw_mcp"]
-    }
-  }
-}
-```
-Then restart claude app.
-You can then ask claude to execute shell commands, read files, edit files, run your code, etc.
-## ChatGPT
-### 🪜 Steps:
-1. Run the [cli client](https://github.com/rusiaaman/wcgw?tab=readme-ov-file#client) in any directory of choice.
-2. Share the generated id with this GPT: `https://chatgpt.com/g/g-Us0AAXkRh-wcgw-giving-shell-access`
-3. The custom GPT can now run any command on your cli
-### Client
-You need to keep running this client for GPT to access your shell. Run it in a version controlled project's root.
-#### Option 1: using uv [Recommended]
-```sh
-$ curl -LsSf https://astral.sh/uv/install.sh | sh
-$ uvx wcgw@latest
-```
-#### Option 2: using pip
-Supports python >=3.10 and <3.13
-```sh
-$ pip3 install wcgw
-$ wcgw
-```
-This will print a UUID that you need to share with the gpt.
-### Chat
-Open the following link or search the "wcgw" custom gpt using "Explore GPTs" on chatgpt.com
-https://chatgpt.com/g/g-Us0AAXkRh-wcgw-giving-shell-access
-Finally, let the chatgpt know your user id in any format. E.g., "user_id=<your uuid>" followed by rest of your instructions.
-NOTE: you can resume a broken connection
-`wcgw --client-uuid $previous_uuid`
-### How it works on chatgpt app?
-Your commands are relayed through a server to the terminal client. [You could host the server on your own](https://github.com/rusiaaman/wcgw?tab=readme-ov-file#creating-your-own-custom-gpt-and-the-relay-server). For public convenience I've hosted one at https://wcgw.arcfu.com thanks to the gcloud free tier plan.
-Chatgpt sends a request to the relay server using the user id that you share with it. The relay server holds a websocket with the terminal client against the user id and acts as a proxy to pass the request.
-It's secure in both the directions. Either a malicious actor or a malicious Chatgpt has to correctly guess your UUID for any security breach.
-# Showcase
-## Claude desktop
-### Resize image and move it to a new dir
-![example](https://github.com/rusiaaman/wcgw/blob/main/static/example.jpg?raw=true)
-## Chatgpt app
-### Unit tests and github actions
-[The first version of unit tests and github workflow to test on multiple python versions were written by the custom chatgpt](https://chatgpt.com/share/6717f922-8998-8005-b825-45d4b348b4dd)
-### Create a todo app using react + typescript + vite
-![Screenshot](https://github.com/rusiaaman/wcgw/blob/main/static/ss1.png?raw=true)
-# Privacy
-The relay server doesn't store any data. I can't access any information passing through it and only secure channels are used to communicate.
-You may host the server on your own and create a custom gpt using the following section.
-# Creating your own custom gpt and the relay server.
-I've used the following instructions and action json schema to create the custom GPT. (Replace wcgw.arcfu.com with the address to your server)
-https://github.com/rusiaaman/wcgw/blob/main/gpt_instructions.txt
-https://github.com/rusiaaman/wcgw/blob/main/gpt_action_json_schema.json
-Run the server
-`gunicorn --worker-class uvicorn.workers.UvicornWorker --bind 0.0.0.0:443 src.wcgw.relay.serve:app  --certfile fullchain.pem  --keyfile  privkey.pem`
-If you don't have public ip and domain name, you can use `ngrok` or similar services to get a https address to the api.
-The specify the server url in the `wcgw` command like so
-`wcgw --server-url https://your-url/v1/register`
-# [Optional] Local shell access with openai API key or anthropic API key
-## Openai
-Add `OPENAI_API_KEY` and `OPENAI_ORG_ID` env variables.
-Then run
-`uvx --from wcgw@latest wcgw_local  --limit 0.1` # Cost limit $0.1
-You can now directly write messages or press enter key to open vim for multiline message and text pasting.
-## Anthropic
-Add `ANTHROPIC_API_KEY` env variable.
-Then run
-`uvx --from wcgw@latest wcgw_local --claude`
-You can now directly write messages or press enter key to open vim for multiline message and text pasting.

wcgw-1.5.3/README.md DELETED Viewed

@@ -1,149 +0,0 @@
-# Shell and Coding agent on Chatgpt and Claude desktop apps
-- An MCP server on claude desktop for autonomous shell, coding and desktop control agent.
-- A custom gpt on chatgpt web/desktop apps to interact with your local shell, edit files, run code, etc.
-[![Tests](https://github.com/rusiaaman/wcgw/actions/workflows/python-tests.yml/badge.svg?branch=main)](https://github.com/rusiaaman/wcgw/actions/workflows/python-tests.yml)
-[![Build](https://github.com/rusiaaman/wcgw/actions/workflows/python-publish.yml/badge.svg)](https://github.com/rusiaaman/wcgw/actions/workflows/python-publish.yml)
-[New feature] [26-Nov-2024] Claude desktop support for shell, computer-control, coding agent.
-[src/wcgw/client/mcp_server/Readme.md](src/wcgw/client/mcp_server/Readme.md)
-### 🚀 Highlights
-- ⚡ **Full Shell Access**: No restrictions, complete control.
-- ⚡ **Desktop control on Claude**: Screen capture, mouse control, keyboard control on claude desktop (on mac with docker linux)
-- ⚡ **Create, Execute, Iterate**: Ask the gpt to keep running compiler checks till all errors are fixed, or ask it to keep checking for the status of a long running command till it's done.
-- ⚡ **Interactive Command Handling**: Supports interactive commands using arrow keys, interrupt, and ansi escape sequences.
-- ⚡ **REPL support**: [beta] Supports python/node and other REPL execution.
-## Claude
-Full readme [src/wcgw/client/mcp_server/Readme.md](src/wcgw/client/mcp_server/Readme.md)
-### Setup
-Update `claude_desktop_config.json`
-```json
-{
-  "mcpServers": {
-    "wcgw": {
-      "command": "uvx",
-      "args": ["--from", "wcgw@latest", "wcgw_mcp"]
-    }
-  }
-}
-```
-Then restart claude app.
-You can then ask claude to execute shell commands, read files, edit files, run your code, etc.
-## ChatGPT
-### 🪜 Steps:
-1. Run the [cli client](https://github.com/rusiaaman/wcgw?tab=readme-ov-file#client) in any directory of choice.
-2. Share the generated id with this GPT: `https://chatgpt.com/g/g-Us0AAXkRh-wcgw-giving-shell-access`
-3. The custom GPT can now run any command on your cli
-### Client
-You need to keep running this client for GPT to access your shell. Run it in a version controlled project's root.
-#### Option 1: using uv [Recommended]
-```sh
-$ curl -LsSf https://astral.sh/uv/install.sh | sh
-$ uvx wcgw@latest
-```
-#### Option 2: using pip
-Supports python >=3.10 and <3.13
-```sh
-$ pip3 install wcgw
-$ wcgw
-```
-This will print a UUID that you need to share with the gpt.
-### Chat
-Open the following link or search the "wcgw" custom gpt using "Explore GPTs" on chatgpt.com
-https://chatgpt.com/g/g-Us0AAXkRh-wcgw-giving-shell-access
-Finally, let the chatgpt know your user id in any format. E.g., "user_id=<your uuid>" followed by rest of your instructions.
-NOTE: you can resume a broken connection
-`wcgw --client-uuid $previous_uuid`
-### How it works on chatgpt app?
-Your commands are relayed through a server to the terminal client. [You could host the server on your own](https://github.com/rusiaaman/wcgw?tab=readme-ov-file#creating-your-own-custom-gpt-and-the-relay-server). For public convenience I've hosted one at https://wcgw.arcfu.com thanks to the gcloud free tier plan.
-Chatgpt sends a request to the relay server using the user id that you share with it. The relay server holds a websocket with the terminal client against the user id and acts as a proxy to pass the request.
-It's secure in both the directions. Either a malicious actor or a malicious Chatgpt has to correctly guess your UUID for any security breach.
-# Showcase
-## Claude desktop
-### Resize image and move it to a new dir
-![example](https://github.com/rusiaaman/wcgw/blob/main/static/example.jpg?raw=true)
-## Chatgpt app
-### Unit tests and github actions
-[The first version of unit tests and github workflow to test on multiple python versions were written by the custom chatgpt](https://chatgpt.com/share/6717f922-8998-8005-b825-45d4b348b4dd)
-### Create a todo app using react + typescript + vite
-![Screenshot](https://github.com/rusiaaman/wcgw/blob/main/static/ss1.png?raw=true)
-# Privacy
-The relay server doesn't store any data. I can't access any information passing through it and only secure channels are used to communicate.
-You may host the server on your own and create a custom gpt using the following section.
-# Creating your own custom gpt and the relay server.
-I've used the following instructions and action json schema to create the custom GPT. (Replace wcgw.arcfu.com with the address to your server)
-https://github.com/rusiaaman/wcgw/blob/main/gpt_instructions.txt
-https://github.com/rusiaaman/wcgw/blob/main/gpt_action_json_schema.json
-Run the server
-`gunicorn --worker-class uvicorn.workers.UvicornWorker --bind 0.0.0.0:443 src.wcgw.relay.serve:app  --certfile fullchain.pem  --keyfile  privkey.pem`
-If you don't have public ip and domain name, you can use `ngrok` or similar services to get a https address to the api.
-The specify the server url in the `wcgw` command like so
-`wcgw --server-url https://your-url/v1/register`
-# [Optional] Local shell access with openai API key or anthropic API key
-## Openai
-Add `OPENAI_API_KEY` and `OPENAI_ORG_ID` env variables.
-Then run
-`uvx --from wcgw@latest wcgw_local  --limit 0.1` # Cost limit $0.1
-You can now directly write messages or press enter key to open vim for multiline message and text pasting.
-## Anthropic
-Add `ANTHROPIC_API_KEY` env variable.
-Then run
-`uvx --from wcgw@latest wcgw_local --claude`
-You can now directly write messages or press enter key to open vim for multiline message and text pasting.

wcgw-1.5.3/add.py DELETED Viewed

@@ -1,6 +0,0 @@
-def add_numbers(a: int, b: int) -> int:
-    return a + b
-# Test the function
-result = add_numbers(5, 3)
-print(f"5 + 3 = {result}")