droidrun-agent 0.1.1__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,221 @@
1
+ Metadata-Version: 2.3
2
+ Name: droidrun-agent
3
+ Version: 0.1.1
4
+ Summary: Add your description here
5
+ Author: 涵曦
6
+ Author-email: 涵曦 <im.hanxi@gmail.com>
7
+ Requires-Dist: httpx>=0.28.1
8
+ Requires-Dist: websockets>=12.0
9
+ Requires-Python: >=3.11
10
+ Description-Content-Type: text/markdown
11
+
12
+ # [droidrun-agent](https://github.com/hanxi/droidrun-agent)
13
+
14
+ [![PyPI version](https://img.shields.io/pypi/v/droidrun-agent)](https://pypi.org/project/droidrun-agent/)
15
+ [![Python](https://img.shields.io/pypi/pyversions/droidrun-agent)](https://pypi.org/project/droidrun-agent/)
16
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
17
+
18
+ Async Python client for the [DroidRun Portal](https://github.com/droidrun/droidrun-portal) local API. Provides both HTTP and WebSocket clients to control Android devices through Portal's accessibility service.
19
+
20
+ ## Features
21
+
22
+ - **HTTP Client** (`PortalHTTPClient`) - Communicates with Portal's HTTP server (default port 8080)
23
+ - **WebSocket Client** (`PortalWSClient`) - Communicates with Portal's WebSocket server (default port 8081) using JSON-RPC style messages
24
+ - Bearer token authentication
25
+ - Async context manager support (`async with`)
26
+ - Automatic reconnection for WebSocket client
27
+ - Full type hints
28
+
29
+ ## Installation
30
+
31
+ ```bash
32
+ pip install droidrun-agent
33
+ ```
34
+
35
+ Or install from source using [uv](https://docs.astral.sh/uv/):
36
+
37
+ ```bash
38
+ git clone https://github.com/hanxi/droidrun-agent.git
39
+ cd droidrun-agent
40
+ uv sync
41
+ ```
42
+
43
+ ## Quick Start
44
+
45
+ ### HTTP Client
46
+
47
+ ```python
48
+ import asyncio
49
+ from droidrun_agent import PortalHTTPClient
50
+
51
+ async def main():
52
+ async with PortalHTTPClient("http://192.168.1.100:8080", token="YOUR_TOKEN") as client:
53
+ # Health check (no auth required)
54
+ await client.ping()
55
+
56
+ # Get device UI state
57
+ state = await client.get_state_full()
58
+ print(state)
59
+
60
+ # Tap on screen coordinates
61
+ await client.tap(200, 400)
62
+
63
+ # Take a screenshot (returns PNG bytes)
64
+ png_data = await client.take_screenshot()
65
+ with open("screenshot.png", "wb") as f:
66
+ f.write(png_data)
67
+
68
+ asyncio.run(main())
69
+ ```
70
+
71
+ ### WebSocket Client
72
+
73
+ ```python
74
+ import asyncio
75
+ from droidrun_agent import PortalWSClient
76
+
77
+ async def main():
78
+ async with PortalWSClient("ws://192.168.1.100:8081", token="YOUR_TOKEN") as ws:
79
+ # Tap on screen coordinates
80
+ await ws.tap(200, 400)
81
+
82
+ # Get device state
83
+ state = await ws.get_state()
84
+ print(state)
85
+
86
+ # Take a screenshot (returns PNG bytes)
87
+ png_data = await ws.take_screenshot()
88
+ with open("screenshot.png", "wb") as f:
89
+ f.write(png_data)
90
+
91
+ # Install APK from URL (WebSocket only)
92
+ await ws.install(["https://example.com/app.apk"])
93
+
94
+ asyncio.run(main())
95
+ ```
96
+
97
+ ## Documentation
98
+
99
+ For detailed API documentation of the DroidRun Portal local API, see:
100
+ - [Local API Documentation](https://github.com/droidrun/droidrun-portal/blob/main/docs/local-api.md)
101
+
102
+ ## API Reference
103
+
104
+ ### PortalHTTPClient
105
+
106
+ | Method | Description |
107
+ |---|---|
108
+ | `ping()` | Health check (no auth required) |
109
+ | `get_a11y_tree()` | Get simplified accessibility tree |
110
+ | `get_a11y_tree_full(filter=True)` | Get full accessibility tree |
111
+ | `get_state()` | Get simplified UI state |
112
+ | `get_state_full(filter=True)` | Get full UI state (a11y tree + phone state) |
113
+ | `get_phone_state()` | Get phone state info |
114
+ | `get_version()` | Get Portal app version string |
115
+ | `get_packages()` | Get list of launchable packages |
116
+ | `take_screenshot(hide_overlay=True)` | Take device screenshot, returns PNG bytes |
117
+ | `tap(x, y)` | Tap screen coordinates |
118
+ | `swipe(start_x, start_y, end_x, end_y, duration=None)` | Swipe gesture |
119
+ | `global_action(action)` | Execute accessibility global action |
120
+ | `start_app(package, activity=None, stop_before_launch=False)` | Launch an app |
121
+ | `stop_app(package)` | Stop an app |
122
+ | `input_text(text, clear=True)` | Input text via Portal keyboard |
123
+ | `clear_input()` | Clear focused input field |
124
+ | `press_key(key_code)` | Send an Android key code |
125
+ | `set_overlay_offset(offset)` | Set overlay vertical offset in pixels |
126
+ | `set_socket_port(port)` | Update the HTTP server port |
127
+
128
+ ### PortalWSClient
129
+
130
+ Supports all methods from HTTP client, plus:
131
+
132
+ | Method | Description |
133
+ |---|---|
134
+ | `get_time()` | Get device Unix timestamp in milliseconds |
135
+ | `install(urls, hide_overlay=True)` | Install APK(s) from URL(s), supports split APKs |
136
+
137
+ ### Exceptions
138
+
139
+ | Exception | Description |
140
+ |---|---|
141
+ | `PortalError` | Base exception for all Portal client errors |
142
+ | `PortalConnectionError` | Failed to connect to Portal server |
143
+ | `PortalAuthError` | Authentication failed (invalid or missing token) |
144
+ | `PortalTimeoutError` | Request timed out |
145
+ | `PortalResponseError` | Server returned an unexpected or error response |
146
+
147
+ ## Requirements
148
+
149
+ - Python >= 3.11
150
+ - [httpx](https://www.python-httpx.org/) >= 0.28.1
151
+ - [websockets](https://websockets.readthedocs.io/) >= 12.0
152
+
153
+ ## Development Workflow
154
+
155
+ This project uses [uv](https://docs.astral.sh/uv/) for development. Here are the common development commands:
156
+
157
+ ### Code Formatting
158
+
159
+ ```bash
160
+ # Format all Python code
161
+ uv format
162
+
163
+ # Check formatting without making changes
164
+ uv format --check
165
+ ```
166
+
167
+ ### Code Quality
168
+
169
+ ```bash
170
+ # Run code quality checks
171
+ uv run ruff check .
172
+
173
+ # Automatically fix fixable issues
174
+ uv run ruff check --fix .
175
+ ```
176
+
177
+ ### Testing
178
+
179
+ ```bash
180
+ # Run all tests
181
+ uv run pytest
182
+
183
+ # Run tests with verbose output
184
+ uv run pytest -v
185
+
186
+ # Run tests and show coverage
187
+ uv run pytest --cov=src
188
+ ```
189
+
190
+ ### Dependency Management
191
+
192
+ ```bash
193
+ # Install development dependencies
194
+ uv sync --group dev
195
+
196
+ # Add a new dependency
197
+ uv add package_name
198
+
199
+ # Add a development dependency
200
+ uv add --group dev package_name
201
+
202
+ # Remove a dependency
203
+ uv remove package_name
204
+ ```
205
+
206
+ ### Complete Development Flow
207
+
208
+ ```bash
209
+ # 1. Format code
210
+ uv format
211
+
212
+ # 2. Check code quality
213
+ uv run ruff check .
214
+
215
+ # 3. Run tests
216
+ uv run pytest
217
+ ```
218
+
219
+ ## License
220
+
221
+ This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
@@ -0,0 +1,210 @@
1
+ # [droidrun-agent](https://github.com/hanxi/droidrun-agent)
2
+
3
+ [![PyPI version](https://img.shields.io/pypi/v/droidrun-agent)](https://pypi.org/project/droidrun-agent/)
4
+ [![Python](https://img.shields.io/pypi/pyversions/droidrun-agent)](https://pypi.org/project/droidrun-agent/)
5
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
6
+
7
+ Async Python client for the [DroidRun Portal](https://github.com/droidrun/droidrun-portal) local API. Provides both HTTP and WebSocket clients to control Android devices through Portal's accessibility service.
8
+
9
+ ## Features
10
+
11
+ - **HTTP Client** (`PortalHTTPClient`) - Communicates with Portal's HTTP server (default port 8080)
12
+ - **WebSocket Client** (`PortalWSClient`) - Communicates with Portal's WebSocket server (default port 8081) using JSON-RPC style messages
13
+ - Bearer token authentication
14
+ - Async context manager support (`async with`)
15
+ - Automatic reconnection for WebSocket client
16
+ - Full type hints
17
+
18
+ ## Installation
19
+
20
+ ```bash
21
+ pip install droidrun-agent
22
+ ```
23
+
24
+ Or install from source using [uv](https://docs.astral.sh/uv/):
25
+
26
+ ```bash
27
+ git clone https://github.com/hanxi/droidrun-agent.git
28
+ cd droidrun-agent
29
+ uv sync
30
+ ```
31
+
32
+ ## Quick Start
33
+
34
+ ### HTTP Client
35
+
36
+ ```python
37
+ import asyncio
38
+ from droidrun_agent import PortalHTTPClient
39
+
40
+ async def main():
41
+ async with PortalHTTPClient("http://192.168.1.100:8080", token="YOUR_TOKEN") as client:
42
+ # Health check (no auth required)
43
+ await client.ping()
44
+
45
+ # Get device UI state
46
+ state = await client.get_state_full()
47
+ print(state)
48
+
49
+ # Tap on screen coordinates
50
+ await client.tap(200, 400)
51
+
52
+ # Take a screenshot (returns PNG bytes)
53
+ png_data = await client.take_screenshot()
54
+ with open("screenshot.png", "wb") as f:
55
+ f.write(png_data)
56
+
57
+ asyncio.run(main())
58
+ ```
59
+
60
+ ### WebSocket Client
61
+
62
+ ```python
63
+ import asyncio
64
+ from droidrun_agent import PortalWSClient
65
+
66
+ async def main():
67
+ async with PortalWSClient("ws://192.168.1.100:8081", token="YOUR_TOKEN") as ws:
68
+ # Tap on screen coordinates
69
+ await ws.tap(200, 400)
70
+
71
+ # Get device state
72
+ state = await ws.get_state()
73
+ print(state)
74
+
75
+ # Take a screenshot (returns PNG bytes)
76
+ png_data = await ws.take_screenshot()
77
+ with open("screenshot.png", "wb") as f:
78
+ f.write(png_data)
79
+
80
+ # Install APK from URL (WebSocket only)
81
+ await ws.install(["https://example.com/app.apk"])
82
+
83
+ asyncio.run(main())
84
+ ```
85
+
86
+ ## Documentation
87
+
88
+ For detailed API documentation of the DroidRun Portal local API, see:
89
+ - [Local API Documentation](https://github.com/droidrun/droidrun-portal/blob/main/docs/local-api.md)
90
+
91
+ ## API Reference
92
+
93
+ ### PortalHTTPClient
94
+
95
+ | Method | Description |
96
+ |---|---|
97
+ | `ping()` | Health check (no auth required) |
98
+ | `get_a11y_tree()` | Get simplified accessibility tree |
99
+ | `get_a11y_tree_full(filter=True)` | Get full accessibility tree |
100
+ | `get_state()` | Get simplified UI state |
101
+ | `get_state_full(filter=True)` | Get full UI state (a11y tree + phone state) |
102
+ | `get_phone_state()` | Get phone state info |
103
+ | `get_version()` | Get Portal app version string |
104
+ | `get_packages()` | Get list of launchable packages |
105
+ | `take_screenshot(hide_overlay=True)` | Take device screenshot, returns PNG bytes |
106
+ | `tap(x, y)` | Tap screen coordinates |
107
+ | `swipe(start_x, start_y, end_x, end_y, duration=None)` | Swipe gesture |
108
+ | `global_action(action)` | Execute accessibility global action |
109
+ | `start_app(package, activity=None, stop_before_launch=False)` | Launch an app |
110
+ | `stop_app(package)` | Stop an app |
111
+ | `input_text(text, clear=True)` | Input text via Portal keyboard |
112
+ | `clear_input()` | Clear focused input field |
113
+ | `press_key(key_code)` | Send an Android key code |
114
+ | `set_overlay_offset(offset)` | Set overlay vertical offset in pixels |
115
+ | `set_socket_port(port)` | Update the HTTP server port |
116
+
117
+ ### PortalWSClient
118
+
119
+ Supports all methods from HTTP client, plus:
120
+
121
+ | Method | Description |
122
+ |---|---|
123
+ | `get_time()` | Get device Unix timestamp in milliseconds |
124
+ | `install(urls, hide_overlay=True)` | Install APK(s) from URL(s), supports split APKs |
125
+
126
+ ### Exceptions
127
+
128
+ | Exception | Description |
129
+ |---|---|
130
+ | `PortalError` | Base exception for all Portal client errors |
131
+ | `PortalConnectionError` | Failed to connect to Portal server |
132
+ | `PortalAuthError` | Authentication failed (invalid or missing token) |
133
+ | `PortalTimeoutError` | Request timed out |
134
+ | `PortalResponseError` | Server returned an unexpected or error response |
135
+
136
+ ## Requirements
137
+
138
+ - Python >= 3.11
139
+ - [httpx](https://www.python-httpx.org/) >= 0.28.1
140
+ - [websockets](https://websockets.readthedocs.io/) >= 12.0
141
+
142
+ ## Development Workflow
143
+
144
+ This project uses [uv](https://docs.astral.sh/uv/) for development. Here are the common development commands:
145
+
146
+ ### Code Formatting
147
+
148
+ ```bash
149
+ # Format all Python code
150
+ uv format
151
+
152
+ # Check formatting without making changes
153
+ uv format --check
154
+ ```
155
+
156
+ ### Code Quality
157
+
158
+ ```bash
159
+ # Run code quality checks
160
+ uv run ruff check .
161
+
162
+ # Automatically fix fixable issues
163
+ uv run ruff check --fix .
164
+ ```
165
+
166
+ ### Testing
167
+
168
+ ```bash
169
+ # Run all tests
170
+ uv run pytest
171
+
172
+ # Run tests with verbose output
173
+ uv run pytest -v
174
+
175
+ # Run tests and show coverage
176
+ uv run pytest --cov=src
177
+ ```
178
+
179
+ ### Dependency Management
180
+
181
+ ```bash
182
+ # Install development dependencies
183
+ uv sync --group dev
184
+
185
+ # Add a new dependency
186
+ uv add package_name
187
+
188
+ # Add a development dependency
189
+ uv add --group dev package_name
190
+
191
+ # Remove a dependency
192
+ uv remove package_name
193
+ ```
194
+
195
+ ### Complete Development Flow
196
+
197
+ ```bash
198
+ # 1. Format code
199
+ uv format
200
+
201
+ # 2. Check code quality
202
+ uv run ruff check .
203
+
204
+ # 3. Run tests
205
+ uv run pytest
206
+ ```
207
+
208
+ ## License
209
+
210
+ This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
@@ -0,0 +1,62 @@
1
+ [project]
2
+ name = "droidrun-agent"
3
+ version = "0.1.1"
4
+ description = "Add your description here"
5
+ readme = "README.md"
6
+ authors = [
7
+ { name = "涵曦", email = "im.hanxi@gmail.com" }
8
+ ]
9
+ requires-python = ">=3.11"
10
+ dependencies = [
11
+ "httpx>=0.28.1",
12
+ "websockets>=12.0",
13
+ ]
14
+
15
+ [build-system]
16
+ requires = ["uv_build>=0.9.26,<0.10.0"]
17
+ build-backend = "uv_build"
18
+
19
+ [dependency-groups]
20
+ dev = [
21
+ "pytest>=9.0.2",
22
+ "pytest-asyncio>=1.3.0",
23
+ "black>=24.0.0",
24
+ "ruff>=0.8.0",
25
+ ]
26
+
27
+ [tool.ruff]
28
+ line-length = 120
29
+ target-version = "py311"
30
+
31
+ [tool.ruff.lint]
32
+ select = [
33
+ "E", # pycodestyle errors
34
+ "W", # pycodestyle warnings
35
+ "F", # pyflakes
36
+ "I", # isort
37
+ "C", # mccabe
38
+ "B", # flake8-bugbear
39
+ "UP", # pyupgrade
40
+ ]
41
+
42
+ [tool.black]
43
+ line-length = 120
44
+ target-version = ['py311']
45
+ include = '\.pyi?$'
46
+ extend-exclude = '''
47
+ /(
48
+ \.eggs
49
+ | \.git
50
+ | \.hg
51
+ | \.mypy_cache
52
+ | \.tox
53
+ | \.venv
54
+ | _build
55
+ | buck-out
56
+ | build
57
+ | dist
58
+ )/
59
+ '''
60
+
61
+ [tool.pytest.ini_options]
62
+ asyncio_mode = "auto"
@@ -0,0 +1,21 @@
1
+ """DroidRun Agent - HTTP and WebSocket clients for Portal local API."""
2
+
3
+ from .exceptions import (
4
+ PortalAuthError,
5
+ PortalConnectionError,
6
+ PortalError,
7
+ PortalResponseError,
8
+ PortalTimeoutError,
9
+ )
10
+ from .http_client import PortalHTTPClient
11
+ from .ws_client import PortalWSClient
12
+
13
+ __all__ = [
14
+ "PortalHTTPClient",
15
+ "PortalWSClient",
16
+ "PortalError",
17
+ "PortalConnectionError",
18
+ "PortalAuthError",
19
+ "PortalTimeoutError",
20
+ "PortalResponseError",
21
+ ]
@@ -0,0 +1,23 @@
1
+ """Portal client exceptions."""
2
+
3
+ from __future__ import annotations
4
+
5
+
6
+ class PortalError(Exception):
7
+ """Base exception for Portal client errors."""
8
+
9
+
10
+ class PortalConnectionError(PortalError):
11
+ """Failed to connect to Portal server."""
12
+
13
+
14
+ class PortalAuthError(PortalError):
15
+ """Authentication failed (invalid or missing token)."""
16
+
17
+
18
+ class PortalTimeoutError(PortalError):
19
+ """Request timed out."""
20
+
21
+
22
+ class PortalResponseError(PortalError):
23
+ """Server returned an unexpected or error response."""
@@ -0,0 +1,349 @@
1
+ """
2
+ PortalHTTPClient - Complete HTTP client for DroidRun Portal.
3
+
4
+ Communicates with Portal's HTTP server (default port 8080) using Bearer token auth.
5
+ Supports all GET and POST endpoints defined in the Portal local API.
6
+ """
7
+
8
+ from __future__ import annotations
9
+
10
+ import asyncio
11
+ import base64
12
+ import json
13
+ import logging
14
+ from typing import Any
15
+
16
+ import httpx
17
+
18
+ from .exceptions import (
19
+ PortalAuthError,
20
+ PortalConnectionError,
21
+ PortalResponseError,
22
+ PortalTimeoutError,
23
+ )
24
+
25
+ logger = logging.getLogger("droidrun_agent")
26
+
27
+
28
+ class PortalHTTPClient:
29
+ """
30
+ Async HTTP client for DroidRun Portal.
31
+
32
+ Usage::
33
+
34
+ async with PortalHTTPClient(
35
+ "http://192.168.1.100:8080", token="TOKEN"
36
+ ) as client:
37
+ await client.ping()
38
+ state = await client.get_state_full()
39
+ await client.tap(200, 400)
40
+ """
41
+
42
+ def __init__(self, base_url: str, token: str, timeout: float = 10.0) -> None:
43
+ self.base_url = base_url.rstrip("/")
44
+ self.token = token
45
+ self.timeout = timeout
46
+ self._headers = {"Authorization": f"Bearer {token}"}
47
+ self._client: httpx.AsyncClient | None = None
48
+
49
+ async def connect(self) -> None:
50
+ """Create the underlying HTTP client."""
51
+ if self._client is None:
52
+ self._client = httpx.AsyncClient(
53
+ base_url=self.base_url,
54
+ headers=self._headers,
55
+ timeout=self.timeout,
56
+ )
57
+
58
+ async def close(self) -> None:
59
+ """Close the underlying HTTP client."""
60
+ if self._client is not None:
61
+ await self._client.aclose()
62
+ self._client = None
63
+
64
+ async def __aenter__(self) -> PortalHTTPClient:
65
+ await self.connect()
66
+ return self
67
+
68
+ async def __aexit__(self, *exc: object) -> None:
69
+ await self.close()
70
+
71
+ # ------------------------------------------------------------------
72
+ # Internal helpers
73
+ # ------------------------------------------------------------------
74
+
75
+ async def _ensure_client(self) -> httpx.AsyncClient:
76
+ if self._client is None:
77
+ await self.connect()
78
+ assert self._client is not None
79
+ return self._client
80
+
81
+ def _unwrap(self, data: dict[str, Any]) -> Any:
82
+ """Extract value from Portal response envelope
83
+ (``{result: ...}`` or ``{data: ...}``)."""
84
+ key = "result" if "result" in data else "data" if "data" in data else None
85
+ if key is None:
86
+ return data
87
+ value = data[key]
88
+ if isinstance(value, str):
89
+ try:
90
+ return json.loads(value)
91
+ except (json.JSONDecodeError, ValueError):
92
+ return value
93
+ return value
94
+
95
+ async def _get(
96
+ self,
97
+ path: str,
98
+ *,
99
+ params: dict[str, Any] | None = None,
100
+ auth: bool = True,
101
+ raw: bool = False,
102
+ ) -> Any:
103
+ """Send GET request and return parsed response."""
104
+ client = await self._ensure_client()
105
+ headers = self._headers if auth else {}
106
+ try:
107
+ resp = await client.get(path, params=params, headers=headers)
108
+ except httpx.ConnectError as exc:
109
+ raise PortalConnectionError(f"Cannot connect to {self.base_url}{path}: {exc}") from exc
110
+ except httpx.TimeoutException as exc:
111
+ raise PortalTimeoutError(f"GET {path} timed out: {exc}") from exc
112
+
113
+ if resp.status_code in (401, 403):
114
+ raise PortalAuthError(f"Auth failed for GET {path}: HTTP {resp.status_code}")
115
+ if resp.status_code != 200:
116
+ raise PortalResponseError(f"GET {path} returned HTTP {resp.status_code}: {resp.text}")
117
+
118
+ if raw:
119
+ return resp.content
120
+
121
+ data = resp.json()
122
+ if isinstance(data, dict):
123
+ return self._unwrap(data)
124
+ return data
125
+
126
+ async def _post(self, path: str, form: dict[str, Any]) -> Any:
127
+ """Send POST request with form-encoded body and return parsed response."""
128
+ client = await self._ensure_client()
129
+ # Remove None values from form data
130
+ form_data = {k: v for k, v in form.items() if v is not None}
131
+ try:
132
+ resp = await client.post(path, data=form_data)
133
+ except httpx.ConnectError as exc:
134
+ raise PortalConnectionError(f"Cannot connect to {self.base_url}{path}: {exc}") from exc
135
+ except httpx.TimeoutException as exc:
136
+ raise PortalTimeoutError(f"POST {path} timed out: {exc}") from exc
137
+
138
+ if resp.status_code in (401, 403):
139
+ raise PortalAuthError(f"Auth failed for POST {path}: HTTP {resp.status_code}")
140
+ if resp.status_code != 200:
141
+ raise PortalResponseError(f"POST {path} returned HTTP {resp.status_code}: {resp.text}")
142
+
143
+ data = resp.json()
144
+ if isinstance(data, dict):
145
+ return self._unwrap(data)
146
+ return data
147
+
148
+ # ------------------------------------------------------------------
149
+ # GET endpoints
150
+ # ------------------------------------------------------------------
151
+
152
+ async def ping(self) -> dict[str, Any]:
153
+ """Health check (no auth required)."""
154
+ return await self._get("/ping", auth=False)
155
+
156
+ async def get_a11y_tree(self) -> dict[str, Any]:
157
+ """Get simplified accessibility tree."""
158
+ return await self._get("/a11y_tree")
159
+
160
+ async def get_a11y_tree_full(self, *, filter: bool = True) -> dict[str, Any]:
161
+ """Get full accessibility tree. Set filter=False to keep small elements."""
162
+ params = {"filter": str(filter).lower()}
163
+ return await self._get("/a11y_tree_full", params=params)
164
+
165
+ async def get_state(self) -> dict[str, Any]:
166
+ """Get simplified UI state."""
167
+ return await self._get("/state")
168
+
169
+ async def get_state_full(self, *, filter: bool = True) -> dict[str, Any]:
170
+ """Get full UI state (a11y tree + phone state).
171
+ Set filter=False to keep small elements."""
172
+ params = {"filter": str(filter).lower()}
173
+ return await self._get("/state_full", params=params)
174
+
175
+ async def get_phone_state(self) -> dict[str, Any]:
176
+ """Get phone state info."""
177
+ return await self._get("/phone_state")
178
+
179
+ async def get_version(self) -> str:
180
+ """Get Portal app version string."""
181
+ result = await self._get("/version")
182
+ if isinstance(result, str):
183
+ return result
184
+ if isinstance(result, dict):
185
+ return result.get("version", str(result))
186
+ return str(result)
187
+
188
+ async def get_packages(self) -> list[dict[str, Any]]:
189
+ """Get list of launchable packages."""
190
+ result = await self._get("/packages")
191
+ if isinstance(result, list):
192
+ return result
193
+ if isinstance(result, dict) and "packages" in result:
194
+ return result["packages"]
195
+ return []
196
+
197
+ async def take_screenshot(self, *, hide_overlay: bool = True) -> bytes:
198
+ """
199
+ Take device screenshot. Returns PNG bytes.
200
+
201
+ The HTTP endpoint returns binary PNG directly.
202
+ Falls back to JSON envelope parsing for compatibility.
203
+ """
204
+ params: dict[str, Any] = {}
205
+ if not hide_overlay:
206
+ params["hideOverlay"] = "false"
207
+
208
+ max_retries = 3
209
+ for attempt in range(max_retries + 1):
210
+ client = await self._ensure_client()
211
+ try:
212
+ resp = await client.get("/screenshot", params=params)
213
+ except httpx.ConnectError as exc:
214
+ raise PortalConnectionError(f"Screenshot connect error: {exc}") from exc
215
+ except httpx.TimeoutException as exc:
216
+ raise PortalTimeoutError(f"Screenshot timed out: {exc}") from exc
217
+
218
+ if resp.status_code in (401, 403):
219
+ raise PortalAuthError(f"Screenshot auth failed: HTTP {resp.status_code}")
220
+ if resp.status_code != 200:
221
+ error_text = resp.text
222
+ if "interval too short" in error_text.lower() and attempt < max_retries:
223
+ await asyncio.sleep(0.5)
224
+ continue
225
+ raise PortalResponseError(f"Screenshot failed: HTTP {resp.status_code}")
226
+
227
+ break
228
+
229
+ content_type = resp.headers.get("content-type", "")
230
+
231
+ # Binary PNG response (check content-type and magic bytes)
232
+ if "image/" in content_type or "octet-stream" in content_type:
233
+ return resp.content
234
+ if resp.content[:4] == b"\x89PNG":
235
+ return resp.content
236
+
237
+ # JSON envelope with base64 PNG.
238
+ # NOTE: We intentionally avoid _unwrap() here because it applies
239
+ # json.loads() to string values, which can over-parse the base64 data.
240
+ data = resp.json()
241
+
242
+ # Check for server-side error envelope (HTTP 200 but JSON error body)
243
+ if isinstance(data, dict) and "error" in data:
244
+ raise PortalResponseError(
245
+ f"Screenshot failed: {data.get('error')} (status={data.get('status')})"
246
+ )
247
+
248
+ # Direct base64 string
249
+ if isinstance(data, str):
250
+ return base64.b64decode(data)
251
+
252
+ if isinstance(data, dict):
253
+ # Extract raw value from envelope without json.loads
254
+ raw = data.get("result") or data.get("data")
255
+ if isinstance(raw, str):
256
+ return base64.b64decode(raw)
257
+ # Nested dict – look for a base64-encoded PNG in values
258
+ target = raw if isinstance(raw, dict) else data
259
+ for val in target.values():
260
+ if isinstance(val, str) and len(val) > 100:
261
+ try:
262
+ decoded = base64.b64decode(val)
263
+ if decoded[:4] == b"\x89PNG":
264
+ return decoded
265
+ except Exception:
266
+ continue
267
+
268
+ detail = f"keys={list(data.keys())}" if isinstance(data, dict) else f"type={type(data)}"
269
+ raise PortalResponseError(f"Unexpected screenshot response format: {detail}")
270
+
271
+ # ------------------------------------------------------------------
272
+ # POST endpoints
273
+ # ------------------------------------------------------------------
274
+
275
+ async def tap(self, x: int, y: int) -> dict[str, Any]:
276
+ """Tap screen coordinates."""
277
+ return await self._post("/tap", {"x": x, "y": y})
278
+
279
+ async def swipe(
280
+ self,
281
+ start_x: int,
282
+ start_y: int,
283
+ end_x: int,
284
+ end_y: int,
285
+ duration: int | None = None,
286
+ ) -> dict[str, Any]:
287
+ """Swipe from (start_x, start_y) to (end_x, end_y).
288
+ Duration in ms (optional)."""
289
+ return await self._post(
290
+ "/swipe",
291
+ {
292
+ "startX": start_x,
293
+ "startY": start_y,
294
+ "endX": end_x,
295
+ "endY": end_y,
296
+ "duration": duration,
297
+ },
298
+ )
299
+
300
+ async def global_action(self, action: int) -> dict[str, Any]:
301
+ """Execute accessibility global action by Android action ID."""
302
+ return await self._post("/global", {"action": action})
303
+
304
+ async def start_app(
305
+ self,
306
+ package: str,
307
+ activity: str | None = None,
308
+ stop_before_launch: bool = False,
309
+ ) -> dict[str, Any]:
310
+ """Launch an app by package name."""
311
+ return await self._post(
312
+ "/app",
313
+ {
314
+ "package": package,
315
+ "activity": activity,
316
+ "stopBeforeLaunch": str(stop_before_launch).lower(),
317
+ },
318
+ )
319
+
320
+ async def stop_app(self, package: str) -> dict[str, Any]:
321
+ """Best-effort stop an app."""
322
+ return await self._post("/app/stop", {"package": package})
323
+
324
+ async def input_text(self, text: str, clear: bool = True) -> dict[str, Any]:
325
+ """Input text via Portal keyboard. Text is base64-encoded automatically."""
326
+ encoded = base64.b64encode(text.encode()).decode()
327
+ return await self._post(
328
+ "/keyboard/input",
329
+ {
330
+ "base64_text": encoded,
331
+ "clear": str(clear).lower(),
332
+ },
333
+ )
334
+
335
+ async def clear_input(self) -> dict[str, Any]:
336
+ """Clear focused input field."""
337
+ return await self._post("/keyboard/clear", {})
338
+
339
+ async def press_key(self, key_code: int) -> dict[str, Any]:
340
+ """Send an Android key code."""
341
+ return await self._post("/keyboard/key", {"key_code": key_code})
342
+
343
+ async def set_overlay_offset(self, offset: int) -> dict[str, Any]:
344
+ """Set overlay vertical offset in pixels."""
345
+ return await self._post("/overlay_offset", {"offset": offset})
346
+
347
+ async def set_socket_port(self, port: int) -> dict[str, Any]:
348
+ """Update the HTTP server port."""
349
+ return await self._post("/socket_port", {"port": port})
@@ -0,0 +1,348 @@
1
+ """
2
+ PortalWSClient - Complete WebSocket client for DroidRun Portal.
3
+
4
+ Communicates with Portal's WebSocket server (default port 8081)
5
+ using JSON-RPC style messages.
6
+ Supports all methods defined in the Portal local API,
7
+ including binary screenshot frames.
8
+ """
9
+
10
+ from __future__ import annotations
11
+
12
+ import asyncio
13
+ import base64
14
+ import json
15
+ import logging
16
+ import uuid
17
+ from typing import Any
18
+
19
+ import websockets
20
+ from websockets.asyncio.client import ClientConnection
21
+
22
+ from .exceptions import (
23
+ PortalConnectionError,
24
+ PortalResponseError,
25
+ PortalTimeoutError,
26
+ )
27
+
28
+ logger = logging.getLogger("droidrun_agent")
29
+
30
+
31
+ class PortalWSClient:
32
+ """
33
+ Async WebSocket client for DroidRun Portal.
34
+
35
+ Uses JSON-RPC style request/response with UUID-based matching.
36
+ Automatically reconnects when a method is called on a broken connection.
37
+
38
+ Usage::
39
+
40
+ async with PortalWSClient("ws://192.168.1.100:8081", token="TOKEN") as ws:
41
+ await ws.tap(200, 400)
42
+ state = await ws.get_state()
43
+ png = await ws.take_screenshot()
44
+ """
45
+
46
+ def __init__(
47
+ self,
48
+ base_url: str = "ws://localhost:8081",
49
+ token: str = "",
50
+ timeout: float = 10.0,
51
+ ) -> None:
52
+ self.base_url = base_url.rstrip("/")
53
+ self.token = token
54
+ self.timeout = timeout
55
+ self._ws: ClientConnection | None = None
56
+ self._listener_task: asyncio.Task[None] | None = None
57
+ self._pending: dict[str, asyncio.Future[Any]] = {}
58
+ self._closed = False
59
+
60
+ @property
61
+ def _url(self) -> str:
62
+ return f"{self.base_url}/?token={self.token}"
63
+
64
+ # ------------------------------------------------------------------
65
+ # Connection management
66
+ # ------------------------------------------------------------------
67
+
68
+ async def connect(self) -> None:
69
+ """Establish WebSocket connection and start listener."""
70
+ if self._ws is not None:
71
+ return
72
+ try:
73
+ self._ws = await websockets.connect(self._url)
74
+ except Exception as exc:
75
+ raise PortalConnectionError(f"Cannot connect to {self._url}: {exc}") from exc
76
+ self._closed = False
77
+ self._listener_task = asyncio.create_task(self._listen())
78
+ logger.debug("WebSocket connected to %s", self.base_url)
79
+
80
+ async def close(self) -> None:
81
+ """Gracefully close the WebSocket connection."""
82
+ self._closed = True
83
+ if self._listener_task is not None:
84
+ self._listener_task.cancel()
85
+ try:
86
+ await self._listener_task
87
+ except asyncio.CancelledError:
88
+ pass
89
+ self._listener_task = None
90
+ if self._ws is not None:
91
+ await self._ws.close()
92
+ self._ws = None
93
+ # Fail all pending futures
94
+ for fut in self._pending.values():
95
+ if not fut.done():
96
+ fut.set_exception(PortalConnectionError("Connection closed"))
97
+ self._pending.clear()
98
+
99
+ async def __aenter__(self) -> PortalWSClient:
100
+ await self.connect()
101
+ return self
102
+
103
+ async def __aexit__(self, *exc: object) -> None:
104
+ await self.close()
105
+
106
+ async def _ensure_connected(self) -> None:
107
+ """Reconnect if the connection is broken."""
108
+ if self._ws is None or self._closed:
109
+ self._ws = None
110
+ self._closed = False
111
+ if self._listener_task is not None:
112
+ self._listener_task.cancel()
113
+ try:
114
+ await self._listener_task
115
+ except asyncio.CancelledError:
116
+ pass
117
+ self._listener_task = None
118
+ await self.connect()
119
+
120
+ # ------------------------------------------------------------------
121
+ # Listener
122
+ # ------------------------------------------------------------------
123
+
124
+ async def _listen(self) -> None:
125
+ """Background task: receive messages and dispatch to pending futures."""
126
+ assert self._ws is not None
127
+ try:
128
+ async for message in self._ws:
129
+ if isinstance(message, bytes):
130
+ self._handle_binary(message)
131
+ else:
132
+ self._handle_text(message)
133
+ except websockets.ConnectionClosed:
134
+ logger.debug("WebSocket connection closed")
135
+ except asyncio.CancelledError:
136
+ raise
137
+ except Exception as exc:
138
+ logger.debug("WebSocket listener error: %s", exc)
139
+ finally:
140
+ self._ws = None
141
+ # Fail remaining pending futures
142
+ for fut in self._pending.values():
143
+ if not fut.done():
144
+ fut.set_exception(PortalConnectionError("Connection lost"))
145
+ self._pending.clear()
146
+
147
+ def _handle_text(self, raw: str) -> None:
148
+ """Parse JSON response and resolve the matching future."""
149
+ try:
150
+ data = json.loads(raw)
151
+ except json.JSONDecodeError:
152
+ logger.warning("Non-JSON message received: %s", raw[:200])
153
+ return
154
+
155
+ msg_id = data.get("id")
156
+ if msg_id is None:
157
+ logger.debug("Message without id: %s", raw[:200])
158
+ return
159
+
160
+ fut = self._pending.pop(str(msg_id), None)
161
+ if fut is None or fut.done():
162
+ return
163
+
164
+ status = data.get("status", "")
165
+ if status == "success":
166
+ fut.set_result(data.get("result"))
167
+ else:
168
+ fut.set_exception(PortalResponseError(f"Method returned status={status}: {data.get('result', data)}"))
169
+
170
+ def _handle_binary(self, data: bytes) -> None:
171
+ """Parse binary screenshot frame: first 36 bytes = UUID, rest = PNG."""
172
+ if len(data) < 36:
173
+ logger.warning("Binary frame too short (%d bytes)", len(data))
174
+ return
175
+
176
+ msg_id = data[:36].decode("ascii", errors="replace")
177
+ png_data = data[36:]
178
+
179
+ fut = self._pending.pop(msg_id, None)
180
+ if fut is None or fut.done():
181
+ logger.debug("No pending future for binary frame id=%s", msg_id)
182
+ return
183
+
184
+ fut.set_result(png_data)
185
+
186
+ # ------------------------------------------------------------------
187
+ # RPC call
188
+ # ------------------------------------------------------------------
189
+
190
+ async def _call(self, method: str, params: dict[str, Any] | None = None) -> Any:
191
+ """Send a JSON-RPC request and wait for the response."""
192
+ await self._ensure_connected()
193
+ assert self._ws is not None
194
+
195
+ request_id = str(uuid.uuid4())
196
+ loop = asyncio.get_running_loop()
197
+ fut: asyncio.Future[Any] = loop.create_future()
198
+ self._pending[request_id] = fut
199
+
200
+ msg = {"id": request_id, "method": method}
201
+ if params:
202
+ msg["params"] = params
203
+
204
+ try:
205
+ await self._ws.send(json.dumps(msg))
206
+ except Exception as exc:
207
+ self._pending.pop(request_id, None)
208
+ raise PortalConnectionError(f"Send failed for {method}: {exc}") from exc
209
+
210
+ try:
211
+ return await asyncio.wait_for(fut, timeout=self.timeout)
212
+ except TimeoutError:
213
+ self._pending.pop(request_id, None)
214
+ raise PortalTimeoutError(f"Timeout waiting for response to {method}") from None
215
+
216
+ # ------------------------------------------------------------------
217
+ # Action methods
218
+ # ------------------------------------------------------------------
219
+
220
+ async def tap(self, x: int, y: int) -> Any:
221
+ """Tap screen coordinates."""
222
+ return await self._call("tap", {"x": x, "y": y})
223
+
224
+ async def swipe(
225
+ self,
226
+ start_x: int,
227
+ start_y: int,
228
+ end_x: int,
229
+ end_y: int,
230
+ duration: int | None = None,
231
+ ) -> Any:
232
+ """Swipe from (start_x, start_y) to (end_x, end_y).
233
+ Duration in ms (optional)."""
234
+ params: dict[str, Any] = {
235
+ "startX": start_x,
236
+ "startY": start_y,
237
+ "endX": end_x,
238
+ "endY": end_y,
239
+ }
240
+ if duration is not None:
241
+ params["duration"] = duration
242
+ return await self._call("swipe", params)
243
+
244
+ async def global_action(self, action: int) -> Any:
245
+ """Execute accessibility global action by Android action ID."""
246
+ return await self._call("global", {"action": action})
247
+
248
+ async def start_app(
249
+ self,
250
+ package: str,
251
+ activity: str | None = None,
252
+ stop_before_launch: bool = False,
253
+ ) -> Any:
254
+ """Launch an app by package name."""
255
+ params: dict[str, Any] = {"package": package}
256
+ if activity is not None:
257
+ params["activity"] = activity
258
+ if stop_before_launch:
259
+ params["stopBeforeLaunch"] = True
260
+ return await self._call("app", params)
261
+
262
+ async def stop_app(self, package: str) -> Any:
263
+ """Best-effort stop an app."""
264
+ return await self._call("app/stop", {"package": package})
265
+
266
+ async def input_text(self, text: str, clear: bool = True) -> Any:
267
+ """Input text via Portal keyboard. Text is base64-encoded automatically."""
268
+ encoded = base64.b64encode(text.encode()).decode()
269
+ return await self._call(
270
+ "keyboard/input",
271
+ {
272
+ "base64_text": encoded,
273
+ "clear": clear,
274
+ },
275
+ )
276
+
277
+ async def clear_input(self) -> Any:
278
+ """Clear focused input field."""
279
+ return await self._call("keyboard/clear")
280
+
281
+ async def press_key(self, key_code: int) -> Any:
282
+ """Send an Android key code."""
283
+ return await self._call("keyboard/key", {"key_code": key_code})
284
+
285
+ async def set_overlay_offset(self, offset: int) -> Any:
286
+ """Set overlay vertical offset in pixels."""
287
+ return await self._call("overlay_offset", {"offset": offset})
288
+
289
+ async def set_socket_port(self, port: int) -> Any:
290
+ """Update the HTTP server port."""
291
+ return await self._call("socket_port", {"port": port})
292
+
293
+ async def take_screenshot(self, *, hide_overlay: bool = True) -> bytes:
294
+ """Take device screenshot. Returns PNG bytes.
295
+
296
+ WebSocket returns a binary frame: first 36 bytes = request UUID,
297
+ rest = PNG data.
298
+ """
299
+ max_retries = 3
300
+ for attempt in range(max_retries + 1):
301
+ try:
302
+ result = await self._call("screenshot", {"hideOverlay": hide_overlay})
303
+ break
304
+ except PortalResponseError as exc:
305
+ if "interval too short" in str(exc).lower() and attempt < max_retries:
306
+ await asyncio.sleep(0.5)
307
+ continue
308
+ raise
309
+ if isinstance(result, bytes):
310
+ return result
311
+ # Fallback: base64-encoded string
312
+ if isinstance(result, str):
313
+ return base64.b64decode(result)
314
+ raise PortalResponseError(f"Unexpected screenshot result type: {type(result)}")
315
+
316
+ # ------------------------------------------------------------------
317
+ # Query methods
318
+ # ------------------------------------------------------------------
319
+
320
+ async def get_packages(self) -> Any:
321
+ """List launchable packages."""
322
+ return await self._call("packages")
323
+
324
+ async def get_state(self, *, filter: bool = True) -> Any:
325
+ """Get full state. Set filter=False to keep small elements."""
326
+ return await self._call("state", {"filter": filter})
327
+
328
+ async def get_version(self) -> Any:
329
+ """Get Portal app version."""
330
+ return await self._call("version")
331
+
332
+ async def get_time(self) -> Any:
333
+ """Get device Unix timestamp in milliseconds."""
334
+ return await self._call("time")
335
+
336
+ async def install(
337
+ self,
338
+ urls: list[str],
339
+ hide_overlay: bool = True,
340
+ ) -> Any:
341
+ """Install APK(s) from URL(s). WebSocket only. Supports split APKs."""
342
+ return await self._call(
343
+ "install",
344
+ {
345
+ "urls": urls,
346
+ "hideOverlay": hide_overlay,
347
+ },
348
+ )