hud-python 0.1.0b3__tar.gz → 0.1.2a0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Potentially problematic release.
This version of hud-python might be problematic. Click here for more details.
- {hud_python-0.1.0b3 → hud_python-0.1.2a0}/.gitignore +9 -1
- {hud_python-0.1.0b3 → hud_python-0.1.2a0}/PKG-INFO +6 -18
- {hud_python-0.1.0b3 → hud_python-0.1.2a0}/README.md +5 -17
- {hud_python-0.1.0b3 → hud_python-0.1.2a0}/agent/claude.py +23 -12
- {hud_python-0.1.0b3 → hud_python-0.1.2a0}/docs/api-reference/adapters.mdx +36 -19
- {hud_python-0.1.0b3 → hud_python-0.1.2a0}/docs/api-reference/client.mdx +50 -26
- {hud_python-0.1.0b3 → hud_python-0.1.2a0}/docs/concepts/adapter.mdx +8 -13
- hud_python-0.1.2a0/docs/concepts/client.mdx +32 -0
- {hud_python-0.1.0b3 → hud_python-0.1.2a0}/docs/concepts/environment.mdx +14 -12
- {hud_python-0.1.0b3 → hud_python-0.1.2a0}/docs/concepts/gym.mdx +5 -6
- {hud_python-0.1.0b3 → hud_python-0.1.2a0}/docs/examples/custom-agent.mdx +12 -43
- {hud_python-0.1.0b3 → hud_python-0.1.2a0}/docs/installation.mdx +9 -23
- hud_python-0.1.2a0/docs/introduction.mdx +24 -0
- {hud_python-0.1.0b3 → hud_python-0.1.2a0}/docs/mint.json +1 -1
- hud_python-0.1.2a0/docs/quickstart.mdx +150 -0
- {hud_python-0.1.0b3 → hud_python-0.1.2a0}/examples/claude_osworld.ipynb +10 -33
- {hud_python-0.1.0b3 → hud_python-0.1.2a0}/hud/__init__.py +1 -1
- {hud_python-0.1.0b3 → hud_python-0.1.2a0}/hud/adapters/claude/adapter.py +35 -8
- {hud_python-0.1.0b3 → hud_python-0.1.2a0}/hud/adapters/common/types.py +5 -0
- {hud_python-0.1.0b3 → hud_python-0.1.2a0}/hud/environment.py +18 -3
- {hud_python-0.1.0b3 → hud_python-0.1.2a0}/pyproject.toml +1 -1
- {hud_python-0.1.0b3 → hud_python-0.1.2a0}/tests/test_import.py +1 -1
- hud_python-0.1.0b3/docs/concepts/client.mdx +0 -25
- hud_python-0.1.0b3/docs/examples/basic.mdx +0 -156
- hud_python-0.1.0b3/docs/examples/claude-agent.mdx +0 -306
- hud_python-0.1.0b3/docs/introduction.mdx +0 -54
- hud_python-0.1.0b3/docs/quickstart.mdx +0 -81
- {hud_python-0.1.0b3 → hud_python-0.1.2a0}/.env.example +0 -0
- {hud_python-0.1.0b3 → hud_python-0.1.2a0}/.github/workflows/ci.yml +0 -0
- {hud_python-0.1.0b3 → hud_python-0.1.2a0}/.github/workflows/release.yml +0 -0
- {hud_python-0.1.0b3 → hud_python-0.1.2a0}/LICENSE +0 -0
- {hud_python-0.1.0b3 → hud_python-0.1.2a0}/MANIFEST.in +0 -0
- {hud_python-0.1.0b3 → hud_python-0.1.2a0}/agent/base.py +0 -0
- {hud_python-0.1.0b3 → hud_python-0.1.2a0}/agent/response_agent.py +0 -0
- {hud_python-0.1.0b3 → hud_python-0.1.2a0}/docs/api-reference/env.mdx +0 -0
- {hud_python-0.1.0b3 → hud_python-0.1.2a0}/docs/logo/HUD.svg +0 -0
- {hud_python-0.1.0b3 → hud_python-0.1.2a0}/examples/README.md +0 -0
- {hud_python-0.1.0b3 → hud_python-0.1.2a0}/hud/adapters/__init__.py +0 -0
- {hud_python-0.1.0b3 → hud_python-0.1.2a0}/hud/adapters/claude/__init__.py +0 -0
- {hud_python-0.1.0b3 → hud_python-0.1.2a0}/hud/adapters/common/__init__.py +0 -0
- {hud_python-0.1.0b3 → hud_python-0.1.2a0}/hud/adapters/common/adapter.py +0 -0
- {hud_python-0.1.0b3 → hud_python-0.1.2a0}/hud/client.py +0 -0
- {hud_python-0.1.0b3 → hud_python-0.1.2a0}/hud/gym.py +0 -0
- {hud_python-0.1.0b3 → hud_python-0.1.2a0}/hud/py.typed +0 -0
- {hud_python-0.1.0b3 → hud_python-0.1.2a0}/hud/run.py +0 -0
- {hud_python-0.1.0b3 → hud_python-0.1.2a0}/hud/server/__init__.py +0 -0
- {hud_python-0.1.0b3 → hud_python-0.1.2a0}/hud/server/requests.py +0 -0
- {hud_python-0.1.0b3 → hud_python-0.1.2a0}/hud/settings.py +0 -0
- {hud_python-0.1.0b3 → hud_python-0.1.2a0}/hud/utils/__init__.py +0 -0
- {hud_python-0.1.0b3 → hud_python-0.1.2a0}/hud/utils/config.py +0 -0
- {hud_python-0.1.0b3 → hud_python-0.1.2a0}/tests/__init__.py +0 -0
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.4
|
|
2
2
|
Name: hud-python
|
|
3
|
-
Version: 0.1.
|
|
3
|
+
Version: 0.1.2a0
|
|
4
4
|
Summary: SDK for the HUD evaluation platform.
|
|
5
5
|
Project-URL: Homepage, https://github.com/Human-Data/hud-sdk
|
|
6
6
|
Project-URL: Bug Tracker, https://github.com/Human-Data/hud-sdk/issues
|
|
@@ -57,9 +57,9 @@ Description-Content-Type: text/markdown
|
|
|
57
57
|
|
|
58
58
|
# HUD
|
|
59
59
|
|
|
60
|
-
A Python SDK for interacting with HUD environments and evaluation benchmarks for browser use and computer use models.
|
|
60
|
+
A Python SDK for interacting with HUD environments and evaluation benchmarks for browser use and computer use models.
|
|
61
61
|
|
|
62
|
-
> **Alpha Release Notice**: This SDK is currently in
|
|
62
|
+
> **Alpha Release Notice**: This SDK is currently in early release status. The API is evolving and may change in future releases as we gather feedback and improve functionality.
|
|
63
63
|
|
|
64
64
|
[](https://pypi.org/project/hud-python/)
|
|
65
65
|
|
|
@@ -70,13 +70,12 @@ A Python SDK for interacting with HUD environments and evaluation benchmarks for
|
|
|
70
70
|
|
|
71
71
|
[RECOMMENDED] To set get started with an agent, see the [Claude Computer use example](https://github.com/Human-Data/hud-sdk/tree/main/examples).
|
|
72
72
|
|
|
73
|
-
|
|
74
|
-
Otherwise, install the package with Python>=3.9:
|
|
73
|
+
Install the package with Python>=3.9:
|
|
75
74
|
```bash
|
|
76
75
|
pip install hud-python
|
|
77
76
|
```
|
|
78
77
|
|
|
79
|
-
Make sure to setup your account
|
|
78
|
+
Make sure to setup your account with us (email founders@hud.so) and add your API key to the environment variables:
|
|
80
79
|
```bash
|
|
81
80
|
HUD_API_KEY=<your-api-key>
|
|
82
81
|
```
|
|
@@ -117,20 +116,9 @@ if __name__ == "__main__":
|
|
|
117
116
|
asyncio.run(main())
|
|
118
117
|
```
|
|
119
118
|
|
|
120
|
-
## Features
|
|
121
|
-
|
|
122
|
-
- Connect to HUD evaluation environments
|
|
123
|
-
- Run benchmarks across various tasks
|
|
124
|
-
- Support for different agent adapters
|
|
125
|
-
- Asynchronous API
|
|
126
|
-
|
|
127
119
|
## Documentation
|
|
128
120
|
|
|
129
|
-
For comprehensive guides, examples, and API reference, visit
|
|
130
|
-
- [Getting Started](https://docs.hud.so/introduction)
|
|
131
|
-
- [Installation](https://docs.hud.so/installation)
|
|
132
|
-
- [API Reference](https://docs.hud.so/api-reference)
|
|
133
|
-
- [Examples](https://docs.hud.so/examples)
|
|
121
|
+
For comprehensive guides, examples, and API reference, visit [our docs](https://docs.hud.so/introduction)
|
|
134
122
|
|
|
135
123
|
## License
|
|
136
124
|
|
|
@@ -1,8 +1,8 @@
|
|
|
1
1
|
# HUD
|
|
2
2
|
|
|
3
|
-
A Python SDK for interacting with HUD environments and evaluation benchmarks for browser use and computer use models.
|
|
3
|
+
A Python SDK for interacting with HUD environments and evaluation benchmarks for browser use and computer use models.
|
|
4
4
|
|
|
5
|
-
> **Alpha Release Notice**: This SDK is currently in
|
|
5
|
+
> **Alpha Release Notice**: This SDK is currently in early release status. The API is evolving and may change in future releases as we gather feedback and improve functionality.
|
|
6
6
|
|
|
7
7
|
[](https://pypi.org/project/hud-python/)
|
|
8
8
|
|
|
@@ -13,13 +13,12 @@ A Python SDK for interacting with HUD environments and evaluation benchmarks for
|
|
|
13
13
|
|
|
14
14
|
[RECOMMENDED] To set get started with an agent, see the [Claude Computer use example](https://github.com/Human-Data/hud-sdk/tree/main/examples).
|
|
15
15
|
|
|
16
|
-
|
|
17
|
-
Otherwise, install the package with Python>=3.9:
|
|
16
|
+
Install the package with Python>=3.9:
|
|
18
17
|
```bash
|
|
19
18
|
pip install hud-python
|
|
20
19
|
```
|
|
21
20
|
|
|
22
|
-
Make sure to setup your account
|
|
21
|
+
Make sure to setup your account with us (email founders@hud.so) and add your API key to the environment variables:
|
|
23
22
|
```bash
|
|
24
23
|
HUD_API_KEY=<your-api-key>
|
|
25
24
|
```
|
|
@@ -60,20 +59,9 @@ if __name__ == "__main__":
|
|
|
60
59
|
asyncio.run(main())
|
|
61
60
|
```
|
|
62
61
|
|
|
63
|
-
## Features
|
|
64
|
-
|
|
65
|
-
- Connect to HUD evaluation environments
|
|
66
|
-
- Run benchmarks across various tasks
|
|
67
|
-
- Support for different agent adapters
|
|
68
|
-
- Asynchronous API
|
|
69
|
-
|
|
70
62
|
## Documentation
|
|
71
63
|
|
|
72
|
-
For comprehensive guides, examples, and API reference, visit
|
|
73
|
-
- [Getting Started](https://docs.hud.so/introduction)
|
|
74
|
-
- [Installation](https://docs.hud.so/installation)
|
|
75
|
-
- [API Reference](https://docs.hud.so/api-reference)
|
|
76
|
-
- [Examples](https://docs.hud.so/examples)
|
|
64
|
+
For comprehensive guides, examples, and API reference, visit [our docs](https://docs.hud.so/introduction)
|
|
77
65
|
|
|
78
66
|
## License
|
|
79
67
|
|
|
@@ -4,6 +4,7 @@ from agent.base import Agent
|
|
|
4
4
|
from anthropic import Anthropic
|
|
5
5
|
from anthropic.types import Message
|
|
6
6
|
|
|
7
|
+
|
|
7
8
|
class ClaudeAgent(Agent):
|
|
8
9
|
def __init__(self, client: Anthropic):
|
|
9
10
|
super().__init__(client)
|
|
@@ -11,10 +12,14 @@ class ClaudeAgent(Agent):
|
|
|
11
12
|
self.max_tokens = 4096
|
|
12
13
|
self.tool_version = "20250124"
|
|
13
14
|
self.thinking_budget = 1024
|
|
14
|
-
self.conversation =
|
|
15
|
+
self.conversation = (
|
|
16
|
+
[]
|
|
17
|
+
) # Store the full conversation history including Claude's responses
|
|
15
18
|
|
|
16
|
-
async def predict(
|
|
17
|
-
|
|
19
|
+
async def predict(
|
|
20
|
+
self, screenshot: str | None = None, text: str | None = None
|
|
21
|
+
) -> tuple[bool, str | object | None]:
|
|
22
|
+
message = self._create_message(screenshot, text)
|
|
18
23
|
|
|
19
24
|
# Only append the message if it's not empty
|
|
20
25
|
if message:
|
|
@@ -37,7 +42,7 @@ class ClaudeAgent(Agent):
|
|
|
37
42
|
|
|
38
43
|
return done, processed
|
|
39
44
|
|
|
40
|
-
def _create_message(self,
|
|
45
|
+
def _create_message(self, screenshot: str | None = None, text: str | None = None):
|
|
41
46
|
"""Create appropriate message based on context and inputs"""
|
|
42
47
|
|
|
43
48
|
# Check if the previous response was from assistant and had tool_use
|
|
@@ -47,7 +52,11 @@ class ClaudeAgent(Agent):
|
|
|
47
52
|
# Look for tool_use blocks in the assistant's message
|
|
48
53
|
for block in last_assistant_message["content"]:
|
|
49
54
|
if hasattr(block, "type") and block.type == "tool_use":
|
|
50
|
-
if
|
|
55
|
+
if (
|
|
56
|
+
hasattr(block, "name")
|
|
57
|
+
and block.name == "computer"
|
|
58
|
+
and screenshot
|
|
59
|
+
):
|
|
51
60
|
# Found the tool_use to respond to
|
|
52
61
|
return {
|
|
53
62
|
"role": "user",
|
|
@@ -61,7 +70,7 @@ class ClaudeAgent(Agent):
|
|
|
61
70
|
"source": {
|
|
62
71
|
"type": "base64",
|
|
63
72
|
"media_type": "image/png",
|
|
64
|
-
"data":
|
|
73
|
+
"data": screenshot,
|
|
65
74
|
},
|
|
66
75
|
}
|
|
67
76
|
],
|
|
@@ -70,18 +79,18 @@ class ClaudeAgent(Agent):
|
|
|
70
79
|
}
|
|
71
80
|
|
|
72
81
|
# Regular user message
|
|
73
|
-
if
|
|
82
|
+
if text or screenshot:
|
|
74
83
|
content = []
|
|
75
|
-
if
|
|
76
|
-
content.append({"type": "text", "text":
|
|
77
|
-
if
|
|
84
|
+
if text:
|
|
85
|
+
content.append({"type": "text", "text": text})
|
|
86
|
+
if screenshot:
|
|
78
87
|
content.append(
|
|
79
88
|
{
|
|
80
89
|
"type": "image",
|
|
81
90
|
"source": {
|
|
82
91
|
"type": "base64",
|
|
83
92
|
"media_type": "image/png",
|
|
84
|
-
"data":
|
|
93
|
+
"data": screenshot,
|
|
85
94
|
},
|
|
86
95
|
}
|
|
87
96
|
)
|
|
@@ -122,7 +131,9 @@ class ClaudeAgent(Agent):
|
|
|
122
131
|
except Exception as e:
|
|
123
132
|
raise
|
|
124
133
|
|
|
125
|
-
async def process_response(
|
|
134
|
+
async def process_response(
|
|
135
|
+
self, response: Message
|
|
136
|
+
) -> tuple[bool, str | object | None]:
|
|
126
137
|
# Check if response contains a computer tool use
|
|
127
138
|
computer_action = None
|
|
128
139
|
for block in response.content:
|
|
@@ -28,12 +28,14 @@ convert(data: Any) -> Any
|
|
|
28
28
|
Converts an action from the agent's format to the CLA format.
|
|
29
29
|
|
|
30
30
|
**Parameters:**
|
|
31
|
-
|
|
31
|
+
|
|
32
|
+
* `data` (Any): The action data to convert
|
|
32
33
|
|
|
33
34
|
**Returns:**
|
|
34
|
-
- `Any`: The converted action in CLA format
|
|
35
35
|
|
|
36
|
-
|
|
36
|
+
* `Any`: The converted action in CLA format
|
|
37
|
+
|
|
38
|
+
#### adapt\_list
|
|
37
39
|
|
|
38
40
|
```python
|
|
39
41
|
adapt_list(actions: list[Any]) -> list[Any]
|
|
@@ -42,10 +44,12 @@ adapt_list(actions: list[Any]) -> list[Any]
|
|
|
42
44
|
Adapts a list of actions.
|
|
43
45
|
|
|
44
46
|
**Parameters:**
|
|
45
|
-
|
|
47
|
+
|
|
48
|
+
* `actions` (list\[Any]): The list of actions to adapt
|
|
46
49
|
|
|
47
50
|
**Returns:**
|
|
48
|
-
|
|
51
|
+
|
|
52
|
+
* `list[Any]`: The adapted list of actions
|
|
49
53
|
|
|
50
54
|
## Common Action Types
|
|
51
55
|
|
|
@@ -58,8 +62,10 @@ ClickAction(point: Point, button: str = "left") -> ClickAction
|
|
|
58
62
|
```
|
|
59
63
|
|
|
60
64
|
**Parameters:**
|
|
61
|
-
|
|
62
|
-
|
|
65
|
+
|
|
66
|
+
* `point` (Point): The point to click
|
|
67
|
+
|
|
68
|
+
* `button` (str, optional): The mouse button to use ("left", "right", "wheel")
|
|
63
69
|
|
|
64
70
|
### TypeAction
|
|
65
71
|
|
|
@@ -70,8 +76,10 @@ TypeAction(text: str, enter_after: bool = False) -> TypeAction
|
|
|
70
76
|
```
|
|
71
77
|
|
|
72
78
|
**Parameters:**
|
|
73
|
-
|
|
74
|
-
|
|
79
|
+
|
|
80
|
+
* `text` (str): The text to type
|
|
81
|
+
|
|
82
|
+
* `enter_after` (bool, optional): Whether to press Enter after typing
|
|
75
83
|
|
|
76
84
|
### ScrollAction
|
|
77
85
|
|
|
@@ -82,8 +90,10 @@ ScrollAction(delta_x: int = 0, delta_y: int = 0) -> ScrollAction
|
|
|
82
90
|
```
|
|
83
91
|
|
|
84
92
|
**Parameters:**
|
|
85
|
-
|
|
86
|
-
|
|
93
|
+
|
|
94
|
+
* `delta_x` (int, optional): The horizontal scroll amount
|
|
95
|
+
|
|
96
|
+
* `delta_y` (int, optional): The vertical scroll amount
|
|
87
97
|
|
|
88
98
|
### DragAction
|
|
89
99
|
|
|
@@ -94,9 +104,12 @@ DragAction(start: Point, end: Point, button: str = "left") -> DragAction
|
|
|
94
104
|
```
|
|
95
105
|
|
|
96
106
|
**Parameters:**
|
|
97
|
-
|
|
98
|
-
|
|
99
|
-
|
|
107
|
+
|
|
108
|
+
* `start` (Point): The starting point
|
|
109
|
+
|
|
110
|
+
* `end` (Point): The ending point
|
|
111
|
+
|
|
112
|
+
* `button` (str, optional): The mouse button to use
|
|
100
113
|
|
|
101
114
|
### Point
|
|
102
115
|
|
|
@@ -107,8 +120,10 @@ Point(x: int, y: int) -> Point
|
|
|
107
120
|
```
|
|
108
121
|
|
|
109
122
|
**Parameters:**
|
|
110
|
-
|
|
111
|
-
|
|
123
|
+
|
|
124
|
+
* `x` (int): The x-coordinate
|
|
125
|
+
|
|
126
|
+
* `y` (int): The y-coordinate
|
|
112
127
|
|
|
113
128
|
## Claude Adapter
|
|
114
129
|
|
|
@@ -133,10 +148,12 @@ convert(data: Any) -> Any
|
|
|
133
148
|
Converts a Claude action to the CLA format.
|
|
134
149
|
|
|
135
150
|
**Parameters:**
|
|
136
|
-
|
|
151
|
+
|
|
152
|
+
* `data` (Any): The Claude action data
|
|
137
153
|
|
|
138
154
|
**Returns:**
|
|
139
|
-
|
|
155
|
+
|
|
156
|
+
* `Any`: The converted action in CLA format
|
|
140
157
|
|
|
141
158
|
## Usage Example
|
|
142
159
|
|
|
@@ -158,4 +175,4 @@ class MyAdapter(Adapter):
|
|
|
158
175
|
# Use the adapter
|
|
159
176
|
adapter = MyAdapter()
|
|
160
177
|
env = await run.make(adapter=adapter, metadata={"agent_id": "my-agent"})
|
|
161
|
-
```
|
|
178
|
+
```
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
---
|
|
2
|
-
title:
|
|
3
|
-
description:
|
|
2
|
+
title: "HUDClient API"
|
|
3
|
+
description: "API reference for the HUDClient class"
|
|
4
4
|
---
|
|
5
5
|
|
|
6
6
|
# HUDClient API Reference
|
|
@@ -16,9 +16,11 @@ HUDClient(api_key: str) -> HUDClient
|
|
|
16
16
|
Creates a new HUD client with the specified API key.
|
|
17
17
|
|
|
18
18
|
**Parameters:**
|
|
19
|
-
|
|
19
|
+
|
|
20
|
+
* `api_key` (str): Your HUD API key
|
|
20
21
|
|
|
21
22
|
**Example:**
|
|
23
|
+
|
|
22
24
|
```python
|
|
23
25
|
from hud import HUDClient
|
|
24
26
|
|
|
@@ -27,7 +29,7 @@ client = HUDClient(api_key="your-api-key")
|
|
|
27
29
|
|
|
28
30
|
## Methods
|
|
29
31
|
|
|
30
|
-
###
|
|
32
|
+
### load\_gym
|
|
31
33
|
|
|
32
34
|
```python
|
|
33
35
|
async load_gym(id: str) -> Gym
|
|
@@ -36,17 +38,20 @@ async load_gym(id: str) -> Gym
|
|
|
36
38
|
Loads a gym by ID from the HUD API.
|
|
37
39
|
|
|
38
40
|
**Parameters:**
|
|
39
|
-
|
|
41
|
+
|
|
42
|
+
* `id` (str): The ID of the gym to load
|
|
40
43
|
|
|
41
44
|
**Returns:**
|
|
42
|
-
|
|
45
|
+
|
|
46
|
+
* `Gym`: The loaded gym
|
|
43
47
|
|
|
44
48
|
**Example:**
|
|
49
|
+
|
|
45
50
|
```python
|
|
46
51
|
gym = await client.load_gym(id="OSWorld-Ubuntu")
|
|
47
52
|
```
|
|
48
53
|
|
|
49
|
-
###
|
|
54
|
+
### load\_evalset
|
|
50
55
|
|
|
51
56
|
```python
|
|
52
57
|
async load_evalset(id: str) -> EvalSet
|
|
@@ -55,17 +60,20 @@ async load_evalset(id: str) -> EvalSet
|
|
|
55
60
|
Loads an evaluation set by ID from the HUD API.
|
|
56
61
|
|
|
57
62
|
**Parameters:**
|
|
58
|
-
|
|
63
|
+
|
|
64
|
+
* `id` (str): The ID of the evaluation set to load
|
|
59
65
|
|
|
60
66
|
**Returns:**
|
|
61
|
-
|
|
67
|
+
|
|
68
|
+
* `EvalSet`: The loaded evaluation set
|
|
62
69
|
|
|
63
70
|
**Example:**
|
|
71
|
+
|
|
64
72
|
```python
|
|
65
73
|
evalset = await client.load_evalset(id="OSWorld-Ubuntu")
|
|
66
74
|
```
|
|
67
75
|
|
|
68
|
-
###
|
|
76
|
+
### list\_gyms
|
|
69
77
|
|
|
70
78
|
```python
|
|
71
79
|
async list_gyms() -> list[str]
|
|
@@ -74,15 +82,17 @@ async list_gyms() -> list[str]
|
|
|
74
82
|
Lists all available gyms.
|
|
75
83
|
|
|
76
84
|
**Returns:**
|
|
77
|
-
|
|
85
|
+
|
|
86
|
+
* `list[str]`: A list of gym IDs
|
|
78
87
|
|
|
79
88
|
**Example:**
|
|
89
|
+
|
|
80
90
|
```python
|
|
81
91
|
gyms = await client.list_gyms()
|
|
82
92
|
print(gyms) # ["OSWorld-Ubuntu", "OSWorld-Windows", ...]
|
|
83
93
|
```
|
|
84
94
|
|
|
85
|
-
###
|
|
95
|
+
### get\_runs
|
|
86
96
|
|
|
87
97
|
```python
|
|
88
98
|
async get_runs() -> list[Run]
|
|
@@ -91,16 +101,18 @@ async get_runs() -> list[Run]
|
|
|
91
101
|
Gets all runs associated with the API key.
|
|
92
102
|
|
|
93
103
|
**Returns:**
|
|
94
|
-
|
|
104
|
+
|
|
105
|
+
* `list[Run]`: A list of runs
|
|
95
106
|
|
|
96
107
|
**Example:**
|
|
108
|
+
|
|
97
109
|
```python
|
|
98
110
|
runs = await client.get_runs()
|
|
99
111
|
for run in runs:
|
|
100
112
|
print(f"Run: {run.name} (ID: {run.id})")
|
|
101
113
|
```
|
|
102
114
|
|
|
103
|
-
###
|
|
115
|
+
### load\_run
|
|
104
116
|
|
|
105
117
|
```python
|
|
106
118
|
async load_run(id: str, adapter: Adapter | None = None) -> Run | None
|
|
@@ -109,20 +121,24 @@ async load_run(id: str, adapter: Adapter | None = None) -> Run | None
|
|
|
109
121
|
Loads a run by ID from the HUD API.
|
|
110
122
|
|
|
111
123
|
**Parameters:**
|
|
112
|
-
|
|
113
|
-
|
|
124
|
+
|
|
125
|
+
* `id` (str): The ID of the run to load
|
|
126
|
+
|
|
127
|
+
* `adapter` (Adapter, optional): An adapter to use with the run
|
|
114
128
|
|
|
115
129
|
**Returns:**
|
|
116
|
-
|
|
130
|
+
|
|
131
|
+
* `Run | None`: The loaded run, or None if not found
|
|
117
132
|
|
|
118
133
|
**Example:**
|
|
134
|
+
|
|
119
135
|
```python
|
|
120
136
|
run = await client.load_run(id="run-123")
|
|
121
137
|
if run:
|
|
122
138
|
print(f"Loaded run: {run.name}")
|
|
123
139
|
```
|
|
124
140
|
|
|
125
|
-
###
|
|
141
|
+
### create\_run
|
|
126
142
|
|
|
127
143
|
```python
|
|
128
144
|
async create_run(
|
|
@@ -138,17 +154,25 @@ async create_run(
|
|
|
138
154
|
Creates a new run.
|
|
139
155
|
|
|
140
156
|
**Parameters:**
|
|
141
|
-
|
|
142
|
-
|
|
143
|
-
|
|
144
|
-
|
|
145
|
-
|
|
146
|
-
|
|
157
|
+
|
|
158
|
+
* `name` (str): The name of the run
|
|
159
|
+
|
|
160
|
+
* `gym` (Gym): The gym to use for the run
|
|
161
|
+
|
|
162
|
+
* `evalset` (EvalSet): The evaluation set to use for the run
|
|
163
|
+
|
|
164
|
+
* `config` (dict, optional): Configuration parameters for the run
|
|
165
|
+
|
|
166
|
+
* `metadata` (dict, optional): Metadata for the run
|
|
167
|
+
|
|
168
|
+
* `adapter` (Adapter, optional): An adapter to use with the run
|
|
147
169
|
|
|
148
170
|
**Returns:**
|
|
149
|
-
|
|
171
|
+
|
|
172
|
+
* `Run`: The created run
|
|
150
173
|
|
|
151
174
|
**Example:**
|
|
175
|
+
|
|
152
176
|
```python
|
|
153
177
|
run = await client.create_run(
|
|
154
178
|
name="example-run",
|
|
@@ -156,4 +180,4 @@ run = await client.create_run(
|
|
|
156
180
|
evalset=evalset,
|
|
157
181
|
metadata={"agent_id": "example"}
|
|
158
182
|
)
|
|
159
|
-
```
|
|
183
|
+
```
|
|
@@ -10,8 +10,10 @@ An `Adapter` in the HUD SDK is responsible for translating between your agent's
|
|
|
10
10
|
## Purpose
|
|
11
11
|
|
|
12
12
|
Adapters serve as a bridge between:
|
|
13
|
-
|
|
14
|
-
|
|
13
|
+
|
|
14
|
+
* Your agent's custom action format
|
|
15
|
+
|
|
16
|
+
* The standardized CLA format expected by HUD environments
|
|
15
17
|
|
|
16
18
|
This allows you to use different agent implementations without changing how they interact with the environment.
|
|
17
19
|
|
|
@@ -19,9 +21,9 @@ This allows you to use different agent implementations without changing how they
|
|
|
19
21
|
|
|
20
22
|
The HUD SDK includes several built-in adapters:
|
|
21
23
|
|
|
22
|
-
|
|
23
|
-
|
|
24
|
-
|
|
24
|
+
* **Claude Adapter**: For integrating with Anthropic's Claude models
|
|
25
|
+
|
|
26
|
+
* **Common Adapter**: A base adapter that can be extended for custom implementations
|
|
25
27
|
|
|
26
28
|
## Creating a Custom Adapter
|
|
27
29
|
|
|
@@ -64,11 +66,4 @@ adapter = SimpleAdapter()
|
|
|
64
66
|
env = await run.make(adapter=adapter, metadata={"agent_id": "simple-agent"})
|
|
65
67
|
```
|
|
66
68
|
|
|
67
|
-
|
|
68
|
-
|
|
69
|
-
The CLA format supports several action types:
|
|
70
|
-
|
|
71
|
-
- **ClickAction**: For mouse clicks (left, right, double)
|
|
72
|
-
- **TypeAction**: For keyboard input
|
|
73
|
-
- **ScrollAction**: For scrolling the screen
|
|
74
|
-
- **DragAction**: For drag-and-drop operations
|
|
69
|
+
See [Common Action Types](/api-reference/adapters)
|
|
@@ -0,0 +1,32 @@
|
|
|
1
|
+
---
|
|
2
|
+
title: 'Client'
|
|
3
|
+
description: 'Understanding the HUDClient'
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# HUDClient
|
|
7
|
+
|
|
8
|
+
The `HUDClient` is the main entry point for interacting with the HUD API. It provides methods to load gyms, evalsets, and create runs.
|
|
9
|
+
|
|
10
|
+
## Initialization
|
|
11
|
+
|
|
12
|
+
```python
|
|
13
|
+
from hud import HUDClient
|
|
14
|
+
|
|
15
|
+
client = HUDClient(api_key="your-api-key")
|
|
16
|
+
```
|
|
17
|
+
|
|
18
|
+
## Key Methods
|
|
19
|
+
|
|
20
|
+
* `load_gym(id)`: Load a gym by ID from the HUD API
|
|
21
|
+
|
|
22
|
+
* `load_evalset(id)`: Load an evalset by ID from the HUD API
|
|
23
|
+
|
|
24
|
+
* `list_gyms()`: List all available gyms
|
|
25
|
+
|
|
26
|
+
* `get_runs()`: Get all runs associated with the API key
|
|
27
|
+
|
|
28
|
+
* `load_run(id)`: Load a run by ID from the HUD API
|
|
29
|
+
|
|
30
|
+
* `create_run(name, gym, evalset)`: Create a new run
|
|
31
|
+
|
|
32
|
+
* `display_stream(url)`: View an inline livestream of the environment VNC
|
|
@@ -3,8 +3,6 @@ title: 'Environment'
|
|
|
3
3
|
description: 'Understanding HUD Environments'
|
|
4
4
|
---
|
|
5
5
|
|
|
6
|
-
# Environment
|
|
7
|
-
|
|
8
6
|
An `Env` in the HUD SDK represents a running instance of a gym where an agent can interact with tasks. It provides methods for observation, action, and evaluation.
|
|
9
7
|
|
|
10
8
|
## Initialization
|
|
@@ -69,23 +67,27 @@ await env.close()
|
|
|
69
67
|
|
|
70
68
|
Observations from the environment include:
|
|
71
69
|
|
|
72
|
-
|
|
73
|
-
|
|
70
|
+
* `screenshot`: A base64-encoded PNG image of the current screen
|
|
71
|
+
|
|
72
|
+
* `text`: Text observation, if available
|
|
74
73
|
|
|
75
74
|
## Environment States
|
|
76
75
|
|
|
77
76
|
An environment can be in one of several states:
|
|
78
77
|
|
|
79
|
-
|
|
80
|
-
|
|
81
|
-
|
|
82
|
-
|
|
78
|
+
* `creating`: The environment is being created
|
|
79
|
+
|
|
80
|
+
* `running`: The environment is running and ready for interaction
|
|
81
|
+
|
|
82
|
+
* `error`: An error occurred during environment creation or execution
|
|
83
|
+
|
|
84
|
+
* `closed`: The environment has been closed
|
|
83
85
|
|
|
84
86
|
## VNC Access
|
|
85
87
|
|
|
86
|
-
For debugging purposes, you can
|
|
88
|
+
For debugging purposes, you can view the environment directly via VNC:
|
|
87
89
|
|
|
88
90
|
```python
|
|
89
|
-
|
|
90
|
-
|
|
91
|
-
```
|
|
91
|
+
live_url = await env.get_vnc_url()
|
|
92
|
+
client.display_stream(live_url)
|
|
93
|
+
```
|
|
@@ -3,8 +3,6 @@ title: 'Gym'
|
|
|
3
3
|
description: 'Understanding HUD Gyms'
|
|
4
4
|
---
|
|
5
5
|
|
|
6
|
-
# Gym
|
|
7
|
-
|
|
8
6
|
A `Gym` in the HUD SDK represents a specific environment where tasks can be executed. It defines the operating system, available tools, and constraints for the agent.
|
|
9
7
|
|
|
10
8
|
## Initialization
|
|
@@ -22,7 +20,7 @@ gym = await client.load_gym(id="OSWorld-Ubuntu")
|
|
|
22
20
|
|
|
23
21
|
The HUD platform offers several gyms, including:
|
|
24
22
|
|
|
25
|
-
|
|
23
|
+
* **OSWorld-Ubuntu**: A Linux Ubuntu environment for general OS tasks
|
|
26
24
|
|
|
27
25
|
You can list all available gyms using:
|
|
28
26
|
|
|
@@ -35,8 +33,9 @@ print(gyms) # List of available gym IDs
|
|
|
35
33
|
|
|
36
34
|
Each gym has the following properties:
|
|
37
35
|
|
|
38
|
-
|
|
39
|
-
|
|
36
|
+
* `id`: Unique identifier for the gym
|
|
37
|
+
|
|
38
|
+
* `name`: Human-readable name of the gym
|
|
40
39
|
|
|
41
40
|
## Using Gyms
|
|
42
41
|
|
|
@@ -46,4 +45,4 @@ Gyms are used when creating a run:
|
|
|
46
45
|
run = await client.create_run(name="my-run", gym=gym, evalset=evalset)
|
|
47
46
|
```
|
|
48
47
|
|
|
49
|
-
This associates the run with the specific environment defined by the gym.
|
|
48
|
+
This associates the run with the specific environment defined by the gym.
|