PyPI - hud-python - Versions diffs - 0.1.0b3__tar.gz → 0.1.2a0__tar.gz - Mend

hud-python 0.1.0b3tar.gz → 0.1.2a0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.

This version of hud-python might be problematic. Click here for more details.

Files changed (51) hide show

{hud_python-0.1.0b3 → hud_python-0.1.2a0}/.gitignore RENAMED Viewed

@@ -16,4 +16,12 @@ uv.lock
 *.gif
 *.bmp
 *.tiff
-*.ico
+*.ico
+# DS-Store
+.DS_Store
+# Test files
+/*.ipynb
+test.json
+TODO.md

{hud_python-0.1.0b3 → hud_python-0.1.2a0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: hud-python
-Version: 0.1.0b3
+Version: 0.1.2a0
 Summary: SDK for the HUD evaluation platform.
 Project-URL: Homepage, https://github.com/Human-Data/hud-sdk
 Project-URL: Bug Tracker, https://github.com/Human-Data/hud-sdk/issues
@@ -57,9 +57,9 @@ Description-Content-Type: text/markdown
 # HUD
-A Python SDK for interacting with HUD environments and evaluation benchmarks for browser use and computer use models. Visit [hud.so](https://hud.so).
+A Python SDK for interacting with HUD environments and evaluation benchmarks for browser use and computer use models.
-> **Alpha Release Notice**: This SDK is currently in alpha status (v0.1.0-alpha). The API is evolving and may change in future releases as we gather feedback and improve functionality.
+> **Alpha Release Notice**: This SDK is currently in early release status. The API is evolving and may change in future releases as we gather feedback and improve functionality.
 [![PyPI version](https://img.shields.io/pypi/v/hud-python)](https://pypi.org/project/hud-python/)
@@ -70,13 +70,12 @@ A Python SDK for interacting with HUD environments and evaluation benchmarks for
 [RECOMMENDED] To set get started with an agent, see the [Claude Computer use example](https://github.com/Human-Data/hud-sdk/tree/main/examples).
-Otherwise, install the package with Python>=3.9:
+Install the package with Python>=3.9:
 ```bash
 pip install hud-python
 ```
-Make sure to setup your account [here](https://hud.so/settings) and add your API key to the environment variables:
+Make sure to setup your account with us (email founders@hud.so) and add your API key to the environment variables:
 ```bash
 HUD_API_KEY=<your-api-key>
 ```
@@ -117,20 +116,9 @@ if __name__ == "__main__":
     asyncio.run(main())
 ```
-## Features
-- Connect to HUD evaluation environments
-- Run benchmarks across various tasks
-- Support for different agent adapters
-- Asynchronous API
 ## Documentation
-For comprehensive guides, examples, and API reference, visit:
-- [Getting Started](https://docs.hud.so/introduction)
-- [Installation](https://docs.hud.so/installation)
-- [API Reference](https://docs.hud.so/api-reference)
-- [Examples](https://docs.hud.so/examples)
+For comprehensive guides, examples, and API reference, visit [our docs](https://docs.hud.so/introduction)
 ## License

{hud_python-0.1.0b3 → hud_python-0.1.2a0}/README.md RENAMED Viewed

@@ -1,8 +1,8 @@
 # HUD
-A Python SDK for interacting with HUD environments and evaluation benchmarks for browser use and computer use models. Visit [hud.so](https://hud.so).
+A Python SDK for interacting with HUD environments and evaluation benchmarks for browser use and computer use models.
-> **Alpha Release Notice**: This SDK is currently in alpha status (v0.1.0-alpha). The API is evolving and may change in future releases as we gather feedback and improve functionality.
+> **Alpha Release Notice**: This SDK is currently in early release status. The API is evolving and may change in future releases as we gather feedback and improve functionality.
 [![PyPI version](https://img.shields.io/pypi/v/hud-python)](https://pypi.org/project/hud-python/)
@@ -13,13 +13,12 @@ A Python SDK for interacting with HUD environments and evaluation benchmarks for
 [RECOMMENDED] To set get started with an agent, see the [Claude Computer use example](https://github.com/Human-Data/hud-sdk/tree/main/examples).
-Otherwise, install the package with Python>=3.9:
+Install the package with Python>=3.9:
 ```bash
 pip install hud-python
 ```
-Make sure to setup your account [here](https://hud.so/settings) and add your API key to the environment variables:
+Make sure to setup your account with us (email founders@hud.so) and add your API key to the environment variables:
 ```bash
 HUD_API_KEY=<your-api-key>
 ```
@@ -60,20 +59,9 @@ if __name__ == "__main__":
     asyncio.run(main())
 ```
-## Features
-- Connect to HUD evaluation environments
-- Run benchmarks across various tasks
-- Support for different agent adapters
-- Asynchronous API
 ## Documentation
-For comprehensive guides, examples, and API reference, visit:
-- [Getting Started](https://docs.hud.so/introduction)
-- [Installation](https://docs.hud.so/installation)
-- [API Reference](https://docs.hud.so/api-reference)
-- [Examples](https://docs.hud.so/examples)
+For comprehensive guides, examples, and API reference, visit [our docs](https://docs.hud.so/introduction)
 ## License

{hud_python-0.1.0b3 → hud_python-0.1.2a0}/agent/claude.py RENAMED Viewed

@@ -4,6 +4,7 @@ from agent.base import Agent
 from anthropic import Anthropic
 from anthropic.types import Message
 class ClaudeAgent(Agent):
     def __init__(self, client: Anthropic):
         super().__init__(client)
@@ -11,10 +12,14 @@ class ClaudeAgent(Agent):
         self.max_tokens = 4096
         self.tool_version = "20250124"
         self.thinking_budget = 1024
-        self.conversation = []  # Store the full conversation history including Claude's responses
+        self.conversation = (
+            []
+        )  # Store the full conversation history including Claude's responses
-    async def predict(self, base64_image: str | None = None, input_text: str | None = None) -> tuple[bool, str | object | None]:
-        message = self._create_message(base64_image, input_text)
+    async def predict(
+        self, screenshot: str | None = None, text: str | None = None
+    ) -> tuple[bool, str | object | None]:
+        message = self._create_message(screenshot, text)
         # Only append the message if it's not empty
         if message:
@@ -37,7 +42,7 @@ class ClaudeAgent(Agent):
         return done, processed
-    def _create_message(self, base64_image: str | None = None, input_text: str | None = None):
+    def _create_message(self, screenshot: str | None = None, text: str | None = None):
         """Create appropriate message based on context and inputs"""
         # Check if the previous response was from assistant and had tool_use
@@ -47,7 +52,11 @@ class ClaudeAgent(Agent):
             # Look for tool_use blocks in the assistant's message
             for block in last_assistant_message["content"]:
                 if hasattr(block, "type") and block.type == "tool_use":
-                    if hasattr(block, "name") and block.name == "computer" and base64_image:
+                    if (
+                        hasattr(block, "name")
+                        and block.name == "computer"
+                        and screenshot
+                    ):
                         # Found the tool_use to respond to
                         return {
                             "role": "user",
@@ -61,7 +70,7 @@ class ClaudeAgent(Agent):
                                             "source": {
                                                 "type": "base64",
                                                 "media_type": "image/png",
-                                                "data": base64_image,
+                                                "data": screenshot,
                                             },
                                         }
                                     ],
@@ -70,18 +79,18 @@ class ClaudeAgent(Agent):
                         }
         # Regular user message
-        if input_text or base64_image:
+        if text or screenshot:
             content = []
-            if input_text:
-                content.append({"type": "text", "text": input_text})
-            if base64_image:
+            if text:
+                content.append({"type": "text", "text": text})
+            if screenshot:
                 content.append(
                     {
                         "type": "image",
                         "source": {
                             "type": "base64",
                             "media_type": "image/png",
-                            "data": base64_image,
+                            "data": screenshot,
                         },
                     }
                 )
@@ -122,7 +131,9 @@ class ClaudeAgent(Agent):
         except Exception as e:
             raise
-    async def process_response(self, response: Message) -> tuple[bool, str | object | None]:
+    async def process_response(
+        self, response: Message
+    ) -> tuple[bool, str | object | None]:
         # Check if response contains a computer tool use
         computer_action = None
         for block in response.content:

{hud_python-0.1.0b3 → hud_python-0.1.2a0}/docs/api-reference/adapters.mdx RENAMED Viewed

@@ -28,12 +28,14 @@ convert(data: Any) -> Any
 Converts an action from the agent's format to the CLA format.
 **Parameters:**
-- `data` (Any): The action data to convert
+* `data` (Any): The action data to convert
 **Returns:**
-- `Any`: The converted action in CLA format
-#### adapt_list
+* `Any`: The converted action in CLA format
+#### adapt\_list
 ```python
 adapt_list(actions: list[Any]) -> list[Any]
@@ -42,10 +44,12 @@ adapt_list(actions: list[Any]) -> list[Any]
 Adapts a list of actions.
 **Parameters:**
-- `actions` (list[Any]): The list of actions to adapt
+* `actions` (list\[Any]): The list of actions to adapt
 **Returns:**
-- `list[Any]`: The adapted list of actions
+* `list[Any]`: The adapted list of actions
 ## Common Action Types
@@ -58,8 +62,10 @@ ClickAction(point: Point, button: str = "left") -> ClickAction
 ```
 **Parameters:**
-- `point` (Point): The point to click
-- `button` (str, optional): The mouse button to use ("left", "right", "double")
+* `point` (Point): The point to click
+* `button` (str, optional): The mouse button to use ("left", "right", "wheel")
 ### TypeAction
@@ -70,8 +76,10 @@ TypeAction(text: str, enter_after: bool = False) -> TypeAction
 ```
 **Parameters:**
-- `text` (str): The text to type
-- `enter_after` (bool, optional): Whether to press Enter after typing
+* `text` (str): The text to type
+* `enter_after` (bool, optional): Whether to press Enter after typing
 ### ScrollAction
@@ -82,8 +90,10 @@ ScrollAction(delta_x: int = 0, delta_y: int = 0) -> ScrollAction
 ```
 **Parameters:**
-- `delta_x` (int, optional): The horizontal scroll amount
-- `delta_y` (int, optional): The vertical scroll amount
+* `delta_x` (int, optional): The horizontal scroll amount
+* `delta_y` (int, optional): The vertical scroll amount
 ### DragAction
@@ -94,9 +104,12 @@ DragAction(start: Point, end: Point, button: str = "left") -> DragAction
 ```
 **Parameters:**
-- `start` (Point): The starting point
-- `end` (Point): The ending point
-- `button` (str, optional): The mouse button to use
+* `start` (Point): The starting point
+* `end` (Point): The ending point
+* `button` (str, optional): The mouse button to use
 ### Point
@@ -107,8 +120,10 @@ Point(x: int, y: int) -> Point
 ```
 **Parameters:**
-- `x` (int): The x-coordinate
-- `y` (int): The y-coordinate
+* `x` (int): The x-coordinate
+* `y` (int): The y-coordinate
 ## Claude Adapter
@@ -133,10 +148,12 @@ convert(data: Any) -> Any
 Converts a Claude action to the CLA format.
 **Parameters:**
-- `data` (Any): The Claude action data
+* `data` (Any): The Claude action data
 **Returns:**
-- `Any`: The converted action in CLA format
+* `Any`: The converted action in CLA format
 ## Usage Example
@@ -158,4 +175,4 @@ class MyAdapter(Adapter):
 # Use the adapter
 adapter = MyAdapter()
 env = await run.make(adapter=adapter, metadata={"agent_id": "my-agent"})
-```
+```

{hud_python-0.1.0b3 → hud_python-0.1.2a0}/docs/api-reference/client.mdx RENAMED Viewed

@@ -1,6 +1,6 @@
 ---
-title: 'HUDClient API'
-description: 'API reference for the HUDClient class'
+title: "HUDClient API"
+description: "API reference for the HUDClient class"
 ---
 # HUDClient API Reference
@@ -16,9 +16,11 @@ HUDClient(api_key: str) -> HUDClient
 Creates a new HUD client with the specified API key.
 **Parameters:**
-- `api_key` (str): Your HUD API key
+* `api_key` (str): Your HUD API key
 **Example:**
 ```python
 from hud import HUDClient
@@ -27,7 +29,7 @@ client = HUDClient(api_key="your-api-key")
 ## Methods
-### load_gym
+### load\_gym
 ```python
 async load_gym(id: str) -> Gym
@@ -36,17 +38,20 @@ async load_gym(id: str) -> Gym
 Loads a gym by ID from the HUD API.
 **Parameters:**
-- `id` (str): The ID of the gym to load
+* `id` (str): The ID of the gym to load
 **Returns:**
-- `Gym`: The loaded gym
+* `Gym`: The loaded gym
 **Example:**
 ```python
 gym = await client.load_gym(id="OSWorld-Ubuntu")
 ```
-### load_evalset
+### load\_evalset
 ```python
 async load_evalset(id: str) -> EvalSet
@@ -55,17 +60,20 @@ async load_evalset(id: str) -> EvalSet
 Loads an evaluation set by ID from the HUD API.
 **Parameters:**
-- `id` (str): The ID of the evaluation set to load
+* `id` (str): The ID of the evaluation set to load
 **Returns:**
-- `EvalSet`: The loaded evaluation set
+* `EvalSet`: The loaded evaluation set
 **Example:**
 ```python
 evalset = await client.load_evalset(id="OSWorld-Ubuntu")
 ```
-### list_gyms
+### list\_gyms
 ```python
 async list_gyms() -> list[str]
@@ -74,15 +82,17 @@ async list_gyms() -> list[str]
 Lists all available gyms.
 **Returns:**
-- `list[str]`: A list of gym IDs
+* `list[str]`: A list of gym IDs
 **Example:**
 ```python
 gyms = await client.list_gyms()
 print(gyms)  # ["OSWorld-Ubuntu", "OSWorld-Windows", ...]
 ```
-### get_runs
+### get\_runs
 ```python
 async get_runs() -> list[Run]
@@ -91,16 +101,18 @@ async get_runs() -> list[Run]
 Gets all runs associated with the API key.
 **Returns:**
-- `list[Run]`: A list of runs
+* `list[Run]`: A list of runs
 **Example:**
 ```python
 runs = await client.get_runs()
 for run in runs:
     print(f"Run: {run.name} (ID: {run.id})")
 ```
-### load_run
+### load\_run
 ```python
 async load_run(id: str, adapter: Adapter | None = None) -> Run | None
@@ -109,20 +121,24 @@ async load_run(id: str, adapter: Adapter | None = None) -> Run | None
 Loads a run by ID from the HUD API.
 **Parameters:**
-- `id` (str): The ID of the run to load
-- `adapter` (Adapter, optional): An adapter to use with the run
+* `id` (str): The ID of the run to load
+* `adapter` (Adapter, optional): An adapter to use with the run
 **Returns:**
-- `Run | None`: The loaded run, or None if not found
+* `Run | None`: The loaded run, or None if not found
 **Example:**
 ```python
 run = await client.load_run(id="run-123")
 if run:
     print(f"Loaded run: {run.name}")
 ```
-### create_run
+### create\_run
 ```python
 async create_run(
@@ -138,17 +154,25 @@ async create_run(
 Creates a new run.
 **Parameters:**
-- `name` (str): The name of the run
-- `gym` (Gym): The gym to use for the run
-- `evalset` (EvalSet): The evaluation set to use for the run
-- `config` (dict, optional): Configuration parameters for the run
-- `metadata` (dict, optional): Metadata for the run
-- `adapter` (Adapter, optional): An adapter to use with the run
+* `name` (str): The name of the run
+* `gym` (Gym): The gym to use for the run
+* `evalset` (EvalSet): The evaluation set to use for the run
+* `config` (dict, optional): Configuration parameters for the run
+* `metadata` (dict, optional): Metadata for the run
+* `adapter` (Adapter, optional): An adapter to use with the run
 **Returns:**
-- `Run`: The created run
+* `Run`: The created run
 **Example:**
 ```python
 run = await client.create_run(
     name="example-run",
@@ -156,4 +180,4 @@ run = await client.create_run(
     evalset=evalset,
     metadata={"agent_id": "example"}
 )
-```
+```

{hud_python-0.1.0b3 → hud_python-0.1.2a0}/docs/concepts/adapter.mdx RENAMED Viewed

@@ -10,8 +10,10 @@ An `Adapter` in the HUD SDK is responsible for translating between your agent's
 ## Purpose
 Adapters serve as a bridge between:
-- Your agent's custom action format
-- The standardized CLA format expected by HUD environments
+* Your agent's custom action format
+* The standardized CLA format expected by HUD environments
 This allows you to use different agent implementations without changing how they interact with the environment.
@@ -19,9 +21,9 @@ This allows you to use different agent implementations without changing how they
 The HUD SDK includes several built-in adapters:
-- **Claude Adapter**: For integrating with Anthropic's Claude models
-- **OpenAI Adapter**: For integrating with OpenAI's models
-- **Common Adapter**: A base adapter that can be extended for custom implementations
+* **Claude Adapter**: For integrating with Anthropic's Claude models
+* **Common Adapter**: A base adapter that can be extended for custom implementations
 ## Creating a Custom Adapter
@@ -64,11 +66,4 @@ adapter = SimpleAdapter()
 env = await run.make(adapter=adapter, metadata={"agent_id": "simple-agent"})
 ```
-## Common Action Types
-The CLA format supports several action types:
-- **ClickAction**: For mouse clicks (left, right, double)
-- **TypeAction**: For keyboard input
-- **ScrollAction**: For scrolling the screen
-- **DragAction**: For drag-and-drop operations
+See [Common Action Types](/api-reference/adapters)

hud_python-0.1.2a0/docs/concepts/client.mdx ADDED Viewed

@@ -0,0 +1,32 @@
+---
+title: 'Client'
+description: 'Understanding the HUDClient'
+---
+# HUDClient
+The `HUDClient` is the main entry point for interacting with the HUD API. It provides methods to load gyms, evalsets, and create runs.
+## Initialization
+```python
+from hud import HUDClient
+client = HUDClient(api_key="your-api-key")
+```
+## Key Methods
+* `load_gym(id)`: Load a gym by ID from the HUD API
+* `load_evalset(id)`: Load an evalset by ID from the HUD API
+* `list_gyms()`: List all available gyms
+* `get_runs()`: Get all runs associated with the API key
+* `load_run(id)`: Load a run by ID from the HUD API
+* `create_run(name, gym, evalset)`: Create a new run
+* `display_stream(url)`: View an inline livestream of the environment VNC

{hud_python-0.1.0b3 → hud_python-0.1.2a0}/docs/concepts/environment.mdx RENAMED Viewed

@@ -3,8 +3,6 @@ title: 'Environment'
 description: 'Understanding HUD Environments'
 ---
-# Environment
 An `Env` in the HUD SDK represents a running instance of a gym where an agent can interact with tasks. It provides methods for observation, action, and evaluation.
 ## Initialization
@@ -69,23 +67,27 @@ await env.close()
 Observations from the environment include:
-- `screenshot`: A base64-encoded PNG image of the current screen
-- `text`: Text observation, if available
+* `screenshot`: A base64-encoded PNG image of the current screen
+* `text`: Text observation, if available
 ## Environment States
 An environment can be in one of several states:
-- `creating`: The environment is being created
-- `running`: The environment is running and ready for interaction
-- `error`: An error occurred during environment creation or execution
-- `closed`: The environment has been closed
+* `creating`: The environment is being created
+* `running`: The environment is running and ready for interaction
+* `error`: An error occurred during environment creation or execution
+* `closed`: The environment has been closed
 ## VNC Access
-For debugging purposes, you can access the environment directly via VNC:
+For debugging purposes, you can view the environment directly via VNC:
 ```python
-vnc_url = await env.get_vnc_url()
-print(f"Connect to VNC at: {vnc_url}")
-```
+live_url = await env.get_vnc_url()
+client.display_stream(live_url)
+```

{hud_python-0.1.0b3 → hud_python-0.1.2a0}/docs/concepts/gym.mdx RENAMED Viewed

@@ -3,8 +3,6 @@ title: 'Gym'
 description: 'Understanding HUD Gyms'
 ---
-# Gym
 A `Gym` in the HUD SDK represents a specific environment where tasks can be executed. It defines the operating system, available tools, and constraints for the agent.
 ## Initialization
@@ -22,7 +20,7 @@ gym = await client.load_gym(id="OSWorld-Ubuntu")
 The HUD platform offers several gyms, including:
-- **OSWorld-Ubuntu**: A Linux Ubuntu environment for general OS tasks
+* **OSWorld-Ubuntu**: A Linux Ubuntu environment for general OS tasks
 You can list all available gyms using:
@@ -35,8 +33,9 @@ print(gyms)  # List of available gym IDs
 Each gym has the following properties:
-- `id`: Unique identifier for the gym
-- `name`: Human-readable name of the gym
+* `id`: Unique identifier for the gym
+* `name`: Human-readable name of the gym
 ## Using Gyms
@@ -46,4 +45,4 @@ Gyms are used when creating a run:
 run = await client.create_run(name="my-run", gym=gym, evalset=evalset)
 ```
-This associates the run with the specific environment defined by the gym.
+This associates the run with the specific environment defined by the gym.

hud-python 0.1.0b3__tar.gz → 0.1.2a0__tar.gz

Potentially problematic release.

hud-python 0.1.0b3tar.gz → 0.1.2a0tar.gz