PyPI - cua-agent - Versions diffs - 0.1.17__tar.gz → 0.1.18__tar.gz - Mend

cua-agent 0.1.17tar.gz → 0.1.18tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.

This version of cua-agent might be problematic. Click here for more details.

Files changed (77) hide show

cua_agent-0.1.18/PKG-INFO ADDED Viewed

@@ -0,0 +1,165 @@
+Metadata-Version: 2.1
+Name: cua-agent
+Version: 0.1.18
+Summary: CUA (Computer Use) Agent for AI-driven computer interaction
+Author-Email: TryCua <gh@trycua.com>
+Requires-Python: <3.13,>=3.10
+Requires-Dist: httpx<0.29.0,>=0.27.0
+Requires-Dist: aiohttp<4.0.0,>=3.9.3
+Requires-Dist: asyncio
+Requires-Dist: anyio<5.0.0,>=4.4.1
+Requires-Dist: typing-extensions<5.0.0,>=4.12.2
+Requires-Dist: pydantic<3.0.0,>=2.6.4
+Requires-Dist: rich<14.0.0,>=13.7.1
+Requires-Dist: python-dotenv<2.0.0,>=1.0.1
+Requires-Dist: cua-computer<0.2.0,>=0.1.0
+Requires-Dist: cua-core<0.2.0,>=0.1.0
+Requires-Dist: certifi>=2024.2.2
+Provides-Extra: anthropic
+Requires-Dist: anthropic>=0.49.0; extra == "anthropic"
+Requires-Dist: boto3<2.0.0,>=1.35.81; extra == "anthropic"
+Provides-Extra: openai
+Requires-Dist: openai<2.0.0,>=1.14.0; extra == "openai"
+Requires-Dist: httpx<0.29.0,>=0.27.0; extra == "openai"
+Provides-Extra: som
+Requires-Dist: torch>=2.2.1; extra == "som"
+Requires-Dist: torchvision>=0.17.1; extra == "som"
+Requires-Dist: ultralytics>=8.0.0; extra == "som"
+Requires-Dist: transformers>=4.38.2; extra == "som"
+Requires-Dist: cua-som<0.2.0,>=0.1.0; extra == "som"
+Requires-Dist: anthropic<0.47.0,>=0.46.0; extra == "som"
+Requires-Dist: boto3<2.0.0,>=1.35.81; extra == "som"
+Requires-Dist: openai<2.0.0,>=1.14.0; extra == "som"
+Requires-Dist: groq<0.5.0,>=0.4.0; extra == "som"
+Requires-Dist: dashscope<2.0.0,>=1.13.0; extra == "som"
+Requires-Dist: requests<3.0.0,>=2.31.0; extra == "som"
+Provides-Extra: all
+Requires-Dist: torch>=2.2.1; extra == "all"
+Requires-Dist: torchvision>=0.17.1; extra == "all"
+Requires-Dist: ultralytics>=8.0.0; extra == "all"
+Requires-Dist: transformers>=4.38.2; extra == "all"
+Requires-Dist: cua-som<0.2.0,>=0.1.0; extra == "all"
+Requires-Dist: anthropic<0.47.0,>=0.46.0; extra == "all"
+Requires-Dist: boto3<2.0.0,>=1.35.81; extra == "all"
+Requires-Dist: openai<2.0.0,>=1.14.0; extra == "all"
+Requires-Dist: groq<0.5.0,>=0.4.0; extra == "all"
+Requires-Dist: dashscope<2.0.0,>=1.13.0; extra == "all"
+Requires-Dist: requests<3.0.0,>=2.31.0; extra == "all"
+Description-Content-Type: text/markdown
+<div align="center">
+<h1>
+  <div class="image-wrapper" style="display: inline-block;">
+    <picture>
+      <source media="(prefers-color-scheme: dark)" alt="logo" height="150" srcset="../../img/logo_white.png" style="display: block; margin: auto;">
+      <source media="(prefers-color-scheme: light)" alt="logo" height="150" srcset="../../img/logo_black.png" style="display: block; margin: auto;">
+      <img alt="Shows my svg">
+    </picture>
+  </div>
+  [![Python](https://img.shields.io/badge/Python-333333?logo=python&logoColor=white&labelColor=333333)](#)
+  [![macOS](https://img.shields.io/badge/macOS-000000?logo=apple&logoColor=F0F0F0)](#)
+  [![Discord](https://img.shields.io/badge/Discord-%235865F2.svg?&logo=discord&logoColor=white)](https://discord.com/invite/mVnXXpdE85)
+  [![PyPI](https://img.shields.io/pypi/v/cua-computer?color=333333)](https://pypi.org/project/cua-computer/)
+</h1>
+</div>
+**cua-agent** is a general Computer-Use framework for running multi-app agentic workflows targeting macOS and Linux sandbox created with Cua, supporting local (Ollama) and cloud model providers (OpenAI, Anthropic, Groq, DeepSeek, Qwen).
+### Get started with Agent
+<div align="center">
+    <img src="../../img/agent.png"/>
+</div>
+## Install
+```bash
+pip install "cua-agent[all]"
+# or install specific loop providers
+pip install "cua-agent[openai]" # OpenAI Cua Loop
+pip install "cua-agent[anthropic]" # Anthropic Cua Loop
+pip install "cua-agent[omni]" # Cua Loop based on OmniParser
+```
+## Run
+```bash
+async with Computer() as macos_computer:
+  # Create agent with loop and provider
+  agent = ComputerAgent(
+      computer=macos_computer,
+      loop=AgentLoop.OPENAI,
+      model=LLM(provider=LLMProvider.OPENAI)
+  )
+  tasks = [
+      "Look for a repository named trycua/cua on GitHub.",
+      "Check the open issues, open the most recent one and read it.",
+      "Clone the repository in users/lume/projects if it doesn't exist yet.",
+      "Open the repository with an app named Cursor (on the dock, black background and white cube icon).",
+      "From Cursor, open Composer if not already open.",
+      "Focus on the Composer text area, then write and submit a task to help resolve the GitHub issue.",
+  ]
+  for i, task in enumerate(tasks):
+      print(f"\nExecuting task {i}/{len(tasks)}: {task}")
+      async for result in agent.run(task):
+          print(result)
+      print(f"\n✅ Task {i+1}/{len(tasks)} completed: {task}")
+```
+Refer to these notebooks for step-by-step guides on how to use the Computer-Use Agent (CUA):
+- [Agent Notebook](../../notebooks/agent_nb.ipynb) - Complete examples and workflows
+## Agent Loops
+The `cua-agent` package provides three agent loops variations, based on different CUA models providers and techniques:
+| Agent Loop | Supported Models | Description | Set-Of-Marks |
+|:-----------|:-----------------|:------------|:-------------|
+| `AgentLoop.OPENAI` | • `computer_use_preview` | Use OpenAI Operator CUA model | Not Required |
+| `AgentLoop.ANTHROPIC` | • `claude-3-5-sonnet-20240620`<br>• `claude-3-7-sonnet-20250219` | Use Anthropic Computer-Use | Not Required |
+| `AgentLoop.OMNI` <br>(preview) | • `claude-3-5-sonnet-20240620`<br>• `claude-3-7-sonnet-20250219`<br>• `gpt-4.5-preview`<br>• `gpt-4o`<br>• `gpt-4`<br>• `gpt-3.5-turbo` | Use OmniParser for element pixel-detection (SoM) and any VLMs | OmniParser |
+## AgentResponse
+The `AgentResponse` class represents the structured output returned after each agent turn. It contains the agent's response, reasoning, tool usage, and other metadata. The response format aligns with the new [OpenAI Agent SDK specification](https://platform.openai.com/docs/api-reference/responses) for better consistency across different agent loops.
+```python
+async for result in agent.run(task):
+  print("Response ID: ", result.get("id"))
+  # Print detailed usage information
+  usage = result.get("usage")
+  if usage:
+      print("\nUsage Details:")
+      print(f"  Input Tokens: {usage.get('input_tokens')}")
+      if "input_tokens_details" in usage:
+          print(f"  Input Tokens Details: {usage.get('input_tokens_details')}")
+      print(f"  Output Tokens: {usage.get('output_tokens')}")
+      if "output_tokens_details" in usage:
+          print(f"  Output Tokens Details: {usage.get('output_tokens_details')}")
+      print(f"  Total Tokens: {usage.get('total_tokens')}")
+  print("Response Text: ", result.get("text"))
+  # Print tools information
+  tools = result.get("tools")
+  if tools:
+      print("\nTools:")
+      print(tools)
+  # Print reasoning and tool call outputs
+  outputs = result.get("output", [])
+  for output in outputs:
+      output_type = output.get("type")
+      if output_type == "reasoning":
+          print("\nReasoning Output:")
+          print(output)
+      elif output_type == "computer_call":
+          print("\nTool Call Output:")
+          print(output)
+```

cua_agent-0.1.18/README.md ADDED Viewed

@@ -0,0 +1,116 @@
+<div align="center">
+<h1>
+  <div class="image-wrapper" style="display: inline-block;">
+    <picture>
+      <source media="(prefers-color-scheme: dark)" alt="logo" height="150" srcset="../../img/logo_white.png" style="display: block; margin: auto;">
+      <source media="(prefers-color-scheme: light)" alt="logo" height="150" srcset="../../img/logo_black.png" style="display: block; margin: auto;">
+      <img alt="Shows my svg">
+    </picture>
+  </div>
+  [![Python](https://img.shields.io/badge/Python-333333?logo=python&logoColor=white&labelColor=333333)](#)
+  [![macOS](https://img.shields.io/badge/macOS-000000?logo=apple&logoColor=F0F0F0)](#)
+  [![Discord](https://img.shields.io/badge/Discord-%235865F2.svg?&logo=discord&logoColor=white)](https://discord.com/invite/mVnXXpdE85)
+  [![PyPI](https://img.shields.io/pypi/v/cua-computer?color=333333)](https://pypi.org/project/cua-computer/)
+</h1>
+</div>
+**cua-agent** is a general Computer-Use framework for running multi-app agentic workflows targeting macOS and Linux sandbox created with Cua, supporting local (Ollama) and cloud model providers (OpenAI, Anthropic, Groq, DeepSeek, Qwen).
+### Get started with Agent
+<div align="center">
+    <img src="../../img/agent.png"/>
+</div>
+## Install
+```bash
+pip install "cua-agent[all]"
+# or install specific loop providers
+pip install "cua-agent[openai]" # OpenAI Cua Loop
+pip install "cua-agent[anthropic]" # Anthropic Cua Loop
+pip install "cua-agent[omni]" # Cua Loop based on OmniParser
+```
+## Run
+```bash
+async with Computer() as macos_computer:
+  # Create agent with loop and provider
+  agent = ComputerAgent(
+      computer=macos_computer,
+      loop=AgentLoop.OPENAI,
+      model=LLM(provider=LLMProvider.OPENAI)
+  )
+  tasks = [
+      "Look for a repository named trycua/cua on GitHub.",
+      "Check the open issues, open the most recent one and read it.",
+      "Clone the repository in users/lume/projects if it doesn't exist yet.",
+      "Open the repository with an app named Cursor (on the dock, black background and white cube icon).",
+      "From Cursor, open Composer if not already open.",
+      "Focus on the Composer text area, then write and submit a task to help resolve the GitHub issue.",
+  ]
+  for i, task in enumerate(tasks):
+      print(f"\nExecuting task {i}/{len(tasks)}: {task}")
+      async for result in agent.run(task):
+          print(result)
+      print(f"\n✅ Task {i+1}/{len(tasks)} completed: {task}")
+```
+Refer to these notebooks for step-by-step guides on how to use the Computer-Use Agent (CUA):
+- [Agent Notebook](../../notebooks/agent_nb.ipynb) - Complete examples and workflows
+## Agent Loops
+The `cua-agent` package provides three agent loops variations, based on different CUA models providers and techniques:
+| Agent Loop | Supported Models | Description | Set-Of-Marks |
+|:-----------|:-----------------|:------------|:-------------|
+| `AgentLoop.OPENAI` | • `computer_use_preview` | Use OpenAI Operator CUA model | Not Required |
+| `AgentLoop.ANTHROPIC` | • `claude-3-5-sonnet-20240620`<br>• `claude-3-7-sonnet-20250219` | Use Anthropic Computer-Use | Not Required |
+| `AgentLoop.OMNI` <br>(preview) | • `claude-3-5-sonnet-20240620`<br>• `claude-3-7-sonnet-20250219`<br>• `gpt-4.5-preview`<br>• `gpt-4o`<br>• `gpt-4`<br>• `gpt-3.5-turbo` | Use OmniParser for element pixel-detection (SoM) and any VLMs | OmniParser |
+## AgentResponse
+The `AgentResponse` class represents the structured output returned after each agent turn. It contains the agent's response, reasoning, tool usage, and other metadata. The response format aligns with the new [OpenAI Agent SDK specification](https://platform.openai.com/docs/api-reference/responses) for better consistency across different agent loops.
+```python
+async for result in agent.run(task):
+  print("Response ID: ", result.get("id"))
+  # Print detailed usage information
+  usage = result.get("usage")
+  if usage:
+      print("\nUsage Details:")
+      print(f"  Input Tokens: {usage.get('input_tokens')}")
+      if "input_tokens_details" in usage:
+          print(f"  Input Tokens Details: {usage.get('input_tokens_details')}")
+      print(f"  Output Tokens: {usage.get('output_tokens')}")
+      if "output_tokens_details" in usage:
+          print(f"  Output Tokens Details: {usage.get('output_tokens_details')}")
+      print(f"  Total Tokens: {usage.get('total_tokens')}")
+  print("Response Text: ", result.get("text"))
+  # Print tools information
+  tools = result.get("tools")
+  if tools:
+      print("\nTools:")
+      print(tools)
+  # Print reasoning and tool call outputs
+  outputs = result.get("output", [])
+  for output in outputs:
+      output_type = output.get("type")
+      if output_type == "reasoning":
+          print("\nReasoning Output:")
+          print(output)
+      elif output_type == "computer_call":
+          print("\nTool Call Output:")
+          print(output)
+```

{cua_agent-0.1.17 → cua_agent-0.1.18}/agent/__init__.py RENAMED Viewed

@@ -49,7 +49,7 @@ except Exception as e:
     logger.warning(f"Error initializing telemetry: {e}")
 from .providers.omni.types import LLMProvider, LLM
-from .core.loop import AgentLoop
-from .core.computer_agent import ComputerAgent
+from .core.factory import AgentLoop
+from .core.agent import ComputerAgent
 __all__ = ["AgentLoop", "LLMProvider", "LLM", "ComputerAgent"]

{cua_agent-0.1.17 → cua_agent-0.1.18}/agent/core/__init__.py RENAMED Viewed

@@ -1,6 +1,6 @@
 """Core agent components."""
-from .loop import BaseLoop
+from .factory import BaseLoop
 from .messages import (
     BaseMessageManager,
     ImageRetentionConfig,

cua_agent-0.1.17/agent/core/computer_agent.py → cua_agent-0.1.18/agent/core/agent.py RENAMED Viewed

@@ -3,32 +3,18 @@
 import asyncio
 import logging
 import os
-from typing import Any, AsyncGenerator, Dict, Optional, cast, List
+from typing import AsyncGenerator, Optional
 from computer import Computer
-from ..providers.anthropic.loop import AnthropicLoop
-from ..providers.omni.loop import OmniLoop
-from ..providers.omni.parser import OmniParser
-from ..providers.omni.types import LLMProvider, LLM
+from ..providers.omni.types import LLM
 from .. import AgentLoop
-from .messages import StandardMessageManager, ImageRetentionConfig
 from .types import AgentResponse
+from .factory import LoopFactory
+from .provider_config import DEFAULT_MODELS, ENV_VARS
 logging.basicConfig(level=logging.INFO)
 logger = logging.getLogger(__name__)
-# Default models for different providers
-DEFAULT_MODELS = {
-    LLMProvider.OPENAI: "gpt-4o",
-    LLMProvider.ANTHROPIC: "claude-3-7-sonnet-20250219",
-}
-# Map providers to their environment variable names
-ENV_VARS = {
-    LLMProvider.OPENAI: "OPENAI_API_KEY",
-    LLMProvider.ANTHROPIC: "ANTHROPIC_API_KEY",
-}
 class ComputerAgent:
     """A computer agent that can perform automated tasks using natural language instructions."""
@@ -98,35 +84,27 @@ class ComputerAgent:
                     f"No model specified for provider {self.provider} and no default found"
                 )
-        # Ensure computer is properly cast for typing purposes
-        computer_instance = self.computer
         # Get API key from environment if not provided
         actual_api_key = api_key or os.environ.get(ENV_VARS[self.provider], "")
         if not actual_api_key:
             raise ValueError(f"No API key provided for {self.provider}")
-        # Initialize the appropriate loop based on the loop parameter
-        if loop == AgentLoop.ANTHROPIC:
-            self._loop = AnthropicLoop(
-                api_key=actual_api_key,
-                model=actual_model_name,
-                computer=computer_instance,
-                save_trajectory=save_trajectory,
-                base_dir=trajectory_dir,
-                only_n_most_recent_images=only_n_most_recent_images,
-            )
-        else:
-            self._loop = OmniLoop(
+        # Create the appropriate loop using the factory
+        try:
+            # Let the factory create the appropriate loop with needed components
+            self._loop = LoopFactory.create_loop(
+                loop_type=loop,
                 provider=self.provider,
+                computer=self.computer,
+                model_name=actual_model_name,
                 api_key=actual_api_key,
-                model=actual_model_name,
-                computer=computer_instance,
                 save_trajectory=save_trajectory,
-                base_dir=trajectory_dir,
+                trajectory_dir=trajectory_dir,
                 only_n_most_recent_images=only_n_most_recent_images,
-                parser=OmniParser(),
             )
+        except ValueError as e:
+            logger.error(f"Failed to create loop: {str(e)}")
+            raise
         # Initialize the message manager from the loop
         self.message_manager = self._loop.message_manager
@@ -152,21 +130,6 @@ class ComputerAgent:
             else:
                 logger.info("Computer already initialized, skipping initialization")
-            # Take a test screenshot to verify the computer is working
-            logger.info("Testing computer with a screenshot...")
-            try:
-                test_screenshot = await self.computer.interface.screenshot()
-                # Determine the screenshot size based on its type
-                if isinstance(test_screenshot, (bytes, bytearray, memoryview)):
-                    size = len(test_screenshot)
-                elif hasattr(test_screenshot, "base64_image"):
-                    size = len(test_screenshot.base64_image)
-                else:
-                    size = "unknown"
-                logger.info(f"Screenshot test successful, size: {size}")
-            except Exception as e:
-                logger.error(f"Screenshot test failed: {str(e)}")
-                # Even though screenshot failed, we continue since some tests might not need it
         except Exception as e:
             logger.error(f"Error initializing computer in __aenter__: {str(e)}")
             raise
@@ -232,7 +195,6 @@ class ComputerAgent:
             # Execute the task and yield results
             async for result in self._loop.run(self.message_manager.messages):
-                # Yield the result to the caller
                 yield result
         except Exception as e:

cua_agent-0.1.17/agent/core/loop.py → cua_agent-0.1.18/agent/core/base.py RENAMED Viewed

@@ -1,35 +1,21 @@
-"""Base agent loop implementation."""
+"""Base loop definitions."""
 import logging
 import asyncio
 from abc import ABC, abstractmethod
-from enum import Enum, auto
-from typing import Any, AsyncGenerator, Dict, List, Optional, Tuple
-from datetime import datetime
+from typing import Any, AsyncGenerator, Dict, List, Optional
 from computer import Computer
-from .experiment import ExperimentManager
 from .messages import StandardMessageManager, ImageRetentionConfig
 from .types import AgentResponse
+from .experiment import ExperimentManager
 logger = logging.getLogger(__name__)
-class AgentLoop(Enum):
-    """Enumeration of available loop types."""
-    ANTHROPIC = auto()  # Anthropic implementation
-    OMNI = auto()  # OmniLoop implementation
-    # Add more loop types as needed
 class BaseLoop(ABC):
     """Base class for agent loops that handle message processing and tool execution."""
-    ###########################################
-    # INITIALIZATION AND CONFIGURATION
-    ###########################################
     def __init__(
         self,
         computer: Computer,
@@ -68,6 +54,11 @@ class BaseLoop(ABC):
         self.only_n_most_recent_images = only_n_most_recent_images
         self._kwargs = kwargs
+        # Initialize message manager
+        self.message_manager = StandardMessageManager(
+            config=ImageRetentionConfig(num_images_to_keep=only_n_most_recent_images)
+        )
         # Initialize experiment manager
         if self.save_trajectory and self.base_dir:
             self.experiment_manager = ExperimentManager(
@@ -110,8 +101,7 @@ class BaseLoop(ABC):
                     )
                     raise RuntimeError(f"Failed to initialize: {str(e)}")
-        ###########################################
+    ###########################################
     # ABSTRACT METHODS TO BE IMPLEMENTED BY SUBCLASSES
     ###########################################
@@ -125,17 +115,14 @@ class BaseLoop(ABC):
         raise NotImplementedError
     @abstractmethod
-    async def run(self, messages: List[Dict[str, Any]]) -> AsyncGenerator[AgentResponse, None]:
+    def run(self, messages: List[Dict[str, Any]]) -> AsyncGenerator[AgentResponse, None]:
         """Run the agent loop with provided messages.
-        This method handles the main agent loop including message processing,
-        API calls, response handling, and action execution.
         Args:
             messages: List of message objects
-        Yields:
-            Agent response format
+        Returns:
+            An async generator that yields agent responses
         """
         raise NotImplementedError

cua_agent-0.1.18/agent/core/factory.py ADDED Viewed

@@ -0,0 +1,104 @@
+"""Base agent loop implementation."""
+import logging
+import importlib.util
+from typing import Dict, Optional, Type, TYPE_CHECKING, Any, cast, Callable, Awaitable
+from computer import Computer
+from .types import AgentLoop
+from .base import BaseLoop
+# For type checking only
+if TYPE_CHECKING:
+    from ..providers.omni.types import LLMProvider
+logger = logging.getLogger(__name__)
+class LoopFactory:
+    """Factory class for creating agent loops."""
+    # Registry to store loop implementations
+    _loop_registry: Dict[AgentLoop, Type[BaseLoop]] = {}
+    @classmethod
+    def create_loop(
+        cls,
+        loop_type: AgentLoop,
+        api_key: str,
+        model_name: str,
+        computer: Computer,
+        provider: Any = None,
+        save_trajectory: bool = True,
+        trajectory_dir: str = "trajectories",
+        only_n_most_recent_images: Optional[int] = None,
+        acknowledge_safety_check_callback: Optional[Callable[[str], Awaitable[bool]]] = None,
+    ) -> BaseLoop:
+        """Create and return an appropriate loop instance based on type."""
+        if loop_type == AgentLoop.ANTHROPIC:
+            # Lazy import AnthropicLoop only when needed
+            try:
+                from ..providers.anthropic.loop import AnthropicLoop
+            except ImportError:
+                raise ImportError(
+                    "The 'anthropic' provider is not installed. "
+                    "Install it with 'pip install cua-agent[anthropic]'"
+                )
+            return AnthropicLoop(
+                api_key=api_key,
+                model=model_name,
+                computer=computer,
+                save_trajectory=save_trajectory,
+                base_dir=trajectory_dir,
+                only_n_most_recent_images=only_n_most_recent_images,
+            )
+        elif loop_type == AgentLoop.OPENAI:
+            # Lazy import OpenAILoop only when needed
+            try:
+                from ..providers.openai.loop import OpenAILoop
+            except ImportError:
+                raise ImportError(
+                    "The 'openai' provider is not installed. "
+                    "Install it with 'pip install cua-agent[openai]'"
+                )
+            return OpenAILoop(
+                api_key=api_key,
+                model=model_name,
+                computer=computer,
+                save_trajectory=save_trajectory,
+                base_dir=trajectory_dir,
+                only_n_most_recent_images=only_n_most_recent_images,
+                acknowledge_safety_check_callback=acknowledge_safety_check_callback,
+            )
+        elif loop_type == AgentLoop.OMNI:
+            # Lazy import OmniLoop and related classes only when needed
+            try:
+                from ..providers.omni.loop import OmniLoop
+                from ..providers.omni.parser import OmniParser
+                from ..providers.omni.types import LLMProvider
+            except ImportError:
+                raise ImportError(
+                    "The 'omni' provider is not installed. "
+                    "Install it with 'pip install cua-agent[all]'"
+                )
+            if provider is None:
+                raise ValueError("Provider is required for OMNI loop type")
+            # We know provider is the correct type at this point, so cast it
+            provider_instance = cast(LLMProvider, provider)
+            return OmniLoop(
+                provider=provider_instance,
+                api_key=api_key,
+                model=model_name,
+                computer=computer,
+                save_trajectory=save_trajectory,
+                base_dir=trajectory_dir,
+                only_n_most_recent_images=only_n_most_recent_images,
+                parser=OmniParser(),
+            )
+        else:
+            raise ValueError(f"Unsupported loop type: {loop_type}")

cua_agent-0.1.18/agent/core/provider_config.py ADDED Viewed

@@ -0,0 +1,15 @@
+"""Provider-specific configurations and constants."""
+from ..providers.omni.types import LLMProvider
+# Default models for different providers
+DEFAULT_MODELS = {
+    LLMProvider.OPENAI: "gpt-4o",
+    LLMProvider.ANTHROPIC: "claude-3-7-sonnet-20250219",
+}
+# Map providers to their environment variable names
+ENV_VARS = {
+    LLMProvider.OPENAI: "OPENAI_API_KEY",
+    LLMProvider.ANTHROPIC: "ANTHROPIC_API_KEY",
+}

{cua_agent-0.1.17 → cua_agent-0.1.18}/agent/core/types.py RENAMED Viewed

@@ -1,6 +1,16 @@
 """Core type definitions."""
 from typing import Any, Dict, List, Optional, TypedDict, Union
+from enum import Enum, auto
+class AgentLoop(Enum):
+    """Enumeration of available loop types."""
+    ANTHROPIC = auto()  # Anthropic implementation
+    OMNI = auto()  # OmniLoop implementation
+    OPENAI = auto()  # OpenAI implementation
+    # Add more loop types as needed
 class AgentResponse(TypedDict, total=False):

{cua_agent-0.1.17 → cua_agent-0.1.18}/agent/providers/anthropic/loop.py RENAMED Viewed

@@ -16,7 +16,7 @@ from datetime import datetime
 from computer import Computer
 # Base imports
-from ...core.loop import BaseLoop
+from ...core.base import BaseLoop
 from ...core.messages import StandardMessageManager, ImageRetentionConfig
 from ...core.types import AgentResponse

{cua_agent-0.1.17 → cua_agent-0.1.18}/agent/providers/anthropic/response_handler.py RENAMED Viewed

@@ -1,14 +1,11 @@
 """Response and tool handling for Anthropic provider."""
 import logging
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Tuple, cast
 from anthropic.types.beta import (
     BetaMessage,
-    BetaMessageParam,
     BetaTextBlock,
-    BetaTextBlockParam,
-    BetaToolUseBlockParam,
     BetaContentBlockParam,
 )

{cua_agent-0.1.17 → cua_agent-0.1.18}/agent/providers/anthropic/utils.py RENAMED Viewed

@@ -1,14 +1,12 @@
 """Utility functions for Anthropic message handling."""
-import time
 import logging
 import re
 from typing import Any, Dict, List, Optional, Tuple, cast
-from anthropic.types.beta import BetaMessage, BetaMessageParam, BetaTextBlock
+from anthropic.types.beta import BetaMessage
 from ..omni.parser import ParseResult
 from ...core.types import AgentResponse
 from datetime import datetime
-import json
 # Configure module logger
 logger = logging.getLogger(__name__)

cua-agent 0.1.17__tar.gz → 0.1.18__tar.gz

Potentially problematic release.

cua-agent 0.1.17tar.gz → 0.1.18tar.gz