PyPI - cua-agent - Versions diffs - 0.1.0__tar.gz → 0.1.1__tar.gz - Mend

cua-agent 0.1.0tar.gz → 0.1.1tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.

This version of cua-agent might be problematic. Click here for more details.

Files changed (67) hide show

{cua_agent-0.1.0 → cua_agent-0.1.1}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.1
 Name: cua-agent
-Version: 0.1.0
+Version: 0.1.1
 Summary: CUA (Computer Use) Agent for AI-driven computer interaction
 Author-Email: TryCua <gh@trycua.com>
 Requires-Python: <3.13,>=3.10

cua_agent-0.1.1/README.md ADDED Viewed

@@ -0,0 +1,126 @@
+<div align="center">
+<h1>
+  <div class="image-wrapper" style="display: inline-block;">
+    <picture>
+      <source media="(prefers-color-scheme: dark)" alt="logo" height="150" srcset="../../img/logo_white.png" style="display: block; margin: auto;">
+      <source media="(prefers-color-scheme: light)" alt="logo" height="150" srcset="../../img/logo_black.png" style="display: block; margin: auto;">
+      <img alt="Shows my svg">
+    </picture>
+  </div>
+  [![Python](https://img.shields.io/badge/Python-333333?logo=python&logoColor=white&labelColor=333333)](#)
+  [![macOS](https://img.shields.io/badge/macOS-000000?logo=apple&logoColor=F0F0F0)](#)
+  [![Discord](https://img.shields.io/badge/Discord-%235865F2.svg?&logo=discord&logoColor=white)](https://discord.com/invite/mVnXXpdE85)
+  [![PyPI](https://img.shields.io/pypi/v/cua-computer?color=333333)](https://pypi.org/project/cua-computer/)
+</h1>
+</div>
+**Agent** is a Computer Use (CUA) framework for running multi-app agentic workflows targeting macOS and Linux sandbox, supporting local (Ollama) and cloud model providers (OpenAI, Anthropic, Groq, DeepSeek, Qwen). The framework integrates with Microsoft's OmniParser for enhanced UI understanding and interaction.
+### Get started with Agent
+```python
+from agent import ComputerAgent, AgenticLoop, APIProvider
+from computer import Computer
+computer = Computer(verbosity=logging.INFO)
+agent = ComputerAgent(
+    computer=computer,
+    api_key="<your-anthropic-api-key>",
+    loop_type=AgenticLoop.ANTHROPIC, # or AgenticLoop.OMNI
+    ai_provider=APIProvider.ANTHROPIC,
+    model='claude-3-7-sonnet-20250219',
+    save_trajectory=True,
+    trajectory_dir=str(Path("trajectories") / datetime.now().strftime("%Y%m%d_%H%M%S")),
+    only_n_most_recent_images=3,
+    verbosity=logging.INFO,
+)
+tasks = [
+"""
+Please help me with the following task:
+1. Open Safari browser
+2. Go to Wikipedia.org
+3. Search for "Claude AI"
+4. Summarize the main points you find about Claude AI
+"""
+]
+async with agent:
+    for i, task in enumerate(tasks, 1):
+        print(f"\nExecuting task {i}/{len(tasks)}: {task}")
+        async for result in agent.run(task):
+            print(result)
+        print(f"Task {i} completed")
+```
+## Install
+### cua-agent
+```bash
+pip install cua-agent[all]
+# or install specific loop providers
+pip install cua-agent[anthropic]
+pip install cua-agent[omni]
+```
+## Features
+### OmniParser Integration
+- Enhanced UI understanding with element detection
+- Automatic bounding box detection for UI elements
+- Improved accuracy for complex UI interactions
+- Support for icon and text element recognition
+### Basic Computer Control
+- Direct keyboard and mouse control
+- Window and application management
+- Screenshot capabilities
+- Basic UI element detection
+### Provider Support
+- OpenAI (GPT-4V) - Recommended for OmniParser integration
+- Anthropic (Claude) - Strong general performance
+- Groq - Fast inference with Llama models
+- DeepSeek - Alternative model provider
+- Qwen - Alibaba's multimodal model
+## Run
+Refer to these notebooks for step-by-step guides on how to use the Computer-Use Agent (CUA):
+- [Agent Notebook](../../notebooks/agent_nb.ipynb) - Complete examples and workflows
+## Components
+The library consists of several components:
+- **Core**
+  - `ComputerAgent`: Unified agent class supporting multiple loop types
+  - `BaseComputerAgent`: Abstract base class for computer agents
+- **Providers**
+  - `Anthropic`: Implementation for Anthropic Claude models
+  - `Omni`: Implementation for multiple providers (OpenAI, Groq, etc.)
+- **Loops**
+  - `AnthropicLoop`: Loop implementation for Anthropic
+  - `OmniLoop`: Generic loop supporting multiple providers
+## Configuration
+The agent can be configured with various parameters:
+- **loop_type**: The type of loop to use (ANTHROPIC or OMNI)
+- **provider**: AI provider to use with the loop
+- **model**: The AI model to use
+- **save_trajectory**: Whether to save screenshots and logs
+- **only_n_most_recent_images**: Only keep a specific number of recent images
+See the [Core README](./agent/core/README.md) for more details on the unified agent.

{cua_agent-0.1.0 → cua_agent-0.1.1}/agent/__init__.py RENAMED Viewed

@@ -5,6 +5,6 @@ __version__ = "0.1.0"
 from .core.factory import AgentFactory
 from .core.agent import ComputerAgent
 from .types.base import Provider, AgenticLoop
-from .providers.omni.types import APIProvider
+from .providers.omni.types import LLMProvider, LLM, Model, LLMModel, APIProvider
-__all__ = ["AgentFactory", "Provider", "ComputerAgent", "AgenticLoop", "APIProvider"]
+__all__ = ["AgentFactory", "Provider", "ComputerAgent", "AgenticLoop", "LLMProvider", "LLM", "Model", "LLMModel", "APIProvider"]

{cua_agent-0.1.0 → cua_agent-0.1.1}/agent/core/agent.py RENAMED Viewed

@@ -3,7 +3,7 @@
 import os
 import logging
 import asyncio
-from typing import Any, AsyncGenerator, Dict, List, Optional, TYPE_CHECKING
+from typing import Any, AsyncGenerator, Dict, List, Optional, TYPE_CHECKING, Union, cast
 from datetime import datetime
 from computer import Computer
@@ -17,23 +17,23 @@ if TYPE_CHECKING:
     from ..providers.omni.loop import OmniLoop
     from ..providers.omni.parser import OmniParser
-# Import the APIProvider enum without importing the whole module
-from ..providers.omni.types import APIProvider
+# Import the provider types
+from ..providers.omni.types import LLMProvider, LLM, Model, LLMModel, APIProvider
 logger = logging.getLogger(__name__)
 # Default models for different providers
 DEFAULT_MODELS = {
-    APIProvider.OPENAI: "gpt-4o",
-    APIProvider.ANTHROPIC: "claude-3-7-sonnet-20250219",
-    APIProvider.GROQ: "llama3-70b-8192",
+    LLMProvider.OPENAI: "gpt-4o",
+    LLMProvider.ANTHROPIC: "claude-3-7-sonnet-20250219",
+    LLMProvider.GROQ: "llama3-70b-8192",
 }
 # Map providers to their environment variable names
 ENV_VARS = {
-    APIProvider.OPENAI: "OPENAI_API_KEY",
-    APIProvider.GROQ: "GROQ_API_KEY",
-    APIProvider.ANTHROPIC: "ANTHROPIC_API_KEY",
+    LLMProvider.OPENAI: "OPENAI_API_KEY",
+    LLMProvider.GROQ: "GROQ_API_KEY",
+    LLMProvider.ANTHROPIC: "ANTHROPIC_API_KEY",
 }
@@ -48,9 +48,8 @@ class ComputerAgent(BaseComputerAgent):
         self,
         computer: Computer,
         loop_type: AgenticLoop = AgenticLoop.OMNI,
-        ai_provider: APIProvider = APIProvider.OPENAI,
+        model: Optional[Union[LLM, Dict[str, str], str]] = None,
         api_key: Optional[str] = None,
-        model: Optional[str] = None,
         save_trajectory: bool = True,
         trajectory_dir: Optional[str] = "trajectories",
         only_n_most_recent_images: Optional[int] = None,
@@ -63,9 +62,12 @@ class ComputerAgent(BaseComputerAgent):
         Args:
             computer: Computer instance to control
             loop_type: The type of loop to use (Anthropic or Omni)
-            ai_provider: AI provider to use (required for Cua loop)
+            model: LLM configuration. Can be:
+                  - LLM object with provider and name
+                  - Dict with 'provider' and 'name' keys
+                  - String with model name (defaults to OpenAI provider)
+                  - None (defaults based on loop_type)
             api_key: Optional API key (will use environment variable if not provided)
-            model: Optional model name (will use provider default if not specified)
             save_trajectory: Whether to save screenshots and logs
             trajectory_dir: Directory to save trajectories (defaults to "trajectories")
             only_n_most_recent_images: Limit history to N most recent images
@@ -88,7 +90,6 @@ class ComputerAgent(BaseComputerAgent):
         )
         self.loop_type = loop_type
-        self.provider = ai_provider
         self.save_trajectory = save_trajectory
         self.trajectory_dir = trajectory_dir
         self.only_n_most_recent_images = only_n_most_recent_images
@@ -98,14 +99,19 @@ class ComputerAgent(BaseComputerAgent):
         # Configure logging based on verbosity
         self._configure_logging(verbosity)
+        # Process model configuration
+        self.model_config = self._process_model_config(model, loop_type)
         # Get API key from environment if not provided
         if api_key is None:
             env_var = (
-                ENV_VARS.get(ai_provider) if loop_type == AgenticLoop.OMNI else "ANTHROPIC_API_KEY"
+                ENV_VARS.get(self.model_config.provider)
+                if loop_type == AgenticLoop.OMNI
+                else "ANTHROPIC_API_KEY"
             )
             if not env_var:
                 raise ValueError(
-                    f"Unsupported provider: {ai_provider}. Please use one of: {list(ENV_VARS.keys())}"
+                    f"Unsupported provider: {self.model_config.provider}. Please use one of: {list(ENV_VARS.keys())}"
                 )
             api_key = os.environ.get(env_var)
@@ -119,17 +125,51 @@ class ComputerAgent(BaseComputerAgent):
                 )
         self.api_key = api_key
-        # Set model based on provider if not specified
-        if model is None:
-            if loop_type == AgenticLoop.OMNI:
-                self.model = DEFAULT_MODELS[ai_provider]
-            else:  # Anthropic loop
-                self.model = DEFAULT_MODELS[APIProvider.ANTHROPIC]
-        else:
-            self.model = model
         # Initialize the appropriate loop based on loop_type
         self.loop = self._init_loop()
+    def _process_model_config(
+        self, model_input: Optional[Union[LLM, Dict[str, str], str]], loop_type: AgenticLoop
+    ) -> LLM:
+        """Process and normalize model configuration.
+        Args:
+            model_input: Input model configuration (LLM, dict, string, or None)
+            loop_type: The loop type being used
+        Returns:
+            Normalized LLM instance
+        """
+        # Handle case where model_input is None
+        if model_input is None:
+            # Use Anthropic for Anthropic loop, OpenAI for Omni loop
+            default_provider = (
+                LLMProvider.ANTHROPIC if loop_type == AgenticLoop.ANTHROPIC else LLMProvider.OPENAI
+            )
+            return LLM(provider=default_provider)
+        # Handle case where model_input is already a LLM or one of its aliases
+        if isinstance(model_input, (LLM, Model, LLMModel)):
+            return model_input
+        # Handle case where model_input is a dict
+        if isinstance(model_input, dict):
+            provider = model_input.get("provider", LLMProvider.OPENAI)
+            if isinstance(provider, str):
+                provider = LLMProvider(provider)
+            return LLM(
+                provider=provider,
+                name=model_input.get("name")
+            )
+        # Handle case where model_input is a string (model name)
+        if isinstance(model_input, str):
+            default_provider = (
+                LLMProvider.ANTHROPIC if loop_type == AgenticLoop.ANTHROPIC else LLMProvider.OPENAI
+            )
+            return LLM(provider=default_provider, name=model_input)
+        raise ValueError(f"Unsupported model configuration: {model_input}")
     def _configure_logging(self, verbosity: int):
         """Configure logging based on verbosity level."""
@@ -162,9 +202,12 @@ class ComputerAgent(BaseComputerAgent):
         if self.loop_type == AgenticLoop.ANTHROPIC:
             from ..providers.anthropic.loop import AnthropicLoop
+            # Ensure we always have a valid model name
+            model_name = self.model_config.name or DEFAULT_MODELS[LLMProvider.ANTHROPIC]
             return AnthropicLoop(
                 api_key=self.api_key,
-                model=self.model,
+                model=model_name,
                 computer=self.computer,
                 save_trajectory=self.save_trajectory,
                 base_dir=self.trajectory_dir,
@@ -176,10 +219,13 @@ class ComputerAgent(BaseComputerAgent):
         if "parser" not in self._kwargs:
             self._kwargs["parser"] = OmniParser()
+        # Ensure we always have a valid model name
+        model_name = self.model_config.name or DEFAULT_MODELS[self.model_config.provider]
         return OmniLoop(
-            provider=self.provider,
+            provider=self.model_config.provider,
             api_key=self.api_key,
-            model=self.model,
+            model=model_name,
             computer=self.computer,
             save_trajectory=self.save_trajectory,
             base_dir=self.trajectory_dir,

{cua_agent-0.1.0 → cua_agent-0.1.1}/agent/core/messages.py RENAMED Viewed

@@ -37,6 +37,17 @@ class BaseMessageManager:
         if self.image_retention_config.min_removal_threshold < 1:
             raise ValueError("min_removal_threshold must be at least 1")
+        # Track provider for message formatting
+        self.provider = "openai"  # Default provider
+    def set_provider(self, provider: str) -> None:
+        """Set the current provider to format messages for.
+        Args:
+            provider: Provider name (e.g., 'openai', 'anthropic')
+        """
+        self.provider = provider.lower()
     def prepare_messages(self, messages: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
         """Prepare messages by applying image retention and caching as configured.
@@ -96,6 +107,10 @@ class BaseMessageManager:
         Args:
             messages: Messages to inject caching into
         """
+        # Only apply cache_control for Anthropic API, not OpenAI
+        if self.provider != "anthropic":
+            return
         # Default to caching last 3 turns
         turns_to_cache = 3
         for message in reversed(messages):

{cua_agent-0.1.0 → cua_agent-0.1.1}/agent/providers/omni/loop.py RENAMED Viewed

@@ -219,9 +219,13 @@ class OmniLoop(BaseLoop):
                     if self.client is None:
                         raise RuntimeError("Failed to initialize client")
+                # Set the provider in message manager based on current provider
+                provider_name = str(self.provider).split(".")[-1].lower()  # Extract name from enum
+                self.message_manager.set_provider(provider_name)
                 # Apply image retention and prepare messages
                 # This will limit the number of images based on only_n_most_recent_images
-                prepared_messages = self.message_manager.prepare_messages(messages.copy())
+                prepared_messages = self.message_manager.get_formatted_messages(provider_name)
                 # Filter out system messages for Anthropic
                 if self.provider == APIProvider.ANTHROPIC:

{cua_agent-0.1.0 → cua_agent-0.1.1}/agent/providers/omni/messages.py RENAMED Viewed

@@ -103,6 +103,9 @@ class OmniMessageManager(BaseMessageManager):
         Returns:
             List of formatted messages
         """
+        # Set the provider for message formatting
+        self.set_provider(provider)
         if provider == "anthropic":
             return self._format_for_anthropic()
         elif provider == "openai":

cua_agent-0.1.1/agent/providers/omni/types.py ADDED Viewed

@@ -0,0 +1,53 @@
+"""Type definitions for the Omni provider."""
+from enum import StrEnum
+from typing import Dict, Optional
+from dataclasses import dataclass
+class LLMProvider(StrEnum):
+    """Supported LLM providers."""
+    ANTHROPIC = "anthropic"
+    OPENAI = "openai"
+    GROQ = "groq"
+    QWEN = "qwen"
+# For backward compatibility
+APIProvider = LLMProvider
+@dataclass
+class LLM:
+    """Configuration for LLM model and provider."""
+    provider: LLMProvider
+    name: Optional[str] = None
+    def __post_init__(self):
+        """Set default model name if not provided."""
+        if self.name is None:
+            self.name = PROVIDER_TO_DEFAULT_MODEL.get(self.provider)
+# For backward compatibility
+LLMModel = LLM
+Model = LLM
+# Default models for each provider
+PROVIDER_TO_DEFAULT_MODEL: Dict[LLMProvider, str] = {
+    LLMProvider.ANTHROPIC: "claude-3-7-sonnet-20250219",
+    LLMProvider.OPENAI: "gpt-4o",
+    LLMProvider.GROQ: "deepseek-r1-distill-llama-70b",
+    LLMProvider.QWEN: "qwen2.5-vl-72b-instruct",
+}
+# Environment variable names for each provider
+PROVIDER_TO_ENV_VAR: Dict[LLMProvider, str] = {
+    LLMProvider.ANTHROPIC: "ANTHROPIC_API_KEY",
+    LLMProvider.OPENAI: "OPENAI_API_KEY",
+    LLMProvider.GROQ: "GROQ_API_KEY",
+    LLMProvider.QWEN: "QWEN_API_KEY",
+}

{cua_agent-0.1.0 → cua_agent-0.1.1}/pyproject.toml RENAMED Viewed

@@ -6,7 +6,7 @@ build-backend = "pdm.backend"
 [project]
 name = "cua-agent"
-version = "0.1.0"
+version = "0.1.1"
 description = "CUA (Computer Use) Agent for AI-driven computer interaction"
 authors = [
     { name = "TryCua", email = "gh@trycua.com" },
@@ -78,7 +78,7 @@ target-version = [
 [tool.ruff]
 line-length = 100
-target-version = "0.1.0"
+target-version = "0.1.1"
 select = [
     "E",
     "F",
@@ -92,7 +92,7 @@ docstring-code-format = true
 [tool.mypy]
 strict = true
-python_version = "0.1.0"
+python_version = "0.1.1"
 ignore_missing_imports = true
 disallow_untyped_defs = true
 check_untyped_defs = true

cua_agent-0.1.0/README.md DELETED Viewed

@@ -1,213 +0,0 @@
-<div align="center">
-<h1>
-  <div class="image-wrapper" style="display: inline-block;">
-    <picture>
-      <source media="(prefers-color-scheme: dark)" alt="logo" height="150" srcset="../../img/logo_white.png" style="display: block; margin: auto;">
-      <source media="(prefers-color-scheme: light)" alt="logo" height="150" srcset="../../img/logo_black.png" style="display: block; margin: auto;">
-      <img alt="Shows my svg">
-    </picture>
-  </div>
-  [![Python](https://img.shields.io/badge/Python-333333?logo=python&logoColor=white&labelColor=333333)](#)
-  [![macOS](https://img.shields.io/badge/macOS-000000?logo=apple&logoColor=F0F0F0)](#)
-  [![Discord](https://img.shields.io/badge/Discord-%235865F2.svg?&logo=discord&logoColor=white)](https://discord.com/invite/mVnXXpdE85)
-  [![PyPI](https://img.shields.io/pypi/v/cua-computer?color=333333)](https://pypi.org/project/cua-computer/)
-</h1>
-</div>
-**Agent** is a Computer Use (CUA) framework for running multi-app agentic workflows targeting macOS and Linux sandbox, supporting local (Ollama) and cloud model providers (OpenAI, Anthropic, Groq, DeepSeek, Qwen). The framework integrates with Microsoft's OmniParser for enhanced UI understanding and interaction.
-### Get started with Agent
-There are two ways to use the agent: with OmniParser for enhanced UI understanding (recommended) or with basic computer control.
-#### Option 1: With OmniParser (Recommended)
-<div align="center">
-    <img src="../../img/agent.png"/>
-</div>
-```python
-from agent.providers.omni import OmniComputerAgent, APIProvider
-# Set your API key
-export OPENAI_API_KEY="your-openai-api-key"
-# Initialize agent with OmniParser for enhanced UI understanding
-agent = OmniComputerAgent(
-    provider=APIProvider.OPENAI,
-    model="gpt-4o",
-    start_omniparser=True  # Automatically starts OmniParser server
-)
-task = """
-1. Search the ai-gradio repo on GitHub.
-2. Clone it to the desktop.
-3. Open the repo with the Cursor app.
-4. Work with Cursor to add a new provider for Cua.
-"""
-async with agent:  # Ensures proper cleanup
-    async for result in agent.run(task):
-        print(result)
-```
-#### Option 2: Basic Computer Control
-```python
-from agent.computer_agent import ComputerAgent
-from agent.base.types import Provider
-# Set your API key (supports any provider)
-export OPENAI_API_KEY="your-openai-api-key"  # or other provider keys
-# Initialize basic agent
-agent = ComputerAgent(
-    provider=Provider.OPENAI,  # or ANTHROPIC, GROQ, etc.
-)
-task = """
-1. Open Chrome and navigate to github.com
-2. Search for 'trycua/cua'
-3. Star the repository
-"""
-async with agent:
-    async for result in agent.run(task):
-        print(result)
-```
-## Install
-### cua-agent
-```bash
-# Basic installation with Anthropic
-pip install cua-agent[anthropic]
-# Install with OmniParser (recommended)
-# Includes all provider dependencies (OpenAI, Anthropic, etc.)
-pip install cua-agent[omni]
-# Install with all features and providers
-pip install cua-agent[all]
-```
-## Features
-### OmniParser Integration
-- Enhanced UI understanding with element detection
-- Automatic bounding box detection for UI elements
-- Improved accuracy for complex UI interactions
-- Support for icon and text element recognition
-### Basic Computer Control
-- Direct keyboard and mouse control
-- Window and application management
-- Screenshot capabilities
-- Basic UI element detection
-### Provider Support
-- OpenAI (GPT-4V) - Recommended for OmniParser integration
-- Anthropic (Claude) - Strong general performance
-- Groq - Fast inference with Llama models
-- DeepSeek - Alternative model provider
-- Qwen - Alibaba's multimodal model
-## Run
-Refer to these notebooks for step-by-step guides on how to use the Computer-Use Agent (CUA):
-- [Getting Started with OmniParser](../../notebooks/omniparser_nb.ipynb) - Enhanced UI understanding
-- [Basic Computer Control](../../notebooks/basic_agent_nb.ipynb) - Simple computer interactions
-- [Advanced Usage](../../notebooks/agent_nb.ipynb) - Complete examples and workflows
-# Computer Agent Library
-A Python library for controlling computer interactions with AI agents.
-## Introduction
-This library provides a unified interface for AI-powered computer interaction agents, allowing applications to automate UI interactions through various AI providers.
-## Key Features
-- **Unified Agent**: Single `ComputerAgent` class with configurable loop types
-- **Multiple AI providers**: Support for OpenAI, Anthropic, Groq, and other providers
-- **Screen analysis**: Intelligent screen parsing and element identification
-- **Tool execution**: Execute tools and commands to interact with the computer
-- **Trajectory saving**: Option to save screenshots and logs for debugging and analysis
-## Installation
-To install the library along with its dependencies:
-```bash
-pip install -e .
-```
-## Usage
-Here's a simple example of how to use the ComputerAgent:
-```python
-import asyncio
-from computer import Computer
-from agent.core.agent import ComputerAgent
-from agent.types.base import AgenticLoop
-from agent.providers.omni.types import APIProvider
-async def main():
-    # Initialize the computer interface
-    computer = Computer()
-    # Create a computer agent
-    agent = ComputerAgent(
-        computer=computer,
-        loop_type=AgenticLoop.OMNI,  # Use OMNI loop
-        provider=APIProvider.OPENAI,  # With OpenAI provider
-        model="gpt-4o",               # Specify the model
-        save_trajectory=True,         # Save logs and screenshots
-    )
-    # Use the agent with a context manager
-    async with agent:
-        # Run a task
-        async for result in agent.run("Open Safari and navigate to github.com"):
-            # Process the result
-            title = result["metadata"].get("title", "Screen Analysis")
-            content = result["content"]
-            print(f"\n{title}")
-            print(content)
-if __name__ == "__main__":
-    asyncio.run(main())
-```
-## Components
-The library consists of several components:
-- **Core**
-  - `ComputerAgent`: Unified agent class supporting multiple loop types
-  - `BaseComputerAgent`: Abstract base class for computer agents
-- **Providers**
-  - `Anthropic`: Implementation for Anthropic Claude models
-  - `Omni`: Implementation for multiple providers (OpenAI, Groq, etc.)
-- **Loops**
-  - `AnthropicLoop`: Loop implementation for Anthropic
-  - `OmniLoop`: Generic loop supporting multiple providers
-## Configuration
-The agent can be configured with various parameters:
-- **loop_type**: The type of loop to use (ANTHROPIC or OMNI)
-- **provider**: AI provider to use with the loop
-- **model**: The AI model to use
-- **save_trajectory**: Whether to save screenshots and logs
-- **only_n_most_recent_images**: Only keep a specific number of recent images
-See the [Core README](./agent/core/README.md) for more details on the unified agent.

cua_agent-0.1.0/agent/providers/omni/types.py DELETED Viewed

@@ -1,30 +0,0 @@
-"""Type definitions for the Omni provider."""
-from enum import StrEnum
-from typing import Dict
-class APIProvider(StrEnum):
-    """Supported API providers."""
-    ANTHROPIC = "anthropic"
-    OPENAI = "openai"
-    GROQ = "groq"
-    QWEN = "qwen"
-# Default models for each provider
-PROVIDER_TO_DEFAULT_MODEL: Dict[APIProvider, str] = {
-    APIProvider.ANTHROPIC: "claude-3-7-sonnet-20250219",
-    APIProvider.OPENAI: "gpt-4o",
-    APIProvider.GROQ: "deepseek-r1-distill-llama-70b",
-    APIProvider.QWEN: "qwen2.5-vl-72b-instruct",
-}
-# Environment variable names for each provider
-PROVIDER_TO_ENV_VAR: Dict[APIProvider, str] = {
-    APIProvider.ANTHROPIC: "ANTHROPIC_API_KEY",
-    APIProvider.OPENAI: "OPENAI_API_KEY",
-    APIProvider.GROQ: "GROQ_API_KEY",
-    APIProvider.QWEN: "QWEN_API_KEY",
-}