PyPI - cua-agent - Versions diffs - 0.4.34__tar.gz → 0.4.36__tar.gz - Mend

cua-agent 0.4.34tar.gz → 0.4.36tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.

This version of cua-agent might be problematic. Click here for more details.

Files changed (66) hide show

{cua_agent-0.4.34 → cua_agent-0.4.36}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.1
 Name: cua-agent
-Version: 0.4.34
+Version: 0.4.36
 Summary: CUA (Computer Use) Agent for AI-driven computer interaction
 Author-Email: TryCua <gh@trycua.com>
 Requires-Python: >=3.12
@@ -18,6 +18,10 @@ Requires-Dist: certifi>=2024.2.2
 Requires-Dist: litellm>=1.74.12
 Provides-Extra: openai
 Provides-Extra: anthropic
+Provides-Extra: qwen
+Requires-Dist: qwen-vl-utils; extra == "qwen"
+Requires-Dist: qwen-agent; extra == "qwen"
+Requires-Dist: Pillow>=10.0.0; extra == "qwen"
 Provides-Extra: omni
 Requires-Dist: cua-som<0.2.0,>=0.1.0; extra == "omni"
 Provides-Extra: uitars
@@ -34,7 +38,7 @@ Requires-Dist: transformers-v4.55.0-GLM-4.5V-preview; extra == "glm45v-hf"
 Provides-Extra: opencua-hf
 Requires-Dist: accelerate; extra == "opencua-hf"
 Requires-Dist: torch; extra == "opencua-hf"
-Requires-Dist: transformers==4.53.0; extra == "opencua-hf"
+Requires-Dist: transformers>=4.53.0; extra == "opencua-hf"
 Requires-Dist: tiktoken>=0.11.0; extra == "opencua-hf"
 Requires-Dist: blobfile>=3.0.0; extra == "opencua-hf"
 Provides-Extra: internvl-hf
@@ -70,6 +74,9 @@ Requires-Dist: python-dotenv>=1.0.1; extra == "all"
 Requires-Dist: yaspin>=3.1.0; extra == "all"
 Requires-Dist: hud-python==0.4.52; extra == "all"
 Requires-Dist: google-genai>=1.41.0; extra == "all"
+Requires-Dist: qwen-vl-utils; extra == "all"
+Requires-Dist: qwen-agent; extra == "all"
+Requires-Dist: Pillow>=10.0.0; extra == "all"
 Description-Content-Type: text/markdown
 <div align="center">
@@ -82,10 +89,11 @@ Description-Content-Type: text/markdown
     </picture>
   </div>
-  [![Python](https://img.shields.io/badge/Python-333333?logo=python&logoColor=white&labelColor=333333)](#)
-  [![macOS](https://img.shields.io/badge/macOS-000000?logo=apple&logoColor=F0F0F0)](#)
-  [![Discord](https://img.shields.io/badge/Discord-%235865F2.svg?&logo=discord&logoColor=white)](https://discord.com/invite/mVnXXpdE85)
-  [![PyPI](https://img.shields.io/pypi/v/cua-computer?color=333333)](https://pypi.org/project/cua-computer/)
+[![Python](https://img.shields.io/badge/Python-333333?logo=python&logoColor=white&labelColor=333333)](#)
+[![macOS](https://img.shields.io/badge/macOS-000000?logo=apple&logoColor=F0F0F0)](#)
+[![Discord](https://img.shields.io/badge/Discord-%235865F2.svg?&logo=discord&logoColor=white)](https://discord.com/invite/mVnXXpdE85)
+[![PyPI](https://img.shields.io/pypi/v/cua-computer?color=333333)](https://pypi.org/project/cua-computer/)
 </h1>
 </div>
@@ -121,7 +129,7 @@ async def main():
         name=os.getenv("CUA_CONTAINER_NAME"),
         api_key=os.getenv("CUA_API_KEY")
     ) as computer:
         # Create agent
         agent = ComputerAgent(
             model="anthropic/claude-3-5-sonnet-20241022",
@@ -130,10 +138,10 @@ async def main():
             trajectory_dir="trajectories",
             max_trajectory_budget=5.0  # $5 budget limit
         )
         # Run agent
         messages = [{"role": "user", "content": "Take a screenshot and tell me what you see"}]
         async for result in agent.run(messages):
             for item in result["output"]:
                 if item["type"] == "message":
@@ -158,4 +166,4 @@ if __name__ == "__main__":
 ## License
-MIT License - see LICENSE file for details.
+MIT License - see LICENSE file for details.

{cua_agent-0.4.34 → cua_agent-0.4.36}/README.md RENAMED Viewed

@@ -8,10 +8,11 @@
     </picture>
   </div>
-  [![Python](https://img.shields.io/badge/Python-333333?logo=python&logoColor=white&labelColor=333333)](#)
-  [![macOS](https://img.shields.io/badge/macOS-000000?logo=apple&logoColor=F0F0F0)](#)
-  [![Discord](https://img.shields.io/badge/Discord-%235865F2.svg?&logo=discord&logoColor=white)](https://discord.com/invite/mVnXXpdE85)
-  [![PyPI](https://img.shields.io/pypi/v/cua-computer?color=333333)](https://pypi.org/project/cua-computer/)
+[![Python](https://img.shields.io/badge/Python-333333?logo=python&logoColor=white&labelColor=333333)](#)
+[![macOS](https://img.shields.io/badge/macOS-000000?logo=apple&logoColor=F0F0F0)](#)
+[![Discord](https://img.shields.io/badge/Discord-%235865F2.svg?&logo=discord&logoColor=white)](https://discord.com/invite/mVnXXpdE85)
+[![PyPI](https://img.shields.io/pypi/v/cua-computer?color=333333)](https://pypi.org/project/cua-computer/)
 </h1>
 </div>
@@ -47,7 +48,7 @@ async def main():
         name=os.getenv("CUA_CONTAINER_NAME"),
         api_key=os.getenv("CUA_API_KEY")
     ) as computer:
         # Create agent
         agent = ComputerAgent(
             model="anthropic/claude-3-5-sonnet-20241022",
@@ -56,10 +57,10 @@ async def main():
             trajectory_dir="trajectories",
             max_trajectory_budget=5.0  # $5 budget limit
         )
         # Run agent
         messages = [{"role": "user", "content": "Take a screenshot and tell me what you see"}]
         async for result in agent.run(messages):
             for item in result["output"]:
                 if item["type"] == "message":
@@ -84,4 +85,4 @@ if __name__ == "__main__":
 ## License
-MIT License - see LICENSE file for details.
+MIT License - see LICENSE file for details.

{cua_agent-0.4.34 → cua_agent-0.4.36}/agent/__init__.py RENAMED Viewed

@@ -5,19 +5,13 @@ agent - Decorator-based Computer Use Agent with liteLLM integration
 import logging
 import sys
-from .decorators import register_agent
-from .agent import ComputerAgent
-from .types import Messages, AgentResponse
 # Import loops to register them
 from . import loops
+from .agent import ComputerAgent
+from .decorators import register_agent
+from .types import AgentResponse, Messages
-__all__ = [
-    "register_agent",
-    "ComputerAgent",
-    "Messages",
-    "AgentResponse"
-]
+__all__ = ["register_agent", "ComputerAgent", "Messages", "AgentResponse"]
 __version__ = "0.4.0"

{cua_agent-0.4.34 → cua_agent-0.4.36}/agent/__main__.py RENAMED Viewed

@@ -5,8 +5,9 @@ Usage:
     python -m agent.cli <model_string>
 """
-import sys
 import asyncio
+import sys
 from .cli import main
 if __name__ == "__main__":

{cua_agent-0.4.34 → cua_agent-0.4.36}/agent/adapters/huggingfacelocal_adapter.py RENAMED Viewed

@@ -2,27 +2,30 @@ import asyncio
 import functools
 import warnings
 from concurrent.futures import ThreadPoolExecutor
-from typing import Iterator, AsyncIterator, Dict, List, Any, Optional
-from litellm.types.utils import GenericStreamingChunk, ModelResponse
+from typing import Any, AsyncIterator, Dict, Iterator, List, Optional
+from litellm import acompletion, completion
 from litellm.llms.custom_llm import CustomLLM
-from litellm import completion, acompletion
+from litellm.types.utils import GenericStreamingChunk, ModelResponse
 # Try to import HuggingFace dependencies
 try:
     import torch
     from transformers import AutoModelForImageTextToText, AutoProcessor
     HF_AVAILABLE = True
 except ImportError:
     HF_AVAILABLE = False
 from .models import load_model as load_model_handler
 class HuggingFaceLocalAdapter(CustomLLM):
     """HuggingFace Local Adapter for running vision-language models locally."""
     def __init__(self, device: str = "auto", trust_remote_code: bool = False, **kwargs):
         """Initialize the adapter.
         Args:
             device: Device to load model on ("auto", "cuda", "cpu", etc.)
             trust_remote_code: Whether to trust remote code
@@ -34,129 +37,120 @@ class HuggingFaceLocalAdapter(CustomLLM):
         # Cache for model handlers keyed by model_name
         self._handlers: Dict[str, Any] = {}
         self._executor = ThreadPoolExecutor(max_workers=1)  # Single thread pool
     def _get_handler(self, model_name: str):
         """Get or create a model handler for the given model name."""
         if model_name not in self._handlers:
-            self._handlers[model_name] = load_model_handler(model_name=model_name, device=self.device, trust_remote_code=self.trust_remote_code)
+            self._handlers[model_name] = load_model_handler(
+                model_name=model_name, device=self.device, trust_remote_code=self.trust_remote_code
+            )
         return self._handlers[model_name]
     def _convert_messages(self, messages: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
         """Convert OpenAI format messages to HuggingFace format.
         Args:
             messages: Messages in OpenAI format
         Returns:
             Messages in HuggingFace format
         """
         converted_messages = []
         for message in messages:
-            converted_message = {
-                "role": message["role"],
-                "content": []
-            }
+            converted_message = {"role": message["role"], "content": []}
             content = message.get("content", [])
             if isinstance(content, str):
                 # Simple text content
-                converted_message["content"].append({
-                    "type": "text",
-                    "text": content
-                })
+                converted_message["content"].append({"type": "text", "text": content})
             elif isinstance(content, list):
                 # Multi-modal content
                 for item in content:
                     if item.get("type") == "text":
-                        converted_message["content"].append({
-                            "type": "text",
-                            "text": item.get("text", "")
-                        })
+                        converted_message["content"].append(
+                            {"type": "text", "text": item.get("text", "")}
+                        )
                     elif item.get("type") == "image_url":
                         # Convert image_url format to image format
                         image_url = item.get("image_url", {}).get("url", "")
-                        converted_message["content"].append({
-                            "type": "image",
-                            "image": image_url
-                        })
+                        converted_message["content"].append({"type": "image", "image": image_url})
             converted_messages.append(converted_message)
         return converted_messages
     def _generate(self, **kwargs) -> str:
         """Generate response using the local HuggingFace model.
         Args:
             **kwargs: Keyword arguments containing messages and model info
         Returns:
             Generated text response
         """
         if not HF_AVAILABLE:
             raise ImportError(
                 "HuggingFace transformers dependencies not found. "
-                "Please install with: pip install \"cua-agent[uitars-hf]\""
+                'Please install with: pip install "cua-agent[uitars-hf]"'
             )
         # Extract messages and model from kwargs
-        messages = kwargs.get('messages', [])
-        model_name = kwargs.get('model', 'ByteDance-Seed/UI-TARS-1.5-7B')
-        max_new_tokens = kwargs.get('max_tokens', 128)
+        messages = kwargs.get("messages", [])
+        model_name = kwargs.get("model", "ByteDance-Seed/UI-TARS-1.5-7B")
+        max_new_tokens = kwargs.get("max_tokens", 128)
         # Warn about ignored kwargs
-        ignored_kwargs = set(kwargs.keys()) - {'messages', 'model', 'max_tokens'}
+        ignored_kwargs = set(kwargs.keys()) - {"messages", "model", "max_tokens"}
         if ignored_kwargs:
             warnings.warn(f"Ignoring unsupported kwargs: {ignored_kwargs}")
         # Convert messages to HuggingFace format
         hf_messages = self._convert_messages(messages)
         # Delegate to model handler
         handler = self._get_handler(model_name)
         generated_text = handler.generate(hf_messages, max_new_tokens=max_new_tokens)
         return generated_text
     def completion(self, *args, **kwargs) -> ModelResponse:
         """Synchronous completion method.
         Returns:
             ModelResponse with generated text
         """
         generated_text = self._generate(**kwargs)
         return completion(
             model=f"huggingface-local/{kwargs['model']}",
             mock_response=generated_text,
         )
     async def acompletion(self, *args, **kwargs) -> ModelResponse:
         """Asynchronous completion method.
         Returns:
             ModelResponse with generated text
         """
         # Run _generate in thread pool to avoid blocking
         loop = asyncio.get_event_loop()
         generated_text = await loop.run_in_executor(
-            self._executor,
-            functools.partial(self._generate, **kwargs)
+            self._executor, functools.partial(self._generate, **kwargs)
         )
         return await acompletion(
             model=f"huggingface-local/{kwargs['model']}",
             mock_response=generated_text,
         )
     def streaming(self, *args, **kwargs) -> Iterator[GenericStreamingChunk]:
         """Synchronous streaming method.
         Returns:
             Iterator of GenericStreamingChunk
         """
         generated_text = self._generate(**kwargs)
         generic_streaming_chunk: GenericStreamingChunk = {
             "finish_reason": "stop",
             "index": 0,
@@ -165,22 +159,21 @@ class HuggingFaceLocalAdapter(CustomLLM):
             "tool_use": None,
             "usage": {"completion_tokens": 0, "prompt_tokens": 0, "total_tokens": 0},
         }
         yield generic_streaming_chunk
     async def astreaming(self, *args, **kwargs) -> AsyncIterator[GenericStreamingChunk]:
         """Asynchronous streaming method.
         Returns:
             AsyncIterator of GenericStreamingChunk
         """
         # Run _generate in thread pool to avoid blocking
         loop = asyncio.get_event_loop()
         generated_text = await loop.run_in_executor(
-            self._executor,
-            functools.partial(self._generate, **kwargs)
+            self._executor, functools.partial(self._generate, **kwargs)
         )
         generic_streaming_chunk: GenericStreamingChunk = {
             "finish_reason": "stop",
             "index": 0,
@@ -189,5 +182,5 @@ class HuggingFaceLocalAdapter(CustomLLM):
             "tool_use": None,
             "usage": {"completion_tokens": 0, "prompt_tokens": 0, "total_tokens": 0},
         }
-        yield generic_streaming_chunk
+        yield generic_streaming_chunk

cua-agent 0.4.34__tar.gz → 0.4.36__tar.gz

Potentially problematic release.

cua-agent 0.4.34tar.gz → 0.4.36tar.gz