PyPI - coala-cli - Versions diffs - 0.2.0__tar.gz - Mend

coala-cli 0.2.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (8) hide show

coala_cli-0.2.0/PKG-INFO +157 -0
coala_cli-0.2.0/README.md +139 -0
coala_cli-0.2.0/coala/__init__.py +0 -0
coala_cli-0.2.0/coala/agent.py +102 -0
coala_cli-0.2.0/coala/mcp_api.py +442 -0
coala_cli-0.2.0/coala/remote_api.py +123 -0
coala_cli-0.2.0/coala/tool_logic.py +109 -0
coala_cli-0.2.0/pyproject.toml +27 -0

coala_cli-0.2.0/PKG-INFO ADDED Viewed

@@ -0,0 +1,157 @@
+Metadata-Version: 2.1
+Name: coala-cli
+Version: 0.2.0
+Summary: Convert any CMD tool into a LLM agent
+License: MIT
+Author: Qiang
+Requires-Python: >=3.12,<3.14
+Classifier: License :: OSI Approved :: MIT License
+Classifier: Programming Language :: Python :: 3
+Classifier: Programming Language :: Python :: 3.12
+Requires-Dist: cwltool (>=3.1.20240909164951,<4.0.0)
+Requires-Dist: fastapi (>=0.114.0)
+Requires-Dist: mcp (>=1.9.0,<2.0.0)
+Requires-Dist: pydantic (>=2.9.2,<3.0.0)
+Requires-Dist: requests (>=2.32.3,<3.0.0)
+Requires-Dist: uvicorn (>=0.30.6,<0.31.0)
+Description-Content-Type: text/markdown
+# coala-cli
+======================
+## Overview
+Coala, implemented as a Python package, is a standards-based framework for turning command-line tools into reproducible, agent-accessible toolsets that support natural-language interaction.
+## How the Framework Works
+Coala integrates the [Common Workflow Language (CWL)](https://www.commonwl.org/specification/) with the [Model Context Protocol (MCP)](https://modelcontextprotocol.io/docs/getting-started/intro) to standardize tool execution. This approach allows Large Language Model (LLM) agents to discover and run tools through structured interfaces, while strictly enforcing the containerized environments and deterministic results necessary for reproducible science.
+### Core Components
+- **Client Layer:** Any MCP-compliant client application (e.g., Claude Desktop, Cursor, or custom interfaces) that utilizes LLMs (such as Gemini, GPT-5, or Claude) to enable natural language interaction.
+- **Bridge Layer:** A local, generic MCP server that acts as a schema translator. Unlike standard MCP servers that require custom Python wrappers for each tool, the bridge layer automatically parses CWL definitions and exposes the CWL-described command-line tools as executable MCP utilities.
+- **Execution Layer:** A standard CWL runner that executes the underlying binaries within containerized environments (Docker). This ensures that analyses are reproducible and isolated from the host system's dependencies.
+### Quick Start
+1. **Initialize:** Create a local MCP server instance using `mcp_api()`.
+2. **Register:** Load your domain-specific tools described in CWL via `add_tool()` (supports local files or repositories).
+3. **Serve:** Start the MCP server using `mcp.serve()`.
+### The Workflow
+- **Interact:** The user sends a natural language query to the MCP Client (e.g., Claude Desktop).
+- **Discover & Select:** The Client retrieves the tool list from the MCP server. The LLM selects the appropriate tool and sends a structured request for the analysis.
+- **Execute:** Coala translates this selection into a CWL job and executes it within a container (Docker), ensuring reproducibility.
+- **Respond:** The execution logs and results are returned to the LLM, which interprets them and presents the final answer to the user.
+## Get Started
+### Requirements
+* Python 3.12 or later
+* FastAPI
+* Requests
+* Pydantic
+* Uvicorn
+* cwltool
+* mcp (Model Context Protocol SDK)
+### Installation
+To install coala-cli, run the following command:
+```bash
+pip install coala-cli
+```
+### Use Cases
+<!-- This text is a hidden note and will not be displayed in the rendered README.
+### MCP server
+The framework allows you to set up an MCP server with predefined tools for specific domains. For example, to create a bioinformatics-focused MCP server, you can use the following setup (as shown in [`examples/bioinfo_question.py`](examples/bioinfo_question.py)):
+```python
+from cmdagent.mcp_api import mcp_api
+mcp = mcp_api(host='0.0.0.0', port=8000)
+mcp.add_tool('examples/ncbi_datasets_gene.cwl', 'ncbi_datasets_gene')
+mcp.add_tool('examples/bcftools_view.cwl', 'bcftools_view', read_outs=False)
+mcp.serve()
+```
+This creates an MCP server that exposes two bioinformatics tools:
+- `ncbi_datasets_gene`: Retrieves gene metadata from NCBI datasets
+- `bcftools_view`: Subsets and filters VCF/BCF files
+Once the server is running, you can configure your MCP client (e.g., in Cursor) to connect to it:
+```json
+{
+    "mcpServers": {
+        "cmdagent": {
+            "url": "http://localhost:8000/mcp",
+            "transport": "streamable-http"
+        }
+    }
+}
+```
+With this setup, you can ask the LLM natural language questions like:
+- "Give me a summary about gene BRCA1"
+- "Subset variants in the gene BRCA1 from the https://ftp.ncbi.nlm.nih.gov/pub/clinvar/vcf_GRCh38/clinvar.vcf.gz"
+The LLM will automatically discover the available tools, understand their parameters, invoke the appropriate tool with the correct arguments, and present the results in a user-friendly format.
+* Start MCP server
+```
+python examples/bioinfo_question.py
+```
+* Call by MCP client from Cursor
+[![Demo md5](tests/cmdagent.gif)](https://www.youtube.com/watch?v=QqevFmQbTDU)
+### Function call
+* Creating an API
+To create an API, import the `tool_api` function from `cmdagent.remote_api` and pass in the path to a CWL file and the name of the tool:
+```python
+from cmdagent.remote_api import tool_api
+api = tool_api(cwl_file='tests/dockstore-tool-md5sum.cwl', tool_name='md5sum')
+api.serve()
+```
+The `api.serve()` method will start a RESTful API as a service, allowing you to run the tool remotely from the cloud or locally.
+* Creating a Tool Agent
+To create a tool agent, import the `cmdagent` function from `cmdagent.agent` and pass in the API instance:
+```python
+from cmdagent.agent import tool_agent
+ta = tool_agent(api)
+md5 = ta.create_tool()
+md5(input_file="tests/dockstore-tool-md5sum.cwl")
+```
+Function `md5` is created automatically based on the `api`.
+* Function call with Gemini
+To integrate the tool agent with Gemini, import the `GenerativeModel` class from `google.generativeai` and create a new instance:
+```python
+import google.generativeai as genai
+genai.configure(api_key="******")
+model = genai.GenerativeModel(model_name='gemini-1.5-flash', tools=[md5])
+chat = model.start_chat(enable_automatic_function_calling=True)
+response = chat.send_message("what is md5 of tests/dockstore-tool-md5sum.cwl?")
+response.text
+```
+```
+'The md5sum of tests/dockstore-tool-md5sum.cwl is ad59d9e9ed6344f5c20ee7e0143c6c12. \n'
+```
+-->

coala_cli-0.2.0/README.md ADDED Viewed

@@ -0,0 +1,139 @@
+# coala-cli
+======================
+## Overview
+Coala, implemented as a Python package, is a standards-based framework for turning command-line tools into reproducible, agent-accessible toolsets that support natural-language interaction.
+## How the Framework Works
+Coala integrates the [Common Workflow Language (CWL)](https://www.commonwl.org/specification/) with the [Model Context Protocol (MCP)](https://modelcontextprotocol.io/docs/getting-started/intro) to standardize tool execution. This approach allows Large Language Model (LLM) agents to discover and run tools through structured interfaces, while strictly enforcing the containerized environments and deterministic results necessary for reproducible science.
+### Core Components
+- **Client Layer:** Any MCP-compliant client application (e.g., Claude Desktop, Cursor, or custom interfaces) that utilizes LLMs (such as Gemini, GPT-5, or Claude) to enable natural language interaction.
+- **Bridge Layer:** A local, generic MCP server that acts as a schema translator. Unlike standard MCP servers that require custom Python wrappers for each tool, the bridge layer automatically parses CWL definitions and exposes the CWL-described command-line tools as executable MCP utilities.
+- **Execution Layer:** A standard CWL runner that executes the underlying binaries within containerized environments (Docker). This ensures that analyses are reproducible and isolated from the host system's dependencies.
+### Quick Start
+1. **Initialize:** Create a local MCP server instance using `mcp_api()`.
+2. **Register:** Load your domain-specific tools described in CWL via `add_tool()` (supports local files or repositories).
+3. **Serve:** Start the MCP server using `mcp.serve()`.
+### The Workflow
+- **Interact:** The user sends a natural language query to the MCP Client (e.g., Claude Desktop).
+- **Discover & Select:** The Client retrieves the tool list from the MCP server. The LLM selects the appropriate tool and sends a structured request for the analysis.
+- **Execute:** Coala translates this selection into a CWL job and executes it within a container (Docker), ensuring reproducibility.
+- **Respond:** The execution logs and results are returned to the LLM, which interprets them and presents the final answer to the user.
+## Get Started
+### Requirements
+* Python 3.12 or later
+* FastAPI
+* Requests
+* Pydantic
+* Uvicorn
+* cwltool
+* mcp (Model Context Protocol SDK)
+### Installation
+To install coala-cli, run the following command:
+```bash
+pip install coala-cli
+```
+### Use Cases
+<!-- This text is a hidden note and will not be displayed in the rendered README.
+### MCP server
+The framework allows you to set up an MCP server with predefined tools for specific domains. For example, to create a bioinformatics-focused MCP server, you can use the following setup (as shown in [`examples/bioinfo_question.py`](examples/bioinfo_question.py)):
+```python
+from cmdagent.mcp_api import mcp_api
+mcp = mcp_api(host='0.0.0.0', port=8000)
+mcp.add_tool('examples/ncbi_datasets_gene.cwl', 'ncbi_datasets_gene')
+mcp.add_tool('examples/bcftools_view.cwl', 'bcftools_view', read_outs=False)
+mcp.serve()
+```
+This creates an MCP server that exposes two bioinformatics tools:
+- `ncbi_datasets_gene`: Retrieves gene metadata from NCBI datasets
+- `bcftools_view`: Subsets and filters VCF/BCF files
+Once the server is running, you can configure your MCP client (e.g., in Cursor) to connect to it:
+```json
+{
+    "mcpServers": {
+        "cmdagent": {
+            "url": "http://localhost:8000/mcp",
+            "transport": "streamable-http"
+        }
+    }
+}
+```
+With this setup, you can ask the LLM natural language questions like:
+- "Give me a summary about gene BRCA1"
+- "Subset variants in the gene BRCA1 from the https://ftp.ncbi.nlm.nih.gov/pub/clinvar/vcf_GRCh38/clinvar.vcf.gz"
+The LLM will automatically discover the available tools, understand their parameters, invoke the appropriate tool with the correct arguments, and present the results in a user-friendly format.
+* Start MCP server
+```
+python examples/bioinfo_question.py
+```
+* Call by MCP client from Cursor
+[![Demo md5](tests/cmdagent.gif)](https://www.youtube.com/watch?v=QqevFmQbTDU)
+### Function call
+* Creating an API
+To create an API, import the `tool_api` function from `cmdagent.remote_api` and pass in the path to a CWL file and the name of the tool:
+```python
+from cmdagent.remote_api import tool_api
+api = tool_api(cwl_file='tests/dockstore-tool-md5sum.cwl', tool_name='md5sum')
+api.serve()
+```
+The `api.serve()` method will start a RESTful API as a service, allowing you to run the tool remotely from the cloud or locally.
+* Creating a Tool Agent
+To create a tool agent, import the `cmdagent` function from `cmdagent.agent` and pass in the API instance:
+```python
+from cmdagent.agent import tool_agent
+ta = tool_agent(api)
+md5 = ta.create_tool()
+md5(input_file="tests/dockstore-tool-md5sum.cwl")
+```
+Function `md5` is created automatically based on the `api`.
+* Function call with Gemini
+To integrate the tool agent with Gemini, import the `GenerativeModel` class from `google.generativeai` and create a new instance:
+```python
+import google.generativeai as genai
+genai.configure(api_key="******")
+model = genai.GenerativeModel(model_name='gemini-1.5-flash', tools=[md5])
+chat = model.start_chat(enable_automatic_function_calling=True)
+response = chat.send_message("what is md5 of tests/dockstore-tool-md5sum.cwl?")
+response.text
+```
+```
+'The md5sum of tests/dockstore-tool-md5sum.cwl is ad59d9e9ed6344f5c20ee7e0143c6c12. \n'
+```
+-->

coala_cli-0.2.0/coala/__init__.py ADDED Viewed

File without changes

coala_cli-0.2.0/coala/agent.py ADDED Viewed

@@ -0,0 +1,102 @@
+import requests
+from types import FunctionType
+class tool_agent():
+    def __init__(self, api):
+        # self.parameter_names = parameter_names
+        self.api = api.server
+        self.tool = api.tool
+        self.url = api.url
+        self.Base = api.Base
+        self.tool_name = api.tool_name
+        self.run = self._create_function()
+        self.parameter_names = [it['name'] for it in api.tool.t.inputs_record_schema['fields']]
+        #params = {'input_file': file_path}
+    def upload_file(self, file_path):
+        url_upload = f"http://{self.api.config.host}:{self.api.config.port}/uploadFile/"
+        files = {'file': open(file_path, 'rb')}
+        response = requests.post(url_upload, files=files)
+        if response.status_code == 200:
+            return response.json()
+        else:
+            raise Exception(f"Error uploading file: {response.text}")
+    def pre_inputs(self, inputs, kwargs):
+        params = kwargs.copy()
+        for ip in inputs:
+            if 'File' in ip['type']:
+                # upload to server
+                r_path = self.upload_file(kwargs[ip['name']])
+                params[ip['name']] = 'file://' + r_path['filepath']
+                # params[ip['name']] = {
+                #     "class": "File",
+                #     "location": r_path['filepath']
+                # }
+        return params
+    def _create_function(self):
+        def gen_function(**kwargs):
+            # Ensure all required parameters are passed
+            for param in self.parameter_names:
+                if param not in kwargs:
+                    raise ValueError(f"Missing required parameter: {param}")
+            print(", ".join(f"{param}={kwargs[param]}" for param in self.parameter_names))
+            inputs = self.tool.t.inputs_record_schema['fields']
+            params = self.pre_inputs(inputs, kwargs)
+            print(params)
+            response = requests.post(self.url, json=[params])
+            if response.status_code == 200:
+                return response.json()
+            else:
+                raise Exception(f"Error uploading file: {response.text}")
+        gen_function.__name__ = self.tool_name
+        ann = {}
+        for k, v in self.Base.model_fields.items():
+            ann[k] = v.annotation
+        ann['return'] = str
+        gen_function.__annotations__ = ann
+        gen_function.__doc__ = self.tool.t.tool['doc']
+        return gen_function
+    def create_tool(self):
+        tool_name = self.tool_name
+        param_names = self.parameter_names
+        fun = self.run
+        # Start building the function definition as a string
+        function_code = "def {}({}):\n".format(tool_name, ", ".join(param_names))
+        function_code += "    kwargs = {" + ", ".join([f"'{name}': {name}" for name in param_names]) + "}\n"
+        function_code += f"    return {fun.__name__}(**kwargs)\n"
+        # Define a local namespace to execute the function
+        local_namespace = {}
+        # Execute the function code in the local namespace
+        exec(function_code, {}, local_namespace)
+        generated_function = local_namespace[tool_name]
+        # Use FunctionType to create the function, passing globals with the external function
+        dynamic_func = FunctionType(
+            generated_function.__code__,
+            {fun.__name__: fun},  # pass the predefined function to globals
+            generated_function.__name__
+        )
+        # Return the generated function
+        dynamic_func.__annotations__ = fun.__annotations__
+        dynamic_func.__doc__ = fun.__doc__
+        return dynamic_func

coala_cli-0.2.0/coala/mcp_api.py ADDED Viewed

@@ -0,0 +1,442 @@
+# Suppress Pydantic warning about Field() with Optional/Union types
+# This must be done before any Pydantic imports to ensure the filter is active
+import warnings
+warnings.filterwarnings('ignore', message='.*default.*Field.*')
+warnings.filterwarnings('ignore', message='.*UnsupportedFieldAttributeWarning.*')
+from fastapi import FastAPI, UploadFile, File
+from pydantic import create_model, Field
+from pydantic.warnings import UnsupportedFieldAttributeWarning
+import logging
+import uvicorn
+from tempfile import NamedTemporaryFile, mkdtemp
+from cwltool import factory
+from cwltool.context import RuntimeContext
+from threading import Thread
+import time
+from typing import Optional, List, Annotated
+from mcp.server.fastmcp import FastMCP
+from coala.tool_logic import run_tool, configure_container_runner  # <-- import shared logic
+import threading
+import sys
+import os
+# Additional filter for the specific warning category (applied after import)
+warnings.filterwarnings('ignore', category=UnsupportedFieldAttributeWarning)
+logger = logging.getLogger(__name__)
+logger.setLevel(logging.INFO)
+# Use stderr for logging to avoid interfering with stdio transport
+logger.addHandler(logging.StreamHandler(sys.stderr))
+class mcp_api():
+    def __init__(self, host='0.0.0.0', port=8000, container_runner=None):
+        """
+        Initializes an MCP server that can host multiple CWL tools.
+        Parameters:
+            host (str): The host IP address. Defaults to '0.0.0.0'.
+            port (int): The port number. Defaults to 8000.
+            container_runner (str, optional): Container runtime to use for all tools.
+                                             Valid values: 'docker', 'podman', 'singularity', 'udocker', etc.
+                                             Defaults to None (uses tool's default, typically 'docker').
+        Notes:
+            Output-reading behavior is controlled per tool via `add_tool(..., read_outs=False)`.
+        """
+        self.host = host
+        self.port = port
+        self.container_runner = container_runner
+        self.server = None
+        self.url = None
+        self.mcp = FastMCP(host=host, port=port)
+        self.tools = {}  # tool_name -> tool info
+        self.system_prompt = """
+At the end of your response, append a summary listing the tool name and version from the tool's description using this exact format:
+```
+Tool Invocation Summary:
+tool_name: <TOOL_NAME>
+tool_version: <TOOL_VERSION>
+```
+"""
+        # @self.mcp.tool()
+        # async def uploadFile(file: UploadFile = File(description="The file to be uploaded to the server")) -> dict:
+        #     """
+        #     Upload a file to the server.
+        #     """
+        #     with NamedTemporaryFile(delete=False) as tmp:
+        #         contents = file.file.read()
+        #         tmp.write(contents)
+        #     return {"filename": file.filename, "filepath": tmp.name}
+    def _build_field_description(self, field_name, input_field, model_field):
+        """
+        Build field description with type hints.
+        """
+        doc = input_field.get('doc', '')
+        type_val = input_field.get('type', '')
+        type_list = type_val if isinstance(type_val, list) else [type_val]
+        type_str = ' '.join(str(t) for t in type_list)
+        type_hint = ""
+        if 'File' in type_str:
+            type_hint = "file path"
+        elif 'string' in type_str:
+            type_hint = "str"
+        elif 'double' in type_str or 'float' in type_str:
+            type_hint = "float"
+        elif 'int' in type_str:
+            type_hint = "int"
+        elif 'boolean' in type_str:
+            type_hint = "bool"
+        annotation = model_field.annotation.__name__ if hasattr(model_field.annotation, '__name__') else str(model_field.annotation)
+        if type_hint:
+            return f"{field_name}: {doc}, {annotation}, {type_hint}"
+        else:
+            return f"{field_name}: {doc}, {annotation}"
+    def _build_output_description(self, output_field):
+        """
+        Build output field description with type hints.
+        """
+        field_name = output_field.get('name', '')
+        doc = output_field.get('doc', '')
+        type_val = output_field.get('type', '')
+        type_list = type_val if isinstance(type_val, list) else [type_val]
+        type_str = ' '.join(str(t) for t in type_list)
+        type_hint = ""
+        if 'File' in type_str:
+            type_hint = "file path"
+        elif 'string' in type_str:
+            type_hint = "str"
+        elif 'double' in type_str or 'float' in type_str:
+            type_hint = "float"
+        elif 'int' in type_str:
+            type_hint = "int"
+        elif 'boolean' in type_str:
+            type_hint = "bool"
+        if type_hint:
+            return f"{field_name}: {doc}, {type_hint}"
+        else:
+            return f"{field_name}: {doc}"
+    def _transform_input_value(self, field_name, value, input_type):
+        """
+        Transform input values based on their expected type.
+        - For File types: If value is just a filename, try to resolve to full path
+        - For string types: If value is a full path, extract just the filename
+        - For array types: Transform each element in the array
+        Parameters:
+            field_name: Name of the input field
+            value: The input value to transform
+            input_type: The CWL type definition for this input
+        Returns:
+            Transformed value
+        """
+        if value is None:
+            return value
+        # Check if it's an array type
+        is_array = False
+        base_type = input_type
+        if isinstance(input_type, list):
+            # Filter out 'null' to get actual types
+            non_null_types = [t for t in input_type if t != 'null']
+            if non_null_types:
+                base_type = non_null_types[0]
+        # Check for array notation (e.g., 'float[]' or {'type': 'array', 'items': 'float'})
+        if isinstance(base_type, dict) and base_type.get('type') == 'array':
+            is_array = True
+            base_type = base_type.get('items', 'string')
+        elif isinstance(base_type, str) and '[]' in base_type:
+            is_array = True
+            base_type = base_type.replace('[]', '')
+        # If value is a list and type is array, transform each element
+        if is_array and isinstance(value, list):
+            return [self._transform_input_value(f"{field_name}[{i}]", item, base_type)
+                    for i, item in enumerate(value)]
+        # Convert type to string for checking
+        type_str = str(base_type) if not isinstance(base_type, dict) else base_type.get('type', '')
+        # Check if it's a File type
+        if 'File' in type_str and isinstance(value, str):
+            # If it's already a file:// URI, return as is
+            if value.startswith('file://'):
+                return value
+            # If it's already an absolute path that exists, return as is
+            if os.path.isabs(value) and os.path.isfile(value):
+                return value
+            # Try to resolve filename to full path
+            # Check if it's a file in current directory
+            if os.path.isfile(value):
+                return os.path.abspath(value)
+            # Check in current working directory
+            cwd_path = os.path.join(os.getcwd(), value)
+            if os.path.isfile(cwd_path):
+                return os.path.abspath(cwd_path)
+            # If not found, return as is (let run_tool handle it)
+            return value
+        # Check if it's a string type
+        elif 'string' in type_str and isinstance(value, str):
+            # If it looks like a full path, check if directory exists
+            if os.path.sep in value or (os.path.altsep and os.path.altsep in value):
+                # Get the directory part of the path
+                dir_path = os.path.dirname(value)
+                # Only extract filename if the directory exists
+                if dir_path and os.path.isdir(dir_path):
+                    # Extract filename from path
+                    filename = os.path.basename(value)
+                    logger.info(f"Transformed string input '{field_name}': '{value}' -> '{filename}'")
+                    return filename
+                # If directory doesn't exist, keep the full path as is
+                return value
+        return value
+    def add_tool(self, cwl_file, tool_name=None, read_outs=False):
+        """
+        Adds a CWL tool to the MCP server.
+        Parameters:
+            cwl_file: Path to the CWL tool file
+            tool_name: Optional tool name. If not provided, will use:
+                      1. The 'id' field from the CWL tool
+                      2. If 'id' is not defined, the basename of cwl_file (without .cwl extension)
+            read_outs: Whether to read output files
+        Raises:
+            FileNotFoundError: If the CWL file does not exist
+            Exception: If there's an error loading the CWL tool
+        """
+        # Check if file exists
+        if not os.path.exists(cwl_file):
+            raise FileNotFoundError(f"CWL file not found: {cwl_file}")
+        if not os.path.isfile(cwl_file):
+            raise ValueError(f"Path is not a file: {cwl_file}")
+        runtime_context = RuntimeContext()
+        runtime_context.outdir = mkdtemp()
+        # Configure container runner if specified
+        if self.container_runner:
+            configure_container_runner(runtime_context, self.container_runner)
+        fac = factory.Factory(runtime_context=runtime_context)
+        try:
+            tool = fac.make(cwl_file)
+        except Exception as e:
+            raise Exception(f"Failed to load CWL tool from {cwl_file}: {str(e)}") from e
+        # Determine tool_name if not provided
+        if tool_name is None:
+            # Try to get 'id' from CWL tool
+            tool_id = tool.t.tool.get('id') if hasattr(tool.t, 'tool') and tool.t.tool else None
+            # Only use id if it contains a '#' fragment (e.g., "file://path#ToolName")
+            # If id is just a file:// path without fragment, treat it as undefined
+            if tool_id and '#' in tool_id:
+                tool_name = tool_id.split('#')[-1]
+            # If 'id' is not defined or doesn't have a fragment, use basename of cwl_file without .cwl extension
+            if not tool_name:
+                tool_name = os.path.basename(cwl_file).replace('.cwl', '')
+        inputs = tool.t.inputs_record_schema['fields']
+        outputs = tool.t.outputs_record_schema['fields']
+        # Create a mapping from field name to input field definition
+        inputs_by_name = {it['name']: it for it in inputs}
+        # map types
+        it_map = {}
+        for it in inputs:
+            # it['type'] can be a list like ['null', 'org.w3id.cwl.cwl.File'] or ['null', 'float[]']
+            # or a dict like {'type': 'array', 'items': 'float'}
+            # or a string like 'float[]'
+            raw_type = it['type']
+            type_list = raw_type if isinstance(raw_type, list) else [raw_type]
+            # Check for 'null' in type list (optional field)
+            is_optional = 'null' in type_list
+            # Filter out 'null' to get the actual type(s)
+            non_null_types = [t for t in type_list if t != 'null']
+            # Check if it's a dict-based array type (e.g., {'type': 'array', 'items': 'float'})
+            is_array = False
+            base_type_str = None
+            if isinstance(raw_type, dict) and raw_type.get('type') == 'array':
+                is_array = True
+                items_type = raw_type.get('items', 'string')
+                base_type_str = str(items_type) if not isinstance(items_type, dict) else items_type.get('type', 'string')
+            elif non_null_types:
+                # Check for array notation in string (e.g., 'float[]')
+                # Look through non-null types for array notation
+                for t in non_null_types:
+                    t_str = str(t)
+                    if '[]' in t_str:
+                        is_array = True
+                        base_type_str = t_str.replace('[]', '')
+                        break
+                if not is_array and non_null_types:
+                    # Not an array, use the first non-null type
+                    base_type_str = str(non_null_types[0])
+            else:
+                # Fallback to string if no types found
+                base_type_str = 'string'
+            # Get field description from CWL input
+            field_doc = it.get('doc', '')
+            # Determine base Python type
+            if 'File' in base_type_str:
+                base_py_type = str
+            elif 'string' in base_type_str:
+                base_py_type = str
+            elif 'double' in base_type_str or 'float' in base_type_str:
+                base_py_type = float
+            elif 'int' in base_type_str:
+                base_py_type = int
+            elif 'boolean' in base_type_str:
+                base_py_type = bool
+            else:
+                base_py_type = str
+            # Wrap in List if it's an array
+            if is_array:
+                py_type = List[base_py_type]
+            else:
+                py_type = base_py_type
+            # Create Field with description
+            # For optional fields, use (Optional[type], None) - Field() can't be used with Union types
+            # For required fields, use Field directly
+            if is_optional:
+                # Use Optional type with None as default
+                # Note: We can't use Field() with Optional/Union types, so description will be set via field_doc in fields_desc
+                it_map[it['name']] = (Optional[py_type], None)
+            else:
+                it_map[it['name']] = (py_type, Field(description=field_doc))
+        Base = create_model(f'Base_{tool_name}', **it_map)
+        fields_desc = "\n\n".join(
+            self._build_field_description(k, inputs_by_name[k], v)
+            for k, v in Base.model_fields.items()
+        )
+        outputs_desc = "\n\n".join(
+            self._build_output_description(out)
+            for out in outputs
+        )
+        # Extract Docker image information
+        docker_info = ""
+        docker_version = ""
+        # Check requirements first
+        if hasattr(tool.t, 'requirements') and tool.t.requirements:
+            for req in tool.t.requirements:
+                if isinstance(req, dict) and req.get('class') == 'DockerRequirement':
+                    docker_pull = req.get('dockerPull', '')
+                    if docker_pull:
+                        docker_info = f"\n\ntool_version: {docker_pull}"
+                        docker_version = docker_pull
+                        break
+        # If not found in requirements, check hints
+        if not docker_info and hasattr(tool.t, 'hints') and tool.t.hints:
+            for hint in tool.t.hints:
+                if isinstance(hint, dict) and hint.get('class') == 'DockerRequirement':
+                    docker_pull = hint.get('dockerPull', '')
+                    if docker_pull:
+                        docker_info = f"\n\ntool_version: {docker_pull}"
+                        docker_version = docker_pull
+                        break
+        tool_desc = f"{tool_name}: {tool.t.tool.get('label', '')}\n\n {tool.t.tool.get('doc', '')}{docker_info}\n\nReturns:\n\n{outputs_desc}"
+        @self.mcp.tool(name=tool_name, description=f"{tool_desc}\n\nInput data for '{tool_name}'. Fields: \n\n{fields_desc}")
+        def mcp_tool(data: List[Base]) -> dict:
+            """MCP tool wrapper for CWL tool execution."""
+            # Store fields_desc as function attribute for programmatic access
+            mcp_tool.fields_desc = fields_desc
+            # Assign interpolated docstring with field descriptions
+            mcp_tool.__doc__ = f"""
+            MCP tool wrapper for CWL tool execution.
+            Input fields:
+            {fields_desc}
+            """
+            logger.info(data)
+            params = data[0].model_dump()
+            # Transform input values based on their types
+            # Create a mapping from field name to input type
+            inputs_by_name = {it['name']: it for it in inputs}
+            for field_name, value in params.items():
+                if field_name in inputs_by_name:
+                    input_field = inputs_by_name[field_name]
+                    input_type = input_field.get('type', 'string')
+                    transformed_value = self._transform_input_value(field_name, value, input_type)
+                    if transformed_value != value:
+                        logger.info(f"Transformed input '{field_name}': '{value}' -> '{transformed_value}'")
+                    params[field_name] = transformed_value
+            outs = run_tool(tool, params, outputs, read_outs, container_runner=self.container_runner)
+            outs['tool_name'] = tool_name
+            outs['tool_version'] = docker_version
+            outs['system_prompt'] = self.system_prompt
+            logger.info(outs)
+            return outs
+        # Store tool info if needed
+        self.tools[tool_name] = {
+            'cwl_file': cwl_file,
+            'tool': tool,
+            'Base': Base,
+            'inputs': inputs,
+            'outputs': outputs
+        }
+    def serve(self, transport=None):
+        """
+        Starts the MCP server.
+        Parameters:
+            transport (str, optional): Transport type ('stdio' or 'streamable-http').
+                                     If None, auto-detects based on stdin availability.
+        """
+        # Auto-detect transport: if stdin is not a TTY, use stdio transport
+        if transport is None:
+            if not sys.stdin.isatty():
+                transport = 'stdio'
+            else:
+                transport = 'streamable-http'
+        if transport == 'streamable-http':
+            # Print to stderr to avoid interfering with stdio transport
+            print(f"Starting MCP server at http://{self.host}:{self.port}/", file=sys.stderr, flush=True)
+        else:
+            # For stdio transport, don't print startup messages to stdout
+            logger.info("Starting MCP server with stdio transport")
+        self.mcp.run(transport=transport)
+        # thread = threading.Thread(target=self.mcp.run, kwargs={'transport': 'sse'}, daemon=True)
+        # thread.start()
+        # self.server_thread = thread

coala_cli-0.2.0/coala/remote_api.py ADDED Viewed

@@ -0,0 +1,123 @@
+from fastapi import FastAPI, UploadFile, Body
+from pydantic import create_model
+import logging
+import uvicorn
+from tempfile import NamedTemporaryFile, mkdtemp
+from cwltool import factory
+from cwltool.context import RuntimeContext
+from threading import Thread
+import time
+from typing import Optional, List
+from coala.tool_logic import run_tool  # <-- import shared logic
+logger = logging.getLogger(__name__)
+logger.setLevel(logging.INFO)
+logger.addHandler(logging.StreamHandler())
+class tool_api():
+    def __init__(self, cwl_file, tool_name='tool', host='0.0.0.0', port=8000, read_outs=False):
+        """
+        Initializes a tool_api object, which is used to create a FastAPI server for a given CWL file.
+        Parameters:
+            cwl_file (str): The path to the CWL file.
+            tool_name (str): The name of the tool. Defaults to 'tool'.
+            host (str): The host IP address. Defaults to '0.0.0.0'.
+            port (int): The port number. Defaults to 8000.
+            read_outs (bool): Whether to read the outputs. Defaults to False.
+        Returns:
+            None
+        """
+        self.cwl_file = cwl_file
+        self.tool_name = tool_name
+        self.host = host
+        self.port = port
+        self.read_outs = read_outs
+        self.server = None
+        self.url = None
+        # cwl
+        runtime_context = RuntimeContext()
+        runtime_context.outdir = mkdtemp()
+        fac = factory.Factory(runtime_context=runtime_context)
+        self.tool = fac.make(cwl_file)
+        self.inputs = self.tool.t.inputs_record_schema['fields']
+        self.outputs = self.tool.t.outputs_record_schema['fields']
+        # map types
+        it_map = {}
+        for it in self.inputs:
+            # it['type'] can be a list like ['null', 'org.w3id.cwl.cwl.File']
+            type_list = it['type'] if isinstance(it['type'], list) else [it['type']]
+            type_str = ' '.join(str(t) for t in type_list)  # Join for checking substrings
+            if 'File' in type_str:
+                it_map[it['name']] = (str, None)
+            elif 'string' in type_str:
+                it_map[it['name']] = (str, None)
+            elif 'double' in type_str:
+                it_map[it['name']] = (float, None)
+            elif 'int' in type_str:
+                it_map[it['name']] = (int, None)
+            elif 'boolean' in type_str:
+                it_map[it['name']] = (bool, None)
+            else:
+                it_map[it['name']] = (str, None)
+            if 'null' in type_list:
+                type, v = it_map[it['name']]
+                it_map[it['name']] = (Optional[type], v)
+        self.Base = create_model('Base', **it_map)
+        # define tool
+        # fastapi
+        self.app = FastAPI()
+        @self.app.post('/uploadFile/')
+        async def uploadFile(file: UploadFile):
+            with NamedTemporaryFile(delete=False) as tmp:
+                contents = file.file.read()
+                tmp.write(contents)
+            return {"filename": file.filename, "filepath": tmp.name}
+        @self.app.post(f"/{self.tool_name}/")
+        def tool(data: List[self.Base] = Body(...)):
+            logger.info(data)
+            params = data[0].model_dump()
+            outs = run_tool(self.tool, params, self.outputs, self.read_outs)
+            logger.info(outs)
+            return outs
+    def serve(self):
+        """
+        Starts a FastAPI server to serve the specified tool.
+        This function initializes a FastAPI server and sets up the necessary routes for the specified tool. The server listens for HTTP requests on the specified host and port.
+        """
+        config = uvicorn.Config(app=self.app, host=self.host, port=self.port)
+        self.server = uvicorn.Server(config=config)
+        thread = Thread(target=self.server.run)
+        thread.start()  # non-blocking call
+        while not self.server.started:
+            time.sleep(0.1)
+        else:
+            print(f"HTTP server is now running on http://{self.host}:{self.port}")
+            self.url = f"http://{self.host}:{self.port}/{self.tool_name}/"
+    def stop(self):
+        """
+        Stops the server by setting the should_exit flag to True.
+        """
+        self.server.should_exit = True
+# api = tool_api(cwl_file='test_data/dockstore-tool-md5sum.cwl')
+# api.serve()
+# api.stop()

coala_cli-0.2.0/coala/tool_logic.py ADDED Viewed

@@ -0,0 +1,109 @@
+# coala/tool_logic.py
+import os.path
+import gzip
+from cwltool.context import RuntimeContext
+def configure_container_runner(runtime_context: RuntimeContext, container_runner: str) -> None:
+    """
+    Configure the runtime context with the specified container runner.
+    Parameters:
+        runtime_context: The RuntimeContext to configure
+        container_runner: Container runtime to use ('docker', 'podman', 'singularity', 'udocker', etc.)
+    """
+    runtime_context.default_container = container_runner
+    # Set boolean flags for specific container runners
+    runtime_context.singularity = (container_runner == 'singularity')
+    runtime_context.podman = (container_runner == 'podman')
+def _read_file_content(filepath):
+    """Read file content, handling gzipped files."""
+    try:
+        if filepath.endswith('.gz'):
+            with gzip.open(filepath, 'rt', encoding='utf-8') as f:
+                return f.read().replace('\n', '')
+        else:
+            with open(filepath, 'r', encoding='utf-8') as f:
+                return f.read().replace('\n', '')
+    except (UnicodeDecodeError, OSError):
+        # If reading fails (binary file, etc.), return the filepath instead
+        return filepath
+def run_tool(tool, params, outputs, read_outs=False, container_runner=None):
+    """
+    Execute a CWL tool with the given parameters.
+    Parameters:
+        tool: The CWL tool object (created via factory.Factory().make())
+        params: Dictionary of input parameters
+        outputs: List of output field definitions
+        read_outs: Whether to read output file contents (default: False)
+        container_runner: Container runtime to use (default: None, uses tool's default)
+                         Valid values: 'docker', 'podman', 'singularity', 'udocker', etc.
+    Returns:
+        Dictionary mapping output field names to their values
+    """
+    # Prepare params for CWL tool
+    inputs = tool.t.inputs_record_schema['fields']
+    in_dict = {}
+    for i in inputs:
+        in_dict[i['name']] = i['type']
+    for k, v in params.items():
+        if k in in_dict:
+            type_val = in_dict[k]
+            # Handle both list and string types (e.g., ['null', 'File'] or 'File?')
+            # Convert each item to str to handle CommentedMap from ruamel.yaml (enum types)
+            type_str = ' '.join(str(t) for t in type_val) if isinstance(type_val, list) else str(type_val)
+            if 'File' in type_str and v is not None:
+                if type(v) is dict and 'location' in v:
+                    location = v['location']
+                elif isinstance(v, str) and v.startswith('file://'):
+                    location = v
+                elif isinstance(v, str) and os.path.isfile(v):
+                    location = f"file://{v}"
+                else:
+                    continue  # Do nothing if v is not a file
+                params[k] = {
+                    "class": "File",
+                    "location": location
+                }
+    # Modify the tool's runtime context if container runner is specified
+    if container_runner:
+        # Try to get the original runtime context from the tool
+        original_runtime_context = None
+        if hasattr(tool, 'runtime_context'):
+            original_runtime_context = tool.runtime_context
+        elif hasattr(tool, 't') and hasattr(tool.t, 'runtime_context'):
+            original_runtime_context = tool.t.runtime_context
+        # If we found the runtime context, modify it in place
+        if original_runtime_context:
+            configure_container_runner(original_runtime_context, container_runner)
+    # Execute tool (no need to pass runtime_context if we modified it in place)
+    res = tool(**params)
+    outs = {}
+    for ot in outputs:
+        out_content = res[ot['name']]
+        # Handle both list and string types (e.g., ['null', 'File'] or 'File?')
+        # Convert each item to str to handle CommentedMap from ruamel.yaml (enum types)
+        type_val = ot['type']
+        type_str = ' '.join(str(t) for t in type_val) if isinstance(type_val, list) else str(type_val)
+        if read_outs and 'File' in type_str:
+            # Handle both single File and File[] (array) outputs
+            file_result = res[ot['name']]
+            if isinstance(file_result, list):
+                # File[] - read first file
+                if len(file_result) > 0:
+                    out_file = file_result[0]['location'].replace('file://', '')
+                    out_content = _read_file_content(out_file)
+            else:
+                # Single File
+                out_file = file_result['location'].replace('file://', '')
+                out_content = _read_file_content(out_file)
+        outs[ot['name']] = out_content
+    return outs

coala_cli-0.2.0/pyproject.toml ADDED Viewed

@@ -0,0 +1,27 @@
+[tool.poetry]
+name = "coala-cli"
+version = "0.2.0"
+description = "Convert any CMD tool into a LLM agent"
+authors = ["Qiang"]
+license = "MIT"
+readme = "README.md"
+packages = [{ include = "coala" }]
+[tool.poetry.dependencies]
+python = ">=3.12,<3.14"
+fastapi = ">=0.114.0"
+requests = "^2.32.3"
+pydantic = "^2.9.2"
+uvicorn = "^0.30.6"
+cwltool = "^3.1.20240909164951"
+mcp = "^1.9.0"
+[build-system]
+requires = ["poetry-core"]
+build-backend = "poetry.core.masonry.api"
+[tool.pytest.ini_options]
+filterwarnings = [
+    "ignore::pydantic.warnings.UnsupportedFieldAttributeWarning",
+]