PyPI - llms-py - Versions diffs - 2.0.8__tar.gz → 2.0.9__tar.gz - Mend

llms-py 2.0.8tar.gz → 2.0.9tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (39) hide show

{llms_py-2.0.8/llms_py.egg-info → llms_py-2.0.9}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: llms-py
-Version: 2.0.8
+Version: 2.0.9
 Summary: A lightweight CLI tool and OpenAI-compatible server for querying multiple Large Language Model (LLM) providers
 Home-page: https://github.com/ServiceStack/llms
 Author: ServiceStack
@@ -51,7 +51,7 @@ Configure additional providers and models in [llms.json](llms.json)
 ## Features
 - **Lightweight**: Single [llms.py](llms.py) Python file with single `aiohttp` dependency
-- **Multi-Provider Support**: OpenRouter, Ollama, Anthropic, Google, OpenAI, Grok, Groq, Qwen, Mistral
+- **Multi-Provider Support**: OpenRouter, Ollama, Anthropic, Google, OpenAI, Grok, Groq, Qwen, Z.ai, Mistral
 - **OpenAI-Compatible API**: Works with any client that supports OpenAI's chat completion API
 - **Configuration Management**: Easy provider enable/disable and configuration management
 - **CLI Interface**: Simple command-line interface for quick interactions
@@ -510,7 +510,52 @@ llms --default grok-4
 # Update llms.py to latest version
 llms --update
-```
+# Pass custom parameters to chat request (URL-encoded)
+llms --args "temperature=0.7&seed=111" "What is 2+2?"
+# Multiple parameters with different types
+llms --args "temperature=0.5&max_completion_tokens=50" "Tell me a joke"
+# URL-encoded special characters (stop sequences)
+llms --args "stop=Two,Words" "Count to 5"
+# Combine with other options
+llms --system "You are helpful" --args "temperature=0.3" --raw "Hello"
+```
+#### Custom Parameters with `--args`
+The `--args` option allows you to pass URL-encoded parameters to customize the chat request sent to LLM providers:
+**Parameter Types:**
+- **Floats**: `temperature=0.7`, `frequency_penalty=0.2`
+- **Integers**: `max_completion_tokens=100`
+- **Booleans**: `store=true`, `verbose=false`, `logprobs=true`
+- **Strings**: `stop=one`
+- **Lists**: `stop=two,words`
+**Common Parameters:**
+- `temperature`: Controls randomness (0.0 to 2.0)
+- `max_completion_tokens`: Maximum tokens in response
+- `seed`: For reproducible outputs
+- `top_p`: Nucleus sampling parameter
+- `stop`: Stop sequences (URL-encode special chars)
+- `store`: Whether or not to store the output
+- `frequency_penalty`: Penalize new tokens based on frequency
+- `presence_penalty`: Penalize new tokens based on presence
+- `logprobs`: Include log probabilities in response
+- `parallel_tool_calls`: Enable parallel tool calls
+- `prompt_cache_key`: Cache key for prompt
+- `reasoning_effort`: Reasoning effort (low, medium, high, *minimal, *none, *default)
+- `safety_identifier`: A string that uniquely identifies each user
+- `seed`: For reproducible outputs
+- `service_tier`: Service tier (free, standard, premium, *default)
+- `top_logprobs`: Number of top logprobs to return
+- `top_p`: Nucleus sampling parameter
+- `verbosity`: Verbosity level (0, 1, 2, 3, *default)
+- `enable_thinking`: Enable thinking mode (Qwen)
+- `stream`: Enable streaming responses
 ### Default Model Configuration
@@ -558,6 +603,42 @@ llms "Explain quantum computing" | glow
 ## Supported Providers
+Any OpenAI-compatible providers and their models can be added by configuring them in [llms.json](./llms.json). By default only AI Providers with free tiers are enabled which will only be "available" if their API Key is set.
+You can list the available providers, their models and which are enabled or disabled with:
+```bash
+llms ls
+```
+They can be enabled/disabled in your `llms.json` file or with:
+```bash
+llms --enable <provider>
+llms --disable <provider>
+```
+For a provider to be available, they also require their API Key configured in either your Environment Variables
+or directly in your `llms.json`.
+### Environment Variables
+| Provider        | Variable                  | Description         | Example |
+|-----------------|---------------------------|---------------------|---------|
+| openrouter_free | `OPENROUTER_FREE_API_KEY` | OpenRouter FREE models API key | `sk-or-...` |
+| groq            | `GROQ_API_KEY`            | Groq API key        | `gsk_...` |
+| google_free     | `GOOGLE_FREE_API_KEY`     | Google FREE API key | `AIza...` |
+| codestral       | `CODESTRAL_API_KEY`       | Codestral API key   | `...` |
+| ollama          | N/A                       | No API key required | |
+| openrouter      | `OPENROUTER_API_KEY`      | OpenRouter API key  | `sk-or-...` |
+| google          | `GOOGLE_API_KEY`          | Google API key      | `AIza...` |
+| anthropic       | `ANTHROPIC_API_KEY`       | Anthropic API key   | `sk-ant-...` |
+| openai          | `OPENAI_API_KEY`          | OpenAI API key      | `sk-...` |
+| grok            | `GROK_API_KEY`            | Grok (X.AI) API key | `xai-...` |
+| qwen            | `DASHSCOPE_API_KEY`       | Qwen (Alibaba) API key | `sk-...` |
+| z.ai            | `ZAI_API_KEY`             | Z.ai API key        | `sk-...` |
+| mistral         | `MISTRAL_API_KEY`         | Mistral API key     | `...` |
 ### OpenAI
 - **Type**: `OpenAiProvider`
 - **Models**: GPT-5, GPT-5 Codex, GPT-4o, GPT-4o-mini, o3, etc.
@@ -588,6 +669,26 @@ export GOOGLE_API_KEY="your-key"
 llms --enable google_free
 ```
+### OpenRouter
+- **Type**: `OpenAiProvider`
+- **Models**: 100+ models from various providers
+- **Features**: Access to latest models, free tier available
+```bash
+export OPENROUTER_API_KEY="your-key"
+llms --enable openrouter
+```
+### Grok (X.AI)
+- **Type**: `OpenAiProvider`
+- **Models**: Grok-4, Grok-3, Grok-3-mini, Grok-code-fast-1, etc.
+- **Features**: Real-time information, humor, uncensored responses
+```bash
+export GROK_API_KEY="your-key"
+llms --enable grok
+```
 ### Groq
 - **Type**: `OpenAiProvider`
 - **Models**: Llama 3.3, Gemma 2, Kimi K2, etc.
@@ -608,44 +709,44 @@ llms --enable groq
 llms --enable ollama
 ```
-### OpenRouter
+### Qwen (Alibaba Cloud)
 - **Type**: `OpenAiProvider`
-- **Models**: 100+ models from various providers
-- **Features**: Access to latest models, free tier available
+- **Models**: Qwen3-max, Qwen-max, Qwen-plus, Qwen2.5-VL, QwQ-plus, etc.
+- **Features**: Multilingual, vision models, coding, reasoning, audio processing
 ```bash
-export OPENROUTER_API_KEY="your-key"
-llms --enable openrouter
+export DASHSCOPE_API_KEY="your-key"
+llms --enable qwen
 ```
-### Mistral
+### Z.ai
 - **Type**: `OpenAiProvider`
-- **Models**: Mistral Large, Codestral, Pixtral, etc.
-- **Features**: Code generation, multilingual
+- **Models**: GLM-4.6, GLM-4.5, GLM-4.5-air, GLM-4.5-x, GLM-4.5-airx, GLM-4.5-flash, GLM-4:32b
+- **Features**: Advanced language models with strong reasoning capabilities
 ```bash
-export MISTRAL_API_KEY="your-key"
-llms --enable mistral
+export ZAI_API_KEY="your-key"
+llms --enable z.ai
 ```
-### Grok (X.AI)
+### Mistral
 - **Type**: `OpenAiProvider`
-- **Models**: Grok-4, Grok-3, Grok-3-mini, Grok-code-fast-1, etc.
-- **Features**: Real-time information, humor, uncensored responses
+- **Models**: Mistral Large, Codestral, Pixtral, etc.
+- **Features**: Code generation, multilingual
 ```bash
-export GROK_API_KEY="your-key"
-llms --enable grok
+export MISTRAL_API_KEY="your-key"
+llms --enable mistral
 ```
-### Qwen (Alibaba Cloud)
+### Codestral
 - **Type**: `OpenAiProvider`
-- **Models**: Qwen3-max, Qwen-max, Qwen-plus, Qwen2.5-VL, QwQ-plus, etc.
-- **Features**: Multilingual, vision models, coding, reasoning, audio processing
+- **Models**: Codestral
+- **Features**: Code generation
 ```bash
-export DASHSCOPE_API_KEY="your-key"
-llms --enable qwen
+export CODESTRAL_API_KEY="your-key"
+llms --enable codestral
 ```
 ## Model Routing
@@ -654,22 +755,6 @@ The tool automatically routes requests to the first available provider that supp
 Example: If both OpenAI and OpenRouter support `kimi-k2`, the request will first try OpenRouter (free), then fall back to Groq than OpenRouter (Paid) if requests fails.
-## Environment Variables
-| Variable | Description | Example |
-|----------|-------------|---------|
-| `LLMS_CONFIG_PATH`        | Custom config file path | `/path/to/llms.json` |
-| `OPENAI_API_KEY`          | OpenAI API key | `sk-...` |
-| `ANTHROPIC_API_KEY`       | Anthropic API key | `sk-ant-...` |
-| `GOOGLE_API_KEY`          | Google API key | `AIza...` |
-| `GROQ_API_KEY`            | Groq API key | `gsk_...` |
-| `MISTRAL_API_KEY`         | Mistral API key | `...` |
-| `OPENROUTER_API_KEY`      | OpenRouter API key | `sk-or-...` |
-| `OPENROUTER_FREE_API_KEY` | OpenRouter free tier key | `sk-or-...` |
-| `CODESTRAL_API_KEY`       | Codestral API key | `...` |
-| `GROK_API_KEY`            | Grok (X.AI) API key | `xai-...` |
-| `DASHSCOPE_API_KEY`       | Qwen (Alibaba Cloud) API key | `sk-...` |
 ## Configuration Examples
 ### Minimal Configuration

{llms_py-2.0.8 → llms_py-2.0.9}/README.md RENAMED Viewed

@@ -11,7 +11,7 @@ Configure additional providers and models in [llms.json](llms.json)
 ## Features
 - **Lightweight**: Single [llms.py](llms.py) Python file with single `aiohttp` dependency
-- **Multi-Provider Support**: OpenRouter, Ollama, Anthropic, Google, OpenAI, Grok, Groq, Qwen, Mistral
+- **Multi-Provider Support**: OpenRouter, Ollama, Anthropic, Google, OpenAI, Grok, Groq, Qwen, Z.ai, Mistral
 - **OpenAI-Compatible API**: Works with any client that supports OpenAI's chat completion API
 - **Configuration Management**: Easy provider enable/disable and configuration management
 - **CLI Interface**: Simple command-line interface for quick interactions
@@ -470,7 +470,52 @@ llms --default grok-4
 # Update llms.py to latest version
 llms --update
-```
+# Pass custom parameters to chat request (URL-encoded)
+llms --args "temperature=0.7&seed=111" "What is 2+2?"
+# Multiple parameters with different types
+llms --args "temperature=0.5&max_completion_tokens=50" "Tell me a joke"
+# URL-encoded special characters (stop sequences)
+llms --args "stop=Two,Words" "Count to 5"
+# Combine with other options
+llms --system "You are helpful" --args "temperature=0.3" --raw "Hello"
+```
+#### Custom Parameters with `--args`
+The `--args` option allows you to pass URL-encoded parameters to customize the chat request sent to LLM providers:
+**Parameter Types:**
+- **Floats**: `temperature=0.7`, `frequency_penalty=0.2`
+- **Integers**: `max_completion_tokens=100`
+- **Booleans**: `store=true`, `verbose=false`, `logprobs=true`
+- **Strings**: `stop=one`
+- **Lists**: `stop=two,words`
+**Common Parameters:**
+- `temperature`: Controls randomness (0.0 to 2.0)
+- `max_completion_tokens`: Maximum tokens in response
+- `seed`: For reproducible outputs
+- `top_p`: Nucleus sampling parameter
+- `stop`: Stop sequences (URL-encode special chars)
+- `store`: Whether or not to store the output
+- `frequency_penalty`: Penalize new tokens based on frequency
+- `presence_penalty`: Penalize new tokens based on presence
+- `logprobs`: Include log probabilities in response
+- `parallel_tool_calls`: Enable parallel tool calls
+- `prompt_cache_key`: Cache key for prompt
+- `reasoning_effort`: Reasoning effort (low, medium, high, *minimal, *none, *default)
+- `safety_identifier`: A string that uniquely identifies each user
+- `seed`: For reproducible outputs
+- `service_tier`: Service tier (free, standard, premium, *default)
+- `top_logprobs`: Number of top logprobs to return
+- `top_p`: Nucleus sampling parameter
+- `verbosity`: Verbosity level (0, 1, 2, 3, *default)
+- `enable_thinking`: Enable thinking mode (Qwen)
+- `stream`: Enable streaming responses
 ### Default Model Configuration
@@ -518,6 +563,42 @@ llms "Explain quantum computing" | glow
 ## Supported Providers
+Any OpenAI-compatible providers and their models can be added by configuring them in [llms.json](./llms.json). By default only AI Providers with free tiers are enabled which will only be "available" if their API Key is set.
+You can list the available providers, their models and which are enabled or disabled with:
+```bash
+llms ls
+```
+They can be enabled/disabled in your `llms.json` file or with:
+```bash
+llms --enable <provider>
+llms --disable <provider>
+```
+For a provider to be available, they also require their API Key configured in either your Environment Variables
+or directly in your `llms.json`.
+### Environment Variables
+| Provider        | Variable                  | Description         | Example |
+|-----------------|---------------------------|---------------------|---------|
+| openrouter_free | `OPENROUTER_FREE_API_KEY` | OpenRouter FREE models API key | `sk-or-...` |
+| groq            | `GROQ_API_KEY`            | Groq API key        | `gsk_...` |
+| google_free     | `GOOGLE_FREE_API_KEY`     | Google FREE API key | `AIza...` |
+| codestral       | `CODESTRAL_API_KEY`       | Codestral API key   | `...` |
+| ollama          | N/A                       | No API key required | |
+| openrouter      | `OPENROUTER_API_KEY`      | OpenRouter API key  | `sk-or-...` |
+| google          | `GOOGLE_API_KEY`          | Google API key      | `AIza...` |
+| anthropic       | `ANTHROPIC_API_KEY`       | Anthropic API key   | `sk-ant-...` |
+| openai          | `OPENAI_API_KEY`          | OpenAI API key      | `sk-...` |
+| grok            | `GROK_API_KEY`            | Grok (X.AI) API key | `xai-...` |
+| qwen            | `DASHSCOPE_API_KEY`       | Qwen (Alibaba) API key | `sk-...` |
+| z.ai            | `ZAI_API_KEY`             | Z.ai API key        | `sk-...` |
+| mistral         | `MISTRAL_API_KEY`         | Mistral API key     | `...` |
 ### OpenAI
 - **Type**: `OpenAiProvider`
 - **Models**: GPT-5, GPT-5 Codex, GPT-4o, GPT-4o-mini, o3, etc.
@@ -548,6 +629,26 @@ export GOOGLE_API_KEY="your-key"
 llms --enable google_free
 ```
+### OpenRouter
+- **Type**: `OpenAiProvider`
+- **Models**: 100+ models from various providers
+- **Features**: Access to latest models, free tier available
+```bash
+export OPENROUTER_API_KEY="your-key"
+llms --enable openrouter
+```
+### Grok (X.AI)
+- **Type**: `OpenAiProvider`
+- **Models**: Grok-4, Grok-3, Grok-3-mini, Grok-code-fast-1, etc.
+- **Features**: Real-time information, humor, uncensored responses
+```bash
+export GROK_API_KEY="your-key"
+llms --enable grok
+```
 ### Groq
 - **Type**: `OpenAiProvider`
 - **Models**: Llama 3.3, Gemma 2, Kimi K2, etc.
@@ -568,44 +669,44 @@ llms --enable groq
 llms --enable ollama
 ```
-### OpenRouter
+### Qwen (Alibaba Cloud)
 - **Type**: `OpenAiProvider`
-- **Models**: 100+ models from various providers
-- **Features**: Access to latest models, free tier available
+- **Models**: Qwen3-max, Qwen-max, Qwen-plus, Qwen2.5-VL, QwQ-plus, etc.
+- **Features**: Multilingual, vision models, coding, reasoning, audio processing
 ```bash
-export OPENROUTER_API_KEY="your-key"
-llms --enable openrouter
+export DASHSCOPE_API_KEY="your-key"
+llms --enable qwen
 ```
-### Mistral
+### Z.ai
 - **Type**: `OpenAiProvider`
-- **Models**: Mistral Large, Codestral, Pixtral, etc.
-- **Features**: Code generation, multilingual
+- **Models**: GLM-4.6, GLM-4.5, GLM-4.5-air, GLM-4.5-x, GLM-4.5-airx, GLM-4.5-flash, GLM-4:32b
+- **Features**: Advanced language models with strong reasoning capabilities
 ```bash
-export MISTRAL_API_KEY="your-key"
-llms --enable mistral
+export ZAI_API_KEY="your-key"
+llms --enable z.ai
 ```
-### Grok (X.AI)
+### Mistral
 - **Type**: `OpenAiProvider`
-- **Models**: Grok-4, Grok-3, Grok-3-mini, Grok-code-fast-1, etc.
-- **Features**: Real-time information, humor, uncensored responses
+- **Models**: Mistral Large, Codestral, Pixtral, etc.
+- **Features**: Code generation, multilingual
 ```bash
-export GROK_API_KEY="your-key"
-llms --enable grok
+export MISTRAL_API_KEY="your-key"
+llms --enable mistral
 ```
-### Qwen (Alibaba Cloud)
+### Codestral
 - **Type**: `OpenAiProvider`
-- **Models**: Qwen3-max, Qwen-max, Qwen-plus, Qwen2.5-VL, QwQ-plus, etc.
-- **Features**: Multilingual, vision models, coding, reasoning, audio processing
+- **Models**: Codestral
+- **Features**: Code generation
 ```bash
-export DASHSCOPE_API_KEY="your-key"
-llms --enable qwen
+export CODESTRAL_API_KEY="your-key"
+llms --enable codestral
 ```
 ## Model Routing
@@ -614,22 +715,6 @@ The tool automatically routes requests to the first available provider that supp
 Example: If both OpenAI and OpenRouter support `kimi-k2`, the request will first try OpenRouter (free), then fall back to Groq than OpenRouter (Paid) if requests fails.
-## Environment Variables
-| Variable | Description | Example |
-|----------|-------------|---------|
-| `LLMS_CONFIG_PATH`        | Custom config file path | `/path/to/llms.json` |
-| `OPENAI_API_KEY`          | OpenAI API key | `sk-...` |
-| `ANTHROPIC_API_KEY`       | Anthropic API key | `sk-ant-...` |
-| `GOOGLE_API_KEY`          | Google API key | `AIza...` |
-| `GROQ_API_KEY`            | Groq API key | `gsk_...` |
-| `MISTRAL_API_KEY`         | Mistral API key | `...` |
-| `OPENROUTER_API_KEY`      | OpenRouter API key | `sk-or-...` |
-| `OPENROUTER_FREE_API_KEY` | OpenRouter free tier key | `sk-or-...` |
-| `CODESTRAL_API_KEY`       | Codestral API key | `...` |
-| `GROK_API_KEY`            | Grok (X.AI) API key | `xai-...` |
-| `DASHSCOPE_API_KEY`       | Qwen (Alibaba Cloud) API key | `sk-...` |
 ## Configuration Examples
 ### Minimal Configuration

{llms_py-2.0.8 → llms_py-2.0.9}/llms.json RENAMED Viewed

@@ -9,7 +9,12 @@
             "messages": [
                 {
                     "role": "user",
-                    "content": ""
+                    "content": [
+                        {
+                            "type": "text",
+                            "text": ""
+                        }
+                    ]
                 }
             ]
         },
@@ -389,7 +394,8 @@
                 "qwen2.5-vl:7b": "qwen2.5-vl-7b-instruct",
                 "qwen2.5-vl:3b": "qwen2.5-vl-3b-instruct",
                 "qwen2.5-omni:7b": "qwen2.5-omni-7b"
-            }
+            },
+            "enable_thinking": false
         },
         "z.ai": {
             "enabled": false,
@@ -404,7 +410,8 @@
                 "glm-4.5-airx": "glm-4.5-airx",
                 "glm-4.5-flash": "glm-4.5-flash",
                 "glm-4:32b": "glm-4-32b-0414-128k"
-            }
+            },
+            "temperature": 0.7
         },
         "mistral": {
             "enabled": false,
@@ -417,20 +424,22 @@
                 "devstral-medium": "devstral-medium-2507",
                 "codestral:22b": "codestral-latest",
                 "mistral-ocr": "mistral-ocr-latest",
-                "voxtral-mini": "voxtral-mini-latest",
                 "mistral-small3.2:24b": "mistral-small-latest",
                 "magistral-small": "magistral-small-latest",
                 "devstral-small": "devstral-small-2507",
                 "voxtral-small": "voxtral-small-latest",
+                "voxtral-mini": "voxtral-mini-latest",
+                "codestral-embed": "codestral-embed-2505",
+                "mistral-embed": "mistral-embed",
                 "mistral-large:123b": "mistral-large-latest",
                 "pixtral-large:124b": "pixtral-large-latest",
                 "pixtral:12b": "pixtral-12b",
-                "mistral-nemo:12b": "mistral-nemo",
+                "mistral-nemo:12b": "open-mistral-nemo",
                 "mistral-saba": "mistral-saba-latest",
                 "mistral:7b": "open-mistral-7b",
                 "mixtral:8x7b": "open-mixtral-8x7b",
                 "mixtral:8x22b": "open-mixtral-8x22b",
-                "ministral:8b": "ministral-3b-latest",
+                "ministral:8b": "ministral-8b-latest",
                 "ministral:3b": "ministral-3b-latest"
             }
         }

{llms_py-2.0.8 → llms_py-2.0.9}/llms.py RENAMED Viewed

@@ -14,6 +14,7 @@ import mimetypes
 import traceback
 import sys
 import site
+from urllib.parse import parse_qs
 import aiohttp
 from aiohttp import web
@@ -21,7 +22,7 @@ from aiohttp import web
 from pathlib import Path
 from importlib import resources   # Py≥3.9  (pip install importlib_resources for 3.7/3.8)
-VERSION = "2.0.8"
+VERSION = "2.0.9"
 _ROOT = None
 g_config_path = None
 g_ui_path = None
@@ -63,7 +64,8 @@ def chat_summary(chat):
                     elif 'file' in item:
                         if 'file_data' in item['file']:
                             data = item['file']['file_data']
-                            item['file']['file_data'] = f"({len(data)})"
+                            prefix = url.split(',', 1)[0]
+                            item['file']['file_data'] = prefix + f",({len(url) - len(prefix)})"
     return json.dumps(clone, indent=2)
 def gemini_chat_summary(gemini_chat):
@@ -89,6 +91,60 @@ def is_url(url):
 def get_filename(file):
     return file.rsplit('/',1)[1] if '/' in file else 'file'
+def parse_args_params(args_str):
+    """Parse URL-encoded parameters and return a dictionary."""
+    if not args_str:
+        return {}
+    # Parse the URL-encoded string
+    parsed = parse_qs(args_str, keep_blank_values=True)
+    # Convert to simple dict with single values (not lists)
+    result = {}
+    for key, values in parsed.items():
+        if len(values) == 1:
+            value = values[0]
+            # Try to convert to appropriate types
+            if value.lower() == 'true':
+                result[key] = True
+            elif value.lower() == 'false':
+                result[key] = False
+            elif value.isdigit():
+                result[key] = int(value)
+            else:
+                try:
+                    # Try to parse as float
+                    result[key] = float(value)
+                except ValueError:
+                    # Keep as string
+                    result[key] = value
+        else:
+            # Multiple values, keep as list
+            result[key] = values
+    return result
+def apply_args_to_chat(chat, args_params):
+    """Apply parsed arguments to the chat request."""
+    if not args_params:
+        return chat
+    # Apply each parameter to the chat request
+    for key, value in args_params.items():
+        if isinstance(value, str):
+            if key == 'stop':
+                if ',' in value:
+                    value = value.split(',')
+            elif key == 'max_completion_tokens' or key == 'max_tokens' or key == 'n' or key == 'seed' or key == 'top_logprobs':
+                value = int(value)
+            elif key == 'temperature' or key == 'top_p' or key == 'frequency_penalty' or key == 'presence_penalty':
+                value = float(value)
+            elif key == 'store' or key == 'logprobs' or key == 'enable_thinking' or key == 'parallel_tool_calls' or key == 'stream':
+                value = bool(value)
+        chat[key] = value
+    return chat
 def is_base_64(data):
     try:
         base64.b64decode(data)
@@ -190,8 +246,9 @@ async def process_chat(chat):
                                     content = f.read()
                                     file['filename'] = get_filename(url)
                                     file['file_data'] = f"data:{mimetype};base64,{base64.b64encode(content).decode('utf-8')}"
-                            elif is_base_64(url):
-                                file['filename'] = 'file'
+                            elif url.startswith('data:'):
+                                if 'filename' not in file:
+                                    file['filename'] = 'file'
                                 pass # use base64 data as-is
                             else:
                                 raise Exception(f"Invalid file: {url}")
@@ -232,6 +289,25 @@ class OpenAiProvider:
         if api_key is not None:
             self.headers["Authorization"] = f"Bearer {api_key}"
+        self.frequency_penalty = float(kwargs['frequency_penalty']) if 'frequency_penalty' in kwargs else None
+        self.max_completion_tokens = int(kwargs['max_completion_tokens']) if 'max_completion_tokens' in kwargs else None
+        self.n = int(kwargs['n']) if 'n' in kwargs else None
+        self.parallel_tool_calls = bool(kwargs['parallel_tool_calls']) if 'parallel_tool_calls' in kwargs else None
+        self.presence_penalty = float(kwargs['presence_penalty']) if 'presence_penalty' in kwargs else None
+        self.prompt_cache_key = kwargs['prompt_cache_key'] if 'prompt_cache_key' in kwargs else None
+        self.reasoning_effort = kwargs['reasoning_effort'] if 'reasoning_effort' in kwargs else None
+        self.safety_identifier = kwargs['safety_identifier'] if 'safety_identifier' in kwargs else None
+        self.seed = int(kwargs['seed']) if 'seed' in kwargs else None
+        self.service_tier = kwargs['service_tier'] if 'service_tier' in kwargs else None
+        self.stop = kwargs['stop'] if 'stop' in kwargs else None
+        self.store = bool(kwargs['store']) if 'store' in kwargs else None
+        self.temperature = float(kwargs['temperature']) if 'temperature' in kwargs else None
+        self.top_logprobs = int(kwargs['top_logprobs']) if 'top_logprobs' in kwargs else None
+        self.top_p = float(kwargs['top_p']) if 'top_p' in kwargs else None
+        self.verbosity = kwargs['verbosity'] if 'verbosity' in kwargs else None
+        self.stream = bool(kwargs['stream']) if 'stream' in kwargs else None
+        self.enable_thinking = bool(kwargs['enable_thinking']) if 'enable_thinking' in kwargs else None
     @classmethod
     def test(cls, base_url=None, api_key=None, models={}, **kwargs):
         return base_url is not None and api_key is not None and len(models) > 0
@@ -247,6 +323,41 @@ class OpenAiProvider:
         # with open(os.path.join(os.path.dirname(__file__), 'chat.wip.json'), "w") as f:
         #     f.write(json.dumps(chat, indent=2))
+        if self.frequency_penalty is not None:
+            chat['frequency_penalty'] = self.frequency_penalty
+        if self.max_completion_tokens is not None:
+            chat['max_completion_tokens'] = self.max_completion_tokens
+        if self.n is not None:
+            chat['n'] = self.n
+        if self.parallel_tool_calls is not None:
+            chat['parallel_tool_calls'] = self.parallel_tool_calls
+        if self.presence_penalty is not None:
+            chat['presence_penalty'] = self.presence_penalty
+        if self.prompt_cache_key is not None:
+            chat['prompt_cache_key'] = self.prompt_cache_key
+        if self.reasoning_effort is not None:
+            chat['reasoning_effort'] = self.reasoning_effort
+        if self.safety_identifier is not None:
+            chat['safety_identifier'] = self.safety_identifier
+        if self.seed is not None:
+            chat['seed'] = self.seed
+        if self.service_tier is not None:
+            chat['service_tier'] = self.service_tier
+        if self.stop is not None:
+            chat['stop'] = self.stop
+        if self.store is not None:
+            chat['store'] = self.store
+        if self.temperature is not None:
+            chat['temperature'] = self.temperature
+        if self.top_logprobs is not None:
+            chat['top_logprobs'] = self.top_logprobs
+        if self.top_p is not None:
+            chat['top_p'] = self.top_p
+        if self.verbosity is not None:
+            chat['verbosity'] = self.verbosity
+        if self.enable_thinking is not None:
+            chat['enable_thinking'] = self.enable_thinking
         chat = await process_chat(chat)
         _log(f"POST {self.chat_url}")
         _log(chat_summary(chat))
@@ -537,10 +648,14 @@ async def chat_completion(chat):
     # If we get here, all providers failed
     raise first_exception
-async def cli_chat(chat, image=None, audio=None, file=None, raw=False):
+async def cli_chat(chat, image=None, audio=None, file=None, args=None, raw=False):
     if g_default_model:
         chat['model'] = g_default_model
+    # Apply args parameters to chat request
+    if args:
+        chat = apply_args_to_chat(chat, args)
     # process_chat downloads the image, just adding the reference here
     if image is not None:
         first_message = None
@@ -925,6 +1040,7 @@ def main():
     parser.add_argument('--image',        default=None, help='Image input to use in chat completion')
     parser.add_argument('--audio',        default=None, help='Audio input to use in chat completion')
     parser.add_argument('--file',         default=None, help='File input to use in chat completion')
+    parser.add_argument('--args',         default=None, help='URL-encoded parameters to add to chat request (e.g. "temperature=0.7&seed=111")', metavar='PARAMS')
     parser.add_argument('--raw',          action='store_true', help='Return raw AI JSON response')
     parser.add_argument('--list',         action='store_true', help='Show list of enabled providers and their models (alias ls provider?)')
@@ -1256,13 +1372,21 @@ def main():
             if len(extra_args) > 0:
                 prompt = ' '.join(extra_args)
                 # replace content of last message if exists, else add
-                last_msg = chat['messages'][-1]
-                if last_msg['role'] == 'user':
-                    last_msg['content'] = prompt
+                last_msg = chat['messages'][-1] if 'messages' in chat else None
+                if last_msg and last_msg['role'] == 'user':
+                    if isinstance(last_msg['content'], list):
+                        last_msg['content'][-1]['text'] = prompt
+                    else:
+                        last_msg['content'] = prompt
                 else:
                     chat['messages'].append({'role': 'user', 'content': prompt})
-            asyncio.run(cli_chat(chat, image=cli_args.image, audio=cli_args.audio, file=cli_args.file, raw=cli_args.raw))
+            # Parse args parameters if provided
+            args = None
+            if cli_args.args is not None:
+                args = parse_args_params(cli_args.args)
+            asyncio.run(cli_chat(chat, image=cli_args.image, audio=cli_args.audio, file=cli_args.file, args=args, raw=cli_args.raw))
             exit(0)
         except Exception as e:
             print(f"{cli_args.logprefix}Error: {e}")

{llms_py-2.0.8 → llms_py-2.0.9/llms_py.egg-info}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: llms-py
-Version: 2.0.8
+Version: 2.0.9
 Summary: A lightweight CLI tool and OpenAI-compatible server for querying multiple Large Language Model (LLM) providers
 Home-page: https://github.com/ServiceStack/llms
 Author: ServiceStack
@@ -51,7 +51,7 @@ Configure additional providers and models in [llms.json](llms.json)
 ## Features
 - **Lightweight**: Single [llms.py](llms.py) Python file with single `aiohttp` dependency
-- **Multi-Provider Support**: OpenRouter, Ollama, Anthropic, Google, OpenAI, Grok, Groq, Qwen, Mistral
+- **Multi-Provider Support**: OpenRouter, Ollama, Anthropic, Google, OpenAI, Grok, Groq, Qwen, Z.ai, Mistral
 - **OpenAI-Compatible API**: Works with any client that supports OpenAI's chat completion API
 - **Configuration Management**: Easy provider enable/disable and configuration management
 - **CLI Interface**: Simple command-line interface for quick interactions
@@ -510,7 +510,52 @@ llms --default grok-4
 # Update llms.py to latest version
 llms --update
-```
+# Pass custom parameters to chat request (URL-encoded)
+llms --args "temperature=0.7&seed=111" "What is 2+2?"
+# Multiple parameters with different types
+llms --args "temperature=0.5&max_completion_tokens=50" "Tell me a joke"
+# URL-encoded special characters (stop sequences)
+llms --args "stop=Two,Words" "Count to 5"
+# Combine with other options
+llms --system "You are helpful" --args "temperature=0.3" --raw "Hello"
+```
+#### Custom Parameters with `--args`
+The `--args` option allows you to pass URL-encoded parameters to customize the chat request sent to LLM providers:
+**Parameter Types:**
+- **Floats**: `temperature=0.7`, `frequency_penalty=0.2`
+- **Integers**: `max_completion_tokens=100`
+- **Booleans**: `store=true`, `verbose=false`, `logprobs=true`
+- **Strings**: `stop=one`
+- **Lists**: `stop=two,words`
+**Common Parameters:**
+- `temperature`: Controls randomness (0.0 to 2.0)
+- `max_completion_tokens`: Maximum tokens in response
+- `seed`: For reproducible outputs
+- `top_p`: Nucleus sampling parameter
+- `stop`: Stop sequences (URL-encode special chars)
+- `store`: Whether or not to store the output
+- `frequency_penalty`: Penalize new tokens based on frequency
+- `presence_penalty`: Penalize new tokens based on presence
+- `logprobs`: Include log probabilities in response
+- `parallel_tool_calls`: Enable parallel tool calls
+- `prompt_cache_key`: Cache key for prompt
+- `reasoning_effort`: Reasoning effort (low, medium, high, *minimal, *none, *default)
+- `safety_identifier`: A string that uniquely identifies each user
+- `seed`: For reproducible outputs
+- `service_tier`: Service tier (free, standard, premium, *default)
+- `top_logprobs`: Number of top logprobs to return
+- `top_p`: Nucleus sampling parameter
+- `verbosity`: Verbosity level (0, 1, 2, 3, *default)
+- `enable_thinking`: Enable thinking mode (Qwen)
+- `stream`: Enable streaming responses
 ### Default Model Configuration
@@ -558,6 +603,42 @@ llms "Explain quantum computing" | glow
 ## Supported Providers
+Any OpenAI-compatible providers and their models can be added by configuring them in [llms.json](./llms.json). By default only AI Providers with free tiers are enabled which will only be "available" if their API Key is set.
+You can list the available providers, their models and which are enabled or disabled with:
+```bash
+llms ls
+```
+They can be enabled/disabled in your `llms.json` file or with:
+```bash
+llms --enable <provider>
+llms --disable <provider>
+```
+For a provider to be available, they also require their API Key configured in either your Environment Variables
+or directly in your `llms.json`.
+### Environment Variables
+| Provider        | Variable                  | Description         | Example |
+|-----------------|---------------------------|---------------------|---------|
+| openrouter_free | `OPENROUTER_FREE_API_KEY` | OpenRouter FREE models API key | `sk-or-...` |
+| groq            | `GROQ_API_KEY`            | Groq API key        | `gsk_...` |
+| google_free     | `GOOGLE_FREE_API_KEY`     | Google FREE API key | `AIza...` |
+| codestral       | `CODESTRAL_API_KEY`       | Codestral API key   | `...` |
+| ollama          | N/A                       | No API key required | |
+| openrouter      | `OPENROUTER_API_KEY`      | OpenRouter API key  | `sk-or-...` |
+| google          | `GOOGLE_API_KEY`          | Google API key      | `AIza...` |
+| anthropic       | `ANTHROPIC_API_KEY`       | Anthropic API key   | `sk-ant-...` |
+| openai          | `OPENAI_API_KEY`          | OpenAI API key      | `sk-...` |
+| grok            | `GROK_API_KEY`            | Grok (X.AI) API key | `xai-...` |
+| qwen            | `DASHSCOPE_API_KEY`       | Qwen (Alibaba) API key | `sk-...` |
+| z.ai            | `ZAI_API_KEY`             | Z.ai API key        | `sk-...` |
+| mistral         | `MISTRAL_API_KEY`         | Mistral API key     | `...` |
 ### OpenAI
 - **Type**: `OpenAiProvider`
 - **Models**: GPT-5, GPT-5 Codex, GPT-4o, GPT-4o-mini, o3, etc.
@@ -588,6 +669,26 @@ export GOOGLE_API_KEY="your-key"
 llms --enable google_free
 ```
+### OpenRouter
+- **Type**: `OpenAiProvider`
+- **Models**: 100+ models from various providers
+- **Features**: Access to latest models, free tier available
+```bash
+export OPENROUTER_API_KEY="your-key"
+llms --enable openrouter
+```
+### Grok (X.AI)
+- **Type**: `OpenAiProvider`
+- **Models**: Grok-4, Grok-3, Grok-3-mini, Grok-code-fast-1, etc.
+- **Features**: Real-time information, humor, uncensored responses
+```bash
+export GROK_API_KEY="your-key"
+llms --enable grok
+```
 ### Groq
 - **Type**: `OpenAiProvider`
 - **Models**: Llama 3.3, Gemma 2, Kimi K2, etc.
@@ -608,44 +709,44 @@ llms --enable groq
 llms --enable ollama
 ```
-### OpenRouter
+### Qwen (Alibaba Cloud)
 - **Type**: `OpenAiProvider`
-- **Models**: 100+ models from various providers
-- **Features**: Access to latest models, free tier available
+- **Models**: Qwen3-max, Qwen-max, Qwen-plus, Qwen2.5-VL, QwQ-plus, etc.
+- **Features**: Multilingual, vision models, coding, reasoning, audio processing
 ```bash
-export OPENROUTER_API_KEY="your-key"
-llms --enable openrouter
+export DASHSCOPE_API_KEY="your-key"
+llms --enable qwen
 ```
-### Mistral
+### Z.ai
 - **Type**: `OpenAiProvider`
-- **Models**: Mistral Large, Codestral, Pixtral, etc.
-- **Features**: Code generation, multilingual
+- **Models**: GLM-4.6, GLM-4.5, GLM-4.5-air, GLM-4.5-x, GLM-4.5-airx, GLM-4.5-flash, GLM-4:32b
+- **Features**: Advanced language models with strong reasoning capabilities
 ```bash
-export MISTRAL_API_KEY="your-key"
-llms --enable mistral
+export ZAI_API_KEY="your-key"
+llms --enable z.ai
 ```
-### Grok (X.AI)
+### Mistral
 - **Type**: `OpenAiProvider`
-- **Models**: Grok-4, Grok-3, Grok-3-mini, Grok-code-fast-1, etc.
-- **Features**: Real-time information, humor, uncensored responses
+- **Models**: Mistral Large, Codestral, Pixtral, etc.
+- **Features**: Code generation, multilingual
 ```bash
-export GROK_API_KEY="your-key"
-llms --enable grok
+export MISTRAL_API_KEY="your-key"
+llms --enable mistral
 ```
-### Qwen (Alibaba Cloud)
+### Codestral
 - **Type**: `OpenAiProvider`
-- **Models**: Qwen3-max, Qwen-max, Qwen-plus, Qwen2.5-VL, QwQ-plus, etc.
-- **Features**: Multilingual, vision models, coding, reasoning, audio processing
+- **Models**: Codestral
+- **Features**: Code generation
 ```bash
-export DASHSCOPE_API_KEY="your-key"
-llms --enable qwen
+export CODESTRAL_API_KEY="your-key"
+llms --enable codestral
 ```
 ## Model Routing
@@ -654,22 +755,6 @@ The tool automatically routes requests to the first available provider that supp
 Example: If both OpenAI and OpenRouter support `kimi-k2`, the request will first try OpenRouter (free), then fall back to Groq than OpenRouter (Paid) if requests fails.
-## Environment Variables
-| Variable | Description | Example |
-|----------|-------------|---------|
-| `LLMS_CONFIG_PATH`        | Custom config file path | `/path/to/llms.json` |
-| `OPENAI_API_KEY`          | OpenAI API key | `sk-...` |
-| `ANTHROPIC_API_KEY`       | Anthropic API key | `sk-ant-...` |
-| `GOOGLE_API_KEY`          | Google API key | `AIza...` |
-| `GROQ_API_KEY`            | Groq API key | `gsk_...` |
-| `MISTRAL_API_KEY`         | Mistral API key | `...` |
-| `OPENROUTER_API_KEY`      | OpenRouter API key | `sk-or-...` |
-| `OPENROUTER_FREE_API_KEY` | OpenRouter free tier key | `sk-or-...` |
-| `CODESTRAL_API_KEY`       | Codestral API key | `...` |
-| `GROK_API_KEY`            | Grok (X.AI) API key | `xai-...` |
-| `DASHSCOPE_API_KEY`       | Qwen (Alibaba Cloud) API key | `sk-...` |
 ## Configuration Examples
 ### Minimal Configuration

{llms_py-2.0.8 → llms_py-2.0.9}/pyproject.toml RENAMED Viewed

@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
 [project]
 name = "llms-py"
-version = "2.0.8"
+version = "2.0.9"
 description = "A lightweight CLI tool and OpenAI-compatible server for querying multiple Large Language Model (LLM) providers"
 readme = "README.md"
 license = "BSD-3-Clause"

{llms_py-2.0.8 → llms_py-2.0.9}/setup.py RENAMED Viewed

@@ -16,7 +16,7 @@ with open(os.path.join(this_directory, "requirements.txt"), encoding="utf-8") as
 setup(
     name="llms-py",
-    version="2.0.8",
+    version="2.0.9",
     author="ServiceStack",
     author_email="team@servicestack.net",
     description="A lightweight CLI tool and OpenAI-compatible server for querying multiple Large Language Model (LLM) providers",