PyPI - llms-py - Versions diffs - 2.0.27__tar.gz → 2.0.29__tar.gz - Mend

llms-py 2.0.27tar.gz → 2.0.29tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (58) hide show

{llms_py-2.0.27/llms_py.egg-info → llms_py-2.0.29}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: llms-py
-Version: 2.0.27
+Version: 2.0.29
 Summary: A lightweight CLI tool and OpenAI-compatible server for querying multiple Large Language Model (LLM) providers
 Home-page: https://github.com/ServiceStack/llms
 Author: ServiceStack
@@ -50,7 +50,7 @@ Configure additional providers and models in [llms.json](llms/llms.json)
 ## Features
-- **Lightweight**: Single [llms.py](https://github.com/ServiceStack/llms/blob/main/llms/main.py) Python file with single `aiohttp` dependency
+- **Lightweight**: Single [llms.py](https://github.com/ServiceStack/llms/blob/main/llms/main.py) Python file with single `aiohttp` dependency (Pillow optional)
 - **Multi-Provider Support**: OpenRouter, Ollama, Anthropic, Google, OpenAI, Grok, Groq, Qwen, Z.ai, Mistral
 - **OpenAI-Compatible API**: Works with any client that supports OpenAI's chat completion API
 - **Built-in Analytics**: Built-in analytics UI to visualize costs, requests, and token usage
@@ -58,6 +58,7 @@ Configure additional providers and models in [llms.json](llms/llms.json)
 - **CLI Interface**: Simple command-line interface for quick interactions
 - **Server Mode**: Run an OpenAI-compatible HTTP server at `http://localhost:{PORT}/v1/chat/completions`
 - **Image Support**: Process images through vision-capable models
+  - Auto resizes and converts to webp if exceeds configured limits
 - **Audio Support**: Process audio through audio-capable models
 - **Custom Chat Templates**: Configurable chat completion request templates for different modalities
 - **Auto-Discovery**: Automatically discover available Ollama models
@@ -68,23 +69,27 @@ Configure additional providers and models in [llms.json](llms/llms.json)
 Access all your local all remote LLMs with a single ChatGPT-like UI:
-[![](https://servicestack.net/img/posts/llms-py-ui/bg.webp?)](https://servicestack.net/posts/llms-py-ui)
+[![](https://servicestack.net/img/posts/llms-py-ui/bg.webp)](https://servicestack.net/posts/llms-py-ui)
-**Monthly Costs Analysis**
+#### Dark Mode Support
+[![](https://servicestack.net/img/posts/llms-py-ui/dark-attach-image.webp?)](https://servicestack.net/posts/llms-py-ui)
+#### Monthly Costs Analysis
 [![](https://servicestack.net/img/posts/llms-py-ui/analytics-costs.webp)](https://servicestack.net/posts/llms-py-ui)
-**Monthly Token Usage**
+#### Monthly Token Usage (Dark Mode)
-[![](https://servicestack.net/img/posts/llms-py-ui/analytics-tokens.webp)](https://servicestack.net/posts/llms-py-ui)
+[![](https://servicestack.net/img/posts/llms-py-ui/dark-analytics-tokens.webp?)](https://servicestack.net/posts/llms-py-ui)
-**Monthly Activity Log**
+#### Monthly Activity Log
 [![](https://servicestack.net/img/posts/llms-py-ui/analytics-activity.webp)](https://servicestack.net/posts/llms-py-ui)
 [More Features and Screenshots](https://servicestack.net/posts/llms-py-ui).
-**Check Provider Reliability and Response Times**
+#### Check Provider Reliability and Response Times
 Check the status of configured providers to test if they're configured correctly, reachable and what their response times is for the simplest `1+1=` request:
@@ -230,6 +235,22 @@ See [DOCKER.md](DOCKER.md) for detailed instructions on customizing configuratio
 llms.py supports optional GitHub OAuth authentication to secure your web UI and API endpoints. When enabled, users must sign in with their GitHub account before accessing the application.
+```json
+{
+    "auth": {
+        "enabled": true,
+        "github": {
+            "client_id": "$GITHUB_CLIENT_ID",
+            "client_secret": "$GITHUB_CLIENT_SECRET",
+            "redirect_uri": "http://localhost:8000/auth/github/callback",
+            "restrict_to": "$GITHUB_USERS"
+        }
+    }
+}
+```
+`GITHUB_USERS` is optional but if set will only allow access to the specified users.
 See [GITHUB_OAUTH_SETUP.md](GITHUB_OAUTH_SETUP.md) for detailed setup instructions.
 ## Configuration
@@ -243,6 +264,8 @@ The configuration file [llms.json](llms/llms.json) is saved to `~/.llms/llms.jso
 - `audio`: Default chat completion request template for audio prompts
 - `file`: Default chat completion request template for file prompts
 - `check`: Check request template for testing provider connectivity
+- `limits`: Override Request size limits
+- `convert`: Max image size and length limits and auto conversion settings
 ### Providers
@@ -1211,7 +1234,7 @@ This shows:
 - `llms/main.py` - Main script with CLI and server functionality
 - `llms/llms.json` - Default configuration file
 - `llms/ui.json` - UI configuration file
-- `requirements.txt` - Python dependencies (aiohttp)
+- `requirements.txt` - Python dependencies, required: `aiohttp`, optional: `Pillow`
 ### Provider Classes

{llms_py-2.0.27 → llms_py-2.0.29}/README.md RENAMED Viewed

@@ -10,7 +10,7 @@ Configure additional providers and models in [llms.json](llms/llms.json)
 ## Features
-- **Lightweight**: Single [llms.py](https://github.com/ServiceStack/llms/blob/main/llms/main.py) Python file with single `aiohttp` dependency
+- **Lightweight**: Single [llms.py](https://github.com/ServiceStack/llms/blob/main/llms/main.py) Python file with single `aiohttp` dependency (Pillow optional)
 - **Multi-Provider Support**: OpenRouter, Ollama, Anthropic, Google, OpenAI, Grok, Groq, Qwen, Z.ai, Mistral
 - **OpenAI-Compatible API**: Works with any client that supports OpenAI's chat completion API
 - **Built-in Analytics**: Built-in analytics UI to visualize costs, requests, and token usage
@@ -18,6 +18,7 @@ Configure additional providers and models in [llms.json](llms/llms.json)
 - **CLI Interface**: Simple command-line interface for quick interactions
 - **Server Mode**: Run an OpenAI-compatible HTTP server at `http://localhost:{PORT}/v1/chat/completions`
 - **Image Support**: Process images through vision-capable models
+  - Auto resizes and converts to webp if exceeds configured limits
 - **Audio Support**: Process audio through audio-capable models
 - **Custom Chat Templates**: Configurable chat completion request templates for different modalities
 - **Auto-Discovery**: Automatically discover available Ollama models
@@ -28,23 +29,27 @@ Configure additional providers and models in [llms.json](llms/llms.json)
 Access all your local all remote LLMs with a single ChatGPT-like UI:
-[![](https://servicestack.net/img/posts/llms-py-ui/bg.webp?)](https://servicestack.net/posts/llms-py-ui)
+[![](https://servicestack.net/img/posts/llms-py-ui/bg.webp)](https://servicestack.net/posts/llms-py-ui)
-**Monthly Costs Analysis**
+#### Dark Mode Support
+[![](https://servicestack.net/img/posts/llms-py-ui/dark-attach-image.webp?)](https://servicestack.net/posts/llms-py-ui)
+#### Monthly Costs Analysis
 [![](https://servicestack.net/img/posts/llms-py-ui/analytics-costs.webp)](https://servicestack.net/posts/llms-py-ui)
-**Monthly Token Usage**
+#### Monthly Token Usage (Dark Mode)
-[![](https://servicestack.net/img/posts/llms-py-ui/analytics-tokens.webp)](https://servicestack.net/posts/llms-py-ui)
+[![](https://servicestack.net/img/posts/llms-py-ui/dark-analytics-tokens.webp?)](https://servicestack.net/posts/llms-py-ui)
-**Monthly Activity Log**
+#### Monthly Activity Log
 [![](https://servicestack.net/img/posts/llms-py-ui/analytics-activity.webp)](https://servicestack.net/posts/llms-py-ui)
 [More Features and Screenshots](https://servicestack.net/posts/llms-py-ui).
-**Check Provider Reliability and Response Times**
+#### Check Provider Reliability and Response Times
 Check the status of configured providers to test if they're configured correctly, reachable and what their response times is for the simplest `1+1=` request:
@@ -190,6 +195,22 @@ See [DOCKER.md](DOCKER.md) for detailed instructions on customizing configuratio
 llms.py supports optional GitHub OAuth authentication to secure your web UI and API endpoints. When enabled, users must sign in with their GitHub account before accessing the application.
+```json
+{
+    "auth": {
+        "enabled": true,
+        "github": {
+            "client_id": "$GITHUB_CLIENT_ID",
+            "client_secret": "$GITHUB_CLIENT_SECRET",
+            "redirect_uri": "http://localhost:8000/auth/github/callback",
+            "restrict_to": "$GITHUB_USERS"
+        }
+    }
+}
+```
+`GITHUB_USERS` is optional but if set will only allow access to the specified users.
 See [GITHUB_OAUTH_SETUP.md](GITHUB_OAUTH_SETUP.md) for detailed setup instructions.
 ## Configuration
@@ -203,6 +224,8 @@ The configuration file [llms.json](llms/llms.json) is saved to `~/.llms/llms.jso
 - `audio`: Default chat completion request template for audio prompts
 - `file`: Default chat completion request template for file prompts
 - `check`: Check request template for testing provider connectivity
+- `limits`: Override Request size limits
+- `convert`: Max image size and length limits and auto conversion settings
 ### Providers
@@ -1171,7 +1194,7 @@ This shows:
 - `llms/main.py` - Main script with CLI and server functionality
 - `llms/llms.json` - Default configuration file
 - `llms/ui.json` - UI configuration file
-- `requirements.txt` - Python dependencies (aiohttp)
+- `requirements.txt` - Python dependencies, required: `aiohttp`, optional: `Pillow`
 ### Provider Classes

{llms_py-2.0.27 → llms_py-2.0.29}/llms/llms.json RENAMED Viewed

@@ -4,7 +4,8 @@
         "github": {
             "client_id": "$GITHUB_CLIENT_ID",
             "client_secret": "$GITHUB_CLIENT_SECRET",
-            "redirect_uri": "http://localhost:8000/auth/github/callback"
+            "redirect_uri": "http://localhost:8000/auth/github/callback",
+            "restrict_to": "$GITHUB_USERS"
         }
     },
     "defaults": {
@@ -104,6 +105,15 @@
             "stream": false
         }
     },
+    "limits": {
+        "client_max_size": 20971520
+    },
+    "convert": {
+        "image": {
+            "max_size": "1536x1024",
+            "max_length": 1572864
+        }
+    },
     "providers": {
         "openrouter_free": {
             "enabled": true,

{llms_py-2.0.27 → llms_py-2.0.29}/llms/main.py RENAMED Viewed

@@ -15,6 +15,8 @@ import traceback
 import sys
 import site
 import secrets
+import re
+from io import BytesIO
 from urllib.parse import parse_qs, urlencode
 import aiohttp
@@ -23,7 +25,13 @@ from aiohttp import web
 from pathlib import Path
 from importlib import resources   # Py≥3.9  (pip install importlib_resources for 3.7/3.8)
-VERSION = "2.0.27"
+try:
+    from PIL import Image
+    HAS_PIL = True
+except ImportError:
+    HAS_PIL = False
+VERSION = "2.0.29"
 _ROOT = None
 g_config_path = None
 g_ui_path = None
@@ -200,6 +208,77 @@ def price_to_string(price: float | int | str | None) -> str | None:
     except (ValueError, TypeError):
         return None
+def convert_image_if_needed(image_bytes, mimetype='image/png'):
+    """
+    Convert and resize image to WebP if it exceeds configured limits.
+    Args:
+        image_bytes: Raw image bytes
+        mimetype: Original image MIME type
+    Returns:
+        tuple: (converted_bytes, new_mimetype) or (original_bytes, original_mimetype) if no conversion needed
+    """
+    if not HAS_PIL:
+        return image_bytes, mimetype
+    # Get conversion config
+    convert_config = g_config.get('convert', {}).get('image', {}) if g_config else {}
+    if not convert_config:
+        return image_bytes, mimetype
+    max_size_str = convert_config.get('max_size', '1536x1024')
+    max_length = convert_config.get('max_length', 1.5*1024*1024) # 1.5MB
+    try:
+        # Parse max_size (e.g., "1536x1024")
+        max_width, max_height = map(int, max_size_str.split('x'))
+        # Open image
+        with Image.open(BytesIO(image_bytes)) as img:
+            original_width, original_height = img.size
+            # Check if image exceeds limits
+            needs_resize = original_width > max_width or original_height > max_height
+            # Check if base64 length would exceed max_length (in KB)
+            # Base64 encoding increases size by ~33%, so check raw bytes * 1.33 / 1024
+            estimated_kb = (len(image_bytes) * 1.33) / 1024
+            needs_conversion = estimated_kb > max_length
+            if not needs_resize and not needs_conversion:
+                return image_bytes, mimetype
+            # Convert RGBA to RGB if necessary (WebP doesn't support transparency in RGB mode)
+            if img.mode in ('RGBA', 'LA', 'P'):
+                # Create a white background
+                background = Image.new('RGB', img.size, (255, 255, 255))
+                if img.mode == 'P':
+                    img = img.convert('RGBA')
+                background.paste(img, mask=img.split()[-1] if img.mode in ('RGBA', 'LA') else None)
+                img = background
+            elif img.mode != 'RGB':
+                img = img.convert('RGB')
+            # Resize if needed (preserve aspect ratio)
+            if needs_resize:
+                img.thumbnail((max_width, max_height), Image.Resampling.LANCZOS)
+                _log(f"Resized image from {original_width}x{original_height} to {img.size[0]}x{img.size[1]}")
+            # Convert to WebP
+            output = BytesIO()
+            img.save(output, format='WEBP', quality=85, method=6)
+            converted_bytes = output.getvalue()
+            _log(f"Converted image to WebP: {len(image_bytes)} bytes -> {len(converted_bytes)} bytes ({len(converted_bytes)*100//len(image_bytes)}%)")
+            return converted_bytes, 'image/webp'
+    except Exception as e:
+        _log(f"Error converting image: {e}")
+        # Return original if conversion fails
+        return image_bytes, mimetype
 async def process_chat(chat):
     if not chat:
         raise Exception("No chat provided")
@@ -230,19 +309,31 @@ async def process_chat(chat):
                                     mimetype = get_file_mime_type(get_filename(url))
                                     if 'Content-Type' in response.headers:
                                         mimetype = response.headers['Content-Type']
+                                    # convert/resize image if needed
+                                    content, mimetype = convert_image_if_needed(content, mimetype)
                                     # convert to data uri
                                     image_url['url'] = f"data:{mimetype};base64,{base64.b64encode(content).decode('utf-8')}"
                             elif is_file_path(url):
                                 _log(f"Reading image: {url}")
                                 with open(url, "rb") as f:
                                     content = f.read()
-                                    ext = os.path.splitext(url)[1].lower().lstrip('.') if '.' in url else 'png'
                                     # get mimetype from file extension
                                     mimetype = get_file_mime_type(get_filename(url))
+                                    # convert/resize image if needed
+                                    content, mimetype = convert_image_if_needed(content, mimetype)
                                     # convert to data uri
                                     image_url['url'] = f"data:{mimetype};base64,{base64.b64encode(content).decode('utf-8')}"
                             elif url.startswith('data:'):
-                                pass
+                                # Extract existing data URI and process it
+                                if ';base64,' in url:
+                                    prefix = url.split(';base64,')[0]
+                                    mimetype = prefix.split(':')[1] if ':' in prefix else 'image/png'
+                                    base64_data = url.split(';base64,')[1]
+                                    content = base64.b64decode(base64_data)
+                                    # convert/resize image if needed
+                                    content, mimetype = convert_image_if_needed(content, mimetype)
+                                    # update data uri with potentially converted image
+                                    image_url['url'] = f"data:{mimetype};base64,{base64.b64encode(content).decode('utf-8')}"
                             else:
                                 raise Exception(f"Invalid image: {url}")
                     elif item['type'] == 'input_audio' and 'input_audio' in item:
@@ -1314,6 +1405,66 @@ async def save_home_configs():
         print("Could not create llms.json. Create one with --init or use --config <path>")
         exit(1)
+async def reload_providers():
+    global g_config, g_handlers
+    g_handlers = init_llms(g_config)
+    await load_llms()
+    _log(f"{len(g_handlers)} providers loaded")
+    return g_handlers
+async def watch_config_files(config_path, ui_path, interval=1):
+    """Watch config files and reload providers when they change"""
+    global g_config
+    config_path = Path(config_path)
+    ui_path = Path(ui_path) if ui_path else None
+    file_mtimes = {}
+    _log(f"Watching config files: {config_path}" + (f", {ui_path}" if ui_path else ""))
+    while True:
+        await asyncio.sleep(interval)
+        # Check llms.json
+        try:
+            if config_path.is_file():
+                mtime = config_path.stat().st_mtime
+                if str(config_path) not in file_mtimes:
+                    file_mtimes[str(config_path)] = mtime
+                elif file_mtimes[str(config_path)] != mtime:
+                    _log(f"Config file changed: {config_path.name}")
+                    file_mtimes[str(config_path)] = mtime
+                    try:
+                        # Reload llms.json
+                        with open(config_path, "r") as f:
+                            g_config = json.load(f)
+                        # Reload providers
+                        await reload_providers()
+                        _log("Providers reloaded successfully")
+                    except Exception as e:
+                        _log(f"Error reloading config: {e}")
+        except FileNotFoundError:
+            pass
+        # Check ui.json
+        if ui_path:
+            try:
+                if ui_path.is_file():
+                    mtime = ui_path.stat().st_mtime
+                    if str(ui_path) not in file_mtimes:
+                        file_mtimes[str(ui_path)] = mtime
+                    elif file_mtimes[str(ui_path)] != mtime:
+                        _log(f"Config file changed: {ui_path.name}")
+                        file_mtimes[str(ui_path)] = mtime
+                        _log("ui.json reloaded - reload page to update")
+            except FileNotFoundError:
+                pass
 def main():
     global _ROOT, g_verbose, g_default_model, g_logprefix, g_config, g_config_path, g_ui_path
@@ -1401,8 +1552,7 @@ def main():
         g_ui_path = home_ui_path
         g_config = json.loads(text_from_file(g_config_path))
-    init_llms(g_config)
-    asyncio.run(load_llms())
+    asyncio.run(reload_providers())
     # print names
     _log(f"enabled providers: {', '.join(g_handlers.keys())}")
@@ -1480,7 +1630,9 @@ def main():
             _log("Authentication enabled - GitHub OAuth configured")
-        app = web.Application()
+        client_max_size = g_config.get('limits', {}).get('client_max_size', 20*1024*1024) # 20MB max request size (to handle base64 encoding overhead)
+        _log(f"client_max_size set to {client_max_size} bytes ({client_max_size/1024/1024:.1f}MB)")
+        app = web.Application(client_max_size=client_max_size)
         # Authentication middleware helper
         def check_auth(request):
@@ -1601,6 +1753,29 @@ def main():
             auth_url = f"https://github.com/login/oauth/authorize?{urlencode(params)}"
             return web.HTTPFound(auth_url)
+        def validate_user(github_username):
+            auth_config = g_config['auth']['github']
+            # Check if user is restricted
+            restrict_to = auth_config.get('restrict_to', '')
+            # Expand environment variables
+            if restrict_to.startswith('$'):
+                restrict_to = os.environ.get(restrict_to[1:], '')
+            # If restrict_to is configured, validate the user
+            if restrict_to:
+                # Parse allowed users (comma or space delimited)
+                allowed_users = [u.strip() for u in re.split(r'[,\s]+', restrict_to) if u.strip()]
+                # Check if user is in the allowed list
+                if not github_username or github_username not in allowed_users:
+                    _log(f"Access denied for user: {github_username}. Not in allowed list: {allowed_users}")
+                    return web.Response(
+                        text=f"Access denied. User '{github_username}' is not authorized to access this application.",
+                        status=403
+                    )
+            return None
         async def github_callback_handler(request):
             """Handle GitHub OAuth callback"""
@@ -1664,6 +1839,11 @@ def main():
                 async with session.get(user_url, headers=headers) as resp:
                     user_data = await resp.json()
+                # Validate user
+                error_response = validate_user(user_data.get('login', ''))
+                if error_response:
+                    return error_response
             # Create session
             session_token = secrets.token_urlsafe(32)
             g_sessions[session_token] = {
@@ -1814,6 +1994,14 @@ def main():
         # Serve index.html as fallback route (SPA routing)
         app.router.add_route('*', '/{tail:.*}', index_handler)
+        # Setup file watcher for config files
+        async def start_background_tasks(app):
+            """Start background tasks when the app starts"""
+            # Start watching config files in the background
+            asyncio.create_task(watch_config_files(g_config_path, g_ui_path))
+        app.on_startup.append(start_background_tasks)
         print(f"Starting server on port {port}...")
         web.run_app(app, host='0.0.0.0', port=port, print=_log)
         exit(0)

{llms_py-2.0.27 → llms_py-2.0.29}/llms/ui/Analytics.mjs RENAMED Viewed

@@ -1021,7 +1021,9 @@ export default {
                         // Only display label if percentage > 1%
                         if (parseFloat(percentage) > 1) {
-                            chartCtx.fillStyle = '#000'
+                            // Use white color in dark mode, black in light mode
+                            const isDarkMode = document.documentElement.classList.contains('dark')
+                            chartCtx.fillStyle = isDarkMode ? '#fff' : '#000'
                             chartCtx.font = 'bold 12px Arial'
                             chartCtx.textAlign = 'center'
                             chartCtx.textBaseline = 'middle'
@@ -1078,7 +1080,9 @@ export default {
                         // Only display label if percentage > 1%
                         if (parseFloat(percentage) > 1) {
-                            chartCtx.fillStyle = '#000'
+                            // Use white color in dark mode, black in light mode
+                            const isDarkMode = document.documentElement.classList.contains('dark')
+                            chartCtx.fillStyle = isDarkMode ? '#fff' : '#000'
                             chartCtx.font = 'bold 12px Arial'
                             chartCtx.textAlign = 'center'
                             chartCtx.textBaseline = 'middle'
@@ -1135,7 +1139,9 @@ export default {
                         // Only display label if percentage > 1%
                         if (parseFloat(percentage) > 1) {
-                            chartCtx.fillStyle = '#000'
+                            // Use white color in dark mode, black in light mode
+                            const isDarkMode = document.documentElement.classList.contains('dark')
+                            chartCtx.fillStyle = isDarkMode ? '#fff' : '#000'
                             chartCtx.font = 'bold 12px Arial'
                             chartCtx.textAlign = 'center'
                             chartCtx.textBaseline = 'middle'
@@ -1192,7 +1198,9 @@ export default {
                         // Only display label if percentage > 1%
                         if (parseFloat(percentage) > 1) {
-                            chartCtx.fillStyle = '#000'
+                            // Use white color in dark mode, black in light mode
+                            const isDarkMode = document.documentElement.classList.contains('dark')
+                            chartCtx.fillStyle = isDarkMode ? '#fff' : '#000'
                             chartCtx.font = 'bold 12px Arial'
                             chartCtx.textAlign = 'center'
                             chartCtx.textBaseline = 'middle'

{llms_py-2.0.27 → llms_py-2.0.29}/llms/ui/ChatPrompt.mjs RENAMED Viewed

@@ -11,6 +11,7 @@ export function useChatPrompt() {
     const attachedFiles = ref([])
     const isGenerating = ref(false)
     const errorStatus = ref(null)
+    const abortController = ref(null)
     const hasImage = () => attachedFiles.value.some(f => imageExts.includes(lastRightPart(f.name, '.')))
     const hasAudio = () => attachedFiles.value.some(f => audioExts.includes(lastRightPart(f.name, '.')))
     const hasFile = () => attachedFiles.value.length > 0
@@ -21,6 +22,17 @@ export function useChatPrompt() {
         isGenerating.value = false
         attachedFiles.value = []
         messageText.value = ''
+        abortController.value = null
+    }
+    function cancel() {
+        // Cancel the pending request
+        if (abortController.value) {
+            abortController.value.abort()
+        }
+        // Reset UI state
+        isGenerating.value = false
+        abortController.value = null
     }
     return {
@@ -28,6 +40,7 @@ export function useChatPrompt() {
         attachedFiles,
         errorStatus,
         isGenerating,
+        abortController,
         get generating() {
             return isGenerating.value
         },
@@ -36,6 +49,7 @@ export function useChatPrompt() {
         hasFile,
         // hasText,
         reset,
+        cancel,
     }
 }
@@ -91,15 +105,18 @@ export default {
                         ]"
                         :disabled="isGenerating || !model"
                     ></textarea>
-                    <button title="Send (Enter)" type="button"
+                    <button v-if="!isGenerating" title="Send (Enter)" type="button"
                         @click="sendMessage"
                         :disabled="!messageText.trim() || isGenerating || !model"
                         class="absolute bottom-2 right-2 size-8 flex items-center justify-center rounded-md border border-gray-300 dark:border-gray-600 text-gray-600 dark:text-gray-400 hover:bg-gray-50 dark:hover:bg-gray-700 disabled:text-gray-400 disabled:cursor-not-allowed disabled:border-gray-200 dark:disabled:border-gray-700 transition-colors">
-                        <svg v-if="isGenerating" class="size-5 animate-spin" fill="none" viewBox="0 0 24 24">
-                            <circle class="opacity-25" cx="12" cy="12" r="10" stroke="currentColor" stroke-width="4"></circle>
-                            <path class="opacity-75" fill="currentColor" d="M4 12a8 8 0 018-8V0C5.373 0 0 5.373 0 12h4zm2 5.291A7.962 7.962 0 014 12H0c0 3.042 1.135 5.824 3 7.938l3-2.647z"></path>
+                        <svg class="size-5" xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24"><g fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="2"><path stroke-dasharray="20" stroke-dashoffset="20" d="M12 21l0 -17.5"><animate fill="freeze" attributeName="stroke-dashoffset" dur="0.2s" values="20;0"/></path><path stroke-dasharray="12" stroke-dashoffset="12" d="M12 3l7 7M12 3l-7 7"><animate fill="freeze" attributeName="stroke-dashoffset" begin="0.2s" dur="0.2s" values="12;0"/></path></g></svg>
+                    </button>
+                    <button v-else title="Cancel request" type="button"
+                        @click="cancelRequest"
+                        class="absolute bottom-2 right-2 size-8 flex items-center justify-center rounded-md border border-red-300 dark:border-red-600 text-red-600 dark:text-red-400 hover:bg-red-50 dark:hover:bg-red-900/30 transition-colors">
+                        <svg class="size-5" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
+                            <rect x="3" y="3" width="18" height="18" rx="2" ry="2"></rect>
                         </svg>
-                        <svg v-else class="size-5" xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24"><g fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="2"><path stroke-dasharray="20" stroke-dashoffset="20" d="M12 21l0 -17.5"><animate fill="freeze" attributeName="stroke-dashoffset" dur="0.2s" values="20;0"/></path><path stroke-dasharray="12" stroke-dashoffset="12" d="M12 3l7 7M12 3l-7 7"><animate fill="freeze" attributeName="stroke-dashoffset" begin="0.2s" dur="0.2s" values="12;0"/></path></g></svg>
                     </button>
                 </div>
@@ -304,6 +321,10 @@ export default {
             }
             messageText.value = ''
+            // Create AbortController for this request
+            const controller = new AbortController()
+            chatPrompt.abortController.value = controller
             try {
                 let threadId
@@ -434,11 +455,15 @@ export default {
                     }))
                 }
+                chatRequest.metadata ??= {}
+                chatRequest.metadata.threadId = threadId
                 // Send to API
                 console.debug('chatRequest', chatRequest)
                 const startTime = Date.now()
                 const response = await ai.post('/v1/chat/completions', {
-                    body: JSON.stringify(chatRequest)
+                    body: JSON.stringify(chatRequest),
+                    signal: controller.signal
                 })
                 let result = null
@@ -513,11 +538,25 @@ export default {
                     attachedFiles.value = []
                     // Error will be cleared when user sends next message (no auto-timeout)
                 }
+            } catch (error) {
+                // Check if the error is due to abort
+                if (error.name === 'AbortError') {
+                    console.log('Request was cancelled by user')
+                    // Don't show error for cancelled requests
+                } else {
+                    // Re-throw other errors to be handled by outer catch
+                    throw error
+                }
             } finally {
                 isGenerating.value = false
+                chatPrompt.abortController.value = null
             }
         }
+        const cancelRequest = () => {
+            chatPrompt.cancel()
+        }
         const addNewLine = () => {
             // Enter key already adds new line
             //messageText.value += '\n'
@@ -538,6 +577,7 @@ export default {
             onDrop,
             removeAttachment,
             sendMessage,
+            cancelRequest,
             addNewLine,
         }
     }

llms-py 2.0.27__tar.gz → 2.0.29__tar.gz

llms-py 2.0.27tar.gz → 2.0.29tar.gz