PyPI - lm-deluge - Versions diffs - 0.0.90__tar.gz → 0.0.92__tar.gz - Mend

lm-deluge 0.0.90tar.gz → 0.0.92tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (151) hide show

{lm_deluge-0.0.90/src/lm_deluge.egg-info → lm_deluge-0.0.92}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: lm_deluge
-Version: 0.0.90
+Version: 0.0.92
 Summary: Python utility for using LLM API models.
 Author-email: Benjamin Anderson <ben@trytaylor.ai>
 Requires-Python: >=3.10
@@ -9,7 +9,6 @@ License-File: LICENSE
 Requires-Dist: python-dotenv
 Requires-Dist: json5
 Requires-Dist: PyYAML
-Requires-Dist: pandas
 Requires-Dist: aiohttp
 Requires-Dist: tiktoken
 Requires-Dist: xxhash
@@ -49,9 +48,9 @@ Dynamic: license-file
 - **Spray across models/providers** – Configure a client with multiple models from any provider(s), and sampling weights. The client samples a model for each request.
 - **Tool Use** – Unified API for defining tools for all providers, and creating tools automatically from python functions.
 - **MCP Support** – Instantiate a `Tool` from a local or remote MCP server so that any LLM can use it, whether or not that provider natively supports MCP.
-- **Computer Use** – We support Claude Computer Use via the computer_use argument to process_prompts_sync/async. It works with Anthropic's API; Bedrock's API is broken right now and rejects the tool definitions, but in principle this will work there too when Bedrock gets their sh*t together.
-- **Caching** – Save completions in a local or distributed cache to avoid repeated LLM calls to process the same input.
-- **Convenient message constructor** – No more looking up how to build an Anthropic messages list with images. Our `Conversation` and `Message` classes work great with our client or with the `openai` and `anthropic` packages.
+- **Computer Use** – We support computer use for all major providers, and have pre-fabricated tools to integrate with Kernel, TryCUA, and more.
+- **Local & Remote Caching** – Use Anthropic caching more easily with common patterns (system-only, tools-only, last N messages, etc.) Use client-side caching to save completions to avoid repeated LLM calls to process the same input.
+- **Convenient message constructor** – No more looking up how to build an Anthropic messages list with images. Our `Conversation` and `Message` classes work great with our `LLMClient` or with the `openai` and `anthropic` packages.
 - **Sync and async APIs** – Use the client from sync or async code.
 **STREAMING IS NOT IN SCOPE.** There are plenty of packages that let you stream chat completions across providers. The sole purpose of this package is to do very fast batch inference using APIs. Sorry!
@@ -146,7 +145,7 @@ Constructing conversations to pass to models is notoriously annoying. Each provi
 ```python
 from lm_deluge import Message, Conversation
-prompt = Conversation.system("You are a helpful assistant.").add(
+prompt = Conversation().system("You are a helpful assistant.").add(
     Message.user("What's in this image?").add_image("tests/image.jpg")
 )
@@ -167,7 +166,7 @@ from lm_deluge import LLMClient, Conversation
 # Simple file upload
 client = LLMClient("gpt-4.1-mini")
-conversation = Conversation.user(
+conversation = Conversation().user(
     "Please summarize this document",
     file="path/to/document.pdf"
 )
@@ -176,7 +175,7 @@ resps = client.process_prompts_sync([conversation])
 # You can also create File objects for more control
 from lm_deluge import File
 file = File("path/to/report.pdf", filename="Q4_Report.pdf")
-conversation = Conversation.user("Analyze this financial report")
+conversation = Conversation().user("Analyze this financial report")
 conversation.messages[0].parts.append(file)
 ```
@@ -246,7 +245,7 @@ for tool_call in resps[0].tool_calls:
 import asyncio
 async def main():
-    conv = Conversation.user("List the files in the current directory")
+    conv = Conversation().user("List the files in the current directory")
     conv, resp = await client.run_agent_loop(conv, tools=tools)
     print(resp.content.completion)
@@ -262,7 +261,7 @@ from lm_deluge import LLMClient, Conversation, Message
 # Create a conversation with system message
 conv = (
-    Conversation.system("You are an expert Python developer with deep knowledge of async programming.")
+    Conversation().system("You are an expert Python developer with deep knowledge of async programming.")
     .add(Message.user("How do I use asyncio.gather?"))
 )

{lm_deluge-0.0.90 → lm_deluge-0.0.92}/README.md RENAMED Viewed

@@ -8,9 +8,9 @@
 - **Spray across models/providers** – Configure a client with multiple models from any provider(s), and sampling weights. The client samples a model for each request.
 - **Tool Use** – Unified API for defining tools for all providers, and creating tools automatically from python functions.
 - **MCP Support** – Instantiate a `Tool` from a local or remote MCP server so that any LLM can use it, whether or not that provider natively supports MCP.
-- **Computer Use** – We support Claude Computer Use via the computer_use argument to process_prompts_sync/async. It works with Anthropic's API; Bedrock's API is broken right now and rejects the tool definitions, but in principle this will work there too when Bedrock gets their sh*t together.
-- **Caching** – Save completions in a local or distributed cache to avoid repeated LLM calls to process the same input.
-- **Convenient message constructor** – No more looking up how to build an Anthropic messages list with images. Our `Conversation` and `Message` classes work great with our client or with the `openai` and `anthropic` packages.
+- **Computer Use** – We support computer use for all major providers, and have pre-fabricated tools to integrate with Kernel, TryCUA, and more.
+- **Local & Remote Caching** – Use Anthropic caching more easily with common patterns (system-only, tools-only, last N messages, etc.) Use client-side caching to save completions to avoid repeated LLM calls to process the same input.
+- **Convenient message constructor** – No more looking up how to build an Anthropic messages list with images. Our `Conversation` and `Message` classes work great with our `LLMClient` or with the `openai` and `anthropic` packages.
 - **Sync and async APIs** – Use the client from sync or async code.
 **STREAMING IS NOT IN SCOPE.** There are plenty of packages that let you stream chat completions across providers. The sole purpose of this package is to do very fast batch inference using APIs. Sorry!
@@ -105,7 +105,7 @@ Constructing conversations to pass to models is notoriously annoying. Each provi
 ```python
 from lm_deluge import Message, Conversation
-prompt = Conversation.system("You are a helpful assistant.").add(
+prompt = Conversation().system("You are a helpful assistant.").add(
     Message.user("What's in this image?").add_image("tests/image.jpg")
 )
@@ -126,7 +126,7 @@ from lm_deluge import LLMClient, Conversation
 # Simple file upload
 client = LLMClient("gpt-4.1-mini")
-conversation = Conversation.user(
+conversation = Conversation().user(
     "Please summarize this document",
     file="path/to/document.pdf"
 )
@@ -135,7 +135,7 @@ resps = client.process_prompts_sync([conversation])
 # You can also create File objects for more control
 from lm_deluge import File
 file = File("path/to/report.pdf", filename="Q4_Report.pdf")
-conversation = Conversation.user("Analyze this financial report")
+conversation = Conversation().user("Analyze this financial report")
 conversation.messages[0].parts.append(file)
 ```
@@ -205,7 +205,7 @@ for tool_call in resps[0].tool_calls:
 import asyncio
 async def main():
-    conv = Conversation.user("List the files in the current directory")
+    conv = Conversation().user("List the files in the current directory")
     conv, resp = await client.run_agent_loop(conv, tools=tools)
     print(resp.content.completion)
@@ -221,7 +221,7 @@ from lm_deluge import LLMClient, Conversation, Message
 # Create a conversation with system message
 conv = (
-    Conversation.system("You are an expert Python developer with deep knowledge of async programming.")
+    Conversation().system("You are an expert Python developer with deep knowledge of async programming.")
     .add(Message.user("How do I use asyncio.gather?"))
 )

{lm_deluge-0.0.90 → lm_deluge-0.0.92}/pyproject.toml RENAMED Viewed

@@ -3,7 +3,7 @@ requires = ["setuptools", "wheel"]
 [project]
 name = "lm_deluge"
-version = "0.0.90"
+version = "0.0.92"
 authors = [{ name = "Benjamin Anderson", email = "ben@trytaylor.ai" }]
 description = "Python utility for using LLM API models."
 readme = "README.md"
@@ -15,7 +15,6 @@ dependencies = [
     "python-dotenv",
     "json5",
     "PyYAML",
-    "pandas",
     "aiohttp",
     "tiktoken",
     "xxhash",
@@ -28,8 +27,7 @@ dependencies = [
     "pdf2image",
     "pillow",
     "fastmcp>=2.4",
-    "rich",
-    # "textual>=0.58.0"
+    "rich"
 ]
 [project.optional-dependencies]
@@ -39,5 +37,6 @@ full_text_search = ["tantivy>=0.21.0", "lenlp>=0.1.0"]
 sandbox = ["modal>=0.64.0", "daytona-sdk>=0.1.4", "docker>=7.0.0"]
 server = ["fastapi>=0.100.0", "uvicorn>=0.20.0"]
-# [project.scripts]
-# deluge = "lm_deluge.cli:main"
+[project.scripts]
+deluge = "lm_deluge.cli:main"
+deluge-server = "lm_deluge.server.__main__:main"

{lm_deluge-0.0.90 → lm_deluge-0.0.92}/src/lm_deluge/__init__.py RENAMED Viewed

@@ -1,7 +1,6 @@
 from .client import APIResponse, LLMClient, SamplingParams
-from .file import File
-from .prompt import Conversation, Message
-from .tool import Tool
+from .prompt import Conversation, Message, File
+from .tool import Tool, MCPServer
 # dotenv.load_dotenv() - don't do this, fucks with other packages
@@ -12,5 +11,6 @@ __all__ = [
     "Conversation",
     "Message",
     "Tool",
+    "MCPServer",
     "File",
 ]

{lm_deluge-0.0.90 → lm_deluge-0.0.92}/src/lm_deluge/api_requests/anthropic.py RENAMED Viewed

@@ -10,7 +10,7 @@ from lm_deluge.prompt import (
     Thinking,
     ToolCall,
 )
-from lm_deluge.request_context import RequestContext
+from lm_deluge.api_requests.context import RequestContext
 from lm_deluge.tool import MCPServer, Tool
 from lm_deluge.usage import Usage
 from lm_deluge.util.schema import (
@@ -103,7 +103,9 @@ def _build_anthropic_request(
             if "top_p" in request_json:
                 request_json["top_p"] = max(request_json["top_p"], 0.95)
             request_json["temperature"] = 1.0
-            request_json["max_tokens"] += budget
+            max_tokens = request_json["max_tokens"]
+            assert isinstance(max_tokens, int)
+            request_json["max_tokens"] = max_tokens + budget
         else:
             request_json["thinking"] = {"type": "disabled"}
             if "kimi" in model.id and "thinking" in model.id:

{lm_deluge-0.0.90 → lm_deluge-0.0.92}/src/lm_deluge/api_requests/base.py RENAMED Viewed

@@ -10,7 +10,7 @@ from aiohttp import ClientResponse
 from ..errors import raise_if_modal_exception
 from ..models.openai import OPENAI_MODELS
-from ..request_context import RequestContext
+from ..api_requests.context import RequestContext
 from .response import APIResponse

{lm_deluge-0.0.90 → lm_deluge-0.0.92}/src/lm_deluge/api_requests/bedrock.py RENAMED Viewed

@@ -20,7 +20,7 @@ from lm_deluge.prompt import (
     Thinking,
     ToolCall,
 )
-from lm_deluge.request_context import RequestContext
+from lm_deluge.api_requests.context import RequestContext
 from lm_deluge.tool import MCPServer, Tool
 from lm_deluge.usage import Usage
@@ -263,6 +263,11 @@ class BedrockRequest(APIRequestBase):
         # Create a fake requests.PreparedRequest object for AWS4Auth to sign
         import requests
+        assert self.url is not None, "URL must be set after build_request"
+        assert (
+            self.request_header is not None
+        ), "Headers must be set after build_request"
         fake_request = requests.Request(
             method="POST",
             url=self.url,

lm_deluge-0.0.92/src/lm_deluge/api_requests/bedrock_nova.py ADDED Viewed

@@ -0,0 +1,299 @@
+import asyncio
+import json
+import os
+from aiohttp import ClientResponse
+try:
+    from requests_aws4auth import AWS4Auth
+except ImportError:
+    raise ImportError(
+        "aws4auth is required for bedrock support. Install with: pip install requests-aws4auth"
+    )
+from lm_deluge.prompt import Message, Text, ToolCall
+from lm_deluge.api_requests.context import RequestContext
+from lm_deluge.tool import MCPServer, Tool
+from lm_deluge.usage import Usage
+from ..models import APIModel
+from .base import APIRequestBase, APIResponse
+def _convert_tool_to_nova(tool: Tool) -> dict:
+    """Convert a Tool to Nova toolSpec format."""
+    return {
+        "toolSpec": {
+            "name": tool.name,
+            "description": tool.description,
+            "inputSchema": {
+                "json": {
+                    "type": "object",
+                    "properties": tool.parameters,
+                    "required": tool.required or [],
+                }
+            },
+        }
+    }
+async def _build_nova_request(
+    model: APIModel,
+    context: RequestContext,
+):
+    """Build request for Amazon Nova models on Bedrock."""
+    prompt = context.prompt
+    tools = context.tools
+    sampling_params = context.sampling_params
+    cache_pattern = context.cache
+    # Handle AWS auth
+    access_key = os.getenv("AWS_ACCESS_KEY_ID")
+    secret_key = os.getenv("AWS_SECRET_ACCESS_KEY")
+    session_token = os.getenv("AWS_SESSION_TOKEN")
+    if not access_key or not secret_key:
+        raise ValueError(
+            "AWS credentials not found. Please set AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environment variables."
+        )
+    # Use us-west-2 for cross-region inference models
+    region = "us-west-2"
+    # Construct the endpoint URL
+    service = "bedrock"
+    url = f"https://bedrock-runtime.{region}.amazonaws.com/model/{model.name}/invoke"
+    # Prepare headers
+    auth = AWS4Auth(
+        access_key,
+        secret_key,
+        region,
+        service,
+        session_token=session_token,
+    )
+    base_headers = {
+        "Content-Type": "application/json",
+    }
+    # Convert conversation to Nova format with optional caching
+    system_list, messages = prompt.to_nova(cache_pattern=cache_pattern)
+    # Build request body
+    request_json = {
+        "schemaVersion": "messages-v1",
+        "messages": messages,
+        "inferenceConfig": {
+            "maxTokens": sampling_params.max_new_tokens,
+            "temperature": sampling_params.temperature,
+            "topP": sampling_params.top_p,
+        },
+    }
+    # Add system prompt if present
+    if system_list:
+        request_json["system"] = system_list
+    # Add tools if present
+    if tools:
+        tool_definitions = []
+        for tool in tools:
+            if isinstance(tool, Tool):
+                tool_definitions.append(_convert_tool_to_nova(tool))
+            elif isinstance(tool, MCPServer):
+                # Convert MCP server to individual tools
+                individual_tools = await tool.to_tools()
+                for individual_tool in individual_tools:
+                    tool_definitions.append(_convert_tool_to_nova(individual_tool))
+        if tool_definitions:
+            request_json["toolConfig"] = {"tools": tool_definitions}
+    return request_json, base_headers, auth, url, region
+class BedrockNovaRequest(APIRequestBase):
+    """Request handler for Amazon Nova models on Bedrock."""
+    def __init__(self, context: RequestContext):
+        super().__init__(context=context)
+        self.model = APIModel.from_registry(self.context.model_name)
+        self.region = None
+    async def build_request(self):
+        (
+            self.request_json,
+            base_headers,
+            self.auth,
+            self.url,
+            self.region,
+        ) = await _build_nova_request(self.model, self.context)
+        self.request_header = self.merge_headers(
+            base_headers, exclude_patterns=["anthropic", "openai", "gemini", "mistral"]
+        )
+    async def execute_once(self) -> APIResponse:
+        """Override execute_once to handle AWS4Auth signing."""
+        await self.build_request()
+        import aiohttp
+        assert self.context.status_tracker
+        self.context.status_tracker.total_requests += 1
+        timeout = aiohttp.ClientTimeout(total=self.context.request_timeout)
+        # Prepare the request data
+        payload = json.dumps(self.request_json, separators=(",", ":")).encode("utf-8")
+        # Create a fake requests.PreparedRequest object for AWS4Auth to sign
+        import requests
+        assert self.url is not None, "URL must be set after build_request"
+        assert (
+            self.request_header is not None
+        ), "Headers must be set after build_request"
+        fake_request = requests.Request(
+            method="POST",
+            url=self.url,
+            data=payload,
+            headers=self.request_header.copy(),
+        )
+        prepared_request = fake_request.prepare()
+        signed_request = self.auth(prepared_request)
+        signed_headers = dict(signed_request.headers)
+        try:
+            async with aiohttp.ClientSession(timeout=timeout) as session:
+                async with session.post(
+                    url=self.url,
+                    headers=signed_headers,
+                    data=payload,
+                ) as http_response:
+                    response: APIResponse = await self.handle_response(http_response)
+            return response
+        except asyncio.TimeoutError:
+            return APIResponse(
+                id=self.context.task_id,
+                model_internal=self.context.model_name,
+                prompt=self.context.prompt,
+                sampling_params=self.context.sampling_params,
+                status_code=None,
+                is_error=True,
+                error_message="Request timed out (terminated by client).",
+                content=None,
+                usage=None,
+            )
+        except Exception as e:
+            from ..errors import raise_if_modal_exception
+            raise_if_modal_exception(e)
+            return APIResponse(
+                id=self.context.task_id,
+                model_internal=self.context.model_name,
+                prompt=self.context.prompt,
+                sampling_params=self.context.sampling_params,
+                status_code=None,
+                is_error=True,
+                error_message=f"Unexpected {type(e).__name__}: {str(e) or 'No message.'}",
+                content=None,
+                usage=None,
+            )
+    async def handle_response(self, http_response: ClientResponse) -> APIResponse:
+        is_error = False
+        error_message = None
+        content = None
+        usage = None
+        finish_reason = None
+        status_code = http_response.status
+        mimetype = http_response.headers.get("Content-Type", None)
+        data = None
+        assert self.context.status_tracker
+        if status_code >= 200 and status_code < 300:
+            try:
+                data = await http_response.json()
+                # Parse Nova response format
+                # Nova returns: {"output": {"message": {"role": "assistant", "content": [...]}}, "usage": {...}, "stopReason": "..."}
+                output = data.get("output", {})
+                message = output.get("message", {})
+                response_content = message.get("content", [])
+                finish_reason = data.get("stopReason")
+                parts = []
+                for item in response_content:
+                    if "text" in item:
+                        parts.append(Text(item["text"]))
+                    elif "toolUse" in item:
+                        tool_use = item["toolUse"]
+                        parts.append(
+                            ToolCall(
+                                id=tool_use["toolUseId"],
+                                name=tool_use["name"],
+                                arguments=tool_use.get("input", {}),
+                            )
+                        )
+                content = Message("assistant", parts)
+                # Parse usage including cache tokens
+                # Note: Nova uses "cacheReadInputTokenCount" and "cacheWriteInputTokenCount"
+                raw_usage = data.get("usage", {})
+                usage = Usage(
+                    input_tokens=raw_usage.get("inputTokens", 0),
+                    output_tokens=raw_usage.get("outputTokens", 0),
+                    cache_read_tokens=raw_usage.get("cacheReadInputTokenCount", 0),
+                    cache_write_tokens=raw_usage.get("cacheWriteInputTokenCount", 0),
+                )
+            except Exception as e:
+                is_error = True
+                error_message = (
+                    f"Error calling .json() on response w/ status {status_code}: {e}"
+                )
+        elif mimetype and "json" in mimetype.lower():
+            is_error = True
+            data = await http_response.json()
+            error_message = json.dumps(data)
+        else:
+            is_error = True
+            text = await http_response.text()
+            error_message = text
+        # Handle special kinds of errors
+        retry_with_different_model = status_code in [529, 429, 400, 401, 403, 413]
+        if is_error and error_message is not None:
+            if (
+                "rate limit" in error_message.lower()
+                or "throttling" in error_message.lower()
+                or status_code == 429
+            ):
+                error_message += " (Rate limit error, triggering cooldown.)"
+                self.context.status_tracker.rate_limit_exceeded()
+            if "context length" in error_message or "too long" in error_message:
+                error_message += " (Context length exceeded, set retries to 0.)"
+                self.context.attempts_left = 0
+            retry_with_different_model = True
+        return APIResponse(
+            id=self.context.task_id,
+            status_code=status_code,
+            is_error=is_error,
+            error_message=error_message,
+            prompt=self.context.prompt,
+            content=content,
+            model_internal=self.context.model_name,
+            region=self.region,
+            sampling_params=self.context.sampling_params,
+            usage=usage,
+            raw_response=data,
+            finish_reason=finish_reason,
+            retry_with_different_model=retry_with_different_model,
+        )

{lm_deluge-0.0.90 → lm_deluge-0.0.92}/src/lm_deluge/api_requests/common.py RENAMED Viewed

@@ -2,6 +2,7 @@ from .openai import OpenAIRequest, OpenAIResponsesRequest
 from .anthropic import AnthropicRequest
 from .mistral import MistralRequest
 from .bedrock import BedrockRequest
+from .bedrock_nova import BedrockNovaRequest
 from .gemini import GeminiRequest
 CLASSES = {
@@ -10,5 +11,6 @@ CLASSES = {
     "anthropic": AnthropicRequest,
     "mistral": MistralRequest,
     "bedrock": BedrockRequest,
+    "bedrock-nova": BedrockNovaRequest,
     "gemini": GeminiRequest,
 }

lm_deluge-0.0.90/src/lm_deluge/request_context.py → lm_deluge-0.0.92/src/lm_deluge/api_requests/context.py RENAMED Viewed

@@ -2,9 +2,9 @@ from dataclasses import dataclass, field
 from functools import cached_property
 from typing import Any, Callable, Sequence, TYPE_CHECKING
-from .config import SamplingParams
-from .prompt import CachePattern, Conversation
-from .tracker import StatusTracker
+from ..config import SamplingParams
+from ..prompt import CachePattern, Conversation
+from ..tracker import StatusTracker
 if TYPE_CHECKING:
     from pydantic import BaseModel
@@ -83,4 +83,4 @@ class RequestContext:
         # Update with any overrides
         current_values.update(overrides)
-        return RequestContext(**current_values)
+        return RequestContext(**current_values)  # type: ignore[arg-type]

{lm_deluge-0.0.90 → lm_deluge-0.0.92}/src/lm_deluge/api_requests/gemini.py RENAMED Viewed

@@ -1,9 +1,10 @@
 import json
 import os
+from typing import Any
 from aiohttp import ClientResponse
-from lm_deluge.request_context import RequestContext
+from lm_deluge.api_requests.context import RequestContext
 from lm_deluge.tool import Tool
 from lm_deluge.warnings import maybe_warn
@@ -37,13 +38,14 @@ async def _build_gemini_request(
                             part_type="function call",
                         )
-    request_json = {
+    generation_config: dict[str, Any] = {
+        "temperature": sampling_params.temperature,
+        "topP": sampling_params.top_p,
+        "maxOutputTokens": sampling_params.max_new_tokens,
+    }
+    request_json: dict[str, Any] = {
         "contents": messages,
-        "generationConfig": {
-            "temperature": sampling_params.temperature,
-            "topP": sampling_params.top_p,
-            "maxOutputTokens": sampling_params.max_new_tokens,
-        },
+        "generationConfig": generation_config,
     }
     # Add system instruction if present
@@ -83,7 +85,7 @@ async def _build_gemini_request(
                 }
             effort = level_map[effort_key]
         thinking_config = {"thinkingLevel": effort}
-        request_json["generationConfig"]["thinkingConfig"] = thinking_config
+        generation_config["thinkingConfig"] = thinking_config
     elif model.reasoning_model:
         if (
@@ -126,7 +128,7 @@ async def _build_gemini_request(
             # no thoughts head empty
             thinking_config = {"includeThoughts": False, "thinkingBudget": 0}
-        request_json["generationConfig"]["thinkingConfig"] = thinking_config
+        generation_config["thinkingConfig"] = thinking_config
     else:
         if sampling_params.reasoning_effort:
@@ -171,14 +173,14 @@ async def _build_gemini_request(
     # Handle JSON mode
     if sampling_params.json_mode and model.supports_json:
-        request_json["generationConfig"]["responseMimeType"] = "application/json"
+        generation_config["responseMimeType"] = "application/json"
     # Handle media_resolution for Gemini 3 (requires v1alpha)
     if sampling_params.media_resolution is not None:
         is_gemini_3 = "gemini-3" in model.name.lower()
         if is_gemini_3:
             # Add global media resolution to generationConfig
-            request_json["generationConfig"]["mediaResolution"] = {
+            generation_config["mediaResolution"] = {
                 "level": sampling_params.media_resolution
             }
         else:

{lm_deluge-0.0.90 → lm_deluge-0.0.92}/src/lm_deluge/api_requests/mistral.py RENAMED Viewed

@@ -7,7 +7,7 @@ from lm_deluge.warnings import maybe_warn
 from ..models import APIModel
 from ..prompt import Message
-from ..request_context import RequestContext
+from ..api_requests.context import RequestContext
 from ..usage import Usage
 from .base import APIRequestBase, APIResponse

lm-deluge 0.0.90__tar.gz → 0.0.92__tar.gz

lm-deluge 0.0.90tar.gz → 0.0.92tar.gz