PyPI - lm-deluge - Versions diffs - 0.0.76__tar.gz → 0.0.79__tar.gz - Mend

lm-deluge 0.0.76tar.gz → 0.0.79tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (94) hide show

{lm_deluge-0.0.76/src/lm_deluge.egg-info → lm_deluge-0.0.79}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: lm_deluge
-Version: 0.0.76
+Version: 0.0.79
 Summary: Python utility for using LLM API models.
 Author-email: Benjamin Anderson <ben@trytaylor.ai>
 Requires-Python: >=3.10
@@ -52,7 +52,7 @@ Dynamic: license-file
 pip install lm-deluge
 ```
-The package relies on environment variables for API keys. Typical variables include `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, `COHERE_API_KEY`, `META_API_KEY`, and `GOOGLE_API_KEY`. `LLMClient` will automatically load the `.env` file when imported; we recommend using that to set the environment variables. For Bedrock, you'll need to set `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY`.
+The package relies on environment variables for API keys. Typical variables include `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, `COHERE_API_KEY`, `META_API_KEY`, and `GEMINI_API_KEY`. `LLMClient` will automatically load the `.env` file when imported; we recommend using that to set the environment variables. For Bedrock, you'll need to set `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY`.
 ## Quickstart
@@ -61,9 +61,9 @@ The package relies on environment variables for API keys. Typical variables incl
 ```python
 from lm_deluge import LLMClient
-client = LLMClient("gpt-4o-mini")
+client = LLMClient("gpt-4.1-mini")
 resps = client.process_prompts_sync(["Hello, world!"])
-print(resp[0].completion)
+print(resps[0].completion)
 ```
 ## Spraying Across Models
@@ -74,13 +74,13 @@ To distribute your requests across models, just provide a list of more than one
 from lm_deluge import LLMClient
 client = LLMClient(
-    ["gpt-4o-mini", "claude-3-haiku"],
+    ["gpt-4.1-mini", "claude-4.5-haiku"],
     max_requests_per_minute=10_000
 )
 resps = client.process_prompts_sync(
     ["Hello, ChatGPT!", "Hello, Claude!"]
 )
-print(resp[0].completion)
+print(resps[0].completion)
 ```
 ## Configuration
@@ -181,7 +181,7 @@ def get_weather(city: str) -> str:
     return f"The weather in {city} is sunny and 72°F"
 tool = Tool.from_function(get_weather)
-client = LLMClient("claude-3-haiku")
+client = LLMClient("claude-4.5-haiku")
 resps = client.process_prompts_sync(
     ["What's the weather in Paris?"],
     tools=[tool]
@@ -255,7 +255,7 @@ conv = (
 )
 # Use prompt caching to cache system message and tools
-client = LLMClient("claude-3-5-sonnet")
+client = LLMClient("claude-4.5-sonnet")
 resps = client.process_prompts_sync(
     [conv],
     cache="system_and_tools"  # Cache system message and any tools
@@ -301,5 +301,6 @@ The `lm_deluge.llm_tools` package exposes a few helper functions:
 - `extract` – structure text or images into a Pydantic model based on a schema.
 - `translate` – translate a list of strings to English.
 - `score_llm` – simple yes/no style scoring with optional log probability output.
+- `FilesystemManager` – expose a sandboxed read/write filesystem tool (with optional regex search and `apply_patch` support) that agents can call without touching the host machine.
 Experimental embeddings (`embed.embed_parallel_async`) and document reranking (`rerank.rerank_parallel_async`) clients are also provided.

{lm_deluge-0.0.76 → lm_deluge-0.0.79}/README.md RENAMED Viewed

@@ -23,7 +23,7 @@
 pip install lm-deluge
 ```
-The package relies on environment variables for API keys. Typical variables include `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, `COHERE_API_KEY`, `META_API_KEY`, and `GOOGLE_API_KEY`. `LLMClient` will automatically load the `.env` file when imported; we recommend using that to set the environment variables. For Bedrock, you'll need to set `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY`.
+The package relies on environment variables for API keys. Typical variables include `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, `COHERE_API_KEY`, `META_API_KEY`, and `GEMINI_API_KEY`. `LLMClient` will automatically load the `.env` file when imported; we recommend using that to set the environment variables. For Bedrock, you'll need to set `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY`.
 ## Quickstart
@@ -32,9 +32,9 @@ The package relies on environment variables for API keys. Typical variables incl
 ```python
 from lm_deluge import LLMClient
-client = LLMClient("gpt-4o-mini")
+client = LLMClient("gpt-4.1-mini")
 resps = client.process_prompts_sync(["Hello, world!"])
-print(resp[0].completion)
+print(resps[0].completion)
 ```
 ## Spraying Across Models
@@ -45,13 +45,13 @@ To distribute your requests across models, just provide a list of more than one
 from lm_deluge import LLMClient
 client = LLMClient(
-    ["gpt-4o-mini", "claude-3-haiku"],
+    ["gpt-4.1-mini", "claude-4.5-haiku"],
     max_requests_per_minute=10_000
 )
 resps = client.process_prompts_sync(
     ["Hello, ChatGPT!", "Hello, Claude!"]
 )
-print(resp[0].completion)
+print(resps[0].completion)
 ```
 ## Configuration
@@ -152,7 +152,7 @@ def get_weather(city: str) -> str:
     return f"The weather in {city} is sunny and 72°F"
 tool = Tool.from_function(get_weather)
-client = LLMClient("claude-3-haiku")
+client = LLMClient("claude-4.5-haiku")
 resps = client.process_prompts_sync(
     ["What's the weather in Paris?"],
     tools=[tool]
@@ -226,7 +226,7 @@ conv = (
 )
 # Use prompt caching to cache system message and tools
-client = LLMClient("claude-3-5-sonnet")
+client = LLMClient("claude-4.5-sonnet")
 resps = client.process_prompts_sync(
     [conv],
     cache="system_and_tools"  # Cache system message and any tools
@@ -272,5 +272,6 @@ The `lm_deluge.llm_tools` package exposes a few helper functions:
 - `extract` – structure text or images into a Pydantic model based on a schema.
 - `translate` – translate a list of strings to English.
 - `score_llm` – simple yes/no style scoring with optional log probability output.
+- `FilesystemManager` – expose a sandboxed read/write filesystem tool (with optional regex search and `apply_patch` support) that agents can call without touching the host machine.
 Experimental embeddings (`embed.embed_parallel_async`) and document reranking (`rerank.rerank_parallel_async`) clients are also provided.

{lm_deluge-0.0.76 → lm_deluge-0.0.79}/pyproject.toml RENAMED Viewed

@@ -3,7 +3,7 @@ requires = ["setuptools", "wheel"]
 [project]
 name = "lm_deluge"
-version = "0.0.76"
+version = "0.0.79"
 authors = [{ name = "Benjamin Anderson", email = "ben@trytaylor.ai" }]
 description = "Python utility for using LLM API models."
 readme = "README.md"

{lm_deluge-0.0.76 → lm_deluge-0.0.79}/src/lm_deluge/api_requests/gemini.py RENAMED Viewed

@@ -23,6 +23,21 @@ async def _build_gemini_request(
 ) -> dict:
     system_message, messages = prompt.to_gemini()
+    # For Gemini 3, inject dummy signatures when missing for function calls
+    is_gemini_3 = "gemini-3" in model.name.lower()
+    if is_gemini_3:
+        dummy_sig = "context_engineering_is_the_way_to_go"
+        for msg in messages:
+            if "parts" in msg:
+                for part in msg["parts"]:
+                    # For function calls, inject dummy signature if missing
+                    if "functionCall" in part and "thoughtSignature" not in part:
+                        part["thoughtSignature"] = dummy_sig
+                        maybe_warn(
+                            "WARN_GEMINI3_MISSING_SIGNATURE",
+                            part_type="function call",
+                        )
     request_json = {
         "contents": messages,
         "generationConfig": {
@@ -40,17 +55,44 @@ async def _build_gemini_request(
     if model.reasoning_model:
         thinking_config: dict[str, Any] | None = None
         effort = sampling_params.reasoning_effort
-        if effort is None or effort == "none":
-            budget = 128 if "2.5-pro" in model.id else 0
-            # Explicitly disable thoughts when no effort is requested
-            thinking_config = {"includeThoughts": False, "thinkingBudget": budget}
+        is_gemini_3 = "gemini-3" in model.name.lower()
+        if is_gemini_3:
+            # Gemini 3 uses thinkingLevel instead of thinkingBudget
+            if effort in {"none", "minimal"}:
+                thinking_config = {"thinkingLevel": "low"}
+            elif effort is None:
+                # Default to high when reasoning is enabled but no preference was provided
+                thinking_config = {"thinkingLevel": "high"}
+            else:
+                # Map reasoning_effort to thinkingLevel
+                level_map = {
+                    "minimal": "low",
+                    "low": "low",
+                    "medium": "medium",  # Will work when supported
+                    "high": "high",
+                }
+                thinking_level = level_map.get(effort, "high")
+                thinking_config = {"thinkingLevel": thinking_level}
         else:
-            thinking_config = {"includeThoughts": True}
-            if effort in {"minimal", "low", "medium", "high"} and "flash" in model.id:
-                budget = {"minimal": 256, "low": 1024, "medium": 4096, "high": 16384}[
-                    effort
-                ]
-                thinking_config["thinkingBudget"] = budget
+            # Gemini 2.5 uses thinkingBudget (legacy)
+            if effort is None or effort == "none":
+                budget = 128 if "2.5-pro" in model.id else 0
+                # Explicitly disable thoughts when no effort is requested
+                thinking_config = {"includeThoughts": False, "thinkingBudget": budget}
+            else:
+                thinking_config = {"includeThoughts": True}
+                if (
+                    effort in {"minimal", "low", "medium", "high"}
+                    and "flash" in model.id
+                ):
+                    budget = {
+                        "minimal": 256,
+                        "low": 1024,
+                        "medium": 4096,
+                        "high": 16384,
+                    }[effort]
+                    thinking_config["thinkingBudget"] = budget
         request_json["generationConfig"]["thinkingConfig"] = thinking_config
     else:
@@ -66,6 +108,21 @@ async def _build_gemini_request(
     if sampling_params.json_mode and model.supports_json:
         request_json["generationConfig"]["responseMimeType"] = "application/json"
+    # Handle media_resolution for Gemini 3 (requires v1alpha)
+    if sampling_params.media_resolution is not None:
+        is_gemini_3 = "gemini-3" in model.name.lower()
+        if is_gemini_3:
+            # Add global media resolution to generationConfig
+            request_json["generationConfig"]["mediaResolution"] = {
+                "level": sampling_params.media_resolution
+            }
+        else:
+            # Warn if trying to use media_resolution on non-Gemini-3 models
+            maybe_warn(
+                "WARN_MEDIA_RESOLUTION_UNSUPPORTED",
+                model_name=model.name,
+            )
     return request_json
@@ -137,10 +194,19 @@ class GeminiRequest(APIRequestBase):
                         candidate = data["candidates"][0]
                         if "content" in candidate and "parts" in candidate["content"]:
                             for part in candidate["content"]["parts"]:
+                                # Extract thought signature if present
+                                thought_sig = part.get("thoughtSignature")
                                 if "text" in part:
                                     parts.append(Text(part["text"]))
                                 elif "thought" in part:
-                                    parts.append(Thinking(part["thought"]))
+                                    # Thought with optional signature
+                                    parts.append(
+                                        Thinking(
+                                            content=part["thought"],
+                                            thought_signature=thought_sig,
+                                        )
+                                    )
                                 elif "functionCall" in part:
                                     func_call = part["functionCall"]
                                     # Generate a unique ID since Gemini doesn't provide one
@@ -152,6 +218,7 @@ class GeminiRequest(APIRequestBase):
                                             id=tool_id,
                                             name=func_call["name"],
                                             arguments=func_call.get("args", {}),
+                                            thought_signature=thought_sig,
                                         )
                                     )

{lm_deluge-0.0.76 → lm_deluge-0.0.79}/src/lm_deluge/client.py RENAMED Viewed

@@ -262,6 +262,7 @@ class _LLMClient(BaseModel):
             self.max_tokens_per_minute = max_tokens_per_minute
         if max_concurrent_requests:
             self.max_concurrent_requests = max_concurrent_requests
+        return self
     def _get_tracker(self) -> StatusTracker:
         if self._tracker is None:

{lm_deluge-0.0.76 → lm_deluge-0.0.79}/src/lm_deluge/config.py RENAMED Viewed

@@ -12,6 +12,13 @@ class SamplingParams(BaseModel):
     logprobs: bool = False
     top_logprobs: int | None = None
     strict_tools: bool = True
+    # Gemini 3 only - controls multimodal vision processing fidelity
+    media_resolution: (
+        Literal[
+            "media_resolution_low", "media_resolution_medium", "media_resolution_high"
+        ]
+        | None
+    ) = None
     def to_vllm(self):
         try:

lm-deluge 0.0.76__tar.gz → 0.0.79__tar.gz

lm-deluge 0.0.76tar.gz → 0.0.79tar.gz