PyPI - lm-deluge - Versions diffs - 0.0.25__tar.gz → 0.0.27__tar.gz - Mend

lm-deluge 0.0.25tar.gz → 0.0.27tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.

This version of lm-deluge might be problematic. Click here for more details.

Files changed (62) hide show

{lm_deluge-0.0.25/src/lm_deluge.egg-info → lm_deluge-0.0.27}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: lm_deluge
-Version: 0.0.25
+Version: 0.0.27
 Summary: Python utility for using LLM API models.
 Author-email: Benjamin Anderson <ben@trytaylor.ai>
 Requires-Python: >=3.10
@@ -274,7 +274,7 @@ We support all models in `src/lm_deluge/models.py`. Vertex support is not planne
 ## Feature Support
-We support structured outputs via `json_mode` parameter provided to `SamplingParams`. Structured outputs with a schema are planned. Reasoning models are supported via the `reasoning_effort` parameter, which is translated to a thinking budget for Claude/Gemini. Image models are supported. We support tool use as documented above. We support logprobs for OpenAI models that return them.
+We support structured outputs via `json_mode` parameter provided to `SamplingParams`. Structured outputs with a schema are planned. Reasoning models are supported via the `reasoning_effort` parameter, which is translated to a thinking budget for Claude/Gemini. Passing `None` (or the string `"none"`) disables Gemini thoughts entirely. Image models are supported. We support tool use as documented above. We support logprobs for OpenAI models that return them.
 ## Built‑in tools

{lm_deluge-0.0.25 → lm_deluge-0.0.27}/README.md RENAMED Viewed

@@ -247,7 +247,7 @@ We support all models in `src/lm_deluge/models.py`. Vertex support is not planne
 ## Feature Support
-We support structured outputs via `json_mode` parameter provided to `SamplingParams`. Structured outputs with a schema are planned. Reasoning models are supported via the `reasoning_effort` parameter, which is translated to a thinking budget for Claude/Gemini. Image models are supported. We support tool use as documented above. We support logprobs for OpenAI models that return them.
+We support structured outputs via `json_mode` parameter provided to `SamplingParams`. Structured outputs with a schema are planned. Reasoning models are supported via the `reasoning_effort` parameter, which is translated to a thinking budget for Claude/Gemini. Passing `None` (or the string `"none"`) disables Gemini thoughts entirely. Image models are supported. We support tool use as documented above. We support logprobs for OpenAI models that return them.
 ## Built‑in tools

{lm_deluge-0.0.25 → lm_deluge-0.0.27}/pyproject.toml RENAMED Viewed

@@ -3,7 +3,7 @@ requires = ["setuptools", "wheel"]
 [project]
 name = "lm_deluge"
-version = "0.0.25"
+version = "0.0.27"
 authors = [{ name = "Benjamin Anderson", email = "ben@trytaylor.ai" }]
 description = "Python utility for using LLM API models."
 readme = "README.md"

{lm_deluge-0.0.25 → lm_deluge-0.0.27}/src/lm_deluge/api_requests/gemini.py RENAMED Viewed

@@ -37,14 +37,17 @@ async def _build_gemini_request(
     # Handle reasoning models (thinking)
     if model.reasoning_model:
-        request_json["generationConfig"]["thinkingConfig"] = {"includeThoughts": True}
-        if sampling_params.reasoning_effort and "flash" in model.id:
-            budget = {"low": 1024, "medium": 4096, "high": 16384}.get(
-                sampling_params.reasoning_effort
-            )
-            request_json["generationConfig"]["thinkingConfig"]["thinkingBudget"] = (
-                budget
-            )
+        thinking_config = None
+        effort = sampling_params.reasoning_effort
+        if effort is None or effort == "none":
+            # Explicitly disable thoughts when no effort is requested
+            thinking_config = {"includeThoughts": False, "thinkingBudget": 0}
+        else:
+            thinking_config = {"includeThoughts": True}
+            if effort in {"low", "medium", "high"} and "flash" in model.id:
+                budget = {"low": 1024, "medium": 4096, "high": 16384}[effort]
+                thinking_config["thinkingBudget"] = budget
+        request_json["generationConfig"]["thinkingConfig"] = thinking_config
     else:
         if sampling_params.reasoning_effort:

{lm_deluge-0.0.25 → lm_deluge-0.0.27}/src/lm_deluge/file.py RENAMED Viewed

@@ -1,3 +1,4 @@
+from functools import cached_property
 import os
 import io
 import requests
@@ -68,13 +69,13 @@ class File:
             return encoded
         return f"data:{self._mime()};base64,{encoded}"
-    @property
+    @cached_property
     def fingerprint(self) -> str:
         # Hash the file contents for fingerprinting
         file_bytes = self._bytes()
         return xxhash.xxh64(file_bytes).hexdigest()
-    @property
+    @cached_property
     def size(self) -> int:
         """Return file size in bytes."""
         return len(self._bytes())

{lm_deluge-0.0.25 → lm_deluge-0.0.27}/src/lm_deluge/request_context.py RENAMED Viewed

@@ -1,4 +1,5 @@
 from dataclasses import dataclass, field
+from functools import cached_property
 from typing import Any, Callable
 from .config import SamplingParams
@@ -39,14 +40,18 @@ class RequestContext:
     # Computed properties
     cache_key: str = field(init=False)
-    num_tokens: int = field(init=False)
+    # num_tokens: int = field(init=False)
-    def __post_init__(self):
-        # Compute cache key from prompt fingerprint
-        self.cache_key = self.prompt.fingerprint
+    # def __post_init__(self):
+    #     # Compute cache key from prompt fingerprint
+    #     # self.cache_key = self.prompt.fingerprint
-        # Compute token count
-        self.num_tokens = self.prompt.count_tokens(self.sampling_params.max_new_tokens)
+    #     # Compute token count
+    #     self.num_tokens =
+    @cached_property
+    def num_tokens(self):
+        return self.prompt.count_tokens(self.sampling_params.max_new_tokens)
     def maybe_callback(self, response, tracker):
         if not self.callback:

{lm_deluge-0.0.25 → lm_deluge-0.0.27/src/lm_deluge.egg-info}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: lm_deluge
-Version: 0.0.25
+Version: 0.0.27
 Summary: Python utility for using LLM API models.
 Author-email: Benjamin Anderson <ben@trytaylor.ai>
 Requires-Python: >=3.10
@@ -274,7 +274,7 @@ We support all models in `src/lm_deluge/models.py`. Vertex support is not planne
 ## Feature Support
-We support structured outputs via `json_mode` parameter provided to `SamplingParams`. Structured outputs with a schema are planned. Reasoning models are supported via the `reasoning_effort` parameter, which is translated to a thinking budget for Claude/Gemini. Image models are supported. We support tool use as documented above. We support logprobs for OpenAI models that return them.
+We support structured outputs via `json_mode` parameter provided to `SamplingParams`. Structured outputs with a schema are planned. Reasoning models are supported via the `reasoning_effort` parameter, which is translated to a thinking budget for Claude/Gemini. Passing `None` (or the string `"none"`) disables Gemini thoughts entirely. Image models are supported. We support tool use as documented above. We support logprobs for OpenAI models that return them.
 ## Built‑in tools