PyPI - lemonade-sdk - Versions diffs - 8.1.2__tar.gz → 8.1.4__tar.gz - Mend

lemonade-sdk 8.1.2tar.gz → 8.1.4tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.

This version of lemonade-sdk might be problematic. Click here for more details.

Files changed (85) hide show

{lemonade_sdk-8.1.2 → lemonade_sdk-8.1.4}/NOTICE.md RENAMED Viewed

@@ -1,7 +1,33 @@
 PORTIONS LICENSED AS FOLLOWS
+## llama.cpp
+Binaries for llama.cpp are downloaded under the MIT license from https://github.com/ggml-org/llama.cpp, as well as https://github.com/lemonade-sdk/llamacpp-rocm (which uses https://github.com/ggml-org/llama.cpp to build them.)
 Lemonade SDK used the [ONNX TurnkeyML](https://github.com/onnx/turnkeyml) project as a starting point under the [Apache 2.0 license](./LICENSE).
+> MIT License
+>
+> Copyright (c) 2023-2024 The ggml authors
+>
+> Permission is hereby granted, free of charge, to any person obtaining a copy
+> of this software and associated documentation files (the "Software"), to deal
+> in the Software without restriction, including without limitation the rights
+> to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+> copies of the Software, and to permit persons to whom the Software is
+> furnished to do so, subject to the following conditions:
+>
+> The above copyright notice and this permission notice shall be included in all
+> copies or substantial portions of the Software.
+>
+> THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+> IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+> FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+> AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+> LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+> OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+> SOFTWARE.
 ## TurnkeyML Attribution
 TurnkeyML used code from other open source projects as a starting point (see [NOTICE.md](NOTICE.md)). Thank you Philip Colangelo, Derek Elkins, Jeremy Fowers, Dan Gard, Victoria Godsoe, Mark Heaps, Daniel Holanda, Brian Kurtz, Mariah Larwood, Philip Lassen, Andrew Ling, Adrian Macias, Gary Malik, Sarah Massengill, Ashwin Murthy, Hatice Ozen, Tim Sears, Sean Settle, Krishna Sivakumar, Aviv Weinstein, Xueli Xao, Bill Xing, and Lev Zlotnik for your contributions to that work.
@@ -18,4 +44,4 @@ TurnkeyML used code from other open source projects as a starting point (see [NO
 >
 >The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
 >
->THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+>THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

{lemonade_sdk-8.1.2/src/lemonade_sdk.egg-info → lemonade_sdk-8.1.4}/PKG-INFO RENAMED Viewed

@@ -1,18 +1,18 @@
 Metadata-Version: 2.4
 Name: lemonade-sdk
-Version: 8.1.2
+Version: 8.1.4
 Summary: Lemonade SDK: Your LLM Aide for Validation and Deployment
 Author-email: lemonade@amd.com
-Requires-Python: >=3.10, <3.13
+Requires-Python: >=3.10, <3.14
 Description-Content-Type: text/markdown
 License-File: LICENSE
 License-File: NOTICE.md
 Requires-Dist: invoke>=2.0.0
-Requires-Dist: onnx<1.18.0,>=1.11.0
+Requires-Dist: onnx==1.18.0
 Requires-Dist: pyyaml>=5.4
 Requires-Dist: typeguard>=2.3.13
 Requires-Dist: packaging>=20.9
-Requires-Dist: numpy<2.0.0
+Requires-Dist: numpy
 Requires-Dist: fasteners
 Requires-Dist: GitPython>=3.1.40
 Requires-Dist: psutil>=6.1.1
@@ -30,7 +30,7 @@ Requires-Dist: sentencepiece
 Requires-Dist: huggingface-hub[hf_xet]==0.33.0
 Requires-Dist: python-dotenv
 Provides-Extra: oga-ryzenai
-Requires-Dist: onnxruntime-genai-directml-ryzenai==0.7.0.2; extra == "oga-ryzenai"
+Requires-Dist: onnxruntime-genai-directml-ryzenai==0.7.0.2.1; extra == "oga-ryzenai"
 Requires-Dist: protobuf>=6.30.1; extra == "oga-ryzenai"
 Provides-Extra: oga-cpu
 Requires-Dist: onnxruntime-genai==0.8.2; extra == "oga-cpu"
@@ -41,9 +41,10 @@ Requires-Dist: accelerate; extra == "dev"
 Requires-Dist: datasets; extra == "dev"
 Requires-Dist: pandas>=1.5.3; extra == "dev"
 Requires-Dist: matplotlib; extra == "dev"
-Requires-Dist: model-generate==1.5.0; (platform_system == "Windows" and python_version == "3.10") and extra == "dev"
 Requires-Dist: human-eval-windows==1.0.4; extra == "dev"
 Requires-Dist: lm-eval[api]; extra == "dev"
+Provides-Extra: model-generate
+Requires-Dist: model-generate==1.5.0; (platform_system == "Windows" and python_version == "3.10") and extra == "model-generate"
 Provides-Extra: oga-hybrid
 Requires-Dist: lemonade-sdk[oga-ryzenai]; extra == "oga-hybrid"
 Provides-Extra: oga-unified
@@ -105,7 +106,7 @@ Dynamic: summary
     <img src="https://img.shields.io/badge/Ubuntu-24.04%20%7C%2025.04-E95420?logo=ubuntu&logoColor=white" alt="Ubuntu 24.04 | 25.04" />
   </a>
   <a href="docs/README.md#installation" title="Check out our instructions">
-    <img src="https://img.shields.io/badge/Python-3.10%20%7C%203.12-blue?logo=python&logoColor=white" alt="Made with Python" />
+    <img src="https://img.shields.io/badge/Python-3.10--3.13-blue?logo=python&logoColor=white" alt="Made with Python" />
   </a>
   <a href="https://github.com/lemonade-sdk/lemonade/blob/main/docs/contribute.md" title="Contribution Guide">
     <img src="https://img.shields.io/badge/PRs-welcome-brightgreen.svg" alt="PRs Welcome" />
@@ -199,48 +200,11 @@ You can also import custom GGUF and ONNX models from Hugging Face by using our [
 Lemonade supports the following configurations, while also making it easy to switch between them at runtime. Find more information about it [here](./docs/README.md#software-and-hardware-overview).
-<table>
-  <thead>
-    <tr>
-      <th rowspan="2">Hardware</th>
-      <th colspan="3" align="center">🛠️ Engine Support</th>
-      <th colspan="2" align="center">🖥️ OS (x86/x64)</th>
-    </tr>
-    <tr>
-      <th align="center">OGA</th>
-      <th align="center">llamacpp</th>
-      <th align="center">HF</th>
-      <th align="center">Windows</th>
-      <th align="center">Linux</th>
-    </tr>
-  </thead>
-  <tbody>
-    <tr>
-      <td><strong>🧠 CPU</strong></td>
-      <td align="center">All platforms</td>
-      <td align="center">All platforms</td>
-      <td align="center">All platforms</td>
-      <td align="center">✅</td>
-      <td align="center">✅</td>
-    </tr>
-    <tr>
-      <td><strong>🎮 GPU</strong></td>
-      <td align="center">—</td>
-      <td align="center">Vulkan: All platforms<br>ROCm: Selected AMD platforms*</td>
-      <td align="center">—</td>
-      <td align="center">✅</td>
-      <td align="center">✅</td>
-    </tr>
-    <tr>
-      <td><strong>🤖 NPU</strong></td>
-      <td align="center">AMD Ryzen™ AI 300 series</td>
-      <td align="center">—</td>
-      <td align="center">—</td>
-      <td align="center">✅</td>
-      <td align="center">—</td>
-    </tr>
-  </tbody>
-</table>
+| Hardware | Engine: OGA | Engine: llamacpp | Engine: HF | Windows | Linux |
+|----------|-------------|------------------|------------|---------|-------|
+| **🧠 CPU** | All platforms | All platforms | All platforms | ✅ | ✅ |
+| **🎮 GPU** | — | Vulkan: All platforms<br>ROCm: Selected AMD platforms* | — | ✅ | ✅ |
+| **🤖 NPU** | AMD Ryzen™ AI 300 series | — | — | ✅ | — |
 <details>
 <summary><small><i>* See supported AMD ROCm platforms</i></small></summary>
@@ -336,9 +300,19 @@ New contributors can find beginner-friendly issues tagged with "Good First Issue
 This project is sponsored by AMD. It is maintained by @danielholanda @jeremyfowers @ramkrishna @vgodsoe in equal measure. You can reach us by filing an [issue](https://github.com/lemonade-sdk/lemonade/issues), emailing [lemonade@amd.com](mailto:lemonade@amd.com), or joining our [Discord](https://discord.gg/5xXzkMu8Zk).
-## License
-This project is licensed under the [Apache 2.0 License](https://github.com/lemonade-sdk/lemonade/blob/main/LICENSE). Portions of the project are licensed as described in [NOTICE.md](./NOTICE.md).
+## License and Attribution
+This project is:
+- [Built with Python](https://www.amd.com/en/developer/resources/technical-articles/2025/rethinking-local-ai-lemonade-servers-python-advantage.html) with ❤️ for the open source community,
+- Standing on the shoulders of great tools from:
+  - [ggml/llama.cpp](https://github.com/ggml-org/llama.cpp)
+  - [OnnxRuntime GenAI](https://github.com/microsoft/onnxruntime-genai)
+  - [Hugging Face Hub](https://github.com/huggingface/huggingface_hub)
+  - [OpenAI API](https://github.com/openai/openai-python)
+  - and more...
+- Accelerated by mentorship from the OCV Catalyst program.
+- Licensed under the [Apache 2.0 License](https://github.com/lemonade-sdk/lemonade/blob/main/LICENSE).
+  - Portions of the project are licensed as described in [NOTICE.md](./NOTICE.md).
 <!--This file was originally licensed under Apache 2.0. It has been modified.
 Modifications Copyright (c) 2025 AMD-->

{lemonade_sdk-8.1.2 → lemonade_sdk-8.1.4}/README.md RENAMED Viewed

@@ -14,7 +14,7 @@
     <img src="https://img.shields.io/badge/Ubuntu-24.04%20%7C%2025.04-E95420?logo=ubuntu&logoColor=white" alt="Ubuntu 24.04 | 25.04" />
   </a>
   <a href="docs/README.md#installation" title="Check out our instructions">
-    <img src="https://img.shields.io/badge/Python-3.10%20%7C%203.12-blue?logo=python&logoColor=white" alt="Made with Python" />
+    <img src="https://img.shields.io/badge/Python-3.10--3.13-blue?logo=python&logoColor=white" alt="Made with Python" />
   </a>
   <a href="https://github.com/lemonade-sdk/lemonade/blob/main/docs/contribute.md" title="Contribution Guide">
     <img src="https://img.shields.io/badge/PRs-welcome-brightgreen.svg" alt="PRs Welcome" />
@@ -108,48 +108,11 @@ You can also import custom GGUF and ONNX models from Hugging Face by using our [
 Lemonade supports the following configurations, while also making it easy to switch between them at runtime. Find more information about it [here](./docs/README.md#software-and-hardware-overview).
-<table>
-  <thead>
-    <tr>
-      <th rowspan="2">Hardware</th>
-      <th colspan="3" align="center">🛠️ Engine Support</th>
-      <th colspan="2" align="center">🖥️ OS (x86/x64)</th>
-    </tr>
-    <tr>
-      <th align="center">OGA</th>
-      <th align="center">llamacpp</th>
-      <th align="center">HF</th>
-      <th align="center">Windows</th>
-      <th align="center">Linux</th>
-    </tr>
-  </thead>
-  <tbody>
-    <tr>
-      <td><strong>🧠 CPU</strong></td>
-      <td align="center">All platforms</td>
-      <td align="center">All platforms</td>
-      <td align="center">All platforms</td>
-      <td align="center">✅</td>
-      <td align="center">✅</td>
-    </tr>
-    <tr>
-      <td><strong>🎮 GPU</strong></td>
-      <td align="center">—</td>
-      <td align="center">Vulkan: All platforms<br>ROCm: Selected AMD platforms*</td>
-      <td align="center">—</td>
-      <td align="center">✅</td>
-      <td align="center">✅</td>
-    </tr>
-    <tr>
-      <td><strong>🤖 NPU</strong></td>
-      <td align="center">AMD Ryzen™ AI 300 series</td>
-      <td align="center">—</td>
-      <td align="center">—</td>
-      <td align="center">✅</td>
-      <td align="center">—</td>
-    </tr>
-  </tbody>
-</table>
+| Hardware | Engine: OGA | Engine: llamacpp | Engine: HF | Windows | Linux |
+|----------|-------------|------------------|------------|---------|-------|
+| **🧠 CPU** | All platforms | All platforms | All platforms | ✅ | ✅ |
+| **🎮 GPU** | — | Vulkan: All platforms<br>ROCm: Selected AMD platforms* | — | ✅ | ✅ |
+| **🤖 NPU** | AMD Ryzen™ AI 300 series | — | — | ✅ | — |
 <details>
 <summary><small><i>* See supported AMD ROCm platforms</i></small></summary>
@@ -245,9 +208,19 @@ New contributors can find beginner-friendly issues tagged with "Good First Issue
 This project is sponsored by AMD. It is maintained by @danielholanda @jeremyfowers @ramkrishna @vgodsoe in equal measure. You can reach us by filing an [issue](https://github.com/lemonade-sdk/lemonade/issues), emailing [lemonade@amd.com](mailto:lemonade@amd.com), or joining our [Discord](https://discord.gg/5xXzkMu8Zk).
-## License
-This project is licensed under the [Apache 2.0 License](https://github.com/lemonade-sdk/lemonade/blob/main/LICENSE). Portions of the project are licensed as described in [NOTICE.md](./NOTICE.md).
+## License and Attribution
+This project is:
+- [Built with Python](https://www.amd.com/en/developer/resources/technical-articles/2025/rethinking-local-ai-lemonade-servers-python-advantage.html) with ❤️ for the open source community,
+- Standing on the shoulders of great tools from:
+  - [ggml/llama.cpp](https://github.com/ggml-org/llama.cpp)
+  - [OnnxRuntime GenAI](https://github.com/microsoft/onnxruntime-genai)
+  - [Hugging Face Hub](https://github.com/huggingface/huggingface_hub)
+  - [OpenAI API](https://github.com/openai/openai-python)
+  - and more...
+- Accelerated by mentorship from the OCV Catalyst program.
+- Licensed under the [Apache 2.0 License](https://github.com/lemonade-sdk/lemonade/blob/main/LICENSE).
+  - Portions of the project are licensed as described in [NOTICE.md](./NOTICE.md).
 <!--This file was originally licensed under Apache 2.0. It has been modified.
 Modifications Copyright (c) 2025 AMD-->

lemonade_sdk-8.1.4/pyproject.toml ADDED Viewed

@@ -0,0 +1,8 @@
+[build-system]
+requires = [
+  "setuptools>=68",
+  "wheel"
+]
+build-backend = "setuptools.build_meta"

{lemonade_sdk-8.1.2 → lemonade_sdk-8.1.4}/setup.py RENAMED Viewed

@@ -28,13 +28,11 @@ setup(
         # Minimal dependencies required for end-users who are running
         # apps deployed on Lemonade SDK
         "invoke>=2.0.0",
-        "onnx>=1.11.0,<1.18.0",
+        "onnx==1.18.0",
         "pyyaml>=5.4",
         "typeguard>=2.3.13",
         "packaging>=20.9",
-        # Necessary until upstream packages account for the breaking
-        # change to numpy
-        "numpy<2.0.0",
+        "numpy",
         "fasteners",
         "GitPython>=3.1.40",
         "psutil>=6.1.1",
@@ -57,7 +55,7 @@ setup(
         # applications, without including developer-focused tools
         # Primary NPU extra using unified PyPI package
         "oga-ryzenai": [
-            "onnxruntime-genai-directml-ryzenai==0.7.0.2",
+            "onnxruntime-genai-directml-ryzenai==0.7.0.2.1",
             "protobuf>=6.30.1",
         ],
         "oga-cpu": [
@@ -74,12 +72,14 @@ setup(
             "datasets",
             "pandas>=1.5.3",
             "matplotlib",
-            "model-generate==1.5.0; platform_system=='Windows' and python_version=='3.10'",
             # Install human-eval from a forked repo with Windows support until the
             # PR (https://github.com/openai/human-eval/pull/53) is merged
             "human-eval-windows==1.0.4",
             "lm-eval[api]",
         ],
+        "model-generate": [
+            "model-generate==1.5.0; platform_system=='Windows' and python_version=='3.10'",
+        ],
         # Keep backwards compatibility for old extras names
         "oga-hybrid": ["lemonade-sdk[oga-ryzenai]"],
         "oga-unified": ["lemonade-sdk[oga-ryzenai]"],
@@ -128,13 +128,13 @@ setup(
             "lsdev=lemonade_server.cli:developer_entrypoint",
         ]
     },
-    python_requires=">=3.10, <3.13",
+    python_requires=">=3.10, <3.14",
     long_description=open("README.md", "r", encoding="utf-8").read(),
     long_description_content_type="text/markdown",
     include_package_data=True,
     package_data={
         "lemonade_server": ["server_models.json"],
-        "lemonade": ["tools/server/static/*"],
+        "lemonade": ["tools/server/static/**/*"],
     },
 )

{lemonade_sdk-8.1.2 → lemonade_sdk-8.1.4}/src/lemonade/api.py RENAMED Viewed

@@ -36,6 +36,7 @@ def _make_state(recipe, checkpoint) -> Dict:
 def from_pretrained(
     checkpoint: str,
     recipe: str = "hf-cpu",
+    do_not_upgrade: bool = True,
 ) -> Tuple[ModelAdapter, TokenizerAdapter]:
     """
     Load an LLM and the corresponding tokenizer using a lemonade recipe.
@@ -43,6 +44,9 @@ def from_pretrained(
     Args:
         - checkpoint: huggingface checkpoint that defines the LLM
         - recipe: defines the implementation and hardware used for the LLM
+        - do_not_upgrade: prioritize the local copy of the model, if available,
+            even if an upgraded copy is available on the server (note: only applies
+            for oga-* recipes)
     Recipe choices:
         - hf-cpu: Huggingface Transformers implementation for CPU with max-perf settings
@@ -118,6 +122,7 @@ def from_pretrained(
             input=checkpoint,
             device=user_backend,
             dtype=backend_to_dtype[user_backend],
+            do_not_upgrade=do_not_upgrade,
         )
         return state.model, state.tokenizer

{lemonade_sdk-8.1.2 → lemonade_sdk-8.1.4}/src/lemonade/common/network.py RENAMED Viewed

@@ -2,6 +2,7 @@ import os
 from typing import Optional
 import socket
 from huggingface_hub import model_info, snapshot_download
+from huggingface_hub.errors import LocalEntryNotFoundError
 def is_offline():
@@ -50,10 +51,11 @@ def get_base_model(checkpoint: str) -> Optional[str]:
     return None
-def custom_snapshot_download(repo_id, **kwargs):
+def _symlink_safe_snapshot_download(repo_id, **kwargs):
     """
     Custom snapshot download with retry logic for Windows symlink privilege errors.
     """
     for attempt in range(2):
         try:
             return snapshot_download(repo_id=repo_id, **kwargs)
@@ -65,3 +67,27 @@ def custom_snapshot_download(repo_id, **kwargs):
             ):
                 continue
             raise
+def custom_snapshot_download(repo_id, do_not_upgrade=False, **kwargs):
+    """
+    Custom snapshot download with:
+        1) retry logic for Windows symlink privilege errors.
+        2) do_not_upgrade allows the caller to prioritize a local copy
+            of the model over an upgraded remote copy.
+    """
+    if do_not_upgrade:
+        try:
+            # Prioritize the local model, if available
+            return _symlink_safe_snapshot_download(
+                repo_id, local_files_only=True, **kwargs
+            )
+        except LocalEntryNotFoundError:
+            # LocalEntryNotFoundError means there was no local model, at this point
+            # we'll accept a remote model
+            return _symlink_safe_snapshot_download(
+                repo_id, local_files_only=False, **kwargs
+            )
+    else:
+        return _symlink_safe_snapshot_download(repo_id, **kwargs)

{lemonade_sdk-8.1.2 → lemonade_sdk-8.1.4}/src/lemonade/tools/llamacpp/utils.py RENAMED Viewed

@@ -585,7 +585,7 @@ def identify_gguf_models(
     return core_files, sharded_files
-def download_gguf(config_checkpoint, config_mmproj=None) -> dict:
+def download_gguf(config_checkpoint, config_mmproj=None, do_not_upgrade=False) -> dict:
     """
     Downloads the GGUF file for the given model configuration.
@@ -605,6 +605,7 @@ def download_gguf(config_checkpoint, config_mmproj=None) -> dict:
     snapshot_folder = custom_snapshot_download(
         checkpoint,
         allow_patterns=list(core_files.values()) + sharded_files,
+        do_not_upgrade=do_not_upgrade,
     )
     # Ensure we downloaded all expected files

{lemonade_sdk-8.1.2 → lemonade_sdk-8.1.4}/src/lemonade/tools/oga/load.py RENAMED Viewed

@@ -654,6 +654,7 @@ class OgaLoad(FirstTool):
         download_only: bool = False,
         trust_remote_code=False,
         subfolder: str = None,
+        do_not_upgrade: bool = False,
     ) -> State:
         from lemonade.common.network import (
             custom_snapshot_download,
@@ -744,7 +745,7 @@ class OgaLoad(FirstTool):
                 input_model_path = custom_snapshot_download(
                     checkpoint,
                     ignore_patterns=["*.md", "*.txt"],
-                    local_files_only=offline,
+                    local_files_only=offline or do_not_upgrade,
                 )
                 # Check if model is ONNX or safetensors
                 is_onnx_model = any(

{lemonade_sdk-8.1.2 → lemonade_sdk-8.1.4}/src/lemonade/tools/oga/utils.py RENAMED Viewed

@@ -100,9 +100,10 @@ class OrtGenaiModel(ModelAdapter):
         max_new_tokens=512,
         min_new_tokens=0,
         do_sample=True,
-        top_k=50,
-        top_p=1.0,
-        temperature=0.7,
+        top_k=None,
+        top_p=None,
+        temperature=None,
+        repeat_penalty=None,
         streamer: OrtGenaiStreamer = None,
         pad_token_id=None,
         stopping_criteria=None,
@@ -154,38 +155,58 @@ class OrtGenaiModel(ModelAdapter):
         if random_seed is None:
             random_seed = -1  # In og.Generator, -1 = seed with random device
+        # Get search config if available, otherwise use empty dict
+        # Thanks to the empty dict, if the model doesn't have a built-in search
+        #   config, the .get() calls will all just use the default values
+        search_config = {}
         if self.config and "search" in self.config:
             search_config = self.config["search"]
-            params.set_search_options(
-                do_sample=search_config.get("do_sample", do_sample),
-                top_k=search_config.get("top_k", top_k),
-                top_p=search_config.get("top_p", top_p),
-                temperature=search_config.get("temperature", temperature),
-                max_length=max_length_to_use,
-                min_length=min_length,
-                early_stopping=search_config.get("early_stopping", False),
-                length_penalty=search_config.get("length_penalty", 1.0),
-                num_beams=search_config.get("num_beams", 1),
-                num_return_sequences=search_config.get("num_return_sequences", 1),
-                repetition_penalty=search_config.get("repetition_penalty", 1.0),
-                past_present_share_buffer=search_config.get(
-                    "past_present_share_buffer", True
-                ),
-                random_seed=random_seed,
-                # Not currently supported by OGA
-                # diversity_penalty=search_config.get('diversity_penalty', 0.0),
-                # no_repeat_ngram_size=search_config.get('no_repeat_ngram_size', 0),
-            )
-        else:
-            params.set_search_options(
-                do_sample=do_sample,
-                top_k=top_k,
-                top_p=top_p,
-                temperature=temperature,
-                max_length=max_length_to_use,
-                min_length=min_length,
-                random_seed=random_seed,
-            )
+        # Apply parameter hierarchy: user provided > search config > defaults
+        default_top_k = 50
+        default_top_p = 1.0
+        default_temperature = 0.7
+        default_repetition_penalty = 1.0
+        top_k_to_use = (
+            top_k if top_k is not None else search_config.get("top_k", default_top_k)
+        )
+        top_p_to_use = (
+            top_p if top_p is not None else search_config.get("top_p", default_top_p)
+        )
+        temperature_to_use = (
+            temperature
+            if temperature is not None
+            else search_config.get("temperature", default_temperature)
+        )
+        # Map the llamacpp name, `repeat_penalty`, to the OGA name, `repetition_penalty`
+        repetition_penalty_to_use = (
+            repeat_penalty
+            if repeat_penalty is not None
+            else search_config.get("repetition_penalty", default_repetition_penalty)
+        )
+        # Set search options once with all parameters
+        params.set_search_options(
+            do_sample=search_config.get("do_sample", do_sample),
+            top_k=top_k_to_use,
+            top_p=top_p_to_use,
+            temperature=temperature_to_use,
+            repetition_penalty=repetition_penalty_to_use,
+            max_length=max_length_to_use,
+            min_length=min_length,
+            early_stopping=search_config.get("early_stopping", False),
+            length_penalty=search_config.get("length_penalty", 1.0),
+            num_beams=search_config.get("num_beams", 1),
+            num_return_sequences=search_config.get("num_return_sequences", 1),
+            past_present_share_buffer=search_config.get(
+                "past_present_share_buffer", True
+            ),
+            random_seed=random_seed,
+            # Not currently supported by OGA
+            # diversity_penalty=search_config.get('diversity_penalty', 0.0),
+            # no_repeat_ngram_size=search_config.get('no_repeat_ngram_size', 0),
+        )
         params.try_graph_capture_with_max_batch_size(1)
         generator = og.Generator(self.model, params)

lemonade-sdk 8.1.2__tar.gz → 8.1.4__tar.gz

Potentially problematic release.

lemonade-sdk 8.1.2tar.gz → 8.1.4tar.gz