PyPI - lemonade-python-sdk - Versions diffs - 1.0.3__tar.gz → 1.0.5__tar.gz - Mend

lemonade-python-sdk 1.0.3tar.gz → 1.0.5tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (17) hide show

{lemonade_python_sdk-1.0.3 → lemonade_python_sdk-1.0.5}/PKG-INFO RENAMED Viewed

@@ -1,10 +1,10 @@
 Metadata-Version: 2.4
 Name: lemonade-python-sdk
-Version: 1.0.3
+Version: 1.0.5
 Summary: A clean interface for interacting with the Lemonade LLM server
 Home-page: https://github.com/Tetramatrix/lemonade-python-sdk
-Author: Your Name
-Author-email: your.email@example.com
+Author: Tetramatrix
+Author-email: contact@tetramatrix.dev
 Project-URL: Bug Reports, https://github.com/Tetramatrix/lemonade-python-sdk/issues
 Project-URL: Source, https://github.com/Tetramatrix/lemonade-python-sdk
 Keywords: llm,ai,lemonade,sdk,api
@@ -52,7 +52,7 @@ This SDK provides a clean, pythonic interface for interacting with local LLMs ru
 * **Auto-Discovery:** Automatically scans multiple ports and hosts to find active Lemonade instances.
 * **Low-Overhead Architecture:** Designed as a thin, efficient wrapper to leverage Lemonade's C++ performance with minimal Python latency.
-* **Health Checks & Recovery:** Built-in utilities to verify server status and handle connection drops.
+* **Health Checks & Server Stats:** Lightweight `/api/v1/health` endpoint plus `get_stats()` for token usage, requests served, and performance metrics.
 * **Type-Safe Client:** Full Python type hinting for better developer experience (IDE autocompletion).
 * **Model Management:** Simple API to load, unload, and list models dynamically.
 * **Embeddings API:** Generate text embeddings for semantic search, RAG, and clustering (FLM & llamacpp backends).
@@ -92,6 +92,25 @@ else:
     print("No Lemonade instance found.")
 ```
+### 1.1 Health Check & Stats
+```python
+# Check if server is alive (uses /api/v1/health endpoint)
+if client.health_check():
+    print("Lemonade is running!")
+# Get server statistics (performance metrics from last request)
+stats = client.get_stats()
+if stats:
+    print(f"Time to first token: {stats.get('time_to_first_token', 0):.2f}s")
+    print(f"Tokens/sec: {stats.get('tokens_per_second', 0):.1f}")
+    print(f"Input tokens: {stats.get('input_tokens', 0)}")
+    print(f"Output tokens: {stats.get('output_tokens', 0)}")
+    print(f"Prompt tokens: {stats.get('prompt_tokens', 0)}")
+```
+**Available stats fields:** `time_to_first_token`, `tokens_per_second`, `input_tokens`, `output_tokens`, `decode_token_times`, `prompt_tokens`.
 ### 2. Chat Completion
 ```python

{lemonade_python_sdk-1.0.3 → lemonade_python_sdk-1.0.5}/README.md RENAMED Viewed

@@ -11,7 +11,7 @@ This SDK provides a clean, pythonic interface for interacting with local LLMs ru
 * **Auto-Discovery:** Automatically scans multiple ports and hosts to find active Lemonade instances.
 * **Low-Overhead Architecture:** Designed as a thin, efficient wrapper to leverage Lemonade's C++ performance with minimal Python latency.
-* **Health Checks & Recovery:** Built-in utilities to verify server status and handle connection drops.
+* **Health Checks & Server Stats:** Lightweight `/api/v1/health` endpoint plus `get_stats()` for token usage, requests served, and performance metrics.
 * **Type-Safe Client:** Full Python type hinting for better developer experience (IDE autocompletion).
 * **Model Management:** Simple API to load, unload, and list models dynamically.
 * **Embeddings API:** Generate text embeddings for semantic search, RAG, and clustering (FLM & llamacpp backends).
@@ -51,6 +51,25 @@ else:
     print("No Lemonade instance found.")
 ```
+### 1.1 Health Check & Stats
+```python
+# Check if server is alive (uses /api/v1/health endpoint)
+if client.health_check():
+    print("Lemonade is running!")
+# Get server statistics (performance metrics from last request)
+stats = client.get_stats()
+if stats:
+    print(f"Time to first token: {stats.get('time_to_first_token', 0):.2f}s")
+    print(f"Tokens/sec: {stats.get('tokens_per_second', 0):.1f}")
+    print(f"Input tokens: {stats.get('input_tokens', 0)}")
+    print(f"Output tokens: {stats.get('output_tokens', 0)}")
+    print(f"Prompt tokens: {stats.get('prompt_tokens', 0)}")
+```
+**Available stats fields:** `time_to_first_token`, `tokens_per_second`, `input_tokens`, `output_tokens`, `decode_token_times`, `prompt_tokens`.
 ### 2. Chat Completion
 ```python

{lemonade_python_sdk-1.0.3 → lemonade_python_sdk-1.0.5}/lemonade_python_sdk.egg-info/PKG-INFO RENAMED Viewed

@@ -1,10 +1,10 @@
 Metadata-Version: 2.4
 Name: lemonade-python-sdk
-Version: 1.0.3
+Version: 1.0.5
 Summary: A clean interface for interacting with the Lemonade LLM server
 Home-page: https://github.com/Tetramatrix/lemonade-python-sdk
-Author: Your Name
-Author-email: your.email@example.com
+Author: Tetramatrix
+Author-email: contact@tetramatrix.dev
 Project-URL: Bug Reports, https://github.com/Tetramatrix/lemonade-python-sdk/issues
 Project-URL: Source, https://github.com/Tetramatrix/lemonade-python-sdk
 Keywords: llm,ai,lemonade,sdk,api
@@ -52,7 +52,7 @@ This SDK provides a clean, pythonic interface for interacting with local LLMs ru
 * **Auto-Discovery:** Automatically scans multiple ports and hosts to find active Lemonade instances.
 * **Low-Overhead Architecture:** Designed as a thin, efficient wrapper to leverage Lemonade's C++ performance with minimal Python latency.
-* **Health Checks & Recovery:** Built-in utilities to verify server status and handle connection drops.
+* **Health Checks & Server Stats:** Lightweight `/api/v1/health` endpoint plus `get_stats()` for token usage, requests served, and performance metrics.
 * **Type-Safe Client:** Full Python type hinting for better developer experience (IDE autocompletion).
 * **Model Management:** Simple API to load, unload, and list models dynamically.
 * **Embeddings API:** Generate text embeddings for semantic search, RAG, and clustering (FLM & llamacpp backends).
@@ -92,6 +92,25 @@ else:
     print("No Lemonade instance found.")
 ```
+### 1.1 Health Check & Stats
+```python
+# Check if server is alive (uses /api/v1/health endpoint)
+if client.health_check():
+    print("Lemonade is running!")
+# Get server statistics (performance metrics from last request)
+stats = client.get_stats()
+if stats:
+    print(f"Time to first token: {stats.get('time_to_first_token', 0):.2f}s")
+    print(f"Tokens/sec: {stats.get('tokens_per_second', 0):.1f}")
+    print(f"Input tokens: {stats.get('input_tokens', 0)}")
+    print(f"Output tokens: {stats.get('output_tokens', 0)}")
+    print(f"Prompt tokens: {stats.get('prompt_tokens', 0)}")
+```
+**Available stats fields:** `time_to_first_token`, `tokens_per_second`, `input_tokens`, `output_tokens`, `decode_token_times`, `prompt_tokens`.
 ### 2. Chat Completion
 ```python

{lemonade_python_sdk-1.0.3 → lemonade_python_sdk-1.0.5}/lemonade_sdk/client.py RENAMED Viewed

@@ -79,11 +79,31 @@ class LemonadeClient:
             bool: True if the server is reachable, otherwise False
         """
         try:
-            models = self.list_models()
-            return len(models) >= 0  # If we don't get an error, the server is reachable
-        except:
+            url = f"{self.base_url}/api/v1/health"
+            response = self.session.get(url, timeout=10)
+            return response.status_code == 200
+        except Exception:
             return False
+    def get_stats(self) -> Dict[str, Any]:
+        """
+        Retrieves server statistics including token usage, requests served, and performance metrics.
+        Returns:
+            Dict[str, Any]: Server stats with token counts, tokens_per_second, etc.
+        """
+        url = f"{self.base_url}/api/v1/stats"
+        try:
+            response = self.session.get(url, timeout=10)
+            response.raise_for_status()
+            return response.json()
+        except requests.exceptions.RequestException as e:
+            print(f"Error retrieving stats: {e}")
+            return {}
+        except json.JSONDecodeError as e:
+            print(f"Error parsing stats response: {e}")
+            return {}
     def get_current_model(self) -> Optional[str]:
         """
         Retrieves the currently active model from the Lemonade server.

{lemonade_python_sdk-1.0.3 → lemonade_python_sdk-1.0.5}/setup.py RENAMED Viewed

@@ -13,9 +13,9 @@ with open("LICENSE", "r", encoding="utf-8") as fh:
 setup(
     name="lemonade-python-sdk",
-    version="1.0.3",
-    author="Your Name",
-    author_email="your.email@example.com",
+    version="1.0.5",
+    author="Tetramatrix",
+    author_email="contact@tetramatrix.dev",
     description="A clean interface for interacting with the Lemonade LLM server",
     long_description=long_description,
     long_description_content_type="text/markdown",