PyPI - lemonade-python-sdk - Versions diffs - 1.0.4__tar.gz → 1.0.5__tar.gz - Mend

lemonade-python-sdk 1.0.4tar.gz → 1.0.5tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (17) hide show

{lemonade_python_sdk-1.0.4 → lemonade_python_sdk-1.0.5}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: lemonade-python-sdk
-Version: 1.0.4
+Version: 1.0.5
 Summary: A clean interface for interacting with the Lemonade LLM server
 Home-page: https://github.com/Tetramatrix/lemonade-python-sdk
 Author: Tetramatrix
@@ -52,8 +52,7 @@ This SDK provides a clean, pythonic interface for interacting with local LLMs ru
 * **Auto-Discovery:** Automatically scans multiple ports and hosts to find active Lemonade instances.
 * **Low-Overhead Architecture:** Designed as a thin, efficient wrapper to leverage Lemonade's C++ performance with minimal Python latency.
-* **Health Checks & Stats:** Lightweight `/api/v1/health` endpoint for connectivity checks plus `get_stats()` for server performance metrics.
-* **Server Statistics:** Retrieve token usage, requests served, and performance metrics via `get_stats()`.
+* **Health Checks & Server Stats:** Lightweight `/api/v1/health` endpoint plus `get_stats()` for token usage, requests served, and performance metrics.
 * **Type-Safe Client:** Full Python type hinting for better developer experience (IDE autocompletion).
 * **Model Management:** Simple API to load, unload, and list models dynamically.
 * **Embeddings API:** Generate text embeddings for semantic search, RAG, and clustering (FLM & llamacpp backends).
@@ -100,14 +99,18 @@ else:
 if client.health_check():
     print("Lemonade is running!")
-# Get server statistics
+# Get server statistics (performance metrics from last request)
 stats = client.get_stats()
 if stats:
-    print(f"Tokens generated: {stats.get('total_tokens', 0)}")
+    print(f"Time to first token: {stats.get('time_to_first_token', 0):.2f}s")
     print(f"Tokens/sec: {stats.get('tokens_per_second', 0):.1f}")
-    print(f"Requests served: {stats.get('requests_served', 0)}")
+    print(f"Input tokens: {stats.get('input_tokens', 0)}")
+    print(f"Output tokens: {stats.get('output_tokens', 0)}")
+    print(f"Prompt tokens: {stats.get('prompt_tokens', 0)}")
 ```
+**Available stats fields:** `time_to_first_token`, `tokens_per_second`, `input_tokens`, `output_tokens`, `decode_token_times`, `prompt_tokens`.
 ### 2. Chat Completion
 ```python

{lemonade_python_sdk-1.0.4 → lemonade_python_sdk-1.0.5}/README.md RENAMED Viewed

@@ -11,8 +11,7 @@ This SDK provides a clean, pythonic interface for interacting with local LLMs ru
 * **Auto-Discovery:** Automatically scans multiple ports and hosts to find active Lemonade instances.
 * **Low-Overhead Architecture:** Designed as a thin, efficient wrapper to leverage Lemonade's C++ performance with minimal Python latency.
-* **Health Checks & Stats:** Lightweight `/api/v1/health` endpoint for connectivity checks plus `get_stats()` for server performance metrics.
-* **Server Statistics:** Retrieve token usage, requests served, and performance metrics via `get_stats()`.
+* **Health Checks & Server Stats:** Lightweight `/api/v1/health` endpoint plus `get_stats()` for token usage, requests served, and performance metrics.
 * **Type-Safe Client:** Full Python type hinting for better developer experience (IDE autocompletion).
 * **Model Management:** Simple API to load, unload, and list models dynamically.
 * **Embeddings API:** Generate text embeddings for semantic search, RAG, and clustering (FLM & llamacpp backends).
@@ -59,14 +58,18 @@ else:
 if client.health_check():
     print("Lemonade is running!")
-# Get server statistics
+# Get server statistics (performance metrics from last request)
 stats = client.get_stats()
 if stats:
-    print(f"Tokens generated: {stats.get('total_tokens', 0)}")
+    print(f"Time to first token: {stats.get('time_to_first_token', 0):.2f}s")
     print(f"Tokens/sec: {stats.get('tokens_per_second', 0):.1f}")
-    print(f"Requests served: {stats.get('requests_served', 0)}")
+    print(f"Input tokens: {stats.get('input_tokens', 0)}")
+    print(f"Output tokens: {stats.get('output_tokens', 0)}")
+    print(f"Prompt tokens: {stats.get('prompt_tokens', 0)}")
 ```
+**Available stats fields:** `time_to_first_token`, `tokens_per_second`, `input_tokens`, `output_tokens`, `decode_token_times`, `prompt_tokens`.
 ### 2. Chat Completion
 ```python

{lemonade_python_sdk-1.0.4 → lemonade_python_sdk-1.0.5}/lemonade_python_sdk.egg-info/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: lemonade-python-sdk
-Version: 1.0.4
+Version: 1.0.5
 Summary: A clean interface for interacting with the Lemonade LLM server
 Home-page: https://github.com/Tetramatrix/lemonade-python-sdk
 Author: Tetramatrix
@@ -52,8 +52,7 @@ This SDK provides a clean, pythonic interface for interacting with local LLMs ru
 * **Auto-Discovery:** Automatically scans multiple ports and hosts to find active Lemonade instances.
 * **Low-Overhead Architecture:** Designed as a thin, efficient wrapper to leverage Lemonade's C++ performance with minimal Python latency.
-* **Health Checks & Stats:** Lightweight `/api/v1/health` endpoint for connectivity checks plus `get_stats()` for server performance metrics.
-* **Server Statistics:** Retrieve token usage, requests served, and performance metrics via `get_stats()`.
+* **Health Checks & Server Stats:** Lightweight `/api/v1/health` endpoint plus `get_stats()` for token usage, requests served, and performance metrics.
 * **Type-Safe Client:** Full Python type hinting for better developer experience (IDE autocompletion).
 * **Model Management:** Simple API to load, unload, and list models dynamically.
 * **Embeddings API:** Generate text embeddings for semantic search, RAG, and clustering (FLM & llamacpp backends).
@@ -100,14 +99,18 @@ else:
 if client.health_check():
     print("Lemonade is running!")
-# Get server statistics
+# Get server statistics (performance metrics from last request)
 stats = client.get_stats()
 if stats:
-    print(f"Tokens generated: {stats.get('total_tokens', 0)}")
+    print(f"Time to first token: {stats.get('time_to_first_token', 0):.2f}s")
     print(f"Tokens/sec: {stats.get('tokens_per_second', 0):.1f}")
-    print(f"Requests served: {stats.get('requests_served', 0)}")
+    print(f"Input tokens: {stats.get('input_tokens', 0)}")
+    print(f"Output tokens: {stats.get('output_tokens', 0)}")
+    print(f"Prompt tokens: {stats.get('prompt_tokens', 0)}")
 ```
+**Available stats fields:** `time_to_first_token`, `tokens_per_second`, `input_tokens`, `output_tokens`, `decode_token_times`, `prompt_tokens`.
 ### 2. Chat Completion
 ```python

{lemonade_python_sdk-1.0.4 → lemonade_python_sdk-1.0.5}/setup.py RENAMED Viewed

@@ -13,7 +13,7 @@ with open("LICENSE", "r", encoding="utf-8") as fh:
 setup(
     name="lemonade-python-sdk",
-    version="1.0.4",
+    version="1.0.5",
     author="Tetramatrix",
     author_email="contact@tetramatrix.dev",
     description="A clean interface for interacting with the Lemonade LLM server",