lemonade-python-sdk 1.0.2__tar.gz → 1.0.4__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- lemonade_python_sdk-1.0.4/PKG-INFO +384 -0
- lemonade_python_sdk-1.0.4/README.md +343 -0
- lemonade_python_sdk-1.0.4/lemonade_python_sdk.egg-info/PKG-INFO +384 -0
- {lemonade_python_sdk-1.0.2 → lemonade_python_sdk-1.0.4}/lemonade_python_sdk.egg-info/SOURCES.txt +1 -0
- lemonade_python_sdk-1.0.4/lemonade_sdk/__init__.py +17 -0
- lemonade_python_sdk-1.0.4/lemonade_sdk/audio_stream.py +403 -0
- lemonade_python_sdk-1.0.4/lemonade_sdk/client.py +474 -0
- lemonade_python_sdk-1.0.4/lemonade_sdk/request_builder.py +325 -0
- {lemonade_python_sdk-1.0.2 → lemonade_python_sdk-1.0.4}/setup.py +3 -3
- lemonade_python_sdk-1.0.2/PKG-INFO +0 -180
- lemonade_python_sdk-1.0.2/README.md +0 -139
- lemonade_python_sdk-1.0.2/lemonade_python_sdk.egg-info/PKG-INFO +0 -180
- lemonade_python_sdk-1.0.2/lemonade_sdk/__init__.py +0 -6
- lemonade_python_sdk-1.0.2/lemonade_sdk/client.py +0 -177
- lemonade_python_sdk-1.0.2/lemonade_sdk/request_builder.py +0 -137
- {lemonade_python_sdk-1.0.2 → lemonade_python_sdk-1.0.4}/LICENSE +0 -0
- {lemonade_python_sdk-1.0.2 → lemonade_python_sdk-1.0.4}/lemonade_python_sdk.egg-info/dependency_links.txt +0 -0
- {lemonade_python_sdk-1.0.2 → lemonade_python_sdk-1.0.4}/lemonade_python_sdk.egg-info/requires.txt +0 -0
- {lemonade_python_sdk-1.0.2 → lemonade_python_sdk-1.0.4}/lemonade_python_sdk.egg-info/top_level.txt +0 -0
- {lemonade_python_sdk-1.0.2 → lemonade_python_sdk-1.0.4}/lemonade_sdk/model_discovery.py +0 -0
- {lemonade_python_sdk-1.0.2 → lemonade_python_sdk-1.0.4}/lemonade_sdk/port_scanner.py +0 -0
- {lemonade_python_sdk-1.0.2 → lemonade_python_sdk-1.0.4}/lemonade_sdk/utils.py +0 -0
- {lemonade_python_sdk-1.0.2 → lemonade_python_sdk-1.0.4}/setup.cfg +0 -0
|
@@ -0,0 +1,384 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: lemonade-python-sdk
|
|
3
|
+
Version: 1.0.4
|
|
4
|
+
Summary: A clean interface for interacting with the Lemonade LLM server
|
|
5
|
+
Home-page: https://github.com/Tetramatrix/lemonade-python-sdk
|
|
6
|
+
Author: Tetramatrix
|
|
7
|
+
Author-email: contact@tetramatrix.dev
|
|
8
|
+
Project-URL: Bug Reports, https://github.com/Tetramatrix/lemonade-python-sdk/issues
|
|
9
|
+
Project-URL: Source, https://github.com/Tetramatrix/lemonade-python-sdk
|
|
10
|
+
Keywords: llm,ai,lemonade,sdk,api
|
|
11
|
+
Classifier: Development Status :: 4 - Beta
|
|
12
|
+
Classifier: Intended Audience :: Developers
|
|
13
|
+
Classifier: License :: OSI Approved :: MIT License
|
|
14
|
+
Classifier: Operating System :: OS Independent
|
|
15
|
+
Classifier: Programming Language :: Python :: 3
|
|
16
|
+
Classifier: Programming Language :: Python :: 3.8
|
|
17
|
+
Classifier: Programming Language :: Python :: 3.9
|
|
18
|
+
Classifier: Programming Language :: Python :: 3.10
|
|
19
|
+
Classifier: Programming Language :: Python :: 3.11
|
|
20
|
+
Requires-Python: >=3.8
|
|
21
|
+
Description-Content-Type: text/markdown
|
|
22
|
+
License-File: LICENSE
|
|
23
|
+
Requires-Dist: requests>=2.25.0
|
|
24
|
+
Provides-Extra: dev
|
|
25
|
+
Requires-Dist: pytest>=6.0; extra == "dev"
|
|
26
|
+
Requires-Dist: pytest-cov; extra == "dev"
|
|
27
|
+
Requires-Dist: flake8; extra == "dev"
|
|
28
|
+
Dynamic: author
|
|
29
|
+
Dynamic: author-email
|
|
30
|
+
Dynamic: classifier
|
|
31
|
+
Dynamic: description
|
|
32
|
+
Dynamic: description-content-type
|
|
33
|
+
Dynamic: home-page
|
|
34
|
+
Dynamic: keywords
|
|
35
|
+
Dynamic: license-file
|
|
36
|
+
Dynamic: project-url
|
|
37
|
+
Dynamic: provides-extra
|
|
38
|
+
Dynamic: requires-dist
|
|
39
|
+
Dynamic: requires-python
|
|
40
|
+
Dynamic: summary
|
|
41
|
+
|
|
42
|
+
# 🍋 Lemonade Python SDK
|
|
43
|
+
|
|
44
|
+
[](https://opensource.org/licenses/MIT)
|
|
45
|
+
[](https://www.python.org/downloads/)
|
|
46
|
+
|
|
47
|
+
A robust, production-grade Python wrapper for the **Lemonade C++ Backend**.
|
|
48
|
+
|
|
49
|
+
This SDK provides a clean, pythonic interface for interacting with local LLMs running on Lemonade. It was built to power **Sorana** (a visual workspace for AI), extracting the core integration logic into a standalone, open-source library for the developer community.
|
|
50
|
+
|
|
51
|
+
## 🚀 Key Features
|
|
52
|
+
|
|
53
|
+
* **Auto-Discovery:** Automatically scans multiple ports and hosts to find active Lemonade instances.
|
|
54
|
+
* **Low-Overhead Architecture:** Designed as a thin, efficient wrapper to leverage Lemonade's C++ performance with minimal Python latency.
|
|
55
|
+
* **Health Checks & Stats:** Lightweight `/api/v1/health` endpoint for connectivity checks plus `get_stats()` for server performance metrics.
|
|
56
|
+
* **Server Statistics:** Retrieve token usage, requests served, and performance metrics via `get_stats()`.
|
|
57
|
+
* **Type-Safe Client:** Full Python type hinting for better developer experience (IDE autocompletion).
|
|
58
|
+
* **Model Management:** Simple API to load, unload, and list models dynamically.
|
|
59
|
+
* **Embeddings API:** Generate text embeddings for semantic search, RAG, and clustering (FLM & llamacpp backends).
|
|
60
|
+
* **Audio API:** Whisper speech-to-text and Kokoro text-to-speech.
|
|
61
|
+
* **Reranking API:** Reorder documents by relevance for better RAG results.
|
|
62
|
+
* **Image Generation:** Create images from text prompts using Stable Diffusion.
|
|
63
|
+
* **WebSocket Streaming:** Real-time audio transcription with VAD.
|
|
64
|
+
|
|
65
|
+
## 📦 Installation
|
|
66
|
+
|
|
67
|
+
```bash
|
|
68
|
+
pip install .
|
|
69
|
+
```
|
|
70
|
+
|
|
71
|
+
Alternatively, you can install it directly from GitHub:
|
|
72
|
+
|
|
73
|
+
```bash
|
|
74
|
+
pip install git+https://github.com/Tetramatrix/lemonade-python-sdk.git
|
|
75
|
+
```
|
|
76
|
+
|
|
77
|
+
## ⚡ Quick Start
|
|
78
|
+
|
|
79
|
+
### 1. Connecting to Lemonade
|
|
80
|
+
|
|
81
|
+
The SDK automatically handles port discovery, so you don't need to hardcode localhost:8000.
|
|
82
|
+
|
|
83
|
+
```python
|
|
84
|
+
from lemonade_sdk import LemonadeClient, find_available_lemonade_port
|
|
85
|
+
|
|
86
|
+
# Auto-discover running instance
|
|
87
|
+
port = find_available_lemonade_port()
|
|
88
|
+
if port:
|
|
89
|
+
client = LemonadeClient(base_url=f"http://localhost:{port}")
|
|
90
|
+
if client.health_check():
|
|
91
|
+
print(f"Connected to Lemonade on port {port}")
|
|
92
|
+
else:
|
|
93
|
+
print("No Lemonade instance found.")
|
|
94
|
+
```
|
|
95
|
+
|
|
96
|
+
### 1.1 Health Check & Stats
|
|
97
|
+
|
|
98
|
+
```python
|
|
99
|
+
# Check if server is alive (uses /api/v1/health endpoint)
|
|
100
|
+
if client.health_check():
|
|
101
|
+
print("Lemonade is running!")
|
|
102
|
+
|
|
103
|
+
# Get server statistics
|
|
104
|
+
stats = client.get_stats()
|
|
105
|
+
if stats:
|
|
106
|
+
print(f"Tokens generated: {stats.get('total_tokens', 0)}")
|
|
107
|
+
print(f"Tokens/sec: {stats.get('tokens_per_second', 0):.1f}")
|
|
108
|
+
print(f"Requests served: {stats.get('requests_served', 0)}")
|
|
109
|
+
```
|
|
110
|
+
|
|
111
|
+
### 2. Chat Completion
|
|
112
|
+
|
|
113
|
+
```python
|
|
114
|
+
response = client.chat_completion(
|
|
115
|
+
model="Llama-3-8B-Instruct",
|
|
116
|
+
messages=[
|
|
117
|
+
{"role": "system", "content": "You are a helpful coding assistant."},
|
|
118
|
+
{"role": "user", "content": "Write a Hello World in C++"}
|
|
119
|
+
],
|
|
120
|
+
temperature=0.7
|
|
121
|
+
)
|
|
122
|
+
|
|
123
|
+
print(response['choices'][0]['message']['content'])
|
|
124
|
+
```
|
|
125
|
+
|
|
126
|
+
### 3. Model Management
|
|
127
|
+
|
|
128
|
+
```python
|
|
129
|
+
# List all available models
|
|
130
|
+
models = client.list_models()
|
|
131
|
+
for m in models:
|
|
132
|
+
print(f"Found model: {m['id']}")
|
|
133
|
+
|
|
134
|
+
# Load a specific model into memory
|
|
135
|
+
client.load_model("Mistral-7B-v0.1")
|
|
136
|
+
```
|
|
137
|
+
|
|
138
|
+
### 4. Embeddings (NEW)
|
|
139
|
+
|
|
140
|
+
Generate text embeddings for semantic search, RAG pipelines, and clustering.
|
|
141
|
+
|
|
142
|
+
```python
|
|
143
|
+
# List available embedding models (filtered by 'embeddings' label)
|
|
144
|
+
embedding_models = client.list_embedding_models()
|
|
145
|
+
for model in embedding_models:
|
|
146
|
+
print(f"Embedding model: {model['id']}")
|
|
147
|
+
|
|
148
|
+
# Generate embeddings for single text
|
|
149
|
+
response = client.embeddings(
|
|
150
|
+
input="Hello, world!",
|
|
151
|
+
model="nomic-embed-text-v1-GGUF"
|
|
152
|
+
)
|
|
153
|
+
|
|
154
|
+
embedding_vector = response["data"][0]["embedding"]
|
|
155
|
+
print(f"Vector length: {len(embedding_vector)}")
|
|
156
|
+
|
|
157
|
+
# Generate embeddings for multiple texts
|
|
158
|
+
texts = ["Text 1", "Text 2", "Text 3"]
|
|
159
|
+
response = client.embeddings(
|
|
160
|
+
input=texts,
|
|
161
|
+
model="nomic-embed-text-v1-GGUF"
|
|
162
|
+
)
|
|
163
|
+
|
|
164
|
+
for item in response["data"]:
|
|
165
|
+
print(f"Text {item['index']}: {len(item['embedding'])} dimensions")
|
|
166
|
+
```
|
|
167
|
+
|
|
168
|
+
**Supported Backends: (Lemonade)**
|
|
169
|
+
- ✅ **FLM (FastFlowLM)** - NPU-accelerated on Windows
|
|
170
|
+
- ✅ **llamacpp** (.GGUF models) - CPU/GPU
|
|
171
|
+
- ❌ ONNX/OGA - Not supported
|
|
172
|
+
|
|
173
|
+
### 5. Audio Transcription (Whisper) - NEW
|
|
174
|
+
|
|
175
|
+
Transcribe audio files to text using Whisper.
|
|
176
|
+
|
|
177
|
+
```python
|
|
178
|
+
# List available audio models (Whisper + Kokoro)
|
|
179
|
+
audio_models = client.list_audio_models()
|
|
180
|
+
for model in audio_models:
|
|
181
|
+
print(f"Audio model: {model['id']}")
|
|
182
|
+
|
|
183
|
+
# Transcribe an audio file
|
|
184
|
+
result = client.transcribe_audio(
|
|
185
|
+
file_path="meeting.wav",
|
|
186
|
+
model="Whisper-Tiny",
|
|
187
|
+
language="en", # Optional: None for auto-detection
|
|
188
|
+
response_format="json" # Options: "json", "text", "verbose_json"
|
|
189
|
+
)
|
|
190
|
+
|
|
191
|
+
if "error" not in result:
|
|
192
|
+
print(f"Transcription: {result['text']}")
|
|
193
|
+
# Verbose format also includes: duration, language, segments
|
|
194
|
+
```
|
|
195
|
+
|
|
196
|
+
**Supported Models:**
|
|
197
|
+
- `Whisper-Tiny` (~39M parameters)
|
|
198
|
+
- `Whisper-Base` (~74M parameters)
|
|
199
|
+
- `Whisper-Small` (~244M parameters)
|
|
200
|
+
|
|
201
|
+
**Supported Formats:** WAV, MP3, FLAC, OGG, WebM
|
|
202
|
+
|
|
203
|
+
**Backend:** whisper.cpp (NPU-accelerated on Windows)
|
|
204
|
+
|
|
205
|
+
### 6. Text-to-Speech (Kokoro) - NEW
|
|
206
|
+
|
|
207
|
+
Generate speech from text using Kokoro TTS.
|
|
208
|
+
|
|
209
|
+
```python
|
|
210
|
+
# Generate speech and save to file
|
|
211
|
+
client.text_to_speech(
|
|
212
|
+
input_text="Hello, Lemonade can now speak!",
|
|
213
|
+
model="kokoro-v1",
|
|
214
|
+
voice="shimmer", # Options: shimmer, corey, af_bella, am_adam, etc.
|
|
215
|
+
speed=1.0, # 0.5 - 2.0
|
|
216
|
+
response_format="mp3", # Options: mp3, wav, opus, pcm, aac, flac
|
|
217
|
+
output_file="speech.mp3" # Saves directly to file
|
|
218
|
+
)
|
|
219
|
+
|
|
220
|
+
# Or get audio bytes directly
|
|
221
|
+
audio_bytes = client.text_to_speech(
|
|
222
|
+
input_text="Short test!",
|
|
223
|
+
model="kokoro-v1",
|
|
224
|
+
voice="corey",
|
|
225
|
+
response_format="mp3"
|
|
226
|
+
)
|
|
227
|
+
|
|
228
|
+
with open("speech.mp3", "wb") as f:
|
|
229
|
+
f.write(audio_bytes)
|
|
230
|
+
```
|
|
231
|
+
|
|
232
|
+
**Supported Models:**
|
|
233
|
+
- `kokoro-v1` (~82M parameters)
|
|
234
|
+
|
|
235
|
+
**Available Voices:**
|
|
236
|
+
|
|
237
|
+
| Voice ID | Language | Gender |
|
|
238
|
+
|----------|----------|--------|
|
|
239
|
+
| `shimmer` | EN | Female |
|
|
240
|
+
| `corey` | EN | Male |
|
|
241
|
+
| `af_bella`, `af_nicole` | EN-US | Female |
|
|
242
|
+
| `am_adam`, `am_michael` | EN-US | Male |
|
|
243
|
+
| `bf_emma`, `bf_isabella` | EN-GB | Female |
|
|
244
|
+
| `bm_george`, `bm_lewis` | EN-GB | Male |
|
|
245
|
+
|
|
246
|
+
**Audio Formats:** MP3, WAV, OPUS, PCM, AAC, FLAC
|
|
247
|
+
|
|
248
|
+
**Backend:** Kokoros (.onnx, CPU)
|
|
249
|
+
|
|
250
|
+
### 7. Reranking (NEW)
|
|
251
|
+
|
|
252
|
+
Rerank documents based on relevance to a query.
|
|
253
|
+
|
|
254
|
+
```python
|
|
255
|
+
result = client.rerank(
|
|
256
|
+
query="What is the capital of France?",
|
|
257
|
+
documents=[
|
|
258
|
+
"Berlin is the capital of Germany.",
|
|
259
|
+
"Paris is the capital of France.",
|
|
260
|
+
"London is the capital of the UK."
|
|
261
|
+
],
|
|
262
|
+
model="bge-reranker-v2-m3-GGUF"
|
|
263
|
+
)
|
|
264
|
+
|
|
265
|
+
# Results sorted by relevance score
|
|
266
|
+
for r in result["results"]:
|
|
267
|
+
print(f"Rank {r['index']}: Score={r['relevance_score']:.2f}")
|
|
268
|
+
```
|
|
269
|
+
|
|
270
|
+
**Supported Models:**
|
|
271
|
+
- `bge-reranker-v2-m3-GGUF`
|
|
272
|
+
- Other BGE reranker models
|
|
273
|
+
|
|
274
|
+
**Backend:** llamacpp (.GGUF only, not available for FLM or OGA)
|
|
275
|
+
|
|
276
|
+
### 8. Image Generation (NEW)
|
|
277
|
+
|
|
278
|
+
Generate images from text prompts using Stable Diffusion.
|
|
279
|
+
|
|
280
|
+
```python
|
|
281
|
+
# Generate and save to file
|
|
282
|
+
client.generate_image(
|
|
283
|
+
prompt="A sunset over mountains with lake reflection",
|
|
284
|
+
model="SD-Turbo",
|
|
285
|
+
size="512x512",
|
|
286
|
+
steps=4, # SD-Turbo needs only 4 steps
|
|
287
|
+
cfg_scale=1.0,
|
|
288
|
+
output_file="sunset.png"
|
|
289
|
+
)
|
|
290
|
+
|
|
291
|
+
# Or get image bytes
|
|
292
|
+
image_bytes = client.generate_image(
|
|
293
|
+
prompt="A cute cat",
|
|
294
|
+
model="SD-Turbo"
|
|
295
|
+
)
|
|
296
|
+
```
|
|
297
|
+
|
|
298
|
+
**Supported Models:**
|
|
299
|
+
- `SD-Turbo` (fast, 4 steps)
|
|
300
|
+
- `SDXL-Turbo` (fast, 4 steps)
|
|
301
|
+
- `SD-1.5` (standard, 20 steps)
|
|
302
|
+
- `SDXL-Base-1.0` (high quality, 20 steps)
|
|
303
|
+
|
|
304
|
+
**Image Sizes:** 512x512, 1024x1024, or custom
|
|
305
|
+
|
|
306
|
+
**Backend:** stable-diffusion.cpp
|
|
307
|
+
|
|
308
|
+
### 9. WebSocket Streaming (NEW)
|
|
309
|
+
|
|
310
|
+
Real-time audio transcription with Voice Activity Detection (VAD).
|
|
311
|
+
|
|
312
|
+
```python
|
|
313
|
+
from lemonade_sdk import WhisperWebSocketClient
|
|
314
|
+
|
|
315
|
+
# Create streaming client
|
|
316
|
+
stream = client.create_whisper_stream(model="Whisper-Tiny")
|
|
317
|
+
stream.connect()
|
|
318
|
+
|
|
319
|
+
# Set callback for transcriptions
|
|
320
|
+
def on_transcript(text):
|
|
321
|
+
print(f"Heard: '{text}'")
|
|
322
|
+
|
|
323
|
+
stream.on_transcription(on_transcript)
|
|
324
|
+
|
|
325
|
+
# Stream audio file (PCM16, 16kHz, mono)
|
|
326
|
+
for text in stream.stream("audio.pcm"):
|
|
327
|
+
pass # Callback handles output
|
|
328
|
+
|
|
329
|
+
# Or stream from microphone (requires pyaudio)
|
|
330
|
+
# for text in stream.stream_microphone():
|
|
331
|
+
# print(f"Heard: {text}")
|
|
332
|
+
|
|
333
|
+
stream.disconnect()
|
|
334
|
+
```
|
|
335
|
+
|
|
336
|
+
**Audio Format:** 16kHz, mono, PCM16 (16-bit)
|
|
337
|
+
|
|
338
|
+
**Features:**
|
|
339
|
+
- Voice Activity Detection (VAD)
|
|
340
|
+
- Real-time streaming
|
|
341
|
+
- Microphone support (with pyaudio)
|
|
342
|
+
- Configurable sensitivity
|
|
343
|
+
|
|
344
|
+
**Backend:** whisper.cpp (NPU-accelerated on Windows)
|
|
345
|
+
|
|
346
|
+
## 📚 Documentation
|
|
347
|
+
|
|
348
|
+
* **[Embeddings API](docs/embeddings_api.md)** - Complete guide for using embeddings
|
|
349
|
+
* **[Audio API](docs/audio_api.md)** - Whisper transcription and Kokoro TTS (documentation)
|
|
350
|
+
* **[Implementation Plan](docs/AUDIO_IMPLEMENTATION.md)** - Audio API implementation roadmap
|
|
351
|
+
* [Lemonade Server Docs](https://lemonade-server.ai/docs/server/server_spec/) - Official Lemonade documentation
|
|
352
|
+
|
|
353
|
+
### 🖼️ Production Showcase:
|
|
354
|
+
|
|
355
|
+
This SDK powers **3 real-world production applications**:
|
|
356
|
+
|
|
357
|
+
[Sorana](https://tetramatrix.github.io/Sorana/) — AI Visual Workspace
|
|
358
|
+
* SDK drives semantic AI grouping of files and folders onto a spatial 2D canvas
|
|
359
|
+
* SDK handles auto-discovery and connection to local Lemonade instances (zero config)
|
|
360
|
+
|
|
361
|
+
[Aicono](https://tetramatrix.github.io/Aicono/) — AI Desktop Icon Organizer *(Featured in [CHIP Magazine](https://www.chip.de/downloads/Aicono_186527264.html) 🇩🇪)*
|
|
362
|
+
* SDK drives AI inference for grouping and categorizing desktop icons
|
|
363
|
+
* Reached millions of readers via [CHIP](https://www.chip.de/downloads/Aicono_186527264.html), one of Germany's largest IT publications
|
|
364
|
+
|
|
365
|
+
[TabNeuron](https://tetramatrix.github.io/TabNeuron/) — AI-Powered Tab Organizer
|
|
366
|
+
* SDK enables local AI inference for grouping and categorizing browser tabs
|
|
367
|
+
* Desktop companion app + browser extension, demonstrating SDK viability in lightweight client architectures
|
|
368
|
+
|
|
369
|
+
## 🛠️ Project Structure
|
|
370
|
+
|
|
371
|
+
* **client.py:** Main entry point for API interactions (chat, embeddings, audio, reranking, images, model management).
|
|
372
|
+
* **port_scanner.py:** Utilities for detecting Lemonade instances across ports (8000-9000).
|
|
373
|
+
* **model_discovery.py:** Logic for fetching and parsing model metadata.
|
|
374
|
+
* **request_builder.py:** Helper functions to construct compliant payloads (chat, embeddings, audio, reranking, images).
|
|
375
|
+
* **audio_stream.py:** WebSocket client for real-time audio transcription with VAD.
|
|
376
|
+
* **utils.py:** Additional utility functions.
|
|
377
|
+
|
|
378
|
+
## 🤝 Contributing
|
|
379
|
+
|
|
380
|
+
Contributions are welcome! This project is intended to help the AMD Ryzen AI and Lemonade community build downstream applications faster.
|
|
381
|
+
|
|
382
|
+
## 📄 License
|
|
383
|
+
|
|
384
|
+
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
|