smallestai 2.0.0__tar.gz → 2.2.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.


This version of smallestai might be problematic. Click here for more details.

@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.2
2
2
  Name: smallestai
3
- Version: 2.0.0
3
+ Version: 2.2.0
4
4
  Summary: Official Python client for the Smallest AI API
5
5
  Author-email: Smallest <support@smallest.ai>
6
6
  License: MIT
@@ -15,7 +15,6 @@ License-File: LICENSE
15
15
  Requires-Dist: aiohttp
16
16
  Requires-Dist: aiofiles
17
17
  Requires-Dist: requests
18
- Requires-Dist: sacremoses
19
18
  Requires-Dist: pydub
20
19
  Provides-Extra: test
21
20
  Requires-Dist: jiwer; extra == "test"
@@ -59,8 +58,11 @@ Currently, the library supports direct synthesis and the ability to synthesize s
59
58
  - [Aynchronous](#Synchronous)
60
59
  - [LLM to Speech](#llm-to-speech)
61
60
  - [Add your Voice](#add-your-voice)
62
- - [Synchronously](#synchronously)
63
- - [Asynchronously](#asynchronously)
61
+ - [Synchronously](#add-synchronously)
62
+ - [Asynchronously](#add-asynchronously)
63
+ - [Delete your Voice](#delete-your-voice)
64
+ - [Synchronously](#delete-synchronously)
65
+ - [Asynchronously](#delete-asynchronously)
64
66
  - [Available Methods](#available-methods)
65
67
  - [Technical Note: WAV Headers in Streaming Audio](#technical-note-wav-headers-in-streaming-audio)
66
68
 
@@ -80,14 +82,6 @@ When using an SDK in your application, make sure to pin to at least the major ve
80
82
  3. Create a new API Key and copy it.
81
83
  4. Export the API Key in your environment with the name `SMALLEST_API_KEY`, ensuring that your application can access it securely for authentication.
82
84
 
83
- ## Best Practices for Input Text
84
- While the `transliterate` parameter is provided, please note that it is not fully supported and may not perform consistently across all cases. It is recommended to use the model without relying on this parameter.
85
-
86
- For optimal voice generation results:
87
-
88
- 1. For English, provide the input in Latin script (e.g., "Hello, how are you?").
89
- 2. For Hindi, provide the input in Devanagari script (e.g., "नमस्ते, आप कैसे हैं?").
90
- 3. For code-mixed input, use Latin script for English and Devanagari script for Hindi (e.g., "Hello, आप कैसे हैं?").
91
85
 
92
86
  ## Examples
93
87
 
@@ -115,9 +109,10 @@ if __name__ == "__main__":
115
109
  - `sample_rate`: Audio sample rate (default: 24000)
116
110
  - `voice_id`: Voice ID (default: "emily")
117
111
  - `speed`: Speech speed multiplier (default: 1.0)
118
- - `add_wav_header`: Include WAV header in output (default: True)
119
- - `transliterate`: Enable text transliteration (default: False)
120
- - `remove_extra_silence`: Remove additional silence (default: True)
112
+ - `consistency`: Controls word repetition and skipping. Decrease it to prevent skipped words, and increase it to prevent repetition. Only supported in `lightning-large` model. (default: 0.5)
113
+ - `similarity`: Controls the similarity between the synthesized audio and the reference audio. Increase it to make the speech more similar to the reference audio. Only supported in `lightning-large` model. (default: 0)
114
+ - `enhancement`: Enhances speech quality at the cost of increased latency. Only supported in `lightning-large` model. (default: False)
115
+ - `add_wav_header`: Whether to add a WAV header to the output audio.
121
116
 
122
117
  These parameters are part of the `Smallest` instance. They can be set when creating the instance (as shown above). However, the `synthesize` function also accepts `kwargs`, allowing you to override these parameters for a specific synthesis request.
123
118
 
@@ -141,9 +136,8 @@ import asyncio
141
136
  import aiofiles
142
137
  from smallest import AsyncSmallest
143
138
 
144
- client = AsyncSmallest(api_key="SMALLEST_API_KEY")
145
-
146
139
  async def main():
140
+ client = AsyncSmallest(api_key="SMALLEST_API_KEY")
147
141
  async with client as tts:
148
142
  audio_bytes = await tts.synthesize("Hello, this is a test of the async synthesis function.")
149
143
  async with aiofiles.open("async_synthesize.wav", "wb") as f:
@@ -153,15 +147,33 @@ if __name__ == "__main__":
153
147
  asyncio.run(main())
154
148
  ```
155
149
 
150
+ **Running Asynchronously in a Jupyter Notebook**
151
+ If you are using a Jupyter Notebook, use the following approach to execute the asynchronous function within an existing event loop:
152
+ ```python
153
+ import asyncio
154
+ import aiofiles
155
+ from smallest import AsyncSmallest
156
+
157
+ async def main():
158
+ client = AsyncSmallest(api_key="SMALLEST_API_KEY")
159
+ async with client as tts:
160
+ audio_bytes = await tts.synthesize("Hello, this is a test of the async synthesis function.")
161
+ async with aiofiles.open("async_synthesize.wav", "wb") as f:
162
+ await f.write(audio_bytes) # alternatively you can use the `save_as` parameter.
163
+
164
+ await main()
165
+ ```
166
+
156
167
  **Parameters:**
157
168
  - `api_key`: Your API key (can be set via SMALLEST_API_KEY environment variable)
158
169
  - `model`: TTS model to use (default: "lightning")
159
170
  - `sample_rate`: Audio sample rate (default: 24000)
160
171
  - `voice_id`: Voice ID (default: "emily")
161
172
  - `speed`: Speech speed multiplier (default: 1.0)
162
- - `add_wav_header`: Include WAV header in output (default: True)
163
- - `transliterate`: Enable text transliteration (default: False)
164
- - `remove_extra_silence`: Remove additional silence (default: True)
173
+ - `consistency`: Controls word repetition and skipping. Decrease it to prevent skipped words, and increase it to prevent repetition. Only supported in `lightning-large` model.
174
+ - `similarity`: Controls the similarity between the synthesized audio and the reference audio. Increase it to make the speech more similar to the reference audio. Only supported in `lightning-large` model.
175
+ - `enhancement`: Enhances speech quality at the cost of increased latency. Only supported in `lightning-large` model.
176
+ - `add_wav_header`: Whether to add a WAV header to the output audio.
165
177
 
166
178
  These parameters are part of the `AsyncSmallest` instance. They can be set when creating the instance (as shown above). However, the `synthesize` function also accepts `kwargs`, allowing you to override any of these parameters on a per-request basis.
167
179
 
@@ -178,6 +190,58 @@ audio_bytes = await tts.synthesize(
178
190
 
179
191
  The `TextToAudioStream` class provides real-time text-to-speech processing, converting streaming text into audio output. It's particularly useful for applications like voice assistants, live captioning, or interactive chatbots that require immediate audio feedback from text generation. Supports both synchronous and asynchronous TTS instance.
180
192
 
193
+ #### Stream through a WebSocket
194
+
195
+ ```python
196
+ import asyncio
197
+ import websockets
198
+ from groq import Groq
199
+ from smallest import Smallest, TextToAudioStream
200
+
201
+ # Initialize Groq (LLM) and Smallest (TTS) instances
202
+ llm = Groq(api_key="GROQ_API_KEY")
203
+ tts = Smallest(api_key="SMALLEST_API_KEY")
204
+ WEBSOCKET_URL = "wss://echo.websocket.events" # Mock WebSocket server
205
+
206
+ # Async function to stream text generation from LLM
207
+ async def generate_text(prompt):
208
+ completion = llm.chat.completions.create(
209
+ messages=[{"role": "user", "content": prompt}],
210
+ model="llama3-8b-8192",
211
+ stream=True,
212
+ )
213
+
214
+ # Yield text as it is generated
215
+ for chunk in completion:
216
+ text = chunk.choices[0].delta.content
217
+ if text:
218
+ yield text
219
+
220
+ # Main function to run the process
221
+ async def main():
222
+ # Initialize the TTS processor
223
+ processor = TextToAudioStream(tts_instance=tts)
224
+
225
+ # Generate text from LLM
226
+ llm_output = generate_text("Explain text to speech like I am five in 5 sentences.")
227
+
228
+ # Stream the generated speech throught a websocket
229
+ async with websockets.connect(WEBSOCKET_URL) as ws:
230
+ print("Connected to WebSocket server.")
231
+
232
+ # Stream the generated speech
233
+ async for audio_chunk in processor.process(llm_output):
234
+ await ws.send(audio_chunk) # Send audio chunk
235
+ echoed_data = await ws.recv() # Receive the echoed message
236
+ print("Received from server:", echoed_data[:20], "...") # Print first 20 bytes
237
+
238
+ print("WebSocket connection closed.")
239
+
240
+ if __name__ == "__main__":
241
+ asyncio.run(main())
242
+ ```
243
+
244
+ #### Save to a File
181
245
  ```python
182
246
  import wave
183
247
  import asyncio
@@ -245,12 +309,12 @@ The processor yields raw audio data chunks without WAV headers for streaming eff
245
309
  ## Add your Voice
246
310
  The Smallest AI SDK allows you to clone your voice by uploading an audio file. This feature is available both synchronously and asynchronously, making it flexible for different use cases. Below are examples of how to use this functionality.
247
311
 
248
- ### Synchronously
312
+ ### Add Synchronously
249
313
  ```python
250
314
  from smallest import Smallest
251
315
 
252
316
  def main():
253
- client = Smallest(api_key="YOUR_API_KEY")
317
+ client = Smallest(api_key="SMALLEST_API_KEY")
254
318
  res = client.add_voice(display_name="My Voice", file_path="my_voice.wav")
255
319
  print(res)
256
320
 
@@ -258,13 +322,13 @@ if __name__ == "__main__":
258
322
  main()
259
323
  ```
260
324
 
261
- ### Asynchronously
325
+ ### Add Asynchronously
262
326
  ```python
263
327
  import asyncio
264
328
  from smallest import AsyncSmallest
265
329
 
266
330
  async def main():
267
- client = AsyncSmallest(api_key="YOUR_API_KEY")
331
+ client = AsyncSmallest(api_key="SMALLEST_API_KEY")
268
332
  res = await client.add_voice(display_name="My Voice", file_path="my_voice.wav")
269
333
  print(res)
270
334
 
@@ -272,6 +336,36 @@ if __name__ == "__main__":
272
336
  asyncio.run(main())
273
337
  ```
274
338
 
339
+ ## Delete your Voice
340
+ The Smallest AI SDK allows you to delete your cloned voice. This feature is available both synchronously and asynchronously, making it flexible for different use cases. Below are examples of how to use this functionality.
341
+
342
+ ### Delete Synchronously
343
+ ```python
344
+ from smallest import Smallest
345
+
346
+ def main():
347
+ client = Smallest(api_key="SMALLEST_API_KEY")
348
+ res = client.delete_voice(voice_id="voice_id")
349
+ print(res)
350
+
351
+ if __name__ == "__main__":
352
+ main()
353
+ ```
354
+
355
+ ### Delete Asynchronously
356
+ ```python
357
+ import asyncio
358
+ from smallest import AsyncSmallest
359
+
360
+ async def main():
361
+ client = AsyncSmallest(api_key="SMALLEST_API_KEY")
362
+ res = await client.delete_voice(voice_id="voice_id")
363
+ print(res)
364
+
365
+ if __name__ == "__main__":
366
+ asyncio.run(main())
367
+ ```
368
+
275
369
  ## Available Methods
276
370
 
277
371
  ```python
@@ -32,8 +32,11 @@ Currently, the library supports direct synthesis and the ability to synthesize s
32
32
  - [Aynchronous](#Synchronous)
33
33
  - [LLM to Speech](#llm-to-speech)
34
34
  - [Add your Voice](#add-your-voice)
35
- - [Synchronously](#synchronously)
36
- - [Asynchronously](#asynchronously)
35
+ - [Synchronously](#add-synchronously)
36
+ - [Asynchronously](#add-asynchronously)
37
+ - [Delete your Voice](#delete-your-voice)
38
+ - [Synchronously](#delete-synchronously)
39
+ - [Asynchronously](#delete-asynchronously)
37
40
  - [Available Methods](#available-methods)
38
41
  - [Technical Note: WAV Headers in Streaming Audio](#technical-note-wav-headers-in-streaming-audio)
39
42
 
@@ -53,14 +56,6 @@ When using an SDK in your application, make sure to pin to at least the major ve
53
56
  3. Create a new API Key and copy it.
54
57
  4. Export the API Key in your environment with the name `SMALLEST_API_KEY`, ensuring that your application can access it securely for authentication.
55
58
 
56
- ## Best Practices for Input Text
57
- While the `transliterate` parameter is provided, please note that it is not fully supported and may not perform consistently across all cases. It is recommended to use the model without relying on this parameter.
58
-
59
- For optimal voice generation results:
60
-
61
- 1. For English, provide the input in Latin script (e.g., "Hello, how are you?").
62
- 2. For Hindi, provide the input in Devanagari script (e.g., "नमस्ते, आप कैसे हैं?").
63
- 3. For code-mixed input, use Latin script for English and Devanagari script for Hindi (e.g., "Hello, आप कैसे हैं?").
64
59
 
65
60
  ## Examples
66
61
 
@@ -88,9 +83,10 @@ if __name__ == "__main__":
88
83
  - `sample_rate`: Audio sample rate (default: 24000)
89
84
  - `voice_id`: Voice ID (default: "emily")
90
85
  - `speed`: Speech speed multiplier (default: 1.0)
91
- - `add_wav_header`: Include WAV header in output (default: True)
92
- - `transliterate`: Enable text transliteration (default: False)
93
- - `remove_extra_silence`: Remove additional silence (default: True)
86
+ - `consistency`: Controls word repetition and skipping. Decrease it to prevent skipped words, and increase it to prevent repetition. Only supported in `lightning-large` model. (default: 0.5)
87
+ - `similarity`: Controls the similarity between the synthesized audio and the reference audio. Increase it to make the speech more similar to the reference audio. Only supported in `lightning-large` model. (default: 0)
88
+ - `enhancement`: Enhances speech quality at the cost of increased latency. Only supported in `lightning-large` model. (default: False)
89
+ - `add_wav_header`: Whether to add a WAV header to the output audio.
94
90
 
95
91
  These parameters are part of the `Smallest` instance. They can be set when creating the instance (as shown above). However, the `synthesize` function also accepts `kwargs`, allowing you to override these parameters for a specific synthesis request.
96
92
 
@@ -114,9 +110,8 @@ import asyncio
114
110
  import aiofiles
115
111
  from smallest import AsyncSmallest
116
112
 
117
- client = AsyncSmallest(api_key="SMALLEST_API_KEY")
118
-
119
113
  async def main():
114
+ client = AsyncSmallest(api_key="SMALLEST_API_KEY")
120
115
  async with client as tts:
121
116
  audio_bytes = await tts.synthesize("Hello, this is a test of the async synthesis function.")
122
117
  async with aiofiles.open("async_synthesize.wav", "wb") as f:
@@ -126,15 +121,33 @@ if __name__ == "__main__":
126
121
  asyncio.run(main())
127
122
  ```
128
123
 
124
+ **Running Asynchronously in a Jupyter Notebook**
125
+ If you are using a Jupyter Notebook, use the following approach to execute the asynchronous function within an existing event loop:
126
+ ```python
127
+ import asyncio
128
+ import aiofiles
129
+ from smallest import AsyncSmallest
130
+
131
+ async def main():
132
+ client = AsyncSmallest(api_key="SMALLEST_API_KEY")
133
+ async with client as tts:
134
+ audio_bytes = await tts.synthesize("Hello, this is a test of the async synthesis function.")
135
+ async with aiofiles.open("async_synthesize.wav", "wb") as f:
136
+ await f.write(audio_bytes) # alternatively you can use the `save_as` parameter.
137
+
138
+ await main()
139
+ ```
140
+
129
141
  **Parameters:**
130
142
  - `api_key`: Your API key (can be set via SMALLEST_API_KEY environment variable)
131
143
  - `model`: TTS model to use (default: "lightning")
132
144
  - `sample_rate`: Audio sample rate (default: 24000)
133
145
  - `voice_id`: Voice ID (default: "emily")
134
146
  - `speed`: Speech speed multiplier (default: 1.0)
135
- - `add_wav_header`: Include WAV header in output (default: True)
136
- - `transliterate`: Enable text transliteration (default: False)
137
- - `remove_extra_silence`: Remove additional silence (default: True)
147
+ - `consistency`: Controls word repetition and skipping. Decrease it to prevent skipped words, and increase it to prevent repetition. Only supported in `lightning-large` model.
148
+ - `similarity`: Controls the similarity between the synthesized audio and the reference audio. Increase it to make the speech more similar to the reference audio. Only supported in `lightning-large` model.
149
+ - `enhancement`: Enhances speech quality at the cost of increased latency. Only supported in `lightning-large` model.
150
+ - `add_wav_header`: Whether to add a WAV header to the output audio.
138
151
 
139
152
  These parameters are part of the `AsyncSmallest` instance. They can be set when creating the instance (as shown above). However, the `synthesize` function also accepts `kwargs`, allowing you to override any of these parameters on a per-request basis.
140
153
 
@@ -151,6 +164,58 @@ audio_bytes = await tts.synthesize(
151
164
 
152
165
  The `TextToAudioStream` class provides real-time text-to-speech processing, converting streaming text into audio output. It's particularly useful for applications like voice assistants, live captioning, or interactive chatbots that require immediate audio feedback from text generation. Supports both synchronous and asynchronous TTS instance.
153
166
 
167
+ #### Stream through a WebSocket
168
+
169
+ ```python
170
+ import asyncio
171
+ import websockets
172
+ from groq import Groq
173
+ from smallest import Smallest, TextToAudioStream
174
+
175
+ # Initialize Groq (LLM) and Smallest (TTS) instances
176
+ llm = Groq(api_key="GROQ_API_KEY")
177
+ tts = Smallest(api_key="SMALLEST_API_KEY")
178
+ WEBSOCKET_URL = "wss://echo.websocket.events" # Mock WebSocket server
179
+
180
+ # Async function to stream text generation from LLM
181
+ async def generate_text(prompt):
182
+ completion = llm.chat.completions.create(
183
+ messages=[{"role": "user", "content": prompt}],
184
+ model="llama3-8b-8192",
185
+ stream=True,
186
+ )
187
+
188
+ # Yield text as it is generated
189
+ for chunk in completion:
190
+ text = chunk.choices[0].delta.content
191
+ if text:
192
+ yield text
193
+
194
+ # Main function to run the process
195
+ async def main():
196
+ # Initialize the TTS processor
197
+ processor = TextToAudioStream(tts_instance=tts)
198
+
199
+ # Generate text from LLM
200
+ llm_output = generate_text("Explain text to speech like I am five in 5 sentences.")
201
+
202
+ # Stream the generated speech throught a websocket
203
+ async with websockets.connect(WEBSOCKET_URL) as ws:
204
+ print("Connected to WebSocket server.")
205
+
206
+ # Stream the generated speech
207
+ async for audio_chunk in processor.process(llm_output):
208
+ await ws.send(audio_chunk) # Send audio chunk
209
+ echoed_data = await ws.recv() # Receive the echoed message
210
+ print("Received from server:", echoed_data[:20], "...") # Print first 20 bytes
211
+
212
+ print("WebSocket connection closed.")
213
+
214
+ if __name__ == "__main__":
215
+ asyncio.run(main())
216
+ ```
217
+
218
+ #### Save to a File
154
219
  ```python
155
220
  import wave
156
221
  import asyncio
@@ -218,12 +283,12 @@ The processor yields raw audio data chunks without WAV headers for streaming eff
218
283
  ## Add your Voice
219
284
  The Smallest AI SDK allows you to clone your voice by uploading an audio file. This feature is available both synchronously and asynchronously, making it flexible for different use cases. Below are examples of how to use this functionality.
220
285
 
221
- ### Synchronously
286
+ ### Add Synchronously
222
287
  ```python
223
288
  from smallest import Smallest
224
289
 
225
290
  def main():
226
- client = Smallest(api_key="YOUR_API_KEY")
291
+ client = Smallest(api_key="SMALLEST_API_KEY")
227
292
  res = client.add_voice(display_name="My Voice", file_path="my_voice.wav")
228
293
  print(res)
229
294
 
@@ -231,13 +296,13 @@ if __name__ == "__main__":
231
296
  main()
232
297
  ```
233
298
 
234
- ### Asynchronously
299
+ ### Add Asynchronously
235
300
  ```python
236
301
  import asyncio
237
302
  from smallest import AsyncSmallest
238
303
 
239
304
  async def main():
240
- client = AsyncSmallest(api_key="YOUR_API_KEY")
305
+ client = AsyncSmallest(api_key="SMALLEST_API_KEY")
241
306
  res = await client.add_voice(display_name="My Voice", file_path="my_voice.wav")
242
307
  print(res)
243
308
 
@@ -245,6 +310,36 @@ if __name__ == "__main__":
245
310
  asyncio.run(main())
246
311
  ```
247
312
 
313
+ ## Delete your Voice
314
+ The Smallest AI SDK allows you to delete your cloned voice. This feature is available both synchronously and asynchronously, making it flexible for different use cases. Below are examples of how to use this functionality.
315
+
316
+ ### Delete Synchronously
317
+ ```python
318
+ from smallest import Smallest
319
+
320
+ def main():
321
+ client = Smallest(api_key="SMALLEST_API_KEY")
322
+ res = client.delete_voice(voice_id="voice_id")
323
+ print(res)
324
+
325
+ if __name__ == "__main__":
326
+ main()
327
+ ```
328
+
329
+ ### Delete Asynchronously
330
+ ```python
331
+ import asyncio
332
+ from smallest import AsyncSmallest
333
+
334
+ async def main():
335
+ client = AsyncSmallest(api_key="SMALLEST_API_KEY")
336
+ res = await client.delete_voice(voice_id="voice_id")
337
+ print(res)
338
+
339
+ if __name__ == "__main__":
340
+ asyncio.run(main())
341
+ ```
342
+
248
343
  ## Available Methods
249
344
 
250
345
  ```python
@@ -1,6 +1,6 @@
1
1
  [project]
2
2
  name = "smallestai"
3
- version = "2.0.0"
3
+ version = "2.2.0"
4
4
  description = "Official Python client for the Smallest AI API"
5
5
  authors = [
6
6
  {name = "Smallest", email = "support@smallest.ai"},
@@ -18,7 +18,6 @@ dependencies = [
18
18
  "aiohttp",
19
19
  "aiofiles",
20
20
  "requests",
21
- "sacremoses",
22
21
  "pydub"
23
22
  ]
24
23