PyPI - smallestai - Versions diffs - 1.3.4__tar.gz → 2.1.0__tar.gz - Mend

smallestai 1.3.4tar.gz → 2.1.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.

This version of smallestai might be problematic. Click here for more details.

Files changed (25) hide show

{smallestai-1.3.4/smallestai.egg-info → smallestai-2.1.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
-Metadata-Version: 2.1
+Metadata-Version: 2.2
 Name: smallestai
-Version: 1.3.4
+Version: 2.1.0
 Summary: Official Python client for the Smallest AI API
 Author-email: Smallest <support@smallest.ai>
 License: MIT
@@ -55,9 +55,15 @@ Currently, the library supports direct synthesis and the ability to synthesize s
 - [Get the API Key](#get-the-api-key)
 - [Best Practices for Input Text](#best-practices-for-input-text)
 - [Examples](#examples)
-  - [Sync](#sync)
-  - [Async](#async)
+  - [Synchronous](#Synchronous)
+  - [Aynchronous](#Synchronous)
   - [LLM to Speech](#llm-to-speech)
+  - [Add your Voice](#add-your-voice)
+    - [Synchronously](#add-synchronously)
+    - [Asynchronously](#add-asynchronously)
+  - [Delete your Voice](#delete-your-voice)
+    - [Synchronously](#delete-synchronously)
+    - [Asynchronously](#delete-asynchronously)
 - [Available Methods](#available-methods)
 - [Technical Note: WAV Headers in Streaming Audio](#technical-note-wav-headers-in-streaming-audio)
@@ -77,28 +83,22 @@ When using an SDK in your application, make sure to pin to at least the major ve
 3. Create a new API Key and copy it.
 4. Export the API Key in your environment with the name `SMALLEST_API_KEY`, ensuring that your application can access it securely for authentication.
-## Best Practices for Input Text
-While the `transliterate` parameter is provided, please note that it is not fully supported and may not perform consistently across all cases. It is recommended to use the model without relying on this parameter.
-For optimal voice generation results:
-1. For English, provide the input in Latin script (e.g., "Hello, how are you?").
-2. For Hindi, provide the input in Devanagari script (e.g., "नमस्ते, आप कैसे हैं?").
-3. For code-mixed input, use Latin script for English and Devanagari script for Hindi (e.g., "Hello, आप कैसे हैं?").
 ## Examples
-### Sync
+### Synchronous
 A synchronous text-to-speech synthesis client.
 **Basic Usage:**
 ```python
-import os
 from smallest import Smallest
 def main():
-    client = Smallest(api_key=os.environ.get("SMALLEST_API_KEY"))
-    client.synthesize("Hello, this is a test for sync synthesis function.", save_as="sync_synthesize.wav")
+    client = Smallest(api_key="SMALLEST_API_KEY")
+    client.synthesize(
+        text="Hello, this is a test for sync synthesis function.",
+        save_as="sync_synthesize.wav"
+    )
 if __name__ == "__main__":
     main()
@@ -108,11 +108,12 @@ if __name__ == "__main__":
 - `api_key`: Your API key (can be set via SMALLEST_API_KEY environment variable)
 - `model`: TTS model to use (default: "lightning")
 - `sample_rate`: Audio sample rate (default: 24000)
-- `voice`: Voice ID (default: "emily")
+- `voice_id`: Voice ID (default: "emily")
 - `speed`: Speech speed multiplier (default: 1.0)
-- `add_wav_header`: Include WAV header in output (default: True)
-- `transliterate`: Enable text transliteration (default: False)
-- `remove_extra_silence`: Remove additional silence (default: True)
+- `consistency`: Controls word repetition and skipping. Decrease it to prevent skipped words, and increase it to prevent repetition. Only supported in `lightning-large` model. (default: 0.5)
+- `similarity`: Controls the similarity between the synthesized audio and the reference audio. Increase it to make the speech more similar to the reference audio. Only supported in `lightning-large` model. (default: 0)
+- `enhancement`: Enhances speech quality at the cost of increased latency. Only supported in `lightning-large` model. (default: False)
+- `add_wav_header`: Whether to add a WAV header to the output audio.
 These parameters are part of the `Smallest` instance. They can be set when creating the instance (as shown above). However, the `synthesize` function also accepts `kwargs`, allowing you to override these parameters for a specific synthesis request.
@@ -127,19 +128,17 @@ client.synthesize(
 ```
-### Async
+### Asynchronous
 Asynchronous text-to-speech synthesis client.
 **Basic Usage:**
 ```python
-import os
 import asyncio
 import aiofiles
 from smallest import AsyncSmallest
-client = AsyncSmallest(api_key=os.environ.get("SMALLEST_API_KEY"))
 async def main():
+    client = AsyncSmallest(api_key="SMALLEST_API_KEY")
     async with client as tts:
         audio_bytes = await tts.synthesize("Hello, this is a test of the async synthesis function.")
         async with aiofiles.open("async_synthesize.wav", "wb") as f:
@@ -149,15 +148,33 @@ if __name__ == "__main__":
     asyncio.run(main())
 ```
+**Running Asynchronously in a Jupyter Notebook**
+If you are using a Jupyter Notebook, use the following approach to execute the asynchronous function within an existing event loop:
+```python
+import asyncio
+import aiofiles
+from smallest import AsyncSmallest
+async def main():
+    client = AsyncSmallest(api_key="SMALLEST_API_KEY")
+    async with client as tts:
+        audio_bytes = await tts.synthesize("Hello, this is a test of the async synthesis function.")
+        async with aiofiles.open("async_synthesize.wav", "wb") as f:
+            await f.write(audio_bytes) # alternatively you can use the `save_as` parameter.
+await main()
+```
 **Parameters:**
 - `api_key`: Your API key (can be set via SMALLEST_API_KEY environment variable)
 - `model`: TTS model to use (default: "lightning")
 - `sample_rate`: Audio sample rate (default: 24000)
-- `voice`: Voice ID (default: "emily")
+- `voice_id`: Voice ID (default: "emily")
 - `speed`: Speech speed multiplier (default: 1.0)
-- `add_wav_header`: Include WAV header in output (default: True)
-- `transliterate`: Enable text transliteration (default: False)
-- `remove_extra_silence`: Remove additional silence (default: True)
+- `consistency`: Controls word repetition and skipping. Decrease it to prevent skipped words, and increase it to prevent repetition. Only supported in `lightning-large` model.
+- `similarity`: Controls the similarity between the synthesized audio and the reference audio. Increase it to make the speech more similar to the reference audio. Only supported in `lightning-large` model.
+- `enhancement`: Enhances speech quality at the cost of increased latency. Only supported in `lightning-large` model.
+- `add_wav_header`: Whether to add a WAV header to the output audio.
 These parameters are part of the `AsyncSmallest` instance. They can be set when creating the instance (as shown above). However, the `synthesize` function also accepts `kwargs`, allowing you to override any of these parameters on a per-request basis.
@@ -174,16 +191,66 @@ audio_bytes = await tts.synthesize(
 The `TextToAudioStream` class provides real-time text-to-speech processing, converting streaming text into audio output. It's particularly useful for applications like voice assistants, live captioning, or interactive chatbots that require immediate audio feedback from text generation. Supports both synchronous and asynchronous TTS instance.
+#### Stream through a WebSocket
+```python
+import asyncio
+import websockets
+from groq import Groq
+from smallest import Smallest, TextToAudioStream
+# Initialize Groq (LLM) and Smallest (TTS) instances
+llm = Groq(api_key="GROQ_API_KEY")
+tts = Smallest(api_key="SMALLEST_API_KEY")
+WEBSOCKET_URL = "wss://echo.websocket.events" # Mock WebSocket server
+# Async function to stream text generation from LLM
+async def generate_text(prompt):
+    completion = llm.chat.completions.create(
+        messages=[{"role": "user", "content": prompt}],
+        model="llama3-8b-8192",
+        stream=True,
+    )
+    # Yield text as it is generated
+    for chunk in completion:
+        text = chunk.choices[0].delta.content
+        if text:
+            yield text
+# Main function to run the process
+async def main():
+    # Initialize the TTS processor
+    processor = TextToAudioStream(tts_instance=tts)
+    # Generate text from LLM
+    llm_output = generate_text("Explain text to speech like I am five in 5 sentences.")
+    # Stream the generated speech throught a websocket
+    async with websockets.connect(WEBSOCKET_URL) as ws:
+        print("Connected to WebSocket server.")
+        # Stream the generated speech
+        async for audio_chunk in processor.process(llm_output):
+            await ws.send(audio_chunk)  # Send audio chunk
+            echoed_data = await ws.recv()  # Receive the echoed message
+            print("Received from server:", echoed_data[:20], "...")  # Print first 20 bytes
+        print("WebSocket connection closed.")
+if __name__ == "__main__":
+    asyncio.run(main())
+```
+#### Save to a File
 ```python
-import os
 import wave
 import asyncio
 from groq import Groq
-from smallest import Smallest
-from smallest import TextToAudioStream
+from smallest import Smallest, TextToAudioStream
-llm = Groq(api_key=os.environ.get("GROQ_API_KEY"))
-tts = Smallest(api_key=os.environ.get("SMALLEST_API_KEY"))
+llm = Groq(api_key="GROQ_API_KEY")
+tts = Smallest(api_key="SMALLEST_API_KEY")
 async def generate_text(prompt):
     """Async generator for streaming text from Groq. You can use any LLM"""
@@ -240,16 +307,76 @@ The processor yields raw audio data chunks without WAV headers for streaming eff
 - Streamed over a network
 - Further processed as needed
+## Add your Voice
+The Smallest AI SDK allows you to clone your voice by uploading an audio file. This feature is available both synchronously and asynchronously, making it flexible for different use cases. Below are examples of how to use this functionality.
+### Add Synchronously
+```python
+from smallest import Smallest
+def main():
+    client = Smallest(api_key="SMALLEST_API_KEY")
+    res = client.add_voice(display_name="My Voice", file_path="my_voice.wav")
+    print(res)
+if __name__ == "__main__":
+    main()
+```
+### Add Asynchronously
+```python
+import asyncio
+from smallest import AsyncSmallest
+async def main():
+    client = AsyncSmallest(api_key="SMALLEST_API_KEY")
+    res = await client.add_voice(display_name="My Voice", file_path="my_voice.wav")
+    print(res)
+if __name__ == "__main__":
+    asyncio.run(main())
+```
+## Delete your Voice
+The Smallest AI SDK allows you to delete your cloned voice. This feature is available both synchronously and asynchronously, making it flexible for different use cases. Below are examples of how to use this functionality.
+### Delete Synchronously
+```python
+from smallest import Smallest
+def main():
+    client = Smallest(api_key="SMALLEST_API_KEY")
+    res = client.delete_voice(voice_id="voice_id")
+    print(res)
+if __name__ == "__main__":
+    main()
+```
+### Delete Asynchronously
+```python
+import asyncio
+from smallest import AsyncSmallest
+async def main():
+    client = AsyncSmallest(api_key="SMALLEST_API_KEY")
+    res = await client.delete_voice(voice_id="voice_id")
+    print(res)
+if __name__ == "__main__":
+    asyncio.run(main())
+```
 ## Available Methods
 ```python
-from smallest.tts import Smallest
+from smallest import Smallest
-client = Smallest(api_key=os.environ.get("SMALLEST_API_KEY"))
+client = Smallest(api_key="SMALLEST_API_KEY")
-print(f"Avalaible Languages: {client.get_languages()}")
-print(f"Available Voices: {client.get_voices()}")
+print(f"Available Languages: {client.get_languages()}")
+print(f"Available Voices: {client.get_voices(model='lightning')}")
+print(f"Available Voices: {client.get_cloned_voices()}")
 print(f"Available Models: {client.get_models()}")
 ```

{smallestai-1.3.4 → smallestai-2.1.0}/README.md RENAMED Viewed

@@ -28,9 +28,15 @@ Currently, the library supports direct synthesis and the ability to synthesize s
 - [Get the API Key](#get-the-api-key)
 - [Best Practices for Input Text](#best-practices-for-input-text)
 - [Examples](#examples)
-  - [Sync](#sync)
-  - [Async](#async)
+  - [Synchronous](#Synchronous)
+  - [Aynchronous](#Synchronous)
   - [LLM to Speech](#llm-to-speech)
+  - [Add your Voice](#add-your-voice)
+    - [Synchronously](#add-synchronously)
+    - [Asynchronously](#add-asynchronously)
+  - [Delete your Voice](#delete-your-voice)
+    - [Synchronously](#delete-synchronously)
+    - [Asynchronously](#delete-asynchronously)
 - [Available Methods](#available-methods)
 - [Technical Note: WAV Headers in Streaming Audio](#technical-note-wav-headers-in-streaming-audio)
@@ -50,28 +56,22 @@ When using an SDK in your application, make sure to pin to at least the major ve
 3. Create a new API Key and copy it.
 4. Export the API Key in your environment with the name `SMALLEST_API_KEY`, ensuring that your application can access it securely for authentication.
-## Best Practices for Input Text
-While the `transliterate` parameter is provided, please note that it is not fully supported and may not perform consistently across all cases. It is recommended to use the model without relying on this parameter.
-For optimal voice generation results:
-1. For English, provide the input in Latin script (e.g., "Hello, how are you?").
-2. For Hindi, provide the input in Devanagari script (e.g., "नमस्ते, आप कैसे हैं?").
-3. For code-mixed input, use Latin script for English and Devanagari script for Hindi (e.g., "Hello, आप कैसे हैं?").
 ## Examples
-### Sync
+### Synchronous
 A synchronous text-to-speech synthesis client.
 **Basic Usage:**
 ```python
-import os
 from smallest import Smallest
 def main():
-    client = Smallest(api_key=os.environ.get("SMALLEST_API_KEY"))
-    client.synthesize("Hello, this is a test for sync synthesis function.", save_as="sync_synthesize.wav")
+    client = Smallest(api_key="SMALLEST_API_KEY")
+    client.synthesize(
+        text="Hello, this is a test for sync synthesis function.",
+        save_as="sync_synthesize.wav"
+    )
 if __name__ == "__main__":
     main()
@@ -81,11 +81,12 @@ if __name__ == "__main__":
 - `api_key`: Your API key (can be set via SMALLEST_API_KEY environment variable)
 - `model`: TTS model to use (default: "lightning")
 - `sample_rate`: Audio sample rate (default: 24000)
-- `voice`: Voice ID (default: "emily")
+- `voice_id`: Voice ID (default: "emily")
 - `speed`: Speech speed multiplier (default: 1.0)
-- `add_wav_header`: Include WAV header in output (default: True)
-- `transliterate`: Enable text transliteration (default: False)
-- `remove_extra_silence`: Remove additional silence (default: True)
+- `consistency`: Controls word repetition and skipping. Decrease it to prevent skipped words, and increase it to prevent repetition. Only supported in `lightning-large` model. (default: 0.5)
+- `similarity`: Controls the similarity between the synthesized audio and the reference audio. Increase it to make the speech more similar to the reference audio. Only supported in `lightning-large` model. (default: 0)
+- `enhancement`: Enhances speech quality at the cost of increased latency. Only supported in `lightning-large` model. (default: False)
+- `add_wav_header`: Whether to add a WAV header to the output audio.
 These parameters are part of the `Smallest` instance. They can be set when creating the instance (as shown above). However, the `synthesize` function also accepts `kwargs`, allowing you to override these parameters for a specific synthesis request.
@@ -100,19 +101,17 @@ client.synthesize(
 ```
-### Async
+### Asynchronous
 Asynchronous text-to-speech synthesis client.
 **Basic Usage:**
 ```python
-import os
 import asyncio
 import aiofiles
 from smallest import AsyncSmallest
-client = AsyncSmallest(api_key=os.environ.get("SMALLEST_API_KEY"))
 async def main():
+    client = AsyncSmallest(api_key="SMALLEST_API_KEY")
     async with client as tts:
         audio_bytes = await tts.synthesize("Hello, this is a test of the async synthesis function.")
         async with aiofiles.open("async_synthesize.wav", "wb") as f:
@@ -122,15 +121,33 @@ if __name__ == "__main__":
     asyncio.run(main())
 ```
+**Running Asynchronously in a Jupyter Notebook**
+If you are using a Jupyter Notebook, use the following approach to execute the asynchronous function within an existing event loop:
+```python
+import asyncio
+import aiofiles
+from smallest import AsyncSmallest
+async def main():
+    client = AsyncSmallest(api_key="SMALLEST_API_KEY")
+    async with client as tts:
+        audio_bytes = await tts.synthesize("Hello, this is a test of the async synthesis function.")
+        async with aiofiles.open("async_synthesize.wav", "wb") as f:
+            await f.write(audio_bytes) # alternatively you can use the `save_as` parameter.
+await main()
+```
 **Parameters:**
 - `api_key`: Your API key (can be set via SMALLEST_API_KEY environment variable)
 - `model`: TTS model to use (default: "lightning")
 - `sample_rate`: Audio sample rate (default: 24000)
-- `voice`: Voice ID (default: "emily")
+- `voice_id`: Voice ID (default: "emily")
 - `speed`: Speech speed multiplier (default: 1.0)
-- `add_wav_header`: Include WAV header in output (default: True)
-- `transliterate`: Enable text transliteration (default: False)
-- `remove_extra_silence`: Remove additional silence (default: True)
+- `consistency`: Controls word repetition and skipping. Decrease it to prevent skipped words, and increase it to prevent repetition. Only supported in `lightning-large` model.
+- `similarity`: Controls the similarity between the synthesized audio and the reference audio. Increase it to make the speech more similar to the reference audio. Only supported in `lightning-large` model.
+- `enhancement`: Enhances speech quality at the cost of increased latency. Only supported in `lightning-large` model.
+- `add_wav_header`: Whether to add a WAV header to the output audio.
 These parameters are part of the `AsyncSmallest` instance. They can be set when creating the instance (as shown above). However, the `synthesize` function also accepts `kwargs`, allowing you to override any of these parameters on a per-request basis.
@@ -147,16 +164,66 @@ audio_bytes = await tts.synthesize(
 The `TextToAudioStream` class provides real-time text-to-speech processing, converting streaming text into audio output. It's particularly useful for applications like voice assistants, live captioning, or interactive chatbots that require immediate audio feedback from text generation. Supports both synchronous and asynchronous TTS instance.
+#### Stream through a WebSocket
+```python
+import asyncio
+import websockets
+from groq import Groq
+from smallest import Smallest, TextToAudioStream
+# Initialize Groq (LLM) and Smallest (TTS) instances
+llm = Groq(api_key="GROQ_API_KEY")
+tts = Smallest(api_key="SMALLEST_API_KEY")
+WEBSOCKET_URL = "wss://echo.websocket.events" # Mock WebSocket server
+# Async function to stream text generation from LLM
+async def generate_text(prompt):
+    completion = llm.chat.completions.create(
+        messages=[{"role": "user", "content": prompt}],
+        model="llama3-8b-8192",
+        stream=True,
+    )
+    # Yield text as it is generated
+    for chunk in completion:
+        text = chunk.choices[0].delta.content
+        if text:
+            yield text
+# Main function to run the process
+async def main():
+    # Initialize the TTS processor
+    processor = TextToAudioStream(tts_instance=tts)
+    # Generate text from LLM
+    llm_output = generate_text("Explain text to speech like I am five in 5 sentences.")
+    # Stream the generated speech throught a websocket
+    async with websockets.connect(WEBSOCKET_URL) as ws:
+        print("Connected to WebSocket server.")
+        # Stream the generated speech
+        async for audio_chunk in processor.process(llm_output):
+            await ws.send(audio_chunk)  # Send audio chunk
+            echoed_data = await ws.recv()  # Receive the echoed message
+            print("Received from server:", echoed_data[:20], "...")  # Print first 20 bytes
+        print("WebSocket connection closed.")
+if __name__ == "__main__":
+    asyncio.run(main())
+```
+#### Save to a File
 ```python
-import os
 import wave
 import asyncio
 from groq import Groq
-from smallest import Smallest
-from smallest import TextToAudioStream
+from smallest import Smallest, TextToAudioStream
-llm = Groq(api_key=os.environ.get("GROQ_API_KEY"))
-tts = Smallest(api_key=os.environ.get("SMALLEST_API_KEY"))
+llm = Groq(api_key="GROQ_API_KEY")
+tts = Smallest(api_key="SMALLEST_API_KEY")
 async def generate_text(prompt):
     """Async generator for streaming text from Groq. You can use any LLM"""
@@ -213,16 +280,76 @@ The processor yields raw audio data chunks without WAV headers for streaming eff
 - Streamed over a network
 - Further processed as needed
+## Add your Voice
+The Smallest AI SDK allows you to clone your voice by uploading an audio file. This feature is available both synchronously and asynchronously, making it flexible for different use cases. Below are examples of how to use this functionality.
+### Add Synchronously
+```python
+from smallest import Smallest
+def main():
+    client = Smallest(api_key="SMALLEST_API_KEY")
+    res = client.add_voice(display_name="My Voice", file_path="my_voice.wav")
+    print(res)
+if __name__ == "__main__":
+    main()
+```
+### Add Asynchronously
+```python
+import asyncio
+from smallest import AsyncSmallest
+async def main():
+    client = AsyncSmallest(api_key="SMALLEST_API_KEY")
+    res = await client.add_voice(display_name="My Voice", file_path="my_voice.wav")
+    print(res)
+if __name__ == "__main__":
+    asyncio.run(main())
+```
+## Delete your Voice
+The Smallest AI SDK allows you to delete your cloned voice. This feature is available both synchronously and asynchronously, making it flexible for different use cases. Below are examples of how to use this functionality.
+### Delete Synchronously
+```python
+from smallest import Smallest
+def main():
+    client = Smallest(api_key="SMALLEST_API_KEY")
+    res = client.delete_voice(voice_id="voice_id")
+    print(res)
+if __name__ == "__main__":
+    main()
+```
+### Delete Asynchronously
+```python
+import asyncio
+from smallest import AsyncSmallest
+async def main():
+    client = AsyncSmallest(api_key="SMALLEST_API_KEY")
+    res = await client.delete_voice(voice_id="voice_id")
+    print(res)
+if __name__ == "__main__":
+    asyncio.run(main())
+```
 ## Available Methods
 ```python
-from smallest.tts import Smallest
+from smallest import Smallest
-client = Smallest(api_key=os.environ.get("SMALLEST_API_KEY"))
+client = Smallest(api_key="SMALLEST_API_KEY")
-print(f"Avalaible Languages: {client.get_languages()}")
-print(f"Available Voices: {client.get_voices()}")
+print(f"Available Languages: {client.get_languages()}")
+print(f"Available Voices: {client.get_voices(model='lightning')}")
+print(f"Available Voices: {client.get_cloned_voices()}")
 print(f"Available Models: {client.get_models()}")
 ```

{smallestai-1.3.4 → smallestai-2.1.0}/pyproject.toml RENAMED Viewed

@@ -1,6 +1,6 @@
 [project]
 name = "smallestai"
-version = "1.3.4"
+version = "2.1.0"
 description = "Official Python client for the Smallest AI API"
 authors = [
     {name = "Smallest", email = "support@smallest.ai"},

smallestai 1.3.4__tar.gz → 2.1.0__tar.gz

Potentially problematic release.

smallestai 1.3.4tar.gz → 2.1.0tar.gz