PyPI - sinapsis-speech - Versions diffs - 0.4.0__tar.gz → 0.4.1__tar.gz - Mend

sinapsis-speech 0.4.0tar.gz → 0.4.1tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (49) hide show

{sinapsis_speech-0.4.0 → sinapsis_speech-0.4.1}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: sinapsis-speech
-Version: 0.4.0
+Version: 0.4.1
 Summary: Generate speech using various libraries.
 Author-email: SinapsisAI <dev@sinapsis.tech>
 Project-URL: Homepage, https://sinapsis.tech
@@ -35,7 +35,7 @@ Sinapsis Speech
 <br>
 </h1>
-<h4 align="center"> Templates for a wide range of voice generation tasks.</h4>
+<h4 align="center"> A monorepo housing multiple packages and templates for versatile voice generation, text-to-speech, speech-to-text, and beyond.</h4>
 <p align="center">
 <a href="#installation">🐍 Installation</a> •
@@ -108,10 +108,14 @@ This repository is organized into modular packages, each designed for integratio
 <details>
 <summary id="elevenlabs"><strong><span style="font-size: 1.4em;"> Sinapsis ElevenLabs </span></strong></summary>
-This package offers a suite of templates and utilities designed for effortless integrating, configuration, and execution of **text-to-speech (TTS)** and **voice generation** functionalities powered by [ElevenLabs](https://elevenlabs.io/).
+This package offers a suite of templates and utilities designed for effortless integrating, configuration, and execution of **text-to-speech (TTS)**, **speech-to-speech (STS)**, **voice cloning**, and **voice generation** functionalities powered by [ElevenLabs](https://elevenlabs.io/).
+- **ElevenLabsSTS**: Template for transforming a voice into a different character or style using the ElevenLabs Speech-to-Speech API.
 - **ElevenLabsTTS**: Template for converting text into speech using ElevenLabs' voice models.
+- **ElevenLabsVoiceClone**: Template for creating a synthetic copy of an existing voice using the ElevenLabs API.
 - **ElevenLabsVoiceGeneration**: Template for generating custom synthetic voices based on user-provided descriptions.
 For specific instructions and further details, see the [README.md](https://github.com/Sinapsis-AI/sinapsis-speech/blob/main/packages/sinapsis_elevenlabs/README.md).
@@ -148,6 +152,30 @@ For specific instructions and further details, see the [README.md](https://githu
 </details>
+<details>
+<summary id="orpheus-cpp"><strong><span style="font-size: 1.4em;"> Sinapsis Orppheus-CPP</span></strong></summary>
+This package provides a template for seamlessly integrating, configuring, and running **text-to-speech (TTS)** functionalities powered by [Orpheus-TTS](https://github.com/canopyai/Orpheus-TTS).
+- **OrpheusTTS**: Converts text to speech using the Orpheus TTS model with advanced neural voice synthesis. The template processes text packets from the input container, generates corresponding audio using Orpheus TTS, and adds the resulting audio packets to the container. Features graceful error handling for out-of-memory conditions
+For specific instructions and further details, see the [README.md](https://github.com/Sinapsis-AI/sinapsis-speech/blob/main/packages/sinapsis_orpheus_cpp/README.md).
+</details>
+<details>
+<summary id="parakeet-tdt"><strong><span style="font-size: 1.4em;"> Sinapsis Parakeet-TDT</span></strong></summary>
+This package provides a template for seamlessly integrating, configuring, and running **speech-to-text (STT)** functionalities powered by [NVIDIA's Parakeet TDT model](https://huggingface.co/nvidia/parakeet-tdt-0.6b-v2).
+- **ParakeetTDTInference**: Converts speech to text using NVIDIA's Parakeet TDT 0.6B model. This template processes audio packets from the input container or specified file paths, performs transcription with optional timestamp prediction, and adds the resulting text packets to the container.
+For specific instructions and further details, see the [README.md](https://github.com/Sinapsis-AI/sinapsis-speech/blob/main/packages/sinapsis_parakeet_tdt/README.md).
+</details>
 <h2 id="webapp">🌐 Webapps</h2>
 The webapps included in this project showcase the modularity of the templates, in this case for speech generation tasks.
@@ -186,7 +214,6 @@ cd sinapsis-speech
 docker compose -f docker/compose.yaml build
 ```
 2. **Start the app container**:
 - For ElevenLabs:

{sinapsis_speech-0.4.0 → sinapsis_speech-0.4.1}/README.md RENAMED Viewed

@@ -9,7 +9,7 @@ Sinapsis Speech
 <br>
 </h1>
-<h4 align="center"> Templates for a wide range of voice generation tasks.</h4>
+<h4 align="center"> A monorepo housing multiple packages and templates for versatile voice generation, text-to-speech, speech-to-text, and beyond.</h4>
 <p align="center">
 <a href="#installation">🐍 Installation</a> •
@@ -82,10 +82,14 @@ This repository is organized into modular packages, each designed for integratio
 <details>
 <summary id="elevenlabs"><strong><span style="font-size: 1.4em;"> Sinapsis ElevenLabs </span></strong></summary>
-This package offers a suite of templates and utilities designed for effortless integrating, configuration, and execution of **text-to-speech (TTS)** and **voice generation** functionalities powered by [ElevenLabs](https://elevenlabs.io/).
+This package offers a suite of templates and utilities designed for effortless integrating, configuration, and execution of **text-to-speech (TTS)**, **speech-to-speech (STS)**, **voice cloning**, and **voice generation** functionalities powered by [ElevenLabs](https://elevenlabs.io/).
+- **ElevenLabsSTS**: Template for transforming a voice into a different character or style using the ElevenLabs Speech-to-Speech API.
 - **ElevenLabsTTS**: Template for converting text into speech using ElevenLabs' voice models.
+- **ElevenLabsVoiceClone**: Template for creating a synthetic copy of an existing voice using the ElevenLabs API.
 - **ElevenLabsVoiceGeneration**: Template for generating custom synthetic voices based on user-provided descriptions.
 For specific instructions and further details, see the [README.md](https://github.com/Sinapsis-AI/sinapsis-speech/blob/main/packages/sinapsis_elevenlabs/README.md).
@@ -122,6 +126,30 @@ For specific instructions and further details, see the [README.md](https://githu
 </details>
+<details>
+<summary id="orpheus-cpp"><strong><span style="font-size: 1.4em;"> Sinapsis Orppheus-CPP</span></strong></summary>
+This package provides a template for seamlessly integrating, configuring, and running **text-to-speech (TTS)** functionalities powered by [Orpheus-TTS](https://github.com/canopyai/Orpheus-TTS).
+- **OrpheusTTS**: Converts text to speech using the Orpheus TTS model with advanced neural voice synthesis. The template processes text packets from the input container, generates corresponding audio using Orpheus TTS, and adds the resulting audio packets to the container. Features graceful error handling for out-of-memory conditions
+For specific instructions and further details, see the [README.md](https://github.com/Sinapsis-AI/sinapsis-speech/blob/main/packages/sinapsis_orpheus_cpp/README.md).
+</details>
+<details>
+<summary id="parakeet-tdt"><strong><span style="font-size: 1.4em;"> Sinapsis Parakeet-TDT</span></strong></summary>
+This package provides a template for seamlessly integrating, configuring, and running **speech-to-text (STT)** functionalities powered by [NVIDIA's Parakeet TDT model](https://huggingface.co/nvidia/parakeet-tdt-0.6b-v2).
+- **ParakeetTDTInference**: Converts speech to text using NVIDIA's Parakeet TDT 0.6B model. This template processes audio packets from the input container or specified file paths, performs transcription with optional timestamp prediction, and adds the resulting text packets to the container.
+For specific instructions and further details, see the [README.md](https://github.com/Sinapsis-AI/sinapsis-speech/blob/main/packages/sinapsis_parakeet_tdt/README.md).
+</details>
 <h2 id="webapp">🌐 Webapps</h2>
 The webapps included in this project showcase the modularity of the templates, in this case for speech generation tasks.
@@ -160,7 +188,6 @@ cd sinapsis-speech
 docker compose -f docker/compose.yaml build
 ```
 2. **Start the app container**:
 - For ElevenLabs:

sinapsis_speech-0.4.1/packages/sinapsis_elevenlabs/src/sinapsis_elevenlabs/helpers/tags.py ADDED Viewed

@@ -0,0 +1,15 @@
+# -*- coding: utf-8 -*-
+from enum import Enum
+class Tags(Enum):
+    AUDIO = "audio"
+    AUDIO_GENERATION = "audio_generation"
+    ELEVENLABS = "elevenlabs"
+    PROMPT = "prompt"
+    SPEECH = "speech"
+    SPEECH_TO_SPEECH = "speech_to_speech"
+    TEXT_TO_SPEECH = "text_to_speech"
+    VOICE_CONVERSION = "voice_conversion"
+    VOICE_CLONING = "voice_cloning"
+    VOICE_GENERATION = "voice_generation"

{sinapsis_speech-0.4.0 → sinapsis_speech-0.4.1}/packages/sinapsis_elevenlabs/src/sinapsis_elevenlabs/templates/elevenlabs_base.py RENAMED Viewed

@@ -3,11 +3,10 @@
 import abc
 import os
-import uuid
-from io import BytesIO
-from typing import IO, Iterable, Iterator, Literal
+from typing import Generator, Iterable, Iterator, Literal
-from elevenlabs import Voice, VoiceSettings, save
+import numpy as np
+from elevenlabs import Voice, VoiceSettings
 from elevenlabs.client import ElevenLabs
 from elevenlabs.types import OutputFormat
 from pydantic import Field
@@ -19,9 +18,11 @@ from sinapsis_core.template_base.base_models import (
     UIPropertiesMetadata,
 )
 from sinapsis_core.template_base.template import Template
-from sinapsis_core.utils.env_var_keys import SINAPSIS_CACHE_DIR
+from sinapsis_core.utils.env_var_keys import WORKING_DIR
+from sinapsis_generic_data_tools.helpers.audio_encoder import audio_bytes_to_numpy
 from sinapsis_elevenlabs.helpers.env_var_keys import ELEVENLABS_API_KEY
+from sinapsis_elevenlabs.helpers.tags import Tags
 RESPONSE_TYPE = Iterator[bytes] | list[bytes] | list[Iterator[bytes]] | None
@@ -51,8 +52,6 @@ class ElevenLabsBase(Template, abc.ABC):
             output_format (OutputFormat): The output audio format and quality. Options include:
                 ["mp3_22050_32", "mp3_44100_32", "mp3_44100_64", "mp3_44100_96", "mp3_44100_128",
                 "mp3_44100_192", "pcm_16000", "pcm_22050", "pcm_24000", "pcm_44100", "ulaw_8000"]
-            output_folder (str): The folder where generated audio files will be saved.
-            stream (bool): If True, the audio is returned as a stream; otherwise, saved to a file.
             voice (str | Voice | None): The voice to use for speech synthesis. This can be a voice ID (str),
                 a voice name (str) or an elevenlabs voice object (Voice).
             voice_settings (VoiceSettings): A dictionary of settings that control the behavior of the voice.
@@ -74,17 +73,20 @@ class ElevenLabsBase(Template, abc.ABC):
         ] = "eleven_turbo_v2_5"
         output_file_name: str | None = None
         output_format: OutputFormat = "mp3_44100_128"
-        output_folder: str = os.path.join(SINAPSIS_CACHE_DIR, "elevenlabs", "audios")
+        output_folder: str = os.path.join(WORKING_DIR, "elevenlabs", "audios")
         stream: bool = False
         voice: str | Voice | None = None
         voice_settings: VoiceSettings = Field(default_factory=dict)  # type: ignore[arg-type]
-    UIProperties = UIPropertiesMetadata(category="Elevenlabs", output_type=OutputTypes.AUDIO)
+    UIProperties = UIPropertiesMetadata(
+        category="Elevenlabs",
+        output_type=OutputTypes.AUDIO,
+        tags=[Tags.AUDIO, Tags.ELEVENLABS, Tags.SPEECH],
+    )
     def __init__(self, attributes: TemplateAttributeType) -> None:
         """Initializes the ElevenLabs API client with the given attributes."""
         super().__init__(attributes)
-        os.makedirs(self.attributes.output_folder, exist_ok=True)
         self.client = self.init_elevenlabs_client()
     def init_elevenlabs_client(self) -> ElevenLabs:
@@ -92,44 +94,27 @@ class ElevenLabsBase(Template, abc.ABC):
         key = self.attributes.api_key if self.attributes.api_key else ELEVENLABS_API_KEY
         return ElevenLabs(api_key=key)
-    def reset_state(self) -> None:
+    def reset_state(self, template_name: str | None = None) -> None:
         """Resets state of model"""
+        _ = template_name
         self.client = self.init_elevenlabs_client()
     @abc.abstractmethod
     def synthesize_speech(self, input_data: list[Packet]) -> RESPONSE_TYPE:
         """Abstract method for ElevenLabs speech synthesis."""
-    def _save_audio(self, response: Iterable | bytes, file_format: str, idx: int) -> str:
-        """Saves the audio to a file and returns the file path."""
-        if self.attributes.output_file_name:
-            file_name = self.attributes.output_file_name + "_" + str(idx)
-        else:
-            file_name = uuid.uuid4()
-        output_file = os.path.join(self.attributes.output_folder, f"{file_name}.{file_format}")
-        try:
-            save(response, output_file)
-            self.logger.info(f"Audio saved to: {output_file}")
-            return output_file
-        except OSError as e:
-            self.logger.error(f"File system error while saving speech to file: {e}")
-            raise
-    def _generate_audio_stream(self, response: Iterable | bytes) -> IO[bytes]:
+    def _generate_audio_stream(self, response: Iterable | bytes) -> bytes:
         """Generates and returns the audio stream."""
-        audio_stream = BytesIO()
         try:
             if isinstance(response, Iterator):
-                for chunk in response:
-                    if chunk:
-                        audio_stream.write(chunk)
+                audio_stream = b"".join(chunk for chunk in response)
             elif isinstance(response, bytes):
-                audio_stream.write(response)
+                audio_stream = response
             else:
                 raise TypeError(f"Unsupported response type: {type(response)}")
-            audio_stream.seek(0)
             self.logger.info("Returning audio stream")
             return audio_stream
         except IOError as e:
@@ -139,14 +124,15 @@ class ElevenLabsBase(Template, abc.ABC):
             self.logger.error(f"Value error while processing audio chunks: {e}")
             raise
-    def _process_audio_output(self, idx: int, response: Iterable | bytes) -> str | IO[bytes]:
+    def _process_audio_output(self, response: Iterable | bytes) -> tuple[np.ndarray, int]:
         """Processes a single audio output (either stream or file)."""
-        if self.attributes.stream:
-            return self._generate_audio_stream(response)
-        file_format = "mp3" if "mp3" in self.attributes.output_format else "wav"
-        return self._save_audio(response, file_format, idx)
-    def generate_speech(self, input_data: list[Packet]) -> list[str | IO[bytes]] | None:
+        result = self._generate_audio_stream(response)
+        audio_np, sample_rate = audio_bytes_to_numpy(result)
+        return audio_np, sample_rate
+    def generate_speech(self, input_data: list[Packet]) -> list[tuple] | None:
         """Generates speech and saves it to a file."""
         responses: RESPONSE_TYPE = self.synthesize_speech(input_data)
         if not responses:
@@ -154,29 +140,29 @@ class ElevenLabsBase(Template, abc.ABC):
         if isinstance(responses, Iterator):
             responses = [responses]
-        audio_outputs = [self._process_audio_output(idx, response) for idx, response in enumerate(responses)]
+        elif isinstance(responses, Generator):
+            responses = list(responses)
+        audio_outputs = [self._process_audio_output(response) for response in responses]
         return audio_outputs
-    def _handle_streaming_output(self, audio_outputs: list[str | IO[bytes]]) -> list[AudioPacket]:
+    def _handle_streaming_output(self, audio_outputs: list[tuple]) -> list[AudioPacket]:
         """Handles audio stream output by adding it to the container as AudioPackets."""
         generated_audios: list[AudioPacket] = []
-        sample_rate = int(self.attributes.output_format.split("_")[1])
+        # sample_rate = int(self.attributes.output_format.split("_")[1])
         for audio_output in audio_outputs:
+            audio = audio_output[0]
+            sample_rate = audio_output[1]
             audio_packet = AudioPacket(
-                content=audio_output,
+                content=audio,
                 sample_rate=sample_rate,
             )
             generated_audios.append(audio_packet)
         return generated_audios
-    def _handle_audio_outputs(self, audio_outputs: list[str | IO[bytes]], container: DataContainer) -> None:
+    def _handle_audio_outputs(self, audio_outputs: list[tuple], container: DataContainer) -> None:
         """Handles the audio outputs by appending to the container based on the output type (stream or file)."""
-        if self.attributes.stream:
-            container.audios = container.audios or []
-            container.audios.extend(self._handle_streaming_output(audio_outputs))
-        else:
-            self._set_generic_data(container, audio_outputs)
+        container.audios = container.audios or []
+        container.audios = self._handle_streaming_output(audio_outputs)
     def execute(self, container: DataContainer) -> DataContainer:
         """

{sinapsis_speech-0.4.0 → sinapsis_speech-0.4.1}/packages/sinapsis_elevenlabs/src/sinapsis_elevenlabs/templates/elevenlabs_sts.py RENAMED Viewed

@@ -5,9 +5,13 @@ from typing import Callable, Iterator, Literal
 from sinapsis_core.data_containers.data_packet import AudioPacket
+from sinapsis_elevenlabs.helpers.tags import Tags
 from sinapsis_elevenlabs.helpers.voice_utils import create_voice_settings, get_voice_id
 from sinapsis_elevenlabs.templates.elevenlabs_base import ElevenLabsBase
+ElevenLabsSTSUIProperties = ElevenLabsBase.UIProperties
+ElevenLabsSTSUIProperties.tags.extend([Tags.SPEECH_TO_SPEECH, Tags.VOICE_CONVERSION])
 class ElevenLabsSTS(ElevenLabsBase):
     """Template to interact with the ElevenLabs Speech-to-Speech API.
@@ -31,7 +35,7 @@ class ElevenLabsSTS(ElevenLabsBase):
         model: eleven_multilingual_sts_v2
         output_file_name: null
         output_format: mp3_44100_128
-        output_folder: ~/.cache/sinapsis/elevenlabs/audios
+        output_folder: <WORKING_DIR>/elevenlabs/audios
         stream: false
         voice: null
         voice_settings:
@@ -45,6 +49,7 @@ class ElevenLabsSTS(ElevenLabsBase):
     """
     PACKET_TYPE_NAME: str = "audios"
+    UIProperties = ElevenLabsSTSUIProperties
     class AttributesBaseModel(ElevenLabsBase.AttributesBaseModel):
         """Attributes specific to ElevenLabs STS API interaction.
@@ -73,9 +78,8 @@ class ElevenLabsSTS(ElevenLabsBase):
             KeyError: If the expected key is missing in the API response.
         """
         try:
-            method: Callable[..., Iterator[bytes]] = (
-                self.client.speech_to_speech.stream if self.attributes.stream else self.client.speech_to_speech.convert
-            )
+            method: Callable[..., Iterator[bytes]] = self.client.speech_to_speech.stream  # (
             return method(
                 voice_id=get_voice_id(self.client, voice=self.attributes.voice),
                 audio=input_data[0].content,

{sinapsis_speech-0.4.0 → sinapsis_speech-0.4.1}/packages/sinapsis_elevenlabs/src/sinapsis_elevenlabs/templates/elevenlabs_tts.py RENAMED Viewed

@@ -5,6 +5,7 @@ from typing import Callable, Iterator, Literal
 from sinapsis_core.data_containers.data_packet import TextPacket
+from sinapsis_elevenlabs.helpers.tags import Tags
 from sinapsis_elevenlabs.helpers.voice_utils import (
     create_voice_settings,
     get_voice_id,
@@ -12,6 +13,9 @@ from sinapsis_elevenlabs.helpers.voice_utils import (
 )
 from sinapsis_elevenlabs.templates.elevenlabs_base import ElevenLabsBase
+ElevenLabsTTSUIProperties = ElevenLabsBase.UIProperties
+ElevenLabsTTSUIProperties.tags.extend([Tags.TEXT_TO_SPEECH])
 class ElevenLabsTTS(ElevenLabsBase):
     """Template to interact with ElevenLabs text-to-speech API.
@@ -36,7 +40,7 @@ class ElevenLabsTTS(ElevenLabsBase):
         voice_settings: null
         model: eleven_turbo_v2_5
         output_format: mp3_44100_128
-        output_folder: /sinapsis/cache/dir/elevenlabs/audios
+        output_folder: <WORKING_DIR>/elevenlabs/audios
         stream: false
     """
@@ -65,9 +69,8 @@ class ElevenLabsTTS(ElevenLabsBase):
         """
         input_text: str = load_input_text(input_data)
         try:
-            method: Callable[..., Iterator[bytes]] = (
-                self.client.text_to_speech.stream if self.attributes.stream else self.client.text_to_speech.convert
-            )
+            method: Callable[..., Iterator[bytes]] = self.client.text_to_speech.stream
             return method(
                 text=input_text,
                 voice_id=get_voice_id(self.client, self.attributes.voice),

{sinapsis_speech-0.4.0 → sinapsis_speech-0.4.1}/packages/sinapsis_elevenlabs/src/sinapsis_elevenlabs/templates/elevenlabs_voice_clone.py RENAMED Viewed

@@ -4,8 +4,12 @@
 from elevenlabs import Voice
 from sinapsis_core.data_containers.data_packet import AudioPacket, DataContainer
+from sinapsis_elevenlabs.helpers.tags import Tags
 from sinapsis_elevenlabs.templates.elevenlabs_tts import ElevenLabsTTS
+ElevenLabsVoiceCloneUIProperties = ElevenLabsTTS.UIProperties
+ElevenLabsVoiceCloneUIProperties.tags.extend([Tags.VOICE_CLONING])
 class ElevenLabsVoiceClone(ElevenLabsTTS):
     """Template to clone a voice using the ElevenLabs API.
@@ -30,7 +34,7 @@ class ElevenLabsVoiceClone(ElevenLabsTTS):
         model: eleven_turbo_v2_5
         output_file_name: null
         output_format: mp3_44100_128
-        output_folder: ~/.cache/sinapsis/elevenlabs/audios
+        output_folder: <WORKING_DIR>/elevenlabs/audios
         stream: false
         voice: null
         voice_settings:
@@ -45,6 +49,8 @@ class ElevenLabsVoiceClone(ElevenLabsTTS):
     """
+    UIProperties = ElevenLabsVoiceCloneUIProperties
     class AttributesBaseModel(ElevenLabsTTS.AttributesBaseModel):
         """Attributes specific to the ElevenLabsVoiceClone class.

{sinapsis_speech-0.4.0 → sinapsis_speech-0.4.1}/packages/sinapsis_elevenlabs/src/sinapsis_elevenlabs/templates/elevenlabs_voice_generation.py RENAMED Viewed

@@ -5,9 +5,13 @@ import base64
 from sinapsis_core.data_containers.data_packet import TextPacket
+from sinapsis_elevenlabs.helpers.tags import Tags
 from sinapsis_elevenlabs.helpers.voice_utils import load_input_text
 from sinapsis_elevenlabs.templates.elevenlabs_base import ElevenLabsBase
+ElevenLabsVoiceGenerationUIProperties = ElevenLabsBase.UIProperties
+ElevenLabsVoiceGenerationUIProperties.tags.extend([Tags.VOICE_GENERATION, Tags.PROMPT])
 class ElevenLabsVoiceGeneration(ElevenLabsBase):
     """
@@ -33,12 +37,14 @@ class ElevenLabsVoiceGeneration(ElevenLabsBase):
         voice_settings: null
         model: eleven_turbo_v2_5
         output_format: mp3_44100_128
-        output_folder: /sinapsis/cache/dir/elevenlabs/audios
+        output_folder: <WORKING_DIR>/elevenlabs/audios
         stream: false
         voice_description: An old British male with a raspy, deep voice. Professional,
           relaxed and assertive
     """
+    UIProperties = ElevenLabsVoiceGenerationUIProperties
     class AttributesBaseModel(ElevenLabsBase.AttributesBaseModel):
         """
         Attributes for voice generation in ElevenLabs API.

sinapsis_speech-0.4.1/packages/sinapsis_f5_tts/src/sinapsis_f5_tts/helpers/tags.py ADDED Viewed

@@ -0,0 +1,10 @@
+# -*- coding: utf-8 -*-
+from enum import Enum
+class Tags(Enum):
+    AUDIO = "audio"
+    AUDIO_GENERATION = "audio_generation"
+    F5TTS = "f5tts"
+    SPEECH = "speech"
+    TEXT_TO_SPEECH = "text_to_speech"

{sinapsis_speech-0.4.0 → sinapsis_speech-0.4.1}/packages/sinapsis_f5_tts/src/sinapsis_f5_tts/templates/f5_tts_inference.py RENAMED Viewed

@@ -6,6 +6,7 @@ from typing import Any, Literal
 import numpy as np
 import soundfile as sf
+import torch
 from pydantic import Field
 from pydantic.dataclasses import dataclass
 from sinapsis_core.data_containers.data_packet import (
@@ -15,6 +16,8 @@ from sinapsis_core.data_containers.data_packet import (
 from sinapsis_core.template_base import Template
 from sinapsis_core.template_base.base_models import OutputTypes, TemplateAttributes, UIPropertiesMetadata
+from sinapsis_f5_tts.helpers.tags import Tags
 @dataclass
 class F5CliKeys:
@@ -146,7 +149,11 @@ class F5TTSInference(Template):
     """
     AttributesBaseModel = F5TTSInferenceAttributes
-    UIProperties = UIPropertiesMetadata(category="F5TTS", output_type=OutputTypes.AUDIO)
+    UIProperties = UIPropertiesMetadata(
+        category="F5TTS",
+        output_type=OutputTypes.AUDIO,
+        tags=[Tags.AUDIO, Tags.AUDIO_GENERATION, Tags.F5TTS, Tags.SPEECH, Tags.TEXT_TO_SPEECH],
+    )
     def _add_attribute_to_command(self, cli_command: list[str], field_name: str, field: Any) -> None:
         """
@@ -357,3 +364,8 @@ class F5TTSInference(Template):
                 )
         return container
+    def reset_state(self, template_name: str | None = None) -> None:
+        if "cuda" in self.attributes.device:
+            torch.cuda.empty_cache()
+        super().reset_state(template_name)

sinapsis_speech-0.4.1/packages/sinapsis_kokoro/src/sinapsis_kokoro/helpers/tags.py ADDED Viewed

@@ -0,0 +1,10 @@
+# -*- coding: utf-8 -*-
+from enum import Enum
+class Tags(Enum):
+    AUDIO = "audio"
+    AUDIO_GENERATION = "audio_generation"
+    KOKORO = "kokoro"
+    SPEECH = "speech"
+    TEXT_TO_SPEECH = "text_to_speech"

{sinapsis_speech-0.4.0 → sinapsis_speech-0.4.1}/packages/sinapsis_kokoro/src/sinapsis_kokoro/templates/kokoro_tts.py RENAMED Viewed

@@ -1,5 +1,5 @@
 # -*- coding: utf-8 -*-
-from typing import Generator
+from typing import Generator, Literal
 from urllib.error import HTTPError
 import torch
@@ -15,6 +15,7 @@ from sinapsis_core.template_base.template import Template
 from sinapsis_core.utils.logging_utils import make_loguru
 from sinapsis_kokoro.helpers.kokoro_utils import KokoroKeys, kokoro_voices
+from sinapsis_kokoro.helpers.tags import Tags
 class KokoroTTS(Template):
@@ -39,7 +40,11 @@ class KokoroTTS(Template):
         voice: af_heart
     """
-    UIProperties = UIPropertiesMetadata(category="Kokoro", output_type=OutputTypes.AUDIO)
+    UIProperties = UIPropertiesMetadata(
+        category="Kokoro",
+        output_type=OutputTypes.AUDIO,
+        tags=[Tags.AUDIO, Tags.AUDIO_GENERATION, Tags.KOKORO, Tags.SPEECH, Tags.TEXT_TO_SPEECH],
+    )
     class AttributesBaseModel(TemplateAttributes):
         """
@@ -56,6 +61,7 @@ class KokoroTTS(Template):
             https://huggingface.co/hexgrad/Kokoro-82M/blob/main/VOICES.md
         """
+        device: Literal["cpu", "cuda"] = "cpu"
         speed: int | float = 1
         split_pattern: str = r"\n+"
         voice: kokoro_voices = KokoroKeys.default_voice
@@ -73,7 +79,7 @@ class KokoroTTS(Template):
         Returns:
             KPipeline: The initialized TTS pipeline for generating speech.
         """
-        return KPipeline(lang_code=self.attributes.voice[0], repo_id=KokoroKeys.repo_id)
+        return KPipeline(lang_code=self.attributes.voice[0], repo_id=KokoroKeys.repo_id, device=self.attributes.device)
     def _create_audio_packet(
         self,
@@ -151,3 +157,8 @@ class KokoroTTS(Template):
         self.generate_speech(container)
         return container
+    def reset_state(self, template_name: str | None = None) -> None:
+        if "cuda" in self.attributes.device:
+            torch.cuda.empty_cache()
+        super().reset_state(template_name)

sinapsis_speech-0.4.1/packages/sinapsis_orpheus_cpp/src/sinapsis_orpheus_cpp/__init__.py ADDED Viewed

File without changes

sinapsis_speech-0.4.1/packages/sinapsis_orpheus_cpp/src/sinapsis_orpheus_cpp/helpers/__init__.py ADDED Viewed

File without changes

sinapsis_speech-0.4.1/packages/sinapsis_orpheus_cpp/src/sinapsis_orpheus_cpp/helpers/tags.py ADDED Viewed

@@ -0,0 +1,10 @@
+# -*- coding: utf-8 -*-
+from enum import Enum
+class Tags(Enum):
+    AUDIO = "audio"
+    AUDIO_GENERATION = "audio_generation"
+    ORPHEUS_CPP = "orpheus_cpp"
+    SPEECH = "speech"
+    TEXT_TO_SPEECH = "text_to_speech"

{sinapsis_speech-0.4.0 → sinapsis_speech-0.4.1}/packages/sinapsis_orpheus_cpp/src/sinapsis_orpheus_cpp/templates/orpheus_tts.py RENAMED Viewed

@@ -1,6 +1,7 @@
 # -*- coding: utf-8 -*-
 import numpy as np
+import torch
 from llama_cpp import Llama
 from orpheus_cpp import OrpheusCpp
 from orpheus_cpp.model import TTSOptions
@@ -18,6 +19,7 @@ from sinapsis_core.template_base.base_models import (
 )
 from sinapsis_core.utils.env_var_keys import SINAPSIS_CACHE_DIR
+from sinapsis_orpheus_cpp.helpers.tags import Tags
 from sinapsis_orpheus_cpp.thirdparty.helpers import download_model, setup_snac_session
@@ -129,7 +131,11 @@ class OrpheusTTS(Template):
     """
     AttributesBaseModel = OrpheusTTSAttributes
-    UIProperties = UIPropertiesMetadata(category="TTS", output_type=OutputTypes.AUDIO)
+    UIProperties = UIPropertiesMetadata(
+        category="TTS",
+        output_type=OutputTypes.AUDIO,
+        tags=[Tags.AUDIO, Tags.AUDIO_GENERATION, Tags.ORPHEUS_CPP, Tags.SPEECH, Tags.TEXT_TO_SPEECH],
+    )
     def __init__(self, attributes: TemplateAttributeType) -> None:
         super().__init__(attributes)
@@ -154,8 +160,9 @@ class OrpheusTTS(Template):
             model_variant=self.attributes.model_variant,
             cache_dir=self.attributes.cache_dir,
         )
-        self._setup_llm(model_file)
-        self._setup_snac_session()
+        if model_file:
+            self._setup_llm(model_file)
+            self._setup_snac_session()
     def _setup_llm(self, model_file: str) -> None:
         """Setup the Large Language Model component with specified parameters.
@@ -298,3 +305,8 @@ class OrpheusTTS(Template):
                 container.audios.append(audio_packet)
         return container
+    def reset_state(self, template_name: str | None = None) -> None:
+        if torch.cuda.is_available():
+            torch.cuda.empty_cache()
+        super().reset_state(template_name)

sinapsis_speech-0.4.1/packages/sinapsis_parakeet_tdt/src/sinapsis_parakeet_tdt/__init__.py ADDED Viewed

File without changes

sinapsis_speech-0.4.1/packages/sinapsis_parakeet_tdt/src/sinapsis_parakeet_tdt/helpers/__init__.py ADDED Viewed

File without changes

sinapsis_speech-0.4.1/packages/sinapsis_parakeet_tdt/src/sinapsis_parakeet_tdt/helpers/tags.py ADDED Viewed

@@ -0,0 +1,11 @@
+# -*- coding: utf-8 -*-
+from enum import Enum
+class Tags(Enum):
+    AUDIO = "audio"
+    SPEECH = "speech"
+    SPEECH_RECOGNITION = "speech_recognition"
+    PARAKEET_TDT = "parakeet_tdt"
+    SPEECH_TO_TEXT = "speech_to_text"
+    TRANSCRIPTION = "transcription"

{sinapsis_speech-0.4.0 → sinapsis_speech-0.4.1}/packages/sinapsis_parakeet_tdt/src/sinapsis_parakeet_tdt/templates/parakeet_tdt.py RENAMED Viewed

@@ -3,6 +3,7 @@ import os
 from typing import Any, Literal
 import nemo.collections.asr as nemo_asr
+import torch
 from sinapsis_core.data_containers.data_packet import (
     AudioPacket,
     DataContainer,
@@ -15,6 +16,8 @@ from sinapsis_core.template_base.base_models import (
 )
 from sinapsis_core.template_base.template import Template
+from sinapsis_parakeet_tdt.helpers.tags import Tags
 class ParakeetTDTInferenceAttributes(TemplateAttributes):
     """
@@ -68,7 +71,18 @@ class ParakeetTDTInference(Template):
         refresh_cache: False
     """
-    UIProperties = UIPropertiesMetadata(category="Parakeet TDT", output_type=OutputTypes.TEXT)
+    UIProperties = UIPropertiesMetadata(
+        category="Parakeet TDT",
+        output_type=OutputTypes.TEXT,
+        tags=[
+            Tags.AUDIO,
+            Tags.SPEECH,
+            Tags.PARAKEET_TDT,
+            Tags.SPEECH_RECOGNITION,
+            Tags.SPEECH_TO_TEXT,
+            Tags.TRANSCRIPTION,
+        ],
+    )
     AttributesBaseModel = ParakeetTDTInferenceAttributes
@@ -268,3 +282,8 @@ class ParakeetTDTInference(Template):
         container.texts.extend(text_packets)
         return container
+    def reset_state(self, template_name: str | None = None) -> None:
+        if "cuda" in self.attributes.device:
+            torch.cuda.empty_cache()
+        super().reset_state(template_name)

{sinapsis_speech-0.4.0 → sinapsis_speech-0.4.1}/packages/sinapsis_speech.egg-info/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: sinapsis-speech
-Version: 0.4.0
+Version: 0.4.1
 Summary: Generate speech using various libraries.
 Author-email: SinapsisAI <dev@sinapsis.tech>
 Project-URL: Homepage, https://sinapsis.tech
@@ -35,7 +35,7 @@ Sinapsis Speech
 <br>
 </h1>
-<h4 align="center"> Templates for a wide range of voice generation tasks.</h4>
+<h4 align="center"> A monorepo housing multiple packages and templates for versatile voice generation, text-to-speech, speech-to-text, and beyond.</h4>
 <p align="center">
 <a href="#installation">🐍 Installation</a> •
@@ -108,10 +108,14 @@ This repository is organized into modular packages, each designed for integratio
 <details>
 <summary id="elevenlabs"><strong><span style="font-size: 1.4em;"> Sinapsis ElevenLabs </span></strong></summary>
-This package offers a suite of templates and utilities designed for effortless integrating, configuration, and execution of **text-to-speech (TTS)** and **voice generation** functionalities powered by [ElevenLabs](https://elevenlabs.io/).
+This package offers a suite of templates and utilities designed for effortless integrating, configuration, and execution of **text-to-speech (TTS)**, **speech-to-speech (STS)**, **voice cloning**, and **voice generation** functionalities powered by [ElevenLabs](https://elevenlabs.io/).
+- **ElevenLabsSTS**: Template for transforming a voice into a different character or style using the ElevenLabs Speech-to-Speech API.
 - **ElevenLabsTTS**: Template for converting text into speech using ElevenLabs' voice models.
+- **ElevenLabsVoiceClone**: Template for creating a synthetic copy of an existing voice using the ElevenLabs API.
 - **ElevenLabsVoiceGeneration**: Template for generating custom synthetic voices based on user-provided descriptions.
 For specific instructions and further details, see the [README.md](https://github.com/Sinapsis-AI/sinapsis-speech/blob/main/packages/sinapsis_elevenlabs/README.md).
@@ -148,6 +152,30 @@ For specific instructions and further details, see the [README.md](https://githu
 </details>
+<details>
+<summary id="orpheus-cpp"><strong><span style="font-size: 1.4em;"> Sinapsis Orppheus-CPP</span></strong></summary>
+This package provides a template for seamlessly integrating, configuring, and running **text-to-speech (TTS)** functionalities powered by [Orpheus-TTS](https://github.com/canopyai/Orpheus-TTS).
+- **OrpheusTTS**: Converts text to speech using the Orpheus TTS model with advanced neural voice synthesis. The template processes text packets from the input container, generates corresponding audio using Orpheus TTS, and adds the resulting audio packets to the container. Features graceful error handling for out-of-memory conditions
+For specific instructions and further details, see the [README.md](https://github.com/Sinapsis-AI/sinapsis-speech/blob/main/packages/sinapsis_orpheus_cpp/README.md).
+</details>
+<details>
+<summary id="parakeet-tdt"><strong><span style="font-size: 1.4em;"> Sinapsis Parakeet-TDT</span></strong></summary>
+This package provides a template for seamlessly integrating, configuring, and running **speech-to-text (STT)** functionalities powered by [NVIDIA's Parakeet TDT model](https://huggingface.co/nvidia/parakeet-tdt-0.6b-v2).
+- **ParakeetTDTInference**: Converts speech to text using NVIDIA's Parakeet TDT 0.6B model. This template processes audio packets from the input container or specified file paths, performs transcription with optional timestamp prediction, and adds the resulting text packets to the container.
+For specific instructions and further details, see the [README.md](https://github.com/Sinapsis-AI/sinapsis-speech/blob/main/packages/sinapsis_parakeet_tdt/README.md).
+</details>
 <h2 id="webapp">🌐 Webapps</h2>
 The webapps included in this project showcase the modularity of the templates, in this case for speech generation tasks.
@@ -186,7 +214,6 @@ cd sinapsis-speech
 docker compose -f docker/compose.yaml build
 ```
 2. **Start the app container**:
 - For ElevenLabs:

{sinapsis_speech-0.4.0 → sinapsis_speech-0.4.1}/packages/sinapsis_speech.egg-info/SOURCES.txt RENAMED Viewed

@@ -4,6 +4,7 @@ pyproject.toml
 packages/sinapsis_elevenlabs/src/sinapsis_elevenlabs/__init__.py
 packages/sinapsis_elevenlabs/src/sinapsis_elevenlabs/helpers/__init__.py
 packages/sinapsis_elevenlabs/src/sinapsis_elevenlabs/helpers/env_var_keys.py
+packages/sinapsis_elevenlabs/src/sinapsis_elevenlabs/helpers/tags.py
 packages/sinapsis_elevenlabs/src/sinapsis_elevenlabs/helpers/voice_utils.py
 packages/sinapsis_elevenlabs/src/sinapsis_elevenlabs/templates/__init__.py
 packages/sinapsis_elevenlabs/src/sinapsis_elevenlabs/templates/elevenlabs_base.py
@@ -12,14 +13,24 @@ packages/sinapsis_elevenlabs/src/sinapsis_elevenlabs/templates/elevenlabs_tts.py
 packages/sinapsis_elevenlabs/src/sinapsis_elevenlabs/templates/elevenlabs_voice_clone.py
 packages/sinapsis_elevenlabs/src/sinapsis_elevenlabs/templates/elevenlabs_voice_generation.py
 packages/sinapsis_f5_tts/src/sinapsis_f5_tts/__init__.py
+packages/sinapsis_f5_tts/src/sinapsis_f5_tts/helpers/__init__.py
+packages/sinapsis_f5_tts/src/sinapsis_f5_tts/helpers/tags.py
 packages/sinapsis_f5_tts/src/sinapsis_f5_tts/templates/__init__.py
 packages/sinapsis_f5_tts/src/sinapsis_f5_tts/templates/f5_tts_inference.py
+packages/sinapsis_kokoro/src/sinapsis_kokoro/__init__.py
 packages/sinapsis_kokoro/src/sinapsis_kokoro/helpers/kokoro_utils.py
+packages/sinapsis_kokoro/src/sinapsis_kokoro/helpers/tags.py
 packages/sinapsis_kokoro/src/sinapsis_kokoro/templates/__init__.py
 packages/sinapsis_kokoro/src/sinapsis_kokoro/templates/kokoro_tts.py
+packages/sinapsis_orpheus_cpp/src/sinapsis_orpheus_cpp/__init__.py
+packages/sinapsis_orpheus_cpp/src/sinapsis_orpheus_cpp/helpers/__init__.py
+packages/sinapsis_orpheus_cpp/src/sinapsis_orpheus_cpp/helpers/tags.py
 packages/sinapsis_orpheus_cpp/src/sinapsis_orpheus_cpp/templates/__init__.py
 packages/sinapsis_orpheus_cpp/src/sinapsis_orpheus_cpp/templates/orpheus_tts.py
 packages/sinapsis_orpheus_cpp/src/sinapsis_orpheus_cpp/thirdparty/helpers.py
+packages/sinapsis_parakeet_tdt/src/sinapsis_parakeet_tdt/__init__.py
+packages/sinapsis_parakeet_tdt/src/sinapsis_parakeet_tdt/helpers/__init__.py
+packages/sinapsis_parakeet_tdt/src/sinapsis_parakeet_tdt/helpers/tags.py
 packages/sinapsis_parakeet_tdt/src/sinapsis_parakeet_tdt/templates/__init__.py
 packages/sinapsis_parakeet_tdt/src/sinapsis_parakeet_tdt/templates/parakeet_tdt.py
 packages/sinapsis_speech.egg-info/PKG-INFO
@@ -29,6 +40,7 @@ packages/sinapsis_speech.egg-info/requires.txt
 packages/sinapsis_speech.egg-info/top_level.txt
 packages/sinapsis_zonos/src/sinapsis_zonos/__init__.py
 packages/sinapsis_zonos/src/sinapsis_zonos/helpers/__init__.py
+packages/sinapsis_zonos/src/sinapsis_zonos/helpers/tags.py
 packages/sinapsis_zonos/src/sinapsis_zonos/helpers/zonos_keys.py
 packages/sinapsis_zonos/src/sinapsis_zonos/helpers/zonos_tts_utils.py
 packages/sinapsis_zonos/src/sinapsis_zonos/templates/__init__.py

sinapsis_speech-0.4.1/packages/sinapsis_zonos/src/sinapsis_zonos/__init__.py ADDED Viewed

File without changes

sinapsis_speech-0.4.1/packages/sinapsis_zonos/src/sinapsis_zonos/helpers/__init__.py ADDED Viewed

File without changes

sinapsis_speech-0.4.1/packages/sinapsis_zonos/src/sinapsis_zonos/helpers/tags.py ADDED Viewed

@@ -0,0 +1,11 @@
+# -*- coding: utf-8 -*-
+from enum import Enum
+class Tags(Enum):
+    AUDIO = "audio"
+    AUDIO_GENERATION = "audio_generation"
+    SPEECH = "speech"
+    TEXT_TO_SPEECH = "text_to_speech"
+    VOICE_CLONING = "voice_cloning"
+    ZONOS = "zonos"

{sinapsis_speech-0.4.0 → sinapsis_speech-0.4.1}/packages/sinapsis_zonos/src/sinapsis_zonos/helpers/zonos_tts_utils.py RENAMED Viewed

@@ -3,7 +3,7 @@ from typing import Set
 import torch
 import torchaudio
-from sinapsis_core.template_base.template import TemplateAttributeType
+from sinapsis_core.template_base.base_models import TemplateAttributeType
 from sinapsis_core.utils.logging_utils import sinapsis_logger
 from zonos.conditioning import make_cond_dict, supported_language_codes
 from zonos.model import Zonos

{sinapsis_speech-0.4.0 → sinapsis_speech-0.4.1}/packages/sinapsis_zonos/src/sinapsis_zonos/templates/zonos_tts.py RENAMED Viewed

@@ -1,14 +1,11 @@
 # -*- coding: utf-8 -*-
 """Base template for Zonos speech synthesis"""
-import os
-import uuid
 from typing import Literal, Set
 import torch
-import torchaudio
 from pydantic import Field
-from sinapsis_core.data_containers.data_packet import DataContainer, TextPacket
+from sinapsis_core.data_containers.data_packet import AudioPacket, DataContainer, TextPacket
 from sinapsis_core.template_base.base_models import (
     OutputTypes,
     TemplateAttributes,
@@ -16,10 +13,10 @@ from sinapsis_core.template_base.base_models import (
     UIPropertiesMetadata,
 )
 from sinapsis_core.template_base.template import Template
-from sinapsis_core.utils.env_var_keys import SINAPSIS_CACHE_DIR
 from zonos.model import Zonos
 from zonos.utils import DEFAULT_DEVICE as device
+from sinapsis_zonos.helpers.tags import Tags
 from sinapsis_zonos.helpers.zonos_keys import EmotionsConfig, SamplingParams, TTSKeys
 from sinapsis_zonos.helpers.zonos_tts_utils import (
     get_audio_prefix_codes,
@@ -38,7 +35,11 @@ class ZonosTTS(Template):
     and fine control over various speech attributes like pitch, speaking rate, and emotions.
     """
-    UIProperties = UIPropertiesMetadata(category="Zonos", output_type=OutputTypes.AUDIO)
+    UIProperties = UIPropertiesMetadata(
+        category="Zonos",
+        output_type=OutputTypes.AUDIO,
+        tags=[Tags.AUDIO, Tags.AUDIO_GENERATION, Tags.ZONOS, Tags.SPEECH, Tags.TEXT_TO_SPEECH, Tags.VOICE_CLONING],
+    )
     class AttributesBaseModel(TemplateAttributes):
         """
@@ -71,7 +72,7 @@ class ZonosTTS(Template):
         fmax: float = 22050.0
         language: str = TTSKeys.en_language
         model: Literal["Zyphra/Zonos-v0.1-transformer", "Zyphra/Zonos-v0.1-hybrid"] = "Zyphra/Zonos-v0.1-transformer"
-        output_folder: str = os.path.join(SINAPSIS_CACHE_DIR, "zonos", "audios")
+        # output_folder: str = os.path.join(SINAPSIS_CACHE_DIR, "zonos", "audios")
         pitch_std: float = 20.0
         prefix_audio: str | None = None
         randomized_seed: bool = True
@@ -85,7 +86,7 @@ class ZonosTTS(Template):
     def __init__(self, attributes: TemplateAttributeType) -> None:
         """Initializes the Zonos model with the provided attributes."""
         super().__init__(attributes)
-        os.makedirs(self.attributes.output_folder, exist_ok=True)
+        # os.makedirs(self.attributes.output_folder, exist_ok=True)
         self.device = device
         self.model = self._init_model()
         init_seed(self.attributes)
@@ -112,8 +113,9 @@ class ZonosTTS(Template):
             del self.model
             torch.cuda.empty_cache()
-    def reset_state(self) -> None:
+    def reset_state(self, template_name: str | None = None) -> None:
         """Reinitialize the model and random seed."""
+        _ = template_name
         self._del_model()
         self.model = self._init_model()
         init_seed(self.attributes)
@@ -154,10 +156,8 @@ class ZonosTTS(Template):
             output_audio (torch.Tensor): The generated audio output tensor.
             container (DataContainer): The container to store metadata.
         """
-        output_path = os.path.join(self.attributes.output_folder, f"{uuid.uuid4()}.{TTSKeys.wav}")
-        torchaudio.save(output_path, output_audio[0], self.model.autoencoder.sampling_rate)
-        self._set_generic_data(container, [output_path])
-        self.logger.debug(f"Audio saved to: {output_path}")
+        audio_np = output_audio[0].cpu().numpy()
+        container.audios.append(AudioPacket(content=audio_np, sample_rate=self.model.autoencoder.sampling_rate))
     def execute(self, container: DataContainer) -> DataContainer:
         """Processes the input data and generates a speech output."""

{sinapsis_speech-0.4.0 → sinapsis_speech-0.4.1}/pyproject.toml RENAMED Viewed

@@ -1,6 +1,6 @@
 [project]
 name = "sinapsis-speech"
-version = "0.4.0"
+version = "0.4.1"
 description = "Generate speech using various libraries."
 authors = [
     {name = "SinapsisAI", email = "dev@sinapsis.tech"},
@@ -28,6 +28,7 @@ all = [
     "sinapsis-zonos[all]",
     "sinapsis-parakeet-tdt[all]",
     "sinapsis-orpheus-cpp[all]",
 ]
 gradio-app = [
     "sinapsis[webapp]>=0.2.3",
@@ -50,6 +51,7 @@ sinapsis-zonos = { workspace = true }
 sinapsis-speech = { workspace = true }
 sinapsis-parakeet-tdt = { workspace = true }
 sinapsis-orpheus-cpp = { workspace = true }
+sinapsis-chatterbox = { workspace = true }
 [[tool.uv.index]]