PyPI - soniox - Versions diffs - 2.5.0__tar.gz → 2.7.0__tar.gz - Mend

soniox 2.5.0tar.gz → 2.7.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (106) hide show

{soniox-2.5.0 → soniox-2.7.0}/CHANGELOG.md RENAMED Viewed

@@ -73,6 +73,38 @@ Examples:
 ---
+## [2.7.0] - 24 jun 2026
+### Added
+- `endpoint_latency_adjustment_level` field on `RealtimeSTTConfig` (integer 0–3) to fine-tune the latency/accuracy trade-off of realtime endpoint detection.
+### Changed
+- TTS REST output settings (`audio_format`, `sample_rate`, `bitrate`) now live on `CreateTtsConfig`; `generate()` and `generate_to_file()` take the utterance's `text`, `voice`, `model`, and `language` directly. Each field now has a single home (no more flat-vs-config overlap). Existing flat-keyword calls keep working (see Deprecated).
+- During the deprecation overlap, when a deprecated field is set both on the config and as a flat argument, the config value now takes precedence uniformly across STT and TTS (previously `client_reference_id` resolved the other way).
+### Deprecated
+- Passing `audio_format`, `sample_rate`, or `bitrate` as keyword arguments to TTS `generate()` / `generate_to_file()` is deprecated; set them on `CreateTtsConfig` instead. The keyword arguments still work but emit a `DeprecationWarning` and will be removed in a future major release.
+- Setting `model`, `voice`, or `language` on `CreateTtsConfig` is deprecated; pass them directly to `generate()` / `generate_to_file()` (they describe the utterance, not output encoding).
+- Relying on the default TTS `language` (`"en"`) is deprecated; pass `language` explicitly to `generate()` / `generate_to_file()`. It will become a required argument in the next major release.
+- Setting `model` or `client_reference_id` on `CreateTranscriptionConfig` is deprecated; pass them directly to the transcription `create*` calls.
+### Removed
+- The internal module constants `DEFAULT_LANGUAGE` and `DEFAULT_AUDIO_FORMAT` in `soniox.api.tts` / `soniox.api.async_tts`. The defaults (`"en"` / `"wav"`) are now applied inside payload construction. Behavior is unchanged.
+---
+## [2.6.0] - 15 jun 2026
+### Added
+- `endpoint_sensitivity` field on `RealtimeSTTConfig`: adjusts how likely the model is to emit a speech endpoint. Allowed values are between -1.0 and 1.0; the default is 0.0. Introduced in the Soniox v5 model.
+---
 ## [2.5.0] - 12 jun 2026
 ### Added

{soniox-2.5.0 → soniox-2.7.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: soniox
-Version: 2.5.0
+Version: 2.7.0
 Summary: The official Python SDK for the Soniox API (STT, REST)
 Project-URL: Homepage, https://soniox.com
 Project-URL: Documentation, https://soniox.com/docs
@@ -111,7 +111,7 @@ from soniox.utils import render_tokens, throttle_audio, start_audio_thread
 DEMO_FILE = "path_to_your_audio_file"
 client = SonioxClient()
-config = RealtimeSTTConfig(model="stt-rt-v4", audio_format="mp3")
+config = RealtimeSTTConfig(model="stt-rt-v5", audio_format="mp3")
 final_tokens: list[Token] = []
 non_final_tokens: list[Token] = []

{soniox-2.5.0 → soniox-2.7.0}/README.md RENAMED Viewed

@@ -68,7 +68,7 @@ from soniox.utils import render_tokens, throttle_audio, start_audio_thread
 DEMO_FILE = "path_to_your_audio_file"
 client = SonioxClient()
-config = RealtimeSTTConfig(model="stt-rt-v4", audio_format="mp3")
+config = RealtimeSTTConfig(model="stt-rt-v5", audio_format="mp3")
 final_tokens: list[Token] = []
 non_final_tokens: list[Token] = []

{soniox-2.5.0 → soniox-2.7.0}/docs/async_client.md RENAMED Viewed

@@ -1245,13 +1245,17 @@ AsyncTtsAPI(client: AsyncSonioxClient)
 ### generate()
 ```python
-generate(*, text: str, voice: str, model: str = DEFAULT_MODEL, language: str = DEFAULT_LANGUAGE, audio_format: TtsAudioFormat = DEFAULT_AUDIO_FORMAT, sample_rate: TtsSampleRate | None = None, bitrate: TtsBitrate | None = None, config: CreateTtsConfig | None = None) -> bytes
+generate(*, text: str, voice: str, model: str = DEFAULT_MODEL, config: CreateTtsConfig | None = None, language: str | None = None, audio_format: TtsAudioFormat | None = None, sample_rate: TtsSampleRate | None = None, bitrate: TtsBitrate | None = None) -> bytes
 ```
 Generate speech audio from text and return raw audio bytes.
 Performs a POST request to the TTS REST endpoint.
+``audio_format``/``sample_rate``/``bitrate`` are deprecated; set them on
+``CreateTtsConfig`` instead. Pass ``language`` explicitly — relying on the default
+("en") is deprecated and ``language`` will be required in the next major release.
 **Parameters**
 | Parameter | Type | Description |
@@ -1259,11 +1263,11 @@ Performs a POST request to the TTS REST endpoint.
 | `text` | `str` | Longer free-form background text, prior interaction history, reference documents, or meeting notes. |
 | `voice` | `str` | Voice identifier to generate speech audio with. |
 | `model` | `str` | Speech-to-text model to use. |
-| `language` | `str` | Language code for Text-to-Speech (e.g., "en"). |
-| `audio_format` | `TtsAudioFormat` | Audio format for realtime transcription. |
+| `config` | `CreateTtsConfig \| None` | Configuration options for this operation. |
+| `language` | `str \| None` | Language code for Text-to-Speech (e.g., "en"). |
+| `audio_format` | `TtsAudioFormat \| None` | Audio format for realtime transcription. |
 | `sample_rate` | `TtsSampleRate \| None` | Audio sample rate in Hz. |
 | `bitrate` | `TtsBitrate \| None` | Output bitrate in bits-per-second for compressed formats. |
-| `config` | `CreateTtsConfig \| None` | Configuration options for this operation. |
 **Returns**
@@ -1280,11 +1284,15 @@ Performs a POST request to the TTS REST endpoint.
 ### generate_to_file()
 ```python
-generate_to_file(output: BinaryIO | Path | str, *, text: str, voice: str = DEFAULT_VOICE, model: str = DEFAULT_MODEL, language: str = DEFAULT_LANGUAGE, audio_format: TtsAudioFormat = DEFAULT_AUDIO_FORMAT, sample_rate: TtsSampleRate | None = None, bitrate: TtsBitrate | None = None, config: CreateTtsConfig | None = None) -> int
+generate_to_file(output: BinaryIO | Path | str, *, text: str, voice: str = DEFAULT_VOICE, model: str = DEFAULT_MODEL, config: CreateTtsConfig | None = None, language: str | None = None, audio_format: TtsAudioFormat | None = None, sample_rate: TtsSampleRate | None = None, bitrate: TtsBitrate | None = None) -> int
 ```
 Generate speech audio from text and write the audio bytes to a file-like output.
+``audio_format``/``sample_rate``/``bitrate`` are deprecated; set them on
+``CreateTtsConfig`` instead. Pass ``language`` explicitly — relying on the default
+("en") is deprecated and ``language`` will be required in the next major release.
 **Parameters**
 | Parameter | Type | Description |
@@ -1293,11 +1301,11 @@ Generate speech audio from text and write the audio bytes to a file-like output.
 | `text` | `str` | Longer free-form background text, prior interaction history, reference documents, or meeting notes. |
 | `voice` | `str` | Voice identifier to generate speech audio with. |
 | `model` | `str` | Speech-to-text model to use. |
-| `language` | `str` | Language code for Text-to-Speech (e.g., "en"). |
-| `audio_format` | `TtsAudioFormat` | Audio format for realtime transcription. |
+| `config` | `CreateTtsConfig \| None` | Configuration options for this operation. |
+| `language` | `str \| None` | Language code for Text-to-Speech (e.g., "en"). |
+| `audio_format` | `TtsAudioFormat \| None` | Audio format for realtime transcription. |
 | `sample_rate` | `TtsSampleRate \| None` | Audio sample rate in Hz. |
 | `bitrate` | `TtsBitrate \| None` | Output bitrate in bits-per-second for compressed formats. |
-| `config` | `CreateTtsConfig \| None` | Configuration options for this operation. |
 **Returns**

{soniox-2.5.0 → soniox-2.7.0}/docs/types.md RENAMED Viewed

@@ -171,9 +171,9 @@ Helper config used when building Text-to-Speech payloads.
 | Property | Type | Description |
 | ------ | ------ | ------ |
-| `model` | `str \| None` | Text-to-Speech model to use. |
-| `language` | `str \| None` | Language code for Text-to-Speech (e.g., "en"). |
-| `voice` | `str \| None` | Voice identifier to generate speech audio with. |
+| `model` | `str \| None` | Deprecated: pass ``model`` to generate()/generate_to_file() instead. |
+| `language` | `str \| None` | Deprecated: pass ``language`` to generate()/generate_to_file() instead. |
+| `voice` | `str \| None` | Deprecated: pass ``voice`` to generate()/generate_to_file() instead. |
 | `audio_format` | `TtsAudioFormat \| None` | Requested output audio format. |
 | `sample_rate` | `TtsSampleRate \| None` | Output sample rate in Hz. |
 | `bitrate` | `TtsBitrate \| None` | Output bitrate in bits-per-second for compressed formats. |
@@ -216,7 +216,7 @@ Helper config used when building transcription payloads.
 | Property | Type | Description |
 | ------ | ------ | ------ |
-| `model` | `str \| None` | Speech-to-text model to use. |
+| `model` | `str \| None` | Deprecated: pass ``model`` to the create call instead. |
 | `language_hints` | `list[LanguageCode] \| None` | Array of expected ISO language codes to bias recognition. |
 | `language_hints_strict` | `bool \| None` | When true, model relies more heavily on language hints. |
 | `enable_speaker_diarization` | `bool \| None` | Enable speaker diarization to identify different speakers. |
@@ -226,7 +226,7 @@ Helper config used when building transcription payloads.
 | `webhook_url` | `str \| None` | URL to receive webhook notifications when transcription is completed or fails. |
 | `webhook_auth_header_name` | `str \| None` | Name of the authentication header sent with webhook notifications |
 | `webhook_auth_header_value` | `str \| None` | Authentication header value sent with webhook notifications |
-| `client_reference_id` | `str \| None` | Optional tracking identifier |
+| `client_reference_id` | `str \| None` | Deprecated: pass ``client_reference_id`` to the create call instead. |
 ---
@@ -502,7 +502,15 @@ Audio formats accepted by the realtime STT websocket.
 ```python
 RealtimeSTTHeaderFormat = Literal[
-    "aac", "aiff", "amr", "asf", "flac", "mp3", "ogg", "wav", "webm",
+    "aac",
+    "aiff",
+    "amr",
+    "asf",
+    "flac",
+    "mp3",
+    "ogg",
+    "wav",
+    "webm",
 ]
 ```
@@ -517,16 +525,25 @@ Container formats whose header carries sample rate and channels.
 ```python
 RealtimeSTTRawFormat = Literal[
     "pcm_s8",
-    "pcm_s16le", "pcm_s16be",
-    "pcm_s24le", "pcm_s24be",
-    "pcm_s32le", "pcm_s32be",
+    "pcm_s16le",
+    "pcm_s16be",
+    "pcm_s24le",
+    "pcm_s24be",
+    "pcm_s32le",
+    "pcm_s32be",
     "pcm_u8",
-    "pcm_u16le", "pcm_u16be",
-    "pcm_u24le", "pcm_u24be",
-    "pcm_u32le", "pcm_u32be",
-    "pcm_f32le", "pcm_f32be",
-    "pcm_f64le", "pcm_f64be",
-    "mulaw", "alaw",
+    "pcm_u16le",
+    "pcm_u16be",
+    "pcm_u24le",
+    "pcm_u24be",
+    "pcm_u32le",
+    "pcm_u32be",
+    "pcm_f32le",
+    "pcm_f32be",
+    "pcm_f64le",
+    "pcm_f64be",
+    "mulaw",
+    "alaw",
 ]
 ```
@@ -992,6 +1009,8 @@ Configuration for initiating a realtime transcription session.
 | `enable_language_identification` | `bool \| None` | Enable automatic language detection. |
 | `enable_endpoint_detection` | `bool \| None` | Enable endpoint detection for utterance boundaries. |
 | `max_endpoint_delay_ms` | `int \| None` | Maximum delay between the end of speech and returned endpoint. Allowed values for maximum delay are between 500ms and 3000ms. The default value is 2000ms |
+| `endpoint_sensitivity` | `float \| None` | Adjusts how likely the model is to emit an endpoint. Higher values make endpoints more likely (finalizing sooner); lower values make them less likely. Allowed values are between -1.0 and 1.0; the default is 0.0. Introduced in the Soniox v5 model; earlier models reject it. |
+| `endpoint_latency_adjustment_level` | `int \| None` | Fine-tunes the latency/accuracy trade-off of endpoint detection. Allowed values are integers from 0 to 3. |
 | `translation` | `TranslationConfigInput \| None` | Translation configuration. |
 | `client_reference_id` | `str \| None` | Optional tracking identifier (max 256 chars). |

{soniox-2.5.0 → soniox-2.7.0}/examples/async_soniox_client/realtime_example.py RENAMED Viewed

@@ -10,7 +10,7 @@ DEMO_FILE = Path(__file__).resolve().parents[2] / "assets" / "coffee_shop.mp3"
 async def main() -> None:
     client = AsyncSonioxClient()
-    config = RealtimeSTTConfig(model="stt-rt-v4", audio_format="mp3")
+    config = RealtimeSTTConfig(model="stt-rt-v5", audio_format="mp3")
     final_tokens: list[Token] = []
     non_final_tokens: list[Token] = []
     async with client.realtime.stt.connect(config=config) as session:

{soniox-2.5.0 → soniox-2.7.0}/examples/async_soniox_client/tts_realtime_example.py RENAMED Viewed

@@ -40,9 +40,7 @@ async def main() -> None:
     try:
         async with client.realtime.tts.connect(config=config) as connection:
             send_task = asyncio.create_task(
-                connection.send_text_chunks(
-                    _iter_text_chunks(TEXT_CHUNKS), text_end=True
-                ),
+                connection.send_text_chunks(_iter_text_chunks(TEXT_CHUNKS), text_end=True),
                 name="tts-async-sender",
             )
             try:

{soniox-2.5.0 → soniox-2.7.0}/examples/async_soniox_client/tts_realtime_multiplexed_example.py RENAMED Viewed

@@ -69,9 +69,7 @@ def write_outputs(audio_by_stream: dict[str, bytes]) -> None:
         )
         if audio:
             output_path.write_bytes(audio)
-            print(
-                f"Wrote stream {key.upper()} ({len(audio)} bytes) to {output_path.resolve()}"
-            )
+            print(f"Wrote stream {key.upper()} ({len(audio)} bytes) to {output_path.resolve()}")
         else:
             print(f"No audio file was written for stream {key.upper()}.")
@@ -96,8 +94,7 @@ async def main() -> None:
     try:
         async with client.realtime.tts.connect_multi_stream() as connection:
             streams = {
-                key: await connection.open_stream(config=configs[key])
-                for key in sorted(configs)
+                key: await connection.open_stream(config=configs[key]) for key in sorted(configs)
             }
             receiver_tasks = [
@@ -124,9 +121,7 @@ async def main() -> None:
                 if isinstance(result, BaseException):
                     errors.append(result)
-            receiver_results = await asyncio.gather(
-                *receiver_tasks, return_exceptions=True
-            )
+            receiver_results = await asyncio.gather(*receiver_tasks, return_exceptions=True)
             for key, result in zip(streams.keys(), receiver_results, strict=True):
                 if isinstance(result, BaseException):
                     errors.append(result)

{soniox-2.5.0 → soniox-2.7.0}/examples/soniox_client/realtime_example.py RENAMED Viewed

@@ -9,7 +9,7 @@ DEMO_FILE = Path(__file__).resolve().parents[2] / "assets" / "coffee_shop.mp3"
 def main() -> None:
     client = SonioxClient()
-    config = RealtimeSTTConfig(model="stt-rt-v4", audio_format="mp3")
+    config = RealtimeSTTConfig(model="stt-rt-v5", audio_format="mp3")
     final_tokens: list[Token] = []
     non_final_tokens: list[Token] = []
     with client.realtime.stt.connect(config=config) as session:

{soniox-2.5.0 → soniox-2.7.0}/examples/soniox_client/tts_realtime_multiplexed_example.py RENAMED Viewed

@@ -74,9 +74,7 @@ def write_outputs(audio_by_stream: dict[str, bytes]) -> None:
         )
         if audio:
             output_path.write_bytes(audio)
-            print(
-                f"Wrote stream {key.upper()} ({len(audio)} bytes) to {output_path.resolve()}"
-            )
+            print(f"Wrote stream {key.upper()} ({len(audio)} bytes) to {output_path.resolve()}")
         else:
             print(f"No audio file was written for stream {key.upper()}.")
@@ -101,10 +99,7 @@ def main() -> None:
     try:
         with client.realtime.tts.connect_multi_stream() as connection:
-            streams = {
-                key: connection.open_stream(config=configs[key])
-                for key in sorted(configs)
-            }
+            streams = {key: connection.open_stream(config=configs[key]) for key in sorted(configs)}
             receiver_threads = [
                 threading.Thread(
@@ -135,9 +130,7 @@ def main() -> None:
             with errors_lock:
                 for exc in errors:
-                    print(
-                        "Realtime multiplexed TTS error (keeping partial audio):", exc
-                    )
+                    print("Realtime multiplexed TTS error (keeping partial audio):", exc)
     finally:
         write_outputs(audio_by_stream)
         client.close()

{soniox-2.5.0 → soniox-2.7.0}/pyproject.toml RENAMED Viewed

@@ -4,7 +4,7 @@ build-backend = "hatchling.build"
 [project]
 name = "soniox"
-version = "2.5.0"
+version = "2.7.0"
 dependencies = ["httpx>0.25.0", "websockets>11.0", "pydantic>2"]
 requires-python = ">=3.10"
 authors = [{ name = "Soniox", email = "support@soniox.com" }]

{soniox-2.5.0 → soniox-2.7.0}/scripts/generate_docs.py RENAMED Viewed

@@ -281,7 +281,9 @@ def parse_raises_entries(lines: list[str]) -> list[tuple[str, str]]:
         match = re.match(r"^\s*([A-Za-z_][\w\.\[\], ]*)\s*:\s*(.*)$", line)
         if match:
             if current_name is not None:
-                entries.append((current_name.strip(), clean_paragraph_block("\n".join(current_desc))))
+                entries.append(
+                    (current_name.strip(), clean_paragraph_block("\n".join(current_desc)))
+                )
             current_name = match.group(1)
             current_desc = [match.group(2).strip()]
         elif current_name is not None:
@@ -343,7 +345,11 @@ def get_parsed_doc(obj: Object) -> ParsedDoc:
 def extract_name_target(node: ast.AST) -> str | None:
-    if isinstance(node, ast.Assign) and len(node.targets) == 1 and isinstance(node.targets[0], ast.Name):
+    if (
+        isinstance(node, ast.Assign)
+        and len(node.targets) == 1
+        and isinstance(node.targets[0], ast.Name)
+    ):
         return node.targets[0].id
     if isinstance(node, ast.AnnAssign) and isinstance(node.target, ast.Name):
         return node.target.id
@@ -434,7 +440,13 @@ def parse_dunder_all(path: Path) -> list[str]:
         name = extract_name_target(stmt)
         if name != "__all__":
             continue
-        value = stmt.value if isinstance(stmt, ast.Assign) else stmt.value if isinstance(stmt, ast.AnnAssign) else None
+        value = (
+            stmt.value
+            if isinstance(stmt, ast.Assign)
+            else stmt.value
+            if isinstance(stmt, ast.AnnAssign)
+            else None
+        )
         if not isinstance(value, (ast.List, ast.Tuple)):
             continue
         exports: list[str] = []
@@ -593,7 +605,9 @@ def get_call_name(call: ast.Call) -> str | None:
     return None
-def iter_function_nodes(tree: ast.Module) -> list[tuple[str, ast.FunctionDef | ast.AsyncFunctionDef]]:
+def iter_function_nodes(
+    tree: ast.Module,
+) -> list[tuple[str, ast.FunctionDef | ast.AsyncFunctionDef]]:
     nodes: list[tuple[str, ast.FunctionDef | ast.AsyncFunctionDef]] = []
     for stmt in tree.body:
         if isinstance(stmt, (ast.FunctionDef, ast.AsyncFunctionDef)):
@@ -695,7 +709,7 @@ def format_constructor_signature(cls: Class, constructor: Function) -> str:
     sig_text = str(constructor.signature())
     match = re.match(r"^__init__\((.*)\)\s*->\s*None$", sig_text)
     if not match:
-        return f"{cls.name}{sig_text[sig_text.find('('):]}"
+        return f"{cls.name}{sig_text[sig_text.find('(') :]}"
     params = match.group(1).strip()
     if params.startswith("self, "):
         params = params[len("self, ") :]

soniox-2.7.0/src/soniox/api/_utils.py ADDED Viewed

@@ -0,0 +1,227 @@
+from __future__ import annotations
+import io
+import warnings
+from pathlib import Path
+from typing import BinaryIO, TypeVar
+import httpx
+from pydantic import BaseModel
+from ..errors import SonioxAPIError, SonioxValidationError
+from ..types import (
+    CreateTranscriptionConfig,
+    CreateTranscriptionPayload,
+    CreateTtsConfig,
+    CreateTtsPayload,
+    LanguageCode,
+    TranslationConfig,
+)
+ModelT = TypeVar("ModelT", bound=BaseModel)
+def _warn_deprecated_config_fields(
+    config: BaseModel | None, names: tuple[str, ...], advice: str
+) -> None:
+    """Emit a DeprecationWarning if any of ``names`` was explicitly set on ``config``.
+    Read via ``model_fields_set`` (not attribute access) so it never fires for
+    unset fields and never double-warns with pydantic's own ``deprecated=`` hook.
+    """
+    if config is None:
+        return
+    used = [n for n in names if n in config.model_fields_set]
+    if used:
+        warnings.warn(
+            f"Setting {', '.join(used)} on {type(config).__name__} is deprecated; "
+            f"{advice}. This will be removed in the next major release.",
+            DeprecationWarning,
+            stacklevel=3,
+        )
+def ensure_success(response: httpx.Response) -> None:
+    if response.is_error:
+        raise SonioxAPIError.from_response(response)
+def parse_response(response: httpx.Response, model: type[ModelT]) -> ModelT:
+    ensure_success(response)
+    payload = response.json()
+    return model.model_validate(payload)
+async def parse_async_response(response: httpx.Response, model: type[ModelT]) -> ModelT:
+    ensure_success(response)
+    payload = response.json()
+    return model.model_validate(payload)
+def normalize_file(
+    file: BinaryIO | bytes | Path | str,
+    filename: str | None = None,
+) -> tuple[BinaryIO, str, bool]:
+    """Return (file-like, filename, should_close) tuple for upload."""
+    if isinstance(file, bytes | bytearray):
+        file_obj = io.BytesIO(file)
+        effective_name = filename or "upload.bin"
+        return file_obj, effective_name, True
+    if isinstance(file, Path):
+        file_obj = file.open("rb")
+        effective_name = filename or file.name
+        return file_obj, effective_name, True
+    if isinstance(file, str):
+        return normalize_file(Path(file), filename=filename)
+    if isinstance(file, io.IOBase):
+        effective_name = filename or getattr(file, "name", "upload.bin")
+        return file, effective_name, False
+    raise TypeError("file must be bytes, Path, or file-like stream.")
+def build_create_payload(
+    *,
+    model: str,
+    file_id: str | None,
+    audio_url: str | None,
+    client_reference_id: str | None,
+    config: CreateTranscriptionConfig | None,
+) -> CreateTranscriptionPayload:
+    _warn_deprecated_config_fields(
+        config,
+        ("model", "client_reference_id"),
+        "pass it directly to the create call instead",
+    )
+    config_data = config.model_dump(exclude_none=True) if config else {}
+    model_override = config_data.pop("model", None)
+    client_ref_override = config_data.pop("client_reference_id", None)
+    return CreateTranscriptionPayload.model_validate(
+        {
+            "model": model_override if model_override is not None else model,
+            "file_id": file_id,
+            "audio_url": audio_url,
+            "client_reference_id": (
+                client_ref_override if client_ref_override is not None else client_reference_id
+            ),
+            **config_data,
+        }
+    )
+def build_tts_payload(
+    *,
+    text: str,
+    voice: str,
+    model: str,
+    config: CreateTtsConfig | None,
+    language: str | None = None,
+    audio_format: str | None = None,
+    sample_rate: int | None = None,
+    bitrate: int | None = None,
+) -> CreateTtsPayload:
+    # ponytail: deprecation shim. Next major — make `language` required (drop its default +
+    # the omit-language warn), delete the flat audio_format/sample_rate/bitrate kwargs + their
+    # warn block, drop model/voice/language from CreateTtsConfig + the _warn_deprecated_config_fields
+    # call + the override dance. Body collapses to: model_dump → defaults → model_validate.
+    # Don't delete _warn_deprecated_config_fields or `import warnings` until STT's shim goes too.
+    """Assemble a TTS payload from flat identity args (``text``/``voice``/``model``/
+    ``language``) plus a ``config`` settings bag (``audio_format``/``sample_rate``/``bitrate``).
+    Deprecated, kept one release: the flat ``audio_format``/``sample_rate``/``bitrate`` kwargs
+    (move them to ``config``); ``model``/``voice``/``language`` set on ``config`` (pass them
+    flat); and omitting ``language`` — it will be required in the next major release.
+    """
+    deprecated_flat = {
+        k: v
+        for k, v in {
+            "audio_format": audio_format,
+            "sample_rate": sample_rate,
+            "bitrate": bitrate,
+        }.items()
+        if v is not None
+    }
+    if deprecated_flat:
+        warnings.warn(
+            f"Passing {', '.join(deprecated_flat)} directly to generate()/generate_to_file() "
+            "is deprecated; set them on CreateTtsConfig instead. This will be removed in the "
+            "next major release.",
+            DeprecationWarning,
+            stacklevel=3,
+        )
+    _warn_deprecated_config_fields(
+        config,
+        ("model", "voice", "language"),
+        "pass it directly to generate()/generate_to_file() instead",
+    )
+    config_data = config.model_dump(exclude_none=True) if config else {}
+    settings = {**deprecated_flat, **config_data}  # config wins for output settings
+    # settings never holds None (exclude_none + deprecated_flat filter), so absent-key is
+    # the only fallback case: get(key, default) and is-None, never falsy `or`.
+    voice_override = settings.pop("voice", None)
+    model_override = settings.pop("model", None)
+    config_language = settings.pop("language", None)
+    # language is identity, not a setting: the flat arg is the blessed path and wins; a
+    # config value is honored (deprecated) next; relying on the "en" default is deprecated.
+    if language is not None:
+        resolved_language = language
+    elif config_language is not None:
+        resolved_language = config_language
+    else:
+        warnings.warn(
+            "Relying on the default Text-to-Speech language 'en' is deprecated; pass "
+            "language= explicitly. It will be required in the next major release.",
+            DeprecationWarning,
+            stacklevel=3,
+        )
+        resolved_language = "en"
+    return CreateTtsPayload.model_validate(
+        {
+            "text": text,
+            "voice": voice if voice_override is None else voice_override,
+            "model": model if model_override is None else model_override,
+            "language": resolved_language,
+            "audio_format": settings.get("audio_format", "wav"),
+            "sample_rate": settings.get("sample_rate"),
+            "bitrate": settings.get("bitrate"),
+        }
+    )
+def build_translate_config(
+    *,
+    to: LanguageCode | None,
+    source: LanguageCode | None,
+    between: tuple[LanguageCode, LanguageCode] | None,
+    config: CreateTranscriptionConfig | None,
+) -> CreateTranscriptionConfig:
+    """Return a config with translation and language fields populated from the kwargs.
+    Requires exactly one of ``to`` or ``between``. ``source`` is only valid with ``to``
+    and is passed as a strict language hint. Forces ``enable_language_identification=True``.
+    Other config fields are preserved.
+    """
+    if (to is None) == (between is None):
+        raise SonioxValidationError("Provide exactly one of `to` or `between`")
+    if source is not None and to is None:
+        raise SonioxValidationError("`source` is only valid with `to`")
+    base = config.model_copy() if config else CreateTranscriptionConfig()
+    if to is not None:
+        base.translation = TranslationConfig(type="one_way", target_language=to)
+        if source:
+            base.language_hints = [source]
+            base.language_hints_strict = True
+    else:
+        assert between is not None  # validated above
+        a, b = between
+        base.translation = TranslationConfig(type="two_way", language_a=a, language_b=b)
+    base.enable_language_identification = True
+    return base

{soniox-2.5.0 → soniox-2.7.0}/src/soniox/api/async_stt.py RENAMED Viewed

@@ -694,4 +694,3 @@ class AsyncSttAPI:
             wait_timeout_sec=wait_timeout_sec,
             config=build_translate_config(to=to, source=source, between=between, config=config),
         )

soniox 2.5.0__tar.gz → 2.7.0__tar.gz

soniox 2.5.0tar.gz → 2.7.0tar.gz