wyoming-microsoft-tts 1.3.3__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- wyoming_microsoft_tts-1.3.3/MANIFEST.in +2 -0
- wyoming_microsoft_tts-1.3.3/PKG-INFO +92 -0
- wyoming_microsoft_tts-1.3.3/README.md +74 -0
- wyoming_microsoft_tts-1.3.3/pyproject.toml +66 -0
- wyoming_microsoft_tts-1.3.3/requirements.txt +7 -0
- wyoming_microsoft_tts-1.3.3/setup.cfg +4 -0
- wyoming_microsoft_tts-1.3.3/setup.py +46 -0
- wyoming_microsoft_tts-1.3.3/tests/__init__.py +1 -0
- wyoming_microsoft_tts-1.3.3/tests/conftest.py +26 -0
- wyoming_microsoft_tts-1.3.3/tests/test_download.py +75 -0
- wyoming_microsoft_tts-1.3.3/tests/test_microsoft_tts.py +17 -0
- wyoming_microsoft_tts-1.3.3/tests/test_voice_parsing.py +169 -0
- wyoming_microsoft_tts-1.3.3/wyoming_microsoft_tts/__init__.py +1 -0
- wyoming_microsoft_tts-1.3.3/wyoming_microsoft_tts/__main__.py +208 -0
- wyoming_microsoft_tts-1.3.3/wyoming_microsoft_tts/download.py +182 -0
- wyoming_microsoft_tts-1.3.3/wyoming_microsoft_tts/handler.py +183 -0
- wyoming_microsoft_tts-1.3.3/wyoming_microsoft_tts/microsoft_tts.py +62 -0
- wyoming_microsoft_tts-1.3.3/wyoming_microsoft_tts/sentence_boundary.py +63 -0
- wyoming_microsoft_tts-1.3.3/wyoming_microsoft_tts/version.py +3 -0
- wyoming_microsoft_tts-1.3.3/wyoming_microsoft_tts/voices.json +12419 -0
- wyoming_microsoft_tts-1.3.3/wyoming_microsoft_tts.egg-info/PKG-INFO +92 -0
- wyoming_microsoft_tts-1.3.3/wyoming_microsoft_tts.egg-info/SOURCES.txt +23 -0
- wyoming_microsoft_tts-1.3.3/wyoming_microsoft_tts.egg-info/dependency_links.txt +1 -0
- wyoming_microsoft_tts-1.3.3/wyoming_microsoft_tts.egg-info/requires.txt +6 -0
- wyoming_microsoft_tts-1.3.3/wyoming_microsoft_tts.egg-info/top_level.txt +2 -0
|
@@ -0,0 +1,92 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: wyoming-microsoft-tts
|
|
3
|
+
Version: 1.3.3
|
|
4
|
+
Summary: Add your description here
|
|
5
|
+
Home-page: https://github.com/hugobloem/wyoming-microsoft-tts
|
|
6
|
+
Author: Hugo Bloem
|
|
7
|
+
Author-email:
|
|
8
|
+
Requires-Python: >=3.13
|
|
9
|
+
Description-Content-Type: text/markdown
|
|
10
|
+
Requires-Dist: azure-cognitiveservices-speech>=1.45.0
|
|
11
|
+
Requires-Dist: black>=25.1.0
|
|
12
|
+
Requires-Dist: lxml>=6.0.1
|
|
13
|
+
Requires-Dist: pycountry>=24.6.1
|
|
14
|
+
Requires-Dist: regex>=2025.7.34
|
|
15
|
+
Requires-Dist: wyoming>=1.7.2
|
|
16
|
+
Dynamic: author
|
|
17
|
+
Dynamic: home-page
|
|
18
|
+
|
|
19
|
+
# Wyoming Microsoft TTS
|
|
20
|
+
Wyoming protocol server for Microsoft Azure text-to-speech.
|
|
21
|
+
|
|
22
|
+
This Python package provides a Wyoming integration for Microsoft Azure text-to-speech and can be directly used with [Home Assistant](https://www.home-assistant.io/) voice and [Rhasspy](https://github.com/rhasspy/rhasspy3).
|
|
23
|
+
|
|
24
|
+
## Azure Speech Service
|
|
25
|
+
This program uses [Microsoft Azure Speech Service](https://learn.microsoft.com/en-us/azure/ai-services/speech-service/). You can sign up to a free Azure account which comes with free tier of 500K characters per month, this should be enough for running a voice assistant as each command is relatively short. Plus, on Home Assistant the outputs are cached so each response will only be requested once. Once this amount is exceeded Azure could charge you for each second used (Current pricing is $0.36 per audio hour). I am not responsible for any incurred charges and recommend you set up a spending limit to reduce your exposure. However, for normal usage the free tier could suffice and the resource should not switch to a paid service automatically.
|
|
26
|
+
|
|
27
|
+
If you have not set up a speech resource, you can follow the instructions below. (you only need to do this once and works both for [Speech-to-Text](https://github.com/hugobloem/wyoming-microsoft-stt) and [Text-to-Speech](https://github.com/hugobloem/wyoming-microsoft-tts))
|
|
28
|
+
|
|
29
|
+
1. Sign in or create an account on [portal.azure.com](https://portal.azure.com).
|
|
30
|
+
2. Create a subscription by searching for `subscription` in the search bar. [Consult Microsoft Learn for more information](https://learn.microsoft.com/en-gb/azure/cost-management-billing/manage/create-subscription#create-a-subscription-in-the-azure-portal).
|
|
31
|
+
3. Create a speech resource by searching for `speech service`.
|
|
32
|
+
4. Select the subscription you created, pick or create a resource group, select a region, pick an identifiable name, and select the pricing tier (you probably want Free F0)
|
|
33
|
+
5. Once created, copy one of the keys from the speech service page. You will need this to run this program.
|
|
34
|
+
|
|
35
|
+
|
|
36
|
+
## Installation
|
|
37
|
+
Depending on your use case there are different installation options.
|
|
38
|
+
|
|
39
|
+
- **Using pip**
|
|
40
|
+
Clone the repository and install the package using pip. Please note the platform requirements as noted [here](https://learn.microsoft.com/en-us/azure/ai-services/speech-service/quickstarts/setup-platform?tabs=linux%2Cubuntu%2Cdotnetcli%2Cdotnet%2Cjre%2Cmaven%2Cnodejs%2Cmac%2Cpypi&pivots=programming-language-python#platform-requirements).
|
|
41
|
+
```sh
|
|
42
|
+
pip install .
|
|
43
|
+
```
|
|
44
|
+
|
|
45
|
+
- **Home Assistant Add-On**
|
|
46
|
+
Add the following repository as an add-on repository to your Home Assistant, or click the button below.
|
|
47
|
+
[https://github.com/hugobloem/homeassistant-addons](https://github.com/hugobloem/homeassistant-addons)
|
|
48
|
+
|
|
49
|
+
[](https://my.home-assistant.io/redirect/supervisor_add_addon_repository/?repository_url=https%3A%2F%2Fgithub.com%2Fhugobloem%2Fhomeassistant-addons)
|
|
50
|
+
|
|
51
|
+
- **Docker container**
|
|
52
|
+
To run as a Docker container use the following command:
|
|
53
|
+
```bash
|
|
54
|
+
docker run ghcr.io/hugobloem/wyoming-microsoft-tts-noha:latest --<key> <value>
|
|
55
|
+
```
|
|
56
|
+
For the relevant keys please look at [the table below](#usage)
|
|
57
|
+
|
|
58
|
+
- **docker compose**
|
|
59
|
+
|
|
60
|
+
Below is a sample for a docker compose file. The azure region + subscription key can be set in environment variables. Everything else needs to be passed via command line arguments.
|
|
61
|
+
|
|
62
|
+
```yaml
|
|
63
|
+
wyoming-proxy-azure-tts:
|
|
64
|
+
image: ghcr.io/hugobloem/wyoming-microsoft-tts-noha
|
|
65
|
+
container_name: wyoming-azure-tts
|
|
66
|
+
ports:
|
|
67
|
+
- "10200:10200"
|
|
68
|
+
environment:
|
|
69
|
+
AZURE_SERVICE_REGION: swedencentral
|
|
70
|
+
AZURE_SUBSCRIPTION_KEY: XXX
|
|
71
|
+
command: --voice=en-GB-SoniaNeural --uri=tcp://0.0.0.0:10200
|
|
72
|
+
```
|
|
73
|
+
|
|
74
|
+
## Usage
|
|
75
|
+
Depending on the installation method parameters are parsed differently. However, the same options are used for each of the installation methods and can be found in the table below. Your service region and subscription key can be found on the speech service resource page (step 5 the Azure Speech service instructions).
|
|
76
|
+
|
|
77
|
+
For the bare-metal Python install the program is run as follows:
|
|
78
|
+
```python
|
|
79
|
+
python -m wyoming-microsoft-tts --<key> <value>
|
|
80
|
+
```
|
|
81
|
+
|
|
82
|
+
| Key | Optional | Description |
|
|
83
|
+
|---|---|---|
|
|
84
|
+
| `service-region` | No | Azure service region e.g., `uksouth` |
|
|
85
|
+
| `subscription-key` | No | Azure subscription key |
|
|
86
|
+
| `uri` | No | Uri where the server will be broadcasted e.g., `tcp://0.0.0.0:10200` |
|
|
87
|
+
| `download-dir` | Yes | Directory to download voices.json into (default: /tmp/) |
|
|
88
|
+
| `voice` | Yes | Default voice to set for transcription, default: `en-GB-SoniaNeural` |
|
|
89
|
+
| `auto-punctuation` | Yes | Automatically add punctuation (default: `".?!"`) |
|
|
90
|
+
| `samples-per-chunk` | Yes | Number of samples per audio chunk (default: 1024) |
|
|
91
|
+
| `update-voices` | Yes | Download latest languages.json during startup |
|
|
92
|
+
| `debug` | Yes | Log debug messages |
|
|
@@ -0,0 +1,74 @@
|
|
|
1
|
+
# Wyoming Microsoft TTS
|
|
2
|
+
Wyoming protocol server for Microsoft Azure text-to-speech.
|
|
3
|
+
|
|
4
|
+
This Python package provides a Wyoming integration for Microsoft Azure text-to-speech and can be directly used with [Home Assistant](https://www.home-assistant.io/) voice and [Rhasspy](https://github.com/rhasspy/rhasspy3).
|
|
5
|
+
|
|
6
|
+
## Azure Speech Service
|
|
7
|
+
This program uses [Microsoft Azure Speech Service](https://learn.microsoft.com/en-us/azure/ai-services/speech-service/). You can sign up to a free Azure account which comes with free tier of 500K characters per month, this should be enough for running a voice assistant as each command is relatively short. Plus, on Home Assistant the outputs are cached so each response will only be requested once. Once this amount is exceeded Azure could charge you for each second used (Current pricing is $0.36 per audio hour). I am not responsible for any incurred charges and recommend you set up a spending limit to reduce your exposure. However, for normal usage the free tier could suffice and the resource should not switch to a paid service automatically.
|
|
8
|
+
|
|
9
|
+
If you have not set up a speech resource, you can follow the instructions below. (you only need to do this once and works both for [Speech-to-Text](https://github.com/hugobloem/wyoming-microsoft-stt) and [Text-to-Speech](https://github.com/hugobloem/wyoming-microsoft-tts))
|
|
10
|
+
|
|
11
|
+
1. Sign in or create an account on [portal.azure.com](https://portal.azure.com).
|
|
12
|
+
2. Create a subscription by searching for `subscription` in the search bar. [Consult Microsoft Learn for more information](https://learn.microsoft.com/en-gb/azure/cost-management-billing/manage/create-subscription#create-a-subscription-in-the-azure-portal).
|
|
13
|
+
3. Create a speech resource by searching for `speech service`.
|
|
14
|
+
4. Select the subscription you created, pick or create a resource group, select a region, pick an identifiable name, and select the pricing tier (you probably want Free F0)
|
|
15
|
+
5. Once created, copy one of the keys from the speech service page. You will need this to run this program.
|
|
16
|
+
|
|
17
|
+
|
|
18
|
+
## Installation
|
|
19
|
+
Depending on your use case there are different installation options.
|
|
20
|
+
|
|
21
|
+
- **Using pip**
|
|
22
|
+
Clone the repository and install the package using pip. Please note the platform requirements as noted [here](https://learn.microsoft.com/en-us/azure/ai-services/speech-service/quickstarts/setup-platform?tabs=linux%2Cubuntu%2Cdotnetcli%2Cdotnet%2Cjre%2Cmaven%2Cnodejs%2Cmac%2Cpypi&pivots=programming-language-python#platform-requirements).
|
|
23
|
+
```sh
|
|
24
|
+
pip install .
|
|
25
|
+
```
|
|
26
|
+
|
|
27
|
+
- **Home Assistant Add-On**
|
|
28
|
+
Add the following repository as an add-on repository to your Home Assistant, or click the button below.
|
|
29
|
+
[https://github.com/hugobloem/homeassistant-addons](https://github.com/hugobloem/homeassistant-addons)
|
|
30
|
+
|
|
31
|
+
[](https://my.home-assistant.io/redirect/supervisor_add_addon_repository/?repository_url=https%3A%2F%2Fgithub.com%2Fhugobloem%2Fhomeassistant-addons)
|
|
32
|
+
|
|
33
|
+
- **Docker container**
|
|
34
|
+
To run as a Docker container use the following command:
|
|
35
|
+
```bash
|
|
36
|
+
docker run ghcr.io/hugobloem/wyoming-microsoft-tts-noha:latest --<key> <value>
|
|
37
|
+
```
|
|
38
|
+
For the relevant keys please look at [the table below](#usage)
|
|
39
|
+
|
|
40
|
+
- **docker compose**
|
|
41
|
+
|
|
42
|
+
Below is a sample for a docker compose file. The azure region + subscription key can be set in environment variables. Everything else needs to be passed via command line arguments.
|
|
43
|
+
|
|
44
|
+
```yaml
|
|
45
|
+
wyoming-proxy-azure-tts:
|
|
46
|
+
image: ghcr.io/hugobloem/wyoming-microsoft-tts-noha
|
|
47
|
+
container_name: wyoming-azure-tts
|
|
48
|
+
ports:
|
|
49
|
+
- "10200:10200"
|
|
50
|
+
environment:
|
|
51
|
+
AZURE_SERVICE_REGION: swedencentral
|
|
52
|
+
AZURE_SUBSCRIPTION_KEY: XXX
|
|
53
|
+
command: --voice=en-GB-SoniaNeural --uri=tcp://0.0.0.0:10200
|
|
54
|
+
```
|
|
55
|
+
|
|
56
|
+
## Usage
|
|
57
|
+
Depending on the installation method parameters are parsed differently. However, the same options are used for each of the installation methods and can be found in the table below. Your service region and subscription key can be found on the speech service resource page (step 5 the Azure Speech service instructions).
|
|
58
|
+
|
|
59
|
+
For the bare-metal Python install the program is run as follows:
|
|
60
|
+
```python
|
|
61
|
+
python -m wyoming-microsoft-tts --<key> <value>
|
|
62
|
+
```
|
|
63
|
+
|
|
64
|
+
| Key | Optional | Description |
|
|
65
|
+
|---|---|---|
|
|
66
|
+
| `service-region` | No | Azure service region e.g., `uksouth` |
|
|
67
|
+
| `subscription-key` | No | Azure subscription key |
|
|
68
|
+
| `uri` | No | Uri where the server will be broadcasted e.g., `tcp://0.0.0.0:10200` |
|
|
69
|
+
| `download-dir` | Yes | Directory to download voices.json into (default: /tmp/) |
|
|
70
|
+
| `voice` | Yes | Default voice to set for transcription, default: `en-GB-SoniaNeural` |
|
|
71
|
+
| `auto-punctuation` | Yes | Automatically add punctuation (default: `".?!"`) |
|
|
72
|
+
| `samples-per-chunk` | Yes | Number of samples per audio chunk (default: 1024) |
|
|
73
|
+
| `update-voices` | Yes | Download latest languages.json during startup |
|
|
74
|
+
| `debug` | Yes | Log debug messages |
|
|
@@ -0,0 +1,66 @@
|
|
|
1
|
+
[project]
|
|
2
|
+
name = "wyoming-microsoft-tts"
|
|
3
|
+
version = "1.3.3"
|
|
4
|
+
description = "Add your description here"
|
|
5
|
+
readme = "README.md"
|
|
6
|
+
requires-python = ">=3.13"
|
|
7
|
+
dependencies = [
|
|
8
|
+
"azure-cognitiveservices-speech>=1.45.0",
|
|
9
|
+
"black>=25.1.0",
|
|
10
|
+
"lxml>=6.0.1",
|
|
11
|
+
"pycountry>=24.6.1",
|
|
12
|
+
"regex>=2025.7.34",
|
|
13
|
+
"wyoming>=1.7.2",
|
|
14
|
+
]
|
|
15
|
+
|
|
16
|
+
[dependency-groups]
|
|
17
|
+
dev = [
|
|
18
|
+
"pytest>=8.4.1",
|
|
19
|
+
"ruff>=0.12.10",
|
|
20
|
+
]
|
|
21
|
+
|
|
22
|
+
[tool.ruff]
|
|
23
|
+
lint.select = [
|
|
24
|
+
"B007", # Loop control variable {name} not used within loop body
|
|
25
|
+
"B014", # Exception handler with duplicate exception
|
|
26
|
+
"C", # complexity
|
|
27
|
+
"D", # docstrings
|
|
28
|
+
"E", # pycodestyle
|
|
29
|
+
"F", # pyflakes/autoflake
|
|
30
|
+
"ICN001", # import concentions; {name} should be imported as {asname}
|
|
31
|
+
"PGH004", # Use specific rule codes when using noqa
|
|
32
|
+
"PLC0414", # Useless import alias. Import alias does not rename original package.
|
|
33
|
+
"SIM105", # Use contextlib.suppress({exception}) instead of try-except-pass
|
|
34
|
+
"SIM117", # Merge with-statements that use the same scope
|
|
35
|
+
"SIM118", # Use {key} in {dict} instead of {key} in {dict}.keys()
|
|
36
|
+
"SIM201", # Use {left} != {right} instead of not {left} == {right}
|
|
37
|
+
"SIM212", # Use {a} if {a} else {b} instead of {b} if not {a} else {a}
|
|
38
|
+
"SIM300", # Yoda conditions. Use 'age == 42' instead of '42 == age'.
|
|
39
|
+
"SIM401", # Use get from dict with default instead of an if block
|
|
40
|
+
"T20", # flake8-print
|
|
41
|
+
"TRY004", # Prefer TypeError exception for invalid type
|
|
42
|
+
"RUF006", # Store a reference to the return value of asyncio.create_task
|
|
43
|
+
"UP", # pyupgrade
|
|
44
|
+
"W", # pycodestyle
|
|
45
|
+
]
|
|
46
|
+
|
|
47
|
+
lint.ignore = [
|
|
48
|
+
"D202", # No blank lines allowed after function docstring
|
|
49
|
+
"D203", # 1 blank line required before class docstring
|
|
50
|
+
"D213", # Multi-line docstring summary should start at the second line
|
|
51
|
+
"D404", # First word of the docstring should not be This
|
|
52
|
+
"D406", # Section name should end with a newline
|
|
53
|
+
"D407", # Section name underlining
|
|
54
|
+
"D411", # Missing blank line before section
|
|
55
|
+
"E501", # line too long
|
|
56
|
+
"E731", # do not assign a lambda expression, use a def
|
|
57
|
+
]
|
|
58
|
+
|
|
59
|
+
[lint.flake8-pytest-style]
|
|
60
|
+
fixture-parentheses = false
|
|
61
|
+
|
|
62
|
+
[lint.pyupgrade]
|
|
63
|
+
keep-runtime-typing = true
|
|
64
|
+
|
|
65
|
+
[lint.mccabe]
|
|
66
|
+
max-complexity = 25
|
|
@@ -0,0 +1,46 @@
|
|
|
1
|
+
"""Setup file for wyoming-microsoft-tts."""
|
|
2
|
+
|
|
3
|
+
from pathlib import Path
|
|
4
|
+
|
|
5
|
+
import setuptools
|
|
6
|
+
from setuptools import setup
|
|
7
|
+
|
|
8
|
+
this_dir = Path(__file__).parent
|
|
9
|
+
module_dir = this_dir / "wyoming_microsoft_tts"
|
|
10
|
+
|
|
11
|
+
requirements = []
|
|
12
|
+
requirements_path = this_dir / "requirements.txt"
|
|
13
|
+
if requirements_path.is_file():
|
|
14
|
+
with open(requirements_path, encoding="utf-8") as requirements_file:
|
|
15
|
+
requirements = requirements_file.read().splitlines()
|
|
16
|
+
|
|
17
|
+
data_files = [module_dir / "voices.json"]
|
|
18
|
+
|
|
19
|
+
# -----------------------------------------------------------------------------
|
|
20
|
+
|
|
21
|
+
setup(
|
|
22
|
+
name="wyoming_microsoft_tts",
|
|
23
|
+
version="1.3.3",
|
|
24
|
+
description="Wyoming Server for Microsoft TTS",
|
|
25
|
+
url="https://github.com/hugobloem/wyoming-microsoft-tts",
|
|
26
|
+
author="Hugo Bloem",
|
|
27
|
+
author_email="",
|
|
28
|
+
license="MIT",
|
|
29
|
+
packages=setuptools.find_packages(),
|
|
30
|
+
package_data={
|
|
31
|
+
"wyoming_microsoft_tts": [str(p.relative_to(module_dir)) for p in data_files]
|
|
32
|
+
},
|
|
33
|
+
install_requires=requirements,
|
|
34
|
+
classifiers=[
|
|
35
|
+
"Development Status :: 3 - Alpha",
|
|
36
|
+
"Intended Audience :: Developers",
|
|
37
|
+
"Topic :: Text Processing :: Linguistic",
|
|
38
|
+
"License :: OSI Approved :: MIT License",
|
|
39
|
+
"Programming Language :: Python :: 3.7",
|
|
40
|
+
"Programming Language :: Python :: 3.8",
|
|
41
|
+
"Programming Language :: Python :: 3.9",
|
|
42
|
+
"Programming Language :: Python :: 3.10",
|
|
43
|
+
"Programming Language :: Python :: 3.11",
|
|
44
|
+
],
|
|
45
|
+
keywords="rhasspy wyoming microsft tts",
|
|
46
|
+
)
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
"""Tests."""
|
|
@@ -0,0 +1,26 @@
|
|
|
1
|
+
"""Fixtures for tests."""
|
|
2
|
+
|
|
3
|
+
from types import SimpleNamespace
|
|
4
|
+
import pytest
|
|
5
|
+
from wyoming_microsoft_tts.microsoft_tts import MicrosoftTTS
|
|
6
|
+
import os
|
|
7
|
+
|
|
8
|
+
|
|
9
|
+
@pytest.fixture
|
|
10
|
+
def configuration():
|
|
11
|
+
"""Return configuration."""
|
|
12
|
+
return {
|
|
13
|
+
"voice": "en-GB-SoniaNeural",
|
|
14
|
+
}
|
|
15
|
+
|
|
16
|
+
|
|
17
|
+
@pytest.fixture
|
|
18
|
+
def microsoft_tts(configuration):
|
|
19
|
+
"""Return MicrosoftTTS instance."""
|
|
20
|
+
args = SimpleNamespace(
|
|
21
|
+
subscription_key=os.environ.get("SPEECH_KEY"),
|
|
22
|
+
service_region=os.environ.get("SPEECH_REGION"),
|
|
23
|
+
download_dir="/tmp/",
|
|
24
|
+
**configuration,
|
|
25
|
+
)
|
|
26
|
+
return MicrosoftTTS(args)
|
|
@@ -0,0 +1,75 @@
|
|
|
1
|
+
"""Tests for download functionality."""
|
|
2
|
+
|
|
3
|
+
import logging
|
|
4
|
+
import tempfile
|
|
5
|
+
from pathlib import Path
|
|
6
|
+
from unittest.mock import patch
|
|
7
|
+
|
|
8
|
+
from wyoming_microsoft_tts.download import get_voices
|
|
9
|
+
|
|
10
|
+
|
|
11
|
+
def test_get_voices_download_failure_logs_error(caplog):
|
|
12
|
+
"""Test that a failed download logs an error and continues with fallback."""
|
|
13
|
+
with (
|
|
14
|
+
tempfile.TemporaryDirectory() as temp_dir,
|
|
15
|
+
patch("wyoming_microsoft_tts.download.urlopen") as mock_urlopen,
|
|
16
|
+
):
|
|
17
|
+
mock_urlopen.side_effect = Exception("Network error")
|
|
18
|
+
|
|
19
|
+
# Capture logs at error level
|
|
20
|
+
with caplog.at_level(logging.ERROR):
|
|
21
|
+
# Call get_voices with update_voices=True to trigger download
|
|
22
|
+
voices = get_voices(
|
|
23
|
+
download_dir=temp_dir,
|
|
24
|
+
update_voices=True,
|
|
25
|
+
region="westus",
|
|
26
|
+
key="fake_key",
|
|
27
|
+
)
|
|
28
|
+
|
|
29
|
+
# Verify that we got an error log
|
|
30
|
+
assert len(caplog.records) > 0
|
|
31
|
+
error_logs = [
|
|
32
|
+
record for record in caplog.records if record.levelname == "ERROR"
|
|
33
|
+
]
|
|
34
|
+
assert len(error_logs) >= 1
|
|
35
|
+
|
|
36
|
+
# Check that the error message is about failed update
|
|
37
|
+
error_message = error_logs[0].message
|
|
38
|
+
assert "Failed to update voices list" in error_message
|
|
39
|
+
|
|
40
|
+
# Verify that voices are still returned (from embedded file)
|
|
41
|
+
assert isinstance(voices, dict)
|
|
42
|
+
assert len(voices) > 0 # Should have voices from embedded file
|
|
43
|
+
|
|
44
|
+
|
|
45
|
+
def test_get_voices_download_failure_uses_fallback():
|
|
46
|
+
"""Test that a failed download falls back to embedded voices."""
|
|
47
|
+
with (
|
|
48
|
+
tempfile.TemporaryDirectory() as temp_dir,
|
|
49
|
+
patch("wyoming_microsoft_tts.download.urlopen") as mock_urlopen,
|
|
50
|
+
):
|
|
51
|
+
mock_urlopen.side_effect = Exception("Network error")
|
|
52
|
+
|
|
53
|
+
# Call get_voices with update_voices=True to trigger download
|
|
54
|
+
voices = get_voices(
|
|
55
|
+
download_dir=temp_dir, update_voices=True, region="westus", key="fake_key"
|
|
56
|
+
)
|
|
57
|
+
|
|
58
|
+
# Verify that voices are still returned from embedded file
|
|
59
|
+
assert isinstance(voices, dict)
|
|
60
|
+
assert len(voices) > 0
|
|
61
|
+
|
|
62
|
+
# Verify that no downloaded file was created in temp directory
|
|
63
|
+
download_path = Path(temp_dir) / "voices.json"
|
|
64
|
+
assert not download_path.exists()
|
|
65
|
+
|
|
66
|
+
|
|
67
|
+
def test_get_voices_without_update_uses_embedded():
|
|
68
|
+
"""Test that get_voices works without update flag."""
|
|
69
|
+
with tempfile.TemporaryDirectory() as temp_dir:
|
|
70
|
+
# Call get_voices with update_voices=False (default)
|
|
71
|
+
voices = get_voices(download_dir=temp_dir)
|
|
72
|
+
|
|
73
|
+
# Should return voices from embedded file
|
|
74
|
+
assert isinstance(voices, dict)
|
|
75
|
+
assert len(voices) > 0
|
|
@@ -0,0 +1,17 @@
|
|
|
1
|
+
"""Tests for the MicrosoftTTS class."""
|
|
2
|
+
|
|
3
|
+
|
|
4
|
+
def test_initialize(microsoft_tts, configuration):
|
|
5
|
+
"""Test initialization."""
|
|
6
|
+
assert microsoft_tts.args.voice == configuration["voice"]
|
|
7
|
+
assert microsoft_tts.speech_config is not None
|
|
8
|
+
assert microsoft_tts.output_dir is not None
|
|
9
|
+
|
|
10
|
+
|
|
11
|
+
def test_synthesize(microsoft_tts):
|
|
12
|
+
"""Test synthesize."""
|
|
13
|
+
text = "Hello, world!"
|
|
14
|
+
voice = "en-US-JennyNeural"
|
|
15
|
+
|
|
16
|
+
result = microsoft_tts.synthesize(text, voice)
|
|
17
|
+
assert result.endswith(".wav")
|
|
@@ -0,0 +1,169 @@
|
|
|
1
|
+
"""Tests for voice parsing functionality."""
|
|
2
|
+
|
|
3
|
+
import json
|
|
4
|
+
from io import StringIO
|
|
5
|
+
|
|
6
|
+
|
|
7
|
+
from wyoming_microsoft_tts.download import transform_voices_files
|
|
8
|
+
|
|
9
|
+
|
|
10
|
+
def test_voice_parsing_with_script_codes():
|
|
11
|
+
"""Test that voices with script codes in locales are parsed correctly."""
|
|
12
|
+
# Sample Microsoft API response with problematic locales
|
|
13
|
+
sample_response = [
|
|
14
|
+
{
|
|
15
|
+
"ShortName": "iu-Cans-CA-SiqiniqNeural",
|
|
16
|
+
"Locale": "iu-Cans-CA",
|
|
17
|
+
"LocalName": "Siqiniq",
|
|
18
|
+
"LocaleName": "Inuktitut (Canadian Aboriginal Syllabics, Canada)",
|
|
19
|
+
"VoiceType": "Neural",
|
|
20
|
+
},
|
|
21
|
+
{
|
|
22
|
+
"ShortName": "iu-Latn-CA-TaqqiqNeural",
|
|
23
|
+
"Locale": "iu-Latn-CA",
|
|
24
|
+
"LocalName": "Taqqiq",
|
|
25
|
+
"LocaleName": "Inuktitut (Latin, Canada)",
|
|
26
|
+
"VoiceType": "Neural",
|
|
27
|
+
},
|
|
28
|
+
{
|
|
29
|
+
"ShortName": "sr-Latn-RS-NicholasNeural",
|
|
30
|
+
"Locale": "sr-Latn-RS",
|
|
31
|
+
"LocalName": "Nicholas",
|
|
32
|
+
"LocaleName": "Serbian (Latin, Serbia)",
|
|
33
|
+
"VoiceType": "Neural",
|
|
34
|
+
},
|
|
35
|
+
{
|
|
36
|
+
"ShortName": "en-US-JennyNeural",
|
|
37
|
+
"Locale": "en-US",
|
|
38
|
+
"LocalName": "Jenny",
|
|
39
|
+
"LocaleName": "English (United States)",
|
|
40
|
+
"VoiceType": "Neural",
|
|
41
|
+
},
|
|
42
|
+
]
|
|
43
|
+
|
|
44
|
+
# Create a StringIO object to simulate the API response
|
|
45
|
+
response_io = StringIO(json.dumps(sample_response))
|
|
46
|
+
|
|
47
|
+
# Transform the voices
|
|
48
|
+
voices = transform_voices_files(response_io)
|
|
49
|
+
|
|
50
|
+
# Verify that all voices were processed successfully
|
|
51
|
+
assert len(voices) == 4, f"Expected 4 voices, got {len(voices)}"
|
|
52
|
+
|
|
53
|
+
# Check that the problematic voices are included
|
|
54
|
+
assert "iu-Cans-CA-SiqiniqNeural" in voices
|
|
55
|
+
assert "iu-Latn-CA-TaqqiqNeural" in voices
|
|
56
|
+
assert "sr-Latn-RS-NicholasNeural" in voices
|
|
57
|
+
assert "en-US-JennyNeural" in voices
|
|
58
|
+
|
|
59
|
+
# Check that the voice data is properly structured
|
|
60
|
+
for _voice_name, voice_data in voices.items():
|
|
61
|
+
assert "key" in voice_data
|
|
62
|
+
assert "name" in voice_data
|
|
63
|
+
assert "language" in voice_data
|
|
64
|
+
assert "quality" in voice_data
|
|
65
|
+
assert "region" in voice_data["language"]
|
|
66
|
+
assert "country_english" in voice_data["language"]
|
|
67
|
+
|
|
68
|
+
# Verify that region and country_english are not None
|
|
69
|
+
assert voice_data["language"]["region"] is not None
|
|
70
|
+
assert voice_data["language"]["country_english"] is not None
|
|
71
|
+
|
|
72
|
+
|
|
73
|
+
def test_voice_parsing_with_secondary_locales():
|
|
74
|
+
"""Test that voices with secondary locales are parsed correctly."""
|
|
75
|
+
sample_response = [
|
|
76
|
+
{
|
|
77
|
+
"ShortName": "en-US-JennyMultilingualNeural",
|
|
78
|
+
"Locale": "en-US",
|
|
79
|
+
"LocalName": "Jenny",
|
|
80
|
+
"LocaleName": "English (United States)",
|
|
81
|
+
"VoiceType": "Neural",
|
|
82
|
+
"SecondaryLocaleList": ["de-DE", "es-ES"],
|
|
83
|
+
}
|
|
84
|
+
]
|
|
85
|
+
|
|
86
|
+
response_io = StringIO(json.dumps(sample_response))
|
|
87
|
+
voices = transform_voices_files(response_io)
|
|
88
|
+
|
|
89
|
+
# Should have 3 voices: original + 2 secondary locales
|
|
90
|
+
assert len(voices) == 3
|
|
91
|
+
assert "en-US-JennyMultilingualNeural" in voices
|
|
92
|
+
assert "de-DE-JennyMultilingualNeural" in voices
|
|
93
|
+
assert "es-ES-JennyMultilingualNeural" in voices
|
|
94
|
+
|
|
95
|
+
|
|
96
|
+
def test_voice_parsing_with_standard_locales():
|
|
97
|
+
"""Test that standard locale format (lang-COUNTRY) still works correctly."""
|
|
98
|
+
sample_response = [
|
|
99
|
+
{
|
|
100
|
+
"ShortName": "en-US-JennyNeural",
|
|
101
|
+
"Locale": "en-US",
|
|
102
|
+
"LocalName": "Jenny",
|
|
103
|
+
"LocaleName": "English (United States)",
|
|
104
|
+
"VoiceType": "Neural",
|
|
105
|
+
},
|
|
106
|
+
{
|
|
107
|
+
"ShortName": "fr-FR-DeniseNeural",
|
|
108
|
+
"Locale": "fr-FR",
|
|
109
|
+
"LocalName": "Denise",
|
|
110
|
+
"LocaleName": "French (France)",
|
|
111
|
+
"VoiceType": "Neural",
|
|
112
|
+
},
|
|
113
|
+
{
|
|
114
|
+
"ShortName": "de-DE-KatjaNeural",
|
|
115
|
+
"Locale": "de-DE",
|
|
116
|
+
"LocalName": "Katja",
|
|
117
|
+
"LocaleName": "German (Germany)",
|
|
118
|
+
"VoiceType": "Neural",
|
|
119
|
+
},
|
|
120
|
+
]
|
|
121
|
+
|
|
122
|
+
response_io = StringIO(json.dumps(sample_response))
|
|
123
|
+
voices = transform_voices_files(response_io)
|
|
124
|
+
|
|
125
|
+
# Should have all 3 voices
|
|
126
|
+
assert len(voices) == 3
|
|
127
|
+
|
|
128
|
+
# Check country mappings are correct for standard locales
|
|
129
|
+
assert voices["en-US-JennyNeural"]["language"]["region"] == "US"
|
|
130
|
+
assert voices["en-US-JennyNeural"]["language"]["country_english"] == "United States"
|
|
131
|
+
|
|
132
|
+
assert voices["fr-FR-DeniseNeural"]["language"]["region"] == "FR"
|
|
133
|
+
assert voices["fr-FR-DeniseNeural"]["language"]["country_english"] == "France"
|
|
134
|
+
|
|
135
|
+
assert voices["de-DE-KatjaNeural"]["language"]["region"] == "DE"
|
|
136
|
+
assert voices["de-DE-KatjaNeural"]["language"]["country_english"] == "Germany"
|
|
137
|
+
|
|
138
|
+
|
|
139
|
+
def test_voice_parsing_with_invalid_locales():
|
|
140
|
+
"""Test that voices with completely invalid locales use fallback values."""
|
|
141
|
+
sample_response = [
|
|
142
|
+
{
|
|
143
|
+
"ShortName": "xx-INVALID-TestNeural",
|
|
144
|
+
"Locale": "xx-INVALID",
|
|
145
|
+
"LocalName": "Test",
|
|
146
|
+
"LocaleName": "Test Language",
|
|
147
|
+
"VoiceType": "Neural",
|
|
148
|
+
},
|
|
149
|
+
{
|
|
150
|
+
"ShortName": "yy-ZZ-FAKE-TestNeural",
|
|
151
|
+
"Locale": "yy-ZZ-FAKE",
|
|
152
|
+
"LocalName": "Test2",
|
|
153
|
+
"LocaleName": "Test Language 2",
|
|
154
|
+
"VoiceType": "Neural",
|
|
155
|
+
},
|
|
156
|
+
]
|
|
157
|
+
|
|
158
|
+
response_io = StringIO(json.dumps(sample_response))
|
|
159
|
+
voices = transform_voices_files(response_io)
|
|
160
|
+
|
|
161
|
+
# Should have both voices with fallback values
|
|
162
|
+
assert len(voices) == 2
|
|
163
|
+
|
|
164
|
+
# Check fallback values are used
|
|
165
|
+
assert voices["xx-INVALID-TestNeural"]["language"]["region"] == "INVALID"
|
|
166
|
+
assert voices["xx-INVALID-TestNeural"]["language"]["country_english"] == "Unknown"
|
|
167
|
+
|
|
168
|
+
assert voices["yy-ZZ-FAKE-TestNeural"]["language"]["region"] == "FAKE"
|
|
169
|
+
assert voices["yy-ZZ-FAKE-TestNeural"]["language"]["country_english"] == "Unknown"
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
"""Wyoming server for Microsoft TTS."""
|