scribe-cli 0.12.1__tar.gz → 0.12.3__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (31) hide show
  1. {scribe_cli-0.12.1/scribe_cli.egg-info → scribe_cli-0.12.3}/PKG-INFO +79 -28
  2. {scribe_cli-0.12.1 → scribe_cli-0.12.3}/README.md +73 -26
  3. {scribe_cli-0.12.1 → scribe_cli-0.12.3}/pyproject.toml +5 -1
  4. {scribe_cli-0.12.1 → scribe_cli-0.12.3}/scribe/_version.py +9 -4
  5. {scribe_cli-0.12.1 → scribe_cli-0.12.3}/scribe/app.py +1 -1
  6. {scribe_cli-0.12.1 → scribe_cli-0.12.3}/scribe/models.py +3 -0
  7. {scribe_cli-0.12.1 → scribe_cli-0.12.3/scribe_cli.egg-info}/PKG-INFO +79 -28
  8. {scribe_cli-0.12.1 → scribe_cli-0.12.3}/scribe_cli.egg-info/SOURCES.txt +2 -1
  9. scribe_cli-0.12.3/scripts/test_python_versions_install.sh +20 -0
  10. {scribe_cli-0.12.1 → scribe_cli-0.12.3}/.github/workflows/pypi.yml +0 -0
  11. {scribe_cli-0.12.1 → scribe_cli-0.12.3}/.gitignore +0 -0
  12. {scribe_cli-0.12.1 → scribe_cli-0.12.3}/LICENSE +0 -0
  13. {scribe_cli-0.12.1 → scribe_cli-0.12.3}/icon.xcf +0 -0
  14. {scribe_cli-0.12.1 → scribe_cli-0.12.3}/scribe/__init__.py +0 -0
  15. {scribe_cli-0.12.1 → scribe_cli-0.12.3}/scribe/audio.py +0 -0
  16. {scribe_cli-0.12.1 → scribe_cli-0.12.3}/scribe/install_desktop.py +0 -0
  17. {scribe_cli-0.12.1 → scribe_cli-0.12.3}/scribe/keyboard.py +0 -0
  18. {scribe_cli-0.12.1 → scribe_cli-0.12.3}/scribe/models.toml +0 -0
  19. {scribe_cli-0.12.1 → scribe_cli-0.12.3}/scribe/saverecording.py +0 -0
  20. {scribe_cli-0.12.1 → scribe_cli-0.12.3}/scribe/testpynput.py +0 -0
  21. {scribe_cli-0.12.1 → scribe_cli-0.12.3}/scribe/util.py +0 -0
  22. {scribe_cli-0.12.1 → scribe_cli-0.12.3}/scribe_cli.egg-info/dependency_links.txt +0 -0
  23. {scribe_cli-0.12.1 → scribe_cli-0.12.3}/scribe_cli.egg-info/entry_points.txt +0 -0
  24. {scribe_cli-0.12.1 → scribe_cli-0.12.3}/scribe_cli.egg-info/requires.txt +0 -0
  25. {scribe_cli-0.12.1 → scribe_cli-0.12.3}/scribe_cli.egg-info/top_level.txt +0 -0
  26. {scribe_cli-0.12.1 → scribe_cli-0.12.3}/scribe_data/__init__.py +0 -0
  27. {scribe_cli-0.12.1 → scribe_cli-0.12.3}/scribe_data/share/icon.png +0 -0
  28. {scribe_cli-0.12.1 → scribe_cli-0.12.3}/scribe_data/share/icon_recording.png +0 -0
  29. {scribe_cli-0.12.1 → scribe_cli-0.12.3}/scribe_data/share/icon_writing.png +0 -0
  30. {scribe_cli-0.12.1 → scribe_cli-0.12.3}/scribe_data/templates/scribe.desktop +0 -0
  31. {scribe_cli-0.12.1 → scribe_cli-0.12.3}/setup.cfg +0 -0
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.2
2
2
  Name: scribe-cli
3
- Version: 0.12.1
3
+ Version: 0.12.3
4
4
  Summary: scribe is a local speech recognition tool that provides real-time transcription using vosk and whisper AI, with the goal of serving as a virtual keyboard on a computer
5
5
  Author-email: Mahé Perrette <mahe.perrette@gmail.com>
6
6
  License: MIT License
@@ -34,7 +34,11 @@ License: MIT License
34
34
  ensure compliance with their respective terms.
35
35
  Project-URL: Homepage, https://github.com/perrette/scribe
36
36
  Keywords: speech recognition,transcription,AI,language,vosk,whisper,openai,keyboard,clipboard
37
- Classifier: Programming Language :: Python :: 3
37
+ Classifier: Programming Language :: Python :: 3.9
38
+ Classifier: Programming Language :: Python :: 3.10
39
+ Classifier: Programming Language :: Python :: 3.11
40
+ Classifier: Programming Language :: Python :: 3.12
41
+ Classifier: Programming Language :: Python :: 3.13
38
42
  Classifier: Operating System :: OS Independent
39
43
  Requires-Python: >=3.9
40
44
  Description-Content-Type: text/markdown
@@ -66,8 +70,8 @@ Requires-Dist: soundfile; extra == "all"
66
70
  Requires-Dist: vosk; extra == "all"
67
71
  Requires-Dist: pystray; extra == "all"
68
72
 
69
- [![python](https://img.shields.io/badge/python-3.12-blue.svg)]()
70
73
  [![pypi](https://img.shields.io/pypi/v/scribe-cli)](https://pypi.org/project/scribe-cli)
74
+ ![](https://img.shields.io/python/required-version-toml?tomlFilePath=https%3A%2F%2Fraw.githubusercontent.com%2Fperrette%2Fscribe%2Frefs%2Fheads%2Fmain%2Fpyproject.toml)
71
75
 
72
76
  # Scribe <img src="scribe_data/share/icon.png" width=48px>
73
77
 
@@ -77,22 +81,31 @@ It features local, downloadable models with the `vosk` and `whisper` backends, a
77
81
 
78
82
  ## Compatibility
79
83
 
80
- In principle `scribe` is compatible with any OS but I develop it under Ubuntu (Wayland) for my own purposes so glitches are likely on other configurations.
81
- Moreover there are quite a bit of dependencies that rely on very OS-specific protocols under the hood, like access to the microphone, keyboard and clipboard,
82
- and even though the python dependencies `scribe` relies on are not restricted to a single platform, there may be limitation and additional binaries to install.
83
- This guide is based on python3.12 running on Ubuntu 24.04 with Gnome + Wayland, which is a relatively standard setting at the time of writing.
84
- Note as of February 19, 2025 python 13 does not seem to produce any transcription (I am not sure which dependency is to blame).
85
- A test on Mac OS (M1 Air with 8Gb RAM) worked with python 12, though with a much inferior performance compared to my own system (Lenovo T14 Gen 5 with i5 125U 32 Gb RAM).
84
+ The package is initially developped for python 3.12 with Ubuntu 24.04 with Gnome + Wayland, but it should work on other platforms as well (feedback welcome).
85
+ Basically check the pages of the dependencies for more info (i.e. pynput for the keyboard, pystray for the app).
86
+
87
+ - python 3.13:
88
+ - at the time of writing, `openai-whisper` does not install.
89
+
90
+ - Ubuntu:
91
+ - see caveats in the use of the keyboard under Wayland [keyboard section](#use-the-keyboard-with-wayland).
92
+ - MacOS:
93
+ - tested on a Macbook Air M1 8Gb RAM, with python 3.12. It runs, but poorly, presumably because of the low memory: prefer the `openaiapi` backend for such machines
94
+ - I expect better memory specs will have the local models run fine
95
+ - Windows:
96
+ - not tested yet
86
97
 
87
98
  ## Installation
88
99
 
89
- Install PortAudio library and xclip library. E.g. on Ubuntu:
100
+ Install PortAudio library (required by `sounddevice`) and xclip library (required by `pyperclip`). E.g. on Ubuntu:
90
101
 
91
102
  ```bash
92
103
  sudo apt-get install portaudio19-dev xclip
93
104
  ```
94
105
 
95
- See additional requirements for the [icon tray](#system-tray-icon-experimental) and [keyboard](#virtual-keyboard-experimental) options. The python dependencies should be dealt with automatically:
106
+ (`portaudio19-dev` becomes `portaudio ` with homebrew)
107
+
108
+ See additional requirements for the [icon tray](#system-tray-icon-experimental-) and [keyboard](#virtual-keyboard-experimental) options. The python dependencies should be dealt with automatically:
96
109
 
97
110
  ```bash
98
111
  pip install scribe-cli[all]
@@ -110,6 +123,37 @@ pip install -e .[all]
110
123
 
111
124
  You can leave the optional dependencies (leave out `[all]`) but must install at least one of `vosk` or `openai-whisper` or `openai` packages (see Usage below).
112
125
 
126
+ At the time of writing `openai-whisper` does not install on `python 3.13`. You can install the packages manually and skip that package. This makes the `whisper` API unavailable.
127
+
128
+ ### Manual selection of the dependencies
129
+
130
+ ```bash
131
+ # language models (at least one must be installed !)
132
+ pip install vosk
133
+ pip install openai soundfile # openaiapi
134
+ pip install openai-whisper # FAILS IN PYTHON 3.13 on Ubuntu
135
+
136
+ # PortAUDIO (sounddevice)
137
+ pip install sounddevice # automatically installed as required dependency
138
+ sudo apt-get install portaudio19-dev
139
+ # MAC OS: brew install portaudio
140
+
141
+ # clipboard
142
+ pip install pyperclip # automatically installed as required dependency
143
+ sudo apt-get install xclip
144
+
145
+ # keyboard
146
+ pip install pynput
147
+
148
+ # app mode
149
+ sudo apt install libcairo-dev libgirepository1.0-dev gir1.2-appindicator3-0.1 # Ubuntu ONLY (not needed on MacOS)
150
+ pip install PyGObject # Ubuntu ONLY (not needed on MacOS)
151
+ pip install pystray
152
+
153
+ # And finally
154
+ pip install scribe-cli
155
+ ```
156
+
113
157
  The language models for local backends `vosk` and `whisper` will download on-the-fly.
114
158
  The default download folder is `$XDG_CACHE_HOME/{backend}` where `$XDG_CACHE_HOME` defaults to `$HOME/.cache`.
115
159
 
@@ -134,11 +178,12 @@ The `vosk` backend is much faster and very good at doing real-time transcription
134
178
  It becomes really powerful when used for longer or interactive typing session with the [keyboard](#virtual-keyboard-experimental) option, e.g. to make notes or chat with an AI.
135
179
  There are many [vosk models](https://alphacephei.com/vosk/models) available, and here a few are associated to [a handful of languages](scribe/models.toml) `en`, `fr`, `it`, `de` (so far).
136
180
 
137
- The `openaiapi` backend uses `whisper-1` model at the time of writing. It requires an API key
181
+ The `openaiapi` backend uses `whisper-1` model at the time of writing. It requires an API key best passed as an environment variable, e.g. in bash:
138
182
  ```bash
139
- scribe --backend openaiapi --api YOURAPIKEY
183
+ export OPENAI_API_KEY=YOURAPIKEY
184
+ scribe --backend openaiapi
140
185
  ```
141
- where `--no-prompt` jumps right to the recording (after the first interruption, you can still choose to change the backend and model).
186
+ The `openaiapi` backend is lightweight and handy if you have an API (you can create one for free for testing) and a low-spec computer (and don't care too much about privacy, obviously).
142
187
 
143
188
  ## Output media
144
189
 
@@ -174,9 +219,9 @@ This can be extremely useful with the `vosk` backend and its realtime transcript
174
219
  The `--keyboard` option relies on the optional `pynput` dependency (installed together with `scribe` if you used the `[all]` or `[keyboard]` option).
175
220
  Depending on your operating system, `pynput` may require additional configuration to work around its [limitations](https://pynput.readthedocs.io/en/latest/limitations.html).
176
221
 
177
- #### Use the keyboard with Wayland (default for Ubuntu 24.04)
222
+ #### Use the keyboard with Wayland
178
223
 
179
- In my Ubuntu + Wayland system the keyboard simulation works out-of-the-box in chromium based applications (including vscode) but it does not in firefox and sublime text and any of the rest (not even in a terminal !). I am told this is because Chromium runs an X server emulator and so is compatible with the default pynput backend.
224
+ In my Ubuntu 24.04 + Wayland system the keyboard simulation works out-of-the-box in chromium based applications (including vscode) but it does not in firefox and sublime text and any of the rest (not even in a terminal !). I am told this is because Chromium runs an X server emulator and so is compatible with the default pynput backend.
180
225
 
181
226
  One workaround is to use the Xorg version of GNOME: in `etc/gdm3/custom.conf` uncomment `# WaylandEnable=false` and restart your computer.
182
227
 
@@ -190,40 +235,46 @@ You're on the right path :)
190
235
 
191
236
  ## System tray icon (experimental) <img src="scribe_data/share/icon.png" width=48px>
192
237
 
238
+ <img src=https://github.com/user-attachments/assets/4c97f4b1-1a65-4d49-9f5a-a9f4287cfa5a width=300px>
239
+
193
240
  To avoid switching back and forth with the terminal, it's possible to interact with the program via an icon tray.
194
241
  To activate start with:
195
242
  ```bash
196
- scribe --app
243
+ scribe --app ...
197
244
  ```
198
245
  or toggle the app option in the interactive menu. The scribe icon will show, with Record and other options. The icon will change based on what the app is doing. It is possible to choose from a set
199
246
  of predefined models (controlled by `--vosk-models` and `whisper-models`) and options, or to Quit and choose from the terminal before pressing Enter again.
200
247
  For the vosk model, there are only two states : recording + transcribing or Idle. For the whisper model there are three states visible from the icon: recording/waiting, transcribing and idle.
201
- That option requires `pystray` to be installed. This is included with the `pip install ...[all]` option. In Ubuntu the following dependencies were required to make the menus appear:
248
+ That option requires `pystray` to be installed. This is included with the `pip install ...[all]` option.
249
+
250
+ The `--vosk-models` and `--whisper-models` allow to predefined the set of available models to choose from in the app manu. E.g.
251
+ ```bash
252
+ scribe --app --vosk-models vosk-model-fr-0.22 --whisper-models small turbo ...
253
+ ```
254
+
255
+ ### Ubuntu
256
+
257
+ In Ubuntu the following dependencies were required to make the menus appear:
202
258
 
203
259
  ```bash
204
260
  sudo apt install libcairo-dev libgirepository1.0-dev gir1.2-appindicator3-0.1
205
261
  pip install PyGObject
206
262
  ```
207
263
 
208
- <img src=https://github.com/user-attachments/assets/4c97f4b1-1a65-4d49-9f5a-a9f4287cfa5a width=300px>
209
-
210
264
  ## Start as an application in GNOME
211
265
 
212
266
  If you run Ubuntu (or else?) with GNOME, the script `scribe-install [...]` will create a `scribe.desktop` file and place it under `$HOME/.local/share/applications`
213
267
  to make it available from the quick launch menu. Any option will be passed on to `scribe`, with the additional options `--name` and `--no-terminal`.
214
268
  `--no-terminal` means no terminal will show up, and it also implies the options `--app --no-prompt`.
215
269
 
216
- In a relatively basic form
217
-
270
+ Consider the following two flavors:
218
271
  ```bash
219
- scribe-install --clipboard --api YOUROPENAIAPIKEY
272
+ scribe-install --clipboard ...
273
+ scribe-install --name "Scribe App" --no-terminal --clipboard ...
220
274
  ```
221
- (`--api` is optional and only useful if you plan to use `openaiapi` backend later on)
275
+ The first will create an app named Scribe (the default) that simply opens a terminal and execute the command `scribe --clipboard ...`.
276
+ The second will create an app named Scribe App that executes in a hidden terminal: `scribe --no-prompt --app --clipboard ...`, thus leaving the tray icon as only mode of interaction.
222
277
 
223
- It is also possible to run an app fully outside the terminal:
224
- ```bash
225
- scribe-install --backend openaiapi --name "Scribe App" --keyboard --clipboard --app --no-prompt --no-terminal --restart-after-silence --api YOUROPENAIAPIKEY --vosk-models vosk-model-fr-0.22 --whisper-models small turbo
226
- ```
227
278
 
228
279
  ## Fine tuning
229
280
 
@@ -1,5 +1,5 @@
1
- [![python](https://img.shields.io/badge/python-3.12-blue.svg)]()
2
1
  [![pypi](https://img.shields.io/pypi/v/scribe-cli)](https://pypi.org/project/scribe-cli)
2
+ ![](https://img.shields.io/python/required-version-toml?tomlFilePath=https%3A%2F%2Fraw.githubusercontent.com%2Fperrette%2Fscribe%2Frefs%2Fheads%2Fmain%2Fpyproject.toml)
3
3
 
4
4
  # Scribe <img src="scribe_data/share/icon.png" width=48px>
5
5
 
@@ -9,22 +9,31 @@ It features local, downloadable models with the `vosk` and `whisper` backends, a
9
9
 
10
10
  ## Compatibility
11
11
 
12
- In principle `scribe` is compatible with any OS but I develop it under Ubuntu (Wayland) for my own purposes so glitches are likely on other configurations.
13
- Moreover there are quite a bit of dependencies that rely on very OS-specific protocols under the hood, like access to the microphone, keyboard and clipboard,
14
- and even though the python dependencies `scribe` relies on are not restricted to a single platform, there may be limitation and additional binaries to install.
15
- This guide is based on python3.12 running on Ubuntu 24.04 with Gnome + Wayland, which is a relatively standard setting at the time of writing.
16
- Note as of February 19, 2025 python 13 does not seem to produce any transcription (I am not sure which dependency is to blame).
17
- A test on Mac OS (M1 Air with 8Gb RAM) worked with python 12, though with a much inferior performance compared to my own system (Lenovo T14 Gen 5 with i5 125U 32 Gb RAM).
12
+ The package is initially developped for python 3.12 with Ubuntu 24.04 with Gnome + Wayland, but it should work on other platforms as well (feedback welcome).
13
+ Basically check the pages of the dependencies for more info (i.e. pynput for the keyboard, pystray for the app).
14
+
15
+ - python 3.13:
16
+ - at the time of writing, `openai-whisper` does not install.
17
+
18
+ - Ubuntu:
19
+ - see caveats in the use of the keyboard under Wayland [keyboard section](#use-the-keyboard-with-wayland).
20
+ - MacOS:
21
+ - tested on a Macbook Air M1 8Gb RAM, with python 3.12. It runs, but poorly, presumably because of the low memory: prefer the `openaiapi` backend for such machines
22
+ - I expect better memory specs will have the local models run fine
23
+ - Windows:
24
+ - not tested yet
18
25
 
19
26
  ## Installation
20
27
 
21
- Install PortAudio library and xclip library. E.g. on Ubuntu:
28
+ Install PortAudio library (required by `sounddevice`) and xclip library (required by `pyperclip`). E.g. on Ubuntu:
22
29
 
23
30
  ```bash
24
31
  sudo apt-get install portaudio19-dev xclip
25
32
  ```
26
33
 
27
- See additional requirements for the [icon tray](#system-tray-icon-experimental) and [keyboard](#virtual-keyboard-experimental) options. The python dependencies should be dealt with automatically:
34
+ (`portaudio19-dev` becomes `portaudio ` with homebrew)
35
+
36
+ See additional requirements for the [icon tray](#system-tray-icon-experimental-) and [keyboard](#virtual-keyboard-experimental) options. The python dependencies should be dealt with automatically:
28
37
 
29
38
  ```bash
30
39
  pip install scribe-cli[all]
@@ -42,6 +51,37 @@ pip install -e .[all]
42
51
 
43
52
  You can leave the optional dependencies (leave out `[all]`) but must install at least one of `vosk` or `openai-whisper` or `openai` packages (see Usage below).
44
53
 
54
+ At the time of writing `openai-whisper` does not install on `python 3.13`. You can install the packages manually and skip that package. This makes the `whisper` API unavailable.
55
+
56
+ ### Manual selection of the dependencies
57
+
58
+ ```bash
59
+ # language models (at least one must be installed !)
60
+ pip install vosk
61
+ pip install openai soundfile # openaiapi
62
+ pip install openai-whisper # FAILS IN PYTHON 3.13 on Ubuntu
63
+
64
+ # PortAUDIO (sounddevice)
65
+ pip install sounddevice # automatically installed as required dependency
66
+ sudo apt-get install portaudio19-dev
67
+ # MAC OS: brew install portaudio
68
+
69
+ # clipboard
70
+ pip install pyperclip # automatically installed as required dependency
71
+ sudo apt-get install xclip
72
+
73
+ # keyboard
74
+ pip install pynput
75
+
76
+ # app mode
77
+ sudo apt install libcairo-dev libgirepository1.0-dev gir1.2-appindicator3-0.1 # Ubuntu ONLY (not needed on MacOS)
78
+ pip install PyGObject # Ubuntu ONLY (not needed on MacOS)
79
+ pip install pystray
80
+
81
+ # And finally
82
+ pip install scribe-cli
83
+ ```
84
+
45
85
  The language models for local backends `vosk` and `whisper` will download on-the-fly.
46
86
  The default download folder is `$XDG_CACHE_HOME/{backend}` where `$XDG_CACHE_HOME` defaults to `$HOME/.cache`.
47
87
 
@@ -66,11 +106,12 @@ The `vosk` backend is much faster and very good at doing real-time transcription
66
106
  It becomes really powerful when used for longer or interactive typing session with the [keyboard](#virtual-keyboard-experimental) option, e.g. to make notes or chat with an AI.
67
107
  There are many [vosk models](https://alphacephei.com/vosk/models) available, and here a few are associated to [a handful of languages](scribe/models.toml) `en`, `fr`, `it`, `de` (so far).
68
108
 
69
- The `openaiapi` backend uses `whisper-1` model at the time of writing. It requires an API key
109
+ The `openaiapi` backend uses `whisper-1` model at the time of writing. It requires an API key best passed as an environment variable, e.g. in bash:
70
110
  ```bash
71
- scribe --backend openaiapi --api YOURAPIKEY
111
+ export OPENAI_API_KEY=YOURAPIKEY
112
+ scribe --backend openaiapi
72
113
  ```
73
- where `--no-prompt` jumps right to the recording (after the first interruption, you can still choose to change the backend and model).
114
+ The `openaiapi` backend is lightweight and handy if you have an API (you can create one for free for testing) and a low-spec computer (and don't care too much about privacy, obviously).
74
115
 
75
116
  ## Output media
76
117
 
@@ -106,9 +147,9 @@ This can be extremely useful with the `vosk` backend and its realtime transcript
106
147
  The `--keyboard` option relies on the optional `pynput` dependency (installed together with `scribe` if you used the `[all]` or `[keyboard]` option).
107
148
  Depending on your operating system, `pynput` may require additional configuration to work around its [limitations](https://pynput.readthedocs.io/en/latest/limitations.html).
108
149
 
109
- #### Use the keyboard with Wayland (default for Ubuntu 24.04)
150
+ #### Use the keyboard with Wayland
110
151
 
111
- In my Ubuntu + Wayland system the keyboard simulation works out-of-the-box in chromium based applications (including vscode) but it does not in firefox and sublime text and any of the rest (not even in a terminal !). I am told this is because Chromium runs an X server emulator and so is compatible with the default pynput backend.
152
+ In my Ubuntu 24.04 + Wayland system the keyboard simulation works out-of-the-box in chromium based applications (including vscode) but it does not in firefox and sublime text and any of the rest (not even in a terminal !). I am told this is because Chromium runs an X server emulator and so is compatible with the default pynput backend.
112
153
 
113
154
  One workaround is to use the Xorg version of GNOME: in `etc/gdm3/custom.conf` uncomment `# WaylandEnable=false` and restart your computer.
114
155
 
@@ -122,40 +163,46 @@ You're on the right path :)
122
163
 
123
164
  ## System tray icon (experimental) <img src="scribe_data/share/icon.png" width=48px>
124
165
 
166
+ <img src=https://github.com/user-attachments/assets/4c97f4b1-1a65-4d49-9f5a-a9f4287cfa5a width=300px>
167
+
125
168
  To avoid switching back and forth with the terminal, it's possible to interact with the program via an icon tray.
126
169
  To activate start with:
127
170
  ```bash
128
- scribe --app
171
+ scribe --app ...
129
172
  ```
130
173
  or toggle the app option in the interactive menu. The scribe icon will show, with Record and other options. The icon will change based on what the app is doing. It is possible to choose from a set
131
174
  of predefined models (controlled by `--vosk-models` and `whisper-models`) and options, or to Quit and choose from the terminal before pressing Enter again.
132
175
  For the vosk model, there are only two states : recording + transcribing or Idle. For the whisper model there are three states visible from the icon: recording/waiting, transcribing and idle.
133
- That option requires `pystray` to be installed. This is included with the `pip install ...[all]` option. In Ubuntu the following dependencies were required to make the menus appear:
176
+ That option requires `pystray` to be installed. This is included with the `pip install ...[all]` option.
177
+
178
+ The `--vosk-models` and `--whisper-models` allow to predefined the set of available models to choose from in the app manu. E.g.
179
+ ```bash
180
+ scribe --app --vosk-models vosk-model-fr-0.22 --whisper-models small turbo ...
181
+ ```
182
+
183
+ ### Ubuntu
184
+
185
+ In Ubuntu the following dependencies were required to make the menus appear:
134
186
 
135
187
  ```bash
136
188
  sudo apt install libcairo-dev libgirepository1.0-dev gir1.2-appindicator3-0.1
137
189
  pip install PyGObject
138
190
  ```
139
191
 
140
- <img src=https://github.com/user-attachments/assets/4c97f4b1-1a65-4d49-9f5a-a9f4287cfa5a width=300px>
141
-
142
192
  ## Start as an application in GNOME
143
193
 
144
194
  If you run Ubuntu (or else?) with GNOME, the script `scribe-install [...]` will create a `scribe.desktop` file and place it under `$HOME/.local/share/applications`
145
195
  to make it available from the quick launch menu. Any option will be passed on to `scribe`, with the additional options `--name` and `--no-terminal`.
146
196
  `--no-terminal` means no terminal will show up, and it also implies the options `--app --no-prompt`.
147
197
 
148
- In a relatively basic form
149
-
198
+ Consider the following two flavors:
150
199
  ```bash
151
- scribe-install --clipboard --api YOUROPENAIAPIKEY
200
+ scribe-install --clipboard ...
201
+ scribe-install --name "Scribe App" --no-terminal --clipboard ...
152
202
  ```
153
- (`--api` is optional and only useful if you plan to use `openaiapi` backend later on)
203
+ The first will create an app named Scribe (the default) that simply opens a terminal and execute the command `scribe --clipboard ...`.
204
+ The second will create an app named Scribe App that executes in a hidden terminal: `scribe --no-prompt --app --clipboard ...`, thus leaving the tray icon as only mode of interaction.
154
205
 
155
- It is also possible to run an app fully outside the terminal:
156
- ```bash
157
- scribe-install --backend openaiapi --name "Scribe App" --keyboard --clipboard --app --no-prompt --no-terminal --restart-after-silence --api YOUROPENAIAPIKEY --vosk-models vosk-model-fr-0.22 --whisper-models small turbo
158
- ```
159
206
 
160
207
  ## Fine tuning
161
208
 
@@ -23,7 +23,11 @@ dependencies = [
23
23
  ]
24
24
 
25
25
  classifiers = [
26
- "Programming Language :: Python :: 3",
26
+ "Programming Language :: Python :: 3.9",
27
+ "Programming Language :: Python :: 3.10",
28
+ "Programming Language :: Python :: 3.11",
29
+ "Programming Language :: Python :: 3.12",
30
+ "Programming Language :: Python :: 3.13",
27
31
  "Operating System :: OS Independent",
28
32
  ]
29
33
 
@@ -1,8 +1,13 @@
1
- # file generated by setuptools_scm
1
+ # file generated by setuptools-scm
2
2
  # don't change, don't track in version control
3
+
4
+ __all__ = ["__version__", "__version_tuple__", "version", "version_tuple"]
5
+
3
6
  TYPE_CHECKING = False
4
7
  if TYPE_CHECKING:
5
- from typing import Tuple, Union
8
+ from typing import Tuple
9
+ from typing import Union
10
+
6
11
  VERSION_TUPLE = Tuple[Union[int, str], ...]
7
12
  else:
8
13
  VERSION_TUPLE = object
@@ -12,5 +17,5 @@ __version__: str
12
17
  __version_tuple__: VERSION_TUPLE
13
18
  version_tuple: VERSION_TUPLE
14
19
 
15
- __version__ = version = '0.12.1'
16
- __version_tuple__ = version_tuple = (0, 12, 1)
20
+ __version__ = version = '0.12.3'
21
+ __version_tuple__ = version_tuple = (0, 12, 3)
@@ -203,7 +203,7 @@ def get_parser():
203
203
  group = parser.add_argument_group("whisper options")
204
204
  group.add_argument("--duration", default=120, type=float, help="Max duration of the whisper recording (default %(default)s s)")
205
205
  group.add_argument("--silence", default=2, type=float, help="silence duration (default %(default)s s)")
206
- group.add_argument("--silence-db", default=-30, type=float, help="silence magnitude in decibel (default %(default)s db)")
206
+ group.add_argument("--silence-db", default=-40, type=float, help="silence magnitude in decibel (default %(default)s db)")
207
207
  group.add_argument("-a", "--restart-after-silence", action="store_true", help="Restart the recording after a transcription triggered by a silence")
208
208
  group.add_argument("--download-folder-whisper", help="Folder to store Whisper models.")
209
209
 
@@ -54,6 +54,9 @@ class AbstractTranscriber:
54
54
  return self.timeout is not None and time.time() - self.start_time > self.timeout
55
55
 
56
56
  def transcribe_realtime_audio(self, audio_bytes=b""):
57
+ """This method is generic and assumes the underlying model does not handle real-time audio.
58
+ The Vosk model handles real-time audio, so this method is overridden in the VoskTranscriber class.
59
+ """
57
60
 
58
61
  # Vérifier si le segment est un silence
59
62
  if is_silent(audio_bytes, self.silence_thresh):
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.2
2
2
  Name: scribe-cli
3
- Version: 0.12.1
3
+ Version: 0.12.3
4
4
  Summary: scribe is a local speech recognition tool that provides real-time transcription using vosk and whisper AI, with the goal of serving as a virtual keyboard on a computer
5
5
  Author-email: Mahé Perrette <mahe.perrette@gmail.com>
6
6
  License: MIT License
@@ -34,7 +34,11 @@ License: MIT License
34
34
  ensure compliance with their respective terms.
35
35
  Project-URL: Homepage, https://github.com/perrette/scribe
36
36
  Keywords: speech recognition,transcription,AI,language,vosk,whisper,openai,keyboard,clipboard
37
- Classifier: Programming Language :: Python :: 3
37
+ Classifier: Programming Language :: Python :: 3.9
38
+ Classifier: Programming Language :: Python :: 3.10
39
+ Classifier: Programming Language :: Python :: 3.11
40
+ Classifier: Programming Language :: Python :: 3.12
41
+ Classifier: Programming Language :: Python :: 3.13
38
42
  Classifier: Operating System :: OS Independent
39
43
  Requires-Python: >=3.9
40
44
  Description-Content-Type: text/markdown
@@ -66,8 +70,8 @@ Requires-Dist: soundfile; extra == "all"
66
70
  Requires-Dist: vosk; extra == "all"
67
71
  Requires-Dist: pystray; extra == "all"
68
72
 
69
- [![python](https://img.shields.io/badge/python-3.12-blue.svg)]()
70
73
  [![pypi](https://img.shields.io/pypi/v/scribe-cli)](https://pypi.org/project/scribe-cli)
74
+ ![](https://img.shields.io/python/required-version-toml?tomlFilePath=https%3A%2F%2Fraw.githubusercontent.com%2Fperrette%2Fscribe%2Frefs%2Fheads%2Fmain%2Fpyproject.toml)
71
75
 
72
76
  # Scribe <img src="scribe_data/share/icon.png" width=48px>
73
77
 
@@ -77,22 +81,31 @@ It features local, downloadable models with the `vosk` and `whisper` backends, a
77
81
 
78
82
  ## Compatibility
79
83
 
80
- In principle `scribe` is compatible with any OS but I develop it under Ubuntu (Wayland) for my own purposes so glitches are likely on other configurations.
81
- Moreover there are quite a bit of dependencies that rely on very OS-specific protocols under the hood, like access to the microphone, keyboard and clipboard,
82
- and even though the python dependencies `scribe` relies on are not restricted to a single platform, there may be limitation and additional binaries to install.
83
- This guide is based on python3.12 running on Ubuntu 24.04 with Gnome + Wayland, which is a relatively standard setting at the time of writing.
84
- Note as of February 19, 2025 python 13 does not seem to produce any transcription (I am not sure which dependency is to blame).
85
- A test on Mac OS (M1 Air with 8Gb RAM) worked with python 12, though with a much inferior performance compared to my own system (Lenovo T14 Gen 5 with i5 125U 32 Gb RAM).
84
+ The package is initially developped for python 3.12 with Ubuntu 24.04 with Gnome + Wayland, but it should work on other platforms as well (feedback welcome).
85
+ Basically check the pages of the dependencies for more info (i.e. pynput for the keyboard, pystray for the app).
86
+
87
+ - python 3.13:
88
+ - at the time of writing, `openai-whisper` does not install.
89
+
90
+ - Ubuntu:
91
+ - see caveats in the use of the keyboard under Wayland [keyboard section](#use-the-keyboard-with-wayland).
92
+ - MacOS:
93
+ - tested on a Macbook Air M1 8Gb RAM, with python 3.12. It runs, but poorly, presumably because of the low memory: prefer the `openaiapi` backend for such machines
94
+ - I expect better memory specs will have the local models run fine
95
+ - Windows:
96
+ - not tested yet
86
97
 
87
98
  ## Installation
88
99
 
89
- Install PortAudio library and xclip library. E.g. on Ubuntu:
100
+ Install PortAudio library (required by `sounddevice`) and xclip library (required by `pyperclip`). E.g. on Ubuntu:
90
101
 
91
102
  ```bash
92
103
  sudo apt-get install portaudio19-dev xclip
93
104
  ```
94
105
 
95
- See additional requirements for the [icon tray](#system-tray-icon-experimental) and [keyboard](#virtual-keyboard-experimental) options. The python dependencies should be dealt with automatically:
106
+ (`portaudio19-dev` becomes `portaudio ` with homebrew)
107
+
108
+ See additional requirements for the [icon tray](#system-tray-icon-experimental-) and [keyboard](#virtual-keyboard-experimental) options. The python dependencies should be dealt with automatically:
96
109
 
97
110
  ```bash
98
111
  pip install scribe-cli[all]
@@ -110,6 +123,37 @@ pip install -e .[all]
110
123
 
111
124
  You can leave the optional dependencies (leave out `[all]`) but must install at least one of `vosk` or `openai-whisper` or `openai` packages (see Usage below).
112
125
 
126
+ At the time of writing `openai-whisper` does not install on `python 3.13`. You can install the packages manually and skip that package. This makes the `whisper` API unavailable.
127
+
128
+ ### Manual selection of the dependencies
129
+
130
+ ```bash
131
+ # language models (at least one must be installed !)
132
+ pip install vosk
133
+ pip install openai soundfile # openaiapi
134
+ pip install openai-whisper # FAILS IN PYTHON 3.13 on Ubuntu
135
+
136
+ # PortAUDIO (sounddevice)
137
+ pip install sounddevice # automatically installed as required dependency
138
+ sudo apt-get install portaudio19-dev
139
+ # MAC OS: brew install portaudio
140
+
141
+ # clipboard
142
+ pip install pyperclip # automatically installed as required dependency
143
+ sudo apt-get install xclip
144
+
145
+ # keyboard
146
+ pip install pynput
147
+
148
+ # app mode
149
+ sudo apt install libcairo-dev libgirepository1.0-dev gir1.2-appindicator3-0.1 # Ubuntu ONLY (not needed on MacOS)
150
+ pip install PyGObject # Ubuntu ONLY (not needed on MacOS)
151
+ pip install pystray
152
+
153
+ # And finally
154
+ pip install scribe-cli
155
+ ```
156
+
113
157
  The language models for local backends `vosk` and `whisper` will download on-the-fly.
114
158
  The default download folder is `$XDG_CACHE_HOME/{backend}` where `$XDG_CACHE_HOME` defaults to `$HOME/.cache`.
115
159
 
@@ -134,11 +178,12 @@ The `vosk` backend is much faster and very good at doing real-time transcription
134
178
  It becomes really powerful when used for longer or interactive typing session with the [keyboard](#virtual-keyboard-experimental) option, e.g. to make notes or chat with an AI.
135
179
  There are many [vosk models](https://alphacephei.com/vosk/models) available, and here a few are associated to [a handful of languages](scribe/models.toml) `en`, `fr`, `it`, `de` (so far).
136
180
 
137
- The `openaiapi` backend uses `whisper-1` model at the time of writing. It requires an API key
181
+ The `openaiapi` backend uses `whisper-1` model at the time of writing. It requires an API key best passed as an environment variable, e.g. in bash:
138
182
  ```bash
139
- scribe --backend openaiapi --api YOURAPIKEY
183
+ export OPENAI_API_KEY=YOURAPIKEY
184
+ scribe --backend openaiapi
140
185
  ```
141
- where `--no-prompt` jumps right to the recording (after the first interruption, you can still choose to change the backend and model).
186
+ The `openaiapi` backend is lightweight and handy if you have an API (you can create one for free for testing) and a low-spec computer (and don't care too much about privacy, obviously).
142
187
 
143
188
  ## Output media
144
189
 
@@ -174,9 +219,9 @@ This can be extremely useful with the `vosk` backend and its realtime transcript
174
219
  The `--keyboard` option relies on the optional `pynput` dependency (installed together with `scribe` if you used the `[all]` or `[keyboard]` option).
175
220
  Depending on your operating system, `pynput` may require additional configuration to work around its [limitations](https://pynput.readthedocs.io/en/latest/limitations.html).
176
221
 
177
- #### Use the keyboard with Wayland (default for Ubuntu 24.04)
222
+ #### Use the keyboard with Wayland
178
223
 
179
- In my Ubuntu + Wayland system the keyboard simulation works out-of-the-box in chromium based applications (including vscode) but it does not in firefox and sublime text and any of the rest (not even in a terminal !). I am told this is because Chromium runs an X server emulator and so is compatible with the default pynput backend.
224
+ In my Ubuntu 24.04 + Wayland system the keyboard simulation works out-of-the-box in chromium based applications (including vscode) but it does not in firefox and sublime text and any of the rest (not even in a terminal !). I am told this is because Chromium runs an X server emulator and so is compatible with the default pynput backend.
180
225
 
181
226
  One workaround is to use the Xorg version of GNOME: in `etc/gdm3/custom.conf` uncomment `# WaylandEnable=false` and restart your computer.
182
227
 
@@ -190,40 +235,46 @@ You're on the right path :)
190
235
 
191
236
  ## System tray icon (experimental) <img src="scribe_data/share/icon.png" width=48px>
192
237
 
238
+ <img src=https://github.com/user-attachments/assets/4c97f4b1-1a65-4d49-9f5a-a9f4287cfa5a width=300px>
239
+
193
240
  To avoid switching back and forth with the terminal, it's possible to interact with the program via an icon tray.
194
241
  To activate start with:
195
242
  ```bash
196
- scribe --app
243
+ scribe --app ...
197
244
  ```
198
245
  or toggle the app option in the interactive menu. The scribe icon will show, with Record and other options. The icon will change based on what the app is doing. It is possible to choose from a set
199
246
  of predefined models (controlled by `--vosk-models` and `whisper-models`) and options, or to Quit and choose from the terminal before pressing Enter again.
200
247
  For the vosk model, there are only two states : recording + transcribing or Idle. For the whisper model there are three states visible from the icon: recording/waiting, transcribing and idle.
201
- That option requires `pystray` to be installed. This is included with the `pip install ...[all]` option. In Ubuntu the following dependencies were required to make the menus appear:
248
+ That option requires `pystray` to be installed. This is included with the `pip install ...[all]` option.
249
+
250
+ The `--vosk-models` and `--whisper-models` allow to predefined the set of available models to choose from in the app manu. E.g.
251
+ ```bash
252
+ scribe --app --vosk-models vosk-model-fr-0.22 --whisper-models small turbo ...
253
+ ```
254
+
255
+ ### Ubuntu
256
+
257
+ In Ubuntu the following dependencies were required to make the menus appear:
202
258
 
203
259
  ```bash
204
260
  sudo apt install libcairo-dev libgirepository1.0-dev gir1.2-appindicator3-0.1
205
261
  pip install PyGObject
206
262
  ```
207
263
 
208
- <img src=https://github.com/user-attachments/assets/4c97f4b1-1a65-4d49-9f5a-a9f4287cfa5a width=300px>
209
-
210
264
  ## Start as an application in GNOME
211
265
 
212
266
  If you run Ubuntu (or else?) with GNOME, the script `scribe-install [...]` will create a `scribe.desktop` file and place it under `$HOME/.local/share/applications`
213
267
  to make it available from the quick launch menu. Any option will be passed on to `scribe`, with the additional options `--name` and `--no-terminal`.
214
268
  `--no-terminal` means no terminal will show up, and it also implies the options `--app --no-prompt`.
215
269
 
216
- In a relatively basic form
217
-
270
+ Consider the following two flavors:
218
271
  ```bash
219
- scribe-install --clipboard --api YOUROPENAIAPIKEY
272
+ scribe-install --clipboard ...
273
+ scribe-install --name "Scribe App" --no-terminal --clipboard ...
220
274
  ```
221
- (`--api` is optional and only useful if you plan to use `openaiapi` backend later on)
275
+ The first will create an app named Scribe (the default) that simply opens a terminal and execute the command `scribe --clipboard ...`.
276
+ The second will create an app named Scribe App that executes in a hidden terminal: `scribe --no-prompt --app --clipboard ...`, thus leaving the tray icon as only mode of interaction.
222
277
 
223
- It is also possible to run an app fully outside the terminal:
224
- ```bash
225
- scribe-install --backend openaiapi --name "Scribe App" --keyboard --clipboard --app --no-prompt --no-terminal --restart-after-silence --api YOUROPENAIAPIKEY --vosk-models vosk-model-fr-0.22 --whisper-models small turbo
226
- ```
227
278
 
228
279
  ## Fine tuning
229
280
 
@@ -25,4 +25,5 @@ scribe_data/__init__.py
25
25
  scribe_data/share/icon.png
26
26
  scribe_data/share/icon_recording.png
27
27
  scribe_data/share/icon_writing.png
28
- scribe_data/templates/scribe.desktop
28
+ scribe_data/templates/scribe.desktop
29
+ scripts/test_python_versions_install.sh
@@ -0,0 +1,20 @@
1
+ subversion=$1
2
+ version=3.$subversion
3
+ name=py3$subversion
4
+
5
+ MAMBAENV=~/.local/share/mamba/envs/$name
6
+ VENVDIR=~/.virtualenvs/$name
7
+
8
+ if [ ! -d $MAMBAENV ] ; then
9
+ micromamba create -n $name python=$version --prefix $MAMBAENV -y
10
+ else
11
+ echo "Environment $name already exists at $MAMBAENV"
12
+ fi
13
+ if [ ! -d $VENVDIR ] ; then
14
+ $MAMBAENV/bin/python3 -m venv $VENVDIR
15
+ else
16
+ echo "Virtualenv $name already exists at $VENVDIR"
17
+ fi
18
+ source ~/.virtualenvs/$name/bin/activate
19
+ pip install -U pip
20
+ pip install scribe-cli[all]
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes