scribe-cli 0.12.1__tar.gz → 0.12.3__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {scribe_cli-0.12.1/scribe_cli.egg-info → scribe_cli-0.12.3}/PKG-INFO +79 -28
- {scribe_cli-0.12.1 → scribe_cli-0.12.3}/README.md +73 -26
- {scribe_cli-0.12.1 → scribe_cli-0.12.3}/pyproject.toml +5 -1
- {scribe_cli-0.12.1 → scribe_cli-0.12.3}/scribe/_version.py +9 -4
- {scribe_cli-0.12.1 → scribe_cli-0.12.3}/scribe/app.py +1 -1
- {scribe_cli-0.12.1 → scribe_cli-0.12.3}/scribe/models.py +3 -0
- {scribe_cli-0.12.1 → scribe_cli-0.12.3/scribe_cli.egg-info}/PKG-INFO +79 -28
- {scribe_cli-0.12.1 → scribe_cli-0.12.3}/scribe_cli.egg-info/SOURCES.txt +2 -1
- scribe_cli-0.12.3/scripts/test_python_versions_install.sh +20 -0
- {scribe_cli-0.12.1 → scribe_cli-0.12.3}/.github/workflows/pypi.yml +0 -0
- {scribe_cli-0.12.1 → scribe_cli-0.12.3}/.gitignore +0 -0
- {scribe_cli-0.12.1 → scribe_cli-0.12.3}/LICENSE +0 -0
- {scribe_cli-0.12.1 → scribe_cli-0.12.3}/icon.xcf +0 -0
- {scribe_cli-0.12.1 → scribe_cli-0.12.3}/scribe/__init__.py +0 -0
- {scribe_cli-0.12.1 → scribe_cli-0.12.3}/scribe/audio.py +0 -0
- {scribe_cli-0.12.1 → scribe_cli-0.12.3}/scribe/install_desktop.py +0 -0
- {scribe_cli-0.12.1 → scribe_cli-0.12.3}/scribe/keyboard.py +0 -0
- {scribe_cli-0.12.1 → scribe_cli-0.12.3}/scribe/models.toml +0 -0
- {scribe_cli-0.12.1 → scribe_cli-0.12.3}/scribe/saverecording.py +0 -0
- {scribe_cli-0.12.1 → scribe_cli-0.12.3}/scribe/testpynput.py +0 -0
- {scribe_cli-0.12.1 → scribe_cli-0.12.3}/scribe/util.py +0 -0
- {scribe_cli-0.12.1 → scribe_cli-0.12.3}/scribe_cli.egg-info/dependency_links.txt +0 -0
- {scribe_cli-0.12.1 → scribe_cli-0.12.3}/scribe_cli.egg-info/entry_points.txt +0 -0
- {scribe_cli-0.12.1 → scribe_cli-0.12.3}/scribe_cli.egg-info/requires.txt +0 -0
- {scribe_cli-0.12.1 → scribe_cli-0.12.3}/scribe_cli.egg-info/top_level.txt +0 -0
- {scribe_cli-0.12.1 → scribe_cli-0.12.3}/scribe_data/__init__.py +0 -0
- {scribe_cli-0.12.1 → scribe_cli-0.12.3}/scribe_data/share/icon.png +0 -0
- {scribe_cli-0.12.1 → scribe_cli-0.12.3}/scribe_data/share/icon_recording.png +0 -0
- {scribe_cli-0.12.1 → scribe_cli-0.12.3}/scribe_data/share/icon_writing.png +0 -0
- {scribe_cli-0.12.1 → scribe_cli-0.12.3}/scribe_data/templates/scribe.desktop +0 -0
- {scribe_cli-0.12.1 → scribe_cli-0.12.3}/setup.cfg +0 -0
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.2
|
|
2
2
|
Name: scribe-cli
|
|
3
|
-
Version: 0.12.
|
|
3
|
+
Version: 0.12.3
|
|
4
4
|
Summary: scribe is a local speech recognition tool that provides real-time transcription using vosk and whisper AI, with the goal of serving as a virtual keyboard on a computer
|
|
5
5
|
Author-email: Mahé Perrette <mahe.perrette@gmail.com>
|
|
6
6
|
License: MIT License
|
|
@@ -34,7 +34,11 @@ License: MIT License
|
|
|
34
34
|
ensure compliance with their respective terms.
|
|
35
35
|
Project-URL: Homepage, https://github.com/perrette/scribe
|
|
36
36
|
Keywords: speech recognition,transcription,AI,language,vosk,whisper,openai,keyboard,clipboard
|
|
37
|
-
Classifier: Programming Language :: Python :: 3
|
|
37
|
+
Classifier: Programming Language :: Python :: 3.9
|
|
38
|
+
Classifier: Programming Language :: Python :: 3.10
|
|
39
|
+
Classifier: Programming Language :: Python :: 3.11
|
|
40
|
+
Classifier: Programming Language :: Python :: 3.12
|
|
41
|
+
Classifier: Programming Language :: Python :: 3.13
|
|
38
42
|
Classifier: Operating System :: OS Independent
|
|
39
43
|
Requires-Python: >=3.9
|
|
40
44
|
Description-Content-Type: text/markdown
|
|
@@ -66,8 +70,8 @@ Requires-Dist: soundfile; extra == "all"
|
|
|
66
70
|
Requires-Dist: vosk; extra == "all"
|
|
67
71
|
Requires-Dist: pystray; extra == "all"
|
|
68
72
|
|
|
69
|
-
[]()
|
|
70
73
|
[](https://pypi.org/project/scribe-cli)
|
|
74
|
+

|
|
71
75
|
|
|
72
76
|
# Scribe <img src="scribe_data/share/icon.png" width=48px>
|
|
73
77
|
|
|
@@ -77,22 +81,31 @@ It features local, downloadable models with the `vosk` and `whisper` backends, a
|
|
|
77
81
|
|
|
78
82
|
## Compatibility
|
|
79
83
|
|
|
80
|
-
|
|
81
|
-
|
|
82
|
-
|
|
83
|
-
|
|
84
|
-
|
|
85
|
-
|
|
84
|
+
The package is initially developped for python 3.12 with Ubuntu 24.04 with Gnome + Wayland, but it should work on other platforms as well (feedback welcome).
|
|
85
|
+
Basically check the pages of the dependencies for more info (i.e. pynput for the keyboard, pystray for the app).
|
|
86
|
+
|
|
87
|
+
- python 3.13:
|
|
88
|
+
- at the time of writing, `openai-whisper` does not install.
|
|
89
|
+
|
|
90
|
+
- Ubuntu:
|
|
91
|
+
- see caveats in the use of the keyboard under Wayland [keyboard section](#use-the-keyboard-with-wayland).
|
|
92
|
+
- MacOS:
|
|
93
|
+
- tested on a Macbook Air M1 8Gb RAM, with python 3.12. It runs, but poorly, presumably because of the low memory: prefer the `openaiapi` backend for such machines
|
|
94
|
+
- I expect better memory specs will have the local models run fine
|
|
95
|
+
- Windows:
|
|
96
|
+
- not tested yet
|
|
86
97
|
|
|
87
98
|
## Installation
|
|
88
99
|
|
|
89
|
-
Install PortAudio library and xclip library. E.g. on Ubuntu:
|
|
100
|
+
Install PortAudio library (required by `sounddevice`) and xclip library (required by `pyperclip`). E.g. on Ubuntu:
|
|
90
101
|
|
|
91
102
|
```bash
|
|
92
103
|
sudo apt-get install portaudio19-dev xclip
|
|
93
104
|
```
|
|
94
105
|
|
|
95
|
-
|
|
106
|
+
(`portaudio19-dev` becomes `portaudio ` with homebrew)
|
|
107
|
+
|
|
108
|
+
See additional requirements for the [icon tray](#system-tray-icon-experimental-) and [keyboard](#virtual-keyboard-experimental) options. The python dependencies should be dealt with automatically:
|
|
96
109
|
|
|
97
110
|
```bash
|
|
98
111
|
pip install scribe-cli[all]
|
|
@@ -110,6 +123,37 @@ pip install -e .[all]
|
|
|
110
123
|
|
|
111
124
|
You can leave the optional dependencies (leave out `[all]`) but must install at least one of `vosk` or `openai-whisper` or `openai` packages (see Usage below).
|
|
112
125
|
|
|
126
|
+
At the time of writing `openai-whisper` does not install on `python 3.13`. You can install the packages manually and skip that package. This makes the `whisper` API unavailable.
|
|
127
|
+
|
|
128
|
+
### Manual selection of the dependencies
|
|
129
|
+
|
|
130
|
+
```bash
|
|
131
|
+
# language models (at least one must be installed !)
|
|
132
|
+
pip install vosk
|
|
133
|
+
pip install openai soundfile # openaiapi
|
|
134
|
+
pip install openai-whisper # FAILS IN PYTHON 3.13 on Ubuntu
|
|
135
|
+
|
|
136
|
+
# PortAUDIO (sounddevice)
|
|
137
|
+
pip install sounddevice # automatically installed as required dependency
|
|
138
|
+
sudo apt-get install portaudio19-dev
|
|
139
|
+
# MAC OS: brew install portaudio
|
|
140
|
+
|
|
141
|
+
# clipboard
|
|
142
|
+
pip install pyperclip # automatically installed as required dependency
|
|
143
|
+
sudo apt-get install xclip
|
|
144
|
+
|
|
145
|
+
# keyboard
|
|
146
|
+
pip install pynput
|
|
147
|
+
|
|
148
|
+
# app mode
|
|
149
|
+
sudo apt install libcairo-dev libgirepository1.0-dev gir1.2-appindicator3-0.1 # Ubuntu ONLY (not needed on MacOS)
|
|
150
|
+
pip install PyGObject # Ubuntu ONLY (not needed on MacOS)
|
|
151
|
+
pip install pystray
|
|
152
|
+
|
|
153
|
+
# And finally
|
|
154
|
+
pip install scribe-cli
|
|
155
|
+
```
|
|
156
|
+
|
|
113
157
|
The language models for local backends `vosk` and `whisper` will download on-the-fly.
|
|
114
158
|
The default download folder is `$XDG_CACHE_HOME/{backend}` where `$XDG_CACHE_HOME` defaults to `$HOME/.cache`.
|
|
115
159
|
|
|
@@ -134,11 +178,12 @@ The `vosk` backend is much faster and very good at doing real-time transcription
|
|
|
134
178
|
It becomes really powerful when used for longer or interactive typing session with the [keyboard](#virtual-keyboard-experimental) option, e.g. to make notes or chat with an AI.
|
|
135
179
|
There are many [vosk models](https://alphacephei.com/vosk/models) available, and here a few are associated to [a handful of languages](scribe/models.toml) `en`, `fr`, `it`, `de` (so far).
|
|
136
180
|
|
|
137
|
-
The `openaiapi` backend uses `whisper-1` model at the time of writing. It requires an API key
|
|
181
|
+
The `openaiapi` backend uses `whisper-1` model at the time of writing. It requires an API key best passed as an environment variable, e.g. in bash:
|
|
138
182
|
```bash
|
|
139
|
-
|
|
183
|
+
export OPENAI_API_KEY=YOURAPIKEY
|
|
184
|
+
scribe --backend openaiapi
|
|
140
185
|
```
|
|
141
|
-
|
|
186
|
+
The `openaiapi` backend is lightweight and handy if you have an API (you can create one for free for testing) and a low-spec computer (and don't care too much about privacy, obviously).
|
|
142
187
|
|
|
143
188
|
## Output media
|
|
144
189
|
|
|
@@ -174,9 +219,9 @@ This can be extremely useful with the `vosk` backend and its realtime transcript
|
|
|
174
219
|
The `--keyboard` option relies on the optional `pynput` dependency (installed together with `scribe` if you used the `[all]` or `[keyboard]` option).
|
|
175
220
|
Depending on your operating system, `pynput` may require additional configuration to work around its [limitations](https://pynput.readthedocs.io/en/latest/limitations.html).
|
|
176
221
|
|
|
177
|
-
#### Use the keyboard with Wayland
|
|
222
|
+
#### Use the keyboard with Wayland
|
|
178
223
|
|
|
179
|
-
In my Ubuntu + Wayland system the keyboard simulation works out-of-the-box in chromium based applications (including vscode) but it does not in firefox and sublime text and any of the rest (not even in a terminal !). I am told this is because Chromium runs an X server emulator and so is compatible with the default pynput backend.
|
|
224
|
+
In my Ubuntu 24.04 + Wayland system the keyboard simulation works out-of-the-box in chromium based applications (including vscode) but it does not in firefox and sublime text and any of the rest (not even in a terminal !). I am told this is because Chromium runs an X server emulator and so is compatible with the default pynput backend.
|
|
180
225
|
|
|
181
226
|
One workaround is to use the Xorg version of GNOME: in `etc/gdm3/custom.conf` uncomment `# WaylandEnable=false` and restart your computer.
|
|
182
227
|
|
|
@@ -190,40 +235,46 @@ You're on the right path :)
|
|
|
190
235
|
|
|
191
236
|
## System tray icon (experimental) <img src="scribe_data/share/icon.png" width=48px>
|
|
192
237
|
|
|
238
|
+
<img src=https://github.com/user-attachments/assets/4c97f4b1-1a65-4d49-9f5a-a9f4287cfa5a width=300px>
|
|
239
|
+
|
|
193
240
|
To avoid switching back and forth with the terminal, it's possible to interact with the program via an icon tray.
|
|
194
241
|
To activate start with:
|
|
195
242
|
```bash
|
|
196
|
-
scribe --app
|
|
243
|
+
scribe --app ...
|
|
197
244
|
```
|
|
198
245
|
or toggle the app option in the interactive menu. The scribe icon will show, with Record and other options. The icon will change based on what the app is doing. It is possible to choose from a set
|
|
199
246
|
of predefined models (controlled by `--vosk-models` and `whisper-models`) and options, or to Quit and choose from the terminal before pressing Enter again.
|
|
200
247
|
For the vosk model, there are only two states : recording + transcribing or Idle. For the whisper model there are three states visible from the icon: recording/waiting, transcribing and idle.
|
|
201
|
-
That option requires `pystray` to be installed. This is included with the `pip install ...[all]` option.
|
|
248
|
+
That option requires `pystray` to be installed. This is included with the `pip install ...[all]` option.
|
|
249
|
+
|
|
250
|
+
The `--vosk-models` and `--whisper-models` allow to predefined the set of available models to choose from in the app manu. E.g.
|
|
251
|
+
```bash
|
|
252
|
+
scribe --app --vosk-models vosk-model-fr-0.22 --whisper-models small turbo ...
|
|
253
|
+
```
|
|
254
|
+
|
|
255
|
+
### Ubuntu
|
|
256
|
+
|
|
257
|
+
In Ubuntu the following dependencies were required to make the menus appear:
|
|
202
258
|
|
|
203
259
|
```bash
|
|
204
260
|
sudo apt install libcairo-dev libgirepository1.0-dev gir1.2-appindicator3-0.1
|
|
205
261
|
pip install PyGObject
|
|
206
262
|
```
|
|
207
263
|
|
|
208
|
-
<img src=https://github.com/user-attachments/assets/4c97f4b1-1a65-4d49-9f5a-a9f4287cfa5a width=300px>
|
|
209
|
-
|
|
210
264
|
## Start as an application in GNOME
|
|
211
265
|
|
|
212
266
|
If you run Ubuntu (or else?) with GNOME, the script `scribe-install [...]` will create a `scribe.desktop` file and place it under `$HOME/.local/share/applications`
|
|
213
267
|
to make it available from the quick launch menu. Any option will be passed on to `scribe`, with the additional options `--name` and `--no-terminal`.
|
|
214
268
|
`--no-terminal` means no terminal will show up, and it also implies the options `--app --no-prompt`.
|
|
215
269
|
|
|
216
|
-
|
|
217
|
-
|
|
270
|
+
Consider the following two flavors:
|
|
218
271
|
```bash
|
|
219
|
-
scribe-install --clipboard
|
|
272
|
+
scribe-install --clipboard ...
|
|
273
|
+
scribe-install --name "Scribe App" --no-terminal --clipboard ...
|
|
220
274
|
```
|
|
221
|
-
(
|
|
275
|
+
The first will create an app named Scribe (the default) that simply opens a terminal and execute the command `scribe --clipboard ...`.
|
|
276
|
+
The second will create an app named Scribe App that executes in a hidden terminal: `scribe --no-prompt --app --clipboard ...`, thus leaving the tray icon as only mode of interaction.
|
|
222
277
|
|
|
223
|
-
It is also possible to run an app fully outside the terminal:
|
|
224
|
-
```bash
|
|
225
|
-
scribe-install --backend openaiapi --name "Scribe App" --keyboard --clipboard --app --no-prompt --no-terminal --restart-after-silence --api YOUROPENAIAPIKEY --vosk-models vosk-model-fr-0.22 --whisper-models small turbo
|
|
226
|
-
```
|
|
227
278
|
|
|
228
279
|
## Fine tuning
|
|
229
280
|
|
|
@@ -1,5 +1,5 @@
|
|
|
1
|
-
[]()
|
|
2
1
|
[](https://pypi.org/project/scribe-cli)
|
|
2
|
+

|
|
3
3
|
|
|
4
4
|
# Scribe <img src="scribe_data/share/icon.png" width=48px>
|
|
5
5
|
|
|
@@ -9,22 +9,31 @@ It features local, downloadable models with the `vosk` and `whisper` backends, a
|
|
|
9
9
|
|
|
10
10
|
## Compatibility
|
|
11
11
|
|
|
12
|
-
|
|
13
|
-
|
|
14
|
-
|
|
15
|
-
|
|
16
|
-
|
|
17
|
-
|
|
12
|
+
The package is initially developped for python 3.12 with Ubuntu 24.04 with Gnome + Wayland, but it should work on other platforms as well (feedback welcome).
|
|
13
|
+
Basically check the pages of the dependencies for more info (i.e. pynput for the keyboard, pystray for the app).
|
|
14
|
+
|
|
15
|
+
- python 3.13:
|
|
16
|
+
- at the time of writing, `openai-whisper` does not install.
|
|
17
|
+
|
|
18
|
+
- Ubuntu:
|
|
19
|
+
- see caveats in the use of the keyboard under Wayland [keyboard section](#use-the-keyboard-with-wayland).
|
|
20
|
+
- MacOS:
|
|
21
|
+
- tested on a Macbook Air M1 8Gb RAM, with python 3.12. It runs, but poorly, presumably because of the low memory: prefer the `openaiapi` backend for such machines
|
|
22
|
+
- I expect better memory specs will have the local models run fine
|
|
23
|
+
- Windows:
|
|
24
|
+
- not tested yet
|
|
18
25
|
|
|
19
26
|
## Installation
|
|
20
27
|
|
|
21
|
-
Install PortAudio library and xclip library. E.g. on Ubuntu:
|
|
28
|
+
Install PortAudio library (required by `sounddevice`) and xclip library (required by `pyperclip`). E.g. on Ubuntu:
|
|
22
29
|
|
|
23
30
|
```bash
|
|
24
31
|
sudo apt-get install portaudio19-dev xclip
|
|
25
32
|
```
|
|
26
33
|
|
|
27
|
-
|
|
34
|
+
(`portaudio19-dev` becomes `portaudio ` with homebrew)
|
|
35
|
+
|
|
36
|
+
See additional requirements for the [icon tray](#system-tray-icon-experimental-) and [keyboard](#virtual-keyboard-experimental) options. The python dependencies should be dealt with automatically:
|
|
28
37
|
|
|
29
38
|
```bash
|
|
30
39
|
pip install scribe-cli[all]
|
|
@@ -42,6 +51,37 @@ pip install -e .[all]
|
|
|
42
51
|
|
|
43
52
|
You can leave the optional dependencies (leave out `[all]`) but must install at least one of `vosk` or `openai-whisper` or `openai` packages (see Usage below).
|
|
44
53
|
|
|
54
|
+
At the time of writing `openai-whisper` does not install on `python 3.13`. You can install the packages manually and skip that package. This makes the `whisper` API unavailable.
|
|
55
|
+
|
|
56
|
+
### Manual selection of the dependencies
|
|
57
|
+
|
|
58
|
+
```bash
|
|
59
|
+
# language models (at least one must be installed !)
|
|
60
|
+
pip install vosk
|
|
61
|
+
pip install openai soundfile # openaiapi
|
|
62
|
+
pip install openai-whisper # FAILS IN PYTHON 3.13 on Ubuntu
|
|
63
|
+
|
|
64
|
+
# PortAUDIO (sounddevice)
|
|
65
|
+
pip install sounddevice # automatically installed as required dependency
|
|
66
|
+
sudo apt-get install portaudio19-dev
|
|
67
|
+
# MAC OS: brew install portaudio
|
|
68
|
+
|
|
69
|
+
# clipboard
|
|
70
|
+
pip install pyperclip # automatically installed as required dependency
|
|
71
|
+
sudo apt-get install xclip
|
|
72
|
+
|
|
73
|
+
# keyboard
|
|
74
|
+
pip install pynput
|
|
75
|
+
|
|
76
|
+
# app mode
|
|
77
|
+
sudo apt install libcairo-dev libgirepository1.0-dev gir1.2-appindicator3-0.1 # Ubuntu ONLY (not needed on MacOS)
|
|
78
|
+
pip install PyGObject # Ubuntu ONLY (not needed on MacOS)
|
|
79
|
+
pip install pystray
|
|
80
|
+
|
|
81
|
+
# And finally
|
|
82
|
+
pip install scribe-cli
|
|
83
|
+
```
|
|
84
|
+
|
|
45
85
|
The language models for local backends `vosk` and `whisper` will download on-the-fly.
|
|
46
86
|
The default download folder is `$XDG_CACHE_HOME/{backend}` where `$XDG_CACHE_HOME` defaults to `$HOME/.cache`.
|
|
47
87
|
|
|
@@ -66,11 +106,12 @@ The `vosk` backend is much faster and very good at doing real-time transcription
|
|
|
66
106
|
It becomes really powerful when used for longer or interactive typing session with the [keyboard](#virtual-keyboard-experimental) option, e.g. to make notes or chat with an AI.
|
|
67
107
|
There are many [vosk models](https://alphacephei.com/vosk/models) available, and here a few are associated to [a handful of languages](scribe/models.toml) `en`, `fr`, `it`, `de` (so far).
|
|
68
108
|
|
|
69
|
-
The `openaiapi` backend uses `whisper-1` model at the time of writing. It requires an API key
|
|
109
|
+
The `openaiapi` backend uses `whisper-1` model at the time of writing. It requires an API key best passed as an environment variable, e.g. in bash:
|
|
70
110
|
```bash
|
|
71
|
-
|
|
111
|
+
export OPENAI_API_KEY=YOURAPIKEY
|
|
112
|
+
scribe --backend openaiapi
|
|
72
113
|
```
|
|
73
|
-
|
|
114
|
+
The `openaiapi` backend is lightweight and handy if you have an API (you can create one for free for testing) and a low-spec computer (and don't care too much about privacy, obviously).
|
|
74
115
|
|
|
75
116
|
## Output media
|
|
76
117
|
|
|
@@ -106,9 +147,9 @@ This can be extremely useful with the `vosk` backend and its realtime transcript
|
|
|
106
147
|
The `--keyboard` option relies on the optional `pynput` dependency (installed together with `scribe` if you used the `[all]` or `[keyboard]` option).
|
|
107
148
|
Depending on your operating system, `pynput` may require additional configuration to work around its [limitations](https://pynput.readthedocs.io/en/latest/limitations.html).
|
|
108
149
|
|
|
109
|
-
#### Use the keyboard with Wayland
|
|
150
|
+
#### Use the keyboard with Wayland
|
|
110
151
|
|
|
111
|
-
In my Ubuntu + Wayland system the keyboard simulation works out-of-the-box in chromium based applications (including vscode) but it does not in firefox and sublime text and any of the rest (not even in a terminal !). I am told this is because Chromium runs an X server emulator and so is compatible with the default pynput backend.
|
|
152
|
+
In my Ubuntu 24.04 + Wayland system the keyboard simulation works out-of-the-box in chromium based applications (including vscode) but it does not in firefox and sublime text and any of the rest (not even in a terminal !). I am told this is because Chromium runs an X server emulator and so is compatible with the default pynput backend.
|
|
112
153
|
|
|
113
154
|
One workaround is to use the Xorg version of GNOME: in `etc/gdm3/custom.conf` uncomment `# WaylandEnable=false` and restart your computer.
|
|
114
155
|
|
|
@@ -122,40 +163,46 @@ You're on the right path :)
|
|
|
122
163
|
|
|
123
164
|
## System tray icon (experimental) <img src="scribe_data/share/icon.png" width=48px>
|
|
124
165
|
|
|
166
|
+
<img src=https://github.com/user-attachments/assets/4c97f4b1-1a65-4d49-9f5a-a9f4287cfa5a width=300px>
|
|
167
|
+
|
|
125
168
|
To avoid switching back and forth with the terminal, it's possible to interact with the program via an icon tray.
|
|
126
169
|
To activate start with:
|
|
127
170
|
```bash
|
|
128
|
-
scribe --app
|
|
171
|
+
scribe --app ...
|
|
129
172
|
```
|
|
130
173
|
or toggle the app option in the interactive menu. The scribe icon will show, with Record and other options. The icon will change based on what the app is doing. It is possible to choose from a set
|
|
131
174
|
of predefined models (controlled by `--vosk-models` and `whisper-models`) and options, or to Quit and choose from the terminal before pressing Enter again.
|
|
132
175
|
For the vosk model, there are only two states : recording + transcribing or Idle. For the whisper model there are three states visible from the icon: recording/waiting, transcribing and idle.
|
|
133
|
-
That option requires `pystray` to be installed. This is included with the `pip install ...[all]` option.
|
|
176
|
+
That option requires `pystray` to be installed. This is included with the `pip install ...[all]` option.
|
|
177
|
+
|
|
178
|
+
The `--vosk-models` and `--whisper-models` allow to predefined the set of available models to choose from in the app manu. E.g.
|
|
179
|
+
```bash
|
|
180
|
+
scribe --app --vosk-models vosk-model-fr-0.22 --whisper-models small turbo ...
|
|
181
|
+
```
|
|
182
|
+
|
|
183
|
+
### Ubuntu
|
|
184
|
+
|
|
185
|
+
In Ubuntu the following dependencies were required to make the menus appear:
|
|
134
186
|
|
|
135
187
|
```bash
|
|
136
188
|
sudo apt install libcairo-dev libgirepository1.0-dev gir1.2-appindicator3-0.1
|
|
137
189
|
pip install PyGObject
|
|
138
190
|
```
|
|
139
191
|
|
|
140
|
-
<img src=https://github.com/user-attachments/assets/4c97f4b1-1a65-4d49-9f5a-a9f4287cfa5a width=300px>
|
|
141
|
-
|
|
142
192
|
## Start as an application in GNOME
|
|
143
193
|
|
|
144
194
|
If you run Ubuntu (or else?) with GNOME, the script `scribe-install [...]` will create a `scribe.desktop` file and place it under `$HOME/.local/share/applications`
|
|
145
195
|
to make it available from the quick launch menu. Any option will be passed on to `scribe`, with the additional options `--name` and `--no-terminal`.
|
|
146
196
|
`--no-terminal` means no terminal will show up, and it also implies the options `--app --no-prompt`.
|
|
147
197
|
|
|
148
|
-
|
|
149
|
-
|
|
198
|
+
Consider the following two flavors:
|
|
150
199
|
```bash
|
|
151
|
-
scribe-install --clipboard
|
|
200
|
+
scribe-install --clipboard ...
|
|
201
|
+
scribe-install --name "Scribe App" --no-terminal --clipboard ...
|
|
152
202
|
```
|
|
153
|
-
(
|
|
203
|
+
The first will create an app named Scribe (the default) that simply opens a terminal and execute the command `scribe --clipboard ...`.
|
|
204
|
+
The second will create an app named Scribe App that executes in a hidden terminal: `scribe --no-prompt --app --clipboard ...`, thus leaving the tray icon as only mode of interaction.
|
|
154
205
|
|
|
155
|
-
It is also possible to run an app fully outside the terminal:
|
|
156
|
-
```bash
|
|
157
|
-
scribe-install --backend openaiapi --name "Scribe App" --keyboard --clipboard --app --no-prompt --no-terminal --restart-after-silence --api YOUROPENAIAPIKEY --vosk-models vosk-model-fr-0.22 --whisper-models small turbo
|
|
158
|
-
```
|
|
159
206
|
|
|
160
207
|
## Fine tuning
|
|
161
208
|
|
|
@@ -23,7 +23,11 @@ dependencies = [
|
|
|
23
23
|
]
|
|
24
24
|
|
|
25
25
|
classifiers = [
|
|
26
|
-
"Programming Language :: Python :: 3",
|
|
26
|
+
"Programming Language :: Python :: 3.9",
|
|
27
|
+
"Programming Language :: Python :: 3.10",
|
|
28
|
+
"Programming Language :: Python :: 3.11",
|
|
29
|
+
"Programming Language :: Python :: 3.12",
|
|
30
|
+
"Programming Language :: Python :: 3.13",
|
|
27
31
|
"Operating System :: OS Independent",
|
|
28
32
|
]
|
|
29
33
|
|
|
@@ -1,8 +1,13 @@
|
|
|
1
|
-
# file generated by
|
|
1
|
+
# file generated by setuptools-scm
|
|
2
2
|
# don't change, don't track in version control
|
|
3
|
+
|
|
4
|
+
__all__ = ["__version__", "__version_tuple__", "version", "version_tuple"]
|
|
5
|
+
|
|
3
6
|
TYPE_CHECKING = False
|
|
4
7
|
if TYPE_CHECKING:
|
|
5
|
-
from typing import Tuple
|
|
8
|
+
from typing import Tuple
|
|
9
|
+
from typing import Union
|
|
10
|
+
|
|
6
11
|
VERSION_TUPLE = Tuple[Union[int, str], ...]
|
|
7
12
|
else:
|
|
8
13
|
VERSION_TUPLE = object
|
|
@@ -12,5 +17,5 @@ __version__: str
|
|
|
12
17
|
__version_tuple__: VERSION_TUPLE
|
|
13
18
|
version_tuple: VERSION_TUPLE
|
|
14
19
|
|
|
15
|
-
__version__ = version = '0.12.
|
|
16
|
-
__version_tuple__ = version_tuple = (0, 12,
|
|
20
|
+
__version__ = version = '0.12.3'
|
|
21
|
+
__version_tuple__ = version_tuple = (0, 12, 3)
|
|
@@ -203,7 +203,7 @@ def get_parser():
|
|
|
203
203
|
group = parser.add_argument_group("whisper options")
|
|
204
204
|
group.add_argument("--duration", default=120, type=float, help="Max duration of the whisper recording (default %(default)s s)")
|
|
205
205
|
group.add_argument("--silence", default=2, type=float, help="silence duration (default %(default)s s)")
|
|
206
|
-
group.add_argument("--silence-db", default=-
|
|
206
|
+
group.add_argument("--silence-db", default=-40, type=float, help="silence magnitude in decibel (default %(default)s db)")
|
|
207
207
|
group.add_argument("-a", "--restart-after-silence", action="store_true", help="Restart the recording after a transcription triggered by a silence")
|
|
208
208
|
group.add_argument("--download-folder-whisper", help="Folder to store Whisper models.")
|
|
209
209
|
|
|
@@ -54,6 +54,9 @@ class AbstractTranscriber:
|
|
|
54
54
|
return self.timeout is not None and time.time() - self.start_time > self.timeout
|
|
55
55
|
|
|
56
56
|
def transcribe_realtime_audio(self, audio_bytes=b""):
|
|
57
|
+
"""This method is generic and assumes the underlying model does not handle real-time audio.
|
|
58
|
+
The Vosk model handles real-time audio, so this method is overridden in the VoskTranscriber class.
|
|
59
|
+
"""
|
|
57
60
|
|
|
58
61
|
# Vérifier si le segment est un silence
|
|
59
62
|
if is_silent(audio_bytes, self.silence_thresh):
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.2
|
|
2
2
|
Name: scribe-cli
|
|
3
|
-
Version: 0.12.
|
|
3
|
+
Version: 0.12.3
|
|
4
4
|
Summary: scribe is a local speech recognition tool that provides real-time transcription using vosk and whisper AI, with the goal of serving as a virtual keyboard on a computer
|
|
5
5
|
Author-email: Mahé Perrette <mahe.perrette@gmail.com>
|
|
6
6
|
License: MIT License
|
|
@@ -34,7 +34,11 @@ License: MIT License
|
|
|
34
34
|
ensure compliance with their respective terms.
|
|
35
35
|
Project-URL: Homepage, https://github.com/perrette/scribe
|
|
36
36
|
Keywords: speech recognition,transcription,AI,language,vosk,whisper,openai,keyboard,clipboard
|
|
37
|
-
Classifier: Programming Language :: Python :: 3
|
|
37
|
+
Classifier: Programming Language :: Python :: 3.9
|
|
38
|
+
Classifier: Programming Language :: Python :: 3.10
|
|
39
|
+
Classifier: Programming Language :: Python :: 3.11
|
|
40
|
+
Classifier: Programming Language :: Python :: 3.12
|
|
41
|
+
Classifier: Programming Language :: Python :: 3.13
|
|
38
42
|
Classifier: Operating System :: OS Independent
|
|
39
43
|
Requires-Python: >=3.9
|
|
40
44
|
Description-Content-Type: text/markdown
|
|
@@ -66,8 +70,8 @@ Requires-Dist: soundfile; extra == "all"
|
|
|
66
70
|
Requires-Dist: vosk; extra == "all"
|
|
67
71
|
Requires-Dist: pystray; extra == "all"
|
|
68
72
|
|
|
69
|
-
[]()
|
|
70
73
|
[](https://pypi.org/project/scribe-cli)
|
|
74
|
+

|
|
71
75
|
|
|
72
76
|
# Scribe <img src="scribe_data/share/icon.png" width=48px>
|
|
73
77
|
|
|
@@ -77,22 +81,31 @@ It features local, downloadable models with the `vosk` and `whisper` backends, a
|
|
|
77
81
|
|
|
78
82
|
## Compatibility
|
|
79
83
|
|
|
80
|
-
|
|
81
|
-
|
|
82
|
-
|
|
83
|
-
|
|
84
|
-
|
|
85
|
-
|
|
84
|
+
The package is initially developped for python 3.12 with Ubuntu 24.04 with Gnome + Wayland, but it should work on other platforms as well (feedback welcome).
|
|
85
|
+
Basically check the pages of the dependencies for more info (i.e. pynput for the keyboard, pystray for the app).
|
|
86
|
+
|
|
87
|
+
- python 3.13:
|
|
88
|
+
- at the time of writing, `openai-whisper` does not install.
|
|
89
|
+
|
|
90
|
+
- Ubuntu:
|
|
91
|
+
- see caveats in the use of the keyboard under Wayland [keyboard section](#use-the-keyboard-with-wayland).
|
|
92
|
+
- MacOS:
|
|
93
|
+
- tested on a Macbook Air M1 8Gb RAM, with python 3.12. It runs, but poorly, presumably because of the low memory: prefer the `openaiapi` backend for such machines
|
|
94
|
+
- I expect better memory specs will have the local models run fine
|
|
95
|
+
- Windows:
|
|
96
|
+
- not tested yet
|
|
86
97
|
|
|
87
98
|
## Installation
|
|
88
99
|
|
|
89
|
-
Install PortAudio library and xclip library. E.g. on Ubuntu:
|
|
100
|
+
Install PortAudio library (required by `sounddevice`) and xclip library (required by `pyperclip`). E.g. on Ubuntu:
|
|
90
101
|
|
|
91
102
|
```bash
|
|
92
103
|
sudo apt-get install portaudio19-dev xclip
|
|
93
104
|
```
|
|
94
105
|
|
|
95
|
-
|
|
106
|
+
(`portaudio19-dev` becomes `portaudio ` with homebrew)
|
|
107
|
+
|
|
108
|
+
See additional requirements for the [icon tray](#system-tray-icon-experimental-) and [keyboard](#virtual-keyboard-experimental) options. The python dependencies should be dealt with automatically:
|
|
96
109
|
|
|
97
110
|
```bash
|
|
98
111
|
pip install scribe-cli[all]
|
|
@@ -110,6 +123,37 @@ pip install -e .[all]
|
|
|
110
123
|
|
|
111
124
|
You can leave the optional dependencies (leave out `[all]`) but must install at least one of `vosk` or `openai-whisper` or `openai` packages (see Usage below).
|
|
112
125
|
|
|
126
|
+
At the time of writing `openai-whisper` does not install on `python 3.13`. You can install the packages manually and skip that package. This makes the `whisper` API unavailable.
|
|
127
|
+
|
|
128
|
+
### Manual selection of the dependencies
|
|
129
|
+
|
|
130
|
+
```bash
|
|
131
|
+
# language models (at least one must be installed !)
|
|
132
|
+
pip install vosk
|
|
133
|
+
pip install openai soundfile # openaiapi
|
|
134
|
+
pip install openai-whisper # FAILS IN PYTHON 3.13 on Ubuntu
|
|
135
|
+
|
|
136
|
+
# PortAUDIO (sounddevice)
|
|
137
|
+
pip install sounddevice # automatically installed as required dependency
|
|
138
|
+
sudo apt-get install portaudio19-dev
|
|
139
|
+
# MAC OS: brew install portaudio
|
|
140
|
+
|
|
141
|
+
# clipboard
|
|
142
|
+
pip install pyperclip # automatically installed as required dependency
|
|
143
|
+
sudo apt-get install xclip
|
|
144
|
+
|
|
145
|
+
# keyboard
|
|
146
|
+
pip install pynput
|
|
147
|
+
|
|
148
|
+
# app mode
|
|
149
|
+
sudo apt install libcairo-dev libgirepository1.0-dev gir1.2-appindicator3-0.1 # Ubuntu ONLY (not needed on MacOS)
|
|
150
|
+
pip install PyGObject # Ubuntu ONLY (not needed on MacOS)
|
|
151
|
+
pip install pystray
|
|
152
|
+
|
|
153
|
+
# And finally
|
|
154
|
+
pip install scribe-cli
|
|
155
|
+
```
|
|
156
|
+
|
|
113
157
|
The language models for local backends `vosk` and `whisper` will download on-the-fly.
|
|
114
158
|
The default download folder is `$XDG_CACHE_HOME/{backend}` where `$XDG_CACHE_HOME` defaults to `$HOME/.cache`.
|
|
115
159
|
|
|
@@ -134,11 +178,12 @@ The `vosk` backend is much faster and very good at doing real-time transcription
|
|
|
134
178
|
It becomes really powerful when used for longer or interactive typing session with the [keyboard](#virtual-keyboard-experimental) option, e.g. to make notes or chat with an AI.
|
|
135
179
|
There are many [vosk models](https://alphacephei.com/vosk/models) available, and here a few are associated to [a handful of languages](scribe/models.toml) `en`, `fr`, `it`, `de` (so far).
|
|
136
180
|
|
|
137
|
-
The `openaiapi` backend uses `whisper-1` model at the time of writing. It requires an API key
|
|
181
|
+
The `openaiapi` backend uses `whisper-1` model at the time of writing. It requires an API key best passed as an environment variable, e.g. in bash:
|
|
138
182
|
```bash
|
|
139
|
-
|
|
183
|
+
export OPENAI_API_KEY=YOURAPIKEY
|
|
184
|
+
scribe --backend openaiapi
|
|
140
185
|
```
|
|
141
|
-
|
|
186
|
+
The `openaiapi` backend is lightweight and handy if you have an API (you can create one for free for testing) and a low-spec computer (and don't care too much about privacy, obviously).
|
|
142
187
|
|
|
143
188
|
## Output media
|
|
144
189
|
|
|
@@ -174,9 +219,9 @@ This can be extremely useful with the `vosk` backend and its realtime transcript
|
|
|
174
219
|
The `--keyboard` option relies on the optional `pynput` dependency (installed together with `scribe` if you used the `[all]` or `[keyboard]` option).
|
|
175
220
|
Depending on your operating system, `pynput` may require additional configuration to work around its [limitations](https://pynput.readthedocs.io/en/latest/limitations.html).
|
|
176
221
|
|
|
177
|
-
#### Use the keyboard with Wayland
|
|
222
|
+
#### Use the keyboard with Wayland
|
|
178
223
|
|
|
179
|
-
In my Ubuntu + Wayland system the keyboard simulation works out-of-the-box in chromium based applications (including vscode) but it does not in firefox and sublime text and any of the rest (not even in a terminal !). I am told this is because Chromium runs an X server emulator and so is compatible with the default pynput backend.
|
|
224
|
+
In my Ubuntu 24.04 + Wayland system the keyboard simulation works out-of-the-box in chromium based applications (including vscode) but it does not in firefox and sublime text and any of the rest (not even in a terminal !). I am told this is because Chromium runs an X server emulator and so is compatible with the default pynput backend.
|
|
180
225
|
|
|
181
226
|
One workaround is to use the Xorg version of GNOME: in `etc/gdm3/custom.conf` uncomment `# WaylandEnable=false` and restart your computer.
|
|
182
227
|
|
|
@@ -190,40 +235,46 @@ You're on the right path :)
|
|
|
190
235
|
|
|
191
236
|
## System tray icon (experimental) <img src="scribe_data/share/icon.png" width=48px>
|
|
192
237
|
|
|
238
|
+
<img src=https://github.com/user-attachments/assets/4c97f4b1-1a65-4d49-9f5a-a9f4287cfa5a width=300px>
|
|
239
|
+
|
|
193
240
|
To avoid switching back and forth with the terminal, it's possible to interact with the program via an icon tray.
|
|
194
241
|
To activate start with:
|
|
195
242
|
```bash
|
|
196
|
-
scribe --app
|
|
243
|
+
scribe --app ...
|
|
197
244
|
```
|
|
198
245
|
or toggle the app option in the interactive menu. The scribe icon will show, with Record and other options. The icon will change based on what the app is doing. It is possible to choose from a set
|
|
199
246
|
of predefined models (controlled by `--vosk-models` and `whisper-models`) and options, or to Quit and choose from the terminal before pressing Enter again.
|
|
200
247
|
For the vosk model, there are only two states : recording + transcribing or Idle. For the whisper model there are three states visible from the icon: recording/waiting, transcribing and idle.
|
|
201
|
-
That option requires `pystray` to be installed. This is included with the `pip install ...[all]` option.
|
|
248
|
+
That option requires `pystray` to be installed. This is included with the `pip install ...[all]` option.
|
|
249
|
+
|
|
250
|
+
The `--vosk-models` and `--whisper-models` allow to predefined the set of available models to choose from in the app manu. E.g.
|
|
251
|
+
```bash
|
|
252
|
+
scribe --app --vosk-models vosk-model-fr-0.22 --whisper-models small turbo ...
|
|
253
|
+
```
|
|
254
|
+
|
|
255
|
+
### Ubuntu
|
|
256
|
+
|
|
257
|
+
In Ubuntu the following dependencies were required to make the menus appear:
|
|
202
258
|
|
|
203
259
|
```bash
|
|
204
260
|
sudo apt install libcairo-dev libgirepository1.0-dev gir1.2-appindicator3-0.1
|
|
205
261
|
pip install PyGObject
|
|
206
262
|
```
|
|
207
263
|
|
|
208
|
-
<img src=https://github.com/user-attachments/assets/4c97f4b1-1a65-4d49-9f5a-a9f4287cfa5a width=300px>
|
|
209
|
-
|
|
210
264
|
## Start as an application in GNOME
|
|
211
265
|
|
|
212
266
|
If you run Ubuntu (or else?) with GNOME, the script `scribe-install [...]` will create a `scribe.desktop` file and place it under `$HOME/.local/share/applications`
|
|
213
267
|
to make it available from the quick launch menu. Any option will be passed on to `scribe`, with the additional options `--name` and `--no-terminal`.
|
|
214
268
|
`--no-terminal` means no terminal will show up, and it also implies the options `--app --no-prompt`.
|
|
215
269
|
|
|
216
|
-
|
|
217
|
-
|
|
270
|
+
Consider the following two flavors:
|
|
218
271
|
```bash
|
|
219
|
-
scribe-install --clipboard
|
|
272
|
+
scribe-install --clipboard ...
|
|
273
|
+
scribe-install --name "Scribe App" --no-terminal --clipboard ...
|
|
220
274
|
```
|
|
221
|
-
(
|
|
275
|
+
The first will create an app named Scribe (the default) that simply opens a terminal and execute the command `scribe --clipboard ...`.
|
|
276
|
+
The second will create an app named Scribe App that executes in a hidden terminal: `scribe --no-prompt --app --clipboard ...`, thus leaving the tray icon as only mode of interaction.
|
|
222
277
|
|
|
223
|
-
It is also possible to run an app fully outside the terminal:
|
|
224
|
-
```bash
|
|
225
|
-
scribe-install --backend openaiapi --name "Scribe App" --keyboard --clipboard --app --no-prompt --no-terminal --restart-after-silence --api YOUROPENAIAPIKEY --vosk-models vosk-model-fr-0.22 --whisper-models small turbo
|
|
226
|
-
```
|
|
227
278
|
|
|
228
279
|
## Fine tuning
|
|
229
280
|
|
|
@@ -0,0 +1,20 @@
|
|
|
1
|
+
subversion=$1
|
|
2
|
+
version=3.$subversion
|
|
3
|
+
name=py3$subversion
|
|
4
|
+
|
|
5
|
+
MAMBAENV=~/.local/share/mamba/envs/$name
|
|
6
|
+
VENVDIR=~/.virtualenvs/$name
|
|
7
|
+
|
|
8
|
+
if [ ! -d $MAMBAENV ] ; then
|
|
9
|
+
micromamba create -n $name python=$version --prefix $MAMBAENV -y
|
|
10
|
+
else
|
|
11
|
+
echo "Environment $name already exists at $MAMBAENV"
|
|
12
|
+
fi
|
|
13
|
+
if [ ! -d $VENVDIR ] ; then
|
|
14
|
+
$MAMBAENV/bin/python3 -m venv $VENVDIR
|
|
15
|
+
else
|
|
16
|
+
echo "Virtualenv $name already exists at $VENVDIR"
|
|
17
|
+
fi
|
|
18
|
+
source ~/.virtualenvs/$name/bin/activate
|
|
19
|
+
pip install -U pip
|
|
20
|
+
pip install scribe-cli[all]
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|