scribe-cli 0.12.1__tar.gz → 0.12.2__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {scribe_cli-0.12.1/scribe_cli.egg-info → scribe_cli-0.12.2}/PKG-INFO +77 -28
- {scribe_cli-0.12.1 → scribe_cli-0.12.2}/README.md +71 -26
- {scribe_cli-0.12.1 → scribe_cli-0.12.2}/pyproject.toml +5 -1
- {scribe_cli-0.12.1 → scribe_cli-0.12.2}/scribe/_version.py +2 -2
- {scribe_cli-0.12.1 → scribe_cli-0.12.2}/scribe/models.py +3 -0
- {scribe_cli-0.12.1 → scribe_cli-0.12.2/scribe_cli.egg-info}/PKG-INFO +77 -28
- {scribe_cli-0.12.1 → scribe_cli-0.12.2}/scribe_cli.egg-info/SOURCES.txt +2 -1
- scribe_cli-0.12.2/scripts/test_python_versions_install.sh +20 -0
- {scribe_cli-0.12.1 → scribe_cli-0.12.2}/.github/workflows/pypi.yml +0 -0
- {scribe_cli-0.12.1 → scribe_cli-0.12.2}/.gitignore +0 -0
- {scribe_cli-0.12.1 → scribe_cli-0.12.2}/LICENSE +0 -0
- {scribe_cli-0.12.1 → scribe_cli-0.12.2}/icon.xcf +0 -0
- {scribe_cli-0.12.1 → scribe_cli-0.12.2}/scribe/__init__.py +0 -0
- {scribe_cli-0.12.1 → scribe_cli-0.12.2}/scribe/app.py +0 -0
- {scribe_cli-0.12.1 → scribe_cli-0.12.2}/scribe/audio.py +0 -0
- {scribe_cli-0.12.1 → scribe_cli-0.12.2}/scribe/install_desktop.py +0 -0
- {scribe_cli-0.12.1 → scribe_cli-0.12.2}/scribe/keyboard.py +0 -0
- {scribe_cli-0.12.1 → scribe_cli-0.12.2}/scribe/models.toml +0 -0
- {scribe_cli-0.12.1 → scribe_cli-0.12.2}/scribe/saverecording.py +0 -0
- {scribe_cli-0.12.1 → scribe_cli-0.12.2}/scribe/testpynput.py +0 -0
- {scribe_cli-0.12.1 → scribe_cli-0.12.2}/scribe/util.py +0 -0
- {scribe_cli-0.12.1 → scribe_cli-0.12.2}/scribe_cli.egg-info/dependency_links.txt +0 -0
- {scribe_cli-0.12.1 → scribe_cli-0.12.2}/scribe_cli.egg-info/entry_points.txt +0 -0
- {scribe_cli-0.12.1 → scribe_cli-0.12.2}/scribe_cli.egg-info/requires.txt +0 -0
- {scribe_cli-0.12.1 → scribe_cli-0.12.2}/scribe_cli.egg-info/top_level.txt +0 -0
- {scribe_cli-0.12.1 → scribe_cli-0.12.2}/scribe_data/__init__.py +0 -0
- {scribe_cli-0.12.1 → scribe_cli-0.12.2}/scribe_data/share/icon.png +0 -0
- {scribe_cli-0.12.1 → scribe_cli-0.12.2}/scribe_data/share/icon_recording.png +0 -0
- {scribe_cli-0.12.1 → scribe_cli-0.12.2}/scribe_data/share/icon_writing.png +0 -0
- {scribe_cli-0.12.1 → scribe_cli-0.12.2}/scribe_data/templates/scribe.desktop +0 -0
- {scribe_cli-0.12.1 → scribe_cli-0.12.2}/setup.cfg +0 -0
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.2
|
|
2
2
|
Name: scribe-cli
|
|
3
|
-
Version: 0.12.
|
|
3
|
+
Version: 0.12.2
|
|
4
4
|
Summary: scribe is a local speech recognition tool that provides real-time transcription using vosk and whisper AI, with the goal of serving as a virtual keyboard on a computer
|
|
5
5
|
Author-email: Mahé Perrette <mahe.perrette@gmail.com>
|
|
6
6
|
License: MIT License
|
|
@@ -34,7 +34,11 @@ License: MIT License
|
|
|
34
34
|
ensure compliance with their respective terms.
|
|
35
35
|
Project-URL: Homepage, https://github.com/perrette/scribe
|
|
36
36
|
Keywords: speech recognition,transcription,AI,language,vosk,whisper,openai,keyboard,clipboard
|
|
37
|
-
Classifier: Programming Language :: Python :: 3
|
|
37
|
+
Classifier: Programming Language :: Python :: 3.9
|
|
38
|
+
Classifier: Programming Language :: Python :: 3.10
|
|
39
|
+
Classifier: Programming Language :: Python :: 3.11
|
|
40
|
+
Classifier: Programming Language :: Python :: 3.12
|
|
41
|
+
Classifier: Programming Language :: Python :: 3.13
|
|
38
42
|
Classifier: Operating System :: OS Independent
|
|
39
43
|
Requires-Python: >=3.9
|
|
40
44
|
Description-Content-Type: text/markdown
|
|
@@ -66,8 +70,8 @@ Requires-Dist: soundfile; extra == "all"
|
|
|
66
70
|
Requires-Dist: vosk; extra == "all"
|
|
67
71
|
Requires-Dist: pystray; extra == "all"
|
|
68
72
|
|
|
69
|
-
[]()
|
|
70
73
|
[](https://pypi.org/project/scribe-cli)
|
|
74
|
+

|
|
71
75
|
|
|
72
76
|
# Scribe <img src="scribe_data/share/icon.png" width=48px>
|
|
73
77
|
|
|
@@ -77,22 +81,29 @@ It features local, downloadable models with the `vosk` and `whisper` backends, a
|
|
|
77
81
|
|
|
78
82
|
## Compatibility
|
|
79
83
|
|
|
80
|
-
|
|
81
|
-
|
|
82
|
-
|
|
83
|
-
|
|
84
|
-
|
|
85
|
-
|
|
84
|
+
The package is initially developped for python 3.12 with Ubuntu 24.04 with Gnome + Wayland, but it should work on other platforms as well (feedback welcome).
|
|
85
|
+
Basically check the pages of the dependencies for more info (i.e. pynput for the keyboard, pystray for the app).
|
|
86
|
+
|
|
87
|
+
- python 3.13:
|
|
88
|
+
- at the time of writing, `openai-whisper` does not install.
|
|
89
|
+
|
|
90
|
+
- Ubuntu:
|
|
91
|
+
- see caveats in the use of the keyboard under Wayland [keyboard section](#use-the-keyboard-with-wayland).
|
|
92
|
+
- MacOS:
|
|
93
|
+
- tested on a Macbook Air M1 8Gb RAM, with python 3.12. It runs, but poorly, presumably because of the low memory: prefer the `openaiapi` backend for such machines
|
|
94
|
+
- I expect better memory specs will have the local models run fine
|
|
95
|
+
- Windows:
|
|
96
|
+
- not tested yet
|
|
86
97
|
|
|
87
98
|
## Installation
|
|
88
99
|
|
|
89
|
-
Install PortAudio library and xclip library. E.g. on Ubuntu:
|
|
100
|
+
Install PortAudio library (required by `sounddevice`) and xclip library (required by `pyperclip`). E.g. on Ubuntu:
|
|
90
101
|
|
|
91
102
|
```bash
|
|
92
103
|
sudo apt-get install portaudio19-dev xclip
|
|
93
104
|
```
|
|
94
105
|
|
|
95
|
-
See additional requirements for the [icon tray](#system-tray-icon-experimental) and [keyboard](#virtual-keyboard-experimental) options. The python dependencies should be dealt with automatically:
|
|
106
|
+
See additional requirements for the [icon tray](#system-tray-icon-experimental-) and [keyboard](#virtual-keyboard-experimental) options. The python dependencies should be dealt with automatically:
|
|
96
107
|
|
|
97
108
|
```bash
|
|
98
109
|
pip install scribe-cli[all]
|
|
@@ -110,6 +121,37 @@ pip install -e .[all]
|
|
|
110
121
|
|
|
111
122
|
You can leave the optional dependencies (leave out `[all]`) but must install at least one of `vosk` or `openai-whisper` or `openai` packages (see Usage below).
|
|
112
123
|
|
|
124
|
+
At the time of writing `openai-whisper` does not install on `python 3.13`. You can install the packages manually and skip that package. This makes the `whisper` API unavailable.
|
|
125
|
+
|
|
126
|
+
### Manual selection of the dependencies
|
|
127
|
+
|
|
128
|
+
```bash
|
|
129
|
+
# language models (at least one must be installed !)
|
|
130
|
+
pip install vosk
|
|
131
|
+
pip install openai soundfile # openaiapi
|
|
132
|
+
pip install openai-whisper # FAILS IN PYTHON 3.13 on Ubuntu
|
|
133
|
+
|
|
134
|
+
# PortAUDIO (sounddevice)
|
|
135
|
+
pip install sounddevice # automatically installed as required dependency
|
|
136
|
+
sudo apt-get install portaudio19-dev
|
|
137
|
+
|
|
138
|
+
# clipboard
|
|
139
|
+
pip install pyperclip # automatically installed as required dependency
|
|
140
|
+
sudo apt-get install xclip
|
|
141
|
+
|
|
142
|
+
# keyboard
|
|
143
|
+
pip install pynput
|
|
144
|
+
|
|
145
|
+
# app mode
|
|
146
|
+
# Uncommand the line below for Ubuntu !
|
|
147
|
+
sudo apt install libcairo-dev libgirepository1.0-dev gir1.2-appindicator3-0.1 # Ubuntu ONLY (not needed on MacOS)
|
|
148
|
+
pip install PyGObject # Ubuntu ONLY (not needed on MacOS)
|
|
149
|
+
pip install pystray
|
|
150
|
+
|
|
151
|
+
# And finally
|
|
152
|
+
pip install scribe-cli
|
|
153
|
+
```
|
|
154
|
+
|
|
113
155
|
The language models for local backends `vosk` and `whisper` will download on-the-fly.
|
|
114
156
|
The default download folder is `$XDG_CACHE_HOME/{backend}` where `$XDG_CACHE_HOME` defaults to `$HOME/.cache`.
|
|
115
157
|
|
|
@@ -134,11 +176,12 @@ The `vosk` backend is much faster and very good at doing real-time transcription
|
|
|
134
176
|
It becomes really powerful when used for longer or interactive typing session with the [keyboard](#virtual-keyboard-experimental) option, e.g. to make notes or chat with an AI.
|
|
135
177
|
There are many [vosk models](https://alphacephei.com/vosk/models) available, and here a few are associated to [a handful of languages](scribe/models.toml) `en`, `fr`, `it`, `de` (so far).
|
|
136
178
|
|
|
137
|
-
The `openaiapi` backend uses `whisper-1` model at the time of writing. It requires an API key
|
|
179
|
+
The `openaiapi` backend uses `whisper-1` model at the time of writing. It requires an API key best passed as an environment variable, e.g. in bash:
|
|
138
180
|
```bash
|
|
139
|
-
|
|
181
|
+
export OPENAI_API_KEY=YOURAPIKEY
|
|
182
|
+
scribe --backend openaiapi
|
|
140
183
|
```
|
|
141
|
-
|
|
184
|
+
The `openaiapi` backend is lightweight and handy if you have an API (you can create one for free for testing) and a low-spec computer (and don't care too much about privacy, obviously).
|
|
142
185
|
|
|
143
186
|
## Output media
|
|
144
187
|
|
|
@@ -174,9 +217,9 @@ This can be extremely useful with the `vosk` backend and its realtime transcript
|
|
|
174
217
|
The `--keyboard` option relies on the optional `pynput` dependency (installed together with `scribe` if you used the `[all]` or `[keyboard]` option).
|
|
175
218
|
Depending on your operating system, `pynput` may require additional configuration to work around its [limitations](https://pynput.readthedocs.io/en/latest/limitations.html).
|
|
176
219
|
|
|
177
|
-
#### Use the keyboard with Wayland
|
|
220
|
+
#### Use the keyboard with Wayland
|
|
178
221
|
|
|
179
|
-
In my Ubuntu + Wayland system the keyboard simulation works out-of-the-box in chromium based applications (including vscode) but it does not in firefox and sublime text and any of the rest (not even in a terminal !). I am told this is because Chromium runs an X server emulator and so is compatible with the default pynput backend.
|
|
222
|
+
In my Ubuntu 24.04 + Wayland system the keyboard simulation works out-of-the-box in chromium based applications (including vscode) but it does not in firefox and sublime text and any of the rest (not even in a terminal !). I am told this is because Chromium runs an X server emulator and so is compatible with the default pynput backend.
|
|
180
223
|
|
|
181
224
|
One workaround is to use the Xorg version of GNOME: in `etc/gdm3/custom.conf` uncomment `# WaylandEnable=false` and restart your computer.
|
|
182
225
|
|
|
@@ -190,40 +233,46 @@ You're on the right path :)
|
|
|
190
233
|
|
|
191
234
|
## System tray icon (experimental) <img src="scribe_data/share/icon.png" width=48px>
|
|
192
235
|
|
|
236
|
+
<img src=https://github.com/user-attachments/assets/4c97f4b1-1a65-4d49-9f5a-a9f4287cfa5a width=300px>
|
|
237
|
+
|
|
193
238
|
To avoid switching back and forth with the terminal, it's possible to interact with the program via an icon tray.
|
|
194
239
|
To activate start with:
|
|
195
240
|
```bash
|
|
196
|
-
scribe --app
|
|
241
|
+
scribe --app ...
|
|
197
242
|
```
|
|
198
243
|
or toggle the app option in the interactive menu. The scribe icon will show, with Record and other options. The icon will change based on what the app is doing. It is possible to choose from a set
|
|
199
244
|
of predefined models (controlled by `--vosk-models` and `whisper-models`) and options, or to Quit and choose from the terminal before pressing Enter again.
|
|
200
245
|
For the vosk model, there are only two states : recording + transcribing or Idle. For the whisper model there are three states visible from the icon: recording/waiting, transcribing and idle.
|
|
201
|
-
That option requires `pystray` to be installed. This is included with the `pip install ...[all]` option.
|
|
246
|
+
That option requires `pystray` to be installed. This is included with the `pip install ...[all]` option.
|
|
247
|
+
|
|
248
|
+
The `--vosk-models` and `--whisper-models` allow to predefined the set of available models to choose from in the app manu. E.g.
|
|
249
|
+
```bash
|
|
250
|
+
scribe --app --vosk-models vosk-model-fr-0.22 --whisper-models small turbo ...
|
|
251
|
+
```
|
|
252
|
+
|
|
253
|
+
### Ubuntu
|
|
254
|
+
|
|
255
|
+
In Ubuntu the following dependencies were required to make the menus appear:
|
|
202
256
|
|
|
203
257
|
```bash
|
|
204
258
|
sudo apt install libcairo-dev libgirepository1.0-dev gir1.2-appindicator3-0.1
|
|
205
259
|
pip install PyGObject
|
|
206
260
|
```
|
|
207
261
|
|
|
208
|
-
<img src=https://github.com/user-attachments/assets/4c97f4b1-1a65-4d49-9f5a-a9f4287cfa5a width=300px>
|
|
209
|
-
|
|
210
262
|
## Start as an application in GNOME
|
|
211
263
|
|
|
212
264
|
If you run Ubuntu (or else?) with GNOME, the script `scribe-install [...]` will create a `scribe.desktop` file and place it under `$HOME/.local/share/applications`
|
|
213
265
|
to make it available from the quick launch menu. Any option will be passed on to `scribe`, with the additional options `--name` and `--no-terminal`.
|
|
214
266
|
`--no-terminal` means no terminal will show up, and it also implies the options `--app --no-prompt`.
|
|
215
267
|
|
|
216
|
-
|
|
217
|
-
|
|
268
|
+
Consider the following two flavors:
|
|
218
269
|
```bash
|
|
219
|
-
scribe-install --clipboard
|
|
270
|
+
scribe-install --clipboard ...
|
|
271
|
+
scribe-install --name "Scribe App" --no-terminal --clipboard ...
|
|
220
272
|
```
|
|
221
|
-
(
|
|
273
|
+
The first will create an app named Scribe (the default) that simply opens a terminal and execute the command `scribe --clipboard ...`.
|
|
274
|
+
The second will create an app named Scribe App that executes in a hidden terminal: `scribe --no-prompt --app --clipboard ...`, thus leaving the tray icon as only mode of interaction.
|
|
222
275
|
|
|
223
|
-
It is also possible to run an app fully outside the terminal:
|
|
224
|
-
```bash
|
|
225
|
-
scribe-install --backend openaiapi --name "Scribe App" --keyboard --clipboard --app --no-prompt --no-terminal --restart-after-silence --api YOUROPENAIAPIKEY --vosk-models vosk-model-fr-0.22 --whisper-models small turbo
|
|
226
|
-
```
|
|
227
276
|
|
|
228
277
|
## Fine tuning
|
|
229
278
|
|
|
@@ -1,5 +1,5 @@
|
|
|
1
|
-
[]()
|
|
2
1
|
[](https://pypi.org/project/scribe-cli)
|
|
2
|
+

|
|
3
3
|
|
|
4
4
|
# Scribe <img src="scribe_data/share/icon.png" width=48px>
|
|
5
5
|
|
|
@@ -9,22 +9,29 @@ It features local, downloadable models with the `vosk` and `whisper` backends, a
|
|
|
9
9
|
|
|
10
10
|
## Compatibility
|
|
11
11
|
|
|
12
|
-
|
|
13
|
-
|
|
14
|
-
|
|
15
|
-
|
|
16
|
-
|
|
17
|
-
|
|
12
|
+
The package is initially developped for python 3.12 with Ubuntu 24.04 with Gnome + Wayland, but it should work on other platforms as well (feedback welcome).
|
|
13
|
+
Basically check the pages of the dependencies for more info (i.e. pynput for the keyboard, pystray for the app).
|
|
14
|
+
|
|
15
|
+
- python 3.13:
|
|
16
|
+
- at the time of writing, `openai-whisper` does not install.
|
|
17
|
+
|
|
18
|
+
- Ubuntu:
|
|
19
|
+
- see caveats in the use of the keyboard under Wayland [keyboard section](#use-the-keyboard-with-wayland).
|
|
20
|
+
- MacOS:
|
|
21
|
+
- tested on a Macbook Air M1 8Gb RAM, with python 3.12. It runs, but poorly, presumably because of the low memory: prefer the `openaiapi` backend for such machines
|
|
22
|
+
- I expect better memory specs will have the local models run fine
|
|
23
|
+
- Windows:
|
|
24
|
+
- not tested yet
|
|
18
25
|
|
|
19
26
|
## Installation
|
|
20
27
|
|
|
21
|
-
Install PortAudio library and xclip library. E.g. on Ubuntu:
|
|
28
|
+
Install PortAudio library (required by `sounddevice`) and xclip library (required by `pyperclip`). E.g. on Ubuntu:
|
|
22
29
|
|
|
23
30
|
```bash
|
|
24
31
|
sudo apt-get install portaudio19-dev xclip
|
|
25
32
|
```
|
|
26
33
|
|
|
27
|
-
See additional requirements for the [icon tray](#system-tray-icon-experimental) and [keyboard](#virtual-keyboard-experimental) options. The python dependencies should be dealt with automatically:
|
|
34
|
+
See additional requirements for the [icon tray](#system-tray-icon-experimental-) and [keyboard](#virtual-keyboard-experimental) options. The python dependencies should be dealt with automatically:
|
|
28
35
|
|
|
29
36
|
```bash
|
|
30
37
|
pip install scribe-cli[all]
|
|
@@ -42,6 +49,37 @@ pip install -e .[all]
|
|
|
42
49
|
|
|
43
50
|
You can leave the optional dependencies (leave out `[all]`) but must install at least one of `vosk` or `openai-whisper` or `openai` packages (see Usage below).
|
|
44
51
|
|
|
52
|
+
At the time of writing `openai-whisper` does not install on `python 3.13`. You can install the packages manually and skip that package. This makes the `whisper` API unavailable.
|
|
53
|
+
|
|
54
|
+
### Manual selection of the dependencies
|
|
55
|
+
|
|
56
|
+
```bash
|
|
57
|
+
# language models (at least one must be installed !)
|
|
58
|
+
pip install vosk
|
|
59
|
+
pip install openai soundfile # openaiapi
|
|
60
|
+
pip install openai-whisper # FAILS IN PYTHON 3.13 on Ubuntu
|
|
61
|
+
|
|
62
|
+
# PortAUDIO (sounddevice)
|
|
63
|
+
pip install sounddevice # automatically installed as required dependency
|
|
64
|
+
sudo apt-get install portaudio19-dev
|
|
65
|
+
|
|
66
|
+
# clipboard
|
|
67
|
+
pip install pyperclip # automatically installed as required dependency
|
|
68
|
+
sudo apt-get install xclip
|
|
69
|
+
|
|
70
|
+
# keyboard
|
|
71
|
+
pip install pynput
|
|
72
|
+
|
|
73
|
+
# app mode
|
|
74
|
+
# Uncommand the line below for Ubuntu !
|
|
75
|
+
sudo apt install libcairo-dev libgirepository1.0-dev gir1.2-appindicator3-0.1 # Ubuntu ONLY (not needed on MacOS)
|
|
76
|
+
pip install PyGObject # Ubuntu ONLY (not needed on MacOS)
|
|
77
|
+
pip install pystray
|
|
78
|
+
|
|
79
|
+
# And finally
|
|
80
|
+
pip install scribe-cli
|
|
81
|
+
```
|
|
82
|
+
|
|
45
83
|
The language models for local backends `vosk` and `whisper` will download on-the-fly.
|
|
46
84
|
The default download folder is `$XDG_CACHE_HOME/{backend}` where `$XDG_CACHE_HOME` defaults to `$HOME/.cache`.
|
|
47
85
|
|
|
@@ -66,11 +104,12 @@ The `vosk` backend is much faster and very good at doing real-time transcription
|
|
|
66
104
|
It becomes really powerful when used for longer or interactive typing session with the [keyboard](#virtual-keyboard-experimental) option, e.g. to make notes or chat with an AI.
|
|
67
105
|
There are many [vosk models](https://alphacephei.com/vosk/models) available, and here a few are associated to [a handful of languages](scribe/models.toml) `en`, `fr`, `it`, `de` (so far).
|
|
68
106
|
|
|
69
|
-
The `openaiapi` backend uses `whisper-1` model at the time of writing. It requires an API key
|
|
107
|
+
The `openaiapi` backend uses `whisper-1` model at the time of writing. It requires an API key best passed as an environment variable, e.g. in bash:
|
|
70
108
|
```bash
|
|
71
|
-
|
|
109
|
+
export OPENAI_API_KEY=YOURAPIKEY
|
|
110
|
+
scribe --backend openaiapi
|
|
72
111
|
```
|
|
73
|
-
|
|
112
|
+
The `openaiapi` backend is lightweight and handy if you have an API (you can create one for free for testing) and a low-spec computer (and don't care too much about privacy, obviously).
|
|
74
113
|
|
|
75
114
|
## Output media
|
|
76
115
|
|
|
@@ -106,9 +145,9 @@ This can be extremely useful with the `vosk` backend and its realtime transcript
|
|
|
106
145
|
The `--keyboard` option relies on the optional `pynput` dependency (installed together with `scribe` if you used the `[all]` or `[keyboard]` option).
|
|
107
146
|
Depending on your operating system, `pynput` may require additional configuration to work around its [limitations](https://pynput.readthedocs.io/en/latest/limitations.html).
|
|
108
147
|
|
|
109
|
-
#### Use the keyboard with Wayland
|
|
148
|
+
#### Use the keyboard with Wayland
|
|
110
149
|
|
|
111
|
-
In my Ubuntu + Wayland system the keyboard simulation works out-of-the-box in chromium based applications (including vscode) but it does not in firefox and sublime text and any of the rest (not even in a terminal !). I am told this is because Chromium runs an X server emulator and so is compatible with the default pynput backend.
|
|
150
|
+
In my Ubuntu 24.04 + Wayland system the keyboard simulation works out-of-the-box in chromium based applications (including vscode) but it does not in firefox and sublime text and any of the rest (not even in a terminal !). I am told this is because Chromium runs an X server emulator and so is compatible with the default pynput backend.
|
|
112
151
|
|
|
113
152
|
One workaround is to use the Xorg version of GNOME: in `etc/gdm3/custom.conf` uncomment `# WaylandEnable=false` and restart your computer.
|
|
114
153
|
|
|
@@ -122,40 +161,46 @@ You're on the right path :)
|
|
|
122
161
|
|
|
123
162
|
## System tray icon (experimental) <img src="scribe_data/share/icon.png" width=48px>
|
|
124
163
|
|
|
164
|
+
<img src=https://github.com/user-attachments/assets/4c97f4b1-1a65-4d49-9f5a-a9f4287cfa5a width=300px>
|
|
165
|
+
|
|
125
166
|
To avoid switching back and forth with the terminal, it's possible to interact with the program via an icon tray.
|
|
126
167
|
To activate start with:
|
|
127
168
|
```bash
|
|
128
|
-
scribe --app
|
|
169
|
+
scribe --app ...
|
|
129
170
|
```
|
|
130
171
|
or toggle the app option in the interactive menu. The scribe icon will show, with Record and other options. The icon will change based on what the app is doing. It is possible to choose from a set
|
|
131
172
|
of predefined models (controlled by `--vosk-models` and `whisper-models`) and options, or to Quit and choose from the terminal before pressing Enter again.
|
|
132
173
|
For the vosk model, there are only two states : recording + transcribing or Idle. For the whisper model there are three states visible from the icon: recording/waiting, transcribing and idle.
|
|
133
|
-
That option requires `pystray` to be installed. This is included with the `pip install ...[all]` option.
|
|
174
|
+
That option requires `pystray` to be installed. This is included with the `pip install ...[all]` option.
|
|
175
|
+
|
|
176
|
+
The `--vosk-models` and `--whisper-models` allow to predefined the set of available models to choose from in the app manu. E.g.
|
|
177
|
+
```bash
|
|
178
|
+
scribe --app --vosk-models vosk-model-fr-0.22 --whisper-models small turbo ...
|
|
179
|
+
```
|
|
180
|
+
|
|
181
|
+
### Ubuntu
|
|
182
|
+
|
|
183
|
+
In Ubuntu the following dependencies were required to make the menus appear:
|
|
134
184
|
|
|
135
185
|
```bash
|
|
136
186
|
sudo apt install libcairo-dev libgirepository1.0-dev gir1.2-appindicator3-0.1
|
|
137
187
|
pip install PyGObject
|
|
138
188
|
```
|
|
139
189
|
|
|
140
|
-
<img src=https://github.com/user-attachments/assets/4c97f4b1-1a65-4d49-9f5a-a9f4287cfa5a width=300px>
|
|
141
|
-
|
|
142
190
|
## Start as an application in GNOME
|
|
143
191
|
|
|
144
192
|
If you run Ubuntu (or else?) with GNOME, the script `scribe-install [...]` will create a `scribe.desktop` file and place it under `$HOME/.local/share/applications`
|
|
145
193
|
to make it available from the quick launch menu. Any option will be passed on to `scribe`, with the additional options `--name` and `--no-terminal`.
|
|
146
194
|
`--no-terminal` means no terminal will show up, and it also implies the options `--app --no-prompt`.
|
|
147
195
|
|
|
148
|
-
|
|
149
|
-
|
|
196
|
+
Consider the following two flavors:
|
|
150
197
|
```bash
|
|
151
|
-
scribe-install --clipboard
|
|
198
|
+
scribe-install --clipboard ...
|
|
199
|
+
scribe-install --name "Scribe App" --no-terminal --clipboard ...
|
|
152
200
|
```
|
|
153
|
-
(
|
|
201
|
+
The first will create an app named Scribe (the default) that simply opens a terminal and execute the command `scribe --clipboard ...`.
|
|
202
|
+
The second will create an app named Scribe App that executes in a hidden terminal: `scribe --no-prompt --app --clipboard ...`, thus leaving the tray icon as only mode of interaction.
|
|
154
203
|
|
|
155
|
-
It is also possible to run an app fully outside the terminal:
|
|
156
|
-
```bash
|
|
157
|
-
scribe-install --backend openaiapi --name "Scribe App" --keyboard --clipboard --app --no-prompt --no-terminal --restart-after-silence --api YOUROPENAIAPIKEY --vosk-models vosk-model-fr-0.22 --whisper-models small turbo
|
|
158
|
-
```
|
|
159
204
|
|
|
160
205
|
## Fine tuning
|
|
161
206
|
|
|
@@ -23,7 +23,11 @@ dependencies = [
|
|
|
23
23
|
]
|
|
24
24
|
|
|
25
25
|
classifiers = [
|
|
26
|
-
"Programming Language :: Python :: 3",
|
|
26
|
+
"Programming Language :: Python :: 3.9",
|
|
27
|
+
"Programming Language :: Python :: 3.10",
|
|
28
|
+
"Programming Language :: Python :: 3.11",
|
|
29
|
+
"Programming Language :: Python :: 3.12",
|
|
30
|
+
"Programming Language :: Python :: 3.13",
|
|
27
31
|
"Operating System :: OS Independent",
|
|
28
32
|
]
|
|
29
33
|
|
|
@@ -54,6 +54,9 @@ class AbstractTranscriber:
|
|
|
54
54
|
return self.timeout is not None and time.time() - self.start_time > self.timeout
|
|
55
55
|
|
|
56
56
|
def transcribe_realtime_audio(self, audio_bytes=b""):
|
|
57
|
+
"""This method is generic and assumes the underlying model does not handle real-time audio.
|
|
58
|
+
The Vosk model handles real-time audio, so this method is overridden in the VoskTranscriber class.
|
|
59
|
+
"""
|
|
57
60
|
|
|
58
61
|
# Vérifier si le segment est un silence
|
|
59
62
|
if is_silent(audio_bytes, self.silence_thresh):
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.2
|
|
2
2
|
Name: scribe-cli
|
|
3
|
-
Version: 0.12.
|
|
3
|
+
Version: 0.12.2
|
|
4
4
|
Summary: scribe is a local speech recognition tool that provides real-time transcription using vosk and whisper AI, with the goal of serving as a virtual keyboard on a computer
|
|
5
5
|
Author-email: Mahé Perrette <mahe.perrette@gmail.com>
|
|
6
6
|
License: MIT License
|
|
@@ -34,7 +34,11 @@ License: MIT License
|
|
|
34
34
|
ensure compliance with their respective terms.
|
|
35
35
|
Project-URL: Homepage, https://github.com/perrette/scribe
|
|
36
36
|
Keywords: speech recognition,transcription,AI,language,vosk,whisper,openai,keyboard,clipboard
|
|
37
|
-
Classifier: Programming Language :: Python :: 3
|
|
37
|
+
Classifier: Programming Language :: Python :: 3.9
|
|
38
|
+
Classifier: Programming Language :: Python :: 3.10
|
|
39
|
+
Classifier: Programming Language :: Python :: 3.11
|
|
40
|
+
Classifier: Programming Language :: Python :: 3.12
|
|
41
|
+
Classifier: Programming Language :: Python :: 3.13
|
|
38
42
|
Classifier: Operating System :: OS Independent
|
|
39
43
|
Requires-Python: >=3.9
|
|
40
44
|
Description-Content-Type: text/markdown
|
|
@@ -66,8 +70,8 @@ Requires-Dist: soundfile; extra == "all"
|
|
|
66
70
|
Requires-Dist: vosk; extra == "all"
|
|
67
71
|
Requires-Dist: pystray; extra == "all"
|
|
68
72
|
|
|
69
|
-
[]()
|
|
70
73
|
[](https://pypi.org/project/scribe-cli)
|
|
74
|
+

|
|
71
75
|
|
|
72
76
|
# Scribe <img src="scribe_data/share/icon.png" width=48px>
|
|
73
77
|
|
|
@@ -77,22 +81,29 @@ It features local, downloadable models with the `vosk` and `whisper` backends, a
|
|
|
77
81
|
|
|
78
82
|
## Compatibility
|
|
79
83
|
|
|
80
|
-
|
|
81
|
-
|
|
82
|
-
|
|
83
|
-
|
|
84
|
-
|
|
85
|
-
|
|
84
|
+
The package is initially developped for python 3.12 with Ubuntu 24.04 with Gnome + Wayland, but it should work on other platforms as well (feedback welcome).
|
|
85
|
+
Basically check the pages of the dependencies for more info (i.e. pynput for the keyboard, pystray for the app).
|
|
86
|
+
|
|
87
|
+
- python 3.13:
|
|
88
|
+
- at the time of writing, `openai-whisper` does not install.
|
|
89
|
+
|
|
90
|
+
- Ubuntu:
|
|
91
|
+
- see caveats in the use of the keyboard under Wayland [keyboard section](#use-the-keyboard-with-wayland).
|
|
92
|
+
- MacOS:
|
|
93
|
+
- tested on a Macbook Air M1 8Gb RAM, with python 3.12. It runs, but poorly, presumably because of the low memory: prefer the `openaiapi` backend for such machines
|
|
94
|
+
- I expect better memory specs will have the local models run fine
|
|
95
|
+
- Windows:
|
|
96
|
+
- not tested yet
|
|
86
97
|
|
|
87
98
|
## Installation
|
|
88
99
|
|
|
89
|
-
Install PortAudio library and xclip library. E.g. on Ubuntu:
|
|
100
|
+
Install PortAudio library (required by `sounddevice`) and xclip library (required by `pyperclip`). E.g. on Ubuntu:
|
|
90
101
|
|
|
91
102
|
```bash
|
|
92
103
|
sudo apt-get install portaudio19-dev xclip
|
|
93
104
|
```
|
|
94
105
|
|
|
95
|
-
See additional requirements for the [icon tray](#system-tray-icon-experimental) and [keyboard](#virtual-keyboard-experimental) options. The python dependencies should be dealt with automatically:
|
|
106
|
+
See additional requirements for the [icon tray](#system-tray-icon-experimental-) and [keyboard](#virtual-keyboard-experimental) options. The python dependencies should be dealt with automatically:
|
|
96
107
|
|
|
97
108
|
```bash
|
|
98
109
|
pip install scribe-cli[all]
|
|
@@ -110,6 +121,37 @@ pip install -e .[all]
|
|
|
110
121
|
|
|
111
122
|
You can leave the optional dependencies (leave out `[all]`) but must install at least one of `vosk` or `openai-whisper` or `openai` packages (see Usage below).
|
|
112
123
|
|
|
124
|
+
At the time of writing `openai-whisper` does not install on `python 3.13`. You can install the packages manually and skip that package. This makes the `whisper` API unavailable.
|
|
125
|
+
|
|
126
|
+
### Manual selection of the dependencies
|
|
127
|
+
|
|
128
|
+
```bash
|
|
129
|
+
# language models (at least one must be installed !)
|
|
130
|
+
pip install vosk
|
|
131
|
+
pip install openai soundfile # openaiapi
|
|
132
|
+
pip install openai-whisper # FAILS IN PYTHON 3.13 on Ubuntu
|
|
133
|
+
|
|
134
|
+
# PortAUDIO (sounddevice)
|
|
135
|
+
pip install sounddevice # automatically installed as required dependency
|
|
136
|
+
sudo apt-get install portaudio19-dev
|
|
137
|
+
|
|
138
|
+
# clipboard
|
|
139
|
+
pip install pyperclip # automatically installed as required dependency
|
|
140
|
+
sudo apt-get install xclip
|
|
141
|
+
|
|
142
|
+
# keyboard
|
|
143
|
+
pip install pynput
|
|
144
|
+
|
|
145
|
+
# app mode
|
|
146
|
+
# Uncommand the line below for Ubuntu !
|
|
147
|
+
sudo apt install libcairo-dev libgirepository1.0-dev gir1.2-appindicator3-0.1 # Ubuntu ONLY (not needed on MacOS)
|
|
148
|
+
pip install PyGObject # Ubuntu ONLY (not needed on MacOS)
|
|
149
|
+
pip install pystray
|
|
150
|
+
|
|
151
|
+
# And finally
|
|
152
|
+
pip install scribe-cli
|
|
153
|
+
```
|
|
154
|
+
|
|
113
155
|
The language models for local backends `vosk` and `whisper` will download on-the-fly.
|
|
114
156
|
The default download folder is `$XDG_CACHE_HOME/{backend}` where `$XDG_CACHE_HOME` defaults to `$HOME/.cache`.
|
|
115
157
|
|
|
@@ -134,11 +176,12 @@ The `vosk` backend is much faster and very good at doing real-time transcription
|
|
|
134
176
|
It becomes really powerful when used for longer or interactive typing session with the [keyboard](#virtual-keyboard-experimental) option, e.g. to make notes or chat with an AI.
|
|
135
177
|
There are many [vosk models](https://alphacephei.com/vosk/models) available, and here a few are associated to [a handful of languages](scribe/models.toml) `en`, `fr`, `it`, `de` (so far).
|
|
136
178
|
|
|
137
|
-
The `openaiapi` backend uses `whisper-1` model at the time of writing. It requires an API key
|
|
179
|
+
The `openaiapi` backend uses `whisper-1` model at the time of writing. It requires an API key best passed as an environment variable, e.g. in bash:
|
|
138
180
|
```bash
|
|
139
|
-
|
|
181
|
+
export OPENAI_API_KEY=YOURAPIKEY
|
|
182
|
+
scribe --backend openaiapi
|
|
140
183
|
```
|
|
141
|
-
|
|
184
|
+
The `openaiapi` backend is lightweight and handy if you have an API (you can create one for free for testing) and a low-spec computer (and don't care too much about privacy, obviously).
|
|
142
185
|
|
|
143
186
|
## Output media
|
|
144
187
|
|
|
@@ -174,9 +217,9 @@ This can be extremely useful with the `vosk` backend and its realtime transcript
|
|
|
174
217
|
The `--keyboard` option relies on the optional `pynput` dependency (installed together with `scribe` if you used the `[all]` or `[keyboard]` option).
|
|
175
218
|
Depending on your operating system, `pynput` may require additional configuration to work around its [limitations](https://pynput.readthedocs.io/en/latest/limitations.html).
|
|
176
219
|
|
|
177
|
-
#### Use the keyboard with Wayland
|
|
220
|
+
#### Use the keyboard with Wayland
|
|
178
221
|
|
|
179
|
-
In my Ubuntu + Wayland system the keyboard simulation works out-of-the-box in chromium based applications (including vscode) but it does not in firefox and sublime text and any of the rest (not even in a terminal !). I am told this is because Chromium runs an X server emulator and so is compatible with the default pynput backend.
|
|
222
|
+
In my Ubuntu 24.04 + Wayland system the keyboard simulation works out-of-the-box in chromium based applications (including vscode) but it does not in firefox and sublime text and any of the rest (not even in a terminal !). I am told this is because Chromium runs an X server emulator and so is compatible with the default pynput backend.
|
|
180
223
|
|
|
181
224
|
One workaround is to use the Xorg version of GNOME: in `etc/gdm3/custom.conf` uncomment `# WaylandEnable=false` and restart your computer.
|
|
182
225
|
|
|
@@ -190,40 +233,46 @@ You're on the right path :)
|
|
|
190
233
|
|
|
191
234
|
## System tray icon (experimental) <img src="scribe_data/share/icon.png" width=48px>
|
|
192
235
|
|
|
236
|
+
<img src=https://github.com/user-attachments/assets/4c97f4b1-1a65-4d49-9f5a-a9f4287cfa5a width=300px>
|
|
237
|
+
|
|
193
238
|
To avoid switching back and forth with the terminal, it's possible to interact with the program via an icon tray.
|
|
194
239
|
To activate start with:
|
|
195
240
|
```bash
|
|
196
|
-
scribe --app
|
|
241
|
+
scribe --app ...
|
|
197
242
|
```
|
|
198
243
|
or toggle the app option in the interactive menu. The scribe icon will show, with Record and other options. The icon will change based on what the app is doing. It is possible to choose from a set
|
|
199
244
|
of predefined models (controlled by `--vosk-models` and `whisper-models`) and options, or to Quit and choose from the terminal before pressing Enter again.
|
|
200
245
|
For the vosk model, there are only two states : recording + transcribing or Idle. For the whisper model there are three states visible from the icon: recording/waiting, transcribing and idle.
|
|
201
|
-
That option requires `pystray` to be installed. This is included with the `pip install ...[all]` option.
|
|
246
|
+
That option requires `pystray` to be installed. This is included with the `pip install ...[all]` option.
|
|
247
|
+
|
|
248
|
+
The `--vosk-models` and `--whisper-models` allow to predefined the set of available models to choose from in the app manu. E.g.
|
|
249
|
+
```bash
|
|
250
|
+
scribe --app --vosk-models vosk-model-fr-0.22 --whisper-models small turbo ...
|
|
251
|
+
```
|
|
252
|
+
|
|
253
|
+
### Ubuntu
|
|
254
|
+
|
|
255
|
+
In Ubuntu the following dependencies were required to make the menus appear:
|
|
202
256
|
|
|
203
257
|
```bash
|
|
204
258
|
sudo apt install libcairo-dev libgirepository1.0-dev gir1.2-appindicator3-0.1
|
|
205
259
|
pip install PyGObject
|
|
206
260
|
```
|
|
207
261
|
|
|
208
|
-
<img src=https://github.com/user-attachments/assets/4c97f4b1-1a65-4d49-9f5a-a9f4287cfa5a width=300px>
|
|
209
|
-
|
|
210
262
|
## Start as an application in GNOME
|
|
211
263
|
|
|
212
264
|
If you run Ubuntu (or else?) with GNOME, the script `scribe-install [...]` will create a `scribe.desktop` file and place it under `$HOME/.local/share/applications`
|
|
213
265
|
to make it available from the quick launch menu. Any option will be passed on to `scribe`, with the additional options `--name` and `--no-terminal`.
|
|
214
266
|
`--no-terminal` means no terminal will show up, and it also implies the options `--app --no-prompt`.
|
|
215
267
|
|
|
216
|
-
|
|
217
|
-
|
|
268
|
+
Consider the following two flavors:
|
|
218
269
|
```bash
|
|
219
|
-
scribe-install --clipboard
|
|
270
|
+
scribe-install --clipboard ...
|
|
271
|
+
scribe-install --name "Scribe App" --no-terminal --clipboard ...
|
|
220
272
|
```
|
|
221
|
-
(
|
|
273
|
+
The first will create an app named Scribe (the default) that simply opens a terminal and execute the command `scribe --clipboard ...`.
|
|
274
|
+
The second will create an app named Scribe App that executes in a hidden terminal: `scribe --no-prompt --app --clipboard ...`, thus leaving the tray icon as only mode of interaction.
|
|
222
275
|
|
|
223
|
-
It is also possible to run an app fully outside the terminal:
|
|
224
|
-
```bash
|
|
225
|
-
scribe-install --backend openaiapi --name "Scribe App" --keyboard --clipboard --app --no-prompt --no-terminal --restart-after-silence --api YOUROPENAIAPIKEY --vosk-models vosk-model-fr-0.22 --whisper-models small turbo
|
|
226
|
-
```
|
|
227
276
|
|
|
228
277
|
## Fine tuning
|
|
229
278
|
|
|
@@ -0,0 +1,20 @@
|
|
|
1
|
+
subversion=$1
|
|
2
|
+
version=3.$subversion
|
|
3
|
+
name=py3$subversion
|
|
4
|
+
|
|
5
|
+
MAMBAENV=~/.local/share/mamba/envs/$name
|
|
6
|
+
VENVDIR=~/.virtualenvs/$name
|
|
7
|
+
|
|
8
|
+
if [ ! -d $MAMBAENV ] ; then
|
|
9
|
+
micromamba create -n $name python=$version --prefix $MAMBAENV -y
|
|
10
|
+
else
|
|
11
|
+
echo "Environment $name already exists at $MAMBAENV"
|
|
12
|
+
fi
|
|
13
|
+
if [ ! -d $VENVDIR ] ; then
|
|
14
|
+
$MAMBAENV/bin/python3 -m venv $VENVDIR
|
|
15
|
+
else
|
|
16
|
+
echo "Virtualenv $name already exists at $VENVDIR"
|
|
17
|
+
fi
|
|
18
|
+
source ~/.virtualenvs/$name/bin/activate
|
|
19
|
+
pip install -U pip
|
|
20
|
+
pip install scribe-cli[all]
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|