scribe-cli 0.7.8__tar.gz → 0.7.10__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {scribe_cli-0.7.8/scribe_cli.egg-info → scribe_cli-0.7.10}/PKG-INFO +10 -7
- {scribe_cli-0.7.8 → scribe_cli-0.7.10}/README.md +9 -6
- scribe_cli-0.7.10/icon.xcf +0 -0
- {scribe_cli-0.7.8 → scribe_cli-0.7.10}/scribe/_version.py +2 -2
- {scribe_cli-0.7.8 → scribe_cli-0.7.10}/scribe/app.py +108 -54
- {scribe_cli-0.7.8 → scribe_cli-0.7.10}/scribe/models.py +12 -4
- {scribe_cli-0.7.8 → scribe_cli-0.7.10/scribe_cli.egg-info}/PKG-INFO +10 -7
- {scribe_cli-0.7.8 → scribe_cli-0.7.10}/scribe_cli.egg-info/SOURCES.txt +4 -1
- scribe_cli-0.7.10/scribe_data/share/icon.png +0 -0
- scribe_cli-0.7.10/scribe_data/share/icon_recording.png +0 -0
- scribe_cli-0.7.10/scribe_data/share/icon_writing.png +0 -0
- scribe_cli-0.7.8/scribe_data/share/icon.jpg +0 -0
- {scribe_cli-0.7.8 → scribe_cli-0.7.10}/.github/workflows/pypi.yml +0 -0
- {scribe_cli-0.7.8 → scribe_cli-0.7.10}/.gitignore +0 -0
- {scribe_cli-0.7.8 → scribe_cli-0.7.10}/LICENSE +0 -0
- {scribe_cli-0.7.8 → scribe_cli-0.7.10}/pyproject.toml +0 -0
- {scribe_cli-0.7.8 → scribe_cli-0.7.10}/scribe/__init__.py +0 -0
- {scribe_cli-0.7.8 → scribe_cli-0.7.10}/scribe/audio.py +0 -0
- {scribe_cli-0.7.8 → scribe_cli-0.7.10}/scribe/install_desktop.py +0 -0
- {scribe_cli-0.7.8 → scribe_cli-0.7.10}/scribe/keyboard.py +0 -0
- {scribe_cli-0.7.8 → scribe_cli-0.7.10}/scribe/models.toml +0 -0
- {scribe_cli-0.7.8 → scribe_cli-0.7.10}/scribe/saverecording.py +0 -0
- {scribe_cli-0.7.8 → scribe_cli-0.7.10}/scribe/testpynput.py +0 -0
- {scribe_cli-0.7.8 → scribe_cli-0.7.10}/scribe/util.py +0 -0
- {scribe_cli-0.7.8 → scribe_cli-0.7.10}/scribe_cli.egg-info/dependency_links.txt +0 -0
- {scribe_cli-0.7.8 → scribe_cli-0.7.10}/scribe_cli.egg-info/entry_points.txt +0 -0
- {scribe_cli-0.7.8 → scribe_cli-0.7.10}/scribe_cli.egg-info/requires.txt +0 -0
- {scribe_cli-0.7.8 → scribe_cli-0.7.10}/scribe_cli.egg-info/top_level.txt +0 -0
- {scribe_cli-0.7.8 → scribe_cli-0.7.10}/scribe_data/__init__.py +0 -0
- {scribe_cli-0.7.8 → scribe_cli-0.7.10}/scribe_data/templates/scribe.desktop +0 -0
- {scribe_cli-0.7.8 → scribe_cli-0.7.10}/setup.cfg +0 -0
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.2
|
|
2
2
|
Name: scribe-cli
|
|
3
|
-
Version: 0.7.
|
|
3
|
+
Version: 0.7.10
|
|
4
4
|
Summary: scribe is a local speech recognition tool that provides real-time transcription using vosk and whisper AI, with the goal of serving as a virtual keyboard on a computer
|
|
5
5
|
Author-email: Mahé Perrette <mahe.perrette@gmail.com>
|
|
6
6
|
License: MIT License
|
|
@@ -64,14 +64,17 @@ Requires-Dist: pystray; extra == "all"
|
|
|
64
64
|
[]()
|
|
65
65
|
[](https://pypi.org/project/scribe-cli)
|
|
66
66
|
|
|
67
|
-
# Scribe
|
|
67
|
+
# Scribe <img src="scribe_data/share/icon.png" width=48px>
|
|
68
68
|
|
|
69
69
|
`scribe` is a local speech recognition tool that provides real-time transcription using vosk and whisper AI, with the goal of serving as a virtual keyboard on a computer.
|
|
70
70
|
|
|
71
71
|
## Compatibility
|
|
72
72
|
|
|
73
|
-
In principle `scribe` is compatible with any OS but I develop it under Ubuntu (Wayland)
|
|
74
|
-
|
|
73
|
+
In principle `scribe` is compatible with any OS but I develop it under Ubuntu (Wayland) for my own purposes so glitches are likely on other configurations.
|
|
74
|
+
Moreover there are quite a bit of dependencies that rely on very OS-specific protocols under the hood, like access to the microphone, keyboard and clipboard,
|
|
75
|
+
and even though the python dependencies `scribe` relies on are not restricted to a single platform, there may be limitation and additional binaries to install.
|
|
76
|
+
This guide is based on python3.12 running on Ubuntu 24.04 with Gnome + Wayland, which is a relatively standard setting at the time of writing.
|
|
77
|
+
Note as of February 19, 2025 python 13 does not seem to produce any transcription (I am not sure which dependency is to blame).
|
|
75
78
|
A test on Mac OS (M1 Air with 8Gb RAM) worked with python 12, though with a much inferior performance compared to my own system (Lenovo T14 Gen 5 with i5 125U 32 Gb RAM).
|
|
76
79
|
|
|
77
80
|
## Installation
|
|
@@ -147,7 +150,7 @@ scribe --keyboard
|
|
|
147
150
|
It relies on the optional `pynput` dependency (installed together with `scribe` if you used the `[all]` or `[keyboard]` option).
|
|
148
151
|
Depending on your operating system, `pynput` may require additional configuration to work around its [limitations](https://pynput.readthedocs.io/en/latest/limitations.html).
|
|
149
152
|
|
|
150
|
-
#### Use the keyboard
|
|
153
|
+
#### Use the keyboard with Wayland (default for Ubuntu 24.04)
|
|
151
154
|
|
|
152
155
|
In my Ubuntu + Wayland system the keyboard simulation works out-of-the-box in chromium based applications (including vscode) but it does not in firefox and sublime text and any of the rest (not even in a terminal !). I am told this is because Chromium runs an X server emulator and so is compatible with the default pynput backend.
|
|
153
156
|
|
|
@@ -161,7 +164,7 @@ sudo HOME=$HOME XDG_RUNTIME_DIR=$XDG_RUNTIME_DIR PYNPUT_BACKEND_KEYBOARD=uinput
|
|
|
161
164
|
```
|
|
162
165
|
You're on the right path :)
|
|
163
166
|
|
|
164
|
-
### System tray icon (experimental)
|
|
167
|
+
### System tray icon (experimental) <img src="scribe_data/share/icon.png" width=48px>
|
|
165
168
|
|
|
166
169
|
To avoid switching back and forth with the terminal, it's possible to interact with the program via an icon tray.
|
|
167
170
|
To activate start with:
|
|
@@ -176,7 +179,7 @@ sudo apt install libcairo-dev libgirepository1.0-dev gir1.2-appindicator3-0.1
|
|
|
176
179
|
pip install PyGObject
|
|
177
180
|
```
|
|
178
181
|
|
|
179
|
-
### Start as an application in
|
|
182
|
+
### Start as an application in GNOME
|
|
180
183
|
|
|
181
184
|
If you run Ubuntu (or else?) with GNOME, the script `scribe-install [...]` will create a `scribe.desktop` file and place it under `$HOME/.local/share/applications`
|
|
182
185
|
to make it available from the quick launch menu. Any option will be passed on to `scribe`.
|
|
@@ -1,14 +1,17 @@
|
|
|
1
1
|
[]()
|
|
2
2
|
[](https://pypi.org/project/scribe-cli)
|
|
3
3
|
|
|
4
|
-
# Scribe
|
|
4
|
+
# Scribe <img src="scribe_data/share/icon.png" width=48px>
|
|
5
5
|
|
|
6
6
|
`scribe` is a local speech recognition tool that provides real-time transcription using vosk and whisper AI, with the goal of serving as a virtual keyboard on a computer.
|
|
7
7
|
|
|
8
8
|
## Compatibility
|
|
9
9
|
|
|
10
|
-
In principle `scribe` is compatible with any OS but I develop it under Ubuntu (Wayland)
|
|
11
|
-
|
|
10
|
+
In principle `scribe` is compatible with any OS but I develop it under Ubuntu (Wayland) for my own purposes so glitches are likely on other configurations.
|
|
11
|
+
Moreover there are quite a bit of dependencies that rely on very OS-specific protocols under the hood, like access to the microphone, keyboard and clipboard,
|
|
12
|
+
and even though the python dependencies `scribe` relies on are not restricted to a single platform, there may be limitation and additional binaries to install.
|
|
13
|
+
This guide is based on python3.12 running on Ubuntu 24.04 with Gnome + Wayland, which is a relatively standard setting at the time of writing.
|
|
14
|
+
Note as of February 19, 2025 python 13 does not seem to produce any transcription (I am not sure which dependency is to blame).
|
|
12
15
|
A test on Mac OS (M1 Air with 8Gb RAM) worked with python 12, though with a much inferior performance compared to my own system (Lenovo T14 Gen 5 with i5 125U 32 Gb RAM).
|
|
13
16
|
|
|
14
17
|
## Installation
|
|
@@ -84,7 +87,7 @@ scribe --keyboard
|
|
|
84
87
|
It relies on the optional `pynput` dependency (installed together with `scribe` if you used the `[all]` or `[keyboard]` option).
|
|
85
88
|
Depending on your operating system, `pynput` may require additional configuration to work around its [limitations](https://pynput.readthedocs.io/en/latest/limitations.html).
|
|
86
89
|
|
|
87
|
-
#### Use the keyboard
|
|
90
|
+
#### Use the keyboard with Wayland (default for Ubuntu 24.04)
|
|
88
91
|
|
|
89
92
|
In my Ubuntu + Wayland system the keyboard simulation works out-of-the-box in chromium based applications (including vscode) but it does not in firefox and sublime text and any of the rest (not even in a terminal !). I am told this is because Chromium runs an X server emulator and so is compatible with the default pynput backend.
|
|
90
93
|
|
|
@@ -98,7 +101,7 @@ sudo HOME=$HOME XDG_RUNTIME_DIR=$XDG_RUNTIME_DIR PYNPUT_BACKEND_KEYBOARD=uinput
|
|
|
98
101
|
```
|
|
99
102
|
You're on the right path :)
|
|
100
103
|
|
|
101
|
-
### System tray icon (experimental)
|
|
104
|
+
### System tray icon (experimental) <img src="scribe_data/share/icon.png" width=48px>
|
|
102
105
|
|
|
103
106
|
To avoid switching back and forth with the terminal, it's possible to interact with the program via an icon tray.
|
|
104
107
|
To activate start with:
|
|
@@ -113,7 +116,7 @@ sudo apt install libcairo-dev libgirepository1.0-dev gir1.2-appindicator3-0.1
|
|
|
113
116
|
pip install PyGObject
|
|
114
117
|
```
|
|
115
118
|
|
|
116
|
-
### Start as an application in
|
|
119
|
+
### Start as an application in GNOME
|
|
117
120
|
|
|
118
121
|
If you run Ubuntu (or else?) with GNOME, the script `scribe-install [...]` will create a `scribe.desktop` file and place it under `$HOME/.local/share/applications`
|
|
119
122
|
to make it available from the quick launch menu. Any option will be passed on to `scribe`.
|
|
Binary file
|
|
@@ -1,9 +1,10 @@
|
|
|
1
1
|
from pathlib import Path
|
|
2
2
|
import tomllib
|
|
3
|
+
import time
|
|
3
4
|
import argparse
|
|
4
5
|
from scribe.audio import Microphone
|
|
5
6
|
from scribe.util import print_partial, clear_line, prompt_choices, check_dependencies, ansi_link, colored
|
|
6
|
-
from scribe.models import VoskTranscriber, WhisperTranscriber
|
|
7
|
+
from scribe.models import VoskTranscriber, WhisperTranscriber
|
|
7
8
|
|
|
8
9
|
with open(Path(__file__).parent / "models.toml", "rb") as f:
|
|
9
10
|
language_config_default = tomllib.load(f)
|
|
@@ -55,9 +56,18 @@ class DummyTranscriber:
|
|
|
55
56
|
|
|
56
57
|
def get_transcriber(o, prompt=True):
|
|
57
58
|
|
|
59
|
+
whisper_models = ["tiny", "base", "small", "medium", "large", "turbo"]
|
|
60
|
+
whisper_english_models = ["tiny.en", "base.en", "small.en", "medium.en"]
|
|
61
|
+
|
|
58
62
|
if o.dummy:
|
|
59
63
|
return DummyTranscriber("whisper", "dummy")
|
|
60
64
|
|
|
65
|
+
if o.model and not o.backend:
|
|
66
|
+
if o.model.startswith("vosk-"):
|
|
67
|
+
o.backend = "vosk"
|
|
68
|
+
elif o.model in whisper_models + whisper_english_models:
|
|
69
|
+
o.backend = "whisper"
|
|
70
|
+
|
|
61
71
|
if o.backend:
|
|
62
72
|
checked_backend = check_dependencies(o.backend)
|
|
63
73
|
if not checked_backend:
|
|
@@ -95,29 +105,26 @@ def get_transcriber(o, prompt=True):
|
|
|
95
105
|
print(f"Or pick one of the pre-defined languages: ", " ".join(available_languages))
|
|
96
106
|
exit(1)
|
|
97
107
|
choices = [language_config[backend][o.language]["model"]]
|
|
98
|
-
default_model = choices[0]
|
|
108
|
+
default_model = choices[0] # this is a string
|
|
99
109
|
|
|
100
110
|
else:
|
|
101
111
|
available_models = [language_config[backend][lang]["model"] for lang in available_languages]
|
|
102
112
|
choices = list(zip(available_models, available_languages)) + [f" * [Any model from {ansi_link('https://alphacephei.com/vosk/models')}]"]
|
|
103
|
-
default_model = choices[0]
|
|
113
|
+
default_model = choices[0] # this is a tuple !!
|
|
104
114
|
|
|
105
|
-
print(f"For information about vosk models see: {ansi_link('https://alphacephei.com/vosk/models')}")
|
|
106
115
|
if prompt:
|
|
107
|
-
|
|
116
|
+
print(f"For information about vosk models see: {ansi_link('https://alphacephei.com/vosk/models')}")
|
|
117
|
+
model = prompt_choices(choices, default=default_model, label="model") # this always returns a string
|
|
108
118
|
else:
|
|
109
|
-
model = default_model
|
|
119
|
+
model = default_model[0] if isinstance(default_model, tuple) else default_model # tuple -> string
|
|
110
120
|
|
|
111
121
|
elif backend == "whisper":
|
|
112
|
-
|
|
113
|
-
models = ["tiny", "base", "small", "medium", "large", "turbo"]
|
|
114
|
-
english_models = ["tiny.en", "base.en", "small.en", "medium.en"]
|
|
115
122
|
default_model = "small"
|
|
116
|
-
|
|
117
|
-
print("Some models have a specialized English version (.en) which will be selected as default is `-l en` was requested, but can also be requested explicitly below (option not listed). See [documentation](https://github.com/openai/whisper?tab=readme-ov-file#available-models-and-languages).")
|
|
118
123
|
if prompt:
|
|
119
|
-
|
|
120
|
-
|
|
124
|
+
# print("Some models have a specialized English version (.en) which will be selected as default is `-l en` was requested, but can also be requested explicitly below (option not listed). See [documentation](https://github.com/openai/whisper?tab=readme-ov-file#available-models-and-languages).")
|
|
125
|
+
print(f"See {ansi_link('https://github.com/openai/whisper?tab=readme-ov-file#available-models-and-languages')} for available models.")
|
|
126
|
+
model = prompt_choices(whisper_models, default=default_model, label="model",
|
|
127
|
+
hidden_models=whisper_english_models)
|
|
121
128
|
else:
|
|
122
129
|
model = default_model
|
|
123
130
|
|
|
@@ -186,7 +193,7 @@ def get_parser():
|
|
|
186
193
|
|
|
187
194
|
|
|
188
195
|
# Commencer l'enregistrement
|
|
189
|
-
def start_recording(micro, transcriber, clipboard=True, keyboard=False, latency=0, ascii=False, **greetings):
|
|
196
|
+
def start_recording(micro, transcriber, clipboard=True, keyboard=False, latency=0, ascii=False, callback=None, **greetings):
|
|
190
197
|
|
|
191
198
|
if keyboard:
|
|
192
199
|
from scribe.keyboard import type_text
|
|
@@ -210,7 +217,7 @@ def start_recording(micro, transcriber, clipboard=True, keyboard=False, latency=
|
|
|
210
217
|
|
|
211
218
|
if clipboard:
|
|
212
219
|
fulltext += result['text'] + " "
|
|
213
|
-
pyperclip.copy(fulltext)
|
|
220
|
+
pyperclip.copy(fulltext.strip())
|
|
214
221
|
|
|
215
222
|
else:
|
|
216
223
|
print_partial(result.get('partial', ''))
|
|
@@ -218,22 +225,8 @@ def start_recording(micro, transcriber, clipboard=True, keyboard=False, latency=
|
|
|
218
225
|
if clipboard:
|
|
219
226
|
print("Copied to clipboard.")
|
|
220
227
|
|
|
221
|
-
|
|
222
|
-
|
|
223
|
-
"""Thanks Le Chat for this solution: https://stackoverflow.com/a/325528/2192272
|
|
224
|
-
"""
|
|
225
|
-
import ctypes
|
|
226
|
-
thread = icon._recording_thread
|
|
227
|
-
# Raise an exception in the thread using ctypes
|
|
228
|
-
thread_id = thread.ident
|
|
229
|
-
if thread_id is not None:
|
|
230
|
-
res = ctypes.pythonapi.PyThreadState_SetAsyncExc(
|
|
231
|
-
ctypes.c_long(thread_id),
|
|
232
|
-
ctypes.py_object(StopRecording)
|
|
233
|
-
)
|
|
234
|
-
if res > 1:
|
|
235
|
-
ctypes.pythonapi.PyThreadState_SetAsyncExc(thread_id, 0)
|
|
236
|
-
print("Failure to raise exception in thread")
|
|
228
|
+
if callback:
|
|
229
|
+
callback()
|
|
237
230
|
|
|
238
231
|
|
|
239
232
|
def create_app(micro, transcriber, **kwargs):
|
|
@@ -246,7 +239,42 @@ def create_app(micro, transcriber, **kwargs):
|
|
|
246
239
|
import threading
|
|
247
240
|
|
|
248
241
|
# Load an image from a file
|
|
249
|
-
image = Image.open(Path(scribe_data.__file__).parent / "share" / "icon.
|
|
242
|
+
image = Image.open(Path(scribe_data.__file__).parent / "share" / "icon.png")
|
|
243
|
+
image_recording = Image.open(Path(scribe_data.__file__).parent / "share" / "icon_recording.png")
|
|
244
|
+
image_writing = Image.open(Path(scribe_data.__file__).parent / "share" / "icon_writing.png")
|
|
245
|
+
|
|
246
|
+
if transcriber.backend == "vosk":
|
|
247
|
+
# Recording and writing happen at the same time in this backend
|
|
248
|
+
# Overlay the writing image on top of the base image
|
|
249
|
+
image_recording = Image.alpha_composite(image_recording.convert("RGBA"), image_writing.convert("RGBA"))
|
|
250
|
+
|
|
251
|
+
def update_icon(icon, force=False):
|
|
252
|
+
if transcriber.recording:
|
|
253
|
+
if force or getattr(icon, "_icon_label", None) != "recording":
|
|
254
|
+
icon.icon = image_recording
|
|
255
|
+
icon._icon_label = "recording"
|
|
256
|
+
icon.update_menu()
|
|
257
|
+
|
|
258
|
+
elif transcriber.busy:
|
|
259
|
+
if force or getattr(icon, "_icon_label", None) != "busy":
|
|
260
|
+
icon.icon = image_writing
|
|
261
|
+
icon._icon_label = "busy"
|
|
262
|
+
icon.update_menu()
|
|
263
|
+
|
|
264
|
+
else:
|
|
265
|
+
if force or getattr(icon, "_icon_label", None) != None:
|
|
266
|
+
icon.icon = image
|
|
267
|
+
icon._icon_label = None
|
|
268
|
+
icon.update_menu()
|
|
269
|
+
|
|
270
|
+
def start_monitoring(icon):
|
|
271
|
+
try:
|
|
272
|
+
while transcriber.busy:
|
|
273
|
+
update_icon(icon)
|
|
274
|
+
time.sleep(0.1)
|
|
275
|
+
|
|
276
|
+
finally:
|
|
277
|
+
update_icon(icon)
|
|
250
278
|
|
|
251
279
|
def callback_quit(icon, item):
|
|
252
280
|
icon.visible = False
|
|
@@ -255,16 +283,34 @@ def create_app(micro, transcriber, **kwargs):
|
|
|
255
283
|
icon.stop()
|
|
256
284
|
|
|
257
285
|
def callback_stop_recording(icon, item):
|
|
258
|
-
|
|
259
|
-
|
|
260
|
-
|
|
286
|
+
# Here we need to stop the recording thread
|
|
287
|
+
|
|
288
|
+
transcriber.recording = False
|
|
289
|
+
if hasattr(icon, "_recording_thread"):
|
|
290
|
+
icon._recording_thread.join()
|
|
291
|
+
if hasattr(icon, "_monitoring_thread"):
|
|
292
|
+
icon._monitoring_thread.join()
|
|
261
293
|
|
|
262
294
|
def callback_record(icon, item):
|
|
295
|
+
# kwargs["callback"] = icon.update_menu # NOTE: the thread will finish AFTER the callback is complete
|
|
296
|
+
if transcriber.busy:
|
|
297
|
+
print("Still busy recording or transcribing.")
|
|
298
|
+
return
|
|
299
|
+
|
|
300
|
+
if hasattr(icon, "_recording_thread") and icon._recording_thread.is_alive():
|
|
301
|
+
icon._recording_thread.join()
|
|
302
|
+
|
|
303
|
+
if hasattr(icon, "_monitoring_thread") and icon._monitoring_thread.is_alive():
|
|
304
|
+
icon._monitoring_thread.join()
|
|
305
|
+
|
|
306
|
+
transcriber.busy = True # this is a hack to prevent race conditions between the below threads
|
|
263
307
|
icon._recording_thread = threading.Thread(target=start_recording, args=(micro, transcriber), kwargs=kwargs)
|
|
264
308
|
icon._recording_thread.start()
|
|
309
|
+
icon._monitoring_thread = threading.Thread(target=start_monitoring, args=(icon,))
|
|
310
|
+
icon._monitoring_thread.start()
|
|
265
311
|
|
|
266
312
|
def is_recording(item):
|
|
267
|
-
return
|
|
313
|
+
return transcriber.busy
|
|
268
314
|
|
|
269
315
|
def is_not_recording(item):
|
|
270
316
|
return not is_recording(item)
|
|
@@ -272,7 +318,6 @@ def create_app(micro, transcriber, **kwargs):
|
|
|
272
318
|
|
|
273
319
|
# Create a menu
|
|
274
320
|
menu = pystrayMenu(
|
|
275
|
-
# Item('Record', callback_record),
|
|
276
321
|
Item("Record", callback_record, visible=is_not_recording),
|
|
277
322
|
Item("Stop", callback_stop_recording, visible=is_recording),
|
|
278
323
|
Item('Quit', callback_quit),
|
|
@@ -294,38 +339,47 @@ def main(args=None):
|
|
|
294
339
|
micro = Microphone(samplerate=o.samplerate, device=o.microphone_device)
|
|
295
340
|
|
|
296
341
|
transcriber = None
|
|
297
|
-
|
|
298
|
-
toggle = {True: "On", False: "Off"}
|
|
342
|
+
details = False
|
|
299
343
|
|
|
300
344
|
while True:
|
|
301
345
|
if transcriber is None:
|
|
302
346
|
transcriber = get_transcriber(o, prompt=o.prompt)
|
|
303
347
|
print(f"Model [{colored(transcriber.model_name, 'light_blue', attrs=['bold'])}] from [{colored(transcriber.backend, 'light_blue', attrs=['bold'])}] selected.")
|
|
348
|
+
show_options = ["clipboard", "keyboard", "ascii", "app"]
|
|
349
|
+
activated_options = [colored(option, 'light_blue') for option in show_options if getattr(o, option)]
|
|
350
|
+
print(f"Options: {' | '.join(activated_options)}")
|
|
304
351
|
if o.prompt:
|
|
305
352
|
print(f"Choose any of the following actions")
|
|
306
353
|
print(f"{colored('[q]', 'light_yellow')} quit")
|
|
307
354
|
print(f"{colored('[e]', 'light_yellow')} change model")
|
|
308
|
-
|
|
309
|
-
|
|
310
|
-
|
|
311
|
-
|
|
312
|
-
|
|
313
|
-
|
|
314
|
-
|
|
315
|
-
|
|
316
|
-
|
|
317
|
-
|
|
318
|
-
|
|
319
|
-
|
|
320
|
-
|
|
321
|
-
|
|
322
|
-
|
|
355
|
+
if details:
|
|
356
|
+
print(f"{colored('[x]', 'light_yellow')} app is {colored(o.app, 'light_blue')} toggle?")
|
|
357
|
+
print(f"{colored('[c]', 'light_yellow')} clipboard is {colored(o.clipboard, 'light_blue')} toggle?")
|
|
358
|
+
print(f"{colored('[k]', 'light_yellow')} keyboard is {colored(o.keyboard, 'light_blue')} toggle?")
|
|
359
|
+
if o.keyboard:
|
|
360
|
+
print(f"{colored('[latency]', 'light_yellow')} between keystrokes is {colored(o.latency, 'light_blue')} s")
|
|
361
|
+
if transcriber.backend == "whisper":
|
|
362
|
+
print(f"{colored('[t]', 'light_yellow')} change duration (currently {colored(transcriber.timeout, 'light_blue')} s)")
|
|
363
|
+
print(f"{colored('[b]', 'light_yellow')} change silence (currently {colored(transcriber.silence_duration, 'light_blue')} s)")
|
|
364
|
+
print(f"{colored('[a]', 'light_yellow')} auto-restart after silence is {colored(transcriber.restart_after_silence, 'light_blue')} toggle?")
|
|
365
|
+
exclude_flags = ["keyboard", "clipboard", "app", "prompt", "restart_after_silence"]
|
|
366
|
+
display_flags = [a.dest for a in parser._actions if a.help != argparse.SUPPRESS]
|
|
367
|
+
for key, value in vars(o).items():
|
|
368
|
+
if key not in display_flags or key in exclude_flags or not isinstance(value, bool):
|
|
369
|
+
continue
|
|
370
|
+
print(f"{colored(f'[{key}]', 'light_yellow')} is {colored(value, 'light_blue')} toggle?")
|
|
371
|
+
print(f"{colored('[o]', 'light_yellow')} hide options")
|
|
372
|
+
else:
|
|
373
|
+
print(f"{colored('[o]', 'light_yellow')} show options")
|
|
323
374
|
|
|
324
375
|
print(colored(f"Press [Enter] to start recording.", attrs=["bold"]))
|
|
325
376
|
|
|
326
377
|
key = input()
|
|
327
378
|
if key == "q":
|
|
328
379
|
exit(0)
|
|
380
|
+
if key == "o":
|
|
381
|
+
details = not details
|
|
382
|
+
continue
|
|
329
383
|
if key == "e":
|
|
330
384
|
transcriber = None
|
|
331
385
|
o.model = None
|
|
@@ -32,6 +32,8 @@ class AbstractTranscriber:
|
|
|
32
32
|
self.silence_thresh = silence_thresh
|
|
33
33
|
self.silence_duration = silence_duration
|
|
34
34
|
self.restart_after_silence = restart_after_silence
|
|
35
|
+
self.recording = False
|
|
36
|
+
self.busy = False
|
|
35
37
|
self.reset()
|
|
36
38
|
|
|
37
39
|
def get_elapsed(self):
|
|
@@ -54,16 +56,18 @@ class AbstractTranscriber:
|
|
|
54
56
|
|
|
55
57
|
def start_recording(self, microphone,
|
|
56
58
|
start_message="Recording... Press Ctrl+C to stop.",
|
|
57
|
-
stop_message="
|
|
59
|
+
stop_message="Done transcribing."):
|
|
58
60
|
|
|
59
61
|
self.reset()
|
|
62
|
+
self.recording = True
|
|
63
|
+
self.busy = True
|
|
60
64
|
|
|
61
65
|
try:
|
|
62
66
|
|
|
63
67
|
with microphone.open_stream():
|
|
64
68
|
print(start_message)
|
|
65
69
|
|
|
66
|
-
while
|
|
70
|
+
while self.recording:
|
|
67
71
|
while not microphone.q.empty():
|
|
68
72
|
data = microphone.q.get()
|
|
69
73
|
|
|
@@ -78,7 +82,7 @@ class AbstractTranscriber:
|
|
|
78
82
|
self.reset()
|
|
79
83
|
yield result
|
|
80
84
|
else:
|
|
81
|
-
raise
|
|
85
|
+
raise StopRecording("Silence detected: {:.2f} seconds".format(silence_duration))
|
|
82
86
|
|
|
83
87
|
else:
|
|
84
88
|
self.last_sound_time = time.time()
|
|
@@ -86,14 +90,18 @@ class AbstractTranscriber:
|
|
|
86
90
|
yield self.transcribe_realtime_audio(data)
|
|
87
91
|
|
|
88
92
|
if self.is_overtime():
|
|
89
|
-
raise
|
|
93
|
+
raise StopRecording("Overtime: {:.2f} seconds".format(self.get_elapsed()))
|
|
94
|
+
|
|
95
|
+
time.sleep(0.1) # avoid overheating
|
|
90
96
|
|
|
91
97
|
except (KeyboardInterrupt, StopRecording):
|
|
92
98
|
pass
|
|
93
99
|
|
|
94
100
|
finally:
|
|
101
|
+
self.recording = False
|
|
95
102
|
result = self.finalize()
|
|
96
103
|
microphone.q.queue.clear()
|
|
104
|
+
self.busy = False
|
|
97
105
|
yield result
|
|
98
106
|
|
|
99
107
|
print(stop_message)
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.2
|
|
2
2
|
Name: scribe-cli
|
|
3
|
-
Version: 0.7.
|
|
3
|
+
Version: 0.7.10
|
|
4
4
|
Summary: scribe is a local speech recognition tool that provides real-time transcription using vosk and whisper AI, with the goal of serving as a virtual keyboard on a computer
|
|
5
5
|
Author-email: Mahé Perrette <mahe.perrette@gmail.com>
|
|
6
6
|
License: MIT License
|
|
@@ -64,14 +64,17 @@ Requires-Dist: pystray; extra == "all"
|
|
|
64
64
|
[]()
|
|
65
65
|
[](https://pypi.org/project/scribe-cli)
|
|
66
66
|
|
|
67
|
-
# Scribe
|
|
67
|
+
# Scribe <img src="scribe_data/share/icon.png" width=48px>
|
|
68
68
|
|
|
69
69
|
`scribe` is a local speech recognition tool that provides real-time transcription using vosk and whisper AI, with the goal of serving as a virtual keyboard on a computer.
|
|
70
70
|
|
|
71
71
|
## Compatibility
|
|
72
72
|
|
|
73
|
-
In principle `scribe` is compatible with any OS but I develop it under Ubuntu (Wayland)
|
|
74
|
-
|
|
73
|
+
In principle `scribe` is compatible with any OS but I develop it under Ubuntu (Wayland) for my own purposes so glitches are likely on other configurations.
|
|
74
|
+
Moreover there are quite a bit of dependencies that rely on very OS-specific protocols under the hood, like access to the microphone, keyboard and clipboard,
|
|
75
|
+
and even though the python dependencies `scribe` relies on are not restricted to a single platform, there may be limitation and additional binaries to install.
|
|
76
|
+
This guide is based on python3.12 running on Ubuntu 24.04 with Gnome + Wayland, which is a relatively standard setting at the time of writing.
|
|
77
|
+
Note as of February 19, 2025 python 13 does not seem to produce any transcription (I am not sure which dependency is to blame).
|
|
75
78
|
A test on Mac OS (M1 Air with 8Gb RAM) worked with python 12, though with a much inferior performance compared to my own system (Lenovo T14 Gen 5 with i5 125U 32 Gb RAM).
|
|
76
79
|
|
|
77
80
|
## Installation
|
|
@@ -147,7 +150,7 @@ scribe --keyboard
|
|
|
147
150
|
It relies on the optional `pynput` dependency (installed together with `scribe` if you used the `[all]` or `[keyboard]` option).
|
|
148
151
|
Depending on your operating system, `pynput` may require additional configuration to work around its [limitations](https://pynput.readthedocs.io/en/latest/limitations.html).
|
|
149
152
|
|
|
150
|
-
#### Use the keyboard
|
|
153
|
+
#### Use the keyboard with Wayland (default for Ubuntu 24.04)
|
|
151
154
|
|
|
152
155
|
In my Ubuntu + Wayland system the keyboard simulation works out-of-the-box in chromium based applications (including vscode) but it does not in firefox and sublime text and any of the rest (not even in a terminal !). I am told this is because Chromium runs an X server emulator and so is compatible with the default pynput backend.
|
|
153
156
|
|
|
@@ -161,7 +164,7 @@ sudo HOME=$HOME XDG_RUNTIME_DIR=$XDG_RUNTIME_DIR PYNPUT_BACKEND_KEYBOARD=uinput
|
|
|
161
164
|
```
|
|
162
165
|
You're on the right path :)
|
|
163
166
|
|
|
164
|
-
### System tray icon (experimental)
|
|
167
|
+
### System tray icon (experimental) <img src="scribe_data/share/icon.png" width=48px>
|
|
165
168
|
|
|
166
169
|
To avoid switching back and forth with the terminal, it's possible to interact with the program via an icon tray.
|
|
167
170
|
To activate start with:
|
|
@@ -176,7 +179,7 @@ sudo apt install libcairo-dev libgirepository1.0-dev gir1.2-appindicator3-0.1
|
|
|
176
179
|
pip install PyGObject
|
|
177
180
|
```
|
|
178
181
|
|
|
179
|
-
### Start as an application in
|
|
182
|
+
### Start as an application in GNOME
|
|
180
183
|
|
|
181
184
|
If you run Ubuntu (or else?) with GNOME, the script `scribe-install [...]` will create a `scribe.desktop` file and place it under `$HOME/.local/share/applications`
|
|
182
185
|
to make it available from the quick launch menu. Any option will be passed on to `scribe`.
|
|
@@ -1,6 +1,7 @@
|
|
|
1
1
|
.gitignore
|
|
2
2
|
LICENSE
|
|
3
3
|
README.md
|
|
4
|
+
icon.xcf
|
|
4
5
|
pyproject.toml
|
|
5
6
|
.github/workflows/pypi.yml
|
|
6
7
|
scribe/__init__.py
|
|
@@ -21,5 +22,7 @@ scribe_cli.egg-info/entry_points.txt
|
|
|
21
22
|
scribe_cli.egg-info/requires.txt
|
|
22
23
|
scribe_cli.egg-info/top_level.txt
|
|
23
24
|
scribe_data/__init__.py
|
|
24
|
-
scribe_data/share/icon.
|
|
25
|
+
scribe_data/share/icon.png
|
|
26
|
+
scribe_data/share/icon_recording.png
|
|
27
|
+
scribe_data/share/icon_writing.png
|
|
25
28
|
scribe_data/templates/scribe.desktop
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|