scribe-cli 0.7.8__tar.gz → 0.7.10__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (31) hide show
  1. {scribe_cli-0.7.8/scribe_cli.egg-info → scribe_cli-0.7.10}/PKG-INFO +10 -7
  2. {scribe_cli-0.7.8 → scribe_cli-0.7.10}/README.md +9 -6
  3. scribe_cli-0.7.10/icon.xcf +0 -0
  4. {scribe_cli-0.7.8 → scribe_cli-0.7.10}/scribe/_version.py +2 -2
  5. {scribe_cli-0.7.8 → scribe_cli-0.7.10}/scribe/app.py +108 -54
  6. {scribe_cli-0.7.8 → scribe_cli-0.7.10}/scribe/models.py +12 -4
  7. {scribe_cli-0.7.8 → scribe_cli-0.7.10/scribe_cli.egg-info}/PKG-INFO +10 -7
  8. {scribe_cli-0.7.8 → scribe_cli-0.7.10}/scribe_cli.egg-info/SOURCES.txt +4 -1
  9. scribe_cli-0.7.10/scribe_data/share/icon.png +0 -0
  10. scribe_cli-0.7.10/scribe_data/share/icon_recording.png +0 -0
  11. scribe_cli-0.7.10/scribe_data/share/icon_writing.png +0 -0
  12. scribe_cli-0.7.8/scribe_data/share/icon.jpg +0 -0
  13. {scribe_cli-0.7.8 → scribe_cli-0.7.10}/.github/workflows/pypi.yml +0 -0
  14. {scribe_cli-0.7.8 → scribe_cli-0.7.10}/.gitignore +0 -0
  15. {scribe_cli-0.7.8 → scribe_cli-0.7.10}/LICENSE +0 -0
  16. {scribe_cli-0.7.8 → scribe_cli-0.7.10}/pyproject.toml +0 -0
  17. {scribe_cli-0.7.8 → scribe_cli-0.7.10}/scribe/__init__.py +0 -0
  18. {scribe_cli-0.7.8 → scribe_cli-0.7.10}/scribe/audio.py +0 -0
  19. {scribe_cli-0.7.8 → scribe_cli-0.7.10}/scribe/install_desktop.py +0 -0
  20. {scribe_cli-0.7.8 → scribe_cli-0.7.10}/scribe/keyboard.py +0 -0
  21. {scribe_cli-0.7.8 → scribe_cli-0.7.10}/scribe/models.toml +0 -0
  22. {scribe_cli-0.7.8 → scribe_cli-0.7.10}/scribe/saverecording.py +0 -0
  23. {scribe_cli-0.7.8 → scribe_cli-0.7.10}/scribe/testpynput.py +0 -0
  24. {scribe_cli-0.7.8 → scribe_cli-0.7.10}/scribe/util.py +0 -0
  25. {scribe_cli-0.7.8 → scribe_cli-0.7.10}/scribe_cli.egg-info/dependency_links.txt +0 -0
  26. {scribe_cli-0.7.8 → scribe_cli-0.7.10}/scribe_cli.egg-info/entry_points.txt +0 -0
  27. {scribe_cli-0.7.8 → scribe_cli-0.7.10}/scribe_cli.egg-info/requires.txt +0 -0
  28. {scribe_cli-0.7.8 → scribe_cli-0.7.10}/scribe_cli.egg-info/top_level.txt +0 -0
  29. {scribe_cli-0.7.8 → scribe_cli-0.7.10}/scribe_data/__init__.py +0 -0
  30. {scribe_cli-0.7.8 → scribe_cli-0.7.10}/scribe_data/templates/scribe.desktop +0 -0
  31. {scribe_cli-0.7.8 → scribe_cli-0.7.10}/setup.cfg +0 -0
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.2
2
2
  Name: scribe-cli
3
- Version: 0.7.8
3
+ Version: 0.7.10
4
4
  Summary: scribe is a local speech recognition tool that provides real-time transcription using vosk and whisper AI, with the goal of serving as a virtual keyboard on a computer
5
5
  Author-email: Mahé Perrette <mahe.perrette@gmail.com>
6
6
  License: MIT License
@@ -64,14 +64,17 @@ Requires-Dist: pystray; extra == "all"
64
64
  [![python](https://img.shields.io/badge/python-3.12-blue.svg)]()
65
65
  [![pypi](https://img.shields.io/pypi/v/scribe-cli)](https://pypi.org/project/scribe-cli)
66
66
 
67
- # Scribe
67
+ # Scribe <img src="scribe_data/share/icon.png" width=48px>
68
68
 
69
69
  `scribe` is a local speech recognition tool that provides real-time transcription using vosk and whisper AI, with the goal of serving as a virtual keyboard on a computer.
70
70
 
71
71
  ## Compatibility
72
72
 
73
- In principle `scribe` is compatible with any OS but I develop it under Ubuntu (Wayland) and develop it for my own purposes so glitches are likely on other configurations.
74
- As of February 19, 2025 python 13 is not supported (I can't recall now which dependency is to blame).
73
+ In principle `scribe` is compatible with any OS but I develop it under Ubuntu (Wayland) for my own purposes so glitches are likely on other configurations.
74
+ Moreover there are quite a bit of dependencies that rely on very OS-specific protocols under the hood, like access to the microphone, keyboard and clipboard,
75
+ and even though the python dependencies `scribe` relies on are not restricted to a single platform, there may be limitation and additional binaries to install.
76
+ This guide is based on python3.12 running on Ubuntu 24.04 with Gnome + Wayland, which is a relatively standard setting at the time of writing.
77
+ Note as of February 19, 2025 python 13 does not seem to produce any transcription (I am not sure which dependency is to blame).
75
78
  A test on Mac OS (M1 Air with 8Gb RAM) worked with python 12, though with a much inferior performance compared to my own system (Lenovo T14 Gen 5 with i5 125U 32 Gb RAM).
76
79
 
77
80
  ## Installation
@@ -147,7 +150,7 @@ scribe --keyboard
147
150
  It relies on the optional `pynput` dependency (installed together with `scribe` if you used the `[all]` or `[keyboard]` option).
148
151
  Depending on your operating system, `pynput` may require additional configuration to work around its [limitations](https://pynput.readthedocs.io/en/latest/limitations.html).
149
152
 
150
- #### Use the keyboard in Ubuntu
153
+ #### Use the keyboard with Wayland (default for Ubuntu 24.04)
151
154
 
152
155
  In my Ubuntu + Wayland system the keyboard simulation works out-of-the-box in chromium based applications (including vscode) but it does not in firefox and sublime text and any of the rest (not even in a terminal !). I am told this is because Chromium runs an X server emulator and so is compatible with the default pynput backend.
153
156
 
@@ -161,7 +164,7 @@ sudo HOME=$HOME XDG_RUNTIME_DIR=$XDG_RUNTIME_DIR PYNPUT_BACKEND_KEYBOARD=uinput
161
164
  ```
162
165
  You're on the right path :)
163
166
 
164
- ### System tray icon (experimental)
167
+ ### System tray icon (experimental) <img src="scribe_data/share/icon.png" width=48px>
165
168
 
166
169
  To avoid switching back and forth with the terminal, it's possible to interact with the program via an icon tray.
167
170
  To activate start with:
@@ -176,7 +179,7 @@ sudo apt install libcairo-dev libgirepository1.0-dev gir1.2-appindicator3-0.1
176
179
  pip install PyGObject
177
180
  ```
178
181
 
179
- ### Start as an application in Ubuntu
182
+ ### Start as an application in GNOME
180
183
 
181
184
  If you run Ubuntu (or else?) with GNOME, the script `scribe-install [...]` will create a `scribe.desktop` file and place it under `$HOME/.local/share/applications`
182
185
  to make it available from the quick launch menu. Any option will be passed on to `scribe`.
@@ -1,14 +1,17 @@
1
1
  [![python](https://img.shields.io/badge/python-3.12-blue.svg)]()
2
2
  [![pypi](https://img.shields.io/pypi/v/scribe-cli)](https://pypi.org/project/scribe-cli)
3
3
 
4
- # Scribe
4
+ # Scribe <img src="scribe_data/share/icon.png" width=48px>
5
5
 
6
6
  `scribe` is a local speech recognition tool that provides real-time transcription using vosk and whisper AI, with the goal of serving as a virtual keyboard on a computer.
7
7
 
8
8
  ## Compatibility
9
9
 
10
- In principle `scribe` is compatible with any OS but I develop it under Ubuntu (Wayland) and develop it for my own purposes so glitches are likely on other configurations.
11
- As of February 19, 2025 python 13 is not supported (I can't recall now which dependency is to blame).
10
+ In principle `scribe` is compatible with any OS but I develop it under Ubuntu (Wayland) for my own purposes so glitches are likely on other configurations.
11
+ Moreover there are quite a bit of dependencies that rely on very OS-specific protocols under the hood, like access to the microphone, keyboard and clipboard,
12
+ and even though the python dependencies `scribe` relies on are not restricted to a single platform, there may be limitation and additional binaries to install.
13
+ This guide is based on python3.12 running on Ubuntu 24.04 with Gnome + Wayland, which is a relatively standard setting at the time of writing.
14
+ Note as of February 19, 2025 python 13 does not seem to produce any transcription (I am not sure which dependency is to blame).
12
15
  A test on Mac OS (M1 Air with 8Gb RAM) worked with python 12, though with a much inferior performance compared to my own system (Lenovo T14 Gen 5 with i5 125U 32 Gb RAM).
13
16
 
14
17
  ## Installation
@@ -84,7 +87,7 @@ scribe --keyboard
84
87
  It relies on the optional `pynput` dependency (installed together with `scribe` if you used the `[all]` or `[keyboard]` option).
85
88
  Depending on your operating system, `pynput` may require additional configuration to work around its [limitations](https://pynput.readthedocs.io/en/latest/limitations.html).
86
89
 
87
- #### Use the keyboard in Ubuntu
90
+ #### Use the keyboard with Wayland (default for Ubuntu 24.04)
88
91
 
89
92
  In my Ubuntu + Wayland system the keyboard simulation works out-of-the-box in chromium based applications (including vscode) but it does not in firefox and sublime text and any of the rest (not even in a terminal !). I am told this is because Chromium runs an X server emulator and so is compatible with the default pynput backend.
90
93
 
@@ -98,7 +101,7 @@ sudo HOME=$HOME XDG_RUNTIME_DIR=$XDG_RUNTIME_DIR PYNPUT_BACKEND_KEYBOARD=uinput
98
101
  ```
99
102
  You're on the right path :)
100
103
 
101
- ### System tray icon (experimental)
104
+ ### System tray icon (experimental) <img src="scribe_data/share/icon.png" width=48px>
102
105
 
103
106
  To avoid switching back and forth with the terminal, it's possible to interact with the program via an icon tray.
104
107
  To activate start with:
@@ -113,7 +116,7 @@ sudo apt install libcairo-dev libgirepository1.0-dev gir1.2-appindicator3-0.1
113
116
  pip install PyGObject
114
117
  ```
115
118
 
116
- ### Start as an application in Ubuntu
119
+ ### Start as an application in GNOME
117
120
 
118
121
  If you run Ubuntu (or else?) with GNOME, the script `scribe-install [...]` will create a `scribe.desktop` file and place it under `$HOME/.local/share/applications`
119
122
  to make it available from the quick launch menu. Any option will be passed on to `scribe`.
Binary file
@@ -12,5 +12,5 @@ __version__: str
12
12
  __version_tuple__: VERSION_TUPLE
13
13
  version_tuple: VERSION_TUPLE
14
14
 
15
- __version__ = version = '0.7.8'
16
- __version_tuple__ = version_tuple = (0, 7, 8)
15
+ __version__ = version = '0.7.10'
16
+ __version_tuple__ = version_tuple = (0, 7, 10)
@@ -1,9 +1,10 @@
1
1
  from pathlib import Path
2
2
  import tomllib
3
+ import time
3
4
  import argparse
4
5
  from scribe.audio import Microphone
5
6
  from scribe.util import print_partial, clear_line, prompt_choices, check_dependencies, ansi_link, colored
6
- from scribe.models import VoskTranscriber, WhisperTranscriber, StopRecording
7
+ from scribe.models import VoskTranscriber, WhisperTranscriber
7
8
 
8
9
  with open(Path(__file__).parent / "models.toml", "rb") as f:
9
10
  language_config_default = tomllib.load(f)
@@ -55,9 +56,18 @@ class DummyTranscriber:
55
56
 
56
57
  def get_transcriber(o, prompt=True):
57
58
 
59
+ whisper_models = ["tiny", "base", "small", "medium", "large", "turbo"]
60
+ whisper_english_models = ["tiny.en", "base.en", "small.en", "medium.en"]
61
+
58
62
  if o.dummy:
59
63
  return DummyTranscriber("whisper", "dummy")
60
64
 
65
+ if o.model and not o.backend:
66
+ if o.model.startswith("vosk-"):
67
+ o.backend = "vosk"
68
+ elif o.model in whisper_models + whisper_english_models:
69
+ o.backend = "whisper"
70
+
61
71
  if o.backend:
62
72
  checked_backend = check_dependencies(o.backend)
63
73
  if not checked_backend:
@@ -95,29 +105,26 @@ def get_transcriber(o, prompt=True):
95
105
  print(f"Or pick one of the pre-defined languages: ", " ".join(available_languages))
96
106
  exit(1)
97
107
  choices = [language_config[backend][o.language]["model"]]
98
- default_model = choices[0]
108
+ default_model = choices[0] # this is a string
99
109
 
100
110
  else:
101
111
  available_models = [language_config[backend][lang]["model"] for lang in available_languages]
102
112
  choices = list(zip(available_models, available_languages)) + [f" * [Any model from {ansi_link('https://alphacephei.com/vosk/models')}]"]
103
- default_model = choices[0]
113
+ default_model = choices[0] # this is a tuple !!
104
114
 
105
- print(f"For information about vosk models see: {ansi_link('https://alphacephei.com/vosk/models')}")
106
115
  if prompt:
107
- model = prompt_choices(choices, default=default_model, label="model")
116
+ print(f"For information about vosk models see: {ansi_link('https://alphacephei.com/vosk/models')}")
117
+ model = prompt_choices(choices, default=default_model, label="model") # this always returns a string
108
118
  else:
109
- model = default_model
119
+ model = default_model[0] if isinstance(default_model, tuple) else default_model # tuple -> string
110
120
 
111
121
  elif backend == "whisper":
112
-
113
- models = ["tiny", "base", "small", "medium", "large", "turbo"]
114
- english_models = ["tiny.en", "base.en", "small.en", "medium.en"]
115
122
  default_model = "small"
116
-
117
- print("Some models have a specialized English version (.en) which will be selected as default is `-l en` was requested, but can also be requested explicitly below (option not listed). See [documentation](https://github.com/openai/whisper?tab=readme-ov-file#available-models-and-languages).")
118
123
  if prompt:
119
- model = prompt_choices(models, default=default_model, label="model",
120
- hidden_models=english_models)
124
+ # print("Some models have a specialized English version (.en) which will be selected as default is `-l en` was requested, but can also be requested explicitly below (option not listed). See [documentation](https://github.com/openai/whisper?tab=readme-ov-file#available-models-and-languages).")
125
+ print(f"See {ansi_link('https://github.com/openai/whisper?tab=readme-ov-file#available-models-and-languages')} for available models.")
126
+ model = prompt_choices(whisper_models, default=default_model, label="model",
127
+ hidden_models=whisper_english_models)
121
128
  else:
122
129
  model = default_model
123
130
 
@@ -186,7 +193,7 @@ def get_parser():
186
193
 
187
194
 
188
195
  # Commencer l'enregistrement
189
- def start_recording(micro, transcriber, clipboard=True, keyboard=False, latency=0, ascii=False, **greetings):
196
+ def start_recording(micro, transcriber, clipboard=True, keyboard=False, latency=0, ascii=False, callback=None, **greetings):
190
197
 
191
198
  if keyboard:
192
199
  from scribe.keyboard import type_text
@@ -210,7 +217,7 @@ def start_recording(micro, transcriber, clipboard=True, keyboard=False, latency=
210
217
 
211
218
  if clipboard:
212
219
  fulltext += result['text'] + " "
213
- pyperclip.copy(fulltext)
220
+ pyperclip.copy(fulltext.strip())
214
221
 
215
222
  else:
216
223
  print_partial(result.get('partial', ''))
@@ -218,22 +225,8 @@ def start_recording(micro, transcriber, clipboard=True, keyboard=False, latency=
218
225
  if clipboard:
219
226
  print("Copied to clipboard.")
220
227
 
221
-
222
- def interrupt_app_thread(icon):
223
- """Thanks Le Chat for this solution: https://stackoverflow.com/a/325528/2192272
224
- """
225
- import ctypes
226
- thread = icon._recording_thread
227
- # Raise an exception in the thread using ctypes
228
- thread_id = thread.ident
229
- if thread_id is not None:
230
- res = ctypes.pythonapi.PyThreadState_SetAsyncExc(
231
- ctypes.c_long(thread_id),
232
- ctypes.py_object(StopRecording)
233
- )
234
- if res > 1:
235
- ctypes.pythonapi.PyThreadState_SetAsyncExc(thread_id, 0)
236
- print("Failure to raise exception in thread")
228
+ if callback:
229
+ callback()
237
230
 
238
231
 
239
232
  def create_app(micro, transcriber, **kwargs):
@@ -246,7 +239,42 @@ def create_app(micro, transcriber, **kwargs):
246
239
  import threading
247
240
 
248
241
  # Load an image from a file
249
- image = Image.open(Path(scribe_data.__file__).parent / "share" / "icon.jpg")
242
+ image = Image.open(Path(scribe_data.__file__).parent / "share" / "icon.png")
243
+ image_recording = Image.open(Path(scribe_data.__file__).parent / "share" / "icon_recording.png")
244
+ image_writing = Image.open(Path(scribe_data.__file__).parent / "share" / "icon_writing.png")
245
+
246
+ if transcriber.backend == "vosk":
247
+ # Recording and writing happen at the same time in this backend
248
+ # Overlay the writing image on top of the base image
249
+ image_recording = Image.alpha_composite(image_recording.convert("RGBA"), image_writing.convert("RGBA"))
250
+
251
+ def update_icon(icon, force=False):
252
+ if transcriber.recording:
253
+ if force or getattr(icon, "_icon_label", None) != "recording":
254
+ icon.icon = image_recording
255
+ icon._icon_label = "recording"
256
+ icon.update_menu()
257
+
258
+ elif transcriber.busy:
259
+ if force or getattr(icon, "_icon_label", None) != "busy":
260
+ icon.icon = image_writing
261
+ icon._icon_label = "busy"
262
+ icon.update_menu()
263
+
264
+ else:
265
+ if force or getattr(icon, "_icon_label", None) != None:
266
+ icon.icon = image
267
+ icon._icon_label = None
268
+ icon.update_menu()
269
+
270
+ def start_monitoring(icon):
271
+ try:
272
+ while transcriber.busy:
273
+ update_icon(icon)
274
+ time.sleep(0.1)
275
+
276
+ finally:
277
+ update_icon(icon)
250
278
 
251
279
  def callback_quit(icon, item):
252
280
  icon.visible = False
@@ -255,16 +283,34 @@ def create_app(micro, transcriber, **kwargs):
255
283
  icon.stop()
256
284
 
257
285
  def callback_stop_recording(icon, item):
258
- ## Here we need to stop the recording thread
259
- interrupt_app_thread(icon)
260
- icon._recording_thread.join()
286
+ # Here we need to stop the recording thread
287
+
288
+ transcriber.recording = False
289
+ if hasattr(icon, "_recording_thread"):
290
+ icon._recording_thread.join()
291
+ if hasattr(icon, "_monitoring_thread"):
292
+ icon._monitoring_thread.join()
261
293
 
262
294
  def callback_record(icon, item):
295
+ # kwargs["callback"] = icon.update_menu # NOTE: the thread will finish AFTER the callback is complete
296
+ if transcriber.busy:
297
+ print("Still busy recording or transcribing.")
298
+ return
299
+
300
+ if hasattr(icon, "_recording_thread") and icon._recording_thread.is_alive():
301
+ icon._recording_thread.join()
302
+
303
+ if hasattr(icon, "_monitoring_thread") and icon._monitoring_thread.is_alive():
304
+ icon._monitoring_thread.join()
305
+
306
+ transcriber.busy = True # this is a hack to prevent race conditions between the below threads
263
307
  icon._recording_thread = threading.Thread(target=start_recording, args=(micro, transcriber), kwargs=kwargs)
264
308
  icon._recording_thread.start()
309
+ icon._monitoring_thread = threading.Thread(target=start_monitoring, args=(icon,))
310
+ icon._monitoring_thread.start()
265
311
 
266
312
  def is_recording(item):
267
- return hasattr(icon, "_recording_thread") and icon._recording_thread.is_alive()
313
+ return transcriber.busy
268
314
 
269
315
  def is_not_recording(item):
270
316
  return not is_recording(item)
@@ -272,7 +318,6 @@ def create_app(micro, transcriber, **kwargs):
272
318
 
273
319
  # Create a menu
274
320
  menu = pystrayMenu(
275
- # Item('Record', callback_record),
276
321
  Item("Record", callback_record, visible=is_not_recording),
277
322
  Item("Stop", callback_stop_recording, visible=is_recording),
278
323
  Item('Quit', callback_quit),
@@ -294,38 +339,47 @@ def main(args=None):
294
339
  micro = Microphone(samplerate=o.samplerate, device=o.microphone_device)
295
340
 
296
341
  transcriber = None
297
-
298
- toggle = {True: "On", False: "Off"}
342
+ details = False
299
343
 
300
344
  while True:
301
345
  if transcriber is None:
302
346
  transcriber = get_transcriber(o, prompt=o.prompt)
303
347
  print(f"Model [{colored(transcriber.model_name, 'light_blue', attrs=['bold'])}] from [{colored(transcriber.backend, 'light_blue', attrs=['bold'])}] selected.")
348
+ show_options = ["clipboard", "keyboard", "ascii", "app"]
349
+ activated_options = [colored(option, 'light_blue') for option in show_options if getattr(o, option)]
350
+ print(f"Options: {' | '.join(activated_options)}")
304
351
  if o.prompt:
305
352
  print(f"Choose any of the following actions")
306
353
  print(f"{colored('[q]', 'light_yellow')} quit")
307
354
  print(f"{colored('[e]', 'light_yellow')} change model")
308
- print(f"{colored('[x]', 'light_yellow')} app is {colored(o.app, 'light_blue')} toggle?")
309
- print(f"{colored('[c]', 'light_yellow')} clipboard is {colored(o.clipboard, 'light_blue')} toggle?")
310
- print(f"{colored('[k]', 'light_yellow')} keyboard is {colored(o.keyboard, 'light_blue')} toggle?")
311
- if o.keyboard:
312
- print(f"{colored('[latency]', 'light_yellow')} between keystrokes is {colored(o.latency, 'light_blue')} s")
313
- if transcriber.backend == "whisper":
314
- print(f"{colored('[t]', 'light_yellow')} change duration (currently {colored(transcriber.timeout, 'light_blue')} s)")
315
- print(f"{colored('[b]', 'light_yellow')} change silence duration (currently {colored(transcriber.silence_duration, 'light_blue')} s)")
316
- print(f"{colored('[a]', 'light_yellow')} auto-restart after silence is {colored(transcriber.restart_after_silence, 'light_blue')} toggle?")
317
- exclude_flags = ["keyboard", "clipboard", "app", "prompt", "restart_after_silence"]
318
- display_flags = [a.dest for a in parser._actions if a.help != argparse.SUPPRESS]
319
- for key, value in vars(o).items():
320
- if key not in display_flags or key in exclude_flags or not isinstance(value, bool):
321
- continue
322
- print(f"{colored(f'[{key}]', 'light_yellow')} is {colored(value, 'light_blue')} toggle?")
355
+ if details:
356
+ print(f"{colored('[x]', 'light_yellow')} app is {colored(o.app, 'light_blue')} toggle?")
357
+ print(f"{colored('[c]', 'light_yellow')} clipboard is {colored(o.clipboard, 'light_blue')} toggle?")
358
+ print(f"{colored('[k]', 'light_yellow')} keyboard is {colored(o.keyboard, 'light_blue')} toggle?")
359
+ if o.keyboard:
360
+ print(f"{colored('[latency]', 'light_yellow')} between keystrokes is {colored(o.latency, 'light_blue')} s")
361
+ if transcriber.backend == "whisper":
362
+ print(f"{colored('[t]', 'light_yellow')} change duration (currently {colored(transcriber.timeout, 'light_blue')} s)")
363
+ print(f"{colored('[b]', 'light_yellow')} change silence (currently {colored(transcriber.silence_duration, 'light_blue')} s)")
364
+ print(f"{colored('[a]', 'light_yellow')} auto-restart after silence is {colored(transcriber.restart_after_silence, 'light_blue')} toggle?")
365
+ exclude_flags = ["keyboard", "clipboard", "app", "prompt", "restart_after_silence"]
366
+ display_flags = [a.dest for a in parser._actions if a.help != argparse.SUPPRESS]
367
+ for key, value in vars(o).items():
368
+ if key not in display_flags or key in exclude_flags or not isinstance(value, bool):
369
+ continue
370
+ print(f"{colored(f'[{key}]', 'light_yellow')} is {colored(value, 'light_blue')} toggle?")
371
+ print(f"{colored('[o]', 'light_yellow')} hide options")
372
+ else:
373
+ print(f"{colored('[o]', 'light_yellow')} show options")
323
374
 
324
375
  print(colored(f"Press [Enter] to start recording.", attrs=["bold"]))
325
376
 
326
377
  key = input()
327
378
  if key == "q":
328
379
  exit(0)
380
+ if key == "o":
381
+ details = not details
382
+ continue
329
383
  if key == "e":
330
384
  transcriber = None
331
385
  o.model = None
@@ -32,6 +32,8 @@ class AbstractTranscriber:
32
32
  self.silence_thresh = silence_thresh
33
33
  self.silence_duration = silence_duration
34
34
  self.restart_after_silence = restart_after_silence
35
+ self.recording = False
36
+ self.busy = False
35
37
  self.reset()
36
38
 
37
39
  def get_elapsed(self):
@@ -54,16 +56,18 @@ class AbstractTranscriber:
54
56
 
55
57
  def start_recording(self, microphone,
56
58
  start_message="Recording... Press Ctrl+C to stop.",
57
- stop_message="Stopped recording."):
59
+ stop_message="Done transcribing."):
58
60
 
59
61
  self.reset()
62
+ self.recording = True
63
+ self.busy = True
60
64
 
61
65
  try:
62
66
 
63
67
  with microphone.open_stream():
64
68
  print(start_message)
65
69
 
66
- while True:
70
+ while self.recording:
67
71
  while not microphone.q.empty():
68
72
  data = microphone.q.get()
69
73
 
@@ -78,7 +82,7 @@ class AbstractTranscriber:
78
82
  self.reset()
79
83
  yield result
80
84
  else:
81
- raise KeyboardInterrupt("Silence detected: {:.2f} seconds".format(silence_duration))
85
+ raise StopRecording("Silence detected: {:.2f} seconds".format(silence_duration))
82
86
 
83
87
  else:
84
88
  self.last_sound_time = time.time()
@@ -86,14 +90,18 @@ class AbstractTranscriber:
86
90
  yield self.transcribe_realtime_audio(data)
87
91
 
88
92
  if self.is_overtime():
89
- raise KeyboardInterrupt("Overtime: {:.2f} seconds".format(self.get_elapsed()))
93
+ raise StopRecording("Overtime: {:.2f} seconds".format(self.get_elapsed()))
94
+
95
+ time.sleep(0.1) # avoid overheating
90
96
 
91
97
  except (KeyboardInterrupt, StopRecording):
92
98
  pass
93
99
 
94
100
  finally:
101
+ self.recording = False
95
102
  result = self.finalize()
96
103
  microphone.q.queue.clear()
104
+ self.busy = False
97
105
  yield result
98
106
 
99
107
  print(stop_message)
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.2
2
2
  Name: scribe-cli
3
- Version: 0.7.8
3
+ Version: 0.7.10
4
4
  Summary: scribe is a local speech recognition tool that provides real-time transcription using vosk and whisper AI, with the goal of serving as a virtual keyboard on a computer
5
5
  Author-email: Mahé Perrette <mahe.perrette@gmail.com>
6
6
  License: MIT License
@@ -64,14 +64,17 @@ Requires-Dist: pystray; extra == "all"
64
64
  [![python](https://img.shields.io/badge/python-3.12-blue.svg)]()
65
65
  [![pypi](https://img.shields.io/pypi/v/scribe-cli)](https://pypi.org/project/scribe-cli)
66
66
 
67
- # Scribe
67
+ # Scribe <img src="scribe_data/share/icon.png" width=48px>
68
68
 
69
69
  `scribe` is a local speech recognition tool that provides real-time transcription using vosk and whisper AI, with the goal of serving as a virtual keyboard on a computer.
70
70
 
71
71
  ## Compatibility
72
72
 
73
- In principle `scribe` is compatible with any OS but I develop it under Ubuntu (Wayland) and develop it for my own purposes so glitches are likely on other configurations.
74
- As of February 19, 2025 python 13 is not supported (I can't recall now which dependency is to blame).
73
+ In principle `scribe` is compatible with any OS but I develop it under Ubuntu (Wayland) for my own purposes so glitches are likely on other configurations.
74
+ Moreover there are quite a bit of dependencies that rely on very OS-specific protocols under the hood, like access to the microphone, keyboard and clipboard,
75
+ and even though the python dependencies `scribe` relies on are not restricted to a single platform, there may be limitation and additional binaries to install.
76
+ This guide is based on python3.12 running on Ubuntu 24.04 with Gnome + Wayland, which is a relatively standard setting at the time of writing.
77
+ Note as of February 19, 2025 python 13 does not seem to produce any transcription (I am not sure which dependency is to blame).
75
78
  A test on Mac OS (M1 Air with 8Gb RAM) worked with python 12, though with a much inferior performance compared to my own system (Lenovo T14 Gen 5 with i5 125U 32 Gb RAM).
76
79
 
77
80
  ## Installation
@@ -147,7 +150,7 @@ scribe --keyboard
147
150
  It relies on the optional `pynput` dependency (installed together with `scribe` if you used the `[all]` or `[keyboard]` option).
148
151
  Depending on your operating system, `pynput` may require additional configuration to work around its [limitations](https://pynput.readthedocs.io/en/latest/limitations.html).
149
152
 
150
- #### Use the keyboard in Ubuntu
153
+ #### Use the keyboard with Wayland (default for Ubuntu 24.04)
151
154
 
152
155
  In my Ubuntu + Wayland system the keyboard simulation works out-of-the-box in chromium based applications (including vscode) but it does not in firefox and sublime text and any of the rest (not even in a terminal !). I am told this is because Chromium runs an X server emulator and so is compatible with the default pynput backend.
153
156
 
@@ -161,7 +164,7 @@ sudo HOME=$HOME XDG_RUNTIME_DIR=$XDG_RUNTIME_DIR PYNPUT_BACKEND_KEYBOARD=uinput
161
164
  ```
162
165
  You're on the right path :)
163
166
 
164
- ### System tray icon (experimental)
167
+ ### System tray icon (experimental) <img src="scribe_data/share/icon.png" width=48px>
165
168
 
166
169
  To avoid switching back and forth with the terminal, it's possible to interact with the program via an icon tray.
167
170
  To activate start with:
@@ -176,7 +179,7 @@ sudo apt install libcairo-dev libgirepository1.0-dev gir1.2-appindicator3-0.1
176
179
  pip install PyGObject
177
180
  ```
178
181
 
179
- ### Start as an application in Ubuntu
182
+ ### Start as an application in GNOME
180
183
 
181
184
  If you run Ubuntu (or else?) with GNOME, the script `scribe-install [...]` will create a `scribe.desktop` file and place it under `$HOME/.local/share/applications`
182
185
  to make it available from the quick launch menu. Any option will be passed on to `scribe`.
@@ -1,6 +1,7 @@
1
1
  .gitignore
2
2
  LICENSE
3
3
  README.md
4
+ icon.xcf
4
5
  pyproject.toml
5
6
  .github/workflows/pypi.yml
6
7
  scribe/__init__.py
@@ -21,5 +22,7 @@ scribe_cli.egg-info/entry_points.txt
21
22
  scribe_cli.egg-info/requires.txt
22
23
  scribe_cli.egg-info/top_level.txt
23
24
  scribe_data/__init__.py
24
- scribe_data/share/icon.jpg
25
+ scribe_data/share/icon.png
26
+ scribe_data/share/icon_recording.png
27
+ scribe_data/share/icon_writing.png
25
28
  scribe_data/templates/scribe.desktop
Binary file
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes