scribe-cli 0.10.0__tar.gz → 0.11.1__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (30) hide show
  1. {scribe_cli-0.10.0/scribe_cli.egg-info → scribe_cli-0.11.1}/PKG-INFO +16 -11
  2. {scribe_cli-0.10.0 → scribe_cli-0.11.1}/README.md +16 -11
  3. {scribe_cli-0.10.0 → scribe_cli-0.11.1}/scribe/_version.py +2 -2
  4. {scribe_cli-0.10.0 → scribe_cli-0.11.1}/scribe/app.py +102 -57
  5. {scribe_cli-0.10.0 → scribe_cli-0.11.1}/scribe/models.py +9 -4
  6. {scribe_cli-0.10.0 → scribe_cli-0.11.1/scribe_cli.egg-info}/PKG-INFO +16 -11
  7. {scribe_cli-0.10.0 → scribe_cli-0.11.1}/.github/workflows/pypi.yml +0 -0
  8. {scribe_cli-0.10.0 → scribe_cli-0.11.1}/.gitignore +0 -0
  9. {scribe_cli-0.10.0 → scribe_cli-0.11.1}/LICENSE +0 -0
  10. {scribe_cli-0.10.0 → scribe_cli-0.11.1}/icon.xcf +0 -0
  11. {scribe_cli-0.10.0 → scribe_cli-0.11.1}/pyproject.toml +0 -0
  12. {scribe_cli-0.10.0 → scribe_cli-0.11.1}/scribe/__init__.py +0 -0
  13. {scribe_cli-0.10.0 → scribe_cli-0.11.1}/scribe/audio.py +0 -0
  14. {scribe_cli-0.10.0 → scribe_cli-0.11.1}/scribe/install_desktop.py +0 -0
  15. {scribe_cli-0.10.0 → scribe_cli-0.11.1}/scribe/keyboard.py +0 -0
  16. {scribe_cli-0.10.0 → scribe_cli-0.11.1}/scribe/models.toml +0 -0
  17. {scribe_cli-0.10.0 → scribe_cli-0.11.1}/scribe/saverecording.py +0 -0
  18. {scribe_cli-0.10.0 → scribe_cli-0.11.1}/scribe/testpynput.py +0 -0
  19. {scribe_cli-0.10.0 → scribe_cli-0.11.1}/scribe/util.py +0 -0
  20. {scribe_cli-0.10.0 → scribe_cli-0.11.1}/scribe_cli.egg-info/SOURCES.txt +0 -0
  21. {scribe_cli-0.10.0 → scribe_cli-0.11.1}/scribe_cli.egg-info/dependency_links.txt +0 -0
  22. {scribe_cli-0.10.0 → scribe_cli-0.11.1}/scribe_cli.egg-info/entry_points.txt +0 -0
  23. {scribe_cli-0.10.0 → scribe_cli-0.11.1}/scribe_cli.egg-info/requires.txt +0 -0
  24. {scribe_cli-0.10.0 → scribe_cli-0.11.1}/scribe_cli.egg-info/top_level.txt +0 -0
  25. {scribe_cli-0.10.0 → scribe_cli-0.11.1}/scribe_data/__init__.py +0 -0
  26. {scribe_cli-0.10.0 → scribe_cli-0.11.1}/scribe_data/share/icon.png +0 -0
  27. {scribe_cli-0.10.0 → scribe_cli-0.11.1}/scribe_data/share/icon_recording.png +0 -0
  28. {scribe_cli-0.10.0 → scribe_cli-0.11.1}/scribe_data/share/icon_writing.png +0 -0
  29. {scribe_cli-0.10.0 → scribe_cli-0.11.1}/scribe_data/templates/scribe.desktop +0 -0
  30. {scribe_cli-0.10.0 → scribe_cli-0.11.1}/setup.cfg +0 -0
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.2
2
2
  Name: scribe-cli
3
- Version: 0.10.0
3
+ Version: 0.11.1
4
4
  Summary: scribe is a local speech recognition tool that provides real-time transcription using vosk and whisper AI, with the goal of serving as a virtual keyboard on a computer
5
5
  Author-email: Mahé Perrette <mahe.perrette@gmail.com>
6
6
  License: MIT License
@@ -158,7 +158,7 @@ The content of the (full) transcription is then pasted to the clipboard, and it
158
158
  Alternatively an output file can be indicated:
159
159
 
160
160
  ```bash
161
- --keyboard -o transcription.txt
161
+ scribe -o transcription.txt
162
162
  ```
163
163
 
164
164
  ### Virtual keyboard (experimental)
@@ -195,7 +195,8 @@ To activate start with:
195
195
  ```bash
196
196
  scribe --app
197
197
  ```
198
- or toggle the app option in the interactive menu. The scribe icon will show, with Record, Stop or Quit options. The icon will change based on what the app is doing.
198
+ or toggle the app option in the interactive menu. The scribe icon will show, with Record and other options. The icon will change based on what the app is doing. It is possible to choose from a set
199
+ of predefined models, or to Quit and choose from the terminal before pressing Enter again.
199
200
  For the vosk model, there are only two states : recording + transcribing or Idle. For the whisper model there are three states visible from the icon: recording, transcribing and idle/waiting.
200
201
  That option requires `pystray` to be installed. This is included with the `pip install ...[all]` option. In Ubuntu the following dependencies were required to make the menus appear:
201
202
 
@@ -204,23 +205,27 @@ sudo apt install libcairo-dev libgirepository1.0-dev gir1.2-appindicator3-0.1
204
205
  pip install PyGObject
205
206
  ```
206
207
 
208
+ <img src=https://github.com/user-attachments/assets/4c97f4b1-1a65-4d49-9f5a-a9f4287cfa5a width=300px>
209
+
207
210
  ## Start as an application in GNOME
208
211
 
209
212
  If you run Ubuntu (or else?) with GNOME, the script `scribe-install [...]` will create a `scribe.desktop` file and place it under `$HOME/.local/share/applications`
210
213
  to make it available from the quick launch menu. Any option will be passed on to `scribe`, with the additional options `--name` and `--no-terminal`.
211
214
  `--no-terminal` means no terminal will show up, and it also implies the options `--app --no-prompt`.
212
215
 
213
- e.g.
216
+ In a relatively basic form
217
+
218
+ ```bash
219
+ scribe-install --clipboard --api YOUROPENAIAPIKEY
220
+ ```
221
+ (`--api` is optional and only useful if you plan to use `openaiapi` backend later on)
222
+
223
+ And to make an app running outside the terminal:
214
224
 
215
225
  ```bash
216
- scribe-install
217
- scribe-install --name "Scribe Whisper" --backend whisper --model small --clipboard --restart-after-silence --no-prompt
218
- scribe-install --name "Scribe Vosk FR" --backend vosk --language fr --keyboard --clipboard --no-terminal
226
+ scribe-install --backend openaiapi --name "Scribe App" --keyboard --clipboard --app --no-prompt --no-terminal --restart-after-silence --api YOUROPENAIAPIKEY
219
227
  ```
220
- This will install three separate apps:
221
- - `Super + scribe` : will launch the default version with terminal prompt
222
- - `Super + whisper` : will launch a present version with the `small` model from `whisper` and start recording right away. You can see what is going on in the terminal and the result is ready to paste from the clipboard.
223
- - `Super + vosk fr` : will launch a preset version for real-time transcription in French with the `vosk` backend, and throughput to the clipboard and the keyboard, not even opening a terminal (you need to press Record in the tray icon menu to start the recording).
228
+ This will install two separate apps (names "Scribe" and "Scribe App")
224
229
 
225
230
 
226
231
  ## Fine tuning
@@ -90,7 +90,7 @@ The content of the (full) transcription is then pasted to the clipboard, and it
90
90
  Alternatively an output file can be indicated:
91
91
 
92
92
  ```bash
93
- --keyboard -o transcription.txt
93
+ scribe -o transcription.txt
94
94
  ```
95
95
 
96
96
  ### Virtual keyboard (experimental)
@@ -127,7 +127,8 @@ To activate start with:
127
127
  ```bash
128
128
  scribe --app
129
129
  ```
130
- or toggle the app option in the interactive menu. The scribe icon will show, with Record, Stop or Quit options. The icon will change based on what the app is doing.
130
+ or toggle the app option in the interactive menu. The scribe icon will show, with Record and other options. The icon will change based on what the app is doing. It is possible to choose from a set
131
+ of predefined models, or to Quit and choose from the terminal before pressing Enter again.
131
132
  For the vosk model, there are only two states : recording + transcribing or Idle. For the whisper model there are three states visible from the icon: recording, transcribing and idle/waiting.
132
133
  That option requires `pystray` to be installed. This is included with the `pip install ...[all]` option. In Ubuntu the following dependencies were required to make the menus appear:
133
134
 
@@ -136,23 +137,27 @@ sudo apt install libcairo-dev libgirepository1.0-dev gir1.2-appindicator3-0.1
136
137
  pip install PyGObject
137
138
  ```
138
139
 
140
+ <img src=https://github.com/user-attachments/assets/4c97f4b1-1a65-4d49-9f5a-a9f4287cfa5a width=300px>
141
+
139
142
  ## Start as an application in GNOME
140
143
 
141
144
  If you run Ubuntu (or else?) with GNOME, the script `scribe-install [...]` will create a `scribe.desktop` file and place it under `$HOME/.local/share/applications`
142
145
  to make it available from the quick launch menu. Any option will be passed on to `scribe`, with the additional options `--name` and `--no-terminal`.
143
146
  `--no-terminal` means no terminal will show up, and it also implies the options `--app --no-prompt`.
144
147
 
145
- e.g.
148
+ In a relatively basic form
149
+
150
+ ```bash
151
+ scribe-install --clipboard --api YOUROPENAIAPIKEY
152
+ ```
153
+ (`--api` is optional and only useful if you plan to use `openaiapi` backend later on)
154
+
155
+ And to make an app running outside the terminal:
146
156
 
147
157
  ```bash
148
- scribe-install
149
- scribe-install --name "Scribe Whisper" --backend whisper --model small --clipboard --restart-after-silence --no-prompt
150
- scribe-install --name "Scribe Vosk FR" --backend vosk --language fr --keyboard --clipboard --no-terminal
158
+ scribe-install --backend openaiapi --name "Scribe App" --keyboard --clipboard --app --no-prompt --no-terminal --restart-after-silence --api YOUROPENAIAPIKEY
151
159
  ```
152
- This will install three separate apps:
153
- - `Super + scribe` : will launch the default version with terminal prompt
154
- - `Super + whisper` : will launch a present version with the `small` model from `whisper` and start recording right away. You can see what is going on in the terminal and the result is ready to paste from the clipboard.
155
- - `Super + vosk fr` : will launch a preset version for real-time transcription in French with the `vosk` backend, and throughput to the clipboard and the keyboard, not even opening a terminal (you need to press Record in the tray icon menu to start the recording).
160
+ This will install two separate apps (names "Scribe" and "Scribe App")
156
161
 
157
162
 
158
163
  ## Fine tuning
@@ -162,4 +167,4 @@ Best is to check the available options in the online help:
162
167
 
163
168
  ```bash
164
169
  scribe --help
165
- ```
170
+ ```
@@ -12,5 +12,5 @@ __version__: str
12
12
  __version_tuple__: VERSION_TUPLE
13
13
  version_tuple: VERSION_TUPLE
14
14
 
15
- __version__ = version = '0.10.0'
16
- __version_tuple__ = version_tuple = (0, 10, 0)
15
+ __version__ = version = '0.11.1'
16
+ __version_tuple__ = version_tuple = (0, 11, 1)
@@ -55,49 +55,54 @@ class DummyTranscriber:
55
55
  def __getattr__(self, item):
56
56
  return None
57
57
 
58
- def get_transcriber(o, prompt=True):
58
+ whisper_models = ["tiny", "base", "small", "medium", "large", "turbo"]
59
+ whisper_english_models = ["tiny.en", "base.en", "small.en", "medium.en"]
60
+ whisperapi_models = ["whisper-1"]
61
+ vosk_models = [language_config["vosk"][lang]["model"] for lang in language_config["vosk"]]
59
62
 
60
- whisper_models = ["tiny", "base", "small", "medium", "large", "turbo"]
61
- whisper_english_models = ["tiny.en", "base.en", "small.en", "medium.en"]
62
- whisperapi_models = ["whisper-1"]
63
63
 
64
- if o.dummy:
64
+ def get_transcriber(model=None, backend=None, dummy=False, prompt=True, language=None,
65
+ samplerate=None, duration=None, silence=None, silence_db=None, restart_after_silence=None,
66
+ api_key=None,
67
+ download_folder_vosk=None, download_folder_whisper=None, **kwargs):
68
+
69
+ if dummy:
65
70
  return DummyTranscriber("whisper", "dummy")
66
71
 
67
- if o.model and not o.backend:
68
- if o.model.startswith("vosk-"):
69
- o.backend = "vosk"
70
- elif o.model in whisper_models + whisper_english_models:
71
- o.backend = "whisper"
72
- elif o.model in whisperapi_models:
73
- o.backend = "openaiapi"
72
+ if model and not backend:
73
+ if model.startswith("vosk-"):
74
+ backend = "vosk"
75
+ elif model in whisper_models + whisper_english_models:
76
+ backend = "whisper"
77
+ elif model in whisperapi_models:
78
+ backend = "openaiapi"
74
79
 
75
- if o.backend:
76
- backend = o.backend
80
+ if backend:
81
+ backend = backend
77
82
 
78
83
  elif not prompt:
79
84
  backend = BACKENDS[0]
80
85
 
81
86
  else:
82
- backend = prompt_choices(BACKENDS, o.backend, "backend", UNAVAILABLE_BACKENDS)
87
+ backend = prompt_choices(BACKENDS, backend, "backend", UNAVAILABLE_BACKENDS)
83
88
 
84
89
  print(f"Selected backend: {backend}")
85
90
 
86
- if o.model:
87
- model = pick_specialist_model(o.model, o.language, backend)
91
+ if model:
92
+ model = pick_specialist_model(model, language, backend)
88
93
 
89
94
  else:
90
95
 
91
96
  if backend == "vosk":
92
97
  available_languages = list(language_config[backend])
93
- if o.language:
94
- if o.language not in available_languages:
95
- print(f"Language '{o.language}' is not pre-defined (yet) for backend '{backend}'.")
98
+ if language:
99
+ if language not in available_languages:
100
+ print(f"Language '{language}' is not pre-defined (yet) for backend '{backend}'.")
96
101
  print(f"Yet it may actually exist.")
97
102
  print(f"Please choose the model explictly from {ansi_link('https://alphacephei.com/vosk/models')}.")
98
103
  print(f"Or pick one of the pre-defined languages: ", " ".join(available_languages))
99
104
  exit(1)
100
- choices = [language_config[backend][o.language]["model"]]
105
+ choices = [language_config[backend][language]["model"]]
101
106
  default_model = choices[0] # this is a string
102
107
 
103
108
  else:
@@ -121,10 +126,10 @@ def get_transcriber(o, prompt=True):
121
126
  else:
122
127
  model = default_model
123
128
 
124
- model = pick_specialist_model(model, o.language, backend)
129
+ model = pick_specialist_model(model, language, backend)
125
130
 
126
131
  elif backend == "openaiapi":
127
- model = o.model or "whisper-1"
132
+ model = model or "whisper-1"
128
133
 
129
134
  else:
130
135
  raise ValueError(f"Unknown backend: {backend}")
@@ -135,26 +140,26 @@ def get_transcriber(o, prompt=True):
135
140
  if backend == "vosk":
136
141
  try:
137
142
  transcriber = VoskTranscriber(model_name=model,
138
- language=o.language,
139
- samplerate=o.samplerate,
143
+ language=language,
144
+ samplerate=samplerate,
140
145
  timeout=None, # vosk keeps going (no timeout)
141
146
  silence_duration=None, # vosk handles silences internally
142
- model_kwargs={"download_root": o.download_folder_vosk})
147
+ model_kwargs={"download_root": download_folder_vosk})
143
148
  except Exception as error:
144
149
  print(error)
145
150
  print(f"Failed to (down)load model {model}.")
146
151
  exit(1)
147
152
 
148
153
  elif backend == "whisper":
149
- transcriber = WhisperTranscriber(model_name=model, language=o.language, samplerate=o.samplerate,
150
- timeout=o.duration, silence_duration=o.silence, silence_thresh=o.silence_db,
151
- restart_after_silence=o.restart_after_silence,
152
- model_kwargs={"download_root": o.download_folder_whisper})
154
+ transcriber = WhisperTranscriber(model_name=model, language=language, samplerate=samplerate,
155
+ timeout=duration, silence_duration=silence, silence_thresh=silence_db,
156
+ restart_after_silence=restart_after_silence,
157
+ model_kwargs={"download_root": download_folder_whisper})
153
158
 
154
159
  elif backend == "openaiapi":
155
- transcriber = OpenaiAPITranscriber(model_name=model, samplerate=o.samplerate,
156
- timeout=o.duration, silence_duration=o.silence, silence_thresh=o.silence_db,
157
- restart_after_silence=o.restart_after_silence, api_key=o.api_key)
160
+ transcriber = OpenaiAPITranscriber(model_name=model, samplerate=samplerate,
161
+ timeout=duration, silence_duration=silence, silence_thresh=silence_db,
162
+ restart_after_silence=restart_after_silence, api_key=api_key)
158
163
 
159
164
 
160
165
  else:
@@ -246,7 +251,7 @@ def start_recording(micro, transcriber, clipboard=True, keyboard=False, latency=
246
251
  callback()
247
252
 
248
253
 
249
- def create_app(micro, transcriber, **kwargs):
254
+ def create_app(micro, transcriber, other_transcribers=None, **kwargs):
250
255
  import pystray
251
256
  from pystray import Menu as pystrayMenu, MenuItem as Item
252
257
  from PIL import Image
@@ -266,15 +271,8 @@ def create_app(micro, transcriber, **kwargs):
266
271
  image_recording = Image.alpha_composite(image_recording.convert("RGBA"), image_writing.convert("RGBA"))
267
272
 
268
273
  def update_icon(icon, force=False):
269
- if transcriber.recording and transcriber.waiting:
270
- # this is the situation with the whisper backend when the microphone is recording
271
- # but we wait for the speaker to speak (silence)
272
- if force or getattr(icon, "_icon_label", None) != None:
273
- icon.icon = image
274
- icon._icon_label = None
275
- icon.update_menu()
276
-
277
- elif transcriber.recording:
274
+ transcriber = icon._transcriber
275
+ if transcriber.recording:
278
276
  if force or getattr(icon, "_icon_label", None) != "recording":
279
277
  icon.icon = image_recording
280
278
  icon._icon_label = "recording"
@@ -293,6 +291,7 @@ def create_app(micro, transcriber, **kwargs):
293
291
  icon.update_menu()
294
292
 
295
293
  def start_monitoring(icon):
294
+ transcriber = icon._transcriber
296
295
  try:
297
296
  while transcriber.busy:
298
297
  update_icon(icon)
@@ -308,8 +307,8 @@ def create_app(micro, transcriber, **kwargs):
308
307
  icon.stop()
309
308
 
310
309
  def callback_stop_recording(icon, item):
310
+ transcriber = icon._transcriber
311
311
  # Here we need to stop the recording thread
312
-
313
312
  transcriber.interrupt = True
314
313
  if hasattr(icon, "_recording_thread"):
315
314
  icon._recording_thread.join()
@@ -317,10 +316,10 @@ def create_app(micro, transcriber, **kwargs):
317
316
  icon._monitoring_thread.join()
318
317
 
319
318
  def callback_record(icon, item):
320
- # kwargs["callback"] = icon.update_menu # NOTE: the thread will finish AFTER the callback is complete
319
+ transcriber = icon._transcriber
321
320
  if transcriber.busy:
322
- transcriber.log("Still busy recording or transcribing.")
323
- return
321
+ # transcriber.log("Still busy recording or transcribing.")
322
+ return callback_stop_recording(icon, item) # play / stop behavior
324
323
 
325
324
  if hasattr(icon, "_recording_thread") and icon._recording_thread.is_alive():
326
325
  icon._recording_thread.join()
@@ -334,22 +333,63 @@ def create_app(micro, transcriber, **kwargs):
334
333
  icon._monitoring_thread = threading.Thread(target=start_monitoring, args=(icon,))
335
334
  icon._monitoring_thread.start()
336
335
 
336
+ if other_transcribers:
337
+ other_transcribers_dict = {meta["model"]: meta for meta in other_transcribers}
338
+ else:
339
+ other_transcribers_dict = {}
340
+
341
+ def callback_set_model(icon, item):
342
+ transcriber = icon._transcriber
343
+ callback_stop_recording(icon, item)
344
+ model_name = str(item)
345
+ meta = other_transcribers_dict[model_name]
346
+ icon._transcriber = transcriber = get_transcriber(**meta)
347
+ icon.title = f"scribe :: {transcriber.backend} :: {transcriber.model_name}"
348
+ print("Set", transcriber.backend, transcriber.model_name)
349
+ # icon.menu.items[0].__name__ = f"Record [{str(item)}]"
350
+ icon._model_selection = False
351
+ icon.update_menu()
352
+
353
+ def callback_toggle_option(icon, item):
354
+ kwargs[str(item)] = not kwargs[str(item)]
355
+
356
+ def is_model_selection(item):
357
+ return icon._model_selection
358
+
337
359
  def is_recording(item):
338
- return transcriber.busy
360
+ return icon._transcriber.busy
339
361
 
340
362
  def is_not_recording(item):
341
- return not is_recording(item)
363
+ return not is_recording(item) and not is_model_selection(item)
342
364
 
365
+ def is_checked(item):
366
+ return icon._transcriber.model_name == str(item)
343
367
 
344
- # Create a menu
345
- menu = pystrayMenu(
346
- Item("Record", callback_record, visible=is_not_recording),
347
- Item("Stop", callback_stop_recording, visible=is_recording),
348
- Item('Quit', callback_quit),
368
+ def is_checked_option(item):
369
+ return kwargs[str(item)]
370
+
371
+ modeltitle = f"{transcriber.backend} :: {transcriber.model_name}"
372
+ title = f"scribe :: {modeltitle}"
373
+
374
+ menus = []
375
+ menus.append(Item(f"Record", callback_record, visible=is_not_recording, default=True))
376
+ menus.append(Item("Stop", callback_stop_recording, visible=is_recording))
377
+ menus.append(Item("Choose Model", pystrayMenu(
378
+ *(Item(f"{name}", callback_set_model, checked=is_checked) for name in other_transcribers_dict)))
379
+ )
380
+ menus.append(Item("Toggle Options", pystrayMenu(
381
+ *(Item(f"{name}", callback_toggle_option, checked=is_checked_option) for name in kwargs if isinstance(kwargs[name], bool))))
349
382
  )
383
+ menus.append(Item('Quit', callback_quit))
384
+
385
+ # Create a menu
386
+ menu = pystrayMenu(*menus)
350
387
 
351
388
  # Create the system tray icon
352
- icon = pystray.Icon('scribe', image, "scribe", menu)
389
+ icon = pystray.Icon('scribe', image, title, menu)
390
+ icon._model_selection = False
391
+ icon._transcriber = transcriber
392
+ del transcriber
353
393
 
354
394
  return icon
355
395
 
@@ -368,7 +408,7 @@ def main(args=None):
368
408
 
369
409
  while True:
370
410
  if transcriber is None:
371
- transcriber = get_transcriber(o, prompt=o.prompt)
411
+ transcriber = get_transcriber(**vars(o))
372
412
  print(f"Model [{colored(transcriber.model_name, 'light_blue', attrs=['bold'])}] from [{colored(transcriber.backend, 'light_blue', attrs=['bold'])}] selected.")
373
413
  show_output = ["clipboard", "keyboard", "output_file"]
374
414
  show_options = ["ascii", "restart_after_silence"]
@@ -482,7 +522,12 @@ def main(args=None):
482
522
  greetings = dict(
483
523
  start_message = "Listening... Use the try icon menu to stop.",
484
524
  )
485
- app = create_app(micro, transcriber, clipboard=o.clipboard, output_file=o.output_file,
525
+
526
+ app = create_app(micro, transcriber, other_transcribers=[
527
+ {**vars(o), "backend": "openaiapi", "model": "whisper-1"},
528
+ *[{**vars(o), "backend": "whisper", "model": model} for model in whisper_models],
529
+ *[{**vars(o), "backend": "vosk", "model": model} for model in vosk_models]],
530
+ clipboard=o.clipboard, output_file=o.output_file,
486
531
  keyboard=o.keyboard, latency=o.latency, ascii=o.ascii, **greetings)
487
532
  print("Starting app...")
488
533
  app.run()
@@ -242,6 +242,7 @@ class OpenaiAPITranscriber(WhisperTranscriber):
242
242
  def transcribe_audio(self, audio_bytes):
243
243
  self.log("\nTranscribing")
244
244
  import io
245
+ import openai
245
246
  import soundfile as sf
246
247
  audio_data = np.frombuffer(audio_bytes, dtype=np.int16).flatten().astype(np.float32) / 32768.0
247
248
  # Write the audio data to an in-memory file in WAV format
@@ -249,8 +250,12 @@ class OpenaiAPITranscriber(WhisperTranscriber):
249
250
  sf.write(buffer, audio_data, self.samplerate, format='WAV')
250
251
  buffer.seek(0)
251
252
  buffer.name = "audio.wav" # Set a filename with a valid extension
252
- transcription = self.model.audio.transcriptions.create(
253
- model=self.model_name,
254
- file=buffer,
255
- )
253
+ try:
254
+ transcription = self.model.audio.transcriptions.create(
255
+ model=self.model_name,
256
+ file=buffer,
257
+ )
258
+ except openai.BadRequestError as e:
259
+ self.log(f"Error: {e}")
260
+ return {"text": ""}
256
261
  return {"text": transcription.text}
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.2
2
2
  Name: scribe-cli
3
- Version: 0.10.0
3
+ Version: 0.11.1
4
4
  Summary: scribe is a local speech recognition tool that provides real-time transcription using vosk and whisper AI, with the goal of serving as a virtual keyboard on a computer
5
5
  Author-email: Mahé Perrette <mahe.perrette@gmail.com>
6
6
  License: MIT License
@@ -158,7 +158,7 @@ The content of the (full) transcription is then pasted to the clipboard, and it
158
158
  Alternatively an output file can be indicated:
159
159
 
160
160
  ```bash
161
- --keyboard -o transcription.txt
161
+ scribe -o transcription.txt
162
162
  ```
163
163
 
164
164
  ### Virtual keyboard (experimental)
@@ -195,7 +195,8 @@ To activate start with:
195
195
  ```bash
196
196
  scribe --app
197
197
  ```
198
- or toggle the app option in the interactive menu. The scribe icon will show, with Record, Stop or Quit options. The icon will change based on what the app is doing.
198
+ or toggle the app option in the interactive menu. The scribe icon will show, with Record and other options. The icon will change based on what the app is doing. It is possible to choose from a set
199
+ of predefined models, or to Quit and choose from the terminal before pressing Enter again.
199
200
  For the vosk model, there are only two states : recording + transcribing or Idle. For the whisper model there are three states visible from the icon: recording, transcribing and idle/waiting.
200
201
  That option requires `pystray` to be installed. This is included with the `pip install ...[all]` option. In Ubuntu the following dependencies were required to make the menus appear:
201
202
 
@@ -204,23 +205,27 @@ sudo apt install libcairo-dev libgirepository1.0-dev gir1.2-appindicator3-0.1
204
205
  pip install PyGObject
205
206
  ```
206
207
 
208
+ <img src=https://github.com/user-attachments/assets/4c97f4b1-1a65-4d49-9f5a-a9f4287cfa5a width=300px>
209
+
207
210
  ## Start as an application in GNOME
208
211
 
209
212
  If you run Ubuntu (or else?) with GNOME, the script `scribe-install [...]` will create a `scribe.desktop` file and place it under `$HOME/.local/share/applications`
210
213
  to make it available from the quick launch menu. Any option will be passed on to `scribe`, with the additional options `--name` and `--no-terminal`.
211
214
  `--no-terminal` means no terminal will show up, and it also implies the options `--app --no-prompt`.
212
215
 
213
- e.g.
216
+ In a relatively basic form
217
+
218
+ ```bash
219
+ scribe-install --clipboard --api YOUROPENAIAPIKEY
220
+ ```
221
+ (`--api` is optional and only useful if you plan to use `openaiapi` backend later on)
222
+
223
+ And to make an app running outside the terminal:
214
224
 
215
225
  ```bash
216
- scribe-install
217
- scribe-install --name "Scribe Whisper" --backend whisper --model small --clipboard --restart-after-silence --no-prompt
218
- scribe-install --name "Scribe Vosk FR" --backend vosk --language fr --keyboard --clipboard --no-terminal
226
+ scribe-install --backend openaiapi --name "Scribe App" --keyboard --clipboard --app --no-prompt --no-terminal --restart-after-silence --api YOUROPENAIAPIKEY
219
227
  ```
220
- This will install three separate apps:
221
- - `Super + scribe` : will launch the default version with terminal prompt
222
- - `Super + whisper` : will launch a present version with the `small` model from `whisper` and start recording right away. You can see what is going on in the terminal and the result is ready to paste from the clipboard.
223
- - `Super + vosk fr` : will launch a preset version for real-time transcription in French with the `vosk` backend, and throughput to the clipboard and the keyboard, not even opening a terminal (you need to press Record in the tray icon menu to start the recording).
228
+ This will install two separate apps (names "Scribe" and "Scribe App")
224
229
 
225
230
 
226
231
  ## Fine tuning
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes