scribe-cli 0.10.0__tar.gz → 0.11.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {scribe_cli-0.10.0/scribe_cli.egg-info → scribe_cli-0.11.0}/PKG-INFO +13 -10
- {scribe_cli-0.10.0 → scribe_cli-0.11.0}/README.md +12 -9
- {scribe_cli-0.10.0 → scribe_cli-0.11.0}/scribe/_version.py +2 -2
- {scribe_cli-0.10.0 → scribe_cli-0.11.0}/scribe/app.py +103 -46
- {scribe_cli-0.10.0 → scribe_cli-0.11.0}/scribe/models.py +9 -4
- {scribe_cli-0.10.0 → scribe_cli-0.11.0/scribe_cli.egg-info}/PKG-INFO +13 -10
- {scribe_cli-0.10.0 → scribe_cli-0.11.0}/.github/workflows/pypi.yml +0 -0
- {scribe_cli-0.10.0 → scribe_cli-0.11.0}/.gitignore +0 -0
- {scribe_cli-0.10.0 → scribe_cli-0.11.0}/LICENSE +0 -0
- {scribe_cli-0.10.0 → scribe_cli-0.11.0}/icon.xcf +0 -0
- {scribe_cli-0.10.0 → scribe_cli-0.11.0}/pyproject.toml +0 -0
- {scribe_cli-0.10.0 → scribe_cli-0.11.0}/scribe/__init__.py +0 -0
- {scribe_cli-0.10.0 → scribe_cli-0.11.0}/scribe/audio.py +0 -0
- {scribe_cli-0.10.0 → scribe_cli-0.11.0}/scribe/install_desktop.py +0 -0
- {scribe_cli-0.10.0 → scribe_cli-0.11.0}/scribe/keyboard.py +0 -0
- {scribe_cli-0.10.0 → scribe_cli-0.11.0}/scribe/models.toml +0 -0
- {scribe_cli-0.10.0 → scribe_cli-0.11.0}/scribe/saverecording.py +0 -0
- {scribe_cli-0.10.0 → scribe_cli-0.11.0}/scribe/testpynput.py +0 -0
- {scribe_cli-0.10.0 → scribe_cli-0.11.0}/scribe/util.py +0 -0
- {scribe_cli-0.10.0 → scribe_cli-0.11.0}/scribe_cli.egg-info/SOURCES.txt +0 -0
- {scribe_cli-0.10.0 → scribe_cli-0.11.0}/scribe_cli.egg-info/dependency_links.txt +0 -0
- {scribe_cli-0.10.0 → scribe_cli-0.11.0}/scribe_cli.egg-info/entry_points.txt +0 -0
- {scribe_cli-0.10.0 → scribe_cli-0.11.0}/scribe_cli.egg-info/requires.txt +0 -0
- {scribe_cli-0.10.0 → scribe_cli-0.11.0}/scribe_cli.egg-info/top_level.txt +0 -0
- {scribe_cli-0.10.0 → scribe_cli-0.11.0}/scribe_data/__init__.py +0 -0
- {scribe_cli-0.10.0 → scribe_cli-0.11.0}/scribe_data/share/icon.png +0 -0
- {scribe_cli-0.10.0 → scribe_cli-0.11.0}/scribe_data/share/icon_recording.png +0 -0
- {scribe_cli-0.10.0 → scribe_cli-0.11.0}/scribe_data/share/icon_writing.png +0 -0
- {scribe_cli-0.10.0 → scribe_cli-0.11.0}/scribe_data/templates/scribe.desktop +0 -0
- {scribe_cli-0.10.0 → scribe_cli-0.11.0}/setup.cfg +0 -0
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.2
|
|
2
2
|
Name: scribe-cli
|
|
3
|
-
Version: 0.
|
|
3
|
+
Version: 0.11.0
|
|
4
4
|
Summary: scribe is a local speech recognition tool that provides real-time transcription using vosk and whisper AI, with the goal of serving as a virtual keyboard on a computer
|
|
5
5
|
Author-email: Mahé Perrette <mahe.perrette@gmail.com>
|
|
6
6
|
License: MIT License
|
|
@@ -195,7 +195,8 @@ To activate start with:
|
|
|
195
195
|
```bash
|
|
196
196
|
scribe --app
|
|
197
197
|
```
|
|
198
|
-
or toggle the app option in the interactive menu. The scribe icon will show, with Record
|
|
198
|
+
or toggle the app option in the interactive menu. The scribe icon will show, with Record and other options. The icon will change based on what the app is doing. It is possible to choose from a set
|
|
199
|
+
of predefined models, or to Quit and choose from the terminal before pressing Enter again.
|
|
199
200
|
For the vosk model, there are only two states : recording + transcribing or Idle. For the whisper model there are three states visible from the icon: recording, transcribing and idle/waiting.
|
|
200
201
|
That option requires `pystray` to be installed. This is included with the `pip install ...[all]` option. In Ubuntu the following dependencies were required to make the menus appear:
|
|
201
202
|
|
|
@@ -210,17 +211,19 @@ If you run Ubuntu (or else?) with GNOME, the script `scribe-install [...]` will
|
|
|
210
211
|
to make it available from the quick launch menu. Any option will be passed on to `scribe`, with the additional options `--name` and `--no-terminal`.
|
|
211
212
|
`--no-terminal` means no terminal will show up, and it also implies the options `--app --no-prompt`.
|
|
212
213
|
|
|
213
|
-
|
|
214
|
+
In a relatively basic form
|
|
214
215
|
|
|
215
216
|
```bash
|
|
216
|
-
scribe-install
|
|
217
|
-
scribe-install --name "Scribe Whisper" --backend whisper --model small --clipboard --restart-after-silence --no-prompt
|
|
218
|
-
scribe-install --name "Scribe Vosk FR" --backend vosk --language fr --keyboard --clipboard --no-terminal
|
|
217
|
+
scribe-install --clipboard --api YOUROPENAIAPIKEY
|
|
219
218
|
```
|
|
220
|
-
|
|
221
|
-
|
|
222
|
-
|
|
223
|
-
|
|
219
|
+
(`--api` is optional and only useful if you plan to use `openaiapi` backend later on)
|
|
220
|
+
|
|
221
|
+
And to make an app running outside the terminal:
|
|
222
|
+
|
|
223
|
+
```bash
|
|
224
|
+
scribe-install --backend openaiapi --name "Scribe App" --keyboard --clipboard --app --no-prompt --no-terminal --api YOUROPENAIAPIKEY
|
|
225
|
+
```
|
|
226
|
+
This will install two separate apps (names "Scribe" and "Scribe App")
|
|
224
227
|
|
|
225
228
|
|
|
226
229
|
## Fine tuning
|
|
@@ -127,7 +127,8 @@ To activate start with:
|
|
|
127
127
|
```bash
|
|
128
128
|
scribe --app
|
|
129
129
|
```
|
|
130
|
-
or toggle the app option in the interactive menu. The scribe icon will show, with Record
|
|
130
|
+
or toggle the app option in the interactive menu. The scribe icon will show, with Record and other options. The icon will change based on what the app is doing. It is possible to choose from a set
|
|
131
|
+
of predefined models, or to Quit and choose from the terminal before pressing Enter again.
|
|
131
132
|
For the vosk model, there are only two states : recording + transcribing or Idle. For the whisper model there are three states visible from the icon: recording, transcribing and idle/waiting.
|
|
132
133
|
That option requires `pystray` to be installed. This is included with the `pip install ...[all]` option. In Ubuntu the following dependencies were required to make the menus appear:
|
|
133
134
|
|
|
@@ -142,17 +143,19 @@ If you run Ubuntu (or else?) with GNOME, the script `scribe-install [...]` will
|
|
|
142
143
|
to make it available from the quick launch menu. Any option will be passed on to `scribe`, with the additional options `--name` and `--no-terminal`.
|
|
143
144
|
`--no-terminal` means no terminal will show up, and it also implies the options `--app --no-prompt`.
|
|
144
145
|
|
|
145
|
-
|
|
146
|
+
In a relatively basic form
|
|
146
147
|
|
|
147
148
|
```bash
|
|
148
|
-
scribe-install
|
|
149
|
-
scribe-install --name "Scribe Whisper" --backend whisper --model small --clipboard --restart-after-silence --no-prompt
|
|
150
|
-
scribe-install --name "Scribe Vosk FR" --backend vosk --language fr --keyboard --clipboard --no-terminal
|
|
149
|
+
scribe-install --clipboard --api YOUROPENAIAPIKEY
|
|
151
150
|
```
|
|
152
|
-
|
|
153
|
-
|
|
154
|
-
|
|
155
|
-
|
|
151
|
+
(`--api` is optional and only useful if you plan to use `openaiapi` backend later on)
|
|
152
|
+
|
|
153
|
+
And to make an app running outside the terminal:
|
|
154
|
+
|
|
155
|
+
```bash
|
|
156
|
+
scribe-install --backend openaiapi --name "Scribe App" --keyboard --clipboard --app --no-prompt --no-terminal --api YOUROPENAIAPIKEY
|
|
157
|
+
```
|
|
158
|
+
This will install two separate apps (names "Scribe" and "Scribe App")
|
|
156
159
|
|
|
157
160
|
|
|
158
161
|
## Fine tuning
|
|
@@ -55,49 +55,54 @@ class DummyTranscriber:
|
|
|
55
55
|
def __getattr__(self, item):
|
|
56
56
|
return None
|
|
57
57
|
|
|
58
|
-
|
|
58
|
+
whisper_models = ["tiny", "base", "small", "medium", "large", "turbo"]
|
|
59
|
+
whisper_english_models = ["tiny.en", "base.en", "small.en", "medium.en"]
|
|
60
|
+
whisperapi_models = ["whisper-1"]
|
|
61
|
+
vosk_models = [language_config["vosk"][lang]["model"] for lang in language_config["vosk"]]
|
|
59
62
|
|
|
60
|
-
whisper_models = ["tiny", "base", "small", "medium", "large", "turbo"]
|
|
61
|
-
whisper_english_models = ["tiny.en", "base.en", "small.en", "medium.en"]
|
|
62
|
-
whisperapi_models = ["whisper-1"]
|
|
63
63
|
|
|
64
|
-
|
|
64
|
+
def get_transcriber(model=None, backend=None, dummy=False, prompt=True, language=None,
|
|
65
|
+
samplerate=None, duration=None, silence=None, silence_db=None, restart_after_silence=None,
|
|
66
|
+
api_key=None,
|
|
67
|
+
download_folder_vosk=None, download_folder_whisper=None, **kwargs):
|
|
68
|
+
|
|
69
|
+
if dummy:
|
|
65
70
|
return DummyTranscriber("whisper", "dummy")
|
|
66
71
|
|
|
67
|
-
if
|
|
68
|
-
if
|
|
69
|
-
|
|
70
|
-
elif
|
|
71
|
-
|
|
72
|
-
elif
|
|
73
|
-
|
|
72
|
+
if model and not backend:
|
|
73
|
+
if model.startswith("vosk-"):
|
|
74
|
+
backend = "vosk"
|
|
75
|
+
elif model in whisper_models + whisper_english_models:
|
|
76
|
+
backend = "whisper"
|
|
77
|
+
elif model in whisperapi_models:
|
|
78
|
+
backend = "openaiapi"
|
|
74
79
|
|
|
75
|
-
if
|
|
76
|
-
backend =
|
|
80
|
+
if backend:
|
|
81
|
+
backend = backend
|
|
77
82
|
|
|
78
83
|
elif not prompt:
|
|
79
84
|
backend = BACKENDS[0]
|
|
80
85
|
|
|
81
86
|
else:
|
|
82
|
-
backend = prompt_choices(BACKENDS,
|
|
87
|
+
backend = prompt_choices(BACKENDS, backend, "backend", UNAVAILABLE_BACKENDS)
|
|
83
88
|
|
|
84
89
|
print(f"Selected backend: {backend}")
|
|
85
90
|
|
|
86
|
-
if
|
|
87
|
-
model = pick_specialist_model(
|
|
91
|
+
if model:
|
|
92
|
+
model = pick_specialist_model(model, language, backend)
|
|
88
93
|
|
|
89
94
|
else:
|
|
90
95
|
|
|
91
96
|
if backend == "vosk":
|
|
92
97
|
available_languages = list(language_config[backend])
|
|
93
|
-
if
|
|
94
|
-
if
|
|
95
|
-
print(f"Language '{
|
|
98
|
+
if language:
|
|
99
|
+
if language not in available_languages:
|
|
100
|
+
print(f"Language '{language}' is not pre-defined (yet) for backend '{backend}'.")
|
|
96
101
|
print(f"Yet it may actually exist.")
|
|
97
102
|
print(f"Please choose the model explictly from {ansi_link('https://alphacephei.com/vosk/models')}.")
|
|
98
103
|
print(f"Or pick one of the pre-defined languages: ", " ".join(available_languages))
|
|
99
104
|
exit(1)
|
|
100
|
-
choices = [language_config[backend][
|
|
105
|
+
choices = [language_config[backend][language]["model"]]
|
|
101
106
|
default_model = choices[0] # this is a string
|
|
102
107
|
|
|
103
108
|
else:
|
|
@@ -121,10 +126,10 @@ def get_transcriber(o, prompt=True):
|
|
|
121
126
|
else:
|
|
122
127
|
model = default_model
|
|
123
128
|
|
|
124
|
-
model = pick_specialist_model(model,
|
|
129
|
+
model = pick_specialist_model(model, language, backend)
|
|
125
130
|
|
|
126
131
|
elif backend == "openaiapi":
|
|
127
|
-
model =
|
|
132
|
+
model = model or "whisper-1"
|
|
128
133
|
|
|
129
134
|
else:
|
|
130
135
|
raise ValueError(f"Unknown backend: {backend}")
|
|
@@ -135,26 +140,26 @@ def get_transcriber(o, prompt=True):
|
|
|
135
140
|
if backend == "vosk":
|
|
136
141
|
try:
|
|
137
142
|
transcriber = VoskTranscriber(model_name=model,
|
|
138
|
-
language=
|
|
139
|
-
samplerate=
|
|
143
|
+
language=language,
|
|
144
|
+
samplerate=samplerate,
|
|
140
145
|
timeout=None, # vosk keeps going (no timeout)
|
|
141
146
|
silence_duration=None, # vosk handles silences internally
|
|
142
|
-
model_kwargs={"download_root":
|
|
147
|
+
model_kwargs={"download_root": download_folder_vosk})
|
|
143
148
|
except Exception as error:
|
|
144
149
|
print(error)
|
|
145
150
|
print(f"Failed to (down)load model {model}.")
|
|
146
151
|
exit(1)
|
|
147
152
|
|
|
148
153
|
elif backend == "whisper":
|
|
149
|
-
transcriber = WhisperTranscriber(model_name=model, language=
|
|
150
|
-
timeout=
|
|
151
|
-
restart_after_silence=
|
|
152
|
-
model_kwargs={"download_root":
|
|
154
|
+
transcriber = WhisperTranscriber(model_name=model, language=language, samplerate=samplerate,
|
|
155
|
+
timeout=duration, silence_duration=silence, silence_thresh=silence_db,
|
|
156
|
+
restart_after_silence=restart_after_silence,
|
|
157
|
+
model_kwargs={"download_root": download_folder_whisper})
|
|
153
158
|
|
|
154
159
|
elif backend == "openaiapi":
|
|
155
|
-
transcriber = OpenaiAPITranscriber(model_name=model, samplerate=
|
|
156
|
-
timeout=
|
|
157
|
-
restart_after_silence=
|
|
160
|
+
transcriber = OpenaiAPITranscriber(model_name=model, samplerate=samplerate,
|
|
161
|
+
timeout=duration, silence_duration=silence, silence_thresh=silence_db,
|
|
162
|
+
restart_after_silence=restart_after_silence, api_key=api_key)
|
|
158
163
|
|
|
159
164
|
|
|
160
165
|
else:
|
|
@@ -246,7 +251,7 @@ def start_recording(micro, transcriber, clipboard=True, keyboard=False, latency=
|
|
|
246
251
|
callback()
|
|
247
252
|
|
|
248
253
|
|
|
249
|
-
def create_app(micro, transcriber, **kwargs):
|
|
254
|
+
def create_app(micro, transcriber, other_transcribers=None, **kwargs):
|
|
250
255
|
import pystray
|
|
251
256
|
from pystray import Menu as pystrayMenu, MenuItem as Item
|
|
252
257
|
from PIL import Image
|
|
@@ -266,6 +271,7 @@ def create_app(micro, transcriber, **kwargs):
|
|
|
266
271
|
image_recording = Image.alpha_composite(image_recording.convert("RGBA"), image_writing.convert("RGBA"))
|
|
267
272
|
|
|
268
273
|
def update_icon(icon, force=False):
|
|
274
|
+
transcriber = icon._transcriber
|
|
269
275
|
if transcriber.recording and transcriber.waiting:
|
|
270
276
|
# this is the situation with the whisper backend when the microphone is recording
|
|
271
277
|
# but we wait for the speaker to speak (silence)
|
|
@@ -293,6 +299,7 @@ def create_app(micro, transcriber, **kwargs):
|
|
|
293
299
|
icon.update_menu()
|
|
294
300
|
|
|
295
301
|
def start_monitoring(icon):
|
|
302
|
+
transcriber = icon._transcriber
|
|
296
303
|
try:
|
|
297
304
|
while transcriber.busy:
|
|
298
305
|
update_icon(icon)
|
|
@@ -308,8 +315,8 @@ def create_app(micro, transcriber, **kwargs):
|
|
|
308
315
|
icon.stop()
|
|
309
316
|
|
|
310
317
|
def callback_stop_recording(icon, item):
|
|
318
|
+
transcriber = icon._transcriber
|
|
311
319
|
# Here we need to stop the recording thread
|
|
312
|
-
|
|
313
320
|
transcriber.interrupt = True
|
|
314
321
|
if hasattr(icon, "_recording_thread"):
|
|
315
322
|
icon._recording_thread.join()
|
|
@@ -317,7 +324,7 @@ def create_app(micro, transcriber, **kwargs):
|
|
|
317
324
|
icon._monitoring_thread.join()
|
|
318
325
|
|
|
319
326
|
def callback_record(icon, item):
|
|
320
|
-
|
|
327
|
+
transcriber = icon._transcriber
|
|
321
328
|
if transcriber.busy:
|
|
322
329
|
transcriber.log("Still busy recording or transcribing.")
|
|
323
330
|
return
|
|
@@ -334,22 +341,67 @@ def create_app(micro, transcriber, **kwargs):
|
|
|
334
341
|
icon._monitoring_thread = threading.Thread(target=start_monitoring, args=(icon,))
|
|
335
342
|
icon._monitoring_thread.start()
|
|
336
343
|
|
|
344
|
+
if other_transcribers:
|
|
345
|
+
other_transcribers_dict = {meta["model"]: meta for meta in other_transcribers}
|
|
346
|
+
else:
|
|
347
|
+
other_transcribers_dict = {}
|
|
348
|
+
|
|
349
|
+
def callback_set_model(icon, item):
|
|
350
|
+
transcriber = icon._transcriber
|
|
351
|
+
callback_stop_recording(icon, item)
|
|
352
|
+
model_name = str(item)
|
|
353
|
+
meta = other_transcribers_dict[model_name]
|
|
354
|
+
icon._transcriber = transcriber = get_transcriber(**meta)
|
|
355
|
+
icon.title = f"scribe :: {transcriber.backend} :: {transcriber.model_name}"
|
|
356
|
+
print("Set", transcriber.backend, transcriber.model_name)
|
|
357
|
+
# icon.menu.items[0].__name__ = f"Record [{str(item)}]"
|
|
358
|
+
icon._model_selection = False
|
|
359
|
+
icon.update_menu()
|
|
360
|
+
icon.notify(f"Set {transcriber.backend} {transcriber.model_name}")
|
|
361
|
+
|
|
362
|
+
def callback_info(icon, item):
|
|
363
|
+
transcriber = icon._transcriber
|
|
364
|
+
# icon.notify(f"scribe {transcriber.backend} {transcriber.model_name}")
|
|
365
|
+
title = f"""{transcriber.backend} :: {transcriber.model_name}"""
|
|
366
|
+
info = [name for name in kwargs if isinstance(kwargs[name], bool) and kwargs[name]]
|
|
367
|
+
icon.notify(" | ".join(info), title=title)
|
|
368
|
+
|
|
369
|
+
def callback_toggle_option(icon, item):
|
|
370
|
+
kwargs[str(item)] = not kwargs[str(item)]
|
|
371
|
+
callback_info(icon, item)
|
|
372
|
+
|
|
373
|
+
def is_model_selection(item):
|
|
374
|
+
return icon._model_selection
|
|
375
|
+
|
|
337
376
|
def is_recording(item):
|
|
338
|
-
return
|
|
377
|
+
return icon._transcriber.busy
|
|
339
378
|
|
|
340
379
|
def is_not_recording(item):
|
|
341
|
-
return not is_recording(item)
|
|
380
|
+
return not is_recording(item) and not is_model_selection(item)
|
|
342
381
|
|
|
382
|
+
modeltitle = f"{transcriber.backend} :: {transcriber.model_name}"
|
|
383
|
+
title = f"scribe :: {modeltitle}"
|
|
343
384
|
|
|
344
|
-
|
|
345
|
-
|
|
346
|
-
|
|
347
|
-
|
|
348
|
-
Item(
|
|
385
|
+
menus = []
|
|
386
|
+
menus.append(Item(f"Record" if len(other_transcribers_dict) <= 1 else f"Record", callback_record, visible=is_not_recording))
|
|
387
|
+
menus.append(Item("Stop", callback_stop_recording, visible=is_recording))
|
|
388
|
+
menus.append(Item("Choose Model", pystrayMenu(
|
|
389
|
+
*(Item(f"{name}", callback_set_model) for name in other_transcribers_dict)))
|
|
390
|
+
)
|
|
391
|
+
menus.append(Item("Toggle Options", pystrayMenu(
|
|
392
|
+
*(Item(f"{name}", callback_toggle_option) for name in kwargs if isinstance(kwargs[name], bool))))
|
|
349
393
|
)
|
|
394
|
+
menus.append(Item(f"Info", callback_info))
|
|
395
|
+
menus.append(Item('Quit', callback_quit))
|
|
396
|
+
|
|
397
|
+
# Create a menu
|
|
398
|
+
menu = pystrayMenu(*menus)
|
|
350
399
|
|
|
351
400
|
# Create the system tray icon
|
|
352
|
-
icon = pystray.Icon('scribe', image,
|
|
401
|
+
icon = pystray.Icon('scribe', image, title, menu)
|
|
402
|
+
icon._model_selection = False
|
|
403
|
+
icon._transcriber = transcriber
|
|
404
|
+
del transcriber
|
|
353
405
|
|
|
354
406
|
return icon
|
|
355
407
|
|
|
@@ -368,7 +420,7 @@ def main(args=None):
|
|
|
368
420
|
|
|
369
421
|
while True:
|
|
370
422
|
if transcriber is None:
|
|
371
|
-
transcriber = get_transcriber(o
|
|
423
|
+
transcriber = get_transcriber(**vars(o))
|
|
372
424
|
print(f"Model [{colored(transcriber.model_name, 'light_blue', attrs=['bold'])}] from [{colored(transcriber.backend, 'light_blue', attrs=['bold'])}] selected.")
|
|
373
425
|
show_output = ["clipboard", "keyboard", "output_file"]
|
|
374
426
|
show_options = ["ascii", "restart_after_silence"]
|
|
@@ -482,7 +534,12 @@ def main(args=None):
|
|
|
482
534
|
greetings = dict(
|
|
483
535
|
start_message = "Listening... Use the try icon menu to stop.",
|
|
484
536
|
)
|
|
485
|
-
|
|
537
|
+
|
|
538
|
+
app = create_app(micro, transcriber, other_transcribers=[
|
|
539
|
+
{**vars(o), "backend": "openaiapi", "model": "whisper-1"},
|
|
540
|
+
*[{**vars(o), "backend": "whisper", "model": model} for model in whisper_models],
|
|
541
|
+
*[{**vars(o), "backend": "vosk", "model": model} for model in vosk_models]],
|
|
542
|
+
clipboard=o.clipboard, output_file=o.output_file,
|
|
486
543
|
keyboard=o.keyboard, latency=o.latency, ascii=o.ascii, **greetings)
|
|
487
544
|
print("Starting app...")
|
|
488
545
|
app.run()
|
|
@@ -242,6 +242,7 @@ class OpenaiAPITranscriber(WhisperTranscriber):
|
|
|
242
242
|
def transcribe_audio(self, audio_bytes):
|
|
243
243
|
self.log("\nTranscribing")
|
|
244
244
|
import io
|
|
245
|
+
import openai
|
|
245
246
|
import soundfile as sf
|
|
246
247
|
audio_data = np.frombuffer(audio_bytes, dtype=np.int16).flatten().astype(np.float32) / 32768.0
|
|
247
248
|
# Write the audio data to an in-memory file in WAV format
|
|
@@ -249,8 +250,12 @@ class OpenaiAPITranscriber(WhisperTranscriber):
|
|
|
249
250
|
sf.write(buffer, audio_data, self.samplerate, format='WAV')
|
|
250
251
|
buffer.seek(0)
|
|
251
252
|
buffer.name = "audio.wav" # Set a filename with a valid extension
|
|
252
|
-
|
|
253
|
-
|
|
254
|
-
|
|
255
|
-
|
|
253
|
+
try:
|
|
254
|
+
transcription = self.model.audio.transcriptions.create(
|
|
255
|
+
model=self.model_name,
|
|
256
|
+
file=buffer,
|
|
257
|
+
)
|
|
258
|
+
except openai.BadRequestError as e:
|
|
259
|
+
self.log(f"Error: {e}")
|
|
260
|
+
return {"text": ""}
|
|
256
261
|
return {"text": transcription.text}
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.2
|
|
2
2
|
Name: scribe-cli
|
|
3
|
-
Version: 0.
|
|
3
|
+
Version: 0.11.0
|
|
4
4
|
Summary: scribe is a local speech recognition tool that provides real-time transcription using vosk and whisper AI, with the goal of serving as a virtual keyboard on a computer
|
|
5
5
|
Author-email: Mahé Perrette <mahe.perrette@gmail.com>
|
|
6
6
|
License: MIT License
|
|
@@ -195,7 +195,8 @@ To activate start with:
|
|
|
195
195
|
```bash
|
|
196
196
|
scribe --app
|
|
197
197
|
```
|
|
198
|
-
or toggle the app option in the interactive menu. The scribe icon will show, with Record
|
|
198
|
+
or toggle the app option in the interactive menu. The scribe icon will show, with Record and other options. The icon will change based on what the app is doing. It is possible to choose from a set
|
|
199
|
+
of predefined models, or to Quit and choose from the terminal before pressing Enter again.
|
|
199
200
|
For the vosk model, there are only two states : recording + transcribing or Idle. For the whisper model there are three states visible from the icon: recording, transcribing and idle/waiting.
|
|
200
201
|
That option requires `pystray` to be installed. This is included with the `pip install ...[all]` option. In Ubuntu the following dependencies were required to make the menus appear:
|
|
201
202
|
|
|
@@ -210,17 +211,19 @@ If you run Ubuntu (or else?) with GNOME, the script `scribe-install [...]` will
|
|
|
210
211
|
to make it available from the quick launch menu. Any option will be passed on to `scribe`, with the additional options `--name` and `--no-terminal`.
|
|
211
212
|
`--no-terminal` means no terminal will show up, and it also implies the options `--app --no-prompt`.
|
|
212
213
|
|
|
213
|
-
|
|
214
|
+
In a relatively basic form
|
|
214
215
|
|
|
215
216
|
```bash
|
|
216
|
-
scribe-install
|
|
217
|
-
scribe-install --name "Scribe Whisper" --backend whisper --model small --clipboard --restart-after-silence --no-prompt
|
|
218
|
-
scribe-install --name "Scribe Vosk FR" --backend vosk --language fr --keyboard --clipboard --no-terminal
|
|
217
|
+
scribe-install --clipboard --api YOUROPENAIAPIKEY
|
|
219
218
|
```
|
|
220
|
-
|
|
221
|
-
|
|
222
|
-
|
|
223
|
-
|
|
219
|
+
(`--api` is optional and only useful if you plan to use `openaiapi` backend later on)
|
|
220
|
+
|
|
221
|
+
And to make an app running outside the terminal:
|
|
222
|
+
|
|
223
|
+
```bash
|
|
224
|
+
scribe-install --backend openaiapi --name "Scribe App" --keyboard --clipboard --app --no-prompt --no-terminal --api YOUROPENAIAPIKEY
|
|
225
|
+
```
|
|
226
|
+
This will install two separate apps (names "Scribe" and "Scribe App")
|
|
224
227
|
|
|
225
228
|
|
|
226
229
|
## Fine tuning
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|