scribe-cli 0.10.0__tar.gz → 0.11.1__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {scribe_cli-0.10.0/scribe_cli.egg-info → scribe_cli-0.11.1}/PKG-INFO +16 -11
- {scribe_cli-0.10.0 → scribe_cli-0.11.1}/README.md +16 -11
- {scribe_cli-0.10.0 → scribe_cli-0.11.1}/scribe/_version.py +2 -2
- {scribe_cli-0.10.0 → scribe_cli-0.11.1}/scribe/app.py +102 -57
- {scribe_cli-0.10.0 → scribe_cli-0.11.1}/scribe/models.py +9 -4
- {scribe_cli-0.10.0 → scribe_cli-0.11.1/scribe_cli.egg-info}/PKG-INFO +16 -11
- {scribe_cli-0.10.0 → scribe_cli-0.11.1}/.github/workflows/pypi.yml +0 -0
- {scribe_cli-0.10.0 → scribe_cli-0.11.1}/.gitignore +0 -0
- {scribe_cli-0.10.0 → scribe_cli-0.11.1}/LICENSE +0 -0
- {scribe_cli-0.10.0 → scribe_cli-0.11.1}/icon.xcf +0 -0
- {scribe_cli-0.10.0 → scribe_cli-0.11.1}/pyproject.toml +0 -0
- {scribe_cli-0.10.0 → scribe_cli-0.11.1}/scribe/__init__.py +0 -0
- {scribe_cli-0.10.0 → scribe_cli-0.11.1}/scribe/audio.py +0 -0
- {scribe_cli-0.10.0 → scribe_cli-0.11.1}/scribe/install_desktop.py +0 -0
- {scribe_cli-0.10.0 → scribe_cli-0.11.1}/scribe/keyboard.py +0 -0
- {scribe_cli-0.10.0 → scribe_cli-0.11.1}/scribe/models.toml +0 -0
- {scribe_cli-0.10.0 → scribe_cli-0.11.1}/scribe/saverecording.py +0 -0
- {scribe_cli-0.10.0 → scribe_cli-0.11.1}/scribe/testpynput.py +0 -0
- {scribe_cli-0.10.0 → scribe_cli-0.11.1}/scribe/util.py +0 -0
- {scribe_cli-0.10.0 → scribe_cli-0.11.1}/scribe_cli.egg-info/SOURCES.txt +0 -0
- {scribe_cli-0.10.0 → scribe_cli-0.11.1}/scribe_cli.egg-info/dependency_links.txt +0 -0
- {scribe_cli-0.10.0 → scribe_cli-0.11.1}/scribe_cli.egg-info/entry_points.txt +0 -0
- {scribe_cli-0.10.0 → scribe_cli-0.11.1}/scribe_cli.egg-info/requires.txt +0 -0
- {scribe_cli-0.10.0 → scribe_cli-0.11.1}/scribe_cli.egg-info/top_level.txt +0 -0
- {scribe_cli-0.10.0 → scribe_cli-0.11.1}/scribe_data/__init__.py +0 -0
- {scribe_cli-0.10.0 → scribe_cli-0.11.1}/scribe_data/share/icon.png +0 -0
- {scribe_cli-0.10.0 → scribe_cli-0.11.1}/scribe_data/share/icon_recording.png +0 -0
- {scribe_cli-0.10.0 → scribe_cli-0.11.1}/scribe_data/share/icon_writing.png +0 -0
- {scribe_cli-0.10.0 → scribe_cli-0.11.1}/scribe_data/templates/scribe.desktop +0 -0
- {scribe_cli-0.10.0 → scribe_cli-0.11.1}/setup.cfg +0 -0
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.2
|
|
2
2
|
Name: scribe-cli
|
|
3
|
-
Version: 0.
|
|
3
|
+
Version: 0.11.1
|
|
4
4
|
Summary: scribe is a local speech recognition tool that provides real-time transcription using vosk and whisper AI, with the goal of serving as a virtual keyboard on a computer
|
|
5
5
|
Author-email: Mahé Perrette <mahe.perrette@gmail.com>
|
|
6
6
|
License: MIT License
|
|
@@ -158,7 +158,7 @@ The content of the (full) transcription is then pasted to the clipboard, and it
|
|
|
158
158
|
Alternatively an output file can be indicated:
|
|
159
159
|
|
|
160
160
|
```bash
|
|
161
|
-
|
|
161
|
+
scribe -o transcription.txt
|
|
162
162
|
```
|
|
163
163
|
|
|
164
164
|
### Virtual keyboard (experimental)
|
|
@@ -195,7 +195,8 @@ To activate start with:
|
|
|
195
195
|
```bash
|
|
196
196
|
scribe --app
|
|
197
197
|
```
|
|
198
|
-
or toggle the app option in the interactive menu. The scribe icon will show, with Record
|
|
198
|
+
or toggle the app option in the interactive menu. The scribe icon will show, with Record and other options. The icon will change based on what the app is doing. It is possible to choose from a set
|
|
199
|
+
of predefined models, or to Quit and choose from the terminal before pressing Enter again.
|
|
199
200
|
For the vosk model, there are only two states : recording + transcribing or Idle. For the whisper model there are three states visible from the icon: recording, transcribing and idle/waiting.
|
|
200
201
|
That option requires `pystray` to be installed. This is included with the `pip install ...[all]` option. In Ubuntu the following dependencies were required to make the menus appear:
|
|
201
202
|
|
|
@@ -204,23 +205,27 @@ sudo apt install libcairo-dev libgirepository1.0-dev gir1.2-appindicator3-0.1
|
|
|
204
205
|
pip install PyGObject
|
|
205
206
|
```
|
|
206
207
|
|
|
208
|
+
<img src=https://github.com/user-attachments/assets/4c97f4b1-1a65-4d49-9f5a-a9f4287cfa5a width=300px>
|
|
209
|
+
|
|
207
210
|
## Start as an application in GNOME
|
|
208
211
|
|
|
209
212
|
If you run Ubuntu (or else?) with GNOME, the script `scribe-install [...]` will create a `scribe.desktop` file and place it under `$HOME/.local/share/applications`
|
|
210
213
|
to make it available from the quick launch menu. Any option will be passed on to `scribe`, with the additional options `--name` and `--no-terminal`.
|
|
211
214
|
`--no-terminal` means no terminal will show up, and it also implies the options `--app --no-prompt`.
|
|
212
215
|
|
|
213
|
-
|
|
216
|
+
In a relatively basic form
|
|
217
|
+
|
|
218
|
+
```bash
|
|
219
|
+
scribe-install --clipboard --api YOUROPENAIAPIKEY
|
|
220
|
+
```
|
|
221
|
+
(`--api` is optional and only useful if you plan to use `openaiapi` backend later on)
|
|
222
|
+
|
|
223
|
+
And to make an app running outside the terminal:
|
|
214
224
|
|
|
215
225
|
```bash
|
|
216
|
-
scribe-install
|
|
217
|
-
scribe-install --name "Scribe Whisper" --backend whisper --model small --clipboard --restart-after-silence --no-prompt
|
|
218
|
-
scribe-install --name "Scribe Vosk FR" --backend vosk --language fr --keyboard --clipboard --no-terminal
|
|
226
|
+
scribe-install --backend openaiapi --name "Scribe App" --keyboard --clipboard --app --no-prompt --no-terminal --restart-after-silence --api YOUROPENAIAPIKEY
|
|
219
227
|
```
|
|
220
|
-
This will install
|
|
221
|
-
- `Super + scribe` : will launch the default version with terminal prompt
|
|
222
|
-
- `Super + whisper` : will launch a present version with the `small` model from `whisper` and start recording right away. You can see what is going on in the terminal and the result is ready to paste from the clipboard.
|
|
223
|
-
- `Super + vosk fr` : will launch a preset version for real-time transcription in French with the `vosk` backend, and throughput to the clipboard and the keyboard, not even opening a terminal (you need to press Record in the tray icon menu to start the recording).
|
|
228
|
+
This will install two separate apps (names "Scribe" and "Scribe App")
|
|
224
229
|
|
|
225
230
|
|
|
226
231
|
## Fine tuning
|
|
@@ -90,7 +90,7 @@ The content of the (full) transcription is then pasted to the clipboard, and it
|
|
|
90
90
|
Alternatively an output file can be indicated:
|
|
91
91
|
|
|
92
92
|
```bash
|
|
93
|
-
|
|
93
|
+
scribe -o transcription.txt
|
|
94
94
|
```
|
|
95
95
|
|
|
96
96
|
### Virtual keyboard (experimental)
|
|
@@ -127,7 +127,8 @@ To activate start with:
|
|
|
127
127
|
```bash
|
|
128
128
|
scribe --app
|
|
129
129
|
```
|
|
130
|
-
or toggle the app option in the interactive menu. The scribe icon will show, with Record
|
|
130
|
+
or toggle the app option in the interactive menu. The scribe icon will show, with Record and other options. The icon will change based on what the app is doing. It is possible to choose from a set
|
|
131
|
+
of predefined models, or to Quit and choose from the terminal before pressing Enter again.
|
|
131
132
|
For the vosk model, there are only two states : recording + transcribing or Idle. For the whisper model there are three states visible from the icon: recording, transcribing and idle/waiting.
|
|
132
133
|
That option requires `pystray` to be installed. This is included with the `pip install ...[all]` option. In Ubuntu the following dependencies were required to make the menus appear:
|
|
133
134
|
|
|
@@ -136,23 +137,27 @@ sudo apt install libcairo-dev libgirepository1.0-dev gir1.2-appindicator3-0.1
|
|
|
136
137
|
pip install PyGObject
|
|
137
138
|
```
|
|
138
139
|
|
|
140
|
+
<img src=https://github.com/user-attachments/assets/4c97f4b1-1a65-4d49-9f5a-a9f4287cfa5a width=300px>
|
|
141
|
+
|
|
139
142
|
## Start as an application in GNOME
|
|
140
143
|
|
|
141
144
|
If you run Ubuntu (or else?) with GNOME, the script `scribe-install [...]` will create a `scribe.desktop` file and place it under `$HOME/.local/share/applications`
|
|
142
145
|
to make it available from the quick launch menu. Any option will be passed on to `scribe`, with the additional options `--name` and `--no-terminal`.
|
|
143
146
|
`--no-terminal` means no terminal will show up, and it also implies the options `--app --no-prompt`.
|
|
144
147
|
|
|
145
|
-
|
|
148
|
+
In a relatively basic form
|
|
149
|
+
|
|
150
|
+
```bash
|
|
151
|
+
scribe-install --clipboard --api YOUROPENAIAPIKEY
|
|
152
|
+
```
|
|
153
|
+
(`--api` is optional and only useful if you plan to use `openaiapi` backend later on)
|
|
154
|
+
|
|
155
|
+
And to make an app running outside the terminal:
|
|
146
156
|
|
|
147
157
|
```bash
|
|
148
|
-
scribe-install
|
|
149
|
-
scribe-install --name "Scribe Whisper" --backend whisper --model small --clipboard --restart-after-silence --no-prompt
|
|
150
|
-
scribe-install --name "Scribe Vosk FR" --backend vosk --language fr --keyboard --clipboard --no-terminal
|
|
158
|
+
scribe-install --backend openaiapi --name "Scribe App" --keyboard --clipboard --app --no-prompt --no-terminal --restart-after-silence --api YOUROPENAIAPIKEY
|
|
151
159
|
```
|
|
152
|
-
This will install
|
|
153
|
-
- `Super + scribe` : will launch the default version with terminal prompt
|
|
154
|
-
- `Super + whisper` : will launch a present version with the `small` model from `whisper` and start recording right away. You can see what is going on in the terminal and the result is ready to paste from the clipboard.
|
|
155
|
-
- `Super + vosk fr` : will launch a preset version for real-time transcription in French with the `vosk` backend, and throughput to the clipboard and the keyboard, not even opening a terminal (you need to press Record in the tray icon menu to start the recording).
|
|
160
|
+
This will install two separate apps (names "Scribe" and "Scribe App")
|
|
156
161
|
|
|
157
162
|
|
|
158
163
|
## Fine tuning
|
|
@@ -162,4 +167,4 @@ Best is to check the available options in the online help:
|
|
|
162
167
|
|
|
163
168
|
```bash
|
|
164
169
|
scribe --help
|
|
165
|
-
```
|
|
170
|
+
```
|
|
@@ -55,49 +55,54 @@ class DummyTranscriber:
|
|
|
55
55
|
def __getattr__(self, item):
|
|
56
56
|
return None
|
|
57
57
|
|
|
58
|
-
|
|
58
|
+
whisper_models = ["tiny", "base", "small", "medium", "large", "turbo"]
|
|
59
|
+
whisper_english_models = ["tiny.en", "base.en", "small.en", "medium.en"]
|
|
60
|
+
whisperapi_models = ["whisper-1"]
|
|
61
|
+
vosk_models = [language_config["vosk"][lang]["model"] for lang in language_config["vosk"]]
|
|
59
62
|
|
|
60
|
-
whisper_models = ["tiny", "base", "small", "medium", "large", "turbo"]
|
|
61
|
-
whisper_english_models = ["tiny.en", "base.en", "small.en", "medium.en"]
|
|
62
|
-
whisperapi_models = ["whisper-1"]
|
|
63
63
|
|
|
64
|
-
|
|
64
|
+
def get_transcriber(model=None, backend=None, dummy=False, prompt=True, language=None,
|
|
65
|
+
samplerate=None, duration=None, silence=None, silence_db=None, restart_after_silence=None,
|
|
66
|
+
api_key=None,
|
|
67
|
+
download_folder_vosk=None, download_folder_whisper=None, **kwargs):
|
|
68
|
+
|
|
69
|
+
if dummy:
|
|
65
70
|
return DummyTranscriber("whisper", "dummy")
|
|
66
71
|
|
|
67
|
-
if
|
|
68
|
-
if
|
|
69
|
-
|
|
70
|
-
elif
|
|
71
|
-
|
|
72
|
-
elif
|
|
73
|
-
|
|
72
|
+
if model and not backend:
|
|
73
|
+
if model.startswith("vosk-"):
|
|
74
|
+
backend = "vosk"
|
|
75
|
+
elif model in whisper_models + whisper_english_models:
|
|
76
|
+
backend = "whisper"
|
|
77
|
+
elif model in whisperapi_models:
|
|
78
|
+
backend = "openaiapi"
|
|
74
79
|
|
|
75
|
-
if
|
|
76
|
-
backend =
|
|
80
|
+
if backend:
|
|
81
|
+
backend = backend
|
|
77
82
|
|
|
78
83
|
elif not prompt:
|
|
79
84
|
backend = BACKENDS[0]
|
|
80
85
|
|
|
81
86
|
else:
|
|
82
|
-
backend = prompt_choices(BACKENDS,
|
|
87
|
+
backend = prompt_choices(BACKENDS, backend, "backend", UNAVAILABLE_BACKENDS)
|
|
83
88
|
|
|
84
89
|
print(f"Selected backend: {backend}")
|
|
85
90
|
|
|
86
|
-
if
|
|
87
|
-
model = pick_specialist_model(
|
|
91
|
+
if model:
|
|
92
|
+
model = pick_specialist_model(model, language, backend)
|
|
88
93
|
|
|
89
94
|
else:
|
|
90
95
|
|
|
91
96
|
if backend == "vosk":
|
|
92
97
|
available_languages = list(language_config[backend])
|
|
93
|
-
if
|
|
94
|
-
if
|
|
95
|
-
print(f"Language '{
|
|
98
|
+
if language:
|
|
99
|
+
if language not in available_languages:
|
|
100
|
+
print(f"Language '{language}' is not pre-defined (yet) for backend '{backend}'.")
|
|
96
101
|
print(f"Yet it may actually exist.")
|
|
97
102
|
print(f"Please choose the model explictly from {ansi_link('https://alphacephei.com/vosk/models')}.")
|
|
98
103
|
print(f"Or pick one of the pre-defined languages: ", " ".join(available_languages))
|
|
99
104
|
exit(1)
|
|
100
|
-
choices = [language_config[backend][
|
|
105
|
+
choices = [language_config[backend][language]["model"]]
|
|
101
106
|
default_model = choices[0] # this is a string
|
|
102
107
|
|
|
103
108
|
else:
|
|
@@ -121,10 +126,10 @@ def get_transcriber(o, prompt=True):
|
|
|
121
126
|
else:
|
|
122
127
|
model = default_model
|
|
123
128
|
|
|
124
|
-
model = pick_specialist_model(model,
|
|
129
|
+
model = pick_specialist_model(model, language, backend)
|
|
125
130
|
|
|
126
131
|
elif backend == "openaiapi":
|
|
127
|
-
model =
|
|
132
|
+
model = model or "whisper-1"
|
|
128
133
|
|
|
129
134
|
else:
|
|
130
135
|
raise ValueError(f"Unknown backend: {backend}")
|
|
@@ -135,26 +140,26 @@ def get_transcriber(o, prompt=True):
|
|
|
135
140
|
if backend == "vosk":
|
|
136
141
|
try:
|
|
137
142
|
transcriber = VoskTranscriber(model_name=model,
|
|
138
|
-
language=
|
|
139
|
-
samplerate=
|
|
143
|
+
language=language,
|
|
144
|
+
samplerate=samplerate,
|
|
140
145
|
timeout=None, # vosk keeps going (no timeout)
|
|
141
146
|
silence_duration=None, # vosk handles silences internally
|
|
142
|
-
model_kwargs={"download_root":
|
|
147
|
+
model_kwargs={"download_root": download_folder_vosk})
|
|
143
148
|
except Exception as error:
|
|
144
149
|
print(error)
|
|
145
150
|
print(f"Failed to (down)load model {model}.")
|
|
146
151
|
exit(1)
|
|
147
152
|
|
|
148
153
|
elif backend == "whisper":
|
|
149
|
-
transcriber = WhisperTranscriber(model_name=model, language=
|
|
150
|
-
timeout=
|
|
151
|
-
restart_after_silence=
|
|
152
|
-
model_kwargs={"download_root":
|
|
154
|
+
transcriber = WhisperTranscriber(model_name=model, language=language, samplerate=samplerate,
|
|
155
|
+
timeout=duration, silence_duration=silence, silence_thresh=silence_db,
|
|
156
|
+
restart_after_silence=restart_after_silence,
|
|
157
|
+
model_kwargs={"download_root": download_folder_whisper})
|
|
153
158
|
|
|
154
159
|
elif backend == "openaiapi":
|
|
155
|
-
transcriber = OpenaiAPITranscriber(model_name=model, samplerate=
|
|
156
|
-
timeout=
|
|
157
|
-
restart_after_silence=
|
|
160
|
+
transcriber = OpenaiAPITranscriber(model_name=model, samplerate=samplerate,
|
|
161
|
+
timeout=duration, silence_duration=silence, silence_thresh=silence_db,
|
|
162
|
+
restart_after_silence=restart_after_silence, api_key=api_key)
|
|
158
163
|
|
|
159
164
|
|
|
160
165
|
else:
|
|
@@ -246,7 +251,7 @@ def start_recording(micro, transcriber, clipboard=True, keyboard=False, latency=
|
|
|
246
251
|
callback()
|
|
247
252
|
|
|
248
253
|
|
|
249
|
-
def create_app(micro, transcriber, **kwargs):
|
|
254
|
+
def create_app(micro, transcriber, other_transcribers=None, **kwargs):
|
|
250
255
|
import pystray
|
|
251
256
|
from pystray import Menu as pystrayMenu, MenuItem as Item
|
|
252
257
|
from PIL import Image
|
|
@@ -266,15 +271,8 @@ def create_app(micro, transcriber, **kwargs):
|
|
|
266
271
|
image_recording = Image.alpha_composite(image_recording.convert("RGBA"), image_writing.convert("RGBA"))
|
|
267
272
|
|
|
268
273
|
def update_icon(icon, force=False):
|
|
269
|
-
|
|
270
|
-
|
|
271
|
-
# but we wait for the speaker to speak (silence)
|
|
272
|
-
if force or getattr(icon, "_icon_label", None) != None:
|
|
273
|
-
icon.icon = image
|
|
274
|
-
icon._icon_label = None
|
|
275
|
-
icon.update_menu()
|
|
276
|
-
|
|
277
|
-
elif transcriber.recording:
|
|
274
|
+
transcriber = icon._transcriber
|
|
275
|
+
if transcriber.recording:
|
|
278
276
|
if force or getattr(icon, "_icon_label", None) != "recording":
|
|
279
277
|
icon.icon = image_recording
|
|
280
278
|
icon._icon_label = "recording"
|
|
@@ -293,6 +291,7 @@ def create_app(micro, transcriber, **kwargs):
|
|
|
293
291
|
icon.update_menu()
|
|
294
292
|
|
|
295
293
|
def start_monitoring(icon):
|
|
294
|
+
transcriber = icon._transcriber
|
|
296
295
|
try:
|
|
297
296
|
while transcriber.busy:
|
|
298
297
|
update_icon(icon)
|
|
@@ -308,8 +307,8 @@ def create_app(micro, transcriber, **kwargs):
|
|
|
308
307
|
icon.stop()
|
|
309
308
|
|
|
310
309
|
def callback_stop_recording(icon, item):
|
|
310
|
+
transcriber = icon._transcriber
|
|
311
311
|
# Here we need to stop the recording thread
|
|
312
|
-
|
|
313
312
|
transcriber.interrupt = True
|
|
314
313
|
if hasattr(icon, "_recording_thread"):
|
|
315
314
|
icon._recording_thread.join()
|
|
@@ -317,10 +316,10 @@ def create_app(micro, transcriber, **kwargs):
|
|
|
317
316
|
icon._monitoring_thread.join()
|
|
318
317
|
|
|
319
318
|
def callback_record(icon, item):
|
|
320
|
-
|
|
319
|
+
transcriber = icon._transcriber
|
|
321
320
|
if transcriber.busy:
|
|
322
|
-
transcriber.log("Still busy recording or transcribing.")
|
|
323
|
-
return
|
|
321
|
+
# transcriber.log("Still busy recording or transcribing.")
|
|
322
|
+
return callback_stop_recording(icon, item) # play / stop behavior
|
|
324
323
|
|
|
325
324
|
if hasattr(icon, "_recording_thread") and icon._recording_thread.is_alive():
|
|
326
325
|
icon._recording_thread.join()
|
|
@@ -334,22 +333,63 @@ def create_app(micro, transcriber, **kwargs):
|
|
|
334
333
|
icon._monitoring_thread = threading.Thread(target=start_monitoring, args=(icon,))
|
|
335
334
|
icon._monitoring_thread.start()
|
|
336
335
|
|
|
336
|
+
if other_transcribers:
|
|
337
|
+
other_transcribers_dict = {meta["model"]: meta for meta in other_transcribers}
|
|
338
|
+
else:
|
|
339
|
+
other_transcribers_dict = {}
|
|
340
|
+
|
|
341
|
+
def callback_set_model(icon, item):
|
|
342
|
+
transcriber = icon._transcriber
|
|
343
|
+
callback_stop_recording(icon, item)
|
|
344
|
+
model_name = str(item)
|
|
345
|
+
meta = other_transcribers_dict[model_name]
|
|
346
|
+
icon._transcriber = transcriber = get_transcriber(**meta)
|
|
347
|
+
icon.title = f"scribe :: {transcriber.backend} :: {transcriber.model_name}"
|
|
348
|
+
print("Set", transcriber.backend, transcriber.model_name)
|
|
349
|
+
# icon.menu.items[0].__name__ = f"Record [{str(item)}]"
|
|
350
|
+
icon._model_selection = False
|
|
351
|
+
icon.update_menu()
|
|
352
|
+
|
|
353
|
+
def callback_toggle_option(icon, item):
|
|
354
|
+
kwargs[str(item)] = not kwargs[str(item)]
|
|
355
|
+
|
|
356
|
+
def is_model_selection(item):
|
|
357
|
+
return icon._model_selection
|
|
358
|
+
|
|
337
359
|
def is_recording(item):
|
|
338
|
-
return
|
|
360
|
+
return icon._transcriber.busy
|
|
339
361
|
|
|
340
362
|
def is_not_recording(item):
|
|
341
|
-
return not is_recording(item)
|
|
363
|
+
return not is_recording(item) and not is_model_selection(item)
|
|
342
364
|
|
|
365
|
+
def is_checked(item):
|
|
366
|
+
return icon._transcriber.model_name == str(item)
|
|
343
367
|
|
|
344
|
-
|
|
345
|
-
|
|
346
|
-
|
|
347
|
-
|
|
348
|
-
|
|
368
|
+
def is_checked_option(item):
|
|
369
|
+
return kwargs[str(item)]
|
|
370
|
+
|
|
371
|
+
modeltitle = f"{transcriber.backend} :: {transcriber.model_name}"
|
|
372
|
+
title = f"scribe :: {modeltitle}"
|
|
373
|
+
|
|
374
|
+
menus = []
|
|
375
|
+
menus.append(Item(f"Record", callback_record, visible=is_not_recording, default=True))
|
|
376
|
+
menus.append(Item("Stop", callback_stop_recording, visible=is_recording))
|
|
377
|
+
menus.append(Item("Choose Model", pystrayMenu(
|
|
378
|
+
*(Item(f"{name}", callback_set_model, checked=is_checked) for name in other_transcribers_dict)))
|
|
379
|
+
)
|
|
380
|
+
menus.append(Item("Toggle Options", pystrayMenu(
|
|
381
|
+
*(Item(f"{name}", callback_toggle_option, checked=is_checked_option) for name in kwargs if isinstance(kwargs[name], bool))))
|
|
349
382
|
)
|
|
383
|
+
menus.append(Item('Quit', callback_quit))
|
|
384
|
+
|
|
385
|
+
# Create a menu
|
|
386
|
+
menu = pystrayMenu(*menus)
|
|
350
387
|
|
|
351
388
|
# Create the system tray icon
|
|
352
|
-
icon = pystray.Icon('scribe', image,
|
|
389
|
+
icon = pystray.Icon('scribe', image, title, menu)
|
|
390
|
+
icon._model_selection = False
|
|
391
|
+
icon._transcriber = transcriber
|
|
392
|
+
del transcriber
|
|
353
393
|
|
|
354
394
|
return icon
|
|
355
395
|
|
|
@@ -368,7 +408,7 @@ def main(args=None):
|
|
|
368
408
|
|
|
369
409
|
while True:
|
|
370
410
|
if transcriber is None:
|
|
371
|
-
transcriber = get_transcriber(o
|
|
411
|
+
transcriber = get_transcriber(**vars(o))
|
|
372
412
|
print(f"Model [{colored(transcriber.model_name, 'light_blue', attrs=['bold'])}] from [{colored(transcriber.backend, 'light_blue', attrs=['bold'])}] selected.")
|
|
373
413
|
show_output = ["clipboard", "keyboard", "output_file"]
|
|
374
414
|
show_options = ["ascii", "restart_after_silence"]
|
|
@@ -482,7 +522,12 @@ def main(args=None):
|
|
|
482
522
|
greetings = dict(
|
|
483
523
|
start_message = "Listening... Use the try icon menu to stop.",
|
|
484
524
|
)
|
|
485
|
-
|
|
525
|
+
|
|
526
|
+
app = create_app(micro, transcriber, other_transcribers=[
|
|
527
|
+
{**vars(o), "backend": "openaiapi", "model": "whisper-1"},
|
|
528
|
+
*[{**vars(o), "backend": "whisper", "model": model} for model in whisper_models],
|
|
529
|
+
*[{**vars(o), "backend": "vosk", "model": model} for model in vosk_models]],
|
|
530
|
+
clipboard=o.clipboard, output_file=o.output_file,
|
|
486
531
|
keyboard=o.keyboard, latency=o.latency, ascii=o.ascii, **greetings)
|
|
487
532
|
print("Starting app...")
|
|
488
533
|
app.run()
|
|
@@ -242,6 +242,7 @@ class OpenaiAPITranscriber(WhisperTranscriber):
|
|
|
242
242
|
def transcribe_audio(self, audio_bytes):
|
|
243
243
|
self.log("\nTranscribing")
|
|
244
244
|
import io
|
|
245
|
+
import openai
|
|
245
246
|
import soundfile as sf
|
|
246
247
|
audio_data = np.frombuffer(audio_bytes, dtype=np.int16).flatten().astype(np.float32) / 32768.0
|
|
247
248
|
# Write the audio data to an in-memory file in WAV format
|
|
@@ -249,8 +250,12 @@ class OpenaiAPITranscriber(WhisperTranscriber):
|
|
|
249
250
|
sf.write(buffer, audio_data, self.samplerate, format='WAV')
|
|
250
251
|
buffer.seek(0)
|
|
251
252
|
buffer.name = "audio.wav" # Set a filename with a valid extension
|
|
252
|
-
|
|
253
|
-
|
|
254
|
-
|
|
255
|
-
|
|
253
|
+
try:
|
|
254
|
+
transcription = self.model.audio.transcriptions.create(
|
|
255
|
+
model=self.model_name,
|
|
256
|
+
file=buffer,
|
|
257
|
+
)
|
|
258
|
+
except openai.BadRequestError as e:
|
|
259
|
+
self.log(f"Error: {e}")
|
|
260
|
+
return {"text": ""}
|
|
256
261
|
return {"text": transcription.text}
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.2
|
|
2
2
|
Name: scribe-cli
|
|
3
|
-
Version: 0.
|
|
3
|
+
Version: 0.11.1
|
|
4
4
|
Summary: scribe is a local speech recognition tool that provides real-time transcription using vosk and whisper AI, with the goal of serving as a virtual keyboard on a computer
|
|
5
5
|
Author-email: Mahé Perrette <mahe.perrette@gmail.com>
|
|
6
6
|
License: MIT License
|
|
@@ -158,7 +158,7 @@ The content of the (full) transcription is then pasted to the clipboard, and it
|
|
|
158
158
|
Alternatively an output file can be indicated:
|
|
159
159
|
|
|
160
160
|
```bash
|
|
161
|
-
|
|
161
|
+
scribe -o transcription.txt
|
|
162
162
|
```
|
|
163
163
|
|
|
164
164
|
### Virtual keyboard (experimental)
|
|
@@ -195,7 +195,8 @@ To activate start with:
|
|
|
195
195
|
```bash
|
|
196
196
|
scribe --app
|
|
197
197
|
```
|
|
198
|
-
or toggle the app option in the interactive menu. The scribe icon will show, with Record
|
|
198
|
+
or toggle the app option in the interactive menu. The scribe icon will show, with Record and other options. The icon will change based on what the app is doing. It is possible to choose from a set
|
|
199
|
+
of predefined models, or to Quit and choose from the terminal before pressing Enter again.
|
|
199
200
|
For the vosk model, there are only two states : recording + transcribing or Idle. For the whisper model there are three states visible from the icon: recording, transcribing and idle/waiting.
|
|
200
201
|
That option requires `pystray` to be installed. This is included with the `pip install ...[all]` option. In Ubuntu the following dependencies were required to make the menus appear:
|
|
201
202
|
|
|
@@ -204,23 +205,27 @@ sudo apt install libcairo-dev libgirepository1.0-dev gir1.2-appindicator3-0.1
|
|
|
204
205
|
pip install PyGObject
|
|
205
206
|
```
|
|
206
207
|
|
|
208
|
+
<img src=https://github.com/user-attachments/assets/4c97f4b1-1a65-4d49-9f5a-a9f4287cfa5a width=300px>
|
|
209
|
+
|
|
207
210
|
## Start as an application in GNOME
|
|
208
211
|
|
|
209
212
|
If you run Ubuntu (or else?) with GNOME, the script `scribe-install [...]` will create a `scribe.desktop` file and place it under `$HOME/.local/share/applications`
|
|
210
213
|
to make it available from the quick launch menu. Any option will be passed on to `scribe`, with the additional options `--name` and `--no-terminal`.
|
|
211
214
|
`--no-terminal` means no terminal will show up, and it also implies the options `--app --no-prompt`.
|
|
212
215
|
|
|
213
|
-
|
|
216
|
+
In a relatively basic form
|
|
217
|
+
|
|
218
|
+
```bash
|
|
219
|
+
scribe-install --clipboard --api YOUROPENAIAPIKEY
|
|
220
|
+
```
|
|
221
|
+
(`--api` is optional and only useful if you plan to use `openaiapi` backend later on)
|
|
222
|
+
|
|
223
|
+
And to make an app running outside the terminal:
|
|
214
224
|
|
|
215
225
|
```bash
|
|
216
|
-
scribe-install
|
|
217
|
-
scribe-install --name "Scribe Whisper" --backend whisper --model small --clipboard --restart-after-silence --no-prompt
|
|
218
|
-
scribe-install --name "Scribe Vosk FR" --backend vosk --language fr --keyboard --clipboard --no-terminal
|
|
226
|
+
scribe-install --backend openaiapi --name "Scribe App" --keyboard --clipboard --app --no-prompt --no-terminal --restart-after-silence --api YOUROPENAIAPIKEY
|
|
219
227
|
```
|
|
220
|
-
This will install
|
|
221
|
-
- `Super + scribe` : will launch the default version with terminal prompt
|
|
222
|
-
- `Super + whisper` : will launch a present version with the `small` model from `whisper` and start recording right away. You can see what is going on in the terminal and the result is ready to paste from the clipboard.
|
|
223
|
-
- `Super + vosk fr` : will launch a preset version for real-time transcription in French with the `vosk` backend, and throughput to the clipboard and the keyboard, not even opening a terminal (you need to press Record in the tray icon menu to start the recording).
|
|
228
|
+
This will install two separate apps (names "Scribe" and "Scribe App")
|
|
224
229
|
|
|
225
230
|
|
|
226
231
|
## Fine tuning
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|