scribe-cli 0.7.6__tar.gz → 0.7.8__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {scribe_cli-0.7.6/scribe_cli.egg-info → scribe_cli-0.7.8}/PKG-INFO +6 -4
- {scribe_cli-0.7.6 → scribe_cli-0.7.8}/README.md +4 -4
- {scribe_cli-0.7.6 → scribe_cli-0.7.8}/pyproject.toml +2 -0
- {scribe_cli-0.7.6 → scribe_cli-0.7.8}/scribe/_version.py +2 -2
- {scribe_cli-0.7.6 → scribe_cli-0.7.8}/scribe/app.py +64 -15
- {scribe_cli-0.7.6 → scribe_cli-0.7.8}/scribe/keyboard.py +20 -3
- {scribe_cli-0.7.6 → scribe_cli-0.7.8}/scribe/models.py +1 -0
- {scribe_cli-0.7.6 → scribe_cli-0.7.8}/scribe/util.py +3 -41
- {scribe_cli-0.7.6 → scribe_cli-0.7.8/scribe_cli.egg-info}/PKG-INFO +6 -4
- {scribe_cli-0.7.6 → scribe_cli-0.7.8}/scribe_cli.egg-info/requires.txt +2 -0
- {scribe_cli-0.7.6 → scribe_cli-0.7.8}/.github/workflows/pypi.yml +0 -0
- {scribe_cli-0.7.6 → scribe_cli-0.7.8}/.gitignore +0 -0
- {scribe_cli-0.7.6 → scribe_cli-0.7.8}/LICENSE +0 -0
- {scribe_cli-0.7.6 → scribe_cli-0.7.8}/scribe/__init__.py +0 -0
- {scribe_cli-0.7.6 → scribe_cli-0.7.8}/scribe/audio.py +0 -0
- {scribe_cli-0.7.6 → scribe_cli-0.7.8}/scribe/install_desktop.py +0 -0
- {scribe_cli-0.7.6 → scribe_cli-0.7.8}/scribe/models.toml +0 -0
- {scribe_cli-0.7.6 → scribe_cli-0.7.8}/scribe/saverecording.py +0 -0
- {scribe_cli-0.7.6 → scribe_cli-0.7.8}/scribe/testpynput.py +0 -0
- {scribe_cli-0.7.6 → scribe_cli-0.7.8}/scribe_cli.egg-info/SOURCES.txt +0 -0
- {scribe_cli-0.7.6 → scribe_cli-0.7.8}/scribe_cli.egg-info/dependency_links.txt +0 -0
- {scribe_cli-0.7.6 → scribe_cli-0.7.8}/scribe_cli.egg-info/entry_points.txt +0 -0
- {scribe_cli-0.7.6 → scribe_cli-0.7.8}/scribe_cli.egg-info/top_level.txt +0 -0
- {scribe_cli-0.7.6 → scribe_cli-0.7.8}/scribe_data/__init__.py +0 -0
- {scribe_cli-0.7.6 → scribe_cli-0.7.8}/scribe_data/share/icon.jpg +0 -0
- {scribe_cli-0.7.6 → scribe_cli-0.7.8}/scribe_data/templates/scribe.desktop +0 -0
- {scribe_cli-0.7.6 → scribe_cli-0.7.8}/setup.cfg +0 -0
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.2
|
|
2
2
|
Name: scribe-cli
|
|
3
|
-
Version: 0.7.
|
|
3
|
+
Version: 0.7.8
|
|
4
4
|
Summary: scribe is a local speech recognition tool that provides real-time transcription using vosk and whisper AI, with the goal of serving as a virtual keyboard on a computer
|
|
5
5
|
Author-email: Mahé Perrette <mahe.perrette@gmail.com>
|
|
6
6
|
License: MIT License
|
|
@@ -44,6 +44,8 @@ Requires-Dist: sounddevice
|
|
|
44
44
|
Requires-Dist: tqdm
|
|
45
45
|
Requires-Dist: requests
|
|
46
46
|
Requires-Dist: pyperclip
|
|
47
|
+
Requires-Dist: unidecode
|
|
48
|
+
Requires-Dist: termcolor
|
|
47
49
|
Provides-Extra: keyboard
|
|
48
50
|
Requires-Dist: pynput; extra == "keyboard"
|
|
49
51
|
Provides-Extra: whisper
|
|
@@ -60,7 +62,7 @@ Requires-Dist: vosk; extra == "all"
|
|
|
60
62
|
Requires-Dist: pystray; extra == "all"
|
|
61
63
|
|
|
62
64
|
[]()
|
|
63
|
-
[](https://pypi.org/project/scribe-cli)
|
|
64
66
|
|
|
65
67
|
# Scribe
|
|
66
68
|
|
|
@@ -83,7 +85,7 @@ sudo apt-get install portaudio19-dev xclip
|
|
|
83
85
|
See additional requirements for the [icon tray](#system-tray-icon-experimental) and [keyboard](#virtual-keyboard-experimental) options. The python dependencies should be dealt with automatically:
|
|
84
86
|
|
|
85
87
|
```bash
|
|
86
|
-
pip install scribe-cli[all]
|
|
88
|
+
pip install scribe-cli[all]
|
|
87
89
|
```
|
|
88
90
|
|
|
89
91
|
(note the `-cli` suffix for client)
|
|
@@ -121,7 +123,7 @@ With the `whisker` model you need to stop the registration manually before the t
|
|
|
121
123
|
there is a maximum duration after which it will stop by itself, which is setup to 60s by default (unless `--duration` is set to something else).
|
|
122
124
|
|
|
123
125
|
The `vosk` backend is much faster and very good at doing real-time transcription for one language, but tended to make more mistakes in my tests and it does not do punctuation.
|
|
124
|
-
|
|
126
|
+
It becomes really powerful when used for longer or interactive typing session with the [keyboard](#virtual-keyboard-experimental) option, e.g. to make notes or chat with an AI.
|
|
125
127
|
There are many [vosk models](https://alphacephei.com/vosk/models) available, and here a few are associated to [a handful of languages](scribe/models.toml) `en`, `fr`, `it`, `de` (so far).
|
|
126
128
|
|
|
127
129
|
To skip the initial selection menu you can do:
|
|
@@ -1,5 +1,5 @@
|
|
|
1
1
|
[]()
|
|
2
|
-
[](https://pypi.org/project/scribe-cli)
|
|
3
3
|
|
|
4
4
|
# Scribe
|
|
5
5
|
|
|
@@ -22,7 +22,7 @@ sudo apt-get install portaudio19-dev xclip
|
|
|
22
22
|
See additional requirements for the [icon tray](#system-tray-icon-experimental) and [keyboard](#virtual-keyboard-experimental) options. The python dependencies should be dealt with automatically:
|
|
23
23
|
|
|
24
24
|
```bash
|
|
25
|
-
pip install scribe-cli[all]
|
|
25
|
+
pip install scribe-cli[all]
|
|
26
26
|
```
|
|
27
27
|
|
|
28
28
|
(note the `-cli` suffix for client)
|
|
@@ -60,7 +60,7 @@ With the `whisker` model you need to stop the registration manually before the t
|
|
|
60
60
|
there is a maximum duration after which it will stop by itself, which is setup to 60s by default (unless `--duration` is set to something else).
|
|
61
61
|
|
|
62
62
|
The `vosk` backend is much faster and very good at doing real-time transcription for one language, but tended to make more mistakes in my tests and it does not do punctuation.
|
|
63
|
-
|
|
63
|
+
It becomes really powerful when used for longer or interactive typing session with the [keyboard](#virtual-keyboard-experimental) option, e.g. to make notes or chat with an AI.
|
|
64
64
|
There are many [vosk models](https://alphacephei.com/vosk/models) available, and here a few are associated to [a handful of languages](scribe/models.toml) `en`, `fr`, `it`, `de` (so far).
|
|
65
65
|
|
|
66
66
|
To skip the initial selection menu you can do:
|
|
@@ -124,4 +124,4 @@ e.g.
|
|
|
124
124
|
scribe-install --backend whisper --model small
|
|
125
125
|
```
|
|
126
126
|
|
|
127
|
-
After that just typing Cmd + scri... at any time from any where will conveniently start the app in its own terminal with the prescribed options.
|
|
127
|
+
After that just typing Cmd + scri... at any time from any where will conveniently start the app in its own terminal with the prescribed options.
|
|
@@ -37,8 +37,27 @@ def pick_specialist_model(model, language, backend):
|
|
|
37
37
|
return model
|
|
38
38
|
|
|
39
39
|
|
|
40
|
+
class DummyTranscriber:
|
|
41
|
+
|
|
42
|
+
def __init__(self, backend, model_name):
|
|
43
|
+
self.backend = backend
|
|
44
|
+
self.model_name = model_name
|
|
45
|
+
|
|
46
|
+
def start_recording(self, micro, **kwargs):
|
|
47
|
+
while True:
|
|
48
|
+
try:
|
|
49
|
+
yield {"text": input()}
|
|
50
|
+
except KeyboardInterrupt:
|
|
51
|
+
break
|
|
52
|
+
|
|
53
|
+
def __getattr__(self, item):
|
|
54
|
+
return None
|
|
55
|
+
|
|
40
56
|
def get_transcriber(o, prompt=True):
|
|
41
57
|
|
|
58
|
+
if o.dummy:
|
|
59
|
+
return DummyTranscriber("whisper", "dummy")
|
|
60
|
+
|
|
42
61
|
if o.backend:
|
|
43
62
|
checked_backend = check_dependencies(o.backend)
|
|
44
63
|
if not checked_backend:
|
|
@@ -143,6 +162,8 @@ def get_parser():
|
|
|
143
162
|
parser.add_argument("-l", "--language", choices=list(language_config["vosk"]),
|
|
144
163
|
help="An alias for preselected models when using the vosk backend, or 'en' for the English version of whisper models.")
|
|
145
164
|
|
|
165
|
+
parser.add_argument("--dummy", action="store_true", help=argparse.SUPPRESS)
|
|
166
|
+
|
|
146
167
|
parser.add_argument("--no-prompt", action="store_false", dest="prompt", help="Disable prompts for backend and model selection and jump to recording")
|
|
147
168
|
parser.add_argument("--app", action="store_true", help="Start in app mode (relies on pystray)")
|
|
148
169
|
|
|
@@ -151,6 +172,7 @@ def get_parser():
|
|
|
151
172
|
parser.add_argument("--keyboard", action="store_true")
|
|
152
173
|
parser.add_argument("--no-clipboard", dest="clipboard", action="store_false")
|
|
153
174
|
parser.add_argument("--latency", default=0, type=float, help="keyboard latency")
|
|
175
|
+
parser.add_argument("--ascii", action="store_true", help="Use unidecode for keyboard typing in ascii")
|
|
154
176
|
|
|
155
177
|
group = parser.add_argument_group("whisper options")
|
|
156
178
|
group.add_argument("--duration", default=120, type=int, help="Max duration of the whisper recording (default %(default)ss)")
|
|
@@ -164,7 +186,7 @@ def get_parser():
|
|
|
164
186
|
|
|
165
187
|
|
|
166
188
|
# Commencer l'enregistrement
|
|
167
|
-
def start_recording(micro, transcriber, clipboard=True, keyboard=False, latency=0, **greetings):
|
|
189
|
+
def start_recording(micro, transcriber, clipboard=True, keyboard=False, latency=0, ascii=False, **greetings):
|
|
168
190
|
|
|
169
191
|
if keyboard:
|
|
170
192
|
from scribe.keyboard import type_text
|
|
@@ -184,7 +206,7 @@ def start_recording(micro, transcriber, clipboard=True, keyboard=False, latency=
|
|
|
184
206
|
clear_line()
|
|
185
207
|
print(result.get('text'))
|
|
186
208
|
if keyboard:
|
|
187
|
-
type_text(result['text'] + " ", interval=latency) # Simulate typing
|
|
209
|
+
type_text(result['text'] + " ", interval=latency, ascii=ascii) # Simulate typing
|
|
188
210
|
|
|
189
211
|
if clipboard:
|
|
190
212
|
fulltext += result['text'] + " "
|
|
@@ -278,25 +300,37 @@ def main(args=None):
|
|
|
278
300
|
while True:
|
|
279
301
|
if transcriber is None:
|
|
280
302
|
transcriber = get_transcriber(o, prompt=o.prompt)
|
|
281
|
-
print(f"
|
|
303
|
+
print(f"Model [{colored(transcriber.model_name, 'light_blue', attrs=['bold'])}] from [{colored(transcriber.backend, 'light_blue', attrs=['bold'])}] selected.")
|
|
282
304
|
if o.prompt:
|
|
283
|
-
print(f"Choose any of the following actions
|
|
284
|
-
print(f"[q] quit")
|
|
285
|
-
print(f"[e] change model")
|
|
286
|
-
print(f"[x]
|
|
287
|
-
print(f"[
|
|
288
|
-
print(f"[
|
|
305
|
+
print(f"Choose any of the following actions")
|
|
306
|
+
print(f"{colored('[q]', 'light_yellow')} quit")
|
|
307
|
+
print(f"{colored('[e]', 'light_yellow')} change model")
|
|
308
|
+
print(f"{colored('[x]', 'light_yellow')} app is {colored(o.app, 'light_blue')} toggle?")
|
|
309
|
+
print(f"{colored('[c]', 'light_yellow')} clipboard is {colored(o.clipboard, 'light_blue')} toggle?")
|
|
310
|
+
print(f"{colored('[k]', 'light_yellow')} keyboard is {colored(o.keyboard, 'light_blue')} toggle?")
|
|
311
|
+
if o.keyboard:
|
|
312
|
+
print(f"{colored('[latency]', 'light_yellow')} between keystrokes is {colored(o.latency, 'light_blue')} s")
|
|
289
313
|
if transcriber.backend == "whisper":
|
|
290
|
-
print(f"[t] change duration (currently {transcriber.timeout}s)")
|
|
291
|
-
print(f"[b] change silence duration (currently {transcriber.silence_duration}s)")
|
|
292
|
-
print(f"[a]
|
|
293
|
-
|
|
314
|
+
print(f"{colored('[t]', 'light_yellow')} change duration (currently {colored(transcriber.timeout, 'light_blue')} s)")
|
|
315
|
+
print(f"{colored('[b]', 'light_yellow')} change silence duration (currently {colored(transcriber.silence_duration, 'light_blue')} s)")
|
|
316
|
+
print(f"{colored('[a]', 'light_yellow')} auto-restart after silence is {colored(transcriber.restart_after_silence, 'light_blue')} toggle?")
|
|
317
|
+
exclude_flags = ["keyboard", "clipboard", "app", "prompt", "restart_after_silence"]
|
|
318
|
+
display_flags = [a.dest for a in parser._actions if a.help != argparse.SUPPRESS]
|
|
319
|
+
for key, value in vars(o).items():
|
|
320
|
+
if key not in display_flags or key in exclude_flags or not isinstance(value, bool):
|
|
321
|
+
continue
|
|
322
|
+
print(f"{colored(f'[{key}]', 'light_yellow')} is {colored(value, 'light_blue')} toggle?")
|
|
323
|
+
|
|
324
|
+
print(colored(f"Press [Enter] to start recording.", attrs=["bold"]))
|
|
294
325
|
|
|
295
326
|
key = input()
|
|
296
327
|
if key == "q":
|
|
297
328
|
exit(0)
|
|
298
329
|
if key == "e":
|
|
299
330
|
transcriber = None
|
|
331
|
+
o.model = None
|
|
332
|
+
o.backend = None
|
|
333
|
+
o.language = None
|
|
300
334
|
continue
|
|
301
335
|
if key == "k":
|
|
302
336
|
o.keyboard = not o.keyboard
|
|
@@ -317,6 +351,13 @@ def main(args=None):
|
|
|
317
351
|
except:
|
|
318
352
|
print("Invalid duration. Must be an integer.")
|
|
319
353
|
continue
|
|
354
|
+
if key == "latency":
|
|
355
|
+
ans = input(f"Enter new keyboard latency in seconds (current: {o.latency}): ")
|
|
356
|
+
try:
|
|
357
|
+
o.latency = float(ans)
|
|
358
|
+
except:
|
|
359
|
+
print("Invalid latency. Must be a float.")
|
|
360
|
+
continue
|
|
320
361
|
if key == "b":
|
|
321
362
|
ans = input(f"Enter new silence break duration in seconds (current: {transcriber.silence_duration}): ")
|
|
322
363
|
try:
|
|
@@ -324,19 +365,27 @@ def main(args=None):
|
|
|
324
365
|
except:
|
|
325
366
|
print("Invalid duration. Must be an integer.")
|
|
326
367
|
continue
|
|
368
|
+
if key:
|
|
369
|
+
if hasattr(o, key) and isinstance(getattr(o, key), bool):
|
|
370
|
+
setattr(o, key, not getattr(o, key))
|
|
371
|
+
print(f"Toggle {key} to [{getattr(o, key)}].")
|
|
372
|
+
print(f"Invalid choice: {key}.")
|
|
373
|
+
continue
|
|
327
374
|
|
|
328
375
|
if o.app:
|
|
329
376
|
greetings = dict(
|
|
330
377
|
start_message = "Listening... Use the try icon menu to stop.",
|
|
331
378
|
)
|
|
332
|
-
app = create_app(micro, transcriber, clipboard=o.clipboard,
|
|
379
|
+
app = create_app(micro, transcriber, clipboard=o.clipboard,
|
|
380
|
+
keyboard=o.keyboard, latency=o.latency, ascii=o.ascii, **greetings)
|
|
333
381
|
print("Starting app...")
|
|
334
382
|
app.run()
|
|
335
383
|
else:
|
|
336
384
|
greetings = dict(
|
|
337
385
|
start_message = "Listening... Press Ctrl+C to stop.",
|
|
338
386
|
)
|
|
339
|
-
start_recording(micro, transcriber, clipboard=o.clipboard,
|
|
387
|
+
start_recording(micro, transcriber, clipboard=o.clipboard,
|
|
388
|
+
keyboard=o.keyboard, latency=o.latency, ascii=o.ascii, **greetings)
|
|
340
389
|
|
|
341
390
|
# if we arrived so far, that means we pressed Ctrl + C anyway, and need Enter to move on.
|
|
342
391
|
# So we leave the wider range of options to change the model.
|
|
@@ -2,6 +2,8 @@
|
|
|
2
2
|
"""
|
|
3
3
|
import platform
|
|
4
4
|
import time
|
|
5
|
+
import unidecode
|
|
6
|
+
import logging
|
|
5
7
|
|
|
6
8
|
try:
|
|
7
9
|
# import pyautogui
|
|
@@ -30,11 +32,24 @@ def paste_text():
|
|
|
30
32
|
keyboard.release('v')
|
|
31
33
|
keyboard.release(Key.ctrl)
|
|
32
34
|
|
|
33
|
-
def
|
|
35
|
+
def safe_type_text(text):
|
|
36
|
+
"""I got key errors with the uinput mode, so I'm using unidecode to convert
|
|
37
|
+
the text to ASCII before typing it."""
|
|
38
|
+
try:
|
|
39
|
+
keyboard.type(text)
|
|
40
|
+
except KeyError:
|
|
41
|
+
asciitext = unidecode.unidecode(text)
|
|
42
|
+
logging.warning(f"Key error with {text} -> convert to {asciitext}")
|
|
43
|
+
keyboard.type(asciitext)
|
|
44
|
+
|
|
45
|
+
def type_text(text, interval=0, paste=False, ascii=False):
|
|
34
46
|
# Simulate typing a string
|
|
35
47
|
# import subprocess
|
|
36
48
|
# subprocess.run(["ydotool", "type", text])
|
|
37
49
|
|
|
50
|
+
if ascii:
|
|
51
|
+
text = unidecode.unidecode(text)
|
|
52
|
+
|
|
38
53
|
if paste:
|
|
39
54
|
import pyperclip
|
|
40
55
|
keep_state = pyperclip.paste()
|
|
@@ -45,7 +60,9 @@ def type_text(text, interval=0, paste=False):
|
|
|
45
60
|
|
|
46
61
|
if interval > 0:
|
|
47
62
|
for c in text:
|
|
48
|
-
keyboard.type(c)
|
|
63
|
+
# keyboard.type(c)
|
|
64
|
+
safe_type_text(c)
|
|
49
65
|
time.sleep(interval)
|
|
50
66
|
else:
|
|
51
|
-
keyboard.type(text)
|
|
67
|
+
# keyboard.type(text)
|
|
68
|
+
safe_type_text(text)
|
|
@@ -102,6 +102,7 @@ class AbstractTranscriber:
|
|
|
102
102
|
def get_vosk_model(model, download_root=None, url=None):
|
|
103
103
|
"""Load the Vosk recognizer"""
|
|
104
104
|
import vosk
|
|
105
|
+
vosk.SetLogLevel(-1)
|
|
105
106
|
if download_root is None:
|
|
106
107
|
download_root = VOSK_MODELS_FOLDER
|
|
107
108
|
model_path = os.path.join(download_root, model)
|
|
@@ -3,26 +3,7 @@ import re
|
|
|
3
3
|
import tqdm
|
|
4
4
|
import shutil
|
|
5
5
|
from functools import partial
|
|
6
|
-
|
|
7
|
-
|
|
8
|
-
class bcolors:
|
|
9
|
-
# https://stackoverflow.com/a/287944/2192272
|
|
10
|
-
HEADER = '\033[95m'
|
|
11
|
-
OKBLUE = '\033[94m'
|
|
12
|
-
OKGREEN = '\033[92m'
|
|
13
|
-
WARNING = '\033[93m'
|
|
14
|
-
FAIL = '\033[91m'
|
|
15
|
-
ENDC = '\033[0m'
|
|
16
|
-
BOLD = '\033[1m'
|
|
17
|
-
UNDERLINE = '\033[4m'
|
|
18
|
-
|
|
19
|
-
def strip_colors(s):
|
|
20
|
-
for name, c in vars(bcolors).items():
|
|
21
|
-
if name.startswith("_"):
|
|
22
|
-
continue
|
|
23
|
-
s = s.replace(c, '')
|
|
24
|
-
return s
|
|
25
|
-
|
|
6
|
+
from termcolor import colored
|
|
26
7
|
|
|
27
8
|
def ansi_link(uri, label=None):
|
|
28
9
|
"""https://stackoverflow.com/a/71309268/2192272
|
|
@@ -36,25 +17,6 @@ def ansi_link(uri, label=None):
|
|
|
36
17
|
|
|
37
18
|
return escape_mask.format(parameters, uri, label)
|
|
38
19
|
|
|
39
|
-
def colored(text, color):
|
|
40
|
-
if hasattr(bcolors, color):
|
|
41
|
-
color = getattr(bcolors, color)
|
|
42
|
-
return f"{color}{text}{bcolors.ENDC}"
|
|
43
|
-
|
|
44
|
-
|
|
45
|
-
ANSI_LINK_RE = re.compile(r'(?P<ansi_sequence>\033]8;(?P<parameter>.*?);(?P<uri>.*?)\033\\(?P<label>.*?)\033]8;;\033\\)')
|
|
46
|
-
|
|
47
|
-
def strip_ansi_link(s):
|
|
48
|
-
for m in ANSI_LINK_RE.findall(s):
|
|
49
|
-
s = s.replace(m[0], m[3])
|
|
50
|
-
return s
|
|
51
|
-
|
|
52
|
-
|
|
53
|
-
def strip_all(s):
|
|
54
|
-
s = strip_colors(s)
|
|
55
|
-
s = strip_ansi_link(s)
|
|
56
|
-
return s
|
|
57
|
-
|
|
58
20
|
|
|
59
21
|
# Function to clear the terminal line
|
|
60
22
|
def clear_line():
|
|
@@ -119,9 +81,9 @@ def format_choice(enum, default=None, unavailable=None):
|
|
|
119
81
|
value_str = value
|
|
120
82
|
|
|
121
83
|
if (default is not None and value == default) or (default is None and i == 0):
|
|
122
|
-
return f' ' + colored(f'({i+1}) {value_str} [Press Enter]', '
|
|
84
|
+
return f' ' + colored(f'({i+1}) {value_str} [Press Enter]', attrs=['bold'])
|
|
123
85
|
elif unavailable and value in unavailable:
|
|
124
|
-
return f' ' + colored(f'{" "} {value_str} -> unavailable !!',
|
|
86
|
+
return f' ' + colored(f'{" "} {value_str} -> unavailable !!', attrs=["strike"])
|
|
125
87
|
else:
|
|
126
88
|
return f' ({i+1}) {value_str}'
|
|
127
89
|
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.2
|
|
2
2
|
Name: scribe-cli
|
|
3
|
-
Version: 0.7.
|
|
3
|
+
Version: 0.7.8
|
|
4
4
|
Summary: scribe is a local speech recognition tool that provides real-time transcription using vosk and whisper AI, with the goal of serving as a virtual keyboard on a computer
|
|
5
5
|
Author-email: Mahé Perrette <mahe.perrette@gmail.com>
|
|
6
6
|
License: MIT License
|
|
@@ -44,6 +44,8 @@ Requires-Dist: sounddevice
|
|
|
44
44
|
Requires-Dist: tqdm
|
|
45
45
|
Requires-Dist: requests
|
|
46
46
|
Requires-Dist: pyperclip
|
|
47
|
+
Requires-Dist: unidecode
|
|
48
|
+
Requires-Dist: termcolor
|
|
47
49
|
Provides-Extra: keyboard
|
|
48
50
|
Requires-Dist: pynput; extra == "keyboard"
|
|
49
51
|
Provides-Extra: whisper
|
|
@@ -60,7 +62,7 @@ Requires-Dist: vosk; extra == "all"
|
|
|
60
62
|
Requires-Dist: pystray; extra == "all"
|
|
61
63
|
|
|
62
64
|
[]()
|
|
63
|
-
[](https://pypi.org/project/scribe-cli)
|
|
64
66
|
|
|
65
67
|
# Scribe
|
|
66
68
|
|
|
@@ -83,7 +85,7 @@ sudo apt-get install portaudio19-dev xclip
|
|
|
83
85
|
See additional requirements for the [icon tray](#system-tray-icon-experimental) and [keyboard](#virtual-keyboard-experimental) options. The python dependencies should be dealt with automatically:
|
|
84
86
|
|
|
85
87
|
```bash
|
|
86
|
-
pip install scribe-cli[all]
|
|
88
|
+
pip install scribe-cli[all]
|
|
87
89
|
```
|
|
88
90
|
|
|
89
91
|
(note the `-cli` suffix for client)
|
|
@@ -121,7 +123,7 @@ With the `whisker` model you need to stop the registration manually before the t
|
|
|
121
123
|
there is a maximum duration after which it will stop by itself, which is setup to 60s by default (unless `--duration` is set to something else).
|
|
122
124
|
|
|
123
125
|
The `vosk` backend is much faster and very good at doing real-time transcription for one language, but tended to make more mistakes in my tests and it does not do punctuation.
|
|
124
|
-
|
|
126
|
+
It becomes really powerful when used for longer or interactive typing session with the [keyboard](#virtual-keyboard-experimental) option, e.g. to make notes or chat with an AI.
|
|
125
127
|
There are many [vosk models](https://alphacephei.com/vosk/models) available, and here a few are associated to [a handful of languages](scribe/models.toml) `en`, `fr`, `it`, `de` (so far).
|
|
126
128
|
|
|
127
129
|
To skip the initial selection menu you can do:
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|