scribe-cli 0.6.1__tar.gz → 0.6.2__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {scribe_cli-0.6.1/scribe_cli.egg-info → scribe_cli-0.6.2}/PKG-INFO +14 -5
- {scribe_cli-0.6.1 → scribe_cli-0.6.2}/README.md +14 -5
- {scribe_cli-0.6.1 → scribe_cli-0.6.2}/scribe/_version.py +2 -2
- {scribe_cli-0.6.1 → scribe_cli-0.6.2}/scribe/keyboard.py +0 -1
- {scribe_cli-0.6.1 → scribe_cli-0.6.2}/scribe/models.py +7 -7
- {scribe_cli-0.6.1 → scribe_cli-0.6.2}/scribe/streamer.py +11 -5
- {scribe_cli-0.6.1 → scribe_cli-0.6.2/scribe_cli.egg-info}/PKG-INFO +14 -5
- {scribe_cli-0.6.1 → scribe_cli-0.6.2}/.github/workflows/pypi.yml +0 -0
- {scribe_cli-0.6.1 → scribe_cli-0.6.2}/.gitignore +0 -0
- {scribe_cli-0.6.1 → scribe_cli-0.6.2}/LICENSE +0 -0
- {scribe_cli-0.6.1 → scribe_cli-0.6.2}/pyproject.toml +0 -0
- {scribe_cli-0.6.1 → scribe_cli-0.6.2}/scribe/__init__.py +0 -0
- {scribe_cli-0.6.1 → scribe_cli-0.6.2}/scribe/audio.py +0 -0
- {scribe_cli-0.6.1 → scribe_cli-0.6.2}/scribe/install_desktop.py +0 -0
- {scribe_cli-0.6.1 → scribe_cli-0.6.2}/scribe/models.toml +0 -0
- {scribe_cli-0.6.1 → scribe_cli-0.6.2}/scribe/saverecording.py +0 -0
- {scribe_cli-0.6.1 → scribe_cli-0.6.2}/scribe/testpynput.py +0 -0
- {scribe_cli-0.6.1 → scribe_cli-0.6.2}/scribe/util.py +0 -0
- {scribe_cli-0.6.1 → scribe_cli-0.6.2}/scribe_cli.egg-info/SOURCES.txt +0 -0
- {scribe_cli-0.6.1 → scribe_cli-0.6.2}/scribe_cli.egg-info/dependency_links.txt +0 -0
- {scribe_cli-0.6.1 → scribe_cli-0.6.2}/scribe_cli.egg-info/entry_points.txt +0 -0
- {scribe_cli-0.6.1 → scribe_cli-0.6.2}/scribe_cli.egg-info/requires.txt +0 -0
- {scribe_cli-0.6.1 → scribe_cli-0.6.2}/scribe_cli.egg-info/top_level.txt +0 -0
- {scribe_cli-0.6.1 → scribe_cli-0.6.2}/scribe_data/__init__.py +0 -0
- {scribe_cli-0.6.1 → scribe_cli-0.6.2}/scribe_data/share/icon.jpg +0 -0
- {scribe_cli-0.6.1 → scribe_cli-0.6.2}/scribe_data/templates/scribe.desktop +0 -0
- {scribe_cli-0.6.1 → scribe_cli-0.6.2}/setup.cfg +0 -0
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.2
|
|
2
2
|
Name: scribe-cli
|
|
3
|
-
Version: 0.6.
|
|
3
|
+
Version: 0.6.2
|
|
4
4
|
Summary: scribe is a local speech recognition tool that provides real-time transcription using vosk and whisper AI.
|
|
5
5
|
Author-email: Mahé Perrette <mahe.perrette@gmail.com>
|
|
6
6
|
License: MIT License
|
|
@@ -109,6 +109,7 @@ there is a maximum duration after which it will stop by itself, which is setup t
|
|
|
109
109
|
|
|
110
110
|
The `vosk` backend is good at
|
|
111
111
|
doing real-time transcription for one language, but tended to make more mistakes in my tests and it does not do punctuation.
|
|
112
|
+
Use mainly for longer typing session with the [keyboard](#virtual-keyboard-advanced) option, e.g. to make notes.
|
|
112
113
|
There are many [vosk models](https://alphacephei.com/vosk/models) available, and here a few are associated to [a handful of languages](scribe/models.toml) `en`, `fr`, `it`, `de` (so far).
|
|
113
114
|
|
|
114
115
|
To skip the initial selection menu you can do:
|
|
@@ -117,9 +118,9 @@ scribe --backend whisper --model small --no-prompt
|
|
|
117
118
|
```
|
|
118
119
|
where `--no-prompt` jumps right to the recording (after the first interruption, you can still choose to change the backend and model).
|
|
119
120
|
|
|
120
|
-
###
|
|
121
|
+
### Virtual keyboard (experimental)
|
|
121
122
|
|
|
122
|
-
By default the content of the transcription is
|
|
123
|
+
By default the content of the transcription is pasted to the clipboard, and it is up to the user to paste (e.g. Ctrl + V).
|
|
123
124
|
With the `--keyboard` option `scribe` will attempt to simulate a keyboard and send transcribed characters to the applcation under focus:
|
|
124
125
|
|
|
125
126
|
```bash
|
|
@@ -128,10 +129,18 @@ scribe --keyboard
|
|
|
128
129
|
|
|
129
130
|
It relies on the optional `pynput` dependency (installed together with `scribe` if you used the `[all]` or `[keyboard]` option).
|
|
130
131
|
|
|
131
|
-
`pynput` may require [some configuration](https://pynput.readthedocs.io/en/latest/limitations.html)
|
|
132
|
-
|
|
132
|
+
`pynput` may require [some configuration](https://pynput.readthedocs.io/en/latest/limitations.html). It has [limitations]((https://pynput.readthedocs.io/en/latest/limitations.html)). In my Ubuntu + Wayland system it works in chromium based applications (including vscode) but it does not in firefox and sublime text and any of the rest (not even in a terminal !). Workarounds include using the Xorg version of GNOME: in `etc/gdm3/custom.conf` uncomment `# WaylandEnable=false` and restart.
|
|
133
|
+
|
|
133
134
|
|
|
134
135
|
### Start as an application in Ubuntu
|
|
135
136
|
|
|
136
137
|
If you run Ubuntu (or else?) with GNOME, the script `scribe-install [...]` will create a `scribe.desktop` file and place it under `$HOME/.local/share/applications`
|
|
137
138
|
to make it available from the quick launch menu. Any option will be passed on to `scribe`.
|
|
139
|
+
|
|
140
|
+
e.g.
|
|
141
|
+
|
|
142
|
+
```bash
|
|
143
|
+
scribe-install --backend whisper --model small
|
|
144
|
+
```
|
|
145
|
+
|
|
146
|
+
After that just typing Cmd + scri... at any time from any where will conveniently start the app in its own terminal with the prescribed options.
|
|
@@ -52,6 +52,7 @@ there is a maximum duration after which it will stop by itself, which is setup t
|
|
|
52
52
|
|
|
53
53
|
The `vosk` backend is good at
|
|
54
54
|
doing real-time transcription for one language, but tended to make more mistakes in my tests and it does not do punctuation.
|
|
55
|
+
Use mainly for longer typing session with the [keyboard](#virtual-keyboard-advanced) option, e.g. to make notes.
|
|
55
56
|
There are many [vosk models](https://alphacephei.com/vosk/models) available, and here a few are associated to [a handful of languages](scribe/models.toml) `en`, `fr`, `it`, `de` (so far).
|
|
56
57
|
|
|
57
58
|
To skip the initial selection menu you can do:
|
|
@@ -60,9 +61,9 @@ scribe --backend whisper --model small --no-prompt
|
|
|
60
61
|
```
|
|
61
62
|
where `--no-prompt` jumps right to the recording (after the first interruption, you can still choose to change the backend and model).
|
|
62
63
|
|
|
63
|
-
###
|
|
64
|
+
### Virtual keyboard (experimental)
|
|
64
65
|
|
|
65
|
-
By default the content of the transcription is
|
|
66
|
+
By default the content of the transcription is pasted to the clipboard, and it is up to the user to paste (e.g. Ctrl + V).
|
|
66
67
|
With the `--keyboard` option `scribe` will attempt to simulate a keyboard and send transcribed characters to the applcation under focus:
|
|
67
68
|
|
|
68
69
|
```bash
|
|
@@ -71,10 +72,18 @@ scribe --keyboard
|
|
|
71
72
|
|
|
72
73
|
It relies on the optional `pynput` dependency (installed together with `scribe` if you used the `[all]` or `[keyboard]` option).
|
|
73
74
|
|
|
74
|
-
`pynput` may require [some configuration](https://pynput.readthedocs.io/en/latest/limitations.html)
|
|
75
|
-
|
|
75
|
+
`pynput` may require [some configuration](https://pynput.readthedocs.io/en/latest/limitations.html). It has [limitations]((https://pynput.readthedocs.io/en/latest/limitations.html)). In my Ubuntu + Wayland system it works in chromium based applications (including vscode) but it does not in firefox and sublime text and any of the rest (not even in a terminal !). Workarounds include using the Xorg version of GNOME: in `etc/gdm3/custom.conf` uncomment `# WaylandEnable=false` and restart.
|
|
76
|
+
|
|
76
77
|
|
|
77
78
|
### Start as an application in Ubuntu
|
|
78
79
|
|
|
79
80
|
If you run Ubuntu (or else?) with GNOME, the script `scribe-install [...]` will create a `scribe.desktop` file and place it under `$HOME/.local/share/applications`
|
|
80
|
-
to make it available from the quick launch menu. Any option will be passed on to `scribe`.
|
|
81
|
+
to make it available from the quick launch menu. Any option will be passed on to `scribe`.
|
|
82
|
+
|
|
83
|
+
e.g.
|
|
84
|
+
|
|
85
|
+
```bash
|
|
86
|
+
scribe-install --backend whisper --model small
|
|
87
|
+
```
|
|
88
|
+
|
|
89
|
+
After that just typing Cmd + scri... at any time from any where will conveniently start the app in its own terminal with the prescribed options.
|
|
@@ -95,16 +95,16 @@ class AbstractTranscriber:
|
|
|
95
95
|
print(stop_message)
|
|
96
96
|
|
|
97
97
|
|
|
98
|
-
def get_vosk_model(model,
|
|
98
|
+
def get_vosk_model(model, download_root=None, url=None):
|
|
99
99
|
"""Load the Vosk recognizer"""
|
|
100
100
|
import vosk
|
|
101
|
-
if
|
|
102
|
-
|
|
103
|
-
model_path = os.path.join(
|
|
101
|
+
if download_root is None:
|
|
102
|
+
download_root = VOSK_MODELS_FOLDER
|
|
103
|
+
model_path = os.path.join(download_root, model)
|
|
104
104
|
if not os.path.exists(model_path):
|
|
105
105
|
if url is None:
|
|
106
106
|
url = f"https://alphacephei.com/vosk/models/{model}.zip"
|
|
107
|
-
download_model(url,
|
|
107
|
+
download_model(url, download_root)
|
|
108
108
|
assert os.path.exists(model_path)
|
|
109
109
|
|
|
110
110
|
return vosk.Model(model_path)
|
|
@@ -162,11 +162,11 @@ class WhisperTranscriber(AbstractTranscriber):
|
|
|
162
162
|
def __init__(self, model_name, language=None, model=None, model_kwargs={}, **kwargs):
|
|
163
163
|
import whisper
|
|
164
164
|
if model is None:
|
|
165
|
-
model = whisper.load_model(model_name)
|
|
165
|
+
model = whisper.load_model(model_name, **model_kwargs)
|
|
166
166
|
super().__init__(model, model_name, language, model_kwargs=model_kwargs, **kwargs)
|
|
167
167
|
|
|
168
168
|
def transcribe_audio(self, audio_bytes):
|
|
169
|
-
print("
|
|
169
|
+
print("\nTranscribing...")
|
|
170
170
|
audio_array = np.frombuffer(audio_bytes, dtype=np.int16).flatten().astype(np.float32) / 32768.0
|
|
171
171
|
return self.model.transcribe(audio_array, fp16=False, language=self.language)
|
|
172
172
|
|
|
@@ -47,7 +47,7 @@ def get_transcriber(o, prompt=True):
|
|
|
47
47
|
backend = o.backend
|
|
48
48
|
|
|
49
49
|
elif not prompt:
|
|
50
|
-
backend =
|
|
50
|
+
backend = BACKENDS[0]
|
|
51
51
|
|
|
52
52
|
else:
|
|
53
53
|
checked_backend = False
|
|
@@ -113,14 +113,16 @@ def get_transcriber(o, prompt=True):
|
|
|
113
113
|
samplerate=o.samplerate,
|
|
114
114
|
timeout=None, # vosk keeps going (no timeout)
|
|
115
115
|
silence_duration=None, # vosk handles silences internally
|
|
116
|
-
model_kwargs={"
|
|
116
|
+
model_kwargs={"download_root": o.download_folder_vosk})
|
|
117
117
|
except Exception as error:
|
|
118
118
|
print(error)
|
|
119
119
|
print(f"Failed to (down)load model {model}.")
|
|
120
120
|
exit(1)
|
|
121
121
|
|
|
122
122
|
elif backend == "whisper":
|
|
123
|
-
transcriber = WhisperTranscriber(model_name=model, language=o.language, samplerate=o.samplerate,
|
|
123
|
+
transcriber = WhisperTranscriber(model_name=model, language=o.language, samplerate=o.samplerate,
|
|
124
|
+
timeout=o.duration, silence_duration=o.silence, restart_after_silence=o.restart_after_silence,
|
|
125
|
+
model_kwargs={"download_root": o.download_folder_whisper})
|
|
124
126
|
|
|
125
127
|
else:
|
|
126
128
|
raise ValueError(f"Unknown backend: {backend}")
|
|
@@ -153,7 +155,8 @@ def get_parser():
|
|
|
153
155
|
group.add_argument("--silence", default=2, type=float, help="silence duration that prompt transcription (whisper) (default %(default)ss)")
|
|
154
156
|
group.add_argument("--restart-after-silence", action="store_true", help="Restart the recording after a transcription triggered by a silence")
|
|
155
157
|
|
|
156
|
-
parser.add_argument("--
|
|
158
|
+
parser.add_argument("--download-folder-vosk", help="Folder to store Vosk models.")
|
|
159
|
+
parser.add_argument("--download-folder-whisper", help="Folder to store Whisper models.")
|
|
157
160
|
|
|
158
161
|
return parser
|
|
159
162
|
|
|
@@ -235,7 +238,7 @@ def main(args=None):
|
|
|
235
238
|
if transcriber.backend == "whisper":
|
|
236
239
|
print(f"[t] change duration (currently {transcriber.timeout}s)")
|
|
237
240
|
print(f"[b] change silence duration (currently {transcriber.silence_duration}s)")
|
|
238
|
-
print(f"[a] toggle auto-restart after silence [{toggle[
|
|
241
|
+
print(f"[a] toggle auto-restart after silence [{toggle[transcriber.restart_after_silence]}] -> [{toggle[not transcriber.restart_after_silence]}]")
|
|
239
242
|
print(colored(f"Press [Enter] or any other key to start recording.", "BOLD"))
|
|
240
243
|
|
|
241
244
|
key = input()
|
|
@@ -250,6 +253,9 @@ def main(args=None):
|
|
|
250
253
|
if key == "c":
|
|
251
254
|
o.clipboard = not o.clipboard
|
|
252
255
|
continue
|
|
256
|
+
if key == "a":
|
|
257
|
+
transcriber.restart_after_silence = not transcriber.restart_after_silence
|
|
258
|
+
continue
|
|
253
259
|
if key == "t":
|
|
254
260
|
ans = input(f"Enter new duration in seconds (current: {transcriber.timeout}): ")
|
|
255
261
|
try:
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.2
|
|
2
2
|
Name: scribe-cli
|
|
3
|
-
Version: 0.6.
|
|
3
|
+
Version: 0.6.2
|
|
4
4
|
Summary: scribe is a local speech recognition tool that provides real-time transcription using vosk and whisper AI.
|
|
5
5
|
Author-email: Mahé Perrette <mahe.perrette@gmail.com>
|
|
6
6
|
License: MIT License
|
|
@@ -109,6 +109,7 @@ there is a maximum duration after which it will stop by itself, which is setup t
|
|
|
109
109
|
|
|
110
110
|
The `vosk` backend is good at
|
|
111
111
|
doing real-time transcription for one language, but tended to make more mistakes in my tests and it does not do punctuation.
|
|
112
|
+
Use mainly for longer typing session with the [keyboard](#virtual-keyboard-advanced) option, e.g. to make notes.
|
|
112
113
|
There are many [vosk models](https://alphacephei.com/vosk/models) available, and here a few are associated to [a handful of languages](scribe/models.toml) `en`, `fr`, `it`, `de` (so far).
|
|
113
114
|
|
|
114
115
|
To skip the initial selection menu you can do:
|
|
@@ -117,9 +118,9 @@ scribe --backend whisper --model small --no-prompt
|
|
|
117
118
|
```
|
|
118
119
|
where `--no-prompt` jumps right to the recording (after the first interruption, you can still choose to change the backend and model).
|
|
119
120
|
|
|
120
|
-
###
|
|
121
|
+
### Virtual keyboard (experimental)
|
|
121
122
|
|
|
122
|
-
By default the content of the transcription is
|
|
123
|
+
By default the content of the transcription is pasted to the clipboard, and it is up to the user to paste (e.g. Ctrl + V).
|
|
123
124
|
With the `--keyboard` option `scribe` will attempt to simulate a keyboard and send transcribed characters to the applcation under focus:
|
|
124
125
|
|
|
125
126
|
```bash
|
|
@@ -128,10 +129,18 @@ scribe --keyboard
|
|
|
128
129
|
|
|
129
130
|
It relies on the optional `pynput` dependency (installed together with `scribe` if you used the `[all]` or `[keyboard]` option).
|
|
130
131
|
|
|
131
|
-
`pynput` may require [some configuration](https://pynput.readthedocs.io/en/latest/limitations.html)
|
|
132
|
-
|
|
132
|
+
`pynput` may require [some configuration](https://pynput.readthedocs.io/en/latest/limitations.html). It has [limitations]((https://pynput.readthedocs.io/en/latest/limitations.html)). In my Ubuntu + Wayland system it works in chromium based applications (including vscode) but it does not in firefox and sublime text and any of the rest (not even in a terminal !). Workarounds include using the Xorg version of GNOME: in `etc/gdm3/custom.conf` uncomment `# WaylandEnable=false` and restart.
|
|
133
|
+
|
|
133
134
|
|
|
134
135
|
### Start as an application in Ubuntu
|
|
135
136
|
|
|
136
137
|
If you run Ubuntu (or else?) with GNOME, the script `scribe-install [...]` will create a `scribe.desktop` file and place it under `$HOME/.local/share/applications`
|
|
137
138
|
to make it available from the quick launch menu. Any option will be passed on to `scribe`.
|
|
139
|
+
|
|
140
|
+
e.g.
|
|
141
|
+
|
|
142
|
+
```bash
|
|
143
|
+
scribe-install --backend whisper --model small
|
|
144
|
+
```
|
|
145
|
+
|
|
146
|
+
After that just typing Cmd + scri... at any time from any where will conveniently start the app in its own terminal with the prescribed options.
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|