scribe-cli 0.9.0__tar.gz → 0.11.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {scribe_cli-0.9.0/scribe_cli.egg-info → scribe_cli-0.11.0}/PKG-INFO +28 -19
- {scribe_cli-0.9.0 → scribe_cli-0.11.0}/README.md +22 -18
- {scribe_cli-0.9.0 → scribe_cli-0.11.0}/pyproject.toml +2 -1
- {scribe_cli-0.9.0 → scribe_cli-0.11.0}/scribe/_version.py +2 -2
- {scribe_cli-0.9.0 → scribe_cli-0.11.0}/scribe/app.py +122 -56
- {scribe_cli-0.9.0 → scribe_cli-0.11.0}/scribe/models.py +56 -7
- {scribe_cli-0.9.0 → scribe_cli-0.11.0/scribe_cli.egg-info}/PKG-INFO +28 -19
- {scribe_cli-0.9.0 → scribe_cli-0.11.0}/scribe_cli.egg-info/requires.txt +6 -0
- {scribe_cli-0.9.0 → scribe_cli-0.11.0}/.github/workflows/pypi.yml +0 -0
- {scribe_cli-0.9.0 → scribe_cli-0.11.0}/.gitignore +0 -0
- {scribe_cli-0.9.0 → scribe_cli-0.11.0}/LICENSE +0 -0
- {scribe_cli-0.9.0 → scribe_cli-0.11.0}/icon.xcf +0 -0
- {scribe_cli-0.9.0 → scribe_cli-0.11.0}/scribe/__init__.py +0 -0
- {scribe_cli-0.9.0 → scribe_cli-0.11.0}/scribe/audio.py +0 -0
- {scribe_cli-0.9.0 → scribe_cli-0.11.0}/scribe/install_desktop.py +0 -0
- {scribe_cli-0.9.0 → scribe_cli-0.11.0}/scribe/keyboard.py +0 -0
- {scribe_cli-0.9.0 → scribe_cli-0.11.0}/scribe/models.toml +0 -0
- {scribe_cli-0.9.0 → scribe_cli-0.11.0}/scribe/saverecording.py +0 -0
- {scribe_cli-0.9.0 → scribe_cli-0.11.0}/scribe/testpynput.py +0 -0
- {scribe_cli-0.9.0 → scribe_cli-0.11.0}/scribe/util.py +0 -0
- {scribe_cli-0.9.0 → scribe_cli-0.11.0}/scribe_cli.egg-info/SOURCES.txt +0 -0
- {scribe_cli-0.9.0 → scribe_cli-0.11.0}/scribe_cli.egg-info/dependency_links.txt +0 -0
- {scribe_cli-0.9.0 → scribe_cli-0.11.0}/scribe_cli.egg-info/entry_points.txt +0 -0
- {scribe_cli-0.9.0 → scribe_cli-0.11.0}/scribe_cli.egg-info/top_level.txt +0 -0
- {scribe_cli-0.9.0 → scribe_cli-0.11.0}/scribe_data/__init__.py +0 -0
- {scribe_cli-0.9.0 → scribe_cli-0.11.0}/scribe_data/share/icon.png +0 -0
- {scribe_cli-0.9.0 → scribe_cli-0.11.0}/scribe_data/share/icon_recording.png +0 -0
- {scribe_cli-0.9.0 → scribe_cli-0.11.0}/scribe_data/share/icon_writing.png +0 -0
- {scribe_cli-0.9.0 → scribe_cli-0.11.0}/scribe_data/templates/scribe.desktop +0 -0
- {scribe_cli-0.9.0 → scribe_cli-0.11.0}/setup.cfg +0 -0
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.2
|
|
2
2
|
Name: scribe-cli
|
|
3
|
-
Version: 0.
|
|
3
|
+
Version: 0.11.0
|
|
4
4
|
Summary: scribe is a local speech recognition tool that provides real-time transcription using vosk and whisper AI, with the goal of serving as a virtual keyboard on a computer
|
|
5
5
|
Author-email: Mahé Perrette <mahe.perrette@gmail.com>
|
|
6
6
|
License: MIT License
|
|
@@ -55,9 +55,14 @@ Requires-Dist: vosk; extra == "vosk"
|
|
|
55
55
|
Provides-Extra: app
|
|
56
56
|
Requires-Dist: pystray; extra == "app"
|
|
57
57
|
Requires-Dist: PyGObject; extra == "app"
|
|
58
|
+
Provides-Extra: openai
|
|
59
|
+
Requires-Dist: openai; extra == "openai"
|
|
60
|
+
Requires-Dist: soundfile; extra == "openai"
|
|
58
61
|
Provides-Extra: all
|
|
59
62
|
Requires-Dist: pynput; extra == "all"
|
|
60
63
|
Requires-Dist: openai-whisper; extra == "all"
|
|
64
|
+
Requires-Dist: openai; extra == "all"
|
|
65
|
+
Requires-Dist: soundfile; extra == "all"
|
|
61
66
|
Requires-Dist: vosk; extra == "all"
|
|
62
67
|
Requires-Dist: pystray; extra == "all"
|
|
63
68
|
|
|
@@ -66,7 +71,9 @@ Requires-Dist: pystray; extra == "all"
|
|
|
66
71
|
|
|
67
72
|
# Scribe <img src="scribe_data/share/icon.png" width=48px>
|
|
68
73
|
|
|
69
|
-
`scribe` is a
|
|
74
|
+
`scribe` is a speech recognition tool that provides real-time transcription using cutting-edge AI models, with the goal of serving as a virtual keyboard on a computer.
|
|
75
|
+
|
|
76
|
+
It features local, downloadable models with the `vosk` and `whisper` backends, as well as a client to open AI via `openaiapi` backend (API key required).
|
|
70
77
|
|
|
71
78
|
## Compatibility
|
|
72
79
|
|
|
@@ -101,12 +108,10 @@ cd scribe
|
|
|
101
108
|
pip install -e .[all]
|
|
102
109
|
```
|
|
103
110
|
|
|
104
|
-
You can leave the optional dependencies (leave out `[all]`) but must install at least one of `vosk` or `openai-whisper` packages (see Usage below).
|
|
105
|
-
|
|
106
|
-
The `vosk` language models will download on-the-fly.
|
|
107
|
-
The default download folder is `$XDG_CACHE_HOME/{backend}` where `$XDG_CACHE_HOME` defaults to `$HOME/.cache` (note for the `whisper` backend
|
|
108
|
-
the default is left to the `openai-whisper` package and might change in the future).
|
|
111
|
+
You can leave the optional dependencies (leave out `[all]`) but must install at least one of `vosk` or `openai-whisper` or `openai` packages (see Usage below).
|
|
109
112
|
|
|
113
|
+
The language models for local backends `vosk` and `whisper` will download on-the-fly.
|
|
114
|
+
The default download folder is `$XDG_CACHE_HOME/{backend}` where `$XDG_CACHE_HOME` defaults to `$HOME/.cache`.
|
|
110
115
|
|
|
111
116
|
## Usage
|
|
112
117
|
|
|
@@ -115,7 +120,7 @@ Just type in the terminal:
|
|
|
115
120
|
```bash
|
|
116
121
|
scribe
|
|
117
122
|
```
|
|
118
|
-
and the script will guide you through the choice of backend (`whisper` or `vosk`) and the specific language model.
|
|
123
|
+
and the script will guide you through the choice of backend (`whisper` or `vosk` or `openaiapi`) and the specific language model.
|
|
119
124
|
After this, you will be prompted to start recording your microphone and print the transcribed text in real-time (`vosk`)
|
|
120
125
|
or until after recording is complete (`whisper`).
|
|
121
126
|
You can interrupt the recording via Ctrl + C and start again or change model.
|
|
@@ -129,9 +134,9 @@ The `vosk` backend is much faster and very good at doing real-time transcription
|
|
|
129
134
|
It becomes really powerful when used for longer or interactive typing session with the [keyboard](#virtual-keyboard-experimental) option, e.g. to make notes or chat with an AI.
|
|
130
135
|
There are many [vosk models](https://alphacephei.com/vosk/models) available, and here a few are associated to [a handful of languages](scribe/models.toml) `en`, `fr`, `it`, `de` (so far).
|
|
131
136
|
|
|
132
|
-
|
|
137
|
+
The `openaiapi` backend uses `whisper-1` model at the time of writing. It requires an API key
|
|
133
138
|
```bash
|
|
134
|
-
scribe --backend
|
|
139
|
+
scribe --backend openaiapi --api YOURAPIKEY
|
|
135
140
|
```
|
|
136
141
|
where `--no-prompt` jumps right to the recording (after the first interruption, you can still choose to change the backend and model).
|
|
137
142
|
|
|
@@ -190,7 +195,9 @@ To activate start with:
|
|
|
190
195
|
```bash
|
|
191
196
|
scribe --app
|
|
192
197
|
```
|
|
193
|
-
or toggle the app option in the interactive menu. The scribe icon will show, with Record
|
|
198
|
+
or toggle the app option in the interactive menu. The scribe icon will show, with Record and other options. The icon will change based on what the app is doing. It is possible to choose from a set
|
|
199
|
+
of predefined models, or to Quit and choose from the terminal before pressing Enter again.
|
|
200
|
+
For the vosk model, there are only two states : recording + transcribing or Idle. For the whisper model there are three states visible from the icon: recording, transcribing and idle/waiting.
|
|
194
201
|
That option requires `pystray` to be installed. This is included with the `pip install ...[all]` option. In Ubuntu the following dependencies were required to make the menus appear:
|
|
195
202
|
|
|
196
203
|
```bash
|
|
@@ -204,17 +211,19 @@ If you run Ubuntu (or else?) with GNOME, the script `scribe-install [...]` will
|
|
|
204
211
|
to make it available from the quick launch menu. Any option will be passed on to `scribe`, with the additional options `--name` and `--no-terminal`.
|
|
205
212
|
`--no-terminal` means no terminal will show up, and it also implies the options `--app --no-prompt`.
|
|
206
213
|
|
|
207
|
-
|
|
214
|
+
In a relatively basic form
|
|
215
|
+
|
|
216
|
+
```bash
|
|
217
|
+
scribe-install --clipboard --api YOUROPENAIAPIKEY
|
|
218
|
+
```
|
|
219
|
+
(`--api` is optional and only useful if you plan to use `openaiapi` backend later on)
|
|
220
|
+
|
|
221
|
+
And to make an app running outside the terminal:
|
|
208
222
|
|
|
209
223
|
```bash
|
|
210
|
-
scribe-install
|
|
211
|
-
scribe-install --name "Scribe Whisper" --backend whisper --model small --clipboard --restart-after-silence --no-prompt
|
|
212
|
-
scribe-install --name "Scribe Vosk FR" --backend vosk --language fr --keyboard --clipboard --no-terminal
|
|
224
|
+
scribe-install --backend openaiapi --name "Scribe App" --keyboard --clipboard --app --no-prompt --no-terminal --api YOUROPENAIAPIKEY
|
|
213
225
|
```
|
|
214
|
-
This will install
|
|
215
|
-
- `Super + scribe` : will launch the default version with terminal prompt
|
|
216
|
-
- `Super + whisper` : will launch a present version with the `small` model from `whisper` and start recording right away. You can see what is going on in the terminal and the result is ready to paste from the clipboard
|
|
217
|
-
- `Super + vosk fr` : will launch a preset version for real-time transcription in French with the `vosk` backend, and throughput to the clipboard and the keyboard, not even opening a terminal.
|
|
226
|
+
This will install two separate apps (names "Scribe" and "Scribe App")
|
|
218
227
|
|
|
219
228
|
|
|
220
229
|
## Fine tuning
|
|
@@ -3,7 +3,9 @@
|
|
|
3
3
|
|
|
4
4
|
# Scribe <img src="scribe_data/share/icon.png" width=48px>
|
|
5
5
|
|
|
6
|
-
`scribe` is a
|
|
6
|
+
`scribe` is a speech recognition tool that provides real-time transcription using cutting-edge AI models, with the goal of serving as a virtual keyboard on a computer.
|
|
7
|
+
|
|
8
|
+
It features local, downloadable models with the `vosk` and `whisper` backends, as well as a client to open AI via `openaiapi` backend (API key required).
|
|
7
9
|
|
|
8
10
|
## Compatibility
|
|
9
11
|
|
|
@@ -38,12 +40,10 @@ cd scribe
|
|
|
38
40
|
pip install -e .[all]
|
|
39
41
|
```
|
|
40
42
|
|
|
41
|
-
You can leave the optional dependencies (leave out `[all]`) but must install at least one of `vosk` or `openai-whisper` packages (see Usage below).
|
|
42
|
-
|
|
43
|
-
The `vosk` language models will download on-the-fly.
|
|
44
|
-
The default download folder is `$XDG_CACHE_HOME/{backend}` where `$XDG_CACHE_HOME` defaults to `$HOME/.cache` (note for the `whisper` backend
|
|
45
|
-
the default is left to the `openai-whisper` package and might change in the future).
|
|
43
|
+
You can leave the optional dependencies (leave out `[all]`) but must install at least one of `vosk` or `openai-whisper` or `openai` packages (see Usage below).
|
|
46
44
|
|
|
45
|
+
The language models for local backends `vosk` and `whisper` will download on-the-fly.
|
|
46
|
+
The default download folder is `$XDG_CACHE_HOME/{backend}` where `$XDG_CACHE_HOME` defaults to `$HOME/.cache`.
|
|
47
47
|
|
|
48
48
|
## Usage
|
|
49
49
|
|
|
@@ -52,7 +52,7 @@ Just type in the terminal:
|
|
|
52
52
|
```bash
|
|
53
53
|
scribe
|
|
54
54
|
```
|
|
55
|
-
and the script will guide you through the choice of backend (`whisper` or `vosk`) and the specific language model.
|
|
55
|
+
and the script will guide you through the choice of backend (`whisper` or `vosk` or `openaiapi`) and the specific language model.
|
|
56
56
|
After this, you will be prompted to start recording your microphone and print the transcribed text in real-time (`vosk`)
|
|
57
57
|
or until after recording is complete (`whisper`).
|
|
58
58
|
You can interrupt the recording via Ctrl + C and start again or change model.
|
|
@@ -66,9 +66,9 @@ The `vosk` backend is much faster and very good at doing real-time transcription
|
|
|
66
66
|
It becomes really powerful when used for longer or interactive typing session with the [keyboard](#virtual-keyboard-experimental) option, e.g. to make notes or chat with an AI.
|
|
67
67
|
There are many [vosk models](https://alphacephei.com/vosk/models) available, and here a few are associated to [a handful of languages](scribe/models.toml) `en`, `fr`, `it`, `de` (so far).
|
|
68
68
|
|
|
69
|
-
|
|
69
|
+
The `openaiapi` backend uses `whisper-1` model at the time of writing. It requires an API key
|
|
70
70
|
```bash
|
|
71
|
-
scribe --backend
|
|
71
|
+
scribe --backend openaiapi --api YOURAPIKEY
|
|
72
72
|
```
|
|
73
73
|
where `--no-prompt` jumps right to the recording (after the first interruption, you can still choose to change the backend and model).
|
|
74
74
|
|
|
@@ -127,7 +127,9 @@ To activate start with:
|
|
|
127
127
|
```bash
|
|
128
128
|
scribe --app
|
|
129
129
|
```
|
|
130
|
-
or toggle the app option in the interactive menu. The scribe icon will show, with Record
|
|
130
|
+
or toggle the app option in the interactive menu. The scribe icon will show, with Record and other options. The icon will change based on what the app is doing. It is possible to choose from a set
|
|
131
|
+
of predefined models, or to Quit and choose from the terminal before pressing Enter again.
|
|
132
|
+
For the vosk model, there are only two states : recording + transcribing or Idle. For the whisper model there are three states visible from the icon: recording, transcribing and idle/waiting.
|
|
131
133
|
That option requires `pystray` to be installed. This is included with the `pip install ...[all]` option. In Ubuntu the following dependencies were required to make the menus appear:
|
|
132
134
|
|
|
133
135
|
```bash
|
|
@@ -141,17 +143,19 @@ If you run Ubuntu (or else?) with GNOME, the script `scribe-install [...]` will
|
|
|
141
143
|
to make it available from the quick launch menu. Any option will be passed on to `scribe`, with the additional options `--name` and `--no-terminal`.
|
|
142
144
|
`--no-terminal` means no terminal will show up, and it also implies the options `--app --no-prompt`.
|
|
143
145
|
|
|
144
|
-
|
|
146
|
+
In a relatively basic form
|
|
147
|
+
|
|
148
|
+
```bash
|
|
149
|
+
scribe-install --clipboard --api YOUROPENAIAPIKEY
|
|
150
|
+
```
|
|
151
|
+
(`--api` is optional and only useful if you plan to use `openaiapi` backend later on)
|
|
152
|
+
|
|
153
|
+
And to make an app running outside the terminal:
|
|
145
154
|
|
|
146
155
|
```bash
|
|
147
|
-
scribe-install
|
|
148
|
-
scribe-install --name "Scribe Whisper" --backend whisper --model small --clipboard --restart-after-silence --no-prompt
|
|
149
|
-
scribe-install --name "Scribe Vosk FR" --backend vosk --language fr --keyboard --clipboard --no-terminal
|
|
156
|
+
scribe-install --backend openaiapi --name "Scribe App" --keyboard --clipboard --app --no-prompt --no-terminal --api YOUROPENAIAPIKEY
|
|
150
157
|
```
|
|
151
|
-
This will install
|
|
152
|
-
- `Super + scribe` : will launch the default version with terminal prompt
|
|
153
|
-
- `Super + whisper` : will launch a present version with the `small` model from `whisper` and start recording right away. You can see what is going on in the terminal and the result is ready to paste from the clipboard
|
|
154
|
-
- `Super + vosk fr` : will launch a preset version for real-time transcription in French with the `vosk` backend, and throughput to the clipboard and the keyboard, not even opening a terminal.
|
|
158
|
+
This will install two separate apps (names "Scribe" and "Scribe App")
|
|
155
159
|
|
|
156
160
|
|
|
157
161
|
## Fine tuning
|
|
@@ -44,7 +44,8 @@ keyboard = ["pynput"]
|
|
|
44
44
|
whisper = ["openai-whisper"]
|
|
45
45
|
vosk = ["vosk"]
|
|
46
46
|
app = ["pystray", "PyGObject"]
|
|
47
|
-
|
|
47
|
+
openai = ["openai", "soundfile"]
|
|
48
|
+
all = ["pynput", "openai-whisper", "openai", "soundfile", "vosk", "pystray"]
|
|
48
49
|
|
|
49
50
|
|
|
50
51
|
[tool.setuptools]
|
|
@@ -4,8 +4,8 @@ import re
|
|
|
4
4
|
import time
|
|
5
5
|
import argparse
|
|
6
6
|
from scribe.audio import Microphone
|
|
7
|
-
from scribe.util import print_partial, clear_line, prompt_choices,
|
|
8
|
-
from scribe.models import VoskTranscriber, WhisperTranscriber
|
|
7
|
+
from scribe.util import print_partial, clear_line, prompt_choices, ansi_link, colored
|
|
8
|
+
from scribe.models import VoskTranscriber, WhisperTranscriber, OpenaiAPITranscriber
|
|
9
9
|
|
|
10
10
|
with open(Path(__file__).parent / "models.toml", "rb") as f:
|
|
11
11
|
language_config_default = tomllib.load(f)
|
|
@@ -24,7 +24,7 @@ def get_default_backend():
|
|
|
24
24
|
except ImportError:
|
|
25
25
|
raise ImportError("Please install either vosk or whisper to use this script.")
|
|
26
26
|
|
|
27
|
-
BACKENDS = ["whisper", "vosk"]
|
|
27
|
+
BACKENDS = ["whisper", "vosk", "openaiapi"]
|
|
28
28
|
UNAVAILABLE_BACKENDS = []
|
|
29
29
|
|
|
30
30
|
|
|
@@ -55,57 +55,54 @@ class DummyTranscriber:
|
|
|
55
55
|
def __getattr__(self, item):
|
|
56
56
|
return None
|
|
57
57
|
|
|
58
|
-
|
|
58
|
+
whisper_models = ["tiny", "base", "small", "medium", "large", "turbo"]
|
|
59
|
+
whisper_english_models = ["tiny.en", "base.en", "small.en", "medium.en"]
|
|
60
|
+
whisperapi_models = ["whisper-1"]
|
|
61
|
+
vosk_models = [language_config["vosk"][lang]["model"] for lang in language_config["vosk"]]
|
|
59
62
|
|
|
60
|
-
whisper_models = ["tiny", "base", "small", "medium", "large", "turbo"]
|
|
61
|
-
whisper_english_models = ["tiny.en", "base.en", "small.en", "medium.en"]
|
|
62
63
|
|
|
63
|
-
|
|
64
|
+
def get_transcriber(model=None, backend=None, dummy=False, prompt=True, language=None,
|
|
65
|
+
samplerate=None, duration=None, silence=None, silence_db=None, restart_after_silence=None,
|
|
66
|
+
api_key=None,
|
|
67
|
+
download_folder_vosk=None, download_folder_whisper=None, **kwargs):
|
|
68
|
+
|
|
69
|
+
if dummy:
|
|
64
70
|
return DummyTranscriber("whisper", "dummy")
|
|
65
71
|
|
|
66
|
-
if
|
|
67
|
-
if
|
|
68
|
-
|
|
69
|
-
elif
|
|
70
|
-
|
|
72
|
+
if model and not backend:
|
|
73
|
+
if model.startswith("vosk-"):
|
|
74
|
+
backend = "vosk"
|
|
75
|
+
elif model in whisper_models + whisper_english_models:
|
|
76
|
+
backend = "whisper"
|
|
77
|
+
elif model in whisperapi_models:
|
|
78
|
+
backend = "openaiapi"
|
|
71
79
|
|
|
72
|
-
if
|
|
73
|
-
|
|
74
|
-
if not checked_backend:
|
|
75
|
-
print(f"Backend {o.backend} is not available.")
|
|
76
|
-
exit(1)
|
|
77
|
-
backend = o.backend
|
|
80
|
+
if backend:
|
|
81
|
+
backend = backend
|
|
78
82
|
|
|
79
83
|
elif not prompt:
|
|
80
84
|
backend = BACKENDS[0]
|
|
81
85
|
|
|
82
86
|
else:
|
|
83
|
-
|
|
84
|
-
while not checked_backend:
|
|
85
|
-
backend = prompt_choices(BACKENDS, o.backend, "backend", UNAVAILABLE_BACKENDS)
|
|
86
|
-
# raise an error if the user has explicitly selected a backend that is not available
|
|
87
|
-
checked_backend = check_dependencies(backend, raise_error=backend==o.backend)
|
|
88
|
-
if not checked_backend:
|
|
89
|
-
print(f"Backend {o.backend} is not available.")
|
|
90
|
-
UNAVAILABLE_BACKENDS.append(backend)
|
|
87
|
+
backend = prompt_choices(BACKENDS, backend, "backend", UNAVAILABLE_BACKENDS)
|
|
91
88
|
|
|
92
89
|
print(f"Selected backend: {backend}")
|
|
93
90
|
|
|
94
|
-
if
|
|
95
|
-
model = pick_specialist_model(
|
|
91
|
+
if model:
|
|
92
|
+
model = pick_specialist_model(model, language, backend)
|
|
96
93
|
|
|
97
94
|
else:
|
|
98
95
|
|
|
99
96
|
if backend == "vosk":
|
|
100
97
|
available_languages = list(language_config[backend])
|
|
101
|
-
if
|
|
102
|
-
if
|
|
103
|
-
print(f"Language '{
|
|
98
|
+
if language:
|
|
99
|
+
if language not in available_languages:
|
|
100
|
+
print(f"Language '{language}' is not pre-defined (yet) for backend '{backend}'.")
|
|
104
101
|
print(f"Yet it may actually exist.")
|
|
105
102
|
print(f"Please choose the model explictly from {ansi_link('https://alphacephei.com/vosk/models')}.")
|
|
106
103
|
print(f"Or pick one of the pre-defined languages: ", " ".join(available_languages))
|
|
107
104
|
exit(1)
|
|
108
|
-
choices = [language_config[backend][
|
|
105
|
+
choices = [language_config[backend][language]["model"]]
|
|
109
106
|
default_model = choices[0] # this is a string
|
|
110
107
|
|
|
111
108
|
else:
|
|
@@ -129,28 +126,41 @@ def get_transcriber(o, prompt=True):
|
|
|
129
126
|
else:
|
|
130
127
|
model = default_model
|
|
131
128
|
|
|
132
|
-
model = pick_specialist_model(model,
|
|
129
|
+
model = pick_specialist_model(model, language, backend)
|
|
130
|
+
|
|
131
|
+
elif backend == "openaiapi":
|
|
132
|
+
model = model or "whisper-1"
|
|
133
|
+
|
|
134
|
+
else:
|
|
135
|
+
raise ValueError(f"Unknown backend: {backend}")
|
|
136
|
+
|
|
133
137
|
|
|
134
138
|
print(f"Selected model: {model}")
|
|
135
139
|
|
|
136
140
|
if backend == "vosk":
|
|
137
141
|
try:
|
|
138
142
|
transcriber = VoskTranscriber(model_name=model,
|
|
139
|
-
language=
|
|
140
|
-
samplerate=
|
|
143
|
+
language=language,
|
|
144
|
+
samplerate=samplerate,
|
|
141
145
|
timeout=None, # vosk keeps going (no timeout)
|
|
142
146
|
silence_duration=None, # vosk handles silences internally
|
|
143
|
-
model_kwargs={"download_root":
|
|
147
|
+
model_kwargs={"download_root": download_folder_vosk})
|
|
144
148
|
except Exception as error:
|
|
145
149
|
print(error)
|
|
146
150
|
print(f"Failed to (down)load model {model}.")
|
|
147
151
|
exit(1)
|
|
148
152
|
|
|
149
153
|
elif backend == "whisper":
|
|
150
|
-
transcriber = WhisperTranscriber(model_name=model, language=
|
|
151
|
-
timeout=
|
|
152
|
-
restart_after_silence=
|
|
153
|
-
model_kwargs={"download_root":
|
|
154
|
+
transcriber = WhisperTranscriber(model_name=model, language=language, samplerate=samplerate,
|
|
155
|
+
timeout=duration, silence_duration=silence, silence_thresh=silence_db,
|
|
156
|
+
restart_after_silence=restart_after_silence,
|
|
157
|
+
model_kwargs={"download_root": download_folder_whisper})
|
|
158
|
+
|
|
159
|
+
elif backend == "openaiapi":
|
|
160
|
+
transcriber = OpenaiAPITranscriber(model_name=model, samplerate=samplerate,
|
|
161
|
+
timeout=duration, silence_duration=silence, silence_thresh=silence_db,
|
|
162
|
+
restart_after_silence=restart_after_silence, api_key=api_key)
|
|
163
|
+
|
|
154
164
|
|
|
155
165
|
else:
|
|
156
166
|
raise ValueError(f"Unknown backend: {backend}")
|
|
@@ -195,6 +205,10 @@ def get_parser():
|
|
|
195
205
|
group.add_argument("--silence-db", default=-30, type=float, help="silence magnitude in decibel (default %(default)s db)")
|
|
196
206
|
group.add_argument("-a", "--restart-after-silence", action="store_true", help="Restart the recording after a transcription triggered by a silence")
|
|
197
207
|
|
|
208
|
+
group = parser.add_argument_group("whisper api")
|
|
209
|
+
group.add_argument("--api-key",
|
|
210
|
+
help="API key for the Whisper API backend.")
|
|
211
|
+
|
|
198
212
|
parser.add_argument("--download-folder-vosk", help="Folder to store Vosk models.")
|
|
199
213
|
parser.add_argument("--download-folder-whisper", help="Folder to store Whisper models.")
|
|
200
214
|
|
|
@@ -206,11 +220,11 @@ def start_recording(micro, transcriber, clipboard=True, keyboard=False, latency=
|
|
|
206
220
|
|
|
207
221
|
if keyboard:
|
|
208
222
|
from scribe.keyboard import type_text
|
|
209
|
-
|
|
223
|
+
transcriber.log("Change focus to target app during transcription.")
|
|
210
224
|
|
|
211
225
|
if clipboard:
|
|
212
226
|
import pyperclip
|
|
213
|
-
|
|
227
|
+
transcriber.log("The full transcription will be copied to clipboard as it becomes available.")
|
|
214
228
|
|
|
215
229
|
fulltext = ""
|
|
216
230
|
|
|
@@ -237,7 +251,7 @@ def start_recording(micro, transcriber, clipboard=True, keyboard=False, latency=
|
|
|
237
251
|
callback()
|
|
238
252
|
|
|
239
253
|
|
|
240
|
-
def create_app(micro, transcriber, **kwargs):
|
|
254
|
+
def create_app(micro, transcriber, other_transcribers=None, **kwargs):
|
|
241
255
|
import pystray
|
|
242
256
|
from pystray import Menu as pystrayMenu, MenuItem as Item
|
|
243
257
|
from PIL import Image
|
|
@@ -257,6 +271,7 @@ def create_app(micro, transcriber, **kwargs):
|
|
|
257
271
|
image_recording = Image.alpha_composite(image_recording.convert("RGBA"), image_writing.convert("RGBA"))
|
|
258
272
|
|
|
259
273
|
def update_icon(icon, force=False):
|
|
274
|
+
transcriber = icon._transcriber
|
|
260
275
|
if transcriber.recording and transcriber.waiting:
|
|
261
276
|
# this is the situation with the whisper backend when the microphone is recording
|
|
262
277
|
# but we wait for the speaker to speak (silence)
|
|
@@ -284,6 +299,7 @@ def create_app(micro, transcriber, **kwargs):
|
|
|
284
299
|
icon.update_menu()
|
|
285
300
|
|
|
286
301
|
def start_monitoring(icon):
|
|
302
|
+
transcriber = icon._transcriber
|
|
287
303
|
try:
|
|
288
304
|
while transcriber.busy:
|
|
289
305
|
update_icon(icon)
|
|
@@ -299,8 +315,8 @@ def create_app(micro, transcriber, **kwargs):
|
|
|
299
315
|
icon.stop()
|
|
300
316
|
|
|
301
317
|
def callback_stop_recording(icon, item):
|
|
318
|
+
transcriber = icon._transcriber
|
|
302
319
|
# Here we need to stop the recording thread
|
|
303
|
-
|
|
304
320
|
transcriber.interrupt = True
|
|
305
321
|
if hasattr(icon, "_recording_thread"):
|
|
306
322
|
icon._recording_thread.join()
|
|
@@ -308,9 +324,9 @@ def create_app(micro, transcriber, **kwargs):
|
|
|
308
324
|
icon._monitoring_thread.join()
|
|
309
325
|
|
|
310
326
|
def callback_record(icon, item):
|
|
311
|
-
|
|
327
|
+
transcriber = icon._transcriber
|
|
312
328
|
if transcriber.busy:
|
|
313
|
-
|
|
329
|
+
transcriber.log("Still busy recording or transcribing.")
|
|
314
330
|
return
|
|
315
331
|
|
|
316
332
|
if hasattr(icon, "_recording_thread") and icon._recording_thread.is_alive():
|
|
@@ -325,22 +341,67 @@ def create_app(micro, transcriber, **kwargs):
|
|
|
325
341
|
icon._monitoring_thread = threading.Thread(target=start_monitoring, args=(icon,))
|
|
326
342
|
icon._monitoring_thread.start()
|
|
327
343
|
|
|
344
|
+
if other_transcribers:
|
|
345
|
+
other_transcribers_dict = {meta["model"]: meta for meta in other_transcribers}
|
|
346
|
+
else:
|
|
347
|
+
other_transcribers_dict = {}
|
|
348
|
+
|
|
349
|
+
def callback_set_model(icon, item):
|
|
350
|
+
transcriber = icon._transcriber
|
|
351
|
+
callback_stop_recording(icon, item)
|
|
352
|
+
model_name = str(item)
|
|
353
|
+
meta = other_transcribers_dict[model_name]
|
|
354
|
+
icon._transcriber = transcriber = get_transcriber(**meta)
|
|
355
|
+
icon.title = f"scribe :: {transcriber.backend} :: {transcriber.model_name}"
|
|
356
|
+
print("Set", transcriber.backend, transcriber.model_name)
|
|
357
|
+
# icon.menu.items[0].__name__ = f"Record [{str(item)}]"
|
|
358
|
+
icon._model_selection = False
|
|
359
|
+
icon.update_menu()
|
|
360
|
+
icon.notify(f"Set {transcriber.backend} {transcriber.model_name}")
|
|
361
|
+
|
|
362
|
+
def callback_info(icon, item):
|
|
363
|
+
transcriber = icon._transcriber
|
|
364
|
+
# icon.notify(f"scribe {transcriber.backend} {transcriber.model_name}")
|
|
365
|
+
title = f"""{transcriber.backend} :: {transcriber.model_name}"""
|
|
366
|
+
info = [name for name in kwargs if isinstance(kwargs[name], bool) and kwargs[name]]
|
|
367
|
+
icon.notify(" | ".join(info), title=title)
|
|
368
|
+
|
|
369
|
+
def callback_toggle_option(icon, item):
|
|
370
|
+
kwargs[str(item)] = not kwargs[str(item)]
|
|
371
|
+
callback_info(icon, item)
|
|
372
|
+
|
|
373
|
+
def is_model_selection(item):
|
|
374
|
+
return icon._model_selection
|
|
375
|
+
|
|
328
376
|
def is_recording(item):
|
|
329
|
-
return
|
|
377
|
+
return icon._transcriber.busy
|
|
330
378
|
|
|
331
379
|
def is_not_recording(item):
|
|
332
|
-
return not is_recording(item)
|
|
380
|
+
return not is_recording(item) and not is_model_selection(item)
|
|
333
381
|
|
|
382
|
+
modeltitle = f"{transcriber.backend} :: {transcriber.model_name}"
|
|
383
|
+
title = f"scribe :: {modeltitle}"
|
|
334
384
|
|
|
335
|
-
|
|
336
|
-
|
|
337
|
-
|
|
338
|
-
|
|
339
|
-
Item(
|
|
385
|
+
menus = []
|
|
386
|
+
menus.append(Item(f"Record" if len(other_transcribers_dict) <= 1 else f"Record", callback_record, visible=is_not_recording))
|
|
387
|
+
menus.append(Item("Stop", callback_stop_recording, visible=is_recording))
|
|
388
|
+
menus.append(Item("Choose Model", pystrayMenu(
|
|
389
|
+
*(Item(f"{name}", callback_set_model) for name in other_transcribers_dict)))
|
|
340
390
|
)
|
|
391
|
+
menus.append(Item("Toggle Options", pystrayMenu(
|
|
392
|
+
*(Item(f"{name}", callback_toggle_option) for name in kwargs if isinstance(kwargs[name], bool))))
|
|
393
|
+
)
|
|
394
|
+
menus.append(Item(f"Info", callback_info))
|
|
395
|
+
menus.append(Item('Quit', callback_quit))
|
|
396
|
+
|
|
397
|
+
# Create a menu
|
|
398
|
+
menu = pystrayMenu(*menus)
|
|
341
399
|
|
|
342
400
|
# Create the system tray icon
|
|
343
|
-
icon = pystray.Icon('scribe', image,
|
|
401
|
+
icon = pystray.Icon('scribe', image, title, menu)
|
|
402
|
+
icon._model_selection = False
|
|
403
|
+
icon._transcriber = transcriber
|
|
404
|
+
del transcriber
|
|
344
405
|
|
|
345
406
|
return icon
|
|
346
407
|
|
|
@@ -359,7 +420,7 @@ def main(args=None):
|
|
|
359
420
|
|
|
360
421
|
while True:
|
|
361
422
|
if transcriber is None:
|
|
362
|
-
transcriber = get_transcriber(o
|
|
423
|
+
transcriber = get_transcriber(**vars(o))
|
|
363
424
|
print(f"Model [{colored(transcriber.model_name, 'light_blue', attrs=['bold'])}] from [{colored(transcriber.backend, 'light_blue', attrs=['bold'])}] selected.")
|
|
364
425
|
show_output = ["clipboard", "keyboard", "output_file"]
|
|
365
426
|
show_options = ["ascii", "restart_after_silence"]
|
|
@@ -473,7 +534,12 @@ def main(args=None):
|
|
|
473
534
|
greetings = dict(
|
|
474
535
|
start_message = "Listening... Use the try icon menu to stop.",
|
|
475
536
|
)
|
|
476
|
-
|
|
537
|
+
|
|
538
|
+
app = create_app(micro, transcriber, other_transcribers=[
|
|
539
|
+
{**vars(o), "backend": "openaiapi", "model": "whisper-1"},
|
|
540
|
+
*[{**vars(o), "backend": "whisper", "model": model} for model in whisper_models],
|
|
541
|
+
*[{**vars(o), "backend": "vosk", "model": model} for model in vosk_models]],
|
|
542
|
+
clipboard=o.clipboard, output_file=o.output_file,
|
|
477
543
|
keyboard=o.keyboard, latency=o.latency, ascii=o.ascii, **greetings)
|
|
478
544
|
print("Starting app...")
|
|
479
545
|
app.run()
|
|
@@ -22,7 +22,7 @@ class StopRecording(Exception):
|
|
|
22
22
|
class AbstractTranscriber:
|
|
23
23
|
backend = None
|
|
24
24
|
def __init__(self, model, model_name=None, language=None, samplerate=16000, timeout=None, model_kwargs={},
|
|
25
|
-
silence_thresh=-40, silence_duration=2, restart_after_silence=False):
|
|
25
|
+
silence_thresh=-40, silence_duration=2, restart_after_silence=False, logger=None):
|
|
26
26
|
self.model_name = model_name
|
|
27
27
|
self.language = language
|
|
28
28
|
self.model = model
|
|
@@ -36,6 +36,11 @@ class AbstractTranscriber:
|
|
|
36
36
|
self.busy = False
|
|
37
37
|
self.waiting = False
|
|
38
38
|
self.interrupt = False
|
|
39
|
+
if logger is None:
|
|
40
|
+
import logging
|
|
41
|
+
logging.basicConfig(level=logging.INFO)
|
|
42
|
+
logger = logging.getLogger("scribe")
|
|
43
|
+
self.logger = logger
|
|
39
44
|
self.reset()
|
|
40
45
|
|
|
41
46
|
def get_elapsed(self):
|
|
@@ -55,9 +60,18 @@ class AbstractTranscriber:
|
|
|
55
60
|
self.audio_buffer = b''
|
|
56
61
|
self.start_time = time.time()
|
|
57
62
|
|
|
63
|
+
def log(self, text):
|
|
64
|
+
if text.startswith("\n"):
|
|
65
|
+
print("")
|
|
66
|
+
text = text[1:]
|
|
67
|
+
if self.logger:
|
|
68
|
+
self.logger.info(text)
|
|
69
|
+
else:
|
|
70
|
+
print(f"[{text}]")
|
|
71
|
+
|
|
58
72
|
def start_recording(self, microphone,
|
|
59
73
|
start_message="Recording... Press Ctrl+C to stop.",
|
|
60
|
-
stop_message="
|
|
74
|
+
stop_message="Exit."):
|
|
61
75
|
|
|
62
76
|
self.reset()
|
|
63
77
|
self.interrupt = False
|
|
@@ -73,7 +87,7 @@ class AbstractTranscriber:
|
|
|
73
87
|
try:
|
|
74
88
|
|
|
75
89
|
with microphone.open_stream():
|
|
76
|
-
|
|
90
|
+
self.log(start_message)
|
|
77
91
|
|
|
78
92
|
while not self.interrupt:
|
|
79
93
|
while not microphone.q.empty():
|
|
@@ -107,7 +121,7 @@ class AbstractTranscriber:
|
|
|
107
121
|
|
|
108
122
|
else:
|
|
109
123
|
if not previous_waiting:
|
|
110
|
-
|
|
124
|
+
self.log("Silence detected...waiting for more audio")
|
|
111
125
|
|
|
112
126
|
if self.is_overtime():
|
|
113
127
|
raise StopRecording("Overtime: {:.2f} seconds".format(self.get_elapsed()))
|
|
@@ -125,7 +139,7 @@ class AbstractTranscriber:
|
|
|
125
139
|
self.busy = False
|
|
126
140
|
yield result
|
|
127
141
|
|
|
128
|
-
|
|
142
|
+
self.log(stop_message)
|
|
129
143
|
|
|
130
144
|
|
|
131
145
|
def get_vosk_model(model, download_root=None, url=None):
|
|
@@ -200,7 +214,7 @@ class WhisperTranscriber(AbstractTranscriber):
|
|
|
200
214
|
super().__init__(model, model_name, language, model_kwargs=model_kwargs, **kwargs)
|
|
201
215
|
|
|
202
216
|
def transcribe_audio(self, audio_bytes):
|
|
203
|
-
|
|
217
|
+
self.log("\nTranscribing")
|
|
204
218
|
audio_array = np.frombuffer(audio_bytes, dtype=np.int16).flatten().astype(np.float32) / 32768.0
|
|
205
219
|
return self.model.transcribe(audio_array, fp16=False, language=self.language)
|
|
206
220
|
|
|
@@ -209,4 +223,39 @@ class WhisperTranscriber(AbstractTranscriber):
|
|
|
209
223
|
return {"text": ""}
|
|
210
224
|
result = self.transcribe_audio(self.audio_buffer)
|
|
211
225
|
self.audio_buffer = b''
|
|
212
|
-
return result
|
|
226
|
+
return result
|
|
227
|
+
|
|
228
|
+
|
|
229
|
+
class OpenaiAPITranscriber(WhisperTranscriber):
|
|
230
|
+
backend = "openaiapi"
|
|
231
|
+
|
|
232
|
+
def __init__(self, model_name="whisper-1", language=None, model_kwargs={}, model=None, api_key=None, **kwargs):
|
|
233
|
+
if model is None:
|
|
234
|
+
import openai
|
|
235
|
+
model = openai.OpenAI(
|
|
236
|
+
api_key=api_key or openai.api_key,
|
|
237
|
+
# 20 seconds (default is 10 minutes)
|
|
238
|
+
timeout=20.0,
|
|
239
|
+
)
|
|
240
|
+
AbstractTranscriber.__init__(self, model, model_name, language, model_kwargs=model_kwargs, **kwargs)
|
|
241
|
+
|
|
242
|
+
def transcribe_audio(self, audio_bytes):
|
|
243
|
+
self.log("\nTranscribing")
|
|
244
|
+
import io
|
|
245
|
+
import openai
|
|
246
|
+
import soundfile as sf
|
|
247
|
+
audio_data = np.frombuffer(audio_bytes, dtype=np.int16).flatten().astype(np.float32) / 32768.0
|
|
248
|
+
# Write the audio data to an in-memory file in WAV format
|
|
249
|
+
buffer = io.BytesIO()
|
|
250
|
+
sf.write(buffer, audio_data, self.samplerate, format='WAV')
|
|
251
|
+
buffer.seek(0)
|
|
252
|
+
buffer.name = "audio.wav" # Set a filename with a valid extension
|
|
253
|
+
try:
|
|
254
|
+
transcription = self.model.audio.transcriptions.create(
|
|
255
|
+
model=self.model_name,
|
|
256
|
+
file=buffer,
|
|
257
|
+
)
|
|
258
|
+
except openai.BadRequestError as e:
|
|
259
|
+
self.log(f"Error: {e}")
|
|
260
|
+
return {"text": ""}
|
|
261
|
+
return {"text": transcription.text}
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.2
|
|
2
2
|
Name: scribe-cli
|
|
3
|
-
Version: 0.
|
|
3
|
+
Version: 0.11.0
|
|
4
4
|
Summary: scribe is a local speech recognition tool that provides real-time transcription using vosk and whisper AI, with the goal of serving as a virtual keyboard on a computer
|
|
5
5
|
Author-email: Mahé Perrette <mahe.perrette@gmail.com>
|
|
6
6
|
License: MIT License
|
|
@@ -55,9 +55,14 @@ Requires-Dist: vosk; extra == "vosk"
|
|
|
55
55
|
Provides-Extra: app
|
|
56
56
|
Requires-Dist: pystray; extra == "app"
|
|
57
57
|
Requires-Dist: PyGObject; extra == "app"
|
|
58
|
+
Provides-Extra: openai
|
|
59
|
+
Requires-Dist: openai; extra == "openai"
|
|
60
|
+
Requires-Dist: soundfile; extra == "openai"
|
|
58
61
|
Provides-Extra: all
|
|
59
62
|
Requires-Dist: pynput; extra == "all"
|
|
60
63
|
Requires-Dist: openai-whisper; extra == "all"
|
|
64
|
+
Requires-Dist: openai; extra == "all"
|
|
65
|
+
Requires-Dist: soundfile; extra == "all"
|
|
61
66
|
Requires-Dist: vosk; extra == "all"
|
|
62
67
|
Requires-Dist: pystray; extra == "all"
|
|
63
68
|
|
|
@@ -66,7 +71,9 @@ Requires-Dist: pystray; extra == "all"
|
|
|
66
71
|
|
|
67
72
|
# Scribe <img src="scribe_data/share/icon.png" width=48px>
|
|
68
73
|
|
|
69
|
-
`scribe` is a
|
|
74
|
+
`scribe` is a speech recognition tool that provides real-time transcription using cutting-edge AI models, with the goal of serving as a virtual keyboard on a computer.
|
|
75
|
+
|
|
76
|
+
It features local, downloadable models with the `vosk` and `whisper` backends, as well as a client to open AI via `openaiapi` backend (API key required).
|
|
70
77
|
|
|
71
78
|
## Compatibility
|
|
72
79
|
|
|
@@ -101,12 +108,10 @@ cd scribe
|
|
|
101
108
|
pip install -e .[all]
|
|
102
109
|
```
|
|
103
110
|
|
|
104
|
-
You can leave the optional dependencies (leave out `[all]`) but must install at least one of `vosk` or `openai-whisper` packages (see Usage below).
|
|
105
|
-
|
|
106
|
-
The `vosk` language models will download on-the-fly.
|
|
107
|
-
The default download folder is `$XDG_CACHE_HOME/{backend}` where `$XDG_CACHE_HOME` defaults to `$HOME/.cache` (note for the `whisper` backend
|
|
108
|
-
the default is left to the `openai-whisper` package and might change in the future).
|
|
111
|
+
You can leave the optional dependencies (leave out `[all]`) but must install at least one of `vosk` or `openai-whisper` or `openai` packages (see Usage below).
|
|
109
112
|
|
|
113
|
+
The language models for local backends `vosk` and `whisper` will download on-the-fly.
|
|
114
|
+
The default download folder is `$XDG_CACHE_HOME/{backend}` where `$XDG_CACHE_HOME` defaults to `$HOME/.cache`.
|
|
110
115
|
|
|
111
116
|
## Usage
|
|
112
117
|
|
|
@@ -115,7 +120,7 @@ Just type in the terminal:
|
|
|
115
120
|
```bash
|
|
116
121
|
scribe
|
|
117
122
|
```
|
|
118
|
-
and the script will guide you through the choice of backend (`whisper` or `vosk`) and the specific language model.
|
|
123
|
+
and the script will guide you through the choice of backend (`whisper` or `vosk` or `openaiapi`) and the specific language model.
|
|
119
124
|
After this, you will be prompted to start recording your microphone and print the transcribed text in real-time (`vosk`)
|
|
120
125
|
or until after recording is complete (`whisper`).
|
|
121
126
|
You can interrupt the recording via Ctrl + C and start again or change model.
|
|
@@ -129,9 +134,9 @@ The `vosk` backend is much faster and very good at doing real-time transcription
|
|
|
129
134
|
It becomes really powerful when used for longer or interactive typing session with the [keyboard](#virtual-keyboard-experimental) option, e.g. to make notes or chat with an AI.
|
|
130
135
|
There are many [vosk models](https://alphacephei.com/vosk/models) available, and here a few are associated to [a handful of languages](scribe/models.toml) `en`, `fr`, `it`, `de` (so far).
|
|
131
136
|
|
|
132
|
-
|
|
137
|
+
The `openaiapi` backend uses `whisper-1` model at the time of writing. It requires an API key
|
|
133
138
|
```bash
|
|
134
|
-
scribe --backend
|
|
139
|
+
scribe --backend openaiapi --api YOURAPIKEY
|
|
135
140
|
```
|
|
136
141
|
where `--no-prompt` jumps right to the recording (after the first interruption, you can still choose to change the backend and model).
|
|
137
142
|
|
|
@@ -190,7 +195,9 @@ To activate start with:
|
|
|
190
195
|
```bash
|
|
191
196
|
scribe --app
|
|
192
197
|
```
|
|
193
|
-
or toggle the app option in the interactive menu. The scribe icon will show, with Record
|
|
198
|
+
or toggle the app option in the interactive menu. The scribe icon will show, with Record and other options. The icon will change based on what the app is doing. It is possible to choose from a set
|
|
199
|
+
of predefined models, or to Quit and choose from the terminal before pressing Enter again.
|
|
200
|
+
For the vosk model, there are only two states : recording + transcribing or Idle. For the whisper model there are three states visible from the icon: recording, transcribing and idle/waiting.
|
|
194
201
|
That option requires `pystray` to be installed. This is included with the `pip install ...[all]` option. In Ubuntu the following dependencies were required to make the menus appear:
|
|
195
202
|
|
|
196
203
|
```bash
|
|
@@ -204,17 +211,19 @@ If you run Ubuntu (or else?) with GNOME, the script `scribe-install [...]` will
|
|
|
204
211
|
to make it available from the quick launch menu. Any option will be passed on to `scribe`, with the additional options `--name` and `--no-terminal`.
|
|
205
212
|
`--no-terminal` means no terminal will show up, and it also implies the options `--app --no-prompt`.
|
|
206
213
|
|
|
207
|
-
|
|
214
|
+
In a relatively basic form
|
|
215
|
+
|
|
216
|
+
```bash
|
|
217
|
+
scribe-install --clipboard --api YOUROPENAIAPIKEY
|
|
218
|
+
```
|
|
219
|
+
(`--api` is optional and only useful if you plan to use `openaiapi` backend later on)
|
|
220
|
+
|
|
221
|
+
And to make an app running outside the terminal:
|
|
208
222
|
|
|
209
223
|
```bash
|
|
210
|
-
scribe-install
|
|
211
|
-
scribe-install --name "Scribe Whisper" --backend whisper --model small --clipboard --restart-after-silence --no-prompt
|
|
212
|
-
scribe-install --name "Scribe Vosk FR" --backend vosk --language fr --keyboard --clipboard --no-terminal
|
|
224
|
+
scribe-install --backend openaiapi --name "Scribe App" --keyboard --clipboard --app --no-prompt --no-terminal --api YOUROPENAIAPIKEY
|
|
213
225
|
```
|
|
214
|
-
This will install
|
|
215
|
-
- `Super + scribe` : will launch the default version with terminal prompt
|
|
216
|
-
- `Super + whisper` : will launch a present version with the `small` model from `whisper` and start recording right away. You can see what is going on in the terminal and the result is ready to paste from the clipboard
|
|
217
|
-
- `Super + vosk fr` : will launch a preset version for real-time transcription in French with the `vosk` backend, and throughput to the clipboard and the keyboard, not even opening a terminal.
|
|
226
|
+
This will install two separate apps (names "Scribe" and "Scribe App")
|
|
218
227
|
|
|
219
228
|
|
|
220
229
|
## Fine tuning
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|