scribe-cli 0.7.2__tar.gz → 0.7.4__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (29) hide show
  1. {scribe_cli-0.7.2/scribe_cli.egg-info → scribe_cli-0.7.4}/PKG-INFO +20 -4
  2. {scribe_cli-0.7.2 → scribe_cli-0.7.4}/README.md +19 -3
  3. {scribe_cli-0.7.2 → scribe_cli-0.7.4}/scribe/_version.py +2 -2
  4. {scribe_cli-0.7.2 → scribe_cli-0.7.4}/scribe/app.py +51 -31
  5. scribe_cli-0.7.4/scribe/keyboard.py +51 -0
  6. {scribe_cli-0.7.2 → scribe_cli-0.7.4}/scribe/models.py +6 -3
  7. scribe_cli-0.7.4/scribe/models.toml +23 -0
  8. {scribe_cli-0.7.2 → scribe_cli-0.7.4/scribe_cli.egg-info}/PKG-INFO +20 -4
  9. scribe_cli-0.7.2/scribe/keyboard.py +0 -18
  10. scribe_cli-0.7.2/scribe/models.toml +0 -31
  11. {scribe_cli-0.7.2 → scribe_cli-0.7.4}/.github/workflows/pypi.yml +0 -0
  12. {scribe_cli-0.7.2 → scribe_cli-0.7.4}/.gitignore +0 -0
  13. {scribe_cli-0.7.2 → scribe_cli-0.7.4}/LICENSE +0 -0
  14. {scribe_cli-0.7.2 → scribe_cli-0.7.4}/pyproject.toml +0 -0
  15. {scribe_cli-0.7.2 → scribe_cli-0.7.4}/scribe/__init__.py +0 -0
  16. {scribe_cli-0.7.2 → scribe_cli-0.7.4}/scribe/audio.py +0 -0
  17. {scribe_cli-0.7.2 → scribe_cli-0.7.4}/scribe/install_desktop.py +0 -0
  18. {scribe_cli-0.7.2 → scribe_cli-0.7.4}/scribe/saverecording.py +0 -0
  19. {scribe_cli-0.7.2 → scribe_cli-0.7.4}/scribe/testpynput.py +0 -0
  20. {scribe_cli-0.7.2 → scribe_cli-0.7.4}/scribe/util.py +0 -0
  21. {scribe_cli-0.7.2 → scribe_cli-0.7.4}/scribe_cli.egg-info/SOURCES.txt +0 -0
  22. {scribe_cli-0.7.2 → scribe_cli-0.7.4}/scribe_cli.egg-info/dependency_links.txt +0 -0
  23. {scribe_cli-0.7.2 → scribe_cli-0.7.4}/scribe_cli.egg-info/entry_points.txt +0 -0
  24. {scribe_cli-0.7.2 → scribe_cli-0.7.4}/scribe_cli.egg-info/requires.txt +0 -0
  25. {scribe_cli-0.7.2 → scribe_cli-0.7.4}/scribe_cli.egg-info/top_level.txt +0 -0
  26. {scribe_cli-0.7.2 → scribe_cli-0.7.4}/scribe_data/__init__.py +0 -0
  27. {scribe_cli-0.7.2 → scribe_cli-0.7.4}/scribe_data/share/icon.jpg +0 -0
  28. {scribe_cli-0.7.2 → scribe_cli-0.7.4}/scribe_data/templates/scribe.desktop +0 -0
  29. {scribe_cli-0.7.2 → scribe_cli-0.7.4}/setup.cfg +0 -0
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.2
2
2
  Name: scribe-cli
3
- Version: 0.7.2
3
+ Version: 0.7.4
4
4
  Summary: scribe is a local speech recognition tool that provides real-time transcription using vosk and whisper AI.
5
5
  Author-email: Mahé Perrette <mahe.perrette@gmail.com>
6
6
  License: MIT License
@@ -87,8 +87,8 @@ pip install -e .[all]
87
87
  You can leave the optional dependencies (leave out `[all]`) but must install at least one of `vosk` or `openai-whisper` packages (see Usage below).
88
88
 
89
89
  The `vosk` language models will download on-the-fly.
90
- The default data folder is `$HOME/.local/share/vosk/language-models`.
91
- This can be modified.
90
+ The default download folder is `$XDG_CACHE_HOME/{backend}` where `$XDG_CACHE_HOME` defaults to `$HOME/.cache` (note for the `whisker` backend
91
+ the default is left to the `openai-whisper` package and might change in the future).
92
92
 
93
93
 
94
94
  ## Usage
@@ -122,6 +122,9 @@ where `--no-prompt` jumps right to the recording (after the first interruption,
122
122
  ### Virtual keyboard (experimental)
123
123
 
124
124
  By default the content of the transcription is pasted to the clipboard, and it is up to the user to paste (e.g. Ctrl + V).
125
+ However with the `vosk` backend and its realtime transcription, it is very handy to have the keys sent directly to the keyboard.
126
+ That can be achieve with the `--keyboard` option.
127
+
125
128
  With the `--keyboard` option `scribe` will attempt to simulate a keyboard and send transcribed characters to the applcation under focus:
126
129
 
127
130
  ```bash
@@ -130,7 +133,20 @@ scribe --keyboard
130
133
 
131
134
  It relies on the optional `pynput` dependency (installed together with `scribe` if you used the `[all]` or `[keyboard]` option).
132
135
 
133
- `pynput` may require [some configuration](https://pynput.readthedocs.io/en/latest/limitations.html). It has [limitations]((https://pynput.readthedocs.io/en/latest/limitations.html)). In my Ubuntu + Wayland system it works in chromium based applications (including vscode) but it does not in firefox and sublime text and any of the rest (not even in a terminal !). Workarounds include using the Xorg version of GNOME: in `etc/gdm3/custom.conf` uncomment `# WaylandEnable=false` and restart.
136
+ `pynput` may require [some configuration](https://pynput.readthedocs.io/en/latest/limitations.html). It has [limitations]((https://pynput.readthedocs.io/en/latest/limitations.html)).
137
+
138
+ #### Use the keyboard in Ubuntu
139
+
140
+ In my Ubuntu + Wayland system the keyboard simulation works out-of-the-box in chromium based applications (including vscode) but it does not in firefox and sublime text and any of the rest (not even in a terminal !). I am told this is because Chromium runs an X server emulator and so is compatible with the default pynput backend.
141
+
142
+ One workaround is to use the Xorg version of GNOME: in `etc/gdm3/custom.conf` uncomment `# WaylandEnable=false` and restart your computer.
143
+
144
+ Another workaround with Wayland is to use the low-level `uinput` backend but that requires that `scribe` is run as root (sudo), and likely other configurations like activating the `uinput` system module (`sudo modprobe uinput` for a one-time test, or adding `uinput` to `/etc/modules-load.d/modules.conf` to make that persistent). Moreover, since `uinput` really only simulates key strokes, your keyboard must be set with an appropriate layout, for example to have the letter `é` you'd want a French or Italian layout otherwise the English will drop it or replace with something else. Another caveat I encountered the caveat that the special characters were inserted at the wrong place. Adding a small delay was enough to fix that with the additional parameter `--latency 0.01` Finally if you run as sudo you may need to reset some environment variable so that the list of audio devices (`XDG_RUNTIME_DIR`) and the download folder remain the same. To sum-up, that gives something like:
145
+ ```bash
146
+ sudo modprobe uinput
147
+ sudo HOME=$HOME XDG_RUNTIME_DIR=$XDG_RUNTIME_DIR PYNPUT_BACKEND_KEYBOARD=uinput $(which scribe) --latency 0.01
148
+ ```
149
+ You're on the right path :)
134
150
 
135
151
  ### System try icon (experimental)
136
152
 
@@ -29,8 +29,8 @@ pip install -e .[all]
29
29
  You can leave the optional dependencies (leave out `[all]`) but must install at least one of `vosk` or `openai-whisper` packages (see Usage below).
30
30
 
31
31
  The `vosk` language models will download on-the-fly.
32
- The default data folder is `$HOME/.local/share/vosk/language-models`.
33
- This can be modified.
32
+ The default download folder is `$XDG_CACHE_HOME/{backend}` where `$XDG_CACHE_HOME` defaults to `$HOME/.cache` (note for the `whisker` backend
33
+ the default is left to the `openai-whisper` package and might change in the future).
34
34
 
35
35
 
36
36
  ## Usage
@@ -64,6 +64,9 @@ where `--no-prompt` jumps right to the recording (after the first interruption,
64
64
  ### Virtual keyboard (experimental)
65
65
 
66
66
  By default the content of the transcription is pasted to the clipboard, and it is up to the user to paste (e.g. Ctrl + V).
67
+ However with the `vosk` backend and its realtime transcription, it is very handy to have the keys sent directly to the keyboard.
68
+ That can be achieve with the `--keyboard` option.
69
+
67
70
  With the `--keyboard` option `scribe` will attempt to simulate a keyboard and send transcribed characters to the applcation under focus:
68
71
 
69
72
  ```bash
@@ -72,7 +75,20 @@ scribe --keyboard
72
75
 
73
76
  It relies on the optional `pynput` dependency (installed together with `scribe` if you used the `[all]` or `[keyboard]` option).
74
77
 
75
- `pynput` may require [some configuration](https://pynput.readthedocs.io/en/latest/limitations.html). It has [limitations]((https://pynput.readthedocs.io/en/latest/limitations.html)). In my Ubuntu + Wayland system it works in chromium based applications (including vscode) but it does not in firefox and sublime text and any of the rest (not even in a terminal !). Workarounds include using the Xorg version of GNOME: in `etc/gdm3/custom.conf` uncomment `# WaylandEnable=false` and restart.
78
+ `pynput` may require [some configuration](https://pynput.readthedocs.io/en/latest/limitations.html). It has [limitations]((https://pynput.readthedocs.io/en/latest/limitations.html)).
79
+
80
+ #### Use the keyboard in Ubuntu
81
+
82
+ In my Ubuntu + Wayland system the keyboard simulation works out-of-the-box in chromium based applications (including vscode) but it does not in firefox and sublime text and any of the rest (not even in a terminal !). I am told this is because Chromium runs an X server emulator and so is compatible with the default pynput backend.
83
+
84
+ One workaround is to use the Xorg version of GNOME: in `etc/gdm3/custom.conf` uncomment `# WaylandEnable=false` and restart your computer.
85
+
86
+ Another workaround with Wayland is to use the low-level `uinput` backend but that requires that `scribe` is run as root (sudo), and likely other configurations like activating the `uinput` system module (`sudo modprobe uinput` for a one-time test, or adding `uinput` to `/etc/modules-load.d/modules.conf` to make that persistent). Moreover, since `uinput` really only simulates key strokes, your keyboard must be set with an appropriate layout, for example to have the letter `é` you'd want a French or Italian layout otherwise the English will drop it or replace with something else. Another caveat I encountered the caveat that the special characters were inserted at the wrong place. Adding a small delay was enough to fix that with the additional parameter `--latency 0.01` Finally if you run as sudo you may need to reset some environment variable so that the list of audio devices (`XDG_RUNTIME_DIR`) and the download folder remain the same. To sum-up, that gives something like:
87
+ ```bash
88
+ sudo modprobe uinput
89
+ sudo HOME=$HOME XDG_RUNTIME_DIR=$XDG_RUNTIME_DIR PYNPUT_BACKEND_KEYBOARD=uinput $(which scribe) --latency 0.01
90
+ ```
91
+ You're on the right path :)
76
92
 
77
93
  ### System try icon (experimental)
78
94
 
@@ -12,5 +12,5 @@ __version__: str
12
12
  __version_tuple__: VERSION_TUPLE
13
13
  version_tuple: VERSION_TUPLE
14
14
 
15
- __version__ = version = '0.7.2'
16
- __version_tuple__ = version_tuple = (0, 7, 2)
15
+ __version__ = version = '0.7.4'
16
+ __version_tuple__ = version_tuple = (0, 7, 4)
@@ -3,7 +3,7 @@ import tomllib
3
3
  import argparse
4
4
  from scribe.audio import Microphone
5
5
  from scribe.util import print_partial, clear_line, prompt_choices, check_dependencies, ansi_link, colored
6
- from scribe.models import VoskTranscriber, WhisperTranscriber
6
+ from scribe.models import VoskTranscriber, WhisperTranscriber, StopRecording
7
7
 
8
8
  with open(Path(__file__).parent / "models.toml", "rb") as f:
9
9
  language_config_default = tomllib.load(f)
@@ -147,6 +147,7 @@ def get_parser():
147
147
  parser.add_argument("--app", action="store_true", help="Start in app mode (relies on pystray)")
148
148
 
149
149
  parser.add_argument("--samplerate", default=16000, type=int, help=argparse.SUPPRESS)
150
+ parser.add_argument("--microphone-device", help="The device index of the microphone to use.", type=int)
150
151
  parser.add_argument("--keyboard", action="store_true")
151
152
  parser.add_argument("--no-clipboard", dest="clipboard", action="store_false")
152
153
  parser.add_argument("--latency", default=0, type=float, help="keyboard latency")
@@ -163,36 +164,20 @@ def get_parser():
163
164
 
164
165
 
165
166
  # Commencer l'enregistrement
166
- def start_recording(micro, transcriber, clipboard=True, keyboard=False, latency=0):
167
+ def start_recording(micro, transcriber, clipboard=True, keyboard=False, latency=0, **greetings):
167
168
 
168
169
  if keyboard:
169
- try:
170
- from scribe.keyboard import type_text
171
- except ImportError:
172
- keyboard = False
173
- print("Keyboard simulation is not available.")
174
- return
175
-
170
+ from scribe.keyboard import type_text
176
171
  print("\nChange focus to target app during transcription.")
177
172
 
178
173
 
179
174
  if clipboard:
180
- try:
181
- import pyperclip
182
- except ImportError:
183
- clipboard = False
184
- print("Clipboard simulation is not available.")
185
- return
186
-
175
+ import pyperclip
187
176
  print("\nThe full transcription will be copied to clipboard as it becomes available.")
188
177
 
189
178
 
190
179
  fulltext = ""
191
180
 
192
- greetings = { k: v for k, v in language_config["_meta"].get(transcriber.language, {}).items()
193
- if v is not None and k.startswith(("start", "stop"))
194
- }
195
-
196
181
  for result in transcriber.start_recording(micro, **greetings):
197
182
 
198
183
  if result.get('text'):
@@ -212,6 +197,23 @@ def start_recording(micro, transcriber, clipboard=True, keyboard=False, latency=
212
197
  print("Copied to clipboard.")
213
198
 
214
199
 
200
+ def interrupt_app_thread(icon):
201
+ """Thanks Le Chat for this solution: https://stackoverflow.com/a/325528/2192272
202
+ """
203
+ import ctypes
204
+ thread = icon._recording_thread
205
+ # Raise an exception in the thread using ctypes
206
+ thread_id = thread.ident
207
+ if thread_id is not None:
208
+ res = ctypes.pythonapi.PyThreadState_SetAsyncExc(
209
+ ctypes.c_long(thread_id),
210
+ ctypes.py_object(StopRecording)
211
+ )
212
+ if res > 1:
213
+ ctypes.pythonapi.PyThreadState_SetAsyncExc(thread_id, 0)
214
+ print("Failure to raise exception in thread")
215
+
216
+
215
217
  def create_app(micro, transcriber, **kwargs):
216
218
  import pystray
217
219
  from pystray import Menu as pystrayMenu, MenuItem as Item
@@ -219,26 +221,38 @@ def create_app(micro, transcriber, **kwargs):
219
221
  import PIL.ImageOps
220
222
 
221
223
  import scribe_data
224
+ import threading
222
225
 
223
226
  # Load an image from a file
224
227
  image = Image.open(Path(scribe_data.__file__).parent / "share" / "icon.jpg")
225
228
 
226
229
  def callback_quit(icon, item):
227
230
  icon.visible = False
231
+ ## Here we need to stop the recording thread
232
+ callback_stop_recording(icon, item)
228
233
  icon.stop()
229
234
 
235
+ def callback_stop_recording(icon, item):
236
+ ## Here we need to stop the recording thread
237
+ interrupt_app_thread(icon)
238
+ icon._recording_thread.join()
239
+
230
240
  def callback_record(icon, item):
231
- print(f"Clicked {item}")
232
- # icon.icon = PIL.ImageOps.invert(icon.icon)
233
- # icon.icon = PIL.ImageOps.invert(image)
234
- # icon.update_menu()
235
- start_recording(micro, transcriber, **kwargs)
236
- # icon.icon = image
237
- # icon.update_menu()
241
+ icon._recording_thread = threading.Thread(target=start_recording, args=(micro, transcriber), kwargs=kwargs)
242
+ icon._recording_thread.start()
243
+
244
+ def is_recording(item):
245
+ return hasattr(icon, "_recording_thread") and icon._recording_thread.is_alive()
246
+
247
+ def is_not_recording(item):
248
+ return not is_recording(item)
249
+
238
250
 
239
251
  # Create a menu
240
252
  menu = pystrayMenu(
241
- Item('Record', callback_record),
253
+ # Item('Record', callback_record),
254
+ Item("Record", callback_record, visible=is_not_recording),
255
+ Item("Stop", callback_stop_recording, visible=is_recording),
242
256
  Item('Quit', callback_quit),
243
257
  )
244
258
 
@@ -255,7 +269,7 @@ def main(args=None):
255
269
 
256
270
 
257
271
  # Set up the microphone for recording
258
- micro = Microphone(samplerate=o.samplerate)
272
+ micro = Microphone(samplerate=o.samplerate, device=o.microphone_device)
259
273
 
260
274
  transcriber = None
261
275
 
@@ -312,11 +326,17 @@ def main(args=None):
312
326
  continue
313
327
 
314
328
  if o.app:
315
- app = create_app(micro, transcriber, clipboard=o.clipboard, keyboard=o.keyboard, latency=o.latency)
329
+ greetings = dict(
330
+ start_message = "Listening... Use the try icon menu to stop.",
331
+ )
332
+ app = create_app(micro, transcriber, clipboard=o.clipboard, keyboard=o.keyboard, latency=o.latency, **greetings)
316
333
  print("Starting app...")
317
334
  app.run()
318
335
  else:
319
- start_recording(micro, transcriber, clipboard=o.clipboard, keyboard=o.keyboard, latency=o.latency)
336
+ greetings = dict(
337
+ start_message = "Listening... Press Ctrl+C to stop.",
338
+ )
339
+ start_recording(micro, transcriber, clipboard=o.clipboard, keyboard=o.keyboard, latency=o.latency, **greetings)
320
340
 
321
341
  # if we arrived so far, that means we pressed Ctrl + C anyway, and need Enter to move on.
322
342
  # So we leave the wider range of options to change the model.
@@ -0,0 +1,51 @@
1
+ """This module handles typing characters as if they were typed on a keyboard.
2
+ """
3
+ import platform
4
+ import time
5
+
6
+ try:
7
+ # import pyautogui
8
+ from pynput.keyboard import Controller, Key
9
+
10
+ except ImportError:
11
+ print("Please install pynput to use the keyboard feature.")
12
+ raise
13
+
14
+ # Create a keyboard controller
15
+ keyboard = Controller()
16
+
17
+ def paste_text():
18
+ """This does not work with the uinput backend
19
+ """
20
+ os_name = platform.system()
21
+
22
+ if os_name == "Darwin": # macOS
23
+ with keyboard.pressed(Key.cmd):
24
+ keyboard.press('v')
25
+ keyboard.release('v')
26
+
27
+ else: # Windows and Linux
28
+ keyboard.press(Key.ctrl)
29
+ keyboard.press('v')
30
+ keyboard.release('v')
31
+ keyboard.release(Key.ctrl)
32
+
33
+ def type_text(text, interval=0, paste=False):
34
+ # Simulate typing a string
35
+ # import subprocess
36
+ # subprocess.run(["ydotool", "type", text])
37
+
38
+ if paste:
39
+ import pyperclip
40
+ keep_state = pyperclip.paste()
41
+ pyperclip.copy(text)
42
+ paste_text()
43
+ pyperclip.copy(keep_state)
44
+ return
45
+
46
+ if interval > 0:
47
+ for c in text:
48
+ keyboard.type(c)
49
+ time.sleep(interval)
50
+ else:
51
+ keyboard.type(text)
@@ -12,9 +12,12 @@ def is_silent(data, silence_thresh=-40):
12
12
  """
13
13
  return calculate_decibels(data) < silence_thresh
14
14
 
15
- VOSK_MODELS_FOLDER = os.path.join(os.environ.get("HOME"),
16
- ".local/share/vosk/language-models")
15
+ HOME = os.environ.get('HOME', os.path.expanduser('~'))
16
+ XDG_CACHE_HOME = os.environ.get('XDG_CACHE_HOME', os.path.join(HOME, '.cache'))
17
+ VOSK_MODELS_FOLDER = os.path.join(XDG_CACHE_HOME, "vosk")
17
18
 
19
+ class StopRecording(Exception):
20
+ pass
18
21
 
19
22
  class AbstractTranscriber:
20
23
  backend = None
@@ -85,7 +88,7 @@ class AbstractTranscriber:
85
88
  if self.is_overtime():
86
89
  raise KeyboardInterrupt("Overtime: {:.2f} seconds".format(self.get_elapsed()))
87
90
 
88
- except KeyboardInterrupt:
91
+ except (KeyboardInterrupt, StopRecording):
89
92
  pass
90
93
 
91
94
  finally:
@@ -0,0 +1,23 @@
1
+ [vosk.en]
2
+ model = "vosk-model-en-us-0.42-gigaspeech"
3
+
4
+ [vosk.fr]
5
+ model = "vosk-model-fr-0.22"
6
+
7
+ [vosk.de]
8
+ model = "vosk-model-de-tuda-0.6-900k"
9
+
10
+ [vosk.it]
11
+ model = "vosk-model-it-0.22"
12
+
13
+ [_meta.en]
14
+ language = "English (US)"
15
+
16
+ [_meta.fr]
17
+ language = "French"
18
+
19
+ [_meta.de]
20
+ language = "German"
21
+
22
+ [_meta.it]
23
+ language = "Italian"
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.2
2
2
  Name: scribe-cli
3
- Version: 0.7.2
3
+ Version: 0.7.4
4
4
  Summary: scribe is a local speech recognition tool that provides real-time transcription using vosk and whisper AI.
5
5
  Author-email: Mahé Perrette <mahe.perrette@gmail.com>
6
6
  License: MIT License
@@ -87,8 +87,8 @@ pip install -e .[all]
87
87
  You can leave the optional dependencies (leave out `[all]`) but must install at least one of `vosk` or `openai-whisper` packages (see Usage below).
88
88
 
89
89
  The `vosk` language models will download on-the-fly.
90
- The default data folder is `$HOME/.local/share/vosk/language-models`.
91
- This can be modified.
90
+ The default download folder is `$XDG_CACHE_HOME/{backend}` where `$XDG_CACHE_HOME` defaults to `$HOME/.cache` (note for the `whisker` backend
91
+ the default is left to the `openai-whisper` package and might change in the future).
92
92
 
93
93
 
94
94
  ## Usage
@@ -122,6 +122,9 @@ where `--no-prompt` jumps right to the recording (after the first interruption,
122
122
  ### Virtual keyboard (experimental)
123
123
 
124
124
  By default the content of the transcription is pasted to the clipboard, and it is up to the user to paste (e.g. Ctrl + V).
125
+ However with the `vosk` backend and its realtime transcription, it is very handy to have the keys sent directly to the keyboard.
126
+ That can be achieve with the `--keyboard` option.
127
+
125
128
  With the `--keyboard` option `scribe` will attempt to simulate a keyboard and send transcribed characters to the applcation under focus:
126
129
 
127
130
  ```bash
@@ -130,7 +133,20 @@ scribe --keyboard
130
133
 
131
134
  It relies on the optional `pynput` dependency (installed together with `scribe` if you used the `[all]` or `[keyboard]` option).
132
135
 
133
- `pynput` may require [some configuration](https://pynput.readthedocs.io/en/latest/limitations.html). It has [limitations]((https://pynput.readthedocs.io/en/latest/limitations.html)). In my Ubuntu + Wayland system it works in chromium based applications (including vscode) but it does not in firefox and sublime text and any of the rest (not even in a terminal !). Workarounds include using the Xorg version of GNOME: in `etc/gdm3/custom.conf` uncomment `# WaylandEnable=false` and restart.
136
+ `pynput` may require [some configuration](https://pynput.readthedocs.io/en/latest/limitations.html). It has [limitations]((https://pynput.readthedocs.io/en/latest/limitations.html)).
137
+
138
+ #### Use the keyboard in Ubuntu
139
+
140
+ In my Ubuntu + Wayland system the keyboard simulation works out-of-the-box in chromium based applications (including vscode) but it does not in firefox and sublime text and any of the rest (not even in a terminal !). I am told this is because Chromium runs an X server emulator and so is compatible with the default pynput backend.
141
+
142
+ One workaround is to use the Xorg version of GNOME: in `etc/gdm3/custom.conf` uncomment `# WaylandEnable=false` and restart your computer.
143
+
144
+ Another workaround with Wayland is to use the low-level `uinput` backend but that requires that `scribe` is run as root (sudo), and likely other configurations like activating the `uinput` system module (`sudo modprobe uinput` for a one-time test, or adding `uinput` to `/etc/modules-load.d/modules.conf` to make that persistent). Moreover, since `uinput` really only simulates key strokes, your keyboard must be set with an appropriate layout, for example to have the letter `é` you'd want a French or Italian layout otherwise the English will drop it or replace with something else. Another caveat I encountered the caveat that the special characters were inserted at the wrong place. Adding a small delay was enough to fix that with the additional parameter `--latency 0.01` Finally if you run as sudo you may need to reset some environment variable so that the list of audio devices (`XDG_RUNTIME_DIR`) and the download folder remain the same. To sum-up, that gives something like:
145
+ ```bash
146
+ sudo modprobe uinput
147
+ sudo HOME=$HOME XDG_RUNTIME_DIR=$XDG_RUNTIME_DIR PYNPUT_BACKEND_KEYBOARD=uinput $(which scribe) --latency 0.01
148
+ ```
149
+ You're on the right path :)
134
150
 
135
151
  ### System try icon (experimental)
136
152
 
@@ -1,18 +0,0 @@
1
- """This module handles typing characters as if they were typed on a keyboard.
2
- """
3
- try:
4
- # import pyautogui
5
- from pynput.keyboard import Controller
6
-
7
- except ImportError:
8
- print("Please install pynput to use the keyboard feature.")
9
- raise
10
-
11
- # Create a keyboard controller
12
- keyboard = Controller()
13
-
14
- def type_text(text, interval=0):
15
- # Simulate typing a string
16
- # import subprocess
17
- # subprocess.run(["ydotool", "type", text])
18
- keyboard.type(text)
@@ -1,31 +0,0 @@
1
- [vosk.en]
2
- model = "vosk-model-en-us-0.42-gigaspeech"
3
-
4
- [vosk.fr]
5
- model = "vosk-model-fr-0.22"
6
-
7
- [vosk.de]
8
- model = "vosk-model-de-tuda-0.6-900k"
9
-
10
- [vosk.it]
11
- model = "vosk-model-it-0.22"
12
-
13
- [_meta.en]
14
- language = "English (US)"
15
- start_message = "Listening... Press Ctrl+C to stop."
16
- stop_message = "Recording stopped."
17
-
18
- [_meta.fr]
19
- language = "French"
20
- start_message = "En écoute... Appuyez sur Ctrl+C pour arrêter."
21
- stop_message = "Écoute arrêtée."
22
-
23
- [_meta.de]
24
- language = "German"
25
- start_message = "Hören... Drücken Sie Strg+C, um zu stoppen."
26
- stop_message = "Aufnahme gestoppt."
27
-
28
- [_meta.it]
29
- language = "Italian"
30
- start_message = "In ascolto... Premere Ctrl+C per interrompere."
31
- stop_message = "Registrazione interrotta."
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes