scribe-cli 0.4.0__tar.gz → 0.5.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (27) hide show
  1. {scribe_cli-0.4.0/scribe_cli.egg-info → scribe_cli-0.5.0}/PKG-INFO +5 -3
  2. {scribe_cli-0.4.0 → scribe_cli-0.5.0}/README.md +3 -2
  3. {scribe_cli-0.4.0 → scribe_cli-0.5.0}/pyproject.toml +1 -0
  4. {scribe_cli-0.4.0 → scribe_cli-0.5.0}/scribe/_version.py +2 -2
  5. {scribe_cli-0.4.0 → scribe_cli-0.5.0}/scribe/models.py +2 -2
  6. {scribe_cli-0.4.0 → scribe_cli-0.5.0}/scribe/streamer.py +26 -5
  7. {scribe_cli-0.4.0 → scribe_cli-0.5.0/scribe_cli.egg-info}/PKG-INFO +5 -3
  8. {scribe_cli-0.4.0 → scribe_cli-0.5.0}/scribe_cli.egg-info/requires.txt +1 -0
  9. {scribe_cli-0.4.0 → scribe_cli-0.5.0}/.github/workflows/pypi.yml +0 -0
  10. {scribe_cli-0.4.0 → scribe_cli-0.5.0}/.gitignore +0 -0
  11. {scribe_cli-0.4.0 → scribe_cli-0.5.0}/LICENSE +0 -0
  12. {scribe_cli-0.4.0 → scribe_cli-0.5.0}/scribe/__init__.py +0 -0
  13. {scribe_cli-0.4.0 → scribe_cli-0.5.0}/scribe/audio.py +0 -0
  14. {scribe_cli-0.4.0 → scribe_cli-0.5.0}/scribe/install_desktop.py +0 -0
  15. {scribe_cli-0.4.0 → scribe_cli-0.5.0}/scribe/keyboard.py +0 -0
  16. {scribe_cli-0.4.0 → scribe_cli-0.5.0}/scribe/models.toml +0 -0
  17. {scribe_cli-0.4.0 → scribe_cli-0.5.0}/scribe/saverecording.py +0 -0
  18. {scribe_cli-0.4.0 → scribe_cli-0.5.0}/scribe/testpynput.py +0 -0
  19. {scribe_cli-0.4.0 → scribe_cli-0.5.0}/scribe/util.py +0 -0
  20. {scribe_cli-0.4.0 → scribe_cli-0.5.0}/scribe_cli.egg-info/SOURCES.txt +0 -0
  21. {scribe_cli-0.4.0 → scribe_cli-0.5.0}/scribe_cli.egg-info/dependency_links.txt +0 -0
  22. {scribe_cli-0.4.0 → scribe_cli-0.5.0}/scribe_cli.egg-info/entry_points.txt +0 -0
  23. {scribe_cli-0.4.0 → scribe_cli-0.5.0}/scribe_cli.egg-info/top_level.txt +0 -0
  24. {scribe_cli-0.4.0 → scribe_cli-0.5.0}/scribe_data/__init__.py +0 -0
  25. {scribe_cli-0.4.0 → scribe_cli-0.5.0}/scribe_data/share/icon.jpg +0 -0
  26. {scribe_cli-0.4.0 → scribe_cli-0.5.0}/scribe_data/templates/scribe.desktop +0 -0
  27. {scribe_cli-0.4.0 → scribe_cli-0.5.0}/setup.cfg +0 -0
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.2
2
2
  Name: scribe-cli
3
- Version: 0.4.0
3
+ Version: 0.5.0
4
4
  Summary: scribe is a local speech recognition tool that provides real-time transcription using vosk and whisper AI.
5
5
  Author-email: Mahé Perrette <mahe.perrette@gmail.com>
6
6
  License: MIT License
@@ -43,6 +43,7 @@ Requires-Dist: numpy
43
43
  Requires-Dist: sounddevice
44
44
  Requires-Dist: tqdm
45
45
  Requires-Dist: requests
46
+ Requires-Dist: pyperclip
46
47
  Provides-Extra: keyboard
47
48
  Requires-Dist: pynput; extra == "keyboard"
48
49
  Provides-Extra: whisper
@@ -56,7 +57,7 @@ Requires-Dist: vosk; extra == "all"
56
57
 
57
58
  # Scribe
58
59
 
59
- `scribe` is a local speech recognition tool that provides real-time transcription using vosk and whisper AI.
60
+ `scribe` is a local speech recognition tool that provides real-time transcription using vosk and whisper AI, with the goal of serving as a virtual keyboard.
60
61
 
61
62
  ## Installation
62
63
 
@@ -99,7 +100,7 @@ scribe
99
100
  and the script will guide you through the choice of backend (`whisper` or `vosk`) and the specific language model.
100
101
  After this, you will be prompted to start recording your microphone and print the transcribed text in real-time (`vosk`)
101
102
  or until after recording is complete (`whisper`).
102
- You can interrupt the recording via Ctrl + C and start again or change model.
103
+ You can interrupt the recording via Ctrl + C and start again or change model. The full content of the transcription will be pasted to the clipboard by default, until interruption.
103
104
 
104
105
  The default (`whisper`) is excellent at transcribing a full-length audio sequences in [many languages](https://github.com/openai/whisper?tab=readme-ov-file#available-models-and-languages). It is really impressive,
105
106
  but it cannot do real-time, and depending on the model can have relatively long execution time, especially with the `turbo` model (at least on my laptop with CPU only). The `small` model is also excellent and runs much faster. It is selected as default in `scribe` for that reason.
@@ -118,6 +119,7 @@ where `--no-prompt` jumps right to the recording (after the first interruption,
118
119
 
119
120
  ### Advanced usage as keyboard replacement
120
121
 
122
+ By default the content of the transcription is paster to the clipboard, but is not propagated further.
121
123
  With the `--keyboard` option `scribe` will attempt to simulate a keyboard and send transcribed characters to the applcation under focus:
122
124
 
123
125
  ```bash
@@ -1,6 +1,6 @@
1
1
  # Scribe
2
2
 
3
- `scribe` is a local speech recognition tool that provides real-time transcription using vosk and whisper AI.
3
+ `scribe` is a local speech recognition tool that provides real-time transcription using vosk and whisper AI, with the goal of serving as a virtual keyboard.
4
4
 
5
5
  ## Installation
6
6
 
@@ -43,7 +43,7 @@ scribe
43
43
  and the script will guide you through the choice of backend (`whisper` or `vosk`) and the specific language model.
44
44
  After this, you will be prompted to start recording your microphone and print the transcribed text in real-time (`vosk`)
45
45
  or until after recording is complete (`whisper`).
46
- You can interrupt the recording via Ctrl + C and start again or change model.
46
+ You can interrupt the recording via Ctrl + C and start again or change model. The full content of the transcription will be pasted to the clipboard by default, until interruption.
47
47
 
48
48
  The default (`whisper`) is excellent at transcribing a full-length audio sequences in [many languages](https://github.com/openai/whisper?tab=readme-ov-file#available-models-and-languages). It is really impressive,
49
49
  but it cannot do real-time, and depending on the model can have relatively long execution time, especially with the `turbo` model (at least on my laptop with CPU only). The `small` model is also excellent and runs much faster. It is selected as default in `scribe` for that reason.
@@ -62,6 +62,7 @@ where `--no-prompt` jumps right to the recording (after the first interruption,
62
62
 
63
63
  ### Advanced usage as keyboard replacement
64
64
 
65
+ By default the content of the transcription is paster to the clipboard, but is not propagated further.
65
66
  With the `--keyboard` option `scribe` will attempt to simulate a keyboard and send transcribed characters to the applcation under focus:
66
67
 
67
68
  ```bash
@@ -17,6 +17,7 @@ dependencies = [
17
17
  "sounddevice",
18
18
  "tqdm",
19
19
  "requests",
20
+ "pyperclip",
20
21
  ]
21
22
  optional-dependencies = { keyboard = ["pynput"], whisper = ["openai-whisper"], vosk = ["vosk"], all = ["pynput", "openai-whisper", "vosk"] }
22
23
 
@@ -12,5 +12,5 @@ __version__: str
12
12
  __version_tuple__: VERSION_TUPLE
13
13
  version_tuple: VERSION_TUPLE
14
14
 
15
- __version__ = version = '0.4.0'
16
- __version_tuple__ = version_tuple = (0, 4, 0)
15
+ __version__ = version = '0.5.0'
16
+ __version_tuple__ = version_tuple = (0, 5, 0)
@@ -135,8 +135,8 @@ class WhisperTranscriber(AbstractTranscriber):
135
135
  super().__init__(model, model_name, language, model_kwargs=model_kwargs, **kwargs)
136
136
 
137
137
  def transcribe_audio(self, audio_bytes):
138
- print("\nTranscribing...")
139
- print("If --keyboard is set, change focus to target app NOW !")
138
+ print("\nIf --keyboard is set, change focus to target app NOW !")
139
+ print("Transcribing...")
140
140
  audio_array = np.frombuffer(audio_bytes, dtype=np.int16).flatten().astype(np.float32) / 32768.0
141
141
  return self.model.transcribe(audio_array, fp16=False, language=self.language)
142
142
 
@@ -12,14 +12,25 @@ language_config = language_config_default.copy()
12
12
 
13
13
 
14
14
  # Commencer l'enregistrement
15
- def start_recording(micro, transcriber, keyboard=False, latency=0):
15
+ def start_recording(micro, transcriber, clipboard=True, keyboard=False, latency=0):
16
16
 
17
17
  if keyboard:
18
18
  try:
19
19
  from scribe.keyboard import type_text
20
20
  except ImportError:
21
21
  keyboard = False
22
- exit(1)
22
+ print("Keyboard simulation is not available.")
23
+ return
24
+
25
+ if clipboard:
26
+ try:
27
+ import pyperclip
28
+ except ImportError:
29
+ clipboard = False
30
+ print("Clipboard simulation is not available.")
31
+ return
32
+
33
+ fulltext = ""
23
34
 
24
35
  greetings = { k: v for k, v in language_config["_meta"].get(transcriber.language, {}).items()
25
36
  if v is not None and k.startswith(("start", "stop"))
@@ -32,6 +43,11 @@ def start_recording(micro, transcriber, keyboard=False, latency=0):
32
43
  print(result.get('text'))
33
44
  if keyboard:
34
45
  type_text(result['text'] + " ", interval=latency) # Simulate typing
46
+
47
+ if clipboard:
48
+ fulltext += result['text'] + " "
49
+ pyperclip.copy(fulltext)
50
+
35
51
  else:
36
52
  print_partial(result.get('partial', ''))
37
53
 
@@ -170,6 +186,7 @@ def get_parser():
170
186
  parser.add_argument("--samplerate", default=16000, type=int, help=argparse.SUPPRESS)
171
187
  parser.add_argument("--duration", default=60, type=int, help="duration in seconds before whisper models start transcribing (default %(default)ss)")
172
188
  parser.add_argument("--keyboard", action="store_true")
189
+ parser.add_argument("--no-clipboard", dest="clipboard", action="store_false")
173
190
  parser.add_argument("--latency", default=0, type=float, help="keyboard latency")
174
191
 
175
192
  parser.add_argument("--data-folder", help="Folder to store Vosk models.")
@@ -191,12 +208,13 @@ def main(args=None):
191
208
  while True:
192
209
  if transcriber is None:
193
210
  transcriber = get_transcriber(o, prompt=o.prompt)
194
- print(f"[ Model {transcriber.model_name} from {transcriber.backend} selected. ]")
211
+ print(f"[ Model {transcriber.model_name} from {transcriber.backend} selected. Keyboard [{'on' if o.keyboard else 'off'}]. Clipboard [{'on' if o.clipboard else 'off'}]]")
195
212
  if o.prompt:
196
213
  print(f"Choose any of the following actions:")
197
214
  print(f"[q] quit")
198
215
  print(f"[e] change model")
199
- print(f"[k] toggle keyboard {'off' if o.keyboard else 'on'}")
216
+ print(f"[k] toggle keyboard [{'off' if o.keyboard else 'on'}]")
217
+ print(f"[c] toggle clipboard [{'off' if o.clipboard else 'on'}]")
200
218
  if transcriber.backend == "whisper":
201
219
  print(f"[t] change duration (currently {transcriber.max_duration}s)")
202
220
  print(colored(f"Press [Enter] or any other key to start recording.", "BOLD"))
@@ -210,6 +228,9 @@ def main(args=None):
210
228
  if key == "k":
211
229
  o.keyboard = not o.keyboard
212
230
  continue
231
+ if key == "c":
232
+ o.clipboard = not o.clipboard
233
+ continue
213
234
  if key == "t":
214
235
  duration = input(f"Enter new duration in seconds (current: {transcriber.max_duration}): ")
215
236
  try:
@@ -218,7 +239,7 @@ def main(args=None):
218
239
  print("Invalid duration. Must be an integer.")
219
240
  continue
220
241
 
221
- start_recording(micro, transcriber, keyboard=o.keyboard, latency=o.latency)
242
+ start_recording(micro, transcriber, clipboard=o.clipboard, keyboard=o.keyboard, latency=o.latency)
222
243
 
223
244
  # if we arrived so far, that means we pressed Ctrl + C anyway, and need Enter to move on.
224
245
  # So we leave the wider range of options to change the model.
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.2
2
2
  Name: scribe-cli
3
- Version: 0.4.0
3
+ Version: 0.5.0
4
4
  Summary: scribe is a local speech recognition tool that provides real-time transcription using vosk and whisper AI.
5
5
  Author-email: Mahé Perrette <mahe.perrette@gmail.com>
6
6
  License: MIT License
@@ -43,6 +43,7 @@ Requires-Dist: numpy
43
43
  Requires-Dist: sounddevice
44
44
  Requires-Dist: tqdm
45
45
  Requires-Dist: requests
46
+ Requires-Dist: pyperclip
46
47
  Provides-Extra: keyboard
47
48
  Requires-Dist: pynput; extra == "keyboard"
48
49
  Provides-Extra: whisper
@@ -56,7 +57,7 @@ Requires-Dist: vosk; extra == "all"
56
57
 
57
58
  # Scribe
58
59
 
59
- `scribe` is a local speech recognition tool that provides real-time transcription using vosk and whisper AI.
60
+ `scribe` is a local speech recognition tool that provides real-time transcription using vosk and whisper AI, with the goal of serving as a virtual keyboard.
60
61
 
61
62
  ## Installation
62
63
 
@@ -99,7 +100,7 @@ scribe
99
100
  and the script will guide you through the choice of backend (`whisper` or `vosk`) and the specific language model.
100
101
  After this, you will be prompted to start recording your microphone and print the transcribed text in real-time (`vosk`)
101
102
  or until after recording is complete (`whisper`).
102
- You can interrupt the recording via Ctrl + C and start again or change model.
103
+ You can interrupt the recording via Ctrl + C and start again or change model. The full content of the transcription will be pasted to the clipboard by default, until interruption.
103
104
 
104
105
  The default (`whisper`) is excellent at transcribing a full-length audio sequences in [many languages](https://github.com/openai/whisper?tab=readme-ov-file#available-models-and-languages). It is really impressive,
105
106
  but it cannot do real-time, and depending on the model can have relatively long execution time, especially with the `turbo` model (at least on my laptop with CPU only). The `small` model is also excellent and runs much faster. It is selected as default in `scribe` for that reason.
@@ -118,6 +119,7 @@ where `--no-prompt` jumps right to the recording (after the first interruption,
118
119
 
119
120
  ### Advanced usage as keyboard replacement
120
121
 
122
+ By default the content of the transcription is paster to the clipboard, but is not propagated further.
121
123
  With the `--keyboard` option `scribe` will attempt to simulate a keyboard and send transcribed characters to the applcation under focus:
122
124
 
123
125
  ```bash
@@ -2,6 +2,7 @@ numpy
2
2
  sounddevice
3
3
  tqdm
4
4
  requests
5
+ pyperclip
5
6
 
6
7
  [all]
7
8
  pynput
File without changes
File without changes
File without changes
File without changes
File without changes