PyPI - scribe-cli - Versions diffs - 0.7.6__tar.gz → 0.7.8__tar.gz - Mend

scribe-cli 0.7.6tar.gz → 0.7.8tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (27) hide show

{scribe_cli-0.7.6/scribe_cli.egg-info → scribe_cli-0.7.8}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.2
 Name: scribe-cli
-Version: 0.7.6
+Version: 0.7.8
 Summary: scribe is a local speech recognition tool that provides real-time transcription using vosk and whisper AI, with the goal of serving as a virtual keyboard on a computer
 Author-email: Mahé Perrette <mahe.perrette@gmail.com>
 License: MIT License
@@ -44,6 +44,8 @@ Requires-Dist: sounddevice
 Requires-Dist: tqdm
 Requires-Dist: requests
 Requires-Dist: pyperclip
+Requires-Dist: unidecode
+Requires-Dist: termcolor
 Provides-Extra: keyboard
 Requires-Dist: pynput; extra == "keyboard"
 Provides-Extra: whisper
@@ -60,7 +62,7 @@ Requires-Dist: vosk; extra == "all"
 Requires-Dist: pystray; extra == "all"
 [![python](https://img.shields.io/badge/python-3.12-blue.svg)]()
-[![pypi](https://github.com/perrette/scribe/actions/workflows/pypi.yml/badge.svg)](https://pypi.org/project/papers-cli)
+[![pypi](https://img.shields.io/pypi/v/scribe-cli)](https://pypi.org/project/scribe-cli)
 # Scribe
@@ -83,7 +85,7 @@ sudo apt-get install portaudio19-dev xclip
 See additional requirements for the [icon tray](#system-tray-icon-experimental) and [keyboard](#virtual-keyboard-experimental) options. The python dependencies should be dealt with automatically:
 ```bash
-pip install scribe-cli[all]"
+pip install scribe-cli[all]
 ```
 (note the `-cli` suffix for client)
@@ -121,7 +123,7 @@ With the `whisker` model you need to stop the registration manually before the t
 there is a maximum duration after which it will stop by itself, which is setup to 60s by default (unless `--duration` is set to something else).
 The `vosk` backend is much faster and very good at doing real-time transcription for one language, but tended to make more mistakes in my tests and it does not do punctuation.
-Use mainly for longer typing session with the [keyboard](#virtual-keyboard-advanced) option, e.g. to make notes.
+It becomes really powerful when used for longer or interactive typing session with the [keyboard](#virtual-keyboard-experimental) option, e.g. to make notes or chat with an AI.
 There are many [vosk models](https://alphacephei.com/vosk/models) available, and here a few are associated to [a handful of languages](scribe/models.toml) `en`, `fr`, `it`, `de` (so far).
 To skip the initial selection menu you can do:

{scribe_cli-0.7.6 → scribe_cli-0.7.8}/README.md RENAMED Viewed

@@ -1,5 +1,5 @@
 [![python](https://img.shields.io/badge/python-3.12-blue.svg)]()
-[![pypi](https://github.com/perrette/scribe/actions/workflows/pypi.yml/badge.svg)](https://pypi.org/project/papers-cli)
+[![pypi](https://img.shields.io/pypi/v/scribe-cli)](https://pypi.org/project/scribe-cli)
 # Scribe
@@ -22,7 +22,7 @@ sudo apt-get install portaudio19-dev xclip
 See additional requirements for the [icon tray](#system-tray-icon-experimental) and [keyboard](#virtual-keyboard-experimental) options. The python dependencies should be dealt with automatically:
 ```bash
-pip install scribe-cli[all]"
+pip install scribe-cli[all]
 ```
 (note the `-cli` suffix for client)
@@ -60,7 +60,7 @@ With the `whisker` model you need to stop the registration manually before the t
 there is a maximum duration after which it will stop by itself, which is setup to 60s by default (unless `--duration` is set to something else).
 The `vosk` backend is much faster and very good at doing real-time transcription for one language, but tended to make more mistakes in my tests and it does not do punctuation.
-Use mainly for longer typing session with the [keyboard](#virtual-keyboard-advanced) option, e.g. to make notes.
+It becomes really powerful when used for longer or interactive typing session with the [keyboard](#virtual-keyboard-experimental) option, e.g. to make notes or chat with an AI.
 There are many [vosk models](https://alphacephei.com/vosk/models) available, and here a few are associated to [a handful of languages](scribe/models.toml) `en`, `fr`, `it`, `de` (so far).
 To skip the initial selection menu you can do:
@@ -124,4 +124,4 @@ e.g.
 scribe-install --backend whisper --model small
 ```
-After that just typing Cmd + scri... at any time from any where will conveniently start the app in its own terminal with the prescribed options.
+After that just typing Cmd + scri... at any time from any where will conveniently start the app in its own terminal with the prescribed options.

{scribe_cli-0.7.6 → scribe_cli-0.7.8}/pyproject.toml RENAMED Viewed

@@ -18,6 +18,8 @@ dependencies = [
     "tqdm",
     "requests",
     "pyperclip",
+    "unidecode",
+    "termcolor",
 ]
 classifiers = [

{scribe_cli-0.7.6 → scribe_cli-0.7.8}/scribe/_version.py RENAMED Viewed

@@ -12,5 +12,5 @@ __version__: str
 __version_tuple__: VERSION_TUPLE
 version_tuple: VERSION_TUPLE
-__version__ = version = '0.7.6'
-__version_tuple__ = version_tuple = (0, 7, 6)
+__version__ = version = '0.7.8'
+__version_tuple__ = version_tuple = (0, 7, 8)

{scribe_cli-0.7.6 → scribe_cli-0.7.8}/scribe/app.py RENAMED Viewed

@@ -37,8 +37,27 @@ def pick_specialist_model(model, language, backend):
     return model
+class DummyTranscriber:
+    def __init__(self, backend, model_name):
+        self.backend = backend
+        self.model_name = model_name
+    def start_recording(self, micro, **kwargs):
+        while True:
+            try:
+                yield {"text": input()}
+            except KeyboardInterrupt:
+                break
+    def __getattr__(self, item):
+        return None
 def get_transcriber(o, prompt=True):
+    if o.dummy:
+        return DummyTranscriber("whisper", "dummy")
     if o.backend:
         checked_backend = check_dependencies(o.backend)
         if not checked_backend:
@@ -143,6 +162,8 @@ def get_parser():
     parser.add_argument("-l", "--language", choices=list(language_config["vosk"]),
                         help="An alias for preselected models when using the vosk backend, or 'en' for the English version of whisper models.")
+    parser.add_argument("--dummy", action="store_true", help=argparse.SUPPRESS)
     parser.add_argument("--no-prompt", action="store_false", dest="prompt", help="Disable prompts for backend and model selection and jump to recording")
     parser.add_argument("--app", action="store_true", help="Start in app mode (relies on pystray)")
@@ -151,6 +172,7 @@ def get_parser():
     parser.add_argument("--keyboard", action="store_true")
     parser.add_argument("--no-clipboard", dest="clipboard", action="store_false")
     parser.add_argument("--latency", default=0, type=float, help="keyboard latency")
+    parser.add_argument("--ascii", action="store_true", help="Use unidecode for keyboard typing in ascii")
     group = parser.add_argument_group("whisper options")
     group.add_argument("--duration", default=120, type=int, help="Max duration of the whisper recording (default %(default)ss)")
@@ -164,7 +186,7 @@ def get_parser():
 # Commencer l'enregistrement
-def start_recording(micro, transcriber, clipboard=True, keyboard=False, latency=0, **greetings):
+def start_recording(micro, transcriber, clipboard=True, keyboard=False, latency=0, ascii=False, **greetings):
     if keyboard:
         from scribe.keyboard import type_text
@@ -184,7 +206,7 @@ def start_recording(micro, transcriber, clipboard=True, keyboard=False, latency=
             clear_line()
             print(result.get('text'))
             if keyboard:
-                type_text(result['text'] + " ", interval=latency) # Simulate typing
+                type_text(result['text'] + " ", interval=latency, ascii=ascii) # Simulate typing
             if clipboard:
                 fulltext += result['text'] + " "
@@ -278,25 +300,37 @@ def main(args=None):
     while True:
         if transcriber is None:
             transcriber = get_transcriber(o, prompt=o.prompt)
-        print(f">>> Model {transcriber.model_name} from {transcriber.backend} selected. Keyboard [{'on' if o.keyboard else 'off'}]. Clipboard [{'on' if o.clipboard else 'off'}] <<<")
+        print(f"Model [{colored(transcriber.model_name, 'light_blue', attrs=['bold'])}] from [{colored(transcriber.backend, 'light_blue', attrs=['bold'])}] selected.")
         if o.prompt:
-            print(f"Choose any of the following actions:")
-            print(f"[q] quit")
-            print(f"[e] change model")
-            print(f"[x] toggle app [{toggle[o.app]}] -> [{toggle[not o.app]}]")
-            print(f"[k] toggle keyboard [{toggle[o.keyboard]}] -> [{toggle[not o.keyboard]}]")
-            print(f"[c] toggle clipboard [{toggle[o.clipboard]}] -> [{toggle[not o.clipboard]}]")
+            print(f"Choose any of the following actions")
+            print(f"{colored('[q]', 'light_yellow')} quit")
+            print(f"{colored('[e]', 'light_yellow')} change model")
+            print(f"{colored('[x]', 'light_yellow')} app is {colored(o.app, 'light_blue')} toggle?")
+            print(f"{colored('[c]', 'light_yellow')} clipboard is {colored(o.clipboard, 'light_blue')} toggle?")
+            print(f"{colored('[k]', 'light_yellow')} keyboard is {colored(o.keyboard, 'light_blue')} toggle?")
+            if o.keyboard:
+                print(f"{colored('[latency]', 'light_yellow')} between keystrokes is {colored(o.latency, 'light_blue')} s")
             if transcriber.backend == "whisper":
-                print(f"[t] change duration (currently {transcriber.timeout}s)")
-                print(f"[b] change silence duration (currently {transcriber.silence_duration}s)")
-                print(f"[a] toggle auto-restart after silence [{toggle[transcriber.restart_after_silence]}] -> [{toggle[not transcriber.restart_after_silence]}]")
-            print(colored(f"Press [Enter] or any other key to start recording.", "BOLD"))
+                print(f"{colored('[t]', 'light_yellow')} change duration (currently {colored(transcriber.timeout, 'light_blue')} s)")
+                print(f"{colored('[b]', 'light_yellow')} change silence duration (currently {colored(transcriber.silence_duration, 'light_blue')} s)")
+                print(f"{colored('[a]', 'light_yellow')} auto-restart after silence is {colored(transcriber.restart_after_silence, 'light_blue')} toggle?")
+            exclude_flags = ["keyboard", "clipboard", "app", "prompt", "restart_after_silence"]
+            display_flags = [a.dest for a in parser._actions if a.help != argparse.SUPPRESS]
+            for key, value in vars(o).items():
+                if key not in display_flags or key in exclude_flags or not isinstance(value, bool):
+                    continue
+                print(f"{colored(f'[{key}]', 'light_yellow')} is {colored(value, 'light_blue')} toggle?")
+            print(colored(f"Press [Enter] to start recording.", attrs=["bold"]))
             key = input()
             if key == "q":
                 exit(0)
             if key == "e":
                 transcriber = None
+                o.model = None
+                o.backend = None
+                o.language = None
                 continue
             if key == "k":
                 o.keyboard = not o.keyboard
@@ -317,6 +351,13 @@ def main(args=None):
                 except:
                     print("Invalid duration. Must be an integer.")
                 continue
+            if key == "latency":
+                ans = input(f"Enter new keyboard latency in seconds (current: {o.latency}): ")
+                try:
+                    o.latency = float(ans)
+                except:
+                    print("Invalid latency. Must be a float.")
+                continue
             if key == "b":
                 ans = input(f"Enter new silence break duration in seconds (current: {transcriber.silence_duration}): ")
                 try:
@@ -324,19 +365,27 @@ def main(args=None):
                 except:
                     print("Invalid duration. Must be an integer.")
                 continue
+            if key:
+                if hasattr(o, key) and isinstance(getattr(o, key), bool):
+                    setattr(o, key, not getattr(o, key))
+                    print(f"Toggle {key} to [{getattr(o, key)}].")
+                print(f"Invalid choice: {key}.")
+                continue
         if o.app:
             greetings = dict(
                 start_message = "Listening... Use the try icon menu to stop.",
             )
-            app = create_app(micro, transcriber, clipboard=o.clipboard, keyboard=o.keyboard, latency=o.latency, **greetings)
+            app = create_app(micro, transcriber, clipboard=o.clipboard,
+                             keyboard=o.keyboard, latency=o.latency, ascii=o.ascii, **greetings)
             print("Starting app...")
             app.run()
         else:
             greetings = dict(
                 start_message = "Listening... Press Ctrl+C to stop.",
             )
-            start_recording(micro, transcriber, clipboard=o.clipboard, keyboard=o.keyboard, latency=o.latency, **greetings)
+            start_recording(micro, transcriber, clipboard=o.clipboard,
+                            keyboard=o.keyboard, latency=o.latency, ascii=o.ascii, **greetings)
         # if we arrived so far, that means we pressed Ctrl + C anyway, and need Enter to move on.
         # So we leave the wider range of options to change the model.

{scribe_cli-0.7.6 → scribe_cli-0.7.8}/scribe/keyboard.py RENAMED Viewed

@@ -2,6 +2,8 @@
 """
 import platform
 import time
+import unidecode
+import logging
 try:
     # import pyautogui
@@ -30,11 +32,24 @@ def paste_text():
         keyboard.release('v')
         keyboard.release(Key.ctrl)
-def type_text(text, interval=0, paste=False):
+def safe_type_text(text):
+    """I got key errors with the uinput mode, so I'm using unidecode to convert
+    the text to ASCII before typing it."""
+    try:
+        keyboard.type(text)
+    except KeyError:
+        asciitext = unidecode.unidecode(text)
+        logging.warning(f"Key error with {text} -> convert to {asciitext}")
+        keyboard.type(asciitext)
+def type_text(text, interval=0, paste=False, ascii=False):
     # Simulate typing a string
     # import subprocess
     # subprocess.run(["ydotool", "type", text])
+    if ascii:
+        text = unidecode.unidecode(text)
     if paste:
         import pyperclip
         keep_state = pyperclip.paste()
@@ -45,7 +60,9 @@ def type_text(text, interval=0, paste=False):
     if interval > 0:
         for c in text:
-            keyboard.type(c)
+            # keyboard.type(c)
+            safe_type_text(c)
             time.sleep(interval)
     else:
-        keyboard.type(text)
+        # keyboard.type(text)
+        safe_type_text(text)

{scribe_cli-0.7.6 → scribe_cli-0.7.8}/scribe/models.py RENAMED Viewed

@@ -102,6 +102,7 @@ class AbstractTranscriber:
 def get_vosk_model(model, download_root=None, url=None):
     """Load the Vosk recognizer"""
     import vosk
+    vosk.SetLogLevel(-1)
     if download_root is None:
         download_root = VOSK_MODELS_FOLDER
     model_path = os.path.join(download_root, model)

{scribe_cli-0.7.6 → scribe_cli-0.7.8}/scribe/util.py RENAMED Viewed

@@ -3,26 +3,7 @@ import re
 import tqdm
 import shutil
 from functools import partial
-class bcolors:
-    # https://stackoverflow.com/a/287944/2192272
-    HEADER = '\033[95m'
-    OKBLUE = '\033[94m'
-    OKGREEN = '\033[92m'
-    WARNING = '\033[93m'
-    FAIL = '\033[91m'
-    ENDC = '\033[0m'
-    BOLD = '\033[1m'
-    UNDERLINE = '\033[4m'
-def strip_colors(s):
-    for name, c in vars(bcolors).items():
-        if name.startswith("_"):
-            continue
-        s = s.replace(c, '')
-    return s
+from termcolor import colored
 def ansi_link(uri, label=None):
     """https://stackoverflow.com/a/71309268/2192272
@@ -36,25 +17,6 @@ def ansi_link(uri, label=None):
     return escape_mask.format(parameters, uri, label)
-def colored(text, color):
-    if hasattr(bcolors, color):
-        color = getattr(bcolors, color)
-    return f"{color}{text}{bcolors.ENDC}"
-ANSI_LINK_RE = re.compile(r'(?P<ansi_sequence>\033]8;(?P<parameter>.*?);(?P<uri>.*?)\033\\(?P<label>.*?)\033]8;;\033\\)')
-def strip_ansi_link(s):
-    for m in ANSI_LINK_RE.findall(s):
-        s = s.replace(m[0], m[3])
-    return s
-def strip_all(s):
-    s = strip_colors(s)
-    s = strip_ansi_link(s)
-    return s
 # Function to clear the terminal line
 def clear_line():
@@ -119,9 +81,9 @@ def format_choice(enum, default=None, unavailable=None):
         value_str = value
     if (default is not None and value == default) or (default is None and i == 0):
-        return f'  ' + colored(f'({i+1}) {value_str} [Press Enter]', 'BOLD')
+        return f'  ' + colored(f'({i+1}) {value_str} [Press Enter]', attrs=['bold'])
     elif unavailable and value in unavailable:
-        return f'  ' + colored(f'{" "} {value_str} -> unavailable !!', 'FAIL')
+        return f'  ' + colored(f'{" "} {value_str} -> unavailable !!', attrs=["strike"])
     else:
         return f'  ({i+1}) {value_str}'

{scribe_cli-0.7.6 → scribe_cli-0.7.8/scribe_cli.egg-info}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.2
 Name: scribe-cli
-Version: 0.7.6
+Version: 0.7.8
 Summary: scribe is a local speech recognition tool that provides real-time transcription using vosk and whisper AI, with the goal of serving as a virtual keyboard on a computer
 Author-email: Mahé Perrette <mahe.perrette@gmail.com>
 License: MIT License
@@ -44,6 +44,8 @@ Requires-Dist: sounddevice
 Requires-Dist: tqdm
 Requires-Dist: requests
 Requires-Dist: pyperclip
+Requires-Dist: unidecode
+Requires-Dist: termcolor
 Provides-Extra: keyboard
 Requires-Dist: pynput; extra == "keyboard"
 Provides-Extra: whisper
@@ -60,7 +62,7 @@ Requires-Dist: vosk; extra == "all"
 Requires-Dist: pystray; extra == "all"
 [![python](https://img.shields.io/badge/python-3.12-blue.svg)]()
-[![pypi](https://github.com/perrette/scribe/actions/workflows/pypi.yml/badge.svg)](https://pypi.org/project/papers-cli)
+[![pypi](https://img.shields.io/pypi/v/scribe-cli)](https://pypi.org/project/scribe-cli)
 # Scribe
@@ -83,7 +85,7 @@ sudo apt-get install portaudio19-dev xclip
 See additional requirements for the [icon tray](#system-tray-icon-experimental) and [keyboard](#virtual-keyboard-experimental) options. The python dependencies should be dealt with automatically:
 ```bash
-pip install scribe-cli[all]"
+pip install scribe-cli[all]
 ```
 (note the `-cli` suffix for client)
@@ -121,7 +123,7 @@ With the `whisker` model you need to stop the registration manually before the t
 there is a maximum duration after which it will stop by itself, which is setup to 60s by default (unless `--duration` is set to something else).
 The `vosk` backend is much faster and very good at doing real-time transcription for one language, but tended to make more mistakes in my tests and it does not do punctuation.
-Use mainly for longer typing session with the [keyboard](#virtual-keyboard-advanced) option, e.g. to make notes.
+It becomes really powerful when used for longer or interactive typing session with the [keyboard](#virtual-keyboard-experimental) option, e.g. to make notes or chat with an AI.
 There are many [vosk models](https://alphacephei.com/vosk/models) available, and here a few are associated to [a handful of languages](scribe/models.toml) `en`, `fr`, `it`, `de` (so far).
 To skip the initial selection menu you can do: