PyPI - scribe-cli - Versions diffs - 0.12.4__tar.gz → 0.13.0__tar.gz - Mend

scribe-cli 0.12.4tar.gz → 0.13.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (31) hide show

{scribe_cli-0.12.4/scribe_cli.egg-info → scribe_cli-0.13.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.2
 Name: scribe-cli
-Version: 0.12.4
+Version: 0.13.0
 Summary: scribe is a local speech recognition tool that provides real-time transcription using vosk and whisper AI, with the goal of serving as a virtual keyboard on a computer
 Author-email: Mahé Perrette <mahe.perrette@gmail.com>
 License: MIT License
@@ -33,7 +33,7 @@ License: MIT License
         licenses of all dependencies before using or distributing this software to
         ensure compliance with their respective terms.
 Project-URL: Homepage, https://github.com/perrette/scribe
-Keywords: speech recognition,transcription,AI,language,vosk,whisper,openai,keyboard,clipboard
+Keywords: speech-to-text,speech recognition,transcription,language,AI,local,API,vosk,whisper,openai,keyboard,clipboard
 Classifier: Programming Language :: Python :: 3.9
 Classifier: Programming Language :: Python :: 3.10
 Classifier: Programming Language :: Python :: 3.11
@@ -171,8 +171,8 @@ You can interrupt the recording via Ctrl + C and start again or change model.
 The default (`whisper`) is excellent at transcribing a full-length audio sequences in [many languages](https://github.com/openai/whisper?tab=readme-ov-file#available-models-and-languages). It is really impressive,
 but it cannot do real-time, and depending on the model can have relatively long execution time, especially with the `turbo` model (at least on my laptop with CPU only). The `small` model is also excellent and runs much faster. It is selected as default in `scribe` for that reason.
-With the `whisper` model the registration stops after a 2-second silence is detected. You can also stop the registration manually before the transcription occurs (Ctrl + C or Stop in the `--app` mode).
-By default, the recording will only last 120 seconds. You can fine-tune this behaviour via the `--silence`, `--duration` and `--restart-after-silence` parameters.
+With the `whisper` model (`whisper` and `openaiapi` backends) the registration continues for 2 minutes until you stop the registration manually to trigger the transcription (Stop in the app, Ctrl + C in the terminal).
+These parameters can be changed. There is also the possibility to interrupt after a silence is detected. You would do: `--silence -40 --duration-duration 2` to interrupt the registration when a silence (less than -40 db recorded) lasts for more than 2 seconds. This is experimental, and the default is an exceedingly low silence threshold of -200 db and a silence duration of 120 s to effectively disable that feature and keep full manual control.
 The `vosk` backend is much faster and very good at doing real-time transcription for one language, but tended to make more mistakes in my tests and it does not do punctuation.
 It becomes really powerful when used for longer or interactive typing session with the [keyboard](#virtual-keyboard-experimental) option, e.g. to make notes or chat with an AI.

{scribe_cli-0.12.4 → scribe_cli-0.13.0}/README.md RENAMED Viewed

@@ -99,8 +99,8 @@ You can interrupt the recording via Ctrl + C and start again or change model.
 The default (`whisper`) is excellent at transcribing a full-length audio sequences in [many languages](https://github.com/openai/whisper?tab=readme-ov-file#available-models-and-languages). It is really impressive,
 but it cannot do real-time, and depending on the model can have relatively long execution time, especially with the `turbo` model (at least on my laptop with CPU only). The `small` model is also excellent and runs much faster. It is selected as default in `scribe` for that reason.
-With the `whisper` model the registration stops after a 2-second silence is detected. You can also stop the registration manually before the transcription occurs (Ctrl + C or Stop in the `--app` mode).
-By default, the recording will only last 120 seconds. You can fine-tune this behaviour via the `--silence`, `--duration` and `--restart-after-silence` parameters.
+With the `whisper` model (`whisper` and `openaiapi` backends) the registration continues for 2 minutes until you stop the registration manually to trigger the transcription (Stop in the app, Ctrl + C in the terminal).
+These parameters can be changed. There is also the possibility to interrupt after a silence is detected. You would do: `--silence -40 --duration-duration 2` to interrupt the registration when a silence (less than -40 db recorded) lasts for more than 2 seconds. This is experimental, and the default is an exceedingly low silence threshold of -200 db and a silence duration of 120 s to effectively disable that feature and keep full manual control.
 The `vosk` backend is much faster and very good at doing real-time transcription for one language, but tended to make more mistakes in my tests and it does not do punctuation.
 It becomes really powerful when used for longer or interactive typing session with the [keyboard](#virtual-keyboard-experimental) option, e.g. to make notes or chat with an AI.

{scribe_cli-0.12.4 → scribe_cli-0.13.0}/pyproject.toml RENAMED Viewed

@@ -32,10 +32,13 @@ classifiers = [
 ]
 keywords = [
+    "speech-to-text",
     "speech recognition",
     "transcription",
-    "AI",
     "language",
+    "AI",
+    "local",
+    "API",
     "vosk",
     "whisper",
     "openai",

{scribe_cli-0.12.4 → scribe_cli-0.13.0}/scribe/_version.py RENAMED Viewed

@@ -17,5 +17,5 @@ __version__: str
 __version_tuple__: VERSION_TUPLE
 version_tuple: VERSION_TUPLE
-__version__ = version = '0.12.4'
-__version_tuple__ = version_tuple = (0, 12, 4)
+__version__ = version = '0.13.0'
+__version_tuple__ = version_tuple = (0, 13, 0)

{scribe_cli-0.12.4 → scribe_cli-0.13.0}/scribe/app.py RENAMED Viewed

@@ -202,8 +202,8 @@ def get_parser():
     group = parser.add_argument_group("whisper options")
     group.add_argument("--duration", default=120, type=float, help="Max duration of the whisper recording (default %(default)s s)")
-    group.add_argument("--silence", default=2, type=float, help="silence duration (default %(default)s s)")
-    group.add_argument("--silence-db", default=-40, type=float, help="silence magnitude in decibel (default %(default)s db)")
+    group.add_argument("--silence", default=120, type=float, help="silence duration (default %(default)s s)")
+    group.add_argument("--silence-db", default=-200, type=float, help="silence magnitude in decibel (default %(default)s db)")
     group.add_argument("-a", "--restart-after-silence", action="store_true", help="Restart the recording after a transcription triggered by a silence")
     group.add_argument("--download-folder-whisper", help="Folder to store Whisper models.")

{scribe_cli-0.12.4 → scribe_cli-0.13.0/scribe_cli.egg-info}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.2
 Name: scribe-cli
-Version: 0.12.4
+Version: 0.13.0
 Summary: scribe is a local speech recognition tool that provides real-time transcription using vosk and whisper AI, with the goal of serving as a virtual keyboard on a computer
 Author-email: Mahé Perrette <mahe.perrette@gmail.com>
 License: MIT License
@@ -33,7 +33,7 @@ License: MIT License
         licenses of all dependencies before using or distributing this software to
         ensure compliance with their respective terms.
 Project-URL: Homepage, https://github.com/perrette/scribe
-Keywords: speech recognition,transcription,AI,language,vosk,whisper,openai,keyboard,clipboard
+Keywords: speech-to-text,speech recognition,transcription,language,AI,local,API,vosk,whisper,openai,keyboard,clipboard
 Classifier: Programming Language :: Python :: 3.9
 Classifier: Programming Language :: Python :: 3.10
 Classifier: Programming Language :: Python :: 3.11
@@ -171,8 +171,8 @@ You can interrupt the recording via Ctrl + C and start again or change model.
 The default (`whisper`) is excellent at transcribing a full-length audio sequences in [many languages](https://github.com/openai/whisper?tab=readme-ov-file#available-models-and-languages). It is really impressive,
 but it cannot do real-time, and depending on the model can have relatively long execution time, especially with the `turbo` model (at least on my laptop with CPU only). The `small` model is also excellent and runs much faster. It is selected as default in `scribe` for that reason.
-With the `whisper` model the registration stops after a 2-second silence is detected. You can also stop the registration manually before the transcription occurs (Ctrl + C or Stop in the `--app` mode).
-By default, the recording will only last 120 seconds. You can fine-tune this behaviour via the `--silence`, `--duration` and `--restart-after-silence` parameters.
+With the `whisper` model (`whisper` and `openaiapi` backends) the registration continues for 2 minutes until you stop the registration manually to trigger the transcription (Stop in the app, Ctrl + C in the terminal).
+These parameters can be changed. There is also the possibility to interrupt after a silence is detected. You would do: `--silence -40 --duration-duration 2` to interrupt the registration when a silence (less than -40 db recorded) lasts for more than 2 seconds. This is experimental, and the default is an exceedingly low silence threshold of -200 db and a silence duration of 120 s to effectively disable that feature and keep full manual control.
 The `vosk` backend is much faster and very good at doing real-time transcription for one language, but tended to make more mistakes in my tests and it does not do punctuation.
 It becomes really powerful when used for longer or interactive typing session with the [keyboard](#virtual-keyboard-experimental) option, e.g. to make notes or chat with an AI.