PyPI - whisper-hotkey - Versions diffs - 0.1.1__tar.gz - Mend

whisper-hotkey 0.1.1__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (12) hide show

whisper_hotkey-0.1.1/LICENSE +21 -0
whisper_hotkey-0.1.1/PKG-INFO +84 -0
whisper_hotkey-0.1.1/README.md +64 -0
whisper_hotkey-0.1.1/pyproject.toml +34 -0
whisper_hotkey-0.1.1/src/whisper_dictation/__init__.py +1 -0
whisper_hotkey-0.1.1/src/whisper_dictation/__main__.py +3 -0
whisper_hotkey-0.1.1/src/whisper_dictation/cli.py +106 -0
whisper_hotkey-0.1.1/src/whisper_dictation/config.py +42 -0
whisper_hotkey-0.1.1/src/whisper_dictation/launchagent.py +114 -0
whisper_hotkey-0.1.1/src/whisper_dictation/permissions.py +117 -0
whisper_hotkey-0.1.1/src/whisper_dictation/service.py +157 -0
whisper_hotkey-0.1.1/uv.lock +797 -0

whisper_hotkey-0.1.1/LICENSE ADDED Viewed

@@ -0,0 +1,21 @@
+MIT License
+Copyright (c) 2025 Zane
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

whisper_hotkey-0.1.1/PKG-INFO ADDED Viewed

@@ -0,0 +1,84 @@
+Metadata-Version: 2.4
+Name: whisper-hotkey
+Version: 0.1.1
+Summary: Double-tap Control to dictate with Whisper. Transcribes and pastes to any text field. macOS only.
+Project-URL: Homepage, https://github.com/zane/whisper-hotkey
+License-Expression: MIT
+License-File: LICENSE
+Keywords: dictation,macos,speech-to-text,transcription,whisper
+Classifier: License :: OSI Approved :: MIT License
+Classifier: Operating System :: MacOS
+Classifier: Programming Language :: Python :: 3
+Classifier: Topic :: Multimedia :: Sound/Audio :: Speech
+Requires-Python: >=3.12
+Requires-Dist: faster-whisper
+Requires-Dist: numpy
+Requires-Dist: pynput
+Requires-Dist: sounddevice
+Requires-Dist: soundfile
+Description-Content-Type: text/markdown
+# whisper-hotkey
+Double-tap Control to dictate with Whisper on macOS. Transcribes your speech and pastes it into whatever text field is focused.
+Uses [faster-whisper](https://github.com/SYSTRAN/faster-whisper) (CTranslate2) with int8 quantization for fast local inference. No cloud APIs, everything runs on your machine.
+## Install
+```bash
+uvx whisper-hotkey install
+```
+This will:
+1. Ask you to choose a model (tiny.en for speed, or distil-large-v3 for accuracy)
+2. Download the model
+3. Request microphone and accessibility permissions
+4. Install a LaunchAgent so it starts automatically on login
+## Usage
+After install, just **double-tap the Control key** to start recording. Double-tap again to stop — your speech is transcribed and pasted into the active text field.
+That's it. It works across all apps, survives reboots, and runs in the background.
+## Commands
+```bash
+whisper-hotkey install    # Interactive setup (model, permissions, auto-start)
+whisper-hotkey uninstall  # Remove auto-start
+whisper-hotkey status     # Check if the service is running
+whisper-hotkey             # Run the service directly (not needed after install)
+```
+## Requirements
+- macOS (Apple Silicon or Intel)
+- Python 3.12+
+- [uv](https://docs.astral.sh/uv/) (recommended) or pip
+## Models
+| Model | Size | Speed | Accuracy |
+|-------|------|-------|----------|
+| tiny.en (default) | ~75MB | Fastest | Good for clear English |
+| distil-large-v3 | ~1.5GB | Slower | Better with accents/noise |
+## Permissions
+The app needs two macOS permissions:
+- **Microphone** — to record your voice
+- **Accessibility** — to detect the keyboard shortcut and paste text
+The `install` command will prompt for these. If the shortcut doesn't work, check System Settings > Privacy & Security and make sure both your terminal and Python have Accessibility access.
+## Logs
+```bash
+tail -f ~/Library/Logs/whisper-hotkey/stdout.log
+tail -f ~/Library/Logs/whisper-hotkey/stderr.log
+```
+## License
+MIT

whisper_hotkey-0.1.1/README.md ADDED Viewed

@@ -0,0 +1,64 @@
+# whisper-hotkey
+Double-tap Control to dictate with Whisper on macOS. Transcribes your speech and pastes it into whatever text field is focused.
+Uses [faster-whisper](https://github.com/SYSTRAN/faster-whisper) (CTranslate2) with int8 quantization for fast local inference. No cloud APIs, everything runs on your machine.
+## Install
+```bash
+uvx whisper-hotkey install
+```
+This will:
+1. Ask you to choose a model (tiny.en for speed, or distil-large-v3 for accuracy)
+2. Download the model
+3. Request microphone and accessibility permissions
+4. Install a LaunchAgent so it starts automatically on login
+## Usage
+After install, just **double-tap the Control key** to start recording. Double-tap again to stop — your speech is transcribed and pasted into the active text field.
+That's it. It works across all apps, survives reboots, and runs in the background.
+## Commands
+```bash
+whisper-hotkey install    # Interactive setup (model, permissions, auto-start)
+whisper-hotkey uninstall  # Remove auto-start
+whisper-hotkey status     # Check if the service is running
+whisper-hotkey             # Run the service directly (not needed after install)
+```
+## Requirements
+- macOS (Apple Silicon or Intel)
+- Python 3.12+
+- [uv](https://docs.astral.sh/uv/) (recommended) or pip
+## Models
+| Model | Size | Speed | Accuracy |
+|-------|------|-------|----------|
+| tiny.en (default) | ~75MB | Fastest | Good for clear English |
+| distil-large-v3 | ~1.5GB | Slower | Better with accents/noise |
+## Permissions
+The app needs two macOS permissions:
+- **Microphone** — to record your voice
+- **Accessibility** — to detect the keyboard shortcut and paste text
+The `install` command will prompt for these. If the shortcut doesn't work, check System Settings > Privacy & Security and make sure both your terminal and Python have Accessibility access.
+## Logs
+```bash
+tail -f ~/Library/Logs/whisper-hotkey/stdout.log
+tail -f ~/Library/Logs/whisper-hotkey/stderr.log
+```
+## License
+MIT

whisper_hotkey-0.1.1/pyproject.toml ADDED Viewed

@@ -0,0 +1,34 @@
+[project]
+name = "whisper-hotkey"
+version = "0.1.1"
+description = "Double-tap Control to dictate with Whisper. Transcribes and pastes to any text field. macOS only."
+readme = "README.md"
+license = "MIT"
+requires-python = ">=3.12"
+keywords = ["whisper", "dictation", "speech-to-text", "macos", "transcription"]
+classifiers = [
+    "Operating System :: MacOS",
+    "Topic :: Multimedia :: Sound/Audio :: Speech",
+    "License :: OSI Approved :: MIT License",
+    "Programming Language :: Python :: 3",
+]
+dependencies = [
+    "numpy",
+    "sounddevice",
+    "soundfile",
+    "faster-whisper",
+    "pynput",
+]
+[project.urls]
+Homepage = "https://github.com/zane/whisper-hotkey"
+[project.scripts]
+whisper-hotkey = "whisper_dictation.cli:main"
+[build-system]
+requires = ["hatchling"]
+build-backend = "hatchling.build"
+[tool.hatch.build.targets.wheel]
+packages = ["src/whisper_dictation"]

whisper_hotkey-0.1.1/src/whisper_dictation/__init__.py ADDED Viewed

	@@ -0,0 +1 @@
1	+ """whisper-dictation: Double-tap Control to dictate with Whisper on macOS."""

whisper_hotkey-0.1.1/src/whisper_dictation/__main__.py ADDED Viewed

@@ -0,0 +1,3 @@
+from whisper_dictation.service import run
+run()

whisper_hotkey-0.1.1/src/whisper_dictation/cli.py ADDED Viewed

@@ -0,0 +1,106 @@
+"""CLI entry point for whisper-dictation."""
+import argparse
+import sys
+from whisper_dictation.config import MODELS, DEFAULT_MODEL, get_model_id, set_model_id
+def prompt_model_choice():
+    """Interactively ask the user which model to use."""
+    print("Select a Whisper model:\n")
+    options = list(MODELS.items())
+    for i, (model_id, info) in enumerate(options, 1):
+        print(f"  {i}) {info['label']}")
+        print(f"     {info['description']}\n")
+    while True:
+        choice = input(f"Enter choice [1]: ").strip()
+        if choice == "" or choice == "1":
+            return options[0][0]
+        if choice == "2":
+            return options[1][0]
+        print("  Please enter 1 or 2.")
+def do_install():
+    """Interactive install: model choice, permissions, LaunchAgent."""
+    from whisper_dictation.permissions import request_permissions
+    from whisper_dictation.launchagent import install_agent
+    print()
+    print("=" * 60)
+    print("  whisper-dictation: Install")
+    print("=" * 60)
+    print()
+    # 1. Model choice
+    model_id = prompt_model_choice()
+    set_model_id(model_id)
+    print(f"\n  Selected model: {model_id}\n")
+    # 2. Pre-download model
+    print("  Downloading model (this may take a moment)...")
+    from faster_whisper import WhisperModel
+    WhisperModel(model_id, device="cpu", compute_type="int8")
+    print("  Model ready.\n")
+    # 3. Permissions
+    request_permissions()
+    # 4. LaunchAgent
+    print("Installing LaunchAgent...\n")
+    install_agent()
+    print("=" * 60)
+    print("  Done! Double-tap Control to dictate.")
+    print()
+    print("  If the hotkey doesn't work, make sure these are all granted")
+    print("  in System Settings > Privacy & Security:")
+    print("    - Microphone")
+    print("    - Input Monitoring")
+    print("    - Accessibility")
+    print()
+    print("  Then restart your terminal or log out and back in.")
+    print("=" * 60)
+    print()
+def do_uninstall():
+    from whisper_dictation.launchagent import uninstall_agent
+    uninstall_agent()
+def do_status():
+    from whisper_dictation.launchagent import check_status
+    print(f"Configured model: {get_model_id()}")
+    check_status()
+def do_run():
+    from whisper_dictation.service import run
+    run()
+def main():
+    parser = argparse.ArgumentParser(
+        prog="whisper-hotkey",
+        description="Whisper dictation service for macOS. "
+                    "Double-tap Control to start/stop recording.",
+    )
+    sub = parser.add_subparsers(dest="command")
+    sub.add_parser("run", help="Run the dictation service (default)")
+    sub.add_parser("install", help="Set up model, permissions, and auto-start")
+    sub.add_parser("uninstall", help="Remove the auto-start LaunchAgent")
+    sub.add_parser("status", help="Check if the service is running")
+    args = parser.parse_args()
+    if args.command == "install":
+        do_install()
+    elif args.command == "uninstall":
+        do_uninstall()
+    elif args.command == "status":
+        do_status()
+    else:
+        do_run()

whisper_hotkey-0.1.1/src/whisper_dictation/config.py ADDED Viewed

@@ -0,0 +1,42 @@
+"""Configuration management for whisper-dictation."""
+import json
+from pathlib import Path
+CONFIG_DIR = Path.home() / ".config" / "whisper-dictation"
+CONFIG_FILE = CONFIG_DIR / "config.json"
+MODELS = {
+    "tiny.en": {
+        "label": "tiny.en (Recommended)",
+        "description": "~75MB, fastest, good for clear English dictation",
+    },
+    "distil-large-v3": {
+        "label": "distil-large-v3",
+        "description": "~1.5GB, slower but more accurate, handles accents/noise better",
+    },
+}
+DEFAULT_MODEL = "tiny.en"
+def load_config() -> dict:
+    if CONFIG_FILE.exists():
+        return json.loads(CONFIG_FILE.read_text())
+    return {}
+def save_config(config: dict):
+    CONFIG_DIR.mkdir(parents=True, exist_ok=True)
+    CONFIG_FILE.write_text(json.dumps(config, indent=2) + "\n")
+def get_model_id() -> str:
+    config = load_config()
+    return config.get("model", DEFAULT_MODEL)
+def set_model_id(model_id: str):
+    config = load_config()
+    config["model"] = model_id
+    save_config(config)

whisper_hotkey-0.1.1/src/whisper_dictation/launchagent.py ADDED Viewed

@@ -0,0 +1,114 @@
+"""macOS LaunchAgent management for whisper-dictation."""
+import os
+import plistlib
+import shutil
+import subprocess
+import sys
+from pathlib import Path
+AGENT_LABEL = "com.whisper-hotkey.service"
+PLIST_FILENAME = f"{AGENT_LABEL}.plist"
+def get_plist_path() -> Path:
+    return Path.home() / "Library" / "LaunchAgents" / PLIST_FILENAME
+def get_log_dir() -> Path:
+    log_dir = Path.home() / "Library" / "Logs" / "whisper-hotkey"
+    log_dir.mkdir(parents=True, exist_ok=True)
+    return log_dir
+def find_binary() -> str:
+    """Find the whisper-dictation binary."""
+    # uv tool install location
+    uv_path = Path.home() / ".local" / "bin" / "whisper-hotkey"
+    if uv_path.exists():
+        return str(uv_path)
+    which = shutil.which("whisper-hotkey")
+    if which:
+        return which
+    return sys.executable
+def build_plist() -> dict:
+    binary = find_binary()
+    log_dir = get_log_dir()
+    if binary == sys.executable:
+        program_args = [binary, "-m", "whisper_dictation"]
+    else:
+        program_args = [binary]
+    return {
+        "Label": AGENT_LABEL,
+        "ProgramArguments": program_args,
+        "RunAtLoad": True,
+        "KeepAlive": True,
+        "StandardOutPath": str(log_dir / "stdout.log"),
+        "StandardErrorPath": str(log_dir / "stderr.log"),
+        "EnvironmentVariables": {
+            "PATH": "/opt/homebrew/bin:/usr/local/bin:/usr/bin:/bin",
+            "HOME": str(Path.home()),
+        },
+        "ProcessType": "Interactive",
+        "ThrottleInterval": 5,
+    }
+def install_agent():
+    """Write the LaunchAgent plist and load it."""
+    plist_path = get_plist_path()
+    plist_path.parent.mkdir(parents=True, exist_ok=True)
+    # Unload existing if present
+    if plist_path.exists():
+        subprocess.run(
+            ["launchctl", "unload", str(plist_path)],
+            capture_output=True,
+        )
+    plist_data = build_plist()
+    with open(plist_path, "wb") as f:
+        plistlib.dump(plist_data, f)
+    subprocess.run(["launchctl", "load", str(plist_path)], check=True)
+    print(f"  LaunchAgent installed: {plist_path}")
+    print(f"  Logs: {get_log_dir()}")
+    print(f"  Binary: {' '.join(plist_data['ProgramArguments'])}")
+    print()
+    print("  The service is now running and will auto-start on login.")
+    print("  Use 'whisper-hotkey uninstall' to remove.")
+    print()
+def uninstall_agent():
+    """Unload and remove the LaunchAgent plist."""
+    plist_path = get_plist_path()
+    if not plist_path.exists():
+        print("LaunchAgent not found. Nothing to uninstall.")
+        return
+    subprocess.run(
+        ["launchctl", "unload", str(plist_path)],
+        capture_output=True,
+    )
+    plist_path.unlink()
+    print(f"LaunchAgent removed: {plist_path}")
+def check_status():
+    """Check if the service is currently loaded."""
+    result = subprocess.run(
+        ["launchctl", "list", AGENT_LABEL],
+        capture_output=True, text=True,
+    )
+    if result.returncode == 0:
+        print(f"Service is running.\n{result.stdout}")
+    else:
+        print("Service is not running.")

whisper_hotkey-0.1.1/src/whisper_dictation/permissions.py ADDED Viewed

@@ -0,0 +1,117 @@
+"""macOS permission requests for microphone, input monitoring, and accessibility."""
+import subprocess
+import sys
+import os
+from pathlib import Path
+def get_python_path():
+    """Return the resolved Python interpreter path."""
+    return os.path.realpath(sys.executable)
+def open_system_settings(pane):
+    """Open a specific Privacy & Security pane in System Settings."""
+    subprocess.run(
+        ["open", f"x-apple.systempreferences:com.apple.preference.security?{pane}"],
+        capture_output=True,
+    )
+def reveal_in_finder(path):
+    """Reveal a file in Finder so the user can drag or select it."""
+    subprocess.run(["open", "-R", path], capture_output=True)
+def request_microphone_access():
+    """Trigger the macOS microphone permission dialog."""
+    print("  [1/3] Microphone")
+    print("  Requesting microphone access...")
+    print("  If prompted, click 'Allow' in the macOS dialog.\n")
+    try:
+        import sounddevice as sd
+        sd.rec(int(0.1 * 16000), samplerate=16000, channels=1, dtype="float32")
+        sd.wait()
+        print("  ✓ Microphone access granted.\n")
+    except Exception as e:
+        print(f"  ✗ Microphone access may have been denied: {e}")
+        print("  Go to System Settings > Privacy & Security > Microphone")
+        print(f"  and enable access for: {get_python_path()}\n")
+def request_input_monitoring():
+    """Guide the user to grant Input Monitoring permission.
+    pynput requires Input Monitoring to detect global keyboard events.
+    macOS only shows .app bundles in the Input Monitoring list by default,
+    so we reveal the Python binary in Finder and open the settings pane
+    to let the user add it manually.
+    """
+    python_path = get_python_path()
+    print("  [2/3] Input Monitoring")
+    print("  This is required for the double-tap Control hotkey to work.")
+    print()
+    print("  Opening System Settings > Input Monitoring and revealing")
+    print(f"  the Python binary in Finder so you can add it.\n")
+    print(f"    Python binary: {python_path}\n")
+    print("  Steps:")
+    print("    1. In the System Settings window that opens, click the '+' button")
+    print("    2. In the file dialog, press Cmd+Shift+G and paste this path:")
+    print(f"       {Path(python_path).parent}")
+    print(f"    3. Select '{Path(python_path).name}' and click Open")
+    print("    4. Make sure the toggle is ON\n")
+    open_system_settings("Privacy_ListenEvent")
+    reveal_in_finder(python_path)
+    input("  Press Enter after granting Input Monitoring access...")
+    print()
+def request_accessibility_access():
+    """Trigger the macOS accessibility permission prompt."""
+    python_path = get_python_path()
+    print("  [3/3] Accessibility")
+    print("  This is needed to paste transcribed text via Cmd+V.\n")
+    subprocess.run(
+        ["osascript", "-e",
+         'tell application "System Events" to keystroke ""'],
+        capture_output=True,
+    )
+    terminal = os.environ.get("TERM_PROGRAM", "Terminal")
+    print("  If pasting doesn't work after install, add BOTH of these")
+    print("  in System Settings > Privacy & Security > Accessibility:\n")
+    print(f"    1. Your terminal: {terminal}")
+    print(f"    2. Python: {python_path}\n")
+def request_permissions():
+    """Run all permission requests."""
+    python_path = get_python_path()
+    print("=" * 60)
+    print("  whisper-dictation: Permission Setup")
+    print("=" * 60)
+    print()
+    print(f"  Python interpreter: {python_path}")
+    print()
+    print("  This tool requires three macOS permissions:")
+    print("    1. Microphone — to record audio")
+    print("    2. Input Monitoring — to detect the hotkey")
+    print("    3. Accessibility — to paste transcribed text")
+    print()
+    request_microphone_access()
+    request_input_monitoring()
+    request_accessibility_access()
+    print("=" * 60)
+    print("  Setup complete! After granting all permissions,")
+    print("  restart your terminal for changes to take effect.")
+    print("=" * 60)
+    print()