whisper-hotkey 0.1.1__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2025 Zane
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
@@ -0,0 +1,84 @@
1
+ Metadata-Version: 2.4
2
+ Name: whisper-hotkey
3
+ Version: 0.1.1
4
+ Summary: Double-tap Control to dictate with Whisper. Transcribes and pastes to any text field. macOS only.
5
+ Project-URL: Homepage, https://github.com/zane/whisper-hotkey
6
+ License-Expression: MIT
7
+ License-File: LICENSE
8
+ Keywords: dictation,macos,speech-to-text,transcription,whisper
9
+ Classifier: License :: OSI Approved :: MIT License
10
+ Classifier: Operating System :: MacOS
11
+ Classifier: Programming Language :: Python :: 3
12
+ Classifier: Topic :: Multimedia :: Sound/Audio :: Speech
13
+ Requires-Python: >=3.12
14
+ Requires-Dist: faster-whisper
15
+ Requires-Dist: numpy
16
+ Requires-Dist: pynput
17
+ Requires-Dist: sounddevice
18
+ Requires-Dist: soundfile
19
+ Description-Content-Type: text/markdown
20
+
21
+ # whisper-hotkey
22
+
23
+ Double-tap Control to dictate with Whisper on macOS. Transcribes your speech and pastes it into whatever text field is focused.
24
+
25
+ Uses [faster-whisper](https://github.com/SYSTRAN/faster-whisper) (CTranslate2) with int8 quantization for fast local inference. No cloud APIs, everything runs on your machine.
26
+
27
+ ## Install
28
+
29
+ ```bash
30
+ uvx whisper-hotkey install
31
+ ```
32
+
33
+ This will:
34
+ 1. Ask you to choose a model (tiny.en for speed, or distil-large-v3 for accuracy)
35
+ 2. Download the model
36
+ 3. Request microphone and accessibility permissions
37
+ 4. Install a LaunchAgent so it starts automatically on login
38
+
39
+ ## Usage
40
+
41
+ After install, just **double-tap the Control key** to start recording. Double-tap again to stop — your speech is transcribed and pasted into the active text field.
42
+
43
+ That's it. It works across all apps, survives reboots, and runs in the background.
44
+
45
+ ## Commands
46
+
47
+ ```bash
48
+ whisper-hotkey install # Interactive setup (model, permissions, auto-start)
49
+ whisper-hotkey uninstall # Remove auto-start
50
+ whisper-hotkey status # Check if the service is running
51
+ whisper-hotkey # Run the service directly (not needed after install)
52
+ ```
53
+
54
+ ## Requirements
55
+
56
+ - macOS (Apple Silicon or Intel)
57
+ - Python 3.12+
58
+ - [uv](https://docs.astral.sh/uv/) (recommended) or pip
59
+
60
+ ## Models
61
+
62
+ | Model | Size | Speed | Accuracy |
63
+ |-------|------|-------|----------|
64
+ | tiny.en (default) | ~75MB | Fastest | Good for clear English |
65
+ | distil-large-v3 | ~1.5GB | Slower | Better with accents/noise |
66
+
67
+ ## Permissions
68
+
69
+ The app needs two macOS permissions:
70
+ - **Microphone** — to record your voice
71
+ - **Accessibility** — to detect the keyboard shortcut and paste text
72
+
73
+ The `install` command will prompt for these. If the shortcut doesn't work, check System Settings > Privacy & Security and make sure both your terminal and Python have Accessibility access.
74
+
75
+ ## Logs
76
+
77
+ ```bash
78
+ tail -f ~/Library/Logs/whisper-hotkey/stdout.log
79
+ tail -f ~/Library/Logs/whisper-hotkey/stderr.log
80
+ ```
81
+
82
+ ## License
83
+
84
+ MIT
@@ -0,0 +1,64 @@
1
+ # whisper-hotkey
2
+
3
+ Double-tap Control to dictate with Whisper on macOS. Transcribes your speech and pastes it into whatever text field is focused.
4
+
5
+ Uses [faster-whisper](https://github.com/SYSTRAN/faster-whisper) (CTranslate2) with int8 quantization for fast local inference. No cloud APIs, everything runs on your machine.
6
+
7
+ ## Install
8
+
9
+ ```bash
10
+ uvx whisper-hotkey install
11
+ ```
12
+
13
+ This will:
14
+ 1. Ask you to choose a model (tiny.en for speed, or distil-large-v3 for accuracy)
15
+ 2. Download the model
16
+ 3. Request microphone and accessibility permissions
17
+ 4. Install a LaunchAgent so it starts automatically on login
18
+
19
+ ## Usage
20
+
21
+ After install, just **double-tap the Control key** to start recording. Double-tap again to stop — your speech is transcribed and pasted into the active text field.
22
+
23
+ That's it. It works across all apps, survives reboots, and runs in the background.
24
+
25
+ ## Commands
26
+
27
+ ```bash
28
+ whisper-hotkey install # Interactive setup (model, permissions, auto-start)
29
+ whisper-hotkey uninstall # Remove auto-start
30
+ whisper-hotkey status # Check if the service is running
31
+ whisper-hotkey # Run the service directly (not needed after install)
32
+ ```
33
+
34
+ ## Requirements
35
+
36
+ - macOS (Apple Silicon or Intel)
37
+ - Python 3.12+
38
+ - [uv](https://docs.astral.sh/uv/) (recommended) or pip
39
+
40
+ ## Models
41
+
42
+ | Model | Size | Speed | Accuracy |
43
+ |-------|------|-------|----------|
44
+ | tiny.en (default) | ~75MB | Fastest | Good for clear English |
45
+ | distil-large-v3 | ~1.5GB | Slower | Better with accents/noise |
46
+
47
+ ## Permissions
48
+
49
+ The app needs two macOS permissions:
50
+ - **Microphone** — to record your voice
51
+ - **Accessibility** — to detect the keyboard shortcut and paste text
52
+
53
+ The `install` command will prompt for these. If the shortcut doesn't work, check System Settings > Privacy & Security and make sure both your terminal and Python have Accessibility access.
54
+
55
+ ## Logs
56
+
57
+ ```bash
58
+ tail -f ~/Library/Logs/whisper-hotkey/stdout.log
59
+ tail -f ~/Library/Logs/whisper-hotkey/stderr.log
60
+ ```
61
+
62
+ ## License
63
+
64
+ MIT
@@ -0,0 +1,34 @@
1
+ [project]
2
+ name = "whisper-hotkey"
3
+ version = "0.1.1"
4
+ description = "Double-tap Control to dictate with Whisper. Transcribes and pastes to any text field. macOS only."
5
+ readme = "README.md"
6
+ license = "MIT"
7
+ requires-python = ">=3.12"
8
+ keywords = ["whisper", "dictation", "speech-to-text", "macos", "transcription"]
9
+ classifiers = [
10
+ "Operating System :: MacOS",
11
+ "Topic :: Multimedia :: Sound/Audio :: Speech",
12
+ "License :: OSI Approved :: MIT License",
13
+ "Programming Language :: Python :: 3",
14
+ ]
15
+ dependencies = [
16
+ "numpy",
17
+ "sounddevice",
18
+ "soundfile",
19
+ "faster-whisper",
20
+ "pynput",
21
+ ]
22
+
23
+ [project.urls]
24
+ Homepage = "https://github.com/zane/whisper-hotkey"
25
+
26
+ [project.scripts]
27
+ whisper-hotkey = "whisper_dictation.cli:main"
28
+
29
+ [build-system]
30
+ requires = ["hatchling"]
31
+ build-backend = "hatchling.build"
32
+
33
+ [tool.hatch.build.targets.wheel]
34
+ packages = ["src/whisper_dictation"]
@@ -0,0 +1 @@
1
+ """whisper-dictation: Double-tap Control to dictate with Whisper on macOS."""
@@ -0,0 +1,3 @@
1
+ from whisper_dictation.service import run
2
+
3
+ run()
@@ -0,0 +1,106 @@
1
+ """CLI entry point for whisper-dictation."""
2
+
3
+ import argparse
4
+ import sys
5
+
6
+ from whisper_dictation.config import MODELS, DEFAULT_MODEL, get_model_id, set_model_id
7
+
8
+
9
+ def prompt_model_choice():
10
+ """Interactively ask the user which model to use."""
11
+ print("Select a Whisper model:\n")
12
+ options = list(MODELS.items())
13
+ for i, (model_id, info) in enumerate(options, 1):
14
+ print(f" {i}) {info['label']}")
15
+ print(f" {info['description']}\n")
16
+
17
+ while True:
18
+ choice = input(f"Enter choice [1]: ").strip()
19
+ if choice == "" or choice == "1":
20
+ return options[0][0]
21
+ if choice == "2":
22
+ return options[1][0]
23
+ print(" Please enter 1 or 2.")
24
+
25
+
26
+ def do_install():
27
+ """Interactive install: model choice, permissions, LaunchAgent."""
28
+ from whisper_dictation.permissions import request_permissions
29
+ from whisper_dictation.launchagent import install_agent
30
+
31
+ print()
32
+ print("=" * 60)
33
+ print(" whisper-dictation: Install")
34
+ print("=" * 60)
35
+ print()
36
+
37
+ # 1. Model choice
38
+ model_id = prompt_model_choice()
39
+ set_model_id(model_id)
40
+ print(f"\n Selected model: {model_id}\n")
41
+
42
+ # 2. Pre-download model
43
+ print(" Downloading model (this may take a moment)...")
44
+ from faster_whisper import WhisperModel
45
+ WhisperModel(model_id, device="cpu", compute_type="int8")
46
+ print(" Model ready.\n")
47
+
48
+ # 3. Permissions
49
+ request_permissions()
50
+
51
+ # 4. LaunchAgent
52
+ print("Installing LaunchAgent...\n")
53
+ install_agent()
54
+
55
+ print("=" * 60)
56
+ print(" Done! Double-tap Control to dictate.")
57
+ print()
58
+ print(" If the hotkey doesn't work, make sure these are all granted")
59
+ print(" in System Settings > Privacy & Security:")
60
+ print(" - Microphone")
61
+ print(" - Input Monitoring")
62
+ print(" - Accessibility")
63
+ print()
64
+ print(" Then restart your terminal or log out and back in.")
65
+ print("=" * 60)
66
+ print()
67
+
68
+
69
+ def do_uninstall():
70
+ from whisper_dictation.launchagent import uninstall_agent
71
+ uninstall_agent()
72
+
73
+
74
+ def do_status():
75
+ from whisper_dictation.launchagent import check_status
76
+ print(f"Configured model: {get_model_id()}")
77
+ check_status()
78
+
79
+
80
+ def do_run():
81
+ from whisper_dictation.service import run
82
+ run()
83
+
84
+
85
+ def main():
86
+ parser = argparse.ArgumentParser(
87
+ prog="whisper-hotkey",
88
+ description="Whisper dictation service for macOS. "
89
+ "Double-tap Control to start/stop recording.",
90
+ )
91
+ sub = parser.add_subparsers(dest="command")
92
+ sub.add_parser("run", help="Run the dictation service (default)")
93
+ sub.add_parser("install", help="Set up model, permissions, and auto-start")
94
+ sub.add_parser("uninstall", help="Remove the auto-start LaunchAgent")
95
+ sub.add_parser("status", help="Check if the service is running")
96
+
97
+ args = parser.parse_args()
98
+
99
+ if args.command == "install":
100
+ do_install()
101
+ elif args.command == "uninstall":
102
+ do_uninstall()
103
+ elif args.command == "status":
104
+ do_status()
105
+ else:
106
+ do_run()
@@ -0,0 +1,42 @@
1
+ """Configuration management for whisper-dictation."""
2
+
3
+ import json
4
+ from pathlib import Path
5
+
6
+ CONFIG_DIR = Path.home() / ".config" / "whisper-dictation"
7
+ CONFIG_FILE = CONFIG_DIR / "config.json"
8
+
9
+ MODELS = {
10
+ "tiny.en": {
11
+ "label": "tiny.en (Recommended)",
12
+ "description": "~75MB, fastest, good for clear English dictation",
13
+ },
14
+ "distil-large-v3": {
15
+ "label": "distil-large-v3",
16
+ "description": "~1.5GB, slower but more accurate, handles accents/noise better",
17
+ },
18
+ }
19
+
20
+ DEFAULT_MODEL = "tiny.en"
21
+
22
+
23
+ def load_config() -> dict:
24
+ if CONFIG_FILE.exists():
25
+ return json.loads(CONFIG_FILE.read_text())
26
+ return {}
27
+
28
+
29
+ def save_config(config: dict):
30
+ CONFIG_DIR.mkdir(parents=True, exist_ok=True)
31
+ CONFIG_FILE.write_text(json.dumps(config, indent=2) + "\n")
32
+
33
+
34
+ def get_model_id() -> str:
35
+ config = load_config()
36
+ return config.get("model", DEFAULT_MODEL)
37
+
38
+
39
+ def set_model_id(model_id: str):
40
+ config = load_config()
41
+ config["model"] = model_id
42
+ save_config(config)
@@ -0,0 +1,114 @@
1
+ """macOS LaunchAgent management for whisper-dictation."""
2
+
3
+ import os
4
+ import plistlib
5
+ import shutil
6
+ import subprocess
7
+ import sys
8
+ from pathlib import Path
9
+
10
+ AGENT_LABEL = "com.whisper-hotkey.service"
11
+ PLIST_FILENAME = f"{AGENT_LABEL}.plist"
12
+
13
+
14
+ def get_plist_path() -> Path:
15
+ return Path.home() / "Library" / "LaunchAgents" / PLIST_FILENAME
16
+
17
+
18
+ def get_log_dir() -> Path:
19
+ log_dir = Path.home() / "Library" / "Logs" / "whisper-hotkey"
20
+ log_dir.mkdir(parents=True, exist_ok=True)
21
+ return log_dir
22
+
23
+
24
+ def find_binary() -> str:
25
+ """Find the whisper-dictation binary."""
26
+ # uv tool install location
27
+ uv_path = Path.home() / ".local" / "bin" / "whisper-hotkey"
28
+ if uv_path.exists():
29
+ return str(uv_path)
30
+
31
+ which = shutil.which("whisper-hotkey")
32
+ if which:
33
+ return which
34
+
35
+ return sys.executable
36
+
37
+
38
+ def build_plist() -> dict:
39
+ binary = find_binary()
40
+ log_dir = get_log_dir()
41
+
42
+ if binary == sys.executable:
43
+ program_args = [binary, "-m", "whisper_dictation"]
44
+ else:
45
+ program_args = [binary]
46
+
47
+ return {
48
+ "Label": AGENT_LABEL,
49
+ "ProgramArguments": program_args,
50
+ "RunAtLoad": True,
51
+ "KeepAlive": True,
52
+ "StandardOutPath": str(log_dir / "stdout.log"),
53
+ "StandardErrorPath": str(log_dir / "stderr.log"),
54
+ "EnvironmentVariables": {
55
+ "PATH": "/opt/homebrew/bin:/usr/local/bin:/usr/bin:/bin",
56
+ "HOME": str(Path.home()),
57
+ },
58
+ "ProcessType": "Interactive",
59
+ "ThrottleInterval": 5,
60
+ }
61
+
62
+
63
+ def install_agent():
64
+ """Write the LaunchAgent plist and load it."""
65
+ plist_path = get_plist_path()
66
+ plist_path.parent.mkdir(parents=True, exist_ok=True)
67
+
68
+ # Unload existing if present
69
+ if plist_path.exists():
70
+ subprocess.run(
71
+ ["launchctl", "unload", str(plist_path)],
72
+ capture_output=True,
73
+ )
74
+
75
+ plist_data = build_plist()
76
+ with open(plist_path, "wb") as f:
77
+ plistlib.dump(plist_data, f)
78
+
79
+ subprocess.run(["launchctl", "load", str(plist_path)], check=True)
80
+
81
+ print(f" LaunchAgent installed: {plist_path}")
82
+ print(f" Logs: {get_log_dir()}")
83
+ print(f" Binary: {' '.join(plist_data['ProgramArguments'])}")
84
+ print()
85
+ print(" The service is now running and will auto-start on login.")
86
+ print(" Use 'whisper-hotkey uninstall' to remove.")
87
+ print()
88
+
89
+
90
+ def uninstall_agent():
91
+ """Unload and remove the LaunchAgent plist."""
92
+ plist_path = get_plist_path()
93
+ if not plist_path.exists():
94
+ print("LaunchAgent not found. Nothing to uninstall.")
95
+ return
96
+
97
+ subprocess.run(
98
+ ["launchctl", "unload", str(plist_path)],
99
+ capture_output=True,
100
+ )
101
+ plist_path.unlink()
102
+ print(f"LaunchAgent removed: {plist_path}")
103
+
104
+
105
+ def check_status():
106
+ """Check if the service is currently loaded."""
107
+ result = subprocess.run(
108
+ ["launchctl", "list", AGENT_LABEL],
109
+ capture_output=True, text=True,
110
+ )
111
+ if result.returncode == 0:
112
+ print(f"Service is running.\n{result.stdout}")
113
+ else:
114
+ print("Service is not running.")
@@ -0,0 +1,117 @@
1
+ """macOS permission requests for microphone, input monitoring, and accessibility."""
2
+
3
+ import subprocess
4
+ import sys
5
+ import os
6
+ from pathlib import Path
7
+
8
+
9
+ def get_python_path():
10
+ """Return the resolved Python interpreter path."""
11
+ return os.path.realpath(sys.executable)
12
+
13
+
14
+ def open_system_settings(pane):
15
+ """Open a specific Privacy & Security pane in System Settings."""
16
+ subprocess.run(
17
+ ["open", f"x-apple.systempreferences:com.apple.preference.security?{pane}"],
18
+ capture_output=True,
19
+ )
20
+
21
+
22
+ def reveal_in_finder(path):
23
+ """Reveal a file in Finder so the user can drag or select it."""
24
+ subprocess.run(["open", "-R", path], capture_output=True)
25
+
26
+
27
+ def request_microphone_access():
28
+ """Trigger the macOS microphone permission dialog."""
29
+ print(" [1/3] Microphone")
30
+ print(" Requesting microphone access...")
31
+ print(" If prompted, click 'Allow' in the macOS dialog.\n")
32
+ try:
33
+ import sounddevice as sd
34
+ sd.rec(int(0.1 * 16000), samplerate=16000, channels=1, dtype="float32")
35
+ sd.wait()
36
+ print(" ✓ Microphone access granted.\n")
37
+ except Exception as e:
38
+ print(f" ✗ Microphone access may have been denied: {e}")
39
+ print(" Go to System Settings > Privacy & Security > Microphone")
40
+ print(f" and enable access for: {get_python_path()}\n")
41
+
42
+
43
+ def request_input_monitoring():
44
+ """Guide the user to grant Input Monitoring permission.
45
+
46
+ pynput requires Input Monitoring to detect global keyboard events.
47
+ macOS only shows .app bundles in the Input Monitoring list by default,
48
+ so we reveal the Python binary in Finder and open the settings pane
49
+ to let the user add it manually.
50
+ """
51
+ python_path = get_python_path()
52
+
53
+ print(" [2/3] Input Monitoring")
54
+ print(" This is required for the double-tap Control hotkey to work.")
55
+ print()
56
+ print(" Opening System Settings > Input Monitoring and revealing")
57
+ print(f" the Python binary in Finder so you can add it.\n")
58
+ print(f" Python binary: {python_path}\n")
59
+ print(" Steps:")
60
+ print(" 1. In the System Settings window that opens, click the '+' button")
61
+ print(" 2. In the file dialog, press Cmd+Shift+G and paste this path:")
62
+ print(f" {Path(python_path).parent}")
63
+ print(f" 3. Select '{Path(python_path).name}' and click Open")
64
+ print(" 4. Make sure the toggle is ON\n")
65
+
66
+ open_system_settings("Privacy_ListenEvent")
67
+ reveal_in_finder(python_path)
68
+
69
+ input(" Press Enter after granting Input Monitoring access...")
70
+ print()
71
+
72
+
73
+ def request_accessibility_access():
74
+ """Trigger the macOS accessibility permission prompt."""
75
+ python_path = get_python_path()
76
+
77
+ print(" [3/3] Accessibility")
78
+ print(" This is needed to paste transcribed text via Cmd+V.\n")
79
+
80
+ subprocess.run(
81
+ ["osascript", "-e",
82
+ 'tell application "System Events" to keystroke ""'],
83
+ capture_output=True,
84
+ )
85
+
86
+ terminal = os.environ.get("TERM_PROGRAM", "Terminal")
87
+ print(" If pasting doesn't work after install, add BOTH of these")
88
+ print(" in System Settings > Privacy & Security > Accessibility:\n")
89
+ print(f" 1. Your terminal: {terminal}")
90
+ print(f" 2. Python: {python_path}\n")
91
+
92
+
93
+ def request_permissions():
94
+ """Run all permission requests."""
95
+ python_path = get_python_path()
96
+
97
+ print("=" * 60)
98
+ print(" whisper-dictation: Permission Setup")
99
+ print("=" * 60)
100
+ print()
101
+ print(f" Python interpreter: {python_path}")
102
+ print()
103
+ print(" This tool requires three macOS permissions:")
104
+ print(" 1. Microphone — to record audio")
105
+ print(" 2. Input Monitoring — to detect the hotkey")
106
+ print(" 3. Accessibility — to paste transcribed text")
107
+ print()
108
+
109
+ request_microphone_access()
110
+ request_input_monitoring()
111
+ request_accessibility_access()
112
+
113
+ print("=" * 60)
114
+ print(" Setup complete! After granting all permissions,")
115
+ print(" restart your terminal for changes to take effect.")
116
+ print("=" * 60)
117
+ print()