@arcforgelabs/dictate 2026.6.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Samuel Rodda
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,212 @@
1
+ # 🎙️ Dictate
2
+
3
+ Desktop dictation that types into the focused app.
4
+
5
+ `dictate` runs as a small tray app. Press your configured push-to-talk shortcut,
6
+ speak, and it transcribes into whatever app you are already using.
7
+
8
+ Current status: early desktop app. Linux installs, Windows 11 source installs,
9
+ tray controls, startup integration, recent history, model selection, API key
10
+ storage, update, and uninstall paths are implemented. Windows Store/signed
11
+ installer packaging is being prepared; see `docs/goal.md`.
12
+
13
+ ## Install
14
+
15
+ Windows 11 normal install:
16
+
17
+ The target public channel is Microsoft Store distribution. Until that listing is
18
+ ready, use the GitHub release installer artifacts for internal validation only.
19
+ The PowerShell bootstrap installer below is a developer/source path, not the
20
+ normal public install route.
21
+
22
+ Windows developer/source install from the hosted bootstrap:
23
+
24
+ ```powershell
25
+ powershell -ExecutionPolicy Bypass -Command "iwr -useb https://cdn.jsdelivr.net/npm/@arcforgelabs/dictate@latest/install.ps1 | iex"
26
+ ```
27
+
28
+ Open **Dictate** from the Start Menu after install.
29
+
30
+ Ubuntu/Debian source install:
31
+
32
+ ```bash
33
+ ./install-ubuntu.sh
34
+ ```
35
+
36
+ Generic Linux source install:
37
+
38
+ ```bash
39
+ ./install.sh
40
+ ```
41
+
42
+ Open **Dictate** from the app launcher after install.
43
+
44
+ Windows developer/source install, from the repo/source directory:
45
+
46
+ ```powershell
47
+ powershell -ExecutionPolicy Bypass -File .\install-windows-wizard.ps1
48
+ ```
49
+
50
+ The local `.ps1` installer scripts must be run from a checkout or extracted
51
+ source directory. They will not work from `C:\Windows\System32`.
52
+
53
+ Node/npm users can also run the developer bootstrap:
54
+
55
+ ```powershell
56
+ npx @arcforgelabs/dictate install
57
+ ```
58
+
59
+ ## Workflow
60
+
61
+ ```text
62
+ Open Dictate
63
+ Select Model
64
+ Set API key if using a hosted model
65
+ Set push-to-talk shortcut if desired
66
+ Hold shortcut, speak, release
67
+ Review Recent History when needed
68
+ ```
69
+
70
+ By default, Dictate installs a normal app launcher entry and starts on sign-in.
71
+ Startup can be changed from Settings.
72
+
73
+ ## Update And Uninstall
74
+
75
+ Windows developer/bootstrap install:
76
+
77
+ ```powershell
78
+ powershell -ExecutionPolicy Bypass -Command "iwr -useb https://cdn.jsdelivr.net/npm/@arcforgelabs/dictate@latest/update.ps1 | iex"
79
+ powershell -ExecutionPolicy Bypass -Command "iwr -useb https://cdn.jsdelivr.net/npm/@arcforgelabs/dictate@latest/uninstall.ps1 | iex"
80
+ ```
81
+
82
+ Windows from source:
83
+
84
+ ```powershell
85
+ powershell -ExecutionPolicy Bypass -File .\update-windows.ps1
86
+ powershell -ExecutionPolicy Bypass -File .\uninstall-windows.ps1
87
+ ```
88
+
89
+ Linux:
90
+
91
+ ```bash
92
+ ./update.sh
93
+ ./uninstall.sh
94
+ ```
95
+
96
+ Use `-RemoveUserData` on Windows or `--remove-user-data` on Linux only when you
97
+ also want to remove config, logs, history, and downloaded model data.
98
+
99
+ ## What It Does Today
100
+
101
+ - Starts from the Windows Start Menu or Linux app launcher
102
+ - Runs as a tray app
103
+ - Types dictated text into the focused app
104
+ - Supports configurable push-to-talk
105
+ - Shows selected model/status in the app UI
106
+ - Supports launch on startup
107
+ - Stores hosted-provider API keys in the OS secret store
108
+ - Keeps a small Recent History for copy/paste recovery
109
+ - Provides installer, updater, uninstaller, and doctor paths
110
+
111
+ ## Models
112
+
113
+ Supported provider defaults:
114
+
115
+ - `faster-whisper/turbo` for local transcription
116
+ - `openai/gpt-4o-mini-transcribe`
117
+ - `xai/grok-speech-to-text`
118
+ - `gemini/gemini-3-flash-preview`
119
+
120
+ Local transcription can use CPU or GPU where supported. Hosted providers require
121
+ an API key before they can be selected.
122
+
123
+ ## Commands
124
+
125
+ ```bash
126
+ dictate
127
+ dictate --no-tray
128
+ dictate --once
129
+ dictate --once --copy
130
+ dictate doctor --quick
131
+ dictate doctor --quick --fix
132
+ dictate doctor --check-model-load
133
+ ```
134
+
135
+ Hotword and model options are available from Settings. CLI flags still exist for
136
+ automation and testing:
137
+
138
+ ```bash
139
+ dictate --stt-backend faster-whisper --model turbo
140
+ dictate --stt-backend openai --model gpt-4o-mini-transcribe
141
+ dictate --stt-backend xai --model grok-speech-to-text
142
+ dictate --stt-backend gemini --model gemini-3-flash-preview
143
+ dictate --add-hotword AcmeWidget
144
+ dictate --list-hotwords
145
+ ```
146
+
147
+ ## State
148
+
149
+ User state is local:
150
+
151
+ ```text
152
+ Linux:
153
+ ~/.config/dictate/config.yaml
154
+ ~/.local/share/dictate/
155
+
156
+ Windows:
157
+ %APPDATA%\dictate\config.yaml
158
+ %LOCALAPPDATA%\dictate\
159
+ ```
160
+
161
+ Repo defaults intentionally ship with `hotwords: []`. Hotwords are user-specific
162
+ and should not be packaged into the public repo default config.
163
+
164
+ ## Safety
165
+
166
+ - Dictate does not intentionally write raw API keys to `config.yaml`.
167
+ - API keys configured in the app use the OS secret store.
168
+ - Dictation text can be sensitive; check logs and issue reports before sharing.
169
+ - Important transcriptions should be verified before relying on them.
170
+ - Support and maintenance are best-effort.
171
+
172
+ ## Desktop UI — the Quiet Console (preview)
173
+
174
+ A design-system desktop Settings window is being built alongside the tray:
175
+
176
+ - [`ui/`](ui/README.md) — React/Vite front-end (the Quiet Console: seven views,
177
+ ⌘K palette, listening HUD, light/dark, GNOME/KDE chrome).
178
+ - [`ui-shell/`](ui-shell/README.md) — Tauri 2 shell that hosts it on Linux.
179
+ - `src/dictate/ui_server.py` — the loopback control server the UI talks to; the
180
+ tray's **Open Settings…** launches the shell (falling back to native dialogs).
181
+ - See [`design/PLAN.md`](design/PLAN.md) for the cross-platform plan.
182
+
183
+ **Install it like a normal app:** tagged releases attach a **self-contained**
184
+ Linux **`.deb`** and **`.AppImage`** — they bundle the frozen Python engine
185
+ inside (PyInstaller sidecar), so there's no separate Python/pip step. Download,
186
+ install, launch; speech models download on first use.
187
+
188
+ ```bash
189
+ sudo apt install ./dictate_*_amd64.deb # or: chmod +x Dictate_*.AppImage && ./Dictate_*.AppImage
190
+ ```
191
+
192
+ The app lives in the tray (Open Settings / Quit) and does push-to-talk straight
193
+ away. Build the package yourself in one step with
194
+ [`scripts/build-linux-desktop.sh`](scripts/build-linux-desktop.sh) — see
195
+ [`ui-shell/README.md`](ui-shell/README.md). The `pip`/`install.sh` route remains
196
+ for source/dev installs.
197
+
198
+ ## Docs
199
+
200
+ - [Windows 11 support](docs/windows-11.md)
201
+ - [Recent History spec](docs/recent-dictation-history-spec.md)
202
+ - [Release/versioning](docs/release-versioning.md)
203
+ - [Desktop packaging & CI runbook](docs/desktop-packaging.md)
204
+ - [Microsoft Store automation](docs/msstore-automation.md)
205
+ - [Microsoft Store listing draft](docs/msstore-listing.md)
206
+ - [Windows release goal](docs/goal.md)
207
+ - [Development streams](docs/development-streams.md)
208
+ - [Security policy](SECURITY.md)
209
+
210
+ ## License
211
+
212
+ MIT. See [LICENSE](LICENSE).
Binary file
Binary file
Binary file
Binary file
@@ -0,0 +1,17 @@
1
+ hotwords: []
2
+ push_to_talk_combo: ctrl_r
3
+ stt_backend: faster-whisper
4
+ stt_compute_type: int8
5
+ stt_device: auto
6
+ # Local Whisper uses the single supported local model:
7
+ # stt_model: turbo
8
+ # To avoid local compute, set one hosted backend:
9
+ # stt_backend: openai
10
+ # stt_model: gpt-4o-mini-transcribe
11
+ # openai_api_key_command: /path/to/command/that/prints/the/key
12
+ # stt_backend: xai
13
+ # stt_model: grok-speech-to-text
14
+ # xai_api_key_command: /path/to/command/that/prints/the/key
15
+ # stt_backend: gemini
16
+ # stt_model: gemini-3-flash-preview
17
+ # gemini_api_key_command: /path/to/command/that/prints/the/key