@codexstar/pi-listen 1.0.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,357 @@
1
+ # pi-voice QA results
2
+
3
+ This document records the interactive/manual QA evidence gathered after the onboarding overhaul and the follow-up model-aware onboarding work.
4
+
5
+ ## Verification baseline
6
+
7
+ Fresh automated verification completed successfully before and after manual QA checks:
8
+
9
+ ```sh
10
+ bun run check
11
+ ```
12
+
13
+ That run currently covers:
14
+ - config migration / scope tests
15
+ - onboarding fallback / finalization tests
16
+ - diagnostics recommendation tests
17
+ - provisioning-plan tests
18
+ - model-detection metadata tests
19
+ - model-aware recommendation and labeling tests
20
+ - model-readiness confidence tests (`installed` / `download required` / `unknown` / `api`)
21
+ - TypeScript compilation
22
+ - Python compilation
23
+
24
+ ## Manual / RPC-assisted checks completed
25
+
26
+ The checks below were run against the real Pi RPC mode and the current `extensions/voice.ts` implementation.
27
+
28
+ ### 1. Fresh install prompts for onboarding on startup
29
+ **Result:** pass
30
+
31
+ Observed startup UI requests for a clean HOME / clean cwd:
32
+
33
+ ```json
34
+ {
35
+ "case": "fresh-first-run",
36
+ "titles": [
37
+ "setStatus",
38
+ "Set up pi-voice now?"
39
+ ],
40
+ "selectTitle": "Set up pi-voice now?",
41
+ "options": [
42
+ "Start voice setup",
43
+ "Remind me later"
44
+ ]
45
+ }
46
+ ```
47
+
48
+ ### 2. Partial legacy config still re-enters onboarding
49
+ **Result:** pass
50
+
51
+ Seeded legacy config:
52
+
53
+ ```json
54
+ { "voice": { "enabled": true } }
55
+ ```
56
+
57
+ Observed startup prompt:
58
+
59
+ ```json
60
+ {
61
+ "case": "partial-legacy",
62
+ "selectTitle": "Set up pi-voice now?",
63
+ "options": [
64
+ "Start voice setup",
65
+ "Remind me later"
66
+ ]
67
+ }
68
+ ```
69
+
70
+ This confirms partial legacy config no longer counts as fully onboarded.
71
+
72
+ ### 3. Remind-me-later suppresses startup prompt and `/voice reconfigure` reopens setup
73
+ **Result:** pass
74
+
75
+ Seeded config with recent `onboarding.skippedAt` and observed no startup onboarding prompt before sending `/voice reconfigure`.
76
+
77
+ Observed UI requests:
78
+
79
+ ```json
80
+ {
81
+ "case": "skipped-then-reconfigure",
82
+ "titles": [
83
+ "setStatus",
84
+ "How do you want to use speech-to-text?"
85
+ ]
86
+ }
87
+ ```
88
+
89
+ This confirms the defer window suppresses startup prompting while keeping reconfiguration available.
90
+
91
+ ### 4. Reconfigure flow saves project-scoped config
92
+ **Result:** pass
93
+
94
+ Executed `/voice reconfigure` in RPC mode and selected:
95
+ - Cloud API
96
+ - Deepgram
97
+ - `nova-3`
98
+ - Project scope
99
+
100
+ Observed notify + saved config:
101
+
102
+ ```json
103
+ {
104
+ "case": "project-scope-save",
105
+ "notify": "Voice setup saved, but action is still required.\nMode: Cloud API\nBackend: deepgram\nModel: nova-3\nScope: project\n...",
106
+ "savedVoice": {
107
+ "version": 2,
108
+ "enabled": true,
109
+ "language": "en",
110
+ "mode": "api",
111
+ "backend": "deepgram",
112
+ "model": "nova-3",
113
+ "scope": "project",
114
+ "btwEnabled": true,
115
+ "onboarding": {
116
+ "completed": false,
117
+ "schemaVersion": 2,
118
+ "source": "repair"
119
+ }
120
+ }
121
+ }
122
+ ```
123
+
124
+ This confirms:
125
+ - reconfigure launches the onboarding flow
126
+ - project scope writes to `.pi/settings.json`
127
+ - incomplete provisioning/validation leaves onboarding in repair-needed state rather than falsely complete
128
+
129
+ ### 5. Local path still exposes backend choices with install hints
130
+ **Result:** pass
131
+
132
+ Observed local backend selection options:
133
+
134
+ ```json
135
+ {
136
+ "case": "local-fallback-options",
137
+ "backendOptions": [
138
+ "faster-whisper — available",
139
+ "moonshine — pip install useful-moonshine[onnx]",
140
+ "whisper-cpp — brew install whisper-cpp",
141
+ "parakeet — pip install nemo_toolkit[asr]"
142
+ ]
143
+ }
144
+ ```
145
+
146
+ This confirms the local flow does not dead-end when only some backends are installed and still surfaces install guidance.
147
+
148
+ ### 6. `/voice doctor` separates current repair from recommended alternative
149
+ **Result:** pass
150
+
151
+ Observed output:
152
+
153
+ ```json
154
+ {
155
+ "case": "voice-doctor",
156
+ "notify": "Voice doctor:\n python3: OK\n sox/rec: missing\n brew: OK\n deepgram key: missing\n available backends: faster-whisper (local)\n\nCurrent config: api/deepgram/nova-3\nRepair current setup:\n - brew install sox\n - Set DEEPGRAM_API_KEY before using Deepgram API mode\n\nRecommended alternative: local/faster-whisper/small\nWhy: Recommended local default with good balance of quality and setup effort.\nFixable issues:\n - Install SoX for microphone recording\nSuggested commands for the recommendation:\n - brew install sox"
157
+ }
158
+ ```
159
+
160
+ This confirms doctor output now distinguishes:
161
+ - how to repair the current saved config
162
+ - what the system recommends instead
163
+
164
+ ### 7. Model-aware backend metadata is now exposed from `transcribe.py`
165
+ **Result:** pass
166
+
167
+ Observed `--list-backends` metadata snippet:
168
+
169
+ ```json
170
+ [
171
+ {
172
+ "name": "faster-whisper",
173
+ "installed_models": [],
174
+ "install_detection": "huggingface-cache"
175
+ },
176
+ {
177
+ "name": "moonshine",
178
+ "installed_models": [],
179
+ "install_detection": "moonshine-cache-heuristic"
180
+ },
181
+ {
182
+ "name": "whisper-cpp",
183
+ "installed_models": [],
184
+ "install_detection": "whisper-cpp-model-paths"
185
+ },
186
+ {
187
+ "name": "deepgram",
188
+ "installed_models": [],
189
+ "install_detection": "api-key"
190
+ },
191
+ {
192
+ "name": "parakeet",
193
+ "installed_models": [],
194
+ "install_detection": "huggingface-or-nemo-cache"
195
+ }
196
+ ]
197
+ ```
198
+
199
+ This confirms the backend scan now exposes machine-readable model-readiness metadata, even when no models are currently detected on disk.
200
+
201
+ ### 8. Local reconfigure path stays model-aware even when no local model is currently cached
202
+ **Result:** pass
203
+
204
+ Observed reconfigure options in the current environment:
205
+
206
+ ```json
207
+ {
208
+ "backendOptions": [
209
+ "faster-whisper — backend ready",
210
+ "moonshine — pip install useful-moonshine[onnx]",
211
+ "whisper-cpp — brew install whisper-cpp",
212
+ "parakeet — pip install nemo_toolkit[asr]"
213
+ ],
214
+ "modelOptions": [
215
+ "tiny",
216
+ "tiny.en",
217
+ "base",
218
+ "base.en",
219
+ "small (recommended)",
220
+ "small.en",
221
+ "medium",
222
+ "medium.en",
223
+ "large-v3",
224
+ "large-v3-turbo",
225
+ "distil-small.en",
226
+ "distil-medium.en",
227
+ "distil-large-v3"
228
+ ]
229
+ }
230
+ ```
231
+
232
+ This confirms:
233
+ - model-aware metadata does not break the local onboarding path when no cached models are found
234
+ - the user still gets a sensible backend list and model selection flow
235
+ - the recommended model remains visible even without an installed-model hit
236
+
237
+ ### 9. Model-aware logic covers both installed-model and missing-model paths
238
+ **Result:** pass (automated coverage), release-machine confirmation still recommended
239
+
240
+ Automated model-aware checks currently cover:
241
+ - installed-model recommendation preference
242
+ - installed-model labeling in onboarding
243
+ - model readiness states for local vs cloud backends
244
+ - provisioning behavior when backend is available but selected model is not installed
245
+ - transcribe metadata contract including `installed_models` and `install_detection`
246
+
247
+ This gives regression safety for the model-aware phase even though the current QA machine does not have local STT model caches populated.
248
+
249
+ ### 10. `/voice info` surfaces current model readiness and scope details
250
+ **Result:** pass
251
+
252
+ Observed output:
253
+
254
+ ```text
255
+ Voice config:
256
+ enabled: true
257
+ mode: local
258
+ scope: global
259
+ backend: faster-whisper
260
+ model: small
261
+ model status: download required
262
+ language: en
263
+ state: idle
264
+ setup: complete (setup-command)
265
+ socket: /tmp/.../pi-voice-<hash>.sock
266
+ daemon: stopped
267
+ ```
268
+
269
+ This confirms `info` now reports:
270
+ - selected mode/backend/model
271
+ - model readiness state
272
+ - scope
273
+ - config-scoped socket path
274
+ - setup state
275
+
276
+ ### 11. `/voice backends` surfaces model-aware backend summaries
277
+ **Result:** pass
278
+
279
+ Observed output:
280
+
281
+ ```text
282
+ Backends:
283
+ + faster-whisper local no confirmed installed models
284
+ detection: huggingface-cache
285
+ - moonshine local install: pip install useful-moonshine[onnx]
286
+ detection: moonshine-cache-heuristic
287
+ - whisper-cpp local install: brew install whisper-cpp
288
+ detection: whisper-cpp-model-paths
289
+ - deepgram cloud needs setup: Set DEEPGRAM_API_KEY env var (free: deepgram.com)
290
+ detection: api-key
291
+ - parakeet local install: pip install nemo_toolkit[asr]
292
+ detection: huggingface-or-nemo-cache
293
+ ```
294
+
295
+ This confirms `backends` now distinguishes:
296
+ - installed models when present
297
+ - no confirmed installed models
298
+ - API readiness / setup-needed cloud wording
299
+ - install detection source hints
300
+
301
+ ### 12. `/voice test` reports current model status plus missing-model guidance
302
+ **Result:** pass
303
+
304
+ Observed output:
305
+
306
+ ```text
307
+ Voice test:
308
+ mode: local
309
+ backend: faster-whisper
310
+ model: small
311
+ model status: download required
312
+ language: en
313
+ onboarding: complete
314
+ python3: OK
315
+ sox/rec: missing
316
+ daemon: not running
317
+
318
+ Suggested commands:
319
+ - brew install sox
320
+
321
+ Manual steps:
322
+ - Selected model small is not installed yet and may need to be downloaded on first use
323
+ ```
324
+
325
+ This confirms `test` now combines:
326
+ - current model readiness
327
+ - recording dependency status
328
+ - targeted install/manual guidance for the selected model path
329
+
330
+ ## Remaining target-machine checks
331
+
332
+ The highest-value interactive checks have been exercised. Remaining follow-up checks are mainly environment- or hardware-specific, especially for the new model-aware behavior.
333
+
334
+ ### Strongly recommended on the target release machine
335
+ - real microphone capture with user audio input
336
+ - real Deepgram end-to-end API validation with a valid key
337
+ - local STT success path on a machine with SoX installed and the desired backend available
338
+ - at least one path where a local model is **already cached** so onboarding can display an explicit installed-model path (`already installed`, `ready now`, or equivalent)
339
+ - at least one path where a backend is installed but the selected model is **not** cached, to confirm the download-required messaging is clear
340
+ - `/voice info` and `/voice test` on a machine with real model caches, so the displayed model status can be compared against actual local assets
341
+
342
+ ### Why these are still worth running
343
+ The current environment used for QA did not have local STT model caches populated, so the model-aware logic was verified through:
344
+ - automated tests
345
+ - backend metadata checks
346
+ - RPC-assisted onboarding checks
347
+
348
+ That is sufficient for regression safety and release evidence, but a final pass on a machine with real cached local models would provide the strongest product validation for the installed-model UX.
349
+
350
+ ## Release signoff
351
+
352
+ Based on:
353
+ - `bun run check`
354
+ - the manual/RPC-assisted onboarding and repair checks above
355
+ - the model-detection metadata and model-aware onboarding checks documented here
356
+
357
+ `pi-voice` now satisfies the current release-hardening checklist for the onboarding overhaul and the initial model-aware onboarding upgrade, with the remaining checks clearly isolated to target-machine / real-audio validation.
@@ -0,0 +1,265 @@
1
+ # pi-voice troubleshooting
2
+
3
+ This guide focuses on the current `pi-voice` behavior and the most likely setup/runtime issues.
4
+
5
+ ## First things to check
6
+
7
+ Run these built-in commands first:
8
+
9
+ - `/voice info` — shows the active config the extension believes it should use
10
+ - `/voice test` — checks SoX, daemon state, and current model readiness
11
+ - `/voice backends` — lists detected STT backends, installed models, and install hints
12
+ - `/voice doctor` — compares how to repair the current config vs a recommended alternative
13
+ - `/voice daemon status` — shows the current daemon backend/model state
14
+ - `/voice setup` — re-run backend/model selection
15
+
16
+ If you only do one thing, start with `/voice test`.
17
+
18
+ ## Symptom: "Voice requires SoX. Install: brew install sox"
19
+
20
+ ### What it means
21
+ `pi-voice` could not find the `rec` command used for audio recording.
22
+
23
+ ### Fix
24
+ Install SoX:
25
+
26
+ ```sh
27
+ brew install sox
28
+ ```
29
+
30
+ Then restart Pi or run `/voice test` again.
31
+
32
+ ### Why this matters
33
+ Without SoX, the extension cannot record microphone input, even if the transcription backend itself is installed correctly.
34
+
35
+ ## Symptom: `/voice backends` shows everything as unavailable
36
+
37
+ ### What it means
38
+ No STT backend is currently detected.
39
+
40
+ ### Common fixes
41
+ Choose one path:
42
+
43
+ #### Local default path
44
+ ```sh
45
+ python3 -m pip install faster-whisper
46
+ ```
47
+
48
+ #### Lightweight local path
49
+ ```sh
50
+ python3 -m pip install 'useful-moonshine[onnx]'
51
+ ```
52
+
53
+ #### whisper.cpp path
54
+ ```sh
55
+ brew install whisper-cpp
56
+ ```
57
+
58
+ #### Cloud path
59
+ Set a Deepgram API key in your shell environment:
60
+
61
+ ```sh
62
+ export DEEPGRAM_API_KEY=your_key_here
63
+ ```
64
+
65
+ Then restart Pi so the environment is visible to the extension.
66
+
67
+ ## Symptom: `/voice test` says `SoX (rec): OK` but `Daemon: not running`
68
+
69
+ ### What it means
70
+ The warm daemon is not currently running. This is not always fatal because `pi-voice` can still fall back to direct transcription subprocesses.
71
+
72
+ ### Fix
73
+ Start it manually:
74
+
75
+ ```text
76
+ /voice daemon start
77
+ ```
78
+
79
+ Then inspect it:
80
+
81
+ ```text
82
+ /voice daemon status
83
+ ```
84
+
85
+ ### If it still will not start
86
+ Check Python availability and backend installation:
87
+
88
+ ```sh
89
+ python3 --version
90
+ python3 transcribe.py --list-backends
91
+ ```
92
+
93
+ ## Symptom: `/voice daemon status` shows the wrong backend or model
94
+
95
+ ### What it means
96
+ The running daemon does not match the config you expect.
97
+
98
+ Recent work in this repo is moving toward config-specific sockets and more explicit backend/model requests, but if you still see mismatch behavior, treat it as a runtime desynchronization issue.
99
+
100
+ ### Fixes
101
+ 1. Re-run setup:
102
+ ```text
103
+ /voice setup
104
+ ```
105
+ 2. Stop the daemon:
106
+ ```text
107
+ /voice daemon stop
108
+ ```
109
+ 3. Start it again:
110
+ ```text
111
+ /voice daemon start
112
+ ```
113
+ 4. Re-check:
114
+ ```text
115
+ /voice daemon status
116
+ ```
117
+
118
+ ## Symptom: recording starts, but transcription is empty or says "No speech detected"
119
+
120
+ ### Likely causes
121
+ - recording was too short
122
+ - microphone input level is too low
123
+ - background noise or device permissions interfered
124
+ - the backend is installed but not functioning correctly for the chosen model
125
+
126
+ ### Fixes
127
+ - hold the record key a bit longer
128
+ - try `/voice test` first to validate microphone capture
129
+ - confirm the recorded sample file is not empty
130
+ - switch to a more conservative model/backend through `/voice setup`
131
+
132
+ ## Symptom: cloud setup is selected, but transcription still fails
133
+
134
+ ### Likely causes
135
+ - `DEEPGRAM_API_KEY` is missing or invalid
136
+ - Pi was launched before the shell environment contained the key
137
+ - network access is blocked or failing
138
+
139
+ ### Fixes
140
+ 1. Verify the environment variable exists in the shell that launches Pi:
141
+ ```sh
142
+ echo $DEEPGRAM_API_KEY
143
+ ```
144
+ 2. Restart Pi after setting the variable.
145
+ 3. Confirm the backend is detected:
146
+ ```sh
147
+ python3 transcribe.py --list-backends
148
+ ```
149
+ 4. Re-run `/voice setup` if needed.
150
+
151
+ ## Symptom: backend is installed, but the selected model is still reported as missing
152
+
153
+ ### What it means
154
+ `pi-voice` can see the backend package or CLI, but it does not see the specific model you selected as already available locally.
155
+
156
+ ### Typical examples
157
+ - `faster-whisper` installed, but `medium` or `large-v3-turbo` not cached yet
158
+ - `whisper-cpp` installed, but no `ggml-<model>.bin` file found
159
+ - backend available, but onboarding marks the selected model as **download required**
160
+
161
+ ### Fixes
162
+ - choose an **installed** model in onboarding if one is already available
163
+ - keep the current model and allow first use to download it if that is acceptable
164
+ - use `/voice backends` to inspect installed-model hints
165
+ - use `/voice doctor` to compare your current setup with a recommended alternative
166
+
167
+ ## Symptom: backend is installed, but model status is unknown
168
+
169
+ ### What it means
170
+ The backend package exists, but `pi-voice` cannot verify local model presence with high confidence for that backend.
171
+
172
+ This is a conservative result, not necessarily an error.
173
+
174
+ ### Fixes
175
+ - try the chosen model anyway if you expect it to already exist
176
+ - use `/voice test` and `/voice doctor` to see whether repair is still needed
177
+ - if you want a more deterministic local path, prefer a backend with stronger model detection, such as `faster-whisper` or `whisper-cpp`
178
+
179
+ ## Symptom: local backend selected, but transcription is slow
180
+
181
+ ### What it means
182
+ The chosen local model may be too heavy for the current machine or use case.
183
+
184
+ ### Fixes
185
+ - switch to a smaller model (`small`, `small.en`, or backend default)
186
+ - prefer an already-installed smaller model if onboarding shows one
187
+ - prefer `faster-whisper` as the conservative local default
188
+ - use cloud mode if setup speed and responsiveness matter more than privacy/offline behavior
189
+
190
+ ## Symptom: project config is ignored
191
+
192
+ ### What it means
193
+ Either:
194
+ - the config was saved globally instead of at project scope, or
195
+ - the project does not have `.pi/settings.json`, or
196
+ - an older config file is still being read
197
+
198
+ ### Fixes
199
+ 1. Re-run setup and select **Project only** when prompted.
200
+ 2. Inspect both files:
201
+ - `~/.pi/agent/settings.json`
202
+ - `.pi/settings.json`
203
+ 3. Remember that project settings are intended to override global settings.
204
+
205
+ ## Symptom: the hold-to-talk shortcut does nothing
206
+
207
+ ### Current behavior to remember
208
+ - hold **Space** to talk only when the editor is empty
209
+ - `Ctrl+Shift+V` is the fallback toggle shortcut
210
+ - `Ctrl+Shift+B` is the BTW voice shortcut
211
+
212
+ ### Fixes
213
+ - make sure the editor is empty before using hold-Space
214
+ - try `Ctrl+Shift+V` instead
215
+ - use `/voice on` if voice was disabled
216
+ - run `/voice info` to confirm `enabled: true`
217
+
218
+ ## Symptom: "Recording too short" or "No audio recorded"
219
+
220
+ ### What it means
221
+ The audio file was missing, too small, or recording ended before a usable sample was captured.
222
+
223
+ ### Fixes
224
+ - hold the key slightly longer
225
+ - try a direct microphone test via `/voice test`
226
+ - confirm SoX can record in your environment
227
+ - avoid tapping the shortcut too quickly
228
+
229
+ ## Manual backend checks
230
+
231
+ These are useful outside Pi too:
232
+
233
+ ```sh
234
+ python3 transcribe.py --list-backends
235
+ python3 daemon.py ping
236
+ python3 daemon.py status
237
+ ```
238
+
239
+ If you are debugging local setup, `--list-backends` is usually the most useful first command because it now includes installed-model hints and detection metadata.
240
+
241
+ ## When to re-run setup
242
+
243
+ Use `/voice setup` again when:
244
+ - switching from cloud to local or vice versa
245
+ - changing model sizes
246
+ - moving from global to project scope
247
+ - recovering from a broken dependency install
248
+
249
+ ## If you are still stuck
250
+
251
+ Capture these four pieces of information before debugging further:
252
+
253
+ 1. `/voice info` output
254
+ 2. `/voice test` output
255
+ 3. `/voice backends` output
256
+ 4. `/voice daemon status` output
257
+
258
+ That is usually enough to identify whether the issue is:
259
+ - recording
260
+ - backend installation
261
+ - selected model missing vs already installed
262
+ - model status unknown
263
+ - API credentials
264
+ - daemon state
265
+ - config scope