@codexstar/pi-listen 1.0.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,196 @@
1
+ # pi-voice model-aware review
2
+
3
+ ## Scope reviewed
4
+
5
+ Focused review of the current model-aware phase in:
6
+ - `transcribe.py`
7
+ - `extensions/voice/diagnostics.ts`
8
+ - `extensions/voice/onboarding.ts`
9
+ - `extensions/voice/install.ts`
10
+ - `extensions/voice.ts`
11
+
12
+ I also sanity-checked the stated behavior against the current implementation shape and looked specifically for:
13
+ - false claims
14
+ - command UX inconsistencies
15
+ - installed-model edge cases
16
+ - regression risks
17
+
18
+ ---
19
+
20
+ ## Summary
21
+
22
+ The model-aware phase is moving in the right direction:
23
+ - backend discovery now emits installed-model metadata
24
+ - recommendations can prefer an installed local model
25
+ - onboarding labels installed models distinctly
26
+ - provisioning distinguishes between backend-missing and model-missing cases
27
+ - doctor/test/info now surface model-aware state
28
+
29
+ However, there are still a few important issues before calling this fully product-grade.
30
+
31
+ ---
32
+
33
+ ## Findings
34
+
35
+ ### 1. Onboarding stops offering alternative local backends as soon as any local backend is discovered
36
+ **Severity:** high
37
+
38
+ **Where:**
39
+ - `extensions/voice/onboarding.ts:81-83`
40
+
41
+ **What happens:**
42
+ `buildSelectableBackends("local", diagnostics)` returns `discoveredLocalBackends` immediately when *any* local backend is discovered. That means if the machine has only `faster-whisper` installed, onboarding will no longer offer:
43
+ - `moonshine`
44
+ - `whisper-cpp`
45
+ - `parakeet`
46
+
47
+ with install hints.
48
+
49
+ **Why it matters:**
50
+ This conflicts with the intended UX:
51
+ - “you already have this, we can configure it now”
52
+ - but still let the user pick another backend if they want
53
+
54
+ Right now the model-aware path improves the ready-now story, but narrows backend choice too aggressively.
55
+
56
+ **Recommended fix:**
57
+ Return a merged local backend list:
58
+ - discovered backends first
59
+ - then undiscovered fallback backends with install hints
60
+ - deduplicated by backend name
61
+
62
+ That preserves the “already installed” path without removing other selectable options.
63
+
64
+ ---
65
+
66
+ ### 2. `faster-whisper` model detection likely produces false negatives for some models
67
+ **Severity:** high
68
+
69
+ **Where:**
70
+ - `transcribe.py:86-93`
71
+
72
+ **What happens:**
73
+ `faster_whisper_repo_ids()` assumes:
74
+ - standard models map to `Systran/faster-whisper-<model>`
75
+ - distil models map to `distil-whisper/<model>`
76
+
77
+ But the actual repo mapping for some models differs. In particular:
78
+ - `large-v3-turbo` is not guaranteed to live under `Systran/faster-whisper-large-v3-turbo`
79
+ - distil model repos are not reliably `distil-whisper/<model>`
80
+
81
+ **Why it matters:**
82
+ The product will claim “download required” or fail to recognize installed models when they actually already exist.
83
+
84
+ This undermines the core model-aware onboarding promise.
85
+
86
+ **Recommended fix:**
87
+ Use the real `faster_whisper` model mapping if available from the library, or maintain an explicit repo map for the supported model IDs instead of constructing repo IDs heuristically.
88
+
89
+ ---
90
+
91
+ ### 3. Heuristic `installed_models` are treated like high-confidence installed models in recommendation ranking
92
+ **Severity:** medium
93
+
94
+ **Where:**
95
+ - `extensions/voice/diagnostics.ts:89-97`
96
+ - `extensions/voice/diagnostics.ts:118-132`
97
+
98
+ **What happens:**
99
+ `getPreferredLocalBackend()` prefers the first local backend with any `installed_models`, regardless of detection confidence.
100
+
101
+ That means a heuristic backend such as:
102
+ - `moonshine`
103
+ - `parakeet`
104
+
105
+ can outrank a safer backend if its heuristic reports an installed model.
106
+
107
+ **Why it matters:**
108
+ The code already treats some detectors as low confidence (`unknown` in `getModelReadiness()`), but recommendation ranking does not use that same confidence model.
109
+
110
+ So the recommendation engine can still over-trust heuristic detections.
111
+
112
+ **Recommended fix:**
113
+ In recommendation ranking, prefer installed models only for high-confidence detectors first, for example:
114
+ 1. high-confidence installed model
115
+ 2. high-confidence available backend
116
+ 3. heuristic installed model
117
+ 4. heuristic available backend
118
+
119
+ At minimum, gate “already installed and ready to configure” recommendation language behind the same confidence rules used by `getModelReadiness()`.
120
+
121
+ ---
122
+
123
+ ### 4. `/voice backends` output is still backend-centric when a backend is available but no installed models are confirmed
124
+ **Severity:** low
125
+
126
+ **Where:**
127
+ - `extensions/voice.ts:915-931`
128
+
129
+ **What happens:**
130
+ If a backend is available and `installed_models` is empty, the output falls back to:
131
+ - `models: <count>`
132
+
133
+ That is not wrong, but it is weaker than the new model-aware UX elsewhere.
134
+
135
+ **Why it matters:**
136
+ Users comparing `/voice backends` with onboarding/doctor/test may get less clarity here than they expect.
137
+
138
+ Example ambiguity:
139
+ - backend ready
140
+ - zero confirmed installed models
141
+ - but output only says `models: 13`
142
+
143
+ That does not tell the user whether the likely next state is:
144
+ - download required
145
+ - unknown confidence
146
+ - or ready-now via API
147
+
148
+ **Recommended fix:**
149
+ Use model-aware wording here too, for example:
150
+ - `installed: small, medium`
151
+ - `no confirmed installed models`
152
+ - `model detection: unknown confidence`
153
+ - `api ready`
154
+
155
+ ---
156
+
157
+ ### 5. Moonshine cache detection may under-detect because it only checks directories, not concrete model files
158
+ **Severity:** low
159
+
160
+ **Where:**
161
+ - `transcribe.py:131-140`
162
+ - `_existing_dirs()` at `transcribe.py:28-34`
163
+
164
+ **What happens:**
165
+ The moonshine fallback local candidates are passed through `_existing_dirs()`, which only returns existing directories. If a practical moonshine install stores key artifacts as files rather than dedicated directories, this logic will miss them.
166
+
167
+ **Why it matters:**
168
+ This will most likely cause false negatives rather than false positives, but it still weakens the ready-now experience for moonshine.
169
+
170
+ **Recommended fix:**
171
+ If moonshine detection remains heuristic, consider checking both:
172
+ - directories
173
+ - likely file artifacts under those roots
174
+
175
+ Or explicitly document the backend as lower-confidence and keep recommendation weight conservative.
176
+
177
+ ---
178
+
179
+ ## What looks good
180
+
181
+ These parts are solid and worth keeping:
182
+ - model-aware tests were added before/with behavior changes
183
+ - `getModelReadiness()` now distinguishes high-confidence local detection from heuristic/unknown paths
184
+ - provisioning wording is more conservative for heuristic backends
185
+ - doctor output separates repair current setup from recommended alternative
186
+ - onboarding model labels are much better than before
187
+
188
+ ---
189
+
190
+ ## Recommended priority order
191
+
192
+ 1. **Fix local backend list merging in onboarding**
193
+ 2. **Fix `faster-whisper` repo/model mapping**
194
+ 3. **Make recommendation ranking confidence-aware**
195
+ 4. **Polish `/voice backends` output to match model-aware UX**
196
+ 5. **Optionally improve moonshine detection fidelity**
@@ -0,0 +1,226 @@
1
+ # pi-voice model-detection QA / release execution checklist
2
+
3
+ ## Objective
4
+
5
+ Validate the next onboarding iteration where `pi-voice` can detect already-available models, prefer ready-to-use local setups when appropriate, and clearly distinguish:
6
+ - **already installed / ready now**
7
+ - **backend installed but model missing**
8
+ - **download required**
9
+ - **cloud/API path**
10
+
11
+ This checklist focuses on release confidence for **model-detection-aware onboarding**, not just the current onboarding baseline.
12
+
13
+ ---
14
+
15
+ ## Release gates
16
+
17
+ A release is not ready until all four gates pass.
18
+
19
+ ### Gate 1 — Static / automated checks
20
+ - [ ] `bun run check`
21
+ - [ ] model-detection unit tests pass
22
+ - [ ] onboarding recommendation tests pass with installed-model scenarios
23
+ - [ ] provisioning-plan tests pass with installed vs missing model scenarios
24
+ - [ ] any new Python smoke checks for model discovery pass
25
+
26
+ ### Gate 2 — Onboarding behavior
27
+ - [ ] first-run onboarding still launches correctly
28
+ - [ ] onboarding clearly marks installed models as available immediately
29
+ - [ ] onboarding does not ask users to re-download models already present
30
+ - [ ] onboarding still offers alternative models/backends when installed assets exist
31
+ - [ ] onboarding summary accurately reflects whether validation is complete or repair is still required
32
+
33
+ ### Gate 3 — Runtime correctness
34
+ - [ ] selected installed model is actually used at runtime
35
+ - [ ] model-detection state does not get out of sync with daemon/runtime state
36
+ - [ ] project scope vs global scope still works correctly after model-aware setup
37
+ - [ ] `/voice info`, `/voice test`, and `/voice doctor` reflect installed-model state accurately
38
+
39
+ ### Gate 4 — Docs / support readiness
40
+ - [ ] README explains installed-model detection behavior
41
+ - [ ] backend docs explain when cached/existing models may be reused
42
+ - [ ] troubleshooting docs include “backend installed but model missing” and “model detected but validation failed” cases
43
+ - [ ] QA evidence is captured in `docs/qa-results.md`
44
+
45
+ ---
46
+
47
+ ## Test matrix
48
+
49
+ ## A. Fresh install / no cached models
50
+
51
+ ### A1. Fresh startup, no config, no local model assets
52
+ - [ ] onboarding prompt appears
53
+ - [ ] user can choose API or Local
54
+ - [ ] local path marks all local options as requiring install/download
55
+ - [ ] doctor output does not falsely claim any model is already available
56
+
57
+ ### A2. Local mode with nothing installed
58
+ - [ ] onboarding still offers backend choices
59
+ - [ ] install guidance appears for backend + model path
60
+ - [ ] completion state remains `repair` / incomplete until validation succeeds
61
+
62
+ ### A3. API mode with no key
63
+ - [ ] Deepgram path clearly says API key is missing
64
+ - [ ] onboarding does not mislabel API mode as “ready now”
65
+
66
+ ---
67
+
68
+ ## B. Existing installed local model paths
69
+
70
+ ### B1. Backend installed, model already cached
71
+ Example target: `faster-whisper` backend available and chosen model already present.
72
+ - [ ] onboarding highlights the model as **already installed** or equivalent
73
+ - [ ] recommendation prefers the installed model when it matches user goals reasonably well
74
+ - [ ] provisioning does not suggest re-downloading that same model
75
+ - [ ] summary says the model is ready for immediate configuration
76
+ - [ ] runtime validation uses the installed model successfully
77
+
78
+ ### B2. whisper.cpp backend installed with model file already present
79
+ - [ ] onboarding can identify existing whisper.cpp model file
80
+ - [ ] model is marked available without requiring re-download
81
+ - [ ] runtime uses the located model path successfully
82
+ - [ ] doctor output reports the model as found, not merely backend available
83
+
84
+ ### B3. Multiple installed local models
85
+ - [ ] onboarding distinguishes between multiple installed models
86
+ - [ ] recommended option is sensible and clearly justified
87
+ - [ ] user can override recommendation and choose a different installed model
88
+ - [ ] saving the non-default installed model persists correctly
89
+
90
+ ### B4. Installed local model but missing SoX
91
+ - [ ] onboarding correctly says model is ready but recording path still needs SoX
92
+ - [ ] summary distinguishes **model ready** vs **recording dependency missing**
93
+ - [ ] repair state is used instead of complete state until validation passes
94
+
95
+ ---
96
+
97
+ ## C. Backend installed, model missing
98
+
99
+ ### C1. Backend available but requested model not present
100
+ Example: `faster-whisper` available, `medium` not downloaded.
101
+ - [ ] onboarding marks backend as installed but selected model as **download required**
102
+ - [ ] recommendation may prefer an already installed smaller model if appropriate
103
+ - [ ] provisioning suggests only the missing model path, not a full backend reinstall
104
+ - [ ] summary and doctor explain the difference clearly
105
+
106
+ ### C2. whisper.cpp installed but no model file found
107
+ - [ ] onboarding does not say whisper.cpp is fully ready
108
+ - [ ] onboarding explains that model files are missing
109
+ - [ ] doctor separates “install backend” from “obtain model file” if backend already exists
110
+
111
+ ### C3. Partial cache / corrupted model asset
112
+ - [ ] discovery does not falsely mark model as ready if validation fails
113
+ - [ ] onboarding/doctor route user to repair path
114
+ - [ ] runtime does not mark onboarding complete after failed validation
115
+
116
+ ---
117
+
118
+ ## D. Cloud/API branch
119
+
120
+ ### D1. Cloud path with valid key
121
+ - [ ] API mode remains the fastest setup path
122
+ - [ ] onboarding does not incorrectly prefer stale local detection over an explicitly selected API choice
123
+ - [ ] completion state becomes complete after validation succeeds
124
+
125
+ ### D2. Cloud path with installed local model also present
126
+ - [ ] recommendation explains the tradeoff clearly
127
+ - [ ] API path remains selectable even when local is ready
128
+ - [ ] selected API mode is respected and saved
129
+ - [ ] doctor distinguishes current API config from local recommended alternative
130
+
131
+ ---
132
+
133
+ ## E. Migration and persistence
134
+
135
+ ### E1. Existing legacy config + installed local model
136
+ - [ ] migration does not skip onboarding incorrectly for partial legacy configs
137
+ - [ ] onboarding can suggest the already installed model immediately
138
+ - [ ] saved config includes the correct versioned onboarding state
139
+
140
+ ### E2. Reconfigure from API -> installed local model
141
+ - [ ] `/voice reconfigure` detects the local installed model
142
+ - [ ] reconfigure flow offers the installed model without download guidance
143
+ - [ ] config updates correctly
144
+ - [ ] runtime/doctor/info reflect the new local state
145
+
146
+ ### E3. Reconfigure from local -> API
147
+ - [ ] existing local detection does not block switching to API
148
+ - [ ] cloud config saves cleanly
149
+ - [ ] doctor still reports local assets accurately as alternatives
150
+
151
+ ### E4. Scope behavior
152
+ - [ ] global save still writes global settings
153
+ - [ ] project save still writes `.pi/settings.json`
154
+ - [ ] project-level model-aware config overrides global config cleanly
155
+ - [ ] local installed-model detection still behaves correctly under either scope
156
+
157
+ ---
158
+
159
+ ## F. Runtime and daemon regression checks
160
+
161
+ ### F1. Config-scoped socket behavior
162
+ - [ ] switching between projects/scopes/backends/models does not silently reuse stale daemon state
163
+ - [ ] already installed model in one scope does not cause another scope to falsely appear ready unless it truly is
164
+
165
+ ### F2. `/voice info`
166
+ - [ ] reports mode/backend/model/scope accurately
167
+ - [ ] if model-detection metadata is added, it reports installed vs missing status accurately
168
+
169
+ ### F3. `/voice test`
170
+ - [ ] reports installed-model readiness accurately
171
+ - [ ] does not imply success when model is missing
172
+ - [ ] still exercises mic sample flow correctly
173
+
174
+ ### F4. `/voice doctor`
175
+ - [ ] shows current-config repair path first
176
+ - [ ] shows recommended alternative separately
177
+ - [ ] reports model-ready vs model-missing status clearly
178
+
179
+ ### F5. Hold-to-talk regression
180
+ - [ ] hold `Space` still works when editor is empty
181
+ - [ ] `Ctrl+Shift+V` fallback still works
182
+ - [ ] `Ctrl+Shift+B` BTW voice path still works
183
+
184
+ ---
185
+
186
+ ## Suggested execution order
187
+
188
+ 1. **Automated checks**
189
+ - unit tests for model detection and recommendation changes
190
+ - `bun run check`
191
+ 2. **Fresh install / no model path**
192
+ 3. **Installed local model happy path**
193
+ 4. **Backend installed / model missing path**
194
+ 5. **API branch with and without local alternatives**
195
+ 6. **Migration + reconfigure paths**
196
+ 7. **Daemon/runtime regressions**
197
+ 8. **Docs verification and QA result capture**
198
+
199
+ ---
200
+
201
+ ## Evidence to capture
202
+
203
+ For each major scenario, capture at least one of:
204
+ - JSON/RPC output snippet
205
+ - screenshot of onboarding step
206
+ - saved config snippet
207
+ - `/voice doctor` output
208
+ - `/voice info` or `/voice test` output
209
+
210
+ Recommended artifact locations:
211
+ - `docs/qa-results.md` for pass/fail summaries
212
+ - optional raw snippets under `docs/qa-artifacts/`
213
+
214
+ ---
215
+
216
+ ## Signoff criteria
217
+
218
+ Model-detection-aware onboarding is release-ready when all are true:
219
+ - [ ] already-installed models are surfaced correctly in onboarding
220
+ - [ ] onboarding avoids unnecessary re-download guidance
221
+ - [ ] backend-installed/model-missing states are clearly explained
222
+ - [ ] API/local branches both remain understandable and correct
223
+ - [ ] migration + reconfigure paths remain safe
224
+ - [ ] runtime/daemon behavior matches selected config
225
+ - [ ] docs reflect the new behavior
226
+ - [ ] QA evidence is recorded