@codexstar/pi-listen 1.0.4
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +21 -0
- package/README.md +283 -0
- package/daemon.py +517 -0
- package/docs/API.md +273 -0
- package/docs/ARCHITECTURE.md +114 -0
- package/docs/backends.md +196 -0
- package/docs/plans/2026-03-12-pi-voice-master-plan.md +613 -0
- package/docs/plans/2026-03-12-pi-voice-model-aware-execution-plan.md +256 -0
- package/docs/plans/2026-03-12-pi-voice-onboarding-remediation-plan.md +391 -0
- package/docs/plans/pi-voice-model-aware-review.md +196 -0
- package/docs/plans/pi-voice-model-detection-qa-plan.md +226 -0
- package/docs/plans/pi-voice-model-detection-research.md +483 -0
- package/docs/plans/pi-voice-onboarding-ux-plan.md +388 -0
- package/docs/plans/pi-voice-release-validation-plan.md +386 -0
- package/docs/plans/pi-voice-remaining-implementation-plan.md +524 -0
- package/docs/plans/pi-voice-review-findings.md +227 -0
- package/docs/plans/pi-voice-technical-remediation-plan.md +613 -0
- package/docs/qa-matrix.md +69 -0
- package/docs/qa-results.md +357 -0
- package/docs/troubleshooting.md +265 -0
- package/extensions/voice/config.ts +206 -0
- package/extensions/voice/diagnostics.ts +212 -0
- package/extensions/voice/install.ts +62 -0
- package/extensions/voice/onboarding.ts +315 -0
- package/extensions/voice.ts +1149 -0
- package/package.json +48 -0
- package/scripts/setup-macos.sh +374 -0
- package/scripts/setup-windows.ps1 +271 -0
- package/transcribe.py +497 -0
|
@@ -0,0 +1,386 @@
|
|
|
1
|
+
# pi-voice release and validation plan
|
|
2
|
+
|
|
3
|
+
## Goal
|
|
4
|
+
|
|
5
|
+
Ship the pi-voice onboarding overhaul with enough automated coverage, manual verification, migration safety, and release hygiene that a fresh install feels reliable and polished for both local and cloud STT users.
|
|
6
|
+
|
|
7
|
+
This plan assumes the product work adds a first-run onboarding wizard, API-vs-local setup flow, richer config state, and daemon/config correctness fixes.
|
|
8
|
+
|
|
9
|
+
## Baseline findings
|
|
10
|
+
|
|
11
|
+
Current repo state, verified during planning:
|
|
12
|
+
|
|
13
|
+
- TypeScript currently compiles: `bunx tsc -p tsconfig.json`
|
|
14
|
+
- Python files currently compile: `python3 -m py_compile daemon.py transcribe.py`
|
|
15
|
+
- Backend scan currently works: `python3 transcribe.py --list-backends`
|
|
16
|
+
- `faster-whisper` is available in the current environment; other backends are not.
|
|
17
|
+
- The package currently lacks release-facing docs and package metadata hardening:
|
|
18
|
+
- no `README.md`
|
|
19
|
+
- no test scripts in `package.json`
|
|
20
|
+
- no peer dependency declarations
|
|
21
|
+
- no gallery preview metadata
|
|
22
|
+
- Setup is currently command-driven (`/voice setup`) rather than first-run guided onboarding.
|
|
23
|
+
|
|
24
|
+
## Release risks to manage
|
|
25
|
+
|
|
26
|
+
### Product risks
|
|
27
|
+
- First-run onboarding becomes annoying instead of helpful.
|
|
28
|
+
- Users cannot tell the difference between cloud and local tradeoffs.
|
|
29
|
+
- Project-scoped and global-scoped configuration behave inconsistently.
|
|
30
|
+
- Recovery paths for partial setup failures are unclear.
|
|
31
|
+
|
|
32
|
+
### Technical risks
|
|
33
|
+
- Daemon keeps stale backend/model state across sessions.
|
|
34
|
+
- Config migrations break existing users or re-trigger onboarding incorrectly.
|
|
35
|
+
- Auto-install/provisioning paths behave differently across machines.
|
|
36
|
+
- Validation passes in development but fails for fresh installs.
|
|
37
|
+
|
|
38
|
+
### Release risks
|
|
39
|
+
- Package ships without docs, screenshots, or troubleshooting.
|
|
40
|
+
- No repeatable QA checklist exists.
|
|
41
|
+
- Regressions in recording, daemon startup, or transcription are discovered after release.
|
|
42
|
+
|
|
43
|
+
## Test strategy
|
|
44
|
+
|
|
45
|
+
Use four layers of validation.
|
|
46
|
+
|
|
47
|
+
### 1. Static and smoke checks
|
|
48
|
+
These run on every change and before release.
|
|
49
|
+
|
|
50
|
+
Required commands:
|
|
51
|
+
|
|
52
|
+
```sh
|
|
53
|
+
bunx tsc -p tsconfig.json
|
|
54
|
+
python3 -m py_compile daemon.py transcribe.py
|
|
55
|
+
python3 transcribe.py --list-backends
|
|
56
|
+
```
|
|
57
|
+
|
|
58
|
+
Add package scripts so release validation is one command away:
|
|
59
|
+
|
|
60
|
+
```json
|
|
61
|
+
{
|
|
62
|
+
"scripts": {
|
|
63
|
+
"typecheck": "bunx tsc -p tsconfig.json",
|
|
64
|
+
"check:python": "python3 -m py_compile daemon.py transcribe.py",
|
|
65
|
+
"check:backends": "python3 transcribe.py --list-backends",
|
|
66
|
+
"check": "bun run typecheck && bun run check:python && bun run check:backends"
|
|
67
|
+
}
|
|
68
|
+
}
|
|
69
|
+
```
|
|
70
|
+
|
|
71
|
+
### 2. Unit and module tests
|
|
72
|
+
Add tests around deterministic logic that should not require live audio.
|
|
73
|
+
|
|
74
|
+
Priority test targets:
|
|
75
|
+
- config load/merge precedence
|
|
76
|
+
- config migration from old shape to new onboarding-aware shape
|
|
77
|
+
- onboarding state gating logic
|
|
78
|
+
- scope-aware save behavior (global vs project)
|
|
79
|
+
- recommendation logic for API vs local mode
|
|
80
|
+
- backend metadata normalization
|
|
81
|
+
- daemon reconciliation logic when config differs from loaded daemon state
|
|
82
|
+
- doctor/diagnostics result formatting
|
|
83
|
+
|
|
84
|
+
Suggested structure:
|
|
85
|
+
- `extensions/voice/config.test.ts`
|
|
86
|
+
- `extensions/voice/onboarding-state.test.ts`
|
|
87
|
+
- `extensions/voice/diagnostics.test.ts`
|
|
88
|
+
- `extensions/voice/runtime.test.ts`
|
|
89
|
+
- `test_python/test_daemon_contract.py` or lightweight Python script-based tests if no framework is added
|
|
90
|
+
|
|
91
|
+
### 3. Integration tests
|
|
92
|
+
Add smoke tests that exercise multi-step workflows with mocked or controlled dependencies.
|
|
93
|
+
|
|
94
|
+
High-value integration scenarios:
|
|
95
|
+
- first session with no voice config launches onboarding
|
|
96
|
+
- completed onboarding does not relaunch every session
|
|
97
|
+
- `/voice setup` reopens the same wizard
|
|
98
|
+
- saving to project scope persists under `.pi/settings.json`
|
|
99
|
+
- saving to global scope persists under `~/.pi/agent/settings.json`
|
|
100
|
+
- daemon reloads when backend/model changes
|
|
101
|
+
- `/voice doctor` reports missing SoX and missing API key correctly
|
|
102
|
+
- backend scan failure renders a recovery state instead of crashing
|
|
103
|
+
|
|
104
|
+
### 4. Manual end-to-end QA
|
|
105
|
+
Because this extension depends on terminal UI, mic input, Python libraries, and host tools, manual QA remains mandatory before release.
|
|
106
|
+
|
|
107
|
+
## Validation matrix
|
|
108
|
+
|
|
109
|
+
### Environment matrix
|
|
110
|
+
|
|
111
|
+
| Environment | Why it matters | Must pass |
|
|
112
|
+
|---|---|---|
|
|
113
|
+
| macOS Apple Silicon, fresh machine profile | primary target, likely most users | Yes |
|
|
114
|
+
| macOS Apple Silicon, existing faster-whisper installed | migration and detection path | Yes |
|
|
115
|
+
| macOS Apple Silicon, no SoX installed | local onboarding provisioning | Yes |
|
|
116
|
+
| macOS Apple Silicon, Deepgram key present | cloud onboarding fast path | Yes |
|
|
117
|
+
| macOS Apple Silicon, Deepgram key absent | cloud error/recovery path | Yes |
|
|
118
|
+
| Existing user with prior `voice` config | migration and no-regression path | Yes |
|
|
119
|
+
| Project-local config present overriding global | precedence and scope correctness | Yes |
|
|
120
|
+
| Multiple Pi sessions with daemon already running | stale daemon/state isolation | Yes |
|
|
121
|
+
|
|
122
|
+
### Feature matrix
|
|
123
|
+
|
|
124
|
+
| Area | Validation focus |
|
|
125
|
+
|---|---|
|
|
126
|
+
| First-run onboarding | launches once, clear choices, cancellable, resumable |
|
|
127
|
+
| API mode | key detection, validation, copy, recovery |
|
|
128
|
+
| Local mode | dependency install path, model selection, warmup |
|
|
129
|
+
| Scope | global/project save behavior and re-load behavior |
|
|
130
|
+
| Diagnostics | accurate pass/fail and actionable remediation |
|
|
131
|
+
| Runtime | hold-to-talk, toggle shortcut, BTW voice path |
|
|
132
|
+
| Daemon | correct backend/model, reload, shutdown, status |
|
|
133
|
+
| Docs | install guide, troubleshooting, shortcuts, privacy notes |
|
|
134
|
+
| Packaging | files included, metadata correct, package installs cleanly |
|
|
135
|
+
|
|
136
|
+
## Manual QA scenarios
|
|
137
|
+
|
|
138
|
+
Each scenario should end with a short written outcome and screenshots or terminal captures where useful.
|
|
139
|
+
|
|
140
|
+
### Fresh install scenarios
|
|
141
|
+
1. Install package into a clean environment with no voice config.
|
|
142
|
+
2. Start Pi interactively.
|
|
143
|
+
3. Confirm onboarding appears automatically.
|
|
144
|
+
4. Verify the user is first asked how they want to use STT: API or local download.
|
|
145
|
+
5. Complete setup and verify a successful test transcription.
|
|
146
|
+
6. Restart Pi and verify onboarding does not reappear.
|
|
147
|
+
|
|
148
|
+
### Local mode scenarios
|
|
149
|
+
1. Start from no SoX and no backend dependency.
|
|
150
|
+
2. Choose local/download mode.
|
|
151
|
+
3. Verify recommendation copy is understandable.
|
|
152
|
+
4. Verify missing dependencies are detected.
|
|
153
|
+
5. Verify auto-install or guided-install copy is correct.
|
|
154
|
+
6. Verify model choice reflects tradeoffs.
|
|
155
|
+
7. Verify daemon warms the selected model.
|
|
156
|
+
8. Record and transcribe successfully.
|
|
157
|
+
|
|
158
|
+
### Cloud mode scenarios
|
|
159
|
+
1. Start with no `DEEPGRAM_API_KEY`.
|
|
160
|
+
2. Choose API mode.
|
|
161
|
+
3. Verify missing key is explained clearly.
|
|
162
|
+
4. Add key and retry without restarting the world if possible.
|
|
163
|
+
5. Verify transcription succeeds.
|
|
164
|
+
|
|
165
|
+
### Migration scenarios
|
|
166
|
+
1. Seed existing config with the old shape (`enabled`, `backend`, `model`, `language`).
|
|
167
|
+
2. Start Pi.
|
|
168
|
+
3. Confirm config migrates safely.
|
|
169
|
+
4. Verify onboarding does not clobber a valid prior setup.
|
|
170
|
+
5. Verify invalid prior setups are detected and routed into repair flow.
|
|
171
|
+
|
|
172
|
+
### Failure and recovery scenarios
|
|
173
|
+
1. Break backend scan intentionally.
|
|
174
|
+
2. Break daemon startup intentionally.
|
|
175
|
+
3. Simulate missing mic audio.
|
|
176
|
+
4. Verify `/voice doctor` and onboarding recovery screens show next steps, not generic failures.
|
|
177
|
+
|
|
178
|
+
### Scope scenarios
|
|
179
|
+
1. Save global config.
|
|
180
|
+
2. Open a project with no project-level override.
|
|
181
|
+
3. Confirm global config is honored.
|
|
182
|
+
4. Save project-level config.
|
|
183
|
+
5. Confirm project config overrides global.
|
|
184
|
+
6. Switch to another project and confirm the project-specific override does not leak.
|
|
185
|
+
|
|
186
|
+
### Daemon correctness scenarios
|
|
187
|
+
1. Run session A with backend/model X.
|
|
188
|
+
2. Start session B with backend/model Y.
|
|
189
|
+
3. Verify the daemon uses the requested config and does not silently keep X.
|
|
190
|
+
4. Validate `/voice info` and `/voice daemon status` reflect reality.
|
|
191
|
+
|
|
192
|
+
## Automated release gates
|
|
193
|
+
|
|
194
|
+
Before tagging any release, require all of the following:
|
|
195
|
+
|
|
196
|
+
### Gate 1: code health
|
|
197
|
+
- TypeScript check passes.
|
|
198
|
+
- Python compile check passes.
|
|
199
|
+
- Backend list check passes.
|
|
200
|
+
- New unit/integration tests pass.
|
|
201
|
+
|
|
202
|
+
### Gate 2: packaging health
|
|
203
|
+
- `package.json` includes required metadata.
|
|
204
|
+
- published file list includes extension, Python files, README, docs if intended
|
|
205
|
+
- no accidental junk files in package
|
|
206
|
+
- install from local path works in a clean test directory
|
|
207
|
+
|
|
208
|
+
### Gate 3: UX acceptance
|
|
209
|
+
- first-run onboarding flow verified manually
|
|
210
|
+
- API and local paths both verified manually
|
|
211
|
+
- error copy reviewed for clarity
|
|
212
|
+
- success screen reviewed for polish and completeness
|
|
213
|
+
|
|
214
|
+
### Gate 4: documentation acceptance
|
|
215
|
+
- README explains first-run onboarding
|
|
216
|
+
- README includes cloud vs local comparison
|
|
217
|
+
- README includes troubleshooting section
|
|
218
|
+
- docs include shortcut summary and doctor/setup commands
|
|
219
|
+
|
|
220
|
+
## Migration and versioning strategy
|
|
221
|
+
|
|
222
|
+
### Config schema versioning
|
|
223
|
+
Introduce an explicit config version under `voice.version`.
|
|
224
|
+
|
|
225
|
+
Recommended migration rules:
|
|
226
|
+
- `version: 1` = legacy shape
|
|
227
|
+
- `version: 2` = onboarding-aware shape with scope, onboarding metadata, and validation markers
|
|
228
|
+
|
|
229
|
+
Migration requirements:
|
|
230
|
+
- old config is upgraded in memory first
|
|
231
|
+
- migration is idempotent
|
|
232
|
+
- migrated config is only written back after successful validation or explicit confirmation
|
|
233
|
+
- invalid legacy config routes to repair/onboarding instead of failing silently
|
|
234
|
+
|
|
235
|
+
### Onboarding versioning
|
|
236
|
+
Track onboarding independently from base config shape if needed:
|
|
237
|
+
|
|
238
|
+
```json
|
|
239
|
+
{
|
|
240
|
+
"voice": {
|
|
241
|
+
"version": 2,
|
|
242
|
+
"onboarding": {
|
|
243
|
+
"schemaVersion": 2,
|
|
244
|
+
"completed": true,
|
|
245
|
+
"completedAt": "...",
|
|
246
|
+
"lastValidatedAt": "..."
|
|
247
|
+
}
|
|
248
|
+
}
|
|
249
|
+
}
|
|
250
|
+
```
|
|
251
|
+
|
|
252
|
+
Re-run onboarding when:
|
|
253
|
+
- config version is too old
|
|
254
|
+
- onboarding schema version is too old
|
|
255
|
+
- required dependency state is no longer valid
|
|
256
|
+
- user explicitly runs `/voice setup` or `/voice reconfigure`
|
|
257
|
+
|
|
258
|
+
### Release versioning
|
|
259
|
+
Use semver aligned to impact:
|
|
260
|
+
- patch: bugfixes, copy, docs, non-breaking diagnostics improvements
|
|
261
|
+
- minor: new onboarding flow, new commands, new backend options, migration-safe config additions
|
|
262
|
+
- major: breaking config changes, command removals, behavior changes that require user adaptation
|
|
263
|
+
|
|
264
|
+
For the onboarding overhaul, a **minor release** is appropriate if migration is safe. Use a **major release** only if existing settings or workflows break.
|
|
265
|
+
|
|
266
|
+
## Docs and package metadata plan
|
|
267
|
+
|
|
268
|
+
### Required new docs
|
|
269
|
+
- `README.md`
|
|
270
|
+
- `docs/backends.md`
|
|
271
|
+
- `docs/troubleshooting.md`
|
|
272
|
+
- optional `docs/release-checklist.md`
|
|
273
|
+
|
|
274
|
+
### README requirements
|
|
275
|
+
Include:
|
|
276
|
+
- what pi-voice does
|
|
277
|
+
- what happens on first run after install
|
|
278
|
+
- API vs local decision guide
|
|
279
|
+
- supported backends and who each is for
|
|
280
|
+
- installation requirements
|
|
281
|
+
- shortcuts and commands
|
|
282
|
+
- troubleshooting basics
|
|
283
|
+
- privacy and cost notes
|
|
284
|
+
|
|
285
|
+
### Package metadata requirements
|
|
286
|
+
Update `package.json` with:
|
|
287
|
+
- `homepage`
|
|
288
|
+
- `repository`
|
|
289
|
+
- `bugs`
|
|
290
|
+
- richer description
|
|
291
|
+
- `peerDependencies` for Pi runtime packages
|
|
292
|
+
- scripts for checks
|
|
293
|
+
- `files` list including README and docs if shipped
|
|
294
|
+
- optional package gallery `image` or `video`
|
|
295
|
+
|
|
296
|
+
## Rollout sequencing
|
|
297
|
+
|
|
298
|
+
### Phase 1: stabilization
|
|
299
|
+
Ship only after daemon/config correctness issues are fixed.
|
|
300
|
+
|
|
301
|
+
Release focus:
|
|
302
|
+
- config-vs-daemon consistency
|
|
303
|
+
- scope-aware persistence
|
|
304
|
+
- no first-run UX yet if correctness is still shaky
|
|
305
|
+
|
|
306
|
+
### Phase 2: onboarding beta
|
|
307
|
+
Add the first-run onboarding flow behind a conservative release.
|
|
308
|
+
|
|
309
|
+
Release focus:
|
|
310
|
+
- clean new-user path
|
|
311
|
+
- migration-safe old-user path
|
|
312
|
+
- basic doctor/setup recovery
|
|
313
|
+
|
|
314
|
+
### Phase 3: provisioning and polish
|
|
315
|
+
Improve dependency installation, doctor flow, docs, and package presentation.
|
|
316
|
+
|
|
317
|
+
Release focus:
|
|
318
|
+
- fewer manual steps
|
|
319
|
+
- clearer remediation
|
|
320
|
+
- publication-grade package page
|
|
321
|
+
|
|
322
|
+
### Phase 4: broad confidence release
|
|
323
|
+
After at least one full QA pass across the matrix, publish the polished release broadly.
|
|
324
|
+
|
|
325
|
+
## Release checklist
|
|
326
|
+
|
|
327
|
+
### Pre-release
|
|
328
|
+
- [ ] Merge product, technical, and QA plans
|
|
329
|
+
- [ ] Add all required package scripts
|
|
330
|
+
- [ ] Add/refresh docs
|
|
331
|
+
- [ ] Verify package contents
|
|
332
|
+
- [ ] Run automated checks
|
|
333
|
+
- [ ] Run manual QA matrix
|
|
334
|
+
- [ ] Review migration behavior from old config
|
|
335
|
+
- [ ] Review stale-daemon scenarios
|
|
336
|
+
|
|
337
|
+
### Release candidate signoff
|
|
338
|
+
- [ ] New-user onboarding approved
|
|
339
|
+
- [ ] Existing-user migration approved
|
|
340
|
+
- [ ] Local mode approved
|
|
341
|
+
- [ ] API mode approved
|
|
342
|
+
- [ ] Diagnostics approved
|
|
343
|
+
- [ ] README/package metadata approved
|
|
344
|
+
|
|
345
|
+
### Post-release
|
|
346
|
+
- [ ] Validate install from published source
|
|
347
|
+
- [ ] Confirm first-run flow on a clean environment
|
|
348
|
+
- [ ] Capture any support issues or friction reports
|
|
349
|
+
- [ ] Patch critical regressions before expanding scope
|
|
350
|
+
|
|
351
|
+
## Definition of done
|
|
352
|
+
|
|
353
|
+
The onboarding overhaul is done only when all of the following are true:
|
|
354
|
+
|
|
355
|
+
### Functional
|
|
356
|
+
- Fresh installs trigger guided onboarding automatically in interactive sessions.
|
|
357
|
+
- Users are explicitly asked whether they want API or local STT before backend details.
|
|
358
|
+
- Local setup can provision or clearly guide missing dependencies.
|
|
359
|
+
- Cloud setup can detect and validate required API credentials.
|
|
360
|
+
- Config scope works correctly for global and project settings.
|
|
361
|
+
- Runtime always uses the selected backend/model, even with an already-running daemon.
|
|
362
|
+
|
|
363
|
+
### Quality
|
|
364
|
+
- Static checks and tests pass from a single release command.
|
|
365
|
+
- Manual QA matrix is completed and documented.
|
|
366
|
+
- Failure states provide actionable recovery guidance.
|
|
367
|
+
- Existing users migrate safely.
|
|
368
|
+
|
|
369
|
+
### Documentation and packaging
|
|
370
|
+
- README is complete and accurate.
|
|
371
|
+
- Troubleshooting docs exist.
|
|
372
|
+
- Package metadata is production-quality.
|
|
373
|
+
- Published package contains everything needed and nothing accidental.
|
|
374
|
+
|
|
375
|
+
### Product polish
|
|
376
|
+
- First-run flow feels deliberate and polished, not like an internal setup tool.
|
|
377
|
+
- Success state clearly tells the user what was configured and what to do next.
|
|
378
|
+
- Reconfiguration and doctor flows are discoverable.
|
|
379
|
+
|
|
380
|
+
## Recommended next actions
|
|
381
|
+
|
|
382
|
+
1. Add the release scripts and test scaffolding to `package.json`.
|
|
383
|
+
2. Implement config versioning and migration tests before building more UX.
|
|
384
|
+
3. Create a reusable manual QA checklist document from this plan.
|
|
385
|
+
4. Treat stale daemon correctness as a release blocker.
|
|
386
|
+
5. Do not publish the onboarding overhaul until both fresh-install and migration paths are signed off.
|