@codexstar/pi-listen 1.0.4
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +21 -0
- package/README.md +283 -0
- package/daemon.py +517 -0
- package/docs/API.md +273 -0
- package/docs/ARCHITECTURE.md +114 -0
- package/docs/backends.md +196 -0
- package/docs/plans/2026-03-12-pi-voice-master-plan.md +613 -0
- package/docs/plans/2026-03-12-pi-voice-model-aware-execution-plan.md +256 -0
- package/docs/plans/2026-03-12-pi-voice-onboarding-remediation-plan.md +391 -0
- package/docs/plans/pi-voice-model-aware-review.md +196 -0
- package/docs/plans/pi-voice-model-detection-qa-plan.md +226 -0
- package/docs/plans/pi-voice-model-detection-research.md +483 -0
- package/docs/plans/pi-voice-onboarding-ux-plan.md +388 -0
- package/docs/plans/pi-voice-release-validation-plan.md +386 -0
- package/docs/plans/pi-voice-remaining-implementation-plan.md +524 -0
- package/docs/plans/pi-voice-review-findings.md +227 -0
- package/docs/plans/pi-voice-technical-remediation-plan.md +613 -0
- package/docs/qa-matrix.md +69 -0
- package/docs/qa-results.md +357 -0
- package/docs/troubleshooting.md +265 -0
- package/extensions/voice/config.ts +206 -0
- package/extensions/voice/diagnostics.ts +212 -0
- package/extensions/voice/install.ts +62 -0
- package/extensions/voice/onboarding.ts +315 -0
- package/extensions/voice.ts +1149 -0
- package/package.json +48 -0
- package/scripts/setup-macos.sh +374 -0
- package/scripts/setup-windows.ps1 +271 -0
- package/transcribe.py +497 -0
package/LICENSE
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2026 codexstar69
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
package/README.md
ADDED
|
@@ -0,0 +1,283 @@
|
|
|
1
|
+
# pi-listen
|
|
2
|
+
|
|
3
|
+
**Voice input and side-channel conversations for [Pi](https://github.com/mariozechner/pi-coding-agent).**
|
|
4
|
+
|
|
5
|
+
[](https://www.npmjs.com/package/@codexstar/pi-listen)
|
|
6
|
+
[](https://github.com/codexstar69/pi-listen/blob/main/LICENSE)
|
|
7
|
+
[](#testing)
|
|
8
|
+
|
|
9
|
+
---
|
|
10
|
+
|
|
11
|
+
## Overview
|
|
12
|
+
|
|
13
|
+
pi-listen is a Pi extension that adds hands-free voice input and side-channel voice workflows to the [Pi coding agent](https://github.com/mariozechner/pi-coding-agent) CLI.
|
|
14
|
+
|
|
15
|
+
### Key Features
|
|
16
|
+
|
|
17
|
+
| Feature | Description |
|
|
18
|
+
|---------|-------------|
|
|
19
|
+
| **Hold-to-talk** | Hold `SPACE` (empty editor) to record → release to transcribe into the editor |
|
|
20
|
+
| **Persistent STT daemon** | Keeps transcription models warm in memory for zero cold-start latency |
|
|
21
|
+
| **5 STT backends** | faster-whisper · moonshine · whisper-cpp · Deepgram (cloud) · Parakeet (NVIDIA) |
|
|
22
|
+
| **BTW side conversations** | Quick parallel questions (`/btw`) without interrupting the main agent session |
|
|
23
|
+
| **Guided onboarding** | First-run wizard detects your environment and recommends the best setup |
|
|
24
|
+
| **Voice → BTW glue** | `Ctrl+Shift+B` = hold to record → auto-send as a side conversation |
|
|
25
|
+
|
|
26
|
+
---
|
|
27
|
+
|
|
28
|
+
## Quick Start
|
|
29
|
+
|
|
30
|
+
### Install
|
|
31
|
+
|
|
32
|
+
```bash
|
|
33
|
+
pi install npm:@codexstar/pi-listen
|
|
34
|
+
```
|
|
35
|
+
|
|
36
|
+
> **Migrating from `pi-listen`?** Run: `pi uninstall pi-listen && pi install npm:@codexstar/pi-listen`
|
|
37
|
+
|
|
38
|
+
### First Run
|
|
39
|
+
|
|
40
|
+
On first launch, pi-listen detects your environment and walks you through setup:
|
|
41
|
+
|
|
42
|
+
```
|
|
43
|
+
? Set up pi-voice now?
|
|
44
|
+
❯ Start voice setup
|
|
45
|
+
Remind me later
|
|
46
|
+
|
|
47
|
+
? How do you want to use speech-to-text?
|
|
48
|
+
❯ Recommended for me
|
|
49
|
+
Cloud API (fastest setup)
|
|
50
|
+
Local download (offline / private)
|
|
51
|
+
```
|
|
52
|
+
|
|
53
|
+
### Manual Install (if needed)
|
|
54
|
+
|
|
55
|
+
```bash
|
|
56
|
+
# Local backend (recommended)
|
|
57
|
+
pip install faster-whisper
|
|
58
|
+
brew install sox # microphone recording
|
|
59
|
+
|
|
60
|
+
# Cloud backend (alternative)
|
|
61
|
+
export DEEPGRAM_API_KEY="your-key-here"
|
|
62
|
+
brew install sox
|
|
63
|
+
```
|
|
64
|
+
|
|
65
|
+
---
|
|
66
|
+
|
|
67
|
+
## Usage
|
|
68
|
+
|
|
69
|
+
### Voice Input
|
|
70
|
+
|
|
71
|
+
| Action | Keybinding | Description |
|
|
72
|
+
|--------|-----------|-------------|
|
|
73
|
+
| Record → Editor | Hold `SPACE` | Hold spacebar with empty editor to record, release to transcribe |
|
|
74
|
+
| Record → Editor (toggle) | `Ctrl+Shift+V` | Toggle recording on/off (non-Kitty terminals) |
|
|
75
|
+
| Record → BTW | Hold `Ctrl+Shift+B` | Hold to record, release to send as BTW side question |
|
|
76
|
+
|
|
77
|
+
### Commands
|
|
78
|
+
|
|
79
|
+
```bash
|
|
80
|
+
/voice # Toggle voice on/off
|
|
81
|
+
/voice on # Enable voice
|
|
82
|
+
/voice off # Disable voice
|
|
83
|
+
/voice setup # Run onboarding wizard
|
|
84
|
+
/voice reconfigure # Re-run setup
|
|
85
|
+
/voice test # Test microphone + STT pipeline
|
|
86
|
+
/voice info # Show current config & daemon status
|
|
87
|
+
/voice doctor # Diagnose issues with repair suggestions
|
|
88
|
+
/voice backends # List all available STT backends
|
|
89
|
+
/voice daemon # Start the STT daemon
|
|
90
|
+
/voice daemon stop # Stop the daemon
|
|
91
|
+
/voice daemon status # Show daemon status
|
|
92
|
+
```
|
|
93
|
+
|
|
94
|
+
### BTW Side Conversations
|
|
95
|
+
|
|
96
|
+
```bash
|
|
97
|
+
/btw <message> # Ask a side question
|
|
98
|
+
/btw:new [message] # Start a fresh thread (optionally with a message)
|
|
99
|
+
/btw:clear # Dismiss the BTW widget
|
|
100
|
+
/btw:inject # Push BTW thread into main agent context
|
|
101
|
+
/btw:summarize # Summarize thread then inject into main agent
|
|
102
|
+
```
|
|
103
|
+
|
|
104
|
+
---
|
|
105
|
+
|
|
106
|
+
## Architecture
|
|
107
|
+
|
|
108
|
+
```
|
|
109
|
+
┌─────────────────────────────────────────────────────────────────┐
|
|
110
|
+
│ Pi CLI (Node.js) │
|
|
111
|
+
│ │
|
|
112
|
+
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────────┐ │
|
|
113
|
+
│ │ voice.ts │ │ config.ts │ │ onboarding.ts │ │
|
|
114
|
+
│ │ (extension) │ │ (settings) │ │ (setup wizard) │ │
|
|
115
|
+
│ └──────┬───────┘ └──────────────┘ └──────────────────┘ │
|
|
116
|
+
│ │ │
|
|
117
|
+
│ │ Unix Socket (newline-delimited JSON) │
|
|
118
|
+
│ ▼ │
|
|
119
|
+
│ ┌──────────────┐ ┌──────────────┐ │
|
|
120
|
+
│ │ daemon.py │◄──│ transcribe.py │ │
|
|
121
|
+
│ │ (persistent) │ │ (backends) │ │
|
|
122
|
+
│ └──────────────┘ └──────────────┘ │
|
|
123
|
+
│ │
|
|
124
|
+
│ Backends: faster-whisper │ moonshine │ whisper-cpp │ deepgram │
|
|
125
|
+
└─────────────────────────────────────────────────────────────────┘
|
|
126
|
+
```
|
|
127
|
+
|
|
128
|
+
### Components
|
|
129
|
+
|
|
130
|
+
| Component | Language | Purpose |
|
|
131
|
+
|-----------|----------|---------|
|
|
132
|
+
| `extensions/voice.ts` | TypeScript | Main extension: recording, daemon IPC, BTW, UI |
|
|
133
|
+
| `extensions/voice/config.ts` | TypeScript | Config loading, saving, migration, socket path generation |
|
|
134
|
+
| `extensions/voice/diagnostics.ts` | TypeScript | Environment scanning, backend detection, recommendations |
|
|
135
|
+
| `extensions/voice/onboarding.ts` | TypeScript | Interactive first-run setup wizard |
|
|
136
|
+
| `extensions/voice/install.ts` | TypeScript | Provisioning plan builder |
|
|
137
|
+
| `daemon.py` | Python | Persistent STT daemon with warm model cache |
|
|
138
|
+
| `transcribe.py` | Python | Multi-backend transcription engine |
|
|
139
|
+
|
|
140
|
+
---
|
|
141
|
+
|
|
142
|
+
## STT Backends
|
|
143
|
+
|
|
144
|
+
| Backend | Type | Speed | Quality | Install |
|
|
145
|
+
|---------|------|-------|---------|---------|
|
|
146
|
+
| **faster-whisper** | Local | ⭐⭐⭐ | ⭐⭐⭐⭐ | `pip install faster-whisper` |
|
|
147
|
+
| **moonshine** | Local | ⭐⭐⭐⭐ | ⭐⭐⭐ | `pip install useful-moonshine[onnx]` |
|
|
148
|
+
| **whisper-cpp** | Local | ⭐⭐⭐⭐ | ⭐⭐⭐⭐ | `brew install whisper-cpp` |
|
|
149
|
+
| **deepgram** | Cloud | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | `DEEPGRAM_API_KEY` env var |
|
|
150
|
+
| **parakeet** | Local | ⭐⭐⭐ | ⭐⭐⭐⭐⭐ | `pip install nemo_toolkit[asr]` |
|
|
151
|
+
|
|
152
|
+
See [docs/backends.md](docs/backends.md) for detailed backend comparison, model lists, and platform notes.
|
|
153
|
+
|
|
154
|
+
---
|
|
155
|
+
|
|
156
|
+
## Configuration
|
|
157
|
+
|
|
158
|
+
pi-listen stores settings in Pi's settings files:
|
|
159
|
+
|
|
160
|
+
| Scope | Path | When to use |
|
|
161
|
+
|-------|------|-------------|
|
|
162
|
+
| Global | `~/.pi/agent/settings.json` | Same setup across all projects |
|
|
163
|
+
| Project | `<project>/.pi/settings.json` | Project-specific backend/model |
|
|
164
|
+
|
|
165
|
+
### Settings Structure
|
|
166
|
+
|
|
167
|
+
```json
|
|
168
|
+
{
|
|
169
|
+
"voice": {
|
|
170
|
+
"version": 2,
|
|
171
|
+
"enabled": true,
|
|
172
|
+
"language": "en",
|
|
173
|
+
"mode": "auto",
|
|
174
|
+
"backend": "faster-whisper",
|
|
175
|
+
"model": "small",
|
|
176
|
+
"scope": "global",
|
|
177
|
+
"btwEnabled": true,
|
|
178
|
+
"onboarding": {
|
|
179
|
+
"completed": true,
|
|
180
|
+
"schemaVersion": 2,
|
|
181
|
+
"completedAt": "2026-03-12T00:00:00.000Z",
|
|
182
|
+
"source": "first-run"
|
|
183
|
+
}
|
|
184
|
+
}
|
|
185
|
+
}
|
|
186
|
+
```
|
|
187
|
+
|
|
188
|
+
---
|
|
189
|
+
|
|
190
|
+
## Bootstrap Scripts
|
|
191
|
+
|
|
192
|
+
For zero-touch setup on a fresh machine:
|
|
193
|
+
|
|
194
|
+
### macOS
|
|
195
|
+
|
|
196
|
+
```bash
|
|
197
|
+
curl -fsSL https://raw.githubusercontent.com/codexstar69/pi-listen/main/scripts/setup-macos.sh | bash
|
|
198
|
+
```
|
|
199
|
+
|
|
200
|
+
### Windows (PowerShell)
|
|
201
|
+
|
|
202
|
+
```powershell
|
|
203
|
+
irm https://raw.githubusercontent.com/codexstar69/pi-listen/main/scripts/setup-windows.ps1 | iex
|
|
204
|
+
```
|
|
205
|
+
|
|
206
|
+
These scripts install all dependencies (Python, SoX, faster-whisper), create a virtualenv, download models, and configure microphone permissions. You should not need to run `/voice setup` on the happy path — the bootstrap scripts handle everything.
|
|
207
|
+
|
|
208
|
+
---
|
|
209
|
+
|
|
210
|
+
## Testing
|
|
211
|
+
|
|
212
|
+
```bash
|
|
213
|
+
bun run test # Run all tests (37 tests)
|
|
214
|
+
bun run typecheck # TypeScript type checking
|
|
215
|
+
bun run check # Full check: typecheck + test + Python compile
|
|
216
|
+
```
|
|
217
|
+
|
|
218
|
+
Test coverage:
|
|
219
|
+
|
|
220
|
+
| Suite | Tests | Covers |
|
|
221
|
+
|-------|-------|--------|
|
|
222
|
+
| config.test.ts | 8 | Config loading, migration, socket paths |
|
|
223
|
+
| diagnostics.test.ts | 6 | Backend detection, recommendations |
|
|
224
|
+
| onboarding.test.ts | 7 | Onboarding flow, model options |
|
|
225
|
+
| install.test.ts | 4 | Provisioning plan generation |
|
|
226
|
+
| auto-resolution.test.ts | 4 | Backend/model auto-detection |
|
|
227
|
+
| model-selection.test.ts | 3 | Model-aware recommendations |
|
|
228
|
+
| setup-scripts.test.ts | 3 | Bootstrap script validation |
|
|
229
|
+
| transcribe-metadata.test.ts | 2 | Backend registry metadata |
|
|
230
|
+
|
|
231
|
+
---
|
|
232
|
+
|
|
233
|
+
## Troubleshooting
|
|
234
|
+
|
|
235
|
+
See [docs/troubleshooting.md](docs/troubleshooting.md) for common issues and solutions.
|
|
236
|
+
|
|
237
|
+
Quick diagnostics:
|
|
238
|
+
|
|
239
|
+
```bash
|
|
240
|
+
/voice test # Test full pipeline
|
|
241
|
+
/voice doctor # Diagnose with repair suggestions
|
|
242
|
+
/voice backends # Check backend availability
|
|
243
|
+
```
|
|
244
|
+
|
|
245
|
+
---
|
|
246
|
+
|
|
247
|
+
## Security
|
|
248
|
+
|
|
249
|
+
See [SECURITY.md](SECURITY.md) for our security policy and how to report vulnerabilities.
|
|
250
|
+
|
|
251
|
+
### Security Considerations
|
|
252
|
+
|
|
253
|
+
- **Local-first:** Audio is processed locally by default (no cloud unless you opt in)
|
|
254
|
+
- **No telemetry:** pi-listen does not collect or transmit any usage data
|
|
255
|
+
- **Unix socket:** Daemon communication is local-only via Unix domain sockets
|
|
256
|
+
- **No secrets in responses:** Error responses do not expose internal paths or stack traces
|
|
257
|
+
|
|
258
|
+
---
|
|
259
|
+
|
|
260
|
+
## Contributing
|
|
261
|
+
|
|
262
|
+
See [CONTRIBUTING.md](CONTRIBUTING.md) for development setup and contribution guidelines.
|
|
263
|
+
|
|
264
|
+
---
|
|
265
|
+
|
|
266
|
+
## Changelog
|
|
267
|
+
|
|
268
|
+
See [CHANGELOG.md](CHANGELOG.md) for release history.
|
|
269
|
+
|
|
270
|
+
---
|
|
271
|
+
|
|
272
|
+
## License
|
|
273
|
+
|
|
274
|
+
[MIT](LICENSE) © 2026 codexstar69
|
|
275
|
+
|
|
276
|
+
---
|
|
277
|
+
|
|
278
|
+
## Links
|
|
279
|
+
|
|
280
|
+
- **npm:** [npmjs.com/package/@codexstar/pi-listen](https://www.npmjs.com/package/@codexstar/pi-listen)
|
|
281
|
+
- **GitHub:** [github.com/codexstar69/pi-listen](https://github.com/codexstar69/pi-listen)
|
|
282
|
+
- **Issues:** [github.com/codexstar69/pi-listen/issues](https://github.com/codexstar69/pi-listen/issues)
|
|
283
|
+
- **Pi CLI:** [github.com/mariozechner/pi-coding-agent](https://github.com/mariozechner/pi-coding-agent)
|