@codexstar/pi-listen 1.0.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 codexstar69
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,283 @@
1
+ # pi-listen
2
+
3
+ **Voice input and side-channel conversations for [Pi](https://github.com/mariozechner/pi-coding-agent).**
4
+
5
+ [![npm version](https://img.shields.io/npm/v/@codexstar/pi-listen.svg)](https://www.npmjs.com/package/@codexstar/pi-listen)
6
+ [![license](https://img.shields.io/npm/l/@codexstar/pi-listen.svg)](https://github.com/codexstar69/pi-listen/blob/main/LICENSE)
7
+ [![tests](https://img.shields.io/badge/tests-37%2F37%20passing-brightgreen)](#testing)
8
+
9
+ ---
10
+
11
+ ## Overview
12
+
13
+ pi-listen is a Pi extension that adds hands-free voice input and side-channel voice workflows to the [Pi coding agent](https://github.com/mariozechner/pi-coding-agent) CLI.
14
+
15
+ ### Key Features
16
+
17
+ | Feature | Description |
18
+ |---------|-------------|
19
+ | **Hold-to-talk** | Hold `SPACE` (empty editor) to record → release to transcribe into the editor |
20
+ | **Persistent STT daemon** | Keeps transcription models warm in memory for zero cold-start latency |
21
+ | **5 STT backends** | faster-whisper · moonshine · whisper-cpp · Deepgram (cloud) · Parakeet (NVIDIA) |
22
+ | **BTW side conversations** | Quick parallel questions (`/btw`) without interrupting the main agent session |
23
+ | **Guided onboarding** | First-run wizard detects your environment and recommends the best setup |
24
+ | **Voice → BTW glue** | `Ctrl+Shift+B` = hold to record → auto-send as a side conversation |
25
+
26
+ ---
27
+
28
+ ## Quick Start
29
+
30
+ ### Install
31
+
32
+ ```bash
33
+ pi install npm:@codexstar/pi-listen
34
+ ```
35
+
36
+ > **Migrating from `pi-listen`?** Run: `pi uninstall pi-listen && pi install npm:@codexstar/pi-listen`
37
+
38
+ ### First Run
39
+
40
+ On first launch, pi-listen detects your environment and walks you through setup:
41
+
42
+ ```
43
+ ? Set up pi-voice now?
44
+ ❯ Start voice setup
45
+ Remind me later
46
+
47
+ ? How do you want to use speech-to-text?
48
+ ❯ Recommended for me
49
+ Cloud API (fastest setup)
50
+ Local download (offline / private)
51
+ ```
52
+
53
+ ### Manual Install (if needed)
54
+
55
+ ```bash
56
+ # Local backend (recommended)
57
+ pip install faster-whisper
58
+ brew install sox # microphone recording
59
+
60
+ # Cloud backend (alternative)
61
+ export DEEPGRAM_API_KEY="your-key-here"
62
+ brew install sox
63
+ ```
64
+
65
+ ---
66
+
67
+ ## Usage
68
+
69
+ ### Voice Input
70
+
71
+ | Action | Keybinding | Description |
72
+ |--------|-----------|-------------|
73
+ | Record → Editor | Hold `SPACE` | Hold spacebar with empty editor to record, release to transcribe |
74
+ | Record → Editor (toggle) | `Ctrl+Shift+V` | Toggle recording on/off (non-Kitty terminals) |
75
+ | Record → BTW | Hold `Ctrl+Shift+B` | Hold to record, release to send as BTW side question |
76
+
77
+ ### Commands
78
+
79
+ ```bash
80
+ /voice # Toggle voice on/off
81
+ /voice on # Enable voice
82
+ /voice off # Disable voice
83
+ /voice setup # Run onboarding wizard
84
+ /voice reconfigure # Re-run setup
85
+ /voice test # Test microphone + STT pipeline
86
+ /voice info # Show current config & daemon status
87
+ /voice doctor # Diagnose issues with repair suggestions
88
+ /voice backends # List all available STT backends
89
+ /voice daemon # Start the STT daemon
90
+ /voice daemon stop # Stop the daemon
91
+ /voice daemon status # Show daemon status
92
+ ```
93
+
94
+ ### BTW Side Conversations
95
+
96
+ ```bash
97
+ /btw <message> # Ask a side question
98
+ /btw:new [message] # Start a fresh thread (optionally with a message)
99
+ /btw:clear # Dismiss the BTW widget
100
+ /btw:inject # Push BTW thread into main agent context
101
+ /btw:summarize # Summarize thread then inject into main agent
102
+ ```
103
+
104
+ ---
105
+
106
+ ## Architecture
107
+
108
+ ```
109
+ ┌─────────────────────────────────────────────────────────────────┐
110
+ │ Pi CLI (Node.js) │
111
+ │ │
112
+ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────────┐ │
113
+ │ │ voice.ts │ │ config.ts │ │ onboarding.ts │ │
114
+ │ │ (extension) │ │ (settings) │ │ (setup wizard) │ │
115
+ │ └──────┬───────┘ └──────────────┘ └──────────────────┘ │
116
+ │ │ │
117
+ │ │ Unix Socket (newline-delimited JSON) │
118
+ │ ▼ │
119
+ │ ┌──────────────┐ ┌──────────────┐ │
120
+ │ │ daemon.py │◄──│ transcribe.py │ │
121
+ │ │ (persistent) │ │ (backends) │ │
122
+ │ └──────────────┘ └──────────────┘ │
123
+ │ │
124
+ │ Backends: faster-whisper │ moonshine │ whisper-cpp │ deepgram │
125
+ └─────────────────────────────────────────────────────────────────┘
126
+ ```
127
+
128
+ ### Components
129
+
130
+ | Component | Language | Purpose |
131
+ |-----------|----------|---------|
132
+ | `extensions/voice.ts` | TypeScript | Main extension: recording, daemon IPC, BTW, UI |
133
+ | `extensions/voice/config.ts` | TypeScript | Config loading, saving, migration, socket path generation |
134
+ | `extensions/voice/diagnostics.ts` | TypeScript | Environment scanning, backend detection, recommendations |
135
+ | `extensions/voice/onboarding.ts` | TypeScript | Interactive first-run setup wizard |
136
+ | `extensions/voice/install.ts` | TypeScript | Provisioning plan builder |
137
+ | `daemon.py` | Python | Persistent STT daemon with warm model cache |
138
+ | `transcribe.py` | Python | Multi-backend transcription engine |
139
+
140
+ ---
141
+
142
+ ## STT Backends
143
+
144
+ | Backend | Type | Speed | Quality | Install |
145
+ |---------|------|-------|---------|---------|
146
+ | **faster-whisper** | Local | ⭐⭐⭐ | ⭐⭐⭐⭐ | `pip install faster-whisper` |
147
+ | **moonshine** | Local | ⭐⭐⭐⭐ | ⭐⭐⭐ | `pip install useful-moonshine[onnx]` |
148
+ | **whisper-cpp** | Local | ⭐⭐⭐⭐ | ⭐⭐⭐⭐ | `brew install whisper-cpp` |
149
+ | **deepgram** | Cloud | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | `DEEPGRAM_API_KEY` env var |
150
+ | **parakeet** | Local | ⭐⭐⭐ | ⭐⭐⭐⭐⭐ | `pip install nemo_toolkit[asr]` |
151
+
152
+ See [docs/backends.md](docs/backends.md) for detailed backend comparison, model lists, and platform notes.
153
+
154
+ ---
155
+
156
+ ## Configuration
157
+
158
+ pi-listen stores settings in Pi's settings files:
159
+
160
+ | Scope | Path | When to use |
161
+ |-------|------|-------------|
162
+ | Global | `~/.pi/agent/settings.json` | Same setup across all projects |
163
+ | Project | `<project>/.pi/settings.json` | Project-specific backend/model |
164
+
165
+ ### Settings Structure
166
+
167
+ ```json
168
+ {
169
+ "voice": {
170
+ "version": 2,
171
+ "enabled": true,
172
+ "language": "en",
173
+ "mode": "auto",
174
+ "backend": "faster-whisper",
175
+ "model": "small",
176
+ "scope": "global",
177
+ "btwEnabled": true,
178
+ "onboarding": {
179
+ "completed": true,
180
+ "schemaVersion": 2,
181
+ "completedAt": "2026-03-12T00:00:00.000Z",
182
+ "source": "first-run"
183
+ }
184
+ }
185
+ }
186
+ ```
187
+
188
+ ---
189
+
190
+ ## Bootstrap Scripts
191
+
192
+ For zero-touch setup on a fresh machine:
193
+
194
+ ### macOS
195
+
196
+ ```bash
197
+ curl -fsSL https://raw.githubusercontent.com/codexstar69/pi-listen/main/scripts/setup-macos.sh | bash
198
+ ```
199
+
200
+ ### Windows (PowerShell)
201
+
202
+ ```powershell
203
+ irm https://raw.githubusercontent.com/codexstar69/pi-listen/main/scripts/setup-windows.ps1 | iex
204
+ ```
205
+
206
+ These scripts install all dependencies (Python, SoX, faster-whisper), create a virtualenv, download models, and configure microphone permissions. You should not need to run `/voice setup` on the happy path — the bootstrap scripts handle everything.
207
+
208
+ ---
209
+
210
+ ## Testing
211
+
212
+ ```bash
213
+ bun run test # Run all tests (37 tests)
214
+ bun run typecheck # TypeScript type checking
215
+ bun run check # Full check: typecheck + test + Python compile
216
+ ```
217
+
218
+ Test coverage:
219
+
220
+ | Suite | Tests | Covers |
221
+ |-------|-------|--------|
222
+ | config.test.ts | 8 | Config loading, migration, socket paths |
223
+ | diagnostics.test.ts | 6 | Backend detection, recommendations |
224
+ | onboarding.test.ts | 7 | Onboarding flow, model options |
225
+ | install.test.ts | 4 | Provisioning plan generation |
226
+ | auto-resolution.test.ts | 4 | Backend/model auto-detection |
227
+ | model-selection.test.ts | 3 | Model-aware recommendations |
228
+ | setup-scripts.test.ts | 3 | Bootstrap script validation |
229
+ | transcribe-metadata.test.ts | 2 | Backend registry metadata |
230
+
231
+ ---
232
+
233
+ ## Troubleshooting
234
+
235
+ See [docs/troubleshooting.md](docs/troubleshooting.md) for common issues and solutions.
236
+
237
+ Quick diagnostics:
238
+
239
+ ```bash
240
+ /voice test # Test full pipeline
241
+ /voice doctor # Diagnose with repair suggestions
242
+ /voice backends # Check backend availability
243
+ ```
244
+
245
+ ---
246
+
247
+ ## Security
248
+
249
+ See [SECURITY.md](SECURITY.md) for our security policy and how to report vulnerabilities.
250
+
251
+ ### Security Considerations
252
+
253
+ - **Local-first:** Audio is processed locally by default (no cloud unless you opt in)
254
+ - **No telemetry:** pi-listen does not collect or transmit any usage data
255
+ - **Unix socket:** Daemon communication is local-only via Unix domain sockets
256
+ - **No secrets in responses:** Error responses do not expose internal paths or stack traces
257
+
258
+ ---
259
+
260
+ ## Contributing
261
+
262
+ See [CONTRIBUTING.md](CONTRIBUTING.md) for development setup and contribution guidelines.
263
+
264
+ ---
265
+
266
+ ## Changelog
267
+
268
+ See [CHANGELOG.md](CHANGELOG.md) for release history.
269
+
270
+ ---
271
+
272
+ ## License
273
+
274
+ [MIT](LICENSE) © 2026 codexstar69
275
+
276
+ ---
277
+
278
+ ## Links
279
+
280
+ - **npm:** [npmjs.com/package/@codexstar/pi-listen](https://www.npmjs.com/package/@codexstar/pi-listen)
281
+ - **GitHub:** [github.com/codexstar69/pi-listen](https://github.com/codexstar69/pi-listen)
282
+ - **Issues:** [github.com/codexstar69/pi-listen/issues](https://github.com/codexstar69/pi-listen/issues)
283
+ - **Pi CLI:** [github.com/mariozechner/pi-coding-agent](https://github.com/mariozechner/pi-coding-agent)