pi-voice 0.1.0 → 0.1.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +49 -46
- package/out/cli/cli.js +719 -45
- package/out/main/index.js +571 -158
- package/out/preload/index.cjs +1 -9
- package/out/renderer/assets/index-CdX3ylbA.js +209 -0
- package/out/renderer/index.html +3 -140
- package/package.json +7 -9
- package/build/entitlements.mac.plist +0 -14
- package/out/renderer/assets/index-dks-nI81.js +0 -162
package/README.md
CHANGED
|
@@ -1,78 +1,81 @@
|
|
|
1
1
|
# pi-voice
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
Voice interface for the [Pi Coding Agent](https://github.com/badlogic/pi-mono). Hold a key, speak, and pi executes your instructions with voice feedback.
|
|
4
|
+
|
|
5
|
+
## Installation
|
|
4
6
|
|
|
5
7
|
```bash
|
|
6
|
-
|
|
7
|
-
|
|
8
|
-
bun
|
|
8
|
+
npm i -g pi-voice
|
|
9
|
+
# or
|
|
10
|
+
bun i -g pi-voice
|
|
9
11
|
```
|
|
10
12
|
|
|
11
|
-
##
|
|
12
|
-
|
|
13
|
-
pi-voice は **daemon 型**のアプリケーションです。Docker と同じように、`start` でバックグラウンドに常駐し、CLI で操作します。起動時にウィンドウは表示されません。
|
|
13
|
+
## Usage
|
|
14
14
|
|
|
15
|
-
|
|
15
|
+
pi-voice is a daemon-style application that runs in the background once started. You can push-to-talk with the agent.
|
|
16
16
|
|
|
17
17
|
```bash
|
|
18
|
-
# daemon
|
|
19
|
-
pi-voice
|
|
18
|
+
pi-voice start # start the daemon in the background
|
|
19
|
+
pi-voice status # show state, PID, and uptime
|
|
20
|
+
pi-voice stop # stop the daemon
|
|
21
|
+
```
|
|
20
22
|
|
|
21
|
-
|
|
22
|
-
pi-voice status
|
|
23
|
+
The push-to-talk trigger defaults to `Cmd+Shift+I` (macOS) / `Win+Shift+I` (Windows). Hold the key to record, release to send.
|
|
23
24
|
|
|
24
|
-
|
|
25
|
-
pi-voice show
|
|
25
|
+
## Setting
|
|
26
26
|
|
|
27
|
-
|
|
28
|
-
pi-voice stop
|
|
29
|
-
```
|
|
27
|
+
### pi agent configuration
|
|
30
28
|
|
|
31
|
-
- `start`
|
|
32
|
-
- `start` は事前に `bun run build` が必要です(`out/main/index.js` がなければエラー)。
|
|
33
|
-
- ウィンドウを閉じても daemon はバックグラウンドで動作し続けます。完全に停止するには `stop` か Cmd+Q を使ってください。
|
|
34
|
-
- 実行状態は `~/.pi-voice/runtime-state.json`、制御 socket は `~/.pi-voice/daemon.sock` に配置されます。
|
|
29
|
+
pi-voice launches a Pi agent session with the directory where `pi-voice start` was executed. This means **all standard pi configuration works as-is**:
|
|
35
30
|
|
|
36
|
-
|
|
31
|
+
- `AGENTS.md` — walked up from `cwd` to the filesystem root
|
|
32
|
+
- `.pi/settings.json` — project-level settings
|
|
33
|
+
- `.pi/skills/`, `.pi/extensions/`, `.pi/prompts/` — project-level resources
|
|
34
|
+
- `~/.pi/agent/` — global settings, skills, extensions, prompts, and models
|
|
35
|
+
- and more
|
|
37
36
|
|
|
38
|
-
|
|
39
|
-
bun run dev:electron
|
|
40
|
-
```
|
|
37
|
+
Refer to the [Pi documentation](https://github.com/badlogic/pi-mono/tree/main/packages/coding-agent) for details on these settings.
|
|
41
38
|
|
|
42
|
-
|
|
39
|
+
### pi-voice configuration
|
|
43
40
|
|
|
44
|
-
|
|
41
|
+
You can configure pi-voice in `.pi/pi-voice.json`:
|
|
45
42
|
|
|
46
|
-
```
|
|
47
|
-
|
|
43
|
+
```json
|
|
44
|
+
{
|
|
45
|
+
"key": "ctrl+t",
|
|
46
|
+
"provider": "local"
|
|
47
|
+
}
|
|
48
48
|
```
|
|
49
49
|
|
|
50
|
-
|
|
50
|
+
| Key | Description |
|
|
51
|
+
| --- | --- |
|
|
52
|
+
| `key` | Push-to-talk shortcut. Combine modifiers (`ctrl`, `shift`, `alt`/`opt`, `meta`/`cmd`) and a main key with `+`. Examples: `"ctrl+t"`, `"alt+space"`, `"ctrl+shift+r"`. Default: `"meta+shift+i"`. |
|
|
53
|
+
| `provider` | Speech provider for STT & TTS. `"local"`, `"gemini"` (Vertex AI), or `"openai"`. Default: `"local"`. |
|
|
51
54
|
|
|
52
|
-
|
|
53
|
-
bun run build
|
|
54
|
-
```
|
|
55
|
-
|
|
56
|
-
`out/` にプロダクションビルドを出力します。
|
|
55
|
+
### Environment variables
|
|
57
56
|
|
|
58
|
-
|
|
57
|
+
| Provider | Required variables |
|
|
58
|
+
| --- | --- |
|
|
59
|
+
| `local` | None (model is auto-downloaded on first launch). Optional: `WHISPER_MODEL_PATH` (custom model path), `WHISPER_MODEL` (model name, default `medium-q5_0`), `SAY_VOICE` (macOS `say` voice name, e.g. `"Kyoko"`). |
|
|
60
|
+
| `gemini` | `GOOGLE_CLOUD_PROJECT`, `GOOGLE_CLOUD_LOCATION` (optional, default `us-central1`) |
|
|
61
|
+
| `openai` | `OPENAI_API_KEY` |
|
|
59
62
|
|
|
60
|
-
|
|
61
|
-
bun run preview
|
|
62
|
-
```
|
|
63
|
+
#### Whisper model (local provider)
|
|
63
64
|
|
|
64
|
-
|
|
65
|
+
The `local` provider uses [Whisper](https://github.com/openai/whisper) for STT and the macOS `say` command for TTS. On first launch, a ggml-format Whisper model (`medium-q5_0`, ~514 MB) is automatically downloaded to `~/.pi-agent/whisper/` and cached for subsequent runs.
|
|
65
66
|
|
|
66
|
-
|
|
67
|
+
To use a different model, set `WHISPER_MODEL`:
|
|
67
68
|
|
|
68
69
|
```bash
|
|
69
|
-
|
|
70
|
+
export WHISPER_MODEL=base # smaller & faster
|
|
70
71
|
```
|
|
71
72
|
|
|
72
|
-
|
|
73
|
-
|
|
74
|
-
パッケージングせずディレクトリ出力のみ(テスト用):
|
|
73
|
+
Or point to your own model file directly:
|
|
75
74
|
|
|
76
75
|
```bash
|
|
77
|
-
|
|
76
|
+
export WHISPER_MODEL_PATH=/path/to/ggml-custom.bin
|
|
78
77
|
```
|
|
78
|
+
|
|
79
|
+
## Contributing
|
|
80
|
+
|
|
81
|
+
See [CONTRIBUTING.md](CONTRIBUTING.md) for development setup, build commands, and release workflow.
|