pi-voice 0.1.0 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,78 +1,93 @@
1
1
  # pi-voice
2
2
 
3
- ## Setup
3
+ Headless voice interface for the [Pi Coding Agent](https://github.com/badlogic/pi-mono). Hold a key, speak, and pi executes your instructions with voice feedback.
4
+
5
+ https://github.com/user-attachments/assets/06a0c56f-76bf-48cb-9de3-f9ca48a7245d
6
+
7
+ ## Installation
4
8
 
5
9
  ```bash
6
- bun install
7
- bun run build
8
- bun link # `pi-voice` コマンドをグローバルに登録
10
+ npm i -g pi-voice
11
+ # or
12
+ bun i -g pi-voice
9
13
  ```
10
14
 
11
- ## CLI
15
+ ## Usage
12
16
 
13
- pi-voice **daemon 型**のアプリケーションです。Docker と同じように、`start` でバックグラウンドに常駐し、CLI で操作します。起動時にウィンドウは表示されません。
14
-
15
- `status` / `stop` / `show` は Electron を起動せず、Unix socket 経由で daemon と通信して即応します。
17
+ pi-voice is a daemon-style application that runs in the background once started. You can push-to-talk with the agent.
16
18
 
17
19
  ```bash
18
- # daemon をバックグラウンドで起動(ウィンドウは表示されない)
19
- pi-voice start
20
+ pi-voice start # start the daemon in the background
21
+ pi-voice status # show state, PID, and uptime
22
+ pi-voice stop # stop the daemon
23
+ ```
20
24
 
21
- # daemon の状態を確認(state・PID・uptime を表示)
22
- pi-voice status
25
+ The push-to-talk trigger defaults to `Cmd+Shift+I` (macOS) / `Win+Shift+I` (Windows). Hold the key to record, release to send.
23
26
 
24
- # ウィンドウを表示
25
- pi-voice show
27
+ ## Setting
26
28
 
27
- # daemon を停止(Fn キーも無効化)
28
- pi-voice stop
29
- ```
29
+ ### pi agent configuration
30
30
 
31
- - `start` は引数なしのデフォルトコマンドです。既に起動中ならエラーで終了します。
32
- - `start` は事前に `bun run build` が必要です(`out/main/index.js` がなければエラー)。
33
- - ウィンドウを閉じても daemon はバックグラウンドで動作し続けます。完全に停止するには `stop` か Cmd+Q を使ってください。
34
- - 実行状態は `~/.pi-voice/runtime-state.json`、制御 socket は `~/.pi-voice/daemon.sock` に配置されます。
31
+ pi-voice launches a Pi agent session with the directory where `pi-voice start` was executed. This means **all standard pi configuration works as-is**:
35
32
 
36
- ### 開発モード
33
+ - `AGENTS.md` — walked up from `cwd` to the filesystem root
34
+ - `.pi/settings.json` — project-level settings
35
+ - `.pi/skills/`, `.pi/extensions/`, `.pi/prompts/` — project-level resources
36
+ - `~/.pi/agent/` — global settings, skills, extensions, prompts, and models
37
+ - and more
37
38
 
38
- ```bash
39
- bun run dev:electron
40
- ```
39
+ Refer to the [Pi documentation](https://github.com/badlogic/pi-mono/tree/main/packages/coding-agent) for details on these settings.
41
40
 
42
- HMR 付きの Vite dev server で renderer を配信しつつ Electron を起動します(開発時はウィンドウを閉じると終了します)。
41
+ ### pi-voice configuration
43
42
 
44
- CLI 単体で実行する場合:
43
+ You can configure pi-voice in `.pi/pi-voice.json`:
45
44
 
46
- ```bash
47
- bun run dev:cli
45
+ ```json
46
+ {
47
+ "key": "ctrl+t",
48
+ "provider": "local"
49
+ }
48
50
  ```
49
51
 
50
- ## Build
52
+ | Key | Description |
53
+ | --- | --- |
54
+ | `key` | Push-to-talk shortcut. Combine modifiers (`ctrl`, `shift`, `alt`/`opt`, `meta`/`cmd`) and a main key with `+`. Examples: `"ctrl+t"`, `"alt+space"`, `"ctrl+shift+r"`. Default: `"meta+shift+i"`. |
55
+ | `provider` | Speech provider for STT & TTS. `"local"`, `"gemini"` (Vertex AI), or `"openai"`. Default: `"local"`. |
51
56
 
52
- ```bash
53
- bun run build
54
- ```
57
+ ### Environment variables
58
+
59
+ | Provider | Required variables |
60
+ | --- | --- |
61
+ | `local` | None (model is auto-downloaded on first launch). Optional: `WHISPER_MODEL_PATH` (custom model path), `WHISPER_MODEL` (model name, default `medium-q5_0`), `SAY_VOICE` (macOS `say` voice name, e.g. `"Kyoko"`). |
62
+ | `gemini` | `GOOGLE_CLOUD_PROJECT`, `GOOGLE_CLOUD_LOCATION` (optional, default `us-central1`) |
63
+ | `openai` | `OPENAI_API_KEY` |
64
+
65
+ #### Logging
55
66
 
56
- `out/` にプロダクションビルドを出力します。
67
+ The daemon writes structured JSON logs to both the console and a log file. The default log file path is `$XDG_CONFIG_HOME/pi-voice/daemon.log` (falls back to `~/.config/pi-voice/daemon.log`).
57
68
 
58
- ## Preview
69
+ To override the log file path:
59
70
 
60
71
  ```bash
61
- bun run preview
72
+ export PI_VOICE_LOG_PATH=/path/to/custom.log
62
73
  ```
63
74
 
64
- ビルド済みの成果物で Electron を起動して動作確認します。
75
+ #### Whisper model (local provider)
65
76
 
66
- ## Distribution
77
+ The `local` provider uses [Whisper](https://github.com/openai/whisper) for STT and the macOS `say` command for TTS. On first launch, a ggml-format Whisper model (`medium-q5_0`, ~514 MB) is automatically downloaded to `~/.pi-agent/whisper/` and cached for subsequent runs.
78
+
79
+ To use a different model, set `WHISPER_MODEL`:
67
80
 
68
81
  ```bash
69
- bun run dist
82
+ export WHISPER_MODEL=base # smaller & faster
70
83
  ```
71
84
 
72
- `bun run build` + electron-builder macOS 向けの dmg/zip を `release/` に生成します。
73
-
74
- パッケージングせずディレクトリ出力のみ(テスト用):
85
+ Or point to your own model file directly:
75
86
 
76
87
  ```bash
77
- bun run dist:dir
88
+ export WHISPER_MODEL_PATH=/path/to/ggml-custom.bin
78
89
  ```
90
+
91
+ ## Contributing
92
+
93
+ See [CONTRIBUTING.md](CONTRIBUTING.md) for development setup, build commands, and release workflow.