ai-notify 0.2.1 → 0.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.ja.md ADDED
@@ -0,0 +1,140 @@
1
+ # ai-notify
2
+
3
+ [English](README.md) · **日本語**
4
+
5
+ **ターミナルのAIエージェントが「あなたを必要とした瞬間」を逃さない** — Claude Code・Codex などのエージェントがターンを終えた/入力を求めた瞬間に、音・読み上げ・デスクトップ通知で知らせます。**全エージェント・全ターミナルを1つのスイッチで一括ミュート**。デーモンも常駐プロセスも無し。
6
+
7
+ ![ai-notify demo](https://raw.githubusercontent.com/unoryota/ai-notify/main/assets/hero-ja.gif)
8
+
9
+ ```sh
10
+ brew install unoryota/tap/ai-notify # macOS(Homebrew)
11
+ # または: npm i -g ai-notify
12
+
13
+ ai-notify init # インストール済みのエージェントを自動検出して配線
14
+ ```
15
+
16
+ ## 何が違うか
17
+
18
+ エージェントは数分間も沈黙しがち。ai-notify は適切な瞬間に呼び戻します。とくに **AIを並列でたくさん動かす運用**のために作られています:
19
+
20
+ - 🎙️ **ターミナルごとに声を変えられる。** ペインごとに別の声を割り当てれば、**どの窓が終わったか**を聞くだけで分かります — `export AI_NOTIFY_VOICE=Eddy`(または [VOICEVOX](#-voicevox-キャラクターボイス) のキャラ)。
21
+ - 🌐 **あなたの言語で読み上げ。** エージェントの英語の返信・プロンプトを翻訳してから読み上げ/表示(キーレス・無料)。日本人にうれしい。
22
+ - 📝 **「何をしたか」を教える。** 完了通知は「終わりました」だけでなく、エージェントの最後の返信(transcript)から作業内容を要約します。
23
+ - 🔕 **1スイッチで全部ミュート。** 全エージェント・全ターミナルが同じフラグを読むので、会議中はワンタップで全部静かに。
24
+ - 🔔 **ネイティブのメニューバーも内蔵。** `ai-notify menubar install` — Hammerspoon/SwiftBar 不要。
25
+
26
+ ## 対応エージェント
27
+
28
+ | エージェント | 状態 | 配線方法 |
29
+ | ----- | ------ | -------------- |
30
+ | Claude Code | ✅ | `~/.claude/settings.json` の `Notification` + `Stop` フック |
31
+ | Codex CLI | ✅ | `~/.codex/config.toml` の `notify`(`agent-turn-complete`) |
32
+ | Gemini CLI | 🧪 検出のみ・フック作業中 | PR歓迎 |
33
+
34
+ 別のエージェント(aider, opencode, amp …)の追加は小さなPRで可能:`src/providers/` にファイルを1つ置くだけ。[CONTRIBUTING](CONTRIBUTING.md) 参照。
35
+
36
+ ## コマンド
37
+
38
+ ```sh
39
+ ai-notify init [--dry-run] [--only claude,codex] # 検出したエージェントを配線
40
+ ai-notify toggle | on | off | status # ミュートスイッチ
41
+ ai-notify volume [0.0-2.0] # 音量の取得/設定
42
+ ai-notify voice [number|name|preview|default] # 読み上げ音声を選ぶ
43
+ ai-notify voicevox [on <id>|off|speakers|test] # VOICEVOXの声で読み上げ
44
+ ai-notify tsundere [on|off|level <0-1>|test] # ツンデレ口調(緊急度でツン⇄デレ)
45
+ ai-notify translate [on <lang>|off|test] # エージェントの文章を自分の言語で
46
+ ai-notify menubar [install|uninstall|status] # ネイティブのメニューバー(macOS)
47
+ ai-notify doctor # 依存・配線の確認
48
+ ai-notify uninstall # 配線をきれいに削除
49
+ ```
50
+
51
+ ペイン別の上書き — エージェントを起動する**前**に、その端末で `export` する:
52
+
53
+ ```sh
54
+ AI_NOTIFY_LABEL=api # この窓の読み上げ/通知での名前
55
+ AI_NOTIFY_VOICE=Eddy # この窓の `say` 音声
56
+ AI_NOTIFY_VOICEVOX_SPEAKER=3 # この窓の VOICEVOX 話者ID
57
+ AI_NOTIFY_VOLUME=0.5 # この窓の音量(0.0〜2.0)
58
+ AI_NOTIFY_TSUNDERE_LEVEL=0.8 # この窓のツンデレ既定値(0=デレ〜1=ツン)
59
+ ```
60
+
61
+ ## 🎛️ ネイティブのメニューバー — ミュート・音量・声
62
+
63
+ エージェントが走っているターミナルにはコマンドを打てないので、**メニューバー**から全部操作します:
64
+
65
+ ```sh
66
+ ai-notify menubar install # ネイティブのメニューバーアプリ・ログイン時に自動起動
67
+ ```
68
+
69
+ モノクロの波形アイコンが**状態を色で**表します(Adobe風):通常はシルエットのみ、入力待ちがあると**黄ドット**、ミュート中は**赤+斜線**。
70
+
71
+ - **左クリック** → メニュー:**音量スライダー**、**ツンデレ**トグル+デレ⇄ツンスライダー、**声の一覧**(システム+VOICEVOX)、**ペイン別**設定(開いている各ターミナルに個別の声と音量)。
72
+ - **右クリック** → 即ミュート切替。
73
+
74
+ 第三者アプリ不要。別の方法が好みなら、**Hammerspoon**・**SwiftBar/xbar**・**Raycast**・標準の**ショートカット**用レシピが [`recipes/`](recipes/) にあります。`ai-notify status --icon` は `🔔`/`🔕` だけを出力するので、tmux・プロンプト・Claude Code のステータスラインに埋め込めます。
75
+
76
+ > 切替は実行中でも効きます:次にエージェントが発火した時にフラグを読むので、トグルした瞬間に全稼働エージェントへ反映されます。
77
+
78
+ ## 🎙️ VOICEVOX キャラクターボイス
79
+
80
+ 通知を [VOICEVOX](https://voicevox.hiroshiba.jp/) のキャラ声(例:ずんだもん)で読み上げられます(無料・ローカル・オフライン)。
81
+
82
+ > **VOICEVOXアプリのインストールと起動が必要です。** ai-notify はローカルのエンジンを叩くだけで、音声データは同梱していません。未起動なら ai-notify は**OS標準の音声**(Samantha, Kyoko …)を使います(こちらは設定不要ですぐ動きます)。
83
+
84
+ `ai-notify voicevox setup` が案内します — ダウンロードページを開き、インストール済みならアプリを起動してエンジンの起動を待ちます。その後:
85
+
86
+ ```sh
87
+ ai-notify voicevox setup # VOICEVOXの導入/起動
88
+ ai-notify voicevox speakers # 利用可能なキャラとIDの一覧
89
+ ai-notify voicevox on 3 # 話者3(例:ずんだもん)を使う
90
+ ```
91
+
92
+ `AI_NOTIFY_VOICEVOX_SPEAKER` で各ペインに別キャラを割り当て可能。エンジンが起動していなければ自動でOS音声にフォールバックします。
93
+ *VOICEVOXのキャラには利用規約があります。録画などを共有する場合は [VOICEVOXのガイドライン](https://voicevox.hiroshiba.jp/term/) に従ってクレジットしてください。*
94
+
95
+ ## 🌐 あなたの言語で読み上げ
96
+
97
+ ```sh
98
+ ai-notify translate on ja # エージェントのメッセージを翻訳してから読み上げ
99
+ ai-notify translate test "I fixed the auth bug and added 3 tests."
100
+ ```
101
+
102
+ キーレス・無料(HTTP 1リクエスト。オフライン時はローカルの定型文にフォールバック)。デスクトップ通知には原文も表示されます。
103
+
104
+ ## 💢 ツンデレモード(任意・遊び心)
105
+
106
+ 読み上げに「ツンデレ」人格を載せ、**事象の緊急度で口調が変わります**:
107
+
108
+ - **失敗・危険な許可待ち** → 声大きめの鋭い**ツン**で「ちょっと!ビルドが失敗じゃない。早く直しなさいよね!」
109
+ - **問題なしのパス** → やさしい**デレ**で「…ふふ、よくやったじゃない。べ、別に褒めてないんだからね…えらいえらい。」
110
+
111
+ ```sh
112
+ ai-notify tsundere on # 既定はOFF
113
+ ai-notify tsundere level 0.6 # 既定の強さ 0(デレ)〜1(ツン)。メニューバーにもスライダー
114
+ ai-notify tsundere test # T3/T2/T1/T0 のサンプルを試聴
115
+ ```
116
+
117
+ **無API・決定論・オフライン**(テンプレートで生成。課金ゼロ)。緊急度はエージェントの文面からのキーワード推定(厳密な重大度ではなくベストエフォート)で、デスクトップ通知は素の文面のまま。**VOICEVOX**利用時は、強さに応じて同じキャラの**ツンツン/あまあま**スタイルを選ぶので、声色そのものがツン・デレに変わります。`lang` は `ja` / `en` 対応。
118
+
119
+ ## ⏳ どの窓が・何を求めているか
120
+
121
+ 各通知のタイトルに窓ラベルが付きます — 入力待ちは `⏳ <label>`、完了は `✓ <label>`。本文には**何を**(翻訳されたプロンプト、または作業内容の要約)が出ます。各ペインに短い `AI_NOTIFY_LABEL` を設定すれば、10個のターミナルもひと目で見分けられます。
122
+
123
+ ## 仕組み
124
+
125
+ XDGパス配下の単一のミュートフラグと設定だけ — デーモンも調整も無し:
126
+
127
+ ```
128
+ ${XDG_STATE_HOME:-~/.local/state}/ai-notify/muted # 存在=ミュート
129
+ ${XDG_CONFIG_HOME:-~/.config}/ai-notify/config.json # 音・声・各種オプション
130
+ ```
131
+
132
+ 各エージェントのフックが `ai-notify hook --source <agent>` を呼び、発火時にこの1つのフラグを読みます。`ai-notify config init` で編集可能な設定(エージェント別の音・声・TTSバックエンド・翻訳・テンプレート)を書き出せます。
133
+
134
+ ## 対応プラットフォーム
135
+
136
+ macOS は完全対応(`afplay` / `say` / VOICEVOX / `terminal-notifier` / ネイティブメニューバー)。Linux はベストエフォート(`paplay`/`canberra`, `notify-send`, `spd-say`/`espeak`, VOICEVOX)。Windows はビープ+PowerShell読み上げ。利用できないバックエンドは静かに縮退し、エラーにはなりません。
137
+
138
+ ## ライセンス
139
+
140
+ [MIT](LICENSE)。ランタイム依存ゼロ。
package/README.md CHANGED
@@ -1,11 +1,15 @@
1
1
  # ai-notify
2
2
 
3
+ **English** · [日本語](README.ja.md)
4
+
3
5
  **Know the moment your terminal AI agent needs you** — a sound, a spoken read-out, and a desktop banner the instant Claude Code, Codex, or another agent finishes a turn or asks for input. One mute switch covers **all of them, across every terminal**. No daemon, no background process.
4
6
 
5
- ![ai-notify demo](https://raw.githubusercontent.com/unoryota/ai-notify/main/assets/demo.gif)
7
+ ![ai-notify demo](https://raw.githubusercontent.com/unoryota/ai-notify/main/assets/hero-en.gif)
6
8
 
7
9
  ```sh
8
- npm i -g ai-notify
10
+ brew install unoryota/tap/ai-notify # macOS (Homebrew)
11
+ # or: npm i -g ai-notify
12
+
9
13
  ai-notify init # auto-detects your agents and wires them
10
14
  ```
11
15
 
@@ -19,10 +23,6 @@ Plenty of agents go quiet for minutes. ai-notify pulls you back at the right mom
19
23
  - 🔕 **One switch mutes everything.** Every agent in every terminal reads the same flag — one tap silences them all for a meeting.
20
24
  - 🔔 **A real menu bar bell, built in.** `ai-notify menubar install` — no Hammerspoon/SwiftBar required.
21
25
 
22
- > ### 日本語
23
- > 複数のAIエージェント(Claude Code / Codex …)を**並列で動かすと、どのターミナルの通知か分からない**——を解決する通知ツール。
24
- > **ペインごとに声を変えられる**(VOICEVOXのキャラ声も)/**英語の出力を日本語に翻訳して読み上げ**/**完了通知に作業内容の要約**/**1タップで全部ミュート**(MTG用)/**メニューバーのベルも内蔵**。
25
-
26
26
  ## Supported agents
27
27
 
28
28
  | Agent | Status | How it's wired |
@@ -41,6 +41,7 @@ ai-notify toggle | on | off | status # the mute switch
41
41
  ai-notify volume [0.0-2.0] # get/set output volume
42
42
  ai-notify voice [number|name|preview|default] # pick the spoken voice
43
43
  ai-notify voicevox [on <id>|off|speakers|test] # speak in VOICEVOX voices
44
+ ai-notify tsundere [on|off|level <0-1>|test] # tsundere persona (ツン⇄デレ by urgency)
44
45
  ai-notify translate [on <lang>|off|test] # speak agent text in your language
45
46
  ai-notify menubar [install|uninstall|status] # native menu bar app (macOS)
46
47
  ai-notify doctor # check deps & wiring
@@ -53,6 +54,7 @@ Per-window overrides — `export` these in a terminal *before* launching the age
53
54
  AI_NOTIFY_LABEL=api # name this window in the read-out / notification
54
55
  AI_NOTIFY_VOICE=Eddy # this window's `say` voice
55
56
  AI_NOTIFY_VOICEVOX_SPEAKER=3 # this window's VOICEVOX speaker id
57
+ AI_NOTIFY_TSUNDERE_LEVEL=0.8 # this window's tsundere baseline (0=デレ … 1=ツン)
56
58
  AI_NOTIFY_VOLUME=0.5 # this window's volume (0.0–2.0)
57
59
  ```
58
60
 
@@ -66,7 +68,7 @@ ai-notify menubar install # native menu bar app, starts at login
66
68
 
67
69
  A monochrome waveform icon shows status by color (Adobe-style): plain when idle, a **yellow** dot when an agent is waiting for you, **red + slash** when muted.
68
70
 
69
- - **Left-click** → menu: a **volume slider**, the **voice list** (system + VOICEVOX), and **per-pane** controls — each open terminal gets its own voice *and* volume.
71
+ - **Left-click** → menu: a **volume slider**, a **tsundere** toggle + デレ⇄ツン slider, the **voice list** (system + VOICEVOX), and **per-pane** controls — each open terminal gets its own voice *and* volume.
70
72
  - **Right-click** → instant mute toggle.
71
73
 
72
74
  No third-party app needed. Prefer something else? There are drop-in recipes for **Hammerspoon**, **SwiftBar/xbar**, **Raycast**, and the built-in **macOS Shortcuts** in [`recipes/`](recipes/). `ai-notify status --icon` prints just `🔔`/`🔕` to embed in tmux / your prompt / Claude Code's status line.
@@ -75,9 +77,14 @@ No third-party app needed. Prefer something else? There are drop-in recipes for
75
77
 
76
78
  ## 🎙️ VOICEVOX character voices
77
79
 
78
- Speak your notifications in [VOICEVOX](https://voicevox.hiroshiba.jp/) character voices (free, local, offline). Run the VOICEVOX app, then:
80
+ Optionally speak your notifications in [VOICEVOX](https://voicevox.hiroshiba.jp/) character voices (e.g. ずんだもん) — free, local, offline.
81
+
82
+ > **Needs the VOICEVOX app installed and running.** ai-notify calls its local engine; it does not bundle the voices. Without it, ai-notify just uses your OS voice (Samantha, Kyoko, …) — no setup required.
83
+
84
+ `ai-notify voicevox setup` walks you through it — it opens the download page, or launches the app and waits for the engine if it's already installed. Then:
79
85
 
80
86
  ```sh
87
+ ai-notify voicevox setup # install / launch VOICEVOX
81
88
  ai-notify voicevox speakers # list available characters + ids
82
89
  ai-notify voicevox on 3 # use speaker 3 (e.g. ずんだもん)
83
90
  ```
@@ -94,6 +101,21 @@ ai-notify translate test "I fixed the auth bug and added 3 tests."
94
101
 
95
102
  Key-less and no cost (one HTTP request; falls back to a localized template offline). The desktop banner still shows the original text.
96
103
 
104
+ ## 💢 Tsundere mode (optional, fun)
105
+
106
+ Give the spoken read-out a tsundere persona whose tone tracks **how urgent the event is**:
107
+
108
+ - a **failure / dangerous approval** → a louder, sharp **ツン** scolding ("Hey! The build failed — don't just sit there, fix it!")
109
+ - a **clean pass / no issues** → a warm **デレ** "good job" ("...heh, not bad. N-not that I'm impressed or anything.")
110
+
111
+ ```sh
112
+ ai-notify tsundere on # off by default
113
+ ai-notify tsundere level 0.6 # baseline 0 (デレ) … 1 (ツン); the menu bar has a slider
114
+ ai-notify tsundere test # hear T3/T2/T1/T0 samples
115
+ ```
116
+
117
+ It's **deterministic and offline** — phrase banks, no API, no cost. The urgency is a keyword heuristic over the agent's text (so it's best-effort, not a real severity signal), and the desktop banner stays factual. With **VOICEVOX** the level also picks the character's own **ツンツン / あまあま** style, so the same character actually *sounds* harsher or sweeter. `lang` supports `ja` and `en`.
118
+
97
119
  ## ⏳ Which window, and what it's asking
98
120
 
99
121
  Each notification is titled with the window label — `⏳ <label>` when an agent is waiting, `✓ <label>` when it's done — and the body says **what** (the translated prompt, or a summary of what was done). Set a short `AI_NOTIFY_LABEL` per pane and you can tell ten terminals apart at a glance.
@@ -46,6 +46,12 @@ enum State {
46
46
  try? String(format: "%.2f", v).write(toFile: file("volume"), atomically: true, encoding: .utf8)
47
47
  }
48
48
 
49
+ // Tsundere baseline level 0.0 (デレ) – 1.0 (ツン). Same file the CLI reads.
50
+ static func setTsundereLevel(_ v: Double) {
51
+ try? FileManager.default.createDirectory(atPath: dir(), withIntermediateDirectories: true)
52
+ try? String(format: "%.2f", v).write(toFile: file("tsundere-level"), atomically: true, encoding: .utf8)
53
+ }
54
+
49
55
  @discardableResult
50
56
  static func cli(_ args: [String], capture: Bool = false) -> String? {
51
57
  let launcher = file("cli")
@@ -130,9 +136,36 @@ final class AppDelegate: NSObject, NSApplicationDelegate {
130
136
  @objc private func quit() { NSApp.terminate(nil) }
131
137
 
132
138
  @objc private func volumeChanged(_ s: NSSlider) { State.setVolume(s.doubleValue) }
139
+ // Slider is shown reversed (left = ツン, right = デレ) but the file keeps the
140
+ // canonical scale (0 = デレ, 1 = ツン), so write back 1 - position.
141
+ @objc private func tsundereLevelChanged(_ s: NSSlider) { State.setTsundereLevel(1 - s.doubleValue) }
142
+ @objc private func tsundereToggled(_ b: NSButton) { State.cli(["tsundere", "toggle"]) }
143
+ @objc private func paneTsundereChanged(_ s: NSSlider) {
144
+ if let tty = s.identifier?.rawValue { State.cli(["tsundere-pane", tty, String(format: "%.2f", 1 - s.doubleValue)]) }
145
+ }
133
146
  @objc private func paneVolumeChanged(_ s: NSSlider) {
134
147
  if let tty = s.identifier?.rawValue { State.cli(["volume-pane", tty, String(format: "%.2f", s.doubleValue)]) }
135
148
  }
149
+ // identifier carries the prosody key (speed | pitch | intonation).
150
+ @objc private func prosodyChanged(_ s: NSSlider) {
151
+ if let key = s.identifier?.rawValue { State.cli(["voice-prosody", key, String(format: "%.3f", s.doubleValue)]) }
152
+ }
153
+
154
+ // A labeled VOICEVOX base-prosody slider (speed / pitch / intonation). Applied
155
+ // on release (one subprocess per drag avoided). The key rides in the identifier.
156
+ private func prosodyRow(label: String, value: Double, lo: Double, hi: Double, key: String) -> NSMenuItem {
157
+ let row = NSView(frame: NSRect(x: 0, y: 0, width: 240, height: 24))
158
+ let cap = NSTextField(labelWithString: label)
159
+ cap.frame = NSRect(x: 12, y: 4, width: 48, height: 16)
160
+ cap.font = .systemFont(ofSize: 11); cap.textColor = .secondaryLabelColor
161
+ let slider = NSSlider(value: value, minValue: lo, maxValue: hi, target: self, action: #selector(prosodyChanged(_:)))
162
+ slider.frame = NSRect(x: 62, y: 3, width: 162, height: 20)
163
+ slider.isContinuous = false
164
+ slider.identifier = NSUserInterfaceItemIdentifier(key)
165
+ row.addSubview(cap); row.addSubview(slider)
166
+ let item = NSMenuItem(); item.view = row
167
+ return item
168
+ }
136
169
 
137
170
  // A 🔊 + slider row. identifier == nil => global (live); otherwise a pane tty
138
171
  // (applied on release to avoid a subprocess per drag tick).
@@ -149,6 +182,44 @@ final class AppDelegate: NSObject, NSApplicationDelegate {
149
182
  return item
150
183
  }
151
184
 
185
+ // A ツン ⇄ デレ slider for the tsundere baseline level. Shown reversed (left =
186
+ // ツン, right = デレ) for intuition, while the file keeps 0 = デレ, 1 = ツン — so
187
+ // the knob sits at 1 - value and writes back 1 - position. Continuous.
188
+ private func tsundereRow(value: Double, identifier: String? = nil) -> NSMenuItem {
189
+ let row = NSView(frame: NSRect(x: 0, y: 0, width: 220, height: 26))
190
+ let left = NSTextField(labelWithString: "ツン")
191
+ left.frame = NSRect(x: 12, y: 5, width: 30, height: 16)
192
+ left.font = .systemFont(ofSize: 10); left.textColor = .secondaryLabelColor
193
+ // identifier == nil => global (live, writes the level file); a pane tty =>
194
+ // per-pane override applied on release (one subprocess per drag avoided).
195
+ let action: Selector = identifier == nil ? #selector(tsundereLevelChanged(_:)) : #selector(paneTsundereChanged(_:))
196
+ let slider = NSSlider(value: 1 - value, minValue: 0, maxValue: 1, target: self, action: action)
197
+ slider.frame = NSRect(x: 46, y: 3, width: 128, height: 20)
198
+ slider.isContinuous = (identifier == nil)
199
+ slider.trackFillColor = .systemPink
200
+ if let id = identifier { slider.identifier = NSUserInterfaceItemIdentifier(id) }
201
+ let right = NSTextField(labelWithString: "デレ")
202
+ right.frame = NSRect(x: 178, y: 5, width: 30, height: 16)
203
+ right.font = .systemFont(ofSize: 10); right.textColor = .secondaryLabelColor
204
+ row.addSubview(left); row.addSubview(slider); row.addSubview(right)
205
+ let item = NSMenuItem(); item.view = row
206
+ return item
207
+ }
208
+
209
+ // ツンデレモード on/off as a checkbox living inside a view row, so a click
210
+ // toggles in place instead of dismissing the menu (a normal menu item closes
211
+ // on click). The level slider below stays mounted regardless of this state, so
212
+ // the menu height never jumps.
213
+ private func tsundereToggleRow(on: Bool) -> NSMenuItem {
214
+ let row = NSView(frame: NSRect(x: 0, y: 0, width: 220, height: 24))
215
+ let btn = NSButton(checkboxWithTitle: "ツンデレモード", target: self, action: #selector(tsundereToggled(_:)))
216
+ btn.frame = NSRect(x: 12, y: 2, width: 196, height: 20)
217
+ btn.state = on ? .on : .off
218
+ row.addSubview(btn)
219
+ let item = NSMenuItem(); item.view = row
220
+ return item
221
+ }
222
+
152
223
  // representedObject is the full CLI arg array to run.
153
224
  @objc private func runItem(_ item: NSMenuItem) {
154
225
  if let cmd = item.representedObject as? [String] { State.cli(cmd) }
@@ -163,16 +234,49 @@ final class AppDelegate: NSObject, NSApplicationDelegate {
163
234
  private func showMenu() {
164
235
  let menu = NSMenu()
165
236
 
166
- // Global volume slider.
167
- menu.addItem(sliderRow(value: State.volume, action: #selector(volumeChanged(_:)), identifier: nil))
168
- menu.addItem(.separator())
169
-
170
237
  // Parse menu-json once.
171
238
  let json = (State.cli(["menu-json"], capture: true)?.data(using: .utf8))
172
239
  .flatMap { try? JSONSerialization.jsonObject(with: $0) as? [String: Any] }
173
240
  let voices = (json?["voices"] as? [[String: Any]]) ?? []
174
241
  let panes = (json?["panes"] as? [[String: Any]]) ?? []
175
242
 
243
+ // Global volume slider.
244
+ menu.addItem(sliderRow(value: State.volume, action: #selector(volumeChanged(_:)), identifier: nil))
245
+
246
+ // Tsundere mode: checkbox toggle + ツン⇄デレ baseline slider. Both live in
247
+ // view rows and are always mounted, so toggling never closes the menu nor
248
+ // shifts its height.
249
+ let tsun = json?["tsundere"] as? [String: Any]
250
+ let tsunOn = (tsun?["enabled"] as? Bool) ?? false
251
+ let tsunLevel = (tsun?["level"] as? Double) ?? 0.5
252
+ menu.addItem(tsundereToggleRow(on: tsunOn))
253
+ menu.addItem(tsundereRow(value: tsunLevel))
254
+ menu.addItem(.separator())
255
+
256
+ // VOICEVOX base prosody (speed / pitch / intonation) — only when VOICEVOX
257
+ // is the active TTS, since these are VOICEVOX audio_query scales.
258
+ if (json?["tts"] as? String) == "voicevox" {
259
+ let pr = json?["prosody"] as? [String: Any] ?? [:]
260
+ let range = json?["prosodyRange"] as? [String: Any] ?? [:]
261
+ let bounds: (String, Double, Double) -> (Double, Double) = { key, dlo, dhi in
262
+ let r = range[key] as? [Any]
263
+ let lo = (r?.first as? Double) ?? dlo
264
+ let hi = (r?.last as? Double) ?? dhi
265
+ return (lo, hi)
266
+ }
267
+ menu.addItem(disabledHeader("読み上げ(VOICEVOX)"))
268
+ for (key, label, dlo, dhi, dflt) in [
269
+ ("speed", "速さ", 0.5, 1.5, 1.0),
270
+ ("pitch", "高さ", -0.15, 0.15, 0.0),
271
+ ("intonation", "抑揚", 0.0, 1.5, 1.0),
272
+ ] {
273
+ let (lo, hi) = bounds(key, dlo, dhi)
274
+ let v = (pr[key] as? Double) ?? dflt
275
+ menu.addItem(prosodyRow(label: label, value: v, lo: lo, hi: hi, key: key))
276
+ }
277
+ menu.addItem(.separator())
278
+ }
279
+
176
280
  if voices.isEmpty {
177
281
  menu.addItem(disabledHeader("(声の一覧を取得できません)"))
178
282
  } else {
@@ -200,6 +304,15 @@ final class AppDelegate: NSObject, NSApplicationDelegate {
200
304
  volDef.state = (p["volumeSet"] as? Bool ?? false) ? .off : .on
201
305
  sub.addItem(volDef)
202
306
  sub.addItem(.separator())
307
+ // Per-pane tsundere baseline (same ツン⇄デレ slider as global).
308
+ sub.addItem(disabledHeader("ツンデレ"))
309
+ let pts = (p["tsundere"] as? Double) ?? tsunLevel
310
+ sub.addItem(tsundereRow(value: pts, identifier: tty))
311
+ let tsDef = NSMenuItem(title: "強さを全体に従う", action: #selector(runItem(_:)), keyEquivalent: "")
312
+ tsDef.target = self; tsDef.representedObject = ["tsundere-pane", tty, "clear"]
313
+ tsDef.state = (p["tsundereSet"] as? Bool ?? false) ? .off : .on
314
+ sub.addItem(tsDef)
315
+ sub.addItem(.separator())
203
316
  // Per-pane voice.
204
317
  sub.addItem(disabledHeader("声"))
205
318
  let def = NSMenuItem(title: "デフォルト(全体に従う)", action: #selector(runItem(_:)), keyEquivalent: "")
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "ai-notify",
3
- "version": "0.2.1",
3
+ "version": "0.4.0",
4
4
  "description": "Desktop, sound, and spoken notifications for terminal AI coding agents (Claude Code, Codex, Gemini, ...) — with one mute switch that covers all of them, across every terminal.",
5
5
  "type": "module",
6
6
  "bin": {
@@ -15,6 +15,7 @@
15
15
  "menubar/README.md",
16
16
  "menubar/dist",
17
17
  "README.md",
18
+ "README.ja.md",
18
19
  "LICENSE"
19
20
  ],
20
21
  "scripts": {
package/src/cli.mjs CHANGED
@@ -3,7 +3,7 @@
3
3
  // One mute switch for all of them, across every terminal. No daemon.
4
4
 
5
5
  import { readFileSync } from 'node:fs';
6
- import { execSync } from 'node:child_process';
6
+ import { execSync, execFileSync } from 'node:child_process';
7
7
  import { providers, byId } from './providers/index.mjs';
8
8
  import { emit } from './notify.mjs';
9
9
  import { deriveLabel, cliInvocation, isEphemeralInstall } from './util.mjs';
@@ -12,6 +12,7 @@ import * as menubar from './menubar.mjs';
12
12
  import { translate } from './translate.mjs';
13
13
  import { diagnose as highlightDiagnose, clearHighlight } from './highlight.mjs';
14
14
  import * as voicevox from './voicevox.mjs';
15
+ import * as tsundere from './tsundere.mjs';
15
16
  import {
16
17
  isMuted,
17
18
  setMuted,
@@ -22,12 +23,18 @@ import {
22
23
  DEFAULT_CONFIG,
23
24
  readVolume,
24
25
  setVolume,
26
+ readTsundereLevel,
27
+ setTsundereLevel,
28
+ readVoiceProsody,
29
+ setVoiceProsody,
30
+ resetVoiceProsody,
31
+ VOICE_PROSODY_RANGE,
25
32
  readPanes,
26
33
  readPaneSetting,
27
34
  updatePaneSetting,
28
35
  } from './state.mjs';
29
36
 
30
- const VERSION = '0.2.1';
37
+ const VERSION = '0.3.0';
31
38
 
32
39
  const args = process.argv.slice(2);
33
40
  const cmd = args[0];
@@ -253,6 +260,33 @@ const cmds = {
253
260
  const config = readConfig();
254
261
  const url = config.voicevox?.url || voicevox.DEFAULT_URL;
255
262
 
263
+ if (sub === 'setup') {
264
+ if (voicevox.isAvailable(url)) {
265
+ log('✓ VOICEVOX engine is already running.');
266
+ return log('Enable it: ai-notify voicevox on (list voices: ai-notify voicevox speakers)');
267
+ }
268
+ if (process.platform !== 'darwin') {
269
+ log(`VOICEVOX is not running. Install it from ${voicevox.DOWNLOAD_URL} and launch the app, then:`);
270
+ return log(' ai-notify voicevox on');
271
+ }
272
+ if (voicevox.appInstalled()) {
273
+ log('VOICEVOX is installed but not running. Launching it…');
274
+ voicevox.launchApp();
275
+ log(' Waiting for the engine to start (first launch can take ~30s)…');
276
+ if (voicevox.waitForEngine(url, 45000)) {
277
+ log('✓ engine ready.');
278
+ return log('Enable it: ai-notify voicevox on');
279
+ }
280
+ return log(' Still starting. Once VOICEVOX is open, run: ai-notify voicevox on');
281
+ }
282
+ log('VOICEVOX is not installed. Opening the download page…');
283
+ voicevox.openDownloadPage();
284
+ log(' 1. Download the macOS app and move it to Applications.');
285
+ log(' 2. First launch is Gatekeeper-blocked (it is open-source / unsigned):');
286
+ log(' xattr -dr com.apple.quarantine /Applications/VOICEVOX.app && open -a VOICEVOX');
287
+ log(' 3. Then run: ai-notify voicevox setup (or ai-notify voicevox on)');
288
+ return;
289
+ }
256
290
  if (sub === 'speakers') {
257
291
  const list = voicevox.listSpeakers(url);
258
292
  if (!list.length) return log(`No speakers (is VOICEVOX running at ${url}?).`);
@@ -302,6 +336,92 @@ const cmds = {
302
336
  log(`🔊 volume → ${n}`);
303
337
  },
304
338
 
339
+ // Tsundere mode: skin the spoken read-out with a tsundere persona that turns
340
+ // ツン (harsh + louder) on failures and デレ (warm) on clean passes. Offline,
341
+ // deterministic, no cost.
342
+ // tsundere on|off|toggle | level <0-1> | test [t3|t2|t1|t0] | status
343
+ tsundere() {
344
+ const sub = positionals[0] || 'status';
345
+ const config = readConfig();
346
+ const ts = config.tsundere;
347
+ const url = config.voicevox?.url || voicevox.DEFAULT_URL;
348
+
349
+ const sayText = (text, voice, tone = 'normal') => {
350
+ try {
351
+ const t = tsundere.decorateForSay(text, tone); // human contour, not 棒読み
352
+ execFileSync('say', voice ? ['-v', voice, t] : [t], { stdio: 'ignore' });
353
+ } catch {
354
+ /* non-mac / no say — ignore */
355
+ }
356
+ };
357
+
358
+ if (sub === 'on' || sub === 'off' || sub === 'toggle') {
359
+ const enabled = sub === 'toggle' ? !ts.enabled : sub === 'on';
360
+ config.tsundere = { ...ts, enabled };
361
+ // With VOICEVOX, resolve & cache the character's ツンツン/あまあま style ids
362
+ // now, so fire-time skips the lookup.
363
+ if (enabled && config.tts === 'voicevox') {
364
+ const sm = voicevox.resolveStyles(config.voicevox?.speaker, url);
365
+ if (sm) config.tsundere.styleMap = sm;
366
+ }
367
+ writeConfig(config);
368
+ log(enabled ? '💢 ツンデレ ON(デレ⇄ツン・緊急度で口調が変化)' : 'ツンデレ OFF');
369
+ if (enabled) {
370
+ log(' 既定の強さ: ai-notify tsundere level <0=デレ 〜 1=ツン>');
371
+ log(' 試聴: ai-notify tsundere test');
372
+ }
373
+ return;
374
+ }
375
+ if (sub === 'level') {
376
+ const arg = positionals[1];
377
+ if (arg === undefined) {
378
+ const v = readTsundereLevel();
379
+ return log(`tsundere level: ${v != null ? v : ts.level} (0=デレ 〜 1=ツン)`);
380
+ }
381
+ const n = setTsundereLevel(arg);
382
+ return log(`💢 tsundere level → ${n} (0=デレ 〜 1=ツン)`);
383
+ }
384
+ if (sub === 'test') {
385
+ const which = (positionals[1] || '').toLowerCase();
386
+ const lang = ts.lang || 'ja';
387
+ const level = readTsundereLevel() != null ? readTsundereLevel() : ts.level;
388
+ const ja = lang === 'ja';
389
+ const samples = {
390
+ t3: { event: 'done', raw: 'Build failed: TypeError in auth.ts', body: ja ? 'ビルドが失敗' : 'the build failed' },
391
+ t2: { event: 'waiting', raw: 'Claude needs your permission to run a command', body: ja ? '許可待ち' : 'waiting for your input' },
392
+ t1: { event: 'done', raw: 'Updated three files', body: ja ? '3ファイルを更新' : 'updated three files' },
393
+ t0: { event: 'done', raw: 'All tests passed, no issues', body: ja ? 'テスト全部パス' : 'all tests passed' },
394
+ };
395
+ const keys = samples[which] ? [which] : ['t3', 't2', 't1', 't0'];
396
+ const sm = config.tts === 'voicevox' ? ts.styleMap || voicevox.resolveStyles(config.voicevox?.speaker, url) : null;
397
+ log(`tsundere test (level ${level}, lang ${lang}):\n`);
398
+ for (const k of keys) {
399
+ const s = samples[k];
400
+ const tier = tsundere.classifyUrgency(s.event, s.raw, s.body);
401
+ const eff = tsundere.effectiveLevel(level, tier, ts.urgencyShift !== false);
402
+ const text = tsundere.wrap(s.body, eff, tier, lang, 0);
403
+ const mul = tsundere.volumeMul(tier, ts.volumeBoost !== false);
404
+ const tone = tsundere.axisFor(eff);
405
+ log(` [${tier} ×${mul} ${tone}] ${text}`);
406
+ if (sm) {
407
+ const speaker = sm[tone] ?? config.voicevox?.speaker;
408
+ voicevox.speak(text, speaker, url, mul, undefined, tsundere.effectiveProsody(tone, readVoiceProsody()));
409
+ } else {
410
+ sayText(text, config.voice || '', tone);
411
+ }
412
+ }
413
+ return;
414
+ }
415
+ // status
416
+ const lvl = readTsundereLevel() != null ? readTsundereLevel() : ts.level;
417
+ log(`tsundere: ${ts.enabled ? '💢 ON' : 'OFF'}`);
418
+ log(` level: ${lvl} (0=デレ 〜 1=ツン)`);
419
+ log(` urgencyShift: ${ts.urgencyShift !== false ? 'on' : 'off'} (緊急度で口調を増減)`);
420
+ log(` volumeBoost: ${ts.volumeBoost !== false ? 'on' : 'off'} (重大時は音量↑)`);
421
+ log(` lang: ${ts.lang || 'ja'}`);
422
+ if (!ts.enabled) log('\nEnable: ai-notify tsundere on 試聴: ai-notify tsundere test');
423
+ },
424
+
305
425
  // Assign a voice to a specific pane (by tty), from the menu bar.
306
426
  // voice-pane <tty> voicevox <id> | say <name> | clear
307
427
  'voice-pane'() {
@@ -340,6 +460,38 @@ const cmds = {
340
460
  log(`pane ${tty}: volume ${v}`);
341
461
  },
342
462
 
463
+ // Set a specific pane's tsundere baseline level (0=デレ – 1=ツン), or `clear` to
464
+ // follow the global level. tsundere-pane <tty> <0-1|clear>
465
+ 'tsundere-pane'() {
466
+ const [tty, arg] = positionals;
467
+ if (!tty || arg === undefined) {
468
+ console.error('usage: tsundere-pane <tty> <0-1|clear>');
469
+ process.exit(1);
470
+ }
471
+ if (arg === 'clear') {
472
+ updatePaneSetting(tty, { tsundere: null });
473
+ return log(`pane ${tty}: tsundere level reset to global`);
474
+ }
475
+ const v = Math.min(1, Math.max(0, Number(arg)));
476
+ updatePaneSetting(tty, { tsundere: v });
477
+ log(`pane ${tty}: tsundere level ${v}`);
478
+ },
479
+
480
+ // Get/set the VOICEVOX base prosody (the normal-tone scales the menu bar
481
+ // sliders drive). With no args, prints the current values as JSON.
482
+ // voice-prosody [speed|pitch|intonation <value> | reset]
483
+ 'voice-prosody'() {
484
+ const [key, val] = positionals;
485
+ if (key === 'reset') return log(JSON.stringify(resetVoiceProsody()));
486
+ if (!key || val === undefined) return log(JSON.stringify(readVoiceProsody()));
487
+ const next = setVoiceProsody(key, val);
488
+ if (!next) {
489
+ console.error('usage: voice-prosody <speed|pitch|intonation> <value> | reset');
490
+ process.exit(1);
491
+ }
492
+ log(`voice prosody ${key} → ${next[key]}`);
493
+ },
494
+
343
495
  // Machine-readable state for the menu bar agent: mute, volume, the selectable
344
496
  // voices, and the recently-active panes (for per-pane assignment). Not human.
345
497
  'menu-json'() {
@@ -371,6 +523,7 @@ const cmds = {
371
523
  // Panes = live terminals currently running an agent (so they show up before
372
524
  // they ever fire a notification) merged with previously-recorded ones.
373
525
  const globalVol = readVolume() != null ? readVolume() : typeof config.volume === 'number' ? config.volume : 1;
526
+ const tsLevel = readTsundereLevel() != null ? readTsundereLevel() : config.tsundere?.level ?? 0.5;
374
527
  const recorded = new Map(readPanes().map((p) => [p.tty, p.label]));
375
528
  const ttys = new Set([...livePanes(), ...recorded.keys()]);
376
529
  const panes = [...ttys].map((tty) => {
@@ -381,6 +534,8 @@ const cmds = {
381
534
  current: labelFor(s.tts ? s : null),
382
535
  volume: typeof s.volume === 'number' ? s.volume : globalVol,
383
536
  volumeSet: typeof s.volume === 'number',
537
+ tsundere: typeof s.tsundere === 'number' ? s.tsundere : tsLevel,
538
+ tsundereSet: typeof s.tsundere === 'number',
384
539
  };
385
540
  });
386
541
  log(
@@ -389,6 +544,10 @@ const cmds = {
389
544
  volume: readVolume() != null ? readVolume() : typeof config.volume === 'number' ? config.volume : 1,
390
545
  voices,
391
546
  panes,
547
+ tsundere: { enabled: !!config.tsundere?.enabled, level: tsLevel },
548
+ tts: config.tts || 'say',
549
+ prosody: readVoiceProsody(),
550
+ prosodyRange: VOICE_PROSODY_RANGE,
392
551
  })
393
552
  );
394
553
  },
@@ -524,7 +683,9 @@ Usage:
524
683
  ai-notify toggle | on | off | status control the mute switch
525
684
  ai-notify volume [0.0-2.0] get/set output volume
526
685
  ai-notify voice [number|name|preview|default] pick the spoken voice
527
- ai-notify voicevox [on <id>|off|speakers|test] speak in VOICEVOX character voices
686
+ ai-notify voicevox [setup|on <id>|off|speakers|test] speak in VOICEVOX character voices
687
+ ai-notify tsundere [on|off|level <0-1>|test|status] tsundere persona (ツン⇄デレ by urgency)
688
+ ai-notify voice-prosody [speed|pitch|intonation <v>|reset] VOICEVOX read-out tuning
528
689
  ai-notify menubar [install|uninstall|status] native menu bar bell (macOS)
529
690
  ai-notify translate [on <lang>|off|test] speak agent text in your language
530
691
  ai-notify doctor check deps & wiring
package/src/notify.mjs CHANGED
@@ -8,11 +8,22 @@ import { spawn, execFileSync } from 'node:child_process';
8
8
  import { existsSync, rmSync } from 'node:fs';
9
9
  import { tmpdir } from 'node:os';
10
10
  import { join } from 'node:path';
11
- import { isMuted, readConfig, readVolume, recordPane, readPaneSetting, setPaneWaiting } from './state.mjs';
11
+ import {
12
+ isMuted,
13
+ readConfig,
14
+ readVolume,
15
+ recordPane,
16
+ readPaneSetting,
17
+ setPaneWaiting,
18
+ readTsundereLevel,
19
+ readVoiceProsody,
20
+ nextCounter,
21
+ } from './state.mjs';
12
22
  import { controllingTty } from './util.mjs';
13
23
  import { translate } from './translate.mjs';
14
24
  import { highlightWaiting, clearHighlight } from './highlight.mjs';
15
25
  import * as voicevox from './voicevox.mjs';
26
+ import * as tsundere from './tsundere.mjs';
16
27
 
17
28
  const platform = process.platform; // 'darwin' | 'linux' | 'win32'
18
29
 
@@ -73,14 +84,23 @@ const sayWithVolume = (text, voice, vol) => {
73
84
  }
74
85
  };
75
86
 
76
- const speak = (text, voice, vol = 1) => {
87
+ const speak = (text, voice, vol = 1, tone = 'normal') => {
77
88
  if (!text) return;
78
89
  if (platform === 'darwin') {
79
- if (vol !== 1) return sayWithVolume(text, voice, vol);
80
- run('say', voice ? ['-v', voice, text] : [text]);
90
+ // Give the OS voice human contour (pace/pitch/intonation + real pauses)
91
+ // instead of a flat 棒読み monotone.
92
+ const t = tsundere.decorateForSay(text, tone);
93
+ if (vol !== 1) return sayWithVolume(t, voice, vol);
94
+ run('say', voice ? ['-v', voice, t] : [t]);
81
95
  } else if (platform === 'linux') {
82
- if (which('spd-say')) run('spd-say', [text]);
83
- else if (which('espeak')) run('espeak', [text]);
96
+ const e = tsundere.prosodyFor(tone).espeak;
97
+ if (which('spd-say')) {
98
+ const r = Math.max(-100, Math.min(100, Math.round((e.speed - 175) / 1.5)));
99
+ const pch = Math.max(-100, Math.min(100, Math.round((e.pitch - 50) * 2)));
100
+ run('spd-say', ['-r', String(r), '-p', String(pch), text]);
101
+ } else if (which('espeak')) {
102
+ run('espeak', ['-p', String(e.pitch), '-s', String(e.speed), text]);
103
+ }
84
104
  } else if (platform === 'win32') {
85
105
  run('powershell', [
86
106
  '-NoProfile',
@@ -187,14 +207,47 @@ export const emit = ({ provider = 'default', event = 'done', label = '', message
187
207
  ? config.volume
188
208
  : 1;
189
209
 
210
+ // Tsundere mode: skin the spoken text, scale volume, and (with VOICEVOX) pick
211
+ // the character's ツンツン/あまあま style — all driven by the event's urgency.
212
+ // The banner is left untouched (it stays factual). Off => identical to before.
213
+ let outText = speakText;
214
+ let outVol = vol;
215
+ let outSpeaker = speaker;
216
+ let speakTone = 'normal'; // delivery contour; tsundere sets it to tsun/dere
217
+ const ts = config.tsundere;
218
+ if (ts && ts.enabled) {
219
+ const tier = tsundere.classifyUrgency(event, message, fullBody);
220
+ const envLevel = parseFloat(process.env.AI_NOTIFY_TSUNDERE_LEVEL);
221
+ const baseLevel = Number.isFinite(envLevel)
222
+ ? Math.min(1, Math.max(0, envLevel))
223
+ : typeof pane.tsundere === 'number'
224
+ ? pane.tsundere
225
+ : readTsundereLevel() != null
226
+ ? readTsundereLevel()
227
+ : typeof ts.level === 'number'
228
+ ? ts.level
229
+ : 0.5;
230
+ const eff = tsundere.effectiveLevel(baseLevel, tier, ts.urgencyShift !== false);
231
+ speakTone = tsundere.axisFor(eff);
232
+ outVol = Math.min(2, Math.max(0, vol * tsundere.volumeMul(tier, ts.volumeBoost !== false)));
233
+ outText = tsundere.wrap(spokenBody, eff, tier, ts.lang || 'ja', nextCounter('tsundere'));
234
+ if (config.speakLabel === true && label) outText = `${label}、${outText}`;
235
+ if (tts === 'voicevox') {
236
+ const sm = ts.styleMap || voicevox.resolveStyles(outSpeaker, config.voicevox?.url);
237
+ const axis = tsundere.axisFor(eff);
238
+ if (sm && sm[axis] != null) outSpeaker = sm[axis];
239
+ }
240
+ }
241
+
190
242
  if (!muted) {
191
- playSound(soundName, vol);
192
- if (config.speak && vol > 0) {
243
+ playSound(soundName, outVol);
244
+ if (config.speak && outVol > 0) {
193
245
  let spoken = false;
194
246
  if (tts === 'voicevox') {
195
- spoken = voicevox.speak(speakText, speaker, config.voicevox?.url, vol);
247
+ const prosody = tsundere.effectiveProsody(speakTone, readVoiceProsody());
248
+ spoken = voicevox.speak(outText, outSpeaker, config.voicevox?.url, outVol, undefined, prosody);
196
249
  }
197
- if (!spoken) speak(speakText, voice, vol); // OS `say` (also the VOICEVOX fallback)
250
+ if (!spoken) speak(outText, voice, outVol, speakTone); // OS `say` (also the VOICEVOX fallback)
198
251
  }
199
252
  }
200
253
 
package/src/state.mjs CHANGED
@@ -67,6 +67,98 @@ export const setVolume = (v) => {
67
67
  return n;
68
68
  };
69
69
 
70
+ // --- Tsundere level --------------------------------------------------------
71
+ // A single number 0.0 (full デレ) – 1.0 (full ツン) in a state file, written by
72
+ // the menu bar slider or `ai-notify tsundere level`, read at fire time. Overrides
73
+ // config.tsundere.level; $AI_NOTIFY_TSUNDERE_LEVEL overrides per window.
74
+
75
+ const tsundereLevelPath = () => join(stateDir(), 'tsundere-level');
76
+
77
+ export const readTsundereLevel = () => {
78
+ try {
79
+ const v = parseFloat(readFileSync(tsundereLevelPath(), 'utf8'));
80
+ return Number.isFinite(v) ? Math.min(1, Math.max(0, v)) : null;
81
+ } catch {
82
+ return null;
83
+ }
84
+ };
85
+
86
+ export const setTsundereLevel = (v) => {
87
+ const n = Math.min(1, Math.max(0, Number(v)));
88
+ ensureDir(stateDir());
89
+ writeFileSync(tsundereLevelPath(), String(n));
90
+ return n;
91
+ };
92
+
93
+ // --- VOICEVOX base prosody -------------------------------------------------
94
+ // User-tunable BASE scales for the VOICEVOX read-out — the values used at the
95
+ // NORMAL tone; tsundere tones nudge from here. Written by the menu bar sliders /
96
+ // `ai-notify voice-prosody`, read at fire time. One small JSON file so all three
97
+ // stay in sync. Defaults = neutral (identical to no tuning).
98
+ export const VOICE_PROSODY_DEFAULTS = { speed: 1.0, pitch: 0.0, intonation: 1.0 };
99
+ export const VOICE_PROSODY_RANGE = { speed: [0.5, 1.5], pitch: [-0.15, 0.15], intonation: [0.0, 1.5] };
100
+
101
+ const voiceProsodyPath = () => join(stateDir(), 'voice-prosody.json');
102
+
103
+ const clampProsody = (key, v) => {
104
+ const [lo, hi] = VOICE_PROSODY_RANGE[key] || [0, 2];
105
+ return Math.min(hi, Math.max(lo, Number(v)));
106
+ };
107
+
108
+ export const readVoiceProsody = () => {
109
+ let raw = {};
110
+ try {
111
+ raw = JSON.parse(readFileSync(voiceProsodyPath(), 'utf8')) || {};
112
+ } catch {
113
+ /* missing/corrupt -> defaults */
114
+ }
115
+ const out = {};
116
+ for (const k of Object.keys(VOICE_PROSODY_DEFAULTS)) {
117
+ out[k] = typeof raw[k] === 'number' ? clampProsody(k, raw[k]) : VOICE_PROSODY_DEFAULTS[k];
118
+ }
119
+ return out;
120
+ };
121
+
122
+ // Set one key (speed | pitch | intonation); returns the full updated object, or
123
+ // null for an unknown key.
124
+ export const setVoiceProsody = (key, value) => {
125
+ if (!(key in VOICE_PROSODY_DEFAULTS)) return null;
126
+ const cur = readVoiceProsody();
127
+ cur[key] = clampProsody(key, value);
128
+ ensureDir(stateDir());
129
+ writeFileSync(voiceProsodyPath(), JSON.stringify(cur));
130
+ return cur;
131
+ };
132
+
133
+ export const resetVoiceProsody = () => {
134
+ try {
135
+ rmSync(voiceProsodyPath(), { force: true });
136
+ } catch {
137
+ /* ignore */
138
+ }
139
+ return { ...VOICE_PROSODY_DEFAULTS };
140
+ };
141
+
142
+ // A small persisted counter (per name), so phrase rotation varies across fires
143
+ // even for identical input. Wraps to stay small; best-effort.
144
+ export const nextCounter = (name) => {
145
+ const p = join(stateDir(), `ctr-${name}`);
146
+ let n = 0;
147
+ try {
148
+ n = parseInt(readFileSync(p, 'utf8'), 10) || 0;
149
+ } catch {
150
+ /* first use */
151
+ }
152
+ n = (n + 1) % 1000000;
153
+ try {
154
+ ensureDir(stateDir());
155
+ writeFileSync(p, String(n));
156
+ } catch {
157
+ /* ignore */
158
+ }
159
+ return n;
160
+ };
161
+
70
162
  // --- Per-pane state --------------------------------------------------------
71
163
  // Recently-active terminal panes (so the menu bar can offer per-pane voices),
72
164
  // and a per-tty voice override. Both are small JSON files in the state dir.
@@ -114,7 +206,8 @@ export const readPanes = () =>
114
206
  .map(([tty, v]) => ({ tty, label: v.label || '', ts: v.ts || 0 }))
115
207
  .sort((a, b) => b.ts - a.ts);
116
208
 
117
- // Per-pane settings: { tts, speaker, voice, volume }. Any subset may be set.
209
+ // Per-pane settings: { tts, speaker, voice, volume, tsundere }. Any subset may
210
+ // be set (tsundere = a 0–1 baseline level override; null/absent = follow global).
118
211
  export const readPaneSetting = (tty) => (tty ? readJson(paneVoicesPath(), {})[tty] || {} : {});
119
212
 
120
213
  // Merge `patch` into the pane's settings; keys set to null are removed; an empty
@@ -176,6 +269,21 @@ export const DEFAULT_CONFIG = {
176
269
  // Per window: $AI_NOTIFY_VOICEVOX_SPEAKER overrides the speaker id.
177
270
  tts: 'say',
178
271
  voicevox: { url: 'http://127.0.0.1:50021', speaker: 3 },
272
+ // Tsundere mode: skin the SPOKEN read-out with a tsundere persona whose
273
+ // harshness (ツン) ⇄ sweetness (デレ) tracks the event's urgency — high-urgency
274
+ // failures get a louder ツン scolding, clean passes get a デレ "good job".
275
+ // Off by default. `level` is the baseline 0 (デレ) – 1 (ツン); the menu bar
276
+ // slider / `ai-notify tsundere level` write a state file that overrides it.
277
+ // With VOICEVOX, the level also picks the character's ツンツン/あまあま style
278
+ // (cached in `styleMap`). No API, no cost — deterministic phrase banks.
279
+ tsundere: {
280
+ enabled: false,
281
+ level: 0.5,
282
+ urgencyShift: true, // modulate the level by the event's urgency
283
+ volumeBoost: true, // louder on high-urgency events
284
+ lang: 'ja', // phrase bank language (ja | en)
285
+ styleMap: null, // { normal, tsun, dere } VOICEVOX style ids; auto-resolved
286
+ },
179
287
  // Spoken read-out templates for agent events. The window label is added
180
288
  // separately (speakLabel), so leave {label} out here to avoid doubling it.
181
289
  // Override per language (e.g. Japanese) in config.json. An agent that supplies
@@ -193,7 +301,12 @@ export const DEFAULT_CONFIG = {
193
301
  export const readConfig = () => {
194
302
  try {
195
303
  const raw = JSON.parse(readFileSync(configPath(), 'utf8'));
196
- return { ...DEFAULT_CONFIG, ...raw, providers: { ...DEFAULT_CONFIG.providers, ...(raw.providers || {}) } };
304
+ return {
305
+ ...DEFAULT_CONFIG,
306
+ ...raw,
307
+ providers: { ...DEFAULT_CONFIG.providers, ...(raw.providers || {}) },
308
+ tsundere: { ...DEFAULT_CONFIG.tsundere, ...(raw.tsundere || {}) },
309
+ };
197
310
  } catch {
198
311
  return DEFAULT_CONFIG;
199
312
  }
@@ -0,0 +1,218 @@
1
+ // Tsundere mode: skin the spoken read-out with a tsundere persona whose harshness
2
+ // (ツン) ⇄ sweetness (デレ) tracks the event's urgency.
3
+ //
4
+ // high urgency (error / failure / dangerous approval) -> ツン + louder
5
+ // low urgency (tests passed / no issues / approved) -> デレ (warm)
6
+ //
7
+ // Everything here is deterministic and offline — phrase banks, no API, no cost.
8
+ // The banner (visual) is never skinned; only the spoken text is wrapped.
9
+
10
+ // --- Urgency classifier ----------------------------------------------------
11
+ // We only see the agent's notification text, so urgency is a heuristic. Order
12
+ // matters: a "no errors" / "tests passed" message must read as POSITIVE even
13
+ // though it contains the word "error".
14
+
15
+ const POSITIVE =
16
+ /\b(passed|all tests? pass(ed)?|no (issues?|errors?|problems?|failures?)|looks good|lgtm|approved?|success(ful|fully)?|succeeded|completed successfully)\b|✅|問題(は)?(あ?り?ま?せん|な(い|し))|エラー(は)?(あり|出て)?(ま|い)?せん|テスト.*(成功|通過|パス|通り)|レビュー.*(通|問題|OK)|承認|無事(完了|成功)/i;
17
+
18
+ // Critical = a failure, OR an approval for a DESTRUCTIVE action. A generic
19
+ // "permission to run a command" is just a wait (T2) — only destructive verbs /
20
+ // dangerous commands escalate to T3.
21
+ const CRITICAL =
22
+ /\b(failed|failing|failure|crash(ed|ing)?|exception|panic|fatal|unrecoverable|aborted|broke(n)?|blocked)\b|❌|🛑|\b(permission|approval)\b[^.!?\n]*\b(delete|remove|overwrite|reset|drop|truncate|force|rm)\b|rm\s+-rf|force[- ]?push|git\s+push\s+-f|drop\s+table|truncate\b|エラーが|失敗|クラッシュ|例外が|落ちて|中断され|危険なコマンド/i;
23
+
24
+ // Returns one of 'T3' (critical) | 'T2' (waiting) | 'T1' (neutral done) | 'T0' (positive).
25
+ // `raw` is the agent's original text (pre-translation); `core` is the formatted
26
+ // body. We test the raw text first for accuracy.
27
+ export const classifyUrgency = (event = 'done', raw = '', core = '') => {
28
+ const text = `${raw || ''} ${core || ''}`;
29
+ if (POSITIVE.test(text) && !CRITICAL.test(text)) return 'T0';
30
+ if (CRITICAL.test(text)) return 'T3';
31
+ if (event === 'waiting') return 'T2';
32
+ return 'T1';
33
+ };
34
+
35
+ // Per-tier modulation: nudge the baseline tsun level toward ツン (positive bias)
36
+ // or デレ (negative bias), and scale the volume. Kept small (±0.25) so the
37
+ // SLIDER stays in charge — at either extreme the slider wins (even a success
38
+ // reads ツン at max, even a failure reads デレ at min); near the middle the
39
+ // urgency nudge is what tips the tone. T0 never lowers the volume.
40
+ const BIAS = { T3: 0.25, T2: 0.1, T1: 0, T0: -0.25 };
41
+ const VOLMUL = { T3: 1.3, T2: 1.1, T1: 1, T0: 1 };
42
+
43
+ export const effectiveLevel = (level, tier, urgencyShift = true) => {
44
+ const base = Number.isFinite(level) ? level : 0.5;
45
+ return Math.min(1, Math.max(0, base + (urgencyShift ? BIAS[tier] || 0 : 0)));
46
+ };
47
+
48
+ export const volumeMul = (tier, volumeBoost = true) => (volumeBoost ? VOLMUL[tier] || 1 : 1);
49
+
50
+ // eff >= 0.6 => ツン, <= 0.4 => デレ, else ノーマル. A narrow ノーマル band (only
51
+ // the genuinely-neutral middle) so the contrast between ツン and デレ is obvious
52
+ // instead of everything collapsing into a bland middle. Used for both the phrase
53
+ // tone and the VOICEVOX style pick.
54
+ export const axisFor = (eff) => (eff >= 0.6 ? 'tsun' : eff <= 0.4 ? 'dere' : 'normal');
55
+
56
+ // --- Phrase banks ----------------------------------------------------------
57
+ // BANK[lang][tone] = { <tier>: [...], default: [...] }. `{body}` is the task
58
+ // gist (kept, so the read-out is still informative). Tasteful, short, SFW.
59
+
60
+ const BANK = {
61
+ ja: {
62
+ // ツン: 冷たい・とげとげ・素直じゃない。失敗には容赦なく、成功も渋々。
63
+ tsun: {
64
+ T3: [
65
+ 'ちょっと!また{body}じゃない。…ぼーっとしてないで早く直しなさいよ!',
66
+ 'はぁ?{body}って…どこ見てたのよ。さっさと直す!',
67
+ 'べ、別に心配なんてしてないけど…{body}よ。早くなんとかしなさいよね!',
68
+ ],
69
+ T2: [
70
+ '…{body}。あんたの判断待ちなの。さっさと決めなさいよ。',
71
+ 'ねえ、{body}でしょ。…わたしに聞いてないで自分で決めなさい。',
72
+ ],
73
+ T1: [
74
+ 'ふん、{body}。…言われなくてもやっといたわよ。',
75
+ '{body}。…別にあんたのためじゃないんだからね。',
76
+ ],
77
+ T0: [
78
+ '{body}…ま、まあ及第点ね。べ、別に褒めてないんだからね!',
79
+ 'ふん、{body}じゃない。…ちょっとは見直したけど、調子に乗らないでよね。',
80
+ ],
81
+ default: ['{body}。…さっさと次いきなさいよ。'],
82
+ },
83
+ // ノーマル: 中央のニュートラル帯だけ。素っ気なく事実だけ。
84
+ normal: {
85
+ default: ['{body}。', '{body}。…以上よ。'],
86
+ },
87
+ // デレ: あまあま・素直・openly 心配&応援。失敗にも寄り添う。
88
+ dere: {
89
+ T3: [
90
+ 'あっ、{body}…!大丈夫?あわてなくていいから、一緒に直そ?',
91
+ '{body}みたい…。落ち込まないで、ね?あなたならきっと直せるよ。',
92
+ ],
93
+ T2: [
94
+ 'ねぇ、{body}だって。…あなたの答え、ここで待ってるね。',
95
+ '{body}…どうするか、ゆっくり決めていいからね。',
96
+ ],
97
+ T1: [
98
+ '{body}、おしまい。…おつかれさま、えらいよ。',
99
+ '{body}。…ちゃんとできてる、すごいね。',
100
+ ],
101
+ T0: [
102
+ '{body}…!やったね、すごいすごい!わたし、ほんとに嬉しい!',
103
+ 'わぁ、{body}だって!さすがだなぁ、大好き…!',
104
+ 'お疲れさま。{body}…できるって信じてたよ、ほんとえらい!',
105
+ ],
106
+ default: ['{body}。…よくがんばったね。'],
107
+ },
108
+ },
109
+ en: {
110
+ tsun: {
111
+ T3: [
112
+ "Hey! {body} again?! ...Don't just sit there — fix it!",
113
+ 'Seriously? {body}. Clean it up, now.',
114
+ "I-it's not like I was worried, but... {body}. Deal with it.",
115
+ ],
116
+ T2: ['...{body}. It needs your call. Hurry up and decide already.'],
117
+ T1: [
118
+ 'Hmph. {body}. ...I did it without being asked, obviously.',
119
+ "{body}. ...Not that I did it for you or anything.",
120
+ ],
121
+ T0: [
122
+ "{body}... fine, that's passable. N-not that I'm impressed!",
123
+ 'Hmph, {body}. ...A little better, I guess. Don’t let it go to your head.',
124
+ ],
125
+ default: ['{body}. ...Get on with the next one.'],
126
+ },
127
+ normal: { default: ['{body}.', "{body}. ...That's that."] },
128
+ dere: {
129
+ T3: [
130
+ "Oh no, {body}...! Are you okay? Don't panic — let's fix it together, okay?",
131
+ "{body}, huh... Don't be down. You've got this, I know it.",
132
+ ],
133
+ T2: [
134
+ "Hey, {body}. ...I'll be right here waiting for your call.",
135
+ '{body}... take your time deciding, okay?',
136
+ ],
137
+ T1: [
138
+ '{body}, all done. ...Nice work, you did great.',
139
+ "{body}. ...You really pulled it off. I'm proud of you.",
140
+ ],
141
+ T0: [
142
+ "{body}...! You did it! Amazing, amazing! I'm so happy for you!",
143
+ "Wow, {body}! That's incredible — good job!",
144
+ 'Nice work. {body}... I always knew you could do it.',
145
+ ],
146
+ default: ['{body}. ...You did your best, well done.'],
147
+ },
148
+ },
149
+ };
150
+
151
+ export const isLangSupported = (lang) => !!BANK[lang];
152
+
153
+ // Wrap `body` with a tsundere line. `rot` rotates phrase choice so repeats vary.
154
+ // Unsupported language => body is returned unchanged (volume/voice still apply).
155
+ export const wrap = (body, eff, tier, lang = 'ja', rot = 0) => {
156
+ const bank = BANK[lang];
157
+ if (!bank || !body) return body;
158
+ const tone = axisFor(eff);
159
+ const group = bank[tone] || bank.normal;
160
+ const arr = (group && (group[tier] || group.default)) || ['{body}'];
161
+ const phrase = arr[((rot % arr.length) + arr.length) % arr.length];
162
+ return phrase.replace('{body}', body);
163
+ };
164
+
165
+ // --- Delivery / prosody ----------------------------------------------------
166
+ // The persona's VOICE, not just its words: how it's spoken, so the read-out has
167
+ // human contour instead of a flat 棒読み monotone. Each tone gets its own pace,
168
+ // pitch, and intonation range.
169
+ // say.* : macOS `say` embedded-command deltas, RELATIVE to the voice's own
170
+ // natural setting (rate wpm, pbas pitch base, pmod pitch range) — so
171
+ // it works on any voice without knowing its defaults.
172
+ // vv.* : VOICEVOX audio_query scales (speed/pitch/intonation; 1.0 = default).
173
+ // espeak : { pitch 0-99, speed wpm } for the Linux fallback.
174
+ // tsun = quick, higher, sharp swings (agitated scolding).
175
+ // dere = slower, gentle, wide warm intonation + longer pauses (affectionate).
176
+ // normal= mild, just enough lilt to not sound robotic.
177
+ // Kept deliberately SUBTLE: VOICEVOX is already expressive, so over-driving the
178
+ // scales (intonation ≫1.2, any pitch shift) is what makes it sound warbly and
179
+ // unnatural. The real ツン/デレ contrast comes from the character STYLE
180
+ // (ツンツン/あまあま, see voicevox.resolveStyles) — these scales only add a light
181
+ // pace/lilt on top, staying inside natural ranges. Same idea for `say`: a small
182
+ // pmod, not a big one (heavy pitch-modulation = robotic warble).
183
+ const PROSODY = {
184
+ tsun: { say: { rate: 16, pbas: 3, pmod: 3 }, vv: { speed: 1.06, pitch: 0.0, intonation: 1.2 }, espeak: { pitch: 56, speed: 190 } },
185
+ normal: { say: { rate: 0, pbas: 0, pmod: 2 }, vv: { speed: 1.0, pitch: 0.0, intonation: 1.0 }, espeak: { pitch: 50, speed: 175 } },
186
+ dere: { say: { rate: -12, pbas: 1, pmod: 4 }, vv: { speed: 0.96, pitch: 0.0, intonation: 1.1 }, espeak: { pitch: 46, speed: 160 } },
187
+ };
188
+
189
+ export const prosodyFor = (tone) => PROSODY[tone] || PROSODY.normal;
190
+
191
+ // Combine the user's GUI-tunable BASE scales (the normal-tone values) with this
192
+ // tone's nudge, for the VOICEVOX read-out. speed & intonation are scales (they
193
+ // multiply), pitch is an offset (it adds). base = {} => pure tone prosody, which
194
+ // equals the old behaviour. Returns { speed, pitch, intonation }.
195
+ export const effectiveProsody = (tone, base = {}) => {
196
+ const t = prosodyFor(tone).vv;
197
+ const b = { speed: 1, pitch: 0, intonation: 1, ...base };
198
+ return {
199
+ speed: b.speed * t.speed,
200
+ pitch: b.pitch + t.pitch,
201
+ intonation: b.intonation * t.intonation,
202
+ };
203
+ };
204
+
205
+ const sgn = (n) => (n >= 0 ? `+${n}` : `${n}`);
206
+
207
+ // Wrap spoken text with macOS `say` embedded commands for the given tone, and
208
+ // turn ellipses / commas into real beats so the line breathes. Stray `[[`/`]]`
209
+ // in the dynamic text is neutralized first so it can't inject its own commands.
210
+ export const decorateForSay = (text, tone = 'normal') => {
211
+ if (!text) return text;
212
+ const p = prosodyFor(tone).say;
213
+ const body = String(text)
214
+ .replace(/\[\[|\]\]/g, '') // can't let task text smuggle in commands
215
+ .replace(/[…⋯]+|・・・+|\.{3,}/g, ' [[slnc 220]] ') // a short beat where it trails off
216
+ .replace(/([、,])\s*/g, '$1 [[slnc 70]] '); // commas breathe a touch
217
+ return `[[rate ${sgn(p.rate)}]] [[pbas ${sgn(p.pbas)}]] [[pmod ${sgn(p.pmod)}]] ${body}`;
218
+ };
package/src/voicevox.mjs CHANGED
@@ -9,15 +9,57 @@
9
9
  // return false and the caller falls back to the OS `say` voice.
10
10
 
11
11
  import { execSync, execFileSync } from 'node:child_process';
12
- import { existsSync, statSync, mkdtempSync, rmSync, appendFileSync } from 'node:fs';
12
+ import { existsSync, statSync, mkdtempSync, rmSync, appendFileSync, readFileSync, writeFileSync } from 'node:fs';
13
13
  import { join } from 'node:path';
14
- import { tmpdir } from 'node:os';
14
+ import { tmpdir, homedir } from 'node:os';
15
15
  import { stateDir } from './state.mjs';
16
16
 
17
17
  export const DEFAULT_URL = 'http://127.0.0.1:50021';
18
+ export const DOWNLOAD_URL = 'https://voicevox.hiroshiba.jp/';
18
19
 
19
20
  const platform = process.platform;
20
21
 
22
+ const sleep = (ms) => {
23
+ try {
24
+ Atomics.wait(new Int32Array(new SharedArrayBuffer(4)), 0, 0, ms);
25
+ } catch {
26
+ /* SharedArrayBuffer unavailable — skip the wait */
27
+ }
28
+ };
29
+
30
+ // Is the VOICEVOX app installed (macOS)?
31
+ export const appInstalled = () => {
32
+ if (platform !== 'darwin') return false;
33
+ return ['/Applications/VOICEVOX.app', join(homedir(), 'Applications/VOICEVOX.app')].some((p) => existsSync(p));
34
+ };
35
+
36
+ export const launchApp = () => {
37
+ try {
38
+ if (platform === 'darwin') execFileSync('open', ['-a', 'VOICEVOX']);
39
+ } catch {
40
+ /* ignore */
41
+ }
42
+ };
43
+
44
+ export const openDownloadPage = () => {
45
+ try {
46
+ if (platform === 'darwin') execFileSync('open', [DOWNLOAD_URL]);
47
+ else if (platform === 'linux') execFileSync('xdg-open', [DOWNLOAD_URL]);
48
+ } catch {
49
+ /* ignore */
50
+ }
51
+ };
52
+
53
+ // Poll until the engine answers, or timeout.
54
+ export const waitForEngine = (url = DEFAULT_URL, timeoutMs = 40000) => {
55
+ const start = Date.now();
56
+ while (Date.now() - start < timeoutMs) {
57
+ if (isAvailable(url, 1500)) return true;
58
+ sleep(2000);
59
+ }
60
+ return false;
61
+ };
62
+
21
63
  // Record why a synthesis fell back to the OS voice, so intermittent fallbacks
22
64
  // are diagnosable instead of silent. Best-effort.
23
65
  const logFail = (reason) => {
@@ -73,6 +115,34 @@ export const listCharacters = (url = DEFAULT_URL) => {
73
115
  }
74
116
  };
75
117
 
118
+ // For tsundere mode: given a speaker id, find the character that owns it and map
119
+ // its styles to { normal, tsun, dere } speaker ids (so the SAME character can
120
+ // speak in a ツンツン or あまあま voice). Missing styles fall back to normal.
121
+ export const resolveStyles = (speakerId, url = DEFAULT_URL) => {
122
+ try {
123
+ const out = execFileSync('curl', ['-s', '-m', '4', `${url}/speakers`], { encoding: 'utf8', timeout: 5000 });
124
+ const data = JSON.parse(out);
125
+ const sid = Number(speakerId);
126
+ for (const sp of data) {
127
+ const styles = sp.styles || [];
128
+ if (!styles.some((s) => Number(s.id) === sid)) continue;
129
+ const find = (re) => {
130
+ const m = styles.find((s) => re.test(s.name || ''));
131
+ return m ? Number(m.id) : null;
132
+ };
133
+ const normal = find(/ノーマル|普通/) ?? sid;
134
+ return {
135
+ normal,
136
+ tsun: find(/ツンツン|ツン/) ?? normal,
137
+ dere: find(/あまあま|甘え|デレ|ささやき/) ?? normal,
138
+ };
139
+ }
140
+ } catch {
141
+ /* engine down / parse error — caller falls back to the base speaker */
142
+ }
143
+ return null;
144
+ };
145
+
76
146
  const playWav = (wav, vol = 1) => {
77
147
  if (platform === 'darwin') execFileSync('afplay', ['-v', String(vol), wav], { timeout: 30000 });
78
148
  else if (platform === 'linux') {
@@ -84,8 +154,25 @@ const playWav = (wav, vol = 1) => {
84
154
  }
85
155
  };
86
156
 
157
+ // Apply a prosody profile to a VOICEVOX audio_query JSON in place, so the
158
+ // read-out has human contour (pace/pitch/intonation) instead of a flat 棒読み.
159
+ // Only the small query JSON passes through Node; the WAV never does.
160
+ const applyProsody = (queryPath, prosody) => {
161
+ if (!prosody) return;
162
+ try {
163
+ const q = JSON.parse(readFileSync(queryPath, 'utf8'));
164
+ if (typeof prosody.speed === 'number') q.speedScale = prosody.speed;
165
+ if (typeof prosody.pitch === 'number') q.pitchScale = prosody.pitch;
166
+ if (typeof prosody.intonation === 'number') q.intonationScale = prosody.intonation;
167
+ writeFileSync(queryPath, JSON.stringify(q));
168
+ } catch {
169
+ /* leave the query untouched on any parse/IO error */
170
+ }
171
+ };
172
+
87
173
  // Synthesize and play. Returns true if it spoke, false to fall back to `say`.
88
- export const speak = (text, speaker = 3, url = DEFAULT_URL, vol = 1, timeoutMs = 15000) => {
174
+ // `prosody` (optional) = { speed, pitch, intonation } audio_query scale overrides.
175
+ export const speak = (text, speaker = 3, url = DEFAULT_URL, vol = 1, timeoutMs = 15000, prosody = null) => {
89
176
  if (!text) return false;
90
177
  let dir;
91
178
  try {
@@ -93,12 +180,26 @@ export const speak = (text, speaker = 3, url = DEFAULT_URL, vol = 1, timeoutMs =
93
180
  const wav = join(dir, 'v.wav');
94
181
  const sec = String(Math.max(2, Math.ceil(timeoutMs / 1000)));
95
182
  const enc = encodeURIComponent(text); // URL-encoded -> no shell metacharacters
96
- // Pipe audio_query straight into synthesis. execSync uses /bin/sh for the pipe.
97
- const cmd =
98
- `curl -s -m ${sec} -X POST "${url}/audio_query?speaker=${speaker}&text=${enc}" | ` +
99
- `curl -s -m ${sec} -X POST -H "Content-Type: application/json" -d @- ` +
100
- `"${url}/synthesis?speaker=${speaker}" -o "${wav}"`;
101
- execSync(cmd, { timeout: timeoutMs + 1000, stdio: 'ignore' });
183
+ if (prosody) {
184
+ // Two steps so we can tune the query JSON between them (still no WAV in Node).
185
+ const q = join(dir, 'q.json');
186
+ execSync(`curl -s -m ${sec} -X POST "${url}/audio_query?speaker=${speaker}&text=${enc}" -o "${q}"`, {
187
+ timeout: timeoutMs + 1000,
188
+ stdio: 'ignore',
189
+ });
190
+ applyProsody(q, prosody);
191
+ execSync(
192
+ `curl -s -m ${sec} -X POST -H "Content-Type: application/json" -d @"${q}" "${url}/synthesis?speaker=${speaker}" -o "${wav}"`,
193
+ { timeout: timeoutMs + 1000, stdio: 'ignore' }
194
+ );
195
+ } else {
196
+ // Pipe audio_query straight into synthesis. execSync uses /bin/sh for the pipe.
197
+ const cmd =
198
+ `curl -s -m ${sec} -X POST "${url}/audio_query?speaker=${speaker}&text=${enc}" | ` +
199
+ `curl -s -m ${sec} -X POST -H "Content-Type: application/json" -d @- ` +
200
+ `"${url}/synthesis?speaker=${speaker}" -o "${wav}"`;
201
+ execSync(cmd, { timeout: timeoutMs + 1000, stdio: 'ignore' });
202
+ }
102
203
  if (!existsSync(wav) || statSync(wav).size < 1000) {
103
204
  logFail(`empty/short wav (speaker ${speaker}, ${text.length} chars)`);
104
205
  return false;