ai-notify 0.1.1 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -2,18 +2,26 @@
2
2
 
3
3
  **Know the moment your terminal AI agent needs you** — a sound, a spoken read-out, and a desktop banner the instant Claude Code, Codex, or another agent finishes a turn or asks for input. One mute switch covers **all of them, across every terminal**. No daemon, no background process.
4
4
 
5
- Long-running agents leave you staring at a quiet terminal. `ai-notify` wires a tiny notification hook into each agent CLI you have installed, so you can look away and get pulled back exactly when there's something to do. And when you're in a meeting, **one tap silences every agent at once** — because they all read the same shared switch.
5
+ ![ai-notify demo](https://raw.githubusercontent.com/unoryota/ai-notify/main/assets/demo.gif)
6
6
 
7
7
  ```sh
8
8
  npm i -g ai-notify
9
9
  ai-notify init # auto-detects your agents and wires them
10
10
  ```
11
11
 
12
- ## Why
12
+ ## What makes it different
13
13
 
14
- - **Get notified even if you never set it up.** The point is to *add* notifications. Muting is just a bonus feature on top.
15
- - **All your agents, one switch.** Use only Claude Code? Only Codex? Both? Plus others? Same experience. The mute flag is shared, so flipping it once is global.
16
- - **Zero friction.** No daemon. Re-run `init` anytime it only wires what's newly detected and never clobbers your existing config.
14
+ Plenty of agents go quiet for minutes. ai-notify pulls you back at the right moment and is built for **running many agents at once**:
15
+
16
+ - 🎙️ **A different voice per terminal.** Give each pane its own spoken voice, so you know *which* window finished just by listening `export AI_NOTIFY_VOICE=Eddy` (or a [VOICEVOX](#-voicevox-character-voices) character).
17
+ - 🌐 **Read out in your language.** An agent's English reply or prompt is translated before it's spoken/shown (key-less, no cost) — great for non-English speakers.
18
+ - 📝 **It tells you *what* was done.** The "done" notification summarizes the agent's last reply (from the transcript), not just "finished".
19
+ - 🔕 **One switch mutes everything.** Every agent in every terminal reads the same flag — one tap silences them all for a meeting.
20
+ - 🔔 **A real menu bar bell, built in.** `ai-notify menubar install` — no Hammerspoon/SwiftBar required.
21
+
22
+ > ### 日本語
23
+ > 複数のAIエージェント(Claude Code / Codex …)を**並列で動かすと、どのターミナルの通知か分からない**——を解決する通知ツール。
24
+ > **ペインごとに声を変えられる**(VOICEVOXのキャラ声も)/**英語の出力を日本語に翻訳して読み上げ**/**完了通知に作業内容の要約**/**1タップで全部ミュート**(MTG用)/**メニューバーのベルも内蔵**。
17
25
 
18
26
  ## Supported agents
19
27
 
@@ -21,63 +29,90 @@ ai-notify init # auto-detects your agents and wires them
21
29
  | ----- | ------ | -------------- |
22
30
  | Claude Code | ✅ | `Notification` + `Stop` hooks in `~/.claude/settings.json` |
23
31
  | Codex CLI | ✅ | `notify` in `~/.codex/config.toml` (`agent-turn-complete`) |
24
- | Gemini CLI | 🧪 detected, hook WIP | see [CONTRIBUTING](CONTRIBUTING.md) — PRs welcome |
32
+ | Gemini CLI | 🧪 detected, hook WIP | PRs welcome |
25
33
 
26
- Adding another agent (aider, opencode, amp, ...) is a small PR: drop a file in `src/providers/`. See [CONTRIBUTING](CONTRIBUTING.md).
34
+ Adding another agent (aider, opencode, amp, ) is a small PR: drop a file in `src/providers/`. See [CONTRIBUTING](CONTRIBUTING.md).
27
35
 
28
- ## Usage
36
+ ## Commands
29
37
 
30
38
  ```sh
31
39
  ai-notify init [--dry-run] [--only claude,codex] # wire detected agents
32
- ai-notify uninstall [--only ...] # cleanly remove wiring
33
- ai-notify toggle | on | off | status # the mute switch
40
+ ai-notify toggle | on | off | status # the mute switch
41
+ ai-notify volume [0.0-2.0] # get/set output volume
42
+ ai-notify voice [number|name|preview|default] # pick the spoken voice
43
+ ai-notify voicevox [on <id>|off|speakers|test] # speak in VOICEVOX voices
44
+ ai-notify translate [on <lang>|off|test] # speak agent text in your language
45
+ ai-notify menubar [install|uninstall|status] # native menu bar app (macOS)
34
46
  ai-notify doctor # check deps & wiring
35
- ai-notify config [init] # print / write config
47
+ ai-notify uninstall # cleanly remove wiring
48
+ ```
49
+
50
+ Per-window overrides — `export` these in a terminal *before* launching the agent:
51
+
52
+ ```sh
53
+ AI_NOTIFY_LABEL=api # name this window in the read-out / notification
54
+ AI_NOTIFY_VOICE=Eddy # this window's `say` voice
55
+ AI_NOTIFY_VOICEVOX_SPEAKER=3 # this window's VOICEVOX speaker id
56
+ AI_NOTIFY_VOLUME=0.5 # this window's volume (0.0–2.0)
57
+ ```
58
+
59
+ ## 🎛️ Native menu bar app — mute, volume, and voices
60
+
61
+ You can't type into the terminal that's running an agent, so drive everything from the **menu bar**:
62
+
63
+ ```sh
64
+ ai-notify menubar install # native menu bar app, starts at login
36
65
  ```
37
66
 
38
- > After `init`, restart any already-running Codex session so it re-reads its config.
67
+ A monochrome waveform icon shows status by color (Adobe-style): plain when idle, a **yellow** dot when an agent is waiting for you, **red + slash** when muted.
39
68
 
40
- ## Mute everythingwithout touching a busy terminal
69
+ - **Left-click** menu: a **volume slider**, the **voice list** (system + VOICEVOX), and **per-pane** controls each open terminal gets its own voice *and* volume.
70
+ - **Right-click** → instant mute toggle.
41
71
 
42
- You can't type a command into the terminal that's running an agent, and you want
43
- the current state visible at a glance. So don't drive this from a prompt — drive
44
- it from the **menu bar / a hotkey**, and show the state where you can always see it.
72
+ No third-party app needed. Prefer something else? There are drop-in recipes for **Hammerspoon**, **SwiftBar/xbar**, **Raycast**, and the built-in **macOS Shortcuts** in [`recipes/`](recipes/). `ai-notify status --icon` prints just `🔔`/`🔕` to embed in tmux / your prompt / Claude Code's status line.
45
73
 
46
- > Toggling works mid-run: the flag is read the next time an agent fires a
47
- > notification, so flipping it instantly affects every running agent. A hotkey
48
- > runs in its own process — it never types into your busy terminal.
74
+ > Toggling works mid-run: the flag is read the next time an agent fires, so flipping it instantly affects every running agent.
49
75
 
50
- **Recommended always-visible menu bar toggle** (pick one):
76
+ ## 🎙️ VOICEVOX character voices
51
77
 
52
- - **Hammerspoon** menu bar icon **and** a global hotkey (⌃⌥M) in ~20 lines. [recipes/hammerspoon](recipes/hammerspoon/).
53
- - **SwiftBar / xbar** — a 🔔/🔕 menu bar item you click to toggle. [recipes/swiftbar](recipes/swiftbar/).
54
- - **macOS Shortcuts** — `ai-notify toggle` pinned to the menu bar / a hotkey / iPhone. [recipes/macos-shortcut](recipes/macos-shortcut/).
55
- - **Raycast** — drop-in script command + hotkey. [recipes/raycast](recipes/raycast/).
78
+ Speak your notifications in [VOICEVOX](https://voicevox.hiroshiba.jp/) character voices (free, local, offline). Run the VOICEVOX app, then:
56
79
 
57
- **Always-visible state** — `ai-notify status --icon` prints just `🔔`/`🔕`, ready to embed:
80
+ ```sh
81
+ ai-notify voicevox speakers # list available characters + ids
82
+ ai-notify voicevox on 3 # use speaker 3 (e.g. ずんだもん)
83
+ ```
84
+
85
+ Give every pane its own character with `AI_NOTIFY_VOICEVOX_SPEAKER`. If the engine isn't running, ai-notify silently falls back to the OS voice.
86
+ *VOICEVOX characters have their own terms of use — credit them per [VOICEVOX's guidelines](https://voicevox.hiroshiba.jp/term/) if you share recordings.*
58
87
 
59
- - **Inside Claude Code's own status line** (the busy terminal shows its own state). [recipes/claude-statusline](recipes/claude-statusline/).
60
- - **tmux status bar / shell prompt / Starship.** [recipes/tmux](recipes/tmux/).
88
+ ## 🌐 Read out in your language
89
+
90
+ ```sh
91
+ ai-notify translate on ja # translate the agent's message, then speak it
92
+ ai-notify translate test "I fixed the auth bug and added 3 tests."
93
+ ```
94
+
95
+ Key-less and no cost (one HTTP request; falls back to a localized template offline). The desktop banner still shows the original text.
96
+
97
+ ## ⏳ Which window, and what it's asking
98
+
99
+ Each notification is titled with the window label — `⏳ <label>` when an agent is waiting, `✓ <label>` when it's done — and the body says **what** (the translated prompt, or a summary of what was done). Set a short `AI_NOTIFY_LABEL` per pane and you can tell ten terminals apart at a glance.
61
100
 
62
101
  ## How it works
63
102
 
64
- `ai-notify` keeps a single mute flag and config under XDG paths:
103
+ A single mute flag and config under XDG paths — no daemon, no coordination:
65
104
 
66
105
  ```
67
106
  ${XDG_STATE_HOME:-~/.local/state}/ai-notify/muted # presence = muted
68
107
  ${XDG_CONFIG_HOME:-~/.config}/ai-notify/config.json # sounds, voice, options
69
108
  ```
70
109
 
71
- Each agent's hook calls `ai-notify hook --source <agent>`, which reads that one flag at fire time. That's why every agent and every terminal stay in sync with no coordination.
72
-
73
- ### Configuration
74
-
75
- `ai-notify config init` writes a config you can edit — per-agent sounds and voice, whether the desktop banner still shows while muted, and whether to speak a read-out. Sounds default to OS built-ins, so nothing is bundled.
110
+ Each agent's hook calls `ai-notify hook --source <agent>`, which reads that one flag at fire time. `ai-notify config init` writes an editable config (per-agent sounds, voice, TTS backend, translation, templates).
76
111
 
77
112
  ## Platforms
78
113
 
79
- macOS is fully supported (`afplay` / `say` / `terminal-notifier` or `osascript`). Linux is best-effort (`paplay`/`canberra`, `notify-send`, `spd-say`/`espeak`). Windows plays a beep and speaks via PowerShell. Missing backends degrade silently — they never error.
114
+ macOS is fully supported (`afplay` / `say` / VOICEVOX / `terminal-notifier` / native menu bar). Linux is best-effort (`paplay`/`canberra`, `notify-send`, `spd-say`/`espeak`, VOICEVOX). Windows plays a beep and speaks via PowerShell. Missing backends degrade silently — they never error.
80
115
 
81
116
  ## License
82
117
 
83
- [MIT](LICENSE).
118
+ [MIT](LICENSE). Zero runtime dependencies.
@@ -1,122 +1,251 @@
1
- // ai-notify menu bar agent — a tiny native NSStatusItem that mirrors the one
2
- // shared mute flag and toggles it on click. No third-party app required.
1
+ // ai-notify menu bar agent — native NSStatusItem, no third-party app.
3
2
  //
4
- // Single source of truth: the same file the CLI and every wired agent read,
5
- // ${XDG_STATE_HOME:-~/.local/state}/ai-notify/muted
6
- // Present = muted (🔕). Absent = on (🔔).
3
+ // Shared state (same files the CLI and every agent read), under
4
+ // ${XDG_STATE_HOME:-~/.local/state}/ai-notify/ :
5
+ // muted present = muted
6
+ // volume 0.0–2.0 (1.0 = normal)
7
+ // cli launcher -> `ai-notify`
7
8
  //
8
- // Left click : toggle mute/unmute (one tap)
9
- // Right click : menu (toggle / quit)
9
+ // Left click : menu volume slider, voice list (flat), per-pane voices, quit
10
+ // Right click : toggle mute (one tap)
10
11
  //
11
12
  // Builds with the system `swiftc` — no Xcode project, no dependencies.
12
13
 
13
14
  import Cocoa
14
15
 
15
- // MARK: - Shared state (must match src/state.mjs)
16
-
17
16
  enum State {
18
- static func stateDir() -> String {
17
+ static func dir() -> String {
19
18
  let env = ProcessInfo.processInfo.environment
20
19
  let base = env["XDG_STATE_HOME"]
21
20
  ?? (NSHomeDirectory() as NSString).appendingPathComponent(".local/state")
22
21
  return (base as NSString).appendingPathComponent("ai-notify")
23
22
  }
23
+ static func file(_ name: String) -> String { (dir() as NSString).appendingPathComponent(name) }
24
24
 
25
- static func flagPath() -> String {
26
- (stateDir() as NSString).appendingPathComponent("muted")
25
+ static var isMuted: Bool { FileManager.default.fileExists(atPath: file("muted")) }
26
+ static func setMuted(_ m: Bool) {
27
+ let p = file("muted"), fm = FileManager.default
28
+ if m { try? fm.createDirectory(atPath: dir(), withIntermediateDirectories: true); fm.createFile(atPath: p, contents: Data()) }
29
+ else { try? fm.removeItem(atPath: p) }
27
30
  }
28
31
 
29
- static var isMuted: Bool {
30
- FileManager.default.fileExists(atPath: flagPath())
32
+ static var volume: Double {
33
+ guard let s = try? String(contentsOfFile: file("volume"), encoding: .utf8),
34
+ let v = Double(s.trimmingCharacters(in: .whitespacesAndNewlines)) else { return 1.0 }
35
+ return min(2, max(0, v))
31
36
  }
32
37
 
33
- static func setMuted(_ muted: Bool) {
34
- let path = flagPath()
35
- let fm = FileManager.default
36
- if muted {
37
- try? fm.createDirectory(atPath: stateDir(), withIntermediateDirectories: true)
38
- fm.createFile(atPath: path, contents: Data())
39
- } else {
40
- try? fm.removeItem(atPath: path)
38
+ // Any pane waiting for input -> the icon shows a yellow status.
39
+ static var hasWaiting: Bool {
40
+ guard let s = try? String(contentsOfFile: file("waiting.json"), encoding: .utf8) else { return false }
41
+ let t = s.trimmingCharacters(in: .whitespacesAndNewlines)
42
+ return !t.isEmpty && t != "{}" && t != "[]"
43
+ }
44
+ static func setVolume(_ v: Double) {
45
+ try? FileManager.default.createDirectory(atPath: dir(), withIntermediateDirectories: true)
46
+ try? String(format: "%.2f", v).write(toFile: file("volume"), atomically: true, encoding: .utf8)
47
+ }
48
+
49
+ @discardableResult
50
+ static func cli(_ args: [String], capture: Bool = false) -> String? {
51
+ let launcher = file("cli")
52
+ guard FileManager.default.isExecutableFile(atPath: launcher) else { return nil }
53
+ let task = Process()
54
+ task.executableURL = URL(fileURLWithPath: launcher)
55
+ task.arguments = args
56
+ let pipe = Pipe()
57
+ if capture { task.standardOutput = pipe; task.standardError = Pipe() }
58
+ do { try task.run() } catch { return nil }
59
+ if capture {
60
+ let data = pipe.fileHandleForReading.readDataToEndOfFile()
61
+ task.waitUntilExit()
62
+ return String(data: data, encoding: .utf8)
41
63
  }
64
+ return nil
42
65
  }
43
66
  }
44
67
 
45
- // MARK: - App
46
-
47
68
  final class AppDelegate: NSObject, NSApplicationDelegate {
48
69
  private var statusItem: NSStatusItem!
49
70
  private var timer: Timer?
50
71
 
51
72
  func applicationDidFinishLaunching(_ notification: Notification) {
52
73
  statusItem = NSStatusBar.system.statusItem(withLength: NSStatusItem.variableLength)
53
- if let button = statusItem.button {
54
- button.action = #selector(handleClick(_:))
55
- button.target = self
56
- button.sendAction(on: [.leftMouseUp, .rightMouseUp])
74
+ if let b = statusItem.button {
75
+ b.action = #selector(handleClick(_:))
76
+ b.target = self
77
+ b.sendAction(on: [.leftMouseUp, .rightMouseUp])
57
78
  }
58
79
  render()
59
- // Reconcile every second so external changes (CLI `ai-notify on/off`,
60
- // another tool) are reflected without any IPC.
61
- timer = Timer.scheduledTimer(withTimeInterval: 1.0, repeats: true) { [weak self] _ in
62
- self?.render()
80
+ timer = Timer.scheduledTimer(withTimeInterval: 1.0, repeats: true) { [weak self] _ in self?.render() }
81
+ }
82
+
83
+ // Black/white waveform silhouette (template, auto-adapting) when idle; a
84
+ // composite with a colored status dot when waiting (yellow) or muted (red +
85
+ // slash) — Adobe-style status-by-color.
86
+ private func statusImage(muted: Bool, waiting: Bool) -> NSImage {
87
+ let cfg = NSImage.SymbolConfiguration(pointSize: 15, weight: .regular)
88
+ let sym = (NSImage(systemSymbolName: "waveform", accessibilityDescription: "ai-notify")?
89
+ .withSymbolConfiguration(cfg)) ?? NSImage()
90
+
91
+ if !muted && !waiting {
92
+ sym.isTemplate = true // system tints to the menu bar color
93
+ return sym
94
+ }
95
+
96
+ let dark = (statusItem.button?.effectiveAppearance.bestMatch(from: [.aqua, .darkAqua]) == .darkAqua)
97
+ let fg: NSColor = muted ? .tertiaryLabelColor : (dark ? .white : .black)
98
+ let size = sym.size
99
+ let img = NSImage(size: size)
100
+ img.lockFocus()
101
+ let rect = NSRect(origin: .zero, size: size)
102
+ sym.draw(in: rect)
103
+ fg.set(); rect.fill(using: .sourceAtop) // tint the silhouette
104
+ // status dot, top-right
105
+ let d: CGFloat = 6
106
+ (muted ? NSColor.systemRed : NSColor.systemYellow).set()
107
+ NSBezierPath(ovalIn: NSRect(x: size.width - d, y: size.height - d, width: d, height: d)).fill()
108
+ if muted { // red slash
109
+ let s = NSBezierPath(); s.lineWidth = 1.6
110
+ s.move(to: NSPoint(x: 1.5, y: 1.5)); s.line(to: NSPoint(x: size.width - 1.5, y: size.height - 1.5))
111
+ NSColor.systemRed.set(); s.stroke()
63
112
  }
113
+ img.unlockFocus()
114
+ img.isTemplate = false
115
+ return img
64
116
  }
65
117
 
66
118
  private func render() {
67
- statusItem.button?.title = State.isMuted ? "🔕" : "🔔"
119
+ guard let b = statusItem.button else { return }
120
+ b.title = ""
121
+ b.image = statusImage(muted: State.isMuted, waiting: State.hasWaiting)
68
122
  }
69
123
 
70
124
  @objc private func handleClick(_ sender: Any?) {
71
- guard let event = NSApp.currentEvent else { toggle(); return }
72
- if event.type == .rightMouseUp {
73
- showMenu()
74
- } else {
75
- toggle()
76
- }
125
+ guard let e = NSApp.currentEvent else { showMenu(); return }
126
+ if e.type == .rightMouseUp { toggle() } else { showMenu() }
77
127
  }
78
128
 
79
- private func toggle() {
80
- let nowMuted = !State.isMuted
81
- State.setMuted(nowMuted)
82
- render()
83
- if !nowMuted { chime() } // brief confirmation on un-mute
129
+ private func toggle() { State.setMuted(!State.isMuted); render() }
130
+ @objc private func quit() { NSApp.terminate(nil) }
131
+
132
+ @objc private func volumeChanged(_ s: NSSlider) { State.setVolume(s.doubleValue) }
133
+ @objc private func paneVolumeChanged(_ s: NSSlider) {
134
+ if let tty = s.identifier?.rawValue { State.cli(["volume-pane", tty, String(format: "%.2f", s.doubleValue)]) }
84
135
  }
85
136
 
86
- @objc private func toggleFromMenu() { toggle() }
137
+ // A 🔊 + slider row. identifier == nil => global (live); otherwise a pane tty
138
+ // (applied on release to avoid a subprocess per drag tick).
139
+ private func sliderRow(value: Double, action: Selector, identifier: String?) -> NSMenuItem {
140
+ let row = NSView(frame: NSRect(x: 0, y: 0, width: 220, height: 26))
141
+ let icon = NSTextField(labelWithString: "🔊"); icon.frame = NSRect(x: 12, y: 4, width: 20, height: 18)
142
+ let slider = NSSlider(value: value, minValue: 0, maxValue: 2, target: self, action: action)
143
+ slider.frame = NSRect(x: 36, y: 3, width: 170, height: 20)
144
+ slider.isContinuous = (identifier == nil)
145
+ if let id = identifier { slider.identifier = NSUserInterfaceItemIdentifier(id) }
146
+ row.addSubview(icon); row.addSubview(slider)
147
+ let item = NSMenuItem(); item.view = row
148
+ return item
149
+ }
87
150
 
88
- @objc private func quit() { NSApp.terminate(nil) }
151
+ // representedObject is the full CLI arg array to run.
152
+ @objc private func runItem(_ item: NSMenuItem) {
153
+ if let cmd = item.representedObject as? [String] { State.cli(cmd) }
154
+ }
155
+
156
+ private func disabledHeader(_ title: String) -> NSMenuItem {
157
+ let it = NSMenuItem(title: title, action: nil, keyEquivalent: "")
158
+ it.isEnabled = false
159
+ return it
160
+ }
89
161
 
90
162
  private func showMenu() {
91
- let muted = State.isMuted
92
163
  let menu = NSMenu()
93
- let toggleItem = NSMenuItem(
94
- title: muted ? "通知をオンにする" : "ミュート",
95
- action: #selector(toggleFromMenu), keyEquivalent: "")
96
- toggleItem.target = self
97
- menu.addItem(toggleItem)
164
+
165
+ // Global volume slider.
166
+ menu.addItem(sliderRow(value: State.volume, action: #selector(volumeChanged(_:)), identifier: nil))
167
+ menu.addItem(.separator())
168
+
169
+ // Parse menu-json once.
170
+ let json = (State.cli(["menu-json"], capture: true)?.data(using: .utf8))
171
+ .flatMap { try? JSONSerialization.jsonObject(with: $0) as? [String: Any] }
172
+ let voices = (json?["voices"] as? [[String: Any]]) ?? []
173
+ let panes = (json?["panes"] as? [[String: Any]]) ?? []
174
+
175
+ if voices.isEmpty {
176
+ menu.addItem(disabledHeader("(声の一覧を取得できません)"))
177
+ } else {
178
+ // Global voice list — flat, at the top level.
179
+ menu.addItem(disabledHeader("ボイス(全体)"))
180
+ addVoiceItems(voices, to: menu, paneTty: nil, currentPaneLabel: nil)
181
+ }
182
+
183
+ // Per-pane voices: one submenu per recently-active pane.
184
+ if !panes.isEmpty {
185
+ menu.addItem(.separator())
186
+ menu.addItem(disabledHeader("ペイン別"))
187
+ for p in panes {
188
+ guard let tty = p["tty"] as? String else { continue }
189
+ let label = p["label"] as? String ?? tty
190
+ let cur = p["current"] as? String
191
+ let item = NSMenuItem(title: cur != nil ? "\(label) — \(cur!)" : label, action: nil, keyEquivalent: "")
192
+ let sub = NSMenu()
193
+ // Per-pane volume.
194
+ let pv = (p["volume"] as? Double) ?? State.volume
195
+ sub.addItem(disabledHeader("音量"))
196
+ sub.addItem(sliderRow(value: pv, action: #selector(paneVolumeChanged(_:)), identifier: tty))
197
+ let volDef = NSMenuItem(title: "音量を全体に従う", action: #selector(runItem(_:)), keyEquivalent: "")
198
+ volDef.target = self; volDef.representedObject = ["volume-pane", tty, "clear"]
199
+ volDef.state = (p["volumeSet"] as? Bool ?? false) ? .off : .on
200
+ sub.addItem(volDef)
201
+ sub.addItem(.separator())
202
+ // Per-pane voice.
203
+ sub.addItem(disabledHeader("声"))
204
+ let def = NSMenuItem(title: "デフォルト(全体に従う)", action: #selector(runItem(_:)), keyEquivalent: "")
205
+ def.target = self; def.representedObject = ["voice-pane", tty, "clear"]; def.state = (cur == nil) ? .on : .off
206
+ sub.addItem(def)
207
+ addVoiceItems(voices, to: sub, paneTty: tty, currentPaneLabel: cur)
208
+ item.submenu = sub
209
+ menu.addItem(item)
210
+ }
211
+ }
212
+
98
213
  menu.addItem(.separator())
99
214
  let quitItem = NSMenuItem(title: "ai-notify を終了", action: #selector(quit), keyEquivalent: "q")
100
215
  quitItem.target = self
101
216
  menu.addItem(quitItem)
102
217
 
103
- statusItem.menu = menu
104
- statusItem.button?.performClick(nil)
105
- statusItem.menu = nil // restore left-click-to-toggle
218
+ if let button = statusItem.button {
219
+ menu.popUp(positioning: nil, at: NSPoint(x: 0, y: button.bounds.height + 4), in: button)
220
+ }
106
221
  }
107
222
 
108
- private func chime() {
109
- let sound = "/System/Library/Sounds/Glass.aiff"
110
- guard FileManager.default.fileExists(atPath: sound) else { return }
111
- let task = Process()
112
- task.executableURL = URL(fileURLWithPath: "/usr/bin/afplay")
113
- task.arguments = ["-v", "2", sound]
114
- try? task.run()
223
+ // Add the voice list to `menu`. paneTty == nil => sets the global voice;
224
+ // otherwise assigns the voice to that pane.
225
+ private func addVoiceItems(_ voices: [[String: Any]], to menu: NSMenu, paneTty: String?, currentPaneLabel: String?) {
226
+ var lastSection = ""
227
+ for v in voices {
228
+ let section = v["section"] as? String ?? ""
229
+ let label = v["label"] as? String ?? "?"
230
+ let kind = v["kind"] as? String ?? "say"
231
+ let ref = v["ref"] as? String ?? ""
232
+ if section != lastSection { menu.addItem(disabledHeader("— \(section) —")); lastSection = section }
233
+ let it = NSMenuItem(title: label, action: #selector(runItem(_:)), keyEquivalent: "")
234
+ it.target = self
235
+ if let tty = paneTty {
236
+ it.representedObject = ["voice-pane", tty, kind, ref]
237
+ it.state = (currentPaneLabel == label) ? .on : .off
238
+ } else {
239
+ it.representedObject = kind == "voicevox" ? ["voicevox", "on", ref] : ["voice", ref]
240
+ it.state = (v["currentGlobal"] as? Bool ?? false) ? .on : .off
241
+ }
242
+ menu.addItem(it)
243
+ }
115
244
  }
116
245
  }
117
246
 
118
247
  let app = NSApplication.shared
119
- app.setActivationPolicy(.accessory) // no Dock icon, menu bar only
248
+ app.setActivationPolicy(.accessory)
120
249
  let delegate = AppDelegate()
121
250
  app.delegate = delegate
122
251
  app.run()
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "ai-notify",
3
- "version": "0.1.1",
3
+ "version": "0.2.0",
4
4
  "description": "Desktop, sound, and spoken notifications for terminal AI coding agents (Claude Code, Codex, Gemini, ...) — with one mute switch that covers all of them, across every terminal.",
5
5
  "type": "module",
6
6
  "bin": {
package/src/cli.mjs CHANGED
@@ -3,6 +3,7 @@
3
3
  // One mute switch for all of them, across every terminal. No daemon.
4
4
 
5
5
  import { readFileSync } from 'node:fs';
6
+ import { execSync } from 'node:child_process';
6
7
  import { providers, byId } from './providers/index.mjs';
7
8
  import { emit } from './notify.mjs';
8
9
  import { deriveLabel, cliInvocation, isEphemeralInstall } from './util.mjs';
@@ -10,9 +11,23 @@ import { curatedVoices, resolveVoice, previewVoice } from './voices.mjs';
10
11
  import * as menubar from './menubar.mjs';
11
12
  import { translate } from './translate.mjs';
12
13
  import { diagnose as highlightDiagnose, clearHighlight } from './highlight.mjs';
13
- import { isMuted, setMuted, toggleMuted, readConfig, writeConfig, paths, DEFAULT_CONFIG } from './state.mjs';
14
-
15
- const VERSION = '0.1.1';
14
+ import * as voicevox from './voicevox.mjs';
15
+ import {
16
+ isMuted,
17
+ setMuted,
18
+ toggleMuted,
19
+ readConfig,
20
+ writeConfig,
21
+ paths,
22
+ DEFAULT_CONFIG,
23
+ readVolume,
24
+ setVolume,
25
+ readPanes,
26
+ readPaneSetting,
27
+ updatePaneSetting,
28
+ } from './state.mjs';
29
+
30
+ const VERSION = '0.2.0';
16
31
 
17
32
  const args = process.argv.slice(2);
18
33
  const cmd = args[0];
@@ -75,6 +90,26 @@ const lastAssistantText = (transcriptPath) => {
75
90
  return '';
76
91
  };
77
92
 
93
+ // Terminals (ttys) currently running a wired agent — so all open panes can be
94
+ // assigned a voice from the menu bar without first firing a notification.
95
+ const livePanes = () => {
96
+ try {
97
+ const out = execSync('ps -Ao tty=,command=', { encoding: 'utf8', maxBuffer: 1 << 22 });
98
+ const ttys = new Set();
99
+ for (const line of out.split('\n')) {
100
+ const m = line.match(/^(\S+)\s+(.*)$/);
101
+ if (!m) continue;
102
+ const [, tty, cmd] = m;
103
+ if (tty === '??' || tty === '?') continue;
104
+ if (/ai-notify|menubar/.test(cmd)) continue; // skip our own hook/agent
105
+ if (/\bclaude\b|\bcodex\b|\bgemini\b/i.test(cmd)) ttys.add(`/dev/${tty}`);
106
+ }
107
+ return [...ttys];
108
+ } catch {
109
+ return [];
110
+ }
111
+ };
112
+
78
113
  const cmds = {
79
114
  init() {
80
115
  const dryRun = !!opt('dry-run');
@@ -162,6 +197,7 @@ const cmds = {
162
197
 
163
198
  const setVoice = (name) => {
164
199
  config.voice = name; // '' = OS default
200
+ config.tts = 'say'; // choosing a system voice switches the backend off VOICEVOX
165
201
  // Global voice wins only if no per-provider override; clear them so the
166
202
  // single switch actually takes effect everywhere.
167
203
  for (const k of Object.keys(config.providers || {})) {
@@ -211,6 +247,152 @@ const cmds = {
211
247
  log(' Reset: ai-notify voice default');
212
248
  },
213
249
 
250
+ // Speak in VOICEVOX character voices (local engine, free, offline).
251
+ voicevox() {
252
+ const sub = positionals[0] || 'status';
253
+ const config = readConfig();
254
+ const url = config.voicevox?.url || voicevox.DEFAULT_URL;
255
+
256
+ if (sub === 'speakers') {
257
+ const list = voicevox.listSpeakers(url);
258
+ if (!list.length) return log(`No speakers (is VOICEVOX running at ${url}?).`);
259
+ list.forEach((s) => log(` ${String(s.id).padStart(3)} ${s.name}`));
260
+ log(`\nUse one: ai-notify voicevox on <id>`);
261
+ return;
262
+ }
263
+ if (sub === 'on') {
264
+ if (!voicevox.isAvailable(url)) {
265
+ console.error(`VOICEVOX engine not reachable at ${url}. Start the VOICEVOX app first.`);
266
+ process.exit(1);
267
+ }
268
+ const speaker = Number(positionals[1] || config.voicevox?.speaker || 3);
269
+ config.tts = 'voicevox';
270
+ config.voicevox = { ...(config.voicevox || {}), url, speaker };
271
+ writeConfig(config);
272
+ log(`✓ VOICEVOX on (speaker ${speaker}). Testing…`);
273
+ voicevox.speak('ボイスボックスで読み上げます。', speaker, url);
274
+ return;
275
+ }
276
+ if (sub === 'off') {
277
+ config.tts = 'say';
278
+ writeConfig(config);
279
+ return log('VOICEVOX off — using the OS voice.');
280
+ }
281
+ if (sub === 'test') {
282
+ const speaker = Number(positionals[1] || config.voicevox?.speaker || 3);
283
+ const ok = voicevox.speak('これはテスト読み上げです。完了しました。', speaker, url);
284
+ return log(ok ? `spoke with speaker ${speaker}` : `⚠ failed (is VOICEVOX running at ${url}?)`);
285
+ }
286
+ // status
287
+ log(`VOICEVOX: ${config.tts === 'voicevox' ? `on (speaker ${config.voicevox?.speaker})` : 'off'}`);
288
+ log(` engine ${url}: ${voicevox.isAvailable(url) ? '✓ reachable' : '✗ not running'}`);
289
+ if (config.tts !== 'voicevox') log('\nEnable: ai-notify voicevox on (list voices: ai-notify voicevox speakers)');
290
+ },
291
+
292
+ // Output volume 0.0–2.0 (1.0 = normal). Written to a state file the menu bar
293
+ // slider also drives; $AI_NOTIFY_VOLUME overrides per window.
294
+ volume() {
295
+ const arg = positionals[0];
296
+ if (arg === undefined) {
297
+ const config = readConfig();
298
+ const v = readVolume();
299
+ return log(`volume: ${v != null ? v : typeof config.volume === 'number' ? config.volume : 1}`);
300
+ }
301
+ const n = setVolume(arg);
302
+ log(`🔊 volume → ${n}`);
303
+ },
304
+
305
+ // Assign a voice to a specific pane (by tty), from the menu bar.
306
+ // voice-pane <tty> voicevox <id> | say <name> | clear
307
+ 'voice-pane'() {
308
+ const [tty, kind, ref] = positionals;
309
+ if (!tty) {
310
+ console.error('usage: voice-pane <tty> voicevox <id> | say <name> | clear');
311
+ process.exit(1);
312
+ }
313
+ if (!kind || kind === 'clear') {
314
+ updatePaneSetting(tty, { tts: null, speaker: null, voice: null });
315
+ return log(`pane ${tty}: voice reset to default`);
316
+ }
317
+ if (kind === 'voicevox') updatePaneSetting(tty, { tts: 'voicevox', speaker: Number(ref), voice: null });
318
+ else if (kind === 'say') updatePaneSetting(tty, { tts: 'say', voice: ref, speaker: null });
319
+ else {
320
+ console.error(`unknown kind: ${kind}`);
321
+ process.exit(1);
322
+ }
323
+ log(`pane ${tty}: ${kind} ${ref}`);
324
+ },
325
+
326
+ // Set a specific pane's output volume (0.0–2.0), or `clear` to follow global.
327
+ // volume-pane <tty> <0.0-2.0|clear>
328
+ 'volume-pane'() {
329
+ const [tty, arg] = positionals;
330
+ if (!tty || arg === undefined) {
331
+ console.error('usage: volume-pane <tty> <0.0-2.0|clear>');
332
+ process.exit(1);
333
+ }
334
+ if (arg === 'clear') {
335
+ updatePaneSetting(tty, { volume: null });
336
+ return log(`pane ${tty}: volume reset to global`);
337
+ }
338
+ const v = Math.min(2, Math.max(0, Number(arg)));
339
+ updatePaneSetting(tty, { volume: v });
340
+ log(`pane ${tty}: volume ${v}`);
341
+ },
342
+
343
+ // Machine-readable state for the menu bar agent: mute, volume, the selectable
344
+ // voices, and the recently-active panes (for per-pane assignment). Not human.
345
+ 'menu-json'() {
346
+ const config = readConfig();
347
+ const url = config.voicevox?.url || voicevox.DEFAULT_URL;
348
+ const chars = voicevox.isAvailable(url) ? voicevox.listCharacters(url) : [];
349
+ const idName = new Map(chars.map((c) => [c.id, c.name]));
350
+ const voices = [];
351
+ for (const c of chars)
352
+ voices.push({
353
+ section: 'VOICEVOX',
354
+ label: c.name,
355
+ kind: 'voicevox',
356
+ ref: String(c.id),
357
+ currentGlobal: config.tts === 'voicevox' && Number(config.voicevox?.speaker) === c.id,
358
+ });
359
+ for (const n of curatedVoices(10))
360
+ voices.push({
361
+ section: 'System',
362
+ label: n,
363
+ kind: 'say',
364
+ ref: n,
365
+ currentGlobal: config.tts !== 'voicevox' && config.voice === n,
366
+ });
367
+ const labelFor = (pv) => {
368
+ if (!pv) return null;
369
+ return pv.tts === 'voicevox' ? idName.get(Number(pv.speaker)) || `VOICEVOX ${pv.speaker}` : pv.voice || 'system';
370
+ };
371
+ // Panes = live terminals currently running an agent (so they show up before
372
+ // they ever fire a notification) merged with previously-recorded ones.
373
+ const globalVol = readVolume() != null ? readVolume() : typeof config.volume === 'number' ? config.volume : 1;
374
+ const recorded = new Map(readPanes().map((p) => [p.tty, p.label]));
375
+ const ttys = new Set([...livePanes(), ...recorded.keys()]);
376
+ const panes = [...ttys].map((tty) => {
377
+ const s = readPaneSetting(tty);
378
+ return {
379
+ tty,
380
+ label: recorded.get(tty) || tty.replace('/dev/', ''),
381
+ current: labelFor(s.tts ? s : null),
382
+ volume: typeof s.volume === 'number' ? s.volume : globalVol,
383
+ volumeSet: typeof s.volume === 'number',
384
+ };
385
+ });
386
+ log(
387
+ JSON.stringify({
388
+ muted: isMuted(),
389
+ volume: readVolume() != null ? readVolume() : typeof config.volume === 'number' ? config.volume : 1,
390
+ voices,
391
+ panes,
392
+ })
393
+ );
394
+ },
395
+
214
396
  // Native menu bar bell (macOS). Self-contained — no Hammerspoon/SwiftBar.
215
397
  menubar() {
216
398
  const sub = positionals[0] || 'status';
@@ -222,7 +404,7 @@ const cmds = {
222
404
  const r = menubar.install();
223
405
  log(` ✓ app: ${r.app}`);
224
406
  log(` ✓ agent: ${r.plist} (starts at login)`);
225
- log('A 🔔 should now be in your menu bar. Left-click toggles, right-click for a menu.');
407
+ log('A 🔔 is now in your menu bar. Left-click for the menu (volume, voices), right-click to mute.');
226
408
  return;
227
409
  }
228
410
  if (sub === 'uninstall') {
@@ -340,7 +522,9 @@ Usage:
340
522
  ai-notify init [--dry-run] [--only claude,codex] wire detected agents
341
523
  ai-notify uninstall [--only ...] remove wiring
342
524
  ai-notify toggle | on | off | status control the mute switch
525
+ ai-notify volume [0.0-2.0] get/set output volume
343
526
  ai-notify voice [number|name|preview|default] pick the spoken voice
527
+ ai-notify voicevox [on <id>|off|speakers|test] speak in VOICEVOX character voices
344
528
  ai-notify menubar [install|uninstall|status] native menu bar bell (macOS)
345
529
  ai-notify translate [on <lang>|off|test] speak agent text in your language
346
530
  ai-notify doctor check deps & wiring
package/src/menubar.mjs CHANGED
@@ -10,8 +10,10 @@
10
10
  import { homedir, platform } from 'node:os';
11
11
  import { join, dirname } from 'node:path';
12
12
  import { fileURLToPath } from 'node:url';
13
- import { existsSync, mkdirSync, writeFileSync, rmSync } from 'node:fs';
13
+ import { existsSync, mkdirSync, writeFileSync, rmSync, chmodSync } from 'node:fs';
14
14
  import { execFileSync, spawnSync } from 'node:child_process';
15
+ import { cliInvocation } from './util.mjs';
16
+ import { stateDir } from './state.mjs';
15
17
 
16
18
  export const LABEL = 'com.ai-notify.menubar';
17
19
 
@@ -81,9 +83,21 @@ const unload = () => {
81
83
  return r;
82
84
  };
83
85
 
86
+ // A tiny launcher the menu bar app shells out to for data/actions, so it works
87
+ // regardless of PATH (embeds the resolved node + cli path).
88
+ const writeCliWrapper = () => {
89
+ const { node, cliPath } = cliInvocation();
90
+ const p = join(stateDir(), 'cli');
91
+ mkdirSync(stateDir(), { recursive: true });
92
+ writeFileSync(p, `#!/bin/sh\nexec "${node}" "${cliPath}" "$@"\n`);
93
+ chmodSync(p, 0o755);
94
+ return p;
95
+ };
96
+
84
97
  export const install = () => {
85
98
  if (!isMac()) throw new Error('the menu bar agent is macOS-only');
86
99
  if (!isBuilt()) build();
100
+ writeCliWrapper();
87
101
  writePlist();
88
102
  load();
89
103
  return { app: appPath(), plist: plistPath() };
package/src/notify.mjs CHANGED
@@ -4,11 +4,15 @@
4
4
  // so a Linux box without `notify-send` (or a Mac without `terminal-notifier`)
5
5
  // never errors — it just does what it can.
6
6
 
7
- import { spawn } from 'node:child_process';
8
- import { existsSync } from 'node:fs';
9
- import { isMuted, readConfig } from './state.mjs';
7
+ import { spawn, execFileSync } from 'node:child_process';
8
+ import { existsSync, rmSync } from 'node:fs';
9
+ import { tmpdir } from 'node:os';
10
+ import { join } from 'node:path';
11
+ import { isMuted, readConfig, readVolume, recordPane, readPaneSetting, setPaneWaiting } from './state.mjs';
12
+ import { controllingTty } from './util.mjs';
10
13
  import { translate } from './translate.mjs';
11
14
  import { highlightWaiting, clearHighlight } from './highlight.mjs';
15
+ import * as voicevox from './voicevox.mjs';
12
16
 
13
17
  const platform = process.platform; // 'darwin' | 'linux' | 'win32'
14
18
 
@@ -34,13 +38,14 @@ const resolveSound = (name) => {
34
38
  return name; // linux/win: treated as a freedesktop event id / ignored
35
39
  };
36
40
 
37
- const playSound = (name) => {
41
+ const playSound = (name, vol = 1) => {
38
42
  const sound = resolveSound(name);
39
43
  if (platform === 'darwin') {
40
44
  if (sound && existsSync(sound)) {
41
45
  // play twice, a touch louder, so it is hard to miss
42
- run('afplay', ['-v', '2', sound]);
43
- run('afplay', ['-v', '2', sound]);
46
+ const v = String(2 * vol);
47
+ run('afplay', ['-v', v, sound]);
48
+ run('afplay', ['-v', v, sound]);
44
49
  }
45
50
  } else if (platform === 'linux') {
46
51
  if (which('paplay') && existsSync('/usr/share/sounds/freedesktop/stereo/complete.oga')) {
@@ -55,9 +60,23 @@ const playSound = (name) => {
55
60
  }
56
61
  };
57
62
 
58
- const speak = (text, voice) => {
63
+ // `say` has no per-call volume, so when a non-default volume is set we render to
64
+ // a file and play it through afplay at the requested level.
65
+ const sayWithVolume = (text, voice, vol) => {
66
+ try {
67
+ const tmp = join(tmpdir(), `ai-notify-say-${process.pid}.aiff`);
68
+ execFileSync('say', voice ? ['-v', voice, '-o', tmp, text] : ['-o', tmp, text], { timeout: 30000 });
69
+ execFileSync('afplay', ['-v', String(vol), tmp], { timeout: 30000 });
70
+ rmSync(tmp, { force: true });
71
+ } catch {
72
+ /* ignore */
73
+ }
74
+ };
75
+
76
+ const speak = (text, voice, vol = 1) => {
59
77
  if (!text) return;
60
78
  if (platform === 'darwin') {
79
+ if (vol !== 1) return sayWithVolume(text, voice, vol);
61
80
  run('say', voice ? ['-v', voice, text] : [text]);
62
81
  } else if (platform === 'linux') {
63
82
  if (which('spd-say')) run('spd-say', [text]);
@@ -93,6 +112,18 @@ const banner = (title, subtitle, message, { activate, urgent } = {}) => {
93
112
  // win32: skipped (no dependency-free toast); sound/voice still fire.
94
113
  };
95
114
 
115
+ // A short, speakable gist of a summary: the first sentence, capped at `max`
116
+ // characters on a clause boundary — enough to tell which task, not a monologue.
117
+ const shortenForSpeech = (text, max = 40) => {
118
+ let s = String(text).replace(/\s+/g, ' ').trim();
119
+ s = (s.split(/[。.!?!?\n]/)[0] || s).trim(); // first sentence
120
+ if (s.length <= max) return s.replace(/[、,\s]+$/, '');
121
+ const cut = s.slice(0, max);
122
+ const ten = cut.lastIndexOf('、'); // prefer a clause boundary
123
+ const sep = ten > max * 0.4 ? ten : cut.lastIndexOf(' ');
124
+ return (sep > 0 ? cut.slice(0, sep) : cut).replace(/[、,\s]+$/, '').trim();
125
+ };
126
+
96
127
  // Public entry. Called by the hook handler with already-parsed fields.
97
128
  export const emit = ({ provider = 'default', event = 'done', label = '', message = '' }) => {
98
129
  const config = readConfig();
@@ -110,29 +141,61 @@ export const emit = ({ provider = 'default', event = 'done', label = '', message
110
141
  // (falling back to the template on failure).
111
142
  // default -> speak the raw message as-is.
112
143
  // The desktop banner always shows the full original message visually.
113
- // coreBody = WHAT happened, in the user's language, without the label.
114
- let coreBody;
115
- if (config.speakAgentMessage === false) {
116
- coreBody = fromTemplate || fallback;
117
- } else if (message) {
118
- const translated = config.translateTo ? translate(message, config.translateTo) : message;
119
- coreBody = translated || fromTemplate || fallback;
144
+ // Full text for the desktop banner — the translated summary / message. Length
145
+ // is fine here: a banner never gets cut off and you read it at a glance.
146
+ let fullBody;
147
+ if (message) {
148
+ fullBody = (config.translateTo ? translate(message, config.translateTo) : message) || fromTemplate || fallback;
120
149
  } else {
121
- coreBody = fromTemplate || fallback;
150
+ fullBody = fromTemplate || fallback;
122
151
  }
123
- // Prefix the window label for the spoken read-out so you can tell which of
124
- // many terminals is asking (set a short per-window name with $AI_NOTIFY_LABEL).
125
- const speakText = config.speakLabel !== false && label ? `${label}、${coreBody}` : coreBody;
152
+ // Spoken read-out short enough not to get cut off, but enough to identify
153
+ // WHICH task: the window label + a short gist of what happened (the first
154
+ // clause of the summary). speakAgentMessage:true reads the whole thing.
155
+ let spokenBody;
156
+ if (!message) spokenBody = fromTemplate || fallback;
157
+ else if (config.speakAgentMessage) spokenBody = fullBody;
158
+ else spokenBody = shortenForSpeech(fullBody, config.speakMaxChars || 40);
159
+ // The task gist already tells you which pane; the label (often the working
160
+ // dir) is just slow filler. Prefix it only if explicitly enabled.
161
+ const speakText = config.speakLabel === true && label ? `${label}、${spokenBody}` : spokenBody;
162
+
163
+ // Per-pane voice: remember this pane (so the menu bar can list it) and apply
164
+ // any voice assigned to it. Precedence (most specific first):
165
+ // $AI_NOTIFY_* env — set in the pane's shell
166
+ // this pane's pick — assigned from the menu bar (keyed by tty)
167
+ // provider / global — config defaults
168
+ const tty = controllingTty();
169
+ recordPane(tty, label);
170
+ setPaneWaiting(tty, event === 'waiting'); // waiting -> yellow menu bar status; done clears it
171
+ const pane = readPaneSetting(tty);
172
+ const tts = pane.tts || config.tts;
173
+ const voice = process.env.AI_NOTIFY_VOICE || pane.voice || p.voice || config.voice;
174
+ const speaker = process.env.AI_NOTIFY_VOICEVOX_SPEAKER || pane.speaker || config.voicevox?.speaker;
126
175
 
127
- // Voice precedence (most specific first):
128
- // $AI_NOTIFY_VOICE — set per terminal window/pane to give each its own voice
129
- // provider voice — per agent (Claude vs Codex)
130
- // global voice — the single `ai-notify voice` switch
131
- const voice = process.env.AI_NOTIFY_VOICE || p.voice || config.voice;
176
+ // Volume (0–2): per-window env > this pane's slider > the global slider /
177
+ // `ai-notify volume` > config.
178
+ const envVol = parseFloat(process.env.AI_NOTIFY_VOLUME);
179
+ const fileVol = readVolume();
180
+ const vol = Number.isFinite(envVol)
181
+ ? Math.min(2, Math.max(0, envVol))
182
+ : typeof pane.volume === 'number'
183
+ ? pane.volume
184
+ : fileVol != null
185
+ ? fileVol
186
+ : typeof config.volume === 'number'
187
+ ? config.volume
188
+ : 1;
132
189
 
133
190
  if (!muted) {
134
- playSound(soundName);
135
- if (config.speak) speak(speakText, voice);
191
+ playSound(soundName, vol);
192
+ if (config.speak && vol > 0) {
193
+ let spoken = false;
194
+ if (tts === 'voicevox') {
195
+ spoken = voicevox.speak(speakText, speaker, config.voicevox?.url, vol);
196
+ }
197
+ if (!spoken) speak(speakText, voice, vol); // OS `say` (also the VOICEVOX fallback)
198
+ }
136
199
  }
137
200
 
138
201
  if (!muted || config.bannerWhenMuted) {
@@ -140,7 +203,7 @@ export const emit = ({ provider = 'default', event = 'done', label = '', message
140
203
  banner(
141
204
  waiting ? `⏳ ${label || 'input'}` : `✓ ${label || 'done'}`,
142
205
  waiting ? 'waiting for input' : '',
143
- coreBody,
206
+ fullBody,
144
207
  {
145
208
  // Click the notification to bring the waiting app (e.g. the IDE) forward.
146
209
  activate: config.notifyActivate !== false ? process.env.__CFBundleIdentifier : undefined,
package/src/state.mjs CHANGED
@@ -45,6 +45,90 @@ export const setMuted = (muted) => {
45
45
 
46
46
  export const toggleMuted = () => setMuted(!isMuted());
47
47
 
48
+ // --- Volume ----------------------------------------------------------------
49
+ // A single number (0.0–2.0) in a state file, written by the menu bar slider or
50
+ // `ai-notify volume`, read at fire time — just like the mute flag.
51
+
52
+ const volumeFlagPath = () => join(stateDir(), 'volume');
53
+
54
+ export const readVolume = () => {
55
+ try {
56
+ const v = parseFloat(readFileSync(volumeFlagPath(), 'utf8'));
57
+ return Number.isFinite(v) ? Math.min(2, Math.max(0, v)) : null;
58
+ } catch {
59
+ return null;
60
+ }
61
+ };
62
+
63
+ export const setVolume = (v) => {
64
+ const n = Math.min(2, Math.max(0, Number(v)));
65
+ ensureDir(stateDir());
66
+ writeFileSync(volumeFlagPath(), String(n));
67
+ return n;
68
+ };
69
+
70
+ // --- Per-pane state --------------------------------------------------------
71
+ // Recently-active terminal panes (so the menu bar can offer per-pane voices),
72
+ // and a per-tty voice override. Both are small JSON files in the state dir.
73
+
74
+ const readJson = (p, fallback) => {
75
+ try {
76
+ return JSON.parse(readFileSync(p, 'utf8'));
77
+ } catch {
78
+ return fallback;
79
+ }
80
+ };
81
+ const writeJson = (p, obj) => {
82
+ ensureDir(stateDir());
83
+ writeFileSync(p, JSON.stringify(obj));
84
+ };
85
+
86
+ const panesPath = () => join(stateDir(), 'panes.json');
87
+ const paneVoicesPath = () => join(stateDir(), 'pane-voices.json');
88
+ const waitingPath = () => join(stateDir(), 'waiting.json');
89
+
90
+ // Track which panes are waiting for input, so the menu bar icon can show a
91
+ // status color (yellow) when any agent needs you.
92
+ export const setPaneWaiting = (tty, waiting) => {
93
+ if (!tty) return;
94
+ const all = readJson(waitingPath(), {});
95
+ if (waiting) all[tty] = Date.now();
96
+ else delete all[tty];
97
+ writeJson(waitingPath(), all);
98
+ };
99
+ export const anyWaiting = () => Object.keys(readJson(waitingPath(), {})).length > 0;
100
+
101
+ // Record this pane as active (keyed by tty). Keeps the 16 most-recent.
102
+ export const recordPane = (tty, label) => {
103
+ if (!tty) return;
104
+ const all = readJson(panesPath(), {});
105
+ all[tty] = { label: label || '', ts: Date.now() };
106
+ const trimmed = Object.entries(all)
107
+ .sort((a, b) => b[1].ts - a[1].ts)
108
+ .slice(0, 16);
109
+ writeJson(panesPath(), Object.fromEntries(trimmed));
110
+ };
111
+
112
+ export const readPanes = () =>
113
+ Object.entries(readJson(panesPath(), {}))
114
+ .map(([tty, v]) => ({ tty, label: v.label || '', ts: v.ts || 0 }))
115
+ .sort((a, b) => b.ts - a.ts);
116
+
117
+ // Per-pane settings: { tts, speaker, voice, volume }. Any subset may be set.
118
+ export const readPaneSetting = (tty) => (tty ? readJson(paneVoicesPath(), {})[tty] || {} : {});
119
+
120
+ // Merge `patch` into the pane's settings; keys set to null are removed; an empty
121
+ // entry is deleted entirely.
122
+ export const updatePaneSetting = (tty, patch) => {
123
+ if (!tty) return;
124
+ const all = readJson(paneVoicesPath(), {});
125
+ const next = { ...(all[tty] || {}), ...patch };
126
+ for (const k of Object.keys(next)) if (next[k] == null) delete next[k];
127
+ if (Object.keys(next).length === 0) delete all[tty];
128
+ else all[tty] = next;
129
+ writeJson(paneVoicesPath(), all);
130
+ };
131
+
48
132
  // --- Config ----------------------------------------------------------------
49
133
 
50
134
  // Sounds default to OS built-ins so we ship no audio assets (clean repo, no
@@ -55,19 +139,27 @@ export const DEFAULT_CONFIG = {
55
139
  bannerWhenMuted: true,
56
140
  // Spoken read-out of which terminal finished (helps tell tabs apart).
57
141
  speak: true,
58
- // Prefix the window label to the spoken message so you can tell which of many
59
- // terminals is asking (set a short per-window name with $AI_NOTIFY_LABEL).
60
- speakLabel: true,
142
+ // Output volume 0.0–2.0 (1.0 = normal). The menu bar slider / `ai-notify
143
+ // volume` write a state file that overrides this; $AI_NOTIFY_VOLUME overrides
144
+ // per window. Applies to sounds, the spoken voice, and VOICEVOX.
145
+ volume: 1.0,
146
+ // Prefix the window label to the SPOKEN read-out. Off by default — the task
147
+ // gist already identifies the pane, and the label (often the working dir) just
148
+ // adds slow filler. Turn on if you set a short $AI_NOTIFY_LABEL per window.
149
+ // (The desktop banner is always titled with the label regardless.)
150
+ speakLabel: false,
61
151
  // Visually highlight the waiting terminal window/pane (best-effort, by tty).
62
152
  // Off by default; the color is yellow / orange / red / green / #RRGGBB.
63
153
  highlightWaiting: false,
64
154
  highlightColor: 'yellow',
65
155
  // Make the desktop notification click bring the terminal/IDE forward.
66
156
  notifyActivate: true,
67
- // Whether to speak the agent's own text (Codex's reply, a Claude prompt).
68
- // That text is in the agent's language set this false to keep every spoken
69
- // read-out in your own language via doneMessage / waitingMessage instead.
70
- speakAgentMessage: true,
157
+ // Speak the agent's full message aloud (Codex's reply, a Claude prompt, the
158
+ // done-summary)? Default false = read only a short gist (first clause, capped
159
+ // at speakMaxChars) enough to tell which task, never cut off. The full text
160
+ // still shows in the desktop banner. Set true to read the whole thing.
161
+ speakAgentMessage: false,
162
+ speakMaxChars: 40,
71
163
  // Optional: translate the agent's message into this language before speaking
72
164
  // it (e.g. 'ja'). Empty = off. Key-less, no cost; makes a network request.
73
165
  // Toggle with `ai-notify translate on ja` / `off`.
@@ -79,6 +171,11 @@ export const DEFAULT_CONFIG = {
79
171
  // 'Kyoko'). Empty = OS default voice. Switch it with `ai-notify voice`. A
80
172
  // per-provider `voice` below, if set, overrides this for that agent.
81
173
  voice: '',
174
+ // TTS backend: 'say' (OS voice) or 'voicevox' (local VOICEVOX engine — speak
175
+ // in character voices). Falls back to 'say' if the engine isn't running.
176
+ // Per window: $AI_NOTIFY_VOICEVOX_SPEAKER overrides the speaker id.
177
+ tts: 'say',
178
+ voicevox: { url: 'http://127.0.0.1:50021', speaker: 3 },
82
179
  // Spoken read-out templates for agent events. The window label is added
83
180
  // separately (speakLabel), so leave {label} out here to avoid doubling it.
84
181
  // Override per language (e.g. Japanese) in config.json. An agent that supplies
@@ -108,4 +205,4 @@ export const writeConfig = (config) => {
108
205
  return configPath();
109
206
  };
110
207
 
111
- export const paths = { muteFlagPath, configPath, stateDir, configDir };
208
+ export const paths = { muteFlagPath, configPath, stateDir, configDir, volumeFlagPath };
package/src/util.mjs CHANGED
@@ -39,3 +39,19 @@ export const cliInvocation = () => ({
39
39
  export const isEphemeralInstall = (cliPath) => /[/\\]_npx[/\\]/.test(cliPath);
40
40
 
41
41
  export const MARKER = 'ai-notify'; // substring used to detect our own wiring
42
+
43
+ // The controlling terminal of this process (e.g. "/dev/ttys010"), which is
44
+ // stable per terminal pane — used to scope per-pane settings. null if none.
45
+ export const controllingTty = () => {
46
+ try {
47
+ const t = execFileSync('ps', ['-o', 'tty=', '-p', String(process.pid)], {
48
+ stdio: ['ignore', 'pipe', 'ignore'],
49
+ })
50
+ .toString()
51
+ .trim();
52
+ if (!t || t === '??' || t === '?') return null;
53
+ return t.startsWith('/dev/') ? t : `/dev/${t}`;
54
+ } catch {
55
+ return null;
56
+ }
57
+ };
@@ -0,0 +1,120 @@
1
+ // VOICEVOX read-out: synthesize the spoken notification with a local VOICEVOX
2
+ // engine (free, offline, no API key) so each terminal can speak in a distinct
3
+ // character voice (ずんだもん, 四国めたん, …).
4
+ //
5
+ // The engine exposes an HTTP API on 127.0.0.1:50021. We use `curl` (zero deps):
6
+ // POST /audio_query?speaker=ID&text=... -> query JSON
7
+ // POST /synthesis?speaker=ID (query body) -> WAV
8
+ // then play the WAV. Everything is best-effort: if the engine isn't running we
9
+ // return false and the caller falls back to the OS `say` voice.
10
+
11
+ import { execSync, execFileSync } from 'node:child_process';
12
+ import { existsSync, statSync, mkdtempSync, rmSync, appendFileSync } from 'node:fs';
13
+ import { join } from 'node:path';
14
+ import { tmpdir } from 'node:os';
15
+ import { stateDir } from './state.mjs';
16
+
17
+ export const DEFAULT_URL = 'http://127.0.0.1:50021';
18
+
19
+ const platform = process.platform;
20
+
21
+ // Record why a synthesis fell back to the OS voice, so intermittent fallbacks
22
+ // are diagnosable instead of silent. Best-effort.
23
+ const logFail = (reason) => {
24
+ try {
25
+ appendFileSync(join(stateDir(), 'voicevox.log'), `${new Date().toISOString()} ${reason}\n`);
26
+ } catch {
27
+ /* ignore */
28
+ }
29
+ };
30
+
31
+ export const isAvailable = (url = DEFAULT_URL, timeoutMs = 1500) => {
32
+ try {
33
+ const out = execFileSync('curl', ['-s', '-m', String(Math.ceil(timeoutMs / 1000)), `${url}/version`], {
34
+ encoding: 'utf8',
35
+ timeout: timeoutMs + 500,
36
+ });
37
+ return out.trim().length > 0;
38
+ } catch {
39
+ return false;
40
+ }
41
+ };
42
+
43
+ // Flatten /speakers into [{ id, name }] (character + style).
44
+ export const listSpeakers = (url = DEFAULT_URL) => {
45
+ try {
46
+ const out = execFileSync('curl', ['-s', '-m', '4', `${url}/speakers`], { encoding: 'utf8', timeout: 5000 });
47
+ const data = JSON.parse(out);
48
+ const rows = [];
49
+ for (const sp of data) {
50
+ for (const st of sp.styles || []) rows.push({ id: st.id, name: `${sp.name}(${st.name})` });
51
+ }
52
+ return rows;
53
+ } catch {
54
+ return [];
55
+ }
56
+ };
57
+
58
+ // One entry per character (preferring the ノーマル style) — a short, pickable
59
+ // list for the menu bar, vs the full style list from listSpeakers.
60
+ export const listCharacters = (url = DEFAULT_URL) => {
61
+ try {
62
+ const out = execFileSync('curl', ['-s', '-m', '4', `${url}/speakers`], { encoding: 'utf8', timeout: 5000 });
63
+ const data = JSON.parse(out);
64
+ const rows = [];
65
+ for (const sp of data) {
66
+ const styles = sp.styles || [];
67
+ const pick = styles.find((s) => s.name === 'ノーマル') || styles[0];
68
+ if (pick) rows.push({ id: pick.id, name: sp.name });
69
+ }
70
+ return rows;
71
+ } catch {
72
+ return [];
73
+ }
74
+ };
75
+
76
+ const playWav = (wav, vol = 1) => {
77
+ if (platform === 'darwin') execFileSync('afplay', ['-v', String(vol), wav], { timeout: 30000 });
78
+ else if (platform === 'linux') {
79
+ try {
80
+ execFileSync('aplay', ['-q', wav], { timeout: 30000 });
81
+ } catch {
82
+ execFileSync('paplay', [wav], { timeout: 30000 });
83
+ }
84
+ }
85
+ };
86
+
87
+ // Synthesize and play. Returns true if it spoke, false to fall back to `say`.
88
+ export const speak = (text, speaker = 3, url = DEFAULT_URL, vol = 1, timeoutMs = 15000) => {
89
+ if (!text) return false;
90
+ let dir;
91
+ try {
92
+ dir = mkdtempSync(join(tmpdir(), 'ai-notify-vv-'));
93
+ const wav = join(dir, 'v.wav');
94
+ const sec = String(Math.max(2, Math.ceil(timeoutMs / 1000)));
95
+ const enc = encodeURIComponent(text); // URL-encoded -> no shell metacharacters
96
+ // Pipe audio_query straight into synthesis. execSync uses /bin/sh for the pipe.
97
+ const cmd =
98
+ `curl -s -m ${sec} -X POST "${url}/audio_query?speaker=${speaker}&text=${enc}" | ` +
99
+ `curl -s -m ${sec} -X POST -H "Content-Type: application/json" -d @- ` +
100
+ `"${url}/synthesis?speaker=${speaker}" -o "${wav}"`;
101
+ execSync(cmd, { timeout: timeoutMs + 1000, stdio: 'ignore' });
102
+ if (!existsSync(wav) || statSync(wav).size < 1000) {
103
+ logFail(`empty/short wav (speaker ${speaker}, ${text.length} chars)`);
104
+ return false;
105
+ }
106
+ playWav(wav, vol);
107
+ return true;
108
+ } catch (e) {
109
+ logFail(`error (speaker ${speaker}): ${(e && e.message) || e}`);
110
+ return false;
111
+ } finally {
112
+ if (dir) {
113
+ try {
114
+ rmSync(dir, { recursive: true, force: true });
115
+ } catch {
116
+ /* ignore */
117
+ }
118
+ }
119
+ }
120
+ };