openclaw-stimm-voice 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md ADDED
@@ -0,0 +1,17 @@
1
+ # Changelog
2
+
3
+ ## 1.0.0 (2026-02-28)
4
+
5
+
6
+ ### Features
7
+
8
+ * **stimm-voice:** dual-agent pipeline — OpenClaw supervisor bridge, room manager, response generator ([934afb0](https://github.com/EtienneLescot/openclaw-stimm-voice/commit/934afb0217f18b04e33e0c1fcbc94c9910041357))
9
+ * **stimm-voice:** extension scaffold — plugin manifest, entrypoint, core config types ([a981bf3](https://github.com/EtienneLescot/openclaw-stimm-voice/commit/a981bf392c953fe13532aefde3b4c364d27041b6))
10
+ * **stimm-voice:** interactive setup wizard — multi-provider catalog (Deepgram, ElevenLabs, Hume, Groq, OpenAI) ([6d7c0ca](https://github.com/EtienneLescot/openclaw-stimm-voice/commit/6d7c0caabadb4a89d3ecb555a20159f056cba0c8))
11
+ * **stimm-voice:** Python voice agent — LiveKit Agents v1 entrypoint, STT/TTS/LLM plugin support ([2965ebf](https://github.com/EtienneLescot/openclaw-stimm-voice/commit/2965ebf2eec995473e1f95ee596ce8e21fd0de95))
12
+ * **stimm-voice:** web voice UI, claim-token flow, and CLI commands (voice:start, voice:logs, voice:setup) ([92f611f](https://github.com/EtienneLescot/openclaw-stimm-voice/commit/92f611fd16f4011ef4e415a47928f0188c5394d9))
13
+
14
+
15
+ ### Bug Fixes
16
+
17
+ * **stimm-voice:** WSL2/Docker WebRTC — LD_PRELOAD shim for Nvidia crashes, TCP+UDP port mapping, TURN credentials ([e769b65](https://github.com/EtienneLescot/openclaw-stimm-voice/commit/e769b65d197b8c7df1f69035a009121a82d9e5cc))
package/README.md ADDED
@@ -0,0 +1,167 @@
1
+ # openclaw-stimm-voice
2
+
3
+ Stimm Voice is a third-party OpenClaw plugin for real-time voice conversations.
4
+
5
+ It uses a dual-agent architecture:
6
+
7
+ - A fast Python voice agent (LiveKit + STT/TTS/LLM) handles low-latency speech.
8
+ - OpenClaw acts as the supervisor for reasoning, tools, and long-context decisions.
9
+
10
+ ## Presentation
11
+
12
+ What this plugin provides:
13
+
14
+ - Real-time voice sessions backed by LiveKit rooms.
15
+ - Browser entrypoint at `web.path` (default: `/voice`).
16
+ - Claim-token flow for web access (`/voice/claim`) with one-time, short-lived claims.
17
+ - Optional Cloudflare Quick Tunnel for temporary public access.
18
+ - Optional supervisor shared secret for `POST /stimm/supervisor`.
19
+
20
+ ## Install
21
+
22
+ ### Prerequisites
23
+
24
+ - Node.js 22+
25
+ - Python 3.10+
26
+ - OpenClaw gateway ≥ 2026.2.0 installed and running
27
+ - LiveKit deployment:
28
+ - local (`ws://localhost:7880`) or
29
+ - cloud (`wss://<your-project>.livekit.cloud`)
30
+
31
+ ### Install from npm
32
+
33
+ ```bash
34
+ openclaw plugins install openclaw-stimm-voice
35
+ ```
36
+
37
+ ### Install from GitHub (latest)
38
+
39
+ ```bash
40
+ openclaw plugins install https://github.com/EtienneLescot/openclaw-stimm-voice
41
+ ```
42
+
43
+ Then restart the OpenClaw gateway.
44
+
45
+ Python dependencies use Stimm extras as the single installation contract:
46
+
47
+ - Base/default profile from [python/requirements.txt](python/requirements.txt): `stimm[deepgram,openai]`
48
+ - Additional provider plugins are installed by the setup wizard based on selected STT/TTS/LLM providers (`stimm[...]`).
49
+
50
+ ## Config
51
+
52
+ Set config under `plugins.entries.stimm-voice.config`.
53
+
54
+ ```json5
55
+ {
56
+ enabled: true,
57
+ livekit: {
58
+ url: "wss://your-project.livekit.cloud",
59
+ apiKey: "APIxxxxx",
60
+ apiSecret: "your-secret",
61
+ },
62
+ web: {
63
+ enabled: true,
64
+ path: "/voice",
65
+ },
66
+ access: {
67
+ mode: "quick-tunnel", // "none" | "quick-tunnel"
68
+ claimTtlSeconds: 120,
69
+ livekitTokenTtlSeconds: 300,
70
+ supervisorSecret: "change-me",
71
+ allowDirectWebSessionCreate: false,
72
+ claimRateLimitPerMinute: 20,
73
+ },
74
+ voiceAgent: {
75
+ spawn: { autoSpawn: true },
76
+ stt: { provider: "deepgram", model: "nova-3" },
77
+ tts: { provider: "openai", model: "gpt-4o-mini-tts", voice: "ash" },
78
+ llm: { provider: "openai", model: "gpt-4o-mini" },
79
+ bufferingLevel: "MEDIUM",
80
+ mode: "hybrid",
81
+ },
82
+ }
83
+ ```
84
+
85
+ Notes:
86
+
87
+ - The extension is disabled by default (`enabled: false`).
88
+ - `access.mode="quick-tunnel"` requires `cloudflared` on PATH.
89
+ - `voiceAgent.tts.voice` is provider-specific: OpenAI uses voice names (`ash`, `alloy`), ElevenLabs uses `voice_id`, and Cartesia uses voice UUIDs.
90
+ - API keys can be set directly in plugin config, or via env fallbacks (`STIMM_STT_API_KEY`, `STIMM_TTS_API_KEY`, `STIMM_LLM_API_KEY`, then provider-specific env vars).
91
+ - `access.supervisorSecret` also supports env fallback (`STIMM_SUPERVISOR_SECRET`, then `OPENCLAW_SUPERVISOR_SECRET`).
92
+
93
+ ## Usage
94
+
95
+ ### Start session from CLI/tool/gateway
96
+
97
+ ```bash
98
+ openclaw voice:start --channel web
99
+ ```
100
+
101
+ ### Supervisor logs (high-level)
102
+
103
+ Use this command to inspect supervisor observability quickly without manual `grep`:
104
+
105
+ ```bash
106
+ openclaw voice:logs --limit 40
107
+ ```
108
+
109
+ Interactive follow mode:
110
+
111
+ ```bash
112
+ openclaw voice:logs --watch --interval 2
113
+ ```
114
+
115
+ Options:
116
+
117
+ - `--raw`: print raw `OBS_JSON` lines from `/tmp/stimm-agent.log`
118
+ - `--limit <n>`: number of entries to print (default: `40`)
119
+ - `--watch`: keep watching and print new entries continuously (Ctrl+C to stop)
120
+ - `--interval <s>`: refresh interval for watch mode in seconds (default: `2`)
121
+ - `--all-events`: include `inference_started` (hidden by default to reduce noise)
122
+
123
+ The command prints two sections:
124
+
125
+ - Stimm supervisor `OBS_JSON` events (`inference_started`, `inference_completed`, `trigger_sent`, `no_action`)
126
+ - Gateway-side synthesized lines (`[stimm-voice:supervisor]`) when available
127
+
128
+ `stimm.start` / `stimm_voice:start_session` returns:
129
+
130
+ - room metadata
131
+ - `shareUrl` (when quick tunnel is enabled)
132
+ - one-time `claimToken`
133
+
134
+ ### Browser flow
135
+
136
+ 1. Open the returned `shareUrl` on phone.
137
+ 2. The page calls `POST /voice/claim` with the claim token.
138
+ 3. Gateway validates claim and returns a short-lived LiveKit token.
139
+ 4. Browser joins LiveKit.
140
+
141
+ ### HTTP endpoints
142
+
143
+ - `GET <web.path>`: serves the web voice UI.
144
+ - `POST <web.path>/claim`: claim exchange endpoint.
145
+ - `POST <web.path>`: disabled by default (`403`) unless `access.allowDirectWebSessionCreate=true`.
146
+ - `POST /stimm/supervisor`: internal supervisor callback (protected if `access.supervisorSecret` is set).
147
+
148
+ ### Gateway methods
149
+
150
+ - `stimm.start`
151
+ - `stimm.end`
152
+ - `stimm.status`
153
+ - `stimm.instruct`
154
+ - `stimm.mode`
155
+
156
+ ### Tool
157
+
158
+ Tool name: `stimm_voice`
159
+
160
+ Actions:
161
+
162
+ - `start_session`
163
+ - `end_session`
164
+ - `status`
165
+ - `instruct`
166
+ - `add_context`
167
+ - `set_mode`