npm - openclaw-stimm-voice - Versions diffs - 1.0.0 - Mend

openclaw-stimm-voice 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (19) hide show

package/CHANGELOG.md +17 -0
package/README.md +167 -0
package/index.ts +1437 -0
package/openclaw.plugin.json +390 -0
package/package.json +43 -0
package/python/Dockerfile +10 -0
package/python/agent.py +244 -0
package/python/lk_no_hw_video.c +76 -0
package/python/requirements.txt +3 -0
package/scripts/dev-setup.sh +89 -0
package/src/agent-process.ts +470 -0
package/src/cli.ts +536 -0
package/src/config.ts +247 -0
package/src/core-bridge.ts +130 -0
package/src/quick-tunnel.ts +122 -0
package/src/response-generator.ts +210 -0
package/src/setup-wizard.ts +1557 -0
package/src/web/voice.html +1237 -0
package/tsconfig.json +20 -0

package/CHANGELOG.md ADDED Viewed

@@ -0,0 +1,17 @@
+# Changelog
+## 1.0.0 (2026-02-28)
+### Features
+* **stimm-voice:** dual-agent pipeline — OpenClaw supervisor bridge, room manager, response generator ([934afb0](https://github.com/EtienneLescot/openclaw-stimm-voice/commit/934afb0217f18b04e33e0c1fcbc94c9910041357))
+* **stimm-voice:** extension scaffold — plugin manifest, entrypoint, core config types ([a981bf3](https://github.com/EtienneLescot/openclaw-stimm-voice/commit/a981bf392c953fe13532aefde3b4c364d27041b6))
+* **stimm-voice:** interactive setup wizard — multi-provider catalog (Deepgram, ElevenLabs, Hume, Groq, OpenAI) ([6d7c0ca](https://github.com/EtienneLescot/openclaw-stimm-voice/commit/6d7c0caabadb4a89d3ecb555a20159f056cba0c8))
+* **stimm-voice:** Python voice agent — LiveKit Agents v1 entrypoint, STT/TTS/LLM plugin support ([2965ebf](https://github.com/EtienneLescot/openclaw-stimm-voice/commit/2965ebf2eec995473e1f95ee596ce8e21fd0de95))
+* **stimm-voice:** web voice UI, claim-token flow, and CLI commands (voice:start, voice:logs, voice:setup) ([92f611f](https://github.com/EtienneLescot/openclaw-stimm-voice/commit/92f611fd16f4011ef4e415a47928f0188c5394d9))
+### Bug Fixes
+* **stimm-voice:** WSL2/Docker WebRTC — LD_PRELOAD shim for Nvidia crashes, TCP+UDP port mapping, TURN credentials ([e769b65](https://github.com/EtienneLescot/openclaw-stimm-voice/commit/e769b65d197b8c7df1f69035a009121a82d9e5cc))

package/README.md ADDED Viewed

@@ -0,0 +1,167 @@
+# openclaw-stimm-voice
+Stimm Voice is a third-party OpenClaw plugin for real-time voice conversations.
+It uses a dual-agent architecture:
+- A fast Python voice agent (LiveKit + STT/TTS/LLM) handles low-latency speech.
+- OpenClaw acts as the supervisor for reasoning, tools, and long-context decisions.
+## Presentation
+What this plugin provides:
+- Real-time voice sessions backed by LiveKit rooms.
+- Browser entrypoint at `web.path` (default: `/voice`).
+- Claim-token flow for web access (`/voice/claim`) with one-time, short-lived claims.
+- Optional Cloudflare Quick Tunnel for temporary public access.
+- Optional supervisor shared secret for `POST /stimm/supervisor`.
+## Install
+### Prerequisites
+- Node.js 22+
+- Python 3.10+
+- OpenClaw gateway ≥ 2026.2.0 installed and running
+- LiveKit deployment:
+  - local (`ws://localhost:7880`) or
+  - cloud (`wss://<your-project>.livekit.cloud`)
+### Install from npm
+```bash
+openclaw plugins install openclaw-stimm-voice
+```
+### Install from GitHub (latest)
+```bash
+openclaw plugins install https://github.com/EtienneLescot/openclaw-stimm-voice
+```
+Then restart the OpenClaw gateway.
+Python dependencies use Stimm extras as the single installation contract:
+- Base/default profile from [python/requirements.txt](python/requirements.txt): `stimm[deepgram,openai]`
+- Additional provider plugins are installed by the setup wizard based on selected STT/TTS/LLM providers (`stimm[...]`).
+## Config
+Set config under `plugins.entries.stimm-voice.config`.
+```json5
+{
+  enabled: true,
+  livekit: {
+    url: "wss://your-project.livekit.cloud",
+    apiKey: "APIxxxxx",
+    apiSecret: "your-secret",
+  },
+  web: {
+    enabled: true,
+    path: "/voice",
+  },
+  access: {
+    mode: "quick-tunnel", // "none" | "quick-tunnel"
+    claimTtlSeconds: 120,
+    livekitTokenTtlSeconds: 300,
+    supervisorSecret: "change-me",
+    allowDirectWebSessionCreate: false,
+    claimRateLimitPerMinute: 20,
+  },
+  voiceAgent: {
+    spawn: { autoSpawn: true },
+    stt: { provider: "deepgram", model: "nova-3" },
+    tts: { provider: "openai", model: "gpt-4o-mini-tts", voice: "ash" },
+    llm: { provider: "openai", model: "gpt-4o-mini" },
+    bufferingLevel: "MEDIUM",
+    mode: "hybrid",
+  },
+}
+```
+Notes:
+- The extension is disabled by default (`enabled: false`).
+- `access.mode="quick-tunnel"` requires `cloudflared` on PATH.
+- `voiceAgent.tts.voice` is provider-specific: OpenAI uses voice names (`ash`, `alloy`), ElevenLabs uses `voice_id`, and Cartesia uses voice UUIDs.
+- API keys can be set directly in plugin config, or via env fallbacks (`STIMM_STT_API_KEY`, `STIMM_TTS_API_KEY`, `STIMM_LLM_API_KEY`, then provider-specific env vars).
+- `access.supervisorSecret` also supports env fallback (`STIMM_SUPERVISOR_SECRET`, then `OPENCLAW_SUPERVISOR_SECRET`).
+## Usage
+### Start session from CLI/tool/gateway
+```bash
+openclaw voice:start --channel web
+```
+### Supervisor logs (high-level)
+Use this command to inspect supervisor observability quickly without manual `grep`:
+```bash
+openclaw voice:logs --limit 40
+```
+Interactive follow mode:
+```bash
+openclaw voice:logs --watch --interval 2
+```
+Options:
+- `--raw`: print raw `OBS_JSON` lines from `/tmp/stimm-agent.log`
+- `--limit <n>`: number of entries to print (default: `40`)
+- `--watch`: keep watching and print new entries continuously (Ctrl+C to stop)
+- `--interval <s>`: refresh interval for watch mode in seconds (default: `2`)
+- `--all-events`: include `inference_started` (hidden by default to reduce noise)
+The command prints two sections:
+- Stimm supervisor `OBS_JSON` events (`inference_started`, `inference_completed`, `trigger_sent`, `no_action`)
+- Gateway-side synthesized lines (`[stimm-voice:supervisor]`) when available
+`stimm.start` / `stimm_voice:start_session` returns:
+- room metadata
+- `shareUrl` (when quick tunnel is enabled)
+- one-time `claimToken`
+### Browser flow
+1. Open the returned `shareUrl` on phone.
+2. The page calls `POST /voice/claim` with the claim token.
+3. Gateway validates claim and returns a short-lived LiveKit token.
+4. Browser joins LiveKit.
+### HTTP endpoints
+- `GET <web.path>`: serves the web voice UI.
+- `POST <web.path>/claim`: claim exchange endpoint.
+- `POST <web.path>`: disabled by default (`403`) unless `access.allowDirectWebSessionCreate=true`.
+- `POST /stimm/supervisor`: internal supervisor callback (protected if `access.supervisorSecret` is set).
+### Gateway methods
+- `stimm.start`
+- `stimm.end`
+- `stimm.status`
+- `stimm.instruct`
+- `stimm.mode`
+### Tool
+Tool name: `stimm_voice`
+Actions:
+- `start_session`
+- `end_session`
+- `status`
+- `instruct`
+- `add_context`
+- `set_mode`