npm - dw-kit - Versions diffs - 1.8.0-rc.1 → 1.8.0-rc.2 - Mend

dw-kit 1.8.0-rc.1 → 1.8.0-rc.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (12) hide show

package/.dw/config/connectors.local.yml +30 -0
package/CLAUDE.md +1 -1
package/package.json +1 -1
package/src/cli.mjs +4 -2
package/src/commands/voice.mjs +472 -21
package/src/lib/orchestrator.mjs +59 -9
package/src/lib/session-store.mjs +14 -2
package/src/lib/session-tree.mjs +127 -0
package/src/lib/tls-helpers.mjs +153 -0
package/src/lib/voice-action.mjs +259 -0
package/src/lib/voice-log.mjs +105 -0
package/src/lib/voice-parser.mjs +50 -34

package/.dw/config/connectors.local.yml CHANGED Viewed

@@ -6,3 +6,33 @@ telegram:
   bot_token: 8693679403:AAG9FrgUd5Ig9eDTAWnA9RqhbMnShRi3Si0
   allowed_user_ids:
     - 6603235862
+# Voice channel (Phase 4 substrate of G-rgoal-realtime-orch).
+# `dw voice` boots a localhost HTTP server with a browser page using Web
+# Speech API for ASR + TTS. Optional orchestrator hybrid: when the regex
+# parser misses, hand the transcript to the configured agent (Claude Code /
+# Codex / Gemini) with a voice-aware system prompt.
+voice:
+  # Default UI language for SpeechRecognition + TTS. User can override on the
+  # page via dropdown. Common values: en-US · vi-VN · ja-JP · ko-KR · zh-CN.
+  # Browser support varies — Chrome / Edge / Safari are best.
+  lang: en-US
+  # Extra languages to surface in the dropdown (beyond `lang` + en-US default).
+  extra_langs: [vi-VN]
+  # Server-side TTS fallback when no native browser/OS voice matches the
+  # selected language. F-23 (G-dogfood-v1.7): a fresh Windows install has
+  # no Vietnamese SAPI voice; we proxy MP3 from Google Translate's public
+  # TTS endpoint. Modes:
+  #   auto    — only when no native voice matches (default; balanced)
+  #   always  — always use server-side (best for cross-machine consistency)
+  #   none    — never use server-side (privacy: spoken text stays local)
+  # Privacy note: in `auto`/`always`, the spoken text leaves the machine to
+  # translate.google.com. Disable with `none` if that is unacceptable.
+  fallback_tts: auto
+  orchestrator:
+    enabled: true          # opt-in. To enable WITHOUT committing it, add the
+                             # same `voice:` section to `.dw/config/connectors.local.yml`
+                             # (gitignored) — the local file overrides this template
+                             # via deep-merge (same pattern as bot_token + allow-list).
+    agent: claude            # must exist in .dw/config/agents.yml; CLI must be on PATH
+    timeout_ms: 30000        # max wait per turn (Claude is typically 2-8s)

package/CLAUDE.md CHANGED Viewed

@@ -3,7 +3,7 @@
 Workflow toolkit codebase. Rules live in `.claude/rules/` (auto-loaded).
 **v2.0 direction:** Context-First SDLC Governance Layer (5 pillars — see `.dw/core/PILLARS.md`)
-**Current:** **v1.8.0-rc.1** (2026-05-25) — 4/5 phase substrate for [G-rgoal-realtime-orch](.dw/goals/G-rgoal-realtime-orch/goal.md) voice-meeting Root Goal: persistent CLI-agent session runtime (`dw session *`), Telegram chat bridge (`dw connector telegram setup`) with 1-command interactive wizard, multi-workspace registry (`dw workspace *`), and browser voice MVP (`dw voice`) with hybrid orchestrator fallback (Claude/Codex/Gemini), full bilingual UX (en + vi), voice-not-installed fallback via Google Translate TTS proxy. **v1.7.0 stable** on `main` (2026-05-24). 284/284 smoke tests. Active ADRs: ADR-0001 (Pragmatic Lean), ADR-0005/0006 (Supply-Chain Guard; sunset review 2026-08-12), ADR-0008 (Task Docs v3, v1.5), ADR-0009 (Agent OS, v1.6), ADR-0010 (Goals Layer, v1.7). ADR-0011 (Session Runtime + Voice Orchestrator, v1.8) pending codification.
+**Current:** **v1.8.0-rc.1** (2026-05-25) — 4/5 phase substrate for [G-rgoal-realtime-orch](.dw/goals/G-rgoal-realtime-orch/goal.md) voice-meeting Root Goal: persistent CLI-agent session runtime (`dw session *`), Telegram chat bridge (`dw connector telegram setup`) with 1-command interactive wizard, multi-workspace registry (`dw workspace *`), and browser voice MVP (`dw voice`) with hybrid orchestrator fallback (Claude/Codex/Gemini), full bilingual UX (en + vi), voice-not-installed fallback via Google Translate TTS proxy. **v1.7.0 stable** on `main` (2026-05-24). 284/284 smoke tests. Active ADRs: ADR-0001 (Pragmatic Lean), ADR-0005/0006 (Supply-Chain Guard; sunset review 2026-08-12), ADR-0008 (Task Docs v3, v1.5), ADR-0009 (Agent OS, v1.6), ADR-0010 (Goals Layer, v1.7), ADR-0011 (Session Runtime + Voice Orchestrator, v1.8 — Accepted).
 ---

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "dw-kit",
-  "version": "1.8.0-rc.1",
+  "version": "1.8.0-rc.2",
   "description": "AI development workflow toolkit — structured, quality-assured, team-ready. From requirements to dashboard.",
   "type": "module",
   "bin": {

package/src/cli.mjs CHANGED Viewed

@@ -352,10 +352,12 @@ export function run(argv) {
   // ─ voice: browser-based voice MVP (G-rgoal phase 4 — KR-A destination) ─
   program
     .command('voice')
-    .description('Browser-based voice MVP — speak dw commands, hear results. Local server + Web Speech API. Spike scope: localhost only.')
-    .option('-p, --port <port>', 'Local server port (default 4500; auto-finds next free)', parseInt)
+    .description('Browser-based voice MVP — speak dw commands, hear results. Web Speech API + optional HTTPS for mobile/LAN reach.')
+    .option('-p, --port <port>', 'Server port (default 4500; auto-finds next free)', parseInt)
     .option('--default-agent <name>', 'Default agent for "start ..." without explicit agent name')
     .option('--no-open', 'Skip auto-opening browser tab (also implied by DW_NO_OPEN=1, CI=true, or non-TTY stdout)')
+    .option('--https', 'Serve over HTTPS using self-signed cert (auto-generated via mkcert or openssl). Required for mic access from non-localhost browsers (mobile / LAN / Tailscale).')
+    .option('--bind <addr>', 'Bind address (default 127.0.0.1). Use 0.0.0.0 to expose on all interfaces (LAN/Tailscale). MUST combine with --https for remote mic access.', '127.0.0.1')
     .action(async (opts) => {
       const { voiceCommand } = await import('./commands/voice.mjs');
       await voiceCommand(opts);