npm - let-them-talk - Versions diffs - 5.4.0 → 5.4.2 - Mend

let-them-talk 5.4.0 → 5.4.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (17) hide show

package/README.md +315 -87
package/USAGE.md +1 -1
package/cli.js +21 -2
package/conversation-templates/autonomous-feature.json +4 -4
package/conversation-templates/code-review.json +3 -3
package/conversation-templates/debug-squad.json +3 -3
package/conversation-templates/feature-build.json +3 -3
package/conversation-templates/research-write.json +3 -3
package/dashboard.js +47 -2
package/office/index.js +43 -40
package/package.json +1 -1
package/server.js +40 -6
package/templates/debate.json +2 -2
package/templates/managed.json +4 -4
package/templates/pair.json +2 -2
package/templates/review.json +2 -2
package/templates/team.json +3 -3

package/README.md CHANGED Viewed

@@ -1,140 +1,363 @@
 <p align="center">
-  <img src="logo.png" alt="Let Them Talk" width="120">
+  <img src="logo.png" alt="Let Them Talk" width="140">
 </p>
 <h1 align="center">Let Them Talk</h1>
 <p align="center">
-  Local multi-agent collaboration for AI CLI terminals and API adapters.
+  <strong>Let your AI agents actually work as a team.</strong><br>
+  Multi-agent collaboration for Claude Code, Gemini CLI, Codex CLI, Ollama, and API-backed agents — with a live operator dashboard and a 3D virtual office to watch it all happen.
 </p>
-Let Them Talk is a local MCP broker and operator dashboard. Claude Code, Gemini CLI, Codex CLI, and API-backed agents share one project runtime, exchange messages, manage work, and expose the same branch, session, and evidence model through a shared `.agent-bridge/` directory.
+<p align="center">
+  <a href="https://www.npmjs.com/package/let-them-talk"><img src="https://img.shields.io/npm/v/let-them-talk.svg?style=flat&color=58a6ff" alt="npm version"></a>
+  <a href="https://www.npmjs.com/package/let-them-talk"><img src="https://img.shields.io/npm/dm/let-them-talk.svg?style=flat&color=3fb950" alt="npm downloads"></a>
+  <a href="https://github.com/Dekelelz/let-them-talk/blob/master/LICENSE"><img src="https://img.shields.io/badge/License-BSL%201.1-f59e0b.svg?style=flat" alt="BSL 1.1"></a>
+  <a href="https://discord.gg/6Y9YgkFNJP"><img src="https://img.shields.io/discord/1482478651000885359?color=5865F2&label=Discord&logo=discord&logoColor=white&style=flat" alt="Discord"></a>
+  <a href="https://nodejs.org/"><img src="https://img.shields.io/node/v/let-them-talk.svg?color=3fb950&style=flat" alt="Node.js"></a>
+</p>
+<p align="center">
+  <a href="https://talk.unrealai.studio">Website</a> ·
+  <a href="#-quick-start">Quick Start</a> ·
+  <a href="#-features">Features</a> ·
+  <a href="#-installation">Install</a> ·
+  <a href="#-dashboard-tour">Dashboard</a> ·
+  <a href="#-core-concepts">Concepts</a> ·
+  <a href="#-architecture">Architecture</a> ·
+  <a href="https://discord.gg/6Y9YgkFNJP">Discord</a>
+</p>
+---
+## What it is
-## Quick start
+Let Them Talk is a **local MCP broker and operator dashboard** that lets multiple AI CLI agents share one project runtime. Open Claude Code, Gemini CLI, or Codex CLI in separate terminals — they discover each other, exchange messages, assign tasks, review each other's work, coordinate through workflows, and coordinate branches, sessions, and evidence through a shared `.agent-bridge/` directory. A browser dashboard gives you real-time visibility with 12 tabs — including a 3D virtual office where chibi agent characters walk between desks, wave during broadcasts, and sleep when idle.
+If you want your agents to stop working in isolation and start collaborating like a real team, this is it.
+---
+## 🚀 Quick Start
 ```bash
+# 1. Configure the MCP broker for every installed CLI (Claude / Gemini / Codex)
 npx let-them-talk init
+# 2. Launch the web dashboard (localhost:3000)
 node .agent-bridge/launch.js
 ```
-In each agent terminal:
+Now open your CLI in a second terminal and tell it to join:
+```
+You are "Alice". Call register("Alice","Claude"), then get_briefing(),
+then listen_group() and stay in the loop.
+```
-1. Register an agent name.
-2. Call `get_briefing()` if you are joining existing work.
-3. Use `listen()` in direct mode, `listen_group()` in group or managed mode, or `get_work()` if you are running the proactive autonomy loop.
+Open a third terminal, tell that agent to register as `Bob`, and the two will start talking. Everything is visible in the dashboard Messages tab, and you can reply directly from there.
-## Current runtime model
+> **Skip the manual prompts** with `npx let-them-talk init --template team` — gives you Coordinator + Researcher + Coder prompts ready to paste.
-- Canonical runtime state is broker-owned and event-backed under `.agent-bridge/runtime/`.
-- Legacy JSON and JSONL files remain compatibility projections during migration. They are not the authority model.
-- The runtime contract treats branches as full-context namespaces. In the shipped runtime today, branch-local guarantees already cover messages and history, delivery and read state, conversation control and non-general channels, sessions, evidence, tasks and workflows, and workspaces.
-- Branch-local guarantees now also cover the governance surfaces that used to remain compatibility-shared during migration: decisions, KB, reviews, dependencies, votes, rules, and progress.
-- Branch switches replace the whole migrated branch-local collaboration view at once.
-- Sessions are tied to one agent on one branch. Switching branches suspends one branch session and creates or resumes another, and forks copy historical session and evidence context without cloning live execution.
-- Terminal task and workflow completion is only authoritative when structured evidence is recorded, including `recorded_at` and `recorded_by_session` metadata.
-- Markdown workspace export writes to `.agent-bridge-markdown/` and stays non-authoritative. Editing exported markdown does not change runtime state.
+---
-Packaged docs and architecture references:
+## ⚡ Why Let Them Talk
-- `USAGE.md`
-- `docs/architecture/runtime-contract.md`
-- `docs/architecture/branch-semantics.md`
-- `docs/architecture/canonical-event-schema.md`
-- `docs/architecture/markdown-workspace.md`
-- `docs/architecture/runtime-migration-hardening.md`
+| Without Let Them Talk | With Let Them Talk |
+|---|---|
+| One agent works, you copy-paste context to the next | Agents share one runtime and see each other's work automatically |
+| "Done" is just a message that says "done" | Completion requires structured evidence (summary, verification, files_changed, confidence) |
+| You babysit the loop all day | `get_work` / `verify_and_advance` + autonomy-v2 run the loop for you |
+| No visibility into what agents are doing | Dashboard with Messages, Tasks, Workflows, Graph, Plan, 3D Hub |
+| Provider lock-in | Claude Code, Gemini CLI, Codex CLI, Ollama, and custom API agents all first-class |
+| Coordination is "chat" | Branches are full execution contexts. Sessions are branch-scoped. Governance is event-backed. |
-## Command surface
+---
-Setup:
+## ✨ Features
+- **66 MCP tools** for the full coordination surface — `register`, `send_message`, `broadcast`, `listen_group`, `get_work`, `verify_and_advance`, `create_task`, `start_plan`, `advance_workflow`, `lock_file`, `log_decision`, `kb_write`, `call_vote`, `submit_review`, `handoff`, and 50+ more.
+- **Canonical runtime** — event-backed state under `.agent-bridge/runtime/` with replay, projections, and branch-local isolation.
+- **Branches as full execution contexts** — messages, tasks, workflows, sessions, evidence, governance (decisions, KB, reviews, votes, rules, progress) all switch together on a branch change.
+- **Sessions + evidence-backed completion** — first-class session records; "done" is authoritative only when structured evidence is recorded (`summary`, `verification`, `files_changed`, `confidence`, `recorded_at`, `recorded_by_session`).
+- **Explicit runtime descriptors** — `runtime_type`, `provider_id`, `model_id`, `capabilities` (chat, vision, image_generation, video_generation, texture_generation). Mixed-provider teams coordinate by capability, not guesswork.
+- **Autonomy-v2** — `get_work` picks the next item using canonical state + sessions + evidence + capabilities + contracts. Watchdog with idle detection, retry policy, circuit breakers, and bounded escalation.
+- **3D virtual office** — real-time chibi-style visualization of your team. Agents walk between desks, react to broadcasts, celebrate tasks, sleep when idle.
+- **Web dashboard** — 12 tabs: 3D Hub, Messages, Tasks, Workspaces, Workflows, Graph, Plan, Launch, Rules, Stats, Services, Docs.
+- **Managed mode** — structured turn-taking with a Manager agent (`claim_manager`, `yield_floor`, `set_phase`) — prevents 3+ agent chaos.
+- **Channels** — sub-team communication without flooding `#general`.
+- **Markdown workspace export** — Obsidian-friendly one-way export (`.agent-bridge-markdown/`), explicitly non-authoritative.
+- **Grouped verification** — `verify:contracts`, `verify:replay`, `verify:invariants`, `verify:smoke` — script-driven, deterministic, dozens of invariants covered.
+- **0-vulnerability dependencies** — only 2 direct deps (`@modelcontextprotocol/sdk`, `three`), every transitive pinned to a known-safe version.
+---
+## 📦 Installation
+### Prerequisites
+- [Node.js 18 or higher](https://nodejs.org/) — `node --version` to check
+- One or more AI CLIs:
+  - [Claude Code](https://claude.ai/code)
+  - [Gemini CLI](https://github.com/google-gemini/gemini-cli)
+  - [Codex CLI](https://github.com/openai/codex)
+### Init (auto-detect everything installed)
 ```bash
+cd your-project
 npx let-them-talk init
-npx let-them-talk init --claude
-npx let-them-talk init --gemini
-npx let-them-talk init --codex
-npx let-them-talk init --all
-npx let-them-talk init --ollama
-npx let-them-talk init --template <name>
 ```
-After init, prefer the local launcher that was written into the project:
+### Init for a specific CLI
 ```bash
-node .agent-bridge/launch.js
-node .agent-bridge/launch.js --lan
-node .agent-bridge/launch.js status
-node .agent-bridge/launch.js msg <agent> <text>
-node .agent-bridge/launch.js reset
+npx let-them-talk init --claude     # Claude Code only
+npx let-them-talk init --gemini     # Gemini CLI only
+npx let-them-talk init --codex      # Codex CLI only
+npx let-them-talk init --all        # All three
+npx let-them-talk init --ollama     # Add a local Ollama bridge
 ```
-Other packaged CLI helpers:
+### Init with a ready-made template
 ```bash
-npx let-them-talk dashboard
-npx let-them-talk status
-npx let-them-talk templates
-npx let-them-talk uninstall
-npx let-them-talk help
+npx let-them-talk init --template pair      # 2-agent chat
+npx let-them-talk init --template team      # Coordinator + Researcher + Coder
+npx let-them-talk init --template review    # Author + Reviewer code-review pair
+npx let-them-talk init --template debate    # Pro + Con structured debate
+npx let-them-talk init --template managed   # Manager + Designer + Coder + Tester
 ```
-## Template inventory
+### What init writes (all merge-safe)
+- `.mcp.json` — Claude Code MCP config
+- `.gemini/settings.json` — Gemini CLI MCP config
+- `.codex/config.toml` — Codex CLI MCP config
+- `AGENTS.md` / `CLAUDE.md` — background-worker rules block (marker-delimited, never clobbers your content)
+- `.agent-bridge/launch.js` — local launcher (no re-download needed)
+- `.gitignore` — adds sensible entries
+All existing configs are preserved — agent-bridge is added alongside your other MCP servers, with `.backup` files created before any edit.
+### Launch the dashboard
+```bash
+node .agent-bridge/launch.js              # localhost:3000
+node .agent-bridge/launch.js --lan        # also listen on LAN (phone/tablet)
+node .agent-bridge/launch.js status       # CLI status snapshot
+node .agent-bridge/launch.js msg <agent>  # send a message from the terminal
+node .agent-bridge/launch.js migrate      # backfill canonical events from legacy projects
+```
+---
+## 🎬 The 60-second demo
+```bash
+# In project folder
+npx let-them-talk init --template team
+node .agent-bridge/launch.js
+```
+Open three terminals. The `templates` output prints the exact prompt to paste into each:
+- **Terminal 1 (Coordinator):** receives the user's request, breaks it into tasks, delegates to Researcher and Coder.
+- **Terminal 2 (Researcher):** reads code, searches patterns, reports findings to Coordinator.
+- **Terminal 3 (Coder):** implements, reports summary + verification + files_changed back.
+From the dashboard Messages tab, send the Coordinator a task. Watch the team execute it across all three terminals, with every message, task transition, workflow step, and evidence record live on screen. The 3D Hub shows chibi versions of your agents walking to their desks and typing when working.
+---
+## 🎛️ Dashboard tour
+| Tab | What it does |
+|---|---|
+| **3D Hub** | Live chibi-style visualization of your team. Per-project worlds, buildings, behaviors. |
+| **Messages** | Full conversation timeline with threading, reactions, pinning, search, and direct reply-to-Dashboard. |
+| **Tasks** | Kanban of all tasks across the branch. Drag to change status. Evidence-backed completion. **Clear All Tasks** button for cleanup. |
+| **Workspaces** | Per-agent scratchpad. Other agents can read, only you can write. 50 keys, 100 KB values. |
+| **Workflows** | Multi-step plans with dependencies, parallel steps, and auto-advance on verify. |
+| **Graph** | Agent/task/dependency network view. |
+| **Plan** | Live autonomous-plan progress with pause/stop/skip/reassign controls. |
+| **Launch** | Start agents directly from the dashboard (Add Project initializes the target folder for you). |
+| **Rules** | Project-wide rules injected into every agent's guide. |
+| **Stats** | Messages, tasks, completion rates, per-agent activity. |
+| **Services** | Status of configured providers and API keys. |
+| **Docs** | Shipped architecture + usage docs, searchable. |
+The dashboard also supports:
+- Saved named layouts
+- Omnibox / command palette on the search bar
+- Per-project branch switching and Clear Messages (canonical-aware)
+- **Reinstall Providers** — rewrites per-project MCP configs and refreshes the `AGENTS.md` rule block without touching your other content
-Agent templates shipped today:
+---
-- `pair`
-- `team`
-- `review`
-- `debate`
-- `managed`
+## 📐 Core concepts
-Conversation templates shipped today:
+### Runtime
-- `autonomous-feature`
-- `code-review`
-- `debug-squad`
-- `feature-build`
-- `research-write`
+- **Canonical truth** lives in an event-backed runtime under `.agent-bridge/runtime/`.
+- Legacy flat `.json` / `.jsonl` files in `.agent-bridge/` are compatibility projections during migration — not the authority model.
+- All mutations go through a shared canonical facade (`state/canonical.js`). The dashboard is a client of the broker, not a second writer.
-## Runtime descriptors and provider capabilities
+### Branches
-API-backed agents persist an explicit runtime descriptor with these fields:
+Branches are **full execution contexts**, not just message logs. A branch switch replaces the migrated branch-local view all at once:
+- messages and history
+- delivery and read state
+- conversation control and non-general channels
+- tasks and workflows
+- workspaces
+- sessions and evidence
+- governance: decisions, KB, reviews, dependencies, votes, rules, progress
-- `runtime_type`
-- `provider_id`
+Branch creation snapshots the source branch at the fork point. Branch-local changes never bleed into `main` until explicitly advanced.
+### Sessions + evidence
+Sessions are branch-scoped records of one agent's work on one branch. Rejoining the same branch resumes that branch-scoped context. Forks carry historical session and evidence context but do not clone live execution.
+Completion is authoritative only when structured evidence is recorded:
+- `summary`
+- `verification`
+- `files_changed`
+- `confidence` (0–100)
+- `recorded_at`
+- `recorded_by_session`
+Anything less is a conversational "done", not a runtime "done".
+### Providers + capabilities
+Every agent has an explicit runtime descriptor:
+- `runtime_type` (CLI / API / custom)
+- `provider_id` (Claude / Codex / Gemini / Ollama / ...)
 - `model_id`
-- `capabilities`
+- `capabilities` — tokens like `chat`, `vision`, `image_generation`, `video_generation`, `texture_generation`
+Coordinators can route work by capability instead of by heuristic — `get_work` and task assignment both respect declared capabilities.
+### Autonomy loop
+Instead of babysitting the chat:
+```
+Coordinator → start_plan(name, steps, assignees)
+        ↓
+Each agent → get_work() → do work → verify_and_advance() → get_work() → ...
+```
+- **`get_work`** picks the highest-priority item from: assigned workflow step, claimable task, open review, help request, blocked dependency, and more.
+- **`verify_and_advance`** self-verifies with evidence. ≥ 70 confidence auto-advances. 40–69 advances with a flag. < 40 broadcasts a help request.
+- **`retry_with_improvement`** handles failures. 3 failed retries auto-escalate to the team. Skill accumulation is stored in the KB for everyone.
+- **Watchdog** detects idle agents, stuck steps, and dead owners. Can rotate ownership within bounds.
+---
-Supported capability tokens today:
+## 🧩 Agent templates
-- `chat`
-- `vision`
-- `image_generation`
-- `video_generation`
-- `texture_generation`
+### Agent templates (role prompts)
-Legacy `provider`, `provider_color`, and `bot_capability` fields remain compatibility projections over that descriptor.
+| Template | Agents | Use when |
+|---|---|---|
+| `pair` | A, B | Two-agent brainstorm or Q&A |
+| `team` | Coordinator, Researcher, Coder | Feature work with research + implementation |
+| `review` | Author, Reviewer | Code-review loop |
+| `debate` | Pro, Con | Explore tradeoffs / architecture decisions |
+| `managed` | Manager, Designer, Coder, Tester | 3+ agents with structured turn-taking |
-## Markdown workspace export
+### Conversation templates (workflow skeletons)
+| Template | Purpose |
+|---|---|
+| `feature-build` | End-to-end feature: research → design → implement → test |
+| `code-review` | Structured code review with evidence |
+| `debug-squad` | Coordinated bug triage and fix |
+| `research-write` | Research → synthesize → document |
+| `autonomous-feature` | Fully autonomous multi-agent feature build |
+List, show, or apply templates:
 ```bash
-npm run export:markdown-workspace
+npx let-them-talk templates                         # list all
+npx let-them-talk init --template team              # scaffold a team
 ```
-Default export root is `<project>/.agent-bridge-markdown/`. Exported files declare `authoritative: false` in frontmatter. The export is one-way only. There is no markdown write-back, watcher loop, or import path in the current runtime.
+---
-When a source surface is still compatibility-shared or main-only, export stays truthful by emitting it only for `main` or omitting it. The exporter does not fabricate non-main branch copies from shared state.
+## 🧪 Verification
-## Verification
-From this package directory:
+Script-driven, deterministic, no flake:
 ```bash
-npm test
+npm test                     # delegates to verify
+npm run verify               # full suite
+npm run verify:contracts     # runtime + schema + branches + markdown
+npm run verify:replay        # event replay (healthy + clean + negative)
+npm run verify:invariants    # dashboard, capabilities, parity, sessions, evidence, autonomy, hooks
+npm run verify:smoke         # representative subset
 ```
-Grouped package commands:
+The verify suite doesn't claim to cover every provider or runtime matrix, and does not include browser automation. But every shipped invariant is script-enforced on every release.
+---
+## 🔐 Security
+- **Dashboard binds to `127.0.0.1` by default.** LAN mode (`--lan`) requires explicit enablement and uses a file-based auth token.
+- **Rate-limited** API endpoints on non-localhost requests.
+- **No telemetry, no cloud.** Everything runs locally.
+- **0 known vulnerabilities** in the shipped tarball as of v5.4.2.
+- **Sensitive-path blocks** on file-share: `.env`, `.pem`, `.key`, `.lan-token`, `mcp.json`, and the agent-bridge data directory cannot be shared.
+- See [`SECURITY.md`](SECURITY.md) for the disclosure policy.
+---
+## 📚 Architecture
+Source-of-truth docs:
+- [`docs/architecture/runtime-contract.md`](docs/architecture/runtime-contract.md)
+- [`docs/architecture/branch-semantics.md`](docs/architecture/branch-semantics.md)
+- [`docs/architecture/canonical-event-schema.md`](docs/architecture/canonical-event-schema.md)
+- [`docs/architecture/markdown-workspace.md`](docs/architecture/markdown-workspace.md)
+- [`docs/architecture/runtime-migration-hardening.md`](docs/architecture/runtime-migration-hardening.md)
+---
+## 🧾 Commands reference
+Full CLI surface for copy-paste convenience:
 ```bash
+# Setup & init
+npx let-them-talk init
+npx let-them-talk init --claude
+npx let-them-talk init --gemini
+npx let-them-talk init --codex
+npx let-them-talk init --all
+npx let-them-talk init --ollama
+npx let-them-talk init --template <name>
+# Packaged helpers via npx
+npx let-them-talk dashboard
+npx let-them-talk status
+npx let-them-talk templates
+npx let-them-talk uninstall
+npx let-them-talk help
+# After init, local launcher (no re-download)
+node .agent-bridge/launch.js
+node .agent-bridge/launch.js --lan
+node .agent-bridge/launch.js status
+node .agent-bridge/launch.js msg <agent> <text>
+node .agent-bridge/launch.js reset
+node .agent-bridge/launch.js migrate
+# Verification (run inside agent-bridge/)
+npm test
 npm run verify
 npm run verify:contracts
 npm run verify:replay
@@ -142,17 +365,22 @@ npm run verify:invariants
 npm run verify:smoke
 ```
-Current grouped coverage:
+---
+## 💬 Community
-- `verify:contracts` checks the runtime contract, canonical event schema, branch semantics, and markdown workspace contract.
-- `verify:replay` checks healthy and clean replay plus expected-failure negative replay scenarios.
-- `verify:invariants` checks authority routing, dashboard control plane behavior, performance and indexing, provider capabilities, API-agent parity, dashboard semantic-gap coverage, migration hardening, branch isolation, session lifecycle, evidence-backed completion, session-aware context, autonomy v2, advisory contracts, managed-team integration, lifecycle hooks, and markdown workspace export and safety.
-- `verify:smoke` runs a representative subset, including the dashboard semantic-gap check.
+- [Discord](https://discord.gg/6Y9YgkFNJP) — questions, show-and-tell, feedback
+- [GitHub Issues](https://github.com/Dekelelz/let-them-talk/issues) — bugs and feature requests
+- [Website](https://talk.unrealai.studio) — project home
-Coverage is still partial. The suite does not claim a full provider or runtime matrix, and it does not include browser automation.
+---
-## Security and license
+## 📄 License
-Security notes live in `SECURITY.md`.
+[Business Source License 1.1](LICENSE). See the license file for usage terms.
-License: [Business Source License 1.1](LICENSE)
+---
+<p align="center">
+  <sub>Built for humans who want their AI agents to work as a team.</sub>
+</p>

package/USAGE.md CHANGED Viewed

@@ -1,6 +1,6 @@
 <!-- Generated from ../USAGE.md by scripts/sync-packaged-docs.js for published package consumers. -->
-# Let Them Talk Usage Guide v5.4.0
+# Let Them Talk Usage Guide v5.4.2
 This guide is the short operator view of the current runtime. For normative architecture details, use the docs under `docs/architecture/`.

package/cli.js CHANGED Viewed

@@ -9,7 +9,7 @@ const { createCanonicalState } = require('./state/canonical');
 function printUsage() {
   console.log(`
-  Let Them Talk — Agent Bridge v5.4.0
+  Let Them Talk — Agent Bridge v5.4.2
   MCP message broker for inter-agent communication
   Supports: Claude Code, Gemini CLI, Codex CLI, Ollama
@@ -346,7 +346,26 @@ function setupAgentsMarkdown(cwd, log = console.log) {
     '5. **Do not answer on another agent\'s behalf.** If a message targets a',
     '   specific agent (`msg.to`), only that agent should reply.',
     '',
-    '6. The loop only ends when the owner sends a message telling you to stop.',
+    '6. **Self-reliance.** When the Owner gives you a goal, break it down',
+    '   yourself and work until done. NEVER stop to ask "should I do X?" or',
+    '   "do you want me to Y?" for decisions the team can make. Decide,',
+    '   `log_decision()` to record the choice, continue.',
+    '',
+    '7. **Team-first escalation.** Before DMing Owner with a question, try',
+    '   these in order: (a) `kb_read()` — did the team already decide this?',
+    '   (b) DM a teammate with the relevant skill (use `list_agents()`).',
+    '   (c) `call_vote()` if the team genuinely disagrees. (d) `log_decision()`',
+    '   to lock in your choice and move forward. Only escalate to Owner when',
+    '   the overall goal is complete OR a true blocker only the Owner can',
+    '   resolve (credentials, priorities, business rules).',
+    '',
+    '8. **Done-when-done.** "Done" means the Owner\'s original GOAL is',
+    '   achieved, not the current step. After `verify_and_advance()`, call',
+    '   `get_work()` again. If nothing is queued and the goal is not yet',
+    '   done, synthesize new tasks with `create_task()` and keep going.',
+    '',
+    '9. The loop only ends when the goal is achieved with evidence OR the',
+    '   Owner sends a message telling you to stop.',
     '',
     END,
   ].join('\n');

package/conversation-templates/autonomous-feature.json CHANGED Viewed

@@ -11,7 +11,7 @@
         "interfaces",
         "architecture"
       ],
-      "prompt": "Register as \"Architect\", call get_briefing() for current branch and session context, call get_guide() once to load the active autonomous rules, then use get_work() for assignments. Design the feature architecture with clear interfaces. When you complete a workflow step, call verify_and_advance() with summary, verification, files_changed, and confidence. Respect any required_capabilities or preferred_capabilities attached to work before continuing it.\n\nLOOP RULE (DO NOT BREAK): After every action, call listen() (or listen_group() in group/managed mode). When it returns with no messages, that is NORMAL — call it again immediately. Codex CLI may end the call near 120s due to its own tool-call timeout; treat the empty return as \"no messages yet\", NOT as a failure. Never stop the loop. Never sleep, never poll with check_messages().\n\nDASHBOARD REPLY RULE: When you receive a message from \"Dashboard\" or \"Owner\", reply ONLY via send_message(to=\"Dashboard\", content=...). The owner reads your reply in the dashboard Messages tab — NOT your CLI terminal. Anything you say in terminal output is invisible to the owner. If a message targets a different agent (msg.to is not you), do not answer on their behalf. After send_message, call listen_group() (or listen()) again immediately.\n\nDASHBOARD IS YOUR VOICE: Your CLI terminal is invisible to the owner and to every other agent. Everything you want anyone to see — status updates, questions for teammates, \"starting X\", \"done with Y\", \"blocked on Z\" — MUST go out through send_message() or broadcast(). Talk like humans on a team chat. Never just narrate in terminal and assume anyone will read it.\n\nTOOL ERROR RECOVERY: If listen_group() (or listen()) returns a tool error such as \"timed out awaiting tools/call\", that is a transport hiccup — immediately call it again. Do NOT summarize in terminal, do NOT stop the loop, do NOT treat the error as \"done\". The loop only ends when the owner tells you to stop via send_message."
+      "prompt": "Register as \"Architect\", call get_briefing() for current branch and session context, call get_guide() once to load the active autonomous rules, then use get_work() for assignments. Design the feature architecture with clear interfaces. When you complete a workflow step, call verify_and_advance() with summary, verification, files_changed, and confidence. Respect any required_capabilities or preferred_capabilities attached to work before continuing it.\n\nLOOP RULE (DO NOT BREAK): After every action, call listen() (or listen_group() in group/managed mode). When it returns with no messages, that is NORMAL — call it again immediately. Codex CLI may end the call near 120s due to its own tool-call timeout; treat the empty return as \"no messages yet\", NOT as a failure. Never stop the loop. Never sleep, never poll with check_messages().\n\nDASHBOARD REPLY RULE: When you receive a message from \"Dashboard\" or \"Owner\", reply ONLY via send_message(to=\"Dashboard\", content=...). The owner reads your reply in the dashboard Messages tab — NOT your CLI terminal. Anything you say in terminal output is invisible to the owner. If a message targets a different agent (msg.to is not you), do not answer on their behalf. After send_message, call listen_group() (or listen()) again immediately.\n\nDASHBOARD IS YOUR VOICE: Your CLI terminal is invisible to the owner and to every other agent. Everything you want anyone to see — status updates, questions for teammates, \"starting X\", \"done with Y\", \"blocked on Z\" — MUST go out through send_message() or broadcast(). Talk like humans on a team chat. Never just narrate in terminal and assume anyone will read it.\n\nTOOL ERROR RECOVERY: If listen_group() (or listen()) returns a tool error such as \"timed out awaiting tools/call\", that is a transport hiccup — immediately call it again. Do NOT summarize in terminal, do NOT stop the loop, do NOT treat the error as \"done\". The loop only ends when the owner tells you to stop via send_message.\n\nAUTONOMY RULES (DO NOT BREAK):\n1. SELF-RELIANCE — When given a goal, break it down and work until done. Never pause to ask \"should I do X?\" or \"do you want me to Y?\" for decisions the team can make. Decide, log_decision() to record the choice, continue.\n2. TEAM-FIRST ESCALATION — Before DMing Owner with a question: kb_read() first, then DM a teammate with the relevant skill (list_agents() to find them), then call_vote() if disagreement, then log_decision() to lock your choice. Only escalate to Owner when the goal is complete OR a true blocker only the Owner can resolve (credentials, priorities, business rules).\n3. DONE-WHEN-DONE — Done means the Owner's original GOAL is achieved with evidence, not \"I finished my current step\". After verify_and_advance(), call get_work() again. If nothing is queued and the goal is not yet done, synthesize new tasks with create_task() and keep going."
     },
     {
       "name": "Backend",
@@ -22,7 +22,7 @@
         "server",
         "backend"
       ],
-      "prompt": "Register as \"Backend\", call get_briefing() for current branch and session context, call get_guide() once to load the active autonomous rules, then use get_work() for assignments. Implement server-side logic, APIs, and data models. Write unit tests. When you complete a workflow step, call verify_and_advance() with summary, verification, files_changed, and confidence. Respect any required_capabilities or preferred_capabilities attached to work before continuing it.\n\nLOOP RULE (DO NOT BREAK): After every action, call listen() (or listen_group() in group/managed mode). When it returns with no messages, that is NORMAL — call it again immediately. Codex CLI may end the call near 120s due to its own tool-call timeout; treat the empty return as \"no messages yet\", NOT as a failure. Never stop the loop. Never sleep, never poll with check_messages().\n\nDASHBOARD REPLY RULE: When you receive a message from \"Dashboard\" or \"Owner\", reply ONLY via send_message(to=\"Dashboard\", content=...). The owner reads your reply in the dashboard Messages tab — NOT your CLI terminal. Anything you say in terminal output is invisible to the owner. If a message targets a different agent (msg.to is not you), do not answer on their behalf. After send_message, call listen_group() (or listen()) again immediately.\n\nDASHBOARD IS YOUR VOICE: Your CLI terminal is invisible to the owner and to every other agent. Everything you want anyone to see — status updates, questions for teammates, \"starting X\", \"done with Y\", \"blocked on Z\" — MUST go out through send_message() or broadcast(). Talk like humans on a team chat. Never just narrate in terminal and assume anyone will read it.\n\nTOOL ERROR RECOVERY: If listen_group() (or listen()) returns a tool error such as \"timed out awaiting tools/call\", that is a transport hiccup — immediately call it again. Do NOT summarize in terminal, do NOT stop the loop, do NOT treat the error as \"done\". The loop only ends when the owner tells you to stop via send_message."
+      "prompt": "Register as \"Backend\", call get_briefing() for current branch and session context, call get_guide() once to load the active autonomous rules, then use get_work() for assignments. Implement server-side logic, APIs, and data models. Write unit tests. When you complete a workflow step, call verify_and_advance() with summary, verification, files_changed, and confidence. Respect any required_capabilities or preferred_capabilities attached to work before continuing it.\n\nLOOP RULE (DO NOT BREAK): After every action, call listen() (or listen_group() in group/managed mode). When it returns with no messages, that is NORMAL — call it again immediately. Codex CLI may end the call near 120s due to its own tool-call timeout; treat the empty return as \"no messages yet\", NOT as a failure. Never stop the loop. Never sleep, never poll with check_messages().\n\nDASHBOARD REPLY RULE: When you receive a message from \"Dashboard\" or \"Owner\", reply ONLY via send_message(to=\"Dashboard\", content=...). The owner reads your reply in the dashboard Messages tab — NOT your CLI terminal. Anything you say in terminal output is invisible to the owner. If a message targets a different agent (msg.to is not you), do not answer on their behalf. After send_message, call listen_group() (or listen()) again immediately.\n\nDASHBOARD IS YOUR VOICE: Your CLI terminal is invisible to the owner and to every other agent. Everything you want anyone to see — status updates, questions for teammates, \"starting X\", \"done with Y\", \"blocked on Z\" — MUST go out through send_message() or broadcast(). Talk like humans on a team chat. Never just narrate in terminal and assume anyone will read it.\n\nTOOL ERROR RECOVERY: If listen_group() (or listen()) returns a tool error such as \"timed out awaiting tools/call\", that is a transport hiccup — immediately call it again. Do NOT summarize in terminal, do NOT stop the loop, do NOT treat the error as \"done\". The loop only ends when the owner tells you to stop via send_message.\n\nAUTONOMY RULES (DO NOT BREAK):\n1. SELF-RELIANCE — When given a goal, break it down and work until done. Never pause to ask \"should I do X?\" or \"do you want me to Y?\" for decisions the team can make. Decide, log_decision() to record the choice, continue.\n2. TEAM-FIRST ESCALATION — Before DMing Owner with a question: kb_read() first, then DM a teammate with the relevant skill (list_agents() to find them), then call_vote() if disagreement, then log_decision() to lock your choice. Only escalate to Owner when the goal is complete OR a true blocker only the Owner can resolve (credentials, priorities, business rules).\n3. DONE-WHEN-DONE — Done means the Owner's original GOAL is achieved with evidence, not \"I finished my current step\". After verify_and_advance(), call get_work() again. If nothing is queued and the goal is not yet done, synthesize new tasks with create_task() and keep going."
     },
     {
       "name": "Frontend",
@@ -33,7 +33,7 @@
         "components",
         "frontend"
       ],
-      "prompt": "Register as \"Frontend\", call get_briefing() for current branch and session context, call get_guide() once to load the active autonomous rules, then use get_work() for assignments. Implement UI components, pages, and client-side logic. Write component tests. When you complete a workflow step, call verify_and_advance() with summary, verification, files_changed, and confidence. Respect any required_capabilities or preferred_capabilities attached to work before continuing it.\n\nLOOP RULE (DO NOT BREAK): After every action, call listen() (or listen_group() in group/managed mode). When it returns with no messages, that is NORMAL — call it again immediately. Codex CLI may end the call near 120s due to its own tool-call timeout; treat the empty return as \"no messages yet\", NOT as a failure. Never stop the loop. Never sleep, never poll with check_messages().\n\nDASHBOARD REPLY RULE: When you receive a message from \"Dashboard\" or \"Owner\", reply ONLY via send_message(to=\"Dashboard\", content=...). The owner reads your reply in the dashboard Messages tab — NOT your CLI terminal. Anything you say in terminal output is invisible to the owner. If a message targets a different agent (msg.to is not you), do not answer on their behalf. After send_message, call listen_group() (or listen()) again immediately.\n\nDASHBOARD IS YOUR VOICE: Your CLI terminal is invisible to the owner and to every other agent. Everything you want anyone to see — status updates, questions for teammates, \"starting X\", \"done with Y\", \"blocked on Z\" — MUST go out through send_message() or broadcast(). Talk like humans on a team chat. Never just narrate in terminal and assume anyone will read it.\n\nTOOL ERROR RECOVERY: If listen_group() (or listen()) returns a tool error such as \"timed out awaiting tools/call\", that is a transport hiccup — immediately call it again. Do NOT summarize in terminal, do NOT stop the loop, do NOT treat the error as \"done\". The loop only ends when the owner tells you to stop via send_message."
+      "prompt": "Register as \"Frontend\", call get_briefing() for current branch and session context, call get_guide() once to load the active autonomous rules, then use get_work() for assignments. Implement UI components, pages, and client-side logic. Write component tests. When you complete a workflow step, call verify_and_advance() with summary, verification, files_changed, and confidence. Respect any required_capabilities or preferred_capabilities attached to work before continuing it.\n\nLOOP RULE (DO NOT BREAK): After every action, call listen() (or listen_group() in group/managed mode). When it returns with no messages, that is NORMAL — call it again immediately. Codex CLI may end the call near 120s due to its own tool-call timeout; treat the empty return as \"no messages yet\", NOT as a failure. Never stop the loop. Never sleep, never poll with check_messages().\n\nDASHBOARD REPLY RULE: When you receive a message from \"Dashboard\" or \"Owner\", reply ONLY via send_message(to=\"Dashboard\", content=...). The owner reads your reply in the dashboard Messages tab — NOT your CLI terminal. Anything you say in terminal output is invisible to the owner. If a message targets a different agent (msg.to is not you), do not answer on their behalf. After send_message, call listen_group() (or listen()) again immediately.\n\nDASHBOARD IS YOUR VOICE: Your CLI terminal is invisible to the owner and to every other agent. Everything you want anyone to see — status updates, questions for teammates, \"starting X\", \"done with Y\", \"blocked on Z\" — MUST go out through send_message() or broadcast(). Talk like humans on a team chat. Never just narrate in terminal and assume anyone will read it.\n\nTOOL ERROR RECOVERY: If listen_group() (or listen()) returns a tool error such as \"timed out awaiting tools/call\", that is a transport hiccup — immediately call it again. Do NOT summarize in terminal, do NOT stop the loop, do NOT treat the error as \"done\". The loop only ends when the owner tells you to stop via send_message.\n\nAUTONOMY RULES (DO NOT BREAK):\n1. SELF-RELIANCE — When given a goal, break it down and work until done. Never pause to ask \"should I do X?\" or \"do you want me to Y?\" for decisions the team can make. Decide, log_decision() to record the choice, continue.\n2. TEAM-FIRST ESCALATION — Before DMing Owner with a question: kb_read() first, then DM a teammate with the relevant skill (list_agents() to find them), then call_vote() if disagreement, then log_decision() to lock your choice. Only escalate to Owner when the goal is complete OR a true blocker only the Owner can resolve (credentials, priorities, business rules).\n3. DONE-WHEN-DONE — Done means the Owner's original GOAL is achieved with evidence, not \"I finished my current step\". After verify_and_advance(), call get_work() again. If nothing is queued and the goal is not yet done, synthesize new tasks with create_task() and keep going."
     },
     {
       "name": "Tester",
@@ -44,7 +44,7 @@
         "integration",
         "verification"
       ],
-      "prompt": "Register as \"Tester\", call get_briefing() for current branch and session context, call get_guide() once to load the active autonomous rules, then use get_work() for assignments. Write and run integration tests. Verify all components work together. When you complete a workflow step, call verify_and_advance() with summary, verification, files_changed, and confidence. Respect any required_capabilities or preferred_capabilities attached to work before continuing it.\n\nLOOP RULE (DO NOT BREAK): After every action, call listen() (or listen_group() in group/managed mode). When it returns with no messages, that is NORMAL — call it again immediately. Codex CLI may end the call near 120s due to its own tool-call timeout; treat the empty return as \"no messages yet\", NOT as a failure. Never stop the loop. Never sleep, never poll with check_messages().\n\nDASHBOARD REPLY RULE: When you receive a message from \"Dashboard\" or \"Owner\", reply ONLY via send_message(to=\"Dashboard\", content=...). The owner reads your reply in the dashboard Messages tab — NOT your CLI terminal. Anything you say in terminal output is invisible to the owner. If a message targets a different agent (msg.to is not you), do not answer on their behalf. After send_message, call listen_group() (or listen()) again immediately.\n\nDASHBOARD IS YOUR VOICE: Your CLI terminal is invisible to the owner and to every other agent. Everything you want anyone to see — status updates, questions for teammates, \"starting X\", \"done with Y\", \"blocked on Z\" — MUST go out through send_message() or broadcast(). Talk like humans on a team chat. Never just narrate in terminal and assume anyone will read it.\n\nTOOL ERROR RECOVERY: If listen_group() (or listen()) returns a tool error such as \"timed out awaiting tools/call\", that is a transport hiccup — immediately call it again. Do NOT summarize in terminal, do NOT stop the loop, do NOT treat the error as \"done\". The loop only ends when the owner tells you to stop via send_message."
+      "prompt": "Register as \"Tester\", call get_briefing() for current branch and session context, call get_guide() once to load the active autonomous rules, then use get_work() for assignments. Write and run integration tests. Verify all components work together. When you complete a workflow step, call verify_and_advance() with summary, verification, files_changed, and confidence. Respect any required_capabilities or preferred_capabilities attached to work before continuing it.\n\nLOOP RULE (DO NOT BREAK): After every action, call listen() (or listen_group() in group/managed mode). When it returns with no messages, that is NORMAL — call it again immediately. Codex CLI may end the call near 120s due to its own tool-call timeout; treat the empty return as \"no messages yet\", NOT as a failure. Never stop the loop. Never sleep, never poll with check_messages().\n\nDASHBOARD REPLY RULE: When you receive a message from \"Dashboard\" or \"Owner\", reply ONLY via send_message(to=\"Dashboard\", content=...). The owner reads your reply in the dashboard Messages tab — NOT your CLI terminal. Anything you say in terminal output is invisible to the owner. If a message targets a different agent (msg.to is not you), do not answer on their behalf. After send_message, call listen_group() (or listen()) again immediately.\n\nDASHBOARD IS YOUR VOICE: Your CLI terminal is invisible to the owner and to every other agent. Everything you want anyone to see — status updates, questions for teammates, \"starting X\", \"done with Y\", \"blocked on Z\" — MUST go out through send_message() or broadcast(). Talk like humans on a team chat. Never just narrate in terminal and assume anyone will read it.\n\nTOOL ERROR RECOVERY: If listen_group() (or listen()) returns a tool error such as \"timed out awaiting tools/call\", that is a transport hiccup — immediately call it again. Do NOT summarize in terminal, do NOT stop the loop, do NOT treat the error as \"done\". The loop only ends when the owner tells you to stop via send_message.\n\nAUTONOMY RULES (DO NOT BREAK):\n1. SELF-RELIANCE — When given a goal, break it down and work until done. Never pause to ask \"should I do X?\" or \"do you want me to Y?\" for decisions the team can make. Decide, log_decision() to record the choice, continue.\n2. TEAM-FIRST ESCALATION — Before DMing Owner with a question: kb_read() first, then DM a teammate with the relevant skill (list_agents() to find them), then call_vote() if disagreement, then log_decision() to lock your choice. Only escalate to Owner when the goal is complete OR a true blocker only the Owner can resolve (credentials, priorities, business rules).\n3. DONE-WHEN-DONE — Done means the Owner's original GOAL is achieved with evidence, not \"I finished my current step\". After verify_and_advance(), call get_work() again. If nothing is queued and the goal is not yet done, synthesize new tasks with create_task() and keep going."
     }
   ],
   "workflow": {