openclaw-node-harness 2.0.4 → 2.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (115) hide show
  1. package/README.md +646 -3
  2. package/bin/hyperagent.mjs +419 -0
  3. package/bin/mesh-agent.js +401 -12
  4. package/bin/mesh-bridge.js +66 -1
  5. package/bin/mesh-task-daemon.js +816 -26
  6. package/bin/mesh.js +403 -1
  7. package/config/claude-settings.json +95 -0
  8. package/config/daemon.json.template +2 -1
  9. package/config/git-hooks/pre-commit +13 -0
  10. package/config/git-hooks/pre-push +12 -0
  11. package/config/harness-rules.json +174 -0
  12. package/config/plan-templates/team-bugfix.yaml +52 -0
  13. package/config/plan-templates/team-deploy.yaml +50 -0
  14. package/config/plan-templates/team-feature.yaml +71 -0
  15. package/config/roles/qa-engineer.yaml +36 -0
  16. package/config/roles/solidity-dev.yaml +51 -0
  17. package/config/roles/tech-architect.yaml +36 -0
  18. package/config/rules/framework/solidity.md +22 -0
  19. package/config/rules/framework/typescript.md +21 -0
  20. package/config/rules/framework/unity.md +21 -0
  21. package/config/rules/universal/design-docs.md +18 -0
  22. package/config/rules/universal/git-hygiene.md +18 -0
  23. package/config/rules/universal/security.md +19 -0
  24. package/config/rules/universal/test-standards.md +19 -0
  25. package/identity/DELEGATION.md +6 -6
  26. package/install.sh +293 -8
  27. package/lib/circling-parser.js +119 -0
  28. package/lib/hyperagent-store.mjs +652 -0
  29. package/lib/kanban-io.js +9 -0
  30. package/lib/mcp-knowledge/bench.mjs +118 -0
  31. package/lib/mcp-knowledge/core.mjs +528 -0
  32. package/lib/mcp-knowledge/package.json +25 -0
  33. package/lib/mcp-knowledge/server.mjs +245 -0
  34. package/lib/mcp-knowledge/test.mjs +802 -0
  35. package/lib/memory-budget.mjs +261 -0
  36. package/lib/mesh-collab.js +301 -1
  37. package/lib/mesh-harness.js +427 -0
  38. package/lib/mesh-plans.js +13 -5
  39. package/lib/mesh-tasks.js +67 -0
  40. package/lib/plan-templates.js +226 -0
  41. package/lib/pre-compression-flush.mjs +320 -0
  42. package/lib/role-loader.js +292 -0
  43. package/lib/rule-loader.js +358 -0
  44. package/lib/session-store.mjs +458 -0
  45. package/lib/transcript-parser.mjs +292 -0
  46. package/mission-control/drizzle/soul_schema_update.sql +29 -0
  47. package/mission-control/drizzle.config.ts +1 -4
  48. package/mission-control/package-lock.json +1571 -83
  49. package/mission-control/package.json +6 -2
  50. package/mission-control/scripts/gen-chronology.js +3 -3
  51. package/mission-control/scripts/import-pipeline-v2.js +0 -16
  52. package/mission-control/scripts/import-pipeline.js +0 -15
  53. package/mission-control/src/app/api/cowork/clusters/[id]/members/route.ts +117 -0
  54. package/mission-control/src/app/api/cowork/clusters/[id]/route.ts +84 -0
  55. package/mission-control/src/app/api/cowork/clusters/route.ts +141 -0
  56. package/mission-control/src/app/api/cowork/dispatch/route.ts +128 -0
  57. package/mission-control/src/app/api/cowork/events/route.ts +65 -0
  58. package/mission-control/src/app/api/cowork/intervene/route.ts +259 -0
  59. package/mission-control/src/app/api/cowork/sessions/[id]/route.ts +37 -0
  60. package/mission-control/src/app/api/cowork/sessions/route.ts +64 -0
  61. package/mission-control/src/app/api/diagnostics/route.ts +97 -0
  62. package/mission-control/src/app/api/diagnostics/test-runner/route.ts +990 -0
  63. package/mission-control/src/app/api/mesh/events/route.ts +95 -19
  64. package/mission-control/src/app/api/mesh/identity/route.ts +11 -0
  65. package/mission-control/src/app/api/mesh/tasks/[id]/route.ts +92 -0
  66. package/mission-control/src/app/api/mesh/tasks/route.ts +91 -0
  67. package/mission-control/src/app/api/tasks/[id]/handoff/route.ts +1 -1
  68. package/mission-control/src/app/api/tasks/[id]/route.ts +90 -4
  69. package/mission-control/src/app/api/tasks/route.ts +21 -30
  70. package/mission-control/src/app/cowork/page.tsx +261 -0
  71. package/mission-control/src/app/diagnostics/page.tsx +385 -0
  72. package/mission-control/src/app/graph/page.tsx +26 -0
  73. package/mission-control/src/app/memory/page.tsx +1 -1
  74. package/mission-control/src/app/obsidian/page.tsx +36 -6
  75. package/mission-control/src/app/roadmap/page.tsx +24 -0
  76. package/mission-control/src/app/souls/page.tsx +2 -2
  77. package/mission-control/src/components/board/execution-config.tsx +431 -0
  78. package/mission-control/src/components/board/kanban-board.tsx +75 -9
  79. package/mission-control/src/components/board/kanban-column.tsx +135 -19
  80. package/mission-control/src/components/board/task-card.tsx +55 -2
  81. package/mission-control/src/components/board/unified-task-dialog.tsx +82 -4
  82. package/mission-control/src/components/cowork/cluster-card.tsx +176 -0
  83. package/mission-control/src/components/cowork/create-cluster-dialog.tsx +251 -0
  84. package/mission-control/src/components/cowork/dispatch-form.tsx +423 -0
  85. package/mission-control/src/components/cowork/role-picker.tsx +102 -0
  86. package/mission-control/src/components/cowork/session-card.tsx +284 -0
  87. package/mission-control/src/components/layout/sidebar.tsx +39 -2
  88. package/mission-control/src/lib/__tests__/daily-log.test.ts +82 -0
  89. package/mission-control/src/lib/__tests__/memory-md.test.ts +87 -0
  90. package/mission-control/src/lib/__tests__/mesh-kv-sync.test.ts +465 -0
  91. package/mission-control/src/lib/__tests__/mocks/mock-kv.ts +131 -0
  92. package/mission-control/src/lib/__tests__/status-kanban.test.ts +46 -0
  93. package/mission-control/src/lib/__tests__/task-markdown.test.ts +188 -0
  94. package/mission-control/src/lib/__tests__/wikilinks.test.ts +175 -0
  95. package/mission-control/src/lib/config.ts +58 -0
  96. package/mission-control/src/lib/db/index.ts +69 -0
  97. package/mission-control/src/lib/db/schema.ts +61 -3
  98. package/mission-control/src/lib/hooks.ts +309 -0
  99. package/mission-control/src/lib/memory/entities.ts +3 -2
  100. package/mission-control/src/lib/nats.ts +66 -1
  101. package/mission-control/src/lib/parsers/task-markdown.ts +52 -2
  102. package/mission-control/src/lib/parsers/transcript.ts +4 -4
  103. package/mission-control/src/lib/scheduler.ts +12 -11
  104. package/mission-control/src/lib/sync/mesh-kv.ts +279 -0
  105. package/mission-control/src/lib/sync/tasks.ts +23 -1
  106. package/mission-control/src/lib/task-id.ts +32 -0
  107. package/mission-control/src/lib/tts/index.ts +33 -9
  108. package/mission-control/tsconfig.json +2 -1
  109. package/mission-control/vitest.config.ts +14 -0
  110. package/package.json +15 -2
  111. package/services/service-manifest.json +1 -1
  112. package/skills/cc-godmode/references/agents.md +8 -8
  113. package/workspace-bin/memory-daemon.mjs +199 -5
  114. package/workspace-bin/session-search.mjs +204 -0
  115. package/workspace-bin/web-fetch.mjs +65 -0
package/README.md CHANGED
@@ -3,11 +3,16 @@
3
3
  Installable package for deploying an OpenClaw node. Includes the full infrastructure stack:
4
4
 
5
5
  - **Memory Daemon** — persistent background service managing session lifecycle, memory maintenance, and Obsidian sync
6
+ - **HyperAgent Protocol** — self-improving agent loop: telemetry, structured reflection, strategy archive, and self-modifying proposals with human-gated approval
6
7
  - **Mission Control** — Next.js web dashboard (kanban, timeline, graph visualization, memory browser)
7
8
  - **Soul System** — multi-soul orchestration with trust registry and evolution
8
9
  - **Skill Library** — 100+ skills for AI agent capabilities
9
10
  - **Boot Compiler** — profile-aware boot artifact generation for multiple AI models
10
11
  - **ClawVault** — structured knowledge vault with search and handoffs
12
+ - **Mesh Task Engine** — distributed task execution with Karpathy iteration (try → measure → keep/discard → retry)
13
+ - **Mechanical Enforcement** — path-scoped coding rules, dual-layer harness, role profiles with structural validation
14
+ - **Plan Pipelines** — YAML-based multi-phase workflows with dependency waves, failure cascade, and escalation recovery
15
+ - **Knowledge Server** — LLM-agnostic MCP server for semantic search over markdown (local embeddings, sqlite-vec, NATS mesh)
11
16
 
12
17
  ## Quick Start (Ubuntu)
13
18
 
@@ -25,6 +30,10 @@ The installer will:
25
30
  5. Install Mission Control and its dependencies
26
31
  6. Set up the memory daemon as a systemd user service
27
32
  7. Initialize the memory system
33
+ 8. Deploy path-scoped coding rules — installs universal rules (security, test-standards, design-docs, git-hygiene), auto-detects frameworks (Hardhat → Solidity rules, tsconfig → TypeScript rules, ProjectSettings → Unity rules), version-aware upgrades preserve user modifications
34
+ 9. Install plan templates — deploys `team-feature`, `team-bugfix`, `team-deploy` YAML pipeline templates (skips if already present)
35
+ 10. Set up Claude Code hooks + LLM-agnostic git hooks — deploys 6 lifecycle hooks (session-start, validate-commit, validate-push, pre-compact, session-stop, log-agent), symlinks `.claude/rules` → `~/.openclaw/rules/`, installs pre-commit/pre-push git hooks that delegate to the same scripts
36
+ 11. Merge enforcement settings — `jq`-based merge of `settings.json` that appends new hooks and permissions without overwriting existing user configuration
28
37
 
29
38
  ## Post-Install
30
39
 
@@ -74,16 +83,24 @@ bash uninstall.sh --purge # Remove everything including all data
74
83
  ├── openclaw.env # Your API keys and config
75
84
  ├── openclaw.json # Generated runtime config
76
85
  ├── config/ # Daemon, transcript, sync configs
86
+ ├── rules/ # Path-scoped coding rules (*.md)
87
+ ├── plan-templates/ # YAML pipeline templates
88
+ ├── harness-rules.json # Behavioral enforcement rules
77
89
  ├── souls/ # Soul definitions (daedalus, specialists)
78
90
  ├── services/ # Service reference files
79
91
  ├── workspace/
80
- │ ├── bin/ # All scripts (daemon, maintenance, etc.)
92
+ │ ├── bin/ # All scripts (daemon, mesh-agent, etc.)
93
+ │ ├── lib/ # Shared libraries (rule-loader, harness, roles, plans)
81
94
  │ ├── skills/ # 100+ skill definitions
82
95
  │ ├── memory/ # Daily logs, active tasks, archive
83
96
  │ ├── memory-vault/ # ClawVault structured knowledge
84
97
  │ ├── .boot/ # Compiled boot profiles
98
+ │ ├── .knowledge.db # Semantic search index (auto-generated)
85
99
  │ ├── .learnings/ # Corrections and lessons
86
100
  │ ├── .tmp/ # Runtime state (logs, sessions)
101
+ │ ├── .claude/
102
+ │ │ ├── hooks/ # Lifecycle hooks (session, commit, push, compact)
103
+ │ │ └── rules → ~/.openclaw/rules/ # Symlink for Claude Code native support
87
104
  │ ├── projects/
88
105
  │ │ └── mission-control/ # Next.js dashboard
89
106
  │ ├── SOUL.md # Identity
@@ -128,7 +145,7 @@ The installer auto-detects and installs these:
128
145
 
129
146
  ## Obsidian Setup
130
147
 
131
- The installer deploys the vault scaffold with 22 domain folders and the **Local REST API** plugin pre-installed. On first Obsidian launch:
148
+ The installer deploys the vault scaffold with 23 domain folders and the **Local REST API** plugin pre-installed. On first Obsidian launch:
132
149
 
133
150
  1. Obsidian will auto-download 5 missing community plugins (dataview, templater, kanban, git, graph-analysis) — requires internet
134
151
  2. Generate an API key in the Local REST API plugin settings
@@ -137,6 +154,133 @@ The installer deploys the vault scaffold with 22 domain folders and the **Local
137
154
 
138
155
  If not using Obsidian, the sync is disabled by default in `obsidian-sync.json` (set `"enabled": false`).
139
156
 
157
+ ## HyperAgent Protocol
158
+
159
+ A self-improving loop that makes any agent on any node better over time. Based on the DGM-Hyperagents framework (Zhang et al., 2026): the mechanism that generates improvements is itself subject to improvement.
160
+
161
+ ### How It Works
162
+
163
+ ```
164
+ Task completes → Telemetry logged (auto-detected pattern flags)
165
+
166
+ 5 tasks accumulate
167
+
168
+ Daemon triggers reflection (raw stats)
169
+
170
+ Agent synthesizes hypotheses + proposals (autonomous)
171
+
172
+ Human reviews proposals (safety gate)
173
+
174
+ Approved proposals update strategy archive
175
+
176
+ Next task consults strategies at start
177
+ ```
178
+
179
+ The loop is fully autonomous except proposal approval. Telemetry, reflection, synthesis, and strategy consultation all happen without human intervention.
180
+
181
+ ### Components
182
+
183
+ | Component | Location | Purpose |
184
+ |---|---|---|
185
+ | `lib/hyperagent-store.mjs` | SQLite in `state.db` | 5 tables: telemetry, strategies, reflections, proposals, junction |
186
+ | `bin/hyperagent.mjs` | CLI | `status`, `log`, `reflect`, `strategies`, `approve`, `reject` |
187
+ | Harness rules (3) | `config/harness-rules.json` | Injected into any agent: task-close telemetry, task-start strategy lookup, reflection synthesis |
188
+ | Daemon phase | `memory-daemon.mjs` | Triggers reflection every 30min when 5+ unreflected tasks exist |
189
+
190
+ ### Agent-Agnostic
191
+
192
+ Works for any soul (daedalus, infra-ops, blockchain-auditor, etc.) on any node. Telemetry is tagged with `node_id` and `soul_id`. Strategies are queryable by domain. The harness rules inject into any agent session via companion-bridge.
193
+
194
+ ### Pattern Flags
195
+
196
+ Pathology detection is automatic. The store detects these flags at telemetry write time:
197
+
198
+ - `repeated-approach` — same strategy on last 3+ tasks in same domain
199
+ - `multiple-iterations` — more than 3 attempts to complete
200
+ - `always-escalated` — failed with only 1 iteration (didn't try)
201
+ - `no-meta-notes` — missing or insufficient observations
202
+
203
+ ### CLI
204
+
205
+ ```bash
206
+ hyperagent status # overview
207
+ hyperagent log '<json>' # log telemetry
208
+ hyperagent strategies [--domain X] # list strategies
209
+ hyperagent reflect [--force] # trigger reflection
210
+ hyperagent reflect --pending # get pending synthesis (JSON)
211
+ hyperagent reflect --write-synthesis '<json>' # write agent synthesis
212
+ hyperagent proposals # list proposals
213
+ hyperagent approve <id> # approve (human gate)
214
+ hyperagent reject <id> [reason] # reject
215
+ hyperagent shadow <id> [--window 60] # start shadow eval
216
+ hyperagent seed-strategy '<json>' # import strategy manually
217
+ ```
218
+
219
+ ### Tests
220
+
221
+ ```bash
222
+ node test/hyperagent-store.test.js # 28 tests, no external deps
223
+ ```
224
+
225
+ ## Semantic Knowledge Search (MCP)
226
+
227
+ Local, LLM-agnostic semantic search over your markdown knowledge base. Uses vector embeddings to find documents by meaning, not just keywords.
228
+
229
+ ### How it works
230
+
231
+ The knowledge server scans markdown files in your workspace, splits them into chunks at heading boundaries, embeds each chunk with a local ONNX model (all-MiniLM-L6-v2, 384-dim), and stores the vectors in a sqlite-vec index. Queries return the most semantically similar chunks with file path, section name, relevance score, and a snippet.
232
+
233
+ ### Tools exposed
234
+
235
+ | Tool | Description |
236
+ |------|-------------|
237
+ | `semantic_search(query, limit)` | Find documents by meaning (e.g. "oracle threat model GPS spoofing") |
238
+ | `find_related(doc_path, limit)` | Find documents similar to a given file |
239
+ | `reindex(force)` | Re-scan and re-embed changed files |
240
+ | `knowledge_stats()` | Index statistics (doc count, chunk count, model info) |
241
+
242
+ ### Access paths
243
+
244
+ Any MCP-compatible client can use these tools. The server supports three transports:
245
+
246
+ | Transport | How | Use case |
247
+ |-----------|-----|----------|
248
+ | **stdio MCP** | Auto-starts via `.mcp.json` | Claude Code, Cursor, VS Code |
249
+ | **HTTP MCP** | `KNOWLEDGE_PORT=3100 node lib/mcp-knowledge/server.mjs` | Remote MCP clients, web UIs |
250
+ | **NATS mesh** | `mesh.tool.{nodeId}.knowledge.{method}` | Any mesh worker node |
251
+
252
+ The NATS transport means worker nodes get semantic search without needing the embedding model, database, or knowledge files locally. One index on the lead node, queried from anywhere on the mesh.
253
+
254
+ ### Configuration
255
+
256
+ Environment variables (set in `.mcp.json` env block or shell):
257
+
258
+ | Variable | Default | Description |
259
+ |----------|---------|-------------|
260
+ | `KNOWLEDGE_ROOT` | `~/.openclaw/workspace` | Directory to scan for markdown files |
261
+ | `KNOWLEDGE_DB` | `{KNOWLEDGE_ROOT}/.knowledge.db` | SQLite database path |
262
+ | `KNOWLEDGE_POLL_MS` | `300000` (5 min) | Background re-index interval |
263
+ | `KNOWLEDGE_PORT` | *(unset)* | Set to enable HTTP transport (e.g. `3100`) |
264
+ | `KNOWLEDGE_HOST` | `127.0.0.1` | HTTP bind address |
265
+ | `INCLUDE_DIRS` | `memory/,projects/,...` | Comma-separated directories to scan |
266
+
267
+ ### Performance
268
+
269
+ Benchmarked on a ~250-file workspace:
270
+
271
+ - **First index:** ~90s (one-time, downloads 23MB ONNX model on first run)
272
+ - **Incremental reindex:** <1s (SHA-256 content hashing, only re-embeds changed files)
273
+ - **Query latency:** 3-14ms
274
+ - **Database size:** ~22MB for 6,500 chunks
275
+
276
+ ### Running tests
277
+
278
+ ```bash
279
+ cd lib/mcp-knowledge
280
+ node test.mjs
281
+ # 98 assertions across 12 test groups
282
+ ```
283
+
140
284
  ## Mesh Network (Multi-Node)
141
285
 
142
286
  The installer detects Tailscale and optionally deploys a full mesh network across multiple machines. When enabled, nodes share files, execute remote commands, and broadcast session lifecycle events via NATS.
@@ -161,11 +305,510 @@ mesh exec "cmd" # run command on remote node
161
305
 
162
306
  - **NATS** — message bus for commands, heartbeats, file sync (runs on Ubuntu)
163
307
  - **Agent v3** — polling-based shared folder sync over NATS (`~/openclaw/shared/`)
164
- - **Memory Bridge** — broadcasts session lifecycle events across nodes (`mesh-bridge.mjs`)
308
+ - **Memory Bridge** — broadcasts session lifecycle events across nodes (`mesh-bridge.js`)
309
+ - **Knowledge Server** — semantic search via NATS (`mesh.tool.{nodeId}.knowledge.*`), workers query lead's index
165
310
  - **Tailscale** — encrypted WireGuard tunnel between nodes
311
+ - **Agent Activity Monitor** (`lib/agent-activity.js`) — zero-cost agent state detection via Claude Code JSONL session files (active, ready, idle, blocked)
312
+ - **Memory Budget** (`lib/memory-budget.mjs`) — character budget enforcement for MEMORY.md with freeze/thaw semantics per session
313
+ - **Mesh Registry** (`lib/mesh-registry.js`) — NATS KV-backed tool registry for discovering and calling remote tools across nodes
166
314
 
167
315
  The mesh is optional. Without Tailscale, everything runs as a standalone single node.
168
316
 
317
+ ## Mechanical Enforcement
318
+
319
+ The enforcement layer operates independently of the LLM backend. Rules are prompt-injected (soft enforcement) AND mechanically validated (hard enforcement). If the LLM ignores a rule, the mechanical check catches it.
320
+
321
+ ### Three-Layer Prompt Injection
322
+
323
+ Every mesh agent task receives context from three independent sources, injected in order:
324
+
325
+ 1. **Coding rules** (`~/.openclaw/rules/*.md`) — path-scoped technical standards. A task touching `contracts/Token.sol` auto-gets Solidity rules (reentrancy guards, events on state changes). Rules match via glob patterns in frontmatter.
326
+ 2. **Harness rules** (`harness-rules.json`) — universal behavioral constraints. "Never declare done without running tests." "Never silently swallow errors." Each rule has both a prompt injection AND a mechanical enforcement mapping.
327
+ 3. **Role profiles** (`config/roles/*.yaml`) — domain-specific responsibilities, must-not boundaries, thinking frameworks, and escalation maps. A `solidity-dev` role knows to check for test coverage and emit events.
328
+
329
+ ### Mechanical Checks (post-execution, pre-commit)
330
+
331
+ After the LLM exits and before results are committed:
332
+
333
+ | Check | What it does | Blocks on failure |
334
+ |---|---|---|
335
+ | **Scope enforcement** | `git diff` vs `task.scope` — reverts files outside allowed paths | Yes (revert + retry) |
336
+ | **Forbidden patterns** | Role-defined regex on changed files (e.g., hardcoded addresses in `.sol`) | Yes (violation + retry) |
337
+ | **Secret scanning** | gitleaks/trufflehog/regex on staged changes | Yes (block commit) |
338
+ | **Output block patterns** | Regex on LLM stdout for dangerous commands (`rm -rf`, `sudo`) | Yes (block completion) |
339
+ | **Error pattern scan** | Detects error/exception patterns in metric-less task output | Warning (forces review) |
340
+ | **Required outputs** | Role-defined structural checks (test files exist, events emitted) | Forces review |
341
+
342
+ ### Coding Rules
343
+
344
+ Rules live in `~/.openclaw/rules/` as markdown files with YAML frontmatter:
345
+
346
+ ```yaml
347
+ ---
348
+ id: solidity
349
+ version: 1.0.0
350
+ tier: framework # universal | framework | project
351
+ paths: ["contracts/**", "**/*.sol"]
352
+ detect: ["hardhat.config.js", "foundry.toml"]
353
+ priority: 80
354
+ ---
355
+ # Solidity Standards
356
+ - Reentrancy guards on all external calls
357
+ - Events on every state change
358
+ - checks-effects-interactions pattern
359
+ ```
360
+
361
+ Three tiers with precedence: `project > framework > universal`. Framework rules auto-activate when the installer detects matching config files. Version-aware upgrades preserve user modifications.
362
+
363
+ ### Rule Loader (`lib/rule-loader.js`)
364
+
365
+ The rule loader is a zero-dependency engine that:
366
+
367
+ 1. **Parses YAML frontmatter** from markdown rule files (custom parser, no `js-yaml` required)
368
+ 2. **Matches rules to file paths** using glob patterns (`*`, `**`, `?`, `{a,b}` brace expansion)
369
+ 3. **Sorts by tier + priority** — project rules (weight 20) override framework (10) override universal (0)
370
+ 4. **Auto-detects frameworks** — scans for `hardhat.config.js` → activates Solidity rules, `tsconfig.json` → TypeScript rules, `ProjectSettings/` → Unity rules
371
+ 5. **Caps prompt injection** at 4,000 characters to avoid context budget blowout
372
+
373
+ **Shipped rules:**
374
+
375
+ | Tier | Rule | Auto-detects |
376
+ |------|------|-------------|
377
+ | Universal | `security.md` | Always active |
378
+ | Universal | `test-standards.md` | Always active |
379
+ | Universal | `design-docs.md` | Always active |
380
+ | Universal | `git-hygiene.md` | Always active |
381
+ | Framework | `solidity.md` | `hardhat.config.js`, `foundry.toml` |
382
+ | Framework | `typescript.md` | `tsconfig.json` |
383
+ | Framework | `unity.md` | `ProjectSettings/`, `Assets/` |
384
+
385
+ ### Rule Injection into Agents
386
+
387
+ When `mesh-agent.js` builds a prompt for any task, it calls `findRulesByScope(task.scope)` and injects matching rules into all three prompt paths:
388
+
389
+ - `buildInitialPrompt()` — first attempt
390
+ - `buildRetryPrompt()` — retry after failure
391
+ - `buildCollabPrompt()` — collaborative session
392
+
393
+ Rules are injected between the task description and the metric/success criteria, so the agent sees them as constraints on how to approach the work.
394
+
395
+ ### Role Profiles
396
+
397
+ Roles define domain-specific agent behavior with mechanical validation:
398
+
399
+ ```yaml
400
+ # config/roles/solidity-dev.yaml
401
+ id: solidity-dev
402
+ responsibilities:
403
+ - "Implement smart contract logic per specification"
404
+ - "Write comprehensive test coverage for all state transitions"
405
+ must_not:
406
+ - "Modify deployment scripts without explicit delegation"
407
+ - "Hardcode addresses — resolve through ArcaneKernel"
408
+ required_outputs:
409
+ - type: file_match
410
+ pattern: "test/**/*.test.js"
411
+ description: "Test file must accompany any contract change"
412
+ forbidden_patterns:
413
+ - pattern: "0x[a-fA-F0-9]{40}"
414
+ in: "contracts/**/*.sol"
415
+ description: "No hardcoded addresses"
416
+ scope_paths: ["contracts/**", "test/**"]
417
+ escalation:
418
+ on_metric_failure: qa-engineer
419
+ on_budget_exceeded: tech-architect
420
+ framework:
421
+ name: "Checks-Effects-Interactions"
422
+ prompt: "Structure all external calls using CEI pattern..."
423
+ ```
424
+
425
+ Roles auto-assign from task scope: a task with `scope: ["contracts/Token.sol"]` gets `role: solidity-dev` because the glob matches.
426
+
427
+ ## Plan Pipelines
428
+
429
+ Multi-phase workflows defined as YAML templates. Plans decompose into subtasks dispatched across mesh agents in dependency waves.
430
+
431
+ ### Usage
432
+
433
+ ```bash
434
+ # List available templates
435
+ mesh plan templates
436
+
437
+ # Create a plan from template
438
+ mesh plan create --template team-feature --context "Add token expiry logic"
439
+
440
+ # Inspect the full subtask tree before approving
441
+ mesh plan show PLAN-xxx
442
+
443
+ # Override template defaults
444
+ mesh plan create --template team-feature --context "..." \
445
+ --set implement.delegation.mode=collab_mesh \
446
+ --set test.budget_minutes=30
447
+
448
+ # Approve and start execution
449
+ mesh plan approve PLAN-xxx
450
+
451
+ # Monitor progress
452
+ mesh plan list --status executing
453
+ mesh plan show PLAN-xxx
454
+ ```
455
+
456
+ ### Shipped Templates
457
+
458
+ | Template | Phases | Failure Policy |
459
+ |---|---|---|
460
+ | `team-feature` | Design → Architecture Review → Implement → Test → Code Review | `abort_on_critical_fail` |
461
+ | `team-bugfix` | Reproduce → Diagnose → Fix → Regression Test | `abort_on_first_fail` |
462
+ | `team-deploy` | Pre-flight → Deploy → Smoke Test → Monitor | `abort_on_first_fail` |
463
+
464
+ ### Plan Templates (`lib/plan-templates.js`)
465
+
466
+ Templates are YAML files in `~/.openclaw/plan-templates/` that define reusable multi-phase workflows. The template engine:
467
+
468
+ 1. **Loads and validates** template structure (phases, subtasks, dependency IDs)
469
+ 2. **Detects circular dependencies** via DFS — rejects templates with cycles
470
+ 3. **Substitutes variables** — `{{context}}` gets the user's task description, `{{vars.key}}` for custom variables
471
+ 4. **Validates delegation modes** — only `solo_mesh`, `collab_mesh`, `local`, `soul`, `human`, `auto` allowed
472
+ 5. **Instantiates into executable plans** via `lib/mesh-plans.js` with wave computation and auto-routing
473
+
474
+ ### Approval Gate
475
+
476
+ Tasks auto-compute whether human review is required:
477
+
478
+ | Delegation Mode | Has Metric | Review Required |
479
+ |---|---|---|
480
+ | `solo_mesh` | Yes | No (metric IS the approval) |
481
+ | `solo_mesh` | No | Yes |
482
+ | `soul` | Any | Yes |
483
+ | `collab_mesh` | No | Yes |
484
+ | `human` | Any | Yes (by definition) |
485
+
486
+ Tasks in `pending_review` block wave advancement — downstream subtasks don't dispatch until the review is completed via `mesh task approve <id>`.
487
+
488
+ ### Failure Policies
489
+
490
+ Each plan declares a `failure_policy` that controls what happens when a subtask fails:
491
+
492
+ | Policy | Behavior |
493
+ |--------|----------|
494
+ | `continue_best_effort` | Skip failed subtask, continue with non-dependent waves |
495
+ | `abort_on_first_fail` | Abort entire plan on any failure |
496
+ | `abort_on_critical_fail` | Abort only if the failed subtask has `critical: true` |
497
+
498
+ Subtasks can be marked `critical: true` to indicate their failure should trigger plan abort under the `abort_on_critical_fail` policy.
499
+
500
+ ### Failure Cascade and Escalation
501
+
502
+ When a subtask fails:
503
+ 1. **Cascade**: BFS blocks all transitive dependents (follows `depends_on` graph)
504
+ 2. **Blocked-critical check**: if any blocked subtask is `critical: true`, abort the plan
505
+ 3. **Escalation**: if the role defines an escalation target, create a recovery task
506
+ 4. **Recovery**: if the escalation task succeeds, override FAILED → COMPLETED and unblock dependents
507
+
508
+ ### Plan-Task Back-References
509
+
510
+ Each mesh task carries `plan_id` and `subtask_id` fields that link back to the parent plan. This enables O(1) plan progress checks — when a task completes, stalls, or exceeds budget, the daemon looks up the plan directly instead of scanning all plans. The daemon's enforcement loop (`checkPlanProgress`, `detectStalls`, `enforceBudgets`) all use these back-references to trigger cascade and wave advancement efficiently.
511
+
512
+ ### Heterogeneous Collaboration
513
+
514
+ Collab tasks can assign different souls to different nodes:
515
+
516
+ ```yaml
517
+ delegation:
518
+ mode: collab_mesh
519
+ collaboration:
520
+ mode: review
521
+ node_roles:
522
+ - soul: blockchain-auditor # primary executor
523
+ - soul: identity-architect # consultant
524
+ convergence: unanimous
525
+ ```
526
+
527
+ Both souls produce reflections. The shared intel compilation includes both perspectives.
528
+
529
+ ### Circling Strategy (Asymmetric Multi-Agent Review)
530
+
531
+ A directed collaboration mode where 3 agents — 1 Worker and 2 Reviewers — iterate through structured sub-rounds of work, review, and integration. Each agent sees only what the protocol decides it should see at each step, creating cognitive separation that prevents groupthink.
532
+
533
+ **Architecture:** Four layers with zero coupling:
534
+
535
+ ```
536
+ lib/circling-parser.js (parsing) Delimiter-based LLM output parser
537
+ bin/mesh-agent.js (execution) Prompt construction, LLM calls
538
+ bin/mesh-task-daemon.js (orchestration) NATS handlers, step lifecycle, timeouts
539
+ lib/mesh-collab.js (state) Session schema, artifact store, state machine
540
+ bin/mesh-bridge.js (human UI) Kanban materialization, gate messages
541
+ ```
542
+
543
+ **Workflow:**
544
+
545
+ ```
546
+ Task → RECRUITING (3 nodes join, roles assigned)
547
+ → INIT (Worker: workArtifact v0, Reviewers: reviewStrategy)
548
+ → SUB-ROUND LOOP (SR1..SRN):
549
+ Step 1 — Review Pass:
550
+ Worker analyzes review strategies (+ review findings in SR2+)
551
+ Reviewers review workArtifact using their strategy
552
+ Step 2 — Integration:
553
+ Worker judges each finding (ACCEPT/REJECT/MODIFY), updates artifact
554
+ Reviewers refine strategy using Worker feedback + cross-review
555
+ → FINALIZATION (Worker: final artifact + completionDiff, Reviewers: vote)
556
+ → COMPLETE (or gate → human approve/reject → loop)
557
+ ```
558
+
559
+ **Key features:**
560
+ - **Directed handoffs** — each node sees only its role-specific inputs per step (information flow matrix enforced by `compileDirectedInput`)
561
+ - **Cross-review** — in Step 2, Reviewer A sees Reviewer B's findings and vice versa, enabling inter-reviewer learning
562
+ - **Adaptive convergence** — if all nodes vote `converged` after step 2, skips remaining sub-rounds and goes directly to finalization
563
+ - **Stored role identities** — `worker_node_id`, `reviewerA_node_id`, `reviewerB_node_id` assigned once at recruiting close, stable for session lifetime
564
+ - **Dual-layer timeouts** — in-memory timers (fast, per-step) + periodic cron sweep every 60s (survives daemon restart via `step_started_at` in JetStream KV)
565
+ - **Tiered human gates** — Tier 1: fully autonomous. Tier 2: gate on finalization. Tier 3: gate every sub-round. Blocked votes always gate.
566
+ - **Delimiter-based parsing** — `===CIRCLING_ARTIFACT===` / `===END_ARTIFACT===` delimiters instead of JSON (LLMs produce reliable delimiter-separated output). Parser extracted to standalone `lib/circling-parser.js` (zero deps, shared by agent and tests).
567
+ - **Anti-preamble prompt hardening** — explicit instruction prevents LLM prose from contaminating code artifacts
568
+ - **Session blob monitoring** — warns at 800KB, critical at 950KB (JetStream KV max 1MB). KV write failures caught and recovered (artifact removed, session re-persisted).
569
+ - **Recruiting guard** — validates 1 worker + 2 reviewers before starting. `min_nodes` defaults to 3 for circling mode.
570
+
571
+ **Information flow matrix — what each node receives:**
572
+
573
+ | Phase | Worker Receives | Reviewers Receive |
574
+ |-------|----------------|-------------------|
575
+ | Init | Task plan | Task plan |
576
+ | Step 1 (SR1) | Both reviewStrategies | workArtifact |
577
+ | Step 1 (SR2+) | Both strategies + review findings* | workArtifact + reconciliationDoc |
578
+ | Step 2 | Both reviewArtifacts | workerReviewsAnalysis + other reviewer's cross-review* |
579
+ | Finalization | Task plan + final workArtifact | Task plan + final workArtifact |
580
+
581
+ `*` = optional (silently skipped if null)
582
+
583
+ **State machine:**
584
+
585
+ ```
586
+ [init] → [circling/SR1/step1] → [step2] → [SR2/step1] → ... → [finalization] → [complete]
587
+ ↑ |
588
+ gate reject: max_subrounds++ ─┘ (all converged)
589
+ ```
590
+
591
+ **Gate behavior:**
592
+ - Tier 2+: gates on finalization entry
593
+ - Tier 3: also gates after every sub-round
594
+ - Blocked votes in finalization: always gate, reviewer reason shown on kanban (`[GATE] SR2 blocked — reentrancy guard missing on withdraw function`)
595
+
596
+ **Usage:**
597
+
598
+ ```yaml
599
+ delegation:
600
+ mode: collab_mesh
601
+ collaboration:
602
+ mode: circling_strategy
603
+ min_nodes: 3
604
+ max_subrounds: 3
605
+ automation_tier: 2
606
+ node_roles:
607
+ - role: worker
608
+ soul: solidity-dev
609
+ - role: reviewer
610
+ soul: blockchain-auditor
611
+ - role: reviewer
612
+ soul: qa-engineer
613
+ ```
614
+
615
+ **Tests:**
616
+
617
+ ```bash
618
+ # All circling tests (93 tests, no external deps)
619
+ node --test test/collab-circling.test.js test/daemon-circling-handlers.test.js test/circling-comprehensive.test.js
620
+ ```
621
+
622
+ Full implementation reference: `docs/circling-strategy-implementationV3.md`
623
+
624
+ ## Lifecycle Hooks
625
+
626
+ 6 hooks wired into Claude Code lifecycle events, plus dual-wired git hooks for LLM-agnostic enforcement:
627
+
628
+ | Hook | Trigger | What it does |
629
+ |---|---|---|
630
+ | `session-start.sh` | SessionStart | Loads git state, active tasks, companion state, last session recap |
631
+ | `validate-commit.sh` | PreToolUse (Bash) | Blocks secrets, validates JSON, warns on bare TODOs, checks commit format |
632
+ | `validate-push.sh` | PreToolUse (Bash) | Warns on force-push and protected branch pushes |
633
+ | `pre-compact.sh` | PreCompact | Preserves session state before context compression |
634
+ | `session-stop.sh` | Stop | Logs session end to daily memory file |
635
+ | `log-agent.sh` | SubagentStart | Audit trail of every subagent spawn |
636
+
637
+ Git hooks (`pre-commit`, `pre-push`) delegate to the same scripts — enforcement works regardless of IDE or AI tool.
638
+
639
+ ---
640
+
641
+ ## Distributed Mission Control
642
+
643
+ Mission Control runs on **every node** in the mesh. Each instance operates independently against its own local SQLite database, while staying in sync through NATS JetStream KV buckets. This means any node can view all mesh tasks, and worker nodes get their own full MC dashboard instead of being headless executors.
644
+
645
+ ### How It Works
646
+
647
+ The system has two layers:
648
+
649
+ **Layer 1 — KV Mirror (read visibility):** Every MC instance watches NATS KV bucket `MESH_TASKS` in real-time. When the lead creates, updates, or completes a task, all connected MC instances see the change within milliseconds. Worker nodes display these tasks as read-only cards in the Kanban.
650
+
651
+ **Layer 2 — Sync Engine (write participation):** Worker nodes can *propose* new tasks to the mesh. Proposals land in the KV bucket with `status: proposed`. The lead's task daemon validates proposals within its 30-second enforcement loop and transitions them to `queued` (accepted) or `rejected`. Once queued, any node with the `claim` capability can execute the task.
652
+
653
+ ```
654
+ NATS KV: MESH_TASKS
655
+ ┌─────────────────────┐
656
+ │ T-001: running │
657
+ │ T-002: queued │
658
+ │ T-003: proposed │
659
+ └────────┬────────────┘
660
+
661
+ ┌──────────────┼──────────────┐
662
+ │ │ │
663
+ ┌─────┴─────┐ ┌─────┴─────┐ ┌─────┴─────┐
664
+ │ Lead MC │ │ Worker MC │ │ Worker MC │
665
+ │ │ │ │ │ │
666
+ │ SQLite │ │ SQLite │ │ SQLite │
667
+ │ (primary) │ │ (mirror) │ │ (mirror) │
668
+ │ │ │ │ │ │
669
+ │ Read/Write│ │ Read + │ │ Read + │
670
+ │ + Approve │ │ Propose │ │ Propose │
671
+ └───────────┘ └───────────┘ └───────────┘
672
+ ```
673
+
674
+ ### Data Flow
675
+
676
+ 1. **Lead creates a task** via MC UI or agent dispatch
677
+ - Task saved to local SQLite (primary)
678
+ - Task written to `MESH_TASKS` KV bucket
679
+ - SSE event broadcast to UI
680
+ - All other MC instances receive the KV watch event and update their local mirrors
681
+
682
+ 2. **Worker proposes a task** via `POST /api/mesh/tasks`
683
+ - Task written to KV with `status: proposed`, `origin: <worker-node-id>`
684
+ - Lead's `mesh-task-daemon` picks it up in the next enforcement loop (< 30s)
685
+ - Daemon validates and transitions: `proposed` → `queued` (or `rejected`)
686
+ - Worker's MC sees the status change via KV watch
687
+
688
+ 3. **Worker reads mesh state** via `GET /api/mesh/tasks`
689
+ - Returns all tasks from NATS KV (not local SQLite)
690
+ - UI merges KV tasks with local SQLite tasks (dedup by task ID)
691
+ - On workers: KV version preferred (more current for mesh tasks)
692
+ - On lead: SQLite version preferred (has richer fields like `kanbanColumn`, `sortOrder`)
693
+
694
+ 4. **Anyone updates a task** via `PATCH /api/mesh/tasks/:id`
695
+ - Authority check: only `lead` can transition most states
696
+ - Workers can update tasks they own (`origin` matches)
697
+ - Uses CAS (Compare-And-Swap) to prevent stale writes — the `revision` field must match
698
+ - On revision mismatch: HTTP 409 with the current state, so the client can retry
699
+
700
+ ### Authority Model
701
+
702
+ The system enforces explicit authority boundaries:
703
+
704
+ | Action | Who Can Do It | Mechanism |
705
+ |--------|--------------|-----------|
706
+ | Create local task | Lead only | Direct SQLite + KV write |
707
+ | Propose mesh task | Any node | KV write with `status: proposed` |
708
+ | Accept/reject proposal | Lead only | Daemon enforcement loop |
709
+ | Claim a queued task | Any node | CAS on KV (first writer wins) |
710
+ | Complete a task | Task owner only | CAS with `origin` check |
711
+ | Approve (mark done) | Human's node only | `approve` capability gate |
712
+ | View all tasks | Any node | KV watch + local mirror |
713
+
714
+ ### Key Files
715
+
716
+ ```
717
+ mission-control/
718
+ ├── src/
719
+ │ ├── app/api/mesh/
720
+ │ │ ├── tasks/
721
+ │ │ │ ├── route.ts # GET (list from KV) + POST (propose)
722
+ │ │ │ └── [id]/route.ts # GET (single) + PATCH (CAS update)
723
+ │ │ ├── identity/route.ts # Node role/ID for sidebar badge
724
+ │ │ └── events/route.ts # SSE: dual-iterator (NATS sub + KV watch)
725
+ │ ├── lib/
726
+ │ │ └── sync/
727
+ │ │ └── mesh-kv.ts # Sync engine (KV watch → SQLite, CAS push)
728
+ │ └── components/layout/
729
+ │ └── sidebar.tsx # Node badge (⬢ Lead / ◇ Worker)
730
+ ├── src/lib/__tests__/
731
+ │ ├── mesh-kv-sync.test.ts # 30 unit tests (CAS, authority, merge, proposals)
732
+ │ └── mocks/mock-kv.ts # Shared MockKV for all KV tests
733
+ bin/
734
+ └── mesh-task-daemon.js # Proposal processing (30s enforcement loop)
735
+ lib/
736
+ └── mesh-tasks.js # PROPOSED + REJECTED task statuses
737
+ test/
738
+ ├── mesh-tasks-status.test.js # 7 unit tests (status enum, defaults)
739
+ └── distributed-mc.test.js # 12 integration tests (needs NATS + daemon)
740
+ ```
741
+
742
+ ### CAS (Compare-And-Swap) Explained
743
+
744
+ Every task in the KV bucket has a `revision` number that increments on each write. To update a task, you must provide the current revision. If another node wrote between your read and your write, the revision won't match and the update fails with a 409.
745
+
746
+ This eliminates race conditions without locks or a central coordinator:
747
+
748
+ ```
749
+ Node A reads T-001 (revision 5)
750
+ Node B reads T-001 (revision 5)
751
+ Node A writes T-001 with revision 5 → succeeds (now revision 6)
752
+ Node B writes T-001 with revision 5 → FAILS (expected 5, got 6)
753
+ Node B re-reads T-001 (revision 6), retries → succeeds
754
+ ```
755
+
756
+ ### SSE Dual-Iterator
757
+
758
+ The `/api/mesh/events` endpoint runs two async iterators in parallel:
759
+
760
+ 1. **NATS subscription** on `mesh.events.>` — receives all mesh event broadcasts
761
+ 2. **KV watcher** on `MESH_TASKS` — receives real-time task state changes
762
+
763
+ Both feed into a single SSE stream. When the client disconnects, both iterators are cleaned up (subscription unsubscribed, watcher stopped). This prevents zombie NATS connections.
764
+
765
+ ### Node Badge
766
+
767
+ The sidebar shows the node's identity:
768
+
769
+ - **⬢ Lead** (green) — full read/write/approve authority
770
+ - **◇ Worker** (blue) — read + propose, no direct task management
771
+ - **◇ Offline** (gray) — NATS unreachable, operating in standalone mode
772
+
773
+ ### Configuration
774
+
775
+ Two environment variables control behavior:
776
+
777
+ | Variable | Default | Description |
778
+ |----------|---------|-------------|
779
+ | `OPENCLAW_NODE_ROLE` | Auto-detected | `lead` or `worker`. Auto-detected from `service-manifest.json` if unset |
780
+ | `OPENCLAW_NODE_ID` | `os.hostname()` | Unique identifier for this node in the mesh |
781
+
782
+ No configuration needed on the lead — it works exactly as before. Workers just need `OPENCLAW_NATS` pointed at the lead's NATS server.
783
+
784
+ ### Testing
785
+
786
+ ```bash
787
+ # Unit tests (no dependencies — run anywhere)
788
+ cd mission-control && npm run test:unit # 30 tests: CAS, authority, merge, proposals
789
+ cd .. && npm run test:unit # 7 tests: status enum, task creation
790
+
791
+ # Integration tests (needs live NATS + mesh-task-daemon)
792
+ npm run test:integration # 12 tests: proposal lifecycle, RPC, events
793
+ # Skips gracefully if daemon not running
794
+
795
+ # Everything
796
+ npm run test:all
797
+ ```
798
+
799
+ ### Migration Path
800
+
801
+ This is Phase 1+2 of a 4-phase rollout:
802
+
803
+ | Phase | What Changes | Status |
804
+ |-------|-------------|--------|
805
+ | **1: KV Mirror** | Workers get read-only MC dashboards via KV watch | Done |
806
+ | **2: Sync Engine** | Workers can propose tasks, lead validates | Done |
807
+ | 3: Distributed Claiming | Any node can claim and execute queued tasks via CAS | Planned |
808
+ | 4: Full Sovereignty | No central daemon, each node schedules independently | Planned |
809
+
810
+ Phase 1+2 is **non-breaking** — the lead's existing task daemon, kanban sync, and agent dispatch all work exactly as before. The new code paths only activate when `OPENCLAW_NATS` is configured and reachable.
811
+
169
812
  ## Environment Variables
170
813
 
171
814
  See `openclaw.env.example` for all available configuration. Key variables: