brainclaw 1.8.0 → 1.9.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (178) hide show
  1. package/README.md +592 -505
  2. package/dist/brainclaw-vscode.vsix +0 -0
  3. package/dist/cli.js +138 -13
  4. package/dist/commands/add-step.js +1 -1
  5. package/dist/commands/bootstrap.js +2 -26
  6. package/dist/commands/check-security-mcp.js +50 -33
  7. package/dist/commands/check-security.js +86 -43
  8. package/dist/commands/claim.js +22 -21
  9. package/dist/commands/confirm.js +26 -0
  10. package/dist/commands/context-diff.js +1 -1
  11. package/dist/commands/dispatch-watch.js +142 -0
  12. package/dist/commands/doctor.js +113 -2
  13. package/dist/commands/estimation-report.js +115 -16
  14. package/dist/commands/harvest.js +286 -23
  15. package/dist/commands/hooks.js +73 -73
  16. package/dist/commands/init.js +124 -22
  17. package/dist/commands/install-hooks.js +78 -78
  18. package/dist/commands/loops-handlers.js +4 -0
  19. package/dist/commands/mcp-read-handlers.js +253 -41
  20. package/dist/commands/mcp.js +664 -102
  21. package/dist/commands/memory.js +21 -17
  22. package/dist/commands/migrate.js +81 -17
  23. package/dist/commands/prune.js +78 -4
  24. package/dist/commands/reflect.js +26 -20
  25. package/dist/commands/register-agent.js +57 -1
  26. package/dist/commands/repair.js +20 -0
  27. package/dist/commands/session-end.js +15 -6
  28. package/dist/commands/session-start.js +18 -1
  29. package/dist/commands/setup-security.js +39 -18
  30. package/dist/commands/setup.js +26 -27
  31. package/dist/commands/stale.js +16 -2
  32. package/dist/commands/switch.js +26 -5
  33. package/dist/commands/uninstall.js +126 -34
  34. package/dist/commands/update-step.js +6 -0
  35. package/dist/commands/version.js +1 -1
  36. package/dist/commands/worktree.js +60 -0
  37. package/dist/core/actions.js +12 -3
  38. package/dist/core/agent-capability.js +30 -17
  39. package/dist/core/agent-files.js +963 -666
  40. package/dist/core/agent-integrations.js +0 -3
  41. package/dist/core/agent-inventory.js +67 -0
  42. package/dist/core/agent-registry.js +163 -29
  43. package/dist/core/agentrun-reconciler.js +33 -2
  44. package/dist/core/agentruns.js +7 -1
  45. package/dist/core/ai-agent-detection.js +31 -44
  46. package/dist/core/archival.js +15 -9
  47. package/dist/core/assignment-reconciler.js +56 -0
  48. package/dist/core/assignment-sweeper.js +127 -4
  49. package/dist/core/assignments.js +69 -11
  50. package/dist/core/bootstrap.js +233 -67
  51. package/dist/core/brainclaw-version.js +22 -0
  52. package/dist/core/candidates.js +21 -1
  53. package/dist/core/claims.js +313 -150
  54. package/dist/core/codev-prompts.js +38 -38
  55. package/dist/core/config.js +6 -1
  56. package/dist/core/context-diff.js +148 -20
  57. package/dist/core/context.js +129 -8
  58. package/dist/core/coordination.js +22 -3
  59. package/dist/core/default-profiles/doctor.yaml +11 -11
  60. package/dist/core/default-profiles/janitor.yaml +11 -11
  61. package/dist/core/default-profiles/onboarder.yaml +11 -11
  62. package/dist/core/default-profiles/reviewer.yaml +13 -13
  63. package/dist/core/dispatch-status.js +79 -5
  64. package/dist/core/dispatcher.js +65 -12
  65. package/dist/core/entity-operations.js +74 -27
  66. package/dist/core/entity-registry.js +31 -5
  67. package/dist/core/event-log.js +138 -21
  68. package/dist/core/events/checkpoint.js +258 -0
  69. package/dist/core/events/genesis.js +220 -0
  70. package/dist/core/events/journal.js +507 -0
  71. package/dist/core/events/materialize.js +126 -0
  72. package/dist/core/events/registry-post-image.js +110 -0
  73. package/dist/core/events/verify.js +109 -0
  74. package/dist/core/execution-adapters.js +23 -0
  75. package/dist/core/execution.js +1 -1
  76. package/dist/core/facade-schema.js +38 -0
  77. package/dist/core/gc-semantic.js +130 -5
  78. package/dist/core/handoff-snapshot.js +68 -0
  79. package/dist/core/ids.js +19 -8
  80. package/dist/core/instruction-templates.js +34 -115
  81. package/dist/core/io.js +39 -3
  82. package/dist/core/json-store.js +10 -1
  83. package/dist/core/lock.js +153 -28
  84. package/dist/core/loops/bootstrap-acquire.js +25 -1
  85. package/dist/core/loops/facade-schema.js +2 -0
  86. package/dist/core/loops/hooks/survey-signals-baseline.js +36 -0
  87. package/dist/core/loops/index.js +1 -0
  88. package/dist/core/loops/presets/bootstrap.js +7 -0
  89. package/dist/core/loops/store.js +17 -0
  90. package/dist/core/loops/verbs.js +24 -2
  91. package/dist/core/markdown.js +8 -76
  92. package/dist/core/mcp-command-resolution.js +245 -0
  93. package/dist/core/memory-compactor.js +5 -3
  94. package/dist/core/memory-lifecycle.js +282 -0
  95. package/dist/core/merge-risk.js +150 -0
  96. package/dist/core/messaging.js +10 -3
  97. package/dist/core/migration.js +11 -1
  98. package/dist/core/observer-mode.js +26 -0
  99. package/dist/core/operations/memory-mutation.js +90 -65
  100. package/dist/core/operations/plan.js +27 -1
  101. package/dist/core/protocol-skills.js +210 -0
  102. package/dist/core/reflection-safety.js +6 -7
  103. package/dist/core/reputation.js +84 -2
  104. package/dist/core/runtime-signals.js +72 -10
  105. package/dist/core/runtime.js +84 -1
  106. package/dist/core/schema.js +114 -0
  107. package/dist/core/search.js +19 -2
  108. package/dist/core/security-detectors.js +125 -0
  109. package/dist/core/security-extract.js +189 -0
  110. package/dist/core/security-guard.js +217 -139
  111. package/dist/core/security-packages.js +121 -0
  112. package/dist/core/security-scoring.js +76 -9
  113. package/dist/core/security.js +34 -2
  114. package/dist/core/sequence.js +11 -2
  115. package/dist/core/setup-flow.js +141 -13
  116. package/dist/core/spawn-check.js +16 -2
  117. package/dist/core/staleness.js +73 -2
  118. package/dist/core/state.js +250 -54
  119. package/dist/core/store-resolution.js +45 -12
  120. package/dist/core/worktree.js +90 -26
  121. package/dist/facts.js +8 -8
  122. package/dist/facts.json +7 -7
  123. package/docs/PROTOCOL.md +223 -0
  124. package/docs/adapters/openclaw.md +43 -43
  125. package/docs/architecture/project-refs.md +328 -328
  126. package/docs/cli.md +2097 -2096
  127. package/docs/concepts/coordination.md +52 -52
  128. package/docs/concepts/coordinator-runbook.md +129 -0
  129. package/docs/concepts/dispatch-lifecycle.md +245 -245
  130. package/docs/concepts/event-log-store.md +928 -0
  131. package/docs/concepts/ideation-loop.md +317 -317
  132. package/docs/concepts/loop-engine.md +520 -511
  133. package/docs/concepts/mcp-governance.md +268 -268
  134. package/docs/concepts/memory.md +89 -88
  135. package/docs/concepts/multi-agent-workflows.md +167 -167
  136. package/docs/concepts/observer-protocol.md +361 -0
  137. package/docs/concepts/parallel-merge-protocol.md +71 -0
  138. package/docs/concepts/plans-and-claims.md +217 -174
  139. package/docs/concepts/project-md-convention.md +35 -35
  140. package/docs/concepts/runtime-notes.md +38 -38
  141. package/docs/concepts/skills.md +78 -0
  142. package/docs/concepts/troubleshooting.md +254 -254
  143. package/docs/concepts/workspace-bootstrapping.md +142 -81
  144. package/docs/context-format-changelog.md +35 -35
  145. package/docs/context-format.md +48 -48
  146. package/docs/index.md +65 -65
  147. package/docs/integrations/agents.md +162 -162
  148. package/docs/integrations/claude-code.md +23 -23
  149. package/docs/integrations/cline.md +87 -88
  150. package/docs/integrations/codex.md +2 -2
  151. package/docs/integrations/continue.md +60 -60
  152. package/docs/integrations/copilot.md +82 -80
  153. package/docs/integrations/cursor.md +23 -23
  154. package/docs/integrations/kilocode.md +72 -72
  155. package/docs/integrations/mcp.md +377 -377
  156. package/docs/integrations/mistral-vibe.md +122 -122
  157. package/docs/integrations/openclaw.md +99 -98
  158. package/docs/integrations/opencode.md +84 -84
  159. package/docs/integrations/overview.md +122 -122
  160. package/docs/integrations/roo.md +74 -74
  161. package/docs/integrations/windsurf.md +83 -83
  162. package/docs/mcp-schema-changelog.md +360 -329
  163. package/docs/playbooks/integration/index.md +121 -121
  164. package/docs/playbooks/orchestration.md +37 -0
  165. package/docs/playbooks/productivity/index.md +99 -99
  166. package/docs/playbooks/team/index.md +117 -117
  167. package/docs/product/agent-first-model.md +184 -184
  168. package/docs/product/entity-model-audit.md +462 -462
  169. package/docs/product/positioning.md +86 -86
  170. package/docs/quickstart-existing-project.md +107 -107
  171. package/docs/quickstart.md +148 -147
  172. package/docs/release-maintenance.md +79 -79
  173. package/docs/reputation.md +52 -52
  174. package/docs/review.md +45 -45
  175. package/docs/security.md +212 -53
  176. package/docs/server-operations.md +118 -118
  177. package/docs/storage.md +110 -108
  178. package/package.json +86 -69
@@ -1,53 +1,53 @@
1
- # Audience: Teams & Ops
2
-
3
- Design constraints for agents developing brainclaw features that serve team developers, project maintainers, and CI/CD operators.
4
-
5
- ## Profiles
6
-
7
- ### Team Developers
8
- Developers collaborating on shared projects where multiple humans (and their agents) contribute. They need async coordination to avoid conflicting edits.
9
-
10
- ### Project Maintainers
11
- Senior developers or tech leads responsible for code quality and architecture. They define the rules that all agents must follow in the repo.
12
-
13
- ### CI/CD Operators
14
- Engineers who run agents in headless pipelines. They need brainclaw to work without interactive prompts and produce machine-readable output.
15
-
16
- ## Design Constraints
17
-
18
- These rules apply to any feature that touches this audience:
19
-
20
- 1. **Concurrent agents is the nominal case, not the exception.** Any mutation of shared state (plans, claims, memory) must handle conflicts gracefully. File claims exist precisely for this — features that bypass claims are bugs. `bclaw_conflict_check` provides pre-edit safety, and `bclaw_check_policy` enforces scope compliance with glob-based pattern matching.
21
-
22
- 2. **Claims are mandatory for file-level coordination.** When multiple agents can be active, no agent should edit files without a claim. The claim system must be strict enough to prevent collisions and clear enough that agents understand scope boundaries. Policy checks return hard blocks (stops) and warnings (advisories) — both must be surfaced.
23
-
24
- 3. **Project constraints must be enforceable.** When a maintainer defines architecture rules or known traps, every agent joining the project must read them before working. This is not optional — it is the mechanism for scaling quality across agents. Lifecycle trigger tags (`trigger:post-claim`, `trigger:pre-session-end`) allow constraints to fire at specific moments.
25
-
26
- 4. **Headless operation must be first-class.** All CLI commands must work non-interactively (`--yes`, `--json`, exit codes). CI/CD agents cannot answer prompts. If a feature adds an interactive flow, it must have a non-interactive equivalent. Delivery methods (stdin_pipe, inline_arg, temp_file, inbox_structured) adapt to the agent's capabilities automatically.
27
-
28
- 5. **Audit trail matters.** For teams, knowing which agent did what and why is not a nice-to-have. The audit system (`audit.ts`) logs 30+ action types in JSONL with auto-rotation (10MB threshold), captures actor/action/before-after/scope/session_id/host_id. The governance posture report (`bclaw_audit`) aggregates this into constitution, red-lines, active claims, and recent activity summaries.
29
-
30
- 6. **Worktree isolation is the safe default for parallel work.** When multiple agents work simultaneously, each should get its own Git worktree. Worktrees are created under `~/.brainclaw/worktrees/<project-hash>/` with sidecar tracking (session_id, agent, user). `brainclaw worktree clean` prunes merged branches, `brainclaw worktree merge` handles restoration.
31
-
32
- ## Anti-Patterns
33
-
34
- - Assuming single-agent, single-session usage
35
- - Modifying shared state without conflict detection
36
- - Adding interactive-only flows with no `--yes` / `--json` fallback
37
- - Storing coordination state outside of Git-tracked `.brainclaw/` (breaks team sync)
38
- - Silently overwriting another agent's claims or plan state
39
- - Producing output that requires a human to interpret (no structured/machine-readable option)
40
- - Ignoring reputation signals when evaluating agent contributions
41
-
42
- ## Key Features for This Audience
43
-
44
- ### Coordination & Claims
45
- - `bclaw_claim` / `bclaw_release_claim` — file-level advisory locks with auto-worktree per claim
46
- - `bclaw_find(entity="claim", filter?)` / `bclaw_get(entity="claim", id)` — list and read claims via the canonical grammar
47
- - `bclaw_conflict_check` — pre-edit conflict detection between agents
48
- - `bclaw_check_policy` — scope compliance with glob matching, returns blocks + warnings
49
- - `bclaw_who` — list all active agent sessions on the workspace
50
-
1
+ # Audience: Teams & Ops
2
+
3
+ Design constraints for agents developing brainclaw features that serve team developers, project maintainers, and CI/CD operators.
4
+
5
+ ## Profiles
6
+
7
+ ### Team Developers
8
+ Developers collaborating on shared projects where multiple humans (and their agents) contribute. They need async coordination to avoid conflicting edits.
9
+
10
+ ### Project Maintainers
11
+ Senior developers or tech leads responsible for code quality and architecture. They define the rules that all agents must follow in the repo.
12
+
13
+ ### CI/CD Operators
14
+ Engineers who run agents in headless pipelines. They need brainclaw to work without interactive prompts and produce machine-readable output.
15
+
16
+ ## Design Constraints
17
+
18
+ These rules apply to any feature that touches this audience:
19
+
20
+ 1. **Concurrent agents is the nominal case, not the exception.** Any mutation of shared state (plans, claims, memory) must handle conflicts gracefully. File claims exist precisely for this — features that bypass claims are bugs. `bclaw_conflict_check` provides pre-edit safety, and `bclaw_check_policy` enforces scope compliance with glob-based pattern matching.
21
+
22
+ 2. **Claims are mandatory for file-level coordination.** When multiple agents can be active, no agent should edit files without a claim. The claim system must be strict enough to prevent collisions and clear enough that agents understand scope boundaries. Policy checks return hard blocks (stops) and warnings (advisories) — both must be surfaced.
23
+
24
+ 3. **Project constraints must be enforceable.** When a maintainer defines architecture rules or known traps, every agent joining the project must read them before working. This is not optional — it is the mechanism for scaling quality across agents. Lifecycle trigger tags (`trigger:post-claim`, `trigger:pre-session-end`) allow constraints to fire at specific moments.
25
+
26
+ 4. **Headless operation must be first-class.** All CLI commands must work non-interactively (`--yes`, `--json`, exit codes). CI/CD agents cannot answer prompts. If a feature adds an interactive flow, it must have a non-interactive equivalent. Delivery methods (stdin_pipe, inline_arg, temp_file, inbox_structured) adapt to the agent's capabilities automatically.
27
+
28
+ 5. **Audit trail matters.** For teams, knowing which agent did what and why is not a nice-to-have. The audit system (`audit.ts`) logs 30+ action types in JSONL with auto-rotation (10MB threshold), captures actor/action/before-after/scope/session_id/host_id. The governance posture report (`bclaw_audit`) aggregates this into constitution, red-lines, active claims, and recent activity summaries.
29
+
30
+ 6. **Worktree isolation is the safe default for parallel work.** When multiple agents work simultaneously, each should get its own Git worktree. Worktrees are created under `~/.brainclaw/worktrees/<project-hash>/` with sidecar tracking (session_id, agent, user). `brainclaw worktree clean` prunes merged branches, `brainclaw worktree merge` handles restoration.
31
+
32
+ ## Anti-Patterns
33
+
34
+ - Assuming single-agent, single-session usage
35
+ - Modifying shared state without conflict detection
36
+ - Adding interactive-only flows with no `--yes` / `--json` fallback
37
+ - Storing coordination state outside of Git-tracked `.brainclaw/` (breaks team sync)
38
+ - Silently overwriting another agent's claims or plan state
39
+ - Producing output that requires a human to interpret (no structured/machine-readable option)
40
+ - Ignoring reputation signals when evaluating agent contributions
41
+
42
+ ## Key Features for This Audience
43
+
44
+ ### Coordination & Claims
45
+ - `bclaw_claim` / `bclaw_release_claim` — file-level advisory locks with auto-worktree per claim
46
+ - `bclaw_find(entity="claim", filter?)` / `bclaw_get(entity="claim", id)` — list and read claims via the canonical grammar
47
+ - `bclaw_conflict_check` — pre-edit conflict detection between agents
48
+ - `bclaw_check_policy` — scope compliance with glob matching, returns blocks + warnings
49
+ - `bclaw_who` — list all active agent sessions on the workspace
50
+
51
51
  ### Planning & Sequencing
52
52
  - `bclaw_create(entity="plan", data)` / `bclaw_update(entity="plan", id, patch)` / `bclaw_transition(entity="plan", id, to)` — shared work planning with priority/effort/assignee/status
53
53
  - `bclaw_find(entity="plan", filter?)` — list plans
@@ -55,70 +55,70 @@ These rules apply to any feature that touches this audience:
55
55
  - `bclaw_create_sequence` / `bclaw_list_sequences` / `bclaw_update_sequence` — multi-step coordination sequences with lane analysis
56
56
 
57
57
  Sequence items are explicit lane records: `{ planId, stepId?, rank, hard_after?, soft_after?, lane?, scope_hint?, rationale? }`. `planId` and `rank` are required; `stepId` lets a sequence dispatch a specific plan step instead of the whole plan. A normal parallel flow is: create sequence as `draft`, update it to `active`, run `bclaw_dispatch(intent="analysis")`, then run `bclaw_dispatch(intent="execute", agents=[...])`.
58
-
59
- ### Multi-Agent Dispatch
60
- - `bclaw_dispatch(intent)` — `analysis` analyses an active sequence (ready/active/blocked/done per lane), `execute` fans out parallel work across lanes, `review` dispatches code reviews for completed handoffs
61
- - `bclaw_coordinate(intent)` — facade for `assign` / `consult` / `review` / `reroute` / `summarize`. Pass `open_loop: true` on `intent="review"` to also dispatch the reviewer turn (the recommended path).
62
- - `bclaw_loop(intent)` — drive a turn in an existing multi-turn loop (`turn`, `complete_turn`, `advance`, `close`)
63
- - Delivery channels: spawn (CLI subprocess) or inbox (structured messages)
64
- - Dry-run mode for previewing assignments
65
-
66
- ### Inter-Agent Messaging
67
- - `bclaw_send_message` / `bclaw_read_inbox` / `bclaw_ack_message` — structured messaging between agents
68
- - `bclaw_get_thread` — full thread view across agent inboxes
69
- - CLI: `brainclaw inbox list/ack/archive/send/thread`
70
-
71
- ### Visibility & Governance
72
- - `bclaw_context(kind="board")` / `bclaw_context(kind="board_summary")` — active plans, claims by agent, open handoffs, sequences, resolved instructions (full or compact form)
73
- - `bclaw_audit` — governance posture report (constitution, red-lines, claim activity, recommendations)
74
- - `bclaw_history` — full mutation history per memory item with before/after snapshots
75
- - Reputation signals — 4 scores: contribution_quality, review_reliability, continuity_hygiene, internal_trust
76
- - `bclaw_estimation_report` — estimation accuracy for completed plans
77
-
78
- ### Handoffs & Review
79
- - `bclaw_get(entity="handoff", id)` / `bclaw_update(entity="handoff", id, patch)` — structured agent transitions with git diff and state snapshot
80
- - `bclaw_create(entity="candidate", data)` / `bclaw_transition(entity="candidate", id, to="accepted"|"rejected")` — review workflow with candidate promotion
81
- - `bclaw_coordinate(intent="review", open_loop: true, review_mode: "symmetric")` — open a multi-turn review-and-fix loop in one call (the recommended way to dispatch a reviewer)
82
- - `brainclaw review` — reflective review from CLI
83
-
84
- ### Worktree Management
85
- - `bclaw_claim` with auto-worktree — creates isolated branch per agent/scope
86
- - `brainclaw worktree clean` — remove merged worktrees and orphan directories
87
- - `brainclaw worktree merge <branch>` — merge with auto-restoration
88
- - Sidecar tracking: session_id, agent name, user per worktree
89
-
90
- ## Known Gaps
91
-
92
- Features this audience would naturally expect but that are not yet implemented:
93
-
94
- ### No merge conflict resolution tooling
95
- brainclaw creates worktrees and manages branches but provides zero help when two agents' worktrees conflict at merge time. No pre-merge conflict detection, no merge strategy hints, no assisted resolution. Teams must handle all git merge/rebase manually.
96
- **Impact:** The hardest part of multi-agent parallel work (the merge) is entirely outside brainclaw's scope.
97
-
98
- ### No push notifications or webhooks
99
- The event log (`event-log.ts`) records all mutations in JSONL and supports cursor-based polling (`readUnseenEvents()`), but there is no push mechanism. No webhooks, no Slack integration, no email alerts. When a claim conflicts or a handoff is ready, agents must poll to discover it.
100
- **Impact:** Teams can't get real-time alerts on coordination events. Stale claim detection depends on someone polling.
101
-
102
- ### Claim expiry has no background enforcement
103
- Claim expiry logic exists (`claims.ts`: `isClaimExpired()`, `isClaimStale()`, `expireStaleActiveClaims()`), but there is no daemon or background process. Expiry only triggers when a command explicitly calls it. If an agent crashes and abandons a claim, it blocks other agents until someone runs a check.
104
- **Impact:** Stale claims from crashed agents can block team progress indefinitely.
105
-
106
- ### No path-level access restrictions
107
- Trust levels exist (observer/contributor/trusted/curator in `agent-registry.ts`), but they are all-or-nothing per tier. There is no way to say "junior agents can't modify src/core/" or restrict specific agents to specific scopes. Policy checks (`policy.ts`) warn on claim conflicts but don't enforce path-based permissions.
108
- **Impact:** Maintainers cannot protect critical code paths from untrusted agents.
109
-
110
- ### No web dashboard or visualization
111
- All output is CLI text or JSON (`metrics.ts`, `doctor.ts`, `audit`). There is no web UI, no HTML reports, no dependency graphs, no Gantt charts for plans. Teams that want visual coordination views must build their own from JSON exports.
112
- **Impact:** Ops teams without CLI fluency have no visibility into coordination state.
113
-
114
- ### No pre-built CI GitHub Action
115
- brainclaw works in CI (proven by `.github/workflows/brainclaw-sync.yml` in this repo), but there is no reusable GitHub Action or template. Operators must manually wire `brainclaw doctor --json` into their pipelines. No `--ci` mode flag, no structured exit codes (warning vs error).
116
- **Impact:** CI adoption requires per-repo custom work instead of a one-line Action reference.
117
-
118
- ### Plan dependencies lack cycle/deadlock detection
119
- Plans support `depends_on` and sequences support `hard_after`/`soft_after` (`schema.ts`, `dispatcher.ts`), but there is no transitive dependency validation. Circular dependencies (A→B→C→A) are not detected before dispatch. No critical path analysis or slack calculation.
120
- **Impact:** Complex plan graphs can deadlock silently.
121
-
122
- ### No team onboarding walkthrough
123
- When a new developer joins an existing brainclaw project, they get the same `init` flow as a greenfield project. There is no "get familiar with this project" step, no guided tour of existing plans/constraints/traps, no role assignment during setup.
124
- **Impact:** New team members start without understanding the project's coordination state.
58
+
59
+ ### Multi-Agent Dispatch
60
+ - `bclaw_dispatch(intent)` — `analysis` analyses an active sequence (ready/active/blocked/done per lane), `execute` fans out parallel work across lanes, `review` dispatches code reviews for completed handoffs
61
+ - `bclaw_coordinate(intent)` — facade for `assign` / `consult` / `review` / `reroute` / `summarize`. Pass `open_loop: true` on `intent="review"` to also dispatch the reviewer turn (the recommended path).
62
+ - `bclaw_loop(intent)` — drive a turn in an existing multi-turn loop (`turn`, `complete_turn`, `advance`, `close`)
63
+ - Delivery channels: spawn (CLI subprocess) or inbox (structured messages)
64
+ - Dry-run mode for previewing assignments
65
+
66
+ ### Inter-Agent Messaging
67
+ - `bclaw_send_message` / `bclaw_read_inbox` / `bclaw_ack_message` — structured messaging between agents
68
+ - `bclaw_get_thread` — full thread view across agent inboxes
69
+ - CLI: `brainclaw inbox list/ack/archive/send/thread`
70
+
71
+ ### Visibility & Governance
72
+ - `bclaw_context(kind="board")` / `bclaw_context(kind="board_summary")` — active plans, claims by agent, open handoffs, sequences, resolved instructions (full or compact form)
73
+ - `bclaw_audit` — governance posture report (constitution, red-lines, claim activity, recommendations)
74
+ - `bclaw_history` — full mutation history per memory item with before/after snapshots
75
+ - Reputation signals — 4 scores: contribution_quality, review_reliability, continuity_hygiene, internal_trust
76
+ - `bclaw_estimation_report` — estimation accuracy for completed plans
77
+
78
+ ### Handoffs & Review
79
+ - `bclaw_get(entity="handoff", id)` / `bclaw_update(entity="handoff", id, patch)` — structured agent transitions with git diff and state snapshot
80
+ - `bclaw_create(entity="candidate", data)` / `bclaw_transition(entity="candidate", id, to="accepted"|"rejected")` — review workflow with candidate promotion
81
+ - `bclaw_coordinate(intent="review", open_loop: true, review_mode: "symmetric")` — open a multi-turn review-and-fix loop in one call (the recommended way to dispatch a reviewer)
82
+ - `brainclaw review` — reflective review from CLI
83
+
84
+ ### Worktree Management
85
+ - `bclaw_claim` with auto-worktree — creates isolated branch per agent/scope
86
+ - `brainclaw worktree clean` — remove merged worktrees and orphan directories
87
+ - `brainclaw worktree merge <branch>` — merge with auto-restoration
88
+ - Sidecar tracking: session_id, agent name, user per worktree
89
+
90
+ ## Known Gaps
91
+
92
+ Features this audience would naturally expect but that are not yet implemented:
93
+
94
+ ### No merge conflict resolution tooling
95
+ brainclaw creates worktrees and manages branches but provides zero help when two agents' worktrees conflict at merge time. No pre-merge conflict detection, no merge strategy hints, no assisted resolution. Teams must handle all git merge/rebase manually.
96
+ **Impact:** The hardest part of multi-agent parallel work (the merge) is entirely outside brainclaw's scope.
97
+
98
+ ### No push notifications or webhooks
99
+ The event log (`event-log.ts`) records all mutations in JSONL and supports cursor-based polling (`readUnseenEvents()`), but there is no push mechanism. No webhooks, no Slack integration, no email alerts. When a claim conflicts or a handoff is ready, agents must poll to discover it.
100
+ **Impact:** Teams can't get real-time alerts on coordination events. Stale claim detection depends on someone polling.
101
+
102
+ ### Claim expiry has no background enforcement
103
+ Claim expiry logic exists (`claims.ts`: `isClaimExpired()`, `isClaimStale()`, `expireStaleActiveClaims()`), but there is no daemon or background process. Expiry only triggers when a command explicitly calls it. If an agent crashes and abandons a claim, it blocks other agents until someone runs a check.
104
+ **Impact:** Stale claims from crashed agents can block team progress indefinitely.
105
+
106
+ ### No path-level access restrictions
107
+ Trust levels exist (observer/contributor/trusted/curator in `agent-registry.ts`), but they are all-or-nothing per tier. There is no way to say "junior agents can't modify src/core/" or restrict specific agents to specific scopes. Policy checks (`policy.ts`) warn on claim conflicts but don't enforce path-based permissions.
108
+ **Impact:** Maintainers cannot protect critical code paths from untrusted agents.
109
+
110
+ ### No web dashboard or visualization
111
+ All output is CLI text or JSON (`metrics.ts`, `doctor.ts`, `audit`). There is no web UI, no HTML reports, no dependency graphs, no Gantt charts for plans. Teams that want visual coordination views must build their own from JSON exports.
112
+ **Impact:** Ops teams without CLI fluency have no visibility into coordination state.
113
+
114
+ ### No pre-built CI GitHub Action
115
+ brainclaw works in CI (proven by `.github/workflows/brainclaw-sync.yml` in this repo), but there is no reusable GitHub Action or template. Operators must manually wire `brainclaw doctor --json` into their pipelines. No `--ci` mode flag, no structured exit codes (warning vs error).
116
+ **Impact:** CI adoption requires per-repo custom work instead of a one-line Action reference.
117
+
118
+ ### Plan dependencies lack cycle/deadlock detection
119
+ Plans support `depends_on` and sequences support `hard_after`/`soft_after` (`schema.ts`, `dispatcher.ts`), but there is no transitive dependency validation. Circular dependencies (A→B→C→A) are not detected before dispatch. No critical path analysis or slack calculation.
120
+ **Impact:** Complex plan graphs can deadlock silently.
121
+
122
+ ### No team onboarding walkthrough
123
+ When a new developer joins an existing brainclaw project, they get the same `init` flow as a greenfield project. There is no "get familiar with this project" step, no guided tour of existing plans/constraints/traps, no role assignment during setup.
124
+ **Impact:** New team members start without understanding the project's coordination state.
@@ -1,184 +1,184 @@
1
- # Agent-first product model
2
-
3
- Captured from a strategic session on 2026-04-18. This document frames
4
- brainclaw's user model, the resulting two-layer product architecture,
5
- and the loop-protocols roadmap implied by both. It supersedes the
6
- implicit framing in the current audience playbooks (`docs/playbooks/*`)
7
- and should be read before recommending new features.
8
-
9
- ## 1. Who is the user?
10
-
11
- The playbooks describe personae as humans ("Non-Tech Creator", "Solo
12
- Developer", "CI/CD Operator"). That framing is misleading. brainclaw is
13
- built for LLM agents to consume — humans rarely type brainclaw CLI
14
- commands. The lines blur because:
15
-
16
- - The **user** of brainclaw is the agent. It calls MCP tools, consumes
17
- the context format, writes memory, participates in loops.
18
- - The **adopter** of brainclaw is the human. The human installs it,
19
- chooses which agents to deploy, reviews outputs, and — at enterprise
20
- scale — must be able to audit what the agents did.
21
-
22
- These are different people with different requirements. Mixing them
23
- biases priorities toward "visible to the human" features (dashboards,
24
- CLI UX, Slack webhooks) at the expense of "invisible but load-bearing
25
- for the agent" features (context compounding, memory staleness,
26
- cross-agent coordination).
27
-
28
- ## 2. Two-layer product architecture
29
-
30
- brainclaw is really two products sharing a data substrate:
31
-
32
- ### The engine — agent-facing
33
-
34
- - **Role.** Where the agent does work. Every feature here is designed
35
- to make the agent measurably more effective, which manifests
36
- externally as fewer re-prompts, better grounded outputs, and cleaner
37
- multi-agent coordination.
38
- - **Surface.** MCP tools, context format, memory schema, Loop engine,
39
- claims/plans/handoffs, federation signals.
40
- - **Design constraints.** Minimal cognitive load for the agent
41
- (structured inputs, discoverable vocabulary), no UX, machine-parseable
42
- outputs, strict contracts. The agent should be able to use it without
43
- the human intervening.
44
- - **Success metric.** Agent effectiveness: does a long-running agent
45
- with months of history outperform a fresh one? Does multi-agent
46
- review produce better code than single-agent? Does session N+1 pick
47
- up where session N left off without re-prompting?
48
-
49
- ### The cockpit — human-facing
50
-
51
- - **Role.** Where the human supervises, audits, and trusts. Rarely used
52
- operationally by the agent. Critical for adoption, especially at
53
- enterprise scale where visibility is a procurement gate.
54
- - **Surface.** Dashboard (local or remote), audit narrative reports,
55
- cost attribution, risk/policy surface, reputation views, forensics
56
- and replay, CI governance gates, webhooks for operational alerts.
57
- - **Design constraints.** Visual, aggregable, exportable,
58
- compliance-ready. Human cognitive model: time series, filters,
59
- drill-downs. Should run read-mostly on the data the engine emits.
60
- - **Success metric.** Human confidence: does a tech lead understand
61
- what N agents did this week? Can a compliance officer produce a
62
- report? Can an ops engineer detect a misbehaving agent before
63
- damage?
64
-
65
- ### The relationship between the two
66
-
67
- - The engine **emits** signals (events, audit entries, reputation
68
- scores, usage traces, loop transitions).
69
- - The cockpit **consumes** those signals and aggregates them for the
70
- human.
71
- - The engine is primary: without it, the cockpit is an empty chart.
72
- The cockpit is load-bearing for adoption: without it, the engine
73
- never reaches production at scale.
74
- - Engineering discipline: every engine feature should ask "what
75
- machine-readable signal should this emit so the cockpit can
76
- consume it?". The cockpit is then incremental capitalization on
77
- engine investments, not a parallel product effort.
78
-
79
- ## 3. Loop engine strategy
80
-
81
- The Loop engine (pln#394) was designed as a generic control plane —
82
- one engine, many protocols. Review & Fix Loop (pln#395) was the first
83
- shipped protocol. The strategic reflection clarifies that:
84
-
85
- - We do **not** need to code eight protocols. We need to wire four
86
- polished entry points for the high-leverage kinds, and document
87
- patterns for the rest as composition variants.
88
- - The engine already supports everything required: `open`, `turn`,
89
- `advance`, `complete_turn`, `add_artifact`, `pause`, `resume`,
90
- `close`, with per-phase `advance_when`, composite `StopCondition`,
91
- idempotency, and CAS.
92
-
93
- ### Ranked protocols to wire next
94
-
95
- 1. **Ideation Loop** — **MVP shipped in v1.5.0** (pln#492). The shipped
96
- shape is single-champion-plus-memory rather than the four-role
97
- framing originally drafted: empirical work in May 2026
98
- (`feedback_ideation_loop_single_agent_method`) showed that one
99
- model produces useful adversarial pressure when the critic phase's
100
- `context_filter` makes it confront only adversarial memory (traps,
101
- feedback, runtime_notes). Multi-agent slots are still supported as
102
- an opt-in for richer diversity. See [docs/concepts/ideation-loop.md](../concepts/ideation-loop.md).
103
- Reframer phase (pln#493) is the next layer — covers the
104
- novelty/simplicity/external-pattern blind spot of memory-driven
105
- critique.
106
- 2. **Debug & Root-Cause Loop**. Five phases: symptom → hypothesis →
107
- test → fix → verify. Targets the #1 pain point of single-agent
108
- debugging — the lack of structure. High daily impact.
109
- 3. **Research & Synthesis Loop**. Researcher → analyzer → synthesizer
110
- → validator. Replaces "the human reads twenty pages" with a
111
- condensed summary of the same sources. Novel utility vs the other
112
- protocols.
113
- 4. **Planning & Breakdown Loop**. Goal → decomposer → estimator →
114
- validator → refiner. Compounds with brainclaw's existing Plans and
115
- Sequences — makes plan creation less naive.
116
-
117
- ### Variants, not new protocols
118
-
119
- The following items from the brainstorm are compositions of the four
120
- above and do not require separate engine work:
121
-
122
- - **Reflection / Self-Critique** = ideation loop with `mode:
123
- 'symmetric'` and all slots assigned to the same agent. The engine
124
- already supports this.
125
- - **Validation & Approval Multi-Audience** = review loop with N
126
- reviewer slots (one per audience) plus a consolidator slot. Purely
127
- a slot-configuration pattern.
128
- - **Optimization / Refactoring** = implementation loop framed around
129
- a before/after artifact pair. A convention, not a new protocol.
130
-
131
- ### What "wiring" means concretely (per protocol)
132
-
133
- For each of the four priority protocols:
134
-
135
- - Polished `DEFAULT_PROTOCOLS` entry (phases, stop_condition, default
136
- roles) in `src/core/loops/types.ts`.
137
- - A facade entry point: either a new intent on `bclaw_coordinate`
138
- (e.g. `intent='ideate'`) or a direct `bclaw_loop(open, kind=...)`
139
- call pattern documented in the RFC.
140
- - A human-visible output: the terminal artifact should materialize as
141
- a candidate or handoff the human can read, not only an intra-loop
142
- artifact. This is how the loop becomes a "helper" rather than a
143
- process trace.
144
- - End-to-end tests that cover the happy path plus at least one
145
- iteration round.
146
-
147
- ## 4. Playbook refactoring note
148
-
149
- The current playbooks (`docs/playbooks/integration/`, `productivity/`,
150
- `team/`) mix the agent-user and human-adopter perspectives within a
151
- single "audience" section. They should be refactored so each audience
152
- file contains two explicit sections:
153
-
154
- - **For the agent-user.** What the agent gains operationally: memory,
155
- context, coordination, loops, review. This is the engine view.
156
- - **For the human-adopter.** What the human gains in trust,
157
- visibility, governance, compliance, cost control. This is the
158
- cockpit view.
159
-
160
- This split clarifies priorities: features that improve agent
161
- effectiveness belong to the engine slice and trade against each other;
162
- features that improve human confidence belong to the cockpit slice and
163
- trade against each other. Mixing them biased the existing "Known Gaps"
164
- sections toward visible-to-human items.
165
-
166
- ## 5. Practical implications
167
-
168
- - Next implementation move: reframer phase (pln#493) on top of the
169
- shipped ideation_loop, then the Debug & Root-Cause Loop.
170
- - Parallel track: the cockpit needs dedicated planning once the engine
171
- emits enough signals (event streaming, reputation exposure, audit
172
- narrative generation, cost attribution).
173
- - Any new feature proposal should explicitly state which layer it
174
- serves (engine or cockpit) and which audience slice (agent-user or
175
- human-adopter). Proposals that don't declare this should be sent
176
- back for clarification.
177
-
178
- ## References
179
-
180
- - `docs/concepts/loop-engine.md` — the v8 RFC (engine primitive)
181
- - `docs/playbooks/*` — current audience playbooks (pending refactor
182
- per §4 above)
183
- - pln#394 `feat/loop-engine-mvp` — shipped
184
- - pln#395 `feat/review-loop-protocol` — shipped
1
+ # Agent-first product model
2
+
3
+ Captured from a strategic session on 2026-04-18. This document frames
4
+ brainclaw's user model, the resulting two-layer product architecture,
5
+ and the loop-protocols roadmap implied by both. It supersedes the
6
+ implicit framing in the current audience playbooks (`docs/playbooks/*`)
7
+ and should be read before recommending new features.
8
+
9
+ ## 1. Who is the user?
10
+
11
+ The playbooks describe personae as humans ("Non-Tech Creator", "Solo
12
+ Developer", "CI/CD Operator"). That framing is misleading. brainclaw is
13
+ built for LLM agents to consume — humans rarely type brainclaw CLI
14
+ commands. The lines blur because:
15
+
16
+ - The **user** of brainclaw is the agent. It calls MCP tools, consumes
17
+ the context format, writes memory, participates in loops.
18
+ - The **adopter** of brainclaw is the human. The human installs it,
19
+ chooses which agents to deploy, reviews outputs, and — at enterprise
20
+ scale — must be able to audit what the agents did.
21
+
22
+ These are different people with different requirements. Mixing them
23
+ biases priorities toward "visible to the human" features (dashboards,
24
+ CLI UX, Slack webhooks) at the expense of "invisible but load-bearing
25
+ for the agent" features (context compounding, memory staleness,
26
+ cross-agent coordination).
27
+
28
+ ## 2. Two-layer product architecture
29
+
30
+ brainclaw is really two products sharing a data substrate:
31
+
32
+ ### The engine — agent-facing
33
+
34
+ - **Role.** Where the agent does work. Every feature here is designed
35
+ to make the agent measurably more effective, which manifests
36
+ externally as fewer re-prompts, better grounded outputs, and cleaner
37
+ multi-agent coordination.
38
+ - **Surface.** MCP tools, context format, memory schema, Loop engine,
39
+ claims/plans/handoffs, federation signals.
40
+ - **Design constraints.** Minimal cognitive load for the agent
41
+ (structured inputs, discoverable vocabulary), no UX, machine-parseable
42
+ outputs, strict contracts. The agent should be able to use it without
43
+ the human intervening.
44
+ - **Success metric.** Agent effectiveness: does a long-running agent
45
+ with months of history outperform a fresh one? Does multi-agent
46
+ review produce better code than single-agent? Does session N+1 pick
47
+ up where session N left off without re-prompting?
48
+
49
+ ### The cockpit — human-facing
50
+
51
+ - **Role.** Where the human supervises, audits, and trusts. Rarely used
52
+ operationally by the agent. Critical for adoption, especially at
53
+ enterprise scale where visibility is a procurement gate.
54
+ - **Surface.** Dashboard (local or remote), audit narrative reports,
55
+ cost attribution, risk/policy surface, reputation views, forensics
56
+ and replay, CI governance gates, webhooks for operational alerts.
57
+ - **Design constraints.** Visual, aggregable, exportable,
58
+ compliance-ready. Human cognitive model: time series, filters,
59
+ drill-downs. Should run read-mostly on the data the engine emits.
60
+ - **Success metric.** Human confidence: does a tech lead understand
61
+ what N agents did this week? Can a compliance officer produce a
62
+ report? Can an ops engineer detect a misbehaving agent before
63
+ damage?
64
+
65
+ ### The relationship between the two
66
+
67
+ - The engine **emits** signals (events, audit entries, reputation
68
+ scores, usage traces, loop transitions).
69
+ - The cockpit **consumes** those signals and aggregates them for the
70
+ human.
71
+ - The engine is primary: without it, the cockpit is an empty chart.
72
+ The cockpit is load-bearing for adoption: without it, the engine
73
+ never reaches production at scale.
74
+ - Engineering discipline: every engine feature should ask "what
75
+ machine-readable signal should this emit so the cockpit can
76
+ consume it?". The cockpit is then incremental capitalization on
77
+ engine investments, not a parallel product effort.
78
+
79
+ ## 3. Loop engine strategy
80
+
81
+ The Loop engine (pln#394) was designed as a generic control plane —
82
+ one engine, many protocols. Review & Fix Loop (pln#395) was the first
83
+ shipped protocol. The strategic reflection clarifies that:
84
+
85
+ - We do **not** need to code eight protocols. We need to wire four
86
+ polished entry points for the high-leverage kinds, and document
87
+ patterns for the rest as composition variants.
88
+ - The engine already supports everything required: `open`, `turn`,
89
+ `advance`, `complete_turn`, `add_artifact`, `pause`, `resume`,
90
+ `close`, with per-phase `advance_when`, composite `StopCondition`,
91
+ idempotency, and CAS.
92
+
93
+ ### Ranked protocols to wire next
94
+
95
+ 1. **Ideation Loop** — **MVP shipped in v1.5.0** (pln#492). The shipped
96
+ shape is single-champion-plus-memory rather than the four-role
97
+ framing originally drafted: empirical work in May 2026
98
+ (`feedback_ideation_loop_single_agent_method`) showed that one
99
+ model produces useful adversarial pressure when the critic phase's
100
+ `context_filter` makes it confront only adversarial memory (traps,
101
+ feedback, runtime_notes). Multi-agent slots are still supported as
102
+ an opt-in for richer diversity. See [docs/concepts/ideation-loop.md](../concepts/ideation-loop.md).
103
+ Reframer phase (pln#493) is the next layer — covers the
104
+ novelty/simplicity/external-pattern blind spot of memory-driven
105
+ critique.
106
+ 2. **Debug & Root-Cause Loop**. Five phases: symptom → hypothesis →
107
+ test → fix → verify. Targets the #1 pain point of single-agent
108
+ debugging — the lack of structure. High daily impact.
109
+ 3. **Research & Synthesis Loop**. Researcher → analyzer → synthesizer
110
+ → validator. Replaces "the human reads twenty pages" with a
111
+ condensed summary of the same sources. Novel utility vs the other
112
+ protocols.
113
+ 4. **Planning & Breakdown Loop**. Goal → decomposer → estimator →
114
+ validator → refiner. Compounds with brainclaw's existing Plans and
115
+ Sequences — makes plan creation less naive.
116
+
117
+ ### Variants, not new protocols
118
+
119
+ The following items from the brainstorm are compositions of the four
120
+ above and do not require separate engine work:
121
+
122
+ - **Reflection / Self-Critique** = ideation loop with `mode:
123
+ 'symmetric'` and all slots assigned to the same agent. The engine
124
+ already supports this.
125
+ - **Validation & Approval Multi-Audience** = review loop with N
126
+ reviewer slots (one per audience) plus a consolidator slot. Purely
127
+ a slot-configuration pattern.
128
+ - **Optimization / Refactoring** = implementation loop framed around
129
+ a before/after artifact pair. A convention, not a new protocol.
130
+
131
+ ### What "wiring" means concretely (per protocol)
132
+
133
+ For each of the four priority protocols:
134
+
135
+ - Polished `DEFAULT_PROTOCOLS` entry (phases, stop_condition, default
136
+ roles) in `src/core/loops/types.ts`.
137
+ - A facade entry point: either a new intent on `bclaw_coordinate`
138
+ (e.g. `intent='ideate'`) or a direct `bclaw_loop(open, kind=...)`
139
+ call pattern documented in the RFC.
140
+ - A human-visible output: the terminal artifact should materialize as
141
+ a candidate or handoff the human can read, not only an intra-loop
142
+ artifact. This is how the loop becomes a "helper" rather than a
143
+ process trace.
144
+ - End-to-end tests that cover the happy path plus at least one
145
+ iteration round.
146
+
147
+ ## 4. Playbook refactoring note
148
+
149
+ The current playbooks (`docs/playbooks/integration/`, `productivity/`,
150
+ `team/`) mix the agent-user and human-adopter perspectives within a
151
+ single "audience" section. They should be refactored so each audience
152
+ file contains two explicit sections:
153
+
154
+ - **For the agent-user.** What the agent gains operationally: memory,
155
+ context, coordination, loops, review. This is the engine view.
156
+ - **For the human-adopter.** What the human gains in trust,
157
+ visibility, governance, compliance, cost control. This is the
158
+ cockpit view.
159
+
160
+ This split clarifies priorities: features that improve agent
161
+ effectiveness belong to the engine slice and trade against each other;
162
+ features that improve human confidence belong to the cockpit slice and
163
+ trade against each other. Mixing them biased the existing "Known Gaps"
164
+ sections toward visible-to-human items.
165
+
166
+ ## 5. Practical implications
167
+
168
+ - Next implementation move: reframer phase (pln#493) on top of the
169
+ shipped ideation_loop, then the Debug & Root-Cause Loop.
170
+ - Parallel track: the cockpit needs dedicated planning once the engine
171
+ emits enough signals (event streaming, reputation exposure, audit
172
+ narrative generation, cost attribution).
173
+ - Any new feature proposal should explicitly state which layer it
174
+ serves (engine or cockpit) and which audience slice (agent-user or
175
+ human-adopter). Proposals that don't declare this should be sent
176
+ back for clarification.
177
+
178
+ ## References
179
+
180
+ - `docs/concepts/loop-engine.md` — the v8 RFC (engine primitive)
181
+ - `docs/playbooks/*` — current audience playbooks (pending refactor
182
+ per §4 above)
183
+ - pln#394 `feat/loop-engine-mvp` — shipped
184
+ - pln#395 `feat/review-loop-protocol` — shipped