brainclaw 1.9.0 → 1.10.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +631 -499
- package/dist/brainclaw-vscode.vsix +0 -0
- package/dist/cli.js +18 -1
- package/dist/commands/code-map.js +129 -0
- package/dist/commands/codev.js +7 -0
- package/dist/commands/harvest.js +1 -1
- package/dist/commands/hooks.js +73 -73
- package/dist/commands/init.js +1 -1
- package/dist/commands/install-hooks.js +78 -78
- package/dist/commands/mcp-read-handlers.js +57 -14
- package/dist/commands/mcp.js +200 -13
- package/dist/commands/run-profile.js +3 -2
- package/dist/commands/switch.js +125 -93
- package/dist/commands/version.js +1 -1
- package/dist/core/agent-capability.js +19 -4
- package/dist/core/agent-files.js +131 -119
- package/dist/core/code-map/backend.js +123 -0
- package/dist/core/code-map/core.js +81 -0
- package/dist/core/code-map/drafts.js +2 -0
- package/dist/core/code-map/extractor.js +29 -0
- package/dist/core/code-map/finalizer.js +191 -0
- package/dist/core/code-map/freshness.js +108 -0
- package/dist/core/code-map/ids.js +0 -0
- package/dist/core/code-map/importable.js +35 -0
- package/dist/core/code-map/indexes.js +197 -0
- package/dist/core/code-map/lang/java/imports.scm +17 -0
- package/dist/core/code-map/lang/java/index.js +254 -0
- package/dist/core/code-map/lang/java/tags.scm +48 -0
- package/dist/core/code-map/lang/php/imports.scm +21 -0
- package/dist/core/code-map/lang/php/index.js +251 -0
- package/dist/core/code-map/lang/php/tags.scm +44 -0
- package/dist/core/code-map/lang/provider.js +9 -0
- package/dist/core/code-map/lang/providers.js +24 -0
- package/dist/core/code-map/lang/python/imports.scm +90 -0
- package/dist/core/code-map/lang/python/index.js +364 -0
- package/dist/core/code-map/lang/python/tags.scm +81 -0
- package/dist/core/code-map/lang/query-runtime.js +374 -0
- package/dist/core/code-map/lang/registry.js +125 -0
- package/dist/core/code-map/lang/typescript/imports.scm +90 -0
- package/dist/core/code-map/lang/typescript/index.js +306 -0
- package/dist/core/code-map/lang/typescript/tags.js.scm +106 -0
- package/dist/core/code-map/lang/typescript/tags.scm +151 -0
- package/dist/core/code-map/lock.js +210 -0
- package/dist/core/code-map/materialized.js +51 -0
- package/dist/core/code-map/memory-reader.js +59 -0
- package/dist/core/code-map/paths.js +53 -0
- package/dist/core/code-map/query.js +568 -0
- package/dist/core/code-map/refresh.js +0 -0
- package/dist/core/code-map/resolve.js +177 -0
- package/dist/core/code-map/store.js +206 -0
- package/dist/core/code-map/types.js +288 -0
- package/dist/core/code-map/vocabulary.js +57 -0
- package/dist/core/code-map/wasm-loader.js +294 -0
- package/dist/core/code-map/work-section.js +206 -0
- package/dist/core/codev-prompts.js +38 -38
- package/dist/core/codev-rounds.js +4 -0
- package/dist/core/default-profiles/doctor.yaml +11 -11
- package/dist/core/default-profiles/janitor.yaml +11 -11
- package/dist/core/default-profiles/onboarder.yaml +11 -11
- package/dist/core/default-profiles/reviewer.yaml +13 -13
- package/dist/core/dispatcher.js +1 -1
- package/dist/core/entity-operations.js +29 -3
- package/dist/core/execution-adapters.js +11 -10
- package/dist/core/execution-profile.js +58 -0
- package/dist/core/execution.js +1 -1
- package/dist/core/facade-schema.js +9 -0
- package/dist/core/instruction-templates.js +2 -0
- package/dist/core/loops/verbs.js +0 -1
- package/dist/core/mcp-command-resolution.js +3 -1
- package/dist/core/messaging.js +2 -2
- package/dist/core/protocol-skills.js +164 -164
- package/dist/core/runtime-signals.js +1 -1
- package/dist/core/search.js +19 -2
- package/dist/core/security-guard.js +207 -207
- package/dist/core/spawn-check.js +16 -2
- package/dist/core/staleness.js +1 -1
- package/dist/core/store-resolution.js +67 -11
- package/dist/core/worktree.js +18 -18
- package/dist/facts.js +9 -5
- package/dist/facts.json +8 -4
- package/dist/vendor/web-tree-sitter/tree-sitter.js +3980 -0
- package/dist/vendor/web-tree-sitter/tree-sitter.wasm +0 -0
- package/dist/wasm/tree-sitter-java.wasm +0 -0
- package/dist/wasm/tree-sitter-javascript.wasm +0 -0
- package/dist/wasm/tree-sitter-php.wasm +0 -0
- package/dist/wasm/tree-sitter-python.wasm +0 -0
- package/dist/wasm/tree-sitter-tsx.wasm +0 -0
- package/dist/wasm/tree-sitter-typescript.wasm +0 -0
- package/dist/wasm/tree-sitter.wasm +0 -0
- package/docs/PROTOCOL.md +1 -1
- package/docs/adapters/openclaw.md +43 -43
- package/docs/architecture/project-refs.md +328 -328
- package/docs/cli.md +2131 -2093
- package/docs/code-map.md +198 -0
- package/docs/concepts/coordination.md +52 -52
- package/docs/concepts/coordinator-runbook.md +129 -129
- package/docs/concepts/dispatch-lifecycle.md +245 -245
- package/docs/concepts/event-log-store.md +928 -928
- package/docs/concepts/ideation-loop.md +317 -317
- package/docs/concepts/loop-engine.md +520 -511
- package/docs/concepts/mcp-governance.md +268 -268
- package/docs/concepts/memory.md +84 -84
- package/docs/concepts/multi-agent-workflows.md +167 -167
- package/docs/concepts/observer-protocol.md +361 -361
- package/docs/concepts/plans-and-claims.md +217 -217
- package/docs/concepts/project-md-convention.md +35 -35
- package/docs/concepts/runtime-notes.md +38 -38
- package/docs/concepts/troubleshooting.md +254 -254
- package/docs/concepts/workspace-bootstrapping.md +142 -142
- package/docs/context-format-changelog.md +35 -35
- package/docs/context-format.md +48 -48
- package/docs/index.md +65 -65
- package/docs/integrations/agents.md +158 -158
- package/docs/integrations/claude-code.md +23 -23
- package/docs/integrations/cline.md +77 -77
- package/docs/integrations/continue.md +55 -55
- package/docs/integrations/copilot.md +68 -68
- package/docs/integrations/cursor.md +23 -23
- package/docs/integrations/kilocode.md +72 -72
- package/docs/integrations/mcp.md +385 -378
- package/docs/integrations/mistral-vibe.md +122 -122
- package/docs/integrations/openclaw.md +92 -92
- package/docs/integrations/opencode.md +84 -84
- package/docs/integrations/overview.md +115 -115
- package/docs/integrations/roo.md +71 -71
- package/docs/integrations/windsurf.md +77 -77
- package/docs/mcp-schema-changelog.md +364 -356
- package/docs/playbooks/integration/index.md +121 -121
- package/docs/playbooks/orchestration.md +37 -0
- package/docs/playbooks/productivity/index.md +99 -99
- package/docs/playbooks/team/index.md +117 -117
- package/docs/product/agent-first-model.md +184 -184
- package/docs/product/entity-model-audit.md +462 -462
- package/docs/product/positioning.md +86 -86
- package/docs/quickstart-existing-project.md +107 -107
- package/docs/quickstart.md +183 -183
- package/docs/release-maintenance.md +79 -79
- package/docs/reputation.md +52 -52
- package/docs/review.md +45 -45
- package/docs/security.md +212 -212
- package/docs/server-operations.md +118 -118
- package/docs/storage.md +106 -106
- package/package.json +86 -66
- package/docs/concepts/event-log-store-critique-A.md +0 -333
- package/docs/concepts/event-log-store-critique-B.md +0 -353
- package/docs/concepts/event-log-store-phase0-measurements.md +0 -58
- package/docs/concepts/event-log-store-proposal-A.md +0 -365
- package/docs/concepts/event-log-store-proposal-B.md +0 -404
- package/docs/concepts/identity-model-proposal.md +0 -371
|
@@ -1,53 +1,53 @@
|
|
|
1
|
-
# Audience: Teams & Ops
|
|
2
|
-
|
|
3
|
-
Design constraints for agents developing brainclaw features that serve team developers, project maintainers, and CI/CD operators.
|
|
4
|
-
|
|
5
|
-
## Profiles
|
|
6
|
-
|
|
7
|
-
### Team Developers
|
|
8
|
-
Developers collaborating on shared projects where multiple humans (and their agents) contribute. They need async coordination to avoid conflicting edits.
|
|
9
|
-
|
|
10
|
-
### Project Maintainers
|
|
11
|
-
Senior developers or tech leads responsible for code quality and architecture. They define the rules that all agents must follow in the repo.
|
|
12
|
-
|
|
13
|
-
### CI/CD Operators
|
|
14
|
-
Engineers who run agents in headless pipelines. They need brainclaw to work without interactive prompts and produce machine-readable output.
|
|
15
|
-
|
|
16
|
-
## Design Constraints
|
|
17
|
-
|
|
18
|
-
These rules apply to any feature that touches this audience:
|
|
19
|
-
|
|
20
|
-
1. **Concurrent agents is the nominal case, not the exception.** Any mutation of shared state (plans, claims, memory) must handle conflicts gracefully. File claims exist precisely for this — features that bypass claims are bugs. `bclaw_conflict_check` provides pre-edit safety, and `bclaw_check_policy` enforces scope compliance with glob-based pattern matching.
|
|
21
|
-
|
|
22
|
-
2. **Claims are mandatory for file-level coordination.** When multiple agents can be active, no agent should edit files without a claim. The claim system must be strict enough to prevent collisions and clear enough that agents understand scope boundaries. Policy checks return hard blocks (stops) and warnings (advisories) — both must be surfaced.
|
|
23
|
-
|
|
24
|
-
3. **Project constraints must be enforceable.** When a maintainer defines architecture rules or known traps, every agent joining the project must read them before working. This is not optional — it is the mechanism for scaling quality across agents. Lifecycle trigger tags (`trigger:post-claim`, `trigger:pre-session-end`) allow constraints to fire at specific moments.
|
|
25
|
-
|
|
26
|
-
4. **Headless operation must be first-class.** All CLI commands must work non-interactively (`--yes`, `--json`, exit codes). CI/CD agents cannot answer prompts. If a feature adds an interactive flow, it must have a non-interactive equivalent. Delivery methods (stdin_pipe, inline_arg, temp_file, inbox_structured) adapt to the agent's capabilities automatically.
|
|
27
|
-
|
|
28
|
-
5. **Audit trail matters.** For teams, knowing which agent did what and why is not a nice-to-have. The audit system (`audit.ts`) logs 30+ action types in JSONL with auto-rotation (10MB threshold), captures actor/action/before-after/scope/session_id/host_id. The governance posture report (`bclaw_audit`) aggregates this into constitution, red-lines, active claims, and recent activity summaries.
|
|
29
|
-
|
|
30
|
-
6. **Worktree isolation is the safe default for parallel work.** When multiple agents work simultaneously, each should get its own Git worktree. Worktrees are created under `~/.brainclaw/worktrees/<project-hash>/` with sidecar tracking (session_id, agent, user). `brainclaw worktree clean` prunes merged branches, `brainclaw worktree merge` handles restoration.
|
|
31
|
-
|
|
32
|
-
## Anti-Patterns
|
|
33
|
-
|
|
34
|
-
- Assuming single-agent, single-session usage
|
|
35
|
-
- Modifying shared state without conflict detection
|
|
36
|
-
- Adding interactive-only flows with no `--yes` / `--json` fallback
|
|
37
|
-
- Storing coordination state outside of Git-tracked `.brainclaw/` (breaks team sync)
|
|
38
|
-
- Silently overwriting another agent's claims or plan state
|
|
39
|
-
- Producing output that requires a human to interpret (no structured/machine-readable option)
|
|
40
|
-
- Ignoring reputation signals when evaluating agent contributions
|
|
41
|
-
|
|
42
|
-
## Key Features for This Audience
|
|
43
|
-
|
|
44
|
-
### Coordination & Claims
|
|
45
|
-
- `bclaw_claim` / `bclaw_release_claim` — file-level advisory locks with auto-worktree per claim
|
|
46
|
-
- `bclaw_find(entity="claim", filter?)` / `bclaw_get(entity="claim", id)` — list and read claims via the canonical grammar
|
|
47
|
-
- `bclaw_conflict_check` — pre-edit conflict detection between agents
|
|
48
|
-
- `bclaw_check_policy` — scope compliance with glob matching, returns blocks + warnings
|
|
49
|
-
- `bclaw_who` — list all active agent sessions on the workspace
|
|
50
|
-
|
|
1
|
+
# Audience: Teams & Ops
|
|
2
|
+
|
|
3
|
+
Design constraints for agents developing brainclaw features that serve team developers, project maintainers, and CI/CD operators.
|
|
4
|
+
|
|
5
|
+
## Profiles
|
|
6
|
+
|
|
7
|
+
### Team Developers
|
|
8
|
+
Developers collaborating on shared projects where multiple humans (and their agents) contribute. They need async coordination to avoid conflicting edits.
|
|
9
|
+
|
|
10
|
+
### Project Maintainers
|
|
11
|
+
Senior developers or tech leads responsible for code quality and architecture. They define the rules that all agents must follow in the repo.
|
|
12
|
+
|
|
13
|
+
### CI/CD Operators
|
|
14
|
+
Engineers who run agents in headless pipelines. They need brainclaw to work without interactive prompts and produce machine-readable output.
|
|
15
|
+
|
|
16
|
+
## Design Constraints
|
|
17
|
+
|
|
18
|
+
These rules apply to any feature that touches this audience:
|
|
19
|
+
|
|
20
|
+
1. **Concurrent agents is the nominal case, not the exception.** Any mutation of shared state (plans, claims, memory) must handle conflicts gracefully. File claims exist precisely for this — features that bypass claims are bugs. `bclaw_conflict_check` provides pre-edit safety, and `bclaw_check_policy` enforces scope compliance with glob-based pattern matching.
|
|
21
|
+
|
|
22
|
+
2. **Claims are mandatory for file-level coordination.** When multiple agents can be active, no agent should edit files without a claim. The claim system must be strict enough to prevent collisions and clear enough that agents understand scope boundaries. Policy checks return hard blocks (stops) and warnings (advisories) — both must be surfaced.
|
|
23
|
+
|
|
24
|
+
3. **Project constraints must be enforceable.** When a maintainer defines architecture rules or known traps, every agent joining the project must read them before working. This is not optional — it is the mechanism for scaling quality across agents. Lifecycle trigger tags (`trigger:post-claim`, `trigger:pre-session-end`) allow constraints to fire at specific moments.
|
|
25
|
+
|
|
26
|
+
4. **Headless operation must be first-class.** All CLI commands must work non-interactively (`--yes`, `--json`, exit codes). CI/CD agents cannot answer prompts. If a feature adds an interactive flow, it must have a non-interactive equivalent. Delivery methods (stdin_pipe, inline_arg, temp_file, inbox_structured) adapt to the agent's capabilities automatically.
|
|
27
|
+
|
|
28
|
+
5. **Audit trail matters.** For teams, knowing which agent did what and why is not a nice-to-have. The audit system (`audit.ts`) logs 30+ action types in JSONL with auto-rotation (10MB threshold), captures actor/action/before-after/scope/session_id/host_id. The governance posture report (`bclaw_audit`) aggregates this into constitution, red-lines, active claims, and recent activity summaries.
|
|
29
|
+
|
|
30
|
+
6. **Worktree isolation is the safe default for parallel work.** When multiple agents work simultaneously, each should get its own Git worktree. Worktrees are created under `~/.brainclaw/worktrees/<project-hash>/` with sidecar tracking (session_id, agent, user). `brainclaw worktree clean` prunes merged branches, `brainclaw worktree merge` handles restoration.
|
|
31
|
+
|
|
32
|
+
## Anti-Patterns
|
|
33
|
+
|
|
34
|
+
- Assuming single-agent, single-session usage
|
|
35
|
+
- Modifying shared state without conflict detection
|
|
36
|
+
- Adding interactive-only flows with no `--yes` / `--json` fallback
|
|
37
|
+
- Storing coordination state outside of Git-tracked `.brainclaw/` (breaks team sync)
|
|
38
|
+
- Silently overwriting another agent's claims or plan state
|
|
39
|
+
- Producing output that requires a human to interpret (no structured/machine-readable option)
|
|
40
|
+
- Ignoring reputation signals when evaluating agent contributions
|
|
41
|
+
|
|
42
|
+
## Key Features for This Audience
|
|
43
|
+
|
|
44
|
+
### Coordination & Claims
|
|
45
|
+
- `bclaw_claim` / `bclaw_release_claim` — file-level advisory locks with auto-worktree per claim
|
|
46
|
+
- `bclaw_find(entity="claim", filter?)` / `bclaw_get(entity="claim", id)` — list and read claims via the canonical grammar
|
|
47
|
+
- `bclaw_conflict_check` — pre-edit conflict detection between agents
|
|
48
|
+
- `bclaw_check_policy` — scope compliance with glob matching, returns blocks + warnings
|
|
49
|
+
- `bclaw_who` — list all active agent sessions on the workspace
|
|
50
|
+
|
|
51
51
|
### Planning & Sequencing
|
|
52
52
|
- `bclaw_create(entity="plan", data)` / `bclaw_update(entity="plan", id, patch)` / `bclaw_transition(entity="plan", id, to)` — shared work planning with priority/effort/assignee/status
|
|
53
53
|
- `bclaw_find(entity="plan", filter?)` — list plans
|
|
@@ -55,70 +55,70 @@ These rules apply to any feature that touches this audience:
|
|
|
55
55
|
- `bclaw_create_sequence` / `bclaw_list_sequences` / `bclaw_update_sequence` — multi-step coordination sequences with lane analysis
|
|
56
56
|
|
|
57
57
|
Sequence items are explicit lane records: `{ planId, stepId?, rank, hard_after?, soft_after?, lane?, scope_hint?, rationale? }`. `planId` and `rank` are required; `stepId` lets a sequence dispatch a specific plan step instead of the whole plan. A normal parallel flow is: create sequence as `draft`, update it to `active`, run `bclaw_dispatch(intent="analysis")`, then run `bclaw_dispatch(intent="execute", agents=[...])`.
|
|
58
|
-
|
|
59
|
-
### Multi-Agent Dispatch
|
|
60
|
-
- `bclaw_dispatch(intent)` — `analysis` analyses an active sequence (ready/active/blocked/done per lane), `execute` fans out parallel work across lanes, `review` dispatches code reviews for completed handoffs
|
|
61
|
-
- `bclaw_coordinate(intent)` — facade for `assign` / `consult` / `review` / `reroute` / `summarize`. Pass `open_loop: true` on `intent="review"` to also dispatch the reviewer turn (the recommended path).
|
|
62
|
-
- `bclaw_loop(intent)` — drive a turn in an existing multi-turn loop (`turn`, `complete_turn`, `advance`, `close`)
|
|
63
|
-
- Delivery channels: spawn (CLI subprocess) or inbox (structured messages)
|
|
64
|
-
- Dry-run mode for previewing assignments
|
|
65
|
-
|
|
66
|
-
### Inter-Agent Messaging
|
|
67
|
-
- `bclaw_send_message` / `bclaw_read_inbox` / `bclaw_ack_message` — structured messaging between agents
|
|
68
|
-
- `bclaw_get_thread` — full thread view across agent inboxes
|
|
69
|
-
- CLI: `brainclaw inbox list/ack/archive/send/thread`
|
|
70
|
-
|
|
71
|
-
### Visibility & Governance
|
|
72
|
-
- `bclaw_context(kind="board")` / `bclaw_context(kind="board_summary")` — active plans, claims by agent, open handoffs, sequences, resolved instructions (full or compact form)
|
|
73
|
-
- `bclaw_audit` — governance posture report (constitution, red-lines, claim activity, recommendations)
|
|
74
|
-
- `bclaw_history` — full mutation history per memory item with before/after snapshots
|
|
75
|
-
- Reputation signals — 4 scores: contribution_quality, review_reliability, continuity_hygiene, internal_trust
|
|
76
|
-
- `bclaw_estimation_report` — estimation accuracy for completed plans
|
|
77
|
-
|
|
78
|
-
### Handoffs & Review
|
|
79
|
-
- `bclaw_get(entity="handoff", id)` / `bclaw_update(entity="handoff", id, patch)` — structured agent transitions with git diff and state snapshot
|
|
80
|
-
- `bclaw_create(entity="candidate", data)` / `bclaw_transition(entity="candidate", id, to="accepted"|"rejected")` — review workflow with candidate promotion
|
|
81
|
-
- `bclaw_coordinate(intent="review", open_loop: true, review_mode: "symmetric")` — open a multi-turn review-and-fix loop in one call (the recommended way to dispatch a reviewer)
|
|
82
|
-
- `brainclaw review` — reflective review from CLI
|
|
83
|
-
|
|
84
|
-
### Worktree Management
|
|
85
|
-
- `bclaw_claim` with auto-worktree — creates isolated branch per agent/scope
|
|
86
|
-
- `brainclaw worktree clean` — remove merged worktrees and orphan directories
|
|
87
|
-
- `brainclaw worktree merge <branch>` — merge with auto-restoration
|
|
88
|
-
- Sidecar tracking: session_id, agent name, user per worktree
|
|
89
|
-
|
|
90
|
-
## Known Gaps
|
|
91
|
-
|
|
92
|
-
Features this audience would naturally expect but that are not yet implemented:
|
|
93
|
-
|
|
94
|
-
### No merge conflict resolution tooling
|
|
95
|
-
brainclaw creates worktrees and manages branches but provides zero help when two agents' worktrees conflict at merge time. No pre-merge conflict detection, no merge strategy hints, no assisted resolution. Teams must handle all git merge/rebase manually.
|
|
96
|
-
**Impact:** The hardest part of multi-agent parallel work (the merge) is entirely outside brainclaw's scope.
|
|
97
|
-
|
|
98
|
-
### No push notifications or webhooks
|
|
99
|
-
The event log (`event-log.ts`) records all mutations in JSONL and supports cursor-based polling (`readUnseenEvents()`), but there is no push mechanism. No webhooks, no Slack integration, no email alerts. When a claim conflicts or a handoff is ready, agents must poll to discover it.
|
|
100
|
-
**Impact:** Teams can't get real-time alerts on coordination events. Stale claim detection depends on someone polling.
|
|
101
|
-
|
|
102
|
-
### Claim expiry has no background enforcement
|
|
103
|
-
Claim expiry logic exists (`claims.ts`: `isClaimExpired()`, `isClaimStale()`, `expireStaleActiveClaims()`), but there is no daemon or background process. Expiry only triggers when a command explicitly calls it. If an agent crashes and abandons a claim, it blocks other agents until someone runs a check.
|
|
104
|
-
**Impact:** Stale claims from crashed agents can block team progress indefinitely.
|
|
105
|
-
|
|
106
|
-
### No path-level access restrictions
|
|
107
|
-
Trust levels exist (observer/contributor/trusted/curator in `agent-registry.ts`), but they are all-or-nothing per tier. There is no way to say "junior agents can't modify src/core/" or restrict specific agents to specific scopes. Policy checks (`policy.ts`) warn on claim conflicts but don't enforce path-based permissions.
|
|
108
|
-
**Impact:** Maintainers cannot protect critical code paths from untrusted agents.
|
|
109
|
-
|
|
110
|
-
### No web dashboard or visualization
|
|
111
|
-
All output is CLI text or JSON (`metrics.ts`, `doctor.ts`, `audit`). There is no web UI, no HTML reports, no dependency graphs, no Gantt charts for plans. Teams that want visual coordination views must build their own from JSON exports.
|
|
112
|
-
**Impact:** Ops teams without CLI fluency have no visibility into coordination state.
|
|
113
|
-
|
|
114
|
-
### No pre-built CI GitHub Action
|
|
115
|
-
brainclaw works in CI (proven by `.github/workflows/brainclaw-sync.yml` in this repo), but there is no reusable GitHub Action or template. Operators must manually wire `brainclaw doctor --json` into their pipelines. No `--ci` mode flag, no structured exit codes (warning vs error).
|
|
116
|
-
**Impact:** CI adoption requires per-repo custom work instead of a one-line Action reference.
|
|
117
|
-
|
|
118
|
-
### Plan dependencies lack cycle/deadlock detection
|
|
119
|
-
Plans support `depends_on` and sequences support `hard_after`/`soft_after` (`schema.ts`, `dispatcher.ts`), but there is no transitive dependency validation. Circular dependencies (A→B→C→A) are not detected before dispatch. No critical path analysis or slack calculation.
|
|
120
|
-
**Impact:** Complex plan graphs can deadlock silently.
|
|
121
|
-
|
|
122
|
-
### No team onboarding walkthrough
|
|
123
|
-
When a new developer joins an existing brainclaw project, they get the same `init` flow as a greenfield project. There is no "get familiar with this project" step, no guided tour of existing plans/constraints/traps, no role assignment during setup.
|
|
124
|
-
**Impact:** New team members start without understanding the project's coordination state.
|
|
58
|
+
|
|
59
|
+
### Multi-Agent Dispatch
|
|
60
|
+
- `bclaw_dispatch(intent)` — `analysis` analyses an active sequence (ready/active/blocked/done per lane), `execute` fans out parallel work across lanes, `review` dispatches code reviews for completed handoffs
|
|
61
|
+
- `bclaw_coordinate(intent)` — facade for `assign` / `consult` / `review` / `reroute` / `summarize`. Pass `open_loop: true` on `intent="review"` to also dispatch the reviewer turn (the recommended path).
|
|
62
|
+
- `bclaw_loop(intent)` — drive a turn in an existing multi-turn loop (`turn`, `complete_turn`, `advance`, `close`)
|
|
63
|
+
- Delivery channels: spawn (CLI subprocess) or inbox (structured messages)
|
|
64
|
+
- Dry-run mode for previewing assignments
|
|
65
|
+
|
|
66
|
+
### Inter-Agent Messaging
|
|
67
|
+
- `bclaw_send_message` / `bclaw_read_inbox` / `bclaw_ack_message` — structured messaging between agents
|
|
68
|
+
- `bclaw_get_thread` — full thread view across agent inboxes
|
|
69
|
+
- CLI: `brainclaw inbox list/ack/archive/send/thread`
|
|
70
|
+
|
|
71
|
+
### Visibility & Governance
|
|
72
|
+
- `bclaw_context(kind="board")` / `bclaw_context(kind="board_summary")` — active plans, claims by agent, open handoffs, sequences, resolved instructions (full or compact form)
|
|
73
|
+
- `bclaw_audit` — governance posture report (constitution, red-lines, claim activity, recommendations)
|
|
74
|
+
- `bclaw_history` — full mutation history per memory item with before/after snapshots
|
|
75
|
+
- Reputation signals — 4 scores: contribution_quality, review_reliability, continuity_hygiene, internal_trust
|
|
76
|
+
- `bclaw_estimation_report` — estimation accuracy for completed plans
|
|
77
|
+
|
|
78
|
+
### Handoffs & Review
|
|
79
|
+
- `bclaw_get(entity="handoff", id)` / `bclaw_update(entity="handoff", id, patch)` — structured agent transitions with git diff and state snapshot
|
|
80
|
+
- `bclaw_create(entity="candidate", data)` / `bclaw_transition(entity="candidate", id, to="accepted"|"rejected")` — review workflow with candidate promotion
|
|
81
|
+
- `bclaw_coordinate(intent="review", open_loop: true, review_mode: "symmetric")` — open a multi-turn review-and-fix loop in one call (the recommended way to dispatch a reviewer)
|
|
82
|
+
- `brainclaw review` — reflective review from CLI
|
|
83
|
+
|
|
84
|
+
### Worktree Management
|
|
85
|
+
- `bclaw_claim` with auto-worktree — creates isolated branch per agent/scope
|
|
86
|
+
- `brainclaw worktree clean` — remove merged worktrees and orphan directories
|
|
87
|
+
- `brainclaw worktree merge <branch>` — merge with auto-restoration
|
|
88
|
+
- Sidecar tracking: session_id, agent name, user per worktree
|
|
89
|
+
|
|
90
|
+
## Known Gaps
|
|
91
|
+
|
|
92
|
+
Features this audience would naturally expect but that are not yet implemented:
|
|
93
|
+
|
|
94
|
+
### No merge conflict resolution tooling
|
|
95
|
+
brainclaw creates worktrees and manages branches but provides zero help when two agents' worktrees conflict at merge time. No pre-merge conflict detection, no merge strategy hints, no assisted resolution. Teams must handle all git merge/rebase manually.
|
|
96
|
+
**Impact:** The hardest part of multi-agent parallel work (the merge) is entirely outside brainclaw's scope.
|
|
97
|
+
|
|
98
|
+
### No push notifications or webhooks
|
|
99
|
+
The event log (`event-log.ts`) records all mutations in JSONL and supports cursor-based polling (`readUnseenEvents()`), but there is no push mechanism. No webhooks, no Slack integration, no email alerts. When a claim conflicts or a handoff is ready, agents must poll to discover it.
|
|
100
|
+
**Impact:** Teams can't get real-time alerts on coordination events. Stale claim detection depends on someone polling.
|
|
101
|
+
|
|
102
|
+
### Claim expiry has no background enforcement
|
|
103
|
+
Claim expiry logic exists (`claims.ts`: `isClaimExpired()`, `isClaimStale()`, `expireStaleActiveClaims()`), but there is no daemon or background process. Expiry only triggers when a command explicitly calls it. If an agent crashes and abandons a claim, it blocks other agents until someone runs a check.
|
|
104
|
+
**Impact:** Stale claims from crashed agents can block team progress indefinitely.
|
|
105
|
+
|
|
106
|
+
### No path-level access restrictions
|
|
107
|
+
Trust levels exist (observer/contributor/trusted/curator in `agent-registry.ts`), but they are all-or-nothing per tier. There is no way to say "junior agents can't modify src/core/" or restrict specific agents to specific scopes. Policy checks (`policy.ts`) warn on claim conflicts but don't enforce path-based permissions.
|
|
108
|
+
**Impact:** Maintainers cannot protect critical code paths from untrusted agents.
|
|
109
|
+
|
|
110
|
+
### No web dashboard or visualization
|
|
111
|
+
All output is CLI text or JSON (`metrics.ts`, `doctor.ts`, `audit`). There is no web UI, no HTML reports, no dependency graphs, no Gantt charts for plans. Teams that want visual coordination views must build their own from JSON exports.
|
|
112
|
+
**Impact:** Ops teams without CLI fluency have no visibility into coordination state.
|
|
113
|
+
|
|
114
|
+
### No pre-built CI GitHub Action
|
|
115
|
+
brainclaw works in CI (proven by `.github/workflows/brainclaw-sync.yml` in this repo), but there is no reusable GitHub Action or template. Operators must manually wire `brainclaw doctor --json` into their pipelines. No `--ci` mode flag, no structured exit codes (warning vs error).
|
|
116
|
+
**Impact:** CI adoption requires per-repo custom work instead of a one-line Action reference.
|
|
117
|
+
|
|
118
|
+
### Plan dependencies lack cycle/deadlock detection
|
|
119
|
+
Plans support `depends_on` and sequences support `hard_after`/`soft_after` (`schema.ts`, `dispatcher.ts`), but there is no transitive dependency validation. Circular dependencies (A→B→C→A) are not detected before dispatch. No critical path analysis or slack calculation.
|
|
120
|
+
**Impact:** Complex plan graphs can deadlock silently.
|
|
121
|
+
|
|
122
|
+
### No team onboarding walkthrough
|
|
123
|
+
When a new developer joins an existing brainclaw project, they get the same `init` flow as a greenfield project. There is no "get familiar with this project" step, no guided tour of existing plans/constraints/traps, no role assignment during setup.
|
|
124
|
+
**Impact:** New team members start without understanding the project's coordination state.
|
|
@@ -1,184 +1,184 @@
|
|
|
1
|
-
# Agent-first product model
|
|
2
|
-
|
|
3
|
-
Captured from a strategic session on 2026-04-18. This document frames
|
|
4
|
-
brainclaw's user model, the resulting two-layer product architecture,
|
|
5
|
-
and the loop-protocols roadmap implied by both. It supersedes the
|
|
6
|
-
implicit framing in the current audience playbooks (`docs/playbooks/*`)
|
|
7
|
-
and should be read before recommending new features.
|
|
8
|
-
|
|
9
|
-
## 1. Who is the user?
|
|
10
|
-
|
|
11
|
-
The playbooks describe personae as humans ("Non-Tech Creator", "Solo
|
|
12
|
-
Developer", "CI/CD Operator"). That framing is misleading. brainclaw is
|
|
13
|
-
built for LLM agents to consume — humans rarely type brainclaw CLI
|
|
14
|
-
commands. The lines blur because:
|
|
15
|
-
|
|
16
|
-
- The **user** of brainclaw is the agent. It calls MCP tools, consumes
|
|
17
|
-
the context format, writes memory, participates in loops.
|
|
18
|
-
- The **adopter** of brainclaw is the human. The human installs it,
|
|
19
|
-
chooses which agents to deploy, reviews outputs, and — at enterprise
|
|
20
|
-
scale — must be able to audit what the agents did.
|
|
21
|
-
|
|
22
|
-
These are different people with different requirements. Mixing them
|
|
23
|
-
biases priorities toward "visible to the human" features (dashboards,
|
|
24
|
-
CLI UX, Slack webhooks) at the expense of "invisible but load-bearing
|
|
25
|
-
for the agent" features (context compounding, memory staleness,
|
|
26
|
-
cross-agent coordination).
|
|
27
|
-
|
|
28
|
-
## 2. Two-layer product architecture
|
|
29
|
-
|
|
30
|
-
brainclaw is really two products sharing a data substrate:
|
|
31
|
-
|
|
32
|
-
### The engine — agent-facing
|
|
33
|
-
|
|
34
|
-
- **Role.** Where the agent does work. Every feature here is designed
|
|
35
|
-
to make the agent measurably more effective, which manifests
|
|
36
|
-
externally as fewer re-prompts, better grounded outputs, and cleaner
|
|
37
|
-
multi-agent coordination.
|
|
38
|
-
- **Surface.** MCP tools, context format, memory schema, Loop engine,
|
|
39
|
-
claims/plans/handoffs, federation signals.
|
|
40
|
-
- **Design constraints.** Minimal cognitive load for the agent
|
|
41
|
-
(structured inputs, discoverable vocabulary), no UX, machine-parseable
|
|
42
|
-
outputs, strict contracts. The agent should be able to use it without
|
|
43
|
-
the human intervening.
|
|
44
|
-
- **Success metric.** Agent effectiveness: does a long-running agent
|
|
45
|
-
with months of history outperform a fresh one? Does multi-agent
|
|
46
|
-
review produce better code than single-agent? Does session N+1 pick
|
|
47
|
-
up where session N left off without re-prompting?
|
|
48
|
-
|
|
49
|
-
### The cockpit — human-facing
|
|
50
|
-
|
|
51
|
-
- **Role.** Where the human supervises, audits, and trusts. Rarely used
|
|
52
|
-
operationally by the agent. Critical for adoption, especially at
|
|
53
|
-
enterprise scale where visibility is a procurement gate.
|
|
54
|
-
- **Surface.** Dashboard (local or remote), audit narrative reports,
|
|
55
|
-
cost attribution, risk/policy surface, reputation views, forensics
|
|
56
|
-
and replay, CI governance gates, webhooks for operational alerts.
|
|
57
|
-
- **Design constraints.** Visual, aggregable, exportable,
|
|
58
|
-
compliance-ready. Human cognitive model: time series, filters,
|
|
59
|
-
drill-downs. Should run read-mostly on the data the engine emits.
|
|
60
|
-
- **Success metric.** Human confidence: does a tech lead understand
|
|
61
|
-
what N agents did this week? Can a compliance officer produce a
|
|
62
|
-
report? Can an ops engineer detect a misbehaving agent before
|
|
63
|
-
damage?
|
|
64
|
-
|
|
65
|
-
### The relationship between the two
|
|
66
|
-
|
|
67
|
-
- The engine **emits** signals (events, audit entries, reputation
|
|
68
|
-
scores, usage traces, loop transitions).
|
|
69
|
-
- The cockpit **consumes** those signals and aggregates them for the
|
|
70
|
-
human.
|
|
71
|
-
- The engine is primary: without it, the cockpit is an empty chart.
|
|
72
|
-
The cockpit is load-bearing for adoption: without it, the engine
|
|
73
|
-
never reaches production at scale.
|
|
74
|
-
- Engineering discipline: every engine feature should ask "what
|
|
75
|
-
machine-readable signal should this emit so the cockpit can
|
|
76
|
-
consume it?". The cockpit is then incremental capitalization on
|
|
77
|
-
engine investments, not a parallel product effort.
|
|
78
|
-
|
|
79
|
-
## 3. Loop engine strategy
|
|
80
|
-
|
|
81
|
-
The Loop engine (pln#394) was designed as a generic control plane —
|
|
82
|
-
one engine, many protocols. Review & Fix Loop (pln#395) was the first
|
|
83
|
-
shipped protocol. The strategic reflection clarifies that:
|
|
84
|
-
|
|
85
|
-
- We do **not** need to code eight protocols. We need to wire four
|
|
86
|
-
polished entry points for the high-leverage kinds, and document
|
|
87
|
-
patterns for the rest as composition variants.
|
|
88
|
-
- The engine already supports everything required: `open`, `turn`,
|
|
89
|
-
`advance`, `complete_turn`, `add_artifact`, `pause`, `resume`,
|
|
90
|
-
`close`, with per-phase `advance_when`, composite `StopCondition`,
|
|
91
|
-
idempotency, and CAS.
|
|
92
|
-
|
|
93
|
-
### Ranked protocols to wire next
|
|
94
|
-
|
|
95
|
-
1. **Ideation Loop** — **MVP shipped in v1.5.0** (pln#492). The shipped
|
|
96
|
-
shape is single-champion-plus-memory rather than the four-role
|
|
97
|
-
framing originally drafted: empirical work in May 2026
|
|
98
|
-
(`feedback_ideation_loop_single_agent_method`) showed that one
|
|
99
|
-
model produces useful adversarial pressure when the critic phase's
|
|
100
|
-
`context_filter` makes it confront only adversarial memory (traps,
|
|
101
|
-
feedback, runtime_notes). Multi-agent slots are still supported as
|
|
102
|
-
an opt-in for richer diversity. See [docs/concepts/ideation-loop.md](../concepts/ideation-loop.md).
|
|
103
|
-
Reframer phase (pln#493) is the next layer — covers the
|
|
104
|
-
novelty/simplicity/external-pattern blind spot of memory-driven
|
|
105
|
-
critique.
|
|
106
|
-
2. **Debug & Root-Cause Loop**. Five phases: symptom → hypothesis →
|
|
107
|
-
test → fix → verify. Targets the #1 pain point of single-agent
|
|
108
|
-
debugging — the lack of structure. High daily impact.
|
|
109
|
-
3. **Research & Synthesis Loop**. Researcher → analyzer → synthesizer
|
|
110
|
-
→ validator. Replaces "the human reads twenty pages" with a
|
|
111
|
-
condensed summary of the same sources. Novel utility vs the other
|
|
112
|
-
protocols.
|
|
113
|
-
4. **Planning & Breakdown Loop**. Goal → decomposer → estimator →
|
|
114
|
-
validator → refiner. Compounds with brainclaw's existing Plans and
|
|
115
|
-
Sequences — makes plan creation less naive.
|
|
116
|
-
|
|
117
|
-
### Variants, not new protocols
|
|
118
|
-
|
|
119
|
-
The following items from the brainstorm are compositions of the four
|
|
120
|
-
above and do not require separate engine work:
|
|
121
|
-
|
|
122
|
-
- **Reflection / Self-Critique** = ideation loop with `mode:
|
|
123
|
-
'symmetric'` and all slots assigned to the same agent. The engine
|
|
124
|
-
already supports this.
|
|
125
|
-
- **Validation & Approval Multi-Audience** = review loop with N
|
|
126
|
-
reviewer slots (one per audience) plus a consolidator slot. Purely
|
|
127
|
-
a slot-configuration pattern.
|
|
128
|
-
- **Optimization / Refactoring** = implementation loop framed around
|
|
129
|
-
a before/after artifact pair. A convention, not a new protocol.
|
|
130
|
-
|
|
131
|
-
### What "wiring" means concretely (per protocol)
|
|
132
|
-
|
|
133
|
-
For each of the four priority protocols:
|
|
134
|
-
|
|
135
|
-
- Polished `DEFAULT_PROTOCOLS` entry (phases, stop_condition, default
|
|
136
|
-
roles) in `src/core/loops/types.ts`.
|
|
137
|
-
- A facade entry point: either a new intent on `bclaw_coordinate`
|
|
138
|
-
(e.g. `intent='ideate'`) or a direct `bclaw_loop(open, kind=...)`
|
|
139
|
-
call pattern documented in the RFC.
|
|
140
|
-
- A human-visible output: the terminal artifact should materialize as
|
|
141
|
-
a candidate or handoff the human can read, not only an intra-loop
|
|
142
|
-
artifact. This is how the loop becomes a "helper" rather than a
|
|
143
|
-
process trace.
|
|
144
|
-
- End-to-end tests that cover the happy path plus at least one
|
|
145
|
-
iteration round.
|
|
146
|
-
|
|
147
|
-
## 4. Playbook refactoring note
|
|
148
|
-
|
|
149
|
-
The current playbooks (`docs/playbooks/integration/`, `productivity/`,
|
|
150
|
-
`team/`) mix the agent-user and human-adopter perspectives within a
|
|
151
|
-
single "audience" section. They should be refactored so each audience
|
|
152
|
-
file contains two explicit sections:
|
|
153
|
-
|
|
154
|
-
- **For the agent-user.** What the agent gains operationally: memory,
|
|
155
|
-
context, coordination, loops, review. This is the engine view.
|
|
156
|
-
- **For the human-adopter.** What the human gains in trust,
|
|
157
|
-
visibility, governance, compliance, cost control. This is the
|
|
158
|
-
cockpit view.
|
|
159
|
-
|
|
160
|
-
This split clarifies priorities: features that improve agent
|
|
161
|
-
effectiveness belong to the engine slice and trade against each other;
|
|
162
|
-
features that improve human confidence belong to the cockpit slice and
|
|
163
|
-
trade against each other. Mixing them biased the existing "Known Gaps"
|
|
164
|
-
sections toward visible-to-human items.
|
|
165
|
-
|
|
166
|
-
## 5. Practical implications
|
|
167
|
-
|
|
168
|
-
- Next implementation move: reframer phase (pln#493) on top of the
|
|
169
|
-
shipped ideation_loop, then the Debug & Root-Cause Loop.
|
|
170
|
-
- Parallel track: the cockpit needs dedicated planning once the engine
|
|
171
|
-
emits enough signals (event streaming, reputation exposure, audit
|
|
172
|
-
narrative generation, cost attribution).
|
|
173
|
-
- Any new feature proposal should explicitly state which layer it
|
|
174
|
-
serves (engine or cockpit) and which audience slice (agent-user or
|
|
175
|
-
human-adopter). Proposals that don't declare this should be sent
|
|
176
|
-
back for clarification.
|
|
177
|
-
|
|
178
|
-
## References
|
|
179
|
-
|
|
180
|
-
- `docs/concepts/loop-engine.md` — the v8 RFC (engine primitive)
|
|
181
|
-
- `docs/playbooks/*` — current audience playbooks (pending refactor
|
|
182
|
-
per §4 above)
|
|
183
|
-
- pln#394 `feat/loop-engine-mvp` — shipped
|
|
184
|
-
- pln#395 `feat/review-loop-protocol` — shipped
|
|
1
|
+
# Agent-first product model
|
|
2
|
+
|
|
3
|
+
Captured from a strategic session on 2026-04-18. This document frames
|
|
4
|
+
brainclaw's user model, the resulting two-layer product architecture,
|
|
5
|
+
and the loop-protocols roadmap implied by both. It supersedes the
|
|
6
|
+
implicit framing in the current audience playbooks (`docs/playbooks/*`)
|
|
7
|
+
and should be read before recommending new features.
|
|
8
|
+
|
|
9
|
+
## 1. Who is the user?
|
|
10
|
+
|
|
11
|
+
The playbooks describe personae as humans ("Non-Tech Creator", "Solo
|
|
12
|
+
Developer", "CI/CD Operator"). That framing is misleading. brainclaw is
|
|
13
|
+
built for LLM agents to consume — humans rarely type brainclaw CLI
|
|
14
|
+
commands. The lines blur because:
|
|
15
|
+
|
|
16
|
+
- The **user** of brainclaw is the agent. It calls MCP tools, consumes
|
|
17
|
+
the context format, writes memory, participates in loops.
|
|
18
|
+
- The **adopter** of brainclaw is the human. The human installs it,
|
|
19
|
+
chooses which agents to deploy, reviews outputs, and — at enterprise
|
|
20
|
+
scale — must be able to audit what the agents did.
|
|
21
|
+
|
|
22
|
+
These are different people with different requirements. Mixing them
|
|
23
|
+
biases priorities toward "visible to the human" features (dashboards,
|
|
24
|
+
CLI UX, Slack webhooks) at the expense of "invisible but load-bearing
|
|
25
|
+
for the agent" features (context compounding, memory staleness,
|
|
26
|
+
cross-agent coordination).
|
|
27
|
+
|
|
28
|
+
## 2. Two-layer product architecture
|
|
29
|
+
|
|
30
|
+
brainclaw is really two products sharing a data substrate:
|
|
31
|
+
|
|
32
|
+
### The engine — agent-facing
|
|
33
|
+
|
|
34
|
+
- **Role.** Where the agent does work. Every feature here is designed
|
|
35
|
+
to make the agent measurably more effective, which manifests
|
|
36
|
+
externally as fewer re-prompts, better grounded outputs, and cleaner
|
|
37
|
+
multi-agent coordination.
|
|
38
|
+
- **Surface.** MCP tools, context format, memory schema, Loop engine,
|
|
39
|
+
claims/plans/handoffs, federation signals.
|
|
40
|
+
- **Design constraints.** Minimal cognitive load for the agent
|
|
41
|
+
(structured inputs, discoverable vocabulary), no UX, machine-parseable
|
|
42
|
+
outputs, strict contracts. The agent should be able to use it without
|
|
43
|
+
the human intervening.
|
|
44
|
+
- **Success metric.** Agent effectiveness: does a long-running agent
|
|
45
|
+
with months of history outperform a fresh one? Does multi-agent
|
|
46
|
+
review produce better code than single-agent? Does session N+1 pick
|
|
47
|
+
up where session N left off without re-prompting?
|
|
48
|
+
|
|
49
|
+
### The cockpit — human-facing
|
|
50
|
+
|
|
51
|
+
- **Role.** Where the human supervises, audits, and trusts. Rarely used
|
|
52
|
+
operationally by the agent. Critical for adoption, especially at
|
|
53
|
+
enterprise scale where visibility is a procurement gate.
|
|
54
|
+
- **Surface.** Dashboard (local or remote), audit narrative reports,
|
|
55
|
+
cost attribution, risk/policy surface, reputation views, forensics
|
|
56
|
+
and replay, CI governance gates, webhooks for operational alerts.
|
|
57
|
+
- **Design constraints.** Visual, aggregable, exportable,
|
|
58
|
+
compliance-ready. Human cognitive model: time series, filters,
|
|
59
|
+
drill-downs. Should run read-mostly on the data the engine emits.
|
|
60
|
+
- **Success metric.** Human confidence: does a tech lead understand
|
|
61
|
+
what N agents did this week? Can a compliance officer produce a
|
|
62
|
+
report? Can an ops engineer detect a misbehaving agent before
|
|
63
|
+
damage?
|
|
64
|
+
|
|
65
|
+
### The relationship between the two
|
|
66
|
+
|
|
67
|
+
- The engine **emits** signals (events, audit entries, reputation
|
|
68
|
+
scores, usage traces, loop transitions).
|
|
69
|
+
- The cockpit **consumes** those signals and aggregates them for the
|
|
70
|
+
human.
|
|
71
|
+
- The engine is primary: without it, the cockpit is an empty chart.
|
|
72
|
+
The cockpit is load-bearing for adoption: without it, the engine
|
|
73
|
+
never reaches production at scale.
|
|
74
|
+
- Engineering discipline: every engine feature should ask "what
|
|
75
|
+
machine-readable signal should this emit so the cockpit can
|
|
76
|
+
consume it?". The cockpit is then incremental capitalization on
|
|
77
|
+
engine investments, not a parallel product effort.
|
|
78
|
+
|
|
79
|
+
## 3. Loop engine strategy
|
|
80
|
+
|
|
81
|
+
The Loop engine (pln#394) was designed as a generic control plane —
|
|
82
|
+
one engine, many protocols. Review & Fix Loop (pln#395) was the first
|
|
83
|
+
shipped protocol. The strategic reflection clarifies that:
|
|
84
|
+
|
|
85
|
+
- We do **not** need to code eight protocols. We need to wire four
|
|
86
|
+
polished entry points for the high-leverage kinds, and document
|
|
87
|
+
patterns for the rest as composition variants.
|
|
88
|
+
- The engine already supports everything required: `open`, `turn`,
|
|
89
|
+
`advance`, `complete_turn`, `add_artifact`, `pause`, `resume`,
|
|
90
|
+
`close`, with per-phase `advance_when`, composite `StopCondition`,
|
|
91
|
+
idempotency, and CAS.
|
|
92
|
+
|
|
93
|
+
### Ranked protocols to wire next
|
|
94
|
+
|
|
95
|
+
1. **Ideation Loop** — **MVP shipped in v1.5.0** (pln#492). The shipped
|
|
96
|
+
shape is single-champion-plus-memory rather than the four-role
|
|
97
|
+
framing originally drafted: empirical work in May 2026
|
|
98
|
+
(`feedback_ideation_loop_single_agent_method`) showed that one
|
|
99
|
+
model produces useful adversarial pressure when the critic phase's
|
|
100
|
+
`context_filter` makes it confront only adversarial memory (traps,
|
|
101
|
+
feedback, runtime_notes). Multi-agent slots are still supported as
|
|
102
|
+
an opt-in for richer diversity. See [docs/concepts/ideation-loop.md](../concepts/ideation-loop.md).
|
|
103
|
+
Reframer phase (pln#493) is the next layer — covers the
|
|
104
|
+
novelty/simplicity/external-pattern blind spot of memory-driven
|
|
105
|
+
critique.
|
|
106
|
+
2. **Debug & Root-Cause Loop**. Five phases: symptom → hypothesis →
|
|
107
|
+
test → fix → verify. Targets the #1 pain point of single-agent
|
|
108
|
+
debugging — the lack of structure. High daily impact.
|
|
109
|
+
3. **Research & Synthesis Loop**. Researcher → analyzer → synthesizer
|
|
110
|
+
→ validator. Replaces "the human reads twenty pages" with a
|
|
111
|
+
condensed summary of the same sources. Novel utility vs the other
|
|
112
|
+
protocols.
|
|
113
|
+
4. **Planning & Breakdown Loop**. Goal → decomposer → estimator →
|
|
114
|
+
validator → refiner. Compounds with brainclaw's existing Plans and
|
|
115
|
+
Sequences — makes plan creation less naive.
|
|
116
|
+
|
|
117
|
+
### Variants, not new protocols
|
|
118
|
+
|
|
119
|
+
The following items from the brainstorm are compositions of the four
|
|
120
|
+
above and do not require separate engine work:
|
|
121
|
+
|
|
122
|
+
- **Reflection / Self-Critique** = ideation loop with `mode:
|
|
123
|
+
'symmetric'` and all slots assigned to the same agent. The engine
|
|
124
|
+
already supports this.
|
|
125
|
+
- **Validation & Approval Multi-Audience** = review loop with N
|
|
126
|
+
reviewer slots (one per audience) plus a consolidator slot. Purely
|
|
127
|
+
a slot-configuration pattern.
|
|
128
|
+
- **Optimization / Refactoring** = implementation loop framed around
|
|
129
|
+
a before/after artifact pair. A convention, not a new protocol.
|
|
130
|
+
|
|
131
|
+
### What "wiring" means concretely (per protocol)
|
|
132
|
+
|
|
133
|
+
For each of the four priority protocols:
|
|
134
|
+
|
|
135
|
+
- Polished `DEFAULT_PROTOCOLS` entry (phases, stop_condition, default
|
|
136
|
+
roles) in `src/core/loops/types.ts`.
|
|
137
|
+
- A facade entry point: either a new intent on `bclaw_coordinate`
|
|
138
|
+
(e.g. `intent='ideate'`) or a direct `bclaw_loop(open, kind=...)`
|
|
139
|
+
call pattern documented in the RFC.
|
|
140
|
+
- A human-visible output: the terminal artifact should materialize as
|
|
141
|
+
a candidate or handoff the human can read, not only an intra-loop
|
|
142
|
+
artifact. This is how the loop becomes a "helper" rather than a
|
|
143
|
+
process trace.
|
|
144
|
+
- End-to-end tests that cover the happy path plus at least one
|
|
145
|
+
iteration round.
|
|
146
|
+
|
|
147
|
+
## 4. Playbook refactoring note
|
|
148
|
+
|
|
149
|
+
The current playbooks (`docs/playbooks/integration/`, `productivity/`,
|
|
150
|
+
`team/`) mix the agent-user and human-adopter perspectives within a
|
|
151
|
+
single "audience" section. They should be refactored so each audience
|
|
152
|
+
file contains two explicit sections:
|
|
153
|
+
|
|
154
|
+
- **For the agent-user.** What the agent gains operationally: memory,
|
|
155
|
+
context, coordination, loops, review. This is the engine view.
|
|
156
|
+
- **For the human-adopter.** What the human gains in trust,
|
|
157
|
+
visibility, governance, compliance, cost control. This is the
|
|
158
|
+
cockpit view.
|
|
159
|
+
|
|
160
|
+
This split clarifies priorities: features that improve agent
|
|
161
|
+
effectiveness belong to the engine slice and trade against each other;
|
|
162
|
+
features that improve human confidence belong to the cockpit slice and
|
|
163
|
+
trade against each other. Mixing them biased the existing "Known Gaps"
|
|
164
|
+
sections toward visible-to-human items.
|
|
165
|
+
|
|
166
|
+
## 5. Practical implications
|
|
167
|
+
|
|
168
|
+
- Next implementation move: reframer phase (pln#493) on top of the
|
|
169
|
+
shipped ideation_loop, then the Debug & Root-Cause Loop.
|
|
170
|
+
- Parallel track: the cockpit needs dedicated planning once the engine
|
|
171
|
+
emits enough signals (event streaming, reputation exposure, audit
|
|
172
|
+
narrative generation, cost attribution).
|
|
173
|
+
- Any new feature proposal should explicitly state which layer it
|
|
174
|
+
serves (engine or cockpit) and which audience slice (agent-user or
|
|
175
|
+
human-adopter). Proposals that don't declare this should be sent
|
|
176
|
+
back for clarification.
|
|
177
|
+
|
|
178
|
+
## References
|
|
179
|
+
|
|
180
|
+
- `docs/concepts/loop-engine.md` — the v8 RFC (engine primitive)
|
|
181
|
+
- `docs/playbooks/*` — current audience playbooks (pending refactor
|
|
182
|
+
per §4 above)
|
|
183
|
+
- pln#394 `feat/loop-engine-mvp` — shipped
|
|
184
|
+
- pln#395 `feat/review-loop-protocol` — shipped
|