opencode-swarm-plugin 0.43.0 → 0.44.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (208) hide show
  1. package/bin/cass.characterization.test.ts +422 -0
  2. package/bin/swarm.serve.test.ts +6 -4
  3. package/bin/swarm.test.ts +68 -0
  4. package/bin/swarm.ts +81 -8
  5. package/dist/compaction-prompt-scoring.js +139 -0
  6. package/dist/contributor-tools.d.ts +42 -0
  7. package/dist/contributor-tools.d.ts.map +1 -0
  8. package/dist/eval-capture.js +12811 -0
  9. package/dist/hive.d.ts.map +1 -1
  10. package/dist/index.d.ts +12 -0
  11. package/dist/index.d.ts.map +1 -1
  12. package/dist/index.js +7728 -62590
  13. package/dist/plugin.js +23833 -78695
  14. package/dist/sessions/agent-discovery.d.ts +59 -0
  15. package/dist/sessions/agent-discovery.d.ts.map +1 -0
  16. package/dist/sessions/index.d.ts +10 -0
  17. package/dist/sessions/index.d.ts.map +1 -0
  18. package/dist/swarm-orchestrate.d.ts.map +1 -1
  19. package/dist/swarm-prompts.d.ts.map +1 -1
  20. package/dist/swarm-review.d.ts.map +1 -1
  21. package/package.json +17 -5
  22. package/.changeset/swarm-insights-data-layer.md +0 -63
  23. package/.hive/analysis/eval-failure-analysis-2025-12-25.md +0 -331
  24. package/.hive/analysis/session-data-quality-audit.md +0 -320
  25. package/.hive/eval-results.json +0 -483
  26. package/.hive/issues.jsonl +0 -138
  27. package/.hive/memories.jsonl +0 -729
  28. package/.opencode/eval-history.jsonl +0 -327
  29. package/.turbo/turbo-build.log +0 -9
  30. package/CHANGELOG.md +0 -2255
  31. package/SCORER-ANALYSIS.md +0 -598
  32. package/docs/analysis/subagent-coordination-patterns.md +0 -902
  33. package/docs/analysis-socratic-planner-pattern.md +0 -504
  34. package/docs/planning/ADR-001-monorepo-structure.md +0 -171
  35. package/docs/planning/ADR-002-package-extraction.md +0 -393
  36. package/docs/planning/ADR-003-performance-improvements.md +0 -451
  37. package/docs/planning/ADR-004-message-queue-features.md +0 -187
  38. package/docs/planning/ADR-005-devtools-observability.md +0 -202
  39. package/docs/planning/ADR-007-swarm-enhancements-worktree-review.md +0 -168
  40. package/docs/planning/ADR-008-worker-handoff-protocol.md +0 -293
  41. package/docs/planning/ADR-009-oh-my-opencode-patterns.md +0 -353
  42. package/docs/planning/ROADMAP.md +0 -368
  43. package/docs/semantic-memory-cli-syntax.md +0 -123
  44. package/docs/swarm-mail-architecture.md +0 -1147
  45. package/docs/testing/context-recovery-test.md +0 -470
  46. package/evals/ARCHITECTURE.md +0 -1189
  47. package/evals/README.md +0 -768
  48. package/evals/compaction-prompt.eval.ts +0 -149
  49. package/evals/compaction-resumption.eval.ts +0 -289
  50. package/evals/coordinator-behavior.eval.ts +0 -307
  51. package/evals/coordinator-session.eval.ts +0 -154
  52. package/evals/evalite.config.ts.bak +0 -15
  53. package/evals/example.eval.ts +0 -31
  54. package/evals/fixtures/compaction-cases.ts +0 -350
  55. package/evals/fixtures/compaction-prompt-cases.ts +0 -311
  56. package/evals/fixtures/coordinator-sessions.ts +0 -328
  57. package/evals/fixtures/decomposition-cases.ts +0 -105
  58. package/evals/lib/compaction-loader.test.ts +0 -248
  59. package/evals/lib/compaction-loader.ts +0 -320
  60. package/evals/lib/data-loader.evalite-test.ts +0 -289
  61. package/evals/lib/data-loader.test.ts +0 -345
  62. package/evals/lib/data-loader.ts +0 -281
  63. package/evals/lib/llm.ts +0 -115
  64. package/evals/scorers/compaction-prompt-scorers.ts +0 -145
  65. package/evals/scorers/compaction-scorers.ts +0 -305
  66. package/evals/scorers/coordinator-discipline.evalite-test.ts +0 -539
  67. package/evals/scorers/coordinator-discipline.ts +0 -325
  68. package/evals/scorers/index.test.ts +0 -146
  69. package/evals/scorers/index.ts +0 -328
  70. package/evals/scorers/outcome-scorers.evalite-test.ts +0 -27
  71. package/evals/scorers/outcome-scorers.ts +0 -349
  72. package/evals/swarm-decomposition.eval.ts +0 -121
  73. package/examples/commands/swarm.md +0 -745
  74. package/examples/plugin-wrapper-template.ts +0 -2426
  75. package/examples/skills/hive-workflow/SKILL.md +0 -212
  76. package/examples/skills/skill-creator/SKILL.md +0 -223
  77. package/examples/skills/swarm-coordination/SKILL.md +0 -292
  78. package/global-skills/cli-builder/SKILL.md +0 -344
  79. package/global-skills/cli-builder/references/advanced-patterns.md +0 -244
  80. package/global-skills/learning-systems/SKILL.md +0 -644
  81. package/global-skills/skill-creator/LICENSE.txt +0 -202
  82. package/global-skills/skill-creator/SKILL.md +0 -352
  83. package/global-skills/skill-creator/references/output-patterns.md +0 -82
  84. package/global-skills/skill-creator/references/workflows.md +0 -28
  85. package/global-skills/swarm-coordination/SKILL.md +0 -995
  86. package/global-skills/swarm-coordination/references/coordinator-patterns.md +0 -235
  87. package/global-skills/swarm-coordination/references/strategies.md +0 -138
  88. package/global-skills/system-design/SKILL.md +0 -213
  89. package/global-skills/testing-patterns/SKILL.md +0 -430
  90. package/global-skills/testing-patterns/references/dependency-breaking-catalog.md +0 -586
  91. package/opencode-swarm-plugin-0.30.7.tgz +0 -0
  92. package/opencode-swarm-plugin-0.31.0.tgz +0 -0
  93. package/scripts/cleanup-test-memories.ts +0 -346
  94. package/scripts/init-skill.ts +0 -222
  95. package/scripts/migrate-unknown-sessions.ts +0 -349
  96. package/scripts/validate-skill.ts +0 -204
  97. package/src/agent-mail.ts +0 -1724
  98. package/src/anti-patterns.test.ts +0 -1167
  99. package/src/anti-patterns.ts +0 -448
  100. package/src/compaction-capture.integration.test.ts +0 -257
  101. package/src/compaction-hook.test.ts +0 -838
  102. package/src/compaction-hook.ts +0 -1204
  103. package/src/compaction-observability.integration.test.ts +0 -139
  104. package/src/compaction-observability.test.ts +0 -187
  105. package/src/compaction-observability.ts +0 -324
  106. package/src/compaction-prompt-scorers.test.ts +0 -475
  107. package/src/compaction-prompt-scoring.ts +0 -300
  108. package/src/dashboard.test.ts +0 -611
  109. package/src/dashboard.ts +0 -462
  110. package/src/error-enrichment.test.ts +0 -403
  111. package/src/error-enrichment.ts +0 -219
  112. package/src/eval-capture.test.ts +0 -1015
  113. package/src/eval-capture.ts +0 -929
  114. package/src/eval-gates.test.ts +0 -306
  115. package/src/eval-gates.ts +0 -218
  116. package/src/eval-history.test.ts +0 -508
  117. package/src/eval-history.ts +0 -214
  118. package/src/eval-learning.test.ts +0 -378
  119. package/src/eval-learning.ts +0 -360
  120. package/src/eval-runner.test.ts +0 -223
  121. package/src/eval-runner.ts +0 -402
  122. package/src/export-tools.test.ts +0 -476
  123. package/src/export-tools.ts +0 -257
  124. package/src/hive.integration.test.ts +0 -2241
  125. package/src/hive.ts +0 -1628
  126. package/src/index.ts +0 -935
  127. package/src/learning.integration.test.ts +0 -1815
  128. package/src/learning.ts +0 -1079
  129. package/src/logger.test.ts +0 -189
  130. package/src/logger.ts +0 -135
  131. package/src/mandate-promotion.test.ts +0 -473
  132. package/src/mandate-promotion.ts +0 -239
  133. package/src/mandate-storage.integration.test.ts +0 -601
  134. package/src/mandate-storage.test.ts +0 -578
  135. package/src/mandate-storage.ts +0 -794
  136. package/src/mandates.ts +0 -540
  137. package/src/memory-tools.test.ts +0 -195
  138. package/src/memory-tools.ts +0 -344
  139. package/src/memory.integration.test.ts +0 -334
  140. package/src/memory.test.ts +0 -158
  141. package/src/memory.ts +0 -527
  142. package/src/model-selection.test.ts +0 -188
  143. package/src/model-selection.ts +0 -68
  144. package/src/observability-tools.test.ts +0 -359
  145. package/src/observability-tools.ts +0 -871
  146. package/src/output-guardrails.test.ts +0 -438
  147. package/src/output-guardrails.ts +0 -381
  148. package/src/pattern-maturity.test.ts +0 -1160
  149. package/src/pattern-maturity.ts +0 -525
  150. package/src/planning-guardrails.test.ts +0 -491
  151. package/src/planning-guardrails.ts +0 -438
  152. package/src/plugin.ts +0 -23
  153. package/src/post-compaction-tracker.test.ts +0 -251
  154. package/src/post-compaction-tracker.ts +0 -237
  155. package/src/query-tools.test.ts +0 -636
  156. package/src/query-tools.ts +0 -324
  157. package/src/rate-limiter.integration.test.ts +0 -466
  158. package/src/rate-limiter.ts +0 -774
  159. package/src/replay-tools.test.ts +0 -496
  160. package/src/replay-tools.ts +0 -240
  161. package/src/repo-crawl.integration.test.ts +0 -441
  162. package/src/repo-crawl.ts +0 -610
  163. package/src/schemas/cell-events.test.ts +0 -347
  164. package/src/schemas/cell-events.ts +0 -807
  165. package/src/schemas/cell.ts +0 -257
  166. package/src/schemas/evaluation.ts +0 -166
  167. package/src/schemas/index.test.ts +0 -199
  168. package/src/schemas/index.ts +0 -286
  169. package/src/schemas/mandate.ts +0 -232
  170. package/src/schemas/swarm-context.ts +0 -115
  171. package/src/schemas/task.ts +0 -161
  172. package/src/schemas/worker-handoff.test.ts +0 -302
  173. package/src/schemas/worker-handoff.ts +0 -131
  174. package/src/skills.integration.test.ts +0 -1192
  175. package/src/skills.test.ts +0 -643
  176. package/src/skills.ts +0 -1549
  177. package/src/storage.integration.test.ts +0 -341
  178. package/src/storage.ts +0 -884
  179. package/src/structured.integration.test.ts +0 -817
  180. package/src/structured.test.ts +0 -1046
  181. package/src/structured.ts +0 -762
  182. package/src/swarm-decompose.test.ts +0 -188
  183. package/src/swarm-decompose.ts +0 -1302
  184. package/src/swarm-deferred.integration.test.ts +0 -157
  185. package/src/swarm-deferred.test.ts +0 -38
  186. package/src/swarm-insights.test.ts +0 -214
  187. package/src/swarm-insights.ts +0 -459
  188. package/src/swarm-mail.integration.test.ts +0 -970
  189. package/src/swarm-mail.ts +0 -739
  190. package/src/swarm-orchestrate.integration.test.ts +0 -282
  191. package/src/swarm-orchestrate.test.ts +0 -548
  192. package/src/swarm-orchestrate.ts +0 -3084
  193. package/src/swarm-prompts.test.ts +0 -1270
  194. package/src/swarm-prompts.ts +0 -2077
  195. package/src/swarm-research.integration.test.ts +0 -701
  196. package/src/swarm-research.test.ts +0 -698
  197. package/src/swarm-research.ts +0 -472
  198. package/src/swarm-review.integration.test.ts +0 -285
  199. package/src/swarm-review.test.ts +0 -879
  200. package/src/swarm-review.ts +0 -709
  201. package/src/swarm-strategies.ts +0 -407
  202. package/src/swarm-worktree.test.ts +0 -501
  203. package/src/swarm-worktree.ts +0 -575
  204. package/src/swarm.integration.test.ts +0 -2377
  205. package/src/swarm.ts +0 -38
  206. package/src/tool-adapter.integration.test.ts +0 -1221
  207. package/src/tool-availability.ts +0 -461
  208. package/tsconfig.json +0 -28
@@ -1,202 +0,0 @@
1
- # ADR-005: DevTools + Observability
2
-
3
- ## Status
4
-
5
- Proposed
6
-
7
- ## Context
8
-
9
- Swarm Mail currently has no visibility:
10
-
11
- - No UI to inspect events, messages, locks
12
- - No metrics on latency, queue depth, throughput
13
- - No distributed tracing across agents
14
- - Hard to debug coordination issues
15
-
16
- Need both developer tools (UI + CLI) and production observability (metrics + tracing).
17
-
18
- ## Decision
19
-
20
- Build layered observability:
21
-
22
- ### 1. DevTools UI (SvelteKit)
23
-
24
- **Stack:**
25
-
26
- - SvelteKit for SSR + static export
27
- - Vite for dev server + build
28
- - Server-Sent Events (SSE) for real-time updates
29
- - Embeddable static build
30
-
31
- **Features:**
32
-
33
- - Event stream viewer (filterable, searchable)
34
- - Message inbox/outbox per agent
35
- - File reservation timeline
36
- - Saga instance tracker (future)
37
-
38
- **Build:**
39
-
40
- ```bash
41
- cd apps/devtools
42
- bun run build # Static export to apps/devtools/build
43
- ```
44
-
45
- **Embed in plugin:**
46
-
47
- ```typescript
48
- // Serve static UI at /_swarm/devtools
49
- const server = serve({
50
- port: 4000,
51
- fetch: (req) => {
52
- if (req.url.startsWith("/_swarm/devtools")) {
53
- return serveStatic("apps/devtools/build");
54
- }
55
- },
56
- });
57
- ```
58
-
59
- ### 2. CLI (@effect/cli)
60
-
61
- **Commands:**
62
-
63
- ```bash
64
- swarm events [--project <key>] [--type <type>] [--tail]
65
- swarm messages [--agent <name>] [--unread]
66
- swarm locks [--agent <name>]
67
- swarm replay --from <sequence> [--to <sequence>]
68
- swarm metrics
69
- ```
70
-
71
- **Implementation:**
72
-
73
- ```typescript
74
- import { Command } from "@effect/cli";
75
-
76
- const eventsCommand = Command.make(
77
- "events",
78
- {
79
- project: Options.string("project").optional,
80
- type: Options.string("type").optional,
81
- tail: Options.boolean("tail"),
82
- },
83
- ({ project, type, tail }) => {
84
- // Query events table, optionally --tail with live query
85
- },
86
- );
87
- ```
88
-
89
- ### 3. Metrics (Prometheus)
90
-
91
- **Histograms:**
92
-
93
- - `swarm_message_latency_seconds` - Send to receive time
94
- - `swarm_lock_contention_seconds` - Time waiting for lock
95
- - `swarm_queue_depth` - Unread messages per agent
96
-
97
- **Counters:**
98
-
99
- - `swarm_events_total{type}` - Events by type
100
- - `swarm_messages_sent_total{sender, recipient}`
101
- - `swarm_locks_acquired_total{agent}`
102
-
103
- **Example:**
104
-
105
- ```typescript
106
- import { Registry, Histogram } from 'prom-client'
107
-
108
- const messageLat ency = new Histogram({
109
- name: 'swarm_message_latency_seconds',
110
- help: 'Message delivery latency',
111
- buckets: [0.01, 0.05, 0.1, 0.5, 1.0, 5.0]
112
- })
113
-
114
- // Record latency
115
- const start = Date.now()
116
- await sendMessage(msg)
117
- const latency = (Date.now() - start) / 1000
118
- messageLatency.observe(latency)
119
- ```
120
-
121
- ### 4. Distributed Tracing (OpenTelemetry)
122
-
123
- **Integration:**
124
-
125
- ```typescript
126
- import { @effect/opentelemetry } from '@effect/opentelemetry'
127
- import { NodeTracerProvider } from '@opentelemetry/sdk-trace-node'
128
-
129
- const provider = new NodeTracerProvider()
130
- const tracer = provider.getTracer('swarm-mail')
131
-
132
- // Trace message send
133
- const span = tracer.startSpan('sendMessage', {
134
- attributes: {
135
- 'swarm.sender': 'AgentA',
136
- 'swarm.recipient': 'AgentB',
137
- 'swarm.thread_id': 'bd-123'
138
- }
139
- })
140
-
141
- await sendMessage(msg)
142
- span.end()
143
- ```
144
-
145
- **Trace Propagation:**
146
-
147
- - Add trace_id to message metadata
148
- - Worker agents continue traces from parents
149
- - Visualize full swarm execution flow
150
-
151
- ## Consequences
152
-
153
- ### Easier
154
-
155
- - **Visibility** - See all events, messages, locks in real-time
156
- - **Debugging** - Trace issues across agents via distributed tracing
157
- - **Performance** - Identify slow operations via histograms
158
- - **Operations** - CLI for prod debugging without UI
159
-
160
- ### More Difficult
161
-
162
- - **Maintenance** - Another app to maintain (DevTools UI)
163
- - **Bundle size** - Metrics/tracing deps increase plugin size
164
- - **Performance overhead** - Instrumentation adds latency
165
- - **Configuration** - Metrics exporters, trace backends
166
-
167
- ## Implementation Notes
168
-
169
- ### Phase 1: CLI (Week 1)
170
-
171
- - Add @effect/cli dependency
172
- - Implement events, messages, locks commands
173
- - Test with real swarm sessions
174
-
175
- ### Phase 2: DevTools UI (Week 2-3)
176
-
177
- - Scaffold SvelteKit app
178
- - Build event stream viewer
179
- - Add SSE endpoint for real-time updates
180
- - Static export + embed in plugin
181
-
182
- ### Phase 3: Metrics (Week 4)
183
-
184
- - Add prom-client dependency
185
- - Instrument send/receive latency
186
- - Add queue depth gauge
187
- - Expose /metrics endpoint
188
-
189
- ### Phase 4: Tracing (Week 5)
190
-
191
- - Add @effect/opentelemetry
192
- - Instrument message send/receive
193
- - Propagate trace context
194
- - Test with Jaeger/Zipkin
195
-
196
- ### Success Criteria
197
-
198
- - [ ] CLI can tail events in real-time
199
- - [ ] DevTools UI shows live message stream
200
- - [ ] Metrics exposed at /metrics endpoint
201
- - [ ] Traces visible in Jaeger UI
202
- - [ ] Documentation for all observability tools
@@ -1,168 +0,0 @@
1
- # ADR-007: Swarm Enhancements - Worktree Isolation + Structured Review
2
-
3
- ## Status
4
-
5
- Proposed
6
-
7
- ## Context
8
-
9
- After reviewing [nexxeln/opencode-config](https://github.com/nexxeln/opencode-config), we identified several patterns that would strengthen our swarm coordination:
10
-
11
- 1. **Git worktree isolation** - Each worker gets a complete isolated copy of the repo
12
- 2. **Structured review loop** - Workers must pass review before completion
13
- 3. **Retry options on abort** - Clean recovery paths when things go wrong
14
-
15
- Currently our swarm uses:
16
- - **File reservations** via Swarm Mail for conflict prevention
17
- - **UBS scan** on completion for bug detection
18
- - **Manual cleanup** on abort
19
-
20
- ## Decision
21
-
22
- ### 1. Optional Worktree Isolation Mode
23
-
24
- Add `isolation` parameter to swarm initialization:
25
-
26
- ```typescript
27
- swarm_init({
28
- task: "Large refactor across 50 files",
29
- isolation: "worktree" // or "reservation" (default)
30
- })
31
- ```
32
-
33
- **When to use worktrees:**
34
- - Large refactors touching many files
35
- - High risk of merge conflicts
36
- - Need complete isolation (different node_modules, etc.)
37
-
38
- **When to use reservations (default):**
39
- - Most swarm tasks
40
- - Quick parallel work
41
- - Lower overhead
42
-
43
- **Worktree lifecycle:**
44
- ```
45
- swarm_worktree_create(task_id) → /path/to/worktree
46
-
47
- worker does work in worktree
48
-
49
- swarm_worktree_merge(task_id) → cherry-pick commit to main
50
-
51
- swarm_worktree_cleanup(task_id) → remove worktree
52
- ```
53
-
54
- **On abort:** Hard reset main to start commit, delete all worktrees.
55
-
56
- ### 2. Structured Review Step
57
-
58
- The coordinator reviews worker output before marking complete. This replaces the current "trust but verify with UBS" approach.
59
-
60
- **Review flow:**
61
- ```
62
- worker completes → coordinator reviews → approved/needs_changes
63
-
64
- if needs_changes: worker fixes (max 3 attempts)
65
-
66
- if approved: mark complete
67
- ```
68
-
69
- **Review prompt includes:**
70
- - Epic goal (the big picture)
71
- - Task requirements
72
- - What completed tasks this builds on (dependency context)
73
- - What future tasks depend on this (downstream context)
74
- - The actual code changes
75
-
76
- **Why coordinator reviews (not separate reviewer agent):**
77
- - Coordinator already has full epic context loaded
78
- - Avoids spawning another agent just for review
79
- - Keeps the feedback loop tight
80
- - Coordinator can make judgment calls about "good enough"
81
-
82
- **Review criteria:**
83
- 1. Does it fulfill the task requirements?
84
- 2. Does it serve the epic goal?
85
- 3. Will downstream tasks be able to use it?
86
- 4. Are there critical bugs? (UBS scan still runs)
87
-
88
- ### 3. Retry Options on Abort
89
-
90
- When a swarm aborts (user request or failure), provide clear recovery paths:
91
-
92
- ```json
93
- {
94
- "retry_options": {
95
- "same_plan": "/swarm --retry",
96
- "edit_plan": "/swarm --retry --edit",
97
- "fresh_start": "/swarm \"original task\""
98
- }
99
- }
100
- ```
101
-
102
- **`--retry`**: Resume with same plan, skip completed tasks
103
- **`--retry --edit`**: Show plan for modification before resuming
104
- **Fresh start**: Decompose from scratch
105
-
106
- This requires persisting swarm session state (already have this via Hive cells).
107
-
108
- ## Implementation
109
-
110
- ### Phase 1: Structured Review (Priority)
111
- 1. Add review step to `swarm_complete`
112
- 2. Create review prompt with epic context injection
113
- 3. Handle needs_changes → worker retry loop (max 3)
114
- 4. Keep UBS scan as additional safety net
115
-
116
- ### Phase 2: Worktree Isolation
117
- 1. Add `isolation` mode to `swarm_init`
118
- 2. Implement worktree lifecycle tools
119
- 3. Update worker prompts to work in worktree path
120
- 4. Add cherry-pick merge on completion
121
- 5. Add cleanup on abort
122
-
123
- ### Phase 3: Retry Options
124
- 1. Persist session state for recovery
125
- 2. Add `--retry` and `--retry --edit` flags
126
- 3. Skip completed tasks on retry
127
- 4. Show plan editor for `--edit` mode
128
-
129
- ## Consequences
130
-
131
- ### Positive
132
- - **Better quality**: Structured review catches issues before integration
133
- - **Safer large refactors**: Worktree isolation eliminates merge conflicts
134
- - **Cleaner recovery**: Retry options reduce friction after failures
135
- - **Coordinator stays in control**: Review keeps human-in-the-loop feel
136
-
137
- ### Negative
138
- - **More complexity**: Two isolation modes to maintain
139
- - **Slower completion**: Review step adds latency
140
- - **Disk usage**: Worktrees consume space (mitigated by cleanup)
141
-
142
- ### Neutral
143
- - **Credit**: Patterns inspired by nexxeln/opencode-config - should acknowledge in docs
144
-
145
- ## Alternatives Considered
146
-
147
- ### Separate Reviewer Agent
148
- nexxeln uses a dedicated reviewer subagent. We chose coordinator-as-reviewer because:
149
- - Avoids context duplication (coordinator already has epic context)
150
- - Faster feedback loop
151
- - Coordinator can make "ship it" judgment calls
152
-
153
- ### Staged Changes on Finalize
154
- nexxeln soft-resets to leave changes staged for user review. We're skipping this because:
155
- - Our flow already has explicit commit step
156
- - Hive tracks what changed
157
- - User can always `git diff` before committing
158
-
159
- ### Always Use Worktrees
160
- Could simplify by always using worktrees. Rejected because:
161
- - Overkill for most tasks
162
- - Slower setup/teardown
163
- - File reservations work fine for typical parallel work
164
-
165
- ## References
166
-
167
- - [nexxeln/opencode-config](https://github.com/nexxeln/opencode-config) - Source of inspiration
168
- - Epic: `bd-lf2p4u-mjaja96b9da` - Swarm Enhancements
@@ -1,293 +0,0 @@
1
- # ADR-008: Worker Handoff Protocol - Structured Contracts Over Prose
2
-
3
- ## Status
4
-
5
- Proposed
6
-
7
- ## Context
8
-
9
- The current `SUBTASK_PROMPT_V2` is a **280-line prose instruction manual** that gets injected into every swarm worker's context. This approach has fundamental problems:
10
-
11
- ### Current Problems
12
-
13
- 1. **Workers ignore prose** - Long text instructions get skimmed or missed entirely
14
- 2. **No validation** - Can't programmatically verify workers followed protocol
15
- 3. **Context bloat** - 280 lines * N workers burns tokens fast
16
- 4. **Drift and violations** - Workers modify files outside their scope, no automatic detection
17
- 5. **Manual error recovery** - Coordinator can't auto-detect contract violations
18
-
19
- **Concrete example of failure:**
20
- ```
21
- Worker assigned: ["src/auth/service.ts"]
22
- Worker actually touched: ["src/auth/service.ts", "src/lib/jwt.ts", "src/types/user.ts"]
23
- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24
- Scope creep undetected until swarm_complete
25
- ```
26
-
27
- Current `swarm_complete` validates `files_touched ⊆ files_owned`, but the **contract** was never machine-readable to begin with.
28
-
29
- ### Research & Inspirations
30
-
31
- From "Patterns for Building AI Agents" and production event-driven systems:
32
-
33
- **mdflow adapter pattern:**
34
- - Convention-based behavior inference
35
- - Template variables define expectations
36
- - Minimal configuration, maximum clarity
37
-
38
- **Bellemare's event-driven orchestration:**
39
- - Explicit contracts between services
40
- - Commands vs Events distinction
41
- - Contract violations fail fast with clear errors
42
-
43
- **Key insight:** Agents need **two channels**:
44
- 1. **Contract** (machine-readable, validated) - WHAT to do, WHERE to do it
45
- 2. **Context** (human-readable, advisory) - WHY it matters, HOW it fits together
46
-
47
- ## Decision
48
-
49
- Replace 280-line prose with **WorkerHandoff envelope** that separates contract from context.
50
-
51
- ### WorkerHandoff Structure
52
-
53
- ```typescript
54
- interface WorkerHandoff {
55
- // Machine-readable - enforced by tools
56
- contract: {
57
- task_id: string; // Cell ID for tracking
58
- files_owned: string[]; // Exclusive write access (validated)
59
- files_readonly: string[]; // Can read, MUST NOT modify (validated)
60
- dependencies_completed: string[]; // Tasks that finished before this
61
- success_criteria: string[]; // Exit conditions (checkable)
62
- };
63
-
64
- // Human-readable - advisory context
65
- context: {
66
- epic_summary: string; // Big picture goal
67
- your_role: string; // What this subtask accomplishes
68
- what_others_did: string; // Dependency outputs
69
- what_comes_next: string; // Downstream task expectations
70
- };
71
-
72
- // Escalation paths - when things go wrong
73
- escalation: {
74
- blocked_contact: string; // "coordinator" or agent name
75
- scope_change_protocol: string; // "swarmmail_send + await approval"
76
- };
77
- }
78
- ```
79
-
80
- ### Example Handoff
81
-
82
- ```json
83
- {
84
- "contract": {
85
- "task_id": "bd-123.2",
86
- "files_owned": ["src/auth/service.ts", "src/auth/service.test.ts"],
87
- "files_readonly": ["src/types/user.ts", "src/lib/jwt.ts"],
88
- "dependencies_completed": ["bd-123.1"],
89
- "success_criteria": [
90
- "AuthService.login() returns JWT token",
91
- "Tests pass: bun test src/auth/service.test.ts",
92
- "Type check passes: tsc --noEmit"
93
- ]
94
- },
95
- "context": {
96
- "epic_summary": "Add OAuth authentication to user service",
97
- "your_role": "Implement AuthService with JWT token generation",
98
- "what_others_did": "bd-123.1 created User schema with email/password fields",
99
- "what_comes_next": "bd-123.3 will integrate this service into API routes"
100
- },
101
- "escalation": {
102
- "blocked_contact": "coordinator",
103
- "scope_change_protocol": "swarmmail_send(subject='Scope Change', ack_required=true)"
104
- }
105
- }
106
- ```
107
-
108
- ### Validation in swarm_complete
109
-
110
- ```typescript
111
- // swarm_complete now validates against contract
112
- function validateCompletion(handoff: WorkerHandoff, result: CompletionReport) {
113
- const violations: string[] = [];
114
-
115
- // 1. File scope violations
116
- const unauthorized = result.files_touched.filter(
117
- f => !handoff.contract.files_owned.includes(f)
118
- );
119
- if (unauthorized.length > 0) {
120
- violations.push(`Touched unauthorized files: ${unauthorized.join(", ")}`);
121
- }
122
-
123
- // 2. Success criteria (checkable ones)
124
- for (const criterion of handoff.contract.success_criteria) {
125
- if (criterion.startsWith("Tests pass:")) {
126
- // Run the test command, validate exit 0
127
- }
128
- if (criterion.startsWith("Type check passes:")) {
129
- // Run tsc --noEmit, validate exit 0
130
- }
131
- }
132
-
133
- // 3. Learning signals from violations
134
- if (violations.length > 0) {
135
- recordLearningSignal({
136
- task_id: handoff.contract.task_id,
137
- violation_type: "scope_creep",
138
- details: violations,
139
- impact: "negative" // Penalize decomposition strategy
140
- });
141
- }
142
-
143
- return { valid: violations.length === 0, violations };
144
- }
145
- ```
146
-
147
- ### Integration with Existing Tools
148
-
149
- **swarm_spawn_subtask generates handoffs:**
150
-
151
- ```typescript
152
- export const swarm_spawn_subtask = tool(/* ... */)
153
- .handler(async ({ input, context }) => {
154
- const handoff: WorkerHandoff = {
155
- contract: {
156
- task_id: input.bead_id,
157
- files_owned: input.files,
158
- files_readonly: inferReadonlyFiles(input.files, epicContext),
159
- dependencies_completed: input.dependencies_completed || [],
160
- success_criteria: generateSuccessCriteria(input.subtask_description)
161
- },
162
- context: {
163
- epic_summary: epicContext.summary,
164
- your_role: input.subtask_title,
165
- what_others_did: summarizeDependencies(input.dependencies_completed),
166
- what_comes_next: summarizeDownstream(input.bead_id)
167
- },
168
- escalation: {
169
- blocked_contact: "coordinator",
170
- scope_change_protocol: "swarmmail_send(subject='Scope Change', ack_required=true)"
171
- }
172
- };
173
-
174
- return formatHandoff(handoff); // Compact JSON + minimal prose wrapper
175
- });
176
- ```
177
-
178
- **swarm_complete validates contract:**
179
-
180
- ```typescript
181
- export const swarm_complete = tool(/* ... */)
182
- .handler(async ({ input, context }) => {
183
- const handoff = getStoredHandoff(input.bead_id);
184
- const validation = validateCompletion(handoff, {
185
- files_touched: input.files_touched,
186
- summary: input.summary
187
- });
188
-
189
- if (!validation.valid) {
190
- throw new Error(
191
- `Contract violations detected:\n${validation.violations.join("\n")}`
192
- );
193
- }
194
-
195
- // Proceed with UBS scan, reservation release, etc.
196
- });
197
- ```
198
-
199
- ## Consequences
200
-
201
- ### Positive
202
-
203
- - **Validation enforced** - Can't complete with contract violations
204
- - **Clear boundaries** - Workers know exactly what's in/out of scope
205
- - **Better learning** - Scope creep violations feed back into strategy selection
206
- - **Context efficiency** - Contract is ~30 lines JSON vs 280 lines prose
207
- - **Fail fast** - Violations detected immediately, not during merge
208
- - **Programmatic recovery** - Coordinator can auto-detect and reassign work
209
-
210
- ### Negative
211
-
212
- - **Requires storage** - Handoffs must persist (already have event store)
213
- - **Success criteria limited** - Can't validate all criteria automatically
214
- - **Migration cost** - Existing `SUBTASK_PROMPT_V2` users need update
215
- - **More upfront work** - Coordinator must generate better contracts
216
-
217
- ### Neutral
218
-
219
- - **Prose still exists** - `context` field provides human explanation, just smaller
220
- - **Not eliminating checklist** - 9-step survival checklist stays, but moves to tool enforcement
221
-
222
- ## Implementation Notes
223
-
224
- ### Phase 1: Storage & Schema
225
-
226
- 1. Add `WorkerHandoff` schema to swarm-mail event types
227
- 2. Store handoffs in event log when spawning subtasks
228
- 3. Retrieve handoffs in `swarm_complete` for validation
229
-
230
- ### Phase 2: Generation Logic
231
-
232
- 1. Implement `inferReadonlyFiles()` - analyze imports/dependencies
233
- 2. Implement `generateSuccessCriteria()` - parse task description for checkable conditions
234
- 3. Implement `summarizeDependencies()` and `summarizeDownstream()` - build context from epic graph
235
-
236
- ### Phase 3: Validation
237
-
238
- 1. Add contract validation to `swarm_complete`
239
- 2. Implement checkable criteria runners (test commands, type checks)
240
- 3. Record learning signals for violations
241
-
242
- ### Phase 4: Migration
243
-
244
- 1. Update `formatSubtaskPromptV2` to generate handoff JSON
245
- 2. Deprecate 280-line prose template
246
- 3. Update tests for new handoff format
247
-
248
- ### Phase 5: Enhanced Features (Future)
249
-
250
- 1. **Readonly enforcement** - Detect modifications to `files_readonly` via git diff
251
- 2. **Dependency validation** - Verify `dependencies_completed` actually ran first
252
- 3. **Auto-generated success criteria** - Parse test files, infer criteria from code
253
-
254
- ## Alternatives Considered
255
-
256
- ### Keep Prose, Add Validation
257
-
258
- Keep `SUBTASK_PROMPT_V2` but add validation after-the-fact. **Rejected** because:
259
- - Still burns 280 lines of context per worker
260
- - Workers still ignore prose
261
- - Validation happens too late (after work done)
262
-
263
- ### Minimal Contract Only
264
-
265
- Remove context entirely, pure machine contract. **Rejected** because:
266
- - Workers need WHY to make good judgment calls
267
- - Context helps with edge cases not in contract
268
- - Loss of human readability hurts debugging
269
-
270
- ### Command Pattern (Bellemare Style)
271
-
272
- Full event-sourcing with Command objects. **Rejected** because:
273
- - Over-engineered for current needs
274
- - Already have event store for coordination
275
- - Contract + context is simpler and sufficient
276
-
277
- ## References
278
-
279
- - **"Patterns for Building AI Agents"** - Subagent context sharing patterns
280
- - **mdflow** - Convention-based adapter design, template variable contracts
281
- - **Bellemare's "Building Event-Driven Microservices"** - Explicit contracts, fail-fast validation
282
- - **Current implementation:** `src/swarm-prompts.ts` (SUBTASK_PROMPT_V2, lines 253-530)
283
- - **Related:** ADR-007 (Structured Review), ADR-002 (Package Extraction)
284
-
285
- ## Success Criteria
286
-
287
- - [ ] `WorkerHandoff` schema defined and validated with Zod
288
- - [ ] `swarm_spawn_subtask` generates handoffs instead of raw prose
289
- - [ ] `swarm_complete` validates contract before accepting completion
290
- - [ ] Scope violations trigger learning signals (negative feedback)
291
- - [ ] Workers receive handoff as JSON + compact context wrapper (<50 lines)
292
- - [ ] Test suite validates contract enforcement catches violations
293
- - [ ] Migration path documented for existing swarm users