opencode-swarm-plugin 0.44.0 → 0.44.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (205) hide show
  1. package/bin/swarm.serve.test.ts +6 -4
  2. package/bin/swarm.ts +16 -10
  3. package/dist/compaction-prompt-scoring.js +139 -0
  4. package/dist/eval-capture.js +12811 -0
  5. package/dist/hive.d.ts.map +1 -1
  6. package/dist/index.js +7644 -62599
  7. package/dist/plugin.js +23766 -78721
  8. package/dist/swarm-orchestrate.d.ts.map +1 -1
  9. package/dist/swarm-prompts.d.ts.map +1 -1
  10. package/dist/swarm-review.d.ts.map +1 -1
  11. package/package.json +17 -5
  12. package/.changeset/swarm-insights-data-layer.md +0 -63
  13. package/.hive/analysis/eval-failure-analysis-2025-12-25.md +0 -331
  14. package/.hive/analysis/session-data-quality-audit.md +0 -320
  15. package/.hive/eval-results.json +0 -483
  16. package/.hive/issues.jsonl +0 -138
  17. package/.hive/memories.jsonl +0 -729
  18. package/.opencode/eval-history.jsonl +0 -327
  19. package/.turbo/turbo-build.log +0 -9
  20. package/CHANGELOG.md +0 -2286
  21. package/SCORER-ANALYSIS.md +0 -598
  22. package/docs/analysis/subagent-coordination-patterns.md +0 -902
  23. package/docs/analysis-socratic-planner-pattern.md +0 -504
  24. package/docs/planning/ADR-001-monorepo-structure.md +0 -171
  25. package/docs/planning/ADR-002-package-extraction.md +0 -393
  26. package/docs/planning/ADR-003-performance-improvements.md +0 -451
  27. package/docs/planning/ADR-004-message-queue-features.md +0 -187
  28. package/docs/planning/ADR-005-devtools-observability.md +0 -202
  29. package/docs/planning/ADR-007-swarm-enhancements-worktree-review.md +0 -168
  30. package/docs/planning/ADR-008-worker-handoff-protocol.md +0 -293
  31. package/docs/planning/ADR-009-oh-my-opencode-patterns.md +0 -353
  32. package/docs/planning/ADR-010-cass-inhousing.md +0 -1215
  33. package/docs/planning/ROADMAP.md +0 -368
  34. package/docs/semantic-memory-cli-syntax.md +0 -123
  35. package/docs/swarm-mail-architecture.md +0 -1147
  36. package/docs/testing/context-recovery-test.md +0 -470
  37. package/evals/ARCHITECTURE.md +0 -1189
  38. package/evals/README.md +0 -768
  39. package/evals/compaction-prompt.eval.ts +0 -149
  40. package/evals/compaction-resumption.eval.ts +0 -289
  41. package/evals/coordinator-behavior.eval.ts +0 -307
  42. package/evals/coordinator-session.eval.ts +0 -154
  43. package/evals/evalite.config.ts.bak +0 -15
  44. package/evals/example.eval.ts +0 -31
  45. package/evals/fixtures/cass-baseline.ts +0 -217
  46. package/evals/fixtures/compaction-cases.ts +0 -350
  47. package/evals/fixtures/compaction-prompt-cases.ts +0 -311
  48. package/evals/fixtures/coordinator-sessions.ts +0 -328
  49. package/evals/fixtures/decomposition-cases.ts +0 -105
  50. package/evals/lib/compaction-loader.test.ts +0 -248
  51. package/evals/lib/compaction-loader.ts +0 -320
  52. package/evals/lib/data-loader.evalite-test.ts +0 -289
  53. package/evals/lib/data-loader.test.ts +0 -345
  54. package/evals/lib/data-loader.ts +0 -281
  55. package/evals/lib/llm.ts +0 -115
  56. package/evals/scorers/compaction-prompt-scorers.ts +0 -145
  57. package/evals/scorers/compaction-scorers.ts +0 -305
  58. package/evals/scorers/coordinator-discipline.evalite-test.ts +0 -539
  59. package/evals/scorers/coordinator-discipline.ts +0 -325
  60. package/evals/scorers/index.test.ts +0 -146
  61. package/evals/scorers/index.ts +0 -328
  62. package/evals/scorers/outcome-scorers.evalite-test.ts +0 -27
  63. package/evals/scorers/outcome-scorers.ts +0 -349
  64. package/evals/swarm-decomposition.eval.ts +0 -121
  65. package/examples/commands/swarm.md +0 -745
  66. package/examples/plugin-wrapper-template.ts +0 -2515
  67. package/examples/skills/hive-workflow/SKILL.md +0 -212
  68. package/examples/skills/skill-creator/SKILL.md +0 -223
  69. package/examples/skills/swarm-coordination/SKILL.md +0 -292
  70. package/global-skills/cli-builder/SKILL.md +0 -344
  71. package/global-skills/cli-builder/references/advanced-patterns.md +0 -244
  72. package/global-skills/learning-systems/SKILL.md +0 -644
  73. package/global-skills/skill-creator/LICENSE.txt +0 -202
  74. package/global-skills/skill-creator/SKILL.md +0 -352
  75. package/global-skills/skill-creator/references/output-patterns.md +0 -82
  76. package/global-skills/skill-creator/references/workflows.md +0 -28
  77. package/global-skills/swarm-coordination/SKILL.md +0 -995
  78. package/global-skills/swarm-coordination/references/coordinator-patterns.md +0 -235
  79. package/global-skills/swarm-coordination/references/strategies.md +0 -138
  80. package/global-skills/system-design/SKILL.md +0 -213
  81. package/global-skills/testing-patterns/SKILL.md +0 -430
  82. package/global-skills/testing-patterns/references/dependency-breaking-catalog.md +0 -586
  83. package/opencode-swarm-plugin-0.30.7.tgz +0 -0
  84. package/opencode-swarm-plugin-0.31.0.tgz +0 -0
  85. package/scripts/cleanup-test-memories.ts +0 -346
  86. package/scripts/init-skill.ts +0 -222
  87. package/scripts/migrate-unknown-sessions.ts +0 -349
  88. package/scripts/validate-skill.ts +0 -204
  89. package/src/agent-mail.ts +0 -1724
  90. package/src/anti-patterns.test.ts +0 -1167
  91. package/src/anti-patterns.ts +0 -448
  92. package/src/compaction-capture.integration.test.ts +0 -257
  93. package/src/compaction-hook.test.ts +0 -838
  94. package/src/compaction-hook.ts +0 -1204
  95. package/src/compaction-observability.integration.test.ts +0 -139
  96. package/src/compaction-observability.test.ts +0 -187
  97. package/src/compaction-observability.ts +0 -324
  98. package/src/compaction-prompt-scorers.test.ts +0 -475
  99. package/src/compaction-prompt-scoring.ts +0 -300
  100. package/src/contributor-tools.test.ts +0 -133
  101. package/src/contributor-tools.ts +0 -201
  102. package/src/dashboard.test.ts +0 -611
  103. package/src/dashboard.ts +0 -462
  104. package/src/error-enrichment.test.ts +0 -403
  105. package/src/error-enrichment.ts +0 -219
  106. package/src/eval-capture.test.ts +0 -1015
  107. package/src/eval-capture.ts +0 -929
  108. package/src/eval-gates.test.ts +0 -306
  109. package/src/eval-gates.ts +0 -218
  110. package/src/eval-history.test.ts +0 -508
  111. package/src/eval-history.ts +0 -214
  112. package/src/eval-learning.test.ts +0 -378
  113. package/src/eval-learning.ts +0 -360
  114. package/src/eval-runner.test.ts +0 -223
  115. package/src/eval-runner.ts +0 -402
  116. package/src/export-tools.test.ts +0 -476
  117. package/src/export-tools.ts +0 -257
  118. package/src/hive.integration.test.ts +0 -2241
  119. package/src/hive.ts +0 -1628
  120. package/src/index.ts +0 -940
  121. package/src/learning.integration.test.ts +0 -1815
  122. package/src/learning.ts +0 -1079
  123. package/src/logger.test.ts +0 -189
  124. package/src/logger.ts +0 -135
  125. package/src/mandate-promotion.test.ts +0 -473
  126. package/src/mandate-promotion.ts +0 -239
  127. package/src/mandate-storage.integration.test.ts +0 -601
  128. package/src/mandate-storage.test.ts +0 -578
  129. package/src/mandate-storage.ts +0 -794
  130. package/src/mandates.ts +0 -540
  131. package/src/memory-tools.test.ts +0 -195
  132. package/src/memory-tools.ts +0 -344
  133. package/src/memory.integration.test.ts +0 -334
  134. package/src/memory.test.ts +0 -158
  135. package/src/memory.ts +0 -527
  136. package/src/model-selection.test.ts +0 -188
  137. package/src/model-selection.ts +0 -68
  138. package/src/observability-tools.test.ts +0 -359
  139. package/src/observability-tools.ts +0 -871
  140. package/src/output-guardrails.test.ts +0 -438
  141. package/src/output-guardrails.ts +0 -381
  142. package/src/pattern-maturity.test.ts +0 -1160
  143. package/src/pattern-maturity.ts +0 -525
  144. package/src/planning-guardrails.test.ts +0 -491
  145. package/src/planning-guardrails.ts +0 -438
  146. package/src/plugin.ts +0 -23
  147. package/src/post-compaction-tracker.test.ts +0 -251
  148. package/src/post-compaction-tracker.ts +0 -237
  149. package/src/query-tools.test.ts +0 -636
  150. package/src/query-tools.ts +0 -324
  151. package/src/rate-limiter.integration.test.ts +0 -466
  152. package/src/rate-limiter.ts +0 -774
  153. package/src/replay-tools.test.ts +0 -496
  154. package/src/replay-tools.ts +0 -240
  155. package/src/repo-crawl.integration.test.ts +0 -441
  156. package/src/repo-crawl.ts +0 -610
  157. package/src/schemas/cell-events.test.ts +0 -347
  158. package/src/schemas/cell-events.ts +0 -807
  159. package/src/schemas/cell.ts +0 -257
  160. package/src/schemas/evaluation.ts +0 -166
  161. package/src/schemas/index.test.ts +0 -199
  162. package/src/schemas/index.ts +0 -286
  163. package/src/schemas/mandate.ts +0 -232
  164. package/src/schemas/swarm-context.ts +0 -115
  165. package/src/schemas/task.ts +0 -161
  166. package/src/schemas/worker-handoff.test.ts +0 -302
  167. package/src/schemas/worker-handoff.ts +0 -131
  168. package/src/sessions/agent-discovery.test.ts +0 -137
  169. package/src/sessions/agent-discovery.ts +0 -112
  170. package/src/sessions/index.ts +0 -15
  171. package/src/skills.integration.test.ts +0 -1192
  172. package/src/skills.test.ts +0 -643
  173. package/src/skills.ts +0 -1549
  174. package/src/storage.integration.test.ts +0 -341
  175. package/src/storage.ts +0 -884
  176. package/src/structured.integration.test.ts +0 -817
  177. package/src/structured.test.ts +0 -1046
  178. package/src/structured.ts +0 -762
  179. package/src/swarm-decompose.test.ts +0 -188
  180. package/src/swarm-decompose.ts +0 -1302
  181. package/src/swarm-deferred.integration.test.ts +0 -157
  182. package/src/swarm-deferred.test.ts +0 -38
  183. package/src/swarm-insights.test.ts +0 -214
  184. package/src/swarm-insights.ts +0 -459
  185. package/src/swarm-mail.integration.test.ts +0 -970
  186. package/src/swarm-mail.ts +0 -739
  187. package/src/swarm-orchestrate.integration.test.ts +0 -282
  188. package/src/swarm-orchestrate.test.ts +0 -548
  189. package/src/swarm-orchestrate.ts +0 -3084
  190. package/src/swarm-prompts.test.ts +0 -1270
  191. package/src/swarm-prompts.ts +0 -2077
  192. package/src/swarm-research.integration.test.ts +0 -701
  193. package/src/swarm-research.test.ts +0 -698
  194. package/src/swarm-research.ts +0 -472
  195. package/src/swarm-review.integration.test.ts +0 -285
  196. package/src/swarm-review.test.ts +0 -879
  197. package/src/swarm-review.ts +0 -709
  198. package/src/swarm-strategies.ts +0 -407
  199. package/src/swarm-worktree.test.ts +0 -501
  200. package/src/swarm-worktree.ts +0 -575
  201. package/src/swarm.integration.test.ts +0 -2377
  202. package/src/swarm.ts +0 -38
  203. package/src/tool-adapter.integration.test.ts +0 -1221
  204. package/src/tool-availability.ts +0 -461
  205. package/tsconfig.json +0 -28
@@ -1,202 +0,0 @@
1
- # ADR-005: DevTools + Observability
2
-
3
- ## Status
4
-
5
- Proposed
6
-
7
- ## Context
8
-
9
- Swarm Mail currently has no visibility:
10
-
11
- - No UI to inspect events, messages, locks
12
- - No metrics on latency, queue depth, throughput
13
- - No distributed tracing across agents
14
- - Hard to debug coordination issues
15
-
16
- Need both developer tools (UI + CLI) and production observability (metrics + tracing).
17
-
18
- ## Decision
19
-
20
- Build layered observability:
21
-
22
- ### 1. DevTools UI (SvelteKit)
23
-
24
- **Stack:**
25
-
26
- - SvelteKit for SSR + static export
27
- - Vite for dev server + build
28
- - Server-Sent Events (SSE) for real-time updates
29
- - Embeddable static build
30
-
31
- **Features:**
32
-
33
- - Event stream viewer (filterable, searchable)
34
- - Message inbox/outbox per agent
35
- - File reservation timeline
36
- - Saga instance tracker (future)
37
-
38
- **Build:**
39
-
40
- ```bash
41
- cd apps/devtools
42
- bun run build # Static export to apps/devtools/build
43
- ```
44
-
45
- **Embed in plugin:**
46
-
47
- ```typescript
48
- // Serve static UI at /_swarm/devtools
49
- const server = serve({
50
- port: 4000,
51
- fetch: (req) => {
52
- if (req.url.startsWith("/_swarm/devtools")) {
53
- return serveStatic("apps/devtools/build");
54
- }
55
- },
56
- });
57
- ```
58
-
59
- ### 2. CLI (@effect/cli)
60
-
61
- **Commands:**
62
-
63
- ```bash
64
- swarm events [--project <key>] [--type <type>] [--tail]
65
- swarm messages [--agent <name>] [--unread]
66
- swarm locks [--agent <name>]
67
- swarm replay --from <sequence> [--to <sequence>]
68
- swarm metrics
69
- ```
70
-
71
- **Implementation:**
72
-
73
- ```typescript
74
- import { Command } from "@effect/cli";
75
-
76
- const eventsCommand = Command.make(
77
- "events",
78
- {
79
- project: Options.string("project").optional,
80
- type: Options.string("type").optional,
81
- tail: Options.boolean("tail"),
82
- },
83
- ({ project, type, tail }) => {
84
- // Query events table, optionally --tail with live query
85
- },
86
- );
87
- ```
88
-
89
- ### 3. Metrics (Prometheus)
90
-
91
- **Histograms:**
92
-
93
- - `swarm_message_latency_seconds` - Send to receive time
94
- - `swarm_lock_contention_seconds` - Time waiting for lock
95
- - `swarm_queue_depth` - Unread messages per agent
96
-
97
- **Counters:**
98
-
99
- - `swarm_events_total{type}` - Events by type
100
- - `swarm_messages_sent_total{sender, recipient}`
101
- - `swarm_locks_acquired_total{agent}`
102
-
103
- **Example:**
104
-
105
- ```typescript
106
- import { Registry, Histogram } from 'prom-client'
107
-
108
- const messageLat ency = new Histogram({
109
- name: 'swarm_message_latency_seconds',
110
- help: 'Message delivery latency',
111
- buckets: [0.01, 0.05, 0.1, 0.5, 1.0, 5.0]
112
- })
113
-
114
- // Record latency
115
- const start = Date.now()
116
- await sendMessage(msg)
117
- const latency = (Date.now() - start) / 1000
118
- messageLatency.observe(latency)
119
- ```
120
-
121
- ### 4. Distributed Tracing (OpenTelemetry)
122
-
123
- **Integration:**
124
-
125
- ```typescript
126
- import { @effect/opentelemetry } from '@effect/opentelemetry'
127
- import { NodeTracerProvider } from '@opentelemetry/sdk-trace-node'
128
-
129
- const provider = new NodeTracerProvider()
130
- const tracer = provider.getTracer('swarm-mail')
131
-
132
- // Trace message send
133
- const span = tracer.startSpan('sendMessage', {
134
- attributes: {
135
- 'swarm.sender': 'AgentA',
136
- 'swarm.recipient': 'AgentB',
137
- 'swarm.thread_id': 'bd-123'
138
- }
139
- })
140
-
141
- await sendMessage(msg)
142
- span.end()
143
- ```
144
-
145
- **Trace Propagation:**
146
-
147
- - Add trace_id to message metadata
148
- - Worker agents continue traces from parents
149
- - Visualize full swarm execution flow
150
-
151
- ## Consequences
152
-
153
- ### Easier
154
-
155
- - **Visibility** - See all events, messages, locks in real-time
156
- - **Debugging** - Trace issues across agents via distributed tracing
157
- - **Performance** - Identify slow operations via histograms
158
- - **Operations** - CLI for prod debugging without UI
159
-
160
- ### More Difficult
161
-
162
- - **Maintenance** - Another app to maintain (DevTools UI)
163
- - **Bundle size** - Metrics/tracing deps increase plugin size
164
- - **Performance overhead** - Instrumentation adds latency
165
- - **Configuration** - Metrics exporters, trace backends
166
-
167
- ## Implementation Notes
168
-
169
- ### Phase 1: CLI (Week 1)
170
-
171
- - Add @effect/cli dependency
172
- - Implement events, messages, locks commands
173
- - Test with real swarm sessions
174
-
175
- ### Phase 2: DevTools UI (Week 2-3)
176
-
177
- - Scaffold SvelteKit app
178
- - Build event stream viewer
179
- - Add SSE endpoint for real-time updates
180
- - Static export + embed in plugin
181
-
182
- ### Phase 3: Metrics (Week 4)
183
-
184
- - Add prom-client dependency
185
- - Instrument send/receive latency
186
- - Add queue depth gauge
187
- - Expose /metrics endpoint
188
-
189
- ### Phase 4: Tracing (Week 5)
190
-
191
- - Add @effect/opentelemetry
192
- - Instrument message send/receive
193
- - Propagate trace context
194
- - Test with Jaeger/Zipkin
195
-
196
- ### Success Criteria
197
-
198
- - [ ] CLI can tail events in real-time
199
- - [ ] DevTools UI shows live message stream
200
- - [ ] Metrics exposed at /metrics endpoint
201
- - [ ] Traces visible in Jaeger UI
202
- - [ ] Documentation for all observability tools
@@ -1,168 +0,0 @@
1
- # ADR-007: Swarm Enhancements - Worktree Isolation + Structured Review
2
-
3
- ## Status
4
-
5
- Proposed
6
-
7
- ## Context
8
-
9
- After reviewing [nexxeln/opencode-config](https://github.com/nexxeln/opencode-config), we identified several patterns that would strengthen our swarm coordination:
10
-
11
- 1. **Git worktree isolation** - Each worker gets a complete isolated copy of the repo
12
- 2. **Structured review loop** - Workers must pass review before completion
13
- 3. **Retry options on abort** - Clean recovery paths when things go wrong
14
-
15
- Currently our swarm uses:
16
- - **File reservations** via Swarm Mail for conflict prevention
17
- - **UBS scan** on completion for bug detection
18
- - **Manual cleanup** on abort
19
-
20
- ## Decision
21
-
22
- ### 1. Optional Worktree Isolation Mode
23
-
24
- Add `isolation` parameter to swarm initialization:
25
-
26
- ```typescript
27
- swarm_init({
28
- task: "Large refactor across 50 files",
29
- isolation: "worktree" // or "reservation" (default)
30
- })
31
- ```
32
-
33
- **When to use worktrees:**
34
- - Large refactors touching many files
35
- - High risk of merge conflicts
36
- - Need complete isolation (different node_modules, etc.)
37
-
38
- **When to use reservations (default):**
39
- - Most swarm tasks
40
- - Quick parallel work
41
- - Lower overhead
42
-
43
- **Worktree lifecycle:**
44
- ```
45
- swarm_worktree_create(task_id) → /path/to/worktree
46
-
47
- worker does work in worktree
48
-
49
- swarm_worktree_merge(task_id) → cherry-pick commit to main
50
-
51
- swarm_worktree_cleanup(task_id) → remove worktree
52
- ```
53
-
54
- **On abort:** Hard reset main to start commit, delete all worktrees.
55
-
56
- ### 2. Structured Review Step
57
-
58
- The coordinator reviews worker output before marking complete. This replaces the current "trust but verify with UBS" approach.
59
-
60
- **Review flow:**
61
- ```
62
- worker completes → coordinator reviews → approved/needs_changes
63
-
64
- if needs_changes: worker fixes (max 3 attempts)
65
-
66
- if approved: mark complete
67
- ```
68
-
69
- **Review prompt includes:**
70
- - Epic goal (the big picture)
71
- - Task requirements
72
- - What completed tasks this builds on (dependency context)
73
- - What future tasks depend on this (downstream context)
74
- - The actual code changes
75
-
76
- **Why coordinator reviews (not separate reviewer agent):**
77
- - Coordinator already has full epic context loaded
78
- - Avoids spawning another agent just for review
79
- - Keeps the feedback loop tight
80
- - Coordinator can make judgment calls about "good enough"
81
-
82
- **Review criteria:**
83
- 1. Does it fulfill the task requirements?
84
- 2. Does it serve the epic goal?
85
- 3. Will downstream tasks be able to use it?
86
- 4. Are there critical bugs? (UBS scan still runs)
87
-
88
- ### 3. Retry Options on Abort
89
-
90
- When a swarm aborts (user request or failure), provide clear recovery paths:
91
-
92
- ```json
93
- {
94
- "retry_options": {
95
- "same_plan": "/swarm --retry",
96
- "edit_plan": "/swarm --retry --edit",
97
- "fresh_start": "/swarm \"original task\""
98
- }
99
- }
100
- ```
101
-
102
- **`--retry`**: Resume with same plan, skip completed tasks
103
- **`--retry --edit`**: Show plan for modification before resuming
104
- **Fresh start**: Decompose from scratch
105
-
106
- This requires persisting swarm session state (already have this via Hive cells).
107
-
108
- ## Implementation
109
-
110
- ### Phase 1: Structured Review (Priority)
111
- 1. Add review step to `swarm_complete`
112
- 2. Create review prompt with epic context injection
113
- 3. Handle needs_changes → worker retry loop (max 3)
114
- 4. Keep UBS scan as additional safety net
115
-
116
- ### Phase 2: Worktree Isolation
117
- 1. Add `isolation` mode to `swarm_init`
118
- 2. Implement worktree lifecycle tools
119
- 3. Update worker prompts to work in worktree path
120
- 4. Add cherry-pick merge on completion
121
- 5. Add cleanup on abort
122
-
123
- ### Phase 3: Retry Options
124
- 1. Persist session state for recovery
125
- 2. Add `--retry` and `--retry --edit` flags
126
- 3. Skip completed tasks on retry
127
- 4. Show plan editor for `--edit` mode
128
-
129
- ## Consequences
130
-
131
- ### Positive
132
- - **Better quality**: Structured review catches issues before integration
133
- - **Safer large refactors**: Worktree isolation eliminates merge conflicts
134
- - **Cleaner recovery**: Retry options reduce friction after failures
135
- - **Coordinator stays in control**: Review keeps human-in-the-loop feel
136
-
137
- ### Negative
138
- - **More complexity**: Two isolation modes to maintain
139
- - **Slower completion**: Review step adds latency
140
- - **Disk usage**: Worktrees consume space (mitigated by cleanup)
141
-
142
- ### Neutral
143
- - **Credit**: Patterns inspired by nexxeln/opencode-config - should acknowledge in docs
144
-
145
- ## Alternatives Considered
146
-
147
- ### Separate Reviewer Agent
148
- nexxeln uses a dedicated reviewer subagent. We chose coordinator-as-reviewer because:
149
- - Avoids context duplication (coordinator already has epic context)
150
- - Faster feedback loop
151
- - Coordinator can make "ship it" judgment calls
152
-
153
- ### Staged Changes on Finalize
154
- nexxeln soft-resets to leave changes staged for user review. We're skipping this because:
155
- - Our flow already has explicit commit step
156
- - Hive tracks what changed
157
- - User can always `git diff` before committing
158
-
159
- ### Always Use Worktrees
160
- Could simplify by always using worktrees. Rejected because:
161
- - Overkill for most tasks
162
- - Slower setup/teardown
163
- - File reservations work fine for typical parallel work
164
-
165
- ## References
166
-
167
- - [nexxeln/opencode-config](https://github.com/nexxeln/opencode-config) - Source of inspiration
168
- - Epic: `bd-lf2p4u-mjaja96b9da` - Swarm Enhancements
@@ -1,293 +0,0 @@
1
- # ADR-008: Worker Handoff Protocol - Structured Contracts Over Prose
2
-
3
- ## Status
4
-
5
- Proposed
6
-
7
- ## Context
8
-
9
- The current `SUBTASK_PROMPT_V2` is a **280-line prose instruction manual** that gets injected into every swarm worker's context. This approach has fundamental problems:
10
-
11
- ### Current Problems
12
-
13
- 1. **Workers ignore prose** - Long text instructions get skimmed or missed entirely
14
- 2. **No validation** - Can't programmatically verify workers followed protocol
15
- 3. **Context bloat** - 280 lines * N workers burns tokens fast
16
- 4. **Drift and violations** - Workers modify files outside their scope, no automatic detection
17
- 5. **Manual error recovery** - Coordinator can't auto-detect contract violations
18
-
19
- **Concrete example of failure:**
20
- ```
21
- Worker assigned: ["src/auth/service.ts"]
22
- Worker actually touched: ["src/auth/service.ts", "src/lib/jwt.ts", "src/types/user.ts"]
23
- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24
- Scope creep undetected until swarm_complete
25
- ```
26
-
27
- Current `swarm_complete` validates `files_touched ⊆ files_owned`, but the **contract** was never machine-readable to begin with.
28
-
29
- ### Research & Inspirations
30
-
31
- From "Patterns for Building AI Agents" and production event-driven systems:
32
-
33
- **mdflow adapter pattern:**
34
- - Convention-based behavior inference
35
- - Template variables define expectations
36
- - Minimal configuration, maximum clarity
37
-
38
- **Bellemare's event-driven orchestration:**
39
- - Explicit contracts between services
40
- - Commands vs Events distinction
41
- - Contract violations fail fast with clear errors
42
-
43
- **Key insight:** Agents need **two channels**:
44
- 1. **Contract** (machine-readable, validated) - WHAT to do, WHERE to do it
45
- 2. **Context** (human-readable, advisory) - WHY it matters, HOW it fits together
46
-
47
- ## Decision
48
-
49
- Replace 280-line prose with **WorkerHandoff envelope** that separates contract from context.
50
-
51
- ### WorkerHandoff Structure
52
-
53
- ```typescript
54
- interface WorkerHandoff {
55
- // Machine-readable - enforced by tools
56
- contract: {
57
- task_id: string; // Cell ID for tracking
58
- files_owned: string[]; // Exclusive write access (validated)
59
- files_readonly: string[]; // Can read, MUST NOT modify (validated)
60
- dependencies_completed: string[]; // Tasks that finished before this
61
- success_criteria: string[]; // Exit conditions (checkable)
62
- };
63
-
64
- // Human-readable - advisory context
65
- context: {
66
- epic_summary: string; // Big picture goal
67
- your_role: string; // What this subtask accomplishes
68
- what_others_did: string; // Dependency outputs
69
- what_comes_next: string; // Downstream task expectations
70
- };
71
-
72
- // Escalation paths - when things go wrong
73
- escalation: {
74
- blocked_contact: string; // "coordinator" or agent name
75
- scope_change_protocol: string; // "swarmmail_send + await approval"
76
- };
77
- }
78
- ```
79
-
80
- ### Example Handoff
81
-
82
- ```json
83
- {
84
- "contract": {
85
- "task_id": "bd-123.2",
86
- "files_owned": ["src/auth/service.ts", "src/auth/service.test.ts"],
87
- "files_readonly": ["src/types/user.ts", "src/lib/jwt.ts"],
88
- "dependencies_completed": ["bd-123.1"],
89
- "success_criteria": [
90
- "AuthService.login() returns JWT token",
91
- "Tests pass: bun test src/auth/service.test.ts",
92
- "Type check passes: tsc --noEmit"
93
- ]
94
- },
95
- "context": {
96
- "epic_summary": "Add OAuth authentication to user service",
97
- "your_role": "Implement AuthService with JWT token generation",
98
- "what_others_did": "bd-123.1 created User schema with email/password fields",
99
- "what_comes_next": "bd-123.3 will integrate this service into API routes"
100
- },
101
- "escalation": {
102
- "blocked_contact": "coordinator",
103
- "scope_change_protocol": "swarmmail_send(subject='Scope Change', ack_required=true)"
104
- }
105
- }
106
- ```
107
-
108
- ### Validation in swarm_complete
109
-
110
- ```typescript
111
- // swarm_complete now validates against contract
112
- function validateCompletion(handoff: WorkerHandoff, result: CompletionReport) {
113
- const violations: string[] = [];
114
-
115
- // 1. File scope violations
116
- const unauthorized = result.files_touched.filter(
117
- f => !handoff.contract.files_owned.includes(f)
118
- );
119
- if (unauthorized.length > 0) {
120
- violations.push(`Touched unauthorized files: ${unauthorized.join(", ")}`);
121
- }
122
-
123
- // 2. Success criteria (checkable ones)
124
- for (const criterion of handoff.contract.success_criteria) {
125
- if (criterion.startsWith("Tests pass:")) {
126
- // Run the test command, validate exit 0
127
- }
128
- if (criterion.startsWith("Type check passes:")) {
129
- // Run tsc --noEmit, validate exit 0
130
- }
131
- }
132
-
133
- // 3. Learning signals from violations
134
- if (violations.length > 0) {
135
- recordLearningSignal({
136
- task_id: handoff.contract.task_id,
137
- violation_type: "scope_creep",
138
- details: violations,
139
- impact: "negative" // Penalize decomposition strategy
140
- });
141
- }
142
-
143
- return { valid: violations.length === 0, violations };
144
- }
145
- ```
146
-
147
- ### Integration with Existing Tools
148
-
149
- **swarm_spawn_subtask generates handoffs:**
150
-
151
- ```typescript
152
- export const swarm_spawn_subtask = tool(/* ... */)
153
- .handler(async ({ input, context }) => {
154
- const handoff: WorkerHandoff = {
155
- contract: {
156
- task_id: input.bead_id,
157
- files_owned: input.files,
158
- files_readonly: inferReadonlyFiles(input.files, epicContext),
159
- dependencies_completed: input.dependencies_completed || [],
160
- success_criteria: generateSuccessCriteria(input.subtask_description)
161
- },
162
- context: {
163
- epic_summary: epicContext.summary,
164
- your_role: input.subtask_title,
165
- what_others_did: summarizeDependencies(input.dependencies_completed),
166
- what_comes_next: summarizeDownstream(input.bead_id)
167
- },
168
- escalation: {
169
- blocked_contact: "coordinator",
170
- scope_change_protocol: "swarmmail_send(subject='Scope Change', ack_required=true)"
171
- }
172
- };
173
-
174
- return formatHandoff(handoff); // Compact JSON + minimal prose wrapper
175
- });
176
- ```
177
-
178
- **swarm_complete validates contract:**
179
-
180
- ```typescript
181
- export const swarm_complete = tool(/* ... */)
182
- .handler(async ({ input, context }) => {
183
- const handoff = getStoredHandoff(input.bead_id);
184
- const validation = validateCompletion(handoff, {
185
- files_touched: input.files_touched,
186
- summary: input.summary
187
- });
188
-
189
- if (!validation.valid) {
190
- throw new Error(
191
- `Contract violations detected:\n${validation.violations.join("\n")}`
192
- );
193
- }
194
-
195
- // Proceed with UBS scan, reservation release, etc.
196
- });
197
- ```
198
-
199
- ## Consequences
200
-
201
- ### Positive
202
-
203
- - **Validation enforced** - Can't complete with contract violations
204
- - **Clear boundaries** - Workers know exactly what's in/out of scope
205
- - **Better learning** - Scope creep violations feed back into strategy selection
206
- - **Context efficiency** - Contract is ~30 lines JSON vs 280 lines prose
207
- - **Fail fast** - Violations detected immediately, not during merge
208
- - **Programmatic recovery** - Coordinator can auto-detect and reassign work
209
-
210
- ### Negative
211
-
212
- - **Requires storage** - Handoffs must persist (already have event store)
213
- - **Success criteria limited** - Can't validate all criteria automatically
214
- - **Migration cost** - Existing `SUBTASK_PROMPT_V2` users need update
215
- - **More upfront work** - Coordinator must generate better contracts
216
-
217
- ### Neutral
218
-
219
- - **Prose still exists** - `context` field provides human explanation, just smaller
220
- - **Not eliminating checklist** - 9-step survival checklist stays, but moves to tool enforcement
221
-
222
- ## Implementation Notes
223
-
224
- ### Phase 1: Storage & Schema
225
-
226
- 1. Add `WorkerHandoff` schema to swarm-mail event types
227
- 2. Store handoffs in event log when spawning subtasks
228
- 3. Retrieve handoffs in `swarm_complete` for validation
229
-
230
- ### Phase 2: Generation Logic
231
-
232
- 1. Implement `inferReadonlyFiles()` - analyze imports/dependencies
233
- 2. Implement `generateSuccessCriteria()` - parse task description for checkable conditions
234
- 3. Implement `summarizeDependencies()` and `summarizeDownstream()` - build context from epic graph
235
-
236
- ### Phase 3: Validation
237
-
238
- 1. Add contract validation to `swarm_complete`
239
- 2. Implement checkable criteria runners (test commands, type checks)
240
- 3. Record learning signals for violations
241
-
242
- ### Phase 4: Migration
243
-
244
- 1. Update `formatSubtaskPromptV2` to generate handoff JSON
245
- 2. Deprecate 280-line prose template
246
- 3. Update tests for new handoff format
247
-
248
- ### Phase 5: Enhanced Features (Future)
249
-
250
- 1. **Readonly enforcement** - Detect modifications to `files_readonly` via git diff
251
- 2. **Dependency validation** - Verify `dependencies_completed` actually ran first
252
- 3. **Auto-generated success criteria** - Parse test files, infer criteria from code
253
-
254
- ## Alternatives Considered
255
-
256
- ### Keep Prose, Add Validation
257
-
258
- Keep `SUBTASK_PROMPT_V2` but add validation after-the-fact. **Rejected** because:
259
- - Still burns 280 lines of context per worker
260
- - Workers still ignore prose
261
- - Validation happens too late (after work done)
262
-
263
- ### Minimal Contract Only
264
-
265
- Remove context entirely, pure machine contract. **Rejected** because:
266
- - Workers need WHY to make good judgment calls
267
- - Context helps with edge cases not in contract
268
- - Loss of human readability hurts debugging
269
-
270
- ### Command Pattern (Bellemare Style)
271
-
272
- Full event-sourcing with Command objects. **Rejected** because:
273
- - Over-engineered for current needs
274
- - Already have event store for coordination
275
- - Contract + context is simpler and sufficient
276
-
277
- ## References
278
-
279
- - **"Patterns for Building AI Agents"** - Subagent context sharing patterns
280
- - **mdflow** - Convention-based adapter design, template variable contracts
281
- - **Bellemare's "Building Event-Driven Microservices"** - Explicit contracts, fail-fast validation
282
- - **Current implementation:** `src/swarm-prompts.ts` (SUBTASK_PROMPT_V2, lines 253-530)
283
- - **Related:** ADR-007 (Structured Review), ADR-002 (Package Extraction)
284
-
285
- ## Success Criteria
286
-
287
- - [ ] `WorkerHandoff` schema defined and validated with Zod
288
- - [ ] `swarm_spawn_subtask` generates handoffs instead of raw prose
289
- - [ ] `swarm_complete` validates contract before accepting completion
290
- - [ ] Scope violations trigger learning signals (negative feedback)
291
- - [ ] Workers receive handoff as JSON + compact context wrapper (<50 lines)
292
- - [ ] Test suite validates contract enforcement catches violations
293
- - [ ] Migration path documented for existing swarm users