opencode-swarm-plugin 0.43.0 → 0.44.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/bin/cass.characterization.test.ts +422 -0
- package/bin/swarm.serve.test.ts +6 -4
- package/bin/swarm.test.ts +68 -0
- package/bin/swarm.ts +81 -8
- package/dist/compaction-prompt-scoring.js +139 -0
- package/dist/contributor-tools.d.ts +42 -0
- package/dist/contributor-tools.d.ts.map +1 -0
- package/dist/eval-capture.js +12811 -0
- package/dist/hive.d.ts.map +1 -1
- package/dist/index.d.ts +12 -0
- package/dist/index.d.ts.map +1 -1
- package/dist/index.js +7728 -62590
- package/dist/plugin.js +23833 -78695
- package/dist/sessions/agent-discovery.d.ts +59 -0
- package/dist/sessions/agent-discovery.d.ts.map +1 -0
- package/dist/sessions/index.d.ts +10 -0
- package/dist/sessions/index.d.ts.map +1 -0
- package/dist/swarm-orchestrate.d.ts.map +1 -1
- package/dist/swarm-prompts.d.ts.map +1 -1
- package/dist/swarm-review.d.ts.map +1 -1
- package/package.json +17 -5
- package/.changeset/swarm-insights-data-layer.md +0 -63
- package/.hive/analysis/eval-failure-analysis-2025-12-25.md +0 -331
- package/.hive/analysis/session-data-quality-audit.md +0 -320
- package/.hive/eval-results.json +0 -483
- package/.hive/issues.jsonl +0 -138
- package/.hive/memories.jsonl +0 -729
- package/.opencode/eval-history.jsonl +0 -327
- package/.turbo/turbo-build.log +0 -9
- package/CHANGELOG.md +0 -2255
- package/SCORER-ANALYSIS.md +0 -598
- package/docs/analysis/subagent-coordination-patterns.md +0 -902
- package/docs/analysis-socratic-planner-pattern.md +0 -504
- package/docs/planning/ADR-001-monorepo-structure.md +0 -171
- package/docs/planning/ADR-002-package-extraction.md +0 -393
- package/docs/planning/ADR-003-performance-improvements.md +0 -451
- package/docs/planning/ADR-004-message-queue-features.md +0 -187
- package/docs/planning/ADR-005-devtools-observability.md +0 -202
- package/docs/planning/ADR-007-swarm-enhancements-worktree-review.md +0 -168
- package/docs/planning/ADR-008-worker-handoff-protocol.md +0 -293
- package/docs/planning/ADR-009-oh-my-opencode-patterns.md +0 -353
- package/docs/planning/ROADMAP.md +0 -368
- package/docs/semantic-memory-cli-syntax.md +0 -123
- package/docs/swarm-mail-architecture.md +0 -1147
- package/docs/testing/context-recovery-test.md +0 -470
- package/evals/ARCHITECTURE.md +0 -1189
- package/evals/README.md +0 -768
- package/evals/compaction-prompt.eval.ts +0 -149
- package/evals/compaction-resumption.eval.ts +0 -289
- package/evals/coordinator-behavior.eval.ts +0 -307
- package/evals/coordinator-session.eval.ts +0 -154
- package/evals/evalite.config.ts.bak +0 -15
- package/evals/example.eval.ts +0 -31
- package/evals/fixtures/compaction-cases.ts +0 -350
- package/evals/fixtures/compaction-prompt-cases.ts +0 -311
- package/evals/fixtures/coordinator-sessions.ts +0 -328
- package/evals/fixtures/decomposition-cases.ts +0 -105
- package/evals/lib/compaction-loader.test.ts +0 -248
- package/evals/lib/compaction-loader.ts +0 -320
- package/evals/lib/data-loader.evalite-test.ts +0 -289
- package/evals/lib/data-loader.test.ts +0 -345
- package/evals/lib/data-loader.ts +0 -281
- package/evals/lib/llm.ts +0 -115
- package/evals/scorers/compaction-prompt-scorers.ts +0 -145
- package/evals/scorers/compaction-scorers.ts +0 -305
- package/evals/scorers/coordinator-discipline.evalite-test.ts +0 -539
- package/evals/scorers/coordinator-discipline.ts +0 -325
- package/evals/scorers/index.test.ts +0 -146
- package/evals/scorers/index.ts +0 -328
- package/evals/scorers/outcome-scorers.evalite-test.ts +0 -27
- package/evals/scorers/outcome-scorers.ts +0 -349
- package/evals/swarm-decomposition.eval.ts +0 -121
- package/examples/commands/swarm.md +0 -745
- package/examples/plugin-wrapper-template.ts +0 -2426
- package/examples/skills/hive-workflow/SKILL.md +0 -212
- package/examples/skills/skill-creator/SKILL.md +0 -223
- package/examples/skills/swarm-coordination/SKILL.md +0 -292
- package/global-skills/cli-builder/SKILL.md +0 -344
- package/global-skills/cli-builder/references/advanced-patterns.md +0 -244
- package/global-skills/learning-systems/SKILL.md +0 -644
- package/global-skills/skill-creator/LICENSE.txt +0 -202
- package/global-skills/skill-creator/SKILL.md +0 -352
- package/global-skills/skill-creator/references/output-patterns.md +0 -82
- package/global-skills/skill-creator/references/workflows.md +0 -28
- package/global-skills/swarm-coordination/SKILL.md +0 -995
- package/global-skills/swarm-coordination/references/coordinator-patterns.md +0 -235
- package/global-skills/swarm-coordination/references/strategies.md +0 -138
- package/global-skills/system-design/SKILL.md +0 -213
- package/global-skills/testing-patterns/SKILL.md +0 -430
- package/global-skills/testing-patterns/references/dependency-breaking-catalog.md +0 -586
- package/opencode-swarm-plugin-0.30.7.tgz +0 -0
- package/opencode-swarm-plugin-0.31.0.tgz +0 -0
- package/scripts/cleanup-test-memories.ts +0 -346
- package/scripts/init-skill.ts +0 -222
- package/scripts/migrate-unknown-sessions.ts +0 -349
- package/scripts/validate-skill.ts +0 -204
- package/src/agent-mail.ts +0 -1724
- package/src/anti-patterns.test.ts +0 -1167
- package/src/anti-patterns.ts +0 -448
- package/src/compaction-capture.integration.test.ts +0 -257
- package/src/compaction-hook.test.ts +0 -838
- package/src/compaction-hook.ts +0 -1204
- package/src/compaction-observability.integration.test.ts +0 -139
- package/src/compaction-observability.test.ts +0 -187
- package/src/compaction-observability.ts +0 -324
- package/src/compaction-prompt-scorers.test.ts +0 -475
- package/src/compaction-prompt-scoring.ts +0 -300
- package/src/dashboard.test.ts +0 -611
- package/src/dashboard.ts +0 -462
- package/src/error-enrichment.test.ts +0 -403
- package/src/error-enrichment.ts +0 -219
- package/src/eval-capture.test.ts +0 -1015
- package/src/eval-capture.ts +0 -929
- package/src/eval-gates.test.ts +0 -306
- package/src/eval-gates.ts +0 -218
- package/src/eval-history.test.ts +0 -508
- package/src/eval-history.ts +0 -214
- package/src/eval-learning.test.ts +0 -378
- package/src/eval-learning.ts +0 -360
- package/src/eval-runner.test.ts +0 -223
- package/src/eval-runner.ts +0 -402
- package/src/export-tools.test.ts +0 -476
- package/src/export-tools.ts +0 -257
- package/src/hive.integration.test.ts +0 -2241
- package/src/hive.ts +0 -1628
- package/src/index.ts +0 -935
- package/src/learning.integration.test.ts +0 -1815
- package/src/learning.ts +0 -1079
- package/src/logger.test.ts +0 -189
- package/src/logger.ts +0 -135
- package/src/mandate-promotion.test.ts +0 -473
- package/src/mandate-promotion.ts +0 -239
- package/src/mandate-storage.integration.test.ts +0 -601
- package/src/mandate-storage.test.ts +0 -578
- package/src/mandate-storage.ts +0 -794
- package/src/mandates.ts +0 -540
- package/src/memory-tools.test.ts +0 -195
- package/src/memory-tools.ts +0 -344
- package/src/memory.integration.test.ts +0 -334
- package/src/memory.test.ts +0 -158
- package/src/memory.ts +0 -527
- package/src/model-selection.test.ts +0 -188
- package/src/model-selection.ts +0 -68
- package/src/observability-tools.test.ts +0 -359
- package/src/observability-tools.ts +0 -871
- package/src/output-guardrails.test.ts +0 -438
- package/src/output-guardrails.ts +0 -381
- package/src/pattern-maturity.test.ts +0 -1160
- package/src/pattern-maturity.ts +0 -525
- package/src/planning-guardrails.test.ts +0 -491
- package/src/planning-guardrails.ts +0 -438
- package/src/plugin.ts +0 -23
- package/src/post-compaction-tracker.test.ts +0 -251
- package/src/post-compaction-tracker.ts +0 -237
- package/src/query-tools.test.ts +0 -636
- package/src/query-tools.ts +0 -324
- package/src/rate-limiter.integration.test.ts +0 -466
- package/src/rate-limiter.ts +0 -774
- package/src/replay-tools.test.ts +0 -496
- package/src/replay-tools.ts +0 -240
- package/src/repo-crawl.integration.test.ts +0 -441
- package/src/repo-crawl.ts +0 -610
- package/src/schemas/cell-events.test.ts +0 -347
- package/src/schemas/cell-events.ts +0 -807
- package/src/schemas/cell.ts +0 -257
- package/src/schemas/evaluation.ts +0 -166
- package/src/schemas/index.test.ts +0 -199
- package/src/schemas/index.ts +0 -286
- package/src/schemas/mandate.ts +0 -232
- package/src/schemas/swarm-context.ts +0 -115
- package/src/schemas/task.ts +0 -161
- package/src/schemas/worker-handoff.test.ts +0 -302
- package/src/schemas/worker-handoff.ts +0 -131
- package/src/skills.integration.test.ts +0 -1192
- package/src/skills.test.ts +0 -643
- package/src/skills.ts +0 -1549
- package/src/storage.integration.test.ts +0 -341
- package/src/storage.ts +0 -884
- package/src/structured.integration.test.ts +0 -817
- package/src/structured.test.ts +0 -1046
- package/src/structured.ts +0 -762
- package/src/swarm-decompose.test.ts +0 -188
- package/src/swarm-decompose.ts +0 -1302
- package/src/swarm-deferred.integration.test.ts +0 -157
- package/src/swarm-deferred.test.ts +0 -38
- package/src/swarm-insights.test.ts +0 -214
- package/src/swarm-insights.ts +0 -459
- package/src/swarm-mail.integration.test.ts +0 -970
- package/src/swarm-mail.ts +0 -739
- package/src/swarm-orchestrate.integration.test.ts +0 -282
- package/src/swarm-orchestrate.test.ts +0 -548
- package/src/swarm-orchestrate.ts +0 -3084
- package/src/swarm-prompts.test.ts +0 -1270
- package/src/swarm-prompts.ts +0 -2077
- package/src/swarm-research.integration.test.ts +0 -701
- package/src/swarm-research.test.ts +0 -698
- package/src/swarm-research.ts +0 -472
- package/src/swarm-review.integration.test.ts +0 -285
- package/src/swarm-review.test.ts +0 -879
- package/src/swarm-review.ts +0 -709
- package/src/swarm-strategies.ts +0 -407
- package/src/swarm-worktree.test.ts +0 -501
- package/src/swarm-worktree.ts +0 -575
- package/src/swarm.integration.test.ts +0 -2377
- package/src/swarm.ts +0 -38
- package/src/tool-adapter.integration.test.ts +0 -1221
- package/src/tool-availability.ts +0 -461
- package/tsconfig.json +0 -28
|
@@ -1,202 +0,0 @@
|
|
|
1
|
-
# ADR-005: DevTools + Observability
|
|
2
|
-
|
|
3
|
-
## Status
|
|
4
|
-
|
|
5
|
-
Proposed
|
|
6
|
-
|
|
7
|
-
## Context
|
|
8
|
-
|
|
9
|
-
Swarm Mail currently has no visibility:
|
|
10
|
-
|
|
11
|
-
- No UI to inspect events, messages, locks
|
|
12
|
-
- No metrics on latency, queue depth, throughput
|
|
13
|
-
- No distributed tracing across agents
|
|
14
|
-
- Hard to debug coordination issues
|
|
15
|
-
|
|
16
|
-
Need both developer tools (UI + CLI) and production observability (metrics + tracing).
|
|
17
|
-
|
|
18
|
-
## Decision
|
|
19
|
-
|
|
20
|
-
Build layered observability:
|
|
21
|
-
|
|
22
|
-
### 1. DevTools UI (SvelteKit)
|
|
23
|
-
|
|
24
|
-
**Stack:**
|
|
25
|
-
|
|
26
|
-
- SvelteKit for SSR + static export
|
|
27
|
-
- Vite for dev server + build
|
|
28
|
-
- Server-Sent Events (SSE) for real-time updates
|
|
29
|
-
- Embeddable static build
|
|
30
|
-
|
|
31
|
-
**Features:**
|
|
32
|
-
|
|
33
|
-
- Event stream viewer (filterable, searchable)
|
|
34
|
-
- Message inbox/outbox per agent
|
|
35
|
-
- File reservation timeline
|
|
36
|
-
- Saga instance tracker (future)
|
|
37
|
-
|
|
38
|
-
**Build:**
|
|
39
|
-
|
|
40
|
-
```bash
|
|
41
|
-
cd apps/devtools
|
|
42
|
-
bun run build # Static export to apps/devtools/build
|
|
43
|
-
```
|
|
44
|
-
|
|
45
|
-
**Embed in plugin:**
|
|
46
|
-
|
|
47
|
-
```typescript
|
|
48
|
-
// Serve static UI at /_swarm/devtools
|
|
49
|
-
const server = serve({
|
|
50
|
-
port: 4000,
|
|
51
|
-
fetch: (req) => {
|
|
52
|
-
if (req.url.startsWith("/_swarm/devtools")) {
|
|
53
|
-
return serveStatic("apps/devtools/build");
|
|
54
|
-
}
|
|
55
|
-
},
|
|
56
|
-
});
|
|
57
|
-
```
|
|
58
|
-
|
|
59
|
-
### 2. CLI (@effect/cli)
|
|
60
|
-
|
|
61
|
-
**Commands:**
|
|
62
|
-
|
|
63
|
-
```bash
|
|
64
|
-
swarm events [--project <key>] [--type <type>] [--tail]
|
|
65
|
-
swarm messages [--agent <name>] [--unread]
|
|
66
|
-
swarm locks [--agent <name>]
|
|
67
|
-
swarm replay --from <sequence> [--to <sequence>]
|
|
68
|
-
swarm metrics
|
|
69
|
-
```
|
|
70
|
-
|
|
71
|
-
**Implementation:**
|
|
72
|
-
|
|
73
|
-
```typescript
|
|
74
|
-
import { Command } from "@effect/cli";
|
|
75
|
-
|
|
76
|
-
const eventsCommand = Command.make(
|
|
77
|
-
"events",
|
|
78
|
-
{
|
|
79
|
-
project: Options.string("project").optional,
|
|
80
|
-
type: Options.string("type").optional,
|
|
81
|
-
tail: Options.boolean("tail"),
|
|
82
|
-
},
|
|
83
|
-
({ project, type, tail }) => {
|
|
84
|
-
// Query events table, optionally --tail with live query
|
|
85
|
-
},
|
|
86
|
-
);
|
|
87
|
-
```
|
|
88
|
-
|
|
89
|
-
### 3. Metrics (Prometheus)
|
|
90
|
-
|
|
91
|
-
**Histograms:**
|
|
92
|
-
|
|
93
|
-
- `swarm_message_latency_seconds` - Send to receive time
|
|
94
|
-
- `swarm_lock_contention_seconds` - Time waiting for lock
|
|
95
|
-
- `swarm_queue_depth` - Unread messages per agent
|
|
96
|
-
|
|
97
|
-
**Counters:**
|
|
98
|
-
|
|
99
|
-
- `swarm_events_total{type}` - Events by type
|
|
100
|
-
- `swarm_messages_sent_total{sender, recipient}`
|
|
101
|
-
- `swarm_locks_acquired_total{agent}`
|
|
102
|
-
|
|
103
|
-
**Example:**
|
|
104
|
-
|
|
105
|
-
```typescript
|
|
106
|
-
import { Registry, Histogram } from 'prom-client'
|
|
107
|
-
|
|
108
|
-
const messageLat ency = new Histogram({
|
|
109
|
-
name: 'swarm_message_latency_seconds',
|
|
110
|
-
help: 'Message delivery latency',
|
|
111
|
-
buckets: [0.01, 0.05, 0.1, 0.5, 1.0, 5.0]
|
|
112
|
-
})
|
|
113
|
-
|
|
114
|
-
// Record latency
|
|
115
|
-
const start = Date.now()
|
|
116
|
-
await sendMessage(msg)
|
|
117
|
-
const latency = (Date.now() - start) / 1000
|
|
118
|
-
messageLatency.observe(latency)
|
|
119
|
-
```
|
|
120
|
-
|
|
121
|
-
### 4. Distributed Tracing (OpenTelemetry)
|
|
122
|
-
|
|
123
|
-
**Integration:**
|
|
124
|
-
|
|
125
|
-
```typescript
|
|
126
|
-
import { @effect/opentelemetry } from '@effect/opentelemetry'
|
|
127
|
-
import { NodeTracerProvider } from '@opentelemetry/sdk-trace-node'
|
|
128
|
-
|
|
129
|
-
const provider = new NodeTracerProvider()
|
|
130
|
-
const tracer = provider.getTracer('swarm-mail')
|
|
131
|
-
|
|
132
|
-
// Trace message send
|
|
133
|
-
const span = tracer.startSpan('sendMessage', {
|
|
134
|
-
attributes: {
|
|
135
|
-
'swarm.sender': 'AgentA',
|
|
136
|
-
'swarm.recipient': 'AgentB',
|
|
137
|
-
'swarm.thread_id': 'bd-123'
|
|
138
|
-
}
|
|
139
|
-
})
|
|
140
|
-
|
|
141
|
-
await sendMessage(msg)
|
|
142
|
-
span.end()
|
|
143
|
-
```
|
|
144
|
-
|
|
145
|
-
**Trace Propagation:**
|
|
146
|
-
|
|
147
|
-
- Add trace_id to message metadata
|
|
148
|
-
- Worker agents continue traces from parents
|
|
149
|
-
- Visualize full swarm execution flow
|
|
150
|
-
|
|
151
|
-
## Consequences
|
|
152
|
-
|
|
153
|
-
### Easier
|
|
154
|
-
|
|
155
|
-
- **Visibility** - See all events, messages, locks in real-time
|
|
156
|
-
- **Debugging** - Trace issues across agents via distributed tracing
|
|
157
|
-
- **Performance** - Identify slow operations via histograms
|
|
158
|
-
- **Operations** - CLI for prod debugging without UI
|
|
159
|
-
|
|
160
|
-
### More Difficult
|
|
161
|
-
|
|
162
|
-
- **Maintenance** - Another app to maintain (DevTools UI)
|
|
163
|
-
- **Bundle size** - Metrics/tracing deps increase plugin size
|
|
164
|
-
- **Performance overhead** - Instrumentation adds latency
|
|
165
|
-
- **Configuration** - Metrics exporters, trace backends
|
|
166
|
-
|
|
167
|
-
## Implementation Notes
|
|
168
|
-
|
|
169
|
-
### Phase 1: CLI (Week 1)
|
|
170
|
-
|
|
171
|
-
- Add @effect/cli dependency
|
|
172
|
-
- Implement events, messages, locks commands
|
|
173
|
-
- Test with real swarm sessions
|
|
174
|
-
|
|
175
|
-
### Phase 2: DevTools UI (Week 2-3)
|
|
176
|
-
|
|
177
|
-
- Scaffold SvelteKit app
|
|
178
|
-
- Build event stream viewer
|
|
179
|
-
- Add SSE endpoint for real-time updates
|
|
180
|
-
- Static export + embed in plugin
|
|
181
|
-
|
|
182
|
-
### Phase 3: Metrics (Week 4)
|
|
183
|
-
|
|
184
|
-
- Add prom-client dependency
|
|
185
|
-
- Instrument send/receive latency
|
|
186
|
-
- Add queue depth gauge
|
|
187
|
-
- Expose /metrics endpoint
|
|
188
|
-
|
|
189
|
-
### Phase 4: Tracing (Week 5)
|
|
190
|
-
|
|
191
|
-
- Add @effect/opentelemetry
|
|
192
|
-
- Instrument message send/receive
|
|
193
|
-
- Propagate trace context
|
|
194
|
-
- Test with Jaeger/Zipkin
|
|
195
|
-
|
|
196
|
-
### Success Criteria
|
|
197
|
-
|
|
198
|
-
- [ ] CLI can tail events in real-time
|
|
199
|
-
- [ ] DevTools UI shows live message stream
|
|
200
|
-
- [ ] Metrics exposed at /metrics endpoint
|
|
201
|
-
- [ ] Traces visible in Jaeger UI
|
|
202
|
-
- [ ] Documentation for all observability tools
|
|
@@ -1,168 +0,0 @@
|
|
|
1
|
-
# ADR-007: Swarm Enhancements - Worktree Isolation + Structured Review
|
|
2
|
-
|
|
3
|
-
## Status
|
|
4
|
-
|
|
5
|
-
Proposed
|
|
6
|
-
|
|
7
|
-
## Context
|
|
8
|
-
|
|
9
|
-
After reviewing [nexxeln/opencode-config](https://github.com/nexxeln/opencode-config), we identified several patterns that would strengthen our swarm coordination:
|
|
10
|
-
|
|
11
|
-
1. **Git worktree isolation** - Each worker gets a complete isolated copy of the repo
|
|
12
|
-
2. **Structured review loop** - Workers must pass review before completion
|
|
13
|
-
3. **Retry options on abort** - Clean recovery paths when things go wrong
|
|
14
|
-
|
|
15
|
-
Currently our swarm uses:
|
|
16
|
-
- **File reservations** via Swarm Mail for conflict prevention
|
|
17
|
-
- **UBS scan** on completion for bug detection
|
|
18
|
-
- **Manual cleanup** on abort
|
|
19
|
-
|
|
20
|
-
## Decision
|
|
21
|
-
|
|
22
|
-
### 1. Optional Worktree Isolation Mode
|
|
23
|
-
|
|
24
|
-
Add `isolation` parameter to swarm initialization:
|
|
25
|
-
|
|
26
|
-
```typescript
|
|
27
|
-
swarm_init({
|
|
28
|
-
task: "Large refactor across 50 files",
|
|
29
|
-
isolation: "worktree" // or "reservation" (default)
|
|
30
|
-
})
|
|
31
|
-
```
|
|
32
|
-
|
|
33
|
-
**When to use worktrees:**
|
|
34
|
-
- Large refactors touching many files
|
|
35
|
-
- High risk of merge conflicts
|
|
36
|
-
- Need complete isolation (different node_modules, etc.)
|
|
37
|
-
|
|
38
|
-
**When to use reservations (default):**
|
|
39
|
-
- Most swarm tasks
|
|
40
|
-
- Quick parallel work
|
|
41
|
-
- Lower overhead
|
|
42
|
-
|
|
43
|
-
**Worktree lifecycle:**
|
|
44
|
-
```
|
|
45
|
-
swarm_worktree_create(task_id) → /path/to/worktree
|
|
46
|
-
↓
|
|
47
|
-
worker does work in worktree
|
|
48
|
-
↓
|
|
49
|
-
swarm_worktree_merge(task_id) → cherry-pick commit to main
|
|
50
|
-
↓
|
|
51
|
-
swarm_worktree_cleanup(task_id) → remove worktree
|
|
52
|
-
```
|
|
53
|
-
|
|
54
|
-
**On abort:** Hard reset main to start commit, delete all worktrees.
|
|
55
|
-
|
|
56
|
-
### 2. Structured Review Step
|
|
57
|
-
|
|
58
|
-
The coordinator reviews worker output before marking complete. This replaces the current "trust but verify with UBS" approach.
|
|
59
|
-
|
|
60
|
-
**Review flow:**
|
|
61
|
-
```
|
|
62
|
-
worker completes → coordinator reviews → approved/needs_changes
|
|
63
|
-
↓
|
|
64
|
-
if needs_changes: worker fixes (max 3 attempts)
|
|
65
|
-
↓
|
|
66
|
-
if approved: mark complete
|
|
67
|
-
```
|
|
68
|
-
|
|
69
|
-
**Review prompt includes:**
|
|
70
|
-
- Epic goal (the big picture)
|
|
71
|
-
- Task requirements
|
|
72
|
-
- What completed tasks this builds on (dependency context)
|
|
73
|
-
- What future tasks depend on this (downstream context)
|
|
74
|
-
- The actual code changes
|
|
75
|
-
|
|
76
|
-
**Why coordinator reviews (not separate reviewer agent):**
|
|
77
|
-
- Coordinator already has full epic context loaded
|
|
78
|
-
- Avoids spawning another agent just for review
|
|
79
|
-
- Keeps the feedback loop tight
|
|
80
|
-
- Coordinator can make judgment calls about "good enough"
|
|
81
|
-
|
|
82
|
-
**Review criteria:**
|
|
83
|
-
1. Does it fulfill the task requirements?
|
|
84
|
-
2. Does it serve the epic goal?
|
|
85
|
-
3. Will downstream tasks be able to use it?
|
|
86
|
-
4. Are there critical bugs? (UBS scan still runs)
|
|
87
|
-
|
|
88
|
-
### 3. Retry Options on Abort
|
|
89
|
-
|
|
90
|
-
When a swarm aborts (user request or failure), provide clear recovery paths:
|
|
91
|
-
|
|
92
|
-
```json
|
|
93
|
-
{
|
|
94
|
-
"retry_options": {
|
|
95
|
-
"same_plan": "/swarm --retry",
|
|
96
|
-
"edit_plan": "/swarm --retry --edit",
|
|
97
|
-
"fresh_start": "/swarm \"original task\""
|
|
98
|
-
}
|
|
99
|
-
}
|
|
100
|
-
```
|
|
101
|
-
|
|
102
|
-
**`--retry`**: Resume with same plan, skip completed tasks
|
|
103
|
-
**`--retry --edit`**: Show plan for modification before resuming
|
|
104
|
-
**Fresh start**: Decompose from scratch
|
|
105
|
-
|
|
106
|
-
This requires persisting swarm session state (already have this via Hive cells).
|
|
107
|
-
|
|
108
|
-
## Implementation
|
|
109
|
-
|
|
110
|
-
### Phase 1: Structured Review (Priority)
|
|
111
|
-
1. Add review step to `swarm_complete`
|
|
112
|
-
2. Create review prompt with epic context injection
|
|
113
|
-
3. Handle needs_changes → worker retry loop (max 3)
|
|
114
|
-
4. Keep UBS scan as additional safety net
|
|
115
|
-
|
|
116
|
-
### Phase 2: Worktree Isolation
|
|
117
|
-
1. Add `isolation` mode to `swarm_init`
|
|
118
|
-
2. Implement worktree lifecycle tools
|
|
119
|
-
3. Update worker prompts to work in worktree path
|
|
120
|
-
4. Add cherry-pick merge on completion
|
|
121
|
-
5. Add cleanup on abort
|
|
122
|
-
|
|
123
|
-
### Phase 3: Retry Options
|
|
124
|
-
1. Persist session state for recovery
|
|
125
|
-
2. Add `--retry` and `--retry --edit` flags
|
|
126
|
-
3. Skip completed tasks on retry
|
|
127
|
-
4. Show plan editor for `--edit` mode
|
|
128
|
-
|
|
129
|
-
## Consequences
|
|
130
|
-
|
|
131
|
-
### Positive
|
|
132
|
-
- **Better quality**: Structured review catches issues before integration
|
|
133
|
-
- **Safer large refactors**: Worktree isolation eliminates merge conflicts
|
|
134
|
-
- **Cleaner recovery**: Retry options reduce friction after failures
|
|
135
|
-
- **Coordinator stays in control**: Review keeps human-in-the-loop feel
|
|
136
|
-
|
|
137
|
-
### Negative
|
|
138
|
-
- **More complexity**: Two isolation modes to maintain
|
|
139
|
-
- **Slower completion**: Review step adds latency
|
|
140
|
-
- **Disk usage**: Worktrees consume space (mitigated by cleanup)
|
|
141
|
-
|
|
142
|
-
### Neutral
|
|
143
|
-
- **Credit**: Patterns inspired by nexxeln/opencode-config - should acknowledge in docs
|
|
144
|
-
|
|
145
|
-
## Alternatives Considered
|
|
146
|
-
|
|
147
|
-
### Separate Reviewer Agent
|
|
148
|
-
nexxeln uses a dedicated reviewer subagent. We chose coordinator-as-reviewer because:
|
|
149
|
-
- Avoids context duplication (coordinator already has epic context)
|
|
150
|
-
- Faster feedback loop
|
|
151
|
-
- Coordinator can make "ship it" judgment calls
|
|
152
|
-
|
|
153
|
-
### Staged Changes on Finalize
|
|
154
|
-
nexxeln soft-resets to leave changes staged for user review. We're skipping this because:
|
|
155
|
-
- Our flow already has explicit commit step
|
|
156
|
-
- Hive tracks what changed
|
|
157
|
-
- User can always `git diff` before committing
|
|
158
|
-
|
|
159
|
-
### Always Use Worktrees
|
|
160
|
-
Could simplify by always using worktrees. Rejected because:
|
|
161
|
-
- Overkill for most tasks
|
|
162
|
-
- Slower setup/teardown
|
|
163
|
-
- File reservations work fine for typical parallel work
|
|
164
|
-
|
|
165
|
-
## References
|
|
166
|
-
|
|
167
|
-
- [nexxeln/opencode-config](https://github.com/nexxeln/opencode-config) - Source of inspiration
|
|
168
|
-
- Epic: `bd-lf2p4u-mjaja96b9da` - Swarm Enhancements
|
|
@@ -1,293 +0,0 @@
|
|
|
1
|
-
# ADR-008: Worker Handoff Protocol - Structured Contracts Over Prose
|
|
2
|
-
|
|
3
|
-
## Status
|
|
4
|
-
|
|
5
|
-
Proposed
|
|
6
|
-
|
|
7
|
-
## Context
|
|
8
|
-
|
|
9
|
-
The current `SUBTASK_PROMPT_V2` is a **280-line prose instruction manual** that gets injected into every swarm worker's context. This approach has fundamental problems:
|
|
10
|
-
|
|
11
|
-
### Current Problems
|
|
12
|
-
|
|
13
|
-
1. **Workers ignore prose** - Long text instructions get skimmed or missed entirely
|
|
14
|
-
2. **No validation** - Can't programmatically verify workers followed protocol
|
|
15
|
-
3. **Context bloat** - 280 lines * N workers burns tokens fast
|
|
16
|
-
4. **Drift and violations** - Workers modify files outside their scope, no automatic detection
|
|
17
|
-
5. **Manual error recovery** - Coordinator can't auto-detect contract violations
|
|
18
|
-
|
|
19
|
-
**Concrete example of failure:**
|
|
20
|
-
```
|
|
21
|
-
Worker assigned: ["src/auth/service.ts"]
|
|
22
|
-
Worker actually touched: ["src/auth/service.ts", "src/lib/jwt.ts", "src/types/user.ts"]
|
|
23
|
-
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
24
|
-
Scope creep undetected until swarm_complete
|
|
25
|
-
```
|
|
26
|
-
|
|
27
|
-
Current `swarm_complete` validates `files_touched ⊆ files_owned`, but the **contract** was never machine-readable to begin with.
|
|
28
|
-
|
|
29
|
-
### Research & Inspirations
|
|
30
|
-
|
|
31
|
-
From "Patterns for Building AI Agents" and production event-driven systems:
|
|
32
|
-
|
|
33
|
-
**mdflow adapter pattern:**
|
|
34
|
-
- Convention-based behavior inference
|
|
35
|
-
- Template variables define expectations
|
|
36
|
-
- Minimal configuration, maximum clarity
|
|
37
|
-
|
|
38
|
-
**Bellemare's event-driven orchestration:**
|
|
39
|
-
- Explicit contracts between services
|
|
40
|
-
- Commands vs Events distinction
|
|
41
|
-
- Contract violations fail fast with clear errors
|
|
42
|
-
|
|
43
|
-
**Key insight:** Agents need **two channels**:
|
|
44
|
-
1. **Contract** (machine-readable, validated) - WHAT to do, WHERE to do it
|
|
45
|
-
2. **Context** (human-readable, advisory) - WHY it matters, HOW it fits together
|
|
46
|
-
|
|
47
|
-
## Decision
|
|
48
|
-
|
|
49
|
-
Replace 280-line prose with **WorkerHandoff envelope** that separates contract from context.
|
|
50
|
-
|
|
51
|
-
### WorkerHandoff Structure
|
|
52
|
-
|
|
53
|
-
```typescript
|
|
54
|
-
interface WorkerHandoff {
|
|
55
|
-
// Machine-readable - enforced by tools
|
|
56
|
-
contract: {
|
|
57
|
-
task_id: string; // Cell ID for tracking
|
|
58
|
-
files_owned: string[]; // Exclusive write access (validated)
|
|
59
|
-
files_readonly: string[]; // Can read, MUST NOT modify (validated)
|
|
60
|
-
dependencies_completed: string[]; // Tasks that finished before this
|
|
61
|
-
success_criteria: string[]; // Exit conditions (checkable)
|
|
62
|
-
};
|
|
63
|
-
|
|
64
|
-
// Human-readable - advisory context
|
|
65
|
-
context: {
|
|
66
|
-
epic_summary: string; // Big picture goal
|
|
67
|
-
your_role: string; // What this subtask accomplishes
|
|
68
|
-
what_others_did: string; // Dependency outputs
|
|
69
|
-
what_comes_next: string; // Downstream task expectations
|
|
70
|
-
};
|
|
71
|
-
|
|
72
|
-
// Escalation paths - when things go wrong
|
|
73
|
-
escalation: {
|
|
74
|
-
blocked_contact: string; // "coordinator" or agent name
|
|
75
|
-
scope_change_protocol: string; // "swarmmail_send + await approval"
|
|
76
|
-
};
|
|
77
|
-
}
|
|
78
|
-
```
|
|
79
|
-
|
|
80
|
-
### Example Handoff
|
|
81
|
-
|
|
82
|
-
```json
|
|
83
|
-
{
|
|
84
|
-
"contract": {
|
|
85
|
-
"task_id": "bd-123.2",
|
|
86
|
-
"files_owned": ["src/auth/service.ts", "src/auth/service.test.ts"],
|
|
87
|
-
"files_readonly": ["src/types/user.ts", "src/lib/jwt.ts"],
|
|
88
|
-
"dependencies_completed": ["bd-123.1"],
|
|
89
|
-
"success_criteria": [
|
|
90
|
-
"AuthService.login() returns JWT token",
|
|
91
|
-
"Tests pass: bun test src/auth/service.test.ts",
|
|
92
|
-
"Type check passes: tsc --noEmit"
|
|
93
|
-
]
|
|
94
|
-
},
|
|
95
|
-
"context": {
|
|
96
|
-
"epic_summary": "Add OAuth authentication to user service",
|
|
97
|
-
"your_role": "Implement AuthService with JWT token generation",
|
|
98
|
-
"what_others_did": "bd-123.1 created User schema with email/password fields",
|
|
99
|
-
"what_comes_next": "bd-123.3 will integrate this service into API routes"
|
|
100
|
-
},
|
|
101
|
-
"escalation": {
|
|
102
|
-
"blocked_contact": "coordinator",
|
|
103
|
-
"scope_change_protocol": "swarmmail_send(subject='Scope Change', ack_required=true)"
|
|
104
|
-
}
|
|
105
|
-
}
|
|
106
|
-
```
|
|
107
|
-
|
|
108
|
-
### Validation in swarm_complete
|
|
109
|
-
|
|
110
|
-
```typescript
|
|
111
|
-
// swarm_complete now validates against contract
|
|
112
|
-
function validateCompletion(handoff: WorkerHandoff, result: CompletionReport) {
|
|
113
|
-
const violations: string[] = [];
|
|
114
|
-
|
|
115
|
-
// 1. File scope violations
|
|
116
|
-
const unauthorized = result.files_touched.filter(
|
|
117
|
-
f => !handoff.contract.files_owned.includes(f)
|
|
118
|
-
);
|
|
119
|
-
if (unauthorized.length > 0) {
|
|
120
|
-
violations.push(`Touched unauthorized files: ${unauthorized.join(", ")}`);
|
|
121
|
-
}
|
|
122
|
-
|
|
123
|
-
// 2. Success criteria (checkable ones)
|
|
124
|
-
for (const criterion of handoff.contract.success_criteria) {
|
|
125
|
-
if (criterion.startsWith("Tests pass:")) {
|
|
126
|
-
// Run the test command, validate exit 0
|
|
127
|
-
}
|
|
128
|
-
if (criterion.startsWith("Type check passes:")) {
|
|
129
|
-
// Run tsc --noEmit, validate exit 0
|
|
130
|
-
}
|
|
131
|
-
}
|
|
132
|
-
|
|
133
|
-
// 3. Learning signals from violations
|
|
134
|
-
if (violations.length > 0) {
|
|
135
|
-
recordLearningSignal({
|
|
136
|
-
task_id: handoff.contract.task_id,
|
|
137
|
-
violation_type: "scope_creep",
|
|
138
|
-
details: violations,
|
|
139
|
-
impact: "negative" // Penalize decomposition strategy
|
|
140
|
-
});
|
|
141
|
-
}
|
|
142
|
-
|
|
143
|
-
return { valid: violations.length === 0, violations };
|
|
144
|
-
}
|
|
145
|
-
```
|
|
146
|
-
|
|
147
|
-
### Integration with Existing Tools
|
|
148
|
-
|
|
149
|
-
**swarm_spawn_subtask generates handoffs:**
|
|
150
|
-
|
|
151
|
-
```typescript
|
|
152
|
-
export const swarm_spawn_subtask = tool(/* ... */)
|
|
153
|
-
.handler(async ({ input, context }) => {
|
|
154
|
-
const handoff: WorkerHandoff = {
|
|
155
|
-
contract: {
|
|
156
|
-
task_id: input.bead_id,
|
|
157
|
-
files_owned: input.files,
|
|
158
|
-
files_readonly: inferReadonlyFiles(input.files, epicContext),
|
|
159
|
-
dependencies_completed: input.dependencies_completed || [],
|
|
160
|
-
success_criteria: generateSuccessCriteria(input.subtask_description)
|
|
161
|
-
},
|
|
162
|
-
context: {
|
|
163
|
-
epic_summary: epicContext.summary,
|
|
164
|
-
your_role: input.subtask_title,
|
|
165
|
-
what_others_did: summarizeDependencies(input.dependencies_completed),
|
|
166
|
-
what_comes_next: summarizeDownstream(input.bead_id)
|
|
167
|
-
},
|
|
168
|
-
escalation: {
|
|
169
|
-
blocked_contact: "coordinator",
|
|
170
|
-
scope_change_protocol: "swarmmail_send(subject='Scope Change', ack_required=true)"
|
|
171
|
-
}
|
|
172
|
-
};
|
|
173
|
-
|
|
174
|
-
return formatHandoff(handoff); // Compact JSON + minimal prose wrapper
|
|
175
|
-
});
|
|
176
|
-
```
|
|
177
|
-
|
|
178
|
-
**swarm_complete validates contract:**
|
|
179
|
-
|
|
180
|
-
```typescript
|
|
181
|
-
export const swarm_complete = tool(/* ... */)
|
|
182
|
-
.handler(async ({ input, context }) => {
|
|
183
|
-
const handoff = getStoredHandoff(input.bead_id);
|
|
184
|
-
const validation = validateCompletion(handoff, {
|
|
185
|
-
files_touched: input.files_touched,
|
|
186
|
-
summary: input.summary
|
|
187
|
-
});
|
|
188
|
-
|
|
189
|
-
if (!validation.valid) {
|
|
190
|
-
throw new Error(
|
|
191
|
-
`Contract violations detected:\n${validation.violations.join("\n")}`
|
|
192
|
-
);
|
|
193
|
-
}
|
|
194
|
-
|
|
195
|
-
// Proceed with UBS scan, reservation release, etc.
|
|
196
|
-
});
|
|
197
|
-
```
|
|
198
|
-
|
|
199
|
-
## Consequences
|
|
200
|
-
|
|
201
|
-
### Positive
|
|
202
|
-
|
|
203
|
-
- **Validation enforced** - Can't complete with contract violations
|
|
204
|
-
- **Clear boundaries** - Workers know exactly what's in/out of scope
|
|
205
|
-
- **Better learning** - Scope creep violations feed back into strategy selection
|
|
206
|
-
- **Context efficiency** - Contract is ~30 lines JSON vs 280 lines prose
|
|
207
|
-
- **Fail fast** - Violations detected immediately, not during merge
|
|
208
|
-
- **Programmatic recovery** - Coordinator can auto-detect and reassign work
|
|
209
|
-
|
|
210
|
-
### Negative
|
|
211
|
-
|
|
212
|
-
- **Requires storage** - Handoffs must persist (already have event store)
|
|
213
|
-
- **Success criteria limited** - Can't validate all criteria automatically
|
|
214
|
-
- **Migration cost** - Existing `SUBTASK_PROMPT_V2` users need update
|
|
215
|
-
- **More upfront work** - Coordinator must generate better contracts
|
|
216
|
-
|
|
217
|
-
### Neutral
|
|
218
|
-
|
|
219
|
-
- **Prose still exists** - `context` field provides human explanation, just smaller
|
|
220
|
-
- **Not eliminating checklist** - 9-step survival checklist stays, but moves to tool enforcement
|
|
221
|
-
|
|
222
|
-
## Implementation Notes
|
|
223
|
-
|
|
224
|
-
### Phase 1: Storage & Schema
|
|
225
|
-
|
|
226
|
-
1. Add `WorkerHandoff` schema to swarm-mail event types
|
|
227
|
-
2. Store handoffs in event log when spawning subtasks
|
|
228
|
-
3. Retrieve handoffs in `swarm_complete` for validation
|
|
229
|
-
|
|
230
|
-
### Phase 2: Generation Logic
|
|
231
|
-
|
|
232
|
-
1. Implement `inferReadonlyFiles()` - analyze imports/dependencies
|
|
233
|
-
2. Implement `generateSuccessCriteria()` - parse task description for checkable conditions
|
|
234
|
-
3. Implement `summarizeDependencies()` and `summarizeDownstream()` - build context from epic graph
|
|
235
|
-
|
|
236
|
-
### Phase 3: Validation
|
|
237
|
-
|
|
238
|
-
1. Add contract validation to `swarm_complete`
|
|
239
|
-
2. Implement checkable criteria runners (test commands, type checks)
|
|
240
|
-
3. Record learning signals for violations
|
|
241
|
-
|
|
242
|
-
### Phase 4: Migration
|
|
243
|
-
|
|
244
|
-
1. Update `formatSubtaskPromptV2` to generate handoff JSON
|
|
245
|
-
2. Deprecate 280-line prose template
|
|
246
|
-
3. Update tests for new handoff format
|
|
247
|
-
|
|
248
|
-
### Phase 5: Enhanced Features (Future)
|
|
249
|
-
|
|
250
|
-
1. **Readonly enforcement** - Detect modifications to `files_readonly` via git diff
|
|
251
|
-
2. **Dependency validation** - Verify `dependencies_completed` actually ran first
|
|
252
|
-
3. **Auto-generated success criteria** - Parse test files, infer criteria from code
|
|
253
|
-
|
|
254
|
-
## Alternatives Considered
|
|
255
|
-
|
|
256
|
-
### Keep Prose, Add Validation
|
|
257
|
-
|
|
258
|
-
Keep `SUBTASK_PROMPT_V2` but add validation after-the-fact. **Rejected** because:
|
|
259
|
-
- Still burns 280 lines of context per worker
|
|
260
|
-
- Workers still ignore prose
|
|
261
|
-
- Validation happens too late (after work done)
|
|
262
|
-
|
|
263
|
-
### Minimal Contract Only
|
|
264
|
-
|
|
265
|
-
Remove context entirely, pure machine contract. **Rejected** because:
|
|
266
|
-
- Workers need WHY to make good judgment calls
|
|
267
|
-
- Context helps with edge cases not in contract
|
|
268
|
-
- Loss of human readability hurts debugging
|
|
269
|
-
|
|
270
|
-
### Command Pattern (Bellemare Style)
|
|
271
|
-
|
|
272
|
-
Full event-sourcing with Command objects. **Rejected** because:
|
|
273
|
-
- Over-engineered for current needs
|
|
274
|
-
- Already have event store for coordination
|
|
275
|
-
- Contract + context is simpler and sufficient
|
|
276
|
-
|
|
277
|
-
## References
|
|
278
|
-
|
|
279
|
-
- **"Patterns for Building AI Agents"** - Subagent context sharing patterns
|
|
280
|
-
- **mdflow** - Convention-based adapter design, template variable contracts
|
|
281
|
-
- **Bellemare's "Building Event-Driven Microservices"** - Explicit contracts, fail-fast validation
|
|
282
|
-
- **Current implementation:** `src/swarm-prompts.ts` (SUBTASK_PROMPT_V2, lines 253-530)
|
|
283
|
-
- **Related:** ADR-007 (Structured Review), ADR-002 (Package Extraction)
|
|
284
|
-
|
|
285
|
-
## Success Criteria
|
|
286
|
-
|
|
287
|
-
- [ ] `WorkerHandoff` schema defined and validated with Zod
|
|
288
|
-
- [ ] `swarm_spawn_subtask` generates handoffs instead of raw prose
|
|
289
|
-
- [ ] `swarm_complete` validates contract before accepting completion
|
|
290
|
-
- [ ] Scope violations trigger learning signals (negative feedback)
|
|
291
|
-
- [ ] Workers receive handoff as JSON + compact context wrapper (<50 lines)
|
|
292
|
-
- [ ] Test suite validates contract enforcement catches violations
|
|
293
|
-
- [ ] Migration path documented for existing swarm users
|