opencode-swarm-plugin 0.36.1 → 0.38.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -1,5 +1,175 @@
1
1
  # opencode-swarm-plugin
2
2
 
3
+ ## 0.38.0
4
+
5
+ ### Minor Changes
6
+
7
+ - [`41a1965`](https://github.com/joelhooks/swarm-tools/commit/41a19657b252eb1c7a7dc82bc59ab13589e8758f) Thanks [@joelhooks](https://github.com/joelhooks)! - ## 🐝 Coordinators Now Delegate Research to Workers
8
+
9
+ Coordinators finally know their place. They orchestrate, they don't fetch.
10
+
11
+ **The Problem:**
12
+ Coordinators were calling `repo-crawl_file`, `webfetch`, `context7_*` directly, burning expensive Sonnet context on raw file contents instead of spawning researcher workers.
13
+
14
+ **The Fix:**
15
+
16
+ ### Forbidden Tools Section
17
+
18
+ COORDINATOR_PROMPT now explicitly lists tools coordinators must NEVER call:
19
+
20
+ - `repo-crawl_*`, `repo-autopsy_*` - repository fetching
21
+ - `webfetch`, `fetch_fetch` - web fetching
22
+ - `context7_*` - library documentation
23
+ - `pdf-brain_search`, `pdf-brain_read` - knowledge base
24
+
25
+ ### Phase 1.5: Research Phase
26
+
27
+ New workflow phase between Initialize and Knowledge Gathering:
28
+
29
+ ```
30
+ swarm_spawn_researcher(
31
+ research_id="research-nextjs-cache",
32
+ tech_stack=["Next.js 16 Cache Components"],
33
+ project_path="/path/to/project"
34
+ )
35
+ ```
36
+
37
+ ### Strong Coordinator Identity Post-Compaction
38
+
39
+ When context compacts, the resuming agent now sees:
40
+
41
+ ```
42
+ ┌─────────────────────────────────────────────────────────────┐
43
+ │ 🐝 YOU ARE THE COORDINATOR 🐝 │
44
+ │ NOT A WORKER. NOT AN IMPLEMENTER. │
45
+ │ YOU ORCHESTRATE. │
46
+ └─────────────────────────────────────────────────────────────┘
47
+ ```
48
+
49
+ ### runResearchPhase Returns Spawn Instructions
50
+
51
+ ```typescript
52
+ const result = await runResearchPhase(task, projectPath);
53
+ // result.spawn_instructions = [
54
+ // { research_id, tech, prompt, subagent_type: "swarm/researcher" }
55
+ // ]
56
+ ```
57
+
58
+ **32+ new tests, all 425 passing.**
59
+
60
+ - [`b06f69b`](https://github.com/joelhooks/swarm-tools/commit/b06f69bc3db099c14f712585d88b42c801123d01) Thanks [@joelhooks](https://github.com/joelhooks)! - ## 🔬 Eval Capture Pipeline: Complete
61
+
62
+ > "The purpose of computing is insight, not numbers." — Richard Hamming
63
+
64
+ Wire all eval-capture functions into the swarm execution path, enabling ground-truth collection from real swarm executions.
65
+
66
+ **What changed:**
67
+
68
+ | Function | Wired Into | Purpose |
69
+ | ------------------------- | ------------------------------ | ---------------------------------- |
70
+ | `captureDecomposition()` | `swarm_validate_decomposition` | Records task → subtasks mapping |
71
+ | `captureSubtaskOutcome()` | `swarm_complete` | Records per-subtask execution data |
72
+ | `finalizeEvalRecord()` | `swarm_record_outcome` | Computes aggregate metrics |
73
+
74
+ **New npm scripts:**
75
+
76
+ ```bash
77
+ bun run eval:run # Run all evals
78
+ bun run eval:decomposition # Decomposition quality
79
+ bun run eval:coordinator # Coordinator discipline
80
+ ```
81
+
82
+ **Data flow:**
83
+
84
+ ```
85
+ swarm_decompose → captureDecomposition → .opencode/eval-data.jsonl
86
+
87
+ swarm_complete → captureSubtaskOutcome → updates record with outcomes
88
+
89
+ swarm_record_outcome → finalizeEvalRecord → computes scope_accuracy, time_balance
90
+
91
+ evalite → reads JSONL → scores decomposition quality
92
+ ```
93
+
94
+ **Why it matters:**
95
+
96
+ - Enables data-driven decomposition strategy selection
97
+ - Tracks which strategies work for which task types
98
+ - Provides ground truth for Evalite evals
99
+ - Foundation for learning from swarm outcomes
100
+
101
+ **Key discovery:** New cell ID format doesn't follow `epicId.subtaskNum` pattern. Must use `cell.parent_id` to get epic ID for subtasks.
102
+
103
+ ### Patch Changes
104
+
105
+ - [`56e5d4c`](https://github.com/joelhooks/swarm-tools/commit/56e5d4c5ac96ddd2184d12c63e163bb9c291fb69) Thanks [@joelhooks](https://github.com/joelhooks)! - ## 🔬 Eval Capture Pipeline: Phase 1
106
+
107
+ > "The first step toward wisdom is getting things right. The second step is getting them wrong in interesting ways." — Marvin Minsky
108
+
109
+ Wire `captureDecomposition()` into `swarm_validate_decomposition` to record decomposition inputs/outputs for evaluation.
110
+
111
+ **What changed:**
112
+
113
+ - `swarm_validate_decomposition` now calls `captureDecomposition()` after successful validation
114
+ - Captures: epicId, projectPath, task, context, strategy, epicTitle, subtasks
115
+ - Data persisted to `.opencode/eval-data.jsonl` for Evalite consumption
116
+
117
+ **Why it matters:**
118
+
119
+ - Enables ground-truth collection from real swarm executions
120
+ - Foundation for decomposition quality evals
121
+ - Tracks what strategies work for which task types
122
+
123
+ **Tests added:**
124
+
125
+ - Verifies `captureDecomposition` called with correct params on success
126
+ - Verifies NOT called on validation failure
127
+ - Handles optional context/description fields
128
+
129
+ **Next:** Wire `captureSubtaskOutcome()` and `finalizeEvalRecord()` to complete the pipeline.
130
+
131
+ ## 0.37.0
132
+
133
+ ### Minor Changes
134
+
135
+ - [`66b5795`](https://github.com/joelhooks/swarm-tools/commit/66b57951e2c114702c663b98829d5f7626607a16) Thanks [@joelhooks](https://github.com/joelhooks)! - ## 🐝 `swarm cells` - Query Your Hive Like a Pro
136
+
137
+ New CLI command AND plugin tool for querying cells directly from the database.
138
+
139
+ ### CLI: `swarm cells`
140
+
141
+ ```bash
142
+ swarm cells # List all cells (table format)
143
+ swarm cells --status open # Filter by status
144
+ swarm cells --type bug # Filter by type
145
+ swarm cells --ready # Next unblocked cell
146
+ swarm cells mjkmd # Partial ID lookup
147
+ swarm cells --json # Raw JSON for scripting
148
+ ```
149
+
150
+ **Replaces:** The awkward `swarm tool hive_query --json '{"status":"open"}'` pattern.
151
+
152
+ ### Plugin Tool: `hive_cells`
153
+
154
+ ```typescript
155
+ // Agents can now query cells directly
156
+ hive_cells({ status: "open", type: "task" });
157
+ hive_cells({ id: "mjkmd" }); // Partial ID works!
158
+ hive_cells({ ready: true }); // Next unblocked
159
+ ```
160
+
161
+ **Why this matters:**
162
+
163
+ - Reads from DATABASE (fast, indexed) not JSONL files
164
+ - Partial ID resolution built-in
165
+ - Consistent JSON array output
166
+ - Rich descriptions encourage agentic use
167
+
168
+ ### Also Fixed
169
+
170
+ - `swarm_review_feedback` tests updated for coordinator-driven retry architecture
171
+ - 425 tests passing
172
+
3
173
  ## 0.36.1
4
174
 
5
175
  ### Patch Changes
package/README.md CHANGED
@@ -231,6 +231,39 @@ bun test
231
231
  bun run typecheck
232
232
  ```
233
233
 
234
+ ### Evaluation Pipeline
235
+
236
+ Test decomposition quality and coordinator discipline with **Evalite** (TypeScript-native eval framework):
237
+
238
+ ```bash
239
+ # Run all evals
240
+ bun run eval:run
241
+
242
+ # Run specific suites
243
+ bun run eval:decomposition # Task decomposition quality
244
+ bun run eval:coordinator # Coordinator protocol compliance
245
+ ```
246
+
247
+ **What gets evaluated:**
248
+
249
+ | Eval Suite | Measures | Data Source |
250
+ |------------|----------|-------------|
251
+ | `swarm-decomposition` | Subtask independence, complexity balance, coverage, clarity | Fixtures + captured real decompositions |
252
+ | `coordinator-session` | Violation count, spawn efficiency, review thoroughness | Real sessions from `~/.config/swarm-tools/sessions/` |
253
+
254
+ **Data capture locations:**
255
+ - Decomposition inputs/outputs: `.opencode/eval-data.jsonl`
256
+ - Coordinator sessions: `~/.config/swarm-tools/sessions/*.jsonl`
257
+ - Subtask outcomes: swarm-mail database (used for pattern learning)
258
+
259
+ **Custom scorers:**
260
+ - Subtask independence (0-1): Files don't overlap between subtasks
261
+ - Complexity balance (0-1): Subtasks have similar estimated complexity
262
+ - Coverage completeness (0-1): Required files are covered
263
+ - Instruction clarity (0-1): Descriptions are specific and actionable
264
+
265
+ See [evals/README.md](./evals/README.md) for scorer details and how to write new evals.
266
+
234
267
  ---
235
268
 
236
269
  ## CLI
package/bin/swarm.test.ts CHANGED
@@ -197,6 +197,112 @@ READ-ONLY research agent. Never modifies code - only gathers intel and stores fi
197
197
  // Log Command Tests (TDD)
198
198
  // ============================================================================
199
199
 
200
+ // ============================================================================
201
+ // Cells Command Tests (TDD)
202
+ // ============================================================================
203
+
204
+ /**
205
+ * Format cells as table output
206
+ */
207
+ function formatCellsTable(cells: Array<{
208
+ id: string;
209
+ title: string;
210
+ status: string;
211
+ priority: number;
212
+ }>): string {
213
+ if (cells.length === 0) {
214
+ return "No cells found";
215
+ }
216
+
217
+ const rows = cells.map(c => ({
218
+ id: c.id,
219
+ title: c.title.length > 50 ? c.title.slice(0, 47) + "..." : c.title,
220
+ status: c.status,
221
+ priority: String(c.priority),
222
+ }));
223
+
224
+ // Calculate column widths
225
+ const widths = {
226
+ id: Math.max(2, ...rows.map(r => r.id.length)),
227
+ title: Math.max(5, ...rows.map(r => r.title.length)),
228
+ status: Math.max(6, ...rows.map(r => r.status.length)),
229
+ priority: Math.max(8, ...rows.map(r => r.priority.length)),
230
+ };
231
+
232
+ // Build header
233
+ const header = [
234
+ "ID".padEnd(widths.id),
235
+ "TITLE".padEnd(widths.title),
236
+ "STATUS".padEnd(widths.status),
237
+ "PRIORITY".padEnd(widths.priority),
238
+ ].join(" ");
239
+
240
+ const separator = "-".repeat(header.length);
241
+
242
+ // Build rows
243
+ const bodyRows = rows.map(r =>
244
+ [
245
+ r.id.padEnd(widths.id),
246
+ r.title.padEnd(widths.title),
247
+ r.status.padEnd(widths.status),
248
+ r.priority.padEnd(widths.priority),
249
+ ].join(" ")
250
+ );
251
+
252
+ return [header, separator, ...bodyRows].join("\n");
253
+ }
254
+
255
+ describe("Cells command", () => {
256
+ describe("formatCellsTable", () => {
257
+ test("formats cells as table with id, title, status, priority", () => {
258
+ const cells = [
259
+ {
260
+ id: "test-abc123-xyz",
261
+ title: "Fix bug",
262
+ status: "open",
263
+ priority: 0,
264
+ type: "bug",
265
+ created_at: 1234567890,
266
+ updated_at: 1234567890,
267
+ },
268
+ {
269
+ id: "test-def456-abc",
270
+ title: "Add feature",
271
+ status: "in_progress",
272
+ priority: 2,
273
+ type: "feature",
274
+ created_at: 1234567890,
275
+ updated_at: 1234567890,
276
+ },
277
+ ];
278
+
279
+ const table = formatCellsTable(cells);
280
+
281
+ // Should contain headers
282
+ expect(table).toContain("ID");
283
+ expect(table).toContain("TITLE");
284
+ expect(table).toContain("STATUS");
285
+ expect(table).toContain("PRIORITY");
286
+
287
+ // Should contain cell data
288
+ expect(table).toContain("test-abc123-xyz");
289
+ expect(table).toContain("Fix bug");
290
+ expect(table).toContain("open");
291
+ expect(table).toContain("0");
292
+
293
+ expect(table).toContain("test-def456-abc");
294
+ expect(table).toContain("Add feature");
295
+ expect(table).toContain("in_progress");
296
+ expect(table).toContain("2");
297
+ });
298
+
299
+ test("returns 'No cells found' for empty array", () => {
300
+ const table = formatCellsTable([]);
301
+ expect(table).toBe("No cells found");
302
+ });
303
+ });
304
+ });
305
+
200
306
  describe("Log command helpers", () => {
201
307
  let testDir: string;
202
308