opencode-swarm 7.89.0 → 7.90.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (30) hide show
  1. package/.opencode/skills/commit-pr/SKILL.md +548 -0
  2. package/.opencode/skills/engineering-conventions/SKILL.md +57 -0
  3. package/.opencode/skills/phase-wrap/SKILL.md +1 -1
  4. package/.opencode/skills/running-tests/SKILL.md +282 -0
  5. package/.opencode/skills/writing-tests/SKILL.md +794 -0
  6. package/dist/cli/{evidence-summary-service-5me91eq8.js → evidence-summary-service-mr9sns2d.js} +5 -5
  7. package/dist/cli/{gate-evidence-y8zn7fe2.js → gate-evidence-nphg8hay.js} +4 -4
  8. package/dist/cli/{guardrail-explain-hy0zz0p6.js → guardrail-explain-w29j6dmx.js} +10 -10
  9. package/dist/cli/{index-9w07ye9b.js → index-4gm78w6c.js} +23 -14
  10. package/dist/cli/{index-1ccnwh54.js → index-5hrexm02.js} +3 -3
  11. package/dist/cli/{index-bcp79s17.js → index-91qtsbce.js} +1 -1
  12. package/dist/cli/{index-dprk5c5f.js → index-c5d6tgbs.js} +10 -10
  13. package/dist/cli/{index-6k31ysgd.js → index-j49ge0mg.js} +1 -1
  14. package/dist/cli/{index-fjwwrwr5.js → index-kv4dd5c5.js} +1 -1
  15. package/dist/cli/{index-e7h9bb6v.js → index-mh1ej70w.js} +2 -2
  16. package/dist/cli/{index-vqyfscxd.js → index-sf08zj91.js} +1 -1
  17. package/dist/cli/{index-axwxkbdd.js → index-w7gkpmq8.js} +2 -2
  18. package/dist/cli/{index-p0ye10nd.js → index-xchgryg4.js} +10 -2
  19. package/dist/cli/{index-8y7qetpg.js → index-y1z6yaq4.js} +3 -3
  20. package/dist/cli/index.js +9 -9
  21. package/dist/cli/{knowledge-store-gsy6p46z.js → knowledge-store-eqans52j.js} +4 -4
  22. package/dist/cli/{pending-delegations-35fvcj7z.js → pending-delegations-shqbvfjc.js} +2 -2
  23. package/dist/cli/{pr-subscriptions-b18n1yd8.js → pr-subscriptions-2565fpsc.js} +3 -3
  24. package/dist/cli/{skill-generator-1hzfyhth.js → skill-generator-d0jzw6n2.js} +5 -5
  25. package/dist/cli/{telemetry-9bbyxrvn.js → telemetry-aa1ma1dr.js} +4 -2
  26. package/dist/config/bundled-skills.d.ts +1 -1
  27. package/dist/config/skill-mirrors.d.ts +87 -0
  28. package/dist/index.js +21 -5
  29. package/dist/telemetry.d.ts +7 -0
  30. package/package.json +6 -1
@@ -0,0 +1,282 @@
1
+ ---
2
+ name: running-tests
3
+ description: >
4
+ Safe test execution patterns for opencode-swarm. Covers when to use the test_runner
5
+ tool vs shell bun commands, scope safety rules, per-file isolation loops (bash and
6
+ PowerShell), pre-existing failure verification, CI log reading, and failure
7
+ classification. Load this skill when you need to run tests — not when you need to
8
+ write them (see writing-tests for authoring guidance).
9
+ ---
10
+
11
+ # Running Tests for opencode-swarm
12
+
13
+ This skill is about **executing** tests safely. For **writing** tests, see `writing-tests`.
14
+
15
+ ---
16
+
17
+ ## ⛔ The One Rule That Prevents Session Kills
18
+
19
+ **Never use `test_runner` with more than one source file for any discovery scope.**
20
+
21
+ `graph` and `impact` each fan out per file through the import tree; `convention` maps
22
+ each source file to a test file by name convention. The union quickly exceeds
23
+ `MAX_SAFE_TEST_FILES = 50`, triggering `scope_exceeded`, which causes LLMs to
24
+ cascade to `scope: 'all'` and kill the session. All three scopes now reject with
25
+ `scope_exceeded` before fan-out when `sourceFiles.length > MAX_SAFE_SOURCE_FILES = 1`.
26
+
27
+ ---
28
+
29
+ ## Three-Layer Defense Against Session Blocking
30
+
31
+ test_runner has three pre-resolution guards that prevent unbounded fan-out from blocking the session:
32
+
33
+ ### Layer 1 — Source-file count guard (synchronous, fires before any I/O)
34
+ `sourceFiles.length > MAX_SAFE_SOURCE_FILES (1)` → returns `scope_exceeded` immediately. Catches the common case of multi-file calls before any filesystem access.
35
+
36
+ ### Layer 2 — Pre-resolution fan-out estimate (fast, ~100ms)
37
+ `estimateFanOut(sourceFiles, workingDir)` reads the cached impact map and counts unique test files without spawning subprocesses. If the estimate exceeds `MAX_SAFE_TEST_FILES = 50`, the call returns `scope_exceeded` immediately — before any graph traversal begins. Only fires when `sourceFiles.length === 1` (Layer 1 has already passed).
38
+
39
+ ### Layer 3 — Budget-limited traversal + post-resolution length check
40
+ `analyzeImpact` accepts a `budget` parameter (`MAX_SAFE_TEST_FILES = 50`). The traversal stops as soon as it has visited 50 test files and sets `budgetExceeded: true`. The call site checks this flag and returns `scope_exceeded` before processing results.
41
+ After graph resolution, the final `testFiles.length` is additionally compared to `MAX_SAFE_TEST_FILES`. If exceeded, `scope_exceeded` is returned.
42
+
43
+ **Result:** When fan-out exceeds the safe threshold, the session gets `outcome: 'scope_exceeded'` instead of hanging.
44
+
45
+ ---
46
+
47
+ ## Decision Tree: test_runner tool vs bun shell command
48
+
49
+ ```
50
+ Do you need to run tests?
51
+
52
+ ├─ Single test file, targeted validation
53
+ │ └─ Either works. Prefer shell: bun --smol test <file> --timeout 30000
54
+
55
+ ├─ Multiple files in the same directory (e.g. all agents tests)
56
+ │ └─ Shell only — per-file loop. Never test_runner with multiple files.
57
+
58
+ ├─ Find tests related to ONE changed source file
59
+ │ └─ test_runner is fine: { scope: 'graph', files: ['src/agents/coder.ts'] }
60
+ │ (single file → bounded fan-out)
61
+
62
+ ├─ Find tests related to MULTIPLE changed source files
63
+ │ └─ Shell only — per-file loop over the changed files, or run the whole directory.
64
+ │ test_runner with any discovery scope + multiple source files = scope_exceeded
65
+ │ (guard fires before fan-out for convention, graph, and impact scopes).
66
+
67
+ └─ Validate the entire repo (pre-push)
68
+ └─ Shell only — 5-tier suite from commit-pr skill. Never test_runner scope:'all'.
69
+ ```
70
+
71
+ ---
72
+
73
+ ## Scope Safety Reference
74
+
75
+ | Scope | With `files: [one]` | With `files: [many]` | Notes |
76
+ |-------|--------------------|--------------------|-------|
77
+ | `'convention'` | ✅ Safe | ❌ Rejected (`scope_exceeded`) | Guard fires before fan-out; direct test file paths exempt |
78
+ | `'graph'` | ✅ Safe (capped at 50 via budget) | ❌ Rejected (`scope_exceeded`) | Two-layer guard: source-file count + fan-out estimate |
79
+ | `'impact'` | ✅ Safe (capped at 50 via budget) | ❌ Rejected (`scope_exceeded`) | Two-layer guard: source-file count + fan-out estimate |
80
+ | `'all'` | ❌ Never | ❌ Never | Requires `allow_full_suite: true`; CI mirror only |
81
+ | `'all'` | ❌ Never | ❌ Never | Requires `allow_full_suite: true`; CI mirror only |
82
+
83
+ **Rule of thumb:** Pass exactly one source file to `test_runner`. For multiple files, use a shell loop.
84
+
85
+ ---
86
+
87
+ ## Per-File Isolation Loops
88
+
89
+ CI runs agents/tools/services in per-file isolation (one `bun --smol` process per file).
90
+ Reproduce this locally with the following loops.
91
+
92
+ ### bash (Linux / macOS)
93
+
94
+ ```bash
95
+ # Single directory — per-file isolation
96
+ for f in tests/unit/agents/*.test.ts; do
97
+ bun --smol test "$f" --timeout 30000
98
+ done
99
+
100
+ # Multiple directories
101
+ for dir in tests/unit/tools tests/unit/services tests/unit/agents; do
102
+ for f in "$dir"/*.test.ts; do
103
+ bun --smol test "$f" --timeout 30000
104
+ done
105
+ done
106
+
107
+ # Stop on first failure (useful for debugging)
108
+ for f in tests/unit/agents/*.test.ts; do
109
+ bun --smol test "$f" --timeout 30000 || { echo "FAILED: $f"; break; }
110
+ done
111
+ ```
112
+
113
+ ### PowerShell (Windows)
114
+
115
+ ```powershell
116
+ # Single directory — per-file isolation
117
+ Get-ChildItem tests/unit/agents/*.test.ts | ForEach-Object {
118
+ bun --smol test $_.FullName --timeout 30000
119
+ }
120
+
121
+ # Multiple directories
122
+ @('tests/unit/tools', 'tests/unit/services', 'tests/unit/agents') | ForEach-Object {
123
+ Get-ChildItem "$_/*.test.ts" | ForEach-Object {
124
+ bun --smol test $_.FullName --timeout 30000
125
+ }
126
+ }
127
+
128
+ # Capture output (avoids truncation on large output)
129
+ Get-ChildItem tests/unit/agents/*.test.ts | ForEach-Object {
130
+ bun --smol test $_.FullName --timeout 30000
131
+ } | Out-File "$env:TEMP\test_out.txt"
132
+ Get-Content "$env:TEMP\test_out.txt" | Select-Object -Last 50
133
+ ```
134
+
135
+ **Common PowerShell pitfalls:**
136
+ - `for f in ...; do` — invalid, use `Get-ChildItem | ForEach-Object`
137
+ - `Select-String -Last N` — invalid parameter, use `Select-Object -Last N`
138
+ - `2>&1 2>&1` — duplicate redirection, causes parse error; use `2>&1` once
139
+ - `&&` — not supported in PowerShell 5.1; use `; if ($?) { cmd2 }` instead
140
+ - After `bun install --frozen-lockfile --force`, non-elevated Windows shells can hit `EPERM` while reading refreshed `node_modules` entries. Treat that as a host permission/access issue: rerun the same focused Bun command with approved/elevated access before diagnosing it as a code or test failure.
141
+
142
+ ---
143
+
144
+ ## Batch vs Per-File: Which Directories Need Isolation?
145
+
146
+ | Directory | Mode | Reason |
147
+ |-----------|------|--------|
148
+ | `tests/unit/tools/` | Per-file loop | Heavy `mock.module` usage; cache poisoning risk |
149
+ | `tests/unit/services/` | Per-file loop | Same |
150
+ | `tests/unit/agents/` | Per-file loop | Same |
151
+ | `tests/unit/hooks/` | Per-file loop | Same |
152
+ | `tests/unit/cli/` | Batch OK | Fewer mock conflicts |
153
+ | `tests/unit/commands/` | Batch OK | Fewer mock conflicts |
154
+ | `tests/unit/config/` | Batch OK | Fewer mock conflicts |
155
+ | `tests/integration/` | Batch OK | Integration fixtures, not mock-heavy |
156
+ | `tests/security/` | Batch OK | Adversarial inputs, no module mocks |
157
+ | `tests/smoke/` | Batch OK | Built-package tests |
158
+
159
+ ---
160
+
161
+ ## Truncated Output Recovery
162
+
163
+ When `bun test` output exceeds the bash tool's buffer, it is saved to a file with an ID
164
+ like `tool_dff778...`. This ID format is **not** accepted by `retrieve_summary` (which only
165
+ reads `S1`, `S2` etc. format IDs). The output is effectively lost.
166
+
167
+ **Prevention — pipe to a file explicitly:**
168
+
169
+ ```powershell
170
+ # PowerShell
171
+ bun --smol test tests/unit/agents --timeout 60000 |
172
+ Out-File "$env:TEMP\test_out.txt"
173
+ Get-Content "$env:TEMP\test_out.txt" | Select-Object -Last 50
174
+ ```
175
+
176
+ ```bash
177
+ # bash
178
+ bun --smol test tests/unit/agents --timeout 60000 2>&1 | tee /tmp/test_out.txt
179
+ tail -50 /tmp/test_out.txt
180
+ ```
181
+
182
+ **To get a clean pass/fail summary only**, filter immediately:
183
+
184
+ ```powershell
185
+ # PowerShell — show only summary lines
186
+ bun --smol test tests/unit/agents --timeout 60000 |
187
+ Select-String "pass|fail|error" |
188
+ Select-Object -Last 10
189
+ ```
190
+
191
+ ```bash
192
+ # bash
193
+ bun --smol test tests/unit/agents --timeout 60000 2>&1 | grep -E "pass|fail|error" | tail -10
194
+ ```
195
+
196
+ ---
197
+
198
+ ## Verifying Pre-Existing Failures
199
+
200
+ Before documenting a failure as "pre-existing," prove it exists on `main` without affecting
201
+ your working tree. Use a Git worktree — safer than `git stash` (stash can drop untracked
202
+ files, fail on locked files on Windows, and leave you in an inconsistent state).
203
+
204
+ ```bash
205
+ # bash — create a throwaway checkout of main
206
+ git worktree add /tmp/repro-check origin/main
207
+ bun --smol test /tmp/repro-check/tests/unit/agents/architect-workflow-security.test.ts --timeout 30000
208
+ git worktree remove /tmp/repro-check
209
+ ```
210
+
211
+ ```powershell
212
+ # PowerShell — same pattern (use Join-Path for robust separator handling)
213
+ git worktree add "$env:TEMP\repro-check" origin/main
214
+ $testPath = Join-Path "$env:TEMP\repro-check" "tests\unit\agents\architect-workflow-security.test.ts"
215
+ bun --smol test $testPath --timeout 30000
216
+ git worktree remove "$env:TEMP\repro-check"
217
+ ```
218
+
219
+ **Decision after checking:**
220
+ - Fails on `main` too → pre-existing. Document under `## Pre-existing failures` in PR body. Continue.
221
+ - Fails only on your branch → you introduced it. Fix before pushing.
222
+
223
+ **⚠️ Check your own session history first.** Before documenting anything as pre-existing, confirm you did not fix or update this test earlier in the current session. A test you fixed 20 messages ago is not pre-existing — listing it as such in the table or PR body is incorrect and will be caught in review.
224
+
225
+ ---
226
+
227
+ ## Failure Classification
228
+
229
+ Not all failures are equal. Before deciding what to do, classify the failure:
230
+
231
+ | Class | Definition | Example | What to do |
232
+ |-------|-----------|---------|------------|
233
+ | **Stale assertion** | Test checks for text/value that was deliberately removed | `expect(prompt).toContain('CONSTRAINT: [what NOT to do]')` — template removed in refactor | Update the assertion to match current state |
234
+ | **Soft regression indicator** | Test checks a threshold the codebase has since exceeded | `expect(tokenCount).toBeLessThan(35000)` — prompt grew past limit | Fix the threshold or reduce the prompt; do not just document and ignore |
235
+ | **Genuine pre-existing** | Failure exists on `main` unrelated to any recent change | `full-auto-intercept.test.ts` logger gating issue | Document in PR body; do not fix unless scoped |
236
+ | **New regression** | Failure introduced by your changes | Tests for prompt text you removed without updating tests | Fix before pushing |
237
+
238
+ **Stale assertions and soft regression indicators are actionable** — they signal drift between
239
+ tests and code. Genuine pre-existing failures are not your responsibility to fix in this PR,
240
+ but they must be documented.
241
+
242
+ ---
243
+
244
+ ## Reading CI Failure Logs
245
+
246
+ When a CI job fails, the GitHub Actions log shows the exact `file:line` of the failure.
247
+ Do not guess — read the log.
248
+
249
+ ```bash
250
+ # Get the failing job URL from the PR
251
+ gh pr view <number> --json statusCheckRollup --jq '.statusCheckRollup[] | select(.conclusion=="FAILURE") | .detailsUrl'
252
+
253
+ # Fetch and search the log (if gh CLI available)
254
+ gh run view --log <run-id> | grep -E "FAIL|error" | head -20
255
+ ```
256
+
257
+ Or open the `detailsUrl` directly in a browser / via WebFetch and search for:
258
+ - `(fail)` — Bun test failure marker
259
+ - `error:` — parse or runtime error
260
+ - `at <anonymous>` — stack frame pointing to the test file and line
261
+
262
+ Once you have `tests/unit/agents/some-file.test.ts:354`, reproduce locally:
263
+ ```bash
264
+ bun --smol test tests/unit/agents/some-file.test.ts --timeout 30000
265
+ ```
266
+
267
+ ---
268
+
269
+ ## Quick Reference: Common Failures and Causes
270
+
271
+ | Symptom | Likely cause | Fix |
272
+ |---------|-------------|-----|
273
+ | `scope_exceeded` returned from test_runner | Fan-out exceeded 50 test files during graph/impact resolution | Switch to per-file shell loop; reduce changed-files scope |
274
+ | Session killed during test_runner | Pre-fix: unbounded fan-out on multiple files | Now returns `scope_exceeded` instead — no more session kills |
275
+ | `mock.module` breaks unrelated tests | Missing spread of real module exports | Add `...realModule` spread |
276
+ | Windows tests fail with EBUSY | `mock.restore()` called while child process holds lock | Add `test.skipIf(process.platform === 'win32')` |
277
+ | Test output truncated, ID unreadable | Bash tool buffer exceeded | Pipe to `Out-File`/`tee` explicitly |
278
+ | `for f in ...; do` parse error | Bash syntax in PowerShell | Use `Get-ChildItem | ForEach-Object` |
279
+ | `Select-String -Last N` error | Invalid PowerShell parameter | Use `Select-Object -Last N` |
280
+ | Token budget test failure | Prompt grew past hardcoded threshold | Treat as soft regression; update threshold |
281
+ | CONSTRAINT assertion fails after refactor | Test checks for removed format template | Update assertion to match current prompt |
282
+ | `package-check` CI failure | `package-check` validates the npm tarball (`npm pack` + tarball contents) — a source/build/package-manifest problem, not generated-file drift. `dist/` is generated and NOT committed — do not stage it; run `bun run build` locally only when you need the bundle. There is no longer a committed-dist drift check. |