opencode-swarm 7.89.0 → 7.90.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (30) hide show
  1. package/.opencode/skills/commit-pr/SKILL.md +548 -0
  2. package/.opencode/skills/engineering-conventions/SKILL.md +57 -0
  3. package/.opencode/skills/phase-wrap/SKILL.md +1 -1
  4. package/.opencode/skills/running-tests/SKILL.md +282 -0
  5. package/.opencode/skills/writing-tests/SKILL.md +794 -0
  6. package/dist/cli/{evidence-summary-service-5me91eq8.js → evidence-summary-service-mr9sns2d.js} +5 -5
  7. package/dist/cli/{gate-evidence-y8zn7fe2.js → gate-evidence-nphg8hay.js} +4 -4
  8. package/dist/cli/{guardrail-explain-hy0zz0p6.js → guardrail-explain-w29j6dmx.js} +10 -10
  9. package/dist/cli/{index-9w07ye9b.js → index-4gm78w6c.js} +23 -14
  10. package/dist/cli/{index-1ccnwh54.js → index-5hrexm02.js} +3 -3
  11. package/dist/cli/{index-bcp79s17.js → index-91qtsbce.js} +1 -1
  12. package/dist/cli/{index-dprk5c5f.js → index-c5d6tgbs.js} +10 -10
  13. package/dist/cli/{index-6k31ysgd.js → index-j49ge0mg.js} +1 -1
  14. package/dist/cli/{index-fjwwrwr5.js → index-kv4dd5c5.js} +1 -1
  15. package/dist/cli/{index-e7h9bb6v.js → index-mh1ej70w.js} +2 -2
  16. package/dist/cli/{index-vqyfscxd.js → index-sf08zj91.js} +1 -1
  17. package/dist/cli/{index-axwxkbdd.js → index-w7gkpmq8.js} +2 -2
  18. package/dist/cli/{index-p0ye10nd.js → index-xchgryg4.js} +10 -2
  19. package/dist/cli/{index-8y7qetpg.js → index-y1z6yaq4.js} +3 -3
  20. package/dist/cli/index.js +9 -9
  21. package/dist/cli/{knowledge-store-gsy6p46z.js → knowledge-store-eqans52j.js} +4 -4
  22. package/dist/cli/{pending-delegations-35fvcj7z.js → pending-delegations-shqbvfjc.js} +2 -2
  23. package/dist/cli/{pr-subscriptions-b18n1yd8.js → pr-subscriptions-2565fpsc.js} +3 -3
  24. package/dist/cli/{skill-generator-1hzfyhth.js → skill-generator-d0jzw6n2.js} +5 -5
  25. package/dist/cli/{telemetry-9bbyxrvn.js → telemetry-aa1ma1dr.js} +4 -2
  26. package/dist/config/bundled-skills.d.ts +1 -1
  27. package/dist/config/skill-mirrors.d.ts +87 -0
  28. package/dist/index.js +21 -5
  29. package/dist/telemetry.d.ts +7 -0
  30. package/package.json +6 -1
@@ -0,0 +1,794 @@
1
+ ---
2
+ name: writing-tests
3
+ description: >
4
+ Guidelines for writing, organizing, and maintaining tests in the opencode-swarm repository.
5
+ Covers framework rules (bun:test), mock isolation, CI pipeline structure, file placement,
6
+ and anti-patterns that break cross-platform CI. Load this skill before writing or modifying
7
+ any test file.
8
+ ---
9
+
10
+ # Writing Tests for opencode-swarm
11
+
12
+ > **⚠️ Do NOT use the OpenCode `test_runner` tool to validate the full repo.** It is for targeted agent validation with explicit `files: [...]` or small targeted scopes. `scope: 'all'` requires `allow_full_suite: true` and is intended for opt-in CI mirrors only. Broad scopes can stall or kill OpenCode before the `MAX_SAFE_TEST_FILES = 50` guard in `src/tools/test-runner.ts` fires. For repo validation, use the shell commands in this file — per-file isolation loops match CI behavior. `allow_full_suite` should be used only when intentional and justified in the PR description. See [`AGENTS.md`](../../../AGENTS.md) invariant 6 for the full contract.
13
+
14
+ ## ⛔ STOP — Read Before Running Any Tests
15
+
16
+ **`test_runner` scope safety — one rule, no exceptions:**
17
+
18
+ | Scope | Files param | Safe? |
19
+ |-------|------------|-------|
20
+ | `'convention'` | single source file | ✅ Safe |
21
+ | `'convention'` | **multiple source files** | ❌ **Rejected** — guard fires (`scope_exceeded`) before fan-out; use shell loop |
22
+ | `'convention'` | direct test file paths | ✅ Safe — exempt from source-file limit |
23
+ | `'graph'` | single file | ✅ Safe |
24
+ | `'graph'` | **multiple files** | ❌ **Rejected** (`scope_exceeded`) — guard fires before import-graph traversal |
25
+ | `'impact'` | multiple files | ❌ **Rejected** (`scope_exceeded`) — same reason |
26
+ | `'all'` | any | ❌ **Never in agent context** |
27
+
28
+ **If you need to run tests across multiple source files: use a per-file shell loop, not `test_runner`.**
29
+
30
+ **Truncated output recovery:** When `bun test` output exceeds the bash tool buffer it is saved to a file whose ID (`tool_abc123...`) cannot be retrieved via `retrieve_summary` (which only accepts `S1`, `S2` format). Workaround — pipe to a temp file instead:
31
+ ```powershell
32
+ # PowerShell (Windows)
33
+ bun --smol test tests/unit/agents --timeout 60000 | Out-File "$env:TEMP\test_out.txt"; Get-Content "$env:TEMP\test_out.txt" | Select-Object -Last 30
34
+ ```
35
+ ```bash
36
+ # bash (Linux/macOS)
37
+ bun --smol test tests/unit/agents --timeout 60000 2>&1 | tee /tmp/test_out.txt | tail -30
38
+ ```
39
+
40
+ ## Framework: bun:test Only
41
+
42
+ All test files MUST import from `bun:test`:
43
+
44
+ ```typescript
45
+ import { describe, test, expect, beforeEach, afterEach } from 'bun:test';
46
+ ```
47
+
48
+ Bun provides a vitest compatibility layer (`vi.mock`, `vi.fn`, `vi.spyOn`) that works on Linux and macOS. However, `vi.mock()` has critical isolation bugs in Bun when multiple test directories run in the same process. Prefer `bun:test` native APIs:
49
+
50
+ | vitest API | bun:test equivalent | Notes |
51
+ |-----------|-------------------|-------|
52
+ | `vi.fn()` | `mock(() => ...)` | Import `mock` from `bun:test` |
53
+ | `vi.spyOn(obj, method)` | `spyOn(obj, method)` | Import `spyOn` from `bun:test` |
54
+ | `vi.mock('module', factory)` | `mock.module('module', factory)` | Import `mock` from `bun:test` |
55
+ | `vi.restoreAllMocks()` | `mock.restore()` | Call in `afterEach` |
56
+
57
+ ## Mock Isolation Rules
58
+
59
+ **CRITICAL: Module-level mocks leak across test files within the same Bun process.**
60
+
61
+ Bun's `--smol` mode shares the module cache between test files in the same worker process. A `mock.module()` call in file A replaces the module globally — file B gets the mock instead of the real module. This caused ~959 failures before per-file isolation was added (#330).
62
+
63
+ **Additional critical limitation (Bun v1.3.11):** `mock.restore()` does NOT reliably restore `mock.module` mocks. Cross-module mocks can persist across test boundaries even after `afterEach(mock.restore())` is called. Three layers of defense are required.
64
+
65
+ ### Rules
66
+
67
+ 1. **Spread the real module when mocking.** Only override the specific export you need:
68
+ ```typescript
69
+ import * as realChildProcess from 'node:child_process';
70
+ const mockExecFileSync = mock(() => '');
71
+ mock.module('node:child_process', () => ({
72
+ ...realChildProcess, // preserve all other exports
73
+ execFileSync: mockExecFileSync, // override only what you test
74
+ }));
75
+ ```
76
+ This prevents tests from accidentally nullifying exports that other code depends on. **This is mandatory for Node built-ins** (`node:fs`, `node:fs/promises`, `node:child_process`, etc.) because other code imports the full module — returning a partial mock without spreading real exports breaks unrelated imports.
77
+
78
+ 2. **Use lazy binding in source code.** Import the namespace, call methods at invocation time:
79
+ ```typescript
80
+ // GOOD — mockable via mock.module
81
+ import * as child_process from 'node:child_process';
82
+ function run() { return child_process.execFileSync('git', ['status']); }
83
+
84
+ // BAD — binds at module load, mock.module can't intercept
85
+ import { execFileSync } from 'node:child_process';
86
+ ```
87
+
88
+ 3. **Always add `afterEach(mock.restore())` for cross-module mocks.** Even though it is unreliable in Bun v1.3.11, it provides best-effort cleanup and reduces the window of cross-file contamination. Without it, the mock persists until the process exits:
89
+ ```typescript
90
+ import { afterEach, mock } from 'bun:test';
91
+
92
+ afterEach(() => {
93
+ mock.restore();
94
+ });
95
+ ```
96
+ **Exception — Windows EBUSY:** Test files that spawn async child processes (e.g. `pre-check-batch` tests) must **NOT** call `mock.restore()` on Windows. Child process handles can hold directory locks, and `mock.restore()` triggers cleanup that causes `EBUSY` errors. These files must use `describe.skipIf(process.platform === 'win32')` or `test.skipIf(process.platform === 'win32')` for affected tests.
97
+
98
+ Intentionally skipped on Windows (async child process handles cause EBUSY):
99
+ - `tests/unit/tools/pre-check-batch-sast-preexisting.test.ts`
100
+ - `tests/unit/tools/pre-check-batch.adversarial.test.ts`
101
+ - `tests/unit/tools/pre-check-batch-cwd.test.ts`
102
+ - `tests/unit/tools/pre-check-batch-cwd.adversarial.test.ts`
103
+ - `tests/unit/tools/pre-check-batch-contextdir-adversarial.test.ts`
104
+ - `tests/unit/tools/pre-check-batch-secretscan-evidence.test.ts`
105
+ - `tests/unit/tools/pre-check-batch.test.ts`
106
+
107
+ 4. **Never create circular mock imports.** This pattern deadlocks Bun:
108
+ ```typescript
109
+ // BROKEN — imports from the module it's about to mock
110
+ import { realFn } from '../../src/module.js';
111
+ vi.mock('../../src/module.js', () => ({
112
+ realFn: (...args) => realFn(...args), // circular!
113
+ otherFn: vi.fn(),
114
+ }));
115
+ ```
116
+ Instead, inline the function logic or extract the real functions into a separate utility module.
117
+
118
+ 5. **Prefer constructor/parameter injection over module mocking.** The swarm's hook factories (`createScopeGuardHook`, `createDelegationLedgerHook`, etc.) accept injected dependencies — test them by passing mock callbacks, not by replacing modules.
119
+
120
+ 6. **Mock `validateDirectory` when testing with Windows temp paths.** The `path-security.ts` validator rejects Windows absolute paths (`C:\...`). If your test uses `os.tmpdir()` and passes that path to a function that calls `validateDirectory`, mock it:
121
+ ```typescript
122
+ mock.module('../../../src/utils/path-security', () => ({
123
+ validateDirectory: () => {},
124
+ validateSwarmPath: (p: string) => p,
125
+ }));
126
+ ```
127
+
128
+ ## Diagnosing Test Isolation Failures
129
+
130
+ When test files pass individually but fail when run together, follow this protocol:
131
+
132
+ 1. **Isolate**: Run the failing file alone: `bun test <file>.test.ts --timeout 30000`
133
+ 2. **Pair**: Run it WITH its suspected polluting neighbor: `bun test <fileA>.test.ts <fileB>.test.ts`
134
+ 3. **Classify**:
135
+ - Both pass alone → fail together → **mock pollution** from neighbor
136
+ - Fails alone → **test logic bug** (not isolation issue)
137
+ - Passes alone + passes together but fails in full suite → **third-file pollution** (use binary search across directory)
138
+ 4. **For mock pollution**, check the neighbor for these patterns:
139
+ - `vi.mock()` or `mock.module()` inside `beforeEach()` (not at top level)
140
+ - `delete require.cache[...]` combined with re-import pattern
141
+ - These indicate hoist-time closure capture — see below
142
+ 5. **Specific symptom — closure capture failure**: `vi.mock()` captures closures at **hoist time** (before `beforeEach` runs). Reassigning `mockFn.mockImplementation(newFn)` in the test body does **NOT** update the hoisted closure — the mock still calls the original function.
143
+ - Symptom: `expect(mockFn).toHaveBeenCalledTimes(N)` fails with an unexpected count
144
+ - Symptom: `expect(mockFn).not.toHaveBeenCalled()` fails because the real function was called
145
+ 6. **Fix path**: Migrate the affected test file to `_internals` DI seam pattern per the `mock-to-internals-migration` skill. This eliminates both the `vi.mock()` call and the closure capture surface area. **Exception — reference-captured functions**: if the source code passes a function as a direct argument or captures it in a closure at module scope (e.g., `transactFile(path, readKnowledge, ...)`), the reference bypasses `_internals` entirely — mutating `_internals.readKnowledge` changes only the object property, not the module-scope binding the source already holds. Migrating to `_internals` does not help. In that case, test via observable outcomes (e.g., run concurrent callers and assert on final persisted state).
146
+
147
+ ## Two-Tier Mock Convention
148
+
149
+ The codebase uses a two-tier strategy for mock isolation, plus a zero-mock testing pattern:
150
+
151
+ ### Tier 0: _test_exports Pure Function Testing (Zero Mocks)
152
+
153
+ When a module contains internal utility functions (formatters, normalizers, transformers) that don't need external dependencies, export them via a `_test_exports` object for direct unit testing. This avoids `mock.module` entirely and produces tests that are deterministic, fast, and immune to Bun's cross-file mock leakage:
154
+
155
+ ```typescript
156
+ // In source file (src/tools/formatter.ts)
157
+ function formatEntry(entry: SomeType): string {
158
+ // internal implementation — may use optional chaining, defaults, etc.
159
+ return entry.score?.toFixed(2) ?? 'N/A';
160
+ }
161
+
162
+ // Public API (tool handler, command handler, etc.)
163
+ export function handleQuery(ctx: Context) {
164
+ const entries = readData(ctx);
165
+ return entries.map(formatEntry);
166
+ }
167
+
168
+ // Export seam for testing — only used by test files
169
+ export const _test_exports = { formatEntry };
170
+ ```
171
+
172
+ ```typescript
173
+ // In test file (tests/unit/tools/formatter.test.ts)
174
+ import { _test_exports } from '../../../src/tools/formatter';
175
+
176
+ const { formatEntry } = _test_exports;
177
+
178
+ describe('formatEntry', () => {
179
+ test('handles missing score', () => {
180
+ expect(formatEntry({ score: undefined })).toBe('N/A');
181
+ });
182
+ test('formats numeric score', () => {
183
+ expect(formatEntry({ score: 0.85 })).toBe('0.85');
184
+ });
185
+ });
186
+ ```
187
+
188
+ **When to use Tier 0 vs Tier 1:**
189
+ - **Tier 0 (`_test_exports`)**: The function is a pure utility (formatter, normalizer, transformer) that doesn't call external modules. No mocking needed — test it directly.
190
+ - **Tier 1 (`_internals`)**: You need to mock a function within the same module to test the caller in isolation. The function has side effects or calls external APIs.
191
+ - **Tier 2 (`mock.module`)**: You need to mock a dependency from another module (Node built-ins, other application modules).
192
+
193
+ **Benefits of Tier 0:**
194
+ - Zero mock pollution — no `mock.module` calls, no `mock.restore()` needed
195
+ - Works in batch test runs without per-file isolation
196
+ - Type-safe (the exported object carries the real TypeScript types)
197
+ - No filesystem dependencies (no tmpDir, no chdir, no existsSync)
198
+ - Deterministic on all platforms and CI environments
199
+
200
+ ### Tier 1: _internals DI Seams (Within-Module)
201
+
202
+ For mocking functions within the same module, source files export an `_internals` object that wraps key functions. Tests can replace individual functions without using `mock.module`:
203
+
204
+ ```typescript
205
+ // In source file (src/services/my-service.ts)
206
+ export const _internals = {
207
+ helperFn: () => { /* real implementation */ }
208
+ };
209
+
210
+ export function mainFn() {
211
+ return _internals.helperFn();
212
+ }
213
+ ```
214
+
215
+ ```typescript
216
+ // In test file
217
+ import { _internals, mainFn } from '../../../src/services/my-service';
218
+
219
+ test('mainFn uses mocked helper', () => {
220
+ const original = _internals.helperFn;
221
+ _internals.helperFn = mock(() => 'mocked');
222
+ // ... test ...
223
+ _internals.helperFn = original; // restore
224
+ });
225
+ ```
226
+
227
+ **Benefits:**
228
+ - No process-global mock pollution
229
+ - Type-safe
230
+ - Fast (no module re-parsing)
231
+ - Works in batch test runs without isolation
232
+
233
+ **Critical limitation — reference-captured functions:** `_internals` interception requires the source code to read `_internals.fn` at the call site. When a function is instead passed as a direct argument or captured in a closure at module definition time, replacing `_internals.fn` has no effect — the mock is silently ignored and the real function runs.
234
+
235
+ ```typescript
236
+ // Source: readKnowledge is captured at definition time, NOT via _internals
237
+ export async function transactKnowledge(filePath: string, mutate: Fn) {
238
+ return transactFile(filePath, readKnowledge, ...); // direct ref, captured at definition time
239
+ }
240
+ export const _internals = { readKnowledge }; // mutating this does NOT affect the closure above
241
+
242
+ // Test — mock is silently ignored; real readKnowledge still runs
243
+ const orig = _internals.readKnowledge;
244
+ _internals.readKnowledge = mock(() => []); // only mutates the object property
245
+ await transactKnowledge(path, mutate); // still calls the real readKnowledge
246
+ _internals.readKnowledge = orig;
247
+ ```
248
+
249
+ When `_internals` interception cannot work, verify **observable outcomes** instead: run concurrent callers and assert on final persisted state. See `tests/unit/hooks/knowledge-application.test.ts` ("two concurrent bumpCountersBatch calls") for the pattern.
250
+
251
+ ### Tier 2: mock.module (Cross-Module)
252
+
253
+ When mocking dependencies from other modules (especially Node built-ins), use `mock.module` with proper cleanup:
254
+
255
+ ```typescript
256
+ import * as realFs from 'node:fs/promises';
257
+
258
+ mock.module('node:fs/promises', () => ({
259
+ ...realFs, // MUST spread real exports
260
+ readFile: mock(() => Promise.resolve('mocked')),
261
+ }));
262
+
263
+ afterEach(() => mock.restore());
264
+ ```
265
+
266
+ **Critical rules for cross-module mocks:**
267
+ 1. **Always spread real exports** for Node built-ins — other code depends on exports you don't mock
268
+ 2. **Always add `afterEach(mock.restore())`** — provides best-effort cleanup
269
+ 3. **Run in per-file isolation** — CI runs each file in its own process (`for f in *.test.ts; do bun --smol test "$f"; done`)
270
+
271
+ ### Choosing Between Tiers
272
+
273
+ | Scenario | Pattern | Example |
274
+ |----------|---------|--------|
275
+ | Mocking a function in the same module you're testing | `_internals` seam | `src/state.ts` `_internals.loadSnapshot` |
276
+ | Mocking a Node built-in (fs, child_process, etc.) | `mock.module` + spread real | `mock.module('node:fs/promises', () => ({ ...realFs, readFile: mockFn }))` |
277
+ | Mocking another application module | `mock.module` + cleanup | `mock.module('../../../src/utils/logger', ...)` + `afterEach(mock.restore())` |
278
+ | File-scoped mock (applies to all tests in file) | `mock.module` at top level + `mockReset()` in `beforeEach` | Preflight tests with `mockLoadPlan.mockReset()` |
279
+
280
+ ## mock.module() Export Completeness
281
+
282
+ When using `mock.module()` (or `vi.mock()`) with Bun's test runner, the mock factory **MUST provide stubs for ALL named exports** of the target module — not just the ones your test calls. Bun validates the export set at dynamic-import time and throws `SyntaxError: Export named 'X' not found` if any export is missing.
283
+
284
+ ### Why this matters
285
+
286
+ Transitive imports may reference exports your test never calls directly. For example, if your test mocks `config/schema.js` and only uses `stripKnownSwarmPrefix`, but a transitive dependency imports `PluginConfigSchema` from the same module, the mock MUST include `PluginConfigSchema` as a stub — even though your test never calls it.
287
+
288
+ When the source module gains new exports (e.g., a PR adds 50 new Zod schemas to `config/schema.ts`), ALL existing `mock.module()` calls targeting that module must be updated — even if the new exports are irrelevant to your test.
289
+
290
+ ### How to verify completeness
291
+
292
+ Before finalizing a test that uses `mock.module()`:
293
+
294
+ 1. List all runtime exports of the target module (type-only exports are erased at compile time and need no stub):
295
+ ```bash
296
+ grep -E "^export (const|function|async function|class) " src/path/to/module.ts
297
+ ```
298
+ **Note:** Do NOT include `type` or `interface` exports — Bun erases these at compile time and they need no runtime stub.
299
+ 2. Ensure every export name has an entry in your `mock.module()` factory.
300
+ 3. Stubs can be minimal:
301
+ - Functions: `() => null` or `async () => {}`
302
+ - Zod schemas: use a comprehensive stub that supports common methods:
303
+ ```typescript
304
+ const zodStub = {
305
+ parse: (v: unknown) => v,
306
+ safeParse: (v: unknown) => ({ success: true as const, data: v }),
307
+ parseAsync: async (v: unknown) => v,
308
+ };
309
+ ```
310
+ - Constants: appropriate zero values (`''`, `0`, `null`, `[]`, `{}`)
311
+
312
+ ### Verification pattern
313
+
314
+ ```typescript
315
+ // ✅ CORRECT — all exports provided, test uses only the first one
316
+ mock.module('../../../src/config/schema.js', () => ({
317
+ // The one export your test actually uses
318
+ stripKnownSwarmPrefix: mockStripFn,
319
+ // Stubs for transitive import resolution (never called in test)
320
+ PluginConfigSchema: zodStub,
321
+ ScoringConfigSchema: zodStub,
322
+ isKnownCanonicalRole: () => false,
323
+ // ... all other runtime exports as stubs
324
+ }));
325
+
326
+ // ❌ WRONG — missing exports cause SyntaxError at module-load time
327
+ mock.module('../../../src/config/schema.js', () => ({
328
+ stripKnownSwarmPrefix: mockStripFn,
329
+ // Missing: PluginConfigSchema, ScoringConfigSchema, etc.
330
+ // → "SyntaxError: Export named 'PluginConfigSchema' not found"
331
+ }));
332
+ ```
333
+
334
+ ### What IS and IS NOT test theater
335
+
336
+ Adding stubs for ESM resolution is NOT test theater — it's a Bun runtime requirement. The distinction:
337
+
338
+ | Pattern | Test theater? | Why |
339
+ |---------|--------------|-----|
340
+ | Adding `PluginConfigSchema: zodStub` so the module loads | **No** | Required for ESM resolution; stub is never called |
341
+ | Stubbing `validateDirectory` to return `true` then asserting "validation works" | **Yes** | The stub bypasses the logic you should be testing |
342
+ | Using `zodStub` in assertions: `expect(zodStub.parse(input)).toBe(input)` | **Yes** | Testing the stub, not the real code |
343
+ | Adding stubs for ALL 50 Zod schemas in config/schema.ts | **No** | All are required for transitive import resolution |
344
+
345
+ The stubs exist solely to satisfy the module loader. Test assertions must verify behavior through the real-mocked functions (the ones your test actually calls), not through the stubs.
346
+
347
+ ### Files Intentionally Using File-Scoped Mocks
348
+
349
+ Some test files use top-level `mock.module` that must persist across all tests in the file. These files use `mockReset()`/`mockClear()` in `beforeEach` instead of `mock.restore()` in `afterEach`:
350
+
351
+ - `src/__tests__/preflight-phase.test.ts` — mocks `plan/manager` and `preflight-service`
352
+
353
+ ## Cross-Platform Test Patterns
354
+
355
+ Tests run on all three CI platforms (ubuntu, macos, windows). Path and filesystem behavior
356
+ differs between them. Follow these patterns to prevent platform-specific failures:
357
+
358
+ ### Mock keys with filesystem paths
359
+
360
+ **Never hardcode Unix-format paths as mock keys.** On Windows, `path.resolve('/dir', 'file')`
361
+ produces drive-letter-prefixed paths like `D:\dir\file`, not `/dir/file`. A mock that checks
362
+ for `/dir/file` will silently never match, causing the test to behave differently on Windows.
363
+
364
+ **Use `path.resolve()` to construct mock keys the same way the source code does:**
365
+
366
+ ```typescript
367
+ // ❌ WRONG — fails on Windows (mock expects '/safe/dir/linked.ts',
368
+ // but path.resolve('/safe/dir', 'linked.ts') = 'D:\safe\dir\linked.ts')
369
+ mockRealpathSync.mockImplementation((inputPath: string) => {
370
+ if (inputPath === '/safe/dir') return '/safe/dir';
371
+ if (inputPath === '/safe/dir/linked.ts') return '/outside/linked.ts';
372
+ return inputPath;
373
+ });
374
+
375
+ // ✅ CORRECT — path.resolve produces matching keys on all platforms
376
+ const mockDir = path.resolve('/safe/dir');
377
+ const linkedResolved = path.resolve(mockDir, 'linked.ts');
378
+ const outsideResolved = path.resolve('/outside/linked.ts');
379
+
380
+ // mockRealpathSync is a mock() function (bun:test) — see mocking patterns above
381
+ mockRealpathSync.mockImplementation((inputPath: string) => {
382
+ if (inputPath === mockDir) return mockDir;
383
+ if (inputPath === linkedResolved) return outsideResolved;
384
+ return inputPath;
385
+ });
386
+ ```
387
+
388
+ ### Symlink behavior differences
389
+
390
+ - On Windows, `fs.symlinkSync` for directories creates **junctions** by default, which
391
+ resolve differently than POSIX symlinks. Junction creation may require administrator
392
+ elevation on older Node.js versions.
393
+ - `fs.realpathSync` on a broken symlink throws `ENOENT` on POSIX but may throw
394
+ `EINVAL` on Windows, depending on symlink type.
395
+ - Use `test.skipIf(process.platform === 'win32')` for tests that directly manipulate
396
+ filesystem symlinks, unless the test's purpose is explicitly to verify cross-platform
397
+ symlink behavior.
398
+
399
+ ### Temporary directory patterns
400
+
401
+ - Use `os.tmpdir()` + `path.join()` for temp paths. **Never** hardcode `/tmp` or `C:\`.
402
+ - Wrap `mkdtempSync` in `realpathSync` if the result is `chdir`'d on macOS (temp
403
+ dirs are often symlinked to `/private/var/...`).
404
+ - Clean up temp dirs in `afterEach` or `afterAll` with a bounded helper that
405
+ verifies the resolved cleanup target is a child of `os.tmpdir()` before
406
+ calling recursive `rm`. Reuse `tests/helpers/safe-test-dir.ts` when possible.
407
+ Do not call recursive `rm` on a computed path unless the helper has rejected
408
+ empty strings, `os.tmpdir()` itself, and paths outside the temp root.
409
+
410
+ ### Platform-specific environment variable redirection
411
+
412
+ When tests redirect `process.env.HOME` to isolate path-resolver-dependent code
413
+ (functions like `resolveHiveKnowledgePath`, `resolveSwarmKnowledgePath`, or any
414
+ function that reads `os.homedir()` / platform env vars), they MUST redirect ALL
415
+ platform-specific env vars, not just `HOME`. A partial redirect silently falls
416
+ back to the real user profile on some platforms, causing tests to read/write
417
+ actual user data instead of the isolated temp directory.
418
+
419
+ Per-platform requirements:
420
+
421
+ - **Linux**: redirect `HOME`, `XDG_CONFIG_HOME`, and `XDG_DATA_HOME`.
422
+ - **macOS**: redirect `HOME` (macOS resolves `~/Library/Application Support` from
423
+ the home directory).
424
+ - **Windows**: redirect `HOME`, `LOCALAPPDATA`, AND `APPDATA`. Windows path
425
+ resolvers read `LOCALAPPDATA` and `APPDATA`, neither of which is derived from
426
+ `HOME`. Redirecting only `HOME` silently fails on Windows, causing tests to
427
+ touch the real `%LOCALAPPDATA%` and `%APPDATA%` trees.
428
+
429
+ > **⚠️ Bun caches `os.homedir()` on first call.** If a module calls `os.homedir()`
430
+ > before the test sets `process.env.HOME`, the cached value persists for the
431
+ > lifetime of the process and later env changes are silently ignored. Set
432
+ > `process.env.HOME` (and other redirected vars) **before** importing any module
433
+ > that calls `os.homedir()`. The source code documents this at
434
+ > `src/hooks/knowledge-store.ts`: "Bun caches os.homedir(), so changing $HOME
435
+ > after first call is ignored."
436
+
437
+ Use per-variable save/restore rather than saving and replacing the entire
438
+ `process.env` object — the latter discards process-level env state and can
439
+ interfere with other test infrastructure:
440
+
441
+ ```typescript
442
+ import { beforeEach, afterEach } from 'bun:test';
443
+ import os from 'node:os';
444
+ import path from 'node:path';
445
+
446
+ const saved = {
447
+ HOME: process.env.HOME,
448
+ LOCALAPPDATA: process.env.LOCALAPPDATA,
449
+ APPDATA: process.env.APPDATA,
450
+ XDG_CONFIG_HOME: process.env.XDG_CONFIG_HOME,
451
+ XDG_DATA_HOME: process.env.XDG_DATA_HOME,
452
+ };
453
+
454
+ beforeEach(() => {
455
+ const isolatedDir = path.join(os.tmpdir(), 'test-home');
456
+ process.env.HOME = isolatedDir;
457
+ process.env.LOCALAPPDATA = isolatedDir;
458
+ process.env.APPDATA = isolatedDir;
459
+ process.env.XDG_CONFIG_HOME = isolatedDir;
460
+ process.env.XDG_DATA_HOME = isolatedDir;
461
+ });
462
+
463
+ afterEach(() => {
464
+ for (const [key, value] of Object.entries(saved)) {
465
+ if (value === undefined) delete process.env[key];
466
+ else process.env[key] = value;
467
+ }
468
+ });
469
+ ```
470
+
471
+ For cross-file isolation (tests that must survive across multiple files in the
472
+ same process, e.g. batch steps), use `beforeAll` / `afterAll` with the same
473
+ per-var save/restore pattern. Never mutate `process.env` without restoring it in
474
+ a matching teardown hook.
475
+
476
+ **Preferred approach:** Use `createIsolatedTestEnv()` from
477
+ `tests/helpers/isolated-test-env.ts`. It handles `XDG_CONFIG_HOME`, `APPDATA`,
478
+ `LOCALAPPDATA`, and `HOME` with correct per-variable save/restore and returns a
479
+ cleanup function that removes the temp directory. Use this helper unless your
480
+ test has specific requirements it doesn't cover.
481
+
482
+ ### Line ending normalization
483
+
484
+ Git on Windows converts LF to CRLF by default. Tests that compare file contents
485
+ byte-by-byte against expected strings must normalize line endings:
486
+
487
+ ```typescript
488
+ const actual = readFileSync(path, 'utf-8').replace(/\r\n/g, '\n');
489
+ ```
490
+
491
+ ## CI Pipeline Structure
492
+
493
+ The CI runs on three platforms (ubuntu, macos, windows). Tests are split into sequential steps within each platform's job.
494
+
495
+ ```text
496
+ Step 1: hooks — per-file isolation loop on every platform
497
+ Step 2: cli — batch
498
+ Step 3: commands + config — batch
499
+ Step 4: tools — per-file isolation loop
500
+ Step 5: services + build + quality + sast + sbom + scripts — per-file isolation loop
501
+ Step 6: state + agents + knowledge + evidence + plan + misc — per-file isolation loop
502
+ ```
503
+
504
+ **Steps 1 and 4-6 use per-file isolation:** each `.test.ts` file runs in its own `bun --smol` process to prevent `mock.module()` cache poisoning (#330). Steps 2-3 run files in batch (one process per step) because they have fewer mock conflicts.
505
+
506
+ When writing a test, know which step your file will run in. In batch steps, do not assume isolation from other files in the same step.
507
+
508
+ **Job timeout: 15 minutes.** A single hanging test will kill the entire platform's test run.
509
+
510
+ ## File Placement
511
+
512
+ ### Convention
513
+
514
+ | Test type | Location | When to use |
515
+ |-----------|----------|-------------|
516
+ | Unit tests for `src/hooks/*.ts` | `tests/unit/hooks/` | Testing hook factories and hook behavior |
517
+ | Unit tests for `src/tools/*.ts` | `tests/unit/tools/` | Testing tool execute functions |
518
+ | Unit tests for `src/commands/*.ts` | `tests/unit/commands/` | Testing CLI command handlers |
519
+ | Unit tests for `src/config/*.ts` | `tests/unit/config/` | Testing schema validation, config loading |
520
+ | Unit tests for `src/agents/*.ts` | `tests/unit/agents/` | Testing agent prompt generation, factory logic |
521
+ | Colocated tests | `src/**/*.test.ts` | Integration-style tests tightly coupled to the source module |
522
+ | Integration tests | `tests/integration/` | Cross-module workflows, plugin initialization |
523
+ | Security tests | `tests/security/` | Adversarial input handling, injection resistance |
524
+ | Smoke tests | `tests/smoke/` | Built package validation |
525
+
526
+ ### Naming
527
+
528
+ - Base test: `<module>.test.ts`
529
+ - Adversarial variant: `<module>.adversarial.test.ts`
530
+
531
+ Only create an adversarial variant if it tests **distinct attack vectors** not covered by the base test. Do not duplicate base test assertions with different inputs — that's redundancy, not security coverage.
532
+
533
+ ### Regression tests (review-surfaced bugs)
534
+
535
+ When fixing a bug surfaced by code review, swarm review, or post-merge audit, **always add a regression test** with the following shape so the test's purpose survives future cleanup:
536
+
537
+ ```typescript
538
+ describe('<feature> — regression: <one-line description> (F#)', () => {
539
+ it('<exact behavior the bug violated>', () => {
540
+ // Previous code did <bad thing>: e.g. the regex `/^\.\/+/` only stripped
541
+ // a single leading `./`, so `././util.ts` survived as `./util.ts`.
542
+ expect(normalizeGraphPath('././util.ts')).toBe('util.ts');
543
+ });
544
+ });
545
+ ```
546
+
547
+ Rules:
548
+ - The describe label includes the original finding ID (e.g. `F8`, `F9`, `F1.1`) so future readers can map back to the review.
549
+ - The leading comment in the body explains the **prior buggy behavior** in concrete terms — what the code did before, not what it does now.
550
+ - One regression test per finding. Do not pile unrelated assertions into a single regression block.
551
+
552
+ Examples in-tree: `tests/unit/graph/graph-query.test.ts`, `tests/unit/graph/import-extractor.test.ts`, `tests/unit/graph/graph-store.test.ts`.
553
+
554
+ ### Guardrail Authority Tests
555
+
556
+ When testing `src/hooks/guardrails/file-authority.ts` or similar ordered
557
+ authority checks:
558
+
559
+ - Test the specific allow/deny rule under review, not just the final denial. A
560
+ later deny rule such as `blockedPrefix` can mask a bad earlier allow match.
561
+ - For case-sensitive glob behavior, place negative cases outside default blocked
562
+ prefixes or use a custom agent with no other deny rules and explicit
563
+ `allowedPrefix: []`. Include a positive case that the case-sensitive glob
564
+ allows, and for negative cases assert the denial reason is the allowlist
565
+ fallback (for example, `not in allowed list`) so the test proves the glob did
566
+ not match.
567
+ - For generated-zone precedence, include at least one case where the filename
568
+ matches the newly allowed convention under `dist/` or `build/`.
569
+ - For custom authority arrays, pin whether the array replaces or extends defaults
570
+ with tests for both an empty array and a custom non-empty array when the
571
+ semantics matter.
572
+ - For matcher caches or other shared state, test both priming orders when the
573
+ selected behavior depends on mode, platform, or prior calls.
574
+
575
+ ## Cross-Entry Invariants (config maps)
576
+
577
+ When you modify any entry of a "map of agents/tools/roles" in `src/config/constants.ts` (`AGENT_TOOL_MAP`, `DEFAULT_MODELS`, `QA_AGENTS`, `PIPELINE_AGENTS`, etc.) or tool-name registration in `src/tools/tool-names.ts`, there are tests that assert **parity across sibling entries**, not just shape of one entry.
578
+
579
+ Known parity assertions:
580
+
581
+ | Test | Invariant |
582
+ |---|---|
583
+ | `tests/unit/config/critic-registration.test.ts` | critic sibling maps include required shared tools such as `get_approved_plan` |
584
+ | `tests/unit/config/agent-tool-map.test.ts` | architect has broader access than subagents, and subagent tool lists stay bounded |
585
+ | `tests/unit/config/constants.test.ts` | declared agents, default models, and tool metadata stay coherent |
586
+
587
+ Workflow when adding a tool to a single agent:
588
+ 1. Add the entry.
589
+ 2. Run `bun --smol test tests/unit/config --timeout 60000` **before pushing**.
590
+ 3. If a parity test fails, decide: mirror the change to sibling agents, or update the invariant test if the design intent has actually changed.
591
+ 4. To inspect runtime shape quickly: `bun -e "import { AGENT_TOOL_MAP } from './src/config/constants.ts'; for (const [k,v] of Object.entries(AGENT_TOOL_MAP)) console.log(k, v.length);"`
592
+
593
+ ## Debugging CI failures
594
+
595
+ When CI reports a `unit (ubuntu|macos|windows)` failure:
596
+
597
+ 1. **Identify the actual failing test from the job log first.** Do not assume it's a pre-existing failure based on a local repro of a different test. Open the failing job's URL and find the `<file>:<line>` in the Bun output. WebFetch can scrape this if the `gh` CLI isn't available.
598
+ 2. **Reproduce that exact file locally:** `bun --smol test tests/unit/<dir>/<file>.test.ts --timeout 30000`.
599
+ 3. **Then check if the same failure reproduces on `main`.** If yes, document as pre-existing in the PR description and continue with your branch's work; do not silently inherit the failure.
600
+ 4. **For `package-check` failures:** `package-check` validates the npm tarball (`npm pack` + tarball contents). A failing `package-check` is a source/build/package-manifest problem, not generated-file drift. `dist/` is generated and NOT committed — do not stage it; run `bun run build` locally only when you need the bundle. There is no longer a committed-dist drift check.
601
+
602
+ ## Test Quality Standards
603
+
604
+ ### DO
605
+
606
+ - Test real behavior: call the actual function with real inputs, assert on real outputs.
607
+ - Test error paths: what happens with `null`, `undefined`, empty string, oversized input?
608
+ - Use temp directories (`fs.mkdtemp`) for file I/O tests. Clean up in `afterEach`.
609
+ - Assert on specific values, not just truthiness: `expect(result.status).toBe('pending')` not `expect(result).toBeTruthy()`.
610
+
611
+ ### DO NOT
612
+
613
+ - **Do not test type definitions.** `expect(event.type === 'foo').toBe(true)` tests TypeScript, not your code.
614
+ - **Do not test framework behavior.** "Zod schema parses valid input" tests Zod, not your schema.
615
+ - **Do not test test utilities.** If it only exists to support other tests, it doesn't need its own test.
616
+ - **Do not mock everything.** If every dependency is mocked, you're testing the mock setup. Prefer real dependencies for pure functions and only mock I/O boundaries (filesystem, network, timers).
617
+
618
+ ### Anchored Content Assertions
619
+
620
+ When asserting that skill files, protocol docs, or structured markdown contain expected text, **anchor your assertions to the relevant section** rather than using bare `toContain()` on the full file content:
621
+
622
+ ```typescript
623
+ // WEAK — passes even if the word appears in prose outside the intended section
624
+ expect(content).toContain('DROP');
625
+
626
+ // STRONG — fails if the structured section is removed or relocated
627
+ const stage3Start = content.indexOf('#### Stage 3: Consult Critic Sounding Board');
628
+ const stage4Start = content.indexOf('#### Stage 4: Surface User Decision Packet');
629
+ const stage3Section = content.slice(stage3Start, stage4Start);
630
+ expect(stage3Section).toContain('DROP');
631
+ expect(stage3Section).toContain('ASK_USER');
632
+ ```
633
+
634
+ **Why this matters:** A bare `toContain('DROP')` passes as long as the word appears anywhere in the file. If the structured outcomes section is deleted but a prose reference remains (e.g., "The critic may DROP irrelevant items"), the test still passes — silently hiding the removal. Section-anchored assertions fail when the content is actually removed from its intended location.
635
+
636
+ Use this pattern for:
637
+ - Critic outcome mappings in skill files (DROP, ASK_USER, RESOLVE, REPHRASE)
638
+ - Classification category lists (self_resolved, user_decision, etc.)
639
+ - Any structured section where word presence is necessary but position-dependent
640
+ - **Do not hardcode version numbers.** Version bumps are automated — a test asserting `version === '6.31.3'` breaks on every release.
641
+ - **Do not use `sleep` or `setTimeout` for synchronization.** Use explicit signals, resolved promises, or `Bun.sleep()` with tight bounds.
642
+ - **Do not spawn `cat /dev/zero`, `yes`, or other infinite-output commands.** Use `sleep 30` for "blocking command" tests.
643
+
644
+ ## Documented-Example Regression Tests
645
+
646
+ When a SKILL.md (or other agent-facing document) contains an **executable example** — a tool invocation with concrete arguments, a parser output with specific field values, a protocol transcript, or any output whose shape and values are runnable — write a test that executes the actual implementation on synthetic data and compares the result **field by field** to the documented example. Place the test file at `tests/unit/skills/<skill-name>-dry-run.test.ts` (or the analogous path for the tool/parser being tested).
647
+
648
+ **Why this matters:** Documented examples drift from the runtime they describe, and the drift is often subtle enough to survive casual review. Common failure modes include field-name drift (`ok` present vs. absent; `parse_errors: 0` vs. `parse_errors: 2`), refusal-shape drift (`invocation_envelope: null` in the example when the real shape is populated), value-level drift (`row_index: 1` 1-indexed in prose when the parser emits 0-indexed), and field-presence drift (new required fields added to an interface but omitted from the example). A field-by-field comparison test catches all of these on every CI run.
649
+
650
+ **Concrete protocol:**
651
+
652
+ 1. Locate the executable example in the SKILL.md (tool call, parser output, protocol transcript, etc.).
653
+ 2. Construct synthetic data that matches the example's input shape.
654
+ 3. Run the actual implementation (parser, tool, protocol handler) on the synthetic data.
655
+ 4. Assert field-by-field equality between the actual output and the documented example using `bun:test`'s `toEqual` (deep-equality). Do not use loose string matching.
656
+ 5. Iterate the example (or fix the implementation) until every field matches with field-level precision.
657
+
658
+ > **Working example:** `tests/unit/skills/swarm-pr-review-dry-run.test.ts` exercises the `swarm-pr-review` SKILL.md dry-run transcript (lines 866–1050) against the live `parse_lane_candidates` implementation. That test survived four review cycles to align the documentation with runtime output. Drift caught during those cycles included: `invocation_envelope.parse_errors` was `0` in the example but actually `2` (FR-017 both-discriminators detection); `invocation_envelope` was `null` on refusal in the example but actually populated; `sidecar_write_error: undefined` is not valid JSON and had to be replaced with an explicit value; `parse_error_details` field paths and message strings did not match the parser source.
659
+
660
+ **When NOT to use this pattern:**
661
+ - Skills without executable examples (pure conceptual guidance with no runnable artifact).
662
+ - Examples that are intentionally schematic ("the response looks roughly like this") rather than literal.
663
+ - Documentation that is auto-generated from source — drift is impossible by construction in that case.
664
+
665
+ ## Cross-Platform Requirements
666
+
667
+ > **See also**: [Cross-Platform Test Patterns](#cross-platform-test-patterns) above for detailed
668
+ > guidance on mock keys, symlink behavior, temp directories, and line endings.
669
+
670
+ All tests must pass on Linux, macOS, and Windows unless explicitly gated with:
671
+ ```typescript
672
+ const isWindows = process.platform === 'win32';
673
+ if (isWindows) test.skip('reason', () => {});
674
+ ```
675
+
676
+ ### Path handling
677
+ - Use `path.join()` or `path.resolve()`, never string concatenation with `/`.
678
+ - Temp directories: use `os.tmpdir()`, not hardcoded `/tmp`.
679
+ - File comparisons: normalize paths before comparing (`path.resolve(a) === path.resolve(b)`).
680
+
681
+ ### Process spawning
682
+ - Use `.cmd` extension on Windows for npm/bun binaries: `process.platform === 'win32' ? 'bun.cmd' : 'bun'`.
683
+ - Use array-form `spawn`/`spawnSync`, never shell string commands.
684
+
685
+ ## Running Tests
686
+
687
+ ### bash (Linux / macOS)
688
+
689
+ ```bash
690
+ # Single file
691
+ bun test src/hooks/scope-guard.test.ts
692
+
693
+ # Batch directory (safe for dirs without mock conflicts)
694
+ bun --smol test tests/unit/hooks --timeout 30000
695
+
696
+ # Per-file loop (required for tools/services/agents — prevents mock poisoning)
697
+ for f in tests/unit/tools/*.test.ts; do bun --smol test "$f" --timeout 30000; done
698
+
699
+ # CI-equivalent run for batch steps
700
+ bun --smol test tests/unit/cli --timeout 120000
701
+ bun --smol test tests/unit/commands tests/unit/config --timeout 120000
702
+ ```
703
+
704
+ ### PowerShell (Windows)
705
+
706
+ ```powershell
707
+ # Single file
708
+ bun test src/hooks/scope-guard.test.ts
709
+
710
+ # Batch directory (safe for dirs without mock conflicts)
711
+ bun --smol test tests/unit/hooks --timeout 30000
712
+
713
+ # Per-file loop (required for tools/services/agents — prevents mock poisoning)
714
+ Get-ChildItem tests/unit/tools/*.test.ts | ForEach-Object { bun --smol test $_.FullName --timeout 30000 }
715
+
716
+ # CI-equivalent run for batch steps
717
+ bun --smol test tests/unit/cli --timeout 120000
718
+ bun --smol test tests/unit/commands tests/unit/config --timeout 120000
719
+
720
+ # Capture output to file (avoids truncation when output is large)
721
+ bun --smol test tests/unit/agents --timeout 60000 | Out-File "$env:TEMP\test_out.txt"; Get-Content "$env:TEMP\test_out.txt" | Select-Object -Last 50
722
+ ```
723
+
724
+ **Note:** `for f in ...; do` bash syntax is invalid in PowerShell. Use `Get-ChildItem | ForEach-Object` instead. `Select-String -Last N` is also invalid — use `Select-Object -Last N`.
725
+
726
+ **Warning:** Running `bun --smol test tests/unit/tools` as a single batch will cause mock poisoning failures. Always use the per-file loop for directories in CI steps 4-6 (tools, services, agents, etc.).
727
+
728
+ The `--smol` flag reduces Bun's memory footprint. Use it when running large directories (50+ files).
729
+
730
+ The `--timeout 120000` flag sets per-test timeout to 120 seconds. Individual tests should complete in under 5 seconds. If a test needs more than 10 seconds, it's doing too much — split it or mock the slow dependency.
731
+
732
+ ## Before Submitting
733
+
734
+ 1. Run the tests for your changed files: `bun test path/to/your.test.ts`
735
+ 2. Run the full CI group your tests belong to (see pipeline structure above)
736
+ 3. Verify no `process.cwd()` usage — use the `directory` parameter from `createSwarmTool` or hook constructor
737
+ 4. Verify no hardcoded paths (`/tmp/...`, `C:\...`) — use `os.tmpdir()` + `path.join()`
738
+ 5. Verify mocks are restored in `afterEach` if using `spyOn` or `mock.module`
739
+
740
+ ## Known Pre-existing Test Failures
741
+
742
+ The following test failures are pre-existing and unrelated to mock isolation:
743
+
744
+ | Test file | Failures | Cause | Status |
745
+ |-----------|----------|-------|--------|
746
+ | `tests/unit/hooks/full-auto-intercept.test.ts` | 21/37 | `logger.log` returns early without `OPENCODE_SWARM_DEBUG=1` | Pre-existing |
747
+ | `tests/unit/hooks/full-auto-intercept.dispatch.test.ts` | 2/46 | Same logger issue | Pre-existing |
748
+ | `tests/unit/commands/help-compound-commands.test.ts` | Multiple | Command routing issues | Pre-existing |
749
+ | `tests/unit/commands/index.test.ts` | Multiple | Command routing issues | Pre-existing |
750
+ | `tests/unit/commands/issue-command.test.ts` | Multiple | Command routing issues | Pre-existing |
751
+ | `src/__tests__/preflight-phase.test.ts` | 3/3 | `loadPlan` called twice per invocation (lines 930 + 545) | Bug exposed by cleanup |
752
+ | `tests/unit/agents/architect-sounding-board-protocol.adversarial.test.ts` | 1 | Token budget threshold `35000` exceeded by prompt growth; soft regression indicator that prompt size needs attention | Pre-existing |
753
+
754
+ ## Known Cross-module mock.module Locations
755
+
756
+ The following directories contain test files that use cross-module `mock.module` (permitted under two-tier convention):
757
+
758
+ - `tests/unit/commands/` — mocks tools, hooks, services, state
759
+ - `tests/unit/hooks/` — mocks knowledge-store, knowledge-validator, knowledge-reader, telemetry, utils
760
+ - `tests/unit/tools/` — mocks Node built-ins (fs, child_process), sast-baseline, build/discovery
761
+ - `tests/unit/services/` — mocks path-security
762
+ - `tests/unit/config/` — mocks node:fs/promises
763
+ - `tests/unit/background/` — mocks utils, event-bus, evidence-summary-service
764
+ - `tests/unit/council/` — mocks node:fs
765
+ - `tests/unit/plan/` — mocks spec-hash
766
+ - `tests/unit/mutation/` — mocks node:child_process
767
+ - `tests/unit/git/` — mocks node:child_process
768
+ - `tests/integration/` — mocks co-change-analyzer, knowledge-store
769
+ - `src/__tests__/` — mocks plan/manager, preflight-service, telemetry
770
+ - `src/hooks/` — mocks logger, event-bus
771
+ - `src/tools/__tests__/` — mocks test-impact/analyzer, build/discovery, path-security
772
+ - `src/mutation/__tests__/` — mocks state
773
+ - `src/agents/` — mocks node:fs/promises
774
+ - `src/background/` — mocks vulnerability trigger
775
+
776
+ ## Dead-code _internals Seams
777
+
778
+ The following source modules export `_internals` but have no test consumers (as of this writing). They are harmless but may be removed in future cleanup:
779
+
780
+ - `src/tools/secretscan.ts`
781
+ - `src/tools/knowledge-recall.ts`
782
+ - `src/tools/lint.ts`
783
+ - `src/tools/sast-scan.ts`
784
+ - `src/tools/sast-baseline.ts`
785
+ - `src/mutation/gate.ts`
786
+ - `src/mutation/equivalence.ts`
787
+ - `src/mutation/engine.ts`
788
+ - `src/db/qa-gate-profile.ts`
789
+ - `src/config/schema.ts`
790
+ - `src/config/index.ts`
791
+ - `src/commands/registry.ts`
792
+ - `src/background/manager.ts`
793
+ - `src/background/event-bus.ts`
794
+ - `src/agents/critic.ts`