pi-crew 0.5.13 → 0.5.16
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +139 -0
- package/README.md +1 -1
- package/docs/pi-crew-v0.5.14-audit-fix-plan.md +75 -0
- package/docs/pi-crew-v0.5.16-audit-fix-plan.md +35 -0
- package/docs/pi-crew-v0.5.17-audit-fix-plan.md +80 -0
- package/docs/skills/REFERENCE.md +11 -0
- package/package.json +1 -1
- package/skills/iterative-audit/SKILL.md +330 -0
- package/src/extension/management.ts +1 -1
- package/src/extension/plan-orchestrate.ts +0 -1
- package/src/extension/register.ts +16 -7
- package/src/extension/registration/viewers.ts +1 -1
- package/src/extension/run-index.ts +1 -1
- package/src/extension/team-tool/explain.ts +0 -1
- package/src/extension/team-tool/handle-schedule.ts +0 -1
- package/src/extension/team-tool/health-monitor.ts +0 -1
- package/src/extension/team-tool/run.ts +2 -2
- package/src/extension/team-tool/status.ts +1 -1
- package/src/extension/team-tool.ts +2 -30
- package/src/observability/exporters/otlp-exporter.ts +11 -1
- package/src/runtime/checkpoint.ts +19 -0
- package/src/runtime/child-pi.ts +1 -1
- package/src/runtime/crash-recovery.ts +1 -1
- package/src/runtime/crew-agent-records.ts +23 -3
- package/src/runtime/crew-hooks.ts +1 -1
- package/src/runtime/handoff-manager.ts +0 -1
- package/src/runtime/heartbeat-watcher.ts +1 -1
- package/src/runtime/live-session-runtime.ts +0 -1
- package/src/runtime/loop-gates.ts +0 -1
- package/src/runtime/mcp-proxy.ts +2 -2
- package/src/runtime/pipeline-runner.ts +1 -2
- package/src/runtime/task-runner/live-executor.ts +1 -2
- package/src/runtime/task-runner.ts +1 -1
- package/src/state/jsonl-writer.ts +24 -0
- package/src/state/locks.ts +66 -35
- package/src/state/run-metrics.ts +1 -2
- package/src/state/schedule.ts +13 -5
- package/src/state/state-store.ts +1 -1
- package/src/tools/safe-bash.ts +0 -1
- package/src/ui/crew-widget.ts +2 -2
- package/src/ui/render-diff.ts +1 -1
- package/src/ui/run-dashboard.ts +1 -2
- package/src/ui/tool-render.ts +20 -3
- package/src/utils/conflict-detect.ts +0 -1
- package/src/utils/gh-protocol.ts +0 -2
package/CHANGELOG.md
CHANGED
|
@@ -1,5 +1,144 @@
|
|
|
1
1
|
# Changelog
|
|
2
2
|
|
|
3
|
+
## [0.5.16] — Rounds 22–31 Audit Fixes (2026-06-02)
|
|
4
|
+
|
|
5
|
+
### Highlights
|
|
6
|
+
- **1 bug fix**: OTLP exporter `dispose()` now awaits in-flight push (bounded by 10s timeout)
|
|
7
|
+
- **269 new unit tests** across 16 previously-untested modules (Pattern #3)
|
|
8
|
+
- **72 unused imports removed** across 28 source files (Pattern #6)
|
|
9
|
+
- **2 defensive caps** for unbounded Maps (Pattern #2)
|
|
10
|
+
- **1 L1 fix**: `console.warn` → `logInternalError` in crew-hooks
|
|
11
|
+
|
|
12
|
+
### Round 22: Defensive Caps (commit 85b3be6)
|
|
13
|
+
- Bounded `autoRecoveryLast` and `agentEventSeqCache` Maps to 1000 entries
|
|
14
|
+
- Eviction uses insertion-order oldest-first pattern
|
|
15
|
+
|
|
16
|
+
### Round 23: Resource Cleanup (commit 4be2c4e)
|
|
17
|
+
- OTLP exporter `dispose()` now async, awaits in-flight push with 10s timeout
|
|
18
|
+
- Surveyed all setInterval/setTimeout, process.on, file watchers, event listeners, AbortControllers — all clean
|
|
19
|
+
|
|
20
|
+
### Round 24: Test Coverage — discover-agents, markers, tiered-eval (commit cfe5242)
|
|
21
|
+
- 50 new tests: `sanitizeAgentSystemPrompt` (6 rules), `sanitizeGuidanceContent` (5 rules), `TieredEvalRunner` class
|
|
22
|
+
|
|
23
|
+
### Round 25: Test Coverage — adaptive-plan, group-join (commit 89e1cf1)
|
|
24
|
+
- 42 new tests: `slug`, `extractAdaptivePlanJson`, `parseAdaptivePlan`, `repairAdaptivePlan`, `GroupJoinManager`
|
|
25
|
+
|
|
26
|
+
### Round 26: Test Coverage — pi-args, i18n (commit 3669f24)
|
|
27
|
+
- 38 new tests: `applyThinkingSuffix`, `resolveCrewMaxDepth`, `t()`, `addTranslations`, `listLocales`
|
|
28
|
+
|
|
29
|
+
### Round 27: Test Coverage — validation-types, live-extension-bridge (commit 44a2366)
|
|
30
|
+
- 36 new tests: `validateWithSeverity` strict/lenient modes, `buildExtensionBridge` mock session
|
|
31
|
+
|
|
32
|
+
### Round 28: Test Coverage — direct-run, live-session-health (commit 339ac7d)
|
|
33
|
+
- 17 new tests: `isDirectRun`, `directTeamAndWorkflowFromRun`, `collectLiveSessionHealth`
|
|
34
|
+
|
|
35
|
+
### Round 29: Test Coverage — process-status, task-claims (commit 405e05d)
|
|
36
|
+
- 43 new tests: `checkProcessLiveness`, `isActiveRunStatus`, full claim lifecycle
|
|
37
|
+
|
|
38
|
+
### Round 30: Test Coverage — task-display, green-contract, session-utils (commit 7d065ca)
|
|
39
|
+
- 43 new tests: `shouldMaterializeAgent`, `taskById`, `waitingReason`, `greenLevelSatisfies`, `assertValidSessionId`
|
|
40
|
+
|
|
41
|
+
### Round 31: Code Quality — unused imports + L1 fix (commit 35cc0e7)
|
|
42
|
+
- 72 unused imports removed across 28 source files
|
|
43
|
+
- `crew-hooks.ts`: `console.warn` → `logInternalError` for unknown event types
|
|
44
|
+
|
|
45
|
+
### Stats
|
|
46
|
+
- Test suite: 2657 pass + 1 skip, 0 fail (was 2370 in v0.5.14; +287 net)
|
|
47
|
+
- TypeScript: 0 errors
|
|
48
|
+
- New test files: 13
|
|
49
|
+
- Files touched: 58
|
|
50
|
+
|
|
51
|
+
## [0.5.15] — Round 20 + 21 Audit Fixes (2026-06-02)
|
|
52
|
+
|
|
53
|
+
### Source tour
|
|
54
|
+
- Pulled latest `can1357/oh-my-pi` (1751 new commits since 2026-05-11) to working copy
|
|
55
|
+
- Surveyed extensibility, skill system, and security/performance changes via 3 parallel explorer agents
|
|
56
|
+
- Distilled 2 high-impact, immediately applicable patterns (Round 20)
|
|
57
|
+
- Identified 5 more upgrade opportunities; applied 5 in Round 21
|
|
58
|
+
|
|
59
|
+
### Round 20: Lock token guard + tool-error sanitization (commit f448d7d)
|
|
60
|
+
|
|
61
|
+
#### 1. Per-process lock tokens (src/state/locks.ts)
|
|
62
|
+
- **Pattern source**: oh-my-pi commit `cd578a86d` (`file-lock.ts:13-152`)
|
|
63
|
+
- **Bug fixed**: "Losing contender wipes winner's lock" race when one process times out and steals a stale lock that the original holder is about to release
|
|
64
|
+
- Lock file now carries a UUID token. `releaseLock` refuses to `fs.rm` unless the stored token matches.
|
|
65
|
+
- 3 new tests in `test/unit/locks-race.test.ts`
|
|
66
|
+
|
|
67
|
+
#### 2. Tool-error sanitization (src/ui/tool-render.ts)
|
|
68
|
+
- **Pattern source**: oh-my-pi `render-utils.ts:177-185` (`replaceTabs(truncateToWidth(clean, LINE_CAP))`)
|
|
69
|
+
- **Bug fixed**: Embedded tabs/newlines/long strings in tool errors break TUI border alignment
|
|
70
|
+
- Applied to `renderAgentProgress` and `renderAgentToolResult` (2 places)
|
|
71
|
+
- `replaceTabs` is now exported from `src/ui/render-diff.ts` for reuse
|
|
72
|
+
- 2 new tests in `test/unit/tool-render.test.ts`
|
|
73
|
+
|
|
74
|
+
### Round 21: L1 cleanup, lock kind, JSONL per-line cap, in-place loader test (commit 1bf120b)
|
|
75
|
+
|
|
76
|
+
#### 1. L1 cleanup in src/state/schedule.ts
|
|
77
|
+
- `console.warn` → `logInternalError` (consistency with rest of codebase)
|
|
78
|
+
- `require("node:fs")` → top-level `fs`/`path` imports
|
|
79
|
+
- 3 new tests in `test/unit/schedule-store.test.ts`
|
|
80
|
+
|
|
81
|
+
#### 2. Dead code sweep in src/state/locks.ts
|
|
82
|
+
- Removed misleadingly-named `readLockStateAsync` (sync I/O, called from async path) and its redundant call site
|
|
83
|
+
- Async path now mirrors sync path exactly: stale-check + release + sleep
|
|
84
|
+
|
|
85
|
+
#### 3. Lock file `kind` discriminator (forward compat)
|
|
86
|
+
- Lock JSON now includes `kind: "run" | "file"`
|
|
87
|
+
- `withRunLock` writes `kind="run"`; `withFileLockSync` writes `kind="file"`
|
|
88
|
+
- Old locks (no `kind` field) still work — `releaseLock` only reads `token`, so the discriminator is purely additive
|
|
89
|
+
- 3 new tests (kind for run, kind for file, back-compat with legacy locks)
|
|
90
|
+
|
|
91
|
+
#### 4. JSONL per-line cap (defensive, src/state/jsonl-writer.ts)
|
|
92
|
+
- Single huge line could exhaust memory during `redactJsonLine`
|
|
93
|
+
- New `DEFAULT_MAX_LINE_BYTES = 1MB`. Lines exceeding the cap are dropped and counted
|
|
94
|
+
- `logInternalError` fires on the first drop and every 100th drop thereafter
|
|
95
|
+
- 2 new tests in `test/unit/jsonl-writer.test.ts`
|
|
96
|
+
|
|
97
|
+
#### 5. In-place extension loader integration test
|
|
98
|
+
- **Pattern source**: oh-my-pi commit `c5e3698f4` (changed how extensions are loaded)
|
|
99
|
+
- This test verifies pi-crew's `import.meta.url`-based skill path resolution still works with the new in-place loader
|
|
100
|
+
- 2 new tests in `test/integration/extension-skill-resolution.test.ts`
|
|
101
|
+
|
|
102
|
+
### Summary
|
|
103
|
+
- **2 rounds** (Round 20 + 21)
|
|
104
|
+
- **2 commits**: `f448d7d` (Round 20) + `1bf120b` (Round 21)
|
|
105
|
+
- **10 new tests** across 4 test files
|
|
106
|
+
- **Total tests**: 50 pass + 1 skip, **0 fail** (was 49 in v0.5.14)
|
|
107
|
+
- **TypeScript**: 0 errors
|
|
108
|
+
- **Patterns adopted**: 5 from `can1357/oh-my-pi` post-2026-05-11
|
|
109
|
+
|
|
110
|
+
### Patterns surveyed but not applied (low applicability for pi-crew)
|
|
111
|
+
- **Streaming JSON throttle** (3a733c480) — pi-crew has no streaming JSON parser
|
|
112
|
+
- **In-place state mutation** (3a733c480) — pi-crew's spreads are bounded (small N), not hot paths
|
|
113
|
+
- **Bounded row probing** (b522fde56) — pi-crew has no SQL queries
|
|
114
|
+
- **MCP reconnect storm circuit breaker** — pi-crew has no MCP reconnect logic
|
|
115
|
+
- **Drop `args` global from eval** (4ab40764d) — pi-crew's `dynamic-script-runner.ts` already safe
|
|
116
|
+
- **Shell-injection rejection in git specs** (22e564a85) — pi-crew has no plugin install path
|
|
117
|
+
- **NPM registry pinning** (9abce6e97) — pi-crew's `install.mjs` is config-only; user runs `pi install npm:pi-crew`
|
|
118
|
+
- **Extension flag shadow** (1fbc2cbd7) — pi-crew has no `registerFlag` calls
|
|
119
|
+
|
|
120
|
+
## [0.5.14] — Round 19 Audit Fixes (2026-06-02)
|
|
121
|
+
|
|
122
|
+
### Phase 1: Path validation in checkpoint.ts (MEDIUM security)
|
|
123
|
+
- All public functions now validate runId/taskId via `assertSafePathId()`:
|
|
124
|
+
- `saveCheckpoint(runId, taskId, ...)`
|
|
125
|
+
- `loadCheckpoint(runId, taskId)`
|
|
126
|
+
- `clearCheckpoint(runId, taskId)`
|
|
127
|
+
- `hasCheckpoint(runId, taskId)`
|
|
128
|
+
- `listCheckpoints(runId)`
|
|
129
|
+
- `FileCheckpointStore.save/load/delete` (validates taskId)
|
|
130
|
+
- Prevents path traversal: malicious IDs like `../../../etc/passwd` throw "Invalid runId" instead of writing outside `.crew/`.
|
|
131
|
+
|
|
132
|
+
### Phase 2-4: Test coverage (33 new tests)
|
|
133
|
+
- 11 new tests in `test/unit/checkpoint.test.ts` (path validation)
|
|
134
|
+
- 14 new tests in `test/unit/subagent-manager.test.ts` (basic + path validation)
|
|
135
|
+
- 16 new tests in `test/unit/paths.test.ts` (findRepoRoot, projectPiRoot, projectCrewRoot)
|
|
136
|
+
|
|
137
|
+
### Tests
|
|
138
|
+
- 2370/2370 pass (was 2352 in v0.5.13; +18 net)
|
|
139
|
+
- 33 new tests across 3 new test files
|
|
140
|
+
- TypeScript: 0 errors
|
|
141
|
+
|
|
3
142
|
## [0.5.13] — Round 18 Audit Fixes (2026-06-02)
|
|
4
143
|
|
|
5
144
|
### Phase 1: Switch to execFileSync (HIGH security)
|
package/README.md
CHANGED
|
@@ -0,0 +1,75 @@
|
|
|
1
|
+
# pi-crew v0.5.14 Audit Fix Plan (Round 19)
|
|
2
|
+
|
|
3
|
+
## Source Verification Findings
|
|
4
|
+
|
|
5
|
+
I read the following files and identified 5 confirmed real issues:
|
|
6
|
+
|
|
7
|
+
### Issue 1: `checkpoint.ts` lacks path validation for runId/taskId (MEDIUM security)
|
|
8
|
+
**File**: `src/runtime/checkpoint.ts:133-200`
|
|
9
|
+
|
|
10
|
+
The `saveCheckpoint(runId, taskId, ...)`, `loadCheckpoint(runId, taskId)`, `deleteCheckpoint(runId, taskId)`, `listCheckpoints(runId)`, `hasCheckpoint(runId, taskId)` functions all build paths like:
|
|
11
|
+
|
|
12
|
+
```ts
|
|
13
|
+
const stateRoot = path.join(process.cwd(), ".crew/state/runs", runId);
|
|
14
|
+
const checkpointPath = path.join(stateRoot, "checkpoints", `${taskId}.json`);
|
|
15
|
+
```
|
|
16
|
+
|
|
17
|
+
If `runId` or `taskId` contains `../`, an attacker (or a bug) could write to arbitrary paths outside `.crew/`. The other modules (e.g., `state-store.ts`) use `assertSafePathId` and `resolveContainedRelativePath` to defend against this, but `checkpoint.ts` does not.
|
|
18
|
+
|
|
19
|
+
**Note**: These functions are not currently used in production code (only in tests), so the attack surface is small. But the issue should be fixed for defense-in-depth.
|
|
20
|
+
|
|
21
|
+
**Fix**: Use `assertSafePathId(runId)` and `assertSafePathId(taskId)` from `utils/safe-paths.ts`.
|
|
22
|
+
|
|
23
|
+
### Issue 2: `subagent-manager.ts` busy-polls blocked runs (MEDIUM performance)
|
|
24
|
+
**File**: `src/runtime/subagent-manager.ts:323-356, 358-389`
|
|
25
|
+
|
|
26
|
+
`pollRunToTerminal` and `scheduleBlockedTerminalPoll` use `setTimeout` to poll the run manifest every `pollIntervalMs` (default 1000ms). For long-running tasks (hours), this means thousands of `loadRunManifestById` calls.
|
|
27
|
+
|
|
28
|
+
Each call does:
|
|
29
|
+
- File stat
|
|
30
|
+
- File read
|
|
31
|
+
- JSON parse
|
|
32
|
+
|
|
33
|
+
**Fix**: Use `fs.watch()` to be notified of manifest changes instead of polling. This is event-driven and only fires when the file actually changes.
|
|
34
|
+
|
|
35
|
+
### Issue 3: `subagent-manager.ts:waitForRecord` busy-loops with 100ms sleep (LOW performance)
|
|
36
|
+
**File**: `src/runtime/subagent-manager.ts:217-225`
|
|
37
|
+
|
|
38
|
+
When `record.promise` is undefined (just created), the function busy-loops with 100ms `setTimeout`. This works but is inefficient.
|
|
39
|
+
|
|
40
|
+
**Fix**: Use an event emitter or a promise that's resolved when the record transitions to terminal state.
|
|
41
|
+
|
|
42
|
+
### Issue 4: `subagent-manager.ts:scheduleStuckBlockedNotify` timer holds strong ref to `record` (LOW memory)
|
|
43
|
+
**File**: `src/runtime/subagent-manager.ts:393-407`
|
|
44
|
+
|
|
45
|
+
The timer closure captures `record` strongly. If the agent is removed (via `removeAgent` or similar), the timer still holds a reference until it fires.
|
|
46
|
+
|
|
47
|
+
**Fix**: Add `removeAgent(id)` method that clears the timer.
|
|
48
|
+
|
|
49
|
+
### Issue 5: Test coverage gaps for subagent-manager, paths, checkpoint (LOW)
|
|
50
|
+
- `test/unit/subagent-manager.test.ts` — does not exist
|
|
51
|
+
- `test/unit/paths.test.ts` — does not exist
|
|
52
|
+
- `test/unit/checkpoint.test.ts` — exists but no path-traversal tests
|
|
53
|
+
|
|
54
|
+
## Plan (5 phases)
|
|
55
|
+
|
|
56
|
+
### Phase 1: Path validation in checkpoint.ts
|
|
57
|
+
- Use `assertSafePathId` from `utils/safe-paths.ts`
|
|
58
|
+
- Update `saveCheckpoint`, `loadCheckpoint`, `deleteCheckpoint`, `listCheckpoints`, `hasCheckpoint`
|
|
59
|
+
|
|
60
|
+
### Phase 2: Add tests for path validation
|
|
61
|
+
- Test that `saveCheckpoint` rejects `../etc/passwd`
|
|
62
|
+
- Test that `loadCheckpoint` rejects path-traversal IDs
|
|
63
|
+
|
|
64
|
+
### Phase 3: Test coverage for subagent-manager
|
|
65
|
+
- Test spawn, abort, waitForAll
|
|
66
|
+
- Test path validation
|
|
67
|
+
- Test concurrent limits
|
|
68
|
+
- Test cleanup of controllers
|
|
69
|
+
|
|
70
|
+
### Phase 4: Test coverage for paths
|
|
71
|
+
- Test findRepoRoot with various project markers
|
|
72
|
+
- Test cache TTL
|
|
73
|
+
- Test projectPiRoot / projectCrewRoot
|
|
74
|
+
|
|
75
|
+
### Phase 5: Release v0.5.14
|
|
@@ -0,0 +1,35 @@
|
|
|
1
|
+
# Round 22 Audit Fix Plan (Defensive Caps)
|
|
2
|
+
|
|
3
|
+
## Findings
|
|
4
|
+
|
|
5
|
+
### Issue 1: `autoRecoveryLast` Map grows unboundedly (MEDIUM, MEMORY)
|
|
6
|
+
- **File**: `src/extension/register.ts:484`
|
|
7
|
+
- **What**: Module-level `Map<string, number>` keyed by `${kind}_${runId}`. Holds cooldown timestamps for "recovery notifications" (5-minute gate per key).
|
|
8
|
+
- **Bug**: Entries are NEVER removed during a session. Each run contributes up to 4 keys (one per `maybeNotifyHealth` kind). Long-running pi sessions that run 1000+ teams accumulate 4000+ entries (~32KB).
|
|
9
|
+
- **Severity**: MEDIUM — silent memory growth in long-running process. Not a security issue.
|
|
10
|
+
- **Fix**: Add `AUTO_RECOVERY_LAST_MAX_ENTRIES` cap. Evict oldest insertion (matches the 5-min cooldown gate semantics — once the gate has expired, the entry is irrelevant). The eviction loop runs on each `set()` to amortize the cost.
|
|
11
|
+
|
|
12
|
+
### Issue 2: `agentEventSeqCache` Map grows unboundedly (MEDIUM, MEMORY)
|
|
13
|
+
- **File**: `src/runtime/crew-agent-records.ts:265`
|
|
14
|
+
- **What**: Module-level `Map<string, { size, mtimeMs, seq }>` keyed by `filePath` (each agent event log). Caches the `.seq` sidecar value.
|
|
15
|
+
- **Bug**: Entries are NEVER removed. Each new agent task creates a new event log file, adding a cache entry. A long-running pi-crew process that spawns 1000s of agents accumulates 1000s of entries.
|
|
16
|
+
- **Severity**: MEDIUM — silent memory growth. Plus, stale entries mask filesystem changes (mtime/size won't reflect a re-created file).
|
|
17
|
+
- **Fix**: Add `AGENT_EVENT_SEQ_CACHE_MAX_ENTRIES` cap. Evict oldest insertion first (mirrors the `asyncAgentReaderCache` pattern at line 134-136 in the same file).
|
|
18
|
+
|
|
19
|
+
## Plan (2 phases)
|
|
20
|
+
|
|
21
|
+
### Phase 1: `autoRecoveryLast` defensive cap
|
|
22
|
+
- `src/extension/register.ts:484` — add `AUTO_RECOVERY_LAST_MAX_ENTRIES = 1000` constant
|
|
23
|
+
- Modify the `set()` site at line 1534 to evict oldest entries before inserting when size > cap
|
|
24
|
+
- Add test in `test/unit/auto-recovery-cap.test.ts`
|
|
25
|
+
|
|
26
|
+
### Phase 2: `agentEventSeqCache` defensive cap
|
|
27
|
+
- `src/runtime/crew-agent-records.ts:265` — add `AGENT_EVENT_SEQ_CACHE_MAX_ENTRIES = 1000` constant
|
|
28
|
+
- Add helper function `setAgentEventSeqCache()` that wraps the `.set()` and evicts oldest entries
|
|
29
|
+
- Add test in `test/unit/crew-agent-records.test.ts` (or new file)
|
|
30
|
+
|
|
31
|
+
## Expected impact
|
|
32
|
+
- 2 new tests, 0 regressions
|
|
33
|
+
- Total: 2 MEDIUM memory-leak fixes
|
|
34
|
+
- No public API changes
|
|
35
|
+
- Pattern: follows existing `NotificationRouter.SEEN_MAP_MAX_SIZE` and `asyncAgentReaderCache` patterns in the codebase
|
|
@@ -0,0 +1,80 @@
|
|
|
1
|
+
# Round 23 Audit Findings (Resource Cleanup)
|
|
2
|
+
|
|
3
|
+
## Skill: iterative-audit (Pattern #7: Resource Cleanup)
|
|
4
|
+
|
|
5
|
+
## Findings
|
|
6
|
+
|
|
7
|
+
### Issue 1: OTLP exporter `inFlight` push not awaited on dispose (LOW)
|
|
8
|
+
- **File**: `src/observability/exporters/otlp-exporter.ts:80-86, 127-130`
|
|
9
|
+
- **What**: When `dispose()` is called, the interval timer is cleared but the in-flight `push()` continues to run until the 10s fetch timeout. The result is lost (not awaited).
|
|
10
|
+
- **Severity**: LOW — bounded by 10s fetch timeout. Not a real leak, just orphaned work.
|
|
11
|
+
- **Fix**: Make `dispose()` async. Await the in-flight push before returning.
|
|
12
|
+
- **Test**: 1 new test verifies `dispose()` waits for the in-flight push.
|
|
13
|
+
|
|
14
|
+
## Patterns surveyed (all VERIFIED clean from source)
|
|
15
|
+
|
|
16
|
+
### setInterval / setTimeout cleanup
|
|
17
|
+
| File | Resource | Cleanup | Status |
|
|
18
|
+
|------|----------|---------|--------|
|
|
19
|
+
| `register.ts:411` | `autoRepairTimer` | cleared on line 308, 402, 1102 | OK |
|
|
20
|
+
| `register.ts:442` | `tempReconcileTimer` | cleared on line 308, 402, 1102 | OK |
|
|
21
|
+
| `result-watcher.ts:80` | `pollTimer` | cleared in `stopPolling()` | OK |
|
|
22
|
+
| `result-watcher.ts:96` | `restartTimer` | cleared in `scheduleRestart()` and `stop()` | OK |
|
|
23
|
+
| `async-notifier.ts:101` | `state.interval` | cleared in `stopAsyncRunNotifier()` | OK |
|
|
24
|
+
| `subagent-tools.ts:228` | `timer` | cleanup function returned to caller | OK |
|
|
25
|
+
| `team-tool.ts:160` | `timer` | `stop()` method clears it | OK |
|
|
26
|
+
| `live-conversation-overlay.ts:55` | `pollTimer` | cleared in `close()` / `dispose()` | OK |
|
|
27
|
+
| `loaders.ts:127` | `timer` | cleared in `dispose()` | OK |
|
|
28
|
+
| `theme-adapter.ts:145` | `pollTimer` | cleared in unsubscribe (line 169) | OK |
|
|
29
|
+
| `delivery-coordinator.ts:169` | `ttlTimer` | cleared in `dispose()` | OK |
|
|
30
|
+
| `parent-guard.ts:61` | `guardInterval` | cleared in `stopParentGuard()` | OK |
|
|
31
|
+
| `scheduler.ts:88` | `t` (timer) | cleared on job removal | OK |
|
|
32
|
+
| `otlp-exporter.ts:80` | `timer` | cleared in `dispose()` (Round 23: also awaits inFlight) | OK |
|
|
33
|
+
| `team-runner.ts:67` | `interval` | local scope (per-run) | OK |
|
|
34
|
+
| `metric-sink.ts:68` | `timer` | cleared in `dispose()` (also closes fd) | OK |
|
|
35
|
+
| `handoff-manager.ts:203` | `cleanupTimer` | cleared in `dispose()` (also clears Maps) | OK |
|
|
36
|
+
| `live-session-runtime.ts:487` | `controlTimer` | cleared in `finally` block | OK |
|
|
37
|
+
| `budget-tracker.ts:231` | `abortInterval` | cleared on abort/exhausted | OK |
|
|
38
|
+
| `background-runner.ts:52, 74` | `interval` | local scope (process entry point) | OK |
|
|
39
|
+
|
|
40
|
+
### process.on() signal handler registration
|
|
41
|
+
| File | Handlers | Guard | Status |
|
|
42
|
+
|------|----------|-------|--------|
|
|
43
|
+
| `crew-cleanup.ts:79, 84` | SIGTERM, SIGHUP | `signalHandlersRegistered` flag (Round 16) | OK |
|
|
44
|
+
| `background-runner.ts:107, 148, 175, 181, 194, 198` | many | process entry point (registered once per process) | OK |
|
|
45
|
+
| `event-log.ts:490-492` | exit, SIGTERM, SIGINT | module-level (ESM caches) | OK |
|
|
46
|
+
| `atomic-write.ts:265-267` | exit, SIGTERM, SIGINT | module-level (ESM caches) | OK |
|
|
47
|
+
|
|
48
|
+
### File watchers
|
|
49
|
+
| File | Watcher | Cleanup | Status |
|
|
50
|
+
|------|---------|---------|--------|
|
|
51
|
+
| `register.ts:682, 686` | `crewWatcher`, `userCrewWatcher` | `closeWatcher()` in cleanup paths | OK |
|
|
52
|
+
| `result-watcher.ts` | `watcher` | `closeWatcher()` in `stop()` | OK |
|
|
53
|
+
|
|
54
|
+
### Event listeners
|
|
55
|
+
| File | Listener | Cleanup | Status |
|
|
56
|
+
|------|----------|---------|--------|
|
|
57
|
+
| `event-bus.ts:on()` | deduped via Set | cleanup function returned | OK |
|
|
58
|
+
| `run-event-bus.ts:onAny()` etc. | deduped via Sets | cleanup function returned | OK |
|
|
59
|
+
| `phase-tracker.ts:dispose()` | EventEmitter | `removeAllListeners()` | OK |
|
|
60
|
+
| `team-tool.ts:72` | signal listener | `removeEventListener` in `finally` | OK |
|
|
61
|
+
|
|
62
|
+
### AbortController
|
|
63
|
+
| File | Controller | Cleanup | Status |
|
|
64
|
+
|------|-----------|---------|--------|
|
|
65
|
+
| `team-tool.ts:68` | per-tool | aborted via signal listener, removed in `finally` | OK |
|
|
66
|
+
| `subagent-manager.ts:290` | per-run | cleaned in `cleanupRunSignal()` | OK |
|
|
67
|
+
| `cancellation-token.ts:17` | per-token | aborted via `#controller.abort()` | OK |
|
|
68
|
+
| `otlp-exporter.ts:106` | per-push | cleared in `finally` block | OK (also: dispose awaits inFlight) |
|
|
69
|
+
|
|
70
|
+
## Plan (1 phase)
|
|
71
|
+
|
|
72
|
+
### Phase 1: OTLP exporter `dispose()` awaits inFlight
|
|
73
|
+
- `src/observability/exporters/otlp-exporter.ts:127-130` — make `dispose()` async, await `this.inFlight`
|
|
74
|
+
- 1 new test in `test/unit/otlp-exporter.test.ts`
|
|
75
|
+
|
|
76
|
+
## Expected impact
|
|
77
|
+
- 1 new test, 0 regressions
|
|
78
|
+
- Total: 1 LOW severity improvement
|
|
79
|
+
- No public API change (callers that don't await still get synchronous timer clear)
|
|
80
|
+
- Pattern: matches the existing `await` patterns elsewhere in the codebase
|
package/docs/skills/REFERENCE.md
CHANGED
|
@@ -38,6 +38,16 @@ multi-perspective-review (8-pass deep review)
|
|
|
38
38
|
secure-agent-orchestration-review (security focus)
|
|
39
39
|
```
|
|
40
40
|
|
|
41
|
+
### Multi-Round Audit (5-20 rounds)
|
|
42
|
+
|
|
43
|
+
```
|
|
44
|
+
iterative-audit (round planning, 7 patterns, diminishing-returns)
|
|
45
|
+
↓
|
|
46
|
+
multi-perspective-review (per round, optional)
|
|
47
|
+
↓
|
|
48
|
+
verification-before-done (per round)
|
|
49
|
+
```
|
|
50
|
+
|
|
41
51
|
---
|
|
42
52
|
|
|
43
53
|
## When to Invoke
|
|
@@ -48,6 +58,7 @@ secure-agent-orchestration-review (security focus)
|
|
|
48
58
|
| Before claiming done | `verification-before-done` |
|
|
49
59
|
| Code review (quick) | `scrutinize` |
|
|
50
60
|
| Code review (deep) | `multi-perspective-review` |
|
|
61
|
+
| Multi-round audit (5-20 rounds) | `iterative-audit` |
|
|
51
62
|
| Task delegation | `delegation-patterns` |
|
|
52
63
|
| Complex multi-phase work | `orchestration` |
|
|
53
64
|
| After bug is fixed | `post-mortem` |
|