pi-crew 0.1.46 → 0.1.51

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (262) hide show
  1. package/CHANGELOG.md +115 -0
  2. package/agents/analyst.md +11 -11
  3. package/agents/critic.md +11 -11
  4. package/agents/executor.md +11 -11
  5. package/agents/explorer.md +11 -11
  6. package/agents/planner.md +11 -11
  7. package/agents/reviewer.md +11 -11
  8. package/agents/security-reviewer.md +11 -11
  9. package/agents/test-engineer.md +11 -11
  10. package/agents/verifier.md +11 -11
  11. package/agents/writer.md +11 -11
  12. package/docs/next-upgrade-roadmap.md +117 -42
  13. package/docs/refactor-tasks-phase3.md +394 -394
  14. package/docs/refactor-tasks-phase4.md +564 -564
  15. package/docs/refactor-tasks-phase5.md +402 -402
  16. package/docs/refactor-tasks-phase6.md +662 -662
  17. package/docs/research/AGENT-EXECUTION-ARCHITECTURE.md +261 -0
  18. package/docs/research/AGENT-LIFECYCLE-COMPARISON.md +111 -0
  19. package/docs/research/AUDIT_OH_MY_PI.md +261 -0
  20. package/docs/research/AUDIT_PI_CREW.md +457 -0
  21. package/docs/research/CAVEMAN-DEEP-RESEARCH.md +281 -0
  22. package/docs/research/COMPARISON_OH_MY_PI_VS_PI_CREW.md +264 -0
  23. package/docs/research/DEEP-RESEARCH-PI-POWERBAR.md +343 -0
  24. package/docs/research/DEEP_RESEARCH_SUBAGENT_ARCHITECTURE.md +480 -0
  25. package/docs/research/GAP_CLOSURE_IMPLEMENTATION_PLAN.md +354 -0
  26. package/docs/research/IMPLEMENTATION_PLAN.md +385 -0
  27. package/docs/research/LIVE-SESSION-PRODUCTION-READY-PLAN.md +502 -0
  28. package/docs/research/OH-MY-PI-DEEP-RESEARCH-v14.7.6.md +266 -0
  29. package/docs/research/REMAINING-GAPS-PLAN.md +363 -0
  30. package/docs/research/SESSION-SUMMARY-2026-05-08.md +146 -0
  31. package/docs/research/UI-RESPONSIVENESS-AUDIT.md +173 -0
  32. package/docs/research-awesome-agent-skills-distillation.md +100 -100
  33. package/docs/research-extension-examples.md +297 -297
  34. package/docs/research-extension-system.md +324 -324
  35. package/docs/research-oh-my-pi-distillation.md +56 -9
  36. package/docs/research-optimization-plan.md +548 -548
  37. package/docs/research-phase10-distillation.md +198 -198
  38. package/docs/research-phase11-distillation.md +201 -201
  39. package/docs/research-pi-coding-agent.md +357 -357
  40. package/docs/research-source-pi-crew-reference.md +174 -174
  41. package/docs/runtime-flow.md +148 -148
  42. package/docs/source-runtime-refactor-map.md +107 -107
  43. package/index.ts +6 -6
  44. package/package.json +99 -98
  45. package/schema.json +8 -0
  46. package/skills/async-worker-recovery/SKILL.md +42 -42
  47. package/skills/context-artifact-hygiene/SKILL.md +52 -52
  48. package/skills/delegation-patterns/SKILL.md +54 -54
  49. package/skills/mailbox-interactive/SKILL.md +40 -40
  50. package/skills/model-routing-context/SKILL.md +39 -39
  51. package/skills/multi-perspective-review/SKILL.md +58 -58
  52. package/skills/observability-reliability/SKILL.md +41 -41
  53. package/skills/orchestration/SKILL.md +157 -0
  54. package/skills/ownership-session-security/SKILL.md +41 -41
  55. package/skills/pi-extension-lifecycle/SKILL.md +39 -39
  56. package/skills/requirements-to-task-packet/SKILL.md +63 -63
  57. package/skills/resource-discovery-config/SKILL.md +41 -41
  58. package/skills/runtime-state-reader/SKILL.md +44 -44
  59. package/skills/secure-agent-orchestration-review/SKILL.md +45 -45
  60. package/skills/state-mutation-locking/SKILL.md +42 -42
  61. package/skills/systematic-debugging/SKILL.md +67 -67
  62. package/skills/ui-render-performance/SKILL.md +39 -39
  63. package/skills/verification-before-done/SKILL.md +57 -57
  64. package/skills/worktree-isolation/SKILL.md +39 -39
  65. package/src/agents/agent-config.ts +6 -0
  66. package/src/agents/agent-search.ts +98 -0
  67. package/src/agents/agent-serializer.ts +4 -0
  68. package/src/agents/discover-agents.ts +17 -4
  69. package/src/config/config.ts +25 -0
  70. package/src/config/defaults.ts +16 -5
  71. package/src/extension/autonomous-policy.ts +26 -33
  72. package/src/extension/cross-extension-rpc.ts +94 -82
  73. package/src/extension/help.ts +1 -0
  74. package/src/extension/management.ts +5 -0
  75. package/src/extension/project-init.ts +15 -3
  76. package/src/extension/register.ts +78 -19
  77. package/src/extension/registration/commands.ts +33 -1
  78. package/src/extension/registration/compaction-guard.ts +125 -125
  79. package/src/extension/registration/team-tool.ts +6 -4
  80. package/src/extension/run-bundle-schema.ts +89 -89
  81. package/src/extension/run-export.ts +26 -12
  82. package/src/extension/run-index.ts +24 -18
  83. package/src/extension/run-maintenance.ts +68 -62
  84. package/src/extension/team-tool/api.ts +23 -2
  85. package/src/extension/team-tool/cancel.ts +86 -11
  86. package/src/extension/team-tool/context.ts +4 -1
  87. package/src/extension/team-tool/handle-settings.ts +188 -188
  88. package/src/extension/team-tool/inspect.ts +41 -41
  89. package/src/extension/team-tool/intent-policy.ts +42 -0
  90. package/src/extension/team-tool/lifecycle-actions.ts +47 -18
  91. package/src/extension/team-tool/parallel-dispatch.ts +156 -0
  92. package/src/extension/team-tool/plan.ts +19 -19
  93. package/src/extension/team-tool/respond.ts +10 -2
  94. package/src/extension/team-tool/run.ts +3 -2
  95. package/src/extension/team-tool/status.ts +1 -1
  96. package/src/extension/team-tool-types.ts +1 -0
  97. package/src/extension/team-tool.ts +16 -5
  98. package/src/hooks/registry.ts +61 -0
  99. package/src/hooks/types.ts +41 -0
  100. package/src/i18n.ts +184 -184
  101. package/src/observability/exporters/otlp-exporter.ts +77 -77
  102. package/src/prompt/prompt-runtime.ts +72 -72
  103. package/src/runtime/agent-control.ts +108 -2
  104. package/src/runtime/agent-memory.ts +72 -72
  105. package/src/runtime/agent-observability.ts +114 -114
  106. package/src/runtime/async-marker.ts +26 -26
  107. package/src/runtime/async-runner.ts +3 -1
  108. package/src/runtime/attention-events.ts +28 -28
  109. package/src/runtime/background-runner.ts +19 -0
  110. package/src/runtime/cancellation-token.ts +89 -0
  111. package/src/runtime/cancellation.ts +61 -51
  112. package/src/runtime/capability-inventory.ts +116 -0
  113. package/src/runtime/child-pi.ts +2 -1
  114. package/src/runtime/code-summary.ts +247 -0
  115. package/src/runtime/completion-guard.ts +190 -190
  116. package/src/runtime/concurrency.ts +3 -1
  117. package/src/runtime/crash-recovery.ts +181 -0
  118. package/src/runtime/crew-agent-records.ts +35 -7
  119. package/src/runtime/crew-agent-runtime.ts +1 -0
  120. package/src/runtime/custom-tools/irc-tool.ts +201 -0
  121. package/src/runtime/custom-tools/submit-result-tool.ts +90 -0
  122. package/src/runtime/delivery-coordinator.ts +3 -1
  123. package/src/runtime/diagnostic-export.ts +3 -1
  124. package/src/runtime/direct-run.ts +35 -35
  125. package/src/runtime/effectiveness.ts +81 -76
  126. package/src/runtime/event-stream-bridge.ts +92 -0
  127. package/src/runtime/foreground-control.ts +82 -82
  128. package/src/runtime/green-contract.ts +46 -46
  129. package/src/runtime/group-join.ts +106 -106
  130. package/src/runtime/heartbeat-gradient.ts +28 -28
  131. package/src/runtime/heartbeat-watcher.ts +124 -124
  132. package/src/runtime/live-agent-control.ts +88 -88
  133. package/src/runtime/live-agent-manager.ts +78 -2
  134. package/src/runtime/live-control-realtime.ts +36 -36
  135. package/src/runtime/live-extension-bridge.ts +150 -0
  136. package/src/runtime/live-irc.ts +92 -0
  137. package/src/runtime/live-session-health.ts +100 -0
  138. package/src/runtime/live-session-runtime.ts +297 -7
  139. package/src/runtime/mcp-proxy.ts +113 -0
  140. package/src/runtime/notebook-helpers.ts +90 -0
  141. package/src/runtime/orphan-sentinel.ts +7 -0
  142. package/src/runtime/output-validator.ts +187 -0
  143. package/src/runtime/parallel-research.ts +44 -44
  144. package/src/runtime/parallel-utils.ts +57 -0
  145. package/src/runtime/parent-guard.ts +80 -0
  146. package/src/runtime/pi-args.ts +11 -2
  147. package/src/runtime/pi-json-output.ts +111 -111
  148. package/src/runtime/pi-spawn.ts +21 -3
  149. package/src/runtime/policy-engine.ts +79 -79
  150. package/src/runtime/process-status.ts +14 -1
  151. package/src/runtime/progress-event-coalescer.ts +43 -43
  152. package/src/runtime/prose-compressor.ts +164 -0
  153. package/src/runtime/recovery-recipes.ts +74 -74
  154. package/src/runtime/result-extractor.ts +121 -0
  155. package/src/runtime/role-permission.ts +39 -39
  156. package/src/runtime/runtime-resolver.ts +1 -4
  157. package/src/runtime/semaphore.ts +131 -0
  158. package/src/runtime/sensitive-paths.ts +92 -0
  159. package/src/runtime/session-resources.ts +25 -25
  160. package/src/runtime/session-snapshot.ts +59 -59
  161. package/src/runtime/session-usage.ts +79 -79
  162. package/src/runtime/sidechain-output.ts +29 -29
  163. package/src/runtime/stream-preview.ts +177 -0
  164. package/src/runtime/subagent-manager.ts +3 -2
  165. package/src/runtime/subprocess-tool-registry.ts +67 -0
  166. package/src/runtime/supervisor-contact.ts +59 -59
  167. package/src/runtime/task-display.ts +38 -38
  168. package/src/runtime/task-output-context.ts +59 -9
  169. package/src/runtime/task-runner/capabilities.ts +78 -78
  170. package/src/runtime/task-runner/live-executor.ts +2 -0
  171. package/src/runtime/task-runner/progress.ts +119 -119
  172. package/src/runtime/task-runner/prompt-builder.ts +71 -9
  173. package/src/runtime/task-runner/prompt-pipeline.ts +64 -64
  174. package/src/runtime/task-runner/result-utils.ts +14 -14
  175. package/src/runtime/task-runner/run-projection.ts +104 -0
  176. package/src/runtime/task-runner/state-helpers.ts +22 -22
  177. package/src/runtime/task-runner.ts +75 -4
  178. package/src/runtime/team-runner.ts +69 -8
  179. package/src/runtime/worker-heartbeat.ts +21 -21
  180. package/src/runtime/worker-startup.ts +57 -57
  181. package/src/runtime/workspace-tree.ts +298 -0
  182. package/src/runtime/yield-handler.ts +189 -0
  183. package/src/schema/config-schema.ts +7 -0
  184. package/src/schema/team-tool-schema.ts +11 -1
  185. package/src/skills/discover-skills.ts +67 -0
  186. package/src/state/active-run-registry.ts +4 -2
  187. package/src/state/artifact-store.ts +4 -1
  188. package/src/state/atomic-write.ts +50 -1
  189. package/src/state/blob-store.ts +117 -0
  190. package/src/state/contracts.ts +1 -0
  191. package/src/state/event-log-rotation.ts +158 -0
  192. package/src/state/event-log.ts +52 -2
  193. package/src/state/locks.ts +3 -1
  194. package/src/state/mailbox.ts +87 -7
  195. package/src/state/state-store.ts +24 -4
  196. package/src/state/task-claims.ts +44 -44
  197. package/src/state/types.ts +20 -0
  198. package/src/state/usage.ts +29 -29
  199. package/src/subagents/async-entry.ts +1 -1
  200. package/src/subagents/index.ts +3 -3
  201. package/src/subagents/live/control.ts +1 -1
  202. package/src/subagents/live/manager.ts +1 -1
  203. package/src/subagents/live/realtime.ts +1 -1
  204. package/src/subagents/live/session-runtime.ts +1 -1
  205. package/src/subagents/manager.ts +1 -1
  206. package/src/subagents/spawn.ts +1 -1
  207. package/src/teams/team-serializer.ts +38 -38
  208. package/src/types/diff.d.ts +18 -18
  209. package/src/ui/agent-management-overlay.ts +144 -0
  210. package/src/ui/crew-footer.ts +101 -101
  211. package/src/ui/crew-select-list.ts +111 -111
  212. package/src/ui/crew-widget.ts +15 -4
  213. package/src/ui/dashboard-panes/cancellation-pane.ts +43 -0
  214. package/src/ui/dashboard-panes/capability-pane.ts +60 -0
  215. package/src/ui/dashboard-panes/mailbox-pane.ts +35 -11
  216. package/src/ui/dashboard-panes/metrics-pane.ts +34 -34
  217. package/src/ui/dynamic-border.ts +25 -25
  218. package/src/ui/layout-primitives.ts +106 -106
  219. package/src/ui/live-run-sidebar.ts +4 -0
  220. package/src/ui/loaders.ts +158 -158
  221. package/src/ui/powerbar-publisher.ts +83 -15
  222. package/src/ui/render-coalescer.ts +51 -0
  223. package/src/ui/render-diff.ts +119 -119
  224. package/src/ui/render-scheduler.ts +143 -143
  225. package/src/ui/run-dashboard.ts +4 -0
  226. package/src/ui/run-event-bus.ts +209 -0
  227. package/src/ui/run-snapshot-cache.ts +68 -16
  228. package/src/ui/snapshot-types.ts +8 -0
  229. package/src/ui/spinner.ts +17 -17
  230. package/src/ui/status-colors.ts +58 -58
  231. package/src/ui/syntax-highlight.ts +116 -116
  232. package/src/ui/transcript-entries.ts +258 -0
  233. package/src/utils/atomic-write.ts +33 -33
  234. package/src/utils/completion-dedupe.ts +63 -63
  235. package/src/utils/frontmatter.ts +68 -68
  236. package/src/utils/git.ts +262 -262
  237. package/src/utils/ids.ts +17 -12
  238. package/src/utils/incremental-reader.ts +104 -0
  239. package/src/utils/names.ts +27 -27
  240. package/src/utils/redaction.ts +44 -44
  241. package/src/utils/safe-paths.ts +47 -47
  242. package/src/utils/scan-cache.ts +137 -0
  243. package/src/utils/sleep.ts +32 -32
  244. package/src/utils/sse-parser.ts +134 -0
  245. package/src/utils/task-name-generator.ts +337 -0
  246. package/src/utils/visual.ts +33 -2
  247. package/src/workflows/validate-workflow.ts +40 -40
  248. package/src/worktree/branch-freshness.ts +45 -45
  249. package/src/worktree/cleanup.ts +2 -1
  250. package/src/worktree/worktree-manager.ts +11 -3
  251. package/teams/default.team.md +12 -12
  252. package/teams/fast-fix.team.md +11 -11
  253. package/teams/implementation.team.md +18 -18
  254. package/teams/parallel-research.team.md +14 -14
  255. package/teams/research.team.md +11 -11
  256. package/teams/review.team.md +12 -12
  257. package/workflows/default.workflow.md +29 -29
  258. package/workflows/fast-fix.workflow.md +22 -22
  259. package/workflows/implementation.workflow.md +43 -38
  260. package/workflows/parallel-research.workflow.md +46 -46
  261. package/workflows/research.workflow.md +22 -22
  262. package/workflows/review.workflow.md +30 -30
package/CHANGELOG.md CHANGED
@@ -2,6 +2,121 @@
2
2
 
3
3
  ## Unreleased
4
4
 
5
+ ## 0.1.51
6
+
7
+ ### Fixed
8
+
9
+ - **Stale foreground spinner** — Working message/spinner now always clears when foreground run completes, even if session generation changed during the run.
10
+ - **Completed-run widget grace period (8s)** — Runs that just completed stay visible in the widget for 8 seconds so users can see results before the widget hides.
11
+
12
+ ## 0.1.50
13
+
14
+ ### Fixed
15
+
16
+ - **Parallel execution** — Raised default concurrency (implementation 2→4, review 2→3, research 2→3). Fixed `defaultWorkflowConcurrency()` routing bug where review/default both returned the implementation value.
17
+ - **Planner prompt** — Added explicit "MAXIMIZE PARALLELISM" instruction with examples, so planner models produce parallel phases instead of sequential.
18
+ - **20 review findings** — 6 CRITICAL (optional chaining crash, env leak, path redaction, RPC validation, hook JSON safety, temp dir security), 6 HIGH (unsafe casts, busy-wait CPU, timestamp merge guard, prompt injection delimiter, binary validation), 5 MEDIUM, 3 LOW.
19
+ - **Widget flicker** — Pinned preloaded manifests to widget component model to prevent manifestCache TTL race. Scoped snapshotCache invalidation to specific run instead of clearing all.
20
+ - **Delegation policy** — Rewritten as mandatory decision table with concrete thresholds (>3 files read or >2 files edit = must delegate). Injected into every session via system prompt.
21
+ - **ignoreMethod option** — New config to write ignore entries to `.git/info/exclude` instead of `.gitignore` (Closes #2).
22
+
23
+ ## 0.1.49
24
+
25
+ ### Added
26
+
27
+ - **Caveman output contracts** — Role-based output validation framework with `output-validator.ts`: regex-based format checking for explorer, executor, reviewer, verifier, security-reviewer roles. Non-blocking: validation failures emit `task.output_validation` events + set `needs_attention` but do NOT fail the task.
28
+ - **Prose compressor** — `prose-compressor.ts` compresses verbose worker output for token-sensitive contexts (role-aware compression levels).
29
+ - **Sensitive paths** — Word-boundary-aware token matching in `sensitive-paths.ts` prevents false positives (e.g. `secretary.ts` no longer flagged as `secret`).
30
+ - **Symlink-safe I/O** — Artifact and shared output paths reject traversal attempts and symlinked root escapes.
31
+ - **Output contract eval harness** — 19 unit tests covering three-arm evaluation (contract vs terse vs baseline), format compliance, token savings, regex safety (no `/g` lastIndex state leak).
32
+
33
+ ### Changed
34
+
35
+ - **Delegation policy rewritten** — Replaced advisory "you should consider" text with a mandatory decision table: concrete thresholds (>3 files read OR >2 files edit = MUST delegate), explicit YES/NO cases per task type, conflict-safe task splitting rules. Injected into every session via `before_agent_start` hook.
36
+ - **Powerbar dedup** — `powerbar-publisher.ts` now skips `powerbar:update` emit when segment data is unchanged (inspired by pi-powerbar's `segmentEquals` pattern). Combined with existing 200ms coalescing for minimal unnecessary renders.
37
+ - **UI responsiveness** — `task-runner.ts` now emits `streamBridge` event immediately after `task.started`, giving the widget agent status within ~100ms instead of 2-5s (child process startup delay).
38
+ - **"spawning…" indicator** — Widget shows "spawning…" for agents < 5 seconds old with no tool activity, distinguishing from "thinking…" for long-running agents.
39
+
40
+ ### Fixed
41
+
42
+ - **H1: MCP proxy fallback** — `mcp-proxy.ts` now falls back to `enableMcp: true` when `createMcpProxyTools()` returns empty, so child sessions self-discover MCP instead of losing all access.
43
+ - **H2: parallel-utils throw undefined** — `mapConcurrent` now throws the actual error instead of `throw undefined`.
44
+ - **H3: Semaphore over-release** — `release()` guard against `#current > 0` prevents over-release corruption.
45
+ - **M1: IRC tool TOCTOU** — `irc-tool.ts` wraps `sendIrcMessage`/`broadcastIrcMessage` in try-catch.
46
+ - **M2: submit-result ordering** — Builds response string before calling `onYield`, wrapped in try-catch.
47
+ - **M3: Sensitive paths false positives** — Word-boundary-aware token matching replaces substring matching.
48
+ - **M4: atomic-write sleepSync** — Added WARNING comment about blocking main thread.
49
+ - **M7: URL regex trailing punctuation** — Precise regex excludes trailing punctuation from URL matches.
50
+ - **L1: parent-guard comment** — Corrected misleading comment about `process.kill` on Windows.
51
+ - **Yield handler DRY** — Extracted `extractYieldDataFromArgs` helper, `isObjectRecord`/`isStringRecord` type guards, safe `find()` pattern.
52
+ - **Event-log-rotation TOCTOU** — `compactEventLog` re-reads file after initial read to merge concurrent appends; `readEvents` skips corrupt JSON lines.
53
+ - **Ghost agent dedup** — Fixed duplicate agent records in `crew-agent-records` after crash recovery.
54
+
55
+ ### Research
56
+
57
+ - `docs/research/AGENT-EXECUTION-ARCHITECTURE.md` — Detailed comparison of 3 execution modes (oh-my-pi in-process, pi-crew child-process, pi-crew live-session).
58
+ - `docs/research/UI-RESPONSIVENESS-AUDIT.md` — Root cause analysis for 2-5s agent spawn visibility delay, 5 proposed fixes with priority matrix.
59
+ - `docs/research/DEEP-RESEARCH-PI-POWERBAR.md` — Deep analysis of pi-powerbar architecture (producer/consumer pattern, rendering, settings, comparison with pi-crew's powerbar publisher).
60
+
61
+ ## 0.1.48
62
+
63
+ ### Added
64
+
65
+ - **Yield-based completion contract** — Workers can call `submit_result` tool to return structured results; task-runner warns on workers that don't yield.
66
+ - **Typed event channels** — `RunEventBus` supports 5 channels (`worker:progress`, `worker:lifecycle`, `worker:stream`, `run:state`, `ui:invalidate`) with `onChannel`/`onChannelForRun` subscriptions and auto-classification.
67
+ - **Human-readable task names** — `generateTaskName()` produces AdjectiveNoun names (14,400 combinations); `displayName` field on `TeamTaskState`.
68
+ - **SubprocessToolRegistry** — Extensible tool event handling with `register`/`extractAll`/`shouldTerminate` pattern; wired into event-stream-bridge.
69
+ - **Event log rotation/compaction** — Auto-compacts event logs over 5MB/50k events, keeping last 1000 events; atomic file replacement.
70
+ - **Incremental JSONL reader** — `readLinesSince`/`readJsonlSince` for seek-based file reading; wired into `readEventsCursor` with `fromByteOffset`.
71
+
72
+ ### Fixed
73
+
74
+ - Fixed `readBlob`/`readBlobMetadata` crash on missing files — now returns `undefined`.
75
+ - Fixed `readSseJson` crash on non-JSON SSE data — now skips malformed events.
76
+ - Fixed wrong value `"long_running"` → `"active_long_running"` in agent-control.
77
+ - Fixed `consecutiveFailures` type bypass — added to `CrewAgentProgress` interface.
78
+ - Fixed `streamBridge.dispose()` memory leak — now in try/finally.
79
+ - Fixed blob-store redundant ternary `typeof x === "string" ? x : x`.
80
+ - Fixed team-runner non-null assertion on potentially empty array.
81
+ - Fixed event-log silent error swallowing — now logs via `logInternalError`.
82
+ - Fixed team-tool switch case indentation.
83
+ - Removed dead code `expandIcon` in agent-management-overlay.
84
+
85
+ ### Changed
86
+
87
+ - Moved 6 research .md files from repo root to `docs/research/`.
88
+ - `discoverAgents`/`discoverSkills` silent catches now log via `logInternalError`.
89
+ - `executeHook` accumulates non-blocking diagnostics instead of short-circuiting.
90
+ - `CancellationToken.heartbeat` wired into `collectRuns` and `pruneFinishedRuns`.
91
+ - `CapabilitySource` extended with `"git"` to match `ResourceSource`.
92
+
93
+ ## 0.1.47
94
+
95
+ ### Added
96
+
97
+ - **Typed hook lifecycle** — 8 of 9 hooks wired: `before_run_start`, `before_task_start`, `task_result`, `before_cancel`, `before_forget`, `before_cleanup`, `before_publish`, `run_recovery`. Hooks are opt-in, blocking/non-blocking, with audit events.
98
+ - **Event-first UI bus** — `RunEventBus` emits on every `appendEvent` call; dashboard, crew widget, sidebar, and snapshot cache subscribe for event-driven invalidation instead of polling.
99
+ - **Shared scan cache** — `SharedScanCache` caches manifest reads and active-run entries with TTL, mtime/size invalidation, and LRU eviction.
100
+ - **Capability inventory** — `buildCapabilityInventory()` enumerates teams, workflows, agents, and skills with stable `kind:name` IDs; supports policy disable and shadowing detection.
101
+ - **Skills in capability inventory** — `discoverSkills()` reads SKILL.md frontmatter; skills appear with kind=`skill` and source=`package`/`project`.
102
+ - **Mailbox kind-separated breakdown** — `RunUiMailbox` tracks `steerUnread`/`followUpUnread`/`responseUnread`/`messageUnread`; mailbox pane shows urgency indicators.
103
+ - **Run recovery hook** — `applyRecoveryPlan` fires `run_recovery` hook; blocked recovery emits `crew.run.recovery_blocked` event.
104
+ - **Synthetic tool cancellation evidence** — Cancelled in-flight tasks receive `tool`-level terminal evidence alongside `worker`-level.
105
+ - **CancellationToken wired into production loops** — `collectRuns` and `pruneFinishedRuns` use `CancellationToken.heartbeat(stage)` for progress diagnostics.
106
+ - **Blob artifact store** — SHA-256 content-addressed storage with metadata sidecars.
107
+ - **Run event provenance** — Event metadata includes `parentEventId`, `attemptId`, `branchId`, `causationId`, `correlationId`.
108
+ - **Control channel reservation** — `ControlReservation` before worker spawn with deterministic `controllerId`.
109
+ - **Release smoke test** — `npm run smoke:release` automates tarball install + version consistency check.
110
+ - **Width-safety tests** — Crew widget rendering verified at widths 1/40/200/empty/multiple.
111
+
112
+ ### Changed
113
+
114
+ - `handleCancel`, `handleForget`, `handleCleanup`, `handlePrune`, `handleExport` converted to async for hook execution.
115
+ - `before_cancel`/`before_forget`/`before_cleanup` hooks can block their respective operations.
116
+ - `before_publish` hook fires before run export.
117
+ - `task_result` hook fires before `task.completed`/`task.failed` events.
118
+ - Dashboard, widget, and sidebar auto-invalidate on `RunEventBus` events.
119
+
5
120
  ## 0.1.45
6
121
 
7
122
  ### Added
package/agents/analyst.md CHANGED
@@ -1,11 +1,11 @@
1
- ---
2
- name: analyst
3
- description: Analyze requirements, ambiguity, and hidden constraints
4
- model: false
5
- systemPromptMode: replace
6
- inheritProjectContext: true
7
- inheritSkills: false
8
- tools: read, grep, find, ls
9
- ---
10
-
11
- You are a requirements analyst. Identify what is known, unknown, risky, ambiguous, or underspecified. Produce clarifying assumptions and acceptance criteria.
1
+ ---
2
+ name: analyst
3
+ description: Analyze requirements, ambiguity, and hidden constraints
4
+ model: false
5
+ systemPromptMode: replace
6
+ inheritProjectContext: true
7
+ inheritSkills: false
8
+ tools: read, grep, find, ls
9
+ ---
10
+
11
+ You are a requirements analyst. Identify what is known, unknown, risky, ambiguous, or underspecified. Produce clarifying assumptions and acceptance criteria.
package/agents/critic.md CHANGED
@@ -1,11 +1,11 @@
1
- ---
2
- name: critic
3
- description: Challenge plans and designs before execution
4
- model: false
5
- systemPromptMode: replace
6
- inheritProjectContext: true
7
- inheritSkills: false
8
- tools: read, grep, find, ls
9
- ---
10
-
11
- You are a critical reviewer. Find flaws, missing steps, unsafe assumptions, overengineering, underengineering, and verification gaps. Return concrete fixes to the plan.
1
+ ---
2
+ name: critic
3
+ description: Challenge plans and designs before execution
4
+ model: false
5
+ systemPromptMode: replace
6
+ inheritProjectContext: true
7
+ inheritSkills: false
8
+ tools: read, grep, find, ls
9
+ ---
10
+
11
+ You are a critical reviewer. Find flaws, missing steps, unsafe assumptions, overengineering, underengineering, and verification gaps. Return concrete fixes to the plan.
@@ -1,11 +1,11 @@
1
- ---
2
- name: executor
3
- description: Implement planned code changes
4
- model: false
5
- systemPromptMode: replace
6
- inheritProjectContext: true
7
- inheritSkills: false
8
- tools: read, grep, find, ls, bash, edit, write
9
- ---
10
-
11
- You are an implementation specialist. Follow the provided plan, make targeted changes, keep edits minimal, and report changed files plus validation status. Do not broaden scope without explaining why.
1
+ ---
2
+ name: executor
3
+ description: Implement planned code changes
4
+ model: false
5
+ systemPromptMode: replace
6
+ inheritProjectContext: true
7
+ inheritSkills: false
8
+ tools: read, grep, find, ls, bash, edit, write
9
+ ---
10
+
11
+ You are an implementation specialist. Follow the provided plan, make targeted changes, keep edits minimal, and report changed files plus validation status. Do not broaden scope without explaining why.
@@ -1,11 +1,11 @@
1
- ---
2
- name: explorer
3
- description: Fast codebase discovery and file/symbol mapping
4
- model: false
5
- systemPromptMode: replace
6
- inheritProjectContext: true
7
- inheritSkills: false
8
- tools: read, grep, find, ls
9
- ---
10
-
11
- You are a fast codebase explorer. Map relevant files, symbols, data flow, and constraints. Do not modify files. Return concise findings with paths and evidence.
1
+ ---
2
+ name: explorer
3
+ description: Fast codebase discovery and file/symbol mapping
4
+ model: false
5
+ systemPromptMode: replace
6
+ inheritProjectContext: true
7
+ inheritSkills: false
8
+ tools: read, grep, find, ls
9
+ ---
10
+
11
+ You are a fast codebase explorer. Map relevant files, symbols, data flow, and constraints. Do not modify files. Return concise findings with paths and evidence.
package/agents/planner.md CHANGED
@@ -1,11 +1,11 @@
1
- ---
2
- name: planner
3
- description: Create an execution plan with clear sequencing and risk notes
4
- model: false
5
- systemPromptMode: replace
6
- inheritProjectContext: true
7
- inheritSkills: false
8
- tools: read, grep, find, ls
9
- ---
10
-
11
- You are a planning specialist. Convert the goal and discovery notes into a concrete, ordered plan. Identify dependencies, risks, validation steps, and handoff instructions for implementers.
1
+ ---
2
+ name: planner
3
+ description: Create an execution plan with clear sequencing and risk notes
4
+ model: false
5
+ systemPromptMode: replace
6
+ inheritProjectContext: true
7
+ inheritSkills: false
8
+ tools: read, grep, find, ls
9
+ ---
10
+
11
+ You are a planning specialist. Convert the goal and discovery notes into a concrete, ordered plan. Identify dependencies, risks, validation steps, and handoff instructions for implementers.
@@ -1,11 +1,11 @@
1
- ---
2
- name: reviewer
3
- description: Review code changes for correctness, maintainability, and regressions
4
- model: false
5
- systemPromptMode: replace
6
- inheritProjectContext: true
7
- inheritSkills: false
8
- tools: read, grep, find, ls, bash
9
- ---
10
-
11
- You are a code reviewer. Review the implementation for bugs, regressions, maintainability issues, missing tests, and project-rule violations. Return prioritized findings with evidence.
1
+ ---
2
+ name: reviewer
3
+ description: Review code changes for correctness, maintainability, and regressions
4
+ model: false
5
+ systemPromptMode: replace
6
+ inheritProjectContext: true
7
+ inheritSkills: false
8
+ tools: read, grep, find, ls, bash
9
+ ---
10
+
11
+ You are a code reviewer. Review the implementation for bugs, regressions, maintainability issues, missing tests, and project-rule violations. Return prioritized findings with evidence.
@@ -1,11 +1,11 @@
1
- ---
2
- name: security-reviewer
3
- description: Review changes for security vulnerabilities and trust-boundary issues
4
- model: false
5
- systemPromptMode: replace
6
- inheritProjectContext: true
7
- inheritSkills: false
8
- tools: read, grep, find, ls, bash
9
- ---
10
-
11
- You are a security reviewer. Look for injection, authn/authz flaws, insecure defaults, secret exposure, unsafe filesystem/network behavior, and dependency risks. Return severity and remediation.
1
+ ---
2
+ name: security-reviewer
3
+ description: Review changes for security vulnerabilities and trust-boundary issues
4
+ model: false
5
+ systemPromptMode: replace
6
+ inheritProjectContext: true
7
+ inheritSkills: false
8
+ tools: read, grep, find, ls, bash
9
+ ---
10
+
11
+ You are a security reviewer. Look for injection, authn/authz flaws, insecure defaults, secret exposure, unsafe filesystem/network behavior, and dependency risks. Return severity and remediation.
@@ -1,11 +1,11 @@
1
- ---
2
- name: test-engineer
3
- description: Design and implement test strategy for a change
4
- model: false
5
- systemPromptMode: replace
6
- inheritProjectContext: true
7
- inheritSkills: false
8
- tools: read, grep, find, ls, bash, edit, write
9
- ---
10
-
11
- You are a test engineer. Identify the right test level, add or adjust tests when asked, detect flaky assumptions, and report exact validation commands and results.
1
+ ---
2
+ name: test-engineer
3
+ description: Design and implement test strategy for a change
4
+ model: false
5
+ systemPromptMode: replace
6
+ inheritProjectContext: true
7
+ inheritSkills: false
8
+ tools: read, grep, find, ls, bash, edit, write
9
+ ---
10
+
11
+ You are a test engineer. Identify the right test level, add or adjust tests when asked, detect flaky assumptions, and report exact validation commands and results.
@@ -1,11 +1,11 @@
1
- ---
2
- name: verifier
3
- description: Verify that implementation satisfies the requested goal
4
- model: false
5
- systemPromptMode: replace
6
- inheritProjectContext: true
7
- inheritSkills: false
8
- tools: read, grep, find, ls, bash
9
- ---
10
-
11
- You are a verification specialist. Check whether the work is complete, correct, tested, and aligned with project constraints. Prefer evidence over assumptions. Return PASS or FAIL with reasons.
1
+ ---
2
+ name: verifier
3
+ description: Verify that implementation satisfies the requested goal
4
+ model: false
5
+ systemPromptMode: replace
6
+ inheritProjectContext: true
7
+ inheritSkills: false
8
+ tools: read, grep, find, ls, bash
9
+ ---
10
+
11
+ You are a verification specialist. Check whether the work is complete, correct, tested, and aligned with project constraints. Prefer evidence over assumptions. Return PASS or FAIL with reasons.
package/agents/writer.md CHANGED
@@ -1,11 +1,11 @@
1
- ---
2
- name: writer
3
- description: Write concise documentation, migration notes, and summaries
4
- model: false
5
- systemPromptMode: replace
6
- inheritProjectContext: true
7
- inheritSkills: false
8
- tools: read, grep, find, ls, edit, write
9
- ---
10
-
11
- You are a documentation specialist. Produce clear, concise, maintainable docs and summaries. Preserve technical accuracy and avoid marketing fluff.
1
+ ---
2
+ name: writer
3
+ description: Write concise documentation, migration notes, and summaries
4
+ model: false
5
+ systemPromptMode: replace
6
+ inheritProjectContext: true
7
+ inheritSkills: false
8
+ tools: read, grep, find, ls, edit, write
9
+ ---
10
+
11
+ You are a documentation specialist. Produce clear, concise, maintainable docs and summaries. Preserve technical accuracy and avoid marketing fluff.
@@ -22,6 +22,66 @@ Already implemented and pushed:
22
22
  - Live-agent control distinguishes `steer` from `follow-up` at live-control/API level.
23
23
  - Retry attempts have `attemptId`; max-retry deadletters link to the final `attemptId`.
24
24
  - Worker prompt pipeline and capability inventory metadata artifacts are written per task.
25
+ - P0.1: effectiveness guard escalates `warn` to `blocked` for mutating-role tasks with no observable worker activity.
26
+ - P1.1: mailbox `readMailbox` accepts `kind` filter; API `read-mailbox` supports `config.kind`.
27
+ - P1.5: `TeamEventMetadata` extended with `parentEventId`, `attemptId`, `branchId`, `causationId`, `correlationId`.
28
+ - P1.6: `buildSyntheticTerminalEvidence()` produces `"worker"`/`"cancelled"` terminal records for cancelled in-flight tasks.
29
+ - P1.7: `buildCapabilityInventory(cwd)` normalizes teams/workflows/agents; API `operation=inventory`.
30
+ - P2.1: typed hook lifecycle — `registerHook`/`executeHook` registry; `before_run_start` and `before_task_start` wired.
31
+ - P2.4: `AbortSignal` wired into `collectRuns`, `validateMailbox`, `readAllMailboxMessages`, `pruneFinishedRuns`, `cleanupRunWorktrees`, etc.
32
+ - Resume scaffold runs preserve scaffold mode from original manifest when workers not disabled.
33
+
34
+ ## Implementation Status as of `v0.1.46`
35
+
36
+ This roadmap is **not complete overall**. The `v0.1.46` release completed several vertical slices, but multiple roadmap items remain partial or unimplemented.
37
+
38
+ ### Implemented / mostly implemented
39
+
40
+ - Baseline worker behavior: real child-process execution by default, explicit scaffold dry-runs, and blocked implicit scaffold/no-op runs.
41
+ - P0.1 ✅ effectiveness policy enforcement: default guard escalates `warn` to `blocked` for mutating-role tasks.
42
+ - P0.2 ✅ runtime safety persistence: manifests persist `runtimeResolution`; `runtime.resolved` event emitted; status shows safety; blocked runs persist evidence.
43
+ - Effectiveness reporting: summary/progress/status expose no-observed-work evidence and policy outcome.
44
+ - Structured cancellation basics: cancellation reasons flow through retry/backoff/team-runner paths and run/task events.
45
+ - Retry attempt evidence: retry attempts and max-retry deadletters carry/link `attemptId` data.
46
+ - Prompt pipeline artifacts and per-task capability metadata artifacts are written.
47
+ - P1.3 worker teardown evidence vertical slice: `WorkerExitStatus` and terminal worker cancellation evidence exist.
48
+
49
+ ### Completed in this upgrade cycle (after v0.1.46)
50
+
51
+ - P0.1 effectiveness policy enforcement: default guard now escalates `warn` to `blocked` for mutating-role tasks with no observable worker activity; read-only roles remain `warning`.
52
+ - P0.2 runtime safety persistence: manifests persist `runtimeResolution`; `runtime.resolved` event emitted; status shows safety; blocked runs persist evidence.
53
+ - P1.1 durable steering/follow-up queues: `readMailbox` accepts `kind` filter; API `read-mailbox` supports `config.kind`; steering and follow-up are isolatable by kind.
54
+ - P1.2 respond vs follow-up UX: `/team-follow-up` command added for continuation prompts; `/team-respond` remains for waiting-task replies.
55
+ - P1.3 two-phase child process teardown: `WorkerExitStatus` populated from graceful SIGTERM → grace window → hard kill pipeline.
56
+ - P1.5 event-tree provenance: `TeamEventMetadata` extended with `parentEventId`, `attemptId`, `branchId`, `causationId`, `correlationId`; retry and cancel events carry `attemptId`.
57
+ - P1.6 synthetic terminal results: `buildSyntheticTerminalEvidence()` in `cancellation.ts`; cancelled in-flight tasks receive `"worker"`/`"cancelled"` terminal evidence records.
58
+ - P1.7 unified capability inventory: `buildCapabilityInventory(cwd)` normalizes teams/workflows/agents into `CapabilityItem[]`; API `operation=inventory` returns sorted JSON.
59
+ - P1.8 capability disable by stable ID: `disabledCapabilities` in `CrewPolicyConfig`; inventory marks disabled items with reason.
60
+ - P2.1 typed hook lifecycle: `HookName`, `HookMode`, `HookOutcome`, `HookContext`, `HookResult`, `HookExecutionReport` types; `registerHook`/`executeHook`/`clearHooks` registry; `before_run_start` and `before_task_start` wired into team-runner.
61
+ - P2.2 intent gates for destructive actions: `enforceDestructiveIntent` wired in cancel/cleanup/forget/prune/delete; configurable via `policy.requireIntentForDestructiveActions`.
62
+ - P2.3 durable history projection: `transformRunContextBeforeWorkerStart()` and `convertRunHistoryToWorkerPrompt()` bounded projection functions.
63
+ - P2.4 CancellationToken wired into long scans: `AbortSignal` passed to `collectRuns`/`validateMailbox`/`readAllMailboxMessages`/`pruneFinishedRuns`/`cleanupRunWorktrees`.
64
+ - P2.5 content-addressed blob store: `writeBlob`/`readBlob`/`readBlobMetadata` with SHA-256 dedup and metadata sidecars.
65
+ - P2.6 dashboard panes for capability and cancellation: `renderCapabilityPane` and `renderCancellationPane`.
66
+ - Resume scaffold run fix: preserves scaffold mode from original manifest when workers not disabled.
67
+
68
+ ### Partial / not safe to mark complete
69
+
70
+ - P1.4 reserve worker control channel before spawn: controller metadata persistence during startup not yet implemented.
71
+ - P2.7 event-first UI: render coalescing and snapshot caches exist, but live UI still relies on durable file polling as a primary source in several panes.
72
+ - P2.8 shared raw scan-entry cache: not yet implemented.
73
+
74
+ ### Completed / no longer backlog
75
+
76
+ - P2.7 event-first UI — RunEventBus wired into appendEvent; dashboard, widget, sidebar auto-invalidate on events; snapshot cache invalidates on events.
77
+ - P2.8 shared raw scan-entry cache — SharedScanCache implemented and wired into manifest reads (run-index) and active-run-registry (active manifest reads).
78
+ - P3.1 tarball-install smoke — `scripts/release-smoke.mjs` verified; `npm run smoke:release` added.
79
+ - Hook lifecycle — All hooks wired: `before_run_start`, `before_task_start`, `before_cancel`, `before_forget`, `before_cleanup`, `before_publish`, `task_result`, `run_recovery`. Only `session_before_switch` remains (no cwd switch mechanism in current codebase).
80
+
81
+ ### Remaining items
82
+
83
+ - `session_before_switch` hook — no cwd/session switch mechanism in current codebase; placeholder for future.
84
+ - P3.2 CI gate — integrate `smoke:release` into CI pipeline (requires CI config).
25
85
 
26
86
  ## Priority Legend
27
87
 
@@ -82,12 +142,13 @@ Already implemented and pushed:
82
142
  - set run `blocked` or `failed` depending config;
83
143
  - include task IDs in `data`.
84
144
 
85
- **Acceptance criteria**
145
+ **Acceptance criteria**
86
146
 
87
- - A mocked child-process run with no tool/model/transcript evidence does not report clean `completed` by default.
88
- - Scaffold run still completes as explicit dry-run and displays `Worker execution: disabled/scaffold`.
89
- - `status` clearly lists `noObservedWork` and `needsAttention` task IDs.
90
- - Unit tests cover warn/block/fail modes.
147
+ - A mocked child-process run with no tool/model/transcript evidence does not report clean `completed` by default.
148
+ - Scaffold run still completes as explicit dry-run and displays `Worker execution: disabled/scaffold`.
149
+ - `status` clearly lists `noObservedWork` and `needsAttention` task IDs.
150
+ - Unit tests cover warn/block/fail modes.
151
+ - ✅ Default guard escalates `warn` to `blocked` for mutating-role tasks.
91
152
 
92
153
  **Verification**
93
154
 
@@ -130,11 +191,12 @@ npm run test:unit
130
191
  - `test/unit/team-run.test.ts`
131
192
  - `test/unit/runtime-resolver.test.ts`
132
193
 
133
- **Acceptance criteria**
194
+ **Acceptance criteria**
134
195
 
135
- - `status` shows `Runtime safety: trusted|explicit_dry_run|blocked`.
136
- - Blocked disabled-worker runs persist enough evidence to explain why no subagents spawned.
137
- - Existing manifest schema remains backward compatible.
196
+ - `status` shows `Runtime safety: trusted|explicit_dry_run|blocked`.
197
+ - Blocked disabled-worker runs persist enough evidence to explain why no subagents spawned.
198
+ - Existing manifest schema remains backward compatible.
199
+ - ✅ `runtimeResolution` persisted on manifest; `runtime.resolved` event emitted.
138
200
 
139
201
  ## P1 — Steering/Follow-up Semantics Beyond Live Control
140
202
 
@@ -170,12 +232,12 @@ npm run test:unit
170
232
  - `test/unit/live-agent-control.test.ts`
171
233
  - `test/unit/respond-tool.test.ts`
172
234
 
173
- **Acceptance criteria**
235
+ **Acceptance criteria** ✅ (partially — kind filter and API done; UI pane separation remaining)
174
236
 
175
- - Steering and follow-up can be inspected separately.
176
- - Existing inbox/outbox JSONL remains readable.
177
- - Durable queue survives process/session switch.
178
- - Realtime live delivery dedupes against durable replay.
237
+ - Steering and follow-up can be inspected separately via `readMailbox` kind filter and API `config.kind`.
238
+ - Existing inbox/outbox JSONL remains readable.
239
+ - Kind filter survives process/session switch (durable mailbox).
240
+ - UI/status separates urgent steering from follow-up backlog (mailbox pane shows kind breakdown with urgency indicators).
179
241
 
180
242
  ### P1.2 Clarify `respond` vs `follow-up` UX
181
243
 
@@ -307,11 +369,12 @@ Retry attempts have `attemptId`, and deadletters link to final attempt. Event lo
307
369
  - `test/unit/event-metadata.test.ts`
308
370
  - `test/unit/retry-executor.test.ts`
309
371
 
310
- **Acceptance criteria**
372
+ **Acceptance criteria**
311
373
 
312
- - Retry attempt events and terminal task events share attempt provenance.
313
- - Deadletter records can be traced back to event sequence.
314
- - Existing JSONL readers ignore missing provenance fields.
374
+ - Retry attempt events and terminal task events share attempt provenance.
375
+ - Deadletter records can be traced back to event sequence.
376
+ - Existing JSONL readers ignore missing provenance fields.
377
+ - ✅ `TeamEventMetadata` extended with `parentEventId`, `attemptId`, `branchId`, `causationId`, `correlationId`.
315
378
 
316
379
  ### P1.6 Synthetic terminal results for cancelled in-flight operations
317
380
 
@@ -336,10 +399,11 @@ Run/task cancellation events are now structured, but worker/tool sub-operations
336
399
  - `src/state/contracts.ts`
337
400
  - `test/unit/cancellation.test.ts`
338
401
 
339
- **Acceptance criteria**
402
+ **Acceptance criteria**
340
403
 
341
- - No started tool/model operation is left without terminal evidence after cancellation.
342
- - Status/diagnostics can distinguish user cancel vs timeout vs shutdown.
404
+ - No started tool/model operation is left without terminal evidence after cancellation.
405
+ - Status/diagnostics can distinguish user cancel vs timeout vs shutdown.
406
+ - ✅ `buildSyntheticTerminalEvidence()` in `cancellation.ts` produces `"worker"`/`"cancelled"` records.
343
407
 
344
408
  ## P1 — Capability Inventory and Control Center
345
409
 
@@ -379,10 +443,10 @@ interface CapabilityItem {
379
443
 
380
444
  **Acceptance criteria**
381
445
 
382
- - Inventory is stable and sorted.
383
- - Shadowed project/user/builtin resources are visible.
384
- - Skill disabled/budget state is visible.
385
- - No file path is used as the only stable ID.
446
+ - Inventory is stable and sorted.
447
+ - Shadowed project/user/builtin resources are visible in capability inventory (state="shadowed", shadowedBy field).
448
+ - Skill disabled/budget state is visible in capability inventory (skills enumerated via discoverSkills).
449
+ - No file path is used as the only stable ID (uses `kind:name` IDs).
386
450
 
387
451
  ### P1.8 Persist capability disables by stable ID
388
452
 
@@ -436,11 +500,18 @@ Errors are recorded in diagnostics/events, not uncontrolled exceptions.
436
500
  - `docs/resource-formats.md`
437
501
  - `test/unit/hooks*.test.ts`
438
502
 
439
- **Acceptance criteria**
503
+ **Acceptance criteria** ✅ (partial — `before_cancel` not yet wired for async)
440
504
 
441
- - Blocking hook can stop a run before worker start with clear event and status.
442
- - Non-blocking hook failure records diagnostic and does not crash run.
443
- - Hook context is redacted and bounded.
505
+ - Blocking hook can stop a run before worker start with clear event and status.
506
+ - Non-blocking hook failure records diagnostic and does not crash run.
507
+ - Hook context is redacted and bounded.
508
+ - ✅ `before_cancel` hook wired (async handleCancel conversion done).
509
+ - ✅ `before_forget` hook wired (async handleForget conversion done).
510
+ - ✅ `before_cleanup` hook wired (async handleCleanup conversion done).
511
+ - ✅ `task_result` hook wired in task-runner before completed/failed event.
512
+ - ✅ `before_publish` hook wired in handleExport.
513
+ - ✅ `run_recovery` hook wired in crash-recovery `applyRecoveryPlan`.
514
+ - ☐ `session_before_switch` not yet wired (no cwd switch mechanism in current codebase; placeholder for future Pi lifecycle integration).
444
515
 
445
516
  ### P2.2 Require intent via policy/hook for destructive actions
446
517
 
@@ -547,11 +618,11 @@ Use it in:
547
618
  - `src/ui/run-snapshot-cache.ts`
548
619
  - `test/unit/cancellation-token.test.ts`
549
620
 
550
- **Acceptance criteria**
621
+ **Acceptance criteria**
551
622
 
552
- - Long scan can abort within bounded cadence.
553
- - Heartbeat stage appears in diagnostics/logs.
554
- - Existing APIs can pass no token and keep current behavior.
623
+ - Long scan can abort within bounded cadence (`AbortSignal` wired into `collectRuns`, `validateMailbox`, `readAllMailboxMessages`, `pruneFinishedRuns`, `cleanupRunWorktrees`).
624
+ - `CancellationToken.heartbeat(stage)` wired into `collectRuns` and `pruneFinishedRuns` with stage diagnostics.
625
+ - Existing APIs can pass no token/signal and keep current behavior.
555
626
 
556
627
  ## P2 — Artifact Store Improvements
557
628
 
@@ -699,16 +770,20 @@ npm pack
699
770
 
700
771
  ## Suggested Implementation Order
701
772
 
702
- 1. **P0.1 Effectiveness policy enforcement**prevents misleading completed runs.
703
- 2. **P0.2 Persist runtime safety**improves debugging for worker spawn issues.
773
+ 1. ~~**P0.1 Effectiveness policy enforcement**~~ ✅ Completed default guard escalates `warn` to `blocked` for mutating-role tasks.
774
+ 2. ~~**P0.2 Persist runtime safety**~~ ✅ Completed manifests persist `runtimeResolution`; `runtime.resolved` event emitted.
704
775
  3. **P1.3 Two-phase worker teardown** — reduces stale/zombie worker risk.
705
- 4. **P1.1 Durable steering/follow-up queues**completes semantic split started at live-control level.
706
- 5. **P1.5 Event-tree provenance**builds on current `attemptId` work.
707
- 6. **P1.7 Capability inventory view**turns existing per-task artifacts into operator UX.
708
- 7. **P2.3 Durable history projection**reduces prompt/context risks.
709
- 8. **P2.4 CancellationToken**improves responsiveness of internal scans.
710
- 9. **P2.5 Blob artifacts**prevents log/transcript bloat.
711
- 10. **P2.6 Dashboard panels** surface all new evidence in UI.
776
+ 4. ~~**P1.1 Durable steering/follow-up queues**~~ ✅ Completed `readMailbox` kind filter; API `read-mailbox` supports `config.kind`.
777
+ 5. ~~**P1.5 Event-tree provenance**~~ ✅ Completed `TeamEventMetadata` extended with `parentEventId`/`attemptId`/`branchId`/`causationId`/`correlationId`.
778
+ 6. ~~**P1.7 Capability inventory view**~~ ✅ Completed `buildCapabilityInventory()` + API `operation=inventory` + dashboard pane.
779
+ 7. ~~**P2.3 Durable history projection**~~ ✅ Completed `transformRunContextBeforeWorkerStart()` + `convertRunHistoryToWorkerPrompt()`.
780
+ 8. ~~**P2.4 CancellationToken**~~ ✅ Completed wired into `collectRuns`/`validateMailbox`/`pruneFinishedRuns`/`cleanupRunWorktrees` etc.
781
+ 9. ~~**P2.5 Blob artifacts**~~ ✅ Completed content-addressed blob store with SHA-256 dedup and metadata sidecars.
782
+ 10. ~~**P2.6 Dashboard panels**~~ Completed capability and cancellation panes.
783
+
784
+ Also completed (not in original order list):
785
+ - ~~**P1.6 Synthetic terminal results**~~ ✅ — `buildSyntheticTerminalEvidence()` for cancelled in-flight tasks.
786
+ - ~~**P2.1 Typed hook lifecycle**~~ ✅ — `before_run_start`/`before_task_start` wired into team-runner.
712
787
 
713
788
  ## Release Guidance
714
789