npm - pi-crew - Versions diffs - 0.1.46 → 0.1.51 - Mend

pi-crew 0.1.46 → 0.1.51

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (262) hide show

package/CHANGELOG.md +115 -0
package/agents/analyst.md +11 -11
package/agents/critic.md +11 -11
package/agents/executor.md +11 -11
package/agents/explorer.md +11 -11
package/agents/planner.md +11 -11
package/agents/reviewer.md +11 -11
package/agents/security-reviewer.md +11 -11
package/agents/test-engineer.md +11 -11
package/agents/verifier.md +11 -11
package/agents/writer.md +11 -11
package/docs/next-upgrade-roadmap.md +117 -42
package/docs/refactor-tasks-phase3.md +394 -394
package/docs/refactor-tasks-phase4.md +564 -564
package/docs/refactor-tasks-phase5.md +402 -402
package/docs/refactor-tasks-phase6.md +662 -662
package/docs/research/AGENT-EXECUTION-ARCHITECTURE.md +261 -0
package/docs/research/AGENT-LIFECYCLE-COMPARISON.md +111 -0
package/docs/research/AUDIT_OH_MY_PI.md +261 -0
package/docs/research/AUDIT_PI_CREW.md +457 -0
package/docs/research/CAVEMAN-DEEP-RESEARCH.md +281 -0
package/docs/research/COMPARISON_OH_MY_PI_VS_PI_CREW.md +264 -0
package/docs/research/DEEP-RESEARCH-PI-POWERBAR.md +343 -0
package/docs/research/DEEP_RESEARCH_SUBAGENT_ARCHITECTURE.md +480 -0
package/docs/research/GAP_CLOSURE_IMPLEMENTATION_PLAN.md +354 -0
package/docs/research/IMPLEMENTATION_PLAN.md +385 -0
package/docs/research/LIVE-SESSION-PRODUCTION-READY-PLAN.md +502 -0
package/docs/research/OH-MY-PI-DEEP-RESEARCH-v14.7.6.md +266 -0
package/docs/research/REMAINING-GAPS-PLAN.md +363 -0
package/docs/research/SESSION-SUMMARY-2026-05-08.md +146 -0
package/docs/research/UI-RESPONSIVENESS-AUDIT.md +173 -0
package/docs/research-awesome-agent-skills-distillation.md +100 -100
package/docs/research-extension-examples.md +297 -297
package/docs/research-extension-system.md +324 -324
package/docs/research-oh-my-pi-distillation.md +56 -9
package/docs/research-optimization-plan.md +548 -548
package/docs/research-phase10-distillation.md +198 -198
package/docs/research-phase11-distillation.md +201 -201
package/docs/research-pi-coding-agent.md +357 -357
package/docs/research-source-pi-crew-reference.md +174 -174
package/docs/runtime-flow.md +148 -148
package/docs/source-runtime-refactor-map.md +107 -107
package/index.ts +6 -6
package/package.json +99 -98
package/schema.json +8 -0
package/skills/async-worker-recovery/SKILL.md +42 -42
package/skills/context-artifact-hygiene/SKILL.md +52 -52
package/skills/delegation-patterns/SKILL.md +54 -54
package/skills/mailbox-interactive/SKILL.md +40 -40
package/skills/model-routing-context/SKILL.md +39 -39
package/skills/multi-perspective-review/SKILL.md +58 -58
package/skills/observability-reliability/SKILL.md +41 -41
package/skills/orchestration/SKILL.md +157 -0
package/skills/ownership-session-security/SKILL.md +41 -41
package/skills/pi-extension-lifecycle/SKILL.md +39 -39
package/skills/requirements-to-task-packet/SKILL.md +63 -63
package/skills/resource-discovery-config/SKILL.md +41 -41
package/skills/runtime-state-reader/SKILL.md +44 -44
package/skills/secure-agent-orchestration-review/SKILL.md +45 -45
package/skills/state-mutation-locking/SKILL.md +42 -42
package/skills/systematic-debugging/SKILL.md +67 -67
package/skills/ui-render-performance/SKILL.md +39 -39
package/skills/verification-before-done/SKILL.md +57 -57
package/skills/worktree-isolation/SKILL.md +39 -39
package/src/agents/agent-config.ts +6 -0
package/src/agents/agent-search.ts +98 -0
package/src/agents/agent-serializer.ts +4 -0
package/src/agents/discover-agents.ts +17 -4
package/src/config/config.ts +25 -0
package/src/config/defaults.ts +16 -5
package/src/extension/autonomous-policy.ts +26 -33
package/src/extension/cross-extension-rpc.ts +94 -82
package/src/extension/help.ts +1 -0
package/src/extension/management.ts +5 -0
package/src/extension/project-init.ts +15 -3
package/src/extension/register.ts +78 -19
package/src/extension/registration/commands.ts +33 -1
package/src/extension/registration/compaction-guard.ts +125 -125
package/src/extension/registration/team-tool.ts +6 -4
package/src/extension/run-bundle-schema.ts +89 -89
package/src/extension/run-export.ts +26 -12
package/src/extension/run-index.ts +24 -18
package/src/extension/run-maintenance.ts +68 -62
package/src/extension/team-tool/api.ts +23 -2
package/src/extension/team-tool/cancel.ts +86 -11
package/src/extension/team-tool/context.ts +4 -1
package/src/extension/team-tool/handle-settings.ts +188 -188
package/src/extension/team-tool/inspect.ts +41 -41
package/src/extension/team-tool/intent-policy.ts +42 -0
package/src/extension/team-tool/lifecycle-actions.ts +47 -18
package/src/extension/team-tool/parallel-dispatch.ts +156 -0
package/src/extension/team-tool/plan.ts +19 -19
package/src/extension/team-tool/respond.ts +10 -2
package/src/extension/team-tool/run.ts +3 -2
package/src/extension/team-tool/status.ts +1 -1
package/src/extension/team-tool-types.ts +1 -0
package/src/extension/team-tool.ts +16 -5
package/src/hooks/registry.ts +61 -0
package/src/hooks/types.ts +41 -0
package/src/i18n.ts +184 -184
package/src/observability/exporters/otlp-exporter.ts +77 -77
package/src/prompt/prompt-runtime.ts +72 -72
package/src/runtime/agent-control.ts +108 -2
package/src/runtime/agent-memory.ts +72 -72
package/src/runtime/agent-observability.ts +114 -114
package/src/runtime/async-marker.ts +26 -26
package/src/runtime/async-runner.ts +3 -1
package/src/runtime/attention-events.ts +28 -28
package/src/runtime/background-runner.ts +19 -0
package/src/runtime/cancellation-token.ts +89 -0
package/src/runtime/cancellation.ts +61 -51
package/src/runtime/capability-inventory.ts +116 -0
package/src/runtime/child-pi.ts +2 -1
package/src/runtime/code-summary.ts +247 -0
package/src/runtime/completion-guard.ts +190 -190
package/src/runtime/concurrency.ts +3 -1
package/src/runtime/crash-recovery.ts +181 -0
package/src/runtime/crew-agent-records.ts +35 -7
package/src/runtime/crew-agent-runtime.ts +1 -0
package/src/runtime/custom-tools/irc-tool.ts +201 -0
package/src/runtime/custom-tools/submit-result-tool.ts +90 -0
package/src/runtime/delivery-coordinator.ts +3 -1
package/src/runtime/diagnostic-export.ts +3 -1
package/src/runtime/direct-run.ts +35 -35
package/src/runtime/effectiveness.ts +81 -76
package/src/runtime/event-stream-bridge.ts +92 -0
package/src/runtime/foreground-control.ts +82 -82
package/src/runtime/green-contract.ts +46 -46
package/src/runtime/group-join.ts +106 -106
package/src/runtime/heartbeat-gradient.ts +28 -28
package/src/runtime/heartbeat-watcher.ts +124 -124
package/src/runtime/live-agent-control.ts +88 -88
package/src/runtime/live-agent-manager.ts +78 -2
package/src/runtime/live-control-realtime.ts +36 -36
package/src/runtime/live-extension-bridge.ts +150 -0
package/src/runtime/live-irc.ts +92 -0
package/src/runtime/live-session-health.ts +100 -0
package/src/runtime/live-session-runtime.ts +297 -7
package/src/runtime/mcp-proxy.ts +113 -0
package/src/runtime/notebook-helpers.ts +90 -0
package/src/runtime/orphan-sentinel.ts +7 -0
package/src/runtime/output-validator.ts +187 -0
package/src/runtime/parallel-research.ts +44 -44
package/src/runtime/parallel-utils.ts +57 -0
package/src/runtime/parent-guard.ts +80 -0
package/src/runtime/pi-args.ts +11 -2
package/src/runtime/pi-json-output.ts +111 -111
package/src/runtime/pi-spawn.ts +21 -3
package/src/runtime/policy-engine.ts +79 -79
package/src/runtime/process-status.ts +14 -1
package/src/runtime/progress-event-coalescer.ts +43 -43
package/src/runtime/prose-compressor.ts +164 -0
package/src/runtime/recovery-recipes.ts +74 -74
package/src/runtime/result-extractor.ts +121 -0
package/src/runtime/role-permission.ts +39 -39
package/src/runtime/runtime-resolver.ts +1 -4
package/src/runtime/semaphore.ts +131 -0
package/src/runtime/sensitive-paths.ts +92 -0
package/src/runtime/session-resources.ts +25 -25
package/src/runtime/session-snapshot.ts +59 -59
package/src/runtime/session-usage.ts +79 -79
package/src/runtime/sidechain-output.ts +29 -29
package/src/runtime/stream-preview.ts +177 -0
package/src/runtime/subagent-manager.ts +3 -2
package/src/runtime/subprocess-tool-registry.ts +67 -0
package/src/runtime/supervisor-contact.ts +59 -59
package/src/runtime/task-display.ts +38 -38
package/src/runtime/task-output-context.ts +59 -9
package/src/runtime/task-runner/capabilities.ts +78 -78
package/src/runtime/task-runner/live-executor.ts +2 -0
package/src/runtime/task-runner/progress.ts +119 -119
package/src/runtime/task-runner/prompt-builder.ts +71 -9
package/src/runtime/task-runner/prompt-pipeline.ts +64 -64
package/src/runtime/task-runner/result-utils.ts +14 -14
package/src/runtime/task-runner/run-projection.ts +104 -0
package/src/runtime/task-runner/state-helpers.ts +22 -22
package/src/runtime/task-runner.ts +75 -4
package/src/runtime/team-runner.ts +69 -8
package/src/runtime/worker-heartbeat.ts +21 -21
package/src/runtime/worker-startup.ts +57 -57
package/src/runtime/workspace-tree.ts +298 -0
package/src/runtime/yield-handler.ts +189 -0
package/src/schema/config-schema.ts +7 -0
package/src/schema/team-tool-schema.ts +11 -1
package/src/skills/discover-skills.ts +67 -0
package/src/state/active-run-registry.ts +4 -2
package/src/state/artifact-store.ts +4 -1
package/src/state/atomic-write.ts +50 -1
package/src/state/blob-store.ts +117 -0
package/src/state/contracts.ts +1 -0
package/src/state/event-log-rotation.ts +158 -0
package/src/state/event-log.ts +52 -2
package/src/state/locks.ts +3 -1
package/src/state/mailbox.ts +87 -7
package/src/state/state-store.ts +24 -4
package/src/state/task-claims.ts +44 -44
package/src/state/types.ts +20 -0
package/src/state/usage.ts +29 -29
package/src/subagents/async-entry.ts +1 -1
package/src/subagents/index.ts +3 -3
package/src/subagents/live/control.ts +1 -1
package/src/subagents/live/manager.ts +1 -1
package/src/subagents/live/realtime.ts +1 -1
package/src/subagents/live/session-runtime.ts +1 -1
package/src/subagents/manager.ts +1 -1
package/src/subagents/spawn.ts +1 -1
package/src/teams/team-serializer.ts +38 -38
package/src/types/diff.d.ts +18 -18
package/src/ui/agent-management-overlay.ts +144 -0
package/src/ui/crew-footer.ts +101 -101
package/src/ui/crew-select-list.ts +111 -111
package/src/ui/crew-widget.ts +15 -4
package/src/ui/dashboard-panes/cancellation-pane.ts +43 -0
package/src/ui/dashboard-panes/capability-pane.ts +60 -0
package/src/ui/dashboard-panes/mailbox-pane.ts +35 -11
package/src/ui/dashboard-panes/metrics-pane.ts +34 -34
package/src/ui/dynamic-border.ts +25 -25
package/src/ui/layout-primitives.ts +106 -106
package/src/ui/live-run-sidebar.ts +4 -0
package/src/ui/loaders.ts +158 -158
package/src/ui/powerbar-publisher.ts +83 -15
package/src/ui/render-coalescer.ts +51 -0
package/src/ui/render-diff.ts +119 -119
package/src/ui/render-scheduler.ts +143 -143
package/src/ui/run-dashboard.ts +4 -0
package/src/ui/run-event-bus.ts +209 -0
package/src/ui/run-snapshot-cache.ts +68 -16
package/src/ui/snapshot-types.ts +8 -0
package/src/ui/spinner.ts +17 -17
package/src/ui/status-colors.ts +58 -58
package/src/ui/syntax-highlight.ts +116 -116
package/src/ui/transcript-entries.ts +258 -0
package/src/utils/atomic-write.ts +33 -33
package/src/utils/completion-dedupe.ts +63 -63
package/src/utils/frontmatter.ts +68 -68
package/src/utils/git.ts +262 -262
package/src/utils/ids.ts +17 -12
package/src/utils/incremental-reader.ts +104 -0
package/src/utils/names.ts +27 -27
package/src/utils/redaction.ts +44 -44
package/src/utils/safe-paths.ts +47 -47
package/src/utils/scan-cache.ts +137 -0
package/src/utils/sleep.ts +32 -32
package/src/utils/sse-parser.ts +134 -0
package/src/utils/task-name-generator.ts +337 -0
package/src/utils/visual.ts +33 -2
package/src/workflows/validate-workflow.ts +40 -40
package/src/worktree/branch-freshness.ts +45 -45
package/src/worktree/cleanup.ts +2 -1
package/src/worktree/worktree-manager.ts +11 -3
package/teams/default.team.md +12 -12
package/teams/fast-fix.team.md +11 -11
package/teams/implementation.team.md +18 -18
package/teams/parallel-research.team.md +14 -14
package/teams/research.team.md +11 -11
package/teams/review.team.md +12 -12
package/workflows/default.workflow.md +29 -29
package/workflows/fast-fix.workflow.md +22 -22
package/workflows/implementation.workflow.md +43 -38
package/workflows/parallel-research.workflow.md +46 -46
package/workflows/research.workflow.md +22 -22
package/workflows/review.workflow.md +30 -30

package/CHANGELOG.md CHANGED Viewed

@@ -2,6 +2,121 @@
 ## Unreleased
+## 0.1.51
+### Fixed
+- **Stale foreground spinner** — Working message/spinner now always clears when foreground run completes, even if session generation changed during the run.
+- **Completed-run widget grace period (8s)** — Runs that just completed stay visible in the widget for 8 seconds so users can see results before the widget hides.
+## 0.1.50
+### Fixed
+- **Parallel execution** — Raised default concurrency (implementation 2→4, review 2→3, research 2→3). Fixed `defaultWorkflowConcurrency()` routing bug where review/default both returned the implementation value.
+- **Planner prompt** — Added explicit "MAXIMIZE PARALLELISM" instruction with examples, so planner models produce parallel phases instead of sequential.
+- **20 review findings** — 6 CRITICAL (optional chaining crash, env leak, path redaction, RPC validation, hook JSON safety, temp dir security), 6 HIGH (unsafe casts, busy-wait CPU, timestamp merge guard, prompt injection delimiter, binary validation), 5 MEDIUM, 3 LOW.
+- **Widget flicker** — Pinned preloaded manifests to widget component model to prevent manifestCache TTL race. Scoped snapshotCache invalidation to specific run instead of clearing all.
+- **Delegation policy** — Rewritten as mandatory decision table with concrete thresholds (>3 files read or >2 files edit = must delegate). Injected into every session via system prompt.
+- **ignoreMethod option** — New config to write ignore entries to `.git/info/exclude` instead of `.gitignore` (Closes #2).
+## 0.1.49
+### Added
+- **Caveman output contracts** — Role-based output validation framework with `output-validator.ts`: regex-based format checking for explorer, executor, reviewer, verifier, security-reviewer roles. Non-blocking: validation failures emit `task.output_validation` events + set `needs_attention` but do NOT fail the task.
+- **Prose compressor** — `prose-compressor.ts` compresses verbose worker output for token-sensitive contexts (role-aware compression levels).
+- **Sensitive paths** — Word-boundary-aware token matching in `sensitive-paths.ts` prevents false positives (e.g. `secretary.ts` no longer flagged as `secret`).
+- **Symlink-safe I/O** — Artifact and shared output paths reject traversal attempts and symlinked root escapes.
+- **Output contract eval harness** — 19 unit tests covering three-arm evaluation (contract vs terse vs baseline), format compliance, token savings, regex safety (no `/g` lastIndex state leak).
+### Changed
+- **Delegation policy rewritten** — Replaced advisory "you should consider" text with a mandatory decision table: concrete thresholds (>3 files read OR >2 files edit = MUST delegate), explicit YES/NO cases per task type, conflict-safe task splitting rules. Injected into every session via `before_agent_start` hook.
+- **Powerbar dedup** — `powerbar-publisher.ts` now skips `powerbar:update` emit when segment data is unchanged (inspired by pi-powerbar's `segmentEquals` pattern). Combined with existing 200ms coalescing for minimal unnecessary renders.
+- **UI responsiveness** — `task-runner.ts` now emits `streamBridge` event immediately after `task.started`, giving the widget agent status within ~100ms instead of 2-5s (child process startup delay).
+- **"spawning…" indicator** — Widget shows "spawning…" for agents < 5 seconds old with no tool activity, distinguishing from "thinking…" for long-running agents.
+### Fixed
+- **H1: MCP proxy fallback** — `mcp-proxy.ts` now falls back to `enableMcp: true` when `createMcpProxyTools()` returns empty, so child sessions self-discover MCP instead of losing all access.
+- **H2: parallel-utils throw undefined** — `mapConcurrent` now throws the actual error instead of `throw undefined`.
+- **H3: Semaphore over-release** — `release()` guard against `#current > 0` prevents over-release corruption.
+- **M1: IRC tool TOCTOU** — `irc-tool.ts` wraps `sendIrcMessage`/`broadcastIrcMessage` in try-catch.
+- **M2: submit-result ordering** — Builds response string before calling `onYield`, wrapped in try-catch.
+- **M3: Sensitive paths false positives** — Word-boundary-aware token matching replaces substring matching.
+- **M4: atomic-write sleepSync** — Added WARNING comment about blocking main thread.
+- **M7: URL regex trailing punctuation** — Precise regex excludes trailing punctuation from URL matches.
+- **L1: parent-guard comment** — Corrected misleading comment about `process.kill` on Windows.
+- **Yield handler DRY** — Extracted `extractYieldDataFromArgs` helper, `isObjectRecord`/`isStringRecord` type guards, safe `find()` pattern.
+- **Event-log-rotation TOCTOU** — `compactEventLog` re-reads file after initial read to merge concurrent appends; `readEvents` skips corrupt JSON lines.
+- **Ghost agent dedup** — Fixed duplicate agent records in `crew-agent-records` after crash recovery.
+### Research
+- `docs/research/AGENT-EXECUTION-ARCHITECTURE.md` — Detailed comparison of 3 execution modes (oh-my-pi in-process, pi-crew child-process, pi-crew live-session).
+- `docs/research/UI-RESPONSIVENESS-AUDIT.md` — Root cause analysis for 2-5s agent spawn visibility delay, 5 proposed fixes with priority matrix.
+- `docs/research/DEEP-RESEARCH-PI-POWERBAR.md` — Deep analysis of pi-powerbar architecture (producer/consumer pattern, rendering, settings, comparison with pi-crew's powerbar publisher).
+## 0.1.48
+### Added
+- **Yield-based completion contract** — Workers can call `submit_result` tool to return structured results; task-runner warns on workers that don't yield.
+- **Typed event channels** — `RunEventBus` supports 5 channels (`worker:progress`, `worker:lifecycle`, `worker:stream`, `run:state`, `ui:invalidate`) with `onChannel`/`onChannelForRun` subscriptions and auto-classification.
+- **Human-readable task names** — `generateTaskName()` produces AdjectiveNoun names (14,400 combinations); `displayName` field on `TeamTaskState`.
+- **SubprocessToolRegistry** — Extensible tool event handling with `register`/`extractAll`/`shouldTerminate` pattern; wired into event-stream-bridge.
+- **Event log rotation/compaction** — Auto-compacts event logs over 5MB/50k events, keeping last 1000 events; atomic file replacement.
+- **Incremental JSONL reader** — `readLinesSince`/`readJsonlSince` for seek-based file reading; wired into `readEventsCursor` with `fromByteOffset`.
+### Fixed
+- Fixed `readBlob`/`readBlobMetadata` crash on missing files — now returns `undefined`.
+- Fixed `readSseJson` crash on non-JSON SSE data — now skips malformed events.
+- Fixed wrong value `"long_running"` → `"active_long_running"` in agent-control.
+- Fixed `consecutiveFailures` type bypass — added to `CrewAgentProgress` interface.
+- Fixed `streamBridge.dispose()` memory leak — now in try/finally.
+- Fixed blob-store redundant ternary `typeof x === "string" ? x : x`.
+- Fixed team-runner non-null assertion on potentially empty array.
+- Fixed event-log silent error swallowing — now logs via `logInternalError`.
+- Fixed team-tool switch case indentation.
+- Removed dead code `expandIcon` in agent-management-overlay.
+### Changed
+- Moved 6 research .md files from repo root to `docs/research/`.
+- `discoverAgents`/`discoverSkills` silent catches now log via `logInternalError`.
+- `executeHook` accumulates non-blocking diagnostics instead of short-circuiting.
+- `CancellationToken.heartbeat` wired into `collectRuns` and `pruneFinishedRuns`.
+- `CapabilitySource` extended with `"git"` to match `ResourceSource`.
+## 0.1.47
+### Added
+- **Typed hook lifecycle** — 8 of 9 hooks wired: `before_run_start`, `before_task_start`, `task_result`, `before_cancel`, `before_forget`, `before_cleanup`, `before_publish`, `run_recovery`. Hooks are opt-in, blocking/non-blocking, with audit events.
+- **Event-first UI bus** — `RunEventBus` emits on every `appendEvent` call; dashboard, crew widget, sidebar, and snapshot cache subscribe for event-driven invalidation instead of polling.
+- **Shared scan cache** — `SharedScanCache` caches manifest reads and active-run entries with TTL, mtime/size invalidation, and LRU eviction.
+- **Capability inventory** — `buildCapabilityInventory()` enumerates teams, workflows, agents, and skills with stable `kind:name` IDs; supports policy disable and shadowing detection.
+- **Skills in capability inventory** — `discoverSkills()` reads SKILL.md frontmatter; skills appear with kind=`skill` and source=`package`/`project`.
+- **Mailbox kind-separated breakdown** — `RunUiMailbox` tracks `steerUnread`/`followUpUnread`/`responseUnread`/`messageUnread`; mailbox pane shows urgency indicators.
+- **Run recovery hook** — `applyRecoveryPlan` fires `run_recovery` hook; blocked recovery emits `crew.run.recovery_blocked` event.
+- **Synthetic tool cancellation evidence** — Cancelled in-flight tasks receive `tool`-level terminal evidence alongside `worker`-level.
+- **CancellationToken wired into production loops** — `collectRuns` and `pruneFinishedRuns` use `CancellationToken.heartbeat(stage)` for progress diagnostics.
+- **Blob artifact store** — SHA-256 content-addressed storage with metadata sidecars.
+- **Run event provenance** — Event metadata includes `parentEventId`, `attemptId`, `branchId`, `causationId`, `correlationId`.
+- **Control channel reservation** — `ControlReservation` before worker spawn with deterministic `controllerId`.
+- **Release smoke test** — `npm run smoke:release` automates tarball install + version consistency check.
+- **Width-safety tests** — Crew widget rendering verified at widths 1/40/200/empty/multiple.
+### Changed
+- `handleCancel`, `handleForget`, `handleCleanup`, `handlePrune`, `handleExport` converted to async for hook execution.
+- `before_cancel`/`before_forget`/`before_cleanup` hooks can block their respective operations.
+- `before_publish` hook fires before run export.
+- `task_result` hook fires before `task.completed`/`task.failed` events.
+- Dashboard, widget, and sidebar auto-invalidate on `RunEventBus` events.
 ## 0.1.45
 ### Added

package/agents/analyst.md CHANGED Viewed

@@ -1,11 +1,11 @@
----
-name: analyst
-description: Analyze requirements, ambiguity, and hidden constraints
-model: false
-systemPromptMode: replace
-inheritProjectContext: true
-inheritSkills: false
-tools: read, grep, find, ls
----
-You are a requirements analyst. Identify what is known, unknown, risky, ambiguous, or underspecified. Produce clarifying assumptions and acceptance criteria.
+---
+name: analyst
+description: Analyze requirements, ambiguity, and hidden constraints
+model: false
+systemPromptMode: replace
+inheritProjectContext: true
+inheritSkills: false
+tools: read, grep, find, ls
+---
+You are a requirements analyst. Identify what is known, unknown, risky, ambiguous, or underspecified. Produce clarifying assumptions and acceptance criteria.

package/agents/critic.md CHANGED Viewed

@@ -1,11 +1,11 @@
----
-name: critic
-description: Challenge plans and designs before execution
-model: false
-systemPromptMode: replace
-inheritProjectContext: true
-inheritSkills: false
-tools: read, grep, find, ls
----
-You are a critical reviewer. Find flaws, missing steps, unsafe assumptions, overengineering, underengineering, and verification gaps. Return concrete fixes to the plan.
+---
+name: critic
+description: Challenge plans and designs before execution
+model: false
+systemPromptMode: replace
+inheritProjectContext: true
+inheritSkills: false
+tools: read, grep, find, ls
+---
+You are a critical reviewer. Find flaws, missing steps, unsafe assumptions, overengineering, underengineering, and verification gaps. Return concrete fixes to the plan.

package/agents/executor.md CHANGED Viewed

@@ -1,11 +1,11 @@
----
-name: executor
-description: Implement planned code changes
-model: false
-systemPromptMode: replace
-inheritProjectContext: true
-inheritSkills: false
-tools: read, grep, find, ls, bash, edit, write
----
-You are an implementation specialist. Follow the provided plan, make targeted changes, keep edits minimal, and report changed files plus validation status. Do not broaden scope without explaining why.
+---
+name: executor
+description: Implement planned code changes
+model: false
+systemPromptMode: replace
+inheritProjectContext: true
+inheritSkills: false
+tools: read, grep, find, ls, bash, edit, write
+---
+You are an implementation specialist. Follow the provided plan, make targeted changes, keep edits minimal, and report changed files plus validation status. Do not broaden scope without explaining why.

package/agents/explorer.md CHANGED Viewed

@@ -1,11 +1,11 @@
----
-name: explorer
-description: Fast codebase discovery and file/symbol mapping
-model: false
-systemPromptMode: replace
-inheritProjectContext: true
-inheritSkills: false
-tools: read, grep, find, ls
----
-You are a fast codebase explorer. Map relevant files, symbols, data flow, and constraints. Do not modify files. Return concise findings with paths and evidence.
+---
+name: explorer
+description: Fast codebase discovery and file/symbol mapping
+model: false
+systemPromptMode: replace
+inheritProjectContext: true
+inheritSkills: false
+tools: read, grep, find, ls
+---
+You are a fast codebase explorer. Map relevant files, symbols, data flow, and constraints. Do not modify files. Return concise findings with paths and evidence.

package/agents/planner.md CHANGED Viewed

@@ -1,11 +1,11 @@
----
-name: planner
-description: Create an execution plan with clear sequencing and risk notes
-model: false
-systemPromptMode: replace
-inheritProjectContext: true
-inheritSkills: false
-tools: read, grep, find, ls
----
-You are a planning specialist. Convert the goal and discovery notes into a concrete, ordered plan. Identify dependencies, risks, validation steps, and handoff instructions for implementers.
+---
+name: planner
+description: Create an execution plan with clear sequencing and risk notes
+model: false
+systemPromptMode: replace
+inheritProjectContext: true
+inheritSkills: false
+tools: read, grep, find, ls
+---
+You are a planning specialist. Convert the goal and discovery notes into a concrete, ordered plan. Identify dependencies, risks, validation steps, and handoff instructions for implementers.

package/agents/reviewer.md CHANGED Viewed

@@ -1,11 +1,11 @@
----
-name: reviewer
-description: Review code changes for correctness, maintainability, and regressions
-model: false
-systemPromptMode: replace
-inheritProjectContext: true
-inheritSkills: false
-tools: read, grep, find, ls, bash
----
-You are a code reviewer. Review the implementation for bugs, regressions, maintainability issues, missing tests, and project-rule violations. Return prioritized findings with evidence.
+---
+name: reviewer
+description: Review code changes for correctness, maintainability, and regressions
+model: false
+systemPromptMode: replace
+inheritProjectContext: true
+inheritSkills: false
+tools: read, grep, find, ls, bash
+---
+You are a code reviewer. Review the implementation for bugs, regressions, maintainability issues, missing tests, and project-rule violations. Return prioritized findings with evidence.

package/agents/security-reviewer.md CHANGED Viewed

@@ -1,11 +1,11 @@
----
-name: security-reviewer
-description: Review changes for security vulnerabilities and trust-boundary issues
-model: false
-systemPromptMode: replace
-inheritProjectContext: true
-inheritSkills: false
-tools: read, grep, find, ls, bash
----
-You are a security reviewer. Look for injection, authn/authz flaws, insecure defaults, secret exposure, unsafe filesystem/network behavior, and dependency risks. Return severity and remediation.
+---
+name: security-reviewer
+description: Review changes for security vulnerabilities and trust-boundary issues
+model: false
+systemPromptMode: replace
+inheritProjectContext: true
+inheritSkills: false
+tools: read, grep, find, ls, bash
+---
+You are a security reviewer. Look for injection, authn/authz flaws, insecure defaults, secret exposure, unsafe filesystem/network behavior, and dependency risks. Return severity and remediation.

package/agents/test-engineer.md CHANGED Viewed

@@ -1,11 +1,11 @@
----
-name: test-engineer
-description: Design and implement test strategy for a change
-model: false
-systemPromptMode: replace
-inheritProjectContext: true
-inheritSkills: false
-tools: read, grep, find, ls, bash, edit, write
----
-You are a test engineer. Identify the right test level, add or adjust tests when asked, detect flaky assumptions, and report exact validation commands and results.
+---
+name: test-engineer
+description: Design and implement test strategy for a change
+model: false
+systemPromptMode: replace
+inheritProjectContext: true
+inheritSkills: false
+tools: read, grep, find, ls, bash, edit, write
+---
+You are a test engineer. Identify the right test level, add or adjust tests when asked, detect flaky assumptions, and report exact validation commands and results.

package/agents/verifier.md CHANGED Viewed

@@ -1,11 +1,11 @@
----
-name: verifier
-description: Verify that implementation satisfies the requested goal
-model: false
-systemPromptMode: replace
-inheritProjectContext: true
-inheritSkills: false
-tools: read, grep, find, ls, bash
----
-You are a verification specialist. Check whether the work is complete, correct, tested, and aligned with project constraints. Prefer evidence over assumptions. Return PASS or FAIL with reasons.
+---
+name: verifier
+description: Verify that implementation satisfies the requested goal
+model: false
+systemPromptMode: replace
+inheritProjectContext: true
+inheritSkills: false
+tools: read, grep, find, ls, bash
+---
+You are a verification specialist. Check whether the work is complete, correct, tested, and aligned with project constraints. Prefer evidence over assumptions. Return PASS or FAIL with reasons.

package/agents/writer.md CHANGED Viewed

@@ -1,11 +1,11 @@
----
-name: writer
-description: Write concise documentation, migration notes, and summaries
-model: false
-systemPromptMode: replace
-inheritProjectContext: true
-inheritSkills: false
-tools: read, grep, find, ls, edit, write
----
-You are a documentation specialist. Produce clear, concise, maintainable docs and summaries. Preserve technical accuracy and avoid marketing fluff.
+---
+name: writer
+description: Write concise documentation, migration notes, and summaries
+model: false
+systemPromptMode: replace
+inheritProjectContext: true
+inheritSkills: false
+tools: read, grep, find, ls, edit, write
+---
+You are a documentation specialist. Produce clear, concise, maintainable docs and summaries. Preserve technical accuracy and avoid marketing fluff.

package/docs/next-upgrade-roadmap.md CHANGED Viewed

@@ -22,6 +22,66 @@ Already implemented and pushed:
 - Live-agent control distinguishes `steer` from `follow-up` at live-control/API level.
 - Retry attempts have `attemptId`; max-retry deadletters link to the final `attemptId`.
 - Worker prompt pipeline and capability inventory metadata artifacts are written per task.
+- P0.1: effectiveness guard escalates `warn` to `blocked` for mutating-role tasks with no observable worker activity.
+- P1.1: mailbox `readMailbox` accepts `kind` filter; API `read-mailbox` supports `config.kind`.
+- P1.5: `TeamEventMetadata` extended with `parentEventId`, `attemptId`, `branchId`, `causationId`, `correlationId`.
+- P1.6: `buildSyntheticTerminalEvidence()` produces `"worker"`/`"cancelled"` terminal records for cancelled in-flight tasks.
+- P1.7: `buildCapabilityInventory(cwd)` normalizes teams/workflows/agents; API `operation=inventory`.
+- P2.1: typed hook lifecycle — `registerHook`/`executeHook` registry; `before_run_start` and `before_task_start` wired.
+- P2.4: `AbortSignal` wired into `collectRuns`, `validateMailbox`, `readAllMailboxMessages`, `pruneFinishedRuns`, `cleanupRunWorktrees`, etc.
+- Resume scaffold runs preserve scaffold mode from original manifest when workers not disabled.
+## Implementation Status as of `v0.1.46`
+This roadmap is **not complete overall**. The `v0.1.46` release completed several vertical slices, but multiple roadmap items remain partial or unimplemented.
+### Implemented / mostly implemented
+- Baseline worker behavior: real child-process execution by default, explicit scaffold dry-runs, and blocked implicit scaffold/no-op runs.
+- P0.1 ✅ effectiveness policy enforcement: default guard escalates `warn` to `blocked` for mutating-role tasks.
+- P0.2 ✅ runtime safety persistence: manifests persist `runtimeResolution`; `runtime.resolved` event emitted; status shows safety; blocked runs persist evidence.
+- Effectiveness reporting: summary/progress/status expose no-observed-work evidence and policy outcome.
+- Structured cancellation basics: cancellation reasons flow through retry/backoff/team-runner paths and run/task events.
+- Retry attempt evidence: retry attempts and max-retry deadletters carry/link `attemptId` data.
+- Prompt pipeline artifacts and per-task capability metadata artifacts are written.
+- P1.3 worker teardown evidence vertical slice: `WorkerExitStatus` and terminal worker cancellation evidence exist.
+### Completed in this upgrade cycle (after v0.1.46)
+- P0.1 effectiveness policy enforcement: default guard now escalates `warn` to `blocked` for mutating-role tasks with no observable worker activity; read-only roles remain `warning`.
+- P0.2 runtime safety persistence: manifests persist `runtimeResolution`; `runtime.resolved` event emitted; status shows safety; blocked runs persist evidence.
+- P1.1 durable steering/follow-up queues: `readMailbox` accepts `kind` filter; API `read-mailbox` supports `config.kind`; steering and follow-up are isolatable by kind.
+- P1.2 respond vs follow-up UX: `/team-follow-up` command added for continuation prompts; `/team-respond` remains for waiting-task replies.
+- P1.3 two-phase child process teardown: `WorkerExitStatus` populated from graceful SIGTERM → grace window → hard kill pipeline.
+- P1.5 event-tree provenance: `TeamEventMetadata` extended with `parentEventId`, `attemptId`, `branchId`, `causationId`, `correlationId`; retry and cancel events carry `attemptId`.
+- P1.6 synthetic terminal results: `buildSyntheticTerminalEvidence()` in `cancellation.ts`; cancelled in-flight tasks receive `"worker"`/`"cancelled"` terminal evidence records.
+- P1.7 unified capability inventory: `buildCapabilityInventory(cwd)` normalizes teams/workflows/agents into `CapabilityItem[]`; API `operation=inventory` returns sorted JSON.
+- P1.8 capability disable by stable ID: `disabledCapabilities` in `CrewPolicyConfig`; inventory marks disabled items with reason.
+- P2.1 typed hook lifecycle: `HookName`, `HookMode`, `HookOutcome`, `HookContext`, `HookResult`, `HookExecutionReport` types; `registerHook`/`executeHook`/`clearHooks` registry; `before_run_start` and `before_task_start` wired into team-runner.
+- P2.2 intent gates for destructive actions: `enforceDestructiveIntent` wired in cancel/cleanup/forget/prune/delete; configurable via `policy.requireIntentForDestructiveActions`.
+- P2.3 durable history projection: `transformRunContextBeforeWorkerStart()` and `convertRunHistoryToWorkerPrompt()` bounded projection functions.
+- P2.4 CancellationToken wired into long scans: `AbortSignal` passed to `collectRuns`/`validateMailbox`/`readAllMailboxMessages`/`pruneFinishedRuns`/`cleanupRunWorktrees`.
+- P2.5 content-addressed blob store: `writeBlob`/`readBlob`/`readBlobMetadata` with SHA-256 dedup and metadata sidecars.
+- P2.6 dashboard panes for capability and cancellation: `renderCapabilityPane` and `renderCancellationPane`.
+- Resume scaffold run fix: preserves scaffold mode from original manifest when workers not disabled.
+### Partial / not safe to mark complete
+- P1.4 reserve worker control channel before spawn: controller metadata persistence during startup not yet implemented.
+- P2.7 event-first UI: render coalescing and snapshot caches exist, but live UI still relies on durable file polling as a primary source in several panes.
+- P2.8 shared raw scan-entry cache: not yet implemented.
+### Completed / no longer backlog
+- P2.7 event-first UI — RunEventBus wired into appendEvent; dashboard, widget, sidebar auto-invalidate on events; snapshot cache invalidates on events.
+- P2.8 shared raw scan-entry cache — SharedScanCache implemented and wired into manifest reads (run-index) and active-run-registry (active manifest reads).
+- P3.1 tarball-install smoke — `scripts/release-smoke.mjs` verified; `npm run smoke:release` added.
+- Hook lifecycle — All hooks wired: `before_run_start`, `before_task_start`, `before_cancel`, `before_forget`, `before_cleanup`, `before_publish`, `task_result`, `run_recovery`. Only `session_before_switch` remains (no cwd switch mechanism in current codebase).
+### Remaining items
+- `session_before_switch` hook — no cwd/session switch mechanism in current codebase; placeholder for future.
+- P3.2 CI gate — integrate `smoke:release` into CI pipeline (requires CI config).
 ## Priority Legend
@@ -82,12 +142,13 @@ Already implemented and pushed:
    - set run `blocked` or `failed` depending config;
    - include task IDs in `data`.
-**Acceptance criteria**
+**Acceptance criteria** ✅
-- A mocked child-process run with no tool/model/transcript evidence does not report clean `completed` by default.
-- Scaffold run still completes as explicit dry-run and displays `Worker execution: disabled/scaffold`.
-- `status` clearly lists `noObservedWork` and `needsAttention` task IDs.
-- Unit tests cover warn/block/fail modes.
+- ✅ A mocked child-process run with no tool/model/transcript evidence does not report clean `completed` by default.
+- ✅ Scaffold run still completes as explicit dry-run and displays `Worker execution: disabled/scaffold`.
+- ✅ `status` clearly lists `noObservedWork` and `needsAttention` task IDs.
+- ✅ Unit tests cover warn/block/fail modes.
+- ✅ Default guard escalates `warn` to `blocked` for mutating-role tasks.
 **Verification**
@@ -130,11 +191,12 @@ npm run test:unit
 - `test/unit/team-run.test.ts`
 - `test/unit/runtime-resolver.test.ts`
-**Acceptance criteria**
+**Acceptance criteria** ✅
-- `status` shows `Runtime safety: trusted|explicit_dry_run|blocked`.
-- Blocked disabled-worker runs persist enough evidence to explain why no subagents spawned.
-- Existing manifest schema remains backward compatible.
+- ✅ `status` shows `Runtime safety: trusted|explicit_dry_run|blocked`.
+- ✅ Blocked disabled-worker runs persist enough evidence to explain why no subagents spawned.
+- ✅ Existing manifest schema remains backward compatible.
+- ✅ `runtimeResolution` persisted on manifest; `runtime.resolved` event emitted.
 ## P1 — Steering/Follow-up Semantics Beyond Live Control
@@ -170,12 +232,12 @@ npm run test:unit
 - `test/unit/live-agent-control.test.ts`
 - `test/unit/respond-tool.test.ts`
-**Acceptance criteria**
+**Acceptance criteria** ✅ (partially — kind filter and API done; UI pane separation remaining)
-- Steering and follow-up can be inspected separately.
-- Existing inbox/outbox JSONL remains readable.
-- Durable queue survives process/session switch.
-- Realtime live delivery dedupes against durable replay.
+- ✅ Steering and follow-up can be inspected separately via `readMailbox` kind filter and API `config.kind`.
+- ✅ Existing inbox/outbox JSONL remains readable.
+- ✅ Kind filter survives process/session switch (durable mailbox).
+- ✅ UI/status separates urgent steering from follow-up backlog (mailbox pane shows kind breakdown with urgency indicators).
 ### P1.2 Clarify `respond` vs `follow-up` UX
@@ -307,11 +369,12 @@ Retry attempts have `attemptId`, and deadletters link to final attempt. Event lo
 - `test/unit/event-metadata.test.ts`
 - `test/unit/retry-executor.test.ts`
-**Acceptance criteria**
+**Acceptance criteria** ✅
-- Retry attempt events and terminal task events share attempt provenance.
-- Deadletter records can be traced back to event sequence.
-- Existing JSONL readers ignore missing provenance fields.
+- ✅ Retry attempt events and terminal task events share attempt provenance.
+- ✅ Deadletter records can be traced back to event sequence.
+- ✅ Existing JSONL readers ignore missing provenance fields.
+- ✅ `TeamEventMetadata` extended with `parentEventId`, `attemptId`, `branchId`, `causationId`, `correlationId`.
 ### P1.6 Synthetic terminal results for cancelled in-flight operations
@@ -336,10 +399,11 @@ Run/task cancellation events are now structured, but worker/tool sub-operations
 - `src/state/contracts.ts`
 - `test/unit/cancellation.test.ts`
-**Acceptance criteria**
+**Acceptance criteria** ✅
-- No started tool/model operation is left without terminal evidence after cancellation.
-- Status/diagnostics can distinguish user cancel vs timeout vs shutdown.
+- ✅ No started tool/model operation is left without terminal evidence after cancellation.
+- ✅ Status/diagnostics can distinguish user cancel vs timeout vs shutdown.
+- ✅ `buildSyntheticTerminalEvidence()` in `cancellation.ts` produces `"worker"`/`"cancelled"` records.
 ## P1 — Capability Inventory and Control Center
@@ -379,10 +443,10 @@ interface CapabilityItem {
 **Acceptance criteria**
-- Inventory is stable and sorted.
-- Shadowed project/user/builtin resources are visible.
-- Skill disabled/budget state is visible.
-- No file path is used as the only stable ID.
+- ✅ Inventory is stable and sorted.
+- ✅ Shadowed project/user/builtin resources are visible in capability inventory (state="shadowed", shadowedBy field).
+- ✅ Skill disabled/budget state is visible in capability inventory (skills enumerated via discoverSkills).
+- ✅ No file path is used as the only stable ID (uses `kind:name` IDs).
 ### P1.8 Persist capability disables by stable ID
@@ -436,11 +500,18 @@ Errors are recorded in diagnostics/events, not uncontrolled exceptions.
 - `docs/resource-formats.md`
 - `test/unit/hooks*.test.ts`
-**Acceptance criteria**
+**Acceptance criteria** ✅ (partial — `before_cancel` not yet wired for async)
-- Blocking hook can stop a run before worker start with clear event and status.
-- Non-blocking hook failure records diagnostic and does not crash run.
-- Hook context is redacted and bounded.
+- ✅ Blocking hook can stop a run before worker start with clear event and status.
+- ✅ Non-blocking hook failure records diagnostic and does not crash run.
+- ✅ Hook context is redacted and bounded.
+- ✅ `before_cancel` hook wired (async handleCancel conversion done).
+- ✅ `before_forget` hook wired (async handleForget conversion done).
+- ✅ `before_cleanup` hook wired (async handleCleanup conversion done).
+- ✅ `task_result` hook wired in task-runner before completed/failed event.
+- ✅ `before_publish` hook wired in handleExport.
+- ✅ `run_recovery` hook wired in crash-recovery `applyRecoveryPlan`.
+- ☐ `session_before_switch` not yet wired (no cwd switch mechanism in current codebase; placeholder for future Pi lifecycle integration).
 ### P2.2 Require intent via policy/hook for destructive actions
@@ -547,11 +618,11 @@ Use it in:
 - `src/ui/run-snapshot-cache.ts`
 - `test/unit/cancellation-token.test.ts`
-**Acceptance criteria**
+**Acceptance criteria** ✅
-- Long scan can abort within bounded cadence.
-- Heartbeat stage appears in diagnostics/logs.
-- Existing APIs can pass no token and keep current behavior.
+- ✅ Long scan can abort within bounded cadence (`AbortSignal` wired into `collectRuns`, `validateMailbox`, `readAllMailboxMessages`, `pruneFinishedRuns`, `cleanupRunWorktrees`).
+- ✅ `CancellationToken.heartbeat(stage)` wired into `collectRuns` and `pruneFinishedRuns` with stage diagnostics.
+- ✅ Existing APIs can pass no token/signal and keep current behavior.
 ## P2 — Artifact Store Improvements
@@ -699,16 +770,20 @@ npm pack
 ## Suggested Implementation Order
-1. **P0.1 Effectiveness policy enforcement** — prevents misleading completed runs.
-2. **P0.2 Persist runtime safety** — improves debugging for worker spawn issues.
+1. ~~**P0.1 Effectiveness policy enforcement**~~ ✅ Completed — default guard escalates `warn` to `blocked` for mutating-role tasks.
+2. ~~**P0.2 Persist runtime safety**~~ ✅ Completed — manifests persist `runtimeResolution`; `runtime.resolved` event emitted.
 3. **P1.3 Two-phase worker teardown** — reduces stale/zombie worker risk.
-4. **P1.1 Durable steering/follow-up queues** — completes semantic split started at live-control level.
-5. **P1.5 Event-tree provenance** — builds on current `attemptId` work.
-6. **P1.7 Capability inventory view** — turns existing per-task artifacts into operator UX.
-7. **P2.3 Durable history projection** — reduces prompt/context risks.
-8. **P2.4 CancellationToken** — improves responsiveness of internal scans.
-9. **P2.5 Blob artifacts** — prevents log/transcript bloat.
-10. **P2.6 Dashboard panels** — surface all new evidence in UI.
+4. ~~**P1.1 Durable steering/follow-up queues**~~ ✅ Completed — `readMailbox` kind filter; API `read-mailbox` supports `config.kind`.
+5. ~~**P1.5 Event-tree provenance**~~ ✅ Completed — `TeamEventMetadata` extended with `parentEventId`/`attemptId`/`branchId`/`causationId`/`correlationId`.
+6. ~~**P1.7 Capability inventory view**~~ ✅ Completed — `buildCapabilityInventory()` + API `operation=inventory` + dashboard pane.
+7. ~~**P2.3 Durable history projection**~~ ✅ Completed — `transformRunContextBeforeWorkerStart()` + `convertRunHistoryToWorkerPrompt()`.
+8. ~~**P2.4 CancellationToken**~~ ✅ Completed — wired into `collectRuns`/`validateMailbox`/`pruneFinishedRuns`/`cleanupRunWorktrees` etc.
+9. ~~**P2.5 Blob artifacts**~~ ✅ Completed — content-addressed blob store with SHA-256 dedup and metadata sidecars.
+10. ~~**P2.6 Dashboard panels**~~ ✅ Completed — capability and cancellation panes.
+Also completed (not in original order list):
+- ~~**P1.6 Synthetic terminal results**~~ ✅ — `buildSyntheticTerminalEvidence()` for cancelled in-flight tasks.
+- ~~**P2.1 Typed hook lifecycle**~~ ✅ — `before_run_start`/`before_task_start` wired into team-runner.
 ## Release Guidance