pi-crew 0.1.46 → 0.1.51
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +115 -0
- package/agents/analyst.md +11 -11
- package/agents/critic.md +11 -11
- package/agents/executor.md +11 -11
- package/agents/explorer.md +11 -11
- package/agents/planner.md +11 -11
- package/agents/reviewer.md +11 -11
- package/agents/security-reviewer.md +11 -11
- package/agents/test-engineer.md +11 -11
- package/agents/verifier.md +11 -11
- package/agents/writer.md +11 -11
- package/docs/next-upgrade-roadmap.md +117 -42
- package/docs/refactor-tasks-phase3.md +394 -394
- package/docs/refactor-tasks-phase4.md +564 -564
- package/docs/refactor-tasks-phase5.md +402 -402
- package/docs/refactor-tasks-phase6.md +662 -662
- package/docs/research/AGENT-EXECUTION-ARCHITECTURE.md +261 -0
- package/docs/research/AGENT-LIFECYCLE-COMPARISON.md +111 -0
- package/docs/research/AUDIT_OH_MY_PI.md +261 -0
- package/docs/research/AUDIT_PI_CREW.md +457 -0
- package/docs/research/CAVEMAN-DEEP-RESEARCH.md +281 -0
- package/docs/research/COMPARISON_OH_MY_PI_VS_PI_CREW.md +264 -0
- package/docs/research/DEEP-RESEARCH-PI-POWERBAR.md +343 -0
- package/docs/research/DEEP_RESEARCH_SUBAGENT_ARCHITECTURE.md +480 -0
- package/docs/research/GAP_CLOSURE_IMPLEMENTATION_PLAN.md +354 -0
- package/docs/research/IMPLEMENTATION_PLAN.md +385 -0
- package/docs/research/LIVE-SESSION-PRODUCTION-READY-PLAN.md +502 -0
- package/docs/research/OH-MY-PI-DEEP-RESEARCH-v14.7.6.md +266 -0
- package/docs/research/REMAINING-GAPS-PLAN.md +363 -0
- package/docs/research/SESSION-SUMMARY-2026-05-08.md +146 -0
- package/docs/research/UI-RESPONSIVENESS-AUDIT.md +173 -0
- package/docs/research-awesome-agent-skills-distillation.md +100 -100
- package/docs/research-extension-examples.md +297 -297
- package/docs/research-extension-system.md +324 -324
- package/docs/research-oh-my-pi-distillation.md +56 -9
- package/docs/research-optimization-plan.md +548 -548
- package/docs/research-phase10-distillation.md +198 -198
- package/docs/research-phase11-distillation.md +201 -201
- package/docs/research-pi-coding-agent.md +357 -357
- package/docs/research-source-pi-crew-reference.md +174 -174
- package/docs/runtime-flow.md +148 -148
- package/docs/source-runtime-refactor-map.md +107 -107
- package/index.ts +6 -6
- package/package.json +99 -98
- package/schema.json +8 -0
- package/skills/async-worker-recovery/SKILL.md +42 -42
- package/skills/context-artifact-hygiene/SKILL.md +52 -52
- package/skills/delegation-patterns/SKILL.md +54 -54
- package/skills/mailbox-interactive/SKILL.md +40 -40
- package/skills/model-routing-context/SKILL.md +39 -39
- package/skills/multi-perspective-review/SKILL.md +58 -58
- package/skills/observability-reliability/SKILL.md +41 -41
- package/skills/orchestration/SKILL.md +157 -0
- package/skills/ownership-session-security/SKILL.md +41 -41
- package/skills/pi-extension-lifecycle/SKILL.md +39 -39
- package/skills/requirements-to-task-packet/SKILL.md +63 -63
- package/skills/resource-discovery-config/SKILL.md +41 -41
- package/skills/runtime-state-reader/SKILL.md +44 -44
- package/skills/secure-agent-orchestration-review/SKILL.md +45 -45
- package/skills/state-mutation-locking/SKILL.md +42 -42
- package/skills/systematic-debugging/SKILL.md +67 -67
- package/skills/ui-render-performance/SKILL.md +39 -39
- package/skills/verification-before-done/SKILL.md +57 -57
- package/skills/worktree-isolation/SKILL.md +39 -39
- package/src/agents/agent-config.ts +6 -0
- package/src/agents/agent-search.ts +98 -0
- package/src/agents/agent-serializer.ts +4 -0
- package/src/agents/discover-agents.ts +17 -4
- package/src/config/config.ts +25 -0
- package/src/config/defaults.ts +16 -5
- package/src/extension/autonomous-policy.ts +26 -33
- package/src/extension/cross-extension-rpc.ts +94 -82
- package/src/extension/help.ts +1 -0
- package/src/extension/management.ts +5 -0
- package/src/extension/project-init.ts +15 -3
- package/src/extension/register.ts +78 -19
- package/src/extension/registration/commands.ts +33 -1
- package/src/extension/registration/compaction-guard.ts +125 -125
- package/src/extension/registration/team-tool.ts +6 -4
- package/src/extension/run-bundle-schema.ts +89 -89
- package/src/extension/run-export.ts +26 -12
- package/src/extension/run-index.ts +24 -18
- package/src/extension/run-maintenance.ts +68 -62
- package/src/extension/team-tool/api.ts +23 -2
- package/src/extension/team-tool/cancel.ts +86 -11
- package/src/extension/team-tool/context.ts +4 -1
- package/src/extension/team-tool/handle-settings.ts +188 -188
- package/src/extension/team-tool/inspect.ts +41 -41
- package/src/extension/team-tool/intent-policy.ts +42 -0
- package/src/extension/team-tool/lifecycle-actions.ts +47 -18
- package/src/extension/team-tool/parallel-dispatch.ts +156 -0
- package/src/extension/team-tool/plan.ts +19 -19
- package/src/extension/team-tool/respond.ts +10 -2
- package/src/extension/team-tool/run.ts +3 -2
- package/src/extension/team-tool/status.ts +1 -1
- package/src/extension/team-tool-types.ts +1 -0
- package/src/extension/team-tool.ts +16 -5
- package/src/hooks/registry.ts +61 -0
- package/src/hooks/types.ts +41 -0
- package/src/i18n.ts +184 -184
- package/src/observability/exporters/otlp-exporter.ts +77 -77
- package/src/prompt/prompt-runtime.ts +72 -72
- package/src/runtime/agent-control.ts +108 -2
- package/src/runtime/agent-memory.ts +72 -72
- package/src/runtime/agent-observability.ts +114 -114
- package/src/runtime/async-marker.ts +26 -26
- package/src/runtime/async-runner.ts +3 -1
- package/src/runtime/attention-events.ts +28 -28
- package/src/runtime/background-runner.ts +19 -0
- package/src/runtime/cancellation-token.ts +89 -0
- package/src/runtime/cancellation.ts +61 -51
- package/src/runtime/capability-inventory.ts +116 -0
- package/src/runtime/child-pi.ts +2 -1
- package/src/runtime/code-summary.ts +247 -0
- package/src/runtime/completion-guard.ts +190 -190
- package/src/runtime/concurrency.ts +3 -1
- package/src/runtime/crash-recovery.ts +181 -0
- package/src/runtime/crew-agent-records.ts +35 -7
- package/src/runtime/crew-agent-runtime.ts +1 -0
- package/src/runtime/custom-tools/irc-tool.ts +201 -0
- package/src/runtime/custom-tools/submit-result-tool.ts +90 -0
- package/src/runtime/delivery-coordinator.ts +3 -1
- package/src/runtime/diagnostic-export.ts +3 -1
- package/src/runtime/direct-run.ts +35 -35
- package/src/runtime/effectiveness.ts +81 -76
- package/src/runtime/event-stream-bridge.ts +92 -0
- package/src/runtime/foreground-control.ts +82 -82
- package/src/runtime/green-contract.ts +46 -46
- package/src/runtime/group-join.ts +106 -106
- package/src/runtime/heartbeat-gradient.ts +28 -28
- package/src/runtime/heartbeat-watcher.ts +124 -124
- package/src/runtime/live-agent-control.ts +88 -88
- package/src/runtime/live-agent-manager.ts +78 -2
- package/src/runtime/live-control-realtime.ts +36 -36
- package/src/runtime/live-extension-bridge.ts +150 -0
- package/src/runtime/live-irc.ts +92 -0
- package/src/runtime/live-session-health.ts +100 -0
- package/src/runtime/live-session-runtime.ts +297 -7
- package/src/runtime/mcp-proxy.ts +113 -0
- package/src/runtime/notebook-helpers.ts +90 -0
- package/src/runtime/orphan-sentinel.ts +7 -0
- package/src/runtime/output-validator.ts +187 -0
- package/src/runtime/parallel-research.ts +44 -44
- package/src/runtime/parallel-utils.ts +57 -0
- package/src/runtime/parent-guard.ts +80 -0
- package/src/runtime/pi-args.ts +11 -2
- package/src/runtime/pi-json-output.ts +111 -111
- package/src/runtime/pi-spawn.ts +21 -3
- package/src/runtime/policy-engine.ts +79 -79
- package/src/runtime/process-status.ts +14 -1
- package/src/runtime/progress-event-coalescer.ts +43 -43
- package/src/runtime/prose-compressor.ts +164 -0
- package/src/runtime/recovery-recipes.ts +74 -74
- package/src/runtime/result-extractor.ts +121 -0
- package/src/runtime/role-permission.ts +39 -39
- package/src/runtime/runtime-resolver.ts +1 -4
- package/src/runtime/semaphore.ts +131 -0
- package/src/runtime/sensitive-paths.ts +92 -0
- package/src/runtime/session-resources.ts +25 -25
- package/src/runtime/session-snapshot.ts +59 -59
- package/src/runtime/session-usage.ts +79 -79
- package/src/runtime/sidechain-output.ts +29 -29
- package/src/runtime/stream-preview.ts +177 -0
- package/src/runtime/subagent-manager.ts +3 -2
- package/src/runtime/subprocess-tool-registry.ts +67 -0
- package/src/runtime/supervisor-contact.ts +59 -59
- package/src/runtime/task-display.ts +38 -38
- package/src/runtime/task-output-context.ts +59 -9
- package/src/runtime/task-runner/capabilities.ts +78 -78
- package/src/runtime/task-runner/live-executor.ts +2 -0
- package/src/runtime/task-runner/progress.ts +119 -119
- package/src/runtime/task-runner/prompt-builder.ts +71 -9
- package/src/runtime/task-runner/prompt-pipeline.ts +64 -64
- package/src/runtime/task-runner/result-utils.ts +14 -14
- package/src/runtime/task-runner/run-projection.ts +104 -0
- package/src/runtime/task-runner/state-helpers.ts +22 -22
- package/src/runtime/task-runner.ts +75 -4
- package/src/runtime/team-runner.ts +69 -8
- package/src/runtime/worker-heartbeat.ts +21 -21
- package/src/runtime/worker-startup.ts +57 -57
- package/src/runtime/workspace-tree.ts +298 -0
- package/src/runtime/yield-handler.ts +189 -0
- package/src/schema/config-schema.ts +7 -0
- package/src/schema/team-tool-schema.ts +11 -1
- package/src/skills/discover-skills.ts +67 -0
- package/src/state/active-run-registry.ts +4 -2
- package/src/state/artifact-store.ts +4 -1
- package/src/state/atomic-write.ts +50 -1
- package/src/state/blob-store.ts +117 -0
- package/src/state/contracts.ts +1 -0
- package/src/state/event-log-rotation.ts +158 -0
- package/src/state/event-log.ts +52 -2
- package/src/state/locks.ts +3 -1
- package/src/state/mailbox.ts +87 -7
- package/src/state/state-store.ts +24 -4
- package/src/state/task-claims.ts +44 -44
- package/src/state/types.ts +20 -0
- package/src/state/usage.ts +29 -29
- package/src/subagents/async-entry.ts +1 -1
- package/src/subagents/index.ts +3 -3
- package/src/subagents/live/control.ts +1 -1
- package/src/subagents/live/manager.ts +1 -1
- package/src/subagents/live/realtime.ts +1 -1
- package/src/subagents/live/session-runtime.ts +1 -1
- package/src/subagents/manager.ts +1 -1
- package/src/subagents/spawn.ts +1 -1
- package/src/teams/team-serializer.ts +38 -38
- package/src/types/diff.d.ts +18 -18
- package/src/ui/agent-management-overlay.ts +144 -0
- package/src/ui/crew-footer.ts +101 -101
- package/src/ui/crew-select-list.ts +111 -111
- package/src/ui/crew-widget.ts +15 -4
- package/src/ui/dashboard-panes/cancellation-pane.ts +43 -0
- package/src/ui/dashboard-panes/capability-pane.ts +60 -0
- package/src/ui/dashboard-panes/mailbox-pane.ts +35 -11
- package/src/ui/dashboard-panes/metrics-pane.ts +34 -34
- package/src/ui/dynamic-border.ts +25 -25
- package/src/ui/layout-primitives.ts +106 -106
- package/src/ui/live-run-sidebar.ts +4 -0
- package/src/ui/loaders.ts +158 -158
- package/src/ui/powerbar-publisher.ts +83 -15
- package/src/ui/render-coalescer.ts +51 -0
- package/src/ui/render-diff.ts +119 -119
- package/src/ui/render-scheduler.ts +143 -143
- package/src/ui/run-dashboard.ts +4 -0
- package/src/ui/run-event-bus.ts +209 -0
- package/src/ui/run-snapshot-cache.ts +68 -16
- package/src/ui/snapshot-types.ts +8 -0
- package/src/ui/spinner.ts +17 -17
- package/src/ui/status-colors.ts +58 -58
- package/src/ui/syntax-highlight.ts +116 -116
- package/src/ui/transcript-entries.ts +258 -0
- package/src/utils/atomic-write.ts +33 -33
- package/src/utils/completion-dedupe.ts +63 -63
- package/src/utils/frontmatter.ts +68 -68
- package/src/utils/git.ts +262 -262
- package/src/utils/ids.ts +17 -12
- package/src/utils/incremental-reader.ts +104 -0
- package/src/utils/names.ts +27 -27
- package/src/utils/redaction.ts +44 -44
- package/src/utils/safe-paths.ts +47 -47
- package/src/utils/scan-cache.ts +137 -0
- package/src/utils/sleep.ts +32 -32
- package/src/utils/sse-parser.ts +134 -0
- package/src/utils/task-name-generator.ts +337 -0
- package/src/utils/visual.ts +33 -2
- package/src/workflows/validate-workflow.ts +40 -40
- package/src/worktree/branch-freshness.ts +45 -45
- package/src/worktree/cleanup.ts +2 -1
- package/src/worktree/worktree-manager.ts +11 -3
- package/teams/default.team.md +12 -12
- package/teams/fast-fix.team.md +11 -11
- package/teams/implementation.team.md +18 -18
- package/teams/parallel-research.team.md +14 -14
- package/teams/research.team.md +11 -11
- package/teams/review.team.md +12 -12
- package/workflows/default.workflow.md +29 -29
- package/workflows/fast-fix.workflow.md +22 -22
- package/workflows/implementation.workflow.md +43 -38
- package/workflows/parallel-research.workflow.md +46 -46
- package/workflows/research.workflow.md +22 -22
- package/workflows/review.workflow.md +30 -30
package/CHANGELOG.md
CHANGED
|
@@ -2,6 +2,121 @@
|
|
|
2
2
|
|
|
3
3
|
## Unreleased
|
|
4
4
|
|
|
5
|
+
## 0.1.51
|
|
6
|
+
|
|
7
|
+
### Fixed
|
|
8
|
+
|
|
9
|
+
- **Stale foreground spinner** — Working message/spinner now always clears when foreground run completes, even if session generation changed during the run.
|
|
10
|
+
- **Completed-run widget grace period (8s)** — Runs that just completed stay visible in the widget for 8 seconds so users can see results before the widget hides.
|
|
11
|
+
|
|
12
|
+
## 0.1.50
|
|
13
|
+
|
|
14
|
+
### Fixed
|
|
15
|
+
|
|
16
|
+
- **Parallel execution** — Raised default concurrency (implementation 2→4, review 2→3, research 2→3). Fixed `defaultWorkflowConcurrency()` routing bug where review/default both returned the implementation value.
|
|
17
|
+
- **Planner prompt** — Added explicit "MAXIMIZE PARALLELISM" instruction with examples, so planner models produce parallel phases instead of sequential.
|
|
18
|
+
- **20 review findings** — 6 CRITICAL (optional chaining crash, env leak, path redaction, RPC validation, hook JSON safety, temp dir security), 6 HIGH (unsafe casts, busy-wait CPU, timestamp merge guard, prompt injection delimiter, binary validation), 5 MEDIUM, 3 LOW.
|
|
19
|
+
- **Widget flicker** — Pinned preloaded manifests to widget component model to prevent manifestCache TTL race. Scoped snapshotCache invalidation to specific run instead of clearing all.
|
|
20
|
+
- **Delegation policy** — Rewritten as mandatory decision table with concrete thresholds (>3 files read or >2 files edit = must delegate). Injected into every session via system prompt.
|
|
21
|
+
- **ignoreMethod option** — New config to write ignore entries to `.git/info/exclude` instead of `.gitignore` (Closes #2).
|
|
22
|
+
|
|
23
|
+
## 0.1.49
|
|
24
|
+
|
|
25
|
+
### Added
|
|
26
|
+
|
|
27
|
+
- **Caveman output contracts** — Role-based output validation framework with `output-validator.ts`: regex-based format checking for explorer, executor, reviewer, verifier, security-reviewer roles. Non-blocking: validation failures emit `task.output_validation` events + set `needs_attention` but do NOT fail the task.
|
|
28
|
+
- **Prose compressor** — `prose-compressor.ts` compresses verbose worker output for token-sensitive contexts (role-aware compression levels).
|
|
29
|
+
- **Sensitive paths** — Word-boundary-aware token matching in `sensitive-paths.ts` prevents false positives (e.g. `secretary.ts` no longer flagged as `secret`).
|
|
30
|
+
- **Symlink-safe I/O** — Artifact and shared output paths reject traversal attempts and symlinked root escapes.
|
|
31
|
+
- **Output contract eval harness** — 19 unit tests covering three-arm evaluation (contract vs terse vs baseline), format compliance, token savings, regex safety (no `/g` lastIndex state leak).
|
|
32
|
+
|
|
33
|
+
### Changed
|
|
34
|
+
|
|
35
|
+
- **Delegation policy rewritten** — Replaced advisory "you should consider" text with a mandatory decision table: concrete thresholds (>3 files read OR >2 files edit = MUST delegate), explicit YES/NO cases per task type, conflict-safe task splitting rules. Injected into every session via `before_agent_start` hook.
|
|
36
|
+
- **Powerbar dedup** — `powerbar-publisher.ts` now skips `powerbar:update` emit when segment data is unchanged (inspired by pi-powerbar's `segmentEquals` pattern). Combined with existing 200ms coalescing for minimal unnecessary renders.
|
|
37
|
+
- **UI responsiveness** — `task-runner.ts` now emits `streamBridge` event immediately after `task.started`, giving the widget agent status within ~100ms instead of 2-5s (child process startup delay).
|
|
38
|
+
- **"spawning…" indicator** — Widget shows "spawning…" for agents < 5 seconds old with no tool activity, distinguishing from "thinking…" for long-running agents.
|
|
39
|
+
|
|
40
|
+
### Fixed
|
|
41
|
+
|
|
42
|
+
- **H1: MCP proxy fallback** — `mcp-proxy.ts` now falls back to `enableMcp: true` when `createMcpProxyTools()` returns empty, so child sessions self-discover MCP instead of losing all access.
|
|
43
|
+
- **H2: parallel-utils throw undefined** — `mapConcurrent` now throws the actual error instead of `throw undefined`.
|
|
44
|
+
- **H3: Semaphore over-release** — `release()` guard against `#current > 0` prevents over-release corruption.
|
|
45
|
+
- **M1: IRC tool TOCTOU** — `irc-tool.ts` wraps `sendIrcMessage`/`broadcastIrcMessage` in try-catch.
|
|
46
|
+
- **M2: submit-result ordering** — Builds response string before calling `onYield`, wrapped in try-catch.
|
|
47
|
+
- **M3: Sensitive paths false positives** — Word-boundary-aware token matching replaces substring matching.
|
|
48
|
+
- **M4: atomic-write sleepSync** — Added WARNING comment about blocking main thread.
|
|
49
|
+
- **M7: URL regex trailing punctuation** — Precise regex excludes trailing punctuation from URL matches.
|
|
50
|
+
- **L1: parent-guard comment** — Corrected misleading comment about `process.kill` on Windows.
|
|
51
|
+
- **Yield handler DRY** — Extracted `extractYieldDataFromArgs` helper, `isObjectRecord`/`isStringRecord` type guards, safe `find()` pattern.
|
|
52
|
+
- **Event-log-rotation TOCTOU** — `compactEventLog` re-reads file after initial read to merge concurrent appends; `readEvents` skips corrupt JSON lines.
|
|
53
|
+
- **Ghost agent dedup** — Fixed duplicate agent records in `crew-agent-records` after crash recovery.
|
|
54
|
+
|
|
55
|
+
### Research
|
|
56
|
+
|
|
57
|
+
- `docs/research/AGENT-EXECUTION-ARCHITECTURE.md` — Detailed comparison of 3 execution modes (oh-my-pi in-process, pi-crew child-process, pi-crew live-session).
|
|
58
|
+
- `docs/research/UI-RESPONSIVENESS-AUDIT.md` — Root cause analysis for 2-5s agent spawn visibility delay, 5 proposed fixes with priority matrix.
|
|
59
|
+
- `docs/research/DEEP-RESEARCH-PI-POWERBAR.md` — Deep analysis of pi-powerbar architecture (producer/consumer pattern, rendering, settings, comparison with pi-crew's powerbar publisher).
|
|
60
|
+
|
|
61
|
+
## 0.1.48
|
|
62
|
+
|
|
63
|
+
### Added
|
|
64
|
+
|
|
65
|
+
- **Yield-based completion contract** — Workers can call `submit_result` tool to return structured results; task-runner warns on workers that don't yield.
|
|
66
|
+
- **Typed event channels** — `RunEventBus` supports 5 channels (`worker:progress`, `worker:lifecycle`, `worker:stream`, `run:state`, `ui:invalidate`) with `onChannel`/`onChannelForRun` subscriptions and auto-classification.
|
|
67
|
+
- **Human-readable task names** — `generateTaskName()` produces AdjectiveNoun names (14,400 combinations); `displayName` field on `TeamTaskState`.
|
|
68
|
+
- **SubprocessToolRegistry** — Extensible tool event handling with `register`/`extractAll`/`shouldTerminate` pattern; wired into event-stream-bridge.
|
|
69
|
+
- **Event log rotation/compaction** — Auto-compacts event logs over 5MB/50k events, keeping last 1000 events; atomic file replacement.
|
|
70
|
+
- **Incremental JSONL reader** — `readLinesSince`/`readJsonlSince` for seek-based file reading; wired into `readEventsCursor` with `fromByteOffset`.
|
|
71
|
+
|
|
72
|
+
### Fixed
|
|
73
|
+
|
|
74
|
+
- Fixed `readBlob`/`readBlobMetadata` crash on missing files — now returns `undefined`.
|
|
75
|
+
- Fixed `readSseJson` crash on non-JSON SSE data — now skips malformed events.
|
|
76
|
+
- Fixed wrong value `"long_running"` → `"active_long_running"` in agent-control.
|
|
77
|
+
- Fixed `consecutiveFailures` type bypass — added to `CrewAgentProgress` interface.
|
|
78
|
+
- Fixed `streamBridge.dispose()` memory leak — now in try/finally.
|
|
79
|
+
- Fixed blob-store redundant ternary `typeof x === "string" ? x : x`.
|
|
80
|
+
- Fixed team-runner non-null assertion on potentially empty array.
|
|
81
|
+
- Fixed event-log silent error swallowing — now logs via `logInternalError`.
|
|
82
|
+
- Fixed team-tool switch case indentation.
|
|
83
|
+
- Removed dead code `expandIcon` in agent-management-overlay.
|
|
84
|
+
|
|
85
|
+
### Changed
|
|
86
|
+
|
|
87
|
+
- Moved 6 research .md files from repo root to `docs/research/`.
|
|
88
|
+
- `discoverAgents`/`discoverSkills` silent catches now log via `logInternalError`.
|
|
89
|
+
- `executeHook` accumulates non-blocking diagnostics instead of short-circuiting.
|
|
90
|
+
- `CancellationToken.heartbeat` wired into `collectRuns` and `pruneFinishedRuns`.
|
|
91
|
+
- `CapabilitySource` extended with `"git"` to match `ResourceSource`.
|
|
92
|
+
|
|
93
|
+
## 0.1.47
|
|
94
|
+
|
|
95
|
+
### Added
|
|
96
|
+
|
|
97
|
+
- **Typed hook lifecycle** — 8 of 9 hooks wired: `before_run_start`, `before_task_start`, `task_result`, `before_cancel`, `before_forget`, `before_cleanup`, `before_publish`, `run_recovery`. Hooks are opt-in, blocking/non-blocking, with audit events.
|
|
98
|
+
- **Event-first UI bus** — `RunEventBus` emits on every `appendEvent` call; dashboard, crew widget, sidebar, and snapshot cache subscribe for event-driven invalidation instead of polling.
|
|
99
|
+
- **Shared scan cache** — `SharedScanCache` caches manifest reads and active-run entries with TTL, mtime/size invalidation, and LRU eviction.
|
|
100
|
+
- **Capability inventory** — `buildCapabilityInventory()` enumerates teams, workflows, agents, and skills with stable `kind:name` IDs; supports policy disable and shadowing detection.
|
|
101
|
+
- **Skills in capability inventory** — `discoverSkills()` reads SKILL.md frontmatter; skills appear with kind=`skill` and source=`package`/`project`.
|
|
102
|
+
- **Mailbox kind-separated breakdown** — `RunUiMailbox` tracks `steerUnread`/`followUpUnread`/`responseUnread`/`messageUnread`; mailbox pane shows urgency indicators.
|
|
103
|
+
- **Run recovery hook** — `applyRecoveryPlan` fires `run_recovery` hook; blocked recovery emits `crew.run.recovery_blocked` event.
|
|
104
|
+
- **Synthetic tool cancellation evidence** — Cancelled in-flight tasks receive `tool`-level terminal evidence alongside `worker`-level.
|
|
105
|
+
- **CancellationToken wired into production loops** — `collectRuns` and `pruneFinishedRuns` use `CancellationToken.heartbeat(stage)` for progress diagnostics.
|
|
106
|
+
- **Blob artifact store** — SHA-256 content-addressed storage with metadata sidecars.
|
|
107
|
+
- **Run event provenance** — Event metadata includes `parentEventId`, `attemptId`, `branchId`, `causationId`, `correlationId`.
|
|
108
|
+
- **Control channel reservation** — `ControlReservation` before worker spawn with deterministic `controllerId`.
|
|
109
|
+
- **Release smoke test** — `npm run smoke:release` automates tarball install + version consistency check.
|
|
110
|
+
- **Width-safety tests** — Crew widget rendering verified at widths 1/40/200/empty/multiple.
|
|
111
|
+
|
|
112
|
+
### Changed
|
|
113
|
+
|
|
114
|
+
- `handleCancel`, `handleForget`, `handleCleanup`, `handlePrune`, `handleExport` converted to async for hook execution.
|
|
115
|
+
- `before_cancel`/`before_forget`/`before_cleanup` hooks can block their respective operations.
|
|
116
|
+
- `before_publish` hook fires before run export.
|
|
117
|
+
- `task_result` hook fires before `task.completed`/`task.failed` events.
|
|
118
|
+
- Dashboard, widget, and sidebar auto-invalidate on `RunEventBus` events.
|
|
119
|
+
|
|
5
120
|
## 0.1.45
|
|
6
121
|
|
|
7
122
|
### Added
|
package/agents/analyst.md
CHANGED
|
@@ -1,11 +1,11 @@
|
|
|
1
|
-
---
|
|
2
|
-
name: analyst
|
|
3
|
-
description: Analyze requirements, ambiguity, and hidden constraints
|
|
4
|
-
model: false
|
|
5
|
-
systemPromptMode: replace
|
|
6
|
-
inheritProjectContext: true
|
|
7
|
-
inheritSkills: false
|
|
8
|
-
tools: read, grep, find, ls
|
|
9
|
-
---
|
|
10
|
-
|
|
11
|
-
You are a requirements analyst. Identify what is known, unknown, risky, ambiguous, or underspecified. Produce clarifying assumptions and acceptance criteria.
|
|
1
|
+
---
|
|
2
|
+
name: analyst
|
|
3
|
+
description: Analyze requirements, ambiguity, and hidden constraints
|
|
4
|
+
model: false
|
|
5
|
+
systemPromptMode: replace
|
|
6
|
+
inheritProjectContext: true
|
|
7
|
+
inheritSkills: false
|
|
8
|
+
tools: read, grep, find, ls
|
|
9
|
+
---
|
|
10
|
+
|
|
11
|
+
You are a requirements analyst. Identify what is known, unknown, risky, ambiguous, or underspecified. Produce clarifying assumptions and acceptance criteria.
|
package/agents/critic.md
CHANGED
|
@@ -1,11 +1,11 @@
|
|
|
1
|
-
---
|
|
2
|
-
name: critic
|
|
3
|
-
description: Challenge plans and designs before execution
|
|
4
|
-
model: false
|
|
5
|
-
systemPromptMode: replace
|
|
6
|
-
inheritProjectContext: true
|
|
7
|
-
inheritSkills: false
|
|
8
|
-
tools: read, grep, find, ls
|
|
9
|
-
---
|
|
10
|
-
|
|
11
|
-
You are a critical reviewer. Find flaws, missing steps, unsafe assumptions, overengineering, underengineering, and verification gaps. Return concrete fixes to the plan.
|
|
1
|
+
---
|
|
2
|
+
name: critic
|
|
3
|
+
description: Challenge plans and designs before execution
|
|
4
|
+
model: false
|
|
5
|
+
systemPromptMode: replace
|
|
6
|
+
inheritProjectContext: true
|
|
7
|
+
inheritSkills: false
|
|
8
|
+
tools: read, grep, find, ls
|
|
9
|
+
---
|
|
10
|
+
|
|
11
|
+
You are a critical reviewer. Find flaws, missing steps, unsafe assumptions, overengineering, underengineering, and verification gaps. Return concrete fixes to the plan.
|
package/agents/executor.md
CHANGED
|
@@ -1,11 +1,11 @@
|
|
|
1
|
-
---
|
|
2
|
-
name: executor
|
|
3
|
-
description: Implement planned code changes
|
|
4
|
-
model: false
|
|
5
|
-
systemPromptMode: replace
|
|
6
|
-
inheritProjectContext: true
|
|
7
|
-
inheritSkills: false
|
|
8
|
-
tools: read, grep, find, ls, bash, edit, write
|
|
9
|
-
---
|
|
10
|
-
|
|
11
|
-
You are an implementation specialist. Follow the provided plan, make targeted changes, keep edits minimal, and report changed files plus validation status. Do not broaden scope without explaining why.
|
|
1
|
+
---
|
|
2
|
+
name: executor
|
|
3
|
+
description: Implement planned code changes
|
|
4
|
+
model: false
|
|
5
|
+
systemPromptMode: replace
|
|
6
|
+
inheritProjectContext: true
|
|
7
|
+
inheritSkills: false
|
|
8
|
+
tools: read, grep, find, ls, bash, edit, write
|
|
9
|
+
---
|
|
10
|
+
|
|
11
|
+
You are an implementation specialist. Follow the provided plan, make targeted changes, keep edits minimal, and report changed files plus validation status. Do not broaden scope without explaining why.
|
package/agents/explorer.md
CHANGED
|
@@ -1,11 +1,11 @@
|
|
|
1
|
-
---
|
|
2
|
-
name: explorer
|
|
3
|
-
description: Fast codebase discovery and file/symbol mapping
|
|
4
|
-
model: false
|
|
5
|
-
systemPromptMode: replace
|
|
6
|
-
inheritProjectContext: true
|
|
7
|
-
inheritSkills: false
|
|
8
|
-
tools: read, grep, find, ls
|
|
9
|
-
---
|
|
10
|
-
|
|
11
|
-
You are a fast codebase explorer. Map relevant files, symbols, data flow, and constraints. Do not modify files. Return concise findings with paths and evidence.
|
|
1
|
+
---
|
|
2
|
+
name: explorer
|
|
3
|
+
description: Fast codebase discovery and file/symbol mapping
|
|
4
|
+
model: false
|
|
5
|
+
systemPromptMode: replace
|
|
6
|
+
inheritProjectContext: true
|
|
7
|
+
inheritSkills: false
|
|
8
|
+
tools: read, grep, find, ls
|
|
9
|
+
---
|
|
10
|
+
|
|
11
|
+
You are a fast codebase explorer. Map relevant files, symbols, data flow, and constraints. Do not modify files. Return concise findings with paths and evidence.
|
package/agents/planner.md
CHANGED
|
@@ -1,11 +1,11 @@
|
|
|
1
|
-
---
|
|
2
|
-
name: planner
|
|
3
|
-
description: Create an execution plan with clear sequencing and risk notes
|
|
4
|
-
model: false
|
|
5
|
-
systemPromptMode: replace
|
|
6
|
-
inheritProjectContext: true
|
|
7
|
-
inheritSkills: false
|
|
8
|
-
tools: read, grep, find, ls
|
|
9
|
-
---
|
|
10
|
-
|
|
11
|
-
You are a planning specialist. Convert the goal and discovery notes into a concrete, ordered plan. Identify dependencies, risks, validation steps, and handoff instructions for implementers.
|
|
1
|
+
---
|
|
2
|
+
name: planner
|
|
3
|
+
description: Create an execution plan with clear sequencing and risk notes
|
|
4
|
+
model: false
|
|
5
|
+
systemPromptMode: replace
|
|
6
|
+
inheritProjectContext: true
|
|
7
|
+
inheritSkills: false
|
|
8
|
+
tools: read, grep, find, ls
|
|
9
|
+
---
|
|
10
|
+
|
|
11
|
+
You are a planning specialist. Convert the goal and discovery notes into a concrete, ordered plan. Identify dependencies, risks, validation steps, and handoff instructions for implementers.
|
package/agents/reviewer.md
CHANGED
|
@@ -1,11 +1,11 @@
|
|
|
1
|
-
---
|
|
2
|
-
name: reviewer
|
|
3
|
-
description: Review code changes for correctness, maintainability, and regressions
|
|
4
|
-
model: false
|
|
5
|
-
systemPromptMode: replace
|
|
6
|
-
inheritProjectContext: true
|
|
7
|
-
inheritSkills: false
|
|
8
|
-
tools: read, grep, find, ls, bash
|
|
9
|
-
---
|
|
10
|
-
|
|
11
|
-
You are a code reviewer. Review the implementation for bugs, regressions, maintainability issues, missing tests, and project-rule violations. Return prioritized findings with evidence.
|
|
1
|
+
---
|
|
2
|
+
name: reviewer
|
|
3
|
+
description: Review code changes for correctness, maintainability, and regressions
|
|
4
|
+
model: false
|
|
5
|
+
systemPromptMode: replace
|
|
6
|
+
inheritProjectContext: true
|
|
7
|
+
inheritSkills: false
|
|
8
|
+
tools: read, grep, find, ls, bash
|
|
9
|
+
---
|
|
10
|
+
|
|
11
|
+
You are a code reviewer. Review the implementation for bugs, regressions, maintainability issues, missing tests, and project-rule violations. Return prioritized findings with evidence.
|
|
@@ -1,11 +1,11 @@
|
|
|
1
|
-
---
|
|
2
|
-
name: security-reviewer
|
|
3
|
-
description: Review changes for security vulnerabilities and trust-boundary issues
|
|
4
|
-
model: false
|
|
5
|
-
systemPromptMode: replace
|
|
6
|
-
inheritProjectContext: true
|
|
7
|
-
inheritSkills: false
|
|
8
|
-
tools: read, grep, find, ls, bash
|
|
9
|
-
---
|
|
10
|
-
|
|
11
|
-
You are a security reviewer. Look for injection, authn/authz flaws, insecure defaults, secret exposure, unsafe filesystem/network behavior, and dependency risks. Return severity and remediation.
|
|
1
|
+
---
|
|
2
|
+
name: security-reviewer
|
|
3
|
+
description: Review changes for security vulnerabilities and trust-boundary issues
|
|
4
|
+
model: false
|
|
5
|
+
systemPromptMode: replace
|
|
6
|
+
inheritProjectContext: true
|
|
7
|
+
inheritSkills: false
|
|
8
|
+
tools: read, grep, find, ls, bash
|
|
9
|
+
---
|
|
10
|
+
|
|
11
|
+
You are a security reviewer. Look for injection, authn/authz flaws, insecure defaults, secret exposure, unsafe filesystem/network behavior, and dependency risks. Return severity and remediation.
|
package/agents/test-engineer.md
CHANGED
|
@@ -1,11 +1,11 @@
|
|
|
1
|
-
---
|
|
2
|
-
name: test-engineer
|
|
3
|
-
description: Design and implement test strategy for a change
|
|
4
|
-
model: false
|
|
5
|
-
systemPromptMode: replace
|
|
6
|
-
inheritProjectContext: true
|
|
7
|
-
inheritSkills: false
|
|
8
|
-
tools: read, grep, find, ls, bash, edit, write
|
|
9
|
-
---
|
|
10
|
-
|
|
11
|
-
You are a test engineer. Identify the right test level, add or adjust tests when asked, detect flaky assumptions, and report exact validation commands and results.
|
|
1
|
+
---
|
|
2
|
+
name: test-engineer
|
|
3
|
+
description: Design and implement test strategy for a change
|
|
4
|
+
model: false
|
|
5
|
+
systemPromptMode: replace
|
|
6
|
+
inheritProjectContext: true
|
|
7
|
+
inheritSkills: false
|
|
8
|
+
tools: read, grep, find, ls, bash, edit, write
|
|
9
|
+
---
|
|
10
|
+
|
|
11
|
+
You are a test engineer. Identify the right test level, add or adjust tests when asked, detect flaky assumptions, and report exact validation commands and results.
|
package/agents/verifier.md
CHANGED
|
@@ -1,11 +1,11 @@
|
|
|
1
|
-
---
|
|
2
|
-
name: verifier
|
|
3
|
-
description: Verify that implementation satisfies the requested goal
|
|
4
|
-
model: false
|
|
5
|
-
systemPromptMode: replace
|
|
6
|
-
inheritProjectContext: true
|
|
7
|
-
inheritSkills: false
|
|
8
|
-
tools: read, grep, find, ls, bash
|
|
9
|
-
---
|
|
10
|
-
|
|
11
|
-
You are a verification specialist. Check whether the work is complete, correct, tested, and aligned with project constraints. Prefer evidence over assumptions. Return PASS or FAIL with reasons.
|
|
1
|
+
---
|
|
2
|
+
name: verifier
|
|
3
|
+
description: Verify that implementation satisfies the requested goal
|
|
4
|
+
model: false
|
|
5
|
+
systemPromptMode: replace
|
|
6
|
+
inheritProjectContext: true
|
|
7
|
+
inheritSkills: false
|
|
8
|
+
tools: read, grep, find, ls, bash
|
|
9
|
+
---
|
|
10
|
+
|
|
11
|
+
You are a verification specialist. Check whether the work is complete, correct, tested, and aligned with project constraints. Prefer evidence over assumptions. Return PASS or FAIL with reasons.
|
package/agents/writer.md
CHANGED
|
@@ -1,11 +1,11 @@
|
|
|
1
|
-
---
|
|
2
|
-
name: writer
|
|
3
|
-
description: Write concise documentation, migration notes, and summaries
|
|
4
|
-
model: false
|
|
5
|
-
systemPromptMode: replace
|
|
6
|
-
inheritProjectContext: true
|
|
7
|
-
inheritSkills: false
|
|
8
|
-
tools: read, grep, find, ls, edit, write
|
|
9
|
-
---
|
|
10
|
-
|
|
11
|
-
You are a documentation specialist. Produce clear, concise, maintainable docs and summaries. Preserve technical accuracy and avoid marketing fluff.
|
|
1
|
+
---
|
|
2
|
+
name: writer
|
|
3
|
+
description: Write concise documentation, migration notes, and summaries
|
|
4
|
+
model: false
|
|
5
|
+
systemPromptMode: replace
|
|
6
|
+
inheritProjectContext: true
|
|
7
|
+
inheritSkills: false
|
|
8
|
+
tools: read, grep, find, ls, edit, write
|
|
9
|
+
---
|
|
10
|
+
|
|
11
|
+
You are a documentation specialist. Produce clear, concise, maintainable docs and summaries. Preserve technical accuracy and avoid marketing fluff.
|
|
@@ -22,6 +22,66 @@ Already implemented and pushed:
|
|
|
22
22
|
- Live-agent control distinguishes `steer` from `follow-up` at live-control/API level.
|
|
23
23
|
- Retry attempts have `attemptId`; max-retry deadletters link to the final `attemptId`.
|
|
24
24
|
- Worker prompt pipeline and capability inventory metadata artifacts are written per task.
|
|
25
|
+
- P0.1: effectiveness guard escalates `warn` to `blocked` for mutating-role tasks with no observable worker activity.
|
|
26
|
+
- P1.1: mailbox `readMailbox` accepts `kind` filter; API `read-mailbox` supports `config.kind`.
|
|
27
|
+
- P1.5: `TeamEventMetadata` extended with `parentEventId`, `attemptId`, `branchId`, `causationId`, `correlationId`.
|
|
28
|
+
- P1.6: `buildSyntheticTerminalEvidence()` produces `"worker"`/`"cancelled"` terminal records for cancelled in-flight tasks.
|
|
29
|
+
- P1.7: `buildCapabilityInventory(cwd)` normalizes teams/workflows/agents; API `operation=inventory`.
|
|
30
|
+
- P2.1: typed hook lifecycle — `registerHook`/`executeHook` registry; `before_run_start` and `before_task_start` wired.
|
|
31
|
+
- P2.4: `AbortSignal` wired into `collectRuns`, `validateMailbox`, `readAllMailboxMessages`, `pruneFinishedRuns`, `cleanupRunWorktrees`, etc.
|
|
32
|
+
- Resume scaffold runs preserve scaffold mode from original manifest when workers not disabled.
|
|
33
|
+
|
|
34
|
+
## Implementation Status as of `v0.1.46`
|
|
35
|
+
|
|
36
|
+
This roadmap is **not complete overall**. The `v0.1.46` release completed several vertical slices, but multiple roadmap items remain partial or unimplemented.
|
|
37
|
+
|
|
38
|
+
### Implemented / mostly implemented
|
|
39
|
+
|
|
40
|
+
- Baseline worker behavior: real child-process execution by default, explicit scaffold dry-runs, and blocked implicit scaffold/no-op runs.
|
|
41
|
+
- P0.1 ✅ effectiveness policy enforcement: default guard escalates `warn` to `blocked` for mutating-role tasks.
|
|
42
|
+
- P0.2 ✅ runtime safety persistence: manifests persist `runtimeResolution`; `runtime.resolved` event emitted; status shows safety; blocked runs persist evidence.
|
|
43
|
+
- Effectiveness reporting: summary/progress/status expose no-observed-work evidence and policy outcome.
|
|
44
|
+
- Structured cancellation basics: cancellation reasons flow through retry/backoff/team-runner paths and run/task events.
|
|
45
|
+
- Retry attempt evidence: retry attempts and max-retry deadletters carry/link `attemptId` data.
|
|
46
|
+
- Prompt pipeline artifacts and per-task capability metadata artifacts are written.
|
|
47
|
+
- P1.3 worker teardown evidence vertical slice: `WorkerExitStatus` and terminal worker cancellation evidence exist.
|
|
48
|
+
|
|
49
|
+
### Completed in this upgrade cycle (after v0.1.46)
|
|
50
|
+
|
|
51
|
+
- P0.1 effectiveness policy enforcement: default guard now escalates `warn` to `blocked` for mutating-role tasks with no observable worker activity; read-only roles remain `warning`.
|
|
52
|
+
- P0.2 runtime safety persistence: manifests persist `runtimeResolution`; `runtime.resolved` event emitted; status shows safety; blocked runs persist evidence.
|
|
53
|
+
- P1.1 durable steering/follow-up queues: `readMailbox` accepts `kind` filter; API `read-mailbox` supports `config.kind`; steering and follow-up are isolatable by kind.
|
|
54
|
+
- P1.2 respond vs follow-up UX: `/team-follow-up` command added for continuation prompts; `/team-respond` remains for waiting-task replies.
|
|
55
|
+
- P1.3 two-phase child process teardown: `WorkerExitStatus` populated from graceful SIGTERM → grace window → hard kill pipeline.
|
|
56
|
+
- P1.5 event-tree provenance: `TeamEventMetadata` extended with `parentEventId`, `attemptId`, `branchId`, `causationId`, `correlationId`; retry and cancel events carry `attemptId`.
|
|
57
|
+
- P1.6 synthetic terminal results: `buildSyntheticTerminalEvidence()` in `cancellation.ts`; cancelled in-flight tasks receive `"worker"`/`"cancelled"` terminal evidence records.
|
|
58
|
+
- P1.7 unified capability inventory: `buildCapabilityInventory(cwd)` normalizes teams/workflows/agents into `CapabilityItem[]`; API `operation=inventory` returns sorted JSON.
|
|
59
|
+
- P1.8 capability disable by stable ID: `disabledCapabilities` in `CrewPolicyConfig`; inventory marks disabled items with reason.
|
|
60
|
+
- P2.1 typed hook lifecycle: `HookName`, `HookMode`, `HookOutcome`, `HookContext`, `HookResult`, `HookExecutionReport` types; `registerHook`/`executeHook`/`clearHooks` registry; `before_run_start` and `before_task_start` wired into team-runner.
|
|
61
|
+
- P2.2 intent gates for destructive actions: `enforceDestructiveIntent` wired in cancel/cleanup/forget/prune/delete; configurable via `policy.requireIntentForDestructiveActions`.
|
|
62
|
+
- P2.3 durable history projection: `transformRunContextBeforeWorkerStart()` and `convertRunHistoryToWorkerPrompt()` bounded projection functions.
|
|
63
|
+
- P2.4 CancellationToken wired into long scans: `AbortSignal` passed to `collectRuns`/`validateMailbox`/`readAllMailboxMessages`/`pruneFinishedRuns`/`cleanupRunWorktrees`.
|
|
64
|
+
- P2.5 content-addressed blob store: `writeBlob`/`readBlob`/`readBlobMetadata` with SHA-256 dedup and metadata sidecars.
|
|
65
|
+
- P2.6 dashboard panes for capability and cancellation: `renderCapabilityPane` and `renderCancellationPane`.
|
|
66
|
+
- Resume scaffold run fix: preserves scaffold mode from original manifest when workers not disabled.
|
|
67
|
+
|
|
68
|
+
### Partial / not safe to mark complete
|
|
69
|
+
|
|
70
|
+
- P1.4 reserve worker control channel before spawn: controller metadata persistence during startup not yet implemented.
|
|
71
|
+
- P2.7 event-first UI: render coalescing and snapshot caches exist, but live UI still relies on durable file polling as a primary source in several panes.
|
|
72
|
+
- P2.8 shared raw scan-entry cache: not yet implemented.
|
|
73
|
+
|
|
74
|
+
### Completed / no longer backlog
|
|
75
|
+
|
|
76
|
+
- P2.7 event-first UI — RunEventBus wired into appendEvent; dashboard, widget, sidebar auto-invalidate on events; snapshot cache invalidates on events.
|
|
77
|
+
- P2.8 shared raw scan-entry cache — SharedScanCache implemented and wired into manifest reads (run-index) and active-run-registry (active manifest reads).
|
|
78
|
+
- P3.1 tarball-install smoke — `scripts/release-smoke.mjs` verified; `npm run smoke:release` added.
|
|
79
|
+
- Hook lifecycle — All hooks wired: `before_run_start`, `before_task_start`, `before_cancel`, `before_forget`, `before_cleanup`, `before_publish`, `task_result`, `run_recovery`. Only `session_before_switch` remains (no cwd switch mechanism in current codebase).
|
|
80
|
+
|
|
81
|
+
### Remaining items
|
|
82
|
+
|
|
83
|
+
- `session_before_switch` hook — no cwd/session switch mechanism in current codebase; placeholder for future.
|
|
84
|
+
- P3.2 CI gate — integrate `smoke:release` into CI pipeline (requires CI config).
|
|
25
85
|
|
|
26
86
|
## Priority Legend
|
|
27
87
|
|
|
@@ -82,12 +142,13 @@ Already implemented and pushed:
|
|
|
82
142
|
- set run `blocked` or `failed` depending config;
|
|
83
143
|
- include task IDs in `data`.
|
|
84
144
|
|
|
85
|
-
**Acceptance criteria**
|
|
145
|
+
**Acceptance criteria** ✅
|
|
86
146
|
|
|
87
|
-
- A mocked child-process run with no tool/model/transcript evidence does not report clean `completed` by default.
|
|
88
|
-
- Scaffold run still completes as explicit dry-run and displays `Worker execution: disabled/scaffold`.
|
|
89
|
-
- `status` clearly lists `noObservedWork` and `needsAttention` task IDs.
|
|
90
|
-
- Unit tests cover warn/block/fail modes.
|
|
147
|
+
- ✅ A mocked child-process run with no tool/model/transcript evidence does not report clean `completed` by default.
|
|
148
|
+
- ✅ Scaffold run still completes as explicit dry-run and displays `Worker execution: disabled/scaffold`.
|
|
149
|
+
- ✅ `status` clearly lists `noObservedWork` and `needsAttention` task IDs.
|
|
150
|
+
- ✅ Unit tests cover warn/block/fail modes.
|
|
151
|
+
- ✅ Default guard escalates `warn` to `blocked` for mutating-role tasks.
|
|
91
152
|
|
|
92
153
|
**Verification**
|
|
93
154
|
|
|
@@ -130,11 +191,12 @@ npm run test:unit
|
|
|
130
191
|
- `test/unit/team-run.test.ts`
|
|
131
192
|
- `test/unit/runtime-resolver.test.ts`
|
|
132
193
|
|
|
133
|
-
**Acceptance criteria**
|
|
194
|
+
**Acceptance criteria** ✅
|
|
134
195
|
|
|
135
|
-
- `status` shows `Runtime safety: trusted|explicit_dry_run|blocked`.
|
|
136
|
-
- Blocked disabled-worker runs persist enough evidence to explain why no subagents spawned.
|
|
137
|
-
- Existing manifest schema remains backward compatible.
|
|
196
|
+
- ✅ `status` shows `Runtime safety: trusted|explicit_dry_run|blocked`.
|
|
197
|
+
- ✅ Blocked disabled-worker runs persist enough evidence to explain why no subagents spawned.
|
|
198
|
+
- ✅ Existing manifest schema remains backward compatible.
|
|
199
|
+
- ✅ `runtimeResolution` persisted on manifest; `runtime.resolved` event emitted.
|
|
138
200
|
|
|
139
201
|
## P1 — Steering/Follow-up Semantics Beyond Live Control
|
|
140
202
|
|
|
@@ -170,12 +232,12 @@ npm run test:unit
|
|
|
170
232
|
- `test/unit/live-agent-control.test.ts`
|
|
171
233
|
- `test/unit/respond-tool.test.ts`
|
|
172
234
|
|
|
173
|
-
**Acceptance criteria**
|
|
235
|
+
**Acceptance criteria** ✅ (partially — kind filter and API done; UI pane separation remaining)
|
|
174
236
|
|
|
175
|
-
- Steering and follow-up can be inspected separately.
|
|
176
|
-
- Existing inbox/outbox JSONL remains readable.
|
|
177
|
-
-
|
|
178
|
-
-
|
|
237
|
+
- ✅ Steering and follow-up can be inspected separately via `readMailbox` kind filter and API `config.kind`.
|
|
238
|
+
- ✅ Existing inbox/outbox JSONL remains readable.
|
|
239
|
+
- ✅ Kind filter survives process/session switch (durable mailbox).
|
|
240
|
+
- ✅ UI/status separates urgent steering from follow-up backlog (mailbox pane shows kind breakdown with urgency indicators).
|
|
179
241
|
|
|
180
242
|
### P1.2 Clarify `respond` vs `follow-up` UX
|
|
181
243
|
|
|
@@ -307,11 +369,12 @@ Retry attempts have `attemptId`, and deadletters link to final attempt. Event lo
|
|
|
307
369
|
- `test/unit/event-metadata.test.ts`
|
|
308
370
|
- `test/unit/retry-executor.test.ts`
|
|
309
371
|
|
|
310
|
-
**Acceptance criteria**
|
|
372
|
+
**Acceptance criteria** ✅
|
|
311
373
|
|
|
312
|
-
- Retry attempt events and terminal task events share attempt provenance.
|
|
313
|
-
- Deadletter records can be traced back to event sequence.
|
|
314
|
-
- Existing JSONL readers ignore missing provenance fields.
|
|
374
|
+
- ✅ Retry attempt events and terminal task events share attempt provenance.
|
|
375
|
+
- ✅ Deadletter records can be traced back to event sequence.
|
|
376
|
+
- ✅ Existing JSONL readers ignore missing provenance fields.
|
|
377
|
+
- ✅ `TeamEventMetadata` extended with `parentEventId`, `attemptId`, `branchId`, `causationId`, `correlationId`.
|
|
315
378
|
|
|
316
379
|
### P1.6 Synthetic terminal results for cancelled in-flight operations
|
|
317
380
|
|
|
@@ -336,10 +399,11 @@ Run/task cancellation events are now structured, but worker/tool sub-operations
|
|
|
336
399
|
- `src/state/contracts.ts`
|
|
337
400
|
- `test/unit/cancellation.test.ts`
|
|
338
401
|
|
|
339
|
-
**Acceptance criteria**
|
|
402
|
+
**Acceptance criteria** ✅
|
|
340
403
|
|
|
341
|
-
- No started tool/model operation is left without terminal evidence after cancellation.
|
|
342
|
-
- Status/diagnostics can distinguish user cancel vs timeout vs shutdown.
|
|
404
|
+
- ✅ No started tool/model operation is left without terminal evidence after cancellation.
|
|
405
|
+
- ✅ Status/diagnostics can distinguish user cancel vs timeout vs shutdown.
|
|
406
|
+
- ✅ `buildSyntheticTerminalEvidence()` in `cancellation.ts` produces `"worker"`/`"cancelled"` records.
|
|
343
407
|
|
|
344
408
|
## P1 — Capability Inventory and Control Center
|
|
345
409
|
|
|
@@ -379,10 +443,10 @@ interface CapabilityItem {
|
|
|
379
443
|
|
|
380
444
|
**Acceptance criteria**
|
|
381
445
|
|
|
382
|
-
- Inventory is stable and sorted.
|
|
383
|
-
- Shadowed project/user/builtin resources are visible.
|
|
384
|
-
- Skill disabled/budget state is visible.
|
|
385
|
-
- No file path is used as the only stable ID.
|
|
446
|
+
- ✅ Inventory is stable and sorted.
|
|
447
|
+
- ✅ Shadowed project/user/builtin resources are visible in capability inventory (state="shadowed", shadowedBy field).
|
|
448
|
+
- ✅ Skill disabled/budget state is visible in capability inventory (skills enumerated via discoverSkills).
|
|
449
|
+
- ✅ No file path is used as the only stable ID (uses `kind:name` IDs).
|
|
386
450
|
|
|
387
451
|
### P1.8 Persist capability disables by stable ID
|
|
388
452
|
|
|
@@ -436,11 +500,18 @@ Errors are recorded in diagnostics/events, not uncontrolled exceptions.
|
|
|
436
500
|
- `docs/resource-formats.md`
|
|
437
501
|
- `test/unit/hooks*.test.ts`
|
|
438
502
|
|
|
439
|
-
**Acceptance criteria**
|
|
503
|
+
**Acceptance criteria** ✅ (partial — `before_cancel` not yet wired for async)
|
|
440
504
|
|
|
441
|
-
- Blocking hook can stop a run before worker start with clear event and status.
|
|
442
|
-
- Non-blocking hook failure records diagnostic and does not crash run.
|
|
443
|
-
- Hook context is redacted and bounded.
|
|
505
|
+
- ✅ Blocking hook can stop a run before worker start with clear event and status.
|
|
506
|
+
- ✅ Non-blocking hook failure records diagnostic and does not crash run.
|
|
507
|
+
- ✅ Hook context is redacted and bounded.
|
|
508
|
+
- ✅ `before_cancel` hook wired (async handleCancel conversion done).
|
|
509
|
+
- ✅ `before_forget` hook wired (async handleForget conversion done).
|
|
510
|
+
- ✅ `before_cleanup` hook wired (async handleCleanup conversion done).
|
|
511
|
+
- ✅ `task_result` hook wired in task-runner before completed/failed event.
|
|
512
|
+
- ✅ `before_publish` hook wired in handleExport.
|
|
513
|
+
- ✅ `run_recovery` hook wired in crash-recovery `applyRecoveryPlan`.
|
|
514
|
+
- ☐ `session_before_switch` not yet wired (no cwd switch mechanism in current codebase; placeholder for future Pi lifecycle integration).
|
|
444
515
|
|
|
445
516
|
### P2.2 Require intent via policy/hook for destructive actions
|
|
446
517
|
|
|
@@ -547,11 +618,11 @@ Use it in:
|
|
|
547
618
|
- `src/ui/run-snapshot-cache.ts`
|
|
548
619
|
- `test/unit/cancellation-token.test.ts`
|
|
549
620
|
|
|
550
|
-
**Acceptance criteria**
|
|
621
|
+
**Acceptance criteria** ✅
|
|
551
622
|
|
|
552
|
-
- Long scan can abort within bounded cadence.
|
|
553
|
-
-
|
|
554
|
-
- Existing APIs can pass no token and keep current behavior.
|
|
623
|
+
- ✅ Long scan can abort within bounded cadence (`AbortSignal` wired into `collectRuns`, `validateMailbox`, `readAllMailboxMessages`, `pruneFinishedRuns`, `cleanupRunWorktrees`).
|
|
624
|
+
- ✅ `CancellationToken.heartbeat(stage)` wired into `collectRuns` and `pruneFinishedRuns` with stage diagnostics.
|
|
625
|
+
- ✅ Existing APIs can pass no token/signal and keep current behavior.
|
|
555
626
|
|
|
556
627
|
## P2 — Artifact Store Improvements
|
|
557
628
|
|
|
@@ -699,16 +770,20 @@ npm pack
|
|
|
699
770
|
|
|
700
771
|
## Suggested Implementation Order
|
|
701
772
|
|
|
702
|
-
1.
|
|
703
|
-
2.
|
|
773
|
+
1. ~~**P0.1 Effectiveness policy enforcement**~~ ✅ Completed — default guard escalates `warn` to `blocked` for mutating-role tasks.
|
|
774
|
+
2. ~~**P0.2 Persist runtime safety**~~ ✅ Completed — manifests persist `runtimeResolution`; `runtime.resolved` event emitted.
|
|
704
775
|
3. **P1.3 Two-phase worker teardown** — reduces stale/zombie worker risk.
|
|
705
|
-
4.
|
|
706
|
-
5.
|
|
707
|
-
6.
|
|
708
|
-
7.
|
|
709
|
-
8.
|
|
710
|
-
9.
|
|
711
|
-
10.
|
|
776
|
+
4. ~~**P1.1 Durable steering/follow-up queues**~~ ✅ Completed — `readMailbox` kind filter; API `read-mailbox` supports `config.kind`.
|
|
777
|
+
5. ~~**P1.5 Event-tree provenance**~~ ✅ Completed — `TeamEventMetadata` extended with `parentEventId`/`attemptId`/`branchId`/`causationId`/`correlationId`.
|
|
778
|
+
6. ~~**P1.7 Capability inventory view**~~ ✅ Completed — `buildCapabilityInventory()` + API `operation=inventory` + dashboard pane.
|
|
779
|
+
7. ~~**P2.3 Durable history projection**~~ ✅ Completed — `transformRunContextBeforeWorkerStart()` + `convertRunHistoryToWorkerPrompt()`.
|
|
780
|
+
8. ~~**P2.4 CancellationToken**~~ ✅ Completed — wired into `collectRuns`/`validateMailbox`/`pruneFinishedRuns`/`cleanupRunWorktrees` etc.
|
|
781
|
+
9. ~~**P2.5 Blob artifacts**~~ ✅ Completed — content-addressed blob store with SHA-256 dedup and metadata sidecars.
|
|
782
|
+
10. ~~**P2.6 Dashboard panels**~~ ✅ Completed — capability and cancellation panes.
|
|
783
|
+
|
|
784
|
+
Also completed (not in original order list):
|
|
785
|
+
- ~~**P1.6 Synthetic terminal results**~~ ✅ — `buildSyntheticTerminalEvidence()` for cancelled in-flight tasks.
|
|
786
|
+
- ~~**P2.1 Typed hook lifecycle**~~ ✅ — `before_run_start`/`before_task_start` wired into team-runner.
|
|
712
787
|
|
|
713
788
|
## Release Guidance
|
|
714
789
|
|