pi-crew 0.1.45 → 0.1.46
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +5 -5
- package/agents/analyst.md +1 -1
- package/agents/critic.md +1 -1
- package/agents/executor.md +1 -1
- package/agents/explorer.md +1 -1
- package/agents/planner.md +1 -1
- package/agents/reviewer.md +1 -1
- package/agents/security-reviewer.md +1 -1
- package/agents/test-engineer.md +1 -1
- package/agents/verifier.md +1 -1
- package/agents/writer.md +1 -1
- package/docs/next-upgrade-roadmap.md +733 -0
- package/docs/refactor-tasks-phase3.md +394 -394
- package/docs/refactor-tasks-phase4.md +564 -564
- package/docs/refactor-tasks-phase5.md +402 -402
- package/docs/refactor-tasks-phase6.md +662 -662
- package/docs/research-awesome-agent-skills-distillation.md +100 -0
- package/docs/research-extension-examples.md +297 -297
- package/docs/research-extension-system.md +324 -324
- package/docs/research-oh-my-pi-distillation.md +322 -0
- package/docs/research-optimization-plan.md +548 -548
- package/docs/research-phase10-distillation.md +198 -198
- package/docs/research-phase11-distillation.md +201 -201
- package/docs/research-pi-coding-agent.md +357 -357
- package/docs/research-source-pi-crew-reference.md +174 -174
- package/docs/runtime-flow.md +148 -148
- package/docs/source-runtime-refactor-map.md +107 -83
- package/docs/usage.md +3 -3
- package/index.ts +6 -6
- package/install.mjs +52 -8
- package/package.json +1 -1
- package/schema.json +2 -1
- package/skills/async-worker-recovery/SKILL.md +42 -0
- package/skills/context-artifact-hygiene/SKILL.md +52 -0
- package/skills/delegation-patterns/SKILL.md +54 -0
- package/skills/mailbox-interactive/SKILL.md +40 -0
- package/skills/model-routing-context/SKILL.md +39 -0
- package/skills/multi-perspective-review/SKILL.md +58 -0
- package/skills/observability-reliability/SKILL.md +41 -0
- package/skills/ownership-session-security/SKILL.md +41 -0
- package/skills/pi-extension-lifecycle/SKILL.md +39 -0
- package/skills/requirements-to-task-packet/SKILL.md +63 -0
- package/skills/resource-discovery-config/SKILL.md +41 -0
- package/skills/runtime-state-reader/SKILL.md +44 -0
- package/skills/secure-agent-orchestration-review/SKILL.md +45 -0
- package/skills/state-mutation-locking/SKILL.md +42 -0
- package/skills/systematic-debugging/SKILL.md +67 -0
- package/skills/ui-render-performance/SKILL.md +39 -0
- package/skills/verification-before-done/SKILL.md +57 -0
- package/skills/worktree-isolation/SKILL.md +39 -0
- package/src/agents/agent-serializer.ts +34 -34
- package/src/agents/discover-agents.ts +12 -11
- package/src/config/config.ts +48 -24
- package/src/config/defaults.ts +14 -0
- package/src/extension/cross-extension-rpc.ts +82 -82
- package/src/extension/project-init.ts +62 -2
- package/src/extension/register.ts +11 -9
- package/src/extension/registration/commands.ts +32 -25
- package/src/extension/registration/compaction-guard.ts +125 -125
- package/src/extension/registration/subagent-helpers.ts +8 -0
- package/src/extension/registration/subagent-tools.ts +149 -148
- package/src/extension/registration/team-tool.ts +8 -6
- package/src/extension/run-bundle-schema.ts +89 -89
- package/src/extension/run-index.ts +13 -5
- package/src/extension/run-maintenance.ts +62 -43
- package/src/extension/team-tool/api.ts +25 -8
- package/src/extension/team-tool/cancel.ts +33 -4
- package/src/extension/team-tool/context.ts +5 -0
- package/src/extension/team-tool/handle-settings.ts +188 -188
- package/src/extension/team-tool/inspect.ts +41 -41
- package/src/extension/team-tool/lifecycle-actions.ts +91 -79
- package/src/extension/team-tool/plan.ts +19 -19
- package/src/extension/team-tool/respond.ts +37 -17
- package/src/extension/team-tool/run.ts +52 -10
- package/src/extension/team-tool/status.ts +12 -1
- package/src/extension/team-tool-types.ts +2 -0
- package/src/extension/team-tool.ts +32 -11
- package/src/i18n.ts +184 -184
- package/src/observability/event-to-metric.ts +8 -1
- package/src/observability/exporters/otlp-exporter.ts +77 -77
- package/src/prompt/prompt-runtime.ts +72 -72
- package/src/runtime/agent-control.ts +63 -63
- package/src/runtime/agent-memory.ts +72 -72
- package/src/runtime/agent-observability.ts +114 -114
- package/src/runtime/async-marker.ts +26 -26
- package/src/runtime/attention-events.ts +28 -28
- package/src/runtime/background-runner.ts +59 -53
- package/src/runtime/cancellation.ts +51 -0
- package/src/runtime/child-pi.ts +457 -444
- package/src/runtime/completion-guard.ts +190 -190
- package/src/runtime/crash-recovery.ts +1 -0
- package/src/runtime/crew-agent-records.ts +38 -6
- package/src/runtime/deadletter.ts +1 -0
- package/src/runtime/delivery-coordinator.ts +46 -25
- package/src/runtime/direct-run.ts +35 -35
- package/src/runtime/effectiveness.ts +76 -0
- package/src/runtime/foreground-control.ts +82 -82
- package/src/runtime/green-contract.ts +46 -46
- package/src/runtime/group-join.ts +106 -106
- package/src/runtime/heartbeat-gradient.ts +28 -28
- package/src/runtime/heartbeat-watcher.ts +124 -124
- package/src/runtime/live-agent-control.ts +88 -87
- package/src/runtime/live-agent-manager.ts +103 -85
- package/src/runtime/live-control-realtime.ts +36 -36
- package/src/runtime/live-session-runtime.ts +309 -305
- package/src/runtime/manifest-cache.ts +17 -2
- package/src/runtime/model-fallback.ts +6 -4
- package/src/runtime/parallel-research.ts +44 -44
- package/src/runtime/pi-args.ts +18 -3
- package/src/runtime/pi-json-output.ts +111 -111
- package/src/runtime/policy-engine.ts +79 -79
- package/src/runtime/process-status.ts +5 -1
- package/src/runtime/progress-event-coalescer.ts +43 -43
- package/src/runtime/recovery-recipes.ts +74 -74
- package/src/runtime/retry-executor.ts +81 -64
- package/src/runtime/role-permission.ts +39 -39
- package/src/runtime/runtime-resolver.ts +22 -6
- package/src/runtime/session-resources.ts +25 -25
- package/src/runtime/session-snapshot.ts +59 -59
- package/src/runtime/session-usage.ts +79 -79
- package/src/runtime/sidechain-output.ts +29 -29
- package/src/runtime/skill-instructions.ts +222 -0
- package/src/runtime/stale-reconciler.ts +4 -14
- package/src/runtime/subagent-manager.ts +3 -0
- package/src/runtime/supervisor-contact.ts +59 -59
- package/src/runtime/task-display.ts +38 -38
- package/src/runtime/task-output-context.ts +127 -127
- package/src/runtime/task-runner/capabilities.ts +78 -0
- package/src/runtime/task-runner/live-executor.ts +105 -101
- package/src/runtime/task-runner/progress.ts +119 -119
- package/src/runtime/task-runner/prompt-builder.ts +3 -1
- package/src/runtime/task-runner/prompt-pipeline.ts +64 -0
- package/src/runtime/task-runner/result-utils.ts +14 -14
- package/src/runtime/task-runner/state-helpers.ts +22 -22
- package/src/runtime/task-runner.ts +44 -5
- package/src/runtime/team-runner.ts +78 -15
- package/src/runtime/worker-heartbeat.ts +21 -21
- package/src/runtime/worker-startup.ts +57 -57
- package/src/schema/config-schema.ts +1 -0
- package/src/schema/team-tool-schema.ts +3 -3
- package/src/state/active-run-registry.ts +165 -0
- package/src/state/contracts.ts +1 -1
- package/src/state/mailbox.ts +44 -4
- package/src/state/state-store.ts +8 -1
- package/src/state/task-claims.ts +44 -44
- package/src/state/types.ts +44 -2
- package/src/state/usage.ts +29 -29
- package/src/subagents/async-entry.ts +1 -1
- package/src/subagents/index.ts +3 -3
- package/src/subagents/live/control.ts +1 -1
- package/src/subagents/live/manager.ts +1 -1
- package/src/subagents/live/realtime.ts +1 -1
- package/src/subagents/live/session-runtime.ts +1 -1
- package/src/subagents/manager.ts +1 -1
- package/src/subagents/spawn.ts +1 -1
- package/src/teams/team-config.ts +1 -0
- package/src/teams/team-serializer.ts +38 -38
- package/src/types/diff.d.ts +18 -18
- package/src/ui/crew-footer.ts +101 -101
- package/src/ui/crew-select-list.ts +111 -111
- package/src/ui/crew-widget.ts +4 -3
- package/src/ui/dashboard-panes/metrics-pane.ts +34 -34
- package/src/ui/dashboard-panes/progress-pane.ts +2 -0
- package/src/ui/dynamic-border.ts +25 -25
- package/src/ui/layout-primitives.ts +106 -106
- package/src/ui/loaders.ts +158 -158
- package/src/ui/render-diff.ts +119 -119
- package/src/ui/render-scheduler.ts +143 -143
- package/src/ui/run-snapshot-cache.ts +10 -2
- package/src/ui/snapshot-types.ts +2 -0
- package/src/ui/spinner.ts +17 -17
- package/src/ui/status-colors.ts +58 -58
- package/src/ui/syntax-highlight.ts +116 -116
- package/src/utils/atomic-write.ts +33 -33
- package/src/utils/completion-dedupe.ts +63 -63
- package/src/utils/frontmatter.ts +68 -68
- package/src/utils/git.ts +262 -262
- package/src/utils/ids.ts +12 -12
- package/src/utils/names.ts +27 -27
- package/src/utils/paths.ts +4 -2
- package/src/utils/redaction.ts +44 -44
- package/src/utils/safe-paths.ts +47 -47
- package/src/utils/sleep.ts +32 -32
- package/src/workflows/validate-workflow.ts +40 -40
- package/src/workflows/workflow-config.ts +1 -0
- package/src/worktree/branch-freshness.ts +45 -45
- package/teams/default.team.md +12 -12
- package/teams/fast-fix.team.md +11 -11
- package/teams/implementation.team.md +18 -18
- package/teams/parallel-research.team.md +14 -14
- package/teams/research.team.md +11 -11
- package/teams/review.team.md +12 -12
- package/workflows/default.workflow.md +29 -29
- package/workflows/fast-fix.workflow.md +22 -22
- package/workflows/implementation.workflow.md +38 -38
- package/workflows/parallel-research.workflow.md +46 -46
- package/workflows/research.workflow.md +22 -22
- package/workflows/review.workflow.md +30 -30
|
@@ -0,0 +1,733 @@
|
|
|
1
|
+
# pi-crew Next Upgrade Roadmap
|
|
2
|
+
|
|
3
|
+
Date: 2026-05-05
|
|
4
|
+
Source inputs:
|
|
5
|
+
|
|
6
|
+
- `docs/research-oh-my-pi-distillation.md`
|
|
7
|
+
- `docs/source-runtime-refactor-map.md`
|
|
8
|
+
- Recent runtime hardening commits through `f5d47aa feat: surface run effectiveness evidence`
|
|
9
|
+
|
|
10
|
+
This document tracks the next practical upgrades after the current scaffold/no-op subagent fix, runtime safety classification, cancellation provenance, intent audit trail, prompt pipeline artifacts, capability inventory artifacts, and run effectiveness reporting.
|
|
11
|
+
|
|
12
|
+
## Current Baseline
|
|
13
|
+
|
|
14
|
+
Already implemented and pushed:
|
|
15
|
+
|
|
16
|
+
- Real child worker execution is the default.
|
|
17
|
+
- Implicit scaffold/no-op runs are blocked when worker execution is disabled by config/env.
|
|
18
|
+
- Explicit `runtime.mode=scaffold` remains available for dry-run prompt/artifact generation.
|
|
19
|
+
- Run `summary.md`, `progress.md`, and `status` now expose effectiveness evidence.
|
|
20
|
+
- Structured cancellation reasons flow through retry/cancel/team-runner/run events/metrics/UI snapshot.
|
|
21
|
+
- `cancel`, `cleanup`, `forget`, and `prune` accept audit intent metadata.
|
|
22
|
+
- Live-agent control distinguishes `steer` from `follow-up` at live-control/API level.
|
|
23
|
+
- Retry attempts have `attemptId`; max-retry deadletters link to the final `attemptId`.
|
|
24
|
+
- Worker prompt pipeline and capability inventory metadata artifacts are written per task.
|
|
25
|
+
|
|
26
|
+
## Priority Legend
|
|
27
|
+
|
|
28
|
+
- **P0**: correctness/safety issue; should be addressed before next release if feasible.
|
|
29
|
+
- **P1**: high user-visible value or reliability gain; good patch-release candidates.
|
|
30
|
+
- **P2**: larger subsystem work; should be planned and sequenced.
|
|
31
|
+
- **P3**: polish/UX/longer-term architecture.
|
|
32
|
+
|
|
33
|
+
## P0 — Prevent Ineffective Completed Runs
|
|
34
|
+
|
|
35
|
+
### P0.1 Enforce effectiveness policy for non-scaffold workers
|
|
36
|
+
|
|
37
|
+
**Problem**
|
|
38
|
+
|
|
39
|
+
`summary/status` now surface effectiveness evidence, but non-scaffold `child-process`/`live-session` runs can still end `completed` when task evidence is weak unless the existing mutation guard fires.
|
|
40
|
+
|
|
41
|
+
**Target behavior**
|
|
42
|
+
|
|
43
|
+
- For real workers, a run with completed tasks but no observable worker activity should be `blocked` or `failed`, not silently `completed`.
|
|
44
|
+
- Keep explicit scaffold dry-runs allowed, but label them as dry-runs.
|
|
45
|
+
- Policy should be configurable:
|
|
46
|
+
- `runtime.effectivenessGuard = "off" | "warn" | "block" | "fail"`
|
|
47
|
+
- default candidate: `warn` for read-only roles, `block` for mutating roles.
|
|
48
|
+
|
|
49
|
+
**Suggested files**
|
|
50
|
+
|
|
51
|
+
- `src/runtime/team-runner.ts`
|
|
52
|
+
- `src/runtime/completion-guard.ts`
|
|
53
|
+
- `src/state/types.ts` if storing guard result on manifest/tasks
|
|
54
|
+
- `src/schema/config-schema.ts`
|
|
55
|
+
- `src/config/config.ts`
|
|
56
|
+
- `test/unit/summary.test.ts`
|
|
57
|
+
- `test/unit/team-runner-merge.test.ts` or new `test/unit/effectiveness-guard.test.ts`
|
|
58
|
+
|
|
59
|
+
**Implementation sketch**
|
|
60
|
+
|
|
61
|
+
1. Extract run effectiveness calculation into a reusable exported helper, e.g.:
|
|
62
|
+
|
|
63
|
+
```ts
|
|
64
|
+
export interface RunEffectivenessSummary {
|
|
65
|
+
completed: number;
|
|
66
|
+
observable: number;
|
|
67
|
+
noObservedWorkTaskIds: string[];
|
|
68
|
+
needsAttentionTaskIds: string[];
|
|
69
|
+
workerExecution: "enabled" | "disabled/scaffold";
|
|
70
|
+
severity: "ok" | "warning" | "blocked" | "failed";
|
|
71
|
+
}
|
|
72
|
+
```
|
|
73
|
+
|
|
74
|
+
2. Use this helper for:
|
|
75
|
+
- `progress.md`
|
|
76
|
+
- `summary.md`
|
|
77
|
+
- `status`
|
|
78
|
+
- policy enforcement before `run.completed`.
|
|
79
|
+
|
|
80
|
+
3. For non-scaffold runs, if mutating tasks have no mutation/tool/model/transcript evidence:
|
|
81
|
+
- append `policy.action` with `reason: "ineffective_worker"`;
|
|
82
|
+
- set run `blocked` or `failed` depending config;
|
|
83
|
+
- include task IDs in `data`.
|
|
84
|
+
|
|
85
|
+
**Acceptance criteria**
|
|
86
|
+
|
|
87
|
+
- A mocked child-process run with no tool/model/transcript evidence does not report clean `completed` by default.
|
|
88
|
+
- Scaffold run still completes as explicit dry-run and displays `Worker execution: disabled/scaffold`.
|
|
89
|
+
- `status` clearly lists `noObservedWork` and `needsAttention` task IDs.
|
|
90
|
+
- Unit tests cover warn/block/fail modes.
|
|
91
|
+
|
|
92
|
+
**Verification**
|
|
93
|
+
|
|
94
|
+
```bash
|
|
95
|
+
npx tsc --noEmit
|
|
96
|
+
node --experimental-strip-types --test --test-concurrency=1 --test-timeout=30000 test/unit/effectiveness-guard.test.ts test/unit/summary.test.ts
|
|
97
|
+
npm run test:unit
|
|
98
|
+
```
|
|
99
|
+
|
|
100
|
+
### P0.2 Make runtime safety visible in manifest and run events
|
|
101
|
+
|
|
102
|
+
**Problem**
|
|
103
|
+
|
|
104
|
+
`runtime.safety` exists in runtime resolution, but it is not persisted as first-class run metadata. Debugging currently requires reading events or inferred artifacts.
|
|
105
|
+
|
|
106
|
+
**Target behavior**
|
|
107
|
+
|
|
108
|
+
- Manifest records resolved runtime:
|
|
109
|
+
|
|
110
|
+
```json
|
|
111
|
+
{
|
|
112
|
+
"runtimeResolution": {
|
|
113
|
+
"kind": "child-process",
|
|
114
|
+
"requestedMode": "auto",
|
|
115
|
+
"safety": "trusted",
|
|
116
|
+
"fallback": "child-process",
|
|
117
|
+
"reason": "..."
|
|
118
|
+
}
|
|
119
|
+
}
|
|
120
|
+
```
|
|
121
|
+
|
|
122
|
+
- `run.running` or `run.blocked` event includes the same resolution.
|
|
123
|
+
|
|
124
|
+
**Suggested files**
|
|
125
|
+
|
|
126
|
+
- `src/state/types.ts`
|
|
127
|
+
- `src/extension/team-tool/run.ts`
|
|
128
|
+
- `src/runtime/background-runner.ts`
|
|
129
|
+
- `src/extension/team-tool/status.ts`
|
|
130
|
+
- `test/unit/team-run.test.ts`
|
|
131
|
+
- `test/unit/runtime-resolver.test.ts`
|
|
132
|
+
|
|
133
|
+
**Acceptance criteria**
|
|
134
|
+
|
|
135
|
+
- `status` shows `Runtime safety: trusted|explicit_dry_run|blocked`.
|
|
136
|
+
- Blocked disabled-worker runs persist enough evidence to explain why no subagents spawned.
|
|
137
|
+
- Existing manifest schema remains backward compatible.
|
|
138
|
+
|
|
139
|
+
## P1 — Steering/Follow-up Semantics Beyond Live Control
|
|
140
|
+
|
|
141
|
+
### P1.1 Persist separate steering and follow-up queues in mailbox state
|
|
142
|
+
|
|
143
|
+
**Current state**
|
|
144
|
+
|
|
145
|
+
`follow-up-agent` exists in live-control, but durable mailbox is still generic inbox/outbox and `respond` still has waiting-task semantics.
|
|
146
|
+
|
|
147
|
+
**Target behavior**
|
|
148
|
+
|
|
149
|
+
- Mailbox messages can carry semantic kind:
|
|
150
|
+
|
|
151
|
+
```ts
|
|
152
|
+
kind?: "message" | "steer" | "follow-up" | "response" | "group_join";
|
|
153
|
+
priority?: "urgent" | "normal" | "low";
|
|
154
|
+
deliveryMode?: "interrupt" | "next_turn";
|
|
155
|
+
```
|
|
156
|
+
|
|
157
|
+
- `steer-agent` appends durable steering queue entry when no live session is present.
|
|
158
|
+
- `follow-up-agent` appends durable follow-up queue entry, deliverable after task stop/resume.
|
|
159
|
+
- UI/status separates urgent steering from follow-up backlog.
|
|
160
|
+
|
|
161
|
+
**Suggested files**
|
|
162
|
+
|
|
163
|
+
- `src/state/mailbox.ts`
|
|
164
|
+
- `src/runtime/live-agent-control.ts`
|
|
165
|
+
- `src/runtime/live-agent-manager.ts`
|
|
166
|
+
- `src/extension/team-tool/api.ts`
|
|
167
|
+
- `src/extension/team-tool/respond.ts`
|
|
168
|
+
- `src/ui/dashboard-panes/mailbox-pane.ts`
|
|
169
|
+
- `test/unit/mailbox-api.test.ts`
|
|
170
|
+
- `test/unit/live-agent-control.test.ts`
|
|
171
|
+
- `test/unit/respond-tool.test.ts`
|
|
172
|
+
|
|
173
|
+
**Acceptance criteria**
|
|
174
|
+
|
|
175
|
+
- Steering and follow-up can be inspected separately.
|
|
176
|
+
- Existing inbox/outbox JSONL remains readable.
|
|
177
|
+
- Durable queue survives process/session switch.
|
|
178
|
+
- Realtime live delivery dedupes against durable replay.
|
|
179
|
+
|
|
180
|
+
### P1.2 Clarify `respond` vs `follow-up` UX
|
|
181
|
+
|
|
182
|
+
**Problem**
|
|
183
|
+
|
|
184
|
+
`respond` is currently a waiting-task resume primitive. Users may expect it to send a general follow-up.
|
|
185
|
+
|
|
186
|
+
**Target behavior**
|
|
187
|
+
|
|
188
|
+
- `/team-respond` remains only for `waiting` tasks.
|
|
189
|
+
- `/team-follow-up` or `api operation=follow-up-agent` is documented as continuation prompt.
|
|
190
|
+
- Error messages recommend the correct command.
|
|
191
|
+
|
|
192
|
+
**Suggested files**
|
|
193
|
+
|
|
194
|
+
- `src/extension/registration/commands.ts`
|
|
195
|
+
- `src/extension/help.ts`
|
|
196
|
+
- `docs/usage.md`
|
|
197
|
+
- `test/unit/registration-commands-coverage.test.ts`
|
|
198
|
+
- `test/unit/respond-tool.test.ts`
|
|
199
|
+
|
|
200
|
+
## P1 — Worker Lifecycle and Process Reliability
|
|
201
|
+
|
|
202
|
+
### P1.3 Two-phase child process teardown
|
|
203
|
+
|
|
204
|
+
**Current state**
|
|
205
|
+
|
|
206
|
+
Child workers have improved post-exit stdio guards and bounded drains, but cancellation semantics can be made more deterministic.
|
|
207
|
+
|
|
208
|
+
**Target behavior**
|
|
209
|
+
|
|
210
|
+
Worker process cancellation returns structured status:
|
|
211
|
+
|
|
212
|
+
```ts
|
|
213
|
+
interface WorkerExitStatus {
|
|
214
|
+
exitCode: number | null;
|
|
215
|
+
cancelled: boolean;
|
|
216
|
+
timedOut: boolean;
|
|
217
|
+
killed: boolean;
|
|
218
|
+
signal?: string;
|
|
219
|
+
cleanupErrors: string[];
|
|
220
|
+
finalDrainMs: number;
|
|
221
|
+
}
|
|
222
|
+
```
|
|
223
|
+
|
|
224
|
+
Process lifecycle:
|
|
225
|
+
|
|
226
|
+
1. graceful cancel/TERM;
|
|
227
|
+
2. wait grace window;
|
|
228
|
+
3. hard kill process tree;
|
|
229
|
+
4. bounded stdout/stderr drain;
|
|
230
|
+
5. mark session non-reusable.
|
|
231
|
+
|
|
232
|
+
**Suggested files**
|
|
233
|
+
|
|
234
|
+
- `src/runtime/child-pi.ts`
|
|
235
|
+
- `src/runtime/pi-spawn.ts`
|
|
236
|
+
- `src/runtime/post-exit-stdio-guard.ts`
|
|
237
|
+
- `src/runtime/task-runner.ts`
|
|
238
|
+
- `src/runtime/cancellation.ts`
|
|
239
|
+
- `test/unit/child-pi*.test.ts`
|
|
240
|
+
- `test/integration/mock-child-run.test.ts`
|
|
241
|
+
|
|
242
|
+
**Acceptance criteria**
|
|
243
|
+
|
|
244
|
+
- Cancelled worker always produces terminal task event.
|
|
245
|
+
- Output drains are bounded.
|
|
246
|
+
- Status includes `cancelled/timedOut/killed`.
|
|
247
|
+
- No zombie/stale running task after cancellation.
|
|
248
|
+
|
|
249
|
+
### P1.4 Reserve worker control channel before spawn
|
|
250
|
+
|
|
251
|
+
**Problem**
|
|
252
|
+
|
|
253
|
+
There can be a short window where a task is logically starting but cancel/steer cannot target a controller yet.
|
|
254
|
+
|
|
255
|
+
**Target behavior**
|
|
256
|
+
|
|
257
|
+
- Synchronously create a `WorkerRunCore`/controller before async spawn.
|
|
258
|
+
- Persist controller metadata in agent status.
|
|
259
|
+
- Cancel/steer requests can be queued immediately while startup is in progress.
|
|
260
|
+
- Controller is cleared in `finally`.
|
|
261
|
+
|
|
262
|
+
**Suggested files**
|
|
263
|
+
|
|
264
|
+
- `src/runtime/task-runner.ts`
|
|
265
|
+
- `src/runtime/agent-control.ts`
|
|
266
|
+
- `src/runtime/live-agent-control.ts`
|
|
267
|
+
- `src/runtime/crew-agent-records.ts`
|
|
268
|
+
- `src/extension/team-tool/api.ts`
|
|
269
|
+
|
|
270
|
+
**Acceptance criteria**
|
|
271
|
+
|
|
272
|
+
- Starting worker can be cancelled immediately.
|
|
273
|
+
- Durable control request written during startup is applied or recorded as terminal no-op with reason.
|
|
274
|
+
- Tests simulate control request before child process emits first output.
|
|
275
|
+
|
|
276
|
+
## P1 — Cancellation and Attempt History
|
|
277
|
+
|
|
278
|
+
### P1.5 Add event-tree provenance: `parentEventId`, `attemptId`, `branchId`
|
|
279
|
+
|
|
280
|
+
**Current state**
|
|
281
|
+
|
|
282
|
+
Retry attempts have `attemptId`, and deadletters link to final attempt. Event log has sequence and terminal fingerprints but no general event tree.
|
|
283
|
+
|
|
284
|
+
**Target behavior**
|
|
285
|
+
|
|
286
|
+
- `TeamEvent.metadata` supports:
|
|
287
|
+
|
|
288
|
+
```ts
|
|
289
|
+
parentEventId?: string;
|
|
290
|
+
attemptId?: string;
|
|
291
|
+
branchId?: string;
|
|
292
|
+
causationId?: string;
|
|
293
|
+
correlationId?: string;
|
|
294
|
+
```
|
|
295
|
+
|
|
296
|
+
- Retry events, task started/completed/failed, deadletter, recovery events link by `attemptId`.
|
|
297
|
+
- UI/status can show attempt timeline.
|
|
298
|
+
|
|
299
|
+
**Suggested files**
|
|
300
|
+
|
|
301
|
+
- `src/state/event-log.ts`
|
|
302
|
+
- `src/state/types.ts`
|
|
303
|
+
- `src/runtime/team-runner.ts`
|
|
304
|
+
- `src/runtime/retry-executor.ts`
|
|
305
|
+
- `src/runtime/recovery-recipes.ts`
|
|
306
|
+
- `src/extension/team-tool/status.ts`
|
|
307
|
+
- `test/unit/event-metadata.test.ts`
|
|
308
|
+
- `test/unit/retry-executor.test.ts`
|
|
309
|
+
|
|
310
|
+
**Acceptance criteria**
|
|
311
|
+
|
|
312
|
+
- Retry attempt events and terminal task events share attempt provenance.
|
|
313
|
+
- Deadletter records can be traced back to event sequence.
|
|
314
|
+
- Existing JSONL readers ignore missing provenance fields.
|
|
315
|
+
|
|
316
|
+
### P1.6 Synthetic terminal results for cancelled in-flight operations
|
|
317
|
+
|
|
318
|
+
**Problem**
|
|
319
|
+
|
|
320
|
+
Run/task cancellation events are now structured, but worker/tool sub-operations can still lack synthetic terminal records if cancelled mid-operation.
|
|
321
|
+
|
|
322
|
+
**Target behavior**
|
|
323
|
+
|
|
324
|
+
- If a task started a worker/tool/model call and cancellation occurs, append a synthetic terminal record:
|
|
325
|
+
- `tool.cancelled` or `worker.cancelled`
|
|
326
|
+
- reason code/message
|
|
327
|
+
- startedAt/finishedAt
|
|
328
|
+
- attemptId if available
|
|
329
|
+
|
|
330
|
+
**Suggested files**
|
|
331
|
+
|
|
332
|
+
- `src/runtime/task-runner.ts`
|
|
333
|
+
- `src/runtime/task-runner/progress.ts`
|
|
334
|
+
- `src/runtime/child-pi.ts`
|
|
335
|
+
- `src/runtime/cancellation.ts`
|
|
336
|
+
- `src/state/contracts.ts`
|
|
337
|
+
- `test/unit/cancellation.test.ts`
|
|
338
|
+
|
|
339
|
+
**Acceptance criteria**
|
|
340
|
+
|
|
341
|
+
- No started tool/model operation is left without terminal evidence after cancellation.
|
|
342
|
+
- Status/diagnostics can distinguish user cancel vs timeout vs shutdown.
|
|
343
|
+
|
|
344
|
+
## P1 — Capability Inventory and Control Center
|
|
345
|
+
|
|
346
|
+
### P1.7 Build run/project capability inventory view
|
|
347
|
+
|
|
348
|
+
**Current state**
|
|
349
|
+
|
|
350
|
+
Per-task capability artifacts exist. There is no unified project/run inventory UI/API yet.
|
|
351
|
+
|
|
352
|
+
**Target behavior**
|
|
353
|
+
|
|
354
|
+
`/team-settings` or new `/team-control` shows normalized inventory:
|
|
355
|
+
|
|
356
|
+
```ts
|
|
357
|
+
interface CapabilityItem {
|
|
358
|
+
id: string;
|
|
359
|
+
kind: "team" | "workflow" | "agent" | "skill" | "tool" | "hook" | "runtime" | "provider";
|
|
360
|
+
name: string;
|
|
361
|
+
source: "builtin" | "project" | "user" | "runtime";
|
|
362
|
+
path?: string;
|
|
363
|
+
state: "active" | "disabled" | "shadowed" | "missing";
|
|
364
|
+
disabledReason?: string;
|
|
365
|
+
shadowedBy?: string;
|
|
366
|
+
}
|
|
367
|
+
```
|
|
368
|
+
|
|
369
|
+
**Suggested files**
|
|
370
|
+
|
|
371
|
+
- `src/extension/team-tool/handle-settings.ts`
|
|
372
|
+
- `src/extension/management.ts`
|
|
373
|
+
- `src/agents/discover-agents.ts`
|
|
374
|
+
- `src/teams/discover-teams.ts`
|
|
375
|
+
- `src/workflows/discover-workflows.ts`
|
|
376
|
+
- `src/runtime/skill-instructions.ts`
|
|
377
|
+
- `docs/resource-formats.md`
|
|
378
|
+
- `test/unit/management.test.ts`
|
|
379
|
+
|
|
380
|
+
**Acceptance criteria**
|
|
381
|
+
|
|
382
|
+
- Inventory is stable and sorted.
|
|
383
|
+
- Shadowed project/user/builtin resources are visible.
|
|
384
|
+
- Skill disabled/budget state is visible.
|
|
385
|
+
- No file path is used as the only stable ID.
|
|
386
|
+
|
|
387
|
+
### P1.8 Persist capability disables by stable ID
|
|
388
|
+
|
|
389
|
+
**Target behavior**
|
|
390
|
+
|
|
391
|
+
- Operator can disable a skill/agent/team by capability ID.
|
|
392
|
+
- Disable config survives path relocation when resource identity remains stable.
|
|
393
|
+
- Status explains disabled reason.
|
|
394
|
+
|
|
395
|
+
**Suggested files**
|
|
396
|
+
|
|
397
|
+
- `src/config/config.ts`
|
|
398
|
+
- `src/schema/config-schema.ts`
|
|
399
|
+
- discovery modules
|
|
400
|
+
- `test/unit/config-schema-validation.test.ts`
|
|
401
|
+
|
|
402
|
+
## P2 — Typed Hook Lifecycle
|
|
403
|
+
|
|
404
|
+
### P2.1 Introduce typed hook contract
|
|
405
|
+
|
|
406
|
+
**Target behavior**
|
|
407
|
+
|
|
408
|
+
Define typed lifecycle gates:
|
|
409
|
+
|
|
410
|
+
- `before_run_start`
|
|
411
|
+
- `before_task_start`
|
|
412
|
+
- `task_result`
|
|
413
|
+
- `before_cancel`
|
|
414
|
+
- `before_forget`
|
|
415
|
+
- `before_cleanup`
|
|
416
|
+
- `before_publish`
|
|
417
|
+
- `session_before_switch`
|
|
418
|
+
- `run_recovery`
|
|
419
|
+
|
|
420
|
+
Each hook declares:
|
|
421
|
+
|
|
422
|
+
```ts
|
|
423
|
+
type HookMode = "blocking" | "non_blocking";
|
|
424
|
+
type HookOutcome = "allow" | "block" | "modify" | "diagnostic";
|
|
425
|
+
```
|
|
426
|
+
|
|
427
|
+
Errors are recorded in diagnostics/events, not uncontrolled exceptions.
|
|
428
|
+
|
|
429
|
+
**Suggested files**
|
|
430
|
+
|
|
431
|
+
- new `src/hooks/*`
|
|
432
|
+
- `src/extension/register.ts`
|
|
433
|
+
- `src/runtime/team-runner.ts`
|
|
434
|
+
- `src/extension/team-tool/cancel.ts`
|
|
435
|
+
- `src/extension/team-tool/lifecycle-actions.ts`
|
|
436
|
+
- `docs/resource-formats.md`
|
|
437
|
+
- `test/unit/hooks*.test.ts`
|
|
438
|
+
|
|
439
|
+
**Acceptance criteria**
|
|
440
|
+
|
|
441
|
+
- Blocking hook can stop a run before worker start with clear event and status.
|
|
442
|
+
- Non-blocking hook failure records diagnostic and does not crash run.
|
|
443
|
+
- Hook context is redacted and bounded.
|
|
444
|
+
|
|
445
|
+
### P2.2 Require intent via policy/hook for destructive actions
|
|
446
|
+
|
|
447
|
+
**Current state**
|
|
448
|
+
|
|
449
|
+
Intent is optional for cancel/cleanup/forget/prune.
|
|
450
|
+
|
|
451
|
+
**Target behavior**
|
|
452
|
+
|
|
453
|
+
- Optional config:
|
|
454
|
+
|
|
455
|
+
```json
|
|
456
|
+
{
|
|
457
|
+
"policy": {
|
|
458
|
+
"requireIntentForDestructiveActions": true
|
|
459
|
+
}
|
|
460
|
+
}
|
|
461
|
+
```
|
|
462
|
+
|
|
463
|
+
- Actions requiring intent:
|
|
464
|
+
- cancel
|
|
465
|
+
- forget
|
|
466
|
+
- prune
|
|
467
|
+
- cleanup with force
|
|
468
|
+
- publish/release helpers if added
|
|
469
|
+
- worktree removal
|
|
470
|
+
|
|
471
|
+
**Acceptance criteria**
|
|
472
|
+
|
|
473
|
+
- Missing intent blocks action with actionable error.
|
|
474
|
+
- Existing tests can opt out or provide intent.
|
|
475
|
+
- Audit trail includes intent after approval.
|
|
476
|
+
|
|
477
|
+
## P2 — Durable History vs Prompt Projection
|
|
478
|
+
|
|
479
|
+
### P2.3 Separate durable run history projection from worker prompt text
|
|
480
|
+
|
|
481
|
+
**Current state**
|
|
482
|
+
|
|
483
|
+
Prompt pipeline artifacts exist, but context projection logic is still coupled to prompt construction in multiple places.
|
|
484
|
+
|
|
485
|
+
**Target behavior**
|
|
486
|
+
|
|
487
|
+
Introduce explicit projection functions:
|
|
488
|
+
|
|
489
|
+
```ts
|
|
490
|
+
transformRunContextBeforeWorkerStart(...)
|
|
491
|
+
convertRunHistoryToWorkerPrompt(...)
|
|
492
|
+
```
|
|
493
|
+
|
|
494
|
+
Rules:
|
|
495
|
+
|
|
496
|
+
- Durable history retains events, mailbox, artifacts, UI/runtime metadata.
|
|
497
|
+
- Worker prompt gets a bounded projection.
|
|
498
|
+
- UI/runtime events are not prompt text unless explicitly selected.
|
|
499
|
+
|
|
500
|
+
**Suggested files**
|
|
501
|
+
|
|
502
|
+
- `src/runtime/task-runner/prompt-pipeline.ts`
|
|
503
|
+
- `src/runtime/task-runner/prompt-builder.ts`
|
|
504
|
+
- `src/runtime/task-output-context.ts`
|
|
505
|
+
- `src/runtime/task-runner.ts`
|
|
506
|
+
- `test/unit/task-runner-prompt-pipeline.test.ts`
|
|
507
|
+
|
|
508
|
+
**Acceptance criteria**
|
|
509
|
+
|
|
510
|
+
- Prompt pipeline artifact identifies every projection source.
|
|
511
|
+
- Large event/mailbox history is summarized or referenced, not blindly embedded.
|
|
512
|
+
- Tests verify UI/runtime events are not injected as instructions.
|
|
513
|
+
|
|
514
|
+
## P2 — Cooperative Cancellation for Internal Scans
|
|
515
|
+
|
|
516
|
+
### P2.4 Add internal `CancellationToken`
|
|
517
|
+
|
|
518
|
+
**Target behavior**
|
|
519
|
+
|
|
520
|
+
A utility for long internal loops:
|
|
521
|
+
|
|
522
|
+
```ts
|
|
523
|
+
interface CancellationToken {
|
|
524
|
+
readonly aborted: boolean;
|
|
525
|
+
readonly reason?: CancellationReason;
|
|
526
|
+
heartbeat(stage?: string): void;
|
|
527
|
+
throwIfCancelled(): void;
|
|
528
|
+
wait(ms: number): Promise<void>;
|
|
529
|
+
}
|
|
530
|
+
```
|
|
531
|
+
|
|
532
|
+
Use it in:
|
|
533
|
+
|
|
534
|
+
- run index scans
|
|
535
|
+
- artifact cleanup
|
|
536
|
+
- mailbox validation/replay
|
|
537
|
+
- worktree cleanup
|
|
538
|
+
- diagnostic export
|
|
539
|
+
- large transcript/event reads
|
|
540
|
+
|
|
541
|
+
**Suggested files**
|
|
542
|
+
|
|
543
|
+
- new `src/runtime/cancellation-token.ts`
|
|
544
|
+
- `src/extension/run-index.ts`
|
|
545
|
+
- `src/extension/registration/artifact-cleanup.ts`
|
|
546
|
+
- `src/state/mailbox.ts`
|
|
547
|
+
- `src/ui/run-snapshot-cache.ts`
|
|
548
|
+
- `test/unit/cancellation-token.test.ts`
|
|
549
|
+
|
|
550
|
+
**Acceptance criteria**
|
|
551
|
+
|
|
552
|
+
- Long scan can abort within bounded cadence.
|
|
553
|
+
- Heartbeat stage appears in diagnostics/logs.
|
|
554
|
+
- Existing APIs can pass no token and keep current behavior.
|
|
555
|
+
|
|
556
|
+
## P2 — Artifact Store Improvements
|
|
557
|
+
|
|
558
|
+
### P2.5 Content-addressed blob artifacts
|
|
559
|
+
|
|
560
|
+
**Target behavior**
|
|
561
|
+
|
|
562
|
+
Large logs/transcripts/results are stored as blobs:
|
|
563
|
+
|
|
564
|
+
```text
|
|
565
|
+
artifacts/blobs/sha256/<hash>
|
|
566
|
+
artifacts/blob-metadata/<hash>.json
|
|
567
|
+
```
|
|
568
|
+
|
|
569
|
+
Metadata includes:
|
|
570
|
+
|
|
571
|
+
- runId/taskId
|
|
572
|
+
- MIME/type
|
|
573
|
+
- producer
|
|
574
|
+
- original path/name
|
|
575
|
+
- size/hash
|
|
576
|
+
- redaction status
|
|
577
|
+
- retention policy
|
|
578
|
+
|
|
579
|
+
**Suggested files**
|
|
580
|
+
|
|
581
|
+
- `src/state/artifact-store.ts`
|
|
582
|
+
- `src/runtime/task-runner.ts`
|
|
583
|
+
- `src/ui/transcript-viewer.ts`
|
|
584
|
+
- `src/extension/run-export.ts`
|
|
585
|
+
- `src/extension/run-import.ts`
|
|
586
|
+
- `test/unit/artifact-store*.test.ts`
|
|
587
|
+
|
|
588
|
+
**Acceptance criteria**
|
|
589
|
+
|
|
590
|
+
- Artifacts above threshold are blob-referenced.
|
|
591
|
+
- Run export/import preserves blobs.
|
|
592
|
+
- GC removes unreferenced blobs after retention.
|
|
593
|
+
- Path traversal protections remain intact.
|
|
594
|
+
|
|
595
|
+
## P2 — UI and Dashboard Upgrades
|
|
596
|
+
|
|
597
|
+
### P2.6 Show capability/effectiveness/cancellation panels in dashboard
|
|
598
|
+
|
|
599
|
+
**Target behavior**
|
|
600
|
+
|
|
601
|
+
Dashboard panes expose:
|
|
602
|
+
|
|
603
|
+
- run effectiveness score and no-observed-work tasks;
|
|
604
|
+
- cancellation reason and intent;
|
|
605
|
+
- capability inventory for selected task;
|
|
606
|
+
- attempt/deadletter timeline.
|
|
607
|
+
|
|
608
|
+
**Suggested files**
|
|
609
|
+
|
|
610
|
+
- `src/ui/run-dashboard.ts`
|
|
611
|
+
- `src/ui/dashboard-panes/*`
|
|
612
|
+
- `src/ui/snapshot-types.ts`
|
|
613
|
+
- `src/ui/run-snapshot-cache.ts`
|
|
614
|
+
- `test/unit/run-dashboard.test.ts`
|
|
615
|
+
- new pane tests
|
|
616
|
+
|
|
617
|
+
**Acceptance criteria**
|
|
618
|
+
|
|
619
|
+
- No heavy synchronous scans in render path.
|
|
620
|
+
- Pane output is width-safe.
|
|
621
|
+
- Snapshot cache provides precomputed compact data.
|
|
622
|
+
|
|
623
|
+
### P2.7 Event-first UI stream
|
|
624
|
+
|
|
625
|
+
**Target behavior**
|
|
626
|
+
|
|
627
|
+
Move more live UI updates from file polling to semantic events:
|
|
628
|
+
|
|
629
|
+
- `task_started`
|
|
630
|
+
- `task_completed`
|
|
631
|
+
- `worker_status`
|
|
632
|
+
- `mailbox_updated`
|
|
633
|
+
- `effectiveness_changed`
|
|
634
|
+
|
|
635
|
+
**Acceptance criteria**
|
|
636
|
+
|
|
637
|
+
- Render scheduler remains coalesced and overlap-safe.
|
|
638
|
+
- UI still recovers from durable files after restart.
|
|
639
|
+
- File polling is fallback, not the hot path.
|
|
640
|
+
|
|
641
|
+
## P2 — Raw Scan Entry Cache
|
|
642
|
+
|
|
643
|
+
### P2.8 Cache raw entries, not final semantic query results
|
|
644
|
+
|
|
645
|
+
**Target behavior**
|
|
646
|
+
|
|
647
|
+
Shared raw scan cache for:
|
|
648
|
+
|
|
649
|
+
- runs
|
|
650
|
+
- artifacts
|
|
651
|
+
- mailbox files
|
|
652
|
+
- transcript chunks
|
|
653
|
+
- worktree roots
|
|
654
|
+
|
|
655
|
+
Then apply filters/sorts after retrieval.
|
|
656
|
+
|
|
657
|
+
**Suggested files**
|
|
658
|
+
|
|
659
|
+
- `src/runtime/manifest-cache.ts`
|
|
660
|
+
- `src/ui/run-snapshot-cache.ts`
|
|
661
|
+
- `src/extension/run-index.ts`
|
|
662
|
+
- `src/utils/file-coalescer.ts`
|
|
663
|
+
|
|
664
|
+
**Acceptance criteria**
|
|
665
|
+
|
|
666
|
+
- Deterministic sort order.
|
|
667
|
+
- State mutation invalidates relevant raw entries.
|
|
668
|
+
- Large workspaces do not trigger full rescans on every render/status.
|
|
669
|
+
|
|
670
|
+
## P3 — Release/Install Hardening
|
|
671
|
+
|
|
672
|
+
### P3.1 Tarball install smoke before publish
|
|
673
|
+
|
|
674
|
+
**Target behavior**
|
|
675
|
+
|
|
676
|
+
Release workflow requires:
|
|
677
|
+
|
|
678
|
+
```bash
|
|
679
|
+
npm run ci
|
|
680
|
+
npm pack --dry-run
|
|
681
|
+
npm pack
|
|
682
|
+
# install tarball in temp project
|
|
683
|
+
# verify pi extension load smoke
|
|
684
|
+
# verify npm package files and version/tag consistency
|
|
685
|
+
```
|
|
686
|
+
|
|
687
|
+
**Suggested files**
|
|
688
|
+
|
|
689
|
+
- `docs/publishing.md`
|
|
690
|
+
- `package.json` scripts
|
|
691
|
+
- `.github/workflows/*` if CI is added
|
|
692
|
+
- optional `scripts/release-smoke.mjs`
|
|
693
|
+
|
|
694
|
+
**Acceptance criteria**
|
|
695
|
+
|
|
696
|
+
- Packed tarball loads extension in temp Pi home.
|
|
697
|
+
- Version in package, changelog, tag, npm view are consistent.
|
|
698
|
+
- Release instructions include rollback notes.
|
|
699
|
+
|
|
700
|
+
## Suggested Implementation Order
|
|
701
|
+
|
|
702
|
+
1. **P0.1 Effectiveness policy enforcement** — prevents misleading completed runs.
|
|
703
|
+
2. **P0.2 Persist runtime safety** — improves debugging for worker spawn issues.
|
|
704
|
+
3. **P1.3 Two-phase worker teardown** — reduces stale/zombie worker risk.
|
|
705
|
+
4. **P1.1 Durable steering/follow-up queues** — completes semantic split started at live-control level.
|
|
706
|
+
5. **P1.5 Event-tree provenance** — builds on current `attemptId` work.
|
|
707
|
+
6. **P1.7 Capability inventory view** — turns existing per-task artifacts into operator UX.
|
|
708
|
+
7. **P2.3 Durable history projection** — reduces prompt/context risks.
|
|
709
|
+
8. **P2.4 CancellationToken** — improves responsiveness of internal scans.
|
|
710
|
+
9. **P2.5 Blob artifacts** — prevents log/transcript bloat.
|
|
711
|
+
10. **P2.6 Dashboard panels** — surface all new evidence in UI.
|
|
712
|
+
|
|
713
|
+
## Release Guidance
|
|
714
|
+
|
|
715
|
+
Before publishing a patch with these upgrades:
|
|
716
|
+
|
|
717
|
+
```bash
|
|
718
|
+
npx tsc --noEmit
|
|
719
|
+
npm run test:unit
|
|
720
|
+
npm run test:integration
|
|
721
|
+
npm pack --dry-run
|
|
722
|
+
```
|
|
723
|
+
|
|
724
|
+
For runtime/process changes also run targeted child-worker integration tests:
|
|
725
|
+
|
|
726
|
+
```bash
|
|
727
|
+
node --experimental-strip-types --test --test-concurrency=1 --test-timeout=60000 \
|
|
728
|
+
test/integration/mock-child-run.test.ts \
|
|
729
|
+
test/integration/mock-child-json-run.test.ts \
|
|
730
|
+
test/integration/phase6-runtime-hardening.test.ts
|
|
731
|
+
```
|
|
732
|
+
|
|
733
|
+
Do not publish without explicit user confirmation and a green verification pass.
|