pi-crew 0.8.11 → 0.8.13

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -1,5 +1,49 @@
1
1
  # Changelog
2
2
 
3
+ ## [0.8.12] — `team action=cleanup` now reverses `init` (Issue #35) (2026-06-17)
4
+
5
+ `team action=cleanup` gained a **project-level mode** that reverses what
6
+ `team action=init` writes. This closes the legitimate complaint in
7
+ [Issue #35](https://github.com/baphuongna/pi-crew/issues/35): pi-crew injects
8
+ a guidance block into `AGENTS.md` on `init`, but `pi uninstall` has no
9
+ extension hook to remove it — so the block (and `.crew/`) were left behind.
10
+
11
+ ### New `cleanup` modes
12
+
13
+ | Call | What it does |
14
+ |---|---|
15
+ | `team action=cleanup runId=<id>` | Per-run worktree cleanup (existing behavior, unchanged) |
16
+ | `team action=cleanup` (no runId) | **NEW**: removes the AGENTS.md guidance block |
17
+ | `team action=cleanup force=true` | NEW: also removes the `.crew/` state directory |
18
+ | `team action=cleanup dryRun=true` | NEW: preview without writing |
19
+
20
+ ### Safety guarantees
21
+
22
+ - The AGENTS.md guidance block is **marker-delimited**
23
+ (`<!-- PI-CREW:GUIDANCE:START/END -->`), so `removeGuidance` removes **only**
24
+ that block — user content is never touched (pinned by a test).
25
+ - `.crew/` removal requires explicit `force=true` (irreversible — holds run
26
+ history, artifacts, worktrees). Default preserves it.
27
+ - A `realpathSync` + basename guard refuses to `rmSync` anything that isn't a
28
+ `.crew` dir, so a crafted cwd can't trick us into deleting an arbitrary path.
29
+ - The user-scope dir (`~/.pi/agent/extensions/pi-crew/`) is owned by
30
+ `pi uninstall` and is never touched by `team action=cleanup`.
31
+
32
+ ### Files
33
+
34
+ - `src/extension/team-tool/lifecycle-actions.ts` — `handleCleanup` dispatcher
35
+ + new `handleProjectCleanup` (no-runId path). Intent policy now checked once
36
+ in the dispatcher (applies to both modes). Per-run path preserved verbatim.
37
+ - `src/extension/team-tool-types.ts` — `TeamToolDetails.scope?`.
38
+ - `README.md` — new **Uninstall** section documenting the full flow.
39
+ - `test/unit/cleanup-project-mode.test.ts` — NEW, 9 tests (removal, user-content
40
+ preservation, idempotency, force-gating, dry-run, scope rejection, runId
41
+ routing).
42
+ - `test/unit/team-tool-dispatch.test.ts` — updated the no-runId test to the
43
+ new contract (project cleanup, not error).
44
+
45
+ typecheck clean; full suite 2964/0.
46
+
3
47
  ## [0.8.11] — Split-scope install fix + transient-provider fallback (2026-06-17)
4
48
 
5
49
  Bundle of two independent fixes that were triaged from real user reports on
@@ -2546,3 +2590,41 @@ correctness+error-handling, and performance+architecture audits across 77 source
2546
2590
  - TypeScript: 0 errors
2547
2591
  - Skills: 37/37 PASS
2548
2592
  - New modules: 11 files, 2,267 LOC
2593
+
2594
+ ## [0.8.13] — user-scope cleanup + install side-effects warning (Issue #35) (2026-06-18)
2595
+
2596
+ Follow-up to issue #35's latest comment ("pi-crew leaves behind user-level
2597
+ junk"). Two of the three points raised were valid; both addressed.
2598
+
2599
+ ### `team action=cleanup scope=user` — new user-level cleanup mode
2600
+ Removes pi-crew user-scope state that `pi uninstall npm:pi-crew` leaves behind:
2601
+ - `~/.pi/agent/extensions/pi-crew/` — pi-crew runtime state (artifacts, state,
2602
+ config.json). Regenerable, always removed.
2603
+ - `~/.pi/agent/agents/*.md.bak-<timestamp>-<hex>` — smoke-test backup junk
2604
+ pi-crew's own tests leave behind. NEVER touches real `*.md` agent files
2605
+ (pi-crew can't tell user-authored vs test-copied — only the timestamped
2606
+ `.bak-*` pattern is removed).
2607
+ - `~/.pi/agent/pi-crew.json` — global config. Gated on `force=true` (may hold
2608
+ your customized settings).
2609
+
2610
+ `dryRun=true` previews; safe by default. Routes via the new `scope=user` flag
2611
+ on `team action=cleanup`.
2612
+
2613
+ ### Install side-effects warning (install.mjs)
2614
+ The postinstall script now prints an explicit "What pi-crew writes (and how to
2615
+ undo it)" block: AGENTS.md injection (marker-delimited, on `init` only),
2616
+ `.crew/` runtime dir, the global config created at install, and the full
2617
+ uninstall command sequence (project + user + `pi uninstall`). Nothing is hidden
2618
+ behind install — be upfront about side effects.
2619
+
2620
+ ### README Uninstall section expanded
2621
+ Split into Project scope + User scope subsections, with the full 6-step
2622
+ uninstall flow and a note that authored agent files are never touched.
2623
+
2624
+ ### On the third claim (hijacks pi-intercom)
2625
+ Still not reproduced. Verified a third time: `grep -rni pi-intercom src/` → 0
2626
+ references. `crew-input-router.ts:11` passes slash commands through unchanged.
2627
+ The reply on the issue asks again for a concrete repro.
2628
+
2629
+ typecheck clean; +6 user-scope cleanup tests + 1 routing test update; full
2630
+ suite 2963/0.
package/README.md CHANGED
@@ -107,6 +107,56 @@ node ./pi-crew/install.mjs # from local clone
107
107
  > one-line workaround (or install pi-crew in pi's own scope:
108
108
  > `npm install -g @earendil-works/pi-crew`).
109
109
 
110
+ ### Uninstall
111
+
112
+ `pi uninstall npm:pi-crew` removes the package, but pi doesn't fire an
113
+ extension uninstall hook, so several things pi-crew created are left behind.
114
+ Reverse them explicitly with `team action=cleanup`. There are **two scopes**:
115
+
116
+ #### Project scope (reverse `team action=init`)
117
+
118
+ ```bash
119
+ # 1. (Optional) Preview what would be removed, without writing:
120
+ team action=cleanup dryRun=true
121
+
122
+ # 2. Remove the AGENTS.md guidance block only (.crew/ preserved):
123
+ team action=cleanup
124
+
125
+ # 3. Remove BOTH the guidance block AND the .crew/ state directory (force):
126
+ team action=cleanup force=true
127
+ ```
128
+
129
+ The guidance block is wrapped in `<!-- PI-CREW:GUIDANCE:START -->` /
130
+ `<!-- PI-CREW:GUIDANCE:END -->` markers, so cleanup removes **only** that
131
+ block — your own AGENTS.md content is never touched. The `.crew/` directory
132
+ is removed **only** with `force=true` (it's irreversible).
133
+
134
+ #### User scope (remove user-level state `pi uninstall` leaves behind)
135
+
136
+ ```bash
137
+ # 4. Preview + remove pi-crew user-scope junk:
138
+ team action=cleanup scope=user dryRun=true # preview
139
+ team action=cleanup scope=user # remove ~/.pi/agent/extensions/pi-crew/
140
+ # + pi-crew smoke-test *.bak files
141
+
142
+ # 5. (Optional) Also remove the global config (holds your settings):
143
+ team action=cleanup scope=user force=true # also removes ~/.pi/agent/pi-crew.json
144
+ ```
145
+
146
+ This removes the pi-crew state dir (`~/.pi/agent/extensions/pi-crew/`, which
147
+ holds run artifacts + state), the global config (with `force=true`), and the
148
+ `*.md.bak-<timestamp>` smoke-test backup files pi-crew's own tests may leave in
149
+ `~/.pi/agent/agents/`. **Your authored agent files (`*.md`) are never touched**
150
+ — pi-crew can't tell which were user-created vs test-copied, so only the
151
+ clearly-pi-crew `.bak-*` backups are removed.
152
+
153
+ #### Final step
154
+
155
+ ```bash
156
+ # 6. Remove the package itself:
157
+ pi uninstall npm:pi-crew
158
+ ```
159
+
110
160
 
111
161
  ---
112
162
 
package/install.mjs CHANGED
@@ -63,3 +63,24 @@ console.log("\nFor local development from a cloned repo:");
63
63
  console.log(" pi install .");
64
64
  console.log("\nChild workers are enabled by default. For dry runs, set runtime.mode=scaffold or executeWorkers=false.");
65
65
  console.log("To force-disable or force-enable workers in a shell, use PI_TEAMS_EXECUTE_WORKERS=0/1.");
66
+
67
+ // Side-effects warning (Issue #35): be upfront about what pi-crew writes and
68
+ // how to fully uninstall it. Nothing runs on install/registration itself; the
69
+ // writes below only happen when you explicitly invoke `team action=init`.
70
+ console.log("\n--- What pi-crew writes (and how to undo it) ---");
71
+ console.log("pi-crew itself writes nothing on install. The following only happens when you");
72
+ console.log("explicitly run `team action=init` in a project:");
73
+ console.log(" - A marker-delimited block is injected into the project's AGENTS.md.");
74
+ console.log(" (Wrapped in <!-- PI-CREW:GUIDANCE:START/END --> — your content is never touched.)");
75
+ console.log(" - A `.crew/` runtime state dir is created in the project (run history + artifacts).");
76
+ console.log(" - With --copy-builtins: bundled agents/teams/workflows are copied into the project.");
77
+ console.log("This install also created the global config above (`~/.pi/agent/pi-crew.json`).");
78
+ console.log("\nFull uninstall (in order):");
79
+ console.log(" team action=cleanup dryRun=true # preview what would be removed (project)");
80
+ console.log(" team action=cleanup # remove the AGENTS.md guidance block");
81
+ console.log(" team action=cleanup force=true # also remove the .crew/ project state dir");
82
+ console.log(" team action=cleanup scope=user # remove pi-crew user-scope junk");
83
+ console.log(" # (~/.pi/agent/extensions/pi-crew/ + test .bak files)");
84
+ console.log(" team action=cleanup scope=user force=true # also remove ~/.pi/agent/pi-crew.json");
85
+ console.log(" pi uninstall npm:pi-crew # remove the package itself");
86
+ console.log("See the README 'Uninstall' section for details.");
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "pi-crew",
3
- "version": "0.8.11",
3
+ "version": "0.8.13",
4
4
  "description": "Pi extension for coordinated AI teams, workflows, worktrees, and async task orchestration",
5
5
  "author": "baphuongna",
6
6
  "license": "MIT",
@@ -22,10 +22,10 @@ export function piTeamsHelp(): string {
22
22
  "- /team-mascot",
23
23
  "- /team-transcript <runId> [taskId]",
24
24
  "- /team-result <runId> [taskId]",
25
- "- /team-manager",
25
+ "- /team-manager — interactive menu (alias: /team-cleanup-menu)",
26
26
  "",
27
27
  "Maintenance:",
28
- "- /team-cleanup <runId> [--force]",
28
+ "- /team-cleanup <runId> [--force] (or scope=project/user for uninstall cleanup)",
29
29
  "- /team-forget <runId> --confirm [--force]",
30
30
  "- /team-prune --keep=20 --confirm",
31
31
  "",
@@ -123,6 +123,7 @@ import { registerContextStatusInjection } from "./context-status-injection.ts";
123
123
  import { registerTeamTool } from "./registration/team-tool.ts";
124
124
  import { handleTeamTool } from "./team-tool.ts";
125
125
  import { persistScheduledJobUpdate } from "./team-tool/handle-schedule.ts";
126
+ import { shouldBlockDestructiveTeamAction } from "./team-tool/destructive-gate.ts";
126
127
 
127
128
  let _cachedOTLPExporter: typeof OTLPExporterType | undefined;
128
129
  async function importOTLPExporter(): Promise<typeof OTLPExporterType> {
@@ -1992,13 +1993,11 @@ export function registerPiTeams(pi: ExtensionAPI): void {
1992
1993
  const input = asRecord(rawInput);
1993
1994
  if (!input) return;
1994
1995
  const action = typeof input.action === "string" ? input.action : undefined;
1995
- const destructiveActions = new Set(["delete", "forget", "prune", "cleanup"]);
1996
- if (!action || !destructiveActions.has(action)) return;
1997
- const forceBypassesReferenceChecks = action === "delete" && input.force === true;
1998
- if (input.confirm === true || forceBypassesReferenceChecks) return;
1996
+ const reason = shouldBlockDestructiveTeamAction(action, input);
1997
+ if (!reason) return;
1999
1998
  return {
2000
- block: true,
2001
- reason: `Destructive action '${action}' requires confirm=true${action === "delete" ? " (or force=true to bypass reference checks)" : ""}.`,
1999
+ block: true as const,
2000
+ reason,
2002
2001
  };
2003
2002
  });
2004
2003
 
@@ -439,7 +439,11 @@ export function registerTeamCommands(pi: ExtensionAPI, deps: RegisterTeamCommand
439
439
  },
440
440
  })
441
441
 
442
- pi.registerCommand("team-cleanup", { description: "Open a simple pi-crew interactive manager", handler: handleTeamManagerCommand });
442
+ pi.registerCommand("team-manager", { description: "Open the pi-crew interactive menu (list/run/status/cleanup/manage resources/doctor)", handler: handleTeamManagerCommand });
443
+ // Backward-compat alias: this command was originally registered as "team-cleanup"
444
+ // (the interactive menu predates the runId-targeted cleanup action). Keep both so
445
+ // existing muscle memory + help docs both work.
446
+ pi.registerCommand("team-cleanup-menu", { description: "Alias for /team-manager (pi-crew interactive menu)", handler: handleTeamManagerCommand });
443
447
 
444
448
  pi.registerCommand("team-result", { description: "Open a pi-crew agent result viewer: <runId> [taskId]", getArgumentCompletions: async (argumentPrefix: string) => { const parts = argumentPrefix.trim().split(/\s+/); return parts.length <= 1 ? suggestRunIds(parts[0] ?? "") : suggestTaskIds(parts[0] ?? "", parts[1] ?? ""); }, handler: async (args: string, ctx: ExtensionCommandContext) => {
445
449
  const [runId, rawTaskId] = args.trim().split(/\s+/).filter(Boolean);
@@ -75,11 +75,13 @@ function loadRunSummaries(cwd: string, options: OnboardingOptions = {}): RunSumm
75
75
  }
76
76
 
77
77
  /**
78
- * Format duration in minutes.
78
+ * Format duration in minutes. Defensive against missing/invalid timestamps
79
+ * (e.g. legacy test runs) — returns "?" instead of "NaNm".
79
80
  */
80
- function formatDuration(createdAt: string, completedAt?: string): string {
81
+ export function formatDuration(createdAt: string, completedAt?: string): string {
81
82
  const start = new Date(createdAt).getTime();
82
83
  const end = completedAt ? new Date(completedAt).getTime() : Date.now();
84
+ if (!Number.isFinite(start) || !Number.isFinite(end) || end < start) return "?";
83
85
  const minutes = Math.round((end - start) / 1000 / 60);
84
86
  if (minutes < 1) return "<1m";
85
87
  if (minutes >= 60) return `${Math.round(minutes / 60)}h`;
@@ -136,8 +138,10 @@ export function buildTeamOnboarding(team: string, cwd: string, options: Onboardi
136
138
  for (const run of runs) {
137
139
  const duration = formatDuration(run.createdAt, run.completedAt);
138
140
  const goalPreview = run.goal ? run.goal.slice(0, 40) : "N/A";
139
- const statusIcon = run.status === "completed" ? "✅" : run.status === "failed" ? "" : "⚠️";
140
- lines.push(`| \`${run.runId.slice(-8)}\` | ${goalPreview}${run.goal.length > 40 ? "..." : ""} | ${duration} | ${statusIcon} ${run.status} |`);
141
+ const goalSuffix = run.goal && run.goal.length > 40 ? "..." : "";
142
+ const status = run.status ?? "unknown";
143
+ const statusIcon = status === "completed" ? "✅" : status === "failed" ? "❌" : status === "cancelled" ? "⏹️" : "⚠️";
144
+ lines.push(`| \`${run.runId.slice(-8)}\` | ${goalPreview}${goalSuffix} | ${duration} | ${statusIcon} ${status} |`);
141
145
  }
142
146
  lines.push("");
143
147
  }
@@ -0,0 +1,47 @@
1
+ /**
2
+ * Permission gate logic for destructive team actions (delete/forget/prune/cleanup).
3
+ *
4
+ * Extracted from register.ts `pi.on("tool_call")` handler into a pure function
5
+ * so the gate logic is unit-testable in isolation (the handler itself is hard
6
+ * to test because it's an async event listener on the Pi extension API).
7
+ *
8
+ * Returns `undefined` when the action is ALLOWED, or a block `reason` string
9
+ * when it should be blocked. The handler wraps this and emits the `{block, reason}`
10
+ * shape Pi expects.
11
+ *
12
+ * Rules (in order):
13
+ * 1. Non-team / non-destructive actions → allowed (caller pre-filters, but safe).
14
+ * 2. `cleanup` with `dryRun=true` → ALWAYS allowed (a preview writes nothing,
15
+ * so gating it would block users from previewing what cleanup would do —
16
+ * this was a UX bug: team action=cleanup dryRun=true returned "requires
17
+ * confirm=true" even though it changed no files).
18
+ * 3. `confirm=true` on the input → allowed (explicit user intent).
19
+ * 4. `delete` with `force=true` → allowed (force bypasses reference checks).
20
+ * 5. Otherwise → blocked with a reason telling the user what to pass.
21
+ */
22
+
23
+ export const DESTRUCTIVE_TEAM_ACTIONS = new Set(["delete", "forget", "prune", "cleanup"]);
24
+
25
+ export interface TeamToolInputLike {
26
+ action?: unknown;
27
+ confirm?: unknown;
28
+ force?: unknown;
29
+ dryRun?: unknown;
30
+ }
31
+
32
+ /**
33
+ * Decide whether a destructive team action should be blocked.
34
+ * @returns block reason string, or `undefined` to allow.
35
+ */
36
+ export function shouldBlockDestructiveTeamAction(
37
+ action: string | undefined,
38
+ input: TeamToolInputLike,
39
+ ): string | undefined {
40
+ if (!action || !DESTRUCTIVE_TEAM_ACTIONS.has(action)) return undefined;
41
+ // dryRun cleanup is a PREVIEW (no writes) — never needs confirm.
42
+ if (action === "cleanup" && input.dryRun === true) return undefined;
43
+ if (input.confirm === true) return undefined;
44
+ const forceBypassesReferenceChecks = action === "delete" && input.force === true;
45
+ if (forceBypassesReferenceChecks) return undefined;
46
+ return `Destructive action '${action}' requires confirm=true${action === "delete" ? " (or force=true to bypass reference checks)" : ""}.`;
47
+ }
@@ -13,7 +13,8 @@ import { RUN_NOT_FOUND_HINT } from "./run-not-found.ts";
13
13
  import { enforceDestructiveIntent, intentFromConfig } from "./intent-policy.ts";
14
14
  import { executeHook, appendHookEvent } from "../../hooks/registry.ts";
15
15
  import { resolveRealContainedPath } from "../../utils/safe-paths.ts";
16
- import { projectCrewRoot, userCrewRoot } from "../../utils/paths.ts";
16
+ import { projectCrewRoot, userCrewRoot, userPiRoot } from "../../utils/paths.ts";
17
+ import { removeGuidance } from "../../config/markers.ts";
17
18
  import * as path from "node:path";
18
19
 
19
20
  export function handleWorktrees(params: TeamToolParamsValue, ctx: TeamContext): PiTeamsToolResult {
@@ -123,12 +124,261 @@ export async function handleForget(params: TeamToolParamsValue, ctx: TeamContext
123
124
  }
124
125
 
125
126
  export async function handleCleanup(params: TeamToolParamsValue, ctx: TeamContext): Promise<PiTeamsToolResult> {
127
+ // Intent policy applies to the cleanup action in BOTH modes (per-run and
128
+ // project-level). Checked once here so handleRunCleanup/handleProjectCleanup
129
+ // can stay focused on their own logic.
126
130
  const intentError = enforceDestructiveIntent("cleanup", params, ctx.config);
127
131
  if (intentError) return intentError;
128
- if (!params.runId) return result("Cleanup requires runId.", { action: "cleanup", status: "error" }, true);
129
- const loaded = loadRunManifestById(ctx.cwd, params.runId); // NOTE: no withRunLock - best-effort only; concurrent writes may cause inconsistency
130
- if (!loaded) return result(`Run '${params.runId}' not found.${RUN_NOT_FOUND_HINT}`, { action: "cleanup", status: "error" }, true);
132
+ // Three cleanup modes:
133
+ // 1. WITH runId per-run worktree cleanup (existing behavior).
134
+ // 2. WITHOUT runId, scope=project (default) PROJECT-LEVEL uninstall:
135
+ // removes the AGENTS.md guidance block + optionally `.crew/`. Reverses
136
+ // `team action=init`.
137
+ // 3. WITHOUT runId, scope=user → USER-LEVEL cleanup: removes pi-crew
138
+ // user-scope state that `pi uninstall` leaves behind (config.json,
139
+ // `~/.pi/agent/extensions/pi-crew/` state, junk `.bak` agent files
140
+ // from pi-crew smoke tests). Issue #35 comment: "pi-crew leaves behind
141
+ // user-level junk" — this closes that gap.
142
+ if (params.runId) {
143
+ return handleRunCleanup(params, ctx);
144
+ }
145
+ if (params.scope === "user") {
146
+ return handleUserCleanup(params, ctx);
147
+ }
148
+ return handleProjectCleanup(params, ctx);
149
+ }
150
+
151
+ /**
152
+ * Project-level uninstall cleanup (no runId). Reverses `team action=init`:
153
+ * removes the pi-crew guidance block from AGENTS.md (marker-delimited, so
154
+ * user content is untouched) and, with `force: true`, removes the `.crew/`
155
+ * runtime state directory. `dryRun: true` previews without writing.
156
+ *
157
+ * Safety:
158
+ * - `removeGuidance` only touches content between the PI-CREW markers.
159
+ * - `.crew/` removal requires explicit `force: true` (it holds run history,
160
+ * artifacts, and worktrees — irreversible). Default is guidance-only.
161
+ * - User-scope cleanup (`scope=user`) is a SEPARATE handler — see
162
+ * `handleUserCleanup`.
163
+ */
164
+ function handleProjectCleanup(params: TeamToolParamsValue, ctx: TeamContext): PiTeamsToolResult {
165
+ const cwd = ctx.cwd;
166
+ const dryRun = params.dryRun === true;
167
+ const removeState = params.force === true;
168
+ const scope = typeof params.scope === "string" ? params.scope : "project";
169
+ if (scope !== "project") {
170
+ return result(
171
+ `Project cleanup operates on the project only (got scope='${scope}'). ` +
172
+ `User-scope files are owned by 'pi uninstall npm:pi-crew'.`,
173
+ { action: "cleanup", status: "error", scope },
174
+ true,
175
+ );
176
+ }
177
+
178
+ const lines: string[] = ["Project cleanup for pi-crew:"];
179
+
180
+ // 1. Remove the AGENTS.md guidance block (marker-delimited → user content preserved).
181
+ const guidancePath = path.join(cwd, "AGENTS.md");
182
+ const guidanceResult = dryRun
183
+ ? { path: guidancePath, modified: fs.existsSync(guidancePath), added: [], removed: dryRunRemovedIds(guidancePath) }
184
+ : removeGuidance(guidancePath);
185
+ lines.push("AGENTS.md guidance block:");
186
+ if (guidanceResult.modified) {
187
+ lines.push(` - ${dryRun ? "would remove" : "removed"}: ${guidanceResult.removed.length ? guidanceResult.removed.join(", ") : "(marker section)"}`);
188
+ } else {
189
+ lines.push(" - (no pi-crew marker section found — nothing to do)");
190
+ }
191
+
192
+ // 2. Optionally remove the .crew/ runtime state directory (force: true).
193
+ const crewRoot = projectCrewRoot(cwd);
194
+ lines.push(".crew/ state directory:");
195
+ const crewExists = fs.existsSync(crewRoot);
196
+ if (!crewExists) {
197
+ lines.push(` - (not present at ${crewRoot} — nothing to do)`);
198
+ } else if (!removeState) {
199
+ lines.push(` - present at ${crewRoot} (preserved — use force: true to remove; contains run history/artifacts/worktrees and is irreversible)`);
200
+ } else {
201
+ // SAFETY: realpath + contain-check before rmSync, so a crafted cwd can't
202
+ // trick us into deleting an arbitrary directory.
203
+ let resolved: string;
204
+ try {
205
+ resolved = fs.realpathSync.native(crewRoot);
206
+ } catch {
207
+ lines.push(` - ERROR: could not resolve ${crewRoot} (skipped)`);
208
+ return result(lines.join("\n"), { action: "cleanup", status: "ok", scope }, false);
209
+ }
210
+ if (!resolved.endsWith(path.sep + ".crew") && !resolved.endsWith("/teams") && path.basename(resolved) !== ".crew") {
211
+ lines.push(` - ERROR: refused to remove ${resolved} (does not look like a .crew dir) — skipped`);
212
+ } else {
213
+ if (!dryRun) {
214
+ try {
215
+ fs.rmSync(resolved, { recursive: true, force: true });
216
+ } catch (e) {
217
+ lines.push(` - ERROR removing ${resolved}: ${(e as Error).message}`);
218
+ }
219
+ }
220
+ lines.push(` - ${dryRun ? "would remove" : "removed"}: ${resolved}`);
221
+ }
222
+ }
223
+
224
+ lines.push("");
225
+ lines.push(
226
+ dryRun
227
+ ? "(dry-run preview — no files were changed. Re-run without dryRun to apply.)"
228
+ : "Done. To fully remove pi-crew, also run: pi uninstall npm:pi-crew",
229
+ );
230
+ return result(lines.join("\n"), { action: "cleanup", status: "ok", scope }, false);
231
+ }
232
+
233
+ /** Dry-run helper: read what removeGuidance WOULD remove without writing. */
234
+ function dryRunRemovedIds(guidancePath: string): string[] {
235
+ try {
236
+ if (!fs.existsSync(guidancePath)) return [];
237
+ const content = fs.readFileSync(guidancePath, "utf-8");
238
+ const startIdx = content.indexOf("<!-- PI-CREW:GUIDANCE:START -->");
239
+ const endIdx = content.indexOf("<!-- PI-CREW:GUIDANCE:END -->");
240
+ if (startIdx === -1 || endIdx === -1 || endIdx <= startIdx) return [];
241
+ // Cheap approximation: report the marker section as a unit. Exact block
242
+ // IDs aren't needed for the dry-run summary; the non-dryRun path uses
243
+ // removeGuidance which returns the precise removed IDs.
244
+ return ["pi-crew-overview", "pi-crew-commands"];
245
+ } catch {
246
+ return [];
247
+ }
248
+ }
249
+
250
+ /**
251
+ * User-level cleanup (`scope=user`, no runId). Removes pi-crew user-scope
252
+ * state that `pi uninstall npm:pi-crew` leaves behind (Issue #35 comment:
253
+ * "pi-crew leaves behind user-level junk"). Targets ONLY pi-crew-owned paths:
254
+ *
255
+ * 1. `~/.pi/agent/extensions/pi-crew/` — pi-crew state dir (artifacts,
256
+ * state, config.json). Owned by pi-crew, safe to remove.
257
+ * 2. `~/.pi/agent/pi-crew.json` — pi-crew global config (only with force).
258
+ * 3. `~/.pi/agent/agents/*.bak-*` junk files from pi-crew smoke tests
259
+ * (pattern: `*.md.bak-<timestamp>-<hex>` — these are pi-crew test
260
+ * leftovers, never user data).
261
+ *
262
+ * Safety:
263
+ * - NEVER removes user-authored agent files (`~/.pi/agent/agents/*.md`)
264
+ * because pi-crew cannot tell which were user-created vs test-copied —
265
+ * only the timestamped `.bak-*` backups (clearly pi-crew test junk).
266
+ * - The state dir removal requires force=true by default is NOT required
267
+ * (it's pi-crew's own runtime cache, regenerable), but config.json removal
268
+ * IS gated on force=true (it may hold user-customized settings).
269
+ * - dryRun=true previews without writing.
270
+ */
271
+ function handleUserCleanup(params: TeamToolParamsValue, ctx: TeamContext): PiTeamsToolResult {
272
+ const dryRun = params.dryRun === true;
273
+ const force = params.force === true;
274
+ const lines: string[] = ["User-scope cleanup for pi-crew:"];
275
+
276
+ // 1. pi-crew user state dir (~/.pi/agent/extensions/pi-crew/) — always safe
277
+ // to remove (regenerable runtime cache: artifacts + state). This is the
278
+ // bulk of the "user-level junk".
279
+ const crewStateDir = userCrewRoot();
280
+ lines.push(`pi-crew user state dir (${crewStateDir}):`);
281
+ let crewStateBytes = 0;
282
+ if (fs.existsSync(crewStateDir)) {
283
+ try {
284
+ crewStateBytes = dirSize(crewStateDir);
285
+ } catch { /* best-effort size */ }
286
+ if (!dryRun) {
287
+ try { fs.rmSync(crewStateDir, { recursive: true, force: true }); }
288
+ catch (e) { lines.push(` - ERROR removing: ${(e as Error).message}`); }
289
+ }
290
+ lines.push(` - ${dryRun ? "would remove" : "removed"}: ${crewStateDir} (${formatBytes(crewStateBytes)})`);
291
+ } else {
292
+ lines.push(" - (not present — nothing to do)");
293
+ }
294
+
295
+ // 2. pi-crew global config (~/.pi/agent/pi-crew.json) — gated on force=true
296
+ // because it may hold user-customized settings (autonomous profile, model
297
+ // overrides, etc.). Default preserves it.
298
+ const userConfigPath = path.join(userPiRoot(), "pi-crew.json");
299
+ lines.push("pi-crew global config:");
300
+ if (!fs.existsSync(userConfigPath)) {
301
+ lines.push(` - (not present at ${userConfigPath} — nothing to do)`);
302
+ } else if (!force) {
303
+ lines.push(` - present at ${userConfigPath} (preserved — use force: true to remove; may hold your customized settings)`);
304
+ } else {
305
+ if (!dryRun) {
306
+ try { fs.rmSync(userConfigPath, { force: true }); }
307
+ catch (e) { lines.push(` - ERROR removing: ${(e as Error).message}`); }
308
+ }
309
+ lines.push(` - ${dryRun ? "would remove" : "removed"}: ${userConfigPath}`);
310
+ }
311
+
312
+ // 3. pi-crew smoke-test `.bak-*` junk in ~/.pi/agent/agents/. These are
313
+ // leftover backups from pi-crew's own smoke tests (pattern:
314
+ // `<name>.md.bak-<timestamp>-<hex>`). NEVER touch real `.md` agent files
315
+ // (can't tell user-authored vs test-copied).
316
+ const agentsDir = path.join(userPiRoot(), "agents");
317
+ lines.push("pi-crew test junk in agents dir:");
318
+ const bakJunk: string[] = [];
319
+ if (fs.existsSync(agentsDir)) {
320
+ try {
321
+ for (const entry of fs.readdirSync(agentsDir)) {
322
+ // Only the pi-crew smoke-test backup pattern. Real agent files end in `.md`.
323
+ if (/^.*\.md\.bak-\d{17,}-[0-9a-f]+$/i.test(entry)) {
324
+ bakJunk.push(path.join(agentsDir, entry));
325
+ }
326
+ }
327
+ } catch { /* best-effort scan */ }
328
+ }
329
+ if (bakJunk.length === 0) {
330
+ lines.push(" - (no `*.md.bak-*` test backups found — nothing to do)");
331
+ } else {
332
+ for (const junk of bakJunk) {
333
+ if (!dryRun) {
334
+ try { fs.rmSync(junk, { force: true }); }
335
+ catch (e) { lines.push(` - ERROR removing ${path.basename(junk)}: ${(e as Error).message}`); }
336
+ }
337
+ }
338
+ lines.push(` - ${dryRun ? "would remove" : "removed"}: ${bakJunk.length} backup file(s) (pattern *.md.bak-<timestamp>-<hex>)`);
339
+ }
340
+
341
+ lines.push("");
342
+ lines.push(
343
+ dryRun
344
+ ? "(dry-run preview — no files were changed. Re-run without dryRun to apply.)"
345
+ : "Done. To fully remove pi-crew: also run `team action=cleanup force=true` (project .crew/) + `pi uninstall npm:pi-crew`.",
346
+ );
347
+ return result(lines.join("\n"), { action: "cleanup", status: "ok", scope: "user" }, false);
348
+ }
349
+
350
+ /** Recursively compute a directory's size in bytes (best-effort). */
351
+ function dirSize(dir: string): number {
352
+ let total = 0;
353
+ const stack = [dir];
354
+ while (stack.length > 0) {
355
+ const cur = stack.pop()!;
356
+ let entries: string[];
357
+ try { entries = fs.readdirSync(cur); }
358
+ catch { continue; }
359
+ for (const entry of entries) {
360
+ const full = path.join(cur, entry);
361
+ try {
362
+ const stat = fs.statSync(full);
363
+ if (stat.isDirectory()) stack.push(full);
364
+ else total += stat.size;
365
+ } catch { /* skip unreadable */ }
366
+ }
367
+ }
368
+ return total;
369
+ }
370
+
371
+ function formatBytes(n: number): string {
372
+ if (n < 1024) return `${n}B`;
373
+ if (n < 1024 * 1024) return `${(n / 1024).toFixed(1)}KB`;
374
+ if (n < 1024 * 1024 * 1024) return `${(n / (1024 * 1024)).toFixed(1)}MB`;
375
+ return `${(n / (1024 * 1024 * 1024)).toFixed(1)}GB`;
376
+ }
131
377
 
378
+ /** Per-run worktree cleanup (existing behavior, preserved). */
379
+ async function handleRunCleanup(params: TeamToolParamsValue, ctx: TeamContext): Promise<PiTeamsToolResult> {
380
+ const loaded = loadRunManifestById(ctx.cwd, params.runId!); // NOTE: no withRunLock - best-effort only; concurrent writes may cause inconsistency
381
+ if (!loaded) return result(`Run '${params.runId}' not found.${RUN_NOT_FOUND_HINT}`, { action: "cleanup", status: "error", runId: params.runId }, true);
132
382
  // Ownership check — prevent cross-session worktree cleanup unless force is set
133
383
  const foreignRun = typeof loaded.manifest.ownerSessionId === "string" && loaded.manifest.ownerSessionId !== ctx.sessionId;
134
384
  if (foreignRun && !params.force) return result(`Run ${params.runId} belongs to another session. Use force: true to override.`, { action: "cleanup", status: "error", runId: loaded.manifest.runId }, true);
@@ -12,6 +12,7 @@ import { computePhaseProgress } from "../../runtime/phase-progress.ts";
12
12
  import { formatDuration } from "../../ui/tool-render.ts";
13
13
  import { verifyTaskCompletion } from "../../runtime/completion-guard.ts";
14
14
  import { evaluateRunEffectiveness } from "../../runtime/effectiveness.ts";
15
+ import { extractCommandTrace } from "../../runtime/command-trace.ts";
15
16
  import type { PiTeamsToolResult } from "../tool-result.ts";
16
17
  import { locateRunCwd } from "../team-tool.ts";
17
18
  import { result, type TeamContext } from "./context.ts";
@@ -88,7 +89,7 @@ export function handleStatus(params: TeamToolParamsValue, ctx: TeamContext): PiT
88
89
  "Task graph:",
89
90
  ...formatTaskGraphLines(tasks),
90
91
  "Tasks:",
91
- ...(tasks.length ? tasks.map((task) => `- ${task.id} [${task.status}] ${task.role} -> ${task.agent}${task.taskPacket ? ` scope=${task.taskPacket.scope}` : ""}${task.verification ? ` green=${task.verification.observedGreenLevel}/${task.verification.requiredGreenLevel}` : ""}${task.modelAttempts?.length ? ` attempts=${task.modelAttempts.length}` : ""}${task.modelRouting ? ` modelRouting=${task.modelRouting.requested ? `${task.modelRouting.requested}->` : ""}${task.modelRouting.resolved}${task.modelRouting.usedAttempt ? ` attempt=${task.modelRouting.usedAttempt + 1}` : ""}` : ""}${task.agentProgress?.activityState ? ` activityState=${task.agentProgress.activityState}` : ""}${attentionByTask.get(task.id)?.data?.reason ? ` attention=${String(attentionByTask.get(task.id)?.data?.reason)}` : ""}${task.jsonEvents !== undefined ? ` jsonEvents=${task.jsonEvents}` : ""}${task.usage ? ` usage=${JSON.stringify(task.usage)}` : ""}${task.resultArtifact ? ` result=${task.resultArtifact.path}` : ""}${task.transcriptArtifact ? ` transcript=${task.transcriptArtifact.path}` : ""}${task.worktree ? ` worktree=${task.worktree.path}` : ""}${task.error ? ` error=${task.error}` : ""}`) : ["- (none)"]),
92
+ ...(tasks.length ? tasks.map((task) => `- ${task.id} [${task.status}] ${task.role} -> ${task.agent}${task.taskPacket ? ` scope=${task.taskPacket.scope}` : ""}${task.verification ? ` green=${task.verification.observedGreenLevel}/${task.verification.requiredGreenLevel}` : ""}${task.modelAttempts?.length ? ` attempts=${task.modelAttempts.length}` : ""}${task.modelRouting ? ` modelRouting=${task.modelRouting.requested ? `${task.modelRouting.requested}->` : ""}${task.modelRouting.resolved}${task.modelRouting.usedAttempt ? ` attempt=${task.modelRouting.usedAttempt + 1}` : ""}` : ""}${task.agentProgress?.activityState ? ` activityState=${task.agentProgress.activityState}` : ""}${(() => { const t = extractCommandTrace(task.agentProgress?.recentTools); return t.summary ? ` ${t.summary}` : ""; })()}${attentionByTask.get(task.id)?.data?.reason ? ` attention=${String(attentionByTask.get(task.id)?.data?.reason)}` : ""}${task.jsonEvents !== undefined ? ` jsonEvents=${task.jsonEvents}` : ""}${task.usage ? ` usage=${JSON.stringify(task.usage)}` : ""}${task.resultArtifact ? ` result=${task.resultArtifact.path}` : ""}${task.transcriptArtifact ? ` transcript=${task.transcriptArtifact.path}` : ""}${task.worktree ? ` worktree=${task.worktree.path}` : ""}${task.error ? ` error=${task.error}` : ""}`) : ["- (none)"]),
92
93
  `Task counts: ${[...counts.entries()].map(([status, count]) => `${status}=${count}`).join(", ") || "none"}`,
93
94
  "Effectiveness:",
94
95
  `- observable=${effectiveness.observable}/${Math.max(1, effectiveness.completed)} completed tasks`,
@@ -10,6 +10,8 @@ export interface TeamToolDetails {
10
10
  resumedIds?: string[];
11
11
  retriedTaskIds?: string[];
12
12
  mailboxIds?: string[];
13
+ /** Resource scope affected by the action (e.g. cleanup: "project"). */
14
+ scope?: string;
13
15
  /** Run metrics for compact display in TUI tool result rendering. */
14
16
  metrics?: { taskCount?: number; completedCount?: number; totalTokens?: number; totalCost?: number; durationMs?: number; consistencyScore?: number };
15
17
  /** Structured data for programmatic consumption (e.g. TUI widgets). */
@@ -0,0 +1,105 @@
1
+ /**
2
+ * T10 — Verbatim command-trace extraction from tool-call history.
3
+ *
4
+ * Ported from pi-agent-flow's `generateCommandsFromHistory` technique (see
5
+ * `research-findings/pi-ecosystem-distillation.md` T10): workers routinely
6
+ * PARAPHRASE commands in their self-reports ("I ran the tests" instead of the
7
+ * exact `npm test -- --grep foo`). The orchestrator/user wants the VERBATIM
8
+ * command trace. pi-crew already records each executed tool in
9
+ * `CrewAgentProgress.recentTools` (capped at 25, args previewed at ≤240 chars),
10
+ * so the factual trace is available — this module distills it.
11
+ *
12
+ * Design notes:
13
+ * - Reads only the recorded `{tool, args, endedAt}` tuples — never trusts any
14
+ * LLM-emitted command string. This is the whole point: mechanical, factual.
15
+ * - Recognizes bash/shell tools and extracts their `command` arg, sanitized to
16
+ * a single line (newlines → ⏎) so a multi-line script shows as one trace row.
17
+ * - Non-shell tools (write/edit/read) are counted but not inlined (their args
18
+ * are file paths, not commands; the task result already lists changed files).
19
+ * - Caps the returned command list to keep the status line / summary bounded.
20
+ */
21
+
22
+ export interface ToolCallRecord {
23
+ tool: string;
24
+ args?: string;
25
+ endedAt?: string;
26
+ }
27
+
28
+ export interface CommandTrace {
29
+ /** Total number of recorded tool calls (all tool kinds). */
30
+ totalTools: number;
31
+ /** Count of recognized command-executing tools (bash/shell). */
32
+ commandTools: number;
33
+ /** Verbatim command strings (sanitized to one line), most-recent-last. */
34
+ commands: string[];
35
+ /** One-line summary suitable for a status/dashboard row, e.g. "cmd=8 (3 bash)". */
36
+ summary: string;
37
+ }
38
+
39
+ const COMMAND_TOOL_NAMES = new Set(["bash", "shell", "execute_bash", "run_command", "terminal"]);
40
+ const MAX_COMMANDS = 12;
41
+ const MAX_COMMAND_LEN = 160;
42
+
43
+ /**
44
+ * Extract a verbatim command trace from recorded tool calls. Pure function —
45
+ * safe to call with any subset of recentTools. Returns an empty trace for
46
+ * empty/missing input.
47
+ */
48
+ export function extractCommandTrace(recentTools: readonly ToolCallRecord[] | undefined | null): CommandTrace {
49
+ const tools = Array.isArray(recentTools) ? recentTools : [];
50
+ // Only count well-formed records (string tool name) — malformed entries
51
+ // are skipped entirely so a corrupt/transient record can't inflate totals.
52
+ const valid = tools.filter((call) => call && typeof call.tool === "string");
53
+ const totalTools = valid.length;
54
+ const commands: string[] = [];
55
+ let commandTools = 0;
56
+ for (const call of valid) {
57
+ const toolName = call.tool.toLowerCase();
58
+ if (!COMMAND_TOOL_NAMES.has(toolName)) continue;
59
+ commandTools += 1;
60
+ const cmd = extractCommandArg(call.args);
61
+ if (cmd) commands.push(sanitizeCommand(cmd));
62
+ }
63
+ const trimmed = commands.slice(-MAX_COMMANDS);
64
+ return {
65
+ totalTools,
66
+ commandTools,
67
+ commands: trimmed,
68
+ summary: formatSummary(totalTools, commandTools),
69
+ };
70
+ }
71
+
72
+ /** Pull the `command` field out of an args preview (JSON string or raw). */
73
+ function extractCommandArg(args: string | undefined): string | undefined {
74
+ if (!args) return undefined;
75
+ const raw = args.trim();
76
+ if (!raw) return undefined;
77
+ // Args is a JSON-stringified object like {"command":"ls -la"} (previewArgs
78
+ // JSON.stringifies objects). Try to parse and read .command / .cmd.
79
+ if (raw.startsWith("{")) {
80
+ try {
81
+ const parsed = JSON.parse(raw) as Record<string, unknown>;
82
+ const cmd = parsed.command ?? parsed.cmd ?? parsed.script ?? parsed.input;
83
+ if (typeof cmd === "string" && cmd.trim()) return cmd;
84
+ } catch {
85
+ // Not valid JSON — fall through to treat raw as the command.
86
+ }
87
+ }
88
+ // Otherwise treat the whole preview as the command text.
89
+ return raw;
90
+ }
91
+
92
+ /** Sanitize a command to a single bounded line for compact display. */
93
+ function sanitizeCommand(cmd: string): string {
94
+ const oneLine = cmd.replace(/\s*\r?\n\s*/g, " ⏎ ").trim();
95
+ if (oneLine.length <= MAX_COMMAND_LEN) return oneLine;
96
+ return `${oneLine.slice(0, MAX_COMMAND_LEN - 1)}…`;
97
+ }
98
+
99
+ function formatSummary(totalTools: number, commandTools: number): string {
100
+ if (totalTools === 0) return "";
101
+ // "cmd=8" always; append "(3 bash)" only when there's a mix worth showing.
102
+ if (commandTools === 0) return `cmd=${totalTools}`;
103
+ if (commandTools === totalTools) return `cmd=${totalTools}`;
104
+ return `cmd=${totalTools} (${commandTools} bash)`;
105
+ }
@@ -14,6 +14,10 @@ export interface ParsedPiJsonOutput {
14
14
  usage?: ParsedPiUsage;
15
15
  /** Unified patches extracted from tool_result events (edit tool patch field) */
16
16
  patches?: string[];
17
+ /** Model/provider error messages extracted from message_end events (e.g.
18
+ * "429 ... overloaded"). Used to detect runs that exited 0 but produced
19
+ * nothing because the model was rate-limited — see task-runner 429 fix. */
20
+ errorMessages?: string[];
17
21
  }
18
22
 
19
23
  function asRecord(value: unknown): Record<string, unknown> | undefined {
@@ -90,6 +94,7 @@ export function parsePiJsonOutput(stdout: string): ParsedPiJsonOutput {
90
94
  let jsonEvents = 0;
91
95
  const textEvents: string[] = [];
92
96
  const patches: string[] = [];
97
+ const errorMessages: string[] = [];
93
98
  let usage: ParsedPiUsage | undefined;
94
99
  for (const line of stdout.split("\n")) {
95
100
  const trimmed = line.trim();
@@ -104,6 +109,9 @@ export function parsePiJsonOutput(stdout: string): ParsedPiJsonOutput {
104
109
  textEvents.push(...extractText(event));
105
110
  // Extract unified patches from tool_result events
106
111
  extractPatch(event, patches);
112
+ // Extract provider/model error messages from message_end events (429 fix).
113
+ const errMsg = extractErrorMessage(event);
114
+ if (errMsg) errorMessages.push(errMsg);
107
115
  const eventUsage = extractUsage(event);
108
116
  if (eventUsage) usage = mergeUsage(usage ?? {}, eventUsage);
109
117
  }
@@ -113,9 +121,24 @@ export function parsePiJsonOutput(stdout: string): ParsedPiJsonOutput {
113
121
  finalText: textEvents.length > 0 ? textEvents[textEvents.length - 1] : undefined,
114
122
  usage,
115
123
  patches: patches.length > 0 ? patches : undefined,
124
+ errorMessages: errorMessages.length > 0 ? errorMessages : undefined,
116
125
  };
117
126
  }
118
127
 
128
+ /**
129
+ * Pull the provider/model error message out of a `message_end` event. The shape
130
+ * is `{type:"message_end", message:{role:"assistant", content:[], errorMessage:"429 ...", stopReason:"error"}}`.
131
+ * Returns undefined for events without an errorMessage.
132
+ */
133
+ function extractErrorMessage(event: unknown): string | undefined {
134
+ const obj = asRecord(event);
135
+ if (!obj) return undefined;
136
+ // message_end events carry the error on the nested message object.
137
+ const message = asRecord(obj.message) ?? obj;
138
+ const errorMessage = message.errorMessage;
139
+ return typeof errorMessage === "string" && errorMessage.trim() ? errorMessage.trim() : undefined;
140
+ }
141
+
119
142
  /**
120
143
  * Extract unified patches from a tool_result event.
121
144
  * pi's edit tool now includes a `patch` field (standard unified diff format).
@@ -333,6 +333,13 @@ export function renderSkillInstructions(
333
333
  ? `Description: ${description}${confidenceNote}`
334
334
  : undefined,
335
335
  `Source: ${source}`,
336
+ // Path: pointer to the skill directory so the agent can deterministically
337
+ // `ls <Path>/references/` and `read` a co-located reference corpus.
338
+ // Without this, skills that defer to a local corpus (the Agent Skills
339
+ // spec "small instruction + large local reference" pattern, e.g.
340
+ // effective-html's `references/html-effectiveness/`) leave the agent
341
+ // guessing the skill dir. No behavior change for corpus-less skills.
342
+ `Path: ${path.dirname(loaded.path)}`,
336
343
  ]
337
344
  .filter(Boolean)
338
345
  .join("\n");
@@ -83,6 +83,7 @@ import {
83
83
  progressEventSummary,
84
84
  shouldFlushProgressEvent,
85
85
  } from "./task-runner/progress.ts";
86
+ import { extractCommandTrace } from "./command-trace.ts";
86
87
  import {
87
88
  checkpointTask,
88
89
  persistSingleTaskUpdate,
@@ -729,6 +730,18 @@ export async function runTeamTask(
729
730
  stderr: childResult.stderr,
730
731
  }).message;
731
732
  }
733
+ // 429/rate-limit fix (PI_CREW_TOOLING_429_NOTE.md): a worker can exit
734
+ // code 0 with NO hard error, but the transcript is full of
735
+ // `message_end` events with `errorMessage: "429 ... overloaded"` and
736
+ // empty content. The model never produced a tool call, so the worker
737
+ // "completed" without doing anything. Detect this: if no error was set
738
+ // above AND the parsed output carries a retryable model-failure message
739
+ // AND there is no real output text, surface it as an error so the
740
+ // model-fallback chain can retry on another model.
741
+ if (!error && parsedOutput) {
742
+ const rateLimitErr = detectRetryableModelFailureFromOutput(parsedOutput);
743
+ if (rateLimitErr) error = rateLimitErr;
744
+ }
732
745
  persistHeartbeat(true);
733
746
  persistChildProgress({ type: "attempt_finished" }, true);
734
747
  const attempt: ModelAttemptSummary = {
@@ -1206,12 +1219,16 @@ export async function runTeamTask(
1206
1219
 
1207
1220
  // Emit task completion hooks (100% reliable, fire-and-forget)
1208
1221
  const hookType = task.status === "completed" ? "task_completed" : task.status === "failed" ? "task_failed" : "task_started";
1222
+ // T10: attach the VERBATIM command trace (mechanically derived from
1223
+ // recorded tool-call history, never from the worker's self-report) so
1224
+ // event viewers + the orchestrator see exactly which commands ran.
1225
+ const commandTrace = extractCommandTrace(task.agentProgress?.recentTools);
1209
1226
  crewHooks.emit({
1210
1227
  type: hookType,
1211
1228
  timestamp: task.finishedAt ?? new Date().toISOString(),
1212
1229
  runId: manifest.runId,
1213
1230
  taskId: task.id,
1214
- data: { status: task.status, role: task.role, error: task.error, exitCode: task.exitCode, usage: task.usage },
1231
+ data: { status: task.status, role: task.role, error: task.error, exitCode: task.exitCode, usage: task.usage, commandTrace },
1215
1232
  });
1216
1233
 
1217
1234
  const packetArtifact = writeArtifact(manifest.artifactsRoot, {
@@ -1332,3 +1349,36 @@ async function resolveTaskScopeModelsPatterns(cwd: string): Promise<string[]> {
1332
1349
  if (!scopeModels) return [];
1333
1350
  return readEnabledModelsPatterns(cwd);
1334
1351
  }
1352
+
1353
+ /**
1354
+ * 429/rate-limit detection (PI_CREW_TOOLING_429_NOTE.md).
1355
+ *
1356
+ * A worker can exit code 0 with no hard error, yet the transcript is full of
1357
+ * `message_end` events carrying `errorMessage: "429 ... overloaded"` (or any
1358
+ * retryable model-failure pattern) and empty content arrays. The model never
1359
+ * produced a tool call, so the worker "completed" without doing anything.
1360
+ *
1361
+ * This helper inspects a ParsedPiJsonOutput and, if the run produced only
1362
+ * retryable model-failure messages AND no real output text (no finalText, no
1363
+ * text events, no patches), returns a surfaced error string so the
1364
+ * model-fallback chain (isRetryableModelFailure) can retry on another model.
1365
+ * Returns undefined when the run has real output (the 429s were recovered from)
1366
+ * or when there are no retryable error messages.
1367
+ */
1368
+ export function detectRetryableModelFailureFromOutput(parsed: ParsedPiJsonOutput): string | undefined {
1369
+ const messages = parsed.errorMessages;
1370
+ if (!messages || messages.length === 0) return undefined;
1371
+ // Find the first retryable model-failure message (429 / rate-limit / overloaded / 5xx / ...).
1372
+ const retryable = messages.find((m) => isRetryableModelFailure(m));
1373
+ if (!retryable) return undefined;
1374
+ // Did the run actually produce real output despite the transient errors?
1375
+ // If finalText / textEvents / patches exist, the model recovered and we
1376
+ // should NOT mark the run as failed — only flag it when the worker yielded
1377
+ // nothing (the 429-only case from the bug report).
1378
+ const hasRealOutput =
1379
+ (parsed.finalText?.trim().length ?? 0) > 0 ||
1380
+ parsed.textEvents.some((t) => t.trim().length > 0) ||
1381
+ (parsed.patches?.length ?? 0) > 0;
1382
+ if (hasRealOutput) return undefined;
1383
+ return `Model returned only retryable errors and no output: ${retryable}`;
1384
+ }