pi-crew 0.8.10 → 0.8.12

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -1,5 +1,188 @@
1
1
  # Changelog
2
2
 
3
+ ## [0.8.12] — `team action=cleanup` now reverses `init` (Issue #35) (2026-06-17)
4
+
5
+ `team action=cleanup` gained a **project-level mode** that reverses what
6
+ `team action=init` writes. This closes the legitimate complaint in
7
+ [Issue #35](https://github.com/baphuongna/pi-crew/issues/35): pi-crew injects
8
+ a guidance block into `AGENTS.md` on `init`, but `pi uninstall` has no
9
+ extension hook to remove it — so the block (and `.crew/`) were left behind.
10
+
11
+ ### New `cleanup` modes
12
+
13
+ | Call | What it does |
14
+ |---|---|
15
+ | `team action=cleanup runId=<id>` | Per-run worktree cleanup (existing behavior, unchanged) |
16
+ | `team action=cleanup` (no runId) | **NEW**: removes the AGENTS.md guidance block |
17
+ | `team action=cleanup force=true` | NEW: also removes the `.crew/` state directory |
18
+ | `team action=cleanup dryRun=true` | NEW: preview without writing |
19
+
20
+ ### Safety guarantees
21
+
22
+ - The AGENTS.md guidance block is **marker-delimited**
23
+ (`<!-- PI-CREW:GUIDANCE:START/END -->`), so `removeGuidance` removes **only**
24
+ that block — user content is never touched (pinned by a test).
25
+ - `.crew/` removal requires explicit `force=true` (irreversible — holds run
26
+ history, artifacts, worktrees). Default preserves it.
27
+ - A `realpathSync` + basename guard refuses to `rmSync` anything that isn't a
28
+ `.crew` dir, so a crafted cwd can't trick us into deleting an arbitrary path.
29
+ - The user-scope dir (`~/.pi/agent/extensions/pi-crew/`) is owned by
30
+ `pi uninstall` and is never touched by `team action=cleanup`.
31
+
32
+ ### Files
33
+
34
+ - `src/extension/team-tool/lifecycle-actions.ts` — `handleCleanup` dispatcher
35
+ + new `handleProjectCleanup` (no-runId path). Intent policy now checked once
36
+ in the dispatcher (applies to both modes). Per-run path preserved verbatim.
37
+ - `src/extension/team-tool-types.ts` — `TeamToolDetails.scope?`.
38
+ - `README.md` — new **Uninstall** section documenting the full flow.
39
+ - `test/unit/cleanup-project-mode.test.ts` — NEW, 9 tests (removal, user-content
40
+ preservation, idempotency, force-gating, dry-run, scope rejection, runId
41
+ routing).
42
+ - `test/unit/team-tool-dispatch.test.ts` — updated the no-runId test to the
43
+ new contract (project cleanup, not error).
44
+
45
+ typecheck clean; full suite 2964/0.
46
+
47
+ ## [0.8.11] — Split-scope install fix + transient-provider fallback (2026-06-17)
48
+
49
+ Bundle of two independent fixes that were triaged from real user reports on
50
+ 2026-06-17. Both are robustness fixes for failure modes that previously
51
+ killed team runs silently.
52
+
53
+ ### 1. `Cannot find module '@earendil-works/pi-coding-agent'` on Windows / global installs
54
+
55
+ **Symptom:** every `team` action (run / parallel / plan) crashed ~1 minute
56
+ after spawn, leaving all tasks permanently `queued`. The detached
57
+ background team-runner child threw:
58
+ ```
59
+ Error: Cannot find module '@earendil-works/pi-coding-agent'
60
+ Require stack:
61
+ - .../.pi/agent/npm/node_modules/pi-crew/src/runtime/skill-instructions.ts
62
+ ```
63
+
64
+ **Root cause:** pi-crew (an extension) is installed under
65
+ `~/.pi/agent/npm/node_modules/<ext>/`, but pi itself (the
66
+ `@earendil-works/pi-coding-agent` package extensions import from) lives in a
67
+ **separate** node_modules tree (nvm / `%APPDATA%\npm` / Volta / fnm /
68
+ pnpm-global). Node's resolver only walks UP ancestor `node_modules`, so a
69
+ static `import { getAgentDir } from "@earendil-works/pi-coding-agent"` in a
70
+ file loaded by the spawned child crashes. This is the **default** layout for
71
+ anyone who installs pi-crew via `pi install` — not a user misconfiguration.
72
+
73
+ **Additional constraint:** pi-coding-agent ships as **ESM-only**
74
+ (`type:module`, exports map with only an `import` condition). CJS
75
+ `createRequire(dir)(name)` / `require.resolve("<pkg>/package.json")` both
76
+ fail with `ERR_PACKAGE_PATH_NOT_EXPORTED` under node AND jiti/tsx (verified).
77
+ The ONLY working load mechanism is a dynamic `import()` of the resolved ESM
78
+ entry file URL.
79
+
80
+ **Fix — NEW `src/runtime/peer-dep.ts`:**
81
+ - `resolvePeerDep()` (sync): walks `node_modules` **manually** (bypasses the
82
+ restrictive exports map) across 6 strategies — env hint
83
+ (`PI_CREW_PEER_DEP_DIR`), this file, `process.argv[1]`, the node binary's
84
+ global node_modules (covers nvm/Volta/fnm), `npm root -g`, and
85
+ `%APPDATA%\npm`. Memoized.
86
+ - `primePeerDep()` (async): dynamic `import(fileURL)` the resolved ESM entry,
87
+ cache the module namespace. Memoized + retryable on failure.
88
+ - `getAgentDir()` (sync): reads the REAL fork-aware `getAgentDir` from the
89
+ primed cache; falls back to a computed default (`~/.pi/agent`, respecting
90
+ `PI_CODING_AGENT_DIR`) if not primed — **NEVER throws**.
91
+
92
+ **Rewired:**
93
+ - `skill-instructions.ts`, `discover-skills.ts` — static peer-dep import →
94
+ lazy `getAgentDir()` from `peer-dep.ts` (this is the crash site).
95
+ - `background-runner.ts` — `primePeerDep()` before importing `team-runner`
96
+ (child process).
97
+ - `register.ts` — `primePeerDep()` at extension entry (main process).
98
+ - `async-runner.ts` — propagate `PI_CREW_PEER_DEP_DIR` to children so they
99
+ skip the ~200ms `npm root -g` probe.
100
+
101
+ **Tests:** NEW `test/unit/peer-dep-resolver.test.ts` (9 cases) — env-hint
102
+ resolution, manual node_modules walk past exports map, ESM dynamic-import
103
+ loading, memoization, graceful fallback, `PI_CODING_AGENT_DIR` override,
104
+ loadable fileURL under the child's loader.
105
+
106
+ ### 2. `500 api_error "unknown error, 999 (1000)"` aborted the run instead of falling back
107
+
108
+ **Symptom:** when the model provider went hard-down with
109
+ `500 {"type":"error","error":{"type":"api_error","message":"unknown
110
+ error, 999 (1000)"}}`, the run died even when the user had configured a
111
+ fallback model that would have worked.
112
+
113
+ **Root cause:** pi has two safety layers. (1) pi-core provider-retry retries
114
+ 3× with exponential backoff — its regex already matches `500`. (2) pi-crew's
115
+ `model-fallback` layer is the last safety net: when all 3 retries fail, it
116
+ tries the next configured model. But `isRetryableModelFailure`'s pattern
117
+ list covered 429 / rate-limit / 502-504 / overloaded / timeout and **MISSED**
118
+ generic `500`, `api_error`, `unknown error`, and internal/server-error
119
+ phrasings. So a transient provider outage was retried 3× then **aborted**
120
+ instead of failing over.
121
+
122
+ **Fix:** added to `RETRYABLE_MODEL_FAILURE_PATTERNS` —
123
+ `\b500\b`, `\b501\b`, `api_error`, `unknown error`,
124
+ `internal(?:_server)?[ _]error`, `server error`, `bad gateway`.
125
+
126
+ `NON_RETRYABLE` (auth/billing/key) still wins — checked first in
127
+ `isRetryableModelFailure` — so a transient-looking 500 wrapping an auth
128
+ failure won't loop the chain.
129
+
130
+ **Tests:** 4 regression tests in `test/unit/model-fallback.test.ts` covering
131
+ the exact reported error, generic 5xx, auth-still-blocked, and undefined/empty.
132
+
133
+ ### Verification
134
+
135
+ typecheck clean; peer-dep suite 9/9; model-* suite 57/57; full suite 0 real
136
+ failures (1 known `result-watcher` fs.watch 10s timeout flake passes 7/7 in
137
+ isolation — unrelated).
138
+
139
+ ## [0.8.10] — Pre-warm 3 repro-observed cold-start crash-variant modules (2026-06-17)
140
+
141
+ The post-v0.8.9-restart 6-subagent repro surfaced 3 cold-start crash variants
142
+ in one batch: `existsSync` (peer-dep, latched v0.8.1 + warmup v0.8.6),
143
+ `effectiveRunConfig` (`team-tool/config-patch.ts`), `CREW_README`
144
+ (`state/crew-init.ts`, latched v0.8.9). v0.8.6's warmup covered `team-tool.ts`
145
+ transitively but not these specific modules explicitly — static-graph
146
+ reachability isn't reliable under tsx/jiti interop + concurrent fanout (the
147
+ `handleRun` latch serializes the CALL but not module-body instantiation of
148
+ `run.ts`'s static deps).
149
+
150
+ **Fix:** add the 3 repro-observed modules to `HOT_MODULE_SPECIFIERS` so their
151
+ module bodies instantiate at single-threaded registration:
152
+ `team-tool/run.ts`, `team-tool/config-patch.ts`, `workflows/validate-workflow.ts`.
153
+
154
+ Repro verification: 6/6 subagents clean (was 1/6) under loaded code.
155
+
156
+ ## [0.8.9] — crew-init dynamic-import latch (kills CREW_README TDZ race) (2026-06-17)
157
+
158
+ Module-scoped `loadCrewInit()` latch in `team-tool/run.ts` — concurrent `team`
159
+ tool calls share ONE in-flight import promise. Added `crew-init.ts` to
160
+ `HOT_MODULE_SPECIFIERS`. Targets the `CREW_README` TDZ variant observed in the
161
+ post-v0.8.8 repro.
162
+
163
+ ## [0.8.8] — Cross-project leak cwd-scope barrier (2026-06-17)
164
+
165
+ `collectInFlightRuns` filtered by STATUS only (queued/planning/running), not
166
+ by project scope. Multiple Pi sessions in the same project shared
167
+ `.crew/state/runs/`, so Session B's compaction picked up Session A's runs in
168
+ OTHER projects and injected them into Session B's continuation prompt.
169
+
170
+ The v0.8.8 (4bd6f5b) `ownerSessionId` filter was **unreliable** —
171
+ `ctx.sessionId` is absent on pi 0.79.6 `ExtensionContext`.
172
+
173
+ **Fix:** `isInProjectScope(run, queryCwd)` in `collectInFlightRuns` — keeps a
174
+ run only if `findRepoRoot(run.cwd) === findRepoRoot(queryCwd)`. Reliable,
175
+ version-independent. Filter at the consumption site, NOT in
176
+ `listRecentRuns`/`collectActiveRuns` (the cross-project dashboard view stays
177
+ unfiltered — 2 run-index tests pin that). Empirically verified: ambient
178
+ status shows only current-project runs, zero foreign-project bleed.
179
+
180
+ ## [0.8.7] — Doctor runtime-warmup status (2026-06-17)
181
+
182
+ `getRuntimeWarmupStatus()` diagnostic + a "Runtime warmup" section in
183
+ `team doctor` showing started/completed/duration/error. "Not started" is NOT
184
+ a doctor error (normal for direct unit-test calls).
185
+
3
186
  ## [0.8.6] — General cold-start race fix (runtime module-graph warmup) (2026-06-17)
4
187
 
5
188
  Fixes the `validateWorkflowForTeam` cold-start crash that v0.8.1 did NOT
package/README.md CHANGED
@@ -9,50 +9,48 @@ npm: pi-crew
9
9
  repo: https://github.com/baphuongna/pi-crew
10
10
  ```
11
11
 
12
- **v0.6.4**: See [CHANGELOG.md](CHANGELOG.md).
13
-
14
- ### Highlights (v0.6.4 → v0.7.0)
15
-
16
- This release implements **Phase 0 + Phase 1** of the long-term roadmap (synthesized from a 10-round research process), plus the **single-agent cliff hedge**. Principle: *build trust and cliff-resilience, stay lean, delete before adding.*
17
-
18
- - **🛡️ Compaction resilience (O10)** — the #1 user pain ("after auto-compact, the task stops midway") is fixed. In-flight crew runs are detected, a resume directive is injected into the compaction summary, and tasks re-attach after compaction.
19
- - **💰 Cost visibility (O1)** `team summary <runId>` now shows a full cost report with per-role attribution and token breakdown (`$0.77 — executor 79%, reviewer 14%...`).
20
- - **✋ Plan-level HITL for any workflow (O5)** — set `runtime.requirePlanApproval = true` to gate any workflow at the plan→execute boundary; approve via `team api op=approve-plan`.
21
- - **🧠 Cross-run memory (O4)** — `.crew/knowledge.md` is auto-injected into every run's system prompt. pi-crew remembers project context across runs.
22
- - **🎯 Single-agent cliff hedge** `team plan singleAgent=true` composes any workflow into one sequential prompt, so pi-crew's mission survives even if multi-agent is obsoleted by large-context models.
23
- - **🧹 2,335 LOC of dead code removed** + **Pi-api seam** centralizing the coupling surface.
24
-
25
- ### Highlights (v0.6.3 v0.6.4)
26
-
27
- - **Visually rich tool rendering** `team` and `Agent` tool calls now render as framed cards in the Pi TUI with box-drawing borders, colored status badges, and structured layouts
28
- - **Merged call+result into ONE connected frame** the call header and result body now form a single seamless frame instead of two disconnected boxes
29
- - **Animated live progress bar during runs** — real-time `████░░░░ N/M` task progress with elapsed time, rendered DURING the run; indeterminate "starting" phase uses an animated scanning bar
30
- - **Compact completion summary** collapsed cards show `✓ crew run 3/3 done · 1m2s · 26k tok · $0.068` with expand hint and per-agent briefs
31
- - **Critical crash fix on session resume** — `renderCall` was returning a `string` instead of a `Text` component, causing `TypeError: child.render is not a function` when Pi re-rendered stored tool calls
32
- - **Disabled brief tool overrides** — reverted the experimental brief mode that replaced Pi's superior native renderers (syntax highlighting, diff views, full content)
33
- - **Flaky test fix** — `AnimatedMascot` timing tests made CI-load-robust via polling loops
34
- - **CI green** 0 failures on Ubuntu, macOS, and Windows
35
-
36
- ### Highlights (v0.6.2 v0.6.3)
37
-
38
- - **137 commits** since v0.6.1200 files changed (+16,955 / −2,057 lines)
39
- - **4,792 tests**, 506 test files **0 failures** across the entire suite
40
- - **Cross-platform CI green** — 0 failures on Ubuntu, macOS, and Windows
41
- - **366 source files**, ~70K lines of TypeScript
42
- - **Worktree precondition validation** friendly errors instead of crashes when cwd is not a git repo or repo is dirty
43
- - **Cross-platform path handling** — `canonicalizePath` with `realpathSync.native` for Windows short-name/long-name aliasing; macOS symlink resolution
44
- - **Scheduled job lifecycle** — spawned runs are tracked, cancelling a job kills its runs
45
- - **Heartbeat false-positive fix** — PID liveness gate prevents dead detection during long LLM responses
46
- - **ENOENT crash fix** — prune/forget race no longer crashes pi when persisting to deleted runs
47
- - **Pipe buffer deadlock fix** — test runner no longer deadlocks when OS pipe buffer fills
48
- - **Plugin registry** extensible framework context injection for Next.js, Vite, Vitest
49
- - **Health score system** — penalty-based scoring with time-series snapshots
50
- - **CrewError taxonomy** — E001–E006 structured error codes replacing raw throws
51
- - **Atomic write v2** — fsync + rename pattern for crash-safe state persistence
52
- - **Pre-push review**: 56 unpushed commits reviewed, 1 release blocker found and fixed
53
- - **Security**: sandbox constructor escape strengthened; env-filter provider key handling fixed
54
- - **State-store race fix** — manifest/tasks mtime false positive eliminated
55
- - **Orphan worker/temp cleanup** — 4-layer defense with session-scoped tracking
12
+ **v0.8.11**: See [CHANGELOG.md](CHANGELOG.md).
13
+
14
+ ### Highlights (v0.6.4 → v0.8.11)
15
+
16
+ A long arc of **trust, cliff-resilience, and robustness** work. Principle: *build
17
+ trust and cliff-resilience, stay lean, delete before adding.*
18
+
19
+ #### v0.8.xhardening & reliability (2026-06-17)
20
+ - **🛠️ Split-scope install fix (v0.8.11)** — `team` runs no longer crash with
21
+ `Cannot find module '@earendil-works/pi-coding-agent'` when pi-crew and pi
22
+ live in separate node_modules trees (the default for `pi install`). New
23
+ `src/runtime/peer-dep.ts` resolves the ESM-only peer dep across 6 strategies.
24
+ - **🔄 Model fallback on transient 5xx (v0.8.11)** — a hard-down provider
25
+ (`500 api_error "unknown error"`) now triggers the configured fallback
26
+ model instead of aborting the run. `isRetryableModelFailure` extended.
27
+ - **🧊 Cold-start race eliminated (v0.8.6 v0.8.10)** under tsx, concurrent
28
+ subagent spawns raced module instantiation (`existsSync` / `CREW_README` /
29
+ `effectiveRunConfig` / `validateWorkflowForTeam`). Fixed graph-wide: warm at
30
+ registration + gate at spawn boundaries + per-site latches. 6/6 repro clean.
31
+ - **🔒 Cross-project leak fixed (v0.8.8)** — ambient status / compaction no
32
+ longer bleed foreign-project runs into the current session. Cwd-scope
33
+ barrier (`isInProjectScope`), version-independent.
34
+ - **🩺 Doctor runtime-warmup status (v0.8.7)** `team doctor` shows whether
35
+ the module-graph warmup fired.
36
+ - **🔍 Cold-verifier agent (v0.8.4)** adversarial cross-check that re-derives
37
+ claims WITHOUT trusting prior analysis, catching confirmation bias.
38
+ - **⚡ Per-write validator (v0.8.5)**zero-cost `JSON.parse` on every
39
+ `write`/`edit`, appends a `🔴` blocker on malformed files.
40
+ - **🎨 Terminal status (v0.8.3)** — tab title + Ghostty native progress bar.
41
+ - **🧠 Skill confidence revived (v0.8.2)** `adjustConfidence()` was dead
42
+ code; the effectiveness system now actually learns.
43
+ - **🔧 Tool-restriction unification (v0.8.0)** — single `resolveToolPolicy`
44
+ across both spawn paths.
45
+ - **🎯 F6/F1 interop granularity (v0.7.9)** — 7 skill roots, `.pi/agents/`
46
+ tier, tool wildcards, `excludeExtensions` denylist.
47
+
48
+ #### v0.7.0Phase 0 + Phase 1 roadmap
49
+ - **🛡️ Compaction resilience (O10)** — in-flight runs survive auto-compact.
50
+ - **💰 Cost visibility (O1)** — per-role token + cost attribution.
51
+ - **✋ Plan-level HITL (O5)** — `requirePlanApproval` gates any workflow.
52
+ - **🧠 Cross-run memory (O4)** `.crew/knowledge.md` injected every run.
53
+ - **🎯 Single-agent cliff hedge** `team plan singleAgent=true`.
56
54
 
57
55
  ---
58
56
 
@@ -99,6 +97,46 @@ pi-crew # after npm install
99
97
  node ./pi-crew/install.mjs # from local clone
100
98
  ```
101
99
 
100
+ > **Split-scope install note (v0.8.11+):** pi installs extensions under
101
+ > `~/.pi/agent/npm/node_modules/<ext>/`, separate from pi's own
102
+ > node_modules tree (nvm / `%APPDATA%\npm` / Volta / fnm). Since v0.8.11
103
+ > pi-crew resolves the `@earendil-works/pi-coding-agent` peer dep robustly
104
+ > across these layouts — no symlink/NODE_PATH workaround needed. If you ever
105
+ > do hit `Cannot find module '@earendil-works/pi-coding-agent'`, set
106
+ > `PI_CREW_PEER_DEP_DIR=<path to the pi-coding-agent package dir>` as a
107
+ > one-line workaround (or install pi-crew in pi's own scope:
108
+ > `npm install -g @earendil-works/pi-crew`).
109
+
110
+ ### Uninstall
111
+
112
+ `pi uninstall npm:pi-crew` removes the package, but pi doesn't fire an
113
+ extension uninstall hook, so two things `team action=init` created are left
114
+ behind: the **marker-delimited guidance block in AGENTS.md** and the **`.crew/`
115
+ runtime state directory** (run history, artifacts, worktrees). Reverse them
116
+ explicitly:
117
+
118
+ ```bash
119
+ # 1. (Optional) Preview what would be removed, without writing:
120
+ team action=cleanup dryRun=true
121
+
122
+ # 2. Remove the AGENTS.md guidance block only (.crew/ preserved):
123
+ team action=cleanup
124
+
125
+ # 3. Remove BOTH the guidance block AND the .crew/ state directory (force):
126
+ team action=cleanup force=true
127
+
128
+ # 4. Finally, remove the package itself:
129
+ pi uninstall npm:pi-crew
130
+ ```
131
+
132
+ The guidance block is wrapped in `<!-- PI-CREW:GUIDANCE:START -->` /
133
+ `<!-- PI-CREW:GUIDANCE:END -->` markers, so `team action=cleanup` removes
134
+ **only** that block — your own AGENTS.md content is never touched. The
135
+ `.crew/` directory is removed **only** with `force=true` (it's irreversible).
136
+ The user-scope dir (`~/.pi/agent/extensions/pi-crew/`) is owned by
137
+ `pi uninstall` and is never touched by `team action=cleanup`.
138
+
139
+
102
140
  ---
103
141
 
104
142
  ## Quick Start
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "pi-crew",
3
- "version": "0.8.10",
3
+ "version": "0.8.12",
4
4
  "description": "Pi extension for coordinated AI teams, workflows, worktrees, and async task orchestration",
5
5
  "author": "baphuongna",
6
6
  "license": "MIT",
@@ -84,6 +84,7 @@ import { runEventBus } from "../ui/run-event-bus.ts";
84
84
  import { createTerminalStatusController, type TerminalStatusController } from "../ui/terminal-status.ts";
85
85
  import { extractPathFromInput, validateWrittenFile, buildValidationBlocker } from "../runtime/per-write-validator.ts";
86
86
  import { startRuntimeWarmup } from "../runtime/runtime-warmup.ts";
87
+ import { primePeerDep } from "../runtime/peer-dep.ts";
87
88
  import { createRunSnapshotCache } from "../ui/run-snapshot-cache.ts";
88
89
  import { closeWatcher } from "../utils/fs-watch.ts";
89
90
  import { RunWatcherRegistry } from "../utils/run-watcher-registry.ts";
@@ -207,6 +208,11 @@ export function registerPiTeams(pi: ExtensionAPI): void {
207
208
  // Warming the graph here + awaiting it at spawn boundaries eliminates the
208
209
  // race window. See src/runtime/runtime-warmup.ts.
209
210
  startRuntimeWarmup();
211
+ // FIX (split-scope install): preload the ESM peer dep so discover-skills /
212
+ // skill-instructions can read the REAL getAgentDir (fork-aware) from cache.
213
+ // Fire-and-forget: getAgentDir() falls back to a safe computed default until
214
+ // this resolves. See src/runtime/peer-dep.ts.
215
+ primePeerDep().catch(() => {});
210
216
  // Deploy bundled themes (crew-dark, crew-dracula, etc.) to ~/.pi/agent/themes/
211
217
  // so Pi's theme loader discovers them. Best-effort, idempotent.
212
218
  deployBundledThemes();
@@ -14,6 +14,7 @@ import { enforceDestructiveIntent, intentFromConfig } from "./intent-policy.ts";
14
14
  import { executeHook, appendHookEvent } from "../../hooks/registry.ts";
15
15
  import { resolveRealContainedPath } from "../../utils/safe-paths.ts";
16
16
  import { projectCrewRoot, userCrewRoot } from "../../utils/paths.ts";
17
+ import { removeGuidance } from "../../config/markers.ts";
17
18
  import * as path from "node:path";
18
19
 
19
20
  export function handleWorktrees(params: TeamToolParamsValue, ctx: TeamContext): PiTeamsToolResult {
@@ -123,12 +124,128 @@ export async function handleForget(params: TeamToolParamsValue, ctx: TeamContext
123
124
  }
124
125
 
125
126
  export async function handleCleanup(params: TeamToolParamsValue, ctx: TeamContext): Promise<PiTeamsToolResult> {
127
+ // Intent policy applies to the cleanup action in BOTH modes (per-run and
128
+ // project-level). Checked once here so handleRunCleanup/handleProjectCleanup
129
+ // can stay focused on their own logic.
126
130
  const intentError = enforceDestructiveIntent("cleanup", params, ctx.config);
127
131
  if (intentError) return intentError;
128
- if (!params.runId) return result("Cleanup requires runId.", { action: "cleanup", status: "error" }, true);
129
- const loaded = loadRunManifestById(ctx.cwd, params.runId); // NOTE: no withRunLock - best-effort only; concurrent writes may cause inconsistency
130
- if (!loaded) return result(`Run '${params.runId}' not found.${RUN_NOT_FOUND_HINT}`, { action: "cleanup", status: "error" }, true);
132
+ // Two cleanup modes:
133
+ // 1. WITH runId per-run worktree cleanup (existing behavior).
134
+ // 2. WITHOUT runId PROJECT-LEVEL uninstall cleanup: removes the
135
+ // AGENTS.md guidance block pi-crew injected (`team action=init`) and
136
+ // optionally the `.crew/` state dir. Use this before/after
137
+ // `pi uninstall npm:pi-crew` to leave the project pristine.
138
+ // Issue #35: pi doesn't fire an uninstall hook for extensions, so this
139
+ // mode is the documented way to reverse an init.
140
+ if (params.runId) {
141
+ return handleRunCleanup(params, ctx);
142
+ }
143
+ return handleProjectCleanup(params, ctx);
144
+ }
145
+
146
+ /**
147
+ * Project-level uninstall cleanup (no runId). Reverses `team action=init`:
148
+ * removes the pi-crew guidance block from AGENTS.md (marker-delimited, so
149
+ * user content is untouched) and, with `force: true`, removes the `.crew/`
150
+ * runtime state directory. `dryRun: true` previews without writing.
151
+ *
152
+ * Safety:
153
+ * - `removeGuidance` only touches content between the PI-CREW markers.
154
+ * - `.crew/` removal requires explicit `force: true` (it holds run history,
155
+ * artifacts, and worktrees — irreversible). Default is guidance-only.
156
+ * - The user pi-crew user dir (`~/.pi/agent/extensions/pi-crew/`) is NEVER
157
+ * touched here — `pi uninstall` owns that; we only touch project state.
158
+ */
159
+ function handleProjectCleanup(params: TeamToolParamsValue, ctx: TeamContext): PiTeamsToolResult {
160
+ const cwd = ctx.cwd;
161
+ const dryRun = params.dryRun === true;
162
+ const removeState = params.force === true;
163
+ const scope = typeof params.scope === "string" ? params.scope : "project";
164
+ if (scope !== "project") {
165
+ return result(
166
+ `Project cleanup operates on the project only (got scope='${scope}'). ` +
167
+ `User-scope files are owned by 'pi uninstall npm:pi-crew'.`,
168
+ { action: "cleanup", status: "error", scope },
169
+ true,
170
+ );
171
+ }
172
+
173
+ const lines: string[] = ["Project cleanup for pi-crew:"];
174
+
175
+ // 1. Remove the AGENTS.md guidance block (marker-delimited → user content preserved).
176
+ const guidancePath = path.join(cwd, "AGENTS.md");
177
+ const guidanceResult = dryRun
178
+ ? { path: guidancePath, modified: fs.existsSync(guidancePath), added: [], removed: dryRunRemovedIds(guidancePath) }
179
+ : removeGuidance(guidancePath);
180
+ lines.push("AGENTS.md guidance block:");
181
+ if (guidanceResult.modified) {
182
+ lines.push(` - ${dryRun ? "would remove" : "removed"}: ${guidanceResult.removed.length ? guidanceResult.removed.join(", ") : "(marker section)"}`);
183
+ } else {
184
+ lines.push(" - (no pi-crew marker section found — nothing to do)");
185
+ }
186
+
187
+ // 2. Optionally remove the .crew/ runtime state directory (force: true).
188
+ const crewRoot = projectCrewRoot(cwd);
189
+ lines.push(".crew/ state directory:");
190
+ const crewExists = fs.existsSync(crewRoot);
191
+ if (!crewExists) {
192
+ lines.push(` - (not present at ${crewRoot} — nothing to do)`);
193
+ } else if (!removeState) {
194
+ lines.push(` - present at ${crewRoot} (preserved — use force: true to remove; contains run history/artifacts/worktrees and is irreversible)`);
195
+ } else {
196
+ // SAFETY: realpath + contain-check before rmSync, so a crafted cwd can't
197
+ // trick us into deleting an arbitrary directory.
198
+ let resolved: string;
199
+ try {
200
+ resolved = fs.realpathSync.native(crewRoot);
201
+ } catch {
202
+ lines.push(` - ERROR: could not resolve ${crewRoot} (skipped)`);
203
+ return result(lines.join("\n"), { action: "cleanup", status: "ok", scope }, false);
204
+ }
205
+ if (!resolved.endsWith(path.sep + ".crew") && !resolved.endsWith("/teams") && path.basename(resolved) !== ".crew") {
206
+ lines.push(` - ERROR: refused to remove ${resolved} (does not look like a .crew dir) — skipped`);
207
+ } else {
208
+ if (!dryRun) {
209
+ try {
210
+ fs.rmSync(resolved, { recursive: true, force: true });
211
+ } catch (e) {
212
+ lines.push(` - ERROR removing ${resolved}: ${(e as Error).message}`);
213
+ }
214
+ }
215
+ lines.push(` - ${dryRun ? "would remove" : "removed"}: ${resolved}`);
216
+ }
217
+ }
218
+
219
+ lines.push("");
220
+ lines.push(
221
+ dryRun
222
+ ? "(dry-run preview — no files were changed. Re-run without dryRun to apply.)"
223
+ : "Done. To fully remove pi-crew, also run: pi uninstall npm:pi-crew",
224
+ );
225
+ return result(lines.join("\n"), { action: "cleanup", status: "ok", scope }, false);
226
+ }
227
+
228
+ /** Dry-run helper: read what removeGuidance WOULD remove without writing. */
229
+ function dryRunRemovedIds(guidancePath: string): string[] {
230
+ try {
231
+ if (!fs.existsSync(guidancePath)) return [];
232
+ const content = fs.readFileSync(guidancePath, "utf-8");
233
+ const startIdx = content.indexOf("<!-- PI-CREW:GUIDANCE:START -->");
234
+ const endIdx = content.indexOf("<!-- PI-CREW:GUIDANCE:END -->");
235
+ if (startIdx === -1 || endIdx === -1 || endIdx <= startIdx) return [];
236
+ // Cheap approximation: report the marker section as a unit. Exact block
237
+ // IDs aren't needed for the dry-run summary; the non-dryRun path uses
238
+ // removeGuidance which returns the precise removed IDs.
239
+ return ["pi-crew-overview", "pi-crew-commands"];
240
+ } catch {
241
+ return [];
242
+ }
243
+ }
131
244
 
245
+ /** Per-run worktree cleanup (existing behavior, preserved). */
246
+ async function handleRunCleanup(params: TeamToolParamsValue, ctx: TeamContext): Promise<PiTeamsToolResult> {
247
+ const loaded = loadRunManifestById(ctx.cwd, params.runId!); // NOTE: no withRunLock - best-effort only; concurrent writes may cause inconsistency
248
+ if (!loaded) return result(`Run '${params.runId}' not found.${RUN_NOT_FOUND_HINT}`, { action: "cleanup", status: "error", runId: params.runId }, true);
132
249
  // Ownership check — prevent cross-session worktree cleanup unless force is set
133
250
  const foreignRun = typeof loaded.manifest.ownerSessionId === "string" && loaded.manifest.ownerSessionId !== ctx.sessionId;
134
251
  if (foreignRun && !params.force) return result(`Run ${params.runId} belongs to another session. Use force: true to override.`, { action: "cleanup", status: "error", runId: loaded.manifest.runId }, true);
@@ -10,6 +10,8 @@ export interface TeamToolDetails {
10
10
  resumedIds?: string[];
11
11
  retriedTaskIds?: string[];
12
12
  mailboxIds?: string[];
13
+ /** Resource scope affected by the action (e.g. cleanup: "project"). */
14
+ scope?: string;
13
15
  /** Run metrics for compact display in TUI tool result rendering. */
14
16
  metrics?: { taskCount?: number; completedCount?: number; totalTokens?: number; totalCost?: number; durationMs?: number; consistencyScore?: number };
15
17
  /** Structured data for programmatic consumption (e.g. TUI widgets). */
@@ -6,6 +6,7 @@ import { fileURLToPath, pathToFileURL } from "node:url";
6
6
  import { logInternalError } from "../utils/internal-error.ts";
7
7
  import { appendEvent } from "../state/event-log.ts";
8
8
  import { sanitizeEnvSecrets } from "../utils/env-filter.ts";
9
+ import { resolvePeerDepDir, PEER_DEP_DIR_ENV } from "./peer-dep.ts";
9
10
  import {
10
11
  registerWorker,
11
12
  unregisterWorker,
@@ -202,6 +203,15 @@ export async function spawnBackgroundTeamRun(manifest: TeamRunManifest): Promise
202
203
  // FIX: removed delete workarounds — with explicit allowlist, these vars
203
204
  // are no longer auto-leaked. Matches child-pi.ts.
204
205
 
206
+ // FIX (split-scope install): pass the resolved peer-dep dir to the child so
207
+ // it can resolve @earendil-works/pi-coding-agent WITHOUT the ~200ms
208
+ // `npm root -g` probe. No-op when pi-crew and pi are co-located. See
209
+ // src/runtime/peer-dep.ts.
210
+ const peerDepDir = resolvePeerDepDir();
211
+ const childEnv = peerDepDir
212
+ ? { ...filteredEnv, [PEER_DEP_DIR_ENV]: peerDepDir }
213
+ : filteredEnv;
214
+
205
215
  const loader = resolveTypeScriptLoader();
206
216
  if (!loader) {
207
217
  const message = buildLoaderUnavailableMessage(packageRootFromRuntime());
@@ -227,7 +237,7 @@ export async function spawnBackgroundTeamRun(manifest: TeamRunManifest): Promise
227
237
  detached: true,
228
238
  setsid: true,
229
239
  stdio: ["ignore", "pipe", "pipe"],
230
- env: filteredEnv,
240
+ env: childEnv,
231
241
  windowsHide: true,
232
242
  } as unknown as Parameters<typeof spawn>[2];
233
243
  const child = spawn(process.execPath, command.args, spawnOpts);
@@ -19,6 +19,7 @@ import {
19
19
  } from "../workflows/discover-workflows.ts";
20
20
  // Heavy runtime — lazy-loaded to avoid pulling team-runner into background-runner
21
21
  // at module load time. Only needed when a background run actually starts.
22
+ import { primePeerDep } from "./peer-dep.ts";
22
23
  import type { executeTeamRun as ExecuteTeamRunFn } from "./team-runner.ts";
23
24
  import type { TeamRunManifest, TeamTaskState } from "../state/types.ts";
24
25
 
@@ -27,6 +28,10 @@ async function executeTeamRun(
27
28
  ...args: Parameters<typeof ExecuteTeamRunFn>
28
29
  ): Promise<Awaited<ReturnType<typeof ExecuteTeamRunFn>>> {
29
30
  if (!_cachedExecuteTeamRun) {
31
+ // FIX (split-scope install): prime the ESM peer dep BEFORE team-runner is
32
+ // imported, so its transitive skill-instructions.ts can read getAgentDir()
33
+ // from the primed cache instead of crashing on `Cannot find module`.
34
+ await primePeerDep().catch(() => {});
30
35
  // LAZY: avoid pulling team-runner into background-runner at module load time.
31
36
  const mod = await import("./team-runner.ts");
32
37
  _cachedExecuteTeamRun = mod.executeTeamRun;
@@ -186,6 +186,29 @@ const RETRYABLE_MODEL_FAILURE_PATTERNS = [
186
186
  /\b502\b/,
187
187
  /\b503\b/,
188
188
  /\b504\b/,
189
+ //
190
+ // Provider-side 5xx / generic api_error. The pi-core retry layer already
191
+ // retries these (agent-session.ts matches `500|server error|internal error`),
192
+ // but the pi-crew MODEL FALLBACK layer must ALSO treat them as retryable so
193
+ // that when the provider is hard-down across all 3 provider retries, we fail
194
+ // over to the next configured model instead of giving up. Reported case
195
+ // (2026-06-17): `500 {"type":"error","error":{"type":"api_error",
196
+ // "message":"unknown error, 999 (1000)"}}` — a transient provider outage that
197
+ // should trigger the fallback chain, not abort.
198
+ //
199
+ // `api_error` is the OpenAI-compatible generic error type (vs rate_limit_error
200
+ // / overloaded_error / etc.) and almost always means a transient server fault.
201
+ //
202
+ // `unknown error` is the body of the generic message; `internal`/`server`
203
+ // catch the common phrasings. `\b500\b`/`\b501\b` catch the HTTP status in
204
+ // the rendered error string.
205
+ /\b500\b/,
206
+ /\b501\b/,
207
+ /api_error/i,
208
+ /unknown error/i,
209
+ /internal(?:_server)?[ _]error/i,
210
+ /server error/i,
211
+ /bad gateway/i,
189
212
  ];
190
213
 
191
214
  // These patterns indicate auth/key/billing issues that will never succeed on retry.
@@ -0,0 +1,296 @@
1
+ /**
2
+ * Robust resolution + async loading of the @earendil-works/pi-coding-agent
3
+ * peer dependency. Fixes the "Cannot find module '@earendil-works/pi-coding-agent'"
4
+ * crash that blocks ALL team runs when pi-crew and pi are installed in
5
+ * SEPARATE node_modules trees.
6
+ *
7
+ * PROBLEM (Windows / global installs — reported 2026-06-17)
8
+ * pi-crew is a pi EXTENSION. pi installs extensions under
9
+ * `~/.pi/agent/npm/node_modules/<ext>/`, but pi itself (the
10
+ * @earendil-works/pi-coding-agent package that extensions import from)
11
+ * usually lives in a DIFFERENT node_modules tree — a global one (nvm,
12
+ * %APPDATA%\npm, Volta, fnm, pnpm-global). Node's resolver only walks UP
13
+ * through ancestor `node_modules` of the importing file, so a file under
14
+ * `~/.pi/agent/npm/node_modules/pi-crew/...` CANNOT resolve a peer dep
15
+ * installed under `~/.nvm/.../lib/node_modules/`. Every static
16
+ * `import { X } from "@earendil-works/pi-coding-agent"` that executes inside
17
+ * a SPAWNED CHILD PROCESS (the detached background team runner started by
18
+ * async-runner.spawnBackgroundTeamRun) therefore crashes at module load,
19
+ * leaving all team runs permanently `queued`.
20
+ *
21
+ * ADDITIONAL CONSTRAINT (verified empirically 2026-06-17)
22
+ * pi-coding-agent ships as ESM-only (`"type":"module"`, exports map has only
23
+ * an `import` condition). CJS `require()` / `createRequire(dir)(name)` fails
24
+ * with ERR_PACKAGE_PATH_NOT_EXPORTED under plain node AND under jiti/tsx. The
25
+ * ONLY working load mechanism is a dynamic `import()` of the resolved ESM
26
+ * entry file URL. Hence: sync resolution of the DIR, async load of the MODULE.
27
+ *
28
+ * APPROACH
29
+ * - resolvePeerDep() (sync) — find the install dir across many layouts.
30
+ * - primePeerDep() (async) — dynamic-import the resolved entry, cache
31
+ * the module namespace. Memoized. Called
32
+ * once per process during bootstrap.
33
+ * - getAgentDir() (sync) — read the cached module's getAgentDir.
34
+ * Falls back to a computed default if the
35
+ * cache was never primed, so it NEVER throws.
36
+ */
37
+ import * as fs from "node:fs";
38
+ import * as os from "node:os";
39
+ import * as path from "node:path";
40
+ import { fileURLToPath, pathToFileURL } from "node:url";
41
+ import { resolveNpmGlobalRoot } from "./pi-spawn.ts";
42
+
43
+ /**
44
+ * The pi-coding-agent peer dependency package name(s) we can be loaded by.
45
+ * @earendil-works is the canonical scope; @mariozechner is the historical fork.
46
+ */
47
+ export const PEER_DEP_NAMES = [
48
+ "@earendil-works/pi-coding-agent",
49
+ "@mariozechner/pi-coding-agent",
50
+ ] as const;
51
+
52
+ /**
53
+ * Env var a parent pi-crew process sets on spawned children so they can resolve
54
+ * the peer dep WITHOUT running `npm root -g` (~200ms probe). The resolver
55
+ * checks this FIRST. Absent (older parent, direct invocation, tests) → falls
56
+ * through to the probing strategies. Also lets users override the resolution
57
+ * explicitly as a last-resort fix.
58
+ */
59
+ export const PEER_DEP_DIR_ENV = "PI_CREW_PEER_DEP_DIR";
60
+
61
+ type PeerDepModule = typeof import("@earendil-works/pi-coding-agent");
62
+
63
+ interface ResolvedPeerDep {
64
+ dir: string;
65
+ name: string;
66
+ /** file:// URL of the ESM entry (exports["."].import || main). */
67
+ mainUrl: string;
68
+ }
69
+
70
+ let cachedResolve: ResolvedPeerDep | undefined | null = null;
71
+ let cachedModule: PeerDepModule | undefined;
72
+ let primingPromise: Promise<PeerDepModule> | undefined;
73
+
74
+ /**
75
+ * Build the ordered list of "resolution bases" — paths to seed
76
+ * `createRequire(...).resolve()` from. Node walks UP `node_modules` from each
77
+ * base's directory, so any base inside (or beside) the peer dep's package
78
+ * tree will find it. Pure given env/process inputs; exported for unit tests.
79
+ */
80
+ export function peerDepResolutionBases(): string[] {
81
+ const bases: string[] = [];
82
+
83
+ // 0. Parent-provided hint (fastest — no probe). Set by async-runner.
84
+ const envHint = process.env[PEER_DEP_DIR_ENV]?.trim();
85
+ if (envHint) bases.push(path.resolve(envHint));
86
+
87
+ // 1. This file's location — works when pi-crew and pi-coding-agent share a
88
+ // node_modules ancestor (the common co-located install).
89
+ bases.push(fileURLToPath(import.meta.url));
90
+
91
+ // 2. The entry script. In the PARENT (main pi process) argv[1] is pi's CLI
92
+ // script, which lives INSIDE pi-coding-agent's package → resolves. In a
93
+ // SPAWNED CHILD argv[1] is a pi-crew script → cheap miss, falls through.
94
+ const argv1 = process.argv[1];
95
+ if (argv1) bases.push(path.resolve(argv1));
96
+
97
+ // 3. The Node binary's global node_modules. Covers nvm / nvm-windows /
98
+ // Volta / fnm where pi-coding-agent is `npm i -g`'d: node is at
99
+ // <prefix>/bin/node and globals live at <prefix>/lib/node_modules.
100
+ try {
101
+ const execDir = path.dirname(fs.realpathSync.native(process.execPath));
102
+ bases.push(path.join(path.dirname(execDir), "lib", "node_modules"));
103
+ // Some layouts (Windows global, or a bare node_modules sibling of bin).
104
+ bases.push(path.join(execDir, "node_modules"));
105
+ } catch {
106
+ /* realpath best-effort */
107
+ }
108
+
109
+ // 4. `npm root -g` — the canonical cross-layout global root (memoized in
110
+ // pi-spawn.ts, ~200ms once). Derive the scoped package dirs from it.
111
+ const npmRoot = resolveNpmGlobalRoot();
112
+ if (npmRoot) {
113
+ for (const pkgName of PEER_DEP_NAMES) {
114
+ bases.push(path.join(npmRoot, ...pkgName.split("/")));
115
+ }
116
+ }
117
+
118
+ // 5. Windows %APPDATA%\npm static layout (legacy npm-global, pre-npm-root-g).
119
+ if (process.env.APPDATA) {
120
+ bases.push(path.join(process.env.APPDATA, "npm", "node_modules"));
121
+ }
122
+
123
+ return bases;
124
+ }
125
+
126
+ /** Pull the ESM entry path out of package.json (exports import || main). */
127
+ function extractEsmMain(pkg: unknown): string | undefined {
128
+ if (!pkg || typeof pkg !== "object") return undefined;
129
+ const p = pkg as Record<string, unknown>;
130
+ const exp = p.exports;
131
+ if (exp && typeof exp === "object") {
132
+ const dot = (exp as Record<string, unknown>)["."];
133
+ if (dot && typeof dot === "object") {
134
+ const d = dot as Record<string, unknown>;
135
+ const rel = d.import ?? d.default ?? d.module;
136
+ if (typeof rel === "string") return rel;
137
+ } else if (typeof dot === "string") {
138
+ return dot;
139
+ }
140
+ }
141
+ const main = p.main;
142
+ return typeof main === "string" ? main : undefined;
143
+ }
144
+
145
+ /**
146
+ * Walk the node_modules resolution algorithm MANUALLY from `start` looking for
147
+ * any of `names`. We do NOT use createRequire/require.resolve here because
148
+ * pi-coding-agent ships an ESM-only package with a restrictive exports map
149
+ * (only the `.` import condition) — `require.resolve("<pkg>/package.json")`
150
+ * and `require.resolve("<pkg>")` both throw ERR_PACKAGE_PATH_NOT_EXPORTED.
151
+ * Reading package.json directly from the walked dir sidesteps the exports map
152
+ * entirely (exports only governs subpath IMPORTS, not raw file reads).
153
+ *
154
+ * At each directory we check BOTH `<dir>/node_modules/<pkg>` (the standard
155
+ * container case) AND `<dir>/<pkg>` (handles a base that IS a node_modules
156
+ * dir, e.g. the output of `npm root -g`), then walk up to root.
157
+ */
158
+ function findPackageDir(
159
+ start: string,
160
+ names: readonly string[],
161
+ ): { dir: string; name: string } | undefined {
162
+ let dir = path.resolve(start);
163
+ try {
164
+ if (fs.statSync(dir).isFile()) dir = path.dirname(dir);
165
+ } catch {
166
+ /* treat as directory */
167
+ }
168
+ while (true) {
169
+ for (const name of names) {
170
+ const segs = name.split("/");
171
+ const candidates = [
172
+ path.join(dir, "node_modules", ...segs, "package.json"),
173
+ path.join(dir, ...segs, "package.json"),
174
+ ];
175
+ for (const pkgJson of candidates) {
176
+ try {
177
+ const pkg = JSON.parse(fs.readFileSync(pkgJson, "utf-8"));
178
+ if (pkg?.name === name) {
179
+ return { dir: path.dirname(pkgJson), name };
180
+ }
181
+ } catch {
182
+ /* not present at this candidate */
183
+ }
184
+ }
185
+ }
186
+ const parent = path.dirname(dir);
187
+ if (parent === dir) break; // reached filesystem root
188
+ dir = parent;
189
+ }
190
+ return undefined;
191
+ }
192
+
193
+ function tryResolveFrom(base: string): ResolvedPeerDep | undefined {
194
+ const found = findPackageDir(base, PEER_DEP_NAMES);
195
+ if (!found) return undefined;
196
+ try {
197
+ const pkg = JSON.parse(
198
+ fs.readFileSync(path.join(found.dir, "package.json"), "utf-8"),
199
+ );
200
+ const mainRel = extractEsmMain(pkg);
201
+ if (!mainRel) return undefined;
202
+ const mainAbs = path.resolve(found.dir, mainRel);
203
+ if (!fs.existsSync(mainAbs)) return undefined;
204
+ return { dir: found.dir, name: found.name, mainUrl: pathToFileURL(mainAbs).href };
205
+ } catch {
206
+ return undefined;
207
+ }
208
+ }
209
+
210
+ /** Resolve the peer dep install dir + ESM entry URL. Memoized (sync). */
211
+ export function resolvePeerDep(): ResolvedPeerDep | undefined {
212
+ if (cachedResolve !== null) return cachedResolve ?? undefined;
213
+ for (const base of peerDepResolutionBases()) {
214
+ const found = tryResolveFrom(base);
215
+ if (found) {
216
+ cachedResolve = found;
217
+ return found;
218
+ }
219
+ }
220
+ cachedResolve = null; // mark attempted-and-failed; don't re-probe per call
221
+ return undefined;
222
+ }
223
+
224
+ /** Just the install directory (for env-hint propagation to children). */
225
+ export function resolvePeerDepDir(): string | undefined {
226
+ return resolvePeerDep()?.dir;
227
+ }
228
+
229
+ /**
230
+ * Dynamic-import the peer dep module, caching the namespace. Memoized via a
231
+ * shared promise so concurrent callers share one load. On failure the promise
232
+ * is cleared so a later caller can retry. Safe to call repeatedly.
233
+ */
234
+ export function primePeerDep(): Promise<PeerDepModule> {
235
+ if (cachedModule) return Promise.resolve(cachedModule);
236
+ if (primingPromise) return primingPromise;
237
+ primingPromise = (async () => {
238
+ const resolved = resolvePeerDep();
239
+ if (!resolved) {
240
+ throw new Error(buildMissingMessage());
241
+ }
242
+ cachedModule = (await import(resolved.mainUrl)) as PeerDepModule;
243
+ return cachedModule;
244
+ })();
245
+ // Clear on failure so a later caller can retry (e.g. after env fix).
246
+ primingPromise.catch(() => {
247
+ primingPromise = undefined;
248
+ });
249
+ return primingPromise;
250
+ }
251
+
252
+ /** Async module accessor (primes if needed). */
253
+ export async function loadPeerDep(): Promise<PeerDepModule> {
254
+ return primePeerDep();
255
+ }
256
+
257
+ function buildMissingMessage(): string {
258
+ return (
259
+ `pi-crew could not resolve the @earendil-works/pi-coding-agent peer dependency.\n` +
260
+ `This usually means pi-crew and pi are installed in separate node_modules trees\n` +
261
+ `(e.g. pi-crew under ~/.pi/agent/npm/ but pi under an nvm/Volta/fnm global scope).\n` +
262
+ `Resolution bases tried:\n` +
263
+ peerDepResolutionBases().map((b) => ` - ${b}`).join("\n") +
264
+ `\nFix: install pi-crew in the SAME scope as pi, e.g.\n` +
265
+ ` npm install -g @earendil-works/pi-crew\n` +
266
+ `or set the env var ${PEER_DEP_DIR_ENV}=<path to the pi-coding-agent package dir>.`
267
+ );
268
+ }
269
+
270
+ /**
271
+ * Read the user agent dir via the REAL peer-dep getAgentDir (fork-aware:
272
+ * correct for pi, tau, and renamed forks). Sync; reads the primed cache.
273
+ *
274
+ * If the cache was never primed (e.g. called before bootstrap completes, or
275
+ * prime failed), falls back to a computed default so it NEVER throws. The
276
+ * default matches standard pi (`~/.pi/agent`) and respects the
277
+ * `PI_CODING_AGENT_DIR` override — correct for the overwhelmingly common
278
+ * case. Forks rely on the primed real function (register.ts primes at startup).
279
+ */
280
+ export function getAgentDir(): string {
281
+ if (cachedModule?.getAgentDir) {
282
+ try {
283
+ return cachedModule.getAgentDir();
284
+ } catch {
285
+ /* fall through to computed default */
286
+ }
287
+ }
288
+ return process.env.PI_CODING_AGENT_DIR || path.join(os.homedir(), ".pi", "agent");
289
+ }
290
+
291
+ /** @internal — reset all caches for unit tests. */
292
+ export function __resetPeerDepCacheForTest(): void {
293
+ cachedResolve = null;
294
+ cachedModule = undefined;
295
+ primingPromise = undefined;
296
+ }
@@ -22,7 +22,11 @@ const PACKAGE_SKILLS_DIR = path.resolve(
22
22
  "skills",
23
23
  );
24
24
  import * as os from "node:os";
25
- import { getAgentDir } from "@earendil-works/pi-coding-agent";
25
+ // peer-dep.ts resolves @earendil-works/pi-coding-agent robustly across install
26
+ // layouts (extension-under-~/.pi + pi-under-global). A static `import { getAgentDir }`
27
+ // here crashes detached child processes when pi-crew and pi live in separate
28
+ // node_modules trees. See src/runtime/peer-dep.ts.
29
+ import { getAgentDir } from "../runtime/peer-dep.ts";
26
30
  const MAX_SKILL_CHARS = 1500;
27
31
  const MAX_TOTAL_CHARS = 6000;
28
32
  const MAX_SKILL_NAME_CHARS = 80;
@@ -2,7 +2,9 @@ import * as fs from "node:fs";
2
2
  import * as os from "node:os";
3
3
  import * as path from "node:path";
4
4
  import { fileURLToPath } from "node:url";
5
- import { getAgentDir } from "@earendil-works/pi-coding-agent";
5
+ // peer-dep.ts resolves @earendil-works/pi-coding-agent robustly across install
6
+ // layouts. See src/runtime/peer-dep.ts (split-scope install fix).
7
+ import { getAgentDir } from "../runtime/peer-dep.ts";
6
8
  import { logInternalError } from "../utils/internal-error.ts";
7
9
  import { isSafePathId, resolveContainedPath, resolveRealContainedPath } from "../utils/safe-paths.ts";
8
10