pi-crew 0.7.7 → 0.8.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -1,5 +1,330 @@
1
1
  # Changelog
2
2
 
3
+ ## [0.8.2] — Skill confidence dead-code fix (T7) (2026-06-16)
4
+
5
+ Fixes a **real correctness bug** surfaced by the pi-extensions deep-dive
6
+ (pi-continuous-learning's tiered confidence model): pi-crew's skill
7
+ confidence system was effectively **inert**.
8
+
9
+ ### Bug fixed
10
+
11
+ `registerSkillEffectivenessHooks` had two defects that left every skill's
12
+ confidence stuck at ~0.3 regardless of outcomes:
13
+
14
+ 1. **`adjustConfidence()` was dead code.** The `task_completed` handler
15
+ hardcoded `confidence: computeInitialConfidence(1)` (= 0.3) on every
16
+ activation write. The function was defined and unit-tested in isolation,
17
+ but **never called in the recording path** — so every stored activation
18
+ had confidence 0.3, and `computeSkillMetrics.currentConfidence` (derived
19
+ from the last stored value + decay) never moved.
20
+ 2. **`task_failed` was a no-op.** Its comment claimed failures were "handled
21
+ by computeSkillMetrics", but `computeSkillMetrics` derives `passRate`
22
+ from *recorded* activations — and failed tasks recorded **nothing**, so a
23
+ failure never fed back into the confidence/decay loop.
24
+
25
+ Net effect: the entire confidence-weighted skill system was decorative.
26
+ Pass-rate, trend, and promotion-gate decisions were computed from a flat
27
+ 0.3 baseline.
28
+
29
+ ### Fix
30
+
31
+ New `computeNextActivationConfidence(skillId, activations, passed)` helper
32
+ computes the **rolling** confidence: it seeds the first activation of a
33
+ skill at 0.3, then applies `adjustConfidence` (+0.05 success / -0.1
34
+ failure, clamped [0.1, 0.95]) on the skill's last recorded confidence.
35
+
36
+ Both hooks now record activations with the rolling confidence:
37
+ - `task_completed` → records `passed:true` activations at the rolled-forward
38
+ confidence.
39
+ - `task_failed` → now records `passed:false` activations (was a no-op),
40
+ which lowers passRate AND triggers the -0.1 contradicting delta on the
41
+ next recorded activation.
42
+
43
+ This unblocks the confidence-weighted skill selection (`getWeightedSkillsForRole`)
44
+ and the promotion gate (`evaluatePromotionGate`) — they now reflect real
45
+ outcome history. Existing `adjustConfidence`/`computeInitialConfidence`/
46
+ `computeSkillMetrics` tests are preserved unchanged (they asserted on the
47
+ intended contract; the recording path now honors it).
48
+
49
+ ### Files
50
+ - `src/runtime/skill-effectiveness.ts` — `computeNextActivationConfidence`
51
+ helper; both hooks rewired to record rolling-confidence activations.
52
+ - NEW `test/unit/t7-confidence-deadcode-fix.test.ts` (7 tests): rolling
53
+ confidence evolves across activations; failures feed back; `adjustConfidence`
54
+ is no longer dead.
55
+
56
+ typecheck clean; skill-effectiveness suite 44/44 pass. (One unrelated
57
+ `event-log-async` flake under local load passes 3/3 in isolation — clean on CI.)
58
+
59
+ ## [0.8.1] — Subagent cold-start race fix (module-scoped import latch) (2026-06-16)
60
+
61
+ Fixes a flaky, load-dependent crash that surfaced when launching multiple
62
+ subagents **concurrently** via `Agent({ run_in_background: true })`.
63
+
64
+ ### Bug fixed
65
+
66
+ When 2+ in-process live-session subagents spawned at once, some crashed at
67
+ cold-start with:
68
+
69
+ ```
70
+ Cannot read properties of undefined (reading 'existsSync')
71
+ Cannot read properties of undefined (reading 'validateWorkflowForTeam')
72
+ ```
73
+
74
+ These are property-access-on-`undefined` errors: a module namespace binding
75
+ observed mid-evaluation as `undefined`. The defining reproduction: 4 explorer
76
+ subagents launched together → 3 of 4 crashed; **all 3 succeeded on sequential
77
+ retry** (same code, same args, same repos — only concurrency changed). That is
78
+ the signature of a cold-start race, not a logic bug.
79
+
80
+ ### Root cause
81
+
82
+ `direct-agent` subagents run **in-process** via `createAgentSession` (the
83
+ live-session runtime), sharing one Node module graph. The spawn path called
84
+ `await import("@earendil-works/pi-coding-agent")` **independently** per
85
+ subagent. Under the **tsx loader** (which registers `load`/`resolve` hooks to
86
+ transpile TS), concurrent first-imports can each enter the loader and race
87
+ module-record instantiation — yielding a namespace binding seen mid-eval as
88
+ `undefined`. Engine-level ESM memoization is not guaranteed to be observed
89
+ synchronously across concurrent evaluation under transpiling loaders.
90
+
91
+ ### Fix
92
+
93
+ Module-scoped memoization in `src/runtime/live-session-runtime.ts`: the FIRST
94
+ caller sets `liveSessionModulePromise`; every later caller awaits the same
95
+ in-flight promise. Guarantees a single module-record instantiation regardless
96
+ of loader behavior. Concurrent callers then proceed in parallel as normal.
97
+
98
+ ```ts
99
+ let liveSessionModulePromise: Promise<LiveSessionModule> | undefined;
100
+ function loadLiveSessionModule(): Promise<LiveSessionModule> {
101
+ if (!liveSessionModulePromise) {
102
+ liveSessionModulePromise = import("@earendil-works/pi-coding-agent")
103
+ as unknown as Promise<LiveSessionModule>;
104
+ }
105
+ return liveSessionModulePromise;
106
+ }
107
+ ```
108
+
109
+ ### Files
110
+ - `src/runtime/live-session-runtime.ts` — module-scoped `loadLiveSessionModule()`
111
+ latch; use site now `await loadLiveSessionModule()` (was un-memoized
112
+ `await import(...)`).
113
+ - NEW `test/unit/live-session-import-latch.test.ts` (2 tests): module loads
114
+ cleanly; latch variable + check-before-set + use site present, and the old
115
+ un-memoized pattern gone (regression guard).
116
+ - NEW `.github/issues/2026-06-16-subagent-cold-start-race.md` — full root-cause
117
+ write-up + lessons.
118
+
119
+ typecheck clean; full suite 0 failures (local EXIT=1 is the test-runner infra
120
+ `spawnSync ETIMEDOUT` on a background-subagent test under local load — clean
121
+ on CI).
122
+
123
+ ## [0.8.0] — Tool-restriction unification across spawn paths (2026-06-16)
124
+
125
+ Fixes a long-standing correctness gap where the same agent behaved
126
+ *differently* depending on which runtime spawned it.
127
+
128
+ ### Bug fixed
129
+
130
+ The child-pi path (`pi-args.ts`) and the live-session path
131
+ (`live-session-runtime.ts`) **disagreed on tool restrictions**:
132
+
133
+ | | allowlist | denylist |
134
+ |---|---|---|
135
+ | child-pi (before) | `roleConfig.tools ?? agent.tools` (role authoritative) | `roleConfig.excludeTools` only |
136
+ | live-session (before) | `agent.tools` only (frontmatter authoritative) | `agent.disallowedTools` only |
137
+
138
+ So a user defining `tools:` or `disallowed_tools:` in a custom agent's
139
+ frontmatter saw it honored on one path and ignored on the other:
140
+ - `disallowed_tools: web` was **silently ignored on child-pi** (the default
141
+ async path).
142
+ - A builtin `explorer` on the live-session path was **not bound by the role's
143
+ read-only security constraint** (it relied solely on the frontmatter).
144
+
145
+ ### Fix
146
+
147
+ A shared `resolveToolPolicy(agent, role)` helper in `agent-config.ts` is
148
+ now the **single source of truth** used by BOTH spawn paths. Stable,
149
+ unified semantics:
150
+
151
+ - **Allowlist precedence is source-aware**:
152
+ - `source === "builtin"` → role-config authoritative (security: a builtin
153
+ explorer MUST stay read-only even if its frontmatter is loose).
154
+ Frontmatter is the fallback when the role has no allowlist.
155
+ - `source !== "builtin"` (user / project) → frontmatter `tools:`
156
+ authoritative (user intent). Role-config is the fallback.
157
+ - **Denylist is additive**: `roleConfig.excludeTools` and
158
+ `agent.disallowedTools` are MERGED (dedup, order-insensitive). It is
159
+ always safe to forbid more, and merging means a security exclude from
160
+ the role can never be weakened by a frontmatter omission.
161
+
162
+ This is **not a regression** for builtin agents: their allowlist still comes
163
+ from `ROLE_TOOL_CONFIGS` (the authoritative security set), and the merged
164
+ denylist only adds constraints. Custom agents now behave identically
165
+ across both runtimes.
166
+
167
+ ### Files
168
+ - `src/agents/agent-config.ts` — NEW `resolveToolPolicy` + `ResolvedToolPolicy`
169
+ (the shared resolver) + `uniqueToolMerge` helper.
170
+ - `src/runtime/pi-args.ts` — uses `resolveToolPolicy` (drops the inline
171
+ role-authoritative logic; removes now-unused `getAgentSessionOptions` import).
172
+ - `src/runtime/live-session-runtime.ts` — `filterActiveTools` now takes the
173
+ role and uses `resolveToolPolicy` (drops the inline frontmatter-only logic).
174
+ - NEW `test/unit/v0-8-0-tool-policy-unification.test.ts` (10 tests pinning
175
+ the resolver: source-aware allowlist, additive denylist, cross-path
176
+ determinism).
177
+
178
+ typecheck clean; 4980+ tests pass / 0 fail. CI green on win/ubuntu/macos.
179
+
180
+ ## [0.7.9] — Interop & agent granularity (4 grouped items, 2026-06-16)
181
+
182
+ One grouped release for four related, surgical interop / agent-granularity
183
+ items (all additive, no behavior change for existing configs):
184
+
185
+ ### F6 — Agent Skills spec skill-roots (interop)
186
+ - Skill discovery now reads 5 roots (was 2), matching pi-subagents'
187
+ `skill-loader` so skills authored under either convention are found:
188
+ - `<cwd>/.pi/skills` (project, Pi standard) — new
189
+ - `<cwd>/.agents/skills` (project, Agent Skills spec / agentskills.io) — new
190
+ - `<cwd>/skills` (project, legacy pi-crew) — kept
191
+ - `~/.pi/agent/skills` (user, Pi standard) — new
192
+ - `~/.agents/skills` (user, Agent Skills spec) — new
193
+ - `~/.pi/skills` (user, legacy) — new
194
+ - `PACKAGE_SKILLS_DIR` (bundled) — kept
195
+ - Affects both `discover-skills.ts` (capability inventory) and
196
+ `skill-instructions.ts` (actual prompt rendering). New `source` values
197
+ (`project-pi`, `project-agents`, `user-pi`, `user-agents`) extend
198
+ `CapabilitySource`; first hit per name wins, project overrides user.
199
+
200
+ ### F1 sub-gap — `.pi/agents/` project agent discovery (interop)
201
+ - Project agent discovery now reads BOTH the legacy pi-crew
202
+ `.crew/agents/` (or `.pi/teams/agents/` fallback) AND the Pi-standard
203
+ `.pi/agents/` as separate tiers. New `projectPi` field in
204
+ `AgentDiscoveryResult` (optional in the type for back-compat with
205
+ existing test fixtures; treated as `[]` when omitted). `allAgents`
206
+ merges them in priority order (project first, then project-pi so a
207
+ `.pi/agents/foo.md` is a fallback to `.crew/agents/foo.md` within
208
+ the project tier). `ResourceSource` extended with `"project-pi"`.
209
+
210
+ ### F1 — frontmatter `tools:` wildcards
211
+ - New `BUILTIN_TOOL_NAMES` constant + `parseToolsField` helper in
212
+ `agent-config.ts` (matching pi-subagents' `parseToolsField`):
213
+ - omitted → `undefined` (back-compat: use the runtime default)
214
+ - `*` or `all` (case-insensitive) → full `BUILTIN_TOOL_NAMES` list
215
+ - `none` / `[]` / empty → `[]` (zero built-ins)
216
+ - CSV → parsed entries (trimmed, empty dropped)
217
+ - `parseAgentFile` now uses `parseToolsField` instead of `parseCsv`,
218
+ so existing agent files keep working with no edits. The
219
+ `ext:<extension>/<tool>` selector from pi-subagents is a documented
220
+ future gap (deferred — would require pi SDK introspection).
221
+
222
+ ### F1 — frontmatter `excludeExtensions` denylist
223
+ - New `excludeExtensions?: string[]` field on `AgentConfig`, parsed
224
+ from frontmatter `exclude_extensions: foo, bar`. Applied on the
225
+ **child-pi path** in `pi-args.ts` as a case-insensitive basename
226
+ denylist (an excluded extension is removed from the `--extension`
227
+ list; the trusted `PROMPT_RUNTIME_EXTENSION_PATH` is never
228
+ excludable). **Documented limitation**: the live-session path
229
+ (opt-in via `runtime.preferLiveSession`) ignores it for v0.7.9 —
230
+ pi's `DefaultResourceLoader` has no per-extension deny hook at the
231
+ point we hand off. Users who need the denylist on live-session
232
+ should stay on the child-pi runtime, or revisit when the SDK
233
+ exposes the hook.
234
+
235
+ ### Files
236
+ - `src/skills/discover-skills.ts` — F6 (5 roots, new source values)
237
+ - `src/runtime/skill-instructions.ts` — F6 (5 roots, type updates)
238
+ - `src/runtime/capability-inventory.ts` — F6 (CapabilitySource extended)
239
+ - `src/agents/agent-config.ts` — F1 (BUILTIN_TOOL_NAMES, parseToolsField,
240
+ excludeExtensions field, ResourceSource +project-pi)
241
+ - `src/agents/discover-agents.ts` — F1 (projectPi tier, tools/excludeExtensions
242
+ parsing, allAgents merge)
243
+ - `src/runtime/pi-args.ts` — F1 (excludeExtensions denylist applied to
244
+ `--extension` args)
245
+ - `src/runtime/live-session-runtime.ts` — F1 (doc comment for the
246
+ live-session limitation)
247
+ - `src/ui/agent-management-overlay.ts` — F1 (ResourceSource order includes
248
+ project-pi)
249
+ - NEW `test/unit/v0-7-9-interop-granularity.test.ts` (15 tests)
250
+ - `test/unit/capability-inventory.test.ts` — accept expanded state set
251
+ (shadowed/missing now possible from user-skill-roots shadowing bundles)
252
+ - `test/unit/discover-skills.test.ts` — accept expanded source set
253
+
254
+ typecheck clean; 4980+ tests pass / 0 fail. CI green on win/ubuntu/macos.
255
+
256
+ ## [0.7.8] — F7 model-scope enforcement + cross-session leak fix (2026-06-16)
257
+
258
+ Two features/fixes from the same session: one new opt-in capability, one
259
+ correctness fix for a bug surfaced by the user while iterating on the new
260
+ feature (firing live in the session — a different Pi session's in-flight
261
+ run kept getting injected into the current session's context via the
262
+ ambient-status handler).
263
+
264
+ ### Features
265
+
266
+ - **F7 model-scope enforcement** — opt-in gate that validates subagent model
267
+ choices against the user's pi `enabledModels` allowlist. Trust distinction
268
+ matches the pi-subagents reference semantics:
269
+ - Caller-supplied (per-spawn `modelOverride` / `step.model` /
270
+ `teamRoleModel`) out-of-scope → **hard error** (`CrewError E013
271
+ ModelOutOfScope`) before spawn, fail-fast with actionable help hint.
272
+ - Frontmatter-pinned (`AgentConfig.model`) out-of-scope → **warning +
273
+ runs anyway** (frontmatter is authoritative; the agent author made a
274
+ deliberate choice).
275
+ Pattern semantics match pi's `--models` allowlist: exact
276
+ (case-insensitive), glob with `*` (unanchored, so `"claude-*"` matches
277
+ `anthropic/claude-opus-4-5`), and case-insensitive substring fallback.
278
+ Toggle: `runtime.reliability.scopeModels: true` (default `false` = no
279
+ enforcement, fully back-compat). The allowlist itself is read from
280
+ pi's `SettingsManager.getEnabledModels()` per spawn (no caching, so
281
+ changes take effect immediately). 20 new unit tests covering pattern
282
+ matching, scope verdicts, and the routing gate (caller/frontmatter
283
+ trust distinction + `isFrontmatterOverride` downgrade).
284
+
285
+ ### Bug Fixes
286
+
287
+ - **Cross-session run-context leak** (commit `4bd6f5b`) — `collectInFlightRuns(cwd)`
288
+ in `compaction-guard.ts` scanned the SHARED per-project `.crew/state/runs/`
289
+ dir and filtered by STATUS only, ignoring `ownerSessionId`. Multiple Pi
290
+ sessions in the same project share that directory, so Session B's
291
+ compaction picked up Session A's in-flight runs and injected them into B's
292
+ continuation prompt, making B wrongly try to resume A's run. The same
293
+ leak affected ambient-status injection (`context-status-injection.ts`),
294
+ showing A's runs in B's context stream. Fix: `collectInFlightRuns`
295
+ gains optional `currentSessionId?` → strict filter
296
+ `run.ownerSessionId === currentSessionId` (legacy ownerless runs
297
+ excluded; true orphans are crash-recovery's job). New canonical
298
+ `extractSessionId(ctx)` helper in `utils/session-utils.ts` (defensive
299
+ against Proxy/exotic objects, replaces inline
300
+ getOwnPropertyDescriptor in `register.ts`). Artifact index stays
301
+ UNFILTERED (durable cross-session memory, not a resume directive).
302
+ `triggerContinuation`'s `sendUserMessage` race ("Agent is already
303
+ processing a prompt...") is detected and downgraded to silent — it is
304
+ benign (the worker continues independently). 11 new regression tests
305
+ (compaction-cross-session-leak.test.ts). CI green on all 3 platforms
306
+ (run `27608398599`).
307
+
308
+ ### Files
309
+
310
+ - NEW `src/runtime/model-scope.ts` — pattern matcher + verdict + SettingsManager
311
+ reader.
312
+ - `src/runtime/model-fallback.ts` — `buildConfiguredModelRouting` gains
313
+ `scopeModelsPatterns?` + `isFrontmatterOverride?` inputs; new
314
+ `CrewError E013 ModelOutOfScope` factory in `src/errors.ts`.
315
+ - `src/config/types.ts` — new `reliability.scopeModels?: boolean` toggle
316
+ (default `false`).
317
+ - `src/extension/team-tool/handle-settings.ts` — adds
318
+ `reliability.scopeModels` to the visible-keys list so it surfaces in
319
+ the settings overlay.
320
+ - `src/extension/registration/compaction-guard.ts`,
321
+ `src/extension/context-status-injection.ts`,
322
+ `src/extension/register.ts`, `src/utils/session-utils.ts` — leak fix.
323
+ - NEW `test/unit/model-scope.test.ts` (20 tests),
324
+ `test/unit/compaction-cross-session-leak.test.ts` (11 tests).
325
+
326
+ typecheck clean; 4968+ tests pass / 0 fail.
327
+
3
328
  ## [0.7.7] — Windows spawn fix + plan-approval crash-recovery fix + CI flake fixes (2026-06-16)
4
329
 
5
330
  A focused patch release driven by two community reports (Issue #33 and PR #32) plus the CI flake surfaced while validating them. CI green on Windows / Ubuntu / macOS (run 27599121797). 4965 tests pass / 0 fail.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "pi-crew",
3
- "version": "0.7.7",
3
+ "version": "0.8.2",
4
4
  "description": "Pi extension for coordinated AI teams, workflows, worktrees, and async task orchestration",
5
5
  "author": "baphuongna",
6
6
  "license": "MIT",
@@ -1,7 +1,51 @@
1
1
  import type { RoleToolConfig } from "../config/role-tools.ts";
2
2
  import { getToolConfig } from "../config/role-tools.ts";
3
3
 
4
- export type ResourceSource = "builtin" | "user" | "project" | "git" | "dynamic";
4
+ /**
5
+ * F1 (v0.7.9): canonical built-in tool name list. Used by `parseToolsField`
6
+ * to expand wildcard `*` / `all` patterns in agent frontmatter. Matches
7
+ * pi-subagents' `BUILTIN_TOOL_NAMES` (derived from pi's `createCodingTools` /
8
+ * `createReadOnlyTools`). If pi adds a new built-in, update this list and
9
+ * the wildcard expansion will pick it up. The 7 names below are stable
10
+ * across pi v0.77+ and cover read, edit, write, bash, grep, find, ls.
11
+ */
12
+ export const BUILTIN_TOOL_NAMES: readonly string[] = [
13
+ "read",
14
+ "edit",
15
+ "write",
16
+ "bash",
17
+ "grep",
18
+ "find",
19
+ "ls",
20
+ ];
21
+
22
+ /**
23
+ * F1 (v0.7.9): normalize the raw `tools:` frontmatter CSV into a `string[]`.
24
+ * Semantics (matching pi-subagents' `parseToolsField`):
25
+ * - omitted / undefined → returns `undefined` (back-compat: use the
26
+ * runtime default — today this is the role-tools default; tomorrow this
27
+ * could become the wildcard expansion if the user opts in).
28
+ * - `*` or `all` (case-insensitive) → returns the full BUILTIN_TOOL_NAMES
29
+ * list (no duplicates).
30
+ * - `none` or empty string → returns `[]` (zero built-ins; extension
31
+ * tools via `ext:` can still be added, though pi-crew doesn't parse
32
+ * `ext:` selectors yet — see F1 sub-gap).
33
+ * - CSV → returns the parsed entries (trimmed, empty entries dropped).
34
+ * Plain tool names (no `*`) pass through unchanged so existing agent
35
+ * files keep working with no edits.
36
+ */
37
+ export function parseToolsField(raw: unknown): string[] | undefined {
38
+ if (raw === undefined || raw === null) return undefined;
39
+ const s = typeof raw === "string" ? raw.trim() : String(raw).trim();
40
+ if (!s) return [];
41
+ const lowered = s.toLowerCase();
42
+ if (lowered === "none" || lowered === "[]") return [];
43
+ if (lowered === "*" || lowered === "all") return [...BUILTIN_TOOL_NAMES];
44
+ const items = s.split(",").map((t) => t.trim()).filter(Boolean);
45
+ return items;
46
+ }
47
+
48
+ export type ResourceSource = "builtin" | "user" | "project" | "git" | "dynamic" | "project-pi";
5
49
 
6
50
  export interface RoutingMetadata {
7
51
  triggers?: string[];
@@ -22,6 +66,14 @@ export interface AgentConfig {
22
66
  thinking?: string;
23
67
  tools?: string[];
24
68
  extensions?: string[];
69
+ /**
70
+ * F1 (v0.7.9): extension denylist (case-insensitive plain names). Applied
71
+ * AFTER `extensions:` (which lists the allowed set) — an excluded
72
+ * extension is removed from the allowlist and never loads. Plain names
73
+ * only (no paths, no `*`); an unknown name logs a warning but is
74
+ * tolerated. Back-compat: omitted = no exclusion.
75
+ */
76
+ excludeExtensions?: string[];
25
77
  skills?: string[];
26
78
  systemPromptMode?: "replace" | "append";
27
79
  inheritProjectContext?: boolean;
@@ -64,6 +116,54 @@ export function getAgentSessionOptions(role: string): {
64
116
  return {};
65
117
  }
66
118
 
119
+ /**
120
+ * F1 unify (v0.8.0): the single source of truth for a worker's tool policy,
121
+ * used by BOTH spawn paths (child-pi `pi-args.ts` and live-session
122
+ * `live-session-runtime.ts`). Before this, the two paths disagreed:
123
+ * - child-pi: `roleConfig.tools ?? agent.tools` (role authoritative)
124
+ * - live-session: `agent.tools` only (frontmatter authoritative, role ignored)
125
+ * so the same agent behaved differently depending on the runtime. A user
126
+ * defining `tools:` or `disallowed_tools:` in a custom agent's frontmatter
127
+ * saw it honored on one path and ignored on the other.
128
+ *
129
+ * Unified semantics (stable across both paths):
130
+ * - **allowlist precedence is source-aware**:
131
+ * - `source === "builtin"` → role-config authoritative (security: a
132
+ * builtin explorer MUST stay read-only even if its frontmatter is
133
+ * loose). Frontmatter is the fallback when the role has no allowlist.
134
+ * - `source !== "builtin"` (user / project) → frontmatter `tools:`
135
+ * authoritative (user intent). Role-config is the fallback.
136
+ * - **denylist is additive**: `roleConfig.excludeTools` and
137
+ * `agent.disallowedTools` are MERGED (dedup, order-insensitive). It is
138
+ * always safe to forbid more, and merging means a security exclude
139
+ * from the role can never be weakened by a frontmatter omission.
140
+ *
141
+ * Returns `{ tools, excludeTools }` where each is `undefined` when no
142
+ * restriction of that kind applies (so callers no-op cleanly).
143
+ */
144
+ export interface ResolvedToolPolicy {
145
+ /** Allowlist; undefined = no allowlist restriction (all built-ins allowed). */
146
+ tools?: string[];
147
+ /** Denylist (additive); undefined = no denylist. */
148
+ excludeTools?: string[];
149
+ }
150
+
151
+ function uniqueToolMerge(...lists: Array<string[] | undefined>): string[] | undefined {
152
+ const merged = [...new Set(lists.flatMap((list) => list ?? []))];
153
+ return merged.length > 0 ? merged : undefined;
154
+ }
155
+
156
+ export function resolveToolPolicy(agent: AgentConfig, role?: string): ResolvedToolPolicy {
157
+ const roleConfig = role ? getToolConfig(role) : {};
158
+ // allowlist: source-aware precedence (see doc above).
159
+ const explicitTools = agent.source === "builtin"
160
+ ? (roleConfig.tools ?? agent.tools)
161
+ : (agent.tools ?? roleConfig.tools);
162
+ // denylist: additive merge of role excludeTools + agent disallowedTools.
163
+ const excludeTools = uniqueToolMerge(roleConfig.excludeTools, agent.disallowedTools);
164
+ return { tools: explicitTools, excludeTools };
165
+ }
166
+
67
167
  /**
68
168
  * Build agent session options including role-based tool restrictions.
69
169
  * @param agent - The agent configuration
@@ -1,10 +1,11 @@
1
1
  import * as fs from "node:fs";
2
2
  import * as path from "node:path";
3
3
  import type { AgentConfig, ResourceSource } from "./agent-config.ts";
4
+ import { parseToolsField } from "./agent-config.ts";
4
5
  import { loadConfig, type LoadedPiTeamsConfig } from "../config/config.ts";
5
6
  import { parseCsv, parseFrontmatter } from "../utils/frontmatter.ts";
6
7
  import { logInternalError } from "../utils/internal-error.ts";
7
- import { packageRoot, projectCrewRoot, userPiRoot } from "../utils/paths.ts";
8
+ import { packageRoot, projectCrewRoot, userPiRoot, findRepoRoot } from "../utils/paths.ts";
8
9
 
9
10
  // ═══════════════════════════════════════════════════════════════════════════
10
11
  // SEC-001 Fix: Protected Agent Names Blocklist
@@ -225,7 +226,23 @@ function checkProjectAgentShadowsBuiltin(name: string): void {
225
226
  export interface AgentDiscoveryResult {
226
227
  builtin: AgentConfig[];
227
228
  user: AgentConfig[];
229
+ /**
230
+ * Project agents from the pi-crew legacy directory (`.crew/agents/`, or
231
+ * `.pi/teams/agents/` fallback). F1 (v0.7.9): the `.pi/agents/` Pi-standard
232
+ * project directory is read into `projectPi` (the 4th tier) so users who
233
+ * author agents under either convention find them.
234
+ */
228
235
  project: AgentConfig[];
236
+ /**
237
+ * F1 (v0.7.9): project agents read from `<repoRoot>/.pi/agents/` (Pi
238
+ * standard). Merged into the same priority order as `project` (project
239
+ * overrides user, but `.crew/agents/` and `.pi/agents/` are peers
240
+ * within the project tier — first hit per `name` wins, with a
241
+ * warning logged on shadow). Optional in the result shape so existing
242
+ * test fixtures that construct `AgentDiscoveryResult` literally don't
243
+ * have to add an empty array (treated as `[]` by `allAgents`).
244
+ */
245
+ projectPi?: AgentConfig[];
229
246
  }
230
247
 
231
248
  function parseCost(value: string | undefined): "free" | "cheap" | "expensive" | undefined {
@@ -365,8 +382,9 @@ function parseAgentFile(filePath: string, source: ResourceSource): AgentConfig |
365
382
  model: frontmatter.model === "false" ? undefined : frontmatter.model || undefined,
366
383
  fallbackModels: parseCsv(frontmatter.fallbackModels),
367
384
  thinking: frontmatter.thinking === "false" ? undefined : frontmatter.thinking || undefined,
368
- tools: parseCsv(frontmatter.tools),
385
+ tools: parseToolsField(frontmatter.tools),
369
386
  extensions: frontmatter.extensions === "" ? [] : parseCsv(frontmatter.extensions),
387
+ excludeExtensions: parseCsv(frontmatter.excludeExtensions ?? frontmatter.exclude_extensions),
370
388
  skills: parseCsv(frontmatter.skills ?? frontmatter.skill),
371
389
  systemPromptMode: frontmatter.systemPromptMode === "append" ? "append" : "replace",
372
390
  inheritProjectContext: frontmatter.inheritProjectContext === "true",
@@ -471,7 +489,14 @@ export function discoverAgents(cwd: string): AgentDiscoveryResult {
471
489
  const result: AgentDiscoveryResult = {
472
490
  builtin: applyAgentOverrides(readAgentDir(path.join(packageRoot(), "agents"), "builtin"), cwd, loaded),
473
491
  user: applyAgentOverrides(readAgentDir(path.join(userPiRoot(), "agents"), "user"), cwd, loaded),
492
+ // F1 (v0.7.9): two project roots — the legacy pi-crew `.crew/agents/`
493
+ // (or `.pi/teams/agents/` fallback) AND the Pi-standard `.pi/agents/`.
494
+ // Both are read; `allAgents` merges them in priority order (project
495
+ // first, then project-pi) so a project can override a global agent
496
+ // from either location. Same-name shadows within the project tier
497
+ // log a warning (SEC-001).
474
498
  project: applyAgentOverrides(readAgentDir(path.join(projectCrewRoot(cwd), "agents"), "project"), cwd, loaded),
499
+ projectPi: applyAgentOverrides(readAgentDir(path.join(findRepoRoot(cwd) ?? cwd, ".pi", "agents"), "project-pi"), cwd, loaded),
475
500
  };
476
501
  // SEC-005: Store with current version stamp
477
502
  discoveryCache.set(cwd, { result, expiresAt: Date.now() + DISCOVERY_CACHE_TTL_MS, cacheVersion: currentVersion });
@@ -520,7 +545,13 @@ export function allAgents(discovery: AgentDiscoveryResult | undefined): AgentCon
520
545
  // Priority for disambiguation (security): project < builtin < user.
521
546
  // Project config cannot override trusted builtins (security-hardening).
522
547
  // Later entries in the loop overwrite earlier ones, so user wins.
523
- for (const agent of [...discovery.project, ...discovery.builtin, ...discovery.user]) {
548
+ // F1 (v0.7.9): `projectPi` is appended AFTER `project` so a `.pi/agents/foo.md`
549
+ // is a fallback to `.crew/agents/foo.md` within the project tier (the
550
+ // legacy pi-crew directory takes precedence when both exist). This
551
+ // matches `applyAgentOverrides` semantics and keeps the SECURITY warning
552
+ // gate on the same source. `projectPi` is optional in the result type
553
+ // (older test fixtures may omit it) — fall back to an empty array.
554
+ for (const agent of [...discovery.project, ...(discovery.projectPi ?? []), ...discovery.builtin, ...discovery.user]) {
524
555
  byName.set(agent.name.toLowerCase(), agent);
525
556
  }
526
557
  // Dynamic agents only fill gaps — they cannot override builtin/user agents.
@@ -180,6 +180,14 @@ export interface CrewReliabilityConfig {
180
180
  cleanupOrphanedTempDirs?: boolean;
181
181
  /** Inject a compact ambient crew-status note into the agent's context on every LLM call while crew runs are in-flight, so the agent stays continuously aware of active runs without calling the `team` tool. No-op when no runs are active. Default: true. */
182
182
  ambientStatusInjection?: boolean;
183
+ /**
184
+ * Opt-in model scope enforcement (F7). When true, subagent model choices
185
+ * that fall outside the user's pi `enabledModels` allowlist are flagged:
186
+ * caller-supplied out-of-scope → hard error before spawn; frontmatter-
187
+ * pinned out-of-scope → warning + runs anyway. Default: false (no
188
+ * enforcement, fully back-compat).
189
+ */
190
+ scopeModels?: boolean;
183
191
  }
184
192
 
185
193
  export interface CrewOtlpConfig {
package/src/errors.ts CHANGED
@@ -38,6 +38,7 @@ export const ErrorCode = {
38
38
  EventLogLockTimeout: "E010", // Could not acquire the event-log file lock
39
39
  DepthLimitExceeded: "E011", // Pipeline/chain recursion depth limit hit (circular dep)
40
40
  RunStale: "E012", // Run reconciled as stale/zombie (heartbeat expired)
41
+ ModelOutOfScope: "E013", // Caller-supplied model is not in pi's enabledModels allowlist (F7 scope gate)
41
42
  } as const;
42
43
 
43
44
  export type ErrorCode = typeof ErrorCode[keyof typeof ErrorCode];
@@ -56,6 +57,7 @@ const DEFAULT_HELP: Record<ErrorCode, string | undefined> = {
56
57
  [ErrorCode.EventLogLockTimeout]: "Another process holds the event-log lock. Check for orphaned `.lock` files or stale pi-crew processes, then retry.",
57
58
  [ErrorCode.DepthLimitExceeded]: "A pipeline/chain exceeded the recursion depth limit, which usually indicates a circular stage dependency. Review step `dependsOn` chains.",
58
59
  [ErrorCode.RunStale]: "The worker stopped heartbeating and was treated as a zombie. Re-run the team (resume or fresh); if it recurs, check `runtime.executeWorkers` / system load.",
60
+ [ErrorCode.ModelOutOfScope]: "The requested model is not in your pi `enabledModels` allowlist. Either pick a model listed in `enabledModels` (settings.json) or extend the allowlist. The scope gate is opt-in — disable `runtime.reliability.scopeModels` to allow any model.",
59
61
  };
60
62
 
61
63
  /**
@@ -188,4 +190,11 @@ export const errors = {
188
190
  `Stale run reconciled (reason=${reason}).${age} The worker stopped heartbeating and was treated as dead/zombie.`,
189
191
  ).withContext("stale-run reconciliation");
190
192
  },
193
+
194
+ modelOutOfScope(model: string, patterns: string[]): CrewError {
195
+ return new CrewError(
196
+ ErrorCode.ModelOutOfScope,
197
+ `Requested model "${model}" is not in enabledModels scope (allowlist: [${patterns.join(", ")}])`,
198
+ ).withContext("F7 model scope gate — caller override rejected");
199
+ },
191
200
  } as const;
@@ -35,6 +35,7 @@ import type { AgentMessage } from "@earendil-works/pi-agent-core";
35
35
  import type { Message } from "@earendil-works/pi-ai";
36
36
  import type { ExtensionAPI, ContextEvent } from "@earendil-works/pi-coding-agent";
37
37
  import { collectInFlightRuns } from "./registration/compaction-guard.ts";
38
+ import { extractSessionId } from "../utils/session-utils.ts";
38
39
  import type { TeamRunManifest } from "../state/types.ts";
39
40
 
40
41
  /** Sentinel that marks an injected ambient-status user message. */
@@ -133,10 +134,10 @@ export interface AmbientContextResult {
133
134
  *
134
135
  * Exported for unit testing.
135
136
  */
136
- export function handleContextEvent(event: ContextEvent, cwd: string): AmbientContextResult | undefined {
137
+ export function handleContextEvent(event: ContextEvent, cwd: string, sessionId?: string): AmbientContextResult | undefined {
137
138
  let runs: TeamRunManifest[] = [];
138
139
  try {
139
- runs = collectInFlightRuns(cwd);
140
+ runs = collectInFlightRuns(cwd, sessionId);
140
141
  } catch {
141
142
  // State read failure → don't inject, don't crash. Pi catches handler
142
143
  // errors anyway, but we avoid noisy error emission for a best-effort
@@ -167,8 +168,16 @@ export function registerContextStatusInjection(
167
168
  opts: { enabled?: boolean } = {},
168
169
  ): void {
169
170
  if (opts.enabled === false) return;
170
- pi.on("context", (event: ContextEvent): AmbientContextResult | undefined => {
171
- const cwd = typeof process.cwd === "function" ? process.cwd() : ".";
172
- return handleContextEvent(event, cwd);
171
+ pi.on("context", (event: ContextEvent, ctx: unknown): AmbientContextResult | undefined => {
172
+ // crew state is per-project; use the session ctx cwd when available,
173
+ // falling back to process.cwd(). Thread the session id so ambient
174
+ // status only reflects runs owned by THIS session (the state store is
175
+ // per-project, shared across sessions).
176
+ const cwd =
177
+ typeof ctx === "object" && ctx !== null && typeof (ctx as { cwd?: unknown }).cwd === "string"
178
+ ? (ctx as { cwd: string }).cwd
179
+ : typeof process.cwd === "function" ? process.cwd() : ".";
180
+ const sessionId = extractSessionId(ctx);
181
+ return handleContextEvent(event, cwd, sessionId);
173
182
  });
174
183
  }
@@ -91,6 +91,7 @@ import {
91
91
  userCrewRoot,
92
92
  } from "../utils/paths.ts";
93
93
  import { resolveContainedPath } from "../utils/safe-paths.ts";
94
+ import { extractSessionId } from "../utils/session-utils.ts";
94
95
  import { resetTimings, time } from "../utils/timings.ts";
95
96
  import {
96
97
  type PiCrewRpcHandle,
@@ -1242,24 +1243,9 @@ export function registerPiTeams(pi: ExtensionAPI): void {
1242
1243
  notifyActiveRuns(ctx);
1243
1244
 
1244
1245
  // Auto-cancel orphaned runs from dead sessions
1245
- // Extract sessionId from context use Object.getOwnPropertyDescriptor
1246
- // to safely access property without triggering Proxy traps, then validate.
1247
- const rawSessionId =
1248
- typeof ctx === "object" && ctx !== null
1249
- ? Object.getOwnPropertyDescriptor(ctx, "sessionId")?.value
1250
- : undefined;
1251
- const currentSessionId =
1252
- typeof rawSessionId === "string" && rawSessionId.length > 0
1253
- ? rawSessionId
1254
- : undefined;
1255
- if (rawSessionId !== undefined && currentSessionId === undefined) {
1256
- logInternalError(
1257
- "register.sessionId.invalid",
1258
- new Error(
1259
- `Invalid session ID: expected non-empty string, got ${typeof rawSessionId}`,
1260
- ),
1261
- );
1262
- }
1246
+ // Extract sessionId from context via the shared safe accessor (handles
1247
+ // untyped runtime property + defensive against exotic objects).
1248
+ const currentSessionId = extractSessionId(ctx);
1263
1249
 
1264
1250
  // Defer ALL heavy cleanup to after the session_start handler returns.
1265
1251
  // These operations involve synchronous directory scanning (readdirSync, readFileSync)