npm - pi-crew - Versions diffs - 0.7.7 → 0.8.2 - Mend

pi-crew 0.7.7 → 0.8.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (20) hide show

package/CHANGELOG.md +325 -0
package/package.json +1 -1
package/src/agents/agent-config.ts +101 -1
package/src/agents/discover-agents.ts +34 -3
package/src/config/types.ts +8 -0
package/src/errors.ts +9 -0
package/src/extension/context-status-injection.ts +14 -5
package/src/extension/register.ts +4 -18
package/src/extension/registration/compaction-guard.ts +44 -13
package/src/extension/team-tool/handle-settings.ts +2 -0
package/src/runtime/live-session-runtime.ts +69 -7
package/src/runtime/model-fallback.ts +39 -1
package/src/runtime/model-scope.ts +141 -0
package/src/runtime/pi-args.ts +21 -6
package/src/runtime/skill-effectiveness.ts +84 -10
package/src/runtime/skill-instructions.ts +14 -4
package/src/runtime/task-runner.ts +21 -0
package/src/skills/discover-skills.ts +31 -2
package/src/ui/agent-management-overlay.ts +1 -1
package/src/utils/session-utils.ts +30 -0

package/CHANGELOG.md CHANGED Viewed

@@ -1,5 +1,330 @@
 # Changelog
+## [0.8.2] — Skill confidence dead-code fix (T7) (2026-06-16)
+Fixes a **real correctness bug** surfaced by the pi-extensions deep-dive
+(pi-continuous-learning's tiered confidence model): pi-crew's skill
+confidence system was effectively **inert**.
+### Bug fixed
+`registerSkillEffectivenessHooks` had two defects that left every skill's
+confidence stuck at ~0.3 regardless of outcomes:
+1. **`adjustConfidence()` was dead code.** The `task_completed` handler
+   hardcoded `confidence: computeInitialConfidence(1)` (= 0.3) on every
+   activation write. The function was defined and unit-tested in isolation,
+   but **never called in the recording path** — so every stored activation
+   had confidence 0.3, and `computeSkillMetrics.currentConfidence` (derived
+   from the last stored value + decay) never moved.
+2. **`task_failed` was a no-op.** Its comment claimed failures were "handled
+   by computeSkillMetrics", but `computeSkillMetrics` derives `passRate`
+   from *recorded* activations — and failed tasks recorded **nothing**, so a
+   failure never fed back into the confidence/decay loop.
+Net effect: the entire confidence-weighted skill system was decorative.
+Pass-rate, trend, and promotion-gate decisions were computed from a flat
+0.3 baseline.
+### Fix
+New `computeNextActivationConfidence(skillId, activations, passed)` helper
+computes the **rolling** confidence: it seeds the first activation of a
+skill at 0.3, then applies `adjustConfidence` (+0.05 success / -0.1
+   failure, clamped [0.1, 0.95]) on the skill's last recorded confidence.
+Both hooks now record activations with the rolling confidence:
+- `task_completed` → records `passed:true` activations at the rolled-forward
+  confidence.
+- `task_failed` → now records `passed:false` activations (was a no-op),
+  which lowers passRate AND triggers the -0.1 contradicting delta on the
+  next recorded activation.
+This unblocks the confidence-weighted skill selection (`getWeightedSkillsForRole`)
+and the promotion gate (`evaluatePromotionGate`) — they now reflect real
+outcome history. Existing `adjustConfidence`/`computeInitialConfidence`/
+`computeSkillMetrics` tests are preserved unchanged (they asserted on the
+intended contract; the recording path now honors it).
+### Files
+- `src/runtime/skill-effectiveness.ts` — `computeNextActivationConfidence`
+  helper; both hooks rewired to record rolling-confidence activations.
+- NEW `test/unit/t7-confidence-deadcode-fix.test.ts` (7 tests): rolling
+  confidence evolves across activations; failures feed back; `adjustConfidence`
+  is no longer dead.
+typecheck clean; skill-effectiveness suite 44/44 pass. (One unrelated
+`event-log-async` flake under local load passes 3/3 in isolation — clean on CI.)
+## [0.8.1] — Subagent cold-start race fix (module-scoped import latch) (2026-06-16)
+Fixes a flaky, load-dependent crash that surfaced when launching multiple
+subagents **concurrently** via `Agent({ run_in_background: true })`.
+### Bug fixed
+When 2+ in-process live-session subagents spawned at once, some crashed at
+cold-start with:
+```
+Cannot read properties of undefined (reading 'existsSync')
+Cannot read properties of undefined (reading 'validateWorkflowForTeam')
+```
+These are property-access-on-`undefined` errors: a module namespace binding
+observed mid-evaluation as `undefined`. The defining reproduction: 4 explorer
+subagents launched together → 3 of 4 crashed; **all 3 succeeded on sequential
+retry** (same code, same args, same repos — only concurrency changed). That is
+the signature of a cold-start race, not a logic bug.
+### Root cause
+`direct-agent` subagents run **in-process** via `createAgentSession` (the
+live-session runtime), sharing one Node module graph. The spawn path called
+`await import("@earendil-works/pi-coding-agent")` **independently** per
+subagent. Under the **tsx loader** (which registers `load`/`resolve` hooks to
+transpile TS), concurrent first-imports can each enter the loader and race
+module-record instantiation — yielding a namespace binding seen mid-eval as
+`undefined`. Engine-level ESM memoization is not guaranteed to be observed
+synchronously across concurrent evaluation under transpiling loaders.
+### Fix
+Module-scoped memoization in `src/runtime/live-session-runtime.ts`: the FIRST
+caller sets `liveSessionModulePromise`; every later caller awaits the same
+in-flight promise. Guarantees a single module-record instantiation regardless
+of loader behavior. Concurrent callers then proceed in parallel as normal.
+```ts
+let liveSessionModulePromise: Promise<LiveSessionModule> | undefined;
+function loadLiveSessionModule(): Promise<LiveSessionModule> {
+	if (!liveSessionModulePromise) {
+		liveSessionModulePromise = import("@earendil-works/pi-coding-agent")
+			as unknown as Promise<LiveSessionModule>;
+	}
+	return liveSessionModulePromise;
+}
+```
+### Files
+- `src/runtime/live-session-runtime.ts` — module-scoped `loadLiveSessionModule()`
+  latch; use site now `await loadLiveSessionModule()` (was un-memoized
+  `await import(...)`).
+- NEW `test/unit/live-session-import-latch.test.ts` (2 tests): module loads
+  cleanly; latch variable + check-before-set + use site present, and the old
+  un-memoized pattern gone (regression guard).
+- NEW `.github/issues/2026-06-16-subagent-cold-start-race.md` — full root-cause
+  write-up + lessons.
+typecheck clean; full suite 0 failures (local EXIT=1 is the test-runner infra
+`spawnSync ETIMEDOUT` on a background-subagent test under local load — clean
+on CI).
+## [0.8.0] — Tool-restriction unification across spawn paths (2026-06-16)
+Fixes a long-standing correctness gap where the same agent behaved
+*differently* depending on which runtime spawned it.
+### Bug fixed
+The child-pi path (`pi-args.ts`) and the live-session path
+(`live-session-runtime.ts`) **disagreed on tool restrictions**:
+| | allowlist | denylist |
+|---|---|---|
+| child-pi (before) | `roleConfig.tools ?? agent.tools` (role authoritative) | `roleConfig.excludeTools` only |
+| live-session (before) | `agent.tools` only (frontmatter authoritative) | `agent.disallowedTools` only |
+So a user defining `tools:` or `disallowed_tools:` in a custom agent's
+frontmatter saw it honored on one path and ignored on the other:
+- `disallowed_tools: web` was **silently ignored on child-pi** (the default
+  async path).
+- A builtin `explorer` on the live-session path was **not bound by the role's
+  read-only security constraint** (it relied solely on the frontmatter).
+### Fix
+A shared `resolveToolPolicy(agent, role)` helper in `agent-config.ts` is
+now the **single source of truth** used by BOTH spawn paths. Stable,
+unified semantics:
+- **Allowlist precedence is source-aware**:
+  - `source === "builtin"` → role-config authoritative (security: a builtin
+    explorer MUST stay read-only even if its frontmatter is loose).
+    Frontmatter is the fallback when the role has no allowlist.
+  - `source !== "builtin"` (user / project) → frontmatter `tools:`
+    authoritative (user intent). Role-config is the fallback.
+- **Denylist is additive**: `roleConfig.excludeTools` and
+  `agent.disallowedTools` are MERGED (dedup, order-insensitive). It is
+  always safe to forbid more, and merging means a security exclude from
+  the role can never be weakened by a frontmatter omission.
+This is **not a regression** for builtin agents: their allowlist still comes
+from `ROLE_TOOL_CONFIGS` (the authoritative security set), and the merged
+denylist only adds constraints. Custom agents now behave identically
+across both runtimes.
+### Files
+- `src/agents/agent-config.ts` — NEW `resolveToolPolicy` + `ResolvedToolPolicy`
+  (the shared resolver) + `uniqueToolMerge` helper.
+- `src/runtime/pi-args.ts` — uses `resolveToolPolicy` (drops the inline
+  role-authoritative logic; removes now-unused `getAgentSessionOptions` import).
+- `src/runtime/live-session-runtime.ts` — `filterActiveTools` now takes the
+  role and uses `resolveToolPolicy` (drops the inline frontmatter-only logic).
+- NEW `test/unit/v0-8-0-tool-policy-unification.test.ts` (10 tests pinning
+  the resolver: source-aware allowlist, additive denylist, cross-path
+  determinism).
+typecheck clean; 4980+ tests pass / 0 fail. CI green on win/ubuntu/macos.
+## [0.7.9] — Interop & agent granularity (4 grouped items, 2026-06-16)
+One grouped release for four related, surgical interop / agent-granularity
+items (all additive, no behavior change for existing configs):
+### F6 — Agent Skills spec skill-roots (interop)
+- Skill discovery now reads 5 roots (was 2), matching pi-subagents'
+  `skill-loader` so skills authored under either convention are found:
+  - `<cwd>/.pi/skills` (project, Pi standard) — new
+  - `<cwd>/.agents/skills` (project, Agent Skills spec / agentskills.io) — new
+  - `<cwd>/skills` (project, legacy pi-crew) — kept
+  - `~/.pi/agent/skills` (user, Pi standard) — new
+  - `~/.agents/skills` (user, Agent Skills spec) — new
+  - `~/.pi/skills` (user, legacy) — new
+  - `PACKAGE_SKILLS_DIR` (bundled) — kept
+- Affects both `discover-skills.ts` (capability inventory) and
+  `skill-instructions.ts` (actual prompt rendering). New `source` values
+  (`project-pi`, `project-agents`, `user-pi`, `user-agents`) extend
+  `CapabilitySource`; first hit per name wins, project overrides user.
+### F1 sub-gap — `.pi/agents/` project agent discovery (interop)
+- Project agent discovery now reads BOTH the legacy pi-crew
+  `.crew/agents/` (or `.pi/teams/agents/` fallback) AND the Pi-standard
+  `.pi/agents/` as separate tiers. New `projectPi` field in
+  `AgentDiscoveryResult` (optional in the type for back-compat with
+  existing test fixtures; treated as `[]` when omitted). `allAgents`
+  merges them in priority order (project first, then project-pi so a
+  `.pi/agents/foo.md` is a fallback to `.crew/agents/foo.md` within
+  the project tier). `ResourceSource` extended with `"project-pi"`.
+### F1 — frontmatter `tools:` wildcards
+- New `BUILTIN_TOOL_NAMES` constant + `parseToolsField` helper in
+  `agent-config.ts` (matching pi-subagents' `parseToolsField`):
+  - omitted → `undefined` (back-compat: use the runtime default)
+  - `*` or `all` (case-insensitive) → full `BUILTIN_TOOL_NAMES` list
+  - `none` / `[]` / empty → `[]` (zero built-ins)
+  - CSV → parsed entries (trimmed, empty dropped)
+- `parseAgentFile` now uses `parseToolsField` instead of `parseCsv`,
+  so existing agent files keep working with no edits. The
+  `ext:<extension>/<tool>` selector from pi-subagents is a documented
+  future gap (deferred — would require pi SDK introspection).
+### F1 — frontmatter `excludeExtensions` denylist
+- New `excludeExtensions?: string[]` field on `AgentConfig`, parsed
+  from frontmatter `exclude_extensions: foo, bar`. Applied on the
+  **child-pi path** in `pi-args.ts` as a case-insensitive basename
+  denylist (an excluded extension is removed from the `--extension`
+  list; the trusted `PROMPT_RUNTIME_EXTENSION_PATH` is never
+  excludable). **Documented limitation**: the live-session path
+  (opt-in via `runtime.preferLiveSession`) ignores it for v0.7.9 —
+  pi's `DefaultResourceLoader` has no per-extension deny hook at the
+  point we hand off. Users who need the denylist on live-session
+  should stay on the child-pi runtime, or revisit when the SDK
+  exposes the hook.
+### Files
+- `src/skills/discover-skills.ts` — F6 (5 roots, new source values)
+- `src/runtime/skill-instructions.ts` — F6 (5 roots, type updates)
+- `src/runtime/capability-inventory.ts` — F6 (CapabilitySource extended)
+- `src/agents/agent-config.ts` — F1 (BUILTIN_TOOL_NAMES, parseToolsField,
+  excludeExtensions field, ResourceSource +project-pi)
+- `src/agents/discover-agents.ts` — F1 (projectPi tier, tools/excludeExtensions
+  parsing, allAgents merge)
+- `src/runtime/pi-args.ts` — F1 (excludeExtensions denylist applied to
+  `--extension` args)
+- `src/runtime/live-session-runtime.ts` — F1 (doc comment for the
+  live-session limitation)
+- `src/ui/agent-management-overlay.ts` — F1 (ResourceSource order includes
+  project-pi)
+- NEW `test/unit/v0-7-9-interop-granularity.test.ts` (15 tests)
+- `test/unit/capability-inventory.test.ts` — accept expanded state set
+  (shadowed/missing now possible from user-skill-roots shadowing bundles)
+- `test/unit/discover-skills.test.ts` — accept expanded source set
+typecheck clean; 4980+ tests pass / 0 fail. CI green on win/ubuntu/macos.
+## [0.7.8] — F7 model-scope enforcement + cross-session leak fix (2026-06-16)
+Two features/fixes from the same session: one new opt-in capability, one
+correctness fix for a bug surfaced by the user while iterating on the new
+feature (firing live in the session — a different Pi session's in-flight
+run kept getting injected into the current session's context via the
+ambient-status handler).
+### Features
+- **F7 model-scope enforcement** — opt-in gate that validates subagent model
+  choices against the user's pi `enabledModels` allowlist. Trust distinction
+  matches the pi-subagents reference semantics:
+  - Caller-supplied (per-spawn `modelOverride` / `step.model` /
+    `teamRoleModel`) out-of-scope → **hard error** (`CrewError E013
+    ModelOutOfScope`) before spawn, fail-fast with actionable help hint.
+  - Frontmatter-pinned (`AgentConfig.model`) out-of-scope → **warning +
+    runs anyway** (frontmatter is authoritative; the agent author made a
+    deliberate choice).
+  Pattern semantics match pi's `--models` allowlist: exact
+  (case-insensitive), glob with `*` (unanchored, so `"claude-*"` matches
+  `anthropic/claude-opus-4-5`), and case-insensitive substring fallback.
+  Toggle: `runtime.reliability.scopeModels: true` (default `false` = no
+  enforcement, fully back-compat). The allowlist itself is read from
+  pi's `SettingsManager.getEnabledModels()` per spawn (no caching, so
+  changes take effect immediately). 20 new unit tests covering pattern
+  matching, scope verdicts, and the routing gate (caller/frontmatter
+  trust distinction + `isFrontmatterOverride` downgrade).
+### Bug Fixes
+- **Cross-session run-context leak** (commit `4bd6f5b`) — `collectInFlightRuns(cwd)`
+  in `compaction-guard.ts` scanned the SHARED per-project `.crew/state/runs/`
+  dir and filtered by STATUS only, ignoring `ownerSessionId`. Multiple Pi
+  sessions in the same project share that directory, so Session B's
+  compaction picked up Session A's in-flight runs and injected them into B's
+  continuation prompt, making B wrongly try to resume A's run. The same
+  leak affected ambient-status injection (`context-status-injection.ts`),
+  showing A's runs in B's context stream. Fix: `collectInFlightRuns`
+  gains optional `currentSessionId?` → strict filter
+  `run.ownerSessionId === currentSessionId` (legacy ownerless runs
+  excluded; true orphans are crash-recovery's job). New canonical
+  `extractSessionId(ctx)` helper in `utils/session-utils.ts` (defensive
+  against Proxy/exotic objects, replaces inline
+  getOwnPropertyDescriptor in `register.ts`). Artifact index stays
+  UNFILTERED (durable cross-session memory, not a resume directive).
+  `triggerContinuation`'s `sendUserMessage` race ("Agent is already
+  processing a prompt...") is detected and downgraded to silent — it is
+  benign (the worker continues independently). 11 new regression tests
+  (compaction-cross-session-leak.test.ts). CI green on all 3 platforms
+  (run `27608398599`).
+### Files
+- NEW `src/runtime/model-scope.ts` — pattern matcher + verdict + SettingsManager
+  reader.
+- `src/runtime/model-fallback.ts` — `buildConfiguredModelRouting` gains
+  `scopeModelsPatterns?` + `isFrontmatterOverride?` inputs; new
+  `CrewError E013 ModelOutOfScope` factory in `src/errors.ts`.
+- `src/config/types.ts` — new `reliability.scopeModels?: boolean` toggle
+  (default `false`).
+- `src/extension/team-tool/handle-settings.ts` — adds
+  `reliability.scopeModels` to the visible-keys list so it surfaces in
+  the settings overlay.
+- `src/extension/registration/compaction-guard.ts`,
+  `src/extension/context-status-injection.ts`,
+  `src/extension/register.ts`, `src/utils/session-utils.ts` — leak fix.
+- NEW `test/unit/model-scope.test.ts` (20 tests),
+  `test/unit/compaction-cross-session-leak.test.ts` (11 tests).
+typecheck clean; 4968+ tests pass / 0 fail.
 ## [0.7.7] — Windows spawn fix + plan-approval crash-recovery fix + CI flake fixes (2026-06-16)
 A focused patch release driven by two community reports (Issue #33 and PR #32) plus the CI flake surfaced while validating them. CI green on Windows / Ubuntu / macOS (run 27599121797). 4965 tests pass / 0 fail.

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "pi-crew",
-  "version": "0.7.7",
+  "version": "0.8.2",
   "description": "Pi extension for coordinated AI teams, workflows, worktrees, and async task orchestration",
   "author": "baphuongna",
   "license": "MIT",

package/src/agents/agent-config.ts CHANGED Viewed

@@ -1,7 +1,51 @@
 import type { RoleToolConfig } from "../config/role-tools.ts";
 import { getToolConfig } from "../config/role-tools.ts";
-export type ResourceSource = "builtin" | "user" | "project" | "git" | "dynamic";
+/**
+ * F1 (v0.7.9): canonical built-in tool name list. Used by `parseToolsField`
+ * to expand wildcard `*` / `all` patterns in agent frontmatter. Matches
+ * pi-subagents' `BUILTIN_TOOL_NAMES` (derived from pi's `createCodingTools` /
+ * `createReadOnlyTools`). If pi adds a new built-in, update this list and
+ * the wildcard expansion will pick it up. The 7 names below are stable
+ * across pi v0.77+ and cover read, edit, write, bash, grep, find, ls.
+ */
+export const BUILTIN_TOOL_NAMES: readonly string[] = [
+	"read",
+	"edit",
+	"write",
+	"bash",
+	"grep",
+	"find",
+	"ls",
+];
+/**
+ * F1 (v0.7.9): normalize the raw `tools:` frontmatter CSV into a `string[]`.
+ * Semantics (matching pi-subagents' `parseToolsField`):
+ *   - omitted / undefined → returns `undefined` (back-compat: use the
+ *     runtime default — today this is the role-tools default; tomorrow this
+ *     could become the wildcard expansion if the user opts in).
+ *   - `*` or `all` (case-insensitive) → returns the full BUILTIN_TOOL_NAMES
+ *     list (no duplicates).
+ *   - `none` or empty string → returns `[]` (zero built-ins; extension
+ *     tools via `ext:` can still be added, though pi-crew doesn't parse
+ *     `ext:` selectors yet — see F1 sub-gap).
+ *   - CSV → returns the parsed entries (trimmed, empty entries dropped).
+ * Plain tool names (no `*`) pass through unchanged so existing agent
+ * files keep working with no edits.
+ */
+export function parseToolsField(raw: unknown): string[] | undefined {
+	if (raw === undefined || raw === null) return undefined;
+	const s = typeof raw === "string" ? raw.trim() : String(raw).trim();
+	if (!s) return [];
+	const lowered = s.toLowerCase();
+	if (lowered === "none" || lowered === "[]") return [];
+	if (lowered === "*" || lowered === "all") return [...BUILTIN_TOOL_NAMES];
+	const items = s.split(",").map((t) => t.trim()).filter(Boolean);
+	return items;
+}
+export type ResourceSource = "builtin" | "user" | "project" | "git" | "dynamic" | "project-pi";
 export interface RoutingMetadata {
 	triggers?: string[];
@@ -22,6 +66,14 @@ export interface AgentConfig {
 	thinking?: string;
 	tools?: string[];
 	extensions?: string[];
+	/**
+	 * F1 (v0.7.9): extension denylist (case-insensitive plain names). Applied
+	 * AFTER `extensions:` (which lists the allowed set) — an excluded
+	 * extension is removed from the allowlist and never loads. Plain names
+	 * only (no paths, no `*`); an unknown name logs a warning but is
+	 * tolerated. Back-compat: omitted = no exclusion.
+	 */
+	excludeExtensions?: string[];
 	skills?: string[];
 	systemPromptMode?: "replace" | "append";
 	inheritProjectContext?: boolean;
@@ -64,6 +116,54 @@ export function getAgentSessionOptions(role: string): {
 	return {};
 }
+/**
+ * F1 unify (v0.8.0): the single source of truth for a worker's tool policy,
+ * used by BOTH spawn paths (child-pi `pi-args.ts` and live-session
+ * `live-session-runtime.ts`). Before this, the two paths disagreed:
+ *   - child-pi: `roleConfig.tools ?? agent.tools` (role authoritative)
+ *   - live-session: `agent.tools` only (frontmatter authoritative, role ignored)
+ * so the same agent behaved differently depending on the runtime. A user
+ * defining `tools:` or `disallowed_tools:` in a custom agent's frontmatter
+ * saw it honored on one path and ignored on the other.
+ *
+ * Unified semantics (stable across both paths):
+ *   - **allowlist precedence is source-aware**:
+ *     - `source === "builtin"` → role-config authoritative (security: a
+ *       builtin explorer MUST stay read-only even if its frontmatter is
+ *       loose). Frontmatter is the fallback when the role has no allowlist.
+ *     - `source !== "builtin"` (user / project) → frontmatter `tools:`
+ *       authoritative (user intent). Role-config is the fallback.
+ *   - **denylist is additive**: `roleConfig.excludeTools` and
+ *     `agent.disallowedTools` are MERGED (dedup, order-insensitive). It is
+ *     always safe to forbid more, and merging means a security exclude
+ *     from the role can never be weakened by a frontmatter omission.
+ *
+ * Returns `{ tools, excludeTools }` where each is `undefined` when no
+ * restriction of that kind applies (so callers no-op cleanly).
+ */
+export interface ResolvedToolPolicy {
+	/** Allowlist; undefined = no allowlist restriction (all built-ins allowed). */
+	tools?: string[];
+	/** Denylist (additive); undefined = no denylist. */
+	excludeTools?: string[];
+}
+function uniqueToolMerge(...lists: Array<string[] | undefined>): string[] | undefined {
+	const merged = [...new Set(lists.flatMap((list) => list ?? []))];
+	return merged.length > 0 ? merged : undefined;
+}
+export function resolveToolPolicy(agent: AgentConfig, role?: string): ResolvedToolPolicy {
+	const roleConfig = role ? getToolConfig(role) : {};
+	// allowlist: source-aware precedence (see doc above).
+	const explicitTools = agent.source === "builtin"
+		? (roleConfig.tools ?? agent.tools)
+		: (agent.tools ?? roleConfig.tools);
+	// denylist: additive merge of role excludeTools + agent disallowedTools.
+	const excludeTools = uniqueToolMerge(roleConfig.excludeTools, agent.disallowedTools);
+	return { tools: explicitTools, excludeTools };
+}
 /**
  * Build agent session options including role-based tool restrictions.
  * @param agent - The agent configuration

package/src/agents/discover-agents.ts CHANGED Viewed

@@ -1,10 +1,11 @@
 import * as fs from "node:fs";
 import * as path from "node:path";
 import type { AgentConfig, ResourceSource } from "./agent-config.ts";
+import { parseToolsField } from "./agent-config.ts";
 import { loadConfig, type LoadedPiTeamsConfig } from "../config/config.ts";
 import { parseCsv, parseFrontmatter } from "../utils/frontmatter.ts";
 import { logInternalError } from "../utils/internal-error.ts";
-import { packageRoot, projectCrewRoot, userPiRoot } from "../utils/paths.ts";
+import { packageRoot, projectCrewRoot, userPiRoot, findRepoRoot } from "../utils/paths.ts";
 // ═══════════════════════════════════════════════════════════════════════════
 // SEC-001 Fix: Protected Agent Names Blocklist
@@ -225,7 +226,23 @@ function checkProjectAgentShadowsBuiltin(name: string): void {
 export interface AgentDiscoveryResult {
 	builtin: AgentConfig[];
 	user: AgentConfig[];
+	/**
+	 * Project agents from the pi-crew legacy directory (`.crew/agents/`, or
+	 * `.pi/teams/agents/` fallback). F1 (v0.7.9): the `.pi/agents/` Pi-standard
+	 * project directory is read into `projectPi` (the 4th tier) so users who
+	 * author agents under either convention find them.
+	 */
 	project: AgentConfig[];
+	/**
+	 * F1 (v0.7.9): project agents read from `<repoRoot>/.pi/agents/` (Pi
+	 * standard). Merged into the same priority order as `project` (project
+	 * overrides user, but `.crew/agents/` and `.pi/agents/` are peers
+	 * within the project tier — first hit per `name` wins, with a
+	 * warning logged on shadow). Optional in the result shape so existing
+	 * test fixtures that construct `AgentDiscoveryResult` literally don't
+	 * have to add an empty array (treated as `[]` by `allAgents`).
+	 */
+	projectPi?: AgentConfig[];
 }
 function parseCost(value: string | undefined): "free" | "cheap" | "expensive" | undefined {
@@ -365,8 +382,9 @@ function parseAgentFile(filePath: string, source: ResourceSource): AgentConfig |
 			model: frontmatter.model === "false" ? undefined : frontmatter.model || undefined,
 			fallbackModels: parseCsv(frontmatter.fallbackModels),
 			thinking: frontmatter.thinking === "false" ? undefined : frontmatter.thinking || undefined,
-			tools: parseCsv(frontmatter.tools),
+			tools: parseToolsField(frontmatter.tools),
 			extensions: frontmatter.extensions === "" ? [] : parseCsv(frontmatter.extensions),
+			excludeExtensions: parseCsv(frontmatter.excludeExtensions ?? frontmatter.exclude_extensions),
 			skills: parseCsv(frontmatter.skills ?? frontmatter.skill),
 			systemPromptMode: frontmatter.systemPromptMode === "append" ? "append" : "replace",
 			inheritProjectContext: frontmatter.inheritProjectContext === "true",
@@ -471,7 +489,14 @@ export function discoverAgents(cwd: string): AgentDiscoveryResult {
 	const result: AgentDiscoveryResult = {
 		builtin: applyAgentOverrides(readAgentDir(path.join(packageRoot(), "agents"), "builtin"), cwd, loaded),
 		user: applyAgentOverrides(readAgentDir(path.join(userPiRoot(), "agents"), "user"), cwd, loaded),
+		// F1 (v0.7.9): two project roots — the legacy pi-crew `.crew/agents/`
+		// (or `.pi/teams/agents/` fallback) AND the Pi-standard `.pi/agents/`.
+		// Both are read; `allAgents` merges them in priority order (project
+		// first, then project-pi) so a project can override a global agent
+		// from either location. Same-name shadows within the project tier
+		// log a warning (SEC-001).
 		project: applyAgentOverrides(readAgentDir(path.join(projectCrewRoot(cwd), "agents"), "project"), cwd, loaded),
+		projectPi: applyAgentOverrides(readAgentDir(path.join(findRepoRoot(cwd) ?? cwd, ".pi", "agents"), "project-pi"), cwd, loaded),
 	};
 	// SEC-005: Store with current version stamp
 	discoveryCache.set(cwd, { result, expiresAt: Date.now() + DISCOVERY_CACHE_TTL_MS, cacheVersion: currentVersion });
@@ -520,7 +545,13 @@ export function allAgents(discovery: AgentDiscoveryResult | undefined): AgentCon
 	// Priority for disambiguation (security): project < builtin < user.
 	// Project config cannot override trusted builtins (security-hardening).
 	// Later entries in the loop overwrite earlier ones, so user wins.
-	for (const agent of [...discovery.project, ...discovery.builtin, ...discovery.user]) {
+	// F1 (v0.7.9): `projectPi` is appended AFTER `project` so a `.pi/agents/foo.md`
+	// is a fallback to `.crew/agents/foo.md` within the project tier (the
+	// legacy pi-crew directory takes precedence when both exist). This
+	// matches `applyAgentOverrides` semantics and keeps the SECURITY warning
+	// gate on the same source. `projectPi` is optional in the result type
+	// (older test fixtures may omit it) — fall back to an empty array.
+	for (const agent of [...discovery.project, ...(discovery.projectPi ?? []), ...discovery.builtin, ...discovery.user]) {
 		byName.set(agent.name.toLowerCase(), agent);
 	}
 	// Dynamic agents only fill gaps — they cannot override builtin/user agents.

package/src/config/types.ts CHANGED Viewed

@@ -180,6 +180,14 @@ export interface CrewReliabilityConfig {
 	cleanupOrphanedTempDirs?: boolean;
 	/** Inject a compact ambient crew-status note into the agent's context on every LLM call while crew runs are in-flight, so the agent stays continuously aware of active runs without calling the `team` tool. No-op when no runs are active. Default: true. */
 	ambientStatusInjection?: boolean;
+	/**
+	 * Opt-in model scope enforcement (F7). When true, subagent model choices
+	 * that fall outside the user's pi `enabledModels` allowlist are flagged:
+	 * caller-supplied out-of-scope → hard error before spawn; frontmatter-
+	 * pinned out-of-scope → warning + runs anyway. Default: false (no
+	 * enforcement, fully back-compat).
+	 */
+	scopeModels?: boolean;
 }
 export interface CrewOtlpConfig {

package/src/errors.ts CHANGED Viewed

@@ -38,6 +38,7 @@ export const ErrorCode = {
   EventLogLockTimeout: "E010",      // Could not acquire the event-log file lock
   DepthLimitExceeded: "E011",       // Pipeline/chain recursion depth limit hit (circular dep)
   RunStale: "E012",                 // Run reconciled as stale/zombie (heartbeat expired)
+  ModelOutOfScope: "E013",          // Caller-supplied model is not in pi's enabledModels allowlist (F7 scope gate)
 } as const;
 export type ErrorCode = typeof ErrorCode[keyof typeof ErrorCode];
@@ -56,6 +57,7 @@ const DEFAULT_HELP: Record<ErrorCode, string | undefined> = {
   [ErrorCode.EventLogLockTimeout]: "Another process holds the event-log lock. Check for orphaned `.lock` files or stale pi-crew processes, then retry.",
   [ErrorCode.DepthLimitExceeded]: "A pipeline/chain exceeded the recursion depth limit, which usually indicates a circular stage dependency. Review step `dependsOn` chains.",
   [ErrorCode.RunStale]: "The worker stopped heartbeating and was treated as a zombie. Re-run the team (resume or fresh); if it recurs, check `runtime.executeWorkers` / system load.",
+  [ErrorCode.ModelOutOfScope]: "The requested model is not in your pi `enabledModels` allowlist. Either pick a model listed in `enabledModels` (settings.json) or extend the allowlist. The scope gate is opt-in — disable `runtime.reliability.scopeModels` to allow any model.",
 };
 /**
@@ -188,4 +190,11 @@ export const errors = {
       `Stale run reconciled (reason=${reason}).${age} The worker stopped heartbeating and was treated as dead/zombie.`,
     ).withContext("stale-run reconciliation");
   },
+  modelOutOfScope(model: string, patterns: string[]): CrewError {
+    return new CrewError(
+      ErrorCode.ModelOutOfScope,
+      `Requested model "${model}" is not in enabledModels scope (allowlist: [${patterns.join(", ")}])`,
+    ).withContext("F7 model scope gate — caller override rejected");
+  },
 } as const;

package/src/extension/context-status-injection.ts CHANGED Viewed

@@ -35,6 +35,7 @@ import type { AgentMessage } from "@earendil-works/pi-agent-core";
 import type { Message } from "@earendil-works/pi-ai";
 import type { ExtensionAPI, ContextEvent } from "@earendil-works/pi-coding-agent";
 import { collectInFlightRuns } from "./registration/compaction-guard.ts";
+import { extractSessionId } from "../utils/session-utils.ts";
 import type { TeamRunManifest } from "../state/types.ts";
 /** Sentinel that marks an injected ambient-status user message. */
@@ -133,10 +134,10 @@ export interface AmbientContextResult {
  *
  * Exported for unit testing.
  */
-export function handleContextEvent(event: ContextEvent, cwd: string): AmbientContextResult | undefined {
+export function handleContextEvent(event: ContextEvent, cwd: string, sessionId?: string): AmbientContextResult | undefined {
 	let runs: TeamRunManifest[] = [];
 	try {
-		runs = collectInFlightRuns(cwd);
+		runs = collectInFlightRuns(cwd, sessionId);
 	} catch {
 		// State read failure → don't inject, don't crash. Pi catches handler
 		// errors anyway, but we avoid noisy error emission for a best-effort
@@ -167,8 +168,16 @@ export function registerContextStatusInjection(
 	opts: { enabled?: boolean } = {},
 ): void {
 	if (opts.enabled === false) return;
-	pi.on("context", (event: ContextEvent): AmbientContextResult | undefined => {
-		const cwd = typeof process.cwd === "function" ? process.cwd() : ".";
-		return handleContextEvent(event, cwd);
+	pi.on("context", (event: ContextEvent, ctx: unknown): AmbientContextResult | undefined => {
+		// crew state is per-project; use the session ctx cwd when available,
+		// falling back to process.cwd(). Thread the session id so ambient
+		// status only reflects runs owned by THIS session (the state store is
+		// per-project, shared across sessions).
+		const cwd =
+			typeof ctx === "object" && ctx !== null && typeof (ctx as { cwd?: unknown }).cwd === "string"
+				? (ctx as { cwd: string }).cwd
+				: typeof process.cwd === "function" ? process.cwd() : ".";
+		const sessionId = extractSessionId(ctx);
+		return handleContextEvent(event, cwd, sessionId);
 	});
 }

package/src/extension/register.ts CHANGED Viewed

@@ -91,6 +91,7 @@ import {
 	userCrewRoot,
 } from "../utils/paths.ts";
 import { resolveContainedPath } from "../utils/safe-paths.ts";
+import { extractSessionId } from "../utils/session-utils.ts";
 import { resetTimings, time } from "../utils/timings.ts";
 import {
 	type PiCrewRpcHandle,
@@ -1242,24 +1243,9 @@ export function registerPiTeams(pi: ExtensionAPI): void {
 		notifyActiveRuns(ctx);
 		// Auto-cancel orphaned runs from dead sessions
-		// Extract sessionId from context — use Object.getOwnPropertyDescriptor
-		// to safely access property without triggering Proxy traps, then validate.
-		const rawSessionId =
-			typeof ctx === "object" && ctx !== null
-				? Object.getOwnPropertyDescriptor(ctx, "sessionId")?.value
-				: undefined;
-		const currentSessionId =
-			typeof rawSessionId === "string" && rawSessionId.length > 0
-				? rawSessionId
-				: undefined;
-		if (rawSessionId !== undefined && currentSessionId === undefined) {
-			logInternalError(
-				"register.sessionId.invalid",
-				new Error(
-					`Invalid session ID: expected non-empty string, got ${typeof rawSessionId}`,
-				),
-			);
-		}
+		// Extract sessionId from context via the shared safe accessor (handles
+		// untyped runtime property + defensive against exotic objects).
+		const currentSessionId = extractSessionId(ctx);
 		// Defer ALL heavy cleanup to after the session_start handler returns.
 		// These operations involve synchronous directory scanning (readdirSync, readFileSync)