npm - pi-crew - Versions diffs - 0.9.10 → 0.9.11 - Mend

pi-crew 0.9.10 → 0.9.11

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (9) hide show

package/CHANGELOG.md +52 -0
package/package.json +1 -1
package/src/config/role-tools.ts +39 -6
package/src/runtime/async-runner.ts +70 -74
package/src/runtime/background-runner.ts +13 -2
package/src/runtime/role-permission.ts +5 -21
package/src/runtime/task-runner/prompt-builder.ts +1 -0
package/src/state/artifact-store.ts +22 -2
package/src/utils/redaction.ts +49 -31

package/CHANGELOG.md CHANGED Viewed

@@ -1,5 +1,57 @@
 # Changelog
+## [v0.9.11] — Per-run lock path for background-runner (parallel-spawn race) (2026-06-27)
+Bug caught by an E2E parallel-spawn test in this session, NOT by unit tests (which cannot spawn multiple real processes). Independent of the F1-F5/redaction batches.
+### Bug fix
+- **Shared `run.lock` killed concurrent background runners** (`src/runtime/background-runner.ts:417`). The bootstrap call passed a fake manifest `{ stateRoot: "", runId, cwd }` to `withRunLockSync` because the real manifest was not loaded yet. `lockPath()` = `path.join(manifest.stateRoot, "run.lock")` = `path.join("", "run.lock")` = `"run.lock"` — a RELATIVE path at cwd, SHARED across every run regardless of runId. When multiple background agents spawned in the same instant (e.g. parallel `Agent` calls), they raced on the single shared lock: one acquired, the rest failed fast ("Run 'run.lock' is locked by another operation") and exited within 3s. The existing "FIX Issue #3" comment claimed to prevent concurrent runners "for the same runId", but the lock path never contained the runId. Fix: compute the real per-run stateRoot via `createRunPaths(cwd, runId).stateRoot` before locking, so each run locks its own `<cwd>/.crew/state/runs/<runId>/run.lock`. Matches `locks-race.test.ts`.
+### Verification
+- `npx tsc --noEmit` EXIT 0
+- 7 lock-related suites pass (locks-race 10, background-runner-console-redirect 4, async-runner 13, api-locks 1, orphan-worker-registry 15, locks-untested 11, team-runner-heartbeat 2)
+- E2E reproduce (decisive): BEFORE the fix, 3 parallel background explorers → 1 pass + 2 fail (background.log: "Failed to acquire lock"). AFTER the fix, same scenario → 3/3 pass, 0 lock errors.
+### Lesson
+Concurrency/lock bugs only reproduce when multiple real processes spawn simultaneously — unit tests mocking a single process can never catch them. E2E parallel-spawn smoke tests are the only way to verify. (Reinforces the v0.9.9 lesson: E2E with real extension load is decisive.)
+## [v0.9.11] — Read-only permission model fixes F1-F5 (2026-06-27)
+Review of the role permission model (question: "do read-only workflows still persist their task output?") confirmed output persistence is runner-driven and correct, but found 5 findings — one the same defect class as the v0.9.10 writer incident (Fix 5), in the opposite direction.
+### Bug fixes
+- **`security-reviewer`/`test-engineer` tool config unreachable (F1, HIGH)** (`src/config/role-tools.ts`). Map keys were `security_reviewer`/`test_engineer` (underscore) while the runtime role strings are hyphenated (`agents/security-reviewer.md` → `security-reviewer`). `getToolConfig` did not normalize, so it returned `{}` and the strictest tool restrictions in the codebase silently never applied. Same defect class as the writer incident, opposite direction (under-enforce vs over-enforce). Tests masked it: they queried only the underscore forms. Fix: quote+hyphen the keys (a bare `security-reviewer:` key parses as subtraction — must be quoted) and normalize in `getToolConfig` (`role.replaceAll("_","-")`); added a regression test that derives role names from the runtime sets and asserts each resolves its intended config.
+- **`critic`/`planner` tool-config gaps (F2)** (`src/config/role-tools.ts`). `critic` had no entry (a custom critic agent had no tool-level read-only enforcement); `planner`'s entry only excluded `ask_question` and did not enforce read-only. Added a `critic` entry and strengthened `planner` to a read-only tool-set.
+- **`planner` kept read-only with deliverable guidance (F3)** (`src/runtime/task-runner/prompt-builder.ts`). `planner` emits deliverables (`output: plan.md`) but moving it to WRITE_ROLES would fire the plan-approval gate BEFORE planning (breaking default/implementation workflows — `team-runner.ts:399` relies on planner being read-only). Fix: keep planner read-only and add a prompt line telling read-only roles their RESULT TEXT is persisted by the runner, so they emit deliverables as text instead of attempting file writes.
+- **`verifier` reclassified read-only → write (F4)** (`src/runtime/role-permission.ts` + `src/config/role-tools.ts`). `verifier`'s task runs tests via bash with redirects/cache writes (`npm test | tee`, `mkdir`, `rm`), all forbidden by the read-only prompt gate — a direct contradiction with `agents/verifier.md`. Moved verifier to WRITE_ROLES; tool-config keeps bash but excludes edit/write so source integrity is preserved. Mirrors `cold-verifier`.
+- **Dead command-enforcement removed (F5)** (`src/runtime/role-permission.ts`). `isReadOnlyCommand`/`checkRolePermission`/`READ_ONLY_COMMANDS` had zero runtime callers (only tests). Real protection lives in the role tool-config + `safe-paths.ts`/`resolveRealContainedPath` (10+ runtime callers). Deleted the dead code.
+### Verification
+- `npx tsc --noEmit` EXIT 0
+- 124 tests pass / 0 fail across 13 suites + 1 integration (role-tools 15, role-permission-cov 23, role-permission 2, role-permission.spawn 3, prompt-builder-cov 15, v0-8-0-tool-policy-unification 10, skill-instructions 16, plan-approval-boundary 7, crew-contracts 6, goal-loop-team-roles 5, t9-cold-verifier 5, completion-guard 7, verification-gates 10, role-tools-integration 3)
+- E2E: `research` workflow 3/3 tasks — explorer+analyst (read-only) persisted findings, writer wrote the deliverable file
+## [v0.9.11] — Secret redaction & env hardening (2026-06-27)
+Independent security review (review team, 3/4 tasks, ~360K tokens) flagged 3 Medium findings in the secret-redaction and env-passthrough surfaces. All verified by live `npx tsx` repro + source trace before fixing.
+### Bug fixes
+- **`redactAuthHeader` leaked credential values (L3/L5)** (`src/utils/redaction.ts`). Two defects: (1) `indexOf` matched only the FIRST `authorization:` occurrence per call, so a second header on a later line leaked verbatim; (2) the word-boundary allow-list excluded `-` and `\t`, so `Proxy-Authorization:` / `X-Authorization:` and tab-indented headers were not recognized. Fix: loop over all occurrences and add `-`/`\t` to a shared `AUTH_HEADER_BOUNDARY_CHARS` set (used by both `redactAuthHeader` and `redactBearerTokens`). Latent weakness caught by repro (NOT by the reviewer's proposed fix): the old code only APPENDED a ` ***` marker without removing the value — `"authorization: Basic abc123"` became `"authorization: Basic abc123 ***"` (credential still visible). The redact branch now blanks the value: `line.substring(authIdx, authIdx+14) + " ***"` → `"authorization: ***"`. Consistent with `redactInlineSecrets`.
+- **`writeArtifact` flat-redaction only (M2)** (`src/state/artifact-store.ts:130`). Applied only `redactSecretString` (flat regex scan), so quoted-JSON secrets (`"api_key":"sk-..."`) and nested keys survived into persisted artifacts (e.g. `startup-evidence.json` holds up to 500 chars of raw child stderr). Fix: structural-then-flat — when content parses as JSON, run `redactSecrets` (recursive) first, then flat `redactSecretString`. Order matters: structural catches quoted keys, flat still catches Bearer/JWT/Auth headers inside JSON string values. Formatting is preserved: the input is re-stringified with the SAME indentation (pretty → indent 2, compact → compact), so pretty-printed artifacts like group-join metadata keep their `"partial": false` whitespace (caught by `test/integration/phase4-runtime.test.ts` regression on CI after the first attempt shipped a compact re-stringify).
+- **Provider API keys leaked into the detached background runner (M1)** (`src/runtime/async-runner.ts:162`). The env allowlist forwarded 14 provider keys (MINIMAX/OPENAI/ANTHROPIC/...) to the background runner, contradicting `child-pi.ts:275` ("API keys are NOT needed — config file"). Keys leaked into V8 fatal-error reports (`--report-on-fatalerror` writes `environmentVariables` unredacted). The inline comment "same as child-pi.ts" was false. Fix: extracted `BACKGROUND_RUNNER_ENV_ALLOWLIST` (exported, unit-testable) and removed the 14 provider keys. Prereq verified: `background-runner.ts` does not read provider keys directly.
+### Verification
+- `npx tsc --noEmit` EXIT 0
+- 21 targeted suites pass (~130 tests): redaction-cov (32), redaction-p1f (18), redaction-transcript-roundtrip (3), child-pi-sec1-redaction (8), artifact-store (4), async-runner (13), env-filter (4), env-filter-cov (9), security-hardening (8), round28-otlp-crlf (4), child-pi-compaction-real (9), + others
+- Live repro: `redactSecretString("Proxy-Authorization: Basic c2VjcmV0")` → `"Proxy-Authorization: ***"` (was: unchanged leak)
 ## [v0.9.10 (continued)] — Round 29 follow-ups: BG2 sweep bug fixes, test optimization, E2E verification (2026-06-26)
 A full-suite verify run (`verify-full2`, 5502 tests, 774 suites) surfaced 4 file-level timeouts and 2 real correctness bugs. This release fixes the 2 real bugs, the underlying cause of 2 of the 4 timeouts (chain-runner + orphan-worker-registry + cleanup-full-flow self-deadlock + HandoffManager interval leak), and adds E2E verification artifacts to prove all fixes hold against the live runtime, not just static analysis.

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "pi-crew",
-  "version": "0.9.10",
+  "version": "0.9.11",
   "description": "Pi extension for coordinated AI teams, workflows, worktrees, and async task orchestration",
   "author": "baphuongna",
   "license": "MIT",

package/src/config/role-tools.ts CHANGED Viewed

@@ -22,9 +22,23 @@ export const ROLE_TOOL_CONFIGS: Record<string, RoleToolConfig> = {
 		excludeTools: ["edit", "write", "ask_question"],
 	},
-	// Planner - Planning and documentation
+	// Planner - Read-only planning; emits plans as TEXT (runner persists result).
+	// F2/F3: strengthened to a read-only tool-set matching its READ_ONLY_ROLES
+	// classification. Deliverables are emitted as RESULT TEXT (consumed by
+	// adaptive-plan.ts / runner shared-output), NOT file writes — so the
+	// plan-approval gate boundary (planner = read-only) is preserved. Moving
+	// planner to WRITE_ROLES would fire the gate before planning, breaking the
+	// default/implementation workflows.
 	planner: {
-		excludeTools: ["ask_question"],
+		tools: ["read", "grep", "find", "ls", "glob"],
+		excludeTools: ["edit", "write", "bash", "web", "ask_question"],
+	},
+	// Critic - Read-only plan/design critique (F2: was missing from the map,
+	// so a custom critic agent had no tool-level read-only enforcement).
+	critic: {
+		tools: ["read", "grep", "find", "ls", "glob"],
+		excludeTools: ["edit", "write", "bash", "web"],
 	},
 	// Executor - Full access (default)
@@ -45,13 +59,26 @@ export const ROLE_TOOL_CONFIGS: Record<string, RoleToolConfig> = {
 	},
 	// Security Reviewer - Strict restrictions
-	security_reviewer: {
+	// F1: key is hyphenated to match the runtime role string (agents/
+	// security-reviewer.md → "security-reviewer"). The underscore form never
+	// resolved at runtime (returned {}), silently dropping enforcement.
+	"security-reviewer": {
 		tools: ["read", "grep", "find"],
 		excludeTools: ["edit", "write", "bash", "web", "ask_question"],
 	},
-	// Test Engineer - Can write tests
-	test_engineer: {
+	// Verifier - Runs tests (needs bash) but must NOT edit source (F4: moved
+	// from READ_ONLY_ROLES to WRITE_ROLES — the read-only prompt gate forbids
+	// the test-running redirects / cache writes its task requires, contradicting
+	// agents/verifier.md). Tool-set keeps bash but excludes edit/write so source
+	// integrity is preserved during verification. Mirrors cold-verifier behavior.
+	verifier: {
+		tools: ["read", "grep", "find", "ls", "bash"],
+		excludeTools: ["edit", "write", "web"],
+	},
+	// Test Engineer - Can write tests (F1: hyphenated key)
+	"test-engineer": {
 		tools: ["read", "edit", "write", "bash", "ls"],
 		excludeTools: ["web"],
 	},
@@ -61,7 +88,13 @@ export const ROLE_TOOL_CONFIGS: Record<string, RoleToolConfig> = {
  * Get tool configuration for a specific role.
  */
 export function getToolConfig(role: string): RoleToolConfig {
-	return ROLE_TOOL_CONFIGS[role] ?? {};
+	// F1: normalize hyphen/underscore. Runtime role strings are hyphenated
+	// (agents/security-reviewer.md → "security-reviewer") but map keys were
+	// historically underscored, silently returning {} at runtime — the same
+	// defect class as the v0.9.10 writer incident (opposite direction:
+	// under-enforce instead of over-enforce). Accept both forms.
+	const key = role.includes("_") ? role.replaceAll("_", "-") : role;
+	return ROLE_TOOL_CONFIGS[key] ?? ROLE_TOOL_CONFIGS[role] ?? {};
 }
 /**

package/src/runtime/async-runner.ts CHANGED Viewed

@@ -150,6 +150,75 @@ export interface SpawnBackgroundTeamRunResult {
 	logPath: string;
 }
+/**
+ * Env vars explicitly forwarded to the detached background runner.
+ *
+ * Provider API keys (MINIMAX/OPENAI/ANTHROPIC/...) are INTENTIONALLY OMITTED
+ * (security review M1): the background runner only spawns child Pi workers,
+ * which read keys from the Pi config file (not env). Passing keys via env
+ * leaks them into V8 fatal-error reports (--report-on-fatalerror writes the
+ * `environmentVariables` section unredacted). Matches child-pi.ts policy.
+ * Exported so the invariant is unit-testable (test/unit/async-runner.test.ts).
+ */
+export const BACKGROUND_RUNNER_ENV_ALLOWLIST: string[] = [
+	// Essential non-secret vars
+	"PATH",
+	"HOME",
+	"USER",
+	"SHELL",
+	"TERM",
+	"LANG",
+	"LC_ALL",
+	"LC_COLLATE",
+	"LC_CTYPE",
+	"LC_MESSAGES",
+	"LC_MONETARY",
+	"LC_NUMERIC",
+	"LC_TIME",
+	"XDG_CONFIG_HOME",
+	"XDG_DATA_HOME",
+	"XDG_CACHE_HOME",
+	"XDG_RUNTIME_DIR",
+	// Windows essentials — see WINDOWS_ESSENTIAL_ENV_VARS (src/utils/env-allowlist.ts).
+	...WINDOWS_ESSENTIAL_ENV_VARS,
+	"NVM_BIN",
+	"NVM_DIR",
+	"NVM_INC",
+	"NODE_PATH",
+	"NODE_DISABLE_COLORS",
+	"NODE_EXTRA_CA_CERTS",
+	"NPM_CONFIG_REGISTRY",
+	"NPM_CONFIG_USERCONFIG",
+	"NPM_CONFIG_GLOBALCONFIG",
+	// PI_CREW_PARENT_PID is needed for parent-guard (liveness check).
+	"PI_CREW_DEPTH",
+	"PI_CREW_MAX_DEPTH",
+	"PI_CREW_INHERIT_PROJECT_CONTEXT",
+	"PI_CREW_INHERIT_SKILLS",
+	"PI_CREW_PARENT_PID",
+	"PI_TEAMS_DEPTH",
+	"PI_TEAMS_MAX_DEPTH",
+	"PI_TEAMS_INHERIT_PROJECT_CONTEXT",
+	"PI_TEAMS_INHERIT_SKILLS",
+	"PI_TEAMS_PI_BIN",
+	"PI_TEAMS_MOCK_CHILD_PI",
+	"PI_CREW_ALLOW_MOCK",
+	// Phase 1.5: worker-thread atomic writer opt-in (RFC 15).
+	"PI_CREW_WORKER_ATOMIC_WRITER",
+	"PI_TEAMS_WORKER_ATOMIC_WRITER",
+	// Phase 1.5 #1: verification env sanitization opt-in (RFC 13 §6).
+	"PI_CREW_VERIFICATION_SANITIZE_ENV",
+	"PI_TEAMS_VERIFICATION_SANITIZE_ENV",
+	"PI_CREW_VERIFICATION_PRESERVE_ENV",
+	"PI_TEAMS_VERIFICATION_PRESERVE_ENV",
+	// Phase 1.5 #2: verification git-worktree sandbox opt-in (RFC 16).
+	"PI_CREW_VERIFICATION_WORKTREE",
+	"PI_TEAMS_VERIFICATION_WORKTREE",
+	// Phase 1.5 #3: V8 diagnostic report on fatal error (RFC 17 — investigation).
+	"PI_CREW_BG_REPORT_ON_FATAL",
+	"PI_TEAMS_BG_REPORT_ON_FATAL",
+];
 export async function spawnBackgroundTeamRun(manifest: TeamRunManifest): Promise<SpawnBackgroundTeamRunResult> {
 	const runnerPath = path.join(path.dirname(fileURLToPath(import.meta.url)), "background-runner.ts");
 	const logPath = path.join(manifest.stateRoot, "background.log");
@@ -159,80 +228,7 @@ export async function spawnBackgroundTeamRun(manifest: TeamRunManifest): Promise
 	// to prevent leaking all env vars (including secrets) to detached background runner.
 	// Previously, destructuring only removed PI_CREW_PARENT_PID but kept everything else.
 	const filteredEnv = sanitizeEnvSecrets(process.env, {
-		allowList: [
-			// Model provider API keys (same as child-pi.ts)
-			"MINIMAX_API_KEY",
-			"MINIMAX_GROUP_ID",
-			"OPENAI_API_KEY",
-			"OPENAI_ORG_ID",
-			"ANTHROPIC_API_KEY",
-			"GOOGLE_API_KEY",
-			"GOOGLE_GENERATIVE_LANGUAGE_API_KEY",
-			"AZURE_OPENAI_API_KEY",
-			"AZURE_OPENAI_ENDPOINT",
-			"AWS_ACCESS_KEY_ID",
-			"AWS_SECRET_ACCESS_KEY",
-			"AWS_REGION",
-			"ZEU_API_KEY",
-			"ZERODEV_API_KEY",
-			// Essential non-secret vars
-			"PATH",
-			"HOME",
-			"USER",
-			"SHELL",
-			"TERM",
-			"LANG",
-			"LC_ALL",
-			"LC_COLLATE",
-			"LC_CTYPE",
-			"LC_MESSAGES",
-			"LC_MONETARY",
-			"LC_NUMERIC",
-			"LC_TIME",
-			"XDG_CONFIG_HOME",
-			"XDG_DATA_HOME",
-			"XDG_CACHE_HOME",
-			"XDG_RUNTIME_DIR",
-			// Windows essentials — see WINDOWS_ESSENTIAL_ENV_VARS (src/utils/env-allowlist.ts).
-			...WINDOWS_ESSENTIAL_ENV_VARS,
-			"NVM_BIN",
-			"NVM_DIR",
-			"NVM_INC",
-			"NODE_PATH",
-			"NODE_DISABLE_COLORS",
-			"NODE_EXTRA_CA_CERTS",
-			"NPM_CONFIG_REGISTRY",
-			"NPM_CONFIG_USERCONFIG",
-			"NPM_CONFIG_GLOBALCONFIG",
-			// FIX: explicit list matches child-pi.ts to prevent regression.
-			// PI_CREW_PARENT_PID is needed for parent-guard (liveness check).
-			"PI_CREW_DEPTH",
-			"PI_CREW_MAX_DEPTH",
-			"PI_CREW_INHERIT_PROJECT_CONTEXT",
-			"PI_CREW_INHERIT_SKILLS",
-			"PI_CREW_PARENT_PID",
-			"PI_TEAMS_DEPTH",
-			"PI_TEAMS_MAX_DEPTH",
-			"PI_TEAMS_INHERIT_PROJECT_CONTEXT",
-			"PI_TEAMS_INHERIT_SKILLS",
-			"PI_TEAMS_PI_BIN",
-			"PI_TEAMS_MOCK_CHILD_PI",
-			"PI_CREW_ALLOW_MOCK",
-			// Phase 1.5: worker-thread atomic writer opt-in (RFC 15).
-			"PI_CREW_WORKER_ATOMIC_WRITER",
-			"PI_TEAMS_WORKER_ATOMIC_WRITER",
-			// Phase 1.5 #1: verification env sanitization opt-in (RFC 13 §6).
-			"PI_CREW_VERIFICATION_SANITIZE_ENV",
-			"PI_TEAMS_VERIFICATION_SANITIZE_ENV",
-			"PI_CREW_VERIFICATION_PRESERVE_ENV",
-			"PI_TEAMS_VERIFICATION_PRESERVE_ENV",
-			// Phase 1.5 #2: verification git-worktree sandbox opt-in (RFC 16).
-			"PI_CREW_VERIFICATION_WORKTREE",
-			"PI_TEAMS_VERIFICATION_WORKTREE",
-			// Phase 1.5 #3: V8 diagnostic report on fatal error (RFC 17 — investigation).
-			"PI_CREW_BG_REPORT_ON_FATAL",
-			"PI_TEAMS_BG_REPORT_ON_FATAL",
-		],
+		allowList: BACKGROUND_RUNNER_ENV_ALLOWLIST,
 	});
 	// FIX: removed delete workarounds — with explicit allowlist, these vars
 	// are no longer auto-leaked. Matches child-pi.ts.

package/src/runtime/background-runner.ts CHANGED Viewed

@@ -7,6 +7,7 @@ import {
 	withRunLockSync,
 } from "../state/locks.ts";
 import {
+	createRunPaths,
 	loadRunManifestById,
 	saveRunManifest,
 	updateRunStatus,
@@ -411,11 +412,21 @@ async function main(): Promise<void> {
 		);
 	// FIX Issue #3: Wrap in withRunLockSync to prevent concurrent background-runners
 	// for the same runId from reading stale manifest state. If lock cannot be
-	// acquired within 5s, fail immediately rather than proceeding with stale data.
+	// be acquired within 5s, fail immediately rather than proceeding with stale data.
+	//
+	// BUGFIX (caught by E2E parallel-spawn, 2026-06-27): the lock manifest must
+	// carry the REAL per-run stateRoot, NOT an empty string. lockPath() derives
+	// `<stateRoot>/run.lock`, so `stateRoot: ""` collapses every concurrent
+	// background-runner (different runIds, same spawn instant) onto a SINGLE
+	// shared `run.lock` at cwd — 1 acquires, the rest fail-fast and die. Compute
+	// the per-run stateRoot from (cwd, runId) via createRunPaths (same helper
+	// resolveRunStateRoot uses internally), so each run locks its own
+	// `<cwd>/.crew/state/runs/<runId>/run.lock`. Matches locks-race.test.ts.
+	const bootstrapStateRoot = createRunPaths(cwd, runId).stateRoot;
 	let loaded: { manifest: TeamRunManifest; tasks: TeamTaskState[] } | undefined;
 	try {
 		loaded = withRunLockSync(
-			{ stateRoot: "", runId, cwd } as TeamRunManifest,
+			{ stateRoot: bootstrapStateRoot, runId, cwd } as TeamRunManifest,
 			() => loadRunManifestById(cwd, runId),
 			{ staleMs: 30_000 },
 		);

package/src/runtime/role-permission.ts CHANGED Viewed

@@ -1,11 +1,10 @@
-import { isSensitivePath } from "./sensitive-paths.ts";
 export type RolePermissionMode = "read_only" | "workspace_write" | "danger_full_access" | "explicit_confirm";
-const READ_ONLY_ROLES = new Set(["explorer", "reviewer", "security-reviewer", "verifier", "analyst", "critic", "planner"]);
-const WRITE_ROLES = new Set(["executor", "test-engineer", "writer"]);
-const READ_ONLY_COMMANDS = new Set(["cat", "head", "tail", "less", "more", "wc", "ls", "find", "grep", "rg", "awk", "sed", "echo", "printf", "which", "where", "whoami", "pwd", "env", "printenv", "date", "df", "du", "uname", "file", "stat", "diff", "sort", "uniq", "tr", "cut", "paste", "test", "true", "false", "type", "readlink", "realpath", "basename", "dirname", "sha256sum", "md5sum", "xxd", "hexdump", "od", "strings", "tree", "jq", "git", "gh"]);
+// Read-only roles: cannot mutate files/source. `verifier` is NOT here — it runs
+// tests (bash + cache writes) so it is a WRITE role (F4). `planner` stays
+// read-only to preserve the plan-approval gate boundary (F3).
+const READ_ONLY_ROLES = new Set(["explorer", "reviewer", "security-reviewer", "analyst", "critic", "planner"]);
+const WRITE_ROLES = new Set(["executor", "test-engineer", "writer", "verifier"]);
 export interface PermissionCheckResult {
 	allowed: boolean;
 	mode: RolePermissionMode;
@@ -18,21 +17,6 @@ export function permissionForRole(role: string): RolePermissionMode {
 	return "workspace_write";
 }
-export function isReadOnlyCommand(command: string): boolean {
-	const first = command.trim().split(/\s+/)[0]?.split(/[\\/]/).pop() ?? "";
-	return READ_ONLY_COMMANDS.has(first) && !/\s(-i|--in-place)\b|\s>{1,2}\s|\brm\b|\bmv\b|\bcp\b|\b(?:npm|pnpm|yarn|bun)\s+(install|add|ci|remove)\b|\bgit\s+(commit|push|merge|rebase|reset|checkout|clean)\b/.test(command);
-}
-export function checkRolePermission(role: string, command: string, filePath?: string): PermissionCheckResult {
-	const mode = permissionForRole(role);
-	// Also block access to known sensitive paths even for read-only commands
-	if (filePath && isSensitivePath(filePath)) {
-		return { allowed: false, mode, reason: `Path '${filePath}' is sensitive (credentials, SSH keys, etc.) — access denied for all roles.` };
-	}
-	if (mode === "read_only" && !isReadOnlyCommand(command)) return { allowed: false, mode, reason: `Role '${role}' is read-only and command may modify state.` };
-	return { allowed: true, mode };
-}
 export function currentCrewRole(env: NodeJS.ProcessEnv = process.env): string | undefined {
 	return env.PI_CREW_ROLE?.trim() || env.PI_TEAMS_ROLE?.trim() || undefined;
 }

package/src/runtime/task-runner/prompt-builder.ts CHANGED Viewed

@@ -30,6 +30,7 @@ function readOnlyRoleInstructions(role: string): string {
 		"- Do not use shell redirects, heredocs, in-place edits, package installs, git commit/merge/rebase/reset/checkout, or other state-mutating commands.",
 		"- If implementation changes are needed, report exact recommendations instead of applying them.",
 		"- Prefer read/grep/find/listing tools and read-only git inspection commands.",
+		"- Your final RESULT TEXT is persisted automatically by the runner (as a result artifact and, if the step declares `output:`, to a shared file). To deliver a plan, report, or findings, EMIT THEM AS TEXT in your final result — do NOT try to write a file yourself.",
 	].join("\n");
 }

package/src/state/artifact-store.ts CHANGED Viewed

@@ -4,7 +4,7 @@ import { createHash } from "node:crypto";
 import type { ArtifactDescriptor } from "./types.ts";
 import { atomicWriteFile } from "./atomic-write.ts";
 import { resolveRealContainedPath } from "../utils/safe-paths.ts";
-import { redactSecretString } from "../utils/redaction.ts";
+import { redactSecretString, redactSecrets } from "../utils/redaction.ts";
 function hashContent(content: string): string {
 	return createHash("sha256").update(content).digest("hex");
@@ -127,7 +127,27 @@ export function writeArtifact(artifactsRoot: string, options: ArtifactWriteOptio
 	const filePath = resolveInside(artifactsRoot, options.relativePath);
 	fs.mkdirSync(path.dirname(filePath), { recursive: true });
 	resolveRealContainedPath(artifactsRoot, path.dirname(filePath));
-	const content = redactSecretString(options.content);
+	let content = options.content;
+	// Structural JSON redaction first — catches quoted-JSON secrets
+	// ("api_key":"sk-...") and nested keys that flat redactSecretString misses.
+	// The flat scan below still catches free-text patterns (Bearer/JWT/Auth
+	// headers) that may live inside JSON string values. See security review M2.
+	//
+	// Formatting preservation: re-stringify with the SAME indentation as the
+	// input so pretty-printed artifacts (e.g. group-join metadata expected by
+	// test/integration/phase4-runtime.test.ts to contain `"partial": false`)
+	// keep their whitespace. Detect pretty-vs-compact from the raw input.
+	const trimmed = content.trim();
+	if (trimmed.startsWith("{") || trimmed.startsWith("[")) {
+		try {
+			const parsed: unknown = JSON.parse(content);
+			const isPretty = /\n|"\s*:\s/.test(content);
+			content = JSON.stringify(redactSecrets(parsed), null, isPretty ? 2 : undefined);
+		} catch {
+			// not valid JSON — fall through to flat redaction only
+		}
+	}
+	content = redactSecretString(content);
 	atomicWriteFile(filePath, content);
 	// Compute hash on written bytes for integrity verification.
 	// Read back the actual file content to handle atomicWrite fallback path

package/src/utils/redaction.ts CHANGED Viewed

@@ -97,34 +97,54 @@ export function isSecretKey(keyName: string): boolean {
 	return false;
 }
-// Linear-time Authorization header redaction
+// Boundary chars that may precede an "authorization:" or "Bearer " keyword.
+// Includes '-' so prefixed headers (Proxy-Authorization, X-Authorization) and
+// '\t' so tab-indented headers are recognized. See security review L5.
+const AUTH_HEADER_BOUNDARY_CHARS = new Set([" ", ",", "{", "[", "\"", "\r", "\n", "-", "\t"]);
+function isAuthHeaderBoundary(ch: string | undefined): boolean {
+	return ch !== undefined && AUTH_HEADER_BOUNDARY_CHARS.has(ch);
+}
+// Linear-time Authorization header redaction.
+// L3 fix: scan ALL occurrences (previously first-only via indexOf, so a second
+// "authorization:" on a later line leaked). L5 fix: boundary set includes '-'
+// and '\t' so Proxy-Authorization / X-Authorization / tab-indented headers are
+// redacted. Bearer values are left for redactBearerTokens.
 export function redactAuthHeader(line: string): string {
 	const lower = line.toLowerCase();
-	const authIdx = lower.indexOf("authorization:");
-	if (authIdx === -1) return line;
-	// Verify word boundary - must be at start of line or preceded by whitespace/comma/brace
-	if (authIdx > 0) {
-		const before = line[authIdx - 1];
-		if (before !== ' ' && before !== ',' && before !== '{' && before !== '[' && before !== '"' && before !== '\r' && before !== '\n') {
-			return line; // Not a word boundary
+	let result = "";
+	let i = 0;          // emit cursor into the original `line`
+	let searchFrom = 0;  // cursor for the next indexOf scan
+	for (;;) {
+		const authIdx = lower.indexOf("authorization:", searchFrom);
+		if (authIdx === -1) {
+			result += line.substring(i);
+			return result;
 		}
-	}
-	// Check if this is followed by Bearer token (don't redact Bearer tokens separately)
-	// Look for "Bearer" after "authorization:"
-	const afterAuth = lower.substring(authIdx + 14).trimStart();
-	if (!afterAuth.startsWith('bearer ')) {
-		// No Bearer token, this is a regular Authorization header - redact it
-		let end = authIdx + 14;
-		while (end < line.length && line[end] !== "\r" && line[end] !== "\n") {
-			end++;
+		// Emit the unchanged span up to this occurrence.
+		result += line.substring(i, authIdx);
+		const isBoundary = authIdx === 0 || isAuthHeaderBoundary(line[authIdx - 1]);
+		const afterAuth = lower.substring(authIdx + 14).trimStart();
+		if (isBoundary && !afterAuth.startsWith("bearer ")) {
+			// Regular Authorization header — blank the credential value (the rest
+			// of the line is the credential). Appending only a marker would leave
+			// the secret bytes visible; replace them with "***". Bearer values are
+			// left intact here for redactBearerTokens. See security review L3/L5.
+			let end = authIdx + 14;
+			while (end < line.length && line[end] !== "\r" && line[end] !== "\n") {
+				end++;
+			}
+			result += line.substring(authIdx, authIdx + 14) + " ***";
+			i = end;
+			searchFrom = end; // entire line consumed; resume after the line break
+		} else {
+			// Bearer token (handled by redactBearerTokens) OR not a boundary —
+			// keep the "authorization:" literal and continue scanning.
+			result += line.substring(authIdx, authIdx + 14);
+			i = authIdx + 14;
+			searchFrom = authIdx + 14;
 		}
-		return line.substring(0, end) + " ***" + (end < line.length ? line.substring(end) : "");
 	}
-	// It's a Bearer token format - don't redact here, let redactBearerTokens handle it
-	return line;
 }
 // Linear-time Bearer token redaction
@@ -135,14 +155,12 @@ export function redactBearerTokens(line: string): string {
 	while (i < line.length) {
 		if (upper.startsWith("BEARER ", i)) {
-			// Check word boundary: preceded by start, space, comma, brace, or newline
-			if (i > 0) {
-				const before = line[i - 1];
-				if (before !== ' ' && before !== ',' && before !== '{' && before !== '[' && before !== '"' && before !== '\r' && before !== '\n') {
-					result.push(line[i]);
-					i++;
-					continue;
-				}
+			// Check word boundary: start-of-string or a boundary char. Includes '-'
+			// and '\t' (L5) so "Proxy-Authorization: Bearer ..." is redacted.
+			if (i > 0 && !isAuthHeaderBoundary(line[i - 1])) {
+				result.push(line[i]);
+				i++;
+				continue;
 			}
 			// Found "Bearer " - now find the token