pi-crew 0.8.13 → 0.9.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +296 -0
- package/README.md +118 -2
- package/docs/FEATURE_INTAKE.md +1 -1
- package/docs/HARNESS.md +20 -19
- package/docs/PROJECT_REVIEW.md +132 -133
- package/docs/PROJECT_REVIEW_FIXES.md +130 -131
- package/docs/actions-reference.md +127 -121
- package/docs/architecture.md +1 -1
- package/docs/code-review-2026-05-11.md +134 -134
- package/docs/commands-reference.md +108 -106
- package/docs/comparison-pi-subagents-vs-pi-crew.md +105 -105
- package/docs/deep-review-report.md +1 -1
- package/docs/dynamic-workflows.md +90 -0
- package/docs/fixes/BATCH_A_H1_H2.md +17 -17
- package/docs/fixes/bug-007-async-notifier-stale-ctx.md +23 -23
- package/docs/followup-plan-2026-05-12.md +135 -135
- package/docs/followup-review-2026-05-12.md +86 -86
- package/docs/followup-review-round3-2026-05-12.md +123 -123
- package/docs/goals.md +59 -0
- package/docs/implementation-plan-top3.md +4 -4
- package/docs/issue-29-analysis.md +2 -2
- package/docs/oh-my-pi-research.md +154 -154
- package/docs/optimization-plan.md +2 -0
- package/docs/perf/baseline-2026-05.md +9 -9
- package/docs/perf/final-report-2026-05.md +2 -2
- package/docs/perf/sprint-1-report.md +2 -2
- package/docs/perf/sprint-2-report.md +1 -1
- package/docs/perf/upgrade-plan-2026-05.md +72 -72
- package/docs/pi-crew-bugs.md +230 -230
- package/docs/pi-crew-investigation-report.md +102 -102
- package/docs/pi-crew-test-round5.md +4 -4
- package/docs/runtime-analysis-child-vs-live.md +57 -57
- package/docs/runtime-migration-in-process-analysis.md +97 -97
- package/install.mjs +3 -2
- package/package.json +2 -4
- package/skills/orchestration/SKILL.md +11 -11
- package/src/agents/agent-config.ts +4 -0
- package/src/config/config.ts +39 -0
- package/src/config/types.ts +11 -0
- package/src/extension/action-suggestions.ts +2 -1
- package/src/extension/async-notifier.ts +10 -0
- package/src/extension/help.ts +14 -0
- package/src/extension/project-init.ts +7 -20
- package/src/extension/registration/commands.ts +27 -0
- package/src/extension/team-tool/destructive-gate.ts +1 -1
- package/src/extension/team-tool/goal-wrap.ts +288 -0
- package/src/extension/team-tool/goal.ts +405 -0
- package/src/extension/team-tool/run.ts +103 -4
- package/src/extension/team-tool/workflow-manage.ts +194 -0
- package/src/extension/team-tool.ts +20 -0
- package/src/hooks/types.ts +3 -1
- package/src/runtime/async-runner.ts +24 -2
- package/src/runtime/background-runner.ts +68 -19
- package/src/runtime/child-pi.ts +6 -1
- package/src/runtime/completion-guard.ts +1 -1
- package/src/runtime/dynamic-workflow-context.ts +450 -0
- package/src/runtime/dynamic-workflow-runner.ts +180 -0
- package/src/runtime/global-worker-cap.ts +96 -0
- package/src/runtime/goal-evaluator.ts +294 -0
- package/src/runtime/goal-loop-runner.ts +612 -0
- package/src/runtime/goal-state-store.ts +209 -0
- package/src/runtime/pi-args.ts +10 -2
- package/src/runtime/result-extractor.ts +32 -0
- package/src/runtime/team-runner.ts +11 -1
- package/src/runtime/verification-gates.ts +85 -5
- package/src/runtime/verification-integrity.ts +110 -0
- package/src/runtime/verification-worktree.ts +136 -0
- package/src/runtime/workspace-lock.ts +448 -0
- package/src/schema/config-schema.ts +26 -0
- package/src/schema/team-tool-schema.ts +39 -4
- package/src/state/atomic-write.ts +9 -0
- package/src/state/contracts.ts +14 -0
- package/src/state/crew-init.ts +18 -5
- package/src/state/event-log.ts +7 -1
- package/src/state/state-store.ts +2 -0
- package/src/state/types.ts +82 -0
- package/src/state/worker-atomic-writer.ts +176 -0
- package/src/utils/redaction.ts +104 -24
- package/src/workflows/discover-workflows.ts +25 -1
- package/src/workflows/workflow-config.ts +13 -0
- package/teams/parallel-research.team.md +1 -1
- package/workflows/examples/hello.dwf.ts +24 -0
|
@@ -1,60 +1,60 @@
|
|
|
1
1
|
# Follow-up Review — pi-crew (2026-05-12, round 3)
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
Author: Droid (Factory) | Related: `docs/code-review-2026-05-11.md`, `docs/followup-plan-2026-05-12.md`, `docs/followup-review-2026-05-12.md`. HEAD: `5bee878`.
|
|
4
4
|
|
|
5
|
-
|
|
5
|
+
This is the review round after commit `5bee878` resolved C1–C6. Goal: scrutinize the modules not yet examined closely (event-log, atomic-write, child-pi, redaction, sleep, hooks, cleanup) to find remaining risks.
|
|
6
6
|
|
|
7
|
-
##
|
|
7
|
+
## Result summary
|
|
8
8
|
|
|
9
9
|
- `npm run typecheck` → Passed
|
|
10
10
|
- `npm run check:lazy-imports` → Passed
|
|
11
11
|
- `npm run test:unit` → **1418 tests / 1415 pass / 0 fail / 3 skip** (212s)
|
|
12
12
|
|
|
13
|
-
|
|
13
|
+
The codebase is in a stable state. The findings below are **low risk** or **defense-in-depth**, not urgent bugs.
|
|
14
14
|
|
|
15
15
|
---
|
|
16
16
|
|
|
17
|
-
##
|
|
17
|
+
## Part A — New findings
|
|
18
18
|
|
|
19
|
-
### D1 — `event-log.appendEvent`
|
|
19
|
+
### D1 — `event-log.appendEvent` has no lock, JSONL may interleave on Windows
|
|
20
20
|
|
|
21
|
-
**Severity:** Medium | **Effort:** ~30
|
|
21
|
+
**Severity:** Medium | **Effort:** ~30 minutes | **File:** `src/state/event-log.ts:148`
|
|
22
22
|
|
|
23
|
-
**
|
|
23
|
+
**Current state:**
|
|
24
24
|
```ts
|
|
25
25
|
fs.appendFileSync(eventsPath, `${JSON.stringify(redactSecrets(fullEvent))}\n`, "utf-8");
|
|
26
26
|
```
|
|
27
27
|
|
|
28
|
-
**
|
|
29
|
-
- `fs.appendFileSync`
|
|
30
|
-
-
|
|
31
|
-
-
|
|
28
|
+
**Problem:**
|
|
29
|
+
- `fs.appendFileSync` on POSIX is only atomic for writes smaller than `PIPE_BUF` (~4 KiB). A full event JSON (with data, metadata, transcripts) can exceed that → interleaved lines between 2 processes (parent + background-runner).
|
|
30
|
+
- On Windows, append is NOT atomic at any size; 2 processes appending to the same eventsPath can produce interleaved JSON lines → `JSON.parse(line)` in `readEvents`/`scanSequence` throws and skips the line.
|
|
31
|
+
- Consequence: lost events, sequence numbers jumping, "appended: false" is returned in a different path (size-limit) but the normal path gives no hint.
|
|
32
32
|
|
|
33
|
-
**Trigger:**
|
|
33
|
+
**Trigger:** running `background-runner` in parallel with the parent writing events to the same `eventsPath` (e.g. cancel + retry in succession).
|
|
34
34
|
|
|
35
|
-
**
|
|
36
|
-
1. Wrap `appendEvent`
|
|
37
|
-
2.
|
|
38
|
-
3.
|
|
35
|
+
**Proposed fix:**
|
|
36
|
+
1. Wrap `appendEvent` in `withRunLockSync(manifest, () => { ... })` — ensures exclusive access.
|
|
37
|
+
2. Or use `fs.openSync(..., O_APPEND | O_WRONLY)` + retry with an advisory lock (`flock` POSIX, `LockFileEx` Windows — via the npm package `proper-lockfile`).
|
|
38
|
+
3. Lightest option: switch to `appendEventAsync` via a queue/serialize.
|
|
39
39
|
|
|
40
|
-
**
|
|
40
|
+
**Tests to add:** stress test in `test/integration/`: 2 processes each appending 100 events concurrently → assert total parseable line count = 200.
|
|
41
41
|
|
|
42
42
|
---
|
|
43
43
|
|
|
44
|
-
### D2 — `event-log.sequenceCache` Map
|
|
44
|
+
### D2 — `event-log.sequenceCache` Map leaks by number of runs
|
|
45
45
|
|
|
46
|
-
**Severity:** Low | **Effort:** ~10
|
|
46
|
+
**Severity:** Low | **Effort:** ~10 minutes | **File:** `src/state/event-log.ts:60`
|
|
47
47
|
|
|
48
|
-
**
|
|
48
|
+
**Current state:**
|
|
49
49
|
```ts
|
|
50
50
|
const sequenceCache = new Map<string, { size: number; mtimeMs: number; seq: number }>();
|
|
51
51
|
```
|
|
52
52
|
|
|
53
|
-
**
|
|
54
|
-
- Module-level map,
|
|
55
|
-
- Memory
|
|
53
|
+
**Problem:**
|
|
54
|
+
- Module-level map, never evicts. Each `eventsPath` (1 per run) takes 1 entry. A long-running parent process (e.g. live-session-runtime) running for many days → the cache could reach thousands of entries.
|
|
55
|
+
- Memory isn't large (~100 bytes/entry) but is unbounded.
|
|
56
56
|
|
|
57
|
-
**
|
|
57
|
+
**Proposed fix:** use a simple LRU (Map with a max size, evict oldest when exceeding the threshold), or clear after the run ends:
|
|
58
58
|
```ts
|
|
59
59
|
export function evictSequenceCache(eventsPath: string): void {
|
|
60
60
|
sequenceCache.delete(eventsPath);
|
|
@@ -64,11 +64,11 @@ export function evictSequenceCache(eventsPath: string): void {
|
|
|
64
64
|
|
|
65
65
|
---
|
|
66
66
|
|
|
67
|
-
### D3 — `atomicWriteFileAsync`
|
|
67
|
+
### D3 — `atomicWriteFileAsync` has a "matches" fallback → sync path doesn't (parity)
|
|
68
68
|
|
|
69
|
-
**Severity:** Low | **Effort:** ~15
|
|
69
|
+
**Severity:** Low | **Effort:** ~15 minutes | **File:** `src/state/atomic-write.ts:122-138`
|
|
70
70
|
|
|
71
|
-
**
|
|
71
|
+
**Current state:**
|
|
72
72
|
```ts
|
|
73
73
|
// async path:
|
|
74
74
|
try { await renameWithRetryAsync(...); }
|
|
@@ -79,15 +79,15 @@ catch (renameError) {
|
|
|
79
79
|
throw renameError;
|
|
80
80
|
}
|
|
81
81
|
|
|
82
|
-
// sync path:
|
|
82
|
+
// sync path: just throws, no "matches" fallback.
|
|
83
83
|
```
|
|
84
84
|
|
|
85
|
-
**
|
|
86
|
-
-
|
|
87
|
-
-
|
|
88
|
-
-
|
|
85
|
+
**Problem:**
|
|
86
|
+
- The async path "forgives" the race condition (the file was already written with the correct content by another process). The sync path throws hard.
|
|
87
|
+
- Different semantics → hard to debug when someone uses sync with a race.
|
|
88
|
+
- This case is rare (identical content), but the asymmetry is a code smell.
|
|
89
89
|
|
|
90
|
-
**
|
|
90
|
+
**Proposed fix:** add the same fallback to sync, or remove the fallback from async (pick one consistent convention):
|
|
91
91
|
```ts
|
|
92
92
|
} catch (renameError) {
|
|
93
93
|
try {
|
|
@@ -103,22 +103,22 @@ catch (renameError) {
|
|
|
103
103
|
|
|
104
104
|
---
|
|
105
105
|
|
|
106
|
-
### D4 — `withRunLock` (async)
|
|
106
|
+
### D4 — `withRunLock` (async) waits until the deadline for an active lock, `withRunLockSync` throws immediately
|
|
107
107
|
|
|
108
|
-
**Severity:** Low | **Effort:** ~10
|
|
108
|
+
**Severity:** Low | **Effort:** ~10 minutes | **File:** `src/state/locks.ts:91-110`
|
|
109
109
|
|
|
110
|
-
**
|
|
111
|
-
- Sync: `if (!isLockStale(...)) throw ...` → fail fast
|
|
112
|
-
- Async:
|
|
110
|
+
**Current state:**
|
|
111
|
+
- Sync: `if (!isLockStale(...)) throw ...` → fail fast for an active lock.
|
|
112
|
+
- Async: only checks staleness in `readLockStateAsync`, doesn't throw for active → loops waiting until the deadline (`staleMs * 2`, usually 60s).
|
|
113
113
|
|
|
114
|
-
**
|
|
115
|
-
-
|
|
116
|
-
- Async
|
|
117
|
-
- BUG-004 (round 1)
|
|
114
|
+
**Problem:**
|
|
115
|
+
- The test `withRunLockSync throws immediately on active (non-stale) lock` proves sync throws immediately.
|
|
116
|
+
- Async will hang ~60s then throw → slow cancel/retry experience.
|
|
117
|
+
- BUG-004 (round 1) aimed to unify sync ↔ async, but this semantic asymmetry remains.
|
|
118
118
|
|
|
119
|
-
**
|
|
120
|
-
- Sync:
|
|
121
|
-
- Async: throw
|
|
119
|
+
**Proposed fix:** unify by one of:
|
|
120
|
+
- Sync: add a short wait + retry like async (wait up to 1-2s then throw).
|
|
121
|
+
- Async: throw immediately when the lock isn't stale (like sync) — usually better since the caller can retry with higher context.
|
|
122
122
|
|
|
123
123
|
```ts
|
|
124
124
|
async function acquireLockWithRetryAsync(filePath: string, staleMs: number): Promise<void> {
|
|
@@ -141,25 +141,25 @@ async function acquireLockWithRetryAsync(filePath: string, staleMs: number): Pro
|
|
|
141
141
|
}
|
|
142
142
|
```
|
|
143
143
|
|
|
144
|
-
**
|
|
144
|
+
**Tests to add:** mirror the sync test — `withRunLock async throws immediately on active (non-stale) lock`.
|
|
145
145
|
|
|
146
146
|
---
|
|
147
147
|
|
|
148
|
-
### D5 — `sleep.ts`
|
|
148
|
+
### D5 — `sleep.ts` uses `require()` in an ES module
|
|
149
149
|
|
|
150
|
-
**Severity:** Low (style) | **Effort:** ~5
|
|
150
|
+
**Severity:** Low (style) | **Effort:** ~5 minutes | **File:** `src/utils/sleep.ts:18`
|
|
151
151
|
|
|
152
|
-
**
|
|
152
|
+
**Current state:**
|
|
153
153
|
```ts
|
|
154
154
|
const { execFileSync } = require("node:child_process") as typeof import("node:child_process");
|
|
155
155
|
```
|
|
156
156
|
|
|
157
|
-
**
|
|
158
|
-
-
|
|
159
|
-
- AGENTS.md: "Avoid dynamic inline imports, EXCEPT at documented lazy-load boundaries to defer heavy runtime cost (mark with `// LAZY: <reason>`)". `require`
|
|
160
|
-
- `child_process`
|
|
157
|
+
**Problem:**
|
|
158
|
+
- The project is ESM (`"type": "module"`). `require` only works via strip-types backward-compat — not clean.
|
|
159
|
+
- AGENTS.md: "Avoid dynamic inline imports, EXCEPT at documented lazy-load boundaries to defer heavy runtime cost (mark with `// LAZY: <reason>`)". This `require` has no marker.
|
|
160
|
+
- `child_process` isn't heavy — top-level import is fine.
|
|
161
161
|
|
|
162
|
-
**
|
|
162
|
+
**Proposed fix:**
|
|
163
163
|
```ts
|
|
164
164
|
import { execFileSync } from "node:child_process";
|
|
165
165
|
// ...
|
|
@@ -168,20 +168,20 @@ execFileSync("sleep", [(ms / 1000).toFixed(3)], { timeout: ms + 1000, stdio: "pi
|
|
|
168
168
|
|
|
169
169
|
---
|
|
170
170
|
|
|
171
|
-
### D6 — `iteration-hooks.runIterationHook`
|
|
171
|
+
### D6 — `iteration-hooks.runIterationHook` doesn't filter env like post-checks
|
|
172
172
|
|
|
173
|
-
**Severity:** Low | **Effort:** ~5
|
|
173
|
+
**Severity:** Low | **Effort:** ~5 minutes | **File:** `src/runtime/iteration-hooks.ts:140`
|
|
174
174
|
|
|
175
|
-
**
|
|
175
|
+
**Current state:**
|
|
176
176
|
```ts
|
|
177
177
|
env: { PATH: process.env.PATH ?? "/usr/bin:/bin", HOME: process.env.HOME ?? "/tmp", USER: process.env.USER, LANG: process.env.LANG, PI_CREW_HOOK: "1" },
|
|
178
178
|
```
|
|
179
179
|
|
|
180
|
-
**
|
|
181
|
-
-
|
|
182
|
-
-
|
|
180
|
+
**Problem:**
|
|
181
|
+
- Already restricted manually, OK for Linux. But on Windows it lacks `USERPROFILE`, `TEMP`, `TMP`, `ComSpec`, `SystemRoot` → `.cmd/.ps1` scripts may fail.
|
|
182
|
+
- post-checks.ts has the same pattern (line 82) — inconsistent with worktree-manager.runSetupHook which moved to `sanitizeEnvSecrets(..., { allowList: [...] })`.
|
|
183
183
|
|
|
184
|
-
**
|
|
184
|
+
**Proposed fix:** apply `sanitizeEnvSecrets` with an allowList, unifying all 3 sites (post-checks, iteration-hooks, setup-hook):
|
|
185
185
|
```ts
|
|
186
186
|
import { sanitizeEnvSecrets } from "../utils/env-filter.ts";
|
|
187
187
|
const HOOK_ENV_ALLOW = ["PATH", "HOME", "USER", "USERPROFILE", "TEMP", "TMP", "TMPDIR", "LANG", "LC_ALL", "ComSpec", "SystemRoot", "PI_*"];
|
|
@@ -189,30 +189,30 @@ const HOOK_ENV_ALLOW = ["PATH", "HOME", "USER", "USERPROFILE", "TEMP", "TMP", "T
|
|
|
189
189
|
env: { ...sanitizeEnvSecrets(process.env, { allowList: HOOK_ENV_ALLOW }), PI_CREW_HOOK: "1" },
|
|
190
190
|
```
|
|
191
191
|
|
|
192
|
-
**
|
|
193
|
-
- 1
|
|
194
|
-
-
|
|
195
|
-
-
|
|
192
|
+
**Benefits:**
|
|
193
|
+
- 1 source of truth for the hook env whitelist.
|
|
194
|
+
- Supports Windows .cmd/.ps1 (USERPROFILE/TEMP needed).
|
|
195
|
+
- Avoids code duplication.
|
|
196
196
|
|
|
197
197
|
---
|
|
198
198
|
|
|
199
|
-
### D7 — `cleanup.ts` git helper
|
|
199
|
+
### D7 — `cleanup.ts` git helper doesn't force locale (consistency with worktree-manager)
|
|
200
200
|
|
|
201
|
-
**Severity:** Info | **Effort:** ~2
|
|
201
|
+
**Severity:** Info | **Effort:** ~2 minutes | **File:** `src/worktree/cleanup.ts:15`
|
|
202
202
|
|
|
203
|
-
**
|
|
203
|
+
**Current state:**
|
|
204
204
|
```ts
|
|
205
205
|
function git(cwd: string, args: string[]): string {
|
|
206
206
|
return execFileSync("git", args, { cwd, encoding: "utf-8", stdio: ["ignore", "pipe", "pipe"] }).trim();
|
|
207
207
|
}
|
|
208
208
|
```
|
|
209
209
|
|
|
210
|
-
**
|
|
211
|
-
- `worktree-manager.ts`
|
|
212
|
-
-
|
|
213
|
-
- `branch-freshness.ts`
|
|
210
|
+
**Problem:**
|
|
211
|
+
- `worktree-manager.ts` already forces `LANG: "C", LC_ALL: "C"` (C5 round 2). `cleanup.ts` doesn't yet.
|
|
212
|
+
- Doesn't cause a current bug because cleanup doesn't parse error strings; but if error parsing is added later, this gets missed again.
|
|
213
|
+
- `branch-freshness.ts` has the same issue.
|
|
214
214
|
|
|
215
|
-
**
|
|
215
|
+
**Proposed fix:** extract a `git()` helper into a shared `src/utils/git-helper.ts`, used by all 3 files:
|
|
216
216
|
```ts
|
|
217
217
|
// src/utils/git-exec.ts
|
|
218
218
|
import { execFileSync } from "node:child_process";
|
|
@@ -226,44 +226,44 @@ export function gitExec(cwd: string, args: string[]): string {
|
|
|
226
226
|
|
|
227
227
|
---
|
|
228
228
|
|
|
229
|
-
### D8 — `redaction.PEM_PRIVATE_KEY_PATTERN`
|
|
229
|
+
### D8 — `redaction.PEM_PRIVATE_KEY_PATTERN` has no length limit → low ReDoS potential
|
|
230
230
|
|
|
231
|
-
**Severity:** Info | **Effort:** ~5
|
|
231
|
+
**Severity:** Info | **Effort:** ~5 minutes | **File:** `src/utils/redaction.ts:7`
|
|
232
232
|
|
|
233
|
-
**
|
|
233
|
+
**Current state:**
|
|
234
234
|
```ts
|
|
235
235
|
const PEM_PRIVATE_KEY_PATTERN = /-----BEGIN [A-Z ]*PRIVATE KEY-----[\s\S]*?-----END [A-Z ]*PRIVATE KEY-----/g;
|
|
236
236
|
```
|
|
237
237
|
|
|
238
|
-
**
|
|
239
|
-
-
|
|
240
|
-
-
|
|
238
|
+
**Problem:**
|
|
239
|
+
- The lazy `[\s\S]*?` is ReDoS-safe, but if the input has a BEGIN without an END → backtracks to the end of the string. With a long JSONL transcript (10+ MB), the regex scans the whole thing.
|
|
240
|
+
- Not a true ReDoS (linear), but can be slow.
|
|
241
241
|
|
|
242
|
-
**
|
|
242
|
+
**Proposed fix:** add a hard 64KB limit for a PEM block (real PEM ~3KB):
|
|
243
243
|
```ts
|
|
244
244
|
const PEM_PRIVATE_KEY_PATTERN = /-----BEGIN [A-Z ]*PRIVATE KEY-----[\s\S]{0,65536}?-----END [A-Z ]*PRIVATE KEY-----/g;
|
|
245
245
|
```
|
|
246
246
|
|
|
247
|
-
Trade-off: PEM > 64KB
|
|
247
|
+
Trade-off: PEM > 64KB won't be fully redacted. Rare in practice.
|
|
248
248
|
|
|
249
249
|
---
|
|
250
250
|
|
|
251
|
-
### D9 — `subagent-manager.persistedSubagentPath`
|
|
251
|
+
### D9 — `subagent-manager.persistedSubagentPath` doesn't validate `id` → potential path traversal
|
|
252
252
|
|
|
253
|
-
**Severity:** Low | **Effort:** ~5
|
|
253
|
+
**Severity:** Low | **Effort:** ~5 minutes | **File:** `src/runtime/subagent-manager.ts:58`
|
|
254
254
|
|
|
255
|
-
**
|
|
255
|
+
**Current state:**
|
|
256
256
|
```ts
|
|
257
257
|
function persistedSubagentPath(cwd: string, id: string): string {
|
|
258
258
|
return path.join(projectCrewRoot(cwd), DEFAULT_PATHS.state.subagentsSubdir, `${id}.json`);
|
|
259
259
|
}
|
|
260
260
|
```
|
|
261
261
|
|
|
262
|
-
**
|
|
263
|
-
- `id`
|
|
264
|
-
-
|
|
262
|
+
**Problem:**
|
|
263
|
+
- `id` is currently generated internally (`agent_${Date.now().toString(36)}_${counter.toString(36)}`) → safe.
|
|
264
|
+
- But `readPersistedSubagentRecord(cwd, id)` is called with `id` from an external source (e.g. `get_subagent_result` tool param). If tool param validation is missing, `id = "../../../etc/passwd"` could read a file outside the state dir.
|
|
265
265
|
|
|
266
|
-
**
|
|
266
|
+
**Proposed fix:** validate `id` matches `^[a-z0-9_]+$`:
|
|
267
267
|
```ts
|
|
268
268
|
function isValidSubagentId(id: string): boolean {
|
|
269
269
|
return /^[a-z0-9_]+$/i.test(id) && id.length <= 128;
|
|
@@ -274,59 +274,59 @@ function persistedSubagentPath(cwd: string, id: string): string {
|
|
|
274
274
|
}
|
|
275
275
|
```
|
|
276
276
|
|
|
277
|
-
|
|
277
|
+
Check the `get_subagent_result` tool schema to see if it sanitizes id already; if so, D9 is just defense-in-depth.
|
|
278
278
|
|
|
279
279
|
---
|
|
280
280
|
|
|
281
|
-
##
|
|
281
|
+
## Implementation priority
|
|
282
282
|
|
|
283
|
-
| # | Item | Severity | Effort |
|
|
283
|
+
| # | Item | Severity | Effort | Recommendation |
|
|
284
284
|
|---|---|---|---|---|
|
|
285
|
-
| 1 | D1 (event-log concurrent append) | Medium | 30
|
|
286
|
-
| 2 | D6 (hook env allowList
|
|
287
|
-
| 3 | D4 (async lock fail-fast
|
|
288
|
-
| 4 | D9 (subagent id validate) | Low | 5
|
|
289
|
-
| 5 | D2 (sequenceCache eviction) | Low | 10
|
|
290
|
-
| 6 | D3 (atomic-write sync/async parity) | Low | 15
|
|
291
|
-
| 7 | D5 (sleep.ts ESM require) | Low | 5
|
|
292
|
-
| 8 | D7 (git helper consolidate) | Info | 2
|
|
293
|
-
| 9 | D8 (PEM regex limit) | Info | 5
|
|
294
|
-
|
|
295
|
-
**
|
|
296
|
-
**
|
|
285
|
+
| 1 | D1 (event-log concurrent append) | Medium | 30 minutes | Current sprint — prevent event loss/corruption |
|
|
286
|
+
| 2 | D6 (consistent hook env allowList) | Low | 5 minutes | Current sprint — sync with the setup-hook fix |
|
|
287
|
+
| 3 | D4 (async lock fail-fast for active) | Low | 10 minutes | Current sprint — cancel/retry UX |
|
|
288
|
+
| 4 | D9 (subagent id validate) | Low | 5 minutes | Current sprint — defense-in-depth |
|
|
289
|
+
| 5 | D2 (sequenceCache eviction) | Low | 10 minutes | Next sprint |
|
|
290
|
+
| 6 | D3 (atomic-write sync/async parity) | Low | 15 minutes | Next sprint |
|
|
291
|
+
| 7 | D5 (sleep.ts ESM require) | Low | 5 minutes | Next sprint |
|
|
292
|
+
| 8 | D7 (git helper consolidate) | Info | 2 minutes | Anytime |
|
|
293
|
+
| 9 | D8 (PEM regex limit) | Info | 5 minutes | Anytime |
|
|
294
|
+
|
|
295
|
+
**Total effort priority 1 (must-fix):** ~50 minutes.
|
|
296
|
+
**Total effort priority 2 (nice-to-have):** ~37 minutes.
|
|
297
297
|
|
|
298
298
|
---
|
|
299
299
|
|
|
300
|
-
##
|
|
300
|
+
## Proposed commit batches
|
|
301
301
|
|
|
302
|
-
- **Batch 1 (correctness/security):** D1 + D9 + D4 → 1 PR "event-log lock + subagent id guard + async lock parity" (~45
|
|
303
|
-
- **Batch 2 (hardening):** D6 + D7 + D2 → 1 PR "hook env allowList consolidation + git helper extract + cache eviction" (~20
|
|
304
|
-
- **Batch 3 (polish):** D3 + D5 + D8 → 1 PR "atomic-write parity + ESM cleanup + redaction limit" (~25
|
|
302
|
+
- **Batch 1 (correctness/security):** D1 + D9 + D4 → 1 PR "event-log lock + subagent id guard + async lock parity" (~45 minutes).
|
|
303
|
+
- **Batch 2 (hardening):** D6 + D7 + D2 → 1 PR "hook env allowList consolidation + git helper extract + cache eviction" (~20 minutes).
|
|
304
|
+
- **Batch 3 (polish):** D3 + D5 + D8 → 1 PR "atomic-write parity + ESM cleanup + redaction limit" (~25 minutes).
|
|
305
305
|
|
|
306
306
|
---
|
|
307
307
|
|
|
308
|
-
##
|
|
308
|
+
## Positive notes after round 3
|
|
309
309
|
|
|
310
|
-
-
|
|
311
|
-
- 1418 tests pass (
|
|
312
|
-
- `npm run check:lazy-imports`
|
|
313
|
-
- `sanitizeEnvSecrets`
|
|
314
|
-
- `resolveShellForScript`
|
|
315
|
-
- `parent-guard` polling
|
|
316
|
-
-
|
|
317
|
-
- Atomic-write
|
|
318
|
-
- Subagent records persisted
|
|
319
|
-
- Background runner
|
|
310
|
+
- All C1–C6 (round 2) fixed correctly per spec.
|
|
311
|
+
- 1418 tests pass (vs 1411 the previous round → +7 new tests), 0 fail.
|
|
312
|
+
- `npm run check:lazy-imports` now runs on Windows (after removing `sed`).
|
|
313
|
+
- `sanitizeEnvSecrets` has both a deny-list (default) and an allow-list mode → good flexibility.
|
|
314
|
+
- `resolveShellForScript` correctly handles `.bat/.cmd` to guard against CVE-2024-27980.
|
|
315
|
+
- `parent-guard` polling works well cross-platform (POSIX + Windows).
|
|
316
|
+
- Multi-layer redaction pipeline (key-name + inline-substring + auth-header + bearer + PEM).
|
|
317
|
+
- Atomic-write has O_EXCL + O_NOFOLLOW + post-open `isFile()` verification.
|
|
318
|
+
- Subagent records persisted under the redaction filter.
|
|
319
|
+
- Background runner has `parent-guard` + tempdir cleanup + final-drain timer.
|
|
320
320
|
|
|
321
321
|
---
|
|
322
322
|
|
|
323
|
-
##
|
|
323
|
+
## Areas with NO serious issues (reviewed)
|
|
324
324
|
|
|
325
|
-
- `src/schema/team-tool-schema.ts` — TypeBox schema
|
|
326
|
-
- `src/state/artifact-store.ts` — path traversal blocking
|
|
325
|
+
- `src/schema/team-tool-schema.ts` — TypeBox schema has the "retry" literal, strict additionalProperties.
|
|
326
|
+
- `src/state/artifact-store.ts` — 2-layer path traversal blocking (`resolveInside` + `resolveRealContainedPath`), post-redaction hash.
|
|
327
327
|
- `src/state/atomic-write.ts` — symlink-safe, O_EXCL, fd-based stat verification.
|
|
328
328
|
- `src/worktree/worktree-manager.ts` — branchExists local+remote, prune stale, env filter, locale-safe error parse.
|
|
329
|
-
- `src/runtime/async-runner.ts` — jiti + strip-types fallback,
|
|
329
|
+
- `src/runtime/async-runner.ts` — jiti + strip-types fallback, multi-candidate path.
|
|
330
330
|
- `src/runtime/child-pi.ts` — env sanitize, redacted transcript, post-exit stdio guard, hard kill timer.
|
|
331
331
|
- `src/runtime/parent-guard.ts` — kill(pid,0) cross-platform, unref'd interval.
|
|
332
332
|
|
package/docs/goals.md
ADDED
|
@@ -0,0 +1,59 @@
|
|
|
1
|
+
# Goal Loops (`team action='goal')
|
|
2
|
+
|
|
3
|
+
pi-crew v0.9.0 introduces an autonomous goal loop, modeled on Claude Code's `/goal`.
|
|
4
|
+
|
|
5
|
+
## What it does
|
|
6
|
+
|
|
7
|
+
A goal loop turns a single objective into a long-running, self-directed multi-turn
|
|
8
|
+
process:
|
|
9
|
+
|
|
10
|
+
1. A **worker** agent does one turn of work (`executeTeamRun`).
|
|
11
|
+
2. A separate **evaluator** model (the "goal-judge") reads the turn's transcript +
|
|
12
|
+
tool calls + verification results and returns a verdict:
|
|
13
|
+
`{ achieved: bool, reason: string, evidenceRefs?: string[] }`.
|
|
14
|
+
3. If **not achieved**, the `reason` is prepended to the next turn's prompt and the
|
|
15
|
+
loop continues.
|
|
16
|
+
4. Stops on: `achieved` / `maxTurns` reached / `budgetAbort` exceeded / `BLOCKED:` /
|
|
17
|
+
user `stop`.
|
|
18
|
+
|
|
19
|
+
## Usage
|
|
20
|
+
|
|
21
|
+
```
|
|
22
|
+
team action='goal', config.subAction='start',
|
|
23
|
+
config.objective='Migrate src/auth from JS to TS. Done when tsc --noEmit=0 and test/auth passes.',
|
|
24
|
+
config.evaluatorModel='haiku',
|
|
25
|
+
config.maxTurns=20,
|
|
26
|
+
budgetTotal=500000
|
|
27
|
+
```
|
|
28
|
+
|
|
29
|
+
Sub-actions: `start | status | pause | resume | stop | step | clear`.
|
|
30
|
+
|
|
31
|
+
Slash command: `/team-goal start --objective='...' --evaluatorModel='...'`
|
|
32
|
+
|
|
33
|
+
## Design notes (see `research-findings/goal-workflow/`)
|
|
34
|
+
|
|
35
|
+
- **One manifest per turn** — `TEAM_RUN_STATUS_TRANSITIONS` + `shouldMergeTaskUpdate`
|
|
36
|
+
block re-driving a terminal manifest, so each turn is a fresh `createRunManifest`.
|
|
37
|
+
The goal loop owns OUTER state in `GoalLoopState` at
|
|
38
|
+
`<crewRoot>/state/goals/<goalId>.json`.
|
|
39
|
+
- **Feedback via `manifest.goal`** — the verdict's `reason` is composed into the next
|
|
40
|
+
turn's `manifest.goal` (re-read lazily each render). `session.steer` is NOT used
|
|
41
|
+
(it's a no-op for child-process runs).
|
|
42
|
+
- **Budget via `collectRunMetrics`** — `budgetUsed = Σ over turns of collectRunMetrics(cwd, turnRunId).totalTokens`.
|
|
43
|
+
(`loadRunMetrics`/`saveRunMetrics` have 0 callers — do not use.)
|
|
44
|
+
- **Judge lockdown** — the synthesized `goal-judge` AgentConfig sets `disableTools:true`
|
|
45
|
+
(Pi `--no-tools`), `excludeTools:[bash,read,write,edit]`, `inheritContext:false`,
|
|
46
|
+
`excludeContextBash:true`, `parentContext:undefined`, `maxTurns:1`. An empty
|
|
47
|
+
`tools:[]` is INSUFFICIENT because `pi-args.ts` skips empty arrays.
|
|
48
|
+
- **Trust boundary** — the judge is capability-locked (no agency, only emits a verdict).
|
|
49
|
+
|
|
50
|
+
## Background spawn
|
|
51
|
+
|
|
52
|
+
The loop runs in the background via `runKind:'goal-loop'` (background-runner.ts
|
|
53
|
+
dispatch). The handler `handleGoal('start')` writes the `GoalLoopState`; the
|
|
54
|
+
background process calls `runGoalLoop` which loops `executeTeamRun` per turn.
|
|
55
|
+
|
|
56
|
+
## Hooks
|
|
57
|
+
|
|
58
|
+
- `before_goal_step` — fires before each turn.
|
|
59
|
+
- `before_goal_abort` — fires before a budget/maxTurns abort.
|
|
@@ -303,12 +303,12 @@ interface ParsedPr {
|
|
|
303
303
|
|
|
304
304
|
**New tool:** `read-issue` / `read-pr`
|
|
305
305
|
```typescript
|
|
306
|
-
// agents
|
|
307
|
-
// read({ path: "issue://123" }) → markdown
|
|
308
|
-
// read({ path: "pr://456" }) → markdown
|
|
306
|
+
// agents can call:
|
|
307
|
+
// read({ path: "issue://123" }) → markdown of the issue
|
|
308
|
+
// read({ path: "pr://456" }) → markdown of the PR
|
|
309
309
|
```
|
|
310
310
|
|
|
311
|
-
**New slash command:** `/issue`
|
|
311
|
+
**New slash command:** `/issue` or use the existing `/crew`:
|
|
312
312
|
```
|
|
313
313
|
/crew create-issue "Task failed: fix memory leak in cache" --labels=bug --assignee=me
|
|
314
314
|
/crew link-issue TICKET-123 --task=explorer-1
|
|
@@ -182,8 +182,8 @@ For the `subagent-manager.ts` defense-in-depth: spawn a subagent whose `runner`
|
|
|
182
182
|
- ✅ Issue reproduced in code (all 11 sites verified)
|
|
183
183
|
- ✅ Root cause identified
|
|
184
184
|
- ✅ Fix plan drafted
|
|
185
|
-
- ❌ **Code changes NOT applied** (per user request: "
|
|
185
|
+
- ❌ **Code changes NOT applied** (per user request: "nothing has been changed yet")
|
|
186
186
|
- ❌ Tests NOT added
|
|
187
187
|
- ❌ v0.6.2 NOT released
|
|
188
188
|
|
|
189
|
-
|
|
189
|
+
Ready to apply when requested by the user.
|