pi-crew 0.8.14 → 0.9.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +366 -0
- package/README.md +112 -2
- package/docs/FEATURE_INTAKE.md +1 -1
- package/docs/HARNESS.md +20 -19
- package/docs/PROJECT_REVIEW.md +132 -133
- package/docs/PROJECT_REVIEW_FIXES.md +130 -131
- package/docs/actions-reference.md +127 -121
- package/docs/architecture.md +1 -1
- package/docs/code-review-2026-05-11.md +134 -134
- package/docs/commands-reference.md +108 -106
- package/docs/comparison-pi-subagents-vs-pi-crew.md +105 -105
- package/docs/deep-review-report.md +1 -1
- package/docs/dynamic-workflows.md +90 -0
- package/docs/fixes/BATCH_A_H1_H2.md +17 -17
- package/docs/fixes/bug-007-async-notifier-stale-ctx.md +23 -23
- package/docs/followup-plan-2026-05-12.md +135 -135
- package/docs/followup-review-2026-05-12.md +86 -86
- package/docs/followup-review-round3-2026-05-12.md +123 -123
- package/docs/goals.md +59 -0
- package/docs/implementation-plan-top3.md +4 -4
- package/docs/issue-29-analysis.md +2 -2
- package/docs/oh-my-pi-research.md +154 -154
- package/docs/optimization-plan.md +2 -0
- package/docs/perf/baseline-2026-05.md +9 -9
- package/docs/perf/final-report-2026-05.md +2 -2
- package/docs/perf/sprint-1-report.md +2 -2
- package/docs/perf/sprint-2-report.md +1 -1
- package/docs/perf/upgrade-plan-2026-05.md +72 -72
- package/docs/pi-crew-bugs.md +230 -230
- package/docs/pi-crew-investigation-report.md +102 -102
- package/docs/pi-crew-test-round5.md +4 -4
- package/docs/runtime-analysis-child-vs-live.md +57 -57
- package/docs/runtime-migration-in-process-analysis.md +97 -97
- package/package.json +2 -4
- package/skills/orchestration/SKILL.md +11 -11
- package/src/agents/agent-config.ts +4 -0
- package/src/config/config.ts +39 -0
- package/src/config/types.ts +11 -0
- package/src/extension/action-suggestions.ts +2 -1
- package/src/extension/async-notifier.ts +10 -0
- package/src/extension/help.ts +14 -0
- package/src/extension/registration/commands.ts +27 -0
- package/src/extension/team-tool/destructive-gate.ts +1 -1
- package/src/extension/team-tool/goal-wrap.ts +288 -0
- package/src/extension/team-tool/goal.ts +405 -0
- package/src/extension/team-tool/run.ts +103 -4
- package/src/extension/team-tool/workflow-manage.ts +194 -0
- package/src/extension/team-tool.ts +20 -0
- package/src/hooks/types.ts +3 -1
- package/src/runtime/async-runner.ts +27 -2
- package/src/runtime/background-runner.ts +68 -19
- package/src/runtime/child-pi.ts +9 -1
- package/src/runtime/completion-guard.ts +1 -1
- package/src/runtime/dynamic-workflow-context.ts +450 -0
- package/src/runtime/dynamic-workflow-runner.ts +180 -0
- package/src/runtime/global-worker-cap.ts +96 -0
- package/src/runtime/goal-evaluator.ts +294 -0
- package/src/runtime/goal-loop-runner.ts +612 -0
- package/src/runtime/goal-state-store.ts +209 -0
- package/src/runtime/iteration-hooks.ts +2 -1
- package/src/runtime/pi-args.ts +10 -2
- package/src/runtime/post-checks.ts +2 -1
- package/src/runtime/result-extractor.ts +32 -0
- package/src/runtime/team-runner.ts +11 -1
- package/src/runtime/verification-gates.ts +88 -5
- package/src/runtime/verification-integrity.ts +110 -0
- package/src/runtime/verification-worktree.ts +136 -0
- package/src/runtime/workspace-lock.ts +448 -0
- package/src/schema/config-schema.ts +26 -0
- package/src/schema/team-tool-schema.ts +39 -4
- package/src/state/atomic-write.ts +9 -0
- package/src/state/contracts.ts +14 -0
- package/src/state/crew-init.ts +18 -5
- package/src/state/event-log.ts +7 -1
- package/src/state/state-store.ts +2 -0
- package/src/state/types.ts +82 -0
- package/src/state/worker-atomic-writer.ts +190 -0
- package/src/utils/env-allowlist.ts +30 -0
- package/src/utils/redaction.ts +104 -24
- package/src/utils/safe-paths.ts +55 -14
- package/src/workflows/discover-workflows.ts +25 -1
- package/src/workflows/workflow-config.ts +13 -0
- package/src/worktree/cleanup.ts +2 -1
- package/src/worktree/worktree-manager.ts +4 -3
- package/teams/parallel-research.team.md +1 -1
- package/workflows/examples/hello.dwf.ts +24 -0
package/CHANGELOG.md
CHANGED
|
@@ -1,5 +1,341 @@
|
|
|
1
1
|
# Changelog
|
|
2
2
|
|
|
3
|
+
## [v0.9.1] — Windows essentials fix + cross-platform CI green (2026-06-22)
|
|
4
|
+
|
|
5
|
+
Patch release. No new features. Fixes a real Windows bug reported by a user,
|
|
6
|
+
plus the cross-platform CI failures that followed.
|
|
7
|
+
|
|
8
|
+
### fix(windows): `${APPDATA}` npm-global resolution failure (root cause)
|
|
9
|
+
|
|
10
|
+
Reported symptom (Windows): running pi-crew created a phantom literal
|
|
11
|
+
`${APPDATA}/npm/` directory in the project root (containing `node_modules`,
|
|
12
|
+
`pi-crew`, `pi-crew.cmd`, `pi-crew.ps1`) and leaked a literal `${APPDATA}`
|
|
13
|
+
line into `.gitignore`.
|
|
14
|
+
|
|
15
|
+
Root cause: pi-crew's subprocess env sanitization used explicit allowlists
|
|
16
|
+
that stripped **all Windows-essential env vars** (`APPDATA`, `LOCALAPPDATA`,
|
|
17
|
+
`USERPROFILE`, `SystemRoot`, `ComSpec`, `TEMP`, `TMP`). When a child pi
|
|
18
|
+
process (or npm inside it) tried to resolve the npm-global prefix on Windows,
|
|
19
|
+
it used `%APPDATA%` (cmd expansion) / `${APPDATA}` (bash expansion), but
|
|
20
|
+
`APPDATA` was missing from the env — so the shell left the literal
|
|
21
|
+
`${APPDATA}` in place and operations created/ignored paths under that
|
|
22
|
+
literal name.
|
|
23
|
+
|
|
24
|
+
Fix: added the 7 Windows essentials to all 7 subprocess env allowlists
|
|
25
|
+
(child-pi, async-runner, verification-gates, post-checks, iteration-hooks,
|
|
26
|
+
worktree/cleanup, worktree/worktree-manager). (commit `a7ddc50`)
|
|
27
|
+
|
|
28
|
+
### refactor(env): centralize Windows essentials + regression guard
|
|
29
|
+
|
|
30
|
+
The same 7 vars were duplicated inline across 9 call sites — easy to forget
|
|
31
|
+
on a new allowlist, with nothing preventing a future site from omitting them.
|
|
32
|
+
|
|
33
|
+
- New single source of truth: `WINDOWS_ESSENTIAL_ENV_VARS` in
|
|
34
|
+
`src/utils/env-allowlist.ts` (with full root-cause documentation).
|
|
35
|
+
- All 9 call sites now spread the constant instead of inlining (net −42/+19
|
|
36
|
+
lines, behavior unchanged).
|
|
37
|
+
- New regression test `test/unit/env-allowlist.test.ts`: scans ALL
|
|
38
|
+
`src/**/*.ts` files and fails if any hardcodes the 7 vars inline (the only
|
|
39
|
+
allowed location is the constant file). This catches any new allowlist that
|
|
40
|
+
forgets the constant — the exact regression that caused the bug.
|
|
41
|
+
(commit `6a0284c`)
|
|
42
|
+
|
|
43
|
+
### fix(ci): cross-platform CI green (ubuntu + macOS + Windows)
|
|
44
|
+
|
|
45
|
+
Three distinct containment/path bugs that only surfaced on non-ubuntu CI:
|
|
46
|
+
|
|
47
|
+
1. **Windows 8.3 short-name paths** — `resolveWindowsCanonical()` used
|
|
48
|
+
non-native `realpathSync`, preserving the `RUNNER~1` vs `runneradmin`
|
|
49
|
+
form mismatch. A legitimately-contained dynamic workflow file was
|
|
50
|
+
rejected as "outside the allowed directories". Fixed by using
|
|
51
|
+
`realpathSync.native` (canonical long-name form) as the primary resolver.
|
|
52
|
+
(commit `e9e7137`)
|
|
53
|
+
|
|
54
|
+
2. **ESM `file://` URLs** — two integration tests passed raw Windows paths
|
|
55
|
+
(`D:\…`) to native `import()`, which Node rejects on Windows
|
|
56
|
+
(`ERR_UNSUPPORTED_ESM_URL_SCHEME: protocol 'd:'`). Wrapped with
|
|
57
|
+
`pathToFileURL(…).href`. (commit `e9e7137`)
|
|
58
|
+
|
|
59
|
+
3. **macOS symlink-ancestor** — `isSymlinkSafePath()` walked up the temp
|
|
60
|
+
path and hit `/var` (a symlink → `/private/var`). The old check compared
|
|
61
|
+
the resolved `/private/var` against the tmpdir
|
|
62
|
+
`/private/var/folders/…/T` — `/private/var` is an **ancestor**, not a
|
|
63
|
+
descendant, so it was wrongly rejected (5 macOS worker-atomic-writer
|
|
64
|
+
failures). Fixed by accepting a symlink whose target is a safe root, is
|
|
65
|
+
UNDER a safe root, OR is an ANCESTOR of a safe root. Added two behavioral
|
|
66
|
+
regression tests (symlink-ancestor accept + symlink-attack reject).
|
|
67
|
+
(commit `e9e7137`)
|
|
68
|
+
|
|
69
|
+
4. **macOS `/var` containment** — `resolveContainedPath()` only
|
|
70
|
+
canonicalized paths on win32; on POSIX it compared raw paths, so base
|
|
71
|
+
(`/private/var`) vs target (`/var`) diverged → false "outside" rejection
|
|
72
|
+
(macOS dwf-setresult failure). Added platform-agnostic
|
|
73
|
+
`resolveCanonicalPath()`. Added a darwin-only regression test for the
|
|
74
|
+
real `/var` divergence. (commit `4821bb1`)
|
|
75
|
+
|
|
76
|
+
5. **Windows wakeup timing** — `subagent-manager` polls the child run
|
|
77
|
+
manifest every 1000ms. On the slower Windows CI runner, child-process
|
|
78
|
+
spawn + first poll exceeded the test's 10s deadline (failed at 11.6s).
|
|
79
|
+
Bumped the mock-test deadline to 30s. (commit `4821bb1`)
|
|
80
|
+
|
|
81
|
+
### Verification
|
|
82
|
+
|
|
83
|
+
- tsc: 0
|
|
84
|
+
- Full test suite: 5207 tests, 0 fail on **all three** platforms (ubuntu,
|
|
85
|
+
macOS, Windows)
|
|
86
|
+
- CI run `27955398241`: success across ubuntu-latest, macos-latest,
|
|
87
|
+
windows-latest
|
|
88
|
+
- Regression tests added: env-allowlist scan, worker-atomic-writer symlink
|
|
89
|
+
ancestor/attack, safe-paths darwin `/var` divergence
|
|
90
|
+
|
|
91
|
+
### Breaking changes
|
|
92
|
+
|
|
93
|
+
None. All fixes are additive or behavior-preserving. Windows users who hit
|
|
94
|
+
the `${APPDATA}` bug should upgrade.
|
|
95
|
+
|
|
96
|
+
---
|
|
97
|
+
|
|
98
|
+
## [v0.9.0] — goal loops + dynamic workflows (2026-06-18)
|
|
99
|
+
|
|
100
|
+
Two new features, both built on a shared `runKind` background-dispatch discriminator.
|
|
101
|
+
|
|
102
|
+
### Phase 1.5 #4: TDZ fix — dynamic-workflow runs end-to-end via full pi pipeline (RFC 17 fix)
|
|
103
|
+
|
|
104
|
+
Live `team action='run' workflow='<dynamic>'` was failing with
|
|
105
|
+
`Dynamic workflow 'X' must export a default async function(ctx).` even
|
|
106
|
+
though the .dwf.ts loaded correctly via direct jiti. Root cause was NOT
|
|
107
|
+
in `dynamic-workflow-runner.ts` — it was a Temporal Dead Zone race in
|
|
108
|
+
`team-tool/run.ts` when loaded via the full pi extension pipeline
|
|
109
|
+
(`index.ts → register.ts → registration/team-tool.ts → team-tool.ts →
|
|
110
|
+
run.ts`).
|
|
111
|
+
|
|
112
|
+
**Race details**: jiti loads each .ts file inside an `async function
|
|
113
|
+
_module(...)` wrapper. Static `import { X } from "..."` statements
|
|
114
|
+
become `var _x = require(...)` calls. When a destructured `import` is
|
|
115
|
+
referenced inside a hoisted function before its `let` declaration line
|
|
116
|
+
runs, the reference hits TDZ.
|
|
117
|
+
|
|
118
|
+
**Fixes**:
|
|
119
|
+
- `src/extension/team-tool/run.ts`:
|
|
120
|
+
- `crewInitPromise`: `let` → `var` (avoids TDZ)
|
|
121
|
+
- `expandParallelResearchWorkflow`, `validateWorkflowForTeam`,
|
|
122
|
+
`normalizeSkillOverride`: convert to lazy dynamic imports at call site
|
|
123
|
+
- `src/state/crew-init.ts`:
|
|
124
|
+
- `CREW_README`: `const` → `function buildCrewReadme(): string` (function
|
|
125
|
+
declarations are fully hoisted)
|
|
126
|
+
- `updateGitignore`: convert usage to lazy dynamic import at call site
|
|
127
|
+
|
|
128
|
+
**New test**: `test/integration/run-via-full-pipeline.test.ts` loads
|
|
129
|
+
`index.ts` via `jiti.import()` the way pi does, invokes `handleRun` with a
|
|
130
|
+
dynamic workflow params, and asserts no TDZ / ReferenceError is thrown.
|
|
131
|
+
Fails without the fix, passes with it.
|
|
132
|
+
|
|
133
|
+
**Verification**:
|
|
134
|
+
- 108 unit tests pass (goal, dwf, redaction, verification, worker-writer)
|
|
135
|
+
- New integration test passes
|
|
136
|
+
- Direct simulation of pi pipeline → `Dynamic workflow 'demo-hello'
|
|
137
|
+
completed` (was: `failed: must export a default async function`)
|
|
138
|
+
|
|
139
|
+
Closes RFC 17 §4 round-trip / investigated residual. See
|
|
140
|
+
`research-findings/goal-workflow/17-PHASE1.5-CRASH-INVESTIGATION-RFC.md`
|
|
141
|
+
for the full 8-attempt investigation log (gdb, strace, V8 report, sync
|
|
142
|
+
workarounds, worker-thread atomic writer, auto-downgrade — none
|
|
143
|
+
identified the real bug because they all skipped the full pi load path).
|
|
144
|
+
|
|
145
|
+
### Phase 1.5 #3: V8 diagnostic report infrastructure + crash investigation closed
|
|
146
|
+
|
|
147
|
+
`PI_CREW_BG_REPORT_ON_FATAL=1` makes the background goal-loop runner spawn
|
|
148
|
+
with `--report-on-fatalerror --report-compact`. When V8 considers the
|
|
149
|
+
process state fatal, it writes a diagnostic report (native stack, JS stack,
|
|
150
|
+
libuv handles, environment) — crucial info that application-level signal
|
|
151
|
+
handlers cannot capture.
|
|
152
|
+
|
|
153
|
+
**Investigation result** (RFC 17, 5+ reproduction attempts): the multi-step
|
|
154
|
+
goal-wrap crash does NOT produce a V8 report. Combined with prior findings
|
|
155
|
+
(no signal via strace, no coredump, no OOM, parent process alive) the crash
|
|
156
|
+
signature is consistent with a V8/libuv internal abort that bypasses all
|
|
157
|
+
Node.js reporting mechanisms. Cannot be diagnosed or fixed at the
|
|
158
|
+
application level.
|
|
159
|
+
|
|
160
|
+
**Decision**: ship with the existing auto-downgrade safety net
|
|
161
|
+
(commit `6de5270`). Multi-step workflows silently run as normal team-runs
|
|
162
|
+
(no goal-wrap layer); single-step workflows (implementation) continue to
|
|
163
|
+
goal-wrap end-to-end. The `--report-on-fatalerror` infrastructure remains
|
|
164
|
+
in place for future investigation if a Node.js version change alters the
|
|
165
|
+
crash behavior.
|
|
166
|
+
|
|
167
|
+
See `research-findings/goal-workflow/17-PHASE1.5-CRASH-INVESTIGATION-RFC.md`
|
|
168
|
+
for the full 8-attempt investigation log and re-opening criteria.
|
|
169
|
+
|
|
170
|
+
### Phase 1.5 #2: git-worktree verification sandbox (closes round-trip + invoked-script tamper)
|
|
171
|
+
|
|
172
|
+
`PI_CREW_VERIFICATION_WORKTREE=1` runs verification commands in a pristine
|
|
173
|
+
`git worktree` at the T_snap commit SHA. Closes the two remaining Med-High
|
|
174
|
+
tampering residuals from RFC 13 §6 that P1a manifest hashing cannot detect:
|
|
175
|
+
|
|
176
|
+
- **Round-trip manifest tamper (MAJ#2)**: worker edits `package.json`, runs
|
|
177
|
+
`npm test` (passes), reverts before T_verify_done → hash matches →
|
|
178
|
+
tamper undetected. With worktree: verification runs at original SHA →
|
|
179
|
+
worker edits invisible → tamper BLOCKED.
|
|
180
|
+
- **Invoked-script tampering**: worker rewrites a script the verification
|
|
181
|
+
command invokes; only MANIFEST_FILES are hashed → invisible. With
|
|
182
|
+
worktree: script is at original SHA → tamper BLOCKED.
|
|
183
|
+
|
|
184
|
+
Graceful fallback when ANY precondition fails (logged via
|
|
185
|
+
logInternalError "goal-loop.worktreeSandboxBypassed"): opt-out env,
|
|
186
|
+
not-a-git-repo, dirty index, git unavailable. NEVER blocks the goal loop.
|
|
187
|
+
|
|
188
|
+
Implementation:
|
|
189
|
+
- `src/runtime/verification-worktree.ts` (NEW, pure leaf module):
|
|
190
|
+
`isWorktreeSandboxEnabled`, `checkWorktreeSandboxAvailable`,
|
|
191
|
+
`prepareVerificationWorktree` (git worktree add --detach),
|
|
192
|
+
`withVerificationWorktree` (RAII cleanup, idempotent, finally-safe).
|
|
193
|
+
- `src/runtime/verification-gates.ts`: `executeVerificationCommands`
|
|
194
|
+
accepts optional `worktreeCwd` — spawns commands with that cwd.
|
|
195
|
+
- `src/runtime/goal-loop-runner.ts`: verification call site prepares
|
|
196
|
+
worktree at T_snap SHA when available; finally block always cleans up.
|
|
197
|
+
- `src/runtime/async-runner.ts`: PI_CREW_VERIFICATION_WORKTREE env
|
|
198
|
+
inherited by bg-runner.
|
|
199
|
+
|
|
200
|
+
Tests: 12 new unit tests in `test/unit/verification-worktree.test.ts`
|
|
201
|
+
(flag opt-in, not-a-repo fallback, dirty-index fallback, clean-repo success,
|
|
202
|
+
pristine-checkout property = the security guarantee, RAII cleanup on success
|
|
203
|
+
+ on exception, idempotent cleanup). All pass.
|
|
204
|
+
5200 unit + 115 integration tests; no regression; tsc clean.
|
|
205
|
+
|
|
206
|
+
RFC: `research-findings/goal-workflow/16-PHASE1.5-WORKTREE-SANDBOX-RFC.md`
|
|
207
|
+
|
|
208
|
+
### Phase 1.5 #1: sanitized-env verification (opt-in info-disclosure mitigation)
|
|
209
|
+
|
|
210
|
+
`PI_CREW_VERIFICATION_SANITIZE_ENV=1` strips model-provider secrets (and
|
|
211
|
+
everything else not in the essential-vars allowlist) from the env passed to
|
|
212
|
+
verification commands (`npm test`, `pytest`, etc.). Closes the info-disclosure
|
|
213
|
+
residual at the SOURCE — P1f redaction at artifact-write + judge-bound is
|
|
214
|
+
regex-best-effort against adversarial workers; this never gives the
|
|
215
|
+
verification process the secret in the first place.
|
|
216
|
+
|
|
217
|
+
Escape hatch: `PI_CREW_VERIFICATION_PRESERVE_ENV=KEY1,KEY2,...` lets users
|
|
218
|
+
explicitly opt specific env vars back in (audited via the env-filter.ts
|
|
219
|
+
allowlist validator). Essential non-secret vars (PATH, HOME, USER, SHELL,
|
|
220
|
+
LANG, XDG_*, NPM_CONFIG_*, etc.) are always preserved.
|
|
221
|
+
|
|
222
|
+
AllowList: 25 essential vars. NO model-provider keys by default.
|
|
223
|
+
Inherited by bg-runner via async-runner.ts env allowlist.
|
|
224
|
+
|
|
225
|
+
Tests: 7 new unit tests in test/unit/verification-env-sanitize.test.ts
|
|
226
|
+
(3 flag checks + 4 integration tests spawning real `printenv` subprocesses).
|
|
227
|
+
All pass. 5188 unit + 115 integration tests; no regression.
|
|
228
|
+
|
|
229
|
+
### SAFETY: goal-wrap auto-downgrades multi-step workflows (no hidden crashes)
|
|
230
|
+
|
|
231
|
+
Multi-step workflows (default: 4 steps, fast-fix: 3 steps) crash
|
|
232
|
+
non-deterministically when run as goal-wrap worker turns in the background
|
|
233
|
+
goal-loop process — V8/libuv race during event-loop yields in team-runner
|
|
234
|
+
batch transition (see commit a9f6e09, RFC 15). Sync fs workarounds regress;
|
|
235
|
+
worker-thread isolation doesn't help.
|
|
236
|
+
|
|
237
|
+
When a user has goal-wrap enabled in config but the workflow is multi-step,
|
|
238
|
+
the team-run handler now **auto-downgrades**: skips the goal-wrap layer and
|
|
239
|
+
runs the workflow via the normal team-run path (foreground `executeTeamRun`
|
|
240
|
+
or background `spawnBackgroundTeamRun`, depending on `async`). The user gets
|
|
241
|
+
the run they asked for — no error, no hang, no need to remove config.
|
|
242
|
+
|
|
243
|
+
The bypass reason is logged via `logInternalError("team-tool.run.goalWrapBypassed", ...)`
|
|
244
|
+
for traceability (findable in debug logs / `internal-error.json`).
|
|
245
|
+
|
|
246
|
+
Single-step workflows (e.g. `implementation`, only the adaptive `assess`
|
|
247
|
+
step) continue to be goal-wrapped end-to-end.
|
|
248
|
+
|
|
249
|
+
Implementation:
|
|
250
|
+
- `shouldGoalWrap(cwd, workflow)` — pure decision function returning
|
|
251
|
+
`{enabled: true}` or `{enabled: false, reason, message}`. Reasons:
|
|
252
|
+
`config-off` (not enabled), `invalid-config` (malformed), `multi-step`
|
|
253
|
+
(more than `GOAL_WRAP_MAX_STEPS = 1` step).
|
|
254
|
+
- `run.ts` calls `shouldGoalWrap` after `isGoalWrapEnabled`; if disabled,
|
|
255
|
+
falls through to normal team-run path. The original `isGoalWrapEnabled`
|
|
256
|
+
fast path (config check only) is kept as a cheap pre-filter.
|
|
257
|
+
- 5 new unit tests in `test/unit/goal-wrap.test.ts` cover all 4 decisions
|
|
258
|
+
(config-off / invalid-config / multi-step refuse / single-step accept)
|
|
259
|
+
+ the GOAL_WRAP_MAX_STEPS value invariant.
|
|
260
|
+
|
|
261
|
+
### Phase 1.5: worker-thread atomic writer (opt-in, infrastructure)
|
|
262
|
+
|
|
263
|
+
`PI_CREW_WORKER_ATOMIC_WRITER=1` routes `atomicWriteFileAsync` and
|
|
264
|
+
`appendEventAsync` through a dedicated worker thread that performs SYNC fs
|
|
265
|
+
ops with no internal yields. Implementation: `src/state/worker-atomic-writer.ts`.
|
|
266
|
+
9 unit tests; 5169 existing tests pass; no regression.
|
|
267
|
+
|
|
268
|
+
**Test result**: worker writer does NOT fix the multi-step crash (verified
|
|
269
|
+
end-to-end with `default` workflow). The crash is NOT in fs writes — worker
|
|
270
|
+
writes complete successfully but the process still dies during batch
|
|
271
|
+
transition. Root cause is some other async operation yielding the main
|
|
272
|
+
event loop. See `research-findings/goal-workflow/15-PHASE1.5-WORKER-WRITER-RFC.md`
|
|
273
|
+
for full investigation notes.
|
|
274
|
+
|
|
275
|
+
The worker writer is kept as **infrastructure** — opt-in, well-tested, no
|
|
276
|
+
regression. It may help with future variants or concurrent-write contention.
|
|
277
|
+
|
|
278
|
+
### Resolution: multi-step goal-wrap crash (3/3 tasks now complete end-to-end)
|
|
279
|
+
|
|
280
|
+
The silent crash at `atomicWriteFileAsync` of the inner turn's `manifest.json`
|
|
281
|
+
(size=7417) — which caused `team action='run' workflow='fast-fix'` (and other
|
|
282
|
+
multi-step builtins) to hang at "1/3" forever — is **resolved** as a side
|
|
283
|
+
effect of commit `d52cb81` ("fix(goal-wrap): persist async.pid on OUTER
|
|
284
|
+
goal-loop manifest"). The extra `atomicWriteJson(manifestPath, asyncGoalManifest)`
|
|
285
|
+
call in `startGoalWrappedRun` after `spawnBackgroundTeamRun` shifts timing
|
|
286
|
+
enough to avoid the underlying race condition.
|
|
287
|
+
|
|
288
|
+
Verified end-to-end with 3 consecutive runs of goal-wrapped fast-fix
|
|
289
|
+
(`fix test.js so npm test passes`): all completed 3/3 tasks in ~120s with
|
|
290
|
+
`npm test` PASS. The original deep-dive investigation (commit `a9f6e09`) is
|
|
291
|
+
preserved as a reference; the proximate crash trigger is a Node.js / V8 /
|
|
292
|
+
filesystem-level race that is not reliably reproducible in either direction.
|
|
293
|
+
|
|
294
|
+
The user-facing symptom (must kill pi to recover from 1/3 hang) is also
|
|
295
|
+
resolved: even if a future regression reintroduces the crash, async-notifier
|
|
296
|
+
will detect the dead background-runner within ~30s and emit `async.died` —
|
|
297
|
+
the user sees "Goal failed: Background runner died unexpectedly" instead of
|
|
298
|
+
an infinite "running" state.
|
|
299
|
+
|
|
300
|
+
|
|
301
|
+
|
|
302
|
+
### `goal` — autonomous goal loop (P0a + P0 + P1)
|
|
303
|
+
|
|
304
|
+
- `team action='goal' config.subAction='start|status|pause|resume|stop|step|clear'`.
|
|
305
|
+
- A worker does a turn (`executeTeamRun`), then a separate LLM judge (synthesized
|
|
306
|
+
`goal-judge` AgentConfig with `disableTools:true` → Pi `--no-tools`) evaluates the
|
|
307
|
+
transcript + evidence and returns `{achieved, reason, evidenceRefs}`. On
|
|
308
|
+
not-achieved, the `reason` is composed into the next turn's `manifest.goal`.
|
|
309
|
+
- One manifest PER turn (status-transition invariants block reuse). Budget via
|
|
310
|
+
`collectRunMetrics`. `GoalLoopState` persisted at `<crewRoot>/state/goals/<goalId>.json`.
|
|
311
|
+
- Slash command `/team-goal`. Hooks: `before_goal_step`, `before_goal_abort`.
|
|
312
|
+
- Spec-driven: `research-findings/goal-workflow/00-SPEC.md` + `07-PLAN.md` v3.
|
|
313
|
+
|
|
314
|
+
### `workflow` — dynamic workflow scripts (P2 + P3)
|
|
315
|
+
|
|
316
|
+
- `.dwf.ts` scripts orchestrate subagents via `ctx.agent()` / `ctx.fanOut()` with
|
|
317
|
+
JS loops/branch/cross-review; only `ctx.setResult()` reaches the main context.
|
|
318
|
+
- Full `WorkflowCtx`: `agent`, `fanOut`, `review`, `retry`, `mail`, `gatherReplies`,
|
|
319
|
+
`renderTemplate`, `vars`, `setResult`.
|
|
320
|
+
- `team action='workflow-{create,get,list,save,delete}'`. `workflow-create`/`-delete`
|
|
321
|
+
ACE-gated via `destructive-gate.ts` (`confirm:true`, user-initiated only, path-
|
|
322
|
+
allowlisted via `resolveRealContainedPath`, content-validated).
|
|
323
|
+
- Capability-locked `WorkflowCtx` (Object.freeze + vm.runInNewContext);
|
|
324
|
+
`isolated-vm` deferred to v1.5.
|
|
325
|
+
- Slash command `/workflows`. Example: `workflows/examples/hello.dwf.ts`.
|
|
326
|
+
|
|
327
|
+
### Shared infra (P0a)
|
|
328
|
+
|
|
329
|
+
- `manifest.runKind?: 'team-run' | 'goal-loop' | 'dynamic-workflow'` discriminator;
|
|
330
|
+
background-runner.ts dispatches to `executeTeamRun` / `runGoalLoop` /
|
|
331
|
+
`runDynamicWorkflow`. Default `'team-run'` (backward-compatible).
|
|
332
|
+
|
|
333
|
+
### Other
|
|
334
|
+
|
|
335
|
+
- `AgentConfig.disableTools?: boolean` — pushes Pi `--no-tools` (capability-locked agents).
|
|
336
|
+
- `TEAM_EVENT_TYPES` += `goal.*` + `dwf.*` namespaces.
|
|
337
|
+
- New agent-config field, new event types, new hooks — all additive, no breaking changes.
|
|
338
|
+
|
|
3
339
|
## [0.8.12] — `team action=cleanup` now reverses `init` (Issue #35) (2026-06-17)
|
|
4
340
|
|
|
5
341
|
`team action=cleanup` gained a **project-level mode** that reverses what
|
|
@@ -2653,3 +2989,33 @@ user's project-instructions file was out-of-scope and unnecessary.
|
|
|
2653
2989
|
|
|
2654
2990
|
+4 regression tests (init does NOT create/modify AGENTS.md; API fields removed).
|
|
2655
2991
|
typecheck clean; full suite 2972/0.
|
|
2992
|
+
|
|
2993
|
+
## [Unreleased] — dead-dep cleanup + non-blocking fallow CI (2026-06-18)
|
|
2994
|
+
|
|
2995
|
+
Spotted by running `fallow` (deterministic Rust codebase intelligence) against
|
|
2996
|
+
the repo. Two genuine wins, plus an informational CI job that never blocks.
|
|
2997
|
+
|
|
2998
|
+
### Removed (dead dependencies, verified unused)
|
|
2999
|
+
- **`typebox`** (`package.json:89`) — dead duplicate of `@sinclair/typebox`
|
|
3000
|
+
(which 10 source files actually import). `typebox` (plain) had **zero**
|
|
3001
|
+
imports anywhere in `src/`.
|
|
3002
|
+
- **`acorn`** (`package.json:84`) — **zero** runtime references in `src/`,
|
|
3003
|
+
`scripts/`, or `*.mjs`. Verified the only other package referencing it
|
|
3004
|
+
(`jiti`) lists it under its own `devDependencies` (for jiti's own tests), so
|
|
3005
|
+
it is not a runtime transitive need. `npm ls acorn` confirmed `pi-crew` was
|
|
3006
|
+
its sole parent.
|
|
3007
|
+
|
|
3008
|
+
Both removals verified: typecheck clean, full suite 2965/0.
|
|
3009
|
+
|
|
3010
|
+
### CI: added `fallow-audit` job (non-blocking)
|
|
3011
|
+
- New job in `.github/workflows/ci.yml`: ubuntu-only, `continue-on-error: true`
|
|
3012
|
+
so it **never fails the build**.
|
|
3013
|
+
- Runs `fallow audit` (changed-code diff vs base ref) in JSON + human summary,
|
|
3014
|
+
uploads `fallow-audit-report` artifact (14-day retention).
|
|
3015
|
+
- Surfaced findings (dead code, circular deps, duplication, complexity
|
|
3016
|
+
hotspots, dependency hygiene) are for human/agent review, NOT a merge gate.
|
|
3017
|
+
- Rationale for non-blocking: fallow has high out-of-the-box noise (254 clone
|
|
3018
|
+
families, 379 hotspots) + a false positive on the tsx/jiti path-loading
|
|
3019
|
+
pattern (`jiti` flagged unused but is used via runtime path-loading). A
|
|
3020
|
+
blocking gate would create an unpaid maintenance backlog unsuitable for a
|
|
3021
|
+
solo-maintained extension.
|
package/README.md
CHANGED
|
@@ -1,5 +1,35 @@
|
|
|
1
1
|
# pi-crew
|
|
2
2
|
|
|
3
|
+
> ## ⚠️ IMPORTANT — Read before using
|
|
4
|
+
>
|
|
5
|
+
> **pi-crew is a sub-agent orchestration layer that was developed almost entirely
|
|
6
|
+
> by AI, for the author's own workflow.** It is **not** a hardened, audited
|
|
7
|
+
> product. Here's the honest framing:
|
|
8
|
+
>
|
|
9
|
+
> - **AI-generated code, limited human review.** The vast majority of pi-crew
|
|
10
|
+
> was written and iterated on by autonomous AI agents. While every change
|
|
11
|
+
> goes through static review + runtime tests, I (the author) have not
|
|
12
|
+
> line-by-line verified everything. There will be bugs, edge cases, and
|
|
13
|
+
> behaviors I haven't anticipated.
|
|
14
|
+
> - **It can spawn processes, run shell commands, and write files on your
|
|
15
|
+
> behalf.** Dynamic workflows (`.dwf.ts`) and goal loops run with the same
|
|
16
|
+
> privileges as your Pi session — treat any `.dwf.ts` like `node script.js`
|
|
17
|
+
> you downloaded from the internet.
|
|
18
|
+
> - **Built for *my* needs, not yours.** This scratches a personal itch. It
|
|
19
|
+
> likely won't fit every workflow, team setup, or risk tolerance — and
|
|
20
|
+
> that's fine.
|
|
21
|
+
>
|
|
22
|
+
> **If that sounds too risky, don't use it** — no hard feelings.
|
|
23
|
+
>
|
|
24
|
+
> **If you still want to use it**, the safest path is to **fork it, read the
|
|
25
|
+
> parts you'll touch, and adapt it to your own setup.** If you find a bug,
|
|
26
|
+
> a footgun, or a sharp edge, please open an issue or send a note — your
|
|
27
|
+
> feedback is genuinely appreciated. Thanks. ✌️
|
|
28
|
+
>
|
|
29
|
+
> See also: [SECURITY-ISSUES.md](SECURITY-ISSUES.md),
|
|
30
|
+
> [docs/dynamic-workflows.md](docs/dynamic-workflows.md#security-model-important)
|
|
31
|
+
> (trust model), and the [Known limitations](#known-limitations) section below.
|
|
32
|
+
|
|
3
33
|
**Coordinate AI agent teams inside [Pi](https://github.com/nicekate/pi-coding-agent).**
|
|
4
34
|
|
|
5
35
|
pi-crew is a Pi extension that orchestrates autonomous multi-agent workflows — research, implementation, review, testing, and more — with durable state, parallel execution, worktree isolation, and safe defaults.
|
|
@@ -9,13 +39,52 @@ npm: pi-crew
|
|
|
9
39
|
repo: https://github.com/baphuongna/pi-crew
|
|
10
40
|
```
|
|
11
41
|
|
|
12
|
-
**v0.
|
|
42
|
+
**v0.9.0**: See [CHANGELOG.md](CHANGELOG.md).
|
|
13
43
|
|
|
14
|
-
### Highlights (v0.6.4 → v0.
|
|
44
|
+
### Highlights (v0.6.4 → v0.9.0)
|
|
15
45
|
|
|
16
46
|
A long arc of **trust, cliff-resilience, and robustness** work. Principle: *build
|
|
17
47
|
trust and cliff-resilience, stay lean, delete before adding.*
|
|
18
48
|
|
|
49
|
+
#### v0.9.0 — goal loops + dynamic workflows (2026-06-18)
|
|
50
|
+
Two new features, both modeled on Claude Code, built on a shared `runKind`
|
|
51
|
+
background-dispatch discriminator.
|
|
52
|
+
|
|
53
|
+
- **🎯 Autonomous goal loops** — `team action='goal'` runs a self-directed
|
|
54
|
+
multi-turn loop: a **worker** does a turn, a separate **LLM judge**
|
|
55
|
+
(capability-locked, no tools) evaluates the transcript + verification against
|
|
56
|
+
the objective, and on "not-achieved" the reason is fed into the next turn's
|
|
57
|
+
prompt. Stops on `achieved` / `maxTurns` / budget / `BLOCKED:` / user `stop`.
|
|
58
|
+
See [docs/goals.md](docs/goals.md).
|
|
59
|
+
- **📜 Dynamic workflows (`.dwf.ts`)** — author orchestration as a TypeScript
|
|
60
|
+
script (JS loops/branch/cross-review) instead of a static step list. Runs in
|
|
61
|
+
the background, spawns subagents via `ctx.agent()`/`ctx.fanOut()`, holds
|
|
62
|
+
intermediate results in JS variables, and only `ctx.setResult()` reaches the
|
|
63
|
+
main context. `workflow-create`/`-delete` are ACE-gated (`confirm:true`,
|
|
64
|
+
user-confirmed). See [docs/dynamic-workflows.md](docs/dynamic-workflows.md).
|
|
65
|
+
- **🛡️ Goal-wrap** (RFC v0.5 vision) — apply the goal completion-guarantee to
|
|
66
|
+
existing builtin workflows (`implementation`, `fast-fix`, `default`) via
|
|
67
|
+
per-workflow `.crew/config.json` toggle. Single-step workflows goal-wrap
|
|
68
|
+
end-to-end; multi-step workflows auto-downgrade to a normal team-run because
|
|
69
|
+
they crash non-deterministically under the V8/libuv event-loop (see [Known
|
|
70
|
+
limitations](#known-limitations)).
|
|
71
|
+
- **🔐 Phase 1 integrity hardening** (P1a–P1g) — verification bookend snapshots,
|
|
72
|
+
anti-oscillation (`stuck` non-terminal + resumable), budget enforcement
|
|
73
|
+
(required or explicit opt-out), nonce-token feedback sanitization, secret
|
|
74
|
+
redaction at artifact-write (O(n) fix), global worker cap + workspace lock
|
|
75
|
+
(O_EXCL, startTime-safe). B2 confused-deputy (auto-detecting verification
|
|
76
|
+
commands) refused — user must declare verification explicitly.
|
|
77
|
+
- **🧪 Phase 1.5 fast-follow** — opt-in mitigation toggles for residual risks:
|
|
78
|
+
`PI_CREW_VERIFICATION_SANITIZE_ENV=1` (strip provider secrets from the
|
|
79
|
+
verification subprocess), `PI_CREW_VERIFICATION_WORKTREE=1` (run verification
|
|
80
|
+
in a pristine git worktree at the T_snap commit SHA),
|
|
81
|
+
`PI_CREW_BG_REPORT_ON_FATAL=1` (V8 diagnostic report on fatal).
|
|
82
|
+
- **🐛 TDZ fix** (Phase 1.5 #4) — live `team action='run' workflow='<dynamic>'`
|
|
83
|
+
was failing with a misleading "must export a default async function" error.
|
|
84
|
+
Root cause was a Temporal Dead Zone race in `team-tool/run.ts` when loaded via
|
|
85
|
+
the full Pi extension pipeline (`index.ts → … → run.ts`). Fixed by
|
|
86
|
+
`let`→`var` on the latch + lazy dynamic imports at call sites.
|
|
87
|
+
|
|
19
88
|
#### v0.8.x — hardening & reliability (2026-06-17)
|
|
20
89
|
- **🛠️ Split-scope install fix (v0.8.11)** — `team` runs no longer crash with
|
|
21
90
|
`Cannot find module '@earendil-works/pi-coding-agent'` when pi-crew and pi
|
|
@@ -75,6 +144,8 @@ trust and cliff-resilience, stay lean, delete before adding.*
|
|
|
75
144
|
- **Scheduled runs** — `schedule`/`scheduled` actions with cron, interval, and one-shot support; spawned runs tracked and auto-cancelled on job removal
|
|
76
145
|
- **Plugin system** — framework-aware context injection (Next.js, Vite, Vitest) via plugin registry
|
|
77
146
|
- **Health scoring** — penalty-based run health with time-series snapshots
|
|
147
|
+
- **Autonomous goal loops** (P0/P1) — `team action='goal'` runs an autonomous multi-turn loop: a worker does a turn, a separate LLM judge evaluates the transcript+evidence against the goal, and on "not-achieved" the reason is fed into the next turn's prompt. Stops on achieved / maxTurns / budget / blocked. Claude-Code-style `/goal`. See `docs/goals.md`.
|
|
148
|
+
- **Dynamic workflows** (P2/P3) — author orchestration as a `.dwf.ts` script (JS loops/branch/cross-review) instead of a static step list. The script runs in the background, calls subagents via `ctx.agent()`/`ctx.fanOut()`, holds intermediate results in JS variables, and only `ctx.setResult()` reaches the main context. `workflow-create`/`-delete`/`-save` require `confirm:true` at the tool-call layer (the only gate — a malicious agent that passes `confirm:true` programmatically bypasses it; this is postinstall-equivalent trust, not a human-in-the-loop dialog). See `docs/dynamic-workflows.md`.
|
|
78
149
|
|
|
79
150
|
---
|
|
80
151
|
|
|
@@ -582,6 +653,8 @@ Stats: **366 source files** (70K lines) · **506 test files** (66K lines) · **4
|
|
|
582
653
|
| [docs/troubleshooting.md](docs/troubleshooting.md) | Common errors, recovery, and error-code reference (E001–E012) |
|
|
583
654
|
| [docs/architecture.md](docs/architecture.md) | Internal architecture + run flow |
|
|
584
655
|
| [docs/runtime-flow.md](docs/runtime-flow.md) | Runtime execution details |
|
|
656
|
+
| [docs/goals.md](docs/goals.md) | **v0.9.0** Autonomous goal loops (`team action='goal'`) |
|
|
657
|
+
| [docs/dynamic-workflows.md](docs/dynamic-workflows.md) | **v0.9.0** `.dwf.ts` script runtime + trust model |
|
|
585
658
|
| [docs/live-mailbox-runtime.md](docs/live-mailbox-runtime.md) | Mailbox + live-session runtime |
|
|
586
659
|
| [docs/publishing.md](docs/publishing.md) | Release & publish process |
|
|
587
660
|
| [docs/next-upgrade-roadmap.md](docs/next-upgrade-roadmap.md) | Future upgrade roadmap |
|
|
@@ -591,6 +664,43 @@ Research docs (not in package): [`docs/pi-crew-research/`](https://github.com/ba
|
|
|
591
664
|
|
|
592
665
|
---
|
|
593
666
|
|
|
667
|
+
## Known limitations
|
|
668
|
+
|
|
669
|
+
This is AI-developed software built for a personal workflow. These are the
|
|
670
|
+
sharp edges I'm aware of — there are almost certainly others I'm not.
|
|
671
|
+
|
|
672
|
+
- **Multi-step goal-wrap crashes non-deterministically.** Goal-wrapping
|
|
673
|
+
multi-step builtin workflows (`fast-fix`, `default`) can hit a V8/libuv
|
|
674
|
+
event-loop race that kills the background process with no signal, no core,
|
|
675
|
+
and no V8 diagnostic report (8 investigation attempts: gdb, strace, perf,
|
|
676
|
+
`--report-on-fatalerror`, sync-fs workarounds, worker-thread atomic writer —
|
|
677
|
+
see `research-findings/goal-workflow/17-PHASE1.5-CRASH-INVESTIGATION-RFC.md`).
|
|
678
|
+
**Mitigation:** multi-step workflows silently auto-downgrade to a normal
|
|
679
|
+
team-run (no goal-wrap layer); single-step workflows (`implementation`)
|
|
680
|
+
goal-wrap end-to-end.
|
|
681
|
+
- **`.dwf.ts` scripts are NOT sandboxed in v1.** The `WorkflowCtx` is
|
|
682
|
+
`Object.freeze()`d, but the script runs in plain module scope with full
|
|
683
|
+
`require`/`import`/`process` access (postinstall-equivalent trust).
|
|
684
|
+
`isolated-vm` (real V8 isolate) is planned for a future release. Only place
|
|
685
|
+
`.dwf.ts` files you have reviewed. See
|
|
686
|
+
[docs/dynamic-workflows.md#security-model-important](docs/dynamic-workflows.md#security-model-important).
|
|
687
|
+
- **Editor/agent file caching.** After editing a loaded pi-crew source file,
|
|
688
|
+
restart the Pi session for changes to take effect (jiti in-memory cache).
|
|
689
|
+
Editing a `.dwf.ts` in place while a run is mid-flight can serve a stale
|
|
690
|
+
module body; rename the file or restart Pi to force a fresh load.
|
|
691
|
+
- **Verification integrity is best-effort against adversarial workers.** The
|
|
692
|
+
bookend snapshot (P1a) and git-worktree sandbox (Phase 1.5 #2, opt-in)
|
|
693
|
+
raise the bar, but a worker in the same process can still tamper with files
|
|
694
|
+
outside the snapshot window. Full isolation requires the planned sandbox.
|
|
695
|
+
- **Single maintainer + AI review.** Every change ships after 2+ consecutive
|
|
696
|
+
clean static-review rounds + runtime tests, but there's no independent human
|
|
697
|
+
audit. Fork and read before trusting anything that touches your data.
|
|
698
|
+
|
|
699
|
+
If you hit any of these — or a new one — please
|
|
700
|
+
[open an issue](https://github.com/baphuongna/pi-crew/issues).
|
|
701
|
+
|
|
702
|
+
---
|
|
703
|
+
|
|
594
704
|
## Acknowledgements
|
|
595
705
|
|
|
596
706
|
`pi-crew` builds on ideas and selected MIT-licensed implementation patterns from `pi-subagents` and `oh-my-claudecode`, with conceptual inspiration from `oh-my-openagent`.
|
package/docs/FEATURE_INTAKE.md
CHANGED
package/docs/HARNESS.md
CHANGED
|
@@ -1,11 +1,12 @@
|
|
|
1
1
|
# Harness
|
|
2
2
|
|
|
3
|
-
pi-crew
|
|
4
|
-
agents
|
|
5
|
-
|
|
3
|
+
pi-crew is a Pi extension for multi-agent orchestration. This harness helps
|
|
4
|
+
agents and humans collaborate on developing pi-crew in a reliable, inspectable,
|
|
5
|
+
and easy-to-steer way.
|
|
6
6
|
|
|
7
|
-
|
|
8
|
-
product, classify work, track decisions,
|
|
7
|
+
The product is pi-crew itself. The harness is the operating environment that
|
|
8
|
+
helps agents understand the product, classify work, track decisions, and
|
|
9
|
+
validate changes.
|
|
9
10
|
|
|
10
11
|
## Mental Model
|
|
11
12
|
|
|
@@ -36,26 +37,26 @@ Human intent (issue, prompt, request)
|
|
|
36
37
|
Next intent
|
|
37
38
|
```
|
|
38
39
|
|
|
39
|
-
|
|
40
|
+
Each task has 2 outputs:
|
|
40
41
|
1. **Product delta**: code changes, test changes, API shape, config changes
|
|
41
42
|
2. **Harness delta**: docs, decisions, test matrix updates, backlog items
|
|
42
43
|
|
|
43
44
|
## Source Hierarchy
|
|
44
45
|
|
|
45
|
-
Agents
|
|
46
|
+
Agents read in this order:
|
|
46
47
|
|
|
47
|
-
1. `AGENTS.md` — operating rules
|
|
48
|
-
2. `docs/HARNESS.md` — file
|
|
49
|
-
3. `docs/FEATURE_INTAKE.md` —
|
|
48
|
+
1. `AGENTS.md` — operating rules and important paths
|
|
49
|
+
2. `docs/HARNESS.md` — this file, the collaboration model
|
|
50
|
+
3. `docs/FEATURE_INTAKE.md` — before turning a request into work
|
|
50
51
|
4. `docs/product/` — current product contract
|
|
51
52
|
5. `docs/ARCHITECTURE.md` — implementation shape
|
|
52
|
-
6. `docs/stories/` — active
|
|
53
|
+
6. `docs/stories/` — active and completed stories
|
|
53
54
|
7. `docs/TEST_MATRIX.md` — proof status
|
|
54
55
|
8. `docs/decisions/` — why important choices were made
|
|
55
56
|
|
|
56
57
|
## Validation Ladder
|
|
57
58
|
|
|
58
|
-
pi-crew
|
|
59
|
+
pi-crew already has validation commands:
|
|
59
60
|
|
|
60
61
|
| Level | Command | What it proves |
|
|
61
62
|
|-------|---------|----------------|
|
|
@@ -68,14 +69,14 @@ Agents **must not** claim validation passes without running the actual command.
|
|
|
68
69
|
|
|
69
70
|
## Growth Rule
|
|
70
71
|
|
|
71
|
-
|
|
72
|
-
-
|
|
73
|
-
-
|
|
74
|
-
-
|
|
75
|
-
-
|
|
76
|
-
-
|
|
72
|
+
The harness grows from friction. When an agent:
|
|
73
|
+
- Gets confused about expected behavior
|
|
74
|
+
- Has to repeat manual reasoning
|
|
75
|
+
- Lacks a validation command
|
|
76
|
+
- Discovers a missing rule
|
|
77
|
+
- Sees a recurring failure pattern
|
|
77
78
|
|
|
78
|
-
→
|
|
79
|
+
→ The agent must improve the harness directly or propose changes in `docs/HARNESS_BACKLOG.md`.
|
|
79
80
|
|
|
80
81
|
## Working Conventions
|
|
81
82
|
|