pi-crew 0.8.14 → 0.9.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (80) hide show
  1. package/CHANGELOG.md +271 -0
  2. package/README.md +112 -2
  3. package/docs/FEATURE_INTAKE.md +1 -1
  4. package/docs/HARNESS.md +20 -19
  5. package/docs/PROJECT_REVIEW.md +132 -133
  6. package/docs/PROJECT_REVIEW_FIXES.md +130 -131
  7. package/docs/actions-reference.md +127 -121
  8. package/docs/architecture.md +1 -1
  9. package/docs/code-review-2026-05-11.md +134 -134
  10. package/docs/commands-reference.md +108 -106
  11. package/docs/comparison-pi-subagents-vs-pi-crew.md +105 -105
  12. package/docs/deep-review-report.md +1 -1
  13. package/docs/dynamic-workflows.md +90 -0
  14. package/docs/fixes/BATCH_A_H1_H2.md +17 -17
  15. package/docs/fixes/bug-007-async-notifier-stale-ctx.md +23 -23
  16. package/docs/followup-plan-2026-05-12.md +135 -135
  17. package/docs/followup-review-2026-05-12.md +86 -86
  18. package/docs/followup-review-round3-2026-05-12.md +123 -123
  19. package/docs/goals.md +59 -0
  20. package/docs/implementation-plan-top3.md +4 -4
  21. package/docs/issue-29-analysis.md +2 -2
  22. package/docs/oh-my-pi-research.md +154 -154
  23. package/docs/optimization-plan.md +2 -0
  24. package/docs/perf/baseline-2026-05.md +9 -9
  25. package/docs/perf/final-report-2026-05.md +2 -2
  26. package/docs/perf/sprint-1-report.md +2 -2
  27. package/docs/perf/sprint-2-report.md +1 -1
  28. package/docs/perf/upgrade-plan-2026-05.md +72 -72
  29. package/docs/pi-crew-bugs.md +230 -230
  30. package/docs/pi-crew-investigation-report.md +102 -102
  31. package/docs/pi-crew-test-round5.md +4 -4
  32. package/docs/runtime-analysis-child-vs-live.md +57 -57
  33. package/docs/runtime-migration-in-process-analysis.md +97 -97
  34. package/package.json +2 -4
  35. package/skills/orchestration/SKILL.md +11 -11
  36. package/src/agents/agent-config.ts +4 -0
  37. package/src/config/config.ts +39 -0
  38. package/src/config/types.ts +11 -0
  39. package/src/extension/action-suggestions.ts +2 -1
  40. package/src/extension/async-notifier.ts +10 -0
  41. package/src/extension/help.ts +14 -0
  42. package/src/extension/registration/commands.ts +27 -0
  43. package/src/extension/team-tool/destructive-gate.ts +1 -1
  44. package/src/extension/team-tool/goal-wrap.ts +288 -0
  45. package/src/extension/team-tool/goal.ts +405 -0
  46. package/src/extension/team-tool/run.ts +103 -4
  47. package/src/extension/team-tool/workflow-manage.ts +194 -0
  48. package/src/extension/team-tool.ts +20 -0
  49. package/src/hooks/types.ts +3 -1
  50. package/src/runtime/async-runner.ts +24 -2
  51. package/src/runtime/background-runner.ts +68 -19
  52. package/src/runtime/child-pi.ts +6 -1
  53. package/src/runtime/completion-guard.ts +1 -1
  54. package/src/runtime/dynamic-workflow-context.ts +450 -0
  55. package/src/runtime/dynamic-workflow-runner.ts +180 -0
  56. package/src/runtime/global-worker-cap.ts +96 -0
  57. package/src/runtime/goal-evaluator.ts +294 -0
  58. package/src/runtime/goal-loop-runner.ts +612 -0
  59. package/src/runtime/goal-state-store.ts +209 -0
  60. package/src/runtime/pi-args.ts +10 -2
  61. package/src/runtime/result-extractor.ts +32 -0
  62. package/src/runtime/team-runner.ts +11 -1
  63. package/src/runtime/verification-gates.ts +85 -5
  64. package/src/runtime/verification-integrity.ts +110 -0
  65. package/src/runtime/verification-worktree.ts +136 -0
  66. package/src/runtime/workspace-lock.ts +448 -0
  67. package/src/schema/config-schema.ts +26 -0
  68. package/src/schema/team-tool-schema.ts +39 -4
  69. package/src/state/atomic-write.ts +9 -0
  70. package/src/state/contracts.ts +14 -0
  71. package/src/state/crew-init.ts +18 -5
  72. package/src/state/event-log.ts +7 -1
  73. package/src/state/state-store.ts +2 -0
  74. package/src/state/types.ts +82 -0
  75. package/src/state/worker-atomic-writer.ts +176 -0
  76. package/src/utils/redaction.ts +104 -24
  77. package/src/workflows/discover-workflows.ts +25 -1
  78. package/src/workflows/workflow-config.ts +13 -0
  79. package/teams/parallel-research.team.md +1 -1
  80. package/workflows/examples/hello.dwf.ts +24 -0
package/CHANGELOG.md CHANGED
@@ -1,5 +1,246 @@
1
1
  # Changelog
2
2
 
3
+ ## [v0.9.0] — goal loops + dynamic workflows (2026-06-18)
4
+
5
+ Two new features, both built on a shared `runKind` background-dispatch discriminator.
6
+
7
+ ### Phase 1.5 #4: TDZ fix — dynamic-workflow runs end-to-end via full pi pipeline (RFC 17 fix)
8
+
9
+ Live `team action='run' workflow='<dynamic>'` was failing with
10
+ `Dynamic workflow 'X' must export a default async function(ctx).` even
11
+ though the .dwf.ts loaded correctly via direct jiti. Root cause was NOT
12
+ in `dynamic-workflow-runner.ts` — it was a Temporal Dead Zone race in
13
+ `team-tool/run.ts` when loaded via the full pi extension pipeline
14
+ (`index.ts → register.ts → registration/team-tool.ts → team-tool.ts →
15
+ run.ts`).
16
+
17
+ **Race details**: jiti loads each .ts file inside an `async function
18
+ _module(...)` wrapper. Static `import { X } from "..."` statements
19
+ become `var _x = require(...)` calls. When a destructured `import` is
20
+ referenced inside a hoisted function before its `let` declaration line
21
+ runs, the reference hits TDZ.
22
+
23
+ **Fixes**:
24
+ - `src/extension/team-tool/run.ts`:
25
+ - `crewInitPromise`: `let` → `var` (avoids TDZ)
26
+ - `expandParallelResearchWorkflow`, `validateWorkflowForTeam`,
27
+ `normalizeSkillOverride`: convert to lazy dynamic imports at call site
28
+ - `src/state/crew-init.ts`:
29
+ - `CREW_README`: `const` → `function buildCrewReadme(): string` (function
30
+ declarations are fully hoisted)
31
+ - `updateGitignore`: convert usage to lazy dynamic import at call site
32
+
33
+ **New test**: `test/integration/run-via-full-pipeline.test.ts` loads
34
+ `index.ts` via `jiti.import()` the way pi does, invokes `handleRun` with a
35
+ dynamic workflow params, and asserts no TDZ / ReferenceError is thrown.
36
+ Fails without the fix, passes with it.
37
+
38
+ **Verification**:
39
+ - 108 unit tests pass (goal, dwf, redaction, verification, worker-writer)
40
+ - New integration test passes
41
+ - Direct simulation of pi pipeline → `Dynamic workflow 'demo-hello'
42
+ completed` (was: `failed: must export a default async function`)
43
+
44
+ Closes RFC 17 §4 round-trip / investigated residual. See
45
+ `research-findings/goal-workflow/17-PHASE1.5-CRASH-INVESTIGATION-RFC.md`
46
+ for the full 8-attempt investigation log (gdb, strace, V8 report, sync
47
+ workarounds, worker-thread atomic writer, auto-downgrade — none
48
+ identified the real bug because they all skipped the full pi load path).
49
+
50
+ ### Phase 1.5 #3: V8 diagnostic report infrastructure + crash investigation closed
51
+
52
+ `PI_CREW_BG_REPORT_ON_FATAL=1` makes the background goal-loop runner spawn
53
+ with `--report-on-fatalerror --report-compact`. When V8 considers the
54
+ process state fatal, it writes a diagnostic report (native stack, JS stack,
55
+ libuv handles, environment) — crucial info that application-level signal
56
+ handlers cannot capture.
57
+
58
+ **Investigation result** (RFC 17, 5+ reproduction attempts): the multi-step
59
+ goal-wrap crash does NOT produce a V8 report. Combined with prior findings
60
+ (no signal via strace, no coredump, no OOM, parent process alive) the crash
61
+ signature is consistent with a V8/libuv internal abort that bypasses all
62
+ Node.js reporting mechanisms. Cannot be diagnosed or fixed at the
63
+ application level.
64
+
65
+ **Decision**: ship with the existing auto-downgrade safety net
66
+ (commit `6de5270`). Multi-step workflows silently run as normal team-runs
67
+ (no goal-wrap layer); single-step workflows (implementation) continue to
68
+ goal-wrap end-to-end. The `--report-on-fatalerror` infrastructure remains
69
+ in place for future investigation if a Node.js version change alters the
70
+ crash behavior.
71
+
72
+ See `research-findings/goal-workflow/17-PHASE1.5-CRASH-INVESTIGATION-RFC.md`
73
+ for the full 8-attempt investigation log and re-opening criteria.
74
+
75
+ ### Phase 1.5 #2: git-worktree verification sandbox (closes round-trip + invoked-script tamper)
76
+
77
+ `PI_CREW_VERIFICATION_WORKTREE=1` runs verification commands in a pristine
78
+ `git worktree` at the T_snap commit SHA. Closes the two remaining Med-High
79
+ tampering residuals from RFC 13 §6 that P1a manifest hashing cannot detect:
80
+
81
+ - **Round-trip manifest tamper (MAJ#2)**: worker edits `package.json`, runs
82
+ `npm test` (passes), reverts before T_verify_done → hash matches →
83
+ tamper undetected. With worktree: verification runs at original SHA →
84
+ worker edits invisible → tamper BLOCKED.
85
+ - **Invoked-script tampering**: worker rewrites a script the verification
86
+ command invokes; only MANIFEST_FILES are hashed → invisible. With
87
+ worktree: script is at original SHA → tamper BLOCKED.
88
+
89
+ Graceful fallback when ANY precondition fails (logged via
90
+ logInternalError "goal-loop.worktreeSandboxBypassed"): opt-out env,
91
+ not-a-git-repo, dirty index, git unavailable. NEVER blocks the goal loop.
92
+
93
+ Implementation:
94
+ - `src/runtime/verification-worktree.ts` (NEW, pure leaf module):
95
+ `isWorktreeSandboxEnabled`, `checkWorktreeSandboxAvailable`,
96
+ `prepareVerificationWorktree` (git worktree add --detach),
97
+ `withVerificationWorktree` (RAII cleanup, idempotent, finally-safe).
98
+ - `src/runtime/verification-gates.ts`: `executeVerificationCommands`
99
+ accepts optional `worktreeCwd` — spawns commands with that cwd.
100
+ - `src/runtime/goal-loop-runner.ts`: verification call site prepares
101
+ worktree at T_snap SHA when available; finally block always cleans up.
102
+ - `src/runtime/async-runner.ts`: PI_CREW_VERIFICATION_WORKTREE env
103
+ inherited by bg-runner.
104
+
105
+ Tests: 12 new unit tests in `test/unit/verification-worktree.test.ts`
106
+ (flag opt-in, not-a-repo fallback, dirty-index fallback, clean-repo success,
107
+ pristine-checkout property = the security guarantee, RAII cleanup on success
108
+ + on exception, idempotent cleanup). All pass.
109
+ 5200 unit + 115 integration tests; no regression; tsc clean.
110
+
111
+ RFC: `research-findings/goal-workflow/16-PHASE1.5-WORKTREE-SANDBOX-RFC.md`
112
+
113
+ ### Phase 1.5 #1: sanitized-env verification (opt-in info-disclosure mitigation)
114
+
115
+ `PI_CREW_VERIFICATION_SANITIZE_ENV=1` strips model-provider secrets (and
116
+ everything else not in the essential-vars allowlist) from the env passed to
117
+ verification commands (`npm test`, `pytest`, etc.). Closes the info-disclosure
118
+ residual at the SOURCE — P1f redaction at artifact-write + judge-bound is
119
+ regex-best-effort against adversarial workers; this never gives the
120
+ verification process the secret in the first place.
121
+
122
+ Escape hatch: `PI_CREW_VERIFICATION_PRESERVE_ENV=KEY1,KEY2,...` lets users
123
+ explicitly opt specific env vars back in (audited via the env-filter.ts
124
+ allowlist validator). Essential non-secret vars (PATH, HOME, USER, SHELL,
125
+ LANG, XDG_*, NPM_CONFIG_*, etc.) are always preserved.
126
+
127
+ AllowList: 25 essential vars. NO model-provider keys by default.
128
+ Inherited by bg-runner via async-runner.ts env allowlist.
129
+
130
+ Tests: 7 new unit tests in test/unit/verification-env-sanitize.test.ts
131
+ (3 flag checks + 4 integration tests spawning real `printenv` subprocesses).
132
+ All pass. 5188 unit + 115 integration tests; no regression.
133
+
134
+ ### SAFETY: goal-wrap auto-downgrades multi-step workflows (no hidden crashes)
135
+
136
+ Multi-step workflows (default: 4 steps, fast-fix: 3 steps) crash
137
+ non-deterministically when run as goal-wrap worker turns in the background
138
+ goal-loop process — V8/libuv race during event-loop yields in team-runner
139
+ batch transition (see commit a9f6e09, RFC 15). Sync fs workarounds regress;
140
+ worker-thread isolation doesn't help.
141
+
142
+ When a user has goal-wrap enabled in config but the workflow is multi-step,
143
+ the team-run handler now **auto-downgrades**: skips the goal-wrap layer and
144
+ runs the workflow via the normal team-run path (foreground `executeTeamRun`
145
+ or background `spawnBackgroundTeamRun`, depending on `async`). The user gets
146
+ the run they asked for — no error, no hang, no need to remove config.
147
+
148
+ The bypass reason is logged via `logInternalError("team-tool.run.goalWrapBypassed", ...)`
149
+ for traceability (findable in debug logs / `internal-error.json`).
150
+
151
+ Single-step workflows (e.g. `implementation`, only the adaptive `assess`
152
+ step) continue to be goal-wrapped end-to-end.
153
+
154
+ Implementation:
155
+ - `shouldGoalWrap(cwd, workflow)` — pure decision function returning
156
+ `{enabled: true}` or `{enabled: false, reason, message}`. Reasons:
157
+ `config-off` (not enabled), `invalid-config` (malformed), `multi-step`
158
+ (more than `GOAL_WRAP_MAX_STEPS = 1` step).
159
+ - `run.ts` calls `shouldGoalWrap` after `isGoalWrapEnabled`; if disabled,
160
+ falls through to normal team-run path. The original `isGoalWrapEnabled`
161
+ fast path (config check only) is kept as a cheap pre-filter.
162
+ - 5 new unit tests in `test/unit/goal-wrap.test.ts` cover all 4 decisions
163
+ (config-off / invalid-config / multi-step refuse / single-step accept)
164
+ + the GOAL_WRAP_MAX_STEPS value invariant.
165
+
166
+ ### Phase 1.5: worker-thread atomic writer (opt-in, infrastructure)
167
+
168
+ `PI_CREW_WORKER_ATOMIC_WRITER=1` routes `atomicWriteFileAsync` and
169
+ `appendEventAsync` through a dedicated worker thread that performs SYNC fs
170
+ ops with no internal yields. Implementation: `src/state/worker-atomic-writer.ts`.
171
+ 9 unit tests; 5169 existing tests pass; no regression.
172
+
173
+ **Test result**: worker writer does NOT fix the multi-step crash (verified
174
+ end-to-end with `default` workflow). The crash is NOT in fs writes — worker
175
+ writes complete successfully but the process still dies during batch
176
+ transition. Root cause is some other async operation yielding the main
177
+ event loop. See `research-findings/goal-workflow/15-PHASE1.5-WORKER-WRITER-RFC.md`
178
+ for full investigation notes.
179
+
180
+ The worker writer is kept as **infrastructure** — opt-in, well-tested, no
181
+ regression. It may help with future variants or concurrent-write contention.
182
+
183
+ ### Resolution: multi-step goal-wrap crash (3/3 tasks now complete end-to-end)
184
+
185
+ The silent crash at `atomicWriteFileAsync` of the inner turn's `manifest.json`
186
+ (size=7417) — which caused `team action='run' workflow='fast-fix'` (and other
187
+ multi-step builtins) to hang at "1/3" forever — is **resolved** as a side
188
+ effect of commit `d52cb81` ("fix(goal-wrap): persist async.pid on OUTER
189
+ goal-loop manifest"). The extra `atomicWriteJson(manifestPath, asyncGoalManifest)`
190
+ call in `startGoalWrappedRun` after `spawnBackgroundTeamRun` shifts timing
191
+ enough to avoid the underlying race condition.
192
+
193
+ Verified end-to-end with 3 consecutive runs of goal-wrapped fast-fix
194
+ (`fix test.js so npm test passes`): all completed 3/3 tasks in ~120s with
195
+ `npm test` PASS. The original deep-dive investigation (commit `a9f6e09`) is
196
+ preserved as a reference; the proximate crash trigger is a Node.js / V8 /
197
+ filesystem-level race that is not reliably reproducible in either direction.
198
+
199
+ The user-facing symptom (must kill pi to recover from 1/3 hang) is also
200
+ resolved: even if a future regression reintroduces the crash, async-notifier
201
+ will detect the dead background-runner within ~30s and emit `async.died` —
202
+ the user sees "Goal failed: Background runner died unexpectedly" instead of
203
+ an infinite "running" state.
204
+
205
+
206
+
207
+ ### `goal` — autonomous goal loop (P0a + P0 + P1)
208
+
209
+ - `team action='goal' config.subAction='start|status|pause|resume|stop|step|clear'`.
210
+ - A worker does a turn (`executeTeamRun`), then a separate LLM judge (synthesized
211
+ `goal-judge` AgentConfig with `disableTools:true` → Pi `--no-tools`) evaluates the
212
+ transcript + evidence and returns `{achieved, reason, evidenceRefs}`. On
213
+ not-achieved, the `reason` is composed into the next turn's `manifest.goal`.
214
+ - One manifest PER turn (status-transition invariants block reuse). Budget via
215
+ `collectRunMetrics`. `GoalLoopState` persisted at `<crewRoot>/state/goals/<goalId>.json`.
216
+ - Slash command `/team-goal`. Hooks: `before_goal_step`, `before_goal_abort`.
217
+ - Spec-driven: `research-findings/goal-workflow/00-SPEC.md` + `07-PLAN.md` v3.
218
+
219
+ ### `workflow` — dynamic workflow scripts (P2 + P3)
220
+
221
+ - `.dwf.ts` scripts orchestrate subagents via `ctx.agent()` / `ctx.fanOut()` with
222
+ JS loops/branch/cross-review; only `ctx.setResult()` reaches the main context.
223
+ - Full `WorkflowCtx`: `agent`, `fanOut`, `review`, `retry`, `mail`, `gatherReplies`,
224
+ `renderTemplate`, `vars`, `setResult`.
225
+ - `team action='workflow-{create,get,list,save,delete}'`. `workflow-create`/`-delete`
226
+ ACE-gated via `destructive-gate.ts` (`confirm:true`, user-initiated only, path-
227
+ allowlisted via `resolveRealContainedPath`, content-validated).
228
+ - Capability-locked `WorkflowCtx` (Object.freeze + vm.runInNewContext);
229
+ `isolated-vm` deferred to v1.5.
230
+ - Slash command `/workflows`. Example: `workflows/examples/hello.dwf.ts`.
231
+
232
+ ### Shared infra (P0a)
233
+
234
+ - `manifest.runKind?: 'team-run' | 'goal-loop' | 'dynamic-workflow'` discriminator;
235
+ background-runner.ts dispatches to `executeTeamRun` / `runGoalLoop` /
236
+ `runDynamicWorkflow`. Default `'team-run'` (backward-compatible).
237
+
238
+ ### Other
239
+
240
+ - `AgentConfig.disableTools?: boolean` — pushes Pi `--no-tools` (capability-locked agents).
241
+ - `TEAM_EVENT_TYPES` += `goal.*` + `dwf.*` namespaces.
242
+ - New agent-config field, new event types, new hooks — all additive, no breaking changes.
243
+
3
244
  ## [0.8.12] — `team action=cleanup` now reverses `init` (Issue #35) (2026-06-17)
4
245
 
5
246
  `team action=cleanup` gained a **project-level mode** that reverses what
@@ -2653,3 +2894,33 @@ user's project-instructions file was out-of-scope and unnecessary.
2653
2894
 
2654
2895
  +4 regression tests (init does NOT create/modify AGENTS.md; API fields removed).
2655
2896
  typecheck clean; full suite 2972/0.
2897
+
2898
+ ## [Unreleased] — dead-dep cleanup + non-blocking fallow CI (2026-06-18)
2899
+
2900
+ Spotted by running `fallow` (deterministic Rust codebase intelligence) against
2901
+ the repo. Two genuine wins, plus an informational CI job that never blocks.
2902
+
2903
+ ### Removed (dead dependencies, verified unused)
2904
+ - **`typebox`** (`package.json:89`) — dead duplicate of `@sinclair/typebox`
2905
+ (which 10 source files actually import). `typebox` (plain) had **zero**
2906
+ imports anywhere in `src/`.
2907
+ - **`acorn`** (`package.json:84`) — **zero** runtime references in `src/`,
2908
+ `scripts/`, or `*.mjs`. Verified the only other package referencing it
2909
+ (`jiti`) lists it under its own `devDependencies` (for jiti's own tests), so
2910
+ it is not a runtime transitive need. `npm ls acorn` confirmed `pi-crew` was
2911
+ its sole parent.
2912
+
2913
+ Both removals verified: typecheck clean, full suite 2965/0.
2914
+
2915
+ ### CI: added `fallow-audit` job (non-blocking)
2916
+ - New job in `.github/workflows/ci.yml`: ubuntu-only, `continue-on-error: true`
2917
+ so it **never fails the build**.
2918
+ - Runs `fallow audit` (changed-code diff vs base ref) in JSON + human summary,
2919
+ uploads `fallow-audit-report` artifact (14-day retention).
2920
+ - Surfaced findings (dead code, circular deps, duplication, complexity
2921
+ hotspots, dependency hygiene) are for human/agent review, NOT a merge gate.
2922
+ - Rationale for non-blocking: fallow has high out-of-the-box noise (254 clone
2923
+ families, 379 hotspots) + a false positive on the tsx/jiti path-loading
2924
+ pattern (`jiti` flagged unused but is used via runtime path-loading). A
2925
+ blocking gate would create an unpaid maintenance backlog unsuitable for a
2926
+ solo-maintained extension.
package/README.md CHANGED
@@ -1,5 +1,35 @@
1
1
  # pi-crew
2
2
 
3
+ > ## ⚠️ IMPORTANT — Read before using
4
+ >
5
+ > **pi-crew is a sub-agent orchestration layer that was developed almost entirely
6
+ > by AI, for the author's own workflow.** It is **not** a hardened, audited
7
+ > product. Here's the honest framing:
8
+ >
9
+ > - **AI-generated code, limited human review.** The vast majority of pi-crew
10
+ > was written and iterated on by autonomous AI agents. While every change
11
+ > goes through static review + runtime tests, I (the author) have not
12
+ > line-by-line verified everything. There will be bugs, edge cases, and
13
+ > behaviors I haven't anticipated.
14
+ > - **It can spawn processes, run shell commands, and write files on your
15
+ > behalf.** Dynamic workflows (`.dwf.ts`) and goal loops run with the same
16
+ > privileges as your Pi session — treat any `.dwf.ts` like `node script.js`
17
+ > you downloaded from the internet.
18
+ > - **Built for *my* needs, not yours.** This scratches a personal itch. It
19
+ > likely won't fit every workflow, team setup, or risk tolerance — and
20
+ > that's fine.
21
+ >
22
+ > **If that sounds too risky, don't use it** — no hard feelings.
23
+ >
24
+ > **If you still want to use it**, the safest path is to **fork it, read the
25
+ > parts you'll touch, and adapt it to your own setup.** If you find a bug,
26
+ > a footgun, or a sharp edge, please open an issue or send a note — your
27
+ > feedback is genuinely appreciated. Thanks. ✌️
28
+ >
29
+ > See also: [SECURITY-ISSUES.md](SECURITY-ISSUES.md),
30
+ > [docs/dynamic-workflows.md](docs/dynamic-workflows.md#security-model-important)
31
+ > (trust model), and the [Known limitations](#known-limitations) section below.
32
+
3
33
  **Coordinate AI agent teams inside [Pi](https://github.com/nicekate/pi-coding-agent).**
4
34
 
5
35
  pi-crew is a Pi extension that orchestrates autonomous multi-agent workflows — research, implementation, review, testing, and more — with durable state, parallel execution, worktree isolation, and safe defaults.
@@ -9,13 +39,52 @@ npm: pi-crew
9
39
  repo: https://github.com/baphuongna/pi-crew
10
40
  ```
11
41
 
12
- **v0.8.11**: See [CHANGELOG.md](CHANGELOG.md).
42
+ **v0.9.0**: See [CHANGELOG.md](CHANGELOG.md).
13
43
 
14
- ### Highlights (v0.6.4 → v0.8.11)
44
+ ### Highlights (v0.6.4 → v0.9.0)
15
45
 
16
46
  A long arc of **trust, cliff-resilience, and robustness** work. Principle: *build
17
47
  trust and cliff-resilience, stay lean, delete before adding.*
18
48
 
49
+ #### v0.9.0 — goal loops + dynamic workflows (2026-06-18)
50
+ Two new features, both modeled on Claude Code, built on a shared `runKind`
51
+ background-dispatch discriminator.
52
+
53
+ - **🎯 Autonomous goal loops** — `team action='goal'` runs a self-directed
54
+ multi-turn loop: a **worker** does a turn, a separate **LLM judge**
55
+ (capability-locked, no tools) evaluates the transcript + verification against
56
+ the objective, and on "not-achieved" the reason is fed into the next turn's
57
+ prompt. Stops on `achieved` / `maxTurns` / budget / `BLOCKED:` / user `stop`.
58
+ See [docs/goals.md](docs/goals.md).
59
+ - **📜 Dynamic workflows (`.dwf.ts`)** — author orchestration as a TypeScript
60
+ script (JS loops/branch/cross-review) instead of a static step list. Runs in
61
+ the background, spawns subagents via `ctx.agent()`/`ctx.fanOut()`, holds
62
+ intermediate results in JS variables, and only `ctx.setResult()` reaches the
63
+ main context. `workflow-create`/`-delete` are ACE-gated (`confirm:true`,
64
+ user-confirmed). See [docs/dynamic-workflows.md](docs/dynamic-workflows.md).
65
+ - **🛡️ Goal-wrap** (RFC v0.5 vision) — apply the goal completion-guarantee to
66
+ existing builtin workflows (`implementation`, `fast-fix`, `default`) via
67
+ per-workflow `.crew/config.json` toggle. Single-step workflows goal-wrap
68
+ end-to-end; multi-step workflows auto-downgrade to a normal team-run because
69
+ they crash non-deterministically under the V8/libuv event-loop (see [Known
70
+ limitations](#known-limitations)).
71
+ - **🔐 Phase 1 integrity hardening** (P1a–P1g) — verification bookend snapshots,
72
+ anti-oscillation (`stuck` non-terminal + resumable), budget enforcement
73
+ (required or explicit opt-out), nonce-token feedback sanitization, secret
74
+ redaction at artifact-write (O(n) fix), global worker cap + workspace lock
75
+ (O_EXCL, startTime-safe). B2 confused-deputy (auto-detecting verification
76
+ commands) refused — user must declare verification explicitly.
77
+ - **🧪 Phase 1.5 fast-follow** — opt-in mitigation toggles for residual risks:
78
+ `PI_CREW_VERIFICATION_SANITIZE_ENV=1` (strip provider secrets from the
79
+ verification subprocess), `PI_CREW_VERIFICATION_WORKTREE=1` (run verification
80
+ in a pristine git worktree at the T_snap commit SHA),
81
+ `PI_CREW_BG_REPORT_ON_FATAL=1` (V8 diagnostic report on fatal).
82
+ - **🐛 TDZ fix** (Phase 1.5 #4) — live `team action='run' workflow='<dynamic>'`
83
+ was failing with a misleading "must export a default async function" error.
84
+ Root cause was a Temporal Dead Zone race in `team-tool/run.ts` when loaded via
85
+ the full Pi extension pipeline (`index.ts → … → run.ts`). Fixed by
86
+ `let`→`var` on the latch + lazy dynamic imports at call sites.
87
+
19
88
  #### v0.8.x — hardening & reliability (2026-06-17)
20
89
  - **🛠️ Split-scope install fix (v0.8.11)** — `team` runs no longer crash with
21
90
  `Cannot find module '@earendil-works/pi-coding-agent'` when pi-crew and pi
@@ -75,6 +144,8 @@ trust and cliff-resilience, stay lean, delete before adding.*
75
144
  - **Scheduled runs** — `schedule`/`scheduled` actions with cron, interval, and one-shot support; spawned runs tracked and auto-cancelled on job removal
76
145
  - **Plugin system** — framework-aware context injection (Next.js, Vite, Vitest) via plugin registry
77
146
  - **Health scoring** — penalty-based run health with time-series snapshots
147
+ - **Autonomous goal loops** (P0/P1) — `team action='goal'` runs an autonomous multi-turn loop: a worker does a turn, a separate LLM judge evaluates the transcript+evidence against the goal, and on "not-achieved" the reason is fed into the next turn's prompt. Stops on achieved / maxTurns / budget / blocked. Claude-Code-style `/goal`. See `docs/goals.md`.
148
+ - **Dynamic workflows** (P2/P3) — author orchestration as a `.dwf.ts` script (JS loops/branch/cross-review) instead of a static step list. The script runs in the background, calls subagents via `ctx.agent()`/`ctx.fanOut()`, holds intermediate results in JS variables, and only `ctx.setResult()` reaches the main context. `workflow-create`/`-delete`/`-save` require `confirm:true` at the tool-call layer (the only gate — a malicious agent that passes `confirm:true` programmatically bypasses it; this is postinstall-equivalent trust, not a human-in-the-loop dialog). See `docs/dynamic-workflows.md`.
78
149
 
79
150
  ---
80
151
 
@@ -582,6 +653,8 @@ Stats: **366 source files** (70K lines) · **506 test files** (66K lines) · **4
582
653
  | [docs/troubleshooting.md](docs/troubleshooting.md) | Common errors, recovery, and error-code reference (E001–E012) |
583
654
  | [docs/architecture.md](docs/architecture.md) | Internal architecture + run flow |
584
655
  | [docs/runtime-flow.md](docs/runtime-flow.md) | Runtime execution details |
656
+ | [docs/goals.md](docs/goals.md) | **v0.9.0** Autonomous goal loops (`team action='goal'`) |
657
+ | [docs/dynamic-workflows.md](docs/dynamic-workflows.md) | **v0.9.0** `.dwf.ts` script runtime + trust model |
585
658
  | [docs/live-mailbox-runtime.md](docs/live-mailbox-runtime.md) | Mailbox + live-session runtime |
586
659
  | [docs/publishing.md](docs/publishing.md) | Release & publish process |
587
660
  | [docs/next-upgrade-roadmap.md](docs/next-upgrade-roadmap.md) | Future upgrade roadmap |
@@ -591,6 +664,43 @@ Research docs (not in package): [`docs/pi-crew-research/`](https://github.com/ba
591
664
 
592
665
  ---
593
666
 
667
+ ## Known limitations
668
+
669
+ This is AI-developed software built for a personal workflow. These are the
670
+ sharp edges I'm aware of — there are almost certainly others I'm not.
671
+
672
+ - **Multi-step goal-wrap crashes non-deterministically.** Goal-wrapping
673
+ multi-step builtin workflows (`fast-fix`, `default`) can hit a V8/libuv
674
+ event-loop race that kills the background process with no signal, no core,
675
+ and no V8 diagnostic report (8 investigation attempts: gdb, strace, perf,
676
+ `--report-on-fatalerror`, sync-fs workarounds, worker-thread atomic writer —
677
+ see `research-findings/goal-workflow/17-PHASE1.5-CRASH-INVESTIGATION-RFC.md`).
678
+ **Mitigation:** multi-step workflows silently auto-downgrade to a normal
679
+ team-run (no goal-wrap layer); single-step workflows (`implementation`)
680
+ goal-wrap end-to-end.
681
+ - **`.dwf.ts` scripts are NOT sandboxed in v1.** The `WorkflowCtx` is
682
+ `Object.freeze()`d, but the script runs in plain module scope with full
683
+ `require`/`import`/`process` access (postinstall-equivalent trust).
684
+ `isolated-vm` (real V8 isolate) is planned for a future release. Only place
685
+ `.dwf.ts` files you have reviewed. See
686
+ [docs/dynamic-workflows.md#security-model-important](docs/dynamic-workflows.md#security-model-important).
687
+ - **Editor/agent file caching.** After editing a loaded pi-crew source file,
688
+ restart the Pi session for changes to take effect (jiti in-memory cache).
689
+ Editing a `.dwf.ts` in place while a run is mid-flight can serve a stale
690
+ module body; rename the file or restart Pi to force a fresh load.
691
+ - **Verification integrity is best-effort against adversarial workers.** The
692
+ bookend snapshot (P1a) and git-worktree sandbox (Phase 1.5 #2, opt-in)
693
+ raise the bar, but a worker in the same process can still tamper with files
694
+ outside the snapshot window. Full isolation requires the planned sandbox.
695
+ - **Single maintainer + AI review.** Every change ships after 2+ consecutive
696
+ clean static-review rounds + runtime tests, but there's no independent human
697
+ audit. Fork and read before trusting anything that touches your data.
698
+
699
+ If you hit any of these — or a new one — please
700
+ [open an issue](https://github.com/baphuongna/pi-crew/issues).
701
+
702
+ ---
703
+
594
704
  ## Acknowledgements
595
705
 
596
706
  `pi-crew` builds on ideas and selected MIT-licensed implementation patterns from `pi-subagents` and `oh-my-claudecode`, with conceptual inspiration from `oh-my-openagent`.
@@ -1,6 +1,6 @@
1
1
  # Feature Intake
2
2
 
3
- Mọi implementation prompt phải đi qua intake gate trước khi code changes.
3
+ Every implementation prompt must pass through the intake gate before code changes.
4
4
 
5
5
  ## Intake Flow
6
6
 
package/docs/HARNESS.md CHANGED
@@ -1,11 +1,12 @@
1
1
  # Harness
2
2
 
3
- pi-crew một Pi extension cho multi-agent orchestration. Harness này giúp
4
- agents humans phối hợp phát triển pi-crew một cách reliable, inspectable,
5
- dễ steer.
3
+ pi-crew is a Pi extension for multi-agent orchestration. This harness helps
4
+ agents and humans collaborate on developing pi-crew in a reliable, inspectable,
5
+ and easy-to-steer way.
6
6
 
7
- Product pi-crew chính nó. Harness môi trường operating để agents hiểu
8
- product, classify work, track decisions, và validate changes.
7
+ The product is pi-crew itself. The harness is the operating environment that
8
+ helps agents understand the product, classify work, track decisions, and
9
+ validate changes.
9
10
 
10
11
  ## Mental Model
11
12
 
@@ -36,26 +37,26 @@ Human intent (issue, prompt, request)
36
37
  Next intent
37
38
  ```
38
39
 
39
- Mỗi task 2 outputs:
40
+ Each task has 2 outputs:
40
41
  1. **Product delta**: code changes, test changes, API shape, config changes
41
42
  2. **Harness delta**: docs, decisions, test matrix updates, backlog items
42
43
 
43
44
  ## Source Hierarchy
44
45
 
45
- Agents đọc theo thứ tự:
46
+ Agents read in this order:
46
47
 
47
- 1. `AGENTS.md` — operating rules important paths
48
- 2. `docs/HARNESS.md` — file này, collaboration model
49
- 3. `docs/FEATURE_INTAKE.md` — trước khi biến request thành work
48
+ 1. `AGENTS.md` — operating rules and important paths
49
+ 2. `docs/HARNESS.md` — this file, the collaboration model
50
+ 3. `docs/FEATURE_INTAKE.md` — before turning a request into work
50
51
  4. `docs/product/` — current product contract
51
52
  5. `docs/ARCHITECTURE.md` — implementation shape
52
- 6. `docs/stories/` — active completed stories
53
+ 6. `docs/stories/` — active and completed stories
53
54
  7. `docs/TEST_MATRIX.md` — proof status
54
55
  8. `docs/decisions/` — why important choices were made
55
56
 
56
57
  ## Validation Ladder
57
58
 
58
- pi-crew đã validation commands:
59
+ pi-crew already has validation commands:
59
60
 
60
61
  | Level | Command | What it proves |
61
62
  |-------|---------|----------------|
@@ -68,14 +69,14 @@ Agents **must not** claim validation passes without running the actual command.
68
69
 
69
70
  ## Growth Rule
70
71
 
71
- Harness grows từ friction. Khi agent:
72
- - Bị confused về expected behavior
73
- - Phải repeat manual reasoning
74
- - Thiếu validation command
75
- - Discover missing rule
76
- - Thấy recurring failure pattern
72
+ The harness grows from friction. When an agent:
73
+ - Gets confused about expected behavior
74
+ - Has to repeat manual reasoning
75
+ - Lacks a validation command
76
+ - Discovers a missing rule
77
+ - Sees a recurring failure pattern
77
78
 
78
- Agent must improve harness directly hoặc propose trong `docs/HARNESS_BACKLOG.md`.
79
+ The agent must improve the harness directly or propose changes in `docs/HARNESS_BACKLOG.md`.
79
80
 
80
81
  ## Working Conventions
81
82