npm - pi-crew - Versions diffs - 0.8.14 → 0.9.1 - Mend

pi-crew 0.8.14 → 0.9.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (86) hide show

package/CHANGELOG.md +366 -0
package/README.md +112 -2
package/docs/FEATURE_INTAKE.md +1 -1
package/docs/HARNESS.md +20 -19
package/docs/PROJECT_REVIEW.md +132 -133
package/docs/PROJECT_REVIEW_FIXES.md +130 -131
package/docs/actions-reference.md +127 -121
package/docs/architecture.md +1 -1
package/docs/code-review-2026-05-11.md +134 -134
package/docs/commands-reference.md +108 -106
package/docs/comparison-pi-subagents-vs-pi-crew.md +105 -105
package/docs/deep-review-report.md +1 -1
package/docs/dynamic-workflows.md +90 -0
package/docs/fixes/BATCH_A_H1_H2.md +17 -17
package/docs/fixes/bug-007-async-notifier-stale-ctx.md +23 -23
package/docs/followup-plan-2026-05-12.md +135 -135
package/docs/followup-review-2026-05-12.md +86 -86
package/docs/followup-review-round3-2026-05-12.md +123 -123
package/docs/goals.md +59 -0
package/docs/implementation-plan-top3.md +4 -4
package/docs/issue-29-analysis.md +2 -2
package/docs/oh-my-pi-research.md +154 -154
package/docs/optimization-plan.md +2 -0
package/docs/perf/baseline-2026-05.md +9 -9
package/docs/perf/final-report-2026-05.md +2 -2
package/docs/perf/sprint-1-report.md +2 -2
package/docs/perf/sprint-2-report.md +1 -1
package/docs/perf/upgrade-plan-2026-05.md +72 -72
package/docs/pi-crew-bugs.md +230 -230
package/docs/pi-crew-investigation-report.md +102 -102
package/docs/pi-crew-test-round5.md +4 -4
package/docs/runtime-analysis-child-vs-live.md +57 -57
package/docs/runtime-migration-in-process-analysis.md +97 -97
package/package.json +2 -4
package/skills/orchestration/SKILL.md +11 -11
package/src/agents/agent-config.ts +4 -0
package/src/config/config.ts +39 -0
package/src/config/types.ts +11 -0
package/src/extension/action-suggestions.ts +2 -1
package/src/extension/async-notifier.ts +10 -0
package/src/extension/help.ts +14 -0
package/src/extension/registration/commands.ts +27 -0
package/src/extension/team-tool/destructive-gate.ts +1 -1
package/src/extension/team-tool/goal-wrap.ts +288 -0
package/src/extension/team-tool/goal.ts +405 -0
package/src/extension/team-tool/run.ts +103 -4
package/src/extension/team-tool/workflow-manage.ts +194 -0
package/src/extension/team-tool.ts +20 -0
package/src/hooks/types.ts +3 -1
package/src/runtime/async-runner.ts +27 -2
package/src/runtime/background-runner.ts +68 -19
package/src/runtime/child-pi.ts +9 -1
package/src/runtime/completion-guard.ts +1 -1
package/src/runtime/dynamic-workflow-context.ts +450 -0
package/src/runtime/dynamic-workflow-runner.ts +180 -0
package/src/runtime/global-worker-cap.ts +96 -0
package/src/runtime/goal-evaluator.ts +294 -0
package/src/runtime/goal-loop-runner.ts +612 -0
package/src/runtime/goal-state-store.ts +209 -0
package/src/runtime/iteration-hooks.ts +2 -1
package/src/runtime/pi-args.ts +10 -2
package/src/runtime/post-checks.ts +2 -1
package/src/runtime/result-extractor.ts +32 -0
package/src/runtime/team-runner.ts +11 -1
package/src/runtime/verification-gates.ts +88 -5
package/src/runtime/verification-integrity.ts +110 -0
package/src/runtime/verification-worktree.ts +136 -0
package/src/runtime/workspace-lock.ts +448 -0
package/src/schema/config-schema.ts +26 -0
package/src/schema/team-tool-schema.ts +39 -4
package/src/state/atomic-write.ts +9 -0
package/src/state/contracts.ts +14 -0
package/src/state/crew-init.ts +18 -5
package/src/state/event-log.ts +7 -1
package/src/state/state-store.ts +2 -0
package/src/state/types.ts +82 -0
package/src/state/worker-atomic-writer.ts +190 -0
package/src/utils/env-allowlist.ts +30 -0
package/src/utils/redaction.ts +104 -24
package/src/utils/safe-paths.ts +55 -14
package/src/workflows/discover-workflows.ts +25 -1
package/src/workflows/workflow-config.ts +13 -0
package/src/worktree/cleanup.ts +2 -1
package/src/worktree/worktree-manager.ts +4 -3
package/teams/parallel-research.team.md +1 -1
package/workflows/examples/hello.dwf.ts +24 -0

package/docs/PROJECT_REVIEW.md CHANGED Viewed

@@ -1,60 +1,60 @@
 # pi-crew — Project Review
-> Ngày review: 2026-05-18
-> Phiên bản: `pi-crew@0.2.19`
-> Phạm vi: toàn bộ source (`index.ts`, `src/**`), config, test, tài liệu, scripts.
-> Method: đọc trực tiếp source, đối chiếu `AGENTS.md`/`docs/architecture.md`, chạy `npm run typecheck` + `npm run test:unit`.
+> Review date: 2026-05-18
+> Version: `pi-crew@0.2.19`
+> Scope: the entire source (`index.ts`, `src/**`), config, tests, docs, scripts.
+> Method: read source directly, cross-referenced against `AGENTS.md`/`docs/architecture.md`, ran `npm run typecheck` + `npm run test:unit`.
-## Tổng quan
+## Overview
-`pi-crew` là một Pi extension điều phối multi-agent (teams + workflows + worktrees + async background runs), với **mô hình durable-first**: mọi run/task/event được persist xuống ổ đĩa (JSONL + JSON atomic write) để foreground, async background, dashboard và crash recovery đều đọc cùng một nguồn sự thật.
+`pi-crew` is a multi-agent orchestration Pi extension (teams + workflows + worktrees + async background runs), with a **durable-first model**: every run/task/event is persisted to disk (JSONL + atomic JSON writes) so that foreground, async background, dashboard, and crash recovery all read the same source of truth.
-Codebase trưởng thành, có **TypeScript strict mode** (`noImplicitAny`, `strict: true`), bộ test rộng (1596 tests pass, 2 skipped, 0 fail), kiến trúc phân tầng rõ (extension / runtime / state / worktree / utils), và rất nhiều ghi chú phòng-thân ("3.1 backpressure", "2.10 cache", "P1 catch unhandled errors") cho thấy đã được iterate qua nhiều round review.
+The codebase is mature, uses **TypeScript strict mode** (`noImplicitAny`, `strict: true`), has a broad test suite (1596 tests pass, 2 skipped, 0 failures), clear layered architecture (extension / runtime / state / worktree / utils), and many defensive notes ("3.1 backpressure", "2.10 cache", "P1 catch unhandled errors") indicating it has been iterated over many review rounds.
-### Kết quả health-check nhanh
+### Quick health-check results
-| Hạng mục | Kết quả |
+| Category | Result |
 |---|---|
 | `npm run typecheck` (`tsc --noEmit` + strip-types import) | PASS |
 | `npm run test:unit` (1598 tests / 128 suites) | 1596 pass · 2 skip · 0 fail (~90s) |
-| `npm pack --dry-run` (qua `npm run ci`) | Không kiểm tra trong session này |
-| Linter (ESLint) | Không có script `lint`; dựa vào `tsc strict` |
-| Số file `.ts` trong `src/` | ~190 modules |
+| `npm pack --dry-run` (via `npm run ci`) | Not checked in this session |
+| Linter (ESLint) | No `lint` script; relies on `tsc strict` |
+| Number of `.ts` files in `src/` | ~190 modules |
 ---
-## 1. Điểm mạnh đáng ghi nhận
+## 1. Notable strengths
-1. **Path-safety nhất quán**: `utils/safe-paths.ts` (`assertSafePathId`, `resolveContainedPath`, `resolveRealContainedPath`) được dùng đồng đều ở `state-store.ts`, `artifact-store.ts`, `mailbox.ts`. Có cả hai lớp: containment check theo string và real-path check (chống symlink escape sau khi mkdir).
-2. **Atomic write nhiều lớp phòng thân** (`state/atomic-write.ts`):
-   - `O_EXCL | O_CREAT | O_NOFOLLOW` khi mở temp file.
-   - `fstatSync` post-open để verify regular-file (chống TOCTOU trên Windows nơi `O_NOFOLLOW = 0`).
-   - Rename retry với exponential backoff + jitter (chống lockstep starvation).
-   - Coalesced variant `atomicWriteJsonCoalesced` cho high-frequency state writes; flush trên `exit`/`SIGTERM`/`SIGINT`.
-3. **Redaction (`utils/redaction.ts`)** xử lý nhiều pattern: PEM private key, Authorization headers, Bearer tokens, inline secret patterns, key-name match (`apiKey`, `password`, `secret`, ...). Áp dụng cả ở `appendEvent`, `appendMailboxMessage`, `writeArtifact`, `appendTranscript`.
-4. **Env sanitization (`utils/env-filter.ts`)**: secret-pattern deny-list mặc định, allow-list mode cho `worktree.setupHook` để chỉ pass `PATH`, `HOME`, `PI_*`.
+1. **Consistent path-safety**: `utils/safe-paths.ts` (`assertSafePathId`, `resolveContainedPath`, `resolveRealContainedPath`) is used uniformly in `state-store.ts`, `artifact-store.ts`, `mailbox.ts`. It has two layers: a string-based containment check and a real-path check (defends against symlink escape after mkdir).
+2. **Multi-layered defensive atomic writes** (`state/atomic-write.ts`):
+   - `O_EXCL | O_CREAT | O_NOFOLLOW` when opening the temp file.
+   - `fstatSync` post-open to verify a regular file (defends against TOCTOU on Windows where `O_NOFOLLOW = 0`).
+   - Rename retry with exponential backoff + jitter (defends against lockstep starvation).
+   - Coalesced variant `atomicWriteJsonCoalesced` for high-frequency state writes; flush on `exit`/`SIGTERM`/`SIGINT`.
+3. **Redaction (`utils/redaction.ts`)** handles many patterns: PEM private keys, Authorization headers, Bearer tokens, inline secret patterns, key-name matching (`apiKey`, `password`, `secret`, ...). Applied in `appendEvent`, `appendMailboxMessage`, `writeArtifact`, `appendTranscript`.
+4. **Env sanitization (`utils/env-filter.ts`)**: default secret-pattern deny-list, allow-list mode for `worktree.setupHook` to pass only `PATH`, `HOME`, `PI_*`.
 5. **Process kill tree** (`runtime/child-pi.ts`):
-   - Windows: `taskkill /T /F` + verify-after-2s + retry nếu PID còn sống.
-   - POSIX: `process.kill(-pid, "SIGTERM")` (process group) với fallback absolute pid; SIGKILL escalation sau `HARD_KILL_MS`; fast-cancel SIGKILL sau 200ms khi user cancel.
-   - Lifecycle events có structured shape `{ type, pid, exitCode?, error?, ts }`.
-6. **Backpressure**: pause child stdout khi vượt 256KB chưa drain.
-7. **Lazy imports được đánh dấu `// LAZY:`** với lý do cụ thể (giảm ~1.4s import cost ở registration), kèm script `check:lazy-imports` để bảo vệ.
-8. **Run / task contract guards**: `shouldMergeTaskUpdate` (không cho stale snapshot regress terminal state), monotonic finishedAt, `canTransitionRunStatus`, plan-approval-gating cho mutating tasks.
-9. **Crash & cancellation paths**: `executeTeamRun` catch-all đảm bảo manifest/tasks chuyển sang terminal khi unhandled error (tránh "running mãi mãi"); `background-runner` có `unhandledRejection` guard ghi `async.failed` trước exit; `parent-guard` để background runner tự chết khi parent chết.
-10. **Test coverage rất rộng** cho cả happy path và edge cases (yield, atomic-write retry, mergeTaskUpdates, mailbox validation, cancellation, model fallback...).
+   - Windows: `taskkill /T /F` + verify-after-2s + retry if PID is still alive.
+   - POSIX: `process.kill(-pid, "SIGTERM")` (process group) with an absolute-pid fallback; SIGKILL escalation after `HARD_KILL_MS`; fast-cancel SIGKILL after 200ms on user cancel.
+   - Lifecycle events have a structured shape `{ type, pid, exitCode?, error?, ts }`.
+6. **Backpressure**: pause child stdout when more than 256KB is undrained.
+7. **Lazy imports marked with `// LAZY:`** with a specific reason (reduces ~1.4s import cost at registration), plus a `check:lazy-imports` script to enforce it.
+8. **Run / task contract guards**: `shouldMergeTaskUpdate` (prevents a stale snapshot from regressing terminal state), monotonic finishedAt, `canTransitionRunStatus`, plan-approval-gating for mutating tasks.
+9. **Crash & cancellation paths**: `executeTeamRun` catch-all ensures the manifest/tasks transition to terminal on unhandled error (avoids "running forever"); `background-runner` has an `unhandledRejection` guard that writes `async.failed` before exit; `parent-guard` so the background runner dies when its parent dies.
+10. **Very broad test coverage** for both happy paths and edge cases (yield, atomic-write retry, mergeTaskUpdates, mailbox validation, cancellation, model fallback...).
 11. **Config**:
-    - Schema validate qua TypeBox với fuzzy suggestions cho key sai chính tả.
-    - **Sanitize project-level config** (`sanitizeProjectConfig`): loại bỏ những key nhạy cảm (`executeWorkers`, `runtime.mode`, `worktree.setupHook`, `otlp.headers`, `agents.overrides`, …) ra khỏi project config, chỉ chấp nhận từ user config. Đây là phòng thân thiết yếu cho repo bị inject.
+    - Schema validation via TypeBox with fuzzy suggestions for misspelled keys.
+    - **Sanitize project-level config** (`sanitizeProjectConfig`): strips sensitive keys (`executeWorkers`, `runtime.mode`, `worktree.setupHook`, `otlp.headers`, `agents.overrides`, …) from the project config, accepting them only from user config. This is an essential safeguard for an injected repo.
 ---
-## 2. Bugs / Issues phát hiện
+## 2. Bugs / Issues found
-> Phân loại: **HIGH** (có thể gây mất dữ liệu / sai correctness), **MED** (correctness corner case / DX), **LOW** (cải thiện).
+> Classification: **HIGH** (can cause data loss / incorrectness), **MED** (correctness corner case / DX), **LOW** (improvement).
 ### HIGH
-**H1. `event-log.ts` — silent loss khi vượt `MAX_EVENTS_BYTES` (50MB)**
+**H1. `event-log.ts` — silent loss when exceeding `MAX_EVENTS_BYTES` (50MB)**
 ```ts
 // src/state/event-log.ts ~ appendEventInsideLock
 if (fs.existsSync(eventsPath) && fs.statSync(eventsPath).size > MAX_EVENTS_BYTES) {
@@ -62,150 +62,150 @@ if (fs.existsSync(eventsPath) && fs.statSync(eventsPath).size > MAX_EVENTS_BYTES
     return { ...fullEvent, metadata: { ...(fullEvent.metadata ?? {seq:0,...}), appended: false } };
 }
 ```
-- Vấn đề: event bị bỏ ngay (kể cả terminal event như `run.failed`, `task.completed`) nhưng `appendCounter` cũng không tăng → `compactEventLog` (chỉ chạy mỗi 100 append) không được kích hoạt khi cần nhất. Hậu quả: một khi vượt ngưỡng, log bị "khoá" silently cho đến khi 100 append tiếp theo trigger rotation.
-- Đề xuất: khi gặp ngưỡng, gọi `compactEventLog(eventsPath)` ngay hoặc rotate trước rồi append; đồng thời ưu tiên cho phép terminal event (TERMINAL_EVENT_TYPES) đi qua, vì những event đó là durable contract.
+- Problem: the event is dropped immediately (including terminal events like `run.failed`, `task.completed`) and `appendCounter` is also not incremented → `compactEventLog` (which only runs every 100 appends) is not triggered when it is needed most. Consequence: once the threshold is crossed, the log is "locked" silently until the next 100 appends trigger a rotation.
+- Suggestion: when the threshold is hit, call `compactEventLog(eventsPath)` immediately, or rotate first then append; also prioritize letting terminal events (TERMINAL_EVENT_TYPES) through, since those events are part of the durable contract.
-**H2. `mailbox.ts` — `appendMailboxMessage` không có lock cross-process**
+**H2. `mailbox.ts` — `appendMailboxMessage` has no cross-process lock**
 ```ts
 fs.appendFileSync(mailboxFile(manifest, complete.direction, complete.taskId), `${JSON.stringify(...)}\n`, "utf-8");
 ```
-- Vấn đề: `appendFileSync` không nguyên tử trên Windows giữa các process. Hai background runners + foreground steer cùng lúc có thể interleave JSON lines → `parseMailboxMessage` skip, message bị mất silently (lỗi report sau qua `validateMailbox`).
-- Đề xuất: dùng pattern `withEventLogLockSync` (đã có sẵn) cho mailbox, hoặc dùng `atomicWriteFile` để rewrite (chậm hơn nhưng nguyên tử). Tối thiểu nên thêm `O_APPEND` nguyên tử trên POSIX (chỉ guarantee tới PIPE_BUF) và lock trên Windows.
+- Problem: `appendFileSync` is not atomic across processes on Windows. Two background runners + foreground steering at the same time can interleave JSON lines → `parseMailboxMessage` skips them, messages are lost silently (reported later via `validateMailbox`).
+- Suggestion: use the existing `withEventLogLockSync` pattern for the mailbox, or use `atomicWriteFile` to rewrite (slower but atomic). At minimum, add atomic `O_APPEND` on POSIX (only guaranteed up to PIPE_BUF) and a lock on Windows.
-**H3. `atomic-write.ts` — fallback `writeFileSync` không có symlink guard**
+**H3. `atomic-write.ts` — fallback `writeFileSync` has no symlink guard**
 ```ts
 try { renameWithRetry(tempPath, filePath); }
 catch (renameError) {
-    try { fs.writeFileSync(filePath, content, "utf-8"); } // BYPASS symlink guard
+    try { fs.writeFileSync(filePath, content, "utf-8"); } // BYPASSES symlink guard
     catch { throw renameError; }
 }
 ```
-- Vấn đề: nếu rename fail với EPERM trên Windows, fallback đi trực tiếp `writeFileSync(filePath)` — nếu `filePath` được tạo thành symlink giữa `isSymlinkSafePath` check (top of function) và fallback, write sẽ follow link. Time window nhỏ nhưng có thể bị adversary trên multi-user host.
-- Đề xuất: trước fallback, re-check `fs.lstatSync(filePath).isSymbolicLink()`. Hoặc mở fd với `O_NOFOLLOW` và `O_TRUNC` rồi write.
+- Problem: if rename fails with EPERM on Windows, the fallback goes directly to `writeFileSync(filePath)` — if `filePath` becomes a symlink between the `isSymlinkSafePath` check (top of function) and the fallback, the write follows the link. The time window is small but could be exploited by an adversary on a multi-user host.
+- Suggestion: before the fallback, re-check `fs.lstatSync(filePath).isSymbolicLink()`. Or open an fd with `O_NOFOLLOW` and `O_TRUNC` then write.
-**H4. `team-runner.ts` — Tên hàm `__test__mergeTaskUpdates` bị dùng trong production**
+**H4. `team-runner.ts` — function named `__test__mergeTaskUpdates` is used in production**
 ```ts
-// Re-export documented as test-only:
+// Re-exported and documented as test-only:
 export function __test__mergeTaskUpdates(...) { ... }
-// nhưng được gọi trong executeTeamRunCore:
+// but called in executeTeamRunCore:
 tasks = __test__mergeTaskUpdates(tasks, results);
 ```
-- Vấn đề: convention `__test__` ngụ ý chỉ test mới import; thực ra đây là core merge logic của runner. Một dev khác có thể "dọn" helper này hoặc thay đổi behavior nghĩ rằng chỉ ảnh hưởng test → silent regression.
-- Đề xuất: đổi tên `mergeTaskUpdatesPreservingTerminal()` (hoặc tương tự), giữ `__test__mergeTaskUpdates` làm alias export-only cho test, ghi comment.
+- Problem: the `__test__` convention implies only tests should import it; this is actually the runner's core merge logic. Another developer might "clean up" this helper or change its behavior thinking it only affects tests → silent regression.
+- Suggestion: rename to `mergeTaskUpdatesPreservingTerminal()` (or similar), keep `__test__mergeTaskUpdates` as an export-only alias for tests, add a comment.
 ### MED
 **M1. `task-runner.ts` — `transcriptPath` reused across model fallback attempts**
-- Mỗi attempt append vào cùng file transcript. `parsePiJsonOutput(fs.readFileSync(transcriptPath, "utf-8"))` parse toàn bộ → final text/usage có thể mixed giữa attempts. `resultArtifact.content` lấy `parsedOutput?.finalText` có thể là final của attempt 1 (đã fail) nếu attempt 2 không có message_end hợp lệ.
-- Đề xuất: hoặc dùng `transcripts/${task.id}.attempt-${i}.jsonl` per attempt, hoặc clear file đầu mỗi attempt nếu chính sách là "last attempt wins".
+- Each attempt appends to the same transcript file. `parsePiJsonOutput(fs.readFileSync(transcriptPath, "utf-8"))` parses everything → final text/usage may be mixed across attempts. `resultArtifact.content` takes `parsedOutput?.finalText`, which could be the final text of attempt 1 (which failed) if attempt 2 has no valid message_end.
+- Suggestion: either use `transcripts/${task.id}.attempt-${i}.jsonl` per attempt, or clear the file at the start of each attempt if the policy is "last attempt wins".
-**M2. `task-runner.ts` — read toàn bộ transcript vào memory cho `transcriptArtifact`**
+**M2. `task-runner.ts` — reads the entire transcript into memory for `transcriptArtifact`**
 ```ts
 content: fs.readFileSync(transcriptPath, "utf-8"),
 ```
-- Với task chạy lâu, transcript có thể vài chục MB. Cộng với việc compactChildPiEvent đã giảm size, nhưng vẫn unbounded. `MAX_CAPTURE_BYTES` chỉ áp dụng cho `stdout/stderr` in-memory, không cho transcript on-disk.
-- Đề xuất: cap transcript file size (rotate khi vượt ngưỡng) hoặc artifact dùng reference (đường dẫn) thay vì copy nội dung.
+- For long-running tasks the transcript can be tens of MB. Combined with compactChildPiEvent already reducing size, it is still unbounded. `MAX_CAPTURE_BYTES` only applies to in-memory `stdout/stderr`, not to the on-disk transcript.
+- Suggestion: cap the transcript file size (rotate when exceeding a threshold) or have the artifact use a reference (path) instead of copying content.
-**M3. `cleanup.ts` — `fs.statSync(worktreePath).isDirectory()` không guard race**
+**M3. `cleanup.ts` — `fs.statSync(worktreePath).isDirectory()` has no race guard**
 ```ts
 for (const entry of fs.readdirSync(worktreeRoot)) {
     const worktreePath = path.join(worktreeRoot, entry);
     if (!fs.statSync(worktreePath).isDirectory()) continue;
 ```
-- Nếu entry bị xóa giữa `readdirSync` và `statSync`, throw uncaught.
-- Đề xuất: bọc `try { fs.statSync... } catch { continue; }` hoặc dùng `fs.readdirSync(worktreeRoot, { withFileTypes: true })` rồi `entry.isDirectory()`.
+- If the entry is deleted between `readdirSync` and `statSync`, it throws uncaught.
+- Suggestion: wrap in `try { fs.statSync... } catch { continue; }` or use `fs.readdirSync(worktreeRoot, { withFileTypes: true })` then `entry.isDirectory()`.
-**M4. `worktree-manager.ts` — `runSetupHook` parse JSON chỉ từ dòng cuối**
+**M4. `worktree-manager.ts` — `runSetupHook` parses JSON only from the last line**
 ```ts
 const lastLine = lines[lines.length - 1] ?? trimmed;
 const parsed = JSON.parse(lastLine);
 ```
-- Nếu hook xuất multi-line JSON (pretty-print) thì chỉ parse được dòng cuối → silently mất `syntheticPaths`. Đã có log warning, nhưng silent về phía caller.
-- Đề xuất: thử parse `trimmed` trước, fallback last-line. Hoặc đặt protocol rõ ràng (one-line JSON, terminator marker).
+- If the hook outputs multi-line JSON (pretty-printed), only the last line is parsed → `syntheticPaths` are silently lost. There is a log warning, but it is silent from the caller's side.
+- Suggestion: try parsing `trimmed` first, fall back to the last line. Or define a clear protocol (one-line JSON, terminator marker).
-**M5. `worktree-manager.ts` — `linkNodeModulesIfPresent` không cảnh báo khi `symlinkSync` fail**
+**M5. `worktree-manager.ts` — `linkNodeModulesIfPresent` does not warn when `symlinkSync` fails**
 ```ts
 try { fs.symlinkSync(...); return true; } catch { return false; }
 ```
-- Trên Windows không có quyền tạo symlink (yêu cầu SeCreateSymbolicLinkPrivilege), fail im lặng, agent chạy mà thiếu `node_modules` — có thể fail moduleResolution nhưng caller không biết.
-- Đề xuất: log lý do fail (đặc biệt cho Windows non-admin) qua `logInternalError`, hoặc trả về `{ linked, reason }`.
+- On Windows without the right to create symlinks (requires SeCreateSymbolicLinkPrivilege), it fails silently, the agent runs without `node_modules` — module resolution may fail but the caller does not know.
+- Suggestion: log the reason for failure (especially for non-admin Windows) via `logInternalError`, or return `{ linked, reason }`.
-**M6. `child-pi.ts` — `forcedFinalDrain` ép `exitCode: 0`**
+**M6. `child-pi.ts` — `forcedFinalDrain` forces `exitCode: 0`**
 ```ts
 const finalExitCode = forcedFinalDrain && !timeoutError ? 0 : exitCode;
 ```
-- Logic này (đã comment giải thích) chuyển một số exit ≠ 0 thành 0 sau khi child gửi final assistant event. Edge case: child crash trong cleanup sau final event → vẫn report success. Có thể che giấu memory leak hoặc crash trong child Pi.
-- Đề xuất: thêm telemetry/metric đếm số lần `forcedFinalDrain → 0` để phát hiện regression. Hiện tại chỉ có lifecycle event "final_drain" nhưng không có metric đếm conversion.
+- This logic (already explained in a comment) converts some exit ≠ 0 into 0 after the child sends the final assistant event. Edge case: child crashes during cleanup after the final event → still reports success. This could mask a memory leak or crash in the child Pi.
+- Suggestion: add telemetry/metrics counting how often `forcedFinalDrain → 0` happens to detect regressions. Currently there is only a lifecycle event "final_drain" but no conversion metric.
-**M7. `background-runner.ts` — `process.exit(130)` trong interrupt guard không await flush**
+**M7. `background-runner.ts` — `process.exit(130)` in the interrupt guard does not await flush**
 ```ts
 if (last?.type === "interrupt" && last?.acknowledged !== true) {
     appendEvent(...);
     process.exit(130);
 }
 ```
-- `process.exit` chạy `'exit'` handler nhưng không await async ops (e.g., `appendEventBuffered` Promise đang chờ). `flushEventLogBuffer` đăng ký trên `'exit'` là sync nên OK, nhưng `terminateLiveAgentsForRun` thì không. Có thể leak live agent.
-- Đề xuất: `await terminateLiveAgentsForRun(...)` rồi mới exit, hoặc dùng `process.exitCode = 130` + return để cleanup chạy bình thường.
+- `process.exit` runs the `'exit'` handler but does not await async ops (e.g., a pending `appendEventBuffered` Promise). `flushEventLogBuffer` registered on `'exit'` is sync so it's OK, but `terminateLiveAgentsForRun` is not. It could leak a live agent.
+- Suggestion: `await terminateLiveAgentsForRun(...)` before exiting, or use `process.exitCode = 130` + return so cleanup runs normally.
 **M8. `state-store.ts` — manifest cache TTL invariant**
-- Cache key là `stateRoot`, TTL 5 phút. Path validation phòng trường hợp manifest paths đổi. Nhưng nếu file mtime + size không đổi (extremely rare nhưng có thể với atomic-write coalesced khi same size & content), cache phục vụ stale content.
-- Đề xuất: thêm `contentHash` (cheap để stat → fingerprint kiểu first 32 bytes) trong cache key, hoặc invalidate cache trong `atomicWriteJsonCoalesced` flush callback.
+- The cache key is `stateRoot`, TTL 5 minutes. Path validation guards against manifest paths changing. But if the file mtime + size do not change (extremely rare but possible with coalesced atomic writes when the size & content are the same), the cache serves stale content.
+- Suggestion: add a `contentHash` (cheap to stat → a fingerprint like the first 32 bytes) to the cache key, or invalidate the cache in the `atomicWriteJsonCoalesced` flush callback.
-**M9. `event-log.ts` — `sequenceCache` không invalidated khi file truncate ngoài**
-- Nếu external tool truncate `events.jsonl` (rotate manual), cached `seq` vẫn cao, làm `nextSequence` sinh seq sai (đã có fallback: `cached.size === stat.size`). OK với same-size race, nhưng nếu truncate xảy ra giữa `statSync` và `appendFileSync`, hai append sẽ có cùng seq.
-- Đề xuất: persistSequence hiện đã dùng atomic write, có thể trust nó trong race. Test integration cho external truncate.
+**M9. `event-log.ts` — `sequenceCache` not invalidated when the file is truncated externally**
+- If an external tool truncates `events.jsonl` (manual rotation), the cached `seq` stays high, making `nextSequence` produce wrong seqs (there is a fallback: `cached.size === stat.size`). OK for the same-size race, but if truncation happens between `statSync` and `appendFileSync`, two appends will have the same seq.
+- Suggestion: persistSequence already uses atomic write, you can trust it in the race. Add an integration test for external truncation.
 **M10. `runtime-resolver` / config — `executeWorkers=false` default fallback path**
-- `handleResume` có logic phức tạp re-evaluate `runtime.mode` khi resume scaffold runs. Logic 3-cách (`resumeManifest.runtimeResolution?.safety === "explicit_dry_run"` + env var checks) dễ dẫn đến edge case nơi user expect actual workers nhưng resume vẫn scaffold. Khó test.
-- Đề xuất: refactor thành state machine rõ ràng `resolveResumeRuntime({ original, override, env })` với unit test full truth table.
+- `handleResume` has complex logic to re-evaluate `runtime.mode` when resuming scaffold runs. The 3-way logic (`resumeManifest.runtimeResolution?.safety === "explicit_dry_run"` + env var checks) easily leads to an edge case where the user expects actual workers but resume is still scaffold. Hard to test.
+- Suggestion: refactor into a clear state machine `resolveResumeRuntime({ original, override, env })` with unit tests covering the full truth table.
 ### LOW
-- **L1. `package.json` thiếu `lint` script**; `AGENTS.md` global có quy ước `eslint --max-warnings=0`. Hiện chỉ dựa vào `tsc strict`. Cân nhắc thêm ESLint hoặc Biome.
-- **L2. Many `JSON.stringify(value, null, 2)` cho metadata artifact**. Pretty-printing 50+ artifact/task tốn I/O. Cân nhắc minified JSON cho metadata, pretty chỉ cho summary/progress mà user đọc.
-- **L3. `task-runner.ts` tạo ~13 artifacts cho mỗi task** (prompt, result, inputs, coordination, skill, packet, verification, startup, permission, capability, prompt-pipeline, log, transcript, diff, diff-stat, output-validation). Mỗi cái là một `atomicWriteFile` syscall. Trong run lớn (50+ tasks), giảm xuống sub-artifacts hợp nhất sẽ giúp giảm I/O đáng kể.
-- **L4. `registerYieldTool()` chạy ở module top-level** (`task-runner.ts` dòng 35). Side-effect khi import — nếu module bị import 2 lần (e.g., jiti vs strip-types), `subprocessToolRegistry` có thể duplicate. Kiểm tra `subprocess-tool-registry.ts` xem có idempotent không.
-- **L5. `atomic-write.ts` `atomicWriteJsonCoalesced`** — API có caveat đáng kể (read-after-write trong buffer window đọc stale content). Risk surface lớn nếu future dev quên gọi `flushPendingAtomicWrites()`. Cân nhắc thêm read API riêng `readJsonFileWithCoalesceFlush()`.
-- **L6. Cancellation paths không có metric đếm**. Đã có observability events nhưng không có gauge số task cancelled per run.
-- **L7. `management.ts` `handleUpdate` rename+write** sequence không có rollback nếu writeFileSync fail sau rename (backup tồn tại, nhưng user phải manually restore). Có thể wrap trong try/catch + auto-restore from backup.
-- **L8. `child-pi.ts` mock paths đọc env `PI_TEAMS_MOCK_CHILD_PI`** — nên có guard không cho prod accidentally bật (kiểm tra `process.env.NODE_ENV === "test"` hoặc test-flag rõ ràng).
-- **L9. `worktree-manager.ts` `findGitRoot` throws** nếu cwd không phải git repo. `prepareTaskWorkspace` gọi nó trước khi check workspaceMode; thực ra check workspaceMode đầu hàm rồi, OK. Nhưng error message từ git ("not a git repository") sẽ propagate lên user — không user-friendly.
-- **L10. Naming `crewRoot` vs `.crew/` vs `.pi/teams/`** đã có doc nhưng dễ confuse. `projectCrewRoot` có cả ba branch (existing `.crew` → `.crew`; existing `.pi` → `.pi/teams`; else → `.crew`). Test có cover nhưng dev mới khi xem code dễ nghĩ nhầm.
-- **L11. Một số `let task: TeamTaskState = ...` rồi reassign nhiều lần trong `task-runner.ts`**. Hard to reason. Cân nhắc refactor thành reducer pattern.
-- **L12. `update-references-for-rename` chỉ cập nhật team→agent và team.defaultWorkflow**, không cover workflow→step.role hay agent references trong test fixtures. Comment đã ghi nhận. Vẫn nên fix để rename an toàn.
+- **L1. `package.json` missing a `lint` script**; the global `AGENTS.md` has a convention `eslint --max-warnings=0`. Currently it only relies on `tsc strict`. Consider adding ESLint or Biome.
+- **L2. Many `JSON.stringify(value, null, 2)` for metadata artifacts**. Pretty-printing 50+ artifact/task files costs I/O. Consider minified JSON for metadata; pretty only for summary/progress that users read.
+- **L3. `task-runner.ts` creates ~13 artifacts per task** (prompt, result, inputs, coordination, skill, packet, verification, startup, permission, capability, prompt-pipeline, log, transcript, diff, diff-stat, output-validation). Each is an `atomicWriteFile` syscall. In a large run (50+ tasks), consolidating into fewer sub-artifacts would significantly reduce I/O.
+- **L4. `registerYieldTool()` runs at module top level** (`task-runner.ts` line 35). Side effect on import — if the module is imported twice (e.g., jiti vs strip-types), `subprocessToolRegistry` could be duplicated. Check whether `subprocess-tool-registry.ts` is idempotent.
+- **L5. `atomic-write.ts` `atomicWriteJsonCoalesced`** — the API has a significant caveat (read-after-write within the buffer window reads stale content). Large risk surface if a future dev forgets to call `flushPendingAtomicWrites()`. Consider adding a dedicated read API `readJsonFileWithCoalesceFlush()`.
+- **L6. Cancellation paths have no counting metric**. There are observability events but no gauge for the number of tasks cancelled per run.
+- **L7. `management.ts` `handleUpdate` rename+write** sequence has no rollback if writeFileSync fails after rename (a backup exists, but the user must manually restore). Could wrap in try/catch + auto-restore from backup.
+- **L8. `child-pi.ts` mock paths read env `PI_TEAMS_MOCK_CHILD_PI`** — there should be a guard preventing accidental production activation (check `process.env.NODE_ENV === "test"` or a clear test flag).
+- **L9. `worktree-manager.ts` `findGitRoot` throws** if cwd is not a git repo. `prepareTaskWorkspace` calls it before checking workspaceMode; actually workspaceMode is checked at the top of the function, OK. But the git error message ("not a git repository") propagates to the user — not user-friendly.
+- **L10. The naming `crewRoot` vs `.crew/` vs `.pi/teams/`** is documented but easy to confuse. `projectCrewRoot` has three branches (existing `.crew` → `.crew`; existing `.pi` → `.pi/teams`; else → `.crew`). Tests cover it but a new dev reading the code can easily misunderstand.
+- **L11. Some `let task: TeamTaskState = ...` is reassigned multiple times in `task-runner.ts`**. Hard to reason about. Consider refactoring into a reducer pattern.
+- **L12. `update-references-for-rename` only updates team→agent and team.defaultWorkflow**, does not cover workflow→step.role or agent references in test fixtures. The comment acknowledges this. Still worth fixing so renames are safe.
 ---
 ## 3. Security review
-| Mục | Trạng thái | Ghi chú |
+| Item | Status | Notes |
 |---|---|---|
-| Path traversal | OK | `assertSafePathId`, `resolveContainedPath`, `resolveRealContainedPath` phủ khá đầy đủ. |
-| Symlink escape | OK (corner case H3) | `O_NOFOLLOW`, `lstatSync`, post-open `fstatSync`. Có 1 fallback path bỏ check (H3). |
-| Secret leak | OK | Redaction áp dụng đầu vào event log, transcript, mailbox, artifact. Env sanitization trước khi spawn child. |
-| Code injection via setup hook | Mitigated | `runSetupHook` validate file tồn tại, dùng `shell: false`, allow-list env, timeout 30s. Nhưng vẫn execute user-provided code. Phải tin user. |
-| Untrusted project config | OK | `sanitizeProjectConfig` strip key nhạy cảm trước khi merge. |
+| Path traversal | OK | `assertSafePathId`, `resolveContainedPath`, `resolveRealContainedPath` cover it fairly thoroughly. |
+| Symlink escape | OK (corner case H3) | `O_NOFOLLOW`, `lstatSync`, post-open `fstatSync`. One fallback path skips the check (H3). |
+| Secret leak | OK | Redaction applied at the event log, transcript, mailbox, artifact inputs. Env sanitization before spawning the child. |
+| Code injection via setup hook | Mitigated | `runSetupHook` validates the file exists, uses `shell: false`, allow-lists env, 30s timeout. But it still executes user-provided code. Must trust the user. |
+| Untrusted project config | OK | `sanitizeProjectConfig` strips sensitive keys before merging. |
 | Process tree leak (zombie child Pi) | OK | `terminateActiveChildPiProcesses` + `parent-guard` + Windows `taskkill /T /F`. |
-| DoS qua concurrency | OK | Default hard-cap; `allowUnboundedConcurrency=true` cần explicit opt-in + emit event. |
-| Event log injection | Mitigated | JSON.stringify mỗi line; readEvents skip parse error. Có rủi ro JSON-line corrupted vì `appendFileSync` race (H2 trong mailbox, nhưng event log có lock). |
-| Dependency surface | Nhỏ | Chỉ runtime deps: typebox, cli-highlight, diff, jiti. |
+| DoS via concurrency | OK | Default hard-cap; `allowUnboundedConcurrency=true` requires explicit opt-in + emits an event. |
+| Event log injection | Mitigated | JSON.stringify per line; readEvents skips parse errors. There is a risk of corrupted JSON lines due to an `appendFileSync` race (H2 in mailbox, but the event log has a lock). |
+| Dependency surface | Small | Only runtime deps: typebox, cli-highlight, diff, jiti. |
-Tóm lại: security posture **tốt**. Vấn đề lớn nhất là H2 (mailbox không lock) — có thể bị stale state nếu nhiều process race.
+In summary: the security posture is **good**. The biggest issue is H2 (mailbox has no lock) — stale state can occur if multiple processes race.
 ---
 ## 4. Performance review
-- **Atomic write coalescer** (50ms window) đã giảm I/O cho high-frequency state writes.
-- **Manifest cache** với mtime+size key tránh re-parse khi không đổi.
-- **Lazy import boundaries** giảm import cost ~1.4s.
-- **`projectRootCache` TTL 30s** giảm 14 `existsSync` × ancestor levels mỗi render tick.
+- **Atomic write coalescer** (50ms window) has reduced I/O for high-frequency state writes.
+- **Manifest cache** with mtime+size key avoids re-parsing when unchanged.
+- **Lazy import boundaries** reduce import cost ~1.4s.
+- **`projectRootCache` TTL 30s** reduces 14 `existsSync` × ancestor levels per render tick.
-Nóng còn tiềm năng tối ưu:
-1. Mỗi task hoàn thành sinh ~13 artifacts (L3). 50 tasks = 650 atomic writes cho metadata. Cân nhắc batch.
-2. `progress.md` và `summary.md` được write lại nhiều lần per batch (writeProgress trong loop). Coalesce ổn nhưng có thể dùng `atomicWriteJsonCoalesced`.
-3. `parsePiJsonOutput(fs.readFileSync(transcriptPath))` chạy mỗi attempt, parse full transcript. Stream parsing rẻ hơn cho transcript lớn.
-4. `aggregateUsage(tasks)` chạy O(n) trên tasks mỗi summary write.
+Areas with optimization potential:
+1. Each completed task produces ~13 artifacts (L3). 50 tasks = 650 atomic writes for metadata. Consider batching.
+2. `progress.md` and `summary.md` are rewritten multiple times per batch (writeProgress in a loop). Coalescing is fine but `atomicWriteJsonCoalesced` could be used.
+3. `parsePiJsonOutput(fs.readFileSync(transcriptPath))` runs each attempt, parsing the full transcript. Stream parsing is cheaper for large transcripts.
+4. `aggregateUsage(tasks)` runs O(n) over tasks on each summary write.
 ---
@@ -214,40 +214,40 @@ Nóng còn tiềm năng tối ưu:
 | Aspect | Note |
 |---|---|
 | TS strict | OK, `noImplicitAny` enforced. |
-| Naming `__test__*` | Có lẫn lộn giữa pure test util và production helper (H4). |
-| File size | `team-runner.ts` (694 dòng), `task-runner.ts` (440+ dòng), `register.ts` (1k+ dòng), `live-session-runtime.ts` (~750 dòng) đều > 500 dòng. AGENTS.md đã nhắc "prefer small modules". |
-| Comment quality | Tốt — có "WHY" markers, version tags (`// 2.10`, `// H4`, `// 3.1`). |
-| Test layout | `test/unit/*.test.ts` + `test/integration/*.test.ts`. Concurrency hợp lý. |
-| Hard-coded magic numbers | Đã centralize vào `config/defaults.ts` cho phần lớn. |
-| Error reporting | `logInternalError` consistent — best-effort, không throw. |
-| Docs sync | `docs/architecture.md` khớp với code (trừ một số next-upgrade-roadmap chưa implement). |
+| Naming `__test__*` | Some mixing of pure test utils and production helpers (H4). |
+| File size | `team-runner.ts` (694 lines), `task-runner.ts` (440+ lines), `register.ts` (1k+ lines), `live-session-runtime.ts` (~750 lines) are all > 500 lines. AGENTS.md says "prefer small modules". |
+| Comment quality | Good — there are "WHY" markers, version tags (`// 2.10`, `// H4`, `// 3.1`). |
+| Test layout | `test/unit/*.test.ts` + `test/integration/*.test.ts`. Reasonable concurrency. |
+| Hard-coded magic numbers | Mostly centralized in `config/defaults.ts`. |
+| Error reporting | `logInternalError` is consistent — best-effort, does not throw. |
+| Docs sync | `docs/architecture.md` matches the code (except some next-upgrade-roadmap items not yet implemented). |
 ---
-## 6. Test-matrix gap (ứng viên thêm test)
+## 6. Test-matrix gaps (candidates for new tests)
-- Cross-process race trên mailbox append (H2).
-- Event log overflow recovery (H1) — đảm bảo terminal event vẫn được persist khi vượt 50MB.
-- `forcedFinalDrain` không che giấu real child crash (M6).
+- Cross-process race on mailbox append (H2).
+- Event log overflow recovery (H1) — ensure terminal events are still persisted when exceeding 50MB.
+- `forcedFinalDrain` does not mask a real child crash (M6).
 - Resume with mixed `runtime.mode` overrides (M10).
-- Atomic-write coalesced + read-after-write within window — đảm bảo doc behavior matches reality.
+- Atomic-write coalesced + read-after-write within the window — ensure documented behavior matches reality.
 - `linkNodeModulesIfPresent` Windows non-admin fallback (M5).
 - `runSetupHook` multi-line JSON output (M4).
 ---
-## 7. Đề xuất ưu tiên (sorted)
+## 7. Suggested priorities (sorted)
-1. **Fix H1** (event-log overflow): rotate ngay khi vượt ngưỡng + ưu tiên terminal events.
-2. **Fix H2** (mailbox lock): áp dụng `withEventLogLockSync` pattern cho mailbox append.
-3. **Fix H3** (atomic-write symlink TOCTOU): re-check lstat trước fallback `writeFileSync`.
-4. **Fix H4** (rename `__test__mergeTaskUpdates` → `mergeTaskUpdates`, giữ alias).
+1. **Fix H1** (event-log overflow): rotate immediately when the threshold is crossed + prioritize terminal events.
+2. **Fix H2** (mailbox lock): apply the `withEventLogLockSync` pattern to mailbox append.
+3. **Fix H3** (atomic-write symlink TOCTOU): re-check lstat before the `writeFileSync` fallback.
+4. **Fix H4** (rename `__test__mergeTaskUpdates` → `mergeTaskUpdates`, keep alias).
 5. **M1/M2** transcript per-attempt + cap size.
-6. **M3** race-safe `statSync` trong cleanup.
-7. **M6** thêm metric `crew.child.final_drain_force_zero_total`.
-8. **L1** thêm ESLint hoặc Biome cho consistency (AGENTS.md global yêu cầu).
-9. **L3** batch artifact writes cho metadata.
-10. **L12** mở rộng `updateReferencesForRename` cho workflow→step + agent references.
+6. **M3** race-safe `statSync` in cleanup.
+7. **M6** add a metric `crew.child.final_drain_force_zero_total`.
+8. **L1** add ESLint or Biome for consistency (global AGENTS.md requires it).
+9. **L3** batch artifact writes for metadata.
+10. **L12** expand `updateReferencesForRename` for workflow→step + agent references.
 ---
@@ -259,13 +259,12 @@ node --experimental-strip-types -e "..."         → PASS (strip-types import ok
 node --test test/unit/*.test.ts                  → 1596 pass / 2 skip / 0 fail / 90s
 ```
-Không có lint command trong project (chỉ `tsc strict`), không tìm thấy file `.eslintrc*`.
+There is no lint command in the project (only `tsc strict`); no `.eslintrc*` file was found.
 ---
-## 9. Kết luận
+## 9. Conclusion
-`pi-crew` là một codebase **trưởng thành, kỷ luật cao**, có nhiều lớp phòng thân chống TOCTOU, race, và crash mid-write. Test coverage rộng, architecture rõ ràng. Các vấn đề tìm thấy chủ yếu là edge-case correctness và hardening, không có lỗ hổng nghiêm trọng nào ở mức "broken core flow".
-**Khuyến nghị**: ưu tiên fix H1–H4 và mở rộng test cho cross-process race (mailbox + event-log overflow). Tiếp theo là cân nhắc thêm linter, batch metadata artifact writes, và refactor một số orchestrator file lớn (`register.ts`, `team-runner.ts`, `live-session-runtime.ts`) thành sub-modules.
+`pi-crew` is a **mature, highly disciplined** codebase, with many defensive layers against TOCTOU, races, and mid-write crashes. Test coverage is broad, the architecture is clear. The issues found are mainly edge-case correctness and hardening; there is no serious "broken core flow" vulnerability.
+**Recommendation**: prioritize fixing H1–H4 and expanding tests for cross-process races (mailbox + event-log overflow). Next, consider adding a linter, batching metadata artifact writes, and refactoring some large orchestrator files (`register.ts`, `team-runner.ts`, `live-session-runtime.ts`) into sub-modules.