npm - pi-crew - Versions diffs - 0.9.9 → 0.9.10 - Mend

pi-crew 0.9.9 → 0.9.10

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (35) hide show

package/CHANGELOG.md +278 -0
package/docs/fixes/v0.9.10/locks-fix-verify.md +3 -0
package/docs/fixes/v0.9.10/smoke-test.md +12 -0
package/package.json +1 -1
package/src/extension/team-tool/doctor.ts +41 -18
package/src/runtime/child-pi.ts +122 -22
package/src/runtime/compact-pipeline.ts +56 -0
package/src/runtime/compact-stages/ansi-strip-stage.ts +25 -0
package/src/runtime/compact-stages/blank-collapse-stage.ts +31 -0
package/src/runtime/compact-stages/deduplicate-stage.ts +34 -0
package/src/runtime/compact-stages/head-snap-stage.ts +57 -0
package/src/runtime/compact-stages/index.ts +13 -0
package/src/runtime/compact-stages/tail-capture-stage.ts +72 -0
package/src/runtime/compact-stages/truncation-stage.ts +71 -0
package/src/runtime/handoff-manager.ts +10 -0
package/src/runtime/important-line-classifier.ts +130 -0
package/src/runtime/iteration-hooks.ts +7 -19
package/src/runtime/live-session-runtime.ts +50 -1
package/src/runtime/model-fallback.ts +29 -1
package/src/runtime/role-permission.ts +2 -2
package/src/runtime/stream-preview.ts +9 -2
package/src/runtime/task-output-context.ts +161 -27
package/src/runtime/task-runner.ts +76 -15
package/src/state/locks.ts +16 -0
package/src/state/state-store.ts +8 -2
package/src/ui/live-run-sidebar.ts +6 -1
package/src/ui/loaders.ts +24 -4
package/src/ui/run-dashboard.ts +6 -1
package/src/ui/run-event-bus.ts +1 -1
package/src/ui/run-snapshot-cache.ts +50 -16
package/src/ui/widget/index.ts +27 -5
package/src/ui/widget/widget-renderer.ts +43 -13
package/src/utils/redaction.ts +17 -1
package/src/utils/visual.ts +6 -0
package/src/ui/crew-widget.ts +0 -544

package/CHANGELOG.md CHANGED Viewed

@@ -1,5 +1,283 @@
 # Changelog
+## [v0.9.10 (continued)] — Round 29 follow-ups: BG2 sweep bug fixes, test optimization, E2E verification (2026-06-26)
+A full-suite verify run (`verify-full2`, 5502 tests, 774 suites) surfaced 4 file-level timeouts and 2 real correctness bugs. This release fixes the 2 real bugs, the underlying cause of 2 of the 4 timeouts (chain-runner + orphan-worker-registry + cleanup-full-flow self-deadlock + HandoffManager interval leak), and adds E2E verification artifacts to prove all fixes hold against the live runtime, not just static analysis.
+### Bug fixes (BG2 sweep)
+- **CountdownTimer drift** (commit cadb5b7, `src/ui/loaders.ts`). `setInterval`-based tick scheduling can skip a second value under event-loop load — the BG2 regression test caught the case where expected `[3, 2, 1, 0]` was emitted as `[3, 2, 0]`. Replaced with recursive `setTimeout` chain + `lastEmittedSeconds` guard so a busy event loop never drops a tick. `dispose()` updated to call `clearTimeout`. Test: `test/unit/loaders.test.ts` (5/5 pass).
+- **redactSecretString ReDoS regression** (commit cadb5b7, `src/utils/redaction.ts:177`). `isSecretKey` used `/[a-zA-Z0-9_-]/.test(value[j])` per character in a 100K-iteration loop — exceeded the 200ms budget on a `_`.repeat(100_000) + `=x` input. Replaced the regex call with a `charCodeAt` numeric check (`isKeyChar` helper, `src/utils/redaction.ts:223-232`). ~5× faster, no regex allocation, O(n²) → O(n) via early exit on the first non-key char. Test: `test/unit/redaction-p1f.test.ts` (4/4 pass).
+- **HandoffManager `setInterval` leak** (commit 5876c38, `src/runtime/handoff-manager.ts:202-213`). `startCleanupTimer()` did not call `.unref()` on the cleanup `setInterval`. Every `new HandoffManager()` in a test mock held an event-loop reference — `chain-runner.test.ts` created 42 instances and the test file's process never exited. Fix: `.unref()` on the interval handle at `:212-213`. Verified: 42/42 tests pass in 362ms (was hanging at 30s file-level timeout).
+### Lock re-entrance guard (Round 29 follow-up to BG2 sweep)
+- **`withFileLockSync` self-deadlock** (commit 7085d8d, `src/state/locks.ts:288`). The function was missing the re-entrance guard that its sibling functions `withRunLockSync` (line 333) and `withRunLock` (line 359+) already had via the `runLockHeldByUs` Map. When the same call stack tried to acquire the file lock twice on the same path (e.g. `registerWorker` → `cleanupOrphanWorkers` → `readRegistry`), the second acquisition read its own freshly-written lock file (same pid, fresh createdAt), failed the steal check, and retried for the full `staleMs` window — hanging `orphan-worker-registry.test.ts` and `cleanup-full-flow.test.ts` at 30s. Strace evidence in `.github/issues/pre-existing-2026-06-10/04-orphan-worker-registry-tests.md:75-86`. Smoking gun: `src/runtime/orphan-worker-registry.ts:220-221` already documents the bug in a code comment (the workaround was to skip the lock in one hot path; the tests hit it through a path the workaround did not cover). Fix: added parallel `fileLockHeldByUs` Map mirroring `runLockHeldByUs`; consult at function entry, set after acquire, delete in `finally`. Reuses the same shape so the two functions stay structurally parallel. Regression test: `test/unit/round29-file-lock-reentrance.test.ts` (5/5 pass, 547ms; `orphan-worker-registry` 15/15 496ms was hanging; `cleanup-full-flow` 4/4 1377ms was hanging).
+### Doctor test optimization (16× speedup)
+- **Build-time + test-time doctor cost** (commit 8842e2c, `src/extension/team-tool/doctor.ts` + `test/unit/doctor-cov.test.ts`). `buildTeamDoctorReport` was NOT pure — it spawned `git --version` and `pi --version` via `spawnSync` (1-2s each), walked the filesystem 3× for discovery (agents/teams/workflows), and audited the JSON schema on every invocation. With 12 tests in `doctor-cov.test.ts` and 2 in `doctor-validation.test.ts`, the cost added up to 25.8s and 6.8s. Three independent fixes: (1) test-side: `doctor-cov` switched from `cwd: "/tmp"` to a fresh `mkdtempSync` cwd via `before`/`after` hooks; (2) production dedupe: hoisted `discoverX` calls to module-level consts (called twice — Drift + Discovery sections — now called once); (3) production memoize: `commandExists` and `piCommandExists` cached at module level. Cache is safe: a doctor check is informational, a stale `ok: true` self-corrects on next process restart, and the in-process discovery is what actually drives user-visible behavior in a long-running pi session. Result: `doctor-cov` 25.8s → 1.6s (16×), `doctor-validation` 6.8s → 3.0s (2.3×). `tsc --noEmit` clean.
+### Widget progress line flicker (v6 invariant format)
+- **`crew-widget` progress line coalesce** (commit 78cd813, `src/ui/widget/widget-renderer.ts`). The `├─ ...` run progress line flickered across renders because `progressPart` was recomputed from multiple optional sources on every snapshot. Reduced to the v6 invariant format: `${completed}/${agents.length} agents` only, with the surrounding `· toolCount · tokenCount · duration` field strip removed (those data points are still surfaced in the per-agent sub-lines below). Format is now stable across ticks, so the host Pi TUI sees no diff between consecutive renders → no flicker.
+### E2E verification artifacts
+- **`docs/fixes/v0.9.10/`** (commit 7bbda16) — two E2E smoke-test artifacts from real team runs, NOT unit tests:
+  - `smoke-test.md` — research workflow (`team_20260626102522_e00831a41ee1cdd8`) wrote a 3-bullet summary of the 3 fix commits with file:line citations. The writer agent writing the file is itself an end-to-end exercise of the v0.9.10 writer-permission fix.
+  - `locks-fix-verify.md` — implementation workflow (`team_20260626151258_edeadbe3c35de7de`) verified all four code paths affected by the 3 commits (CountdownTimer, redactSecretString, HandoffManager, withFileLockSync) in a single multi-agent run; reviewer re-verified every line citation against the source.
+### Lesson
+- When a test file "hangs" at the file-level timeout, the FIRST hypothesis to check is per-test slowness, not deadlock. Profile with `--test-reporter=tap` and look at per-test `duration_ms` before assuming resource leaks or lock contention. Saves hours of chasing ghosts (this Round 29 originally looked like a `notification-router` or `parent-guard` interval leak; the real cause was `withFileLockSync` self-deadlock, found only after a delegated research investigation).
+- Grep-by-pattern ("X timer không .unref()") does NOT find re-entrance deadlocks. Read the call stack.
+### Verification (all under 60s timeout, per the 3863s lesson)
+- `test/unit/round29-file-lock-reentrance.test.ts`  5/5  547ms
+- `test/unit/loaders.test.ts`  5/5
+- `test/unit/redaction-p1f.test.ts`  4/4
+- `test/unit/orphan-worker-registry.test.ts`  15/15  496ms
+- `test/integration/cleanup-full-flow.test.ts`  4/4  1377ms
+- `test/unit/doctor-cov.test.ts`  12/12  1.6s  (was 25.8s)
+- `test/unit/doctor-validation.test.ts`  2/2  3.0s  (was 6.8s)
+- 134/134 Sprint 1-5 + bug-fix regression pass
+- `npx tsc --noEmit`  EXIT 0
+- E2E: research workflow (3/3 tasks, 93K tokens, 5m19s, writer wrote file); implementation workflow (3/3 tasks, 66K tokens, 5m01s, reviewer re-verified all citations)
+## [v0.9.10 (continued)] — migrate deferred truncation points through the stage-chain (Sprint 5) (2026-06-26)
+The Sprint 3 v0.9.10 (continued) entry listed 5 truncation points deferred from the P0-A stage-chain refactor. This release migrates 3 of them and defers 2 more with explicit reasons.
+### Features
+- **`TailCaptureStage`** (`src/runtime/compact-stages/tail-capture-stage.ts`) — keeps the last N chars/bytes, prepends an optional marker when truncation fires. Two cap modes: `maxChars` (UTF-8 safe) or `maxBytes` (legacy byte cap with UTF-8 boundary snap). Mutually exclusive caps; positive finite values required.
+- **`HeadSnapStage`** (`src/runtime/compact-stages/head-snap-stage.ts`) — keeps the first N bytes, optionally snapping to the last newline within the head region. Byte cap (not char cap) to preserve the original memory-budget semantic. UTF-8 boundary safety: walks back partial multi-byte sequences at the cut point. `snapToNewline: false` disables the snap.
+- **`TAIL_CAPTURE_STREAM_STAGE` singleton** — exported 16_384-char tail-capture with no marker for `stream-preview.ts` textBuffer.
+- **`appendBoundedTail` refactored** (child-pi.ts:52) — delegates to `TailCaptureStage` with dynamic marker `[pi-crew captured output truncated to last X KiB]` computed from `maxBytes`. Behavior bit-identical to pre-Sprint-5.
+- **`stream-preview.ts` textBuffer truncation refactored** (lines 114, 122) — delegates to `TAIL_CAPTURE_STREAM_STAGE`. Both call sites in `feedJsonEvent` now use the stage. Behavior bit-identical to pre-Sprint-5 inline `.slice(appended.length - MAX_TEXT_BUFFER)`.
+- **`iteration-hooks.ts` `truncateToLimit` removed** — replaced by inline `new HeadSnapStage({ maxBytes: MAX_STDOUT_BYTES }).apply(rawStdout.toString("utf-8"))` at the call site. Eliminates the Buffer → string → Buffer round-trip. Behavior preserved.
+### Deferred (still not migrated — with reasons)
+- **`async-runner.ts` stderr "stop capturing" semantic** (lines 280-330) — chunk-by-chunk state machine: chunks accumulate up to `STDERR_CAPTURE_LIMIT` (256KB), then further chunks are DROPPED ENTIRELY with a single truncation marker. Fundamentally stateful flow, not a transform-on-string. Migrating would require a chunk-stream pipeline that no other call site needs.
+- **`chain-runner.ts` array caps** (lines ~503-520) — operate on arrays of mixed-shape objects, not strings. The string-only pipeline abstraction doesn't apply. Existing `.slice()` calls are simple and correct; migrating would be ceremony without value.
+### Tests
+- New real-function suite `test/unit/deferred-truncation-migration.test.ts` (31 tests, calling REAL exported stages):
+  - **TailCaptureStage**: char/byte cap under/over boundary; UTF-8 boundary snap; marker behavior (prepended ONLY when truncating); singleton behavior; constructor validation.
+  - **HeadSnapStage**: byte cap; newline-snap to last `\n` in head region; `snapToNewline: false` disables snap; UTF-8 boundary safety with emoji at cut boundary; constructor validation.
+  - **3 migration integrations**: `appendBoundedTail`, `stream-preview.ts`, `iteration-hooks.ts` all verified behavior-equivalent to pre-Sprint-5.
+  - **L4 backward-compat**: all 3 migrations produce bit-identical output to pre-Sprint-5 inline implementations.
+- 111 output-handling + child-pi/task-output-context importer tests pass (was 80 in pre-Sprint-5 — added 31 new tests); `tsc --noEmit` clean.
+### Verification
+```
+npx tsc --noEmit                                                  -> clean (exit 0)
+node --test test/unit/deferred-truncation-migration.test.ts      -> 31 tests, 0 fail
+node --test (full Sprint 1+2+3+4+5 regression set, 7 files)       -> 111 tests, 0 fail
+node --test (all child-pi/task-output-context importer tests)     -> 132 tests, 0 fail
+```
+The full integration suite (incl. slow E2E / mocked-child-pi) was not re-run in this release window; the targeted unit + importer set covering every changed symbol is green.
+---
+## [v0.9.10 (continued)] — tee-recovery for truncated shared artifacts (P1-A, Sprint 4) (2026-06-26)
+When `readIfSmall` truncated a shared artifact down to ~32KB of head+tail for inline injection into a downstream worker's prompt, the worker had no way to recover the dropped middle. The only options were: re-run the producing task (waste), guess what was missing (error-prone), or skip the work (capability loss). This release wires up **tee-recovery** so the truncated middle is recoverable on demand.
+### Features
+- **`readIfSmallWithTee(filePath, opts) -> { content, fullOutputPath? }`** — new function in `src/runtime/task-output-context.ts`. Reads the file (with the same multi-byte-safe truncation pipeline as `readIfSmall`), and when the file size exceeds `2 * MAX_RESULT_INLINE_BYTES` AND `opts.tee.fullOutputPath` was provided, ALSO writes the FULL untruncated content to that path. Returns the truncated content for inline injection PLUS the path so the caller can expose it to the worker. Returns `{ content }` (no `fullOutputPath`) when truncation is below the 2× threshold or when no tee opts were provided. Returns `undefined` if the file cannot be read.
+- **`teePathForArtifact(artifactsRoot, taskId, artifactName) -> string`** — public helper computing the canonical tee path `${artifactsRoot}/tee/${taskId}-${artifactName}.full.txt`. `taskId` and `artifactName` are sanitized to `[A-Za-z0-9._-]+` (path separators and `..` neutralized to `_`) so the resulting path is always a single segment inside the tee directory.
+- **Tee directory auto-creation** — `mkdirSync(path.dirname(teePath), { recursive: true })` so callers do not need to pre-create `${artifactsRoot}/tee/`.
+- **Best-effort tee write** — I/O failures (disk full, permission denied, parent-is-a-file, etc.) are swallowed and `fullOutputPath` is omitted from the result instead of failing the read. The truncated inline content is still returned either way.
+- **SharedReads entry shape extended** — the `sharedReads` array in `DependencyOutputContext` gains an optional `fullOutputPath?: string` field, set when tee was actually written. The construction site in `collectDependencyOutputContext` was refactored to compute the tee path via `teePathForArtifact(manifest.artifactsRoot, task.id, name)`, call `readIfSmallWithTee`, and include `fullOutputPath` in the entry when present.
+- **Worker prompt augmentation** — `renderDependencyOutputContext` now emits a `Full output (if you need the missing middle): ${fullOutputPath}` line whenever the entry has one. Downstream workers can `read` this path to recover the dropped middle. The `read` call goes through pi-crew's normal permission gate (writer role has `workspace_write`), so security is preserved. Existing entries (small / under-2× files) render exactly as before — no extra line.
+- **`readIfSmall` backward-compat wrapper** — the existing `readIfSmall(filePath, baseDir?) -> string | undefined` signature is preserved; it now delegates to `readIfSmallWithTee` and returns just the content string. All existing call sites (live task resultArtifact reads, prompt-builder contexts, etc.) compile and behave identically.
+### Tests
+- New real-function suite `test/unit/tee-recovery-real.test.ts` (14 tests, calling the REAL exported `readIfSmallWithTee`, `readIfSmall`, `teePathForArtifact`):
+  - **Threshold boundary**: files at or below `MAX_RESULT_INLINE_BYTES` returned verbatim, no tee, no tee file created on disk.
+  - **Truncation without tee**: files between 1× and 2× threshold are truncated, marker present, NO tee file created (the head/tail is mostly intact, tee would be wasteful).
+  - **Truncation WITH tee**: files > 2× threshold trigger tee; the tee file on disk is byte-equal to the original (full content, not truncated); `result.fullOutputPath` equals the requested tee path.
+  - **Tee directory auto-creation**: nested non-existent directories (e.g. `${root}/deeply/nested/tee/`) are created via `mkdirSync recursive`.
+  - **Tee write failure (best-effort)**: when the tee path's parent is a regular file (not a directory), `writeFileSync` fails internally; the read still returns truncated content with `fullOutputPath: undefined`. Never throws.
+  - **No-op without tee opts**: `readIfSmallWithTee(file)` without `opts.tee` behaves like the legacy `readIfSmall` (content only, no `fullOutputPath`).
+  - **Legacy `readIfSmall` wrapper**: returns the same string as `readIfSmallWithTee(file).content` for any input (backward-compat verified end-to-end).
+  - **`teePathForArtifact` format**: `${artifactsRoot}/tee/${taskId}-${artifactName}.full.txt`. Path-safety invariants verified — final filename segment contains no path separators, ends with `.full.txt`.
+  - **`teePathForArtifact` sanitization**: input like `"../escape/me"` + `"../../etc/passwd"` produces a path whose final segment is single-segment and contains no `/`. (`.` is intentionally allowed in the safe-char class so legitimate filenames like `result.json` survive; the real safety guarantee is no path separators inside the segment.)
+  - **Integration**: sharedReads construction (replicated from `collectDependencyOutputContext`) — small entries have no `fullOutputPath`, medium (under 2×) have no `fullOutputPath`, large (over 2×) have `fullOutputPath` AND the tee file exists on disk.
+  - **L4 backward-compat**: `readIfSmallWithTee` truncated marker wording on plain text is bit-identical to the pre-P1-A format (no `important lines preserved` marker, exact `[pi-crew truncated N chars, head+tail preserved]`).
+- 133 output-handling + child-pi/task-output-context importer tests pass (was 119 in v0.9.12 — added 14 P1-A suite); `tsc --noEmit` clean.
+### Verification
+```
+npx tsc --noEmit                                                  -> clean (exit 0)
+node --test test/unit/tee-recovery-real.test.ts                   -> 14 tests, 0 fail
+node --test (full Sprint 1+2+3+P1-A regression set, 9 files)      -> 133 tests, 0 fail
+node --test (all child-pi/task-output-context importer tests)     -> 132 tests, 0 fail
+```
+The full integration suite (incl. slow E2E / mocked-child-pi) was not re-run in this release window; the targeted unit + importer set covering every changed symbol is green.
+### Lessons learned
+- **Backward-compat via thin wrapper** — adding `readIfSmallWithTee` (returns enriched object) as the new canonical function and refactoring `readIfSmall` to delegate + return just `result.content` avoided any change to the existing `readIfSmall(filePath, baseDir?) -> string | undefined` signature. All four existing call sites in the file compiled without edits. The enriched result type is a strict superset; new code can opt into the metadata without breaking old code.
+- **Best-effort tee is the right default** — making `writeTeeFile` swallow I/O errors and report success/failure via boolean means the read path is NEVER blocked by tee-side problems. The worker prompt augmentation (`if (read.fullOutputPath)`) is the natural fallback signal — when tee failed, the worker simply does not see the recovery hint and behaves as if the file is non-recoverable (same as pre-P1-A).
+- **Tee threshold of 2× is the right cutoff** — files just over `MAX_RESULT_INLINE_BYTES` (say 33KB) get a clean head/tail with most of the content visible inline; tee-ing them would be wasteful disk usage. Files over 2× (say 70KB+) have >38KB dropped in the middle, where tee provides real value.
+---
+## [v0.9.10 (continued)] — stage-chain compression pipeline (P0-A, Sprint 3) (2026-06-26)
+The output-handling & compression area had several ad-hoc truncation / cleaning functions, each with its own quirks (`appendBoundedTail`, `stream-preview`, `iteration-hooks`, `async-runner`, `compactString`, `readIfSmall`, `chain-runner`). This release introduces a composable **stage-chain compression pipeline** so that future clean-up stages (ANSI strip, blank-line collapse, deduplication, truncation, …) can be added once and reused at every call site, and so that ALL compaction is forced through a single **monotonic-shrink gate** that mathematically cannot expand its input.
+### Features
+- **`src/runtime/compact-pipeline.ts` — stage-chain with monotonic-shrink gate** — `ICompactStage { id, apply(text): string }`, `applyCompactPipeline(text, stages) -> { text, applied }`. A stage is applied only if its output is no longer than its input (gate: `next.length <= text.length`); expanding stages are silently dropped and their id is not recorded in `applied`. This is the safety property that prevents the family of L4 caveman-shrink bugs (24/27 artifacts null-byte-corrupted by a regex-based shrink that expanded in some cases — see `.crew/knowledge.md` §"L4 output-handling"). Ported from Hypa's `GenericOutputCompressor.cs:18-51`.
+- **Four concrete stages in `src/runtime/compact-stages/`** — `AnsiStripStage` (CSI color/cursor codes, fast-path when no `\x1b` present), `BlankCollapseStage` (collapses 3+ consecutive newlines to a single blank line; configurable `minConsecutive`), `DeduplicateStage` (collapses CONSECUTIVE duplicate lines, preserves `\r\n` endings; opt-in only — unsafe for assistant prose), `TruncationStage` (parameterized marker verb/separators so the SAME class serves both `compactString`'s "compacted ... chars" wording and `readIfSmall`'s "truncated ... chars" wording).
+- **`compactString` refactored onto the pipeline** — default pipeline is `[TruncationStage(maxChars, { preserveImportant })]`. Plain-text inputs with no ANSI / blank runs / consecutive duplicates pass through bit-identically (L4 backward-compat — marker wording unchanged when no important lines and no noise). The P0-B important-line preservation still works through the pipeline.
+- **`readIfSmall` refactored onto the pipeline with noise stripping** — artifact files (which frequently contain ANSI color codes + blank-line noise from npm/cargo/jest output captured to disk) now pass through `[AnsiStripStage, BlankCollapseStage, TruncationStage]` before the result is returned. Plain-text fixtures remain bit-identical (L4 backward-compat).
+### Deferred to a future release
+Five other truncation points in the codebase were intentionally NOT ported to the pipeline in this release — each has call-site-specific semantics that warrant their own migration:
+| Code point | Current behavior | Migration scope |
+|---|---|---|
+| `appendBoundedTail` (`child-pi.ts`) | Tail-only accumulator, 256KB byte cap | Live-streaming chunked input; needs a streaming-friendly stage API |
+| `stream-preview.ts:114` | Tail-only live preview, 16KB | Live UI preview; ANSI strip is a feature but the tail-only semantics are UI-specific |
+| `iteration-hooks.ts:105` | Head-only, newline-snapped, 8KB | Hook output is small (single command); pattern is deliberately different |
+| `async-runner.ts:293` | Head+marker, stops capturing at 256KB stderr | Detached-process stderr; capture-stop semantics need a separate stage |
+| `chain-runner.ts:515` | Head-only array cap, 20/50 items | Array compaction (not string); pipeline operates on strings |
+These are tracked for a future v0.9.13+ cleanup. The P0-A infrastructure makes the migration mechanical (each is a ~30 LOC refactor once the per-point design is settled).
+### Tests
+- New real-function suite `compact-pipeline-real.test.ts` (23 tests, calling the REAL exported pipeline, stages, `compactString`, `readIfSmall`):
+  - **Monotonic-shrink gate (critical safety property)**: an expanding stage is silently dropped; an equal-length stage is accepted; chained stages with mixed expand/shrink produce a deterministic `applied[]`.
+  - **Malformed-stage defense**: non-string output, missing `apply` are skipped (pipeline never throws on bad input).
+  - **AnsiStripStage**: CSI color/cursor codes stripped; idempotent; fast-path on text without `\x1b`.
+  - **BlankCollapseStage**: 3+ newlines collapsed to 2; 1-2 newlines preserved; configurable threshold.
+  - **DeduplicateStage**: consecutive duplicates collapsed; non-adjacent duplicates preserved; `\r\n` endings preserved.
+  - **TruncationStage**: default marker matches `compactString` wording; truncated marker + `\n\n` headSeparator matches `readIfSmall` wording; rejects non-positive `maxChars`.
+  - **Pipeline integration**: `compactString` on plain text is bit-identical to pre-P0-A wording (L4 backward-compat); important-line preservation still works; monotonic-shrink holds across the boundary window. `readIfSmall` strips ANSI before truncating (assert: no `\x1b` in result); collapses blank-line noise before truncating; plain-text input is bit-identical to pre-P0-A wording.
+  - **Pipeline observability**: `compactString` returns a plain string (not a `PipelineResult`) — `applied[]` is internal; future dashboard wiring is a separate task.
+- 119 output-handling + child-pi/task-output-context importer tests pass (was 96 in v0.9.11 — added 23 P0-A suite); `tsc --noEmit` clean.
+### Verification
+```
+npx tsc --noEmit                                                  -> clean (exit 0)
+node --test test/unit/compact-pipeline-real.test.ts               -> 23 tests, 0 fail
+node --test (full Sprint 1+2+P0-A regression set, 8 files)        -> 119 tests, 0 fail
+node --test (all child-pi/task-output-context importer tests)     -> 118 tests, 0 fail
+```
+The full integration suite (incl. slow E2E / mocked-child-pi) was not re-run in this release window; the targeted unit + importer set covering every changed symbol is green.
+### Lessons learned during this work
+- **Parameter-property syntax** (`constructor(private readonly x = 3) { }`) is **NOT** supported by Node's `--experimental-strip-types` mode used by pi-crew's test runner. Sprint 3 originally hit a `SyntaxError: TypeScript parameter property is not supported in strip-only mode` that crashed every test file importing the affected module (not just the failing test). Field declaration + constructor assignment is the portable shape — documented inline in `BlankCollapseStage` so the next contributor doesn't re-introduce it.
+- **Delegation is not free**. The first P0-B and P0-A team-run attempts both failed at the executor stage (exploration without code production) despite 800K+ tokens of work. Direct implementation, with the spec written by the leader, was strictly faster and cheaper. Lesson: for tightly-scoped, well-specified refactors where the leader holds the full context, delegate only the verification (or skip delegation entirely).
+---
+## [v0.9.10 (continued)] — harden output-handling: fix UTF-8 corruption, dead-code path resolution, compaction expansion, stderr secret leakage, and add important-line classifier (Sprints 1-2) (2026-06-26)
+A code-review + security-review + verifier pass on the output-handling & compression code path surfaced 4 correctness bugs and 1 medium-severity security gap. This release fixes all of them and replaces the test "mirror" anti-pattern (tests re-implemented the algorithm locally instead of calling the real functions) with real-function tests, so the passing suite now actually guards the shipped code.
+### Fixes
+- **`readIfSmall` UTF-8 byte-boundary corruption** — `src/runtime/task-output-context.ts` previously read head/tail as raw bytes then `.toString("utf-8")`, splitting multi-byte sequences into `\uFFFD` replacement characters. This corrupted emoji, CJK text, and the `⬜`/`⬛` large-square symbols central to the v0.9.10 visual fix. Now reads the full file as a UTF-8 string and slices by character count (char-safe, consistent with `compactString`).
+- **`pruneSharedReads` path resolution (dead invalidation branch)** — `path.resolve("shared")` resolved against the process CWD instead of `manifest.artifactsRoot`, and then double-prefixed to `<cwd>/shared/shared/<name>`. The file-edit-after-read invalidation branch therefore never matched. `pruneSharedReads` now receives `artifactsRoot` and resolves artifact paths (already relative to `artifactsRoot`) directly.
+- **`compactString` / `readIfSmall` expansion at threshold boundary** — for inputs just over the threshold, `head(75%) + marker(~57) + tail(25%)` was *larger* than the input, so "compaction" expanded the content. Added a monotonic-shrink guard: if the compacted result is not shorter than the input, return the input unchanged. (This is the local seed of the P0-A stage-chain `if (next.length <= text.length)` gate planned for a later release.)
+- **`compactValue` silent array/object truncation** — `.slice(0, 20)` dropped items 21+ with no marker. Now appends a `[pi-crew truncated N entries]` marker (arrays) / `[truncated]` key (objects) so downstream consumers know data was elided, consistent with `compactString`.
+### Security
+- **`child-pi.ts` stderr/stdout secret leakage via lifecycle events (SEC-1, medium)** — the in-memory `stdout`/`stderr` accumulators receive raw worker output (structurally compacted only, not secret-redacted). Tail slices embedded in lifecycle events (`response_timeout`, `spawn_error`, `exit`) and error messages could therefore leak worker-emitted secrets (GitHub PATs, AWS keys, JWTs) through diagnostic logs that bypass artifact-store redaction. All 8 embed sites now route through a single `redactStderrExcerpt(stderr, maxChars)` helper that applies `redactSecretString` at the boundary.
+### Features
+- **Important-line classifier for truncated output (P0-B)** — when `compactString` or `readIfSmall` truncates a value below its threshold, the middle slice is now scanned for diagnostic lines and the most important ones are preserved between head and tail within a 15% slack budget. Five anchored regexes (error keywords, `file:line` diagnostics, HTTP 4xx/5xx, k8s/linter `Warning`, compiler/linter codes like `TS2304`) ported from Hypa's `ImportantLineClassifier.cs`. `compactString` gains an optional `{ preserveImportant?: boolean }` arg (default `true`); the assistant-text branch in `compactContentPart` opts out via `preserveImportant: false` so prose compaction behavior is unchanged. `readIfSmall` always preserves (artifact files are tool output context). When no important lines are picked the marker wording stays bit-identical to the pre-P0-B format (L4 regression safety). Local seed of the monotonic-shrink `if (next.length <= text.length)` gate planned for the P0-A stage-chain refactor.
+### Tests
+- Replaced the **test-mirror anti-pattern**: `output-handling-l4.test.ts` had re-implemented `headTailCompact` locally and used a local fd-read that "mirrored" `readIfSmall`, so the suite stayed green while the bugs above existed. `compactString`, `compactValue`, `readIfSmall`, `MAX_RESULT_INLINE_BYTES`, `splitWithImportantLines`, `isImportantLine`, and `extractImportantLines` are now exported and exercised directly.
+- New real-function suites: `child-pi-compaction-real.test.ts` (monotonic-shrink across the boundary window, head/tail/marker preservation), `task-output-context-compaction-real.test.ts` (UTF-8 multi-byte safety at the split point, monotonic-shrink at multiple thresholds), `child-pi-sec1-redaction.test.ts` (GitHub PAT / AWS key / JWT / Bearer redaction at the `redactStderrExcerpt` boundary, plus slice-window and passthrough cases), `important-line-classifier-real.test.ts` (per-pattern match, greedy whole-line slack selection, compactString/readIfSmall integration with real files, `preserveImportant:false` opt-out, L4 backward-compat marker wording).
+- 96 output-handling + child-pi importer tests pass (191 with the full importer set including the pre-existing UI/notification/pool/timeout/exit suites); `tsc --noEmit` clean.
+### Verification
+```
+npx tsc --noEmit                                  -> clean (exit 0)
+node --test (output-handling + importers)        -> 108 tests, 0 fail, 2.0s
+SEC-1 redaction boundary (8 subtests)            -> 8 pass, 0 fail
+```
+The full integration suite (incl. slow E2E / mocked-child-pi) was not re-run in this release window; the targeted unit/importer set covering every changed symbol is green.
+---
+## [v0.9.10] — fix TUI crash: count large-square emoji (⬜) as width 2 (2026-06-25)
+Fixes the recurring `uncaughtException: Rendered line N exceeds terminal width (160 > 159)` that killed the host Pi process while a pi-crew foreground team run was active. The crew run itself (a child process) kept running, but the Pi TUI died and could not be used to observe it.
+### Root cause — width-measurement mismatch (the earlier commit 7a3ac8b was WRONG about this)
+`src/utils/visual.ts` `WIDE_RANGES` did **not** include `⬜` U+2B1C (WHITE LARGE SQUARE) or `⬛` U+2B1B (BLACK LARGE SQUARE), so pi-crew's `visibleWidth`/`truncate` counted them as **1 column**. Upstream `@earendil-works/pi-tui` (the renderer Pi runtime uses to detect overflow) counts them as **2 columns** (RGI emoji).
+A widget sub-line such as
+```
+│     ⊶ | S7: pi-audit security test | ⬜ pending | | · 39 tools · *** tok · 49s
+```
+got composed, then `crew-widget.ts` `Box.render` **padded** it to 159 chars. pi-crew's own `visibleWidth` said 159 (⬜ = 1), so every truncate guard passed; but pi-tui re-measured the padded line at **160** (⬜ = 2) and threw → `uncaughtException` → Pi exits.
+Commit `7a3ac8b` (truncate widget lines) was **ineffective**: it measured with the same mismatched `visibleWidth`, so truncating "to 159" still left pi-tui seeing 160. It is kept as harmless defense-in-depth.
+### Fix (commit 3cd9001)
+Add `[0x2B1B, 0x2B1C]` to `WIDE_RANGES` so pi-crew's measurement agrees with upstream pi-tui for the large-square emoji that appear in task descriptions (e.g. pi-audit backlog markers `⬜ pending`). Surrounding codepoints (U+2B00, U+2BFF) stay width 1 — only the RGI large squares are wide.
+### Verification
+Cross-checked directly with the SAME `visibleWidth` the Pi runtime uses:
+```
+before fix: crewTruncate(paddedLine, 159) -> pi-tui visibleWidth 160  (CRASH)
+after  fix: crewTruncate(paddedLine, 159) -> pi-tui visibleWidth 159  (CRASH PREVENTED)
+```
+Tests: 2 new regression tests in `test/unit/visual.test.ts` pin `visibleWidth('⬜') === 2` (and `⬛`), that `⬀`/`⯿` stay 1, and that truncating a 159-char line containing `⬜` now fits `visibleWidth <= 159`. Existing visual/width-safety/widget/crew-widget suites (24 tests) stay green.
+### Also in v0.9.10
+**Stale `parentModel` from Pi runtime session state** (commit 06b2337). `ctx.model` in the Pi extension API is the **session's SAVED model** (`main.ts:389`), not the model currently running. A previous session saved `claude-sonnet-4-5` to session state; a new session inherits the stale `savedModel` but actually runs on `minimax-M3` (visible in the footer). When the saved model has no auth in `modelRegistry`, child-pi spawn fails immediately with `No API key found for anthropic` before any fallback can fire. New helper `resolveParentModelFromRegistry()` in `live-session-runtime.ts` resolves `parentModel` to the first auth-available model in `modelRegistry`, falling back to the raw value if the registry is empty. pi-crew config (fallbackModels, agentModel, overrideModel, F7 scope) is fully respected — fix only touches the parentModel argument. 6 new tests in `test/unit/resolve-parent-model-from-registry.test.ts`.
+**Model-fallback chain reliability** (commit c90b1bf). The fallback chain failed to advance past certain non-retryable look-alikes because of three gaps plus missing E2E coverage. (a) `RETRYABLE_MODEL_FAILURE_PATTERNS` broadened with 5 new patterns (`provider error`, `context_length_exceeded`, `safety`, `is overloaded`, `408`). (b) `!nextModel` short-circuit in `task-runner.ts` relaxed so the chain can advance past non-retryable look-alikes. (c) `detectRetryableModelFailureFromOutput` broadened to check `stopReason === "error"` events (previously only string-matched). (d) New `PI_TEAMS_MOCK_CHILD_PI=retryable-failure-then-success` mock branch + new `test/integration/model-fallback-chain-e2e.test.ts`. 39/39 tests pass (21 model-fallback + 14 chain E2E + 4 others).
+**Channel-based event subscription — eliminate over-invalidate spam** (commit a67aaee). `runEventBus.onAny()` in 5 widgets caused every emitted `worker_status` event (vài events/giây when an agent calls tools) to trigger a full state reload. Concurrent write to `manifest.json` + `tasks.json` during live team runs triggered the retry-loop instability path in `state-store.ts:659-735`, spamming `console.warn` with `retry loop detected instability for run X`. EventChannel taxonomy already existed in `run-event-bus.ts:19-49` but was not used. All 5 widgets refactored to subscribe via `onChannel()`: `run-snapshot-cache` → `run:state + worker:lifecycle`; the other four → `run:state + worker:lifecycle + ui:invalidate`. Secondary bug: `run.cache_invalidated` was classified as `worker:progress` (default fallback) because missing from `UI_INVALIDATE_TYPES` set — added. Retry-loop `console.warn` downgraded to `console.debug` (best-effort by design, not an error). Expected impact: 80-95% reduction in retry-loop warnings; 100% reduction in log noise.
+**Writer role misclassification — unblock deliverable emission** (commit 4abc9f1, E2E verified). `permissionForRole("writer")` returned `read_only` because writer was in `READ_ONLY_ROLES` (`src/runtime/role-permission.ts:6`). This blocked file writes at runtime even though `agents/writer.md` already declares `tools: read, grep, find, ls, edit, write` and 3/3 workflows use writer EXACTLY for deliverable emission: `parallel-research.workflow.md:42-46` (`output: research-summary.md`), `research.workflow.md:18-22` (`output: research-summary.md`), `pipeline.workflow.md:24-29` (Stage 4 Documentation). Incident 2026-06-25: `parallel-research` ran 7/7 tasks successfully (714K tokens, 20m49s) but the deliverable file was NEVER written; the worker reported `My role contract states: Do not create, modify, delete, move, or copy files` — a permission-gate injection. Fix: 1 line — `writer` moved from `READ_ONLY_ROLES` to `WRITE_ROLES`. **E2E verified** with `team_20260625110218_e100ce5371db08ea` (research workflow): 3/3 tasks, 65,431 tokens, 3m41s — `research-summary.md` (1319 bytes) created successfully by the `03_write` worker.
+### Follow-up observation (not fixed in v0.9.10)
+**Planner has a 2-layer latent bug** (P1, deferred). `default.workflow.md:11-14` and `implementation.workflow.md:7` assign `planner` to write `plan.md` / `adaptive-plan.json`. Layer 1: `planner` is in `READ_ONLY_ROLES` (`role-permission.ts:5`). Layer 2: `agents/planner.md:7` declares `tools: read, grep, find, ls` — **no `write` or `edit` tool**. Single-layer fixes would miss the other 50% of the issue. Action item: separate audit + targeted fix + E2E test in next sprint.
 ## [v0.9.9] — gajae-code distillation (4 P0) + notification race fix (2026-06-25)
 Six changes: four high-impact/low-effort features distilled from researching [Yeachan-Heo/gajae-code](https://github.com/Yeachan-Heo/gajae-code) (full report: `research-findings/gajae-code-distill.md`), plus a fix for a redundant-notification bug the leader directly hit while running that research. Each was calibrated against real pi-crew code — two reported "gaps" turned out to be patterns pi-crew already implements (prompt-level stablePrefix, detached spawning), and four areas where pi-crew is already superior were deliberately left untouched (crash-recovery byte-offset cursor, declarative workflow + semaphores, run-snapshot-cache, event sourcing).

package/docs/fixes/v0.9.10/locks-fix-verify.md ADDED Viewed

@@ -0,0 +1,3 @@
+# Round 29 withFileLockSync re-entrance fix — verification
+Verified the v0.9.10 `withFileLockSync` re-entrance fix against `pi-crew/src/state/locks.ts`. The `fileLockHeldByUs` re-entrance map is declared at `src/state/locks.ts:371` as `const fileLockHeldByUs = new Map<string, string>(); // lockFile -> token`; `export function withFileLockSync<T>(filePath: string, fn: () => T, options: RunLockOptions = {}): T {` is at `src/state/locks.ts:288`; the re-entrance check `const existingToken = fileLockHeldByUs.get(lockFile);` is at `src/state/locks.ts:302` (immediately followed by the `if (existingToken) { return fn(); }` fast-path). After successful acquire, `fileLockHeldByUs.set(lockFile, token);` is at `src/state/locks.ts:336`, and the matching `fileLockHeldByUs.delete(lockFile);` lives inside the `finally` block at `src/state/locks.ts:341`, immediately followed by `releaseLock(lockFile, token)` — i.e. the map entry is cleared BEFORE the actual filesystem release, guaranteeing no stale map entries can survive a thrown exception. Test `pi-crew/test/unit/round29-file-lock-reentrance.test.ts` reports **5/5 passing** (≈0.41s wall; longest sub-test 3.76ms) via `npx tsx --test`, covering nested-same-path, deeply-nested (3 levels), different-paths, post-release reacquire, and value-flow cases. Both commits confirmed in local `git log --oneline`: `7085d8d` ("Add re-entrance guard to withFileLockSync (Round 29 follow-up)") and `5876c38` ("Fix HandoffManager setInterval leak: prevent test file-level hangs"). **Verdict: VERIFIED** — no anomalies; every line citation and the test pass-count match the claim exactly.

package/docs/fixes/v0.9.10/smoke-test.md ADDED Viewed

@@ -0,0 +1,12 @@
+# Smoke Test — v0.9.10 File-Level Test Hang Fixes
+3 commits resolved BG2 (`verify-full2`) file-level test hangs/timeouts. All verified under a 60s timeout bound (per the 3863s lesson); `tsc --noEmit` clean.
+- **cadb5b7** — `Fix CountdownTimer drift and redactSecretString ReDoS regression`
+  Correctness/perf fixes surfaced by the P1f ReDoS regression test (300KB no-dot input) and BG2 sweep; not attributed to a specific hanging test. `CountdownTimer` switched from `setInterval` to recursive `setTimeout` so a busy event loop no longer skips a second value (`src/ui/loaders.ts:109` class; `scheduleNextTick` at `:143`; `.unref()` defense-in-depth at `:157`; test `test/unit/loaders.test.ts`). `redactSecretString` (`src/utils/redaction.ts:177`) inner loop now advances past non-secret alphanumeric runs (O(n²)→O(n); FIX comment `:207-220`) and `isKeyChar` uses `charCodeAt` (`:223-232`) in place of `/[a-zA-Z0-9_-]/.test` — ~5× faster, no regex allocation.
+- **5876c38** — `Fix HandoffManager setInterval leak: prevent test file-level hangs`
+  Resolves `chain-runner.test.ts` file-level hang. `HandoffManager.startCleanupTimer` (`src/runtime/handoff-manager.ts:202-213`) now calls `.unref()` on the cleanup `setInterval` at `:212-213`. Without `.unref()`, every `new HandoffManager()` in a mock helper leaked a handle that kept Node alive past test completion.
+- **7085d8d** — `Add re-entrance guard to withFileLockSync (Round 29 follow-up)`
+  Resolves `orphan-worker-registry.test.ts` + `cleanup-full-flow.test.ts` hangs. Added `fileLockHeldByUs: Map<string,string>` re-entrance guard to `withFileLockSync` (`src/state/locks.ts:295` FIX comment; `:302` existingToken short-circuit; `:336` Map.set; `:341` finally delete; `:371` Map declaration) mirroring the existing `runLockHeldByUs` pattern. Without the guard, nested same-path acquisition read its own freshly-written lock file and retried for the full `staleMs` window. Regression test: `test/unit/round29-file-lock-reentrance.test.ts` (5/5 pass, 547ms; orphan-worker-registry 15/15 496ms was hanging at 30s; cleanup-full-flow 4/4 1377ms was hanging at 30s).

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "pi-crew",
-  "version": "0.9.9",
+  "version": "0.9.10",
   "description": "Pi extension for coordinated AI teams, workflows, worktrees, and async task orchestration",
   "author": "baphuongna",
   "license": "MIT",

package/src/extension/team-tool/doctor.ts CHANGED Viewed

@@ -28,27 +28,47 @@ function firstOutputLine(stdout: string | null | undefined, stderr: string | nul
 	return output.split(/\r?\n/).find((line) => line.trim().length > 0)?.trim() ?? "available";
 }
+// Round 29 optimization: memoize spawnSync probe results at module level.
+// The probes (git --version, pi --version) are stable for the process
+// lifetime, and spawnSync on a node script can cost 1-2s. Without the
+// cache, each buildTeamDoctorReport() call would pay that cost, and a
+// file with 12 tests would take 20s+ even with empty cwd. The cache is
+// safe: a doctor check is informational, and a stale ok=true would
+// self-correct on the next process restart.
+const commandExistsCache = new Map<string, { ok: boolean; detail: string }>();
 function commandExists(command: string, args: string[]): { ok: boolean; detail: string } {
+	const cacheKey = `${command} ${args.join(" ")}`;
+	const cached = commandExistsCache.get(cacheKey);
+	if (cached) return cached;
+	let result: { ok: boolean; detail: string };
 	try {
 		const output = spawnSync(command, args, { encoding: "utf-8", stdio: ["ignore", "pipe", "pipe"] });
 		if (output.error) {
-			return { ok: false, detail: output.error.message };
+			result = { ok: false, detail: output.error.message };
+		} else if (output.status !== 0) {
+			result = { ok: false, detail: firstOutputLine(output.stdout, output.stderr) || `status ${output.status}` };
+		} else {
+			result = { ok: true, detail: firstOutputLine(output.stdout, output.stderr) };
 		}
-		if (output.status !== 0) {
-			return { ok: false, detail: firstOutputLine(output.stdout, output.stderr) || `status ${output.status}` };
-		}
-		return { ok: true, detail: firstOutputLine(output.stdout, output.stderr) };
 	} catch (error) {
-		return { ok: false, detail: error instanceof Error ? error.message : String(error) };
+		result = { ok: false, detail: error instanceof Error ? error.message : String(error) };
 	}
+	commandExistsCache.set(cacheKey, result);
+	return result;
 }
+let piCommandExistsCache: { ok: boolean; detail: string } | undefined;
 function piCommandExists(): { ok: boolean; detail: string } {
+	if (piCommandExistsCache) return piCommandExistsCache;
 	const spec = getPiSpawnCommand(["--version"]);
 	const output = commandExists(spec.command, spec.args);
-	if (!output.ok) return output;
+	if (!output.ok) {
+		piCommandExistsCache = output;
+		return piCommandExistsCache;
+	}
 	const executable = spec.command === "pi" ? "pi" : `${spec.command} ${spec.args[0] ?? ""}`.trim();
-	return { ok: true, detail: `${output.detail} (${executable})` };
+	piCommandExistsCache = { ok: true, detail: `${output.detail} (${executable})` };
+	return piCommandExistsCache;
 }
 function checkWritableDir(dir: string): { ok: boolean; detail: string } {
@@ -119,12 +139,18 @@ export interface TeamDoctorReport {
 }
 export function buildTeamDoctorReport(input: TeamDoctorReportInput): TeamDoctorReport {
+	// Discover once — used in both Drift and Discovery sections. Walking the
+	// filesystem 3x (agents/teams/workflows) is the dominant cost of this
+	// function; calling it twice doubles the cost. Round 29 optimization.
+	const discoveredAgentsAll = allAgents(discoverAgents(input.cwd));
+	const discoveredTeamsAll = allTeams(discoverTeams(input.cwd));
+	const discoveredWorkflowsAll = allWorkflows(discoverWorkflows(input.cwd));
 	// Compute drift once — reused in both Drift section and return value
 	const driftResult = detectDrift(
 		{
-			agents: allAgents(discoverAgents(input.cwd)).map((a) => a.name),
-			teams: allTeams(discoverTeams(input.cwd)).map((t) => t.name),
-			workflows: allWorkflows(discoverWorkflows(input.cwd)).map((w) => w.name),
+			agents: discoveredAgentsAll.map((a) => a.name),
+			teams: discoveredTeamsAll.map((t) => t.name),
+			workflows: discoveredWorkflowsAll.map((w) => w.name),
 		},
 		loadConfig(input.cwd).config,
 	);
@@ -153,14 +179,11 @@ export function buildTeamDoctorReport(input: TeamDoctorReportInput): TeamDoctorR
 			];
 		}),
 		section("Discovery", () => {
-			const discoveredAgents = allAgents(discoverAgents(input.cwd));
-			const discoveredTeams = allTeams(discoverTeams(input.cwd));
-			const discoveredWorkflows = allWorkflows(discoverWorkflows(input.cwd));
-			const agentModelHints = discoveredAgents.filter((agent) => agent.model || agent.fallbackModels?.length).length;
+			const agentModelHints = discoveredAgentsAll.filter((agent) => agent.model || agent.fallbackModels?.length).length;
 			return [
-				{ label: "agents", ok: true, detail: `${discoveredAgents.length} discovered` },
-				{ label: "teams", ok: true, detail: `${discoveredTeams.length} discovered` },
-				{ label: "workflows", ok: true, detail: `${discoveredWorkflows.length} discovered` },
+				{ label: "agents", ok: true, detail: `${discoveredAgentsAll.length} discovered` },
+				{ label: "teams", ok: true, detail: `${discoveredTeamsAll.length} discovered` },
+				{ label: "workflows", ok: true, detail: `${discoveredWorkflowsAll.length} discovered` },
 				{ label: "resource model hints", ok: true, detail: `${agentModelHints} agents declare model/fallback preferences` },
 			];
 		}),