npm - agentic-orchestrator - Versions diffs - 0.1.2 → 0.1.4 - Mend

agentic-orchestrator 0.1.2 → 0.1.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (300) hide show

package/spec-files/outstanding/agentic_orchestrator_performance_improvements_spec.md ADDED Viewed

@@ -0,0 +1,1059 @@
+# Feature Spec: Runtime Performance & Agent Context Optimization
+**Version:** 1.0
+**Date:** 2026-03-03
+**Status:** Draft
+**Milestone:** M30 – Performance & Efficiency
+> **Purpose:** Diagnose and implement fixes for sub-optimal runtime, memory, and agent context-size patterns in the control plane. Section 2 is the diagnostic record (what is wrong and why). Section 3 is the implementation plan (what to change and how). Each task in Section 3 maps to one or more findings in Section 2.
+---
+## 0. Implementation Standards & References
+### 0.1 Testing Standards
+All new and modified code MUST follow the testing standards already established in the repo:
+- Use Vitest (`describe/it/expect`, `vi` mocks/spies)
+- Test files live in `apps/control-plane/test/*.spec.ts`
+- Use **Given / When / Then** naming: `GIVEN_<context>_WHEN_<action>_THEN_<expected>`
+- Maintain coverage thresholds: Lines ≥70%, Branches ≥70%, Functions ≥85%
+- Time-dependent tests must use `vi.useFakeTimers()`
+- No real filesystem I/O in unit tests; use `tmp` directories or `vi.mock('node:fs/promises')`
+### 0.2 Guiding Constraints
+- **No tool contract changes.** Input/output schemas for all 33 MCP tools remain unchanged.
+- **No index/state schema changes.** On-disk artifact formats stay compatible.
+- **No behavioral changes.** Parallelizing reads must not alter ordering of writes or state transitions.
+- **All existing tests must remain green** after each task.
+- **Each task is independently mergeable** — do not batch unrelated changes.
+---
+## 1. Objectives
+### 1.1 Must-Have Outcomes
+- Reduce per-orchestration-iteration latency by eliminating unnecessary sequential I/O in wave executors.
+- Eliminate the duplicate context fetch on gate failure in `BuildWaveExecutor`.
+- Bound evidence directory growth with a configurable retention policy.
+- Reduce agent context window token usage by 40–60% through role-specific projections and pre-filtering.
+- Evict dead entries from `RunCoordinator.statusCache` to prevent slow memory growth over long runs.
+- Replace the `JSON.stringify` diff in `readIndex` with a cheap version guard.
+### 1.2 Non-Goals (Patterns Examined and Rejected)
+The following patterns were reviewed but are **not considered sub-optimal** given current constraints:
+- **Schema validation on every state write** (`kernel.ts`): Correctness is the priority; AJV-compiled validators are fast after first compilation. Sampling or skipping would risk silent schema drift.
+- **AJV lazy compilation** (`schemas.ts`): First-call compilation is a one-time cost per schema type; the cache works correctly afterwards, and eager preloading adds startup latency without runtime benefit.
+- **Session activation polling loop** (`kernel.ts`): Fixed 250ms polling bounded by a 5s timeout is an internal synchronization mechanism on a cold path, not a hot loop.
+- **Object spread in `makeDefaultState`** (`kernel.ts`): Feature initialization is a cold path; spread of small constant-size default objects is negligible.
+---
+## 2. Problem Analysis
+Findings are organized by category. Numbers assume 10 active features over 3 orchestration iterations unless otherwise noted. Each finding cross-references its implementation task in Section 3.
+### 2.1 Runtime / I/O Hotspots
+#### Finding I-1 — `featureGetContext`: 5 sequential disk reads per call (HIGH) → PER-T-001
+**File:** `apps/control-plane/src/application/services/feature-lifecycle-service.ts:142–146`
+```typescript
+const state    = await this.port.featureStateGet(featureId);   // read 1
+const plan     = await this.port.planGet(featureId);            // read 2
+const qaIndex  = await this.port.qaTestIndexGet(featureId);     // read 3
+const evidence = await this.port.evidenceLatest(featureId);     // read 4
+const specText = await fs.readFile(this.port.specPath(...));    // read 5
+```
+All five reads are sequential `await`s with zero data dependencies on each other. Under MCP transport every read incurs a full round-trip. This function is called at least once per feature per wave by all three wave executors, plus a second time in `BuildWaveExecutor` on gate failure.
+**Quantified impact:** 10 features × 3 waves × 5 reads = 150 serial I/O calls per iteration that could be reduced to 30 parallel batches.
+---
+#### Finding I-2 — `BuildWaveExecutor`: duplicate full context fetch on gate failure (HIGH) → PER-T-003
+**File:** `apps/control-plane/src/supervisor/build-wave-executor.ts:42` and `95`
+The context bundle fetched before the initial `workerDecisionRunner.execute` call (line 42) remains valid through the gate failure event. A second identical `FEATURE_GET_CONTEXT` call at line 95 re-fetches the same 50KB+ bundle unconditionally before entering the repair retry loop.
+**Quantified impact:** With 5 retry attempts per failing feature × 50 KB context = 250 KB of redundant I/O per failing feature per wave.
+---
+#### Finding I-3 — Build + QA wave executors: sequential state-filter loops (HIGH) → PER-T-002
+**Files:** `apps/control-plane/src/supervisor/build-wave-executor.ts:30–38`, `qa-wave-executor.ts:54–62`
+Both executors loop sequentially over all active features to filter by status before any gate work begins. With 20 active features this is 20 serial `FEATURE_STATE_GET` tool calls just to produce a smaller subset.
+**Quantified impact:** 20 features × 2 wave executors = 40 serial tool calls per iteration that could be 2 parallel batches.
+---
+#### Finding I-4 — `reportDashboard`: sequential state + cost reads per feature (MEDIUM) → PER-T-004
+**File:** `apps/control-plane/src/application/services/reporting-service.ts:81–108`
+The dashboard loop reads `state.md` and the cost JSON file for each feature one after another. Called once per orchestration cycle via `TOOLS.REPORT_DASHBOARD` and also from the CLI `status` command.
+**Quantified impact:** 50 features × 2 reads = 100 serial file reads per `reportDashboard` call.
+---
+#### Finding I-5 — `evidenceLatest`: O(N) stat scan to find newest file (MEDIUM) → PER-T-005
+**File:** `apps/control-plane/src/application/services/gate-service.ts:236–256`
+On every `featureGetContext` call, `evidenceLatest` lists the entire evidence directory, calls `fs.stat()` on every `.json` file to read mtimes, sorts by mtime, then reads the newest file. The evidence directory grows by 1–3 files per gate run (one per profile per retry).
+Critically, `gates.ts:430` already writes a `latest.json` sentinel on every gate run. `evidenceLatest` ignores it entirely and rediscovers the newest file the hard way every call.
+**Quantified impact:** 50 retries × 3 profiles = 150 evidence files per feature × 10 features = 1,500 `stat()` calls per context fetch, all avoidable by reading `latest.json` directly.
+---
+#### Finding I-6 — `readIndex`: `JSON.stringify` comparison on every call (MEDIUM) → PER-T-011
+**File:** `apps/control-plane/src/core/kernel.ts:761`
+```typescript
+const changed = JSON.stringify(existing ?? null) !== JSON.stringify(normalized);
+```
+`readIndex` is called on every orchestration cycle and every state transition. Serializing a 20–50 KB index object twice on every call is wasted CPU — schema migration (`changed === true`) only occurs after a code change that alters `normalizeIndexShape`.
+---
+#### Finding I-7 — `collisionsScan`: O(n²) plan comparisons with `JSON.stringify` in inner loop (MEDIUM) — Deferred
+**File:** `apps/control-plane/src/application/services/reporting-service.ts:40–68`
+`collisionsScan` reads every feature's `plan.json` (N disk reads via `collectAcceptedPlans`) then compares every pair (N²/2 `detectPlanCollisions` calls). `createCollisionFingerprint` runs `JSON.stringify(collisions)` inside the inner loop. For 100 features: 100 file reads + 4,950 comparisons.
+**Status:** Deferred from this spec. Preferred fix is event-driven: compute the collision matrix at plan-submission time and cache it, rather than recomputing on every dashboard poll. Requires a design change to the plan-submission flow.
+---
+#### Finding I-8 — `QaWaveExecutor`: `loadRolePrompts` called inside per-feature loop (LOW) → PER-T-014
+**File:** `apps/control-plane/src/supervisor/qa-wave-executor.ts:196`
+`loadRolePrompts` is called once per feature inside the QA batch loop. `PromptBundleLoader` caches after the first load so subsequent calls are cheap — but the placement inside the loop is structurally fragile. If the cache is externally invalidated, each iteration independently hits the filesystem, and the regression is invisible.
+---
+### 2.2 Memory / Allocation Hotspots
+#### Finding M-1 — `normalizeSet` copy-pasted in 4 files (LOW) → PER-T-013
+**Files:** `reporting-service.ts:7–9`, `merge-service.ts:17–19`, `feature-lifecycle-service.ts:15–17`, `feature-deletion-service.ts:67–69`
+The same 3-line function (deduplicate + sort string array) is independently defined in four service files. Four copies that must be kept in sync; any change to deduplication or sort semantics must be applied in all four places.
+---
+#### Finding M-2 — `structuredClone(plan)` for a 5-field mutation (LOW) → PER-T-012
+**File:** `apps/control-plane/src/supervisor/planning-wave-executor.ts:298`
+`buildUpdatedPlan` uses `structuredClone` to deep-copy the entire plan object, then immediately overwrites 5 top-level fields. Since those fields are scalars or wholly replaced arrays (not mutated in-place), a shallow object spread produces identical semantics at a fraction of the cost.
+---
+#### Finding M-3 — `statusCache` entries never evicted for terminal features (MEDIUM) → PER-T-010
+**File:** `apps/control-plane/src/supervisor/run-coordinator.ts:61–62, 272`
+`statusCache` accumulates one entry per feature that ever transitions state. When `closeFeatureCluster` is called for a terminal feature (`MERGED`, `FAILED`, `PAUSED_BUDGET`), the Map entry is not deleted. Over a long run processing hundreds of features, the Map holds dead entries indefinitely.
+---
+#### Finding M-4 — Evidence directory grows unbounded across gate retry cycles (HIGH) → PER-T-006
+**File:** `apps/control-plane/src/core/gates.ts:428–430`
+Every gate run appends a new timestamped JSON file to `.aop/features/<id>/evidence/`. With 5 retry cycles × 3 gate profiles per feature the directory accumulates 15+ files per feature with no pruning. This directly worsens Finding I-5: the `stat()` scan cost scales linearly with the number of accumulated files.
+---
+### 2.3 Agent Context Size
+#### Finding C-1 — Full QA index + full evidence + full state delivered to every agent unconditionally (HIGH) → PER-T-008, PER-T-009
+**File:** `apps/control-plane/src/application/services/feature-lifecycle-service.ts:132–158`
+The context bundle returned to every agent role includes the complete QA test index (all test records, including passed), the full gate evidence JSON (verbose stdout/stderr), and the complete state frontmatter (all historical lock, gate, and PR metadata). A feature with 200 tests and 50 gate retry cycles produces a bundle exceeding 80–100 KB — roughly 20,000–25,000 tokens per agent invocation.
+Relevant bloat by field:
+| Field | Problem |
+|-------|---------|
+| `qa_test_index` | All test records including passed — agents only need `summary` + `failed` + `pending` |
+| `latest_evidence` | Full verbose gate output — agents need overall result + top failing steps |
+| `state.front_matter` | All historical locks/gates/PR metadata — agents need current status + held locks |
+| `plan` | Full plan when builder/QA roles only need current-phase tasks |
+---
+#### Finding C-2 — `PlanningWaveExecutor` fetches full context for non-planning features (MEDIUM) → PER-T-007
+**File:** `apps/control-plane/src/supervisor/planning-wave-executor.ts:125–133`
+`run()` fetches the full 50KB context bundle for every active feature, then immediately discards it if status is not `PLANNING` or `BLOCKED`. For 10 active features where 8 are in `BUILDING` status, 8 full context bundles (400 KB total) are fetched and thrown away on each planning wave.
+---
+#### Finding C-3 — Decision log entries embed raw `JSON.stringify` of complex objects (LOW) — Deferred
+**File:** `apps/control-plane/src/supervisor/planning-wave-executor.ts:313`
+`appendDecisionLog` serializes an `AnyRecord` containing `qa_summary`, `edge_case_checklist`, `reasons`, and `latest_gate_overall` as a raw JSON string in the decisions log. These accumulate linearly with iteration count. If decisions logs are surfaced to agents in future context bundles, this becomes a context-size problem.
+**Status:** Deferred. No agents currently consume decisions logs directly; addressing this requires a typed `DecisionLogEntry` interface and a schema for the log format, which is out of scope for this spec.
+---
+#### Finding C-4 — `loadRolePrompts` call placement risks sending prompts redundantly (LOW) → PER-T-014
+See Finding I-8. The structural fix (hoist above the loop) also ensures prompts are loaded once and passed at session creation time rather than risking re-loading mid-batch.
+---
+### 2.4 Complete Findings Table
+| ID | File | Lines | Severity | Category | Task |
+|----|------|-------|----------|----------|------|
+| I-1 | `feature-lifecycle-service.ts` | 142–146 | HIGH | I/O | PER-T-001 |
+| I-2 | `build-wave-executor.ts` | 42, 95 | HIGH | I/O | PER-T-003 |
+| I-3 | `build-wave-executor.ts`, `qa-wave-executor.ts` | 30–38, 54–62 | HIGH | I/O | PER-T-002 |
+| I-4 | `reporting-service.ts` | 81–108 | MEDIUM | I/O | PER-T-004 |
+| I-5 | `gate-service.ts` | 236–256 | MEDIUM | I/O | PER-T-005 |
+| I-6 | `kernel.ts` | 761 | MEDIUM | CPU | PER-T-011 |
+| I-7 | `reporting-service.ts` | 40–68 | MEDIUM | CPU/I/O | Deferred |
+| I-8 | `qa-wave-executor.ts` | 196 | LOW | I/O | PER-T-014 |
+| M-1 | 4 service files | various | LOW | Memory | PER-T-013 |
+| M-2 | `planning-wave-executor.ts` | 298 | LOW | Memory | PER-T-012 |
+| M-3 | `run-coordinator.ts` | 61–62, 272 | MEDIUM | Memory | PER-T-010 |
+| M-4 | `core/gates.ts` | 428–430 | HIGH | Disk | PER-T-006 |
+| C-1 | `feature-lifecycle-service.ts` | 132–158 | HIGH | Context | PER-T-008, PER-T-009 |
+| C-2 | `planning-wave-executor.ts` | 125–133 | MEDIUM | Context | PER-T-007 |
+| C-3 | `planning-wave-executor.ts` | 313 | LOW | Context | Deferred |
+| C-4 | `qa-wave-executor.ts`, `prompt-bundle-loader.ts` | 196, 15–52 | LOW | Context | PER-T-014 |
+---
+## 3. Milestone Plan
+### PER-M1: Parallel I/O in Wave Executors (Highest Impact)
+**Goal:** Eliminate sequential awaits where there are no data dependencies. No behavioral changes — only I/O ordering changes; write ordering is unaffected.
+---
+#### PER-T-001: Parallelize the 5 sequential reads in `featureGetContext`
+**Fixes:** Finding I-1
+**File:** `apps/control-plane/src/application/services/feature-lifecycle-service.ts`
+**Lines:** 142–146
+**Before:**
+```typescript
+const state    = await this.port.featureStateGet(featureId);
+const plan     = await this.port.planGet(featureId);
+const qaIndex  = await this.port.qaTestIndexGet(featureId);
+const evidence = await this.port.evidenceLatest(featureId);
+const specText = await fs.readFile(this.port.specPath(featureId), 'utf8').catch(() => '');
+```
+**After:**
+```typescript
+const [state, plan, qaIndex, evidence, specText] = await Promise.all([
+  this.port.featureStateGet(featureId),
+  this.port.planGet(featureId),
+  this.port.qaTestIndexGet(featureId),
+  this.port.evidenceLatest(featureId),
+  fs.readFile(this.port.specPath(featureId), 'utf8').catch(() => '')
+]);
+```
+**Tests to write:** `apps/control-plane/test/feature-lifecycle-service.spec.ts`
+- `GIVEN_featureGetContext_WHEN_called_THEN_all_reads_are_parallel` — assert each port method is called exactly once and all are called before any result is consumed.
+**Acceptance criteria:**
+1. `featureGetContext` returns identical output before and after.
+2. All five port method spies are called; the individual calls are non-ordered.
+3. A single I/O error in any one read propagates as a rejected promise (existing behavior).
+---
+#### PER-T-002: Parallelize the status-filter loops in `BuildWaveExecutor` and `QaWaveExecutor`
+**Fixes:** Finding I-3
+**Files:**
+- `apps/control-plane/src/supervisor/build-wave-executor.ts` (lines 30–38)
+- `apps/control-plane/src/supervisor/qa-wave-executor.ts` (lines 54–62)
+**Before (BuildWaveExecutor):**
+```typescript
+const batch: string[] = [];
+for (const featureId of featureIds) {
+  const state = await this.toolCaller.callTool<FeatureStatePayload>('builder', TOOLS.FEATURE_STATE_GET, {
+    feature_id: featureId
+  });
+  if (state.data.front_matter.status === STATUS.BUILDING) {
+    batch.push(featureId);
+  }
+}
+```
+**After (BuildWaveExecutor):**
+```typescript
+const states = await Promise.all(
+  featureIds.map((featureId) =>
+    this.toolCaller.callTool<FeatureStatePayload>('builder', TOOLS.FEATURE_STATE_GET, {
+      feature_id: featureId
+    })
+  )
+);
+const batch = featureIds.filter((_, i) => states[i].data.front_matter.status === STATUS.BUILDING);
+```
+Apply the same transformation to `QaWaveExecutor` substituting role `'qa'` and status `STATUS.QA`.
+**Tests to write:** `apps/control-plane/test/batch-operations.spec.ts` (existing file — add cases)
+- `GIVEN_BuildWaveExecutor_run_WHEN_multiple_features_THEN_state_reads_are_parallel`
+- `GIVEN_QaWaveExecutor_run_WHEN_multiple_features_THEN_state_reads_are_parallel`
+**Acceptance criteria:**
+1. Identical `batch` array produced before and after (same filter logic, same contents).
+2. All `FEATURE_STATE_GET` calls are issued concurrently (spy call ordering is non-sequential).
+3. `selected` slice behavior (`batch.slice(0, maxParallelGateRuns)`) is unchanged.
+---
+#### PER-T-003: Eliminate duplicate `FEATURE_GET_CONTEXT` fetch in `BuildWaveExecutor` retry path
+**Fixes:** Finding I-2
+**File:** `apps/control-plane/src/supervisor/build-wave-executor.ts`
+**Lines:** 40–52 (initial fetch), 94–97 (duplicate fetch)
+The `context` variable captured at line 42 is valid through the gate failure. Re-fetching it at line 95 duplicates the entire I/O bundle for no benefit.
+**Before:**
+```typescript
+// Line 41–52: first fetch used for initial workerDecisionRunner call
+for (const featureId of selected) {
+  const context = await this.toolCaller.callTool('builder', TOOLS.FEATURE_GET_CONTEXT, {
+    feature_id: featureId
+  });
+  await this.workerDecisionRunner.execute({ ..., contextBundle: context.data, ... });
+}
+// Line 54+: separate Promise.all map — context is re-fetched inside
+const executing = selected.map(async (featureId) => {
+  ...
+  if (this.reactionsService && gateOverall === GATE_RESULT.FAIL) {
+    const context = await this.toolCaller.callTool('builder', TOOLS.FEATURE_GET_CONTEXT, {  // DUPLICATE
+      feature_id: featureId
+    });
+    ...
+  }
+});
+```
+**After:** Hoist the context fetch into the `executing` map so a single variable serves both the initial decision loop and the retry loop. The sequential `for` loop over `selected` (lines 41–52) is eliminated entirely.
+```typescript
+const executing = selected.map(async (featureId) => {
+  const context = await this.toolCaller.callTool('builder', TOOLS.FEATURE_GET_CONTEXT, {
+    feature_id: featureId
+  });
+  await this.workerDecisionRunner.execute({
+    role: 'builder',
+    featureId,
+    contextBundle: context.data,
+    instructions: '...'
+  });
+  const stateForRetry = await this.toolCaller.callTool<FeatureStatePayload>('builder', TOOLS.FEATURE_STATE_GET, {
+    feature_id: featureId
+  });
+  // ... retry loop reuses context.data from above
+});
+await Promise.allSettled(executing);
+```
+**Tests to write:** `apps/control-plane/test/batch-operations.spec.ts`
+- `GIVEN_BuildWaveExecutor_gate_fails_WHEN_retry_loop_runs_THEN_FEATURE_GET_CONTEXT_called_once_per_feature`
+**Acceptance criteria:**
+1. `FEATURE_GET_CONTEXT` is called exactly once per feature per `run()` invocation regardless of gate outcome.
+2. The same `context.data` is used for both the initial decision loop and the repair retry loop.
+3. Gate retry behavior (shouldRetry, recordRetry, escalate) is identical.
+---
+#### PER-T-004: Parallelize state and cost reads in `reportDashboard`
+**Fixes:** Finding I-4
+**File:** `apps/control-plane/src/application/services/reporting-service.ts`
+**Lines:** 81–108
+**Before:**
+```typescript
+const features = [];
+for (const featureId of featureIds) {
+  const statePath = this.port.statePath(featureId);
+  if (!(await pathExists(statePath))) {
+    continue;
+  }
+  const state = await this.port.readState(featureId);
+  const costData = await readJson<...>(this.port.featureCostPath(featureId), null);
+  features.push({ ... });
+}
+```
+**After:**
+```typescript
+const features = (
+  await Promise.all(
+    featureIds.map(async (featureId) => {
+      const statePath = this.port.statePath(featureId);
+      if (!(await pathExists(statePath))) {
+        return null;
+      }
+      const [state, costData] = await Promise.all([
+        this.port.readState(featureId),
+        readJson<{ estimated_cost_usd: number; tokens_used: number }>(this.port.featureCostPath(featureId), null)
+      ]);
+      return {
+        feature_id: featureId,
+        status: state.frontMatter.status,
+        branch: typeof state.frontMatter.worktree_branch === 'string'
+          ? state.frontMatter.worktree_branch
+          : typeof state.frontMatter.branch === 'string'
+            ? state.frontMatter.branch
+            : null,
+        locks: readHeldLocks(state.frontMatter),
+        gate_profile: state.frontMatter.gate_profile,
+        gates: state.frontMatter.gates,
+        pr: state.frontMatter.pr ?? null,
+        last_updated: state.frontMatter.last_updated,
+        activity_state: state.frontMatter.activity_state,
+        activity_last_event_at: state.frontMatter.activity_last_event_at,
+        activity_detected_via: state.frontMatter.activity_detected_via,
+        cost: costData
+          ? { estimated_cost_usd: costData.estimated_cost_usd, tokens_used: costData.tokens_used }
+          : null
+      };
+    })
+  )
+).filter((entry): entry is NonNullable<typeof entry> => entry !== null);
+```
+**Tests to write:** `apps/control-plane/test/services.spec.ts` (existing) or new `reporting-service.spec.ts`
+- `GIVEN_reportDashboard_WHEN_multiple_features_exist_THEN_reads_are_parallel`
+- `GIVEN_reportDashboard_WHEN_state_file_missing_THEN_feature_is_omitted`
+**Acceptance criteria:**
+1. Returned `features` array contains identical entries in the same order as before.
+2. Features without a state file are omitted from output (unchanged behavior).
+3. `readState` and `readJson` (cost) are called in parallel per feature.
+---
+### PER-M2: Evidence Directory Efficiency
+**Goal:** Stop scanning the full evidence directory on every context fetch. Read the `latest.json` sentinel that `gates.ts` already writes. Add a retention policy so the directory stays bounded.
+---
+#### PER-T-005: Read `latest.json` directly in `evidenceLatest` instead of scanning the directory
+**Fixes:** Finding I-5
+**File:** `apps/control-plane/src/application/services/gate-service.ts`
+**Lines:** 224–266
+`gates.ts` already writes two files on every gate run:
+- `evidence/<timestamp>-<profile>.json` — the timestamped archive record
+- `latest.json` — always the most recent result
+`evidenceLatest` ignores `latest.json` and instead lists the entire directory, stats every file, sorts by mtime, and reads the newest. This is O(N) in the number of historical evidence files.
+**Before:**
+```typescript
+const files = (await fs.readdir(evidenceDir))
+  .filter((file) => file.endsWith('.json'))
+  .map((file) => path.join(evidenceDir, file));
+if (files.length === 0) { return { data: { feature_id: featureId, latest: null } }; }
+const withStats = await Promise.all(
+  files.map(async (file) => {
+    const stat = await fs.stat(file);
+    return { file, mtimeMs: stat.mtimeMs };
+  })
+);
+withStats.sort((a, b) => b.mtimeMs - a.mtimeMs);
+const latestFile = withStats[0].file;
+const latest = JSON.parse(await fs.readFile(latestFile, 'utf8'));
+```
+**After:**
+```typescript
+const latestPath = path.join(evidenceDir, 'latest.json');
+if (!(await pathExists(latestPath))) {
+  return { data: { feature_id: featureId, latest: null } };
+}
+const latest = await readJson<AnyRecord>(latestPath, null);
+```
+The method drops from O(N) stat calls to O(1). Add a port method `evidenceLatestPath(featureId: string): string` to `GateServicePort` if not already present.
+**Tests to write:** `apps/control-plane/test/gate-service.spec.ts` (create if not existing)
+- `GIVEN_evidenceLatest_WHEN_latest_json_exists_THEN_returns_its_contents`
+- `GIVEN_evidenceLatest_WHEN_no_evidence_dir_exists_THEN_returns_null`
+- `GIVEN_evidenceLatest_WHEN_latest_json_missing_THEN_returns_null`
+**Acceptance criteria:**
+1. `evidenceLatest` reads only `latest.json`; no `readdir` or `stat` calls.
+2. Return shape `{ data: { feature_id, latest, path? } }` is identical to the existing interface.
+3. Existing tests that rely on `latest` content remain green.
+---
+#### PER-T-006: Add configurable evidence retention policy
+**Fixes:** Finding M-4
+**Files:**
+- `agentic/orchestrator/policy.yaml` — add `evidence_retention_count` field
+- `agentic/orchestrator/schemas/policy.schema.json` — add field to schema
+- `apps/control-plane/src/core/gates.ts` — prune after writing new evidence file
+**Policy change (`policy.yaml`):**
+```yaml
+cleanup:
+  grace_period_seconds: 300
+  auto_after_merge: true
+  evidence_retention_count: 10   # NEW: keep the N most recent evidence files per feature
+```
+**Schema change (`schemas/policy.schema.json`):** Add to the `cleanup` object properties:
+```json
+"evidence_retention_count": {
+  "type": "integer",
+  "minimum": 1,
+  "maximum": 100,
+  "default": 10,
+  "description": "Maximum number of timestamped evidence files to retain per feature. Oldest are pruned after each gate run."
+}
+```
+**Implementation in `gates.ts`** — after writing the new evidence file (currently line 428), add pruning:
+```typescript
+await pruneEvidenceFiles(evidenceDir, retentionCount);
+async function pruneEvidenceFiles(dir: string, keep: number): Promise<void> {
+  const entries = await fs.readdir(dir);
+  // Only prune timestamped files; never prune latest.json
+  const archived = entries
+    .filter((f) => f.endsWith('.json') && f !== 'latest.json')
+    .map((f) => ({ name: f, path: path.join(dir, f) }));
+  if (archived.length <= keep) return;
+  // Sort by name ascending — timestamp-prefixed filenames sort chronologically
+  archived.sort((a, b) => a.name.localeCompare(b.name));
+  const toDelete = archived.slice(0, archived.length - keep);
+  await Promise.all(toDelete.map(({ path: p }) => fs.unlink(p).catch(() => undefined)));
+}
+```
+The `retentionCount` value is read from the policy snapshot, defaulting to `10` if absent.
+**Tests to write:** `apps/control-plane/test/incremental-gates.spec.ts` (existing) — add cases:
+- `GIVEN_pruneEvidenceFiles_WHEN_archived_count_exceeds_retention_THEN_oldest_are_deleted`
+- `GIVEN_pruneEvidenceFiles_WHEN_archived_count_within_retention_THEN_nothing_deleted`
+- `GIVEN_pruneEvidenceFiles_WHEN_latest_json_present_THEN_it_is_never_pruned`
+**Acceptance criteria:**
+1. After every gate run, the evidence directory contains at most `evidence_retention_count` timestamped files.
+2. `latest.json` is never deleted by pruning.
+3. Pruning occurs only after the new evidence file is successfully written.
+4. Default `evidence_retention_count` of 10 applies when the field is absent from `policy.yaml`.
+5. Policy schema validates correctly; invalid values (negative, >100) are rejected.
+---
+### PER-M3: Agent Context Slimming
+**Goal:** Reduce the token footprint of context bundles delivered to agents. The biggest savings come from role-specific projections and pre-filtering non-relevant features before fetching full context.
+---
+#### PER-T-007: Pre-filter non-planning features in `PlanningWaveExecutor` before fetching full context
+**Fixes:** Finding C-2
+**File:** `apps/control-plane/src/supervisor/planning-wave-executor.ts`
+**Lines:** 124–153 (`run`) and 156–228 (`runPostQaReconciliation`)
+`run()` calls `FEATURE_GET_CONTEXT` (full 50KB+ bundle) for every active feature, then immediately discards it if status is not `PLANNING` or `BLOCKED`. For 10 features where 8 are in `BUILDING`, this wastes 400 KB of I/O per planning wave.
+**Before:**
+```typescript
+async run(featureIds: string[]): Promise<void> {
+  for (const featureId of featureIds) {
+    const context = await this.toolCaller.callTool<FeatureContextPayload>('planner', TOOLS.FEATURE_GET_CONTEXT, {
+      feature_id: featureId
+    });
+    const state = context.data.state.front_matter;
+    if (state.status !== STATUS.PLANNING && state.status !== STATUS.BLOCKED) {
+      continue;
+    }
+    // ...use context
+  }
+}
+```
+**After:**
+```typescript
+async run(featureIds: string[]): Promise<void> {
+  // Phase 1: Batch-fetch lightweight state to identify planning features
+  const states = await Promise.all(
+    featureIds.map((featureId) =>
+      this.toolCaller.callTool<FeatureStatePayload>('planner', TOOLS.FEATURE_STATE_GET, {
+        feature_id: featureId
+      })
+    )
+  );
+  const planningFeatureIds = featureIds.filter((_, i) => {
+    const status = states[i].data.front_matter.status;
+    return status === STATUS.PLANNING || status === STATUS.BLOCKED;
+  });
+  // Phase 2: Fetch full context only for features that need planning
+  for (const featureId of planningFeatureIds) {
+    const context = await this.toolCaller.callTool<FeatureContextPayload>('planner', TOOLS.FEATURE_GET_CONTEXT, {
+      feature_id: featureId
+    });
+    // ...rest unchanged
+  }
+}
+```
+Apply the same pre-filter pattern to `runPostQaReconciliation`, substituting `isPostQaStatus` as the predicate.
+**Tests to write:** `apps/control-plane/test/planning-wave-executor.spec.ts` (existing — add cases)
+- `GIVEN_run_WHEN_features_not_in_planning_status_THEN_FEATURE_GET_CONTEXT_not_called`
+- `GIVEN_run_WHEN_features_in_planning_THEN_FEATURE_GET_CONTEXT_called_only_for_those`
+- `GIVEN_runPostQaReconciliation_WHEN_features_not_in_post_qa_status_THEN_context_not_fetched`
+**Acceptance criteria:**
+1. `FEATURE_GET_CONTEXT` is never called for features not in `PLANNING`, `BLOCKED`, `QA`, or `READY_TO_MERGE` status (as applicable per method).
+2. `FEATURE_STATE_GET` batch calls are issued in parallel.
+3. Planning logic for features that reach the context fetch is unchanged.
+---
+#### PER-T-008: Introduce role-specific context projections in `featureGetContext`
+**Fixes:** Finding C-1
+**Files:**
+- `apps/control-plane/src/application/services/feature-lifecycle-service.ts`
+- `apps/control-plane/src/core/constants.ts` (add new tool constant)
+- `agentic/orchestrator/tools/catalog.json` (register new tool)
+- Tool input/output schemas under `agentic/orchestrator/tools/schemas/`
+Add an optional `role` parameter to the existing `feature.get_context` tool (backward-compatible: defaults to `'full'`):
+```typescript
+// Input schema addition
+{
+  "feature_id": { "type": "string" },
+  "role": {
+    "type": "string",
+    "enum": ["full", "planner", "builder", "qa"],
+    "default": "full",
+    "description": "Returns a role-scoped projection of the context bundle to reduce token usage."
+  }
+}
+```
+**Context projections by role:**
+| Field | `full` | `planner` | `builder` | `qa` |
+|-------|--------|-----------|-----------|------|
+| `feature_id` | ✓ | ✓ | ✓ | ✓ |
+| `spec` | ✓ | ✓ | ✓ (trimmed to 8KB) | ✓ (trimmed to 4KB) |
+| `state.front_matter` | full | full | `status`, `branch`, `locks.held`, `gates`, `gate_profile` | `status`, `branch`, `gates`, `gate_profile` |
+| `plan` | full | full | `tasks` (current phase only), `acceptance_criteria` | `acceptance_criteria`, `risk` |
+| `qa_test_index` | full | `summary` only | `summary` only | `summary` + `failed` + `pending` (no `passed`) |
+| `latest_evidence` | full | `{ overall, profile }` | `{ overall, profile, failed_steps[0..4] }` | `{ overall, profile, failed_steps[0..9], coverage }` |
+**Implementation in `FeatureLifecycleService`:**
+```typescript
+async featureGetContext(featureId: string, role: 'full' | 'planner' | 'builder' | 'qa' = 'full') {
+  const [state, plan, qaIndex, evidence, specText] = await Promise.all([...]);  // PER-T-001
+  const projected = projectContext({ state, plan, qaIndex, evidence, specText }, role);
+  return { data: { feature_id: featureId, ...projected } };
+}
+function projectContext(bundle: FullContextBundle, role: string): ProjectedBundle {
+  if (role === 'full') return bundle;  // no-op for backward compatibility
+  const spec = trimSpec(bundle.specText, specBudgetByRole[role]);
+  const state = projectState(bundle.state, role);
+  const plan = projectPlan(bundle.plan, role);
+  const qaIndex = projectQaIndex(bundle.qaIndex, role);
+  const evidence = projectEvidence(bundle.evidence, role);
+  return { spec, state, plan, qa_test_index: qaIndex, latest_evidence: evidence };
+}
+```
+Wave executors pass their role when calling context:
+```typescript
+// BuildWaveExecutor
+const context = await this.toolCaller.callTool('builder', TOOLS.FEATURE_GET_CONTEXT, {
+  feature_id: featureId,
+  role: 'builder'    // NEW
+});
+```
+**Tests to write:** `apps/control-plane/test/feature-lifecycle-service.spec.ts`
+- `GIVEN_featureGetContext_with_role_builder_WHEN_called_THEN_qa_index_is_summary_only`
+- `GIVEN_featureGetContext_with_role_qa_WHEN_called_THEN_passed_tests_are_omitted`
+- `GIVEN_featureGetContext_with_role_planner_WHEN_called_THEN_state_is_projected`
+- `GIVEN_featureGetContext_with_role_full_WHEN_called_THEN_bundle_is_unchanged`
+**Acceptance criteria:**
+1. `role: 'full'` (and omitting `role`) returns the existing complete bundle — no regression for callers that do not pass a role.
+2. `role: 'builder'` omits passed test records from `qa_test_index`; only `summary`, `failed`, and `pending` keys are present.
+3. `role: 'qa'` omits passed test records; includes `failed_steps` in evidence (max 10).
+4. `role: 'planner'` projects `state` to `{ status, branch, locks, gates, gate_profile }`.
+5. Context byte size for `builder` and `qa` roles is ≤50% of `full` bundle when QA index has ≥50 test records.
+6. Input schema is updated and validated by the MCP tool runtime.
+---
+#### PER-T-009: Extract and test the `projectQaIndex` projection helper
+**Fixes:** Finding C-1 (supporting task for PER-T-008)
+**File:** `apps/control-plane/src/application/services/feature-lifecycle-service.ts`
+Extracted from PER-T-008 as a standalone, independently testable function:
+```typescript
+export function projectQaIndex(
+  qaIndex: QaIndexRecord,
+  role: 'full' | 'planner' | 'builder' | 'qa'
+): Partial<QaIndexRecord> {
+  if (role === 'full') return qaIndex;
+  if (role === 'planner') return { summary: qaIndex.summary };
+  if (role === 'builder') return { summary: qaIndex.summary };
+  // 'qa': return summary + failed + pending; omit passed
+  const { passed: _omitted, ...rest } = qaIndex;
+  return rest;
+}
+```
+**Tests to write:** `apps/control-plane/test/feature-lifecycle-service.spec.ts`
+- `GIVEN_projectQaIndex_role_qa_WHEN_index_has_passed_tests_THEN_passed_is_omitted`
+- `GIVEN_projectQaIndex_role_full_WHEN_called_THEN_index_is_unchanged`
+- `GIVEN_projectQaIndex_role_planner_WHEN_called_THEN_only_summary_returned`
+---
+### PER-M4: Memory & CPU Cleanup
+**Goal:** Fix the slow memory leak in `RunCoordinator`, eliminate redundant serialization in `readIndex`, replace `structuredClone` with a targeted spread, and unify the four duplicated `normalizeSet` implementations.
+---
+#### PER-T-010: Evict `statusCache` entries on feature termination
+**Fixes:** Finding M-3
+**File:** `apps/control-plane/src/supervisor/run-coordinator.ts`
+**Lines:** 183–191 (`rebalanceActiveFeatures`)
+When `closeFeatureCluster(featureId)` is called for a terminal-status feature, the corresponding `statusCache` entry is never deleted. Over a long run processing hundreds of features, the Map accumulates dead entries indefinitely.
+**Before:**
+```typescript
+for (const featureId of sortedCurrent) {
+  const status = await this.readFeatureStatus(featureId);
+  if (status && RunCoordinator.TERMINAL_STATUSES.has(status)) {
+    await this.sessionOrchestrator.closeFeatureCluster(featureId);
+    continue;  // cache entry remains
+  }
+  survivingActiveFeatureIds.push(featureId);
+}
+```
+**After:**
+```typescript
+for (const featureId of sortedCurrent) {
+  const status = await this.readFeatureStatus(featureId);
+  if (status && RunCoordinator.TERMINAL_STATUSES.has(status)) {
+    await this.sessionOrchestrator.closeFeatureCluster(featureId);
+    this.statusCache.delete(featureId);   // NEW
+    continue;
+  }
+  survivingActiveFeatureIds.push(featureId);
+}
+```
+Apply the same eviction in the queue-drain loop (lines 193–207) where features popped from `this.state.queue` are found to already be terminal.
+**Tests to write:** `apps/control-plane/test/run-coordinator.spec.ts` (existing — add case)
+- `GIVEN_rebalanceActiveFeatures_WHEN_feature_reaches_terminal_status_THEN_statusCache_entry_is_evicted`
+**Acceptance criteria:**
+1. After `closeFeatureCluster` is called for a terminal feature, `statusCache.has(featureId)` returns `false`.
+2. `notifyStatusTransitions` is unaffected — it reads `statusCache` only for currently active features.
+3. No change to orchestration loop behavior or status transition notification logic.
+---
+#### PER-T-011: Replace `JSON.stringify` diff with version-based guard in `readIndex`
+**Fixes:** Finding I-6
+**File:** `apps/control-plane/src/core/kernel.ts`
+**Lines:** 758–766
+`readIndex` serializes the entire index (20–50 KB) twice on every call just to detect whether schema migration is needed — a cold-path concern paid on every hot-path read.
+**Current behavior:**
+```typescript
+const changed = JSON.stringify(existing ?? null) !== JSON.stringify(normalized);
+if (changed) {
+  await atomicWriteJson(this.indexPath, normalized);
+}
+```
+**After:** Add a `schema_version` field to the index shape. `normalizeIndexShape` sets it to the current constant. Migration check becomes a single integer comparison:
+```typescript
+const CURRENT_INDEX_SCHEMA_VERSION = 1;   // increment when normalizeIndexShape changes shape
+async readIndex(): Promise<AnyRecord> {
+  const existing = await readJson(this.indexPath, null);
+  const needsMigration =
+    existing === null ||
+    (existing as AnyRecord).schema_version !== CURRENT_INDEX_SCHEMA_VERSION;
+  if (needsMigration) {
+    const normalized = this.normalizeIndexShape(existing);
+    await atomicWriteJson(this.indexPath, normalized);
+    return normalized;
+  }
+  return existing as AnyRecord;
+}
+```
+`normalizeIndexShape` output gains `schema_version: CURRENT_INDEX_SCHEMA_VERSION`. The field is added to `schemas/index.schema.json` as an optional integer.
+**Tests to write:** `apps/control-plane/test/kernel.spec.ts` (existing — add cases)
+- `GIVEN_readIndex_WHEN_schema_version_matches_THEN_normalizeIndexShape_not_called`
+- `GIVEN_readIndex_WHEN_schema_version_absent_THEN_migration_runs_and_writes_file`
+- `GIVEN_readIndex_WHEN_index_is_null_THEN_migration_runs`
+**Acceptance criteria:**
+1. When the on-disk index has the current `schema_version`, `normalizeIndexShape` is not called and no write occurs.
+2. When the field is absent (legacy index), normalization and re-write happen exactly once.
+3. The `schema_version` field is included in `index.schema.json` and validates correctly.
+4. No other fields in the index are changed by this migration.
+---
+#### PER-T-012: Replace `structuredClone(plan)` with targeted field spread in `buildUpdatedPlan`
+**Fixes:** Finding M-2
+**File:** `apps/control-plane/src/supervisor/planning-wave-executor.ts`
+**Line:** 298
+`structuredClone` serializes the entire plan object graph for a mutation that only touches five top-level fields. Since those fields are scalars or wholly replaced arrays, a shallow spread is sufficient and does not risk mutating the original.
+**Before:**
+```typescript
+private buildUpdatedPlan(plan: AnyRecord, planVersion: number, decision: ReconciliationDecision): AnyRecord {
+  const nextPlan = structuredClone(plan);
+  const acceptanceCriteria = asStringArray(nextPlan.acceptance_criteria);
+  const riskItems = asStringArray(nextPlan.risk);
+  nextPlan.plan_version = planVersion + 1;
+  nextPlan.revision_of = planVersion;
+  nextPlan.revision_reason = decision.reasons.join('; ');
+  nextPlan.acceptance_criteria = normalizeList([...acceptanceCriteria, ...decision.acceptanceCriteriaAdditions]);
+  nextPlan.risk = normalizeList([...riskItems, ...decision.edgeCaseChecklist]);
+  return nextPlan;
+}
+```
+**After:**
+```typescript
+private buildUpdatedPlan(plan: AnyRecord, planVersion: number, decision: ReconciliationDecision): AnyRecord {
+  return {
+    ...plan,
+    plan_version: planVersion + 1,
+    revision_of: planVersion,
+    revision_reason: decision.reasons.join('; '),
+    acceptance_criteria: normalizeList([
+      ...asStringArray(plan.acceptance_criteria),
+      ...decision.acceptanceCriteriaAdditions
+    ]),
+    risk: normalizeList([...asStringArray(plan.risk), ...decision.edgeCaseChecklist])
+  };
+}
+```
+**Tests to write:** `apps/control-plane/test/planning-wave-executor.spec.ts` (existing — add cases)
+- `GIVEN_buildUpdatedPlan_WHEN_called_THEN_original_plan_is_not_mutated`
+- `GIVEN_buildUpdatedPlan_WHEN_called_THEN_only_mutated_fields_change`
+**Acceptance criteria:**
+1. Returned plan object has all fields from the original plan plus the five updated fields.
+2. The original `plan` argument is not mutated.
+3. `structuredClone` is not called.
+4. Existing `planVersion + 1` and `normalizeList` behavior is preserved.
+---
+#### PER-T-013: Extract `normalizeSet` to a shared utility
+**Fixes:** Finding M-1
+**Files:**
+- Create or update: `apps/control-plane/src/core/utils.ts`
+- Refactor: `apps/control-plane/src/application/services/reporting-service.ts:7–9`
+- Refactor: `apps/control-plane/src/application/services/merge-service.ts:17–19`
+- Refactor: `apps/control-plane/src/application/services/feature-lifecycle-service.ts:15–17`
+- Refactor: `apps/control-plane/src/application/services/feature-deletion-service.ts:67–69`
+Add to `apps/control-plane/src/core/utils.ts`:
+```typescript
+/** Deduplicate and sort a string array. */
+export function normalizeSet(array: string[]): string[] {
+  return [...new Set(array)].sort((a, b) => a.localeCompare(b));
+}
+```
+In each of the four service files, remove the local definition and add:
+```typescript
+import { normalizeSet } from '../../core/utils.js';
+```
+**Tests to write:** `apps/control-plane/test/core-utils.spec.ts` (existing — add cases)
+- `GIVEN_normalizeSet_WHEN_array_has_duplicates_THEN_deduplicates`
+- `GIVEN_normalizeSet_WHEN_array_is_unsorted_THEN_sorts_lexicographically`
+- `GIVEN_normalizeSet_WHEN_empty_array_THEN_returns_empty`
+**Acceptance criteria:**
+1. All four service files import from `../../core/utils.js`; no local `normalizeSet` definitions remain.
+2. Behavior is identical to the four previous implementations.
+3. No existing tests fail.
+---
+#### PER-T-014: Move `loadRolePrompts` outside the per-feature loop in `QaWaveExecutor`
+**Fixes:** Findings I-8, C-4
+**File:** `apps/control-plane/src/supervisor/qa-wave-executor.ts`
+**Line:** 196
+`loadRolePrompts()` is called once per feature inside the QA batch loop. The `PromptBundleLoader` cache makes subsequent calls cheap, but the placement is structurally fragile — cache invalidation silently causes per-feature file I/O.
+**Before:**
+```typescript
+for (const featureId of batch.slice(0, maxParallelGateRuns)) {
+  // ... gate run logic ...
+  const prompts = await this.promptProvider.loadRolePrompts();   // INSIDE LOOP
+  const newQa = await this.provider.createSession('qa', featureId, prompts.qa);
+  // ...
+}
+```
+**After:**
+```typescript
+const prompts = await this.promptProvider.loadRolePrompts();  // OUTSIDE LOOP
+for (const featureId of batch.slice(0, maxParallelGateRuns)) {
+  // ... gate run logic ...
+  const newQa = await this.provider.createSession('qa', featureId, prompts.qa);
+  // ...
+}
+```
+**Tests to write:** `apps/control-plane/test/session-management.spec.ts` (existing — add case)
+- `GIVEN_QaWaveExecutor_run_WHEN_multiple_features_in_batch_THEN_loadRolePrompts_called_once`
+**Acceptance criteria:**
+1. `loadRolePrompts` is called exactly once per `run()` invocation regardless of batch size.
+2. The same `prompts` object is used for all session creations in the batch.
+3. QA session rotation behavior is unchanged.
+---
+## 4. Acceptance Criteria (Spec-Level)
+1. All 14 tasks pass their own unit tests and the full existing test suite.
+2. `npm run typecheck` and `npm run lint` report zero errors and zero warnings after all tasks.
+3. MCP contract validation (`npm run validate:mcp-contracts`) passes.
+4. Architecture validation (`npm run validate:architecture`) passes.
+5. No on-disk artifact format is changed except for the optional `schema_version` field added to the index.
+6. Tool input/output schemas for all existing tools are backward-compatible; `role` is an optional field with a default of `'full'`.
+7. The context bundle returned by `feature.get_context` with `role: 'full'` (or no role) is byte-for-byte identical to the current output.
+---
+## 5. Risks and Mitigations
+| Risk | Mitigation |
+|------|-----------|
+| `Promise.all` parallelism exposes hidden race conditions in test mocks | Add ordering assertions in tests; use `vi.fn()` with explicit mock return sequences |
+| Context projection silently drops fields agents depend on | Gate behind `role: 'full'` default; add integration test asserting agent tools still resolve |
+| `schema_version` migration triggers a spurious re-write on first deploy | Acceptable one-time cost; no data loss; migration is idempotent |
+| `structuredClone` removal causes mutation if spread is too shallow | The five mutated fields are scalars or wholly replaced arrays; add mutation-guard test to confirm |
+| Evidence pruning deletes a file being read concurrently | Pruning occurs after the gate write completes; `fs.unlink` uses `.catch(() => undefined)` |
+| `normalizeSet` extraction breaks an unknown caller | Signature is identical; all four service files updated atomically in this task |
+---
+## 6. Task Backlog
+| ID | Milestone | File(s) | Description |
+|----|-----------|---------|-------------|
+| PER-T-001 | PER-M1 | `feature-lifecycle-service.ts` | Parallelize 5 sequential reads in `featureGetContext` |
+| PER-T-002 | PER-M1 | `build-wave-executor.ts`, `qa-wave-executor.ts` | Parallelize status-filter loops |
+| PER-T-003 | PER-M1 | `build-wave-executor.ts` | Eliminate duplicate context fetch in retry path |
+| PER-T-004 | PER-M1 | `reporting-service.ts` | Parallelize `reportDashboard` state+cost reads |
+| PER-T-005 | PER-M2 | `gate-service.ts` | Read `latest.json` directly; remove directory scan |
+| PER-T-006 | PER-M2 | `policy.yaml`, `policy.schema.json`, `gates.ts` | Add `evidence_retention_count` and pruning logic |
+| PER-T-007 | PER-M3 | `planning-wave-executor.ts` | Pre-filter non-planning features before context fetch |
+| PER-T-008 | PER-M3 | `feature-lifecycle-service.ts`, schemas, catalog | Role-specific context projections |
+| PER-T-009 | PER-M3 | `feature-lifecycle-service.ts` | Extract and test `projectQaIndex` helper |
+| PER-T-010 | PER-M4 | `run-coordinator.ts` | Evict `statusCache` on feature termination |
+| PER-T-011 | PER-M4 | `kernel.ts`, `schemas/index.schema.json` | Replace `JSON.stringify` diff with `schema_version` guard |
+| PER-T-012 | PER-M4 | `planning-wave-executor.ts` | Replace `structuredClone(plan)` with spread |
+| PER-T-013 | PER-M4 | `core/utils.ts` + 4 service files | Deduplicate `normalizeSet` into shared utility |
+| PER-T-014 | PER-M4 | `qa-wave-executor.ts` | Move `loadRolePrompts` outside per-feature loop |
+---
+## 7. Implementation Order
+Tasks within each milestone are independent and can be implemented in any order or in parallel. Milestone ordering recommendation:
+1. **PER-M1 first** — highest-frequency hotpaths; no schema or interface changes; all changes are local refactors.
+2. **PER-M2 second** — bounds disk growth; PER-T-005 depends on `latest.json` being reliably written (confirmed at `gates.ts:430`).
+3. **PER-M4 third** — low-risk cleanups; PER-T-013 produces the shared utility that PER-T-008 will use.
+4. **PER-M3 last** — largest change surface; requires schema and catalog updates; builds on PER-T-001 from PER-M1.