npm - agentic-orchestrator - Versions diffs - 0.1.19 → 0.1.21 - Mend

agentic-orchestrator 0.1.19 → 0.1.21

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (21) hide show

package/spec-files/outstanding/agentic_orchestrator_dashboard_advanced_ux_spec.md CHANGED Viewed

@@ -1,563 +1,406 @@
-# Feature Spec: Dashboard Advanced UX — Second Pass (AOP)
+# Feature Spec: Dashboard Advanced UX - Second Pass (AOP)
-> **Purpose of this document**: Define a second wave of dashboard improvements that go beyond structural cleanup (M44) into genuinely advanced capability: surfacing data the backend already produces but never exposes, adding intelligent visualizations, and enabling operations that today require dropping to the CLI. These improvements are designed around the question: _what would the best possible version of this dashboard look like?_
+> Purpose: define the M45 dashboard wave with low-cognitive-load UX, strict implementation contracts, and deterministic behavior for agentic execution.
-**Version:** 1.0
+**Version:** 2.0 (rewritten for execution quality)
 **Date:** 2026-03-05
 **Status:** Draft
 **Roadmap Mapping:** M45
-**Depends On:** M44 (Dashboard UX Improvements, component architecture refactor must land first)
+**Depends On:** M44 (UX-01..UX-20 complete)
 ---
 ## 0. Scope and Standards
-### 0.1 Motivation
+### 0.1 Primary Users and Jobs
-The M44 spec (UX-01 through UX-20) established the structural foundation: component extraction, CSS tokens, rich cards, phase-aware panels, and basic filters. This spec adds the layer above: _insight and intelligence_.
+This spec optimizes for two primary jobs:
-The backend currently produces:
+1. **Reviewer job:** make correct approve/deny/request-changes decisions quickly.
+2. **Operator job:** detect stalled runs, lock contention, and systemic issues before throughput drops.
-- **Per-feature cost data** (`cost.get` tool) — tokens used, estimated USD cost
-- **Provider/model performance analytics** (`performance.get_analytics`) — success rates, retry counts, durations, costs by provider/model
-- **Agent role-level execution status** (`role_status` in state: planner/builder/qa each with ready/running/blocked/done)
-- **Agent session cluster IDs** (`cluster` in state)
-- **Full lock lease details** (`lock_leases` in index.json — includes holder, expires_at)
-- **Dependency-blocked features** (`dep_blocked` in index.json — with depends_on_unresolved)
-- **Plan acceptance criteria** (`plan.acceptance_criteria` — required conditions for completion)
-- **Plan risk notes** (`plan.risk` — known risks declared by the planner)
-- **Plan file scope** (`plan.files.create/modify/delete` — planned file operations)
-- **Plan revision history** (`plan.revision_of`, `plan.revision_reason`)
-- **QA test coverage index** (`qa_test_index.json` — per-file test status and coverage)
-- **Gate retry count and last retry time** (`gate_retry_count`, `last_retry_at`)
-- **Orchestrator run lease** (`runtime_sessions` in index.json — provider, model, heartbeat, expiry)
-- **Auto-generated review briefs** (`review_brief.json` from M34-36 PRQ5)
-- **Feasibility scores** (`planning_quality.feasibility_score` from M34-36 PRQ2)
-- **Collision matrix** (from `collisions.scan` tool)
-- **Flaky test quarantine data** (from `gate.flaky_report_get` tool, M34-36 PRQ4)
+Secondary job:
-None of this data is currently accessible from the dashboard. This spec surfaces it all.
+3. **Author job (in-scope for M45):** create a new feature run from dashboard with server-side validation and policy checks.
-### 0.2 Required Standards
+### 0.2 Critical UX Corrections from v1.0
-Same as M44: TypeScript strict, zero lint warnings, test coverage ≥ 90% on all metrics, existing API contracts preserved, Next.js build passes.
+The previous draft over-indexed on "more widgets" and under-specified decision flow. This rewrite makes these corrections:
-### 0.3 Implementation Approach
+| v1.0 Issue                            | Why It Is Problematic                         | v2.0 Correction                                                                    |
+| ------------------------------------- | --------------------------------------------- | ---------------------------------------------------------------------------------- |
+| Dense feature packing in one screen   | High cognitive load, weak scan hierarchy      | Progressive disclosure + clear view boundaries (Board vs Analytics vs Focus route) |
+| Heuristic language like "Likely Met"  | Creates false confidence                      | Rename to **Verification Signals** with explicit confidence labels and caveats     |
+| Color-first status encoding           | Accessibility failures for color-vision users | Every status includes icon + text + semantic label                                 |
+| Right-click-only action affordances   | Not keyboard/touch accessible                 | Explicit action menu button + full keyboard path                                   |
+| Unbounded client polling/render loops | Performance instability at scale              | Explicit budgets, throttling, and update intervals                                 |
+| `POST /api/run` shell ambiguity       | Security and reliability risk                 | Contract requires orchestrator tool path; shell-out disallowed in default path     |
+| Missing API error contracts           | Agent implementation ambiguity                | Uniform response envelope and explicit error codes                                 |
-New server-side data is accessed by adding:
+### 0.3 UX Principles (Mandatory)
-- New API routes under `packages/web-dashboard/src/app/api/`
-- New `callOrchestratorTool` invocations in `src/lib/orchestrator-tools.ts`
-- New fields on existing types or new types in `src/lib/types.ts`
-- New `aop-client.ts` file-read functions for artifacts not accessible via MCP tools
+1. **Task-first hierarchy:** surface what helps decisions first, details second.
+2. **Progressive disclosure:** default collapsed for deep diagnostics.
+3. **Truth over optimism:** inferred states must be labeled as inferred.
+4. **Accessibility by default:** keyboard complete, non-color semantics, ARIA labels.
+5. **Operational safety:** destructive or high-impact actions require guardrails and confirmation.
+6. **Determinism for agents:** every feature includes explicit data source, fallback, and testability.
----
-## 1. Improvements
-### UX-21 — Agent Pipeline Stepper
+### 0.4 Required Quality Standards (Mandatory)
-**What**: In the feature detail panel, render the planner → builder → QA pipeline as a visual three-step stepper showing the `role_status` for each agent role (ready / running / blocked / done).
+All M45 implementations MUST satisfy:
-**Why it matters**: The most common question a developer has when monitoring a feature is "what is the agent doing right now?" The `role_status` object in `state.md` frontmatter answers this precisely — it has per-role status for planner, builder, and qa. Yet this is completely invisible in the current dashboard. The pipeline stepper turns a three-field object into an instantly readable execution snapshot.
+- TypeScript strict mode passes.
+- ESLint zero warnings.
+- Coverage >= 90% lines/branches/functions/statements.
+- `npm run build` passes for workspace and dashboard build target.
+- Existing API contracts are preserved or versioned.
+- Accessibility: WCAG 2.2 AA baseline for interactive components.
+- No critical interaction depends on pointer-only behavior.
-**Specification**:
+### 0.5 Non-Functional Budgets (Mandatory)
-- Read `role_status.planner`, `role_status.builder`, `role_status.qa` from `FeatureSummary` (add these fields to the type).
-- Render as a horizontal stepper: `[Planner] → [Builder] → [QA]`.
-- Each step node has a status indicator:
-  - `ready` → grey circle (not started)
-  - `running` → amber animated pulse
-  - `done` → green checkmark
-  - `blocked` → red X
-- The active (running) step label is bold; done steps use muted/strikethrough styling.
-- Show alongside the feature name in the detail panel header, above status badges.
-- Update live with every SSE snapshot (no additional API calls needed).
+- SSE update handling SHOULD complete in < 50ms per snapshot for 200 features on a dev laptop baseline.
+- No component may run `setInterval(1000)` unless required by human-visible countdown; prefer 5s cadence for low-value timers.
+- Large lists/tables (>150 rows) MUST use virtualization or pagination.
+- API errors MUST render non-blocking inline states (no white-screen failures).
+- Responsive baseline is mandatory: dashboard core flows MUST be fully usable at **360px CSS width** in portrait orientation (phone-class viewport), including triage, feature selection, detail inspection, and review actions.
+- At the phone baseline, no horizontal page-level scrolling is allowed for primary workflows; controls must remain reachable by keyboard and touch.
 ---
-### UX-22 — Per-Feature Cost & Token Tracker
-**What**: Show token usage and estimated USD cost for each feature in the detail panel, sourced from the existing `cost.get` MCP tool.
-**Why it matters**: AI-driven development has a direct monetary cost. Engineers and managers need to see what each feature costs — both to understand ROI and to catch runaway agents that have been retrying for hours. The `cost.get` tool already exposes `tokens_used` and `estimated_cost_usd` per feature. Making this visible closes the most obvious information gap for any team managing budgets.
+## 1. Experience Architecture
-**Specification**:
+### 1.1 Views and Information Scent
-- Add a new API route `GET /api/features/:id/cost` that calls `callOrchestratorTool('cost.get', { feature_id })`.
-- Fetch when a feature is selected in the detail panel.
-- In the detail panel, add a small "Cost" section: "Tokens: X,XXX | Est. Cost: $0.04".
-- Format cost with 2–4 decimal places; use "< $0.01" for very small values.
-- If cost data is unavailable (feature never ran), show "No cost data recorded."
-- In the summary bar (UX-01), add a "Total Cost Today" tile: sum `estimated_cost_usd` across all features with a `recorded_at` within the last 24 hours.
-- Add a `CostSummary` type to `src/lib/types.ts`:
-  ```typescript
-  export interface CostSummary {
-    feature_id: string;
-    tokens_used: number;
-    estimated_cost_usd: number;
-    recorded_at: string | null;
-  }
-  ```
+M45 uses three intentional contexts:
----
+1. **Board (`/`)**
+   - For triage and workflow control.
+   - Shows columns, summary metrics, runtime health, lock/dependency summaries, and selected-feature detail panel.
+2. **Analytics (`/analytics`)**
+   - For trend and provider/collision insights.
+   - Keeps heavy tables/charts out of triage view.
+3. **Focus (`/feature/:id`)**
+   - For deep review of one feature.
+   - Full-width stacked detail sections.
-### UX-23 — Plan Scope File Tree
+### 1.2 URL-State Contract
-**What**: In the detail panel's plan view, render `plan.files.create`, `plan.files.modify`, and `plan.files.delete` as a visual collapsible directory tree with color-coded operation type instead of flat lists.
+To reduce user disorientation and allow shareable links:
-**Why it matters**: When reviewing a feature, understanding the _scope_ of changes matters as much as the content. A flat list of 40 file paths is hard to scan; a tree grouped by directory reveals the blast radius immediately. Color-coding (green = create, blue = modify, red = delete) maps directly to risk: red means deletion, which is the highest-risk operation.
+- Board selection state MUST sync to URL query (`?feature=<id>&filters=...`).
+- View mode MUST be route-driven (not only local state).
+- Browser back/forward MUST restore selected feature + filters.
-**Specification**:
+### 1.3 Progressive Disclosure Rules
-- Parse the three arrays from `detail.plan.files` (or `{create: [], modify: [], delete: []}`).
-- Build a directory tree from path strings: split on `/`, build a nested object, render recursively.
-- Leaf nodes (files) are colored: green for create, blue for modify, red for delete.
-- Directory nodes show a count badge: "src/services/ (3 modified, 1 created)".
-- Tree is collapsed by default at depth > 2; clicking a directory node expands it.
-- `allowed_areas` and `forbidden_areas` from the plan are shown as a separate labeled list below the tree ("Allowed: src/services/**, Forbidden: src/core/**").
-- This replaces the simple file list in UX-04's plan viewer.
+- Default expanded: summary, current phase status, review actions.
+- Default collapsed: collision matrix, raw evidence, live feed details, revision metadata.
+- In Focus view, reviewer-critical sections appear before diagnostics.
 ---
-### UX-24 — Acceptance Criteria Live Tracker
+## 2. Revised Improvements (UX-21 to UX-40)
-**What**: Render `plan.acceptance_criteria` as a live-status checklist in the detail panel, with each criterion's status inferred from available gate results and evidence artifacts.
+### UX-21 - Agent Pipeline Stepper
-**Why it matters**: The plan's acceptance criteria are the definition of done. They're declared by the planner agent and represent exactly what must be true for the feature to be complete. Yet they're never shown in the dashboard — a reviewer has no way to know what the agent was supposed to achieve. Surfacing them with inferred completion status turns the criteria list into a live progress tracker.
+**Intent:** reveal where a feature currently sits in planner -> builder -> qa.
-**Specification**:
+**Requirements:**
-- Read `plan.acceptance_criteria` (array of strings) from `detail.plan`.
-- For each criterion, infer status using a lightweight heuristic:
-  - If criterion text mentions "test" or "coverage" and the corresponding gate (`fast`/`full`) has `pass` status: mark as ✅ "Likely Met (gate passed)".
-  - If the gate has `fail` status: mark as ❌ "Gate Failed".
-  - If gate status is `na` or absent: mark as 🔲 "Unverified (requires review)".
-- Render as a checklist below the plan summary in the detail panel.
-- Each criterion shows its full text and a colored status badge.
-- A summary line: "3 / 5 criteria verifiable from gate results".
-- If no acceptance criteria exist, show "No acceptance criteria declared in plan."
+- Source from `feature.role_status`.
+- Render stepper with icon + text state: `Ready`, `Running`, `Blocked`, `Done`, `Unknown`.
+- Running step uses motion + text, never motion-only.
+- If role status is missing, render `Unknown` with neutral styling.
+- Update via SSE snapshots; no extra API call.
----
+### UX-22 - Per-Feature Cost and Token Tracker
-### UX-25 — QA Test Coverage Map
+**Intent:** make cost visibility immediate without noisy calls.
-**What**: Read `qa_test_index.json` for the selected feature and render a file-by-file test coverage status grid in the detail panel.
+**Requirements:**
-**Why it matters**: The `qa_test_index.json` artifact maps each changed file to its required tests and current test status (`pending / running / passed / failed / waived`). This is precisely the information a developer needs to understand test coverage during QA. Currently it is an invisible file. Surfacing it as a visual grid (file path → status color) gives reviewers confidence that changed code is actually tested.
+- API route: `GET /api/features/:id/cost`.
+- Fetch only when feature detail is open; cache per feature for 60s.
+- Use `AbortController` when switching selected feature.
+- Render `Tokens`, `Estimated Cost`, and `Last Recorded`.
+- Summary tile uses aggregated value from server payload (not N feature calls).
+- Empty state: `No cost data recorded yet`.
-**Specification**:
+### UX-23 - Plan Scope File Tree
-- Add a new API route `GET /api/features/:id/test-index` that reads `.aop/features/:id/qa_test_index.json`.
-- Return type matches `QaTestIndex` type (add to `src/lib/types.ts`):
-  ```typescript
-  export interface QaTestIndexItem {
-    path: string;
-    status: 'pending' | 'running' | 'passed' | 'failed' | 'waived';
-    required_tests: string[];
-    last_run_at?: string;
-  }
-  export interface QaTestIndex {
-    feature_id: string;
-    version: number;
-    items: QaTestIndexItem[];
-  }
-  ```
-- In the detail panel (shown for `qa` and `ready_to_merge` phases), add a "Test Coverage" section.
-- Render as a compact table: file path (truncated to 40 chars with tooltip), status pill (green = passed, red = failed, amber = pending/running, grey = waived), required test count.
-- Summary row: "N files: X passed, Y failed, Z pending."
-- If artifact missing: "No test index available."
+**Intent:** communicate blast radius faster than flat lists.
----
+**Requirements:**
-### UX-26 — Lock Resource Map
+- Parse `plan.files.create|modify|delete` into tree model.
+- Render operation badges (`Created`, `Modified`, `Deleted`) with icon + text.
+- Collapse depth > 2 by default.
+- For >200 nodes, use virtualization or chunked rendering.
+- Show `allowed_areas` and `forbidden_areas` as separate policy chips.
-**What**: Add a "Locks" panel to the dashboard showing all resource locks from `index.lock_leases` with the current holder, time until lease expiry, and a stale indicator.
+### UX-24 - Acceptance Criteria Verification Signals
-**Why it matters**: Lock contention is the primary cause of features entering the blocked queue. Currently, developers have no way to see which features hold which locks without running `aop status` or reading raw JSON files. A live lock map makes contention immediately diagnosable — you can see at a glance that "openapi" is held by `feature-auth`, expires in 4 minutes, and `feature-payments` is waiting for it.
+**Intent:** show progress signals without overstating certainty.
-**Specification**:
+**Requirements:**
-- Extend `DashboardStatusPayload` (or add a new type) to include `lock_map`:
-  ```typescript
-  export interface LockLease {
-    resource: string;
-    holder: string | null;
-    expires_at: string | null;
-    is_stale: boolean;
-  }
-  ```
-- Extend `/api/status` or add `/api/locks` to return lock lease data read from `index.lock_leases`.
-- Compute `is_stale` server-side: `new Date(expires_at) < new Date()`.
-- Render as a small panel below or alongside the Kanban board (or as a tab in a future multi-panel layout, UX-40).
-- Each row: resource name, holder feature_id (as a clickable link to select that feature), expiry countdown ("expires in 4m 22s"), stale badge if stale.
-- Stale leases render in red with a "Stale" badge; healthy leases in green/neutral.
-- Countdown updates via `setInterval(1000)` from the `expires_at` timestamp (no server polling needed for the countdown).
+- Rename section to **Verification Signals**.
+- Never label inferred criteria as fully met.
+- Status values:
+  - `Verified` (direct artifact evidence)
+  - `At Risk` (explicit gate/evidence failure)
+  - `Unverified` (insufficient evidence)
+- Each item includes `reason` and `evidence_ref` when available.
+- Summary line: `X verified, Y at risk, Z unverified`.
----
+### UX-25 - QA Test Coverage Map
-### UX-27 — Dependency Unblock Chain
+**Intent:** expose test accountability per changed file.
-**What**: When features are in `dep_blocked` (waiting for dependency features to merge), visualize the dependency relationships as a linked chain showing which features are blocked and what they're waiting for.
+**Requirements:**
-**Why it matters**: Dependency blocking is invisible in the current dashboard. `dep_blocked` entries are in `index.json` but never surfaced. A feature author has no way to see from the dashboard that their feature is waiting for `feature-auth` to merge. The dependency chain makes this explicit and shows the critical path.
+- API route: `GET /api/features/:id/test-index`.
+- Visible in `qa` and `ready_to_merge`.
+- Table columns: file path, status, required test count, last run.
+- Status labels must include icon + text.
+- Filter pills: `failed`, `pending/running`, `waived`, `passed`.
+- Empty state if artifact missing.
-**Specification**:
+### UX-26 - Lock Resource Map
-- Extend `FeaturesIndex` to include `dep_blocked`:
-  ```typescript
-  dep_blocked?: Array<{
-    feature_id: string;
-    depends_on_unresolved: string[];
-  }>;
-  ```
-- Features in `dep_blocked` render with a distinct "Awaiting Dependencies" sub-phase in their Kanban card.
-- In the detail panel, if `dep_blocked` contains the selected feature, show a "Dependencies" section listing unresolved dependencies as links.
-- At the bottom of the board (or in the collision queue area from UX-12), show a "Dependency Chains" section that renders the dep_blocked list as: `feature-payments → waiting for → [feature-auth]` where `feature-auth` is a clickable link.
-- When all dependencies for a feature are merged (list empties), the chain entry disappears on the next SSE snapshot.
+**Intent:** make lock contention diagnosable in one glance.
----
+**Requirements:**
-### UX-28 — Auto-Generated Review Brief Renderer
-**What**: When `.aop/features/<id>/review_brief.json` exists (generated by the M34-36 PRQ5 review brief service), render its structured sections in the detail panel as a rich, formatted review document.
-**Why it matters**: The M34-36 spec defines `review_brief.json` as a structured summary containing: intent summary, scope, contract risk, feasibility score, gate matrix, unresolved questions, and evidence references. This is designed to be the first thing a reviewer reads. Without dashboard rendering, reviewers must find and parse this JSON artifact manually. A rendered brief turns approval into a guided workflow.
-**Specification**:
-- Add API route `GET /api/features/:id/review-brief` that reads `.aop/features/:id/review_brief.json`.
-- Extend `FeatureDetail` to include `review_brief?: ReviewBrief | null`.
-- Add type:
-  ```typescript
-  export interface ReviewBrief {
-    intent_summary: string;
-    scope_summary: string;
-    contract_risk_summary: string;
-    feasibility_score?: number;
-    feasibility_breakdown?: Record<string, number>;
-    gate_matrix: Record<string, string>;
-    unresolved_questions: string[];
-    evidence_refs: string[];
-    generated_at: string;
-  }
-  ```
-- In the detail panel for `ready_to_merge` features, show "Review Brief" as a prominently styled card at the top.
-- Render each section under a labeled header: Intent, Scope, Contract Risk, Feasibility (as a score with colored bar), Gate Results, Open Questions.
-- If `review_brief` is null/absent: show "Review brief not yet generated." with a note that it is auto-generated on gate completion.
-- When `feasibility_breakdown` is present, render each component (scope_realism, test_sufficiency, etc.) as a labelled progress bar.
+- Extend status payload with normalized lock entries.
+- Row fields: resource, holder, expires in, stale flag.
+- Countdown refresh every 5s.
+- Stale rows have text badge `Stale lease`.
+- Clicking holder selects that feature.
----
+### UX-27 - Dependency Unblock Chain
-### UX-29 — Plan Risk Annotations
+**Intent:** expose blocked critical path clearly.
-**What**: Render `plan.risk` (an array of risk notes declared by the planner) as prominent warning callouts in the detail panel.
+**Requirements:**
-**Why it matters**: The planner agent is instructed to enumerate known risks and edge cases in the plan. This is valuable information — the agent knows the codebase and has identified "this touches authentication, test the token refresh path carefully." But these notes are buried in `plan.json` and never shown to the human reviewer. Surfacing them as visible amber callout boxes puts the planner's concerns directly in front of the person who needs to act on them.
+- Source from `dep_blocked` in index/status payload.
+- Feature card shows `Awaiting Dependencies` substate when applicable.
+- Detail panel lists unresolved dependencies as navigable links.
+- Board section `Dependency Chains` shows `feature -> depends on -> feature` rows.
+- Rows disappear automatically on next snapshot when resolved.
-**Specification**:
+### UX-28 - Review Brief Renderer
-- Read `plan.risk` (array of strings) from `detail.plan`.
-- If `plan.risk` is non-empty, render a "Known Risks" section immediately above the review action buttons.
-- Each risk renders as an amber/yellow callout card with a ⚠️ prefix and the risk text.
-- Section has a header: "⚠ N Risk(s) Declared by Planner Agent".
-- If `plan.risk` is empty or absent, omit the section (no "no risks" message — absence implies clean).
-- Risk callouts are always visible (not collapsed) in the `ready_to_merge` phase; collapsed by default in earlier phases.
+**Intent:** present generated review context as first-class reviewer input.
----
+**Requirements:**
-### UX-30 — Gate Step Drill-Down
+- API route: `GET /api/features/:id/review-brief`.
+- Render in `ready_to_merge`; optional in `qa`.
+- Sections: intent, scope, contract risk, feasibility, gate matrix, unresolved questions, evidence references.
+- Show `generated_at` + freshness age.
+- Missing brief state: `Review brief not generated yet`.
-**What**: When a gate fails, parse the gate evidence artifact (e.g., `.aop/features/<id>/evidence/fast.json`) to show the specific step that failed, its command, exit code, and the last N lines of output.
+### UX-29 - Plan Risk Annotations
-**Why it matters**: Seeing "fast: fail" tells a developer nothing actionable. What step failed? Was it `lint`, `build`, `test`? What was the error? The gate evidence JSON contains this information — it records per-step results. Showing "Step `test` failed: exit code 1. Last output: [...]" makes the failure immediately actionable without navigating to any other tool.
+**Intent:** keep planner-declared risks visible at decision time.
-**Specification**:
+**Requirements:**
-- Gate evidence artifacts are JSON files at `.aop/features/<id>/evidence/<mode>.json` or `-<mode>.json`.
-- Extend the evidence viewer (UX-07) to detect `.json` gate evidence files and render them as structured gate step results rather than raw JSON.
-- Parse: `{ steps: [{ name: string, cmd: string[], exit_code: number, stdout_tail: string, duration_ms: number }] }` (adapt to actual artifact structure).
-- Failed steps render in red with a collapsed `<details>` containing the `stdout_tail` (last 50 lines).
-- Passed steps render in green; skipped/timeout steps in amber.
-- Step duration shown as "Xs" beside each step.
-- If the artifact format doesn't match the expected gate result shape, fall back to the raw JSON viewer.
+- Read from `plan.risk`.
+- In `ready_to_merge`, section is expanded by default.
+- Support optional severity prefix parsing (`[high]`, `[medium]`, `[low]`); default `medium`.
+- Risks render as callouts with severity text badges.
----
+### UX-30 - Gate Step Drill-Down
-### UX-31 — Orchestrator Run Health Panel
+**Intent:** shorten time-to-diagnosis for gate failures.
-**What**: Add a small but prominent "Orchestrator" health card showing the current run's provider, model, uptime, last heartbeat age, and a countdown to lease expiry.
+**Requirements:**
-**Why it matters**: The `runtime_sessions` object in `index.json` contains: `provider`, `model`, `started_at`, `last_heartbeat_at`, `lease_expires_at`, `owner_instance_id`. If the orchestrator process dies, `last_heartbeat_at` stops updating and `lease_expires_at` approaches. Without this visibility, developers have no way to know from the dashboard that the orchestrator has crashed — they simply watch features stop making progress. A health card makes this obvious.
+- Parse gate evidence JSON using tolerant parser with shape guards.
+- Render per-step: name, command, status, exit code, duration.
+- First failed step expands by default.
+- Output tail is collapsible, capped to last 50 lines.
+- Unknown artifact shape falls back to raw JSON viewer.
-**Specification**:
+### UX-31 - Orchestrator Run Health Panel
-- Extend `/api/status` response to include `runtime` field from `index.runtime_sessions`.
-- Render in the top-right of the header, replacing or supplementing the connection indicator dot.
-- Show: provider name, model name (abbreviated), run uptime ("running Xh Ym"), heartbeat health ("heartbeat: Ns ago").
-- Heartbeat indicator: green if `last_heartbeat_at` is < 30s ago; amber if 30s-2min; red if > 2min.
-- Lease expiry countdown: "lease expires in Xm Ys" — red if < 60s.
-- If `runtime_sessions` is absent (no active run): show "No active orchestrator run" with grey styling.
-- Clicking the health card expands a tooltip with full details: run_id, owner_instance_id, full timestamps.
+**Intent:** expose run liveness and lease risk early.
----
+**Requirements:**
-### UX-32 — Throughput Sparklines
+- Include runtime session in status payload.
+- Show provider, model, uptime, heartbeat age, lease countdown.
+- Health states:
+  - `Healthy` (<30s heartbeat)
+  - `Degraded` (30s-2m)
+  - `Critical` (>2m)
+- No active runtime: show `No active orchestrator run`.
-**What**: Display a mini sparkline chart in the summary bar (UX-01) or a dedicated analytics section showing features merged per day/week, derived from the merged features list and their `last_updated` timestamps.
+### UX-32 - Throughput Sparkline
-**Why it matters**: Throughput — how many features are completing successfully per unit time — is the primary output metric for any engineering team using AOP. Without it, there's no way to know if the system is working well or if throughput has degraded. A sparkline gives instant trend visibility without requiring a full analytics page.
+**Intent:** provide fast trend check without analytics overload in Board view.
-**Specification**:
+**Requirements:**
-- No new backend data is needed: use `payload.features` already available, filtered to `phase === 'merged'`, grouped by `last_updated` date.
-- Compute a rolling 14-day histogram (date → count of merges on that date).
-- Render as an SVG sparkline: 14 bars, one per day, height proportional to count.
-- Show below or beside the summary bar with label "Merge throughput (14d)" and the total count "N features merged."
-- Bars for today and yesterday use a highlighted color; older bars use muted color.
-- Tooltip on hover: "March 4: 3 features merged."
-- Implement using pure SVG (no charting library dependency).
+- Compute 14-day merges histogram from merged feature timestamps.
+- Render compact SVG sparkline with tooltips.
+- Board shows tiny sparkline tile + link to full analytics route.
----
+### UX-33 - Provider Performance Analytics
-### UX-33 — Provider Performance Analytics Panel
+**Intent:** support provider/model tradeoff decisions.
-**What**: Add a dedicated "Analytics" tab or section that renders `performance.get_analytics` data as a sortable table showing provider/model performance: success rate, avg cost, avg duration, avg retry count.
+**Requirements:**
-**Why it matters**: The `performance.get_analytics` tool returns aggregated metrics per provider/model combination: `success_rate`, `avg_cost_usd`, `avg_duration_ms`, `avg_retry_count`, `total_features`. This data is extraordinarily valuable for teams choosing between Claude, Codex, Gemini, or custom providers — it answers "which model gives the best outcomes at the lowest cost?" with real data from your own runs.
+- API route: `GET /api/analytics`.
+- Place in `/analytics` route.
+- Table columns: provider, model, features, success rate, avg cost, avg duration, avg retries.
+- Sortable headers with accessible sort state labels.
+- Include recent outcomes timeline (last 20).
+- Refresh interval 30s, pause when tab hidden.
-**Specification**:
+### UX-34 - Collision Matrix Heatmap
-- Add API route `GET /api/analytics` that calls `callOrchestratorTool('performance.get_analytics', {})`.
-- Add a navigation tab or toggle: "Board" (default) | "Analytics" | "Locks" (see UX-40).
-- Analytics view renders two sections:
-  **Provider Comparison Table**: one row per provider/model, columns: Provider, Model, Features, Success Rate (as %), Avg Cost, Avg Duration, Avg Retries. Sortable by any column. Success rate rendered as a colored pill (green ≥ 80%, amber 50–79%, red < 50%).
-  **Recent Outcomes List**: the raw `outcomes` array as a timeline (last 20), showing feature_id, provider, model, status, gate_pass, cost_usd, duration (human-formatted).
-- Auto-refresh every 30s (less frequent than the main board polling).
-- Empty state: "No performance data recorded yet. Data accumulates as features complete."
+**Intent:** make collision topology understandable.
----
-### UX-34 — Collision Matrix Heatmap
-**What**: When multiple features have active plans, surface a collision matrix showing which features conflict with which others, and on which resources (files, areas, contracts).
-**Why it matters**: Collisions are the leading cause of blocked features. Currently the dashboard shows that a feature is blocked but gives no visual sense of the collision topology: which other feature is it conflicting with, on which files? The collision matrix answers this spatially — a quick scan shows if Feature A is blocking both Feature B and Feature C simultaneously, which indicates a high-priority merge situation.
-**Specification**:
+**Requirements:**
-- Add API route `GET /api/collisions` that calls `callOrchestratorTool('collisions.scan', {})`.
-- The collision scan returns a matrix structure indicating which features conflict.
-- Render as a compact grid: features on both axes, cells colored by collision type:
-  - 🔴 Red: file collision (highest priority)
-  - 🟠 Orange: area collision
-  - 🟡 Yellow: contract collision (openapi/events/db)
-  - ⚪ Grey: no collision
-- Clicking a cell opens a tooltip: "Feature A × Feature B: 3 file conflicts in `src/api/`."
-- Only show the matrix when ≥ 2 active features exist; show a message otherwise.
-- Render in the "Locks" / "Collisions" panel (see UX-40 multi-view).
-- Use `dep_blocked` entries as a key to highlight which features are actively queued.
+- API route: `GET /api/collisions`.
+- Render matrix for <=50 active features.
+- For >50 features, degrade to ranked collision pair list.
+- Cell tooltip includes collision type and count.
+- Use text legend + color legend.
----
+### UX-35 - Plan Revision History
-### UX-35 — Plan Revision History
+**Intent:** provide context for scope churn.
-**What**: Show the full revision chain of a feature's plan in the detail panel, derived from `plan.plan_version`, `plan.revision_of`, and `plan.revision_reason`.
+**Requirements:**
-**Why it matters**: A feature that is on plan version 4 with reasons like "revised: build failed due to missing dependency" and "revised: QA requested broader test coverage" tells a completely different story than a feature on version 1. Revision history reveals how difficult the work was, whether the agent was on track, and why scope changed. This context is critical for reviews and for understanding systemic patterns.
+- Surface `plan_version`, `revision_of`, `revision_reason`.
+- Display as a compact revision timeline card.
+- If historical artifacts are not available, explicitly label: `Only current plan metadata is retained`.
-**Specification**:
+### UX-36 - Live Cross-Feature Event Feed
-- The current `plan.json` contains `plan_version`, and optionally `revision_of` and `revision_reason` for each version.
-- Note: previous plan versions are not retained on disk in the current architecture (only current plan.json exists). This improvement should expose what _is_ available: `plan_version` number, `revision_of` (which version it supersedes), and `revision_reason`.
-- In the plan viewer section, show: "Plan v{N}" as the header, with `revision_reason` if present.
-- If `plan_version > 1`, add a "This plan was revised from v{revision_of}" annotation with the revision reason.
-- If `revision_reason` is absent but `plan_version > 1`, show "Plan revised N time(s) — no reason recorded."
-- This is a light-touch display of already-available data; no new artifact storage is required.
+**Intent:** provide stream-level observability without noise overload.
----
+**Requirements:**
-### UX-36 — Live Cross-Feature Event Feed
+- Compute feed events by diffing SSE snapshots.
+- Event types: phase, gate, activity, pr, lock.
+- De-duplicate repeated identical events within 10s window.
+- Keep last 100 events.
+- Provide pause/resume and clear actions.
+- Collapsed by default.
-**What**: Add a real-time event feed panel that shows a rolling log of state-change events across all features as they arrive via SSE — a "tail -f" view of your entire orchestration system.
+### UX-37 - Quick Launch Panel
-**Why it matters**: The Kanban board is a snapshot view. Developers monitoring a multi-feature run often want streaming visibility: "What just happened? Did something just fail? Did the QA agent just finish?" The SSE stream already delivers state snapshots every 2 seconds. Diffing consecutive snapshots surfaces discrete change events that can be rendered as a feed — "feature-auth transitioned building → qa" or "feature-payments gate fast: pass."
+**Intent:** provide in-dashboard feature launch for authorized operators in M45.
-**Specification**:
+**Requirements:**
-- Client-side only: diff the current `DashboardStatusPayload` against the previous one on each SSE snapshot event.
-- Detect changes: phase transitions, gate status changes, activity_state changes, PR status changes.
-- Emit `FeedEvent` objects:
-  ```typescript
-  interface FeedEvent {
-    id: string;
-    timestamp: string;
-    feature_id: string;
-    type: 'phase_change' | 'gate_update' | 'activity_change' | 'pr_update' | 'lock_change';
-    message: string;
-    severity: 'info' | 'success' | 'warning' | 'error';
-  }
-  ```
-- Render as a fixed-height (200px) scrollable panel at the bottom of the board, labeled "Live Events."
-- New events animate in at the top (prepend); keep last 100 events in memory.
-- Color-code by severity: green = success, red = error/fail, amber = warning, grey = info.
-- Events are clickable: clicking selects the associated feature for detail view.
-- A pause button stops auto-scrolling while reading.
-- This panel is collapsed by default; toggled via a small "Live Feed" toggle button.
+- Feature flagged: `dashboard.quick_launch`.
+- API route: `POST /api/run`.
+- Default path MUST call orchestrator tools (not shelling out).
+- Validate `feature_id` pattern and required goal.
+- Require explicit confirmation before launch.
+- If disabled by policy or permission, hide control and return authorization error on direct call.
----
+### UX-38 - Flaky Test Indicator
-### UX-37 — Quick Launch Panel
+**Intent:** distinguish probable flaky failures from likely regressions.
-**What**: Add a "New Feature" button in the dashboard header that opens a modal form for starting a new feature run (`aop run`) directly from the UI, without requiring CLI access.
+**Requirements:**
-**Why it matters**: Every workflow involving AOP today starts at the CLI. A developer monitoring the dashboard must context-switch to a terminal to start the next feature. The quick launch panel makes the dashboard the home base for the full AOP workflow — start, monitor, review, approve — without leaving the browser.
+- API route: `GET /api/flaky`.
+- If data unavailable, omit indicator without error toast.
+- Show indicator only when failing required tests overlap flaky suspects.
+- Label as probabilistic signal, not definitive cause.
-**Specification**:
+### UX-39 - Feature Budget Meter
-- A "＋ New Feature" button in the dashboard header (top-right area).
-- Opens a modal form with fields:
-  - **Feature ID** (text input, validated against `^[a-z0-9_][a-z0-9_-]*$` pattern from `plan.schema.json`)
-  - **Goal / Description** (textarea, maps to `--goal` flag)
-  - **Provider** (optional select: codex / claude / gemini / custom / — use default)
-  - **Gate Profile** (optional select populated from `gates.list` tool response)
-- "Start Feature" button calls a new API endpoint `POST /api/run` that invokes the CLI programmatically or calls the appropriate orchestrator tool.
-- On success: close modal, show toast "Feature {id} started", feature appears in Planning column on next SSE update.
-- On error: show error in modal.
-- Implementation note: the `POST /api/run` endpoint shells out to `aop run` or calls `feature.init` + `plan.submit` via `callOrchestratorTool`. The exact mechanism depends on whether a live supervisor is running.
+**Intent:** prevent late surprise budget blocks.
----
+**Requirements:**
-### UX-38 — Flaky Test Indicator
+- API route: `GET /api/policy/budget`.
+- Meter compares `estimated_cost_usd` to per-feature limit when configured.
+- States: `Normal` (<60%), `Warning` (60-85%), `Critical` (>85%), `Exceeded`.
+- `paused_budget` phase/status must show explicit `Budget exhausted` banner.
-**What**: When the flaky gate intelligence system (M34-36 PRQ4) has quarantine data, surface a "known flaky" indicator on QA-phase feature cards and in the detail panel.
+### UX-40 - Multi-View Layout Toggle (Route-Native)
-**Why it matters**: One of the most frustrating experiences in CI/CD is a failed gate where the failure is a known flaky test. Without flaky awareness in the dashboard, a reviewer sees "gate: fail" and spends time investigating what turns out to be a pre-existing flakiness problem, not a regression introduced by the feature. Flaky indicators prevent this wasted investigation cycle.
+**Intent:** support triage and deep review without cramped hybrid layouts.
-**Specification**:
+**Requirements:**
-- Add API route `GET /api/flaky` that calls `callOrchestratorTool('gate.flaky_report_get', {})` (when M34-36 is implemented) or reads the flaky report artifact from `.aop/flaky-report.json` if available.
-- If the flaky tool/artifact is not available, omit the indicator gracefully (no error state shown).
-- Add to `DashboardStatusPayload`: `flaky_suspects: string[]` (list of test keys with flaky risk).
-- On feature cards in `qa` phase: if the feature's gate results include a failure AND any `required_tests` from its test index overlap with `flaky_suspects`, show a small amber badge "⚠ Known flaky tests may be involved."
-- In the detail panel, add a "Flaky Test Risk" section: list affected test names, their flaky probability, quarantine status, and expiry.
-- Quarantined tests (actively suppressed from blocking merge) are shown with a distinct "Quarantined" badge.
+- Views are route-native:
+  - Board: `/`
+  - List: `/?view=list`
+  - Focus: `/feature/:id`
+- List view supports sortable columns and keyboard row navigation.
+- Right-click-only interactions are prohibited; action menu button required.
+- Focus view includes full-width review sections with sticky action footer.
 ---
-### UX-39 — Feature Budget Meter
-**What**: For each feature, show a visual budget consumption meter comparing `cost.estimated_cost_usd` against the configured per-feature budget threshold from `policy.yaml`.
-**Why it matters**: Policy-level budget controls (`budget.per_feature_usd_limit` or similar) exist to prevent runaway agent costs. But without a visual indicator, developers don't know how close a feature is to exhausting its budget until it hits `paused_budget` status (which renders as blocked in the current board). A budget meter — like a fuel gauge — shows the consumption trajectory before it becomes a problem.
-**Specification**:
-- Add a new API route `GET /api/policy/budget` that reads the relevant budget fields from the composed policy (`policy.budget.per_feature_usd_limit` or equivalent).
-- On the feature detail panel, when cost data is available, show a progress bar: "Budget Usage: $0.04 / $2.00 (2%)".
-- Bar color: green < 60%, amber 60–85%, red > 85%.
-- If no budget limit is configured in policy, show the raw cost without a comparative bar.
-- On feature cards: show a small budget indicator dot (using the same color scheme) when budget usage > 60%.
-- Features at `paused_budget` status show "Budget exhausted" instead of a percentage.
----
-### UX-40 — Multi-View Layout Toggle
-**What**: Add a layout toggle that switches the dashboard between three views: **Board** (current Kanban layout), **List** (sortable table of all features), and **Focus** (single-column full-width view of a selected feature with maximum detail).
-**Why it matters**: Kanban is the right view when monitoring many features simultaneously. But it is a poor format for a reviewer who wants to deeply inspect one feature — the 1/3-width detail panel in a 2:1 grid is cramped. A **Focus** view gives a single feature the entire screen, rendering all detail sections (pipeline stepper, plan, diff, QA index, review brief, acceptance criteria, risk notes, cost) in a single scrolling document. A **List** view is better for operations tasks (sorting by cost, filtering by status, bulk review).
+## 3. API Contracts (Revised)
-**Specification**:
+### 3.1 Uniform Response Envelope (Mandatory)
-**Board View** (default): existing Kanban grid from M44.
+All new dashboard routes MUST return:
-**List View**:
-- Table with columns: Feature ID, Phase, Activity, Gates (fast/full/merge as icons), PR CI, Tokens, Cost, Last Updated (relative).
-- Sortable by any column (click header to sort ascending/descending).
-- Row click selects feature and opens a compact detail drawer (not full detail panel).
-- Row right-click opens a context menu with quick actions (approve, deny, request changes).
-**Focus View** (single-feature deep dive):
-- Full-page single-feature view.
-- Left column (60%): diff viewer (UX-05), plan scope tree (UX-23), QA test coverage map (UX-25).
-- Right column (40%): review brief (UX-28), acceptance criteria (UX-24), risk notes (UX-29), gate drill-down (UX-30), cost meter (UX-39), agent pipeline (UX-21).
-- Review action buttons fixed at the bottom of the right column.
-- "← Back to Board" button returns to the previous view.
-- Triggered by: clicking "Open in Focus" in the Kanban detail panel, or clicking a feature ID in the List view while holding Cmd/Ctrl.
-**Implementation**:
-- View state managed as `'board' | 'list' | 'focus'` in React state.
-- Toggle rendered as a three-button toggle group in the header bar.
-- Each view is a separate component in `src/components/views/`.
----
-## 2. Implementation Priorities
-### 2.1 Priority Tiers
-**Tier 1 — Immediately Valuable (no spec dependencies):**
-- UX-21: Agent Pipeline Stepper — data already in state frontmatter
-- UX-22: Cost & Token Tracker — cost.get tool already exists
-- UX-29: Plan Risk Annotations — plan.risk already in plan.json
-- UX-31: Orchestrator Run Health Panel — runtime_sessions already in index.json
-- UX-35: Plan Revision History — plan_version/revision_of already in plan.json
-**Tier 2 — High Impact, Requires New Data Wiring:**
-- UX-23: Plan Scope File Tree — plan.files already in plan.json, needs parsing
-- UX-24: Acceptance Criteria Live Tracker — plan.acceptance_criteria already available
-- UX-25: QA Test Coverage Map — requires reading qa_test_index.json
-- UX-26: Lock Resource Map — requires reading lock_leases from index.json
-- UX-27: Dependency Unblock Chain — dep_blocked in index.json
-- UX-32: Throughput Sparklines — derived from existing feature data
-- UX-36: Live Cross-Feature Event Feed — client-side diff of SSE data only
-**Tier 3 — Requires New API Routes or Upstream Specs:**
-- UX-28: Review Brief Renderer — depends on M34-36 PRQ5 artifact existing
-- UX-30: Gate Step Drill-Down — depends on gate evidence JSON structure
-- UX-33: Provider Performance Analytics — requires performance.get_analytics call
-- UX-34: Collision Matrix Heatmap — requires collisions.scan call
-- UX-37: Quick Launch Panel — requires `POST /api/run` implementation
-- UX-38: Flaky Test Indicator — depends on M34-36 PRQ4 flaky data
-- UX-39: Feature Budget Meter — requires policy budget config surfacing
-- UX-40: Multi-View Layout Toggle — large component, depends on UX-19 completion
----
-## 3. New API Routes Required
+- Success:
+  ```json
+  { "ok": true, "data": {}, "meta": { "stale": false, "source": "tool|artifact|derived" } }
+  ```
+- Error:
+  ```json
+  { "ok": false, "error": { "code": "string", "message": "string", "retryable": false } }
+  ```
-| Route                            | Method | Purpose                                 | Spec  |
-| -------------------------------- | ------ | --------------------------------------- | ----- |
-| `/api/features/:id/cost`         | GET    | Return cost.get result via MCP tool     | UX-22 |
-| `/api/features/:id/test-index`   | GET    | Read `qa_test_index.json` from disk     | UX-25 |
-| `/api/features/:id/review-brief` | GET    | Read `review_brief.json` from disk      | UX-28 |
-| `/api/analytics`                 | GET    | Return performance.get_analytics result | UX-33 |
-| `/api/collisions`                | GET    | Return collisions.scan result           | UX-34 |
-| `/api/flaky`                     | GET    | Return gate.flaky_report_get result     | UX-38 |
-| `/api/policy/budget`             | GET    | Return budget policy fields             | UX-39 |
-| `/api/run`                       | POST   | Launch a new feature run                | UX-37 |
+### 3.2 Required Routes
+| Route                            | Method | Purpose                             | UX    |
+| -------------------------------- | ------ | ----------------------------------- | ----- |
+| `/api/features/:id/cost`         | GET    | Per-feature cost and tokens         | UX-22 |
+| `/api/features/:id/test-index`   | GET    | Per-feature QA test index           | UX-25 |
+| `/api/features/:id/review-brief` | GET    | Renderable review brief             | UX-28 |
+| `/api/analytics`                 | GET    | Provider/model analytics            | UX-33 |
+| `/api/collisions`                | GET    | Collision matrix or ranked list     | UX-34 |
+| `/api/flaky`                     | GET    | Flaky suspects/quarantine metadata  | UX-38 |
+| `/api/policy/budget`             | GET    | Budget policy thresholds            | UX-39 |
+| `/api/run`                       | POST   | Quick launch entry point (in-scope) | UX-37 |
+### 3.3 API Error Codes (Minimum Set)
+- `feature_not_found`
+- `artifact_missing`
+- `tool_unavailable`
+- `tool_timeout`
+- `policy_not_configured`
+- `unauthorized_action`
+- `invalid_input`
 ---
-## 4. New Type Additions to `src/lib/types.ts`
+## 4. Type Additions (`packages/web-dashboard/src/lib/types.ts`)
 ```typescript
-// UX-21
-export type RoleStatus = 'ready' | 'running' | 'blocked' | 'done';
+export type RoleStatus = 'ready' | 'running' | 'blocked' | 'done' | 'unknown';
 export interface AgentPipelineStatus {
   planner: RoleStatus;
   builder: RoleStatus;
   qa: RoleStatus;
 }
-// UX-22
 export interface CostSummary {
   feature_id: string;
   tokens_used: number;
@@ -565,20 +408,19 @@ export interface CostSummary {
   recorded_at: string | null;
 }
-// UX-25
 export interface QaTestIndexItem {
   path: string;
   status: 'pending' | 'running' | 'passed' | 'failed' | 'waived';
   required_tests: string[];
   last_run_at?: string;
 }
 export interface QaTestIndex {
   feature_id: string;
   version: number;
   items: QaTestIndexItem[];
 }
-// UX-26
 export interface LockLease {
   resource: string;
   holder: string | null;
@@ -586,7 +428,6 @@ export interface LockLease {
   is_stale: boolean;
 }
-// UX-28
 export interface ReviewBrief {
   intent_summary: string;
   scope_summary: string;
@@ -599,7 +440,6 @@ export interface ReviewBrief {
   generated_at: string;
 }
-// UX-33
 export interface ProviderAnalytics {
   provider: string;
   model: string;
@@ -611,7 +451,6 @@ export interface ProviderAnalytics {
   avg_cost_usd: number;
 }
-// UX-36
 export interface FeedEvent {
   id: string;
   timestamp: string;
@@ -622,39 +461,94 @@ export interface FeedEvent {
 }
 ```
-Additionally, extend `FeatureSummary` to include:
+`FeatureSummary` additions:
-- `role_status?: AgentPipelineStatus` (UX-21)
-- `gate_retry_count?: number` (UX-30)
-- `last_retry_at?: string | null` (UX-30)
+- `role_status?: AgentPipelineStatus`
+- `gate_retry_count?: number`
+- `last_retry_at?: string | null`
-And extend `DashboardStatusPayload` to include:
+`DashboardStatusPayload` additions:
-- `runtime?: RuntimeSession` (UX-31)
-- `lock_map?: LockLease[]` (UX-26)
-- `dep_blocked?: Array<{ feature_id: string; depends_on_unresolved: string[] }>` (UX-27)
-- `flaky_suspects?: string[]` (UX-38)
+- `runtime?: RuntimeSession`
+- `lock_map?: LockLease[]`
+- `dep_blocked?: Array<{ feature_id: string; depends_on_unresolved: string[] }>`
+- `flaky_suspects?: string[]`
+- `metrics?: { total_cost_today_usd?: number; merge_histogram_14d?: number[] }`
 ---
-## 5. Acceptance Criteria
+## 5. Agentic Implementation Plan
+### 5.1 Delivery Slices
+Execution priority for M45 is explicitly reviewer-first, then operator flow.
+1. **Slice A - Reviewer Core**
+   - UX-21, UX-23, UX-24, UX-29, UX-30, UX-35
+2. **Slice B - Operator Core**
+   - UX-26, UX-27, UX-31, UX-36, UX-39
+3. **Slice C - Analytics**
+   - UX-32, UX-33, UX-34, UX-38
+4. **Slice D - Launch + View Refinement**
+   - UX-37, UX-40, List/Focus routing polish
+### 5.2 Required File Targets
+- `packages/web-dashboard/src/app/page.tsx`
+- `packages/web-dashboard/src/app/analytics/page.tsx` (new)
+- `packages/web-dashboard/src/app/feature/[id]/page.tsx` (new)
+- `packages/web-dashboard/src/components/*` (new/updated per feature)
+- `packages/web-dashboard/src/lib/types.ts`
+- `packages/web-dashboard/src/lib/dashboard-utils.ts`
+- `packages/web-dashboard/src/lib/aop-client.ts`
+- `packages/web-dashboard/src/lib/orchestrator-tools.ts`
+- `packages/web-dashboard/src/app/api/**/route.ts` (new routes above)
+- `apps/control-plane/test/dashboard-*.spec.ts` (API + utils + interaction tests)
+### 5.3 Deterministic Fallback Rules
+- Missing artifact: return `ok: false`, `error.code = artifact_missing` and render empty state.
+- Missing tool support: `tool_unavailable`; UI renders feature section disabled with explanation.
+- Any optional section failure MUST NOT block board rendering.
+### 5.4 Security and Safety Rules
+- `POST /api/run` MUST validate inputs server-side, regardless of client validation.
+- Quick launch MUST honor policy + permission checks.
+- Role-based authorization enforcement is deferred for M45, but route handlers MUST call a pluggable authorization adapter interface so RBAC can be enabled without route rewrites in a later milestone.
+- Action confirmations required for launch/approve/deny/request changes.
+---
+## 6. Acceptance Criteria
+M45 is accepted only when all are true:
+1. Functional criteria
+   - UX-21..UX-40 implemented according to this spec and degradations defined.
+2. Accessibility criteria
+   - Keyboard-only flow covers triage, feature selection, view switch, and actions.
+   - No status semantics rely on color alone.
+3. Reliability criteria
+   - Optional section failures do not break board rendering.
+   - API routes return uniform response envelope.
+4. Performance criteria
+   - No continuous high-frequency loops without justification.
+   - Event feed de-duplication and bounded memory confirmed.
+5. Engineering criteria
+   - `npm run lint`, `npm run typecheck`, `npm test`, `npm run build`, `npm run validate:mcp-contracts`, `npm run validate:architecture` all pass.
+---
-Each improvement is accepted when:
+## 7. Out of Scope
-1. It renders correctly in development mode (`npm run dev`).
-2. `npm run build` succeeds with no TypeScript or lint errors.
-3. Components that can be unit-tested have corresponding test files.
-4. Data that is unavailable (optional fields, features not yet run, specs not yet implemented) degrades gracefully with an informative empty/unavailable state — no crashes, no unhandled promise rejections.
-5. No new third-party runtime dependencies introduced without explicit justification. SVG charts, diff parsing, tree building — all implemented locally.
+- In-dashboard code editing or patch authoring.
+- Full mobile-first redesign beyond the mandatory 360px phone baseline.
+- Websocket migration (SSE remains transport for M45).
+- External alerting integrations (Slack/PagerDuty/webhooks).
 ---
-## 6. Out of Scope
+## 8. Open Questions (Need Product/Governance Input)
-- Real-time WebSocket upgrade (SSE polling at 2s is sufficient for this spec).
-- Server-side rendered pages for individual features (URL routing per feature).
-- In-dashboard code editing or patch application.
-- Notification webhooks or external alerting integration (belongs in M37 or a separate alerting spec).
-- AI-powered natural language queries over dashboard data.
-- Dark/light theme switching.
-- Mobile layout optimization.
+None for this revision.