npm - buildanything - Versions diffs - 2.0.0 → 2.1.1 - Mend

buildanything 2.0.0 → 2.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (115) hide show

package/.claude-plugin/marketplace.json +1 -1
package/.claude-plugin/plugin.json +9 -1
package/README.md +57 -61
package/agents/a11y-architect.md +2 -0
package/agents/briefing-officer.md +172 -0
package/agents/business-model.md +14 -12
package/agents/code-architect.md +6 -1
package/agents/code-reviewer.md +3 -2
package/agents/code-simplifier.md +12 -4
package/agents/design-brand-guardian.md +19 -0
package/agents/design-critic.md +16 -11
package/agents/design-inclusive-visuals-specialist.md +2 -0
package/agents/design-ui-designer.md +17 -0
package/agents/design-ux-architect.md +15 -0
package/agents/design-ux-researcher.md +102 -7
package/agents/engineering-ai-engineer.md +2 -0
package/agents/engineering-backend-architect.md +2 -0
package/agents/engineering-data-engineer.md +2 -0
package/agents/engineering-devops-automator.md +2 -0
package/agents/engineering-frontend-developer.md +13 -0
package/agents/engineering-mobile-app-builder.md +2 -0
package/agents/engineering-rapid-prototyper.md +15 -2
package/agents/engineering-security-engineer.md +2 -0
package/agents/engineering-senior-developer.md +13 -0
package/agents/engineering-sre.md +2 -0
package/agents/engineering-technical-writer.md +2 -0
package/agents/feature-intel.md +8 -7
package/agents/ios-app-review-guardian.md +2 -0
package/agents/ios-foundation-models-specialist.md +2 -0
package/agents/ios-product-reality-auditor.md +292 -0
package/agents/ios-storekit-specialist.md +2 -0
package/agents/ios-swift-architect.md +1 -0
package/agents/ios-swift-search.md +1 -0
package/agents/ios-swift-ui-design.md +7 -4
package/agents/marketing-app-store-optimizer.md +2 -0
package/agents/planner.md +6 -1
package/agents/pr-test-analyzer.md +3 -2
package/agents/product-feedback-synthesizer.md +62 -0
package/agents/product-owner.md +163 -0
package/agents/product-reality-auditor.md +216 -0
package/agents/product-spec-writer.md +176 -0
package/agents/refactor-cleaner.md +9 -1
package/agents/security-reviewer.md +2 -1
package/agents/silent-failure-hunter.md +2 -1
package/agents/swift-build-resolver.md +2 -0
package/agents/swift-reviewer.md +2 -1
package/agents/tech-feasibility.md +5 -3
package/agents/testing-api-tester.md +2 -0
package/agents/testing-evidence-collector.md +24 -0
package/agents/testing-performance-benchmarker.md +2 -0
package/agents/testing-reality-checker.md +2 -1
package/agents/visual-research.md +7 -5
package/bin/adapters/scribe-tool.ts +4 -2
package/bin/adapters/write-lease-tool.ts +1 -1
package/bin/buildanything-runtime.ts +20 -107
package/bin/graph-index.js +24 -0
package/bin/graph-index.ts +340 -0
package/bin/mcp-servers/graph-mcp.js +26 -0
package/bin/mcp-servers/graph-mcp.ts +481 -0
package/bin/mcp-servers/orchestrator-mcp.js +26 -0
package/bin/mcp-servers/orchestrator-mcp.ts +361 -0
package/bin/setup.js +272 -111
package/commands/build.md +371 -158
package/commands/idea-sweep.md +2 -2
package/commands/setup.md +15 -4
package/commands/ux-review.md +3 -3
package/commands/verify.md +3 -0
package/docs/migration/phase-graph.yaml +573 -157
package/hooks/design-md-lint +4 -0
package/hooks/design-md-lint.ts +295 -0
package/hooks/pre-tool-use.ts +37 -6
package/hooks/record-mode-transitions.ts +63 -6
package/hooks/subagent-start.ts +3 -2
package/package.json +3 -1
package/protocols/agent-prompt-authoring.md +165 -0
package/protocols/architecture-schema.md +10 -3
package/protocols/cleanup.md +4 -0
package/protocols/decision-log.md +8 -4
package/protocols/design-md-authoring.md +520 -0
package/protocols/design-md-spec.md +362 -0
package/protocols/fake-data-detector.md +1 -1
package/protocols/ios-fake-data-detector.md +65 -0
package/protocols/ios-phase-branches.md +112 -27
package/protocols/launch-readiness.md +9 -5
package/protocols/metric-loop.md +1 -1
package/protocols/page-spec-schema.md +234 -0
package/protocols/product-spec-schema.md +354 -0
package/protocols/sprint-tasks-schema.md +53 -0
package/protocols/state-schema.json +38 -3
package/protocols/state-schema.md +32 -2
package/protocols/verify.md +29 -1
package/protocols/web-phase-branches.md +234 -64
package/skills/ios/ios-bootstrap/SKILL.md +1 -1
package/src/graph/ids.ts +86 -0
package/src/graph/index.ts +32 -0
package/src/graph/parser/architecture.ts +603 -0
package/src/graph/parser/component-manifest.ts +268 -0
package/src/graph/parser/decisions-jsonl.ts +407 -0
package/src/graph/parser/design-md-pass2.ts +253 -0
package/src/graph/parser/design-md.ts +477 -0
package/src/graph/parser/page-spec.ts +496 -0
package/src/graph/parser/product-spec.ts +930 -0
package/src/graph/parser/screenshot.ts +342 -0
package/src/graph/parser/sprint-tasks.ts +317 -0
package/src/graph/storage/index.ts +1154 -0
package/src/graph/types.ts +432 -0
package/src/graph/util/dhash.ts +84 -0
package/src/lrr/aggregator.ts +105 -10
package/src/orchestrator/hooks/context-header.ts +34 -10
package/src/orchestrator/hooks/token-accounting.ts +25 -14
package/src/orchestrator/mcp/cycle-counter.ts +2 -1
package/src/orchestrator/mcp/scribe.ts +27 -16
package/src/orchestrator/mcp/write-lease.ts +30 -13
package/src/orchestrator/phase4-shared-context.ts +20 -4
package/protocols/visual-dna.md +0 -185

package/protocols/web-phase-branches.md CHANGED Viewed

@@ -10,14 +10,14 @@ Every `subagent_type:` dispatch in this file prepends a CONTEXT header to its pr
 CONTEXT:
   project_type: web
   phase: <resolved: current phase number>
-  dna: <resolved: 6-axis DNA values extracted from docs/plans/visual-dna.md — NOT the full file content, just the axis values. Include only if phase >= 3 AND visual-dna.md exists>
+  dna: <resolved: 7-axis Brand DNA values extracted from `DESIGN.md` `## Overview > ### Brand DNA` block — NOT the full file content, just the axis values. Include only if phase >= 3 AND DESIGN.md exists>
 TASK:
 ```
 **Resolution rules:**
-- `dna` = the 6 axis values only (Scope, Density, Character, Material, Motion, Type) — NOT the full `visual-dna.md` content. ~100 tokens, not ~5K.
-- Phase 3 Step 3.0 (Visual DNA Selection) is the ONE exception — it runs BEFORE `visual-dna.md` exists, so its CONTEXT omits `dna`.
+- `dna` = the 7 axis values only (Scope, Density, Character, Material, Motion, Type, Copy) extracted from `DESIGN.md` `## Overview > ### Brand DNA` — NOT the full `DESIGN.md` content. ~100 tokens, not ~5K.
+- Phase 3 Step 3.0 (Visual DNA Selection) is the ONE exception — it runs BEFORE `DESIGN.md` exists, so its CONTEXT omits `dna`.
 - The rendered header is a stable prefix — it does not change between dispatches within a phase.
 Individual dispatches below reference `[CONTEXT header above]` and rely on this rendered template.
@@ -30,29 +30,57 @@ Before the Quality Gate 2 approval prompt in `commands/build.md` is rendered, di
 Call the Agent tool once:
-1. Description: "Visual DNA directional preview" — subagent_type: `design-brand-guardian` — prompt: "[CONTEXT header above — phase: 2. NOTE: `dna` is omitted — this step produces the preview, not the lock.] Read `docs/plans/design-doc.md` (#persona, #scope, #voice), `docs/plans/findings-digest.md` (reference signals), and `docs/plans/architecture.md` (stack constraints). Emit a 3-5 bullet DIRECTIONAL preview of the intended Visual DNA — brand read in one line, then proposed leanings on Scope, Character, Material/Motion, and Type. NO rationale paragraphs, NO reference citations, NO incompatibility-matrix work. This is a sanity-check for the user at Gate 2, not the locked card. Save to `docs/plans/visual-dna-preview.md` as a flat bullet list. Target 150 tokens of output, max 250."
+1. Description: "Visual DNA directional preview" — subagent_type: `design-brand-guardian` — prompt: "[CONTEXT header above — phase: 2. NOTE: `dna` is omitted — this step produces the preview, not the lock.] Read `docs/plans/design-doc.md` (#persona, #scope, #voice), `docs/plans/phase1-scratch/findings-digest.md` (reference signals), and `docs/plans/architecture.md` (stack constraints). Emit a 3-5 bullet DIRECTIONAL preview of the intended Visual DNA — brand read in one line, then proposed leanings on Scope, Character, Material/Motion, and Type. NO rationale paragraphs, NO reference citations, NO incompatibility-matrix work. This is a sanity-check for the user at Gate 2, not the locked card. Save to `docs/plans/visual-dna-preview.md` as a flat bullet list. Target 150 tokens of output, max 250."
 Output: `docs/plans/visual-dna-preview.md` — surfaced by the orchestrator in the Gate 2 prompt alongside Architecture + Sprint Task List. Phase 3.0 Brand Guardian re-invokes to produce the full locked 6-axis card; the preview is discarded after Gate 2 approval.
 ## Phase 3 — Design (web branch)
-**Goal:** Lock a 6-axis Visual DNA card, then compose — not reconstruct — the product's visual system from a vendored component library. Every downstream step reads the DNA. Compositional beats reconstructive for visual quality. Fully autonomous.
+**Goal:** Lock the 7-axis Brand DNA inside `DESIGN.md` Pass 1, then compose — not reconstruct — the product's visual system from a vendored component library, then complete `DESIGN.md` Pass 2 with full tokens + prose. Every downstream step reads `DESIGN.md`. Compositional beats reconstructive for visual quality. Fully autonomous.
 **Skip if** the project has no user-facing frontend (CLI tools, pure APIs, backend services).
-HARD-GATE: UI/UX IS THE PRODUCT. This phase is a full peer to Architecture and Build — not a footnote, not an afterthought, not a "nice to have." Do NOT skip, compress, or rush this phase for any reason. Brand Guardian MUST lock the Visual DNA at Step 3.0 before any other agent runs. Every downstream step reads `docs/plans/visual-dna.md`. The `/design-system` route must be rendered and iterated with Playwright-verified feedback from the Design Critic before a single line of product code is written. Phase 4 (Build) Step 4.0 Scaffold WILL NOT START without both `docs/plans/visual-dna.md` AND `docs/plans/visual-design-spec.md`. If either is missing, return here.
+HARD-GATE: UI/UX IS THE PRODUCT. This phase is a full peer to Architecture and Build — not a footnote, not an afterthought, not a "nice to have." Do NOT skip, compress, or rush this phase for any reason. Brand Guardian MUST author Pass 1 of `DESIGN.md` (Overview + Brand DNA + Do's and Don'ts) at Step 3.0 before any other agent runs. Every downstream step reads `DESIGN.md`. The `/design-system` route must be rendered and iterated with Playwright-verified feedback from the Design Critic before a single line of product code is written. Phase 4 (Build) Step 4.0 Scaffold WILL NOT START without `DESIGN.md` complete (Pass 2 finished). If missing or incomplete, return here.
 HARD-GATE: **Compositional not reconstructive.** From Step 3.2 onward, every visual element that has a library variant MUST be mapped to that variant in `docs/plans/component-manifest.md`. Writing components from scratch when the library covers the case is a HARD-GATE violation that the cleanup agent will revert.
-### Step 3.0 — Visual DNA Selection (DNA owner, single agent)
+### Step 3.0 — DESIGN.md Pass 1 — Brand DNA + Overview + Do's and Don'ts (single agent)
-Dispatch a single agent to lock the 6-axis Visual DNA card that governs every downstream step in this phase.
+Dispatch a single agent to author Pass 1 of `DESIGN.md` (repo root). Pass 1 locks the 7-axis Brand DNA, writes the Overview prose, and seeds the Do's and Don'ts. Pass 2 (token + remaining prose) lands at Step 3.4.
 Call the Agent tool once:
-1. Description: "Visual DNA selection" — subagent_type: `design-brand-guardian` — prompt: "[CONTEXT header above — phase: 3. NOTE: Step 3.0 omits `dna` because this step PRODUCES it.] You are the DNA Owner for this build. Read these inputs: `docs/plans/design-doc.md` (product concept, user, voice), `docs/plans/findings-digest.md` (reference sites the user mentioned, competitor aesthetic landscape), `docs/plans/architecture.md` (stack constraints — e.g. server-rendered Rails can't ship Three.js), `docs/plans/quality-targets.json` (perf budget constrains motion and material choices), `docs/plans/user-decisions.md` (if the user said 'like Linear' or 'make it playful' during brainstorm). Lock a 6-axis Visual DNA card per the schema in `protocols/visual-dna.md`. The 6 axes: **Scope** (Marketing / Product / Dashboard / Internal Tool — gates library install), **Density** (Airy / Balanced / Dense), **Character** (Minimal / Editorial / Maximalist / Brutalist / Playful), **Material** (Flat / Glassy / Physical / Neumorphic), **Motion** (Still / Subtle / Expressive / Cinematic), **Type** (Neutral Sans / Humanist Sans / Serif-forward / Display-forward / Mono-accented). Consult the incompatibility matrix in `protocols/visual-dna.md` — you are FORBIDDEN from picking illegal combinations (e.g. Dashboard + Cinematic is contradictory). Write the locked DNA card to `docs/plans/visual-dna.md`."
+1. Description: "DESIGN.md Pass 1 — Brand DNA + Overview" — subagent_type: `design-brand-guardian` — prompt: "[CONTEXT header above — phase: 3. NOTE: Step 3.0 omits `dna` because this step PRODUCES it.] You are the Brand Guardian authoring Pass 1 of `DESIGN.md`. The format is specified by `protocols/design-md-spec.md` (vendored). The pipeline contract is in `protocols/design-md-authoring.md`. Read both before writing.
-Output: `docs/plans/visual-dna.md` — the locked DNA card. Every downstream Phase 3 step reads this file, and Phase 4 implementers read it via `refs.json`.
+Inputs (Read tool): `docs/plans/product-spec.md` (## App Overview for product identity, ## Screen Inventory for what screens exist, ## Permissions & Roles for complexity level — a dense admin panel needs different DNA than a simple consumer app), `docs/plans/design-doc.md` (product concept, user, voice), `docs/plans/phase1-scratch/findings-digest.md` (reference sites the user mentioned, competitor aesthetic landscape), `docs/plans/architecture.md` (stack constraints — e.g. server-rendered Rails can't ship Three.js), `docs/plans/quality-targets.json` (perf budget constrains motion and material choices), `docs/plans/phase1-scratch/user-decisions.md`.
+Lock the 7-axis Brand DNA per `protocols/design-md-authoring.md` §3 (incompatibility matrix). The 7 axes: **Scope** (Marketing / Product / Dashboard / Internal Tool), **Density** (Airy / Balanced / Dense), **Character** (Minimal / Editorial / Maximalist / Brutalist / Playful), **Material** (Flat / Glassy / Physical / Neumorphic), **Motion** (Still / Subtle / Expressive / Cinematic), **Type** (Neutral Sans / Humanist Sans / Serif-forward / Display-forward / Mono-accented), **Copy** (Functional / Narrative / Punchy / Technical). You are FORBIDDEN from picking illegal combinations from the §3 matrix.
+Write `DESIGN.md` at the **repository root** (NOT under `docs/plans/`) using the Pass 1 skeleton in `protocols/design-md-authoring.md` §5:
+- YAML front matter: `version: alpha`, `name: <Brand Name>`. Leave colors/typography/rounded/spacing/components empty for Pass 2.
+- `## Overview` with 2-4 paragraph brand description.
+- `### Brand DNA` h3 subsection listing all 7 axis values.
+- `### Rationale` h3 with 4-8 sentences citing design-doc.md sections + findings-digest signals.
+- `### Locked At` h3 with `locked_at` (ISO-8601, single-write), `locked_by: design-brand-guardian`, `build_session`.
+- `### References` h3 with at least 2 entries, each tied to specific axis pairs.
+- `## Colors`, `## Typography`, `## Layout`, `## Elevation & Depth`, `## Shapes`, `## Components` — present as headings with `<!-- Pass 2 — UI Designer at Step 3.4 -->` placeholder body. Section ORDER matters; the linter enforces it.
+- `## Do's and Don'ts` with at least 4 bullets (≥2 Do, ≥2 Don't), enforcing the anti-slop gates in §4 of the authoring protocol against the user's references.
+Apply the anti-slop gates from `protocols/design-md-authoring.md` §4 (font hard-ban, font overuse-ban, AI-slop pattern ban, Copy axis validation). When the user's references push toward a forbidden choice, reject it, pick the closest legal alternative, and emit a decision-log row naming the rejection.
+Output: `DESIGN.md` at repo root. Every downstream Phase 3 step reads this file."
+Output: `DESIGN.md` (repo root) — Pass 1. Step 3.4 completes Pass 2.
+#### Step 3.0.idx — DESIGN.md Pass 1 graph index
+After `design-brand-guardian` returns and `DESIGN.md` is on disk, index it into the build graph. Slice 2 graph index — required for downstream agents.
+Run via the Bash tool:
+- Command: `node ${CLAUDE_PLUGIN_ROOT}/bin/graph-index.js DESIGN.md`
+- On exit 0: log success to `docs/plans/build-log.md` and continue.
+- On non-zero exit: STOP. Log the error to `docs/plans/build-log.md` and report the failure. Downstream agents require the graph — do not proceed without a successful index.
 ### Step 3.1 — Visual Research (2 agents, parallel, both Playwright-driven)
@@ -60,37 +88,94 @@ Research is now goal-directed — validate and enrich the locked DNA, not catalo
 Call the Agent tool 2 times in one message:
-1. Description: "Competitive visual audit" — subagent_type: `visual-research` — prompt: "[CONTEXT header above — phase: 3] Mode: `competitive-audit`. Read `docs/plans/visual-dna.md` to understand the locked DNA. Find 5-8 rival UIs that exemplify the chosen DNA axes (NOT all competitors — only ones that nail the axes we chose). Use Playwright to screenshot each at desktop 1920x1080 and mobile 375x812. For each site, analyze which DNA axes it nails and which it doesn't. Save screenshots to `docs/plans/design-references/competitors/`. Append findings to `docs/plans/design-references.md` grouped by DNA axis (motion refs, material refs, typography refs, character refs, density refs). Optional caller-supplied competitor URLs: [list or 'none']."
+1. Description: "Competitive visual audit" — subagent_type: `visual-research` — prompt: "[CONTEXT header above — phase: 3] Mode: `competitive-audit`. Read `DESIGN.md` (`## Overview > ### Brand DNA`) to understand the locked DNA. Find 5-8 rival UIs that exemplify the chosen DNA axes (NOT all competitors — only ones that nail the axes we chose). Use Playwright to screenshot each at desktop 1920x1080 and mobile 375x812. For each site, analyze which DNA axes it nails and which it doesn't. Save screenshots to `docs/plans/design-references/competitors/`. Append findings to `docs/plans/design-references.md` grouped by DNA axis (motion refs, material refs, typography refs, character refs, density refs). Optional caller-supplied competitor URLs: [list or 'none']."
-2. Description: "Design inspiration mining" — subagent_type: `visual-research` — prompt: "[CONTEXT header above — phase: 3] Mode: `inspiration-mining`. Read `docs/plans/visual-dna.md`. Search Awwwards.com, Godly.website, and SiteInspire for award-winning sites that match the DNA axes. Use Playwright to screenshot the top 5-8 results at desktop 1920x1080 and mobile 375x812. Save to `docs/plans/design-references/inspiration/`. Append findings to `docs/plans/design-references.md` grouped by DNA axis. Tag every reference with the specific axis (or axes) it validates."
+2. Description: "Design inspiration mining" — subagent_type: `visual-research` — prompt: "[CONTEXT header above — phase: 3] Mode: `inspiration-mining`. Read `DESIGN.md` (`## Overview > ### Brand DNA`). Search Awwwards.com, Godly.website, and SiteInspire for award-winning sites that match the DNA axes. Use Playwright to screenshot the top 5-8 results at desktop 1920x1080 and mobile 375x812. Save to `docs/plans/design-references/inspiration/`. Append findings to `docs/plans/design-references.md` grouped by DNA axis. Tag every reference with the specific axis (or axes) it validates."
 Output: `docs/plans/design-references.md` — reference paths grouped by DNA axis, ready to feed Step 3.2 component mapping and Step 3.6 critic scoring.
+#### Step 3.1.idx — Design references graph index
+After both `visual-research` agents return and `docs/plans/design-references/` is populated with screenshots, index the directory into the build graph as Slice 5 reference fragments. Required for downstream agents.
+Run via the Bash tool:
+- Command: `node ${CLAUDE_PLUGIN_ROOT}/bin/graph-index.js docs/plans/design-references/`
+- On exit 0: log success to `docs/plans/build-log.md` and continue.
+- On non-zero exit: STOP. Log the error to `docs/plans/build-log.md` and report the failure. Downstream agents require the graph — do not proceed without a successful index.
 ### Step 3.2 — Component Library Mapping (single agent, HARD-GATE source)
 This is the compositional step. The Visual Designer picks specific library component variants for every slot the product needs, using the static DNA→variant catalog as its source of truth. The output is a locked manifest that Phase 4 implementers MUST import from.
 Call the Agent tool once:
-1. Description: "Component library mapping" — subagent_type: `design-ui-designer` — prompt: "[CONTEXT header above — phase: 3] Read `docs/plans/visual-dna.md`, `docs/plans/design-references.md`, and `docs/library-refs/component-library-catalog.md` (the static reference mapping DNA-axis combinations to library component variants). Pick specific component variants for each slot the product needs: hero, cards, cta, nav, marquee, chart, 3D, form elements, modals. The catalog is authoritative — when the DNA matches a row, use the variants that row specifies; do not reinvent. Write `docs/plans/component-manifest.md` with the locked component picks, one row per slot, naming the library and the variant. For any slot the catalog doesn't cover, emit a row tagged 'manifest gap' with a short fallback plan (stock shadcn primitive plus notes)."
+1. Description: "Component library mapping" — subagent_type: `design-ui-designer` — prompt: "[CONTEXT header above — phase: 3] Read `DESIGN.md` (`## Overview > ### Brand DNA` for axis values; `### References` for reference paths), `docs/plans/design-references.md`, `docs/plans/product-spec.md` (## Screen Inventory for what screens exist, per-feature States and Empty/Loading/Error States sections for what component states are needed — e.g. a feature with 7 states needs more component variants than one with 3), and `docs/library-refs/component-library-catalog.md` (the static reference mapping DNA-axis combinations to library component variants). Pick specific component variants for each slot the product needs: hero, cards, cta, nav, marquee, chart, 3D, form elements, modals. The catalog is authoritative — when the DNA matches a row, use the variants that row specifies; do not reinvent. Write `docs/plans/component-manifest.md` with the locked component picks, one row per slot, naming the library and the variant. For any slot the catalog doesn't cover, emit a row tagged 'manifest gap' with a short fallback plan (stock shadcn primitive plus notes)."
 Output: `docs/plans/component-manifest.md` — locked component manifest.
 **HARD-GATE:** Phase 4 implementers MUST import from this manifest. Writing components from scratch when the manifest names one is a HARD-GATE violation. The cleanup agent will flag and revert custom-written components that have a manifest entry. See the Phase 4 HARD-GATE block below.
+#### Step 3.2.idx — Component manifest graph index
+After `design-ui-designer` returns and `docs/plans/component-manifest.md` is on disk, index it into the build graph. Slice 2 graph index — required for downstream agents.
+Run via the Bash tool:
+- Command: `node ${CLAUDE_PLUGIN_ROOT}/bin/graph-index.js docs/plans/component-manifest.md`
+- On exit 0: log success to `docs/plans/build-log.md` and continue.
+- On non-zero exit: STOP. Log the error to `docs/plans/build-log.md` and report the failure. Downstream agents require the graph — do not proceed without a successful index.
 ### Step 3.2b — DNA Persona Check
-Call the Agent tool — description: "DNA persona check" — subagent_type: design-ux-researcher — prompt: "[CONTEXT header above — phase: 3] Read docs/plans/visual-dna.md (the locked 6-axis DNA card) + docs/plans/design-doc.md (#persona and #jobs-to-be-done sections) + docs/plans/findings-digest.md. Validate: does the chosen Visual DNA actually serve this persona and these jobs-to-be-done? Cross-check each DNA axis against the persona's context (e.g., if persona is 'senior enterprise buyer on a tight schedule' but DNA chose Maximalist + Cinematic, that's wrong — Enterprise/Minimal/Subtle fits better). Report any DNA-persona mismatches. If mismatches found, the Brand Guardian may need to re-lock the DNA (backward edge to Step 3.0). Save findings to docs/plans/dna-persona-check.md."
+Call the Agent tool — description: "DNA persona check" — subagent_type: design-ux-researcher — prompt: "[CONTEXT header above — phase: 3] Read `DESIGN.md` (the full Pass 1 — `## Overview` including `### Brand DNA` is the locked 7-axis card and `### Rationale` explains why those axes were chosen) + docs/plans/design-doc.md (#persona and #jobs-to-be-done sections) + docs/plans/product-spec.md (## App Overview and per-feature Persona Constraints sections — these carry the specific behavioral patterns from research, e.g. 'user scans, doesn't read') + docs/plans/phase1-scratch/findings-digest.md. Validate: do the locked DNA axes actually serve this persona and these jobs-to-be-done? Cross-check each DNA axis against the persona's context (e.g., if persona is 'senior enterprise buyer on a tight schedule' but DNA chose Maximalist + Cinematic, that's wrong — Enterprise/Minimal/Subtle fits better). Report any DNA-persona mismatches. If mismatches found, the Brand Guardian may need to re-author DESIGN.md Pass 1 (backward edge to Step 3.0). Save findings to docs/plans/dna-persona-check.md."
-### Step 3.3 — UX Architecture (single agent)
+### Step 3.3 — UX Architecture + Page Layouts (single agent)
-Structural design must align to the locked DNA — a Dense layout behaves differently from an Airy layout even for the same user flow.
+Structural design must align to the locked DNA — a Dense layout behaves differently from an Airy layout even for the same user flow. This step produces BOTH the UX architecture (flows, navigation, IA) AND per-screen page specs with ASCII wireframes. Flows and layouts inform each other — a checkout flow might be 2 steps or 3 depending on what fits spatially, and a sidebar nav only makes sense if the screen count warrants it.
 Call the Agent tool once:
-1. Description: "UX architecture" — subagent_type: `design-ux-architect` — prompt: "[CONTEXT header above — phase: 3] Read `docs/plans/visual-dna.md`, `docs/plans/component-manifest.md`, and the #frontend anchor in `docs/plans/architecture.md`. Design information architecture, user flows, interaction patterns, and responsive strategy — all aligned to the locked DNA. Dense layout behaves differently than Airy layout even for the same flow; Cinematic motion reshapes page transitions versus Subtle motion. Map each user flow to the component-manifest slots it needs. Save to `docs/plans/ux-architecture.md`."
+1. Description: "UX architecture + page layouts" — subagent_type: `design-ux-architect` — mode: "bypassPermissions" — prompt: "[CONTEXT header above — phase: 3] Read the page spec schema at `protocols/page-spec-schema.md` before writing. Then read these inputs via your Read tool:
+  - Product spec: `docs/plans/product-spec.md` (FULL document — this is your source of truth. Screen Inventory is your screen list. Per-feature sections define what each screen does, what data it shows, what states exist, what errors look like, persona constraints, business rules)
+  - Visual DNA: `DESIGN.md` `## Overview > ### Brand DNA` (Density axis drives layout — Airy = generous whitespace, Dense = compact data. Character and Motion axes shape navigation transitions and interaction patterns)
+  - Components: `docs/plans/component-manifest.md` (which library components for which slots — use these in your wireframes)
+  - Frontend architecture: `docs/plans/architecture.md#frontend` (component hierarchy, routing, state management)
+  - API contracts: `docs/plans/architecture.md#backend/api` (what data is available from each endpoint)
+  - Design references: `docs/plans/design-references/` (competitor/inspiration screenshots for layout reference)
+  - PRD: `docs/plans/design-doc.md` (#persona, #jobs-to-be-done, #scope)
-Output: `docs/plans/ux-architecture.md`.
+Produce TWO outputs:
+**Output 1: `docs/plans/ux-architecture.md`** — information architecture, user flows (derived from product spec feature flows, not invented), navigation model, interaction patterns, responsive strategy. Map each user flow to the component-manifest slots it needs. The product spec's feature flows are your behavioral source of truth — refine and structure them into screen-to-screen journeys, don't reinvent them.
+**Output 2: `docs/plans/page-specs/*.md`** — one file per screen from the Screen Inventory, following `protocols/page-spec-schema.md`. Each file includes: ASCII wireframe (desktop + mobile for web), content hierarchy with component refs from the manifest and data sources from the API contracts, key copy, responsive behavior, platform conventions, data loading strategy, and screen-specific states from the product spec.
+The Density axis from DESIGN.md is your primary layout driver. Airy = generous spacing, fewer items visible per viewport. Dense = compact rows, data tables, more items per viewport. Match the density to the persona constraints from the product spec.
+NOTE: The visual design spec (exact spacing values, typography ramp) does not exist yet at this step. Use the DNA Density axis for spatial decisions (airy vs dense) and the component manifest for component choices. Phase 4 implementers have specialized build skills and will apply the final token values from the visual design spec when they build — your layouts define the spatial arrangement and content hierarchy, not pixel-precise measurements."
+Output: `docs/plans/ux-architecture.md` + `docs/plans/page-specs/*.md`.
+#### Step 3.3.idx — Page-specs graph index
+After `design-ux-architect` returns and `docs/plans/page-specs/` is populated with one .md file per screen, index the directory into the build graph. Slice 3 graph index — best-effort, BO falls back to file reads on failure.
+Run via the Bash tool:
+- Command: `node ${CLAUDE_PLUGIN_ROOT}/bin/graph-index.js docs/plans/page-specs/`
+- On exit 0: log success to `docs/plans/build-log.md` and continue.
+- On non-zero exit: STOP. Log the error to `docs/plans/build-log.md` and report the failure. Downstream agents require the graph — do not proceed without a successful index.
+### Step 3.3b — UX Flow Validation
+Validate the UX architecture against the target persona's actual goals and jobs-to-be-done before the Visual Design Spec is built on top of it.
+Call the Agent tool once:
+1. Description: "UX flow validation" — subagent_type: `design-ux-researcher` — prompt: "[CONTEXT header above — phase: 3] Read `docs/plans/ux-architecture.md`, `docs/plans/page-specs/` (the ASCII wireframes — validate that layouts serve the persona), `docs/plans/product-spec.md` (per-feature Happy Path and Persona Constraints — these are the behavioral source of truth the flows must implement), `docs/plans/design-doc.md` (#persona, #jobs-to-be-done, #scope sections), and `DESIGN.md`. For each user flow in the UX architecture, walk through it as the target persona: narrate the steps, flag friction points, check if the flow serves the persona's jobs-to-be-done efficiently. Specifically check: (1) Are there screens or sections the persona doesn't need? (2) Are critical tasks reachable in the minimum number of steps? (3) Does the information hierarchy match what the persona cares about most? (4) Does the navigation pattern fit the persona's context (mobile-first for on-the-go users, sidebar for desktop power users, etc.)? (5) Does the responsive strategy degrade gracefully for the persona's primary device? Report findings to `docs/plans/ux-flow-validation.md` with pass/flag per flow. If critical flow issues are found, the UX Architect should revise `ux-architecture.md` before proceeding (backward edge to Step 3.3)."
+Output: `docs/plans/ux-flow-validation.md`.
 ### Step 3.4 — Visual Design Spec (single agent, second Visual Designer invocation)
@@ -98,7 +183,7 @@ The Visual Designer re-invokes as writer this time, producing the much richer Vi
 Call the Agent tool once:
-1. Description: "Visual design spec" — subagent_type: `design-ui-designer` — prompt: "[CONTEXT header above — phase: 3] Second invocation as writer. Read `docs/plans/visual-dna.md`, `docs/plans/component-manifest.md`, `docs/plans/ux-architecture.md`, and `docs/plans/design-references.md`. Write `docs/plans/visual-design-spec.md` with ALL the following layers:
+1. Description: "Visual design spec" — subagent_type: `design-ui-designer` — prompt: "[CONTEXT header above — phase: 3] Second invocation as writer. Read `DESIGN.md`, `docs/plans/component-manifest.md`, `docs/plans/ux-architecture.md`, `docs/plans/design-references.md`, `docs/plans/product-spec.md` (per-feature States and Empty/Loading/Error States — the state matrix must cover every state the product spec defines, not just generic defaults), and `docs/plans/page-specs/` (the ASCII wireframes — the typography ramp and spacing scale must work for the actual page layouts, not just in isolation). Write `DESIGN.md` with ALL the following layers:
 **TOKENS** (existing): color system (hex, light + dark), typography scale, spacing (8px base), shadows, radius.
@@ -112,13 +197,23 @@ Call the Agent tool once:
 Every token, parameter, and rule must be derivable from the DNA card plus the design references. Cite the reference path for every non-obvious choice."
-Output: `docs/plans/visual-design-spec.md` — substantially richer than the prior one-layer spec.
+Output: `DESIGN.md` — substantially richer than the prior one-layer spec.
+#### Step 3.4.idx — DESIGN.md Pass 2 token re-index
+After `design-ui-designer` completes Pass 2 of `DESIGN.md` (YAML front matter + Pass 2 prose sections populated), re-run the indexer on DESIGN.md. The CLI dispatch detects Pass 2 content and writes `slice-3-tokens.json` alongside the existing `slice-2-dna.json` (which is also overwritten with the latest Pass 1 state for consistency).
+Run via the Bash tool:
+- Command: `node ${CLAUDE_PLUGIN_ROOT}/bin/graph-index.js DESIGN.md`
+- On exit 0: log success to `docs/plans/build-log.md` and continue.
+- On non-zero exit: STOP. Log the error to `docs/plans/build-log.md` and report the failure. Downstream agents require the graph — do not proceed without a successful index.
 ### Step 3.5 — Inclusive Visuals Check (single agent)
 Call the Agent tool once:
-1. Description: "Inclusive visuals check" — subagent_type: `design-inclusive-visuals-specialist` — prompt: "[CONTEXT header above — phase: 3] Read `docs/plans/visual-dna.md`, `docs/plans/component-manifest.md`, and `docs/plans/visual-design-spec.md`. Audit for representation gaps, imagery bias, color choices that exclude colorblind users, contrast failures, and culturally-specific iconography that doesn't translate. Write findings to `docs/plans/inclusive-visuals-audit.md`."
+1. Description: "Inclusive visuals check" — subagent_type: `design-inclusive-visuals-specialist` — prompt: "[CONTEXT header above — phase: 3] Read `DESIGN.md`, `docs/plans/component-manifest.md`, and `DESIGN.md`. Audit for representation gaps, imagery bias, color choices that exclude colorblind users, contrast failures, and culturally-specific iconography that doesn't translate. Write findings to `docs/plans/inclusive-visuals-audit.md`."
 Output: `docs/plans/inclusive-visuals-audit.md`.
@@ -130,15 +225,15 @@ This is the only Phase 3 step that writes code. Wrapped in a generator/critic me
 Call the Agent tool once:
-1. Description: "Build living style guide" — subagent_type: `engineering-frontend-developer` — mode: "bypassPermissions" — prompt: "[CONTEXT header above — phase: 3] [COMPLEXITY: L] Read `docs/plans/component-manifest.md` and `docs/plans/visual-design-spec.md`. Build a `/design-system` route with rendered, interactive examples of every chosen variant from the manifest. **HARD-GATE: Import from the installed libraries. Do NOT write components from scratch when the manifest names one.** Every component must be interactive (hover, focus, transitions all work). Mobile-responsive. This ships with the product. Commit: 'feat: living style guide'."
+1. Description: "Build living style guide" — subagent_type: `engineering-frontend-developer` — mode: "bypassPermissions" — prompt: "[CONTEXT header above — phase: 3] [COMPLEXITY: L] Read `docs/plans/component-manifest.md` and `DESIGN.md`. Build a `/design-system` route with rendered, interactive examples of every chosen variant from the manifest. **HARD-GATE: Import from the installed libraries. Do NOT write components from scratch when the manifest names one.** Every component must be interactive (hover, focus, transitions all work). Mobile-responsive. This ships with the product. Commit: 'feat: living style guide'."
 **Metric loop wrapper** (per `protocols/metric-loop.md`):
-- **Critic** — Call the Agent tool — description: "Design critic scoring pass" — subagent_type: `design-critic` — prompt: "[CONTEXT header above — phase: 3] SCORING CRITERIA CHECKLIST: [paste the checklist from `active_metric_loop.scoring_criteria_checklist` in `.build-state.json` — NOT the raw reference docs]. Capture the rendered `/design-system` route via Playwright screenshot (desktop 1920x1080 + mobile 375x812). Score the gap on **6 DNA axes** (Scope fit, Density, Character, Material, Motion, Type — 20 points each) plus **5 craft dimensions** (whitespace rhythm, visual hierarchy, motion coherence, color harmony, typographic refinement — 20 points each). Total 220. Target 180. Every finding must cite a specific element with file:line reference AND reference the checklist criteria — score a gap, not an opinion. Suggest concrete improvements ('the card padding is 16px but the checklist says Density: Airy — 32px — bump to 32px'). Iteration 1 MAY Read `docs/plans/design-references.md` for visual comparison; iteration 2+ MUST NOT unless diagnosis explicitly flags a visual-reference gap. Default verdict: NEEDS WORK. Never edit code. Max 5 iterations before exit."
+- **Critic** — Call the Agent tool — description: "Design critic scoring pass" — subagent_type: `design-critic` — prompt: "[CONTEXT header above — phase: 3] SCORING CRITERIA CHECKLIST: [paste the checklist from `active_metric_loop.scoring_criteria_checklist` in `.build-state.json` — NOT the raw reference docs]. Capture the rendered `/design-system` route via Playwright screenshot (desktop 1920x1080 + mobile 375x812). Also read `docs/plans/page-specs/` to understand what page compositions these components will be used in — score components in the context of their actual usage, not just in isolation. Score the gap on **7 DNA axes** (Scope fit, Density, Character, Material, Motion, Type, Copy — 20 points each) plus **5 craft dimensions** (whitespace rhythm, visual hierarchy, motion coherence, color harmony, typographic refinement — 20 points each). Total 240. Target 195. <!-- Scoring scale: see agents/design-critic.md for authoritative thresholds --> Every finding must cite a specific element with file:line reference AND reference the checklist criteria — score a gap, not an opinion. Suggest concrete improvements ('the card padding is 16px but the checklist says Density: Airy — 32px — bump to 32px'). Iteration 1 MAY Read `docs/plans/design-references.md` for visual comparison; iteration 2+ MUST NOT unless diagnosis explicitly flags a visual-reference gap. Default verdict: NEEDS WORK. Never edit code. Max 5 iterations before exit."
 - **Generator (re-invocation, iteration 2+)** — Call the Agent tool — description: "Apply critic's top issue" — subagent_type: `engineering-frontend-developer` — mode: "bypassPermissions" — prompt: "TARGETED FIX from metric loop diagnosis: [paste top issue from Step 3 diagnosis]. Files: [paste file paths]. Relevant criteria from checklist: [paste the specific checklist values that relate to the top issue — e.g., 'Density: Airy — 32px card padding']. Apply ONLY the top issue. Do not re-critique. Do not refactor other parts. Re-render the `/design-system` route. Return the commit SHA." NOTE: Do NOT include `[CONTEXT header above]` on iteration 2+ — the generator already has the codebase context from iteration 1. Per `protocols/metric-loop.md` Step 4 iteration-aware context rule.
-- **Exit conditions:** quality target hit (score ≥ 180), stall (no score improvement for 2 consecutive rounds), or max iterations (5 total).
+- **Exit conditions:** quality target hit (score ≥ 195), stall (no score improvement for 2 consecutive rounds), or max iterations (5 total).
 Record the score history to `docs/plans/build-log.md` under `## Design Critic Loop`.
@@ -154,9 +249,11 @@ Output: `docs/plans/a11y-design-review.md`.
 ### Step 3.8 — Autonomous Quality Gate
-Log to `docs/plans/build-log.md`: final screenshot paths, Design Critic score history (per-round totals plus per-axis subscores), a11y findings count by severity, and a DNA compliance score derived from the critic's 6 DNA-axis subscores. No user pause.
+Log to `docs/plans/build-log.md`: final screenshot paths, Design Critic score history (per-round totals plus per-axis subscores), a11y findings count by severity, a DNA compliance score derived from the critic's 7 DNA-axis subscores, and the DESIGN.md lint result (broken-refs count, warning count, hash). No user pause.
-Phase 4 HARD-GATE: web mode requires BOTH `docs/plans/visual-dna.md` AND `docs/plans/visual-design-spec.md` AND `docs/plans/component-manifest.md` to exist before Phase 4 starts. If any is missing, return to Phase 3.
+DESIGN.md lint runs at this step via `hooks/design-md-lint`. Broken-refs is a hard fail and routes back to Step 3.4 with the broken ref as the focused finding. Warnings (missing-primary, contrast-ratio WCAG AA, orphaned-tokens, missing-typography, section-order) are logged to `build-log.md` and feed the Phase 3.7 a11y review's contrast escalation rules but do NOT block Phase 4.
+Phase 4 HARD-GATE: web mode requires `DESIGN.md` (Pass 1 + Pass 2 complete, lint broken-refs == 0) AND `docs/plans/component-manifest.md` AND `docs/plans/page-specs/` (at least one file) to exist before Phase 4 starts. If any is missing or DESIGN.md fails the broken-refs lint, return to Phase 3.
 ## Phase 4 — Build (web branch)
@@ -176,11 +273,11 @@ Step 4.0 is three sequential dispatches: project scaffolding, design system setu
 #### 4.0.a — Project scaffolding
-Call the Agent tool — description: "Project scaffolding" — subagent_type: `engineering-rapid-prototyper` — mode: "bypassPermissions" — prompt: "[CONTEXT header above — phase: 4] [COMPLEXITY: M] Set up the project from the architecture. Read `docs/plans/architecture.md` via your Read tool before starting. Create directory structure, dependencies, build tooling, linting config, test framework with one passing test, .gitignore, .env.example. Read `docs/plans/visual-dna.md` Scope axis and only install the component libraries the DNA needs — never ship Three.js for an internal admin panel. Commit: 'feat: initial scaffolding'."
+Call the Agent tool — description: "Project scaffolding" — subagent_type: `engineering-rapid-prototyper` — mode: "bypassPermissions" — prompt: "[CONTEXT header above — phase: 4] [COMPLEXITY: M] Set up the project from the architecture. Read `docs/plans/architecture.md` via your Read tool before starting. Create directory structure, dependencies, build tooling, linting config, test framework with one passing test, .gitignore, .env.example. Read `DESIGN.md` Scope axis and only install the component libraries the DNA needs — never ship Three.js for an internal admin panel. Commit: 'feat: initial scaffolding'."
 #### 4.0.b — Design system setup
-Call the Agent tool — description: "Design system setup" — subagent_type: `engineering-frontend-developer` — mode: "bypassPermissions" — prompt: "[CONTEXT header above — phase: 4] Implement the design system from the Visual Design Spec. Read `docs/plans/visual-design-spec.md` via your Read tool before starting. Create CSS tokens matching the spec's color system, typography scale, spacing system, shadow/elevation tokens, and base layout components. The living style guide from Phase 3 is the reference implementation — components must match. Commit: 'feat: design system'."
+Call the Agent tool — description: "Design system setup" — subagent_type: `engineering-frontend-developer` — mode: "bypassPermissions" — prompt: "[CONTEXT header above — phase: 4] Implement the design system from the Visual Design Spec. Read `DESIGN.md` via your Read tool before starting. Create CSS tokens matching the spec's color system, typography scale, spacing system, shadow/elevation tokens, and base layout components. The living style guide from Phase 3 is the reference implementation — components must match. Commit: 'feat: design system'."
 #### 4.0.c — Acceptance test scaffolding
@@ -188,27 +285,55 @@ Call the Agent tool — description: "Scaffold acceptance tests" — subagent_ty
 ## Phase 4 — Build per-task flow (web branch)
-These are the web-specific prompt templates for the per-task flow inside Phase 4 Step 4.1+. The orchestrator-side machinery (wave-based parallel dispatch by DAG, Briefing Officer, Senior Dev cleanup, code review pair, Metric Loop, Verify Service) lives in `commands/build.md` Phase 4. This section only overrides the implementer dispatch and UI-specific verification prompts.
+These are the web-specific prompt templates for the per-task flow inside Phase 4 Step 4.1+. The orchestrator-side machinery (**three-tier: Product Owner → Briefing Officer → Execution Agents**, Senior Dev cleanup, code review pair, Metric Loop, Verify Service) lives in `commands/build.md` Phase 4. This section only overrides the implementer dispatch and UI-specific verification prompts.
-### Wave dispatch (topological, dependency-bounded)
+### Wave dispatch (feature-grained, from feature-delegation-plan.json)
-Build the DAG from the `Dependencies:` field on each row in `docs/plans/sprint-tasks.md`. A wave is the set of all not-yet-dispatched tasks whose declared dependencies are ALL complete. Dispatch every task in a wave as parallel Agent tool calls in ONE message, wait for the full wave to return, write back any `deviation_row` payloads via the orchestrator-scribe (single-writer pattern per `commands/build.md` §Decision log scribe), then compute the next wave. Repeat until the graph is drained.
+The Product Owner (Step 4.1) groups features into waves and writes `docs/plans/feature-delegation-plan.json`. The orchestrator reads that plan — not sprint-tasks.md Dependencies — to determine wave membership. Each wave dispatches one Briefing Officer per feature in parallel. Within a feature, tasks run in DAG-parallel batches (topological order from the `Dependencies:` field in sprint-tasks.md — independent sibling tasks run in parallel, yielding ~30-50% wall-clock saving).
-No magic parallelism cap — the dependency graph is the limit. A task that declares no dependencies runs in wave 1 alongside every other root. A task that declares `Dependencies: T1, T2` runs in whichever wave first satisfies both.
+No magic parallelism cap — the dependency graph is the limit within a feature. A task that declares no dependencies runs in the first intra-feature batch alongside every other root. A task that declares `Dependencies: T1, T2` runs in whichever batch first satisfies both.
 ### Step 4.1+ — Task execution overrides (web)
 #### Implementer dispatch (web)
-Call the Agent tool — description: "[task name]" — subagent_type: `[pick per task type]` — mode: "bypassPermissions" — prompt: "[CONTEXT header above — phase: 4] TASK: [task description + acceptance criteria]. CONTEXT MAP from Briefing Officer: [paste Briefing Officer output]. Use the Read tool to pull refs on demand — design-doc.md, architecture.md, visual-design-spec.md, component-manifest.md — do NOT expect full pasted content. For UI tasks: the living style guide at /design-system shows every component's exact styling and states — match it, and import from the manifest-named library variants (Phase 4 HARD-GATE — do not write components from scratch when the manifest names one). Implement fully with real code and tests. Commit: 'feat: [task]'. Report what you built, files changed, and test results."
+The Briefing Officer's feature brief specifies the agent type (`subagent_type`) for each task — the orchestrator reads it from the brief rather than deciding itself.
+Call the Agent tool — description: "[task name]" — subagent_type: `[from BO brief]` — mode: "bypassPermissions" — prompt: "[CONTEXT header above — phase: 4] [COMPLEXITY: S/M/L from sprint-tasks.md].
+TASK: [task description from BO brief]
+FEATURE CONTEXT:
+[product_context from BO brief — persona constraints, business rules, key error scenarios]
+PAGE LAYOUT:
+[relevant wireframe section from page-spec, pasted from BO brief. Omit for backend-only tasks.]
+COMPONENTS:
+[component picks from BO brief — name, variant, which slot. HARD-GATE: import from manifest, do NOT write from scratch.]
-Pick the right developer framing AND the matching `subagent_type`:
-- Frontend / UI tasks → `engineering-frontend-developer`
-- Backend / API / data-layer tasks → `engineering-backend-architect`
-- AI / ML / model-integration tasks → `engineering-ai-engineer`
-- Generalist / refactor / cross-cutting tasks → `engineering-senior-developer`
+API CONTRACT:
+[endpoint shape from BO brief — route, method, request/response]
-Set `[COMPLEXITY: S/M/L]` based on the task's Size from sprint-tasks.md.
+ERROR STATES:
+[specific failure modes from BO brief — trigger, user message, recovery]
+BUSINESS RULES:
+[concrete rules from BO brief — values, not 'configurable']
+SKILLS ASSIGNED: [skill list from BO brief]
+ACCEPTANCE: [criteria from BO brief]
+## Prior Learnings
+[paste contents of docs/plans/.active-learnings.md if it exists]
+## Deviation Reporting
+If your implementation deviates from the planned architecture, return a deviation_row object per protocols/decision-log.md. If no deviation, return deviation_row: null. Do NOT write decisions.jsonl directly.
+For UI tasks: the living style guide at /design-system shows every component's exact styling and states — match it. Import from the manifest-named library variants (Phase 4 HARD-GATE).
+Implement fully with real code and tests. Commit: 'feat: [task]'. Report what you built, files changed, and test results."
 #### Metric Loop (web behavioral verification)
@@ -222,19 +347,19 @@ Uses agent-browser against localhost to open the app, execute the task's behavio
 ## Phase 5 — Audit (web branch)
-Phase 5 in the web branch contains the 5-agent audit team, eval harness, hardening metric loop, 3-iteration E2E testing, autonomous dogfooding, and fake-data detector. The orchestrator-side machinery (TEAM dispatch, Feedback Synthesizer, evidence writes) follows `commands/build.md` Phase 5. Reality Check and LRR Aggregation moved to Phase 6 — do NOT run them here.
+Phase 5 in the web branch is split into three layers — Track A (engineering envelope: 5 parallel auditors), Track B (product reality: parallel per-feature audit driven by graph queries), and Cross-cutting (3-iteration Playwright E2E, autonomous agent-browser dogfood, fake-data detector). All findings route through the Feedback Synthesizer (Step 5.4) and Fix loop (Step 5.5). The orchestrator-side machinery (Track-A team dispatch, Track-B fan-out, synthesizer, evidence writes, fix loop) follows `commands/build.md` Phase 5 — this file carries web-branch-specific elaboration only. Reality Check and LRR Aggregation are Phase 6, not here.
-### Step 5.1 — Initial Audit (6 agents in parallel, ONE message)
+### Step 5.1 — Track A: Engineering Reality (5 agents in parallel, ONE message)
-Read the NFRs from `docs/plans/quality-targets.json` (and `docs/plans/sprint-tasks.md` NFR section if present). Pass the relevant NFR thresholds to each audit agent so they have concrete targets, not generic checks. The sixth auditor is the Brand Guardian drift check — it runs alongside the technical auditors to catch DNA drift before the Phase 6 LRR Brand Guardian chapter renders its verdict.
+Read the NFRs from `docs/plans/quality-targets.json` (and `docs/plans/sprint-tasks.md` NFR section if present). Pass the relevant NFR thresholds to each audit agent so they have concrete targets, not generic checks. The fifth auditor is the Brand Guardian drift check — it runs alongside the technical auditors to catch DNA drift before the Phase 6 LRR Brand Guardian chapter renders its verdict. Per-feature UX quality (loading states, empty states, error states, mobile responsiveness, visual consistency) is now covered feature-by-feature in Step 5.2 Track B — DO NOT add a generic UX-quality dispatch back here.
-Call the Agent tool 6 times in one message:
+Call the Agent tool 5 times in one message:
 1. Description: "API testing" — subagent_type: `testing-api-tester` — Prompt: "[CONTEXT header above — phase: 5] Comprehensive API validation: all endpoints, edge cases, error responses, auth flows. NFR targets: Read `docs/plans/quality-targets.json` via your Read tool for performance and reliability thresholds. Report findings with counts."
 2. Description: "Performance audit" — subagent_type: `testing-performance-benchmarker` — Prompt: "[CONTEXT header above — phase: 5] Measure response times, identify bottlenecks, flag performance issues. NFR targets: Read `docs/plans/quality-targets.json` via your Read tool for performance thresholds. Report benchmarks AGAINST these targets.
-**Bundle budget per Scope axis** (read `docs/plans/visual-dna.md` Scope field):
+**Bundle budget per Scope axis** (read `DESIGN.md` Scope field):
 - Marketing:     500KB gzipped (excluding images), LCP <= 2.5s
 - Product:       300KB gzipped, LCP <= 1.8s
 - Dashboard:     400KB gzipped, LCP <= 2.0s
@@ -246,44 +371,67 @@ Exceeding the budget by >25% auto-blocks the Phase 6 LRR SRE chapter. Budget vio
 4. Description: "Security audit" — subagent_type: `engineering-security-engineer` — Prompt: "[CONTEXT header above — phase: 5] Security review: auth, input validation, data exposure, dependency vulnerabilities. NFR targets: Read `docs/plans/quality-targets.json` via your Read tool for security thresholds. Report findings with severity."
-5. Description: "UX quality audit" — subagent_type: `design-ux-researcher` — Prompt: "[CONTEXT header above — phase: 5] UX quality review of every user-facing page. NFR targets: Read `docs/plans/quality-targets.json` via your Read tool for accessibility and UX thresholds. First, screenshot the living style guide at /design-system as your reference for how components should look. Then review every product page and check: loading states (every async action must show a loading indicator), error states (every form and API call must show user-friendly error feedback), empty states (every list/table must handle zero items gracefully), mobile responsiveness (test at 375px viewport — touch targets >= 44px, no horizontal scroll, readable text), form validation (inline feedback, not just alert()), transition smoothness (no layout shifts, no janky animations), visual consistency (compare each page's components against the style guide — buttons, inputs, cards, colors, spacing should match). Report issues with page, severity, and screenshot."
+5. Description: "Brand Guardian drift check" — subagent_type: `design-brand-guardian` — Prompt: "[CONTEXT header above — phase: 5] You are the Phase 5 drift check (proposed state §5 re-invite). Read `DESIGN.md` (the DNA card locked at Phase 3.0) + the actually-built pages via Playwright screenshots under `docs/plans/evidence/brand-drift/` (write production screenshots there as PNG/JPG files, one per page audited, named `<screen-id>.png`). Score whether Phase 4 implementers stayed true to the DNA or drifted away from it. Specifically check each of the 6 DNA axes (Scope / Density / Character / Material / Motion / Type) against what the built product actually renders. Report drift count and specific elements (file:line references). Save findings to `docs/plans/evidence/brand-drift.md`. This is a drift check only — the Phase 6 LRR Brand Guardian chapter does the verdict. You do NOT issue a pass/fail here, only surface findings for the LRR chapter to read."
+#### Step 5.1.idx — Brand drift screenshots graph index
-6. Description: "Brand Guardian drift check" — subagent_type: `design-brand-guardian` — Prompt: "[CONTEXT header above — phase: 5] You are the Phase 5 drift check (proposed state §5 re-invite). Read `docs/plans/visual-dna.md` (the DNA card locked at Phase 3.0) + the actually-built pages via Playwright screenshots under `docs/plans/evidence/`. Score whether Phase 4 implementers stayed true to the DNA or drifted away from it. Specifically check each of the 6 DNA axes (Scope / Density / Character / Material / Motion / Type) against what the built product actually renders. Report drift count and specific elements (file:line references). Save findings to `docs/plans/evidence/brand-drift.md`. This is a drift check only — the Phase 6 LRR Brand Guardian chapter does the verdict. You do NOT issue a pass/fail here, only surface findings for the LRR chapter to read."
+After `design-brand-guardian` returns and `docs/plans/evidence/brand-drift/` is populated with production screenshots, index the directory into the build graph as Slice 5 brand-drift fragments. Best-effort, the LRR Brand chapter falls back to direct file reads on failure.
-### Step 5.2 — Eval Harness
+Run via the Bash tool:
-Run the Eval Harness Protocol (`protocols/eval-harness.md`). Define 8-15 concrete, executable eval cases from the audit findings and architecture doc. For UI flows, eval cases should use agent-browser: "agent-browser open /dashboard -> agent-browser click @submit -> agent-browser wait --text Success -> expect text contains confirmation ID". Run the eval agent. Record baseline pass rate. CRITICAL and HIGH failures feed into the metric loop in Step 5.3 as specific issues to fix.
+- Command: `node ${CLAUDE_PLUGIN_ROOT}/bin/graph-index.js docs/plans/evidence/brand-drift/`
+- On exit 0: log success to `docs/plans/build-log.md` and continue.
+- On non-zero exit: STOP. Log the error to `docs/plans/build-log.md` and report the failure. Downstream agents require the graph — do not proceed without a successful index.
-### Step 5.3 — Metric Loop: Hardening Quality
+### Step 5.2 — Track B: Product Reality (parallel per-feature, ONE message)
-Per `protocols/metric-loop.md` Step 0.5, extract audit findings (from Step 5.1 and Step 5.2 eval harness) into the Scoring Criteria Checklist via a **one-shot extractor dispatch** — single agent call reads the audit reports and outputs a prioritized findings checklist with severity, description, and file refs. Persist to `active_metric_loop.scoring_criteria_checklist` in `.build-state.json`. Critic receives the checklist + fresh measurement results each iteration. Do NOT re-inject full audit reports per iteration.
+Track B audits the built app against `product-spec.md` on a per-feature basis. The orchestrator-side dispatch shape (feature enumeration via the graph, zero-feature gate, parallel `product-reality-auditor` fan-out, post-dispatch evidence verification) is canonically described in `commands/build.md` Step 5.2 — follow that for orchestration. This section adds web-branch-specific elaboration.
-Run the Metric Loop Protocol on the full codebase using the checklist as scoring input. Define a composite metric based on what this project needs. Max 4 iterations.
+**What the auditor does** (per-feature, in parallel): synthesizes agent-browser scripts from the graph slice (states, transitions, business rules, happy path, persona constraints, page-spec wiring, manifest coverage), executes them against the running web app, captures screenshots, and writes structured evidence. The auditor's contract is in `agents/product-reality-auditor.md`. The seven check classes (a–g) and the routing table live there — do not paraphrase them here.
-When fixing, dispatch to the RIGHT specialist. Security → security agent. Accessibility → frontend agent. Don't send everything to one agent.
+**Web-branch specifics:**
+- The running app is at `http://localhost:[port]` (orchestrator must have the dev server running before Step 5.2 — same as for E2E/dogfood at Step 5.3).
+- agent-browser is the primary execution surface; Playwright loaded via the Skill tool is the fallback (one retry total).
+- Screenshots and per-case evidence land under `docs/plans/evidence/product-reality/{feature_id}/screenshots/`. Each case_id maps 1:1 to a PNG file (or `screenshot: null` for non-visual checks like manifest-slot-empty).
+- The four evidence files per feature (`tests-generated.md`, `results.json`, `findings.json`, `coverage.json`) are written by the auditor; the orchestrator verifies their presence + JSON parseability per `commands/build.md` Step 5.2 post-dispatch verification.
+- Failure modes (graph queries fail, graph layer absent, agent-browser unavailable, dev server not running, feature has no screens) are owned by the auditor — see `agents/product-reality-auditor.md` §Failure Modes. This file does not duplicate them.
-### Step 5.3b — Eval Re-run
+**Failure routing:** Track B auditor failures route through the existing fix-loop spec-gap path (`target_phase: 1, target_task_or_step: "1.6"` to `product-spec-writer`) — see `commands/build.md` Step 5.2 post-dispatch verification for the escalation flow.
-Re-run the Eval Harness after the metric loop exits. All CRITICAL eval cases must now pass. If any CRITICAL case still fails, include it as evidence for the Phase 6 Reality Check sweep.
+#### Step 5.2.idx — Track B evidence graph index
-### Step 5.4 — E2E Testing (3 mandatory iterations)
+After all per-feature `product-reality-auditor` dispatches return and `docs/plans/evidence/product-reality/*/` is populated, index the directory into the build graph.
+Run via the Bash tool:
+- Command: `node ${CLAUDE_PLUGIN_ROOT}/bin/graph-index.js docs/plans/evidence/product-reality/`
+- On exit 0: log success to `docs/plans/build-log.md` and continue.
+- On non-zero exit: STOP. Log the error to `docs/plans/build-log.md` and report the failure. Downstream agents require the graph — do not proceed without a successful index.
+### Step 5.3 — Cross-cutting (3 parallel, ONE message)
+Three checks run in parallel as a cross-cutting layer: 3-iteration Playwright E2E for multi-feature User Journeys, autonomous agent-browser dogfood for emergent issues, and the fake-data detector. The orchestrator fires THREE Agent dispatches in one message — the E2E one runs all 3 iterations internally (see Step 5.3a "Where this runs"). Dispatch shape canonicalized in `commands/build.md` Step 5.3.
+#### Step 5.3a — E2E Testing (3 mandatory iterations)
+**Where this runs:** The orchestrator fires ONE Agent dispatch (description: "E2E runner") at Step 5.3 alongside dogfood and fake-data — three parallel agents in one message. The three iterations below run INSIDE that single E2E runner agent, sequentially. The runner agent reads this section as its instruction body and internally drives iteration 1 → 2 → 3 without coming back to the orchestrator between iterations. Do NOT misread "3 mandatory iterations" as three separate orchestrator-side dispatches.
 HARD-GATE: ALL 3 ITERATIONS ARE MANDATORY. Do NOT stop after iteration 1 even if all tests pass. The purpose of 3 runs is to catch flaky tests, timing-dependent failures, and race conditions that only surface on repeated execution. Skip this step ONLY if the project has no user-facing frontend.
-Generate and execute end-to-end tests using Playwright against the running application. Tests cover the **User Journeys** defined in `docs/plans/sprint-tasks.md` (Step 0 of the Planning Protocol). Each journey = one E2E test file.
+**Scope (POST Track B):** E2E covers **multi-feature User Journeys ONLY** — login → browse → buy, signup → onboarding → first-action, etc. Single-feature happy paths are covered by Track B per-feature auditors at Step 5.2 — DO NOT duplicate. The User Journey list lives in `docs/plans/sprint-tasks.md` (Step 0 of the Planning Protocol). Each cross-feature journey = one E2E test file.
 **Iteration 1 — Generate & Run:**
 Call the Agent tool — description: "E2E test generation" — subagent_type: `engineering-frontend-developer` — mode: "bypassPermissions" — prompt:
-"[CONTEXT header above — phase: 5] [COMPLEXITY: L] Generate and run end-to-end Playwright tests for this application.
+"[CONTEXT header above — phase: 5] [COMPLEXITY: L] Generate and run end-to-end Playwright tests for cross-feature User Journeys ONLY (single-feature happy paths are covered by Track B at Step 5.2 — do NOT duplicate them here).
 INPUTS:
 Read these files via your Read tool before starting — do NOT expect pasted content:
 - User Journeys: `docs/plans/sprint-tasks.md` (User Journeys section — each journey becomes one E2E test)
 - Architecture (API contracts): `docs/plans/architecture.md`
 - NFRs: `docs/plans/sprint-tasks.md` (NFR section — use performance thresholds as test assertions)
-- Visual Design Spec (component selectors): `docs/plans/visual-design-spec.md`
+- Visual Design Spec (component selectors): `DESIGN.md`
 REQUIREMENTS:
 1. One E2E test per User Journey from sprint-tasks.md (each journey = one test file covering the full flow)
@@ -294,10 +442,10 @@ REQUIREMENTS:
 6. Configure multi-browser: Chromium + Firefox + WebKit
 7. Set up playwright.config.ts with: fullyParallel, retries: 0 (we handle retries ourselves), screenshot: 'only-on-failure', video: 'retain-on-failure', trace: 'on-first-retry'
 8. Run all tests. Report: total, passed, failed, with failure details and screenshot paths.
-9. Commit: 'test: e2e test suite for critical user journeys'
+9. Commit: 'test: e2e test suite for cross-feature user journeys'
 Test priority:
-- CRITICAL: Auth, core feature happy path, data submission, payment/transaction flows
+- CRITICAL: Auth, core happy path, data submission, payment/transaction flows
 - HIGH: Search, filtering, navigation, error states
 - MEDIUM: Responsive layout, animations, edge cases"
@@ -321,17 +469,31 @@ Call the Agent tool — description: "E2E stability run" — subagent_type: `eng
 Record final results. Include in the Phase 6.0 Reality Check evidence sweep (see `commands/build.md` Phase 6 Step 6.0).
-### Step 5.5 — Autonomous Dogfooding
+#### Step 5.3b — Autonomous Dogfooding
-Run the agent-browser dogfood skill against the running app. Unlike the per-task smoke tests (which verify specific acceptance criteria), dogfooding is **exploratory** — it autonomously navigates every reachable page, clicks buttons, fills forms, checks console errors, and finds issues we didn't think to test.
+Run the agent-browser dogfood skill against the running app. Unlike Track B (which checks built features against the spec) and unlike per-task smoke tests (which verify specific acceptance criteria), dogfooding is **exploratory** — it autonomously navigates every reachable page, clicks buttons, fills forms, checks console errors, and finds issues we didn't think to test. Spec-blind by design — that's the point.
 Start the dev server if not running. Then invoke the dogfood skill:
-Call the Agent tool — description: "Dogfood the app" — subagent_type: `testing-evidence-collector` — mode: "bypassPermissions" — prompt: "[CONTEXT header above — phase: 5] Run the agent-browser dogfood skill against the running app at http://localhost:[port]. Explore every reachable page. Click every button. Fill every form. Check console for errors. Report a structured list of issues with severity ratings (critical/high/medium/low), screenshots, and repro steps. If dogfood skill is not available, use agent-browser manually: snapshot each page, click all interactive elements, check errors and network requests. Also evaluate UX quality: missing loading states, poor error messages, broken mobile layouts (resize to 375px), visual inconsistencies, missing empty states, form validation gaps. Report UX issues separately from functional issues."
+Call the Agent tool — description: "Dogfood the app" — subagent_type: `testing-evidence-collector` — mode: "bypassPermissions" — prompt: "[CONTEXT header above — phase: 5] Run the agent-browser dogfood skill against the running app at http://localhost:[port]. Explore every reachable page. Click every button. Fill every form. Check console for errors. Report a structured list of issues with severity ratings (critical/high/medium/low), screenshots, and repro steps. Save screenshots under `docs/plans/evidence/dogfood/` (one PNG/JPG per finding, named after the finding_id), and emit `docs/plans/evidence/dogfood/findings.json` (machine-readable mirror of findings.md — schema: `[{finding_id, severity, description, screenshot_path, affected_screen_id}, ...]` per agents/testing-evidence-collector.md \"Dogfood Evidence Outputs\") so the Slice 5 indexer can wire `screenshot_evidences_finding` edges.
+If dogfood skill is not available, use agent-browser manually: snapshot each page, click all interactive elements, check errors and network requests.
+Focus on emergent issues (console errors, broken layouts at 320/375/768px, failed network requests, broken navigation links) — do NOT re-audit per-feature spec coverage; that's Track B's job at Step 5.2."
 Classification and fix-routing of Dogfood findings is handled by the Feedback Synthesizer at `commands/build.md` Phase 5 Step 5.4 — do NOT self-classify or spawn fix agents from this step.
-### Step 5.6 — Fake Data Detector
+##### Step 5.3b.idx — Dogfood evidence graph index
+After `testing-evidence-collector` returns and `docs/plans/evidence/dogfood/` is populated with finding screenshots, index the directory into the build graph as Slice 5 dogfood fragments. Best-effort, the feedback synthesizer falls back to file reads on failure. The indexer reads BOTH the screenshots in `evidence/dogfood/` AND the `findings.json` side-channel to wire `screenshot_evidences_finding` edges.
+Run via the Bash tool:
+- Command: `node ${CLAUDE_PLUGIN_ROOT}/bin/graph-index.js docs/plans/evidence/dogfood/`
+- On exit 0: log success to `docs/plans/build-log.md` and continue.
+- On non-zero exit: STOP. Log the error to `docs/plans/build-log.md` and report the failure. Downstream agents require the graph — do not proceed without a successful index.
+#### Step 5.3c — Fake Data Detector
 Call the Agent tool — description: "Fake data audit" — subagent_type: `silent-failure-hunter` — mode: "bypassPermissions" — prompt: "[CONTEXT header above — phase: 5] Run the Fake Data Detector Protocol (protocols/fake-data-detector.md). Check for mock/hardcoded data in production paths. Static analysis: grep for Math.random() business data, hardcoded API responses, setTimeout faking async, placeholder text. Dynamic analysis: inspect HAR files from docs/plans/evidence/ for missing real API calls, static responses, absent WebSocket traffic. Report findings with file:line references and severity."
@@ -342,6 +504,14 @@ Call the Agent tool — description: "Fake data audit" — subagent_type: `silen
 Remaining findings feed into the Phase 6.0 Reality Check evidence sweep (see `commands/build.md` Phase 6 Step 6.0).
+### Step 5.4 — Feedback Synthesizer
+The orchestrator-side dispatch and prompt body live in `commands/build.md` Step 5.4. The synthesizer ingests both Track B `findings.json` (one per feature) and Dogfood `findings.md`/`findings.json`, validates target_phase routing against the graph, and emits `docs/plans/evidence/dogfood/classified-findings.json` with a `source: "dogfood" | "product-reality"` discriminator. Web-branch note: for `project_type=web` this is always the path; for iOS see `protocols/ios-phase-branches.md`.
+### Step 5.5 — Fix loop
+The orchestrator-side fix-loop dispatch lives in `commands/build.md` Step 5.5. Max 2 fix cycles. Routing template at the bottom of `commands/build.md` ("Re-entry dispatch template"). Findings with `target_phase: 1, target_task_or_step: "1.6"` route back to `product-spec-writer`, which re-triggers Track B for the affected feature on the next loop.
 ## Phase 7 — Ship (web branch)
 ### Step 7.1 — Documentation (web)

package/skills/ios/ios-bootstrap/SKILL.md CHANGED Viewed

@@ -41,7 +41,7 @@ If missing or older: fail with `"Install Xcode 26.3 from the Mac App Store, then
 ### 3. Create project directory structure
 From project root, create:
-- `docs/plans/` — holds `.build-state.md`, `ios-design-board.md`, task lists
+- `docs/plans/` — holds `.build-state.md`, task lists (note: `DESIGN.md` lives at the repo root, not under `docs/plans/`)
 - `maestro/` — canonical name for Maestro YAML flows
 ### 4. User-assisted Xcode New Project dialog