buildanything 2.0.0 → 2.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (115) hide show
  1. package/.claude-plugin/marketplace.json +1 -1
  2. package/.claude-plugin/plugin.json +9 -1
  3. package/README.md +57 -61
  4. package/agents/a11y-architect.md +2 -0
  5. package/agents/briefing-officer.md +172 -0
  6. package/agents/business-model.md +14 -12
  7. package/agents/code-architect.md +6 -1
  8. package/agents/code-reviewer.md +3 -2
  9. package/agents/code-simplifier.md +12 -4
  10. package/agents/design-brand-guardian.md +19 -0
  11. package/agents/design-critic.md +16 -11
  12. package/agents/design-inclusive-visuals-specialist.md +2 -0
  13. package/agents/design-ui-designer.md +17 -0
  14. package/agents/design-ux-architect.md +15 -0
  15. package/agents/design-ux-researcher.md +102 -7
  16. package/agents/engineering-ai-engineer.md +2 -0
  17. package/agents/engineering-backend-architect.md +2 -0
  18. package/agents/engineering-data-engineer.md +2 -0
  19. package/agents/engineering-devops-automator.md +2 -0
  20. package/agents/engineering-frontend-developer.md +13 -0
  21. package/agents/engineering-mobile-app-builder.md +2 -0
  22. package/agents/engineering-rapid-prototyper.md +15 -2
  23. package/agents/engineering-security-engineer.md +2 -0
  24. package/agents/engineering-senior-developer.md +13 -0
  25. package/agents/engineering-sre.md +2 -0
  26. package/agents/engineering-technical-writer.md +2 -0
  27. package/agents/feature-intel.md +8 -7
  28. package/agents/ios-app-review-guardian.md +2 -0
  29. package/agents/ios-foundation-models-specialist.md +2 -0
  30. package/agents/ios-product-reality-auditor.md +292 -0
  31. package/agents/ios-storekit-specialist.md +2 -0
  32. package/agents/ios-swift-architect.md +1 -0
  33. package/agents/ios-swift-search.md +1 -0
  34. package/agents/ios-swift-ui-design.md +7 -4
  35. package/agents/marketing-app-store-optimizer.md +2 -0
  36. package/agents/planner.md +6 -1
  37. package/agents/pr-test-analyzer.md +3 -2
  38. package/agents/product-feedback-synthesizer.md +62 -0
  39. package/agents/product-owner.md +163 -0
  40. package/agents/product-reality-auditor.md +216 -0
  41. package/agents/product-spec-writer.md +176 -0
  42. package/agents/refactor-cleaner.md +9 -1
  43. package/agents/security-reviewer.md +2 -1
  44. package/agents/silent-failure-hunter.md +2 -1
  45. package/agents/swift-build-resolver.md +2 -0
  46. package/agents/swift-reviewer.md +2 -1
  47. package/agents/tech-feasibility.md +5 -3
  48. package/agents/testing-api-tester.md +2 -0
  49. package/agents/testing-evidence-collector.md +24 -0
  50. package/agents/testing-performance-benchmarker.md +2 -0
  51. package/agents/testing-reality-checker.md +2 -1
  52. package/agents/visual-research.md +7 -5
  53. package/bin/adapters/scribe-tool.ts +4 -2
  54. package/bin/adapters/write-lease-tool.ts +1 -1
  55. package/bin/buildanything-runtime.ts +20 -107
  56. package/bin/graph-index.js +24 -0
  57. package/bin/graph-index.ts +340 -0
  58. package/bin/mcp-servers/graph-mcp.js +26 -0
  59. package/bin/mcp-servers/graph-mcp.ts +481 -0
  60. package/bin/mcp-servers/orchestrator-mcp.js +26 -0
  61. package/bin/mcp-servers/orchestrator-mcp.ts +361 -0
  62. package/bin/setup.js +272 -111
  63. package/commands/build.md +371 -158
  64. package/commands/idea-sweep.md +2 -2
  65. package/commands/setup.md +15 -4
  66. package/commands/ux-review.md +3 -3
  67. package/commands/verify.md +3 -0
  68. package/docs/migration/phase-graph.yaml +573 -157
  69. package/hooks/design-md-lint +4 -0
  70. package/hooks/design-md-lint.ts +295 -0
  71. package/hooks/pre-tool-use.ts +37 -6
  72. package/hooks/record-mode-transitions.ts +63 -6
  73. package/hooks/subagent-start.ts +3 -2
  74. package/package.json +3 -1
  75. package/protocols/agent-prompt-authoring.md +165 -0
  76. package/protocols/architecture-schema.md +10 -3
  77. package/protocols/cleanup.md +4 -0
  78. package/protocols/decision-log.md +8 -4
  79. package/protocols/design-md-authoring.md +520 -0
  80. package/protocols/design-md-spec.md +362 -0
  81. package/protocols/fake-data-detector.md +1 -1
  82. package/protocols/ios-fake-data-detector.md +65 -0
  83. package/protocols/ios-phase-branches.md +112 -27
  84. package/protocols/launch-readiness.md +9 -5
  85. package/protocols/metric-loop.md +1 -1
  86. package/protocols/page-spec-schema.md +234 -0
  87. package/protocols/product-spec-schema.md +354 -0
  88. package/protocols/sprint-tasks-schema.md +53 -0
  89. package/protocols/state-schema.json +38 -3
  90. package/protocols/state-schema.md +32 -2
  91. package/protocols/verify.md +29 -1
  92. package/protocols/web-phase-branches.md +234 -64
  93. package/skills/ios/ios-bootstrap/SKILL.md +1 -1
  94. package/src/graph/ids.ts +86 -0
  95. package/src/graph/index.ts +32 -0
  96. package/src/graph/parser/architecture.ts +603 -0
  97. package/src/graph/parser/component-manifest.ts +268 -0
  98. package/src/graph/parser/decisions-jsonl.ts +407 -0
  99. package/src/graph/parser/design-md-pass2.ts +253 -0
  100. package/src/graph/parser/design-md.ts +477 -0
  101. package/src/graph/parser/page-spec.ts +496 -0
  102. package/src/graph/parser/product-spec.ts +930 -0
  103. package/src/graph/parser/screenshot.ts +342 -0
  104. package/src/graph/parser/sprint-tasks.ts +317 -0
  105. package/src/graph/storage/index.ts +1154 -0
  106. package/src/graph/types.ts +432 -0
  107. package/src/graph/util/dhash.ts +84 -0
  108. package/src/lrr/aggregator.ts +105 -10
  109. package/src/orchestrator/hooks/context-header.ts +34 -10
  110. package/src/orchestrator/hooks/token-accounting.ts +25 -14
  111. package/src/orchestrator/mcp/cycle-counter.ts +2 -1
  112. package/src/orchestrator/mcp/scribe.ts +27 -16
  113. package/src/orchestrator/mcp/write-lease.ts +30 -13
  114. package/src/orchestrator/phase4-shared-context.ts +20 -4
  115. package/protocols/visual-dna.md +0 -185
package/commands/build.md CHANGED
@@ -19,34 +19,61 @@ Exception: Brainstorming in Phase 1 Step 1.0 and Step 1.3 uses an INTERNAL Brain
19
19
  <HARD-GATE>
20
20
  SUBAGENT_TYPE REQUIRED.
21
21
 
22
- Every Agent tool call MUST include a `subagent_type` field unless the dispatch is explicitly marked INTERNAL (inline role-string). INTERNAL dispatches are listed in `docs/plans/agent-dispatch-audit.md` — they are orchestrator helpers (Brainstorm Facilitator, Research Synthesizer, Design Doc Writer, Prereq Collector, Task DAG Validator, Refs Indexer, Briefing Officer, Dogfood runner, Fake-Data Detector, PM chapter, LRR Aggregator, Completion Report, Verify scaffolding dispatcher).
22
+ Every Agent tool call MUST include a `subagent_type` field unless the dispatch is explicitly marked INTERNAL (inline role-string). INTERNAL dispatches are orchestrator helpers: Brainstorm Facilitator, Research Synthesizer, Design Doc Writer, Prereq Collector, Task DAG Validator, Refs Indexer, Briefing Officer, PM chapter, LRR Aggregator, Completion Report, Verify scaffolding dispatcher.
23
23
 
24
- Missing `subagent_type` on a non-INTERNAL dispatch is a HARD-GATE violation. The orchestrator rejects dispatches that don't name a specific agent. If you catch yourself typing `description: "..."` without a `subagent_type:` line alongside it, STOP and look up the right agent from the dispatch audit.
24
+ Missing `subagent_type` on a non-INTERNAL dispatch is a HARD-GATE violation. The orchestrator rejects dispatches that don't name a specific agent. If you catch yourself typing `description: "..."` without a `subagent_type:` line alongside it, STOP and look up the right agent from the per-phase dispatch tables further down in this file.
25
25
  </HARD-GATE>
26
26
 
27
27
  <HARD-GATE>
28
28
  ARTIFACT WRITER-OWNER RULE.
29
29
 
30
- Every shared artifact has ONE concurrent writer at any instant. The writer-owner table in `docs/plans/orchestration-proposed-state.md` §6 defines which phase writes which file. Before any file write, the orchestrator verifies the current phase is the rightful writer. Non-owning phase writes are a HARD-GATE violation. For parallel-batch phases (e.g., Phase 4), intra-phase dispatches MUST NOT race on the same file — writes either target disjoint per-dispatch filenames OR route through an orchestrator-scribe handler (see `decisions.jsonl` handling below).
30
+ Every shared artifact has ONE concurrent writer at any instant. The writer-owner table below defines which phase writes which file. Before any file write, the orchestrator verifies the current phase is the rightful writer. Non-owning phase writes are a HARD-GATE violation. For parallel-batch phases (e.g., Phase 4), intra-phase dispatches MUST NOT race on the same file — writes either target disjoint per-dispatch filenames OR route through an orchestrator-scribe handler (see `decisions.jsonl` handling below).
31
31
 
32
32
  Live downstream docs (read across Phase 3+):
33
33
  - `CLAUDE.md` — P1 writer (then auto-loaded into every subagent)
34
- - `design-doc.md` (PRD) — P1 writer
35
- - `architecture.md` — P2 writer
36
- - `sprint-tasks.md` — P2 writer
37
- - `quality-targets.json` — P2 writer
38
- - `visual-design-spec.md` P3 writer (web) / `ios-design-board.md` P3 writer (iOS)
39
- - `refs.json` — P2 writer + P3 writer (P3 extends after visual spec lands)
40
- - `decisions.jsonl` orchestrator-scribe ONLY via `scribe_decision` MCP tool (subagents return `deviation_row` objects; the orchestrator forwards each row through the MCP, which owns ID allocation and atomic append)
41
- - `learnings.jsonl` P5, P7 writers
42
- - `evidence/*.json` P5 writer (P4 contributes per-task, P6/P7 readers)
43
- - `lrr/*.json` P6 writer (1 per chapter + Aggregator)
44
- - `lrr-aggregate.json` P6 writer (Aggregator only)
34
+ - `docs/plans/design-doc.md` (PRD) — P1 writer
35
+ - `docs/plans/product-spec.md` — P1 writer (Step 1.6), product-spec-writer writer
36
+ - `docs/plans/architecture.md` — P2 writer
37
+ - `docs/plans/sprint-tasks.md` — P2 writer
38
+ - `docs/plans/quality-targets.json` P2 writer
39
+ - `docs/plans/phase-2-contracts/*.md` — P2 writer (per-architect post-debate contract files)
40
+ - `docs/plans/visual-dna-preview.md` P2 writer, design-brand-guardian writer, ios-swift-ui-design writer (directional DNA preview at Gate 2)
41
+ - `DESIGN.md` P3 writers: design-brand-guardian (Pass 1 at Step 3.0, both modes); design-ui-designer (Pass 2 at Step 3.4, web); ios-swift-ui-design (Pass 2 at Step 3.2-ios, iOS). Replaces former visual-dna.md + visual-design-spec.md pair (web) and ios-design-board.md (iOS). Repo root.
42
+ - `docs/plans/component-manifest.md` P3 writer (web, HARD-GATE import source)
43
+ - `docs/plans/design-references.md` visual-research writer (web, Step 3.1)
44
+ - `docs/plans/design-references/**` visual-research writer (web, screenshots harvested by visual-research subagents)
45
+ - `docs/plans/dna-persona-check.md` — design-ux-researcher writer (web, Step 3.2b)
46
+ - `docs/plans/ux-architecture.md` — P3 writer (web)
47
+ - `docs/plans/ux-flow-validation.md` — design-ux-researcher writer (web, Step 3.3b)
48
+ - `docs/plans/inclusive-visuals-audit.md` — P3 writer (web)
49
+ - `docs/plans/a11y-design-review.md` — P3 writer, a11y-architect writer (web, Step 3.7)
50
+ - `docs/plans/page-specs/*.md` — P3 writer, design-ux-architect writer (web, Step 3.3 — per-screen wireframes + layout specs)
51
+ - `docs/plans/refs.json` — P2 writer, P3 writer (P3 extends after visual spec lands)
52
+ - `docs/plans/decisions.jsonl` — orchestrator-scribe ONLY via `scribe_decision` MCP tool (subagents return `deviation_row` objects; the orchestrator forwards each row through the MCP, which owns ID allocation and atomic append)
53
+ - `docs/plans/learnings.jsonl` — P5 writer, P7 writer
54
+ - `docs/plans/evidence/*.json` — P5 writer (P4 contributes per-task, P6/P7 readers)
55
+ - `docs/plans/evidence/*.md` — P5 writer, design-brand-guardian writer (brand-drift findings, fake-data-audit)
56
+ - `docs/plans/evidence/**/*.json` — P4 writer, P5 writer, P6 writer (nested per-task/per-run evidence JSON)
57
+ - `docs/plans/evidence/**/*.md` — P4 writer, P5 writer (nested per-task/per-run evidence markdown)
58
+ - `docs/plans/evidence/**/*.png` — P3 writer, P4 writer, P5 writer (screenshots: Playwright, SwiftUI Preview, Maestro, design-reference)
59
+ - `docs/plans/evidence/**/*.{txt,har}` — P4 writer, P5 writer (smoke-test HAR captures, DOM snapshots)
60
+ - `docs/plans/evidence/lrr/*.json` — code-reviewer writer, security-reviewer writer, engineering-sre writer, a11y-architect writer, design-brand-guardian writer, pr-test-analyzer writer (5 chapter verdicts + 1 sub-verdict)
61
+ - `docs/plans/evidence/lrr-aggregate.json` — phase-6-aggregator writer (Aggregator only)
62
+ - `docs/plans/evidence/lrr-incomplete.json` — phase-6-aggregator writer (file-completeness checkpoint)
63
+ - `docs/plans/evidence/lrr-routing.json` — phase-6-aggregator writer (BLOCK routing via decided_by)
64
+ - `docs/plans/evidence/reality-check-manifest.json` — testing-reality-checker writer (evidence-sweep manifest)
65
+ - `docs/plans/.build-state.json` — orchestrator writer (every phase boundary)
66
+ - `docs/plans/.build-state.md` — auto-rendered-view writer (regenerated from .build-state.json on every update)
67
+ - `docs/plans/.task-outputs/[task-id].json` — P4 writer (per-task output)
68
+ - `docs/plans/build-log.md` — every-phase writer (append on transition)
69
+ - `docs/plans/.active-learnings.md` — P0 writer (top-3 cross-run learnings for Phase 4 implementer briefings)
70
+ - `docs/plans/ios-verify-report.md` — P5 writer (iOS verify twin)
71
+ - `docs/plans/ios-ux-review-report.md` — P5 writer (iOS ux-review twin)
45
72
 
46
73
  Phase-internal scaffolding (lives in `docs/plans/phase1-scratch/` after Gate 1, never read by P3+):
47
74
  - `idea-draft.md`, `feature-intel.md`, `tech-feasibility.md`, `ux-research.md`, `business-model.md`, `findings-digest.md`, `suggested-questions.md`, `user-decisions.md`, `prereqs.json`
48
75
 
49
- Phase 4 implementers never reference Phase 1 raw research files. They are SPENT after Phase 2 dispatch.
76
+ Phase 4 implementers never reference Phase 1 raw research files. They are SPENT after the Product Spec step (Step 1.6). The product spec is the LAST consumer of raw research. After Step 1.6, research insights survive in `product-spec.md`, `design-doc.md`, and `CLAUDE.md`.
50
77
  </HARD-GATE>
51
78
 
52
79
  > **Default-deny (Stage 2+):** Once Stage 2 of the SDK migration activates, any `Write|Edit` tool call targeting a path absent from this table will be denied by the `PreToolUse` hook with message `"path not in writer-owner table — please add to phase-graph.yaml or route through scribe MCP"`. This is a pre-announcement; actual hook wiring ships in Task 2.1.3.
@@ -54,7 +81,7 @@ Phase 4 implementers never reference Phase 1 raw research files. They are SPENT
54
81
  <HARD-GATE>
55
82
  CONTEXT HEADER — RENDER ONCE, HOIST AS STABLE PREFIX.
56
83
 
57
- Every phase uses a CONTEXT header prepended to dispatch prompts. The orchestrator MUST render this header ONCE at the start of each phase by reading `.build-state.json` (and `visual-dna.md` for web, Phase 4+) and resolving all values into concrete strings. The rendered header is then reused verbatim for every dispatch in that phase.
84
+ Every phase uses a CONTEXT header prepended to dispatch prompts. The orchestrator MUST render this header ONCE at the start of each phase by reading `.build-state.json` (and `DESIGN.md` `## Overview > ### Brand DNA` for web, Phase 4+) and resolving all values into concrete strings. The rendered header is then reused verbatim for every dispatch in that phase.
58
85
 
59
86
  DO NOT paste `{read from .build-state.json}` placeholders into dispatch prompts. DO NOT re-read state files per dispatch. The values do not change within a phase.
60
87
 
@@ -63,7 +90,7 @@ DO NOT paste `{read from .build-state.json}` placeholders into dispatch prompts.
63
90
  CONTEXT:
64
91
  project_type: <resolved value>
65
92
  phase: <current phase number>
66
- dna: <resolved from docs/plans/visual-dna.md — INCLUDE only if project_type=web AND phase >= 4>
93
+ dna: <resolved from DESIGN.md `## Overview > ### Brand DNA` (7 axis values only, ~100 tokens) — INCLUDE only if project_type=web AND phase >= 4>
67
94
  ios_features: <resolved from .build-state.json — INCLUDE only if project_type=ios>
68
95
 
69
96
  TASK:
@@ -71,7 +98,7 @@ TASK:
71
98
 
72
99
  **Rendering procedure** (run once per phase boundary):
73
100
  1. Read `docs/plans/.build-state.json`. Extract `project_type`, `ios_features`.
74
- 2. If `project_type=web` AND phase >= 4: read `docs/plans/visual-dna.md` and extract the DNA summary (first 5 lines or the `## Summary` section). Otherwise omit the `dna` field.
101
+ 2. If `project_type=web` AND phase >= 4: read `DESIGN.md` and extract the DNA summary (first 5 lines or the `## Summary` section). Otherwise omit the `dna` field.
75
102
  3. If `project_type=ios`: include `ios_features`. Otherwise omit.
76
103
  4. Substitute all values into the template above. Store the result as `rendered_context_header`.
77
104
  5. For every dispatch in this phase, prepend `rendered_context_header` — do NOT re-read or re-interpolate.
@@ -111,6 +138,8 @@ Increment after each agent returns (parallel dispatch of 6 agents = +6). Reset t
111
138
 
112
139
  **Compaction checkpoint format:** At every phase boundary, check `dispatch_count` in `docs/plans/.build-state.json`. If >= 8: save ALL state (current phase, task statuses, metric loop scores, decisions) to `docs/plans/.build-state.json` and regenerate `.build-state.md` as the rendered view. Reset `dispatch_count` to 0. TodoWrite does NOT survive compaction — rebuild it from the JSON state file on resume. See `protocols/state-schema.md` for the full schema and rendering contract.
113
140
 
141
+ Phase 4 context pressure: With 20+ tasks, compact returns accumulate ~30-40K tokens in the orchestrator's context. The compaction checkpoint (dispatch_count >= 8) is the primary relief valve. If Phase 4 has more than 15 tasks, force a compaction checkpoint after every wave transition regardless of dispatch_count.
142
+
114
143
  **Cumulative-cost banner at phase boundaries:** When announcing a phase transition (e.g. "Phase N complete — proceeding to Phase N+1"), prefix the message with `[Cost so far: $X.XX • Y tokens]`. Source the values from the last-appended entry in `docs/plans/build-log.md`'s token-accounting lines (fields `cumulative_usd=...` plus the sum of `input_tokens=...` + `output_tokens=...`), written by `src/orchestrator/hooks/token-accounting.ts` (see module for exact schema). If the build-log has no token-accounting entries yet, omit the prefix rather than guessing.
115
144
 
116
145
  Input: $ARGUMENTS
@@ -162,9 +191,9 @@ The 7-check verification gate is called by Phase 2 (architecture check), Phase 4
162
191
 
163
192
  ### Refs-Not-Pastes Rule
164
193
 
165
- For Phase 3+ agents, the orchestrator passes REFS to live downstream docs (`design-doc.md`, `architecture.md`, `visual-design-spec.md`, `sprint-tasks.md`, `quality-targets.json`, `decisions.jsonl`) — NOT pasted content. The orchestrator reads `docs/plans/refs.json` (produced by the Phase 2 Refs Indexer), resolves the task topic against the flat anchor index, and passes a short ref list to the agent. The agent uses the Read tool to pull refs it needs. This keeps orchestrator context lean and lets the agent widen its view on demand. Phase 1-2 agents still receive full documents because the architecture anchors don't exist yet.
194
+ For Phase 3+ agents, the orchestrator passes REFS to live downstream docs (`design-doc.md`, `architecture.md`, `DESIGN.md`, `sprint-tasks.md`, `quality-targets.json`, `decisions.jsonl`) — NOT pasted content. The orchestrator reads `docs/plans/refs.json` (produced by the Phase 2 Refs Indexer), resolves the task topic against the flat anchor index, and passes a short ref list to the agent. The agent uses the Read tool to pull refs it needs. This keeps orchestrator context lean and lets the agent widen its view on demand. Phase 1-2 agents still receive full documents because the architecture anchors don't exist yet.
166
195
 
167
- **refs.json mutation invalidates sprint-context hash (Stage 6 / task 6.3.2).** Any orchestrator update to `docs/plans/refs.json` (Phase 2 Refs Indexer initial write, Phase 3 extension after `visual-design-spec.md` lands, or any subsequent correction) MUST be IMMEDIATELY followed by a `state_save` call that sets `.build-state.json.current_sprint_context_hash = null`. This invalidates the cached Phase 4 sprint-scoped shared-context block so the next subagent dispatch re-renders with fresh references. See `src/orchestrator/phase4-shared-context.ts#shouldInvalidate` for how the hash is consulted at render time. Skipping this invalidation causes Phase 4 implementers to read stale anchor indices — a silent correctness failure.
196
+ **refs.json mutation invalidates sprint-context hash (Stage 6 / task 6.3.2).** Any orchestrator update to `docs/plans/refs.json` (Phase 2 Refs Indexer initial write, Phase 3 extension after `DESIGN.md` lands, or any subsequent correction) MUST be IMMEDIATELY followed by a `state_save` call that sets `.build-state.json.current_sprint_context_hash = null`. This invalidates the cached Phase 4 sprint-scoped shared-context block so the next subagent dispatch re-renders with fresh references. See `src/orchestrator/phase4-shared-context.ts#shouldInvalidate` for how the hash is consulted at render time. Skipping this invalidation causes Phase 4 implementers to read stale anchor indices — a silent correctness failure.
168
197
 
169
198
  ### Complexity Routing (Advisory)
170
199
 
@@ -181,23 +210,27 @@ Mode-specific tool stacks, per-phase branches, and persona rules live in separat
181
210
  When a later phase finds a problem whose root cause lives earlier, control flows BACKWARD to the authoring phase. The orchestrator codifies these edges so problems are fixed where they were introduced, not patched locally.
182
211
 
183
212
  ```
184
- PROBLEM FOUND AT ROUTES BACK TO
185
- ─────────────────────────────────────────────────────────────────
186
- Gate 1 (human says "no") → Phase 1 Step 1.0 with feedback
187
- Gate 2 (human says "no") → Phase 2 with feedback
188
- Phase 5 Audit (code issue) Phase 4 target task
189
- Phase 5 Audit (design issue) → Phase 3 target step
190
- Phase 5 Audit (spec issue) → Phase 2 re-architect
191
- Phase 6 LRR BLOCK (⭐⭐) Aggregator reads decisions.jsonl
192
- by related_decision_id
193
- authoring phase re-open
194
- Phase 6 LRR NEEDS_WORK (code) Phase 4 target task
195
- Phase 6 LRR NEEDS_WORK (struct) Phase 2 or Phase 3
213
+ PROBLEM FOUND AT ROUTES BACK TO
214
+ ──────────────────────────────────────────────────────────────────────────────────
215
+ Gate 1 (human says "no") → Phase 1 Step 1.0 with feedback
216
+ Gate 2 (human says "no") → Phase 2 with feedback
217
+ phase-3.step-3.2b-DNA-persona-mismatch phase-3.step-3.0
218
+ Phase 5 Audit (code issue) → Phase 4 target task
219
+ Phase 5 Audit (design issue) → Phase 3 target step
220
+ Phase 5 Audit (spec issue) phase-2
221
+ phase-5-dogfood-classified target_phase-per-classified-findings.json
222
+ phase-5-dogfood-feedback-synthesizer phase-4.target-task
223
+ Phase 6 LRR BLOCK (⭐⭐) authoring-phase (per decisions.jsonl.decided_by)
224
+ LRR-BLOCK-decided_by==architect phase-2
225
+ LRR-BLOCK-decided_by==design-brand-guardian-or-phase-3-writer → phase-3
226
+ Phase 6 LRR NEEDS_WORK (code) → Phase 4 target feature (via BO re-planning)
227
+ LRR-NEEDS_WORK-code-level → phase-4.target-task
228
+ phase-6-LRR-NEEDS_WORK-structural → phase-2-or-phase-3
196
229
  ```
197
230
 
198
231
  The ⭐⭐ star rule: when the LRR Aggregator receives a BLOCK verdict, it reads the `related_decision_id` on the blocker, looks up that row in `decisions.jsonl`, finds which phase authored the decision (the `decided_by` field), and re-opens that phase with the finding as input. Infrastructure already exists (decision IDs, author tracking) — wired here.
199
232
 
200
- **Re-entry halt rule (Stage 4 A7).** Before dispatching any backward routing (LRR BLOCK Phase N re-open, Reality Checker BLOCK Phase M re-entry, Gate "no" Phase 1/2 re-entry, etc.), check `.build-state.json.backward_routing_count` AND the per-target-phase variant `.build-state.json.backward_routing_count_by_target_phase[<N>]`. If the new (post-increment) value of EITHER counter for the target phase would exceed `max_cycles` (currently 2, from `phase-graph.yaml:routing.max_cycles`) — i.e., on the attempted third backward iteration — the orchestrator MUST halt and escalate to the user instead of dispatching. The Stage 4 `cycle_counter_check` MCP is the authoritative enforcer at runtime — it increments atomically and returns `escalate_to_user` once the new value exceeds `max_cycles`. This prose documents the behavior for the markdown-mode rollback path and for human readers.
233
+ **Re-entry halt rule (Stage 4 A7).** Before dispatching any backward routing (LRR BLOCK to Phase N re-open, Reality Checker BLOCK to Phase M re-entry, Gate "no" to Phase 1/2 re-entry, etc.), check `.build-state.json.backward_routing_count` AND the per-target-phase variant `.build-state.json.backward_routing_count_by_target_phase[<N>]`. If the new (post-increment) value of EITHER counter for the target phase would exceed `max_cycles` (currently 2, from `phase-graph.yaml:routing.max_cycles`) — i.e., on the attempted third backward iteration — the orchestrator MUST halt and escalate to the user instead of dispatching. The Stage 4 `cycle_counter_check` MCP is the authoritative enforcer at runtime — it increments atomically and returns `escalate_to_user` once the new value exceeds `max_cycles`. This prose documents the behavior for the markdown-mode rollback path and for human readers.
201
234
 
202
235
  **Phase-entry `in_flight_backward_edge` clear (Stage 4 A3 / task 4.3.3).** On the FIRST `state_save` after any phase entry — whether forward progression or backward-edge re-entry — the orchestrator MUST explicitly set `.build-state.json.in_flight_backward_edge = null`. This is the "successful landing" signal that closes the atomic crash-seam opened by `cycle_counter_check` (which writes `in_flight_backward_edge` in the same atomic state_save that increments the counter). If the runtime crashes between edge dispatch and landing, `--resume` in `bin/buildanything-runtime.ts` observes a stale `in_flight_backward_edge` (age > 60s) and decrements the counter (see task 4.3.4). See `src/orchestrator/mcp/cycle-counter.ts#clearInFlightEdge` for the runtime primitive.
203
236
 
@@ -211,7 +244,7 @@ Phase 0 is thin. No agent dispatch. No human input. Instant. The orchestrator re
211
244
  1. Read `docs/plans/.build-state.json` (source of truth) — verify it exists and has a `resume_point` field. Fall back to reading `docs/plans/.build-state.md` (rendered view) if the JSON file is missing but the markdown exists (graceful migration path from pre-W1-2 builds).
212
245
  If neither exists, OR neither has a Resume Point, warn the user: 'No previous build state found. Starting fresh.' Then proceed to Step 0.1 as a new build.
213
246
  2. Re-read this file and all protocol files in `protocols/`.
214
- 3. Re-read live downstream docs: `docs/plans/sprint-tasks.md`, `docs/plans/architecture.md`, `docs/plans/design-doc.md`, `docs/plans/visual-design-spec.md` (if exists), `CLAUDE.md`.
247
+ 3. Re-read live downstream docs: `docs/plans/sprint-tasks.md`, `docs/plans/architecture.md`, `docs/plans/design-doc.md`, `DESIGN.md` (if exists), `CLAUDE.md`.
215
248
  4. Read `docs/plans/decisions.jsonl` if it exists (top 5 most recent rows, filtered to the current phase and upstream phases). Pass short row fields + `ref` anchors into Phase 0 rehydration context — not the full row prose. See `protocols/decision-log.md`.
216
249
  5. Rebuild TodoWrite from the state file (TodoWrite does NOT survive compaction or session breaks).
217
250
  6. Reset `dispatches_since_save` to 0 (fresh context window).
@@ -231,7 +264,7 @@ Scan for existing context:
231
264
  | Context Level | What You Have | What Happens |
232
265
  |---|---|---|
233
266
  | **Full design** | Design doc with decisions, scope, tech stack, data models | Skip Phase 1. Feed design into Phase 2. |
234
- | **Decision brief** | An idea-sweep brief with verdicts and MVP definition | Phase 1 skips Step 1.1 research (already done). Brainstorming refines the brief into a design. |
267
+ | **Decision brief** | An idea-sweep brief with verdicts and product definition | Phase 1 skips Step 1.1 research (already done). Brainstorming refines the brief into a design. |
235
268
  | **Partial context** | Some notes, conversation, rough sketch | Phase 1 runs fully. Feed context into brainstorming + research. |
236
269
  | **Raw idea** | One-line build request, no prior work | Phase 1 runs fully from scratch. |
237
270
 
@@ -239,6 +272,8 @@ Scan for existing context:
239
272
 
240
273
  Scan the build request AND any context from Step 0.1 for iOS signals. Keywords: **iOS, iPhone, iPad, SwiftUI, Swift, App Store, TestFlight, Xcode, Apple, Liquid Glass, watchOS, visionOS, SwiftData, HIG**.
241
274
 
275
+ To avoid false positives (e.g., a web app mentioning "Apple Pay" or "Sign in with Apple"), require at least 2 iOS keywords OR 1 keyword + existing Swift/Xcode files before triggering the iOS confirmation prompt.
276
+
242
277
  | Signal | Action |
243
278
  |---|---|
244
279
  | iOS keywords present in prompt | Confirm with user: "This looks like an iOS app — confirm? [y/n]" |
@@ -282,7 +317,7 @@ Phase 4 implementer dispatch reads `.active-learnings.md` and injects its conten
282
317
 
283
318
  0. Create `docs/plans/` directory if it doesn't exist (greenfield projects won't have it).
284
319
  1. Create a TodoWrite checklist with Phases 0-7.
285
- 2. Write `docs/plans/.build-state.json` per the schema in `protocols/state-schema.md`. Required top-level fields: `project_type`, `phase`, `step`, `input`, `context_level`, `prerequisites`, `dispatch_count`, `last_save_phase`, `autonomous`, `session_id`, `session_started`, `completed_tasks[]`, `metric_loop_scores[]`, `decisions_next_id` (object keyed by phase number — see Phase 4 orchestrator-scribe handler), `resume_point { phase, step, completed_tasks, git_branch }`. Then regenerate `docs/plans/.build-state.md` from the JSON as a **read-only rendered view**.
320
+ 2. Write `docs/plans/.build-state.json` per the schema in `protocols/state-schema.md`. Required top-level fields: `project_type`, `phase`, `step`, `input`, `context_level`, `prerequisites`, `dispatch_count`, `last_save_phase`, `autonomous`, `session_id`, `session_started`, `completed_tasks[]`, `metric_loop_scores[]`, `decisions_next_id` (object keyed by phase number — see Phase 4 orchestrator-scribe handler), `resume_point { phase, step, completed_tasks, git_branch }`, `backward_routing_count_by_target_phase` (object), `feature_delegation_plan_path`, `current_wave`, `completed_features[]`, `feature_acceptance{}`, `feature_briefs{}`. See `protocols/state-schema.md` for the complete and authoritative field list. This inline list is a summary. Then regenerate `docs/plans/.build-state.md` from the JSON as a **read-only rendered view**.
286
321
  3. Go to Phase 1 (or Phase 2 if context level is "Full design").
287
322
 
288
323
  **NO prereq collection in Phase 0.** Stack isn't decided yet. Prereqs move to Step 1.5, after Gate 1. Asking for creds before the stack is picked means asking for wrong creds or re-asking on rejection.
@@ -295,7 +330,7 @@ Phase 4 implementer dispatch reads `.active-learnings.md` and injects its conten
295
330
 
296
331
  **If `project_type=ios` AND no `.xcodeproj` exists:** follow `protocols/ios-phase-branches.md` §Phase -1 — iOS Bootstrap. Otherwise skip entirely.
297
332
 
298
- iOS structural changes are out of scope for this orchestrator migration (per proposed-state §7).
333
+ iOS structural changes are out of scope for this orchestrator migration.
299
334
 
300
335
  **Compaction checkpoint.** Update `.build-state.json` per the format above.
301
336
 
@@ -333,7 +368,7 @@ Call the Agent tool 4 times in a single message. Each gets the build request + `
333
368
 
334
369
  1. Description: "Feature intel" — subagent_type: `feature-intel` — Prompt: "[CONTEXT header above] Extract competitor feature matrix for: [build request]. Idea draft: read docs/plans/phase1-scratch/idea-draft.md with your Read tool. Walk 5-10 rivals. Return must-haves (features present in >=80% of rivals — table stakes) + stand-outs (features unique to individual rivals — differentiation opportunities), sorted by competitor. Save to `docs/plans/phase1-scratch/feature-intel.md`."
335
370
 
336
- 2. Description: "Tech feasibility" — subagent_type: `tech-feasibility` — Prompt: "[CONTEXT header above] Evaluate hard technical problems (Solved/Hard/Unsolved), build-vs-buy decisions, MVP scope, stack validation for: [build request]. Idea draft: read docs/plans/phase1-scratch/idea-draft.md with your Read tool. Verify APIs and libraries from the draft exist and are maintained. Save to `docs/plans/phase1-scratch/tech-feasibility.md`. Report with a Technical Verdict."
371
+ 2. Description: "Tech feasibility" — subagent_type: `tech-feasibility` — Prompt: "[CONTEXT header above] Evaluate hard technical problems (Solved/Hard/Unsolved), build-vs-buy decisions, stack validation for: [build request]. Idea draft: read docs/plans/phase1-scratch/idea-draft.md with your Read tool. Verify APIs and libraries from the draft exist and are maintained. Save to `docs/plans/phase1-scratch/tech-feasibility.md`. Report with a Technical Verdict."
337
372
 
338
373
  3. Description: "UX research" — subagent_type: `design-ux-researcher` — Prompt: "[CONTEXT header above] Analyze target persona, jobs-to-be-done, current alternatives, and behavioral barriers for: [build request]. Idea draft: read docs/plans/phase1-scratch/idea-draft.md with your Read tool. Save to `docs/plans/phase1-scratch/ux-research.md`. Report with a User Verdict."
339
374
 
@@ -391,7 +426,7 @@ Write TWO outputs.
391
426
  OUTPUT 1 — `docs/plans/design-doc.md` — **THE PRD** (authoritative product document). Header MUST begin with `# [Product Name] — PRD`. Structure:
392
427
  - Product — what it is, core value prop, success criteria
393
428
  - User — persona, JTBD, hard constraints
394
- - Scope — MVP features (must-haves + chosen stand-outs), explicit Out-of-Scope boundary
429
+ - Scope — Features in scope (must-haves + chosen stand-outs), explicit Out-of-Scope boundary
395
430
  - Tech Stack — chosen stack with 1-line rationale
396
431
  - Data Model — shape of core entities
397
432
  - Decisions — links to `decisions.jsonl` rows
@@ -411,7 +446,7 @@ OUTPUT 2 — `CLAUDE.md` (project root, NOT `docs/plans/`). <200-line product br
411
446
  [Stack choices with 1-line rationale for each]
412
447
 
413
448
  ## Scope
414
- [What's in MVP vs. deferred. Hard boundaries.]
449
+ [What's in scope vs. deferred. Hard boundaries.]
415
450
 
416
451
  ## Rules
417
452
  [Project-specific hard rules derived from the product and user context.]
@@ -464,6 +499,32 @@ Output: `docs/plans/phase1-scratch/prereqs.json` with shape `{supabase_url, supa
464
499
 
465
500
  ---
466
501
 
502
+ ### Step 1.6 — PRODUCT SPEC
503
+
504
+ Call the Agent tool — description: "Product spec" — subagent_type: `product-spec-writer` — prompt: "[CONTEXT header above] Write `docs/plans/product-spec.md` following the structure in `protocols/product-spec-schema.md`. Read ALL of these via your Read tool before writing (do NOT expect pasted content):
505
+ - `docs/plans/design-doc.md` — PRD: features, persona, JTBD, value prop, scope, tech stack
506
+ - `docs/plans/phase1-scratch/findings-digest.md` — research synthesis
507
+ - `docs/plans/phase1-scratch/ux-research.md` — behavioral patterns, pain points
508
+ - `docs/plans/phase1-scratch/feature-intel.md` — competitive matrix, table-stakes vs differentiators
509
+ - `docs/plans/phase1-scratch/business-model.md` — revenue model implications
510
+ - `docs/plans/phase1-scratch/tech-feasibility.md` — technical constraints, rate limits, API limitations
511
+ - `docs/plans/phase1-scratch/user-decisions.md` — user's product decisions from informed brainstorm
512
+ This is the LAST step that reads raw research files. Every actionable insight must survive in product-spec.md in structured, queryable form. Commit: 'feat: product spec'."
513
+
514
+ #### Step 1.6.idx — Slice 1 graph index
515
+
516
+ After `product-spec-writer` returns and `docs/plans/product-spec.md` is on disk, index it into the build graph. Slice 1 graph index — required for downstream agents.
517
+
518
+ Run via the Bash tool:
519
+
520
+ - Command: `node ${CLAUDE_PLUGIN_ROOT}/bin/graph-index.js docs/plans/product-spec.md`
521
+ - On exit 0: log success to `docs/plans/build-log.md` and continue.
522
+ - On non-zero exit: STOP. Log the error to `docs/plans/build-log.md` and report the failure. Downstream agents (BO, PO, implementers) require the graph — do not proceed without a successful index.
523
+
524
+ **Compaction checkpoint.** Update `.build-state.json` per the format above.
525
+
526
+ ---
527
+
467
528
  ## Phase 2: Plan / Architect — TEAM of 6 + sequence
468
529
 
469
530
  **Goal**: Convert the PRD into a concrete architecture and ordered task list with explicit dependencies. Every architect receives the PRD (design-doc.md) + the Research Digest + its domain's raw research file (hybrid routing).
@@ -484,6 +545,8 @@ The 6 architects design as a TEAM — not 6 isolated subagents. Cross-domain con
484
545
 
485
546
  **On re-entry from LRR backward routing:** If Phase 2 is being re-opened via the re-entry dispatch template (Step 6.3), skip team creation if the original `phase-2-architects` team is still live from this build; otherwise recreate it. Pass the re-entry payload (`{blocking_finding, prior_output: "docs/plans/architecture.md", decision_row}`) into the dispatch prompt of the architect(s) whose domain matches `decision_row.author` — only those architects re-run, not all 6. The re-dispatched architect revises its `docs/plans/phase-2-contracts/<name>.md` in place, SendMessages peers on any contract boundary it now changes, and the synthesizer re-runs once to re-stitch `architecture.md`. Do NOT redo unaffected domains.
486
547
 
548
+ After the synthesizer re-stitches `architecture.md`, re-run the Refs Indexer (Step 2.3 dispatch #4) to update `docs/plans/refs.json` with fresh anchors, and re-run the DAG Validator (Step 2.3 dispatch #3) to verify sprint-tasks.md still references valid architecture sections. Invalidate the sprint-context hash per the refs.json mutation rule.
549
+
487
550
  **Step 2.2a — Create the team.**
488
551
 
489
552
  Call `TeamCreate` with `team_name: "phase-2-architects"`. This team scopes the SendMessage channel for the 6 architects below. Capture the team id in `.build-state.json` for teardown.
@@ -509,6 +572,7 @@ CROSS-CHECK PAIRINGS (mandatory — if your design touches one of these boundari
509
572
  - Security ↔ Backend on auth flow (token storage, refresh, session model, authz gates)
510
573
  - Accessibility ↔ Frontend on component patterns (primitives, focus management, landmark structure)
511
574
  - Performance ↔ Backend+Data on query shapes (N+1 risk, indexing strategy, bundle impact of data layer choices)
575
+ - Security ↔ Frontend on client-side auth (token storage location, CSRF protection, input sanitization, secure cookie flags)
512
576
 
513
577
  COORDINATION RULES:
514
578
  - Plain text in your output file is INVISIBLE to teammates. If a contract boundary intersects another architect's domain, you MUST `SendMessage` to that peer using the exact `name` from the roster above. Do not assume they will read your file.
@@ -524,17 +588,17 @@ Per-architect dispatches:
524
588
  **CONTEXT header:** Render `rendered_context_header` for phase 2 per the canonical template (see CONTEXT HEADER HARD-GATE above). Prepend to every Phase 2 architect prompt below.
525
589
 
526
590
 
527
- 1. Description: "Backend architecture" — subagent_type: `engineering-backend-architect` — team_name: `phase-2-architects` — name: `backend-architect` — Prompt: "[CONTEXT header above] Design system architecture. Read these files via your Read tool before starting — do NOT expect pasted content:\n - PRD: `docs/plans/design-doc.md`\n - DIGEST: `docs/plans/phase1-scratch/findings-digest.md`\n - YOUR DOMAIN RAW: `docs/plans/phase1-scratch/tech-feasibility.md`, `docs/plans/phase1-scratch/feature-intel.md`\nInclude services, data models, API contracts, database schema, integration points. Respect stack choices from PRD.\n\n[paste shared team brief above]"
591
+ 1. Description: "Backend architecture" — subagent_type: `engineering-backend-architect` — team_name: `phase-2-architects` — name: `backend-architect` — Prompt: "[CONTEXT header above] Design system architecture. Read these files via your Read tool before starting — do NOT expect pasted content:\n - PRD: `docs/plans/design-doc.md`\n - PRODUCT SPEC: `docs/plans/product-spec.md` (## App Overview, ## Screen Inventory, ## Permissions & Roles, per-feature behavioral sections — feature behaviors are the source of truth your architecture must support)\n - DIGEST: `docs/plans/phase1-scratch/findings-digest.md`\n - YOUR DOMAIN RAW: `docs/plans/phase1-scratch/tech-feasibility.md`, `docs/plans/phase1-scratch/feature-intel.md`\nInclude services, data models, API contracts, database schema, integration points. Respect stack choices from PRD. Map per-feature Business Rules and States to specific endpoints, persistence schemas, and validation logic — every State the product spec defines must have a backend behavior.\n\n[paste shared team brief above]"
528
592
 
529
- 2. Description: "Frontend architecture" — subagent_type: `engineering-frontend-developer` — team_name: `phase-2-architects` — name: `frontend-architect` — Prompt: "[CONTEXT header above] Design frontend architecture. Read these files via your Read tool before starting — do NOT expect pasted content:\n - PRD: `docs/plans/design-doc.md`\n - DIGEST: `docs/plans/phase1-scratch/findings-digest.md`\n - YOUR DOMAIN RAW: `docs/plans/phase1-scratch/ux-research.md`, `docs/plans/phase1-scratch/feature-intel.md`\nInclude component hierarchy, layout strategy, responsive approach, state management, routing. Align UX with the persona from research.\n\n[paste shared team brief above]"
593
+ 2. Description: "Frontend architecture" — subagent_type: `engineering-frontend-developer` — team_name: `phase-2-architects` — name: `frontend-architect` — Prompt: "[CONTEXT header above] Design frontend architecture. Read these files via your Read tool before starting — do NOT expect pasted content:\n - PRD: `docs/plans/design-doc.md`\n - PRODUCT SPEC: `docs/plans/product-spec.md` (## App Overview, ## Screen Inventory, ## Permissions & Roles, per-feature behavioral sections — feature behaviors are the source of truth your architecture must support)\n - DIGEST: `docs/plans/phase1-scratch/findings-digest.md`\n - YOUR DOMAIN RAW: `docs/plans/phase1-scratch/ux-research.md`, `docs/plans/phase1-scratch/feature-intel.md`\nInclude component hierarchy, layout strategy, responsive approach, state management, routing. Align UX with the persona from research. Map the Screen Inventory to your component hierarchy — every screen the product spec lists must have a routable view, and per-feature States must drive the component-state matrix.\n\n[paste shared team brief above]"
530
594
 
531
- 3. Description: "Data engineering" — subagent_type: `engineering-data-engineer` — team_name: `phase-2-architects` — name: `data-engineer` — Prompt: "[CONTEXT header above] Design data architecture. Read these files via your Read tool before starting — do NOT expect pasted content:\n - PRD: `docs/plans/design-doc.md`\n - DIGEST: `docs/plans/phase1-scratch/findings-digest.md`\n - YOUR DOMAIN RAW: `docs/plans/phase1-scratch/tech-feasibility.md`\nInclude ETL/ELT patterns, schema versioning, query patterns, indexing strategy, data lineage, migration plan.\n\n[paste shared team brief above]"
595
+ 3. Description: "Data engineering" — subagent_type: `engineering-data-engineer` — team_name: `phase-2-architects` — name: `data-engineer` — Prompt: "[CONTEXT header above] Design data architecture. Read these files via your Read tool before starting — do NOT expect pasted content:\n - PRD: `docs/plans/design-doc.md`\n - PRODUCT SPEC: `docs/plans/product-spec.md` (## App Overview, ## Screen Inventory, ## Permissions & Roles, per-feature behavioral sections — feature behaviors are the source of truth your architecture must support)\n - DIGEST: `docs/plans/phase1-scratch/findings-digest.md`\n - YOUR DOMAIN RAW: `docs/plans/phase1-scratch/tech-feasibility.md`\nInclude ETL/ELT patterns, schema versioning, query patterns, indexing strategy, data lineage, migration plan. Per-feature data requirements from the product spec drive your schema — derived fields, denormalizations, and access patterns must serve specific feature flows.\n\n[paste shared team brief above]"
532
596
 
533
- 4. Description: "Security architecture" — subagent_type: `engineering-security-engineer` — team_name: `phase-2-architects` — name: `security-engineer` — Prompt: "[CONTEXT header above] Security review. Read these files via your Read tool before starting — do NOT expect pasted content:\n - PRD: `docs/plans/design-doc.md`\n - DIGEST: `docs/plans/phase1-scratch/findings-digest.md`\nCover auth model, input validation, secrets management, threat model, dependency hygiene. Note: no raw research file routeddigest only (security architecture is a cross-cutting concern).\n\n[paste shared team brief above]"
597
+ 4. Description: "Security architecture" — subagent_type: `engineering-security-engineer` — team_name: `phase-2-architects` — name: `security-engineer` — Prompt: "[CONTEXT header above] Security review. Read these files via your Read tool before starting — do NOT expect pasted content:\n - PRD: `docs/plans/design-doc.md`\n - PRODUCT SPEC: `docs/plans/product-spec.md` (## App Overview, ## Screen Inventory, ## Permissions & Roles, per-feature behavioral sections — feature behaviors are the source of truth your architecture must support)\n - DIGEST: `docs/plans/phase1-scratch/findings-digest.md`\n - YOUR DOMAIN RAW: `docs/plans/phase1-scratch/tech-feasibility.md`\nCover auth model, input validation, secrets management, threat model, dependency hygiene. Use the product spec's ## Permissions & Roles section to drive your auth model roles in the product spec must map to enforceable permissions in the architecture.\n\n[paste shared team brief above]"
534
598
 
535
- 5. Description: "A11y constraints" — subagent_type: `a11y-architect` — team_name: `phase-2-architects` — name: `accessibility-auditor` — Prompt: "[CONTEXT header above] Accessibility-driven architecture constraints. Read these files via your Read tool before starting — do NOT expect pasted content:\n - PRD: `docs/plans/design-doc.md`\n - DIGEST: `docs/plans/phase1-scratch/findings-digest.md`\n - YOUR DOMAIN RAW: `docs/plans/phase1-scratch/ux-research.md`\nIdentify WCAG 2.2 AA requirements that affect component choice, navigation structure, form patterns, focus management, landmark regions.\n\n[paste shared team brief above]"
599
+ 5. Description: "A11y constraints" — subagent_type: `a11y-architect` — team_name: `phase-2-architects` — name: `accessibility-auditor` — Prompt: "[CONTEXT header above] Accessibility-driven architecture constraints. Read these files via your Read tool before starting — do NOT expect pasted content:\n - PRD: `docs/plans/design-doc.md`\n - PRODUCT SPEC: `docs/plans/product-spec.md` (## App Overview, ## Screen Inventory, ## Permissions & Roles, per-feature behavioral sections — feature behaviors are the source of truth your architecture must support)\n - DIGEST: `docs/plans/phase1-scratch/findings-digest.md`\n - YOUR DOMAIN RAW: `docs/plans/phase1-scratch/ux-research.md`\nIdentify WCAG 2.2 AA requirements that affect component choice, navigation structure, form patterns, focus management, landmark regions. Per-feature Persona Constraints (e.g., \"user scans, doesn't read\", \"operator on a phone in the field\") drive component-level a11y constraints.\n\n[paste shared team brief above]"
536
600
 
537
- 6. Description: "Performance constraints" — subagent_type: `testing-performance-benchmarker` — team_name: `phase-2-architects` — name: `performance-benchmarker` — Prompt: "[CONTEXT header above] Define quality targets for this build. Read these files via your Read tool before starting — do NOT expect pasted content:\n - PRD: `docs/plans/design-doc.md`\n - DIGEST: `docs/plans/phase1-scratch/findings-digest.md`\n - YOUR DOMAIN RAW: `docs/plans/phase1-scratch/tech-feasibility.md`\nWrite `docs/plans/quality-targets.json` covering bundle budget, LCP, TTI, API p95, Lighthouse scores. Use per-Scope budgets from `orchestration-proposed-state.md` §11: Marketing 500KB / Product 300KB / Dashboard 400KB / Internal 200KB gzipped.\n\n[paste shared team brief above]"
601
+ 6. Description: "Performance constraints" — subagent_type: `testing-performance-benchmarker` — team_name: `phase-2-architects` — name: `performance-benchmarker` — Prompt: "[CONTEXT header above] Define quality targets for this build. Read these files via your Read tool before starting — do NOT expect pasted content:\n - PRD: `docs/plans/design-doc.md`\n - PRODUCT SPEC: `docs/plans/product-spec.md` (## App Overview, ## Screen Inventory, ## Permissions & Roles, per-feature behavioral sections — feature behaviors are the source of truth your architecture must support)\n - DIGEST: `docs/plans/phase1-scratch/findings-digest.md`\n - YOUR DOMAIN RAW: `docs/plans/phase1-scratch/tech-feasibility.md`\nWrite `docs/plans/quality-targets.json` covering bundle budget, LCP, TTI, API p95, Lighthouse scores. Use per-Scope budgets: Marketing 500KB / Product 300KB / Dashboard 400KB / Internal 200KB gzipped. Per-feature critical-path performance derives from the product spec's Happy Path latency expectations.\n\n[paste shared team brief above]"
538
602
 
539
603
  **Step 2.2c — Wait for all 6 teammates to idle**, then proceed to synthesis. The `docs/plans/phase-2-contracts/*.md` files now contain post-debate positions (initial draft plus any SendMessage-driven revisions). The orchestrator does NOT read these files — the synthesizer below does.
540
604
 
@@ -548,9 +612,9 @@ Four sequential dispatches.
548
612
 
549
613
  **CONTEXT header:** Reuse `rendered_context_header` from phase 2 (already rendered above). Prepend to Step 2.3 synthesizer + sprint-breakdown prompts.
550
614
 
551
- 1. Description: "Implementation blueprint" — subagent_type: `code-architect` — Prompt: "[CONTEXT header above] Implementation blueprint. Read the PRD via your Read tool: `docs/plans/design-doc.md`. Read all 6 post-debate architect positions via your own Read tool from `docs/plans/phase-2-contracts/`:\n - `backend-architect.md`\n - `frontend-architect.md`\n - `data-engineer.md`\n - `security-engineer.md`\n - `accessibility-auditor.md`\n - `performance-benchmarker.md`\n\nThese files are the authoritative team positions AFTER any SendMessage-driven revisions — the architects already cross-checked each other's contract boundaries, so you can stitch without re-debating. Your job is to assemble, not adjudicate. Include specific files to create/modify, build sequence, dependency order. Write `docs/plans/architecture.md` with stable section anchors per `protocols/architecture-schema.md`. Required top-level sections: Overview, Frontend, Backend, Data Model, Security, Infrastructure, MVP Scope, Out of Scope. Scope to MVP boundary from the PRD."
615
+ 1. Description: "Implementation blueprint" — subagent_type: `code-architect` — Prompt: "[CONTEXT header above] Implementation blueprint. Read the PRD via your Read tool: `docs/plans/design-doc.md`. Read the product spec: `docs/plans/product-spec.md` (Screen Inventory + per-feature behavioral sections — your blueprint's file-and-build-order list must cover every feature in the spec). Read all 6 post-debate architect positions via your own Read tool from `docs/plans/phase-2-contracts/`:\n - `backend-architect.md`\n - `frontend-architect.md`\n - `data-engineer.md`\n - `security-engineer.md`\n - `accessibility-auditor.md`\n - `performance-benchmarker.md`\n\nThese files are the authoritative team positions AFTER any SendMessage-driven revisions — the architects already cross-checked each other's contract boundaries, so you can stitch without re-debating. Your job is to assemble the 6 positions into a coherent architecture. Where positions conflict OUTSIDE the 5 mandatory cross-check pairings, flag the contradiction explicitly in `architecture.md` under a `### Unresolved Tensions` section and pick the safer default. Do not silently absorb contradictions. Include specific files to create/modify, build sequence, dependency order. Write `docs/plans/architecture.md` with stable section anchors per `protocols/architecture-schema.md`. Required top-level sections: Overview, Frontend, Backend, Data Model, Security, Infrastructure, Scope, Out of Scope. Scope to the boundary from the PRD. Every API endpoint heading in the Backend section MUST include feature attribution annotations — e.g. `**POST /api/orders** (provides: order-placement)` — using the feature kebab names from `product-spec.md`. These annotations are required for the graph indexer to emit cross-feature dependency edges."
552
616
 
553
- 2. Description: "Sprint breakdown" — subagent_type: `planner` — Prompt: "[CONTEXT header above] Break this architecture into ordered, atomic tasks. Each task needs: description, acceptance criteria, **dependencies** (list of task IDs this depends on), size (S/M/L), **Behavioral Test** field for every UI task (concrete interaction: 'Navigate to [page], click [element], verify [outcome]') or curl-based acceptance test for API tasks. Read these files via your Read tool before starting:\n - ARCHITECTURE: `docs/plans/architecture.md`\n - PRD: `docs/plans/design-doc.md`\nScope to MVP only. Save to `docs/plans/sprint-tasks.md`. Dependencies field is load-bearing — Phase 4 uses it to batch independent tasks in parallel."
617
+ 2. Description: "Sprint breakdown" — subagent_type: `planner` — Prompt: "[CONTEXT header above] Break this architecture into ordered, atomic tasks. Each task needs: description, acceptance criteria, **dependencies** (list of task IDs this depends on), size (S/M/L), **Behavioral Test** field for every UI task (concrete interaction: 'Navigate to [page], click [element], verify [outcome]') or curl-based acceptance test for API tasks, **Feature** — the exact feature name from product-spec.md (e.g. 'Order Placement', 'Auth') that must match a `## Feature: X` heading in product-spec.md (use '—' for infrastructure tasks that don't belong to a specific feature), **Screens** — comma-separated screen names from the product-spec Screen Inventory (e.g. 'Catalog, Product Detail') that must match screen names in product-spec.md (use '—' for backend-only tasks). Read these files via your Read tool before starting:\n - ARCHITECTURE: `docs/plans/architecture.md`\n - PRODUCT SPEC: `docs/plans/product-spec.md` (per-feature behavioral sections — every feature in the spec must have at least one task, and per-feature acceptance criteria become Behavioral Test field values)\n - PRD: `docs/plans/design-doc.md`\nSave to `docs/plans/sprint-tasks.md`. The table must have these columns in order: Task ID, Title, Size, Dependencies, Behavioral Test, Owns Files, Implementing Phase, Feature, Screens. Dependencies field is load-bearing — Phase 4 uses it to batch independent tasks in parallel. Each task's Behavioral Test field SHOULD reference a specific feature acceptance criterion from the product spec (e.g., \"User can submit form with valid email; submitted form appears in admin dashboard within 5s\" — derived from product-spec.md's Happy Path or per-state criteria)."
554
618
 
555
619
  3. Description: "Task DAG validator" — INTERNAL inline role-string — Prompt: "You are the Task DAG Validator. Read `docs/plans/sprint-tasks.md`. Validate for DAG correctness:
556
620
  - No circular dependencies
@@ -565,12 +629,42 @@ Report any violations. If clean, return PASS. If violations, return a list of fi
565
629
  - `docs/plans/architecture.md`
566
630
  - `docs/plans/sprint-tasks.md`
567
631
  - `docs/plans/quality-targets.json`
568
- - `docs/plans/visual-design-spec.md` (if it exists yet — Phase 3 extends refs.json after it writes this file)
632
+ - `DESIGN.md` (if it exists yet — Phase 3 extends refs.json after it writes this file)
569
633
 
570
634
  For each doc, extract section anchors into a flat index. Schema: `[{\"anchor\": \"design-doc.md#persona\", \"topic\": \"user persona\", \"file_path\": \"docs/plans/design-doc.md\"}, ...]`. This index is consumed by the Phase 4 Briefing Officer for per-task context maps. Do NOT include Phase 1 scratch files — they are SPENT."
571
635
 
572
636
  **Architecture Metric Loop (callable service):** Run the Metric Loop Protocol (`protocols/metric-loop.md`) on `architecture.md`. Define a metric: coverage of PRD requirements, specificity, consistency across the 6 architects, and **simplicity** — is this the simplest architecture that meets the requirements? Could any service, abstraction, or dependency be eliminated? Penalize over-engineering. Max 3 iterations.
573
637
 
638
+ #### Step 2.3.1.idx — Architecture graph index
639
+
640
+ After `code-architect` returns from the Implementation Blueprint dispatch (#1 above) AND the Architecture Metric Loop exits with `architecture.md` on disk, index it into the build graph. Slice 4 graph index — required for downstream agents.
641
+
642
+ Run via the Bash tool:
643
+
644
+ - Command: `node ${CLAUDE_PLUGIN_ROOT}/bin/graph-index.js docs/plans/architecture.md`
645
+ - On exit 0: log success to `docs/plans/build-log.md` and continue.
646
+ - On non-zero exit: STOP. Log the error to `docs/plans/build-log.md` and report the failure. Downstream agents require the graph — do not proceed without a successful index.
647
+
648
+ #### Step 2.3.2.idx — Sprint tasks graph index
649
+
650
+ After `planner` returns from the Sprint Breakdown dispatch (#2 above) AND the Task DAG Validator (#3 above) returns PASS, index `sprint-tasks.md` into the build graph. Slice 4 graph index — best-effort.
651
+
652
+ Run via the Bash tool:
653
+
654
+ - Command: `node ${CLAUDE_PLUGIN_ROOT}/bin/graph-index.js docs/plans/sprint-tasks.md`
655
+ - On exit 0: log success to `docs/plans/build-log.md` and continue.
656
+ - On non-zero exit: STOP. Log the error to `docs/plans/build-log.md` and report the failure. Downstream agents require the graph — do not proceed without a successful index.
657
+
658
+ #### Step 2.3.4.idx — Decisions re-index (end of Phase 2)
659
+
660
+ After the four Step 2.3 dispatches complete and the orchestrator finishes routing the 4 Phase 2 `deviation_row` objects through `scribe_decision`, re-index `decisions.jsonl` so the Slice 4 fragment reflects every Phase 2 decision before the LRR aggregator or feedback synthesizer can read it. Skip silently if `docs/plans/decisions.jsonl` does not exist (no decisions written yet).
661
+
662
+ Run via the Bash tool:
663
+
664
+ - Command: `node ${CLAUDE_PLUGIN_ROOT}/bin/graph-index.js docs/plans/decisions.jsonl`
665
+ - On exit 0: log success to `docs/plans/build-log.md` and continue.
666
+ - On non-zero exit: STOP. Log the error to `docs/plans/build-log.md` and report the failure. The decisions graph fragment must be current before downstream consumers query it.
667
+
574
668
  **Architecture decisions:** The Implementation Blueprint synthesizer returns 4 `deviation_row` objects (or a `phase_2_decisions` array of row objects) in its structured result — one per cross-cutting Phase 2 decision (API contract, persistence layer, auth model, framework choice). The orchestrator forwards each row through the `scribe_decision` MCP tool (see Phase 4 "Orchestrator-scribe dispatch"); the MCP allocates `D-2-<seq>` IDs and atomically appends to `docs/plans/decisions.jsonl`. Author = `architect`. Each row carries a `ref` anchor pointing into `architecture.md` per `protocols/decision-log.md`. Total: 4 rows.
575
669
 
576
670
  **Writes:** `docs/plans/architecture.md`, `docs/plans/sprint-tasks.md`, `docs/plans/quality-targets.json`, `docs/plans/refs.json`. Decision rows (4) flow through the orchestrator's `scribe_decision` MCP calls.
@@ -608,90 +702,127 @@ Update TodoWrite and `.build-state.json`.
608
702
  <HARD-GATE>
609
703
  UI/UX IS THE PRODUCT. This phase is a full peer to Architecture and Build — not a footnote, not an afterthought. Do NOT skip, compress, or rush this phase for any reason.
610
704
 
611
- Phase 4 WILL NOT START without `docs/plans/visual-design-spec.md` (web) or `docs/plans/ios-design-board.md` (iOS). If the artifact does not exist, return here.
705
+ Phase 4 WILL NOT START without `DESIGN.md` (Pass 1 + Pass 2 complete). If the artifact does not exist, return here.
612
706
  </HARD-GATE>
613
707
 
614
708
  **Mode-specific branch files drive Phase 3 in detail:**
615
709
  - `project_type=ios`: follow `protocols/ios-phase-branches.md` §Phase 3 (HIG + App Store + screenlane harvest → iOS Design Board, SwiftUI Preview QA loop).
616
- - `project_type=web`: follow `protocols/web-phase-branches.md` §Phase 3 — this file contains the NEW structure with steps 3.0-3.7 covering Visual DNA Selection (Brand Guardian as DNA owner at 3.0), Visual Research, Component Library Mapping, UX Architecture, Visual Design Spec, Inclusive Visuals Check, Style Guide Implementation (wrapped in Design Critic metric loop), and A11y Design Review. See `orchestration-proposed-state.md` §11 for the component library strategy.
710
+ - `project_type=web`: follow `protocols/web-phase-branches.md` §Phase 3 — this file contains the NEW structure with steps 3.0-3.7 covering Visual DNA Selection (Brand Guardian as DNA owner at 3.0), Visual Research, Component Library Mapping, UX Architecture, Visual Design Spec, Inclusive Visuals Check, Style Guide Implementation (wrapped in Design Critic metric loop), and A11y Design Review. See the Component Library Mapping step in that protocol for the component library strategy.
711
+
712
+ **Phase 3 branch-file dispatch table (subagent_type references for SSOT lint):**
713
+ - Step 3.0 Visual DNA Selection: subagent_type: `design-brand-guardian` (web)
714
+ - Step 3.1 Visual Research (2 parallel): subagent_type: `visual-research` (web, competitive-audit + inspiration-mining)
715
+ - Step 3.2 Component Library Mapping: subagent_type: `design-ui-designer` (web)
716
+ - Step 3.2b DNA Persona Check: subagent_type: `design-ux-researcher` (web, may route to 3.0)
717
+ - Step 3.3 UX Architecture: subagent_type: `design-ux-architect` (web)
718
+ - Step 3.5 Inclusive Visuals Check: subagent_type: `design-inclusive-visuals-specialist` (web)
719
+ - Step 3.2-ios iOS Design Board: subagent_type: `ios-swift-ui-design` (iOS)
617
720
 
618
- **Phase 3 write discipline:** Phase 3 is the writer for `docs/plans/visual-design-spec.md` (web) and extends `docs/plans/refs.json` to cover the visual spec anchors once it lands. Phase 3 does NOT write to `architecture.md` or `sprint-tasks.md` — those are Phase 2's.
721
+ **Phase 3 write discipline:** Phase 3 is the writer for `DESIGN.md` (web) and extends `docs/plans/refs.json` to cover the visual spec anchors once it lands. Phase 3 does NOT write to `architecture.md` or `sprint-tasks.md` — those are Phase 2's.
619
722
 
620
723
  <HARD-GATE>
621
724
  LRR BLOCK backward edge: `LRR BLOCK authoring=Phase 3 → back to Phase 3`. The ⭐⭐ star rule routes BLOCK findings via Aggregator decisions.jsonl `decided_by` lookup; if `decided_by == design-brand-guardian` or any Phase 3 writer, the build re-opens Phase 3 with the finding as input.
622
725
  </HARD-GATE>
623
726
 
624
- **On re-entry from LRR backward routing:** When Phase 3 is re-opened via the re-entry dispatch template (Step 6.3), the orchestrator passes the re-entry payload (`{blocking_finding, prior_output: "docs/plans/visual-design-spec.md" or "docs/plans/visual-dna.md", decision_row}`) into the specific Phase 3 step named by `decision_row.author`. That step revises the prior output to address `blocking_finding` only — DNA lock, component manifest, or visual spec — and emits a new decision_row. Unaffected steps are NOT re-run. Mode-specific branch files (`protocols/web-phase-branches.md` / `protocols/ios-phase-branches.md`) define which step owns which `decided_by` value.
727
+ **On re-entry from LRR backward routing:** When Phase 3 is re-opened via the re-entry dispatch template (Step 6.3), the orchestrator passes the re-entry payload (`{blocking_finding, prior_output: "DESIGN.md", decision_row}`) into the specific Phase 3 step named by `decision_row.author`. That step revises the prior output to address `blocking_finding` only — DESIGN.md Pass 1 (Step 3.0), component manifest (Step 3.2), or DESIGN.md Pass 2 (Step 3.4) — and emits a new decision_row. Unaffected steps are NOT re-run. Mode-specific branch files (`protocols/web-phase-branches.md` / `protocols/ios-phase-branches.md`) define which step owns which `decided_by` value.
625
728
 
626
729
  **Compaction checkpoint.** Update `.build-state.json` per the format above.
627
730
 
628
731
  ---
629
732
 
630
- ## Phase 4: Build — PARALLEL BATCHES by task dependencies
733
+ ## Phase 4: Build — THREE-TIER FEATURE-BASED EXECUTION
631
734
 
632
735
  <HARD-GATE>
633
- Before starting Phase 4: Phase 2 must be approved, Phase 3 must have produced the design artifact for this mode (`visual-design-spec.md` web / `ios-design-board.md` iOS). You MUST call the Agent tool for EVERY task. No exceptions.
736
+ Before starting Phase 4: Phase 2 must be approved, Phase 3 must have produced the design artifact (`DESIGN.md` Pass 1 + Pass 2 complete; broken-refs lint == 0), and `docs/plans/page-specs/` must contain at least one file (web). You MUST call the Agent tool for EVERY task. No exceptions.
634
737
  </HARD-GATE>
635
738
 
636
- **Goal**: Scaffold project + execute sprint tasks in dependency-ordered batches. Independent sibling tasks run in parallel (~30-50% wall-clock saving on typical sprint).
739
+ **Goal**: Scaffold project, then execute sprint tasks organized by FEATURE with product adherence checked per-feature during build. Three tiers: Product Owner (product quality) → Briefing Officers (task planning per feature) Execution Agents (code). The orchestrator drives all dispatches — PO and BO are planning agents that write artifacts to disk.
637
740
 
638
741
  **Mode-specific branch:**
639
- - `project_type=ios`: follow `protocols/ios-phase-branches.md` §Phase 4 (entitlements generator + Info.plist hardening, XcodeBuildMCP folder structure, SwiftUI design tokens, Maestro flow stubs).
640
- - `project_type=web`: follow `protocols/web-phase-branches.md` §Phase 4 (web project scaffolding, CSS design system tokens, Playwright acceptance test stubs).
641
-
642
- ### Step 4.0 Scaffold (old Phase 4 Foundation merged here)
742
+ - `project_type=ios`: follow `protocols/ios-phase-branches.md` §Phase 4 for scaffold details and execution agent prompts.
743
+ - `project_type=web`: follow `protocols/web-phase-branches.md` §Phase 4 for scaffold details and execution agent prompts.
744
+
745
+ **Phase 4 dispatch table (subagent_type references for SSOT lint):**
746
+ - Product Owner (planning): subagent_type: `product-owner`
747
+ - Product Owner (acceptance): subagent_type: `product-owner`
748
+ - Briefing Officer (per feature): subagent_type: `briefing-officer`
749
+ - Web UI (S/M): subagent_type: `engineering-frontend-developer`
750
+ - Web UI (L): subagent_type: `engineering-senior-developer`
751
+ - Web backend: subagent_type: `engineering-backend-architect` OR `engineering-senior-developer`
752
+ - Web AI/ML: subagent_type: `engineering-ai-engineer`
753
+ - iOS UI planner: subagent_type: `ios-swift-ui-design`
754
+ - iOS UI impl: subagent_type: `engineering-senior-developer`, `engineering-mobile-app-builder`
755
+ - iOS Foundation Models: subagent_type: `ios-foundation-models-specialist`
756
+ - iOS StoreKit: subagent_type: `ios-storekit-specialist`
757
+ - iOS Swift review: subagent_type: `swift-reviewer`
758
+ - Security review: subagent_type: `security-reviewer`
759
+ - Cleanup: subagent_type: `code-simplifier`, `refactor-cleaner`
760
+ - Code review: subagent_type: `code-reviewer`, `silent-failure-hunter`
761
+
762
+ ### Step 4.0 — Scaffold (unchanged)
643
763
 
644
764
  Scaffolding is project skeleton + design system + acceptance test stubs. Three sequential dispatches (full details in the mode-specific branch file):
645
765
 
646
- **CONTEXT header:** Render `rendered_context_header` for phase 4 per the canonical template (see CONTEXT HEADER HARD-GATE above). Includes `dna` field for web projects. Prepend to every Phase 4 scaffold prompt below; branch files do the same for per-task flow.
766
+ **CONTEXT header:** Render `rendered_context_header` for phase 4 per the canonical template (see CONTEXT HEADER HARD-GATE above). Includes `dna` field for web projects. Prepend to every Phase 4 prompt below.
647
767
 
648
- 1. Description: "Project scaffolding" — subagent_type: `engineering-rapid-prototyper` — mode: "bypassPermissions" — prompt per branch file (web: Next.js/Vite/etc; iOS: Xcode project from bootstrap). Prepend CONTEXT header above. [COMPLEXITY: M]
768
+ 1. Description: "Project scaffolding" — subagent_type: `engineering-rapid-prototyper` — mode: "bypassPermissions" — prompt per branch file. [COMPLEXITY: M]
649
769
 
650
- 2. Description: "Design system setup" — subagent_type: `engineering-frontend-developer` — mode: "bypassPermissions" — prompt per branch file. Prepend CONTEXT header above. Implements design tokens from `visual-design-spec.md` or `ios-design-board.md`. The living style guide from Phase 3 is the reference implementation — components must match. [COMPLEXITY: M]
770
+ 2. Description: "Design system setup" — subagent_type: `engineering-frontend-developer` — mode: "bypassPermissions" — prompt per branch file. Implements design tokens from `DESIGN.md`. [COMPLEXITY: M]
651
771
 
652
- 3. Description: "Scaffold acceptance tests" — INTERNAL inline role-string — mode: "bypassPermissions" — prompt: "[CONTEXT header above] Scaffold acceptance tests from sprint-tasks.md. Use Page Object Model. Read `docs/plans/sprint-tasks.md`. For every task with a Behavioral Test field, create a Playwright test stub (web) or Maestro flow stub (iOS). Each stub: navigate → interact → assert. Stubs must FAIL right now (features aren't built yet) — that's correct. Commit: 'test: scaffold acceptance tests from sprint tasks'."
772
+ 3. Description: "Scaffold acceptance tests" — INTERNAL inline role-string — mode: "bypassPermissions" — prompt: "[CONTEXT header above] Scaffold acceptance tests from sprint-tasks.md. Use Page Object Model. For every task with a Behavioral Test field, create a Playwright test stub (web) or Maestro flow stub (iOS). Stubs must FAIL right now. Commit: 'test: scaffold acceptance tests from sprint tasks'."
653
773
 
654
- **Scaffold verification:** Run the Verify Protocol (INTERNAL inline "Verify scaffolding") 7 checks sequentially, stop on first FAIL. Do not proceed to Step 4.1 until PASS.
774
+ **Scaffold verification:** Run the Verify Protocol with `scope: static` (checks 1-3 and 6 only: Build, Type-Check, Lint, Diff Review). Test stubs are designed to fail at this point — do not run checks 4, 5, or 7 until after task implementation.
655
775
 
656
- ### Step 4.1+Task execution in dependency-ordered batches
776
+ ### Step 4.1 — Product Owner: Feature Planning
657
777
 
658
- Expand TodoWrite with each sprint task.
778
+ Dispatch the Product Owner in planning mode. It reads the full artifact set via graph queries, groups tasks by feature, sequences features into dependency-ordered waves, and writes a delegation plan.
659
779
 
660
- Build a DAG from `sprint-tasks.md` Dependencies fields. Execute in batches: the next batch is the set of all tasks whose dependencies are all complete. Dispatch each batch as parallel Agent tool calls in ONE message.
780
+ Call the Agent tool description: "Product Owner: feature planning" subagent_type: `product-owner` prompt: "[CONTEXT header above] MODE: planning.
661
781
 
662
- **Per-task flow (runs for every task in every batch):**
782
+ Read these artifacts via graph queries:
783
+ - `docs/plans/product-spec.md` — feature list, cross-feature interactions, screen inventory
784
+ - `docs/plans/sprint-tasks.md` — task breakdown with dependencies
785
+ - `docs/plans/architecture.md` — cross-feature API contracts, shared data entities
786
+ - `docs/plans/page-specs/*.md` — screen assignments per feature
787
+ - `docs/plans/quality-targets.json` — NFRs
663
788
 
664
- #### Briefing Officer (INTERNAL inline)
789
+ Produce `docs/plans/feature-delegation-plan.json` per the schema in `agents/product-owner.md`. For each feature: list assigned tasks (from sprint-tasks.md), write a product_context summary (~100-200 tokens: persona constraints, key business rules, critical error scenarios, competitive differentiators), extract cross-feature contracts, list page-spec refs (web: `page-specs/*.md` paths; iOS: `DESIGN.md` section anchors). Sequence features into waves by dependency order."
665
790
 
666
- Dispatch before every implementer. Assembles a compact <40-line context map that tells the implementer EXACTLY where to look for each kind of context. Refs not pastes.
791
+ Output: `docs/plans/feature-delegation-plan.json`. Update `.build-state.json`: set `feature_delegation_plan_path`, initialize `current_wave: 1`, `completed_features: []`, `feature_acceptance: {}`.
667
792
 
668
- Call the Agent tool description: "Briefing for [task name]" — INTERNAL inline role-string — prompt: "You are the Briefing Officer. Read `docs/plans/refs.json` and the task row for [task-id] from `docs/plans/sprint-tasks.md`. Build a compact context map (~40 lines max) in this exact shape:
793
+ ### Step 4.2Wave Execution (repeat for each wave)
669
794
 
670
- ```
671
- CONTEXT MAP — [task-id] [task name]
672
- persona / JTBD → design-doc.md#persona
673
- product scope → design-doc.md#scope
674
- visual tokens → visual-design-spec.md#tokens
675
- component variants → component-manifest.md#[category]
676
- auth model → architecture.md#auth
677
- data schema → architecture.md#data-model
678
- sibling task deps → sprint-tasks.md#[dep-id-1], #[dep-id-2]
679
- prior decisions → decisions.jsonl rows [row-id-1], [row-id-2]
680
- quality targets → quality-targets.json (full file)
681
- ```
795
+ Read `feature-delegation-plan.json`. For each wave, execute all features. Features within a wave are independent and their Briefing Officers can be dispatched in parallel.
796
+
797
+ #### 4.2.a Briefing Officer dispatch (one per feature, parallel within wave)
798
+
799
+ For each feature in the current wave, dispatch a Briefing Officer. If the wave has multiple independent features, dispatch all BOs in ONE message (parallel).
800
+
801
+ Call the Agent tool — description: "Briefing Officer: [feature name]" — subagent_type: `briefing-officer` — mode: "bypassPermissions" — prompt: "[CONTEXT header above] FEATURE DELEGATION from Product Owner:
682
802
 
683
- CLAUDE.md is NOT in the map — it auto-loads into every subagent. Raw Phase 1 research is NOT in the map — it is SPENT. The implementer reads refs on-demand using the Read tool; no full pastes."
803
+ Feature: [feature name]
804
+ Product context: [paste product_context from delegation plan]
805
+ Cross-feature contracts: [paste contracts from delegation plan]
806
+ Assigned tasks: [paste task IDs]
807
+ Page spec refs: [paste page_spec_refs from delegation plan]
684
808
 
685
- The Briefing Officer's output is the handoff payload for the implementer not for the orchestrator to re-paste.
809
+ Read the full feature spec via graph query. Read task rows from `docs/plans/sprint-tasks.md`. Read page specs, architecture, component manifest, visual design spec for this feature's screens.
686
810
 
687
- #### Implementer dispatch (subagent_type by task type)
811
+ Write `docs/plans/feature-briefs/[feature].md` per the schema in `agents/briefing-officer.md`. For each task: specify agent type, skills, structured context payload (layout, components, API contract, error states, business rules, persona constraints), and acceptance criteria."
688
812
 
689
- Dispatch by task type and complexity:
690
- - UI tasks: `subagent_type: engineering-frontend-developer` (S/M) or `subagent_type: engineering-senior-developer` (L)
691
- - Backend tasks: `subagent_type: engineering-backend-architect` (L) or `subagent_type: engineering-senior-developer` (M)
692
- - Hard / complex / cross-cutting tasks: `subagent_type: engineering-senior-developer`
813
+ Output: `docs/plans/feature-briefs/[feature].md`. Update `.build-state.json.feature_briefs[feature]` with the path.
693
814
 
694
- Call the Agent tool — description: "[task-id] [task name]" — subagent_type per above — mode: "bypassPermissions" — prompt: "[CONTEXT header above] [COMPLEXITY: S/M/L from sprint-tasks.md]. TASK: [task description + acceptance criteria from sprint-tasks.md]. Sprint context is prepended; focus on this task.
815
+ #### 4.2.b Task execution (orchestrator reads BO brief, dispatches per task)
816
+
817
+ After the Briefing Officer writes the feature brief, the orchestrator reads it and executes each task. Tasks within a feature are executed in DAG-parallel batches (topological ordering from the Dependencies field — independent siblings run in parallel, yielding ~30-50% wall-clock saving). The per-task pipeline is unchanged in structure — only the input to the execution agent changes.
818
+
819
+ **For each task in the feature brief:**
820
+
821
+ **1. Implementer dispatch** — The orchestrator reads the task's execution spec from the feature brief and pastes the structured context directly into the execution agent's prompt. See mode-specific branch file (`protocols/web-phase-branches.md` §Phase 4 or `protocols/ios-phase-branches.md` §Phase 4) for the exact prompt template.
822
+
823
+ Call the Agent tool — description: "[task-id] [task name]" — subagent_type: [from BO brief] — mode: "bypassPermissions" — prompt: "[CONTEXT header above] [COMPLEXITY: S/M/L from sprint-tasks.md].
824
+
825
+ [Paste the full structured context payload from the feature brief — TASK, FEATURE CONTEXT, PAGE LAYOUT, COMPONENTS, API CONTRACT, ERROR STATES, BUSINESS RULES, SKILLS ASSIGNED, ACCEPTANCE. See branch file for exact format.]
695
826
 
696
827
  ## Prior Learnings
697
828
  [paste contents of `docs/plans/.active-learnings.md` if it exists, otherwise omit this section]
@@ -699,56 +830,72 @@ Call the Agent tool — description: "[task-id] [task name]" — subagent_type p
699
830
  ## Deviation Reporting
700
831
  If your implementation deviates from the planned architecture, return a `deviation_row` object per the schema in `protocols/decision-log.md`. If no deviation, return `deviation_row: null`. Do NOT write `decisions.jsonl` directly.
701
832
 
702
- Implement fully with real code and tests. Commit: 'feat: [task]'. Report what you built, files changed, and test results.
833
+ Implement fully with real code and tests. Commit: 'feat: [task]'. Report what you built, files changed, and test results."
834
+
835
+ **2. Per-task security review (auth/PII tasks only)** — unchanged from prior design.
703
836
 
704
- ## On Re-entry (from LRR backward routing)
705
- **[ORCHESTRATOR: Include the "On Re-entry" section below ONLY when this is a re-entry dispatch from LRR backward routing. For normal Phase 4 execution, OMIT it.]**
837
+ Call the Agent tool — description: "Security review for [task-id]" — subagent_type: `security-reviewer` — prompt: "[CONTEXT header above] Review changed files from [task-id] for security issues. Scope: auth logic, input validation, secrets handling, dependency hygiene, OWASP Top 10 for web (or iOS Keychain / ATS / data protection for iOS). Return blocking findings only — 80%+ confidence threshold. Files to review: [list from implementer's changeset]."
706
838
 
707
- If this dispatch is a re-entry (the orchestrator passes `blocking_finding`, `prior_output`, and `decision_row` in the prompt), DO NOT treat this as a fresh task. Read `prior_output` (the path to your previous task artifact under `.task-outputs/[task-id].json` + changed files) and `decision_row` (the original deviation rationale from decisions.jsonl). Revise ONLY what `blocking_finding` requires — do not redo unaffected code, do not re-run acceptance tests that already passed, do not touch files outside the blast radius of the finding. Return a fresh `deviation_row` in your result payload documenting the revision rationale (author=this task-id, type and summary describing the revision)."
839
+ **3. Senior Dev cleanup** unchanged. Two-pass, changeset-scoped.
708
840
 
709
- #### Per-task security review (auth/PII tasks only)
841
+ 1. Call the Agent tool — description: "Simplify [task-id]" — subagent_type: `code-simplifier` — mode: "bypassPermissions" — prompt: "[CONTEXT header above] Simplify changed files from [task-id]. Remove dead code, unused imports, redundant abstractions. Do NOT add features. Do NOT change architecture. Do NOT touch files outside the changeset. Files: [list]."
710
842
 
711
- FOR tasks touching auth, PII, secrets, or payment flowsadd a per-task security review BEFORE Senior Dev cleanup:
843
+ 2. If TS/JS task: Call the Agent tool description: "Refactor [task-id]" subagent_type: `refactor-cleaner` — mode: "bypassPermissions" — prompt: "[CONTEXT header above] Run knip/depcheck/ts-prune on changed files from [task-id]. Changeset only. Files: [list]."
712
844
 
713
- Call the Agent tool — description: "Security review for [task-id]" — subagent_type: `security-reviewer` — prompt: "[CONTEXT header above] Review changed files from [task-id] for security issues. Scope: auth logic, input validation, secrets handling, dependency hygiene, OWASP Top 10 for web (or iOS Keychain / ATS / data protection for iOS). Return blocking findings only 80%+ confidence threshold. Files to review: [list from implementer's changeset]."
845
+ **4. Per-task code review (parallel pair)**unchanged.
714
846
 
715
- #### Senior Dev cleanup (simplifier + refactor-cleaner if TS)
847
+ Call the Agent tool 2 times in one message:
716
848
 
717
- Two-pass cleanup. Scope is sacred: ONLY files from the implementation changeset. Zero exceptions.
849
+ 1. Description: "Code review for [task-id]" subagent_type: `code-reviewer` — Prompt: "[CONTEXT header above] Review changed files from [task-id]. 80%+ confidence threshold. Changeset only. Files: [list]."
718
850
 
719
- 1. Call the Agent tool description: "Simplify [task-id]" — subagent_type: `code-simplifier` — mode: "bypassPermissions" — prompt: "[CONTEXT header above] Simplify changed files from [task-id]. Remove dead code, unused imports, redundant abstractions. Do NOT add features. Do NOT change architecture. Do NOT touch files outside the changeset. If simplification breaks acceptance criteria, revert and skip. Files: [list]."
851
+ 2. Description: "Silent failure hunt for [task-id]" — subagent_type: `silent-failure-hunter` — Prompt: "[CONTEXT header above] Hunt silent failures in changed files from [task-id]. Files: [list]."
720
852
 
721
- 2. If TS/JS task: Call the Agent tool description: "Refactor [task-id]" subagent_type: `refactor-cleaner` — mode: "bypassPermissions" — prompt: "[CONTEXT header above] Run knip/depcheck/ts-prune on changed files from [task-id]. Remove orphaned exports, unused deps, dead files. Same scope rules as simplifier — changeset only. Files: [list]."
853
+ **5. Metric Loop**unchanged. Authoritative behavioral check per `protocols/metric-loop.md`. Max 5 iterations.
722
854
 
723
- Skip cleanup if trivial (< 20 lines, single file).
855
+ **6. Verify Service** unchanged. Static checks only (type-check, lint, build). Max 2 fix attempts.
724
856
 
725
- #### Per-task code review (parallel pair)
857
+ **7. After each task completes** Update TodoWrite and `.build-state.json`. Write summary to `docs/plans/.task-outputs/[task-id].json`.
726
858
 
727
- Call the Agent tool 2 times in one message after Senior Dev cleanup:
859
+ **8. Orchestrator-scribe** After all tasks in a feature complete, collect deviation_rows and forward through `scribe_decision` MCP. Same mechanics as before.
728
860
 
729
- 1. Description: "Code review for [task-id]" — subagent_type: `code-reviewer` — Prompt: "[CONTEXT header above] Review changed files from [task-id]. Report findings with 80%+ confidence threshold only skip low-confidence nitpicks. Scope: changeset only. Acceptance criteria: [paste from task]. Files: [list]."
861
+ ### Step 4.3Product Owner: Feature Acceptance
730
862
 
731
- 2. Description: "Silent failure hunt for [task-id]" subagent_type: `silent-failure-hunter` Prompt: "[CONTEXT header above] Hunt silent failures in changed files from [task-id]. Targets: empty catch blocks, try/catch returning null, swallowed errors, unhandled promise rejections, assertions disabled in production. Files: [list]. Report blocking findings only."
863
+ After all tasks for a feature complete, dispatch the Product Owner in acceptance mode. It checks whether the built feature matches the product spec.
732
864
 
733
- #### Metric Loop (generator/critic)authoritative behavioral check
865
+ Call the Agent tooldescription: "Product Owner: accept [feature name]" — subagent_type: `product-owner` — prompt: "[CONTEXT header above] MODE: acceptance. FEATURE: [feature name].
734
866
 
735
- Run the Metric Loop Protocol (callable service) on the task implementation. Define a metric based on the task's acceptance criteria. For UI-facing tasks, include behavioral verification per the mode-specific branch file (web: agent-browser; iOS: SwiftUI Preview captures). Max 5 iterations.
867
+ Read the feature's acceptance criteria and business rules via graph query. Read the feature's page spec(s) from `docs/plans/page-specs/`. Use agent-browser (web) or XcodeBuildMCP + Maestro (iOS) to verify the built feature.
736
868
 
737
- The metric loop's final measurement IS the authoritative behavioral verification for this task no separate smoke-test dispatch. The critic's final score + pass/fail is what downstream steps consume.
869
+ Check: (1) Does the feature implement the product spec's happy path? (2) Are business rules correct? (3) Are error states from the product spec handled? (4) Does the layout match the page spec? (5) Does component usage match the manifest?
738
870
 
739
- Generator: same implementer agent re-invoked. Critic: measurement agent dispatched fresh. Never share context.
871
+ Write verdict: ACCEPTED or NEEDS_REVISION with specific findings citing product-spec sections."
740
872
 
741
- On target met: mark task complete. On stall: accept if score >= 60% of target (autonomous) or present to user (interactive).
873
+ **Verdict routing:**
874
+ - `ACCEPTED` → mark feature complete in `.build-state.json.feature_acceptance`. Proceed.
875
+ - `NEEDS_REVISION` → orchestrator re-dispatches the Briefing Officer for this feature with the findings. BO writes an updated brief targeting only the failing tasks. Orchestrator re-executes those tasks. Max 2 revision cycles per feature. After 2nd NEEDS_REVISION: interactive → present findings to user. Autonomous → accept with gap note in build-log.md AND append a structured gap entry to `.build-state.json.feature_acceptance[feature].gaps[]` with shape `{finding, severity, accepted_at_cycle}`. The Phase 6 LRR Eng-Quality chapter reads these gaps as input evidence.
742
876
 
743
- #### Verify Service (static checks only)
877
+ ### Step 4.4 Wave Transition
744
878
 
745
- Run the Verify Protocol (INTERNAL inline — "Verify scaffolding") after the metric loop exits. Verify now covers STATIC checks only: type-check, lint, build. Behavioral verification has already happened in the metric loop above — verify consumes the metric loop's final pass/fail + score from `.build-state.json.metric_loop_scores[]` rather than re-running behavioral checks. If any static check FAILS, dispatch a fix agent with the error, re-verify. Max 3 fix attempts.
879
+ After all features in the current wave are ACCEPTED:
746
880
 
747
- #### After each task completes
881
+ 1. Update `.build-state.json`: add features to `completed_features`, increment `current_wave`.
882
+ 2. Handle shared file mutations: if any BO flagged shared file changes needed by the next wave, apply them now. The orchestrator identifies shared files from BO cross-feature contract fields. For each shared file flagged by multiple features in the next wave, dispatch a single `code-architect` agent to reconcile the mutations before wave execution begins. Do NOT let multiple BOs independently modify the same shared file.
883
+ 3. Run a quick Verify Protocol (static checks) to confirm the wave didn't break anything.
884
+ 4. Proceed to next wave. Repeat Steps 4.2-4.4.
748
885
 
749
- Update TodoWrite and `.build-state.json`. Write a compact summary to `docs/plans/.task-outputs/[task-id].json` with {files-changed, tests-passing, verify-status}.
886
+ After all waves complete, Phase 4 is done.
750
887
 
751
- **Writes:** source code, `docs/plans/.task-outputs/`. Deviation rows flow through the orchestrator's `scribe_decision` MCP calls below implementers do NOT touch `decisions.jsonl`.
888
+ #### Step 4.4.idxDecisions re-index (end of wave)
889
+
890
+ After each wave's deviation rows have been routed through `scribe_decision` (per the Orchestrator-scribe dispatch below), re-index `decisions.jsonl` so the Slice 4 fragment reflects every wave-level decision before the next wave's BOs query open decisions. Skip silently if `docs/plans/decisions.jsonl` does not exist.
891
+
892
+ Run via the Bash tool:
893
+
894
+ - Command: `node ${CLAUDE_PLUGIN_ROOT}/bin/graph-index.js docs/plans/decisions.jsonl`
895
+ - On exit 0: log success to `docs/plans/build-log.md` and continue.
896
+ - On non-zero exit: STOP. Log the error to `docs/plans/build-log.md` and report the failure. The next wave's BOs require current decision data.
897
+
898
+ **Writes:** source code, `docs/plans/.task-outputs/`, `docs/plans/feature-delegation-plan.json`, `docs/plans/feature-briefs/*.md`. Deviation rows flow through the orchestrator's `scribe_decision` MCP calls.
752
899
 
753
900
  <HARD-GATE>
754
901
  DECISIONS.JSONL — ORCHESTRATOR-SCRIBE ONLY via `scribe_decision` MCP. Only the orchestrator may cause appends to `docs/plans/decisions.jsonl`, and it does so exclusively by invoking the `scribe_decision` MCP tool. Any dispatch prompt asking a subagent to write this file is a bug. The orchestrator itself MUST NOT Write or Edit the file directly. Subagents return `deviation_row` objects in their structured result; the orchestrator forwards them through the MCP, which owns ID allocation and atomic append.
@@ -756,42 +903,42 @@ DECISIONS.JSONL — ORCHESTRATOR-SCRIBE ONLY via `scribe_decision` MCP. Only the
756
903
 
757
904
  #### Orchestrator-scribe dispatch (route deviation rows through `scribe_decision` MCP)
758
905
 
759
- Runs after every Phase 4 parallel batch returns (and anywhere else a subagent returns a `deviation_row`, including Phase 1 synthesis and Phase 2 architecture synthesis). The scribe MCP is the single writer for `docs/plans/decisions.jsonl`; the orchestrator is the single caller of the MCP.
906
+ Runs after each feature's tasks complete. Same mechanics as before:
760
907
 
761
- 1. Walk `batch_results`. Collect every non-null `deviation_row` from each subagent return.
762
- 2. For each row, invoke the `scribe_decision` MCP tool with the row's fields (`phase`, `category`/`type`, `summary`, `decided_by`/`author`, `impact_level`, `rationale`, `related_files`) per the MCP's documented schema. One MCP call per row.
763
- 3. The MCP allocates the `decision_id` (`D-{N}-<seq>`), stamps `timestamp` (ISO-8601) and `status: "open"`, validates against `decisions.schema.json`, and atomically appends the line. The orchestrator MUST NOT Write or Edit `docs/plans/decisions.jsonl` directly, MUST NOT pre-compute decision IDs, and MUST NOT read or allocate `.build-state.json.decisions_next_id.P{N}` — ID allocation is the MCP's responsibility.
764
- 4. Regenerate `.build-state.md` after the batch completes so the rendered view reflects the newly appended rows.
908
+ 1. Collect non-null `deviation_row` from each subagent return.
909
+ 2. For each row, invoke `scribe_decision` MCP. One call per row.
910
+ 3. MCP allocates `decision_id`, stamps timestamp, validates, atomically appends.
911
+ 4. Regenerate `.build-state.md`.
765
912
 
766
- **On resume:** the scribe MCP reconstructs its ID allocator internally on first invocation by scanning `docs/plans/decisions.jsonl` (for each phase `N`, `max(seq)+1` across rows whose `decision_id` matches `D-{N}-<seq>`). The orchestrator no longer maintains `decisions_next_id` in `.build-state.json`; the field is effectively deprecated under Stage 2 (scribe owns ID allocation end-to-end) and is scheduled for formal removal in Stage 4 schema bump A7 (see Task 4.5.1 in `docs/migration/sdk-hybrid/TASK-BREAKDOWN.md`). TODO(stage-4-A7): drop `decisions_next_id` from the state schema.
913
+ **On resume:** scribe MCP reconstructs its ID allocator by scanning `decisions.jsonl`. The `decisions_next_id` field in `.build-state.json` is deprecated (scribe owns ID allocation).
767
914
 
768
915
  <HARD-GATE>
769
- LRR NEEDS_WORK backward edge: `LRR NEEDS_WORK (code-level) → back to Phase 4 target task`. The Aggregator classifies the finding and routes to the specific task via `related_decision_id` lookup; Phase 4 re-opens that task with the finding as input.
916
+ LRR NEEDS_WORK backward edge: `LRR NEEDS_WORK (code-level) → back to Phase 4 target feature`. The Aggregator classifies the finding and routes to the specific feature's Briefing Officer via `related_decision_id` lookup. The BO re-plans the affected task(s), orchestrator re-executes. Product-level issues route to the Product Owner, who re-delegates to the relevant BO.
770
917
  </HARD-GATE>
771
918
 
772
- **Compaction checkpoint.** Update `.build-state.json` per the format above.
919
+ **Compaction checkpoint.** Update `.build-state.json` per the format above. Feature-level state (`completed_features`, `current_wave`, `feature_acceptance`, `feature_briefs`) survives compaction — all planning artifacts are on disk.
773
920
 
774
921
  ---
775
922
 
776
- ## Phase 5: Audit — TEAM of 6 + eval harness + 3 parallel + feedback synth
923
+ ## Phase 5: Audit — Track A (engineering reality) + Track B (product reality) + cross-cutting
777
924
 
778
925
  <HARD-GATE>
779
926
  Before starting Phase 5: run the Verify Protocol (7 checks) one more time. All checks must pass before expensive audit agents fire.
780
927
  </HARD-GATE>
781
928
 
782
- **Goal**: Surface quality issues before Launch Review. Split from old Phase 6 old 6.1-6.3 (5-agent audit, eval harness, E2E + dogfood + fake-data) live here. Old 6.4-6.5 (Reality Check + LRR) move to Phase 6.
929
+ **Goal**: Surface quality issues before Launch Review. Phase 5 runs in three layers: Track A audits the engineering envelope (API / perf / a11y / security / brand drift), Track B audits the built product against `product-spec.md` per-feature (states, transitions, business rules, happy path, persona constraints, wiring, manifest coverage), and Cross-cutting checks (E2E user journeys, autonomous dogfood, fake-data detector) catch what neither track anticipates. Findings from all three layers route through one Feedback Synthesizer (Step 5.4) and one Fix loop (Step 5.5).
783
930
 
784
931
  **Mode-specific branch:**
785
- - `project_type=ios`: follow `protocols/ios-phase-branches.md` §Phase 5 (iOS twin commands: `/buildanything:verify` → `/buildanything:ux-review` `/buildanything:fix` in sequence; Maestro smoke tests). Skip the web TEAM below and jump to Step 5.4 Feedback Synthesizer with iOS evidence.
932
+ - `project_type=ios`: follow `protocols/ios-phase-branches.md` §Phase 5 for iOS-adapted Track A/B + cross-cutting (XcodeBuildMCP + Maestro execution surface). Steps 5.1–5.3 are defined in the iOS protocol; Steps 5.4–5.5 below are shared.
786
933
  - `project_type=web`: continue below.
787
934
 
788
- ### Step 5.1 — TEAM of 6 parallel auditors (ONE message)
935
+ ### Step 5.1 — Track A: Engineering Reality (5 parallel auditors, ONE message)
789
936
 
790
937
  Read the NFRs from `docs/plans/quality-targets.json`. Pass the relevant targets to each audit agent so they have concrete thresholds, not generic checks.
791
938
 
792
- **CONTEXT header:** Render `rendered_context_header` for phase 5 per the canonical template (see CONTEXT HEADER HARD-GATE above). Prepend to every Phase 5 prompt below.
939
+ **CONTEXT header:** Render `rendered_context_header` for phase 5 per the canonical template (see CONTEXT HEADER HARD-GATE above). Prepend to every Step 5.X dispatch prompt below.
793
940
 
794
- Call the Agent tool 6 times in one message:
941
+ Call the Agent tool 5 times in one message:
795
942
 
796
943
  1. Description: "API testing" — subagent_type: `testing-api-tester` — Prompt: "[CONTEXT header above] Comprehensive API validation: all endpoints, edge cases, error responses, auth flows. NFR targets: Read `docs/plans/quality-targets.json` via your Read tool for performance and reliability thresholds. Report findings with severity counts."
797
944
 
@@ -801,42 +948,80 @@ Call the Agent tool 6 times in one message:
801
948
 
802
949
  4. Description: "Security audit" — subagent_type: `engineering-security-engineer` — Prompt: "[CONTEXT header above] Security review at app level: auth, input validation, data exposure, dependency vulnerabilities. NFR targets: Read `docs/plans/quality-targets.json` via your Read tool for security thresholds. Report findings with severity."
803
950
 
804
- 5. Description: "UX quality audit" — subagent_type: `design-ux-researcher` — Prompt: "[CONTEXT header above] UX quality review of every user-facing page. First, screenshot the living style guide at /design-system (web) as your reference. Then review every product page: loading states (every async action shows a loading indicator), error states (every form and API call shows user-friendly feedback), empty states (lists/tables handle zero items), mobile responsiveness (test at 375px touch targets >= 44px, no horizontal scroll), form validation (inline feedback, not alert()), transition smoothness, visual consistency vs style guide (buttons, inputs, cards, colors, spacing should match). Report issues with page, severity, and screenshot."
951
+ 5. Description: "Brand Guardian drift check" — subagent_type: `design-brand-guardian` — Prompt: "[CONTEXT header above] You are the Phase 5 drift check. Read DESIGN.md (the DNA card locked at Phase 3.0) + the actually-built pages via Playwright screenshots under docs/plans/evidence/. Score whether Phase 4 implementers stayed true to the DNA or drifted away from it. Specifically check: does the built Character axis match the DNA? Does Density match? Is Material consistent? Is Motion aligned? Report drift count and specific elements. Save findings to docs/plans/evidence/brand-drift.md. Note: this is a drift check only the Phase 6 LRR Brand Guardian chapter does the verdict. You do NOT issue a pass/fail here, only surface findings."
952
+
953
+ ### Step 5.2 — Track B: Product Reality (parallel per-feature, ONE message)
954
+
955
+ Track B audits the built app against `product-spec.md` on a per-feature basis. Each feature gets its own auditor; all auditors run in parallel.
956
+
957
+ **CONTEXT header:** Render `rendered_context_header` for phase 5 per the canonical template (see CONTEXT HEADER HARD-GATE above). Prepend to every Step 5.X dispatch prompt below.
805
958
 
806
- 6. Description: "Brand Guardian drift check" — subagent_type: `design-brand-guardian` — Prompt: "[CONTEXT header above] You are the Phase 5 drift check. Read docs/plans/visual-dna.md (the DNA card locked at Phase 3.0) + the actually-built pages via Playwright screenshots under docs/plans/evidence/. Score whether Phase 4 implementers stayed true to the DNA or drifted away from it. Specifically check: does the built Character axis match the DNA? Does Density match? Is Material consistent? Is Motion aligned? Report drift count and specific elements. Save findings to docs/plans/evidence/brand-drift.md. Note: this is a drift check only — the Phase 6 LRR Brand Guardian chapter does the verdict. You do NOT issue a pass/fail here, only surface findings."
959
+ **Feature enumeration:** Before dispatch, query the graph for the full feature inventory:
807
960
 
808
- ### Step 5.2 Sequence: Eval Harness Metric Loop
961
+ - Call `mcp__plugin_buildanything_graph__graph_list_features` (no arguments) to get the full feature inventory. Returns an array of `{id, label, kebab_anchor}` for every feature in the indexed product-spec. The orchestrator OWNS this enumeration — auditors never enumerate themselves.
962
+ - If the call fails, STOP. Log `TRACK B BLOCKED: graph_list_features failed` to `docs/plans/build-log.md` and report the error. The graph must be indexed correctly before Track B can run.
809
963
 
810
- Run the Eval Harness Protocol (`protocols/eval-harness.md`). Define 8-15 concrete, executable eval cases from the audit findings and architecture doc. Run the eval agent. Record baseline pass rate. CRITICAL and HIGH failures feed into the Metric Loop as specific issues to fix.
964
+ **Zero-feature gate:** If feature enumeration returns zero features, STOP. This indicates either the graph indexer is broken or `product-spec.md` has no recognizable `## Feature:` sections neither is a Phase 5 problem. Log `TRACK B BLOCKED: zero features enumerated` to `docs/plans/build-log.md` and route the build back to Step 1.6 (product-spec-writer) via the standard backward-routing template. Do NOT proceed with Cross-cutting (Step 5.3) on a feature-less Track B run.
811
965
 
812
- Run the Metric Loop Protocol (callable service) on the full codebase using audit findings as initial input. Define a composite metric based on what this project needs. Max 4 iterations. When fixing, dispatch to the RIGHT specialist security `security-reviewer`, a11y → `engineering-frontend-developer`, perf → `testing-performance-benchmarker`. Don't send everything to one agent.
966
+ **Dispatch:** Call the Agent tool N times in ONE messageone per `feature_id`:
813
967
 
814
- Re-run the Eval Harness after the metric loop exits. All CRITICAL eval cases must now pass. If any CRITICAL case still fails, include it as evidence for Phase 6.
968
+ - Description: "Product Reality Audit: {feature_label}"
969
+ - subagent_type: product-reality-auditor
970
+ - Prompt: "[CONTEXT header above] Audit feature_id: {feature_id}. Follow your Cognitive Protocol (ABSORB → QUERY → SYNTHESIZE → EXECUTE → CLASSIFY → SCORE → WRITE). Write evidence to docs/plans/evidence/product-reality/{feature_id}/. Report manifest of evidence paths back."
815
971
 
816
- ### Step 5.3 TEAM of 3 parallel (ONE message)
972
+ **Post-dispatch verification:** After all Track B auditors return, verify each feature has the four evidence files (`tests-generated.md`, `results.json`, `findings.json`, `coverage.json`) AND that each JSON file parses as valid JSON. If any feature is missing a file or has a malformed JSON file, log `TRACK B EVIDENCE MISSING/MALFORMED: {feature_id}: {path}` to `docs/plans/build-log.md` and re-dispatch that feature's auditor once (this distinguishes the retry from the first attempt). If the second attempt still fails, emit a synthetic finding with `target_phase: 1, target_task_or_step: "1.6"` (the auditor failing twice on the same feature is a strong signal the spec for that feature is malformed) and let it route through the existing spec-gap path at Step 5.4.
973
+
974
+ **Note on the metric loop:** The Metric Loop callable service is no longer wired as a primary Phase 5 step. It can still be invoked ad-hoc by Track A audit fixes via Step 5.5 if a single check class needs iterative tightening, but the structured per-feature audit replaces the orchestrator-improvised eval cases that the previous Step 5.2 (Eval Harness → Metric Loop) drove.
975
+
976
+ ### Step 5.3 — Cross-cutting (3 parallel, ONE message)
817
977
 
818
978
  Call the Agent tool 3 times in one message:
819
979
 
820
- 1. Description: "E2E runner" — INTERNAL inline role-string — mode: "bypassPermissions" — Prompt: "Run Playwright E2E test generation, execution, and stability check per `protocols/web-phase-branches.md` Phase 5 E2E steps (generate and run E2E tests for User Journeys, 3 mandatory iterations for flakiness detection). Report results + artifact paths. Records results to `docs/plans/evidence/e2e/iter-3-results.json`."
980
+ 1. Description: "E2E runner" — INTERNAL inline role-string — mode: "bypassPermissions" — Prompt: "Run Playwright E2E test generation, execution, and stability check per `protocols/web-phase-branches.md` Phase 5 E2E steps (generate and run E2E tests for User Journeys, 3 mandatory iterations for flakiness detection). Report results + artifact paths. Records results to `docs/plans/evidence/e2e/iter-3-results.json`. Scope: multi-feature User Journeys ONLY (login → browse → buy, signup → onboarding → first-action). Single-feature happy paths are covered by Track B per-feature auditors at Step 5.2 — do NOT duplicate. Additionally, read the `## Cross-Feature Interactions` section from `docs/plans/product-spec.md`. For each cross-feature rule (e.g., 'Auth → Checkout: user must be authenticated'), generate a targeted E2E test that verifies the rule holds. These are NOT user journeys — they are specific behavioral contracts between features."
821
981
 
822
- 2. Description: "Dogfood the app" — INTERNAL inline role-string + agent-browser skill — mode: "bypassPermissions" — Prompt: "You are the Dogfood runner. Run the agent-browser dogfood skill against the running app at http://localhost:[port]. Explore every reachable page. Click every button. Fill every form. Check console for errors. Report a structured list of issues with severity ratings, screenshots, repro steps. Write findings to `docs/plans/evidence/dogfood/findings.md`. Do NOT classify or route findings — that's the Feedback Synthesizer's job at Step 5.4."
982
+ 2. Description: "Dogfood the app" — subagent_type: `testing-evidence-collector`
823
983
 
824
- 3. Description: "Fake-data detector" — INTERNAL inline role-string — mode: "bypassPermissions" — Prompt: "Run the Fake Data Detector Protocol (`protocols/fake-data-detector.md`). Static analysis: grep for Math.random() in business data paths, hardcoded API responses, setTimeout faking async, placeholder text. Dynamic analysis: inspect HAR files from `docs/plans/evidence/` for missing real API calls, static responses, absent WebSocket traffic. Write findings to `docs/plans/evidence/fake-data-audit.md` with file:line refs and severity."
984
+ 3. Description: "Fake-data detector" — subagent_type: `silent-failure-hunter` — mode: "bypassPermissions" — Prompt: "Run the Fake Data Detector Protocol (`protocols/fake-data-detector.md`). Static analysis: grep for Math.random() in business data paths, hardcoded API responses, setTimeout faking async, placeholder text. Dynamic analysis: inspect HAR files from `docs/plans/evidence/` for missing real API calls, static responses, absent WebSocket traffic. Write findings to `docs/plans/evidence/fake-data-audit.md` with file:line refs and severity."
825
985
 
826
986
  ### Step 5.4 — Feedback Synthesizer
827
987
 
828
988
  The Dogfood findings used to dead-end. Now route them to fix loops.
829
989
 
830
- Call the Agent tool — description: "Synthesize dogfood findings" — subagent_type: `product-feedback-synthesizer` — Prompt: "[CONTEXT header above] Interpret Dogfood output. Input: `docs/plans/evidence/dogfood/findings.md`. For each finding, classify it and assign a target phase for the fix:
990
+ **Pre-dispatch: finding count check.**
991
+ Before dispatching the synthesizer, count total findings across all 5 input streams:
992
+ - Count lines in each `evidence/product-reality/*/findings.json`
993
+ - Count findings in `evidence/dogfood/findings.md` (count `### Finding` headings or JSON array length)
994
+ - Count entries in `evidence/track-a/*.json`
995
+ - Count failures in `evidence/e2e/iter-3-results.json`
996
+ - Count findings in `evidence/fake-data-audit.md`
997
+
998
+ If total findings ≤ 40: dispatch the synthesizer as a single pass (existing behavior below).
999
+
1000
+ If total findings > 40: split into two sequential dispatches:
1001
+ - **Pass 1 (mechanical routing):** Track B findings (pre-routed, validate only) + Track A findings (static routing) + E2E failures (route to phase 4) + fake-data findings (route to phase 4). These require minimal graph queries. Output: `docs/plans/evidence/dogfood/classified-findings-pass1.json`.
1002
+ - **Pass 2 (graph-heavy classification):** Dogfood findings only (need full graph-based classification). Input includes pass-1 output for dedup. Output: merge pass-1 + pass-2 into final `docs/plans/evidence/dogfood/classified-findings.json`.
1003
+
1004
+ Call the Agent tool — description: "Synthesize all findings" — subagent_type: `product-feedback-synthesizer` — Prompt: "[CONTEXT header above] Interpret findings from Track A, Track B, and Cross-cutting streams. Inputs:
1005
+
1006
+ - `docs/plans/evidence/dogfood/findings.md` — autonomous exploration findings, each requires classification + routing
1007
+ - `docs/plans/evidence/product-reality/*/findings.json` — one per feature (web uses agent-browser evidence; iOS uses XcodeBuildMCP + Maestro evidence). Each Track B finding ALREADY CARRIES `target_phase` and `target_task_or_step` set by the product-reality-auditor. VALIDATE these against the graph (same `graph_query_dependencies` walk used for dogfood findings) and pass through if valid; only re-route if validation fails (e.g., the targeted task no longer exists in the task DAG).
1008
+ - E2E test failures: `docs/plans/evidence/e2e/iter-3-results.json` — failures that persisted through 3 Playwright iterations. For each, set `source: "e2e"`, classify severity, route to `target_phase: 4`.
1009
+ - Fake-data findings: `docs/plans/evidence/fake-data-audit.md` — hardcoded/mock data in production paths. For each, set `source: "fake-data"`, classify severity, route to `target_phase: 4`.
1010
+ - Track A audit findings: `docs/plans/evidence/brand-drift.md`, `docs/plans/evidence/track-a/*.json` (API contract, performance, a11y, security). Web uses Playwright/Lighthouse; iOS uses XcodeBuildMCP/Instruments. These are engineering-focused findings. For each Track A finding, set `source: "track-a"`, classify severity, and route: API/perf/security findings → `target_phase: 4` (implementation fix); a11y findings → `target_phase: 4` (implementation fix); brand-drift findings → `target_phase: 3` (design fix, re-run Brand Guardian at Step 3.0).
1011
+
1012
+ For each finding, ensure it ends up classified with:
831
1013
  - Code-level bug (broken feature, failing logic, fake data) → `target_phase: 4`, assign to the specific task that owns the affected file
832
1014
  - Visual/design issue (styling drift, missing state, a11y gap) → `target_phase: 3`, assign to the Phase 3 step that owns the relevant artifact
833
1015
  - Structural/architecture issue (missing feature, wrong data flow, API mismatch) → `target_phase: 2`, assign to the architecture section
1016
+ - Spec-gap (acceptance criteria too vague, persona constraint not measurable) → `target_phase: 1, target_task_or_step: "1.6"`
1017
+
1018
+ Output: `docs/plans/evidence/dogfood/classified-findings.json` with shape `[{finding_id, source: \"dogfood\" | \"product-reality\" | \"track-a\" | \"e2e\" | \"fake-data\", severity, target_phase, target_task_or_step, description, evidence_ref, related_decision_id?: string}, ...]`. The `source` field distinguishes the five input streams. The file also carries a footer object with: `graph_used: boolean` (false if any graph call failed and grep fallback ran), `re_routed_findings: [{finding_id, original_target, new_target, reason}, ...]` (Track B findings whose routing the synthesizer overrode after graph validation failed — empty array if none), `source_counts: {dogfood: N, product_reality: M, track_a: P, e2e: N, fake_data: N}` (count by input stream). This file is read by the Phase 5 fix loop and by the Phase 6 LRR Aggregator for backward routing."
834
1019
 
835
- Output: `docs/plans/evidence/dogfood/classified-findings.json` with shape `[{finding_id, severity, target_phase, target_task_or_step, description, evidence_ref}, ...]`. This file is read by the Phase 5 fix loop and by the Phase 6 LRR Aggregator for backward routing."
1020
+ ### Step 5.5 Fix loop
836
1021
 
837
- **Phase 5 fix loop:** For each CRITICAL/HIGH classified finding, dispatch the appropriate fix agent based on `target_phase`. Max 2 fix cycles.
1022
+ For each CRITICAL/HIGH classified finding, dispatch the appropriate fix agent based on `target_phase`. Max 2 fix cycles. Routing template at the bottom of this file ("Re-entry dispatch template"). Findings with `target_phase: 1, target_task_or_step: "1.6"` route back to `product-spec-writer` to tighten the spec, which re-triggers Track B for the affected feature on the next loop.
838
1023
 
839
- **Writes:** `docs/plans/evidence/*.json`, `docs/plans/evidence/fake-data-audit.md`, `docs/plans/evidence/dogfood/classified-findings.json`, `docs/plans/learnings.jsonl` (reality sweep writes PITFALL/PATTERN rows — see `protocols/decision-log.md` for the Dissent Log Revisit Pass path).
1024
+ **Writes:** `docs/plans/evidence/*.json`, `docs/plans/evidence/fake-data-audit.md`, `docs/plans/evidence/dogfood/classified-findings.json`, `docs/plans/evidence/product-reality/*/{tests-generated.md, results.json, findings.json, coverage.json, screenshots/}`, `docs/plans/learnings.jsonl` (reality sweep writes PITFALL/PATTERN rows — see `protocols/decision-log.md` for the Dissent Log Revisit Pass path).
840
1025
 
841
1026
  **Compaction checkpoint.** Update `.build-state.json` per the format above.
842
1027
 
@@ -848,6 +1033,16 @@ Output: `docs/plans/evidence/dogfood/classified-findings.json` with shape `[{fin
848
1033
 
849
1034
  Split from old Phase 6. Old 6.4 (Reality Check) and 6.5 (LRR) merged and restructured. Reality Checker keeps its evidence sweep role only — the combined verdict authority moved to the LRR Aggregator.
850
1035
 
1036
+ #### Step 6.0.idx — Decisions re-index (pre-LRR backfill)
1037
+
1038
+ Before dispatching the Reality Checker (Step 6.0) and the LRR chapter judges (Step 6.1), re-index `decisions.jsonl` so the Slice 4 fragment reflects any decisions appended since the last Phase 4 wave transition. The aggregator's backward-routing walk at Step 6.2 (the ⭐⭐ star rule) reads the indexed fragment via `graph_query_decisions` — running this once here catches any drift from hand-edits or out-of-band scribe writes. Skip silently if `docs/plans/decisions.jsonl` does not exist.
1039
+
1040
+ Run via the Bash tool:
1041
+
1042
+ - Command: `node ${CLAUDE_PLUGIN_ROOT}/bin/graph-index.js docs/plans/decisions.jsonl`
1043
+ - On exit 0: log success to `docs/plans/build-log.md` and continue.
1044
+ - On non-zero exit: STOP. Log the error to `docs/plans/build-log.md` and report the failure. The LRR aggregator's backward-routing walk requires current decision data.
1045
+
851
1046
  ### Step 6.0 — Reality Check (evidence sweep + dissent log revisit pass)
852
1047
 
853
1048
  Reality Checker runs its existing evidence sweep per `commands/build.md` precondition list. Writes the manifest to `docs/plans/evidence/reality-check-manifest.json`. Does NOT issue a combined verdict.
@@ -859,12 +1054,12 @@ REQUIRED EVIDENCE FOR ALL PROJECTS:
859
1054
  - `docs/plans/.build-state.json` exists, contains current build session id, contains a recent `VERIFY: PASS` line from this session.
860
1055
 
861
1056
  REQUIRED EVIDENCE FOR `project_type=web`:
862
- - `docs/plans/evidence/eval-harness/baseline.json` (non-empty)
863
- - `docs/plans/evidence/eval-harness/final.json` (non-empty)
864
1057
  - `docs/plans/evidence/e2e/iter-3-results.json` (non-empty)
865
1058
  - `docs/plans/evidence/dogfood/findings.md` (non-empty)
866
1059
  - `docs/plans/evidence/dogfood/classified-findings.json` (non-empty)
867
1060
  - `docs/plans/evidence/fake-data-audit.md` (non-empty)
1061
+ - `docs/plans/evidence/product-reality/*/coverage.json` — at least one per feature in product-spec.md (non-empty); a missing file for any feature listed in product-spec.md is itself a BLOCK
1062
+ - `docs/plans/evidence/product-reality/*/findings.json` (one per feature; may be an empty array `[]` if no failures)
868
1063
  - `docs/plans/evidence/manifest.json`
869
1064
 
870
1065
  REQUIRED EVIDENCE FOR `project_type=ios`:
@@ -904,23 +1099,23 @@ Call the Agent tool 5 times in ONE message. Note: the Eng-Quality chapter dispat
904
1099
 
905
1100
  1. Description: "LRR Eng-Quality chapter" — subagent_type: `code-reviewer` — Prompt: "[CONTEXT header above] You are the Eng-Quality chapter of the Launch Readiness Review. Your natural tendency is to be encouraging. Fight it. Default verdict: NEEDS WORK.
906
1101
 
907
- Read: `docs/plans/architecture.md`, `docs/plans/design-doc.md` (PRD — needed for requirements coverage evaluation), `docs/plans/sprint-tasks.md`, `docs/plans/.task-outputs/`, `protocols/verify.md` check outputs from `.build-state.json`, test results from Phase 4 and 5, eval-harness results from `docs/plans/evidence/eval-harness/`. Also read `docs/plans/decisions.jsonl` for cross-chapter context.
1102
+ Read: `docs/plans/architecture.md`, `docs/plans/design-doc.md` (PRD), `docs/plans/sprint-tasks.md`, `docs/plans/.task-outputs/`, `protocols/verify.md` check outputs from `.build-state.json`, test results from Phase 4 and 5, `docs/plans/evidence/product-reality/*/coverage.json` (Track B per-feature coverage). Also read `docs/plans/decisions.jsonl` for cross-chapter context.
908
1103
 
909
- Requirements coverage is folded into this chapternot a separate dispatch. For EVERY feature listed in the MVP scope of `design-doc.md`, evaluate: (1) does it have a corresponding implemented task in sprint-tasks.md, (2) does it have a passing test or behavioral verification in evidence, (3) is it reachable and functional per the task-outputs. Emit a `requirements_coverage` field in your verdict JSON with shape `[{feature: \"<name>\", status: \"COVERED\" | \"PARTIAL\" | \"MISSING\"}, ...]`. Any MISSING feature is a BLOCK finding. Any PARTIAL feature is a CONCERNS finding at minimum.
1104
+ Requirements coverage is sourced from Phase 5 Track B evidence do NOT recompute. Read every `docs/plans/evidence/product-reality/*/coverage.json` (one per feature). Aggregate the per-feature `coverage_pct` and `status` fields into a single `requirements_coverage[]` array on your verdict, one entry per feature with `{feature_id, feature_label, status, coverage_pct, blocker_summary}` where `blocker_summary` is a short string distilling `missing_states + broken_transitions + unenforced_rules + persona_constraint_violations` from coverage.json. Any `MISSING` status is a BLOCK finding. Any `PARTIAL` is CONCERNS at minimum. If a `coverage.json` file is missing for a feature listed in product-spec.md, that itself is a BLOCK finding (Track B did not run for that feature — pipeline integrity issue).
910
1105
 
911
1106
  Before writing the final verdict, spawn a parallel subagent dispatch: description: 'LRR test coverage adequacy' — subagent_type: `pr-test-analyzer` — prompt: 'You are a test-coverage auditor for the Eng-Quality LRR chapter. Read the test files under tests/, task-outputs/, and behavioral-test stub detector output. Evaluate: (1) do declared behavioral tests have non-stub bodies, (2) does coverage match the PR diff scope, (3) are edge cases covered, (4) are any tests flaky markers set. Return a JSON summary with test_coverage_score (0-100), stub_flagged_count, edge_case_gap_count, recommendations[]. Save to docs/plans/evidence/lrr/eng-quality-coverage.json.' Read the resulting eng-quality-coverage.json and fold its findings into your verdict.
912
1107
 
913
- Evaluate code quality + test coverage adequacy + architecture conformance + requirements coverage TOGETHER (single coherent view — merged from old Eng + QA chapters). Check: do declared behavioral tests actually exercise the features? Are there stub-flagged tests? Do tests match task acceptance criteria? Does the built code match architecture MUSTs? Are MVP features all COVERED?
1108
+ Evaluate code quality + test coverage adequacy + architecture conformance + requirements coverage TOGETHER (single coherent view — merged from old Eng + QA chapters). Check: do declared behavioral tests actually exercise the features? Are there stub-flagged tests? Do tests match task acceptance criteria? Does the built code match architecture MUSTs? Are features all COVERED?
914
1109
 
915
- Write verdict to `docs/plans/evidence/lrr/eng-quality.json` per `protocols/launch-readiness.md` schema. Fields: `chapter=eng-quality`, `verdict` (PASS|CONCERNS|BLOCK), `override_blocks_launch` (false unless BLOCK), `evidence_files_read` (non-empty, MUST include eng-quality-coverage.json), `findings[]` (each with `severity`, `description`, `evidence_ref`, `related_decision_id` if blocker ties to a decisions.jsonl row), `requirements_coverage[]` (one entry per MVP feature with `{feature, status}`), `follow_up_spawned=false`, `follow_up_findings=null`. Eng-Quality CANNOT spawn follow-ups."
1110
+ Write verdict to `docs/plans/evidence/lrr/eng-quality.json` per `protocols/launch-readiness.md` schema. Fields: `chapter=eng-quality`, `verdict` (PASS|CONCERNS|BLOCK), `override_blocks_launch` (false unless BLOCK), `evidence_files_read` (non-empty, MUST include eng-quality-coverage.json), `findings[]` (each with `severity`, `description`, `evidence_ref`, `related_decision_id` if blocker ties to a decisions.jsonl row), `requirements_coverage[]` (shape per the Track B aggregation paragraph above — `{feature_id, feature_label, status, coverage_pct, blocker_summary}`), `follow_up_spawned=false`, `follow_up_findings=null`. Eng-Quality CANNOT spawn follow-ups."
916
1111
 
917
- 2. Description: "LRR Security chapter" — subagent_type: `security-reviewer` — Prompt: "[CONTEXT header above] You are the Security chapter of the LRR. Read: `docs/plans/evidence/fake-data-audit.md`, Phase 5 security audit output (from Step 5.1), eval-harness security cases. Also read `docs/plans/decisions.jsonl` for context.
1112
+ 2. Description: "LRR Security chapter" — subagent_type: `security-reviewer` — Prompt: "[CONTEXT header above] You are the Security chapter of the LRR. Read: `docs/plans/evidence/fake-data-audit.md`, Phase 5 security audit output (from Step 5.1). Also read `docs/plans/decisions.jsonl` for context.
918
1113
 
919
1114
  Evaluate auth model, input validation, secrets management, dependency vulnerabilities. Write verdict to `docs/plans/evidence/lrr/security.json` per schema. Fields: `chapter=security`, `verdict`, `override_blocks_launch`, `evidence_files_read` (non-empty), `findings[]` (with `related_decision_id` when applicable), `follow_up_spawned` (boolean), `follow_up_findings` (null or typed object).
920
1115
 
921
1116
  Security MAY spawn ONE read-only follow-up investigation, but ONLY if verdict would be BLOCK — NOT on suspicion. This is tightened from current behavior. Follow-up: read-only, Read/Grep/Glob only, max 15 tool calls, self-report tool_calls_used. See `protocols/launch-readiness.md` for follow-up flow."
922
1117
 
923
- 3. Description: "LRR SRE chapter" — subagent_type: `engineering-sre` — Prompt: "[CONTEXT header above] You are the SRE chapter of the LRR. Read: performance-audit outputs from Phase 5 (Step 5.1 performance auditor + Step 5.2 eval-harness perf cases), Performance Benchmarker evidence, NFRs from `docs/plans/quality-targets.json` and `docs/plans/sprint-tasks.md`, reliability checks. Also read `docs/plans/decisions.jsonl` for context.
1118
+ 3. Description: "LRR SRE chapter" — subagent_type: `engineering-sre` — Prompt: "[CONTEXT header above] You are the SRE chapter of the LRR. Read: performance-audit outputs from Phase 5 (Step 5.1 performance auditor), Performance Benchmarker evidence, NFRs from `docs/plans/quality-targets.json` and `docs/plans/sprint-tasks.md`, reliability checks. Also read `docs/plans/decisions.jsonl` for context.
924
1119
 
925
1120
  Evaluate whether the build meets NFR targets (response time, load handling, error rates) and is production-ready under load. Bundle-size budget violations (>25% over Scope budget) auto-block. Write verdict to `docs/plans/evidence/lrr/sre.json` per schema.
926
1121
 
@@ -937,9 +1132,9 @@ Write verdict to `docs/plans/evidence/lrr/a11y.json` per schema. A11y CANNOT spa
937
1132
 
938
1133
  5. Description: "LRR Brand Guardian chapter" — subagent_type: `design-brand-guardian` — Prompt: "[CONTEXT header above] You are the Brand Guardian chapter of the LRR (REPLACES the old Design mechanical check — real taste judgment, not a 15-line mechanical gate). Your natural tendency is to be encouraging. Fight it. Default verdict: NEEDS WORK.
939
1134
 
940
- Read: `docs/plans/visual-design-spec.md`, `docs/plans/visual-dna.md` (the 6-axis DNA card locked at Phase 3.0), `docs/plans/design-references.md`, Playwright screenshots under `docs/plans/evidence/` matching production pages, Phase 3.6 Design Critic final score from `.build-state.json`.
1135
+ Read: `DESIGN.md` (full file — `## Overview > ### Brand DNA` is the locked 7-axis card from Phase 3.0; YAML tokens are what Phase 4 was supposed to honor; `## Do's and Don'ts` are the explicit guardrails), `docs/plans/design-references.md`, Playwright screenshots under `docs/plans/evidence/` matching production pages, Phase 3.6 Design Critic final score from `.build-state.json`.
941
1136
 
942
- Evaluate DRIFT: did the built product stay true to the DNA card locked at Phase 3.0? Score the gap on 6 DNA axes (Scope, Density, Character, Material, Motion, Type) + 5 craft dimensions (whitespace rhythm, visual hierarchy, motion coherence, color harmony, typographic refinement). Cite specific elements ('the hero padding at landing.tsx:42 is 32px but DNA calls for Airy density — should be 48px+') — never vague ('needs polish').
1137
+ Evaluate DRIFT: did the built product stay true to DESIGN.md (DNA + tokens + guardrails)? Score the gap on 7 DNA axes (Scope, Density, Character, Material, Motion, Type, Copy) + 5 craft dimensions (whitespace rhythm, visual hierarchy, motion coherence, color harmony, typographic refinement). Cite specific elements ('the hero padding at landing.tsx:42 is 32px but DNA calls for Airy density — should be 48px+') — never vague ('needs polish').
943
1138
 
944
1139
  Write verdict to `docs/plans/evidence/lrr/brand-guardian.json` per schema. Fields per protocol. Brand Guardian CANNOT spawn follow-ups."
945
1140
 
@@ -947,7 +1142,7 @@ Write verdict to `docs/plans/evidence/lrr/brand-guardian.json` per schema. Field
947
1142
 
948
1143
  ### Step 6.1a — PM coverage fold-in
949
1144
 
950
- PM coverage is a sub-input of the Eng-Quality chapter — evaluated inline within the Eng-Quality dispatch at Step 6.1 above against `design-doc.md` MVP scope and emitted as a `requirements_coverage[]` field on `eng-quality.json`. The LRR Aggregator runs exactly once. Chapter count stays 5.
1145
+ PM coverage is a sub-input of the Eng-Quality chapter — evaluated inline within the Eng-Quality dispatch at Step 6.1 above against `design-doc.md` scope and emitted as a `requirements_coverage[]` field on `eng-quality.json`. The LRR Aggregator runs exactly once. Chapter count stays 5.
951
1146
 
952
1147
  ### Step 6.2 — LRR Aggregator (sequential, after all 5 chapter files exist)
953
1148
 
@@ -1012,6 +1207,8 @@ On re-entry from LRR BLOCK:
1012
1207
  blocking_finding: {chapter, finding_id, severity, description, related_decision_id, related_files}
1013
1208
  prior_output: path to the phase's previous artifact
1014
1209
  decision_row: the row from decisions.jsonl containing original reasoning + authorship
1210
+ cycle_number: current backward-routing cycle count for this target phase (from .build-state.json.backward_routing_count_by_target_phase)
1211
+ downstream_phases_affected: list of phases that consume this phase's output (e.g., Phase 2 re-entry affects Phases 3, 4, 5, 6)
1015
1212
  TASK for the re-opened phase:
1016
1213
  Revise prior_output to address blocking_finding. Do NOT redo unaffected work. Emit a new decision_row documenting the revision rationale.
1017
1214
  ```
@@ -1053,7 +1250,7 @@ Do not loop forever.
1053
1250
 
1054
1251
  4. Description: "Deploy" — subagent_type: `engineering-devops-automator` — mode: "bypassPermissions" — Prompt: "[CONTEXT header above] Deploy the app to the target from the PRD (`docs/plans/design-doc.md#tech-stack`). Run pre-deploy checks: build, env vars, secrets. Execute deploy. Verify the deployed URL returns 200 and serves the built app (not the placeholder). Report deploy URL and any smoke-test findings."
1055
1252
 
1056
- 5. Description: "Completion Report" — INTERNAL inline role-string — Prompt: "[CONTEXT header above] You are the Completion Report writer. Draw verification surface from the LRR Aggregator's structured output (`docs/plans/evidence/lrr-aggregate.json`) and the Reality Checker evidence manifest (`docs/plans/evidence/reality-check-manifest.json`) — NOT from orchestrator summary prose. Present:
1253
+ 5. Description: "Completion Report" — INTERNAL inline role-string — Prompt: "[CONTEXT header above] You are the Completion Report writer. Draw verification surface from THREE sources: the LRR Aggregator's structured output (`docs/plans/evidence/lrr-aggregate.json`), the Reality Checker evidence manifest (`docs/plans/evidence/reality-check-manifest.json`), and the build state (`docs/plans/.build-state.json` for backward-routing counts and mode transitions per state-schema v2). Do NOT draw from orchestrator summary prose. Present:
1057
1254
 
1058
1255
  ```
1059
1256
  BUILD COMPLETE
@@ -1077,6 +1274,22 @@ Remaining: [any NEEDS WORK items from lrr-routing.json]
1077
1274
  | LRR follow-ups spawned | count | — |
1078
1275
  | LRR triggered rule | rule number 1-6 | — |
1079
1276
 
1277
+ ```
1278
+ QUALITY METRICS (from .build-state.json schema v2)
1279
+ ================================================
1280
+ Backward routing: <total> events
1281
+ by target phase: <\"2\": N, \"3\": N, \"4\": N>
1282
+ top decisions re-opened: <decision_id: N> (up to 3)
1283
+
1284
+ Mode transitions: <count> (autonomous ↔ interactive)
1285
+ <if count > 0: list each transition timestamp + direction>
1286
+
1287
+ Interpretation:
1288
+ - Backward routing count is a quality signal — fewer means Phase 1-3 caught issues earlier.
1289
+ - A target phase appearing 3+ times suggests structural rework (re-architect or re-design); investigate the related decisions.
1290
+ - Mode transitions ≥ 2 in autonomous mode indicates the build hit a manual-review threshold — review the LRR rule that triggered.
1291
+ ```
1292
+
1080
1293
  If there's a Verification Gap (declared != passing, or stub-flagged > 0), surface a top-level 'Verification Gap' section BEFORE writing the report to disk. Ask user: 'Write Completion Report with this verification gap surfaced? [YES/NO]'. In autonomous mode: write but flag prominently.
1081
1294
 
1082
1295
  Create final commit. Mark all TodoWrite items complete. Update `.build-state.json`: 'Phase: 7 COMPLETE'."