buildanything 1.2.1 → 1.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/commands/build.md CHANGED
@@ -1,464 +1,460 @@
1
1
  ---
2
- description: "Full product build pipeline: takes a brainstormed idea through architecture, planning, implementation, testing, and hardening using coordinated agent teams — outputs working, tested, reviewed code"
3
- argument-hint: "Path to brainstorming doc or describe what we're building"
2
+ description: "Full product build pipeline orchestrates specialist agents through brainstorming, research, architecture, implementation, testing, hardening, and shipping"
3
+ argument-hint: "Describe what to build, or path to a design doc. --autonomous for unattended mode. --resume to continue a previous build."
4
4
  ---
5
5
 
6
- # /build — buildanything pipeline
6
+ <HARD-GATE>
7
+ YOU ARE AN ORCHESTRATOR. YOU COORDINATE AGENTS. YOU DO NOT WRITE CODE.
7
8
 
8
- ## PROCESS INTEGRITYREAD THIS FIRST
9
+ Every step below tells you to call the Agent tool. DO IT. Do not role-play as the agent. Do not write implementation code yourself. Do not skip the Agent tool call "because it's faster." If you are typing code instead of calling the Agent tool, STOP you are violating this process.
9
10
 
10
- <HARD-GATE>
11
- You are an ORCHESTRATOR. You coordinate specialist agents. You do NOT write implementation code yourself.
11
+ "Launch an agent" = call the Agent tool (the actual tool in your toolbar, the one that spawns a subprocess).
12
12
 
13
- If you are about to write implementation code directly STOP. That is a violation of this process. Dispatch to a specialist agent instead.
13
+ For implementation agents, set mode: "bypassPermissions".
14
+ For parallel work, put multiple Agent tool calls in ONE message.
14
15
 
15
- This gate is non-negotiable. No exceptions. No "just this one quick fix." No "it's faster if I do it myself."
16
+ Exception: Brainstorming (Phase 1, Step 1.1) is a direct conversation with the user — you ask questions and process answers yourself. This is the ONE phase where you work directly, not through agents.
16
17
  </HARD-GATE>
17
18
 
18
- **Resuming after context compaction?** If your context was recently compacted or you are continuing a previous session:
19
- 1. Read `docs/plans/.build-state.md` to recover your phase, step, and progress
20
- 2. Re-read THIS file completely — you are reading it now
21
- 3. Check the TodoWrite list for task progress
22
- 4. Resume from the saved state, not from scratch
23
- 5. Do NOT skip ahead or fall back to default coding behavior
24
-
25
- ### Rationalization Prevention
26
-
27
- If you catch yourself thinking any of these, you are drifting from the process:
28
-
29
- | Thought | Reality |
30
- |---------|---------|
31
- | "It's faster if I just write this myself" | You are an orchestrator. Dispatch to an agent. Speed is not your job — coordination is. |
32
- | "This is too small for a subagent" | Every implementation task goes through an agent. No exceptions. Small tasks still need the Dev→QA loop. |
33
- | "I'll skip the code review for this one" | Every task gets reviewed. The code-reviewer agent exists for a reason. |
34
- | "The quality gate is obvious, I'll just proceed" | Present it to the user. Quality gates require explicit user approval. |
35
- | "I already know what to build, I'll skip architecture" | Phase 1 is mandatory. The architecture step catches design mistakes before they become code. |
36
- | "Tests aren't needed for this part" | Every task has acceptance criteria and tests. The Evidence Collector verifies. |
37
- | "I'll clean this up later" | The Harden phase (Phase 4) exists for this. Don't skip steps — follow the process. |
38
- | "Context was compacted, I'll just keep coding" | STOP. Re-read this file. Check .build-state.md. Reload the process. |
39
-
40
- ### Process Flowchart
41
-
42
- ```dot
43
- digraph build_pipeline {
44
- rankdir=TB;
45
- node [shape=box];
46
-
47
- start [label="User invokes /build" shape=ellipse];
48
- p1 [label="Phase 1: Architecture & Planning\n(Backend Architect + UX Architect +\nSecurity Engineer + code-architect +\nSprint Prioritizer + Senior PM)"];
49
- gate1 [label="Quality Gate 1\nUser approves architecture?" shape=diamond];
50
- p2 [label="Phase 2: Foundation\n(DevOps Automator + Frontend Dev\nor Backend Architect)"];
51
- gate2 [label="Quality Gate 2\nBuilds? Tests pass? Lint clean?" shape=diamond];
52
- p3 [label="Phase 3: Build — Dev↔QA Loops\nFor EACH task:\nAgent implements → Evidence Collector\nverifies → code-reviewer reviews"];
53
- retry [label="Retry (max 3)\nFeedback to dev agent" shape=box];
54
- escalate [label="Escalate to user\nafter 3 failures" shape=box];
55
- p4 [label="Phase 4: Harden\n(API Tester + Perf Benchmarker +\nAccessibility Auditor + Security Engineer +\ncode-simplifier + Reality Checker)"];
56
- gate4 [label="Quality Gate 4\nReality Checker: PRODUCTION READY?" shape=diamond];
57
- p5 [label="Phase 5: Ship\n(Technical Writer + final commit)"];
58
- done [label="BUILD COMPLETE" shape=ellipse];
59
-
60
- start -> p1;
61
- p1 -> gate1;
62
- gate1 -> p2 [label="approved"];
63
- gate1 -> p1 [label="changes requested"];
64
- p2 -> gate2;
65
- gate2 -> p3 [label="pass"];
66
- gate2 -> p2 [label="fix"];
67
- p3 -> retry [label="task fails"];
68
- retry -> p3 [label="< 3 retries"];
69
- retry -> escalate [label="3 retries"];
70
- p3 -> p4 [label="all tasks complete"];
71
- p4 -> gate4;
72
- gate4 -> p5 [label="PRODUCTION READY"];
73
- gate4 -> p4 [label="NEEDS WORK"];
74
- p5 -> done;
75
- }
76
- ```
19
+ ### Orchestrator Discipline
77
20
 
78
- ---
21
+ Your context window is precious. Protect it.
79
22
 
80
- You are the **Agents Orchestrator** running the buildanything pipeline. Your job is to take a brainstormed idea and build it into a working, tested, production-quality product coordinating specialist agents the way a VP of Engineering at Meta or Google would run a product team.
23
+ **You are a DISPATCHER, not a DOER.** Your job is: read state decide next step compose agent prompt dispatch process result decide next step.
81
24
 
82
- **This is NOT brainstorming. Brainstorming is done. This is execution.**
25
+ **Two types of agents handle their results differently:**
83
26
 
84
- Input: $ARGUMENTS
27
+ | Agent Type | Examples | What you keep |
28
+ |-----------|----------|---------------|
29
+ | **Research/analysis** | Market research, tech feasibility, architecture design, audits, measurement | **Full output** — their response IS the deliverable. You need it to synthesize, compare, and make decisions. Save to `docs/plans/` when applicable. |
30
+ | **Implementation** | Code writing, fixes, cleanup, verification, scaffolding | **Summary only** — their work product lives in the codebase. Keep: what was done, files changed, test results, pass/fail. Discard: code snippets, full build logs, lint output. |
85
31
 
86
- ## Operating Principles
32
+ **After implementation agents return:**
33
+ 1. Extract: what was built, files changed, test pass/fail, any blockers
34
+ 2. Record in `docs/plans/.build-state.md` under the current phase
35
+ 3. The code is in the repo — you don't need it in your context
87
36
 
88
- - **You are an orchestrator.** You dispatch work to specialist agents. You do NOT write implementation code yourself. Your job is coordination, synthesis, and quality enforcement.
89
- - **Phase gates are mandatory.** Do not advance to the next phase until the current phase passes its quality gate. Present phase output to the user for approval before advancing.
90
- - **Dev↔QA loops are mandatory.** Every implementation task gets tested. Failed tasks loop back to the developer agent with specific feedback. Max 3 retries per task before escalation to the user.
91
- - **Fresh agents per task.** Each task gets a fresh subagent to prevent context pollution from previous tasks. Do not reuse a subagent across multiple implementation tasks.
92
- - **Parallelism within phases.** Agents within the same step run in parallel via the Agent tool. Phases run sequentially.
93
- - **Real code, real tests, real commits.** This pipeline writes actual files, runs actual tests, and makes actual git commits. It does not produce documents about code.
94
- - **Evidence-based quality.** The Reality Checker defaults to NEEDS WORK. The Evidence Collector requires proof. Do not self-approve.
95
- - **TodoWrite for progress tracking.** Use TodoWrite to create and update a task checklist at the start of Phase 3. This is your primary progress tracker — it survives context compaction better than memory alone.
96
- - **State persistence.** After completing each step, update `docs/plans/.build-state.md` with your current phase, step, task progress, and agent usage. This file is your recovery point if context is compacted.
37
+ **After research/analysis agents return:**
38
+ 1. Read and use the full output this is your decision-making input
39
+ 2. Save the output to the appropriate file in `docs/plans/` (research brief, architecture doc, etc.)
40
+ 3. Once saved to disk, you can reference the file later instead of holding it all in context
97
41
 
98
- ---
42
+ **Never do these yourself:**
43
+ - Read source code files to understand implementation details — spawn an Explore agent
44
+ - Write or edit code — spawn an implementation agent
45
+ - Debug failures — spawn a fix agent with the error message
99
46
 
100
- ## Phase 0: Initialize
47
+ If you catch yourself typing code or reading source files: STOP. You are wasting context. Spawn an agent.
101
48
 
102
- Before starting any work:
49
+ **Dispatch Counter:** Track agent dispatches in `docs/plans/.build-state.md` under `## Dispatch Counter`:
50
+ - `dispatches_since_save: [N]`
51
+ - `last_save: [Phase.Step]`
52
+ Increment after each agent returns (parallel dispatch of 4 agents = +4). Reset to 0 after each compaction save.
103
53
 
104
- 1. **Create a TodoWrite checklist** with the 5 phases:
105
- - [ ] Phase 1: Architecture & Planning
106
- - [ ] Phase 2: Foundation
107
- - [ ] Phase 3: Build (will expand into per-task items later)
108
- - [ ] Phase 4: Harden
109
- - [ ] Phase 5: Ship
54
+ Input: $ARGUMENTS
110
55
 
111
- 2. **Write initial state** to `docs/plans/.build-state.md`:
112
- ```
113
- Phase: 0 — Initializing
114
- Input: [user's build request]
115
- Started: [timestamp]
116
- ```
56
+ ### Autonomous Mode
117
57
 
118
- 3. Proceed to Phase 1.
58
+ If the input contains `--autonomous` or `--auto`, this build runs **unattended**. The user will not be present to approve quality gates. In autonomous mode:
59
+ - Quality gates auto-approve. Do NOT pause and wait for user input.
60
+ - Brainstorming runs in autonomous mode (see protocol).
61
+ - Metric loops that stall accept at >= 60% of target, skip below that.
62
+ - Log every decision to `docs/plans/build-log.md` so the user can review later.
119
63
 
120
- ---
64
+ If `--autonomous` is NOT present, all quality gates require user approval as described below.
65
+
66
+ When combining `--resume` with `--autonomous`: the current invocation's flags take precedence over saved state. If you resume a previously interactive build with `--autonomous`, it continues in autonomous mode.
121
67
 
122
- ## Phase 1: Architecture & Planning
68
+ ### Metric Loop
123
69
 
124
- **Goal**: Define the technical architecture, component structure, UX foundation, and sprint task list. No code yet just the blueprint.
70
+ Every phase uses a **metric-driven iteration loop** to drive quality. Read the full protocol at `commands/protocols/metric-loop.md`. Critical rules (survive compaction):
71
+
72
+ 1. YOU define a metric for this phase based on context (what you're building, what matters). The metric is NOT predefined.
73
+ 2. Spawn a **measurement agent** to score the artifact 0-100. Read its full output — it's analysis.
74
+ 3. Pick the ONE highest-impact issue. Spawn a separate **fix agent** with ONLY that issue + file paths.
75
+ 4. Re-measure. Repeat until: target met, stalled (2 consecutive delta <= 0), or max iterations.
76
+ 5. Track all scores in `docs/plans/.build-state.md` — this is your lifeline across compaction.
125
77
 
126
78
  <HARD-GATE>
127
- Quality Gate: User MUST approve the architecture and task list before any code is written. Do not proceed to Phase 2 without explicit user approval. "Looks good" counts. Silence does not.
79
+ METRIC LOOP NON-NEGOTIABLES:
80
+ - Measurement agent and fix agent are SEPARATE Agent tool calls — never share context (author-bias elimination).
81
+ - Fix agent gets ONLY the top issue + file paths + acceptance criteria. NOT the full measurement findings.
82
+ - One fix per iteration. Measure impact before fixing the next thing.
83
+ - Each measurement is fresh — don't accumulate findings across iterations.
128
84
  </HARD-GATE>
129
85
 
130
- ### Step 1.1 — Codebase Understanding (if existing project)
86
+ ### Handoff Documents
131
87
 
132
- If this is being built in an existing codebase, launch 2-3 **code-explorer** agents in parallel to map:
133
- - Similar features and their implementation patterns
134
- - Architecture layers and abstractions
135
- - File organization conventions, testing patterns, build system
88
+ When spawning agents in sequence (e.g., architect implementer → reviewer), pass **scoped handoffs** not the full architecture dump. Each agent receives only what it needs:
136
89
 
137
- If this is a greenfield project, skip to Step 1.2.
90
+ 1. **Relevant architecture section** the specific part of architecture.md that applies to this agent's task
91
+ 2. **Previous agent's output** — what the upstream agent produced (if any)
92
+ 3. **Acceptance criteria** — what "done" looks like for THIS agent
138
93
 
139
- ### Step 1.2 — Architecture Design (Parallel)
94
+ For implementation agents (Phase 4+): Do NOT paste the entire Design Document or Architecture Document. Extract the relevant sections only. For research and architecture agents (Phases 1-2): pass the full document these agents need complete context to do their analysis.
140
95
 
141
- Launch these agents simultaneously:
96
+ ### Complexity Routing (Advisory)
142
97
 
143
- 1. **Backend Architect** Design the system architecture: services, data models, API contracts, database schema, external integrations. Define the technical boundaries and data flows. Be specific — name tables, endpoints, data structures.
98
+ When composing agent prompts, prefix with `[COMPLEXITY: S/M/L]` to hint at the appropriate model tier:
144
99
 
145
- 2. **UX Architect** Design the frontend architecture: component hierarchy, layout system, responsive strategy, CSS architecture, state management approach. Produce a component tree with clear responsibilities.
100
+ | Complexity | Task Types | Preferred Tier |
101
+ |-----------|-----------|----------------|
102
+ | S | Build-fix, cleanup, lint fix, single-error fix | Haiku-class (fastest) |
103
+ | M | Measurement, eval, testing, single-feature impl | Sonnet-class (balanced) |
104
+ | L | Architecture, research, multi-file impl, debugging | Opus-class (deepest reasoning) |
146
105
 
147
- 3. **Security Engineer** Review the proposed architecture for security concerns: auth model, data handling, input validation strategy, secrets management, threat model for the top 3 attack vectors.
106
+ For sprint tasks, use the Size field from `docs/plans/sprint-tasks.md`. This is advisory the tag documents intent for future model routing support.
148
107
 
149
- 4. **code-architect** (Claude Code agent) — Analyze the architecture proposals against the existing codebase (if any). Produce a concrete implementation blueprint: specific files to create/modify, build sequence, dependency order.
108
+ ---
150
109
 
151
- After all return, synthesize into a single **Architecture Document** that resolves any contradictions between agents.
110
+ ## Phase 0: Context & Pre-Flight
152
111
 
153
- ### Step 1.3 Sprint Planning (Sequential, after 1.2)
112
+ **Resuming?** If the input contains `--resume` OR if context was just compacted (SessionStart hook fired with active state):
113
+ 1. Read `docs/plans/.build-state.md` — verify it exists and has a Resume Point section.
114
+ If `docs/plans/.build-state.md` does not exist or has no Resume Point, warn the user: 'No previous build state found. Starting fresh.' Then proceed to Step 0.1 as a new build.
115
+ 2. Re-read this file and all protocol files in `commands/protocols/`.
116
+ 3. Re-read `docs/plans/sprint-tasks.md`, `docs/plans/architecture.md`, and `CLAUDE.md`.
117
+ 4. Rebuild TodoWrite from the state file (TodoWrite does NOT survive compaction or session breaks).
118
+ 5. Reset `dispatches_since_save` to 0 (fresh context window).
119
+ 6. Resume from the saved phase and step. Skip Phase 0.
154
120
 
155
- Launch **Sprint Prioritizer** with the Architecture Document:
156
- - Break the build into ordered, atomic tasks
157
- - Each task should be implementable and testable independently
158
- - Define acceptance criteria for each task — what "done" looks like, what tests must pass
159
- - Identify dependencies between tasks — what must be built first
160
- - Estimate relative complexity (S/M/L) for each task
161
- - **Include the architectural rationale** — WHY this task exists, which part of the architecture it implements
121
+ ### Step 0.1 Read the Room
162
122
 
163
- Then launch **Senior Project Manager** to validate the task list:
164
- - Confirm realistic scope — remove anything that isn't in the brainstorming spec
165
- - Verify no missing tasks — every component from the architecture has implementation tasks
166
- - Ensure task descriptions are specific enough that a developer agent can execute without ambiguity
123
+ Before doing anything, scan for existing context:
167
124
 
168
- Save the task list to `docs/plans/sprint-tasks.md`. The file MUST include this header:
125
+ - Check if the input is a file path (e.g., `docs/plans/brainstorm.md`). If so, read it.
126
+ - Check if `docs/plans/` or `docs/briefs/` exist with prior brainstorming, design docs, decision briefs, or research. Read them.
127
+ - Check if there's existing code in the project. If so, this is an enhancement, not greenfield.
128
+ - Check the conversation history — has the user been discussing this idea already?
129
+ - Check if `docs/plans/learnings.md` exists from a previous build. If so, read it. Apply relevant PATTERNS to agent prompt design, avoid listed PITFALLs, use HEURISTICS when applicable.
169
130
 
170
- ```
171
- # Sprint Tasks — buildanything pipeline
172
- # PROCESS: Execute each task using build.md Phase 3 Dev→QA loops.
173
- # DO NOT implement tasks directly. Dispatch to specialist agents.
174
- # If you lost context, re-read: commands/build.md
175
- #
176
- # Each task MUST go through: Implement (agent) → Verify (Evidence Collector) → Review (code-reviewer)
177
- ```
131
+ **Classify what you found:**
178
132
 
179
- ### Quality Gate 1
133
+ | Context Level | What You Have | What Happens |
134
+ |---|---|---|
135
+ | **Full design** | Design doc with decisions, scope, tech stack, data models | Skip Phase 1. Feed design into Phase 2. |
136
+ | **Decision brief** | An idea-sweep brief with verdicts and MVP definition | Phase 1 skips research (Step 1.2). Brainstorming refines the brief into a design. |
137
+ | **Partial context** | Some notes, conversation, rough sketch | Phase 1 runs fully. Feed context into brainstorming + research. |
138
+ | **Raw idea** | One-line build request, no prior work | Phase 1 runs fully from scratch. |
180
139
 
181
- Present to the user:
182
- 1. Architecture Document (system diagram, component tree, data models, API contracts)
183
- 2. Sprint Task List (ordered tasks with acceptance criteria)
184
- 3. Identified risks or decisions that need user input
140
+ ### Step 0.2 — Human Prerequisites Checklist
185
141
 
186
- Ask: **"Architecture and sprint plan ready. Approve to start building, or flag changes?"**
142
+ Identify everything that requires HUMAN action before going heads-down:
187
143
 
188
- <HARD-GATE>
189
- DO NOT PROCEED WITHOUT USER APPROVAL. Wait for explicit confirmation.
190
- </HARD-GATE>
144
+ - **API keys & secrets** — External services the project integrates with. List each key needed.
145
+ - **Database setup** Supabase, Postgres, etc. User needs to create it and provide credentials.
146
+ - **Repository** — Git repo on GitHub? Public or private?
147
+ - **Deployment** — Vercel, Railway, Fly.io? User needs to connect.
148
+ - **MCP servers** — Playwright for visual testing, database access, etc.
149
+ - **Local tooling** — Docker, specific runtimes, etc.
150
+
151
+ Present the checklist:
191
152
 
192
- **Save state:** Write `docs/plans/.build-state.md`:
193
153
  ```
194
- Phase: 1 COMPLETE awaiting user approval
195
- Tasks: [total] planned
196
- Agents used: Backend Architect, UX Architect, Security Engineer, code-architect, Sprint Prioritizer, Senior Project Manager
154
+ BEFORE I GO HEADS-DOWN, please set up:
155
+
156
+ [ ] [Service] API key add as [KEY_NAME] to .env
157
+ [ ] [Database] → add connection URL to .env
158
+ [ ] GitHub repo → share the URL
159
+ [ ] [Deployment service] connected (optional)
160
+
161
+ Once done, say "ready" and I'll start building.
197
162
  ```
198
163
 
199
- Update TodoWrite: mark Phase 1 complete.
164
+ <HARD-GATE>
165
+ Interactive mode: DO NOT proceed until the user confirms prerequisites (or says to skip).
166
+ Autonomous mode: Log checklist to `docs/plans/build-log.md`. Create `.env.example` with required keys. Proceed — log missing keys as blockers if hit during build.
167
+ </HARD-GATE>
168
+
169
+ ### Step 0.3 — Initialize
170
+
171
+ 0. Create `docs/plans/` directory if it doesn't exist (greenfield projects won't have it).
172
+ 1. Create a TodoWrite checklist with Phases 0-6.
173
+ 2. Create `docs/plans/.build-state.md` as a single write with ALL of the following: phase and step (`Phase: 0 — Starting`), input (`[build request]`), context level (`[classification]`), prerequisites (`[status]`), dispatch counter (`dispatches_since_save: 0, last_save: Phase 0`), and a `## Resume Point` section with: phase, step, autonomous mode flag, completed tasks (none), git branch name.
174
+ 3. Go to Phase 1 (or Phase 2 if context level is "Full design").
200
175
 
201
176
  ---
202
177
 
203
- ## Phase 2: Foundation
178
+ ## Phase 1: Brainstorm & Research
204
179
 
205
- **Goal**: Set up the project skeleton build system, directory structure, base configuration, CI, design tokens. The scaffolding that every subsequent task builds on.
180
+ **Goal**: Turn the raw idea into a validated Design Document grounded in research. This ensures Phase 2 architects receive a design, not a guess.
206
181
 
207
- ### Step 2.1 — Project Scaffolding
182
+ **Skip if** Step 0.1 classified context as "Full design" go straight to Phase 2.
208
183
 
209
- Based on the Architecture Document, set up:
210
- - Project directory structure
211
- - Package manager and dependencies
212
- - Build/dev tooling configuration
213
- - Linting, formatting, type checking config
214
- - Base test framework and first passing test
215
- - Git initialization and .gitignore
216
- - Environment configuration (.env.example)
184
+ ### Step 1.1 Brainstorming
217
185
 
218
- Use the **DevOps Automator** to define the infrastructure and CI setup.
219
- Use the **Frontend Developer** or **Backend Architect** (as appropriate) to scaffold the actual project.
186
+ Follow the Brainstorm Protocol (`commands/protocols/brainstorm.md`).
220
187
 
221
- Commit: `feat: initial project scaffolding`
188
+ In interactive mode: this is a conversation. Ask questions one at a time, propose approaches with trade-offs, let the user decide. Output: Design Document saved to `docs/plans/`.
222
189
 
223
- ### Step 2.2 Design System Foundation (if frontend)
190
+ In autonomous mode: synthesize a design document directly using the build request and available context. Pick pragmatic defaults. Log rationale to `docs/plans/build-log.md`.
224
191
 
225
- Launch **UX Architect** to implement:
226
- - CSS design tokens (colors, spacing, typography as variables)
227
- - Base layout components (grid, container, responsive breakpoints)
228
- - Core UI primitives that other components will build on
192
+ ### Step 1.2 Parallel Research (5 agents, ONE message)
229
193
 
230
- Commit: `feat: design system foundation`
194
+ Skip if context level is "Decision brief" (research already done).
231
195
 
232
- ### Quality Gate 2
196
+ Call the Agent tool 5 times in a single message. Pass each agent the build request AND the Design Document draft.
233
197
 
234
- Run these checks:
235
- - Project builds without errors
236
- - Test framework runs and the initial test passes
237
- - Linting passes clean
238
- - Directory structure matches the Architecture Document
198
+ 1. Description: "Market research" — Prompt: "Research market size (TAM/SAM/SOM), competitive landscape (5-10 players), timing, and market structure for: [build request]. Design context: [paste design doc]. Use web search extensively. Report with a Market Verdict: GREEN/AMBER/RED."
239
199
 
240
- If any fail, fix before proceeding. Present status to user.
200
+ 2. Description: "Tech feasibility" — Prompt: "Evaluate hard technical problems (Solved/Hard/Unsolved), build-vs-buy decisions, MVP scope, and stack validation for: [build request]. Design context: [paste design doc]. Search for APIs and libraries mentioned in the design to verify they exist and are maintained. Report with a Technical Verdict."
241
201
 
242
- **Save state:** Update `docs/plans/.build-state.md`:
243
- ```
244
- Phase: 2 COMPLETE
245
- Foundation: scaffolded, builds clean, tests pass
246
- Next: Phase 3Dev↔QA loops
247
- ```
202
+ 3. Description: "User research" — Prompt: "Analyze target persona, jobs-to-be-done, current alternatives, behavioral barriers to adoption for: [build request]. Design context: [paste design doc]. Search for real user complaints and communities discussing this problem. Report with a User Verdict."
203
+
204
+ 4. Description: "Business model" — Prompt: "Evaluate revenue models, unit economics, growth loops, first-1000-users strategy for: [build request]. Design context: [paste design doc]. Search for comparable pricing and growth data. Report with a Business Verdict."
205
+
206
+ 5. Description: "Risk analysis"Prompt: "Adversarial review: regulatory risk, security concerns, dependency risks, competitive response, top 3 failure modes for: [build request]. Design context: [paste design doc]. Search for enforcement actions and comparable failures. Report with a Risk Verdict."
207
+
208
+ After all 5 return, synthesize a **Research Brief** with a verdict table. Save to `docs/plans/research-brief.md`.
209
+
210
+ ### Step 1.3 — Design Refinement
211
+
212
+ Read the Design Document and Research Brief together. Check for contradictions:
213
+
214
+ - Tech-feasibility flagged "Unsolved" hard problem → simplify or flag as risk
215
+ - Risk-analysis returned RED → add mitigation or descope
216
+ - User-research says "no validated demand" → flag as pivot point
217
+ - Business-model says "no moat" → note for speed-to-market priority
218
+
219
+ Update the Design Document with corrections. Save final version.
220
+
221
+ ### Step 1.4 — Persist Decisions
222
+
223
+ Append key decisions to the project's `CLAUDE.md` (create if needed) under `## Build Decisions`:
224
+
225
+ - Project name and one-line description
226
+ - Primary user and core value prop
227
+ - Tech stack (with rationale)
228
+ - Key constraints or risks
229
+ - MVP scope boundary (in vs. deferred)
248
230
 
249
- Update TodoWrite: mark Phase 2 complete.
231
+ This ensures decisions survive context compaction.
232
+
233
+ ### Quality Gate 1
234
+
235
+ **Autonomous:** Log design and research paths to `docs/plans/build-log.md`. If 2+ RED verdicts, log warning. Proceed.
236
+
237
+ **Interactive:** Present Design Document summary + Research Brief verdict table. Ask: "Approve this design, or want to adjust?" <HARD-GATE>DO NOT PROCEED without user approval.</HARD-GATE>
238
+
239
+ Update TodoWrite and `docs/plans/.build-state.md`.
240
+
241
+ **Compaction checkpoint:** Check `dispatches_since_save` in `docs/plans/.build-state.md`. If >= 8: save ALL state (current phase, task statuses, metric loop scores, decisions) to `docs/plans/.build-state.md`. Reset `dispatches_since_save` to 0. TodoWrite does NOT survive compaction — rebuild it from this state file on resume.
250
242
 
251
243
  ---
252
244
 
253
- ## Phase 3: Build Dev↔QA Loops
245
+ ## Phase 2: Architecture & Planning
254
246
 
255
- <HARD-GATE>
256
- SENTINEL CHECK — Before starting Phase 3, verify ALL of these:
257
- - Phase 1 quality gate passed (user approved architecture)
258
- - Phase 2 quality gate passed (project builds, tests pass)
259
- - You are dispatching to agents, not coding directly
260
- - `docs/plans/.build-state.md` exists and is current
261
- - TodoWrite has Phases 1 and 2 marked complete
262
-
263
- If ANY check fails, STOP and resolve before continuing.
264
- </HARD-GATE>
247
+ **Goal**: Convert the validated Design Document into a concrete architecture and ordered task list. Every agent receives the Design Document — not just the build request.
265
248
 
266
- **Goal**: Implement every task from the Sprint Task List. Each task goes through a Dev→Test→Review loop. This is where the actual product gets built.
249
+ ### Step 2.1 Explore (existing codebase only)
267
250
 
268
- **First:** Expand the TodoWrite listadd each task from sprint-tasks.md as a separate todo item under Phase 3.
251
+ If existing code, call the Agent tooldescription: "Explore codebase" prompt: "Explore this codebase. Map architecture layers, file conventions, testing patterns, existing features. Report findings."
269
252
 
270
- **For EACH task in the Sprint Task List, execute this loop:**
253
+ If greenfield, skip to Step 2.2.
271
254
 
272
- ### Step 3.1Implement
255
+ ### Step 2.2Architecture Design (4 agents in parallel, ONE message)
273
256
 
274
- Select the right developer agent based on task type:
275
- - **Frontend Developer** — UI components, pages, client-side logic
276
- - **Backend Architect** — APIs, database operations, server logic
277
- - **AI Engineer** — ML features, model integration, data pipelines
278
- - **Rapid Prototyper** — Quick integrations, glue code, utility functions
257
+ Read the Design Document and Research Brief. Pass both to every agent.
279
258
 
280
- **Launch a FRESH agent for each task.** Do not reuse agents across tasks — this prevents context pollution.
259
+ Call the Agent tool 4 times in a single message:
281
260
 
282
- The developer agent receives:
283
- - The specific task description and acceptance criteria from the Sprint Task List
284
- - The Architecture Document for context
285
- - Access to all existing code via Read/Grep/Glob tools
261
+ 1. Description: "Backend architecture" — Prompt: "Design system architecture. DESIGN DOC: [paste]. RESEARCH: [paste tech + risk sections]. Include services, data models, API contracts, database schema. Be specific. Respect tech stack and constraints from the design doc."
286
262
 
287
- The developer implements the task and writes tests that verify the acceptance criteria.
263
+ 2. Description: "Frontend architecture" Prompt: "Design frontend architecture. DESIGN DOC: [paste]. RESEARCH: [paste user research section]. Include component hierarchy, layout, responsive strategy, state management. Align UX with the user persona from research."
288
264
 
289
- Commit after implementation: `feat: [task description]`
265
+ 3. Description: "Security architecture" — Prompt: "Security review. DESIGN DOC: [paste]. RESEARCH: [paste risk section]. Cover auth model, input validation, secrets management, threat model. Address any regulatory risks flagged in research."
290
266
 
291
- ### Step 3.2Test & Verify
267
+ 4. Description: "Implementation blueprint" Prompt: "Implementation blueprint. DESIGN DOC: [paste]. Include specific files to create/modify, build sequence, dependency order. Scope to MVP boundary from design doc."
292
268
 
293
- Launch **Evidence Collector** to verify the implementation:
294
- - Run the tests the developer wrote — do they pass?
295
- - Check the acceptance criteria from the Sprint Task List — is each one met?
296
- - If frontend: take screenshots as visual proof
297
- - Report: **PASS** (all criteria met with evidence) or **FAIL** (specific failures listed)
269
+ After all 4 return, YOU synthesize into one Architecture Document. Save to `docs/plans/architecture.md`.
298
270
 
299
- ### Step 3.3 — Code Review
271
+ ### Step 2.3 — Metric Loop: Architecture Quality
300
272
 
301
- Launch **code-reviewer** (Claude Code agent) to review the implementation:
302
- - Bugs, logic errors, security issues
303
- - Adherence to project conventions from the Architecture Document
304
- - Code quality — is it simple, DRY, readable?
273
+ Run the Metric Loop Protocol (`commands/protocols/metric-loop.md`) on the Architecture Document. Define a metric based on this project — coverage of design doc requirements, specificity, consistency between agents. Max 3 iterations.
305
274
 
306
- ### Step 3.4 — Loop Decision
275
+ ### Step 2.4 — Sprint Planning
307
276
 
308
- **IF Evidence Collector = PASS AND code-reviewer finds no critical issues:**
309
- - Mark task as complete in TodoWrite
310
- - Move to next task
311
- - Reset retry counter
277
+ Follow the Planning Protocol (`commands/protocols/planning.md`). Use 2 sequential Agent tool calls:
312
278
 
313
- **IF Evidence Collector = FAIL OR code-reviewer finds critical issues:**
314
- - Increment retry counter
315
- - Send specific feedback to the developer agent: what failed, what the QA/reviewer found
316
- - Developer fixes and resubmits
317
- - Repeat Steps 3.2-3.3
279
+ Call the Agent tool description: "Sprint breakdown" prompt: "Break this architecture into ordered, atomic tasks. Each task needs: description, acceptance criteria, dependencies, size (S/M/L). ARCHITECTURE: [paste]. DESIGN DOC: [paste]. Scope to MVP only."
318
280
 
319
- **IF retry count reaches 3:**
320
- - Stop and escalate to the user with:
321
- - What the task is trying to do
322
- - What keeps failing
323
- - The specific error or QA feedback
324
- - Ask: "Fix manually, skip for now, or redesign the approach?"
281
+ Then call the Agent tool — description: "Validate task list" — prompt: "Validate this task list: [paste]. Check scope is realistic, no missing tasks, descriptions specific enough for a developer agent to execute, all tasks within MVP boundary."
325
282
 
326
- ### Progress Tracking
283
+ Save to `docs/plans/sprint-tasks.md`.
327
284
 
328
- After each task completes:
285
+ ### Quality Gate 2
329
286
 
330
- 1. Update TodoWrite: mark the task complete.
287
+ **Autonomous:** Log to `docs/plans/build-log.md`. Proceed.
331
288
 
332
- 2. Report to user:
333
- ```
334
- Task [X/total]: [task name] — COMPLETE
335
- Tests: [pass count] passing
336
- Attempts: [retry count]
337
- Next: [next task name]
338
- ```
289
+ **Interactive:** Present Architecture + Sprint Task List. Ask: "Approve to start building, or flag changes?" <HARD-GATE>DO NOT PROCEED without user approval.</HARD-GATE>
339
290
 
340
- 3. **Save state** — Update `docs/plans/.build-state.md`:
341
- ```
342
- Phase: 3 IN PROGRESS
343
- Current task: [X+1]/[total] — [next task name]
344
- Completed: [list of completed tasks]
345
- Retry counter: 0
346
- Agents used this phase: [list]
347
- ```
291
+ Update TodoWrite and `docs/plans/.build-state.md`.
292
+
293
+ **Compaction checkpoint:** Check `dispatches_since_save` in `docs/plans/.build-state.md`. If >= 8: save ALL state (current phase, task statuses, metric loop scores, decisions) to `docs/plans/.build-state.md`. Reset `dispatches_since_save` to 0. TodoWrite does NOT survive compaction — rebuild it from this state file on resume.
348
294
 
349
295
  ---
350
296
 
351
- ## Phase 4: Harden
297
+ ## Phase 3: Foundation
298
+
299
+ ### Step 3.1 — Scaffolding
300
+
301
+ Call the Agent tool — description: "Project scaffolding" — mode: "bypassPermissions" — prompt: "[COMPLEXITY: M] Set up the project from this architecture: [paste]. Create directory structure, dependencies, build tooling, linting config, test framework with one passing test, .gitignore, .env.example. Commit: 'feat: initial scaffolding'."
302
+
303
+ ### Step 3.2 — Design System (frontend only)
304
+
305
+ Call the Agent tool — description: "Design system setup" — mode: "bypassPermissions" — prompt: "Implement design system foundation from this architecture: [paste frontend section]. Create CSS tokens, base layout components, core UI primitives. Commit: 'feat: design system'."
306
+
307
+ ### Step 3.3 — Metric Loop: Scaffold Health
308
+
309
+ Run the Metric Loop Protocol. Define a metric: builds clean, tests pass, lint clean, structure matches architecture. Max 3 iterations.
310
+
311
+ ### Step 3.4 — Verification Gate
312
+
313
+ Run the Verification Protocol (`commands/protocols/verify.md`). Critical rules (survive compaction):
314
+ - ONE agent runs all 6 checks sequentially: Build → Type-Check → Lint → Test → Security → Diff Review. Stop on first FAIL.
315
+ - Agent auto-detects stack from manifest files (package.json → Node, go.mod → Go, etc.).
316
+ - On FAIL: for build/type/lint errors, use the Build-Fix Protocol (`commands/protocols/build-fix.md`) — fixes one error at a time with cascade detection. For test/security/diff failures, spawn a targeted fix agent. Re-verify. Max 3 fix attempts.
317
+ - On PASS: log `VERIFY: PASS (6/6)` to `docs/plans/.build-state.md`. Proceed.
318
+
319
+ Call the Agent tool — description: "Verify scaffolding" — mode: "bypassPermissions" — prompt: "Run the Verification Protocol. Execute all 6 checks sequentially, stop on first failure. Report: VERIFY: PASS or VERIFY: FAIL with details."
320
+
321
+ Do not proceed to Phase 4 until verification passes.
352
322
 
353
- **Goal**: The full product is built. Now stress-test it. This phase finds the bugs, performance issues, security holes, and accessibility failures that task-level QA misses.
323
+ Update TodoWrite and state.
324
+
325
+ **Compaction checkpoint:** Check `dispatches_since_save` in `docs/plans/.build-state.md`. If >= 8: save ALL state (current phase, task statuses, metric loop scores, decisions) to `docs/plans/.build-state.md`. Reset `dispatches_since_save` to 0. TodoWrite does NOT survive compaction — rebuild it from this state file on resume.
326
+
327
+ ---
328
+
329
+ ## Phase 4: Build — Metric-Driven Dev Loops
354
330
 
355
331
  <HARD-GATE>
356
- Quality Gate: Reality Checker must approve before this phase passes. The Reality Checker defaults to NEEDS WORK and requires overwhelming evidence for approval. Do NOT self-approve.
332
+ Before starting: Phase 2 must be approved, Phase 3 must pass. You MUST call the Agent tool for EVERY task. No exceptions.
357
333
  </HARD-GATE>
358
334
 
359
- ### Step 4.1 Integration Testing (Parallel)
360
-
361
- Launch simultaneously:
335
+ Expand TodoWrite with each sprint task.
362
336
 
363
- 1. **API Tester** — Comprehensive API validation: all endpoints, edge cases, error responses, auth flows, rate limiting. Run the full test suite.
337
+ **For EACH task:**
364
338
 
365
- 2. **Performance Benchmarker**Measure response times, identify bottlenecks, test under load if applicable. Flag anything that doesn't meet performance requirements from the Architecture Document.
339
+ ### Step 4.1Implement
366
340
 
367
- 3. **Accessibility Auditor**WCAG compliance audit on all user-facing interfaces. Screen reader testing. Keyboard navigation. Color contrast. Flag every barrier found.
341
+ Call the Agent tool description: "[task name]" mode: "bypassPermissions" — prompt: "TASK: [task description + acceptance criteria]. HANDOFF Architecture section: [paste ONLY the relevant section from architecture.md]. Design section: [paste ONLY the relevant section from the design doc]. Previous task output: [what the last completed task produced, if relevant]. Implement fully with real code and tests. Commit: 'feat: [task]'. Report what you built, files changed, and test results."
368
342
 
369
- 4. **Security Engineer** Security review of the built system: auth implementation, input validation, data exposure, dependency vulnerabilities. Run security scanning tools.
343
+ Pick the right developer framing: frontend, backend, AI, etc. Set `[COMPLEXITY: S/M/L]` based on the task's Size from sprint-tasks.md.
370
344
 
371
- ### Step 4.2Fix Critical Issues
345
+ ### Step 4.1bCleanup (De-Sloppify)
372
346
 
373
- For each critical issue found in 4.1:
374
- - Route to the appropriate developer agent with the specific finding
375
- - Developer fixes the issue
376
- - The agent that found the issue re-validates
377
- - Dev↔QA loop until the specific issue is resolved
347
+ Follow the Cleanup Protocol (`commands/protocols/cleanup.md`). Critical rules (survive compaction):
348
+ [COMPLEXITY: S]
349
+ - Skip if trivial (< 20 lines, single file).
350
+ - Cleanup agent is a SEPARATE agent from the implementer — no cleaning your own mess.
351
+ - Scope is sacred: ONLY files from the implementation changeset. Zero exceptions.
352
+ - Cleanup fixes: naming, dead code, unused imports, style, DRY. Does NOT: add features, change architecture, touch other files.
353
+ - If cleanup breaks acceptance criteria, revert and skip. Never block the metric loop on cleanup failure.
378
354
 
379
- ### Step 4.3Code Quality Pass (Parallel)
355
+ Call the Agent tool description: "Cleanup [task name]" — mode: "bypassPermissions" — with the list of files changed and the task's acceptance criteria.
380
356
 
381
- Launch simultaneously:
357
+ ### Step 4.2 — Metric Loop: Task Quality
382
358
 
383
- 1. **code-simplifier** (Claude Code) Simplify any overly complex code while preserving functionality
384
- 2. **type-design-analyzer** (Claude Code) — Review all type definitions for proper encapsulation and invariants
385
- 3. **comment-analyzer** (Claude Code) — Verify all comments are accurate and useful
359
+ Run the Metric Loop Protocol on the task implementation. Define a metric based on the task's acceptance criteria. Max 5 iterations.
386
360
 
387
- Commit any improvements: `refactor: code quality improvements`
361
+ ### Step 4.3 Loop Exit
388
362
 
389
- ### Step 4.4Final Verdict
363
+ On target met: mark task complete in TodoWrite, report "Task X/N: [name] COMPLETE (score: [final], iterations: [count])".
390
364
 
391
- Launch **Reality Checker** for the final assessment:
392
- - Cross-validate all test results
393
- - Review all QA evidence from Phase 3 and Phase 4
394
- - Check every acceptance criterion from the Sprint Task List
395
- - Verdict: **PRODUCTION READY** or **NEEDS WORK** with specific items
365
+ On stall or max iterations:
366
+ - **Interactive:** present score history + top remaining issue to user.
367
+ - **Autonomous:** accept if score >= 60% of target, skip otherwise. Log to `docs/plans/build-log.md`.
396
368
 
397
- ### Quality Gate 4
369
+ After each task: update TodoWrite and `docs/plans/.build-state.md`.
398
370
 
399
- Present to the user:
400
- 1. Reality Checker's verdict
401
- 2. Test results summary (pass/fail counts, coverage)
402
- 3. Performance benchmarks
403
- 4. Security findings (resolved and any remaining)
404
- 5. Accessibility audit results
405
- 6. Any items the Reality Checker flagged as NEEDS WORK
371
+ ### Step 4.4 — Post-Task Verification
406
372
 
407
- **Save state:** Update `docs/plans/.build-state.md`:
408
- ```
409
- Phase: 4 COMPLETE
410
- Reality Checker: [verdict]
411
- ```
373
+ Run the Verification Protocol (`commands/protocols/verify.md`) to catch regressions. If FAIL, fix before starting the next task.
412
374
 
413
- Update TodoWrite: mark Phase 4 complete.
375
+ **Compaction checkpoint:** Check `dispatches_since_save` in `docs/plans/.build-state.md`. If >= 8: save ALL state (current phase, task statuses, metric loop scores, decisions) to `docs/plans/.build-state.md`. Reset `dispatches_since_save` to 0. TodoWrite does NOT survive compaction — rebuild it from this state file on resume.
414
376
 
415
377
  ---
416
378
 
417
- ## Phase 5: Ship
379
+ ## Phase 5: Harden — Metric-Driven Hardening
418
380
 
419
- **Goal**: Final documentation, clean git history, and handoff.
381
+ ### Step 5.0 Pre-Hardening Verification
420
382
 
421
- ### Step 5.1Documentation
383
+ Run the Verification Protocol (`commands/protocols/verify.md`). ONE agent, 6 sequential checks (Build → Type → Lint → Test → Security → Diff), stop on first FAIL. Max 3 fix attempts. All checks must pass before starting expensive audit agents do not waste audit agents on code that doesn't build or pass tests.
422
384
 
423
- Launch **Technical Writer**:
424
- - README with setup instructions, architecture overview, and usage
425
- - API documentation (if applicable)
426
- - Any environment/deployment notes
385
+ ### Step 5.1 — Initial Audit (4 agents in parallel, ONE message)
427
386
 
428
- Commit: `docs: add project documentation`
387
+ Call the Agent tool 4 times in one message:
429
388
 
430
- ### Step 5.2Final Commit
389
+ 1. Description: "API testing" Prompt: "Comprehensive API validation: all endpoints, edge cases, error responses, auth flows. Report findings with counts."
431
390
 
432
- Use `/commit` to create a clean final commit with a summary of what was built.
391
+ 2. Description: "Performance audit" Prompt: "Measure response times, identify bottlenecks, flag performance issues. Report benchmarks."
433
392
 
434
- ### Completion Report
393
+ 3. Description: "Accessibility audit" — Prompt: "WCAG compliance audit on all interfaces. Check screen reader, keyboard nav, contrast. Report issues with counts."
435
394
 
436
- Present to the user:
395
+ 4. Description: "Security audit" — Prompt: "Security review: auth, input validation, data exposure, dependency vulnerabilities. Report findings with severity."
437
396
 
438
- ```
439
- BUILD COMPLETE
440
- ==============
397
+ ### Step 5.1b — Eval Harness
441
398
 
442
- Project: [name]
443
- Tasks: [completed]/[total] ([pass rate]%)
444
- Tests: [count] passing
445
- Commits: [count]
399
+ Run the Eval Harness Protocol (`commands/protocols/eval-harness.md`). Define 8-15 concrete, executable eval cases from the audit findings and architecture doc. Run the eval agent. Record baseline pass rate. CRITICAL and HIGH failures feed into the metric loop in Step 5.2 as specific issues to fix.
446
400
 
447
- Architecture: [Backend Architect + UX Architect + Security Engineer]
448
- Implementation: [which developer agents were used]
449
- QA: [Evidence Collector + code-reviewer findings]
450
- Hardening: [API Tester + Performance Benchmarker + Accessibility Auditor + Security Engineer]
451
- Final Verdict: [Reality Checker's assessment]
401
+ ### Step 5.2 Metric Loop: Hardening Quality
452
402
 
453
- Files Created: [count]
454
- Files Modified: [count]
403
+ Run the Metric Loop Protocol on the full codebase using audit findings as initial input. Define a composite metric based on what this project needs. Max 4 iterations.
455
404
 
456
- Remaining Items: [any NEEDS WORK items from Reality Checker]
457
- ```
405
+ When fixing, dispatch to the RIGHT specialist. Security → security agent. Accessibility → frontend agent. Don't send everything to one agent.
406
+
407
+ ### Step 5.2b — Eval Re-run
458
408
 
459
- Update TodoWrite: mark Phase 5 and all items complete.
409
+ Re-run the Eval Harness after the metric loop exits. All CRITICAL eval cases must now pass. If any CRITICAL case still fails, include it as evidence for the Reality Checker.
410
+
411
+ ### Step 5.3 — Reality Check
412
+
413
+ Call the Agent tool — description: "Final verdict" — prompt: "You are the Reality Checker. Default: NEEDS WORK. The hardening loop reached score [final_score] after [iterations] iterations. Score history: [paste table]. Review all evidence. Eval harness results: [baseline pass rate] → [final pass rate]. CRITICAL failures remaining: [list or none]. Verdict: PRODUCTION READY or NEEDS WORK with specifics."
414
+
415
+ <HARD-GATE>Do NOT self-approve. Reality Checker must give the verdict.</HARD-GATE>
416
+
417
+ **Autonomous:** Log verdict to `docs/plans/build-log.md`. Continue.
418
+ **Interactive:** Present score history + verdict to user. Update state.
419
+
420
+ **Compaction checkpoint:** Check `dispatches_since_save` in `docs/plans/.build-state.md`. If >= 8: save ALL state (current phase, task statuses, metric loop scores, decisions) to `docs/plans/.build-state.md`. Reset `dispatches_since_save` to 0. TodoWrite does NOT survive compaction — rebuild it from this state file on resume.
421
+
422
+ ---
423
+
424
+ ## Phase 6: Ship
425
+
426
+ ### Step 6.0 — Pre-Ship Verification
427
+
428
+ Final verification gate. Run the Verification Protocol (`commands/protocols/verify.md`). ONE agent, 6 sequential checks (Build → Type → Lint → Test → Security → Diff), stop on first FAIL. Max 3 fix attempts. All checks must pass before documenting and shipping. If FAIL persists, return to Phase 5 for targeted fixes.
429
+
430
+ ### Step 6.1 — Documentation
431
+
432
+ Call the Agent tool — description: "Documentation" — mode: "bypassPermissions" — prompt: "Write project docs: README with setup/architecture/usage, API docs if applicable, deployment notes. Commit: 'docs: project documentation'."
433
+
434
+ ### Step 6.2 — Metric Loop: Documentation Quality
435
+
436
+ Run the Metric Loop Protocol on documentation. Define a metric based on completeness and whether a new developer could follow the README. Max 3 iterations.
437
+
438
+ ### Step 6.3 — Record Learnings
439
+
440
+ Append to `docs/plans/learnings.md` (create if it doesn't exist). Review the build and record 3-5 learnings:
441
+
442
+ - **PATTERN:** [what worked well and should be repeated in future builds]
443
+ - **PITFALL:** [what failed, caused waste, or required excessive iterations]
444
+ - **HEURISTIC:** [project-specific tuning discovered during this build]
445
+
446
+ Base learnings on: metric loop stall patterns, build-fix frequency, phases that exceeded expected iterations, agent prompts that needed rework.
447
+
448
+ ### Completion Report
449
+
450
+ Create final commit. Present:
460
451
 
461
- **Save final state:** Update `docs/plans/.build-state.md`:
462
452
  ```
463
- Phase: 5 COMPLETE — BUILD DONE
453
+ BUILD COMPLETE
454
+ Project: [name] | Tasks: [done]/[total] | Tests: [count] passing
455
+ Agents used: [list] | Verdict: [Reality Checker result]
456
+ Metric loops run: [count] | Avg iterations: [N]
457
+ Remaining: [any NEEDS WORK items]
464
458
  ```
459
+
460
+ Mark all TodoWrite items complete. Update `docs/plans/.build-state.md`: "Phase: 6 COMPLETE."