claude-raid 0.2.7 → 0.2.9

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (39) hide show
  1. package/README.md +84 -23
  2. package/bin/cli.js +4 -2
  3. package/package.json +1 -1
  4. package/src/descriptions.js +10 -7
  5. package/src/init.js +36 -5
  6. package/src/merge-settings.js +53 -2
  7. package/src/remove.js +1 -1
  8. package/src/setup.js +32 -0
  9. package/src/ui.js +1 -0
  10. package/src/update.js +26 -3
  11. package/template/.claude/agents/archer.md +18 -4
  12. package/template/.claude/agents/rogue.md +18 -4
  13. package/template/.claude/agents/warrior.md +18 -4
  14. package/template/.claude/agents/wizard.md +32 -5
  15. package/template/.claude/dungeon-master-rules.md +120 -31
  16. package/template/.claude/hooks/raid-lib.sh +45 -4
  17. package/template/.claude/hooks/raid-pre-compact.sh +8 -4
  18. package/template/.claude/hooks/raid-session-end.sh +2 -2
  19. package/template/.claude/hooks/raid-session-start.sh +2 -0
  20. package/template/.claude/hooks/rtk-bridge.sh +46 -0
  21. package/template/.claude/hooks/validate-dungeon.sh +11 -3
  22. package/template/.claude/hooks/validate-file-naming.sh +6 -1
  23. package/template/.claude/hooks/validate-no-placeholders.sh +13 -2
  24. package/template/.claude/hooks/validate-write-gate.sh +7 -2
  25. package/template/.claude/party-rules.md +91 -65
  26. package/template/.claude/skills/raid-browser/SKILL.md +3 -5
  27. package/template/.claude/skills/raid-browser-chrome/SKILL.md +1 -1
  28. package/template/.claude/skills/raid-canonical-design/SKILL.md +309 -162
  29. package/template/.claude/skills/raid-canonical-implementation/SKILL.md +157 -132
  30. package/template/.claude/skills/raid-canonical-implementation-plan/SKILL.md +196 -141
  31. package/template/.claude/skills/raid-canonical-prd/SKILL.md +92 -89
  32. package/template/.claude/skills/raid-canonical-protocol/SKILL.md +29 -123
  33. package/template/.claude/skills/raid-canonical-review/SKILL.md +292 -148
  34. package/template/.claude/skills/raid-debugging/SKILL.md +1 -7
  35. package/template/.claude/skills/raid-init/SKILL.md +7 -5
  36. package/template/.claude/skills/raid-tdd/SKILL.md +5 -5
  37. package/template/.claude/skills/raid-teambuff/SKILL.md +6 -24
  38. package/template/.claude/skills/raid-verification/SKILL.md +0 -6
  39. package/template/.claude/skills/raid-wrap-up/SKILL.md +30 -29
@@ -5,248 +5,395 @@ description: "Use when Phase 2 (Design) begins in a Canonical Quest, after PRD i
5
5
 
6
6
  # Raid Design — Phase 2
7
7
 
8
- Turn ideas into battle-tested designs through agent-driven adversarial exploration.
8
+ Turn ideas into battle-tested designs through the writer/reviewer/defend-concede protocol.
9
9
 
10
10
  <HARD-GATE>
11
- Do NOT write any code, scaffold any project, or take any implementation action until the Wizard has approved the design and it is committed to git. All assigned agents participate. Agents communicate via SendMessage — do not spawn subagents.
11
+ Do NOT write any code, scaffold any project, or take any implementation action until the design is approved and committed.
12
12
  </HARD-GATE>
13
13
 
14
14
  ## Scope Check
15
15
 
16
- Before asking detailed questions, assess scope. If the request describes multiple independent subsystems (e.g., "build a platform with chat, file storage, billing, and analytics"), flag this immediately. Don't spend rounds refining details of a project that needs decomposition first.
16
+ Before dispatching agents, assess scope. If the request describes multiple independent subsystems (e.g., "build a platform with chat, file storage, billing, and analytics"), flag it immediately. Don't spend rounds refining a project that needs decomposition first.
17
17
 
18
- If too large for a single design: help the human decompose into sub-quests. Each sub-quest gets its own design → plan → implementation cycle. Design the first sub-quest through the normal flow.
19
-
20
- ## Mode Behavior
21
-
22
- - **Full Raid**: All 3 agents explore from different angles, fight directly, pin findings to Dungeon. Full design doc required.
23
- - **Skirmish**: 2 agents explore and interact, produce a lightweight design+plan combined doc.
24
- - **Scout**: Wizard assesses inline, no design doc required. Skip this skill entirely.
18
+ If too large for a single design: decompose into sub-quests with the human. Each sub-quest gets its own design → plan → implementation cycle.
25
19
 
26
20
  ## Process Flow
27
21
 
28
22
  ```dot
29
23
  digraph design {
30
- "Wizard comprehends request (reads 3x)" -> "Scope check";
31
- "Scope check" -> "Too large?" [shape=diamond];
32
- "Too large?" -> "Decompose into sub-projects" [label="yes"];
33
- "Decompose into sub-projects" -> "Brainstorm first sub-project";
34
- "Too large?" -> "Explore project context" [label="no"];
35
- "Explore project context" -> "Research dependencies";
36
- "Research dependencies" -> "Ask clarifying questions (one at a time)";
37
- "Ask clarifying questions (one at a time)" -> "Wizard opens Dungeon + dispatches";
38
- "Wizard opens Dungeon + dispatches" -> "Agents explore, challenge, build freely";
39
- "Agents explore, challenge, build freely" -> "Agents pin verified findings to Dungeon";
40
- "Agents pin verified findings to Dungeon" -> "Dungeon sufficient?" [shape=diamond];
41
- "Dungeon sufficient?" -> "Agents explore, challenge, build freely" [label="no"];
42
- "Dungeon sufficient?" -> "Wizard closes: synthesizes 2-3 approaches from Dungeon" [label="yes"];
43
- "Wizard closes: synthesizes 2-3 approaches from Dungeon" -> "Present design (section by section)";
44
- "Present design (section by section)" -> "Human approves?" [shape=diamond];
45
- "Human approves?" -> "Present design (section by section)" [label="revise"];
46
- "Human approves?" -> "Write design doc" [label="yes"];
47
- "Write design doc" -> "Adversarial spec review (agents attack directly)";
48
- "Adversarial spec review (agents attack directly)" -> "Spec self-review (fix inline)";
49
- "Spec self-review (fix inline)" -> "Human reviews written spec";
50
- "Human reviews written spec" -> "Commit + invoke raid-canonical-implementation-plan" [shape=doublecircle];
24
+ "Wizard comprehends request + scope check" -> "Explore codebase, ask human questions";
25
+ "Explore codebase, ask human questions" -> "Phase recap (PRD if exists)";
26
+ "Phase recap (PRD if exists)" -> "Roll dice for phase turn order";
27
+ "Roll dice for phase turn order" -> "Scaffold design.md template + create phase-2-design.md";
28
+ "Scaffold design.md template + create phase-2-design.md" -> "ROUND 1: Agent 1 WRITES initial design";
29
+ "ROUND 1: Agent 1 WRITES initial design" -> "Agent 2 REVIEWS, pins findings";
30
+ "Agent 2 REVIEWS, pins findings" -> "Agent 3 REVIEWS, pins findings";
31
+ "Agent 3 REVIEWS, pins findings" -> "Wizard evaluates, optionally intervenes";
32
+ "Wizard evaluates, optionally intervenes" -> "ROUND 2: Agent 1 DEFEND/CONCEDE, writes V2";
33
+ "ROUND 2: Agent 1 DEFEND/CONCEDE, writes V2" -> "Agents 2+3 review V2";
34
+ "Agents 2+3 review V2" -> "Wizard evaluates — Round 3 needed?" [shape=diamond];
35
+ "Wizard evaluates — Round 3 needed?" -> "ROUND 3 (FINAL): same cycle" [label="critical findings"];
36
+ "Wizard evaluates — Round 3 needed?" -> "Drift check: design.md vs prd.md" [label="solid"];
37
+ "ROUND 3 (FINAL): same cycle" -> "Drift check: design.md vs prd.md";
38
+ "Drift check: design.md vs prd.md" -> "Extract final design.md";
39
+ "Extract final design.md" -> "Present to human" -> "Approved?" [shape=diamond];
40
+ "Approved?" -> "Ask why, explain to agents, more rounds" [label="no"];
41
+ "Ask why, explain to agents, more rounds" -> "ROUND 2: Agent 1 DEFEND/CONCEDE, writes V2";
42
+ "Approved?" -> "Commit + report with file links" [label="yes"];
43
+ "Commit + report with file links" -> "Load raid-canonical-implementation-plan" [shape=doublecircle];
51
44
  }
52
45
  ```
53
46
 
54
47
  ## Wizard Checklist
55
48
 
56
- Complete in order:
57
-
58
49
  1. **Comprehend the request** — read 3 times, identify the real problem beneath the stated one
59
- 2. **Scope check** — if the request describes multiple independent subsystems, flag it immediately
50
+ 2. **Scope check** — if multiple independent subsystems, decompose first
60
51
  3. **Explore project context** — files, docs, recent commits, dependencies, conventions, patterns
61
- 4. **Research dependencies** — API surface, versioning, compatibility, known issues. Read docs COMPLETELY.
62
- 5. **Ask clarifying questions** — one at a time to the human, eliminate every ambiguity
63
- 6. **Open the Dungeon** — create `{questDir}/phase-2-design.md` (scoreboard) with Phase 2 header, quest, mode. Read `{questDir}/prd.md` if it exists.
64
- 7. **Dispatch with angles** — send each agent their angle via SendMessage, then go silent:
65
- ```
66
- SendMessage(to="warrior", message="DISPATCH: [quest]. Your angle: [X]...")
67
- SendMessage(to="archer", message="DISPATCH: [quest]. Your angle: [Y]...")
68
- SendMessage(to="rogue", message="DISPATCH: [quest]. Your angle: [Z]...")
69
- ```
70
- 8. **Round 1: Research** — agents explore their angles independently in their own panes. Pin findings to Dungeon. Signal `ROUND_COMPLETE:`. **Stop.** Agents do NOT self-initiate cross-testing. You receive messages automatically. Intervene only on protocol violations.
71
- 9. **Round 2: Cross-testing** — when ALL agents have flagged `ROUND_COMPLETE:`, dispatch explicit cross-verification assignments. Each agent challenges specific findings from the others. Signal `ROUND_COMPLETE:` when done. **Stop.**
72
- 10. **Repeat if needed** — if more exploration is needed, dispatch a new research round with refined angles
73
- 11. **Close the phase** — broadcast `HOLD`. Close when Dungeon has sufficient verified findings to form 2-3 approaches
74
- 12. **Synthesize approaches** — propose 2-3 approaches from Dungeon evidence, with trade-offs and recommendation
75
- 13. **Present design section by section** scale each section to its complexity (a few sentences if straightforward, up to 200-300 words if nuanced). Ask the human after each section: "Does this look right so far?" Be ready to revise before moving on. Cover: architecture, components, data flow, error handling, testing.
76
- 14. **Write design doc** — save to `{questDir}/design.md` (separate from the phase scoreboard). May also create `{questDir}/design-diagrams.md` for mermaid charts.
77
- 15. **Adversarial spec review** — agents attack the written spec directly, challenging each other
78
- 16. **Spec self-review** — fix issues inline (see checklist below)
79
- 17. **Human reviews written spec** human approves before proceeding
80
- 18. **Commit** — `docs(quest-{slug}): phase 2 design — {summary}`
81
- 19. **Transition** invoke `raid-canonical-implementation-plan`
82
-
83
- ## Opening the Dungeon (Phase Scoreboard)
84
-
85
- Create `{questDir}/phase-2-design.md` — this is the **dungeon scoreboard**, not the deliverable. It tracks discoveries, battles, and shared knowledge from agent exploration. Every line in Discoveries/Active Battles must use a recognized prefix (`DUNGEON:`, `UNRESOLVED:`, `BLACKCARD:`, `RESOLVED:`, `TASK:`). Freeform content is only allowed in Resolved, Shared Knowledge, and Escalations sections.
52
+ 4. **Ask clarifying questions** — one at a time to the human, eliminate every ambiguity
53
+ 5. **Phase recap** — summarize PRD findings and deliverable (read `{questDir}/spoils/prd.md` if it exists). Or summarize the exploration context if PRD was skipped. Present to agents and human.
54
+ 6. **Roll dice** — randomly shuffle `["warrior", "archer", "rogue"]` for this phase's turn order. Update raid-session via Bash using the jq command from protocol "Dice Roll Reference". Announce: *"The dice have spoken. Turn order for this phase: {agent1} {agent2} → {agent3}."*
55
+ 7. **Scaffold documents** — create `{questDir}/spoils/design.md` (template) and `{questDir}/phases/phase-2-design.md` (evolution log)
56
+ 8. **Run rounds** — see Round Protocol below
57
+ 9. **Drift check** — compare final design with `prd.md` (if exists). See Drift Detection below.
58
+ 10. **Extract final** — polish the final version into clean `design.md` from the evolution in `phase-2-design.md`
59
+ 11. **Present to human** — walk through the design. If not approved: ask why, understand, explain feedback to agents, run more rounds, re-extract. Repeat until approved.
60
+ 12. **Commit** — `docs(quest-{slug}): phase 2 design — {summary}`
61
+ 13. **Report** — link both `design.md` and `phase-2-design.md` file paths
62
+ 14. **Transition** — load `raid-canonical-implementation-plan`
63
+
64
+ ## Dispatch Templates
65
+
66
+ Dispatch carries only dynamic context the agent can't get from party-rules or the phase file's embedded comments. Keep dispatch lean detailed instructions are in the scaffolded document sections.
67
+
68
+ **Writer (Round 1, Turn 1):**
69
+ ```
70
+ TURN_DISPATCH: Phase 2 Design, Round 1, Turn 1.
71
+ Quest: {description}
72
+ Phase recap: {summary of PRD/prior findings}
73
+ Your role: WRITER. Your section: "Version 1 — @{name} [R1]"
74
+
75
+ FIRST: Read the FULL document at {questDir}/phases/phase-2-design.md before writing anything.
76
+ Understand the structure, read the embedded instructions in your section, and read the
77
+ Writing Guidance at the bottom. Then read {questDir}/spoils/prd.md (if exists) + codebase.
78
+ THEN: Write in your designated section following the embedded instructions.
79
+ ```
80
+
81
+ **Reviewer (Round 1, Turns 2-3):**
82
+ ```
83
+ TURN_DISPATCH: Phase 2 Design, Round 1, Turn {T}.
84
+ Quest: {description}
85
+ {prior agent} just wrote Version 1.
86
+ Your role: REVIEWER. Your section: "@{name} [R1] Review"
87
+
88
+ FIRST: Read the FULL document at {questDir}/phases/phase-2-design.md before writing anything.
89
+ Understand the structure, read Version 1, read the embedded instructions in your review section.
90
+ THEN: Write your review in your designated section following the embedded instructions.
91
+ ```
92
+
93
+ **Writer (Round 2+, Defend/Concede):**
94
+ ```
95
+ TURN_DISPATCH: Phase 2 Design, Round {N}, Turn 1.
96
+ Quest: {description}
97
+ Round {N-1} reviews are in from @{reviewer1} and @{reviewer2}.
98
+ Your role: WRITER. Sections: "Defend/Concede — @{name} [R{N}]" then "Version {N} — @{name} [R{N}]"
99
+
100
+ FIRST: Read the FULL document at {questDir}/phases/phase-2-design.md.
101
+ Read every finding from Round {N-1}. Read the embedded instructions in your sections.
102
+ THEN: Respond to each finding, then write Version {N}.
103
+ ```
104
+
105
+ **Reviewer (Round 2+, Turns 2-3):**
106
+ ```
107
+ TURN_DISPATCH: Phase 2 Design, Round {N}, Turn {T}.
108
+ Quest: {description}
109
+ {writer} responded with DEFEND/CONCEDE and wrote Version {N}.
110
+ Your role: REVIEWER. Your section: "@{name} [R{N}] Review"
111
+
112
+ FIRST: Read the FULL document at {questDir}/phases/phase-2-design.md.
113
+ Read Version {N}, the defend/concede responses, and your embedded instructions.
114
+ THEN: Write your review in your designated section.
115
+ ```
116
+
117
+ ## Round Protocol
118
+
119
+ ### Round 1: Write + Review
120
+
121
+ **Agent 1 (dice-first) — WRITES the initial design:**
122
+ - Receives the PRD (or exploration context), codebase findings, and the `design.md` template
123
+ - Writes the complete initial design applying their unique lens
124
+ - Signs all work: `@{name} [R1]`
125
+ - Output goes to the "Version 1" section of `phase-2-design.md`
126
+ - Signals `TURN_COMPLETE:`
127
+
128
+ **Agent 2 — REVIEWS Agent 1's work:**
129
+ - Reads Agent 1's design in `phase-2-design.md`
130
+ - Writes review in the "Review — Round 1" section, pins findings
131
+ - Challenges gaps, weak assumptions, missing edge cases — from their unique lens
132
+ - Signs all findings: `@{name} [R1]`
133
+ - Signals `TURN_COMPLETE:`
134
+
135
+ **Agent 3 — REVIEWS both prior works:**
136
+ - Reads Agent 1's design AND Agent 2's review
137
+ - Writes their own review section, building on or challenging Agent 2's findings
138
+ - Signs all findings: `@{name} [R1]`
139
+ - Signals `TURN_COMPLETE:`
140
+
141
+ **Wizard evaluates Round 1:**
142
+ - Reads all work. Ultrathink synthesis.
143
+ - Optionally intervenes on the document — with human approval, explaining why. But only if needed; if the document is in good shape, move to Round 2.
144
+
145
+ ### Round 2: Defend/Concede + Review
146
+
147
+ **Agent 1 — DEFEND or CONCEDE each finding, write Version 2:**
148
+ - Reads every finding from Agents 2 and 3
149
+ - Responds to **each one** explicitly:
150
+ - `DEFEND:` — counter-evidence showing the approach is correct
151
+ - `CONCEDE:` — acknowledge the issue, commit to addressing it
152
+ - Writes Version 2 incorporating all conceded findings
153
+ - May intentionally mark specific findings as false positives (with explanation)
154
+ - Signs: `@{name} [R2]`
155
+ - Signals `TURN_COMPLETE:`
156
+
157
+ **Agents 2+3 — Review Version 2:**
158
+ - Same review pattern as Round 1, but now evaluating the V2 and the defend/concede responses
159
+ - Sign: `@{name} [R2]`
160
+
161
+ **Wizard evaluates Round 2:**
162
+ - If no critical or high-relevance findings remain → close
163
+ - If breaking concerns exist → announce Round 3 as FINAL: *"This is the final round. Make every move count."*
164
+
165
+ ### Round 3 (if needed): Final Round
166
+
167
+ Same cycle. Wizard makes clear this is the FINAL round — agents have limited moves, so every one must count. After Round 3, the Wizard closes regardless.
168
+
169
+ ## Evolution Log Template
170
+
171
+ Scaffold `{questDir}/phases/phase-2-design.md`. Replace `{writer}`, `{reviewer1}`, `{reviewer2}` with actual agent names from the dice roll:
86
172
 
87
173
  ```markdown
88
- # Phase 2: Design
89
- ## Quest: <task description from human>
90
- ## Mode: <Full Raid | Skirmish>
91
- ## PRD: <link to prd.md if it exists>
174
+ # Phase 2: Design — Evolution Log
175
+
176
+ ## Quest: [quest description]
177
+ ## Quest Type: Canonical Quest
178
+ ## Turn Order: @{agent1} → @{agent2} → @{agent3}
92
179
 
93
- ### Discoveries
180
+ ## References
181
+ - PRD: `{questDir}/spoils/prd.md` (if exists)
94
182
 
95
- ### Active Battles
183
+ ## Quest Goal
184
+ <!-- Wizard writes 2-3 lines: what the design phase aims to produce,
185
+ key constraints from the PRD, and the main architectural question to answer -->
96
186
 
97
- ### Resolved
187
+ ---
98
188
 
99
- ### Shared Knowledge
189
+ ## Version 1 — @{writer} [R1]
190
+
191
+ <!-- @{writer}: WRITER for this phase. Read references above first.
192
+ Fill EVERY section. Scale depth to complexity (simple→bullets, complex→full detail).
193
+ Make reasoning explicit — reviewers will challenge everything. -->
194
+
195
+ ### Problem Restatement
196
+ <!-- Restate the problem in technical terms. How does it manifest in the codebase?
197
+ What specific code, systems, or flows are affected? -->
198
+
199
+ ### Requirements Summary
200
+ <!-- Numbered list extracted from PRD. Each requirement that this design must satisfy.
201
+ If no PRD exists, derive from the wizard's context. -->
202
+
203
+ ### Constraints
204
+ <!-- Technical: language, framework, infrastructure, backwards compatibility.
205
+ Business: timeline, compliance, dependencies on other teams.
206
+ Only constraints that affect design decisions. -->
207
+
208
+ ### Architecture
209
+ <!-- Scale depth to complexity:
210
+ - Describe the main components/modules and how they connect
211
+ - Show data flow: what enters, what's processed, what exits
212
+ - Define key interfaces between components
213
+ - For complex features: include sequence of operations, state transitions
214
+ - Reference existing code patterns in the codebase where you're extending them
215
+ - Call out what's NEW vs what EXTENDS existing code -->
216
+
217
+ ### File Structure
218
+ <!-- Map of files to create or modify:
219
+ | File | Action | Purpose |
220
+ |------|--------|---------|
221
+ Use the project's existing structure as the guide. -->
222
+
223
+ ### Error Handling Strategy
224
+ <!-- What errors can occur at each boundary?
225
+ How is each error surfaced to the user or calling code?
226
+ What's the recovery path? What's unrecoverable? -->
227
+
228
+ ### Testing Strategy
229
+ <!-- What types of tests? (unit, integration, e2e)
230
+ What's the mocking strategy?
231
+ What are the critical paths that MUST have test coverage?
232
+ When browser.enabled: which flows need Playwright tests? -->
233
+
234
+ ### Edge Cases
235
+ <!-- Catalog by category:
236
+ - Boundary: empty input, max values, zero, negative
237
+ - State: concurrent access, partial failure, interrupted operations
238
+ - Input: malformed data, unicode, unexpected types
239
+ - Environment: network failure, timeout, missing dependencies
240
+ Only include edge cases relevant to THIS feature. -->
241
+
242
+ ### Alternatives Considered
243
+ <!-- At least 2 alternatives to your chosen approach.
244
+ For each: what it is, why it was rejected (specific technical reason). -->
100
245
 
101
- ### Escalations
102
- ```
246
+ ---
103
247
 
104
- ## Question Chain
248
+ ## Review — Round 1
105
249
 
106
- **Agents NEVER ask the human directly.** The question flow is:
107
- 1. Agent discovers they need clarification → sends `WIZARD:` with the question
108
- 2. Wizard reasons: can I answer this confidently from the PRD, codebase, or prior context?
109
- 3. If yes → answer the agent directly via SendMessage
110
- 4. If unsure → digest the question, formulate it clearly for the human, ask human
111
- 5. Wizard passes human's answer back to agents with his own interpretation added
112
- 6. Goal: minimize questions to human, batch related questions
250
+ ### @{reviewer1} [R1] Review
113
251
 
114
- ## Dispatch Pattern
252
+ <!-- @{reviewer1}: REVIEWER. Read Version 1, then verify claims against actual code.
253
+ For each finding: 1) WHAT is wrong 2) WHY it matters 3) WHAT should change.
254
+ Use FINDING:/CHALLENGE:/BUILDING: signals. Sign @{reviewer1} [R1]. -->
115
255
 
116
- Each agent gets the same objective but a different starting angle. After dispatch, the Wizard goes silent.
256
+ ### @{reviewer2} [R1] Review
117
257
 
118
- **DISPATCH:**
258
+ <!-- @{reviewer2}: REVIEWER. Read Version 1 + @{reviewer1}'s review.
259
+ Find what was missed. Challenge with evidence. Don't repeat — add new value. -->
119
260
 
120
- > **@Warrior**: Explore from the data/infrastructure side. What are the hard technical constraints? What schemas, migrations, APIs are needed? What breaks if we get this wrong? Find the structural load-bearing walls. Challenge @Archer and @Rogue's findings directly. Pin verified findings to the Dungeon.
121
- >
122
- > **@Archer**: Explore from the integration/consistency side. How does this fit with existing patterns? What implicit contracts exist? What ripple effects? Trace the dependency chain. Check naming and file structure conventions. Challenge @Warrior and @Rogue's findings directly. Pin verified findings to the Dungeon.
123
- >
124
- > **@Rogue**: Explore from the failure/adversarial side. What assumptions about inputs, state, timing, availability? Build failure scenarios. What does a malicious user do? What does a slow network do? What does concurrent access do? Challenge @Warrior and @Archer's findings directly. Pin verified findings to the Dungeon.
125
- >
126
- > **All**: Read the Dungeon. Build on each other's discoveries. Challenge everything. Pin only what survives. Escalate to me with `WIZARD:` only when genuinely stuck.
261
+ ### Wizard [R1] Synthesis
262
+ <!-- Wizard evaluates the round. Key findings, open questions,
263
+ direction for Round 2. Optional interventions (with human approval). -->
127
264
 
128
- ## Design Principles
265
+ ---
129
266
 
130
- - **Isolation:** Break into units with one clear purpose, well-defined interfaces, testable independently. For each unit: what does it do, how do you use it, what does it depend on?
131
- - **Encapsulation:** Can someone understand a unit without reading its internals? Can you change internals without breaking consumers? If not, the boundaries need work.
132
- - **Size:** Smaller, well-bounded units are easier to reason about. When a file grows large, that's a signal it's doing too much.
133
- - **Existing codebases:** Explore current structure first. Follow existing patterns. Only include targeted improvements where they serve the current goal — no unrelated refactoring.
267
+ ## Defend/Concede @{writer} [R2]
134
268
 
135
- ## What Agents Must Cover
269
+ <!-- @{writer}: Respond to EACH finding from both reviewers.
270
+ DEFEND: [ref] — counter-evidence. CONCEDE: [ref] — what you'll fix in V2.
271
+ No silent ignoring. Every finding gets a response. -->
136
272
 
137
- Every agent addresses ALL of these from their assigned angle:
273
+ ## Version 2 @{writer} [R2]
138
274
 
139
- - **Performance** scale, bottlenecks, complexity
140
- - **Robustness** retries, fallbacks, graceful degradation
141
- - **Reliability** blast radius of failure, production-readiness
142
- - **Testability** — meaningful tests, mock strategy, test-friendly design. When `browser.enabled`: can this feature be E2E tested with Playwright? What user flows need browser verification? Are there loading states, client-side routing, or visual states that unit tests can't catch?
143
- - **Error handling** — what errors occur, how surfaced, UX of failure
144
- - **Edge cases** — empty, null, boundary, Unicode, timezones, large payloads
145
- - **Cascading effects** — blast radius, what else changes
146
- - **Clean architecture** — separation of concerns, single responsibility, dependency inversion
147
- - **Modularity & composability** — replaceable, extensible, composable
148
- - **DRY** — duplicating logic? reuse existing code?
149
- - **Dependencies** — version compatibility, security, maintenance, licensing
275
+ <!-- @{writer}: Incorporate all conceded findings into a revised design.
276
+ Mark what changed from V1 and why.
277
+ Defended items remain as-isstate why they survived challenge. -->
150
278
 
151
- ## The Fight What Makes It Productive
279
+ [Same sections as Version 1]
152
280
 
153
- ```
154
- Agents interact DIRECTLY — @Name addressing, building, challenging, roasting:
155
- 1. Present findings with EVIDENCE (file paths, docs, concrete examples)
156
- 2. Challenge other agents DIRECTLY with COUNTER-EVIDENCE (not opinions)
157
- 3. Build on each other's discoveries — BUILDING: with independent verification
158
- 4. Go to the EDGES — push every finding to its extreme
159
- 5. LEARN from each other — incorporate discoveries into your model
160
- 6. Pin verified findings — DUNGEON: only after surviving challenge
161
- 7. Challenge weak analysis — back every challenge with your own independent evidence
162
- 8. Escalate to Wizard — WIZARD: only when genuinely stuck
163
- ```
281
+ ---
164
282
 
165
- **The goal is not to tear each other down. The goal is to forge the strongest design by testing it from every angle. The Dungeon captures what survived.**
283
+ ## Review Round 2
166
284
 
167
- ## Closing the Phase
285
+ ### @{reviewer1} [R2] Review
286
+ <!-- @{reviewer1}: Focus on Version 2 changes and the defend/concede responses.
287
+ Did @{writer} address your findings adequately?
288
+ Are the defenses valid? Are the concessions properly incorporated?
289
+ Any NEW issues introduced by the changes? -->
168
290
 
169
- The Wizard closes when the Dungeon has sufficient verified findings — enough Discoveries, Shared Knowledge, and Resolved battles to synthesize 2-3 approaches.
291
+ ### @{reviewer2} [R2] Review
292
+ <!-- @{reviewer2}: Same focus. Challenge defenses you disagree with.
293
+ Confirm concessions were properly incorporated. -->
170
294
 
171
- **How the Wizard knows it's time to close:**
172
- - Dungeon has verified findings covering all major aspects (performance, robustness, testability, etc.)
173
- - Active Battles section is empty or has only minor unresolved points
174
- - Agents are converging — new findings are variations, not revelations
175
- - Shared Knowledge section has the foundational truths the design needs
295
+ ### Wizard [R2] Synthesis
296
+ <!-- Wizard evaluates. If critical findings remain announce Round 3 as FINAL.
297
+ If solid proceed to extraction. -->
176
298
 
177
- **RULING:** Synthesize from Dungeon evidence. Propose 2-3 approaches. Recommend one. Archive Dungeon.
299
+ ---
178
300
 
179
- ## Spec Self-Review
301
+ ## Final Extraction Notes — Wizard
302
+ <!-- What was incorporated into design.md and why.
303
+ What was intentionally excluded and why.
304
+ Drift check result against prd.md (if exists). -->
180
305
 
181
- After writing the design doc, the Wizard reviews with fresh eyes:
306
+ ---
182
307
 
183
- 1. **Placeholder scan:** Any TBD, TODO, incomplete sections, vague requirements? Fix them.
184
- 2. **Internal consistency:** Do any sections contradict each other? Architecture match feature descriptions?
185
- 3. **Scope check:** Focused enough for a single implementation plan, or needs decomposition?
186
- 4. **Ambiguity check:** Could any requirement be interpreted two ways? Pick one and make it explicit.
308
+ ## Writing Guidance
309
+ - Sign all work: `@{name} [R{N}]`
310
+ - Evidence-based: file paths, line numbers, concrete examples no opinions without proof
311
+ - No placeholders: no TBD, TODO, or vague references
312
+ - Scale depth to complexity — a few sentences if straightforward, detailed if nuanced
313
+ - Reviewers: respond to EVERY finding with DEFEND: or CONCEDE:
314
+ - Each review must add NEW value — don't repeat what prior reviewers said
315
+ ```
187
316
 
188
- Fix issues inline.
317
+ **Round 3:** If needed, the wizard appends Round 3 sections to the evolution log before dispatching. Do NOT pre-scaffold Round 3.
189
318
 
190
- ## Design Document Structure (Phase Deliverable)
319
+ ## Design Document Template
191
320
 
192
- The actual design doc is a **separate file**: `{questDir}/design.md`. This file is not validated by the dungeon hook and can contain freeform markdown. Write it when closing the phase — synthesize from scoreboard findings and agent exploration.
321
+ Scaffold `{questDir}/spoils/design.md` wizard-only, clean deliverable extracted from evolution log:
193
322
 
194
323
  ```markdown
195
324
  # [Feature Name] Design Specification
196
325
 
197
326
  **Date:** YYYY-MM-DD
198
327
  **Status:** Draft | Under Review | Approved
199
- **Raid Team:** Wizard (dungeon master), [agents used]
200
- **Mode:** Full Raid | Skirmish
328
+ **Quest Type:** Canonical Quest
201
329
 
202
330
  ## Problem Statement
203
331
  ## Requirements (numbered, unambiguous)
204
332
  ## Constraints
205
- ## Dungeon Findings (verified, from Phase 1 Dungeon)
206
- ### Key Discoveries (survived cross-testing)
207
- ### Lessons Learned (wrong assumptions corrected)
208
- ## Design Decision
209
- ### Alternatives Considered (2-3 with rejection reasons)
210
333
  ## Architecture
211
334
  ## File Structure
212
335
  ## Error Handling Strategy
213
336
  ## Testing Strategy
214
337
  ## Edge Cases
215
338
  ## Future Considerations (NOT building now, designing to accommodate)
339
+ ## Design Decision
340
+ ### Alternatives Considered (with rejection reasons)
216
341
  ## RULING
217
342
  ```
218
343
 
219
- ## Red Flags Thoughts That Signal Violations
344
+ ## What Agents Must Cover
220
345
 
221
- | Thought | Reality |
222
- |---------|---------|
223
- | "This is too simple to need a design" | Simple projects are where unexamined assumptions cause the most waste. |
224
- | "I already know the right approach" | Knowing and verifying are different. Propose 2-3 anyway. |
225
- | "Let's just start coding and figure it out" | Code without design becomes the design. And it's usually wrong. |
226
- | "The agents all agree, let's move on" | Agreement without challenge is groupthink. Did they actually cross-test? |
227
- | "I'll wait for the Wizard to tell me what to do" | You own the phase. Explore, challenge, build. Self-organize. |
228
- | "Let me just post everything to the Dungeon" | Only verified, challenged findings get pinned. |
229
- | "I need the Wizard to mediate this disagreement" | Talk to the other agent directly first. Escalate only if stuck. |
346
+ Every agent addresses ALL of these from their assigned angle:
230
347
 
231
- ## Escalation
348
+ - **Performance** — scale, bottlenecks, complexity
349
+ - **Robustness** — retries, fallbacks, graceful degradation
350
+ - **Testability** — meaningful tests, mock strategy, test-friendly design
351
+ - **Error handling** — what errors occur, how surfaced, UX of failure
352
+ - **Edge cases** — empty, null, boundary, Unicode, timezones, large payloads
353
+ - **Cascading effects** — blast radius, what else changes
354
+ - **Clean architecture** — separation of concerns, single responsibility
355
+ - **Dependencies** — version compatibility, security, licensing
232
356
 
233
- If the team is stuck on a fundamental design choice after genuine direct debate:
234
- 1. Present the top 2 options with trade-offs to the human
235
- 2. Let the human decide
236
- 3. Never ask the human to resolve something the team should handle
357
+ ## Drift Detection
237
358
 
238
- ---
359
+ Before closing, the Wizard compares `design.md` with `prd.md` (if it exists). If the design contradicts or omits a PRD requirement without explicit rationale, that's drift.
360
+
361
+ If drift detected, present options to the human:
362
+ - **(a)** Change PRD to match design — the design exploration revealed the PRD was wrong
363
+ - **(b)** Change design to match PRD — the design drifted from the original intent
364
+ - **(c)** Something else — explain the situation, let the human decide
365
+
366
+ ## Design Principles
367
+
368
+ - **Isolation:** Break into units with one clear purpose, well-defined interfaces, testable independently.
369
+ - **Encapsulation:** Can someone understand a unit without reading its internals?
370
+ - **Size:** When a file grows large, that's a signal it's doing too much.
371
+ - **Existing codebases:** Follow existing patterns. Only improve where it serves the current goal.
372
+
373
+ ## Red Flags
374
+
375
+ | Thought | Reality |
376
+ |---------|---------|
377
+ | "This is too simple to need a design" | Simple projects hide unexamined assumptions. |
378
+ | "I already know the right approach" | Knowing and verifying are different. |
379
+ | "The agents all agree after one round" | Minimum 2 rounds. Agreement without challenge is groupthink. |
380
+ | "Let me silently ignore that finding" | Every finding must get DEFEND: or CONCEDE:. No silent ignoring. |
381
+ | "Good enough, let's move on" | Present to human. Only they decide when it's good enough. |
239
382
 
240
383
  ## Phase Transition
241
384
 
242
385
  When the design is approved and committed:
243
386
 
244
- 1. Update `.claude/raid-session` phase via Bash (write gate blocks Write/Edit on this file):
387
+ 1. Update raid-session phase via Bash:
245
388
  ```bash
246
389
  jq '.phase="plan"' .claude/raid-session > .claude/raid-session.tmp && mv .claude/raid-session.tmp .claude/raid-session
247
390
  ```
248
391
  2. **Commit:** `docs(quest-{slug}): phase 2 design — {summary}`
249
- 3. **Send phase report to human:** summarize key design decisions, trade-offs resolved, what's next
250
- 4. **Load the `raid-canonical-implementation-plan` skill now and begin Phase 3.**
392
+ 3. **Report:** Link `design.md` and `phase-2-design.md` file paths to the human.
393
+ 4. **Load `raid-canonical-implementation-plan` and begin Phase 3.**
394
+
395
+ ## Phase Spoils
251
396
 
252
- Do not wait. Do not ask. The next action after committing the design doc is loading the next skill.
397
+ **Two outputs:**
398
+ - `{questDir}/phases/phase-2-design.md` — Full evolution timeline (all versions, reviews, defend/concede responses)
399
+ - `{questDir}/spoils/design.md` — Clean final design specification (wizard-polished)