claude-raid 0.2.7 → 0.2.9

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (39) hide show
  1. package/README.md +84 -23
  2. package/bin/cli.js +4 -2
  3. package/package.json +1 -1
  4. package/src/descriptions.js +10 -7
  5. package/src/init.js +36 -5
  6. package/src/merge-settings.js +53 -2
  7. package/src/remove.js +1 -1
  8. package/src/setup.js +32 -0
  9. package/src/ui.js +1 -0
  10. package/src/update.js +26 -3
  11. package/template/.claude/agents/archer.md +18 -4
  12. package/template/.claude/agents/rogue.md +18 -4
  13. package/template/.claude/agents/warrior.md +18 -4
  14. package/template/.claude/agents/wizard.md +32 -5
  15. package/template/.claude/dungeon-master-rules.md +120 -31
  16. package/template/.claude/hooks/raid-lib.sh +45 -4
  17. package/template/.claude/hooks/raid-pre-compact.sh +8 -4
  18. package/template/.claude/hooks/raid-session-end.sh +2 -2
  19. package/template/.claude/hooks/raid-session-start.sh +2 -0
  20. package/template/.claude/hooks/rtk-bridge.sh +46 -0
  21. package/template/.claude/hooks/validate-dungeon.sh +11 -3
  22. package/template/.claude/hooks/validate-file-naming.sh +6 -1
  23. package/template/.claude/hooks/validate-no-placeholders.sh +13 -2
  24. package/template/.claude/hooks/validate-write-gate.sh +7 -2
  25. package/template/.claude/party-rules.md +91 -65
  26. package/template/.claude/skills/raid-browser/SKILL.md +3 -5
  27. package/template/.claude/skills/raid-browser-chrome/SKILL.md +1 -1
  28. package/template/.claude/skills/raid-canonical-design/SKILL.md +309 -162
  29. package/template/.claude/skills/raid-canonical-implementation/SKILL.md +157 -132
  30. package/template/.claude/skills/raid-canonical-implementation-plan/SKILL.md +196 -141
  31. package/template/.claude/skills/raid-canonical-prd/SKILL.md +92 -89
  32. package/template/.claude/skills/raid-canonical-protocol/SKILL.md +29 -123
  33. package/template/.claude/skills/raid-canonical-review/SKILL.md +292 -148
  34. package/template/.claude/skills/raid-debugging/SKILL.md +1 -7
  35. package/template/.claude/skills/raid-init/SKILL.md +7 -5
  36. package/template/.claude/skills/raid-tdd/SKILL.md +5 -5
  37. package/template/.claude/skills/raid-teambuff/SKILL.md +6 -24
  38. package/template/.claude/skills/raid-verification/SKILL.md +0 -6
  39. package/template/.claude/skills/raid-wrap-up/SKILL.md +30 -29
@@ -5,232 +5,376 @@ description: "Use when Phase 5 (Review) begins in a Canonical Quest, after imple
5
5
 
6
6
  # Raid Review — Phase 5 (Optional)
7
7
 
8
- Three reviewers, three angles, zero mercy. Pinning then fixing. Black cards for the unfixable.
8
+ Two sub-phases: **Review** (find issues, build fix plan) then **Fix Session** (execute fixes). The review digests all prior deliverables — PRD, Design, Plan, Implementation — and verifies the implementation is correct, complete, and coherent.
9
9
 
10
10
  <HARD-GATE>
11
- This phase is OPTIONAL — the Wizard asks the human before entering. All assigned agents review the ENTIRE implementation independently, then attack each other's findings. Use `raid-verification` before any completion claims. No subagents.
11
+ This phase is OPTIONAL — the Wizard asks the human before entering. All agents review the ENTIRE implementation. Use `raid-verification` before any completion claims.
12
12
  </HARD-GATE>
13
13
 
14
- ## Mode Behavior
15
-
16
- - **Full Raid**: 3 independent reviews, then agents fight directly over findings. All severity levels enforced.
17
- - **Skirmish**: 1 agent reviews + Wizard. Cross-testing between reviewer and Wizard.
18
- - **Scout**: Wizard reviews alone. Checks against requirements and runs tests.
19
-
20
14
  ## Process Flow
21
15
 
22
16
  ```dot
23
17
  digraph review {
24
- "Wizard reads design doc, plan, Phase 4 implementation log" -> "Wizard opens review board";
25
- "Wizard opens Dungeon + dispatches" -> "Agents review independently";
26
- "Agents review independently" -> "Agents fight over findings directly";
27
- "Agents fight over findings directly" -> "Agents challenge missing findings";
28
- "Agents challenge missing findings" -> "Agents pin severity-classified issues to Dungeon";
29
- "Agents pin severity-classified issues to Dungeon" -> "Wizard closes: categorizes surviving issues";
30
- "Wizard closes: categorizes surviving issues" -> "Critical or Important?" [shape=diamond];
31
- "Critical or Important?" -> "Assign fixes" [label="yes"];
32
- "Assign fixes" -> "Fix + verify + challengers re-attack";
33
- "Fix + verify + challengers re-attack" -> "Wizard closes: categorizes surviving issues";
34
- "Critical or Important?" -> "Wizard final ruling" [label="no"];
35
- "Wizard final ruling" -> "Commit + invoke raid-wrap-up" [shape=doublecircle];
18
+ "Wizard reads all prior deliverables" -> "Phase recap (PRD + Design + Plan + Implementation)";
19
+ "Phase recap (PRD + Design + Plan + Implementation)" -> "SUB-PHASE A: REVIEW";
20
+
21
+ subgraph cluster_review {
22
+ label="Sub-phase A: Review";
23
+ "Roll dice for review turn order" -> "ROUND 1: Agent 1 reviews, pins findings";
24
+ "ROUND 1: Agent 1 reviews, pins findings" -> "Agent 2 adversary-tests findings + adds own";
25
+ "Agent 2 adversary-tests findings + adds own" -> "Agent 3 reviews + adds own";
26
+ "Agent 3 reviews + adds own" -> "Wizard evaluates Round 1";
27
+ "Wizard evaluates Round 1" -> "ROUND 2: Agent 1 converges findings, proposes fix plan";
28
+ "ROUND 2: Agent 1 converges findings, proposes fix plan" -> "Agents 2+3 attack fix plan";
29
+ "Agents 2+3 attack fix plan" -> "Wizard evaluates Round 3?" [shape=diamond];
30
+ "Wizard evaluates — Round 3?" -> "ROUND 3 (FINAL)" [label="critical gaps"];
31
+ "Wizard evaluates — Round 3?" -> "Extract review.md fix plan" [label="solid"];
32
+ "ROUND 3 (FINAL)" -> "Extract review.md fix plan";
33
+ }
34
+
35
+ "SUB-PHASE A: REVIEW" -> "Roll dice for review turn order";
36
+ "Extract review.md fix plan" -> "Human approves fix plan?" [shape=diamond];
37
+ "Human approves fix plan?" -> "Ask why, revise" [label="no"];
38
+ "Ask why, revise" -> "Extract review.md fix plan";
39
+ "Human approves fix plan?" -> "Fixes needed?" [shape=diamond];
40
+ "Fixes needed?" -> "SUB-PHASE B: FIX SESSION" [label="yes"];
41
+ "Fixes needed?" -> "Commit + load wrap-up" [label="no fixes", shape=doublecircle];
42
+
43
+ subgraph cluster_fix {
44
+ label="Sub-phase B: Fix Session";
45
+ "Fresh dice roll for fix order" -> "Agent 1 makes fixes from review.md, reports";
46
+ "Agent 1 makes fixes from review.md, reports" -> "Agent 2 reviews fixes, reports";
47
+ "Agent 2 reviews fixes, reports" -> "Agent 3 reviews fixes, reports";
48
+ "Agent 3 reviews fixes, reports" -> "Wizard evaluates fixes";
49
+ "Wizard evaluates fixes" -> "More fix rounds?" [shape=diamond];
50
+ "More fix rounds?" -> "Next fix round" [label="yes"];
51
+ "More fix rounds?" -> "Wizard extracts results" [label="done"];
52
+ }
53
+
54
+ "SUB-PHASE B: FIX SESSION" -> "Fresh dice roll for fix order";
55
+ "Wizard extracts results" -> "Present to human";
56
+ "Present to human" -> "Commit + load wrap-up" [shape=doublecircle];
36
57
  }
37
58
  ```
38
59
 
39
- ## Wizard Checklist
60
+ ## Sub-phase A: Review
61
+
62
+ ### Wizard Checklist (Review)
63
+
64
+ 1. **Prepare** — gather all prior deliverables: PRD, design.md, task files, phase-4-implementation.md, git diff range
65
+ 2. **Phase recap** — summarize all prior phases. Present to agents and human.
66
+ 3. **Roll dice** — randomly shuffle `["warrior", "archer", "rogue"]` for the review turn order. Update raid-session via Bash using the jq command from protocol "Dice Roll Reference". Announce: *"The dice have spoken. Review turn order: {agent1} → {agent2} → {agent3}."*
67
+ 4. **Create evolution log** — `{questDir}/phases/phase-5-review.md`
68
+ 5. **Run rounds** — see Round Protocol below
69
+ 6. **Extract fix plan** — polish into `{questDir}/spoils/review.md`
70
+ 7. **Present to human** for approval
71
+
72
+ ### Dispatch Templates (Review)
73
+
74
+ Dispatch carries only dynamic context. Detailed instructions (severity format, checklist, finding structure) are embedded in the scaffolded phase file.
75
+
76
+ **Reviewer (Round 1):**
77
+ ```
78
+ TURN_DISPATCH: Phase 5 Review, Round 1, Turn {T}.
79
+ Quest: {description}
80
+ Phase recap: {summary of all prior phases — what was built, key decisions}
81
+ Your role: REVIEWER. Your section: "@{name} [R1]"
82
+
83
+ FIRST: Read the FULL document at {questDir}/phases/phase-5-review.md to understand the structure.
84
+ Read the embedded instructions in your section. Then read the code changes (git diff),
85
+ {questDir}/spoils/design.md, and task files.
86
+ THEN: Write your review in your designated section following the embedded instructions.
87
+ ```
88
+
89
+ **Fix Plan Writer (Round 2, Turn 1):**
90
+ ```
91
+ TURN_DISPATCH: Phase 5 Review, Round 2, Turn 1.
92
+ Quest: {description}
93
+ All Round 1 findings are in.
94
+ Your role: converge findings into fix plan. Your section: "@{name} [R2] — Converged Fix Plan"
95
+
96
+ FIRST: Read the FULL document at {questDir}/phases/phase-5-review.md.
97
+ Read all Round 1 findings. Read the embedded instructions in your section.
98
+ THEN: Write the converged fix plan following the embedded instructions.
99
+ ```
40
100
 
41
- 1. **Prepare** gather git range, design doc, plan doc, read `{questDir}/phase-4-implementation.md`
42
- 2. **Open the review board** — create `{questDir}/phase-5-review.md`
43
- 3. **Dispatch** all agents review independently, then interact directly
44
- 4. **Observe the fight** — agents challenge findings and missing findings directly
45
- 5. **Close** — categorize surviving issues by severity from Dungeon
46
- 6. **Browser inspection** — dispatch agents to inspect in Chrome (if `browser.enabled`)
47
- 7. **Observe browser fights** agents cross-verify findings on separate instances
48
- 8. **Rule on fixes** Critical and Important must be fixed (code AND browser)
49
- 9. **Verify fixes** targeted re-attack after fixes (use `raid-verification`)
50
- 10. **Final ruling** approved or rejected
51
- 11. **Commit** — `fix(quest-{slug}): phase 5 review — {N} findings resolved`
52
- 12. **Transition** — invoke `raid-wrap-up`
101
+ **Fix Session dispatch:**
102
+ ```
103
+ TURN_DISPATCH: Phase 5 Fix Session, Round 1, Turn {T}.
104
+ Quest: {description}
105
+ Fix plan: {questDir}/spoils/review.md
106
+
107
+ FIRST: Read the FULL document at {questDir}/phases/phase-5-review.md to find
108
+ the Fix Session section and your embedded instructions. Then read the fix plan.
109
+ THEN: Execute your role following the embedded instructions.
110
+ TDD enforced — load raid-tdd. Signal TURN_COMPLETE with status when done.
111
+ ```
53
112
 
54
- ## Opening the Dungeon
113
+ ### Evolution Log Template (Sub-phase A)
55
114
 
56
- Create `{questDir}/phase-5-review.md`:
115
+ Scaffold `{questDir}/phases/phase-5-review.md`. Replace agent name placeholders with actual names from dice roll:
57
116
 
58
117
  ```markdown
59
- # Phase 5: Review
60
- ## Quest: Full adversarial review of <feature> implementation
61
- ## Mode: <Full Raid | Skirmish>
118
+ # Phase 5: Review — Evolution Log
62
119
 
63
- ### Discoveries
120
+ ## Quest: [quest description]
121
+ ## Quest Type: Canonical Quest
122
+ ## Turn Order (Review): @{agent1} → @{agent2} → @{agent3}
64
123
 
65
- ### Active Battles
124
+ ## References
125
+ - PRD: `{questDir}/spoils/prd.md` (if exists)
126
+ - Design: `{questDir}/spoils/design.md`
127
+ - Tasks: `{questDir}/spoils/tasks/phase-3-plan-task-*.md`
128
+ - Implementation: `{questDir}/phases/phase-4-implementation.md`
66
129
 
67
- ### Resolved
130
+ ## Quest Goal
131
+ <!-- Wizard writes 2-3 lines: what this review must verify,
132
+ total file count from implementation, key risk areas to focus on -->
68
133
 
69
- ### Shared Knowledge
134
+ ---
70
135
 
71
- ### Escalations
72
- ```
136
+ ## Sub-phase A: Review
73
137
 
74
- ## Dispatch
138
+ ### @{agent1} [R1] — Full Implementation Review
75
139
 
76
- **DISPATCH:**
140
+ <!-- @{agent1}: FIRST REVIEWER. Read ACTUAL CODE — not reports.
141
+ For each finding: [Severity] `file:line` — what, why, proposed fix.
142
+ Example: [Critical] `src/auth/handler.ts:23` — missing validation. Fix: add zod schema.
143
+ Checklist: requirements, code quality, testing, architecture, naming, production. -->
77
144
 
78
- > **@Warrior**: Review full implementation. Run every test. Check error handling at every boundary. Verify all requirements from design doc. Find the bugs that crash in production. Then fight @Archer and @Rogue over their findings.
79
- >
80
- > **@Archer**: Review full implementation. Does it match the design doc exactly? Patterns consistent? Interfaces correct? Types sound? Naming conventions followed? File structure clean? Find the bugs that silently produce wrong results. Then fight @Warrior and @Rogue.
81
- >
82
- > **@Rogue**: Review full implementation. Think like an attacker. What inputs break it? What timing causes races? What happens when dependencies fail? Find the bugs nobody else will find. Then fight @Warrior and @Archer.
83
- >
84
- > **All**: Review independently first, then fight directly. Challenge each other's findings AND each other's blind spots. Pin severity-classified issues to Dungeon with `DUNGEON:`. Reference the Phase 3 Dungeon for context.
145
+ ### @{agent2} [R1] Adversarial Review
85
146
 
86
- ## Review Checklist Each Agent
147
+ <!-- @{agent2}: ADVERSARIAL REVIEWER. Verify @{agent1}'s findings against actual code.
148
+ Challenge severity if overblown. Add findings @{agent1} missed. Don't repeat.
149
+ Same format: [Severity] `file:line` — what, why, fix. -->
87
150
 
88
- **Requirements:** Every design doc requirement implemented? No extras (YAGNI)? Nothing misinterpreted?
151
+ ### @{agent3} [R1] Final Review Pass
89
152
 
90
- **Code Quality:** Clean separation? Error handling at every boundary? DRY? Clear names?
153
+ <!-- @{agent3}: FINAL REVIEWER. Read all prior findings. Challenge what you disagree with.
154
+ Find what BOTH reviewers missed. Same format: [Severity] `file:line` — what, why, fix. -->
91
155
 
92
- **Testing:** Every function tested? Edge cases? Failure paths? All passing?
156
+ ### Wizard [R1] Synthesis
157
+ <!-- Wizard categorizes all surviving findings by severity.
158
+ Counts: N Critical, N Important, N Minor.
159
+ Direction for Round 2. -->
93
160
 
94
- **Architecture:** Design decisions implemented correctly? Interfaces match spec? No drift?
161
+ ---
95
162
 
96
- **Naming & Structure:** Consistent naming? File system follows conventions? Modules clean?
163
+ ### @{agent1} [R2] Converged Fix Plan
97
164
 
98
- **Production:** Performance OK? External calls have timeouts? No secrets in code?
165
+ <!-- @{agent1}: Read EVERY finding from all reviewers (R1).
166
+ Your job is to produce a SINGLE converged fix plan.
99
167
 
100
- ## Verification Protocol for Findings
168
+ 1. Group all findings by severity (Critical → Important → Minor)
169
+ 2. Within each group, order by domain/file for efficient fixing
170
+ 3. For each finding: confirm, mark as false positive (with evidence), or merge duplicates
171
+ 4. Propose concrete fix for each confirmed finding
172
+ 5. Note execution order (dependencies between fixes)
101
173
 
102
- Before acting on ANY finding (yours or a teammate's):
103
- 1. **READ:** Complete the finding without reacting
104
- 2. **VERIFY:** Check against codebase reality read the actual code at the referenced location
105
- 3. **EVALUATE:** Is this technically sound for THIS codebase? Does fixing it break something else?
106
- 4. **RESPOND:** Technical evidence or reasoned pushback
174
+ Format per finding:
175
+ **[Critical-1]** `src/auth/handler.ts:23` Missing input validation
176
+ - Found by: @{agent2} [R1], confirmed by @{agent3} [R1]
177
+ - Fix: Add zod schema validation in validateToken() before line 23
178
+ - Blocked by: none -->
107
179
 
108
- ## No Performative Agreement
180
+ ### @{agent2} [R2] — Fix Plan Review
109
181
 
110
- NEVER respond with "You're absolutely right!" or "Great point!" or "Good catch!"
111
- Instead: state the technical finding, show evidence, or push back.
112
- Actions speak. Fix and show — don't compliment.
182
+ <!-- @{agent2}: Review fix plan. Are fixes correct? Execution order right?
183
+ Challenge false positive designations. Flag dropped findings. -->
113
184
 
114
- If a finding IS correct: `"Fixed. [Brief description of what changed]."` or just fix it silently.
185
+ ### @{agent3} [R2] Fix Plan Review
115
186
 
116
- ## YAGNI Check on Findings
187
+ <!-- @{agent3}: Same focus. Challenge what @{agent2} missed.
188
+ Confirm or dispute false positive designations. -->
117
189
 
118
- Before implementing a "professional improvement" suggestion:
119
- - Grep codebase for actual usage of the component
120
- - If unused: suggest removing (YAGNI) `"This endpoint isn't called. Remove it?"`
121
- - If used: implement properly
122
- - Don't gold-plate during review
190
+ ### Wizard [R2] Synthesis
191
+ <!-- Wizard evaluates the fix plan. If solid → extract to review.md.
192
+ If critical gaps announce Round 3 as FINAL. -->
123
193
 
124
- ## The Fight — Agents Challenge Each Other
194
+ ---
125
195
 
126
- After independent reviews, agents fight DIRECTLY over findings AND missing findings:
196
+ ## Final Extraction Notes Wizard
197
+ <!-- What was incorporated into review.md.
198
+ False positives excluded and why.
199
+ Total findings: N confirmed, N false positives, N deferred. -->
127
200
 
128
- - `CHALLENGE: @Archer, you gave the auth module a pass but didn't check the session rotation path — review it now.`
129
- - `BUILDING: @Warrior, your finding about the missing error handler — the impact is worse than you stated because...`
130
- - `CHALLENGE: @Rogue, your "Critical" severity on the naming inconsistency is overblown — here's why it's actually Minor...`
131
- - `DUNGEON: [Critical] handler.js:23 — missing input validation allows injection. Verified by @Warrior and @Rogue.`
201
+ ---
132
202
 
133
- **Agents classify severity when pinning to Dungeon:**
203
+ ## Writing Guidance
204
+ - Sign all work: `@{name} [R{N}]`
205
+ - Read ACTUAL CODE — not summaries, not reports, not commit messages
206
+ - Every finding needs: severity, location, what, why, proposed fix
207
+ - No performative agreement — no "Great catch!" Just evidence or pushback.
208
+ - Reviewers: challenge severity classifications, not just content
209
+ - Fix plan must be actionable — concrete fixes, not "improve error handling"
210
+ ```
134
211
 
135
- | Severity | Definition | Action |
136
- |----------|------------|--------|
137
- | **Critical** | Bugs, security holes, data loss, crashes | Must fix. No exceptions. |
138
- | **Important** | Missing features, poor error handling, test gaps, naming inconsistencies | Must fix. |
139
- | **Minor** | Style, docs, optimization | Note for future. |
212
+ **Round 3:** If needed, wizard appends Round 3 sections before dispatching. Do NOT pre-scaffold.
140
213
 
141
- ## Browser Inspection Phase (when `browser.enabled` in raid.json)
214
+ ### Browser Inspection (when `browser.enabled`)
142
215
 
143
- After code review findings are pinned, the Wizard announces browser inspection.
216
+ After code review findings are pinned, agents inspect the live application:
217
+ 1. Each reviewer boots their own instance on separate ports (invoke `raid-browser`)
218
+ 2. Pre-flight: state test subject, check auth, discover routes
219
+ 3. Inspect from angle (invoke `raid-browser-chrome`): Warrior=stress, Archer=visual, Rogue=security
220
+ 4. Cross-verify others' findings on own instance
221
+ 5. Pin browser findings alongside code findings
222
+ 6. Cleanup instances
144
223
 
145
- ### Process
224
+ Browser bugs block merge the same way code bugs do.
146
225
 
147
- 1. **Wizard announces:** "Browser inspection phase each reviewer boots their own instance"
148
- 2. **Each reviewer BOOTs** their own app instance on separate ports (invoke `raid-browser`)
149
- 3. **Each reviewer runs PRE-FLIGHT** — state test subject, check auth, discover routes
150
- 4. **Each reviewer LOGINs** if auth is required (credentials from `.env.raid`)
151
- 5. **Each reviewer inspects** from their angle (invoke `raid-browser-chrome`):
152
- - Minimum gates first (console, network, page loads)
153
- - Then angle-driven exploration (Warrior: stress, Archer: visual/precision, Rogue: security)
154
- - Evidence captured for every finding (GIF, screenshot, console/network)
155
- 6. **Cross-verification** — each reviewer reproduces others' findings on their own instance
156
- 7. **Pin browser findings** to Dungeon alongside code review findings
157
- 8. **Each reviewer CLEANUPs** their instance
158
- 9. **Wizard rules** on ALL findings (code + browser) together
226
+ ## Sub-phase B: Fix Session
159
227
 
160
- ### Browser findings follow the same severity rules:
228
+ Only entered if `review.md` contains fixes to make. This is different from the Implementation phase — the source is `review.md`, not numbered plan tasks.
161
229
 
162
- - **Critical** (crash, security, layout broken) — must fix
163
- - **Important** (broken feature, visual inconsistency, responsive breakage) — must fix
164
- - **Minor** (polish, console warnings) — note for future
230
+ ### Wizard Checklist (Fix Session)
165
231
 
166
- **Browser bugs block merge the same way code bugs do.**
232
+ 1. **Fresh dice roll** a new turn order for the fix session. Update raid-session via Bash using the jq command from protocol "Dice Roll Reference". Announce: *"Fresh dice for the fix session: {agent1} → {agent2} → {agent3}."*
233
+ 2. **Dispatch fixes** — round-based, sequential
167
234
 
168
- ## Black Card System
235
+ ### Fix Session Evolution Log (Appended Dynamically)
169
236
 
170
- If any agent finds something that fundamentally breaks the architecture a change so deep it invalidates the implementation they play a `BLACKCARD:`:
237
+ When Sub-phase B begins, the wizard appends these sections to `phase-5-review.md` with fresh agent names from the new dice roll:
171
238
 
239
+ ```markdown
240
+ ---
241
+
242
+ ## Sub-phase B: Fix Session
243
+
244
+ ## Turn Order (Fix Session): @{agent1} → @{agent2} → @{agent3}
245
+ <!-- Fresh dice roll — may be different order from review sub-phase -->
246
+
247
+ ### @{agent1} [R1] — Implementing Fixes
248
+
249
+ <!-- @{agent1}: Work through review.md fix plan in order.
250
+ For each fix:
251
+ 1. Implement the fix following TDD (write test → verify fail → fix → verify pass)
252
+ 2. Report what was fixed and how
253
+
254
+ Format per fix:
255
+ **[Critical-1]** `src/auth/handler.ts:23` — FIXED
256
+ - Change: Added zod schema validation in validateToken()
257
+ - Test: `tests/auth/handler.test.ts` — added "rejects malformed tokens" test
258
+ - Commit: `fix(auth): add input validation to token handler`
259
+
260
+ Prioritize: blocking issues first, then simple fixes, then complex fixes. -->
261
+
262
+ ### @{agent2} [R1] — Fix Verification
263
+
264
+ <!-- @{agent2}: Read the ACTUAL CODE changes for each fix above.
265
+ - Does each fix address the original finding?
266
+ - Does any fix introduce new issues?
267
+ - Run the full test suite — any regressions?
268
+ Report per fix: VERIFIED or ISSUE: [what's wrong] -->
269
+
270
+ ### @{agent3} [R1] — Fix Verification
271
+
272
+ <!-- @{agent3}: Same focus. Verify fixes AND @{agent2}'s verification.
273
+ Final pass — anything missed? -->
274
+
275
+ ### Wizard [R1] Synthesis
276
+ <!-- All fixes verified? If issues remain → another round.
277
+ If clean → extract results, present to human. -->
172
278
  ```
173
- BLACKCARD: [description of breaking concern]
174
- Evidence: [file paths, scenarios, why this is unfixable within current design]
175
- Impact: [what breaks, how deep the damage goes]
176
- ```
177
279
 
178
- **Black Card flow:**
179
- 1. Agent plays `BLACKCARD:` → other agents independently verify
180
- 2. If 2+ agents agree it's a black card → Wizard escalates to human
181
- 3. Wizard presents to human with full context (digested, not raw):
182
- - What the black card is
183
- - Why it's unfixable in current design
184
- - Options:
185
- a) **Rollback**Go back to PRD or Design phase (creates `phase-2-design-v2.md`)
186
- b) **Accept** — Live with the limitation, document it, continue
187
- 4. Human decides → Wizard acts accordingly
280
+ 2-3 rounds until the Wizard is satisfied all fixes are sound.
281
+
282
+ ### Review Deliverable Template
283
+
284
+ Wizard extracts into `{questDir}/spoils/review.md` issue-centric, grouped by severity:
285
+
286
+ ```markdown
287
+ # [Feature Name] Review Report
288
+
289
+ ## Quest: [quest description]
290
+ ## Date: YYYY-MM-DD
291
+ ## Author: Wizard (extracted from phase-5-review.md)
292
+
293
+ ---
294
+
295
+ ## Summary
296
+ <!-- Total findings, breakdown by severity, fix session outcome -->
188
297
 
189
- **Black cards are RARE.** Most issues are Critical or Important, not black cards. A black card means "the foundation is wrong" — not "there's a bug."
298
+ ## Critical Issues
190
299
 
191
- ## Fix Implementation Order
300
+ ### [Critical-1] `file:line` — Short description
301
+ - **Found by:** @agent [R1], confirmed by @agent [R1]
302
+ - **Description:** What is wrong and why it matters
303
+ - **Fix:** What was done to resolve it
304
+ - **Status:** Fixed | Deferred — [reason]
305
+ - **Verification:** Test name or evidence that the fix works
192
306
 
193
- When the Wizard assigns fixes during the Fixing subphase, prioritize in this order within each severity level:
194
- 1. **Blocking issues** — crashes, security holes, data loss
195
- 2. **Simple fixes** — typos, imports, naming inconsistencies
196
- 3. **Complex fixes** — refactoring, logic changes, architectural adjustments
307
+ ## Important Issues
197
308
 
198
- Test each fix individually. Verify no regressions before moving to the next fix.
309
+ ### [Important-1] `file:line` Short description
310
+ <!-- Same structure -->
199
311
 
200
- ## Unclear Finding Protocol
312
+ ## Minor Issues (Noted for Future)
201
313
 
202
- If ANY finding is unclear unclear what the problem is, unclear how to reproduce, unclear what the fix should be — **STOP**. Clarify ALL unclear items before implementing ANY fixes. Partial understanding leads to wrong implementation.
314
+ ### [Minor-1] `file:line`Short description
315
+ - **Found by:** @agent [R1]
316
+ - **Description:** What and why
317
+ - **Status:** Deferred — not blocking
203
318
 
204
- ## Closing the Phase
319
+ ## False Positives
205
320
 
206
- The Wizard closes when agents have exhausted their findings and the review board has all issues classified:
321
+ ### [FP-1] `file:line` Short description
322
+ - **Raised by:** @agent [R1]
323
+ - **Dismissed by:** @agent [R2] — [evidence why it's not an issue]
324
+ ```
325
+
326
+ ## Black Card System
327
+
328
+ If any agent finds something that fundamentally breaks the architecture — unfixable within current design:
207
329
 
208
- **RULING: APPROVED FOR MERGE** — all Critical/Important fixed, tests pass, requirements met.
330
+ ```
331
+ BLACKCARD: [description]
332
+ Evidence: [file paths, scenarios, why unfixable]
333
+ Impact: [what breaks, how deep]
334
+ ```
209
335
 
210
- **RULING: REJECTED** specify what must change and which phase to return to.
336
+ **Flow:** Agent plays 2+ agents verify Wizard escalates to human → Options: (a) rollback to earlier phase, (b) accept limitation.
337
+
338
+ Black cards are RARE. Most issues are Critical or Important, not black cards.
339
+
340
+ ## No Performative Agreement
341
+
342
+ NEVER respond with "Great catch!" or "You're absolutely right!" Instead: state the finding, show evidence, or push back. If a finding IS correct: fix it and move on.
343
+
344
+ ## Verification Protocol
345
+
346
+ Before acting on ANY finding:
347
+ 1. **READ:** Complete the finding without reacting
348
+ 2. **VERIFY:** Check against actual code at the referenced location
349
+ 3. **EVALUATE:** Is this technically sound for THIS codebase?
350
+ 4. **RESPOND:** Technical evidence or reasoned pushback
211
351
 
212
352
  ## Red Flags
213
353
 
214
354
  | Thought | Reality |
215
355
  |---------|---------|
216
- | "The implementation looks fine, no issues" | Every review finds at least one issue. Look harder. |
217
- | "I'll report my findings to the Wizard" | Report to the other agents directly. Fight over them. |
218
- | "This is a Minor issue" (when it causes wrong behavior) | Wrong results = Important or Critical. |
356
+ | "The implementation looks fine" | Every review finds at least one issue. Look harder. |
357
+ | "This is Minor" (when it causes wrong behavior) | Wrong results = Important or Critical. |
219
358
  | "The tests pass, so it works" | Tests prove what they test. What DON'T they test? |
220
- | "Let's skip re-review of the fixes" | Fixes introduce new bugs. Always re-attack. |
359
+ | "Let me silently ignore that finding" | Every finding gets addressed in the fix plan. |
360
+ | "Fixes are simple, skip re-review" | Fixes introduce new bugs. Always re-verify. |
221
361
 
222
362
  ---
223
363
 
224
364
  ## Phase Transition
225
365
 
226
- When the RULING is APPROVED FOR MERGE:
366
+ When the review is complete and all fixes verified:
227
367
 
228
- 1. Update `.claude/raid-session` phase via Bash (write gate blocks Write/Edit on this file):
368
+ 1. Update raid-session phase via Bash:
229
369
  ```bash
230
370
  jq '.phase="wrap-up"' .claude/raid-session > .claude/raid-session.tmp && mv .claude/raid-session.tmp .claude/raid-session
231
371
  ```
232
- 2. **Commit**: `fix(quest-{slug}): phase 5 review — {N} findings resolved`
233
- 3. **Send phase report to human**: findings count, fixes applied, any black cards
234
- 4. **Load the `raid-wrap-up` skill now and begin Phase 6.**
372
+ 2. **Commit:** `fix(quest-{slug}): phase 5 review — {N} findings resolved`
373
+ 3. **Report:** Link `review.md` and `phase-5-review.md` file paths to the human.
374
+ 4. **Load `raid-wrap-up` and begin Phase 6.**
375
+
376
+ ## Phase Spoils
235
377
 
236
- Do not wait. Do not ask. The next action after approving for merge is loading the next skill.
378
+ **Two outputs:**
379
+ - `{questDir}/phases/phase-5-review.md` — Full evolution (findings, challenges, fix plan debate, fix session)
380
+ - `{questDir}/spoils/review.md` — Clean fix plan deliverable (what was found, what was fixed, what was deferred)
@@ -1,6 +1,6 @@
1
1
  ---
2
2
  name: raid-debugging
3
- description: "Use when encountering any bug, test failure, or unexpected behavior. Agents investigate competing hypotheses in parallel. No fixes without root cause. No subagents."
3
+ description: "Use when encountering any bug, test failure, or unexpected behavior. Agents investigate competing hypotheses in sequential turns. No fixes without root cause. No subagents."
4
4
  ---
5
5
 
6
6
  # Raid Debugging — Adversarial Root Cause Analysis
@@ -19,12 +19,6 @@ NO FIXES WITHOUT ROOT CAUSE INVESTIGATION FIRST
19
19
 
20
20
  If you haven't completed Phase 1, you cannot propose fixes.
21
21
 
22
- ## Mode Behavior
23
-
24
- - **Full Raid**: 3 agents investigate competing hypotheses in parallel.
25
- - **Skirmish**: 2 agents with different hypotheses.
26
- - **Scout**: 1 agent investigates + Wizard challenges the hypothesis.
27
-
28
22
  ## Process Flow
29
23
 
30
24
  ```dot
@@ -26,7 +26,7 @@ digraph init {
26
26
  "Coming soon message" -> "Present quest menu";
27
27
  "Ask: PRD needed?" -> "Human describes task";
28
28
  "Human describes task" -> "Spawn full team + create quest dir";
29
- "Spawn full team + create quest dir" -> "Begin first phase" [shape=doublecircle];
29
+ "Spawn full team + create quest dir" -> "Announce quest + begin first phase" [shape=doublecircle];
30
30
  }
31
31
  ```
32
32
 
@@ -90,12 +90,12 @@ Ask the human to describe the task/feature they want to build. Listen carefully.
90
90
 
91
91
  ### 4c. Spawn Team & Setup
92
92
 
93
- The Canonical Quest always runs as Full Raid (Warrior, Archer, Rogue). Do NOT ask the human to confirm the mode — it is implicit.
93
+ The Canonical Quest always runs with the full party (Wizard + Warrior + Archer + Rogue). 4 agents, no reduced configurations.
94
94
 
95
95
  1. Update `.claude/raid-session` (created by the session-start hook) via **Bash with jq** — the write gate blocks Write/Edit on this file, so always use Bash:
96
96
  ```bash
97
97
  jq --arg qt "canonical" --arg qid "{questId}" --arg qdir ".claude/dungeon/{questId}" \
98
- '.questType=$qt | .questId=$qid | .questDir=$qdir | .mode="full"' \
98
+ '.questType=$qt | .questId=$qid | .questDir=$qdir' \
99
99
  .claude/raid-session > .claude/raid-session.tmp && mv .claude/raid-session.tmp .claude/raid-session
100
100
  ```
101
101
  2. Create quest directory if not already created by hook:
@@ -116,13 +116,15 @@ The Canonical Quest always runs as Full Raid (Warrior, Archer, Rogue). Do NOT as
116
116
  - If PRD skipped → Load `raid-canonical-design` skill, begin Phase 2
117
117
 
118
118
  **Announce the quest to the party and the human:**
119
- > "The quest begins: **{task description}**. Mode: **{mode}**. {agent count} brave souls answer the call."
119
+ > "The quest begins: **{task description}**. 4 brave souls answer the call. The dice will roll at each phase to determine turn order."
120
+
121
+ Dice rolls happen **per phase**, not at quest start. The first dice roll happens when Phase 2 (Design) opens — or whenever the first agent phase begins. Phase 1 (PRD) is wizard+human only, so no dice needed there.
120
122
 
121
123
  ## Red Flags
122
124
 
123
125
  | Thought | Reality |
124
126
  |---------|---------|
125
127
  | "Skip the greeting, get to work" | The greeting sets the tone. It takes 5 seconds. Do it. |
126
- | "Let me ask which mode to use" | Canonical Quest = Full Raid. Always. Don't ask. |
128
+ | "Let me ask which mode to use" | Canonical Quest = full party, always. Don't ask. |
127
129
  | "Let me start exploring the codebase" | You are the Wizard. You don't explore. You dispatch. |
128
130
  | "I'll figure out the quest type later" | Quest type determines the phase flow. Choose now. |
@@ -11,7 +11,7 @@ Write the test first. Watch it fail. Write minimal code to pass. Then the others
11
11
 
12
12
  **Violating the letter of these rules is violating their spirit.**
13
13
 
14
- **TDD is enforced in ALL modes — Full Raid, Skirmish, and Scout. No exceptions.**
14
+ **TDD is enforced. No exceptions.**
15
15
 
16
16
  ## The Iron Law
17
17
 
@@ -116,7 +116,7 @@ When claiming tests pass, both must pass:
116
116
 
117
117
  ## Adversarial Test Review
118
118
 
119
- After TDD cycle, challengers attack the TESTS directlyand build on each other's critiques:
119
+ After TDD cycle, challengers attack the TESTS in their sequential turns each building on prior challengers' findings via Dungeon pins:
120
120
 
121
121
  1. **Does this test prove the behavior, or just confirm the implementation?** If you renamed an internal method, would the test break? It shouldn't.
122
122
  2. **What input would make this test pass even with a broken implementation?** (e.g., a test that only checks the happy path passes for any implementation that doesn't crash)
@@ -124,9 +124,9 @@ After TDD cycle, challengers attack the TESTS directly — and build on each oth
124
124
  4. **Is it testing real code or mock behavior?** Mocks that don't match real behavior = false confidence.
125
125
  5. **Would this catch a regression?** If someone changes the implementation next month, does this test catch the break?
126
126
 
127
- **Challengers interact directly:**
128
- - `CHALLENGE: @Warrior, your test at line 15 only validates the happy path — here's an input that passes with a broken implementation: ...`
129
- - `BUILDING: @Archer, your edge case finding — the same gap exists in the error path test at line 32...`
127
+ **Challengers pin findings to the Dungeon on their turns:**
128
+ - `@archer [R1] CHALLENGE: @warrior's test at line 15 only validates the happy path — here's an input that passes with a broken implementation: ...`
129
+ - `@rogue [R1] BUILDING: @archer's edge case finding — the same gap exists in the error path test at line 32...`
130
130
  - `CHALLENGE: @Rogue, you claimed the test is implementation-dependent but renaming the internal method doesn't break it — here's proof: ...`
131
131
 
132
132
  **Browser-specific attacks (when `browser.enabled`):**