claude-raid 0.2.6 → 0.2.8
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +108 -66
- package/bin/cli.js +47 -11
- package/package.json +1 -1
- package/src/descriptions.js +11 -7
- package/src/init.js +37 -6
- package/src/merge-settings.js +43 -1
- package/src/remove.js +2 -2
- package/src/setup.js +33 -1
- package/src/ui.js +24 -19
- package/src/update.js +26 -3
- package/template/.claude/agents/archer.md +18 -4
- package/template/.claude/agents/rogue.md +18 -4
- package/template/.claude/agents/warrior.md +18 -4
- package/template/.claude/agents/wizard.md +32 -5
- package/template/.claude/dungeon-master-rules.md +132 -37
- package/template/.claude/hooks/raid-lib.sh +45 -4
- package/template/.claude/hooks/raid-pre-compact.sh +8 -4
- package/template/.claude/hooks/raid-session-end.sh +2 -2
- package/template/.claude/hooks/raid-session-start.sh +2 -0
- package/template/.claude/hooks/rtk-bridge.sh +46 -0
- package/template/.claude/hooks/validate-dungeon.sh +11 -3
- package/template/.claude/hooks/validate-file-naming.sh +6 -1
- package/template/.claude/hooks/validate-no-placeholders.sh +13 -2
- package/template/.claude/hooks/validate-write-gate.sh +7 -2
- package/template/.claude/party-rules.md +93 -64
- package/template/.claude/skills/raid-browser/SKILL.md +4 -6
- package/template/.claude/skills/raid-browser-chrome/SKILL.md +2 -2
- package/template/.claude/skills/raid-canonical-design/SKILL.md +306 -166
- package/template/.claude/skills/raid-canonical-implementation/SKILL.md +161 -133
- package/template/.claude/skills/raid-canonical-implementation-plan/SKILL.md +200 -142
- package/template/.claude/skills/raid-canonical-prd/SKILL.md +101 -78
- package/template/.claude/skills/raid-canonical-protocol/SKILL.md +30 -124
- package/template/.claude/skills/raid-canonical-review/SKILL.md +296 -149
- package/template/.claude/skills/raid-debugging/SKILL.md +1 -7
- package/template/.claude/skills/raid-init/SKILL.md +19 -29
- package/template/.claude/skills/raid-tdd/SKILL.md +5 -5
- package/template/.claude/skills/raid-teambuff/SKILL.md +281 -0
- package/template/.claude/skills/raid-verification/SKILL.md +0 -6
- package/template/.claude/skills/raid-wrap-up/SKILL.md +36 -32
|
@@ -1,233 +1,380 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: raid-canonical-review
|
|
3
|
-
description: "Phase 5
|
|
3
|
+
description: "Use when Phase 5 (Review) begins in a Canonical Quest, after implementation is complete and the human opts in."
|
|
4
4
|
---
|
|
5
5
|
|
|
6
6
|
# Raid Review — Phase 5 (Optional)
|
|
7
7
|
|
|
8
|
-
|
|
8
|
+
Two sub-phases: **Review** (find issues, build fix plan) then **Fix Session** (execute fixes). The review digests all prior deliverables — PRD, Design, Plan, Implementation — and verifies the implementation is correct, complete, and coherent.
|
|
9
9
|
|
|
10
10
|
<HARD-GATE>
|
|
11
|
-
This phase is OPTIONAL — the Wizard asks the human before entering. All
|
|
11
|
+
This phase is OPTIONAL — the Wizard asks the human before entering. All agents review the ENTIRE implementation. Use `raid-verification` before any completion claims.
|
|
12
12
|
</HARD-GATE>
|
|
13
13
|
|
|
14
|
-
## Mode Behavior
|
|
15
|
-
|
|
16
|
-
- **Full Raid**: 3 independent reviews, then agents fight directly over findings. All severity levels enforced.
|
|
17
|
-
- **Skirmish**: 1 agent reviews + Wizard. Cross-testing between reviewer and Wizard.
|
|
18
|
-
- **Scout**: Wizard reviews alone. Checks against requirements and runs tests.
|
|
19
|
-
|
|
20
14
|
## Process Flow
|
|
21
15
|
|
|
22
16
|
```dot
|
|
23
17
|
digraph review {
|
|
24
|
-
"Wizard reads
|
|
25
|
-
"
|
|
26
|
-
|
|
27
|
-
|
|
28
|
-
|
|
29
|
-
|
|
30
|
-
|
|
31
|
-
|
|
32
|
-
|
|
33
|
-
|
|
34
|
-
|
|
35
|
-
|
|
18
|
+
"Wizard reads all prior deliverables" -> "Phase recap (PRD + Design + Plan + Implementation)";
|
|
19
|
+
"Phase recap (PRD + Design + Plan + Implementation)" -> "SUB-PHASE A: REVIEW";
|
|
20
|
+
|
|
21
|
+
subgraph cluster_review {
|
|
22
|
+
label="Sub-phase A: Review";
|
|
23
|
+
"Roll dice for review turn order" -> "ROUND 1: Agent 1 reviews, pins findings";
|
|
24
|
+
"ROUND 1: Agent 1 reviews, pins findings" -> "Agent 2 adversary-tests findings + adds own";
|
|
25
|
+
"Agent 2 adversary-tests findings + adds own" -> "Agent 3 reviews + adds own";
|
|
26
|
+
"Agent 3 reviews + adds own" -> "Wizard evaluates Round 1";
|
|
27
|
+
"Wizard evaluates Round 1" -> "ROUND 2: Agent 1 converges findings, proposes fix plan";
|
|
28
|
+
"ROUND 2: Agent 1 converges findings, proposes fix plan" -> "Agents 2+3 attack fix plan";
|
|
29
|
+
"Agents 2+3 attack fix plan" -> "Wizard evaluates — Round 3?" [shape=diamond];
|
|
30
|
+
"Wizard evaluates — Round 3?" -> "ROUND 3 (FINAL)" [label="critical gaps"];
|
|
31
|
+
"Wizard evaluates — Round 3?" -> "Extract review.md fix plan" [label="solid"];
|
|
32
|
+
"ROUND 3 (FINAL)" -> "Extract review.md fix plan";
|
|
33
|
+
}
|
|
34
|
+
|
|
35
|
+
"SUB-PHASE A: REVIEW" -> "Roll dice for review turn order";
|
|
36
|
+
"Extract review.md fix plan" -> "Human approves fix plan?" [shape=diamond];
|
|
37
|
+
"Human approves fix plan?" -> "Ask why, revise" [label="no"];
|
|
38
|
+
"Ask why, revise" -> "Extract review.md fix plan";
|
|
39
|
+
"Human approves fix plan?" -> "Fixes needed?" [shape=diamond];
|
|
40
|
+
"Fixes needed?" -> "SUB-PHASE B: FIX SESSION" [label="yes"];
|
|
41
|
+
"Fixes needed?" -> "Commit + load wrap-up" [label="no fixes", shape=doublecircle];
|
|
42
|
+
|
|
43
|
+
subgraph cluster_fix {
|
|
44
|
+
label="Sub-phase B: Fix Session";
|
|
45
|
+
"Fresh dice roll for fix order" -> "Agent 1 makes fixes from review.md, reports";
|
|
46
|
+
"Agent 1 makes fixes from review.md, reports" -> "Agent 2 reviews fixes, reports";
|
|
47
|
+
"Agent 2 reviews fixes, reports" -> "Agent 3 reviews fixes, reports";
|
|
48
|
+
"Agent 3 reviews fixes, reports" -> "Wizard evaluates fixes";
|
|
49
|
+
"Wizard evaluates fixes" -> "More fix rounds?" [shape=diamond];
|
|
50
|
+
"More fix rounds?" -> "Next fix round" [label="yes"];
|
|
51
|
+
"More fix rounds?" -> "Wizard extracts results" [label="done"];
|
|
52
|
+
}
|
|
53
|
+
|
|
54
|
+
"SUB-PHASE B: FIX SESSION" -> "Fresh dice roll for fix order";
|
|
55
|
+
"Wizard extracts results" -> "Present to human";
|
|
56
|
+
"Present to human" -> "Commit + load wrap-up" [shape=doublecircle];
|
|
36
57
|
}
|
|
37
58
|
```
|
|
38
59
|
|
|
39
|
-
##
|
|
60
|
+
## Sub-phase A: Review
|
|
61
|
+
|
|
62
|
+
### Wizard Checklist (Review)
|
|
63
|
+
|
|
64
|
+
1. **Prepare** — gather all prior deliverables: PRD, design.md, task files, phase-4-implementation.md, git diff range
|
|
65
|
+
2. **Phase recap** — summarize all prior phases. Present to agents and human.
|
|
66
|
+
3. **Roll dice** — randomly shuffle `["warrior", "archer", "rogue"]` for the review turn order. Update raid-session via Bash using the jq command from protocol "Dice Roll Reference". Announce: *"The dice have spoken. Review turn order: {agent1} → {agent2} → {agent3}."*
|
|
67
|
+
4. **Create evolution log** — `{questDir}/phases/phase-5-review.md`
|
|
68
|
+
5. **Run rounds** — see Round Protocol below
|
|
69
|
+
6. **Extract fix plan** — polish into `{questDir}/spoils/review.md`
|
|
70
|
+
7. **Present to human** for approval
|
|
71
|
+
|
|
72
|
+
### Dispatch Templates (Review)
|
|
73
|
+
|
|
74
|
+
Dispatch carries only dynamic context. Detailed instructions (severity format, checklist, finding structure) are embedded in the scaffolded phase file.
|
|
75
|
+
|
|
76
|
+
**Reviewer (Round 1):**
|
|
77
|
+
```
|
|
78
|
+
TURN_DISPATCH: Phase 5 Review, Round 1, Turn {T}.
|
|
79
|
+
Quest: {description}
|
|
80
|
+
Phase recap: {summary of all prior phases — what was built, key decisions}
|
|
81
|
+
Your role: REVIEWER. Your section: "@{name} [R1]"
|
|
82
|
+
|
|
83
|
+
FIRST: Read the FULL document at {questDir}/phases/phase-5-review.md to understand the structure.
|
|
84
|
+
Read the embedded instructions in your section. Then read the code changes (git diff),
|
|
85
|
+
{questDir}/spoils/design.md, and task files.
|
|
86
|
+
THEN: Write your review in your designated section following the embedded instructions.
|
|
87
|
+
```
|
|
88
|
+
|
|
89
|
+
**Fix Plan Writer (Round 2, Turn 1):**
|
|
90
|
+
```
|
|
91
|
+
TURN_DISPATCH: Phase 5 Review, Round 2, Turn 1.
|
|
92
|
+
Quest: {description}
|
|
93
|
+
All Round 1 findings are in.
|
|
94
|
+
Your role: converge findings into fix plan. Your section: "@{name} [R2] — Converged Fix Plan"
|
|
95
|
+
|
|
96
|
+
FIRST: Read the FULL document at {questDir}/phases/phase-5-review.md.
|
|
97
|
+
Read all Round 1 findings. Read the embedded instructions in your section.
|
|
98
|
+
THEN: Write the converged fix plan following the embedded instructions.
|
|
99
|
+
```
|
|
40
100
|
|
|
41
|
-
|
|
42
|
-
|
|
43
|
-
|
|
44
|
-
|
|
45
|
-
|
|
46
|
-
|
|
47
|
-
|
|
48
|
-
|
|
49
|
-
|
|
50
|
-
|
|
51
|
-
|
|
52
|
-
12. **Transition** — invoke `raid-wrap-up`
|
|
101
|
+
**Fix Session dispatch:**
|
|
102
|
+
```
|
|
103
|
+
TURN_DISPATCH: Phase 5 Fix Session, Round 1, Turn {T}.
|
|
104
|
+
Quest: {description}
|
|
105
|
+
Fix plan: {questDir}/spoils/review.md
|
|
106
|
+
|
|
107
|
+
FIRST: Read the FULL document at {questDir}/phases/phase-5-review.md to find
|
|
108
|
+
the Fix Session section and your embedded instructions. Then read the fix plan.
|
|
109
|
+
THEN: Execute your role following the embedded instructions.
|
|
110
|
+
TDD enforced — load raid-tdd. Signal TURN_COMPLETE with status when done.
|
|
111
|
+
```
|
|
53
112
|
|
|
54
|
-
|
|
113
|
+
### Evolution Log Template (Sub-phase A)
|
|
55
114
|
|
|
56
|
-
|
|
115
|
+
Scaffold `{questDir}/phases/phase-5-review.md`. Replace agent name placeholders with actual names from dice roll:
|
|
57
116
|
|
|
58
117
|
```markdown
|
|
59
|
-
# Phase 5: Review
|
|
60
|
-
## Quest: Full adversarial review of <feature> implementation
|
|
61
|
-
## Mode: <Full Raid | Skirmish>
|
|
118
|
+
# Phase 5: Review — Evolution Log
|
|
62
119
|
|
|
63
|
-
|
|
120
|
+
## Quest: [quest description]
|
|
121
|
+
## Quest Type: Canonical Quest
|
|
122
|
+
## Turn Order (Review): @{agent1} → @{agent2} → @{agent3}
|
|
64
123
|
|
|
65
|
-
|
|
124
|
+
## References
|
|
125
|
+
- PRD: `{questDir}/spoils/prd.md` (if exists)
|
|
126
|
+
- Design: `{questDir}/spoils/design.md`
|
|
127
|
+
- Tasks: `{questDir}/spoils/tasks/phase-3-plan-task-*.md`
|
|
128
|
+
- Implementation: `{questDir}/phases/phase-4-implementation.md`
|
|
66
129
|
|
|
67
|
-
|
|
130
|
+
## Quest Goal
|
|
131
|
+
<!-- Wizard writes 2-3 lines: what this review must verify,
|
|
132
|
+
total file count from implementation, key risk areas to focus on -->
|
|
68
133
|
|
|
69
|
-
|
|
134
|
+
---
|
|
70
135
|
|
|
71
|
-
|
|
72
|
-
```
|
|
136
|
+
## Sub-phase A: Review
|
|
73
137
|
|
|
74
|
-
|
|
138
|
+
### @{agent1} [R1] — Full Implementation Review
|
|
75
139
|
|
|
76
|
-
|
|
140
|
+
<!-- @{agent1}: FIRST REVIEWER. Read ACTUAL CODE — not reports.
|
|
141
|
+
For each finding: [Severity] `file:line` — what, why, proposed fix.
|
|
142
|
+
Example: [Critical] `src/auth/handler.ts:23` — missing validation. Fix: add zod schema.
|
|
143
|
+
Checklist: requirements, code quality, testing, architecture, naming, production. -->
|
|
77
144
|
|
|
78
|
-
|
|
79
|
-
>
|
|
80
|
-
> **@Archer**: Review full implementation. Does it match the design doc exactly? Patterns consistent? Interfaces correct? Types sound? Naming conventions followed? File structure clean? Find the bugs that silently produce wrong results. Then fight @Warrior and @Rogue.
|
|
81
|
-
>
|
|
82
|
-
> **@Rogue**: Review full implementation. Think like an attacker. What inputs break it? What timing causes races? What happens when dependencies fail? Find the bugs nobody else will find. Then fight @Warrior and @Archer.
|
|
83
|
-
>
|
|
84
|
-
> **All**: Review independently first, then fight directly. Challenge each other's findings AND each other's blind spots. Pin severity-classified issues to Dungeon with `DUNGEON:`. Reference the Phase 3 Dungeon for context.
|
|
145
|
+
### @{agent2} [R1] — Adversarial Review
|
|
85
146
|
|
|
86
|
-
|
|
147
|
+
<!-- @{agent2}: ADVERSARIAL REVIEWER. Verify @{agent1}'s findings against actual code.
|
|
148
|
+
Challenge severity if overblown. Add findings @{agent1} missed. Don't repeat.
|
|
149
|
+
Same format: [Severity] `file:line` — what, why, fix. -->
|
|
87
150
|
|
|
88
|
-
|
|
151
|
+
### @{agent3} [R1] — Final Review Pass
|
|
89
152
|
|
|
90
|
-
|
|
153
|
+
<!-- @{agent3}: FINAL REVIEWER. Read all prior findings. Challenge what you disagree with.
|
|
154
|
+
Find what BOTH reviewers missed. Same format: [Severity] `file:line` — what, why, fix. -->
|
|
91
155
|
|
|
92
|
-
|
|
156
|
+
### Wizard [R1] Synthesis
|
|
157
|
+
<!-- Wizard categorizes all surviving findings by severity.
|
|
158
|
+
Counts: N Critical, N Important, N Minor.
|
|
159
|
+
Direction for Round 2. -->
|
|
93
160
|
|
|
94
|
-
|
|
161
|
+
---
|
|
95
162
|
|
|
96
|
-
|
|
163
|
+
### @{agent1} [R2] — Converged Fix Plan
|
|
97
164
|
|
|
98
|
-
|
|
165
|
+
<!-- @{agent1}: Read EVERY finding from all reviewers (R1).
|
|
166
|
+
Your job is to produce a SINGLE converged fix plan.
|
|
99
167
|
|
|
100
|
-
|
|
168
|
+
1. Group all findings by severity (Critical → Important → Minor)
|
|
169
|
+
2. Within each group, order by domain/file for efficient fixing
|
|
170
|
+
3. For each finding: confirm, mark as false positive (with evidence), or merge duplicates
|
|
171
|
+
4. Propose concrete fix for each confirmed finding
|
|
172
|
+
5. Note execution order (dependencies between fixes)
|
|
101
173
|
|
|
102
|
-
|
|
103
|
-
1
|
|
104
|
-
|
|
105
|
-
|
|
106
|
-
|
|
174
|
+
Format per finding:
|
|
175
|
+
**[Critical-1]** `src/auth/handler.ts:23` — Missing input validation
|
|
176
|
+
- Found by: @{agent2} [R1], confirmed by @{agent3} [R1]
|
|
177
|
+
- Fix: Add zod schema validation in validateToken() before line 23
|
|
178
|
+
- Blocked by: none -->
|
|
107
179
|
|
|
108
|
-
|
|
180
|
+
### @{agent2} [R2] — Fix Plan Review
|
|
109
181
|
|
|
110
|
-
|
|
111
|
-
|
|
112
|
-
Actions speak. Fix and show — don't compliment.
|
|
182
|
+
<!-- @{agent2}: Review fix plan. Are fixes correct? Execution order right?
|
|
183
|
+
Challenge false positive designations. Flag dropped findings. -->
|
|
113
184
|
|
|
114
|
-
|
|
185
|
+
### @{agent3} [R2] — Fix Plan Review
|
|
115
186
|
|
|
116
|
-
|
|
187
|
+
<!-- @{agent3}: Same focus. Challenge what @{agent2} missed.
|
|
188
|
+
Confirm or dispute false positive designations. -->
|
|
117
189
|
|
|
118
|
-
|
|
119
|
-
|
|
120
|
-
|
|
121
|
-
- If used: implement properly
|
|
122
|
-
- Don't gold-plate during review
|
|
190
|
+
### Wizard [R2] Synthesis
|
|
191
|
+
<!-- Wizard evaluates the fix plan. If solid → extract to review.md.
|
|
192
|
+
If critical gaps → announce Round 3 as FINAL. -->
|
|
123
193
|
|
|
124
|
-
|
|
194
|
+
---
|
|
125
195
|
|
|
126
|
-
|
|
196
|
+
## Final Extraction Notes — Wizard
|
|
197
|
+
<!-- What was incorporated into review.md.
|
|
198
|
+
False positives excluded and why.
|
|
199
|
+
Total findings: N confirmed, N false positives, N deferred. -->
|
|
127
200
|
|
|
128
|
-
|
|
129
|
-
- `BUILDING: @Warrior, your finding about the missing error handler — the impact is worse than you stated because...`
|
|
130
|
-
- `CHALLENGE: @Rogue, your "Critical" severity on the naming inconsistency is overblown — here's why it's actually Minor...`
|
|
131
|
-
- `DUNGEON: [Critical] handler.js:23 — missing input validation allows injection. Verified by @Warrior and @Rogue.`
|
|
201
|
+
---
|
|
132
202
|
|
|
133
|
-
|
|
203
|
+
## Writing Guidance
|
|
204
|
+
- Sign all work: `@{name} [R{N}]`
|
|
205
|
+
- Read ACTUAL CODE — not summaries, not reports, not commit messages
|
|
206
|
+
- Every finding needs: severity, location, what, why, proposed fix
|
|
207
|
+
- No performative agreement — no "Great catch!" Just evidence or pushback.
|
|
208
|
+
- Reviewers: challenge severity classifications, not just content
|
|
209
|
+
- Fix plan must be actionable — concrete fixes, not "improve error handling"
|
|
210
|
+
```
|
|
134
211
|
|
|
135
|
-
|
|
136
|
-
|----------|------------|--------|
|
|
137
|
-
| **Critical** | Bugs, security holes, data loss, crashes | Must fix. No exceptions. |
|
|
138
|
-
| **Important** | Missing features, poor error handling, test gaps, naming inconsistencies | Must fix. |
|
|
139
|
-
| **Minor** | Style, docs, optimization | Note for future. |
|
|
212
|
+
**Round 3:** If needed, wizard appends Round 3 sections before dispatching. Do NOT pre-scaffold.
|
|
140
213
|
|
|
141
|
-
|
|
214
|
+
### Browser Inspection (when `browser.enabled`)
|
|
142
215
|
|
|
143
|
-
After code review findings are pinned,
|
|
216
|
+
After code review findings are pinned, agents inspect the live application:
|
|
217
|
+
1. Each reviewer boots their own instance on separate ports (invoke `raid-browser`)
|
|
218
|
+
2. Pre-flight: state test subject, check auth, discover routes
|
|
219
|
+
3. Inspect from angle (invoke `raid-browser-chrome`): Warrior=stress, Archer=visual, Rogue=security
|
|
220
|
+
4. Cross-verify others' findings on own instance
|
|
221
|
+
5. Pin browser findings alongside code findings
|
|
222
|
+
6. Cleanup instances
|
|
144
223
|
|
|
145
|
-
|
|
224
|
+
Browser bugs block merge the same way code bugs do.
|
|
146
225
|
|
|
147
|
-
|
|
148
|
-
2. **Each reviewer BOOTs** their own app instance on separate ports (invoke `raid-browser`)
|
|
149
|
-
3. **Each reviewer runs PRE-FLIGHT** — state test subject, check auth, discover routes
|
|
150
|
-
4. **Each reviewer LOGINs** if auth is required (credentials from `.env.raid`)
|
|
151
|
-
5. **Each reviewer inspects** from their angle (invoke `raid-browser-chrome`):
|
|
152
|
-
- Minimum gates first (console, network, page loads)
|
|
153
|
-
- Then angle-driven exploration (Warrior: stress, Archer: visual/precision, Rogue: security)
|
|
154
|
-
- Evidence captured for every finding (GIF, screenshot, console/network)
|
|
155
|
-
6. **Cross-verification** — each reviewer reproduces others' findings on their own instance
|
|
156
|
-
7. **Pin browser findings** to Dungeon alongside code review findings
|
|
157
|
-
8. **Each reviewer CLEANUPs** their instance
|
|
158
|
-
9. **Wizard rules** on ALL findings (code + browser) together
|
|
226
|
+
## Sub-phase B: Fix Session
|
|
159
227
|
|
|
160
|
-
|
|
228
|
+
Only entered if `review.md` contains fixes to make. This is different from the Implementation phase — the source is `review.md`, not numbered plan tasks.
|
|
161
229
|
|
|
162
|
-
|
|
163
|
-
- **Important** (broken feature, visual inconsistency, responsive breakage) — must fix
|
|
164
|
-
- **Minor** (polish, console warnings) — note for future
|
|
230
|
+
### Wizard Checklist (Fix Session)
|
|
165
231
|
|
|
166
|
-
**
|
|
232
|
+
1. **Fresh dice roll** — a new turn order for the fix session. Update raid-session via Bash using the jq command from protocol "Dice Roll Reference". Announce: *"Fresh dice for the fix session: {agent1} → {agent2} → {agent3}."*
|
|
233
|
+
2. **Dispatch fixes** — round-based, sequential
|
|
167
234
|
|
|
168
|
-
|
|
235
|
+
### Fix Session Evolution Log (Appended Dynamically)
|
|
169
236
|
|
|
170
|
-
|
|
237
|
+
When Sub-phase B begins, the wizard appends these sections to `phase-5-review.md` with fresh agent names from the new dice roll:
|
|
171
238
|
|
|
239
|
+
```markdown
|
|
240
|
+
---
|
|
241
|
+
|
|
242
|
+
## Sub-phase B: Fix Session
|
|
243
|
+
|
|
244
|
+
## Turn Order (Fix Session): @{agent1} → @{agent2} → @{agent3}
|
|
245
|
+
<!-- Fresh dice roll — may be different order from review sub-phase -->
|
|
246
|
+
|
|
247
|
+
### @{agent1} [R1] — Implementing Fixes
|
|
248
|
+
|
|
249
|
+
<!-- @{agent1}: Work through review.md fix plan in order.
|
|
250
|
+
For each fix:
|
|
251
|
+
1. Implement the fix following TDD (write test → verify fail → fix → verify pass)
|
|
252
|
+
2. Report what was fixed and how
|
|
253
|
+
|
|
254
|
+
Format per fix:
|
|
255
|
+
**[Critical-1]** `src/auth/handler.ts:23` — FIXED
|
|
256
|
+
- Change: Added zod schema validation in validateToken()
|
|
257
|
+
- Test: `tests/auth/handler.test.ts` — added "rejects malformed tokens" test
|
|
258
|
+
- Commit: `fix(auth): add input validation to token handler`
|
|
259
|
+
|
|
260
|
+
Prioritize: blocking issues first, then simple fixes, then complex fixes. -->
|
|
261
|
+
|
|
262
|
+
### @{agent2} [R1] — Fix Verification
|
|
263
|
+
|
|
264
|
+
<!-- @{agent2}: Read the ACTUAL CODE changes for each fix above.
|
|
265
|
+
- Does each fix address the original finding?
|
|
266
|
+
- Does any fix introduce new issues?
|
|
267
|
+
- Run the full test suite — any regressions?
|
|
268
|
+
Report per fix: VERIFIED or ISSUE: [what's wrong] -->
|
|
269
|
+
|
|
270
|
+
### @{agent3} [R1] — Fix Verification
|
|
271
|
+
|
|
272
|
+
<!-- @{agent3}: Same focus. Verify fixes AND @{agent2}'s verification.
|
|
273
|
+
Final pass — anything missed? -->
|
|
274
|
+
|
|
275
|
+
### Wizard [R1] Synthesis
|
|
276
|
+
<!-- All fixes verified? If issues remain → another round.
|
|
277
|
+
If clean → extract results, present to human. -->
|
|
172
278
|
```
|
|
173
|
-
BLACKCARD: [description of breaking concern]
|
|
174
|
-
Evidence: [file paths, scenarios, why this is unfixable within current design]
|
|
175
|
-
Impact: [what breaks, how deep the damage goes]
|
|
176
|
-
```
|
|
177
279
|
|
|
178
|
-
|
|
179
|
-
|
|
180
|
-
|
|
181
|
-
|
|
182
|
-
|
|
183
|
-
|
|
184
|
-
|
|
185
|
-
|
|
186
|
-
|
|
187
|
-
|
|
280
|
+
2-3 rounds until the Wizard is satisfied all fixes are sound.
|
|
281
|
+
|
|
282
|
+
### Review Deliverable Template
|
|
283
|
+
|
|
284
|
+
Wizard extracts into `{questDir}/spoils/review.md` — issue-centric, grouped by severity:
|
|
285
|
+
|
|
286
|
+
```markdown
|
|
287
|
+
# [Feature Name] — Review Report
|
|
288
|
+
|
|
289
|
+
## Quest: [quest description]
|
|
290
|
+
## Date: YYYY-MM-DD
|
|
291
|
+
## Author: Wizard (extracted from phase-5-review.md)
|
|
292
|
+
|
|
293
|
+
---
|
|
294
|
+
|
|
295
|
+
## Summary
|
|
296
|
+
<!-- Total findings, breakdown by severity, fix session outcome -->
|
|
188
297
|
|
|
189
|
-
|
|
298
|
+
## Critical Issues
|
|
190
299
|
|
|
191
|
-
|
|
300
|
+
### [Critical-1] `file:line` — Short description
|
|
301
|
+
- **Found by:** @agent [R1], confirmed by @agent [R1]
|
|
302
|
+
- **Description:** What is wrong and why it matters
|
|
303
|
+
- **Fix:** What was done to resolve it
|
|
304
|
+
- **Status:** Fixed | Deferred — [reason]
|
|
305
|
+
- **Verification:** Test name or evidence that the fix works
|
|
192
306
|
|
|
193
|
-
|
|
194
|
-
1. **Blocking issues** — crashes, security holes, data loss
|
|
195
|
-
2. **Simple fixes** — typos, imports, naming inconsistencies
|
|
196
|
-
3. **Complex fixes** — refactoring, logic changes, architectural adjustments
|
|
307
|
+
## Important Issues
|
|
197
308
|
|
|
198
|
-
|
|
309
|
+
### [Important-1] `file:line` — Short description
|
|
310
|
+
<!-- Same structure -->
|
|
199
311
|
|
|
200
|
-
##
|
|
312
|
+
## Minor Issues (Noted for Future)
|
|
201
313
|
|
|
202
|
-
|
|
314
|
+
### [Minor-1] `file:line` — Short description
|
|
315
|
+
- **Found by:** @agent [R1]
|
|
316
|
+
- **Description:** What and why
|
|
317
|
+
- **Status:** Deferred — not blocking
|
|
203
318
|
|
|
204
|
-
##
|
|
319
|
+
## False Positives
|
|
205
320
|
|
|
206
|
-
|
|
321
|
+
### [FP-1] `file:line` — Short description
|
|
322
|
+
- **Raised by:** @agent [R1]
|
|
323
|
+
- **Dismissed by:** @agent [R2] — [evidence why it's not an issue]
|
|
324
|
+
```
|
|
325
|
+
|
|
326
|
+
## Black Card System
|
|
327
|
+
|
|
328
|
+
If any agent finds something that fundamentally breaks the architecture — unfixable within current design:
|
|
207
329
|
|
|
208
|
-
|
|
330
|
+
```
|
|
331
|
+
BLACKCARD: [description]
|
|
332
|
+
Evidence: [file paths, scenarios, why unfixable]
|
|
333
|
+
Impact: [what breaks, how deep]
|
|
334
|
+
```
|
|
209
335
|
|
|
210
|
-
**
|
|
336
|
+
**Flow:** Agent plays → 2+ agents verify → Wizard escalates to human → Options: (a) rollback to earlier phase, (b) accept limitation.
|
|
337
|
+
|
|
338
|
+
Black cards are RARE. Most issues are Critical or Important, not black cards.
|
|
339
|
+
|
|
340
|
+
## No Performative Agreement
|
|
341
|
+
|
|
342
|
+
NEVER respond with "Great catch!" or "You're absolutely right!" Instead: state the finding, show evidence, or push back. If a finding IS correct: fix it and move on.
|
|
343
|
+
|
|
344
|
+
## Verification Protocol
|
|
345
|
+
|
|
346
|
+
Before acting on ANY finding:
|
|
347
|
+
1. **READ:** Complete the finding without reacting
|
|
348
|
+
2. **VERIFY:** Check against actual code at the referenced location
|
|
349
|
+
3. **EVALUATE:** Is this technically sound for THIS codebase?
|
|
350
|
+
4. **RESPOND:** Technical evidence or reasoned pushback
|
|
211
351
|
|
|
212
352
|
## Red Flags
|
|
213
353
|
|
|
214
354
|
| Thought | Reality |
|
|
215
355
|
|---------|---------|
|
|
216
|
-
| "The implementation looks fine
|
|
217
|
-
| "
|
|
218
|
-
| "This is a Minor issue" (when it causes wrong behavior) | Wrong results = Important or Critical. |
|
|
356
|
+
| "The implementation looks fine" | Every review finds at least one issue. Look harder. |
|
|
357
|
+
| "This is Minor" (when it causes wrong behavior) | Wrong results = Important or Critical. |
|
|
219
358
|
| "The tests pass, so it works" | Tests prove what they test. What DON'T they test? |
|
|
220
|
-
| "Let
|
|
359
|
+
| "Let me silently ignore that finding" | Every finding gets addressed in the fix plan. |
|
|
360
|
+
| "Fixes are simple, skip re-review" | Fixes introduce new bugs. Always re-verify. |
|
|
221
361
|
|
|
222
362
|
---
|
|
223
363
|
|
|
224
364
|
## Phase Transition
|
|
225
365
|
|
|
226
|
-
When the
|
|
366
|
+
When the review is complete and all fixes verified:
|
|
367
|
+
|
|
368
|
+
1. Update raid-session phase via Bash:
|
|
369
|
+
```bash
|
|
370
|
+
jq '.phase="wrap-up"' .claude/raid-session > .claude/raid-session.tmp && mv .claude/raid-session.tmp .claude/raid-session
|
|
371
|
+
```
|
|
372
|
+
2. **Commit:** `fix(quest-{slug}): phase 5 review — {N} findings resolved`
|
|
373
|
+
3. **Report:** Link `review.md` and `phase-5-review.md` file paths to the human.
|
|
374
|
+
4. **Load `raid-wrap-up` and begin Phase 6.**
|
|
227
375
|
|
|
228
|
-
|
|
229
|
-
2. **Commit**: `fix(quest-{slug}): phase 5 review — {N} findings resolved`
|
|
230
|
-
3. **Send phase report to human**: findings count, fixes applied, any black cards
|
|
231
|
-
4. **Load the `raid-wrap-up` skill now and begin Phase 6.**
|
|
376
|
+
## Phase Spoils
|
|
232
377
|
|
|
233
|
-
|
|
378
|
+
**Two outputs:**
|
|
379
|
+
- `{questDir}/phases/phase-5-review.md` — Full evolution (findings, challenges, fix plan debate, fix session)
|
|
380
|
+
- `{questDir}/spoils/review.md` — Clean fix plan deliverable (what was found, what was fixed, what was deferred)
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: raid-debugging
|
|
3
|
-
description: "Use when encountering any bug, test failure, or unexpected behavior. Agents investigate competing hypotheses in
|
|
3
|
+
description: "Use when encountering any bug, test failure, or unexpected behavior. Agents investigate competing hypotheses in sequential turns. No fixes without root cause. No subagents."
|
|
4
4
|
---
|
|
5
5
|
|
|
6
6
|
# Raid Debugging — Adversarial Root Cause Analysis
|
|
@@ -19,12 +19,6 @@ NO FIXES WITHOUT ROOT CAUSE INVESTIGATION FIRST
|
|
|
19
19
|
|
|
20
20
|
If you haven't completed Phase 1, you cannot propose fixes.
|
|
21
21
|
|
|
22
|
-
## Mode Behavior
|
|
23
|
-
|
|
24
|
-
- **Full Raid**: 3 agents investigate competing hypotheses in parallel.
|
|
25
|
-
- **Skirmish**: 2 agents with different hypotheses.
|
|
26
|
-
- **Scout**: 1 agent investigates + Wizard challenges the hypothesis.
|
|
27
|
-
|
|
28
22
|
## Process Flow
|
|
29
23
|
|
|
30
24
|
```dot
|