claude-raid 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (31) hide show
  1. package/LICENSE +21 -0
  2. package/README.md +345 -0
  3. package/bin/cli.js +34 -0
  4. package/package.json +37 -0
  5. package/src/detect-project.js +112 -0
  6. package/src/doctor.js +201 -0
  7. package/src/init.js +138 -0
  8. package/src/merge-settings.js +119 -0
  9. package/src/remove.js +92 -0
  10. package/src/update.js +110 -0
  11. package/template/.claude/agents/archer.md +115 -0
  12. package/template/.claude/agents/rogue.md +116 -0
  13. package/template/.claude/agents/warrior.md +114 -0
  14. package/template/.claude/agents/wizard.md +206 -0
  15. package/template/.claude/hooks/validate-commit-message.sh +78 -0
  16. package/template/.claude/hooks/validate-file-naming.sh +73 -0
  17. package/template/.claude/hooks/validate-no-placeholders.sh +67 -0
  18. package/template/.claude/hooks/validate-phase-gate.sh +60 -0
  19. package/template/.claude/hooks/validate-tests-pass.sh +43 -0
  20. package/template/.claude/hooks/validate-verification.sh +70 -0
  21. package/template/.claude/raid-rules.md +21 -0
  22. package/template/.claude/skills/raid-debugging/SKILL.md +159 -0
  23. package/template/.claude/skills/raid-design/SKILL.md +208 -0
  24. package/template/.claude/skills/raid-finishing/SKILL.md +123 -0
  25. package/template/.claude/skills/raid-git-worktrees/SKILL.md +96 -0
  26. package/template/.claude/skills/raid-implementation/SKILL.md +155 -0
  27. package/template/.claude/skills/raid-implementation-plan/SKILL.md +173 -0
  28. package/template/.claude/skills/raid-protocol/SKILL.md +288 -0
  29. package/template/.claude/skills/raid-review/SKILL.md +133 -0
  30. package/template/.claude/skills/raid-tdd/SKILL.md +147 -0
  31. package/template/.claude/skills/raid-verification/SKILL.md +113 -0
@@ -0,0 +1,208 @@
1
+ ---
2
+ name: raid-design
3
+ description: "Phase 1 of Raid protocol. Wizard opens the Dungeon, agents explore freely from different angles, challenge and build on each other directly, and pin verified findings. Wizard closes when design is battle-tested."
4
+ ---
5
+
6
+ # Raid Design — Phase 1
7
+
8
+ Turn ideas into battle-tested designs through agent-driven adversarial exploration.
9
+
10
+ <HARD-GATE>
11
+ Do NOT write any code, scaffold any project, or take any implementation action until the Wizard has approved the design and it is committed to git. All assigned agents participate. No subagents.
12
+ </HARD-GATE>
13
+
14
+ ## Mode Behavior
15
+
16
+ - **Full Raid**: All 3 agents explore from different angles, fight directly, pin findings to Dungeon. Full design doc required.
17
+ - **Skirmish**: 2 agents explore and interact, produce a lightweight design+plan combined doc.
18
+ - **Scout**: Wizard assesses inline, no design doc required. Skip this skill entirely.
19
+
20
+ ## Process Flow
21
+
22
+ ```dot
23
+ digraph design {
24
+ "Wizard comprehends request (reads 3x)" -> "Scope check";
25
+ "Scope check" -> "Too large?" [shape=diamond];
26
+ "Too large?" -> "Decompose into sub-projects" [label="yes"];
27
+ "Decompose into sub-projects" -> "Brainstorm first sub-project";
28
+ "Too large?" -> "Explore project context" [label="no"];
29
+ "Explore project context" -> "Research dependencies";
30
+ "Research dependencies" -> "Ask clarifying questions (one at a time)";
31
+ "Ask clarifying questions (one at a time)" -> "Wizard opens Dungeon + dispatches";
32
+ "Wizard opens Dungeon + dispatches" -> "Agents explore, challenge, build freely";
33
+ "Agents explore, challenge, build freely" -> "Agents pin verified findings to Dungeon";
34
+ "Agents pin verified findings to Dungeon" -> "Dungeon sufficient?" [shape=diamond];
35
+ "Dungeon sufficient?" -> "Agents explore, challenge, build freely" [label="no"];
36
+ "Dungeon sufficient?" -> "Wizard closes: synthesizes 2-3 approaches from Dungeon" [label="yes"];
37
+ "Wizard closes: synthesizes 2-3 approaches from Dungeon" -> "Present design (section by section)";
38
+ "Present design (section by section)" -> "Human approves?" [shape=diamond];
39
+ "Human approves?" -> "Present design (section by section)" [label="revise"];
40
+ "Human approves?" -> "Write design doc" [label="yes"];
41
+ "Write design doc" -> "Adversarial spec review (agents attack directly)";
42
+ "Adversarial spec review (agents attack directly)" -> "Spec self-review (fix inline)";
43
+ "Spec self-review (fix inline)" -> "Human reviews written spec";
44
+ "Human reviews written spec" -> "Commit + invoke raid-implementation-plan" [shape=doublecircle];
45
+ }
46
+ ```
47
+
48
+ ## Wizard Checklist
49
+
50
+ Complete in order:
51
+
52
+ 1. **Comprehend the request** — read 3 times, identify the real problem beneath the stated one
53
+ 2. **Scope check** — if the request describes multiple independent subsystems, flag it immediately
54
+ 3. **Explore project context** — files, docs, recent commits, dependencies, conventions, patterns
55
+ 4. **Research dependencies** — API surface, versioning, compatibility, known issues. Read docs COMPLETELY.
56
+ 5. **Ask clarifying questions** — one at a time to the human, eliminate every ambiguity
57
+ 6. **Open the Dungeon** — create `.claude/raid-dungeon.md` with Phase 1 header, quest, mode
58
+ 7. **Dispatch with angles** — give each agent their angle, then go silent
59
+ 8. **Observe the fight** — agents explore, challenge, build, roast, and pin findings to Dungeon. Intervene only on triggers.
60
+ 9. **Close the phase** — when Dungeon has sufficient verified findings to form 2-3 approaches
61
+ 10. **Synthesize approaches** — propose 2-3 approaches from Dungeon evidence, with trade-offs and recommendation
62
+ 11. **Present design** — in sections scaled to complexity, get human approval per section
63
+ 12. **Write design doc** — save to specs path from `.claude/raid.json`
64
+ 13. **Adversarial spec review** — agents attack the written spec directly, challenging each other
65
+ 14. **Spec self-review** — fix issues inline (see checklist below)
66
+ 15. **Human reviews written spec** — human approves before proceeding
67
+ 16. **Commit** — `docs(design): <topic> specification`
68
+ 17. **Archive Dungeon** — rename to `.claude/raid-dungeon-phase-1.md`
69
+ 18. **Transition** — invoke `raid-implementation-plan`
70
+
71
+ ## Opening the Dungeon
72
+
73
+ Create `.claude/raid-dungeon.md`:
74
+
75
+ ```markdown
76
+ # Dungeon — Phase 1: Design
77
+ ## Quest: <task description from human>
78
+ ## Mode: <Full Raid | Skirmish>
79
+
80
+ ### Discoveries
81
+
82
+ ### Active Battles
83
+
84
+ ### Resolved
85
+
86
+ ### Shared Knowledge
87
+
88
+ ### Escalations
89
+ ```
90
+
91
+ ## Dispatch Pattern
92
+
93
+ Each agent gets the same objective but a different starting angle. After dispatch, the Wizard goes silent.
94
+
95
+ **📡 DISPATCH:**
96
+
97
+ > **@Warrior**: Explore from the data/infrastructure side. What are the hard technical constraints? What schemas, migrations, APIs are needed? What breaks if we get this wrong? Find the structural load-bearing walls. Challenge @Archer and @Rogue's findings directly. Pin verified findings to the Dungeon.
98
+ >
99
+ > **@Archer**: Explore from the integration/consistency side. How does this fit with existing patterns? What implicit contracts exist? What ripple effects? Trace the dependency chain. Check naming and file structure conventions. Challenge @Warrior and @Rogue's findings directly. Pin verified findings to the Dungeon.
100
+ >
101
+ > **@Rogue**: Explore from the failure/adversarial side. What assumptions about inputs, state, timing, availability? Build failure scenarios. What does a malicious user do? What does a slow network do? What does concurrent access do? Challenge @Warrior and @Archer's findings directly. Pin verified findings to the Dungeon.
102
+ >
103
+ > **All**: Read the Dungeon. Build on each other's discoveries. Challenge everything. Pin only what survives. Escalate to me with `🆘 WIZARD:` only when genuinely stuck.
104
+
105
+ ## What Agents Must Cover
106
+
107
+ Every agent addresses ALL of these from their assigned angle:
108
+
109
+ - **Performance** — scale, bottlenecks, complexity
110
+ - **Robustness** — retries, fallbacks, graceful degradation
111
+ - **Reliability** — blast radius of failure, production-readiness
112
+ - **Testability** — meaningful tests, mock strategy, test-friendly design
113
+ - **Error handling** — what errors occur, how surfaced, UX of failure
114
+ - **Edge cases** — empty, null, boundary, Unicode, timezones, large payloads
115
+ - **Cascading effects** — blast radius, what else changes
116
+ - **Clean architecture** — separation of concerns, single responsibility, dependency inversion
117
+ - **Modularity & composability** — replaceable, extensible, composable
118
+ - **DRY** — duplicating logic? reuse existing code?
119
+ - **Dependencies** — version compatibility, security, maintenance, licensing
120
+
121
+ ## The Fight — What Makes It Productive
122
+
123
+ ```
124
+ Agents interact DIRECTLY — @Name addressing, building, challenging, roasting:
125
+ 1. Present findings with EVIDENCE (file paths, docs, concrete examples)
126
+ 2. Challenge other agents DIRECTLY with COUNTER-EVIDENCE (not opinions)
127
+ 3. Build on each other's discoveries — 🔗 BUILDING ON @Name:
128
+ 4. Go to the EDGES — push every finding to its extreme
129
+ 5. LEARN from each other — incorporate discoveries into your model
130
+ 6. Pin verified findings — 📌 DUNGEON: only after surviving challenge
131
+ 7. Roast weak analysis — 🔥 ROAST: with evidence, not insults
132
+ 8. Escalate to Wizard — 🆘 WIZARD: only when genuinely stuck
133
+ ```
134
+
135
+ **The goal is not to tear each other down. The goal is to forge the strongest design by testing it from every angle. The Dungeon captures what survived.**
136
+
137
+ ## Closing the Phase
138
+
139
+ The Wizard closes when the Dungeon has sufficient verified findings — enough Discoveries, Shared Knowledge, and Resolved battles to synthesize 2-3 approaches.
140
+
141
+ **How the Wizard knows it's time to close:**
142
+ - Dungeon has verified findings covering all major aspects (performance, robustness, testability, etc.)
143
+ - Active Battles section is empty or has only minor unresolved points
144
+ - Agents are converging — new findings are variations, not revelations
145
+ - Shared Knowledge section has the foundational truths the design needs
146
+
147
+ **⚡ WIZARD RULING:** Synthesize from Dungeon evidence. Propose 2-3 approaches. Recommend one. Archive Dungeon.
148
+
149
+ ## Spec Self-Review
150
+
151
+ After writing the design doc, the Wizard reviews with fresh eyes:
152
+
153
+ 1. **Placeholder scan:** Any TBD, TODO, incomplete sections, vague requirements? Fix them.
154
+ 2. **Internal consistency:** Do any sections contradict each other? Architecture match feature descriptions?
155
+ 3. **Scope check:** Focused enough for a single implementation plan, or needs decomposition?
156
+ 4. **Ambiguity check:** Could any requirement be interpreted two ways? Pick one and make it explicit.
157
+
158
+ Fix issues inline.
159
+
160
+ ## Design Document Structure
161
+
162
+ Save to: specs path from `.claude/raid.json` (default: `docs/raid/specs/YYYY-MM-DD-<topic>-design.md`)
163
+
164
+ ```markdown
165
+ # [Feature Name] Design Specification
166
+
167
+ **Date:** YYYY-MM-DD
168
+ **Status:** Draft | Under Review | Approved
169
+ **Raid Team:** Wizard (dungeon master), [agents used]
170
+ **Mode:** Full Raid | Skirmish
171
+
172
+ ## Problem Statement
173
+ ## Requirements (numbered, unambiguous)
174
+ ## Constraints
175
+ ## Dungeon Findings (verified, from Phase 1 Dungeon)
176
+ ### Key Discoveries (survived cross-testing)
177
+ ### Lessons Learned (wrong assumptions corrected)
178
+ ## Design Decision
179
+ ### Alternatives Considered (2-3 with rejection reasons)
180
+ ## Architecture
181
+ ## File Structure
182
+ ## Error Handling Strategy
183
+ ## Testing Strategy
184
+ ## Edge Cases
185
+ ## Future Considerations (NOT building now, designing to accommodate)
186
+ ## ⚡ WIZARD RULING
187
+ ```
188
+
189
+ ## Red Flags — Thoughts That Signal Violations
190
+
191
+ | Thought | Reality |
192
+ |---------|---------|
193
+ | "This is too simple to need a design" | Simple projects are where unexamined assumptions cause the most waste. |
194
+ | "I already know the right approach" | Knowing and verifying are different. Propose 2-3 anyway. |
195
+ | "Let's just start coding and figure it out" | Code without design becomes the design. And it's usually wrong. |
196
+ | "The agents all agree, let's move on" | Agreement without challenge is groupthink. Did they actually cross-test? |
197
+ | "I'll wait for the Wizard to tell me what to do" | You own the phase. Explore, challenge, build. Self-organize. |
198
+ | "Let me just post everything to the Dungeon" | Only verified, challenged findings get pinned. |
199
+ | "I need the Wizard to mediate this disagreement" | Talk to the other agent directly first. Escalate only if stuck. |
200
+
201
+ ## Escalation
202
+
203
+ If the team is stuck on a fundamental design choice after genuine direct debate:
204
+ 1. Present the top 2 options with trade-offs to the human
205
+ 2. Let the human decide
206
+ 3. Never ask the human to resolve something the team should handle
207
+
208
+ **Terminal state:** ⚡ WIZARD RULING: Design approved. Commit. Archive Dungeon. Invoke `raid-implementation-plan`.
@@ -0,0 +1,123 @@
1
+ ---
2
+ name: raid-finishing
3
+ description: "Use after Phase 4 review is approved. Agents debate completeness directly, fighting over what's truly done. Wizard closes with verdict, presents merge options, cleans up Dungeon files and session."
4
+ ---
5
+
6
+ # Raid Finishing — Complete the Development Branch
7
+
8
+ Agents debate completeness directly. Verify. Present options. Execute. Clean up.
9
+
10
+ **Violating the letter of this process is violating its spirit.**
11
+
12
+ ## Mode Behavior
13
+
14
+ - **Full Raid**: All 3 agents debate completeness directly. Full verification.
15
+ - **Skirmish**: 1 agent + Wizard verify completeness.
16
+ - **Scout**: Wizard verifies alone.
17
+
18
+ ## Process Flow
19
+
20
+ ```dot
21
+ digraph finishing {
22
+ "Wizard opens final debate" -> "Agents argue directly: truly done?";
23
+ "Agents argue directly: truly done?" -> "Any agent says incomplete?" [shape=diamond];
24
+ "Any agent says incomplete?" -> "Agent presents evidence, others attack" [label="yes"];
25
+ "Agent presents evidence, others attack" -> "Wizard rules" [shape=diamond];
26
+ "Wizard rules" -> "Return to Phase 3 or 4" [label="incomplete"];
27
+ "Wizard rules" -> "Verify all tests pass (fresh run)" [label="complete"];
28
+ "Any agent says incomplete?" -> "Verify all tests pass (fresh run)" [label="no, all agree"];
29
+ "Verify all tests pass (fresh run)" -> "Tests pass?" [shape=diamond];
30
+ "Tests pass?" -> "Fix first. Do not present options." [label="no"];
31
+ "Fix first. Do not present options." -> "Verify all tests pass (fresh run)";
32
+ "Tests pass?" -> "Present 4 options" [label="yes"];
33
+ "Present 4 options" -> "Execute choice";
34
+ "Execute choice" -> "Clean up: Dungeon files + worktree + raid-session";
35
+ "Clean up: Dungeon files + worktree + raid-session" -> "Done" [shape=doublecircle];
36
+ }
37
+ ```
38
+
39
+ ## Wizard Checklist
40
+
41
+ 1. **Open final debate** — dispatch agents to argue completeness directly
42
+ 2. **Observe the fight** — agents challenge each other on what's done vs. missing
43
+ 3. **Wizard rules on completeness** — only proceed if ruling is "complete"
44
+ 4. **Verify all tests pass** — full suite, fresh run
45
+ 5. **Present options** — exactly 4 choices
46
+ 6. **Execute choice** — merge, PR, keep, or discard
47
+ 7. **Clean up** — remove all Dungeon files (`.claude/raid-dungeon.md`, `.claude/raid-dungeon-phase-*.md`), worktree if applicable, remove `.claude/raid-session`
48
+
49
+ ## Step 1: The Completeness Debate
50
+
51
+ **📡 DISPATCH:**
52
+
53
+ > **@Warrior**: Review the implementation against the plan. Is every task completed? Every acceptance criterion met? Every test passing? Is anything half-done? Fight @Archer and @Rogue directly on their assessments.
54
+ >
55
+ > **@Archer**: Review the implementation against the design doc. Is every requirement covered? Naming patterns consistent throughout? File structure clean? Did we introduce inconsistencies with the rest of the codebase? Fight @Warrior and @Rogue directly.
56
+ >
57
+ > **@Rogue**: Review from the adversarial angle. What did we miss? What edge case is untested? What requirement was subtly misinterpreted? What will break in the first week of production? Fight @Warrior and @Archer directly.
58
+ >
59
+ > **All**: Reference ALL archived Dungeons (Phase 1-4) for full context. Debate directly. If you believe the work is incomplete, present evidence. Others challenge your claim. Pin conclusions to conversation (no Dungeon for finishing — this is the final debate).
60
+
61
+ **The agents must fight over this.** If any agent believes the work is incomplete, they present evidence. The other two challenge that claim directly.
62
+
63
+ ⚡ WIZARD RULING: [Complete — proceed | Incomplete — return to Phase 3/4 with specific issues]
64
+
65
+ ## Step 2: Final Verification
66
+
67
+ ```
68
+ BEFORE presenting options:
69
+ 1. IDENTIFY: test command from .claude/raid.json
70
+ 2. RUN: Execute the FULL test suite (fresh, complete)
71
+ 3. READ: Full output, check exit code, count failures
72
+ 4. VERIFY: Zero failures?
73
+ If NO → STOP. Fix first. Do not present options.
74
+ If YES → Proceed with evidence.
75
+ ```
76
+
77
+ ## Step 3: Present Options
78
+
79
+ ```
80
+ ⚡ WIZARD RULING: Implementation complete and verified.
81
+
82
+ Tests: [N] passing, 0 failures (evidence: [command output])
83
+
84
+ Options:
85
+ 1. Merge back to [base-branch] locally
86
+ 2. Push and create a Pull Request
87
+ 3. Keep the branch as-is (handle later)
88
+ 4. Discard this work
89
+
90
+ Which option?
91
+ ```
92
+
93
+ ## Step 4: Execute
94
+
95
+ | Option | Actions |
96
+ |--------|---------|
97
+ | **1. Merge** | Checkout base -> pull -> merge -> run tests on merged result -> delete branch -> clean up |
98
+ | **2. PR** | Push with -u -> create PR via gh -> clean up |
99
+ | **3. Keep** | Report branch location. Done. |
100
+ | **4. Discard** | Require typed "discard" confirmation -> delete branch (force) -> clean up |
101
+
102
+ ## Step 5: Clean Up
103
+
104
+ Remove ALL Dungeon artifacts:
105
+ - `.claude/raid-dungeon.md` (if exists)
106
+ - `.claude/raid-dungeon-phase-1.md`
107
+ - `.claude/raid-dungeon-phase-2.md`
108
+ - `.claude/raid-dungeon-phase-3.md`
109
+ - `.claude/raid-dungeon-phase-4.md`
110
+ - `.claude/raid-session`
111
+ - Worktree (if applicable)
112
+
113
+ ## Red Flags
114
+
115
+ | Thought | Reality |
116
+ |---------|---------|
117
+ | "Tests passed earlier, no need to re-run" | Verification Iron Law. Fresh run or no claim. |
118
+ | "The completeness debate is a formality" | It's where missed requirements surface. Take it seriously. |
119
+ | "Let me report to the Wizard whether it's complete" | Debate with the other agents directly. |
120
+ | "Merge without testing the merged result" | Merges introduce conflicts. Always test after merge. |
121
+ | "Leave the Dungeon files, they might be useful" | Clean up. Session artifacts don't belong in the repo. |
122
+
123
+ **Terminal state:** Choice executed. All Dungeon files removed. `.claude/raid-session` removed. Session over.
@@ -0,0 +1,96 @@
1
+ ---
2
+ name: raid-git-worktrees
3
+ description: "Use when starting Raid implementation that needs isolation. Creates isolated git worktree with safety verification and clean test baseline."
4
+ ---
5
+
6
+ # Raid Git Worktrees — Isolated Workspaces
7
+
8
+ Systematic directory selection + safety verification = reliable isolation.
9
+
10
+ ## Process Flow
11
+
12
+ ```dot
13
+ digraph worktree {
14
+ "Check worktree path from raid.json" -> "Directory exists?";
15
+ "Directory exists?" -> "Verify gitignored" [label="yes"];
16
+ "Directory exists?" -> "Create directory" [label="no"];
17
+ "Create directory" -> "Add to .gitignore + commit";
18
+ "Add to .gitignore + commit" -> "Verify gitignored";
19
+ "Verify gitignored" -> "Ignored?" [shape=diamond];
20
+ "Ignored?" -> "Create worktree" [label="yes"];
21
+ "Ignored?" -> "Add to .gitignore + commit" [label="no"];
22
+ "Create worktree" -> "Install dependencies";
23
+ "Install dependencies" -> "Run baseline tests";
24
+ "Run baseline tests" -> "Tests pass?" [shape=diamond];
25
+ "Tests pass?" -> "Report ready" [label="yes", shape=doublecircle];
26
+ "Tests pass?" -> "Report failures, ask user" [label="no"];
27
+ }
28
+ ```
29
+
30
+ ## Directory Selection Priority
31
+
32
+ 1. Check worktrees path from `.claude/raid.json` (default: `.worktrees/`) -> use it (verify ignored)
33
+ 2. Check CLAUDE.md for preference -> use it
34
+ 3. Ask the user
35
+
36
+ ## Safety Verification
37
+
38
+ ```bash
39
+ # MUST verify directory is gitignored before creating worktree
40
+ git check-ignore -q [worktrees-path] 2>/dev/null
41
+ ```
42
+
43
+ If NOT ignored: add to `.gitignore`, commit immediately, then proceed. Fix broken things immediately — don't leave unignored worktree directories.
44
+
45
+ ## Creation
46
+
47
+ ```bash
48
+ WORKTREE_PATH=$(jq -r '.paths.worktrees // ".worktrees"' .claude/raid.json)
49
+ git worktree add "$WORKTREE_PATH/$BRANCH_NAME" -b "$BRANCH_NAME"
50
+ cd "$WORKTREE_PATH/$BRANCH_NAME"
51
+
52
+ # Auto-detect and install deps
53
+ [ -f package.json ] && npm install
54
+ [ -f Cargo.toml ] && cargo build
55
+ [ -f requirements.txt ] && pip install -r requirements.txt
56
+ [ -f pyproject.toml ] && poetry install
57
+ [ -f go.mod ] && go mod download
58
+
59
+ # Verify clean baseline
60
+ TEST_CMD=$(jq -r '.project.testCommand // empty' .claude/raid.json)
61
+ [ -n "$TEST_CMD" ] && eval "$TEST_CMD"
62
+ ```
63
+
64
+ ## Report
65
+
66
+ ```
67
+ Worktree ready at [path]
68
+ Branch: [branch-name]
69
+ Tests: [N] passing, 0 failures
70
+ Ready for Raid implementation
71
+
72
+ Note: Dungeon files (.claude/raid-dungeon*.md) are session artifacts
73
+ and will be cleaned up by raid-finishing. No gitignore needed.
74
+ ```
75
+
76
+ ## Quick Reference
77
+
78
+ | Situation | Action |
79
+ |-----------|--------|
80
+ | `.worktrees/` exists | Use it (verify ignored) |
81
+ | `worktrees/` exists | Use it (verify ignored) |
82
+ | Both exist | Use `.worktrees/` |
83
+ | Neither exists | Check raid.json -> CLAUDE.md -> ask user |
84
+ | Directory not ignored | Add to .gitignore + commit first |
85
+ | Tests fail during baseline | Report failures + ask user before proceeding |
86
+ | No test command configured | Warn, proceed without baseline |
87
+
88
+ ## Red Flags
89
+
90
+ | Thought | Reality |
91
+ |---------|---------|
92
+ | "I'll add it to .gitignore later" | Fix it now. Worktree dirs must never be committed. |
93
+ | "Baseline tests don't matter" | Failing baseline = you'll waste time debugging pre-existing failures. |
94
+ | "Skip dependency install, it'll be fine" | Missing deps = mysterious failures during implementation. |
95
+
96
+ **Never** create a worktree without verifying it's gitignored. **Never** skip baseline test verification. **Never** proceed with failing baseline tests without asking.
@@ -0,0 +1,155 @@
1
+ ---
2
+ name: raid-implementation
3
+ description: "Phase 3 of Raid protocol. Wizard assigns implementer and opens task Dungeon. Implementer builds with TDD. Challengers attack directly, building on each other's critiques. Wizard rotates and closes each task."
4
+ ---
5
+
6
+ # Raid Implementation — Phase 3
7
+
8
+ One builds, two attack — and the attackers attack each other's reviews too. Every implementation earns its approval through direct adversarial pressure.
9
+
10
+ <HARD-GATE>
11
+ Do NOT implement without an approved plan (except Scout mode). Do NOT skip TDD. Do NOT let any implementation pass unchallenged. Do NOT use subagents. Use `raid-tdd` skill for all test-driven development. Use `raid-verification` before any completion claims.
12
+ </HARD-GATE>
13
+
14
+ ## Mode Behavior
15
+
16
+ - **Full Raid**: 1 implements, 2 challenge (and challenge each other's reviews). Rotate implementer.
17
+ - **Skirmish**: 1 implements, 1 challenges. Swap roles each task.
18
+ - **Scout**: 1 agent implements. Wizard reviews. Self-challenge ruthlessly.
19
+
20
+ TDD is enforced in ALL modes. This is an Iron Law.
21
+
22
+ ## Process Flow
23
+
24
+ ```dot
25
+ digraph implementation {
26
+ "Wizard reads plan + Phase 2 Dungeon" -> "Create task tracking (TaskCreate)";
27
+ "Create task tracking (TaskCreate)" -> "Wizard assigns task N (rotate implementer)";
28
+ "Wizard assigns task N (rotate implementer)" -> "Wizard opens task Dungeon";
29
+ "Wizard opens task Dungeon" -> "Implementer executes (TDD)";
30
+ "Implementer executes (TDD)" -> "Implementer reports status";
31
+ "Implementer reports status" -> "Status?" [shape=diamond];
32
+ "Status?" -> "Challengers attack directly" [label="DONE"];
33
+ "Status?" -> "Read concerns, decide" [label="DONE_WITH_CONCERNS"];
34
+ "Status?" -> "Provide context, re-dispatch" [label="NEEDS_CONTEXT"];
35
+ "Status?" -> "Assess blocker" [label="BLOCKED"];
36
+ "Read concerns, decide" -> "Challengers attack directly";
37
+ "Provide context, re-dispatch" -> "Implementer executes (TDD)";
38
+ "Assess blocker" -> "Break down task / escalate";
39
+ "Challengers attack directly" -> "Challengers build on each other's critiques";
40
+ "Challengers build on each other's critiques" -> "Implementer defends against both";
41
+ "Implementer defends against both" -> "All issues resolved?" [shape=diamond];
42
+ "All issues resolved?" -> "Challengers attack directly" [label="no, continue"];
43
+ "All issues resolved?" -> "Wizard closes task: ruling" [label="yes"];
44
+ "Wizard closes task: ruling" -> "More tasks?" [shape=diamond];
45
+ "More tasks?" -> "Wizard assigns task N (rotate implementer)" [label="yes"];
46
+ "More tasks?" -> "Archive Dungeon + invoke raid-review" [label="no", shape=doublecircle];
47
+ }
48
+ ```
49
+
50
+ ## Wizard Checklist
51
+
52
+ 1. **Read the plan** — extract all tasks, dependencies, ordering
53
+ 2. **Read Phase 2 archived Dungeon** — carry forward context
54
+ 3. **Set up worktree** — use `raid-git-worktrees` for isolation (optional)
55
+ 4. **Create task tracking** — use TaskCreate for every plan task
56
+ 5. **Per task:** Assign implementer (rotate), open Dungeon, observe attack, close with ruling
57
+ 6. **Track progress** — mark complete only after Wizard ruling per task
58
+ 7. **After all tasks** — archive Dungeon, invoke `raid-review`
59
+
60
+ ## The Implementation Gauntlet (per task)
61
+
62
+ ### Step 1: Wizard Assigns + Opens Dungeon
63
+
64
+ One agent implements. Others prepare to attack. **Rotate the implementer** across tasks.
65
+
66
+ The Wizard doesn't open a new Dungeon for every task — the Phase 3 Dungeon is continuous across all tasks. But the Wizard announces each task assignment clearly.
67
+
68
+ ### Step 2: Implementer Executes (TDD)
69
+
70
+ Following `raid-tdd` strictly:
71
+ 1. Write the failing test from the plan
72
+ 2. Run test command from `.claude/raid.json` — verify it fails for the RIGHT reason
73
+ 3. Write minimal code to pass
74
+ 4. Run — verify pass
75
+ 5. Run FULL test suite — verify no regressions
76
+ 6. Self-review against acceptance criteria
77
+ 7. Commit: `feat(scope): descriptive message`
78
+
79
+ Report status: **DONE** | **DONE_WITH_CONCERNS** | **NEEDS_CONTEXT** | **BLOCKED**
80
+
81
+ ### Step 3: Challengers Attack Directly
82
+
83
+ This is where the new model shines. Challengers don't just report to the Wizard — they:
84
+
85
+ 1. **Read ACTUAL CODE** (not the implementer's report — reports lie)
86
+ 2. **Challenge the implementer directly:** `⚔️ CHALLENGE: @Warrior, your implementation at handler.js:23 doesn't validate...`
87
+ 3. **Build on each other's critiques:** `🔗 BUILDING ON @Archer: Your naming drift finding — the inconsistency also affects the test at...`
88
+ 4. **Roast weak implementations:** `🔥 ROAST: @Rogue, you claimed this handles concurrent access but there's no lock at...`
89
+ 5. **Pin verified issues to Dungeon:** `📌 DUNGEON: Confirmed issue — handler.js:23 missing validation [verified by @Archer and @Rogue]`
90
+
91
+ **Challengers check:**
92
+ - Spec compliance — does it match the task spec line by line?
93
+ - Design doc compliance — does it match the design requirements?
94
+ - Edge cases — what inputs break it?
95
+ - Test quality — do tests prove correctness or just confirm happy path?
96
+ - Naming consistency — do new names follow established patterns?
97
+ - File structure — does new code follow project conventions?
98
+
99
+ ### Step 4: Implementer Defends
100
+
101
+ The implementer defends against BOTH challengers simultaneously:
102
+ - Respond to each challenge with evidence or concede immediately
103
+ - Fix conceded issues
104
+ - Re-run all tests
105
+ - Pin resolved issues to Dungeon: `📌 DUNGEON: Resolved — added validation at handler.js:23 [tests pass]`
106
+
107
+ ### Step 5: Wizard Closes Task
108
+
109
+ ⚡ WIZARD RULING: Task N [approved | needs fixes]
110
+
111
+ The Wizard closes when the Dungeon shows all issues resolved and challengers have no remaining critiques.
112
+
113
+ ## Handling Implementer Status
114
+
115
+ | Status | Action |
116
+ |--------|--------|
117
+ | **DONE** | Challengers attack directly |
118
+ | **DONE_WITH_CONCERNS** | Read concerns. If correctness: address before attack. If observations: note and proceed. |
119
+ | **NEEDS_CONTEXT** | Provide missing information. Re-dispatch. |
120
+ | **BLOCKED** | 1) Context → provide more. 2) Too complex → break into subtasks. 3) Plan wrong → fix plan. |
121
+
122
+ **Never ignore an escalation.** If the implementer says it's stuck, something needs to change.
123
+
124
+ ## Quality Gates Per Task
125
+
126
+ - [ ] Tests written BEFORE implementation (TDD)
127
+ - [ ] Tests fail for the right reason
128
+ - [ ] Tests pass after implementation
129
+ - [ ] Full test suite passes (no regressions)
130
+ - [ ] Challengers attacked ACTUAL CODE directly
131
+ - [ ] Challengers built on each other's critiques
132
+ - [ ] All challenges addressed (fixed or defended with evidence)
133
+ - [ ] Implementation matches task spec (nothing more, nothing less)
134
+ - [ ] Naming follows established patterns
135
+ - [ ] Verified issues pinned to Dungeon
136
+ - [ ] Code committed with descriptive message
137
+
138
+ ## Red Flags
139
+
140
+ | Thought | Reality |
141
+ |---------|---------|
142
+ | "This task is simple, skip cross-testing" | Simple tasks are where assumptions slip through. |
143
+ | "The challengers should report to the Wizard" | Challengers attack the implementer and each other directly. |
144
+ | "We can batch the review for multiple tasks" | Review per task. Batching lets issues compound. |
145
+ | "I trust this agent's work" | Trust without verification is the definition of a bug farm. |
146
+ | "The same agent can implement twice in a row" | Rotation prevents blind spots. Enforce it. |
147
+ | "I'll wait for the Wizard to coordinate the review" | Attack directly. Build on each other's findings. |
148
+
149
+ ## Escalation
150
+
151
+ - **3+ fix attempts on one task:** Question whether the task spec or design is wrong.
152
+ - **Agent repeatedly blocked:** The plan may need revision.
153
+ - **Tests can't be written:** The design may not be testable. Return to Phase 1.
154
+
155
+ **Terminal state:** All tasks approved. Archive Dungeon. Invoke `raid-review`.