@qball-inc/the-bulwark 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (175) hide show
  1. package/.claude-plugin/plugin.json +43 -0
  2. package/agents/bulwark-fix-validator.md +633 -0
  3. package/agents/bulwark-implementer.md +391 -0
  4. package/agents/bulwark-issue-analyzer.md +308 -0
  5. package/agents/bulwark-standards-reviewer.md +221 -0
  6. package/agents/plan-creation-architect.md +323 -0
  7. package/agents/plan-creation-eng-lead.md +352 -0
  8. package/agents/plan-creation-po.md +300 -0
  9. package/agents/plan-creation-qa-critic.md +334 -0
  10. package/agents/product-ideation-competitive-analyzer.md +298 -0
  11. package/agents/product-ideation-idea-validator.md +268 -0
  12. package/agents/product-ideation-market-researcher.md +292 -0
  13. package/agents/product-ideation-pattern-documenter.md +308 -0
  14. package/agents/product-ideation-segment-analyzer.md +303 -0
  15. package/agents/product-ideation-strategist.md +259 -0
  16. package/agents/statusline-setup.md +97 -0
  17. package/hooks/hooks.json +59 -0
  18. package/package.json +45 -0
  19. package/scripts/hooks/cleanup-stale.sh +13 -0
  20. package/scripts/hooks/enforce-quality.sh +166 -0
  21. package/scripts/hooks/implementer-quality.sh +256 -0
  22. package/scripts/hooks/inject-protocol.sh +52 -0
  23. package/scripts/hooks/suggest-pipeline.sh +175 -0
  24. package/scripts/hooks/track-pipeline-start.sh +37 -0
  25. package/scripts/hooks/track-pipeline-stop.sh +52 -0
  26. package/scripts/init-rules.sh +35 -0
  27. package/scripts/init.sh +151 -0
  28. package/skills/anthropic-validator/SKILL.md +607 -0
  29. package/skills/anthropic-validator/references/agents-checklist.md +131 -0
  30. package/skills/anthropic-validator/references/commands-checklist.md +102 -0
  31. package/skills/anthropic-validator/references/hooks-checklist.md +151 -0
  32. package/skills/anthropic-validator/references/mcp-checklist.md +136 -0
  33. package/skills/anthropic-validator/references/plugins-checklist.md +148 -0
  34. package/skills/anthropic-validator/references/skills-checklist.md +85 -0
  35. package/skills/assertion-patterns/SKILL.md +296 -0
  36. package/skills/bug-magnet-data/SKILL.md +284 -0
  37. package/skills/bug-magnet-data/context/cli-args.md +91 -0
  38. package/skills/bug-magnet-data/context/db-query.md +104 -0
  39. package/skills/bug-magnet-data/context/file-contents.md +103 -0
  40. package/skills/bug-magnet-data/context/http-body.md +91 -0
  41. package/skills/bug-magnet-data/context/process-spawn.md +123 -0
  42. package/skills/bug-magnet-data/data/booleans/boundaries.yaml +143 -0
  43. package/skills/bug-magnet-data/data/collections/arrays.yaml +114 -0
  44. package/skills/bug-magnet-data/data/collections/objects.yaml +123 -0
  45. package/skills/bug-magnet-data/data/concurrency/race-conditions.yaml +118 -0
  46. package/skills/bug-magnet-data/data/concurrency/state-machines.yaml +115 -0
  47. package/skills/bug-magnet-data/data/dates/boundaries.yaml +137 -0
  48. package/skills/bug-magnet-data/data/dates/invalid.yaml +132 -0
  49. package/skills/bug-magnet-data/data/dates/timezone.yaml +118 -0
  50. package/skills/bug-magnet-data/data/encoding/charset.yaml +79 -0
  51. package/skills/bug-magnet-data/data/encoding/normalization.yaml +105 -0
  52. package/skills/bug-magnet-data/data/formats/email.yaml +154 -0
  53. package/skills/bug-magnet-data/data/formats/json.yaml +187 -0
  54. package/skills/bug-magnet-data/data/formats/url.yaml +165 -0
  55. package/skills/bug-magnet-data/data/language-specific/javascript.yaml +182 -0
  56. package/skills/bug-magnet-data/data/language-specific/python.yaml +174 -0
  57. package/skills/bug-magnet-data/data/language-specific/rust.yaml +148 -0
  58. package/skills/bug-magnet-data/data/numbers/boundaries.yaml +161 -0
  59. package/skills/bug-magnet-data/data/numbers/precision.yaml +89 -0
  60. package/skills/bug-magnet-data/data/numbers/special.yaml +69 -0
  61. package/skills/bug-magnet-data/data/strings/boundaries.yaml +109 -0
  62. package/skills/bug-magnet-data/data/strings/injection.yaml +208 -0
  63. package/skills/bug-magnet-data/data/strings/special-chars.yaml +190 -0
  64. package/skills/bug-magnet-data/data/strings/unicode.yaml +139 -0
  65. package/skills/bug-magnet-data/references/external-lists.md +115 -0
  66. package/skills/bulwark-brainstorm/SKILL.md +563 -0
  67. package/skills/bulwark-brainstorm/references/at-teammate-prompts.md +60 -0
  68. package/skills/bulwark-brainstorm/references/role-critical-analyst.md +78 -0
  69. package/skills/bulwark-brainstorm/references/role-development-lead.md +66 -0
  70. package/skills/bulwark-brainstorm/references/role-product-delivery-lead.md +79 -0
  71. package/skills/bulwark-brainstorm/references/role-product-manager.md +62 -0
  72. package/skills/bulwark-brainstorm/references/role-project-sme.md +59 -0
  73. package/skills/bulwark-brainstorm/references/role-technical-architect.md +66 -0
  74. package/skills/bulwark-research/SKILL.md +298 -0
  75. package/skills/bulwark-research/references/viewpoint-contrarian.md +63 -0
  76. package/skills/bulwark-research/references/viewpoint-direct-investigation.md +62 -0
  77. package/skills/bulwark-research/references/viewpoint-first-principles.md +65 -0
  78. package/skills/bulwark-research/references/viewpoint-practitioner.md +62 -0
  79. package/skills/bulwark-research/references/viewpoint-prior-art.md +66 -0
  80. package/skills/bulwark-scaffold/SKILL.md +330 -0
  81. package/skills/bulwark-statusline/SKILL.md +161 -0
  82. package/skills/bulwark-statusline/scripts/statusline.sh +144 -0
  83. package/skills/bulwark-verify/SKILL.md +519 -0
  84. package/skills/code-review/SKILL.md +428 -0
  85. package/skills/code-review/examples/anti-patterns/linting.ts +181 -0
  86. package/skills/code-review/examples/anti-patterns/security.ts +91 -0
  87. package/skills/code-review/examples/anti-patterns/standards.ts +195 -0
  88. package/skills/code-review/examples/anti-patterns/type-safety.ts +108 -0
  89. package/skills/code-review/examples/recommended/linting.ts +195 -0
  90. package/skills/code-review/examples/recommended/security.ts +154 -0
  91. package/skills/code-review/examples/recommended/standards.ts +231 -0
  92. package/skills/code-review/examples/recommended/type-safety.ts +181 -0
  93. package/skills/code-review/frameworks/angular.md +218 -0
  94. package/skills/code-review/frameworks/django.md +235 -0
  95. package/skills/code-review/frameworks/express.md +207 -0
  96. package/skills/code-review/frameworks/flask.md +298 -0
  97. package/skills/code-review/frameworks/generic.md +146 -0
  98. package/skills/code-review/frameworks/react.md +152 -0
  99. package/skills/code-review/frameworks/vue.md +244 -0
  100. package/skills/code-review/references/linting-patterns.md +221 -0
  101. package/skills/code-review/references/security-patterns.md +125 -0
  102. package/skills/code-review/references/standards-patterns.md +246 -0
  103. package/skills/code-review/references/type-safety-patterns.md +130 -0
  104. package/skills/component-patterns/SKILL.md +131 -0
  105. package/skills/component-patterns/references/pattern-cli-command.md +118 -0
  106. package/skills/component-patterns/references/pattern-database.md +166 -0
  107. package/skills/component-patterns/references/pattern-external-api.md +139 -0
  108. package/skills/component-patterns/references/pattern-file-parser.md +168 -0
  109. package/skills/component-patterns/references/pattern-http-server.md +162 -0
  110. package/skills/component-patterns/references/pattern-process-spawner.md +133 -0
  111. package/skills/continuous-feedback/SKILL.md +327 -0
  112. package/skills/continuous-feedback/references/collect-instructions.md +81 -0
  113. package/skills/continuous-feedback/references/specialize-code-review.md +82 -0
  114. package/skills/continuous-feedback/references/specialize-general.md +98 -0
  115. package/skills/continuous-feedback/references/specialize-test-audit.md +81 -0
  116. package/skills/create-skill/SKILL.md +359 -0
  117. package/skills/create-skill/references/agent-conventions.md +194 -0
  118. package/skills/create-skill/references/agent-template.md +195 -0
  119. package/skills/create-skill/references/content-guidance.md +291 -0
  120. package/skills/create-skill/references/decision-framework.md +124 -0
  121. package/skills/create-skill/references/template-pipeline.md +217 -0
  122. package/skills/create-skill/references/template-reference-heavy.md +111 -0
  123. package/skills/create-skill/references/template-research.md +210 -0
  124. package/skills/create-skill/references/template-script-driven.md +172 -0
  125. package/skills/create-skill/references/template-simple.md +80 -0
  126. package/skills/create-subagent/SKILL.md +353 -0
  127. package/skills/create-subagent/references/agent-conventions.md +268 -0
  128. package/skills/create-subagent/references/content-guidance.md +232 -0
  129. package/skills/create-subagent/references/decision-framework.md +134 -0
  130. package/skills/create-subagent/references/template-single-agent.md +192 -0
  131. package/skills/fix-bug/SKILL.md +241 -0
  132. package/skills/governance-protocol/SKILL.md +116 -0
  133. package/skills/init/SKILL.md +341 -0
  134. package/skills/issue-debugging/SKILL.md +385 -0
  135. package/skills/issue-debugging/references/anti-patterns.md +245 -0
  136. package/skills/issue-debugging/references/debug-report-schema.md +227 -0
  137. package/skills/mock-detection/SKILL.md +511 -0
  138. package/skills/mock-detection/references/false-positive-prevention.md +402 -0
  139. package/skills/mock-detection/references/stub-patterns.md +236 -0
  140. package/skills/pipeline-templates/SKILL.md +215 -0
  141. package/skills/pipeline-templates/references/code-change-workflow.md +277 -0
  142. package/skills/pipeline-templates/references/code-review.md +336 -0
  143. package/skills/pipeline-templates/references/fix-validation.md +421 -0
  144. package/skills/pipeline-templates/references/new-feature.md +335 -0
  145. package/skills/pipeline-templates/references/research-brainstorm.md +161 -0
  146. package/skills/pipeline-templates/references/research-planning.md +257 -0
  147. package/skills/pipeline-templates/references/test-audit.md +389 -0
  148. package/skills/pipeline-templates/references/test-execution-fix.md +238 -0
  149. package/skills/plan-creation/SKILL.md +497 -0
  150. package/skills/product-ideation/SKILL.md +372 -0
  151. package/skills/product-ideation/references/analysis-frameworks.md +161 -0
  152. package/skills/session-handoff/SKILL.md +139 -0
  153. package/skills/session-handoff/references/examples.md +223 -0
  154. package/skills/setup-lsp/SKILL.md +312 -0
  155. package/skills/setup-lsp/references/server-registry.md +85 -0
  156. package/skills/setup-lsp/references/troubleshooting.md +135 -0
  157. package/skills/subagent-output-templating/SKILL.md +415 -0
  158. package/skills/subagent-output-templating/references/examples.md +440 -0
  159. package/skills/subagent-prompting/SKILL.md +364 -0
  160. package/skills/subagent-prompting/references/examples.md +342 -0
  161. package/skills/test-audit/SKILL.md +531 -0
  162. package/skills/test-audit/references/known-limitations.md +41 -0
  163. package/skills/test-audit/references/priority-classification.md +30 -0
  164. package/skills/test-audit/references/prompts/deep-mode-detection.md +83 -0
  165. package/skills/test-audit/references/prompts/synthesis.md +57 -0
  166. package/skills/test-audit/references/rewrite-instructions.md +46 -0
  167. package/skills/test-audit/references/schemas/audit-output.yaml +100 -0
  168. package/skills/test-audit/references/schemas/diagnostic-output.yaml +49 -0
  169. package/skills/test-audit/scripts/data-flow-analyzer.ts +509 -0
  170. package/skills/test-audit/scripts/integration-mock-detector.ts +462 -0
  171. package/skills/test-audit/scripts/package.json +20 -0
  172. package/skills/test-audit/scripts/skip-detector.ts +211 -0
  173. package/skills/test-audit/scripts/verification-counter.ts +295 -0
  174. package/skills/test-classification/SKILL.md +310 -0
  175. package/skills/test-fixture-creation/SKILL.md +295 -0
@@ -0,0 +1,497 @@
1
+ ---
2
+ name: plan-creation
3
+ description: Create structured implementation plans using a 4-role scrum team with optional Agent Teams peer debate
4
+ user-invocable: true
5
+ argument-hint: "<topic, filepath, or directory> [--research <synthesis-file>]"
6
+ skills:
7
+ - subagent-prompting
8
+ ---
9
+
10
+ # Plan Creation
11
+
12
+ Create structured implementation plans through a 4-role collaborative scrum team: Product Owner, Technical Architect, Engineering & Delivery Lead, and QA/Critic. The Product Owner explores the codebase first, then Architect and Eng Lead analyze in parallel, and the QA/Critic challenges everything last. The orchestrator synthesizes all outputs into a hybrid Markdown + YAML plan.
13
+
14
+ ---
15
+
16
+ ## When to Use This Skill
17
+
18
+ **Load this skill when the user request matches ANY of these patterns:**
19
+
20
+ | Trigger Pattern | Example User Request |
21
+ |-----------------|---------------------|
22
+ | Implementation planning | "Create an implementation plan for X" |
23
+ | Feature planning | "Plan how we'd build X" |
24
+ | Project scoping | "Break down X into phases and workpackages" |
25
+ | Post-research planning | "We've researched X, now create a plan" |
26
+ | Task brief creation | "Create a task brief for X" |
27
+
28
+ **DO NOT use for:**
29
+ - Initial topic research (use `bulwark-research` first)
30
+ - Feasibility brainstorming (use `bulwark-brainstorm`)
31
+ - Quick technical questions (ask directly)
32
+ - Code review or debugging (use `code-review` or `issue-debugging`)
33
+
34
+ ---
35
+
36
+ ## Dependencies
37
+
38
+ | Category | Files | Requirement | When to Load |
39
+ |----------|-------|-------------|--------------|
40
+ | **Plan output template** | `templates/plan-output.md` | **REQUIRED** | Load at Stage 5 for plan structure |
41
+ | **Critic output template** | `templates/critic-output.md` | **REQUIRED** | Include in QA/Critic agent prompt |
42
+ | **Synthesis template** | `templates/synthesis-output.md` | **REQUIRED** | Use when writing synthesis |
43
+ | **Diagnostic template** | `templates/diagnostic-output.yaml` | **REQUIRED** | Use at Stage 6 |
44
+ | **Role output reference** | `templates/role-output.md` | OPTIONAL | Reference for parsing agent outputs |
45
+ | **Subagent prompting** | `subagent-prompting` skill | **REQUIRED** | Load at Stage 1 for 4-part prompt template |
46
+ | **Research synthesis** | `--research <file>` | OPTIONAL | If provided, include in all agent prompts |
47
+
48
+ **Fallback behavior:**
49
+ - If an agent fails to spawn: Re-spawn once. If still fails, skip that role and document in synthesis under "Incomplete Coverage"
50
+ - If PO fails: STOP — all downstream agents depend on PO output. Inform user.
51
+ - If output template is missing: Use the schema from this SKILL.md directly
52
+ - If research synthesis not provided: Agents work from problem statement alone (warn user that quality may be lower)
53
+
54
+ ---
55
+
56
+ ## Mandatory Execution Checklist (BINDING)
57
+
58
+ **Every item below is mandatory. No deviations. No substitutions. No skipping.**
59
+
60
+ This skill uses a multi-stage pipeline. You are the orchestrator. Follow every item in order. Do NOT return to the user until all applicable items are checked.
61
+
62
+ - [ ] **Stage 1 — Pre-Flight**: Topic parsed (from argument, --doc, or AskUserQuestion)
63
+ - [ ] **Stage 1 — Pre-Flight**: subagent-prompting skill loaded
64
+ - [ ] **Stage 1 — Pre-Flight**: If topic is ambiguous or under-specified, AskUserQuestion interview conducted (2-3 questions per round)
65
+ - [ ] **Stage 1 — Pre-Flight**: If --research not provided, user warned via displayed message AND asked to confirm proceeding
66
+ - [ ] **Stage 1 — Mode Detection**: `$CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS` env var checked — you MUST check this, no exceptions
67
+ - [ ] **Stage 1 — Mode Detection**: If env var is SET, user offered choice via AskUserQuestion (Agent Teams vs Task tool) — you MUST NOT default silently
68
+ - [ ] **Stage 1 — Mode Detection**: If user selects Agent Teams, AT Confirmation Flow executed (RED banner + model class choice)
69
+ - [ ] **Stage 2 — Product Owner**: PO spawned via Task tool (`plan-creation-po`, Opus) and output read
70
+ - [ ] **Stage 3A or 3B**: Correct mode executed based on user's Stage 1 choice
71
+ - [ ] **Stage 3A (Task tool)**: Architect + Eng Lead spawned in parallel, then QA/Critic spawned with all 3 prior outputs
72
+ - [ ] **Stage 3B (Agent Teams)**: Agent files read, delegate mode entered, 3 teammates spawned with correct model class
73
+ - [ ] **Stage 5 — Synthesis**: ALL role outputs read, synthesis written, plan drafted using template
74
+ - [ ] **Stage 5 — Approval**: Plan presented to user via AskUserQuestion — you MUST NOT write the final plan without user approval
75
+ - [ ] **Stage 5 — Plan Written**: Final plan written to `plans/{slug}/plan_v{N}.md`
76
+ - [ ] **Stage 6 — Diagnostics**: Diagnostic YAML written to `$PROJECT_DIR/logs/diagnostics/`
77
+
78
+ ---
79
+
80
+ ## Usage
81
+
82
+ ```
83
+ /plan-creation <topic-or-prompt> [--research <synthesis-file>]
84
+ /plan-creation --doc <path-to-document> [--research <synthesis-file>]
85
+ ```
86
+
87
+ **Arguments:**
88
+ - `<topic-or-prompt>` - Free-text topic description or problem statement
89
+ - `--doc <path>` - Use a document as the topic source
90
+ - `--research <synthesis-file>` - Path to research synthesis (from bulwark-research or bulwark-brainstorm). Strongly recommended.
91
+
92
+ **Examples:**
93
+ - `/plan-creation "add user authentication" --research logs/research/auth/synthesis.md`
94
+ - `/plan-creation --doc plans/proposal.md`
95
+ - `/plan-creation "migrate database to PostgreSQL"`
96
+
97
+ **Plan Versioning:**
98
+
99
+ Plans are written to `plans/{slug}/plan_v{N}.md` with automatic version detection:
100
+
101
+ | Scenario | Version | Example |
102
+ |----------|---------|---------|
103
+ | First plan for a topic | `v1` | `plans/add-auth/plan_v1.md` |
104
+ | Minor revision (user iterates on current plan) | `v1.1`, `v1.2` | Approval gate feedback → revision |
105
+ | Major version (full re-run or pivot) | `v2`, `v3` | New invocation for same slug |
106
+
107
+ The skill checks for existing `plans/{slug}/plan_v*.md` files before writing. When ambiguous (re-run vs revision), it asks the user.
108
+
109
+ ---
110
+
111
+ ## Stages
112
+
113
+ ### Stage 1: Pre-Flight
114
+
115
+ ```
116
+ Stage 1: Pre-Flight
117
+ ├── Read problem statement / document
118
+ ├── Load research synthesis if --research provided
119
+ ├── AskUserQuestion if ambiguous (iterative, 2-3 questions per round)
120
+ ├── Slugify topic for output directory
121
+ ├── Create output directory: $PROJECT_DIR/logs/plan-creation/{slug}/
122
+ ├── Load subagent-prompting skill
123
+ ├── Detect mode: Task tool (default) or Agent Teams (opt-in)
124
+ └── Token budget check (warn if >30% consumed)
125
+ ```
126
+
127
+ **AskUserQuestion Protocol (Pre-Spawn):**
128
+
129
+ If the problem statement is ambiguous, under-specified, or could benefit from scope boundaries:
130
+
131
+ 1. Ask 2-3 clarifying questions using AskUserQuestion
132
+ 2. Assess whether the answers provide sufficient clarity to construct high-quality prompts
133
+ 3. If not, ask up to 3 more questions in a follow-up round
134
+ 4. Repeat until clarity is achieved (no hard cap on rounds, but each round is 2-3 questions max)
135
+ 5. If the problem statement is clear and well-scoped from the start, skip this step and note in diagnostics: `pre_flight_interview: skipped (problem statement sufficient)`
136
+
137
+ If `--research` was not provided, warn the user: "No research synthesis provided. Plan quality is significantly higher when preceded by `/bulwark-research` or `/bulwark-brainstorm`. Proceed without research?"
138
+
139
+ **Mode Detection:**
140
+
141
+ 1. Check `$CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS` env var
142
+ 2. If env var is SET: offer user choice via AskUserQuestion — "Agent Teams enhanced mode is available. Use Agent Teams or Task tool?" Default to Task tool if user doesn't specify.
143
+ 3. If env var is NOT SET: use Task tool mode. If user explicitly requested Agent Teams, notify: "Agent Teams requires CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1. Using Task tool mode."
144
+
145
+ **AT Confirmation Flow (if user selects Agent Teams):**
146
+
147
+ Execute this confirmation flow BEFORE spawning any teammates:
148
+
149
+ **Step 1 — Display RED warning banner** using ANSI color `\033[38;2;255;154;150m` (#FF9A96):
150
+
151
+ ```
152
+ ⚠️ NOTICE: Claude Code's Agent Teams is an experimental feature. Unexpected
153
+ issues like teammates being stuck or unresponsive may occur. Agent Teams mode
154
+ is also significantly more token-expensive than Task tool mode (4 concurrent
155
+ agents vs sequential sub-agents).
156
+
157
+ If you run into issues, start a new session and re-run /plan-creation.
158
+ If the final output does not match expectations, re-run with Task tool mode.
159
+ ```
160
+
161
+ **Step 2 — AskUserQuestion: model class + mode confirmation**
162
+
163
+ Present a single question with 3 options:
164
+
165
+ | Option | Label | Description |
166
+ |--------|-------|-------------|
167
+ | 1 | Opus agents (Recommended) | Higher quality analysis, highest token cost. Opus-class agents for all roles. |
168
+ | 2 | Sonnet agents | Good quality, lower token cost. Sonnet-class agents for all roles. |
169
+ | 3 | Switch to Task tool mode | Cancel Agent Teams and use sequential sub-agents instead. |
170
+
171
+ - Option 1: proceed with Opus agents in AT mode (Stage 3B)
172
+ - Option 2: proceed with Sonnet agents in AT mode (Stage 3B)
173
+ - Option 3: fall back to Task tool mode (Stage 3A), skip AT entirely
174
+
175
+ ### Stage 2: Product Owner (Opus, Sequential — First)
176
+
177
+ ```
178
+ Stage 2: Product Owner
179
+ ├── Construct prompt using 4-part template
180
+ │ ├── GOAL: Explore codebase and produce requirements analysis for {topic}
181
+ │ ├── CONSTRAINTS: Do not make architectural decisions or estimate effort
182
+ │ ├── CONTEXT: Problem statement + research synthesis (if available)
183
+ │ └── OUTPUT: $PROJECT_DIR/logs/plan-creation/{slug}/01-product-owner.md
184
+ ├── Spawn plan-creation-po agent via Task tool
185
+ │ ├── subagent_type: plan-creation-po
186
+ │ ├── model: opus (specified in agent frontmatter)
187
+ │ ├── Agent autonomously explores codebase (Glob, Grep, Read)
188
+ │ └── NO hardcoded document paths — agent discovers what's relevant
189
+ ├── Read PO output from logs/plan-creation/{slug}/01-product-owner.md
190
+ └── Token budget check
191
+ ```
192
+
193
+ **CRITICAL — PO Autonomy**: The PO agent MUST NOT receive hardcoded project document paths. Instead:
194
+
195
+ - PO receives the problem statement and (optionally) research synthesis
196
+ - PO is spawned as `plan-creation-po` subagent type
197
+ - PO autonomously explores the codebase using Glob, Grep, Read
198
+ - PO output documents which files it read and why
199
+
200
+ This makes the skill portable across any project.
201
+
202
+ ### Stage 3A: Scrum Team — Task Tool Mode
203
+
204
+ ```
205
+ Stage 3A: Scrum Team (Task Tool Mode)
206
+ ├── Read PO output in full
207
+ ├── Construct 2 prompts using 4-part template:
208
+ │ ├── Technical Architect:
209
+ │ │ ├── GOAL: Analyze system design, components, integration, trade-offs for {topic}
210
+ │ │ ├── CONSTRAINTS: Do not estimate effort or sequence work
211
+ │ │ ├── CONTEXT: Problem statement + research synthesis + PO output (full text)
212
+ │ │ └── OUTPUT: $PROJECT_DIR/logs/plan-creation/{slug}/02-technical-architect.md
213
+ │ └── Engineering & Delivery Lead:
214
+ │ ├── GOAL: Produce WBS, estimates, dependencies, milestones, risk register for {topic}
215
+ │ ├── CONSTRAINTS: Do not redesign architecture — work with Architect's design
216
+ │ ├── CONTEXT: Problem statement + research synthesis + PO output (full text)
217
+ │ └── OUTPUT: $PROJECT_DIR/logs/plan-creation/{slug}/03-eng-delivery-lead.md
218
+ ├── Spawn BOTH agents in parallel via Task tool (single message, 2 Task tool calls)
219
+ │ ├── subagent_type: plan-creation-architect (opus)
220
+ │ └── subagent_type: plan-creation-eng-lead (sonnet)
221
+ ├── Read both outputs
222
+ └── Token budget check (checkpoint if >55%)
223
+ ```
224
+
225
+ **CRITICAL**: Spawn both agents in a single message with 2 Task tool calls. Do NOT spawn sequentially.
226
+
227
+ **Note**: Both Architect and Eng Lead receive the PO output directly in their prompt CONTEXT. They do NOT read each other's output — they work independently in parallel. The QA/Critic cross-references their outputs in Stage 4.
228
+
229
+ ### Stage 3B: Scrum Team — Agent Teams Mode (Enhanced, Opt-In)
230
+
231
+ **Pre-condition**: User selected Agent Teams in Pre-Flight AND confirmed model class in AT Confirmation Flow.
232
+
233
+ ```
234
+ Stage 3B: Scrum Team (Agent Teams Mode)
235
+ ├── Read PO output in full (from Stage 2)
236
+ ├── Read agent definition files from .claude/agents/:
237
+ │ ├── .claude/agents/plan-creation-architect.md
238
+ │ ├── .claude/agents/plan-creation-eng-lead.md
239
+ │ └── .claude/agents/plan-creation-qa-critic.md
240
+ ├── YOU (the orchestrator) become the delegate-mode Scrum Lead
241
+ │ └── Your role: coordination ONLY — do not perform analysis yourself
242
+ ├── Enter delegate mode with 3 teammates
243
+ ├── Create shared task list with initial tasks:
244
+ │ ├── "[Architect] Analyze system design for {topic}"
245
+ │ ├── "[Eng Lead] Produce WBS and delivery plan for {topic}"
246
+ │ └── "[QA/Critic] Adversarially review all analyses for {topic}"
247
+ ├── Spawn 3 teammates using agent file content as system prompts:
248
+ │ ├── Technical Architect (model: user's choice from AT Confirmation)
249
+ │ ├── Engineering & Delivery Lead (model: user's choice from AT Confirmation)
250
+ │ └── QA / Critic (model: user's choice from AT Confirmation)
251
+ ├── Each teammate prompt MUST include (in addition to agent file content):
252
+ │ ├── Problem statement + research synthesis (if available)
253
+ │ ├── PO output (full text)
254
+ │ ├── Dual-output contract (see below)
255
+ │ ├── CC-to-lead instruction (see below)
256
+ │ ├── Task list coordination instruction (see below)
257
+ │ └── Rendezvous instruction (see below)
258
+ ├── Use in-process display mode (WSL2 safe — no tmux)
259
+ ├── Shutdown gate: see below
260
+ └── Token budget check
261
+ ```
262
+
263
+ **Dual-Output Contract** (include in EVERY teammate prompt):
264
+
265
+ > You MUST produce two outputs:
266
+ > 1. **Full analysis** → Write to `$PROJECT_DIR/logs/plan-creation/{slug}/{NN}-{role-name}.md` using the Write tool. This is your SA2-compliant artifact. Include all analysis, tables, and findings.
267
+ > 2. **Coordination summary** → Send a 3-5 sentence summary of your key findings and conclusions to the Scrum Lead via mailbox. This is for coordination only — the full analysis is in the log file.
268
+
269
+ **CC-to-Lead Instruction** (include in EVERY teammate prompt):
270
+
271
+ > When sending peer DMs to other teammates with work instructions, challenges, or significant findings, also send a 1-line summary to the Scrum Lead. Example: "Sent Architect a challenge on component coupling — see my full analysis in logs."
272
+
273
+ **Task List Coordination Instruction** (include in EVERY teammate prompt):
274
+
275
+ > When you receive work from a peer via DM (e.g., "review this section", "stress-test this estimate"), create a new task in the shared task list describing the peer-dispatched work before starting it. Mark it in_progress immediately. This gives the Scrum Lead visibility into peer-coordinated work.
276
+
277
+ **Rendezvous Instruction** (include in EVERY teammate prompt):
278
+
279
+ > Your FINAL action before going idle is to send the Scrum Lead: "WORK COMPLETE — all tasks done, log written to {path}". Do NOT go idle without sending this message.
280
+
281
+ **Shutdown Gate** (Scrum Lead logic — YOU enforce this):
282
+
283
+ The Scrum Lead (you, the orchestrator) MUST NOT call `requestShutdown` for ANY teammate until ALL of the following are true:
284
+
285
+ 1. All shared task list tasks are in terminal state (completed or blocked)
286
+ 2. WORK COMPLETE message received from ALL 3 teammates
287
+ 3. All 3 log files exist and are non-empty:
288
+ - `logs/plan-creation/{slug}/02-technical-architect.md`
289
+ - `logs/plan-creation/{slug}/03-eng-delivery-lead.md`
290
+ - `logs/plan-creation/{slug}/04-qa-critic.md`
291
+
292
+ If a teammate appears idle but has NOT sent WORK COMPLETE:
293
+ - Check the shared task list for in-progress tasks assigned to that teammate
294
+ - Send a status check message: "Status update? Are you still working on [task]?"
295
+ - Do NOT send `requestShutdown` — they may be executing a long tool call
296
+
297
+ **AT Completion Banner** (display after ALL teammates shut down):
298
+
299
+ After Agent Teams execution completes successfully, display an AMBER banner using ANSI color `\033[38;2;255;244;176m` (#FFF4B0):
300
+
301
+ ```
302
+ ℹ️ Agent Teams is an experimental feature. If the final plan does not match
303
+ your expectations, try re-running /plan-creation with Task tool mode
304
+ (sequential sub-agents). Individual role outputs are preserved in
305
+ logs/plan-creation/{slug}/ for inspection.
306
+ ```
307
+
308
+ Then proceed to Stage 5 (Synthesis) — Stage 4 is skipped in AT mode.
309
+
310
+ **CRITICAL — Agent File Reuse**: The 3 agent files in `.claude/agents/` contain the role expertise (system prompts, output formats, tool constraints). For AT mode, read each file's content and use it as the teammate's system prompt. This keeps a single source of truth per role — the agent files serve both Task tool mode (via `subagent_type`) and AT mode (content embedded in teammate prompts).
311
+
312
+ **Note**: In AT mode, the QA/Critic participates throughout — it can challenge the Architect and Eng Lead via peer DMs during their analysis, not just after. This is the primary quality advantage over Task tool mode.
313
+
314
+ ---
315
+
316
+ ### Stage 4: QA / Critic (Sonnet, Sequential — Last, Task Tool Mode ONLY)
317
+
318
+ ```
319
+ Stage 4: QA / Critic
320
+ ├── Load templates/critic-output.md
321
+ ├── Read ALL 3 prior output files:
322
+ │ ├── 01-product-owner.md
323
+ │ ├── 02-technical-architect.md
324
+ │ └── 03-eng-delivery-lead.md
325
+ ├── Construct prompt using 4-part template
326
+ │ ├── GOAL: Adversarially review all prior analyses — challenge assumptions, identify gaps, stress-test estimates, produce APPROVE/MODIFY/REJECT verdict
327
+ │ ├── CONSTRAINTS: Do not redesign or re-plan — only challenge and validate
328
+ │ ├── CONTEXT: Problem statement + research synthesis + ALL 3 prior outputs (full text) + critic-output.md template
329
+ │ └── OUTPUT: $PROJECT_DIR/logs/plan-creation/{slug}/04-qa-critic.md
330
+ ├── Spawn plan-creation-qa-critic agent via Task tool
331
+ │ ├── subagent_type: plan-creation-qa-critic
332
+ │ └── model: sonnet (specified in agent frontmatter)
333
+ ├── Read Critic output
334
+ └── Token budget check
335
+ ```
336
+
337
+ **CRITICAL**: The QA/Critic MUST receive ALL 3 prior outputs in full. This is the entire point — the Critic cross-references PO requirements against Architect components against Eng Lead workpackages to find gaps.
338
+
339
+ **Skip condition**: If Agent Teams mode was used (Stage 3B), skip Stage 4 entirely — the QA/Critic already participated throughout Stage 3B via peer debate. Proceed directly to Stage 5.
340
+
341
+ ### Stage 5: Synthesis & Plan Output (SHARED — Mode-Aware)
342
+
343
+ ```
344
+ Stage 5: Synthesis
345
+ ├── Read ALL 4 agent output files (MANDATORY — do not skip any):
346
+ │ ├── logs/plan-creation/{slug}/01-product-owner.md
347
+ │ ├── logs/plan-creation/{slug}/02-technical-architect.md
348
+ │ ├── logs/plan-creation/{slug}/03-eng-delivery-lead.md
349
+ │ └── logs/plan-creation/{slug}/04-qa-critic.md
350
+ ├── If Agent Teams mode: also review lead coordination notes from the AT session
351
+ ├── If any output is missing or empty → re-spawn that agent once (max 1 retry)
352
+ │ └── In AT mode: re-spawning is NOT possible — document gap in "Incomplete Coverage"
353
+ ├── If retry fails → document gap in synthesis under "Incomplete Coverage"
354
+ ├── Load templates/synthesis-output.md
355
+ ├── Load templates/plan-output.md
356
+ ├── Write synthesis to $PROJECT_DIR/logs/plan-creation/{slug}/synthesis.md
357
+ ├── Compose plan draft:
358
+ │ ├── Executive Summary from synthesis consensus + PO problem statement
359
+ │ ├── YAML body from:
360
+ │ │ ├── Phases and workpackages: Eng Lead's WBS + Architect's component structure
361
+ │ │ ├── Milestones: Eng Lead's milestones
362
+ │ │ ├── Dependency graph: Eng Lead's dependency analysis + Architect's integration order
363
+ │ │ ├── Risks: Consolidated from all roles, prioritized by Critic
364
+ │ │ └── Kill criteria: From Critic's verdict
365
+ │ └── Apply Critic's MODIFY requirements (if verdict was MODIFY)
366
+ ├── Present draft plan to user via AskUserQuestion approval gate
367
+ ├── Critical Evaluation Gate (see below)
368
+ ├── Determine plan version:
369
+ │ ├── Glob for existing plans/{slug}/plan_v*.md
370
+ │ ├── If none found: version = v1 (first plan)
371
+ │ ├── If user is iterating on the current plan (minor revision): bump minor (v1 → v1.1, v1.1 → v1.2)
372
+ │ ├── If user is starting fresh or pivoting: bump major (v1 → v2, v2 → v3)
373
+ │ └── When ambiguous, ask user: "This is a revision of the existing plan (v1.1) or a new plan (v2)?"
374
+ ├── On approval: write final plan to plans/{slug}/plan_v{N}.md
375
+ └── Token budget check (must be <65% after synthesis)
376
+ ```
377
+
378
+ **Enforcement**: Do NOT begin writing synthesis until ALL available agent outputs have been read. The orchestrator must reference every agent's output at least once in the synthesis.
379
+
380
+ #### Critical Evaluation Gate (Post-User Q&A)
381
+
382
+ After each AskUserQuestion round, do NOT blindly incorporate user responses. Instead:
383
+
384
+ **Step 1 — Classify each user response:**
385
+
386
+ | Classification | Definition | Action |
387
+ |---------------|------------|--------|
388
+ | **Preference** | Scope, priority, or UX choice (e.g., "I'd prefer v1 to focus on X", "Let's defer Y") | Incorporate directly. These are user decisions — no validation needed. |
389
+ | **Technical Claim** | Assertion about a technology, library, or API (e.g., "Library X supports this", "That API has rate limits") | **Do NOT incorporate.** Trigger Step 2. |
390
+ | **Architectural Suggestion** | Proposed structural approach (e.g., "What if we structure it as a plugin?", "We could use event sourcing") | **Do NOT incorporate.** Trigger Step 2. |
391
+
392
+ **Step 2 — For Technical Claims and Architectural Suggestions, present to user:**
393
+
394
+ > "Your suggestion about [X] involves a technical claim / architectural approach that hasn't been validated against the codebase and research. I recommend a targeted follow-up with 2 focused agents (Technical Architect + QA/Critic) to verify feasibility and stress-test the approach.
395
+ >
396
+ > This will spawn 2 agents and consume additional token budget.
397
+ >
398
+ > [Run follow-up validation / Incorporate as-is with LOW confidence caveat]"
399
+
400
+ **Step 3 — If follow-up validation approved:**
401
+
402
+ 1. Spawn 2 agents in parallel (single message, 2 Task tool calls):
403
+ - **Technical Architect** (`plan-creation-architect`) — validates the suggestion against the codebase and research
404
+ - **QA/Critic** (`plan-creation-qa-critic`) — stress-tests the suggestion
405
+ 2. Use the same 4-part prompt template (GOAL/CONSTRAINTS/CONTEXT/OUTPUT)
406
+ 3. Provide both agents with: original research synthesis, PO output, and the specific user suggestion
407
+ 4. Output to: `$PROJECT_DIR/logs/plan-creation/{slug}/followup-{NN}-architect.md` and `followup-{NN}-critic.md`
408
+ 5. Read both outputs, then update plan with validated findings
409
+ 6. Tag follow-up findings in plan with: `[Follow-up: validated]` or `[Follow-up: refuted]` or `[Follow-up: mixed — see details]`
410
+
411
+ **Step 4 — If user declines follow-up:**
412
+
413
+ Incorporate the user's suggestion into the plan with an explicit caveat:
414
+ > **[Unvalidated — user suggestion, not verified against codebase or research]**: {suggestion}
415
+
416
+ **Repeat**: After updating the plan, ask if user has additional input. Apply the same classification gate to each round. Each round with Technical Claim / Architectural Suggestion input that triggers validation consumes ~10-15% token budget (2 agents) — warn user if approaching 55%.
417
+
418
+ ### Stage 6: Diagnostics (REQUIRED)
419
+
420
+ ```
421
+ Stage 6: Diagnostics
422
+ ├── Write diagnostic YAML to $PROJECT_DIR/logs/diagnostics/plan-creation-{YYYYMMDD-HHMMSS}.yaml
423
+ └── Verify Mandatory Execution Checklist (top of skill)
424
+ ```
425
+
426
+ ---
427
+
428
+ ## Execution Flow (F# Pipeline)
429
+
430
+ ```fsharp
431
+ // Task tool mode (default)
432
+ ProductOwner(topic, research?) // Stage 2: Opus, solo
433
+ |> [Architect, EngDeliveryLead](po_output) // Stage 3A: parallel Task tool
434
+ |> QACritic(all_prior_outputs) // Stage 4: sequential Task tool
435
+ |> Synthesis |> ApprovalGate |> PlanOutput(plan_v{N}) // Stage 5: versioned output
436
+
437
+ // Agent Teams mode (enhanced, opt-in)
438
+ ProductOwner(topic, research?) // Stage 2: Opus, Task tool (solo)
439
+ |> AgentTeam[Architect, EngDeliveryLead, QACritic](po_output) // Stage 3B: peer debate
440
+ |> Synthesis |> ApprovalGate |> PlanOutput(plan_v{N}) // Stage 5: versioned output
441
+ // Note: Stage 4 skipped in AT mode — QA/Critic participates throughout Stage 3B
442
+ ```
443
+
444
+ ---
445
+
446
+ ## Token Budget Management
447
+
448
+ | Checkpoint | Threshold | Action |
449
+ |------------|-----------|--------|
450
+ | After constructing PO prompt | >30% consumed | Warn user: "4 agents will consume significant context" |
451
+ | After reading Stage 3A outputs | Running tally | If approaching 55%, checkpoint with user |
452
+ | After synthesis | Must be <65% | Leave room for plan approval + session closing |
453
+ | Synthesis complete at >65% | Immediate | Write plan as-is, create handoff, do not start additional work |
454
+
455
+ If token budget is insufficient to complete all 4 agents + synthesis, inform the user and suggest splitting (e.g., "PO + Architect/Eng Lead this session, Critic + synthesis next session").
456
+
457
+ ---
458
+
459
+ ## Error Handling
460
+
461
+ **Both modes:**
462
+
463
+ | Scenario | Action |
464
+ |----------|--------|
465
+ | Agent returns empty output | Re-spawn once. If still empty, document gap in synthesis. |
466
+ | Agent returns truncated output | Accept as-is, note in diagnostics. |
467
+ | Agent fails to spawn | Re-spawn once. If still fails, skip role, document. |
468
+ | PO fails | STOP — subsequent agents depend on PO. Inform user. |
469
+ | Token budget exceeded mid-session | Stop spawning, synthesize from available outputs, note incomplete. |
470
+ | Research synthesis not provided | Warn user, proceed with lower quality. |
471
+ | User rejects plan draft | Ask what needs to change, re-enter Critical Evaluation Gate. |
472
+
473
+ **Agent Teams mode only:**
474
+
475
+ | Scenario | Action |
476
+ |----------|--------|
477
+ | Teammate appears stuck (no WORK COMPLETE, no task updates) | Send status check via mailbox. Wait for response before any shutdown attempt. |
478
+ | Teammate never sends WORK COMPLETE | After status check + 1 follow-up, check if log file was written. If log exists and is non-empty, treat as implicit completion. Document in diagnostics. |
479
+ | Peer DM traffic invisible to lead | Expected — this is an AT architectural constraint. Rely on CC-to-lead summaries and task list state. |
480
+ | One teammate fails, others succeed | Document gap. Do NOT shut down working teammates — let them complete. Synthesize from available outputs. |
481
+ | AT env var absent but user requested AT | Notify user, fall back to Task tool mode (Stage 3A). |
482
+ | User selects "Switch to Task tool" in AT Confirmation | Execute Stage 3A instead. No AT infrastructure spawned. |
483
+
484
+ ---
485
+
486
+ ## Diagnostic Output (REQUIRED)
487
+
488
+ **MANDATORY**: You MUST write diagnostic output after every invocation. This is Stage 6 and cannot be skipped.
489
+
490
+ Write to: `$PROJECT_DIR/logs/diagnostics/plan-creation-{YYYYMMDD-HHMMSS}.yaml`
491
+
492
+ **Template**: Use `templates/diagnostic-output.yaml` for the schema. Fill in actual values from the session.
493
+
494
+ ---
495
+
496
+
497
+ **Do NOT return to user until all applicable checkboxes can be marked complete.**