@wazir-dev/cli 1.1.0 → 1.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (138) hide show
  1. package/CHANGELOG.md +74 -10
  2. package/README.md +15 -15
  3. package/assets/demo.cast +47 -0
  4. package/assets/demo.gif +0 -0
  5. package/docs/anti-patterns/AP-23-skipping-enabled-workflows.md +28 -0
  6. package/docs/anti-patterns/AP-24-clarifier-deciding-scope.md +34 -0
  7. package/docs/concepts/architecture.md +1 -1
  8. package/docs/concepts/roles-and-workflows.md +2 -0
  9. package/docs/concepts/why-wazir.md +59 -0
  10. package/docs/decisions/2026-03-19-deferred-items.md +564 -0
  11. package/docs/decisions/2026-03-19-enhancement-decisions.md +300 -0
  12. package/docs/readmes/INDEX.md +21 -5
  13. package/docs/readmes/features/expertise/README.md +2 -2
  14. package/docs/readmes/features/exports/README.md +2 -2
  15. package/docs/readmes/features/hooks/pre-compact-summary.md +1 -1
  16. package/docs/readmes/features/schemas/README.md +3 -0
  17. package/docs/readmes/features/skills/README.md +17 -0
  18. package/docs/readmes/features/skills/clarifier.md +5 -0
  19. package/docs/readmes/features/skills/claude-cli.md +5 -0
  20. package/docs/readmes/features/skills/codex-cli.md +5 -0
  21. package/docs/readmes/features/skills/dispatching-parallel-agents.md +5 -0
  22. package/docs/readmes/features/skills/executing-plans.md +5 -0
  23. package/docs/readmes/features/skills/executor.md +5 -0
  24. package/docs/readmes/features/skills/finishing-a-development-branch.md +5 -0
  25. package/docs/readmes/features/skills/gemini-cli.md +5 -0
  26. package/docs/readmes/features/skills/humanize.md +5 -0
  27. package/docs/readmes/features/skills/init-pipeline.md +5 -0
  28. package/docs/readmes/features/skills/receiving-code-review.md +5 -0
  29. package/docs/readmes/features/skills/requesting-code-review.md +5 -0
  30. package/docs/readmes/features/skills/reviewer.md +5 -0
  31. package/docs/readmes/features/skills/subagent-driven-development.md +5 -0
  32. package/docs/readmes/features/skills/using-git-worktrees.md +5 -0
  33. package/docs/readmes/features/skills/wazir.md +5 -0
  34. package/docs/readmes/features/skills/writing-skills.md +5 -0
  35. package/docs/readmes/features/workflows/prepare-next.md +1 -1
  36. package/docs/reference/configuration-reference.md +47 -6
  37. package/docs/reference/hooks.md +1 -0
  38. package/docs/reference/launch-checklist.md +4 -4
  39. package/docs/reference/review-loop-pattern.md +119 -9
  40. package/docs/reference/roles-reference.md +1 -0
  41. package/docs/reference/skill-tiers.md +147 -0
  42. package/docs/reference/tooling-cli.md +3 -1
  43. package/docs/truth-claims.yaml +12 -0
  44. package/expertise/antipatterns/process/ai-coding-antipatterns.md +214 -1
  45. package/exports/hosts/claude/.claude/commands/plan-review.md +3 -1
  46. package/exports/hosts/claude/.claude/commands/verify.md +30 -1
  47. package/exports/hosts/claude/.claude/settings.json +9 -0
  48. package/exports/hosts/claude/CLAUDE.md +1 -1
  49. package/exports/hosts/claude/export.manifest.json +6 -4
  50. package/exports/hosts/claude/host-package.json +3 -1
  51. package/exports/hosts/codex/AGENTS.md +1 -1
  52. package/exports/hosts/codex/export.manifest.json +6 -4
  53. package/exports/hosts/codex/host-package.json +3 -1
  54. package/exports/hosts/cursor/.cursor/hooks.json +4 -0
  55. package/exports/hosts/cursor/.cursor/rules/wazir-core.mdc +1 -1
  56. package/exports/hosts/cursor/export.manifest.json +6 -4
  57. package/exports/hosts/cursor/host-package.json +3 -1
  58. package/exports/hosts/gemini/GEMINI.md +1 -1
  59. package/exports/hosts/gemini/export.manifest.json +6 -4
  60. package/exports/hosts/gemini/host-package.json +3 -1
  61. package/hooks/context-mode-router +191 -0
  62. package/hooks/definitions/context_mode_router.yaml +19 -0
  63. package/hooks/hooks.json +31 -6
  64. package/hooks/protected-path-write-guard +8 -0
  65. package/hooks/routing-matrix.json +45 -0
  66. package/hooks/session-start +62 -1
  67. package/llms-full.txt +937 -134
  68. package/package.json +2 -4
  69. package/schemas/hook.schema.json +2 -1
  70. package/schemas/phase-report.schema.json +89 -0
  71. package/schemas/usage.schema.json +25 -1
  72. package/schemas/wazir-manifest.schema.json +19 -0
  73. package/skills/brainstorming/SKILL.md +32 -157
  74. package/skills/clarifier/SKILL.md +289 -111
  75. package/skills/claude-cli/SKILL.md +320 -0
  76. package/skills/codex-cli/SKILL.md +260 -0
  77. package/skills/debugging/SKILL.md +13 -0
  78. package/skills/design/SKILL.md +13 -0
  79. package/skills/dispatching-parallel-agents/SKILL.md +13 -0
  80. package/skills/executing-plans/SKILL.md +13 -0
  81. package/skills/executor/SKILL.md +139 -19
  82. package/skills/finishing-a-development-branch/SKILL.md +13 -0
  83. package/skills/gemini-cli/SKILL.md +260 -0
  84. package/skills/humanize/SKILL.md +13 -0
  85. package/skills/init-pipeline/SKILL.md +72 -164
  86. package/skills/prepare-next/SKILL.md +81 -10
  87. package/skills/receiving-code-review/SKILL.md +13 -0
  88. package/skills/requesting-code-review/SKILL.md +13 -0
  89. package/skills/reviewer/SKILL.md +369 -24
  90. package/skills/run-audit/SKILL.md +13 -0
  91. package/skills/scan-project/SKILL.md +13 -0
  92. package/skills/self-audit/SKILL.md +217 -16
  93. package/skills/skill-research/SKILL.md +188 -0
  94. package/skills/subagent-driven-development/SKILL.md +13 -0
  95. package/skills/subagent-driven-development/code-quality-reviewer-prompt.md +2 -0
  96. package/skills/subagent-driven-development/implementer-prompt.md +8 -0
  97. package/skills/subagent-driven-development/spec-reviewer-prompt.md +7 -0
  98. package/skills/tdd/SKILL.md +13 -0
  99. package/skills/using-git-worktrees/SKILL.md +13 -0
  100. package/skills/using-skills/SKILL.md +13 -0
  101. package/skills/verification/SKILL.md +54 -3
  102. package/skills/wazir/SKILL.md +464 -381
  103. package/skills/writing-plans/SKILL.md +14 -1
  104. package/skills/writing-skills/SKILL.md +13 -0
  105. package/templates/artifacts/implementation-plan.md +3 -0
  106. package/templates/artifacts/tasks-template.md +133 -0
  107. package/templates/examples/phase-report.example.json +48 -0
  108. package/tooling/src/adapters/composition-engine.js +256 -0
  109. package/tooling/src/adapters/model-router.js +84 -0
  110. package/tooling/src/capture/command.js +41 -2
  111. package/tooling/src/capture/run-config.js +3 -1
  112. package/tooling/src/capture/store.js +56 -0
  113. package/tooling/src/capture/usage.js +106 -0
  114. package/tooling/src/capture/user-input.js +66 -0
  115. package/tooling/src/checks/ac-matrix.js +256 -0
  116. package/tooling/src/checks/command-registry.js +12 -0
  117. package/tooling/src/checks/docs-truth.js +1 -1
  118. package/tooling/src/checks/security-sensitivity.js +69 -0
  119. package/tooling/src/checks/skills.js +111 -0
  120. package/tooling/src/cli.js +31 -20
  121. package/tooling/src/commands/stats.js +161 -0
  122. package/tooling/src/commands/validate.js +5 -1
  123. package/tooling/src/export/compiler.js +33 -37
  124. package/tooling/src/gating/agent.js +145 -0
  125. package/tooling/src/guards/phase-prerequisite-guard.js +185 -0
  126. package/tooling/src/hooks/routing-logic.js +69 -0
  127. package/tooling/src/init/auto-detect.js +258 -0
  128. package/tooling/src/init/command.js +38 -170
  129. package/tooling/src/input/scanner.js +46 -0
  130. package/tooling/src/reports/command.js +103 -0
  131. package/tooling/src/reports/phase-report.js +323 -0
  132. package/tooling/src/state/command.js +160 -0
  133. package/tooling/src/state/db.js +287 -0
  134. package/tooling/src/status/command.js +58 -1
  135. package/tooling/src/verify/proof-collector.js +299 -0
  136. package/wazir.manifest.yaml +26 -14
  137. package/workflows/plan-review.md +3 -1
  138. package/workflows/verify.md +30 -1
@@ -5,9 +5,22 @@ description: Run the clarification pipeline — research, clarify scope, brainst
5
5
 
6
6
  # Clarifier
7
7
 
8
- Run Phase 0 (Research) + Phase 1 (Clarify, Brainstorm, Plan) for the current project.
8
+ ## Command Routing
9
+ Follow the Canonical Command Matrix in `hooks/routing-matrix.json`.
10
+ - Large commands (test runners, builds, diffs, dependency trees, linting) → context-mode tools
11
+ - Small commands (git status, ls, pwd, wazir CLI) → native Bash
12
+ - If context-mode unavailable, fall back to native Bash with warning
9
13
 
10
- **Pacing rule:** This skill has mandatory user checkpoints between phases. Do NOT skip checkpoints. Do NOT combine phases. Complete each phase fully, present the output, and wait for explicit user approval before advancing.
14
+ ## Codebase Exploration
15
+ 1. Query `wazir index search-symbols <query>` first
16
+ 2. Use `wazir recall file <path> --tier L1` for targeted reads
17
+ 3. Fall back to direct file reads ONLY for files identified by index queries
18
+ 4. Maximum 10 direct file reads without a justifying index query
19
+ 5. If no index exists: `wazir index build && wazir index summarize --tier all`
20
+
21
+ Run the Clarifier phase — everything from reading input to having an approved execution plan.
22
+
23
+ **Pacing rule:** This skill has mandatory user checkpoints between sub-workflows. Do NOT skip checkpoints. Do NOT combine sub-workflows. Complete each fully, present output, and wait for explicit user approval before advancing.
11
24
 
12
25
  Review loops follow the pattern in `docs/reference/review-loop-pattern.md`. All reviewer invocations use explicit `--mode`.
13
26
 
@@ -17,8 +30,10 @@ Review loops follow the pattern in `docs/reference/review-loop-pattern.md`. All
17
30
 
18
31
  1. Check `.wazir/state/config.json` exists. If not, run `wazir init` first.
19
32
  2. Check `.wazir/input/briefing.md` exists. If not, ask the user what they want to build and save it there.
20
- 3. Read config for `default_depth`, `default_intent`, `team_mode`, and `multi_tool` settings.
21
- 4. Create a run directory if one doesn't exist:
33
+ 3. Scan `input/` (project-level) and `.wazir/input/` (state-level) for additional input files. Present what's found.
34
+ 4. Read config for `default_depth` and `multi_tool` settings.
35
+ 5. **Load accepted learnings:** Glob `memory/learnings/accepted/*.md`. For each accepted learning, read scope tags. Inject learnings whose scope matches the current run's intent/stack into context. Limit: top 10 by confidence, most recent first. This is how prior run insights improve future runs.
36
+ 6. Create a run directory if one doesn't exist:
22
37
  ```bash
23
38
  mkdir -p .wazir/runs/run-YYYYMMDD-HHMMSS/{sources,tasks,artifacts,reviews,clarified}
24
39
  ln -sfn run-YYYYMMDD-HHMMSS .wazir/runs/latest
@@ -26,194 +41,357 @@ Review loops follow the pattern in `docs/reference/review-loop-pattern.md`. All
26
41
 
27
42
  ---
28
43
 
29
- ## Phase 0: Research (delegated)
44
+ ## Context-Mode Usage
45
+
46
+ Read `context_mode` from `.wazir/state/config.json`:
47
+
48
+ - **If `context_mode.enabled: true`:** Use `fetch_and_index` for URL fetching, `search` for follow-up queries on indexed content. Use `execute` or `execute_file` for large outputs instead of Bash.
49
+ - **If `context_mode.enabled: false`:** Fall back to `WebFetch` for URLs and `Bash` for commands.
50
+
51
+ ---
52
+
53
+ ## Sub-Workflow 1: Research (discover workflow)
54
+
55
+ **Before starting this phase, output to the user:**
56
+
57
+ > **Research** — About to scan the codebase and fetch external references to understand the existing architecture, tech stack, and any standards referenced in the briefing.
58
+ >
59
+ > **Why this matters:** Without research, I'd assume the wrong framework version, miss existing patterns in the codebase, and contradict established conventions. Every wrong assumption here cascades into a wrong spec and wrong implementation.
60
+ >
61
+ > **Looking for:** Existing code patterns, dependency versions, external standard definitions, architectural constraints
30
62
 
31
63
  Delegate to the discover workflow (`workflows/discover.md`):
32
64
 
33
- 1. The **researcher role** produces the research artifact
34
- (codebase scan, external sources, source manifest, research brief).
35
- 2. The **reviewer role** runs the research-review loop
36
- using research dimensions with `--mode research-review`
37
- (see `docs/reference/review-loop-pattern.md`).
38
- 3. The researcher resolves findings from each pass.
39
- 4. Loop runs for `pass_counts[depth]` passes.
40
- 5. Research artifact flows back to the clarifier for Checkpoint 0.
65
+ 1. **Keyword extraction:** Read the briefing and extract concepts/terms that are vague, reference external standards, or use unfamiliar terminology.
66
+ - **When to research:** concept references an external standard by name, uses a tool/library not seen in the codebase, or is ambiguous enough that two agents could interpret it differently.
67
+ - **When NOT to research:** concept is fully defined in the input, or it's a well-known programming concept.
68
+ 2. **Fetch sources:** For each concept needing research:
69
+ - Use `fetch_and_index` (if context-mode available) or `WebFetch` to fetch the source.
70
+ - Save fetched content to `.wazir/runs/latest/sources/`.
71
+ - Track each fetch in `sources/manifest.json`.
72
+ 3. **Error handling:** 404/unreachable log failure, continue. Research is best-effort.
73
+ 4. The **researcher role** produces the research artifact.
74
+ 5. The **reviewer role** runs the research-review loop with `--mode research-review`.
75
+ 6. Loop runs for `pass_counts[depth]` passes.
41
76
 
42
77
  Save result to `.wazir/runs/latest/clarified/research-brief.md`.
43
78
 
44
- ### Checkpoint 0: Research Review
79
+ **After completing this phase, output to the user:**
45
80
 
46
- Present the research brief to the user:
81
+ > **Research complete.**
82
+ >
83
+ > **Found:** [N] external sources fetched, [N] codebase patterns identified, [N] architectural constraints documented
84
+ >
85
+ > **Without this phase:** Spec would be built on assumptions instead of evidence — wrong framework APIs, missed existing utilities, contradicted naming conventions
86
+ >
87
+ > **Changed because of this work:** [List of key discoveries — e.g., "found existing auth middleware at src/middleware/auth.ts", "project uses Vitest not Jest"]
88
+
89
+ ### Checkpoint: Research Review
47
90
 
48
91
  > **Research complete. Here's what I found:**
49
92
  >
50
- > [Summary of existing codebase state, relevant architecture, external context]
51
- >
52
- > **Does this match your understanding? Anything to add or correct?**
53
- > 1. **Looks good, continue** (Recommended)
54
- > 2. **Missing context** — let me add more information
55
- > 3. **Wrong direction** let me clarify the intent
93
+ > [Summary of codebase state, relevant architecture, external context]
94
+
95
+ Ask the user via AskUserQuestion:
96
+ - **Question:** "Does the research look complete and accurate?"
97
+ - **Options:**
98
+ 1. "Looks good, continue" *(Recommended)*
99
+ 2. "Missing context — let me add more information"
100
+ 3. "Wrong direction — let me clarify the intent"
56
101
 
57
- **Wait for user response before continuing.**
102
+ Wait for the user's selection before continuing.
58
103
 
59
104
  ---
60
105
 
61
- ## Phase 1A: Clarify (autonomous, then review, then checkpoint)
106
+ ## Sub-Workflow 2: Clarify (clarify workflow)
107
+
108
+ **Before starting this phase, output to the user:**
109
+
110
+ > **Clarification** — About to transform the briefing and research into a precise scope document with explicit constraints, assumptions, and boundaries.
111
+ >
112
+ > **Why this matters:** Without explicit clarification, "add user auth" could mean OAuth, magic links, or username/password. Every ambiguity left here becomes a 50/50 coin flip during implementation that could produce the wrong feature.
113
+ >
114
+ > **Looking for:** Ambiguous requirements, implicit assumptions, missing constraints, scope boundaries, unresolved questions
115
+
116
+ ### Input Preservation (before producing clarification)
62
117
 
63
- Read the briefing, research brief, and codebase context. Produce:
118
+ 1. Glob `.wazir/input/tasks/*.md`. If files exist:
119
+ - Adopt those specs as the starting point — copy content verbatim into the clarification's item descriptions.
120
+ - Enhance with codebase scan + research findings. **Never remove detail — only add.**
121
+ - Every acceptance criterion from input must appear verbatim.
122
+ - Every API endpoint, color hex code, and UI dimension from input must appear in the relevant item section.
123
+ 2. If `.wazir/input/tasks/` is empty or missing, synthesize from `briefing.md` alone.
64
124
 
65
- - **What** we're building concrete deliverables, not vague descriptions
125
+ ### Informed Question Batching (after research, before producing clarification)
126
+
127
+ Research has completed. You now have codebase context and external findings. Before producing the clarification, ask the user INFORMED questions — informed by the research, not guesses.
128
+
129
+ **Rules:**
130
+ 1. **Research runs FIRST, questions come AFTER.** Never ask questions before research completes.
131
+ 2. **Batch questions:** 1-3 batches of 3-7 questions each. Never one-at-a-time.
132
+ 3. **Every scope exclusion must be explicitly confirmed by the user.** You MUST NOT decide that something is "out of scope" without asking. If the input doesn't mention docs, ask: "The input doesn't mention documentation — should we include API docs, or is that explicitly out of scope?" Do NOT assume.
133
+ 4. **If the input is clear and complete:** Zero questions is fine. State: "Input is clear and specific. No ambiguities detected. Proceeding with clarification."
134
+ 5. **In auto mode (`interaction_mode: auto`):** Questions go to the gating agent, not the user.
135
+ 6. **In interactive mode (`interaction_mode: interactive`):** More detailed questions, present research findings that informed each question.
136
+
137
+ **Question format:**
138
+ ```
139
+ Based on research, I have [N] questions before proceeding:
140
+
141
+ **Scope & Intent**
142
+ 1. [Question informed by research finding]
143
+ 2. [Question about ambiguous requirement]
144
+
145
+ **Technical Decisions**
146
+ 3. [Question about architecture choice discovered during research]
147
+ 4. [Question about dependency/framework preference]
148
+
149
+ **Boundaries**
150
+ 5. [Explicit scope boundary question — "Should X be included or excluded?"]
151
+ ```
152
+
153
+ Ask via AskUserQuestion with the full batch. Wait for answers. If answers introduce new ambiguity, ask a follow-up batch (max 3 batches total).
154
+
155
+ ### Clarification Production
156
+
157
+ Read the briefing, research brief, user answers to questions, and codebase context. Produce:
158
+
159
+ - **What** we're building — concrete deliverables
66
160
  - **Why** — the motivation and business value
67
161
  - **Constraints** — technical, timeline, dependencies
68
- - **Assumptions** — what we're taking as given (explicitly stated)
69
- - **Scope boundaries** — what's IN and what's explicitly OUT
70
- - **Unresolved questions** — anything ambiguous that could change architecture or acceptance criteria
162
+ - **Assumptions** — what we're taking as given (each explicitly confirmed by user or clearly stated in input)
163
+ - **Scope boundaries** — what's IN and what's explicitly OUT (every exclusion must reference the user's confirmation: "Out of scope per user confirmation in question batch 1, Q5")
164
+ - **Unresolved questions** — anything still ambiguous after question batches
71
165
 
72
166
  Save to `.wazir/runs/latest/clarified/clarification.md`.
73
167
 
74
- Invoke the review loop for the clarification artifact using spec/clarification dimensions with `--mode clarification-review`. The **reviewer role** runs the loop (see `docs/reference/review-loop-pattern.md`). Resolve any findings before presenting to user.
168
+ Invoke `wz:reviewer --mode clarification-review`. Resolve findings before presenting to user.
75
169
 
76
- ### Checkpoint 1A: Clarification Review
170
+ **After completing this phase, output to the user:**
77
171
 
78
- Present the full clarification to the user:
172
+ > **Clarification complete.**
173
+ >
174
+ > **Found:** [N] ambiguities resolved, [N] assumptions documented, [N] scope boundaries defined, [N] items explicitly marked out-of-scope
175
+ >
176
+ > **Without this phase:** Implementation would proceed with hidden assumptions, scope would creep mid-build, and acceptance criteria would be vague enough to pass any implementation
177
+ >
178
+ > **Changed because of this work:** [List of resolved ambiguities — e.g., "clarified auth means OAuth2 with Google provider only", "out-of-scope: mobile responsive for v1"]
179
+
180
+ ### Checkpoint: Clarification Review
79
181
 
80
182
  > **Here's the clarified scope:**
81
183
  >
82
- > [Full clarification with what/why/constraints/assumptions/scope/questions]
83
- >
84
- > **Are there any corrections, missing context, or open questions to resolve?**
85
- > 1. **Approved — continue to spec hardening**
86
- > 2. **Needs changes** — [user provides corrections]
87
- > 3. **Missing important context** — [user adds information]
184
+ > [Full clarification]
88
185
 
89
- **Wait for user response. If the user provides corrections, update the clarification and re-present.**
186
+ Ask the user via AskUserQuestion:
187
+ - **Question:** "Does the clarified scope accurately capture what you want to build?"
188
+ - **Options:**
189
+ 1. "Approved — continue to spec hardening" *(Recommended)*
190
+ 2. "Needs changes — let me provide corrections"
191
+ 3. "Missing important context — let me add information"
192
+
193
+ Wait for the user's selection before continuing. Route feedback: plan corrections → `user-feedback.md`, new requirements → `briefing.md`.
90
194
 
91
195
  ---
92
196
 
93
- ## Phase 1A+: Spec Harden (delegated, then checkpoint)
197
+ ## Sub-Workflow 3: Spec Harden (specify + spec-challenge workflows)
198
+
199
+ **Before starting this phase, output to the user:**
200
+
201
+ > **Spec Hardening** — About to convert the clarified scope into a measurable, testable specification and then run adversarial spec-challenge review to find gaps.
202
+ >
203
+ > **Why this matters:** Without hardening, acceptance criteria stay vague ("it should work well") instead of measurable ("response time under 200ms for 95th percentile"). Vague specs pass any implementation, making review meaningless.
204
+ >
205
+ > **Looking for:** Untestable criteria, missing error handling specs, undefined edge cases, performance requirements, security constraints
94
206
 
95
207
  Delegate to the specify workflow (`workflows/specify.md`):
96
208
 
97
- 1. The **specifier role** produces a measurable spec from the clarification
98
- and research artifacts.
99
- 2. The **reviewer role** runs the spec-challenge loop
100
- (`workflows/spec-challenge.md`) with `--mode spec-challenge`.
101
- 3. The specifier resolves findings from each pass.
102
- 4. Loop runs for `pass_counts[depth]` passes.
209
+ 1. The **specifier role** produces a measurable spec from clarification + research.
210
+ 2. Invoke `wz:reviewer --mode spec-challenge`.
211
+ 3. Loop runs for `pass_counts[depth]` passes.
103
212
 
104
213
  Save result to `.wazir/runs/latest/clarified/spec-hardened.md`.
105
214
 
106
- ### Checkpoint 1A+: Hardened Spec Review
215
+ **After completing this phase, output to the user:**
107
216
 
108
- Present the changes made during hardening:
217
+ > **Spec Hardening complete.**
218
+ >
219
+ > **Found:** [N] acceptance criteria tightened, [N] edge cases added, [N] error handling requirements specified, [N] spec-challenge findings resolved
220
+ >
221
+ > **Without this phase:** Acceptance criteria would be subjective, review would have no concrete standard to measure against, and "done" would mean whatever the implementer decided
222
+ >
223
+ > **Changed because of this work:** [List of hardening changes — e.g., "added 404 handling spec for missing resources", "specified max payload size of 5MB", "added rate limit requirement of 100 req/min"]
224
+
225
+ ### Content-Author Detection
226
+
227
+ After spec hardening, scan the spec for content needs. Auto-enable the `author` workflow if the spec mentions any of:
228
+ - Database seeding, seed data, fixtures, sample records
229
+ - Sample content, placeholder text, demo data
230
+ - Test fixtures, mock API responses, test data files
231
+ - Translations, i18n strings, localization
232
+ - Copy (button labels, error messages, onboarding text)
233
+ - Documentation content, user guides, API docs
234
+ - Email templates, notification text
235
+
236
+ If detected, set `workflow_policy.author.enabled = true` in the run config and note:
237
+ > **Content needs detected.** The content-author workflow will run after design approval to produce: [list detected content types].
238
+
239
+ ### Checkpoint: Hardened Spec Review
109
240
 
110
241
  > **Spec hardened. Changes made:**
111
242
  >
112
- > [List of each gap found and how it was tightened]
113
- >
114
- > **Review the hardened spec. Approve or adjust?**
115
- > 1. **Approved continue to brainstorming** (Recommended)
116
- > 2. **Disagree with a change** — [user specifies]
117
- > 3. **Found more gaps** [user adds]
243
+ > [List of gaps found and how they were tightened]
244
+
245
+ Ask the user via AskUserQuestion:
246
+ - **Question:** "Are the spec hardening changes acceptable?"
247
+ - **Options:**
248
+ 1. "Approved continue to brainstorming" *(Recommended)*
249
+ 2. "Disagree with a change — let me specify"
250
+ 3. "Found more gaps — let me add"
118
251
 
119
- **Wait for user response before continuing.**
252
+ Wait for the user's selection before continuing.
120
253
 
121
254
  ---
122
255
 
123
- ## Phase 1B: Brainstorm (interactive always pauses)
256
+ ## Sub-Workflow 4: Brainstorm (design + design-review workflows)
257
+
258
+ **Before starting this phase, output to the user:**
259
+
260
+ > **Brainstorming** — About to propose 2-3 design approaches with explicit trade-offs, then run design-review on the approved choice.
261
+ >
262
+ > **Why this matters:** Without exploring alternatives, the first approach that comes to mind gets built — even if a simpler, more maintainable, or more performant option exists. This is where architectural mistakes get caught cheaply instead of discovered during implementation.
263
+ >
264
+ > **Looking for:** Architectural trade-offs, scalability implications, complexity vs. simplicity, alignment with existing codebase patterns
124
265
 
125
- Invoke the `brainstorming` skill (`wz:brainstorming`) and follow it.
266
+ Invoke the `brainstorming` skill (`wz:brainstorming`):
126
267
 
127
- This phase explores design approaches:
128
268
  1. Propose 2-3 viable approaches with explicit trade-offs
129
269
  2. For each approach: effort estimate, risk assessment, what it enables/prevents
130
270
  3. Recommend one approach with rationale
131
271
 
132
- If `team_mode: parallel` in config, the brainstorming skill activates its
133
- **Agent Teams Structured Dialogue** mode:
272
+ ### Checkpoint: Design Approval
273
+
274
+ Ask the user via AskUserQuestion:
275
+ - **Question:** "Which design approach should we implement?"
276
+ - **Options:**
277
+ 1. "Approach A — [one-line summary]" *(Recommended)*
278
+ 2. "Approach B — [one-line summary]"
279
+ 3. "Approach C — [one-line summary]"
280
+ 4. "Modify an approach — let me specify changes"
134
281
 
135
- 1. Checks that `CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS` is enabled (falls back
136
- to single-agent brainstorming if not)
137
- 2. Creates a team via `TeamCreate` (`wazir-brainstorm-<concept-slug>`)
138
- 3. Spawns three teammates via `Agent` with `team_name`:
139
- - **Free Thinker** — proposes creative directions via `SendMessage`
140
- - **Grounder** — challenges each direction with practical concerns via `SendMessage`
141
- - **Synthesizer** — observes silently, writes the design document on convergence
142
- 4. You (the Arbiter) coordinate the dialogue, signal convergence, and clean up
143
- with `TeamDelete`
282
+ Wait for the user's selection before continuing. This is the most important checkpoint.
144
283
 
145
- See `skills/brainstorming/SKILL.md` "Team Mode: Agent Teams Structured Dialogue"
146
- for full spawn prompts, convergence criteria, and constraints.
284
+ Save approved design to `.wazir/runs/latest/clarified/design.md`.
147
285
 
148
- ### Checkpoint 1B: Design Approval
286
+ **After completing this phase, output to the user:**
149
287
 
150
- > **Proposed design approaches:**
288
+ > **Brainstorming complete.**
289
+ >
290
+ > **Found:** [N] approaches evaluated, [N] trade-offs documented, [N] design-review findings resolved
151
291
  >
152
- > [Approaches with trade-offs, recommendation]
292
+ > **Without this phase:** The first viable approach would be built without considering alternatives — potentially choosing a complex solution when a simple one exists, or an approach that conflicts with existing patterns
153
293
  >
154
- > **Which approach should we implement?**
155
- > 1. **Approach A** — [one-line summary]
156
- > 2. **Approach B** — [one-line summary]
157
- > 3. **Approach C** — [one-line summary]
158
- > 4. **Modify an approach** — [user specifies changes]
294
+ > **Changed because of this work:** [Selected approach and why, rejected alternatives and why, design-review adjustments made]
159
295
 
160
- **This is the most important checkpoint. Do NOT proceed without explicit design approval.**
296
+ After approval: design-review loop with `--mode design-review` (5 canonical dimensions: spec coverage, design-spec consistency, accessibility, visual consistency, exported-code fidelity).
161
297
 
162
- Save approved design to `.wazir/runs/latest/clarified/design.md`.
163
-
164
- ### Design Review
298
+ ---
165
299
 
166
- After the user approves the design concept, invoke the design-review loop with `--mode design-review`. The **reviewer role** validates the design against the approved spec using the canonical design-review dimensions:
300
+ ## Sub-Workflow 5: Plan (plan + plan-review workflows)
167
301
 
168
- - Spec coverage
169
- - Design-spec consistency
170
- - Accessibility
171
- - Visual consistency
172
- - Exported-code fidelity
302
+ **Before starting this phase, output to the user:**
173
303
 
174
- See `workflows/design-review.md` and `docs/reference/review-loop-pattern.md`. The designer resolves findings. Proceed to planning only after all design-review passes complete.
304
+ > **Planning** About to break the approved design into ordered, dependency-aware implementation tasks with a gap analysis against the original input.
305
+ >
306
+ > **Why this matters:** Without explicit planning, tasks get implemented in the wrong order (breaking dependencies), items from the input get silently dropped, and task granularity is either too coarse (monolithic changes that are hard to review) or too fine (overhead without value).
307
+ >
308
+ > **Looking for:** Correct dependency ordering, complete input coverage, appropriate task granularity, clear acceptance criteria per task
175
309
 
176
- ---
310
+ Delegate to `wz:writing-plans`:
177
311
 
178
- ## Phase 1C: Plan (delegated, then checkpoint)
312
+ 1. Planner produces a SINGLE execution plan at `.wazir/runs/latest/clarified/execution-plan.md` in spec-kit format.
313
+ 2. **Gap analysis exit gate:** Compare original input against plan. Invoke `wz:reviewer --mode plan-review`.
314
+ 3. Loop until clean or cap reached.
179
315
 
180
- Delegate to `wz:writing-plans`:
316
+ **After completing this phase, output to the user:**
181
317
 
182
- 1. `wz:writing-plans` (using **planner role**) produces the execution plan
183
- and task specs.
184
- 2. The **reviewer role** runs the plan-review loop
185
- (`workflows/plan-review.md`) with `--mode plan-review`.
186
- 3. The planner resolves findings from each pass.
187
- 4. Loop runs for `pass_counts[depth]` passes.
318
+ > **Planning complete.**
319
+ >
320
+ > **Found:** [N] tasks created, [N] dependencies mapped, [N] plan-review findings resolved, [N] gap analysis items addressed
321
+ >
322
+ > **Without this phase:** Tasks would be implemented in ad-hoc order breaking dependencies, input items would be silently dropped, and task sizes would vary wildly making review inconsistent
323
+ >
324
+ > **Changed because of this work:** [Task count, dependency chain summary, any items reordered or split during plan-review]
188
325
 
189
- ### Checkpoint 1C: Plan Review
326
+ ### Checkpoint: Plan Review
190
327
 
191
328
  > **Implementation plan: [N] tasks**
192
329
  >
193
330
  > | # | Task | Complexity | Dependencies | Description |
194
331
  > |---|------|-----------|--------------|-------------|
195
- > | 1 | ... | S | none | ... |
196
- > | 2 | ... | M | task-1 | ... |
332
+
333
+ Ask the user via AskUserQuestion:
334
+ - **Question:** "Does the implementation plan look correct and complete?"
335
+ - **Options:**
336
+ 1. "Approved — ready for execution" *(Recommended)*
337
+ 2. "Reorder or split tasks"
338
+ 3. "Missing tasks"
339
+ 4. "Too granular / too coarse"
340
+
341
+ Wait for the user's selection before continuing.
342
+
343
+ ---
344
+
345
+ ### Scope Coverage Gate (Hard Gate)
346
+
347
+ Before presenting the plan to the user, verify ALL input items are covered:
348
+
349
+ 1. Count distinct items/deliverables in the input briefing (`.wazir/input/briefing.md` + any `input/*.md` files)
350
+ 2. Count tasks in the execution plan
351
+ 3. **If `tasks_in_plan < items_in_input`:** STOP and present:
352
+
353
+ > **Scope reduction detected.** The input contains [N] items but the plan only covers [M].
197
354
  >
198
- > **Review the plan. Approve or adjust?**
199
- > 1. **Approved — ready for execution** (Recommended)
200
- > 2. **Reorder or split tasks** — [user specifies]
201
- > 3. **Missing tasks** [user adds]
202
- > 4. **Too granular / too coarse** — [user adjusts scope]
355
+ > Missing items: [list]
356
+
357
+ Ask the user via AskUserQuestion:
358
+ - **Question:** "The plan is missing [N-M] items from your input. How should we proceed?"
359
+ - **Options:**
360
+ 1. "Add missing items to the plan" *(Recommended)*
361
+ 2. "Approve reduced scope — I confirm these items can be dropped"
203
362
 
204
- **Wait for user response before completing.**
363
+ **The clarifier MUST NOT autonomously drop items into "future tiers", "deferred", or "out of scope" without explicit user approval. This is a hard rule.**
364
+
365
+ Invariant: `items_in_plan >= items_in_input` unless user explicitly approves reduction.
205
366
 
206
367
  ---
207
368
 
369
+ ## Reasoning Output
370
+
371
+ Throughout the clarifier phase, produce reasoning at two layers:
372
+
373
+ **Conversation (Layer 1):** Before each sub-workflow, explain the trigger and why it matters. After each sub-workflow, state what was found and the counterfactual — what would have gone wrong without it.
374
+
375
+ **File (Layer 2):** Write `.wazir/runs/<id>/reasoning/phase-clarifier-reasoning.md` with structured entries per decision:
376
+ - **Trigger** — what prompted the decision
377
+ - **Options considered** — alternatives evaluated
378
+ - **Chosen** — selected option
379
+ - **Reasoning** — why
380
+ - **Confidence** — high/medium/low
381
+ - **Counterfactual** — what would go wrong without this info
382
+
383
+ Examples of clarifier reasoning entries:
384
+ - "Trigger: input says 'auth' without specifying provider. Options: ask user, assume OAuth2, assume magic links. Chosen: ask user. Counterfactual: assuming OAuth2 when user wanted Supabase auth = wrong middleware, 2 days rework."
385
+ - "Trigger: 13 items in input. Options: plan all 13, tier into must/should/could. Chosen: plan all 13 (user explicitly said 'do not tier'). Counterfactual: tiering would silently drop 5 items."
386
+
208
387
  ## Done
209
388
 
210
- When the plan is approved, present:
389
+ When the plan is approved:
211
390
 
212
- > **Clarification complete.**
391
+ > **Clarifier phase complete.**
213
392
  >
214
393
  > - Spec: `.wazir/runs/latest/clarified/spec-hardened.md`
215
394
  > - Design: `.wazir/runs/latest/clarified/design.md`
216
- > - Tasks: [count] tasks in `.wazir/runs/latest/tasks/`
217
395
  > - Plan: `.wazir/runs/latest/clarified/execution-plan.md`
218
396
  >
219
- > **Next:** Run `/wazir:executor` to execute the plan.
397
+ > **Next:** Run `/executor` to implement the plan.