@wazir-dev/cli 1.1.0 → 1.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (138) hide show
  1. package/CHANGELOG.md +74 -10
  2. package/README.md +15 -15
  3. package/assets/demo.cast +47 -0
  4. package/assets/demo.gif +0 -0
  5. package/docs/anti-patterns/AP-23-skipping-enabled-workflows.md +28 -0
  6. package/docs/anti-patterns/AP-24-clarifier-deciding-scope.md +34 -0
  7. package/docs/concepts/architecture.md +1 -1
  8. package/docs/concepts/roles-and-workflows.md +2 -0
  9. package/docs/concepts/why-wazir.md +59 -0
  10. package/docs/decisions/2026-03-19-deferred-items.md +564 -0
  11. package/docs/decisions/2026-03-19-enhancement-decisions.md +300 -0
  12. package/docs/readmes/INDEX.md +21 -5
  13. package/docs/readmes/features/expertise/README.md +2 -2
  14. package/docs/readmes/features/exports/README.md +2 -2
  15. package/docs/readmes/features/hooks/pre-compact-summary.md +1 -1
  16. package/docs/readmes/features/schemas/README.md +3 -0
  17. package/docs/readmes/features/skills/README.md +17 -0
  18. package/docs/readmes/features/skills/clarifier.md +5 -0
  19. package/docs/readmes/features/skills/claude-cli.md +5 -0
  20. package/docs/readmes/features/skills/codex-cli.md +5 -0
  21. package/docs/readmes/features/skills/dispatching-parallel-agents.md +5 -0
  22. package/docs/readmes/features/skills/executing-plans.md +5 -0
  23. package/docs/readmes/features/skills/executor.md +5 -0
  24. package/docs/readmes/features/skills/finishing-a-development-branch.md +5 -0
  25. package/docs/readmes/features/skills/gemini-cli.md +5 -0
  26. package/docs/readmes/features/skills/humanize.md +5 -0
  27. package/docs/readmes/features/skills/init-pipeline.md +5 -0
  28. package/docs/readmes/features/skills/receiving-code-review.md +5 -0
  29. package/docs/readmes/features/skills/requesting-code-review.md +5 -0
  30. package/docs/readmes/features/skills/reviewer.md +5 -0
  31. package/docs/readmes/features/skills/subagent-driven-development.md +5 -0
  32. package/docs/readmes/features/skills/using-git-worktrees.md +5 -0
  33. package/docs/readmes/features/skills/wazir.md +5 -0
  34. package/docs/readmes/features/skills/writing-skills.md +5 -0
  35. package/docs/readmes/features/workflows/prepare-next.md +1 -1
  36. package/docs/reference/configuration-reference.md +47 -6
  37. package/docs/reference/hooks.md +1 -0
  38. package/docs/reference/launch-checklist.md +4 -4
  39. package/docs/reference/review-loop-pattern.md +119 -9
  40. package/docs/reference/roles-reference.md +1 -0
  41. package/docs/reference/skill-tiers.md +147 -0
  42. package/docs/reference/tooling-cli.md +3 -1
  43. package/docs/truth-claims.yaml +12 -0
  44. package/expertise/antipatterns/process/ai-coding-antipatterns.md +214 -1
  45. package/exports/hosts/claude/.claude/commands/plan-review.md +3 -1
  46. package/exports/hosts/claude/.claude/commands/verify.md +30 -1
  47. package/exports/hosts/claude/.claude/settings.json +9 -0
  48. package/exports/hosts/claude/CLAUDE.md +1 -1
  49. package/exports/hosts/claude/export.manifest.json +6 -4
  50. package/exports/hosts/claude/host-package.json +3 -1
  51. package/exports/hosts/codex/AGENTS.md +1 -1
  52. package/exports/hosts/codex/export.manifest.json +6 -4
  53. package/exports/hosts/codex/host-package.json +3 -1
  54. package/exports/hosts/cursor/.cursor/hooks.json +4 -0
  55. package/exports/hosts/cursor/.cursor/rules/wazir-core.mdc +1 -1
  56. package/exports/hosts/cursor/export.manifest.json +6 -4
  57. package/exports/hosts/cursor/host-package.json +3 -1
  58. package/exports/hosts/gemini/GEMINI.md +1 -1
  59. package/exports/hosts/gemini/export.manifest.json +6 -4
  60. package/exports/hosts/gemini/host-package.json +3 -1
  61. package/hooks/context-mode-router +191 -0
  62. package/hooks/definitions/context_mode_router.yaml +19 -0
  63. package/hooks/hooks.json +31 -6
  64. package/hooks/protected-path-write-guard +8 -0
  65. package/hooks/routing-matrix.json +45 -0
  66. package/hooks/session-start +62 -1
  67. package/llms-full.txt +937 -134
  68. package/package.json +2 -4
  69. package/schemas/hook.schema.json +2 -1
  70. package/schemas/phase-report.schema.json +89 -0
  71. package/schemas/usage.schema.json +25 -1
  72. package/schemas/wazir-manifest.schema.json +19 -0
  73. package/skills/brainstorming/SKILL.md +32 -157
  74. package/skills/clarifier/SKILL.md +289 -111
  75. package/skills/claude-cli/SKILL.md +320 -0
  76. package/skills/codex-cli/SKILL.md +260 -0
  77. package/skills/debugging/SKILL.md +13 -0
  78. package/skills/design/SKILL.md +13 -0
  79. package/skills/dispatching-parallel-agents/SKILL.md +13 -0
  80. package/skills/executing-plans/SKILL.md +13 -0
  81. package/skills/executor/SKILL.md +139 -19
  82. package/skills/finishing-a-development-branch/SKILL.md +13 -0
  83. package/skills/gemini-cli/SKILL.md +260 -0
  84. package/skills/humanize/SKILL.md +13 -0
  85. package/skills/init-pipeline/SKILL.md +72 -164
  86. package/skills/prepare-next/SKILL.md +81 -10
  87. package/skills/receiving-code-review/SKILL.md +13 -0
  88. package/skills/requesting-code-review/SKILL.md +13 -0
  89. package/skills/reviewer/SKILL.md +369 -24
  90. package/skills/run-audit/SKILL.md +13 -0
  91. package/skills/scan-project/SKILL.md +13 -0
  92. package/skills/self-audit/SKILL.md +217 -16
  93. package/skills/skill-research/SKILL.md +188 -0
  94. package/skills/subagent-driven-development/SKILL.md +13 -0
  95. package/skills/subagent-driven-development/code-quality-reviewer-prompt.md +2 -0
  96. package/skills/subagent-driven-development/implementer-prompt.md +8 -0
  97. package/skills/subagent-driven-development/spec-reviewer-prompt.md +7 -0
  98. package/skills/tdd/SKILL.md +13 -0
  99. package/skills/using-git-worktrees/SKILL.md +13 -0
  100. package/skills/using-skills/SKILL.md +13 -0
  101. package/skills/verification/SKILL.md +54 -3
  102. package/skills/wazir/SKILL.md +464 -381
  103. package/skills/writing-plans/SKILL.md +14 -1
  104. package/skills/writing-skills/SKILL.md +13 -0
  105. package/templates/artifacts/implementation-plan.md +3 -0
  106. package/templates/artifacts/tasks-template.md +133 -0
  107. package/templates/examples/phase-report.example.json +48 -0
  108. package/tooling/src/adapters/composition-engine.js +256 -0
  109. package/tooling/src/adapters/model-router.js +84 -0
  110. package/tooling/src/capture/command.js +41 -2
  111. package/tooling/src/capture/run-config.js +3 -1
  112. package/tooling/src/capture/store.js +56 -0
  113. package/tooling/src/capture/usage.js +106 -0
  114. package/tooling/src/capture/user-input.js +66 -0
  115. package/tooling/src/checks/ac-matrix.js +256 -0
  116. package/tooling/src/checks/command-registry.js +12 -0
  117. package/tooling/src/checks/docs-truth.js +1 -1
  118. package/tooling/src/checks/security-sensitivity.js +69 -0
  119. package/tooling/src/checks/skills.js +111 -0
  120. package/tooling/src/cli.js +31 -20
  121. package/tooling/src/commands/stats.js +161 -0
  122. package/tooling/src/commands/validate.js +5 -1
  123. package/tooling/src/export/compiler.js +33 -37
  124. package/tooling/src/gating/agent.js +145 -0
  125. package/tooling/src/guards/phase-prerequisite-guard.js +185 -0
  126. package/tooling/src/hooks/routing-logic.js +69 -0
  127. package/tooling/src/init/auto-detect.js +258 -0
  128. package/tooling/src/init/command.js +38 -170
  129. package/tooling/src/input/scanner.js +46 -0
  130. package/tooling/src/reports/command.js +103 -0
  131. package/tooling/src/reports/phase-report.js +323 -0
  132. package/tooling/src/state/command.js +160 -0
  133. package/tooling/src/state/db.js +287 -0
  134. package/tooling/src/status/command.js +58 -1
  135. package/tooling/src/verify/proof-collector.js +299 -0
  136. package/wazir.manifest.yaml +26 -14
  137. package/workflows/plan-review.md +3 -1
  138. package/workflows/verify.md +30 -1
@@ -9,6 +9,29 @@ The user typed `/wazir <their request>`. Run the entire pipeline end-to-end, han
9
9
 
10
10
  All questions use **numbered interactive options** — one question at a time, defaults marked "(Recommended)", wait for user response before proceeding.
11
11
 
12
+ ## User Input Capture
13
+
14
+ After every user response (approval, correction, rejection, redirect, instruction), capture it:
15
+
16
+ ```
17
+ captureUserInput(runDir, { phase: '<current-phase>', type: '<instruction|approval|correction|rejection|redirect>', content: '<user message>', context: '<what prompted the question>' })
18
+ ```
19
+
20
+ This uses `tooling/src/capture/user-input.js`. The log at `user-input-log.ndjson` feeds the learning system — user corrections are the strongest signal for improvement. At run end, prune logs older than 10 runs via `pruneOldInputLogs(stateRoot, 10)`.
21
+
22
+ ## Command Routing
23
+ Follow the Canonical Command Matrix in `hooks/routing-matrix.json`.
24
+ - Large commands (test runners, builds, diffs, dependency trees, linting) → context-mode tools
25
+ - Small commands (git status, ls, pwd, wazir CLI) → native Bash
26
+ - If context-mode unavailable, fall back to native Bash with warning
27
+
28
+ ## Codebase Exploration
29
+ 1. Query `wazir index search-symbols <query>` first
30
+ 2. Use `wazir recall file <path> --tier L1` for targeted reads
31
+ 3. Fall back to direct file reads ONLY for files identified by index queries
32
+ 4. Maximum 10 direct file reads without a justifying index query
33
+ 5. If no index exists: `wazir index build && wazir index summarize --tier all`
34
+
12
35
  ## Subcommand Detection
13
36
 
14
37
  Before anything else, check if the request starts with a known subcommand:
@@ -18,11 +41,24 @@ Before anything else, check if the request starts with a known subcommand:
18
41
  | `/wazir audit ...` | Jump to **Audit Mode** (see below) |
19
42
  | `/wazir prd [run-id]` | Jump to **PRD Mode** (see below) |
20
43
  | `/wazir init` | Invoke the `init-pipeline` skill directly, then stop |
21
- | Anything else | Continue to Step 1 (normal pipeline) |
44
+ | Anything else | Continue to Phase 1 (Init) |
45
+
46
+ ---
47
+
48
+ # 4-Phase Pipeline
49
+
50
+ The pipeline has 4 phases. Each phase groups related workflows. Individual workflows within a phase can be enabled/disabled via `workflow_policy` in run-config.
51
+
52
+ | Phase | Contains | Owner Skill | Key Output |
53
+ |-------|----------|-------------|------------|
54
+ | **Init** | Setup, prereqs, run directory, input scan | `wz:wazir` (inline) | `run-config.yaml` |
55
+ | **Clarifier** | Research, clarify, specify, brainstorm, plan | `wz:clarifier` | Approved spec + design + plan |
56
+ | **Executor** | Implement, verify | `wz:executor` | Code + verification proof |
57
+ | **Final Review** | Review vs original input, learn, prepare next | `wz:reviewer` | Verdict + learnings + handoff |
22
58
 
23
59
  ---
24
60
 
25
- # Normal Pipeline Mode
61
+ # Phase 1: Init
26
62
 
27
63
  ## Step 1: Capture the Request
28
64
 
@@ -37,16 +73,28 @@ If the user provided no text after `/wazir`, ask:
37
73
 
38
74
  Save their answer as the briefing, then continue.
39
75
 
76
+ ### Scan Input Directory
77
+
78
+ Scan both `input/` (project-level) and `.wazir/input/` (state-level) for existing briefing materials. If files exist beyond `briefing.md`, list them:
79
+
80
+ > **Found input files:**
81
+ > - `input/2026-03-19-deferred-items.md`
82
+ > - `.wazir/input/briefing.md`
83
+ >
84
+ > Using all found input as context for clarification.
85
+
40
86
  ### Inline Modifiers
41
87
 
42
- Parse the request for inline modifiers before the main text. These skip the corresponding interview question:
88
+ Parse the request for inline modifiers before the main text:
43
89
 
44
90
  - `/wazir quick fix the login redirect` → depth = quick, intent = bugfix
45
91
  - `/wazir deep design a new onboarding flow` → depth = deep, intent = feature
46
- - `/wazir feature add CSV export` → intent = feature, depth = standard (default)
47
92
 
48
93
  Recognized modifiers:
49
94
  - **Depth:** `quick`, `deep` (standard is default when omitted)
95
+ - **Interaction mode:** `auto`, `interactive` (guided is default when omitted)
96
+ - `/wazir auto fix the auth bug` → interaction_mode = auto
97
+ - `/wazir interactive design the onboarding` → interaction_mode = interactive
50
98
  - **Intent:** `bugfix`, `feature`, `refactor`, `docs`, `spike`
51
99
 
52
100
  ## Step 2: Check Prerequisites
@@ -58,24 +106,18 @@ Run `which wazir` to check if the CLI is installed.
58
106
  **If not installed**, present:
59
107
 
60
108
  > **The Wazir CLI is not installed. It's required for event capture, validation, and indexing.**
61
- >
62
- > **How would you like to install it?**
63
- >
64
- > 1. **npm** (Recommended) — `npm install -g @wazir-dev/cli`
65
- > 2. **Local link** — `npm link` from the Wazir project root
66
-
67
- If the user picks 1, run `npm install -g @wazir-dev/cli` and verify with `wazir --version`.
68
- If the user picks 2, run `npm link` from the project root and verify.
69
109
 
70
- The CLI is **required** — the pipeline uses `wazir capture`, `wazir validate`, `wazir index`, and `wazir doctor` throughout execution. There is no skip option.
110
+ Ask the user via AskUserQuestion:
111
+ - **Question:** "The Wazir CLI is not installed. How would you like to install it?"
112
+ - **Options:**
113
+ 1. "npm install -g @wazir-dev/cli" *(Recommended)*
114
+ 2. "npm link from the Wazir project root"
71
115
 
72
- **If installed**, run `wazir doctor --json` to verify repo health.
116
+ Wait for the user's selection before continuing.
73
117
 
74
- If doctor reports unhealthy:
75
- > **Repo health check failed:** [details from doctor output]
76
- > Fix issues before running the pipeline.
118
+ The CLI is **required** — the pipeline uses `wazir capture`, `wazir validate`, `wazir index`, and `wazir doctor` throughout execution.
77
119
 
78
- Stop. Do NOT continue the pipeline until the health check passes.
120
+ **If installed**, run `wazir doctor --json` to verify repo health. Stop if unhealthy.
79
121
 
80
122
  ### Branch Check
81
123
 
@@ -83,13 +125,14 @@ Run `wazir validate branches` to check the current git branch.
83
125
 
84
126
  - If on `main` or `develop`:
85
127
  > You're on **[branch]**. The pipeline requires a feature branch.
86
- >
87
- > 1. **Create feat/<slug>** (Recommended) — branch from current
88
- > 2. **Continue on [branch]** — not recommended for feature/refactor work
89
128
 
90
- Wait for the user to answer before continuing.
129
+ Ask the user via AskUserQuestion:
130
+ - **Question:** "You're on a protected branch. Create a feature branch?"
131
+ - **Options:**
132
+ 1. "Create feat/<slug> from current branch" *(Recommended)*
133
+ 2. "Continue on current branch — not recommended"
91
134
 
92
- - If branch name is invalid (not `feat/`, `fix/`, `chore/`, etc.): warn but continue.
135
+ Wait for the user's selection before continuing.
93
136
 
94
137
  ### Index Check
95
138
 
@@ -107,87 +150,83 @@ fi
107
150
 
108
151
  Check if `.wazir/state/config.json` exists.
109
152
 
110
- - **If missing** — invoke the `init-pipeline` skill. This will ask the user interactive questions to set up the config.
111
- - **If exists** — continue to Step 2.5.
153
+ - **If missing** — invoke the `init-pipeline` skill.
154
+ - **If exists** — continue.
112
155
 
113
- ## Step 2.5: Create Run Directory
156
+ ## Step 3: Create Run Directory
114
157
 
115
158
  Generate a run ID using the current timestamp: `run-YYYYMMDD-HHMMSS`
116
159
 
117
160
  ```bash
118
- mkdir -p .wazir/runs/run-YYYYMMDD-HHMMSS/{sources,tasks,artifacts,reviews}
161
+ mkdir -p .wazir/runs/run-YYYYMMDD-HHMMSS/{sources,tasks,artifacts,reviews,clarified}
119
162
  ln -sfn run-YYYYMMDD-HHMMSS .wazir/runs/latest
120
163
  ```
121
164
 
122
- If a previous completed run exists (check for a `completed_at` field in the previous `latest` run's `run-config.yaml`), record its `run_id` as `parent_run_id` in the new run's config.
123
-
124
- After creating the run directory, initialize event capture:
165
+ Initialize event capture:
125
166
 
126
167
  ```bash
127
- wazir capture init --run <run-id> --phase clarify --status starting
168
+ wazir capture init --run <run-id> --phase init --status starting
128
169
  ```
129
170
 
130
- ## Step 3: Pre-Flight Configuration
171
+ ### Resume Detection
172
+
173
+ Check if a previous incomplete run exists (via `latest` symlink pointing to a run without `completed_at`).
131
174
 
132
- Build the run configuration. Skip questions that were answered via inline modifiers.
175
+ **If previous incomplete run found**, present:
133
176
 
134
- ### Question 1: Depth (if not set via modifier)
177
+ > **A previous incomplete run was detected:** `<previous-run-id>`
135
178
 
136
- > **How thorough should this run be?**
137
- >
138
- > 1. **Quick** — Minimal research, single-pass review, fast execution. Good for small fixes and config changes.
139
- > 2. **Standard** (Recommended) Balanced research, multi-pass hardening, full review. Good for most features.
140
- > 3. **Deep** Extended research, thorough hardening, strict review thresholds. Good for complex or security-critical work.
179
+ Ask the user via AskUserQuestion:
180
+ - **Question:** "A previous incomplete run was detected. Resume or start fresh?"
181
+ - **Options:**
182
+ 1. "Resume from the last completed phase" *(Recommended)*
183
+ 2. "Start fresh with a new empty run"
141
184
 
142
- ### Question 2: Intent (if not set via modifier and not obvious from the request)
185
+ Wait for the user's selection before continuing.
143
186
 
144
- Only ask this if the request is ambiguous. If the intent is clear from the text (e.g., "fix the bug" → bugfix), infer it and skip.
187
+ **If Resume:**
188
+ - Copy `clarified/` from previous run into new run, EXCEPT `user-feedback.md`.
189
+ - Detect last completed phase by checking which artifacts exist.
190
+ - **Staleness check:** If input files are newer than copied artifacts, warn and offer to re-run clarification.
145
191
 
146
- > **What kind of work is this?**
147
- >
148
- > 1. **Feature** (Recommended) — New functionality or enhancement
149
- > 2. **Bugfix** — Fix broken behavior
150
- > 3. **Refactor** — Restructure without changing behavior
151
- > 4. **Docs** — Documentation only
152
- > 5. **Spike** — Research and exploration, no production code
192
+ ## Step 4: Build Run Config
153
193
 
154
- ### Question 3: Agent Teams (conditional)
194
+ **No questions asked.** Depth, intent, and mode are all inferred or defaulted.
155
195
 
156
- Only ask this if ALL of these are true:
157
- - The host is Claude Code (not Codex/Gemini/Cursor)
158
- - Depth is `standard` or `deep`
159
- - Intent is `feature` or `refactor` (not bugfix/docs/spike)
196
+ ### Intent Inference
160
197
 
161
- > **Would you like to use Agent Teams for parallel execution?**
162
- >
163
- > 1. **No** (Recommended) Tasks run sequentially. Predictable, lower cost.
164
- > 2. **Yes** — Spawns parallel teammates for independent tasks. Potentially faster and richer output.
165
- >
166
- > *Agent Teams is experimental from Claude's side. Requires Opus model. Higher token consumption.*
198
+ Infer intent from the request text using keyword matching:
199
+
200
+ | Keywords in request | Inferred Intent |
201
+ |-------------------|-----------------|
202
+ | fix, bug, broken, crash, error, issue, wrong | `bugfix` |
203
+ | refactor, clean, restructure, reorganize, rename, simplify | `refactor` |
204
+ | doc, document, readme, guide, explain | `docs` |
205
+ | research, spike, explore, investigate, prototype | `spike` |
206
+ | (anything else) | `feature` |
207
+
208
+ Depth defaults to `standard`. Override only via inline modifiers (`/wazir quick ...`, `/wazir deep ...`).
167
209
 
168
210
  ### Write Run Config
169
211
 
170
- Save all decisions to `.wazir/runs/<run-id>/run-config.yaml`:
212
+ Save to `.wazir/runs/<run-id>/run-config.yaml`:
171
213
 
172
214
  ```yaml
173
- # Identity
174
- run_id: run-20260317-143000
175
- parent_run_id: null # set if resuming/continuing from a prior run
176
- continuation_reason: null # e.g. "review found minor fixes"
215
+ run_id: run-YYYYMMDD-HHMMSS
216
+ parent_run_id: null
217
+ continuation_reason: null
177
218
 
178
- # User request
179
219
  request: "the original user request"
180
- request_summary: "short summary of intent"
181
- parsed_intent: feature # feature | bugfix | refactor | docs | spike
182
- entry_point: "/wazir" # how the user entered the pipeline
220
+ request_summary: "short summary"
221
+ parsed_intent: feature
222
+ entry_point: "/wazir"
183
223
 
184
- # Configuration
185
- depth: standard # quick | standard | deep
186
- team_mode: sequential # sequential | parallel
187
- parallel_backend: none # none | claude_teams (future: subagents, worktrees)
224
+ depth: standard
225
+ interaction_mode: guided # auto | guided | interactive
188
226
 
189
- # Phase policy (system-decided, not user-facing)
190
- phase_policy:
227
+ # Workflow policy individual workflows within each phase
228
+ workflow_policy:
229
+ # Clarifier phase workflows
191
230
  discover: { enabled: true, loop_cap: 10 }
192
231
  clarify: { enabled: true, loop_cap: 10 }
193
232
  specify: { enabled: true, loop_cap: 10 }
@@ -197,333 +236,419 @@ phase_policy:
197
236
  design-review: { enabled: true, loop_cap: 10 }
198
237
  plan: { enabled: true, loop_cap: 10 }
199
238
  plan-review: { enabled: true, loop_cap: 10 }
239
+ # Executor phase workflows
200
240
  execute: { enabled: true, loop_cap: 10 }
201
241
  verify: { enabled: true, loop_cap: 5 }
242
+ # Final Review phase workflows
202
243
  review: { enabled: true, loop_cap: 10 }
203
- learn: { enabled: false, loop_cap: 5 }
204
- prepare_next: { enabled: false, loop_cap: 5 }
244
+ learn: { enabled: true, loop_cap: 5 }
245
+ prepare_next: { enabled: true, loop_cap: 5 }
205
246
  run_audit: { enabled: false, loop_cap: 10 }
206
247
 
207
- # Research
208
- research_topics: [] # populated by researcher phase
248
+ research_topics: []
209
249
 
210
- # Timestamps
211
- created_at: 2026-03-17T14:30:00Z
250
+ created_at: "YYYY-MM-DDTHH:MM:SSZ"
212
251
  completed_at: null
213
252
  ```
214
253
 
215
- Mutable execution state (current phase, task progress, error counts) lives in `.wazir/runs/<run-id>/status.json`, NOT in `run-config.yaml`. The run config captures setup decisions only.
216
-
217
- ### Phase Policy
218
-
219
- Map intent + depth to applicable phases. The system decides — the user does NOT pick phases.
254
+ ### Workflow Skip Rules
220
255
 
221
- **Phase classes:**
256
+ Map intent + depth to applicable workflows. The system decides — the user does NOT pick.
222
257
 
223
- | Class | Phases | Rules |
224
- |-------|--------|-------|
225
- | **Core** (always run) | `clarify`, `verify`, `review` | Never skipped |
258
+ | Class | Workflows | Rules |
259
+ |-------|-----------|-------|
260
+ | **Core** (always run) | `clarify`, `execute`, `verify`, `review` | Never skipped |
226
261
  | **Adaptive** (run when evidence says so) | `discover`, `design`, `author`, `specify` | Skipped for bugfix/docs/spike at quick depth |
227
262
  | **Scale** (intensity varies) | `spec-challenge`, `plan-review`, `design-review` | Loop cap controls iteration depth |
263
+ | **Post-run** (always run) | `learn`, `prepare_next` | Part of Final Review phase |
228
264
 
229
- Log skip decisions to the run's `run-config.yaml` with reasons:
230
-
231
- ```yaml
232
- phase_policy:
233
- discover: { enabled: true, loop_cap: 10 }
234
- design: { enabled: false, loop_cap: 10, reason: "bugfix intent — no design needed" }
235
- spec-challenge: { enabled: true, loop_cap: 10 }
236
- ```
265
+ Log skip decisions with reasons in `workflow_policy`.
237
266
 
238
267
  ### Confidence Gate
239
268
 
240
- After building the run config, evaluate confidence:
241
-
242
- - **High confidence** (clear intent, depth set, no ambiguity) — show a one-line summary and proceed:
243
- > **Running: standard depth, feature, sequential. 11 of 15 phases. Proceeding...**
244
-
245
- - **Low confidence** (ambiguous intent, unclear scope) — show the full plan and ask:
246
- > **Here's the run plan:**
247
- > - Depth: standard
248
- > - Intent: feature
249
- > - Phases: [list enabled phases]
250
- > - Skipped: [list skipped with reasons]
251
- >
252
- > **Does this look right?**
253
- > 1. **Yes, proceed** (Recommended)
254
- > 2. **No, let me adjust**
255
-
256
- ## Step 4: Run Pipeline Phases
257
-
258
- The full pipeline runs these phases in order. Each phase produces an artifact that must pass its review loop before flowing to the next phase. Review mode is always passed explicitly (`--mode`) -- no auto-detection.
259
-
260
- ### 4a: Source Capture
261
-
262
- Before invoking the clarifier, capture all referenced sources locally:
263
-
264
- - Fetch all URLs referenced in `.wazir/input/` briefing files
265
- - Save fetched content to `.wazir/runs/<run-id>/sources/`
266
- - Name files as `src-NNN-<slug>.md` (fetched content) or `src-NNN-fetch-failed.json` (failures)
267
- - Create `.wazir/runs/<run-id>/sources/manifest.json` indexing all captures:
268
-
269
- ```json
270
- [
271
- {
272
- "id": "src-001",
273
- "origin_url": "https://...",
274
- "fetch_time": "2026-03-17T14:30:00Z",
275
- "content_hash": "sha256:abc...",
276
- "status": "captured",
277
- "local_path": "src-001-github-readme.md"
278
- },
279
- {
280
- "id": "src-002",
281
- "origin_url": "https://...",
282
- "status": "failed",
283
- "error": "403 Forbidden",
284
- "fetch_time": "2026-03-17T14:30:01Z"
285
- }
286
- ]
287
- ```
269
+ After building run config:
288
270
 
289
- Research briefs produced by the researcher must reference local paths (`sources/src-001-...`) instead of live URLs. The original URL is preserved in the manifest for provenance. Failures are recorded explicitly — never silently skipped.
271
+ - **High confidence** one-line summary and proceed:
272
+ > **Running: standard depth, feature, sequential. Proceeding...**
290
273
 
291
- ### 4b: Clarify (clarifier role)
274
+ - **Low confidence** show plan and ask:
292
275
 
293
- ```bash
294
- wazir capture event --run <run-id> --event phase_enter --phase clarify --status in_progress
295
- ```
276
+ Ask the user via AskUserQuestion:
277
+ - **Question:** "Does this run configuration look right?"
278
+ - **Options:**
279
+ 1. "Yes, proceed" *(Recommended)*
280
+ 2. "No, let me adjust"
296
281
 
297
- Invoke the clarifier skill for Phase 1A.
298
- Produces clarification artifact.
299
- Review: clarification-review loop (`--mode clarification-review`, spec/clarification dimensions).
300
- Pass count: quick=3, standard=5, deep=7. No extension.
301
- Checkpoint: user approves clarification.
282
+ Wait for the user's selection before continuing.
302
283
 
303
284
  ```bash
304
- wazir capture event --run <run-id> --event phase_exit --phase clarify --status completed
285
+ wazir capture event --run <run-id> --event phase_exit --phase init --status completed
305
286
  ```
306
287
 
307
- ### 4c: Research (researcher role via discover workflow)
308
-
288
+ Run the phase report and display it to the user:
309
289
  ```bash
310
- wazir capture event --run <run-id> --event phase_enter --phase discover --status in_progress
290
+ wazir report phase --run <run-id> --phase init
311
291
  ```
312
292
 
313
- Clarifier delegates to discover workflow (researcher role).
314
- Produces research artifact.
315
- Review: research-review loop (`--mode research-review`, research dimensions).
316
- Pass count: quick=3, standard=5, deep=7. No extension.
317
- Skip condition: depth=quick AND intent=bugfix.
293
+ Output the report content to the user in the conversation.
318
294
 
319
- ```bash
320
- wazir capture event --run <run-id> --event phase_exit --phase discover --status completed
321
- ```
295
+ ---
322
296
 
323
- ### 4d: Specify (specifier role)
297
+ # Interaction Modes
324
298
 
325
- ```bash
326
- wazir capture event --run <run-id> --event phase_enter --phase specify --status in_progress
327
- ```
299
+ The `interaction_mode` field in run-config controls how the pipeline interacts with the user:
328
300
 
329
- Delegate to specify workflow.
330
- Specifier produces measurable spec from clarification + research.
331
- Review: spec-challenge loop (`--mode spec-challenge`, spec/clarification dimensions).
332
- Pass count: quick=3, standard=5, deep=7. No extension.
333
- Checkpoint: user approves spec.
301
+ | Mode | Inline modifier | Behavior | Best for |
302
+ |------|----------------|----------|----------|
303
+ | **`guided`** | (default) | Pipeline runs, pauses at phase checkpoints for user approval. Current default behavior. | Most work |
304
+ | **`auto`** | `/wazir auto ...` | No human checkpoints. Codex reviews all. Gating agent decides continue/loop_back/escalate. Stops ONLY on escalate. | Overnight, clear spec, well-understood domain |
305
+ | **`interactive`** | `/wazir interactive ...` | More questions, more discussion, co-designs with user. Researcher presents options. Executor checks approach before coding. | Ambiguous requirements, new domain, learning |
334
306
 
335
- ```bash
336
- wazir capture event --run <run-id> --event phase_exit --phase specify --status completed
337
- ```
307
+ ## `auto` mode constraints
338
308
 
339
- ### 4d.5: Author (content-author role) [ADAPTIVE]
309
+ - **Codex REQUIRED** — refuse to start auto mode if `multi_tool.codex` is not configured in `.wazir/state/config.json`. Error: "Auto mode requires an external reviewer (Codex). Configure it first or use guided mode."
310
+ - **On escalate:** STOP immediately, write the escalation reason to `.wazir/runs/<id>/escalations/`, and wait for user input
311
+ - **Wall-clock limit:** default 4 hours. If exceeded, stop with escalation.
312
+ - **Never auto-commits to main** — always work on feature branch
313
+ - All checkpoints (AskUserQuestion) are skipped — gating agent evaluates phase reports and decides
340
314
 
341
- ```bash
342
- wazir capture event --run <run-id> --event phase_enter --phase author --status in_progress
343
- ```
315
+ ## `guided` mode (default)
344
316
 
345
- Enabled when `phase_policy.author.enabled = true` (default: false).
346
- Content-author writes non-code content artifacts.
347
- Approval gate: human approval required (not a review loop).
348
- Skip condition: disabled by default. Enable for content-heavy projects.
317
+ Current behavior — no changes needed. Checkpoints at phase boundaries, user approves before advancing.
349
318
 
350
- ```bash
351
- wazir capture event --run <run-id> --event phase_exit --phase author --status completed
319
+ ## `interactive` mode
320
+
321
+ - **Clarifier:** asks more detailed questions, presents research findings with options: "I found 3 approaches — which interests you?"
322
+ - **Executor:** checks approach before coding: "I'm about to implement auth with Supabase — sound right?"
323
+ - **Reviewer:** discusses findings with user, not just presents verdict: "I found a potential auth bypass — here's why I think it's high severity, do you agree?"
324
+ - Slower but highest quality for complex/ambiguous work
325
+
326
+ ## Mode checking in phase skills
327
+
328
+ All phase skills check `interaction_mode` from run-config at every checkpoint:
329
+
330
+ ```
331
+ # Read from run-config
332
+ interaction_mode = run_config.interaction_mode ?? 'guided'
333
+
334
+ # At each checkpoint:
335
+ if interaction_mode == 'auto':
336
+ # Skip checkpoint, let gating agent decide
337
+ elif interaction_mode == 'interactive':
338
+ # More detailed question, present options, discuss
339
+ else:
340
+ # guided — standard checkpoint with AskUserQuestion
352
341
  ```
353
342
 
354
- ### 4e: Brainstorm (designer role)
343
+ ---
344
+
345
+ # Two-Level Phase Model
346
+
347
+ The pipeline has 4 top-level **phases**, each containing multiple **workflows** with review loops:
348
+
349
+ ```
350
+ Phase 1: Init
351
+ └── (inline — no sub-workflows)
352
+
353
+ Phase 2: Clarifier
354
+ ├── discover (research) ← research-review loop
355
+ ├── clarify ← clarification-review loop
356
+ ├── specify ← spec-challenge loop
357
+ ├── author (adaptive) ← approval gate
358
+ ├── design ← design-review loop
359
+ └── plan ← plan-review loop
360
+
361
+ Phase 3: Executor
362
+ ├── execute (per-task) ← task-review loop per task
363
+ └── verify
364
+
365
+ Phase 4: Final Review
366
+ ├── review (final) ← scored review
367
+ ├── learn
368
+ └── prepare_next
369
+ ```
355
370
 
371
+ **Event capture uses both levels.** When emitting phase events, include `--parent-phase`:
356
372
  ```bash
357
- wazir capture event --run <run-id> --event phase_enter --phase design --status in_progress
373
+ wazir capture event --run <id> --event phase_enter --phase discover --parent-phase clarifier --status in_progress
358
374
  ```
359
375
 
360
- Invoke brainstorming skill for Phase 1B.
361
- Interactive -- pauses for user approval of design concept.
362
- After user approval: design-review loop (`--mode design-review`,
363
- canonical design-review dimensions: spec coverage, design-spec consistency,
364
- accessibility, visual consistency, exported-code fidelity).
365
- Pass count: quick=3, standard=5, deep=7. No extension.
366
- Skip condition: intent=bugfix/docs.
376
+ **Progress markers between workflows:** After each workflow completes, output:
377
+ > Phase 2: Clarifier > Workflow: specify (3 of 6 workflows complete)
378
+
379
+ **`wazir status` shows both levels:** "Phase 2: Clarifier > Workflow: specify"
380
+
381
+ ---
382
+
383
+ # Phase 2: Clarifier
384
+
385
+ **Before starting this phase, output to the user:**
386
+
387
+ > **Clarifier Phase** — About to research your codebase, clarify requirements, harden the spec, brainstorm designs, and produce an execution plan.
388
+ >
389
+ > **Why this matters:** Without this, I'd guess your tech stack, misunderstand constraints, miss edge cases in the spec, and build the wrong architecture. Every ambiguity left unresolved here becomes a bug or rework cycle later.
390
+ >
391
+ > **Looking for:** Unstated assumptions, scope boundaries, conflicting requirements, missing acceptance criteria
367
392
 
368
393
  ```bash
369
- wazir capture event --run <run-id> --event phase_exit --phase design --status completed
394
+ wazir capture event --run <run-id> --event phase_enter --phase clarifier --status in_progress
370
395
  ```
371
396
 
372
- ### 4f: Plan (planner role via wz:writing-plans)
397
+ Invoke the `wz:clarifier` skill. It handles all sub-workflows internally:
398
+
399
+ 1. **Source Capture** — fetch URLs from input
400
+ 2. **Research** (discover workflow) — codebase + external research
401
+ 3. **Clarify** (clarify workflow) — scope, constraints, assumptions
402
+ 4. **Spec Harden** (specify + spec-challenge workflows) — measurable spec
403
+ 5. **Brainstorm** (design + design-review workflows) — design approaches
404
+ 6. **Plan** (plan + plan-review workflows) — execution plan
405
+
406
+ Each sub-workflow has its own review loop. User checkpoints between major steps.
407
+
408
+ ### Scope Invariant
409
+
410
+ **Hard rule:** `items_in_plan >= items_in_input` unless the user explicitly approves scope reduction. The clarifier MUST NOT autonomously tier, defer, or drop items from the user's input. It can suggest prioritization, but the decision belongs to the user.
411
+
412
+ Output: approved spec + design + execution plan in `.wazir/runs/latest/clarified/`.
413
+
414
+ **After completing this phase, output to the user:**
415
+
416
+ > **Clarifier Phase complete.**
417
+ >
418
+ > **Found:** [N] ambiguities resolved, [N] assumptions made explicit, [N] scope boundaries drawn, [N] acceptance criteria hardened
419
+ >
420
+ > **Without this phase:** Requirements would be interpreted differently across tasks, acceptance criteria would be vague and untestable, the design would be ad-hoc, and the plan would miss dependency ordering
421
+ >
422
+ > **Changed because of this work:** [List spec tightening changes, resolved questions, design decisions, scope adjustments]
373
423
 
374
424
  ```bash
375
- wazir capture event --run <run-id> --event phase_enter --phase plan --status in_progress
425
+ wazir capture event --run <run-id> --event phase_exit --phase clarifier --status completed
376
426
  ```
377
427
 
378
- Delegate to `wz:writing-plans`.
379
- Planner produces execution plan and task specs.
380
- Review: plan-review loop (`--mode plan-review`, plan dimensions).
381
- Pass count: quick=3, standard=5, deep=7. No extension.
382
- Checkpoint: user approves plan.
383
-
428
+ Run the phase report and display savings to the user:
384
429
  ```bash
385
- wazir capture event --run <run-id> --event phase_exit --phase plan --status completed
430
+ wazir report phase --run <run-id> --phase clarifier
431
+ wazir stats --run <run-id>
386
432
  ```
387
433
 
388
- ### 4g: Execute (executor role)
434
+ **Show savings in conversation output:**
435
+ > **Context savings this phase:** Used wazir index for [N] queries and context-mode for [M] commands, saving ~[X] tokens ([Y]% reduction). Without these, this phase would have consumed [A] tokens instead of [B].
436
+
437
+ Output the report content to the user in the conversation.
438
+
439
+ ---
440
+
441
+ # Phase 3: Executor
442
+
443
+ **Before starting this phase, output to the user:**
444
+
445
+ > **Executor Phase** — About to implement [N] tasks in dependency order with TDD (test-first), per-task code review, and verification before each commit.
446
+ >
447
+ > **Why this matters:** Without this discipline, tests get skipped, edge cases get missed, integration points break silently, and review catches problems too late when they're expensive to fix.
448
+ >
449
+ > **Looking for:** Correct dependency ordering, test coverage for each task, clean per-task review passes, no implementation drift from the approved plan
450
+
451
+ ## Phase Gate (Hard Gate)
452
+
453
+ Before entering the Executor phase, verify ALL clarifier artifacts exist:
454
+
455
+ - [ ] `.wazir/runs/latest/clarified/clarification.md`
456
+ - [ ] `.wazir/runs/latest/clarified/spec-hardened.md`
457
+ - [ ] `.wazir/runs/latest/clarified/design.md`
458
+ - [ ] `.wazir/runs/latest/clarified/execution-plan.md`
459
+
460
+ If ANY file is missing, **STOP**:
461
+
462
+ > **Cannot enter Executor phase: missing prerequisite artifacts from Clarifier.**
463
+ >
464
+ > Missing: [list missing files]
465
+ >
466
+ > The Clarifier phase must complete before execution can begin. Run `/wazir:clarifier` first.
467
+
468
+ **Do NOT skip this check. Do NOT rationalize that the input is "clear enough" to bypass clarification. Every pipeline run must produce these artifacts.**
389
469
 
390
470
  ```bash
391
- wazir capture event --run <run-id> --event phase_enter --phase execute --status in_progress
471
+ wazir capture event --run <run-id> --event phase_enter --phase executor --status in_progress
392
472
  ```
393
473
 
394
- **Pre-execution gate** — run before the first task:
474
+ **Pre-execution gate:**
395
475
 
396
476
  ```bash
397
477
  wazir validate manifest && wazir validate hooks
398
- # If either fails, stop and report the failure. Do NOT proceed to task execution.
478
+ # Hard gate stop if either fails.
399
479
  ```
400
480
 
401
- Invoke executor skill for Phase 2.
402
- Per-task review: task-review loop (`--mode task-review --task-id <NNN>`,
403
- 5 task-execution dimensions) before each commit.
404
- Review logs: `execute-task-<NNN>-review-pass-<N>.md`
405
- Cap tracking: `wazir capture loop-check --task-id <NNN>`
406
- Codex error handling: non-zero exit -> codex-unavailable, self-review only.
407
- NOTE: per-task review is NOT the final review.
481
+ Invoke the `wz:executor` skill. It handles:
408
482
 
409
- If `team_mode: parallel` in run-config, the executor spawns Agent Teams for independent tasks. Otherwise, tasks run sequentially.
483
+ 1. **Execute** (execute workflow) — per-task TDD cycle with review before each commit
484
+ 2. **Verify** (verify workflow) — deterministic verification of all claims
410
485
 
411
- ```bash
412
- wazir capture event --run <run-id> --event phase_exit --phase execute --status completed
413
- ```
486
+ Per-task review: `--mode task-review`, 5 task-execution dimensions.
487
+ Tasks always run sequentially.
414
488
 
415
- ### 4h: Verify (verifier role)
489
+ Output: code changes + verification proof in `.wazir/runs/latest/artifacts/`.
416
490
 
417
- ```bash
418
- wazir capture event --run <run-id> --event phase_enter --phase verify --status in_progress
419
- ```
491
+ **After completing this phase, output to the user:**
420
492
 
421
- Deterministic verification of execution claims.
422
- Not a review loop -- produces proof, not findings.
493
+ > **Executor Phase complete.**
494
+ >
495
+ > **Found:** [N]/[N] tasks implemented, [N] tests written, [N] per-task review passes completed, [N] findings fixed before commit
496
+ >
497
+ > **Without this phase:** Code would ship without tests, review findings would accumulate until final review (10x more expensive to fix), and verification claims would be unsubstantiated
498
+ >
499
+ > **Changed because of this work:** [List of commits with conventional commit messages, test counts, verification evidence collected]
423
500
 
424
501
  ```bash
425
- wazir capture event --run <run-id> --event phase_exit --phase verify --status completed
502
+ wazir capture event --run <run-id> --event phase_exit --phase executor --status completed
426
503
  ```
427
504
 
428
- ### 4i: Final Review (reviewer role in final mode)
429
-
505
+ Run the phase report and display savings to the user:
430
506
  ```bash
431
- wazir capture event --run <run-id> --event phase_enter --phase review --status in_progress
507
+ wazir report phase --run <run-id> --phase executor
508
+ wazir stats --run <run-id>
432
509
  ```
433
510
 
434
- Invoke reviewer skill with `--mode final`.
435
- 7-dimension scored review (correctness, completeness, wiring, verification,
436
- drift, quality, documentation). Score 0-70.
437
- This IS the scored final review gate.
511
+ Output the report content to the user in the conversation.
512
+
513
+ **Show savings in conversation output:**
514
+ > **Context savings this phase:** Used wazir index for [N] queries and context-mode for [M] commands, saving ~[X] tokens ([Y]% reduction).
515
+
516
+ ---
517
+
518
+ # Phase 4: Final Review
519
+
520
+ **Before starting this phase, output to the user:**
521
+
522
+ > **Final Review Phase** — About to run adversarial 7-dimension review comparing the implementation against your original input, extract durable learnings, and prepare the handoff.
523
+ >
524
+ > **Why this matters:** Without this, implementation drift ships undetected, missing acceptance criteria go unnoticed, untested code paths hide bugs, and the same mistakes repeat in the next run.
525
+ >
526
+ > **Looking for:** Spec violations, missing features, dead code paths, unsubstantiated claims, scope creep, security gaps, stale documentation
527
+
528
+ ## Phase Gate (Hard Gate)
529
+
530
+ Before entering the Final Review phase, verify the Executor produced its proof:
531
+
532
+ - [ ] `.wazir/runs/latest/artifacts/verification-proof.md`
533
+
534
+ If missing, **STOP**:
535
+
536
+ > **Cannot enter Final Review: missing verification proof from Executor.**
537
+ >
538
+ > The Executor phase must complete and produce `verification-proof.md` before final review. Run `/wazir:executor` first.
438
539
 
439
540
  ```bash
440
- wazir capture event --run <run-id> --event phase_exit --phase review --status completed
541
+ wazir capture event --run <run-id> --event phase_enter --phase final_review --status in_progress
441
542
  ```
442
543
 
443
- ### 4j: Learn (learner role) [ADAPTIVE]
544
+ This phase validates the implementation against the **ORIGINAL INPUT** (not the task specs — the executor's per-task reviewer already covered that).
444
545
 
445
- Enabled when `phase_policy.learn.enabled = true` (default: false).
446
- Extract durable learnings from the completed run.
447
- No review loop. Learnings require explicit scope tags.
448
- Skip condition: disabled by default. Enable for retrospective runs.
546
+ ### 4a: Review (reviewer role in final mode)
449
547
 
450
- ### 4k: Prepare Next (planner role) [ADAPTIVE]
548
+ Invoke `wz:reviewer --mode final`.
549
+ 7-dimension scored review comparing implementation against the original user input.
550
+ Score 0-70. Verdicts: PASS (56+), NEEDS MINOR FIXES (42-55), NEEDS REWORK (28-41), FAIL (0-27).
451
551
 
452
- Enabled when `phase_policy.prepare_next.enabled = true` (default: false).
453
- Prepare context and handoff for the next run.
454
- No review loop. No implicit carry-forward of unapproved learnings.
455
- Skip condition: disabled by default.
552
+ ### 4b: Learn (learner role)
456
553
 
457
- `run_audit` is NOT part of the pipeline flow -- it is an on-demand standalone phase invoked separately.
554
+ Extract durable learnings from the completed run:
555
+ - Scan all review findings (internal + Codex)
556
+ - Propose learnings to `memory/learnings/proposed/`
557
+ - Findings that recur across 2+ runs → auto-proposed as learnings
558
+ - Learnings require explicit scope tags (roles, stacks, concerns)
458
559
 
459
- ### Resume Detection
560
+ ### 4c: Prepare Next (planner role)
460
561
 
461
- If the run has partial progress, detect the latest completed phase and resume:
562
+ Prepare context and handoff for the next run:
563
+ - Write handoff document
564
+ - Compress/archive unneeded files
565
+ - Record what's left to do
462
566
 
463
- - If clarification exists but no spec: resume at 4d (specify)
464
- - If spec exists but no design: resume at 4e (brainstorm)
465
- - If design exists but no plan: resume at 4f (plan)
466
- - If plan exists but no task artifacts: resume at 4g (execute)
467
- - If task artifacts exist but no verification: resume at 4h (verify)
468
- - If verification exists: resume at 4i (final review)
567
+ **After completing this phase, output to the user:**
469
568
 
470
- Present resume options:
471
-
472
- > **Previous progress detected (completed through [phase]).**
569
+ > **Final Review Phase complete.**
570
+ >
571
+ > **Found:** [N] findings across 7 dimensions, [N] blocking issues, [N] warnings, [N] learnings proposed for future runs
473
572
  >
474
- > **What would you like to do?**
475
- > 1. **Resume from [next phase]** (Recommended)
476
- > 2. **Start fresh** Re-run all phases from scratch
573
+ > **Without this phase:** Implementation drift from the original request would ship undetected, untested paths would hide production bugs, and recurring mistakes would never get captured as learnings
574
+ >
575
+ > **Changed because of this work:** [List of findings fixed, score achieved, learnings extracted, handoff prepared]
576
+
577
+ ```bash
578
+ wazir capture event --run <run-id> --event phase_exit --phase final_review --status completed
579
+ ```
477
580
 
478
- ## Step 5: Present Results
581
+ Run the phase report and display it to the user:
582
+ ```bash
583
+ wazir report phase --run <run-id> --phase final_review
584
+ ```
479
585
 
480
- After the reviewer completes, present the verdict and offer next steps with numbered options:
586
+ Output the report content to the user in the conversation.
587
+
588
+ ---
589
+
590
+ ## Step 5: CHANGELOG + Gitflow Validation (Hard Gates)
591
+
592
+ Before presenting results:
593
+
594
+ ```bash
595
+ wazir validate changelog --require-entries --base main
596
+ wazir validate commits --base main
597
+ ```
598
+
599
+ Both must pass before PR. These are not warnings.
600
+
601
+ ## Step 6: Present Results
602
+
603
+ After the reviewer completes, present verdict with numbered options:
481
604
 
482
605
  ### If PASS (score 56+):
483
606
 
484
607
  > **Result: PASS (score/70)**
485
- >
486
- > [score breakdown]
487
- >
488
- > **What would you like to do?**
489
- > 1. **Create a PR** (Recommended)
490
- > 2. **Merge directly**
491
- > 3. **Review the changes first**
608
+
609
+ Ask the user via AskUserQuestion:
610
+ - **Question:** "Pipeline passed. What would you like to do next?"
611
+ - **Options:**
612
+ 1. "Create a PR" *(Recommended)*
613
+ 2. "Merge directly"
614
+ 3. "Review the changes first"
615
+
616
+ Wait for the user's selection before continuing.
492
617
 
493
618
  ### If NEEDS MINOR FIXES (score 42-55):
494
619
 
495
620
  > **Result: NEEDS MINOR FIXES (score/70)**
496
- >
497
- > [findings list]
498
- >
499
- > **What would you like to do?**
500
- > 1. **Auto-fix and re-review** (Recommended)
501
- > 2. **Fix manually**
502
- > 3. **Accept as-is**
621
+
622
+ Ask the user via AskUserQuestion:
623
+ - **Question:** "Minor issues found. How should we handle them?"
624
+ - **Options:**
625
+ 1. "Auto-fix and re-review" *(Recommended)*
626
+ 2. "Fix manually"
627
+ 3. "Accept as-is"
628
+
629
+ Wait for the user's selection before continuing.
503
630
 
504
631
  ### If NEEDS REWORK (score 28-41):
505
632
 
506
633
  > **Result: NEEDS REWORK (score/70)**
507
- >
508
- > [findings list with affected tasks]
509
- >
510
- > **What would you like to do?**
511
- > 1. **Re-run affected tasks** (Recommended)
512
- > 2. **Review findings in detail**
513
- > 3. **Abandon this run**
634
+
635
+ Ask the user via AskUserQuestion:
636
+ - **Question:** "Significant issues found. How should we proceed?"
637
+ - **Options:**
638
+ 1. "Re-run affected tasks" *(Recommended)*
639
+ 2. "Review findings in detail"
640
+ 3. "Abandon this run"
641
+
642
+ Wait for the user's selection before continuing.
514
643
 
515
644
  ### If FAIL (score 0-27):
516
645
 
517
646
  > **Result: FAIL (score/70)**
518
647
  >
519
- > [full findings]
520
- >
521
- > Something fundamental went wrong. Review the findings above and decide how to proceed.
648
+ > Something fundamental went wrong. Review the findings above.
522
649
 
523
650
  ### Run Summary
524
651
 
525
- After presenting results (regardless of verdict), capture the run summary:
526
-
527
652
  ```bash
528
653
  wazir capture summary --run <run-id>
529
654
  wazir status --run <run-id> --json
@@ -531,19 +656,18 @@ wazir status --run <run-id> --json
531
656
 
532
657
  ## Error Handling
533
658
 
534
- If any phase fails or the user cancels:
535
-
536
- 1. Report which phase failed and why
537
- 2. Present recovery options:
659
+ If any phase fails:
538
660
 
539
661
  > **Phase [name] failed: [reason]**
540
- >
541
- > **What would you like to do?**
542
- > 1. **Retry this phase** (Recommended)
543
- > 2. **Skip and continue** (only if phase is adaptive, not core)
544
- > 3. **Abort the run**
545
662
 
546
- The run config persists, so running `/wazir` again will detect the partial state and offer to resume.
663
+ Ask the user via AskUserQuestion:
664
+ - **Question:** "Phase [name] failed: [reason]. How should we proceed?"
665
+ - **Options:**
666
+ 1. "Retry this phase" *(Recommended)*
667
+ 2. "Skip and continue" *(only if workflows within phase are adaptive)*
668
+ 3. "Abort the run"
669
+
670
+ Wait for the user's selection before continuing.
547
671
 
548
672
  ---
549
673
 
@@ -551,27 +675,22 @@ The run config persists, so running `/wazir` again will detect the partial state
551
675
 
552
676
  Triggered by `/wazir audit` or `/wazir audit <focus>`.
553
677
 
554
- Runs a structured codebase audit. Invokes the `run-audit` skill with the interactive question flow.
678
+ Runs a structured codebase audit. Invokes the `run-audit` skill.
555
679
 
556
- ## Inline Audit Modifiers
680
+ Parse inline audit types: `/wazir audit security` → skip Question 1.
557
681
 
558
- Parse for known audit types after `audit`:
682
+ After audit:
559
683
 
560
- - `/wazir audit security` → audit type = security, skip Question 1
561
- - `/wazir audit deps` audit type = dependencies, skip Question 1
562
- - `/wazir audit` → ask Question 1
684
+ Ask the user via AskUserQuestion:
685
+ - **Question:** "Audit complete. What would you like to do with the findings?"
686
+ - **Options:**
687
+ 1. "Review the findings" *(Recommended)*
688
+ 2. "Generate a fix plan"
689
+ 3. "Run the pipeline on the fix plan"
563
690
 
564
- Then let the `run-audit` skill handle the rest (scope, output mode). All its questions already follow the interactive numbered pattern.
691
+ Wait for the user's selection before continuing.
565
692
 
566
- After the audit completes:
567
-
568
- > **Audit complete. What would you like to do?**
569
- >
570
- > 1. **Review the findings** (Recommended)
571
- > 2. **Generate a fix plan** — turn findings into implementation tasks
572
- > 3. **Run the pipeline on the fix plan** — generate plan, then execute and review fixes
573
-
574
- If the user picks option 3, save the findings as the briefing and run the normal pipeline (Steps 3-5) with intent = `bugfix`.
693
+ If option 3, save findings as briefing and run pipeline with intent = `bugfix`.
575
694
 
576
695
  ---
577
696
 
@@ -579,89 +698,53 @@ If the user picks option 3, save the findings as the briefing and run the normal
579
698
 
580
699
  Triggered by `/wazir prd` or `/wazir prd <run-id>`.
581
700
 
582
- Generates a Product Requirements Document from a completed pipeline run.
583
-
584
- ## Pre-Flight
585
-
586
- 1. If a `<run-id>` was provided, use that run's directory. Otherwise, use `.wazir/runs/latest`.
587
- 2. Verify the run has completed artifacts:
588
- - Design doc in the run's tasks or in `docs/plans/`
589
- - Task specs in the run's `clarified/`
590
- - Review results in the run's `reviews/` (if available)
591
- 3. If the run is incomplete or has no artifacts:
592
-
593
- > **No completed run found. Run `/wazir <your request>` first to create a pipeline run, then use `/wazir prd` to generate the PRD.**
701
+ Generates a PRD from a completed run. Reads approved design, task specs, execution plan, review results. Saves to `docs/prd/YYYY-MM-DD-<topic>-prd.md`.
594
702
 
595
- ## Inputs (read-only)
703
+ After generation:
596
704
 
597
- Read these artifacts from the completed run:
598
- - Approved design document
599
- - Task specs (all `spec.md` files in `clarified/`)
600
- - Execution plan
601
- - Review results and verification proofs (if available)
602
- - Run config (for context on depth, intent, decisions)
603
-
604
- ## Output
605
-
606
- Generate a PRD and save to `docs/prd/YYYY-MM-DD-<topic>-prd.md`.
607
-
608
- ### PRD Template
609
-
610
- ```markdown
611
- # Product Requirements Document — <Topic>
705
+ Ask the user via AskUserQuestion:
706
+ - **Question:** "PRD generated. What would you like to do?"
707
+ - **Options:**
708
+ 1. "Review the PRD" *(Recommended)*
709
+ 2. "Commit it"
710
+ 3. "Edit before committing"
612
711
 
613
- **Generated from run:** `<run-id>`
614
- **Date:** YYYY-MM-DD
712
+ Wait for the user's selection before continuing.
615
713
 
616
- ## Vision & Core Thesis
617
-
618
- [1-2 paragraphs synthesized from the design document's core approach]
619
-
620
- ## What We're Building
621
-
622
- ### Feature Area 1: <name>
714
+ ---
623
715
 
624
- **What:** [description from task specs]
625
- **Why:** [rationale from design doc]
626
- **Requirements:**
627
- - [ ] [from task spec acceptance criteria]
628
- - [ ] ...
716
+ ## Reasoning Chain Output
629
717
 
630
- ### Feature Area 2: <name>
631
- ...
718
+ Every phase produces reasoning output at two layers:
632
719
 
633
- ## Success Criteria
720
+ ### Layer 1: Conversation Output (concise — for the user)
634
721
 
635
- [From review results and verification proofs what was tested and confirmed]
722
+ Before each major decision, output one trigger sentence and one reasoning sentence:
636
723
 
637
- ## Technical Constraints
724
+ > "Your request mentions 'overnight autonomous run' — researching how Devin and Karpathy's autoresearch handle this, because unattended runs need different safety constraints than interactive ones."
638
725
 
639
- [From architecture decisions, run config, and design trade-offs]
726
+ After each phase, output what was found and a counterfactual:
640
727
 
641
- ## What's NOT in Scope
728
+ > "Found: you use Supabase auth (not custom JWT). If I'd skipped research, I would have built JWT middleware — completely wrong."
642
729
 
643
- [From design doc's rejected alternatives and explicit exclusions]
730
+ ### Layer 2: File Output (detailed — for learning and reports)
644
731
 
645
- ## Open Questions
732
+ Save full reasoning chain to `.wazir/runs/<id>/reasoning/phase-<name>-reasoning.md` with entries:
646
733
 
647
- [From design doc's open questions and review findings]
734
+ ```markdown
735
+ ### Decision: [title]
736
+ - **Trigger:** What prompted this decision
737
+ - **Options considered:** List of alternatives
738
+ - **Chosen:** The selected option
739
+ - **Reasoning:** Why this option was chosen
740
+ - **Confidence:** high | medium | low
741
+ - **Counterfactual:** What would have gone wrong without this information
648
742
  ```
649
743
 
650
- ## After Generation
651
-
652
- > **PRD generated at `docs/prd/YYYY-MM-DD-<topic>-prd.md`.**
653
- >
654
- > **What would you like to do?**
655
- > 1. **Review the PRD** (Recommended)
656
- > 2. **Commit it**
657
- > 3. **Edit before committing**
658
-
659
- ---
744
+ Create the `reasoning/` directory during run init. Every phase skill (clarifier, executor, reviewer) writes its own reasoning file. Counterfactuals appear in BOTH conversation output AND reasoning files.
660
745
 
661
746
  ## Interaction Rules
662
747
 
663
- These rules apply to ALL questions in the pipeline, including those asked by sub-skills (clarifier, executor, reviewer) and audit modes:
664
-
665
748
  - **One question at a time** — never combine multiple questions
666
749
  - **Numbered options** — always present choices as numbered lists
667
750
  - **Mark defaults** — always show "(Recommended)" on the suggested option