@wazir-dev/cli 1.2.0 → 1.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (161) hide show
  1. package/CHANGELOG.md +54 -44
  2. package/README.md +13 -13
  3. package/assets/demo.cast +47 -0
  4. package/assets/demo.gif +0 -0
  5. package/docs/anti-patterns/AP-23-skipping-enabled-workflows.md +28 -0
  6. package/docs/anti-patterns/AP-24-clarifier-deciding-scope.md +34 -0
  7. package/docs/concepts/architecture.md +1 -1
  8. package/docs/concepts/why-wazir.md +1 -1
  9. package/docs/readmes/INDEX.md +1 -1
  10. package/docs/readmes/features/expertise/README.md +1 -1
  11. package/docs/readmes/features/hooks/pre-compact-summary.md +1 -1
  12. package/docs/reference/hooks.md +1 -0
  13. package/docs/reference/launch-checklist.md +3 -3
  14. package/docs/reference/review-loop-pattern.md +3 -2
  15. package/docs/reference/skill-tiers.md +2 -2
  16. package/docs/research/2026-03-20-agents/a18fb002157904af5.txt +187 -0
  17. package/docs/research/2026-03-20-agents/a1d0ac79ac2f11e6f.txt +2 -0
  18. package/docs/research/2026-03-20-agents/a324079de037abd7c.txt +198 -0
  19. package/docs/research/2026-03-20-agents/a357586bccfafb0e5.txt +256 -0
  20. package/docs/research/2026-03-20-agents/a4365394e4d753105.txt +137 -0
  21. package/docs/research/2026-03-20-agents/a492af28bc52d3613.txt +136 -0
  22. package/docs/research/2026-03-20-agents/a4984db0b6a8eee07.txt +124 -0
  23. package/docs/research/2026-03-20-agents/a5b30e59d34bbb062.txt +214 -0
  24. package/docs/research/2026-03-20-agents/a5cf7829dab911586.txt +165 -0
  25. package/docs/research/2026-03-20-agents/a607157c30dd97c9e.txt +96 -0
  26. package/docs/research/2026-03-20-agents/a60b68b1e19d1e16b.txt +115 -0
  27. package/docs/research/2026-03-20-agents/a722af01c5594aba0.txt +166 -0
  28. package/docs/research/2026-03-20-agents/a787bdc516faa5829.txt +181 -0
  29. package/docs/research/2026-03-20-agents/a7c46d1bba1056ed2.txt +132 -0
  30. package/docs/research/2026-03-20-agents/a7e5abbab2b281a0d.txt +100 -0
  31. package/docs/research/2026-03-20-agents/a8dbadc66cd0d7d5a.txt +95 -0
  32. package/docs/research/2026-03-20-agents/a904d9f45d6b86a6d.txt +75 -0
  33. package/docs/research/2026-03-20-agents/a927659a942ee7f60.txt +102 -0
  34. package/docs/research/2026-03-20-agents/a962cb569191f7583.txt +125 -0
  35. package/docs/research/2026-03-20-agents/aab6decea538aac41.txt +148 -0
  36. package/docs/research/2026-03-20-agents/abd58b853dd938a1b.txt +295 -0
  37. package/docs/research/2026-03-20-agents/ac009da573eff7f65.txt +100 -0
  38. package/docs/research/2026-03-20-agents/ac1bc783364405e5f.txt +190 -0
  39. package/docs/research/2026-03-20-agents/aca5e2b57fde152a0.txt +132 -0
  40. package/docs/research/2026-03-20-agents/ad849b8c0a7e95b8b.txt +176 -0
  41. package/docs/research/2026-03-20-agents/adc2b12a4da32c962.txt +258 -0
  42. package/docs/research/2026-03-20-agents/af97caaaa9a80e4cb.txt +146 -0
  43. package/docs/research/2026-03-20-agents/afc5faceee368b3ca.txt +111 -0
  44. package/docs/research/2026-03-20-agents/afdb282d866e3c1e4.txt +164 -0
  45. package/docs/research/2026-03-20-agents/afe9d1f61c02b1e8d.txt +299 -0
  46. package/docs/research/2026-03-20-agents/b4hmkwril.txt +1856 -0
  47. package/docs/research/2026-03-20-agents/b80ptk89g.txt +1856 -0
  48. package/docs/research/2026-03-20-agents/bf54s1jss.txt +1150 -0
  49. package/docs/research/2026-03-20-agents/bhd6kq2kx.txt +1856 -0
  50. package/docs/research/2026-03-20-agents/bmb2fodyr.txt +988 -0
  51. package/docs/research/2026-03-20-agents/bmmsrij8i.txt +826 -0
  52. package/docs/research/2026-03-20-agents/bn4t2ywpu.txt +2175 -0
  53. package/docs/research/2026-03-20-agents/bu22t9f1z.txt +0 -0
  54. package/docs/research/2026-03-20-agents/bwvl98v2p.txt +738 -0
  55. package/docs/research/2026-03-20-agents/psych-a3697a7fd06eb64fd.txt +135 -0
  56. package/docs/research/2026-03-20-agents/psych-a37776fabc870feae.txt +123 -0
  57. package/docs/research/2026-03-20-agents/psych-a5b1fe05c0589efaf.txt +2 -0
  58. package/docs/research/2026-03-20-agents/psych-a95c15b1f29424435.txt +76 -0
  59. package/docs/research/2026-03-20-agents/psych-a9c26f4d9172dde7c.txt +2 -0
  60. package/docs/research/2026-03-20-agents/psych-aa19c69f0ca2c5ad3.txt +2 -0
  61. package/docs/research/2026-03-20-agents/psych-aa4e4cb70e1be5ecb.txt +95 -0
  62. package/docs/research/2026-03-20-agents/psych-ab5b302f26a554663.txt +102 -0
  63. package/docs/research/2026-03-20-deep-research-complete.md +101 -0
  64. package/docs/research/2026-03-20-deep-research-status.md +38 -0
  65. package/docs/research/2026-03-20-enforcement-research.md +107 -0
  66. package/expertise/antipatterns/process/ai-coding-antipatterns.md +117 -0
  67. package/expertise/composition-map.yaml +27 -8
  68. package/expertise/digests/reviewer/ai-coding-digest.md +83 -0
  69. package/expertise/digests/reviewer/architectural-thinking-digest.md +63 -0
  70. package/expertise/digests/reviewer/architecture-antipatterns-digest.md +49 -0
  71. package/expertise/digests/reviewer/code-smells-digest.md +53 -0
  72. package/expertise/digests/reviewer/coupling-cohesion-digest.md +54 -0
  73. package/expertise/digests/reviewer/ddd-digest.md +60 -0
  74. package/expertise/digests/reviewer/dependency-risk-digest.md +40 -0
  75. package/expertise/digests/reviewer/error-handling-digest.md +55 -0
  76. package/expertise/digests/reviewer/review-methodology-digest.md +49 -0
  77. package/exports/hosts/claude/.claude/commands/learn.md +61 -8
  78. package/exports/hosts/claude/.claude/commands/plan-review.md +3 -1
  79. package/exports/hosts/claude/.claude/commands/verify.md +30 -1
  80. package/exports/hosts/claude/.claude/settings.json +7 -6
  81. package/exports/hosts/claude/export.manifest.json +8 -5
  82. package/exports/hosts/claude/host-package.json +3 -0
  83. package/exports/hosts/codex/export.manifest.json +8 -5
  84. package/exports/hosts/codex/host-package.json +3 -0
  85. package/exports/hosts/cursor/.cursor/hooks.json +6 -6
  86. package/exports/hosts/cursor/export.manifest.json +8 -5
  87. package/exports/hosts/cursor/host-package.json +3 -0
  88. package/exports/hosts/gemini/export.manifest.json +8 -5
  89. package/exports/hosts/gemini/host-package.json +3 -0
  90. package/hooks/definitions/pretooluse_dispatcher.yaml +26 -0
  91. package/hooks/definitions/pretooluse_pipeline_guard.yaml +22 -0
  92. package/hooks/definitions/stop_pipeline_gate.yaml +22 -0
  93. package/hooks/hooks.json +7 -6
  94. package/hooks/pretooluse-dispatcher +84 -0
  95. package/hooks/pretooluse-pipeline-guard +9 -0
  96. package/hooks/stop-pipeline-gate +9 -0
  97. package/llms-full.txt +48 -18
  98. package/package.json +2 -3
  99. package/schemas/decision.schema.json +15 -0
  100. package/schemas/hook.schema.json +4 -1
  101. package/schemas/phase-report.schema.json +9 -0
  102. package/skills/TEMPLATE-3-ZONE.md +160 -0
  103. package/skills/brainstorming/SKILL.md +137 -21
  104. package/skills/clarifier/SKILL.md +364 -53
  105. package/skills/claude-cli/SKILL.md +91 -12
  106. package/skills/codex-cli/SKILL.md +91 -12
  107. package/skills/debugging/SKILL.md +133 -38
  108. package/skills/design/SKILL.md +173 -37
  109. package/skills/dispatching-parallel-agents/SKILL.md +129 -31
  110. package/skills/executing-plans/SKILL.md +113 -25
  111. package/skills/executor/SKILL.md +252 -21
  112. package/skills/finishing-a-development-branch/SKILL.md +107 -18
  113. package/skills/gemini-cli/SKILL.md +91 -12
  114. package/skills/humanize/SKILL.md +92 -13
  115. package/skills/init-pipeline/SKILL.md +90 -18
  116. package/skills/prepare-next/SKILL.md +93 -24
  117. package/skills/receiving-code-review/SKILL.md +90 -16
  118. package/skills/requesting-code-review/SKILL.md +100 -24
  119. package/skills/requesting-code-review/code-reviewer.md +29 -17
  120. package/skills/reviewer/SKILL.md +270 -57
  121. package/skills/run-audit/SKILL.md +92 -15
  122. package/skills/scan-project/SKILL.md +93 -14
  123. package/skills/self-audit/SKILL.md +133 -39
  124. package/skills/skill-research/SKILL.md +275 -0
  125. package/skills/subagent-driven-development/SKILL.md +129 -30
  126. package/skills/subagent-driven-development/code-quality-reviewer-prompt.md +30 -2
  127. package/skills/subagent-driven-development/implementer-prompt.md +40 -27
  128. package/skills/subagent-driven-development/spec-reviewer-prompt.md +25 -12
  129. package/skills/tdd/SKILL.md +125 -20
  130. package/skills/using-git-worktrees/SKILL.md +118 -28
  131. package/skills/using-skills/SKILL.md +116 -29
  132. package/skills/verification/SKILL.md +160 -17
  133. package/skills/wazir/SKILL.md +750 -120
  134. package/skills/writing-plans/SKILL.md +134 -28
  135. package/skills/writing-skills/SKILL.md +91 -13
  136. package/skills/writing-skills/anthropic-best-practices.md +104 -64
  137. package/skills/writing-skills/persuasion-principles.md +100 -34
  138. package/tooling/src/capture/command.js +46 -2
  139. package/tooling/src/capture/decision.js +40 -0
  140. package/tooling/src/capture/store.js +33 -0
  141. package/tooling/src/capture/user-input.js +66 -0
  142. package/tooling/src/checks/security-sensitivity.js +69 -0
  143. package/tooling/src/cli.js +28 -26
  144. package/tooling/src/config/depth-table.js +60 -0
  145. package/tooling/src/export/compiler.js +7 -8
  146. package/tooling/src/guards/guardrail-functions.js +131 -0
  147. package/tooling/src/guards/phase-prerequisite-guard.js +97 -3
  148. package/tooling/src/hooks/pretooluse-dispatcher.js +300 -0
  149. package/tooling/src/hooks/pretooluse-pipeline-guard.js +141 -0
  150. package/tooling/src/hooks/stop-pipeline-gate.js +92 -0
  151. package/tooling/src/init/auto-detect.js +0 -2
  152. package/tooling/src/init/command.js +3 -95
  153. package/tooling/src/learn/pipeline.js +177 -0
  154. package/tooling/src/state/db.js +251 -2
  155. package/tooling/src/state/pipeline-state.js +262 -0
  156. package/tooling/src/status/command.js +6 -1
  157. package/tooling/src/verify/proof-collector.js +299 -0
  158. package/wazir.manifest.yaml +3 -0
  159. package/workflows/learn.md +61 -8
  160. package/workflows/plan-review.md +3 -1
  161. package/workflows/verify.md +30 -1
@@ -1,26 +1,57 @@
1
1
  ---
2
2
  name: wz:wazir
3
- description: One-command pipeline type /wazir followed by what you want to build. Handles init, clarification, execution, review, and audits automatically.
3
+ description: "Use when the user types /wazir to run the full pipeline for building, reviewing, and auditing."
4
4
  ---
5
5
 
6
6
  # Wazir — Full Pipeline Runner
7
7
 
8
- The user typed `/wazir <their request>`. Run the entire pipeline end-to-end, handling each phase automatically and only pausing where human input is required.
8
+ <!-- ═══════════════════════════════════════════════════════════════════
9
+ ZONE 1 — PRIMACY
10
+ ═══════════════════════════════════════════════════════════════════ -->
9
11
 
10
- All questions use **numbered interactive options** one question at a time, defaults marked "(Recommended)", wait for user response before proceeding.
12
+ You are the **Pipeline Controller**. Your value is orchestrating the full Wazir pipeline end-to-end — init, clarification, execution, review handling each phase automatically and only pausing where human input is required. Following the pipeline IS how you help.
11
13
 
12
- ## Command Routing
13
- Follow the Canonical Command Matrix in `hooks/routing-matrix.json`.
14
- - Large commands (test runners, builds, diffs, dependency trees, linting) → context-mode tools
15
- - Small commands (git status, ls, pwd, wazir CLI) → native Bash
16
- - If context-mode unavailable, fall back to native Bash with warning
14
+ The user typed `/wazir <their request>`. Run the entire pipeline end-to-end.
17
15
 
18
- ## Codebase Exploration
19
- 1. Query `wazir index search-symbols <query>` first
20
- 2. Use `wazir recall file <path> --tier L1` for targeted reads
21
- 3. Fall back to direct file reads ONLY for files identified by index queries
22
- 4. Maximum 10 direct file reads without a justifying index query
23
- 5. If no index exists: `wazir index build && wazir index summarize --tier all`
16
+ ## Iron Laws
17
+
18
+ 1. **NEVER skip a core pipeline phase** (clarify, execute, verify, review). Core workflows always run.
19
+ 2. **NEVER run a phase inline in the controller.** The controller ONLY dispatches subagents, validates guardrails, and manages state. No phase runs inside the controller context.
20
+ 3. **NEVER let a subagent see or skip another phase.** Each subagent gets only its own phase instructions and artifact paths.
21
+ 4. **ALWAYS capture events for every phase transition** via `wazir capture event`.
22
+ 5. **ALWAYS validate artifacts BETWEEN phases** via guardrails. No phase starts without previous phase artifacts verified.
23
+
24
+ ## Priority Stack
25
+
26
+ | Priority | Name | Beats | Conflict Example |
27
+ |----------|------|-------|------------------|
28
+ | P0 | Iron Laws | Everything | User says "skip review" → review anyway |
29
+ | P1 | Pipeline gates | P2-P5 | Spec not approved → do not code |
30
+ | P2 | Correctness | P3-P5 | Partial correct > complete wrong |
31
+ | P3 | Completeness | P4-P5 | All criteria before optimizing |
32
+ | P4 | Speed | P5 | Fast execution, never fewer steps |
33
+ | P5 | User comfort | Nothing | Minimize friction, never weaken P0-P4 |
34
+
35
+ ## Override Boundary
36
+
37
+ User CAN choose depth (quick/standard/deep), interaction mode (auto/guided/interactive), and which adaptive workflows to enable.
38
+ User CANNOT skip core phases, bypass guardrails, or run phases inline in the controller.
39
+
40
+ <!-- ═══════════════════════════════════════════════════════════════════
41
+ ZONE 2 — PROCESS
42
+ ═══════════════════════════════════════════════════════════════════ -->
43
+
44
+ ## Signature
45
+
46
+ **Inputs:**
47
+ - User request (text after `/wazir`)
48
+ - Project repo state
49
+ - `.wazir/state/config.json` (if exists)
50
+
51
+ **Outputs:**
52
+ - Completed pipeline run with all artifacts
53
+ - Review verdict with numeric score
54
+ - Event log, reasoning chain, learnings
24
55
 
25
56
  ## Subcommand Detection
26
57
 
@@ -33,6 +64,23 @@ Before anything else, check if the request starts with a known subcommand:
33
64
  | `/wazir init` | Invoke the `init-pipeline` skill directly, then stop |
34
65
  | Anything else | Continue to Phase 1 (Init) |
35
66
 
67
+ ## Commitment Priming
68
+
69
+ Before executing, announce your plan:
70
+ > "Running the Wazir pipeline at [depth] depth in [mode] mode. I will orchestrate 4 phases — Init, Clarifier, Executor, Final Review — dispatching isolated subagents for each, validating artifacts between phases."
71
+
72
+ All questions use **numbered interactive options** — one question at a time, defaults marked "(Recommended)", wait for user response before proceeding.
73
+
74
+ ## User Input Capture
75
+
76
+ After every user response (approval, correction, rejection, redirect, instruction), capture it:
77
+
78
+ ```
79
+ captureUserInput(runDir, { phase: '<current-phase>', type: '<instruction|approval|correction|rejection|redirect>', content: '<user message>', context: '<what prompted the question>' })
80
+ ```
81
+
82
+ This uses `tooling/src/capture/user-input.js`. The log at `user-input-log.ndjson` feeds the learning system — user corrections are the strongest signal for improvement. At run end, prune logs older than 10 runs via `pruneOldInputLogs(stateRoot, 10)`.
83
+
36
84
  ---
37
85
 
38
86
  # 4-Phase Pipeline
@@ -82,6 +130,9 @@ Parse the request for inline modifiers before the main text:
82
130
 
83
131
  Recognized modifiers:
84
132
  - **Depth:** `quick`, `deep` (standard is default when omitted)
133
+ - **Interaction mode:** `auto`, `interactive` (guided is default when omitted)
134
+ - `/wazir auto fix the auth bug` → interaction_mode = auto
135
+ - `/wazir interactive design the onboarding` → interaction_mode = interactive
85
136
  - **Intent:** `bugfix`, `feature`, `refactor`, `docs`, `spike`
86
137
 
87
138
  ## Step 2: Check Prerequisites
@@ -93,11 +144,14 @@ Run `which wazir` to check if the CLI is installed.
93
144
  **If not installed**, present:
94
145
 
95
146
  > **The Wazir CLI is not installed. It's required for event capture, validation, and indexing.**
96
- >
97
- > **How would you like to install it?**
98
- >
99
- > 1. **npm** (Recommended) — `npm install -g @wazir-dev/cli`
100
- > 2. **Local link** — `npm link` from the Wazir project root
147
+
148
+ Ask the user via AskUserQuestion:
149
+ - **Question:** "The Wazir CLI is not installed. How would you like to install it?"
150
+ - **Options:**
151
+ 1. "npm install -g @wazir-dev/cli" *(Recommended)*
152
+ 2. "npm link from the Wazir project root"
153
+
154
+ Wait for the user's selection before continuing.
101
155
 
102
156
  The CLI is **required** — the pipeline uses `wazir capture`, `wazir validate`, `wazir index`, and `wazir doctor` throughout execution.
103
157
 
@@ -109,9 +163,14 @@ Run `wazir validate branches` to check the current git branch.
109
163
 
110
164
  - If on `main` or `develop`:
111
165
  > You're on **[branch]**. The pipeline requires a feature branch.
112
- >
113
- > 1. **Create feat/<slug>** (Recommended) — branch from current
114
- > 2. **Continue on [branch]** not recommended for feature/refactor work
166
+
167
+ Ask the user via AskUserQuestion:
168
+ - **Question:** "You're on a protected branch. Create a feature branch?"
169
+ - **Options:**
170
+ 1. "Create feat/<slug> from current branch" *(Recommended)*
171
+ 2. "Continue on current branch — not recommended"
172
+
173
+ Wait for the user's selection before continuing.
115
174
 
116
175
  ### Index Check
117
176
 
@@ -154,9 +213,14 @@ Check if a previous incomplete run exists (via `latest` symlink pointing to a ru
154
213
  **If previous incomplete run found**, present:
155
214
 
156
215
  > **A previous incomplete run was detected:** `<previous-run-id>`
157
- >
158
- > 1. **Resume** (Recommended) — continue from the last completed phase
159
- > 2. **Start fresh** create a new empty run
216
+
217
+ Ask the user via AskUserQuestion:
218
+ - **Question:** "A previous incomplete run was detected. Resume or start fresh?"
219
+ - **Options:**
220
+ 1. "Resume from the last completed phase" *(Recommended)*
221
+ 2. "Start fresh with a new empty run"
222
+
223
+ Wait for the user's selection before continuing.
160
224
 
161
225
  **If Resume:**
162
226
  - Copy `clarified/` from previous run into new run, EXCEPT `user-feedback.md`.
@@ -196,29 +260,30 @@ parsed_intent: feature
196
260
  entry_point: "/wazir"
197
261
 
198
262
  depth: standard
199
- team_mode: sequential
200
- parallel_backend: none
263
+ interaction_mode: guided # auto | guided | interactive
201
264
 
202
- # Workflow policy — individual workflows within each phase
265
+ # Workflow policy — loop_cap is set from the depth table:
266
+ # quick: loop_cap=5, standard: loop_cap=10, deep: loop_cap=15
267
+ # See tooling/src/config/depth-table.js for the canonical values.
203
268
  workflow_policy:
204
269
  # Clarifier phase workflows
205
- discover: { enabled: true, loop_cap: 10 }
206
- clarify: { enabled: true, loop_cap: 10 }
207
- specify: { enabled: true, loop_cap: 10 }
208
- spec-challenge: { enabled: true, loop_cap: 10 }
209
- author: { enabled: false, loop_cap: 10 }
210
- design: { enabled: true, loop_cap: 10 }
211
- design-review: { enabled: true, loop_cap: 10 }
212
- plan: { enabled: true, loop_cap: 10 }
213
- plan-review: { enabled: true, loop_cap: 10 }
270
+ discover: { enabled: true, loop_cap: DEPTH_TABLE[depth].loop_cap }
271
+ clarify: { enabled: true, loop_cap: DEPTH_TABLE[depth].loop_cap }
272
+ specify: { enabled: true, loop_cap: DEPTH_TABLE[depth].loop_cap }
273
+ spec-challenge: { enabled: true, loop_cap: DEPTH_TABLE[depth].loop_cap }
274
+ author: { enabled: false, loop_cap: DEPTH_TABLE[depth].loop_cap }
275
+ design: { enabled: true, loop_cap: DEPTH_TABLE[depth].loop_cap }
276
+ design-review: { enabled: true, loop_cap: DEPTH_TABLE[depth].loop_cap }
277
+ plan: { enabled: true, loop_cap: DEPTH_TABLE[depth].loop_cap }
278
+ plan-review: { enabled: true, loop_cap: DEPTH_TABLE[depth].loop_cap }
214
279
  # Executor phase workflows
215
- execute: { enabled: true, loop_cap: 10 }
216
- verify: { enabled: true, loop_cap: 5 }
280
+ execute: { enabled: true, loop_cap: DEPTH_TABLE[depth].loop_cap }
281
+ verify: { enabled: true, loop_cap: DEPTH_TABLE[depth].loop_cap }
217
282
  # Final Review phase workflows
218
- review: { enabled: true, loop_cap: 10 }
219
- learn: { enabled: true, loop_cap: 5 }
220
- prepare_next: { enabled: true, loop_cap: 5 }
221
- run_audit: { enabled: false, loop_cap: 10 }
283
+ review: { enabled: true, loop_cap: DEPTH_TABLE[depth].loop_cap }
284
+ learn: { enabled: true, loop_cap: DEPTH_TABLE[depth].loop_cap }
285
+ prepare_next: { enabled: true, loop_cap: DEPTH_TABLE[depth].loop_cap }
286
+ run_audit: { enabled: false, loop_cap: DEPTH_TABLE[depth].loop_cap }
222
287
 
223
288
  research_topics: []
224
289
 
@@ -247,127 +312,425 @@ After building run config:
247
312
  > **Running: standard depth, feature, sequential. Proceeding...**
248
313
 
249
314
  - **Low confidence** — show plan and ask:
250
- > **Does this look right?**
251
- > 1. **Yes, proceed** (Recommended)
252
- > 2. **No, let me adjust**
315
+
316
+ Ask the user via AskUserQuestion:
317
+ - **Question:** "Does this run configuration look right?"
318
+ - **Options:**
319
+ 1. "Yes, proceed" *(Recommended)*
320
+ 2. "No, let me adjust"
321
+
322
+ Wait for the user's selection before continuing.
253
323
 
254
324
  ```bash
255
325
  wazir capture event --run <run-id> --event phase_exit --phase init --status completed
256
326
  ```
257
327
 
328
+ Run the phase report and display it to the user:
329
+ ```bash
330
+ wazir report phase --run <run-id> --phase init
331
+ ```
332
+
333
+ Output the report content to the user in the conversation.
334
+
258
335
  ---
259
336
 
260
- # Phase 2: Clarifier
337
+ # Interaction Modes
261
338
 
262
- ```bash
263
- wazir capture event --run <run-id> --event phase_enter --phase clarifier --status in_progress
264
- ```
339
+ The `interaction_mode` field in run-config controls how the pipeline interacts with the user:
265
340
 
266
- Invoke the `wz:clarifier` skill. It handles all sub-workflows internally:
341
+ | Mode | Inline modifier | Behavior | Best for |
342
+ |------|----------------|----------|----------|
343
+ | **`guided`** | (default) | Pipeline runs, pauses at phase checkpoints for user approval. Current default behavior. | Most work |
344
+ | **`auto`** | `/wazir auto ...` | No human checkpoints. Codex reviews all. Gating agent decides continue/loop_back/escalate. Stops ONLY on escalate. | Overnight, clear spec, well-understood domain |
345
+ | **`interactive`** | `/wazir interactive ...` | More questions, more discussion, co-designs with user. Researcher presents options. Executor checks approach before coding. | Ambiguous requirements, new domain, learning |
267
346
 
268
- 1. **Source Capture** — fetch URLs from input
269
- 2. **Research** (discover workflow) — codebase + external research
270
- 3. **Clarify** (clarify workflow) — scope, constraints, assumptions
271
- 4. **Spec Harden** (specify + spec-challenge workflows) — measurable spec
272
- 5. **Brainstorm** (design + design-review workflows) — design approaches
273
- 6. **Plan** (plan + plan-review workflows) — execution plan
347
+ ## `auto` mode constraints
274
348
 
275
- Each sub-workflow has its own review loop. User checkpoints between major steps.
349
+ - **Codex REQUIRED** refuse to start auto mode if `multi_tool.codex` is not configured in `.wazir/state/config.json`. Error: "Auto mode requires an external reviewer (Codex). Configure it first or use guided mode."
350
+ - **On escalate:** STOP immediately, write the escalation reason to `.wazir/runs/<id>/escalations/`, and wait for user input
351
+ - **Wall-clock limit:** default 4 hours. If exceeded, stop with escalation.
352
+ - **Never auto-commits to main** — always work on feature branch
353
+ - All checkpoints (AskUserQuestion) are skipped — gating agent evaluates phase reports and decides
276
354
 
277
- ### Scope Invariant
355
+ ## `guided` mode (default)
356
+
357
+ Current behavior — no changes needed. Checkpoints at phase boundaries, user approves before advancing.
358
+
359
+ ## `interactive` mode
360
+
361
+ - **Clarifier:** asks more detailed questions, presents research findings with options: "I found 3 approaches — which interests you?"
362
+ - **Executor:** checks approach before coding: "I'm about to implement auth with Supabase — sound right?"
363
+ - **Reviewer:** discusses findings with user, not just presents verdict: "I found a potential auth bypass — here's why I think it's high severity, do you agree?"
364
+ - Slower but highest quality for complex/ambiguous work
365
+
366
+ ## Mode checking in phase skills
367
+
368
+ All phase skills check `interaction_mode` from run-config at every checkpoint:
369
+
370
+ ```
371
+ # Read from run-config
372
+ interaction_mode = run_config.interaction_mode ?? 'guided'
373
+
374
+ # At each checkpoint:
375
+ if interaction_mode == 'auto':
376
+ # Skip checkpoint, let gating agent decide
377
+ elif interaction_mode == 'interactive':
378
+ # More detailed question, present options, discuss
379
+ else:
380
+ # guided — standard checkpoint with AskUserQuestion
381
+ ```
382
+
383
+ ---
384
+
385
+ # Two-Level Phase Model
278
386
 
279
- **Hard rule:** `items_in_plan >= items_in_input` unless the user explicitly approves scope reduction. The clarifier MUST NOT autonomously tier, defer, or drop items from the user's input. It can suggest prioritization, but the decision belongs to the user.
387
+ The pipeline has 4 top-level **phases**, each containing multiple **workflows** with review loops:
280
388
 
281
- Output: approved spec + design + execution plan in `.wazir/runs/latest/clarified/`.
389
+ ```
390
+ Phase 1: Init
391
+ └── (inline — controller handles directly)
392
+
393
+ Phase 2: Clarifier → dispatched as SUBAGENT
394
+ ├── discover (research) ← research-review loop
395
+ ├── clarify ← clarification-review loop
396
+ ├── specify ← spec-challenge loop
397
+ ├── author (adaptive) ← approval gate
398
+ ├── design ← design-review loop
399
+ └── plan ← plan-review loop
400
+
401
+ Phase 3: Executor → dispatched as SUBAGENT
402
+ ├── execute (per-task) ← task-review loop per task
403
+ └── verify
404
+
405
+ Phase 4: Final Review → dispatched as SUBAGENT
406
+ ├── review (final) ← scored review
407
+ ├── learn
408
+ └── prepare_next
409
+ ```
282
410
 
411
+ **Event capture uses both levels.** When emitting phase events, include `--parent-phase`:
283
412
  ```bash
284
- wazir capture event --run <run-id> --event phase_exit --phase clarifier --status completed
413
+ wazir capture event --run <id> --event phase_enter --phase discover --parent-phase clarifier --status in_progress
285
414
  ```
286
415
 
416
+ **Progress markers between workflows:** After each workflow completes, output:
417
+ > Phase 2: Clarifier > Workflow: specify (3 of 6 workflows complete)
418
+
419
+ **`wazir status` shows both levels:** "Phase 2: Clarifier > Workflow: specify"
420
+
287
421
  ---
288
422
 
289
- # Phase 3: Executor
423
+ # Subagent Controller Architecture
290
424
 
291
- ## Phase Gate (Hard Gate)
425
+ **This is the core enforcement mechanism.** The controller (this skill, wz:wazir) dispatches ONE fresh Agent per phase. Each subagent gets a clean 200K context with only its skill instructions and artifact paths — never the full pipeline context.
292
426
 
293
- Before entering the Executor phase, verify ALL clarifier artifacts exist:
427
+ ## Why Subagents
294
428
 
295
- - [ ] `.wazir/runs/latest/clarified/clarification.md`
296
- - [ ] `.wazir/runs/latest/clarified/spec-hardened.md`
297
- - [ ] `.wazir/runs/latest/clarified/design.md`
298
- - [ ] `.wazir/runs/latest/clarified/execution-plan.md`
429
+ A single-context pipeline allows the agent to rationalize skipping phases ("the input is clear enough"). Subagent isolation prevents this:
430
+ - Each subagent ONLY sees its own phase instructions
431
+ - No subagent can see or skip another phase
432
+ - The controller validates artifacts BETWEEN phases
433
+ - Hooks provide a second enforcement layer independent of prompt compliance
299
434
 
300
- If ANY file is missing, **STOP**:
435
+ ## Controller Loop
301
436
 
302
- > **Cannot enter Executor phase: missing prerequisite artifacts from Clarifier.**
303
- >
304
- > Missing: [list missing files]
437
+ ```
438
+ initialize pipeline-state.json via createPipelineState(runId, stateRoot)
439
+ transitionPhase(stateRoot, 'clarify')
440
+
441
+ for each phase in [clarify, execute, review]:
442
+ 1. Update pipeline-state.json: current_phase = phase
443
+ 2. Run pre-phase guardrail (validate previous phase artifacts)
444
+ 3. Build subagent prompt (see Subagent Prompt Template below)
445
+ 4. Dispatch: Agent(prompt=..., description="wazir: <phase>", mode="bypassPermissions")
446
+ 5. On completion: validate output artifacts via runGuardrail(phase, state, runDir)
447
+ 6. If guardrail passes:
448
+ a. completePhase(stateRoot, phase, artifacts)
449
+ b. Continue to next phase
450
+ 7. If guardrail fails: execute Retry Ladder
451
+ 8. Capture events:
452
+ wazir capture event --run <id> --event phase_exit --phase <phase> --status completed
453
+
454
+ transitionPhase(stateRoot, 'complete')
455
+ ```
456
+
457
+ **CRITICAL: No phase runs inline in the controller.** The controller ONLY:
458
+ - Manages state transitions
459
+ - Dispatches subagents
460
+ - Validates guardrails
461
+ - Handles retry/escalation
462
+ - Presents results to the user
463
+
464
+ ## Subagent Prompt Template
465
+
466
+ Each subagent receives this prompt structure:
467
+
468
+ ```
469
+ You are running the {PHASE} phase of the Wazir pipeline.
470
+
471
+ Run ID: {run_id}
472
+ Run directory: {run_dir}
473
+ State root: {state_root}
474
+ Depth: {depth}
475
+ Interaction mode: {interaction_mode}
476
+
477
+ ## Your Instructions
478
+ {Read and paste the full content of skills/{phase_skill}/SKILL.md here}
479
+
480
+ ## Input Artifacts (read from disk)
481
+ {List of file paths the subagent should read as input}
482
+
483
+ ## Output Artifacts (write to disk)
484
+ {List of file paths the subagent must produce}
485
+
486
+ ## Rules
487
+ - Read your input artifacts from the paths above
488
+ - Write your output artifacts to the paths above
489
+ - Do NOT skip any step in your instructions
490
+ - Use wazir index for codebase exploration
491
+ - Use context-mode for large command outputs
492
+ - When done, state which artifacts you produced
493
+ ```
494
+
495
+ The controller reads the phase skill from disk and includes it in the prompt. This ensures each subagent has the latest skill version.
496
+
497
+ ## Subagent Dispatch Rules
498
+
499
+ 1. **No nesting** — all subagents dispatched at depth=1 from the controller
500
+ 2. **No context sharing** — subagents communicate only via artifacts on disk
501
+ 3. **No pipeline state awareness** — subagents don't read pipeline-state.json
502
+ 4. **Controller reads skills** — Read `skills/{name}/SKILL.md` before dispatch, paste into prompt
503
+ 5. **Verify phase handled by executor** — the executor subagent handles both execute + verify workflows
504
+
505
+ ## Retry Ladder
506
+
507
+ If a guardrail fails after a phase subagent completes:
508
+
509
+ ```
510
+ retry_count = 0
511
+ while guardrail fails:
512
+ retry_count++
513
+ if retry_count <= 2:
514
+ # Re-dispatch same phase with failure feedback
515
+ prompt += "\n\nPREVIOUS ATTEMPT FAILED GUARDRAIL:\n{guardrail.reason}\nMissing: {guardrail.missing}\nFix these issues."
516
+ Dispatch Agent again
517
+ elif retry_count == 3:
518
+ # Escalate model (use Opus if not already)
519
+ prompt += "\n\nESCALATED: Previous attempts failed. Produce ALL required artifacts."
520
+ Dispatch Agent with model="opus"
521
+ else:
522
+ # Escalate to human
523
+ Ask user: "Phase {phase} failed guardrail after {retry_count} attempts: {reason}"
524
+ Options: 1. Retry manually 2. Skip phase 3. Abort run
525
+ break
526
+ ```
527
+
528
+ ## Pipeline State Management
529
+
530
+ The controller manages `pipeline-state.json` at `$STATE_ROOT/pipeline-state.json`:
531
+
532
+ ```javascript
533
+ // Before first phase
534
+ createPipelineState(runId, stateRoot)
535
+ transitionPhase(stateRoot, 'clarify')
536
+
537
+ // Between phases
538
+ transitionPhase(stateRoot, 'execute')
539
+
540
+ // After each phase
541
+ completePhase(stateRoot, phase, { artifactName: { path: '...' } })
542
+
543
+ // When done
544
+ transitionPhase(stateRoot, 'complete')
545
+ ```
546
+
547
+ The Stop hook reads this file to block premature completion.
548
+ The PreToolUse hook reads this file to enforce phase-specific tool restrictions.
549
+
550
+ ---
551
+
552
+ # Phase 2: Clarifier (Subagent)
553
+
554
+ **Before dispatching, output to the user:**
555
+
556
+ > **Clarifier Phase** — Dispatching clarifier subagent to research your codebase, clarify requirements, harden the spec, brainstorm designs, and produce an execution plan.
305
557
  >
306
- > The Clarifier phase must complete before execution can begin. Run `/wazir:clarifier` first.
558
+ > **Why this matters:** Without this, I'd guess your tech stack, misunderstand constraints, miss edge cases in the spec, and build the wrong architecture. Every ambiguity left unresolved here becomes a bug or rework cycle later.
307
559
 
308
- **Do NOT skip this check. Do NOT rationalize that the input is "clear enough" to bypass clarification. Every pipeline run must produce these artifacts.**
560
+ ## Pre-Dispatch
309
561
 
310
562
  ```bash
311
- wazir capture event --run <run-id> --event phase_enter --phase executor --status in_progress
563
+ wazir capture event --run <run-id> --event phase_enter --phase clarifier --status in_progress
312
564
  ```
313
565
 
314
- **Pre-execution gate:**
566
+ Update pipeline state:
567
+ ```
568
+ transitionPhase(stateRoot, 'clarify')
569
+ ```
570
+
571
+ ## Dispatch
572
+
573
+ Read `skills/clarifier/SKILL.md` from disk. Build the subagent prompt using the Subagent Prompt Template above.
574
+
575
+ **Input artifacts for clarifier subagent:**
576
+ - `.wazir/input/briefing.md`
577
+ - `.wazir/runs/<id>/sources/` (all captured sources)
578
+ - `.wazir/runs/<id>/run-config.yaml`
579
+ - `input/` directory (project-level input files)
580
+
581
+ **Required output artifacts:**
582
+ - `.wazir/runs/<id>/clarified/clarification.md`
583
+ - `.wazir/runs/<id>/clarified/spec-hardened.md`
584
+ - `.wazir/runs/<id>/clarified/design.md`
585
+ - `.wazir/runs/<id>/clarified/execution-plan.md`
586
+
587
+ Dispatch: `Agent(prompt=..., description="wazir: clarifier")`
588
+
589
+ ## Post-Dispatch
590
+
591
+ Run guardrail: `validateClarifyComplete(state, runDir)`
592
+
593
+ If guardrail passes:
594
+ ```bash
595
+ completePhase(stateRoot, 'clarify', { clarification: {...}, spec: {...}, design: {...}, plan: {...} })
596
+ wazir capture event --run <run-id> --event phase_exit --phase clarifier --status completed
597
+ wazir report phase --run <run-id> --phase clarifier
598
+ ```
599
+
600
+ If guardrail fails: execute Retry Ladder.
601
+
602
+ ### Scope Invariant
603
+
604
+ **Hard rule:** `items_in_plan >= items_in_input` unless the user explicitly approves scope reduction. The clarifier MUST NOT autonomously tier, defer, or drop items from the user's input.
605
+
606
+ **After clarifier subagent completes, output to the user:**
607
+
608
+ > **Clarifier Phase complete.**
609
+ >
610
+ > **Found:** [N] ambiguities resolved, [N] assumptions made explicit, [N] scope boundaries drawn, [N] acceptance criteria hardened
611
+ >
612
+ > **Without this phase:** Requirements would be interpreted differently across tasks, acceptance criteria would be vague and untestable, the design would be ad-hoc, and the plan would miss dependency ordering
613
+
614
+ ---
615
+
616
+ # Phase 3: Executor (Subagent)
617
+
618
+ **Before dispatching, output to the user:**
619
+
620
+ > **Executor Phase** — Dispatching executor subagent to implement [N] tasks with TDD, per-task review, and verification.
621
+ >
622
+ > **Why this matters:** Without this discipline, tests get skipped, edge cases get missed, integration points break silently, and review catches problems too late.
623
+
624
+ ## Pre-Dispatch Guardrail (Hard Gate)
625
+
626
+ Run `validateClarifyComplete(state, runDir)` to verify ALL clarifier artifacts exist. If ANY file is missing, **STOP** — do not dispatch the executor subagent.
315
627
 
316
628
  ```bash
317
629
  wazir validate manifest && wazir validate hooks
318
630
  # Hard gate — stop if either fails.
319
631
  ```
320
632
 
321
- Invoke the `wz:executor` skill. It handles:
633
+ Update pipeline state:
634
+ ```
635
+ transitionPhase(stateRoot, 'execute')
636
+ wazir capture event --run <run-id> --event phase_enter --phase executor --status in_progress
637
+ ```
322
638
 
323
- 1. **Execute** (execute workflow) — per-task TDD cycle with review before each commit
324
- 2. **Verify** (verify workflow) — deterministic verification of all claims
639
+ ## Dispatch
325
640
 
326
- Per-task review: `--mode task-review`, 5 task-execution dimensions.
327
- Tasks always run sequentially.
641
+ Read `skills/executor/SKILL.md` from disk. Build the subagent prompt.
328
642
 
329
- Output: code changes + verification proof in `.wazir/runs/latest/artifacts/`.
643
+ **Input artifacts for executor subagent:**
644
+ - `.wazir/runs/<id>/clarified/clarification.md`
645
+ - `.wazir/runs/<id>/clarified/spec-hardened.md`
646
+ - `.wazir/runs/<id>/clarified/design.md`
647
+ - `.wazir/runs/<id>/clarified/execution-plan.md`
648
+ - `.wazir/runs/<id>/run-config.yaml`
649
+ - `.wazir/state/config.json`
330
650
 
651
+ **Required output artifacts:**
652
+ - `.wazir/runs/<id>/artifacts/task-NNN/` (at least one)
653
+ - `.wazir/runs/<id>/artifacts/verification-proof.md`
654
+
655
+ Dispatch: `Agent(prompt=..., description="wazir: executor")`
656
+
657
+ The executor subagent handles BOTH the execute and verify workflows internally.
658
+
659
+ ## Post-Dispatch
660
+
661
+ Run guardrail: `validateExecuteComplete(state, runDir)`
662
+
663
+ If guardrail passes:
331
664
  ```bash
665
+ completePhase(stateRoot, 'execute', { verification_proof: { path: '...' } })
666
+ transitionPhase(stateRoot, 'verify')
667
+ completePhase(stateRoot, 'verify', { verification_proof: { path: '...' } })
332
668
  wazir capture event --run <run-id> --event phase_exit --phase executor --status completed
669
+ wazir report phase --run <run-id> --phase executor
333
670
  ```
334
671
 
335
- ---
672
+ If guardrail fails: execute Retry Ladder.
336
673
 
337
- # Phase 4: Final Review
674
+ **After executor subagent completes, output to the user:**
338
675
 
339
- ## Phase Gate (Hard Gate)
676
+ > **Executor Phase complete.**
677
+ >
678
+ > **Found:** [N]/[N] tasks implemented, [N] tests written, [N] per-task review passes completed
679
+ >
680
+ > **Without this phase:** Code would ship without tests, review findings would accumulate until final review (10x more expensive to fix), and verification claims would be unsubstantiated
340
681
 
341
- Before entering the Final Review phase, verify the Executor produced its proof:
682
+ ---
342
683
 
343
- - [ ] `.wazir/runs/latest/artifacts/verification-proof.md`
684
+ # Phase 4: Final Review (Subagent)
344
685
 
345
- If missing, **STOP**:
686
+ **Before dispatching, output to the user:**
346
687
 
347
- > **Cannot enter Final Review: missing verification proof from Executor.**
688
+ > **Final Review Phase** Dispatching reviewer subagent for adversarial 7-dimension review comparing implementation against your original input.
348
689
  >
349
- > The Executor phase must complete and produce `verification-proof.md` before final review. Run `/wazir:executor` first.
690
+ > **Why this matters:** Without this, implementation drift ships undetected, missing acceptance criteria go unnoticed, and the same mistakes repeat.
350
691
 
351
- ```bash
692
+ ## Pre-Dispatch Guardrail (Hard Gate)
693
+
694
+ Run `validateVerifyComplete(state, runDir)` to verify verification proof exists. If missing, **STOP**.
695
+
696
+ Update pipeline state:
697
+ ```
698
+ transitionPhase(stateRoot, 'review')
352
699
  wazir capture event --run <run-id> --event phase_enter --phase final_review --status in_progress
353
700
  ```
354
701
 
355
- This phase validates the implementation against the **ORIGINAL INPUT** (not the task specs — the executor's per-task reviewer already covered that).
702
+ ## Dispatch
356
703
 
357
- ### 4a: Review (reviewer role in final mode)
704
+ Read `skills/reviewer/SKILL.md` from disk. Build the subagent prompt.
358
705
 
359
- Invoke `wz:reviewer --mode final`.
360
- 7-dimension scored review comparing implementation against the original user input.
361
- Score 0-70. Verdicts: PASS (56+), NEEDS MINOR FIXES (42-55), NEEDS REWORK (28-41), FAIL (0-27).
706
+ **Input artifacts for reviewer subagent:**
707
+ - `.wazir/input/briefing.md` (original input — compare implementation against THIS)
708
+ - `.wazir/runs/<id>/clarified/spec-hardened.md`
709
+ - `.wazir/runs/<id>/artifacts/verification-proof.md`
710
+ - `.wazir/runs/<id>/run-config.yaml`
711
+ - `.wazir/state/config.json`
712
+ - Git diff: `git diff main..HEAD`
362
713
 
363
- ### 4b: Learn (learner role)
714
+ **Required output artifacts:**
715
+ - `.wazir/runs/<id>/reviews/final-review.md`
716
+ - `.wazir/runs/<id>/reviews/verdict.json` (must have numeric `score` field)
364
717
 
718
+ <<<<<<< HEAD
719
+ Additional instructions in the subagent prompt:
720
+ ```
721
+ Run in --mode final. Produce a 7-dimension scored review.
722
+ Write verdict.json with { "score": N, "verdict": "PASS|NEEDS_MINOR_FIXES|NEEDS_REWORK|FAIL" }
723
+ Compare implementation against the ORIGINAL INPUT (briefing.md), not just the spec.
724
+ Use Codex for external review if configured in config.json.
725
+ =======
365
726
  Extract durable learnings from the completed run:
366
727
  - Scan all review findings (internal + Codex)
367
728
  - Propose learnings to `memory/learnings/proposed/`
368
729
  - Findings that recur across 2+ runs → auto-proposed as learnings
369
730
  - Learnings require explicit scope tags (roles, stacks, concerns)
370
731
 
732
+ **Learn workflow completion guard:** If `workflow_policy.learn.enabled: true` in run config AND no files exist in `memory/learnings/proposed/` matching the current run ID pattern (`run-<current-id>-*.md`): log a warning finding: 'Learn workflow enabled but no proposed learnings written for this run'. This ensures the learn workflow always produces output when enabled.
733
+
371
734
  ### 4c: Prepare Next (planner role)
372
735
 
373
736
  Prepare context and handoff for the next run:
@@ -375,10 +738,45 @@ Prepare context and handoff for the next run:
375
738
  - Compress/archive unneeded files
376
739
  - Record what's left to do
377
740
 
741
+ **After completing this phase, output to the user:**
742
+
743
+ > **Final Review Phase complete.**
744
+ >
745
+ > **Found:** [N] findings across 7 dimensions, [N] blocking issues, [N] warnings, [N] learnings proposed for future runs
746
+ >
747
+ > **Without this phase:** Implementation drift from the original request would ship undetected, untested paths would hide production bugs, and recurring mistakes would never get captured as learnings
748
+ >
749
+ > **Changed because of this work:** [List of findings fixed, score achieved, learnings extracted, handoff prepared]
750
+
751
+ ```bash
752
+ wazir capture event --run <run-id> --event phase_exit --phase final_review --status completed
753
+ >>>>>>> d54b700 (feat(learnings): activate learning pipeline feedback loop)
754
+ ```
755
+
756
+ Dispatch: `Agent(prompt=..., description="wazir: reviewer")`
757
+
758
+ ## Post-Dispatch
759
+
760
+ Run guardrail: `validateReviewComplete(state, runDir)`
761
+
762
+ If guardrail passes:
378
763
  ```bash
764
+ completePhase(stateRoot, 'review', { review_verdict: { path: '...' } })
379
765
  wazir capture event --run <run-id> --event phase_exit --phase final_review --status completed
766
+ transitionPhase(stateRoot, 'complete')
767
+ wazir report phase --run <run-id> --phase final_review
380
768
  ```
381
769
 
770
+ If guardrail fails: execute Retry Ladder.
771
+
772
+ **After reviewer subagent completes, output to the user:**
773
+
774
+ > **Final Review Phase complete.**
775
+ >
776
+ > **Found:** [N] findings across 7 dimensions, [N] blocking issues, [N] warnings
777
+ >
778
+ > **Without this phase:** Implementation drift from the original request would ship undetected, untested paths would hide production bugs
779
+
382
780
  ---
383
781
 
384
782
  ## Step 5: CHANGELOG + Gitflow Validation (Hard Gates)
@@ -399,26 +797,41 @@ After the reviewer completes, present verdict with numbered options:
399
797
  ### If PASS (score 56+):
400
798
 
401
799
  > **Result: PASS (score/70)**
402
- >
403
- > 1. **Create a PR** (Recommended)
404
- > 2. **Merge directly**
405
- > 3. **Review the changes first**
800
+
801
+ Ask the user via AskUserQuestion:
802
+ - **Question:** "Pipeline passed. What would you like to do next?"
803
+ - **Options:**
804
+ 1. "Create a PR" *(Recommended)*
805
+ 2. "Merge directly"
806
+ 3. "Review the changes first"
807
+
808
+ Wait for the user's selection before continuing.
406
809
 
407
810
  ### If NEEDS MINOR FIXES (score 42-55):
408
811
 
409
812
  > **Result: NEEDS MINOR FIXES (score/70)**
410
- >
411
- > 1. **Auto-fix and re-review** (Recommended)
412
- > 2. **Fix manually**
413
- > 3. **Accept as-is**
813
+
814
+ Ask the user via AskUserQuestion:
815
+ - **Question:** "Minor issues found. How should we handle them?"
816
+ - **Options:**
817
+ 1. "Auto-fix and re-review" *(Recommended)*
818
+ 2. "Fix manually"
819
+ 3. "Accept as-is"
820
+
821
+ Wait for the user's selection before continuing.
414
822
 
415
823
  ### If NEEDS REWORK (score 28-41):
416
824
 
417
825
  > **Result: NEEDS REWORK (score/70)**
418
- >
419
- > 1. **Re-run affected tasks** (Recommended)
420
- > 2. **Review findings in detail**
421
- > 3. **Abandon this run**
826
+
827
+ Ask the user via AskUserQuestion:
828
+ - **Question:** "Significant issues found. How should we proceed?"
829
+ - **Options:**
830
+ 1. "Re-run affected tasks" *(Recommended)*
831
+ 2. "Review findings in detail"
832
+ 3. "Abandon this run"
833
+
834
+ Wait for the user's selection before continuing.
422
835
 
423
836
  ### If FAIL (score 0-27):
424
837
 
@@ -438,10 +851,15 @@ wazir status --run <run-id> --json
438
851
  If any phase fails:
439
852
 
440
853
  > **Phase [name] failed: [reason]**
441
- >
442
- > 1. **Retry this phase** (Recommended)
443
- > 2. **Skip and continue** (only if workflows within phase are adaptive)
444
- > 3. **Abort the run**
854
+
855
+ Ask the user via AskUserQuestion:
856
+ - **Question:** "Phase [name] failed: [reason]. How should we proceed?"
857
+ - **Options:**
858
+ 1. "Retry this phase" *(Recommended)*
859
+ 2. "Skip and continue" *(only if workflows within phase are adaptive)*
860
+ 3. "Abort the run"
861
+
862
+ Wait for the user's selection before continuing.
445
863
 
446
864
  ---
447
865
 
@@ -455,9 +873,14 @@ Parse inline audit types: `/wazir audit security` → skip Question 1.
455
873
 
456
874
  After audit:
457
875
 
458
- > 1. **Review the findings** (Recommended)
459
- > 2. **Generate a fix plan**
460
- > 3. **Run the pipeline on the fix plan**
876
+ Ask the user via AskUserQuestion:
877
+ - **Question:** "Audit complete. What would you like to do with the findings?"
878
+ - **Options:**
879
+ 1. "Review the findings" *(Recommended)*
880
+ 2. "Generate a fix plan"
881
+ 3. "Run the pipeline on the fix plan"
882
+
883
+ Wait for the user's selection before continuing.
461
884
 
462
885
  If option 3, save findings as briefing and run pipeline with intent = `bugfix`.
463
886
 
@@ -471,12 +894,27 @@ Generates a PRD from a completed run. Reads approved design, task specs, executi
471
894
 
472
895
  After generation:
473
896
 
474
- > 1. **Review the PRD** (Recommended)
475
- > 2. **Commit it**
476
- > 3. **Edit before committing**
897
+ Ask the user via AskUserQuestion:
898
+ - **Question:** "PRD generated. What would you like to do?"
899
+ - **Options:**
900
+ 1. "Review the PRD" *(Recommended)*
901
+ 2. "Commit it"
902
+ 3. "Edit before committing"
903
+
904
+ Wait for the user's selection before continuing.
477
905
 
478
906
  ---
479
907
 
908
+ ## Implementation Intentions
909
+
910
+ IF user asks to skip a required step → THEN say "Running it quickly" and execute. No debate.
911
+ IF urgency is expressed ("just", "quickly") → THEN execute ALL steps at full speed. Never fewer steps.
912
+ IF you are unsure whether a step is required → THEN it IS required.
913
+ IF a phase guardrail fails → THEN execute the Retry Ladder. Never skip.
914
+ IF auto mode and Codex is not configured → THEN refuse to start. Error message and suggest guided mode.
915
+ IF a subagent fails and retry ladder exhausts → THEN escalate to human. Never silently skip.
916
+ IF previous incomplete run detected → THEN ask user about resume vs fresh start. Never assume.
917
+
480
918
  ## Interaction Rules
481
919
 
482
920
  - **One question at a time** — never combine multiple questions
@@ -485,3 +923,195 @@ After generation:
485
923
  - **Wait for answer** — never proceed past a question until the user responds
486
924
  - **No open-ended questions** — every question has concrete options to pick from
487
925
  - **Inline answers accepted** — users can type the number or the option name
926
+
927
+ <!-- ═══════════════════════════════════════════════════════════════════
928
+ ZONE 3 — RECENCY
929
+ ═══════════════════════════════════════════════════════════════════ -->
930
+
931
+ ## Recency Anchor
932
+
933
+ Remember: core phases (clarify, execute, verify, review) always run. No phase runs inline — only subagent dispatch. Validate artifacts between every phase. Capture events at every transition. Subagents see only their own phase.
934
+
935
+ ## Red Flags
936
+
937
+ | Thought | Reality |
938
+ |---------|---------|
939
+ | "The user said to skip this" | The user controls WHAT to build. The pipeline controls HOW. |
940
+ | "This is too small for the full process" | Small tasks have small steps. Do them all. |
941
+ | "I already know the answer" | The process will confirm it quickly. Do it anyway. |
942
+ | "The input is clear enough, skip clarification" | Clarity is subjective. The clarifier will confirm it quickly. Run it. |
943
+ | "I can run this phase inline instead of dispatching" | Inline phases allow rationalized skipping. Always dispatch. |
944
+ | "The guardrail is too strict" | Guardrails prevent broken handoffs. Trust them. |
945
+ | "I'll skip event capture, it's just logging" | Event capture feeds learning, reports, and audit. Never skip. |
946
+ | "Auto mode means I can skip steps" | Auto mode skips human checkpoints, not pipeline steps. |
947
+
948
+ ## Meta-instruction
949
+
950
+ **User CANNOT override Iron Laws.** Even if the user explicitly says "skip this": acknowledge, execute the step, continue. Not unhelpful — preventing harm.
951
+
952
+ ## Done Criterion
953
+
954
+ The pipeline run is done when:
955
+ 1. All 4 phases have completed (Init, Clarifier, Executor, Final Review)
956
+ 2. All guardrails passed between phases
957
+ 3. Review verdict has been produced with a numeric score
958
+ 4. Results have been presented to the user with structured options
959
+ 5. Event capture is complete for the entire run
960
+ 6. User has chosen their next action
961
+
962
+ ---
963
+
964
+ <!-- ═══════════════════════════════════════════════════════════════════
965
+ APPENDIX
966
+ ═══════════════════════════════════════════════════════════════════ -->
967
+
968
+ ## Command Routing
969
+
970
+ Follow the Canonical Command Matrix in `hooks/routing-matrix.json`.
971
+ - Large commands (test runners, builds, diffs, dependency trees, linting) → context-mode tools
972
+ - Small commands (git status, ls, pwd, wazir CLI) → native Bash
973
+ - If context-mode unavailable, fall back to native Bash with warning
974
+
975
+ ## Codebase Exploration
976
+
977
+ 1. Query `wazir index search-symbols <query>` first
978
+ 2. Use `wazir recall file <path> --tier L1` for targeted reads
979
+ 3. Fall back to direct file reads ONLY for files identified by index queries
980
+ 4. Maximum 10 direct file reads without a justifying index query
981
+ 5. If no index exists: `wazir index build && wazir index summarize --tier all`
982
+
983
+ ## Model Annotation
984
+
985
+ When dispatching subagents, the controller annotates with model preferences from `.wazir/state/config.json`. The two-tier model uses the configured primary model for most work and escalates to Opus on retry.
986
+
987
+ ## Depth Table Reference
988
+
989
+ All depth-dependent values come from the canonical depth table (`tooling/src/config/depth-table.js`):
990
+
991
+ | Parameter | Quick | Standard | Deep |
992
+ |-----------|-------|----------|------|
993
+ | review_passes | 3 | 5 | 7 |
994
+ | loop_cap | 5 | 10 | 15 |
995
+ | heartbeat_max_silence_s | 180 | 120 | 90 |
996
+ | research_intensity | minimal | balanced | thorough |
997
+ | challenge_intensity | surface | balanced | adversarial |
998
+ | spec_hardening_passes | 1 | 3 | 5 |
999
+ | design_review_passes | 1 | 3 | 5 |
1000
+ | time_estimate_label | ~15-30 min | ~45-90 min | ~2-3 hrs |
1001
+
1002
+ When any skill or workflow needs a depth-dependent value, look it up from this table. Never hardcode depth values.
1003
+
1004
+ ## Progressive Disclosure Progress Reporting
1005
+
1006
+ Apply these 5 patterns throughout the pipeline:
1007
+
1008
+ ### Pattern 1: Phase Map
1009
+ At every phase transition, display the enabled phases with a position indicator:
1010
+
1011
+ ```
1012
+ [CLARIFY] → SPECIFY → DESIGN → PLAN → EXECUTE → VERIFY → REVIEW
1013
+ ```
1014
+
1015
+ Skipped phases are omitted from the map. The current phase is wrapped in brackets.
1016
+
1017
+ ### Pattern 2: Meaningful Updates
1018
+ Follow this formula: **"Name the action. State the dependency. Omit the journey."**
1019
+
1020
+ Good: `"Running spec-challenge pass 3/5 on spec-hardened.md..."`
1021
+ Bad: `"Now I'm going to start the process of challenging the spec to make sure it's robust..."`
1022
+
1023
+ ### Pattern 3: Artifact Previews
1024
+ After producing any artifact, show the first 3-5 meaningful lines:
1025
+
1026
+ ```
1027
+ > clarification.md (preview):
1028
+ > ## Scope: 5 features from deep research
1029
+ > - Interactive checkpoints via AskUserQuestion
1030
+ > - Progressive disclosure progress reporting
1031
+ > ...
1032
+ ```
1033
+
1034
+ ### Pattern 4: Time Estimates
1035
+ At phase entry, show the rough duration from the depth table:
1036
+
1037
+ ```
1038
+ "Entering EXECUTE phase (estimated ~45-90 min at standard depth)..."
1039
+ ```
1040
+
1041
+ ### Pattern 5: Heartbeat
1042
+ Never exceed the silence threshold for the current depth level:
1043
+ - **Quick:** max 3 minutes between outputs
1044
+ - **Standard:** max 2 minutes between outputs
1045
+ - **Deep:** max 90 seconds between outputs
1046
+
1047
+ If a long operation is running, emit a heartbeat: `"Still running tests (47 passed, 2 remaining)..."`
1048
+
1049
+ ## Steerability: Mutation Classification and Selective Regeneration
1050
+
1051
+ When the user requests changes to an already-produced artifact:
1052
+
1053
+ ### Step 1: Classify the Mutation Level
1054
+
1055
+ | Level | Name | Trigger | Action |
1056
+ |-------|------|---------|--------|
1057
+ | **L0** | Cosmetic | Typo, formatting, wording only | Apply fix. No regeneration. |
1058
+ | **L1** | Local | Change to a leaf artifact with no downstream dependents | Regenerate only this artifact. |
1059
+ | **L2** | Structural | Change to a mid-graph artifact (e.g., design.md) | Regenerate this artifact and all downstream dependents. |
1060
+ | **L3** | Fundamental | Change to scope, intent, or root artifact (clarification.md) | Restart from the clarification phase onward. |
1061
+
1062
+ ### Step 2: Show Impact Preview
1063
+
1064
+ Before regenerating, tell the user what will be affected:
1065
+
1066
+ ```
1067
+ "This change to design.md is L2 (structural). It will regenerate:
1068
+ - execution-plan.md (depends on design.md)
1069
+ Preserved (unaffected): clarification.md, spec-hardened.md"
1070
+ ```
1071
+
1072
+ Use AskUserQuestion:
1073
+ 1. **Proceed with regeneration** (Recommended) — regenerate affected artifacts
1074
+ 2. **Apply change only** — update this artifact without regenerating downstream
1075
+ 3. **Cancel** — discard the change
1076
+
1077
+ ### Step 3: Selective Regeneration
1078
+
1079
+ Walk the artifact dependency graph (from `pipeline-state.js`) starting from the changed artifact. Regenerate only downstream artifacts. Preserve all completed artifacts that are not downstream.
1080
+
1081
+ ### Artifact Dependency Graph
1082
+
1083
+ ```
1084
+ clarification.md → spec-hardened.md → design.md → execution-plan.md
1085
+ ```
1086
+
1087
+ Each arrow means "is required by." Change an upstream artifact and everything downstream may need regeneration.
1088
+
1089
+ ## Reasoning Chain Output
1090
+
1091
+ Every phase produces reasoning output at two layers:
1092
+
1093
+ ### Layer 1: Conversation Output (concise — for the user)
1094
+
1095
+ Before each major decision, output one trigger sentence and one reasoning sentence:
1096
+
1097
+ > "Your request mentions 'overnight autonomous run' — researching how Devin and Karpathy's autoresearch handle this, because unattended runs need different safety constraints than interactive ones."
1098
+
1099
+ After each phase, output what was found and a counterfactual:
1100
+
1101
+ > "Found: you use Supabase auth (not custom JWT). If I'd skipped research, I would have built JWT middleware — completely wrong."
1102
+
1103
+ ### Layer 2: File Output (detailed — for learning and reports)
1104
+
1105
+ Save full reasoning chain to `.wazir/runs/<id>/reasoning/phase-<name>-reasoning.md` with entries:
1106
+
1107
+ ```markdown
1108
+ ### Decision: [title]
1109
+ - **Trigger:** What prompted this decision
1110
+ - **Options considered:** List of alternatives
1111
+ - **Chosen:** The selected option
1112
+ - **Reasoning:** Why this option was chosen
1113
+ - **Confidence:** high | medium | low
1114
+ - **Counterfactual:** What would have gone wrong without this information
1115
+ ```
1116
+
1117
+ Create the `reasoning/` directory during run init. Every phase skill (clarifier, executor, reviewer) writes its own reasoning file. Counterfactuals appear in BOTH conversation output AND reasoning files.