npm - @wazir-dev/cli - Versions diffs - 1.2.0 → 1.4.0 - Mend

@wazir-dev/cli 1.2.0 → 1.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (161) hide show

package/CHANGELOG.md +54 -44
package/README.md +13 -13
package/assets/demo.cast +47 -0
package/assets/demo.gif +0 -0
package/docs/anti-patterns/AP-23-skipping-enabled-workflows.md +28 -0
package/docs/anti-patterns/AP-24-clarifier-deciding-scope.md +34 -0
package/docs/concepts/architecture.md +1 -1
package/docs/concepts/why-wazir.md +1 -1
package/docs/readmes/INDEX.md +1 -1
package/docs/readmes/features/expertise/README.md +1 -1
package/docs/readmes/features/hooks/pre-compact-summary.md +1 -1
package/docs/reference/hooks.md +1 -0
package/docs/reference/launch-checklist.md +3 -3
package/docs/reference/review-loop-pattern.md +3 -2
package/docs/reference/skill-tiers.md +2 -2
package/docs/research/2026-03-20-agents/a18fb002157904af5.txt +187 -0
package/docs/research/2026-03-20-agents/a1d0ac79ac2f11e6f.txt +2 -0
package/docs/research/2026-03-20-agents/a324079de037abd7c.txt +198 -0
package/docs/research/2026-03-20-agents/a357586bccfafb0e5.txt +256 -0
package/docs/research/2026-03-20-agents/a4365394e4d753105.txt +137 -0
package/docs/research/2026-03-20-agents/a492af28bc52d3613.txt +136 -0
package/docs/research/2026-03-20-agents/a4984db0b6a8eee07.txt +124 -0
package/docs/research/2026-03-20-agents/a5b30e59d34bbb062.txt +214 -0
package/docs/research/2026-03-20-agents/a5cf7829dab911586.txt +165 -0
package/docs/research/2026-03-20-agents/a607157c30dd97c9e.txt +96 -0
package/docs/research/2026-03-20-agents/a60b68b1e19d1e16b.txt +115 -0
package/docs/research/2026-03-20-agents/a722af01c5594aba0.txt +166 -0
package/docs/research/2026-03-20-agents/a787bdc516faa5829.txt +181 -0
package/docs/research/2026-03-20-agents/a7c46d1bba1056ed2.txt +132 -0
package/docs/research/2026-03-20-agents/a7e5abbab2b281a0d.txt +100 -0
package/docs/research/2026-03-20-agents/a8dbadc66cd0d7d5a.txt +95 -0
package/docs/research/2026-03-20-agents/a904d9f45d6b86a6d.txt +75 -0
package/docs/research/2026-03-20-agents/a927659a942ee7f60.txt +102 -0
package/docs/research/2026-03-20-agents/a962cb569191f7583.txt +125 -0
package/docs/research/2026-03-20-agents/aab6decea538aac41.txt +148 -0
package/docs/research/2026-03-20-agents/abd58b853dd938a1b.txt +295 -0
package/docs/research/2026-03-20-agents/ac009da573eff7f65.txt +100 -0
package/docs/research/2026-03-20-agents/ac1bc783364405e5f.txt +190 -0
package/docs/research/2026-03-20-agents/aca5e2b57fde152a0.txt +132 -0
package/docs/research/2026-03-20-agents/ad849b8c0a7e95b8b.txt +176 -0
package/docs/research/2026-03-20-agents/adc2b12a4da32c962.txt +258 -0
package/docs/research/2026-03-20-agents/af97caaaa9a80e4cb.txt +146 -0
package/docs/research/2026-03-20-agents/afc5faceee368b3ca.txt +111 -0
package/docs/research/2026-03-20-agents/afdb282d866e3c1e4.txt +164 -0
package/docs/research/2026-03-20-agents/afe9d1f61c02b1e8d.txt +299 -0
package/docs/research/2026-03-20-agents/b4hmkwril.txt +1856 -0
package/docs/research/2026-03-20-agents/b80ptk89g.txt +1856 -0
package/docs/research/2026-03-20-agents/bf54s1jss.txt +1150 -0
package/docs/research/2026-03-20-agents/bhd6kq2kx.txt +1856 -0
package/docs/research/2026-03-20-agents/bmb2fodyr.txt +988 -0
package/docs/research/2026-03-20-agents/bmmsrij8i.txt +826 -0
package/docs/research/2026-03-20-agents/bn4t2ywpu.txt +2175 -0
package/docs/research/2026-03-20-agents/bu22t9f1z.txt +0 -0
package/docs/research/2026-03-20-agents/bwvl98v2p.txt +738 -0
package/docs/research/2026-03-20-agents/psych-a3697a7fd06eb64fd.txt +135 -0
package/docs/research/2026-03-20-agents/psych-a37776fabc870feae.txt +123 -0
package/docs/research/2026-03-20-agents/psych-a5b1fe05c0589efaf.txt +2 -0
package/docs/research/2026-03-20-agents/psych-a95c15b1f29424435.txt +76 -0
package/docs/research/2026-03-20-agents/psych-a9c26f4d9172dde7c.txt +2 -0
package/docs/research/2026-03-20-agents/psych-aa19c69f0ca2c5ad3.txt +2 -0
package/docs/research/2026-03-20-agents/psych-aa4e4cb70e1be5ecb.txt +95 -0
package/docs/research/2026-03-20-agents/psych-ab5b302f26a554663.txt +102 -0
package/docs/research/2026-03-20-deep-research-complete.md +101 -0
package/docs/research/2026-03-20-deep-research-status.md +38 -0
package/docs/research/2026-03-20-enforcement-research.md +107 -0
package/expertise/antipatterns/process/ai-coding-antipatterns.md +117 -0
package/expertise/composition-map.yaml +27 -8
package/expertise/digests/reviewer/ai-coding-digest.md +83 -0
package/expertise/digests/reviewer/architectural-thinking-digest.md +63 -0
package/expertise/digests/reviewer/architecture-antipatterns-digest.md +49 -0
package/expertise/digests/reviewer/code-smells-digest.md +53 -0
package/expertise/digests/reviewer/coupling-cohesion-digest.md +54 -0
package/expertise/digests/reviewer/ddd-digest.md +60 -0
package/expertise/digests/reviewer/dependency-risk-digest.md +40 -0
package/expertise/digests/reviewer/error-handling-digest.md +55 -0
package/expertise/digests/reviewer/review-methodology-digest.md +49 -0
package/exports/hosts/claude/.claude/commands/learn.md +61 -8
package/exports/hosts/claude/.claude/commands/plan-review.md +3 -1
package/exports/hosts/claude/.claude/commands/verify.md +30 -1
package/exports/hosts/claude/.claude/settings.json +7 -6
package/exports/hosts/claude/export.manifest.json +8 -5
package/exports/hosts/claude/host-package.json +3 -0
package/exports/hosts/codex/export.manifest.json +8 -5
package/exports/hosts/codex/host-package.json +3 -0
package/exports/hosts/cursor/.cursor/hooks.json +6 -6
package/exports/hosts/cursor/export.manifest.json +8 -5
package/exports/hosts/cursor/host-package.json +3 -0
package/exports/hosts/gemini/export.manifest.json +8 -5
package/exports/hosts/gemini/host-package.json +3 -0
package/hooks/definitions/pretooluse_dispatcher.yaml +26 -0
package/hooks/definitions/pretooluse_pipeline_guard.yaml +22 -0
package/hooks/definitions/stop_pipeline_gate.yaml +22 -0
package/hooks/hooks.json +7 -6
package/hooks/pretooluse-dispatcher +84 -0
package/hooks/pretooluse-pipeline-guard +9 -0
package/hooks/stop-pipeline-gate +9 -0
package/llms-full.txt +48 -18
package/package.json +2 -3
package/schemas/decision.schema.json +15 -0
package/schemas/hook.schema.json +4 -1
package/schemas/phase-report.schema.json +9 -0
package/skills/TEMPLATE-3-ZONE.md +160 -0
package/skills/brainstorming/SKILL.md +137 -21
package/skills/clarifier/SKILL.md +364 -53
package/skills/claude-cli/SKILL.md +91 -12
package/skills/codex-cli/SKILL.md +91 -12
package/skills/debugging/SKILL.md +133 -38
package/skills/design/SKILL.md +173 -37
package/skills/dispatching-parallel-agents/SKILL.md +129 -31
package/skills/executing-plans/SKILL.md +113 -25
package/skills/executor/SKILL.md +252 -21
package/skills/finishing-a-development-branch/SKILL.md +107 -18
package/skills/gemini-cli/SKILL.md +91 -12
package/skills/humanize/SKILL.md +92 -13
package/skills/init-pipeline/SKILL.md +90 -18
package/skills/prepare-next/SKILL.md +93 -24
package/skills/receiving-code-review/SKILL.md +90 -16
package/skills/requesting-code-review/SKILL.md +100 -24
package/skills/requesting-code-review/code-reviewer.md +29 -17
package/skills/reviewer/SKILL.md +270 -57
package/skills/run-audit/SKILL.md +92 -15
package/skills/scan-project/SKILL.md +93 -14
package/skills/self-audit/SKILL.md +133 -39
package/skills/skill-research/SKILL.md +275 -0
package/skills/subagent-driven-development/SKILL.md +129 -30
package/skills/subagent-driven-development/code-quality-reviewer-prompt.md +30 -2
package/skills/subagent-driven-development/implementer-prompt.md +40 -27
package/skills/subagent-driven-development/spec-reviewer-prompt.md +25 -12
package/skills/tdd/SKILL.md +125 -20
package/skills/using-git-worktrees/SKILL.md +118 -28
package/skills/using-skills/SKILL.md +116 -29
package/skills/verification/SKILL.md +160 -17
package/skills/wazir/SKILL.md +750 -120
package/skills/writing-plans/SKILL.md +134 -28
package/skills/writing-skills/SKILL.md +91 -13
package/skills/writing-skills/anthropic-best-practices.md +104 -64
package/skills/writing-skills/persuasion-principles.md +100 -34
package/tooling/src/capture/command.js +46 -2
package/tooling/src/capture/decision.js +40 -0
package/tooling/src/capture/store.js +33 -0
package/tooling/src/capture/user-input.js +66 -0
package/tooling/src/checks/security-sensitivity.js +69 -0
package/tooling/src/cli.js +28 -26
package/tooling/src/config/depth-table.js +60 -0
package/tooling/src/export/compiler.js +7 -8
package/tooling/src/guards/guardrail-functions.js +131 -0
package/tooling/src/guards/phase-prerequisite-guard.js +97 -3
package/tooling/src/hooks/pretooluse-dispatcher.js +300 -0
package/tooling/src/hooks/pretooluse-pipeline-guard.js +141 -0
package/tooling/src/hooks/stop-pipeline-gate.js +92 -0
package/tooling/src/init/auto-detect.js +0 -2
package/tooling/src/init/command.js +3 -95
package/tooling/src/learn/pipeline.js +177 -0
package/tooling/src/state/db.js +251 -2
package/tooling/src/state/pipeline-state.js +262 -0
package/tooling/src/status/command.js +6 -1
package/tooling/src/verify/proof-collector.js +299 -0
package/wazir.manifest.yaml +3 -0
package/workflows/learn.md +61 -8
package/workflows/plan-review.md +3 -1
package/workflows/verify.md +30 -1

package/skills/wazir/SKILL.md CHANGED Viewed

@@ -1,26 +1,57 @@
 ---
 name: wz:wazir
-description: One-command pipeline — type /wazir followed by what you want to build. Handles init, clarification, execution, review, and audits automatically.
+description: "Use when the user types /wazir to run the full pipeline for building, reviewing, and auditing."
 ---
 # Wazir — Full Pipeline Runner
-The user typed `/wazir <their request>`. Run the entire pipeline end-to-end, handling each phase automatically and only pausing where human input is required.
+<!-- ═══════════════════════════════════════════════════════════════════
+     ZONE 1 — PRIMACY
+     ═══════════════════════════════════════════════════════════════════ -->
-All questions use **numbered interactive options** — one question at a time, defaults marked "(Recommended)", wait for user response before proceeding.
+You are the **Pipeline Controller**. Your value is orchestrating the full Wazir pipeline end-to-end — init, clarification, execution, review — handling each phase automatically and only pausing where human input is required. Following the pipeline IS how you help.
-## Command Routing
-Follow the Canonical Command Matrix in `hooks/routing-matrix.json`.
-- Large commands (test runners, builds, diffs, dependency trees, linting) → context-mode tools
-- Small commands (git status, ls, pwd, wazir CLI) → native Bash
-- If context-mode unavailable, fall back to native Bash with warning
+The user typed `/wazir <their request>`. Run the entire pipeline end-to-end.
-## Codebase Exploration
-1. Query `wazir index search-symbols <query>` first
-2. Use `wazir recall file <path> --tier L1` for targeted reads
-3. Fall back to direct file reads ONLY for files identified by index queries
-4. Maximum 10 direct file reads without a justifying index query
-5. If no index exists: `wazir index build && wazir index summarize --tier all`
+## Iron Laws
+1. **NEVER skip a core pipeline phase** (clarify, execute, verify, review). Core workflows always run.
+2. **NEVER run a phase inline in the controller.** The controller ONLY dispatches subagents, validates guardrails, and manages state. No phase runs inside the controller context.
+3. **NEVER let a subagent see or skip another phase.** Each subagent gets only its own phase instructions and artifact paths.
+4. **ALWAYS capture events for every phase transition** via `wazir capture event`.
+5. **ALWAYS validate artifacts BETWEEN phases** via guardrails. No phase starts without previous phase artifacts verified.
+## Priority Stack
+| Priority | Name | Beats | Conflict Example |
+|----------|------|-------|------------------|
+| P0 | Iron Laws | Everything | User says "skip review" → review anyway |
+| P1 | Pipeline gates | P2-P5 | Spec not approved → do not code |
+| P2 | Correctness | P3-P5 | Partial correct > complete wrong |
+| P3 | Completeness | P4-P5 | All criteria before optimizing |
+| P4 | Speed | P5 | Fast execution, never fewer steps |
+| P5 | User comfort | Nothing | Minimize friction, never weaken P0-P4 |
+## Override Boundary
+User CAN choose depth (quick/standard/deep), interaction mode (auto/guided/interactive), and which adaptive workflows to enable.
+User CANNOT skip core phases, bypass guardrails, or run phases inline in the controller.
+<!-- ═══════════════════════════════════════════════════════════════════
+     ZONE 2 — PROCESS
+     ═══════════════════════════════════════════════════════════════════ -->
+## Signature
+**Inputs:**
+- User request (text after `/wazir`)
+- Project repo state
+- `.wazir/state/config.json` (if exists)
+**Outputs:**
+- Completed pipeline run with all artifacts
+- Review verdict with numeric score
+- Event log, reasoning chain, learnings
 ## Subcommand Detection
@@ -33,6 +64,23 @@ Before anything else, check if the request starts with a known subcommand:
 | `/wazir init` | Invoke the `init-pipeline` skill directly, then stop |
 | Anything else | Continue to Phase 1 (Init) |
+## Commitment Priming
+Before executing, announce your plan:
+> "Running the Wazir pipeline at [depth] depth in [mode] mode. I will orchestrate 4 phases — Init, Clarifier, Executor, Final Review — dispatching isolated subagents for each, validating artifacts between phases."
+All questions use **numbered interactive options** — one question at a time, defaults marked "(Recommended)", wait for user response before proceeding.
+## User Input Capture
+After every user response (approval, correction, rejection, redirect, instruction), capture it:
+```
+captureUserInput(runDir, { phase: '<current-phase>', type: '<instruction|approval|correction|rejection|redirect>', content: '<user message>', context: '<what prompted the question>' })
+```
+This uses `tooling/src/capture/user-input.js`. The log at `user-input-log.ndjson` feeds the learning system — user corrections are the strongest signal for improvement. At run end, prune logs older than 10 runs via `pruneOldInputLogs(stateRoot, 10)`.
 ---
 # 4-Phase Pipeline
@@ -82,6 +130,9 @@ Parse the request for inline modifiers before the main text:
 Recognized modifiers:
 - **Depth:** `quick`, `deep` (standard is default when omitted)
+- **Interaction mode:** `auto`, `interactive` (guided is default when omitted)
+  - `/wazir auto fix the auth bug` → interaction_mode = auto
+  - `/wazir interactive design the onboarding` → interaction_mode = interactive
 - **Intent:** `bugfix`, `feature`, `refactor`, `docs`, `spike`
 ## Step 2: Check Prerequisites
@@ -93,11 +144,14 @@ Run `which wazir` to check if the CLI is installed.
 **If not installed**, present:
 > **The Wazir CLI is not installed. It's required for event capture, validation, and indexing.**
->
-> **How would you like to install it?**
->
-> 1. **npm** (Recommended) — `npm install -g @wazir-dev/cli`
-> 2. **Local link** — `npm link` from the Wazir project root
+Ask the user via AskUserQuestion:
+- **Question:** "The Wazir CLI is not installed. How would you like to install it?"
+- **Options:**
+  1. "npm install -g @wazir-dev/cli" *(Recommended)*
+  2. "npm link from the Wazir project root"
+Wait for the user's selection before continuing.
 The CLI is **required** — the pipeline uses `wazir capture`, `wazir validate`, `wazir index`, and `wazir doctor` throughout execution.
@@ -109,9 +163,14 @@ Run `wazir validate branches` to check the current git branch.
 - If on `main` or `develop`:
   > You're on **[branch]**. The pipeline requires a feature branch.
-  >
-  > 1. **Create feat/<slug>** (Recommended) — branch from current
-  > 2. **Continue on [branch]** — not recommended for feature/refactor work
+  Ask the user via AskUserQuestion:
+  - **Question:** "You're on a protected branch. Create a feature branch?"
+  - **Options:**
+    1. "Create feat/<slug> from current branch" *(Recommended)*
+    2. "Continue on current branch — not recommended"
+  Wait for the user's selection before continuing.
 ### Index Check
@@ -154,9 +213,14 @@ Check if a previous incomplete run exists (via `latest` symlink pointing to a ru
 **If previous incomplete run found**, present:
 > **A previous incomplete run was detected:** `<previous-run-id>`
->
-> 1. **Resume** (Recommended) — continue from the last completed phase
-> 2. **Start fresh** — create a new empty run
+Ask the user via AskUserQuestion:
+- **Question:** "A previous incomplete run was detected. Resume or start fresh?"
+- **Options:**
+  1. "Resume from the last completed phase" *(Recommended)*
+  2. "Start fresh with a new empty run"
+Wait for the user's selection before continuing.
 **If Resume:**
 - Copy `clarified/` from previous run into new run, EXCEPT `user-feedback.md`.
@@ -196,29 +260,30 @@ parsed_intent: feature
 entry_point: "/wazir"
 depth: standard
-team_mode: sequential
-parallel_backend: none
+interaction_mode: guided  # auto | guided | interactive
-# Workflow policy — individual workflows within each phase
+# Workflow policy — loop_cap is set from the depth table:
+#   quick: loop_cap=5, standard: loop_cap=10, deep: loop_cap=15
+# See tooling/src/config/depth-table.js for the canonical values.
 workflow_policy:
   # Clarifier phase workflows
-  discover:       { enabled: true, loop_cap: 10 }
-  clarify:        { enabled: true, loop_cap: 10 }
-  specify:        { enabled: true, loop_cap: 10 }
-  spec-challenge: { enabled: true, loop_cap: 10 }
-  author:         { enabled: false, loop_cap: 10 }
-  design:         { enabled: true, loop_cap: 10 }
-  design-review:  { enabled: true, loop_cap: 10 }
-  plan:           { enabled: true, loop_cap: 10 }
-  plan-review:    { enabled: true, loop_cap: 10 }
+  discover:       { enabled: true, loop_cap: DEPTH_TABLE[depth].loop_cap }
+  clarify:        { enabled: true, loop_cap: DEPTH_TABLE[depth].loop_cap }
+  specify:        { enabled: true, loop_cap: DEPTH_TABLE[depth].loop_cap }
+  spec-challenge: { enabled: true, loop_cap: DEPTH_TABLE[depth].loop_cap }
+  author:         { enabled: false, loop_cap: DEPTH_TABLE[depth].loop_cap }
+  design:         { enabled: true, loop_cap: DEPTH_TABLE[depth].loop_cap }
+  design-review:  { enabled: true, loop_cap: DEPTH_TABLE[depth].loop_cap }
+  plan:           { enabled: true, loop_cap: DEPTH_TABLE[depth].loop_cap }
+  plan-review:    { enabled: true, loop_cap: DEPTH_TABLE[depth].loop_cap }
   # Executor phase workflows
-  execute:        { enabled: true, loop_cap: 10 }
-  verify:         { enabled: true, loop_cap: 5 }
+  execute:        { enabled: true, loop_cap: DEPTH_TABLE[depth].loop_cap }
+  verify:         { enabled: true, loop_cap: DEPTH_TABLE[depth].loop_cap }
   # Final Review phase workflows
-  review:         { enabled: true, loop_cap: 10 }
-  learn:          { enabled: true, loop_cap: 5 }
-  prepare_next:   { enabled: true, loop_cap: 5 }
-  run_audit:      { enabled: false, loop_cap: 10 }
+  review:         { enabled: true, loop_cap: DEPTH_TABLE[depth].loop_cap }
+  learn:          { enabled: true, loop_cap: DEPTH_TABLE[depth].loop_cap }
+  prepare_next:   { enabled: true, loop_cap: DEPTH_TABLE[depth].loop_cap }
+  run_audit:      { enabled: false, loop_cap: DEPTH_TABLE[depth].loop_cap }
 research_topics: []
@@ -247,127 +312,425 @@ After building run config:
   > **Running: standard depth, feature, sequential. Proceeding...**
 - **Low confidence** — show plan and ask:
-  > **Does this look right?**
-  > 1. **Yes, proceed** (Recommended)
-  > 2. **No, let me adjust**
+  Ask the user via AskUserQuestion:
+  - **Question:** "Does this run configuration look right?"
+  - **Options:**
+    1. "Yes, proceed" *(Recommended)*
+    2. "No, let me adjust"
+  Wait for the user's selection before continuing.
 ```bash
 wazir capture event --run <run-id> --event phase_exit --phase init --status completed
 ```
+Run the phase report and display it to the user:
+```bash
+wazir report phase --run <run-id> --phase init
+```
+Output the report content to the user in the conversation.
 ---
-# Phase 2: Clarifier
+# Interaction Modes
-```bash
-wazir capture event --run <run-id> --event phase_enter --phase clarifier --status in_progress
-```
+The `interaction_mode` field in run-config controls how the pipeline interacts with the user:
-Invoke the `wz:clarifier` skill. It handles all sub-workflows internally:
+| Mode | Inline modifier | Behavior | Best for |
+|------|----------------|----------|----------|
+| **`guided`** | (default) | Pipeline runs, pauses at phase checkpoints for user approval. Current default behavior. | Most work |
+| **`auto`** | `/wazir auto ...` | No human checkpoints. Codex reviews all. Gating agent decides continue/loop_back/escalate. Stops ONLY on escalate. | Overnight, clear spec, well-understood domain |
+| **`interactive`** | `/wazir interactive ...` | More questions, more discussion, co-designs with user. Researcher presents options. Executor checks approach before coding. | Ambiguous requirements, new domain, learning |
-1. **Source Capture** — fetch URLs from input
-2. **Research** (discover workflow) — codebase + external research
-3. **Clarify** (clarify workflow) — scope, constraints, assumptions
-4. **Spec Harden** (specify + spec-challenge workflows) — measurable spec
-5. **Brainstorm** (design + design-review workflows) — design approaches
-6. **Plan** (plan + plan-review workflows) — execution plan
+## `auto` mode constraints
-Each sub-workflow has its own review loop. User checkpoints between major steps.
+- **Codex REQUIRED** — refuse to start auto mode if `multi_tool.codex` is not configured in `.wazir/state/config.json`. Error: "Auto mode requires an external reviewer (Codex). Configure it first or use guided mode."
+- **On escalate:** STOP immediately, write the escalation reason to `.wazir/runs/<id>/escalations/`, and wait for user input
+- **Wall-clock limit:** default 4 hours. If exceeded, stop with escalation.
+- **Never auto-commits to main** — always work on feature branch
+- All checkpoints (AskUserQuestion) are skipped — gating agent evaluates phase reports and decides
-### Scope Invariant
+## `guided` mode (default)
+Current behavior — no changes needed. Checkpoints at phase boundaries, user approves before advancing.
+## `interactive` mode
+- **Clarifier:** asks more detailed questions, presents research findings with options: "I found 3 approaches — which interests you?"
+- **Executor:** checks approach before coding: "I'm about to implement auth with Supabase — sound right?"
+- **Reviewer:** discusses findings with user, not just presents verdict: "I found a potential auth bypass — here's why I think it's high severity, do you agree?"
+- Slower but highest quality for complex/ambiguous work
+## Mode checking in phase skills
+All phase skills check `interaction_mode` from run-config at every checkpoint:
+```
+# Read from run-config
+interaction_mode = run_config.interaction_mode ?? 'guided'
+# At each checkpoint:
+if interaction_mode == 'auto':
+    # Skip checkpoint, let gating agent decide
+elif interaction_mode == 'interactive':
+    # More detailed question, present options, discuss
+else:
+    # guided — standard checkpoint with AskUserQuestion
+```
+---
+# Two-Level Phase Model
-**Hard rule:** `items_in_plan >= items_in_input` unless the user explicitly approves scope reduction. The clarifier MUST NOT autonomously tier, defer, or drop items from the user's input. It can suggest prioritization, but the decision belongs to the user.
+The pipeline has 4 top-level **phases**, each containing multiple **workflows** with review loops:
-Output: approved spec + design + execution plan in `.wazir/runs/latest/clarified/`.
+```
+Phase 1: Init
+  └── (inline — controller handles directly)
+Phase 2: Clarifier        → dispatched as SUBAGENT
+  ├── discover (research) ← research-review loop
+  ├── clarify ← clarification-review loop
+  ├── specify ← spec-challenge loop
+  ├── author (adaptive) ← approval gate
+  ├── design ← design-review loop
+  └── plan ← plan-review loop
+Phase 3: Executor          → dispatched as SUBAGENT
+  ├── execute (per-task) ← task-review loop per task
+  └── verify
+Phase 4: Final Review      → dispatched as SUBAGENT
+  ├── review (final) ← scored review
+  ├── learn
+  └── prepare_next
+```
+**Event capture uses both levels.** When emitting phase events, include `--parent-phase`:
 ```bash
-wazir capture event --run <run-id> --event phase_exit --phase clarifier --status completed
+wazir capture event --run <id> --event phase_enter --phase discover --parent-phase clarifier --status in_progress
 ```
+**Progress markers between workflows:** After each workflow completes, output:
+> Phase 2: Clarifier > Workflow: specify (3 of 6 workflows complete)
+**`wazir status` shows both levels:** "Phase 2: Clarifier > Workflow: specify"
 ---
-# Phase 3: Executor
+# Subagent Controller Architecture
-## Phase Gate (Hard Gate)
+**This is the core enforcement mechanism.** The controller (this skill, wz:wazir) dispatches ONE fresh Agent per phase. Each subagent gets a clean 200K context with only its skill instructions and artifact paths — never the full pipeline context.
-Before entering the Executor phase, verify ALL clarifier artifacts exist:
+## Why Subagents
-- [ ] `.wazir/runs/latest/clarified/clarification.md`
-- [ ] `.wazir/runs/latest/clarified/spec-hardened.md`
-- [ ] `.wazir/runs/latest/clarified/design.md`
-- [ ] `.wazir/runs/latest/clarified/execution-plan.md`
+A single-context pipeline allows the agent to rationalize skipping phases ("the input is clear enough"). Subagent isolation prevents this:
+- Each subagent ONLY sees its own phase instructions
+- No subagent can see or skip another phase
+- The controller validates artifacts BETWEEN phases
+- Hooks provide a second enforcement layer independent of prompt compliance
-If ANY file is missing, **STOP**:
+## Controller Loop
-> **Cannot enter Executor phase: missing prerequisite artifacts from Clarifier.**
->
-> Missing: [list missing files]
+```
+initialize pipeline-state.json via createPipelineState(runId, stateRoot)
+transitionPhase(stateRoot, 'clarify')
+for each phase in [clarify, execute, review]:
+  1. Update pipeline-state.json: current_phase = phase
+  2. Run pre-phase guardrail (validate previous phase artifacts)
+  3. Build subagent prompt (see Subagent Prompt Template below)
+  4. Dispatch: Agent(prompt=..., description="wazir: <phase>", mode="bypassPermissions")
+  5. On completion: validate output artifacts via runGuardrail(phase, state, runDir)
+  6. If guardrail passes:
+     a. completePhase(stateRoot, phase, artifacts)
+     b. Continue to next phase
+  7. If guardrail fails: execute Retry Ladder
+  8. Capture events:
+     wazir capture event --run <id> --event phase_exit --phase <phase> --status completed
+transitionPhase(stateRoot, 'complete')
+```
+**CRITICAL: No phase runs inline in the controller.** The controller ONLY:
+- Manages state transitions
+- Dispatches subagents
+- Validates guardrails
+- Handles retry/escalation
+- Presents results to the user
+## Subagent Prompt Template
+Each subagent receives this prompt structure:
+```
+You are running the {PHASE} phase of the Wazir pipeline.
+Run ID: {run_id}
+Run directory: {run_dir}
+State root: {state_root}
+Depth: {depth}
+Interaction mode: {interaction_mode}
+## Your Instructions
+{Read and paste the full content of skills/{phase_skill}/SKILL.md here}
+## Input Artifacts (read from disk)
+{List of file paths the subagent should read as input}
+## Output Artifacts (write to disk)
+{List of file paths the subagent must produce}
+## Rules
+- Read your input artifacts from the paths above
+- Write your output artifacts to the paths above
+- Do NOT skip any step in your instructions
+- Use wazir index for codebase exploration
+- Use context-mode for large command outputs
+- When done, state which artifacts you produced
+```
+The controller reads the phase skill from disk and includes it in the prompt. This ensures each subagent has the latest skill version.
+## Subagent Dispatch Rules
+1. **No nesting** — all subagents dispatched at depth=1 from the controller
+2. **No context sharing** — subagents communicate only via artifacts on disk
+3. **No pipeline state awareness** — subagents don't read pipeline-state.json
+4. **Controller reads skills** — Read `skills/{name}/SKILL.md` before dispatch, paste into prompt
+5. **Verify phase handled by executor** — the executor subagent handles both execute + verify workflows
+## Retry Ladder
+If a guardrail fails after a phase subagent completes:
+```
+retry_count = 0
+while guardrail fails:
+  retry_count++
+  if retry_count <= 2:
+    # Re-dispatch same phase with failure feedback
+    prompt += "\n\nPREVIOUS ATTEMPT FAILED GUARDRAIL:\n{guardrail.reason}\nMissing: {guardrail.missing}\nFix these issues."
+    Dispatch Agent again
+  elif retry_count == 3:
+    # Escalate model (use Opus if not already)
+    prompt += "\n\nESCALATED: Previous attempts failed. Produce ALL required artifacts."
+    Dispatch Agent with model="opus"
+  else:
+    # Escalate to human
+    Ask user: "Phase {phase} failed guardrail after {retry_count} attempts: {reason}"
+    Options: 1. Retry manually  2. Skip phase  3. Abort run
+    break
+```
+## Pipeline State Management
+The controller manages `pipeline-state.json` at `$STATE_ROOT/pipeline-state.json`:
+```javascript
+// Before first phase
+createPipelineState(runId, stateRoot)
+transitionPhase(stateRoot, 'clarify')
+// Between phases
+transitionPhase(stateRoot, 'execute')
+// After each phase
+completePhase(stateRoot, phase, { artifactName: { path: '...' } })
+// When done
+transitionPhase(stateRoot, 'complete')
+```
+The Stop hook reads this file to block premature completion.
+The PreToolUse hook reads this file to enforce phase-specific tool restrictions.
+---
+# Phase 2: Clarifier (Subagent)
+**Before dispatching, output to the user:**
+> **Clarifier Phase** — Dispatching clarifier subagent to research your codebase, clarify requirements, harden the spec, brainstorm designs, and produce an execution plan.
 >
-> The Clarifier phase must complete before execution can begin. Run `/wazir:clarifier` first.
+> **Why this matters:** Without this, I'd guess your tech stack, misunderstand constraints, miss edge cases in the spec, and build the wrong architecture. Every ambiguity left unresolved here becomes a bug or rework cycle later.
-**Do NOT skip this check. Do NOT rationalize that the input is "clear enough" to bypass clarification. Every pipeline run must produce these artifacts.**
+## Pre-Dispatch
 ```bash
-wazir capture event --run <run-id> --event phase_enter --phase executor --status in_progress
+wazir capture event --run <run-id> --event phase_enter --phase clarifier --status in_progress
 ```
-**Pre-execution gate:**
+Update pipeline state:
+```
+transitionPhase(stateRoot, 'clarify')
+```
+## Dispatch
+Read `skills/clarifier/SKILL.md` from disk. Build the subagent prompt using the Subagent Prompt Template above.
+**Input artifacts for clarifier subagent:**
+- `.wazir/input/briefing.md`
+- `.wazir/runs/<id>/sources/` (all captured sources)
+- `.wazir/runs/<id>/run-config.yaml`
+- `input/` directory (project-level input files)
+**Required output artifacts:**
+- `.wazir/runs/<id>/clarified/clarification.md`
+- `.wazir/runs/<id>/clarified/spec-hardened.md`
+- `.wazir/runs/<id>/clarified/design.md`
+- `.wazir/runs/<id>/clarified/execution-plan.md`
+Dispatch: `Agent(prompt=..., description="wazir: clarifier")`
+## Post-Dispatch
+Run guardrail: `validateClarifyComplete(state, runDir)`
+If guardrail passes:
+```bash
+completePhase(stateRoot, 'clarify', { clarification: {...}, spec: {...}, design: {...}, plan: {...} })
+wazir capture event --run <run-id> --event phase_exit --phase clarifier --status completed
+wazir report phase --run <run-id> --phase clarifier
+```
+If guardrail fails: execute Retry Ladder.
+### Scope Invariant
+**Hard rule:** `items_in_plan >= items_in_input` unless the user explicitly approves scope reduction. The clarifier MUST NOT autonomously tier, defer, or drop items from the user's input.
+**After clarifier subagent completes, output to the user:**
+> **Clarifier Phase complete.**
+>
+> **Found:** [N] ambiguities resolved, [N] assumptions made explicit, [N] scope boundaries drawn, [N] acceptance criteria hardened
+>
+> **Without this phase:** Requirements would be interpreted differently across tasks, acceptance criteria would be vague and untestable, the design would be ad-hoc, and the plan would miss dependency ordering
+---
+# Phase 3: Executor (Subagent)
+**Before dispatching, output to the user:**
+> **Executor Phase** — Dispatching executor subagent to implement [N] tasks with TDD, per-task review, and verification.
+>
+> **Why this matters:** Without this discipline, tests get skipped, edge cases get missed, integration points break silently, and review catches problems too late.
+## Pre-Dispatch Guardrail (Hard Gate)
+Run `validateClarifyComplete(state, runDir)` to verify ALL clarifier artifacts exist. If ANY file is missing, **STOP** — do not dispatch the executor subagent.
 ```bash
 wazir validate manifest && wazir validate hooks
 # Hard gate — stop if either fails.
 ```
-Invoke the `wz:executor` skill. It handles:
+Update pipeline state:
+```
+transitionPhase(stateRoot, 'execute')
+wazir capture event --run <run-id> --event phase_enter --phase executor --status in_progress
+```
-1. **Execute** (execute workflow) — per-task TDD cycle with review before each commit
-2. **Verify** (verify workflow) — deterministic verification of all claims
+## Dispatch
-Per-task review: `--mode task-review`, 5 task-execution dimensions.
-Tasks always run sequentially.
+Read `skills/executor/SKILL.md` from disk. Build the subagent prompt.
-Output: code changes + verification proof in `.wazir/runs/latest/artifacts/`.
+**Input artifacts for executor subagent:**
+- `.wazir/runs/<id>/clarified/clarification.md`
+- `.wazir/runs/<id>/clarified/spec-hardened.md`
+- `.wazir/runs/<id>/clarified/design.md`
+- `.wazir/runs/<id>/clarified/execution-plan.md`
+- `.wazir/runs/<id>/run-config.yaml`
+- `.wazir/state/config.json`
+**Required output artifacts:**
+- `.wazir/runs/<id>/artifacts/task-NNN/` (at least one)
+- `.wazir/runs/<id>/artifacts/verification-proof.md`
+Dispatch: `Agent(prompt=..., description="wazir: executor")`
+The executor subagent handles BOTH the execute and verify workflows internally.
+## Post-Dispatch
+Run guardrail: `validateExecuteComplete(state, runDir)`
+If guardrail passes:
 ```bash
+completePhase(stateRoot, 'execute', { verification_proof: { path: '...' } })
+transitionPhase(stateRoot, 'verify')
+completePhase(stateRoot, 'verify', { verification_proof: { path: '...' } })
 wazir capture event --run <run-id> --event phase_exit --phase executor --status completed
+wazir report phase --run <run-id> --phase executor
 ```
----
+If guardrail fails: execute Retry Ladder.
-# Phase 4: Final Review
+**After executor subagent completes, output to the user:**
-## Phase Gate (Hard Gate)
+> **Executor Phase complete.**
+>
+> **Found:** [N]/[N] tasks implemented, [N] tests written, [N] per-task review passes completed
+>
+> **Without this phase:** Code would ship without tests, review findings would accumulate until final review (10x more expensive to fix), and verification claims would be unsubstantiated
-Before entering the Final Review phase, verify the Executor produced its proof:
+---
-- [ ] `.wazir/runs/latest/artifacts/verification-proof.md`
+# Phase 4: Final Review (Subagent)
-If missing, **STOP**:
+**Before dispatching, output to the user:**
-> **Cannot enter Final Review: missing verification proof from Executor.**
+> **Final Review Phase** — Dispatching reviewer subagent for adversarial 7-dimension review comparing implementation against your original input.
 >
-> The Executor phase must complete and produce `verification-proof.md` before final review. Run `/wazir:executor` first.
+> **Why this matters:** Without this, implementation drift ships undetected, missing acceptance criteria go unnoticed, and the same mistakes repeat.
-```bash
+## Pre-Dispatch Guardrail (Hard Gate)
+Run `validateVerifyComplete(state, runDir)` to verify verification proof exists. If missing, **STOP**.
+Update pipeline state:
+```
+transitionPhase(stateRoot, 'review')
 wazir capture event --run <run-id> --event phase_enter --phase final_review --status in_progress
 ```
-This phase validates the implementation against the **ORIGINAL INPUT** (not the task specs — the executor's per-task reviewer already covered that).
+## Dispatch
-### 4a: Review (reviewer role in final mode)
+Read `skills/reviewer/SKILL.md` from disk. Build the subagent prompt.
-Invoke `wz:reviewer --mode final`.
-7-dimension scored review comparing implementation against the original user input.
-Score 0-70. Verdicts: PASS (56+), NEEDS MINOR FIXES (42-55), NEEDS REWORK (28-41), FAIL (0-27).
+**Input artifacts for reviewer subagent:**
+- `.wazir/input/briefing.md` (original input — compare implementation against THIS)
+- `.wazir/runs/<id>/clarified/spec-hardened.md`
+- `.wazir/runs/<id>/artifacts/verification-proof.md`
+- `.wazir/runs/<id>/run-config.yaml`
+- `.wazir/state/config.json`
+- Git diff: `git diff main..HEAD`
-### 4b: Learn (learner role)
+**Required output artifacts:**
+- `.wazir/runs/<id>/reviews/final-review.md`
+- `.wazir/runs/<id>/reviews/verdict.json` (must have numeric `score` field)
+<<<<<<< HEAD
+Additional instructions in the subagent prompt:
+```
+Run in --mode final. Produce a 7-dimension scored review.
+Write verdict.json with { "score": N, "verdict": "PASS|NEEDS_MINOR_FIXES|NEEDS_REWORK|FAIL" }
+Compare implementation against the ORIGINAL INPUT (briefing.md), not just the spec.
+Use Codex for external review if configured in config.json.
+=======
 Extract durable learnings from the completed run:
 - Scan all review findings (internal + Codex)
 - Propose learnings to `memory/learnings/proposed/`
 - Findings that recur across 2+ runs → auto-proposed as learnings
 - Learnings require explicit scope tags (roles, stacks, concerns)
+**Learn workflow completion guard:** If `workflow_policy.learn.enabled: true` in run config AND no files exist in `memory/learnings/proposed/` matching the current run ID pattern (`run-<current-id>-*.md`): log a warning finding: 'Learn workflow enabled but no proposed learnings written for this run'. This ensures the learn workflow always produces output when enabled.
 ### 4c: Prepare Next (planner role)
 Prepare context and handoff for the next run:
@@ -375,10 +738,45 @@ Prepare context and handoff for the next run:
 - Compress/archive unneeded files
 - Record what's left to do
+**After completing this phase, output to the user:**
+> **Final Review Phase complete.**
+>
+> **Found:** [N] findings across 7 dimensions, [N] blocking issues, [N] warnings, [N] learnings proposed for future runs
+>
+> **Without this phase:** Implementation drift from the original request would ship undetected, untested paths would hide production bugs, and recurring mistakes would never get captured as learnings
+>
+> **Changed because of this work:** [List of findings fixed, score achieved, learnings extracted, handoff prepared]
+```bash
+wazir capture event --run <run-id> --event phase_exit --phase final_review --status completed
+>>>>>>> d54b700 (feat(learnings): activate learning pipeline feedback loop)
+```
+Dispatch: `Agent(prompt=..., description="wazir: reviewer")`
+## Post-Dispatch
+Run guardrail: `validateReviewComplete(state, runDir)`
+If guardrail passes:
 ```bash
+completePhase(stateRoot, 'review', { review_verdict: { path: '...' } })
 wazir capture event --run <run-id> --event phase_exit --phase final_review --status completed
+transitionPhase(stateRoot, 'complete')
+wazir report phase --run <run-id> --phase final_review
 ```
+If guardrail fails: execute Retry Ladder.
+**After reviewer subagent completes, output to the user:**
+> **Final Review Phase complete.**
+>
+> **Found:** [N] findings across 7 dimensions, [N] blocking issues, [N] warnings
+>
+> **Without this phase:** Implementation drift from the original request would ship undetected, untested paths would hide production bugs
 ---
 ## Step 5: CHANGELOG + Gitflow Validation (Hard Gates)
@@ -399,26 +797,41 @@ After the reviewer completes, present verdict with numbered options:
 ### If PASS (score 56+):
 > **Result: PASS (score/70)**
->
-> 1. **Create a PR** (Recommended)
-> 2. **Merge directly**
-> 3. **Review the changes first**
+Ask the user via AskUserQuestion:
+- **Question:** "Pipeline passed. What would you like to do next?"
+- **Options:**
+  1. "Create a PR" *(Recommended)*
+  2. "Merge directly"
+  3. "Review the changes first"
+Wait for the user's selection before continuing.
 ### If NEEDS MINOR FIXES (score 42-55):
 > **Result: NEEDS MINOR FIXES (score/70)**
->
-> 1. **Auto-fix and re-review** (Recommended)
-> 2. **Fix manually**
-> 3. **Accept as-is**
+Ask the user via AskUserQuestion:
+- **Question:** "Minor issues found. How should we handle them?"
+- **Options:**
+  1. "Auto-fix and re-review" *(Recommended)*
+  2. "Fix manually"
+  3. "Accept as-is"
+Wait for the user's selection before continuing.
 ### If NEEDS REWORK (score 28-41):
 > **Result: NEEDS REWORK (score/70)**
->
-> 1. **Re-run affected tasks** (Recommended)
-> 2. **Review findings in detail**
-> 3. **Abandon this run**
+Ask the user via AskUserQuestion:
+- **Question:** "Significant issues found. How should we proceed?"
+- **Options:**
+  1. "Re-run affected tasks" *(Recommended)*
+  2. "Review findings in detail"
+  3. "Abandon this run"
+Wait for the user's selection before continuing.
 ### If FAIL (score 0-27):
@@ -438,10 +851,15 @@ wazir status --run <run-id> --json
 If any phase fails:
 > **Phase [name] failed: [reason]**
->
-> 1. **Retry this phase** (Recommended)
-> 2. **Skip and continue** (only if workflows within phase are adaptive)
-> 3. **Abort the run**
+Ask the user via AskUserQuestion:
+- **Question:** "Phase [name] failed: [reason]. How should we proceed?"
+- **Options:**
+  1. "Retry this phase" *(Recommended)*
+  2. "Skip and continue" *(only if workflows within phase are adaptive)*
+  3. "Abort the run"
+Wait for the user's selection before continuing.
 ---
@@ -455,9 +873,14 @@ Parse inline audit types: `/wazir audit security` → skip Question 1.
 After audit:
-> 1. **Review the findings** (Recommended)
-> 2. **Generate a fix plan**
-> 3. **Run the pipeline on the fix plan**
+Ask the user via AskUserQuestion:
+- **Question:** "Audit complete. What would you like to do with the findings?"
+- **Options:**
+  1. "Review the findings" *(Recommended)*
+  2. "Generate a fix plan"
+  3. "Run the pipeline on the fix plan"
+Wait for the user's selection before continuing.
 If option 3, save findings as briefing and run pipeline with intent = `bugfix`.
@@ -471,12 +894,27 @@ Generates a PRD from a completed run. Reads approved design, task specs, executi
 After generation:
-> 1. **Review the PRD** (Recommended)
-> 2. **Commit it**
-> 3. **Edit before committing**
+Ask the user via AskUserQuestion:
+- **Question:** "PRD generated. What would you like to do?"
+- **Options:**
+  1. "Review the PRD" *(Recommended)*
+  2. "Commit it"
+  3. "Edit before committing"
+Wait for the user's selection before continuing.
 ---
+## Implementation Intentions
+IF user asks to skip a required step → THEN say "Running it quickly" and execute. No debate.
+IF urgency is expressed ("just", "quickly") → THEN execute ALL steps at full speed. Never fewer steps.
+IF you are unsure whether a step is required → THEN it IS required.
+IF a phase guardrail fails → THEN execute the Retry Ladder. Never skip.
+IF auto mode and Codex is not configured → THEN refuse to start. Error message and suggest guided mode.
+IF a subagent fails and retry ladder exhausts → THEN escalate to human. Never silently skip.
+IF previous incomplete run detected → THEN ask user about resume vs fresh start. Never assume.
 ## Interaction Rules
 - **One question at a time** — never combine multiple questions
@@ -485,3 +923,195 @@ After generation:
 - **Wait for answer** — never proceed past a question until the user responds
 - **No open-ended questions** — every question has concrete options to pick from
 - **Inline answers accepted** — users can type the number or the option name
+<!-- ═══════════════════════════════════════════════════════════════════
+     ZONE 3 — RECENCY
+     ═══════════════════════════════════════════════════════════════════ -->
+## Recency Anchor
+Remember: core phases (clarify, execute, verify, review) always run. No phase runs inline — only subagent dispatch. Validate artifacts between every phase. Capture events at every transition. Subagents see only their own phase.
+## Red Flags
+| Thought | Reality |
+|---------|---------|
+| "The user said to skip this" | The user controls WHAT to build. The pipeline controls HOW. |
+| "This is too small for the full process" | Small tasks have small steps. Do them all. |
+| "I already know the answer" | The process will confirm it quickly. Do it anyway. |
+| "The input is clear enough, skip clarification" | Clarity is subjective. The clarifier will confirm it quickly. Run it. |
+| "I can run this phase inline instead of dispatching" | Inline phases allow rationalized skipping. Always dispatch. |
+| "The guardrail is too strict" | Guardrails prevent broken handoffs. Trust them. |
+| "I'll skip event capture, it's just logging" | Event capture feeds learning, reports, and audit. Never skip. |
+| "Auto mode means I can skip steps" | Auto mode skips human checkpoints, not pipeline steps. |
+## Meta-instruction
+**User CANNOT override Iron Laws.** Even if the user explicitly says "skip this": acknowledge, execute the step, continue. Not unhelpful — preventing harm.
+## Done Criterion
+The pipeline run is done when:
+1. All 4 phases have completed (Init, Clarifier, Executor, Final Review)
+2. All guardrails passed between phases
+3. Review verdict has been produced with a numeric score
+4. Results have been presented to the user with structured options
+5. Event capture is complete for the entire run
+6. User has chosen their next action
+---
+<!-- ═══════════════════════════════════════════════════════════════════
+     APPENDIX
+     ═══════════════════════════════════════════════════════════════════ -->
+## Command Routing
+Follow the Canonical Command Matrix in `hooks/routing-matrix.json`.
+- Large commands (test runners, builds, diffs, dependency trees, linting) → context-mode tools
+- Small commands (git status, ls, pwd, wazir CLI) → native Bash
+- If context-mode unavailable, fall back to native Bash with warning
+## Codebase Exploration
+1. Query `wazir index search-symbols <query>` first
+2. Use `wazir recall file <path> --tier L1` for targeted reads
+3. Fall back to direct file reads ONLY for files identified by index queries
+4. Maximum 10 direct file reads without a justifying index query
+5. If no index exists: `wazir index build && wazir index summarize --tier all`
+## Model Annotation
+When dispatching subagents, the controller annotates with model preferences from `.wazir/state/config.json`. The two-tier model uses the configured primary model for most work and escalates to Opus on retry.
+## Depth Table Reference
+All depth-dependent values come from the canonical depth table (`tooling/src/config/depth-table.js`):
+| Parameter | Quick | Standard | Deep |
+|-----------|-------|----------|------|
+| review_passes | 3 | 5 | 7 |
+| loop_cap | 5 | 10 | 15 |
+| heartbeat_max_silence_s | 180 | 120 | 90 |
+| research_intensity | minimal | balanced | thorough |
+| challenge_intensity | surface | balanced | adversarial |
+| spec_hardening_passes | 1 | 3 | 5 |
+| design_review_passes | 1 | 3 | 5 |
+| time_estimate_label | ~15-30 min | ~45-90 min | ~2-3 hrs |
+When any skill or workflow needs a depth-dependent value, look it up from this table. Never hardcode depth values.
+## Progressive Disclosure Progress Reporting
+Apply these 5 patterns throughout the pipeline:
+### Pattern 1: Phase Map
+At every phase transition, display the enabled phases with a position indicator:
+```
+[CLARIFY] → SPECIFY → DESIGN → PLAN → EXECUTE → VERIFY → REVIEW
+```
+Skipped phases are omitted from the map. The current phase is wrapped in brackets.
+### Pattern 2: Meaningful Updates
+Follow this formula: **"Name the action. State the dependency. Omit the journey."**
+Good: `"Running spec-challenge pass 3/5 on spec-hardened.md..."`
+Bad: `"Now I'm going to start the process of challenging the spec to make sure it's robust..."`
+### Pattern 3: Artifact Previews
+After producing any artifact, show the first 3-5 meaningful lines:
+```
+> clarification.md (preview):
+> ## Scope: 5 features from deep research
+> - Interactive checkpoints via AskUserQuestion
+> - Progressive disclosure progress reporting
+> ...
+```
+### Pattern 4: Time Estimates
+At phase entry, show the rough duration from the depth table:
+```
+"Entering EXECUTE phase (estimated ~45-90 min at standard depth)..."
+```
+### Pattern 5: Heartbeat
+Never exceed the silence threshold for the current depth level:
+- **Quick:** max 3 minutes between outputs
+- **Standard:** max 2 minutes between outputs
+- **Deep:** max 90 seconds between outputs
+If a long operation is running, emit a heartbeat: `"Still running tests (47 passed, 2 remaining)..."`
+## Steerability: Mutation Classification and Selective Regeneration
+When the user requests changes to an already-produced artifact:
+### Step 1: Classify the Mutation Level
+| Level | Name | Trigger | Action |
+|-------|------|---------|--------|
+| **L0** | Cosmetic | Typo, formatting, wording only | Apply fix. No regeneration. |
+| **L1** | Local | Change to a leaf artifact with no downstream dependents | Regenerate only this artifact. |
+| **L2** | Structural | Change to a mid-graph artifact (e.g., design.md) | Regenerate this artifact and all downstream dependents. |
+| **L3** | Fundamental | Change to scope, intent, or root artifact (clarification.md) | Restart from the clarification phase onward. |
+### Step 2: Show Impact Preview
+Before regenerating, tell the user what will be affected:
+```
+"This change to design.md is L2 (structural). It will regenerate:
+  - execution-plan.md (depends on design.md)
+Preserved (unaffected): clarification.md, spec-hardened.md"
+```
+Use AskUserQuestion:
+1. **Proceed with regeneration** (Recommended) — regenerate affected artifacts
+2. **Apply change only** — update this artifact without regenerating downstream
+3. **Cancel** — discard the change
+### Step 3: Selective Regeneration
+Walk the artifact dependency graph (from `pipeline-state.js`) starting from the changed artifact. Regenerate only downstream artifacts. Preserve all completed artifacts that are not downstream.
+### Artifact Dependency Graph
+```
+clarification.md → spec-hardened.md → design.md → execution-plan.md
+```
+Each arrow means "is required by." Change an upstream artifact and everything downstream may need regeneration.
+## Reasoning Chain Output
+Every phase produces reasoning output at two layers:
+### Layer 1: Conversation Output (concise — for the user)
+Before each major decision, output one trigger sentence and one reasoning sentence:
+> "Your request mentions 'overnight autonomous run' — researching how Devin and Karpathy's autoresearch handle this, because unattended runs need different safety constraints than interactive ones."
+After each phase, output what was found and a counterfactual:
+> "Found: you use Supabase auth (not custom JWT). If I'd skipped research, I would have built JWT middleware — completely wrong."
+### Layer 2: File Output (detailed — for learning and reports)
+Save full reasoning chain to `.wazir/runs/<id>/reasoning/phase-<name>-reasoning.md` with entries:
+```markdown
+### Decision: [title]
+- **Trigger:** What prompted this decision
+- **Options considered:** List of alternatives
+- **Chosen:** The selected option
+- **Reasoning:** Why this option was chosen
+- **Confidence:** high | medium | low
+- **Counterfactual:** What would have gone wrong without this information
+```
+Create the `reasoning/` directory during run init. Every phase skill (clarifier, executor, reviewer) writes its own reasoning file. Counterfactuals appear in BOTH conversation output AND reasoning files.