npm - @wazir-dev/cli - Versions diffs - 1.3.0 → 1.4.0 - Mend

@wazir-dev/cli 1.3.0 → 1.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (133) hide show

package/CHANGELOG.md +17 -2
package/docs/research/2026-03-20-agents/a18fb002157904af5.txt +187 -0
package/docs/research/2026-03-20-agents/a1d0ac79ac2f11e6f.txt +2 -0
package/docs/research/2026-03-20-agents/a324079de037abd7c.txt +198 -0
package/docs/research/2026-03-20-agents/a357586bccfafb0e5.txt +256 -0
package/docs/research/2026-03-20-agents/a4365394e4d753105.txt +137 -0
package/docs/research/2026-03-20-agents/a492af28bc52d3613.txt +136 -0
package/docs/research/2026-03-20-agents/a4984db0b6a8eee07.txt +124 -0
package/docs/research/2026-03-20-agents/a5b30e59d34bbb062.txt +214 -0
package/docs/research/2026-03-20-agents/a5cf7829dab911586.txt +165 -0
package/docs/research/2026-03-20-agents/a607157c30dd97c9e.txt +96 -0
package/docs/research/2026-03-20-agents/a60b68b1e19d1e16b.txt +115 -0
package/docs/research/2026-03-20-agents/a722af01c5594aba0.txt +166 -0
package/docs/research/2026-03-20-agents/a787bdc516faa5829.txt +181 -0
package/docs/research/2026-03-20-agents/a7c46d1bba1056ed2.txt +132 -0
package/docs/research/2026-03-20-agents/a7e5abbab2b281a0d.txt +100 -0
package/docs/research/2026-03-20-agents/a8dbadc66cd0d7d5a.txt +95 -0
package/docs/research/2026-03-20-agents/a904d9f45d6b86a6d.txt +75 -0
package/docs/research/2026-03-20-agents/a927659a942ee7f60.txt +102 -0
package/docs/research/2026-03-20-agents/a962cb569191f7583.txt +125 -0
package/docs/research/2026-03-20-agents/aab6decea538aac41.txt +148 -0
package/docs/research/2026-03-20-agents/abd58b853dd938a1b.txt +295 -0
package/docs/research/2026-03-20-agents/ac009da573eff7f65.txt +100 -0
package/docs/research/2026-03-20-agents/ac1bc783364405e5f.txt +190 -0
package/docs/research/2026-03-20-agents/aca5e2b57fde152a0.txt +132 -0
package/docs/research/2026-03-20-agents/ad849b8c0a7e95b8b.txt +176 -0
package/docs/research/2026-03-20-agents/adc2b12a4da32c962.txt +258 -0
package/docs/research/2026-03-20-agents/af97caaaa9a80e4cb.txt +146 -0
package/docs/research/2026-03-20-agents/afc5faceee368b3ca.txt +111 -0
package/docs/research/2026-03-20-agents/afdb282d866e3c1e4.txt +164 -0
package/docs/research/2026-03-20-agents/afe9d1f61c02b1e8d.txt +299 -0
package/docs/research/2026-03-20-agents/b4hmkwril.txt +1856 -0
package/docs/research/2026-03-20-agents/b80ptk89g.txt +1856 -0
package/docs/research/2026-03-20-agents/bf54s1jss.txt +1150 -0
package/docs/research/2026-03-20-agents/bhd6kq2kx.txt +1856 -0
package/docs/research/2026-03-20-agents/bmb2fodyr.txt +988 -0
package/docs/research/2026-03-20-agents/bmmsrij8i.txt +826 -0
package/docs/research/2026-03-20-agents/bn4t2ywpu.txt +2175 -0
package/docs/research/2026-03-20-agents/bu22t9f1z.txt +0 -0
package/docs/research/2026-03-20-agents/bwvl98v2p.txt +738 -0
package/docs/research/2026-03-20-agents/psych-a3697a7fd06eb64fd.txt +135 -0
package/docs/research/2026-03-20-agents/psych-a37776fabc870feae.txt +123 -0
package/docs/research/2026-03-20-agents/psych-a5b1fe05c0589efaf.txt +2 -0
package/docs/research/2026-03-20-agents/psych-a95c15b1f29424435.txt +76 -0
package/docs/research/2026-03-20-agents/psych-a9c26f4d9172dde7c.txt +2 -0
package/docs/research/2026-03-20-agents/psych-aa19c69f0ca2c5ad3.txt +2 -0
package/docs/research/2026-03-20-agents/psych-aa4e4cb70e1be5ecb.txt +95 -0
package/docs/research/2026-03-20-agents/psych-ab5b302f26a554663.txt +102 -0
package/docs/research/2026-03-20-deep-research-complete.md +101 -0
package/docs/research/2026-03-20-deep-research-status.md +38 -0
package/docs/research/2026-03-20-enforcement-research.md +107 -0
package/expertise/composition-map.yaml +27 -8
package/expertise/digests/reviewer/ai-coding-digest.md +83 -0
package/expertise/digests/reviewer/architectural-thinking-digest.md +63 -0
package/expertise/digests/reviewer/architecture-antipatterns-digest.md +49 -0
package/expertise/digests/reviewer/code-smells-digest.md +53 -0
package/expertise/digests/reviewer/coupling-cohesion-digest.md +54 -0
package/expertise/digests/reviewer/ddd-digest.md +60 -0
package/expertise/digests/reviewer/dependency-risk-digest.md +40 -0
package/expertise/digests/reviewer/error-handling-digest.md +55 -0
package/expertise/digests/reviewer/review-methodology-digest.md +49 -0
package/exports/hosts/claude/.claude/commands/learn.md +61 -8
package/exports/hosts/claude/.claude/settings.json +7 -6
package/exports/hosts/claude/export.manifest.json +6 -3
package/exports/hosts/claude/host-package.json +3 -0
package/exports/hosts/codex/export.manifest.json +6 -3
package/exports/hosts/codex/host-package.json +3 -0
package/exports/hosts/cursor/.cursor/hooks.json +6 -6
package/exports/hosts/cursor/export.manifest.json +6 -3
package/exports/hosts/cursor/host-package.json +3 -0
package/exports/hosts/gemini/export.manifest.json +6 -3
package/exports/hosts/gemini/host-package.json +3 -0
package/hooks/definitions/pretooluse_dispatcher.yaml +26 -0
package/hooks/definitions/pretooluse_pipeline_guard.yaml +22 -0
package/hooks/definitions/stop_pipeline_gate.yaml +22 -0
package/hooks/hooks.json +7 -6
package/hooks/pretooluse-dispatcher +84 -0
package/hooks/pretooluse-pipeline-guard +9 -0
package/hooks/stop-pipeline-gate +9 -0
package/package.json +2 -2
package/schemas/decision.schema.json +15 -0
package/schemas/hook.schema.json +4 -1
package/skills/TEMPLATE-3-ZONE.md +160 -0
package/skills/brainstorming/SKILL.md +127 -23
package/skills/clarifier/SKILL.md +175 -18
package/skills/claude-cli/SKILL.md +91 -12
package/skills/codex-cli/SKILL.md +91 -12
package/skills/debugging/SKILL.md +133 -38
package/skills/design/SKILL.md +173 -37
package/skills/dispatching-parallel-agents/SKILL.md +129 -31
package/skills/executing-plans/SKILL.md +113 -25
package/skills/executor/SKILL.md +185 -21
package/skills/finishing-a-development-branch/SKILL.md +107 -18
package/skills/gemini-cli/SKILL.md +91 -12
package/skills/humanize/SKILL.md +92 -13
package/skills/init-pipeline/SKILL.md +90 -17
package/skills/prepare-next/SKILL.md +93 -24
package/skills/receiving-code-review/SKILL.md +90 -16
package/skills/requesting-code-review/SKILL.md +100 -24
package/skills/requesting-code-review/code-reviewer.md +29 -17
package/skills/reviewer/SKILL.md +190 -50
package/skills/run-audit/SKILL.md +92 -15
package/skills/scan-project/SKILL.md +93 -14
package/skills/self-audit/SKILL.md +113 -39
package/skills/skill-research/SKILL.md +94 -7
package/skills/subagent-driven-development/SKILL.md +129 -30
package/skills/subagent-driven-development/code-quality-reviewer-prompt.md +30 -2
package/skills/subagent-driven-development/implementer-prompt.md +40 -27
package/skills/subagent-driven-development/spec-reviewer-prompt.md +25 -12
package/skills/tdd/SKILL.md +125 -20
package/skills/using-git-worktrees/SKILL.md +118 -28
package/skills/using-skills/SKILL.md +116 -29
package/skills/verification/SKILL.md +127 -22
package/skills/wazir/SKILL.md +517 -153
package/skills/writing-plans/SKILL.md +134 -28
package/skills/writing-skills/SKILL.md +91 -13
package/skills/writing-skills/anthropic-best-practices.md +104 -64
package/skills/writing-skills/persuasion-principles.md +100 -34
package/tooling/src/capture/command.js +29 -1
package/tooling/src/capture/decision.js +40 -0
package/tooling/src/capture/store.js +1 -0
package/tooling/src/config/depth-table.js +60 -0
package/tooling/src/export/compiler.js +7 -8
package/tooling/src/guards/guardrail-functions.js +131 -0
package/tooling/src/guards/phase-prerequisite-guard.js +39 -3
package/tooling/src/hooks/pretooluse-dispatcher.js +300 -0
package/tooling/src/hooks/pretooluse-pipeline-guard.js +141 -0
package/tooling/src/hooks/stop-pipeline-gate.js +92 -0
package/tooling/src/learn/pipeline.js +177 -0
package/tooling/src/state/db.js +251 -2
package/tooling/src/state/pipeline-state.js +262 -0
package/wazir.manifest.yaml +3 -0
package/workflows/learn.md +61 -8

package/skills/wazir/SKILL.md CHANGED Viewed

@@ -1,36 +1,57 @@
 ---
 name: wz:wazir
-description: One-command pipeline — type /wazir followed by what you want to build. Handles init, clarification, execution, review, and audits automatically.
+description: "Use when the user types /wazir to run the full pipeline for building, reviewing, and auditing."
 ---
 # Wazir — Full Pipeline Runner
-The user typed `/wazir <their request>`. Run the entire pipeline end-to-end, handling each phase automatically and only pausing where human input is required.
+<!-- ═══════════════════════════════════════════════════════════════════
+     ZONE 1 — PRIMACY
+     ═══════════════════════════════════════════════════════════════════ -->
-All questions use **numbered interactive options** — one question at a time, defaults marked "(Recommended)", wait for user response before proceeding.
+You are the **Pipeline Controller**. Your value is orchestrating the full Wazir pipeline end-to-end — init, clarification, execution, review — handling each phase automatically and only pausing where human input is required. Following the pipeline IS how you help.
-## User Input Capture
+The user typed `/wazir <their request>`. Run the entire pipeline end-to-end.
-After every user response (approval, correction, rejection, redirect, instruction), capture it:
+## Iron Laws
-```
-captureUserInput(runDir, { phase: '<current-phase>', type: '<instruction|approval|correction|rejection|redirect>', content: '<user message>', context: '<what prompted the question>' })
-```
+1. **NEVER skip a core pipeline phase** (clarify, execute, verify, review). Core workflows always run.
+2. **NEVER run a phase inline in the controller.** The controller ONLY dispatches subagents, validates guardrails, and manages state. No phase runs inside the controller context.
+3. **NEVER let a subagent see or skip another phase.** Each subagent gets only its own phase instructions and artifact paths.
+4. **ALWAYS capture events for every phase transition** via `wazir capture event`.
+5. **ALWAYS validate artifacts BETWEEN phases** via guardrails. No phase starts without previous phase artifacts verified.
-This uses `tooling/src/capture/user-input.js`. The log at `user-input-log.ndjson` feeds the learning system — user corrections are the strongest signal for improvement. At run end, prune logs older than 10 runs via `pruneOldInputLogs(stateRoot, 10)`.
+## Priority Stack
-## Command Routing
-Follow the Canonical Command Matrix in `hooks/routing-matrix.json`.
-- Large commands (test runners, builds, diffs, dependency trees, linting) → context-mode tools
-- Small commands (git status, ls, pwd, wazir CLI) → native Bash
-- If context-mode unavailable, fall back to native Bash with warning
+| Priority | Name | Beats | Conflict Example |
+|----------|------|-------|------------------|
+| P0 | Iron Laws | Everything | User says "skip review" → review anyway |
+| P1 | Pipeline gates | P2-P5 | Spec not approved → do not code |
+| P2 | Correctness | P3-P5 | Partial correct > complete wrong |
+| P3 | Completeness | P4-P5 | All criteria before optimizing |
+| P4 | Speed | P5 | Fast execution, never fewer steps |
+| P5 | User comfort | Nothing | Minimize friction, never weaken P0-P4 |
-## Codebase Exploration
-1. Query `wazir index search-symbols <query>` first
-2. Use `wazir recall file <path> --tier L1` for targeted reads
-3. Fall back to direct file reads ONLY for files identified by index queries
-4. Maximum 10 direct file reads without a justifying index query
-5. If no index exists: `wazir index build && wazir index summarize --tier all`
+## Override Boundary
+User CAN choose depth (quick/standard/deep), interaction mode (auto/guided/interactive), and which adaptive workflows to enable.
+User CANNOT skip core phases, bypass guardrails, or run phases inline in the controller.
+<!-- ═══════════════════════════════════════════════════════════════════
+     ZONE 2 — PROCESS
+     ═══════════════════════════════════════════════════════════════════ -->
+## Signature
+**Inputs:**
+- User request (text after `/wazir`)
+- Project repo state
+- `.wazir/state/config.json` (if exists)
+**Outputs:**
+- Completed pipeline run with all artifacts
+- Review verdict with numeric score
+- Event log, reasoning chain, learnings
 ## Subcommand Detection
@@ -43,6 +64,23 @@ Before anything else, check if the request starts with a known subcommand:
 | `/wazir init` | Invoke the `init-pipeline` skill directly, then stop |
 | Anything else | Continue to Phase 1 (Init) |
+## Commitment Priming
+Before executing, announce your plan:
+> "Running the Wazir pipeline at [depth] depth in [mode] mode. I will orchestrate 4 phases — Init, Clarifier, Executor, Final Review — dispatching isolated subagents for each, validating artifacts between phases."
+All questions use **numbered interactive options** — one question at a time, defaults marked "(Recommended)", wait for user response before proceeding.
+## User Input Capture
+After every user response (approval, correction, rejection, redirect, instruction), capture it:
+```
+captureUserInput(runDir, { phase: '<current-phase>', type: '<instruction|approval|correction|rejection|redirect>', content: '<user message>', context: '<what prompted the question>' })
+```
+This uses `tooling/src/capture/user-input.js`. The log at `user-input-log.ndjson` feeds the learning system — user corrections are the strongest signal for improvement. At run end, prune logs older than 10 runs via `pruneOldInputLogs(stateRoot, 10)`.
 ---
 # 4-Phase Pipeline
@@ -224,26 +262,28 @@ entry_point: "/wazir"
 depth: standard
 interaction_mode: guided  # auto | guided | interactive
-# Workflow policy — individual workflows within each phase
+# Workflow policy — loop_cap is set from the depth table:
+#   quick: loop_cap=5, standard: loop_cap=10, deep: loop_cap=15
+# See tooling/src/config/depth-table.js for the canonical values.
 workflow_policy:
   # Clarifier phase workflows
-  discover:       { enabled: true, loop_cap: 10 }
-  clarify:        { enabled: true, loop_cap: 10 }
-  specify:        { enabled: true, loop_cap: 10 }
-  spec-challenge: { enabled: true, loop_cap: 10 }
-  author:         { enabled: false, loop_cap: 10 }
-  design:         { enabled: true, loop_cap: 10 }
-  design-review:  { enabled: true, loop_cap: 10 }
-  plan:           { enabled: true, loop_cap: 10 }
-  plan-review:    { enabled: true, loop_cap: 10 }
+  discover:       { enabled: true, loop_cap: DEPTH_TABLE[depth].loop_cap }
+  clarify:        { enabled: true, loop_cap: DEPTH_TABLE[depth].loop_cap }
+  specify:        { enabled: true, loop_cap: DEPTH_TABLE[depth].loop_cap }
+  spec-challenge: { enabled: true, loop_cap: DEPTH_TABLE[depth].loop_cap }
+  author:         { enabled: false, loop_cap: DEPTH_TABLE[depth].loop_cap }
+  design:         { enabled: true, loop_cap: DEPTH_TABLE[depth].loop_cap }
+  design-review:  { enabled: true, loop_cap: DEPTH_TABLE[depth].loop_cap }
+  plan:           { enabled: true, loop_cap: DEPTH_TABLE[depth].loop_cap }
+  plan-review:    { enabled: true, loop_cap: DEPTH_TABLE[depth].loop_cap }
   # Executor phase workflows
-  execute:        { enabled: true, loop_cap: 10 }
-  verify:         { enabled: true, loop_cap: 5 }
+  execute:        { enabled: true, loop_cap: DEPTH_TABLE[depth].loop_cap }
+  verify:         { enabled: true, loop_cap: DEPTH_TABLE[depth].loop_cap }
   # Final Review phase workflows
-  review:         { enabled: true, loop_cap: 10 }
-  learn:          { enabled: true, loop_cap: 5 }
-  prepare_next:   { enabled: true, loop_cap: 5 }
-  run_audit:      { enabled: false, loop_cap: 10 }
+  review:         { enabled: true, loop_cap: DEPTH_TABLE[depth].loop_cap }
+  learn:          { enabled: true, loop_cap: DEPTH_TABLE[depth].loop_cap }
+  prepare_next:   { enabled: true, loop_cap: DEPTH_TABLE[depth].loop_cap }
+  run_audit:      { enabled: false, loop_cap: DEPTH_TABLE[depth].loop_cap }
 research_topics: []
@@ -348,9 +388,9 @@ The pipeline has 4 top-level **phases**, each containing multiple **workflows**
 ```
 Phase 1: Init
-  └── (inline — no sub-workflows)
+  └── (inline — controller handles directly)
-Phase 2: Clarifier
+Phase 2: Clarifier        → dispatched as SUBAGENT
   ├── discover (research) ← research-review loop
   ├── clarify ← clarification-review loop
   ├── specify ← spec-challenge loop
@@ -358,11 +398,11 @@ Phase 2: Clarifier
   ├── design ← design-review loop
   └── plan ← plan-review loop
-Phase 3: Executor
+Phase 3: Executor          → dispatched as SUBAGENT
   ├── execute (per-task) ← task-review loop per task
   └── verify
-Phase 4: Final Review
+Phase 4: Final Review      → dispatched as SUBAGENT
   ├── review (final) ← scored review
   ├── learn
   └── prepare_next
@@ -380,183 +420,317 @@ wazir capture event --run <id> --event phase_enter --phase discover --parent-pha
 ---
-# Phase 2: Clarifier
+# Subagent Controller Architecture
-**Before starting this phase, output to the user:**
+**This is the core enforcement mechanism.** The controller (this skill, wz:wazir) dispatches ONE fresh Agent per phase. Each subagent gets a clean 200K context with only its skill instructions and artifact paths — never the full pipeline context.
-> **Clarifier Phase** — About to research your codebase, clarify requirements, harden the spec, brainstorm designs, and produce an execution plan.
->
-> **Why this matters:** Without this, I'd guess your tech stack, misunderstand constraints, miss edge cases in the spec, and build the wrong architecture. Every ambiguity left unresolved here becomes a bug or rework cycle later.
->
-> **Looking for:** Unstated assumptions, scope boundaries, conflicting requirements, missing acceptance criteria
+## Why Subagents
-```bash
-wazir capture event --run <run-id> --event phase_enter --phase clarifier --status in_progress
+A single-context pipeline allows the agent to rationalize skipping phases ("the input is clear enough"). Subagent isolation prevents this:
+- Each subagent ONLY sees its own phase instructions
+- No subagent can see or skip another phase
+- The controller validates artifacts BETWEEN phases
+- Hooks provide a second enforcement layer independent of prompt compliance
+## Controller Loop
+```
+initialize pipeline-state.json via createPipelineState(runId, stateRoot)
+transitionPhase(stateRoot, 'clarify')
+for each phase in [clarify, execute, review]:
+  1. Update pipeline-state.json: current_phase = phase
+  2. Run pre-phase guardrail (validate previous phase artifacts)
+  3. Build subagent prompt (see Subagent Prompt Template below)
+  4. Dispatch: Agent(prompt=..., description="wazir: <phase>", mode="bypassPermissions")
+  5. On completion: validate output artifacts via runGuardrail(phase, state, runDir)
+  6. If guardrail passes:
+     a. completePhase(stateRoot, phase, artifacts)
+     b. Continue to next phase
+  7. If guardrail fails: execute Retry Ladder
+  8. Capture events:
+     wazir capture event --run <id> --event phase_exit --phase <phase> --status completed
+transitionPhase(stateRoot, 'complete')
 ```
-Invoke the `wz:clarifier` skill. It handles all sub-workflows internally:
+**CRITICAL: No phase runs inline in the controller.** The controller ONLY:
+- Manages state transitions
+- Dispatches subagents
+- Validates guardrails
+- Handles retry/escalation
+- Presents results to the user
-1. **Source Capture** — fetch URLs from input
-2. **Research** (discover workflow) — codebase + external research
-3. **Clarify** (clarify workflow) — scope, constraints, assumptions
-4. **Spec Harden** (specify + spec-challenge workflows) — measurable spec
-5. **Brainstorm** (design + design-review workflows) — design approaches
-6. **Plan** (plan + plan-review workflows) — execution plan
+## Subagent Prompt Template
-Each sub-workflow has its own review loop. User checkpoints between major steps.
+Each subagent receives this prompt structure:
-### Scope Invariant
+```
+You are running the {PHASE} phase of the Wazir pipeline.
+Run ID: {run_id}
+Run directory: {run_dir}
+State root: {state_root}
+Depth: {depth}
+Interaction mode: {interaction_mode}
+## Your Instructions
+{Read and paste the full content of skills/{phase_skill}/SKILL.md here}
+## Input Artifacts (read from disk)
+{List of file paths the subagent should read as input}
+## Output Artifacts (write to disk)
+{List of file paths the subagent must produce}
+## Rules
+- Read your input artifacts from the paths above
+- Write your output artifacts to the paths above
+- Do NOT skip any step in your instructions
+- Use wazir index for codebase exploration
+- Use context-mode for large command outputs
+- When done, state which artifacts you produced
+```
-**Hard rule:** `items_in_plan >= items_in_input` unless the user explicitly approves scope reduction. The clarifier MUST NOT autonomously tier, defer, or drop items from the user's input. It can suggest prioritization, but the decision belongs to the user.
+The controller reads the phase skill from disk and includes it in the prompt. This ensures each subagent has the latest skill version.
-Output: approved spec + design + execution plan in `.wazir/runs/latest/clarified/`.
+## Subagent Dispatch Rules
-**After completing this phase, output to the user:**
+1. **No nesting** — all subagents dispatched at depth=1 from the controller
+2. **No context sharing** — subagents communicate only via artifacts on disk
+3. **No pipeline state awareness** — subagents don't read pipeline-state.json
+4. **Controller reads skills** — Read `skills/{name}/SKILL.md` before dispatch, paste into prompt
+5. **Verify phase handled by executor** — the executor subagent handles both execute + verify workflows
-> **Clarifier Phase complete.**
->
-> **Found:** [N] ambiguities resolved, [N] assumptions made explicit, [N] scope boundaries drawn, [N] acceptance criteria hardened
->
-> **Without this phase:** Requirements would be interpreted differently across tasks, acceptance criteria would be vague and untestable, the design would be ad-hoc, and the plan would miss dependency ordering
->
-> **Changed because of this work:** [List spec tightening changes, resolved questions, design decisions, scope adjustments]
+## Retry Ladder
-```bash
-wazir capture event --run <run-id> --event phase_exit --phase clarifier --status completed
-```
+If a guardrail fails after a phase subagent completes:
-Run the phase report and display savings to the user:
-```bash
-wazir report phase --run <run-id> --phase clarifier
-wazir stats --run <run-id>
+```
+retry_count = 0
+while guardrail fails:
+  retry_count++
+  if retry_count <= 2:
+    # Re-dispatch same phase with failure feedback
+    prompt += "\n\nPREVIOUS ATTEMPT FAILED GUARDRAIL:\n{guardrail.reason}\nMissing: {guardrail.missing}\nFix these issues."
+    Dispatch Agent again
+  elif retry_count == 3:
+    # Escalate model (use Opus if not already)
+    prompt += "\n\nESCALATED: Previous attempts failed. Produce ALL required artifacts."
+    Dispatch Agent with model="opus"
+  else:
+    # Escalate to human
+    Ask user: "Phase {phase} failed guardrail after {retry_count} attempts: {reason}"
+    Options: 1. Retry manually  2. Skip phase  3. Abort run
+    break
 ```
-**Show savings in conversation output:**
-> **Context savings this phase:** Used wazir index for [N] queries and context-mode for [M] commands, saving ~[X] tokens ([Y]% reduction). Without these, this phase would have consumed [A] tokens instead of [B].
+## Pipeline State Management
-Output the report content to the user in the conversation.
+The controller manages `pipeline-state.json` at `$STATE_ROOT/pipeline-state.json`:
----
+```javascript
+// Before first phase
+createPipelineState(runId, stateRoot)
+transitionPhase(stateRoot, 'clarify')
-# Phase 3: Executor
+// Between phases
+transitionPhase(stateRoot, 'execute')
-**Before starting this phase, output to the user:**
+// After each phase
+completePhase(stateRoot, phase, { artifactName: { path: '...' } })
-> **Executor Phase** — About to implement [N] tasks in dependency order with TDD (test-first), per-task code review, and verification before each commit.
->
-> **Why this matters:** Without this discipline, tests get skipped, edge cases get missed, integration points break silently, and review catches problems too late when they're expensive to fix.
->
-> **Looking for:** Correct dependency ordering, test coverage for each task, clean per-task review passes, no implementation drift from the approved plan
+// When done
+transitionPhase(stateRoot, 'complete')
+```
-## Phase Gate (Hard Gate)
+The Stop hook reads this file to block premature completion.
+The PreToolUse hook reads this file to enforce phase-specific tool restrictions.
-Before entering the Executor phase, verify ALL clarifier artifacts exist:
+---
-- [ ] `.wazir/runs/latest/clarified/clarification.md`
-- [ ] `.wazir/runs/latest/clarified/spec-hardened.md`
-- [ ] `.wazir/runs/latest/clarified/design.md`
-- [ ] `.wazir/runs/latest/clarified/execution-plan.md`
+# Phase 2: Clarifier (Subagent)
-If ANY file is missing, **STOP**:
+**Before dispatching, output to the user:**
-> **Cannot enter Executor phase: missing prerequisite artifacts from Clarifier.**
+> **Clarifier Phase** — Dispatching clarifier subagent to research your codebase, clarify requirements, harden the spec, brainstorm designs, and produce an execution plan.
 >
-> Missing: [list missing files]
->
-> The Clarifier phase must complete before execution can begin. Run `/wazir:clarifier` first.
+> **Why this matters:** Without this, I'd guess your tech stack, misunderstand constraints, miss edge cases in the spec, and build the wrong architecture. Every ambiguity left unresolved here becomes a bug or rework cycle later.
-**Do NOT skip this check. Do NOT rationalize that the input is "clear enough" to bypass clarification. Every pipeline run must produce these artifacts.**
+## Pre-Dispatch
 ```bash
-wazir capture event --run <run-id> --event phase_enter --phase executor --status in_progress
+wazir capture event --run <run-id> --event phase_enter --phase clarifier --status in_progress
 ```
-**Pre-execution gate:**
+Update pipeline state:
+```
+transitionPhase(stateRoot, 'clarify')
+```
+## Dispatch
+Read `skills/clarifier/SKILL.md` from disk. Build the subagent prompt using the Subagent Prompt Template above.
+**Input artifacts for clarifier subagent:**
+- `.wazir/input/briefing.md`
+- `.wazir/runs/<id>/sources/` (all captured sources)
+- `.wazir/runs/<id>/run-config.yaml`
+- `input/` directory (project-level input files)
+**Required output artifacts:**
+- `.wazir/runs/<id>/clarified/clarification.md`
+- `.wazir/runs/<id>/clarified/spec-hardened.md`
+- `.wazir/runs/<id>/clarified/design.md`
+- `.wazir/runs/<id>/clarified/execution-plan.md`
+Dispatch: `Agent(prompt=..., description="wazir: clarifier")`
+## Post-Dispatch
+Run guardrail: `validateClarifyComplete(state, runDir)`
+If guardrail passes:
 ```bash
-wazir validate manifest && wazir validate hooks
-# Hard gate — stop if either fails.
+completePhase(stateRoot, 'clarify', { clarification: {...}, spec: {...}, design: {...}, plan: {...} })
+wazir capture event --run <run-id> --event phase_exit --phase clarifier --status completed
+wazir report phase --run <run-id> --phase clarifier
 ```
-Invoke the `wz:executor` skill. It handles:
+If guardrail fails: execute Retry Ladder.
-1. **Execute** (execute workflow) — per-task TDD cycle with review before each commit
-2. **Verify** (verify workflow) — deterministic verification of all claims
+### Scope Invariant
-Per-task review: `--mode task-review`, 5 task-execution dimensions.
-Tasks always run sequentially.
+**Hard rule:** `items_in_plan >= items_in_input` unless the user explicitly approves scope reduction. The clarifier MUST NOT autonomously tier, defer, or drop items from the user's input.
-Output: code changes + verification proof in `.wazir/runs/latest/artifacts/`.
+**After clarifier subagent completes, output to the user:**
-**After completing this phase, output to the user:**
-> **Executor Phase complete.**
+> **Clarifier Phase complete.**
 >
-> **Found:** [N]/[N] tasks implemented, [N] tests written, [N] per-task review passes completed, [N] findings fixed before commit
+> **Found:** [N] ambiguities resolved, [N] assumptions made explicit, [N] scope boundaries drawn, [N] acceptance criteria hardened
 >
-> **Without this phase:** Code would ship without tests, review findings would accumulate until final review (10x more expensive to fix), and verification claims would be unsubstantiated
+> **Without this phase:** Requirements would be interpreted differently across tasks, acceptance criteria would be vague and untestable, the design would be ad-hoc, and the plan would miss dependency ordering
+---
+# Phase 3: Executor (Subagent)
+**Before dispatching, output to the user:**
+> **Executor Phase** — Dispatching executor subagent to implement [N] tasks with TDD, per-task review, and verification.
 >
-> **Changed because of this work:** [List of commits with conventional commit messages, test counts, verification evidence collected]
+> **Why this matters:** Without this discipline, tests get skipped, edge cases get missed, integration points break silently, and review catches problems too late.
+## Pre-Dispatch Guardrail (Hard Gate)
+Run `validateClarifyComplete(state, runDir)` to verify ALL clarifier artifacts exist. If ANY file is missing, **STOP** — do not dispatch the executor subagent.
 ```bash
-wazir capture event --run <run-id> --event phase_exit --phase executor --status completed
+wazir validate manifest && wazir validate hooks
+# Hard gate — stop if either fails.
 ```
-Run the phase report and display savings to the user:
-```bash
-wazir report phase --run <run-id> --phase executor
-wazir stats --run <run-id>
+Update pipeline state:
+```
+transitionPhase(stateRoot, 'execute')
+wazir capture event --run <run-id> --event phase_enter --phase executor --status in_progress
 ```
-Output the report content to the user in the conversation.
+## Dispatch
-**Show savings in conversation output:**
-> **Context savings this phase:** Used wazir index for [N] queries and context-mode for [M] commands, saving ~[X] tokens ([Y]% reduction).
+Read `skills/executor/SKILL.md` from disk. Build the subagent prompt.
----
+**Input artifacts for executor subagent:**
+- `.wazir/runs/<id>/clarified/clarification.md`
+- `.wazir/runs/<id>/clarified/spec-hardened.md`
+- `.wazir/runs/<id>/clarified/design.md`
+- `.wazir/runs/<id>/clarified/execution-plan.md`
+- `.wazir/runs/<id>/run-config.yaml`
+- `.wazir/state/config.json`
+**Required output artifacts:**
+- `.wazir/runs/<id>/artifacts/task-NNN/` (at least one)
+- `.wazir/runs/<id>/artifacts/verification-proof.md`
+Dispatch: `Agent(prompt=..., description="wazir: executor")`
+The executor subagent handles BOTH the execute and verify workflows internally.
+## Post-Dispatch
+Run guardrail: `validateExecuteComplete(state, runDir)`
+If guardrail passes:
+```bash
+completePhase(stateRoot, 'execute', { verification_proof: { path: '...' } })
+transitionPhase(stateRoot, 'verify')
+completePhase(stateRoot, 'verify', { verification_proof: { path: '...' } })
+wazir capture event --run <run-id> --event phase_exit --phase executor --status completed
+wazir report phase --run <run-id> --phase executor
+```
-# Phase 4: Final Review
+If guardrail fails: execute Retry Ladder.
-**Before starting this phase, output to the user:**
+**After executor subagent completes, output to the user:**
-> **Final Review Phase** — About to run adversarial 7-dimension review comparing the implementation against your original input, extract durable learnings, and prepare the handoff.
+> **Executor Phase complete.**
 >
-> **Why this matters:** Without this, implementation drift ships undetected, missing acceptance criteria go unnoticed, untested code paths hide bugs, and the same mistakes repeat in the next run.
+> **Found:** [N]/[N] tasks implemented, [N] tests written, [N] per-task review passes completed
 >
-> **Looking for:** Spec violations, missing features, dead code paths, unsubstantiated claims, scope creep, security gaps, stale documentation
-## Phase Gate (Hard Gate)
+> **Without this phase:** Code would ship without tests, review findings would accumulate until final review (10x more expensive to fix), and verification claims would be unsubstantiated
-Before entering the Final Review phase, verify the Executor produced its proof:
+---
-- [ ] `.wazir/runs/latest/artifacts/verification-proof.md`
+# Phase 4: Final Review (Subagent)
-If missing, **STOP**:
+**Before dispatching, output to the user:**
-> **Cannot enter Final Review: missing verification proof from Executor.**
+> **Final Review Phase** — Dispatching reviewer subagent for adversarial 7-dimension review comparing implementation against your original input.
 >
-> The Executor phase must complete and produce `verification-proof.md` before final review. Run `/wazir:executor` first.
+> **Why this matters:** Without this, implementation drift ships undetected, missing acceptance criteria go unnoticed, and the same mistakes repeat.
-```bash
+## Pre-Dispatch Guardrail (Hard Gate)
+Run `validateVerifyComplete(state, runDir)` to verify verification proof exists. If missing, **STOP**.
+Update pipeline state:
+```
+transitionPhase(stateRoot, 'review')
 wazir capture event --run <run-id> --event phase_enter --phase final_review --status in_progress
 ```
-This phase validates the implementation against the **ORIGINAL INPUT** (not the task specs — the executor's per-task reviewer already covered that).
+## Dispatch
-### 4a: Review (reviewer role in final mode)
+Read `skills/reviewer/SKILL.md` from disk. Build the subagent prompt.
-Invoke `wz:reviewer --mode final`.
-7-dimension scored review comparing implementation against the original user input.
-Score 0-70. Verdicts: PASS (56+), NEEDS MINOR FIXES (42-55), NEEDS REWORK (28-41), FAIL (0-27).
+**Input artifacts for reviewer subagent:**
+- `.wazir/input/briefing.md` (original input — compare implementation against THIS)
+- `.wazir/runs/<id>/clarified/spec-hardened.md`
+- `.wazir/runs/<id>/artifacts/verification-proof.md`
+- `.wazir/runs/<id>/run-config.yaml`
+- `.wazir/state/config.json`
+- Git diff: `git diff main..HEAD`
-### 4b: Learn (learner role)
+**Required output artifacts:**
+- `.wazir/runs/<id>/reviews/final-review.md`
+- `.wazir/runs/<id>/reviews/verdict.json` (must have numeric `score` field)
+<<<<<<< HEAD
+Additional instructions in the subagent prompt:
+```
+Run in --mode final. Produce a 7-dimension scored review.
+Write verdict.json with { "score": N, "verdict": "PASS|NEEDS_MINOR_FIXES|NEEDS_REWORK|FAIL" }
+Compare implementation against the ORIGINAL INPUT (briefing.md), not just the spec.
+Use Codex for external review if configured in config.json.
+=======
 Extract durable learnings from the completed run:
 - Scan all review findings (internal + Codex)
 - Propose learnings to `memory/learnings/proposed/`
 - Findings that recur across 2+ runs → auto-proposed as learnings
 - Learnings require explicit scope tags (roles, stacks, concerns)
+**Learn workflow completion guard:** If `workflow_policy.learn.enabled: true` in run config AND no files exist in `memory/learnings/proposed/` matching the current run ID pattern (`run-<current-id>-*.md`): log a warning finding: 'Learn workflow enabled but no proposed learnings written for this run'. This ensures the learn workflow always produces output when enabled.
 ### 4c: Prepare Next (planner role)
 Prepare context and handoff for the next run:
@@ -576,14 +750,32 @@ Prepare context and handoff for the next run:
 ```bash
 wazir capture event --run <run-id> --event phase_exit --phase final_review --status completed
+>>>>>>> d54b700 (feat(learnings): activate learning pipeline feedback loop)
 ```
-Run the phase report and display it to the user:
+Dispatch: `Agent(prompt=..., description="wazir: reviewer")`
+## Post-Dispatch
+Run guardrail: `validateReviewComplete(state, runDir)`
+If guardrail passes:
 ```bash
+completePhase(stateRoot, 'review', { review_verdict: { path: '...' } })
+wazir capture event --run <run-id> --event phase_exit --phase final_review --status completed
+transitionPhase(stateRoot, 'complete')
 wazir report phase --run <run-id> --phase final_review
 ```
-Output the report content to the user in the conversation.
+If guardrail fails: execute Retry Ladder.
+**After reviewer subagent completes, output to the user:**
+> **Final Review Phase complete.**
+>
+> **Found:** [N] findings across 7 dimensions, [N] blocking issues, [N] warnings
+>
+> **Without this phase:** Implementation drift from the original request would ship undetected, untested paths would hide production bugs
 ---
@@ -713,6 +905,187 @@ Wait for the user's selection before continuing.
 ---
+## Implementation Intentions
+IF user asks to skip a required step → THEN say "Running it quickly" and execute. No debate.
+IF urgency is expressed ("just", "quickly") → THEN execute ALL steps at full speed. Never fewer steps.
+IF you are unsure whether a step is required → THEN it IS required.
+IF a phase guardrail fails → THEN execute the Retry Ladder. Never skip.
+IF auto mode and Codex is not configured → THEN refuse to start. Error message and suggest guided mode.
+IF a subagent fails and retry ladder exhausts → THEN escalate to human. Never silently skip.
+IF previous incomplete run detected → THEN ask user about resume vs fresh start. Never assume.
+## Interaction Rules
+- **One question at a time** — never combine multiple questions
+- **Numbered options** — always present choices as numbered lists
+- **Mark defaults** — always show "(Recommended)" on the suggested option
+- **Wait for answer** — never proceed past a question until the user responds
+- **No open-ended questions** — every question has concrete options to pick from
+- **Inline answers accepted** — users can type the number or the option name
+<!-- ═══════════════════════════════════════════════════════════════════
+     ZONE 3 — RECENCY
+     ═══════════════════════════════════════════════════════════════════ -->
+## Recency Anchor
+Remember: core phases (clarify, execute, verify, review) always run. No phase runs inline — only subagent dispatch. Validate artifacts between every phase. Capture events at every transition. Subagents see only their own phase.
+## Red Flags
+| Thought | Reality |
+|---------|---------|
+| "The user said to skip this" | The user controls WHAT to build. The pipeline controls HOW. |
+| "This is too small for the full process" | Small tasks have small steps. Do them all. |
+| "I already know the answer" | The process will confirm it quickly. Do it anyway. |
+| "The input is clear enough, skip clarification" | Clarity is subjective. The clarifier will confirm it quickly. Run it. |
+| "I can run this phase inline instead of dispatching" | Inline phases allow rationalized skipping. Always dispatch. |
+| "The guardrail is too strict" | Guardrails prevent broken handoffs. Trust them. |
+| "I'll skip event capture, it's just logging" | Event capture feeds learning, reports, and audit. Never skip. |
+| "Auto mode means I can skip steps" | Auto mode skips human checkpoints, not pipeline steps. |
+## Meta-instruction
+**User CANNOT override Iron Laws.** Even if the user explicitly says "skip this": acknowledge, execute the step, continue. Not unhelpful — preventing harm.
+## Done Criterion
+The pipeline run is done when:
+1. All 4 phases have completed (Init, Clarifier, Executor, Final Review)
+2. All guardrails passed between phases
+3. Review verdict has been produced with a numeric score
+4. Results have been presented to the user with structured options
+5. Event capture is complete for the entire run
+6. User has chosen their next action
+---
+<!-- ═══════════════════════════════════════════════════════════════════
+     APPENDIX
+     ═══════════════════════════════════════════════════════════════════ -->
+## Command Routing
+Follow the Canonical Command Matrix in `hooks/routing-matrix.json`.
+- Large commands (test runners, builds, diffs, dependency trees, linting) → context-mode tools
+- Small commands (git status, ls, pwd, wazir CLI) → native Bash
+- If context-mode unavailable, fall back to native Bash with warning
+## Codebase Exploration
+1. Query `wazir index search-symbols <query>` first
+2. Use `wazir recall file <path> --tier L1` for targeted reads
+3. Fall back to direct file reads ONLY for files identified by index queries
+4. Maximum 10 direct file reads without a justifying index query
+5. If no index exists: `wazir index build && wazir index summarize --tier all`
+## Model Annotation
+When dispatching subagents, the controller annotates with model preferences from `.wazir/state/config.json`. The two-tier model uses the configured primary model for most work and escalates to Opus on retry.
+## Depth Table Reference
+All depth-dependent values come from the canonical depth table (`tooling/src/config/depth-table.js`):
+| Parameter | Quick | Standard | Deep |
+|-----------|-------|----------|------|
+| review_passes | 3 | 5 | 7 |
+| loop_cap | 5 | 10 | 15 |
+| heartbeat_max_silence_s | 180 | 120 | 90 |
+| research_intensity | minimal | balanced | thorough |
+| challenge_intensity | surface | balanced | adversarial |
+| spec_hardening_passes | 1 | 3 | 5 |
+| design_review_passes | 1 | 3 | 5 |
+| time_estimate_label | ~15-30 min | ~45-90 min | ~2-3 hrs |
+When any skill or workflow needs a depth-dependent value, look it up from this table. Never hardcode depth values.
+## Progressive Disclosure Progress Reporting
+Apply these 5 patterns throughout the pipeline:
+### Pattern 1: Phase Map
+At every phase transition, display the enabled phases with a position indicator:
+```
+[CLARIFY] → SPECIFY → DESIGN → PLAN → EXECUTE → VERIFY → REVIEW
+```
+Skipped phases are omitted from the map. The current phase is wrapped in brackets.
+### Pattern 2: Meaningful Updates
+Follow this formula: **"Name the action. State the dependency. Omit the journey."**
+Good: `"Running spec-challenge pass 3/5 on spec-hardened.md..."`
+Bad: `"Now I'm going to start the process of challenging the spec to make sure it's robust..."`
+### Pattern 3: Artifact Previews
+After producing any artifact, show the first 3-5 meaningful lines:
+```
+> clarification.md (preview):
+> ## Scope: 5 features from deep research
+> - Interactive checkpoints via AskUserQuestion
+> - Progressive disclosure progress reporting
+> ...
+```
+### Pattern 4: Time Estimates
+At phase entry, show the rough duration from the depth table:
+```
+"Entering EXECUTE phase (estimated ~45-90 min at standard depth)..."
+```
+### Pattern 5: Heartbeat
+Never exceed the silence threshold for the current depth level:
+- **Quick:** max 3 minutes between outputs
+- **Standard:** max 2 minutes between outputs
+- **Deep:** max 90 seconds between outputs
+If a long operation is running, emit a heartbeat: `"Still running tests (47 passed, 2 remaining)..."`
+## Steerability: Mutation Classification and Selective Regeneration
+When the user requests changes to an already-produced artifact:
+### Step 1: Classify the Mutation Level
+| Level | Name | Trigger | Action |
+|-------|------|---------|--------|
+| **L0** | Cosmetic | Typo, formatting, wording only | Apply fix. No regeneration. |
+| **L1** | Local | Change to a leaf artifact with no downstream dependents | Regenerate only this artifact. |
+| **L2** | Structural | Change to a mid-graph artifact (e.g., design.md) | Regenerate this artifact and all downstream dependents. |
+| **L3** | Fundamental | Change to scope, intent, or root artifact (clarification.md) | Restart from the clarification phase onward. |
+### Step 2: Show Impact Preview
+Before regenerating, tell the user what will be affected:
+```
+"This change to design.md is L2 (structural). It will regenerate:
+  - execution-plan.md (depends on design.md)
+Preserved (unaffected): clarification.md, spec-hardened.md"
+```
+Use AskUserQuestion:
+1. **Proceed with regeneration** (Recommended) — regenerate affected artifacts
+2. **Apply change only** — update this artifact without regenerating downstream
+3. **Cancel** — discard the change
+### Step 3: Selective Regeneration
+Walk the artifact dependency graph (from `pipeline-state.js`) starting from the changed artifact. Regenerate only downstream artifacts. Preserve all completed artifacts that are not downstream.
+### Artifact Dependency Graph
+```
+clarification.md → spec-hardened.md → design.md → execution-plan.md
+```
+Each arrow means "is required by." Change an upstream artifact and everything downstream may need regeneration.
 ## Reasoning Chain Output
 Every phase produces reasoning output at two layers:
@@ -742,12 +1115,3 @@ Save full reasoning chain to `.wazir/runs/<id>/reasoning/phase-<name>-reasoning.
 ```
 Create the `reasoning/` directory during run init. Every phase skill (clarifier, executor, reviewer) writes its own reasoning file. Counterfactuals appear in BOTH conversation output AND reasoning files.
-## Interaction Rules
-- **One question at a time** — never combine multiple questions
-- **Numbered options** — always present choices as numbered lists
-- **Mark defaults** — always show "(Recommended)" on the suggested option
-- **Wait for answer** — never proceed past a question until the user responds
-- **No open-ended questions** — every question has concrete options to pick from
-- **Inline answers accepted** — users can type the number or the option name