npm - @wazir-dev/cli - Versions diffs - 1.1.0 → 1.3.0 - Mend

@wazir-dev/cli 1.1.0 → 1.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (138) hide show

package/CHANGELOG.md +74 -10
package/README.md +15 -15
package/assets/demo.cast +47 -0
package/assets/demo.gif +0 -0
package/docs/anti-patterns/AP-23-skipping-enabled-workflows.md +28 -0
package/docs/anti-patterns/AP-24-clarifier-deciding-scope.md +34 -0
package/docs/concepts/architecture.md +1 -1
package/docs/concepts/roles-and-workflows.md +2 -0
package/docs/concepts/why-wazir.md +59 -0
package/docs/decisions/2026-03-19-deferred-items.md +564 -0
package/docs/decisions/2026-03-19-enhancement-decisions.md +300 -0
package/docs/readmes/INDEX.md +21 -5
package/docs/readmes/features/expertise/README.md +2 -2
package/docs/readmes/features/exports/README.md +2 -2
package/docs/readmes/features/hooks/pre-compact-summary.md +1 -1
package/docs/readmes/features/schemas/README.md +3 -0
package/docs/readmes/features/skills/README.md +17 -0
package/docs/readmes/features/skills/clarifier.md +5 -0
package/docs/readmes/features/skills/claude-cli.md +5 -0
package/docs/readmes/features/skills/codex-cli.md +5 -0
package/docs/readmes/features/skills/dispatching-parallel-agents.md +5 -0
package/docs/readmes/features/skills/executing-plans.md +5 -0
package/docs/readmes/features/skills/executor.md +5 -0
package/docs/readmes/features/skills/finishing-a-development-branch.md +5 -0
package/docs/readmes/features/skills/gemini-cli.md +5 -0
package/docs/readmes/features/skills/humanize.md +5 -0
package/docs/readmes/features/skills/init-pipeline.md +5 -0
package/docs/readmes/features/skills/receiving-code-review.md +5 -0
package/docs/readmes/features/skills/requesting-code-review.md +5 -0
package/docs/readmes/features/skills/reviewer.md +5 -0
package/docs/readmes/features/skills/subagent-driven-development.md +5 -0
package/docs/readmes/features/skills/using-git-worktrees.md +5 -0
package/docs/readmes/features/skills/wazir.md +5 -0
package/docs/readmes/features/skills/writing-skills.md +5 -0
package/docs/readmes/features/workflows/prepare-next.md +1 -1
package/docs/reference/configuration-reference.md +47 -6
package/docs/reference/hooks.md +1 -0
package/docs/reference/launch-checklist.md +4 -4
package/docs/reference/review-loop-pattern.md +119 -9
package/docs/reference/roles-reference.md +1 -0
package/docs/reference/skill-tiers.md +147 -0
package/docs/reference/tooling-cli.md +3 -1
package/docs/truth-claims.yaml +12 -0
package/expertise/antipatterns/process/ai-coding-antipatterns.md +214 -1
package/exports/hosts/claude/.claude/commands/plan-review.md +3 -1
package/exports/hosts/claude/.claude/commands/verify.md +30 -1
package/exports/hosts/claude/.claude/settings.json +9 -0
package/exports/hosts/claude/CLAUDE.md +1 -1
package/exports/hosts/claude/export.manifest.json +6 -4
package/exports/hosts/claude/host-package.json +3 -1
package/exports/hosts/codex/AGENTS.md +1 -1
package/exports/hosts/codex/export.manifest.json +6 -4
package/exports/hosts/codex/host-package.json +3 -1
package/exports/hosts/cursor/.cursor/hooks.json +4 -0
package/exports/hosts/cursor/.cursor/rules/wazir-core.mdc +1 -1
package/exports/hosts/cursor/export.manifest.json +6 -4
package/exports/hosts/cursor/host-package.json +3 -1
package/exports/hosts/gemini/GEMINI.md +1 -1
package/exports/hosts/gemini/export.manifest.json +6 -4
package/exports/hosts/gemini/host-package.json +3 -1
package/hooks/context-mode-router +191 -0
package/hooks/definitions/context_mode_router.yaml +19 -0
package/hooks/hooks.json +31 -6
package/hooks/protected-path-write-guard +8 -0
package/hooks/routing-matrix.json +45 -0
package/hooks/session-start +62 -1
package/llms-full.txt +937 -134
package/package.json +2 -4
package/schemas/hook.schema.json +2 -1
package/schemas/phase-report.schema.json +89 -0
package/schemas/usage.schema.json +25 -1
package/schemas/wazir-manifest.schema.json +19 -0
package/skills/brainstorming/SKILL.md +32 -157
package/skills/clarifier/SKILL.md +289 -111
package/skills/claude-cli/SKILL.md +320 -0
package/skills/codex-cli/SKILL.md +260 -0
package/skills/debugging/SKILL.md +13 -0
package/skills/design/SKILL.md +13 -0
package/skills/dispatching-parallel-agents/SKILL.md +13 -0
package/skills/executing-plans/SKILL.md +13 -0
package/skills/executor/SKILL.md +139 -19
package/skills/finishing-a-development-branch/SKILL.md +13 -0
package/skills/gemini-cli/SKILL.md +260 -0
package/skills/humanize/SKILL.md +13 -0
package/skills/init-pipeline/SKILL.md +72 -164
package/skills/prepare-next/SKILL.md +81 -10
package/skills/receiving-code-review/SKILL.md +13 -0
package/skills/requesting-code-review/SKILL.md +13 -0
package/skills/reviewer/SKILL.md +369 -24
package/skills/run-audit/SKILL.md +13 -0
package/skills/scan-project/SKILL.md +13 -0
package/skills/self-audit/SKILL.md +217 -16
package/skills/skill-research/SKILL.md +188 -0
package/skills/subagent-driven-development/SKILL.md +13 -0
package/skills/subagent-driven-development/code-quality-reviewer-prompt.md +2 -0
package/skills/subagent-driven-development/implementer-prompt.md +8 -0
package/skills/subagent-driven-development/spec-reviewer-prompt.md +7 -0
package/skills/tdd/SKILL.md +13 -0
package/skills/using-git-worktrees/SKILL.md +13 -0
package/skills/using-skills/SKILL.md +13 -0
package/skills/verification/SKILL.md +54 -3
package/skills/wazir/SKILL.md +464 -381
package/skills/writing-plans/SKILL.md +14 -1
package/skills/writing-skills/SKILL.md +13 -0
package/templates/artifacts/implementation-plan.md +3 -0
package/templates/artifacts/tasks-template.md +133 -0
package/templates/examples/phase-report.example.json +48 -0
package/tooling/src/adapters/composition-engine.js +256 -0
package/tooling/src/adapters/model-router.js +84 -0
package/tooling/src/capture/command.js +41 -2
package/tooling/src/capture/run-config.js +3 -1
package/tooling/src/capture/store.js +56 -0
package/tooling/src/capture/usage.js +106 -0
package/tooling/src/capture/user-input.js +66 -0
package/tooling/src/checks/ac-matrix.js +256 -0
package/tooling/src/checks/command-registry.js +12 -0
package/tooling/src/checks/docs-truth.js +1 -1
package/tooling/src/checks/security-sensitivity.js +69 -0
package/tooling/src/checks/skills.js +111 -0
package/tooling/src/cli.js +31 -20
package/tooling/src/commands/stats.js +161 -0
package/tooling/src/commands/validate.js +5 -1
package/tooling/src/export/compiler.js +33 -37
package/tooling/src/gating/agent.js +145 -0
package/tooling/src/guards/phase-prerequisite-guard.js +185 -0
package/tooling/src/hooks/routing-logic.js +69 -0
package/tooling/src/init/auto-detect.js +258 -0
package/tooling/src/init/command.js +38 -170
package/tooling/src/input/scanner.js +46 -0
package/tooling/src/reports/command.js +103 -0
package/tooling/src/reports/phase-report.js +323 -0
package/tooling/src/state/command.js +160 -0
package/tooling/src/state/db.js +287 -0
package/tooling/src/status/command.js +58 -1
package/tooling/src/verify/proof-collector.js +299 -0
package/wazir.manifest.yaml +26 -14
package/workflows/plan-review.md +3 -1
package/workflows/verify.md +30 -1

package/docs/reference/hooks.md CHANGED Viewed

@@ -15,6 +15,7 @@ These hook definitions are product contracts first. Host-specific native hooks o
 | `stop_handoff_harvest` | Persist final handoff and stop-time observability data | capture |
 | `protected_path_write_guard` | Block writes to protected canonical paths outside approved flows | block |
 | `loop_cap_guard` | Block extra iterations after the configured loop cap | block |
+| `context_mode_router` | Route large command output through context-mode tools to avoid flooding model context | warn |
 ## Source of truth

package/docs/reference/launch-checklist.md CHANGED Viewed

@@ -26,7 +26,7 @@ Submit pull requests to these curated lists (one PR per list, follow each repo's
 ### awesome-claude-code
 - **Repo:** `github.com/anthropics/awesome-claude-code` (or the most-starred community fork)
 - **Section:** Tools / Plugins / Extensions
-- **Entry format:** `[Wazir](https://github.com/MohamedAbdallah-14/Wazir) - Host-native engineering OS kit with 10 roles, 14 phases, and 308 expertise modules.`
+- **Entry format:** `[Wazir](https://github.com/MohamedAbdallah-14/Wazir) - Host-native engineering OS kit with 10 roles, 4 phases (15 workflows), and 315 expertise modules.`
 - **Tips:** Keep the description under 120 characters. Link directly to the repo.
 ### awesome-ai-agents
@@ -56,7 +56,7 @@ Show HN: Wazir – Engineering OS kit for AI coding agents (Claude, Codex, Gemin
 ### First comment
 Post a comment immediately after submission explaining:
 1. What problem Wazir solves (AI agents lack structured engineering workflows)
-2. How it works (10 canonical roles, 14-phase pipeline, 308 expertise modules)
+2. How it works (10 canonical roles, 15-workflow pipeline, 315 expertise modules)
 3. What makes it different (host-native, works across Claude/Codex/Gemini/Cursor)
 4. Quick install: `npx @wazir-dev/cli init`
 5. Invite feedback -- HN readers appreciate genuine requests for input
@@ -75,7 +75,7 @@ Post a comment immediately after submission explaining:
 **Title:** "How I Built an Engineering OS for AI Coding Agents"
 1. **Hook** -- The problem: AI agents write code but lack engineering discipline.
-2. **Architecture overview** -- 10 roles, 14 phases, expertise modules, quality gates.
+2. **Architecture overview** -- 10 roles, 4 phases (15 workflows), expertise modules, quality gates.
 3. **Code walkthrough** -- Show a real workflow: how a feature moves from requirements through TDD to deployment.
 4. **Host-native approach** -- Explain why one kit works across Claude, Codex, Gemini, and Cursor.
 5. **Results** -- Concrete metrics or before/after comparisons.
@@ -100,7 +100,7 @@ Structure as a 5-7 tweet thread:
 1. **Hook tweet:** One-liner about the problem + link to repo.
 2. **What it is:** Brief description of Wazir.
-3. **Architecture:** 10 roles, 14 phases, 308 modules (include a diagram image).
+3. **Architecture:** 10 roles, 4 phases (15 workflows), 315 modules (include a diagram image).
 4. **Demo:** Short GIF or screenshot of a workflow in action.
 5. **Multi-host:** Works with Claude, Codex, Gemini, and Cursor.
 6. **Install:** `npx @wazir-dev/cli init`

package/docs/reference/review-loop-pattern.md CHANGED Viewed

@@ -134,10 +134,25 @@ review_loop(artifact_path, phase, dimensions[], depth, config, options={}):
       log(pass_number+1, dimension, findings) -> log_path
     if findings.has_issues:
-      # --- Fix inline, do NOT return ---
+      # --- Fix and re-submit (MANDATORY) ---
+      # The producer MUST fix findings and the reviewer MUST re-review.
+      # "Fix and continue without re-review" is EXPLICITLY PROHIBITED.
       producer_fix(artifact_path, findings)
       # Continue to next pass -- the fix will be re-reviewed
+  # --- Post-loop: escalation if issues remain ---
+  if remaining.has_issues:
+    # Cap reached with unresolved findings. Present to user:
+    # 1. Approve with known issues (Recommended if non-blocking)
+    # 2. Fix manually and re-run
+    # 3. Abort
+    escalate_to_user(remaining, options=[
+      "approve-with-issues",
+      "fix-manually-and-rerun",
+      "abort"
+    ])
+    # User decides. If approved, log "user-approved-with-issues" in final pass file.
   return { pass_count: total_passes, issues_found, issues_fixed, remaining, attributions }
 ```
@@ -278,7 +293,7 @@ Matches canonical `workflows/design-review.md`:
 4. **Visual consistency** -- design tokens form a coherent system, dark/light mode alignment
 5. **Exported-code fidelity** -- do exported scaffolds match the designs? Mismatches are failures here, not implementation concerns.
-### Plan Dimensions (7)
+### Plan Dimensions (8)
 1. **Completeness** -- all design decisions mapped to tasks
 2. **Ordering** -- dependencies correct, parallelizable identified
@@ -287,6 +302,7 @@ Matches canonical `workflows/design-review.md`:
 5. **Edge cases** -- error paths covered
 6. **Security** -- auth, injection, data exposure
 7. **Integration** -- tasks connect end-to-end
+8. **Input Coverage** -- every distinct item in the original input maps to at least one task. If `tasks < input items`, HIGH finding listing missing items
 ### Task Execution Dimensions (5)
@@ -328,10 +344,11 @@ Pass counts are FIXED per depth. Quick = 3 passes, standard = 5 passes, deep = 7
 ## Loop Cap Configuration
-The `phase_policy` section of `run-config.yaml` controls which phases are enabled and sets an absolute safety ceiling per phase. Only two fields exist: `enabled` and `loop_cap`. There is no `passes` field -- depth determines pass counts (3/5/7), not phase policy.
+The `workflow_policy` section of `run-config.yaml` (legacy: `phase_policy`) controls which workflows are enabled and sets an absolute safety ceiling per workflow. Only two fields exist: `enabled` and `loop_cap`. There is no `passes` field -- depth determines pass counts (3/5/7), not workflow policy.
 ```yaml
-phase_policy:
+workflow_policy:
+  # Clarifier phase workflows
   discover:       { enabled: true, loop_cap: 10 }
   clarify:        { enabled: true, loop_cap: 10 }
   specify:        { enabled: true, loop_cap: 10 }
@@ -341,21 +358,24 @@ phase_policy:
   design-review:  { enabled: true, loop_cap: 10 }
   plan:           { enabled: true, loop_cap: 10 }
   plan-review:    { enabled: true, loop_cap: 10 }
+  # Executor phase workflows
   execute:        { enabled: true, loop_cap: 10 }
   verify:         { enabled: true, loop_cap: 5 }
   review:         { enabled: true, loop_cap: 10 }
-  learn:          { enabled: false, loop_cap: 5 }
-  prepare_next:   { enabled: false, loop_cap: 5 }
+  learn:          { enabled: true, loop_cap: 5 }
+  prepare_next:   { enabled: true, loop_cap: 5 }
   run_audit:      { enabled: false, loop_cap: 10 }
 ```
 **`loop_cap`** is an absolute safety ceiling that prevents runaway loops regardless of depth. It is checked by `wazir capture loop-check` in pipeline mode. It is NOT the same as pass count (which is determined by depth: 3/5/7). Example: depth=deep gives 7 passes, but if `loop_cap: 5`, the cap guard fires at pass 5 and escalates. This is intentional -- the operator can constrain expensive phases.
-**Adaptive phases** (`author`, `learn`, `prepare_next`, `run_audit`) default to `enabled: false`. They are activated by explicit operator config or intent detection. They do not participate in the standard review loop pattern because:
+**Adaptive workflows** (`author`, `run_audit`) default to `enabled: false`. They are activated by explicit operator config or intent detection.
+**Post-run workflows** (`learn`, `prepare_next`) default to `enabled: true`. They run as part of the Final Review phase:
+- `learn` extracts durable learnings from review findings -- recurring findings become accepted learnings.
+- `prepare_next` prepares context and handoff for the next run.
 - `author` has a human approval gate, not an iterative review loop.
-- `learn` extracts learnings from the completed run -- it is post-execution housekeeping.
-- `prepare_next` prepares context for the next run -- it is a handoff phase.
 - `run_audit` is an on-demand standalone audit, not part of the main pipeline flow.
 ---
@@ -427,3 +447,93 @@ Do NOT load or invoke any skills."
 For committed changes, replace `--uncommitted` with `--base <sha>`.
 Replace `[DIMENSION]`, `[dimension description]`, and `[criteria]` with the task-specific values from the execution plan and spec.
+---
+## Codex Output Context Protection
+Codex CLI output includes internal traces (file reads, tool calls, reasoning) that are NOT useful for the review — only the final findings matter. To prevent context flooding:
+### Tee + Extract Pattern
+1. **Always tee** Codex output to a file:
+   ```bash
+   codex exec ... 2>&1 | tee .wazir/runs/latest/reviews/<phase>-review-pass-<N>.md
+   ```
+2. **Extract findings** after the last `codex` marker using `execute_file`:
+   ```bash
+   # If context-mode available (has_execute_file: true):
+   mcp__plugin_context-mode_context-mode__execute_file(
+     path: ".wazir/runs/latest/reviews/<phase>-review-pass-<N>.md",
+     language: "shell",
+     code: "tac $FILE | sed '/^codex$/q' | tac | tail -n +2"
+   )
+   ```
+3. **Present extracted findings only** — the raw trace stays in the file for debugging but never enters the main context window.
+### Fallback (no context-mode)
+If `context_mode.has_execute_file` is false, extract using shell directly:
+```bash
+tac <file> | sed '/^codex$/q' | tac | tail -n +2
+```
+This reverses the file, finds the first (= last original) `codex` marker, reverses back, and skips the marker line.
+**If no marker found:** fail closed
+---
+## Phase Scoring: First vs Final Artifact Comparison
+At the start of each review loop (pass 1), score the artifact on its phase's canonical dimension set (1-10 per dimension). At the end of the loop (final pass), score again using the **same canonical dimensions**. Present the delta in the end-of-phase report.
+### Canonical Dimension Sets Per Phase
+These are the fixed rubrics — no ad-hoc dimension selection:
+| Phase | Canonical Dimensions |
+|-------|---------------------|
+| research-review | Coverage, Source quality, Relevance, Gaps identified, Actionability |
+| clarification-review / spec-challenge | Completeness, Testability, Ambiguity, Assumptions, Scope creep |
+| design-review | Spec coverage, Design-spec consistency, Accessibility, Visual consistency, Exported-code fidelity |
+| plan-review | Completeness, Testability, Task granularity, Dependency correctness, Phase structure, File coverage, Estimation accuracy, Input coverage |
+| task-review | Correctness, Tests, Wiring, Drift, Quality |
+| final | Correctness, Completeness, Wiring, Verification, Drift, Quality, Documentation |
+### Scoring Rules
+1. Initial and final scores MUST use the **same dimension set** — the delta is only meaningful on the same rubric.
+2. The reviewer records which dimension set was used in each pass file.
+3. Delta format: `Dimension: X/10 → Y/10 (+Z)`.
+### Quality Delta Report Section
+The end-of-phase report (see "End-of-Phase Report" below) includes a **Quality Delta** section:
+```markdown
+## Quality Delta
+| Dimension | Initial | Final | Delta |
+|-----------|---------|-------|-------|
+| Completeness | 4/10 | 9/10 | +5 |
+| Testability | 3/10 | 8/10 | +5 |
+| Ambiguity | 5/10 | 9/10 | +4 |
+```
+---
+## End-of-Phase Report
+Every phase exit produces a report saved to `.wazir/runs/latest/reviews/<phase>-report.md` containing:
+1. **Summary** — what the phase produced
+2. **Key Changes** — first-version vs final-version highlights (not full diff — what improved)
+3. **Quality Delta** — per-dimension before/after scores (see Phase Scoring above)
+4. **Findings Log** — per-pass finding counts by severity (e.g., "Pass 1: 6 findings (3 blocking, 2 warning, 1 note). Pass 7: 0 findings. All resolved.")
+5. **Usage** — token usage from `wazir capture usage` (runs before report generation)
+6. **Context Savings** — context-mode stats if available, omit section if not
+7. **Time Spent** — wall-clock elapsed time from phase start to end — log "codex marker not found in output, cannot extract findings" and present a warning to the user with 0 findings extracted. The raw file is preserved for manual review. Do NOT fall back to `tail` or any best-effort extraction that could leak traces into context.

package/docs/reference/roles-reference.md CHANGED Viewed

@@ -35,6 +35,7 @@ This is the lookup reference for canonical roles, workflows, and their contracts
 | `review` | `verify` | Adversarial quality review |
 | `learn` | `review` | Capture scoped learnings |
 | `prepare-next` | `learn` | Produce clean next-run handoff |
+| `run-audit` | (standalone) | Structured codebase audit with source-backed findings |
 ## Role routing valid values

package/docs/reference/skill-tiers.md ADDED Viewed

@@ -0,0 +1,147 @@
+# Skill Tier Classification
+Audit of Wazir skills against Superpowers v4.3.1 skills.
+Each skill is classified into one of three tiers:
+- **Delegate** -- use superpowers skill as-is, delete Wazir fork
+- **Augment** -- use superpowers skill + inject Wazir context addendum (strictly additive, no overrides). **NOTE:** R2 validation found this tier is not implementable -- see [Augment Mechanism](#augment-mechanism) below.
+- **Own** -- Wazir-original or structurally rewritten skill, rename to `wz:` prefix
+---
+## Classification Table
+| Wazir Skill | Superpowers Equivalent | Tier | Rationale | Risk Notes |
+|---|---|---|---|---|
+| brainstorming | brainstorming | **Own** | Structurally rewritten. Superpowers version is a linear checklist (explore context, ask questions, propose approaches, present design, write doc, invoke writing-plans). Wazir replaces the entire process: adds Command Routing and Codebase Exploration preambles, replaces the design-doc step with a design-review loop (`--mode design-review` with canonical dimensions), and outputs to `.wazir/runs/latest/clarified/design.md` instead of `docs/plans/`. None of the superpowers process steps survive intact. | -- |
+| clarifier | _(none)_ | **Own** | Wazir-original. No superpowers counterpart exists. | -- |
+| debugging | systematic-debugging | **Own** | Structurally rewritten. Superpowers has a 4-phase process (Root Cause Investigation with 5 substeps, Pattern Analysis, Hypothesis and Testing, Implementation) totaling ~300 lines with detailed examples, rationalization tables, and supporting technique references. Wazir condenses this to a 4-step observe-hypothesize-test-fix loop (~75 lines), replaces all codebase exploration with Wazir CLI symbol-first exploration (`wazir index search-symbols`, `wazir recall symbol` and `wazir recall file`), adds loop cap awareness (pipeline mode with `wazir capture loop-check` vs. standalone mode), and removes all superpowers examples, rationalization tables, and red-flag lists. The methodology is fundamentally different in structure despite sharing the spirit of "root cause first." | Delegating would lose Wazir CLI integration and loop cap awareness. Superpowers version is far more detailed on anti-patterns and may be worth referencing separately. |
+| design | _(none)_ | **Own** | Wazir-original. No superpowers counterpart exists. | -- |
+| dispatching-parallel-agents | dispatching-parallel-agents | **Own** | Reclassified from Augment to Own (R2). Skill shadowing is full-override, so Augment tier is not implementable via `~/.claude/skills/`. Wazir already carries the full content: superpowers core (When to Use decision tree, The Pattern with 4 steps, Agent Prompt Structure, Common Mistakes section) plus Wazir additions (Command Routing preamble, Codebase Exploration preamble, philosophical paragraph in Overview, Problem/Fix format for Common Mistakes). Drops superpowers-only sections: "When NOT to Use," "Real Example from Session," "Key Benefits," "Verification," "Real-World Impact." | Superpowers informational sections (Real Example, Key Benefits, Verification, Real-World Impact) not carried forward. Low risk -- these are teaching content, not behavioral. |
+| executing-plans | executing-plans | **Own** | Structurally rewritten. Superpowers uses batch execution (default first 3 tasks) with report-and-wait checkpoints and explicit batch feedback loops. Wazir replaces batching with per-task execution, adds a per-task review loop (`--mode task-review` with 5 task-execution dimensions, Codex integration, review log filenames, loop cap tracking via `wazir capture loop-check`), adds standalone vs. pipeline mode detection, and adds a note recommending wz:subagent-driven-development when subagents are available. The batch-vs-per-task change is a core behavioral difference. All integration references point to `wz:` skills. | Delegating would lose per-task review loops and pipeline mode integration. |
+| executor | _(none)_ | **Own** | Wazir-original. No superpowers counterpart exists. | -- |
+| finishing-a-development-branch | finishing-a-development-branch | **Own** | Reclassified from Augment to Own (R2). Skill shadowing is full-override, so Augment tier is not implementable via `~/.claude/skills/`. Wazir already carries the full content: superpowers process (5 steps: verify tests, determine base branch, present 4 options, execute choice, cleanup worktree) preserved with identical structure and identical option semantics. Wazir adds Command Routing and Codebase Exploration preambles. Minor cosmetic changes: `<N>` removed from failure template, `<base-branch>` shortened to `<base>`, emoji checkmarks replaced with Y/-, `<commit-list>` changed to `<count>`, PR body simplified. Red Flags and Integration sections trimmed but no behavioral contradiction. | Low risk. The superpowers version has more detailed Red Flags and Integration sections not carried forward. |
+| humanize | _(none)_ | **Own** | Wazir-original. No superpowers counterpart exists. | -- |
+| init-pipeline | _(none)_ | **Own** | Wazir-original. No superpowers counterpart exists. | -- |
+| prepare-next | _(none)_ | **Own** | Wazir-original. No superpowers counterpart exists. | -- |
+| receiving-code-review | receiving-code-review | **Own** | Structurally rewritten. Superpowers has extensive sections: Forbidden Responses, Source-Specific Handling, YAGNI Check, Implementation Order, When To Push Back, Acknowledging Correct Feedback (with detailed anti-patterns for gratitude), Gracefully Correcting Pushback, Common Mistakes table, Real Examples, and GitHub Thread Replies. Wazir preserves the core Response Pattern and Forbidden Responses but: (1) adds Loop Tracking section (pipeline mode with `wazir capture loop-check` and standalone pass counts), (2) restructures Implementation Order to a 4-tier priority (blocking, functional, quality, nice-to-have) instead of 3-tier, (3) adds a Quick Reference decision table, (4) removes the entire "Acknowledging Correct Feedback" anti-gratitude section, the "Gracefully Correcting Pushback" section, the Common Mistakes table, all Real Examples, the "When To Push Back" enumeration, and the GitHub Thread Replies section. The Loop Tracking addition and structural deletions make this a substantive rewrite. | Delegating would lose loop tracking. The removed anti-gratitude and pushback sections from superpowers are valuable behavioral guardrails worth preserving. |
+| requesting-code-review | requesting-code-review | **Own** | Structurally rewritten. Both skills share the same When to Request triggers and Example structure. But Wazir: (1) replaces `superpowers:code-reviewer` with `wz:code-reviewer`, (2) adds explicit review loop parameters (`--mode`, depth-aware dimensions, pass number), (3) adds `codex review --uncommitted` and `codex review --base` commands, (4) adds Codex Error Handling section, (5) adds `{REVIEW_MODE}` placeholder, (6) changes Integration section to reference per-task review checkpoints instead of batch review, (7) adds "Dispatch review without explicit `--mode`" to Red Flags. The Codex integration and review loop parameter system are structural additions that change how reviews are dispatched. | Delegating would lose Codex integration and review loop protocol. |
+| reviewer | _(none)_ | **Own** | Wazir-original. No superpowers counterpart exists. | -- |
+| run-audit | _(none)_ | **Own** | Wazir-original. No superpowers counterpart exists. | -- |
+| scan-project | _(none)_ | **Own** | Wazir-original. No superpowers counterpart exists. | -- |
+| self-audit | _(none)_ | **Own** | Wazir-original. No superpowers counterpart exists. | -- |
+| subagent-driven-development | subagent-driven-development | **Own** | Structurally rewritten. Both share the same high-level process (fresh subagent per task, two-stage review, spec then quality). But Wazir: (1) adds `Capture PRE_TASK_SHA` step to the process flowchart for diff scoping, (2) adds Code Review Scoping section (`codex review --base <pre-task-sha>`), (3) adds Review Loop Alignment section (explicit `--mode task-review`, task-scoped log filenames, loop cap via `wazir capture loop-check`), (4) adds Codex Error Handling section, (5) adds standalone mode fallback, (6) changes all skill references from `superpowers:` to `wz:`, (7) adds "Review the wrong diff" to Red Flags, (8) removes the Example Workflow, Advantages detail, and Cost breakdown from superpowers. The diff-scoping and review-loop integration are structural process changes. | Delegating would lose diff-scoped reviews and Codex integration. The removed Example Workflow from superpowers is a useful teaching tool. |
+| tdd | test-driven-development | **Own** | Structurally rewritten. Superpowers has an exhaustive treatment (~370 lines): detailed Red-Green-Refactor with Good/Bad code examples, Iron Law with explicit "delete and start over" rules, a Verification Checklist, extensive Why Order Matters section, Common Rationalizations table, When Stuck guide, Testing Anti-Patterns reference, and Debugging Integration. Wazir condenses to ~45 lines with 3 steps (RED, GREEN, REFACTOR), adds a single-pass test quality check in RED phase ("Are these tests testing the right behavior? Are they real assertions?"), and removes all examples, rationalization tables, and elaboration. Different description and name (`wz:tdd` vs `test-driven-development`). | Delegating would lose the test quality check. The superpowers version's extensive rationalization prevention and examples are valuable for discipline enforcement but costly in tokens. |
+| using-git-worktrees | using-git-worktrees | **Own** | Reclassified from Augment to Own (R2). Skill shadowing is full-override, so Augment tier is not implementable via `~/.claude/skills/`. Wazir already carries the full content: superpowers core process (directory selection priority, safety verification with `git check-ignore`, creation steps, project setup auto-detection, clean baseline verification) preserved structurally intact. Wazir adds: Command Routing preamble, Codebase Exploration preamble, global directory changed from `~/.config/superpowers/worktrees/` to `~/.wazir/worktrees/`, Cleanup and Common Issues sections (submodules, lock files, stale worktrees). Drops superpowers-only sections: Example Workflow, Quick Reference table, Common Mistakes, Red Flags, Integration. | Dropped superpowers sections (Quick Reference, Common Mistakes, Red Flags, Integration) reduce operational guardrails. Could be recovered into the Own skill. |
+| using-skills | using-superpowers | **Own** | Structurally rewritten. Both enforce the same core rule (invoke skills before any response, even at 1% chance). But Wazir: (1) renames from `using-superpowers` to `using-skills`, (2) changes all internal skill references from `superpowers:` to `wz:` throughout flowchart and examples, (3) removes the Skill Types section detail about "Rigid vs Flexible" elaboration, (4) removes User Instructions elaboration. The name change and systematic `wz:` prefix replacement throughout the flowchart make this a namespace-level rewrite. | Could potentially be Augment if namespace mapping were handled at a routing layer rather than in-skill. |
+| verification | verification-before-completion | **Own** | Structurally rewritten. Superpowers has an exhaustive treatment (~140 lines): Iron Law, Gate Function (5-step IDENTIFY/RUN/READ/VERIFY/CLAIM), Common Failures table, Red Flags list, Rationalization Prevention table, Key Patterns (tests, regression, build, requirements, agent delegation), Why This Matters section with 24 failure memories, and When To Apply section. Wazir condenses to ~35 lines with 3 bullet requirements (what was verified, exact command, actual result), a minimum rule, and a brief "when verification fails" section. Different name (`wz:verification` vs `verification-before-completion`). | Delegating would lose the concise Wazir format. The superpowers version's extensive rationalization prevention is valuable for discipline but token-expensive. The Wazir version may be too terse to enforce the discipline effectively. |
+| wazir | _(none)_ | **Own** | Wazir-original. No superpowers counterpart exists. | -- |
+| writing-plans | writing-plans | **Own** | Structurally rewritten. Superpowers focuses on plan document format (header template, task structure with bite-sized steps, code examples in plan, execution handoff to subagent-driven or parallel session). Wazir: (1) changes inputs to "approved design or approved clarified direction" instead of "spec or requirements", (2) adds pipeline-aware output paths (`.wazir/runs/latest/clarified/execution-plan.md` and `.wazir/runs/latest/tasks/task-NNN/spec.md` vs. standalone `docs/plans/`), (3) removes the plan document format template entirely (no header template, no task structure template, no code examples), (4) adds Plan Review Loop section with `wz:reviewer --mode plan-review`, Codex integration via stdin pipe, Codex error handling, depth-aware pass counts, and standalone fallback. The plan review loop and pipeline path system are structural additions; the removal of the format template is a structural deletion. | Delegating would lose pipeline integration and plan review loop. The removed format template from superpowers is valuable for plan quality and could be worth recovering. |
+| writing-skills | writing-skills | **Own** | Structurally rewritten. Both share the TDD-for-skills philosophy and RED-GREEN-REFACTOR mapping. But Wazir: (1) condenses from ~650 lines to ~170 lines, (2) removes the extensive SKILL.md Structure template, CSO (Claude Search Optimization) section, Flowchart Usage guidelines, Code Examples guidelines, Token Efficiency section, File Organization examples, Testing All Skill Types section (discipline/technique/pattern/reference), Common Rationalizations for Skipping Testing table, Bulletproofing Skills Against Rationalization section (with Cialdini psychology reference), Skill Creation Checklist, Discovery Workflow, Anti-Patterns section, and STOP deployment gate, (3) adds "Be Prescriptive, Not Descriptive" guidance, "Use Rationalization Prevention" example, "Include Decision Trees" guidance, and skill reference syntax. The massive content reduction and different teaching approach make this a structural rewrite. | Delegating would lose the concise prescriptive format. The superpowers version's CSO guidelines, testing methodology, and anti-pattern catalog are extremely valuable reference material. |
+---
+## Superpowers Skills with No Wazir Counterpart
+These superpowers skills have no Wazir fork. They could be used as-is via the superpowers plugin.
+| Superpowers Skill | Status | Notes |
+|---|---|---|
+| using-superpowers | Replaced by `wz:using-skills` | See using-skills row above. |
+All 14 superpowers skills have a Wazir counterpart (using-superpowers maps to using-skills, systematic-debugging maps to debugging, test-driven-development maps to tdd, verification-before-completion maps to verification).
+---
+## Summary by Tier
+| Tier | Count | Skills |
+|---|---|---|
+| **Own** | 25 | brainstorming, clarifier, debugging, design, dispatching-parallel-agents, executing-plans, executor, finishing-a-development-branch, humanize, init-pipeline, prepare-next, receiving-code-review, requesting-code-review, reviewer, run-audit, scan-project, self-audit, subagent-driven-development, tdd, using-git-worktrees, using-skills, verification, wazir, writing-plans, writing-skills |
+| **Augment** | 0 | _(none -- tier not implementable, see [Augment Mechanism](#augment-mechanism))_ |
+| **Delegate** | 0 | _(none)_ |
+---
+## Common Wazir Additions (Appear in All Forked Skills)
+Every Wazir fork of a superpowers skill adds these two preamble sections:
+1. **Command Routing** -- routes large commands to context-mode tools and small commands to native Bash, following `hooks/routing-matrix.json`.
+2. **Codebase Exploration** -- prescribes symbol-first exploration via `wazir index search-symbols` and `wazir recall`, with fallback to direct file reads.
+These preambles alone would justify **Augment** tier for any skill where no other structural changes exist.
+---
+## Augment Mechanism
+**Research date:** 2026-03-19 (R2: Composition Infrastructure Validation)
+### Finding: Augment tier is not implementable
+The Augment tier assumed that placing a Wazir addendum at `~/.claude/skills/<skill-name>/SKILL.md` would layer Wazir context on top of the superpowers base skill. This assumption is wrong. **Skill shadowing is full-override, not merge/append.**
+### Evidence
+**1. `skills-core.js` `resolveSkillPath()` (superpowers v4.3.1)**
+The function at `lib/skills-core.js:108-140` checks personal skills directory first. If `~/.claude/skills/<name>/SKILL.md` exists, it returns that file immediately and never reads the superpowers version. There is no content merging.
+```
+// Try personal skills first (unless explicitly superpowers:)
+if (!forceSuperpowers && personalDir) {
+    const personalSkillFile = path.join(personalDir, actualSkillName, 'SKILL.md');
+    if (fs.existsSync(personalSkillFile)) {
+        return { skillFile: personalSkillFile, sourceType: 'personal', ... };
+        // ^^^ returns here -- superpowers version never consulted
+    }
+}
+```
+**2. Superpowers test suite confirms override behavior**
+`tests/opencode/test-skills-core.sh` line 336 asserts:
+```
+[PASS] Personal skills shadow superpowers skills
+```
+The test creates `personal-skills/shared-skill/SKILL.md` and `superpowers-skills/shared-skill/SKILL.md`, resolves `shared-skill`, and verifies `sourceType` is `"personal"` -- the superpowers version is invisible.
+**3. Superpowers RELEASE-NOTES.md v3.3.0**
+Line 385 documents the behavior explicitly: "Personal skills override superpowers skills when names match."
+**4. The `superpowers:` prefix bypass is not available in Claude Code**
+`skills-core.js` supports `superpowers:skill-name` syntax to force resolution to the superpowers version even when a personal skill shadows it. However, `skills-core.js` is only used by the OpenCode plugin (`/.opencode/plugins/superpowers.js`). Claude Code's native `Skill` tool has its own built-in resolution logic that does not expose this prefix bypass.
+### Alternatives Considered
+| Approach | Viable? | Why |
+|---|---|---|
+| Place addendum in `~/.claude/skills/<name>/` | No | Full override -- base skill content lost |
+| Merge base + addendum in SKILL.md at install time | Partial | Would work but creates a maintenance coupling: every superpowers update requires re-merging. This is functionally identical to Own tier. |
+| Inject Wazir context via CLAUDE.md | No | CLAUDE.md is project-scoped; skill behavior should be global across all projects |
+| Use `superpowers:` prefix to load base, then append | No | Prefix only works in OpenCode's `skills-core.js`, not in Claude Code's native Skill tool |
+| Propose upstream merge/append feature | Future | Would require a superpowers or Claude Code platform change |
+### Conclusion
+The Augment tier is architecturally impossible with the current skill discovery mechanism. All three former Augment skills (dispatching-parallel-agents, finishing-a-development-branch, using-git-worktrees) are reclassified to **Own** tier. Since the Wazir versions already carry the full superpowers base content plus Wazir additions, no content is lost -- the skills simply cannot delegate to a shared base.
+If superpowers or Claude Code introduces a composition/layering mechanism in the future (e.g., `extends: superpowers:dispatching-parallel-agents` in frontmatter), the Augment tier could be revisited.
+---
+## Observations
+1. **No Delegate candidates exist.** Every Wazir fork adds at minimum the Command Routing and Codebase Exploration preambles, which prevents pure delegation.
+2. **Augment tier is not implementable.** R2 validation (2026-03-19) found that skill shadowing in both superpowers `skills-core.js` and Claude Code's native Skill tool is full-override: placing a SKILL.md in `~/.claude/skills/<name>/` completely replaces the superpowers skill with the same name. There is no merge or append mechanism. The three former Augment candidates (dispatching-parallel-agents, finishing-a-development-branch, using-git-worktrees) have been reclassified to Own. See [Augment Mechanism](#augment-mechanism) for full analysis.
+3. **All 14 forked skills are Own** because either (a) they introduce structural process changes (review loops, pipeline mode, Codex integration, content restructuring) or (b) the Augment composition mechanism does not exist in the platform.
+4. **Token cost tradeoff is significant.** Several Wazir Own skills (tdd, verification, debugging, writing-skills) are dramatically shorter than their superpowers counterparts. The superpowers versions contain valuable rationalization prevention tables, detailed examples, and anti-pattern catalogs that enforce discipline. The Wazir versions trade this for token efficiency. This tradeoff should be revisited -- some of the removed discipline content may be worth recovering as separate reference files.
+5. **The `wz:` prefix is already applied** in skill names within the Wazir SKILL.md frontmatter for all forked skills, consistent with the Own tier convention.

package/docs/reference/tooling-cli.md CHANGED Viewed

@@ -15,6 +15,7 @@ The `wazir` CLI is minimal on purpose. It exists to validate and export the host
 | `wazir validate commits` | implemented | Validates conventional commit format for commits in the range `--base..--head` (or auto-detected base to HEAD). |
 | `wazir validate changelog` | implemented | Validates `CHANGELOG.md` structure; with `--require-entries` and `--base`, enforces new entries since the base. |
 | `wazir validate docs-drift` | implemented | Detects when source files (roles, workflows, skills, hooks) change without corresponding documentation updates. Advisory by default; `--strict` exits non-zero on drift. |
+| `wazir validate skills` | implemented | Validates skill frontmatter and checks for name conflicts with superpowers skills (requires `wz:` prefix). Rejects any `CONTEXT.md` files (augment tier concluded not implementable in R2). |
 | `wazir validate artifacts` | reserved | Exits `2` until artifact-template and example validation expands. |
 | `wazir export build` | implemented | Generates host packages under `exports/hosts/*` from canonical sources. |
 | `wazir export --check` | implemented | Verifies generated host packages still match current canonical source hashes. |
@@ -28,7 +29,8 @@ The `wazir` CLI is minimal on purpose. It exists to validate and export the host
 | `wazir recall file` | implemented | Returns an exact line-bounded slice from an indexed file. Supports `--tier L0\|L1` for summary recall. |
 | `wazir recall symbol` | implemented | Returns an exact slice for an indexed symbol match. Supports `--tier L0\|L1` for summary recall. |
 | `wazir doctor` | implemented | Validates the active repo surface for manifest, hooks, state-root policy, and host export directory presence. |
-| `wazir status` | implemented | Reads run status directly from `<state-root>/runs/<run-id>/status.json`. |
+| `wazir status` | implemented | Reads run status directly from `<state-root>/runs/<run-id>/status.json`. Includes a one-line context savings summary when usage data is available. |
+| `wazir stats` | implemented | Shows token savings statistics for a run, including total queries, estimated tokens saved, bytes avoided, per-tool breakdown, and overall savings ratio. |
 | `wazir capture init` | implemented | Creates a run ledger with `status.json`, `events.ndjson`, and a captures directory under the configured state root. |
 | `wazir capture event` | implemented | Appends a run event and can update phase, status, and loop counts in `status.json`. |
 | `wazir capture route` | implemented | Reserves a run-local capture file path for large tool output. |

package/docs/truth-claims.yaml CHANGED Viewed

@@ -130,6 +130,12 @@
   subject: wazir status
   verifier: command_registry
   required: true
+- id: command-stats
+  file: docs/reference/tooling-cli.md
+  claim_type: command
+  subject: wazir stats
+  verifier: command_registry
+  required: true
 - id: command-capture-family
   file: docs/reference/tooling-cli.md
   claim_type: command
@@ -202,6 +208,12 @@
   subject: wazir validate docs-drift
   verifier: command_registry
   required: true
+- id: command-validate-skills
+  file: docs/reference/tooling-cli.md
+  claim_type: command
+  subject: wazir validate skills
+  verifier: command_registry
+  required: true
 - id: generated-claude-package
   file: docs/reference/host-exports.md
   claim_type: generated_file