npm - @wazir-dev/cli - Versions diffs - 1.0.0 → 1.2.0 - Mend

@wazir-dev/cli 1.0.0 → 1.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (163) hide show

package/CHANGELOG.md +100 -2
package/README.md +6 -6
package/docs/concepts/architecture.md +1 -1
package/docs/concepts/roles-and-workflows.md +2 -0
package/docs/concepts/why-wazir.md +59 -0
package/docs/decisions/2026-03-19-deferred-items.md +564 -0
package/docs/decisions/2026-03-19-enhancement-decisions.md +300 -0
package/docs/plans/2026-03-15-cli-pipeline-integration-plan.md +1 -1
package/docs/readmes/INDEX.md +21 -5
package/docs/readmes/features/expertise/README.md +2 -2
package/docs/readmes/features/exports/README.md +2 -2
package/docs/readmes/features/schemas/README.md +3 -0
package/docs/readmes/features/skills/README.md +17 -0
package/docs/readmes/features/skills/clarifier.md +5 -0
package/docs/readmes/features/skills/claude-cli.md +5 -0
package/docs/readmes/features/skills/codex-cli.md +5 -0
package/docs/readmes/features/skills/dispatching-parallel-agents.md +5 -0
package/docs/readmes/features/skills/executing-plans.md +5 -0
package/docs/readmes/features/skills/executor.md +5 -0
package/docs/readmes/features/skills/finishing-a-development-branch.md +5 -0
package/docs/readmes/features/skills/gemini-cli.md +5 -0
package/docs/readmes/features/skills/humanize.md +5 -0
package/docs/readmes/features/skills/init-pipeline.md +5 -0
package/docs/readmes/features/skills/receiving-code-review.md +5 -0
package/docs/readmes/features/skills/requesting-code-review.md +5 -0
package/docs/readmes/features/skills/reviewer.md +5 -0
package/docs/readmes/features/skills/subagent-driven-development.md +5 -0
package/docs/readmes/features/skills/using-git-worktrees.md +5 -0
package/docs/readmes/features/skills/wazir.md +5 -0
package/docs/readmes/features/skills/writing-skills.md +5 -0
package/docs/readmes/features/workflows/prepare-next.md +1 -1
package/docs/reference/configuration-reference.md +47 -6
package/docs/reference/launch-checklist.md +4 -4
package/docs/reference/review-loop-pattern.md +538 -0
package/docs/reference/roles-reference.md +1 -0
package/docs/reference/skill-tiers.md +147 -0
package/docs/reference/tooling-cli.md +5 -1
package/docs/truth-claims.yaml +18 -0
package/expertise/antipatterns/process/ai-coding-antipatterns.md +97 -1
package/exports/hosts/claude/.claude/agents/clarifier.md +3 -0
package/exports/hosts/claude/.claude/agents/designer.md +3 -0
package/exports/hosts/claude/.claude/agents/executor.md +2 -0
package/exports/hosts/claude/.claude/agents/planner.md +3 -0
package/exports/hosts/claude/.claude/agents/researcher.md +2 -0
package/exports/hosts/claude/.claude/agents/reviewer.md +5 -1
package/exports/hosts/claude/.claude/agents/specifier.md +3 -0
package/exports/hosts/claude/.claude/commands/clarify.md +4 -0
package/exports/hosts/claude/.claude/commands/design-review.md +4 -0
package/exports/hosts/claude/.claude/commands/design.md +4 -0
package/exports/hosts/claude/.claude/commands/discover.md +4 -0
package/exports/hosts/claude/.claude/commands/execute.md +4 -0
package/exports/hosts/claude/.claude/commands/plan-review.md +4 -0
package/exports/hosts/claude/.claude/commands/plan.md +4 -0
package/exports/hosts/claude/.claude/commands/spec-challenge.md +4 -0
package/exports/hosts/claude/.claude/commands/specify.md +4 -0
package/exports/hosts/claude/.claude/commands/verify.md +4 -0
package/exports/hosts/claude/.claude/settings.json +9 -0
package/exports/hosts/claude/CLAUDE.md +1 -1
package/exports/hosts/claude/export.manifest.json +22 -20
package/exports/hosts/claude/host-package.json +3 -1
package/exports/hosts/codex/AGENTS.md +1 -1
package/exports/hosts/codex/export.manifest.json +22 -20
package/exports/hosts/codex/host-package.json +3 -1
package/exports/hosts/cursor/.cursor/hooks.json +4 -0
package/exports/hosts/cursor/.cursor/rules/wazir-core.mdc +1 -1
package/exports/hosts/cursor/export.manifest.json +22 -20
package/exports/hosts/cursor/host-package.json +3 -1
package/exports/hosts/gemini/GEMINI.md +1 -1
package/exports/hosts/gemini/export.manifest.json +22 -20
package/exports/hosts/gemini/host-package.json +3 -1
package/hooks/context-mode-router +191 -0
package/hooks/definitions/context_mode_router.yaml +19 -0
package/hooks/definitions/loop_cap_guard.yaml +1 -1
package/hooks/hooks.json +43 -0
package/hooks/protected-path-write-guard +8 -0
package/hooks/routing-matrix.json +45 -0
package/hooks/session-start +62 -1
package/llms-full.txt +905 -132
package/package.json +3 -3
package/roles/clarifier.md +3 -0
package/roles/designer.md +3 -0
package/roles/executor.md +2 -0
package/roles/planner.md +3 -0
package/roles/researcher.md +2 -0
package/roles/reviewer.md +5 -1
package/roles/specifier.md +3 -0
package/schemas/hook.schema.json +2 -1
package/schemas/phase-report.schema.json +80 -0
package/schemas/usage.schema.json +25 -1
package/schemas/wazir-manifest.schema.json +19 -0
package/skills/brainstorming/SKILL.md +20 -56
package/skills/clarifier/SKILL.md +243 -0
package/skills/claude-cli/SKILL.md +320 -0
package/skills/codex-cli/SKILL.md +260 -0
package/skills/debugging/SKILL.md +24 -1
package/skills/design/SKILL.md +13 -0
package/skills/dispatching-parallel-agents/SKILL.md +13 -0
package/skills/executing-plans/SKILL.md +28 -2
package/skills/executor/SKILL.md +129 -0
package/skills/finishing-a-development-branch/SKILL.md +13 -0
package/skills/gemini-cli/SKILL.md +260 -0
package/skills/humanize/SKILL.md +13 -0
package/skills/init-pipeline/SKILL.md +76 -78
package/skills/prepare-next/SKILL.md +81 -10
package/skills/receiving-code-review/SKILL.md +21 -0
package/skills/requesting-code-review/SKILL.md +38 -5
package/skills/reviewer/SKILL.md +423 -0
package/skills/run-audit/SKILL.md +13 -0
package/skills/scan-project/SKILL.md +13 -0
package/skills/self-audit/SKILL.md +197 -16
package/skills/subagent-driven-development/SKILL.md +38 -2
package/skills/subagent-driven-development/code-quality-reviewer-prompt.md +2 -0
package/skills/subagent-driven-development/implementer-prompt.md +8 -0
package/skills/subagent-driven-development/spec-reviewer-prompt.md +7 -0
package/skills/tdd/SKILL.md +21 -0
package/skills/using-git-worktrees/SKILL.md +13 -0
package/skills/using-skills/SKILL.md +13 -0
package/skills/verification/SKILL.md +13 -0
package/skills/wazir/SKILL.md +286 -262
package/skills/writing-plans/SKILL.md +44 -4
package/skills/writing-skills/SKILL.md +13 -0
package/templates/artifacts/implementation-plan.md +3 -0
package/templates/artifacts/tasks-template.md +133 -0
package/templates/examples/phase-report.example.json +48 -0
package/templates/examples/wazir-manifest.example.yaml +1 -1
package/tooling/src/adapters/composition-engine.js +256 -0
package/tooling/src/adapters/model-router.js +84 -0
package/tooling/src/capture/command.js +111 -2
package/tooling/src/capture/run-config.js +23 -0
package/tooling/src/capture/store.js +24 -0
package/tooling/src/capture/usage.js +106 -0
package/tooling/src/checks/ac-matrix.js +256 -0
package/tooling/src/checks/brand-truth.js +3 -6
package/tooling/src/checks/command-registry.js +13 -0
package/tooling/src/checks/docs-truth.js +1 -1
package/tooling/src/checks/runtime-surface.js +3 -7
package/tooling/src/checks/skills.js +111 -0
package/tooling/src/cli.js +17 -3
package/tooling/src/commands/stats.js +161 -0
package/tooling/src/commands/validate.js +5 -1
package/tooling/src/export/compiler.js +33 -37
package/tooling/src/gating/agent.js +145 -0
package/tooling/src/guards/phase-prerequisite-guard.js +127 -0
package/tooling/src/hooks/routing-logic.js +69 -0
package/tooling/src/init/auto-detect.js +260 -0
package/tooling/src/init/command.js +161 -0
package/tooling/src/input/scanner.js +46 -0
package/tooling/src/reports/command.js +103 -0
package/tooling/src/reports/phase-report.js +323 -0
package/tooling/src/state/command.js +160 -0
package/tooling/src/state/db.js +287 -0
package/tooling/src/status/command.js +53 -1
package/wazir.manifest.yaml +26 -17
package/workflows/clarify.md +4 -0
package/workflows/design-review.md +4 -0
package/workflows/design.md +4 -0
package/workflows/discover.md +4 -0
package/workflows/execute.md +4 -0
package/workflows/plan-review.md +4 -0
package/workflows/plan.md +4 -0
package/workflows/spec-challenge.md +4 -0
package/workflows/specify.md +4 -0
package/workflows/verify.md +4 -0

package/CHANGELOG.md CHANGED Viewed

@@ -1,9 +1,48 @@
-# 1.0.0 (2026-03-17)
+# [1.2.0](https://github.com/MohamedAbdallah-14/Wazir/compare/v1.1.0...v1.2.0) (2026-03-19)
+### Bug Fixes
+* address 4 Codex review findings — nested payload, fallback, hashing, freshness key ([2276cae](https://github.com/MohamedAbdallah-14/Wazir/commit/2276caefc3fb22ab2ed1cd1b78152f64f3e5685c))
+* address 5 Codex review findings — routing state root, stats accuracy, Cursor hooks ([2cb21ba](https://github.com/MohamedAbdallah-14/Wazir/commit/2cb21ba258f63a663b8eddc0cc25322900022125))
+* address final review findings — context-mode CLI detection, AC heading overlap, CHANGELOG ([c33947f](https://github.com/MohamedAbdallah-14/Wazir/commit/c33947f8481647017506b0929ff87511d5dc6cad))
+* **hooks:** canonicalize hook registry and fix Claude Code payload mapping (I9) ([3e8810a](https://github.com/MohamedAbdallah-14/Wazir/commit/3e8810af7625de206979ebb356eeae6b7a1b5e67))
+* **self-audit:** loop 1 — add missing run-audit workflow to reference docs ([f830d84](https://github.com/MohamedAbdallah-14/Wazir/commit/f830d842cad4756db3311ff5cedb98fdbb5b0f72))
+* **self-audit:** loop 2 — add run-audit to reference docs, register 2 unlisted test files ([3e65c89](https://github.com/MohamedAbdallah-14/Wazir/commit/3e65c89b0532eed28313a17898abdf3627a5dadf))
+* **self-audit:** loop 3 — fix expertise count drift (261/308→268), schema count (16/18→19), regenerate exports ([500df05](https://github.com/MohamedAbdallah-14/Wazir/commit/500df057c59bcf82d691e52d0d9e09e8cec33edc))
+* **self-audit:** loop 4 — remove unused gray-matter dep, complete skill roster (11→28), add 17 skill readme stubs ([87047d1](https://github.com/MohamedAbdallah-14/Wazir/commit/87047d1f009b2f6cb1c73cd8db69c97d740c6385))
+* **self-audit:** loop 5 — fix INDEX.md counts (60→76 readmes, 11→28 skills), add 3 missing schemas to catalog, fix export diagram counts, remove gray-matter ref, regenerate exports ([92d187c](https://github.com/MohamedAbdallah-14/Wazir/commit/92d187c12786a6d19ec3c65c2d4b09499d853582))
 ### Features
-* Wazir v0.1.0 - Engineering with itqan ([d9a5c1b](https://github.com/MohamedAbdallah-14/Wazir/commit/d9a5c1bf1ffe615f67d55181458ead68e5cf7ecf))
+* **clarifier:** auto-detect content needs and enable author workflow ([c0f9523](https://github.com/MohamedAbdallah-14/Wazir/commit/c0f95230e0ef104bcbfb0b9bad16a0ab09ba9fb4)), closes [#17](https://github.com/MohamedAbdallah-14/Wazir/issues/17)
+* **clarifier:** context-mode fallbacks (item-6) ([4694143](https://github.com/MohamedAbdallah-14/Wazir/commit/4694143028066e373728feb1e5100cdc5fb6aec2))
+* **clarifier:** gap analysis exit gate (item-3) ([83df703](https://github.com/MohamedAbdallah-14/Wazir/commit/83df703f251a541c2155a2f360aa2e7ec5206f02))
+* **clarifier:** online research phase (item-9) ([0ac5ac8](https://github.com/MohamedAbdallah-14/Wazir/commit/0ac5ac849edeed975fc5051c04a3a234e9ca7994))
+* **clarifier:** preserve input quality (item-1) ([ba46424](https://github.com/MohamedAbdallah-14/Wazir/commit/ba46424d4900a1b13f1456156cb22f54e9b5ba1d))
+* **clarifier:** reviewer skill invocation policy (item-13) ([4b3f59d](https://github.com/MohamedAbdallah-14/Wazir/commit/4b3f59dd7ae3f7f93b4d1f7d7da070dc98c3a369))
+* **clarifier:** run-scoped user feedback routing (item-11) ([5436746](https://github.com/MohamedAbdallah-14/Wazir/commit/5436746d4da453c23c213db7c11e4497870352da))
+* **clarifier:** spec-kit plan format (item-2) ([0266247](https://github.com/MohamedAbdallah-14/Wazir/commit/02662474db97d87c2ce6f82b2e2a7b960d386d00))
+* **executor:** changelog and gitflow enforcement (item-5) ([5a5986a](https://github.com/MohamedAbdallah-14/Wazir/commit/5a5986a0131bbefe7d601e799411ae48cf68fe10))
+* **hooks:** add index refresh to session-start hook (D2) ([ff4647f](https://github.com/MohamedAbdallah-14/Wazir/commit/ff4647f2fc18e963a8180ec3b93937ce70d33be4))
+* **hooks:** extract routing logic and add context-mode router tests (D1/D3) ([1e7650a](https://github.com/MohamedAbdallah-14/Wazir/commit/1e7650a7cf5af944807897f5a3200a69b691cd5f)), closes [passthrou#vs-large](https://github.com/passthrou/issues/vs-large)
+* implement all 8 remaining enhancement items ([00acff5](https://github.com/MohamedAbdallah-14/Wazir/commit/00acff57e7f0bc9515ef8cce8acf9596577fca83))
+* **init-pipeline:** context-mode detection (item-6-init) ([839a5b3](https://github.com/MohamedAbdallah-14/Wazir/commit/839a5b3b41b0246468d4e9b0da61c4f17e7c3c41))
+* **input:** extract input scanner utility and verify scanning (I3) ([8a232a2](https://github.com/MohamedAbdallah-14/Wazir/commit/8a232a2312e4a42b1aefc5c9c6283d738c58f0c4))
+* **review-loop:** fix-and-loop with convergence cap (item-12) ([d2fcb9c](https://github.com/MohamedAbdallah-14/Wazir/commit/d2fcb9c7bd93407a88dda2c8c96ffd397834f268))
+* **review-loop:** phase scoring with dimension deltas (item-15) ([1fe9f6d](https://github.com/MohamedAbdallah-14/Wazir/commit/1fe9f6d85d36d610fb6c48e2e1035d2f71134832))
+* **reviewer:** codex output context protection (item-17) ([abb8a77](https://github.com/MohamedAbdallah-14/Wazir/commit/abb8a77e922268adbfaee721facbdcbaa1e30646))
+* **skills:** interactive numbered options at all checkpoints (item-10) ([45bc6fd](https://github.com/MohamedAbdallah-14/Wazir/commit/45bc6fdf5f8feb558a89823befa72b0ca78cdce3))
+* **templates:** create spec-kit task template (item-8) ([cd55d73](https://github.com/MohamedAbdallah-14/Wazir/commit/cd55d73316a59437c76ddeb22643020a60d76b55))
+* **tooling:** AC verification scaffold — 111 checks (T000) ([ea61684](https://github.com/MohamedAbdallah-14/Wazir/commit/ea616843b5deca97ead38b8db06c1a1eef15458f))
+* **wazir:** enforce pipeline phases — agent must never skip phases ([3e21bd2](https://github.com/MohamedAbdallah-14/Wazir/commit/3e21bd2d36af57d0db4b8dc68458cc63bbb8676a))
+* **wazir:** full end-of-phase reports (item-16) ([6c84455](https://github.com/MohamedAbdallah-14/Wazir/commit/6c84455e867eb19d49b69cfcb130b147a80917f9))
+* **wazir:** implement 9 enhancement decisions from brainstorming session ([885a2c1](https://github.com/MohamedAbdallah-14/Wazir/commit/885a2c1f04fc9ac02c65ed06461114e4c3251393))
+* **wazir:** implement learning system — extraction, injection, and handoff ([06eb107](https://github.com/MohamedAbdallah-14/Wazir/commit/06eb107e8641b50e075ca1744f899da0fe9d09e6)), closes [#1](https://github.com/MohamedAbdallah-14/Wazir/issues/1) [#2](https://github.com/MohamedAbdallah-14/Wazir/issues/2) [#13](https://github.com/MohamedAbdallah-14/Wazir/issues/13)
+* **wazir:** restructure pipeline from 14 phases to 4 (Init, Clarifier, Executor, Final Review) ([d6e2372](https://github.com/MohamedAbdallah-14/Wazir/commit/d6e2372d5a0c6824905569de26ddb2434eb74dca)), closes [#4](https://github.com/MohamedAbdallah-14/Wazir/issues/4) [#5](https://github.com/MohamedAbdallah-14/Wazir/issues/5)
+* **wazir:** resume copies artifacts (item-4) ([14633d4](https://github.com/MohamedAbdallah-14/Wazir/commit/14633d4cd66d04846b3297c3d50d99046a89fb7c))
+* **wazir:** usage reports at phase exits (item-7) ([8f055be](https://github.com/MohamedAbdallah-14/Wazir/commit/8f055be12992c56e01619e2a9547dc3e9045dc7c))
 # Changelog
@@ -12,3 +51,62 @@ All notable changes to this project will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/), and this project adheres to [Semantic Versioning](https://semver.org/).
 ## [Unreleased]
+### Changed
+- Restructured pipeline from 14 micro-phases to 4 main phases: Init, Clarifier, Executor, Final Review
+- Removed depth and intent questions from pipeline init — depth defaults to standard (override via inline modifiers), intent inferred from request keywords
+- Enabled learn + prepare-next workflows by default (part of Final Review phase)
+- Renamed `phase_policy` to `workflow_policy` in run-config (legacy name still supported)
+- Pipeline init no longer asks about Agent Teams — always sequential
+- Input directory (`input/`) now scanned automatically at startup
+- Learning extraction with concrete proposal format in reviewer final mode
+- Accepted learnings injected into clarifier context (top 10 by confidence, scope-matched)
+- Prepare-next skill produces structured handoff document
+### Fixed
+- Router logs now write to manifest-derived state root instead of `_default` (Codex P1)
+- Routing log replay scoped to current run via timestamp filtering (Codex P2)
+- Index-query savings now computed from avoided bytes, not raw bytes (Codex P2)
+- Index-query savings included in savings-ratio denominator (Codex P2)
+- Cursor export now includes context-mode-router hook (Codex P2)
+### Added
+- Core review loop pattern across all pipeline phases with Codex CLI integration
+- `wazir capture loop-check` CLI subcommand with task-scoped cap tracking and run-config loader
+- `wazir init` interactive CLI command with arrow-key selection (depth, intent, teams, codex model)
+- `docs/reference/review-loop-pattern.md` canonical reference for the review loop pattern
+- Standalone skills: `/wazir:clarifier`, `/wazir:executor`, `/wazir:reviewer`
+- Agent Teams real implementation in brainstorming (TeamCreate, SendMessage, TeamDelete)
+- Codex prompt templates (artifact + code) with "Do NOT load skills" instruction
+- Git branch enforcement in `/wazir` runner (validates branch, offers to create feature branch)
+- CLI wiring across pipeline phases (doctor gate, index build/refresh, capture events, validate gates)
+- CHANGELOG enforcement in executor and reviewer skills
+- 10 new tests: 7 for handleLoopCheck, 4 for init command (406 total)
+- Spec-kit task template (`templates/artifacts/tasks-template.md`) with checklist format, phase structure, parallel markers, MVP strategy
+- AC verification scaffold (`tooling/src/checks/ac-matrix.js`) — 111 automated acceptance criteria checks
+- Context-mode detection in `wazir init` (3 core tools + optional execute_file under MCP prefix)
+- Input preservation logic in clarifier (adopt input specs verbatim, never remove detail)
+- Gap analysis exit gate in clarifier (invoke wz:reviewer --mode plan-review, fix-and-loop)
+- Online research in clarifier Phase 0 (keyword extraction, fetch_and_index/WebFetch, error handling)
+- Codex output context protection (tee + extract via execute_file, fail-closed fallback)
+- Resume detection with staleness check and interactive checkpoint in /wazir runner
+- Usage capture at every phase_exit event
+- Run-scoped user feedback routing (plan corrections vs scope changes)
+- Phase scoring with canonical dimension sets and quality delta reporting
+- Full end-of-phase reports (7 sections: Summary, Key Changes, Quality Delta, Findings Log, Usage, Context Savings, Time Spent)
+### Changed
+- All Codex CLI calls now read model from `config.multi_tool.codex.model` with fallback to `gpt-5.4`
+- Producer-reviewer separation enforced: no role reviews its own output
+- Reviewer skill is phase-aware with 7 explicit modes (final, spec-challenge, design-review, plan-review, task-review, research-review, clarification-review)
+- Brainstorming design-review gate replaces direct handoff to writing-plans
+- Clarifier delegates research to discover workflow, spec to specify workflow, planning to writing-plans
+- `/wazir` runner pipeline rewritten with all manifest phases and review loops
+- Wazir CLI is now required (removed "Skip" option)
+- Fixed pass counts: quick=3, standard=5, deep=7 (no extension)
+- Clarifier now invokes `wz:reviewer --mode` explicitly instead of ad-hoc codex calls
+- Fix-and-loop pattern: re-submission after fixes is mandatory, "fix and continue" prohibited
+- Review loop escalation at cap: 3 user options (approve-with-issues, fix-manually, abort)
+- CHANGELOG/gitflow hard gates before PR (validate changelog + validate commits)
+- All checkpoints use numbered interactive options with (Recommended) markers
+- Reviewer documents 5 owned responsibilities (Codex integration, dimensions, pass counting, attribution, dimension set recording)

package/README.md CHANGED Viewed

@@ -32,7 +32,7 @@
 I'm Mohamed Abdallah. I kept watching AI agents write confident code that broke in production, skip tests, and forget what we agreed on yesterday. So I stopped asking them to be better and built them an engineering department instead.
 **Wazir puts engineering discipline inside AI coding agents.**
-No wrapper. No server. Just structure -- inside Claude, Codex, Gemini, and Cursor. Built on 300+ research sources distilled into 261 curated expertise modules across 12 domains.
+No wrapper. No server. Just structure -- inside Claude, Codex, Gemini, and Cursor. Built on 300+ research sources distilled into 268 curated expertise modules across 12 domains.
 ---
@@ -122,9 +122,9 @@ Three concepts.
 **1 -- Roles are isolation boundaries, not personas.** Each of the 10 roles has defined inputs, allowed tools, required outputs, escalation rules, and failure conditions. An agent inside a role cannot write to protected paths, cannot skip required outputs, and must escalate when ambiguity conditions are met. The discipline is structural, not instructional. See [Roles & Workflows](docs/concepts/roles-and-workflows.md).
-**2 -- Phases are artifact checkpoints, not conversation stages.** Every phase consumes a named artifact from the previous phase and produces a named artifact for the next. Nothing flows through conversation history. A session can end, a new agent can pick up the artifacts, and delivery continues. The handoff is explicit, structured, and schema-validated against 18 JSON schemas. See [Architecture](docs/concepts/architecture.md).
+**2 -- Phases are artifact checkpoints, not conversation stages.** Every phase consumes a named artifact from the previous phase and produces a named artifact for the next. Nothing flows through conversation history. A session can end, a new agent can pick up the artifacts, and delivery continues. The handoff is explicit, structured, and schema-validated against 19 JSON schemas. See [Architecture](docs/concepts/architecture.md).
-**3 -- The composition engine loads the right expert automatically.** One agent pretending to be an expert in everything is an expert in nothing. A 4-layer system (always, auto, stacks, concerns) decides which of 261 expertise modules load into each role's context. The executor gets modules on how to build. The verifier gets modules on what to detect. The reviewer gets modules on what to flag. All resolved automatically from the task's declared stack and concerns. Max 15 modules per dispatch, token budget enforced.
+**3 -- The composition engine loads the right expert automatically.** One agent pretending to be an expert in everything is an expert in nothing. A 4-layer system (always, auto, stacks, concerns) decides which of 268 expertise modules load into each role's context. The executor gets modules on how to build. The verifier gets modules on what to detect. The reviewer gets modules on what to flag. All resolved automatically from the task's declared stack and concerns. Max 15 modules per dispatch, token budget enforced.
 ---
@@ -171,7 +171,7 @@ Run `wazir capture usage` at the end of a session to see the savings:
 **Adversarial review at three chokepoints.** Spec-challenge, plan-review, and final review run by the reviewer role, never the phase author. Nine hard approval gates span the 14-phase pipeline. Nothing advances without explicit clearance. [Architecture](docs/concepts/architecture.md)
-**261 curated expertise modules across 12 domains.** Loaded selectively per role per phase via a 4-layer composition engine. Max 15 modules per dispatch, token budget enforced. Wazir ships with 261. Yours could be next. [Expertise index](docs/reference/expertise-index.md)
+**268 curated expertise modules across 12 domains.** Loaded selectively per role per phase via a 4-layer composition engine. Max 15 modules per dispatch, token budget enforced. Wazir ships with 268. Yours could be next. [Expertise index](docs/reference/expertise-index.md)
 **Three-tier recall for token savings.** L0 (~~100 tokens), L1 (~~500-2k tokens), direct read for full source. Symbol-first exploration searches the index before reading source. Capture routing redirects large tool output to files. Result: 60-80% token reduction on exploration-heavy phases, measured per-session by `wazir capture usage`. [Indexing and Recall](docs/concepts/indexing-and-recall.md)
@@ -200,7 +200,7 @@ The AI coding tool space is fragmenting. Developers bolt together separate plugi
 | **Phase model**        | 14 explicit, artifact-gated   | 7-step (advisory)                                  | 3-step                                         | 1 (generate/test)                                       | N/A                                             | N/A                                                    | 5-step pipeline                                        |
 | **Adversarial review** | 3 gate phases                 | Code review skill                                  | No                                             | No                                                      | No                                              | No                                                     | team-verify step                                       |
 | **Context management** | L0/L1 tiered recall           | None                                               | None                                           | None                                                    | LLM compression                                 | Vector DB (ChromaDB)                                   | Token routing                                          |
-| **Schema validation**  | 18 JSON schemas               | No                                                 | No                                             | No                                                      | No                                              | No                                                     | No                                                     |
+| **Schema validation**  | 19 JSON schemas               | No                                                 | No                                             | No                                                      | No                                              | No                                                     | No                                                     |
 | **Guardrails**         | 7 hook contracts              | None                                               | None                                           | None                                                    | None                                            | 5 hooks (memory)                                       | Agent tracking                                         |
 | **External deps**      | None (host-native)            | None (prompt-only)                                 | Python CLI                                     | Node.js CLI                                             | Node.js + LLM                                   | ChromaDB, SQLite, Bun                                  | tmux, exp. teams API                                   |
 | **Host support**       | Claude, Codex, Gemini, Cursor | Claude, Codex, Gemini, Cursor, OpenCode            | Claude, Copilot, Gemini                        | Any LLM provider                                        | Any LLM                                         | Claude Code only                                       | Claude Code (+ workers)                                |
@@ -265,7 +265,7 @@ The pipeline, roles, and expertise modules are stable and used in production by
 What's solid:
 - The 14-phase pipeline and 10 role contracts
-- 261 expertise modules across 12 domains
+- 268 expertise modules across 12 domains
 - Host exports for Claude, Codex, Gemini, and Cursor
 - The composition engine and tiered recall system

package/docs/concepts/architecture.md CHANGED Viewed

@@ -10,7 +10,7 @@ Wazir is a host-native engineering OS kit. The host environment (Claude, Codex,
 | Workflows | Phase entrypoints that sequence roles through delivery |
 | Skills | Reusable procedures (wz:tdd, wz:debugging, wz:verification, wz:brainstorming) |
 | Hooks | Guardrails enforcing protected paths, loop caps, and capture routing |
-| Expertise | 308 curated knowledge modules composed into agent prompts |
+| Expertise | 268 curated knowledge modules composed into agent prompts |
 | Templates | Artifact templates for phase outputs and handoff |
 | Schemas | Validation schemas for manifest, hooks, artifacts, and exports |
 | Exports | Generated host packages tailored per supported host |

package/docs/concepts/roles-and-workflows.md CHANGED Viewed

@@ -31,6 +31,8 @@ The canonical workflow sequence is:
 13. **learn** — capture scoped learnings
 14. **prepare-next** — produce a clean handoff for the next run
+Additionally, **run-audit** is a standalone workflow that can be invoked outside the linear pipeline to perform structured codebase audits with source-backed findings.
 ## Role routing
 The orchestrator dispatches three roles per task: `executor`, `reviewer`, and `verifier`. By default, all three run for every task. The `required_roles` field in a task's YAML frontmatter controls which roles are dispatched, allowing the orchestrator to skip unnecessary roles and save context window budget.

package/docs/concepts/why-wazir.md ADDED Viewed

@@ -0,0 +1,59 @@
+# Why Wazir
+What makes Wazir the best engineering OS you can add to an AI coding agent.
+## 1. Measure Twice, Cut Once
+Wazir clarifies before coding. The pipeline forces research, spec hardening, design review, and plan approval before a single line of implementation code is written. Most AI agents jump straight to code and fix mistakes after. Wazir prevents the mistakes.
+## 2. Deep Research
+Every AI agent knows how to research. Users don't ask them to. Wazir makes research a mandatory phase — the researcher role scans the codebase, fetches external sources, and produces a research brief before clarification begins. The agent starts informed, not guessing.
+## 3. Clarifier + Task Planning
+A structured clarification pipeline turns vague requests into measurable specs. Spec hardening catches ambiguity, missing constraints, and untestable acceptance criteria before they become bugs. Task planning produces execution-grade task specs — not TODO lists.
+## 4. Content Author
+A dedicated role for any content need — database seeding, sample content, test fixtures, translations, copy, email templates, notification text. Most AI agents treat content as an afterthought bolted onto code tasks. Wazir gives content its own phase with editorial standards, i18n awareness, and humanization rules.
+## 5. Self-Audit
+The agent audits its own work in an isolated git worktree. Validates, finds structural issues, fixes what it can, verifies the fixes, and only merges on all-green. 5-loop cycle with convergence detection. Protected-path safety rails prevent the agent from modifying its own identity-defining files. Safe self-improvement.
+## 6. Composer
+~300 curated expertise modules across 12 domains. The composition engine assembles task-specific agents by loading the right expertise for each role, stack, and concern. The executor building a Flutter RTL app gets Flutter patterns, RTL layout rules, and mobile antipatterns composed into its context. The reviewer gets the corresponding antipattern catalog. Every dispatched agent is a specialist, not a generalist pretending.
+## 7. Review Loops
+Multi-pass adversarial review at every pipeline checkpoint — not a single rubber-stamp at the end. Research-review, clarification-review, spec-challenge, design-review, plan-review, per-task execution review, and final review. Each uses phase-specific dimensions. Findings are resolved before advancing. The reviewer is an adversary, not a cheerleader.
+## 8. Continuous Learning
+Wazir evolves from its own mistakes. Review findings, audit findings, and user corrections feed into a learning system. Recurring issues become accepted learnings injected into future runs. A drift budget prevents learned behavior from diverging too far from the original design. The agent that builds your 10th feature is better than the one that built your 1st.
+## 9. Antipatterns
+A first-class antipattern catalog loaded into reviewer context BEFORE domain expertise. Catches AI-specific failure modes: fake completion, unwired abstractions, shallow tests, security theater, architecture drift. The reviewer's first lens is "what could go wrong" — not "does this look right."
+## 10. Multi-Host
+One canonical source, four host exports. Wazir works on Claude Code, Codex, Gemini, and Cursor from a single `wazir export build`. Roles, workflows, skills, and expertise are written once and compiled into each host's native format. Switch hosts without rewriting your engineering process.
+## 11. Context Efficiency
+AI agents waste most of their context window on brute-force file reads and verbose command output. Wazir's routing hook auto-routes large commands through context-mode. The index provides symbol-first exploration — query first, read only what's needed. Capture routing redirects large output to files. Result: 60-80% token reduction on exploration-heavy phases. The agent thinks more, reads less.
+## 12. Verification Before Completion
+No success claims without evidence. The verify phase produces deterministic proof — test results, lint output, type-check results — not "I believe it works." Every completion claim is backed by a command that was actually run and output that was actually checked. Evidence before assertions, always.
+## 13. Gating Agent
+Autonomous phase transition decisions. After each phase, a gating agent reads the phase report and decides: continue (all gates pass), loop back (specific failures with fix paths), or escalate to human (ambiguous trade-offs, scope changes). Default posture: escalate. The pipeline doesn't blindly advance — it stops when it should stop.
+## 14. Humanize
+Anti-AI-writing patterns across all text output. A vocabulary blacklist, domain-specific rules, and a self-audit checklist ensure that specs, plans, code comments, commit messages, and documentation read like they were written by a human engineer — not generated by an LLM. Because AI-sounding output erodes trust.