@wazir-dev/cli 1.1.0 → 1.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (124) hide show
  1. package/CHANGELOG.md +73 -4
  2. package/README.md +6 -6
  3. package/docs/concepts/architecture.md +1 -1
  4. package/docs/concepts/roles-and-workflows.md +2 -0
  5. package/docs/concepts/why-wazir.md +59 -0
  6. package/docs/decisions/2026-03-19-deferred-items.md +564 -0
  7. package/docs/decisions/2026-03-19-enhancement-decisions.md +300 -0
  8. package/docs/readmes/INDEX.md +21 -5
  9. package/docs/readmes/features/expertise/README.md +2 -2
  10. package/docs/readmes/features/exports/README.md +2 -2
  11. package/docs/readmes/features/schemas/README.md +3 -0
  12. package/docs/readmes/features/skills/README.md +17 -0
  13. package/docs/readmes/features/skills/clarifier.md +5 -0
  14. package/docs/readmes/features/skills/claude-cli.md +5 -0
  15. package/docs/readmes/features/skills/codex-cli.md +5 -0
  16. package/docs/readmes/features/skills/dispatching-parallel-agents.md +5 -0
  17. package/docs/readmes/features/skills/executing-plans.md +5 -0
  18. package/docs/readmes/features/skills/executor.md +5 -0
  19. package/docs/readmes/features/skills/finishing-a-development-branch.md +5 -0
  20. package/docs/readmes/features/skills/gemini-cli.md +5 -0
  21. package/docs/readmes/features/skills/humanize.md +5 -0
  22. package/docs/readmes/features/skills/init-pipeline.md +5 -0
  23. package/docs/readmes/features/skills/receiving-code-review.md +5 -0
  24. package/docs/readmes/features/skills/requesting-code-review.md +5 -0
  25. package/docs/readmes/features/skills/reviewer.md +5 -0
  26. package/docs/readmes/features/skills/subagent-driven-development.md +5 -0
  27. package/docs/readmes/features/skills/using-git-worktrees.md +5 -0
  28. package/docs/readmes/features/skills/wazir.md +5 -0
  29. package/docs/readmes/features/skills/writing-skills.md +5 -0
  30. package/docs/readmes/features/workflows/prepare-next.md +1 -1
  31. package/docs/reference/configuration-reference.md +47 -6
  32. package/docs/reference/launch-checklist.md +4 -4
  33. package/docs/reference/review-loop-pattern.md +117 -8
  34. package/docs/reference/roles-reference.md +1 -0
  35. package/docs/reference/skill-tiers.md +147 -0
  36. package/docs/reference/tooling-cli.md +3 -1
  37. package/docs/truth-claims.yaml +12 -0
  38. package/expertise/antipatterns/process/ai-coding-antipatterns.md +97 -1
  39. package/exports/hosts/claude/.claude/settings.json +9 -0
  40. package/exports/hosts/claude/CLAUDE.md +1 -1
  41. package/exports/hosts/claude/export.manifest.json +4 -2
  42. package/exports/hosts/claude/host-package.json +3 -1
  43. package/exports/hosts/codex/AGENTS.md +1 -1
  44. package/exports/hosts/codex/export.manifest.json +4 -2
  45. package/exports/hosts/codex/host-package.json +3 -1
  46. package/exports/hosts/cursor/.cursor/hooks.json +4 -0
  47. package/exports/hosts/cursor/.cursor/rules/wazir-core.mdc +1 -1
  48. package/exports/hosts/cursor/export.manifest.json +4 -2
  49. package/exports/hosts/cursor/host-package.json +3 -1
  50. package/exports/hosts/gemini/GEMINI.md +1 -1
  51. package/exports/hosts/gemini/export.manifest.json +4 -2
  52. package/exports/hosts/gemini/host-package.json +3 -1
  53. package/hooks/context-mode-router +191 -0
  54. package/hooks/definitions/context_mode_router.yaml +19 -0
  55. package/hooks/hooks.json +31 -6
  56. package/hooks/protected-path-write-guard +8 -0
  57. package/hooks/routing-matrix.json +45 -0
  58. package/hooks/session-start +62 -1
  59. package/llms-full.txt +905 -132
  60. package/package.json +2 -3
  61. package/schemas/hook.schema.json +2 -1
  62. package/schemas/phase-report.schema.json +80 -0
  63. package/schemas/usage.schema.json +25 -1
  64. package/schemas/wazir-manifest.schema.json +19 -0
  65. package/skills/brainstorming/SKILL.md +18 -155
  66. package/skills/clarifier/SKILL.md +122 -98
  67. package/skills/claude-cli/SKILL.md +320 -0
  68. package/skills/codex-cli/SKILL.md +260 -0
  69. package/skills/debugging/SKILL.md +13 -0
  70. package/skills/design/SKILL.md +13 -0
  71. package/skills/dispatching-parallel-agents/SKILL.md +13 -0
  72. package/skills/executing-plans/SKILL.md +13 -0
  73. package/skills/executor/SKILL.md +72 -19
  74. package/skills/finishing-a-development-branch/SKILL.md +13 -0
  75. package/skills/gemini-cli/SKILL.md +260 -0
  76. package/skills/humanize/SKILL.md +13 -0
  77. package/skills/init-pipeline/SKILL.md +73 -164
  78. package/skills/prepare-next/SKILL.md +81 -10
  79. package/skills/receiving-code-review/SKILL.md +13 -0
  80. package/skills/requesting-code-review/SKILL.md +13 -0
  81. package/skills/reviewer/SKILL.md +287 -15
  82. package/skills/run-audit/SKILL.md +13 -0
  83. package/skills/scan-project/SKILL.md +13 -0
  84. package/skills/self-audit/SKILL.md +197 -16
  85. package/skills/subagent-driven-development/SKILL.md +13 -0
  86. package/skills/subagent-driven-development/code-quality-reviewer-prompt.md +2 -0
  87. package/skills/subagent-driven-development/implementer-prompt.md +8 -0
  88. package/skills/subagent-driven-development/spec-reviewer-prompt.md +7 -0
  89. package/skills/tdd/SKILL.md +13 -0
  90. package/skills/using-git-worktrees/SKILL.md +13 -0
  91. package/skills/using-skills/SKILL.md +13 -0
  92. package/skills/verification/SKILL.md +13 -0
  93. package/skills/wazir/SKILL.md +194 -377
  94. package/skills/writing-plans/SKILL.md +14 -1
  95. package/skills/writing-skills/SKILL.md +13 -0
  96. package/templates/artifacts/implementation-plan.md +3 -0
  97. package/templates/artifacts/tasks-template.md +133 -0
  98. package/templates/examples/phase-report.example.json +48 -0
  99. package/tooling/src/adapters/composition-engine.js +256 -0
  100. package/tooling/src/adapters/model-router.js +84 -0
  101. package/tooling/src/capture/command.js +24 -1
  102. package/tooling/src/capture/run-config.js +3 -1
  103. package/tooling/src/capture/store.js +24 -0
  104. package/tooling/src/capture/usage.js +106 -0
  105. package/tooling/src/checks/ac-matrix.js +256 -0
  106. package/tooling/src/checks/command-registry.js +12 -0
  107. package/tooling/src/checks/docs-truth.js +1 -1
  108. package/tooling/src/checks/skills.js +111 -0
  109. package/tooling/src/cli.js +9 -0
  110. package/tooling/src/commands/stats.js +161 -0
  111. package/tooling/src/commands/validate.js +5 -1
  112. package/tooling/src/export/compiler.js +33 -37
  113. package/tooling/src/gating/agent.js +145 -0
  114. package/tooling/src/guards/phase-prerequisite-guard.js +127 -0
  115. package/tooling/src/hooks/routing-logic.js +69 -0
  116. package/tooling/src/init/auto-detect.js +260 -0
  117. package/tooling/src/init/command.js +95 -135
  118. package/tooling/src/input/scanner.js +46 -0
  119. package/tooling/src/reports/command.js +103 -0
  120. package/tooling/src/reports/phase-report.js +323 -0
  121. package/tooling/src/state/command.js +160 -0
  122. package/tooling/src/state/db.js +287 -0
  123. package/tooling/src/status/command.js +53 -1
  124. package/wazir.manifest.yaml +26 -14
@@ -5,6 +5,19 @@ description: Use when completing tasks, implementing major features, or before m
5
5
 
6
6
  # Requesting Code Review
7
7
 
8
+ ## Command Routing
9
+ Follow the Canonical Command Matrix in `hooks/routing-matrix.json`.
10
+ - Large commands (test runners, builds, diffs, dependency trees, linting) → context-mode tools
11
+ - Small commands (git status, ls, pwd, wazir CLI) → native Bash
12
+ - If context-mode unavailable, fall back to native Bash with warning
13
+
14
+ ## Codebase Exploration
15
+ 1. Query `wazir index search-symbols <query>` first
16
+ 2. Use `wazir recall file <path> --tier L1` for targeted reads
17
+ 3. Fall back to direct file reads ONLY for files identified by index queries
18
+ 4. Maximum 10 direct file reads without a justifying index query
19
+ 5. If no index exists: `wazir index build && wazir index summarize --tier all`
20
+
8
21
  Dispatch wz:code-reviewer subagent to catch issues before they cascade. The reviewer gets precisely crafted context for evaluation — never your session's history. This keeps the reviewer focused on the work product, not your thought process, and preserves your own context for continued work.
9
22
 
10
23
  **Core principle:** Review early, review often. Review follows the loop pattern in `docs/reference/review-loop-pattern.md`. Dispatch the reviewer with explicit `--mode` and depth-aware loop parameters.
@@ -5,10 +5,40 @@ description: Run the review phase — adversarial review of implementation again
5
5
 
6
6
  # Reviewer
7
7
 
8
- Run Phase 3 (Review) for the current project.
8
+ ## Model Annotation
9
+ When multi-model mode is enabled:
10
+ - **Sonnet** for internal review passes (internal-review)
11
+ - **Opus** for final review mode (final-review)
12
+ - **Opus** for spec-challenge mode (spec-harden)
13
+ - **Opus** for design-review mode (design)
14
+
15
+ ## Command Routing
16
+ Follow the Canonical Command Matrix in `hooks/routing-matrix.json`.
17
+ - Large commands (test runners, builds, diffs, dependency trees, linting) → context-mode tools
18
+ - Small commands (git status, ls, pwd, wazir CLI) → native Bash
19
+ - If context-mode unavailable, fall back to native Bash with warning
20
+
21
+ ## Codebase Exploration
22
+ 1. Query `wazir index search-symbols <query>` first
23
+ 2. Use `wazir recall file <path> --tier L1` for targeted reads
24
+ 3. Fall back to direct file reads ONLY for files identified by index queries
25
+ 4. Maximum 10 direct file reads without a justifying index query
26
+ 5. If no index exists: `wazir index build && wazir index summarize --tier all`
27
+
28
+ Run the Final Review phase — or any review mode invoked by other phases.
9
29
 
10
30
  The reviewer role owns all review loops across the pipeline: research-review, clarification-review, spec-challenge, design-review, plan-review, per-task execution review, and final review. Each uses phase-specific dimensions from `docs/reference/review-loop-pattern.md`.
11
31
 
32
+ **Key principle for `final` mode:** Compare implementation against the **ORIGINAL INPUT** (briefing + input files), NOT the task specs. The executor's per-task reviewer already validated against task specs — that concern is covered. The final reviewer catches drift: does what we built match what the user actually asked for?
33
+
34
+ **Reviewer-owned responsibilities** (callers must NOT replicate these):
35
+ 1. **Two-tier review** — internal review first (fast, cheap, expertise-loaded), Codex second (fresh eyes on clean code)
36
+ 2. **Dimension selection** — the reviewer selects the correct dimension set for the review mode and depth
37
+ 3. **Pass counting** — the reviewer tracks pass numbers and enforces the depth-based cap (quick=3, standard=5, deep=7)
38
+ 4. **Finding attribution** — each finding is tagged `[Internal]`, `[Codex]`, or `[Both]` based on source
39
+ 5. **Dimension set recording** — each review pass file records which canonical dimension set was used, enabling Phase Scoring (first vs final delta)
40
+ 6. **Learning pipeline** — ALL findings (internal + Codex) feed into `state.sqlite` and the learning system
41
+
12
42
  ## Review Modes
13
43
 
14
44
  The reviewer operates in different modes depending on the phase. Mode MUST be passed explicitly by the caller (`--mode <mode>`). The reviewer does NOT auto-detect mode from artifact availability. If `--mode` is not provided, ask the user which review to run.
@@ -34,6 +64,23 @@ In `task-review` and `final` modes, flag missing CHANGELOG entries for user-faci
34
64
  Prerequisites depend on the review mode:
35
65
 
36
66
  ### `final` mode
67
+
68
+ **Phase Prerequisites (Hard Gate):** Before proceeding, verify ALL of these artifacts exist. If ANY is missing, **STOP** and report which are missing.
69
+
70
+ - [ ] `.wazir/runs/latest/clarified/clarification.md`
71
+ - [ ] `.wazir/runs/latest/clarified/spec-hardened.md`
72
+ - [ ] `.wazir/runs/latest/clarified/design.md`
73
+ - [ ] `.wazir/runs/latest/clarified/execution-plan.md`
74
+ - [ ] `.wazir/runs/latest/artifacts/verification-proof.md`
75
+
76
+ If any file is missing:
77
+
78
+ > **Cannot run final review: missing prerequisite artifacts.**
79
+ >
80
+ > Missing: [list missing files]
81
+ >
82
+ > Run `/wazir:clarifier` (for clarified/* files) or `/wazir:executor` (for verification-proof.md) first.
83
+
37
84
  1. Check `.wazir/runs/latest/artifacts/` has completed task artifacts. If not, tell the user to run `/wazir:executor` first.
38
85
  2. Read the approved spec, plan, and design from `.wazir/runs/latest/clarified/`.
39
86
  3. Read `.wazir/state/config.json` for depth and multi_tool settings.
@@ -48,13 +95,15 @@ Prerequisites depend on the review mode:
48
95
 
49
96
  ## Review Process (`final` mode)
50
97
 
98
+ **Input:** Read the ORIGINAL user input (`.wazir/input/briefing.md`, `input/` directory files) and compare against what was built. This catches intent drift that task-level review misses.
99
+
51
100
  Perform adversarial review across 7 dimensions:
52
101
 
53
- 1. **Correctness** — Does the code do what the spec says?
54
- 2. **Completeness** — Are all acceptance criteria met?
102
+ 1. **Correctness** — Does the code do what the original input asked for?
103
+ 2. **Completeness** — Are all requirements from the original input met?
55
104
  3. **Wiring** — Are all paths connected end-to-end?
56
105
  4. **Verification** — Is there evidence (tests, type checks) for each claim?
57
- 5. **Drift** — Does the implementation match the approved plan?
106
+ 5. **Drift** — Does the implementation match what the user originally requested? (not just the plan — the INPUT)
58
107
  6. **Quality** — Code style, naming, error handling, security
59
108
  7. **Documentation** — Changelog entries, commit messages, comments
60
109
 
@@ -76,11 +125,28 @@ Score each dimension 0-10. Total out of 70.
76
125
  | **NEEDS REWORK** | 28-41 | Re-run affected tasks |
77
126
  | **FAIL** | 0-27 | Fundamental issues |
78
127
 
79
- ## Secondary Review
128
+ ## Two-Tier Review Flow
129
+
130
+ The review process has two tiers. Internal review catches ~80% of issues quickly and cheaply. Codex review provides fresh eyes on clean code.
131
+
132
+ ### Tier 1: Internal Review (Fast, Cheap, Expertise-Loaded)
133
+
134
+ 1. **Compose expertise:** Load relevant expertise modules from `expertise/composition-map.yaml` into context based on the review mode and detected stack. This gives the internal reviewer domain-specific knowledge.
135
+ 2. **Run internal review** using the dimension set for the current mode. When multi-model is enabled, use **Sonnet** (not Opus) for internal review passes — it's fast and good enough for pattern matching against expertise.
136
+ 3. **Produce findings:** Each finding is tagged `[Internal]` with severity (blocking, warning, note).
137
+ 4. **Fix cycle:** If blocking findings exist, the executor fixes them. Re-run internal review. Repeat until clean or cap reached.
138
+
139
+ Internal review passes are logged to `.wazir/runs/latest/reviews/<mode>-internal-pass-<N>.md`.
80
140
 
81
- Read `.wazir/state/config.json`. If `multi_tool.tools` includes external reviewers, run them **after** your own review and **before** producing the final verdict.
141
+ ### Tier 2: External Review (Fresh Eyes on Clean Code)
82
142
 
83
- ### Codex Review
143
+ Only runs AFTER Tier 1 produces a clean pass (no blocking findings).
144
+
145
+ Read `.wazir/state/config.json`. If `multi_tool.tools` includes external reviewers:
146
+
147
+ #### Codex Review
148
+
149
+ **For detailed Codex CLI usage, see `wz:codex-cli` skill.**
84
150
 
85
151
  If `codex` is in `multi_tool.tools`:
86
152
 
@@ -101,10 +167,10 @@ If `codex` is in `multi_tool.tools`:
101
167
  2>&1 | tee .wazir/runs/latest/reviews/codex-review.md
102
168
  ```
103
169
 
104
- 2. Read the Codex findings from `.wazir/runs/latest/reviews/codex-review.md`
105
- 3. Incorporate Codex findings into your scoring — if Codex flags something you missed, add it. If you disagree with a Codex finding, note it with your rationale.
170
+ 2. **Extract findings only** (context protection): After tee, use `execute_file` to extract only the final findings from the Codex output (everything after the last `codex` marker). If context-mode is unavailable, use `tac <file> | sed '/^codex$/q' | tac | tail -n +2`. If no marker found, fail closed (0 findings, warn user). See `docs/reference/review-loop-pattern.md` "Codex Output Context Protection" for full protocol.
171
+ 3. Incorporate extracted Codex findings into your scoring — if Codex flags something you missed, add it. If you disagree with a Codex finding, note it with your rationale.
106
172
 
107
- **Codex error handling:** If codex exits non-zero (auth/rate-limit/transport failure), log the full stderr, mark the pass as `codex-unavailable` in the review log, and use self-review findings only for that pass. Do NOT treat a Codex failure as a clean review. Do NOT skip the pass. The next pass still attempts Codex (transient failures may recover).
173
+ **Codex error handling:** If codex exits non-zero (auth/rate-limit/transport failure), log the full stderr, mark the pass as `codex-unavailable` in the review log, and use internal review findings only for that pass. Do NOT treat a Codex failure as a clean review. Do NOT skip the pass. The next pass still attempts Codex (transient failures may recover).
108
174
 
109
175
  **Code review scoping by mode:**
110
176
  - Use `--uncommitted` when reviewing uncommitted changes (`task-review` mode).
@@ -112,16 +178,51 @@ If `codex` is in `multi_tool.tools`:
112
178
  - Use `codex exec -c model="$CODEX_MODEL"` with stdin pipe for non-code artifacts (`spec-challenge`, `design-review`, `plan-review`, `research-review`, `clarification-review` modes).
113
179
  - See `docs/reference/review-loop-pattern.md` for code review scoping rules.
114
180
 
115
- ### Gemini Review
181
+ #### Gemini Review
182
+
183
+ If `gemini` is in `multi_tool.tools`, follow the same pattern using the Gemini CLI when available. **For detailed Gemini CLI usage, see `wz:gemini-cli` skill.**
116
184
 
117
- If `gemini` is in `multi_tool.tools`, follow the same pattern using the Gemini CLI when available.
185
+ ### Fix Cycle (Codex Findings)
186
+
187
+ If Codex produces blocking findings:
188
+ 1. Executor fixes the Codex findings
189
+ 2. Re-run internal review (quick pass) to verify fixes didn't introduce regressions
190
+ 3. Optionally re-run Codex for a clean pass
118
191
 
119
192
  ### Merging Findings
120
193
 
121
194
  The final review report must clearly attribute each finding:
122
- - `[Wazir]` — found by primary review
123
- - `[Codex]` — found by Codex secondary review
124
- - `[Both]` — found independently by both
195
+ - `[Internal]` — found by Tier 1 internal review
196
+ - `[Codex]` — found by Tier 2 Codex review
197
+ - `[Gemini]` — found by Tier 2 Gemini review
198
+ - `[Both]` — found independently by multiple sources
199
+
200
+ ### Finding Persistence (Learning Pipeline)
201
+
202
+ ALL findings from both tiers are persisted to `state.sqlite` for cross-run learning:
203
+
204
+ ```javascript
205
+ // After each review pass
206
+ const { insertFinding, getRecurringFindingHashes } = require('tooling/src/state/db');
207
+ const db = openStateDb(stateRoot);
208
+
209
+ for (const finding of allFindings) {
210
+ insertFinding(db, {
211
+ run_id: runId,
212
+ phase: reviewMode,
213
+ source: finding.attribution, // 'internal', 'codex', 'gemini'
214
+ severity: finding.severity,
215
+ description: finding.description,
216
+ finding_hash: hashFinding(finding.description),
217
+ });
218
+ }
219
+
220
+ // Check for recurring patterns
221
+ const recurring = getRecurringFindingHashes(db, 2);
222
+ // Recurring findings → auto-propose as learnings in the learn phase
223
+ ```
224
+
225
+ This is how Wazir evolves — findings that recur across runs become accepted learnings injected into future executor context, preventing the same mistakes.
125
226
 
126
227
  ## Task-Review Log Filenames
127
228
 
@@ -137,6 +238,174 @@ Save review results to `.wazir/runs/latest/reviews/review.md` with:
137
238
  - Score breakdown
138
239
  - Verdict
139
240
 
241
+ ## Phase Report Generation
242
+
243
+ After completing any review pass, generate a phase report following `schemas/phase-report.schema.json`:
244
+
245
+ 1. **`attempted_actions`** — Populate from the review findings. Each finding becomes an action entry:
246
+ - `description`: the finding summary
247
+ - `outcome`: `"success"` if the finding passed, `"fail"` if it is a blocking issue, `"uncertain"` if ambiguous
248
+ - `evidence`: the rationale or evidence supporting the outcome
249
+
250
+ 2. **`drift_analysis`** — Compare review findings against the approved spec:
251
+ - `delta`: count of deviations between implementation and spec (0 = no drift)
252
+ - `description`: summary of any drift detected and its impact
253
+
254
+ 3. **`quality_metrics`** — Populate from test, lint, and type-check results gathered during review:
255
+ - `test_pass_count`, `test_fail_count`: from test runner output
256
+ - `lint_errors`: from linter output
257
+ - `type_errors`: from type checker output
258
+
259
+ 4. **`risk_flags`** — Populate from any high-severity findings:
260
+ - `severity`: `"low"`, `"medium"`, or `"high"`
261
+ - `description`: what the risk is
262
+ - `mitigation`: recommended mitigation (if known)
263
+
264
+ 5. **`decisions`** — Populate from any scope or approach decisions made during the review:
265
+ - `description`: what was decided
266
+ - `rationale`: why
267
+ - `alternatives_considered`: other options evaluated (optional)
268
+ - `source`: `"[Wazir]"`, `"[Codex]"`, or `"[Both]"` (optional)
269
+
270
+ 6. **`verdict_recommendation`** — Set based on the gating rules in `config/gating-rules.yaml`:
271
+ - `verdict`: `"continue"` (PASS), `"loop_back"` (NEEDS MINOR FIXES / NEEDS REWORK), or `"escalate"` (FAIL with fundamental issues)
272
+ - `reasoning`: brief explanation of why this verdict was chosen
273
+
274
+ ### Report Output Paths
275
+
276
+ Save reports to two formats under the run directory:
277
+ - `.wazir/runs/<id>/reports/phase-<name>-report.json` — machine-readable, validated against `schemas/phase-report.schema.json`
278
+ - `.wazir/runs/<id>/reports/phase-<name>-report.md` — human-readable Markdown summary
279
+
280
+ The gating agent (`tooling/src/gating/agent.js`) consumes the JSON report to decide: **continue**, **loop_back**, or **escalate**.
281
+
282
+ ### Report Fields Reference
283
+
284
+ All required fields per `schemas/phase-report.schema.json`:
285
+
286
+ | Field | Type | Required | Description |
287
+ |-------|------|----------|-------------|
288
+ | `phase_name` | string | yes | Review mode name (e.g., `"final"`, `"task-review"`) |
289
+ | `run_id` | string | yes | Current run identifier |
290
+ | `timestamp` | string (date-time) | yes | ISO 8601 timestamp of report generation |
291
+ | `attempted_actions` | array | yes | Findings mapped to action outcomes |
292
+ | `drift_analysis` | object | yes | Spec-vs-implementation drift summary |
293
+ | `quality_metrics` | object | yes | Test/lint/type results |
294
+ | `risk_flags` | array | yes | High-severity risk items |
295
+ | `decisions` | array | yes | Scope/approach decisions made |
296
+ | `verdict_recommendation` | object | no | Gating verdict based on `config/gating-rules.yaml` |
297
+
298
+ ## Post-Review: Learn (final mode only)
299
+
300
+ After the final review verdict, extract durable learnings using the **learner role** (`roles/learner.md`).
301
+
302
+ ### Step 1: Gather all findings
303
+
304
+ Collect review findings from ALL sources in this run:
305
+ - `.wazir/runs/<run-id>/reviews/` — all review pass logs (task-review, final review)
306
+ - Codex findings (attributed `[Codex]` or `[Both]`)
307
+ - Self-audit findings (if `run_audit` was enabled)
308
+
309
+ ### Step 2: Identify learning candidates
310
+
311
+ A finding becomes a learning candidate if:
312
+ - It recurred across 2+ review passes within this run (same issue found repeatedly)
313
+ - It matches a finding from a prior run (check `memory/learnings/proposed/` and `accepted/` for similar patterns)
314
+ - It represents a class of mistake, not just a single instance (e.g., "missing error handling in async functions" vs "missing try-catch on line 42")
315
+
316
+ ### Step 3: Write learning proposals
317
+
318
+ For each candidate, write a proposal to `memory/learnings/proposed/<run-id>-<NNN>.md`:
319
+
320
+ ```markdown
321
+ ---
322
+ artifact_type: proposed_learning
323
+ phase: learn
324
+ role: learner
325
+ run_id: <run-id>
326
+ status: proposed
327
+ sources:
328
+ - <review-file-1>
329
+ - <review-file-2>
330
+ approval_status: required
331
+ ---
332
+
333
+ # Proposed Learning: <title>
334
+
335
+ ## Scope
336
+ - **Roles:** [which roles should receive this learning — e.g., executor, reviewer]
337
+ - **Stacks:** [which tech stacks — e.g., node, react, or "all"]
338
+ - **Concerns:** [which concerns — e.g., error-handling, testing, security]
339
+
340
+ ## Evidence
341
+ - [finding from review pass N: description]
342
+ - [finding from review pass M: same pattern]
343
+ - [optional: similar finding from prior run <run-id>]
344
+
345
+ ## Learning
346
+ [The concrete, actionable instruction that should be injected into future executor context]
347
+
348
+ ## Expected Benefit
349
+ [What this prevents in future runs]
350
+
351
+ ## Confidence
352
+ - **Level:** low | medium | high
353
+ - **Basis:** [single run observation | multi-run recurrence | user correction]
354
+ ```
355
+
356
+ ### Step 4: Report
357
+
358
+ Present proposed learnings to the user:
359
+
360
+ > **Learnings proposed:** [count]
361
+ > - [title 1] (confidence: high, scope: executor/node)
362
+ > - [title 2] (confidence: medium, scope: reviewer/all)
363
+ >
364
+ > Proposals saved to `memory/learnings/proposed/`. Review and accept with `/wazir audit learnings`.
365
+
366
+ Learnings are NEVER auto-applied. They require explicit user acceptance before being injected into future runs.
367
+
368
+ ## Post-Review: Prepare Next (final mode only)
369
+
370
+ After learning extraction, invoke the `prepare-next` skill to prepare the handoff:
371
+
372
+ ### Handoff document
373
+
374
+ Write to `.wazir/runs/<run-id>/handoff.md`:
375
+
376
+ ```markdown
377
+ # Handoff — <run-id>
378
+
379
+ **Status:** [Completed | Partial]
380
+ **Branch:** <branch-name>
381
+ **Date:** YYYY-MM-DD
382
+
383
+ ## What Was Done
384
+ [List of completed tasks with commit hashes]
385
+
386
+ ## Test Results
387
+ [Test count, pass/fail, validator status]
388
+
389
+ ## Review Score
390
+ [Final review verdict and score]
391
+
392
+ ## What's Next
393
+ [Pending items, deferred work, follow-up tasks]
394
+
395
+ ## Open Bugs
396
+ [Any known issues discovered during this run]
397
+
398
+ ## Learnings From This Run
399
+ [Key insights — what worked, what didn't, what to change]
400
+ ```
401
+
402
+ ### Cleanup
403
+
404
+ - Archive verbose intermediate review logs (compress to summary)
405
+ - Update `.wazir/runs/latest` symlink if creating a new run
406
+ - Do NOT mutate `input/` — it belongs to the user
407
+ - Do NOT auto-load proposed learnings into the next run
408
+
140
409
  ## Done
141
410
 
142
411
  Present the verdict and offer next steps:
@@ -145,6 +414,9 @@ Present the verdict and offer next steps:
145
414
  >
146
415
  > [Score breakdown and findings summary]
147
416
  >
417
+ > **Learnings proposed:** [count] (see `memory/learnings/proposed/`)
418
+ > **Handoff:** `.wazir/runs/<run-id>/handoff.md`
419
+ >
148
420
  > **What would you like to do?**
149
421
  > 1. **Create a PR** (if PASS)
150
422
  > 2. **Auto-fix and re-review** (if MINOR FIXES)
@@ -5,6 +5,19 @@ description: Run a structured audit on your codebase — security, code quality,
5
5
 
6
6
  # Run Audit — Structured Codebase Audit Pipeline
7
7
 
8
+ ## Command Routing
9
+ Follow the Canonical Command Matrix in `hooks/routing-matrix.json`.
10
+ - Large commands (test runners, builds, diffs, dependency trees, linting) → context-mode tools
11
+ - Small commands (git status, ls, pwd, wazir CLI) → native Bash
12
+ - If context-mode unavailable, fall back to native Bash with warning
13
+
14
+ ## Codebase Exploration
15
+ 1. Query `wazir index search-symbols <query>` first
16
+ 2. Use `wazir recall file <path> --tier L1` for targeted reads
17
+ 3. Fall back to direct file reads ONLY for files identified by index queries
18
+ 4. Maximum 10 direct file reads without a justifying index query
19
+ 5. If no index exists: `wazir index build && wazir index summarize --tier all`
20
+
8
21
  ## Overview
9
22
 
10
23
  This skill runs a structured audit on your codebase. It collects three parameters interactively (audit type, scope, output mode), then feeds them through the pipeline: Research → Audit → Report or Plan.
@@ -5,6 +5,19 @@ description: Build a project profile from manifests, docs, tests, and `input/` s
5
5
 
6
6
  # Scan Project
7
7
 
8
+ ## Command Routing
9
+ Follow the Canonical Command Matrix in `hooks/routing-matrix.json`.
10
+ - Large commands (test runners, builds, diffs, dependency trees, linting) → context-mode tools
11
+ - Small commands (git status, ls, pwd, wazir CLI) → native Bash
12
+ - If context-mode unavailable, fall back to native Bash with warning
13
+
14
+ ## Codebase Exploration
15
+ 1. Query `wazir index search-symbols <query>` first
16
+ 2. Use `wazir recall file <path> --tier L1` for targeted reads
17
+ 3. Fall back to direct file reads ONLY for files identified by index queries
18
+ 4. Maximum 10 direct file reads without a justifying index query
19
+ 5. If no index exists: `wazir index build && wazir index summarize --tier all`
20
+
8
21
  Inspect the smallest set of repo surfaces needed to answer:
9
22
 
10
23
  - what kind of project this is