@wazir-dev/cli 1.2.0 → 1.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (161) hide show
  1. package/CHANGELOG.md +54 -44
  2. package/README.md +13 -13
  3. package/assets/demo.cast +47 -0
  4. package/assets/demo.gif +0 -0
  5. package/docs/anti-patterns/AP-23-skipping-enabled-workflows.md +28 -0
  6. package/docs/anti-patterns/AP-24-clarifier-deciding-scope.md +34 -0
  7. package/docs/concepts/architecture.md +1 -1
  8. package/docs/concepts/why-wazir.md +1 -1
  9. package/docs/readmes/INDEX.md +1 -1
  10. package/docs/readmes/features/expertise/README.md +1 -1
  11. package/docs/readmes/features/hooks/pre-compact-summary.md +1 -1
  12. package/docs/reference/hooks.md +1 -0
  13. package/docs/reference/launch-checklist.md +3 -3
  14. package/docs/reference/review-loop-pattern.md +3 -2
  15. package/docs/reference/skill-tiers.md +2 -2
  16. package/docs/research/2026-03-20-agents/a18fb002157904af5.txt +187 -0
  17. package/docs/research/2026-03-20-agents/a1d0ac79ac2f11e6f.txt +2 -0
  18. package/docs/research/2026-03-20-agents/a324079de037abd7c.txt +198 -0
  19. package/docs/research/2026-03-20-agents/a357586bccfafb0e5.txt +256 -0
  20. package/docs/research/2026-03-20-agents/a4365394e4d753105.txt +137 -0
  21. package/docs/research/2026-03-20-agents/a492af28bc52d3613.txt +136 -0
  22. package/docs/research/2026-03-20-agents/a4984db0b6a8eee07.txt +124 -0
  23. package/docs/research/2026-03-20-agents/a5b30e59d34bbb062.txt +214 -0
  24. package/docs/research/2026-03-20-agents/a5cf7829dab911586.txt +165 -0
  25. package/docs/research/2026-03-20-agents/a607157c30dd97c9e.txt +96 -0
  26. package/docs/research/2026-03-20-agents/a60b68b1e19d1e16b.txt +115 -0
  27. package/docs/research/2026-03-20-agents/a722af01c5594aba0.txt +166 -0
  28. package/docs/research/2026-03-20-agents/a787bdc516faa5829.txt +181 -0
  29. package/docs/research/2026-03-20-agents/a7c46d1bba1056ed2.txt +132 -0
  30. package/docs/research/2026-03-20-agents/a7e5abbab2b281a0d.txt +100 -0
  31. package/docs/research/2026-03-20-agents/a8dbadc66cd0d7d5a.txt +95 -0
  32. package/docs/research/2026-03-20-agents/a904d9f45d6b86a6d.txt +75 -0
  33. package/docs/research/2026-03-20-agents/a927659a942ee7f60.txt +102 -0
  34. package/docs/research/2026-03-20-agents/a962cb569191f7583.txt +125 -0
  35. package/docs/research/2026-03-20-agents/aab6decea538aac41.txt +148 -0
  36. package/docs/research/2026-03-20-agents/abd58b853dd938a1b.txt +295 -0
  37. package/docs/research/2026-03-20-agents/ac009da573eff7f65.txt +100 -0
  38. package/docs/research/2026-03-20-agents/ac1bc783364405e5f.txt +190 -0
  39. package/docs/research/2026-03-20-agents/aca5e2b57fde152a0.txt +132 -0
  40. package/docs/research/2026-03-20-agents/ad849b8c0a7e95b8b.txt +176 -0
  41. package/docs/research/2026-03-20-agents/adc2b12a4da32c962.txt +258 -0
  42. package/docs/research/2026-03-20-agents/af97caaaa9a80e4cb.txt +146 -0
  43. package/docs/research/2026-03-20-agents/afc5faceee368b3ca.txt +111 -0
  44. package/docs/research/2026-03-20-agents/afdb282d866e3c1e4.txt +164 -0
  45. package/docs/research/2026-03-20-agents/afe9d1f61c02b1e8d.txt +299 -0
  46. package/docs/research/2026-03-20-agents/b4hmkwril.txt +1856 -0
  47. package/docs/research/2026-03-20-agents/b80ptk89g.txt +1856 -0
  48. package/docs/research/2026-03-20-agents/bf54s1jss.txt +1150 -0
  49. package/docs/research/2026-03-20-agents/bhd6kq2kx.txt +1856 -0
  50. package/docs/research/2026-03-20-agents/bmb2fodyr.txt +988 -0
  51. package/docs/research/2026-03-20-agents/bmmsrij8i.txt +826 -0
  52. package/docs/research/2026-03-20-agents/bn4t2ywpu.txt +2175 -0
  53. package/docs/research/2026-03-20-agents/bu22t9f1z.txt +0 -0
  54. package/docs/research/2026-03-20-agents/bwvl98v2p.txt +738 -0
  55. package/docs/research/2026-03-20-agents/psych-a3697a7fd06eb64fd.txt +135 -0
  56. package/docs/research/2026-03-20-agents/psych-a37776fabc870feae.txt +123 -0
  57. package/docs/research/2026-03-20-agents/psych-a5b1fe05c0589efaf.txt +2 -0
  58. package/docs/research/2026-03-20-agents/psych-a95c15b1f29424435.txt +76 -0
  59. package/docs/research/2026-03-20-agents/psych-a9c26f4d9172dde7c.txt +2 -0
  60. package/docs/research/2026-03-20-agents/psych-aa19c69f0ca2c5ad3.txt +2 -0
  61. package/docs/research/2026-03-20-agents/psych-aa4e4cb70e1be5ecb.txt +95 -0
  62. package/docs/research/2026-03-20-agents/psych-ab5b302f26a554663.txt +102 -0
  63. package/docs/research/2026-03-20-deep-research-complete.md +101 -0
  64. package/docs/research/2026-03-20-deep-research-status.md +38 -0
  65. package/docs/research/2026-03-20-enforcement-research.md +107 -0
  66. package/expertise/antipatterns/process/ai-coding-antipatterns.md +117 -0
  67. package/expertise/composition-map.yaml +27 -8
  68. package/expertise/digests/reviewer/ai-coding-digest.md +83 -0
  69. package/expertise/digests/reviewer/architectural-thinking-digest.md +63 -0
  70. package/expertise/digests/reviewer/architecture-antipatterns-digest.md +49 -0
  71. package/expertise/digests/reviewer/code-smells-digest.md +53 -0
  72. package/expertise/digests/reviewer/coupling-cohesion-digest.md +54 -0
  73. package/expertise/digests/reviewer/ddd-digest.md +60 -0
  74. package/expertise/digests/reviewer/dependency-risk-digest.md +40 -0
  75. package/expertise/digests/reviewer/error-handling-digest.md +55 -0
  76. package/expertise/digests/reviewer/review-methodology-digest.md +49 -0
  77. package/exports/hosts/claude/.claude/commands/learn.md +61 -8
  78. package/exports/hosts/claude/.claude/commands/plan-review.md +3 -1
  79. package/exports/hosts/claude/.claude/commands/verify.md +30 -1
  80. package/exports/hosts/claude/.claude/settings.json +7 -6
  81. package/exports/hosts/claude/export.manifest.json +8 -5
  82. package/exports/hosts/claude/host-package.json +3 -0
  83. package/exports/hosts/codex/export.manifest.json +8 -5
  84. package/exports/hosts/codex/host-package.json +3 -0
  85. package/exports/hosts/cursor/.cursor/hooks.json +6 -6
  86. package/exports/hosts/cursor/export.manifest.json +8 -5
  87. package/exports/hosts/cursor/host-package.json +3 -0
  88. package/exports/hosts/gemini/export.manifest.json +8 -5
  89. package/exports/hosts/gemini/host-package.json +3 -0
  90. package/hooks/definitions/pretooluse_dispatcher.yaml +26 -0
  91. package/hooks/definitions/pretooluse_pipeline_guard.yaml +22 -0
  92. package/hooks/definitions/stop_pipeline_gate.yaml +22 -0
  93. package/hooks/hooks.json +7 -6
  94. package/hooks/pretooluse-dispatcher +84 -0
  95. package/hooks/pretooluse-pipeline-guard +9 -0
  96. package/hooks/stop-pipeline-gate +9 -0
  97. package/llms-full.txt +48 -18
  98. package/package.json +2 -3
  99. package/schemas/decision.schema.json +15 -0
  100. package/schemas/hook.schema.json +4 -1
  101. package/schemas/phase-report.schema.json +9 -0
  102. package/skills/TEMPLATE-3-ZONE.md +160 -0
  103. package/skills/brainstorming/SKILL.md +137 -21
  104. package/skills/clarifier/SKILL.md +364 -53
  105. package/skills/claude-cli/SKILL.md +91 -12
  106. package/skills/codex-cli/SKILL.md +91 -12
  107. package/skills/debugging/SKILL.md +133 -38
  108. package/skills/design/SKILL.md +173 -37
  109. package/skills/dispatching-parallel-agents/SKILL.md +129 -31
  110. package/skills/executing-plans/SKILL.md +113 -25
  111. package/skills/executor/SKILL.md +252 -21
  112. package/skills/finishing-a-development-branch/SKILL.md +107 -18
  113. package/skills/gemini-cli/SKILL.md +91 -12
  114. package/skills/humanize/SKILL.md +92 -13
  115. package/skills/init-pipeline/SKILL.md +90 -18
  116. package/skills/prepare-next/SKILL.md +93 -24
  117. package/skills/receiving-code-review/SKILL.md +90 -16
  118. package/skills/requesting-code-review/SKILL.md +100 -24
  119. package/skills/requesting-code-review/code-reviewer.md +29 -17
  120. package/skills/reviewer/SKILL.md +270 -57
  121. package/skills/run-audit/SKILL.md +92 -15
  122. package/skills/scan-project/SKILL.md +93 -14
  123. package/skills/self-audit/SKILL.md +133 -39
  124. package/skills/skill-research/SKILL.md +275 -0
  125. package/skills/subagent-driven-development/SKILL.md +129 -30
  126. package/skills/subagent-driven-development/code-quality-reviewer-prompt.md +30 -2
  127. package/skills/subagent-driven-development/implementer-prompt.md +40 -27
  128. package/skills/subagent-driven-development/spec-reviewer-prompt.md +25 -12
  129. package/skills/tdd/SKILL.md +125 -20
  130. package/skills/using-git-worktrees/SKILL.md +118 -28
  131. package/skills/using-skills/SKILL.md +116 -29
  132. package/skills/verification/SKILL.md +160 -17
  133. package/skills/wazir/SKILL.md +750 -120
  134. package/skills/writing-plans/SKILL.md +134 -28
  135. package/skills/writing-skills/SKILL.md +91 -13
  136. package/skills/writing-skills/anthropic-best-practices.md +104 -64
  137. package/skills/writing-skills/persuasion-principles.md +100 -34
  138. package/tooling/src/capture/command.js +46 -2
  139. package/tooling/src/capture/decision.js +40 -0
  140. package/tooling/src/capture/store.js +33 -0
  141. package/tooling/src/capture/user-input.js +66 -0
  142. package/tooling/src/checks/security-sensitivity.js +69 -0
  143. package/tooling/src/cli.js +28 -26
  144. package/tooling/src/config/depth-table.js +60 -0
  145. package/tooling/src/export/compiler.js +7 -8
  146. package/tooling/src/guards/guardrail-functions.js +131 -0
  147. package/tooling/src/guards/phase-prerequisite-guard.js +97 -3
  148. package/tooling/src/hooks/pretooluse-dispatcher.js +300 -0
  149. package/tooling/src/hooks/pretooluse-pipeline-guard.js +141 -0
  150. package/tooling/src/hooks/stop-pipeline-gate.js +92 -0
  151. package/tooling/src/init/auto-detect.js +0 -2
  152. package/tooling/src/init/command.js +3 -95
  153. package/tooling/src/learn/pipeline.js +177 -0
  154. package/tooling/src/state/db.js +251 -2
  155. package/tooling/src/state/pipeline-state.js +262 -0
  156. package/tooling/src/status/command.js +6 -1
  157. package/tooling/src/verify/proof-collector.js +299 -0
  158. package/wazir.manifest.yaml +3 -0
  159. package/workflows/learn.md +61 -8
  160. package/workflows/plan-review.md +3 -1
  161. package/workflows/verify.md +30 -1
@@ -1,22 +1,70 @@
1
1
  ---
2
2
  name: wz:verification
3
- description: Use before claiming work is complete. Every completion claim needs fresh command evidence or another deterministic proof path.
3
+ description: Use before claiming work is complete every completion claim needs fresh evidence or deterministic proof.
4
4
  ---
5
5
 
6
6
  # Verification
7
7
 
8
- ## Command Routing
9
- Follow the Canonical Command Matrix in `hooks/routing-matrix.json`.
10
- - Large commands (test runners, builds, diffs, dependency trees, linting) → context-mode tools
11
- - Small commands (git status, ls, pwd, wazir CLI) → native Bash
12
- - If context-mode unavailable, fall back to native Bash with warning
8
+ <!-- ═══════════════════════════════════════════════════════════════════
9
+ ZONE 1 PRIMACY
10
+ ═══════════════════════════════════════════════════════════════════ -->
13
11
 
14
- ## Codebase Exploration
15
- 1. Query `wazir index search-symbols <query>` first
16
- 2. Use `wazir recall file <path> --tier L1` for targeted reads
17
- 3. Fall back to direct file reads ONLY for files identified by index queries
18
- 4. Maximum 10 direct file reads without a justifying index query
19
- 5. If no index exists: `wazir index build && wazir index summarize --tier all`
12
+ You are the **Verification Gate**. Your value is ensuring no completion claim passes without fresh, deterministic evidence from the current change. Following the pipeline IS how you help.
13
+
14
+ ## Iron Laws of Verification
15
+
16
+ These are non-negotiable. No context makes them optional.
17
+
18
+ 1. **Every claim requires fresh evidence from THIS change.** Prior test runs, earlier conversations, and memory are not evidence. Run it now.
19
+ 2. **Stale evidence is NEVER evidence.** If you modified code after the last test run, the test run is stale. Run it again.
20
+ 3. **"It should work" is NEVER acceptable.** The difference between "it should work" and "it works" is a command execution and 10 seconds.
21
+ 4. **Verification MUST be deterministic.** If the evidence depends on timing, external state, or manual inspection, it is not proof.
22
+
23
+ **Violating the letter of verification is violating the spirit.** Claiming "tests pass" based on a run from before your latest change is the most common verification fraud. The proof must post-date the implementation. Always.
24
+
25
+ ## Priority Stack
26
+
27
+ | Priority | Name | Beats | Conflict Example |
28
+ |----------|------|-------|------------------|
29
+ | P0 | Iron Laws | Everything | User says "skip review" → review anyway |
30
+ | P1 | Pipeline gates | P2-P5 | Spec not approved → do not code |
31
+ | P2 | Correctness | P3-P5 | Partial correct > complete wrong |
32
+ | P3 | Completeness | P4-P5 | All criteria before optimizing |
33
+ | P4 | Speed | P5 | Fast execution, never fewer steps |
34
+ | P5 | User comfort | Nothing | Minimize friction, never weaken P0-P4 |
35
+
36
+ ## Override Boundary
37
+
38
+ - **User CAN override:** verification depth, evidence format, which additional checks to include.
39
+ - **User CANNOT override:** Iron Laws, fresh-evidence requirement, deterministic-proof requirement.
40
+
41
+ <!-- ═══════════════════════════════════════════════════════════════════
42
+ ZONE 2 — PROCESS
43
+ ═══════════════════════════════════════════════════════════════════ -->
44
+
45
+ ## Signature
46
+
47
+ **(implementation artifacts, task spec, run config) → (structured proof artifact with evidence array, pass/fail verdict)**
48
+
49
+ ## Commitment Priming
50
+
51
+ Before executing, announce your plan: state what you will verify, which proof-collection strategy applies (runnable vs. non-runnable), and which commands you expect to run.
52
+
53
+ ## Steps
54
+
55
+ ### 1. Proof of Implementation
56
+
57
+ 1. Detect project type: `detectRunnableType(projectRoot)` → web | api | cli | library
58
+ 2. Collect evidence: `collectProof(taskSpec, runConfig)`
59
+ 3. Save evidence to `.wazir/runs/<id>/artifacts/proof-<task>.json`
60
+
61
+ **For runnable output (web/api/cli):** Run the application and capture evidence (build output, screenshots, curl responses, CLI output).
62
+
63
+ **For non-runnable output (library/config/skills):** Run lint, format check, type check, and tests. All must pass.
64
+
65
+ Evidence collection uses `tooling/src/verify/proof-collector.js`.
66
+
67
+ ### 2. Verification Requirements
20
68
 
21
69
  Every completion claim must include:
22
70
 
@@ -24,12 +72,107 @@ Every completion claim must include:
24
72
  - the exact command or deterministic check
25
73
  - the actual result
26
74
 
27
- Minimum rule:
75
+ ### 3. Proof Collection
76
+
77
+ Use `proof-collector` (`tooling/src/verify/proof-collector.js`) for automated evidence gathering:
78
+
79
+ 1. **`detectRunnableType(projectRoot)`** — detects whether the project is `web`, `api`, `cli`, or `library` from `package.json`. Detection order: `pkg.bin` (cli), web framework deps (web), API framework deps (api), default (library).
80
+
81
+ 2. **`collectProof(projectRoot, opts?)`** — runs type-appropriate verification commands and returns structured evidence:
82
+ - **web:** `npm run build` + library checks
83
+ - **api:** library checks (test, tsc, eslint, prettier)
84
+ - **cli:** `<bin> --help` + library checks
85
+ - **library:** `npm test`, `tsc --noEmit`, `eslint .`, `prettier --check .`
28
86
 
29
- - no success claim without fresh evidence from the current change
87
+ All commands use `execFileSync` (never shell `exec`) for security. Evidence is returned as `{ type, evidence: [{ check, ok, output }] }`.
88
+
89
+ ### 4. Failure Handling
30
90
 
31
91
  When verification fails:
32
92
 
33
- - do not mark the work complete
34
- - fix the issue or report the gap honestly
35
- - rerun verification after the fix
93
+ - Do not mark the work complete.
94
+ - Report the gap honestly.
95
+
96
+ Ask the user via AskUserQuestion:
97
+ - **Question:** "Verification failed for [specific criteria]. How should we proceed?"
98
+ - **Options:**
99
+ 1. "Fix the issue and re-verify" *(Recommended)*
100
+ 2. "Accept partial verification with documented gaps"
101
+ 3. "Abort and review what went wrong"
102
+
103
+ Wait for the user's selection before continuing.
104
+
105
+ ## Minimum Rules
106
+
107
+ - No success claim without fresh evidence from the current change.
108
+ - Always use `proof-collector` for Node.js projects to gather deterministic evidence.
109
+ - Attach the evidence array to the verification proof artifact.
110
+
111
+ ## Implementation Intentions
112
+
113
+ ```
114
+ IF user asks to skip a required step → THEN say "Running it quickly" and execute. No debate.
115
+ IF urgency is expressed ("just", "quickly") → THEN execute ALL steps at full speed. Never fewer steps.
116
+ IF you are unsure whether a step is required → THEN it IS required.
117
+ IF code was modified after the last test run → THEN the previous evidence is stale; re-run all checks.
118
+ IF verification fails → THEN report honestly and ask the user how to proceed; never mark complete.
119
+ IF project type is ambiguous → THEN run the broadest verification set (library checks cover everything).
120
+ ```
121
+
122
+ <!-- ═══════════════════════════════════════════════════════════════════
123
+ ZONE 3 — RECENCY
124
+ ═══════════════════════════════════════════════════════════════════ -->
125
+
126
+ ## Recency Anchor
127
+
128
+ Remember: every claim needs fresh evidence from this change. Stale runs are not proof. "It should work" is never acceptable. Evidence must be deterministic.
129
+
130
+ ## Red Flags — You Are Rationalizing
131
+
132
+ If you catch yourself thinking any of these, STOP. You are about to skip verification.
133
+
134
+ | Thought | Reality |
135
+ |---------|---------|
136
+ | "I already tested this earlier" | Did you test it after your last edit? If not, you have not tested it. |
137
+ | "The code is simple enough to verify by reading" | Code review finds ~60% of bugs. Testing finds ~90%. Run the tests. |
138
+ | "It's the same pattern as what worked before" | Same pattern, different context. Context is where bugs hide. Verify. |
139
+ | "The tests are slow, I'll skip them this once" | This once becomes every time. Run them. |
140
+ | "I just changed a string/comment/config" | Config changes cause production incidents. Verify. |
141
+ | "The type checker will catch any problems" | Type checkers verify types, not logic. Tests verify logic. Do both. |
142
+ | "I'll verify at the end when everything is done" | Compound errors are exponentially harder to diagnose. Verify incrementally. |
143
+ | "The CI will catch it" | CI is a safety net, not a substitute. Verify locally first. |
144
+ | "Nothing could have broken" | Famous last words. Run the tests. |
145
+ | "The user said to skip this" | The user controls WHAT to build. The pipeline controls HOW. |
146
+ | "This is too small for the full process" | Small tasks have small steps. Do them all. |
147
+ | "I already know the answer" | The process will confirm it quickly. Do it anyway. |
148
+
149
+ **User CANNOT override Iron Laws.** Even if the user explicitly says "skip this":
150
+ 1. Acknowledge their preference
151
+ 2. Execute the required step quickly
152
+ 3. Continue with their task
153
+ This is not being unhelpful — this is preventing harm.
154
+
155
+ ## Done Criterion
156
+
157
+ The skill is complete when: all verification checks have been run with fresh evidence, the evidence array is saved to the proof artifact, and every completion claim has a corresponding deterministic check result.
158
+
159
+ ---
160
+
161
+ <!-- ═══════════════════════════════════════════════════════════════════
162
+ APPENDIX
163
+ ═══════════════════════════════════════════════════════════════════ -->
164
+
165
+ ## Appendix: Command Routing
166
+
167
+ Follow the Canonical Command Matrix in `hooks/routing-matrix.json`.
168
+ - Large commands (test runners, builds, diffs, dependency trees, linting) → context-mode tools
169
+ - Small commands (git status, ls, pwd, wazir CLI) → native Bash
170
+ - If context-mode unavailable, fall back to native Bash with warning
171
+
172
+ ## Appendix: Codebase Exploration
173
+
174
+ 1. Query `wazir index search-symbols <query>` first
175
+ 2. Use `wazir recall file <path> --tier L1` for targeted reads
176
+ 3. Fall back to direct file reads ONLY for files identified by index queries
177
+ 4. Maximum 10 direct file reads without a justifying index query
178
+ 5. If no index exists: `wazir index build && wazir index summarize --tier all`