@wazir-dev/cli 1.2.0 → 1.4.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +54 -44
- package/README.md +13 -13
- package/assets/demo.cast +47 -0
- package/assets/demo.gif +0 -0
- package/docs/anti-patterns/AP-23-skipping-enabled-workflows.md +28 -0
- package/docs/anti-patterns/AP-24-clarifier-deciding-scope.md +34 -0
- package/docs/concepts/architecture.md +1 -1
- package/docs/concepts/why-wazir.md +1 -1
- package/docs/readmes/INDEX.md +1 -1
- package/docs/readmes/features/expertise/README.md +1 -1
- package/docs/readmes/features/hooks/pre-compact-summary.md +1 -1
- package/docs/reference/hooks.md +1 -0
- package/docs/reference/launch-checklist.md +3 -3
- package/docs/reference/review-loop-pattern.md +3 -2
- package/docs/reference/skill-tiers.md +2 -2
- package/docs/research/2026-03-20-agents/a18fb002157904af5.txt +187 -0
- package/docs/research/2026-03-20-agents/a1d0ac79ac2f11e6f.txt +2 -0
- package/docs/research/2026-03-20-agents/a324079de037abd7c.txt +198 -0
- package/docs/research/2026-03-20-agents/a357586bccfafb0e5.txt +256 -0
- package/docs/research/2026-03-20-agents/a4365394e4d753105.txt +137 -0
- package/docs/research/2026-03-20-agents/a492af28bc52d3613.txt +136 -0
- package/docs/research/2026-03-20-agents/a4984db0b6a8eee07.txt +124 -0
- package/docs/research/2026-03-20-agents/a5b30e59d34bbb062.txt +214 -0
- package/docs/research/2026-03-20-agents/a5cf7829dab911586.txt +165 -0
- package/docs/research/2026-03-20-agents/a607157c30dd97c9e.txt +96 -0
- package/docs/research/2026-03-20-agents/a60b68b1e19d1e16b.txt +115 -0
- package/docs/research/2026-03-20-agents/a722af01c5594aba0.txt +166 -0
- package/docs/research/2026-03-20-agents/a787bdc516faa5829.txt +181 -0
- package/docs/research/2026-03-20-agents/a7c46d1bba1056ed2.txt +132 -0
- package/docs/research/2026-03-20-agents/a7e5abbab2b281a0d.txt +100 -0
- package/docs/research/2026-03-20-agents/a8dbadc66cd0d7d5a.txt +95 -0
- package/docs/research/2026-03-20-agents/a904d9f45d6b86a6d.txt +75 -0
- package/docs/research/2026-03-20-agents/a927659a942ee7f60.txt +102 -0
- package/docs/research/2026-03-20-agents/a962cb569191f7583.txt +125 -0
- package/docs/research/2026-03-20-agents/aab6decea538aac41.txt +148 -0
- package/docs/research/2026-03-20-agents/abd58b853dd938a1b.txt +295 -0
- package/docs/research/2026-03-20-agents/ac009da573eff7f65.txt +100 -0
- package/docs/research/2026-03-20-agents/ac1bc783364405e5f.txt +190 -0
- package/docs/research/2026-03-20-agents/aca5e2b57fde152a0.txt +132 -0
- package/docs/research/2026-03-20-agents/ad849b8c0a7e95b8b.txt +176 -0
- package/docs/research/2026-03-20-agents/adc2b12a4da32c962.txt +258 -0
- package/docs/research/2026-03-20-agents/af97caaaa9a80e4cb.txt +146 -0
- package/docs/research/2026-03-20-agents/afc5faceee368b3ca.txt +111 -0
- package/docs/research/2026-03-20-agents/afdb282d866e3c1e4.txt +164 -0
- package/docs/research/2026-03-20-agents/afe9d1f61c02b1e8d.txt +299 -0
- package/docs/research/2026-03-20-agents/b4hmkwril.txt +1856 -0
- package/docs/research/2026-03-20-agents/b80ptk89g.txt +1856 -0
- package/docs/research/2026-03-20-agents/bf54s1jss.txt +1150 -0
- package/docs/research/2026-03-20-agents/bhd6kq2kx.txt +1856 -0
- package/docs/research/2026-03-20-agents/bmb2fodyr.txt +988 -0
- package/docs/research/2026-03-20-agents/bmmsrij8i.txt +826 -0
- package/docs/research/2026-03-20-agents/bn4t2ywpu.txt +2175 -0
- package/docs/research/2026-03-20-agents/bu22t9f1z.txt +0 -0
- package/docs/research/2026-03-20-agents/bwvl98v2p.txt +738 -0
- package/docs/research/2026-03-20-agents/psych-a3697a7fd06eb64fd.txt +135 -0
- package/docs/research/2026-03-20-agents/psych-a37776fabc870feae.txt +123 -0
- package/docs/research/2026-03-20-agents/psych-a5b1fe05c0589efaf.txt +2 -0
- package/docs/research/2026-03-20-agents/psych-a95c15b1f29424435.txt +76 -0
- package/docs/research/2026-03-20-agents/psych-a9c26f4d9172dde7c.txt +2 -0
- package/docs/research/2026-03-20-agents/psych-aa19c69f0ca2c5ad3.txt +2 -0
- package/docs/research/2026-03-20-agents/psych-aa4e4cb70e1be5ecb.txt +95 -0
- package/docs/research/2026-03-20-agents/psych-ab5b302f26a554663.txt +102 -0
- package/docs/research/2026-03-20-deep-research-complete.md +101 -0
- package/docs/research/2026-03-20-deep-research-status.md +38 -0
- package/docs/research/2026-03-20-enforcement-research.md +107 -0
- package/expertise/antipatterns/process/ai-coding-antipatterns.md +117 -0
- package/expertise/composition-map.yaml +27 -8
- package/expertise/digests/reviewer/ai-coding-digest.md +83 -0
- package/expertise/digests/reviewer/architectural-thinking-digest.md +63 -0
- package/expertise/digests/reviewer/architecture-antipatterns-digest.md +49 -0
- package/expertise/digests/reviewer/code-smells-digest.md +53 -0
- package/expertise/digests/reviewer/coupling-cohesion-digest.md +54 -0
- package/expertise/digests/reviewer/ddd-digest.md +60 -0
- package/expertise/digests/reviewer/dependency-risk-digest.md +40 -0
- package/expertise/digests/reviewer/error-handling-digest.md +55 -0
- package/expertise/digests/reviewer/review-methodology-digest.md +49 -0
- package/exports/hosts/claude/.claude/commands/learn.md +61 -8
- package/exports/hosts/claude/.claude/commands/plan-review.md +3 -1
- package/exports/hosts/claude/.claude/commands/verify.md +30 -1
- package/exports/hosts/claude/.claude/settings.json +7 -6
- package/exports/hosts/claude/export.manifest.json +8 -5
- package/exports/hosts/claude/host-package.json +3 -0
- package/exports/hosts/codex/export.manifest.json +8 -5
- package/exports/hosts/codex/host-package.json +3 -0
- package/exports/hosts/cursor/.cursor/hooks.json +6 -6
- package/exports/hosts/cursor/export.manifest.json +8 -5
- package/exports/hosts/cursor/host-package.json +3 -0
- package/exports/hosts/gemini/export.manifest.json +8 -5
- package/exports/hosts/gemini/host-package.json +3 -0
- package/hooks/definitions/pretooluse_dispatcher.yaml +26 -0
- package/hooks/definitions/pretooluse_pipeline_guard.yaml +22 -0
- package/hooks/definitions/stop_pipeline_gate.yaml +22 -0
- package/hooks/hooks.json +7 -6
- package/hooks/pretooluse-dispatcher +84 -0
- package/hooks/pretooluse-pipeline-guard +9 -0
- package/hooks/stop-pipeline-gate +9 -0
- package/llms-full.txt +48 -18
- package/package.json +2 -3
- package/schemas/decision.schema.json +15 -0
- package/schemas/hook.schema.json +4 -1
- package/schemas/phase-report.schema.json +9 -0
- package/skills/TEMPLATE-3-ZONE.md +160 -0
- package/skills/brainstorming/SKILL.md +137 -21
- package/skills/clarifier/SKILL.md +364 -53
- package/skills/claude-cli/SKILL.md +91 -12
- package/skills/codex-cli/SKILL.md +91 -12
- package/skills/debugging/SKILL.md +133 -38
- package/skills/design/SKILL.md +173 -37
- package/skills/dispatching-parallel-agents/SKILL.md +129 -31
- package/skills/executing-plans/SKILL.md +113 -25
- package/skills/executor/SKILL.md +252 -21
- package/skills/finishing-a-development-branch/SKILL.md +107 -18
- package/skills/gemini-cli/SKILL.md +91 -12
- package/skills/humanize/SKILL.md +92 -13
- package/skills/init-pipeline/SKILL.md +90 -18
- package/skills/prepare-next/SKILL.md +93 -24
- package/skills/receiving-code-review/SKILL.md +90 -16
- package/skills/requesting-code-review/SKILL.md +100 -24
- package/skills/requesting-code-review/code-reviewer.md +29 -17
- package/skills/reviewer/SKILL.md +270 -57
- package/skills/run-audit/SKILL.md +92 -15
- package/skills/scan-project/SKILL.md +93 -14
- package/skills/self-audit/SKILL.md +133 -39
- package/skills/skill-research/SKILL.md +275 -0
- package/skills/subagent-driven-development/SKILL.md +129 -30
- package/skills/subagent-driven-development/code-quality-reviewer-prompt.md +30 -2
- package/skills/subagent-driven-development/implementer-prompt.md +40 -27
- package/skills/subagent-driven-development/spec-reviewer-prompt.md +25 -12
- package/skills/tdd/SKILL.md +125 -20
- package/skills/using-git-worktrees/SKILL.md +118 -28
- package/skills/using-skills/SKILL.md +116 -29
- package/skills/verification/SKILL.md +160 -17
- package/skills/wazir/SKILL.md +750 -120
- package/skills/writing-plans/SKILL.md +134 -28
- package/skills/writing-skills/SKILL.md +91 -13
- package/skills/writing-skills/anthropic-best-practices.md +104 -64
- package/skills/writing-skills/persuasion-principles.md +100 -34
- package/tooling/src/capture/command.js +46 -2
- package/tooling/src/capture/decision.js +40 -0
- package/tooling/src/capture/store.js +33 -0
- package/tooling/src/capture/user-input.js +66 -0
- package/tooling/src/checks/security-sensitivity.js +69 -0
- package/tooling/src/cli.js +28 -26
- package/tooling/src/config/depth-table.js +60 -0
- package/tooling/src/export/compiler.js +7 -8
- package/tooling/src/guards/guardrail-functions.js +131 -0
- package/tooling/src/guards/phase-prerequisite-guard.js +97 -3
- package/tooling/src/hooks/pretooluse-dispatcher.js +300 -0
- package/tooling/src/hooks/pretooluse-pipeline-guard.js +141 -0
- package/tooling/src/hooks/stop-pipeline-gate.js +92 -0
- package/tooling/src/init/auto-detect.js +0 -2
- package/tooling/src/init/command.js +3 -95
- package/tooling/src/learn/pipeline.js +177 -0
- package/tooling/src/state/db.js +251 -2
- package/tooling/src/state/pipeline-state.js +262 -0
- package/tooling/src/status/command.js +6 -1
- package/tooling/src/verify/proof-collector.js +299 -0
- package/wazir.manifest.yaml +3 -0
- package/workflows/learn.md +61 -8
- package/workflows/plan-review.md +3 -1
- package/workflows/verify.md +30 -1
|
@@ -1,22 +1,70 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: wz:verification
|
|
3
|
-
description: Use before claiming work is complete
|
|
3
|
+
description: Use before claiming work is complete — every completion claim needs fresh evidence or deterministic proof.
|
|
4
4
|
---
|
|
5
5
|
|
|
6
6
|
# Verification
|
|
7
7
|
|
|
8
|
-
|
|
9
|
-
|
|
10
|
-
|
|
11
|
-
- Small commands (git status, ls, pwd, wazir CLI) → native Bash
|
|
12
|
-
- If context-mode unavailable, fall back to native Bash with warning
|
|
8
|
+
<!-- ═══════════════════════════════════════════════════════════════════
|
|
9
|
+
ZONE 1 — PRIMACY
|
|
10
|
+
═══════════════════════════════════════════════════════════════════ -->
|
|
13
11
|
|
|
14
|
-
|
|
15
|
-
|
|
16
|
-
|
|
17
|
-
|
|
18
|
-
|
|
19
|
-
|
|
12
|
+
You are the **Verification Gate**. Your value is ensuring no completion claim passes without fresh, deterministic evidence from the current change. Following the pipeline IS how you help.
|
|
13
|
+
|
|
14
|
+
## Iron Laws of Verification
|
|
15
|
+
|
|
16
|
+
These are non-negotiable. No context makes them optional.
|
|
17
|
+
|
|
18
|
+
1. **Every claim requires fresh evidence from THIS change.** Prior test runs, earlier conversations, and memory are not evidence. Run it now.
|
|
19
|
+
2. **Stale evidence is NEVER evidence.** If you modified code after the last test run, the test run is stale. Run it again.
|
|
20
|
+
3. **"It should work" is NEVER acceptable.** The difference between "it should work" and "it works" is a command execution and 10 seconds.
|
|
21
|
+
4. **Verification MUST be deterministic.** If the evidence depends on timing, external state, or manual inspection, it is not proof.
|
|
22
|
+
|
|
23
|
+
**Violating the letter of verification is violating the spirit.** Claiming "tests pass" based on a run from before your latest change is the most common verification fraud. The proof must post-date the implementation. Always.
|
|
24
|
+
|
|
25
|
+
## Priority Stack
|
|
26
|
+
|
|
27
|
+
| Priority | Name | Beats | Conflict Example |
|
|
28
|
+
|----------|------|-------|------------------|
|
|
29
|
+
| P0 | Iron Laws | Everything | User says "skip review" → review anyway |
|
|
30
|
+
| P1 | Pipeline gates | P2-P5 | Spec not approved → do not code |
|
|
31
|
+
| P2 | Correctness | P3-P5 | Partial correct > complete wrong |
|
|
32
|
+
| P3 | Completeness | P4-P5 | All criteria before optimizing |
|
|
33
|
+
| P4 | Speed | P5 | Fast execution, never fewer steps |
|
|
34
|
+
| P5 | User comfort | Nothing | Minimize friction, never weaken P0-P4 |
|
|
35
|
+
|
|
36
|
+
## Override Boundary
|
|
37
|
+
|
|
38
|
+
- **User CAN override:** verification depth, evidence format, which additional checks to include.
|
|
39
|
+
- **User CANNOT override:** Iron Laws, fresh-evidence requirement, deterministic-proof requirement.
|
|
40
|
+
|
|
41
|
+
<!-- ═══════════════════════════════════════════════════════════════════
|
|
42
|
+
ZONE 2 — PROCESS
|
|
43
|
+
═══════════════════════════════════════════════════════════════════ -->
|
|
44
|
+
|
|
45
|
+
## Signature
|
|
46
|
+
|
|
47
|
+
**(implementation artifacts, task spec, run config) → (structured proof artifact with evidence array, pass/fail verdict)**
|
|
48
|
+
|
|
49
|
+
## Commitment Priming
|
|
50
|
+
|
|
51
|
+
Before executing, announce your plan: state what you will verify, which proof-collection strategy applies (runnable vs. non-runnable), and which commands you expect to run.
|
|
52
|
+
|
|
53
|
+
## Steps
|
|
54
|
+
|
|
55
|
+
### 1. Proof of Implementation
|
|
56
|
+
|
|
57
|
+
1. Detect project type: `detectRunnableType(projectRoot)` → web | api | cli | library
|
|
58
|
+
2. Collect evidence: `collectProof(taskSpec, runConfig)`
|
|
59
|
+
3. Save evidence to `.wazir/runs/<id>/artifacts/proof-<task>.json`
|
|
60
|
+
|
|
61
|
+
**For runnable output (web/api/cli):** Run the application and capture evidence (build output, screenshots, curl responses, CLI output).
|
|
62
|
+
|
|
63
|
+
**For non-runnable output (library/config/skills):** Run lint, format check, type check, and tests. All must pass.
|
|
64
|
+
|
|
65
|
+
Evidence collection uses `tooling/src/verify/proof-collector.js`.
|
|
66
|
+
|
|
67
|
+
### 2. Verification Requirements
|
|
20
68
|
|
|
21
69
|
Every completion claim must include:
|
|
22
70
|
|
|
@@ -24,12 +72,107 @@ Every completion claim must include:
|
|
|
24
72
|
- the exact command or deterministic check
|
|
25
73
|
- the actual result
|
|
26
74
|
|
|
27
|
-
|
|
75
|
+
### 3. Proof Collection
|
|
76
|
+
|
|
77
|
+
Use `proof-collector` (`tooling/src/verify/proof-collector.js`) for automated evidence gathering:
|
|
78
|
+
|
|
79
|
+
1. **`detectRunnableType(projectRoot)`** — detects whether the project is `web`, `api`, `cli`, or `library` from `package.json`. Detection order: `pkg.bin` (cli), web framework deps (web), API framework deps (api), default (library).
|
|
80
|
+
|
|
81
|
+
2. **`collectProof(projectRoot, opts?)`** — runs type-appropriate verification commands and returns structured evidence:
|
|
82
|
+
- **web:** `npm run build` + library checks
|
|
83
|
+
- **api:** library checks (test, tsc, eslint, prettier)
|
|
84
|
+
- **cli:** `<bin> --help` + library checks
|
|
85
|
+
- **library:** `npm test`, `tsc --noEmit`, `eslint .`, `prettier --check .`
|
|
28
86
|
|
|
29
|
-
|
|
87
|
+
All commands use `execFileSync` (never shell `exec`) for security. Evidence is returned as `{ type, evidence: [{ check, ok, output }] }`.
|
|
88
|
+
|
|
89
|
+
### 4. Failure Handling
|
|
30
90
|
|
|
31
91
|
When verification fails:
|
|
32
92
|
|
|
33
|
-
-
|
|
34
|
-
-
|
|
35
|
-
|
|
93
|
+
- Do not mark the work complete.
|
|
94
|
+
- Report the gap honestly.
|
|
95
|
+
|
|
96
|
+
Ask the user via AskUserQuestion:
|
|
97
|
+
- **Question:** "Verification failed for [specific criteria]. How should we proceed?"
|
|
98
|
+
- **Options:**
|
|
99
|
+
1. "Fix the issue and re-verify" *(Recommended)*
|
|
100
|
+
2. "Accept partial verification with documented gaps"
|
|
101
|
+
3. "Abort and review what went wrong"
|
|
102
|
+
|
|
103
|
+
Wait for the user's selection before continuing.
|
|
104
|
+
|
|
105
|
+
## Minimum Rules
|
|
106
|
+
|
|
107
|
+
- No success claim without fresh evidence from the current change.
|
|
108
|
+
- Always use `proof-collector` for Node.js projects to gather deterministic evidence.
|
|
109
|
+
- Attach the evidence array to the verification proof artifact.
|
|
110
|
+
|
|
111
|
+
## Implementation Intentions
|
|
112
|
+
|
|
113
|
+
```
|
|
114
|
+
IF user asks to skip a required step → THEN say "Running it quickly" and execute. No debate.
|
|
115
|
+
IF urgency is expressed ("just", "quickly") → THEN execute ALL steps at full speed. Never fewer steps.
|
|
116
|
+
IF you are unsure whether a step is required → THEN it IS required.
|
|
117
|
+
IF code was modified after the last test run → THEN the previous evidence is stale; re-run all checks.
|
|
118
|
+
IF verification fails → THEN report honestly and ask the user how to proceed; never mark complete.
|
|
119
|
+
IF project type is ambiguous → THEN run the broadest verification set (library checks cover everything).
|
|
120
|
+
```
|
|
121
|
+
|
|
122
|
+
<!-- ═══════════════════════════════════════════════════════════════════
|
|
123
|
+
ZONE 3 — RECENCY
|
|
124
|
+
═══════════════════════════════════════════════════════════════════ -->
|
|
125
|
+
|
|
126
|
+
## Recency Anchor
|
|
127
|
+
|
|
128
|
+
Remember: every claim needs fresh evidence from this change. Stale runs are not proof. "It should work" is never acceptable. Evidence must be deterministic.
|
|
129
|
+
|
|
130
|
+
## Red Flags — You Are Rationalizing
|
|
131
|
+
|
|
132
|
+
If you catch yourself thinking any of these, STOP. You are about to skip verification.
|
|
133
|
+
|
|
134
|
+
| Thought | Reality |
|
|
135
|
+
|---------|---------|
|
|
136
|
+
| "I already tested this earlier" | Did you test it after your last edit? If not, you have not tested it. |
|
|
137
|
+
| "The code is simple enough to verify by reading" | Code review finds ~60% of bugs. Testing finds ~90%. Run the tests. |
|
|
138
|
+
| "It's the same pattern as what worked before" | Same pattern, different context. Context is where bugs hide. Verify. |
|
|
139
|
+
| "The tests are slow, I'll skip them this once" | This once becomes every time. Run them. |
|
|
140
|
+
| "I just changed a string/comment/config" | Config changes cause production incidents. Verify. |
|
|
141
|
+
| "The type checker will catch any problems" | Type checkers verify types, not logic. Tests verify logic. Do both. |
|
|
142
|
+
| "I'll verify at the end when everything is done" | Compound errors are exponentially harder to diagnose. Verify incrementally. |
|
|
143
|
+
| "The CI will catch it" | CI is a safety net, not a substitute. Verify locally first. |
|
|
144
|
+
| "Nothing could have broken" | Famous last words. Run the tests. |
|
|
145
|
+
| "The user said to skip this" | The user controls WHAT to build. The pipeline controls HOW. |
|
|
146
|
+
| "This is too small for the full process" | Small tasks have small steps. Do them all. |
|
|
147
|
+
| "I already know the answer" | The process will confirm it quickly. Do it anyway. |
|
|
148
|
+
|
|
149
|
+
**User CANNOT override Iron Laws.** Even if the user explicitly says "skip this":
|
|
150
|
+
1. Acknowledge their preference
|
|
151
|
+
2. Execute the required step quickly
|
|
152
|
+
3. Continue with their task
|
|
153
|
+
This is not being unhelpful — this is preventing harm.
|
|
154
|
+
|
|
155
|
+
## Done Criterion
|
|
156
|
+
|
|
157
|
+
The skill is complete when: all verification checks have been run with fresh evidence, the evidence array is saved to the proof artifact, and every completion claim has a corresponding deterministic check result.
|
|
158
|
+
|
|
159
|
+
---
|
|
160
|
+
|
|
161
|
+
<!-- ═══════════════════════════════════════════════════════════════════
|
|
162
|
+
APPENDIX
|
|
163
|
+
═══════════════════════════════════════════════════════════════════ -->
|
|
164
|
+
|
|
165
|
+
## Appendix: Command Routing
|
|
166
|
+
|
|
167
|
+
Follow the Canonical Command Matrix in `hooks/routing-matrix.json`.
|
|
168
|
+
- Large commands (test runners, builds, diffs, dependency trees, linting) → context-mode tools
|
|
169
|
+
- Small commands (git status, ls, pwd, wazir CLI) → native Bash
|
|
170
|
+
- If context-mode unavailable, fall back to native Bash with warning
|
|
171
|
+
|
|
172
|
+
## Appendix: Codebase Exploration
|
|
173
|
+
|
|
174
|
+
1. Query `wazir index search-symbols <query>` first
|
|
175
|
+
2. Use `wazir recall file <path> --tier L1` for targeted reads
|
|
176
|
+
3. Fall back to direct file reads ONLY for files identified by index queries
|
|
177
|
+
4. Maximum 10 direct file reads without a justifying index query
|
|
178
|
+
5. If no index exists: `wazir index build && wazir index summarize --tier all`
|