@wazir-dev/cli 1.2.0 → 1.4.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +54 -44
- package/README.md +13 -13
- package/assets/demo.cast +47 -0
- package/assets/demo.gif +0 -0
- package/docs/anti-patterns/AP-23-skipping-enabled-workflows.md +28 -0
- package/docs/anti-patterns/AP-24-clarifier-deciding-scope.md +34 -0
- package/docs/concepts/architecture.md +1 -1
- package/docs/concepts/why-wazir.md +1 -1
- package/docs/readmes/INDEX.md +1 -1
- package/docs/readmes/features/expertise/README.md +1 -1
- package/docs/readmes/features/hooks/pre-compact-summary.md +1 -1
- package/docs/reference/hooks.md +1 -0
- package/docs/reference/launch-checklist.md +3 -3
- package/docs/reference/review-loop-pattern.md +3 -2
- package/docs/reference/skill-tiers.md +2 -2
- package/docs/research/2026-03-20-agents/a18fb002157904af5.txt +187 -0
- package/docs/research/2026-03-20-agents/a1d0ac79ac2f11e6f.txt +2 -0
- package/docs/research/2026-03-20-agents/a324079de037abd7c.txt +198 -0
- package/docs/research/2026-03-20-agents/a357586bccfafb0e5.txt +256 -0
- package/docs/research/2026-03-20-agents/a4365394e4d753105.txt +137 -0
- package/docs/research/2026-03-20-agents/a492af28bc52d3613.txt +136 -0
- package/docs/research/2026-03-20-agents/a4984db0b6a8eee07.txt +124 -0
- package/docs/research/2026-03-20-agents/a5b30e59d34bbb062.txt +214 -0
- package/docs/research/2026-03-20-agents/a5cf7829dab911586.txt +165 -0
- package/docs/research/2026-03-20-agents/a607157c30dd97c9e.txt +96 -0
- package/docs/research/2026-03-20-agents/a60b68b1e19d1e16b.txt +115 -0
- package/docs/research/2026-03-20-agents/a722af01c5594aba0.txt +166 -0
- package/docs/research/2026-03-20-agents/a787bdc516faa5829.txt +181 -0
- package/docs/research/2026-03-20-agents/a7c46d1bba1056ed2.txt +132 -0
- package/docs/research/2026-03-20-agents/a7e5abbab2b281a0d.txt +100 -0
- package/docs/research/2026-03-20-agents/a8dbadc66cd0d7d5a.txt +95 -0
- package/docs/research/2026-03-20-agents/a904d9f45d6b86a6d.txt +75 -0
- package/docs/research/2026-03-20-agents/a927659a942ee7f60.txt +102 -0
- package/docs/research/2026-03-20-agents/a962cb569191f7583.txt +125 -0
- package/docs/research/2026-03-20-agents/aab6decea538aac41.txt +148 -0
- package/docs/research/2026-03-20-agents/abd58b853dd938a1b.txt +295 -0
- package/docs/research/2026-03-20-agents/ac009da573eff7f65.txt +100 -0
- package/docs/research/2026-03-20-agents/ac1bc783364405e5f.txt +190 -0
- package/docs/research/2026-03-20-agents/aca5e2b57fde152a0.txt +132 -0
- package/docs/research/2026-03-20-agents/ad849b8c0a7e95b8b.txt +176 -0
- package/docs/research/2026-03-20-agents/adc2b12a4da32c962.txt +258 -0
- package/docs/research/2026-03-20-agents/af97caaaa9a80e4cb.txt +146 -0
- package/docs/research/2026-03-20-agents/afc5faceee368b3ca.txt +111 -0
- package/docs/research/2026-03-20-agents/afdb282d866e3c1e4.txt +164 -0
- package/docs/research/2026-03-20-agents/afe9d1f61c02b1e8d.txt +299 -0
- package/docs/research/2026-03-20-agents/b4hmkwril.txt +1856 -0
- package/docs/research/2026-03-20-agents/b80ptk89g.txt +1856 -0
- package/docs/research/2026-03-20-agents/bf54s1jss.txt +1150 -0
- package/docs/research/2026-03-20-agents/bhd6kq2kx.txt +1856 -0
- package/docs/research/2026-03-20-agents/bmb2fodyr.txt +988 -0
- package/docs/research/2026-03-20-agents/bmmsrij8i.txt +826 -0
- package/docs/research/2026-03-20-agents/bn4t2ywpu.txt +2175 -0
- package/docs/research/2026-03-20-agents/bu22t9f1z.txt +0 -0
- package/docs/research/2026-03-20-agents/bwvl98v2p.txt +738 -0
- package/docs/research/2026-03-20-agents/psych-a3697a7fd06eb64fd.txt +135 -0
- package/docs/research/2026-03-20-agents/psych-a37776fabc870feae.txt +123 -0
- package/docs/research/2026-03-20-agents/psych-a5b1fe05c0589efaf.txt +2 -0
- package/docs/research/2026-03-20-agents/psych-a95c15b1f29424435.txt +76 -0
- package/docs/research/2026-03-20-agents/psych-a9c26f4d9172dde7c.txt +2 -0
- package/docs/research/2026-03-20-agents/psych-aa19c69f0ca2c5ad3.txt +2 -0
- package/docs/research/2026-03-20-agents/psych-aa4e4cb70e1be5ecb.txt +95 -0
- package/docs/research/2026-03-20-agents/psych-ab5b302f26a554663.txt +102 -0
- package/docs/research/2026-03-20-deep-research-complete.md +101 -0
- package/docs/research/2026-03-20-deep-research-status.md +38 -0
- package/docs/research/2026-03-20-enforcement-research.md +107 -0
- package/expertise/antipatterns/process/ai-coding-antipatterns.md +117 -0
- package/expertise/composition-map.yaml +27 -8
- package/expertise/digests/reviewer/ai-coding-digest.md +83 -0
- package/expertise/digests/reviewer/architectural-thinking-digest.md +63 -0
- package/expertise/digests/reviewer/architecture-antipatterns-digest.md +49 -0
- package/expertise/digests/reviewer/code-smells-digest.md +53 -0
- package/expertise/digests/reviewer/coupling-cohesion-digest.md +54 -0
- package/expertise/digests/reviewer/ddd-digest.md +60 -0
- package/expertise/digests/reviewer/dependency-risk-digest.md +40 -0
- package/expertise/digests/reviewer/error-handling-digest.md +55 -0
- package/expertise/digests/reviewer/review-methodology-digest.md +49 -0
- package/exports/hosts/claude/.claude/commands/learn.md +61 -8
- package/exports/hosts/claude/.claude/commands/plan-review.md +3 -1
- package/exports/hosts/claude/.claude/commands/verify.md +30 -1
- package/exports/hosts/claude/.claude/settings.json +7 -6
- package/exports/hosts/claude/export.manifest.json +8 -5
- package/exports/hosts/claude/host-package.json +3 -0
- package/exports/hosts/codex/export.manifest.json +8 -5
- package/exports/hosts/codex/host-package.json +3 -0
- package/exports/hosts/cursor/.cursor/hooks.json +6 -6
- package/exports/hosts/cursor/export.manifest.json +8 -5
- package/exports/hosts/cursor/host-package.json +3 -0
- package/exports/hosts/gemini/export.manifest.json +8 -5
- package/exports/hosts/gemini/host-package.json +3 -0
- package/hooks/definitions/pretooluse_dispatcher.yaml +26 -0
- package/hooks/definitions/pretooluse_pipeline_guard.yaml +22 -0
- package/hooks/definitions/stop_pipeline_gate.yaml +22 -0
- package/hooks/hooks.json +7 -6
- package/hooks/pretooluse-dispatcher +84 -0
- package/hooks/pretooluse-pipeline-guard +9 -0
- package/hooks/stop-pipeline-gate +9 -0
- package/llms-full.txt +48 -18
- package/package.json +2 -3
- package/schemas/decision.schema.json +15 -0
- package/schemas/hook.schema.json +4 -1
- package/schemas/phase-report.schema.json +9 -0
- package/skills/TEMPLATE-3-ZONE.md +160 -0
- package/skills/brainstorming/SKILL.md +137 -21
- package/skills/clarifier/SKILL.md +364 -53
- package/skills/claude-cli/SKILL.md +91 -12
- package/skills/codex-cli/SKILL.md +91 -12
- package/skills/debugging/SKILL.md +133 -38
- package/skills/design/SKILL.md +173 -37
- package/skills/dispatching-parallel-agents/SKILL.md +129 -31
- package/skills/executing-plans/SKILL.md +113 -25
- package/skills/executor/SKILL.md +252 -21
- package/skills/finishing-a-development-branch/SKILL.md +107 -18
- package/skills/gemini-cli/SKILL.md +91 -12
- package/skills/humanize/SKILL.md +92 -13
- package/skills/init-pipeline/SKILL.md +90 -18
- package/skills/prepare-next/SKILL.md +93 -24
- package/skills/receiving-code-review/SKILL.md +90 -16
- package/skills/requesting-code-review/SKILL.md +100 -24
- package/skills/requesting-code-review/code-reviewer.md +29 -17
- package/skills/reviewer/SKILL.md +270 -57
- package/skills/run-audit/SKILL.md +92 -15
- package/skills/scan-project/SKILL.md +93 -14
- package/skills/self-audit/SKILL.md +133 -39
- package/skills/skill-research/SKILL.md +275 -0
- package/skills/subagent-driven-development/SKILL.md +129 -30
- package/skills/subagent-driven-development/code-quality-reviewer-prompt.md +30 -2
- package/skills/subagent-driven-development/implementer-prompt.md +40 -27
- package/skills/subagent-driven-development/spec-reviewer-prompt.md +25 -12
- package/skills/tdd/SKILL.md +125 -20
- package/skills/using-git-worktrees/SKILL.md +118 -28
- package/skills/using-skills/SKILL.md +116 -29
- package/skills/verification/SKILL.md +160 -17
- package/skills/wazir/SKILL.md +750 -120
- package/skills/writing-plans/SKILL.md +134 -28
- package/skills/writing-skills/SKILL.md +91 -13
- package/skills/writing-skills/anthropic-best-practices.md +104 -64
- package/skills/writing-skills/persuasion-principles.md +100 -34
- package/tooling/src/capture/command.js +46 -2
- package/tooling/src/capture/decision.js +40 -0
- package/tooling/src/capture/store.js +33 -0
- package/tooling/src/capture/user-input.js +66 -0
- package/tooling/src/checks/security-sensitivity.js +69 -0
- package/tooling/src/cli.js +28 -26
- package/tooling/src/config/depth-table.js +60 -0
- package/tooling/src/export/compiler.js +7 -8
- package/tooling/src/guards/guardrail-functions.js +131 -0
- package/tooling/src/guards/phase-prerequisite-guard.js +97 -3
- package/tooling/src/hooks/pretooluse-dispatcher.js +300 -0
- package/tooling/src/hooks/pretooluse-pipeline-guard.js +141 -0
- package/tooling/src/hooks/stop-pipeline-gate.js +92 -0
- package/tooling/src/init/auto-detect.js +0 -2
- package/tooling/src/init/command.js +3 -95
- package/tooling/src/learn/pipeline.js +177 -0
- package/tooling/src/state/db.js +251 -2
- package/tooling/src/state/pipeline-state.js +262 -0
- package/tooling/src/status/command.js +6 -1
- package/tooling/src/verify/proof-collector.js +299 -0
- package/wazir.manifest.yaml +3 -0
- package/workflows/learn.md +61 -8
- package/workflows/plan-review.md +3 -1
- package/workflows/verify.md +30 -1
package/CHANGELOG.md
CHANGED
|
@@ -1,48 +1,9 @@
|
|
|
1
|
-
# [1.
|
|
2
|
-
|
|
3
|
-
|
|
4
|
-
### Bug Fixes
|
|
5
|
-
|
|
6
|
-
* address 4 Codex review findings — nested payload, fallback, hashing, freshness key ([2276cae](https://github.com/MohamedAbdallah-14/Wazir/commit/2276caefc3fb22ab2ed1cd1b78152f64f3e5685c))
|
|
7
|
-
* address 5 Codex review findings — routing state root, stats accuracy, Cursor hooks ([2cb21ba](https://github.com/MohamedAbdallah-14/Wazir/commit/2cb21ba258f63a663b8eddc0cc25322900022125))
|
|
8
|
-
* address final review findings — context-mode CLI detection, AC heading overlap, CHANGELOG ([c33947f](https://github.com/MohamedAbdallah-14/Wazir/commit/c33947f8481647017506b0929ff87511d5dc6cad))
|
|
9
|
-
* **hooks:** canonicalize hook registry and fix Claude Code payload mapping (I9) ([3e8810a](https://github.com/MohamedAbdallah-14/Wazir/commit/3e8810af7625de206979ebb356eeae6b7a1b5e67))
|
|
10
|
-
* **self-audit:** loop 1 — add missing run-audit workflow to reference docs ([f830d84](https://github.com/MohamedAbdallah-14/Wazir/commit/f830d842cad4756db3311ff5cedb98fdbb5b0f72))
|
|
11
|
-
* **self-audit:** loop 2 — add run-audit to reference docs, register 2 unlisted test files ([3e65c89](https://github.com/MohamedAbdallah-14/Wazir/commit/3e65c89b0532eed28313a17898abdf3627a5dadf))
|
|
12
|
-
* **self-audit:** loop 3 — fix expertise count drift (261/308→268), schema count (16/18→19), regenerate exports ([500df05](https://github.com/MohamedAbdallah-14/Wazir/commit/500df057c59bcf82d691e52d0d9e09e8cec33edc))
|
|
13
|
-
* **self-audit:** loop 4 — remove unused gray-matter dep, complete skill roster (11→28), add 17 skill readme stubs ([87047d1](https://github.com/MohamedAbdallah-14/Wazir/commit/87047d1f009b2f6cb1c73cd8db69c97d740c6385))
|
|
14
|
-
* **self-audit:** loop 5 — fix INDEX.md counts (60→76 readmes, 11→28 skills), add 3 missing schemas to catalog, fix export diagram counts, remove gray-matter ref, regenerate exports ([92d187c](https://github.com/MohamedAbdallah-14/Wazir/commit/92d187c12786a6d19ec3c65c2d4b09499d853582))
|
|
1
|
+
# [1.4.0](https://github.com/MohamedAbdallah-14/Wazir/compare/v1.3.0...v1.4.0) (2026-03-20)
|
|
15
2
|
|
|
16
3
|
|
|
17
4
|
### Features
|
|
18
5
|
|
|
19
|
-
*
|
|
20
|
-
* **clarifier:** context-mode fallbacks (item-6) ([4694143](https://github.com/MohamedAbdallah-14/Wazir/commit/4694143028066e373728feb1e5100cdc5fb6aec2))
|
|
21
|
-
* **clarifier:** gap analysis exit gate (item-3) ([83df703](https://github.com/MohamedAbdallah-14/Wazir/commit/83df703f251a541c2155a2f360aa2e7ec5206f02))
|
|
22
|
-
* **clarifier:** online research phase (item-9) ([0ac5ac8](https://github.com/MohamedAbdallah-14/Wazir/commit/0ac5ac849edeed975fc5051c04a3a234e9ca7994))
|
|
23
|
-
* **clarifier:** preserve input quality (item-1) ([ba46424](https://github.com/MohamedAbdallah-14/Wazir/commit/ba46424d4900a1b13f1456156cb22f54e9b5ba1d))
|
|
24
|
-
* **clarifier:** reviewer skill invocation policy (item-13) ([4b3f59d](https://github.com/MohamedAbdallah-14/Wazir/commit/4b3f59dd7ae3f7f93b4d1f7d7da070dc98c3a369))
|
|
25
|
-
* **clarifier:** run-scoped user feedback routing (item-11) ([5436746](https://github.com/MohamedAbdallah-14/Wazir/commit/5436746d4da453c23c213db7c11e4497870352da))
|
|
26
|
-
* **clarifier:** spec-kit plan format (item-2) ([0266247](https://github.com/MohamedAbdallah-14/Wazir/commit/02662474db97d87c2ce6f82b2e2a7b960d386d00))
|
|
27
|
-
* **executor:** changelog and gitflow enforcement (item-5) ([5a5986a](https://github.com/MohamedAbdallah-14/Wazir/commit/5a5986a0131bbefe7d601e799411ae48cf68fe10))
|
|
28
|
-
* **hooks:** add index refresh to session-start hook (D2) ([ff4647f](https://github.com/MohamedAbdallah-14/Wazir/commit/ff4647f2fc18e963a8180ec3b93937ce70d33be4))
|
|
29
|
-
* **hooks:** extract routing logic and add context-mode router tests (D1/D3) ([1e7650a](https://github.com/MohamedAbdallah-14/Wazir/commit/1e7650a7cf5af944807897f5a3200a69b691cd5f)), closes [passthrou#vs-large](https://github.com/passthrou/issues/vs-large)
|
|
30
|
-
* implement all 8 remaining enhancement items ([00acff5](https://github.com/MohamedAbdallah-14/Wazir/commit/00acff57e7f0bc9515ef8cce8acf9596577fca83))
|
|
31
|
-
* **init-pipeline:** context-mode detection (item-6-init) ([839a5b3](https://github.com/MohamedAbdallah-14/Wazir/commit/839a5b3b41b0246468d4e9b0da61c4f17e7c3c41))
|
|
32
|
-
* **input:** extract input scanner utility and verify scanning (I3) ([8a232a2](https://github.com/MohamedAbdallah-14/Wazir/commit/8a232a2312e4a42b1aefc5c9c6283d738c58f0c4))
|
|
33
|
-
* **review-loop:** fix-and-loop with convergence cap (item-12) ([d2fcb9c](https://github.com/MohamedAbdallah-14/Wazir/commit/d2fcb9c7bd93407a88dda2c8c96ffd397834f268))
|
|
34
|
-
* **review-loop:** phase scoring with dimension deltas (item-15) ([1fe9f6d](https://github.com/MohamedAbdallah-14/Wazir/commit/1fe9f6d85d36d610fb6c48e2e1035d2f71134832))
|
|
35
|
-
* **reviewer:** codex output context protection (item-17) ([abb8a77](https://github.com/MohamedAbdallah-14/Wazir/commit/abb8a77e922268adbfaee721facbdcbaa1e30646))
|
|
36
|
-
* **skills:** interactive numbered options at all checkpoints (item-10) ([45bc6fd](https://github.com/MohamedAbdallah-14/Wazir/commit/45bc6fdf5f8feb558a89823befa72b0ca78cdce3))
|
|
37
|
-
* **templates:** create spec-kit task template (item-8) ([cd55d73](https://github.com/MohamedAbdallah-14/Wazir/commit/cd55d73316a59437c76ddeb22643020a60d76b55))
|
|
38
|
-
* **tooling:** AC verification scaffold — 111 checks (T000) ([ea61684](https://github.com/MohamedAbdallah-14/Wazir/commit/ea616843b5deca97ead38b8db06c1a1eef15458f))
|
|
39
|
-
* **wazir:** enforce pipeline phases — agent must never skip phases ([3e21bd2](https://github.com/MohamedAbdallah-14/Wazir/commit/3e21bd2d36af57d0db4b8dc68458cc63bbb8676a))
|
|
40
|
-
* **wazir:** full end-of-phase reports (item-16) ([6c84455](https://github.com/MohamedAbdallah-14/Wazir/commit/6c84455e867eb19d49b69cfcb130b147a80917f9))
|
|
41
|
-
* **wazir:** implement 9 enhancement decisions from brainstorming session ([885a2c1](https://github.com/MohamedAbdallah-14/Wazir/commit/885a2c1f04fc9ac02c65ed06461114e4c3251393))
|
|
42
|
-
* **wazir:** implement learning system — extraction, injection, and handoff ([06eb107](https://github.com/MohamedAbdallah-14/Wazir/commit/06eb107e8641b50e075ca1744f899da0fe9d09e6)), closes [#1](https://github.com/MohamedAbdallah-14/Wazir/issues/1) [#2](https://github.com/MohamedAbdallah-14/Wazir/issues/2) [#13](https://github.com/MohamedAbdallah-14/Wazir/issues/13)
|
|
43
|
-
* **wazir:** restructure pipeline from 14 phases to 4 (Init, Clarifier, Executor, Final Review) ([d6e2372](https://github.com/MohamedAbdallah-14/Wazir/commit/d6e2372d5a0c6824905569de26ddb2434eb74dca)), closes [#4](https://github.com/MohamedAbdallah-14/Wazir/issues/4) [#5](https://github.com/MohamedAbdallah-14/Wazir/issues/5)
|
|
44
|
-
* **wazir:** resume copies artifacts (item-4) ([14633d4](https://github.com/MohamedAbdallah-14/Wazir/commit/14633d4cd66d04846b3297c3d50d99046a89fb7c))
|
|
45
|
-
* **wazir:** usage reports at phase exits (item-7) ([8f055be](https://github.com/MohamedAbdallah-14/Wazir/commit/8f055be12992c56e01619e2a9547dc3e9045dc7c))
|
|
6
|
+
* Wazir v2 — three-layer enforcement, psychology-driven skill rewriting, 33-agent research ([#6](https://github.com/MohamedAbdallah-14/Wazir/issues/6)) ([893ca14](https://github.com/MohamedAbdallah-14/Wazir/commit/893ca143fc29365f92aa923c87f50d3c66c8f485))
|
|
46
7
|
|
|
47
8
|
# Changelog
|
|
48
9
|
|
|
@@ -52,16 +13,61 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/), and this
|
|
|
52
13
|
|
|
53
14
|
## [Unreleased]
|
|
54
15
|
|
|
16
|
+
### Added
|
|
17
|
+
- Three-layer enforcement architecture: Stop hook blocks completion, PreToolUse hooks enforce phase gates, subagent-per-phase controller isolates workflows
|
|
18
|
+
- Pipeline state machine (`tooling/src/state/pipeline-state.js`) with session isolation, atomic writes, and crash recovery
|
|
19
|
+
- Per-phase guardrail validators with concrete pass/fail criteria per workflow boundary
|
|
20
|
+
- Consolidated PreToolUse dispatcher replacing 3 separate hooks (reduces false "hook error" labels)
|
|
21
|
+
- Psychology-driven 3-zone skill architecture across all 29 SKILL.md files (primacy Iron Laws → structured process → recency re-anchoring)
|
|
22
|
+
- Rationalization tables, red flags, IF-THEN implementation intentions in every discipline skill
|
|
23
|
+
- CSO description fixes — triggers only, never process summaries
|
|
24
|
+
- Priority stack P0-P5 with conflict examples in every skill
|
|
25
|
+
- Identity framing: "pipeline compliance IS helpfulness"
|
|
26
|
+
- Mode-specific reviewer composition with 8 digest modules (3-5K tokens each)
|
|
27
|
+
- Findings-to-antipattern learning pipeline with LLM-assisted clustering and threshold-based promotion
|
|
28
|
+
- Depth parameter table (`tooling/src/config/depth-table.js`) with 40+ parameters across all phases
|
|
29
|
+
- Artifact dependency graph with digest verification and selective regeneration
|
|
30
|
+
- Decision logging module (`tooling/src/capture/decision.js`) with NDJSON audit trail
|
|
31
|
+
- 33-agent architecture research + 8-agent psychology research (docs/research/)
|
|
32
|
+
- Workflow completion enforcement — `validateRunCompletion()` ensures all enabled workflows complete before run finalizes (`wazir capture summary --complete`)
|
|
33
|
+
- Mandatory security gate — pattern-based diff scanner (`tooling/src/checks/security-sensitivity.js`) auto-adds 6 security review dimensions when auth/token/SQL/etc. patterns detected
|
|
34
|
+
- Three interaction modes: `auto` (overnight, Codex-required), `guided` (default), `interactive` (co-design) via `/wazir auto|interactive ...`
|
|
35
|
+
- User input capture — NDJSON logging of all user messages during a run (`tooling/src/capture/user-input.js`) with retention pruning
|
|
36
|
+
- Two-layer reasoning chain output — concise conversation triggers + detailed file output at `reasoning/phase-<name>-reasoning.md`
|
|
37
|
+
- Input Coverage dimension in self-audit (compares original input vs plan vs commits)
|
|
38
|
+
- Input Coverage dimension in plan-review (8th dimension, catches scope reduction)
|
|
39
|
+
- Two-level phase model — `parent_phase` and `workflow` fields in phase report schema, hierarchy display in `wazir status`
|
|
40
|
+
- CLI/context-mode enforcement — reviewer flags >5 direct reads without index query and large commands without context-mode
|
|
41
|
+
- Per-phase context savings display at phase boundaries via `wazir stats`
|
|
42
|
+
- Overnight skill research skill (`skills/skill-research/SKILL.md`) for competitive analysis against superpowers and other frameworks
|
|
43
|
+
- Anti-pattern docs: AP-23 (skipping enabled workflows), AP-24 (clarifier deciding scope without asking)
|
|
44
|
+
|
|
45
|
+
### Changed
|
|
46
|
+
- Clarifier Phase 1A rewritten — research runs first, then informed question batches (3-7 per batch), every scope exclusion requires user confirmation
|
|
47
|
+
- Executor enforces one commit per task (hard rule, reviewer rejects multi-task batching)
|
|
48
|
+
- Per-phase savings display added to clarifier and executor phase boundaries
|
|
49
|
+
|
|
50
|
+
### Fixed
|
|
51
|
+
- SQLite ExperimentalWarning suppressed via lazy dynamic imports in CLI entrypoint
|
|
52
|
+
- `--complete` flag properly parsed in `wazir capture summary`
|
|
53
|
+
- `validateRunCompletion` filters by `workflow_policy` (enabled workflows only), not full manifest list
|
|
54
|
+
|
|
55
55
|
### Changed
|
|
56
56
|
- Restructured pipeline from 14 micro-phases to 4 main phases: Init, Clarifier, Executor, Final Review
|
|
57
57
|
- Removed depth and intent questions from pipeline init — depth defaults to standard (override via inline modifiers), intent inferred from request keywords
|
|
58
58
|
- Enabled learn + prepare-next workflows by default (part of Final Review phase)
|
|
59
59
|
- Renamed `phase_policy` to `workflow_policy` in run-config (legacy name still supported)
|
|
60
|
-
- Pipeline init no longer asks about Agent Teams — always sequential
|
|
61
60
|
- Input directory (`input/`) now scanned automatically at startup
|
|
62
61
|
- Learning extraction with concrete proposal format in reviewer final mode
|
|
63
62
|
- Accepted learnings injected into clarifier context (top 10 by confidence, scope-matched)
|
|
64
63
|
- Prepare-next skill produces structured handoff document
|
|
64
|
+
- All pipeline checkpoints now use AskUserQuestion pattern instead of numbered lists
|
|
65
|
+
- Every pipeline phase outputs value-reporting text (before/after) explaining why the phase matters and what it found
|
|
66
|
+
- Review dimensions annotated with "catches:" descriptions explaining what class of bugs each dimension prevents
|
|
67
|
+
|
|
68
|
+
### Removed
|
|
69
|
+
- `@inquirer/prompts` dependency and `--interactive` init path (always fails in non-TTY)
|
|
70
|
+
- All Agent Teams references (team_mode, parallel_backend, TeamCreate/SendMessage/TeamDelete, Free Thinker/Grounder/Synthesizer)
|
|
65
71
|
|
|
66
72
|
### Fixed
|
|
67
73
|
- Router logs now write to manifest-derived state root instead of `_default` (Codex P1)
|
|
@@ -69,15 +75,19 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/), and this
|
|
|
69
75
|
- Index-query savings now computed from avoided bytes, not raw bytes (Codex P2)
|
|
70
76
|
- Index-query savings included in savings-ratio denominator (Codex P2)
|
|
71
77
|
- Cursor export now includes context-mode-router hook (Codex P2)
|
|
78
|
+
- SessionStart hook uses correct `database_path` key for index freshness check
|
|
79
|
+
- TabManager stop hook error documented as Claude Code internal (cannot fix from Wazir side)
|
|
72
80
|
|
|
73
81
|
### Added
|
|
74
82
|
- Core review loop pattern across all pipeline phases with Codex CLI integration
|
|
75
83
|
- `wazir capture loop-check` CLI subcommand with task-scoped cap tracking and run-config loader
|
|
76
|
-
- `wazir init`
|
|
84
|
+
- `wazir init` zero-config auto-init (no prompts, infer everything)
|
|
77
85
|
- `docs/reference/review-loop-pattern.md` canonical reference for the review loop pattern
|
|
78
86
|
- Standalone skills: `/wazir:clarifier`, `/wazir:executor`, `/wazir:reviewer`
|
|
79
|
-
- Agent Teams real implementation in brainstorming (TeamCreate, SendMessage, TeamDelete)
|
|
80
87
|
- Codex prompt templates (artifact + code) with "Do NOT load skills" instruction
|
|
88
|
+
- `tooling/src/verify/proof-collector.js` — detects project type (web/api/cli/library) and collects mechanical proof of implementation
|
|
89
|
+
- Phase reports wired into pipeline — `wazir report phase` called after each phase exit and displayed to user
|
|
90
|
+
- Proof-of-implementation in verify workflow — runnable vs non-runnable detection with evidence collection
|
|
81
91
|
- Git branch enforcement in `/wazir` runner (validates branch, offers to create feature branch)
|
|
82
92
|
- CLI wiring across pipeline phases (doctor gate, index build/refresh, capture events, validate gates)
|
|
83
93
|
- CHANGELOG enforcement in executor and reviewer skills
|
package/README.md
CHANGED
|
@@ -32,7 +32,7 @@
|
|
|
32
32
|
I'm Mohamed Abdallah. I kept watching AI agents write confident code that broke in production, skip tests, and forget what we agreed on yesterday. So I stopped asking them to be better and built them an engineering department instead.
|
|
33
33
|
|
|
34
34
|
**Wazir puts engineering discipline inside AI coding agents.**
|
|
35
|
-
No wrapper. No server. Just structure -- inside Claude, Codex, Gemini, and Cursor. Built on 300+ research sources distilled into
|
|
35
|
+
No wrapper. No server. Just structure -- inside Claude, Codex, Gemini, and Cursor. Built on 300+ research sources distilled into 315 curated expertise modules across 12 domains.
|
|
36
36
|
|
|
37
37
|
---
|
|
38
38
|
|
|
@@ -77,7 +77,7 @@ AI agents love to announce they're finished. Wazir doesn't care. Every phase loo
|
|
|
77
77
|
|
|
78
78
|
## The Pipeline
|
|
79
79
|
|
|
80
|
-
Every task flows through
|
|
80
|
+
Every task flows through 15 workflows grouped into 4 phases. Three are adversarial review gates that block progress until the reviewer explicitly approves. Rejection loops back to the authoring phase.
|
|
81
81
|
|
|
82
82
|
```mermaid
|
|
83
83
|
graph LR
|
|
@@ -124,7 +124,7 @@ Three concepts.
|
|
|
124
124
|
|
|
125
125
|
**2 -- Phases are artifact checkpoints, not conversation stages.** Every phase consumes a named artifact from the previous phase and produces a named artifact for the next. Nothing flows through conversation history. A session can end, a new agent can pick up the artifacts, and delivery continues. The handoff is explicit, structured, and schema-validated against 19 JSON schemas. See [Architecture](docs/concepts/architecture.md).
|
|
126
126
|
|
|
127
|
-
**3 -- The composition engine loads the right expert automatically.** One agent pretending to be an expert in everything is an expert in nothing. A 4-layer system (always, auto, stacks, concerns) decides which of
|
|
127
|
+
**3 -- The composition engine loads the right expert automatically.** One agent pretending to be an expert in everything is an expert in nothing. A 4-layer system (always, auto, stacks, concerns) decides which of 315 expertise modules load into each role's context. The executor gets modules on how to build. The verifier gets modules on what to detect. The reviewer gets modules on what to flag. All resolved automatically from the task's declared stack and concerns. Max 15 modules per dispatch, token budget enforced.
|
|
128
128
|
|
|
129
129
|
---
|
|
130
130
|
|
|
@@ -169,17 +169,17 @@ Run `wazir capture usage` at the end of a session to see the savings:
|
|
|
169
169
|
|
|
170
170
|
**10 canonical role contracts.** Clarifier, researcher, specifier, content-author, designer, planner, executor, verifier, reviewer, learner. Each has enforceable inputs, outputs, and escalation rules. [Roles reference](docs/reference/roles-reference.md)
|
|
171
171
|
|
|
172
|
-
**Adversarial review at three chokepoints.** Spec-challenge, plan-review, and final review run by the reviewer role, never the phase author. Nine hard approval gates span the
|
|
172
|
+
**Adversarial review at three chokepoints.** Spec-challenge, plan-review, and final review run by the reviewer role, never the phase author. Nine hard approval gates span the 15-workflow pipeline. Nothing advances without explicit clearance. [Architecture](docs/concepts/architecture.md)
|
|
173
173
|
|
|
174
|
-
**
|
|
174
|
+
**315 curated expertise modules across 12 domains.** Loaded selectively per role per phase via a 4-layer composition engine. Max 15 modules per dispatch, token budget enforced. Wazir ships with 315. Yours could be next. [Expertise index](docs/reference/expertise-index.md)
|
|
175
175
|
|
|
176
176
|
**Three-tier recall for token savings.** L0 (~~100 tokens), L1 (~~500-2k tokens), direct read for full source. Symbol-first exploration searches the index before reading source. Capture routing redirects large tool output to files. Result: 60-80% token reduction on exploration-heavy phases, measured per-session by `wazir capture usage`. [Indexing and Recall](docs/concepts/indexing-and-recall.md)
|
|
177
177
|
|
|
178
178
|
**Structured learning.** Proposed learnings require explicit review and scope tagging before promotion. Only learnings whose file patterns overlap the current task get injected into context. The system improves per-project without drifting.
|
|
179
179
|
|
|
180
|
-
**
|
|
180
|
+
**8 hook contracts for structural guardrails.** These enforce protected path writes (exit 42), loop caps (exit 43), and session observability. [Hooks](docs/reference/hooks.md)
|
|
181
181
|
|
|
182
|
-
**
|
|
182
|
+
**28 callable skills.** `/wazir` runs the full pipeline. `/wazir audit security` runs a codebase audit. `/wazir prd` generates a product requirements document from completed runs. Plus TDD, verification, debugging, and more -- each enforcing an exact procedure with evidence at every step. [Skills](docs/reference/skills.md)
|
|
183
183
|
|
|
184
184
|
**Built-in text humanization.** The composition engine loads domain-specific language rules per role: code rules for the executor (commit messages, comments), content rules for the content-author (microcopy, glossary), and technical-docs rules for the specifier, planner, reviewer, and learner. A 61-item vocabulary blacklist, 24-pattern sentence taxonomy, and two-pass self-audit checklist keep all output sounding like it was written by a person.
|
|
185
185
|
|
|
@@ -189,19 +189,19 @@ Run `wazir capture usage` at the end of a session to see the savings:
|
|
|
189
189
|
|
|
190
190
|
## Compared to Other Tools
|
|
191
191
|
|
|
192
|
-
The AI coding tool space is fragmenting. Developers bolt together separate plugins for workflow management, specification, memory, output compression, and orchestration. Not every project needs
|
|
192
|
+
The AI coding tool space is fragmenting. Developers bolt together separate plugins for workflow management, specification, memory, output compression, and orchestration. Not every project needs 15 workflows. For a weekend hack, prompting is fine. For production, you want structure.
|
|
193
193
|
|
|
194
194
|
|
|
195
195
|
| Dimension | Wazir | [Superpowers](https://github.com/obra/superpowers) | [Spec-Kit](https://github.com/github/spec-kit) | [Micro-Agent](https://github.com/BuilderIO/micro-agent) | [Distill](https://github.com/samuelfaj/distill) | [Claude-Mem](https://github.com/thedotmack/claude-mem) | [OMC](https://github.com/yeachan-heo/oh-my-claudecode) |
|
|
196
196
|
| ---------------------- | ----------------------------- | -------------------------------------------------- | ---------------------------------------------- | ------------------------------------------------------- | ----------------------------------------------- | ------------------------------------------------------ | ------------------------------------------------------ |
|
|
197
197
|
| **Category** | Engineering OS | Skills framework | Spec toolkit | Code gen agent | Output compressor | Memory plugin | Orchestration layer |
|
|
198
|
-
| **Scope** | Full lifecycle (
|
|
198
|
+
| **Scope** | Full lifecycle (15 workflows) | Dev workflow (~20 skills) | Specify / Plan / Implement | Single-file TDD loop | CLI output compression | Session memory | Multi-agent orchestration |
|
|
199
199
|
| **Enforced roles** | 10 canonical, contractual | None (skills only) | None | None | None | None | 32 agents (behavioral) |
|
|
200
|
-
| **Phase model** |
|
|
200
|
+
| **Phase model** | 15 explicit, artifact-gated | 7-step (advisory) | 3-step | 1 (generate/test) | N/A | N/A | 5-step pipeline |
|
|
201
201
|
| **Adversarial review** | 3 gate phases | Code review skill | No | No | No | No | team-verify step |
|
|
202
202
|
| **Context management** | L0/L1 tiered recall | None | None | None | LLM compression | Vector DB (ChromaDB) | Token routing |
|
|
203
203
|
| **Schema validation** | 19 JSON schemas | No | No | No | No | No | No |
|
|
204
|
-
| **Guardrails** |
|
|
204
|
+
| **Guardrails** | 8 hook contracts | None | None | None | None | 5 hooks (memory) | Agent tracking |
|
|
205
205
|
| **External deps** | None (host-native) | None (prompt-only) | Python CLI | Node.js CLI | Node.js + LLM | ChromaDB, SQLite, Bun | tmux, exp. teams API |
|
|
206
206
|
| **Host support** | Claude, Codex, Gemini, Cursor | Claude, Codex, Gemini, Cursor, OpenCode | Claude, Copilot, Gemini | Any LLM provider | Any LLM | Claude Code only | Claude Code (+ workers) |
|
|
207
207
|
|
|
@@ -264,8 +264,8 @@ The pipeline, roles, and expertise modules are stable and used in production by
|
|
|
264
264
|
|
|
265
265
|
What's solid:
|
|
266
266
|
|
|
267
|
-
- The
|
|
268
|
-
-
|
|
267
|
+
- The 15-workflow pipeline and 10 role contracts
|
|
268
|
+
- 315 expertise modules across 12 domains
|
|
269
269
|
- Host exports for Claude, Codex, Gemini, and Cursor
|
|
270
270
|
- The composition engine and tiered recall system
|
|
271
271
|
|
package/assets/demo.cast
ADDED
|
@@ -0,0 +1,47 @@
|
|
|
1
|
+
{"version":3,"term":{"cols":387,"rows":85,"type":"xterm-256color","version":"Warp(v0.2026.03.04.08.20.stable_03)"},"timestamp":1773955554,"command":"bash assets/demo-script.sh","env":{"SHELL":"/bin/zsh"}}
|
|
2
|
+
[0.008, "o", "$ wazir doctor\r\n"]
|
|
3
|
+
[0.315, "o", "\u001b[1G"]
|
|
4
|
+
[0.000, "o", "\u001b[0K⠙"]
|
|
5
|
+
[0.081, "o", "\u001b[1G\u001b[0K⠹"]
|
|
6
|
+
[0.080, "o", "\u001b[1G\u001b[0K⠸"]
|
|
7
|
+
[0.082, "o", "\u001b[1G\u001b[0K⠼"]
|
|
8
|
+
[0.081, "o", "\u001b[1G\u001b[0K⠴"]
|
|
9
|
+
[0.080, "o", "\u001b[1G\u001b[0K⠦"]
|
|
10
|
+
[0.082, "o", "\u001b[1G\u001b[0K⠧"]
|
|
11
|
+
[0.082, "o", "\u001b[1G\u001b[0K⠇"]
|
|
12
|
+
[0.081, "o", "\u001b[1G\u001b[0K⠏"]
|
|
13
|
+
[0.080, "o", "\u001b[1G\u001b[0K⠋"]
|
|
14
|
+
[0.080, "o", "\u001b[1G\u001b[0K⠙"]
|
|
15
|
+
[0.025, "o", "\u001b[1G\u001b[0K"]
|
|
16
|
+
[0.187, "o", "(node:20008) ExperimentalWarning: SQLite is an experimental feature and might change at any time\r\n(Use `node --trace-warnings ...` to show where the warning was created)\r\n"]
|
|
17
|
+
[0.074, "o", "PASS manifest: Manifest is valid.\r\nPASS hooks: Hook definitions are valid.\r\nPASS state-root: /Users/mohamedabdallah/.wazir/projects/wazir stays outside the project root\r\nPASS host-exports: All required host export directories exist.\r\n"]
|
|
18
|
+
[0.004, "o", "\u001b[1G\u001b[0K⠙"]
|
|
19
|
+
[0.001, "o", "\u001b[1G\u001b[0K"]
|
|
20
|
+
[2.010, "o", "\r\n$ wazir export build\r\n"]
|
|
21
|
+
[0.287, "o", "\u001b[1G"]
|
|
22
|
+
[0.000, "o", "\u001b[0K⠙"]
|
|
23
|
+
[0.081, "o", "\u001b[1G\u001b[0K⠹"]
|
|
24
|
+
[0.081, "o", "\u001b[1G\u001b[0K⠸"]
|
|
25
|
+
[0.080, "o", "\u001b[1G\u001b[0K⠼"]
|
|
26
|
+
[0.081, "o", "\u001b[1G\u001b[0K⠴"]
|
|
27
|
+
[0.082, "o", "\u001b[1G\u001b[0K⠦"]
|
|
28
|
+
[0.012, "o", "\u001b[1G\u001b[0K"]
|
|
29
|
+
[0.164, "o", "(node:20085) ExperimentalWarning: SQLite is an experimental feature and might change at any time\r\n(Use `node --trace-warnings ...` to show where the warning was created)\r\n"]
|
|
30
|
+
[0.070, "o", "Generated host exports for claude, codex, gemini, cursor.\r\n"]
|
|
31
|
+
[0.004, "o", "\u001b[1G\u001b[0K⠙"]
|
|
32
|
+
[0.000, "o", "\u001b[1G\u001b[0K"]
|
|
33
|
+
[2.013, "o", "\r\n$ wazir index build\r\n"]
|
|
34
|
+
[0.304, "o", "\u001b[1G"]
|
|
35
|
+
[0.000, "o", "\u001b[0K⠙"]
|
|
36
|
+
[0.081, "o", "\u001b[1G\u001b[0K⠹"]
|
|
37
|
+
[0.080, "o", "\u001b[1G\u001b[0K⠸"]
|
|
38
|
+
[0.081, "o", "\u001b[1G\u001b[0K⠼"]
|
|
39
|
+
[0.081, "o", "\u001b[1G\u001b[0K⠴"]
|
|
40
|
+
[0.081, "o", "\u001b[1G\u001b[0K⠦"]
|
|
41
|
+
[0.081, "o", "\u001b[1G\u001b[0K⠧"]
|
|
42
|
+
[0.018, "o", "\u001b[1G\u001b[0K"]
|
|
43
|
+
[0.189, "o", "(node:20186) ExperimentalWarning: SQLite is an experimental feature and might change at any time\r\n(Use `node --trace-warnings ...` to show where the warning was created)\r\n"]
|
|
44
|
+
[1.042, "o", "Indexed 889 files, 7493 symbols, and 26395 outlines.\r\n"]
|
|
45
|
+
[0.005, "o", "\u001b[1G\u001b[0K⠙"]
|
|
46
|
+
[0.001, "o", "\u001b[1G\u001b[0K"]
|
|
47
|
+
[1.014, "x", "0"]
|
package/assets/demo.gif
ADDED
|
Binary file
|
|
@@ -0,0 +1,28 @@
|
|
|
1
|
+
# AP-23: Selectively Skipping Enabled Workflows Within a Phase
|
|
2
|
+
|
|
3
|
+
## Pattern
|
|
4
|
+
|
|
5
|
+
An agent completes a phase but skips one or more enabled workflows. The run proceeds to completion without the skipped workflow's output. No error is raised because the phase gate only checks artifacts from explicitly required predecessors, not workflow-level completeness.
|
|
6
|
+
|
|
7
|
+
## Example
|
|
8
|
+
|
|
9
|
+
The final review phase has three workflows: `review`, `learn`, `prepare_next`. The agent completes `review` and presents the verdict, but skips `learn` and `prepare_next`. The run is marked complete. No learnings are captured, no handoff document is produced.
|
|
10
|
+
|
|
11
|
+
## Harm
|
|
12
|
+
|
|
13
|
+
- Learnings from the run are lost — the same mistakes repeat in future runs
|
|
14
|
+
- Handoff documents are missing — the next session starts without context
|
|
15
|
+
- Verification evidence is incomplete — claims cannot be audited
|
|
16
|
+
- The user believes the pipeline ran fully when it did not
|
|
17
|
+
|
|
18
|
+
## Detection
|
|
19
|
+
|
|
20
|
+
`validateRunCompletion(runDir, manifestPath)` in `tooling/src/guards/phase-prerequisite-guard.js` checks that every workflow declared in `wazir.manifest.yaml` has a `phase_exit` event with `status: completed` in the run's `events.ndjson`.
|
|
21
|
+
|
|
22
|
+
`wazir capture summary --complete` calls this check and refuses to finalize the run if any enabled workflow was skipped.
|
|
23
|
+
|
|
24
|
+
## Fix
|
|
25
|
+
|
|
26
|
+
1. Always emit `phase_exit` events for every workflow: `wazir capture event --run <id> --event phase_exit --phase <workflow> --status completed`
|
|
27
|
+
2. Use `wazir capture summary --complete` instead of bare `wazir capture summary` at run end
|
|
28
|
+
3. The wazir pipeline skill checks completion before presenting final results
|
|
@@ -0,0 +1,34 @@
|
|
|
1
|
+
# AP-24: Clarifier Making Scope Decisions Without Asking
|
|
2
|
+
|
|
3
|
+
## Pattern
|
|
4
|
+
|
|
5
|
+
The clarifier autonomously decides that certain items are "out of scope" without asking the user. This typically happens when the input doesn't explicitly mention something (e.g., documentation, i18n, testing strategy), and the clarifier assumes silence means exclusion.
|
|
6
|
+
|
|
7
|
+
## Example
|
|
8
|
+
|
|
9
|
+
User input: "Build a user authentication system with OAuth2."
|
|
10
|
+
|
|
11
|
+
Clarifier produces: "Out of scope: documentation, i18n, rate limiting, password recovery."
|
|
12
|
+
|
|
13
|
+
The user never agreed to exclude any of these. The clarifier decided unilaterally.
|
|
14
|
+
|
|
15
|
+
## Harm
|
|
16
|
+
|
|
17
|
+
- Items the user wanted are silently dropped
|
|
18
|
+
- The user sees the final output and assumes the pipeline covered everything
|
|
19
|
+
- 21 input items become 5 tasks because the clarifier excluded 16 without asking
|
|
20
|
+
- Trust in the pipeline erodes when users discover missing features after delivery
|
|
21
|
+
|
|
22
|
+
## Detection
|
|
23
|
+
|
|
24
|
+
- Clarification document contains "out of scope" items that were never discussed with the user
|
|
25
|
+
- Plan has fewer tasks than distinct items in the original input
|
|
26
|
+
- Scope coverage guard (`evaluateScopeCoverageGuard`) flags plan < input items
|
|
27
|
+
|
|
28
|
+
## Fix
|
|
29
|
+
|
|
30
|
+
1. Research runs FIRST — the clarifier must have context before asking questions
|
|
31
|
+
2. After research, ask INFORMED questions in batches of 3-7
|
|
32
|
+
3. Every scope exclusion must reference an explicit user confirmation
|
|
33
|
+
4. If the input is clear, zero questions is fine — but the clarifier must state "no ambiguities detected" rather than silently proceeding
|
|
34
|
+
5. The clarification document must cite user responses for every scope boundary decision
|
|
@@ -10,7 +10,7 @@ Wazir is a host-native engineering OS kit. The host environment (Claude, Codex,
|
|
|
10
10
|
| Workflows | Phase entrypoints that sequence roles through delivery |
|
|
11
11
|
| Skills | Reusable procedures (wz:tdd, wz:debugging, wz:verification, wz:brainstorming) |
|
|
12
12
|
| Hooks | Guardrails enforcing protected paths, loop caps, and capture routing |
|
|
13
|
-
| Expertise |
|
|
13
|
+
| Expertise | 315 curated knowledge modules composed into agent prompts |
|
|
14
14
|
| Templates | Artifact templates for phase outputs and handoff |
|
|
15
15
|
| Schemas | Validation schemas for manifest, hooks, artifacts, and exports |
|
|
16
16
|
| Exports | Generated host packages tailored per supported host |
|
|
@@ -24,7 +24,7 @@ The agent audits its own work in an isolated git worktree. Validates, finds stru
|
|
|
24
24
|
|
|
25
25
|
## 6. Composer
|
|
26
26
|
|
|
27
|
-
|
|
27
|
+
315 curated expertise modules across 12 domains. The composition engine assembles task-specific agents by loading the right expertise for each role, stack, and concern. The executor building a Flutter RTL app gets Flutter patterns, RTL layout rules, and mobile antipatterns composed into its context. The reviewer gets the corresponding antipattern catalog. Every dispatched agent is a specialist, not a generalist pretending.
|
|
28
28
|
|
|
29
29
|
## 7. Review Loops
|
|
30
30
|
|
package/docs/readmes/INDEX.md
CHANGED
|
@@ -94,7 +94,7 @@
|
|
|
94
94
|
### Other Features
|
|
95
95
|
| File | Description |
|
|
96
96
|
|------|-------------|
|
|
97
|
-
| [expertise/README.md](features/expertise/README.md) | Expertise system —
|
|
97
|
+
| [expertise/README.md](features/expertise/README.md) | Expertise system — 315 modules across 12 domains |
|
|
98
98
|
| [schemas/README.md](features/schemas/README.md) | Schema system — 19 JSON schemas for artifact validation |
|
|
99
99
|
| [tooling/README.md](features/tooling/README.md) | CLI tooling — all commands with options and examples |
|
|
100
100
|
| [exports/README.md](features/exports/README.md) | Host exports — Claude, Codex, Gemini, Cursor packages |
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
# Expertise System
|
|
2
2
|
|
|
3
|
-
Wazir's expertise system is a curated library of **
|
|
3
|
+
Wazir's expertise system is a curated library of **315 knowledge modules** spanning
|
|
4
4
|
architecture, security, performance, design, and more. Modules are loaded selectively into
|
|
5
5
|
agent prompts — giving the right knowledge to the right role at the right phase — without
|
|
6
6
|
flooding context with irrelevant content.
|
|
@@ -64,7 +64,7 @@ This hook does not block. On failure:
|
|
|
64
64
|
|
|
65
65
|
## Key Findings
|
|
66
66
|
- All 127 tests passing after rate-limit middleware implementation
|
|
67
|
-
- Hook validation: all
|
|
67
|
+
- Hook validation: all 8 hooks validated
|
|
68
68
|
|
|
69
69
|
## Captures
|
|
70
70
|
- captures/npm-test-001.txt — full test run output (47KB)
|
package/docs/reference/hooks.md
CHANGED
|
@@ -15,6 +15,7 @@ These hook definitions are product contracts first. Host-specific native hooks o
|
|
|
15
15
|
| `stop_handoff_harvest` | Persist final handoff and stop-time observability data | capture |
|
|
16
16
|
| `protected_path_write_guard` | Block writes to protected canonical paths outside approved flows | block |
|
|
17
17
|
| `loop_cap_guard` | Block extra iterations after the configured loop cap | block |
|
|
18
|
+
| `context_mode_router` | Route large command output through context-mode tools to avoid flooding model context | warn |
|
|
18
19
|
|
|
19
20
|
## Source of truth
|
|
20
21
|
|
|
@@ -26,7 +26,7 @@ Submit pull requests to these curated lists (one PR per list, follow each repo's
|
|
|
26
26
|
### awesome-claude-code
|
|
27
27
|
- **Repo:** `github.com/anthropics/awesome-claude-code` (or the most-starred community fork)
|
|
28
28
|
- **Section:** Tools / Plugins / Extensions
|
|
29
|
-
- **Entry format:** `[Wazir](https://github.com/MohamedAbdallah-14/Wazir) - Host-native engineering OS kit with 10 roles, 4 phases (15 workflows), and
|
|
29
|
+
- **Entry format:** `[Wazir](https://github.com/MohamedAbdallah-14/Wazir) - Host-native engineering OS kit with 10 roles, 4 phases (15 workflows), and 315 expertise modules.`
|
|
30
30
|
- **Tips:** Keep the description under 120 characters. Link directly to the repo.
|
|
31
31
|
|
|
32
32
|
### awesome-ai-agents
|
|
@@ -56,7 +56,7 @@ Show HN: Wazir – Engineering OS kit for AI coding agents (Claude, Codex, Gemin
|
|
|
56
56
|
### First comment
|
|
57
57
|
Post a comment immediately after submission explaining:
|
|
58
58
|
1. What problem Wazir solves (AI agents lack structured engineering workflows)
|
|
59
|
-
2. How it works (10 canonical roles,
|
|
59
|
+
2. How it works (10 canonical roles, 15-workflow pipeline, 315 expertise modules)
|
|
60
60
|
3. What makes it different (host-native, works across Claude/Codex/Gemini/Cursor)
|
|
61
61
|
4. Quick install: `npx @wazir-dev/cli init`
|
|
62
62
|
5. Invite feedback -- HN readers appreciate genuine requests for input
|
|
@@ -100,7 +100,7 @@ Structure as a 5-7 tweet thread:
|
|
|
100
100
|
|
|
101
101
|
1. **Hook tweet:** One-liner about the problem + link to repo.
|
|
102
102
|
2. **What it is:** Brief description of Wazir.
|
|
103
|
-
3. **Architecture:** 10 roles, 4 phases (15 workflows),
|
|
103
|
+
3. **Architecture:** 10 roles, 4 phases (15 workflows), 315 modules (include a diagram image).
|
|
104
104
|
4. **Demo:** Short GIF or screenshot of a workflow in action.
|
|
105
105
|
5. **Multi-host:** Works with Claude, Codex, Gemini, and Cursor.
|
|
106
106
|
6. **Install:** `npx @wazir-dev/cli init`
|
|
@@ -293,7 +293,7 @@ Matches canonical `workflows/design-review.md`:
|
|
|
293
293
|
4. **Visual consistency** -- design tokens form a coherent system, dark/light mode alignment
|
|
294
294
|
5. **Exported-code fidelity** -- do exported scaffolds match the designs? Mismatches are failures here, not implementation concerns.
|
|
295
295
|
|
|
296
|
-
### Plan Dimensions (
|
|
296
|
+
### Plan Dimensions (8)
|
|
297
297
|
|
|
298
298
|
1. **Completeness** -- all design decisions mapped to tasks
|
|
299
299
|
2. **Ordering** -- dependencies correct, parallelizable identified
|
|
@@ -302,6 +302,7 @@ Matches canonical `workflows/design-review.md`:
|
|
|
302
302
|
5. **Edge cases** -- error paths covered
|
|
303
303
|
6. **Security** -- auth, injection, data exposure
|
|
304
304
|
7. **Integration** -- tasks connect end-to-end
|
|
305
|
+
8. **Input Coverage** -- every distinct item in the original input maps to at least one task. If `tasks < input items`, HIGH finding listing missing items
|
|
305
306
|
|
|
306
307
|
### Task Execution Dimensions (5)
|
|
307
308
|
|
|
@@ -499,7 +500,7 @@ These are the fixed rubrics — no ad-hoc dimension selection:
|
|
|
499
500
|
| research-review | Coverage, Source quality, Relevance, Gaps identified, Actionability |
|
|
500
501
|
| clarification-review / spec-challenge | Completeness, Testability, Ambiguity, Assumptions, Scope creep |
|
|
501
502
|
| design-review | Spec coverage, Design-spec consistency, Accessibility, Visual consistency, Exported-code fidelity |
|
|
502
|
-
| plan-review | Completeness, Testability, Task granularity, Dependency correctness, Phase structure, File coverage, Estimation accuracy |
|
|
503
|
+
| plan-review | Completeness, Testability, Task granularity, Dependency correctness, Phase structure, File coverage, Estimation accuracy, Input coverage |
|
|
503
504
|
| task-review | Correctness, Tests, Wiring, Drift, Quality |
|
|
504
505
|
| final | Correctness, Completeness, Wiring, Verification, Drift, Quality, Documentation |
|
|
505
506
|
|
|
@@ -13,7 +13,7 @@ Each skill is classified into one of three tiers:
|
|
|
13
13
|
|
|
14
14
|
| Wazir Skill | Superpowers Equivalent | Tier | Rationale | Risk Notes |
|
|
15
15
|
|---|---|---|---|---|
|
|
16
|
-
| brainstorming | brainstorming | **Own** | Structurally rewritten. Superpowers version is a linear checklist (explore context, ask questions, propose approaches, present design, write doc, invoke writing-plans). Wazir replaces the entire process: adds Command Routing and Codebase Exploration preambles, replaces the design-doc step with a design-review loop (`--mode design-review` with canonical dimensions), outputs to `.wazir/runs/latest/clarified/design.md` instead of `docs/plans
|
|
16
|
+
| brainstorming | brainstorming | **Own** | Structurally rewritten. Superpowers version is a linear checklist (explore context, ask questions, propose approaches, present design, write doc, invoke writing-plans). Wazir replaces the entire process: adds Command Routing and Codebase Exploration preambles, replaces the design-doc step with a design-review loop (`--mode design-review` with canonical dimensions), and outputs to `.wazir/runs/latest/clarified/design.md` instead of `docs/plans/`. None of the superpowers process steps survive intact. | -- |
|
|
17
17
|
| clarifier | _(none)_ | **Own** | Wazir-original. No superpowers counterpart exists. | -- |
|
|
18
18
|
| debugging | systematic-debugging | **Own** | Structurally rewritten. Superpowers has a 4-phase process (Root Cause Investigation with 5 substeps, Pattern Analysis, Hypothesis and Testing, Implementation) totaling ~300 lines with detailed examples, rationalization tables, and supporting technique references. Wazir condenses this to a 4-step observe-hypothesize-test-fix loop (~75 lines), replaces all codebase exploration with Wazir CLI symbol-first exploration (`wazir index search-symbols`, `wazir recall symbol` and `wazir recall file`), adds loop cap awareness (pipeline mode with `wazir capture loop-check` vs. standalone mode), and removes all superpowers examples, rationalization tables, and red-flag lists. The methodology is fundamentally different in structure despite sharing the spirit of "root cause first." | Delegating would lose Wazir CLI integration and loop cap awareness. Superpowers version is far more detailed on anti-patterns and may be worth referencing separately. |
|
|
19
19
|
| design | _(none)_ | **Own** | Wazir-original. No superpowers counterpart exists. | -- |
|
|
@@ -140,7 +140,7 @@ If superpowers or Claude Code introduces a composition/layering mechanism in the
|
|
|
140
140
|
|
|
141
141
|
2. **Augment tier is not implementable.** R2 validation (2026-03-19) found that skill shadowing in both superpowers `skills-core.js` and Claude Code's native Skill tool is full-override: placing a SKILL.md in `~/.claude/skills/<name>/` completely replaces the superpowers skill with the same name. There is no merge or append mechanism. The three former Augment candidates (dispatching-parallel-agents, finishing-a-development-branch, using-git-worktrees) have been reclassified to Own. See [Augment Mechanism](#augment-mechanism) for full analysis.
|
|
142
142
|
|
|
143
|
-
3. **All 14 forked skills are Own** because either (a) they introduce structural process changes (review loops, pipeline mode, Codex integration,
|
|
143
|
+
3. **All 14 forked skills are Own** because either (a) they introduce structural process changes (review loops, pipeline mode, Codex integration, content restructuring) or (b) the Augment composition mechanism does not exist in the platform.
|
|
144
144
|
|
|
145
145
|
4. **Token cost tradeoff is significant.** Several Wazir Own skills (tdd, verification, debugging, writing-skills) are dramatically shorter than their superpowers counterparts. The superpowers versions contain valuable rationalization prevention tables, detailed examples, and anti-pattern catalogs that enforce discipline. The Wazir versions trade this for token efficiency. This tradeoff should be revisited -- some of the removed discipline content may be worth recovering as separate reference files.
|
|
146
146
|
|