@wazir-dev/cli 1.1.0 → 1.3.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +74 -10
- package/README.md +15 -15
- package/assets/demo.cast +47 -0
- package/assets/demo.gif +0 -0
- package/docs/anti-patterns/AP-23-skipping-enabled-workflows.md +28 -0
- package/docs/anti-patterns/AP-24-clarifier-deciding-scope.md +34 -0
- package/docs/concepts/architecture.md +1 -1
- package/docs/concepts/roles-and-workflows.md +2 -0
- package/docs/concepts/why-wazir.md +59 -0
- package/docs/decisions/2026-03-19-deferred-items.md +564 -0
- package/docs/decisions/2026-03-19-enhancement-decisions.md +300 -0
- package/docs/readmes/INDEX.md +21 -5
- package/docs/readmes/features/expertise/README.md +2 -2
- package/docs/readmes/features/exports/README.md +2 -2
- package/docs/readmes/features/hooks/pre-compact-summary.md +1 -1
- package/docs/readmes/features/schemas/README.md +3 -0
- package/docs/readmes/features/skills/README.md +17 -0
- package/docs/readmes/features/skills/clarifier.md +5 -0
- package/docs/readmes/features/skills/claude-cli.md +5 -0
- package/docs/readmes/features/skills/codex-cli.md +5 -0
- package/docs/readmes/features/skills/dispatching-parallel-agents.md +5 -0
- package/docs/readmes/features/skills/executing-plans.md +5 -0
- package/docs/readmes/features/skills/executor.md +5 -0
- package/docs/readmes/features/skills/finishing-a-development-branch.md +5 -0
- package/docs/readmes/features/skills/gemini-cli.md +5 -0
- package/docs/readmes/features/skills/humanize.md +5 -0
- package/docs/readmes/features/skills/init-pipeline.md +5 -0
- package/docs/readmes/features/skills/receiving-code-review.md +5 -0
- package/docs/readmes/features/skills/requesting-code-review.md +5 -0
- package/docs/readmes/features/skills/reviewer.md +5 -0
- package/docs/readmes/features/skills/subagent-driven-development.md +5 -0
- package/docs/readmes/features/skills/using-git-worktrees.md +5 -0
- package/docs/readmes/features/skills/wazir.md +5 -0
- package/docs/readmes/features/skills/writing-skills.md +5 -0
- package/docs/readmes/features/workflows/prepare-next.md +1 -1
- package/docs/reference/configuration-reference.md +47 -6
- package/docs/reference/hooks.md +1 -0
- package/docs/reference/launch-checklist.md +4 -4
- package/docs/reference/review-loop-pattern.md +119 -9
- package/docs/reference/roles-reference.md +1 -0
- package/docs/reference/skill-tiers.md +147 -0
- package/docs/reference/tooling-cli.md +3 -1
- package/docs/truth-claims.yaml +12 -0
- package/expertise/antipatterns/process/ai-coding-antipatterns.md +214 -1
- package/exports/hosts/claude/.claude/commands/plan-review.md +3 -1
- package/exports/hosts/claude/.claude/commands/verify.md +30 -1
- package/exports/hosts/claude/.claude/settings.json +9 -0
- package/exports/hosts/claude/CLAUDE.md +1 -1
- package/exports/hosts/claude/export.manifest.json +6 -4
- package/exports/hosts/claude/host-package.json +3 -1
- package/exports/hosts/codex/AGENTS.md +1 -1
- package/exports/hosts/codex/export.manifest.json +6 -4
- package/exports/hosts/codex/host-package.json +3 -1
- package/exports/hosts/cursor/.cursor/hooks.json +4 -0
- package/exports/hosts/cursor/.cursor/rules/wazir-core.mdc +1 -1
- package/exports/hosts/cursor/export.manifest.json +6 -4
- package/exports/hosts/cursor/host-package.json +3 -1
- package/exports/hosts/gemini/GEMINI.md +1 -1
- package/exports/hosts/gemini/export.manifest.json +6 -4
- package/exports/hosts/gemini/host-package.json +3 -1
- package/hooks/context-mode-router +191 -0
- package/hooks/definitions/context_mode_router.yaml +19 -0
- package/hooks/hooks.json +31 -6
- package/hooks/protected-path-write-guard +8 -0
- package/hooks/routing-matrix.json +45 -0
- package/hooks/session-start +62 -1
- package/llms-full.txt +937 -134
- package/package.json +2 -4
- package/schemas/hook.schema.json +2 -1
- package/schemas/phase-report.schema.json +89 -0
- package/schemas/usage.schema.json +25 -1
- package/schemas/wazir-manifest.schema.json +19 -0
- package/skills/brainstorming/SKILL.md +32 -157
- package/skills/clarifier/SKILL.md +289 -111
- package/skills/claude-cli/SKILL.md +320 -0
- package/skills/codex-cli/SKILL.md +260 -0
- package/skills/debugging/SKILL.md +13 -0
- package/skills/design/SKILL.md +13 -0
- package/skills/dispatching-parallel-agents/SKILL.md +13 -0
- package/skills/executing-plans/SKILL.md +13 -0
- package/skills/executor/SKILL.md +139 -19
- package/skills/finishing-a-development-branch/SKILL.md +13 -0
- package/skills/gemini-cli/SKILL.md +260 -0
- package/skills/humanize/SKILL.md +13 -0
- package/skills/init-pipeline/SKILL.md +72 -164
- package/skills/prepare-next/SKILL.md +81 -10
- package/skills/receiving-code-review/SKILL.md +13 -0
- package/skills/requesting-code-review/SKILL.md +13 -0
- package/skills/reviewer/SKILL.md +369 -24
- package/skills/run-audit/SKILL.md +13 -0
- package/skills/scan-project/SKILL.md +13 -0
- package/skills/self-audit/SKILL.md +217 -16
- package/skills/skill-research/SKILL.md +188 -0
- package/skills/subagent-driven-development/SKILL.md +13 -0
- package/skills/subagent-driven-development/code-quality-reviewer-prompt.md +2 -0
- package/skills/subagent-driven-development/implementer-prompt.md +8 -0
- package/skills/subagent-driven-development/spec-reviewer-prompt.md +7 -0
- package/skills/tdd/SKILL.md +13 -0
- package/skills/using-git-worktrees/SKILL.md +13 -0
- package/skills/using-skills/SKILL.md +13 -0
- package/skills/verification/SKILL.md +54 -3
- package/skills/wazir/SKILL.md +464 -381
- package/skills/writing-plans/SKILL.md +14 -1
- package/skills/writing-skills/SKILL.md +13 -0
- package/templates/artifacts/implementation-plan.md +3 -0
- package/templates/artifacts/tasks-template.md +133 -0
- package/templates/examples/phase-report.example.json +48 -0
- package/tooling/src/adapters/composition-engine.js +256 -0
- package/tooling/src/adapters/model-router.js +84 -0
- package/tooling/src/capture/command.js +41 -2
- package/tooling/src/capture/run-config.js +3 -1
- package/tooling/src/capture/store.js +56 -0
- package/tooling/src/capture/usage.js +106 -0
- package/tooling/src/capture/user-input.js +66 -0
- package/tooling/src/checks/ac-matrix.js +256 -0
- package/tooling/src/checks/command-registry.js +12 -0
- package/tooling/src/checks/docs-truth.js +1 -1
- package/tooling/src/checks/security-sensitivity.js +69 -0
- package/tooling/src/checks/skills.js +111 -0
- package/tooling/src/cli.js +31 -20
- package/tooling/src/commands/stats.js +161 -0
- package/tooling/src/commands/validate.js +5 -1
- package/tooling/src/export/compiler.js +33 -37
- package/tooling/src/gating/agent.js +145 -0
- package/tooling/src/guards/phase-prerequisite-guard.js +185 -0
- package/tooling/src/hooks/routing-logic.js +69 -0
- package/tooling/src/init/auto-detect.js +258 -0
- package/tooling/src/init/command.js +38 -170
- package/tooling/src/input/scanner.js +46 -0
- package/tooling/src/reports/command.js +103 -0
- package/tooling/src/reports/phase-report.js +323 -0
- package/tooling/src/state/command.js +160 -0
- package/tooling/src/state/db.js +287 -0
- package/tooling/src/status/command.js +58 -1
- package/tooling/src/verify/proof-collector.js +299 -0
- package/wazir.manifest.yaml +26 -14
- package/workflows/plan-review.md +3 -1
- package/workflows/verify.md +30 -1
package/llms-full.txt
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
# Wazir — Complete Documentation
|
|
2
2
|
|
|
3
|
-
> Generated: 2026-03-
|
|
3
|
+
> Generated: 2026-03-20T04:20:52Z
|
|
4
4
|
|
|
5
5
|
---
|
|
6
6
|
## Source: docs/concepts/architecture.md
|
|
@@ -17,7 +17,7 @@ Wazir is a host-native engineering OS kit. The host environment (Claude, Codex,
|
|
|
17
17
|
| Workflows | Phase entrypoints that sequence roles through delivery |
|
|
18
18
|
| Skills | Reusable procedures (wz:tdd, wz:debugging, wz:verification, wz:brainstorming) |
|
|
19
19
|
| Hooks | Guardrails enforcing protected paths, loop caps, and capture routing |
|
|
20
|
-
| Expertise |
|
|
20
|
+
| Expertise | 315 curated knowledge modules composed into agent prompts |
|
|
21
21
|
| Templates | Artifact templates for phase outputs and handoff |
|
|
22
22
|
| Schemas | Validation schemas for manifest, hooks, artifacts, and exports |
|
|
23
23
|
| Exports | Generated host packages tailored per supported host |
|
|
@@ -93,6 +93,7 @@ open-pencil is integrated as an optional adapter (`open_pencil`) — it is not r
|
|
|
93
93
|
- [Hooks](../reference/hooks.md)
|
|
94
94
|
- [Expertise & Antipatterns](composition-engine.md)
|
|
95
95
|
|
|
96
|
+
|
|
96
97
|
---
|
|
97
98
|
## Source: docs/concepts/artifact-model.md
|
|
98
99
|
|
|
@@ -157,6 +158,7 @@ That path must remain gitignored so a normal run still leaves `git status` clean
|
|
|
157
158
|
- archive stale or disproven learnings instead of rewriting history in place
|
|
158
159
|
- prune external run-state captures when they no longer provide audit or debugging value
|
|
159
160
|
|
|
161
|
+
|
|
160
162
|
---
|
|
161
163
|
## Source: docs/concepts/composition-engine.md
|
|
162
164
|
|
|
@@ -197,6 +199,7 @@ Brute-force loading of all expertise modules would flood the context window. The
|
|
|
197
199
|
|
|
198
200
|
For the complete module listing and anti-pattern catalog, see the [Expertise Index reference](../reference/expertise-index.md).
|
|
199
201
|
|
|
202
|
+
|
|
200
203
|
---
|
|
201
204
|
## Source: docs/concepts/indexing-and-recall.md
|
|
202
205
|
|
|
@@ -361,6 +364,7 @@ The optional `context-mode` adapter remains:
|
|
|
361
364
|
|
|
362
365
|
See the adapter docs for current status and constraints.
|
|
363
366
|
|
|
367
|
+
|
|
364
368
|
---
|
|
365
369
|
## Source: docs/concepts/observability.md
|
|
366
370
|
|
|
@@ -406,6 +410,7 @@ The capture command family writes under:
|
|
|
406
410
|
- captured output should reduce context flooding, not increase it
|
|
407
411
|
- summaries must point back to captured files instead of pretending the full output stayed in context
|
|
408
412
|
|
|
413
|
+
|
|
409
414
|
---
|
|
410
415
|
## Source: docs/concepts/roles-and-workflows.md
|
|
411
416
|
|
|
@@ -442,6 +447,8 @@ The canonical workflow sequence is:
|
|
|
442
447
|
13. **learn** — capture scoped learnings
|
|
443
448
|
14. **prepare-next** — produce a clean handoff for the next run
|
|
444
449
|
|
|
450
|
+
Additionally, **run-audit** is a standalone workflow that can be invoked outside the linear pipeline to perform structured codebase audits with source-backed findings.
|
|
451
|
+
|
|
445
452
|
## Role routing
|
|
446
453
|
|
|
447
454
|
The orchestrator dispatches three roles per task: `executor`, `reviewer`, and `verifier`. By default, all three run for every task. The `required_roles` field in a task's YAML frontmatter controls which roles are dispatched, allowing the orchestrator to skip unnecessary roles and save context window budget.
|
|
@@ -469,6 +476,7 @@ If `security_critical: true`, `reviewer` is always included.
|
|
|
469
476
|
|
|
470
477
|
Use the files under `roles/` and `workflows/` as the canonical source for role contracts and phase entrypoints. For exact role and workflow tables, see the [Roles Reference](../reference/roles-reference.md).
|
|
471
478
|
|
|
479
|
+
|
|
472
480
|
---
|
|
473
481
|
## Source: docs/concepts/terminology-policy.md
|
|
474
482
|
|
|
@@ -500,6 +508,71 @@ Do not use terms that describe Wazir as a background service, a web-based contro
|
|
|
500
508
|
- Use the canonical terms above in all roles, workflows, skills, and documentation.
|
|
501
509
|
- When in doubt, describe what Wazir is, not what it is not.
|
|
502
510
|
|
|
511
|
+
|
|
512
|
+
---
|
|
513
|
+
## Source: docs/concepts/why-wazir.md
|
|
514
|
+
|
|
515
|
+
# Why Wazir
|
|
516
|
+
|
|
517
|
+
What makes Wazir the best engineering OS you can add to an AI coding agent.
|
|
518
|
+
|
|
519
|
+
## 1. Measure Twice, Cut Once
|
|
520
|
+
|
|
521
|
+
Wazir clarifies before coding. The pipeline forces research, spec hardening, design review, and plan approval before a single line of implementation code is written. Most AI agents jump straight to code and fix mistakes after. Wazir prevents the mistakes.
|
|
522
|
+
|
|
523
|
+
## 2. Deep Research
|
|
524
|
+
|
|
525
|
+
Every AI agent knows how to research. Users don't ask them to. Wazir makes research a mandatory phase — the researcher role scans the codebase, fetches external sources, and produces a research brief before clarification begins. The agent starts informed, not guessing.
|
|
526
|
+
|
|
527
|
+
## 3. Clarifier + Task Planning
|
|
528
|
+
|
|
529
|
+
A structured clarification pipeline turns vague requests into measurable specs. Spec hardening catches ambiguity, missing constraints, and untestable acceptance criteria before they become bugs. Task planning produces execution-grade task specs — not TODO lists.
|
|
530
|
+
|
|
531
|
+
## 4. Content Author
|
|
532
|
+
|
|
533
|
+
A dedicated role for any content need — database seeding, sample content, test fixtures, translations, copy, email templates, notification text. Most AI agents treat content as an afterthought bolted onto code tasks. Wazir gives content its own phase with editorial standards, i18n awareness, and humanization rules.
|
|
534
|
+
|
|
535
|
+
## 5. Self-Audit
|
|
536
|
+
|
|
537
|
+
The agent audits its own work in an isolated git worktree. Validates, finds structural issues, fixes what it can, verifies the fixes, and only merges on all-green. 5-loop cycle with convergence detection. Protected-path safety rails prevent the agent from modifying its own identity-defining files. Safe self-improvement.
|
|
538
|
+
|
|
539
|
+
## 6. Composer
|
|
540
|
+
|
|
541
|
+
315 curated expertise modules across 12 domains. The composition engine assembles task-specific agents by loading the right expertise for each role, stack, and concern. The executor building a Flutter RTL app gets Flutter patterns, RTL layout rules, and mobile antipatterns composed into its context. The reviewer gets the corresponding antipattern catalog. Every dispatched agent is a specialist, not a generalist pretending.
|
|
542
|
+
|
|
543
|
+
## 7. Review Loops
|
|
544
|
+
|
|
545
|
+
Multi-pass adversarial review at every pipeline checkpoint — not a single rubber-stamp at the end. Research-review, clarification-review, spec-challenge, design-review, plan-review, per-task execution review, and final review. Each uses phase-specific dimensions. Findings are resolved before advancing. The reviewer is an adversary, not a cheerleader.
|
|
546
|
+
|
|
547
|
+
## 8. Continuous Learning
|
|
548
|
+
|
|
549
|
+
Wazir evolves from its own mistakes. Review findings, audit findings, and user corrections feed into a learning system. Recurring issues become accepted learnings injected into future runs. A drift budget prevents learned behavior from diverging too far from the original design. The agent that builds your 10th feature is better than the one that built your 1st.
|
|
550
|
+
|
|
551
|
+
## 9. Antipatterns
|
|
552
|
+
|
|
553
|
+
A first-class antipattern catalog loaded into reviewer context BEFORE domain expertise. Catches AI-specific failure modes: fake completion, unwired abstractions, shallow tests, security theater, architecture drift. The reviewer's first lens is "what could go wrong" — not "does this look right."
|
|
554
|
+
|
|
555
|
+
## 10. Multi-Host
|
|
556
|
+
|
|
557
|
+
One canonical source, four host exports. Wazir works on Claude Code, Codex, Gemini, and Cursor from a single `wazir export build`. Roles, workflows, skills, and expertise are written once and compiled into each host's native format. Switch hosts without rewriting your engineering process.
|
|
558
|
+
|
|
559
|
+
## 11. Context Efficiency
|
|
560
|
+
|
|
561
|
+
AI agents waste most of their context window on brute-force file reads and verbose command output. Wazir's routing hook auto-routes large commands through context-mode. The index provides symbol-first exploration — query first, read only what's needed. Capture routing redirects large output to files. Result: 60-80% token reduction on exploration-heavy phases. The agent thinks more, reads less.
|
|
562
|
+
|
|
563
|
+
## 12. Verification Before Completion
|
|
564
|
+
|
|
565
|
+
No success claims without evidence. The verify phase produces deterministic proof — test results, lint output, type-check results — not "I believe it works." Every completion claim is backed by a command that was actually run and output that was actually checked. Evidence before assertions, always.
|
|
566
|
+
|
|
567
|
+
## 13. Gating Agent
|
|
568
|
+
|
|
569
|
+
Autonomous phase transition decisions. After each phase, a gating agent reads the phase report and decides: continue (all gates pass), loop back (specific failures with fix paths), or escalate to human (ambiguous trade-offs, scope changes). Default posture: escalate. The pipeline doesn't blindly advance — it stops when it should stop.
|
|
570
|
+
|
|
571
|
+
## 14. Humanize
|
|
572
|
+
|
|
573
|
+
Anti-AI-writing patterns across all text output. A vocabulary blacklist, domain-specific rules, and a self-audit checklist ensure that specs, plans, code comments, commit messages, and documentation read like they were written by a human engineer — not generated by an LLM. Because AI-sounding output erodes trust.
|
|
574
|
+
|
|
575
|
+
|
|
503
576
|
---
|
|
504
577
|
## Source: docs/getting-started/01-installation.md
|
|
505
578
|
|
|
@@ -582,6 +655,7 @@ npx wazir doctor
|
|
|
582
655
|
|
|
583
656
|
[Your First Run](02-first-run.md) — walk through the full pipeline from brief to shipped code.
|
|
584
657
|
|
|
658
|
+
|
|
585
659
|
---
|
|
586
660
|
## Source: docs/getting-started/02-first-run.md
|
|
587
661
|
|
|
@@ -688,6 +762,7 @@ The 7 steps above map to 14 internal phases:
|
|
|
688
762
|
- [Roles & Workflows](../concepts/roles-and-workflows.md) — deep dive into role contracts
|
|
689
763
|
- [Composition Engine](../concepts/composition-engine.md) — how expertise modules are loaded
|
|
690
764
|
|
|
765
|
+
|
|
691
766
|
---
|
|
692
767
|
## Source: docs/guides/memory-and-learnings.md
|
|
693
768
|
|
|
@@ -726,6 +801,7 @@ Wazir keeps learning durable but scoped.
|
|
|
726
801
|
3. Promote it to `memory/learnings/accepted/` only when the scope and evidence are durable.
|
|
727
802
|
4. Move disproven or obsolete learnings to `memory/learnings/archived/`.
|
|
728
803
|
|
|
804
|
+
|
|
729
805
|
---
|
|
730
806
|
## Source: docs/guides/troubleshooting.md
|
|
731
807
|
|
|
@@ -831,6 +907,7 @@ If it says the run status is missing:
|
|
|
831
907
|
- confirm the file exists on disk
|
|
832
908
|
- use `--json` for machine-readable output during automation
|
|
833
909
|
|
|
910
|
+
|
|
834
911
|
---
|
|
835
912
|
## Source: docs/reference/configuration-reference.md
|
|
836
913
|
|
|
@@ -969,15 +1046,56 @@ Out of scope for this manifest check:
|
|
|
969
1046
|
|
|
970
1047
|
Maintainers are responsible for policing those surfaces with the separate docs-truth, runtime-surface, and repository review checks.
|
|
971
1048
|
|
|
972
|
-
##
|
|
1049
|
+
## Phases vs workflows
|
|
973
1050
|
|
|
974
|
-
|
|
975
|
-
- `workflows` are the canonical callable or review-gated entrypoints that drive those phases.
|
|
1051
|
+
The pipeline has **4 phases** (Init, Clarifier, Executor, Final Review) and **15 workflows** (atomic units within those phases).
|
|
976
1052
|
|
|
977
|
-
|
|
1053
|
+
- **Phases** are the top-level pipeline stages. Event capture and tracking use phase names: `init`, `clarifier`, `executor`, `final_review`.
|
|
1054
|
+
- **Workflows** are the canonical callable or review-gated entrypoints that run within phases. Each workflow can be independently enabled/disabled via `workflow_policy` in run-config.
|
|
978
1055
|
|
|
979
|
-
|
|
980
|
-
|
|
1056
|
+
| Phase | Workflows |
|
|
1057
|
+
|-------|-----------|
|
|
1058
|
+
| Init | (inline — no workflow files) |
|
|
1059
|
+
| Clarifier | clarify, discover, specify, spec_challenge, author, design, design_review, plan, plan_review |
|
|
1060
|
+
| Executor | execute, verify |
|
|
1061
|
+
| Final Review | review, learn, prepare_next |
|
|
1062
|
+
|
|
1063
|
+
`run_audit` is a standalone on-demand workflow, not part of the main pipeline flow.
|
|
1064
|
+
|
|
1065
|
+
Validators and exports should treat manifest-declared workflows as the canonical workflow file roster.
|
|
1066
|
+
|
|
1067
|
+
## Hook configuration
|
|
1068
|
+
|
|
1069
|
+
### `hooks/routing-matrix.json`
|
|
1070
|
+
|
|
1071
|
+
The routing matrix defines how the context-mode router classifies commands:
|
|
1072
|
+
|
|
1073
|
+
- `large` — array of command prefixes that always route to context-mode (AC-3.1). The `# wazir:passthrough` marker does NOT exempt commands in this category.
|
|
1074
|
+
- `small` — array of command prefixes that always pass through without context-mode processing.
|
|
1075
|
+
- `ambiguous_heuristic` — rules for commands that match neither large nor small:
|
|
1076
|
+
- `pipe_detected` — classify piped commands as ambiguous
|
|
1077
|
+
- `redirect_detected` — classify redirected commands as ambiguous
|
|
1078
|
+
- `verbose_binaries` — array of binary names whose output is typically large
|
|
1079
|
+
|
|
1080
|
+
### `config/gating-rules.yaml`
|
|
1081
|
+
|
|
1082
|
+
The gating rules file defines conditions for phase transition decisions:
|
|
1083
|
+
|
|
1084
|
+
- `rules.continue` — all conditions must pass for a phase to advance (test failures, lint errors, type errors, drift delta, risk flags, uncertain outcomes)
|
|
1085
|
+
- `rules.loop_back` — any deterministic failure (test failures, lint errors, or type errors) triggers a loop-back with actionable fix descriptions
|
|
1086
|
+
- `rules.escalate` — fallback when neither continue nor loop_back match
|
|
1087
|
+
- `default_verdict` — verdict when the report is empty or missing (defaults to `escalate`)
|
|
1088
|
+
|
|
1089
|
+
### Composition proof artifacts
|
|
1090
|
+
|
|
1091
|
+
The composition engine (`tooling/src/adapters/composition-engine.js`) writes a proof artifact per dispatch to `.wazir/runs/<id>/artifacts/composition-<role>-<task>.json` containing:
|
|
1092
|
+
|
|
1093
|
+
- `modules_included[]` — `{ path, layer, tokens }` for each loaded module
|
|
1094
|
+
- `modules_dropped[]` — `{ path, layer, tokens, reason }` for each dropped module. Reason values:
|
|
1095
|
+
- `module_cap_exceeded` — module count exceeded the 15-module cap
|
|
1096
|
+
- `token_ceiling_exceeded` — total tokens exceeded the configurable ceiling (default: 50,000)
|
|
1097
|
+
- `total_tokens` — total token count of composed prompt
|
|
1098
|
+
- `prompt_hash` — SHA-256 hash of the composed prompt for audit traceability
|
|
981
1099
|
|
|
982
1100
|
## Current index parser roster
|
|
983
1101
|
|
|
@@ -994,6 +1112,7 @@ The active manifest currently declares built-in heuristic extractors for:
|
|
|
994
1112
|
- YAML
|
|
995
1113
|
- Markdown
|
|
996
1114
|
|
|
1115
|
+
|
|
997
1116
|
---
|
|
998
1117
|
## Source: docs/reference/expertise-index.md
|
|
999
1118
|
|
|
@@ -1050,6 +1169,7 @@ The `expertise/humanize/` domain provides AI text pattern detection and removal.
|
|
|
1050
1169
|
|
|
1051
1170
|
For conceptual understanding of how the composition engine works, see [Composition Engine](../concepts/composition-engine.md).
|
|
1052
1171
|
|
|
1172
|
+
|
|
1053
1173
|
---
|
|
1054
1174
|
## Source: docs/reference/git-flow.md
|
|
1055
1175
|
|
|
@@ -1097,6 +1217,7 @@ Allowed types: `feat`, `fix`, `docs`, `chore`, `refactor`, `test`, `ci`, `perf`,
|
|
|
1097
1217
|
- **CI:** All three validators run on pull requests; `--require-entries` blocks feature/codex/hotfix branches without changelog entries
|
|
1098
1218
|
- **Roles:** Each role has documented git-flow responsibilities in its contract
|
|
1099
1219
|
|
|
1220
|
+
|
|
1100
1221
|
---
|
|
1101
1222
|
## Source: docs/reference/hooks.md
|
|
1102
1223
|
|
|
@@ -1188,6 +1309,7 @@ Hook definitions are the authoritative product contracts. The canonical definiti
|
|
|
1188
1309
|
- `0` allow
|
|
1189
1310
|
- `43` block
|
|
1190
1311
|
|
|
1312
|
+
|
|
1191
1313
|
---
|
|
1192
1314
|
## Source: docs/reference/host-exports.md
|
|
1193
1315
|
|
|
@@ -1242,6 +1364,7 @@ The compiler generates the canonical host packages under `exports/hosts/*`.
|
|
|
1242
1364
|
|
|
1243
1365
|
The only root host bootstrap retained is `.claude/settings.json`, which mirrors the generated Claude settings contract.
|
|
1244
1366
|
|
|
1367
|
+
|
|
1245
1368
|
---
|
|
1246
1369
|
## Source: docs/reference/launch-checklist.md
|
|
1247
1370
|
|
|
@@ -1273,7 +1396,7 @@ Submit pull requests to these curated lists (one PR per list, follow each repo's
|
|
|
1273
1396
|
### awesome-claude-code
|
|
1274
1397
|
- **Repo:** `github.com/anthropics/awesome-claude-code` (or the most-starred community fork)
|
|
1275
1398
|
- **Section:** Tools / Plugins / Extensions
|
|
1276
|
-
- **Entry format:** `[Wazir](https://github.com/MohamedAbdallah-14/Wazir) - Host-native engineering OS kit with 10 roles,
|
|
1399
|
+
- **Entry format:** `[Wazir](https://github.com/MohamedAbdallah-14/Wazir) - Host-native engineering OS kit with 10 roles, 4 phases (15 workflows), and 315 expertise modules.`
|
|
1277
1400
|
- **Tips:** Keep the description under 120 characters. Link directly to the repo.
|
|
1278
1401
|
|
|
1279
1402
|
### awesome-ai-agents
|
|
@@ -1303,7 +1426,7 @@ Show HN: Wazir – Engineering OS kit for AI coding agents (Claude, Codex, Gemin
|
|
|
1303
1426
|
### First comment
|
|
1304
1427
|
Post a comment immediately after submission explaining:
|
|
1305
1428
|
1. What problem Wazir solves (AI agents lack structured engineering workflows)
|
|
1306
|
-
2. How it works (10 canonical roles,
|
|
1429
|
+
2. How it works (10 canonical roles, 4-phase pipeline with 15 workflows, 315 expertise modules)
|
|
1307
1430
|
3. What makes it different (host-native, works across Claude/Codex/Gemini/Cursor)
|
|
1308
1431
|
4. Quick install: `npx @wazir-dev/cli init`
|
|
1309
1432
|
5. Invite feedback -- HN readers appreciate genuine requests for input
|
|
@@ -1322,7 +1445,7 @@ Post a comment immediately after submission explaining:
|
|
|
1322
1445
|
**Title:** "How I Built an Engineering OS for AI Coding Agents"
|
|
1323
1446
|
|
|
1324
1447
|
1. **Hook** -- The problem: AI agents write code but lack engineering discipline.
|
|
1325
|
-
2. **Architecture overview** -- 10 roles,
|
|
1448
|
+
2. **Architecture overview** -- 10 roles, 4 phases (15 workflows), expertise modules, quality gates.
|
|
1326
1449
|
3. **Code walkthrough** -- Show a real workflow: how a feature moves from requirements through TDD to deployment.
|
|
1327
1450
|
4. **Host-native approach** -- Explain why one kit works across Claude, Codex, Gemini, and Cursor.
|
|
1328
1451
|
5. **Results** -- Concrete metrics or before/after comparisons.
|
|
@@ -1347,7 +1470,7 @@ Structure as a 5-7 tweet thread:
|
|
|
1347
1470
|
|
|
1348
1471
|
1. **Hook tweet:** One-liner about the problem + link to repo.
|
|
1349
1472
|
2. **What it is:** Brief description of Wazir.
|
|
1350
|
-
3. **Architecture:** 10 roles,
|
|
1473
|
+
3. **Architecture:** 10 roles, 4 phases (15 workflows), 315 modules (include a diagram image).
|
|
1351
1474
|
4. **Demo:** Short GIF or screenshot of a workflow in action.
|
|
1352
1475
|
5. **Multi-host:** Works with Claude, Codex, Gemini, and Cursor.
|
|
1353
1476
|
6. **Install:** `npx @wazir-dev/cli init`
|
|
@@ -1418,6 +1541,7 @@ Monitor these metrics weekly for the first month, then monthly:
|
|
|
1418
1541
|
| External PRs | 2+ |
|
|
1419
1542
|
| HN points | 50+ |
|
|
1420
1543
|
|
|
1544
|
+
|
|
1421
1545
|
---
|
|
1422
1546
|
## Source: docs/reference/marketplace-listings.md
|
|
1423
1547
|
|
|
@@ -1498,6 +1622,7 @@ Run through this checklist after every `npm publish`:
|
|
|
1498
1622
|
- [ ] **Host exports:** Run `npx wazir export --check` to verify no drift
|
|
1499
1623
|
- [ ] **CHANGELOG:** Verify `CHANGELOG.md` is updated with the new version entry
|
|
1500
1624
|
|
|
1625
|
+
|
|
1501
1626
|
---
|
|
1502
1627
|
## Source: docs/reference/release-process.md
|
|
1503
1628
|
|
|
@@ -1536,6 +1661,550 @@ When no Wazir release tag exists yet:
|
|
|
1536
1661
|
- Legacy tags are not considered release boundaries
|
|
1537
1662
|
- The first release tag will be `v1.0.0` (or `v0.1.0` if pre-stable)
|
|
1538
1663
|
|
|
1664
|
+
|
|
1665
|
+
---
|
|
1666
|
+
## Source: docs/reference/review-loop-pattern.md
|
|
1667
|
+
|
|
1668
|
+
# Review Loop Pattern Reference
|
|
1669
|
+
|
|
1670
|
+
Canonical reference for the review loop pattern used across all Wazir pipeline phases. Skills and workflows link to this document rather than embedding loop logic inline.
|
|
1671
|
+
|
|
1672
|
+
---
|
|
1673
|
+
|
|
1674
|
+
## Core Principle: Producer-Reviewer Separation
|
|
1675
|
+
|
|
1676
|
+
The producer skill (clarifier, planner, designer, etc.) **emits** an artifact and calls for review. The **reviewer role** owns the review loop. The producer receives findings and resolves them. No role reviews its own output.
|
|
1677
|
+
|
|
1678
|
+
```
|
|
1679
|
+
Producer emits artifact
|
|
1680
|
+
-> Reviewer runs review loop (N passes, Codex if available)
|
|
1681
|
+
-> Findings returned to producer
|
|
1682
|
+
-> Producer fixes and resubmits
|
|
1683
|
+
-> Loop until all passes exhausted or cap reached
|
|
1684
|
+
-> Escalate to user if cap exceeded
|
|
1685
|
+
```
|
|
1686
|
+
|
|
1687
|
+
When Codex is available, the reviewer role delegates to `codex review` as a secondary input while maintaining its own independent primary verdict.
|
|
1688
|
+
|
|
1689
|
+
---
|
|
1690
|
+
|
|
1691
|
+
## Per-Task Review vs Final Review
|
|
1692
|
+
|
|
1693
|
+
These are two structurally different constructs:
|
|
1694
|
+
|
|
1695
|
+
| | Per-Task Review | Final Review |
|
|
1696
|
+
|---|---|---|
|
|
1697
|
+
| **When** | During execution, after each task | After all execution + verification complete |
|
|
1698
|
+
| **Dimensions** | 5 task-execution dims (correctness, tests, wiring, drift, quality) | 7 scored dims (correctness, completeness, wiring, verification, drift, quality, documentation) |
|
|
1699
|
+
| **Scope** | Single task's uncommitted changes | Entire implementation vs spec/plan |
|
|
1700
|
+
| **Output** | Pass/fix loop, no score | Scored verdict (0-70), PASS/FAIL |
|
|
1701
|
+
| **Workflow** | Inline in execution flow | `workflows/review.md` |
|
|
1702
|
+
| **Skill** | `wz:reviewer` in `task-review` mode | `wz:reviewer` in `final` mode |
|
|
1703
|
+
| **Log filename** | `<phase>-task-<NNN>-review-pass-<N>.md` | `final-review.md` |
|
|
1704
|
+
|
|
1705
|
+
---
|
|
1706
|
+
|
|
1707
|
+
## Standalone Mode
|
|
1708
|
+
|
|
1709
|
+
When no `.wazir/runs/latest/` directory exists (standalone skill invocation outside a pipeline run):
|
|
1710
|
+
|
|
1711
|
+
1. **Review loops still run** -- the review logic is embedded in the skill, not dependent on run state.
|
|
1712
|
+
2. **Artifact location** -- artifacts live in `docs/plans/`. This is the canonical standalone artifact path.
|
|
1713
|
+
3. **Review log location** -- review logs go alongside the artifact: `docs/plans/YYYY-MM-DD-<topic>-review-pass-<N>.md`. No temp dir.
|
|
1714
|
+
4. **Loop cap is SKIPPED entirely** -- no `wazir capture loop-check` call. The loop runs for exactly `pass_counts[depth]` passes (3/5/7) and stops. No cap guard, no fallback constant.
|
|
1715
|
+
5. **`wazir capture loop-check`** -- not invoked in standalone mode. The standalone detection happens before the cap guard call.
|
|
1716
|
+
|
|
1717
|
+
Detection logic:
|
|
1718
|
+
|
|
1719
|
+
```
|
|
1720
|
+
if .wazir/runs/latest/ exists:
|
|
1721
|
+
run_mode = "pipeline"
|
|
1722
|
+
log_dir = .wazir/runs/latest/reviews/
|
|
1723
|
+
cap_guard = wazir capture loop-check (full guard)
|
|
1724
|
+
else:
|
|
1725
|
+
run_mode = "standalone"
|
|
1726
|
+
artifact_dir = docs/plans/
|
|
1727
|
+
log_dir = docs/plans/ (alongside artifact)
|
|
1728
|
+
cap_guard = none (depth pass count is the only limit)
|
|
1729
|
+
```
|
|
1730
|
+
|
|
1731
|
+
---
|
|
1732
|
+
|
|
1733
|
+
## Review Loop Pseudocode
|
|
1734
|
+
|
|
1735
|
+
```
|
|
1736
|
+
review_loop(artifact_path, phase, dimensions[], depth, config, options={}):
|
|
1737
|
+
|
|
1738
|
+
# options.mode -- explicit review mode (required)
|
|
1739
|
+
# options.task_id -- task identifier for task-scoped reviews (optional)
|
|
1740
|
+
|
|
1741
|
+
# Standalone detection
|
|
1742
|
+
run_mode = detect_run_mode() # "pipeline" or "standalone"
|
|
1743
|
+
|
|
1744
|
+
# Fixed pass counts -- no extension
|
|
1745
|
+
pass_counts = { quick: 3, standard: 5, deep: 7 }
|
|
1746
|
+
total_passes = pass_counts[depth]
|
|
1747
|
+
|
|
1748
|
+
# Depth-aware dimension subsets (coverage contract)
|
|
1749
|
+
depth_dimensions = {
|
|
1750
|
+
quick: dimensions[0:3], # first 3 dimensions only
|
|
1751
|
+
standard: dimensions[0:5], # first 5
|
|
1752
|
+
deep: dimensions, # all available
|
|
1753
|
+
}
|
|
1754
|
+
active_dims = depth_dimensions[depth]
|
|
1755
|
+
|
|
1756
|
+
codex_available = check_codex() # which codex && codex --version
|
|
1757
|
+
|
|
1758
|
+
for pass_number in 0..total_passes-1:
|
|
1759
|
+
|
|
1760
|
+
# --- Cap guard check (pipeline mode only, before each pass) ---
|
|
1761
|
+
if run_mode == "pipeline":
|
|
1762
|
+
loop_check_args = "--run <run-id> --phase <phase> --loop-count <pass_number+1>"
|
|
1763
|
+
if options.task_id:
|
|
1764
|
+
loop_check_args += " --task-id <task_id>"
|
|
1765
|
+
wazir capture loop-check $loop_check_args
|
|
1766
|
+
# loop-check wraps: event capture + evaluateLoopCapGuard
|
|
1767
|
+
# If loop_cap_guard fires (exit 43), stop immediately:
|
|
1768
|
+
if last_exit_code == 43:
|
|
1769
|
+
log("Loop cap reached for phase: <phase>. Escalating to user.")
|
|
1770
|
+
escalate_to_user(evidence_gathered_so_far)
|
|
1771
|
+
return { pass_count: pass_number, escalated: true }
|
|
1772
|
+
# Standalone mode: no cap guard. Loop runs for total_passes and stops.
|
|
1773
|
+
|
|
1774
|
+
dimension = active_dims[pass_number % len(active_dims)]
|
|
1775
|
+
|
|
1776
|
+
# --- Primary review (reviewer role, not producer) ---
|
|
1777
|
+
# Mode is always explicit -- passed by caller via options.mode
|
|
1778
|
+
findings = self_review(artifact_path, focus=dimension, mode=options.mode)
|
|
1779
|
+
|
|
1780
|
+
# --- Secondary review (Codex, if available) ---
|
|
1781
|
+
if codex_available:
|
|
1782
|
+
codex_exit_code, codex_output = run_codex_review(artifact_path, dimension)
|
|
1783
|
+
if codex_exit_code != 0:
|
|
1784
|
+
# Codex failed -- log error, fall back to self-review for this pass
|
|
1785
|
+
log_error("Codex exited " + codex_exit_code + ": " + codex_output.stderr)
|
|
1786
|
+
mark_pass_codex_unavailable(pass_number)
|
|
1787
|
+
# Do NOT treat Codex failure as clean. Self-review findings stand alone.
|
|
1788
|
+
else:
|
|
1789
|
+
codex_findings = parse(codex_output.stdout)
|
|
1790
|
+
merge(findings, codex_findings, preserve_attribution=true)
|
|
1791
|
+
|
|
1792
|
+
# --- Log the review pass ---
|
|
1793
|
+
if run_mode == "pipeline":
|
|
1794
|
+
if options.task_id:
|
|
1795
|
+
log_path = .wazir/runs/latest/reviews/<phase>-task-<task_id>-review-pass-<N>.md
|
|
1796
|
+
else:
|
|
1797
|
+
log_path = .wazir/runs/latest/reviews/<phase>-review-pass-<N>.md
|
|
1798
|
+
log(pass_number+1, dimension, findings) -> log_path
|
|
1799
|
+
else:
|
|
1800
|
+
log_path = docs/plans/YYYY-MM-DD-<topic>-review-pass-<N>.md
|
|
1801
|
+
log(pass_number+1, dimension, findings) -> log_path
|
|
1802
|
+
|
|
1803
|
+
if findings.has_issues:
|
|
1804
|
+
# --- Fix and re-submit (MANDATORY) ---
|
|
1805
|
+
# The producer MUST fix findings and the reviewer MUST re-review.
|
|
1806
|
+
# "Fix and continue without re-review" is EXPLICITLY PROHIBITED.
|
|
1807
|
+
producer_fix(artifact_path, findings)
|
|
1808
|
+
# Continue to next pass -- the fix will be re-reviewed
|
|
1809
|
+
|
|
1810
|
+
# --- Post-loop: escalation if issues remain ---
|
|
1811
|
+
if remaining.has_issues:
|
|
1812
|
+
# Cap reached with unresolved findings. Present to user:
|
|
1813
|
+
# 1. Approve with known issues (Recommended if non-blocking)
|
|
1814
|
+
# 2. Fix manually and re-run
|
|
1815
|
+
# 3. Abort
|
|
1816
|
+
escalate_to_user(remaining, options=[
|
|
1817
|
+
"approve-with-issues",
|
|
1818
|
+
"fix-manually-and-rerun",
|
|
1819
|
+
"abort"
|
|
1820
|
+
])
|
|
1821
|
+
# User decides. If approved, log "user-approved-with-issues" in final pass file.
|
|
1822
|
+
|
|
1823
|
+
return { pass_count: total_passes, issues_found, issues_fixed, remaining, attributions }
|
|
1824
|
+
```
|
|
1825
|
+
|
|
1826
|
+
Key properties of this pseudocode:
|
|
1827
|
+
|
|
1828
|
+
1. **Fixed pass counts** -- Quick is exactly 3, standard exactly 5, deep exactly 7. No `max_passes = min_passes + 3`. No clean-streak early-exit. No extension.
|
|
1829
|
+
2. **Task-scoped log filenames** -- `<phase>-task-<NNN>-review-pass-<N>.md` for per-task reviews, preventing log clobbering in parallel mode.
|
|
1830
|
+
3. **Task-scoped loop cap keys** -- `--task-id` flag on `loop-check` so each task gets its own counter in `phase_loop_counts`.
|
|
1831
|
+
4. **Explicit review mode** -- `options.mode` is always passed by the caller. No auto-detection.
|
|
1832
|
+
5. **Codex error handling** -- non-zero exit is logged, pass marked `codex-unavailable`, self-review findings used alone. Never treated as clean.
|
|
1833
|
+
6. **Standalone mode** -- uses `docs/plans/` for artifacts and logs. No temp dir. No cap guard at all.
|
|
1834
|
+
|
|
1835
|
+
---
|
|
1836
|
+
|
|
1837
|
+
## Codex Error Handling Contract
|
|
1838
|
+
|
|
1839
|
+
```
|
|
1840
|
+
run_codex_review(artifact_path, dimension):
|
|
1841
|
+
CODEX_MODEL = read_config('.wazir/state/config.json', '.multi_tool.codex.model') or "gpt-5.4"
|
|
1842
|
+
|
|
1843
|
+
if is_code_artifact:
|
|
1844
|
+
cmd = codex review -c model="$CODEX_MODEL" --uncommitted --title "..." "Review for [dimension]..."
|
|
1845
|
+
# or: codex review -c model="$CODEX_MODEL" --base <sha> for committed changes
|
|
1846
|
+
else:
|
|
1847
|
+
cmd = cat <artifact_path> | codex exec -c model="$CODEX_MODEL" "Review this [type] for [dimension]..."
|
|
1848
|
+
|
|
1849
|
+
result = execute(cmd, timeout=120s, capture_stderr=true)
|
|
1850
|
+
|
|
1851
|
+
if result.exit_code != 0:
|
|
1852
|
+
return (result.exit_code, { stderr: result.stderr, stdout: "" })
|
|
1853
|
+
# Caller handles: log error, mark codex-unavailable, use self-review only
|
|
1854
|
+
|
|
1855
|
+
return (0, { stdout: result.stdout, stderr: result.stderr })
|
|
1856
|
+
```
|
|
1857
|
+
|
|
1858
|
+
Rules:
|
|
1859
|
+
|
|
1860
|
+
- If Codex exits non-zero, log the full stderr.
|
|
1861
|
+
- Mark the pass as `codex-unavailable` in the review log metadata.
|
|
1862
|
+
- Fall back to self-review for that pass only. Do not skip the pass.
|
|
1863
|
+
- Do not retry Codex on the same pass. If Codex fails on pass 2, pass 3 still tries Codex (transient failures recover).
|
|
1864
|
+
- Never treat a Codex failure as a clean review pass.
|
|
1865
|
+
|
|
1866
|
+
---
|
|
1867
|
+
|
|
1868
|
+
## Codex Availability Probe
|
|
1869
|
+
|
|
1870
|
+
Before any Codex call, verify availability once at loop start:
|
|
1871
|
+
|
|
1872
|
+
```bash
|
|
1873
|
+
which codex >/dev/null 2>&1 && codex --version >/dev/null 2>&1
|
|
1874
|
+
```
|
|
1875
|
+
|
|
1876
|
+
If the probe fails, set `codex_available = false` for the entire loop. Fall back to self-review only. Never error out.
|
|
1877
|
+
|
|
1878
|
+
Per-invocation failures (Codex available but a single call fails) are handled separately by the error contract above.
|
|
1879
|
+
|
|
1880
|
+
---
|
|
1881
|
+
|
|
1882
|
+
## Codex Artifact-Scoped Review
|
|
1883
|
+
|
|
1884
|
+
Never use `codex review` for non-code artifacts (specs, plans, designs). Instead, pipe the artifact content via stdin:
|
|
1885
|
+
|
|
1886
|
+
```bash
|
|
1887
|
+
CODEX_MODEL=$(jq -r '.multi_tool.codex.model // empty' .wazir/state/config.json 2>/dev/null)
|
|
1888
|
+
CODEX_MODEL=${CODEX_MODEL:-gpt-5.4}
|
|
1889
|
+
cat .wazir/runs/latest/clarified/spec-hardened.md | \
|
|
1890
|
+
codex exec -c model="$CODEX_MODEL" "Review this specification for: [dimension]. Be specific, cite sections. Say CLEAN if no issues." \
|
|
1891
|
+
2>&1 | tee .wazir/runs/latest/reviews/spec-challenge-review-pass-N.md
|
|
1892
|
+
```
|
|
1893
|
+
|
|
1894
|
+
For code artifacts, use `codex review -c model="$CODEX_MODEL" --uncommitted` (or `--base` for committed changes). See the next section for details.
|
|
1895
|
+
|
|
1896
|
+
---
|
|
1897
|
+
|
|
1898
|
+
## Code Review Scoping
|
|
1899
|
+
|
|
1900
|
+
**Rule: review BEFORE commit.**
|
|
1901
|
+
|
|
1902
|
+
For each task during execution:
|
|
1903
|
+
|
|
1904
|
+
1. Implement the task (changes are uncommitted).
|
|
1905
|
+
2. Review the uncommitted changes using the **5 task-execution dimensions** (NOT the 7 final-review dimensions):
|
|
1906
|
+
```bash
|
|
1907
|
+
CODEX_MODEL=$(jq -r '.multi_tool.codex.model // empty' .wazir/state/config.json 2>/dev/null)
|
|
1908
|
+
CODEX_MODEL=${CODEX_MODEL:-gpt-5.4}
|
|
1909
|
+
codex review -c model="$CODEX_MODEL" --uncommitted --title "Task NNN: <summary>" \
|
|
1910
|
+
"Review against acceptance criteria: <criteria>" \
|
|
1911
|
+
2>&1 | tee .wazir/runs/latest/reviews/execute-task-NNN-review-pass-N.md
|
|
1912
|
+
```
|
|
1913
|
+
3. Fix any findings (still uncommitted).
|
|
1914
|
+
4. Re-review until all passes exhausted or cap reached.
|
|
1915
|
+
5. **Only after review passes:** commit with conventional commit format.
|
|
1916
|
+
|
|
1917
|
+
**If changes are already committed** (e.g., subagent workflow where the implementer subagent commits before review):
|
|
1918
|
+
|
|
1919
|
+
```bash
|
|
1920
|
+
# Capture the SHA before the task starts
|
|
1921
|
+
PRE_TASK_SHA=$(git rev-parse HEAD)
|
|
1922
|
+
|
|
1923
|
+
# ... subagent implements and commits ...
|
|
1924
|
+
|
|
1925
|
+
# Review the committed changes against the pre-task baseline
|
|
1926
|
+
CODEX_MODEL=$(jq -r '.multi_tool.codex.model // empty' .wazir/state/config.json 2>/dev/null)
|
|
1927
|
+
CODEX_MODEL=${CODEX_MODEL:-gpt-5.4}
|
|
1928
|
+
codex review -c model="$CODEX_MODEL" --base $PRE_TASK_SHA --title "Task NNN: <summary>" \
|
|
1929
|
+
"Review against acceptance criteria: <criteria>" \
|
|
1930
|
+
2>&1 | tee .wazir/runs/latest/reviews/execute-task-NNN-review-pass-N.md
|
|
1931
|
+
```
|
|
1932
|
+
|
|
1933
|
+
---
|
|
1934
|
+
|
|
1935
|
+
## Dimension Sets
|
|
1936
|
+
|
|
1937
|
+
### Research Dimensions (5)
|
|
1938
|
+
|
|
1939
|
+
1. **Coverage** -- all briefing topics researched
|
|
1940
|
+
2. **Source quality** -- authoritative, current sources
|
|
1941
|
+
3. **Relevance** -- research answers the actual questions
|
|
1942
|
+
4. **Gaps** -- missing info that blocks later phases
|
|
1943
|
+
5. **Contradictions** -- conflicting sources identified
|
|
1944
|
+
|
|
1945
|
+
### Spec/Clarification Dimensions (5)
|
|
1946
|
+
|
|
1947
|
+
1. **Completeness** -- all requirements covered
|
|
1948
|
+
2. **Testability** -- each criterion verifiable
|
|
1949
|
+
3. **Ambiguity** -- no dual-interpretation statements
|
|
1950
|
+
4. **Assumptions** -- hidden assumptions explicit
|
|
1951
|
+
5. **Scope creep** -- nothing beyond briefing
|
|
1952
|
+
|
|
1953
|
+
### Design-Review Dimensions (5)
|
|
1954
|
+
|
|
1955
|
+
Matches canonical `workflows/design-review.md`:
|
|
1956
|
+
|
|
1957
|
+
1. **Spec coverage** -- does the design address every acceptance criterion with a visual component?
|
|
1958
|
+
2. **Design-spec consistency** -- does the design introduce anything not in the spec? (scope creep check)
|
|
1959
|
+
3. **Accessibility** -- color contrast ratios (WCAG 2.1 AA), focus states, touch target sizes (44x44px minimum)
|
|
1960
|
+
4. **Visual consistency** -- design tokens form a coherent system, dark/light mode alignment
|
|
1961
|
+
5. **Exported-code fidelity** -- do exported scaffolds match the designs? Mismatches are failures here, not implementation concerns.
|
|
1962
|
+
|
|
1963
|
+
### Plan Dimensions (7)
|
|
1964
|
+
|
|
1965
|
+
1. **Completeness** -- all design decisions mapped to tasks
|
|
1966
|
+
2. **Ordering** -- dependencies correct, parallelizable identified
|
|
1967
|
+
3. **Atomicity** -- each task fits one session
|
|
1968
|
+
4. **Testability** -- concrete verification per task
|
|
1969
|
+
5. **Edge cases** -- error paths covered
|
|
1970
|
+
6. **Security** -- auth, injection, data exposure
|
|
1971
|
+
7. **Integration** -- tasks connect end-to-end
|
|
1972
|
+
|
|
1973
|
+
### Task Execution Dimensions (5)
|
|
1974
|
+
|
|
1975
|
+
Used for per-task review during execution:
|
|
1976
|
+
|
|
1977
|
+
1. **Correctness** -- code matches spec
|
|
1978
|
+
2. **Tests** -- real tests, not mocked/faked
|
|
1979
|
+
3. **Wiring** -- all paths connected
|
|
1980
|
+
4. **Drift** -- matches task spec
|
|
1981
|
+
5. **Quality** -- naming, error handling
|
|
1982
|
+
|
|
1983
|
+
### Final Review Dimensions (7)
|
|
1984
|
+
|
|
1985
|
+
Used for `workflows/review.md` scored gate:
|
|
1986
|
+
|
|
1987
|
+
1. **Correctness** -- does the code do what the spec says?
|
|
1988
|
+
2. **Completeness** -- are all acceptance criteria met?
|
|
1989
|
+
3. **Wiring** -- are all paths connected end-to-end?
|
|
1990
|
+
4. **Verification** -- is there evidence (tests, type checks) for each claim?
|
|
1991
|
+
5. **Drift** -- does the implementation match the approved plan?
|
|
1992
|
+
6. **Quality** -- code style, naming, error handling, security
|
|
1993
|
+
7. **Documentation** -- changelog entries, commit messages, comments
|
|
1994
|
+
|
|
1995
|
+
The final review dimensions are the existing 7 from `skills/reviewer/SKILL.md`. `workflows/review.md` is not modified by this pattern.
|
|
1996
|
+
|
|
1997
|
+
---
|
|
1998
|
+
|
|
1999
|
+
## Per-Depth Coverage Contract
|
|
2000
|
+
|
|
2001
|
+
| Depth | Research | Spec | Design-Review | Plan | Task Execution | Final Review |
|
|
2002
|
+
|-------|----------|------|---------------|------|----------------|--------------|
|
|
2003
|
+
| Quick | dims 1-3, 3 passes | dims 1-3, 3 passes | dims 1-3, 3 passes | dims 1-3, 3 passes | dims 1-3, 3 passes | always 7 dims, 1 pass |
|
|
2004
|
+
| Standard | dims 1-5, 5 passes | dims 1-5, 5 passes | dims 1-5, 5 passes | dims 1-5, 5 passes | dims 1-5, 5 passes | always 7 dims, 1 pass |
|
|
2005
|
+
| Deep | dims 1-5, 7 passes | dims 1-5, 7 passes | dims 1-5, 7 passes | dims 1-7, 7 passes | dims 1-5, 7 passes | always 7 dims, 1 pass |
|
|
2006
|
+
|
|
2007
|
+
Pass counts are FIXED per depth. Quick = 3 passes, standard = 5 passes, deep = 7 passes. No extension. No early-exit. Final review is always a single scored pass across all 7 dimensions -- it is a gate, not a loop.
|
|
2008
|
+
|
|
2009
|
+
---
|
|
2010
|
+
|
|
2011
|
+
## Loop Cap Configuration
|
|
2012
|
+
|
|
2013
|
+
The `workflow_policy` section of `run-config.yaml` (legacy: `phase_policy`) controls which workflows are enabled and sets an absolute safety ceiling per workflow. Only two fields exist: `enabled` and `loop_cap`. There is no `passes` field -- depth determines pass counts (3/5/7), not workflow policy.
|
|
2014
|
+
|
|
2015
|
+
```yaml
|
|
2016
|
+
workflow_policy:
|
|
2017
|
+
# Clarifier phase workflows
|
|
2018
|
+
discover: { enabled: true, loop_cap: 10 }
|
|
2019
|
+
clarify: { enabled: true, loop_cap: 10 }
|
|
2020
|
+
specify: { enabled: true, loop_cap: 10 }
|
|
2021
|
+
spec-challenge: { enabled: true, loop_cap: 10 }
|
|
2022
|
+
author: { enabled: false, loop_cap: 10 }
|
|
2023
|
+
design: { enabled: true, loop_cap: 10 }
|
|
2024
|
+
design-review: { enabled: true, loop_cap: 10 }
|
|
2025
|
+
plan: { enabled: true, loop_cap: 10 }
|
|
2026
|
+
plan-review: { enabled: true, loop_cap: 10 }
|
|
2027
|
+
# Executor phase workflows
|
|
2028
|
+
execute: { enabled: true, loop_cap: 10 }
|
|
2029
|
+
verify: { enabled: true, loop_cap: 5 }
|
|
2030
|
+
review: { enabled: true, loop_cap: 10 }
|
|
2031
|
+
learn: { enabled: true, loop_cap: 5 }
|
|
2032
|
+
prepare_next: { enabled: true, loop_cap: 5 }
|
|
2033
|
+
run_audit: { enabled: false, loop_cap: 10 }
|
|
2034
|
+
```
|
|
2035
|
+
|
|
2036
|
+
**`loop_cap`** is an absolute safety ceiling that prevents runaway loops regardless of depth. It is checked by `wazir capture loop-check` in pipeline mode. It is NOT the same as pass count (which is determined by depth: 3/5/7). Example: depth=deep gives 7 passes, but if `loop_cap: 5`, the cap guard fires at pass 5 and escalates. This is intentional -- the operator can constrain expensive phases.
|
|
2037
|
+
|
|
2038
|
+
**Adaptive workflows** (`author`, `run_audit`) default to `enabled: false`. They are activated by explicit operator config or intent detection.
|
|
2039
|
+
|
|
2040
|
+
**Post-run workflows** (`learn`, `prepare_next`) default to `enabled: true`. They run as part of the Final Review phase:
|
|
2041
|
+
|
|
2042
|
+
- `learn` extracts durable learnings from review findings -- recurring findings become accepted learnings.
|
|
2043
|
+
- `prepare_next` prepares context and handoff for the next run.
|
|
2044
|
+
- `author` has a human approval gate, not an iterative review loop.
|
|
2045
|
+
- `run_audit` is an on-demand standalone audit, not part of the main pipeline flow.
|
|
2046
|
+
|
|
2047
|
+
---
|
|
2048
|
+
|
|
2049
|
+
## Reviewer Mode Table
|
|
2050
|
+
|
|
2051
|
+
The reviewer skill operates in different modes depending on the phase. **Mode is always explicit** -- the caller passes `--mode <mode>`. There is no auto-detection based on artifact availability.
|
|
2052
|
+
|
|
2053
|
+
| Mode | Invoked during | Prerequisites | Dimensions | Output |
|
|
2054
|
+
|------|---------------|---------------|------------|--------|
|
|
2055
|
+
| `final` | After execution + verification | Completed task artifacts in `.wazir/runs/latest/artifacts/` | 7 final-review dims, scored 0-70 | Verdict: PASS/NEEDS FIXES/NEEDS REWORK/FAIL |
|
|
2056
|
+
| `spec-challenge` | After specify | Draft spec artifact | 5 spec/clarification dims | Findings with severity, no score |
|
|
2057
|
+
| `design-review` | After design approval | Design artifact, approved spec, accessibility guidelines | 5 design-review dims (canonical) | Findings with severity (blocking/advisory) |
|
|
2058
|
+
| `plan-review` | After planning | Draft plan, approved spec, design artifact | 7 plan dims | Findings with severity, no score |
|
|
2059
|
+
| `task-review` | During execution, per task | Uncommitted changes (or committed with known base SHA) | 5 task-execution dims | Pass/fail per task, no score |
|
|
2060
|
+
| `research-review` | During discover | Research artifact | 5 research dims | Findings with severity, no score |
|
|
2061
|
+
| `clarification-review` | During clarify | Clarification artifact | 5 spec/clarification dims | Findings with severity, no score |
|
|
2062
|
+
|
|
2063
|
+
If `--mode` is not provided, the reviewer asks the user which review to run. Auto-detection based on artifact availability is NOT used -- it causes ambiguity in resumed/multi-phase runs where stale artifacts from prior phases exist.
|
|
2064
|
+
|
|
2065
|
+
Each caller is responsible for passing the correct mode:
|
|
2066
|
+
|
|
2067
|
+
- Clarifier passes `--mode clarification-review` after Phase 1A
|
|
2068
|
+
- Discover workflow passes `--mode research-review` after research
|
|
2069
|
+
- Specifier flow passes `--mode spec-challenge` after specify
|
|
2070
|
+
- Brainstorming passes `--mode design-review` after user approval
|
|
2071
|
+
- Writing-plans passes `--mode plan-review` after planning
|
|
2072
|
+
- Executor passes `--mode task-review` for each task
|
|
2073
|
+
- `/wazir` runner passes `--mode final` for the final review gate
|
|
2074
|
+
|
|
2075
|
+
---
|
|
2076
|
+
|
|
2077
|
+
## Codex Prompt Templates
|
|
2078
|
+
|
|
2079
|
+
All Codex invocations read the model from config with a fallback:
|
|
2080
|
+
|
|
2081
|
+
```bash
|
|
2082
|
+
CODEX_MODEL=$(jq -r '.multi_tool.codex.model // empty' .wazir/state/config.json 2>/dev/null)
|
|
2083
|
+
CODEX_MODEL=${CODEX_MODEL:-gpt-5.4}
|
|
2084
|
+
```
|
|
2085
|
+
|
|
2086
|
+
### Artifact Review (specs, plans, designs via stdin)
|
|
2087
|
+
|
|
2088
|
+
Use this template with `codex exec` for non-code artifacts piped via stdin:
|
|
2089
|
+
|
|
2090
|
+
```bash
|
|
2091
|
+
cat <artifact_path> | codex exec -c model="$CODEX_MODEL" \
|
|
2092
|
+
"You are reviewing a [ARTIFACT_TYPE] for the Wazir engineering OS.
|
|
2093
|
+
Focus on [DIMENSION]: [dimension description].
|
|
2094
|
+
Rules: cite specific sections, be actionable, say CLEAN if no issues.
|
|
2095
|
+
Do NOT load or invoke any skills. Do NOT read the codebase.
|
|
2096
|
+
Review ONLY the content provided via stdin."
|
|
2097
|
+
```
|
|
2098
|
+
|
|
2099
|
+
Replace `[ARTIFACT_TYPE]` with: `specification`, `implementation plan`, `design document`, `research brief`, or `clarification`.
|
|
2100
|
+
Replace `[DIMENSION]` and `[dimension description]` with the current review pass dimension from the relevant dimension set above.
|
|
2101
|
+
|
|
2102
|
+
### Code Review (diffs via --uncommitted or --base)
|
|
2103
|
+
|
|
2104
|
+
Use this template with `codex review` for code changes:
|
|
2105
|
+
|
|
2106
|
+
```bash
|
|
2107
|
+
codex review -c model="$CODEX_MODEL" --uncommitted --title "Task NNN: <summary>" \
|
|
2108
|
+
"Review the code changes for [DIMENSION]: [dimension description].
|
|
2109
|
+
Check against acceptance criteria: [criteria].
|
|
2110
|
+
Flag: correctness issues, missing tests, unwired paths, drift from spec.
|
|
2111
|
+
Do NOT load or invoke any skills."
|
|
2112
|
+
```
|
|
2113
|
+
|
|
2114
|
+
For committed changes, replace `--uncommitted` with `--base <sha>`.
|
|
2115
|
+
Replace `[DIMENSION]`, `[dimension description]`, and `[criteria]` with the task-specific values from the execution plan and spec.
|
|
2116
|
+
|
|
2117
|
+
---
|
|
2118
|
+
|
|
2119
|
+
## Codex Output Context Protection
|
|
2120
|
+
|
|
2121
|
+
Codex CLI output includes internal traces (file reads, tool calls, reasoning) that are NOT useful for the review — only the final findings matter. To prevent context flooding:
|
|
2122
|
+
|
|
2123
|
+
### Tee + Extract Pattern
|
|
2124
|
+
|
|
2125
|
+
1. **Always tee** Codex output to a file:
|
|
2126
|
+
```bash
|
|
2127
|
+
codex exec ... 2>&1 | tee .wazir/runs/latest/reviews/<phase>-review-pass-<N>.md
|
|
2128
|
+
```
|
|
2129
|
+
|
|
2130
|
+
2. **Extract findings** after the last `codex` marker using `execute_file`:
|
|
2131
|
+
```bash
|
|
2132
|
+
# If context-mode available (has_execute_file: true):
|
|
2133
|
+
mcp__plugin_context-mode_context-mode__execute_file(
|
|
2134
|
+
path: ".wazir/runs/latest/reviews/<phase>-review-pass-<N>.md",
|
|
2135
|
+
language: "shell",
|
|
2136
|
+
code: "tac $FILE | sed '/^codex$/q' | tac | tail -n +2"
|
|
2137
|
+
)
|
|
2138
|
+
```
|
|
2139
|
+
|
|
2140
|
+
3. **Present extracted findings only** — the raw trace stays in the file for debugging but never enters the main context window.
|
|
2141
|
+
|
|
2142
|
+
### Fallback (no context-mode)
|
|
2143
|
+
|
|
2144
|
+
If `context_mode.has_execute_file` is false, extract using shell directly:
|
|
2145
|
+
|
|
2146
|
+
```bash
|
|
2147
|
+
tac <file> | sed '/^codex$/q' | tac | tail -n +2
|
|
2148
|
+
```
|
|
2149
|
+
|
|
2150
|
+
This reverses the file, finds the first (= last original) `codex` marker, reverses back, and skips the marker line.
|
|
2151
|
+
|
|
2152
|
+
**If no marker found:** fail closed
|
|
2153
|
+
|
|
2154
|
+
---
|
|
2155
|
+
|
|
2156
|
+
## Phase Scoring: First vs Final Artifact Comparison
|
|
2157
|
+
|
|
2158
|
+
At the start of each review loop (pass 1), score the artifact on its phase's canonical dimension set (1-10 per dimension). At the end of the loop (final pass), score again using the **same canonical dimensions**. Present the delta in the end-of-phase report.
|
|
2159
|
+
|
|
2160
|
+
### Canonical Dimension Sets Per Phase
|
|
2161
|
+
|
|
2162
|
+
These are the fixed rubrics — no ad-hoc dimension selection:
|
|
2163
|
+
|
|
2164
|
+
| Phase | Canonical Dimensions |
|
|
2165
|
+
|-------|---------------------|
|
|
2166
|
+
| research-review | Coverage, Source quality, Relevance, Gaps identified, Actionability |
|
|
2167
|
+
| clarification-review / spec-challenge | Completeness, Testability, Ambiguity, Assumptions, Scope creep |
|
|
2168
|
+
| design-review | Spec coverage, Design-spec consistency, Accessibility, Visual consistency, Exported-code fidelity |
|
|
2169
|
+
| plan-review | Completeness, Testability, Task granularity, Dependency correctness, Phase structure, File coverage, Estimation accuracy |
|
|
2170
|
+
| task-review | Correctness, Tests, Wiring, Drift, Quality |
|
|
2171
|
+
| final | Correctness, Completeness, Wiring, Verification, Drift, Quality, Documentation |
|
|
2172
|
+
|
|
2173
|
+
### Scoring Rules
|
|
2174
|
+
|
|
2175
|
+
1. Initial and final scores MUST use the **same dimension set** — the delta is only meaningful on the same rubric.
|
|
2176
|
+
2. The reviewer records which dimension set was used in each pass file.
|
|
2177
|
+
3. Delta format: `Dimension: X/10 → Y/10 (+Z)`.
|
|
2178
|
+
|
|
2179
|
+
### Quality Delta Report Section
|
|
2180
|
+
|
|
2181
|
+
The end-of-phase report (see "End-of-Phase Report" below) includes a **Quality Delta** section:
|
|
2182
|
+
|
|
2183
|
+
```markdown
|
|
2184
|
+
## Quality Delta
|
|
2185
|
+
|
|
2186
|
+
| Dimension | Initial | Final | Delta |
|
|
2187
|
+
|-----------|---------|-------|-------|
|
|
2188
|
+
| Completeness | 4/10 | 9/10 | +5 |
|
|
2189
|
+
| Testability | 3/10 | 8/10 | +5 |
|
|
2190
|
+
| Ambiguity | 5/10 | 9/10 | +4 |
|
|
2191
|
+
```
|
|
2192
|
+
|
|
2193
|
+
---
|
|
2194
|
+
|
|
2195
|
+
## End-of-Phase Report
|
|
2196
|
+
|
|
2197
|
+
Every phase exit produces a report saved to `.wazir/runs/latest/reviews/<phase>-report.md` containing:
|
|
2198
|
+
|
|
2199
|
+
1. **Summary** — what the phase produced
|
|
2200
|
+
2. **Key Changes** — first-version vs final-version highlights (not full diff — what improved)
|
|
2201
|
+
3. **Quality Delta** — per-dimension before/after scores (see Phase Scoring above)
|
|
2202
|
+
4. **Findings Log** — per-pass finding counts by severity (e.g., "Pass 1: 6 findings (3 blocking, 2 warning, 1 note). Pass 7: 0 findings. All resolved.")
|
|
2203
|
+
5. **Usage** — token usage from `wazir capture usage` (runs before report generation)
|
|
2204
|
+
6. **Context Savings** — context-mode stats if available, omit section if not
|
|
2205
|
+
7. **Time Spent** — wall-clock elapsed time from phase start to end — log "codex marker not found in output, cannot extract findings" and present a warning to the user with 0 findings extracted. The raw file is preserved for manual review. Do NOT fall back to `tail` or any best-effort extraction that could leak traces into context.
|
|
2206
|
+
|
|
2207
|
+
|
|
1539
2208
|
---
|
|
1540
2209
|
## Source: docs/reference/roles-reference.md
|
|
1541
2210
|
|
|
@@ -1576,6 +2245,7 @@ This is the lookup reference for canonical roles, workflows, and their contracts
|
|
|
1576
2245
|
| `review` | `verify` | Adversarial quality review |
|
|
1577
2246
|
| `learn` | `review` | Capture scoped learnings |
|
|
1578
2247
|
| `prepare-next` | `learn` | Produce clean next-run handoff |
|
|
2248
|
+
| `run-audit` | (standalone) | Structured codebase audit with source-backed findings |
|
|
1579
2249
|
|
|
1580
2250
|
## Role routing valid values
|
|
1581
2251
|
|
|
@@ -1617,6 +2287,159 @@ Roles that explore broadly (clarifier, researcher, planner) benefit most from L1
|
|
|
1617
2287
|
|
|
1618
2288
|
See [Indexing and Recall](../concepts/indexing-and-recall.md) for full details on tiers and commands.
|
|
1619
2289
|
|
|
2290
|
+
|
|
2291
|
+
---
|
|
2292
|
+
## Source: docs/reference/skill-tiers.md
|
|
2293
|
+
|
|
2294
|
+
# Skill Tier Classification
|
|
2295
|
+
|
|
2296
|
+
Audit of Wazir skills against Superpowers v4.3.1 skills.
|
|
2297
|
+
Each skill is classified into one of three tiers:
|
|
2298
|
+
|
|
2299
|
+
- **Delegate** -- use superpowers skill as-is, delete Wazir fork
|
|
2300
|
+
- **Augment** -- use superpowers skill + inject Wazir context addendum (strictly additive, no overrides). **NOTE:** R2 validation found this tier is not implementable -- see [Augment Mechanism](#augment-mechanism) below.
|
|
2301
|
+
- **Own** -- Wazir-original or structurally rewritten skill, rename to `wz:` prefix
|
|
2302
|
+
|
|
2303
|
+
---
|
|
2304
|
+
|
|
2305
|
+
## Classification Table
|
|
2306
|
+
|
|
2307
|
+
| Wazir Skill | Superpowers Equivalent | Tier | Rationale | Risk Notes |
|
|
2308
|
+
|---|---|---|---|---|
|
|
2309
|
+
| brainstorming | brainstorming | **Own** | Structurally rewritten. Superpowers version is a linear checklist (explore context, ask questions, propose approaches, present design, write doc, invoke writing-plans). Wazir replaces the entire process: adds Command Routing and Codebase Exploration preambles, replaces the design-doc step with a design-review loop (`--mode design-review` with canonical dimensions), outputs to `.wazir/runs/latest/clarified/design.md` instead of `docs/plans/`, and adds a complete Agent Teams multi-agent brainstorming mode (Free Thinker / Grounder / Synthesizer / Arbiter pattern using TeamCreate/SendMessage). None of the superpowers process steps survive intact. | Dropping the Agent Teams mode would lose Wazir's most differentiated brainstorming capability. |
|
|
2310
|
+
| clarifier | _(none)_ | **Own** | Wazir-original. No superpowers counterpart exists. | -- |
|
|
2311
|
+
| debugging | systematic-debugging | **Own** | Structurally rewritten. Superpowers has a 4-phase process (Root Cause Investigation with 5 substeps, Pattern Analysis, Hypothesis and Testing, Implementation) totaling ~300 lines with detailed examples, rationalization tables, and supporting technique references. Wazir condenses this to a 4-step observe-hypothesize-test-fix loop (~75 lines), replaces all codebase exploration with Wazir CLI symbol-first exploration (`wazir index search-symbols`, `wazir recall symbol` and `wazir recall file`), adds loop cap awareness (pipeline mode with `wazir capture loop-check` vs. standalone mode), and removes all superpowers examples, rationalization tables, and red-flag lists. The methodology is fundamentally different in structure despite sharing the spirit of "root cause first." | Delegating would lose Wazir CLI integration and loop cap awareness. Superpowers version is far more detailed on anti-patterns and may be worth referencing separately. |
|
|
2312
|
+
| design | _(none)_ | **Own** | Wazir-original. No superpowers counterpart exists. | -- |
|
|
2313
|
+
| dispatching-parallel-agents | dispatching-parallel-agents | **Own** | Reclassified from Augment to Own (R2). Skill shadowing is full-override, so Augment tier is not implementable via `~/.claude/skills/`. Wazir already carries the full content: superpowers core (When to Use decision tree, The Pattern with 4 steps, Agent Prompt Structure, Common Mistakes section) plus Wazir additions (Command Routing preamble, Codebase Exploration preamble, philosophical paragraph in Overview, Problem/Fix format for Common Mistakes). Drops superpowers-only sections: "When NOT to Use," "Real Example from Session," "Key Benefits," "Verification," "Real-World Impact." | Superpowers informational sections (Real Example, Key Benefits, Verification, Real-World Impact) not carried forward. Low risk -- these are teaching content, not behavioral. |
|
|
2314
|
+
| executing-plans | executing-plans | **Own** | Structurally rewritten. Superpowers uses batch execution (default first 3 tasks) with report-and-wait checkpoints and explicit batch feedback loops. Wazir replaces batching with per-task execution, adds a per-task review loop (`--mode task-review` with 5 task-execution dimensions, Codex integration, review log filenames, loop cap tracking via `wazir capture loop-check`), adds standalone vs. pipeline mode detection, and adds a note recommending wz:subagent-driven-development when subagents are available. The batch-vs-per-task change is a core behavioral difference. All integration references point to `wz:` skills. | Delegating would lose per-task review loops and pipeline mode integration. |
|
|
2315
|
+
| executor | _(none)_ | **Own** | Wazir-original. No superpowers counterpart exists. | -- |
|
|
2316
|
+
| finishing-a-development-branch | finishing-a-development-branch | **Own** | Reclassified from Augment to Own (R2). Skill shadowing is full-override, so Augment tier is not implementable via `~/.claude/skills/`. Wazir already carries the full content: superpowers process (5 steps: verify tests, determine base branch, present 4 options, execute choice, cleanup worktree) preserved with identical structure and identical option semantics. Wazir adds Command Routing and Codebase Exploration preambles. Minor cosmetic changes: `<N>` removed from failure template, `<base-branch>` shortened to `<base>`, emoji checkmarks replaced with Y/-, `<commit-list>` changed to `<count>`, PR body simplified. Red Flags and Integration sections trimmed but no behavioral contradiction. | Low risk. The superpowers version has more detailed Red Flags and Integration sections not carried forward. |
|
|
2317
|
+
| humanize | _(none)_ | **Own** | Wazir-original. No superpowers counterpart exists. | -- |
|
|
2318
|
+
| init-pipeline | _(none)_ | **Own** | Wazir-original. No superpowers counterpart exists. | -- |
|
|
2319
|
+
| prepare-next | _(none)_ | **Own** | Wazir-original. No superpowers counterpart exists. | -- |
|
|
2320
|
+
| receiving-code-review | receiving-code-review | **Own** | Structurally rewritten. Superpowers has extensive sections: Forbidden Responses, Source-Specific Handling, YAGNI Check, Implementation Order, When To Push Back, Acknowledging Correct Feedback (with detailed anti-patterns for gratitude), Gracefully Correcting Pushback, Common Mistakes table, Real Examples, and GitHub Thread Replies. Wazir preserves the core Response Pattern and Forbidden Responses but: (1) adds Loop Tracking section (pipeline mode with `wazir capture loop-check` and standalone pass counts), (2) restructures Implementation Order to a 4-tier priority (blocking, functional, quality, nice-to-have) instead of 3-tier, (3) adds a Quick Reference decision table, (4) removes the entire "Acknowledging Correct Feedback" anti-gratitude section, the "Gracefully Correcting Pushback" section, the Common Mistakes table, all Real Examples, the "When To Push Back" enumeration, and the GitHub Thread Replies section. The Loop Tracking addition and structural deletions make this a substantive rewrite. | Delegating would lose loop tracking. The removed anti-gratitude and pushback sections from superpowers are valuable behavioral guardrails worth preserving. |
|
|
2321
|
+
| requesting-code-review | requesting-code-review | **Own** | Structurally rewritten. Both skills share the same When to Request triggers and Example structure. But Wazir: (1) replaces `superpowers:code-reviewer` with `wz:code-reviewer`, (2) adds explicit review loop parameters (`--mode`, depth-aware dimensions, pass number), (3) adds `codex review --uncommitted` and `codex review --base` commands, (4) adds Codex Error Handling section, (5) adds `{REVIEW_MODE}` placeholder, (6) changes Integration section to reference per-task review checkpoints instead of batch review, (7) adds "Dispatch review without explicit `--mode`" to Red Flags. The Codex integration and review loop parameter system are structural additions that change how reviews are dispatched. | Delegating would lose Codex integration and review loop protocol. |
|
|
2322
|
+
| reviewer | _(none)_ | **Own** | Wazir-original. No superpowers counterpart exists. | -- |
|
|
2323
|
+
| run-audit | _(none)_ | **Own** | Wazir-original. No superpowers counterpart exists. | -- |
|
|
2324
|
+
| scan-project | _(none)_ | **Own** | Wazir-original. No superpowers counterpart exists. | -- |
|
|
2325
|
+
| self-audit | _(none)_ | **Own** | Wazir-original. No superpowers counterpart exists. | -- |
|
|
2326
|
+
| subagent-driven-development | subagent-driven-development | **Own** | Structurally rewritten. Both share the same high-level process (fresh subagent per task, two-stage review, spec then quality). But Wazir: (1) adds `Capture PRE_TASK_SHA` step to the process flowchart for diff scoping, (2) adds Code Review Scoping section (`codex review --base <pre-task-sha>`), (3) adds Review Loop Alignment section (explicit `--mode task-review`, task-scoped log filenames, loop cap via `wazir capture loop-check`), (4) adds Codex Error Handling section, (5) adds standalone mode fallback, (6) changes all skill references from `superpowers:` to `wz:`, (7) adds "Review the wrong diff" to Red Flags, (8) removes the Example Workflow, Advantages detail, and Cost breakdown from superpowers. The diff-scoping and review-loop integration are structural process changes. | Delegating would lose diff-scoped reviews and Codex integration. The removed Example Workflow from superpowers is a useful teaching tool. |
|
|
2327
|
+
| tdd | test-driven-development | **Own** | Structurally rewritten. Superpowers has an exhaustive treatment (~370 lines): detailed Red-Green-Refactor with Good/Bad code examples, Iron Law with explicit "delete and start over" rules, a Verification Checklist, extensive Why Order Matters section, Common Rationalizations table, When Stuck guide, Testing Anti-Patterns reference, and Debugging Integration. Wazir condenses to ~45 lines with 3 steps (RED, GREEN, REFACTOR), adds a single-pass test quality check in RED phase ("Are these tests testing the right behavior? Are they real assertions?"), and removes all examples, rationalization tables, and elaboration. Different description and name (`wz:tdd` vs `test-driven-development`). | Delegating would lose the test quality check. The superpowers version's extensive rationalization prevention and examples are valuable for discipline enforcement but costly in tokens. |
|
|
2328
|
+
| using-git-worktrees | using-git-worktrees | **Own** | Reclassified from Augment to Own (R2). Skill shadowing is full-override, so Augment tier is not implementable via `~/.claude/skills/`. Wazir already carries the full content: superpowers core process (directory selection priority, safety verification with `git check-ignore`, creation steps, project setup auto-detection, clean baseline verification) preserved structurally intact. Wazir adds: Command Routing preamble, Codebase Exploration preamble, global directory changed from `~/.config/superpowers/worktrees/` to `~/.wazir/worktrees/`, Cleanup and Common Issues sections (submodules, lock files, stale worktrees). Drops superpowers-only sections: Example Workflow, Quick Reference table, Common Mistakes, Red Flags, Integration. | Dropped superpowers sections (Quick Reference, Common Mistakes, Red Flags, Integration) reduce operational guardrails. Could be recovered into the Own skill. |
|
|
2329
|
+
| using-skills | using-superpowers | **Own** | Structurally rewritten. Both enforce the same core rule (invoke skills before any response, even at 1% chance). But Wazir: (1) renames from `using-superpowers` to `using-skills`, (2) changes all internal skill references from `superpowers:` to `wz:` throughout flowchart and examples, (3) removes the Skill Types section detail about "Rigid vs Flexible" elaboration, (4) removes User Instructions elaboration. The name change and systematic `wz:` prefix replacement throughout the flowchart make this a namespace-level rewrite. | Could potentially be Augment if namespace mapping were handled at a routing layer rather than in-skill. |
|
|
2330
|
+
| verification | verification-before-completion | **Own** | Structurally rewritten. Superpowers has an exhaustive treatment (~140 lines): Iron Law, Gate Function (5-step IDENTIFY/RUN/READ/VERIFY/CLAIM), Common Failures table, Red Flags list, Rationalization Prevention table, Key Patterns (tests, regression, build, requirements, agent delegation), Why This Matters section with 24 failure memories, and When To Apply section. Wazir condenses to ~35 lines with 3 bullet requirements (what was verified, exact command, actual result), a minimum rule, and a brief "when verification fails" section. Different name (`wz:verification` vs `verification-before-completion`). | Delegating would lose the concise Wazir format. The superpowers version's extensive rationalization prevention is valuable for discipline but token-expensive. The Wazir version may be too terse to enforce the discipline effectively. |
|
|
2331
|
+
| wazir | _(none)_ | **Own** | Wazir-original. No superpowers counterpart exists. | -- |
|
|
2332
|
+
| writing-plans | writing-plans | **Own** | Structurally rewritten. Superpowers focuses on plan document format (header template, task structure with bite-sized steps, code examples in plan, execution handoff to subagent-driven or parallel session). Wazir: (1) changes inputs to "approved design or approved clarified direction" instead of "spec or requirements", (2) adds pipeline-aware output paths (`.wazir/runs/latest/clarified/execution-plan.md` and `.wazir/runs/latest/tasks/task-NNN/spec.md` vs. standalone `docs/plans/`), (3) removes the plan document format template entirely (no header template, no task structure template, no code examples), (4) adds Plan Review Loop section with `wz:reviewer --mode plan-review`, Codex integration via stdin pipe, Codex error handling, depth-aware pass counts, and standalone fallback. The plan review loop and pipeline path system are structural additions; the removal of the format template is a structural deletion. | Delegating would lose pipeline integration and plan review loop. The removed format template from superpowers is valuable for plan quality and could be worth recovering. |
|
|
2333
|
+
| writing-skills | writing-skills | **Own** | Structurally rewritten. Both share the TDD-for-skills philosophy and RED-GREEN-REFACTOR mapping. But Wazir: (1) condenses from ~650 lines to ~170 lines, (2) removes the extensive SKILL.md Structure template, CSO (Claude Search Optimization) section, Flowchart Usage guidelines, Code Examples guidelines, Token Efficiency section, File Organization examples, Testing All Skill Types section (discipline/technique/pattern/reference), Common Rationalizations for Skipping Testing table, Bulletproofing Skills Against Rationalization section (with Cialdini psychology reference), Skill Creation Checklist, Discovery Workflow, Anti-Patterns section, and STOP deployment gate, (3) adds "Be Prescriptive, Not Descriptive" guidance, "Use Rationalization Prevention" example, "Include Decision Trees" guidance, and skill reference syntax. The massive content reduction and different teaching approach make this a structural rewrite. | Delegating would lose the concise prescriptive format. The superpowers version's CSO guidelines, testing methodology, and anti-pattern catalog are extremely valuable reference material. |
|
|
2334
|
+
|
|
2335
|
+
---
|
|
2336
|
+
|
|
2337
|
+
## Superpowers Skills with No Wazir Counterpart
|
|
2338
|
+
|
|
2339
|
+
These superpowers skills have no Wazir fork. They could be used as-is via the superpowers plugin.
|
|
2340
|
+
|
|
2341
|
+
| Superpowers Skill | Status | Notes |
|
|
2342
|
+
|---|---|---|
|
|
2343
|
+
| using-superpowers | Replaced by `wz:using-skills` | See using-skills row above. |
|
|
2344
|
+
|
|
2345
|
+
All 14 superpowers skills have a Wazir counterpart (using-superpowers maps to using-skills, systematic-debugging maps to debugging, test-driven-development maps to tdd, verification-before-completion maps to verification).
|
|
2346
|
+
|
|
2347
|
+
---
|
|
2348
|
+
|
|
2349
|
+
## Summary by Tier
|
|
2350
|
+
|
|
2351
|
+
| Tier | Count | Skills |
|
|
2352
|
+
|---|---|---|
|
|
2353
|
+
| **Own** | 25 | brainstorming, clarifier, debugging, design, dispatching-parallel-agents, executing-plans, executor, finishing-a-development-branch, humanize, init-pipeline, prepare-next, receiving-code-review, requesting-code-review, reviewer, run-audit, scan-project, self-audit, subagent-driven-development, tdd, using-git-worktrees, using-skills, verification, wazir, writing-plans, writing-skills |
|
|
2354
|
+
| **Augment** | 0 | _(none -- tier not implementable, see [Augment Mechanism](#augment-mechanism))_ |
|
|
2355
|
+
| **Delegate** | 0 | _(none)_ |
|
|
2356
|
+
|
|
2357
|
+
---
|
|
2358
|
+
|
|
2359
|
+
## Common Wazir Additions (Appear in All Forked Skills)
|
|
2360
|
+
|
|
2361
|
+
Every Wazir fork of a superpowers skill adds these two preamble sections:
|
|
2362
|
+
|
|
2363
|
+
1. **Command Routing** -- routes large commands to context-mode tools and small commands to native Bash, following `hooks/routing-matrix.json`.
|
|
2364
|
+
2. **Codebase Exploration** -- prescribes symbol-first exploration via `wazir index search-symbols` and `wazir recall`, with fallback to direct file reads.
|
|
2365
|
+
|
|
2366
|
+
These preambles alone would justify **Augment** tier for any skill where no other structural changes exist.
|
|
2367
|
+
|
|
2368
|
+
---
|
|
2369
|
+
|
|
2370
|
+
## Augment Mechanism
|
|
2371
|
+
|
|
2372
|
+
**Research date:** 2026-03-19 (R2: Composition Infrastructure Validation)
|
|
2373
|
+
|
|
2374
|
+
### Finding: Augment tier is not implementable
|
|
2375
|
+
|
|
2376
|
+
The Augment tier assumed that placing a Wazir addendum at `~/.claude/skills/<skill-name>/SKILL.md` would layer Wazir context on top of the superpowers base skill. This assumption is wrong. **Skill shadowing is full-override, not merge/append.**
|
|
2377
|
+
|
|
2378
|
+
### Evidence
|
|
2379
|
+
|
|
2380
|
+
**1. `skills-core.js` `resolveSkillPath()` (superpowers v4.3.1)**
|
|
2381
|
+
|
|
2382
|
+
The function at `lib/skills-core.js:108-140` checks personal skills directory first. If `~/.claude/skills/<name>/SKILL.md` exists, it returns that file immediately and never reads the superpowers version. There is no content merging.
|
|
2383
|
+
|
|
2384
|
+
```
|
|
2385
|
+
// Try personal skills first (unless explicitly superpowers:)
|
|
2386
|
+
if (!forceSuperpowers && personalDir) {
|
|
2387
|
+
const personalSkillFile = path.join(personalDir, actualSkillName, 'SKILL.md');
|
|
2388
|
+
if (fs.existsSync(personalSkillFile)) {
|
|
2389
|
+
return { skillFile: personalSkillFile, sourceType: 'personal', ... };
|
|
2390
|
+
// ^^^ returns here -- superpowers version never consulted
|
|
2391
|
+
}
|
|
2392
|
+
}
|
|
2393
|
+
```
|
|
2394
|
+
|
|
2395
|
+
**2. Superpowers test suite confirms override behavior**
|
|
2396
|
+
|
|
2397
|
+
`tests/opencode/test-skills-core.sh` line 336 asserts:
|
|
2398
|
+
```
|
|
2399
|
+
[PASS] Personal skills shadow superpowers skills
|
|
2400
|
+
```
|
|
2401
|
+
|
|
2402
|
+
The test creates `personal-skills/shared-skill/SKILL.md` and `superpowers-skills/shared-skill/SKILL.md`, resolves `shared-skill`, and verifies `sourceType` is `"personal"` -- the superpowers version is invisible.
|
|
2403
|
+
|
|
2404
|
+
**3. Superpowers RELEASE-NOTES.md v3.3.0**
|
|
2405
|
+
|
|
2406
|
+
Line 385 documents the behavior explicitly: "Personal skills override superpowers skills when names match."
|
|
2407
|
+
|
|
2408
|
+
**4. The `superpowers:` prefix bypass is not available in Claude Code**
|
|
2409
|
+
|
|
2410
|
+
`skills-core.js` supports `superpowers:skill-name` syntax to force resolution to the superpowers version even when a personal skill shadows it. However, `skills-core.js` is only used by the OpenCode plugin (`/.opencode/plugins/superpowers.js`). Claude Code's native `Skill` tool has its own built-in resolution logic that does not expose this prefix bypass.
|
|
2411
|
+
|
|
2412
|
+
### Alternatives Considered
|
|
2413
|
+
|
|
2414
|
+
| Approach | Viable? | Why |
|
|
2415
|
+
|---|---|---|
|
|
2416
|
+
| Place addendum in `~/.claude/skills/<name>/` | No | Full override -- base skill content lost |
|
|
2417
|
+
| Merge base + addendum in SKILL.md at install time | Partial | Would work but creates a maintenance coupling: every superpowers update requires re-merging. This is functionally identical to Own tier. |
|
|
2418
|
+
| Inject Wazir context via CLAUDE.md | No | CLAUDE.md is project-scoped; skill behavior should be global across all projects |
|
|
2419
|
+
| Use `superpowers:` prefix to load base, then append | No | Prefix only works in OpenCode's `skills-core.js`, not in Claude Code's native Skill tool |
|
|
2420
|
+
| Propose upstream merge/append feature | Future | Would require a superpowers or Claude Code platform change |
|
|
2421
|
+
|
|
2422
|
+
### Conclusion
|
|
2423
|
+
|
|
2424
|
+
The Augment tier is architecturally impossible with the current skill discovery mechanism. All three former Augment skills (dispatching-parallel-agents, finishing-a-development-branch, using-git-worktrees) are reclassified to **Own** tier. Since the Wazir versions already carry the full superpowers base content plus Wazir additions, no content is lost -- the skills simply cannot delegate to a shared base.
|
|
2425
|
+
|
|
2426
|
+
If superpowers or Claude Code introduces a composition/layering mechanism in the future (e.g., `extends: superpowers:dispatching-parallel-agents` in frontmatter), the Augment tier could be revisited.
|
|
2427
|
+
|
|
2428
|
+
---
|
|
2429
|
+
|
|
2430
|
+
## Observations
|
|
2431
|
+
|
|
2432
|
+
1. **No Delegate candidates exist.** Every Wazir fork adds at minimum the Command Routing and Codebase Exploration preambles, which prevents pure delegation.
|
|
2433
|
+
|
|
2434
|
+
2. **Augment tier is not implementable.** R2 validation (2026-03-19) found that skill shadowing in both superpowers `skills-core.js` and Claude Code's native Skill tool is full-override: placing a SKILL.md in `~/.claude/skills/<name>/` completely replaces the superpowers skill with the same name. There is no merge or append mechanism. The three former Augment candidates (dispatching-parallel-agents, finishing-a-development-branch, using-git-worktrees) have been reclassified to Own. See [Augment Mechanism](#augment-mechanism) for full analysis.
|
|
2435
|
+
|
|
2436
|
+
3. **All 14 forked skills are Own** because either (a) they introduce structural process changes (review loops, pipeline mode, Codex integration, Agent Teams, content restructuring) or (b) the Augment composition mechanism does not exist in the platform.
|
|
2437
|
+
|
|
2438
|
+
4. **Token cost tradeoff is significant.** Several Wazir Own skills (tdd, verification, debugging, writing-skills) are dramatically shorter than their superpowers counterparts. The superpowers versions contain valuable rationalization prevention tables, detailed examples, and anti-pattern catalogs that enforce discipline. The Wazir versions trade this for token efficiency. This tradeoff should be revisited -- some of the removed discipline content may be worth recovering as separate reference files.
|
|
2439
|
+
|
|
2440
|
+
5. **The `wz:` prefix is already applied** in skill names within the Wazir SKILL.md frontmatter for all forked skills, consistent with the Own tier convention.
|
|
2441
|
+
|
|
2442
|
+
|
|
1620
2443
|
---
|
|
1621
2444
|
## Source: docs/reference/skills.md
|
|
1622
2445
|
|
|
@@ -1654,6 +2477,7 @@ These skills remain on the active surface:
|
|
|
1654
2477
|
- Skills must not instruct users to run background services or wrapper scripts that are not part of the canonical workflow surface.
|
|
1655
2478
|
- When a skill becomes contradictory to the current operating model, remove it from `skills/`.
|
|
1656
2479
|
|
|
2480
|
+
|
|
1657
2481
|
---
|
|
1658
2482
|
## Source: docs/reference/templates.md
|
|
1659
2483
|
|
|
@@ -1687,6 +2511,7 @@ Each template requires run metadata, sources, loop number, and approval status w
|
|
|
1687
2511
|
|
|
1688
2512
|
Schema-backed examples under `templates/examples/` exist to keep schemas, examples, and validation in sync.
|
|
1689
2513
|
|
|
2514
|
+
|
|
1690
2515
|
---
|
|
1691
2516
|
## Source: docs/reference/tooling-cli.md
|
|
1692
2517
|
|
|
@@ -1707,6 +2532,7 @@ The `wazir` CLI is minimal on purpose. It exists to validate and export the host
|
|
|
1707
2532
|
| `wazir validate commits` | implemented | Validates conventional commit format for commits in the range `--base..--head` (or auto-detected base to HEAD). |
|
|
1708
2533
|
| `wazir validate changelog` | implemented | Validates `CHANGELOG.md` structure; with `--require-entries` and `--base`, enforces new entries since the base. |
|
|
1709
2534
|
| `wazir validate docs-drift` | implemented | Detects when source files (roles, workflows, skills, hooks) change without corresponding documentation updates. Advisory by default; `--strict` exits non-zero on drift. |
|
|
2535
|
+
| `wazir validate skills` | implemented | Validates skill frontmatter and checks for name conflicts with superpowers skills (requires `wz:` prefix). Rejects any `CONTEXT.md` files (augment tier concluded not implementable in R2). |
|
|
1710
2536
|
| `wazir validate artifacts` | reserved | Exits `2` until artifact-template and example validation expands. |
|
|
1711
2537
|
| `wazir export build` | implemented | Generates host packages under `exports/hosts/*` from canonical sources. |
|
|
1712
2538
|
| `wazir export --check` | implemented | Verifies generated host packages still match current canonical source hashes. |
|
|
@@ -1720,19 +2546,22 @@ The `wazir` CLI is minimal on purpose. It exists to validate and export the host
|
|
|
1720
2546
|
| `wazir recall file` | implemented | Returns an exact line-bounded slice from an indexed file. Supports `--tier L0\|L1` for summary recall. |
|
|
1721
2547
|
| `wazir recall symbol` | implemented | Returns an exact slice for an indexed symbol match. Supports `--tier L0\|L1` for summary recall. |
|
|
1722
2548
|
| `wazir doctor` | implemented | Validates the active repo surface for manifest, hooks, state-root policy, and host export directory presence. |
|
|
1723
|
-
| `wazir status` | implemented | Reads run status directly from `<state-root>/runs/<run-id>/status.json`. |
|
|
2549
|
+
| `wazir status` | implemented | Reads run status directly from `<state-root>/runs/<run-id>/status.json`. Includes a one-line context savings summary when usage data is available. |
|
|
2550
|
+
| `wazir stats` | implemented | Shows token savings statistics for a run, including total queries, estimated tokens saved, bytes avoided, per-tool breakdown, and overall savings ratio. |
|
|
1724
2551
|
| `wazir capture init` | implemented | Creates a run ledger with `status.json`, `events.ndjson`, and a captures directory under the configured state root. |
|
|
1725
2552
|
| `wazir capture event` | implemented | Appends a run event and can update phase, status, and loop counts in `status.json`. |
|
|
1726
2553
|
| `wazir capture route` | implemented | Reserves a run-local capture file path for large tool output. |
|
|
1727
2554
|
| `wazir capture output` | implemented | Writes captured tool output to a run-local file and records a `post_tool_capture` event. |
|
|
1728
2555
|
| `wazir capture summary` | implemented | Writes `summary.md` and records the chosen summary or handoff event. |
|
|
1729
2556
|
| `wazir capture usage` | implemented | Generates a token savings report for a run, showing capture routing statistics and context window savings. |
|
|
2557
|
+
| `wazir capture loop-check` | implemented | Records a loop iteration event and evaluates the loop cap guard. Exits 43 if the phase loop cap is exceeded. Accepts `--task-id` for task-scoped cap tracking. In standalone mode (no status.json), exits 0. |
|
|
1730
2558
|
|
|
1731
2559
|
## Exit codes
|
|
1732
2560
|
|
|
1733
2561
|
- `0`: requested check passed
|
|
1734
2562
|
- `1`: invalid input or validation failure
|
|
1735
2563
|
- `2`: command surface exists but the implementation is intentionally not complete yet
|
|
2564
|
+
- `43`: phase loop cap exceeded (returned by `wazir capture loop-check`)
|
|
1736
2565
|
|
|
1737
2566
|
## Root discovery
|
|
1738
2567
|
|
|
@@ -1785,6 +2614,7 @@ Executable documentation claims are registered in:
|
|
|
1785
2614
|
|
|
1786
2615
|
`wazir validate docs` uses that file plus active markdown link checks to prevent stale command and path claims from silently drifting.
|
|
1787
2616
|
|
|
2617
|
+
|
|
1788
2618
|
---
|
|
1789
2619
|
## Source: README.md
|
|
1790
2620
|
|
|
@@ -1796,7 +2626,7 @@ Executable documentation claims are registered in:
|
|
|
1796
2626
|
</picture>
|
|
1797
2627
|
</p>
|
|
1798
2628
|
|
|
1799
|
-
<h3 align="center">
|
|
2629
|
+
<h3 align="center">Engineering with itqan.</h3>
|
|
1800
2630
|
|
|
1801
2631
|
<p align="center">
|
|
1802
2632
|
<a href="https://github.com/MohamedAbdallah-14/Wazir/actions/workflows/ci.yml"><img src="https://img.shields.io/github/actions/workflow/status/MohamedAbdallah-14/Wazir/ci.yml?branch=main&label=CI" alt="CI"></a>
|
|
@@ -1814,80 +2644,60 @@ Executable documentation claims are registered in:
|
|
|
1814
2644
|
<img src="https://img.shields.io/badge/Cursor-supported-FF6B35" alt="Cursor">
|
|
1815
2645
|
</p>
|
|
1816
2646
|
|
|
1817
|
-
<!-- Demo GIF: run assets/record-demo.sh to generate assets/demo.gif, then uncomment the img tag below -->
|
|
1818
|
-
<!-- <p align="center"><img src="assets/demo.gif" alt="Wazir Demo" width="700"></p> -->
|
|
1819
|
-
|
|
1820
|
-
A host-native operating model for AI coding agents. Wazir gives Claude, Codex, Gemini, and Cursor a 14-phase delivery pipeline, 10 canonical roles with enforceable contracts, 3 adversarial review phases with 9 hard approval gates, and 261 curated expertise modules loaded automatically per task. No server. No wrapper. No custom orchestration.
|
|
1821
|
-
|
|
1822
|
-
Install once. Your agent works the way your best engineer does.
|
|
1823
|
-
|
|
1824
|
-
---
|
|
1825
|
-
|
|
1826
|
-
## Table of Contents
|
|
1827
|
-
|
|
1828
|
-
- [Why Wazir?](#why-wazir)
|
|
1829
|
-
- [Quick Start](#quick-start)
|
|
1830
|
-
- [The Pipeline](#the-pipeline)
|
|
1831
|
-
- [How It Works](#how-it-works)
|
|
1832
|
-
- [How Wazir Handles Complex Tasks](#how-wazir-handles-complex-tasks)
|
|
1833
|
-
- [Token Savings](#token-savings)
|
|
1834
|
-
- [What's Included](#whats-included)
|
|
1835
|
-
- [Compared to Other Tools](#compared-to-other-tools)
|
|
1836
|
-
- [Install](#install)
|
|
1837
|
-
- [Documentation](#documentation)
|
|
1838
|
-
- [Project Status](#project-status)
|
|
1839
|
-
- [Acknowledgments](#acknowledgments)
|
|
1840
|
-
- [Contributing](#contributing)
|
|
1841
|
-
- [License](#license)
|
|
1842
2647
|
|
|
1843
2648
|
---
|
|
1844
2649
|
|
|
1845
|
-
|
|
1846
|
-
|
|
1847
|
-
AI coding agents fail the same five ways. Every time.
|
|
1848
|
-
|
|
1849
|
-
**Ambiguous specs become wrong code.** The clarifier role escalates unresolved ambiguity instead of guessing. No spec ships until material questions get answers. Escalation is a required output, not an option.
|
|
1850
|
-
|
|
1851
|
-
**Output quality varies randomly.** The reviewer role is never the phase author. Adversarial review runs at three chokepoints -- spec-challenge, design-review, and final review -- always by a different model or model family. Nine hard approval gates block advancement until artifacts pass.
|
|
2650
|
+
> AI agents don't have a quality problem. They have a management problem.
|
|
1852
2651
|
|
|
1853
|
-
|
|
2652
|
+
I'm Mohamed Abdallah. I kept watching AI agents write confident code that broke in production, skip tests, and forget what we agreed on yesterday. So I stopped asking them to be better and built them an engineering department instead.
|
|
1854
2653
|
|
|
1855
|
-
**
|
|
1856
|
-
|
|
1857
|
-
**Nothing prevents structural failures.** Seven hook contracts enforce protected paths (exit 42), loop caps (exit 43), and session observability. Hooks are enforcement, not suggestions.
|
|
2654
|
+
**Wazir puts engineering discipline inside AI coding agents.**
|
|
2655
|
+
No wrapper. No server. Just structure -- inside Claude, Codex, Gemini, and Cursor. Built on 300+ research sources distilled into 315 curated expertise modules across 12 domains.
|
|
1858
2656
|
|
|
1859
2657
|
---
|
|
1860
2658
|
|
|
1861
2659
|
## Quick Start
|
|
1862
2660
|
|
|
1863
|
-
**Step 1: Install**
|
|
1864
|
-
|
|
1865
2661
|
```bash
|
|
1866
2662
|
/plugin marketplace add MohamedAbdallah-14/Wazir
|
|
1867
2663
|
/plugin install wazir
|
|
1868
2664
|
```
|
|
1869
2665
|
|
|
1870
|
-
|
|
2666
|
+
Then tell your agent what to build:
|
|
1871
2667
|
|
|
1872
|
-
```
|
|
1873
|
-
/
|
|
2668
|
+
```
|
|
2669
|
+
/wazir Build a REST API for managing tasks with authentication
|
|
1874
2670
|
```
|
|
1875
2671
|
|
|
1876
|
-
|
|
2672
|
+
That's it. The pipeline takes over -- clarifies your requirements, writes a spec, plans the work, implements with TDD, reviews, and learns for next time. You approve at the gates. Everything else is automatic.
|
|
1877
2673
|
|
|
1878
|
-
|
|
2674
|
+
You can also control the depth and intent directly:
|
|
1879
2675
|
|
|
1880
2676
|
```
|
|
1881
|
-
/
|
|
2677
|
+
/wazir quick fix the login redirect bug
|
|
2678
|
+
/wazir deep design a new onboarding flow
|
|
2679
|
+
/wazir audit security
|
|
1882
2680
|
```
|
|
1883
2681
|
|
|
1884
|
-
|
|
2682
|
+
---
|
|
2683
|
+
|
|
2684
|
+
### The reviewer is never the author.
|
|
2685
|
+
|
|
2686
|
+
When your AI agent reviews its own code, it finds what it expected to find -- nothing. Wazir's adversarial reviewer is a separate agent with different expertise modules. It catches the mistakes your agent is structurally blind to.
|
|
2687
|
+
|
|
2688
|
+
### Silence isn't confidence -- it's assumptions.
|
|
2689
|
+
|
|
2690
|
+
Your AI agent doesn't ask questions because it's sure. It doesn't ask questions because it's trained to be helpful. Wazir's clarifier forces ambiguity to the surface before a single line is written.
|
|
2691
|
+
|
|
2692
|
+
### Done means verified, not declared.
|
|
2693
|
+
|
|
2694
|
+
AI agents love to announce they're finished. Wazir doesn't care. Every phase loops until the work and its verification converge. The agent doesn't get to say "done." The process decides.
|
|
1885
2695
|
|
|
1886
2696
|
---
|
|
1887
2697
|
|
|
1888
2698
|
## The Pipeline
|
|
1889
2699
|
|
|
1890
|
-
Every task flows through
|
|
2700
|
+
Every task flows through 15 workflows grouped into 4 phases. Three are adversarial review gates that block progress until the reviewer explicitly approves. Rejection loops back to the authoring phase.
|
|
1891
2701
|
|
|
1892
2702
|
```mermaid
|
|
1893
2703
|
graph LR
|
|
@@ -1920,6 +2730,8 @@ graph LR
|
|
|
1920
2730
|
style P8 fill:#c62828,color:#fff
|
|
1921
2731
|
```
|
|
1922
2732
|
|
|
2733
|
+
|
|
2734
|
+
|
|
1923
2735
|
> **GATE** = Approval gate. The phase blocks until the reviewer explicitly approves. Rejection loops back to the authoring phase.
|
|
1924
2736
|
|
|
1925
2737
|
---
|
|
@@ -1930,23 +2742,9 @@ Three concepts.
|
|
|
1930
2742
|
|
|
1931
2743
|
**1 -- Roles are isolation boundaries, not personas.** Each of the 10 roles has defined inputs, allowed tools, required outputs, escalation rules, and failure conditions. An agent inside a role cannot write to protected paths, cannot skip required outputs, and must escalate when ambiguity conditions are met. The discipline is structural, not instructional. See [Roles & Workflows](docs/concepts/roles-and-workflows.md).
|
|
1932
2744
|
|
|
1933
|
-
**2 -- Phases are artifact checkpoints, not conversation stages.** Every phase consumes a named artifact from the previous phase and produces a named artifact for the next. Nothing flows through conversation history. A session can end, a new agent can pick up the artifacts, and delivery continues. The handoff is explicit, structured, and schema-validated against
|
|
2745
|
+
**2 -- Phases are artifact checkpoints, not conversation stages.** Every phase consumes a named artifact from the previous phase and produces a named artifact for the next. Nothing flows through conversation history. A session can end, a new agent can pick up the artifacts, and delivery continues. The handoff is explicit, structured, and schema-validated against 19 JSON schemas. See [Architecture](docs/concepts/architecture.md).
|
|
1934
2746
|
|
|
1935
|
-
**3 -- The composition engine loads the right expert automatically.** A 4-layer system (always, auto, stacks, concerns) decides which of
|
|
1936
|
-
|
|
1937
|
-
---
|
|
1938
|
-
|
|
1939
|
-
## How Wazir Handles Complex Tasks
|
|
1940
|
-
|
|
1941
|
-
Large coding tasks fail when agents lose track of quality. Wazir addresses this with three reinforcing mechanisms.
|
|
1942
|
-
|
|
1943
|
-
**14-phase pipeline with 9 hard approval gates.** Every task passes through clarify, research, specify, design, plan, execute, verify, review, and learn. Nine transitions have hard blocking conditions. No phase is skipped, no shortcut taken. The pipeline is defined in `workflows/` and enforced by the orchestrator.
|
|
1944
|
-
|
|
1945
|
-
**Adversarial review built in.** The reviewer role operates independently from the executor. It starts with structural summaries (L1 recall) to triage, then reads full source for logic errors, security concerns, or ambiguous code. Review criteria come from expertise modules, not guesswork.
|
|
1946
|
-
|
|
1947
|
-
**TDD and verification-before-completion.** The executor writes failing tests before implementation (red-green-refactor). The verifier independently runs all tests, checks truth claims, and validates exports. No task completes until the verifier confirms all acceptance criteria pass. This catches regressions that the executor's own testing misses.
|
|
1948
|
-
|
|
1949
|
-
The output is code held to the same standard a senior engineering team would enforce.
|
|
2747
|
+
**3 -- The composition engine loads the right expert automatically.** One agent pretending to be an expert in everything is an expert in nothing. A 4-layer system (always, auto, stacks, concerns) decides which of 315 expertise modules load into each role's context. The executor gets modules on how to build. The verifier gets modules on what to detect. The reviewer gets modules on what to flag. All resolved automatically from the task's declared stack and concerns. Max 15 modules per dispatch, token budget enforced.
|
|
1950
2748
|
|
|
1951
2749
|
---
|
|
1952
2750
|
|
|
@@ -1954,11 +2752,13 @@ The output is code held to the same standard a senior engineering team would enf
|
|
|
1954
2752
|
|
|
1955
2753
|
Wazir's tiered recall system loads the minimum context each role needs.
|
|
1956
2754
|
|
|
1957
|
-
|
|
1958
|
-
|
|
1959
|
-
|
|
|
1960
|
-
|
|
|
1961
|
-
|
|
|
2755
|
+
|
|
2756
|
+
| Tier | Tokens | Content | Used by |
|
|
2757
|
+
| ----------- | --------- | ------------------- | ------------------------------------------------------ |
|
|
2758
|
+
| L0 | ~100 | One-line identifier | learner (inventory scans) |
|
|
2759
|
+
| L1 | ~500-2k | Structural summary | clarifier, researcher, planner, reviewer (exploration) |
|
|
2760
|
+
| Direct read | Full file | Exact source lines | executor, verifier (implementation) |
|
|
2761
|
+
|
|
1962
2762
|
|
|
1963
2763
|
Capture routing redirects large tool output to run-local files. The agent gets a file path (~50 tokens) instead of the full output. Combined with tiered recall, this yields 60-80% token reduction on exploration-heavy phases.
|
|
1964
2764
|
|
|
@@ -1987,23 +2787,21 @@ Run `wazir capture usage` at the end of a session to see the savings:
|
|
|
1987
2787
|
|
|
1988
2788
|
## What's Included
|
|
1989
2789
|
|
|
1990
|
-
**10 canonical role contracts.** Clarifier, researcher, specifier, content-author, designer, planner, executor, verifier, reviewer, learner. Each has enforceable inputs, outputs, and escalation rules.
|
|
2790
|
+
**10 canonical role contracts.** Clarifier, researcher, specifier, content-author, designer, planner, executor, verifier, reviewer, learner. Each has enforceable inputs, outputs, and escalation rules. [Roles reference](docs/reference/roles-reference.md)
|
|
1991
2791
|
|
|
1992
|
-
**Adversarial review at three chokepoints.** Spec-challenge, plan-review, and final review run by the reviewer role, never the phase author.
|
|
2792
|
+
**Adversarial review at three chokepoints.** Spec-challenge, plan-review, and final review run by the reviewer role, never the phase author. Nine hard approval gates span the 15-workflow pipeline. Nothing advances without explicit clearance. [Architecture](docs/concepts/architecture.md)
|
|
1993
2793
|
|
|
1994
|
-
**
|
|
2794
|
+
**315 curated expertise modules across 12 domains.** Loaded selectively per role per phase via a 4-layer composition engine. Max 15 modules per dispatch, token budget enforced. Wazir ships with 315. Yours could be next. [Expertise index](docs/reference/expertise-index.md)
|
|
1995
2795
|
|
|
1996
|
-
**Three-tier recall for token savings.** L0 (
|
|
2796
|
+
**Three-tier recall for token savings.** L0 (~~100 tokens), L1 (~~500-2k tokens), direct read for full source. Symbol-first exploration searches the index before reading source. Capture routing redirects large tool output to files. Result: 60-80% token reduction on exploration-heavy phases, measured per-session by `wazir capture usage`. [Indexing and Recall](docs/concepts/indexing-and-recall.md)
|
|
1997
2797
|
|
|
1998
2798
|
**Structured learning.** Proposed learnings require explicit review and scope tagging before promotion. Only learnings whose file patterns overlap the current task get injected into context. The system improves per-project without drifting.
|
|
1999
2799
|
|
|
2000
|
-
**
|
|
2001
|
-
|
|
2002
|
-
**20 callable skills.** wz:tdd, wz:verification, wz:debugging, wz:scan-project, wz:writing-plans, and 14 more. Each enforces an exact procedure with evidence at each step. [Skills](docs/reference/skills.md)
|
|
2800
|
+
**8 hook contracts for structural guardrails.** These enforce protected path writes (exit 42), loop caps (exit 43), and session observability. [Hooks](docs/reference/hooks.md)
|
|
2003
2801
|
|
|
2004
|
-
**
|
|
2802
|
+
**20+ callable skills.** `/wazir` runs the full pipeline. `/wazir audit security` runs a codebase audit. `/wazir prd` generates a product requirements document from completed runs. Plus TDD, verification, debugging, and more -- each enforcing an exact procedure with evidence at every step. [Skills](docs/reference/skills.md)
|
|
2005
2803
|
|
|
2006
|
-
**
|
|
2804
|
+
**Built-in text humanization.** The composition engine loads domain-specific language rules per role: code rules for the executor (commit messages, comments), content rules for the content-author (microcopy, glossary), and technical-docs rules for the specifier, planner, reviewer, and learner. A 61-item vocabulary blacklist, 24-pattern sentence taxonomy, and two-pass self-audit checklist keep all output sounding like it was written by a person.
|
|
2007
2805
|
|
|
2008
2806
|
**Runs on 4 platforms.** `wazir export build` compiles canonical sources into native packages for Claude, Codex, Gemini, and Cursor. SHA-256 drift detection catches stale exports in CI. [Host exports](docs/reference/host-exports.md)
|
|
2009
2807
|
|
|
@@ -2011,26 +2809,24 @@ Run `wazir capture usage` at the end of a session to see the savings:
|
|
|
2011
2809
|
|
|
2012
2810
|
## Compared to Other Tools
|
|
2013
2811
|
|
|
2014
|
-
The AI coding tool space is fragmenting. Developers bolt together separate plugins for workflow management, specification, memory, output compression, and orchestration.
|
|
2812
|
+
The AI coding tool space is fragmenting. Developers bolt together separate plugins for workflow management, specification, memory, output compression, and orchestration. Not every project needs 15 workflows. For a weekend hack, prompting is fine. For production, you want structure.
|
|
2015
2813
|
|
|
2016
|
-
Wazir takes a different path: one integrated operating model instead of many independent plugins.
|
|
2017
2814
|
|
|
2018
|
-
| Dimension
|
|
2019
|
-
|
|
2020
|
-
| **Category**
|
|
2021
|
-
| **Scope**
|
|
2022
|
-
| **Enforced roles**
|
|
2023
|
-
| **Phase model**
|
|
2024
|
-
| **Adversarial review** | 3 gate phases
|
|
2025
|
-
| **Context management** | L0/L1 tiered recall
|
|
2026
|
-
| **Schema validation**
|
|
2027
|
-
| **Guardrails**
|
|
2028
|
-
| **External deps**
|
|
2029
|
-
| **Host support**
|
|
2815
|
+
| Dimension | Wazir | [Superpowers](https://github.com/obra/superpowers) | [Spec-Kit](https://github.com/github/spec-kit) | [Micro-Agent](https://github.com/BuilderIO/micro-agent) | [Distill](https://github.com/samuelfaj/distill) | [Claude-Mem](https://github.com/thedotmack/claude-mem) | [OMC](https://github.com/yeachan-heo/oh-my-claudecode) |
|
|
2816
|
+
| ---------------------- | ----------------------------- | -------------------------------------------------- | ---------------------------------------------- | ------------------------------------------------------- | ----------------------------------------------- | ------------------------------------------------------ | ------------------------------------------------------ |
|
|
2817
|
+
| **Category** | Engineering OS | Skills framework | Spec toolkit | Code gen agent | Output compressor | Memory plugin | Orchestration layer |
|
|
2818
|
+
| **Scope** | Full lifecycle (15 workflows) | Dev workflow (~20 skills) | Specify / Plan / Implement | Single-file TDD loop | CLI output compression | Session memory | Multi-agent orchestration |
|
|
2819
|
+
| **Enforced roles** | 10 canonical, contractual | None (skills only) | None | None | None | None | 32 agents (behavioral) |
|
|
2820
|
+
| **Phase model** | 15 explicit, artifact-gated | 7-step (advisory) | 3-step | 1 (generate/test) | N/A | N/A | 5-step pipeline |
|
|
2821
|
+
| **Adversarial review** | 3 gate phases | Code review skill | No | No | No | No | team-verify step |
|
|
2822
|
+
| **Context management** | L0/L1 tiered recall | None | None | None | LLM compression | Vector DB (ChromaDB) | Token routing |
|
|
2823
|
+
| **Schema validation** | 19 JSON schemas | No | No | No | No | No | No |
|
|
2824
|
+
| **Guardrails** | 8 hook contracts | None | None | None | None | 5 hooks (memory) | Agent tracking |
|
|
2825
|
+
| **External deps** | None (host-native) | None (prompt-only) | Python CLI | Node.js CLI | Node.js + LLM | ChromaDB, SQLite, Bun | tmux, exp. teams API |
|
|
2826
|
+
| **Host support** | Claude, Codex, Gemini, Cursor | Claude, Codex, Gemini, Cursor, OpenCode | Claude, Copilot, Gemini | Any LLM provider | Any LLM | Claude Code only | Claude Code (+ workers) |
|
|
2030
2827
|
|
|
2031
|
-
Each of these tools solves a real problem. Wazir's approach is to solve them together -- one system, shared context, structural enforcement -- instead of asking developers to wire separate plugins into a coherent workflow.
|
|
2032
2828
|
|
|
2033
|
-
|
|
2829
|
+
Each of these tools solves a real problem. Wazir's approach is to solve them together -- one system, shared context, structural enforcement -- instead of asking developers to wire separate plugins into a coherent workflow.
|
|
2034
2830
|
|
|
2035
2831
|
---
|
|
2036
2832
|
|
|
@@ -2043,63 +2839,58 @@ Each of these tools solves a real problem. Wazir's approach is to solve them tog
|
|
|
2043
2839
|
/plugin install wazir
|
|
2044
2840
|
```
|
|
2045
2841
|
|
|
2046
|
-
The plugin loads skills, roles, and workflows into your Claude sessions.
|
|
2842
|
+
The plugin loads skills, roles, and workflows into your Claude sessions. Then type `/wazir` and go.
|
|
2047
2843
|
|
|
2048
2844
|
**npm / Homebrew:**
|
|
2049
2845
|
|
|
2050
2846
|
```bash
|
|
2051
|
-
npm install -g @wazir-dev/cli
|
|
2052
|
-
brew tap MohamedAbdallah-14/
|
|
2847
|
+
npm install -g @wazir-dev/cli # npm
|
|
2848
|
+
brew tap MohamedAbdallah-14/homebrew-wazir && brew install wazir # Homebrew
|
|
2053
2849
|
```
|
|
2054
2850
|
|
|
2055
|
-
**Deploy to your project:**
|
|
2056
|
-
|
|
2057
|
-
| Host | Command |
|
|
2058
|
-
|------|---------|
|
|
2059
|
-
| **Claude** | `cp -r exports/hosts/claude/.claude ~/your-project/ && cp exports/hosts/claude/CLAUDE.md ~/your-project/` |
|
|
2060
|
-
| **Codex** | `cp exports/hosts/codex/AGENTS.md ~/your-project/` |
|
|
2061
|
-
| **Gemini** | `cp exports/hosts/gemini/GEMINI.md ~/your-project/` |
|
|
2062
|
-
| **Cursor** | `cp -r exports/hosts/cursor/.cursor ~/your-project/` |
|
|
2063
|
-
|
|
2064
|
-
> npm/Homebrew users: clone the source and run `npx wazir export build` to generate host exports. See [Installation Guide](docs/getting-started/01-installation.md) for the full path.
|
|
2065
|
-
|
|
2066
2851
|
---
|
|
2067
2852
|
|
|
2068
2853
|
## Documentation
|
|
2069
2854
|
|
|
2070
2855
|
**For users:**
|
|
2071
2856
|
|
|
2072
|
-
|
|
2073
|
-
|
|
2074
|
-
|
|
|
2075
|
-
|
|
|
2076
|
-
|
|
|
2857
|
+
|
|
2858
|
+
| I want to... | Go to |
|
|
2859
|
+
| ------------------------------- | --------------------------------------------------------- |
|
|
2860
|
+
| Install and get started | [Installation](docs/getting-started/01-installation.md) |
|
|
2861
|
+
| Run my first task | [First Run](docs/getting-started/02-first-run.md) |
|
|
2862
|
+
| Understand the architecture | [Architecture](docs/concepts/architecture.md) |
|
|
2077
2863
|
| Learn about roles and workflows | [Roles & Workflows](docs/concepts/roles-and-workflows.md) |
|
|
2078
2864
|
|
|
2865
|
+
|
|
2079
2866
|
**For contributors:**
|
|
2080
2867
|
|
|
2081
|
-
|
|
2082
|
-
|
|
2083
|
-
|
|
|
2084
|
-
|
|
|
2085
|
-
|
|
|
2086
|
-
|
|
|
2868
|
+
|
|
2869
|
+
| I want to... | Go to |
|
|
2870
|
+
| ------------------------ | -------------------------------------------------------------------- |
|
|
2871
|
+
| Set up for development | [CONTRIBUTING.md](CONTRIBUTING.md) |
|
|
2872
|
+
| Look up CLI commands | [CLI Reference](docs/reference/tooling-cli.md) |
|
|
2873
|
+
| Configure the manifest | [Configuration Reference](docs/reference/configuration-reference.md) |
|
|
2874
|
+
| Browse all documentation | [Documentation Hub](docs/README.md) |
|
|
2875
|
+
|
|
2087
2876
|
|
|
2088
2877
|
---
|
|
2089
2878
|
|
|
2090
2879
|
## Project Status
|
|
2091
2880
|
|
|
2092
|
-
Wazir is in active early development (
|
|
2881
|
+
Wazir is in active early development (pre-1.0-alpha).
|
|
2093
2882
|
|
|
2094
2883
|
The pipeline, roles, and expertise modules are stable and used in production by the maintainers. The CLI, schemas, and hook contracts work. But this is early software -- APIs may change before 1.0.
|
|
2095
2884
|
|
|
2096
2885
|
What's solid:
|
|
2097
|
-
|
|
2098
|
-
-
|
|
2886
|
+
|
|
2887
|
+
- The 4-phase pipeline (15 workflows) and 10 role contracts
|
|
2888
|
+
- 315 expertise modules across 12 domains
|
|
2099
2889
|
- Host exports for Claude, Codex, Gemini, and Cursor
|
|
2100
2890
|
- The composition engine and tiered recall system
|
|
2101
2891
|
|
|
2102
2892
|
What may change:
|
|
2893
|
+
|
|
2103
2894
|
- CLI command surface and flags
|
|
2104
2895
|
- Schema field names
|
|
2105
2896
|
- Hook contract signatures
|
|
@@ -2109,6 +2900,14 @@ Feedback and contributions are welcome. See [CONTRIBUTING.md](CONTRIBUTING.md).
|
|
|
2109
2900
|
|
|
2110
2901
|
---
|
|
2111
2902
|
|
|
2903
|
+
## Why "Wazir"?
|
|
2904
|
+
|
|
2905
|
+
Wazir (وزير) -- the vizier. The operational mastermind who ran empires while the sultan held authority. In Arabic chess, the wazir became the queen: the most powerful piece on the board.
|
|
2906
|
+
|
|
2907
|
+
The Arabic word *itqan* (إتقان) means mastery -- doing something so well that nothing remains to improve. This isn't a tagline. It's the test every commit runs against.
|
|
2908
|
+
|
|
2909
|
+
---
|
|
2910
|
+
|
|
2112
2911
|
## Acknowledgments
|
|
2113
2912
|
|
|
2114
2913
|
Wazir builds on ideas and patterns from these projects:
|
|
@@ -2120,6 +2919,7 @@ Wazir builds on ideas and patterns from these projects:
|
|
|
2120
2919
|
- **[micro-agent](https://github.com/BuilderIO/micro-agent)** by Builder.io -- test-driven code generation patterns
|
|
2121
2920
|
- **[distill](https://github.com/samuelfaj/distill)** by [@samuelfaj](https://github.com/samuelfaj) -- CLI output compression for token savings
|
|
2122
2921
|
- **[claude-mem](https://github.com/thedotmack/claude-mem)** by [@thedotmack](https://github.com/thedotmack) -- persistent memory patterns for coding agents
|
|
2922
|
+
- **[ideation](https://github.com/bladnman/ideation_team_skill)** by [@bladnman](https://github.com/bladnman) -- multi-agent structured dialogue patterns
|
|
2123
2923
|
|
|
2124
2924
|
---
|
|
2125
2925
|
|
|
@@ -2238,6 +3038,7 @@ Not sure where to start? Open a [Discussion](https://github.com/MohamedAbdallah-
|
|
|
2238
3038
|
4. **Merge:** Once approved and all checks pass, a maintainer will merge your PR using a squash merge with a conventional commit message.
|
|
2239
3039
|
5. **Timeline:** We aim to provide initial review feedback within a few days. If your PR has been open for more than a week without a response, feel free to leave a comment or ping in Discussions.
|
|
2240
3040
|
|
|
3041
|
+
|
|
2241
3042
|
---
|
|
2242
3043
|
## Source: AGENTS.md
|
|
2243
3044
|
|
|
@@ -2353,3 +3154,5 @@ This project uses Codex as a secondary reviewer. Review artifacts are in `tasks/
|
|
|
2353
3154
|
- Use isolated feature branches
|
|
2354
3155
|
- Reference `wazir.manifest.yaml` for the project manifest and schema
|
|
2355
3156
|
|
|
3157
|
+
|
|
3158
|
+
---
|