@wazir-dev/cli 1.1.0 → 1.3.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +74 -10
- package/README.md +15 -15
- package/assets/demo.cast +47 -0
- package/assets/demo.gif +0 -0
- package/docs/anti-patterns/AP-23-skipping-enabled-workflows.md +28 -0
- package/docs/anti-patterns/AP-24-clarifier-deciding-scope.md +34 -0
- package/docs/concepts/architecture.md +1 -1
- package/docs/concepts/roles-and-workflows.md +2 -0
- package/docs/concepts/why-wazir.md +59 -0
- package/docs/decisions/2026-03-19-deferred-items.md +564 -0
- package/docs/decisions/2026-03-19-enhancement-decisions.md +300 -0
- package/docs/readmes/INDEX.md +21 -5
- package/docs/readmes/features/expertise/README.md +2 -2
- package/docs/readmes/features/exports/README.md +2 -2
- package/docs/readmes/features/hooks/pre-compact-summary.md +1 -1
- package/docs/readmes/features/schemas/README.md +3 -0
- package/docs/readmes/features/skills/README.md +17 -0
- package/docs/readmes/features/skills/clarifier.md +5 -0
- package/docs/readmes/features/skills/claude-cli.md +5 -0
- package/docs/readmes/features/skills/codex-cli.md +5 -0
- package/docs/readmes/features/skills/dispatching-parallel-agents.md +5 -0
- package/docs/readmes/features/skills/executing-plans.md +5 -0
- package/docs/readmes/features/skills/executor.md +5 -0
- package/docs/readmes/features/skills/finishing-a-development-branch.md +5 -0
- package/docs/readmes/features/skills/gemini-cli.md +5 -0
- package/docs/readmes/features/skills/humanize.md +5 -0
- package/docs/readmes/features/skills/init-pipeline.md +5 -0
- package/docs/readmes/features/skills/receiving-code-review.md +5 -0
- package/docs/readmes/features/skills/requesting-code-review.md +5 -0
- package/docs/readmes/features/skills/reviewer.md +5 -0
- package/docs/readmes/features/skills/subagent-driven-development.md +5 -0
- package/docs/readmes/features/skills/using-git-worktrees.md +5 -0
- package/docs/readmes/features/skills/wazir.md +5 -0
- package/docs/readmes/features/skills/writing-skills.md +5 -0
- package/docs/readmes/features/workflows/prepare-next.md +1 -1
- package/docs/reference/configuration-reference.md +47 -6
- package/docs/reference/hooks.md +1 -0
- package/docs/reference/launch-checklist.md +4 -4
- package/docs/reference/review-loop-pattern.md +119 -9
- package/docs/reference/roles-reference.md +1 -0
- package/docs/reference/skill-tiers.md +147 -0
- package/docs/reference/tooling-cli.md +3 -1
- package/docs/truth-claims.yaml +12 -0
- package/expertise/antipatterns/process/ai-coding-antipatterns.md +214 -1
- package/exports/hosts/claude/.claude/commands/plan-review.md +3 -1
- package/exports/hosts/claude/.claude/commands/verify.md +30 -1
- package/exports/hosts/claude/.claude/settings.json +9 -0
- package/exports/hosts/claude/CLAUDE.md +1 -1
- package/exports/hosts/claude/export.manifest.json +6 -4
- package/exports/hosts/claude/host-package.json +3 -1
- package/exports/hosts/codex/AGENTS.md +1 -1
- package/exports/hosts/codex/export.manifest.json +6 -4
- package/exports/hosts/codex/host-package.json +3 -1
- package/exports/hosts/cursor/.cursor/hooks.json +4 -0
- package/exports/hosts/cursor/.cursor/rules/wazir-core.mdc +1 -1
- package/exports/hosts/cursor/export.manifest.json +6 -4
- package/exports/hosts/cursor/host-package.json +3 -1
- package/exports/hosts/gemini/GEMINI.md +1 -1
- package/exports/hosts/gemini/export.manifest.json +6 -4
- package/exports/hosts/gemini/host-package.json +3 -1
- package/hooks/context-mode-router +191 -0
- package/hooks/definitions/context_mode_router.yaml +19 -0
- package/hooks/hooks.json +31 -6
- package/hooks/protected-path-write-guard +8 -0
- package/hooks/routing-matrix.json +45 -0
- package/hooks/session-start +62 -1
- package/llms-full.txt +937 -134
- package/package.json +2 -4
- package/schemas/hook.schema.json +2 -1
- package/schemas/phase-report.schema.json +89 -0
- package/schemas/usage.schema.json +25 -1
- package/schemas/wazir-manifest.schema.json +19 -0
- package/skills/brainstorming/SKILL.md +32 -157
- package/skills/clarifier/SKILL.md +289 -111
- package/skills/claude-cli/SKILL.md +320 -0
- package/skills/codex-cli/SKILL.md +260 -0
- package/skills/debugging/SKILL.md +13 -0
- package/skills/design/SKILL.md +13 -0
- package/skills/dispatching-parallel-agents/SKILL.md +13 -0
- package/skills/executing-plans/SKILL.md +13 -0
- package/skills/executor/SKILL.md +139 -19
- package/skills/finishing-a-development-branch/SKILL.md +13 -0
- package/skills/gemini-cli/SKILL.md +260 -0
- package/skills/humanize/SKILL.md +13 -0
- package/skills/init-pipeline/SKILL.md +72 -164
- package/skills/prepare-next/SKILL.md +81 -10
- package/skills/receiving-code-review/SKILL.md +13 -0
- package/skills/requesting-code-review/SKILL.md +13 -0
- package/skills/reviewer/SKILL.md +369 -24
- package/skills/run-audit/SKILL.md +13 -0
- package/skills/scan-project/SKILL.md +13 -0
- package/skills/self-audit/SKILL.md +217 -16
- package/skills/skill-research/SKILL.md +188 -0
- package/skills/subagent-driven-development/SKILL.md +13 -0
- package/skills/subagent-driven-development/code-quality-reviewer-prompt.md +2 -0
- package/skills/subagent-driven-development/implementer-prompt.md +8 -0
- package/skills/subagent-driven-development/spec-reviewer-prompt.md +7 -0
- package/skills/tdd/SKILL.md +13 -0
- package/skills/using-git-worktrees/SKILL.md +13 -0
- package/skills/using-skills/SKILL.md +13 -0
- package/skills/verification/SKILL.md +54 -3
- package/skills/wazir/SKILL.md +464 -381
- package/skills/writing-plans/SKILL.md +14 -1
- package/skills/writing-skills/SKILL.md +13 -0
- package/templates/artifacts/implementation-plan.md +3 -0
- package/templates/artifacts/tasks-template.md +133 -0
- package/templates/examples/phase-report.example.json +48 -0
- package/tooling/src/adapters/composition-engine.js +256 -0
- package/tooling/src/adapters/model-router.js +84 -0
- package/tooling/src/capture/command.js +41 -2
- package/tooling/src/capture/run-config.js +3 -1
- package/tooling/src/capture/store.js +56 -0
- package/tooling/src/capture/usage.js +106 -0
- package/tooling/src/capture/user-input.js +66 -0
- package/tooling/src/checks/ac-matrix.js +256 -0
- package/tooling/src/checks/command-registry.js +12 -0
- package/tooling/src/checks/docs-truth.js +1 -1
- package/tooling/src/checks/security-sensitivity.js +69 -0
- package/tooling/src/checks/skills.js +111 -0
- package/tooling/src/cli.js +31 -20
- package/tooling/src/commands/stats.js +161 -0
- package/tooling/src/commands/validate.js +5 -1
- package/tooling/src/export/compiler.js +33 -37
- package/tooling/src/gating/agent.js +145 -0
- package/tooling/src/guards/phase-prerequisite-guard.js +185 -0
- package/tooling/src/hooks/routing-logic.js +69 -0
- package/tooling/src/init/auto-detect.js +258 -0
- package/tooling/src/init/command.js +38 -170
- package/tooling/src/input/scanner.js +46 -0
- package/tooling/src/reports/command.js +103 -0
- package/tooling/src/reports/phase-report.js +323 -0
- package/tooling/src/state/command.js +160 -0
- package/tooling/src/state/db.js +287 -0
- package/tooling/src/status/command.js +58 -1
- package/tooling/src/verify/proof-collector.js +299 -0
- package/wazir.manifest.yaml +26 -14
- package/workflows/plan-review.md +3 -1
- package/workflows/verify.md +30 -1
package/skills/wazir/SKILL.md
CHANGED
|
@@ -9,6 +9,29 @@ The user typed `/wazir <their request>`. Run the entire pipeline end-to-end, han
|
|
|
9
9
|
|
|
10
10
|
All questions use **numbered interactive options** — one question at a time, defaults marked "(Recommended)", wait for user response before proceeding.
|
|
11
11
|
|
|
12
|
+
## User Input Capture
|
|
13
|
+
|
|
14
|
+
After every user response (approval, correction, rejection, redirect, instruction), capture it:
|
|
15
|
+
|
|
16
|
+
```
|
|
17
|
+
captureUserInput(runDir, { phase: '<current-phase>', type: '<instruction|approval|correction|rejection|redirect>', content: '<user message>', context: '<what prompted the question>' })
|
|
18
|
+
```
|
|
19
|
+
|
|
20
|
+
This uses `tooling/src/capture/user-input.js`. The log at `user-input-log.ndjson` feeds the learning system — user corrections are the strongest signal for improvement. At run end, prune logs older than 10 runs via `pruneOldInputLogs(stateRoot, 10)`.
|
|
21
|
+
|
|
22
|
+
## Command Routing
|
|
23
|
+
Follow the Canonical Command Matrix in `hooks/routing-matrix.json`.
|
|
24
|
+
- Large commands (test runners, builds, diffs, dependency trees, linting) → context-mode tools
|
|
25
|
+
- Small commands (git status, ls, pwd, wazir CLI) → native Bash
|
|
26
|
+
- If context-mode unavailable, fall back to native Bash with warning
|
|
27
|
+
|
|
28
|
+
## Codebase Exploration
|
|
29
|
+
1. Query `wazir index search-symbols <query>` first
|
|
30
|
+
2. Use `wazir recall file <path> --tier L1` for targeted reads
|
|
31
|
+
3. Fall back to direct file reads ONLY for files identified by index queries
|
|
32
|
+
4. Maximum 10 direct file reads without a justifying index query
|
|
33
|
+
5. If no index exists: `wazir index build && wazir index summarize --tier all`
|
|
34
|
+
|
|
12
35
|
## Subcommand Detection
|
|
13
36
|
|
|
14
37
|
Before anything else, check if the request starts with a known subcommand:
|
|
@@ -18,11 +41,24 @@ Before anything else, check if the request starts with a known subcommand:
|
|
|
18
41
|
| `/wazir audit ...` | Jump to **Audit Mode** (see below) |
|
|
19
42
|
| `/wazir prd [run-id]` | Jump to **PRD Mode** (see below) |
|
|
20
43
|
| `/wazir init` | Invoke the `init-pipeline` skill directly, then stop |
|
|
21
|
-
| Anything else | Continue to
|
|
44
|
+
| Anything else | Continue to Phase 1 (Init) |
|
|
45
|
+
|
|
46
|
+
---
|
|
47
|
+
|
|
48
|
+
# 4-Phase Pipeline
|
|
49
|
+
|
|
50
|
+
The pipeline has 4 phases. Each phase groups related workflows. Individual workflows within a phase can be enabled/disabled via `workflow_policy` in run-config.
|
|
51
|
+
|
|
52
|
+
| Phase | Contains | Owner Skill | Key Output |
|
|
53
|
+
|-------|----------|-------------|------------|
|
|
54
|
+
| **Init** | Setup, prereqs, run directory, input scan | `wz:wazir` (inline) | `run-config.yaml` |
|
|
55
|
+
| **Clarifier** | Research, clarify, specify, brainstorm, plan | `wz:clarifier` | Approved spec + design + plan |
|
|
56
|
+
| **Executor** | Implement, verify | `wz:executor` | Code + verification proof |
|
|
57
|
+
| **Final Review** | Review vs original input, learn, prepare next | `wz:reviewer` | Verdict + learnings + handoff |
|
|
22
58
|
|
|
23
59
|
---
|
|
24
60
|
|
|
25
|
-
#
|
|
61
|
+
# Phase 1: Init
|
|
26
62
|
|
|
27
63
|
## Step 1: Capture the Request
|
|
28
64
|
|
|
@@ -37,16 +73,28 @@ If the user provided no text after `/wazir`, ask:
|
|
|
37
73
|
|
|
38
74
|
Save their answer as the briefing, then continue.
|
|
39
75
|
|
|
76
|
+
### Scan Input Directory
|
|
77
|
+
|
|
78
|
+
Scan both `input/` (project-level) and `.wazir/input/` (state-level) for existing briefing materials. If files exist beyond `briefing.md`, list them:
|
|
79
|
+
|
|
80
|
+
> **Found input files:**
|
|
81
|
+
> - `input/2026-03-19-deferred-items.md`
|
|
82
|
+
> - `.wazir/input/briefing.md`
|
|
83
|
+
>
|
|
84
|
+
> Using all found input as context for clarification.
|
|
85
|
+
|
|
40
86
|
### Inline Modifiers
|
|
41
87
|
|
|
42
|
-
Parse the request for inline modifiers before the main text
|
|
88
|
+
Parse the request for inline modifiers before the main text:
|
|
43
89
|
|
|
44
90
|
- `/wazir quick fix the login redirect` → depth = quick, intent = bugfix
|
|
45
91
|
- `/wazir deep design a new onboarding flow` → depth = deep, intent = feature
|
|
46
|
-
- `/wazir feature add CSV export` → intent = feature, depth = standard (default)
|
|
47
92
|
|
|
48
93
|
Recognized modifiers:
|
|
49
94
|
- **Depth:** `quick`, `deep` (standard is default when omitted)
|
|
95
|
+
- **Interaction mode:** `auto`, `interactive` (guided is default when omitted)
|
|
96
|
+
- `/wazir auto fix the auth bug` → interaction_mode = auto
|
|
97
|
+
- `/wazir interactive design the onboarding` → interaction_mode = interactive
|
|
50
98
|
- **Intent:** `bugfix`, `feature`, `refactor`, `docs`, `spike`
|
|
51
99
|
|
|
52
100
|
## Step 2: Check Prerequisites
|
|
@@ -58,24 +106,18 @@ Run `which wazir` to check if the CLI is installed.
|
|
|
58
106
|
**If not installed**, present:
|
|
59
107
|
|
|
60
108
|
> **The Wazir CLI is not installed. It's required for event capture, validation, and indexing.**
|
|
61
|
-
>
|
|
62
|
-
> **How would you like to install it?**
|
|
63
|
-
>
|
|
64
|
-
> 1. **npm** (Recommended) — `npm install -g @wazir-dev/cli`
|
|
65
|
-
> 2. **Local link** — `npm link` from the Wazir project root
|
|
66
|
-
|
|
67
|
-
If the user picks 1, run `npm install -g @wazir-dev/cli` and verify with `wazir --version`.
|
|
68
|
-
If the user picks 2, run `npm link` from the project root and verify.
|
|
69
109
|
|
|
70
|
-
|
|
110
|
+
Ask the user via AskUserQuestion:
|
|
111
|
+
- **Question:** "The Wazir CLI is not installed. How would you like to install it?"
|
|
112
|
+
- **Options:**
|
|
113
|
+
1. "npm install -g @wazir-dev/cli" *(Recommended)*
|
|
114
|
+
2. "npm link from the Wazir project root"
|
|
71
115
|
|
|
72
|
-
|
|
116
|
+
Wait for the user's selection before continuing.
|
|
73
117
|
|
|
74
|
-
|
|
75
|
-
> **Repo health check failed:** [details from doctor output]
|
|
76
|
-
> Fix issues before running the pipeline.
|
|
118
|
+
The CLI is **required** — the pipeline uses `wazir capture`, `wazir validate`, `wazir index`, and `wazir doctor` throughout execution.
|
|
77
119
|
|
|
78
|
-
|
|
120
|
+
**If installed**, run `wazir doctor --json` to verify repo health. Stop if unhealthy.
|
|
79
121
|
|
|
80
122
|
### Branch Check
|
|
81
123
|
|
|
@@ -83,13 +125,14 @@ Run `wazir validate branches` to check the current git branch.
|
|
|
83
125
|
|
|
84
126
|
- If on `main` or `develop`:
|
|
85
127
|
> You're on **[branch]**. The pipeline requires a feature branch.
|
|
86
|
-
>
|
|
87
|
-
> 1. **Create feat/<slug>** (Recommended) — branch from current
|
|
88
|
-
> 2. **Continue on [branch]** — not recommended for feature/refactor work
|
|
89
128
|
|
|
90
|
-
|
|
129
|
+
Ask the user via AskUserQuestion:
|
|
130
|
+
- **Question:** "You're on a protected branch. Create a feature branch?"
|
|
131
|
+
- **Options:**
|
|
132
|
+
1. "Create feat/<slug> from current branch" *(Recommended)*
|
|
133
|
+
2. "Continue on current branch — not recommended"
|
|
91
134
|
|
|
92
|
-
|
|
135
|
+
Wait for the user's selection before continuing.
|
|
93
136
|
|
|
94
137
|
### Index Check
|
|
95
138
|
|
|
@@ -107,87 +150,83 @@ fi
|
|
|
107
150
|
|
|
108
151
|
Check if `.wazir/state/config.json` exists.
|
|
109
152
|
|
|
110
|
-
- **If missing** — invoke the `init-pipeline` skill.
|
|
111
|
-
- **If exists** — continue
|
|
153
|
+
- **If missing** — invoke the `init-pipeline` skill.
|
|
154
|
+
- **If exists** — continue.
|
|
112
155
|
|
|
113
|
-
## Step
|
|
156
|
+
## Step 3: Create Run Directory
|
|
114
157
|
|
|
115
158
|
Generate a run ID using the current timestamp: `run-YYYYMMDD-HHMMSS`
|
|
116
159
|
|
|
117
160
|
```bash
|
|
118
|
-
mkdir -p .wazir/runs/run-YYYYMMDD-HHMMSS/{sources,tasks,artifacts,reviews}
|
|
161
|
+
mkdir -p .wazir/runs/run-YYYYMMDD-HHMMSS/{sources,tasks,artifacts,reviews,clarified}
|
|
119
162
|
ln -sfn run-YYYYMMDD-HHMMSS .wazir/runs/latest
|
|
120
163
|
```
|
|
121
164
|
|
|
122
|
-
|
|
123
|
-
|
|
124
|
-
After creating the run directory, initialize event capture:
|
|
165
|
+
Initialize event capture:
|
|
125
166
|
|
|
126
167
|
```bash
|
|
127
|
-
wazir capture init --run <run-id> --phase
|
|
168
|
+
wazir capture init --run <run-id> --phase init --status starting
|
|
128
169
|
```
|
|
129
170
|
|
|
130
|
-
|
|
171
|
+
### Resume Detection
|
|
172
|
+
|
|
173
|
+
Check if a previous incomplete run exists (via `latest` symlink pointing to a run without `completed_at`).
|
|
131
174
|
|
|
132
|
-
|
|
175
|
+
**If previous incomplete run found**, present:
|
|
133
176
|
|
|
134
|
-
|
|
177
|
+
> **A previous incomplete run was detected:** `<previous-run-id>`
|
|
135
178
|
|
|
136
|
-
|
|
137
|
-
|
|
138
|
-
|
|
139
|
-
|
|
140
|
-
|
|
179
|
+
Ask the user via AskUserQuestion:
|
|
180
|
+
- **Question:** "A previous incomplete run was detected. Resume or start fresh?"
|
|
181
|
+
- **Options:**
|
|
182
|
+
1. "Resume from the last completed phase" *(Recommended)*
|
|
183
|
+
2. "Start fresh with a new empty run"
|
|
141
184
|
|
|
142
|
-
|
|
185
|
+
Wait for the user's selection before continuing.
|
|
143
186
|
|
|
144
|
-
|
|
187
|
+
**If Resume:**
|
|
188
|
+
- Copy `clarified/` from previous run into new run, EXCEPT `user-feedback.md`.
|
|
189
|
+
- Detect last completed phase by checking which artifacts exist.
|
|
190
|
+
- **Staleness check:** If input files are newer than copied artifacts, warn and offer to re-run clarification.
|
|
145
191
|
|
|
146
|
-
|
|
147
|
-
>
|
|
148
|
-
> 1. **Feature** (Recommended) — New functionality or enhancement
|
|
149
|
-
> 2. **Bugfix** — Fix broken behavior
|
|
150
|
-
> 3. **Refactor** — Restructure without changing behavior
|
|
151
|
-
> 4. **Docs** — Documentation only
|
|
152
|
-
> 5. **Spike** — Research and exploration, no production code
|
|
192
|
+
## Step 4: Build Run Config
|
|
153
193
|
|
|
154
|
-
|
|
194
|
+
**No questions asked.** Depth, intent, and mode are all inferred or defaulted.
|
|
155
195
|
|
|
156
|
-
|
|
157
|
-
- The host is Claude Code (not Codex/Gemini/Cursor)
|
|
158
|
-
- Depth is `standard` or `deep`
|
|
159
|
-
- Intent is `feature` or `refactor` (not bugfix/docs/spike)
|
|
196
|
+
### Intent Inference
|
|
160
197
|
|
|
161
|
-
|
|
162
|
-
|
|
163
|
-
|
|
164
|
-
|
|
165
|
-
|
|
166
|
-
|
|
198
|
+
Infer intent from the request text using keyword matching:
|
|
199
|
+
|
|
200
|
+
| Keywords in request | Inferred Intent |
|
|
201
|
+
|-------------------|-----------------|
|
|
202
|
+
| fix, bug, broken, crash, error, issue, wrong | `bugfix` |
|
|
203
|
+
| refactor, clean, restructure, reorganize, rename, simplify | `refactor` |
|
|
204
|
+
| doc, document, readme, guide, explain | `docs` |
|
|
205
|
+
| research, spike, explore, investigate, prototype | `spike` |
|
|
206
|
+
| (anything else) | `feature` |
|
|
207
|
+
|
|
208
|
+
Depth defaults to `standard`. Override only via inline modifiers (`/wazir quick ...`, `/wazir deep ...`).
|
|
167
209
|
|
|
168
210
|
### Write Run Config
|
|
169
211
|
|
|
170
|
-
Save
|
|
212
|
+
Save to `.wazir/runs/<run-id>/run-config.yaml`:
|
|
171
213
|
|
|
172
214
|
```yaml
|
|
173
|
-
|
|
174
|
-
|
|
175
|
-
|
|
176
|
-
continuation_reason: null # e.g. "review found minor fixes"
|
|
215
|
+
run_id: run-YYYYMMDD-HHMMSS
|
|
216
|
+
parent_run_id: null
|
|
217
|
+
continuation_reason: null
|
|
177
218
|
|
|
178
|
-
# User request
|
|
179
219
|
request: "the original user request"
|
|
180
|
-
request_summary: "short summary
|
|
181
|
-
parsed_intent: feature
|
|
182
|
-
entry_point: "/wazir"
|
|
220
|
+
request_summary: "short summary"
|
|
221
|
+
parsed_intent: feature
|
|
222
|
+
entry_point: "/wazir"
|
|
183
223
|
|
|
184
|
-
|
|
185
|
-
|
|
186
|
-
team_mode: sequential # sequential | parallel
|
|
187
|
-
parallel_backend: none # none | claude_teams (future: subagents, worktrees)
|
|
224
|
+
depth: standard
|
|
225
|
+
interaction_mode: guided # auto | guided | interactive
|
|
188
226
|
|
|
189
|
-
#
|
|
190
|
-
|
|
227
|
+
# Workflow policy — individual workflows within each phase
|
|
228
|
+
workflow_policy:
|
|
229
|
+
# Clarifier phase workflows
|
|
191
230
|
discover: { enabled: true, loop_cap: 10 }
|
|
192
231
|
clarify: { enabled: true, loop_cap: 10 }
|
|
193
232
|
specify: { enabled: true, loop_cap: 10 }
|
|
@@ -197,333 +236,419 @@ phase_policy:
|
|
|
197
236
|
design-review: { enabled: true, loop_cap: 10 }
|
|
198
237
|
plan: { enabled: true, loop_cap: 10 }
|
|
199
238
|
plan-review: { enabled: true, loop_cap: 10 }
|
|
239
|
+
# Executor phase workflows
|
|
200
240
|
execute: { enabled: true, loop_cap: 10 }
|
|
201
241
|
verify: { enabled: true, loop_cap: 5 }
|
|
242
|
+
# Final Review phase workflows
|
|
202
243
|
review: { enabled: true, loop_cap: 10 }
|
|
203
|
-
learn: { enabled:
|
|
204
|
-
prepare_next: { enabled:
|
|
244
|
+
learn: { enabled: true, loop_cap: 5 }
|
|
245
|
+
prepare_next: { enabled: true, loop_cap: 5 }
|
|
205
246
|
run_audit: { enabled: false, loop_cap: 10 }
|
|
206
247
|
|
|
207
|
-
|
|
208
|
-
research_topics: [] # populated by researcher phase
|
|
248
|
+
research_topics: []
|
|
209
249
|
|
|
210
|
-
|
|
211
|
-
created_at: 2026-03-17T14:30:00Z
|
|
250
|
+
created_at: "YYYY-MM-DDTHH:MM:SSZ"
|
|
212
251
|
completed_at: null
|
|
213
252
|
```
|
|
214
253
|
|
|
215
|
-
|
|
216
|
-
|
|
217
|
-
### Phase Policy
|
|
218
|
-
|
|
219
|
-
Map intent + depth to applicable phases. The system decides — the user does NOT pick phases.
|
|
254
|
+
### Workflow Skip Rules
|
|
220
255
|
|
|
221
|
-
|
|
256
|
+
Map intent + depth to applicable workflows. The system decides — the user does NOT pick.
|
|
222
257
|
|
|
223
|
-
| Class |
|
|
224
|
-
|
|
225
|
-
| **Core** (always run) | `clarify`, `verify`, `review` | Never skipped |
|
|
258
|
+
| Class | Workflows | Rules |
|
|
259
|
+
|-------|-----------|-------|
|
|
260
|
+
| **Core** (always run) | `clarify`, `execute`, `verify`, `review` | Never skipped |
|
|
226
261
|
| **Adaptive** (run when evidence says so) | `discover`, `design`, `author`, `specify` | Skipped for bugfix/docs/spike at quick depth |
|
|
227
262
|
| **Scale** (intensity varies) | `spec-challenge`, `plan-review`, `design-review` | Loop cap controls iteration depth |
|
|
263
|
+
| **Post-run** (always run) | `learn`, `prepare_next` | Part of Final Review phase |
|
|
228
264
|
|
|
229
|
-
Log skip decisions
|
|
230
|
-
|
|
231
|
-
```yaml
|
|
232
|
-
phase_policy:
|
|
233
|
-
discover: { enabled: true, loop_cap: 10 }
|
|
234
|
-
design: { enabled: false, loop_cap: 10, reason: "bugfix intent — no design needed" }
|
|
235
|
-
spec-challenge: { enabled: true, loop_cap: 10 }
|
|
236
|
-
```
|
|
265
|
+
Log skip decisions with reasons in `workflow_policy`.
|
|
237
266
|
|
|
238
267
|
### Confidence Gate
|
|
239
268
|
|
|
240
|
-
After building
|
|
241
|
-
|
|
242
|
-
- **High confidence** (clear intent, depth set, no ambiguity) — show a one-line summary and proceed:
|
|
243
|
-
> **Running: standard depth, feature, sequential. 11 of 15 phases. Proceeding...**
|
|
244
|
-
|
|
245
|
-
- **Low confidence** (ambiguous intent, unclear scope) — show the full plan and ask:
|
|
246
|
-
> **Here's the run plan:**
|
|
247
|
-
> - Depth: standard
|
|
248
|
-
> - Intent: feature
|
|
249
|
-
> - Phases: [list enabled phases]
|
|
250
|
-
> - Skipped: [list skipped with reasons]
|
|
251
|
-
>
|
|
252
|
-
> **Does this look right?**
|
|
253
|
-
> 1. **Yes, proceed** (Recommended)
|
|
254
|
-
> 2. **No, let me adjust**
|
|
255
|
-
|
|
256
|
-
## Step 4: Run Pipeline Phases
|
|
257
|
-
|
|
258
|
-
The full pipeline runs these phases in order. Each phase produces an artifact that must pass its review loop before flowing to the next phase. Review mode is always passed explicitly (`--mode`) -- no auto-detection.
|
|
259
|
-
|
|
260
|
-
### 4a: Source Capture
|
|
261
|
-
|
|
262
|
-
Before invoking the clarifier, capture all referenced sources locally:
|
|
263
|
-
|
|
264
|
-
- Fetch all URLs referenced in `.wazir/input/` briefing files
|
|
265
|
-
- Save fetched content to `.wazir/runs/<run-id>/sources/`
|
|
266
|
-
- Name files as `src-NNN-<slug>.md` (fetched content) or `src-NNN-fetch-failed.json` (failures)
|
|
267
|
-
- Create `.wazir/runs/<run-id>/sources/manifest.json` indexing all captures:
|
|
268
|
-
|
|
269
|
-
```json
|
|
270
|
-
[
|
|
271
|
-
{
|
|
272
|
-
"id": "src-001",
|
|
273
|
-
"origin_url": "https://...",
|
|
274
|
-
"fetch_time": "2026-03-17T14:30:00Z",
|
|
275
|
-
"content_hash": "sha256:abc...",
|
|
276
|
-
"status": "captured",
|
|
277
|
-
"local_path": "src-001-github-readme.md"
|
|
278
|
-
},
|
|
279
|
-
{
|
|
280
|
-
"id": "src-002",
|
|
281
|
-
"origin_url": "https://...",
|
|
282
|
-
"status": "failed",
|
|
283
|
-
"error": "403 Forbidden",
|
|
284
|
-
"fetch_time": "2026-03-17T14:30:01Z"
|
|
285
|
-
}
|
|
286
|
-
]
|
|
287
|
-
```
|
|
269
|
+
After building run config:
|
|
288
270
|
|
|
289
|
-
|
|
271
|
+
- **High confidence** — one-line summary and proceed:
|
|
272
|
+
> **Running: standard depth, feature, sequential. Proceeding...**
|
|
290
273
|
|
|
291
|
-
|
|
274
|
+
- **Low confidence** — show plan and ask:
|
|
292
275
|
|
|
293
|
-
|
|
294
|
-
|
|
295
|
-
|
|
276
|
+
Ask the user via AskUserQuestion:
|
|
277
|
+
- **Question:** "Does this run configuration look right?"
|
|
278
|
+
- **Options:**
|
|
279
|
+
1. "Yes, proceed" *(Recommended)*
|
|
280
|
+
2. "No, let me adjust"
|
|
296
281
|
|
|
297
|
-
|
|
298
|
-
Produces clarification artifact.
|
|
299
|
-
Review: clarification-review loop (`--mode clarification-review`, spec/clarification dimensions).
|
|
300
|
-
Pass count: quick=3, standard=5, deep=7. No extension.
|
|
301
|
-
Checkpoint: user approves clarification.
|
|
282
|
+
Wait for the user's selection before continuing.
|
|
302
283
|
|
|
303
284
|
```bash
|
|
304
|
-
wazir capture event --run <run-id> --event phase_exit --phase
|
|
285
|
+
wazir capture event --run <run-id> --event phase_exit --phase init --status completed
|
|
305
286
|
```
|
|
306
287
|
|
|
307
|
-
|
|
308
|
-
|
|
288
|
+
Run the phase report and display it to the user:
|
|
309
289
|
```bash
|
|
310
|
-
wazir
|
|
290
|
+
wazir report phase --run <run-id> --phase init
|
|
311
291
|
```
|
|
312
292
|
|
|
313
|
-
|
|
314
|
-
Produces research artifact.
|
|
315
|
-
Review: research-review loop (`--mode research-review`, research dimensions).
|
|
316
|
-
Pass count: quick=3, standard=5, deep=7. No extension.
|
|
317
|
-
Skip condition: depth=quick AND intent=bugfix.
|
|
293
|
+
Output the report content to the user in the conversation.
|
|
318
294
|
|
|
319
|
-
|
|
320
|
-
wazir capture event --run <run-id> --event phase_exit --phase discover --status completed
|
|
321
|
-
```
|
|
295
|
+
---
|
|
322
296
|
|
|
323
|
-
|
|
297
|
+
# Interaction Modes
|
|
324
298
|
|
|
325
|
-
|
|
326
|
-
wazir capture event --run <run-id> --event phase_enter --phase specify --status in_progress
|
|
327
|
-
```
|
|
299
|
+
The `interaction_mode` field in run-config controls how the pipeline interacts with the user:
|
|
328
300
|
|
|
329
|
-
|
|
330
|
-
|
|
331
|
-
|
|
332
|
-
|
|
333
|
-
|
|
301
|
+
| Mode | Inline modifier | Behavior | Best for |
|
|
302
|
+
|------|----------------|----------|----------|
|
|
303
|
+
| **`guided`** | (default) | Pipeline runs, pauses at phase checkpoints for user approval. Current default behavior. | Most work |
|
|
304
|
+
| **`auto`** | `/wazir auto ...` | No human checkpoints. Codex reviews all. Gating agent decides continue/loop_back/escalate. Stops ONLY on escalate. | Overnight, clear spec, well-understood domain |
|
|
305
|
+
| **`interactive`** | `/wazir interactive ...` | More questions, more discussion, co-designs with user. Researcher presents options. Executor checks approach before coding. | Ambiguous requirements, new domain, learning |
|
|
334
306
|
|
|
335
|
-
|
|
336
|
-
wazir capture event --run <run-id> --event phase_exit --phase specify --status completed
|
|
337
|
-
```
|
|
307
|
+
## `auto` mode constraints
|
|
338
308
|
|
|
339
|
-
|
|
309
|
+
- **Codex REQUIRED** — refuse to start auto mode if `multi_tool.codex` is not configured in `.wazir/state/config.json`. Error: "Auto mode requires an external reviewer (Codex). Configure it first or use guided mode."
|
|
310
|
+
- **On escalate:** STOP immediately, write the escalation reason to `.wazir/runs/<id>/escalations/`, and wait for user input
|
|
311
|
+
- **Wall-clock limit:** default 4 hours. If exceeded, stop with escalation.
|
|
312
|
+
- **Never auto-commits to main** — always work on feature branch
|
|
313
|
+
- All checkpoints (AskUserQuestion) are skipped — gating agent evaluates phase reports and decides
|
|
340
314
|
|
|
341
|
-
|
|
342
|
-
wazir capture event --run <run-id> --event phase_enter --phase author --status in_progress
|
|
343
|
-
```
|
|
315
|
+
## `guided` mode (default)
|
|
344
316
|
|
|
345
|
-
|
|
346
|
-
Content-author writes non-code content artifacts.
|
|
347
|
-
Approval gate: human approval required (not a review loop).
|
|
348
|
-
Skip condition: disabled by default. Enable for content-heavy projects.
|
|
317
|
+
Current behavior — no changes needed. Checkpoints at phase boundaries, user approves before advancing.
|
|
349
318
|
|
|
350
|
-
|
|
351
|
-
|
|
319
|
+
## `interactive` mode
|
|
320
|
+
|
|
321
|
+
- **Clarifier:** asks more detailed questions, presents research findings with options: "I found 3 approaches — which interests you?"
|
|
322
|
+
- **Executor:** checks approach before coding: "I'm about to implement auth with Supabase — sound right?"
|
|
323
|
+
- **Reviewer:** discusses findings with user, not just presents verdict: "I found a potential auth bypass — here's why I think it's high severity, do you agree?"
|
|
324
|
+
- Slower but highest quality for complex/ambiguous work
|
|
325
|
+
|
|
326
|
+
## Mode checking in phase skills
|
|
327
|
+
|
|
328
|
+
All phase skills check `interaction_mode` from run-config at every checkpoint:
|
|
329
|
+
|
|
330
|
+
```
|
|
331
|
+
# Read from run-config
|
|
332
|
+
interaction_mode = run_config.interaction_mode ?? 'guided'
|
|
333
|
+
|
|
334
|
+
# At each checkpoint:
|
|
335
|
+
if interaction_mode == 'auto':
|
|
336
|
+
# Skip checkpoint, let gating agent decide
|
|
337
|
+
elif interaction_mode == 'interactive':
|
|
338
|
+
# More detailed question, present options, discuss
|
|
339
|
+
else:
|
|
340
|
+
# guided — standard checkpoint with AskUserQuestion
|
|
352
341
|
```
|
|
353
342
|
|
|
354
|
-
|
|
343
|
+
---
|
|
344
|
+
|
|
345
|
+
# Two-Level Phase Model
|
|
346
|
+
|
|
347
|
+
The pipeline has 4 top-level **phases**, each containing multiple **workflows** with review loops:
|
|
348
|
+
|
|
349
|
+
```
|
|
350
|
+
Phase 1: Init
|
|
351
|
+
└── (inline — no sub-workflows)
|
|
352
|
+
|
|
353
|
+
Phase 2: Clarifier
|
|
354
|
+
├── discover (research) ← research-review loop
|
|
355
|
+
├── clarify ← clarification-review loop
|
|
356
|
+
├── specify ← spec-challenge loop
|
|
357
|
+
├── author (adaptive) ← approval gate
|
|
358
|
+
├── design ← design-review loop
|
|
359
|
+
└── plan ← plan-review loop
|
|
360
|
+
|
|
361
|
+
Phase 3: Executor
|
|
362
|
+
├── execute (per-task) ← task-review loop per task
|
|
363
|
+
└── verify
|
|
364
|
+
|
|
365
|
+
Phase 4: Final Review
|
|
366
|
+
├── review (final) ← scored review
|
|
367
|
+
├── learn
|
|
368
|
+
└── prepare_next
|
|
369
|
+
```
|
|
355
370
|
|
|
371
|
+
**Event capture uses both levels.** When emitting phase events, include `--parent-phase`:
|
|
356
372
|
```bash
|
|
357
|
-
wazir capture event --run <
|
|
373
|
+
wazir capture event --run <id> --event phase_enter --phase discover --parent-phase clarifier --status in_progress
|
|
358
374
|
```
|
|
359
375
|
|
|
360
|
-
|
|
361
|
-
|
|
362
|
-
|
|
363
|
-
|
|
364
|
-
|
|
365
|
-
|
|
366
|
-
|
|
376
|
+
**Progress markers between workflows:** After each workflow completes, output:
|
|
377
|
+
> Phase 2: Clarifier > Workflow: specify (3 of 6 workflows complete)
|
|
378
|
+
|
|
379
|
+
**`wazir status` shows both levels:** "Phase 2: Clarifier > Workflow: specify"
|
|
380
|
+
|
|
381
|
+
---
|
|
382
|
+
|
|
383
|
+
# Phase 2: Clarifier
|
|
384
|
+
|
|
385
|
+
**Before starting this phase, output to the user:**
|
|
386
|
+
|
|
387
|
+
> **Clarifier Phase** — About to research your codebase, clarify requirements, harden the spec, brainstorm designs, and produce an execution plan.
|
|
388
|
+
>
|
|
389
|
+
> **Why this matters:** Without this, I'd guess your tech stack, misunderstand constraints, miss edge cases in the spec, and build the wrong architecture. Every ambiguity left unresolved here becomes a bug or rework cycle later.
|
|
390
|
+
>
|
|
391
|
+
> **Looking for:** Unstated assumptions, scope boundaries, conflicting requirements, missing acceptance criteria
|
|
367
392
|
|
|
368
393
|
```bash
|
|
369
|
-
wazir capture event --run <run-id> --event
|
|
394
|
+
wazir capture event --run <run-id> --event phase_enter --phase clarifier --status in_progress
|
|
370
395
|
```
|
|
371
396
|
|
|
372
|
-
|
|
397
|
+
Invoke the `wz:clarifier` skill. It handles all sub-workflows internally:
|
|
398
|
+
|
|
399
|
+
1. **Source Capture** — fetch URLs from input
|
|
400
|
+
2. **Research** (discover workflow) — codebase + external research
|
|
401
|
+
3. **Clarify** (clarify workflow) — scope, constraints, assumptions
|
|
402
|
+
4. **Spec Harden** (specify + spec-challenge workflows) — measurable spec
|
|
403
|
+
5. **Brainstorm** (design + design-review workflows) — design approaches
|
|
404
|
+
6. **Plan** (plan + plan-review workflows) — execution plan
|
|
405
|
+
|
|
406
|
+
Each sub-workflow has its own review loop. User checkpoints between major steps.
|
|
407
|
+
|
|
408
|
+
### Scope Invariant
|
|
409
|
+
|
|
410
|
+
**Hard rule:** `items_in_plan >= items_in_input` unless the user explicitly approves scope reduction. The clarifier MUST NOT autonomously tier, defer, or drop items from the user's input. It can suggest prioritization, but the decision belongs to the user.
|
|
411
|
+
|
|
412
|
+
Output: approved spec + design + execution plan in `.wazir/runs/latest/clarified/`.
|
|
413
|
+
|
|
414
|
+
**After completing this phase, output to the user:**
|
|
415
|
+
|
|
416
|
+
> **Clarifier Phase complete.**
|
|
417
|
+
>
|
|
418
|
+
> **Found:** [N] ambiguities resolved, [N] assumptions made explicit, [N] scope boundaries drawn, [N] acceptance criteria hardened
|
|
419
|
+
>
|
|
420
|
+
> **Without this phase:** Requirements would be interpreted differently across tasks, acceptance criteria would be vague and untestable, the design would be ad-hoc, and the plan would miss dependency ordering
|
|
421
|
+
>
|
|
422
|
+
> **Changed because of this work:** [List spec tightening changes, resolved questions, design decisions, scope adjustments]
|
|
373
423
|
|
|
374
424
|
```bash
|
|
375
|
-
wazir capture event --run <run-id> --event
|
|
425
|
+
wazir capture event --run <run-id> --event phase_exit --phase clarifier --status completed
|
|
376
426
|
```
|
|
377
427
|
|
|
378
|
-
|
|
379
|
-
Planner produces execution plan and task specs.
|
|
380
|
-
Review: plan-review loop (`--mode plan-review`, plan dimensions).
|
|
381
|
-
Pass count: quick=3, standard=5, deep=7. No extension.
|
|
382
|
-
Checkpoint: user approves plan.
|
|
383
|
-
|
|
428
|
+
Run the phase report and display savings to the user:
|
|
384
429
|
```bash
|
|
385
|
-
wazir
|
|
430
|
+
wazir report phase --run <run-id> --phase clarifier
|
|
431
|
+
wazir stats --run <run-id>
|
|
386
432
|
```
|
|
387
433
|
|
|
388
|
-
|
|
434
|
+
**Show savings in conversation output:**
|
|
435
|
+
> **Context savings this phase:** Used wazir index for [N] queries and context-mode for [M] commands, saving ~[X] tokens ([Y]% reduction). Without these, this phase would have consumed [A] tokens instead of [B].
|
|
436
|
+
|
|
437
|
+
Output the report content to the user in the conversation.
|
|
438
|
+
|
|
439
|
+
---
|
|
440
|
+
|
|
441
|
+
# Phase 3: Executor
|
|
442
|
+
|
|
443
|
+
**Before starting this phase, output to the user:**
|
|
444
|
+
|
|
445
|
+
> **Executor Phase** — About to implement [N] tasks in dependency order with TDD (test-first), per-task code review, and verification before each commit.
|
|
446
|
+
>
|
|
447
|
+
> **Why this matters:** Without this discipline, tests get skipped, edge cases get missed, integration points break silently, and review catches problems too late when they're expensive to fix.
|
|
448
|
+
>
|
|
449
|
+
> **Looking for:** Correct dependency ordering, test coverage for each task, clean per-task review passes, no implementation drift from the approved plan
|
|
450
|
+
|
|
451
|
+
## Phase Gate (Hard Gate)
|
|
452
|
+
|
|
453
|
+
Before entering the Executor phase, verify ALL clarifier artifacts exist:
|
|
454
|
+
|
|
455
|
+
- [ ] `.wazir/runs/latest/clarified/clarification.md`
|
|
456
|
+
- [ ] `.wazir/runs/latest/clarified/spec-hardened.md`
|
|
457
|
+
- [ ] `.wazir/runs/latest/clarified/design.md`
|
|
458
|
+
- [ ] `.wazir/runs/latest/clarified/execution-plan.md`
|
|
459
|
+
|
|
460
|
+
If ANY file is missing, **STOP**:
|
|
461
|
+
|
|
462
|
+
> **Cannot enter Executor phase: missing prerequisite artifacts from Clarifier.**
|
|
463
|
+
>
|
|
464
|
+
> Missing: [list missing files]
|
|
465
|
+
>
|
|
466
|
+
> The Clarifier phase must complete before execution can begin. Run `/wazir:clarifier` first.
|
|
467
|
+
|
|
468
|
+
**Do NOT skip this check. Do NOT rationalize that the input is "clear enough" to bypass clarification. Every pipeline run must produce these artifacts.**
|
|
389
469
|
|
|
390
470
|
```bash
|
|
391
|
-
wazir capture event --run <run-id> --event phase_enter --phase
|
|
471
|
+
wazir capture event --run <run-id> --event phase_enter --phase executor --status in_progress
|
|
392
472
|
```
|
|
393
473
|
|
|
394
|
-
**Pre-execution gate
|
|
474
|
+
**Pre-execution gate:**
|
|
395
475
|
|
|
396
476
|
```bash
|
|
397
477
|
wazir validate manifest && wazir validate hooks
|
|
398
|
-
#
|
|
478
|
+
# Hard gate — stop if either fails.
|
|
399
479
|
```
|
|
400
480
|
|
|
401
|
-
Invoke executor skill
|
|
402
|
-
Per-task review: task-review loop (`--mode task-review --task-id <NNN>`,
|
|
403
|
-
5 task-execution dimensions) before each commit.
|
|
404
|
-
Review logs: `execute-task-<NNN>-review-pass-<N>.md`
|
|
405
|
-
Cap tracking: `wazir capture loop-check --task-id <NNN>`
|
|
406
|
-
Codex error handling: non-zero exit -> codex-unavailable, self-review only.
|
|
407
|
-
NOTE: per-task review is NOT the final review.
|
|
481
|
+
Invoke the `wz:executor` skill. It handles:
|
|
408
482
|
|
|
409
|
-
|
|
483
|
+
1. **Execute** (execute workflow) — per-task TDD cycle with review before each commit
|
|
484
|
+
2. **Verify** (verify workflow) — deterministic verification of all claims
|
|
410
485
|
|
|
411
|
-
|
|
412
|
-
|
|
413
|
-
```
|
|
486
|
+
Per-task review: `--mode task-review`, 5 task-execution dimensions.
|
|
487
|
+
Tasks always run sequentially.
|
|
414
488
|
|
|
415
|
-
|
|
489
|
+
Output: code changes + verification proof in `.wazir/runs/latest/artifacts/`.
|
|
416
490
|
|
|
417
|
-
|
|
418
|
-
wazir capture event --run <run-id> --event phase_enter --phase verify --status in_progress
|
|
419
|
-
```
|
|
491
|
+
**After completing this phase, output to the user:**
|
|
420
492
|
|
|
421
|
-
|
|
422
|
-
|
|
493
|
+
> **Executor Phase complete.**
|
|
494
|
+
>
|
|
495
|
+
> **Found:** [N]/[N] tasks implemented, [N] tests written, [N] per-task review passes completed, [N] findings fixed before commit
|
|
496
|
+
>
|
|
497
|
+
> **Without this phase:** Code would ship without tests, review findings would accumulate until final review (10x more expensive to fix), and verification claims would be unsubstantiated
|
|
498
|
+
>
|
|
499
|
+
> **Changed because of this work:** [List of commits with conventional commit messages, test counts, verification evidence collected]
|
|
423
500
|
|
|
424
501
|
```bash
|
|
425
|
-
wazir capture event --run <run-id> --event phase_exit --phase
|
|
502
|
+
wazir capture event --run <run-id> --event phase_exit --phase executor --status completed
|
|
426
503
|
```
|
|
427
504
|
|
|
428
|
-
|
|
429
|
-
|
|
505
|
+
Run the phase report and display savings to the user:
|
|
430
506
|
```bash
|
|
431
|
-
wazir
|
|
507
|
+
wazir report phase --run <run-id> --phase executor
|
|
508
|
+
wazir stats --run <run-id>
|
|
432
509
|
```
|
|
433
510
|
|
|
434
|
-
|
|
435
|
-
|
|
436
|
-
|
|
437
|
-
|
|
511
|
+
Output the report content to the user in the conversation.
|
|
512
|
+
|
|
513
|
+
**Show savings in conversation output:**
|
|
514
|
+
> **Context savings this phase:** Used wazir index for [N] queries and context-mode for [M] commands, saving ~[X] tokens ([Y]% reduction).
|
|
515
|
+
|
|
516
|
+
---
|
|
517
|
+
|
|
518
|
+
# Phase 4: Final Review
|
|
519
|
+
|
|
520
|
+
**Before starting this phase, output to the user:**
|
|
521
|
+
|
|
522
|
+
> **Final Review Phase** — About to run adversarial 7-dimension review comparing the implementation against your original input, extract durable learnings, and prepare the handoff.
|
|
523
|
+
>
|
|
524
|
+
> **Why this matters:** Without this, implementation drift ships undetected, missing acceptance criteria go unnoticed, untested code paths hide bugs, and the same mistakes repeat in the next run.
|
|
525
|
+
>
|
|
526
|
+
> **Looking for:** Spec violations, missing features, dead code paths, unsubstantiated claims, scope creep, security gaps, stale documentation
|
|
527
|
+
|
|
528
|
+
## Phase Gate (Hard Gate)
|
|
529
|
+
|
|
530
|
+
Before entering the Final Review phase, verify the Executor produced its proof:
|
|
531
|
+
|
|
532
|
+
- [ ] `.wazir/runs/latest/artifacts/verification-proof.md`
|
|
533
|
+
|
|
534
|
+
If missing, **STOP**:
|
|
535
|
+
|
|
536
|
+
> **Cannot enter Final Review: missing verification proof from Executor.**
|
|
537
|
+
>
|
|
538
|
+
> The Executor phase must complete and produce `verification-proof.md` before final review. Run `/wazir:executor` first.
|
|
438
539
|
|
|
439
540
|
```bash
|
|
440
|
-
wazir capture event --run <run-id> --event
|
|
541
|
+
wazir capture event --run <run-id> --event phase_enter --phase final_review --status in_progress
|
|
441
542
|
```
|
|
442
543
|
|
|
443
|
-
|
|
544
|
+
This phase validates the implementation against the **ORIGINAL INPUT** (not the task specs — the executor's per-task reviewer already covered that).
|
|
444
545
|
|
|
445
|
-
|
|
446
|
-
Extract durable learnings from the completed run.
|
|
447
|
-
No review loop. Learnings require explicit scope tags.
|
|
448
|
-
Skip condition: disabled by default. Enable for retrospective runs.
|
|
546
|
+
### 4a: Review (reviewer role in final mode)
|
|
449
547
|
|
|
450
|
-
|
|
548
|
+
Invoke `wz:reviewer --mode final`.
|
|
549
|
+
7-dimension scored review comparing implementation against the original user input.
|
|
550
|
+
Score 0-70. Verdicts: PASS (56+), NEEDS MINOR FIXES (42-55), NEEDS REWORK (28-41), FAIL (0-27).
|
|
451
551
|
|
|
452
|
-
|
|
453
|
-
Prepare context and handoff for the next run.
|
|
454
|
-
No review loop. No implicit carry-forward of unapproved learnings.
|
|
455
|
-
Skip condition: disabled by default.
|
|
552
|
+
### 4b: Learn (learner role)
|
|
456
553
|
|
|
457
|
-
|
|
554
|
+
Extract durable learnings from the completed run:
|
|
555
|
+
- Scan all review findings (internal + Codex)
|
|
556
|
+
- Propose learnings to `memory/learnings/proposed/`
|
|
557
|
+
- Findings that recur across 2+ runs → auto-proposed as learnings
|
|
558
|
+
- Learnings require explicit scope tags (roles, stacks, concerns)
|
|
458
559
|
|
|
459
|
-
###
|
|
560
|
+
### 4c: Prepare Next (planner role)
|
|
460
561
|
|
|
461
|
-
|
|
562
|
+
Prepare context and handoff for the next run:
|
|
563
|
+
- Write handoff document
|
|
564
|
+
- Compress/archive unneeded files
|
|
565
|
+
- Record what's left to do
|
|
462
566
|
|
|
463
|
-
|
|
464
|
-
- If spec exists but no design: resume at 4e (brainstorm)
|
|
465
|
-
- If design exists but no plan: resume at 4f (plan)
|
|
466
|
-
- If plan exists but no task artifacts: resume at 4g (execute)
|
|
467
|
-
- If task artifacts exist but no verification: resume at 4h (verify)
|
|
468
|
-
- If verification exists: resume at 4i (final review)
|
|
567
|
+
**After completing this phase, output to the user:**
|
|
469
568
|
|
|
470
|
-
|
|
471
|
-
|
|
472
|
-
> **
|
|
569
|
+
> **Final Review Phase complete.**
|
|
570
|
+
>
|
|
571
|
+
> **Found:** [N] findings across 7 dimensions, [N] blocking issues, [N] warnings, [N] learnings proposed for future runs
|
|
473
572
|
>
|
|
474
|
-
> **
|
|
475
|
-
>
|
|
476
|
-
>
|
|
573
|
+
> **Without this phase:** Implementation drift from the original request would ship undetected, untested paths would hide production bugs, and recurring mistakes would never get captured as learnings
|
|
574
|
+
>
|
|
575
|
+
> **Changed because of this work:** [List of findings fixed, score achieved, learnings extracted, handoff prepared]
|
|
576
|
+
|
|
577
|
+
```bash
|
|
578
|
+
wazir capture event --run <run-id> --event phase_exit --phase final_review --status completed
|
|
579
|
+
```
|
|
477
580
|
|
|
478
|
-
|
|
581
|
+
Run the phase report and display it to the user:
|
|
582
|
+
```bash
|
|
583
|
+
wazir report phase --run <run-id> --phase final_review
|
|
584
|
+
```
|
|
479
585
|
|
|
480
|
-
|
|
586
|
+
Output the report content to the user in the conversation.
|
|
587
|
+
|
|
588
|
+
---
|
|
589
|
+
|
|
590
|
+
## Step 5: CHANGELOG + Gitflow Validation (Hard Gates)
|
|
591
|
+
|
|
592
|
+
Before presenting results:
|
|
593
|
+
|
|
594
|
+
```bash
|
|
595
|
+
wazir validate changelog --require-entries --base main
|
|
596
|
+
wazir validate commits --base main
|
|
597
|
+
```
|
|
598
|
+
|
|
599
|
+
Both must pass before PR. These are not warnings.
|
|
600
|
+
|
|
601
|
+
## Step 6: Present Results
|
|
602
|
+
|
|
603
|
+
After the reviewer completes, present verdict with numbered options:
|
|
481
604
|
|
|
482
605
|
### If PASS (score 56+):
|
|
483
606
|
|
|
484
607
|
> **Result: PASS (score/70)**
|
|
485
|
-
|
|
486
|
-
|
|
487
|
-
|
|
488
|
-
|
|
489
|
-
|
|
490
|
-
|
|
491
|
-
|
|
608
|
+
|
|
609
|
+
Ask the user via AskUserQuestion:
|
|
610
|
+
- **Question:** "Pipeline passed. What would you like to do next?"
|
|
611
|
+
- **Options:**
|
|
612
|
+
1. "Create a PR" *(Recommended)*
|
|
613
|
+
2. "Merge directly"
|
|
614
|
+
3. "Review the changes first"
|
|
615
|
+
|
|
616
|
+
Wait for the user's selection before continuing.
|
|
492
617
|
|
|
493
618
|
### If NEEDS MINOR FIXES (score 42-55):
|
|
494
619
|
|
|
495
620
|
> **Result: NEEDS MINOR FIXES (score/70)**
|
|
496
|
-
|
|
497
|
-
|
|
498
|
-
|
|
499
|
-
|
|
500
|
-
|
|
501
|
-
|
|
502
|
-
|
|
621
|
+
|
|
622
|
+
Ask the user via AskUserQuestion:
|
|
623
|
+
- **Question:** "Minor issues found. How should we handle them?"
|
|
624
|
+
- **Options:**
|
|
625
|
+
1. "Auto-fix and re-review" *(Recommended)*
|
|
626
|
+
2. "Fix manually"
|
|
627
|
+
3. "Accept as-is"
|
|
628
|
+
|
|
629
|
+
Wait for the user's selection before continuing.
|
|
503
630
|
|
|
504
631
|
### If NEEDS REWORK (score 28-41):
|
|
505
632
|
|
|
506
633
|
> **Result: NEEDS REWORK (score/70)**
|
|
507
|
-
|
|
508
|
-
|
|
509
|
-
|
|
510
|
-
|
|
511
|
-
|
|
512
|
-
|
|
513
|
-
|
|
634
|
+
|
|
635
|
+
Ask the user via AskUserQuestion:
|
|
636
|
+
- **Question:** "Significant issues found. How should we proceed?"
|
|
637
|
+
- **Options:**
|
|
638
|
+
1. "Re-run affected tasks" *(Recommended)*
|
|
639
|
+
2. "Review findings in detail"
|
|
640
|
+
3. "Abandon this run"
|
|
641
|
+
|
|
642
|
+
Wait for the user's selection before continuing.
|
|
514
643
|
|
|
515
644
|
### If FAIL (score 0-27):
|
|
516
645
|
|
|
517
646
|
> **Result: FAIL (score/70)**
|
|
518
647
|
>
|
|
519
|
-
>
|
|
520
|
-
>
|
|
521
|
-
> Something fundamental went wrong. Review the findings above and decide how to proceed.
|
|
648
|
+
> Something fundamental went wrong. Review the findings above.
|
|
522
649
|
|
|
523
650
|
### Run Summary
|
|
524
651
|
|
|
525
|
-
After presenting results (regardless of verdict), capture the run summary:
|
|
526
|
-
|
|
527
652
|
```bash
|
|
528
653
|
wazir capture summary --run <run-id>
|
|
529
654
|
wazir status --run <run-id> --json
|
|
@@ -531,19 +656,18 @@ wazir status --run <run-id> --json
|
|
|
531
656
|
|
|
532
657
|
## Error Handling
|
|
533
658
|
|
|
534
|
-
If any phase fails
|
|
535
|
-
|
|
536
|
-
1. Report which phase failed and why
|
|
537
|
-
2. Present recovery options:
|
|
659
|
+
If any phase fails:
|
|
538
660
|
|
|
539
661
|
> **Phase [name] failed: [reason]**
|
|
540
|
-
>
|
|
541
|
-
> **What would you like to do?**
|
|
542
|
-
> 1. **Retry this phase** (Recommended)
|
|
543
|
-
> 2. **Skip and continue** (only if phase is adaptive, not core)
|
|
544
|
-
> 3. **Abort the run**
|
|
545
662
|
|
|
546
|
-
|
|
663
|
+
Ask the user via AskUserQuestion:
|
|
664
|
+
- **Question:** "Phase [name] failed: [reason]. How should we proceed?"
|
|
665
|
+
- **Options:**
|
|
666
|
+
1. "Retry this phase" *(Recommended)*
|
|
667
|
+
2. "Skip and continue" *(only if workflows within phase are adaptive)*
|
|
668
|
+
3. "Abort the run"
|
|
669
|
+
|
|
670
|
+
Wait for the user's selection before continuing.
|
|
547
671
|
|
|
548
672
|
---
|
|
549
673
|
|
|
@@ -551,27 +675,22 @@ The run config persists, so running `/wazir` again will detect the partial state
|
|
|
551
675
|
|
|
552
676
|
Triggered by `/wazir audit` or `/wazir audit <focus>`.
|
|
553
677
|
|
|
554
|
-
Runs a structured codebase audit. Invokes the `run-audit` skill
|
|
678
|
+
Runs a structured codebase audit. Invokes the `run-audit` skill.
|
|
555
679
|
|
|
556
|
-
|
|
680
|
+
Parse inline audit types: `/wazir audit security` → skip Question 1.
|
|
557
681
|
|
|
558
|
-
|
|
682
|
+
After audit:
|
|
559
683
|
|
|
560
|
-
|
|
561
|
-
-
|
|
562
|
-
-
|
|
684
|
+
Ask the user via AskUserQuestion:
|
|
685
|
+
- **Question:** "Audit complete. What would you like to do with the findings?"
|
|
686
|
+
- **Options:**
|
|
687
|
+
1. "Review the findings" *(Recommended)*
|
|
688
|
+
2. "Generate a fix plan"
|
|
689
|
+
3. "Run the pipeline on the fix plan"
|
|
563
690
|
|
|
564
|
-
|
|
691
|
+
Wait for the user's selection before continuing.
|
|
565
692
|
|
|
566
|
-
|
|
567
|
-
|
|
568
|
-
> **Audit complete. What would you like to do?**
|
|
569
|
-
>
|
|
570
|
-
> 1. **Review the findings** (Recommended)
|
|
571
|
-
> 2. **Generate a fix plan** — turn findings into implementation tasks
|
|
572
|
-
> 3. **Run the pipeline on the fix plan** — generate plan, then execute and review fixes
|
|
573
|
-
|
|
574
|
-
If the user picks option 3, save the findings as the briefing and run the normal pipeline (Steps 3-5) with intent = `bugfix`.
|
|
693
|
+
If option 3, save findings as briefing and run pipeline with intent = `bugfix`.
|
|
575
694
|
|
|
576
695
|
---
|
|
577
696
|
|
|
@@ -579,89 +698,53 @@ If the user picks option 3, save the findings as the briefing and run the normal
|
|
|
579
698
|
|
|
580
699
|
Triggered by `/wazir prd` or `/wazir prd <run-id>`.
|
|
581
700
|
|
|
582
|
-
Generates a
|
|
583
|
-
|
|
584
|
-
## Pre-Flight
|
|
585
|
-
|
|
586
|
-
1. If a `<run-id>` was provided, use that run's directory. Otherwise, use `.wazir/runs/latest`.
|
|
587
|
-
2. Verify the run has completed artifacts:
|
|
588
|
-
- Design doc in the run's tasks or in `docs/plans/`
|
|
589
|
-
- Task specs in the run's `clarified/`
|
|
590
|
-
- Review results in the run's `reviews/` (if available)
|
|
591
|
-
3. If the run is incomplete or has no artifacts:
|
|
592
|
-
|
|
593
|
-
> **No completed run found. Run `/wazir <your request>` first to create a pipeline run, then use `/wazir prd` to generate the PRD.**
|
|
701
|
+
Generates a PRD from a completed run. Reads approved design, task specs, execution plan, review results. Saves to `docs/prd/YYYY-MM-DD-<topic>-prd.md`.
|
|
594
702
|
|
|
595
|
-
|
|
703
|
+
After generation:
|
|
596
704
|
|
|
597
|
-
|
|
598
|
-
-
|
|
599
|
-
-
|
|
600
|
-
|
|
601
|
-
|
|
602
|
-
|
|
603
|
-
|
|
604
|
-
## Output
|
|
605
|
-
|
|
606
|
-
Generate a PRD and save to `docs/prd/YYYY-MM-DD-<topic>-prd.md`.
|
|
607
|
-
|
|
608
|
-
### PRD Template
|
|
609
|
-
|
|
610
|
-
```markdown
|
|
611
|
-
# Product Requirements Document — <Topic>
|
|
705
|
+
Ask the user via AskUserQuestion:
|
|
706
|
+
- **Question:** "PRD generated. What would you like to do?"
|
|
707
|
+
- **Options:**
|
|
708
|
+
1. "Review the PRD" *(Recommended)*
|
|
709
|
+
2. "Commit it"
|
|
710
|
+
3. "Edit before committing"
|
|
612
711
|
|
|
613
|
-
|
|
614
|
-
**Date:** YYYY-MM-DD
|
|
712
|
+
Wait for the user's selection before continuing.
|
|
615
713
|
|
|
616
|
-
|
|
617
|
-
|
|
618
|
-
[1-2 paragraphs synthesized from the design document's core approach]
|
|
619
|
-
|
|
620
|
-
## What We're Building
|
|
621
|
-
|
|
622
|
-
### Feature Area 1: <name>
|
|
714
|
+
---
|
|
623
715
|
|
|
624
|
-
|
|
625
|
-
**Why:** [rationale from design doc]
|
|
626
|
-
**Requirements:**
|
|
627
|
-
- [ ] [from task spec acceptance criteria]
|
|
628
|
-
- [ ] ...
|
|
716
|
+
## Reasoning Chain Output
|
|
629
717
|
|
|
630
|
-
|
|
631
|
-
...
|
|
718
|
+
Every phase produces reasoning output at two layers:
|
|
632
719
|
|
|
633
|
-
|
|
720
|
+
### Layer 1: Conversation Output (concise — for the user)
|
|
634
721
|
|
|
635
|
-
|
|
722
|
+
Before each major decision, output one trigger sentence and one reasoning sentence:
|
|
636
723
|
|
|
637
|
-
|
|
724
|
+
> "Your request mentions 'overnight autonomous run' — researching how Devin and Karpathy's autoresearch handle this, because unattended runs need different safety constraints than interactive ones."
|
|
638
725
|
|
|
639
|
-
|
|
726
|
+
After each phase, output what was found and a counterfactual:
|
|
640
727
|
|
|
641
|
-
|
|
728
|
+
> "Found: you use Supabase auth (not custom JWT). If I'd skipped research, I would have built JWT middleware — completely wrong."
|
|
642
729
|
|
|
643
|
-
|
|
730
|
+
### Layer 2: File Output (detailed — for learning and reports)
|
|
644
731
|
|
|
645
|
-
|
|
732
|
+
Save full reasoning chain to `.wazir/runs/<id>/reasoning/phase-<name>-reasoning.md` with entries:
|
|
646
733
|
|
|
647
|
-
|
|
734
|
+
```markdown
|
|
735
|
+
### Decision: [title]
|
|
736
|
+
- **Trigger:** What prompted this decision
|
|
737
|
+
- **Options considered:** List of alternatives
|
|
738
|
+
- **Chosen:** The selected option
|
|
739
|
+
- **Reasoning:** Why this option was chosen
|
|
740
|
+
- **Confidence:** high | medium | low
|
|
741
|
+
- **Counterfactual:** What would have gone wrong without this information
|
|
648
742
|
```
|
|
649
743
|
|
|
650
|
-
|
|
651
|
-
|
|
652
|
-
> **PRD generated at `docs/prd/YYYY-MM-DD-<topic>-prd.md`.**
|
|
653
|
-
>
|
|
654
|
-
> **What would you like to do?**
|
|
655
|
-
> 1. **Review the PRD** (Recommended)
|
|
656
|
-
> 2. **Commit it**
|
|
657
|
-
> 3. **Edit before committing**
|
|
658
|
-
|
|
659
|
-
---
|
|
744
|
+
Create the `reasoning/` directory during run init. Every phase skill (clarifier, executor, reviewer) writes its own reasoning file. Counterfactuals appear in BOTH conversation output AND reasoning files.
|
|
660
745
|
|
|
661
746
|
## Interaction Rules
|
|
662
747
|
|
|
663
|
-
These rules apply to ALL questions in the pipeline, including those asked by sub-skills (clarifier, executor, reviewer) and audit modes:
|
|
664
|
-
|
|
665
748
|
- **One question at a time** — never combine multiple questions
|
|
666
749
|
- **Numbered options** — always present choices as numbered lists
|
|
667
750
|
- **Mark defaults** — always show "(Recommended)" on the suggested option
|