@wazir-dev/cli 1.1.0 → 1.3.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +74 -10
- package/README.md +15 -15
- package/assets/demo.cast +47 -0
- package/assets/demo.gif +0 -0
- package/docs/anti-patterns/AP-23-skipping-enabled-workflows.md +28 -0
- package/docs/anti-patterns/AP-24-clarifier-deciding-scope.md +34 -0
- package/docs/concepts/architecture.md +1 -1
- package/docs/concepts/roles-and-workflows.md +2 -0
- package/docs/concepts/why-wazir.md +59 -0
- package/docs/decisions/2026-03-19-deferred-items.md +564 -0
- package/docs/decisions/2026-03-19-enhancement-decisions.md +300 -0
- package/docs/readmes/INDEX.md +21 -5
- package/docs/readmes/features/expertise/README.md +2 -2
- package/docs/readmes/features/exports/README.md +2 -2
- package/docs/readmes/features/hooks/pre-compact-summary.md +1 -1
- package/docs/readmes/features/schemas/README.md +3 -0
- package/docs/readmes/features/skills/README.md +17 -0
- package/docs/readmes/features/skills/clarifier.md +5 -0
- package/docs/readmes/features/skills/claude-cli.md +5 -0
- package/docs/readmes/features/skills/codex-cli.md +5 -0
- package/docs/readmes/features/skills/dispatching-parallel-agents.md +5 -0
- package/docs/readmes/features/skills/executing-plans.md +5 -0
- package/docs/readmes/features/skills/executor.md +5 -0
- package/docs/readmes/features/skills/finishing-a-development-branch.md +5 -0
- package/docs/readmes/features/skills/gemini-cli.md +5 -0
- package/docs/readmes/features/skills/humanize.md +5 -0
- package/docs/readmes/features/skills/init-pipeline.md +5 -0
- package/docs/readmes/features/skills/receiving-code-review.md +5 -0
- package/docs/readmes/features/skills/requesting-code-review.md +5 -0
- package/docs/readmes/features/skills/reviewer.md +5 -0
- package/docs/readmes/features/skills/subagent-driven-development.md +5 -0
- package/docs/readmes/features/skills/using-git-worktrees.md +5 -0
- package/docs/readmes/features/skills/wazir.md +5 -0
- package/docs/readmes/features/skills/writing-skills.md +5 -0
- package/docs/readmes/features/workflows/prepare-next.md +1 -1
- package/docs/reference/configuration-reference.md +47 -6
- package/docs/reference/hooks.md +1 -0
- package/docs/reference/launch-checklist.md +4 -4
- package/docs/reference/review-loop-pattern.md +119 -9
- package/docs/reference/roles-reference.md +1 -0
- package/docs/reference/skill-tiers.md +147 -0
- package/docs/reference/tooling-cli.md +3 -1
- package/docs/truth-claims.yaml +12 -0
- package/expertise/antipatterns/process/ai-coding-antipatterns.md +214 -1
- package/exports/hosts/claude/.claude/commands/plan-review.md +3 -1
- package/exports/hosts/claude/.claude/commands/verify.md +30 -1
- package/exports/hosts/claude/.claude/settings.json +9 -0
- package/exports/hosts/claude/CLAUDE.md +1 -1
- package/exports/hosts/claude/export.manifest.json +6 -4
- package/exports/hosts/claude/host-package.json +3 -1
- package/exports/hosts/codex/AGENTS.md +1 -1
- package/exports/hosts/codex/export.manifest.json +6 -4
- package/exports/hosts/codex/host-package.json +3 -1
- package/exports/hosts/cursor/.cursor/hooks.json +4 -0
- package/exports/hosts/cursor/.cursor/rules/wazir-core.mdc +1 -1
- package/exports/hosts/cursor/export.manifest.json +6 -4
- package/exports/hosts/cursor/host-package.json +3 -1
- package/exports/hosts/gemini/GEMINI.md +1 -1
- package/exports/hosts/gemini/export.manifest.json +6 -4
- package/exports/hosts/gemini/host-package.json +3 -1
- package/hooks/context-mode-router +191 -0
- package/hooks/definitions/context_mode_router.yaml +19 -0
- package/hooks/hooks.json +31 -6
- package/hooks/protected-path-write-guard +8 -0
- package/hooks/routing-matrix.json +45 -0
- package/hooks/session-start +62 -1
- package/llms-full.txt +937 -134
- package/package.json +2 -4
- package/schemas/hook.schema.json +2 -1
- package/schemas/phase-report.schema.json +89 -0
- package/schemas/usage.schema.json +25 -1
- package/schemas/wazir-manifest.schema.json +19 -0
- package/skills/brainstorming/SKILL.md +32 -157
- package/skills/clarifier/SKILL.md +289 -111
- package/skills/claude-cli/SKILL.md +320 -0
- package/skills/codex-cli/SKILL.md +260 -0
- package/skills/debugging/SKILL.md +13 -0
- package/skills/design/SKILL.md +13 -0
- package/skills/dispatching-parallel-agents/SKILL.md +13 -0
- package/skills/executing-plans/SKILL.md +13 -0
- package/skills/executor/SKILL.md +139 -19
- package/skills/finishing-a-development-branch/SKILL.md +13 -0
- package/skills/gemini-cli/SKILL.md +260 -0
- package/skills/humanize/SKILL.md +13 -0
- package/skills/init-pipeline/SKILL.md +72 -164
- package/skills/prepare-next/SKILL.md +81 -10
- package/skills/receiving-code-review/SKILL.md +13 -0
- package/skills/requesting-code-review/SKILL.md +13 -0
- package/skills/reviewer/SKILL.md +369 -24
- package/skills/run-audit/SKILL.md +13 -0
- package/skills/scan-project/SKILL.md +13 -0
- package/skills/self-audit/SKILL.md +217 -16
- package/skills/skill-research/SKILL.md +188 -0
- package/skills/subagent-driven-development/SKILL.md +13 -0
- package/skills/subagent-driven-development/code-quality-reviewer-prompt.md +2 -0
- package/skills/subagent-driven-development/implementer-prompt.md +8 -0
- package/skills/subagent-driven-development/spec-reviewer-prompt.md +7 -0
- package/skills/tdd/SKILL.md +13 -0
- package/skills/using-git-worktrees/SKILL.md +13 -0
- package/skills/using-skills/SKILL.md +13 -0
- package/skills/verification/SKILL.md +54 -3
- package/skills/wazir/SKILL.md +464 -381
- package/skills/writing-plans/SKILL.md +14 -1
- package/skills/writing-skills/SKILL.md +13 -0
- package/templates/artifacts/implementation-plan.md +3 -0
- package/templates/artifacts/tasks-template.md +133 -0
- package/templates/examples/phase-report.example.json +48 -0
- package/tooling/src/adapters/composition-engine.js +256 -0
- package/tooling/src/adapters/model-router.js +84 -0
- package/tooling/src/capture/command.js +41 -2
- package/tooling/src/capture/run-config.js +3 -1
- package/tooling/src/capture/store.js +56 -0
- package/tooling/src/capture/usage.js +106 -0
- package/tooling/src/capture/user-input.js +66 -0
- package/tooling/src/checks/ac-matrix.js +256 -0
- package/tooling/src/checks/command-registry.js +12 -0
- package/tooling/src/checks/docs-truth.js +1 -1
- package/tooling/src/checks/security-sensitivity.js +69 -0
- package/tooling/src/checks/skills.js +111 -0
- package/tooling/src/cli.js +31 -20
- package/tooling/src/commands/stats.js +161 -0
- package/tooling/src/commands/validate.js +5 -1
- package/tooling/src/export/compiler.js +33 -37
- package/tooling/src/gating/agent.js +145 -0
- package/tooling/src/guards/phase-prerequisite-guard.js +185 -0
- package/tooling/src/hooks/routing-logic.js +69 -0
- package/tooling/src/init/auto-detect.js +258 -0
- package/tooling/src/init/command.js +38 -170
- package/tooling/src/input/scanner.js +46 -0
- package/tooling/src/reports/command.js +103 -0
- package/tooling/src/reports/phase-report.js +323 -0
- package/tooling/src/state/command.js +160 -0
- package/tooling/src/state/db.js +287 -0
- package/tooling/src/status/command.js +58 -1
- package/tooling/src/verify/proof-collector.js +299 -0
- package/wazir.manifest.yaml +26 -14
- package/workflows/plan-review.md +3 -1
- package/workflows/verify.md +30 -1
|
@@ -5,9 +5,22 @@ description: Run the clarification pipeline — research, clarify scope, brainst
|
|
|
5
5
|
|
|
6
6
|
# Clarifier
|
|
7
7
|
|
|
8
|
-
|
|
8
|
+
## Command Routing
|
|
9
|
+
Follow the Canonical Command Matrix in `hooks/routing-matrix.json`.
|
|
10
|
+
- Large commands (test runners, builds, diffs, dependency trees, linting) → context-mode tools
|
|
11
|
+
- Small commands (git status, ls, pwd, wazir CLI) → native Bash
|
|
12
|
+
- If context-mode unavailable, fall back to native Bash with warning
|
|
9
13
|
|
|
10
|
-
|
|
14
|
+
## Codebase Exploration
|
|
15
|
+
1. Query `wazir index search-symbols <query>` first
|
|
16
|
+
2. Use `wazir recall file <path> --tier L1` for targeted reads
|
|
17
|
+
3. Fall back to direct file reads ONLY for files identified by index queries
|
|
18
|
+
4. Maximum 10 direct file reads without a justifying index query
|
|
19
|
+
5. If no index exists: `wazir index build && wazir index summarize --tier all`
|
|
20
|
+
|
|
21
|
+
Run the Clarifier phase — everything from reading input to having an approved execution plan.
|
|
22
|
+
|
|
23
|
+
**Pacing rule:** This skill has mandatory user checkpoints between sub-workflows. Do NOT skip checkpoints. Do NOT combine sub-workflows. Complete each fully, present output, and wait for explicit user approval before advancing.
|
|
11
24
|
|
|
12
25
|
Review loops follow the pattern in `docs/reference/review-loop-pattern.md`. All reviewer invocations use explicit `--mode`.
|
|
13
26
|
|
|
@@ -17,8 +30,10 @@ Review loops follow the pattern in `docs/reference/review-loop-pattern.md`. All
|
|
|
17
30
|
|
|
18
31
|
1. Check `.wazir/state/config.json` exists. If not, run `wazir init` first.
|
|
19
32
|
2. Check `.wazir/input/briefing.md` exists. If not, ask the user what they want to build and save it there.
|
|
20
|
-
3.
|
|
21
|
-
4.
|
|
33
|
+
3. Scan `input/` (project-level) and `.wazir/input/` (state-level) for additional input files. Present what's found.
|
|
34
|
+
4. Read config for `default_depth` and `multi_tool` settings.
|
|
35
|
+
5. **Load accepted learnings:** Glob `memory/learnings/accepted/*.md`. For each accepted learning, read scope tags. Inject learnings whose scope matches the current run's intent/stack into context. Limit: top 10 by confidence, most recent first. This is how prior run insights improve future runs.
|
|
36
|
+
6. Create a run directory if one doesn't exist:
|
|
22
37
|
```bash
|
|
23
38
|
mkdir -p .wazir/runs/run-YYYYMMDD-HHMMSS/{sources,tasks,artifacts,reviews,clarified}
|
|
24
39
|
ln -sfn run-YYYYMMDD-HHMMSS .wazir/runs/latest
|
|
@@ -26,194 +41,357 @@ Review loops follow the pattern in `docs/reference/review-loop-pattern.md`. All
|
|
|
26
41
|
|
|
27
42
|
---
|
|
28
43
|
|
|
29
|
-
##
|
|
44
|
+
## Context-Mode Usage
|
|
45
|
+
|
|
46
|
+
Read `context_mode` from `.wazir/state/config.json`:
|
|
47
|
+
|
|
48
|
+
- **If `context_mode.enabled: true`:** Use `fetch_and_index` for URL fetching, `search` for follow-up queries on indexed content. Use `execute` or `execute_file` for large outputs instead of Bash.
|
|
49
|
+
- **If `context_mode.enabled: false`:** Fall back to `WebFetch` for URLs and `Bash` for commands.
|
|
50
|
+
|
|
51
|
+
---
|
|
52
|
+
|
|
53
|
+
## Sub-Workflow 1: Research (discover workflow)
|
|
54
|
+
|
|
55
|
+
**Before starting this phase, output to the user:**
|
|
56
|
+
|
|
57
|
+
> **Research** — About to scan the codebase and fetch external references to understand the existing architecture, tech stack, and any standards referenced in the briefing.
|
|
58
|
+
>
|
|
59
|
+
> **Why this matters:** Without research, I'd assume the wrong framework version, miss existing patterns in the codebase, and contradict established conventions. Every wrong assumption here cascades into a wrong spec and wrong implementation.
|
|
60
|
+
>
|
|
61
|
+
> **Looking for:** Existing code patterns, dependency versions, external standard definitions, architectural constraints
|
|
30
62
|
|
|
31
63
|
Delegate to the discover workflow (`workflows/discover.md`):
|
|
32
64
|
|
|
33
|
-
1.
|
|
34
|
-
|
|
35
|
-
|
|
36
|
-
|
|
37
|
-
(
|
|
38
|
-
|
|
39
|
-
|
|
40
|
-
|
|
65
|
+
1. **Keyword extraction:** Read the briefing and extract concepts/terms that are vague, reference external standards, or use unfamiliar terminology.
|
|
66
|
+
- **When to research:** concept references an external standard by name, uses a tool/library not seen in the codebase, or is ambiguous enough that two agents could interpret it differently.
|
|
67
|
+
- **When NOT to research:** concept is fully defined in the input, or it's a well-known programming concept.
|
|
68
|
+
2. **Fetch sources:** For each concept needing research:
|
|
69
|
+
- Use `fetch_and_index` (if context-mode available) or `WebFetch` to fetch the source.
|
|
70
|
+
- Save fetched content to `.wazir/runs/latest/sources/`.
|
|
71
|
+
- Track each fetch in `sources/manifest.json`.
|
|
72
|
+
3. **Error handling:** 404/unreachable → log failure, continue. Research is best-effort.
|
|
73
|
+
4. The **researcher role** produces the research artifact.
|
|
74
|
+
5. The **reviewer role** runs the research-review loop with `--mode research-review`.
|
|
75
|
+
6. Loop runs for `pass_counts[depth]` passes.
|
|
41
76
|
|
|
42
77
|
Save result to `.wazir/runs/latest/clarified/research-brief.md`.
|
|
43
78
|
|
|
44
|
-
|
|
79
|
+
**After completing this phase, output to the user:**
|
|
45
80
|
|
|
46
|
-
|
|
81
|
+
> **Research complete.**
|
|
82
|
+
>
|
|
83
|
+
> **Found:** [N] external sources fetched, [N] codebase patterns identified, [N] architectural constraints documented
|
|
84
|
+
>
|
|
85
|
+
> **Without this phase:** Spec would be built on assumptions instead of evidence — wrong framework APIs, missed existing utilities, contradicted naming conventions
|
|
86
|
+
>
|
|
87
|
+
> **Changed because of this work:** [List of key discoveries — e.g., "found existing auth middleware at src/middleware/auth.ts", "project uses Vitest not Jest"]
|
|
88
|
+
|
|
89
|
+
### Checkpoint: Research Review
|
|
47
90
|
|
|
48
91
|
> **Research complete. Here's what I found:**
|
|
49
92
|
>
|
|
50
|
-
> [Summary of
|
|
51
|
-
|
|
52
|
-
|
|
53
|
-
|
|
54
|
-
|
|
55
|
-
|
|
93
|
+
> [Summary of codebase state, relevant architecture, external context]
|
|
94
|
+
|
|
95
|
+
Ask the user via AskUserQuestion:
|
|
96
|
+
- **Question:** "Does the research look complete and accurate?"
|
|
97
|
+
- **Options:**
|
|
98
|
+
1. "Looks good, continue" *(Recommended)*
|
|
99
|
+
2. "Missing context — let me add more information"
|
|
100
|
+
3. "Wrong direction — let me clarify the intent"
|
|
56
101
|
|
|
57
|
-
|
|
102
|
+
Wait for the user's selection before continuing.
|
|
58
103
|
|
|
59
104
|
---
|
|
60
105
|
|
|
61
|
-
##
|
|
106
|
+
## Sub-Workflow 2: Clarify (clarify workflow)
|
|
107
|
+
|
|
108
|
+
**Before starting this phase, output to the user:**
|
|
109
|
+
|
|
110
|
+
> **Clarification** — About to transform the briefing and research into a precise scope document with explicit constraints, assumptions, and boundaries.
|
|
111
|
+
>
|
|
112
|
+
> **Why this matters:** Without explicit clarification, "add user auth" could mean OAuth, magic links, or username/password. Every ambiguity left here becomes a 50/50 coin flip during implementation that could produce the wrong feature.
|
|
113
|
+
>
|
|
114
|
+
> **Looking for:** Ambiguous requirements, implicit assumptions, missing constraints, scope boundaries, unresolved questions
|
|
115
|
+
|
|
116
|
+
### Input Preservation (before producing clarification)
|
|
62
117
|
|
|
63
|
-
|
|
118
|
+
1. Glob `.wazir/input/tasks/*.md`. If files exist:
|
|
119
|
+
- Adopt those specs as the starting point — copy content verbatim into the clarification's item descriptions.
|
|
120
|
+
- Enhance with codebase scan + research findings. **Never remove detail — only add.**
|
|
121
|
+
- Every acceptance criterion from input must appear verbatim.
|
|
122
|
+
- Every API endpoint, color hex code, and UI dimension from input must appear in the relevant item section.
|
|
123
|
+
2. If `.wazir/input/tasks/` is empty or missing, synthesize from `briefing.md` alone.
|
|
64
124
|
|
|
65
|
-
|
|
125
|
+
### Informed Question Batching (after research, before producing clarification)
|
|
126
|
+
|
|
127
|
+
Research has completed. You now have codebase context and external findings. Before producing the clarification, ask the user INFORMED questions — informed by the research, not guesses.
|
|
128
|
+
|
|
129
|
+
**Rules:**
|
|
130
|
+
1. **Research runs FIRST, questions come AFTER.** Never ask questions before research completes.
|
|
131
|
+
2. **Batch questions:** 1-3 batches of 3-7 questions each. Never one-at-a-time.
|
|
132
|
+
3. **Every scope exclusion must be explicitly confirmed by the user.** You MUST NOT decide that something is "out of scope" without asking. If the input doesn't mention docs, ask: "The input doesn't mention documentation — should we include API docs, or is that explicitly out of scope?" Do NOT assume.
|
|
133
|
+
4. **If the input is clear and complete:** Zero questions is fine. State: "Input is clear and specific. No ambiguities detected. Proceeding with clarification."
|
|
134
|
+
5. **In auto mode (`interaction_mode: auto`):** Questions go to the gating agent, not the user.
|
|
135
|
+
6. **In interactive mode (`interaction_mode: interactive`):** More detailed questions, present research findings that informed each question.
|
|
136
|
+
|
|
137
|
+
**Question format:**
|
|
138
|
+
```
|
|
139
|
+
Based on research, I have [N] questions before proceeding:
|
|
140
|
+
|
|
141
|
+
**Scope & Intent**
|
|
142
|
+
1. [Question informed by research finding]
|
|
143
|
+
2. [Question about ambiguous requirement]
|
|
144
|
+
|
|
145
|
+
**Technical Decisions**
|
|
146
|
+
3. [Question about architecture choice discovered during research]
|
|
147
|
+
4. [Question about dependency/framework preference]
|
|
148
|
+
|
|
149
|
+
**Boundaries**
|
|
150
|
+
5. [Explicit scope boundary question — "Should X be included or excluded?"]
|
|
151
|
+
```
|
|
152
|
+
|
|
153
|
+
Ask via AskUserQuestion with the full batch. Wait for answers. If answers introduce new ambiguity, ask a follow-up batch (max 3 batches total).
|
|
154
|
+
|
|
155
|
+
### Clarification Production
|
|
156
|
+
|
|
157
|
+
Read the briefing, research brief, user answers to questions, and codebase context. Produce:
|
|
158
|
+
|
|
159
|
+
- **What** we're building — concrete deliverables
|
|
66
160
|
- **Why** — the motivation and business value
|
|
67
161
|
- **Constraints** — technical, timeline, dependencies
|
|
68
|
-
- **Assumptions** — what we're taking as given (explicitly stated)
|
|
69
|
-
- **Scope boundaries** — what's IN and what's explicitly OUT
|
|
70
|
-
- **Unresolved questions** — anything ambiguous
|
|
162
|
+
- **Assumptions** — what we're taking as given (each explicitly confirmed by user or clearly stated in input)
|
|
163
|
+
- **Scope boundaries** — what's IN and what's explicitly OUT (every exclusion must reference the user's confirmation: "Out of scope per user confirmation in question batch 1, Q5")
|
|
164
|
+
- **Unresolved questions** — anything still ambiguous after question batches
|
|
71
165
|
|
|
72
166
|
Save to `.wazir/runs/latest/clarified/clarification.md`.
|
|
73
167
|
|
|
74
|
-
Invoke
|
|
168
|
+
Invoke `wz:reviewer --mode clarification-review`. Resolve findings before presenting to user.
|
|
75
169
|
|
|
76
|
-
|
|
170
|
+
**After completing this phase, output to the user:**
|
|
77
171
|
|
|
78
|
-
|
|
172
|
+
> **Clarification complete.**
|
|
173
|
+
>
|
|
174
|
+
> **Found:** [N] ambiguities resolved, [N] assumptions documented, [N] scope boundaries defined, [N] items explicitly marked out-of-scope
|
|
175
|
+
>
|
|
176
|
+
> **Without this phase:** Implementation would proceed with hidden assumptions, scope would creep mid-build, and acceptance criteria would be vague enough to pass any implementation
|
|
177
|
+
>
|
|
178
|
+
> **Changed because of this work:** [List of resolved ambiguities — e.g., "clarified auth means OAuth2 with Google provider only", "out-of-scope: mobile responsive for v1"]
|
|
179
|
+
|
|
180
|
+
### Checkpoint: Clarification Review
|
|
79
181
|
|
|
80
182
|
> **Here's the clarified scope:**
|
|
81
183
|
>
|
|
82
|
-
> [Full clarification
|
|
83
|
-
>
|
|
84
|
-
> **Are there any corrections, missing context, or open questions to resolve?**
|
|
85
|
-
> 1. **Approved — continue to spec hardening**
|
|
86
|
-
> 2. **Needs changes** — [user provides corrections]
|
|
87
|
-
> 3. **Missing important context** — [user adds information]
|
|
184
|
+
> [Full clarification]
|
|
88
185
|
|
|
89
|
-
|
|
186
|
+
Ask the user via AskUserQuestion:
|
|
187
|
+
- **Question:** "Does the clarified scope accurately capture what you want to build?"
|
|
188
|
+
- **Options:**
|
|
189
|
+
1. "Approved — continue to spec hardening" *(Recommended)*
|
|
190
|
+
2. "Needs changes — let me provide corrections"
|
|
191
|
+
3. "Missing important context — let me add information"
|
|
192
|
+
|
|
193
|
+
Wait for the user's selection before continuing. Route feedback: plan corrections → `user-feedback.md`, new requirements → `briefing.md`.
|
|
90
194
|
|
|
91
195
|
---
|
|
92
196
|
|
|
93
|
-
##
|
|
197
|
+
## Sub-Workflow 3: Spec Harden (specify + spec-challenge workflows)
|
|
198
|
+
|
|
199
|
+
**Before starting this phase, output to the user:**
|
|
200
|
+
|
|
201
|
+
> **Spec Hardening** — About to convert the clarified scope into a measurable, testable specification and then run adversarial spec-challenge review to find gaps.
|
|
202
|
+
>
|
|
203
|
+
> **Why this matters:** Without hardening, acceptance criteria stay vague ("it should work well") instead of measurable ("response time under 200ms for 95th percentile"). Vague specs pass any implementation, making review meaningless.
|
|
204
|
+
>
|
|
205
|
+
> **Looking for:** Untestable criteria, missing error handling specs, undefined edge cases, performance requirements, security constraints
|
|
94
206
|
|
|
95
207
|
Delegate to the specify workflow (`workflows/specify.md`):
|
|
96
208
|
|
|
97
|
-
1. The **specifier role** produces a measurable spec from
|
|
98
|
-
|
|
99
|
-
|
|
100
|
-
(`workflows/spec-challenge.md`) with `--mode spec-challenge`.
|
|
101
|
-
3. The specifier resolves findings from each pass.
|
|
102
|
-
4. Loop runs for `pass_counts[depth]` passes.
|
|
209
|
+
1. The **specifier role** produces a measurable spec from clarification + research.
|
|
210
|
+
2. Invoke `wz:reviewer --mode spec-challenge`.
|
|
211
|
+
3. Loop runs for `pass_counts[depth]` passes.
|
|
103
212
|
|
|
104
213
|
Save result to `.wazir/runs/latest/clarified/spec-hardened.md`.
|
|
105
214
|
|
|
106
|
-
|
|
215
|
+
**After completing this phase, output to the user:**
|
|
107
216
|
|
|
108
|
-
|
|
217
|
+
> **Spec Hardening complete.**
|
|
218
|
+
>
|
|
219
|
+
> **Found:** [N] acceptance criteria tightened, [N] edge cases added, [N] error handling requirements specified, [N] spec-challenge findings resolved
|
|
220
|
+
>
|
|
221
|
+
> **Without this phase:** Acceptance criteria would be subjective, review would have no concrete standard to measure against, and "done" would mean whatever the implementer decided
|
|
222
|
+
>
|
|
223
|
+
> **Changed because of this work:** [List of hardening changes — e.g., "added 404 handling spec for missing resources", "specified max payload size of 5MB", "added rate limit requirement of 100 req/min"]
|
|
224
|
+
|
|
225
|
+
### Content-Author Detection
|
|
226
|
+
|
|
227
|
+
After spec hardening, scan the spec for content needs. Auto-enable the `author` workflow if the spec mentions any of:
|
|
228
|
+
- Database seeding, seed data, fixtures, sample records
|
|
229
|
+
- Sample content, placeholder text, demo data
|
|
230
|
+
- Test fixtures, mock API responses, test data files
|
|
231
|
+
- Translations, i18n strings, localization
|
|
232
|
+
- Copy (button labels, error messages, onboarding text)
|
|
233
|
+
- Documentation content, user guides, API docs
|
|
234
|
+
- Email templates, notification text
|
|
235
|
+
|
|
236
|
+
If detected, set `workflow_policy.author.enabled = true` in the run config and note:
|
|
237
|
+
> **Content needs detected.** The content-author workflow will run after design approval to produce: [list detected content types].
|
|
238
|
+
|
|
239
|
+
### Checkpoint: Hardened Spec Review
|
|
109
240
|
|
|
110
241
|
> **Spec hardened. Changes made:**
|
|
111
242
|
>
|
|
112
|
-
> [List of
|
|
113
|
-
|
|
114
|
-
|
|
115
|
-
|
|
116
|
-
|
|
117
|
-
|
|
243
|
+
> [List of gaps found and how they were tightened]
|
|
244
|
+
|
|
245
|
+
Ask the user via AskUserQuestion:
|
|
246
|
+
- **Question:** "Are the spec hardening changes acceptable?"
|
|
247
|
+
- **Options:**
|
|
248
|
+
1. "Approved — continue to brainstorming" *(Recommended)*
|
|
249
|
+
2. "Disagree with a change — let me specify"
|
|
250
|
+
3. "Found more gaps — let me add"
|
|
118
251
|
|
|
119
|
-
|
|
252
|
+
Wait for the user's selection before continuing.
|
|
120
253
|
|
|
121
254
|
---
|
|
122
255
|
|
|
123
|
-
##
|
|
256
|
+
## Sub-Workflow 4: Brainstorm (design + design-review workflows)
|
|
257
|
+
|
|
258
|
+
**Before starting this phase, output to the user:**
|
|
259
|
+
|
|
260
|
+
> **Brainstorming** — About to propose 2-3 design approaches with explicit trade-offs, then run design-review on the approved choice.
|
|
261
|
+
>
|
|
262
|
+
> **Why this matters:** Without exploring alternatives, the first approach that comes to mind gets built — even if a simpler, more maintainable, or more performant option exists. This is where architectural mistakes get caught cheaply instead of discovered during implementation.
|
|
263
|
+
>
|
|
264
|
+
> **Looking for:** Architectural trade-offs, scalability implications, complexity vs. simplicity, alignment with existing codebase patterns
|
|
124
265
|
|
|
125
|
-
Invoke the `brainstorming` skill (`wz:brainstorming`)
|
|
266
|
+
Invoke the `brainstorming` skill (`wz:brainstorming`):
|
|
126
267
|
|
|
127
|
-
This phase explores design approaches:
|
|
128
268
|
1. Propose 2-3 viable approaches with explicit trade-offs
|
|
129
269
|
2. For each approach: effort estimate, risk assessment, what it enables/prevents
|
|
130
270
|
3. Recommend one approach with rationale
|
|
131
271
|
|
|
132
|
-
|
|
133
|
-
|
|
272
|
+
### Checkpoint: Design Approval
|
|
273
|
+
|
|
274
|
+
Ask the user via AskUserQuestion:
|
|
275
|
+
- **Question:** "Which design approach should we implement?"
|
|
276
|
+
- **Options:**
|
|
277
|
+
1. "Approach A — [one-line summary]" *(Recommended)*
|
|
278
|
+
2. "Approach B — [one-line summary]"
|
|
279
|
+
3. "Approach C — [one-line summary]"
|
|
280
|
+
4. "Modify an approach — let me specify changes"
|
|
134
281
|
|
|
135
|
-
|
|
136
|
-
to single-agent brainstorming if not)
|
|
137
|
-
2. Creates a team via `TeamCreate` (`wazir-brainstorm-<concept-slug>`)
|
|
138
|
-
3. Spawns three teammates via `Agent` with `team_name`:
|
|
139
|
-
- **Free Thinker** — proposes creative directions via `SendMessage`
|
|
140
|
-
- **Grounder** — challenges each direction with practical concerns via `SendMessage`
|
|
141
|
-
- **Synthesizer** — observes silently, writes the design document on convergence
|
|
142
|
-
4. You (the Arbiter) coordinate the dialogue, signal convergence, and clean up
|
|
143
|
-
with `TeamDelete`
|
|
282
|
+
Wait for the user's selection before continuing. This is the most important checkpoint.
|
|
144
283
|
|
|
145
|
-
|
|
146
|
-
for full spawn prompts, convergence criteria, and constraints.
|
|
284
|
+
Save approved design to `.wazir/runs/latest/clarified/design.md`.
|
|
147
285
|
|
|
148
|
-
|
|
286
|
+
**After completing this phase, output to the user:**
|
|
149
287
|
|
|
150
|
-
> **
|
|
288
|
+
> **Brainstorming complete.**
|
|
289
|
+
>
|
|
290
|
+
> **Found:** [N] approaches evaluated, [N] trade-offs documented, [N] design-review findings resolved
|
|
151
291
|
>
|
|
152
|
-
>
|
|
292
|
+
> **Without this phase:** The first viable approach would be built without considering alternatives — potentially choosing a complex solution when a simple one exists, or an approach that conflicts with existing patterns
|
|
153
293
|
>
|
|
154
|
-
> **
|
|
155
|
-
> 1. **Approach A** — [one-line summary]
|
|
156
|
-
> 2. **Approach B** — [one-line summary]
|
|
157
|
-
> 3. **Approach C** — [one-line summary]
|
|
158
|
-
> 4. **Modify an approach** — [user specifies changes]
|
|
294
|
+
> **Changed because of this work:** [Selected approach and why, rejected alternatives and why, design-review adjustments made]
|
|
159
295
|
|
|
160
|
-
|
|
296
|
+
After approval: design-review loop with `--mode design-review` (5 canonical dimensions: spec coverage, design-spec consistency, accessibility, visual consistency, exported-code fidelity).
|
|
161
297
|
|
|
162
|
-
|
|
163
|
-
|
|
164
|
-
### Design Review
|
|
298
|
+
---
|
|
165
299
|
|
|
166
|
-
|
|
300
|
+
## Sub-Workflow 5: Plan (plan + plan-review workflows)
|
|
167
301
|
|
|
168
|
-
|
|
169
|
-
- Design-spec consistency
|
|
170
|
-
- Accessibility
|
|
171
|
-
- Visual consistency
|
|
172
|
-
- Exported-code fidelity
|
|
302
|
+
**Before starting this phase, output to the user:**
|
|
173
303
|
|
|
174
|
-
|
|
304
|
+
> **Planning** — About to break the approved design into ordered, dependency-aware implementation tasks with a gap analysis against the original input.
|
|
305
|
+
>
|
|
306
|
+
> **Why this matters:** Without explicit planning, tasks get implemented in the wrong order (breaking dependencies), items from the input get silently dropped, and task granularity is either too coarse (monolithic changes that are hard to review) or too fine (overhead without value).
|
|
307
|
+
>
|
|
308
|
+
> **Looking for:** Correct dependency ordering, complete input coverage, appropriate task granularity, clear acceptance criteria per task
|
|
175
309
|
|
|
176
|
-
|
|
310
|
+
Delegate to `wz:writing-plans`:
|
|
177
311
|
|
|
178
|
-
|
|
312
|
+
1. Planner produces a SINGLE execution plan at `.wazir/runs/latest/clarified/execution-plan.md` in spec-kit format.
|
|
313
|
+
2. **Gap analysis exit gate:** Compare original input against plan. Invoke `wz:reviewer --mode plan-review`.
|
|
314
|
+
3. Loop until clean or cap reached.
|
|
179
315
|
|
|
180
|
-
|
|
316
|
+
**After completing this phase, output to the user:**
|
|
181
317
|
|
|
182
|
-
|
|
183
|
-
|
|
184
|
-
|
|
185
|
-
|
|
186
|
-
|
|
187
|
-
|
|
318
|
+
> **Planning complete.**
|
|
319
|
+
>
|
|
320
|
+
> **Found:** [N] tasks created, [N] dependencies mapped, [N] plan-review findings resolved, [N] gap analysis items addressed
|
|
321
|
+
>
|
|
322
|
+
> **Without this phase:** Tasks would be implemented in ad-hoc order breaking dependencies, input items would be silently dropped, and task sizes would vary wildly making review inconsistent
|
|
323
|
+
>
|
|
324
|
+
> **Changed because of this work:** [Task count, dependency chain summary, any items reordered or split during plan-review]
|
|
188
325
|
|
|
189
|
-
### Checkpoint
|
|
326
|
+
### Checkpoint: Plan Review
|
|
190
327
|
|
|
191
328
|
> **Implementation plan: [N] tasks**
|
|
192
329
|
>
|
|
193
330
|
> | # | Task | Complexity | Dependencies | Description |
|
|
194
331
|
> |---|------|-----------|--------------|-------------|
|
|
195
|
-
|
|
196
|
-
|
|
332
|
+
|
|
333
|
+
Ask the user via AskUserQuestion:
|
|
334
|
+
- **Question:** "Does the implementation plan look correct and complete?"
|
|
335
|
+
- **Options:**
|
|
336
|
+
1. "Approved — ready for execution" *(Recommended)*
|
|
337
|
+
2. "Reorder or split tasks"
|
|
338
|
+
3. "Missing tasks"
|
|
339
|
+
4. "Too granular / too coarse"
|
|
340
|
+
|
|
341
|
+
Wait for the user's selection before continuing.
|
|
342
|
+
|
|
343
|
+
---
|
|
344
|
+
|
|
345
|
+
### Scope Coverage Gate (Hard Gate)
|
|
346
|
+
|
|
347
|
+
Before presenting the plan to the user, verify ALL input items are covered:
|
|
348
|
+
|
|
349
|
+
1. Count distinct items/deliverables in the input briefing (`.wazir/input/briefing.md` + any `input/*.md` files)
|
|
350
|
+
2. Count tasks in the execution plan
|
|
351
|
+
3. **If `tasks_in_plan < items_in_input`:** STOP and present:
|
|
352
|
+
|
|
353
|
+
> **Scope reduction detected.** The input contains [N] items but the plan only covers [M].
|
|
197
354
|
>
|
|
198
|
-
>
|
|
199
|
-
|
|
200
|
-
|
|
201
|
-
|
|
202
|
-
|
|
355
|
+
> Missing items: [list]
|
|
356
|
+
|
|
357
|
+
Ask the user via AskUserQuestion:
|
|
358
|
+
- **Question:** "The plan is missing [N-M] items from your input. How should we proceed?"
|
|
359
|
+
- **Options:**
|
|
360
|
+
1. "Add missing items to the plan" *(Recommended)*
|
|
361
|
+
2. "Approve reduced scope — I confirm these items can be dropped"
|
|
203
362
|
|
|
204
|
-
**
|
|
363
|
+
**The clarifier MUST NOT autonomously drop items into "future tiers", "deferred", or "out of scope" without explicit user approval. This is a hard rule.**
|
|
364
|
+
|
|
365
|
+
Invariant: `items_in_plan >= items_in_input` unless user explicitly approves reduction.
|
|
205
366
|
|
|
206
367
|
---
|
|
207
368
|
|
|
369
|
+
## Reasoning Output
|
|
370
|
+
|
|
371
|
+
Throughout the clarifier phase, produce reasoning at two layers:
|
|
372
|
+
|
|
373
|
+
**Conversation (Layer 1):** Before each sub-workflow, explain the trigger and why it matters. After each sub-workflow, state what was found and the counterfactual — what would have gone wrong without it.
|
|
374
|
+
|
|
375
|
+
**File (Layer 2):** Write `.wazir/runs/<id>/reasoning/phase-clarifier-reasoning.md` with structured entries per decision:
|
|
376
|
+
- **Trigger** — what prompted the decision
|
|
377
|
+
- **Options considered** — alternatives evaluated
|
|
378
|
+
- **Chosen** — selected option
|
|
379
|
+
- **Reasoning** — why
|
|
380
|
+
- **Confidence** — high/medium/low
|
|
381
|
+
- **Counterfactual** — what would go wrong without this info
|
|
382
|
+
|
|
383
|
+
Examples of clarifier reasoning entries:
|
|
384
|
+
- "Trigger: input says 'auth' without specifying provider. Options: ask user, assume OAuth2, assume magic links. Chosen: ask user. Counterfactual: assuming OAuth2 when user wanted Supabase auth = wrong middleware, 2 days rework."
|
|
385
|
+
- "Trigger: 13 items in input. Options: plan all 13, tier into must/should/could. Chosen: plan all 13 (user explicitly said 'do not tier'). Counterfactual: tiering would silently drop 5 items."
|
|
386
|
+
|
|
208
387
|
## Done
|
|
209
388
|
|
|
210
|
-
When the plan is approved
|
|
389
|
+
When the plan is approved:
|
|
211
390
|
|
|
212
|
-
> **
|
|
391
|
+
> **Clarifier phase complete.**
|
|
213
392
|
>
|
|
214
393
|
> - Spec: `.wazir/runs/latest/clarified/spec-hardened.md`
|
|
215
394
|
> - Design: `.wazir/runs/latest/clarified/design.md`
|
|
216
|
-
> - Tasks: [count] tasks in `.wazir/runs/latest/tasks/`
|
|
217
395
|
> - Plan: `.wazir/runs/latest/clarified/execution-plan.md`
|
|
218
396
|
>
|
|
219
|
-
> **Next:** Run `/
|
|
397
|
+
> **Next:** Run `/executor` to implement the plan.
|