@wazir-dev/cli 1.0.0 → 1.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +31 -2
- package/docs/plans/2026-03-15-cli-pipeline-integration-plan.md +1 -1
- package/docs/reference/review-loop-pattern.md +429 -0
- package/docs/reference/tooling-cli.md +2 -0
- package/docs/truth-claims.yaml +6 -0
- package/exports/hosts/claude/.claude/agents/clarifier.md +3 -0
- package/exports/hosts/claude/.claude/agents/designer.md +3 -0
- package/exports/hosts/claude/.claude/agents/executor.md +2 -0
- package/exports/hosts/claude/.claude/agents/planner.md +3 -0
- package/exports/hosts/claude/.claude/agents/researcher.md +2 -0
- package/exports/hosts/claude/.claude/agents/reviewer.md +5 -1
- package/exports/hosts/claude/.claude/agents/specifier.md +3 -0
- package/exports/hosts/claude/.claude/commands/clarify.md +4 -0
- package/exports/hosts/claude/.claude/commands/design-review.md +4 -0
- package/exports/hosts/claude/.claude/commands/design.md +4 -0
- package/exports/hosts/claude/.claude/commands/discover.md +4 -0
- package/exports/hosts/claude/.claude/commands/execute.md +4 -0
- package/exports/hosts/claude/.claude/commands/plan-review.md +4 -0
- package/exports/hosts/claude/.claude/commands/plan.md +4 -0
- package/exports/hosts/claude/.claude/commands/spec-challenge.md +4 -0
- package/exports/hosts/claude/.claude/commands/specify.md +4 -0
- package/exports/hosts/claude/.claude/commands/verify.md +4 -0
- package/exports/hosts/claude/export.manifest.json +19 -19
- package/exports/hosts/codex/export.manifest.json +19 -19
- package/exports/hosts/cursor/export.manifest.json +19 -19
- package/exports/hosts/gemini/export.manifest.json +19 -19
- package/hooks/definitions/loop_cap_guard.yaml +1 -1
- package/hooks/hooks.json +18 -0
- package/package.json +3 -2
- package/roles/clarifier.md +3 -0
- package/roles/designer.md +3 -0
- package/roles/executor.md +2 -0
- package/roles/planner.md +3 -0
- package/roles/researcher.md +2 -0
- package/roles/reviewer.md +5 -1
- package/roles/specifier.md +3 -0
- package/skills/brainstorming/SKILL.md +139 -38
- package/skills/clarifier/SKILL.md +219 -0
- package/skills/debugging/SKILL.md +11 -1
- package/skills/executing-plans/SKILL.md +15 -2
- package/skills/executor/SKILL.md +76 -0
- package/skills/init-pipeline/SKILL.md +106 -17
- package/skills/receiving-code-review/SKILL.md +8 -0
- package/skills/requesting-code-review/SKILL.md +25 -5
- package/skills/reviewer/SKILL.md +151 -0
- package/skills/subagent-driven-development/SKILL.md +25 -2
- package/skills/tdd/SKILL.md +8 -0
- package/skills/wazir/SKILL.md +250 -43
- package/skills/writing-plans/SKILL.md +31 -4
- package/templates/examples/wazir-manifest.example.yaml +1 -1
- package/tooling/src/capture/command.js +87 -1
- package/tooling/src/capture/run-config.js +21 -0
- package/tooling/src/checks/brand-truth.js +3 -6
- package/tooling/src/checks/command-registry.js +1 -0
- package/tooling/src/checks/docs-truth.js +1 -1
- package/tooling/src/checks/runtime-surface.js +3 -7
- package/tooling/src/cli.js +8 -3
- package/tooling/src/init/command.js +201 -0
- package/wazir.manifest.yaml +0 -3
- package/workflows/clarify.md +4 -0
- package/workflows/design-review.md +4 -0
- package/workflows/design.md +4 -0
- package/workflows/discover.md +4 -0
- package/workflows/execute.md +4 -0
- package/workflows/plan-review.md +4 -0
- package/workflows/plan.md +4 -0
- package/workflows/spec-challenge.md +4 -0
- package/workflows/specify.md +4 -0
- package/workflows/verify.md +4 -0
|
@@ -12,8 +12,8 @@ Rules:
|
|
|
12
12
|
1. Do not write implementation code before the design is reviewed with the operator.
|
|
13
13
|
2. Ask clarifying questions only when the ambiguity changes scope, architecture, or acceptance criteria.
|
|
14
14
|
3. Propose 2-3 approaches with trade-offs and a recommendation.
|
|
15
|
-
4. Write the approved design to `docs/plans/YYYY-MM-DD-<topic>-design.md
|
|
16
|
-
5.
|
|
15
|
+
4. Write the approved design to `.wazir/runs/latest/clarified/design.md` (if inside a pipeline run) or `docs/plans/YYYY-MM-DD-<topic>-design.md` (if standalone).
|
|
16
|
+
5. After user approves the design concept, the reviewer role runs the design-review loop with `--mode design-review` using canonical design-review dimensions (spec coverage, design-spec consistency, accessibility, visual consistency, exported-code fidelity). See `workflows/design-review.md` and `docs/reference/review-loop-pattern.md`. The designer resolves any findings. If the design-review loop completes all passes clean, hand off to `wz:writing-plans`. Planning does not start until design-review is complete.
|
|
17
17
|
|
|
18
18
|
Required outputs:
|
|
19
19
|
|
|
@@ -23,55 +23,156 @@ Required outputs:
|
|
|
23
23
|
|
|
24
24
|
---
|
|
25
25
|
|
|
26
|
-
## Team Mode: Structured Dialogue
|
|
26
|
+
## Team Mode: Agent Teams Structured Dialogue
|
|
27
27
|
|
|
28
28
|
**Condition:** Only activate when `team_mode: parallel` in `.wazir/runs/latest/run-config.yaml`. Otherwise, use the default single-agent brainstorming above.
|
|
29
29
|
|
|
30
|
-
|
|
30
|
+
This mode uses **Agent Teams** (experimental, Claude Code + Opus 4.6) to run a
|
|
31
|
+
multi-agent brainstorming session. Your role is the **Arbiter** — you coordinate
|
|
32
|
+
the dialogue, evaluate convergence, and signal when to stop. You do NOT generate
|
|
33
|
+
design ideas yourself.
|
|
34
|
+
|
|
35
|
+
### Infrastructure: Claude Code Agent Teams
|
|
36
|
+
|
|
37
|
+
This skill uses **Agent Teams** — not subagents. The distinction matters:
|
|
38
|
+
|
|
39
|
+
| | Subagents (Task tool) | Agent Teams |
|
|
40
|
+
|---|---|---|
|
|
41
|
+
| **Lifecycle** | Spawn, return result, die | Full independent sessions that persist for team lifetime |
|
|
42
|
+
| **Communication** | Report back to parent only | Direct peer-to-peer messaging via `SendMessage` |
|
|
43
|
+
| **Coordination** | Parent manages everything | Shared task list with self-coordination |
|
|
44
|
+
|
|
45
|
+
**Critical constraint:** Text output from teammates is NOT visible to the team.
|
|
46
|
+
Teammates MUST use `SendMessage` to communicate with each other. Regular text
|
|
47
|
+
output is only visible in a teammate's own terminal pane.
|
|
48
|
+
|
|
49
|
+
### Prerequisites
|
|
50
|
+
|
|
51
|
+
```bash
|
|
52
|
+
# Check if Agent Teams is enabled
|
|
53
|
+
echo $CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS
|
|
54
|
+
```
|
|
55
|
+
|
|
56
|
+
If not set (empty), tell the user:
|
|
57
|
+
|
|
58
|
+
> Agent Teams is not enabled. Run this command and restart Claude Code:
|
|
59
|
+
> ```bash
|
|
60
|
+
> claude config set env.CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS 1
|
|
61
|
+
> ```
|
|
62
|
+
|
|
63
|
+
Then fall back to single-agent brainstorming (rules 1-5 above).
|
|
64
|
+
|
|
65
|
+
### Step 1: Create the Team
|
|
66
|
+
|
|
67
|
+
Derive `<concept-slug>` from the briefing topic (lowercased, hyphens for spaces).
|
|
68
|
+
|
|
69
|
+
Use `TeamCreate` to initialize the team with the name `wazir-brainstorm-<concept-slug>`.
|
|
70
|
+
|
|
71
|
+
### Step 2: Spawn Teammates
|
|
72
|
+
|
|
73
|
+
Spawn three teammates using the `Agent` tool with the `team_name` parameter set
|
|
74
|
+
to `wazir-brainstorm-<concept-slug>`. Each agent receives a detailed system
|
|
75
|
+
prompt via the `prompt` parameter.
|
|
76
|
+
|
|
77
|
+
#### Free Thinker
|
|
78
|
+
|
|
79
|
+
```
|
|
80
|
+
Agent(
|
|
81
|
+
team_name: "wazir-brainstorm-<concept-slug>",
|
|
82
|
+
prompt: "You are the Free Thinker in a Wazir brainstorming session.
|
|
83
|
+
Your job is to propose creative design directions without self-censoring.
|
|
84
|
+
Open new threads, explore possibilities, make connections. Communicate
|
|
85
|
+
ONLY via SendMessage — your text output is not visible to the team.
|
|
86
|
+
After proposing a direction, wait for the Grounder's response before
|
|
87
|
+
opening a new one."
|
|
88
|
+
)
|
|
89
|
+
```
|
|
90
|
+
|
|
91
|
+
#### Grounder
|
|
92
|
+
|
|
93
|
+
```
|
|
94
|
+
Agent(
|
|
95
|
+
team_name: "wazir-brainstorm-<concept-slug>",
|
|
96
|
+
prompt: "You are the Grounder in a Wazir brainstorming session. Your
|
|
97
|
+
job is to challenge every proposal from the Free Thinker with practical
|
|
98
|
+
concerns: feasibility, complexity, risk, alternatives. After 3-5
|
|
99
|
+
exchanges on a direction, decide: pursue, park, or redirect.
|
|
100
|
+
Communicate ONLY via SendMessage — your text output is not visible to
|
|
101
|
+
the team."
|
|
102
|
+
)
|
|
103
|
+
```
|
|
104
|
+
|
|
105
|
+
#### Synthesizer
|
|
106
|
+
|
|
107
|
+
```
|
|
108
|
+
Agent(
|
|
109
|
+
team_name: "wazir-brainstorm-<concept-slug>",
|
|
110
|
+
prompt: "You are the Synthesizer in a Wazir brainstorming session.
|
|
111
|
+
You NEVER participate in dialogue — only observe. Read all SendMessage
|
|
112
|
+
traffic between Free Thinker and Grounder. When the Arbiter signals
|
|
113
|
+
convergence, write the final design document to
|
|
114
|
+
.wazir/runs/latest/clarified/design.md with: design summary, pursued
|
|
115
|
+
directions, rejected alternatives, open questions, recommendation."
|
|
116
|
+
)
|
|
117
|
+
```
|
|
118
|
+
|
|
119
|
+
### Step 3: Coordinate the Dialogue (You Are the Arbiter)
|
|
120
|
+
|
|
121
|
+
1. Use `SendMessage` to tell the Free Thinker to open the first direction
|
|
122
|
+
based on the briefing, research brief, and hardened spec.
|
|
123
|
+
2. Monitor exchanges via `SendMessage`. Do NOT generate ideas — only
|
|
124
|
+
coordinate, nudge, and evaluate.
|
|
125
|
+
3. After each direction is explored (3-5 exchanges), the Grounder decides:
|
|
126
|
+
**pursue**, **park**, or **redirect**.
|
|
127
|
+
4. After depth-appropriate directions are explored:
|
|
128
|
+
|
|
129
|
+
| Depth | Directions to explore | Exchanges per direction |
|
|
130
|
+
|-------|-----------------------|------------------------|
|
|
131
|
+
| Standard | 3-5 | 3-5 exchanges |
|
|
132
|
+
| Deep | 5-8 | 5-8 exchanges |
|
|
133
|
+
|
|
134
|
+
5. **Signal convergence:** Use `SendMessage` to tell the Synthesizer to
|
|
135
|
+
produce the final design document.
|
|
136
|
+
6. Wait for the Synthesizer to write the design to
|
|
137
|
+
`.wazir/runs/latest/clarified/design.md` (if inside a pipeline run) or
|
|
138
|
+
`docs/plans/YYYY-MM-DD-<topic>-design.md` (if standalone).
|
|
139
|
+
|
|
140
|
+
### Step 4: Convergence Criteria
|
|
31
141
|
|
|
32
|
-
|
|
33
|
-
|
|
34
|
-
| Agent | Role | Cognitive Mode |
|
|
35
|
-
|-------|------|----------------|
|
|
36
|
-
| **Free Thinker** | Proposes design directions, creative leaps, "what if..." scenarios. Speaks first, opens new threads. | Divergent generation |
|
|
37
|
-
| **Grounder** | Challenges proposals, sorts signal from noise, picks winners, redirects dead ends. Responds to Free Thinker. | Convergent editing |
|
|
38
|
-
| **Synthesizer** | Observes silently, maintains a running summary, produces the final design document. Never participates in dialogue. | Synthesis only |
|
|
39
|
-
|
|
40
|
-
### Communication Protocol
|
|
41
|
-
|
|
42
|
-
- Free Thinker and Grounder exchange via **broadcast** (all agents see every message)
|
|
43
|
-
- Synthesizer **NEVER** participates in dialogue — only observes and writes to files
|
|
44
|
-
- After each direction is explored (3-5 exchanges), the Grounder decides: pursue, park, or redirect
|
|
45
|
-
|
|
46
|
-
### Dialogue Flow
|
|
142
|
+
The dialogue has converged when:
|
|
47
143
|
|
|
48
|
-
1.
|
|
49
|
-
2.
|
|
50
|
-
3.
|
|
51
|
-
4.
|
|
144
|
+
1. Enough directions have been explored for the depth level
|
|
145
|
+
2. The pursued directions have genuine range (not variations of the same idea)
|
|
146
|
+
3. The Grounder signals satisfaction
|
|
147
|
+
4. Further dialogue is producing diminishing returns
|
|
52
148
|
|
|
53
|
-
Early rounds
|
|
149
|
+
Early rounds should be more divergent. Later rounds more convergent.
|
|
54
150
|
|
|
55
|
-
###
|
|
151
|
+
### Step 5: Clean Up
|
|
56
152
|
|
|
57
|
-
|
|
58
|
-
|
|
59
|
-
| Standard | 3-5 | 3 exchanges |
|
|
60
|
-
| Deep | 5-8 | 5 exchanges |
|
|
153
|
+
Use `TeamDelete` to tear down the team after the Synthesizer has written the
|
|
154
|
+
design document.
|
|
61
155
|
|
|
62
|
-
###
|
|
156
|
+
### Constraints
|
|
63
157
|
|
|
64
|
-
|
|
65
|
-
|
|
66
|
-
|
|
67
|
-
|
|
68
|
-
|
|
158
|
+
- Text output from teammates is NOT visible to the team — they MUST use
|
|
159
|
+
`SendMessage`
|
|
160
|
+
- The Arbiter (you) coordinates but does NOT generate ideas
|
|
161
|
+
- The Synthesizer NEVER sends messages — only reads and writes files
|
|
162
|
+
- Free Thinker and Grounder exchange via broadcast (all agents see every
|
|
163
|
+
message)
|
|
69
164
|
|
|
70
165
|
### Output
|
|
71
166
|
|
|
72
|
-
The Synthesizer produces the design document following the same format as
|
|
167
|
+
The Synthesizer produces the design document following the same format as
|
|
168
|
+
single-agent brainstorming:
|
|
169
|
+
|
|
73
170
|
- Design summary
|
|
171
|
+
- Pursued directions with rationale
|
|
172
|
+
- Rejected alternatives with reasons
|
|
74
173
|
- Open questions or resolved assumptions
|
|
75
|
-
- Explicit recommendation
|
|
174
|
+
- Explicit recommendation
|
|
76
175
|
|
|
77
|
-
|
|
176
|
+
After the design is written, submit it for the design-review loop
|
|
177
|
+
(`--mode design-review`). After design-review is complete, hand off to
|
|
178
|
+
`wz:writing-plans`.
|
|
@@ -0,0 +1,219 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: wz:clarifier
|
|
3
|
+
description: Run the clarification pipeline — research, clarify scope, brainstorm design, generate task specs and execution plan. Pauses for user approval between phases.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Clarifier
|
|
7
|
+
|
|
8
|
+
Run Phase 0 (Research) + Phase 1 (Clarify, Brainstorm, Plan) for the current project.
|
|
9
|
+
|
|
10
|
+
**Pacing rule:** This skill has mandatory user checkpoints between phases. Do NOT skip checkpoints. Do NOT combine phases. Complete each phase fully, present the output, and wait for explicit user approval before advancing.
|
|
11
|
+
|
|
12
|
+
Review loops follow the pattern in `docs/reference/review-loop-pattern.md`. All reviewer invocations use explicit `--mode`.
|
|
13
|
+
|
|
14
|
+
**Standalone mode:** If no `.wazir/runs/latest/` exists, artifacts go to `docs/plans/` and review logs go alongside.
|
|
15
|
+
|
|
16
|
+
## Prerequisites
|
|
17
|
+
|
|
18
|
+
1. Check `.wazir/state/config.json` exists. If not, run `wazir init` first.
|
|
19
|
+
2. Check `.wazir/input/briefing.md` exists. If not, ask the user what they want to build and save it there.
|
|
20
|
+
3. Read config for `default_depth`, `default_intent`, `team_mode`, and `multi_tool` settings.
|
|
21
|
+
4. Create a run directory if one doesn't exist:
|
|
22
|
+
```bash
|
|
23
|
+
mkdir -p .wazir/runs/run-YYYYMMDD-HHMMSS/{sources,tasks,artifacts,reviews,clarified}
|
|
24
|
+
ln -sfn run-YYYYMMDD-HHMMSS .wazir/runs/latest
|
|
25
|
+
```
|
|
26
|
+
|
|
27
|
+
---
|
|
28
|
+
|
|
29
|
+
## Phase 0: Research (delegated)
|
|
30
|
+
|
|
31
|
+
Delegate to the discover workflow (`workflows/discover.md`):
|
|
32
|
+
|
|
33
|
+
1. The **researcher role** produces the research artifact
|
|
34
|
+
(codebase scan, external sources, source manifest, research brief).
|
|
35
|
+
2. The **reviewer role** runs the research-review loop
|
|
36
|
+
using research dimensions with `--mode research-review`
|
|
37
|
+
(see `docs/reference/review-loop-pattern.md`).
|
|
38
|
+
3. The researcher resolves findings from each pass.
|
|
39
|
+
4. Loop runs for `pass_counts[depth]` passes.
|
|
40
|
+
5. Research artifact flows back to the clarifier for Checkpoint 0.
|
|
41
|
+
|
|
42
|
+
Save result to `.wazir/runs/latest/clarified/research-brief.md`.
|
|
43
|
+
|
|
44
|
+
### Checkpoint 0: Research Review
|
|
45
|
+
|
|
46
|
+
Present the research brief to the user:
|
|
47
|
+
|
|
48
|
+
> **Research complete. Here's what I found:**
|
|
49
|
+
>
|
|
50
|
+
> [Summary of existing codebase state, relevant architecture, external context]
|
|
51
|
+
>
|
|
52
|
+
> **Does this match your understanding? Anything to add or correct?**
|
|
53
|
+
> 1. **Looks good, continue** (Recommended)
|
|
54
|
+
> 2. **Missing context** — let me add more information
|
|
55
|
+
> 3. **Wrong direction** — let me clarify the intent
|
|
56
|
+
|
|
57
|
+
**Wait for user response before continuing.**
|
|
58
|
+
|
|
59
|
+
---
|
|
60
|
+
|
|
61
|
+
## Phase 1A: Clarify (autonomous, then review, then checkpoint)
|
|
62
|
+
|
|
63
|
+
Read the briefing, research brief, and codebase context. Produce:
|
|
64
|
+
|
|
65
|
+
- **What** we're building — concrete deliverables, not vague descriptions
|
|
66
|
+
- **Why** — the motivation and business value
|
|
67
|
+
- **Constraints** — technical, timeline, dependencies
|
|
68
|
+
- **Assumptions** — what we're taking as given (explicitly stated)
|
|
69
|
+
- **Scope boundaries** — what's IN and what's explicitly OUT
|
|
70
|
+
- **Unresolved questions** — anything ambiguous that could change architecture or acceptance criteria
|
|
71
|
+
|
|
72
|
+
Save to `.wazir/runs/latest/clarified/clarification.md`.
|
|
73
|
+
|
|
74
|
+
Invoke the review loop for the clarification artifact using spec/clarification dimensions with `--mode clarification-review`. The **reviewer role** runs the loop (see `docs/reference/review-loop-pattern.md`). Resolve any findings before presenting to user.
|
|
75
|
+
|
|
76
|
+
### Checkpoint 1A: Clarification Review
|
|
77
|
+
|
|
78
|
+
Present the full clarification to the user:
|
|
79
|
+
|
|
80
|
+
> **Here's the clarified scope:**
|
|
81
|
+
>
|
|
82
|
+
> [Full clarification with what/why/constraints/assumptions/scope/questions]
|
|
83
|
+
>
|
|
84
|
+
> **Are there any corrections, missing context, or open questions to resolve?**
|
|
85
|
+
> 1. **Approved — continue to spec hardening**
|
|
86
|
+
> 2. **Needs changes** — [user provides corrections]
|
|
87
|
+
> 3. **Missing important context** — [user adds information]
|
|
88
|
+
|
|
89
|
+
**Wait for user response. If the user provides corrections, update the clarification and re-present.**
|
|
90
|
+
|
|
91
|
+
---
|
|
92
|
+
|
|
93
|
+
## Phase 1A+: Spec Harden (delegated, then checkpoint)
|
|
94
|
+
|
|
95
|
+
Delegate to the specify workflow (`workflows/specify.md`):
|
|
96
|
+
|
|
97
|
+
1. The **specifier role** produces a measurable spec from the clarification
|
|
98
|
+
and research artifacts.
|
|
99
|
+
2. The **reviewer role** runs the spec-challenge loop
|
|
100
|
+
(`workflows/spec-challenge.md`) with `--mode spec-challenge`.
|
|
101
|
+
3. The specifier resolves findings from each pass.
|
|
102
|
+
4. Loop runs for `pass_counts[depth]` passes.
|
|
103
|
+
|
|
104
|
+
Save result to `.wazir/runs/latest/clarified/spec-hardened.md`.
|
|
105
|
+
|
|
106
|
+
### Checkpoint 1A+: Hardened Spec Review
|
|
107
|
+
|
|
108
|
+
Present the changes made during hardening:
|
|
109
|
+
|
|
110
|
+
> **Spec hardened. Changes made:**
|
|
111
|
+
>
|
|
112
|
+
> [List of each gap found and how it was tightened]
|
|
113
|
+
>
|
|
114
|
+
> **Review the hardened spec. Approve or adjust?**
|
|
115
|
+
> 1. **Approved — continue to brainstorming** (Recommended)
|
|
116
|
+
> 2. **Disagree with a change** — [user specifies]
|
|
117
|
+
> 3. **Found more gaps** — [user adds]
|
|
118
|
+
|
|
119
|
+
**Wait for user response before continuing.**
|
|
120
|
+
|
|
121
|
+
---
|
|
122
|
+
|
|
123
|
+
## Phase 1B: Brainstorm (interactive — always pauses)
|
|
124
|
+
|
|
125
|
+
Invoke the `brainstorming` skill (`wz:brainstorming`) and follow it.
|
|
126
|
+
|
|
127
|
+
This phase explores design approaches:
|
|
128
|
+
1. Propose 2-3 viable approaches with explicit trade-offs
|
|
129
|
+
2. For each approach: effort estimate, risk assessment, what it enables/prevents
|
|
130
|
+
3. Recommend one approach with rationale
|
|
131
|
+
|
|
132
|
+
If `team_mode: parallel` in config, the brainstorming skill activates its
|
|
133
|
+
**Agent Teams Structured Dialogue** mode:
|
|
134
|
+
|
|
135
|
+
1. Checks that `CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS` is enabled (falls back
|
|
136
|
+
to single-agent brainstorming if not)
|
|
137
|
+
2. Creates a team via `TeamCreate` (`wazir-brainstorm-<concept-slug>`)
|
|
138
|
+
3. Spawns three teammates via `Agent` with `team_name`:
|
|
139
|
+
- **Free Thinker** — proposes creative directions via `SendMessage`
|
|
140
|
+
- **Grounder** — challenges each direction with practical concerns via `SendMessage`
|
|
141
|
+
- **Synthesizer** — observes silently, writes the design document on convergence
|
|
142
|
+
4. You (the Arbiter) coordinate the dialogue, signal convergence, and clean up
|
|
143
|
+
with `TeamDelete`
|
|
144
|
+
|
|
145
|
+
See `skills/brainstorming/SKILL.md` "Team Mode: Agent Teams Structured Dialogue"
|
|
146
|
+
for full spawn prompts, convergence criteria, and constraints.
|
|
147
|
+
|
|
148
|
+
### Checkpoint 1B: Design Approval
|
|
149
|
+
|
|
150
|
+
> **Proposed design approaches:**
|
|
151
|
+
>
|
|
152
|
+
> [Approaches with trade-offs, recommendation]
|
|
153
|
+
>
|
|
154
|
+
> **Which approach should we implement?**
|
|
155
|
+
> 1. **Approach A** — [one-line summary]
|
|
156
|
+
> 2. **Approach B** — [one-line summary]
|
|
157
|
+
> 3. **Approach C** — [one-line summary]
|
|
158
|
+
> 4. **Modify an approach** — [user specifies changes]
|
|
159
|
+
|
|
160
|
+
**This is the most important checkpoint. Do NOT proceed without explicit design approval.**
|
|
161
|
+
|
|
162
|
+
Save approved design to `.wazir/runs/latest/clarified/design.md`.
|
|
163
|
+
|
|
164
|
+
### Design Review
|
|
165
|
+
|
|
166
|
+
After the user approves the design concept, invoke the design-review loop with `--mode design-review`. The **reviewer role** validates the design against the approved spec using the canonical design-review dimensions:
|
|
167
|
+
|
|
168
|
+
- Spec coverage
|
|
169
|
+
- Design-spec consistency
|
|
170
|
+
- Accessibility
|
|
171
|
+
- Visual consistency
|
|
172
|
+
- Exported-code fidelity
|
|
173
|
+
|
|
174
|
+
See `workflows/design-review.md` and `docs/reference/review-loop-pattern.md`. The designer resolves findings. Proceed to planning only after all design-review passes complete.
|
|
175
|
+
|
|
176
|
+
---
|
|
177
|
+
|
|
178
|
+
## Phase 1C: Plan (delegated, then checkpoint)
|
|
179
|
+
|
|
180
|
+
Delegate to `wz:writing-plans`:
|
|
181
|
+
|
|
182
|
+
1. `wz:writing-plans` (using **planner role**) produces the execution plan
|
|
183
|
+
and task specs.
|
|
184
|
+
2. The **reviewer role** runs the plan-review loop
|
|
185
|
+
(`workflows/plan-review.md`) with `--mode plan-review`.
|
|
186
|
+
3. The planner resolves findings from each pass.
|
|
187
|
+
4. Loop runs for `pass_counts[depth]` passes.
|
|
188
|
+
|
|
189
|
+
### Checkpoint 1C: Plan Review
|
|
190
|
+
|
|
191
|
+
> **Implementation plan: [N] tasks**
|
|
192
|
+
>
|
|
193
|
+
> | # | Task | Complexity | Dependencies | Description |
|
|
194
|
+
> |---|------|-----------|--------------|-------------|
|
|
195
|
+
> | 1 | ... | S | none | ... |
|
|
196
|
+
> | 2 | ... | M | task-1 | ... |
|
|
197
|
+
>
|
|
198
|
+
> **Review the plan. Approve or adjust?**
|
|
199
|
+
> 1. **Approved — ready for execution** (Recommended)
|
|
200
|
+
> 2. **Reorder or split tasks** — [user specifies]
|
|
201
|
+
> 3. **Missing tasks** — [user adds]
|
|
202
|
+
> 4. **Too granular / too coarse** — [user adjusts scope]
|
|
203
|
+
|
|
204
|
+
**Wait for user response before completing.**
|
|
205
|
+
|
|
206
|
+
---
|
|
207
|
+
|
|
208
|
+
## Done
|
|
209
|
+
|
|
210
|
+
When the plan is approved, present:
|
|
211
|
+
|
|
212
|
+
> **Clarification complete.**
|
|
213
|
+
>
|
|
214
|
+
> - Spec: `.wazir/runs/latest/clarified/spec-hardened.md`
|
|
215
|
+
> - Design: `.wazir/runs/latest/clarified/design.md`
|
|
216
|
+
> - Tasks: [count] tasks in `.wazir/runs/latest/tasks/`
|
|
217
|
+
> - Plan: `.wazir/runs/latest/clarified/execution-plan.md`
|
|
218
|
+
>
|
|
219
|
+
> **Next:** Run `/wazir:executor` to execute the plan.
|
|
@@ -43,7 +43,17 @@ Follow this order:
|
|
|
43
43
|
|
|
44
44
|
Apply the minimum corrective change, then rerun the failing check and the relevant broader verification set.
|
|
45
45
|
|
|
46
|
-
|
|
46
|
+
## Loop Cap Awareness
|
|
47
|
+
|
|
48
|
+
Debugging loops respect the loop cap when running inside a pipeline:
|
|
49
|
+
- **Pipeline mode** (`.wazir/runs/latest/` exists): use `wazir capture loop-check` to track iteration count. If the cap is reached (exit 43), escalate to the user with all evidence collected so far.
|
|
50
|
+
- **Standalone mode** (no `.wazir/runs/latest/`): the loop runs for `pass_counts[depth]` passes (quick=3, standard=5, deep=7) with no cap guard. Track iteration count manually.
|
|
51
|
+
|
|
52
|
+
In standalone mode, any debug logs go to `docs/plans/` alongside the artifact.
|
|
53
|
+
|
|
54
|
+
See `docs/reference/review-loop-pattern.md` for cap guard integration.
|
|
55
|
+
|
|
56
|
+
## Rules
|
|
47
57
|
|
|
48
58
|
- change one thing at a time
|
|
49
59
|
- keep evidence for each failed hypothesis
|
|
@@ -7,7 +7,7 @@ description: Use when you have a written implementation plan to execute in a sep
|
|
|
7
7
|
|
|
8
8
|
## Overview
|
|
9
9
|
|
|
10
|
-
Load plan, review critically, execute all tasks, report when complete.
|
|
10
|
+
Load plan, review critically, execute all tasks with per-task review checkpoints, report when complete.
|
|
11
11
|
|
|
12
12
|
**Announce at start:** "I'm using the executing-plans skill to implement this plan."
|
|
13
13
|
|
|
@@ -27,7 +27,18 @@ For each task:
|
|
|
27
27
|
1. Mark as in_progress
|
|
28
28
|
2. Follow each step exactly (plan has bite-sized steps)
|
|
29
29
|
3. Run verifications as specified
|
|
30
|
-
4.
|
|
30
|
+
4. Review BEFORE marking complete (per-task review, 5 task-execution dimensions):
|
|
31
|
+
- Run task-review loop with `--mode task-review`
|
|
32
|
+
- Use `codex review --uncommitted` for uncommitted changes, or `codex review --base <sha>` if already committed
|
|
33
|
+
- Codex error handling: if codex exits non-zero, log the error, mark the pass as codex-unavailable, and use self-review findings only. Do not treat a Codex failure as a clean pass.
|
|
34
|
+
- Resolve all findings before proceeding
|
|
35
|
+
- Log to: `.wazir/runs/latest/reviews/execute-task-<NNN>-review-pass-<N>.md`
|
|
36
|
+
- Cap tracking: `wazir capture loop-check --task-id <NNN>`
|
|
37
|
+
- This is NOT the final scored review -- it is a per-task gate using 5 task-execution dimensions
|
|
38
|
+
- See `docs/reference/review-loop-pattern.md` for the full review loop contract
|
|
39
|
+
5. Only after review passes: mark as completed, commit
|
|
40
|
+
|
|
41
|
+
**Standalone mode:** When no `.wazir/runs/latest/` exists, review logs go to `docs/plans/` alongside the artifact. The loop runs for `pass_counts[depth]` passes with no cap guard.
|
|
31
42
|
|
|
32
43
|
### Step 3: Complete Development
|
|
33
44
|
|
|
@@ -58,9 +69,11 @@ After all tasks complete and verified:
|
|
|
58
69
|
- Review plan critically first
|
|
59
70
|
- Follow plan steps exactly
|
|
60
71
|
- Don't skip verifications
|
|
72
|
+
- Don't skip per-task review -- it catches issues before they cascade to later tasks
|
|
61
73
|
- Reference skills when plan says to
|
|
62
74
|
- Stop when blocked, don't guess
|
|
63
75
|
- Never start implementation on main/master branch without explicit user consent
|
|
76
|
+
- Review loop pattern: see `docs/reference/review-loop-pattern.md`
|
|
64
77
|
|
|
65
78
|
## Integration
|
|
66
79
|
|
|
@@ -0,0 +1,76 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: wz:executor
|
|
3
|
+
description: Run the execution phase — implement the approved plan with TDD, quality gates, and verification.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Executor
|
|
7
|
+
|
|
8
|
+
Run Phase 2 (Execute) for the current project.
|
|
9
|
+
|
|
10
|
+
## Prerequisites
|
|
11
|
+
|
|
12
|
+
1. Check `.wazir/runs/latest/clarified/execution-plan.md` exists. If not, tell the user to run `/wazir:clarifier` first.
|
|
13
|
+
2. Read the execution plan and task specs from `.wazir/runs/latest/tasks/`.
|
|
14
|
+
3. Read `.wazir/state/config.json` for team_mode and depth settings.
|
|
15
|
+
|
|
16
|
+
## Pre-Execution Validation
|
|
17
|
+
|
|
18
|
+
Run these checks before implementing:
|
|
19
|
+
- `wazir validate manifest` — confirm manifest schema is valid
|
|
20
|
+
- `wazir validate hooks` — confirm hook contracts are intact
|
|
21
|
+
|
|
22
|
+
If either fails, surface the failure and do NOT proceed until resolved.
|
|
23
|
+
|
|
24
|
+
## Execution
|
|
25
|
+
|
|
26
|
+
Implement tasks in the order defined by the execution plan.
|
|
27
|
+
|
|
28
|
+
For each task:
|
|
29
|
+
|
|
30
|
+
1. **Read** the task spec at `.wazir/runs/latest/tasks/task-NNN/spec.md`
|
|
31
|
+
2. **Implement** using TDD (write test first, make it pass, refactor)
|
|
32
|
+
3. **Verify** — run tests, type checks, linting as appropriate
|
|
33
|
+
4. **Review BEFORE commit** (per-task review, NOT final review):
|
|
34
|
+
- Reviewer runs task-review loop with `--mode task-review` using 5 task-execution dimensions (correctness, tests, wiring, drift, quality)
|
|
35
|
+
- Reads the Codex model from config: `CODEX_MODEL=$(jq -r '.multi_tool.codex.model // empty' .wazir/state/config.json 2>/dev/null); CODEX_MODEL=${CODEX_MODEL:-gpt-5.4}`
|
|
36
|
+
- Uses `codex review -c model="$CODEX_MODEL" --uncommitted` for the current task's changes
|
|
37
|
+
- Codex error handling: if codex exits non-zero, log error, mark pass as `codex-unavailable`, use self-review only for that pass. Do NOT treat a Codex failure as a clean review. Do NOT skip the pass. The next pass still attempts Codex (transient failures may recover).
|
|
38
|
+
- Executor resolves findings, reviewer re-reviews
|
|
39
|
+
- Loop runs for `pass_counts[depth]` passes (quick=3, standard=5, deep=7). No extension.
|
|
40
|
+
- Review logs: `.wazir/runs/latest/reviews/execute-task-<NNN>-review-pass-<N>.md`
|
|
41
|
+
- Loop cap tracking: `wazir capture loop-check --task-id <NNN>` (each task has its own cap counter)
|
|
42
|
+
- See `docs/reference/review-loop-pattern.md` for full protocol
|
|
43
|
+
- NOTE: this is the per-task review (5 dims), not the final scored review (7 dims) which runs later in `/wazir:reviewer --mode final`
|
|
44
|
+
5. **Commit** — only after review passes, commit with conventional commit format: `<type>(<scope>): <description>`
|
|
45
|
+
6. **CHANGELOG** — if the change is user-facing (new feature, behavior change, bug fix visible to users), update `CHANGELOG.md` `[Unreleased]` section. If not user-facing (refactor, internal tooling, tests), skip.
|
|
46
|
+
7. **Record** evidence at `.wazir/runs/latest/artifacts/task-NNN/`
|
|
47
|
+
|
|
48
|
+
Review loops follow the pattern in `docs/reference/review-loop-pattern.md`. Code review scoping: review uncommitted changes before commit. If changes are already committed (subagent workflow), use `codex review -c model="$CODEX_MODEL" --base <pre-task-sha>`.
|
|
49
|
+
|
|
50
|
+
If `team_mode: parallel` in config, spawn Agent Teams for independent tasks. Otherwise, tasks run sequentially.
|
|
51
|
+
|
|
52
|
+
**Standalone mode:** When no `.wazir/runs/latest/` exists, review logs go to `docs/plans/` alongside the artifact.
|
|
53
|
+
|
|
54
|
+
## Context Retrieval
|
|
55
|
+
|
|
56
|
+
- Use `wazir index search-symbols <query>` to locate relevant code before reading
|
|
57
|
+
- Read full files directly when editing or verifying
|
|
58
|
+
- Use `wazir recall file <path> --tier L1` for files you need to understand but not modify
|
|
59
|
+
|
|
60
|
+
## Escalation
|
|
61
|
+
|
|
62
|
+
Pause and ask the user when:
|
|
63
|
+
- The plan is blocked or contradictory
|
|
64
|
+
- Implementation would require unapproved scope change
|
|
65
|
+
- A task's acceptance criteria can't be met
|
|
66
|
+
|
|
67
|
+
## Done
|
|
68
|
+
|
|
69
|
+
When all tasks are complete, present:
|
|
70
|
+
|
|
71
|
+
> **Execution complete.**
|
|
72
|
+
>
|
|
73
|
+
> - Tasks: [completed]/[total] implemented
|
|
74
|
+
> - Artifacts: `.wazir/runs/latest/artifacts/`
|
|
75
|
+
>
|
|
76
|
+
> **Next:** Run `/wazir:reviewer --mode final` to review the changes, or `/wazir` for the full pipeline.
|