mustard-claude 3.1.30 → 3.1.32

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "mustard-claude",
3
- "version": "3.1.30",
3
+ "version": "3.1.32",
4
4
  "description": "Framework-agnostic CLI for Claude Code project setup",
5
5
  "type": "module",
6
6
  "bin": {
@@ -23,15 +23,31 @@ Approves the active spec and prepares the implementation phase.
23
23
  - Do NOT proceed to step 2 without running this command
24
24
  2. **Read** `.claude/pipeline-config.md` — agents, model selection
25
25
  3. Locate active spec in `.claude/spec/active/`
26
+
27
+ ### Step 3b: Wave Plan Detection
28
+
29
+ Check if the located spec is a wave plan: look for `.claude/spec/active/{specName}/wave-plan.md`.
30
+
31
+ **If `wave-plan.md` exists:**
32
+
33
+ 1. Read `.claude/.pipeline-states/{specName}.json` — expect `isWavePlan: true`, `totalWaves: N`, `currentWave: 1`, `completedWaves: []`.
34
+ 2. Read `wave-plan.md` and print its ENTIRE contents verbatim inside a fenced markdown block (```` ```markdown ... ``` ````). List each wave spec file path below the block (one line each).
35
+ 3. `AskUserQuestion`:
36
+ - **"Approve wave plan — start with wave 1"** → proceed to step 4 (update header + state for wave 1 dispatch)
37
+ - **"Reject decomposition — use single spec"** → merge all wave specs back into a single spec at `.claude/spec/active/{specName}/spec.md` (concatenate `## Files`, `## Tasks`, `## Boundaries` from each wave), delete `wave-plan.md` and `wave-N-*/` subdirectories, set `scopeOverride: "user-rejected-waves"` and `isWavePlan: false` in pipeline state, proceed to step 4 on the single spec
38
+ - **"Stop — re-plan with guidance"** → stop. Instruct user: `Delete .claude/spec/active/{specName}/ and re-run /feature {name} with explicit guidance (e.g., "keep wave 2 and wave 3 together").`
39
+ 4. If user approved wave plan, for step 4 and onward, operate on the **wave 1 spec** (`.claude/spec/active/{specName}/wave-1-{role}/spec.md`) — update its header, not the wave-plan.md header.
40
+
41
+ **If `wave-plan.md` does NOT exist:** proceed as a single spec (original behavior below).
42
+
26
43
  4. **Spec Checkpoint — update spec header:**
27
44
  - `### Status: approved`
28
45
  - `### Phase: PLAN`
29
46
  - `### Checkpoint: {ISO timestamp now}`
30
- 5. **Pipeline State — create `.claude/.pipeline-states/{spec-name}.json`:**
47
+ 5. **Pipeline State — create or update `.claude/.pipeline-states/{spec-name}.json`:**
31
48
  - Extract `spec-name` from the spec directory (e.g. basename of path → `2026-02-26-linked-services-card`)
32
- - Parse Tasks from spec to extract tasks per agent (DB, Backend, Frontend, etc.)
33
- - Create `.claude/.pipeline-states/` directory if it doesn't exist
34
- - Write state file with `specName`, `status: "approved"`, `phaseName: "PLAN"`, `tasks` with names and agents, `model`, `updatedAt`
49
+ - **If wave plan (from Step 3b):** state already exists. Update fields: `status: "approved"`, `currentWave: 1`, `updatedAt`. Parse tasks from **wave-1** spec only (not all waves). Preserve `isWavePlan`, `totalWaves`, `completedWaves`, `failedWaves`.
50
+ - **If single spec:** Parse Tasks from spec to extract tasks per agent (DB, Backend, Frontend, etc.). Create `.claude/.pipeline-states/` directory if it doesn't exist. Write state file with `specName`, `status: "approved"`, `phaseName: "PLAN"`, `tasks` with names and agents, `model`, `updatedAt`.
35
51
  5b. **Memory Persist — record architectural decisions:**
36
52
  - For each significant decision in the spec (technology choices, design patterns, trade-offs):
37
53
  ```bash
@@ -58,6 +58,32 @@ If the diff file is empty or missing, skip the Git State header entirely. Never
58
58
  - Trace callers/callees via Grep in relevant directories (prefer Grep over Read)
59
59
  - Return as soon as root cause is clear — don't exhaustively scan
60
60
  - Return: root cause file(s), line(s), explanation
61
+
62
+ 2b. **Cache root-cause for retry reuse:**
63
+
64
+ After DIAGNOSE returns, compute a cache signature so fix-loop retries can skip re-DIAGNOSE when the affected surface hasn't changed:
65
+
66
+ ```javascript
67
+ // in-memory during bugfix session (also persisted to pipeline-state for Full Path)
68
+ const affectedFiles = [...root-cause file(s) from Explore return, sorted];
69
+ const bugDescription = {user's error description, canonical — trimmed and lowercased};
70
+ const rootCauseHash = sha256(bugDescription + '|' + affectedFiles.join(','));
71
+ const rootCauseSummary = {1-line root cause from Explore, ≤500 chars};
72
+ const affectedFilesHash = sha256(concatenated contents of affectedFiles right now);
73
+ ```
74
+
75
+ Write to pipeline-state if Full Path (`.claude/.pipeline-states/{specName}.json`):
76
+ ```json
77
+ {
78
+ "rootCauseHash": "sha256...",
79
+ "rootCauseSummary": "...",
80
+ "affectedFilesHash": "sha256...",
81
+ "affectedFiles": ["path/a.ts", "path/b.ts"],
82
+ "cachedAt": "{ISO}"
83
+ }
84
+ ```
85
+
86
+ For Fast Path (no spec yet), keep the cache in-memory only — it lives for the duration of the bugfix session, which is sufficient for the retry loop.
61
87
  3. **ASSESS — Decision point:**
62
88
  - Explore returns clear root cause in 1-2 files → **Fast Path** (skip PLAN)
63
89
  - 3+ files, unclear impact, cross-layer → **Full Path** (brief spec via PLAN)
@@ -125,9 +151,19 @@ Before retrying a failed fix attempt, classify the failure:
125
151
 
126
152
  1. **Transient?** — Would re-running succeed without any change? (flaky test, cache, env) → Retry once immediately.
127
153
  2. **Resolvable?** — Is the fix clear and patchable in ≤3 lines without new reads? → Apply patch, retry (counts as retry 1).
128
- 3. **Structural?** — Did the original ANALYZE misidentify the root cause? → Re-analyze: dispatch a focused Explore on the actual failure point, update root cause, re-dispatch bugfix agent. Does NOT count against the 2-retry cap.
129
-
130
- Max 2 retries for Transient + Resolvable. Structural failures trigger a targeted re-ANALYZE, not a blind retry.
154
+ 3. **Structural?** — Did the original ANALYZE misidentify the root cause? → **Before re-Exploring, consult the root-cause cache from Step 2b:**
155
+ - Recompute `affectedFilesHash` for the cached `affectedFiles`.
156
+ - **Cache hit (hash matches) AND failure signal does NOT suggest a different cause** (no keyword in the failure pointing to files outside `affectedFiles`, no REVIEW rationale explicitly naming a different root) → skip re-Explore, inject `rootCauseSummary` verbatim into the retry prompt. Log: `root-cause cached (retry {N}/2), skipping diagnose`.
157
+ - **Cache miss (files changed) OR failure rationale points elsewhere** → invalidate cache, run targeted Explore on the actual failure point, update root cause (including new cache entry via Step 2b), re-dispatch bugfix agent.
158
+ - Re-ANALYZE (with or without cache) does NOT count against the 2-retry cap.
159
+
160
+ Max 2 retries for Transient + Resolvable. Structural failures trigger a targeted re-ANALYZE (cache-gated), not a blind retry.
161
+
162
+ **Cache invalidation signals:**
163
+ - Affected files changed on disk → hash mismatch invalidates
164
+ - Review/build failure rationale mentions files outside `affectedFiles` → invalidate
165
+ - User explicitly overrides (rare) → invalidate
166
+ - After 2 retries exhausted, the cache is naturally flushed when the pipeline aborts or advances
131
167
 
132
168
  ### CLOSE
133
169
 
@@ -121,6 +121,101 @@ Continue to PLAN regardless.
121
121
 
122
122
  ### PLAN Phase
123
123
 
124
+ #### Wave Decomposition Pre-Check (Full scope only)
125
+
126
+ **Skip for Light/Extended Light** — decomposition only makes sense when scope is genuinely large.
127
+
128
+ Before writing the single spec in Full scope, check whether the work should be decomposed into waves:
129
+
130
+ 1. **Compute signals from ANALYZE output:**
131
+ - `fileCount` — files that will go into `## Files`
132
+ - `layerCount` — distinct layers (use role detection derived from paths: schema/api/ui/lib)
133
+ - `newEntityCount` — new entities created by this spec
134
+ - `estimatedTouchPoints` — count of imports/refs from Grep on affected directories (optional)
135
+
136
+ 2. **Read knowledge matches:** Read `.claude/knowledge.json` (if it exists). Extract entries whose `id` starts with `heavy-pipeline` or `high-hook-retry`. Each entry's scope signals represent a historical pipeline that cost a lot.
137
+
138
+ 3. **Run decomposition decision:**
139
+ ```bash
140
+ echo '{"fileCount":{N},"layerCount":{L},"newEntityCount":{E},"knowledgeMatches":[...]}' | node .claude/scripts/scope-decompose.js
141
+ ```
142
+ Output JSON: `{decompose: bool, reason: string, signals: {...}}`
143
+
144
+ 4. **If `decompose: false`** → proceed to `#### Full Scope` below as usual (single spec).
145
+
146
+ 5. **If `decompose: true`** → build wave plan:
147
+ ```bash
148
+ echo '{"files":[...all paths from ANALYZE...],"projectRoot":"."}' | node .claude/scripts/wave-dependency.js
149
+ ```
150
+ Output cases:
151
+ - `{error: "cyclic-dependency", cycle: [...]}` → warn user about cyclic imports (pre-existing architecture issue), fall back to single spec with note in `## Concerns`. Proceed to `#### Full Scope`.
152
+ - `{error: ...}` → fail-open: fall back to single spec.
153
+ - `{waves: [...]}` with only 1 wave → no real DAG depth, fall back to single spec.
154
+ - `{waves: [...]}` with 2+ waves → write **Wave Plan** (step 6).
155
+
156
+ 6. **Write Wave Plan structure:**
157
+ ```
158
+ .claude/spec/active/{date}-{name}/
159
+ ├── wave-plan.md
160
+ ├── wave-1-{role}/spec.md
161
+ ├── wave-2-{role}/spec.md
162
+ └── wave-N-{role}/spec.md
163
+ ```
164
+
165
+ `wave-plan.md` contains:
166
+ ```markdown
167
+ # Wave Plan: {name}
168
+ ### Status: draft | Phase: PLAN | Scope: full | Decomposed: yes
169
+ ### Checkpoint: {ISO now}
170
+ ### Reason: {decompose.reason}
171
+
172
+ ## Summary
173
+ {1-2 lines: what + why}
174
+
175
+ ## Waves
176
+ ### Wave 1 — {roles of wave 1}
177
+ Depends on: none
178
+ Files ({count}): {file1}, {file2}, ...
179
+
180
+ ### Wave 2 — {roles of wave 2}
181
+ Depends on: wave 1
182
+ Files ({count}): {file3}, ...
183
+
184
+ {... for each wave ...}
185
+
186
+ ## Rationale
187
+ {which knowledge entry matched or which threshold triggered; signals from scope-decompose}
188
+ ```
189
+
190
+ Each `wave-N-{role}/spec.md` is a **complete atomic spec** scoped to just that wave's files. Use the same template as Full scope single spec (Summary, Entity Info, Files, Tasks, Dependencies, Boundaries). Reference `../wave-plan.md` at the top as context.
191
+
192
+ 7. **Write pipeline state for wave plan:**
193
+ ```json
194
+ {
195
+ "specName": "{date}-{name}",
196
+ "status": "draft",
197
+ "phase": 2,
198
+ "phaseName": "PLAN",
199
+ "scope": "full",
200
+ "isWavePlan": true,
201
+ "currentWave": 1,
202
+ "totalWaves": N,
203
+ "completedWaves": [],
204
+ "failedWaves": []
205
+ }
206
+ ```
207
+
208
+ 8. **Present wave plan to user:**
209
+ - Read `wave-plan.md` and print its ENTIRE contents verbatim inside a fenced markdown block.
210
+ - Also list each wave's spec file paths (one line each) so the user can open individual wave specs if desired.
211
+ - Then `AskUserQuestion`:
212
+ - **"Approve wave plan and implement now"** → goes to EXECUTE wave 1 inline (same rules as Light inline)
213
+ - **"Approve wave plan for later"** → stop, user runs `/approve` + `/resume`
214
+ - **"Edit decomposition (hint PLAN)"** → user provides hint (e.g., "merge waves 2 and 3"), PLAN reexecutes with the hint appended to `estimatedTouchPoints`/manual grouping. Re-decompose once.
215
+ - **"Reject decomposition — use single spec"** → discard wave plan files, set `scopeOverride: "user-rejected-waves"` in pipeline state, proceed to `#### Full Scope` as if `decompose: false`.
216
+
217
+ 9. **If user approves the wave plan**, the single-spec `#### Full Scope` flow below is **skipped** — wave-1 becomes the first thing to execute (via `/approve --resume` or `/resume`).
218
+
124
219
  #### Full Scope
125
220
 
126
221
  1. Create `.claude/spec/active/{date}-{name}/spec.md` with:
@@ -300,5 +395,7 @@ Scope tag: `[LIGHT]` or `[FULL]` after progress line.
300
395
  - ALWAYS go straight to PLAN once you understand the change — more reads ≠ better spec
301
396
  - Light scope inline implement follows same dispatch rules as `/resume` (template, waves, retries)
302
397
  - Context budget: Grep entity-registry (not full read), Grep recipes (not full read), line-by-line checkbox updates
398
+ - Wave decomposition is opt-in via signals (knowledge matches, layer/file/entity counts) — never force waves on small scopes
399
+ - If wave decomposition is approved, single-spec Full Scope flow is skipped — waves execute sequentially via `/resume`
303
400
 
304
401
  ULTRATHINK
@@ -1,29 +1,32 @@
1
1
  ---
2
2
  name: mustard:metrics
3
- description: Show enforcement metrics report hook hit rates, budget distributions, gate activity. Metrics are recorded automatically; just run this to see them.
3
+ description: Focused view of enforcement hook events and compare-window deltas. For the superset (pipelines + hooks + RTK), use /mustard:stats.
4
4
  ---
5
5
 
6
- # /mustard:metrics - Show Metrics Report
6
+ # /mustard:metrics - Hook Events & Compare
7
7
 
8
8
  ## Trigger
9
- `/mustard:metrics [--since <ISO date>] [--event <type>]`
9
+ `/mustard:metrics [--since <ISO date>] [--event <type>] [--compare <from> <to>]`
10
10
 
11
11
  ## What it does
12
- Runs `.claude/scripts/metrics-report.js` and shows the aggregated report.
12
+ Focused on two use cases:
13
13
 
14
- Metrics are recorded **automatically** by enforcement hooks on every Task dispatch no activation needed. Just run this command whenever you want to see the current state.
14
+ 1. **Hook-level aggregation** (default)runs `.claude/scripts/metrics-report.js` and emits a table of events from `.claude/.metrics/*.jsonl`, plus RTK token savings.
15
+ 2. **Compare window** (`--compare`) — delta between two git tags or ISO dates (reference window computed automatically from the delta).
16
+
17
+ For the superset view that also includes per-pipeline metrics, orphans, Pass@1 and Last 7 Days, use **`/mustard:stats`** (cross-reference).
15
18
 
16
19
  ## Action
17
20
  1. Run `rtk node .claude/scripts/metrics-report.js $ARGS` (pass through any flags)
18
21
  2. Display output verbatim
19
22
 
20
- ## Optional flags
23
+ ## Flags
21
24
  - `--since <ISO date>` — filter events after this date
22
25
  - `--event <type>` — filter to one event type (e.g. `budget-check`)
23
- - `--compare <from> <to>` — delta between two windows (git tag or ISO date)
26
+ - `--compare <from> <to>` — delta between two windows (git tag `vX.Y.Z` or ISO date)
24
27
 
25
28
  ## Examples
26
- - `/mustard:metrics` — full report since beginning
29
+ - `/mustard:metrics` — hook event aggregation since beginning
27
30
  - `/mustard:metrics --since 2026-04-09` — only recent events
28
31
  - `/mustard:metrics --event budget-check` — only budget-check events
29
32
  - `/mustard:metrics --compare v3.1.21 v3.1.22` — delta between two releases
@@ -34,3 +37,4 @@ Metrics are recorded **automatically** by enforcement hooks on every Task dispat
34
37
  - Logs auto-rotate at 10MB
35
38
  - To reset: delete files in `.claude/.metrics/` manually
36
39
  - Advanced: override mode via `CONTEXT_BUDGET_MODE` env var (`strict`|`warn`|`observe`). Default is `strict`.
40
+ - `rtk-rewrite` events deliberately show only counts (no `tokens_saved` column) — real RTK numbers come from `rtk gain`, surfaced in the "RTK Token Savings" block.
@@ -29,7 +29,28 @@ Before the normal detect-and-confirm flow, scan the newest pipeline state for a
29
29
  3. After the re-dispatch returns, clear the flag: remove `lastDispatchFailure` from the state object and rewrite the pipeline-state JSON.
30
30
  4. Fall through to Step 1 (normal resume flow continues from the updated state).
31
31
  - **If ageMs > 10 * 60 * 1000** (stale): silently remove `lastDispatchFailure` from the state and rewrite the file, then continue to Step 1.
32
- 4. If `lastDispatchFailure` is absent, skip Step 0 entirely and proceed to Step 1.
32
+ 4. If `lastDispatchFailure` is absent, skip Step 0 entirely and proceed to Step 0.5.
33
+
34
+ ### Step 0.5: Resume Mode (continuar vs reanalisar)
35
+
36
+ Before loading heavy context (sync-registry, diff-context, Explore Gate), ask the user which mode to use. This gates roughly 2-5k tokens per resume.
37
+
38
+ 1. **Skip conditions** — enter `reanalyze` mode automatically without prompting:
39
+ - Step 0 just re-dispatched a failed agent (recovery path → always reanalyze next step)
40
+ - `pipeline-state.lastDispatchFailure` was present and <10min old (already handled in Step 0)
41
+ - Wave plan with `failedWaves.length > 0` (handled in wave failure section below — forces `reanalyze`)
42
+
43
+ 2. **Otherwise, AskUserQuestion:**
44
+ - **"Continuar de onde parou (modo leve)"** → `mode = "continued"`: skip sync-registry (Step 2 #6), skip diff-context (unless wave transition forces), skip Pre-EXECUTE Existence Gate (Step 12b). Trust pipeline-state as source of truth.
45
+ - **"Reanalisar contexto (modo completo)"** → `mode = "reanalyzed"`: run Step 2 fully (default behavior, relê tudo).
46
+
47
+ 3. **Record mode in pipeline state:** write `resumeMode: "continued" | "reanalyzed"` and `resumeModeAt: {ISO now}` so downstream steps know which path they are in.
48
+
49
+ 4. **Stale-context fallback (safety net):** if a dispatched agent in `continued` mode returns an error indicating stale context (e.g., references a missing file, fails boundary check, or returns `BLOCKED` with reason citing out-of-date registry), escalate automatically:
50
+ - Update pipeline state: `resumeMode: "escalated-to-reanalyze"`, append to `resumeEscalations` array with `{at, reason}`
51
+ - Re-run Step 2 in full (sync-registry + diff-context)
52
+ - Re-dispatch the failed agent with fresh context
53
+ - Fail-open: escalation never blocks, just upgrades to the heavier path
33
54
 
34
55
  ### Step 1: Detect & Confirm
35
56
 
@@ -83,10 +104,14 @@ Before the normal detect-and-confirm flow, scan the newest pipeline state for a
83
104
  ### Step 2: Bootstrap (after confirmation)
84
105
 
85
106
  6. **AUTO-SYNC:** `node .claude/scripts/sync-registry.js`
107
+ - **Skip if `resumeMode === "continued"`** (Step 0.5): registry is reused from prior session.
108
+ - Always run if `resumeMode === "reanalyzed"` or `"escalated-to-reanalyze"`.
86
109
 
87
110
  ### Diff Context (automatic)
88
111
  Run `node .claude/scripts/diff-context.js --subproject {subproject_path}` per subproject to capture the current git state scoped to each subproject. Include the subproject-specific output in the agent prompt as `{diff_context}` so agents see only changes relevant to their scope.
89
112
 
113
+ **Skip if `resumeMode === "continued"`** unless a wave just completed (wave transitions always refresh diff). The prior diff snapshot is reused from `.claude/.pipeline-states/{specName}.diff-{subproject}.md`.
114
+
90
115
  7. **Read** `.claude/pipeline-config.md`. For `entity-registry.json`: use Grep to extract ONLY the relevant entity block (e.g. `"Contract":`), NEVER read the full JSON
91
116
  9. **Update spec header:** `Status: implementing`, `Phase: EXECUTE`, `Checkpoint: {ISO now}`
92
117
  10. **Update/create pipeline state:** `status: "implementing"`, `phaseName: "EXECUTE"`, `specName`
@@ -99,9 +124,26 @@ Run `node .claude/scripts/diff-context.js --subproject {subproject_path}` per su
99
124
  12. **Match recipe by name only:** Grep `{subproject}/.claude/commands/recipes.md` for recipe title matching the task type — do NOT read the full recipes file. Extract only: recipe number, pattern refs, reference modules
100
125
  12b. **Pre-EXECUTE Existence Gate**: Same gate as `feature/SKILL.md § Pre-EXECUTE Existence Gate`. Invoke identically (Full scope only, `## Files` ≤ 8). On retry/resume, the gate naturally handles idempotence: tasks already `[x]` from a prior run are treated as Mixed — the Haiku confirms they stay done and the orchestrator only re-dispatches what remains `[ ]`.
101
126
 
127
+ **Skip entirely if `resumeMode === "continued"`** (Step 0.5). The `continued` mode trusts pipeline-state checkboxes as-is. If the stale-context fallback escalates to `reanalyze`, the gate runs on the re-dispatch.
128
+
102
129
  **Pre-check (same as `feature/SKILL.md § Pre-EXECUTE Existence Gate`):** Before dispatching Haiku, run `rtk git diff --stat HEAD -- <files listed in spec's ## Files>`. Skip gate entirely if output is empty (no changes) or total insertions/deletions <10. Only proceed with Haiku dispatch if ≥10 lines changed.
103
130
 
131
+ 12c. **Wave Plan Scope (conditional — only if `pipeline-state.isWavePlan === true`):**
132
+
133
+ When the pipeline state indicates a wave plan, the orchestrator dispatches only the **current wave**, not the full spec:
134
+
135
+ 1. Read `pipeline-state.currentWave` and `pipeline-state.totalWaves`.
136
+ 2. The spec to work from for this invocation is `.claude/spec/active/{specName}/wave-{currentWave}-*/spec.md`. Replace any prior reference to `spec.md` at the root of the spec dir with the current wave's spec.
137
+ 3. **Between waves** (see Step 17 post-dispatch):
138
+ - On wave completion: run `/mustard:git commit` style commit with message `feat(wave-{N}/{role}): {summary}`. If `/mustard:git commit` is not appropriate for the project, fall back to `git add {files} && git commit -m "..."`.
139
+ - Update state: `completedWaves.push(currentWave)`, `currentWave += 1`, `updatedAt`.
140
+ - Force `resumeMode = "reanalyzed"` for the next wave transition so diff-context refreshes with the just-committed changes.
141
+ - If `currentWave > totalWaves` → skip remaining wave dispatch, go to Step 19 REVIEW + Step 20 CLOSE on the overall wave plan.
142
+ 4. **If a wave fails (REJECTED after 2 fix-loops, or BLOCKED)** — see § Wave Failure Handling below.
143
+
104
144
  13. **Plan waves:** `Depends on: none` → Wave 1; dependencies → later. DB+Backend parallel. Frontend after Backend UNLESS all parallel override conditions met (see `.claude/pipeline-config.md` Parallel Rules). Review agents: ALWAYS dispatch in single parallel message. Skip completed tasks.
145
+
146
+ **Note on wave plans:** when `isWavePlan === true`, this step plans the agent wave structure **within** the current wave's spec only — agents internal to the current wave-spec may still split across DB/Backend/Frontend sub-waves. The outer wave (1..N) is the cross-spec sequence managed by Step 12c.
105
147
  14. **Build agent prompts using template** (`.claude/commands/mustard/templates/agent-prompt/SKILL.md`):
106
148
  - Read template once, then fill placeholders per agent using `.claude/pipeline-config.md` data:
107
149
  - `{subproject}` → from Agents table (Subproject column)
@@ -183,11 +225,46 @@ When REVIEW returns REJECTED (any CRITICAL):
183
225
  8. If review still REJECTED after 2 fix-loops: STOP + report exhausted retries.
184
226
 
185
227
  20. **CLOSE:**
228
+ - **Wave plan gate:** if `pipeline-state.isWavePlan === true`, only CLOSE when `completedWaves.length === totalWaves`. If waves remain (`currentWave <= totalWaves` and wave N-1 just finished), **do not** run CLOSE — instead update state (`currentWave++`, `completedWaves.push`), output `═══ WAVE {N-1} COMPLETE — {role} ═══`, and stop. Next `/mustard:resume` picks up wave N.
186
229
  - `node .claude/scripts/sync-registry.js`
187
- - Spec: `Status: completed`, `Phase: CLOSE`, all `[ ]` → `[x]`
188
- - Move spec to `.claude/spec/completed/`
230
+ - Spec: `Status: completed`, `Phase: CLOSE`, all `[ ]` → `[x]`. For wave plans: mark `wave-plan.md` status `completed`, and mark each `wave-N-{role}/spec.md` completed too.
231
+ - Move spec to `.claude/spec/completed/` (the entire `{specName}/` directory, including wave subdirs if any)
189
232
  - **Delete** `.claude/.pipeline-states/{spec-name}.json`
190
- - Output with agent colors: `═══ PIPELINE COMPLETE — {name} | Agents: {n} ok | Files: {c} created, {m} modified ═══`
233
+ - Output with agent colors: `═══ PIPELINE COMPLETE — {name} | Agents: {n} ok | Files: {c} created, {m} modified ═══` (for wave plans: append `| Waves: {totalWaves}`).
234
+
235
+ ### Wave Failure Handling
236
+
237
+ Applies only when `pipeline-state.isWavePlan === true`.
238
+
239
+ A wave is considered **failed** when:
240
+ - REVIEW returns REJECTED after 2 fix-loops exhausted (see Step 19b), OR
241
+ - An implementation agent returns `BLOCKED` and the user cannot resolve inline, OR
242
+ - Build/type-check fails repeatedly (max 2 retries) after Granular Retry Protocol is exhausted.
243
+
244
+ **On wave failure:**
245
+
246
+ 1. Update pipeline state:
247
+ - `failedWaves.push(currentWave)`
248
+ - `status = "failed"`
249
+ - `updatedAt = {ISO now}`
250
+ 2. Write failure log to `.claude/spec/active/{specName}/wave-{currentWave}-{role}/failure.md`:
251
+ ```markdown
252
+ # Wave {N} Failure — {role}
253
+ ## When: {ISO}
254
+ ## Phase: {EXECUTE | REVIEW | CLOSE}
255
+ ## Reason: {short cause — e.g., "REVIEW REJECTED after 2 fix-loops"}
256
+ ## Findings (verbatim)
257
+ {last review findings OR BLOCKED rationale OR build error}
258
+ ## Files touched
259
+ {list from agent memory}
260
+ ```
261
+ 3. **Do NOT** attempt further automatic recovery. Wave N-1 commits remain in place — they are real progress.
262
+ 4. **Prompt the user via AskUserQuestion:**
263
+ - **"Corrigir wave {N} manualmente e retomar"** → user fixes by hand; next `/mustard:resume` clears `failedWaves` entry and restarts wave N from EXECUTE.
264
+ - **"Reescrever wave {N} (re-PLAN dessa onda)"** → delete `wave-{N}-{role}/spec.md`, re-enter PLAN for wave N only (run PLAN sub-flow scoped to wave N's files). User then re-approves via `/mustard:approve` for wave N.
265
+ - **"Abortar pipeline"** → set `status: "aborted"`, move spec to `.claude/spec/aborted/{specName}/` (create dir if needed), keep waves 1..N-1 commits. Inform user: `Pipeline aborted. Waves 1..{N-1} commits preserved. Waves {N}..{totalWaves} discarded.`
266
+
267
+ **Risco residual documentado:** wave N-1 commits podem estar incompletos semanticamente sem wave N (ex.: schema criado mas API não). O usuário foi avisado disso no `/approve` da wave plan. O log `failure.md` explicita qual superfície ficou exposta.
191
268
 
192
269
  ### Granular Retry Protocol
193
270
 
@@ -2,7 +2,7 @@
2
2
  description: "Show pipeline metrics, token savings, and performance stats — use when user asks for stats, metrics, performance, or token usage"
3
3
  ---
4
4
  <!-- mustard:generated -->
5
- # /stats - Pipeline Metrics
5
+ # /stats - Pipeline Metrics (superset view)
6
6
 
7
7
  ## Trigger
8
8
 
@@ -10,7 +10,7 @@ description: "Show pipeline metrics, token savings, and performance stats — us
10
10
 
11
11
  ## Description
12
12
 
13
- Displays pipeline metrics including duration, API calls, retries, Pass@1 success rate, tool breakdown, RTK token savings, gate saves, wave reentries, and skill hit rate per agent.
13
+ Superset view of pipeline state + enforcement hooks + RTK token economy. This is the primary command; `/mustard:metrics` is a focused view for hook-only events and `--compare` windows.
14
14
 
15
15
  ## Action
16
16
 
@@ -18,21 +18,14 @@ Displays pipeline metrics including duration, API calls, retries, Pass@1 success
18
18
  2. Present the output to the user
19
19
  3. If no metrics found, inform user to run a pipeline first
20
20
 
21
- ## Pass@1 Metrics
21
+ ## Sections emitted
22
22
 
23
- `metrics-collect.js` emits a `## Pass@1 Metrics` section at the end of completed-pipeline output:
24
-
25
- - **Pass@1**: percentage of pipelines completed without any retries (retries === 0)
26
- - **Avg retries**: mean retry count across all completed pipelines
27
-
28
- Example output:
29
- ```
30
- ## Pass@1 Metrics
31
- - Pass@1: 80% (4/5 completed without retries)
32
- - Avg retries per pipeline: 0.4
33
- ```
34
-
35
- This section is omitted automatically when no completed pipelines exist yet.
23
+ - **Summary** 5–8 lines with ✓/⚠/→ prefixes (pipelines tracked, orphans, Pass@1, RTK savings, top alert)
24
+ - **Active / Orphaned (per spec)** — duration, API calls, retries, top 3 tools, retries by phase, gate saves, wave reentries, skill hits, Pass@1 by agent (heuristic)
25
+ - **Completed Pipelines** archived runs from `.claude/metrics/`
26
+ - **Last 7 Days** events per day + current week vs prior week delta
27
+ - **Enforcement Events (hooks)** — table of events from `.claude/.metrics/*.jsonl`
28
+ - **RTK Token Economy** — totals from `rtk gain`
36
29
 
37
30
  ## When to Use
38
31
 
@@ -0,0 +1,50 @@
1
+ 'use strict';
2
+ /**
3
+ * Shared helper: normalize `rtk gain --all --format json` output.
4
+ *
5
+ * rtk emits { summary: { total_saved, avg_savings_pct, total_input,
6
+ * total_output, total_commands }, daily, weekly, monthly }. Different
7
+ * rtk versions (and earlier mustard scripts) assumed top-level
8
+ * `saved_tokens`/`total_saved` — neither is correct on current rtk.
9
+ * This helper is the single source of truth.
10
+ */
11
+
12
+ const { execFileSync } = require('child_process');
13
+
14
+ function getRtkGain(opts) {
15
+ const timeout = (opts && opts.timeout) || 3000;
16
+ let raw;
17
+ try {
18
+ raw = execFileSync('rtk', ['gain', '--all', '--format', 'json'], {
19
+ encoding: 'utf8',
20
+ timeout,
21
+ stdio: ['ignore', 'pipe', 'ignore'],
22
+ windowsHide: true,
23
+ });
24
+ } catch {
25
+ return null;
26
+ }
27
+ let data;
28
+ try {
29
+ data = JSON.parse(raw);
30
+ } catch {
31
+ return null;
32
+ }
33
+ const s = (data && data.summary) || data || {};
34
+ const saved = Number(s.total_saved ?? s.saved_tokens ?? s.savedTokens ?? 0) || 0;
35
+ const original = Number(s.total_input ?? s.total_original ?? 0) || 0;
36
+ const pct = Number(s.avg_savings_pct ?? s.savings_pct ?? s.savingsPct ?? 0) || 0;
37
+ const commands = Number(s.total_commands ?? s.commands ?? 0) || 0;
38
+ if (saved <= 0 && commands <= 0) return null;
39
+ return {
40
+ saved,
41
+ originalTotal: original,
42
+ pct,
43
+ commands,
44
+ byCommand: (data && data.by_command) || null,
45
+ daily: (data && Array.isArray(data.daily)) ? data.daily : [],
46
+ weekly: (data && Array.isArray(data.weekly)) ? data.weekly : [],
47
+ };
48
+ }
49
+
50
+ module.exports = { getRtkGain };