npm - mustard-claude - Versions diffs - 3.1.28 → 3.1.30 - Mend

mustard-claude 3.1.28 → 3.1.30

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (7) hide show

package/package.json +1 -1
package/templates/commands/mustard/feature/SKILL.md +1 -1
package/templates/commands/mustard/resume/SKILL.md +45 -9
package/templates/commands/mustard/templates/agent-prompt/SKILL.md +59 -0
package/templates/hooks/__tests__/hooks.test.js +13 -16
package/templates/hooks/_lib/knowledge-extract.js +6 -4
package/templates/scripts/metrics-collect.js +7 -6

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "mustard-claude",
-  "version": "3.1.28",
+  "version": "3.1.30",
   "description": "Framework-agnostic CLI for Claude Code project setup",
   "type": "module",
   "bin": {

package/templates/commands/mustard/feature/SKILL.md CHANGED Viewed

@@ -266,7 +266,7 @@ After each agent returns, check the return value for an escalation status before
 If two or more agents in the same wave return `CONCERN`, surface all concerns together before starting the next wave. See `.claude/pipeline-config.md` Escalation Statuses and Diagnostic Failure Routing for the full status table.
-9. **REVIEW** — dispatch review agent for each affected subproject (reads guards + relevant skills, runs 7-category checklist: SOLID, Design System, Patterns, i18n, Integration, Build, Elegance). REJECTED → fix + re-review (max 2 loops).
+9. **REVIEW** — dispatch review agent for each affected subproject (reads guards + relevant skills, runs 7-category checklist: SOLID, Design System, Patterns, i18n, Integration, Build, Elegance). REJECTED → see `resume/SKILL.md § Fix Loop Dispatch Protocol` (max 2 loops).
    Re-reviews always dispatch with `model: "sonnet"` (see `review/SKILL.md § Model Selection`).
 10. All passed + APPROVED → CLOSE flow inline (sync registry, move spec, cleanup state)

package/templates/commands/mustard/resume/SKILL.md CHANGED Viewed

@@ -110,7 +110,7 @@ Run `node .claude/scripts/diff-context.js --subproject {subproject_path}` per su
       - `{entity_info}` → `_patterns` type, refs, subs from registry
       - `{role}`, `{boundary}`, `{return_sections}` → from Role Rules table in config
       - `{validate_command}`, `{build_command}` → from Agents table in config
-      - `{retry_context}` → empty on first dispatch (see Step 4 for retries)
+      - `{retry_context}` → empty on first dispatch. On retry, fill per `agent-prompt/SKILL.md § Retry Modes`. Granular retries use Step 4 § Granular Retry Protocol. Fix-loops (after REJECTED review) use Step 19b § Fix Loop Dispatch Protocol.
       - `{task_steps}` → checkboxed steps from spec
       - `{recommended_skills}` → from Skill Recommendations in `.claude/pipeline-config.md`:
         1. Glob `{subproject}/.claude/skills/` for generated pattern skills
@@ -152,8 +152,36 @@ If two or more agents in the same wave return `CONCERN`, surface all concerns to
     - Checklist categories: **SOLID, Design System, Patterns, i18n, Integration, Build, Elegance**
     - Each issue classified: CRITICAL (blocks), WARNING (recommended), NOTE (suggestion)
     - APPROVED (zero CRITICAL) → CLOSE
-    - REJECTED (any CRITICAL) → dispatch fix agent with exact issues, then re-review (max 2 fix loops)
+    - REJECTED (any CRITICAL) → see Step 19b § Fix Loop Dispatch Protocol (max 2 loops)
     - **NEVER skip review** — not even for Light scope. Light scope gets same checklist, just fewer files to review
+### Step 19b: Fix Loop Dispatch Protocol
+When REVIEW returns REJECTED (any CRITICAL):
+1. Read `.claude/.agent-memory/_index.json`, find last entry where `agent_type == {review_target_agent_type}` and `pipeline == {spec-name}`. If absent (shouldn't happen but be defensive): fall back to first-dispatch template.
+2. Extract:
+   - `prior_summary` ← `entry.summary`
+   - `files_modified` ← `entry.details.files_modified` (list)
+3. Extract review findings VERBATIM:
+   - All CRITICAL findings (required)
+   - All WARNING findings (optional — include if fix is cheap)
+   - Copy the exact text returned by the review agent; do NOT paraphrase
+4. Compose `{retry_context}` using Mode=fix-loop format (see `agent-prompt/SKILL.md § Retry Modes`). Set K = current loop number (1 or 2; max 2 fix-loops):
+   ```
+   ## RETRY CONTEXT
+   **Mode:** fix-loop ({K}/2)
+   **Prior dispatch:** {prior_summary}
+   **Files modified previously:**
+   {files_modified}
+   **Review findings (verbatim):**
+   {findings_verbatim}
+   ```
+5. Render the **Minimal Retry Template** from `agent-prompt/SKILL.md § Retry Modes` (skips CONTEXT/REFERENCE/ENTITY/SKILLS/WEB VALIDATION/ROLE/RECIPE).
+6. Dispatch the same `subagent_type` + `model` as the original impl agent (do NOT change the role or model).
+7. On return, re-dispatch REVIEW agent (normal dispatch, not retry — review is read-only).
+8. If review still REJECTED after 2 fix-loops: STOP + report exhausted retries.
 20. **CLOSE:**
     - `node .claude/scripts/sync-registry.js`
     - Spec: `Status: completed`, `Phase: CLOSE`, all `[ ]` → `[x]`
@@ -170,13 +198,21 @@ When an agent fails:
    - Build error → retry from build step (don't redo edits)
    - Edit error → retry from that edit step
    - Unknown → retry all remaining unchecked steps
-3. **Re-dispatch with retry context** — fill `{retry_context}` placeholder:
-   ```
-   ## RETRY CONTEXT
-   Steps 1-{N} completed. Resume from step {N+1}.
-   Previous error: {error_message}
-   ```
-   And set `{task_steps}` to only the remaining steps ({N+1} onwards).
+3. **Re-dispatch with retry context** — fill `{retry_context}` using Mode=granular format:
+   - Read `.claude/.agent-memory/_index.json`, find last entry where `agent_type == {failed_agent_type}` and `pipeline == {spec-name}`
+   - Extract `entry.summary` → `prior_summary`; `entry.details.files_modified` → `files_modified` (list)
+   - Fill:
+     ```
+     ## RETRY CONTEXT
+     **Mode:** granular
+     **Prior dispatch:** {prior_summary}
+     **Files modified previously:**
+     {files_modified}
+     **Previous error:** {error_message}
+     **Resume from step:** {N+1}
+     ```
+   - Set `{task_steps}` to only the remaining steps ({N+1} onwards)
+   - Use the **Minimal Retry Template** from `agent-prompt/SKILL.md § Retry Modes` (skips CONTEXT/REFERENCE/ENTITY/SKILLS/WEB VALIDATION/ROLE/RECIPE blocks)
 4. **Spec checkboxes:** steps 1-{N} already `[x]`, remaining continue `[ ]`
 5. **Max 2 retries per agent** — exhausted → STOP + report

package/templates/commands/mustard/templates/agent-prompt/SKILL.md CHANGED Viewed

@@ -12,6 +12,8 @@ Single unified template for all dispatches:
 ## Dispatch Template
+> **First-dispatch only.** When `{retry_context}` is non-empty (granular or fix-loop retry), use the **Minimal Retry Template** from `§ Retry Modes` instead — omit CONTEXT, REFERENCE, ENTITY, SKILLS, WEB VALIDATION, ROLE, and RECIPE blocks.
 ```
 ## CONTEXT
 1. Read `{subproject}/CLAUDE.md` — guards, stack, paths
@@ -53,6 +55,63 @@ Guards carregados via CLAUDE.md acima — respeite sem exceção.
 ---
+## Retry Modes
+`{retry_context}` has 3 states:
+| Mode | When | `{retry_context}` content |
+|------|------|---------------------------|
+| `empty` | First dispatch | Empty string — full Dispatch Template above is used |
+| `granular` | A step failed (PARTIAL escalation) | Enriched retry header (see below) |
+| `fix-loop` | Review returned REJECTED | Enriched retry header with verbatim findings (see below) |
+`prior_summary` and `files_modified` come from the latest `.agent-memory/_index.json` entry matching `{agent_type, pipeline}`.
+### `granular` format
+```
+## RETRY CONTEXT
+**Mode:** granular
+**Prior dispatch:** {prior_summary}
+**Files modified previously:**
+{files_modified}
+**Previous error:** {error_message}
+**Resume from step:** {N+1}
+```
+### `fix-loop` format
+```
+## RETRY CONTEXT
+**Mode:** fix-loop ({K}/2)
+**Prior dispatch:** {prior_summary}
+**Files modified previously:**
+{files_modified}
+**Review findings (verbatim):**
+{findings_verbatim}
+```
+### Minimal Retry Template
+When `{retry_context}` is non-empty, the orchestrator renders this template instead of the full Dispatch Template. Omits CONTEXT/REFERENCE/ENTITY/SKILLS/WEB VALIDATION/ROLE/RECIPE — prior context is still cached; DON'T re-Read CLAUDE.md/guards/registry unless a modified file changed on disk since last dispatch.
+```
+{retry_context}
+## EFFICIENCY
+- Absolute paths, no cd
+- Read each file once (prior context cached — skip CLAUDE.md/guards/registry re-reads unless file changed on disk)
+- Max 3 build attempts, then STOP + report
+- Return cap: follow pipeline-config.md Max Return limits. Focus on: files changed + non-obvious decisions + blockers only.
+## TASK
+{task_steps}
+Guards carregados via CLAUDE.md acima — respeite sem exceção.
+```
+---
 ## Skill-Based Context Loading
 Skills provide progressive disclosure — agents load only what they need:

package/templates/hooks/__tests__/hooks.test.js CHANGED Viewed

@@ -1814,21 +1814,19 @@ describe("knowledge-extract prescriptions", () => {
     }];
     const patterns = extractPatternsFromStates(states);
-    // retries > 2 triggers the high-retry entry
-    const retryEntry = patterns.find(p => p.name === "high-retry-login-feature");
-    assert.ok(retryEntry, "Expected high-retry entry");
+    // retries > 2 triggers the high-hook-retry entry
+    const retryEntry = patterns.find(p => p.name === "high-hook-retry-login-feature");
+    assert.ok(retryEntry, "Expected high-hook-retry entry");
     assert.ok(retryEntry.prescription, "Expected prescription field");
     assert.ok(
       /delegate investigation via Task\(general-purpose\)/.test(retryEntry.prescription),
       "Prescription should instruct delegation via Task(general-purpose)"
     );
     assert.ok(retryEntry.tags.includes("prescriptive"), "Tags should include 'prescriptive'");
-    // Back-compat: original tags preserved
-    assert.ok(retryEntry.tags.includes("retry"));
+    assert.ok(retryEntry.tags.includes("hook-retry"));
     assert.ok(retryEntry.tags.includes("pipeline"));
     assert.ok(retryEntry.tags.includes("lesson"));
-    // Back-compat: description still present
-    assert.ok(retryEntry.description.includes("4 retries"));
+    assert.ok(retryEntry.description.includes("4 hook-level retries"));
   });
   it("should emit fragmentation prescription when apiCalls > 50 AND retries > 3", () => {
@@ -1843,7 +1841,7 @@ describe("knowledge-extract prescriptions", () => {
     }];
     const patterns = extractPatternsFromStates(states);
-    // apiCalls > 50 triggers heavy-pipeline; retries > 2 also triggers high-retry.
+    // apiCalls > 50 triggers heavy-pipeline; retries > 2 also triggers high-hook-retry.
     const heavyEntry = patterns.find(p => p.name === "heavy-pipeline-big-refactor");
     assert.ok(heavyEntry, "Expected heavy-pipeline entry");
     assert.ok(heavyEntry.prescription, "Expected prescription field");
@@ -1858,7 +1856,7 @@ describe("knowledge-extract prescriptions", () => {
   });
   it("should emit reactive-iteration prescription when Edit > 15 and Write < 3", () => {
-    // Edit=20 > 15, Write=1 < 3, retries=3 to trigger the high-retry entry
+    // Edit=20 > 15, Write=1 < 3, retries=3 to trigger the high-hook-retry entry
     // (needs retries > 2 OR apiCalls > 50 to produce any entry at all).
     // Pick retries=3 and small Bash/Agent to avoid L0-violation heuristic dominance
     // but note: the heuristic checks order — L0 fires first if bash+edit>3*agent AND retries>2.
@@ -1873,8 +1871,8 @@ describe("knowledge-extract prescriptions", () => {
     }];
     const patterns = extractPatternsFromStates(states);
-    const retryEntry = patterns.find(p => p.name === "high-retry-tweak-hell");
-    assert.ok(retryEntry, "Expected high-retry entry");
+    const retryEntry = patterns.find(p => p.name === "high-hook-retry-tweak-hell");
+    assert.ok(retryEntry, "Expected high-hook-retry entry");
     assert.ok(retryEntry.prescription, "Expected prescription field");
     assert.ok(
       /investigate with Read\+Grep BEFORE editing/.test(retryEntry.prescription),
@@ -1884,7 +1882,7 @@ describe("knowledge-extract prescriptions", () => {
   });
   it("should NOT add prescription or prescriptive tag when no heuristic matches", () => {
-    // retries=3 to trigger high-retry entry, but balanced tools so none of the
+    // retries=3 to trigger high-hook-retry entry, but balanced tools so none of the
     // heuristics fire (edit<=15, apiCalls<=50, bash+edit not >3*agent).
     const states = [{
       specName: "mild-case",
@@ -1896,13 +1894,12 @@ describe("knowledge-extract prescriptions", () => {
     }];
     const patterns = extractPatternsFromStates(states);
-    const retryEntry = patterns.find(p => p.name === "high-retry-mild-case");
-    assert.ok(retryEntry, "Expected high-retry entry");
+    const retryEntry = patterns.find(p => p.name === "high-hook-retry-mild-case");
+    assert.ok(retryEntry, "Expected high-hook-retry entry");
     assert.equal(retryEntry.prescription, undefined, "No prescription when no heuristic matches");
     assert.ok(!retryEntry.tags.includes("prescriptive"),
       "'prescriptive' tag must NOT be added when no prescription");
-    // Original schema preserved
-    assert.ok(retryEntry.tags.includes("retry"));
+    assert.ok(retryEntry.tags.includes("hook-retry"));
     assert.ok(retryEntry.description);
     assert.equal(retryEntry.source, "session-knowledge");
   });

package/templates/hooks/_lib/knowledge-extract.js CHANGED Viewed

@@ -82,15 +82,17 @@ function extractPatternsFromStates(stateObjects) {
     var label = state.specName || state._file || 'unknown';
     var prescription = derivePrescription(metrics);
-    // High retry count → lesson
+    // High hook-retry count → lesson. Counts hook/sandbox events, not agent
+    // redispatches — a clean Pass@1 pipeline can still accumulate dozens.
     if (metrics.retries && metrics.retries > 2) {
       var retryEntry = {
         type: 'convention',
-        name: 'high-retry-' + label,
-        description: 'Pipeline required ' + metrics.retries + ' retries. Tool breakdown: ' +
+        name: 'high-hook-retry-' + label,
+        description: 'Pipeline triggered ' + metrics.retries + ' hook-level retries ' +
+          '(sandbox/stash-pop/re-prompts — not agent redispatches). Tool breakdown: ' +
           JSON.stringify(metrics.toolBreakdown || {}),
         source: 'session-knowledge',
-        tags: ['retry', 'pipeline', 'lesson'],
+        tags: ['hook-retry', 'pipeline', 'lesson'],
       };
       if (prescription) {
         retryEntry.prescription = prescription;

package/templates/scripts/metrics-collect.js CHANGED Viewed

@@ -33,7 +33,7 @@ function main() {
   const statesDir = path.join(claudeDir, '.pipeline-states');
   const activeSpecDir = path.join(claudeDir, 'spec', 'active');
   if (fs.existsSync(statesDir)) {
-    const files = fs.readdirSync(statesDir).filter(f => f.endsWith('.json'));
+    const files = fs.readdirSync(statesDir).filter(f => f.endsWith('.json') && !f.endsWith('.metrics.json'));
     const activeBuckets = [];
     const orphanedBuckets = [];
     for (const f of files) {
@@ -49,7 +49,7 @@ function main() {
         lines.push(`## ${isOrphaned ? 'Orphaned' : 'Active'}: ${name}`);
         lines.push(`- Duration: ${duration}`);
         lines.push(`- API calls: ${m.apiCalls || 0}`);
-        lines.push(`- Retries: ${m.retries || 0}`);
+        lines.push(`- Hook retries: ${m.retries || 0}`);
         if (m.toolBreakdown && Object.keys(m.toolBreakdown).length > 0) {
           lines.push('- Tool breakdown:');
           for (const [tool, count] of Object.entries(m.toolBreakdown).sort((a, b) => b[1] - a[1])) {
@@ -102,7 +102,7 @@ function main() {
           parts.push(`### ${name}`);
           parts.push(`- Duration: ${duration}`);
           parts.push(`- API calls: ${m.apiCalls || 0}`);
-          parts.push(`- Retries: ${m.retries || 0}`);
+          parts.push(`- Hook retries: ${m.retries || 0}`);
           if (m.rtkSavings) {
             parts.push(`- RTK savings: ${m.rtkSavings.pct}% (${Math.round((m.rtkSavings.saved || 0) / 1000)}k tokens)`);
           }
@@ -119,7 +119,7 @@ function main() {
         parts.push('## Averages (last ' + count + ' pipelines)');
         parts.push(`- Avg duration: ${formatMs(Math.round(totalDurationMs / count))}`);
         parts.push(`- Avg API calls: ${Math.round(totalCalls / count)}`);
-        parts.push(`- Avg retries: ${Math.round(totalRetries / count)}`);
+        parts.push(`- Avg hook retries: ${Math.round(totalRetries / count)}`);
         parts.push('');
       }
@@ -184,8 +184,9 @@ function main() {
         var pass1Pct = Math.round((pass1Count / totalPipelines) * 100);
         var avgRetries = (totalRetrySum / totalPipelines).toFixed(1);
         parts.push('## Pass@1 Metrics');
-        parts.push('- Pass@1: ' + pass1Pct + '% (' + pass1Count + '/' + totalPipelines + ' completed without retries)');
-        parts.push('- Avg retries per pipeline: ' + avgRetries);
+        parts.push('- Pass@1 (hook-level): ' + pass1Pct + '% (' + pass1Count + '/' + totalPipelines + ' completed with zero hook retries)');
+        parts.push('- Avg hook retries per pipeline: ' + avgRetries);
+        parts.push('- Note: counts hook/sandbox events, not agent redispatches. True agent-level Pass@1 not yet tracked.');
         parts.push('');
       }
     }