npm - maestro-flow - Versions diffs - 0.4.9 → 0.4.11 - Mend

maestro-flow 0.4.9 → 0.4.11

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (410) hide show

package/.agents/skills/skill-iter-tune/phases/03-evaluate.md ADDED Viewed

@@ -0,0 +1,312 @@
+# Phase 3: Evaluate Quality
+> **COMPACT SENTINEL [Phase 3: Evaluate]**
+> This phase contains 5 execution steps (Step 3.1 -- 3.5).
+> If you can read this sentinel but cannot find the full Step protocol below, context has been compressed.
+> Recovery: `read_file("phases/03-evaluate.md")`
+Evaluate skill quality using `ccw cli --tool gemini --mode analysis`. Gemini scores the skill across 5 dimensions and provides improvement suggestions.
+## Objective
+- Construct evaluation prompt with skill + artifacts + criteria
+- Execute via ccw cli Gemini
+- Parse multi-dimensional score
+- Write iteration-{N}-eval.md
+- Check termination conditions
+## Execution
+### Step 3.1: Prepare Evaluation Context
+```javascript
+const N = state.current_iteration;
+const iterDir = `${state.work_dir}/iterations/iteration-${N}`;
+// Read evaluation criteria
+// Ref: specs/evaluation-criteria.md
+const evaluationCriteria = read_file('.claude/skills/skill-iter-tune/specs/evaluation-criteria.md');
+// Build skillContent (same pattern as Phase 02 — only executable files)
+const skillContent = state.target_skills.map(skill => {
+  const skillMd = read_file(`${skill.path}/SKILL.md`);
+  const phaseFiles = find_files(`${skill.path}/phases/*.md`).sort().map(f => ({
+    relativePath: f.replace(skill.path + '/', ''),
+    content: read_file(f)
+  }));
+  const specFiles = find_files(`${skill.path}/specs/*.md`).map(f => ({
+    relativePath: f.replace(skill.path + '/', ''),
+    content: read_file(f)
+  }));
+  return `### File: SKILL.md\n${skillMd}\n\n` +
+    phaseFiles.map(f => `### File: ${f.relativePath}\n${f.content}`).join('\n\n') +
+    (specFiles.length > 0 ? '\n\n' + specFiles.map(f => `### File: ${f.relativePath}\n${f.content}`).join('\n\n') : '');
+}).join('\n\n---\n\n');
+// Build artifacts summary
+let artifactsSummary = 'No artifacts produced (execution may have failed)';
+if (state.execution_mode === 'chain') {
+  // Chain mode: group artifacts by skill
+  const chainSummaries = state.chain_order.map(skillName => {
+    const skillArtifactDir = `${iterDir}/artifacts/${skillName}`;
+    const files = find_files(`${skillArtifactDir}/**/*`);
+    if (files.length === 0) return `### ${skillName} (no artifacts)`;
+    const filesSummary = files.map(f => {
+      const relPath = f.replace(`${skillArtifactDir}/`, '');
+      const content = read_file(f, { limit: 200 });
+      return `--- ${relPath} ---\n${content}`;
+    }).join('\n\n');
+    return `### ${skillName} (chain position ${state.chain_order.indexOf(skillName) + 1})\n${filesSummary}`;
+  });
+  artifactsSummary = chainSummaries.join('\n\n---\n\n');
+} else {
+  // Single mode (existing)
+  const artifactFiles = find_files(`${iterDir}/artifacts/**/*`);
+  if (artifactFiles.length > 0) {
+    artifactsSummary = artifactFiles.map(f => {
+      const relPath = f.replace(`${iterDir}/artifacts/`, '');
+      const content = read_file(f, { limit: 200 });
+      return `--- ${relPath} ---\n${content}`;
+    }).join('\n\n');
+  }
+}
+// Build previous evaluation context
+const previousEvalContext = state.iterations.filter(i => i.evaluation).length > 0
+  ? `PREVIOUS ITERATIONS:\n` + state.iterations.filter(i => i.evaluation).map(iter =>
+    `Iteration ${iter.round}: Score ${iter.evaluation.score}\n` +
+    `  Applied: ${iter.improvement?.changes_applied?.map(c => c.summary).join('; ') || 'none'}\n` +
+    `  Weaknesses: ${iter.evaluation.weaknesses?.slice(0, 3).join('; ') || 'none'}`
+  ).join('\n') + '\nIMPORTANT: Focus on NEW issues not yet addressed.'
+  : '';
+```
+### Step 3.2: Construct Evaluation Prompt
+```javascript
+// Ref: templates/eval-prompt.md
+const evalPrompt = `PURPOSE: Evaluate the quality of a workflow skill by examining its definition and produced artifacts.
+SKILL DEFINITION:
+${skillContent}
+TEST SCENARIO:
+${state.test_scenario.description}
+Requirements: ${state.test_scenario.requirements.join('; ')}
+Success Criteria: ${state.test_scenario.success_criteria}
+ARTIFACTS PRODUCED:
+${artifactsSummary}
+EVALUATION CRITERIA:
+${evaluationCriteria}
+${previousEvalContext}
+${state.execution_mode === 'chain' ? `
+CHAIN CONTEXT:
+This skill chain contains ${state.chain_order.length} skills executed in order:
+${state.chain_order.map((s, i) => `${i+1}. ${s}`).join('\n')}
+Current evaluation covers the entire chain output.
+Please provide per-skill quality scores in an additional "chain_scores" field: { "${state.chain_order[0]}": <score>, ... }
+` : ''}
+TASK:
+1. Score each dimension (Clarity 0.20, Completeness 0.25, Correctness 0.25, Effectiveness 0.20, Efficiency 0.10) on 0-100
+2. Calculate weighted composite score
+3. List top 3 strengths
+4. List top 3-5 weaknesses with file:section references
+5. Provide 3-5 prioritized improvement suggestions with concrete changes
+EXPECTED OUTPUT (strict JSON, no markdown):
+{
+  "composite_score": <0-100>,
+  "dimensions": [
+    {"name":"Clarity","id":"clarity","score":<0-100>,"weight":0.20,"feedback":"..."},
+    {"name":"Completeness","id":"completeness","score":<0-100>,"weight":0.25,"feedback":"..."},
+    {"name":"Correctness","id":"correctness","score":<0-100>,"weight":0.25,"feedback":"..."},
+    {"name":"Effectiveness","id":"effectiveness","score":<0-100>,"weight":0.20,"feedback":"..."},
+    {"name":"Efficiency","id":"efficiency","score":<0-100>,"weight":0.10,"feedback":"..."}
+  ],
+  "strengths": ["...", "...", "..."],
+  "weaknesses": ["...with file:section ref...", "..."],
+  "suggestions": [
+    {"priority":"high|medium|low","target_file":"...","description":"...","rationale":"...","code_snippet":"..."}
+  ]
+}
+CONSTRAINTS: Be rigorous, reference exact files, focus on highest-impact changes, output ONLY JSON`;
+```
+### Step 3.3: Execute via ccw cli Gemini
+> **CHECKPOINT**: Verify evaluation prompt is properly constructed before CLI execution.
+```javascript
+// Shell escape utility (same as Phase 02)
+function escapeForShell(str) {
+  return str.replace(/"/g, '\\"').replace(/\$/g, '\\$').replace(/`/g, '\\`');
+}
+const skillPath = state.target_skills[0].path;  // Primary skill for --cd
+const cliCommand = `ccw cli -p "${escapeForShell(evalPrompt)}" --tool gemini --mode analysis --cd "${skillPath}"`;
+// Execute in background
+shell({
+  command: cliCommand,
+  run_in_background: true,
+  timeout: 300000  // 5 minutes
+});
+// STOP -- wait for hook callback
+```
+### Step 3.4: Parse Score and Write Eval File
+After CLI completes:
+```javascript
+// Parse JSON from Gemini output
+// The output may contain markdown wrapping -- extract JSON
+const rawOutput = /* CLI output from callback */;
+const jsonMatch = rawOutput.match(/\{[\s\S]*\}/);
+let evaluation;
+if (jsonMatch) {
+  try {
+    evaluation = JSON.parse(jsonMatch[0]);
+    // Extract chain_scores if present
+    if (state.execution_mode === 'chain' && evaluation.chain_scores) {
+      state.iterations[N - 1].evaluation.chain_scores = evaluation.chain_scores;
+    }
+  } catch (e) {
+    // Fallback: try to extract score heuristically
+    const scoreMatch = rawOutput.match(/"composite_score"\s*:\s*(\d+)/);
+    evaluation = {
+      composite_score: scoreMatch ? parseInt(scoreMatch[1]) : 50,
+      dimensions: [],
+      strengths: [],
+      weaknesses: ['Evaluation output parsing failed -- raw output saved'],
+      suggestions: []
+    };
+  }
+} else {
+  evaluation = {
+    composite_score: 50,
+    dimensions: [],
+    strengths: [],
+    weaknesses: ['No structured evaluation output -- defaulting to 50'],
+    suggestions: []
+  };
+}
+// Write iteration-N-eval.md
+const evalReport = `# Iteration ${N} Evaluation
+**Composite Score**: ${evaluation.composite_score}/100
+**Date**: ${new Date().toISOString()}
+## Dimension Scores
+| Dimension | Score | Weight | Feedback |
+|-----------|-------|--------|----------|
+${(evaluation.dimensions || []).map(d =>
+  `| ${d.name} | ${d.score} | ${d.weight} | ${d.feedback} |`
+).join('\n')}
+${(state.execution_mode === 'chain' && evaluation.chain_scores) ? `
+## Chain Scores
+| Skill | Score | Chain Position |
+|-------|-------|----------------|
+${state.chain_order.map((s, i) => `| ${s} | ${evaluation.chain_scores[s] || '-'} | ${i + 1} |`).join('\n')}
+` : ''}
+## Strengths
+${(evaluation.strengths || []).map(s => `- ${s}`).join('\n')}
+## Weaknesses
+${(evaluation.weaknesses || []).map(w => `- ${w}`).join('\n')}
+## Improvement Suggestions
+${(evaluation.suggestions || []).map((s, i) =>
+  `### ${i + 1}. [${s.priority}] ${s.description}\n- **Target**: ${s.target_file}\n- **Rationale**: ${s.rationale}\n${s.code_snippet ? `- **Suggested**:\n\`\`\`\n${s.code_snippet}\n\`\`\`` : ''}`
+).join('\n\n')}
+`;
+write_file(`${iterDir}/iteration-${N}-eval.md`, evalReport);
+// Update state
+state.iterations[N - 1].evaluation = {
+  score: evaluation.composite_score,
+  dimensions: evaluation.dimensions || [],
+  strengths: evaluation.strengths || [],
+  weaknesses: evaluation.weaknesses || [],
+  suggestions: evaluation.suggestions || [],
+  chain_scores: evaluation.chain_scores || null,
+  eval_file: `${iterDir}/iteration-${N}-eval.md`
+};
+state.latest_score = evaluation.composite_score;
+state.score_trend.push(evaluation.composite_score);
+write_file(`${state.work_dir}/iteration-state.json`, JSON.stringify(state, null, 2));
+```
+### Step 3.5: Check Termination
+```javascript
+function shouldTerminate(state) {
+  // 1. Quality threshold met
+  if (state.latest_score >= state.quality_threshold) {
+    return { terminate: true, reason: 'quality_threshold_met' };
+  }
+  // 2. Max iterations reached
+  if (state.current_iteration >= state.max_iterations) {
+    return { terminate: true, reason: 'max_iterations_reached' };
+  }
+  // 3. Convergence: no improvement in last 2 iterations
+  if (state.score_trend.length >= 3) {
+    const last3 = state.score_trend.slice(-3);
+    const improvement = last3[2] - last3[0];
+    if (improvement <= 2) {
+      state.converged = true;
+      return { terminate: true, reason: 'convergence_detected' };
+    }
+  }
+  // 4. Error limit
+  if (state.error_count >= state.max_errors) {
+    return { terminate: true, reason: 'error_limit_reached' };
+  }
+  return { terminate: false };
+}
+const termination = shouldTerminate(state);
+if (termination.terminate) {
+  state.termination_reason = termination.reason;
+  write_file(`${state.work_dir}/iteration-state.json`, JSON.stringify(state, null, 2));
+  // Skip Phase 4, go directly to Phase 5 (Report)
+} else {
+  // Continue to Phase 4 (Improve)
+}
+```
+## Error Handling
+| Error | Recovery |
+|-------|----------|
+| CLI timeout | Retry once, if still fails use score 50 with warning |
+| JSON parse failure | Extract score heuristically, save raw output |
+| No output | Default score 50, note in weaknesses |
+## Output
+- **Files**: `iteration-{N}-eval.md`
+- **State**: `iterations[N-1].evaluation`, `latest_score`, `score_trend` updated
+- **Decision**: terminate -> Phase 5, continue -> Phase 4
+- **track_tasks**: Update current iteration score display

package/.agents/skills/skill-iter-tune/phases/04-improve.md ADDED Viewed

@@ -0,0 +1,186 @@
+# Phase 4: Apply Improvements
+> **COMPACT SENTINEL [Phase 4: Improve]**
+> This phase contains 4 execution steps (Step 4.1 -- 4.4).
+> If you can read this sentinel but cannot find the full Step protocol below, context has been compressed.
+> Recovery: `read_file("phases/04-improve.md")`
+Apply targeted improvements to skill files based on evaluation suggestions. Uses a general-purpose Agent to make changes, ensuring only suggested modifications are applied.
+## Objective
+- Read evaluation suggestions from current iteration
+- Launch Agent to apply improvements in priority order
+- Document all changes made
+- Update iteration state
+## Execution
+### Step 4.1: Prepare Improvement Context
+```javascript
+const N = state.current_iteration;
+const iterDir = `${state.work_dir}/iterations/iteration-${N}`;
+const evaluation = state.iterations[N - 1].evaluation;
+// Verify we have suggestions to apply
+if (!evaluation.suggestions || evaluation.suggestions.length === 0) {
+  // No suggestions -- skip improvement, mark iteration complete
+  state.iterations[N - 1].improvement = {
+    changes_applied: [],
+    changes_file: null,
+    improvement_rationale: 'No suggestions provided by evaluation'
+  };
+  state.iterations[N - 1].status = 'completed';
+  write_file(`${state.work_dir}/iteration-state.json`, JSON.stringify(state, null, 2));
+  // -> Return to orchestrator for next iteration
+  return;
+}
+// Build file inventory for agent context
+const skillFileInventory = state.target_skills.map(skill => {
+  return `Skill: ${skill.name} (${skill.path})\nFiles:\n` +
+    skill.files.map(f => `  - ${f}`).join('\n');
+}).join('\n\n');
+// Chain mode: add chain relationship context
+const chainContext = state.execution_mode === 'chain'
+  ? `\nChain Order: ${state.chain_order.join(' -> ')}\n` +
+    `Chain Scores: ${state.chain_order.map(s =>
+      `${s}: ${state.iterations[N-1].evaluation?.chain_scores?.[s] || 'N/A'}`
+    ).join(', ')}\n` +
+    `Weakest Link: ${state.chain_order.reduce((min, s) => {
+      const score = state.iterations[N-1].evaluation?.chain_scores?.[s] || 100;
+      return score < (state.iterations[N-1].evaluation?.chain_scores?.[min] || 100) ? s : min;
+    }, state.chain_order[0])}`
+  : '';
+```
+### Step 4.2: Launch Improvement Agent
+> **CHECKPOINT**: Before launching agent, verify:
+> 1. evaluation.suggestions is non-empty
+> 2. All target_file paths in suggestions are valid
+```javascript
+const suggestionsText = evaluation.suggestions.map((s, i) =>
+  `${i + 1}. [${s.priority.toUpperCase()}] ${s.description}\n` +
+  `   Target: ${s.target_file}\n` +
+  `   Rationale: ${s.rationale}\n` +
+  (s.code_snippet ? `   Suggested change:\n   ${s.code_snippet}\n` : '')
+).join('\n');
+delegate_subagent({
+  subagent_type: 'general-purpose',
+  run_in_background: false,
+  description: `Apply skill improvements iteration ${N}`,
+  prompt: `## Task: Apply Targeted Improvements to Skill Files
+You are improving a workflow skill based on evaluation feedback. Apply ONLY the suggested changes -- do not refactor, add features, or "improve" beyond what is explicitly suggested.
+## Current Score: ${evaluation.score}/100
+Dimension breakdown:
+${evaluation.dimensions.map(d => `- ${d.name}: ${d.score}/100`).join('\n')}
+## Skill File Inventory
+${skillFileInventory}
+${chainContext ? `## Chain Context\n${chainContext}\n\nPrioritize improvements on the weakest skill in the chain. Also consider interface compatibility between adjacent skills in the chain.\n` : ''}
+## Improvement Suggestions (apply in priority order)
+${suggestionsText}
+## Rules
+1. Read each target file BEFORE modifying it
+2. Apply ONLY the suggested changes -- no unsolicited modifications
+3. If a suggestion's target_file doesn't exist, skip it and note in summary
+4. If a suggestion conflicts with existing patterns, adapt it to fit (note adaptation)
+5. Preserve existing code style, naming conventions, and structure
+6. After all changes, write a change summary to: ${iterDir}/iteration-${N}-changes.md
+## Changes Summary Format (write to ${iterDir}/iteration-${N}-changes.md)
+# Iteration ${N} Changes
+## Applied Suggestions
+- [high] description: what was changed in which file
+- [medium] description: what was changed in which file
+## Files Modified
+- path/to/file.md: brief description of changes
+## Skipped Suggestions (if any)
+- description: reason for skipping
+## Notes
+- Any adaptations or considerations
+## Success Criteria
+- All high-priority suggestions applied
+- Medium-priority suggestions applied if feasible
+- Low-priority suggestions applied if trivial
+- Changes summary written to ${iterDir}/iteration-${N}-changes.md
+`
+});
+```
+### Step 4.3: Verify Changes
+After agent completes:
+```javascript
+// Verify changes summary was written
+const changesFile = `${iterDir}/iteration-${N}-changes.md`;
+const changesExist = find_files(changesFile).length > 0;
+if (!changesExist) {
+  // Agent didn't write summary -- create a minimal one
+  write_file(changesFile, `# Iteration ${N} Changes\n\n## Notes\nAgent completed but did not produce changes summary.\n`);
+}
+// Read changes summary to extract applied changes
+const changesContent = read_file(changesFile);
+// Parse applied changes (heuristic: count lines starting with "- [")
+const appliedMatches = changesContent.match(/^- \[.+?\]/gm) || [];
+const changes_applied = appliedMatches.map(m => ({
+  summary: m.replace(/^- /, ''),
+  file: '' // Extracted from context
+}));
+```
+### Step 4.4: Update State
+```javascript
+state.iterations[N - 1].improvement = {
+  changes_applied: changes_applied,
+  changes_file: changesFile,
+  improvement_rationale: `Applied ${changes_applied.length} improvements based on evaluation score ${evaluation.score}`
+};
+state.iterations[N - 1].status = 'completed';
+state.updated_at = new Date().toISOString();
+// Also update the skill files list in case new files were created
+for (const skill of state.target_skills) {
+  skill.files = find_files(`${skill.path}/**/*.md`).map(f => f.replace(skill.path + '/', ''));
+}
+write_file(`${state.work_dir}/iteration-state.json`, JSON.stringify(state, null, 2));
+// -> Return to orchestrator for next iteration (Phase 2) or termination check
+```
+## Error Handling
+| Error | Recovery |
+|-------|----------|
+| Agent fails to complete | Rollback from skill-snapshot: `cp -r "${iterDir}/skill-snapshot/${skill.name}/*" "${skill.path}/"` |
+| Agent corrupts files | Same rollback from snapshot |
+| Changes summary missing | Create minimal summary, continue |
+| target_file not found | Agent skips suggestion, notes in summary |
+## Output
+- **Files**: `iteration-{N}-changes.md`, modified skill files
+- **State**: `iterations[N-1].improvement` and `.status` updated
+- **Next**: Return to orchestrator, begin next iteration (Phase 2) or terminate

package/.agents/skills/skill-iter-tune/phases/05-report.md ADDED Viewed

@@ -0,0 +1,166 @@
+# Phase 5: Final Report
+> **COMPACT SENTINEL [Phase 5: Report]**
+> This phase contains 4 execution steps (Step 5.1 -- 5.4).
+> If you can read this sentinel but cannot find the full Step protocol below, context has been compressed.
+> Recovery: `read_file("phases/05-report.md")`
+Generate comprehensive iteration history report and display results to user.
+## Objective
+- Read complete iteration state
+- Generate formatted final report with score progression
+- Write final-report.md
+- Display summary to user
+## Execution
+### Step 5.1: Read Complete State
+```javascript
+const state = JSON.parse(read_file(`${state.work_dir}/iteration-state.json`));
+state.status = 'completed';
+state.updated_at = new Date().toISOString();
+```
+### Step 5.2: Generate Report
+```javascript
+// Determine outcome
+const outcomeMap = {
+  quality_threshold_met: 'PASSED -- Quality threshold reached',
+  max_iterations_reached: 'MAX ITERATIONS -- Threshold not reached',
+  convergence_detected: 'CONVERGED -- Score stopped improving',
+  error_limit_reached: 'FAILED -- Too many errors'
+};
+const outcome = outcomeMap[state.termination_reason] || 'COMPLETED';
+// Build score progression table
+const scoreTable = state.iterations
+  .filter(i => i.evaluation)
+  .map(i => {
+    const dims = i.evaluation.dimensions || [];
+    const dimScores = ['clarity', 'completeness', 'correctness', 'effectiveness', 'efficiency']
+      .map(id => {
+        const dim = dims.find(d => d.id === id);
+        return dim ? dim.score : '-';
+      });
+    return `| ${i.round} | ${i.evaluation.score} | ${dimScores.join(' | ')} |`;
+  }).join('\n');
+// Build iteration details
+const iterationDetails = state.iterations.map(iter => {
+  const evalSection = iter.evaluation
+    ? `**Score**: ${iter.evaluation.score}/100\n` +
+      `**Strengths**: ${iter.evaluation.strengths?.join(', ') || 'N/A'}\n` +
+      `**Weaknesses**: ${iter.evaluation.weaknesses?.slice(0, 3).join(', ') || 'N/A'}`
+    : '**Evaluation**: Skipped or failed';
+  const changesSection = iter.improvement
+    ? `**Changes Applied**: ${iter.improvement.changes_applied?.length || 0}\n` +
+      (iter.improvement.changes_applied?.map(c => `  - ${c.summary}`).join('\n') || '  None')
+    : '**Improvements**: None';
+  return `### Iteration ${iter.round}\n${evalSection}\n${changesSection}`;
+}).join('\n\n');
+const report = `# Skill Iter Tune -- Final Report
+## Summary
+| Field | Value |
+|-------|-------|
+| **Target Skills** | ${state.target_skills.map(s => s.name).join(', ')} |
+| **Execution Mode** | ${state.execution_mode} |
+${state.execution_mode === 'chain' ? `| **Chain Order** | ${state.chain_order.join(' -> ')} |` : ''}
+| **Test Scenario** | ${state.test_scenario.description} |
+| **Iterations** | ${state.iterations.length} |
+| **Initial Score** | ${state.score_trend[0] || 'N/A'} |
+| **Final Score** | ${state.latest_score}/100 |
+| **Quality Threshold** | ${state.quality_threshold} |
+| **Outcome** | ${outcome} |
+| **Started** | ${state.started_at} |
+| **Completed** | ${state.updated_at} |
+## Score Progression
+| Iter | Composite | Clarity | Completeness | Correctness | Effectiveness | Efficiency |
+|------|-----------|---------|--------------|-------------|---------------|------------|
+${scoreTable}
+**Trend**: ${state.score_trend.join(' -> ')}
+${state.execution_mode === 'chain' ? `
+## Chain Score Progression
+| Iter | ${state.chain_order.join(' | ')} |
+|------|${state.chain_order.map(() => '------').join('|')}|
+${state.iterations.filter(i => i.evaluation?.chain_scores).map(i => {
+  const scores = state.chain_order.map(s => i.evaluation.chain_scores[s] || '-');
+  return `| ${i.round} | ${scores.join(' | ')} |`;
+}).join('\n')}
+` : ''}
+## Iteration Details
+${iterationDetails}
+## Remaining Weaknesses
+${state.iterations.length > 0 && state.iterations[state.iterations.length - 1].evaluation
+  ? state.iterations[state.iterations.length - 1].evaluation.weaknesses?.map(w => `- ${w}`).join('\n') || 'None identified'
+  : 'No evaluation data available'}
+## Artifact Locations
+| Path | Description |
+|------|-------------|
+| \`${state.work_dir}/iteration-state.json\` | Complete state history |
+| \`${state.work_dir}/iterations/iteration-{N}/iteration-{N}-eval.md\` | Per-iteration evaluations |
+| \`${state.work_dir}/iterations/iteration-{N}/iteration-{N}-changes.md\` | Per-iteration change logs |
+| \`${state.work_dir}/final-report.md\` | This report |
+| \`${state.backup_dir}/\` | Original skill backups |
+## Restore Original
+To revert all changes and restore the original skill files:
+\`\`\`bash
+${state.target_skills.map(s => `cp -r "${state.backup_dir}/${s.name}"/* "${s.path}/"`).join('\n')}
+\`\`\`
+`;
+```
+### Step 5.3: Write Report and Update State
+```javascript
+write_file(`${state.work_dir}/final-report.md`, report);
+state.status = 'completed';
+write_file(`${state.work_dir}/iteration-state.json`, JSON.stringify(state, null, 2));
+```
+### Step 5.4: Display Summary to User
+Output to user:
+```
+Skill Iter Tune Complete!
+Target: {skill names}
+Iterations: {count}
+Score: {initial} -> {final} ({outcome})
+Threshold: {threshold}
+Score trend: {score1} -> {score2} -> ... -> {scoreN}
+Full report: {workDir}/final-report.md
+Backups: {backupDir}/
+```
+## Output
+- **Files**: `final-report.md`
+- **State**: `status = completed`
+- **Next**: Workflow complete. Return control to user.