npm - create-merlin-brain - Versions diffs - 3.11.0 → 3.12.0 - Mend

create-merlin-brain 3.11.0 → 3.12.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (124) hide show

package/bin/install.cjs +146 -22
package/bin/runtime-adapters.cjs +396 -0
package/dist/server/cost/tracker.d.ts +38 -2
package/dist/server/cost/tracker.d.ts.map +1 -1
package/dist/server/cost/tracker.js +87 -15
package/dist/server/cost/tracker.js.map +1 -1
package/dist/server/server.d.ts.map +1 -1
package/dist/server/server.js +74 -30
package/dist/server/server.js.map +1 -1
package/dist/server/tools/adaptive.js +1 -1
package/dist/server/tools/adaptive.js.map +1 -1
package/dist/server/tools/agents-index.js +3 -3
package/dist/server/tools/agents-index.js.map +1 -1
package/dist/server/tools/agents.js +5 -5
package/dist/server/tools/agents.js.map +1 -1
package/dist/server/tools/behaviors.js +4 -4
package/dist/server/tools/behaviors.js.map +1 -1
package/dist/server/tools/context.js +7 -7
package/dist/server/tools/context.js.map +1 -1
package/dist/server/tools/cost.d.ts +3 -1
package/dist/server/tools/cost.d.ts.map +1 -1
package/dist/server/tools/cost.js +66 -13
package/dist/server/tools/cost.js.map +1 -1
package/dist/server/tools/discoveries.js +6 -6
package/dist/server/tools/discoveries.js.map +1 -1
package/dist/server/tools/index.d.ts +4 -0
package/dist/server/tools/index.d.ts.map +1 -1
package/dist/server/tools/index.js +4 -0
package/dist/server/tools/index.js.map +1 -1
package/dist/server/tools/learning.d.ts +12 -0
package/dist/server/tools/learning.d.ts.map +1 -0
package/dist/server/tools/learning.js +269 -0
package/dist/server/tools/learning.js.map +1 -0
package/dist/server/tools/project.js +7 -7
package/dist/server/tools/project.js.map +1 -1
package/dist/server/tools/promote.d.ts +11 -0
package/dist/server/tools/promote.d.ts.map +1 -0
package/dist/server/tools/promote.js +315 -0
package/dist/server/tools/promote.js.map +1 -0
package/dist/server/tools/route.d.ts.map +1 -1
package/dist/server/tools/route.js +65 -24
package/dist/server/tools/route.js.map +1 -1
package/dist/server/tools/session-restore.d.ts +18 -0
package/dist/server/tools/session-restore.d.ts.map +1 -0
package/dist/server/tools/session-restore.js +154 -0
package/dist/server/tools/session-restore.js.map +1 -0
package/dist/server/tools/session-search.d.ts +16 -0
package/dist/server/tools/session-search.d.ts.map +1 -0
package/dist/server/tools/session-search.js +240 -0
package/dist/server/tools/session-search.js.map +1 -0
package/dist/server/tools/sights-index.js +2 -2
package/dist/server/tools/sights-index.js.map +1 -1
package/dist/server/tools/smart-route.d.ts.map +1 -1
package/dist/server/tools/smart-route.js +4 -5
package/dist/server/tools/smart-route.js.map +1 -1
package/dist/server/tools/verification.js +1 -1
package/dist/server/tools/verification.js.map +1 -1
package/files/agents/code-organization-supervisor.md +1 -0
package/files/agents/context-guardian.md +1 -0
package/files/agents/docs-keeper.md +1 -0
package/files/agents/dry-refactor.md +1 -0
package/files/agents/elite-code-refactorer.md +1 -0
package/files/agents/hardening-guard.md +1 -0
package/files/agents/implementation-dev.md +1 -0
package/files/agents/merlin-access-control-reviewer.md +248 -0
package/files/agents/merlin-codebase-mapper.md +1 -1
package/files/agents/merlin-dependency-auditor.md +216 -0
package/files/agents/merlin-executor.md +1 -0
package/files/agents/merlin-input-validator.md +247 -0
package/files/agents/merlin-reviewer.md +1 -0
package/files/agents/merlin-sast-reviewer.md +182 -0
package/files/agents/merlin-secret-scanner.md +203 -0
package/files/agents/tests-qa.md +1 -0
package/files/commands/merlin/execute-phase.md +94 -197
package/files/commands/merlin/execute-plan.md +116 -180
package/files/commands/merlin/health.md +385 -0
package/files/commands/merlin/loop-recipes.md +93 -36
package/files/commands/merlin/optimize-prompts.md +158 -0
package/files/commands/merlin/profiles.md +215 -0
package/files/commands/merlin/promote.md +176 -0
package/files/commands/merlin/quick.md +229 -0
package/files/commands/merlin/resume-work.md +27 -1
package/files/commands/merlin/route.md +43 -1
package/files/commands/merlin/sandbox.md +359 -0
package/files/commands/merlin/usage.md +55 -0
package/files/docker/Dockerfile.merlin +20 -0
package/files/docker/docker-compose.merlin.yml +23 -0
package/files/hook-templates/auto-commit.sh +64 -0
package/files/hook-templates/auto-format.sh +95 -0
package/files/hook-templates/auto-test.sh +117 -0
package/files/hook-templates/branch-protection.sh +72 -0
package/files/hook-templates/changelog-reminder.sh +76 -0
package/files/hook-templates/complexity-check.sh +112 -0
package/files/hook-templates/import-audit.sh +83 -0
package/files/hook-templates/license-header.sh +84 -0
package/files/hook-templates/pr-description.sh +100 -0
package/files/hook-templates/todo-tracker.sh +80 -0
package/files/hooks/check-file-size.sh +17 -4
package/files/hooks/config-change.sh +44 -16
package/files/hooks/instructions-loaded.sh +22 -5
package/files/hooks/notify-desktop.sh +157 -0
package/files/hooks/notify-webhook.sh +141 -0
package/files/hooks/pre-edit-sights-check.sh +76 -9
package/files/hooks/security-scanner.sh +153 -0
package/files/hooks/session-end-memory-sync.sh +97 -0
package/files/hooks/session-end.sh +274 -1
package/files/hooks/session-start.sh +19 -6
package/files/hooks/smart-approve.sh +270 -0
package/files/hooks/teammate-idle-verify.sh +87 -12
package/files/hooks/worktree-create.sh +20 -3
package/files/hooks/worktree-remove.sh +21 -3
package/files/merlin/references/plan-format.md +37 -9
package/files/merlin/sandbox.json +9 -0
package/files/merlin/security.json +11 -0
package/files/merlin/templates/ci/docs-update.yml +81 -0
package/files/merlin/templates/ci/pr-review.yml +50 -0
package/files/merlin/templates/ci/security-audit.yml +74 -0
package/files/merlin/templates/config.json +9 -1
package/files/rules/api-rules.md +30 -0
package/files/rules/frontend-rules.md +25 -0
package/files/rules/hooks-rules.md +36 -0
package/files/rules/mcp-rules.md +30 -0
package/files/rules/worker-rules.md +29 -0
package/package.json +1 -1

package/files/commands/merlin/health.md ADDED Viewed

@@ -0,0 +1,385 @@
+---
+name: merlin:health
+description: Validate .planning/ directory integrity and auto-repair common issues
+argument-hint: "[--repair]"
+allowed-tools:
+  - Read
+  - Bash
+  - Glob
+  - Grep
+  - Write
+  - AskUserQuestion
+---
+<objective>
+Check the health of the .planning/ directory and Merlin configuration.
+Runs 10 checks covering file existence, structural validity, phase consistency,
+hook registration, and MCP config. Reports issues with clear status indicators.
+With `--repair`: auto-fixes issues that are safe to fix without user input.
+Confirms before destructive actions (e.g., removing orphaned directories).
+</objective>
+<context>
+Arguments: $ARGUMENTS
+Check for repair mode:
+- `--repair` flag present → run in repair mode after checks complete
+- No flag → read-only check, suggest repair at end
+</context>
+<process>
+## Step 1: Parse Mode
+```bash
+REPAIR_MODE=false
+echo "$ARGUMENTS" | grep -q "\-\-repair" && REPAIR_MODE=true
+```
+## Step 2: Run All Checks
+Run each check and collect results. Track pass/fail/warn counts.
+---
+### Check 1: PROJECT.md exists and has required sections
+```bash
+cat .planning/PROJECT.md 2>/dev/null
+```
+- PASS: file exists AND contains "Vision" AND "Stack" AND "Architecture"
+- WARN: file exists but missing one or more sections
+- FAIL: file does not exist
+---
+### Check 2: ROADMAP.md exists and phases are numbered correctly
+```bash
+cat .planning/ROADMAP.md 2>/dev/null
+```
+- PASS: file exists AND phases are numbered sequentially (01, 02, 03... or 1, 2, 3...)
+- WARN: file exists but numbering has gaps (01, 02, 04 — missing 03)
+- FAIL: file does not exist
+Extract:
+- Total phase count
+- Milestone count (count `## Milestone` or `## v` headers)
+- List of phase numbers for cross-referencing in later checks
+---
+### Check 3: REQUIREMENTS.md exists (if roadmap references requirements)
+```bash
+# Check if roadmap mentions requirements
+grep -i "requirement\|REQUIREMENTS" .planning/ROADMAP.md 2>/dev/null | head -3
+cat .planning/REQUIREMENTS.md 2>/dev/null | head -5
+```
+- PASS: REQUIREMENTS.md exists
+- WARN: roadmap mentions requirements but REQUIREMENTS.md is missing
+- SKIP: roadmap does not reference requirements — note "not required"
+---
+### Check 4: STATE.md exists and has valid current position
+```bash
+cat .planning/STATE.md 2>/dev/null
+```
+Check:
+- File exists
+- Contains a current phase reference
+- Note the last-modified date:
+```bash
+# Get last modified timestamp
+stat -f "%Sm" -t "%Y-%m-%d" .planning/STATE.md 2>/dev/null || \
+  stat --format="%y" .planning/STATE.md 2>/dev/null | cut -d' ' -f1
+```
+- PASS: exists and was modified within the last 7 days
+- WARN: exists but last modified more than 7 days ago (note how many days)
+- FAIL: file does not exist
+---
+### Check 5: Phase directories match roadmap phases
+```bash
+# List actual phase directories
+ls -d .planning/phases/*/ 2>/dev/null | xargs -I{} basename {} 2>/dev/null
+# Cross-reference with phase numbers extracted in Check 2
+```
+Compare:
+- Phase numbers in ROADMAP.md vs directories in `.planning/phases/`
+- Identify MISSING directories (in roadmap but not on disk)
+- Identify ORPHANED directories (on disk but not in roadmap)
+- PASS: all roadmap phases have directories, no orphans
+- WARN: minor mismatch (1-2 issues)
+- FAIL: multiple missing or orphaned directories
+---
+### Check 6: PLAN.md files are valid (have tasks)
+```bash
+# Find all PLAN.md files
+find .planning/phases -name "*-PLAN.md" 2>/dev/null | head -20
+```
+For each PLAN.md found, check it contains at least one `<task>` element or task list item.
+```bash
+# Quick check for task content
+grep -l "<task\|## Task\|- \[ \]" .planning/phases/**/*-PLAN.md 2>/dev/null | wc -l
+grep -rL "<task\|## Task\|- \[ \]" .planning/phases/ 2>/dev/null | grep "PLAN.md"
+```
+- PASS: all PLAN.md files have task content
+- WARN: one or more PLAN.md files appear empty or lack task markers
+- SKIP: no PLAN.md files exist yet (project may be in early planning)
+---
+### Check 7: No orphaned phase directories
+Already computed in Check 5. Report here as a dedicated line item.
+---
+### Check 8: No missing phase directories
+Already computed in Check 5. Report here as a dedicated line item.
+---
+### Check 9: .merlin/config.json exists and has valid API key format
+```bash
+cat .merlin/config.json 2>/dev/null
+```
+Also check global config:
+```bash
+cat ~/.claude/merlin/settings.json 2>/dev/null | grep -i "apiKey\|api_key\|key" | head -3
+```
+- PASS: config exists AND apiKey field matches pattern `mrln_[a-zA-Z0-9]{20,}`
+- WARN: config exists but apiKey format looks unusual
+- FAIL: config does not exist or apiKey is missing/empty
+---
+### Check 10: Hooks registered in Claude settings
+```bash
+# Check local settings first
+cat .claude/settings.local.json 2>/dev/null | grep -o '"hooks"' | head -1
+cat .claude/settings.json 2>/dev/null | grep -o '"hooks"' | head -1
+# Count hook entries
+cat .claude/settings.local.json 2>/dev/null | grep -o '"command"' | wc -l
+cat .claude/settings.json 2>/dev/null | grep -o '"command"' | wc -l
+```
+For each hook file referenced, verify it exists on disk:
+```bash
+# Extract hook commands from settings and check each file exists
+cat .claude/settings.local.json 2>/dev/null | grep '"command"' | grep -o '"[^"]*\.sh[^"]*"' | tr -d '"' | while read f; do
+  # Expand $HOME
+  expanded="${f/\$HOME/$HOME}"
+  [ -f "$expanded" ] && echo "OK: $f" || echo "MISSING: $f"
+done
+```
+- PASS: hooks key exists AND all referenced files exist
+- WARN: hooks registered but one or more hook files missing on disk
+- FAIL: hooks key not found in either settings file
+## Step 3: Compile Score and Display Results
+Count:
+- PASS checks → contribute 1 point each
+- WARN checks → contribute 0.5 points each
+- FAIL checks → contribute 0 points each
+- SKIP checks → excluded from denominator
+Score = sum of points / number of applicable checks * 10 (rounded to nearest integer)
+Display:
+```
+════════════════════════════════════════
+Merlin Health Check
+════════════════════════════════════════
+[1]  PROJECT.md         — {PASS/WARN/FAIL} {detail}
+[2]  ROADMAP.md         — {PASS/WARN/FAIL} ({N} phases, {M} milestones)
+[3]  REQUIREMENTS.md    — {PASS/WARN/SKIP} {detail}
+[4]  STATE.md           — {PASS/WARN/FAIL} {detail, e.g., "last updated 3 days ago"}
+[5]  Phase dirs match   — {PASS/WARN/FAIL} {detail}
+[6]  PLAN.md validity   — {PASS/WARN/SKIP} {detail}
+[7]  No orphaned dirs   — {PASS/WARN/FAIL} {list orphans if any}
+[8]  No missing dirs    — {PASS/WARN/FAIL} {list missing if any}
+[9]  Merlin config      — {PASS/WARN/FAIL} {detail}
+[10] Hooks registered   — {PASS/WARN/FAIL} ({N} hooks, all files present)
+────────────────────────────────────────
+Score: {X}/10   ({N} issues found)
+────────────────────────────────────────
+{If issues > 0 and not in repair mode:}
+Run `/merlin:health --repair` to auto-fix safe issues.
+```
+If score is 10/10:
+```
+Everything looks healthy. No action needed.
+```
+## Step 4: Repair (only if --repair flag present)
+If no issues found, skip this step.
+Group issues by repair safety:
+**Auto-fix (no confirmation needed):**
+- Missing STATE.md → create with sensible defaults
+- Missing phase directories (referenced in roadmap) → create empty directories
+- Missing REQUIREMENTS.md → note it, do not auto-create (content requires user input)
+**Confirm before fixing:**
+- Orphaned phase directories → ask before removing
+---
+### Repair: Missing STATE.md
+```bash
+cat > .planning/STATE.md << 'EOF'
+# State
+## Current Position
+- Phase: 1
+- Status: planning
+## Recent Decisions
+(none yet)
+## Blockers
+(none)
+## Notes
+(auto-created by /merlin:health --repair)
+EOF
+```
+Report: "Created .planning/STATE.md with defaults."
+---
+### Repair: Missing phase directories
+For each phase in roadmap that lacks a directory:
+```bash
+mkdir -p ".planning/phases/{phase-dir-name}"
+```
+Report: "Created .planning/phases/{dir}/"
+---
+### Repair: Orphaned phase directories
+For each orphaned directory, ask:
+```
+Found orphaned directory: .planning/phases/{dir}/
+This directory is not referenced in ROADMAP.md.
+Remove it? (y/n)
+```
+Use AskUserQuestion if multiple orphans exist — confirm all at once with a list.
+If confirmed:
+```bash
+rm -rf ".planning/phases/{dir}"
+```
+Report: "Removed .planning/phases/{dir}/"
+---
+### Repair: Phase numbering gaps in ROADMAP.md
+If gaps detected in phase numbering (e.g., 01, 02, 04 — missing 03):
+Report the gap but do NOT auto-rename — renumbering phases can break STATE.md references and directory names. Suggest the user fix manually or use `/merlin:insert-phase`.
+```
+Phase numbering gap detected: phases jump from 02 to 04.
+Manual fix recommended — use /merlin:insert-phase to add the missing phase,
+or edit ROADMAP.md directly and update STATE.md to match.
+```
+---
+### Repair Summary
+After all repairs:
+```
+════════════════════════════════════════
+Repair Complete
+════════════════════════════════════════
+Fixed:
+{list of auto-fixed items}
+Skipped (manual action required):
+{list of items that need manual attention}
+────────────────────────────────────────
+Run `/merlin:health` to verify.
+```
+</process>
+<check_reference>
+| # | Check | PASS | WARN | FAIL | Auto-Repair |
+|---|-------|------|------|------|-------------|
+| 1 | PROJECT.md | exists + 3 sections | exists, missing sections | missing | No |
+| 2 | ROADMAP.md | exists + sequential | exists, gaps in numbering | missing | No |
+| 3 | REQUIREMENTS.md | exists | missing but referenced | n/a | No |
+| 4 | STATE.md | exists, <7 days old | exists, >7 days old | missing | Yes — create with defaults |
+| 5 | Phase dirs match | all match | 1-2 mismatches | many mismatches | Partial |
+| 6 | PLAN.md validity | all have tasks | some empty | n/a | No |
+| 7 | No orphans | no orphans | 1-2 orphans | many orphans | Yes — with confirmation |
+| 8 | No missing dirs | none missing | 1-2 missing | many missing | Yes — mkdir |
+| 9 | Merlin config | key + valid format | key present, odd format | missing | No |
+| 10 | Hooks | registered + files exist | files missing | not registered | No |
+</check_reference>
+<success_criteria>
+- [ ] All 10 checks run and reported
+- [ ] Score calculated and displayed
+- [ ] --repair mode applies safe fixes without confirmation
+- [ ] Orphaned directory removal always confirmed first
+- [ ] Phase renumbering explicitly deferred to manual action
+- [ ] Final output is scannable in under 20 seconds
+</success_criteria>

package/files/commands/merlin/loop-recipes.md CHANGED Viewed

@@ -17,57 +17,114 @@ Output ONLY the reference content below. Do NOT add:
 <reference>
 # Merlin Loop Recipes
-Pre-built `/loop` patterns for common Merlin workflows. Copy and use directly.
+Pre-built `/loop` patterns for common Merlin workflows. Copy and paste directly into Claude Code.
-## Development Monitoring
+---
-- `/loop 5m /merlin:progress` — Check project progress every 5 minutes
-- `/loop 2m check build status and report any failures` — CI monitoring
-- `/loop 10m /merlin:standup` — Running standup summary
-- `/loop 15m check if any new errors appeared in the logs` — Log monitoring
+## How to Use
-## Code Quality
+```
+/loop <interval> <command or prompt>
+```
-- `/loop 30m scan for files over 400 lines and list them` — File size monitoring
-- `/loop 1h review recent changes for security issues` — Security patrol
-- `/loop 4h check if documentation is up to date with recent code changes` — Doc freshness
-- `/loop 2h scan for duplicate code or functions that do the same thing` — DRY check
+Intervals: `2m`, `5m`, `10m`, `30m`, `1h`, `4h` — natural language also works.
-## Sights & Context
+---
-- `/loop 30m merlin_get_context("current work progress")` — Keep Sights context warm
-- `/loop 1h merlin_lite_refresh()` — Refresh local Sights analysis
-- `/loop 6h run /merlin:map-codebase to keep codebase index fresh` — Index freshness
+## Recipes
-## Deployment
+### Progress Check
+**Command:** `/loop 5m /merlin:progress`
+**Interval:** 5 minutes
+**What it does:** Runs `/merlin:progress` on each tick to summarize what has been completed, what is in progress, and what is blocked.
+**When to use:** Active development sessions where you want a live project status dashboard without manually checking in.
-- `/loop 2m check Railway deployment status` — Deploy monitoring
-- `/loop 5m check if PR checks have passed` — PR monitoring
-- `/loop 10m check if the staging environment is healthy` — Staging health
+---
-## Planning & Todos
+### Standup Generator
+**Command:** `/loop 10m /merlin:standup`
+**Interval:** 10 minutes
+**What it does:** Generates a running standup summary — what was done, what is next, any blockers — updated every 10 minutes.
+**When to use:** Long coding sessions, pair programming, or async team work where stakeholders need regular updates.
-- `/loop 1h /merlin:check-todos and summarize what's pending` — Todo awareness
-- `/loop 24h /merlin:standup and create a brief daily summary` — Daily digest
+---
-## Usage
+### Sights Warm
+**Command:** `/loop 30m call merlin_get_context with current task summary to keep Sights context fresh`
+**Interval:** 30 minutes
+**What it does:** Calls `merlin_get_context` with a summary of current work so the Merlin Sights knowledge layer stays warm and fast.
+**When to use:** Long sessions where you want instant context retrieval without cold-start delays throughout the day.
-Just copy any recipe above and paste it into Claude Code. Customize the interval and prompt as needed.
+---
-```
-/loop 5m /merlin:progress
-```
+### CI Monitor
+**Command:** `/loop 2m check gh run list --limit 3 and report status of each run`
+**Interval:** 2 minutes
+**What it does:** Polls the last 3 GitHub Actions runs and reports pass/fail/in-progress status with links to failing jobs.
+**When to use:** After pushing a branch or opening a PR — keep eyes on CI without switching tabs.
+---
+### Security Scan
+**Command:** `/loop 1h scan for any new dependencies added since last check and report known vulnerabilities`
+**Interval:** 1 hour
+**What it does:** Checks for newly added packages in package.json / requirements.txt and flags any with known CVEs or suspicious origins.
+**When to use:** Active feature development where new packages are being pulled in, or any session touching the dependency tree.
+---
+### Doc Freshness
+**Command:** `/loop 4h check if any .md files are outdated relative to recent code changes and list what needs updating`
+**Interval:** 4 hours
+**What it does:** Compares recent code diffs against documentation files and surfaces docs that reference changed APIs, configs, or behaviors.
+**When to use:** Background monitoring during a sprint to avoid docs rot accumulating unnoticed.
+---
-Native `/loop` features:
-- Supports natural language intervals (e.g., "every 5 minutes", "5m", "2h")
-- Default interval: 10 minutes
-- Auto-expires after 3 days
-- Session-scoped (lives in your active session)
-- Can run any slash command or natural language prompt
+### Test Runner
+**Command:** `/loop 3m run the test suite and report any new failures compared to the last run`
+**Interval:** 3 minutes
+**What it does:** Runs the project test suite on each tick and highlights any tests that newly broke since the previous iteration.
+**When to use:** TDD workflows or refactoring sessions where you want immediate feedback on regressions without running tests manually.
+---
+### Cost Tracker
+**Command:** `/loop 15m call merlin_session_cost and report the current session token spend`
+**Interval:** 15 minutes
+**What it does:** Calls `merlin_session_cost` to report cumulative token usage and estimated cost for the current session.
+**When to use:** Long autonomous sessions or when working with expensive models — prevents bill surprises on unattended runs.
+---
+## More Patterns
+### Development Monitoring
+- `/loop 2m check Railway deployment status and report if unhealthy`
+- `/loop 5m check if PR checks have passed and summarize results`
+- `/loop 15m check if any new errors appeared in the logs`
+### Code Quality
+- `/loop 30m scan for files over 400 lines and list them`
+- `/loop 2h scan for duplicate code or functions that do the same thing`
+### Sights & Context
+- `/loop 1h run /merlin:map-codebase to keep codebase index fresh`
+### Planning
+- `/loop 1h /merlin:check-todos and summarize what is pending`
+- `/loop 24h /merlin:standup and write a brief daily digest`
+---
 ## Tips
-- Combine with `/merlin:progress` for a live project dashboard
-- Use shorter intervals (2-5m) during active deploys, longer (1h+) for background monitoring
-- Chain recipes: let one loop inform what the next one checks
+- Use shorter intervals (2–5m) during active deploys; longer (1h+) for background checks.
+- Chain recipes: let one loop inform what the next checks.
+- Loops auto-expire after 3 days and are session-scoped.
+- Default interval when none is specified: 10 minutes.
 </reference>

package/files/commands/merlin/optimize-prompts.md ADDED Viewed

@@ -0,0 +1,158 @@
+---
+name: merlin:optimize-prompts
+description: Analyze agent prompt effectiveness and surface actionable improvement suggestions
+argument-hint: "[agent-name]"
+allowed-tools:
+  - mcp__merlin__merlin_prompt_suggestions
+  - mcp__merlin__merlin_get_behaviors
+  - AskUserQuestion
+---
+<objective>
+Review how well each agent is performing based on session outcome data, then walk through
+improvement suggestions one by one. You never auto-modify any agent file — the user approves
+each suggestion and applies it manually.
+</objective>
+<process>
+## Step 1: Parse Arguments
+Extract from $ARGUMENTS (optional):
+- **agent-name**: Focus the report on one specific agent
+Store as `TARGET_AGENT` (may be empty).
+## Step 2: Fetch Effectiveness Report
+Call `merlin_prompt_suggestions` with `agentFilter: TARGET_AGENT` (omit if empty).
+If the result says "No outcome data yet":
+```
+No outcome data has been collected yet.
+Outcome tracking starts automatically at session end. Run a few sessions,
+then come back to /merlin:optimize-prompts.
+```
+Stop here.
+## Step 3: Display the Report
+Print the full effectiveness report exactly as returned. It includes:
+- A score table (all agents or filtered agent)
+- Suggestions section for low-scoring agents
+- High-confidence behavior hints (if any)
+## Step 4: Check for Suggestions
+Scan the report for the **Suggestions** section.
+If it says "All tracked agents are performing above the threshold":
+```
+All agents are performing well — no prompt changes needed at this time.
+Check back after more sessions to track trends.
+```
+Stop here.
+## Step 5: Walk Through Each Suggestion
+For each suggestion in the report, present it clearly:
+```
+Suggestion for: {agent-name}  (score: {score}%)
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+Suggested addition:
+  "{suggestion text}"
+Evidence: {evidence}
+Confidence: {confidence}
+Agent file: ~/.claude/agents/{agent-name}.md
+[A] Apply — show me exactly what to add
+[S] Skip this suggestion
+[Q] Stop reviewing
+```
+Ask the user to choose A, S, or Q.
+## Step 6: Show Exact Text to Add (if Approved)
+If the user chooses **A**:
+Show the exact text block they should paste into the agent's .md file.
+Put it inside a fenced code block so they can copy it easily:
+```
+Add the following to ~/.claude/agents/{agent-name}.md
+(paste into the relevant section of the agent prompt):
+\`\`\`markdown
+## Effectiveness Note
+{suggestion text}
+\`\`\`
+Apply this change manually — Merlin never edits agent files directly.
+After applying, run a few more sessions to see if the score improves.
+```
+Do NOT call any edit or write tool. The user applies the change themselves.
+Then move to the next suggestion (if any).
+## Step 7: Handle Behavior Hints
+If the report includes a "High-Confidence Behaviors Ready for Agent Prompts" section,
+show it after all suggestions are handled:
+```
+High-Confidence Behaviors
+━━━━━━━━━━━━━━━━━━━━━━━━━
+These behaviors have been validated enough to embed into agent prompts:
+{list from report}
+To promote a behavior into a skill first, run /merlin:promote.
+To embed directly, add the action text to the agent .md under a "Learned Patterns" section.
+```
+Ask if the user wants to see the promote workflow: if yes, tell them to run `/merlin:promote`.
+## Step 8: Summary
+After all suggestions are reviewed:
+```
+Optimize-Prompts complete
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+Applied: {N} suggestion(s) shown for manual apply
+Skipped: {M} suggestion(s)
+Changes to agent .md files take effect in the next session.
+Run /merlin:optimize-prompts again after 5+ sessions to track improvement.
+```
+</process>
+<design_notes>
+- This command is a guided UX wrapper around merlin_prompt_suggestions
+- NEVER auto-edit agent .md files — always show text for manual copy-paste
+- The tool does the analysis; this command handles the UX conversation
+- Outcome data is collected automatically by session-end.sh — users don't need to do anything
+- Scores improve when agents complete sessions that end with file changes and no errors
+</design_notes>
+<error_handling>
+| Condition | Action |
+|-----------|--------|
+| No outcomes.jsonl | Explain collection is automatic, suggest running sessions first |
+| All agents scoring well | Confirm good health, no action needed |
+| Only 1-2 sessions tracked | Note low confidence, suggest waiting for more data |
+| User rejects all suggestions | Exit cleanly with "Nothing applied" message |
+| Behavior hints fetch fails | Skip that section silently, show the score table anyway |
+</error_handling>