npm - create-merlin-brain - Versions diffs - 2.7.0 → 3.0.1 - Mend

create-merlin-brain 2.7.0 → 3.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (29) hide show

package/bin/install.cjs +9 -1
package/files/CLAUDE.md +96 -10
package/files/agents/context-guardian.md +1 -1
package/files/agents/elite-code-refactorer.md +1 -1
package/files/agents/merlin-milestone-auditor.md +32 -19
package/files/agents/merlin-planner.md +192 -0
package/files/agents/merlin-work-verifier.md +90 -0
package/files/agents/merlin.md +103 -0
package/files/commands/merlin/audit-milestone.md +21 -6
package/files/commands/merlin/debug.md +15 -11
package/files/commands/merlin/execute-phase.md +237 -230
package/files/commands/merlin/execute-plan.md +241 -362
package/files/commands/merlin/help.md +1 -1
package/files/commands/merlin/map-codebase.md +83 -16
package/files/commands/merlin/plan-phase.md +193 -117
package/files/commands/merlin/research-phase.md +14 -10
package/files/commands/merlin/research-project.md +20 -9
package/files/commands/merlin/route.md +272 -84
package/files/commands/merlin/update.md +20 -20
package/files/commands/merlin/verify-work.md +97 -41
package/files/commands/merlin/whats-new.md +5 -5
package/files/merlin/VERSION +1 -1
package/files/merlin/templates/debug-subagent-prompt.md +12 -8
package/files/merlin/templates/research-subagent-prompt.md +6 -6
package/files/merlin/workflows/diagnose-issues.md +17 -8
package/files/merlin/workflows/execute-phase.md +44 -19
package/files/merlin/workflows/execute-plan.md +59 -99
package/files/merlin/workflows/map-codebase.md +42 -43
package/package.json +2 -2

package/bin/install.cjs CHANGED Viewed

@@ -145,6 +145,7 @@ ${colors.magenta}${colors.bright}
   ╚═╝     ╚═╝╚══════╝╚═╝  ╚═╝╚══════╝╚═╝╚═╝  ╚═══╝
 ${colors.reset}
   ${colors.cyan}The Ultimate AI Brain for Claude Code${colors.reset}
+  ${colors.bright}v3 — One Install, Everything Included${colors.reset}
   Four integrated layers:
   • ${colors.bright}Merlin Loop${colors.reset} - Autonomous orchestration (hours of unattended work)
@@ -296,7 +297,7 @@ merlin-loop() {
   fs.appendFileSync(rcFile, merlinBlock);
   logSuccess(`Configured 'claude' command to use Merlin`);
   logSuccess(`Added 'cc' shortcut`);
-  logSuccess(`Enabled auto-update checks`);
+  logSuccess(`Version check enabled (run /merlin:update to update)`);
   return true;
 }
@@ -496,6 +497,9 @@ function cleanupLegacy() {
     path.join(CLAUDE_DIR, 'gsd'),
     path.join(CLAUDE_DIR, 'commands', 'gsd'),
+    // Old Merlin Pro (merged into single package in v3)
+    path.join(CLAUDE_DIR, 'pro'),
     // Old ccwiki remnants
     path.join(CLAUDE_DIR, 'ccwiki'),
     path.join(CLAUDE_DIR, 'commands', 'ccwiki'),
@@ -543,6 +547,10 @@ function cleanupLegacy() {
     // Old config files
     path.join(CLAUDE_DIR, 'gsd-config.json'),
     path.join(CLAUDE_DIR, 'ccwiki-config.json'),
+    // Old Pro marker file (merged into single package in v3)
+    path.join(CLAUDE_DIR, '.pro'),
+    path.join(MERLIN_DIR, '.pro'),
   ];
   for (const file of filesToRemove) {

package/files/CLAUDE.md CHANGED Viewed

@@ -30,6 +30,23 @@ Which would you like?
 Wait for user response if not connected.
+### Step 1.5: Check for Updates (Quick)
+```bash
+INSTALLED=$(cat ~/.claude/merlin/VERSION 2>/dev/null)
+LATEST=$(npm view create-merlin-brain version 2>/dev/null)
+```
+**If LATEST is newer than INSTALLED:**
+```
+⚡ Merlin update available: v{INSTALLED} → v{LATEST}
+[1] Update now (`/merlin:update`)
+[2] Skip for now
+```
+**If up to date or check fails:** Skip silently — don't slow down the session.
 ### Step 2: Check Project Status (THE ENFORCEMENT LOOP)
 ```
@@ -89,7 +106,7 @@ Merlin is a complete AI-powered development system with three integrated layers:
 ---
-## UNIVERSAL ROUTING PROTOCOL
+## UNIVERSAL ROUTING PROTOCOL — FRESH PROCESS ISOLATION
 **Every specialist routing decision goes through `/merlin:route`.**
@@ -98,12 +115,34 @@ When Merlin decides a specialist agent should handle a task:
 Skill("merlin:route", args='<agent-name> "<task description>"')
 ```
-This spawns a **fresh Claude sub-agent** with:
-- The specialist's system prompt (from `~/.claude/agents/<agent-name>.md`)
-- Merlin Sights context injected automatically
-- Mode-aware tools (interactive = can ask questions, automated = makes assumptions)
+This spawns a **truly fresh Claude process** via `claude --agent <name> -p`:
+- New PID, new 200K context window — completely isolated from orchestrator
+- The specialist's system prompt loaded via `--agent` flag
+- Merlin Sights context injected into the handoff file
+- Context passed via file, results returned as compact summary
+- Orchestrator stays lean — never polluted by specialist work
+**The orchestrator decides WHO. The fresh process decides HOW.**
-**The orchestrator decides WHO. The sub-agent decides HOW.**
+### How It Works (Deterministic)
+```
+Orchestrator                    Fresh Process
+    |                               |
+    |-- Write handoff.md ---------> |
+    |-- Spawn: claude --agent -p -> |  (new PID, 200K context)
+    |                               |-- Read handoff
+    |                               |-- Read agent system prompt
+    |                               |-- Get Sights context
+    |                               |-- Do the work
+    |                               |-- Write result
+    |   <---- Return summary -------|
+    |                               |  (process exits)
+    |-- Parse result                |
+    |-- Present to user             |
+```
+Every route: fresh process. Every time. No exceptions.
 ### Quick Routing Reference
@@ -118,12 +157,24 @@ This spawns a **fresh Claude sub-agent** with:
 | Deploy/ops | ops-railway | `Skill("merlin:route", args='ops-railway "..."')` |
 | Documentation | docs-keeper | `Skill("merlin:route", args='docs-keeper "..."')` |
-### Mode
+### Two Modes — User Chooses
+- **Interactive** (default): Orchestrator asks user clarifying questions BEFORE spawning. Handles checkpoints AFTER. User stays in the loop.
+- **Automated**: Orchestrator sends everything to agent. Agent makes reasonable assumptions. No questions asked.
-- **Interactive** (default): Sub-agent CAN ask the user clarifying questions
-- **Automated**: Sub-agent makes reasonable assumptions, does NOT ask questions
+**CRITICAL:** Both modes spawn a fresh process. The mode only controls whether the orchestrator gathers user input before/after spawning.
-Mode only controls AskUserQuestion availability. Depth (fresh sub-agent per specialist) is **always on**.
+Set mode via:
+1. Command argument: `--interactive` or `--automated`
+2. Project config: `.planning/config.json` → `{ "mode": "interactive" }`
+3. User settings: `~/.claude/merlin/settings.local.json` → `{ "mode": "automated" }`
+4. Default: `interactive`
+### Why NOT Task()
+`Task()` creates a sub-agent within the SAME Claude session. The sub-agent's output flows back into the parent's context window, rapidly exhausting tokens. After 3-4 routes, you hit the ceiling.
+`claude --agent -p` via Bash spawns a completely separate OS process. Fresh 200K. No context pollution. Unlimited routes possible.
 ---
@@ -603,6 +654,41 @@ Access via `/merlin:*` commands:
 - `/merlin:verify-work` - Validate built features
 - `/merlin:help` - See all commands
+### CRITICAL: Workflow Commands Spawn FRESH PROCESSES
+**Heavy workflow commands are THIN ORCHESTRATORS that spawn fresh `claude --agent -p` processes.**
+They write a handoff file to /tmp, spawn a fresh CLI process, and parse the compact result.
+The orchestrator's context is NEVER polluted by the specialist's work.
+| Command | Agent Process | Isolation |
+|---------|--------------|-----------|
+| `/merlin:route` | `claude --agent <specialist> -p` | Fresh 200K per route |
+| `/merlin:plan-phase` | `claude --agent merlin-planner -p` | Fresh 200K for planning |
+| `/merlin:execute-phase` | `claude --agent merlin-executor -p` (per plan) | Fresh 200K per plan |
+| `/merlin:execute-plan` | `claude --agent merlin-executor -p` | Fresh 200K for execution |
+| `/merlin:map-codebase` | `claude --agent merlin-codebase-mapper -p` (per area) | Fresh 200K per area |
+| `/merlin:verify-work` | `claude --agent merlin-work-verifier -p` | Fresh 200K for verification |
+| `/merlin:research-phase` | `claude --agent merlin-researcher -p` | Fresh 200K for research |
+| `/merlin:debug` | `claude --agent merlin-debugger -p` | Fresh 200K for debugging |
+**How fresh process spawning works:**
+1. Orchestrator writes handoff file to `/tmp/merlin-<command>-<pid>/handoff.md`
+2. Spawns: `cat handoff.md | claude --agent <name> -p --permission-mode acceptEdits --output-format text`
+3. Fresh process gets 200K context, reads its own files, does the work
+4. Fresh process exits, returns compact structured result
+5. Orchestrator parses result, presents to user, cleans up /tmp
+**NEVER use Task().** Task() runs within the same context window and pollutes it.
+**ALWAYS use `claude --agent -p` via Bash.** True process isolation, deterministic handoff.
+**Conversational commands** (new-project, create-roadmap, define-requirements, discuss-*)
+run in-context because they need multi-turn user conversation. For these, suggest `/clear`
+first if the session has been running a while.
+**NEVER do heavy workflow work directly in the orchestrator context.** If the user asks to
+plan a phase and you're tempted to read plan-format.md, scope-estimation.md, etc. yourself —
+STOP. Call `Skill("merlin:plan-phase")` which spawns a fresh process to handle it.
 ---
 ## Write-Back Memory

package/files/agents/context-guardian.md CHANGED Viewed

@@ -1,6 +1,6 @@
 ---
 name: context-guardian
-description: Use this agent when starting new development work, before implementing new features or functions, when there's a risk of duplicating existing functionality, or when the developer needs a comprehensive understanding of the project structure. This agent should be invoked proactively at the beginning of development sessions or when approaching unfamiliar parts of the codebase.\n\nExamples:\n\n<example>\nContext: User is about to implement a new utility function\nuser: "I need to create a function that formats dates for our API responses"\nassistant: "Before implementing this, let me use the context-guardian agent to ensure we understand the full project context and check if similar functionality already exists."\n<commentary>\nSince the user is about to write new functionality, use the Task tool to launch the context-guardian agent to scan for existing date formatting utilities and gather project context.\n</commentary>\n</example>\n\n<example>\nContext: User is starting work on a new feature\nuser: "Let's start building the user authentication module"\nassistant: "I'm going to use the Task tool to launch the context-guardian agent to get a complete picture of the project before we begin. This will help us identify any existing auth patterns or reusable components."\n<commentary>\nSince this is a significant new feature, use the context-guardian agent to gather comprehensive project context and identify existing patterns or services that could be leveraged.\n</commentary>\n</example>\n\n<example>\nContext: User wants to add a new service integration\nuser: "We need to integrate with the payment provider API"\nassistant: "Let me invoke the context-guardian agent first to understand how other service integrations are structured in this project and check for any existing HTTP client utilities or API patterns we should follow."\n<commentary>\nBefore building new integration code, use the context-guardian agent to discover existing integration patterns and reusable components.\n</commentary>\n</example>
+description: Use this agent when starting new development work, before implementing new features or functions, when there's a risk of duplicating existing functionality, or when the developer needs a comprehensive understanding of the project structure. This agent should be invoked proactively at the beginning of development sessions or when approaching unfamiliar parts of the codebase.\n\nExamples:\n\n<example>\nContext: User is about to implement a new utility function\nuser: "I need to create a function that formats dates for our API responses"\nassistant: "Before implementing this, let me use the context-guardian agent to ensure we understand the full project context and check if similar functionality already exists."\n<commentary>\nSince the user is about to write new functionality, spawn the context-guardian agent via fresh process to scan for existing date formatting utilities and gather project context.\n</commentary>\n</example>\n\n<example>\nContext: User is starting work on a new feature\nuser: "Let's start building the user authentication module"\nassistant: "I'm going to spawn the context-guardian agent via fresh process to get a complete picture of the project before we begin. This will help us identify any existing auth patterns or reusable components."\n<commentary>\nSince this is a significant new feature, spawn the context-guardian agent via fresh process to gather comprehensive project context and identify existing patterns or services that could be leveraged.\n</commentary>\n</example>\n\n<example>\nContext: User wants to add a new service integration\nuser: "We need to integrate with the payment provider API"\nassistant: "Let me invoke the context-guardian agent first to understand how other service integrations are structured in this project and check for any existing HTTP client utilities or API patterns we should follow."\n<commentary>\nBefore building new integration code, spawn the context-guardian agent via fresh process to discover existing integration patterns and reusable components.\n</commentary>\n</example>
 tools: Glob, Grep, Read, WebFetch, TodoWrite, WebSearch
 model: haiku
 color: green

package/files/agents/elite-code-refactorer.md CHANGED Viewed

@@ -1,6 +1,6 @@
 ---
 name: elite-code-refactorer
-description: "Use this agent when you need to transform existing code into production-grade, enterprise-quality software. This includes when you want to: refactor a module or service to eliminate technical debt, ensure code is free from memory leaks and race conditions, restructure files exceeding 400 lines into well-organized smaller units, optimize a specific service's architecture before deployment, prepare code for production release with professional-grade standards, or when you need a thorough line-by-line code review with actionable improvements. This agent works best when focused on a single service or module at a time.\\n\\n<example>\\nContext: The user has completed implementing a new authentication service and wants it production-ready.\\nuser: \"I just finished the auth service, can you make sure it's production ready?\"\\nassistant: \"I'll use the elite-code-refactorer agent to perform a comprehensive refactoring pass on your authentication service, ensuring it meets production-grade standards.\"\\n<commentary>\\nSince the user wants production-ready code for a specific service, use the Task tool to launch the elite-code-refactorer agent to perform deep analysis and refactoring of the auth service.\\n</commentary>\\n</example>\\n\\n<example>\\nContext: The user notices a file has grown too large during development.\\nuser: \"The UserService.ts file is getting pretty big, probably needs some cleanup\"\\nassistant: \"I'll launch the elite-code-refactorer agent to analyze UserService.ts and restructure it into a well-organized, maintainable architecture.\"\\n<commentary>\\nSince the user mentioned a file that likely exceeds size limits and needs cleanup, use the Task tool to launch the elite-code-refactorer agent to break it down and optimize the structure.\\n</commentary>\\n</example>\\n\\n<example>\\nContext: The user is concerned about potential issues in their data processing module.\\nuser: \"Can you check the data-processor module for any potential issues? It handles concurrent requests.\"\\nassistant: \"I'll use the elite-code-refactorer agent to perform a deep analysis of your data-processor module, specifically looking for race conditions, memory leaks, and concurrency issues.\"\\n<commentary>\\nSince the user mentioned concurrent request handling which could have race conditions, use the Task tool to launch the elite-code-refactorer agent to thoroughly audit and refactor the module.\\n</commentary>\\n</example>"
+description: "Use this agent when you need to transform existing code into production-grade, enterprise-quality software. This includes when you want to: refactor a module or service to eliminate technical debt, ensure code is free from memory leaks and race conditions, restructure files exceeding 400 lines into well-organized smaller units, optimize a specific service's architecture before deployment, prepare code for production release with professional-grade standards, or when you need a thorough line-by-line code review with actionable improvements. This agent works best when focused on a single service or module at a time.\\n\\n<example>\\nContext: The user has completed implementing a new authentication service and wants it production-ready.\\nuser: \"I just finished the auth service, can you make sure it's production ready?\"\\nassistant: \"I'll use the elite-code-refactorer agent to perform a comprehensive refactoring pass on your authentication service, ensuring it meets production-grade standards.\"\\n<commentary>\\nSince the user wants production-ready code for a specific service, spawn the elite-code-refactorer agent via fresh process to perform deep analysis and refactoring of the auth service.\\n</commentary>\\n</example>\\n\\n<example>\\nContext: The user notices a file has grown too large during development.\\nuser: \"The UserService.ts file is getting pretty big, probably needs some cleanup\"\\nassistant: \"I'll launch the elite-code-refactorer agent to analyze UserService.ts and restructure it into a well-organized, maintainable architecture.\"\\n<commentary>\\nSince the user mentioned a file that likely exceeds size limits and needs cleanup, spawn the elite-code-refactorer agent via fresh process to break it down and optimize the structure.\\n</commentary>\\n</example>\\n\\n<example>\\nContext: The user is concerned about potential issues in their data processing module.\\nuser: \"Can you check the data-processor module for any potential issues? It handles concurrent requests.\"\\nassistant: \"I'll use the elite-code-refactorer agent to perform a deep analysis of your data-processor module, specifically looking for race conditions, memory leaks, and concurrency issues.\"\\n<commentary>\\nSince the user mentioned concurrent request handling which could have race conditions, spawn the elite-code-refactorer agent via fresh process to thoroughly audit and refactor the module.\\n</commentary>\\n</example>"
 model: opus
 color: green
 ---

package/files/agents/merlin-milestone-auditor.md CHANGED Viewed

@@ -89,29 +89,37 @@ For each requirement, track:
 Re-verify each phase to catch regressions. Later phases may have broken earlier ones.
-**Spawn all phases in a single message with multiple Task calls:**
+**Spawn all phases in parallel via fresh processes:**
-```
-Task(prompt="Re-verify phase 1...", subagent_type="merlin-verifier")
-Task(prompt="Re-verify phase 2...", subagent_type="merlin-verifier")
-Task(prompt="Re-verify phase 3...", subagent_type="merlin-verifier")
-Task(prompt="Re-verify phase 4...", subagent_type="merlin-verifier")
-Task(prompt="Re-verify phase 5...", subagent_type="merlin-verifier")
-```
-All run in parallel. Task tool blocks until all complete.
+```bash
+HANDOFF_DIR="/tmp/merlin-audit-$$"
+mkdir -p "$HANDOFF_DIR"
-**Each Task prompt:**
-```
-Re-verify phase {N} goal achievement.
+# Write handoff for each phase, then spawn in parallel
+for PHASE_NUM in 1 2 3 4 5; do
+  cat > "$HANDOFF_DIR/verify-phase-${PHASE_NUM}.md" << EOF
+Re-verify phase ${PHASE_NUM} goal achievement.
 Phase directory: {phase_dir}
 Phase goal: {goal from ROADMAP}
 Requirements: {REQ-IDs for this phase}
 Check must_haves against actual codebase. Create/update VERIFICATION.md.
+EOF
+  cat "$HANDOFF_DIR/verify-phase-${PHASE_NUM}.md" | claude \
+    --agent merlin-work-verifier \
+    -p \
+    --permission-mode acceptEdits \
+    --output-format text \
+    > "$HANDOFF_DIR/result-phase-${PHASE_NUM}.txt" 2>&1 &
+done
+wait  # All phases complete
 ```
+All run in parallel. Each gets fresh 200K context.
 **Collect results:**
 - Read each phase's VERIFICATION.md
 - Extract status and gaps
@@ -121,9 +129,9 @@ Check must_haves against actual codebase. Create/update VERIFICATION.md.
 After phase verifications complete, check cross-phase integration.
-```
-Task(
-  prompt="Check cross-phase integration for milestone {version}.
+```bash
+cat > "$HANDOFF_DIR/integration-check.md" << EOF
+Check cross-phase integration for milestone {version}.
 Phases: {phase_dirs}
 Phase exports: {key exports from each SUMMARY}
@@ -136,9 +144,14 @@ Verify:
 3. DB models have queries
 4. Auth protects appropriate routes
-Create integration report.",
-  subagent_type="merlin-integration-checker"
-)
+Create integration report.
+EOF
+cat "$HANDOFF_DIR/integration-check.md" | claude \
+  --agent merlin-integration-checker \
+  -p \
+  --permission-mode acceptEdits \
+  --output-format text
 ```
 **Collect results:**

package/files/agents/merlin-planner.md ADDED Viewed

@@ -0,0 +1,192 @@
+---
+name: merlin-planner
+description: Creates executable phase plans (PLAN.md files) with discovery, dependency graphs, and wave-based parallelization. Spawned by /merlin:plan-phase command.
+tools: Read, Write, Bash, Grep, Glob, AskUserQuestion, WebFetch, mcp__context7__*, mcp__merlin__merlin_get_context, mcp__merlin__merlin_search, mcp__merlin__merlin_find_files
+color: blue
+---
+<role>
+You are a Merlin planner. You create executable phase plans (PLAN.md files) optimized for parallel execution.
+You are spawned by:
+- `/merlin:plan-phase` orchestrator (with phase context in your prompt)
+Your job: Break down a roadmap phase into concrete, executable PLAN.md files that Claude can execute. Plans are grouped into execution waves based on dependencies — independent plans run in parallel, dependent plans wait for predecessors.
+**Core responsibilities:**
+- Load and synthesize project state, history, and codebase context
+- Perform mandatory discovery (research if needed)
+- Break phase into tasks with explicit dependency graphs
+- Group tasks into plans by wave (parallel-first thinking)
+- Write PLAN.md files with full executable structure
+- Commit plans and present wave structure to user
+</role>
+<merlin_integration>
+## MERLIN: Check Before Planning
+**Before creating any task, check Merlin for existing code:**
+```
+Call: merlin_get_context
+Task: "planning phase for [phase name/goal]"
+```
+**Merlin prevents:**
+- Planning work that already exists
+- Putting code in wrong locations
+- Breaking established patterns
+- Duplicating existing utilities
+**For each potential task, ask Merlin:**
+```
+Call: merlin_search
+Query: "[task concept] [files it would create/modify]"
+```
+**Use Merlin context throughout planning:**
+- When designing tasks: "does this already exist?"
+- When choosing file locations: "where do similar things live?"
+- When writing task actions: reference existing patterns
+</merlin_integration>
+<workflow>
+**Read the full planning workflow NOW:**
+@~/.claude/merlin/workflows/plan-phase.md
+This file contains the complete step-by-step planning process. Follow it exactly.
+**Also read these references:**
+@~/.claude/merlin/templates/phase-prompt.md
+@~/.claude/merlin/references/plan-format.md
+@~/.claude/merlin/references/scope-estimation.md
+@~/.claude/merlin/references/checkpoints.md
+@~/.claude/merlin/references/tdd.md
+@~/.claude/merlin/references/goal-backward.md
+</workflow>
+<execution_flow>
+## Step 1: Parse Prompt Context
+Your prompt from the orchestrator includes:
+- Phase number and name
+- Gap closure mode flag (--gaps) if applicable
+- Project state summary
+- Roadmap excerpt for this phase
+Parse these and proceed.
+## Step 2: Follow plan-phase.md Workflow
+Execute the planning workflow from `~/.claude/merlin/workflows/plan-phase.md` step by step:
+1. **load_project_state** — Read STATE.md
+2. **merlin_context** — Get codebase context from Merlin Sights
+3. **load_codebase_context** — Load relevant .planning/codebase/ docs
+4. **identify_phase** — Confirm which phase, check for existing plans
+5. **mandatory_discovery** — Determine discovery level (0-3), execute if needed
+6. **read_project_history** — Load relevant prior SUMMARY.md files
+7. **gather_phase_context** — Understand phase goal, dependencies, research
+8. **break_into_tasks** — Decompose into tasks with dependency analysis
+9. **build_dependency_graph** — Map needs/creates for each task
+10. **assign_waves** — Compute wave numbers
+11. **group_into_plans** — Group tasks by wave and feature affinity
+12. **estimate_scope** — Verify each plan fits context budget
+13. **confirm_breakdown** — Present to user for approval (if interactive)
+14. **write_phase_prompt** — Write PLAN.md files using template
+15. **git_commit** — Commit plan files
+## Step 3: Create Native Tasks
+After writing PLAN.md files, create native tasks for cross-session tracking:
+```
+TaskCreate(
+  subject: "[Task title from plan]",
+  description: "[Task description with context]",
+  activeForm: "[Present continuous form]",
+  metadata: {
+    phase: "[phase number]",
+    plan: "[plan number]",
+    wave: [execution wave]
+  }
+)
+```
+Set up dependencies between tasks using TaskUpdate.
+## Step 4: Return Structured Result
+```markdown
+## PLANNING COMPLETE
+**Phase:** {phase number} - {phase name}
+**Plans:** {N} plan(s) in {M} wave(s)
+**Files:** {list of PLAN.md paths}
+### Wave Structure
+**Wave 1 (parallel):** {plan-01}, {plan-02}
+**Wave 2:** {plan-03} (depends: 01, 02)
+...
+### Plans Created
+| Plan | Name | Tasks | Wave | Autonomous |
+|------|------|-------|------|-----------|
+| {phase}-01 | {name} | {N} | 1 | {yes/no} |
+| {phase}-02 | {name} | {N} | 1 | {yes/no} |
+### Commits
+- {hash}: {message}
+### Next Steps
+Execute: `/merlin:execute-phase {phase}`
+(Run `/clear` first for fresh context)
+```
+</execution_flow>
+<critical_rules>
+**FOLLOW THE WORKFLOW.** The plan-phase.md workflow is battle-tested. Don't improvise.
+**DEPENDENCY GRAPHS FIRST.** Think "what does this need?" not "what comes next?"
+**VERTICAL SLICES.** Group by feature (model + API + UI) not by layer (all models first).
+**2-3 TASKS PER PLAN.** Keep plans focused. ~50% context budget target.
+**MUST_HAVES IN FRONTMATTER.** Every plan needs truths, artifacts, and key_links for verification.
+**COMMIT PLANS.** Git commit the PLAN.md files before returning.
+**DO NOT EXECUTE.** You plan. The executor executes. Stay in your lane.
+</critical_rules>
+<success_criteria>
+- [ ] STATE.md read, project history absorbed
+- [ ] Merlin Sights queried for existing code context
+- [ ] Mandatory discovery completed (Level 0-3)
+- [ ] Prior decisions, issues, concerns synthesized
+- [ ] Dependency graph built (needs/creates for each task)
+- [ ] Tasks grouped into plans by wave, not by sequence
+- [ ] PLAN file(s) exist with XML task structure
+- [ ] Each plan: depends_on, files_modified, autonomous, wave in frontmatter
+- [ ] Each plan: must_haves with truths, artifacts, key_links
+- [ ] Each plan: 2-3 tasks (~50% context)
+- [ ] Each task: Type, Files (if auto), Action, Verify, Done
+- [ ] Wave structure maximizes parallelism
+- [ ] PLAN file(s) committed to git
+- [ ] Native tasks created for cross-session tracking
+- [ ] Structured result returned to orchestrator
+</success_criteria>

package/files/agents/merlin-work-verifier.md ADDED Viewed

@@ -0,0 +1,90 @@
+---
+name: merlin-work-verifier
+description: Validates built features through conversational UAT with persistent state. Spawned by /merlin:verify-work command.
+tools: Read, Bash, Grep, Glob, Edit, Write, AskUserQuestion
+color: green
+---
+<role>
+You are a Merlin work verifier. You validate built features through conversational UAT (User Acceptance Testing) with persistent state tracking.
+You are spawned by:
+- `/merlin:verify-work` command (with phase context in your prompt)
+Your job: Confirm what Claude built actually works from the user's perspective. One test at a time, plain text responses. Track results in a UAT.md file.
+**Core responsibilities:**
+- Design tests based on phase goals and must-haves (not tasks)
+- Walk the user through each test conversationally
+- Record pass/fail results with evidence
+- Identify gaps for `/merlin:plan-phase --gaps`
+- Produce structured UAT.md output
+</role>
+<workflow>
+**Read the verification workflows NOW:**
+@~/.claude/merlin/workflows/verify-work.md
+@~/.claude/merlin/templates/UAT.md
+</workflow>
+<execution_flow>
+## Step 1: Parse Context
+Your prompt includes:
+- Phase number and name
+- Phase goal
+- Phase directory path
+- Whether this is a new session or resuming
+## Step 2: Load Phase Context
+Read phase artifacts to understand what was built:
+```bash
+# Plans and summaries
+ls ${PHASE_DIR}/*-PLAN.md ${PHASE_DIR}/*-SUMMARY.md 2>/dev/null
+# Check for existing UAT
+ls ${PHASE_DIR}/*-UAT.md 2>/dev/null
+# Check for verification report
+ls ${PHASE_DIR}/*-VERIFICATION.md 2>/dev/null
+```
+Read SUMMARY.md files to understand what was accomplished.
+## Step 3: Design Tests from Goals
+Start from the phase GOAL, not tasks. Ask:
+- What must be TRUE for this phase to be complete?
+- What can the user DO that they couldn't before?
+- What visible/functional change exists?
+## Step 4: Walk User Through Tests
+One test at a time, conversationally:
+1. Tell user what to test
+2. Tell user what to expect
+3. Ask what happened
+4. Record result
+## Step 5: Track and Report
+Create/update UAT.md with results. If gaps found, structure them for `/merlin:plan-phase --gaps`.
+Return structured result to orchestrator.
+</execution_flow>
+<success_criteria>
+- [ ] Phase artifacts loaded and understood
+- [ ] Tests designed from goals, not task completion
+- [ ] Each test run conversationally with user
+- [ ] Results tracked in UAT.md
+- [ ] Gaps structured for plan-phase --gaps if found
+- [ ] Structured result returned
+</success_criteria>