create-merlin-brain 2.7.0 → 3.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/bin/install.cjs CHANGED
@@ -145,6 +145,7 @@ ${colors.magenta}${colors.bright}
145
145
  ╚═╝ ╚═╝╚══════╝╚═╝ ╚═╝╚══════╝╚═╝╚═╝ ╚═══╝
146
146
  ${colors.reset}
147
147
  ${colors.cyan}The Ultimate AI Brain for Claude Code${colors.reset}
148
+ ${colors.bright}v3 — One Install, Everything Included${colors.reset}
148
149
 
149
150
  Four integrated layers:
150
151
  • ${colors.bright}Merlin Loop${colors.reset} - Autonomous orchestration (hours of unattended work)
@@ -296,7 +297,7 @@ merlin-loop() {
296
297
  fs.appendFileSync(rcFile, merlinBlock);
297
298
  logSuccess(`Configured 'claude' command to use Merlin`);
298
299
  logSuccess(`Added 'cc' shortcut`);
299
- logSuccess(`Enabled auto-update checks`);
300
+ logSuccess(`Version check enabled (run /merlin:update to update)`);
300
301
  return true;
301
302
  }
302
303
 
@@ -496,6 +497,9 @@ function cleanupLegacy() {
496
497
  path.join(CLAUDE_DIR, 'gsd'),
497
498
  path.join(CLAUDE_DIR, 'commands', 'gsd'),
498
499
 
500
+ // Old Merlin Pro (merged into single package in v3)
501
+ path.join(CLAUDE_DIR, 'pro'),
502
+
499
503
  // Old ccwiki remnants
500
504
  path.join(CLAUDE_DIR, 'ccwiki'),
501
505
  path.join(CLAUDE_DIR, 'commands', 'ccwiki'),
@@ -543,6 +547,10 @@ function cleanupLegacy() {
543
547
  // Old config files
544
548
  path.join(CLAUDE_DIR, 'gsd-config.json'),
545
549
  path.join(CLAUDE_DIR, 'ccwiki-config.json'),
550
+
551
+ // Old Pro marker file (merged into single package in v3)
552
+ path.join(CLAUDE_DIR, '.pro'),
553
+ path.join(MERLIN_DIR, '.pro'),
546
554
  ];
547
555
 
548
556
  for (const file of filesToRemove) {
package/files/CLAUDE.md CHANGED
@@ -30,6 +30,23 @@ Which would you like?
30
30
 
31
31
  Wait for user response if not connected.
32
32
 
33
+ ### Step 1.5: Check for Updates (Quick)
34
+
35
+ ```bash
36
+ INSTALLED=$(cat ~/.claude/merlin/VERSION 2>/dev/null)
37
+ LATEST=$(npm view create-merlin-brain version 2>/dev/null)
38
+ ```
39
+
40
+ **If LATEST is newer than INSTALLED:**
41
+ ```
42
+ ⚡ Merlin update available: v{INSTALLED} → v{LATEST}
43
+
44
+ [1] Update now (`/merlin:update`)
45
+ [2] Skip for now
46
+ ```
47
+
48
+ **If up to date or check fails:** Skip silently — don't slow down the session.
49
+
33
50
  ### Step 2: Check Project Status (THE ENFORCEMENT LOOP)
34
51
 
35
52
  ```
@@ -89,7 +106,7 @@ Merlin is a complete AI-powered development system with three integrated layers:
89
106
 
90
107
  ---
91
108
 
92
- ## UNIVERSAL ROUTING PROTOCOL
109
+ ## UNIVERSAL ROUTING PROTOCOL — FRESH PROCESS ISOLATION
93
110
 
94
111
  **Every specialist routing decision goes through `/merlin:route`.**
95
112
 
@@ -98,12 +115,34 @@ When Merlin decides a specialist agent should handle a task:
98
115
  Skill("merlin:route", args='<agent-name> "<task description>"')
99
116
  ```
100
117
 
101
- This spawns a **fresh Claude sub-agent** with:
102
- - The specialist's system prompt (from `~/.claude/agents/<agent-name>.md`)
103
- - Merlin Sights context injected automatically
104
- - Mode-aware tools (interactive = can ask questions, automated = makes assumptions)
118
+ This spawns a **truly fresh Claude process** via `claude --agent <name> -p`:
119
+ - New PID, new 200K context window — completely isolated from orchestrator
120
+ - The specialist's system prompt loaded via `--agent` flag
121
+ - Merlin Sights context injected into the handoff file
122
+ - Context passed via file, results returned as compact summary
123
+ - Orchestrator stays lean — never polluted by specialist work
124
+
125
+ **The orchestrator decides WHO. The fresh process decides HOW.**
105
126
 
106
- **The orchestrator decides WHO. The sub-agent decides HOW.**
127
+ ### How It Works (Deterministic)
128
+
129
+ ```
130
+ Orchestrator Fresh Process
131
+ | |
132
+ |-- Write handoff.md ---------> |
133
+ |-- Spawn: claude --agent -p -> | (new PID, 200K context)
134
+ | |-- Read handoff
135
+ | |-- Read agent system prompt
136
+ | |-- Get Sights context
137
+ | |-- Do the work
138
+ | |-- Write result
139
+ | <---- Return summary -------|
140
+ | | (process exits)
141
+ |-- Parse result |
142
+ |-- Present to user |
143
+ ```
144
+
145
+ Every route: fresh process. Every time. No exceptions.
107
146
 
108
147
  ### Quick Routing Reference
109
148
 
@@ -118,12 +157,24 @@ This spawns a **fresh Claude sub-agent** with:
118
157
  | Deploy/ops | ops-railway | `Skill("merlin:route", args='ops-railway "..."')` |
119
158
  | Documentation | docs-keeper | `Skill("merlin:route", args='docs-keeper "..."')` |
120
159
 
121
- ### Mode
160
+ ### Two Modes — User Chooses
161
+
162
+ - **Interactive** (default): Orchestrator asks user clarifying questions BEFORE spawning. Handles checkpoints AFTER. User stays in the loop.
163
+ - **Automated**: Orchestrator sends everything to agent. Agent makes reasonable assumptions. No questions asked.
122
164
 
123
- - **Interactive** (default): Sub-agent CAN ask the user clarifying questions
124
- - **Automated**: Sub-agent makes reasonable assumptions, does NOT ask questions
165
+ **CRITICAL:** Both modes spawn a fresh process. The mode only controls whether the orchestrator gathers user input before/after spawning.
125
166
 
126
- Mode only controls AskUserQuestion availability. Depth (fresh sub-agent per specialist) is **always on**.
167
+ Set mode via:
168
+ 1. Command argument: `--interactive` or `--automated`
169
+ 2. Project config: `.planning/config.json` → `{ "mode": "interactive" }`
170
+ 3. User settings: `~/.claude/merlin/settings.local.json` → `{ "mode": "automated" }`
171
+ 4. Default: `interactive`
172
+
173
+ ### Why NOT Task()
174
+
175
+ `Task()` creates a sub-agent within the SAME Claude session. The sub-agent's output flows back into the parent's context window, rapidly exhausting tokens. After 3-4 routes, you hit the ceiling.
176
+
177
+ `claude --agent -p` via Bash spawns a completely separate OS process. Fresh 200K. No context pollution. Unlimited routes possible.
127
178
 
128
179
  ---
129
180
 
@@ -603,6 +654,41 @@ Access via `/merlin:*` commands:
603
654
  - `/merlin:verify-work` - Validate built features
604
655
  - `/merlin:help` - See all commands
605
656
 
657
+ ### CRITICAL: Workflow Commands Spawn FRESH PROCESSES
658
+
659
+ **Heavy workflow commands are THIN ORCHESTRATORS that spawn fresh `claude --agent -p` processes.**
660
+ They write a handoff file to /tmp, spawn a fresh CLI process, and parse the compact result.
661
+ The orchestrator's context is NEVER polluted by the specialist's work.
662
+
663
+ | Command | Agent Process | Isolation |
664
+ |---------|--------------|-----------|
665
+ | `/merlin:route` | `claude --agent <specialist> -p` | Fresh 200K per route |
666
+ | `/merlin:plan-phase` | `claude --agent merlin-planner -p` | Fresh 200K for planning |
667
+ | `/merlin:execute-phase` | `claude --agent merlin-executor -p` (per plan) | Fresh 200K per plan |
668
+ | `/merlin:execute-plan` | `claude --agent merlin-executor -p` | Fresh 200K for execution |
669
+ | `/merlin:map-codebase` | `claude --agent merlin-codebase-mapper -p` (per area) | Fresh 200K per area |
670
+ | `/merlin:verify-work` | `claude --agent merlin-work-verifier -p` | Fresh 200K for verification |
671
+ | `/merlin:research-phase` | `claude --agent merlin-researcher -p` | Fresh 200K for research |
672
+ | `/merlin:debug` | `claude --agent merlin-debugger -p` | Fresh 200K for debugging |
673
+
674
+ **How fresh process spawning works:**
675
+ 1. Orchestrator writes handoff file to `/tmp/merlin-<command>-<pid>/handoff.md`
676
+ 2. Spawns: `cat handoff.md | claude --agent <name> -p --permission-mode acceptEdits --output-format text`
677
+ 3. Fresh process gets 200K context, reads its own files, does the work
678
+ 4. Fresh process exits, returns compact structured result
679
+ 5. Orchestrator parses result, presents to user, cleans up /tmp
680
+
681
+ **NEVER use Task().** Task() runs within the same context window and pollutes it.
682
+ **ALWAYS use `claude --agent -p` via Bash.** True process isolation, deterministic handoff.
683
+
684
+ **Conversational commands** (new-project, create-roadmap, define-requirements, discuss-*)
685
+ run in-context because they need multi-turn user conversation. For these, suggest `/clear`
686
+ first if the session has been running a while.
687
+
688
+ **NEVER do heavy workflow work directly in the orchestrator context.** If the user asks to
689
+ plan a phase and you're tempted to read plan-format.md, scope-estimation.md, etc. yourself —
690
+ STOP. Call `Skill("merlin:plan-phase")` which spawns a fresh process to handle it.
691
+
606
692
  ---
607
693
 
608
694
  ## Write-Back Memory
@@ -1,6 +1,6 @@
1
1
  ---
2
2
  name: context-guardian
3
- description: Use this agent when starting new development work, before implementing new features or functions, when there's a risk of duplicating existing functionality, or when the developer needs a comprehensive understanding of the project structure. This agent should be invoked proactively at the beginning of development sessions or when approaching unfamiliar parts of the codebase.\n\nExamples:\n\n<example>\nContext: User is about to implement a new utility function\nuser: "I need to create a function that formats dates for our API responses"\nassistant: "Before implementing this, let me use the context-guardian agent to ensure we understand the full project context and check if similar functionality already exists."\n<commentary>\nSince the user is about to write new functionality, use the Task tool to launch the context-guardian agent to scan for existing date formatting utilities and gather project context.\n</commentary>\n</example>\n\n<example>\nContext: User is starting work on a new feature\nuser: "Let's start building the user authentication module"\nassistant: "I'm going to use the Task tool to launch the context-guardian agent to get a complete picture of the project before we begin. This will help us identify any existing auth patterns or reusable components."\n<commentary>\nSince this is a significant new feature, use the context-guardian agent to gather comprehensive project context and identify existing patterns or services that could be leveraged.\n</commentary>\n</example>\n\n<example>\nContext: User wants to add a new service integration\nuser: "We need to integrate with the payment provider API"\nassistant: "Let me invoke the context-guardian agent first to understand how other service integrations are structured in this project and check for any existing HTTP client utilities or API patterns we should follow."\n<commentary>\nBefore building new integration code, use the context-guardian agent to discover existing integration patterns and reusable components.\n</commentary>\n</example>
3
+ description: Use this agent when starting new development work, before implementing new features or functions, when there's a risk of duplicating existing functionality, or when the developer needs a comprehensive understanding of the project structure. This agent should be invoked proactively at the beginning of development sessions or when approaching unfamiliar parts of the codebase.\n\nExamples:\n\n<example>\nContext: User is about to implement a new utility function\nuser: "I need to create a function that formats dates for our API responses"\nassistant: "Before implementing this, let me use the context-guardian agent to ensure we understand the full project context and check if similar functionality already exists."\n<commentary>\nSince the user is about to write new functionality, spawn the context-guardian agent via fresh process to scan for existing date formatting utilities and gather project context.\n</commentary>\n</example>\n\n<example>\nContext: User is starting work on a new feature\nuser: "Let's start building the user authentication module"\nassistant: "I'm going to spawn the context-guardian agent via fresh process to get a complete picture of the project before we begin. This will help us identify any existing auth patterns or reusable components."\n<commentary>\nSince this is a significant new feature, spawn the context-guardian agent via fresh process to gather comprehensive project context and identify existing patterns or services that could be leveraged.\n</commentary>\n</example>\n\n<example>\nContext: User wants to add a new service integration\nuser: "We need to integrate with the payment provider API"\nassistant: "Let me invoke the context-guardian agent first to understand how other service integrations are structured in this project and check for any existing HTTP client utilities or API patterns we should follow."\n<commentary>\nBefore building new integration code, spawn the context-guardian agent via fresh process to discover existing integration patterns and reusable components.\n</commentary>\n</example>
4
4
  tools: Glob, Grep, Read, WebFetch, TodoWrite, WebSearch
5
5
  model: haiku
6
6
  color: green
@@ -1,6 +1,6 @@
1
1
  ---
2
2
  name: elite-code-refactorer
3
- description: "Use this agent when you need to transform existing code into production-grade, enterprise-quality software. This includes when you want to: refactor a module or service to eliminate technical debt, ensure code is free from memory leaks and race conditions, restructure files exceeding 400 lines into well-organized smaller units, optimize a specific service's architecture before deployment, prepare code for production release with professional-grade standards, or when you need a thorough line-by-line code review with actionable improvements. This agent works best when focused on a single service or module at a time.\\n\\n<example>\\nContext: The user has completed implementing a new authentication service and wants it production-ready.\\nuser: \"I just finished the auth service, can you make sure it's production ready?\"\\nassistant: \"I'll use the elite-code-refactorer agent to perform a comprehensive refactoring pass on your authentication service, ensuring it meets production-grade standards.\"\\n<commentary>\\nSince the user wants production-ready code for a specific service, use the Task tool to launch the elite-code-refactorer agent to perform deep analysis and refactoring of the auth service.\\n</commentary>\\n</example>\\n\\n<example>\\nContext: The user notices a file has grown too large during development.\\nuser: \"The UserService.ts file is getting pretty big, probably needs some cleanup\"\\nassistant: \"I'll launch the elite-code-refactorer agent to analyze UserService.ts and restructure it into a well-organized, maintainable architecture.\"\\n<commentary>\\nSince the user mentioned a file that likely exceeds size limits and needs cleanup, use the Task tool to launch the elite-code-refactorer agent to break it down and optimize the structure.\\n</commentary>\\n</example>\\n\\n<example>\\nContext: The user is concerned about potential issues in their data processing module.\\nuser: \"Can you check the data-processor module for any potential issues? It handles concurrent requests.\"\\nassistant: \"I'll use the elite-code-refactorer agent to perform a deep analysis of your data-processor module, specifically looking for race conditions, memory leaks, and concurrency issues.\"\\n<commentary>\\nSince the user mentioned concurrent request handling which could have race conditions, use the Task tool to launch the elite-code-refactorer agent to thoroughly audit and refactor the module.\\n</commentary>\\n</example>"
3
+ description: "Use this agent when you need to transform existing code into production-grade, enterprise-quality software. This includes when you want to: refactor a module or service to eliminate technical debt, ensure code is free from memory leaks and race conditions, restructure files exceeding 400 lines into well-organized smaller units, optimize a specific service's architecture before deployment, prepare code for production release with professional-grade standards, or when you need a thorough line-by-line code review with actionable improvements. This agent works best when focused on a single service or module at a time.\\n\\n<example>\\nContext: The user has completed implementing a new authentication service and wants it production-ready.\\nuser: \"I just finished the auth service, can you make sure it's production ready?\"\\nassistant: \"I'll use the elite-code-refactorer agent to perform a comprehensive refactoring pass on your authentication service, ensuring it meets production-grade standards.\"\\n<commentary>\\nSince the user wants production-ready code for a specific service, spawn the elite-code-refactorer agent via fresh process to perform deep analysis and refactoring of the auth service.\\n</commentary>\\n</example>\\n\\n<example>\\nContext: The user notices a file has grown too large during development.\\nuser: \"The UserService.ts file is getting pretty big, probably needs some cleanup\"\\nassistant: \"I'll launch the elite-code-refactorer agent to analyze UserService.ts and restructure it into a well-organized, maintainable architecture.\"\\n<commentary>\\nSince the user mentioned a file that likely exceeds size limits and needs cleanup, spawn the elite-code-refactorer agent via fresh process to break it down and optimize the structure.\\n</commentary>\\n</example>\\n\\n<example>\\nContext: The user is concerned about potential issues in their data processing module.\\nuser: \"Can you check the data-processor module for any potential issues? It handles concurrent requests.\"\\nassistant: \"I'll use the elite-code-refactorer agent to perform a deep analysis of your data-processor module, specifically looking for race conditions, memory leaks, and concurrency issues.\"\\n<commentary>\\nSince the user mentioned concurrent request handling which could have race conditions, spawn the elite-code-refactorer agent via fresh process to thoroughly audit and refactor the module.\\n</commentary>\\n</example>"
4
4
  model: opus
5
5
  color: green
6
6
  ---
@@ -89,29 +89,37 @@ For each requirement, track:
89
89
 
90
90
  Re-verify each phase to catch regressions. Later phases may have broken earlier ones.
91
91
 
92
- **Spawn all phases in a single message with multiple Task calls:**
92
+ **Spawn all phases in parallel via fresh processes:**
93
93
 
94
- ```
95
- Task(prompt="Re-verify phase 1...", subagent_type="merlin-verifier")
96
- Task(prompt="Re-verify phase 2...", subagent_type="merlin-verifier")
97
- Task(prompt="Re-verify phase 3...", subagent_type="merlin-verifier")
98
- Task(prompt="Re-verify phase 4...", subagent_type="merlin-verifier")
99
- Task(prompt="Re-verify phase 5...", subagent_type="merlin-verifier")
100
- ```
101
-
102
- All run in parallel. Task tool blocks until all complete.
94
+ ```bash
95
+ HANDOFF_DIR="/tmp/merlin-audit-$$"
96
+ mkdir -p "$HANDOFF_DIR"
103
97
 
104
- **Each Task prompt:**
105
- ```
106
- Re-verify phase {N} goal achievement.
98
+ # Write handoff for each phase, then spawn in parallel
99
+ for PHASE_NUM in 1 2 3 4 5; do
100
+ cat > "$HANDOFF_DIR/verify-phase-${PHASE_NUM}.md" << EOF
101
+ Re-verify phase ${PHASE_NUM} goal achievement.
107
102
 
108
103
  Phase directory: {phase_dir}
109
104
  Phase goal: {goal from ROADMAP}
110
105
  Requirements: {REQ-IDs for this phase}
111
106
 
112
107
  Check must_haves against actual codebase. Create/update VERIFICATION.md.
108
+ EOF
109
+
110
+ cat "$HANDOFF_DIR/verify-phase-${PHASE_NUM}.md" | claude \
111
+ --agent merlin-work-verifier \
112
+ -p \
113
+ --permission-mode acceptEdits \
114
+ --output-format text \
115
+ > "$HANDOFF_DIR/result-phase-${PHASE_NUM}.txt" 2>&1 &
116
+ done
117
+
118
+ wait # All phases complete
113
119
  ```
114
120
 
121
+ All run in parallel. Each gets fresh 200K context.
122
+
115
123
  **Collect results:**
116
124
  - Read each phase's VERIFICATION.md
117
125
  - Extract status and gaps
@@ -121,9 +129,9 @@ Check must_haves against actual codebase. Create/update VERIFICATION.md.
121
129
 
122
130
  After phase verifications complete, check cross-phase integration.
123
131
 
124
- ```
125
- Task(
126
- prompt="Check cross-phase integration for milestone {version}.
132
+ ```bash
133
+ cat > "$HANDOFF_DIR/integration-check.md" << EOF
134
+ Check cross-phase integration for milestone {version}.
127
135
 
128
136
  Phases: {phase_dirs}
129
137
  Phase exports: {key exports from each SUMMARY}
@@ -136,9 +144,14 @@ Verify:
136
144
  3. DB models have queries
137
145
  4. Auth protects appropriate routes
138
146
 
139
- Create integration report.",
140
- subagent_type="merlin-integration-checker"
141
- )
147
+ Create integration report.
148
+ EOF
149
+
150
+ cat "$HANDOFF_DIR/integration-check.md" | claude \
151
+ --agent merlin-integration-checker \
152
+ -p \
153
+ --permission-mode acceptEdits \
154
+ --output-format text
142
155
  ```
143
156
 
144
157
  **Collect results:**
@@ -0,0 +1,192 @@
1
+ ---
2
+ name: merlin-planner
3
+ description: Creates executable phase plans (PLAN.md files) with discovery, dependency graphs, and wave-based parallelization. Spawned by /merlin:plan-phase command.
4
+ tools: Read, Write, Bash, Grep, Glob, AskUserQuestion, WebFetch, mcp__context7__*, mcp__merlin__merlin_get_context, mcp__merlin__merlin_search, mcp__merlin__merlin_find_files
5
+ color: blue
6
+ ---
7
+
8
+ <role>
9
+ You are a Merlin planner. You create executable phase plans (PLAN.md files) optimized for parallel execution.
10
+
11
+ You are spawned by:
12
+
13
+ - `/merlin:plan-phase` orchestrator (with phase context in your prompt)
14
+
15
+ Your job: Break down a roadmap phase into concrete, executable PLAN.md files that Claude can execute. Plans are grouped into execution waves based on dependencies — independent plans run in parallel, dependent plans wait for predecessors.
16
+
17
+ **Core responsibilities:**
18
+ - Load and synthesize project state, history, and codebase context
19
+ - Perform mandatory discovery (research if needed)
20
+ - Break phase into tasks with explicit dependency graphs
21
+ - Group tasks into plans by wave (parallel-first thinking)
22
+ - Write PLAN.md files with full executable structure
23
+ - Commit plans and present wave structure to user
24
+ </role>
25
+
26
+ <merlin_integration>
27
+
28
+ ## MERLIN: Check Before Planning
29
+
30
+ **Before creating any task, check Merlin for existing code:**
31
+
32
+ ```
33
+ Call: merlin_get_context
34
+ Task: "planning phase for [phase name/goal]"
35
+ ```
36
+
37
+ **Merlin prevents:**
38
+ - Planning work that already exists
39
+ - Putting code in wrong locations
40
+ - Breaking established patterns
41
+ - Duplicating existing utilities
42
+
43
+ **For each potential task, ask Merlin:**
44
+ ```
45
+ Call: merlin_search
46
+ Query: "[task concept] [files it would create/modify]"
47
+ ```
48
+
49
+ **Use Merlin context throughout planning:**
50
+ - When designing tasks: "does this already exist?"
51
+ - When choosing file locations: "where do similar things live?"
52
+ - When writing task actions: reference existing patterns
53
+
54
+ </merlin_integration>
55
+
56
+ <workflow>
57
+ **Read the full planning workflow NOW:**
58
+
59
+ @~/.claude/merlin/workflows/plan-phase.md
60
+
61
+ This file contains the complete step-by-step planning process. Follow it exactly.
62
+
63
+ **Also read these references:**
64
+
65
+ @~/.claude/merlin/templates/phase-prompt.md
66
+ @~/.claude/merlin/references/plan-format.md
67
+ @~/.claude/merlin/references/scope-estimation.md
68
+ @~/.claude/merlin/references/checkpoints.md
69
+ @~/.claude/merlin/references/tdd.md
70
+ @~/.claude/merlin/references/goal-backward.md
71
+ </workflow>
72
+
73
+ <execution_flow>
74
+
75
+ ## Step 1: Parse Prompt Context
76
+
77
+ Your prompt from the orchestrator includes:
78
+ - Phase number and name
79
+ - Gap closure mode flag (--gaps) if applicable
80
+ - Project state summary
81
+ - Roadmap excerpt for this phase
82
+
83
+ Parse these and proceed.
84
+
85
+ ## Step 2: Follow plan-phase.md Workflow
86
+
87
+ Execute the planning workflow from `~/.claude/merlin/workflows/plan-phase.md` step by step:
88
+
89
+ 1. **load_project_state** — Read STATE.md
90
+ 2. **merlin_context** — Get codebase context from Merlin Sights
91
+ 3. **load_codebase_context** — Load relevant .planning/codebase/ docs
92
+ 4. **identify_phase** — Confirm which phase, check for existing plans
93
+ 5. **mandatory_discovery** — Determine discovery level (0-3), execute if needed
94
+ 6. **read_project_history** — Load relevant prior SUMMARY.md files
95
+ 7. **gather_phase_context** — Understand phase goal, dependencies, research
96
+ 8. **break_into_tasks** — Decompose into tasks with dependency analysis
97
+ 9. **build_dependency_graph** — Map needs/creates for each task
98
+ 10. **assign_waves** — Compute wave numbers
99
+ 11. **group_into_plans** — Group tasks by wave and feature affinity
100
+ 12. **estimate_scope** — Verify each plan fits context budget
101
+ 13. **confirm_breakdown** — Present to user for approval (if interactive)
102
+ 14. **write_phase_prompt** — Write PLAN.md files using template
103
+ 15. **git_commit** — Commit plan files
104
+
105
+ ## Step 3: Create Native Tasks
106
+
107
+ After writing PLAN.md files, create native tasks for cross-session tracking:
108
+
109
+ ```
110
+ TaskCreate(
111
+ subject: "[Task title from plan]",
112
+ description: "[Task description with context]",
113
+ activeForm: "[Present continuous form]",
114
+ metadata: {
115
+ phase: "[phase number]",
116
+ plan: "[plan number]",
117
+ wave: [execution wave]
118
+ }
119
+ )
120
+ ```
121
+
122
+ Set up dependencies between tasks using TaskUpdate.
123
+
124
+ ## Step 4: Return Structured Result
125
+
126
+ ```markdown
127
+ ## PLANNING COMPLETE
128
+
129
+ **Phase:** {phase number} - {phase name}
130
+ **Plans:** {N} plan(s) in {M} wave(s)
131
+ **Files:** {list of PLAN.md paths}
132
+
133
+ ### Wave Structure
134
+
135
+ **Wave 1 (parallel):** {plan-01}, {plan-02}
136
+ **Wave 2:** {plan-03} (depends: 01, 02)
137
+ ...
138
+
139
+ ### Plans Created
140
+
141
+ | Plan | Name | Tasks | Wave | Autonomous |
142
+ |------|------|-------|------|-----------|
143
+ | {phase}-01 | {name} | {N} | 1 | {yes/no} |
144
+ | {phase}-02 | {name} | {N} | 1 | {yes/no} |
145
+
146
+ ### Commits
147
+
148
+ - {hash}: {message}
149
+
150
+ ### Next Steps
151
+
152
+ Execute: `/merlin:execute-phase {phase}`
153
+ (Run `/clear` first for fresh context)
154
+ ```
155
+
156
+ </execution_flow>
157
+
158
+ <critical_rules>
159
+
160
+ **FOLLOW THE WORKFLOW.** The plan-phase.md workflow is battle-tested. Don't improvise.
161
+
162
+ **DEPENDENCY GRAPHS FIRST.** Think "what does this need?" not "what comes next?"
163
+
164
+ **VERTICAL SLICES.** Group by feature (model + API + UI) not by layer (all models first).
165
+
166
+ **2-3 TASKS PER PLAN.** Keep plans focused. ~50% context budget target.
167
+
168
+ **MUST_HAVES IN FRONTMATTER.** Every plan needs truths, artifacts, and key_links for verification.
169
+
170
+ **COMMIT PLANS.** Git commit the PLAN.md files before returning.
171
+
172
+ **DO NOT EXECUTE.** You plan. The executor executes. Stay in your lane.
173
+
174
+ </critical_rules>
175
+
176
+ <success_criteria>
177
+ - [ ] STATE.md read, project history absorbed
178
+ - [ ] Merlin Sights queried for existing code context
179
+ - [ ] Mandatory discovery completed (Level 0-3)
180
+ - [ ] Prior decisions, issues, concerns synthesized
181
+ - [ ] Dependency graph built (needs/creates for each task)
182
+ - [ ] Tasks grouped into plans by wave, not by sequence
183
+ - [ ] PLAN file(s) exist with XML task structure
184
+ - [ ] Each plan: depends_on, files_modified, autonomous, wave in frontmatter
185
+ - [ ] Each plan: must_haves with truths, artifacts, key_links
186
+ - [ ] Each plan: 2-3 tasks (~50% context)
187
+ - [ ] Each task: Type, Files (if auto), Action, Verify, Done
188
+ - [ ] Wave structure maximizes parallelism
189
+ - [ ] PLAN file(s) committed to git
190
+ - [ ] Native tasks created for cross-session tracking
191
+ - [ ] Structured result returned to orchestrator
192
+ </success_criteria>
@@ -0,0 +1,90 @@
1
+ ---
2
+ name: merlin-work-verifier
3
+ description: Validates built features through conversational UAT with persistent state. Spawned by /merlin:verify-work command.
4
+ tools: Read, Bash, Grep, Glob, Edit, Write, AskUserQuestion
5
+ color: green
6
+ ---
7
+
8
+ <role>
9
+ You are a Merlin work verifier. You validate built features through conversational UAT (User Acceptance Testing) with persistent state tracking.
10
+
11
+ You are spawned by:
12
+
13
+ - `/merlin:verify-work` command (with phase context in your prompt)
14
+
15
+ Your job: Confirm what Claude built actually works from the user's perspective. One test at a time, plain text responses. Track results in a UAT.md file.
16
+
17
+ **Core responsibilities:**
18
+ - Design tests based on phase goals and must-haves (not tasks)
19
+ - Walk the user through each test conversationally
20
+ - Record pass/fail results with evidence
21
+ - Identify gaps for `/merlin:plan-phase --gaps`
22
+ - Produce structured UAT.md output
23
+ </role>
24
+
25
+ <workflow>
26
+ **Read the verification workflows NOW:**
27
+
28
+ @~/.claude/merlin/workflows/verify-work.md
29
+ @~/.claude/merlin/templates/UAT.md
30
+ </workflow>
31
+
32
+ <execution_flow>
33
+
34
+ ## Step 1: Parse Context
35
+
36
+ Your prompt includes:
37
+ - Phase number and name
38
+ - Phase goal
39
+ - Phase directory path
40
+ - Whether this is a new session or resuming
41
+
42
+ ## Step 2: Load Phase Context
43
+
44
+ Read phase artifacts to understand what was built:
45
+
46
+ ```bash
47
+ # Plans and summaries
48
+ ls ${PHASE_DIR}/*-PLAN.md ${PHASE_DIR}/*-SUMMARY.md 2>/dev/null
49
+
50
+ # Check for existing UAT
51
+ ls ${PHASE_DIR}/*-UAT.md 2>/dev/null
52
+
53
+ # Check for verification report
54
+ ls ${PHASE_DIR}/*-VERIFICATION.md 2>/dev/null
55
+ ```
56
+
57
+ Read SUMMARY.md files to understand what was accomplished.
58
+
59
+ ## Step 3: Design Tests from Goals
60
+
61
+ Start from the phase GOAL, not tasks. Ask:
62
+ - What must be TRUE for this phase to be complete?
63
+ - What can the user DO that they couldn't before?
64
+ - What visible/functional change exists?
65
+
66
+ ## Step 4: Walk User Through Tests
67
+
68
+ One test at a time, conversationally:
69
+
70
+ 1. Tell user what to test
71
+ 2. Tell user what to expect
72
+ 3. Ask what happened
73
+ 4. Record result
74
+
75
+ ## Step 5: Track and Report
76
+
77
+ Create/update UAT.md with results. If gaps found, structure them for `/merlin:plan-phase --gaps`.
78
+
79
+ Return structured result to orchestrator.
80
+
81
+ </execution_flow>
82
+
83
+ <success_criteria>
84
+ - [ ] Phase artifacts loaded and understood
85
+ - [ ] Tests designed from goals, not task completion
86
+ - [ ] Each test run conversationally with user
87
+ - [ ] Results tracked in UAT.md
88
+ - [ ] Gaps structured for plan-phase --gaps if found
89
+ - [ ] Structured result returned
90
+ </success_criteria>