@pharaoh-so/mcp 0.3.7 → 0.3.9

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -68,12 +68,32 @@ function installClaudeCodePlugin(home = homedir()) {
68
68
  process.stderr.write("Pharaoh: .claude-plugin/ manifest not found in package — cannot install.\n");
69
69
  return -1;
70
70
  }
71
- // Copy skills/
71
+ // Copy skills/ and generate pharaoh-* prefixed aliases
72
72
  let skillCount = 0;
73
73
  if (existsSync(BUNDLED_SKILLS_DIR)) {
74
74
  cpSync(BUNDLED_SKILLS_DIR, join(pluginDir, "skills"), { recursive: true, force: true });
75
75
  const entries = readdirSync(BUNDLED_SKILLS_DIR, { withFileTypes: true });
76
- skillCount = entries.filter((e) => e.isDirectory()).length;
76
+ const skillDirs = entries.filter((e) => e.isDirectory());
77
+ skillCount = skillDirs.length;
78
+ // Auto-generate pharaoh-* prefixed copies so both `/plan` and `pharaoh:plan`
79
+ // resolve to the same content. Without this, prefixed copies drift and users
80
+ // get a stripped skeleton when invoking via the pharaoh: prefix.
81
+ for (const dir of skillDirs) {
82
+ if (dir.name === "pharaoh" || dir.name.startsWith("pharaoh-"))
83
+ continue;
84
+ const prefixedName = `pharaoh-${dir.name}`;
85
+ const prefixedDir = join(pluginDir, "skills", prefixedName);
86
+ const srcSkill = join(pluginDir, "skills", dir.name, "SKILL.md");
87
+ if (!existsSync(srcSkill))
88
+ continue;
89
+ mkdirSync(prefixedDir, { recursive: true });
90
+ const content = readFileSync(srcSkill, "utf-8");
91
+ // Rewrite the name field in YAML frontmatter only (between --- delimiters).
92
+ // Using a whole-file /m regex would match `name:` in body content too.
93
+ const rewritten = content.replace(/^(---\n[\s\S]*?)(name:\s*).+(\n[\s\S]*?---)/, `$1$2${prefixedName}$3`);
94
+ writeFileSync(join(prefixedDir, "SKILL.md"), rewritten);
95
+ skillCount++;
96
+ }
77
97
  }
78
98
  // Copy .mcp.json
79
99
  const mcpSrc = join(PKG_ROOT, ".mcp.json");
package/package.json CHANGED
@@ -1,7 +1,7 @@
1
1
  {
2
2
  "name": "@pharaoh-so/mcp",
3
3
  "mcpName": "so.pharaoh/pharaoh",
4
- "version": "0.3.7",
4
+ "version": "0.3.9",
5
5
  "description": "MCP proxy for Pharaoh — maps codebases into queryable knowledge graphs for AI agents. Enables Claude Code in headless environments (VPS, SSH, CI) via device flow auth.",
6
6
  "type": "module",
7
7
  "main": "dist/index.js",
@@ -1,22 +1,20 @@
1
1
  ---
2
2
  name: plan
3
3
  prompt-name: plan-with-pharaoh
4
- description: "Full-cycle architecture-aware planning: Pharaoh reconnaissance, structured plan writing with bite-sized TDD steps and zero placeholders, then deep adversarial review with wiring verification and interactive issue resolution. Replaces both writing-plans and plan-review."
5
- version: 0.3.0
4
+ description: Deep plan review with Pharaoh reconnaissance, wiring verification, and structured issue tracking. Use before implementing any feature, refactor, or significant code change. Enters plan mode (no code changes) and provides structured review with decision points.
5
+ version: 0.5.0
6
6
  homepage: https://pharaoh.so
7
7
  user-invocable: true
8
- metadata: {"emoji": "☥", "tags": ["planning", "architecture", "blast-radius", "pharaoh", "implementation-plan", "wiring", "review", "tdd"]}
8
+ metadata: {"emoji": "☥", "tags": ["planning", "architecture", "blast-radius", "pharaoh", "review", "interactive"]}
9
9
  ---
10
10
 
11
- # Plan with Pharaoh
11
+ # Plan Review
12
12
 
13
- Full-cycle planning: reconnaissance plan writing → adversarial review. Combines architecture-aware graph analysis with rigorous plan craft and interactive issue resolution.
13
+ **You are now in plan mode. Do NOT make any code changes. Think, evaluate, and present decisions.**
14
14
 
15
- **You are in plan mode. Do NOT make any code changes. Think, evaluate, plan, review.**
15
+ ## Document Review
16
16
 
17
- ## When to Use
18
-
19
- Before implementing any non-trivial change: new features, refactors, adding modules, or anything that touches shared code. Use it whenever you need to answer "what's the right way to build this?" before writing code.
17
+ If the user provides a document, PRD, prompt, or artifact alongside this command, that IS the plan to review. Apply all review sections to that document. Do not treat it as background context — it is the subject of evaluation.
20
18
 
21
19
  ## Project Overrides
22
20
 
@@ -32,188 +30,58 @@ If a `.claude/plan-review.md` file exists in this project, read it now and apply
32
30
  - Subtraction > addition; target zero or negative net LOC
33
31
  - Every export must have a caller; unwired code doesn't exist
34
32
 
35
- ## Document Review Mode
36
-
37
- If the user provides a document, PRD, prompt, or artifact alongside this command, that IS the plan to review. Still run Phase 1 (Reconnaissance) — always verify against the actual codebase. Then proceed to Phase 3 (Approach) and Phase 5 (Review), applying all review sections to that document. Do not treat it as background context — it is the subject of evaluation.
38
-
39
- ---
40
-
41
- ## Phase 1 — Reconnaissance (required — do this BEFORE anything else)
33
+ ## Step 1: Pharaoh Reconnaissance (Required — do this BEFORE reviewing)
42
34
 
43
- Do NOT plan from memory or assumptions. Query the actual codebase first:
35
+ Do NOT review from memory or assumptions. Query the actual codebase first:
44
36
 
45
37
  1. `get_codebase_map` — current modules, hot files, dependency graph
46
38
  2. `search_functions` for keywords related to the plan — find existing code to reuse/extend
47
- 3. `get_module_context` on each module likely affected by the change
39
+ 3. `get_module_context` on affected modules entry points, patterns, conventions
48
40
  4. `query_dependencies` between affected modules — coupling, circular deps
49
- 5. `get_blast_radius` on the primary target of the change
50
- 6. `check_reachability` on the primary target to verify it's reachable from entry points
51
41
 
52
42
  Ground every recommendation in what actually exists. If you propose adding something, confirm it doesn't already exist. If you propose changing something, know its blast radius.
53
43
 
54
- ## Phase 2 Analysis
55
-
56
- Using the reconnaissance data:
57
-
58
- - Evaluate the blast radius — how many callers and modules are affected?
59
- - Check `search_functions` results — does related code already exist? Can you reuse/extend?
60
- - Assess module coupling — are the affected modules tightly or loosely coupled?
61
- - Rate the risk level (LOW / MEDIUM / HIGH) based on blast radius and coupling
62
- - Does this need new code at all, or can an existing pattern solve it?
63
-
64
- ## Phase 3 — Approach
65
-
66
- ### Scope Check
67
-
68
- If the spec covers multiple independent subsystems, it should be broken into separate plans — one per subsystem. Each plan should produce working, testable software on its own. Suggest splitting if needed.
69
-
70
- ### Mode Selection (MANDATORY — do NOT skip)
71
-
72
- **STOP and ask the user before proceeding.** This is a hard gate — do not infer, assume, or skip this question even if the user says "yes", "go ahead", "yes to all", or similar. Present both options and wait for an explicit choice:
73
-
74
- > **This looks like it could be a BIG or SMALL change. Which mode?**
75
- >
76
- > - **BIG CHANGE**: Full plan with all sections, approach trade-offs, interactive review
77
- > - **SMALL CHANGE**: Abbreviated plan, sections 2-4 of review only
78
-
79
- If the user's response is ambiguous (e.g. "just do it", "yes to all"), ask again: "I need to know — BIG or SMALL change?" Do not proceed to Phase 4 without an answer.
80
-
81
- ### Approach Trade-offs
82
-
83
- Propose 2-3 implementation approaches:
84
- - For each: what files change, estimated blast radius, pros, cons
85
- - Recommend one with justification
86
- - Flag any approach that would increase module coupling
87
- - Flag any approach that requires new code where existing code could be extended
88
-
89
- ## Phase 4 — Plan Writing
90
-
91
- ### File Structure
92
-
93
- Before defining tasks, map out which files will be created or modified and what each one is responsible for. This is where decomposition decisions get locked in.
94
-
95
- - Design units with clear boundaries and well-defined interfaces. Each file should have one clear responsibility.
96
- - Files that change together should live together. Split by responsibility, not by technical layer.
97
- - In existing codebases, follow established patterns. If the codebase uses large files, don't unilaterally restructure — but if a file you're modifying has grown unwieldy, including a split is reasonable.
98
-
99
- This structure informs the task decomposition. Each task should produce self-contained changes that make sense independently.
100
-
101
- ### Plan Document Header
102
-
103
- Every plan MUST start with:
104
-
105
- ```markdown
106
- # [Feature Name] Implementation Plan
107
-
108
- > **For agentic workers:** Use `pharaoh:orchestrate` (recommended) or `pharaoh:execute` to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
109
-
110
- **Goal:** [One sentence describing what this builds]
111
-
112
- **Architecture:** [2-3 sentences about approach]
113
-
114
- **Tech Stack:** [Key technologies/libraries]
115
-
116
- **Risk:** [LOW / MEDIUM / HIGH] — [one line justification from Phase 2 data]
117
-
118
- ---
119
- ```
120
-
121
- ### Bite-Sized Task Granularity
44
+ ## Step 1b: Reconnaissance Dashboard
122
45
 
123
- Each step is one action (2-5 minutes):
124
- - "Write the failing test" — step
125
- - "Run it to make sure it fails" — step
126
- - "Implement the minimal code to make the test pass" — step
127
- - "Run the tests and make sure they pass" — step
128
- - "Commit" — step
46
+ After running recon, present a visual summary before proceeding. This shows the user what Pharaoh found.
129
47
 
130
- ### Task Structure
48
+ **Surface all ★ Pharaoh insight blocks verbatim** — they contain pre-formatted bar charts, risk meters, and flow diagrams. Do not summarize or paraphrase them.
131
49
 
132
- ````markdown
133
- ### Task N: [Component Name]
50
+ Then compose a **Recon Summary** table:
134
51
 
135
- **Files:**
136
- - Create: `exact/path/to/file.ts`
137
- - Modify: `exact/path/to/existing.ts:123-145`
138
- - Test: `tests/exact/path/to/test.ts`
52
+ | Signal | Value | Source |
53
+ |--------|-------|--------|
54
+ | Modules affected | N | get_codebase_map |
55
+ | Blast radius | LOW/MED/HIGH + caller count | get_blast_radius |
56
+ | Existing functions found | N matches | search_functions |
57
+ | Cross-module coupling | deps + circular? | query_dependencies |
139
58
 
140
- **Blast radius:** [from Phase 1 data callers affected, modules touched]
59
+ If any signal is surprising (high blast radius, circular deps, existing code that overlaps the plan), call it out before moving to Mode Selection.
141
60
 
142
- **Wiring:** [where new exports get called from — declared caller for every export]
61
+ ## Step 2: Mode Selection
143
62
 
144
- - [ ] **Step 1: Write the failing test**
63
+ Ask the user which mode before starting the review:
145
64
 
146
- ```typescript
147
- test('specific behavior', () => {
148
- const result = function(input);
149
- expect(result).toBe(expected);
150
- });
151
- ```
65
+ **BIG CHANGE**: Full interactive review, all relevant sections, up to 4 top issues per section.
66
+ **SMALL CHANGE**: One question per section, only sections 2-4.
152
67
 
153
- - [ ] **Step 2: Run test to verify it fails**
68
+ ## Step 3: Review Sections
154
69
 
155
- Run: `pnpm test -- tests/path/test.ts`
156
- Expected: FAIL with "function not defined"
157
-
158
- - [ ] **Step 3: Write minimal implementation**
159
-
160
- ```typescript
161
- export function myFunction(input: string): string {
162
- return expected;
163
- }
164
- ```
165
-
166
- - [ ] **Step 4: Run test to verify it passes**
167
-
168
- Run: `pnpm test -- tests/path/test.ts`
169
- Expected: PASS
170
-
171
- - [ ] **Step 5: Commit**
172
-
173
- ```bash
174
- git add tests/path/test.ts src/path/file.ts
175
- git commit -m "feat: add specific feature"
176
- ```
177
- ````
178
-
179
- ### No Placeholders
180
-
181
- Every step must contain the actual content an engineer needs. These are **plan failures** — never write them:
182
-
183
- - "TBD", "TODO", "implement later", "fill in details"
184
- - "Add appropriate error handling" / "add validation" / "handle edge cases"
185
- - "Write tests for the above" (without actual test code)
186
- - "Similar to Task N" (repeat the code — the engineer may be reading tasks out of order)
187
- - Steps that describe what to do without showing how (code blocks required for code steps)
188
- - References to types, functions, or methods not defined in any task
189
-
190
- ### Remember
191
-
192
- - Exact file paths always
193
- - Complete code in every step — if a step changes code, show the code
194
- - Exact commands with expected output
195
- - DRY, YAGNI, TDD, frequent commits
196
- - Every new export must have a declared caller — if a function has no caller, it's not part of the plan
197
-
198
- ---
199
-
200
- ## Phase 5 — Adversarial Review
201
-
202
- Review the plan before presenting it. Apply all relevant sections, adapting depth to change size. Skip sections that don't apply.
70
+ Adapt depth to change size. Skip sections that don't apply.
203
71
 
204
72
  ### Section 1 — Architecture (skip for small/single-file changes)
205
73
 
206
74
  - Component boundaries and coupling concerns
207
75
  - Dependency graph: does this change shrink or expand surface area?
208
76
  - Data flow bottlenecks and single points of failure
209
- - Does this need new code at all, or can an existing pattern solve it?
77
+ - Does this need new code at all, or can a human process / existing pattern solve it?
210
78
 
211
79
  ### Section 2 — Code Quality (always)
212
80
 
213
81
  - Organization, module structure, DRY violations (be aggressive)
214
82
  - Error handling gaps and missing edge cases (call out explicitly)
215
83
  - Technical debt: shortcuts, hardcoded values, magic strings
216
- - Over-engineered or under-engineered relative to engineering preferences
84
+ - Over-engineered or under-engineered relative to my preferences
217
85
  - Reuse: does code for this already exist somewhere?
218
86
 
219
87
  ### Section 3 — Wiring & Integration (always)
@@ -221,7 +89,7 @@ Review the plan before presenting it. Apply all relevant sections, adapting dept
221
89
  - Are all new exports called from a production entry point?
222
90
  - Run `get_blast_radius` on any new/changed functions — zero callers = not done
223
91
  - `check_reachability` on new exports — verify reachable from API handlers, crons, or event handlers
224
- - Does every task declare WHERE new code gets called from? If not, flag it
92
+ - Does the plan declare WHERE new code gets called from? If not, flag it
225
93
  - Integration points: how does this connect to what already exists?
226
94
 
227
95
  ### Section 4 — Tests (always)
@@ -239,77 +107,35 @@ Review the plan before presenting it. Apply all relevant sections, adapting dept
239
107
 
240
108
  ### Section 6 — Security & Attack Surface (always for new endpoints/routes/APIs; skip for pure refactors)
241
109
 
242
- - **Authentication model** — what authenticates requests? Where validated? What happens on failure?
243
- - **Sensitive data in URLs** — tokens, session IDs, or tenant identifiers in URL paths/params leak via Referer, history, logs
244
- - **Authorization boundaries** — what prevents User A from accessing User B's data?
245
- - **Input trust boundary** — user input flowing into shell commands, queries, HTML rendering, or file paths
246
- - **Error and response surface** — do error responses expose internals to unauthenticated callers?
247
- - **New attack surface** — new public URLs, webhooks, API routes each need rate limiting, auth, and input validation
248
-
249
- ### Self-Review Checklist (run after all sections)
110
+ - **Authentication model** — what authenticates requests in this plan? Where is it validated? What happens on auth failure (redirect, 401, silent pass-through)? Use `search_functions` to find existing auth middleware and confirm reuse.
111
+ - **Sensitive data in URLs** — does the design put tokens, session IDs, or tenant identifiers in URL paths or query params? These leak via Referer headers, browser history, logs, and link sharing.
112
+ - **Authorization boundaries** — what prevents User A from accessing User B's data? Is there an ownership check, or just an "is logged in" check? Use `get_blast_radius` on existing ownership-check functions to see where they're already enforced.
113
+ - **Input trust boundary** — does the plan accept user input that flows into shell commands, database queries, HTML rendering, or file paths? Each is an injection vector.
114
+ - **Error and response surface** — will error responses or API payloads expose internals (stack traces, DB schemas, internal IDs) to unauthenticated callers?
115
+ - **New attack surface** — does the plan introduce new public URLs, webhooks, API routes, or WebSocket endpoints? Each needs: rate limiting, authentication, and input validation. Use `get_module_context` on the receiving module to check what protections exist.
250
116
 
251
- 1. **Spec coverage:** Skim each section/requirement in the spec. Can you point to a task that implements it? List any gaps.
252
- 2. **Placeholder scan:** Search the plan for red flags from the "No Placeholders" section. Fix them.
253
- 3. **Type consistency:** Do types, method signatures, and property names used in later tasks match earlier tasks? A function called `clearLayers()` in Task 3 but `clearFullLayers()` in Task 7 is a bug.
254
- 4. **Wiring sweep:** `get_blast_radius` on ALL new exports — zero callers on non-entry-points = plan is incomplete.
255
-
256
- ### For Each Issue Found
117
+ ## For Each Issue Found
257
118
 
258
119
  For every specific issue (bug, smell, design concern, risk, missing wiring):
259
120
 
260
121
  1. **Describe concretely** — file, line/function reference, what's wrong
261
122
  2. **Present 2-3 options** including "do nothing" where reasonable
262
123
  3. **For each option** — implementation effort, risk, blast radius, maintenance burden
263
- 4. **Recommend one** mapped to engineering preferences above, and say why
264
- 5. **Ask** whether the user agrees or wants a different direction
265
-
266
- Number each issue (1, 2, 3...) and letter each option (A, B, C...). Recommended option is always listed first.
267
-
268
- ---
269
-
270
- ## Phase 6 — Output & Handoff
271
-
272
- ### Present the Plan
273
-
274
- A complete implementation plan containing:
275
-
276
- - Risk rating (LOW / MEDIUM / HIGH) with data backing
277
- - Recommended approach with trade-off rationale
278
- - File structure map
279
- - Numbered tasks with bite-sized steps, exact files, and complete code
280
- - Blast radius per task
281
- - Wiring declarations for every new export
282
- - Required tests per step
283
- - Adversarial review findings (issues caught and resolved)
124
+ 4. **Recommend one** mapped to my preferences above, and say why
125
+ 5. **Ask** whether I agree or want a different direction
284
126
 
285
- Save plans to: `docs/sessions/YYYY-MM-DD-<feature-name>.md`
286
- (User preferences for plan location override this default)
287
-
288
- ### Execution Handoff
289
-
290
- After saving the plan, offer execution choice:
291
-
292
- **"Plan complete and saved. Two execution options:**
293
-
294
- **1. Orchestrated (recommended)** — I dispatch a fresh subagent per task with two-stage review (spec compliance then code quality). Use `pharaoh:orchestrate`.
295
-
296
- **2. Inline Execution** — Execute tasks in this session with checkpoints. Use `pharaoh:execute`.
297
-
298
- **Which approach?"**
299
-
300
- ---
127
+ Number each issue (1, 2, 3...) and letter each option (A, B, C...). Recommended option is always listed first. Use AskUserQuestion with clear labels like "Issue 1 Option A", "Issue 1 Option B".
301
128
 
302
129
  ## Pharaoh Checkpoints (use throughout, not just at the end)
303
130
 
304
- - **Before planning**: recon (Phase 1)
305
- - **During plan writing**: `get_blast_radius` when evaluating impact; `search_functions` before proposing new code
306
- - **During review**: `get_blast_radius` on all new/changed functions; `check_reachability` on new exports
307
- - **After decisions**: `get_unused_code` to catch disconnections
131
+ - **Before reviewing**: recon (Step 1 above)
132
+ - **During review**: `get_blast_radius` when evaluating impact of changes; `search_functions` before suggesting new code
133
+ - **After decisions**: `check_reachability` on all new exports; `get_unused_code` to catch disconnections
308
134
  - **Final sweep**: `get_blast_radius` on ALL new exports — zero callers on non-entry-points = plan is incomplete
309
135
 
310
136
  ## Workflow Rules
311
137
 
312
- - After each review section, pause and ask for feedback before moving on (BIG CHANGE mode)
138
+ - After each section, pause and ask for feedback before moving on
313
139
  - Do not assume priorities on timeline or scale
314
140
  - If you see a better approach to the entire plan, say so BEFORE section-by-section review
315
- - Challenge the approach if you see a better one — your job is to find problems the user will regret later
141
+ - Challenge the approach if you see a better one — your job is to find problems I'll regret later
@@ -1,8 +1,8 @@
1
1
  ---
2
2
  name: sessions
3
3
  prompt-name: session-decomposition
4
- description: "Decompose work into parallel, isolated sessions using git worktrees. Each session gets fresh context, a narrow scope, and produces atomic commits. Prevents context window pollution from large tasks. Coordinate across sessions without shared state."
5
- version: 0.2.0
4
+ description: "Decompose work into parallel, isolated sessions using git worktrees. Each session gets fresh context, a narrow scope, and produces atomic commits. Presents session prompts for user review before execution."
5
+ version: 0.3.0
6
6
  homepage: https://pharaoh.so
7
7
  user-invocable: true
8
8
  metadata: {"emoji": "☥", "tags": ["sessions", "worktrees", "parallel-work", "context-management", "decomposition"]}
@@ -10,7 +10,7 @@ metadata: {"emoji": "☥", "tags": ["sessions", "worktrees", "parallel-work", "c
10
10
 
11
11
  # Session Decomposition
12
12
 
13
- Break large tasks into parallel, isolated work sessions. Each session runs in its own git worktree with fresh context, focused scope, and atomic commits. Prevents context window bloat and keeps each unit of work clean.
13
+ Break large tasks into parallel, isolated work sessions. Each session runs in its own git worktree with fresh context, focused scope, and atomic commits.
14
14
 
15
15
  ## When to Use
16
16
 
@@ -25,9 +25,11 @@ Break large tasks into parallel, isolated work sessions. Each session runs in it
25
25
  - Work is sequential (each step depends on the previous)
26
26
  - Task fits comfortably in one session
27
27
 
28
- ## Process
28
+ ## Step 1: Reconnaissance
29
29
 
30
- ### 1. Decompose
30
+ If Pharaoh MCP tools are available, call `get_codebase_map` and `get_module_context` on affected modules to understand the current landscape before decomposing.
31
+
32
+ ## Step 2: Decompose
31
33
 
32
34
  Break the task into sessions. Each session must:
33
35
 
@@ -36,19 +38,9 @@ Break the task into sessions. Each session must:
36
38
  - Be independently verifiable (tests pass, build succeeds)
37
39
  - Produce atomic commits that make sense on their own
38
40
 
39
- ### 2. Create Worktrees
40
-
41
- For each session, create an isolated worktree:
41
+ ## Step 3: Write Session Prompts
42
42
 
43
- ```bash
44
- git worktree add .worktrees/<session-name> -b <branch-name>
45
- ```
46
-
47
- Install dependencies in each worktree. Verify clean baseline (tests pass).
48
-
49
- ### 3. Write Session Prompts
50
-
51
- Each session gets a prompt containing:
43
+ For each session, write a complete prompt containing:
52
44
 
53
45
  - **Goal:** what this session produces (1-2 sentences)
54
46
  - **Scope:** which files/modules to touch (explicit list)
@@ -56,11 +48,34 @@ Each session gets a prompt containing:
56
48
  - **Verification:** how to confirm the work is correct
57
49
  - **Context:** any architectural decisions or patterns to follow
58
50
 
59
- ### 4. Execute Sessions
51
+ ## Step 4: Present for Review (MANDATORY — do NOT skip)
52
+
53
+ **STOP. Paste every session prompt into the chat as a numbered list.**
54
+
55
+ For each session, show:
56
+ 1. The session name
57
+ 2. The full prompt text
58
+ 3. Which sessions need `/plan` review (flag anything non-trivial)
59
+
60
+ **Wait for the user to approve, modify, add, remove, or reorder sessions before proceeding.** Do not create worktrees or execute any work until the user explicitly approves the decomposition.
61
+
62
+ If the user says "looks good" or similar, proceed. If they request changes, update the prompts and present again.
63
+
64
+ ## Step 5: Create Worktrees
65
+
66
+ Only after user approval. For each session, create an isolated worktree:
67
+
68
+ ```bash
69
+ git worktree add .worktrees/<session-name> -b <branch-name>
70
+ ```
71
+
72
+ Install dependencies in each worktree. Verify clean baseline (tests pass).
73
+
74
+ ## Step 6: Execute Sessions
60
75
 
61
76
  Run each session independently. Sessions should not reference each other's work-in-progress — they operate on the same base commit.
62
77
 
63
- ### 5. Integrate
78
+ ## Step 7: Integrate
64
79
 
65
80
  After all sessions complete:
66
81
 
@@ -84,3 +99,4 @@ After all sessions complete:
84
99
  - **Atomic commits** — each session's output should be a coherent, reviewable unit
85
100
  - **Verify before integrating** — never merge a session that doesn't pass its own checks
86
101
  - **Decomposition is the hard part** — spend time getting boundaries right before starting work
102
+ - **The user reviews before execution** — always present prompts, never skip to building