@pharaoh-so/mcp 0.1.6 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (37) hide show
  1. package/CHANGELOG.md +41 -0
  2. package/LICENSE +21 -0
  3. package/README.md +237 -13
  4. package/dist/helpers.js +1 -1
  5. package/dist/index.js +6 -0
  6. package/dist/install-skills.d.ts +33 -0
  7. package/dist/install-skills.js +121 -0
  8. package/inspect-tools.json +12 -2
  9. package/package.json +64 -32
  10. package/skills/.gitkeep +0 -0
  11. package/skills/pharaoh/SKILL.md +81 -0
  12. package/skills/pharaoh-audit-tests/SKILL.md +88 -0
  13. package/skills/pharaoh-brainstorm/SKILL.md +73 -0
  14. package/skills/pharaoh-debt/SKILL.md +33 -0
  15. package/skills/pharaoh-debug/SKILL.md +69 -0
  16. package/skills/pharaoh-execute/SKILL.md +57 -0
  17. package/skills/pharaoh-explore/SKILL.md +32 -0
  18. package/skills/pharaoh-finish/SKILL.md +79 -0
  19. package/skills/pharaoh-health/SKILL.md +36 -0
  20. package/skills/pharaoh-investigate/SKILL.md +34 -0
  21. package/skills/pharaoh-onboard/SKILL.md +32 -0
  22. package/skills/pharaoh-parallel/SKILL.md +74 -0
  23. package/skills/pharaoh-plan/SKILL.md +74 -0
  24. package/skills/pharaoh-pr/SKILL.md +52 -0
  25. package/skills/pharaoh-refactor/SKILL.md +36 -0
  26. package/skills/pharaoh-review/SKILL.md +61 -0
  27. package/skills/pharaoh-review-codex/SKILL.md +80 -0
  28. package/skills/pharaoh-review-receive/SKILL.md +81 -0
  29. package/skills/pharaoh-sessions/SKILL.md +85 -0
  30. package/skills/pharaoh-tdd/SKILL.md +104 -0
  31. package/skills/pharaoh-verify/SKILL.md +72 -0
  32. package/skills/pharaoh-wiring/SKILL.md +34 -0
  33. package/skills/pharaoh-worktree/SKILL.md +85 -0
  34. package/dist/auth.js.map +0 -1
  35. package/dist/credentials.js.map +0 -1
  36. package/dist/index.js.map +0 -1
  37. package/dist/proxy.js.map +0 -1
@@ -0,0 +1,74 @@
1
+ ---
2
+ name: pharaoh-parallel
3
+ description: "Dispatch 2+ independent subagent tasks that run concurrently. Each agent gets focused scope, clear goal, constraints, and expected output. No shared state between agents. Review and integrate results after all complete."
4
+ version: 0.2.0
5
+ homepage: https://pharaoh.so
6
+ user-invocable: true
7
+ metadata: {"emoji": "☥", "tags": ["parallel", "subagents", "concurrency", "delegation", "efficiency"]}
8
+ ---
9
+
10
+ # Parallel Dispatch
11
+
12
+ Delegate independent tasks to specialized agents running concurrently. Each agent gets isolated context and focused scope. Never share your session history — construct exactly what each agent needs.
13
+
14
+ ## When to Use
15
+
16
+ - 2+ independent tasks with no shared state
17
+ - Multiple failures across different subsystems
18
+ - Each problem can be understood without context from others
19
+ - Agents won't edit the same files
20
+
21
+ ## Do Not Use When
22
+
23
+ - Failures are related (fixing one might fix others)
24
+ - Tasks require understanding full system state
25
+ - Agents would edit the same files or resources
26
+ - You don't yet know what's broken (investigate first)
27
+
28
+ ## The Pattern
29
+
30
+ ### 1. Identify Independent Domains
31
+
32
+ Group work by what's independent:
33
+ - Different test files with different root causes
34
+ - Different modules with unrelated issues
35
+ - Different features with no shared code
36
+
37
+ ### 2. Craft Agent Prompts
38
+
39
+ Each agent gets:
40
+
41
+ - **Specific scope:** one file, one module, one subsystem
42
+ - **Clear goal:** what success looks like
43
+ - **Constraints:** what NOT to change
44
+ - **Context:** error messages, relevant code paths, architectural notes
45
+ - **Expected output:** summary of findings and changes
46
+
47
+ ### 3. Dispatch
48
+
49
+ Launch all agents simultaneously. They run concurrently with no coordination needed.
50
+
51
+ ### 4. Review and Integrate
52
+
53
+ When agents return:
54
+
55
+ 1. Read each summary — understand what changed
56
+ 2. Check for conflicts — did agents edit overlapping code?
57
+ 3. Run full test suite — verify all fixes work together
58
+ 4. Spot-check results — agents can make systematic errors
59
+
60
+ ## Prompt Quality
61
+
62
+ | Bad | Good |
63
+ |-----|------|
64
+ | "Fix all the tests" | "Fix the 3 failures in agent-abort.test.ts" |
65
+ | "Fix the race condition" | "Fix timing in abort test — here are the error messages: ..." |
66
+ | No constraints | "Do NOT change production code, fix tests only" |
67
+ | "Fix it" | "Return: root cause summary + what you changed" |
68
+
69
+ ## Iron Rules
70
+
71
+ - **One task per agent** — focused agents produce better results than broad ones
72
+ - **No shared state** — agents must not depend on each other's output
73
+ - **Never trust agent reports** — verify changes independently before integrating
74
+ - **Construct context, don't inherit** — give each agent exactly what it needs, nothing more
@@ -0,0 +1,74 @@
1
+ ---
2
+ name: pharaoh-plan
3
+ description: "Architecture-aware planning workflow using Pharaoh codebase knowledge graph. Four-phase process: reconnaissance with MCP tools, blast radius analysis, approach selection with trade-offs, and step-by-step implementation plan with wiring declarations. Prevents dead exports and overcoupled designs before a line of code is written."
4
+ version: 0.2.0
5
+ homepage: https://pharaoh.so
6
+ user-invocable: true
7
+ metadata: {"emoji": "☥", "tags": ["planning", "architecture", "blast-radius", "pharaoh", "implementation-plan", "wiring"]}
8
+ ---
9
+
10
+ # Plan with Pharaoh
11
+
12
+ Architecture-aware planning before implementation. Uses `plan-with-pharaoh` — a 4-phase workflow (+ adversarial review) that combines reconnaissance, blast radius analysis, approach trade-offs, and a wired step-by-step plan. Iron law: every new export must have a declared caller.
13
+
14
+ ## When to Use
15
+
16
+ Invoke before implementing any non-trivial change: new features, refactors, adding modules, or anything that touches shared code. Use it whenever you need to answer "what's the right way to build this?" before writing code.
17
+
18
+ ## Workflow
19
+
20
+ ### Phase 1 — Reconnaissance (do NOT skip)
21
+
22
+ 1. Call `get_codebase_map` for the target repository to see the full module landscape.
23
+ 2. Call `get_module_context` on each module likely affected by the change.
24
+ 3. Call `search_functions` for terms related to the feature or change description.
25
+ 4. Call `query_dependencies` between the affected modules to map coupling.
26
+ 5. Call `get_blast_radius` on the primary target of the change.
27
+ 6. Call `check_reachability` on the primary target to verify it is reachable from entry points.
28
+
29
+ ### Phase 2 — Analysis
30
+
31
+ Using the reconnaissance data:
32
+ - Evaluate the blast radius — how many callers and modules are affected?
33
+ - Check `search_functions` results — does related code already exist?
34
+ - Assess module coupling — are the affected modules tightly or loosely coupled?
35
+ - Rate the risk level (LOW / MEDIUM / HIGH) based on blast radius and coupling.
36
+
37
+ ### Phase 3 — Approach
38
+
39
+ Propose 2-3 implementation approaches with trade-offs:
40
+ - For each approach: what files change, estimated blast radius, pros, cons.
41
+ - Recommend one approach with justification.
42
+ - Flag any approach that would increase module coupling.
43
+
44
+ ### Phase 4 — Plan
45
+
46
+ Produce a step-by-step implementation plan:
47
+ - Exact files and functions to create or modify.
48
+ - Blast radius per change (from Phase 1 data).
49
+ - Required tests for each step.
50
+ - Wiring declarations: every new export must have a declared caller.
51
+
52
+ Iron law: "Every new export in the plan must have a declared caller. If a function has no caller, it's not part of the plan — remove it."
53
+
54
+ ### Phase 5 — Adversarial Review
55
+
56
+ Before presenting the plan, adversarially review it:
57
+ - Are all new exports connected to declared callers? (If not, remove them.)
58
+ - Is the blast radius acceptable, or does the approach touch too many callers?
59
+ - Does the approach minimize coupling, or does it introduce new cross-module dependencies?
60
+ - Are there simpler alternatives that achieve the same result with fewer file changes?
61
+ - Would any step create unreachable code paths?
62
+
63
+ Only present the plan after it passes this review.
64
+
65
+ ## Output
66
+
67
+ A complete implementation plan containing:
68
+ - Risk rating (LOW / MEDIUM / HIGH) with data backing
69
+ - Recommended approach with trade-off rationale
70
+ - Numbered steps with exact files and functions
71
+ - Blast radius per change
72
+ - Required tests per step
73
+ - Wiring declarations for every new export
74
+ - Adversarial review findings (issues caught and resolved)
@@ -0,0 +1,52 @@
1
+ ---
2
+ name: pharaoh-pr
3
+ description: "Pre-pull-request architectural review checklist using Pharaoh codebase knowledge graph. Covers module context, blast radius per touched module, hidden coupling between modules, duplicate logic detection, regression risk scoring, and vision spec alignment. Produces a structured review summary before opening a PR."
4
+ version: 0.2.0
5
+ homepage: https://pharaoh.so
6
+ user-invocable: true
7
+ metadata: {"emoji": "☥", "tags": ["pull-request", "code-review", "architecture", "pharaoh", "pre-pr", "regression-risk"]}
8
+ ---
9
+
10
+ # Pre-PR Review
11
+
12
+ Architectural review checklist to run before opening a pull request. Uses `pre-pr-review` — a 6-step workflow covering module context, blast radius, dependency coupling, duplicate logic, regression risk, and spec drift. Catches architectural problems before reviewers see the code.
13
+
14
+ ## When to Use
15
+
16
+ Invoke before opening a pull request. Use it when changes touch one or more modules and you want a structured architectural assessment before requesting human review.
17
+
18
+ ## Workflow
19
+
20
+ ### Step 1: Module context
21
+
22
+ For each touched module, call `get_module_context` to review its current structure and complexity.
23
+
24
+ ### Step 2: Blast radius
25
+
26
+ For each touched module, call `get_blast_radius` to identify what else is affected by the changes.
27
+
28
+ ### Step 3: Dependency check
29
+
30
+ Call `query_dependencies` between each pair of touched modules to find hidden coupling introduced by the PR.
31
+
32
+ ### Step 4: Consolidation check
33
+
34
+ Call `get_consolidation_opportunities` for the target repository to flag any duplicate logic introduced by the PR.
35
+
36
+ ### Step 5: Regression risk
37
+
38
+ Call `get_regression_risk` for the target repository to assess overall change risk.
39
+
40
+ ### Step 6: Vision alignment
41
+
42
+ Call `get_vision_gaps` for the target repository to check if changes align with or drift from specs.
43
+
44
+ ## Output
45
+
46
+ A review summary containing:
47
+ - **Architecture impact:** modules affected, dependency changes introduced
48
+ - **Risk assessment:** blast radius per module, overall regression risk level
49
+ - **Cleanup opportunities:** consolidation candidates, unused code created
50
+ - **Spec alignment:** vision gaps introduced or resolved by the PR
51
+
52
+ Ready to paste into the PR description or share with reviewers.
@@ -0,0 +1,36 @@
1
+ ---
2
+ name: pharaoh-refactor
3
+ description: "Safe refactoring workflow using Pharaoh codebase knowledge graph. Six-step process: module context, blast radius of downstream callers, reachability verification, dependency mapping, naming conflict detection, and test coverage assessment. Produces a refactoring plan with every caller listed, test files identified, unreachable code flagged, and high-risk paths warned."
4
+ version: 0.2.0
5
+ homepage: https://pharaoh.so
6
+ user-invocable: true
7
+ metadata: {"emoji": "☥", "tags": ["refactoring", "blast-radius", "architecture", "pharaoh", "safe-refactor", "test-coverage"]}
8
+ ---
9
+
10
+ # Safe Refactor
11
+
12
+ Step-by-step workflow to safely refactor a function or module with full blast radius awareness. Uses `safe-refactor` — a 6-step process that maps every caller, identifies affected tests, flags unreachable code, and warns about high-risk downstream paths before a single line changes.
13
+
14
+ ## When to Use
15
+
16
+ Invoke before refactoring any function, module, or file — especially shared utilities, exports used across multiple modules, or anything with an unclear caller graph. Use it whenever the blast radius of a change is unknown.
17
+
18
+ ## Workflow
19
+
20
+ 1. Call `get_module_context` for the module containing the target to understand its current structure.
21
+ 2. Call `get_blast_radius` for the target to identify all downstream callers and affected modules.
22
+ 3. Call `check_reachability` for the target to verify it is actually reachable from entry points.
23
+ 4. Call `query_dependencies` to map how the containing module connects to its dependents.
24
+ 5. Call `search_functions` to check if the refactored version's name already exists elsewhere.
25
+ 6. Call `get_test_coverage` for the module to identify which tests cover the refactored code.
26
+
27
+ Do not propose a refactoring plan until all 6 steps are complete.
28
+
29
+ ## Output
30
+
31
+ A refactoring plan containing:
32
+ - **Callers to update:** every function and file that calls the target, with update requirements
33
+ - **Tests to change:** test files covering the refactored code, with required modifications
34
+ - **Dead code:** unreachable code paths that can be deleted during the refactor
35
+ - **High-risk paths:** downstream modules with wide blast radius or high complexity scores
36
+ - **Naming conflicts:** existing functions whose names would conflict with the refactored version
@@ -0,0 +1,61 @@
1
+ ---
2
+ name: pharaoh-review
3
+ description: "Architecture-aware pre-PR code review using Pharaoh codebase knowledge graph. Four-phase workflow: context gathering with module structure and blast radius, risk assessment with regression scoring and wiring checks, spec alignment against vision docs, and a final verdict of SHIP / SHIP WITH CHANGES / BLOCK. Auto-block rules for unreachable exports, circular dependencies, high regression risk, and spec violations."
4
+ version: 0.2.0
5
+ homepage: https://pharaoh.so
6
+ user-invocable: true
7
+ metadata: {"emoji": "☥", "tags": ["code-review", "pull-request", "architecture", "pharaoh", "regression-risk", "spec-alignment"]}
8
+ ---
9
+
10
+ # Review with Pharaoh
11
+
12
+ Architecture-aware pre-PR review. Uses `review-with-pharaoh` — a 4-phase workflow that assesses blast radius, regression risk, wiring integrity, duplication, and spec alignment. Produces a final verdict: SHIP, SHIP WITH CHANGES, or BLOCK.
13
+
14
+ ## When to Use
15
+
16
+ Invoke before merging any pull request. Use it when reviewing changes that touch shared modules, export new functions, modify core data flows, or claim to implement a spec.
17
+
18
+ ## Workflow
19
+
20
+ ### Phase 1 — Context
21
+
22
+ 1. For each touched module, call `get_module_context` to understand its structure.
23
+ 2. For each touched module, call `get_blast_radius` to identify downstream impact.
24
+ 3. Call `query_dependencies` between the touched modules to map coupling.
25
+
26
+ ### Phase 2 — Risk Assessment
27
+
28
+ 4. Call `get_regression_risk` for the target repository to assess overall change risk.
29
+ 5. Call `check_reachability` for new exports in the touched modules — are they wired?
30
+ 6. Call `get_consolidation_opportunities` for the repository to check for duplicated logic.
31
+
32
+ ### Phase 3 — Spec Alignment
33
+
34
+ 7. Call `get_vision_gaps` for the repository to verify changes align with specs.
35
+
36
+ ### Phase 4 — Verdict
37
+
38
+ Produce a review with:
39
+ - **Architecture impact:** modules affected, dependency changes, blast radius
40
+ - **Risk assessment:** regression risk level, volatile modules touched
41
+ - **Wiring check:** are all new exports reachable from entry points?
42
+ - **Duplication check:** does new code duplicate existing logic?
43
+ - **Spec alignment:** do changes match or drift from vision specs?
44
+
45
+ Final verdict: **SHIP** / **SHIP WITH CHANGES** / **BLOCK**
46
+
47
+ Auto-block triggers (any of these = BLOCK):
48
+ - Unreachable exports (new code with zero callers)
49
+ - New circular dependencies between modules
50
+ - HIGH regression risk without corresponding test coverage
51
+ - Vision spec violations (building against spec intent)
52
+
53
+ ## Output
54
+
55
+ A structured review containing:
56
+ - Architecture impact summary with specific modules and blast radius numbers
57
+ - Risk level (LOW / MEDIUM / HIGH) with data backing
58
+ - Wiring status for all new exports
59
+ - Duplication findings with affected modules
60
+ - Spec alignment verdict
61
+ - Final verdict (SHIP / SHIP WITH CHANGES / BLOCK) with specific required changes if not SHIP
@@ -0,0 +1,80 @@
1
+ ---
2
+ name: pharaoh-review-codex
3
+ description: "Cross-model security review. Dispatch code to a different AI model or subagent for independent second-opinion review. Evaluator applies AGREE, DISAGREE, or CONTEXT verdicts to each finding. Catches blind spots from single-model reasoning. Use for security-sensitive code, auth flows, data access, and architectural decisions."
4
+ version: 0.2.0
5
+ homepage: https://pharaoh.so
6
+ user-invocable: true
7
+ metadata: {"emoji": "☥", "tags": ["security-review", "cross-model", "second-opinion", "code-review", "verification"]}
8
+ ---
9
+
10
+ # Cross-Model Code Review
11
+
12
+ Get a second opinion on critical code by dispatching it to an independent reviewer — a different agent, model, or subagent. One model's blind spots are another's obvious catches.
13
+
14
+ ## When to Use
15
+
16
+ - Security-sensitive changes (auth, encryption, access control, token handling)
17
+ - Data access patterns (tenant isolation, query construction, input validation)
18
+ - Architectural decisions with long-term consequences
19
+ - Code you're not fully confident about
20
+ - Before shipping changes that affect user data or billing
21
+
22
+ ## Do Not Use When
23
+
24
+ - Trivial changes (typos, formatting, dependency bumps)
25
+ - Changes fully covered by existing tests with high mutation scores
26
+ - Time-critical hotfixes where review delay is worse than risk
27
+
28
+ ## Process
29
+
30
+ ### 1. Prepare Review Package
31
+
32
+ Assemble exactly what the reviewer needs:
33
+
34
+ - **Changed files:** full diff or complete file contents
35
+ - **Context:** what the code does, why it was changed, what it interacts with
36
+ - **Constraints:** security requirements, isolation rules, performance bounds
37
+ - **Specific concerns:** what you want the reviewer to focus on
38
+
39
+ Do NOT send your session history — construct focused context.
40
+
41
+ ### 2. Dispatch to Reviewer
42
+
43
+ Send the review package to an independent agent. The reviewer should have no knowledge of your reasoning process — they evaluate the code fresh.
44
+
45
+ ### 3. Reviewer Applies Verdicts
46
+
47
+ For each finding, the reviewer assigns:
48
+
49
+ | Verdict | Meaning |
50
+ |---------|---------|
51
+ | **AGREE** | Confirms the implementation is correct for the stated concern |
52
+ | **DISAGREE** | Identifies a concrete issue with evidence |
53
+ | **CONTEXT** | Cannot determine correctness — needs more information |
54
+
55
+ Each DISAGREE must include: what's wrong, why it matters, and a suggested fix.
56
+
57
+ ### 4. Evaluate Findings
58
+
59
+ When review returns:
60
+
61
+ - **AGREE items:** no action needed
62
+ - **DISAGREE items:** verify the finding against actual code. If confirmed, fix. If the reviewer misunderstood context, document why the current approach is correct.
63
+ - **CONTEXT items:** provide the missing information and re-review that item
64
+
65
+ ## What to Include in Review
66
+
67
+ | Category | Include | Skip |
68
+ |----------|---------|------|
69
+ | Auth/access control | Token validation, session management, permission checks | UI styling |
70
+ | Data access | Query construction, tenant isolation, input sanitization | Logging format |
71
+ | Cryptography | Key management, encryption/decryption, hashing | String formatting |
72
+ | Error handling | What's exposed to users, what's logged, what's swallowed | Happy path only |
73
+
74
+ ## Key Principles
75
+
76
+ - **Independent evaluation** — reviewer must not be primed with your conclusions
77
+ - **Evidence-based verdicts** — no "looks fine" without specifics
78
+ - **Verify disagreements** — reviewer may lack context; check before acting
79
+ - **Don't skip uncomfortable findings** — the point is catching what you missed
80
+ - **Repeat for high-stakes changes** — one review round may not be enough
@@ -0,0 +1,81 @@
1
+ ---
2
+ name: pharaoh-review-receive
3
+ description: "Receive code review feedback with technical rigor. No performative agreement — verify suggestions against codebase reality before implementing. Push back with evidence when feedback is wrong. Clarify all unclear items before implementing any. External feedback is suggestions to evaluate, not orders to follow."
4
+ version: 0.2.0
5
+ homepage: https://pharaoh.so
6
+ user-invocable: true
7
+ metadata: {"emoji": "☥", "tags": ["code-review", "feedback", "technical-rigor", "pushback", "collaboration"]}
8
+ ---
9
+
10
+ # Receiving Code Review
11
+
12
+ Code review requires technical evaluation, not emotional performance.
13
+
14
+ **Verify before implementing. Ask before assuming. Technical correctness over social comfort.**
15
+
16
+ ## When to Use
17
+
18
+ When receiving code review feedback — from humans, external reviewers, or automated tools. Especially when feedback seems unclear or technically questionable.
19
+
20
+ ## The Response Pattern
21
+
22
+ 1. **READ:** complete feedback without reacting
23
+ 2. **UNDERSTAND:** restate the requirement in your own words (or ask)
24
+ 3. **VERIFY:** check against codebase reality
25
+ 4. **EVALUATE:** technically sound for THIS codebase?
26
+ 5. **RESPOND:** technical acknowledgment or reasoned pushback
27
+ 6. **IMPLEMENT:** one item at a time, test each
28
+
29
+ ## Forbidden Responses
30
+
31
+ Never respond with:
32
+ - "You're absolutely right!"
33
+ - "Great point!" / "Excellent feedback!"
34
+ - "Let me implement that now" (before verification)
35
+ - Any gratitude expression
36
+
37
+ Instead: restate the technical requirement, ask clarifying questions, push back with evidence if wrong, or just start fixing.
38
+
39
+ ## Handling Unclear Feedback
40
+
41
+ If ANY item is unclear: **stop — do not implement anything yet.** Ask for clarification on unclear items before touching code. Items may be related; partial understanding produces wrong implementations.
42
+
43
+ ## Evaluating External Feedback
44
+
45
+ Before implementing suggestions from external reviewers:
46
+
47
+ 1. Is this technically correct for THIS codebase?
48
+ 2. Does it break existing functionality?
49
+ 3. Is there a reason for the current implementation?
50
+ 4. Does the reviewer understand the full context?
51
+ 5. Does it conflict with prior architectural decisions?
52
+
53
+ If a suggestion seems wrong, push back with technical reasoning — reference working tests, actual code, or codebase patterns.
54
+
55
+ ## When to Push Back
56
+
57
+ - Suggestion breaks existing functionality
58
+ - Reviewer lacks full context
59
+ - Feature is unused (YAGNI)
60
+ - Technically incorrect for this stack
61
+ - Conflicts with established architectural decisions
62
+
63
+ **How:** technical reasoning, specific questions, references to working code. Never defensive — just factual.
64
+
65
+ ## Implementation Order
66
+
67
+ For multi-item feedback:
68
+
69
+ 1. Clarify anything unclear FIRST
70
+ 2. Implement in priority order: blocking issues, simple fixes, complex fixes
71
+ 3. Test each fix individually
72
+ 4. Verify no regressions
73
+
74
+ ## Acknowledging Correct Feedback
75
+
76
+ When feedback IS correct:
77
+ - "Fixed. [Brief description of what changed]"
78
+ - "Good catch — [specific issue]. Fixed in [location]."
79
+ - Or just fix it silently — the code shows you heard.
80
+
81
+ Actions speak. No performative agreement needed.
@@ -0,0 +1,85 @@
1
+ ---
2
+ name: pharaoh-sessions
3
+ description: "Decompose work into parallel, isolated sessions using git worktrees. Each session gets fresh context, a narrow scope, and produces atomic commits. Prevents context window pollution from large tasks. Coordinate across sessions without shared state."
4
+ version: 0.2.0
5
+ homepage: https://pharaoh.so
6
+ user-invocable: true
7
+ metadata: {"emoji": "☥", "tags": ["sessions", "worktrees", "parallel-work", "context-management", "decomposition"]}
8
+ ---
9
+
10
+ # Session Decomposition
11
+
12
+ Break large tasks into parallel, isolated work sessions. Each session runs in its own git worktree with fresh context, focused scope, and atomic commits. Prevents context window bloat and keeps each unit of work clean.
13
+
14
+ ## When to Use
15
+
16
+ - Task is too large for a single context window
17
+ - Work has 3+ independent sub-tasks that don't touch the same files
18
+ - You need to preserve context quality across a multi-hour effort
19
+ - Multiple features or fixes can proceed in parallel
20
+
21
+ ## Do Not Use When
22
+
23
+ - Sub-tasks share files or state
24
+ - Work is sequential (each step depends on the previous)
25
+ - Task fits comfortably in one session
26
+
27
+ ## Process
28
+
29
+ ### 1. Decompose
30
+
31
+ Break the task into sessions. Each session must:
32
+
33
+ - Have a clear, narrow goal (one feature, one fix, one module)
34
+ - Touch a distinct set of files — no overlap between sessions
35
+ - Be independently verifiable (tests pass, build succeeds)
36
+ - Produce atomic commits that make sense on their own
37
+
38
+ ### 2. Create Worktrees
39
+
40
+ For each session, create an isolated worktree:
41
+
42
+ ```bash
43
+ git worktree add .worktrees/<session-name> -b <branch-name>
44
+ ```
45
+
46
+ Install dependencies in each worktree. Verify clean baseline (tests pass).
47
+
48
+ ### 3. Write Session Prompts
49
+
50
+ Each session gets a prompt containing:
51
+
52
+ - **Goal:** what this session produces (1-2 sentences)
53
+ - **Scope:** which files/modules to touch (explicit list)
54
+ - **Constraints:** what NOT to change
55
+ - **Verification:** how to confirm the work is correct
56
+ - **Context:** any architectural decisions or patterns to follow
57
+
58
+ ### 4. Execute Sessions
59
+
60
+ Run each session independently. Sessions should not reference each other's work-in-progress — they operate on the same base commit.
61
+
62
+ ### 5. Integrate
63
+
64
+ After all sessions complete:
65
+
66
+ 1. Verify each branch independently (tests pass, build succeeds)
67
+ 2. Merge branches sequentially into the target branch
68
+ 3. Resolve any conflicts (rare if decomposition was clean)
69
+ 4. Run full verification on the integrated result
70
+
71
+ ## Decomposition Rules
72
+
73
+ | Good decomposition | Bad decomposition |
74
+ |---|---|
75
+ | Session A: auth module, Session B: billing module | Session A: backend, Session B: frontend (likely share types) |
76
+ | Session A: new feature, Session B: unrelated bugfix | Session A: write code, Session B: write tests (coupled) |
77
+ | Session A: parser, Session B: renderer (clear interface) | Session A: first half of file, Session B: second half |
78
+
79
+ ## Key Principles
80
+
81
+ - **No shared files** — if two sessions touch the same file, merge them into one
82
+ - **Fresh context per session** — don't carry state between sessions
83
+ - **Atomic commits** — each session's output should be a coherent, reviewable unit
84
+ - **Verify before integrating** — never merge a session that doesn't pass its own checks
85
+ - **Decomposition is the hard part** — spend time getting boundaries right before starting work
@@ -0,0 +1,104 @@
1
+ ---
2
+ name: pharaoh-tdd
3
+ description: "Test-driven development discipline. Write the failing test first, watch it fail, write minimal code to pass, refactor. No production code without a failing test. No exceptions without explicit permission. Covers red-green-refactor cycle, common rationalizations, and when to start over."
4
+ version: 0.2.0
5
+ homepage: https://pharaoh.so
6
+ user-invocable: true
7
+ metadata: {"emoji": "☥", "tags": ["tdd", "testing", "red-green-refactor", "quality", "discipline"]}
8
+ ---
9
+
10
+ # Test-Driven Development
11
+
12
+ Write the test first. Watch it fail. Write minimal code to pass. Refactor.
13
+
14
+ **If you didn't watch the test fail, you don't know if it tests the right thing.**
15
+
16
+ ## When to Use
17
+
18
+ Always — for new features, bug fixes, refactoring, and behavior changes. Exceptions require explicit user permission (throwaway prototypes, generated code, config files).
19
+
20
+ ## The Iron Law
21
+
22
+ ```
23
+ NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST
24
+ ```
25
+
26
+ Wrote code before the test? Delete it. Start over. Don't keep it as "reference." Don't adapt it. Delete means delete.
27
+
28
+ ## Red-Green-Refactor
29
+
30
+ ### RED — Write Failing Test
31
+
32
+ Write one minimal test showing what should happen.
33
+
34
+ - One behavior per test
35
+ - Clear name describing the behavior
36
+ - Real code, not mocks (unless unavoidable)
37
+
38
+ Run the test. Confirm it **fails** (not errors) for the expected reason — the feature is missing, not a typo.
39
+
40
+ ### GREEN — Minimal Code
41
+
42
+ Write the simplest code that makes the test pass. Nothing more.
43
+
44
+ - Don't add features the test doesn't require
45
+ - Don't refactor other code
46
+ - Don't "improve" beyond the test
47
+
48
+ Run the test. Confirm it passes. Confirm other tests still pass.
49
+
50
+ ### REFACTOR — Clean Up
51
+
52
+ Only after green:
53
+ - Remove duplication
54
+ - Improve names
55
+ - Extract helpers
56
+
57
+ Keep tests green throughout. Don't add behavior during refactor.
58
+
59
+ ### Repeat
60
+
61
+ Next failing test for next behavior.
62
+
63
+ ## Bug Fix Flow
64
+
65
+ 1. Write a failing test that reproduces the bug
66
+ 2. Watch it fail — confirms the test catches the bug
67
+ 3. Fix the bug with minimal code
68
+ 4. Watch it pass — confirms the fix works
69
+ 5. Refactor if needed
70
+
71
+ Never fix bugs without a regression test.
72
+
73
+ ## Common Rationalizations
74
+
75
+ | Excuse | Reality |
76
+ |--------|---------|
77
+ | "Too simple to test" | Simple code breaks. Test takes 30 seconds. |
78
+ | "I'll test after" | Tests passing immediately prove nothing. |
79
+ | "Need to explore first" | Fine. Throw away exploration, then start with TDD. |
80
+ | "Test hard = skip test" | Hard to test = hard to use. Simplify the interface. |
81
+ | "TDD will slow me down" | TDD is faster than debugging. Always. |
82
+ | "Already manually tested" | Ad-hoc is not systematic. No record, can't re-run. |
83
+
84
+ ## Red Flags — Start Over
85
+
86
+ - Code written before test
87
+ - Test passes immediately (testing existing behavior)
88
+ - Can't explain why test failed
89
+ - "Just this once" rationalization
90
+ - Keeping pre-TDD code "as reference"
91
+
92
+ **All of these mean: delete code, start over with TDD.**
93
+
94
+ ## Verification Checklist
95
+
96
+ Before marking work complete:
97
+
98
+ - Every new function has a test
99
+ - Watched each test fail before implementing
100
+ - Each test failed for the expected reason
101
+ - Wrote minimal code to pass each test
102
+ - All tests pass with clean output
103
+ - Tests use real code (mocks only if unavoidable)
104
+ - Edge cases and errors are covered