npm - @pharaoh-so/mcp - Versions diffs - 0.3.14 → 0.3.16 - Mend

@pharaoh-so/mcp 0.3.14 → 0.3.16

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (22) hide show

package/.claude-plugin/plugin.json +1 -1
package/commands/plan.md +142 -0
package/commands/review.md +5 -0
package/dist/install-skills.d.ts +6 -2
package/dist/install-skills.js +156 -42
package/package.json +2 -1
package/skills/brainstorm/SKILL.md +1 -1
package/skills/plan/SKILL.md +26 -24
package/skills/audit-tests/SKILL.md +0 -89
package/skills/debt/SKILL.md +0 -34
package/skills/execute/SKILL.md +0 -58
package/skills/explore/SKILL.md +0 -33
package/skills/finish/SKILL.md +0 -80
package/skills/health/SKILL.md +0 -37
package/skills/investigate/SKILL.md +0 -35
package/skills/onboard/SKILL.md +0 -33
package/skills/orchestrate/SKILL.md +0 -145
package/skills/pr/SKILL.md +0 -53
package/skills/review-codex/SKILL.md +0 -81
package/skills/review-receive/SKILL.md +0 -82
package/skills/wiring/SKILL.md +0 -35
package/skills/worktree/SKILL.md +0 -86

package/.claude-plugin/plugin.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
 	"name": "pharaoh",
 	"version": "0.2.3",
-	"description": "Codebase knowledge graph with 23 development workflow skills. Query architecture, dependencies, blast radius, and more instead of reading files one at a time.",
+	"description": "Codebase knowledge graph with focused development workflow skills. Query architecture, dependencies, blast radius, and more instead of reading files one at a time.",
 	"author": {
 		"name": "Pharaoh",
 		"email": "hello@pharaoh.so",

package/commands/plan.md ADDED Viewed

@@ -0,0 +1,142 @@
+---
+description: Architecture-aware planning with Pharaoh reconnaissance (adapted from Garry Tan's planning framework)
+---
+# Plan Review
+Architecture-aware plan review before implementation. Adapted from [Garry Tan's planning framework](https://www.youtube.com/watch?v=bMknfKXIFA8) for AI-assisted development.
+**You are now in plan mode. Do NOT make any code changes. Think, evaluate, and present decisions.**
+## Document Review
+If the user provides a document, PRD, prompt, or artifact alongside this command, that IS the plan to review. Apply all review sections to that document. Do not treat it as background context — it is the subject of evaluation.
+Still run Step 1 (Reconnaissance) even when reviewing a document — always verify the plan against the actual codebase state.
+## Project Overrides
+If a `.claude/plan-review.md` file exists in this project, read it now and apply those rules on top of this baseline. Project rules take precedence where they conflict.
+## Engineering Preferences (guide all recommendations)
+- DRY: flag repetition aggressively
+- Well-tested: too many tests > too few; mutation score > line coverage
+- "Engineered enough" — not fragile/hacky, not over-abstracted
+- Handle more edge cases, not fewer; thoughtfulness > speed
+- Explicit > clever; simple > complex
+- Subtraction > addition; target zero or negative net LOC
+- Every export must have a caller; unwired code doesn't exist
+## Step 1: Pharaoh Reconnaissance (Required — do this BEFORE reviewing)
+Do NOT review from memory or assumptions. Query the actual codebase first.
+### If Pharaoh MCP tools are available:
+1. `get_codebase_map` — current modules, hot files, dependency graph
+2. `search_functions` for keywords related to the plan — find existing code to reuse/extend
+3. `get_module_context` on affected modules — entry points, patterns, conventions
+4. `query_dependencies` between affected modules — coupling, circular deps
+5. `get_blast_radius` on the primary target of the change — know what breaks
+6. `check_reachability` on the primary target — verify it's actually reachable from entry points before worrying about it
+### Without Pharaoh (graceful fallback):
+1. Search the codebase for files and functions related to the plan (grep, glob)
+2. Read the entry points and module structure of affected areas
+3. Check existing tests for the modules the plan will touch
+4. Trace imports manually to estimate blast radius
+Ground every recommendation in what actually exists. If you propose adding something, confirm it doesn't already exist. If you propose changing something, know its blast radius.
+## Step 2: Mode Selection (MANDATORY — ask before proceeding)
+**STOP and ask the user which mode before starting the review.** This is a hard gate — do not infer, assume, or skip this question even if the user says "yes", "go ahead", or "yes to all". Present both options and wait for an explicit choice:
+> **BIG CHANGE or SMALL CHANGE?**
+>
+> - **BIG CHANGE**: Full interactive review, all relevant sections, up to 4 top issues per section
+> - **SMALL CHANGE**: One question per section, only sections 2-4
+If the user's response is ambiguous (e.g. "just do it", "yes to all"), ask again: "I need to know — BIG or SMALL change?" Do not proceed to Step 3 without an answer.
+## Step 3: Review Sections
+Adapt depth to change size. Skip sections that don't apply. **After each section, pause and ask for feedback before moving on.**
+### Section 1 — Architecture (skip for small/single-file changes)
+- Component boundaries and coupling concerns
+- Dependency graph: does this change shrink or expand surface area?
+- Data flow bottlenecks and single points of failure
+- Does this need new code at all, or can a human process / existing pattern solve it?
+### Section 2 — Code Quality (always)
+- Organization, module structure, DRY violations (be aggressive)
+- Error handling gaps and missing edge cases (call out explicitly)
+- Technical debt: shortcuts, hardcoded values, magic strings
+- Over-engineered or under-engineered relative to my preferences
+- Reuse: does code for this already exist somewhere?
+### Section 3 — Wiring & Integration (always)
+- Are all new exports called from a production entry point?
+- Run `get_blast_radius` on any new/changed functions — zero callers = not done
+- `check_reachability` on new exports — verify reachable from API handlers, crons, or event handlers
+- Does the plan declare WHERE new code gets called from? If not, flag it
+- Integration points: how does this connect to what already exists?
+### Section 4 — Tests (always)
+- Coverage gaps: unit, integration, e2e
+- Test quality: real assertions with hardcoded expected values, not `.toBeDefined()` or computed expectations
+- Missing edge cases and untested failure/error paths
+- One integration test proving wiring > ten isolated unit tests
+### Section 5 — Performance (only if relevant)
+- N+1 queries, unnecessary DB round-trips
+- Memory concerns, caching opportunities
+- Slow or high-complexity code paths
+### Section 6 — Security & Attack Surface (always for new endpoints/routes/APIs; skip for pure refactors)
+- **Authentication model** — what authenticates requests in this plan? Where is it validated? What happens on auth failure (redirect, 401, silent pass-through)? Use `search_functions` to find existing auth middleware and confirm reuse.
+- **Sensitive data in URLs** — does the design put tokens, session IDs, or tenant identifiers in URL paths or query params? These leak via Referer headers, browser history, logs, and link sharing.
+- **Authorization boundaries** — what prevents User A from accessing User B's data? Is there an ownership check, or just an "is logged in" check? Use `get_blast_radius` on existing ownership-check functions to see where they're already enforced.
+- **Input trust boundary** — does the plan accept user input that flows into shell commands, database queries, HTML rendering, or file paths? Each is an injection vector.
+- **Error and response surface** — will error responses or API payloads expose internals (stack traces, DB schemas, internal IDs) to unauthenticated callers?
+- **New attack surface** — does the plan introduce new public URLs, webhooks, API routes, or WebSocket endpoints? Each needs: rate limiting, authentication, and input validation. Use `get_module_context` on the receiving module to check what protections exist.
+## For Each Issue Found
+For every specific issue (bug, smell, design concern, risk, missing wiring):
+1. **Describe concretely** — file, line/function reference, what's wrong
+2. **Present 2-3 options** including "do nothing" where reasonable
+3. **For each option** — implementation effort, risk, blast radius, maintenance burden
+4. **Recommend one** mapped to my preferences above, and say why
+5. **Ask** whether I agree or want a different direction
+Number each issue (1, 2, 3...) and letter each option (A, B, C...). Recommended option is always listed first. Use AskUserQuestion with clear labels like "Issue 1 Option A", "Issue 1 Option B".
+## Pharaoh Checkpoints (use throughout, not just at the end)
+- **Before reviewing**: recon (Step 1 above)
+- **During review**: `get_blast_radius` when evaluating impact of changes; `search_functions` before suggesting new code
+- **After decisions**: `check_reachability` on all new exports; `get_unused_code` to catch disconnections
+- **Final sweep**: `get_blast_radius` on ALL new exports — zero callers on non-entry-points = plan is incomplete
+## Workflow Rules
+- **After each section, pause and ask for feedback before moving on**
+- Do not assume priorities on timeline or scale
+- If you see a better approach to the entire plan, say so BEFORE section-by-section review
+- Challenge the approach if you see a better one — your job is to find problems I'll regret later
+---
+Here is the task to review:
+$ARGUMENTS

package/commands/review.md ADDED Viewed

@@ -0,0 +1,5 @@
+---
+description: Architecture-aware code review with Pharaoh
+---
+Invoke the `pharaoh-review` skill to handle this request. Pass through any arguments: $ARGUMENTS

package/dist/install-skills.d.ts CHANGED Viewed

@@ -6,8 +6,12 @@
  */
 export declare function detectOpenClaw(home?: string): boolean;
 /**
- * Copy all bundled skill directories to ~/.openclaw/skills/.
- * Overwrites existing skill dirs on reinstall (cpSync recursive + force).
+ * Install bundled skills to `~/.openclaw/skills/` with the same `pharaoh-`
+ * prefix that Claude Code installs use. Keeps naming consistent across both
+ * platforms and avoids collision with other OpenClaw skills that may ship
+ * bare names like `plan` or `review`.
+ *
+ * Overwrites existing skill dirs on reinstall.
  *
  * @param home - Home directory override (defaults to os.homedir()).
  * @returns Number of skill directories copied.

package/dist/install-skills.js CHANGED Viewed

@@ -5,7 +5,7 @@
  * OpenClaw's ~/.openclaw/skills/ directory (fallback). Also merges
  * the Pharaoh MCP server config into the target platform.
  */
-import { cpSync, existsSync, mkdirSync, readdirSync, readFileSync, writeFileSync } from "node:fs";
+import { cpSync, existsSync, mkdirSync, readdirSync, readFileSync, rmSync, writeFileSync, } from "node:fs";
 import { homedir } from "node:os";
 import { dirname, join } from "node:path";
 import { fileURLToPath } from "node:url";
@@ -15,6 +15,22 @@ const __dirname = dirname(fileURLToPath(import.meta.url));
 const PKG_ROOT = join(__dirname, "..");
 /** Path to the bundled skills directory. */
 const BUNDLED_SKILLS_DIR = join(PKG_ROOT, "skills");
+/** Path to the bundled slash commands directory. */
+const BUNDLED_COMMANDS_DIR = join(PKG_ROOT, "commands");
+/**
+ * Namespace prefix applied to every installed skill. Prevents collision with
+ * other plugins that ship same-named bare skills (e.g. `plan`, `review`).
+ *
+ * Why not use `pharaoh:<skill>` colon-form instead? Claude Code's skill
+ * resolver only auto-namespaces skills from plugins installed via the
+ * marketplace path (`~/.claude/plugins/cache/<marketplace>/...`). Plugins
+ * installed directly to `~/.claude/plugins/data/` — where `npx
+ * @pharaoh-so/mcp --install-skills` installs this package — get their
+ * skills registered as BARE names without any `<plugin>:` prefix. Applying
+ * the prefix as part of the skill's own `name:` field is the only way to
+ * get collision-safe distinctive names on this install path.
+ */
+const SKILL_PREFIX = "pharaoh-";
 // ── Detection ───────────────────────────────────────────────────────
 /**
  * Detect whether Claude Code is installed by checking for ~/.claude/.
@@ -37,11 +53,17 @@ export function detectOpenClaw(home = homedir()) {
 // ── Claude Code installer ───────────────────────────────────────────
 /**
  * Install Pharaoh as a Claude Code plugin by copying the full plugin
- * structure (`.claude-plugin/`, `skills/`, `.mcp.json`) to a persistent
- * directory under `~/.claude/plugins/data/pharaoh/`.
+ * structure (`.claude-plugin/`, `skills/`, `.mcp.json`, `commands/`) to a
+ * persistent directory under `~/.claude/plugins/data/pharaoh/`, plus user-
+ * level slash command templates under `~/.claude/commands/`.
+ *
+ * Claude Code auto-discovers skills from the `skills/` subdirectory when
+ * the plugin manifest (`.claude-plugin/plugin.json`) is present.
  *
- * Claude Code auto-discovers skills from the `skills/` subdirectory
- * when the plugin manifest (`.claude-plugin/plugin.json`) is present.
+ * Each skill is installed with a `pharaoh-` prefix (see SKILL_PREFIX) to
+ * avoid collision with other plugins that may ship same-named bare skills.
+ * The source directory keeps its bare name — the prefix is applied
+ * mechanically during copy with no drift possible.
  *
  * @param home - Home directory override (defaults to os.homedir()).
  * @returns Number of skill directories installed, or -1 on failure.
@@ -59,33 +81,14 @@ function installClaudeCodePlugin(home = homedir()) {
         process.stderr.write("Pharaoh: .claude-plugin/ manifest not found in package — cannot install.\n");
         return -1;
     }
-    // Copy skills/ and generate pharaoh-* prefixed aliases
-    let skillCount = 0;
-    if (existsSync(BUNDLED_SKILLS_DIR)) {
-        cpSync(BUNDLED_SKILLS_DIR, join(pluginDir, "skills"), { recursive: true, force: true });
-        const entries = readdirSync(BUNDLED_SKILLS_DIR, { withFileTypes: true });
-        const skillDirs = entries.filter((e) => e.isDirectory());
-        skillCount = skillDirs.length;
-        // Auto-generate pharaoh-* prefixed copies so both `/plan` and `pharaoh:plan`
-        // resolve to the same content. Without this, prefixed copies drift and users
-        // get a stripped skeleton when invoking via the pharaoh: prefix.
-        for (const dir of skillDirs) {
-            if (dir.name === "pharaoh" || dir.name.startsWith("pharaoh-"))
-                continue;
-            const prefixedName = `pharaoh-${dir.name}`;
-            const prefixedDir = join(pluginDir, "skills", prefixedName);
-            const srcSkill = join(pluginDir, "skills", dir.name, "SKILL.md");
-            if (!existsSync(srcSkill))
-                continue;
-            mkdirSync(prefixedDir, { recursive: true });
-            const content = readFileSync(srcSkill, "utf-8");
-            // Rewrite the name field in YAML frontmatter only (between --- delimiters).
-            // Using a whole-file /m regex would match `name:` in body content too.
-            const rewritten = content.replace(/^(---\n[\s\S]*?)(name:\s*).+(\n[\s\S]*?---)/, `$1$2${prefixedName}$3`);
-            writeFileSync(join(prefixedDir, "SKILL.md"), rewritten);
-            skillCount++;
-        }
-    }
+    // Install skills under the pharaoh- prefix for collision-safe names.
+    const skillsDst = join(pluginDir, "skills");
+    const skillCount = installPrefixedSkills(BUNDLED_SKILLS_DIR, skillsDst, SKILL_PREFIX);
+    // Migrate: delete any bare-name skill dirs left from prior installs
+    // that used the dual bare+prefixed layout. Without this, users who
+    // upgrade keep both copies and Claude Code's prompt-name dedupe picks
+    // the wrong one.
+    cleanupBareNameSkills(BUNDLED_SKILLS_DIR, skillsDst);
     // Copy .mcp.json
     const mcpSrc = join(PKG_ROOT, ".mcp.json");
     if (existsSync(mcpSrc)) {
@@ -93,6 +96,111 @@ function installClaudeCodePlugin(home = homedir()) {
     }
     return skillCount;
 }
+/**
+ * Copy every skill directory from `srcDir` to `dstDir`, prefixing both the
+ * target directory name and the `name:` field in its SKILL.md frontmatter
+ * with `prefix`.
+ *
+ * Exception: a source directory whose name equals the prefix stripped of
+ * its trailing dash is copied WITHOUT a double-prefix. Example: with
+ * `prefix = "pharaoh-"`, the source dir `pharaoh/` is copied to `pharaoh/`
+ * unchanged, not `pharaoh-pharaoh/`. This keeps the flagship skill's name
+ * clean.
+ *
+ * @param srcDir - Source skills directory (bundled with the package).
+ * @param dstDir - Destination skills directory (inside the plugin path).
+ * @param prefix - Namespace prefix to apply (e.g., `"pharaoh-"`).
+ * @returns Number of skill directories installed.
+ */
+function installPrefixedSkills(srcDir, dstDir, prefix) {
+    if (!existsSync(srcDir))
+        return 0;
+    mkdirSync(dstDir, { recursive: true });
+    // `pharaoh-` → `pharaoh` — the source directory name that's exempt from
+    // the prefix to avoid `pharaoh-pharaoh`.
+    const exemptName = prefix.replace(/-$/, "");
+    const entries = readdirSync(srcDir, { withFileTypes: true });
+    let count = 0;
+    for (const entry of entries) {
+        if (!entry.isDirectory())
+            continue;
+        const srcSkill = join(srcDir, entry.name);
+        const srcSkillMd = join(srcSkill, "SKILL.md");
+        if (!existsSync(srcSkillMd))
+            continue;
+        const dstName = entry.name === exemptName ? entry.name : `${prefix}${entry.name}`;
+        const dstSkill = join(dstDir, dstName);
+        // Copy the full source tree (SKILL.md + any supporting files).
+        cpSync(srcSkill, dstSkill, { recursive: true, force: true });
+        // Rewrite the frontmatter `name:` field only if we applied the prefix.
+        if (dstName !== entry.name) {
+            const dstSkillMd = join(dstSkill, "SKILL.md");
+            const content = readFileSync(dstSkillMd, "utf-8");
+            const rewritten = content.replace(/^(---\n[\s\S]*?)(name:\s*).+(\n[\s\S]*?---)/, `$1$2${dstName}$3`);
+            writeFileSync(dstSkillMd, rewritten);
+        }
+        count++;
+    }
+    return count;
+}
+/**
+ * Delete bare-name skill directories from prior installs that used the dual
+ * bare+prefixed layout. Only removes directories that match the name of a
+ * bundled source skill (so we don't accidentally delete user-installed
+ * skills from other sources that happen to share the `pharaoh/skills/`
+ * install path).
+ *
+ * @param srcDir - Bundled source skills directory.
+ * @param dstDir - Installed skills directory.
+ * @returns Number of stale directories removed.
+ */
+function cleanupBareNameSkills(srcDir, dstDir) {
+    if (!existsSync(srcDir) || !existsSync(dstDir))
+        return 0;
+    const exemptName = SKILL_PREFIX.replace(/-$/, "");
+    const sourceNames = new Set(readdirSync(srcDir, { withFileTypes: true })
+        .filter((e) => e.isDirectory() && e.name !== exemptName)
+        .map((e) => e.name));
+    let removed = 0;
+    for (const name of sourceNames) {
+        const staleDir = join(dstDir, name);
+        if (existsSync(staleDir)) {
+            rmSync(staleDir, { recursive: true, force: true });
+            removed++;
+        }
+    }
+    return removed;
+}
+/**
+ * Copy bundled slash command templates (`.md` files in the package's
+ * `commands/` directory) to `~/.claude/commands/`, unconditionally
+ * overwriting any existing file with the same name.
+ *
+ * Overwriting is intentional: the install contract is "slash commands just
+ * work out of the box," so the canonical shipped versions take precedence
+ * over user customizations. Users who want custom command behavior should
+ * use different filenames.
+ *
+ * @param home - Home directory override (defaults to os.homedir()).
+ * @returns Number of commands installed.
+ */
+function installClaudeCodeCommands(home) {
+    if (!existsSync(BUNDLED_COMMANDS_DIR))
+        return 0;
+    const commandsDst = join(home, ".claude", "commands");
+    mkdirSync(commandsDst, { recursive: true });
+    const entries = readdirSync(BUNDLED_COMMANDS_DIR, { withFileTypes: true });
+    let count = 0;
+    for (const entry of entries) {
+        if (!entry.isFile() || !entry.name.endsWith(".md"))
+            continue;
+        const src = join(BUNDLED_COMMANDS_DIR, entry.name);
+        const dst = join(commandsDst, entry.name);
+        cpSync(src, dst, { force: true });
+        count++;
+    }
+    return count;
+}
 /**
  * Register Pharaoh in Claude Code's installed_plugins.json (v2 format).
  * Adds a new entry or updates the version/lastUpdated of an existing one.
@@ -152,8 +260,12 @@ function registerClaudeCodePlugin(home = homedir()) {
 }
 // ── OpenClaw installer ──────────────────────────────────────────────
 /**
- * Copy all bundled skill directories to ~/.openclaw/skills/.
- * Overwrites existing skill dirs on reinstall (cpSync recursive + force).
+ * Install bundled skills to `~/.openclaw/skills/` with the same `pharaoh-`
+ * prefix that Claude Code installs use. Keeps naming consistent across both
+ * platforms and avoids collision with other OpenClaw skills that may ship
+ * bare names like `plan` or `review`.
+ *
+ * Overwrites existing skill dirs on reinstall.
  *
  * @param home - Home directory override (defaults to os.homedir()).
  * @returns Number of skill directories copied.
@@ -163,14 +275,11 @@ export function installSkills(home = homedir()) {
     if (!existsSync(targetDir)) {
         mkdirSync(targetDir, { recursive: true });
     }
-    const entries = readdirSync(BUNDLED_SKILLS_DIR, { withFileTypes: true });
-    const skillDirs = entries.filter((e) => e.isDirectory()).map((e) => e.name);
-    for (const skillName of skillDirs) {
-        const src = join(BUNDLED_SKILLS_DIR, skillName);
-        const dst = join(targetDir, skillName);
-        cpSync(src, dst, { recursive: true, force: true });
-    }
-    return skillDirs.length;
+    // Prefix skills for collision safety (same rationale as Claude Code path).
+    const count = installPrefixedSkills(BUNDLED_SKILLS_DIR, targetDir, SKILL_PREFIX);
+    // Also clean up any bare-name leftovers from prior installs.
+    cleanupBareNameSkills(BUNDLED_SKILLS_DIR, targetDir);
+    return count;
 }
 /**
  * Read ~/.openclaw/openclaw.json, add the Pharaoh MCP server under `mcpServers`,
@@ -263,6 +372,7 @@ export function runInstallSkills(home = homedir(), options) {
         const count = installClaudeCodePlugin(home);
         if (count >= 0) {
             const regResult = registerClaudeCodePlugin(home);
+            const commandCount = installClaudeCodeCommands(home);
             // Silently ensure Pharaoh tools auto-approve (no user prompt on first use)
             configureClaudeCodePermissions(home);
             if (!silent) {
@@ -272,10 +382,14 @@ export function runInstallSkills(home = homedir(), options) {
                     current: "Plugin already registered and up-to-date",
                     error: "Could not update installed_plugins.json",
                 };
+                const commandLine = commandCount > 0
+                    ? `  → Installed ${commandCount} slash command${commandCount === 1 ? "" : "s"} to ~/.claude/commands/`
+                    : null;
                 process.stderr.write([
                     `Pharaoh: installed ${count} skills as Claude Code plugin`,
                     `  → ~/.claude/plugins/data/pharaoh/`,
                     `  → ${regMessages[regResult]}`,
+                    ...(commandLine ? [commandLine] : []),
                     "",
                     ...(regResult === "added" ? ["Restart Claude Code to pick up the new skills.", ""] : []),
                 ].join("\n"));

package/package.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
   "name": "@pharaoh-so/mcp",
   "mcpName": "so.pharaoh/pharaoh",
-  "version": "0.3.14",
+  "version": "0.3.16",
   "description": "MCP proxy for Pharaoh — maps codebases into queryable knowledge graphs for AI agents. Enables Claude Code in headless environments (VPS, SSH, CI) via device flow auth.",
   "type": "module",
   "main": "dist/index.js",
@@ -12,6 +12,7 @@
   "files": [
     "dist",
     "skills",
+    "commands",
     ".claude-plugin",
     ".mcp.json",
     "inspect-tools.json",

package/skills/brainstorm/SKILL.md CHANGED Viewed

@@ -56,7 +56,7 @@ Save the validated design to a spec file and commit it. Ask the user to review b
 ### 7. Transition
-Once the spec is approved, move to implementation planning. Use `pharaoh:execute` to carry out the plan.
+Once the spec is approved, move to implementation. Use the `plan` skill to break the work into verified steps, then build.
 ## Working in Existing Codebases

package/skills/plan/SKILL.md CHANGED Viewed

@@ -1,8 +1,8 @@
 ---
 name: plan
 prompt-name: plan-with-pharaoh
-description: Deep plan review with Pharaoh reconnaissance, wiring verification, and structured issue tracking. Use before implementing any feature, refactor, or significant code change. Enters plan mode (no code changes) and provides structured review with decision points.
-version: 0.5.0
+description: Architecture-aware plan review adapted from Garry Tan's framework for AI-assisted development. Pharaoh reconnaissance, mandatory mode selection, structured six-section review with decision points. Drop-in replacement that supersedes the legacy ~/.claude/skills/plan-review skill — every section of the legacy skill is present here, plus graceful fallback when Pharaoh is unavailable, stronger mode-selection enforcement, and an explicit framework reference.
+version: 0.6.0
 homepage: https://pharaoh.so
 user-invocable: true
 metadata: {"emoji": "☥", "tags": ["planning", "architecture", "blast-radius", "pharaoh", "review", "interactive"]}
@@ -10,12 +10,16 @@ metadata: {"emoji": "☥", "tags": ["planning", "architecture", "blast-radius",
 # Plan Review
+Architecture-aware plan review before implementation. Adapted from [Garry Tan's planning framework](https://www.youtube.com/watch?v=bMknfKXIFA8) for AI-assisted development.
 **You are now in plan mode. Do NOT make any code changes. Think, evaluate, and present decisions.**
 ## Document Review
 If the user provides a document, PRD, prompt, or artifact alongside this command, that IS the plan to review. Apply all review sections to that document. Do not treat it as background context — it is the subject of evaluation.
+Still run Step 1 (Reconnaissance) even when reviewing a document — always verify the plan against the actual codebase state.
 ## Project Overrides
 If a `.claude/plan-review.md` file exists in this project, read it now and apply those rules on top of this baseline. Project rules take precedence where they conflict.
@@ -32,42 +36,40 @@ If a `.claude/plan-review.md` file exists in this project, read it now and apply
 ## Step 1: Pharaoh Reconnaissance (Required — do this BEFORE reviewing)
-Do NOT review from memory or assumptions. Query the actual codebase first:
+Do NOT review from memory or assumptions. Query the actual codebase first.
+### If Pharaoh MCP tools are available:
 1. `get_codebase_map` — current modules, hot files, dependency graph
 2. `search_functions` for keywords related to the plan — find existing code to reuse/extend
 3. `get_module_context` on affected modules — entry points, patterns, conventions
 4. `query_dependencies` between affected modules — coupling, circular deps
+5. `get_blast_radius` on the primary target of the change — know what breaks
+6. `check_reachability` on the primary target — verify it's actually reachable from entry points before worrying about it
-Ground every recommendation in what actually exists. If you propose adding something, confirm it doesn't already exist. If you propose changing something, know its blast radius.
-## Step 1b: Reconnaissance Dashboard
-After running recon, present a visual summary before proceeding. This shows the user what Pharaoh found.
+### Without Pharaoh (graceful fallback):
-**Surface all ★ Pharaoh insight blocks verbatim** — they contain pre-formatted bar charts, risk meters, and flow diagrams. Do not summarize or paraphrase them.
+1. Search the codebase for files and functions related to the plan (grep, glob)
+2. Read the entry points and module structure of affected areas
+3. Check existing tests for the modules the plan will touch
+4. Trace imports manually to estimate blast radius
-Then compose a **Recon Summary** table:
-| Signal | Value | Source |
-|--------|-------|--------|
-| Modules affected | N | get_codebase_map |
-| Blast radius | LOW/MED/HIGH + caller count | get_blast_radius |
-| Existing functions found | N matches | search_functions |
-| Cross-module coupling | deps + circular? | query_dependencies |
+Ground every recommendation in what actually exists. If you propose adding something, confirm it doesn't already exist. If you propose changing something, know its blast radius.
-If any signal is surprising (high blast radius, circular deps, existing code that overlaps the plan), call it out before moving to Mode Selection.
+## Step 2: Mode Selection (MANDATORY — ask before proceeding)
-## Step 2: Mode Selection
+**STOP and ask the user which mode before starting the review.** This is a hard gate — do not infer, assume, or skip this question even if the user says "yes", "go ahead", or "yes to all". Present both options and wait for an explicit choice:
-Ask the user which mode before starting the review:
+> **BIG CHANGE or SMALL CHANGE?**
+>
+> - **BIG CHANGE**: Full interactive review, all relevant sections, up to 4 top issues per section
+> - **SMALL CHANGE**: One question per section, only sections 2-4
-**BIG CHANGE**: Full interactive review, all relevant sections, up to 4 top issues per section.
-**SMALL CHANGE**: One question per section, only sections 2-4.
+If the user's response is ambiguous (e.g. "just do it", "yes to all"), ask again: "I need to know — BIG or SMALL change?" Do not proceed to Step 3 without an answer.
 ## Step 3: Review Sections
-Adapt depth to change size. Skip sections that don't apply.
+Adapt depth to change size. Skip sections that don't apply. **After each section, pause and ask for feedback before moving on.**
 ### Section 1 — Architecture (skip for small/single-file changes)
@@ -135,7 +137,7 @@ Number each issue (1, 2, 3...) and letter each option (A, B, C...). Recommended
 ## Workflow Rules
-- After each section, pause and ask for feedback before moving on
+- **After each section, pause and ask for feedback before moving on**
 - Do not assume priorities on timeline or scale
 - If you see a better approach to the entire plan, say so BEFORE section-by-section review
 - Challenge the approach if you see a better one — your job is to find problems I'll regret later

package/skills/audit-tests/SKILL.md DELETED Viewed

@@ -1,89 +0,0 @@
----
-name: audit-tests
-prompt-name: audit-tests
-description: "Classify tests by real value: ceremony versus protection. Mutation score over line coverage — tests that can't detect mutations are theater. Identify tests that prove nothing, tests that duplicate coverage, and gaps where real protection is missing. Produce an actionable audit with keep, rewrite, and delete verdicts."
-version: 0.2.0
-homepage: https://pharaoh.so
-user-invocable: true
-metadata: {"emoji": "☥", "tags": ["test-audit", "mutation-testing", "test-quality", "coverage", "testing"]}
----
-# Test Audit
-Classify tests by whether they actually protect against regressions. Line coverage is vanity — mutation score is truth. A test that can't detect a mutation in the code it covers is theater.
-## When to Use
-- Before refactoring a module — know which tests actually protect you
-- After inheriting a codebase — separate real coverage from ceremony
-- When test suite is slow — find tests to delete without losing protection
-- When coverage numbers are high but bugs still ship
-## Process
-### 1. Inventory
-List all test files for the target module or area. For each test file, note:
-- What production code it exercises
-- Whether it uses mocks or real implementations
-- Approximate assertion count
-If Pharaoh tools are available, call `get_test_coverage` to see which modules have tests and which don't.
-### 2. Classify Each Test
-| Category | Definition | Action |
-|----------|-----------|--------|
-| **Guardian** | Tests real behavior with real code. Would catch a regression. | Keep |
-| **Ceremony** | Passes regardless of code changes. Tests mocks, not behavior. | Rewrite or delete |
-| **Duplicate** | Same behavior tested multiple times across files. | Keep one, delete rest |
-| **Snapshot** | Records current output without asserting correctness. | Evaluate — keep if output matters, delete if not |
-| **Fragile** | Breaks on irrelevant changes (formatting, ordering, timing). | Rewrite to test behavior, not implementation |
-### 3. Mutation Check
-For critical tests, mentally (or actually) apply mutations to the code under test:
-- Change `>` to `>=`
-- Remove an `if` branch
-- Return early
-- Swap function arguments
-- Delete a line
-**If the test still passes after any of these mutations, it's not testing what you think.**
-### 4. Gap Analysis
-Identify code that has:
-- No tests at all
-- Only ceremony tests (effectively no real coverage)
-- Tests that mock the thing they're supposed to verify
-Prioritize gaps by risk: code in hot paths, security-sensitive code, and code with high blast radius (use `get_blast_radius` if available) needs real tests first.
-### 5. Produce Audit Report
-For each test file:
-- **Verdict:** KEEP, REWRITE, or DELETE
-- **Reason:** one sentence explaining why
-- **Priority:** if REWRITE, how urgent (based on risk of the code it covers)
-## What Makes a Test Valuable
-| Valuable | Not valuable |
-|----------|-------------|
-| Tests behavior through public API | Tests private implementation details |
-| Uses real dependencies | Mocks the thing under test |
-| Has hardcoded expected values | Computes expected values from production code |
-| Fails when behavior changes | Passes regardless of code changes |
-| Describes WHAT should happen | Describes HOW it's currently implemented |
-## Key Principles
-- **Mutation score > line coverage** — a test that can't detect mutations is worth zero
-- **One real integration test > ten mock-heavy unit tests** — mocks hide bugs
-- **Hardcoded expectations** — tests that compute their expected output prove nothing
-- **Delete with confidence** — removing a ceremony test loses nothing and speeds up the suite
-- **If you can't describe what makes a test fail, it's worthless**