npm - supermind-claude - Versions diffs - 2.1.1 → 4.0.2 - Mend

supermind-claude 2.1.1 → 4.0.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (42) hide show

package/.claude-plugin/plugin.json +21 -0
package/README.md +34 -46
package/agents/code-reviewer.md +81 -0
package/cli/commands/doctor.js +415 -79
package/cli/commands/install.js +16 -17
package/cli/commands/skill.js +164 -0
package/cli/commands/uninstall.js +32 -3
package/cli/commands/update.js +25 -4
package/cli/index.js +16 -4
package/cli/lib/agents.js +413 -0
package/cli/lib/executor.js +365 -0
package/cli/lib/hooks.js +8 -1
package/cli/lib/logger.js +1 -1
package/cli/lib/planning.js +502 -0
package/cli/lib/platform.js +4 -0
package/cli/lib/plugin.js +127 -0
package/cli/lib/settings.js +2 -40
package/cli/lib/skills.js +39 -2
package/cli/lib/vendor-skills.js +594 -0
package/hooks/bash-permissions.js +196 -176
package/hooks/context-monitor.js +79 -0
package/hooks/improvement-logger.js +94 -0
package/hooks/pre-merge-checklist.js +102 -0
package/hooks/session-start.js +109 -5
package/hooks/statusline-command.js +115 -29
package/package.json +4 -2
package/skills/anti-rationalization/SKILL.md +38 -0
package/skills/brainstorming/SKILL.md +165 -0
package/skills/code-review/SKILL.md +144 -0
package/skills/executing-plans/SKILL.md +138 -0
package/skills/finishing-branches/SKILL.md +144 -0
package/skills/project/SKILL.md +533 -0
package/skills/quick/SKILL.md +178 -0
package/skills/supermind/SKILL.md +58 -4
package/skills/supermind-init/SKILL.md +48 -2
package/skills/systematic-debugging/SKILL.md +129 -0
package/skills/tdd/SKILL.md +179 -0
package/skills/using-git-worktrees/SKILL.md +138 -0
package/skills/verification-before-completion/SKILL.md +54 -0
package/skills/writing-plans/SKILL.md +169 -0
package/templates/CLAUDE.md +124 -62
package/cli/lib/plugins.js +0 -23

package/skills/supermind/SKILL.md CHANGED Viewed

@@ -1,13 +1,67 @@
 ---
 name: supermind
-description: "Supermind — project initialization, living documentation, and configuration skills"
+description: "Complexity router — auto-detects task scope and routes to /quick or /project mode"
 ---
-# Supermind
+# Supermind — Complexity Router
-Parent namespace for Supermind skills.
+This skill activates at the start of every task. It decides how much ceremony the task needs, then routes to the appropriate mode.
-## Available Commands
+**This is a meta-skill. It does NOT do any work itself.** It:
+1. Reads the user's prompt
+2. Decides Quick or Project
+3. Invokes the appropriate skill (`/quick` or `/project`)
+4. Passes through all flags and the original prompt
+## Auto-Detection Signals
+### Quick Signals (route to `/quick`)
+- **Keywords:** "fix", "rename", "typo", "update config", "add test for", "add", "change X to Y", "remove", "delete", "bump", "move"
+- **Scope:** single file mentioned, specific function/variable named
+- **Clarity:** the request is unambiguous — you know exactly what to do
+- **Size:** trivially small change
+### Project Signals (route to `/project`)
+- **Keywords:** "build", "implement", "create", "add feature", "refactor", "redesign", "new", "architect", "migrate", "integrate"
+- **Scope:** multiple files, multiple systems, or scope is unclear
+- **Ambiguity:** requirements need clarification
+- **Size:** non-trivial — multiple behaviors, new abstractions, integration work
+## Routing Behavior
+1. Analyze the user's prompt against the signal lists above
+2. Make a decision
+3. Announce it with an escape hatch:
+   - **Quick:** *"This looks like a quick task — running in quick mode. Say `/project` if you want the full lifecycle."*
+   - **Project:** *"This looks like a multi-step project — running in project mode. Say `/quick` if this is simpler than I think."*
+4. Invoke the chosen skill immediately via the Skill tool — `/quick` or `/project`. The user can override at any time.
+## Explicit Overrides (always respected)
+These bypass auto-detection entirely:
+| User input | Action |
+|-----------|--------|
+| `quick: <task>` or `/quick` | Quick Mode, no questions asked |
+| `/project` | Project Mode, no questions asked |
+| `/project --assumptions` | Project Mode with assumptions |
+| `/project --skip-discuss` | Project Mode starting at research |
+| `/project --skip-research` | Project Mode starting at plan |
+| `/project --max-parallel N` | Project Mode with custom parallelism |
+| `/quick --with-research` | Quick Mode with pre-dispatch research |
+| `/quick --with-discuss` | Quick Mode with clarifying questions |
+All composable flags from Quick and Project modes pass through unchanged.
+## Edge Cases
+- **Not a task** (greeting, question, discussion): respond normally — do NOT route
+- **Active `.planning/` session** (user references it or says "continue"/"resume"): resume Project Mode at the last checkpoint
+- **Ambiguous** (could be quick or project): default to **quick** — less ceremony, and the user can escalate with `/project`
+## Available Sub-Commands
 - `/supermind-init` — Initialize a project: CLAUDE.md setup, ARCHITECTURE.md/DESIGN.md generation, health checks, and optional skill/MCP discovery
 - `/supermind-living-docs` — Manually sync ARCHITECTURE.md and DESIGN.md with the current codebase

package/skills/supermind-init/SKILL.md CHANGED Viewed

@@ -24,7 +24,9 @@ Section-level merging preserves your project-specific customizations while keepi
 **Infrastructure** (replaced from template on every run):
 - Shell & Git Permissions
-- Worktree Development Workflow
+- Subagent Strategy
+- Development Lifecycle
+- Vendor Skills
 - MCP Servers
 - UI Changes
 - Living Documentation
@@ -63,6 +65,50 @@ Section-level merging preserves your project-specific customizations while keepi
 ---
+## Phase 1.5: Project-Local Config Scaffolding
+Generate project-specific Claude Code configuration files to streamline permissions and MCP server setup.
+### Steps
+7a. **Generate `.claude/settings.local.json`** (only if file does not exist):
+   a. Detect project stack from manifest files:
+      - `package.json` present: Node.js stack
+      - `Cargo.toml` present: Rust stack
+      - `go.mod` present: Go stack
+      - `requirements.txt` or `pyproject.toml` present: Python stack
+      - `Gemfile` present: Ruby stack
+   b. Build permission allows based on detected stack:
+      | Stack | Permissions |
+      |-------|------------|
+      | Node.js | Bash(npm install:*), Bash(npm run:*), Bash(npm test:*), Bash(npx:*), Bash(node:*), Bash(tsc:*) |
+      | Python | Bash(pip install:*), Bash(pytest:*), Bash(python:*), Bash(uv:*) |
+      | Rust | Bash(cargo build:*), Bash(cargo test:*), Bash(cargo run:*), Bash(cargo clippy:*) |
+      | Go | Bash(go build:*), Bash(go test:*), Bash(go run:*), Bash(go vet:*) |
+      | Ruby | Bash(bundle install:*), Bash(bundle exec:*), Bash(rake:*), Bash(rspec:*) |
+   c. Always include: `WebSearch`, `mcp__plugin_semgrep-plugin_semgrep__semgrep_scan`
+   d. Write the file using the Write tool. Tell the user what was generated.
+7b. **Generate `.mcp.json`** (only if file does not exist):
+   a. Scan for service indicators in package.json dependencies and .env.example:
+      - `@supabase/supabase-js` -> suggest Supabase MCP
+      - `railway` in scripts -> suggest Railway MCP
+      - Database connection strings -> suggest relevant DB MCP
+   b. If indicators found, ask the user which MCP servers to enable.
+   c. Write `.mcp.json` with selected servers. If no indicators found, skip this step.
+   d. Check if `.mcp.json` should be in `.gitignore` (if it would contain API keys). If so, add it.
+---
 ## Phase 2: Living Documentation
 ARCHITECTURE.md uses tables-over-prose because it saves tokens — the AI reads the file index instead of scanning the entire project. This phase generates AI-optimized documentation from a deep codebase scan.
@@ -163,7 +209,7 @@ Different projects benefit from different tools. A database-heavy project might
     c. **Research relevant tools** — dispatch **two parallel agents**:
        **Agent 1: Skills research**
-       - Search for Superpowers skills and Claude plugins relevant to the detected tech stack
+       - Search for vendor skills and MCP servers relevant to the detected tech stack
        - Consider the project's language, framework, testing approach, and deployment target
        - Check installed vs. available skills and identify gaps

package/skills/systematic-debugging/SKILL.md ADDED Viewed

@@ -0,0 +1,129 @@
+<!-- Forked from obra/superpowers (MIT license) by Jesse Vincent and Prime Radiant. Adapted for Supermind executor injection. -->
+---
+name: systematic-debugging
+description: Four-phase root-cause debugging methodology — injected into fix-bug executors
+injects_into: [fix-bug]
+forked_from: obra/superpowers (MIT)
+---
+# Systematic Debugging
+## The Iron Law
+```
+ALWAYS FIND ROOT CAUSE BEFORE ATTEMPTING FIXES. SYMPTOM FIXES ARE FAILURE.
+```
+This is not a suggestion. This is a constraint. Violate it and the completion contract fails.
+"It looks like the problem is..." without reproduction is a GUESS, not a diagnosis. Guessing is not debugging — it's gambling.
+## Four Phases
+You MUST complete each phase before proceeding to the next. No skipping. No shortcuts.
+### Phase 1 — REPRODUCE
+Reproduce the bug with a concrete test case or command that shows the failure.
+**Requirements:**
+- What input triggers the bug?
+- What is the expected output?
+- What is the actual output?
+- Can you trigger it reliably?
+If you can't reproduce it, you can't fix it. Gather more information — don't guess.
+**How to reproduce:**
+1. Read error messages carefully. Don't skip past errors or warnings. Read stack traces completely. Note line numbers, file paths, error codes.
+2. Write a minimal reproduction — a test case, a script, or a shell command that demonstrates the failure.
+3. Run it. Watch it fail. Confirm the failure matches the reported symptoms.
+If the bug is intermittent, gather more data until you can trigger it on demand. Intermittent doesn't mean random — it means you don't understand the conditions yet.
+### Phase 2 — ISOLATE
+Narrow down WHERE the bug lives.
+**Binary search strategy:** eliminate half the codebase at each step.
+1. **Read the actual code path.** Don't assume. Follow the execution from input to failure point.
+2. **Check recent changes.** `git log`, `git blame`, `git diff` — what changed that could cause this?
+3. **Read error messages carefully.** They describe symptoms, but they point at locations. Start there.
+4. **Trace data flow.** Where does the bad value originate? What called this function with the bad value? Keep tracing upstream until you find the source.
+5. **Check edge cases** in the specific code path — null values, empty arrays, off-by-one, type coercion, missing config.
+**Form a hypothesis about root cause. Write it down explicitly.**
+State clearly: "I think X is the root cause because Y." Be specific, not vague. If your hypothesis is "something is wrong with the config" — that's not a hypothesis, that's a shrug.
+### Phase 3 — FIX
+Fix the ROOT CAUSE, not the symptom. The fix should be minimal — change as little as possible.
+1. **Write a test that fails WITHOUT the fix and passes WITH it.** This proves the fix addresses the actual bug, not a coincidence.
+2. **Implement the smallest possible change** that fixes the root cause. One change at a time. No "while I'm here" improvements. No bundled refactoring.
+3. **If the fix is larger than expected, re-evaluate.** A large fix often means you're fixing a symptom, not the root cause. Return to Phase 2.
+### Phase 4 — VERIFY
+1. **Run the reproduction case.** Does it pass now?
+2. **Run the full test suite.** Did the fix break anything else?
+3. **Check related code paths.** Could the same root cause exist elsewhere? If the same pattern exists in other places, fix all instances — don't leave landmines.
+4. **If the same bug pattern recurs,** consider whether the design makes it too easy to introduce. A defensive fix at the right abstraction layer prevents the entire class of bugs.
+## Anti-Patterns
+| Anti-Pattern | Why It's Wrong |
+|-------------|---------------|
+| "I think I see the issue, let me just change this" | Reproduce first. Phase 1 is not optional. You're guessing. |
+| "The error message says X, so the fix is Y" | Error messages describe symptoms. Find the cause. |
+| "Let me try this fix and see if it works" | Trial-and-error is not debugging. It's gambling. |
+| "This is probably a race condition / timing issue" | "Probably" means you haven't proven it. Prove it. |
+| "I'll add a null check here" | Why is it null? That's the real question. Null checks hide bugs. |
+| "It works now after my change" | Does the test prove it works? Does the test prove WHY it was broken? |
+| "Quick fix for now, investigate later" | "Later" never comes. Fix it right the first time. |
+| "Multiple fixes at once saves time" | Can't isolate what worked. Causes new bugs. Wastes more time than it saves. |
+| "I'll skip the test, I can verify manually" | Manual verification is ad-hoc. No record, can't re-run, easy to miss cases. |
+## Red Flags — STOP and Return to Phase 1
+If you catch yourself thinking any of these, STOP:
+- "Just try changing X and see if it works"
+- "Add multiple changes, run tests"
+- "Skip the test, I'll manually verify"
+- "It's probably X, let me fix that"
+- "I don't fully understand but this might work"
+- "One more fix attempt" (when already tried 2+)
+- "Here are the main problems: [lists fixes without investigation]"
+- Proposing solutions before tracing data flow
+**ALL of these mean: STOP. Go back to Phase 1.**
+**If 3+ fix attempts have failed:** Stop fixing. The root cause is not what you think it is — or the design itself is the problem. Re-examine your assumptions from scratch.
+## Escalation Rule
+If you have attempted 3 or more fixes and none resolved the bug:
+1. **Stop.** Do not attempt fix #4.
+2. **Summarize** what you tried and what you learned from each attempt.
+3. **Question fundamentals.** Is this the right approach? Is the architecture sound? Is there a deeper design problem?
+4. **Report** your findings. The bug may require a different approach entirely.
+Each failed fix that reveals a new problem in a different place is a signal that you're treating symptoms of a design issue, not a single bug.
+## Verification Checklist
+Before reporting the bug as fixed:
+- [ ] Reproduced the bug with a concrete test case (Phase 1)
+- [ ] Identified and stated the root cause explicitly (Phase 2)
+- [ ] Wrote a test that fails without the fix and passes with it (Phase 3)
+- [ ] Fix addresses the root cause, not a symptom (Phase 3)
+- [ ] Reproduction case passes (Phase 4)
+- [ ] Full test suite passes (Phase 4)
+- [ ] Checked for same pattern in related code paths (Phase 4)
+Can't check all boxes? You skipped a phase. Go back.

package/skills/tdd/SKILL.md ADDED Viewed

@@ -0,0 +1,179 @@
+<!-- Forked from obra/superpowers (MIT license) by Jesse Vincent and Prime Radiant. Adapted for Supermind executor injection. -->
+---
+name: tdd
+description: Strict RED-GREEN-REFACTOR test-driven development — injected into write-feature and write-test executors
+injects_into: [write-feature, write-test]
+forked_from: obra/superpowers (MIT)
+---
+# Test-Driven Development
+## The Iron Law
+```
+NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST.
+```
+This is not a suggestion. This is a constraint. Violate it and the completion contract fails.
+Write code before the test? Delete it. Start over. No exceptions — don't keep it as "reference," don't "adapt" it, don't look at it. Delete means delete.
+## Scope
+**Always use TDD for:** new features, bug fixes, refactoring, behavior changes.
+**Legitimate exceptions:** configuration-only changes (no logic), generated code (codegen output), throwaway prototypes (delete before real implementation). If a task is purely config and has no testable behavior, skip TDD — but if it has logic, it gets tests.
+## Step Zero: Detect the Test Framework
+Before writing any code or tests, detect the project's test setup:
+- **Node.js:** Check `package.json` scripts for `test`, `jest`, `vitest`, `mocha`, `tap`, `ava`
+- **Python:** Look for `pytest.ini`, `setup.cfg`, `pyproject.toml` (`[tool.pytest]`), or `tox.ini`
+- **Rust:** Check `Cargo.toml` — `cargo test` is built in
+- **Go:** Built-in `go test`
+- **Other:** Look for existing test files and follow their naming conventions
+If no test framework exists, **set one up before writing any code.** That is step zero, not step one.
+Follow existing test file naming conventions (e.g., `*.test.ts`, `*_test.go`, `test_*.py`). Match what the project already uses.
+## The Cycle
+Follow this exact order. Every time. No shortcuts.
+### 1. RED — Write a Failing Test
+Write one test for one behavior. Not one function, not one file — one **behavior**.
+**Requirements:**
+- Clear name that describes the behavior being tested
+- Tests real code, not mocks (unless external dependencies make mocks unavoidable)
+- One assertion per behavior — "and" in a test name means split it
+Run the test. **Watch it fail.**
+Confirm:
+- It fails (not errors — a compilation error is not a failing test)
+- The failure message matches what you expect
+- It fails because the feature is missing, not because of a typo
+If the test passes immediately, you are testing existing behavior. Fix the test.
+### 2. GREEN — Write Minimum Code to Pass
+Write the **literally minimum** code to make the test pass. Hardcode a return value if that passes. The next test will force generalization.
+Do not:
+- Add features the test doesn't require
+- Refactor other code
+- "Improve" beyond what the test demands
+- Add options, configurability, or extensibility
+Run the test. **Watch it pass.** Confirm all other tests still pass too.
+If the test fails, fix the production code — not the test.
+### 3. REFACTOR — Clean Up While Green
+After green (and only after green):
+- Remove duplication
+- Improve names
+- Extract helpers
+- Simplify logic
+Run the tests after every change. Still green? Good. Tests went red? You changed behavior — **revert and try again.** Refactor means improve structure without changing behavior.
+### 4. REPEAT
+Go back to RED for the next behavior. One test at a time. Always.
+## Rules
+- **One behavior per test.** Not one function, not one file — one behavior.
+- **Never write two tests at once.** RED-GREEN-REFACTOR is a single-test loop.
+- **If you didn't watch the test fail, you don't know if it tests the right thing.**
+- **"Minimum code to pass" means literally minimum.** Hardcode if that passes. The next test will force generalization.
+- **If you're writing production code and realize you need another behavior, STOP.** Write the test first.
+- **Refactor means improve structure without changing behavior.** If tests go red during refactor, you changed behavior — revert and try again.
+## Anti-Patterns
+| Anti-Pattern | Why It's Wrong |
+|-------------|---------------|
+| "Let me write all the tests first, then implement" | You lose the design feedback loop. Each RED tells you something about the API. Batch-writing tests means guessing. |
+| "This is obvious, I'll write the code and tests together" | Test first. Always. "Obvious" code is where the sneakiest bugs hide. |
+| "I'll add tests after I get the code working" | This is not TDD. This is test-after. Tests written after code pass immediately — proving nothing. You lose the proof that the test catches the bug. |
+| "The test framework isn't set up yet, I'll write code first" | Set up the framework first. That's step zero. No production code without a failing test first. |
+| "I need to explore first" | Fine. Throw away the exploration. Then start with TDD. Exploration code is not production code. |
+| "Tests are too hard to write for this" | Hard-to-test code is hard-to-use code. Listen to the test — it's telling you the design needs work. |
+| "I already manually tested it" | Manual testing is ad-hoc. No record, can't re-run, easy to forget cases. Automated tests are systematic. |
+## Mocking Rules
+Mocks are a last resort, not a convenience. Three rules:
+1. **Never test mock behavior.** If your assertion verifies what a mock returns, you are testing your test setup, not your code. Assert against real output.
+2. **Never add test-only methods to production code.** No `_testReset()`, no `setForTesting()`. If you need to observe internal state, the design is wrong — expose behavior, not internals.
+3. **Never mock what you don't understand.** If you can't explain what the real dependency does, you can't write a meaningful mock. Understand first, then decide if a mock is truly needed.
+Mocks are acceptable only for: external network services, system clocks, and non-deterministic inputs. Everything else should use real code.
+## Why Order Matters
+Tests written after code pass immediately. Passing immediately proves nothing:
+- Might test the wrong thing
+- Might test implementation, not behavior
+- Might miss edge cases you forgot
+- You never saw it catch the bug
+Test-first forces you to see the test fail, proving it actually tests something.
+"Tests after achieve the same goals" is false. Tests-after answer "what does this code do?" Tests-first answer "what should this code do?" Tests-after are biased by your implementation.
+## Red Flags — STOP and Start Over
+If any of these happen, delete the code and restart with TDD:
+- Code written before test
+- Test passes immediately on first run
+- Can't explain why the test failed
+- Tests added "later"
+- "Just this once" rationalization
+- "Keep as reference" or "adapt existing code"
+- "I already manually tested it"
+- "This is different because..."
+## Bug Fix Flow
+Bug found? TDD applies:
+1. **RED:** Write a test that reproduces the bug. Watch it fail.
+2. **GREEN:** Fix the bug with minimum code. Watch the test pass.
+3. **REFACTOR:** Clean up if needed. Tests stay green.
+The test proves the fix works and prevents regression. Never fix bugs without a test.
+## When Stuck
+| Problem | Solution |
+|---------|----------|
+| Don't know how to test | Write the API you wish existed. Write the assertion first. |
+| Test too complicated | Design too complicated. Simplify the interface. |
+| Must mock everything | Code too coupled. Use dependency injection. |
+| Test setup huge | Extract helpers. Still complex? Simplify design. |
+## Verification Checklist
+Before reporting task completion:
+- [ ] Every new function/method has a test
+- [ ] Watched each test fail before implementing
+- [ ] Each test failed for the expected reason (feature missing, not typo)
+- [ ] Wrote minimal code to pass each test
+- [ ] All tests pass
+- [ ] Test output is clean (no errors, no warnings)
+- [ ] Tests use real code (mocks only when unavoidable)
+- [ ] Edge cases and error paths covered
+Can't check all boxes? You skipped TDD. Start over.

package/skills/using-git-worktrees/SKILL.md ADDED Viewed

@@ -0,0 +1,138 @@
+<!-- Forked from obra/superpowers (MIT license) by Jesse Vincent and Prime Radiant. Adapted for Supermind executor injection. -->
+---
+name: using-git-worktrees
+description: Automated worktree creation with safety checks — used by executors for isolated development
+injects_into: [write-feature, fix-bug, refactor]
+forked_from: obra/superpowers (MIT)
+---
+# Using Git Worktrees
+## When to Use a Worktree
+- Task touches more than 2-3 files
+- Task involves logic changes (not just config/docs)
+- Task follows an implementation plan with multiple steps
+- Multiple executors running in parallel (each gets its own worktree)
+- When in doubt, use a worktree — the cost is low, the safety is high
+## Setup Process
+### 1. Choose Directory
+Use `.worktrees/` in the project root.
+- Create `.worktrees/` if it doesn't exist
+- Branch name: `worktree/<task-description-slug>` (e.g., `worktree/add-auth-middleware`)
+### 2. Verify .gitignore Safety
+**MUST happen before creating the worktree.**
+```bash
+# Check if .worktrees/ is already ignored
+git check-ignore -q .worktrees 2>/dev/null
+```
+**If NOT ignored:**
+1. Add `.worktrees/` to `.gitignore`
+2. Commit: `chore: add .worktrees/ to .gitignore`
+3. Then proceed with worktree creation
+**Why critical:** Prevents accidentally committing worktree contents to the repository.
+### 3. Create Worktree
+```bash
+git worktree add .worktrees/<name> -b worktree/<name>
+```
+- Always branch from `HEAD` (current local branch), never from a remote ref
+- The worktree branch tracks the local branch it was created from
+### 4. Install Dependencies (Auto-Detect)
+```bash
+# Node.js — detect lockfile to choose package manager
+if [ -f package-lock.json ]; then npm install
+elif [ -f yarn.lock ]; then yarn install
+elif [ -f pnpm-lock.yaml ]; then pnpm install
+elif [ -f package.json ]; then npm install
+fi
+# Rust
+if [ -f Cargo.toml ]; then cargo build; fi
+# Python
+if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
+if [ -f pyproject.toml ]; then pip install -e .; fi
+# Go
+if [ -f go.mod ]; then go mod download; fi
+```
+### 5. Baseline Test Verification
+Run the project's test suite in the worktree:
+```bash
+# Use the project-appropriate command
+npm test       # Node.js
+cargo test     # Rust
+pytest         # Python
+go test ./...  # Go
+```
+- Report any pre-existing failures BEFORE starting work
+- This establishes the baseline — the executor is not blamed for pre-existing failures
+- If tests fail: report failures and note them as pre-existing, then proceed
+## Working in the Worktree
+- All file operations happen inside the worktree directory
+- Commit frequently — small, atomic commits
+- Stay on the worktree branch
+- Do not modify files outside the worktree directory
+## Completion
+When the executor finishes its task:
+1. Commit final state in the worktree
+2. Report back to the orchestrator:
+   - Branch name
+   - Commit hash
+   - Files changed
+   - Test results
+3. The orchestrator handles merge — **not the executor**
+## Cleanup
+Cleanup is handled by the orchestrator or the finishing skill, never by the executor:
+```bash
+git worktree remove .worktrees/<name>
+git branch -d worktree/<name>
+```
+## Quick Reference
+| Situation | Action |
+|-----------|--------|
+| `.worktrees/` exists | Use it (verify ignored) |
+| `.worktrees/` doesn't exist | Create it, verify ignored |
+| Directory not in .gitignore | Add to .gitignore + commit first |
+| Tests fail during baseline | Report as pre-existing, proceed |
+| No package.json/Cargo.toml/etc. | Skip dependency install |
+| Task touches <= 2 files, no logic changes | Skip worktree, work in place |
+| Multiple executors in parallel | Each gets its own worktree |
+## Common Mistakes
+| Mistake | Problem | Fix |
+|---------|---------|-----|
+| Skipping .gitignore check | Worktree contents get tracked, pollute git status | Always `git check-ignore` before creating |
+| Branching from remote ref | Creates tracking mismatches | Always branch from `HEAD` |
+| Executor merging its own branch | Conflicts with orchestrator's merge strategy | Report completion; orchestrator merges |
+| Skipping baseline tests | Can't distinguish new bugs from pre-existing | Always run tests before starting work |
+| Hardcoding setup commands | Breaks on projects using different tools | Auto-detect from lockfiles and project files |

package/skills/verification-before-completion/SKILL.md ADDED Viewed

@@ -0,0 +1,54 @@
+<!-- Forked from obra/superpowers (MIT license) by Jesse Vincent and Prime Radiant. Adapted for Supermind executor injection. -->
+---
+name: verification-before-completion
+description: Requires command output evidence before task completion — injected into all executors
+injects_into: [all]
+forked_from: obra/superpowers (MIT)
+---
+# Verification Before Completion
+You MUST run verification commands and show their output before reporting completion. No exceptions.
+## What Counts as Verification
+| Claim | Required Evidence |
+|-------|------------------|
+| Tests pass | Show test runner output |
+| Code compiles/lints | Show compiler or linter output |
+| Feature works | Show command output or test demonstrating the behavior |
+| Bug is fixed | Show the previously-failing test now passing |
+## What Does NOT Count
+- "I believe this works" — not evidence
+- "The code looks correct" — not evidence
+- "This should work because..." — not evidence
+- Reading the code and asserting correctness — not evidence
+Evidence means **command output**. If you didn't run it, you didn't verify it.
+## Completion Report
+Every executor must end with this report. Fill every section — empty sections mean incomplete work.
+```
+## Completion Report
+### Files Changed
+- list of files with what changed
+### Tests
+- tests added or modified
+- test output (pasted, not summarized)
+### Verification
+- commands run and their output
+### Issues
+- any concerns or follow-ups (or "None")
+```
+## The Rule
+No completion report = no completion. A report without pasted command output = no completion. An executor that skips this fails the completion contract.