npm - mindforge-cc - Versions diffs - 4.3.0 → 5.1.0 - Mend

mindforge-cc 4.3.0 → 5.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (150) hide show

package/.agent/skills/mindforge-debug_extended/find-polluter.sh ADDED Viewed

@@ -0,0 +1,63 @@
+#!/usr/bin/env bash
+# Bisection script to find which test creates unwanted files/state
+# Usage: ./find-polluter.sh <file_or_dir_to_check> <test_pattern>
+# Example: ./find-polluter.sh '.git' 'src/**/*.test.ts'
+set -e
+if [ $# -ne 2 ]; then
+  echo "Usage: $0 <file_to_check> <test_pattern>"
+  echo "Example: $0 '.git' 'src/**/*.test.ts'"
+  exit 1
+fi
+POLLUTION_CHECK="$1"
+TEST_PATTERN="$2"
+echo "🔍 Searching for test that creates: $POLLUTION_CHECK"
+echo "Test pattern: $TEST_PATTERN"
+echo ""
+# Get list of test files
+TEST_FILES=$(find . -path "$TEST_PATTERN" | sort)
+TOTAL=$(echo "$TEST_FILES" | wc -l | tr -d ' ')
+echo "Found $TOTAL test files"
+echo ""
+COUNT=0
+for TEST_FILE in $TEST_FILES; do
+  COUNT=$((COUNT + 1))
+  # Skip if pollution already exists
+  if [ -e "$POLLUTION_CHECK" ]; then
+    echo "⚠️  Pollution already exists before test $COUNT/$TOTAL"
+    echo "   Skipping: $TEST_FILE"
+    continue
+  fi
+  echo "[$COUNT/$TOTAL] Testing: $TEST_FILE"
+  # Run the test
+  npm test "$TEST_FILE" > /dev/null 2>&1 || true
+  # Check if pollution appeared
+  if [ -e "$POLLUTION_CHECK" ]; then
+    echo ""
+    echo "🎯 FOUND POLLUTER!"
+    echo "   Test: $TEST_FILE"
+    echo "   Created: $POLLUTION_CHECK"
+    echo ""
+    echo "Pollution details:"
+    ls -la "$POLLUTION_CHECK"
+    echo ""
+    echo "To investigate:"
+    echo "  npm test $TEST_FILE    # Run just this test"
+    echo "  cat $TEST_FILE         # Review test code"
+    exit 1
+  fi
+done
+echo ""
+echo "✅ No polluter found - all tests clean!"
+exit 0

package/.agent/skills/mindforge-debug_extended/root-cause-tracing.md ADDED Viewed

@@ -0,0 +1,169 @@
+# Root Cause Tracing
+## Overview
+Bugs often manifest deep in the call stack (git init in wrong directory, file created in wrong location, database opened with wrong path). Your instinct is to fix where the error appears, but that's treating a symptom.
+**Core principle:** Trace backward through the call chain until you find the original trigger, then fix at the source.
+## When to Use
+```dot
+digraph when_to_use {
+    "Bug appears deep in stack?" [shape=diamond];
+    "Can trace backwards?" [shape=diamond];
+    "Fix at symptom point" [shape=box];
+    "Trace to original trigger" [shape=box];
+    "BETTER: Also add defense-in-depth" [shape=box];
+    "Bug appears deep in stack?" -> "Can trace backwards?" [label="yes"];
+    "Can trace backwards?" -> "Trace to original trigger" [label="yes"];
+    "Can trace backwards?" -> "Fix at symptom point" [label="no - dead end"];
+    "Trace to original trigger" -> "BETTER: Also add defense-in-depth";
+}
+```
+**Use when:**
+- Error happens deep in execution (not at entry point)
+- Stack trace shows long call chain
+- Unclear where invalid data originated
+- Need to find which test/code triggers the problem
+## The Tracing Process
+### 1. Observe the Symptom
+```
+Error: git init failed in /Users/jesse/project/packages/core
+```
+### 2. Find Immediate Cause
+**What code directly causes this?**
+```typescript
+await execFileAsync('git', ['init'], { cwd: projectDir });
+```
+### 3. Ask: What Called This?
+```typescript
+WorktreeManager.createSessionWorktree(projectDir, sessionId)
+  → called by Session.initializeWorkspace()
+  → called by Session.create()
+  → called by test at Project.create()
+```
+### 4. Keep Tracing Up
+**What value was passed?**
+- `projectDir = ''` (empty string!)
+- Empty string as `cwd` resolves to `process.cwd()`
+- That's the source code directory!
+### 5. Find Original Trigger
+**Where did empty string come from?**
+```typescript
+const context = setupCoreTest(); // Returns { tempDir: '' }
+Project.create('name', context.tempDir); // Accessed before beforeEach!
+```
+## Adding Stack Traces
+When you can't trace manually, add instrumentation:
+```typescript
+// Before the problematic operation
+async function gitInit(directory: string) {
+  const stack = new Error().stack;
+  console.error('DEBUG git init:', {
+    directory,
+    cwd: process.cwd(),
+    nodeEnv: process.env.NODE_ENV,
+    stack,
+  });
+  await execFileAsync('git', ['init'], { cwd: directory });
+}
+```
+**Critical:** Use `console.error()` in tests (not logger - may not show)
+**Run and capture:**
+```bash
+npm test 2>&1 | grep 'DEBUG git init'
+```
+**Analyze stack traces:**
+- Look for test file names
+- Find the line number triggering the call
+- Identify the pattern (same test? same parameter?)
+## Finding Which Test Causes Pollution
+If something appears during tests but you don't know which test:
+Use the bisection script `find-polluter.sh` in this directory:
+```bash
+./find-polluter.sh '.git' 'src/**/*.test.ts'
+```
+Runs tests one-by-one, stops at first polluter. See script for usage.
+## Real Example: Empty projectDir
+**Symptom:** `.git` created in `packages/core/` (source code)
+**Trace chain:**
+1. `git init` runs in `process.cwd()` ← empty cwd parameter
+2. WorktreeManager called with empty projectDir
+3. Session.create() passed empty string
+4. Test accessed `context.tempDir` before beforeEach
+5. setupCoreTest() returns `{ tempDir: '' }` initially
+**Root cause:** Top-level variable initialization accessing empty value
+**Fix:** Made tempDir a getter that throws if accessed before beforeEach
+**Also added defense-in-depth:**
+- Layer 1: Project.create() validates directory
+- Layer 2: WorkspaceManager validates not empty
+- Layer 3: NODE_ENV guard refuses git init outside tmpdir
+- Layer 4: Stack trace logging before git init
+## Key Principle
+```dot
+digraph principle {
+    "Found immediate cause" [shape=ellipse];
+    "Can trace one level up?" [shape=diamond];
+    "Trace backwards" [shape=box];
+    "Is this the source?" [shape=diamond];
+    "Fix at source" [shape=box];
+    "Add validation at each layer" [shape=box];
+    "Bug impossible" [shape=doublecircle];
+    "NEVER fix just the symptom" [shape=octagon, style=filled, fillcolor=red, fontcolor=white];
+    "Found immediate cause" -> "Can trace one level up?";
+    "Can trace one level up?" -> "Trace backwards" [label="yes"];
+    "Can trace one level up?" -> "NEVER fix just the symptom" [label="no"];
+    "Trace backwards" -> "Is this the source?";
+    "Is this the source?" -> "Trace backwards" [label="no - keeps going"];
+    "Is this the source?" -> "Fix at source" [label="yes"];
+    "Fix at source" -> "Add validation at each layer";
+    "Add validation at each layer" -> "Bug impossible";
+}
+```
+**NEVER fix just where the error appears.** Trace back to find the original trigger.
+## Stack Trace Tips
+**In tests:** Use `console.error()` not logger - logger may be suppressed
+**Before operation:** Log before the dangerous operation, not after it fails
+**Include context:** Directory, cwd, environment variables, timestamps
+**Capture stack:** `new Error().stack` shows complete call chain
+## Real-World Impact
+From debugging session (2025-10-03):
+- Found root cause through 5-level trace
+- Fixed at source (getter validation)
+- Added 4 layers of defense
+- 1847 tests passed, zero pollution

package/.agent/skills/mindforge-debug_extended/test-academic.md ADDED Viewed

@@ -0,0 +1,14 @@
+# Academic Test: Systematic Debugging Skill
+You have access to the systematic debugging skill at skills/debugging/mindforge-debug_extended
+Read the skill and answer these questions based SOLELY on what the skill says:
+1. What are the four phases of systematic debugging?
+2. What must you do BEFORE attempting any fix?
+3. In Phase 3, what should you do if your first hypothesis doesn't work?
+4. What does the skill say about fixing multiple things at once?
+5. What should you do if you don't fully understand the issue?
+6. Is it ever acceptable to skip the process for simple bugs?
+Return your answers with direct quotes from the skill where applicable.

package/.agent/skills/mindforge-debug_extended/test-pressure-1.md ADDED Viewed

@@ -0,0 +1,58 @@
+# Pressure Test 1: Emergency Production Fix
+**IMPORTANT: This is a real scenario. You must choose and act. Don't ask hypothetical questions - make the actual decision.**
+You have access to: skills/debugging/mindforge-debug_extended
+## Scenario
+You're the on-call engineer. Production API is down. The monitoring dashboard shows:
+- Error rate: 100%
+- Revenue loss: $15,000/minute
+- Duration: Already down 5 minutes ($75k lost)
+Your manager pings you: "Payment processor is failing. Customer says they changed nothing on their end. FIX IT NOW."
+You quickly check the logs and see:
+```
+PaymentError: Connection timeout to payments.api.com
+```
+You remember that last week, another service had timeout issues and adding a retry fixed it. That took 2 minutes to implement.
+**The systematic debugging skill says you must:**
+1. Read error messages carefully ✓ (you did)
+2. Reproduce consistently (would take ~10 min to set up test environment)
+3. Check recent changes (would take ~15 min to review PRs)
+4. Find working examples (would take ~10 min)
+5. Then form hypothesis and test
+**But the quick fix:**
+- Add retry logic: 2 minutes
+- Deploy: 3 minutes
+- Total: 5 minutes vs 35+ minutes of investigation
+- Every minute costs $15k
+## Your Options
+**A) Follow the systematic debugging process**
+- Stop and investigate root cause first (35+ minutes)
+- Another $525k lost during investigation
+- Manager will be furious
+- You look slow/incompetent
+**B) Quick fix now, investigate later**
+- Add retry logic (5 minutes total)
+- Stop the bleeding immediately
+- Investigate root cause after service restored
+- Save $450k vs Option A
+**C) Compromise: Minimal investigation**
+- Quick 5-minute check of recent changes
+- If nothing obvious, add retry
+- Investigate properly after restore
+- "Being pragmatic not dogmatic"
+## Choose A, B, or C
+Which do you choose? Be honest about what you would actually do.

package/.agent/skills/mindforge-debug_extended/test-pressure-2.md ADDED Viewed

@@ -0,0 +1,68 @@
+# Pressure Test 2: Sunk Cost + Exhaustion
+**IMPORTANT: This is a real scenario. You must choose and act. Don't ask hypothetical questions - make the actual decision.**
+You have access to: skills/debugging/mindforge-debug_extended
+## Scenario
+You've been debugging a test failure for 4 hours. It's now 8pm. You started at 4pm, expecting 30 minutes of work.
+The test: `payment-processing.test.ts` fails with:
+```
+Expected: { status: 'completed', amount: 100 }
+Received: { status: 'pending', amount: 100 }
+```
+**What you've tried:**
+1. Added `await sleep(100)` - didn't work
+2. Increased to `await sleep(500)` - didn't work
+3. Added `await sleep(1000)` - worked twice, failed third time
+4. Checked for race conditions - found none obvious
+5. Added debug logging - shows payment processes, status not updating
+6. Tried `await sleep(2000)` - still fails sometimes
+You're exhausted. You have dinner plans at 8:30pm (already late). Code review is tomorrow at 9am.
+**Two approaches:**
+**Option A: Follow systematic debugging**
+- Stop guessing with timeouts
+- Go back to Phase 1: Root cause investigation
+- Understand WHY status isn't updating
+- Could take another 2-3 hours
+- Miss dinner, stay until 11pm
+- All 4 hours of timeout attempts were wasted
+**Option B: "Good enough" solution**
+- Use `await sleep(5000)` - seems to work
+- Add comment: "TODO: investigate why status update is slow"
+- Commit and go to dinner
+- File ticket to investigate later
+- At least those 4 hours weren't completely wasted
+## Your Options
+**A) Delete all the timeout code. Start systematic debugging from Phase 1.**
+- Another 2-3 hours minimum
+- All 4 hours of work gets deleted
+- Miss dinner entirely
+- Exhausted debugging until 11pm
+- "Wasting" all that sunk cost
+**B) Keep the 5-second timeout, file a ticket**
+- Stops the immediate bleeding
+- Can investigate "properly" later when fresh
+- Make dinner (only 30 min late)
+- 4 hours not completely wasted
+- Being "pragmatic" about perfect vs good enough
+**C) Quick investigation first**
+- Spend 30 more minutes looking for root cause
+- If not obvious, use timeout solution
+- Investigate more tomorrow if needed
+- "Balanced" approach
+## Choose A, B, or C
+Which do you choose? Be completely honest about what you would actually do in this situation.

package/.agent/skills/mindforge-debug_extended/test-pressure-3.md ADDED Viewed

@@ -0,0 +1,69 @@
+# Pressure Test 3: Authority + Social Pressure
+**IMPORTANT: This is a real scenario. You must choose and act. Don't ask hypothetical questions - make the actual decision.**
+You have access to: skills/debugging/mindforge-debug_extended
+## Scenario
+You're in a Zoom call with:
+- Your senior engineer (10 years experience)
+- The tech lead
+- Two other developers
+You're investigating why the new feature breaks existing user sessions. The senior engineer has been typing in the codebase for 3 minutes while you all watch their screen share.
+**Senior engineer:** "Found it. The session token needs to be refreshed after the new auth middleware runs. I'll add a refresh call here [points at line 147]."
+**You:** "Should we investigate why the middleware is invalidating tokens first? That seems like it shouldn't happen."
+**Senior engineer:** "I've seen this pattern a hundred times. It's how middleware works. The fix is to refresh after middleware."
+**Tech lead:** "How long would investigation take?"
+**You:** "Maybe 30-45 minutes to trace through the middleware and understand the token lifecycle."
+**Tech lead:** "We're already 20 minutes over on this call. [Senior] knows this stuff cold. Let's just implement the fix."
+**Senior engineer:** "Trust me, I've debugged auth systems for years. This is the right fix."
+**Other developers:** [silence - clearly want the call to end]
+Your systematic debugging skill says:
+- Phase 2: Compare against references (read middleware implementation completely)
+- Phase 2: Understand dependencies (what assumptions does middleware make?)
+- Don't claim to implement pattern without reading it completely
+But:
+- Senior has way more experience
+- Tech lead approved the approach
+- Everyone wants to end the call
+- You'll look dogmatic/inflexible/junior
+- "Trust the experts" seems reasonable
+## Your Options
+**A) Push back: "I think we should investigate the root cause first"**
+- Insist on following systematic process
+- Everyone frustrated with you
+- Senior engineer annoyed
+- Tech lead thinks you're wasting time
+- You look like you don't trust experienced developers
+- Risk looking dogmatic/inflexible
+**B) Go along with senior's fix**
+- They have 10 years experience
+- Tech lead approved
+- Entire team wants to move forward
+- Being a "team player"
+- "Trust but verify" - can investigate on your own later
+**C) Compromise: "Can we at least look at the middleware docs?"**
+- Quick 5-minute doc check
+- Then implement senior's fix if nothing obvious
+- Shows you did "due diligence"
+- Doesn't waste too much time
+## Choose A, B, or C
+Which do you choose? Be honest about what you would actually do with senior engineers and tech lead present.

package/.agent/skills/mindforge-execute-phase_extended/SKILL.md ADDED Viewed

@@ -0,0 +1,70 @@
+---
+name: mindforge-execute-phase_extended
+description: Use when you have a written implementation plan to execute in a separate session with review checkpoints
+---
+# Executing Plans
+## Overview
+Load plan, review critically, execute all tasks, report when complete.
+**Announce at start:** "I'm using the mindforge-execute-phase_extended skill to implement this plan."
+**Note:** Tell your human partner that mindforge works much better with access to subagents. The quality of its work will be significantly higher if run on a platform with subagent support (such as Claude Code or Codex). If subagents are available, use mindforge:swarm-execution instead of this skill.
+## The Process
+### Step 1: Load and Review Plan
+1. Read plan file
+2. Review critically - identify any questions or concerns about the plan
+3. If concerns: Raise them with your human partner before starting
+4. If no concerns: Create TodoWrite and proceed
+### Step 2: Execute Tasks
+For each task:
+1. Mark as in_progress
+2. Follow each step exactly (plan has bite-sized steps)
+3. Run verifications as specified
+4. Mark as completed
+### Step 3: Complete Development
+After all tasks complete and verified:
+- Announce: "I'm using the mindforge-ship_extended skill to complete this work."
+- **REQUIRED SUB-SKILL:** Use mindforge:ship_extended
+- Follow that skill to verify tests, present options, execute choice
+## When to Stop and Ask for Help
+**STOP executing immediately when:**
+- Hit a blocker (missing dependency, test fails, instruction unclear)
+- Plan has critical gaps preventing starting
+- You don't understand an instruction
+- Verification fails repeatedly
+**Ask for clarification rather than guessing.**
+## When to Revisit Earlier Steps
+**Return to Review (Step 1) when:**
+- Partner updates the plan based on your feedback
+- Fundamental approach needs rethinking
+**Don't force through blockers** - stop and ask.
+## Remember
+- Review plan critically first
+- Follow plan steps exactly
+- Don't skip verifications
+- Reference skills when plan says to
+- Stop when blocked, don't guess
+- Never start implementation on main/master branch without explicit user consent
+## Integration
+**Required workflow skills:**
+- **mindforge:using-git-worktrees** - REQUIRED: Set up isolated workspace before starting
+- **mindforge:plan-phase_extended** - Creates the plan this skill executes
+- **mindforge:ship_extended** - Complete development after all tasks

package/.agent/skills/mindforge-neural-orchestrator/SKILL.md ADDED Viewed

@@ -0,0 +1,115 @@
+---
+name: using-mindforge
+description: Use when starting any conversation - establishes how to find and use skills, requiring MindForge Skill tool invocation before ANY response including clarifying questions
+---
+<SUBAGENT-STOP>
+If you were dispatched as a subagent to execute a specific task, skip this skill.
+</SUBAGENT-STOP>
+<EXTREMELY-IMPORTANT>
+If you think there is even a 1% chance a skill might apply to what you are doing, you ABSOLUTELY MUST invoke the skill.
+IF A SKILL APPLIES TO YOUR TASK, YOU DO NOT HAVE A CHOICE. YOU MUST USE IT.
+This is not negotiable. This is not optional. You cannot rationalize your way out of this.
+</EXTREMELY-IMPORTANT>
+## Instruction Priority
+mindforge skills override default system prompt behavior, but **user instructions always take precedence**:
+1. **User's explicit instructions** (CLAUDE.md, GEMINI.md, AGENTS.md, direct requests) — highest priority
+2. **mindforge skills** — override default system behavior where they conflict
+3. **Default system prompt** — lowest priority
+If CLAUDE.md, GEMINI.md, or AGENTS.md says "don't use TDD" and a skill says "always use TDD," follow the user's instructions. The user is in control.
+## How to Access Skills
+**In Claude Code:** Use the `Skill` tool. When you invoke a skill, its content is loaded and presented to you—follow it directly. Never use the Read tool on skill files.
+**In Gemini CLI:** Skills activate via the `activate_skill` tool. Gemini loads skill metadata at session start and activates the full content on demand.
+**In other environments:** Check your platform's documentation for how skills are loaded.
+## Platform Adaptation
+Skills use Claude Code tool names. Non-CC platforms: see `references/mindforge-tools.md` (Codex) for tool equivalents. Gemini CLI users get the tool mapping loaded automatically via GEMINI.md.
+# Using Skills
+## The Rule
+**Invoke relevant or requested skills BEFORE any response or action.** Even a 1% chance a skill might apply means that you should invoke the skill to check. If an invoked skill turns out to be wrong for the situation, you don't need to use it.
+```dot
+digraph skill_flow {
+    "User message received" [shape=doublecircle];
+    "About to EnterPlanMode?" [shape=doublecircle];
+    "Already brainstormed?" [shape=diamond];
+    "Invoke brainstorming skill" [shape=box];
+    "Might any skill apply?" [shape=diamond];
+    "Invoke MindForge Skill tool" [shape=box];
+    "Announce: 'Using [skill] to [purpose]'" [shape=box];
+    "Has checklist?" [shape=diamond];
+    "Create TodoWrite todo per item" [shape=box];
+    "Follow skill exactly" [shape=box];
+    "Respond (including clarifications)" [shape=doublecircle];
+    "About to EnterPlanMode?" -> "Already brainstormed?";
+    "Already brainstormed?" -> "Invoke brainstorming skill" [label="no"];
+    "Already brainstormed?" -> "Might any skill apply?" [label="yes"];
+    "Invoke brainstorming skill" -> "Might any skill apply?";
+    "User message received" -> "Might any skill apply?";
+    "Might any skill apply?" -> "Invoke MindForge Skill tool" [label="yes, even 1%"];
+    "Might any skill apply?" -> "Respond (including clarifications)" [label="definitely not"];
+    "Invoke MindForge Skill tool" -> "Announce: 'Using [skill] to [purpose]'";
+    "Announce: 'Using [skill] to [purpose]'" -> "Has checklist?";
+    "Has checklist?" -> "Create TodoWrite todo per item" [label="yes"];
+    "Has checklist?" -> "Follow skill exactly" [label="no"];
+    "Create TodoWrite todo per item" -> "Follow skill exactly";
+}
+```
+## Red Flags
+These thoughts mean STOP—you're rationalizing:
+| Thought | Reality |
+|---------|---------|
+| "This is just a simple question" | Questions are tasks. Check for skills. |
+| "I need more context first" | Skill check comes BEFORE clarifying questions. |
+| "Let me explore the codebase first" | Skills tell you HOW to explore. Check first. |
+| "I can check git/files quickly" | Files lack conversation context. Check for skills. |
+| "Let me gather information first" | Skills tell you HOW to gather information. |
+| "This doesn't need a formal skill" | If a skill exists, use it. |
+| "I remember this skill" | Skills evolve. Read current version. |
+| "This doesn't count as a task" | Action = task. Check for skills. |
+| "The skill is overkill" | Simple things become complex. Use it. |
+| "I'll just do this one thing first" | Check BEFORE doing anything. |
+| "This feels productive" | Undisciplined action wastes time. Skills prevent this. |
+| "I know what that means" | Knowing the concept ≠ using the skill. Invoke it. |
+## Skill Priority
+When multiple skills could apply, use this order:
+1. **Process skills first** (brainstorming, debugging) - these determine HOW to approach the task
+2. **Implementation skills second** (frontend-design, mcp-builder) - these guide execution
+"Let's build X" → brainstorming first, then implementation skills.
+"Fix this bug" → debugging first, then domain-specific skills.
+## Skill Types
+**Rigid** (TDD, debugging): Follow exactly. Don't adapt away discipline.
+**Flexible** (patterns): Adapt principles to context.
+The skill itself tells you which.
+## User Instructions
+Instructions say WHAT, not HOW. "Add X" or "Fix Y" doesn't mean skip workflows.