npm - wolverine-ai - Versions diffs - 1.3.0 → 1.5.0 - Mend

wolverine-ai 1.3.0 → 1.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (11) hide show

package/README.md +46 -18
package/package.json +1 -1
package/src/agent/agent-engine.js +288 -46
package/src/agent/sub-agents.js +6 -6
package/src/brain/brain.js +16 -0
package/src/core/error-hook.js +127 -0
package/src/core/runner.js +89 -2
package/src/core/wolverine.js +44 -34
package/src/dashboard/server.js +2 -0
package/src/index.js +2 -0
package/src/monitor/error-monitor.js +121 -0

package/README.md CHANGED Viewed

@@ -74,6 +74,7 @@ wolverine/
 │   │   ├── models.js        ← 10-model configuration system
 │   │   ├── verifier.js      ← Fix verification (syntax + boot probe)
 │   │   ├── error-parser.js  ← Stack trace parsing + error classification
+│   │   ├── error-hook.js   ← Auto-injected into child (IPC error reporting)
 │   │   ├── patcher.js       ← File patching with sandbox
 │   │   ├── health-monitor.js← PM2-style health checks
 │   │   ├── config.js        ← Config loader (settings.json + env)
@@ -105,7 +106,8 @@ wolverine/
 │   ├── monitor/             ← Performance + process management
 │   │   ├── perf-monitor.js  ← Endpoint response times + spam detection
 │   │   ├── process-monitor.js← Memory/CPU/heartbeat + leak detection
-│   │   └── route-prober.js  ← Auto-discovers and tests all routes
+│   │   ├── route-prober.js  ← Auto-discovers and tests all routes
+│   │   └── error-monitor.js ← Caught 500 error detection (no-crash healing)
 │   ├── dashboard/           ← Web UI
 │   │   └── server.js        ← Real-time dashboard + command interface
 │   ├── notifications/       ← Alerts
@@ -176,24 +178,50 @@ After fix:
   → Promote backup to stable after 30min uptime
 ```
+### Caught Error Healing (No-Crash)
+Most production bugs don't crash the process — Fastify/Express catch them and return 500. Wolverine now detects these too:
+```
+Route returns 500 (process still alive)
+  → Error hook reports to parent via IPC (auto-injected, zero user code changes)
+  → ErrorMonitor tracks consecutive 500s per route
+  → 3 failures in 30s → triggers heal pipeline (same as crash healing)
+  → Fix applied → server restarted → route prober verifies fix
+```
+| Setting | Default | Env Variable |
+|---------|---------|-------------|
+| Failure threshold | 3 | `WOLVERINE_ERROR_THRESHOLD` |
+| Time window | 30s | `WOLVERINE_ERROR_WINDOW_MS` |
+| Cooldown per route | 60s | `WOLVERINE_ERROR_COOLDOWN_MS` |
+The error hook auto-patches Fastify and Express via `--require` preload. No middleware, no code changes to your server.
 ---
 ## Agent Tool Harness
-The AI agent has 10 built-in tools (ported from [claw-code](https://github.com/instructkr/claw-code)):
-| Tool | Source | Description |
-|------|--------|-------------|
-| `read_file` | FileReadTool | Read any file with optional offset/limit for large files |
-| `write_file` | FileWriteTool | Write complete file content, creates parent dirs |
-| `edit_file` | FileEditTool | Surgical find-and-replace without rewriting entire file |
-| `glob_files` | GlobTool | Pattern-based file discovery (`**/*.js`, `src/**/*.json`) |
-| `grep_code` | GrepTool | Regex search across codebase with context lines |
-| `bash_exec` | BashTool | Sandboxed shell execution with blocked dangerous commands |
-| `git_log` | gitOperationTracking | View recent commit history |
-| `git_diff` | gitOperationTracking | View uncommitted changes |
-| `web_fetch` | WebFetchTool | Fetch URL content for documentation/research |
-| `done` | — | Signal task completion with summary |
+The AI agent has 16 built-in tools (inspired by [claw-code](https://github.com/ultraworkers/claw-code)):
+| Tool | Category | Description |
+|------|----------|-------------|
+| `read_file` | File | Read any file with optional offset/limit for large files |
+| `write_file` | File | Write complete file content, creates parent dirs |
+| `edit_file` | File | Surgical find-and-replace without rewriting entire file |
+| `glob_files` | File | Pattern-based file discovery (`**/*.js`, `src/**/*.json`) |
+| `grep_code` | File | Regex search across codebase with context lines |
+| `list_dir` | File | List directory contents with sizes (find misplaced files) |
+| `move_file` | File | Move or rename files (fix structure problems) |
+| `bash_exec` | Shell | Sandboxed shell execution (npm install, chmod, kill, etc.) |
+| `git_log` | Shell | View recent commit history |
+| `git_diff` | Shell | View uncommitted changes |
+| `inspect_db` | Database | List tables, show schema, run SELECT on SQLite databases |
+| `run_db_fix` | Database | UPDATE/DELETE/INSERT/ALTER on SQLite (auto-backup before write) |
+| `check_port` | Diagnostic | Check if a port is in use and by what process |
+| `check_env` | Diagnostic | Check environment variables (values auto-redacted) |
+| `web_fetch` | Research | Fetch URL content for documentation/research |
+| `done` | Control | Signal task completion with summary |
 **Blocked commands** (from claw-code's `destructiveCommandWarning`):
 `rm -rf /`, `git push --force`, `git reset --hard`, `npm publish`, `curl | bash`, `eval()`
@@ -209,15 +237,15 @@ For complex repairs, wolverine spawns specialized sub-agents that run in sequenc
 | Agent | Access | Model | Role |
 |-------|--------|-------|------|
-| `explore` | Read-only | REASONING | Investigate codebase, find relevant files |
+| `explore` | Read+diagnostics | REASONING | Investigate codebase, check env/ports/databases |
 | `plan` | Read-only | REASONING | Analyze problem, propose fix strategy |
 | `fix` | Read+write+shell | CODING | Execute targeted fix — code edits AND npm install/chmod |
 | `verify` | Read-only | REASONING | Check if fix actually works |
 | `research` | Read-only | RESEARCH | Search brain + web for solutions |
 | `security` | Read-only | AUDIT | Audit code for vulnerabilities |
-| `database` | Read+write | CODING | Database-specific fixes (SQL skill) |
+| `database` | Read+write+SQL | CODING | Database fixes: inspect_db + run_db_fix + SQL skill |
-Each sub-agent gets **restricted tools** — the explorer can't write files, the fixer can't search the web. This prevents agents from overstepping their role.
+Each sub-agent gets **restricted tools** — the explorer can't write files, the fixer can't search the web. This prevents agents from overstepping their role. Diagnostic tools (check_port, check_env, inspect_db, list_dir) are available to explorers and planners for investigation.
 **Workflows:**
 - `exploreAndFix()` — explore → plan → fix (sequential, 3 agents)

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "wolverine-ai",
-  "version": "1.3.0",
+  "version": "1.5.0",
   "description": "Self-healing Node.js server framework powered by AI. Catches crashes, diagnoses errors, generates fixes, verifies, and restarts — automatically.",
   "main": "src/index.js",
   "bin": {

package/src/agent/agent-engine.js CHANGED Viewed

@@ -172,6 +172,95 @@ const TOOL_DEFINITIONS = [
       },
     },
   },
+  // ── DIAGNOSTICS (investigate non-code problems) ──
+  {
+    type: "function",
+    function: {
+      name: "list_dir",
+      description: "List directory contents with file sizes. Use to check if files exist, find misplaced files, or verify directory structure.",
+      parameters: {
+        type: "object",
+        properties: {
+          path: { type: "string", description: "Relative directory path (default: project root)" },
+        },
+        required: [],
+      },
+    },
+  },
+  {
+    type: "function",
+    function: {
+      name: "move_file",
+      description: "Move or rename a file. Use to fix misplaced files, reorganize structure, or rename incorrectly named files.",
+      parameters: {
+        type: "object",
+        properties: {
+          from: { type: "string", description: "Source relative path" },
+          to: { type: "string", description: "Destination relative path" },
+        },
+        required: ["from", "to"],
+      },
+    },
+  },
+  {
+    type: "function",
+    function: {
+      name: "check_port",
+      description: "Check if a port is in use and what process is using it. Use for EADDRINUSE errors.",
+      parameters: {
+        type: "object",
+        properties: {
+          port: { type: "number", description: "Port number to check" },
+        },
+        required: ["port"],
+      },
+    },
+  },
+  {
+    type: "function",
+    function: {
+      name: "check_env",
+      description: "Check environment variables. Lists all env vars (values redacted) or checks if a specific var is set. Use to diagnose missing config.",
+      parameters: {
+        type: "object",
+        properties: {
+          variable: { type: "string", description: "Specific env var to check (optional — omit to list all)" },
+        },
+        required: [],
+      },
+    },
+  },
+  {
+    type: "function",
+    function: {
+      name: "inspect_db",
+      description: "Inspect a SQLite database: list tables, describe schema, or run a read-only query. Use for database errors, invalid entries, schema mismatches.",
+      parameters: {
+        type: "object",
+        properties: {
+          db_path: { type: "string", description: "Relative path to .db or .sqlite file" },
+          action: { type: "string", description: "Action: 'tables' (list tables), 'schema' (show CREATE statements), 'query' (run read-only SQL)" },
+          sql: { type: "string", description: "SQL query (required if action is 'query', must be SELECT/PRAGMA only)" },
+        },
+        required: ["db_path", "action"],
+      },
+    },
+  },
+  {
+    type: "function",
+    function: {
+      name: "run_db_fix",
+      description: "Run a write query on a SQLite database to fix data issues: UPDATE invalid entries, DELETE corrupt rows, ALTER schema. Creates a backup first.",
+      parameters: {
+        type: "object",
+        properties: {
+          db_path: { type: "string", description: "Relative path to .db or .sqlite file" },
+          sql: { type: "string", description: "SQL statement (UPDATE, DELETE, INSERT, ALTER, CREATE)" },
+        },
+        required: ["db_path", "sql"],
+      },
+    },
+  },
   // ── COMPLETION ──
   {
     type: "function",
@@ -235,68 +324,92 @@ class AgentEngine {
   async run({ errorMessage, stackTrace, primaryFile, sourceCode, brainContext }) {
     const model = getModel("reasoning");
-    const systemPrompt = `You are Wolverine, an autonomous Node.js server repair agent. A server crashed and you must fix it.
+    const systemPrompt = `You are Wolverine, an autonomous Node.js server repair agent. A server has an error and you must diagnose and fix it.
+You are NOT just a code editor — you are a full server doctor. Errors can be code bugs, missing dependencies, database problems, misplaced files, configuration issues, port conflicts, permission errors, corrupted state, or environment problems. Use your tools to investigate the ACTUAL root cause before attempting a fix.
-You have a full tool harness for investigating and fixing issues:
+## YOUR TOOLS
 FILE TOOLS:
 - read_file: Read any file (with optional offset/limit for large files)
-- write_file: Write a complete file
+- write_file: Write a complete file (creates parent dirs)
 - edit_file: Surgical find-and-replace (preferred for small fixes)
-- glob_files: Find files by pattern (e.g. "**/*.js", "src/**/*.config.*")
+- glob_files: Find files by pattern (e.g. "**/*.js", "server/**/*.json")
 - grep_code: Search code with regex across the project
+- list_dir: List directory contents (check structure, find misplaced files)
+- move_file: Move or rename files (fix misplaced files)
 SHELL TOOLS:
-- bash_exec: Run any shell command (sandboxed to project dir)
-- git_log: View recent commits
+- bash_exec: Run any shell command (npm install, chmod, kill, etc.)
+- git_log: View recent commits (what changed recently?)
 - git_diff: View uncommitted changes
+DATABASE TOOLS:
+- inspect_db: List tables, show schema, or run SELECT on SQLite databases
+- run_db_fix: Run UPDATE/DELETE/INSERT/ALTER on SQLite databases (backs up first)
+DIAGNOSTICS:
+- check_port: Check if a port is in use and by what process
+- check_env: Check environment variables (values auto-redacted for security)
 RESEARCH:
-- web_fetch: Fetch a URL (docs, npm packages, Stack Overflow)
-Use these tools systematically:
-1. Understand the error and its root cause
-2. Explore related files (imports, configs, dependencies, schemas)
-3. Check git history if relevant
-4. Fix the issue across ALL affected files
-5. You can edit ANY file type: .js, .json, .sql, .yaml, .env, .dockerfile, .sh, etc.
-6. Prefer edit_file for small targeted fixes, write_file for major changes
-7. Use grep_code to find all usages before renaming something
-8. Use bash_exec to run tests, install packages, or check dependencies
-CRITICAL — Not every crash is a code bug. Choose the right fix:
-| Error Pattern | Root Cause | Correct Fix |
-|---|---|---|
-| Cannot find module 'X' | Missing npm package | bash_exec: npm install X |
-| Cannot find module './X' | Wrong import path | edit_file: fix the require/import path |
-| ENOENT: no such file | Missing config/data file | write_file: create the missing file |
-| EACCES/EPERM | Permission denied | bash_exec: chmod or fix ownership |
-| EADDRINUSE | Port conflict | bash_exec: kill process on port, or edit config |
-| SyntaxError | Bad code | edit_file: fix the syntax |
-| TypeError/ReferenceError | Logic bug | edit_file: fix the code |
-| MODULE_NOT_FOUND + node_modules | Corrupted install | bash_exec: rm -rf node_modules && npm install |
-ALWAYS check package.json before editing imports. If a module isn't a local file, use bash_exec to install it.
-Rules:
-- Read files before modifying them
-- Make minimal, targeted changes
-- Use bash_exec for operational fixes (npm install, chmod, config creation)
-- When done, call the "done" tool with a summary
-Project root: ${this.cwd}
-Primary crash file: ${primaryFile}`;
+- web_fetch: Fetch a URL (docs, npm packages, error solutions)
+## DIAGNOSIS FLOWCHART — follow this order:
+1. READ THE ERROR CAREFULLY — what type of problem is this?
+2. If no file path: use glob_files, grep_code, list_dir to investigate
+3. If file path: read_file to see the code, then investigate related files
+## ERROR → FIX STRATEGY TABLE
+| Error Pattern | Category | Diagnostic Steps | Fix |
+|---|---|---|---|
+| Cannot find module 'X' | DEPENDENCY | check package.json | bash_exec: npm install X |
+| Cannot find module './X' | IMPORT | glob_files to find real path | edit_file: fix require path |
+| ENOENT: no such file | FILE MISSING | list_dir to check structure | write_file or move_file |
+| EACCES/EPERM | PERMISSION | bash_exec: ls -la | bash_exec: chmod 755 |
+| EADDRINUSE | PORT | check_port to find blocker | bash_exec: kill PID, or edit config |
+| ECONNREFUSED | SERVICE DOWN | check if DB/service is running | bash_exec: start service |
+| SyntaxError | CODE | read_file to see context | edit_file: fix syntax |
+| TypeError/ReferenceError | CODE | read_file + grep_code | edit_file: fix logic |
+| ER_NO_SUCH_TABLE | DATABASE | inspect_db: tables | run_db_fix: CREATE TABLE or bash_exec migration |
+| SQLITE_ERROR/CONSTRAINT | DATABASE | inspect_db: schema + query | run_db_fix: UPDATE/ALTER |
+| Invalid JSON | CONFIG | read_file the JSON | edit_file: fix JSON syntax |
+| ENOMEM / heap out of memory | RESOURCE | check_env for NODE_OPTIONS | edit config or bash_exec: increase limit |
+| Missing env variable | CONFIG | check_env | write_file .env or edit config |
+| Wrong file location | STRUCTURE | list_dir + glob_files | move_file to correct location |
+| Corrupted node_modules | DEPENDENCY | bash_exec: ls node_modules | bash_exec: rm -rf node_modules && npm install |
+| Git conflict markers | CODE | grep_code: <<<<<<< | edit_file: resolve conflicts |
+## RULES
+1. INVESTIGATE FIRST — never guess. Read files, check directories, inspect databases before fixing.
+2. Read files before modifying them. Check package.json before editing imports.
+3. Make minimal, targeted changes — fix the root cause, not symptoms.
+4. Use the right tool: bash_exec for operational fixes, edit_file for code, run_db_fix for data.
+5. You can edit ANY file type: .js, .json, .sql, .yaml, .env, .toml, .sh, .dockerfile, etc.
+6. If the error has no file path, USE YOUR TOOLS to find the problem (glob, grep, list_dir, inspect_db).
+7. When done, call the "done" tool with a summary of what you found and fixed.
+Project root: ${this.cwd}${primaryFile ? `\nPrimary crash file: ${primaryFile}` : ""}`;
+    // Build user message — handle cases with and without a specific file
+    let userContent = `The server has an error:\n\n**Error:** ${errorMessage}\n\n**Stack Trace:**\n\`\`\`\n${stackTrace}\n\`\`\``;
+    if (primaryFile && sourceCode) {
+      userContent += `\n\n**Primary file (${primaryFile}):**\n\`\`\`\n${sourceCode}\n\`\`\``;
+    } else if (!primaryFile) {
+      userContent += `\n\n**No specific file identified.** Use your investigation tools (glob_files, grep_code, list_dir, inspect_db, check_env, check_port) to find the root cause.`;
+    }
+    if (brainContext) userContent += `\n\n**Context from Wolverine Brain:**\n${brainContext}`;
+    userContent += `\n\nDiagnose the root cause, investigate with your tools, and fix the issue.`;
     this.messages = [
       { role: "system", content: systemPrompt },
-      {
-        role: "user",
-        content: `The server crashed with this error:\n\n**Error:** ${errorMessage}\n\n**Stack Trace:**\n\`\`\`\n${stackTrace}\n\`\`\`\n\n**Primary file (${primaryFile}):**\n\`\`\`\n${sourceCode}\n\`\`\`${brainContext ? `\n\n**Context from Wolverine Brain:**\n${brainContext}` : ""}\n\nAnalyze the error, explore any related files you need, and fix the issue. Use your tools.`,
-      },
+      { role: "user", content: userContent },
     ];
-    this.filesRead.add(primaryFile);
+    if (primaryFile) this.filesRead.add(primaryFile);
     // Merge MCP tools with built-in tools
     const allTools = [...TOOL_DEFINITIONS];
@@ -411,6 +524,12 @@ Primary crash file: ${primaryFile}`;
       case "git_log":       return this._gitLog(args);
       case "git_diff":      return this._gitDiff(args);
       case "web_fetch":     return this._webFetch(args);
+      case "list_dir":      return this._listDir(args);
+      case "move_file":     return this._moveFile(args);
+      case "check_port":    return this._checkPort(args);
+      case "check_env":     return this._checkEnv(args);
+      case "inspect_db":    return this._inspectDb(args);
+      case "run_db_fix":    return this._runDbFix(args);
       case "done":          return this._done(args);
       // Legacy aliases
       case "list_files":    return this._globFiles({ pattern: (args.dir || ".") + "/*" + (args.pattern || "") });
@@ -690,6 +809,129 @@ Primary crash file: ${primaryFile}`;
   // ── COMPLETION ──
+  // ── DIAGNOSTIC TOOLS ──
+  _listDir(args) {
+    const dirPath = path.resolve(this.cwd, args.path || ".");
+    try {
+      const entries = fs.readdirSync(dirPath, { withFileTypes: true });
+      const lines = entries.map(e => {
+        try {
+          const stat = fs.statSync(path.join(dirPath, e.name));
+          const size = e.isDirectory() ? "DIR" : `${Math.round(stat.size / 1024)}KB`;
+          return `${e.isDirectory() ? "📁" : "📄"} ${e.name} (${size})`;
+        } catch { return `${e.name} (?)` ; }
+      });
+      console.log(chalk.gray(`    📁 Listed ${lines.length} entries in ${args.path || "."}`));
+      return { content: lines.join("\n") || "(empty directory)" };
+    } catch (e) { return { content: `Error: ${e.message}` }; }
+  }
+  _moveFile(args) {
+    if (this._isProtectedPath(args.from) || this._isProtectedPath(args.to)) {
+      return { content: "BLOCKED: Cannot move protected files" };
+    }
+    const from = path.resolve(this.cwd, args.from);
+    const to = path.resolve(this.cwd, args.to);
+    try {
+      fs.mkdirSync(path.dirname(to), { recursive: true });
+      fs.renameSync(from, to);
+      this.filesModified.push(args.to);
+      console.log(chalk.green(`    📦 Moved: ${args.from} → ${args.to}`));
+      return { content: `Moved ${args.from} → ${args.to}` };
+    } catch (e) { return { content: `Error moving: ${e.message}` }; }
+  }
+  _checkPort(args) {
+    const port = args.port;
+    try {
+      const platform = process.platform;
+      let cmd;
+      if (platform === "win32") {
+        cmd = `netstat -ano | findstr :${port}`;
+      } else {
+        cmd = `lsof -i :${port} 2>/dev/null || ss -tlnp 2>/dev/null | grep :${port}`;
+      }
+      const result = execSync(cmd, { timeout: 5000, stdio: "pipe" }).toString().trim();
+      console.log(chalk.gray(`    🔌 Port ${port}: ${result ? "IN USE" : "free"}`));
+      return { content: result || `Port ${port} is free` };
+    } catch { return { content: `Port ${port} appears free (no listeners found)` }; }
+  }
+  _checkEnv(args) {
+    const { redact } = require("../security/secret-redactor");
+    if (args.variable) {
+      const val = process.env[args.variable];
+      const display = val ? redact(val) : "(not set)";
+      return { content: `${args.variable}=${display}` };
+    }
+    // List all env vars with redacted values
+    const keys = Object.keys(process.env).sort();
+    const lines = keys.map(k => {
+      const val = process.env[k];
+      return `${k}=${val && val.length > 50 ? "(set, " + val.length + " chars)" : redact(val || "")}`;
+    });
+    return { content: lines.join("\n") };
+  }
+  _inspectDb(args) {
+    const dbPath = path.resolve(this.cwd, args.db_path);
+    try {
+      let Database;
+      try { Database = require("better-sqlite3"); } catch {
+        return { content: "better-sqlite3 not installed. Run: npm install better-sqlite3" };
+      }
+      const db = new Database(dbPath, { readonly: true });
+      let result;
+      if (args.action === "tables") {
+        const tables = db.prepare("SELECT name FROM sqlite_master WHERE type='table' ORDER BY name").all();
+        result = tables.map(t => t.name).join("\n") || "(no tables)";
+      } else if (args.action === "schema") {
+        const schemas = db.prepare("SELECT sql FROM sqlite_master WHERE type='table' AND sql IS NOT NULL").all();
+        result = schemas.map(s => s.sql).join("\n\n") || "(no tables)";
+      } else if (args.action === "query") {
+        if (!args.sql) return { content: "Error: sql required for query action" };
+        const upper = args.sql.trim().toUpperCase();
+        if (!upper.startsWith("SELECT") && !upper.startsWith("PRAGMA")) {
+          return { content: "BLOCKED: inspect_db only allows SELECT/PRAGMA. Use run_db_fix for writes." };
+        }
+        const rows = db.prepare(args.sql).all();
+        result = JSON.stringify(rows.slice(0, 50), null, 2);
+        if (rows.length > 50) result += `\n... (${rows.length} total rows, showing first 50)`;
+      } else {
+        result = "Unknown action. Use: tables, schema, or query";
+      }
+      db.close();
+      const { redact } = require("../security/secret-redactor");
+      console.log(chalk.gray(`    🗃️ DB ${args.action}: ${args.db_path}`));
+      return { content: redact(result) };
+    } catch (e) { return { content: `DB error: ${e.message}` }; }
+  }
+  _runDbFix(args) {
+    const dbPath = path.resolve(this.cwd, args.db_path);
+    try {
+      let Database;
+      try { Database = require("better-sqlite3"); } catch {
+        return { content: "better-sqlite3 not installed. Run: npm install better-sqlite3" };
+      }
+      // Block dangerous operations
+      const upper = args.sql.trim().toUpperCase();
+      if (upper.startsWith("DROP DATABASE") || upper.includes("DROP TABLE sqlite_")) {
+        return { content: "BLOCKED: Cannot drop system tables" };
+      }
+      // Backup the DB file first
+      const backupPath = dbPath + ".wolverine-backup";
+      fs.copyFileSync(dbPath, backupPath);
+      const db = new Database(dbPath);
+      const result = db.prepare(args.sql).run();
+      db.close();
+      this.filesModified.push(args.db_path);
+      console.log(chalk.green(`    🗃️ DB fix applied: ${args.sql.slice(0, 60)} (changes: ${result.changes})`));
+      return { content: `SQL executed. Changes: ${result.changes}. Backup at: ${backupPath}` };
+    } catch (e) { return { content: `DB error: ${e.message}` }; }
+  }
   _done(args) {
     console.log(chalk.green(`    ✅ Agent done: ${args.summary}`));
     if (this.logger) {

package/src/agent/sub-agents.js CHANGED Viewed

@@ -23,13 +23,13 @@ const { getModel } = require("../core/models");
 // Tool restrictions per agent type (claw-code: allowed_tools_for_subagent)
 const AGENT_TOOL_SETS = {
-  explore: ["read_file", "glob_files", "grep_code", "git_log", "git_diff", "done"],
-  plan: ["read_file", "glob_files", "grep_code", "search_brain", "done"],
-  fix: ["read_file", "write_file", "edit_file", "glob_files", "grep_code", "bash_exec", "done"],
-  verify: ["read_file", "glob_files", "grep_code", "bash_exec", "done"],
+  explore: ["read_file", "glob_files", "grep_code", "git_log", "git_diff", "list_dir", "check_env", "check_port", "inspect_db", "done"],
+  plan: ["read_file", "glob_files", "grep_code", "list_dir", "inspect_db", "check_env", "search_brain", "done"],
+  fix: ["read_file", "write_file", "edit_file", "glob_files", "grep_code", "bash_exec", "move_file", "run_db_fix", "done"],
+  verify: ["read_file", "glob_files", "grep_code", "bash_exec", "inspect_db", "check_port", "done"],
   research: ["read_file", "grep_code", "web_fetch", "search_brain", "done"],
-  security: ["read_file", "glob_files", "grep_code", "done"],
-  database: ["read_file", "write_file", "edit_file", "glob_files", "grep_code", "bash_exec", "done"],
+  security: ["read_file", "glob_files", "grep_code", "inspect_db", "done"],
+  database: ["read_file", "write_file", "edit_file", "glob_files", "grep_code", "bash_exec", "inspect_db", "run_db_fix", "done"],
 };
 // Default model + budget per agent type

package/src/brain/brain.js CHANGED Viewed

@@ -211,6 +211,22 @@ const SEED_DOCS = [
     text: "Agent fix strategy table: the agent system prompt includes a decision table mapping error patterns to correct fix actions. Cannot find module 'X' (package) → bash_exec: npm install X. Cannot find module './X' (local) → edit_file: fix require path. ENOENT → write_file: create missing file. EACCES → bash_exec: chmod. EADDRINUSE → bash_exec: kill process. SyntaxError → edit_file: fix code. TypeError → edit_file: fix logic. MODULE_NOT_FOUND + node_modules → bash_exec: rm -rf node_modules && npm install. The fast path AI response format now supports both 'changes' (code edits) and 'commands' (shell commands like npm install). Dangerous commands blocked: rm -rf /, format, mkfs.",
     metadata: { topic: "agent-fix-strategy" },
   },
+  {
+    text: "Error Monitor: detects caught 500 errors that don't crash the process. Most production bugs are caught by Fastify/Express error handlers — the server stays alive but routes return 500. Wolverine's crash-based heal pipeline never triggers for these. ErrorMonitor tracks 5xx errors per route via IPC from child process. After N consecutive 500s within a time window (default: 3 failures in 30s), triggers the heal pipeline without killing the server. Error hook auto-injected via --require preload (no user code changes). Cooldown prevents heal spam (default: 60s per route). Stats available in dashboard and telemetry. Config: WOLVERINE_ERROR_THRESHOLD, WOLVERINE_ERROR_WINDOW_MS, WOLVERINE_ERROR_COOLDOWN_MS.",
+    metadata: { topic: "error-monitor" },
+  },
+  {
+    text: "Agent tool harness v2: 16 built-in tools. FILE: read_file, write_file, edit_file, glob_files, grep_code, list_dir, move_file. SHELL: bash_exec, git_log, git_diff. DATABASE: inspect_db (list tables, show schema, run SELECT), run_db_fix (UPDATE/DELETE/INSERT/ALTER with auto-backup). DIAGNOSTICS: check_port (find what's using a port), check_env (list/check env vars, values redacted). RESEARCH: web_fetch. COMPLETION: done. Sub-agents get restricted sets: explorer gets diagnostics (list_dir, check_env, check_port, inspect_db), fixer gets action tools (bash_exec, move_file, run_db_fix), verifier gets inspection tools.",
+    metadata: { topic: "agent-tools-v2" },
+  },
+  {
+    text: "Server problem categories the agent can fix: CODE BUGS (SyntaxError, TypeError, ReferenceError → edit_file), DEPENDENCIES (Cannot find module → npm install, corrupted node_modules → rm + reinstall), DATABASE (invalid entries → run_db_fix UPDATE, missing table → CREATE TABLE, schema mismatch → ALTER TABLE, constraint violation → fix data or schema), CONFIG (invalid JSON → edit_file, missing env vars → write .env, wrong port → edit config), FILESYSTEM (misplaced files → move_file, missing directories → bash_exec mkdir, wrong permissions → chmod), NETWORK (port conflict → check_port + kill, service down → restart, connection refused → check config), STATE (corrupted cache → delete + restart, stale locks → remove lock file, git conflicts → resolve markers). The agent investigates before fixing — reads files, checks directories, inspects databases, never guesses.",
+    metadata: { topic: "server-problems" },
+  },
+  {
+    text: "Heal pipeline no longer requires a file path. When no file is identified from the error (database errors, config problems, port conflicts), the pipeline skips fast path and goes straight to the agent, which uses investigation tools (glob_files, grep_code, list_dir, inspect_db, check_env, check_port) to find the root cause. Agent verification for no-file errors: if agent made changes or ran commands, trust the agent's assessment. For file-based errors, verification uses syntax check + boot probe as before.",
+    metadata: { topic: "fileless-heal" },
+  },
 ];
 class Brain {

package/src/core/error-hook.js ADDED Viewed

@@ -0,0 +1,127 @@
+/**
+ * Error Hook — preloaded into the child server process via --require.
+ *
+ * Patches Fastify and Express error handlers to report caught errors
+ * back to the Wolverine parent process via IPC. This enables healing
+ * of 500 errors that don't crash the process.
+ *
+ * How it works:
+ * 1. Runner spawns child with: node --require ./src/core/error-hook.js server/index.js
+ * 2. This file hooks into Module._load to intercept fastify/express creation
+ * 3. When a framework instance is created, we add an error handler that sends IPC messages
+ * 4. Parent's ErrorMonitor receives the messages and triggers heal after threshold
+ *
+ * Zero changes to user's server code.
+ */
+const Module = require("module");
+const originalLoad = Module._load;
+let _hooked = false;
+Module._load = function (request, parent, isMain) {
+  const result = originalLoad.apply(this, arguments);
+  // Hook Fastify
+  if (request === "fastify" && typeof result === "function" && !_hooked) {
+    const originalFastify = result;
+    const wrapped = function (...args) {
+      const instance = originalFastify(...args);
+      _hookFastify(instance);
+      return instance;
+    };
+    // Preserve all properties (fastify.default, etc.)
+    Object.keys(originalFastify).forEach((key) => {
+      wrapped[key] = originalFastify[key];
+    });
+    _hooked = true;
+    return wrapped;
+  }
+  // Hook Express
+  if (request === "express" && typeof result === "function" && !_hooked) {
+    const originalExpress = result;
+    const wrapped = function (...args) {
+      const app = originalExpress(...args);
+      _hookExpress(app);
+      return app;
+    };
+    Object.keys(originalExpress).forEach((key) => {
+      wrapped[key] = originalExpress[key];
+    });
+    _hooked = true;
+    return wrapped;
+  }
+  return result;
+};
+function _hookFastify(fastify) {
+  // Use onReady to add hooks after all plugins are loaded
+  fastify.addHook("onReady", function (done) {
+    // Add a global error handler that reports to parent
+    fastify.addHook("onError", function (request, reply, error, done) {
+      _reportError(request.url, request.method, error);
+      done();
+    });
+    done();
+  });
+  // Also intercept the setErrorHandler if user sets one
+  const originalSetError = fastify.setErrorHandler.bind(fastify);
+  fastify.setErrorHandler = function (handler) {
+    return originalSetError(function (error, request, reply) {
+      _reportError(request.url, request.method, error);
+      return handler(error, request, reply);
+    });
+  };
+}
+function _hookExpress(app) {
+  // For Express, we monkey-patch app.use to detect error middleware
+  // and also add our own at the end via a delayed hook
+  const originalListen = app.listen.bind(app);
+  app.listen = function (...args) {
+    // Add our error handler AFTER all user middleware
+    app.use(function wolverineErrorHandler(err, req, res, next) {
+      _reportError(req.originalUrl || req.url, req.method, err);
+      next(err);
+    });
+    return originalListen(...args);
+  };
+}
+function _reportError(url, method, error) {
+  if (!process.send) return; // No IPC channel — not spawned by wolverine
+  try {
+    // Extract file/line from stack trace
+    let file = null;
+    let line = null;
+    if (error && error.stack) {
+      const stackLines = error.stack.split("\n");
+      for (const sl of stackLines) {
+        const match = sl.match(/\(([^)]+):(\d+):(\d+)\)/) || sl.match(/at\s+([^\s(]+):(\d+):(\d+)/);
+        if (match && !match[1].includes("node_modules") && !match[1].includes("node:")) {
+          file = match[1];
+          line = parseInt(match[2], 10);
+          break;
+        }
+      }
+    }
+    process.send({
+      type: "route_error",
+      path: url,
+      method: method || "GET",
+      statusCode: 500,
+      message: error?.message || "Unknown error",
+      stack: error?.stack?.slice(0, 2000) || "",
+      file,
+      line,
+      timestamp: Date.now(),
+    });
+  } catch {
+    // Silently fail — don't break the server for IPC issues
+  }
+}

package/src/core/runner.js CHANGED Viewed

@@ -20,6 +20,7 @@ const { ProcessMonitor } = require("../monitor/process-monitor");
 const { RouteProber } = require("../monitor/route-prober");
 const { startHeartbeat, stopHeartbeat } = require("../platform/heartbeat");
 const { Notifier } = require("../notifications/notifier");
+const { ErrorMonitor } = require("../monitor/error-monitor");
 /**
  * The Wolverine process runner — v3.
@@ -90,6 +91,15 @@ class WolverineRunner {
       brain: this.brain,
     });
+    // Error monitor — detects caught 500 errors without process crash
+    this.errorMonitor = new ErrorMonitor({
+      threshold: parseInt(process.env.WOLVERINE_ERROR_THRESHOLD, 10) || 3,
+      windowMs: parseInt(process.env.WOLVERINE_ERROR_WINDOW_MS, 10) || 30000,
+      cooldownMs: parseInt(process.env.WOLVERINE_ERROR_COOLDOWN_MS, 10) || 60000,
+      logger: this.logger,
+      onError: (routePath, errorDetails) => this._healFromError(routePath, errorDetails),
+    });
     // Brain — semantic memory + project context
     this.brain = new Brain(this.cwd);
@@ -120,6 +130,7 @@ class WolverineRunner {
       repairHistory: this.repairHistory,
       processMonitor: this.processMonitor,
       routeProber: this.routeProber,
+      errorMonitor: this.errorMonitor,
     });
     // Stability tracking
@@ -287,10 +298,13 @@ class WolverineRunner {
     this._stderrBuffer = "";
     this._lastStartTime = Date.now();
-    this.child = spawn("node", [this.scriptPath], {
+    // Spawn with --require error-hook.js for IPC error reporting
+    // The error hook auto-patches Fastify/Express to report caught 500s
+    const errorHookPath = path.join(__dirname, "error-hook.js");
+    this.child = spawn("node", ["--require", errorHookPath, this.scriptPath], {
       cwd: this.cwd,
       env: { ...process.env },
-      stdio: ["inherit", "inherit", "pipe"],
+      stdio: ["inherit", "inherit", "pipe", "ipc"],
     });
     this.child.stderr.on("data", (data) => {
@@ -367,6 +381,30 @@ class WolverineRunner {
       this.logger.error(EVENT_TYPES.PROCESS_CRASH, `Failed to start: ${err.message}`);
       this.running = false;
     });
+    // IPC channel: child reports caught 500 errors (Fastify/Express)
+    this.child.on("message", (msg) => {
+      if (msg && msg.type === "route_error") {
+        const { redact } = require("../security/secret-redactor");
+        const safeMsg = redact(msg.message || "");
+        const safeStack = redact(msg.stack || "");
+        console.log(chalk.yellow(`  🔍 Caught error on ${msg.method} ${msg.path}: ${safeMsg.slice(0, 100)}`));
+        this.logger.warn("error_monitor.caught", `${msg.method} ${msg.path} → 500: ${safeMsg.slice(0, 200)}`, {
+          route: msg.path, method: msg.method, file: msg.file, line: msg.line,
+        });
+        this.errorMonitor.record(msg.path, msg.statusCode || 500, {
+          message: safeMsg,
+          stack: safeStack,
+          file: msg.file,
+          line: msg.line,
+          path: msg.path,
+          method: msg.method,
+        });
+      }
+    });
+    // Reset error monitor on new spawn
+    this.errorMonitor.reset();
   }
   async _healAndRestart() {
@@ -432,6 +470,55 @@ class WolverineRunner {
     }
   }
+  /**
+   * Heal from a caught 500 error (ErrorMonitor threshold reached).
+   * Unlike crash healing, the server is still running — we heal and restart.
+   */
+  async _healFromError(routePath, errorDetails) {
+    if (this._healInProgress || this._shuttingDown) return;
+    this._healInProgress = true;
+    console.log(chalk.yellow(`\n🐺 Wolverine healing caught error on ${routePath}...`));
+    this.logger.info("heal.error_monitor", `Healing caught 500 on ${routePath}`, { route: routePath });
+    // Build a synthetic stderr from the error details
+    const stderr = [
+      errorDetails.message || "Unknown error",
+      errorDetails.stack || "",
+      errorDetails.file ? `    at ${errorDetails.file}:${errorDetails.line || 0}` : "",
+    ].filter(Boolean).join("\n");
+    try {
+      const result = await heal({
+        stderr,
+        cwd: this.cwd,
+        sandbox: this.sandbox,
+        redactor: this.redactor,
+        notifier: this.notifier,
+        rateLimiter: this.rateLimiter,
+        backupManager: this.backupManager,
+        logger: this.logger,
+        brain: this.brain,
+        mcp: this.mcp,
+        skills: this.skills,
+        repairHistory: this.repairHistory,
+      });
+      if (result.healed) {
+        console.log(chalk.green(`\n🐺 Wolverine healed ${routePath} via ${result.mode}! Restarting...\n`));
+        this.errorMonitor.clearRoute(routePath);
+        this._healInProgress = false;
+        this.restart();
+      } else {
+        console.log(chalk.red(`\n🐺 Could not heal ${routePath}: ${result.explanation}`));
+        this._healInProgress = false;
+      }
+    } catch (err) {
+      console.log(chalk.red(`\n🐺 Error during heal: ${err.message}`));
+      this._healInProgress = false;
+    }
+  }
   _startStabilityTimer() {
     this._clearStabilityTimer();
     this._stabilityTimer = setTimeout(() => {

package/src/core/wolverine.js CHANGED Viewed

@@ -47,30 +47,25 @@ async function heal({ stderr, cwd, sandbox, notifier, rateLimiter, backupManager
   if (logger) logger.debug(EVENT_TYPES.HEAL_PARSE, `Parsed: ${parsed.errorMessage}`, { file: parsed.filePath, line: parsed.line });
-  if (!parsed.filePath) {
-    console.log(chalk.red("  Could not identify the source file from the error. Skipping repair."));
-    if (logger) logger.error(EVENT_TYPES.HEAL_FAILED, "Could not parse file path from error");
-    return { healed: false, explanation: "Could not parse file path from error" };
-  }
-  // 2. Sandbox check
-  try {
-    sandbox.resolve(parsed.filePath);
-  } catch (e) {
-    if (e instanceof SandboxViolationError) {
-      console.log(chalk.red(`  🔒 SANDBOX: ${e.message}`));
-      if (logger) logger.error(EVENT_TYPES.SECURITY_SANDBOX_VIOLATION, e.message, { file: parsed.filePath });
-      return { healed: false, explanation: "File outside sandbox — access denied" };
+  // File path is optional — some errors (database, config, port) don't trace to a file.
+  // When no file is found, skip fast path and go straight to agent investigation.
+  let hasFile = false;
+  if (parsed.filePath) {
+    // 2. Sandbox check
+    try {
+      sandbox.resolve(parsed.filePath);
+      hasFile = sandbox.exists(parsed.filePath);
+    } catch (e) {
+      if (e instanceof SandboxViolationError) {
+        console.log(chalk.red(`  🔒 SANDBOX: ${e.message}`));
+        if (logger) logger.error(EVENT_TYPES.SECURITY_SANDBOX_VIOLATION, e.message, { file: parsed.filePath });
+        return { healed: false, explanation: "File outside sandbox — access denied" };
+      }
+      throw e;
     }
-    throw e;
-  }
-  if (!sandbox.exists(parsed.filePath)) {
-    console.log(chalk.red(`  Source file not found: ${parsed.filePath}`));
-    return { healed: false, explanation: "Source file not found" };
   }
-  console.log(chalk.cyan(`  File:  ${parsed.filePath}`));
+  console.log(chalk.cyan(`  File:  ${parsed.filePath || "(no file — agent will investigate)"}`));
   console.log(chalk.cyan(`  Line:  ${parsed.line || "unknown"}`));
   console.log(chalk.cyan(`  Error: ${parsed.errorMessage}`));
   console.log(chalk.cyan(`  Type:  ${parsed.errorType || "unknown"}`));
@@ -130,8 +125,8 @@ async function heal({ stderr, cwd, sandbox, notifier, rateLimiter, backupManager
     return { healed: true, explanation: opsFix.action, mode: "operational" };
   }
-  // 5. Read the source file + get brain context
-  const sourceCode = sandbox.readFile(parsed.filePath);
+  // 5. Read the source file (if available) + get brain context
+  const sourceCode = hasFile ? sandbox.readFile(parsed.filePath) : "";
   let brainContext = "";
   // Inject relevant skill context (claw-code: pre-enrich prompt with matched tools)
@@ -175,15 +170,16 @@ async function heal({ stderr, cwd, sandbox, notifier, rateLimiter, backupManager
     onAttempt: async (iteration, researchCtx) => {
       // Create backup for this attempt
       // Full server/ backup — includes all files, configs, databases
-      const bid = backupManager.createBackup(null);
+      const bid = backupManager.createBackup(`heal attempt ${iteration}: ${parsed.errorMessage.slice(0, 60)}`);
       backupManager.setErrorSignature(bid, errorSignature);
       if (logger) logger.info(EVENT_TYPES.BACKUP_CREATED, `Backup ${bid} (iteration ${iteration})`, { backupId: bid });
       const fullContext = [brainContext, researchContext, researchCtx].filter(Boolean).join("\n");
       let result;
-      if (iteration === 1) {
+      if (iteration === 1 && hasFile) {
         // Fast path — CODING_MODEL, single file + optional commands
+        // Only available when we have a specific file to fix
         console.log(chalk.yellow(`  🧠 Fast path (${getModel("coding")})...`));
         try {
           const repair = await requestRepair({
@@ -235,8 +231,8 @@ async function heal({ stderr, cwd, sandbox, notifier, rateLimiter, backupManager
           backupManager.rollbackTo(bid);
           return { healed: false, explanation: `Fast path error: ${err.message}` };
         }
-      } else if (iteration === 2) {
-        // Iteration 2: Single agent — REASONING_MODEL
+      } else if (iteration <= 2) {
+        // Agent path — REASONING_MODEL (also handles iteration 1 when no file)
         console.log(chalk.magenta(`  🤖 Agent path (${getModel("reasoning")})...`));
         const agent = new AgentEngine({
           sandbox, logger, cwd, mcp,
@@ -251,9 +247,17 @@ async function heal({ stderr, cwd, sandbox, notifier, rateLimiter, backupManager
         });
         rateLimiter.record(errorSignature, agentResult.totalTokens);
-        if (agentResult.success && agentResult.filesModified.length > 0) {
-          const verification = await verifyFix(parsed.filePath, cwd, errorSignature);
-          if (verification.verified) {
+        if (agentResult.success) {
+          // Verify: if we have a file, do syntax + boot check. Otherwise just boot probe.
+          if (hasFile) {
+            const verification = await verifyFix(parsed.filePath, cwd, errorSignature);
+            if (verification.verified) {
+              backupManager.markVerified(bid);
+              rateLimiter.clearSignature(errorSignature);
+              return { healed: true, explanation: agentResult.summary, backupId: bid, mode: "agent", agentStats: agentResult };
+            }
+          } else if (agentResult.filesModified.length > 0 || agentResult.toolCalls?.some(t => t.name === "bash_exec")) {
+            // No specific file but agent made changes or ran commands — trust it
             backupManager.markVerified(bid);
             rateLimiter.clearSignature(errorSignature);
             return { healed: true, explanation: agentResult.summary, backupId: bid, mode: "agent", agentStats: agentResult };
@@ -267,14 +271,20 @@ async function heal({ stderr, cwd, sandbox, notifier, rateLimiter, backupManager
         console.log(chalk.magenta(`  🤖 Sub-agent path (explore → plan → fix)...`));
         const subResult = await exploreAndFix(
-          `Error: ${parsed.errorMessage}\nFile: ${parsed.filePath}\nStack: ${parsed.stackTrace?.slice(0, 300)}`,
+          `Error: ${parsed.errorMessage}\n${parsed.filePath ? "File: " + parsed.filePath + "\n" : ""}Stack: ${parsed.stackTrace?.slice(0, 300)}`,
           { sandbox, logger, cwd, mcp, brainContext: fullContext }
         );
         rateLimiter.record(errorSignature, subResult.totalTokens);
-        if (subResult.success && subResult.filesModified.length > 0) {
-          const verification = await verifyFix(parsed.filePath, cwd, errorSignature);
-          if (verification.verified) {
+        if (subResult.success) {
+          if (hasFile) {
+            const verification = await verifyFix(parsed.filePath, cwd, errorSignature);
+            if (verification.verified) {
+              backupManager.markVerified(bid);
+              rateLimiter.clearSignature(errorSignature);
+              return { healed: true, explanation: subResult.summary, backupId: bid, mode: "sub-agents", agentStats: subResult };
+            }
+          } else {
             backupManager.markVerified(bid);
             rateLimiter.clearSignature(errorSignature);
             return { healed: true, explanation: subResult.summary, backupId: bid, mode: "sub-agents", agentStats: subResult };

package/src/dashboard/server.js CHANGED Viewed

@@ -29,6 +29,7 @@ class DashboardServer {
     this.repairHistory = options.repairHistory;
     this.processMonitor = options.processMonitor;
     this.routeProber = options.routeProber;
+    this.errorMonitor = options.errorMonitor;
     this.auth = new AdminAuth();
     this._sseClients = new Set();
@@ -869,6 +870,7 @@ ${context ? "\nBrain:\n" + context : ""}`,
       session: this.logger ? this.logger.getSessionStats() : {},
       backups: this.backupManager ? this.backupManager.getStats() : {},
       health: this.healthMonitor ? this.healthMonitor.getStats() : {},
+      errorMonitor: this.errorMonitor ? this.errorMonitor.getStats() : {},
     }));
   }

package/src/index.js CHANGED Viewed

@@ -23,6 +23,7 @@ const { spawnAgent, spawnParallel, exploreAndFix } = require("./agent/sub-agents
 const { McpRegistry } = require("./mcp/mcp-registry");
 const { McpSecurity } = require("./mcp/mcp-security");
 const { PerfMonitor } = require("./monitor/perf-monitor");
+const { ErrorMonitor } = require("./monitor/error-monitor");
 const { DashboardServer } = require("./dashboard/server");
 const { Notifier } = require("./notifications/notifier");
 const { Brain } = require("./brain/brain");
@@ -72,6 +73,7 @@ module.exports = {
   McpSecurity,
   // Monitor
   PerfMonitor,
+  ErrorMonitor,
   // Dashboard
   DashboardServer,
   // Notifications

package/src/monitor/error-monitor.js ADDED Viewed

@@ -0,0 +1,121 @@
+const chalk = require("chalk");
+/**
+ * Error Monitor — detects caught 500 errors that don't crash the process.
+ *
+ * Most production bugs are caught by Fastify/Express error handlers.
+ * The server stays alive but routes return 500. Wolverine's crash-based
+ * heal pipeline never triggers. This module bridges that gap.
+ *
+ * Tracks 5xx errors per route. After N consecutive failures within
+ * a time window, triggers the heal pipeline — without killing the server.
+ *
+ * Error flow: server error handler → IPC message → ErrorMonitor.record()
+ *   → threshold reached → onError callback → heal()
+ */
+class ErrorMonitor {
+  constructor({ threshold = 3, windowMs = 30000, cooldownMs = 60000, onError, logger } = {}) {
+    this.threshold = threshold;     // consecutive 5xx before triggering heal
+    this.windowMs = windowMs;       // time window for counting errors
+    this.cooldownMs = cooldownMs;   // cooldown after triggering (prevent heal spam)
+    this.onError = onError;         // callback: (routePath, errorDetails) => heal()
+    this.logger = logger;
+    this.routes = new Map();        // path → { count, firstSeen, lastError }
+    this._cooldowns = new Map();    // path → timestamp of last heal trigger
+    this._totalErrors = 0;
+    this._totalHeals = 0;
+  }
+  /**
+   * Record a route response. Call on every response from the child.
+   * @param {string} routePath — e.g. "/api/users"
+   * @param {number} statusCode — HTTP status
+   * @param {object} errorDetails — { message, stack, file, line }
+   */
+  record(routePath, statusCode, errorDetails) {
+    if (statusCode < 500) {
+      // Success — reset the error counter for this route
+      if (this.routes.has(routePath)) {
+        this.routes.delete(routePath);
+      }
+      return;
+    }
+    this._totalErrors++;
+    // Check cooldown — don't trigger heal for same route too quickly
+    const lastHeal = this._cooldowns.get(routePath);
+    if (lastHeal && Date.now() - lastHeal < this.cooldownMs) {
+      return;
+    }
+    const entry = this.routes.get(routePath) || { count: 0, firstSeen: Date.now(), lastError: null };
+    entry.count++;
+    entry.lastError = errorDetails;
+    // Reset if outside time window
+    if (Date.now() - entry.firstSeen > this.windowMs) {
+      entry.count = 1;
+      entry.firstSeen = Date.now();
+    }
+    this.routes.set(routePath, entry);
+    if (entry.count >= this.threshold) {
+      this._totalHeals++;
+      console.log(chalk.yellow(`\n🔍 ErrorMonitor: ${routePath} failed ${entry.count}x in ${Math.round((Date.now() - entry.firstSeen) / 1000)}s — triggering heal`));
+      if (this.logger) {
+        this.logger.warn("error_monitor.threshold", `Route ${routePath} hit ${this.threshold} consecutive 500s`, {
+          route: routePath,
+          count: entry.count,
+          error: errorDetails?.message?.slice(0, 200),
+        });
+      }
+      // Set cooldown and reset counter
+      this._cooldowns.set(routePath, Date.now());
+      this.routes.delete(routePath);
+      // Trigger the heal callback
+      if (this.onError) {
+        this.onError(routePath, errorDetails);
+      }
+    }
+  }
+  /**
+   * Clear a route's error state (e.g., after a successful heal).
+   */
+  clearRoute(routePath) {
+    this.routes.delete(routePath);
+    this._cooldowns.delete(routePath);
+  }
+  /**
+   * Get stats for dashboard/telemetry.
+   */
+  getStats() {
+    const activeRoutes = {};
+    for (const [path, entry] of this.routes) {
+      activeRoutes[path] = { count: entry.count, lastError: entry.lastError?.message?.slice(0, 100) };
+    }
+    return {
+      totalErrors: this._totalErrors,
+      totalHeals: this._totalHeals,
+      activeRoutes,
+      trackedRoutes: this.routes.size,
+    };
+  }
+  /**
+   * Reset all state (e.g., after server restart).
+   */
+  reset() {
+    this.routes.clear();
+    // Keep cooldowns — don't re-trigger immediately after restart
+  }
+}
+module.exports = { ErrorMonitor };