npm - wolverine-ai - Versions diffs - 3.4.1 → 3.5.0 - Mend

wolverine-ai 3.4.1 → 3.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (10) hide show

package/README.md +10 -6
package/bin/wolverine.js +11 -1
package/package.json +1 -3
package/src/agent/agent-engine.js +31 -18
package/src/agent/goal-loop.js +11 -6
package/src/brain/brain.js +8 -4
package/src/core/error-hook.js +17 -1
package/src/core/wolverine.js +69 -5
package/src/skills/loop-guard.js +9 -2
package/CLAUDE.md +0 -146

package/README.md CHANGED Viewed

@@ -70,7 +70,7 @@ wolverine/
 │   ├── core/                ← Wolverine engine
 │   │   ├── wolverine.js     ← Heal pipeline + goal loop
 │   │   ├── runner.js        ← Process manager (PM2-like)
-│   │   ├── ai-client.js     ← OpenAI client (Chat + Responses API)
+│   │   ├── ai-client.js     ← Dual provider client (OpenAI + Anthropic)
 │   │   ├── models.js        ← 10-model configuration system
 │   │   ├── verifier.js      ← Fix verification (syntax + boot probe)
 │   │   ├── error-parser.js  ← Stack trace parsing + error classification
@@ -81,7 +81,7 @@ wolverine/
 │   │   ├── system-info.js   ← Machine detection (cores, RAM, cloud, containers)
 │   │   └── cluster-manager.js← Auto-scaling worker management
 │   ├── agent/               ← AI agent system
-│   │   ├── agent-engine.js  ← Multi-turn agent with 10 tools
+│   │   ├── agent-engine.js  ← Multi-turn agent with 18 tools + 45s per-call timeout
 │   │   ├── goal-loop.js     ← Goal-driven repair loop
 │   │   ├── research-agent.js← Deep research + learning from failures
 │   │   └── sub-agents.js    ← 7 specialized sub-agents (explore/plan/fix/verify/...)
@@ -151,8 +151,9 @@ Server crashes
 Operational Fix (zero AI tokens):
   → "Cannot find module 'cors'" → npm install cors (instant, free)
-  → ENOENT on config file → create missing file with defaults
+  → ENOENT on config file → read source code, infer expected fields, create with correct structure
   → EACCES/EPERM → chmod 755
+  → EADDRINUSE → find and kill stale process on port
   → If operational fix works → done. No AI needed.
 Goal Loop (iterate until fixed or exhausted):
@@ -215,7 +216,7 @@ The AI agent has 18 built-in tools (inspired by [claw-code](https://github.com/u
 | `grep_code` | File | Regex search across codebase with context lines |
 | `list_dir` | File | List directory contents with sizes (find misplaced files) |
 | `move_file` | File | Move or rename files (fix structure problems) |
-| `bash_exec` | Shell | Sandboxed shell execution (npm install, chmod, kill, etc.) |
+| `bash_exec` | Shell | Sandboxed shell execution (npm install, chmod, kill, etc.) 30s default, 60s cap |
 | `git_log` | Shell | View recent commit history |
 | `git_diff` | Shell | View uncommitted changes |
 | `inspect_db` | Database | List tables, show schema, run SELECT on SQLite databases |
@@ -439,8 +440,11 @@ Three layers prevent token waste:
 | **Empty stderr guard** | Signal kills, clean shutdowns with no error | $0.00 |
 | **Loop guard** | Same error failing 3+ times in 10min → files bug report, stops healing | $0.00 after detection |
 | **Global rate limit** | Max 5 heals per 5 minutes regardless of error | Caps total spend |
+| **Per-API-call timeout** | 45s timeout on each AI call — prevents indefinite agent hangs | Saves time + tokens |
+| **Heal timeout** | 5-minute overall heal timeout via Promise.race | Prevents stuck heals |
+| **SIGTERM grace period** | 3s startup grace ignores SIGTERM — prevents restart scripts killing new process | Prevents shutdown loops |
-**Process dedup:** PID file ensures only one wolverine instance runs. Kills old process on startup.
+**Process dedup:** PID file ensures only one wolverine instance runs. Kills old process on startup. Exit handler only deletes PID file if it still belongs to current process (prevents race condition on restart).
 **Bug reports:** When loop guard triggers, generates a security-scanned report (no secrets/injection patterns) and sends to the platform backend for human review.
@@ -450,7 +454,7 @@ Three layers prevent token waste:
 | Technique | What it does | Cost |
 |-----------|-------------|------|
-| **Dynamic system prompt** | Simple errors get 400-token prompt with 7 tools. Complex get 1200 with 18 + strategy | 50% on 70% of heals |
+| **Dynamic system prompt** | Simple errors get 400-token prompt with 7 tools. Complex get 1200 with 18 + fast-fix strategy table | 50% on 70% of heals |
 | **Brain namespace isolation** | Seed docs (20K tokens) excluded from error heals — only searched for wolverine queries | 50% context reduction |
 | **Prompt caching** | Anthropic system prompt cached server-side — 90% cheaper on repeat calls | 12-16K tokens saved per heal |
 | **Tool result truncation** | Tool output capped at 4K chars — prevents context blowup from large reads | Up to 30K saved per turn |

package/bin/wolverine.js CHANGED Viewed

@@ -152,13 +152,23 @@ console.log("");
 const runner = new WolverineRunner(scriptPath, { cwd: process.cwd() });
+// Grace period: ignore SIGTERM for 3s after startup.
+// Prevents restart scripts using `pkill -f wolverine.js` from killing
+// both the old AND newly spawned process.
+let startupGrace = true;
+setTimeout(() => { startupGrace = false; }, 3000);
 process.on("SIGINT", () => {
-  console.log(chalk.yellow(`\n\n👋 Shutting down Wolverine${workerLabel}...`));
+  console.log(chalk.yellow(`\n\n👋 Shutting down Wolverine...`));
   runner.stop();
   process.exit(0);
 });
 process.on("SIGTERM", () => {
+  if (startupGrace) {
+    console.log(chalk.yellow("  ⚡ Ignoring SIGTERM during startup grace period (3s)"));
+    return;
+  }
   runner.stop();
   process.exit(0);
 });

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "wolverine-ai",
-  "version": "3.4.1",
+  "version": "3.5.0",
   "description": "Self-healing Node.js server framework powered by AI. Catches crashes, diagnoses errors, generates fixes, verifies, and restarts — automatically.",
   "main": "src/index.js",
   "bin": {
@@ -49,8 +49,6 @@
     "src/",
     "server/",
     "examples/",
-    "README.md",
-    "CLAUDE.md",
     ".env.example"
   ],
   "dependencies": {

package/src/agent/agent-engine.js CHANGED Viewed

@@ -414,15 +414,23 @@ class AgentEngine {
       }
       let response;
+      const AI_CALL_TIMEOUT_MS = 45000; // 45s per API call — prevents indefinite hangs
       try {
-        response = await aiCallWithHistory({
-          model,
-          messages: this.messages,
-          tools: allTools,
-          maxTokens: 4096,
-        });
+        response = await Promise.race([
+          aiCallWithHistory({
+            model,
+            messages: this.messages,
+            tools: allTools,
+            maxTokens: 4096,
+          }),
+          new Promise((_, reject) => setTimeout(() => reject(new Error("AI call timed out after 45s")), AI_CALL_TIMEOUT_MS)),
+        ]);
       } catch (err) {
         console.log(chalk.red(`  Agent API error: ${err.message}`));
+        // On timeout, return what we have so far rather than failing completely
+        if (err.message.includes("timed out") && this.filesModified.length > 0) {
+          return { success: true, summary: `Partial fix applied (API timeout on turn ${this.turnCount})`, filesModified: this.filesModified, turnCount: this.turnCount, totalTokens: this.totalTokens };
+        }
         return { success: false, summary: err.message, filesModified: [], turnCount: this.turnCount, totalTokens: this.totalTokens };
       }
@@ -1060,26 +1068,31 @@ Project: ${cwd}`;
 function _fullPrompt(cwd, primaryFile) {
   return `You are Wolverine, an autonomous Node.js server repair agent. Diagnose and fix the error.
-You are a full server doctor. Errors can be code bugs, missing deps, database problems, config issues, port conflicts, permissions, or corrupted state. Investigate the root cause before fixing.
+You are a full server doctor. Errors can be code bugs, missing deps, database problems, config issues, port conflicts, permissions, or corrupted state.
+CRITICAL: Act fast. You have limited turns. Fix immediately when the solution is obvious from the error. Only investigate when the cause is unclear.
 For maximum efficiency, invoke multiple independent tools simultaneously rather than sequentially.
 TOOLS: read_file, write_file, edit_file, glob_files, grep_code, list_dir, move_file, bash_exec, git_log, git_diff, inspect_db, run_db_fix, check_port, check_env, audit_deps, check_migration, web_fetch, done
-STRATEGY:
-- Cannot find module 'X' → bash_exec: npm install X
-- Cannot find module './X' → edit_file: fix require path
-- ENOENT → write_file or move_file
-- EADDRINUSE → check_port then bash_exec: kill
-- TypeError/ReferenceError → read_file then edit_file
+FAST FIXES (act immediately, don't investigate):
+- Cannot find module 'X' → bash_exec: npm install X → done
+- Cannot find module './X' → grep for correct path → edit_file → done
+- ENOENT missing config/json file → read the code that loads it to see what fields it expects → write_file with required fields → done
+- EADDRINUSE → check_port → bash_exec: kill PID → done
+- TypeError/ReferenceError → read_file → edit_file → done
+- Missing env var → check_env → report it → done
+INVESTIGATION (only when cause is unclear):
 - Database error → inspect_db then run_db_fix
-- Missing env var → check_env
+- Unknown errors → grep_code, list_dir to find root cause
 RULES:
-1. Investigate first — read files before modifying
-2. Minimal targeted changes — fix root cause not symptoms
-3. bash_exec for operational fixes, edit_file for code, run_db_fix for data
-4. Call done with summary when finished
+1. Fix on turn 1-2 when possible. Investigation is a last resort.
+2. For ENOENT config files: read the code that requires the file, then create it with the expected structure.
+3. bash_exec for operational fixes, edit_file for code, write_file for missing files, run_db_fix for data
+4. Always call done with summary when finished — never end without calling done.
 ${primaryFile ? `\nFile: ${primaryFile}` : ""}
 Project: ${cwd}`;
 }

package/src/agent/goal-loop.js CHANGED Viewed

@@ -107,13 +107,18 @@ class GoalLoop {
           explanation: attempt.explanation,
         }).catch(() => {});
-        // Deep research after 2nd failure — bring in RESEARCH_MODEL
-        if (iteration >= 2) {
+        // Deep research only after 3rd failure — avoid adding latency on early iterations
+        if (iteration >= 3) {
           console.log(chalk.magenta(`  🔬 Triggering deep research after ${iteration} failures...`));
-          const research = await this.researcher.research(errorMessage, context);
-          if (research) {
-            console.log(chalk.gray(`  🔬 Research insight: ${research.slice(0, 100)}`));
-          }
+          try {
+            const research = await Promise.race([
+              this.researcher.research(errorMessage, context),
+              new Promise((resolve) => setTimeout(() => resolve(null), 30000)), // 30s cap
+            ]);
+            if (research) {
+              console.log(chalk.gray(`  🔬 Research insight: ${research.slice(0, 100)}`));
+            }
+          } catch {}
         }
       }

package/src/brain/brain.js CHANGED Viewed

@@ -34,7 +34,7 @@ const SEED_DOCS = [
     metadata: { topic: "overview" },
   },
   {
-    text: "Wolverine heal pipeline: crash detected → error parsed (file, line, message, errorType) → prompt injection scan (AUDIT_MODEL) → rate limit check → operational fix attempt (missing_module → npm install, missing_file → create file, permission → chmod — zero AI tokens) → if operational fix doesn't apply → fast path repair (CODING_MODEL, supports both code changes AND shell commands like npm install) → if fast path fails → agent path (REASONING_MODEL with tools including bash_exec for npm install) → if agent fails → sub-agents (explore → plan → fix, fixer has bash_exec) → verify fix (syntax check + boot probe) → rollback on failure. Error types classified: missing_module, missing_file, permission, port_conflict, syntax, runtime, unknown.",
+    text: "Wolverine heal pipeline: crash detected → error parsed (file, line, message, errorType) → prompt injection scan (AUDIT_MODEL) → rate limit check (per-signature + global 5/5min cap) → operational fix attempt (missing_module → npm install, missing_file → create file with inferred config, permission → chmod, port conflict → kill stale process — zero AI tokens) → if operational fix doesn't apply → fast path repair (CODING_MODEL, supports both code changes AND shell commands like npm install) → if fast path fails → agent path (REASONING_MODEL with tools including bash_exec, 45s per-API-call timeout) → if agent fails → sub-agents (explore → plan → fix, fixer has bash_exec) → verify fix (syntax check + boot probe + error classification comparison) → rollback on failure. Error types classified: missing_module, missing_file, permission, port_conflict, syntax, runtime, unknown. Heal timeout: 5 minutes via Promise.race. Config-aware turn budget: simple=4, config/ENOENT=5, complex=8 turns.",
     metadata: { topic: "heal-pipeline" },
   },
   {
@@ -66,7 +66,7 @@ const SEED_DOCS = [
     metadata: { topic: "verification" },
   },
   {
-    text: "Wolverine multi-file agent: 15-turn agent loop with 18 tools across 7 categories. FILE: read_file (offset/limit), write_file (creates dirs), edit_file (find-and-replace), glob_files (pattern search), grep_code (regex with context), list_dir (directory listing with sizes), move_file (rename/relocate). SHELL: bash_exec (30s default, 60s cap), git_log, git_diff. DATABASE: inspect_db (tables/schema/SELECT on SQLite), run_db_fix (UPDATE/DELETE/ALTER with auto-backup). DIAGNOSTICS: check_port (find what uses a port), check_env (env vars, values redacted). DEPS: audit_deps (full npm health check), check_migration (known upgrade paths). RESEARCH: web_fetch. CONTROL: done. Used when fast path fails. Token budget 50k max.",
+    text: "Wolverine multi-file agent: turn-limited agent loop with 18 tools across 7 categories. Turn budget adapts to error type: simple (TypeError)=4, config/ENOENT=5, complex=8. Each AI call has 45s timeout via Promise.race — prevents indefinite hangs. If timeout occurs mid-fix, partial results returned. FILE: read_file (offset/limit), write_file (creates dirs), edit_file (find-and-replace), glob_files (pattern search), grep_code (regex with context), list_dir (directory listing with sizes), move_file (rename/relocate). SHELL: bash_exec (30s default, 60s cap), git_log, git_diff. DATABASE: inspect_db (tables/schema/SELECT on SQLite), run_db_fix (UPDATE/DELETE/ALTER with auto-backup). DIAGNOSTICS: check_port (find what uses a port), check_env (env vars, values redacted). DEPS: audit_deps (full npm health check), check_migration (known upgrade paths). RESEARCH: web_fetch (10s timeout). CONTROL: done. Prompt emphasizes fast action: fix immediately when solution is obvious, investigate only when cause unclear.",
     metadata: { topic: "agent" },
   },
   {
@@ -202,7 +202,7 @@ const SEED_DOCS = [
     metadata: { topic: "admin-auth" },
   },
   {
-    text: "Operational fix layer: before calling AI, wolverine checks for common non-code errors that can be fixed instantly with zero tokens. Pattern 1: 'Cannot find module X' (where X is a package name, not a relative path) → runs npm install X (or just npm install if package is already in package.json). Pattern 2: ENOENT on config/data files (.json, .yaml, .env, .log, etc.) → creates the missing file with sensible defaults (empty JSON {}, empty string). Pattern 3: EACCES/EPERM → chmod 755 on the file. This layer runs before the AI repair loop and handles ~30% of production crashes at zero cost.",
+    text: "Operational fix layer: before calling AI, wolverine checks for common non-code errors that can be fixed instantly with zero tokens. Pattern 1: 'Cannot find module X' (where X is a package name, not a relative path) → runs npm install X via deps skill diagnosis. Pattern 2: ENOENT on config/data files (.json, .yaml, .env, .log, etc.) → for JSON configs, reads the source code that loads the file to infer expected fields (apiUrl, timeout, etc.) and creates the file with correct structure; for other types, creates empty file. Pattern 3: EACCES/EPERM → chmod 755 on the file. Pattern 4: EADDRINUSE → finds and kills stale process on the port (lsof on Linux, netstat on Windows). This layer runs before the AI repair loop and handles ~30% of production crashes at zero cost.",
     metadata: { topic: "operational-fix" },
   },
   {
@@ -214,7 +214,7 @@ const SEED_DOCS = [
     metadata: { topic: "agent-fix-strategy" },
   },
   {
-    text: "Error Monitor: detects caught 500 errors that don't crash the process. Most production bugs are caught by Fastify/Express error handlers — the server stays alive but routes return 500. Wolverine's crash-based heal pipeline never triggers for these. ErrorMonitor tracks 5xx errors per route via IPC from child process. After N consecutive 500s within a time window (default: 3 failures in 30s), triggers the heal pipeline without killing the server. Error hook auto-injected via --require preload (no user code changes). Cooldown prevents heal spam (default: 60s per route). Stats available in dashboard and telemetry. Config: WOLVERINE_ERROR_THRESHOLD, WOLVERINE_ERROR_WINDOW_MS, WOLVERINE_ERROR_COOLDOWN_MS.",
+    text: "Error Monitor: detects caught 500 errors that don't crash the process. Most production bugs are caught by Fastify/Express error handlers — the server stays alive but routes return 500. Wolverine's crash-based heal pipeline never triggers for these. ErrorMonitor tracks 5xx errors per normalized route (/api/users/123 → /api/users/:id) via IPC from child process. Single error triggers heal (threshold=1, configurable). Error hook auto-injected via --require preload (no user code changes) — hooks Fastify onError + setErrorHandler wrapper + auto-registers default error handler if user never sets one (catches async route throws). Cooldown prevents heal spam (default: 60s per route). Health check failures also trigger heal (not just restart). Config: WOLVERINE_ERROR_THRESHOLD, WOLVERINE_ERROR_WINDOW_MS, WOLVERINE_ERROR_COOLDOWN_MS.",
     metadata: { topic: "error-monitor" },
   },
   {
@@ -265,6 +265,10 @@ const SEED_DOCS = [
     text: "Agent efficiency (claw-code patterns): (1) Anthropic prompt caching — system prompt marked with cache_control:{type:'ephemeral'}, cached server-side across agent turns, 90% cheaper on repeat calls (12-16K saved tokens per heal). (2) Tool result truncation — capped at 4K chars before entering message history, prevents context blowup from large grep/file reads. (3) Zero-cost structural compaction — extracts signals (tools used, files touched, errors found, actions taken) from message history WITHOUT an LLM call. Costs $0.00 vs old method that burned tokens on a compacting model. Triggers when estimated tokens > 10K (text.length/4 approximation). Preserves last 4 messages verbatim. (2) Token estimation — text.length/4+1, fast approximation without tokenizer, ~10% accurate. Used for budget decisions before API calls. (3) Error-graceful tools — tool errors returned as [ERROR] prefixed results, not thrown. Model sees the error and decides how to proceed. (4) Pre/post tool hooks — shell commands in .wolverine/hooks.json, exit 0=allow, 2=deny. Enables audit logging and policy enforcement without hard-coding.",
     metadata: { topic: "agent-efficiency" },
   },
+  {
+    text: "Robustness guards: (1) Heal concurrency guard — _healInProgress flag prevents parallel heals from health monitor + crash handler racing. (2) Global rate limit — 5 heals per 5 minutes regardless of error signature, prevents infinite loop of different errors burning API quota. (3) Heal timeout — Promise.race wraps _healImpl() with 5-minute timeout, clears _healInProgress on timeout. (4) Per-API-call timeout — 45s timeout in agent engine via Promise.race, returns partial results if files already modified. (5) bash_exec enforced timeout — 30s default, 60s hard cap via Math.min(). (6) PID file race prevention — exit handler only deletes PID file if it still belongs to current process. (7) SIGTERM startup grace — 3s grace period ignores SIGTERM on startup, prevents restart scripts from killing both old and new processes. (8) Research timeout — deep research capped at 30s, deferred to iteration 3+ to avoid slowing early fix attempts.",
+    metadata: { topic: "robustness-guards" },
+  },
   {
     text: "Cost optimization: 7 techniques reduce heal cost from $0.31 to $0.02 for simple errors. (1) Verifier skips route probe for simple errors (TypeError/ReferenceError/SyntaxError) — trusts syntax+boot, ErrorMonitor is safety net. Prevents false-rejection cascades. (2) Sub-agents use Haiku (classifier model) for explore/plan/verify/research — only fixer uses Sonnet/Opus. 6 Haiku calls=$0.006 vs 6 Sonnet calls=$0.12. (3) Agent context compacted every 3 turns using compacting model — prevents 15K→95K token blowup. (4) Brain checked for cached fix patterns before AI — repeat errors cost $0. (5) Token budgets capped by error complexity: simple=20K agent budget, moderate=50K, complex=100K. Simple errors get 4 agent turns max. (6) Prior attempt summaries (not full context) passed between iterations — concise 'do NOT repeat' directives. (7) Fast path includes last known good backup code so AI can revert broken additions instead of patching around them.",
     metadata: { topic: "cost-optimization" },

package/src/core/error-hook.js CHANGED Viewed

@@ -58,20 +58,36 @@ Module._load = function (request, parent, isMain) {
 function _hookFastify(fastify) {
   // Wrap setErrorHandler so our IPC reporting runs BEFORE the user's handler
   const origSetError = fastify.setErrorHandler;
+  let customErrorHandlerSet = false;
   fastify.setErrorHandler = function (userHandler) {
+    customErrorHandlerSet = true;
     return origSetError.call(this, function (error, request, reply) {
       _reportError(request.url, request.method, error);
       return userHandler.call(this, error, request, reply);
     });
   };
-  // Also add onError hook as a fallback (fires even if no custom error handler)
+  // Add onError hook as primary fallback — fires for all route errors in Fastify
   try {
     fastify.addHook("onError", function (request, reply, error, done) {
       _reportError(request.url, request.method, error);
       done();
     });
   } catch { /* addHook may fail if server is already started */ }
+  // Register a default error handler if user never calls setErrorHandler
+  // This ensures we catch async route throws even without a custom handler
+  try {
+    fastify.addHook("onReady", function (done) {
+      if (!customErrorHandlerSet) {
+        origSetError.call(fastify, function (error, request, reply) {
+          _reportError(request.url, request.method, error);
+          reply.code(error.statusCode || 500).send({ error: error.message });
+        });
+      }
+      done();
+    });
+  } catch { /* non-fatal */ }
 }
 function _hookExpress(app) {

package/src/core/wolverine.js CHANGED Viewed

@@ -332,9 +332,12 @@ async function _healImpl({ stderr, cwd, sandbox, notifier, rateLimiter, backupMa
       } else if (iteration <= 2) {
         // Agent path — REASONING_MODEL (also handles iteration 1 when no file)
         console.log(chalk.magenta(`  🤖 Agent path (${getModel("reasoning")})...`));
+        // Tight turn budget: simple errors get 4 turns, ENOENT/config gets 5, complex gets 8
+        const isConfigError = /ENOENT|missing.*config|missing.*file|no such file/i.test(parsed.errorMessage);
+        const agentMaxTurns = isSimpleError ? 4 : isConfigError ? 5 : 8;
         const agent = new AgentEngine({
           sandbox, logger, cwd, mcp,
-          maxTurns: isSimpleError ? 4 : 8,
+          maxTurns: agentMaxTurns,
           maxTokens: tokenBudget.agent,
         });
@@ -496,12 +499,20 @@ async function tryOperationalFix(parsed, cwd, logger) {
     if (!rel.startsWith("..") && /\.(json|yaml|yml|toml|ini|conf|cfg|env|log|txt|csv|db|sqlite)$/i.test(missingFile)) {
       try {
         fs.mkdirSync(path.dirname(missingFile), { recursive: true });
-        // Create empty file or sensible default
         const ext = path.extname(missingFile).toLowerCase();
-        const defaults = { ".json": "{}", ".yaml": "", ".yml": "", ".log": "", ".txt": "", ".csv": "", ".env": "" };
-        fs.writeFileSync(missingFile, defaults[ext] || "", "utf-8");
+        // For JSON config files, try to infer expected structure from the code that loads them
+        let content = "";
+        if (ext === ".json") {
+          content = _inferJsonConfig(missingFile, cwd, parsed) || "{}";
+        } else {
+          const defaults = { ".yaml": "", ".yml": "", ".log": "", ".txt": "", ".csv": "", ".env": "" };
+          content = defaults[ext] || "";
+        }
+        fs.writeFileSync(missingFile, content, "utf-8");
         console.log(chalk.blue(`  📄 Created missing file: ${rel}`));
-        return { fixed: true, action: `Created missing file: ${rel}` };
+        return { fixed: true, action: `Created missing file: ${rel} with ${content === "{}" ? "empty" : "inferred"} config` };
       } catch {}
     }
   }
@@ -544,4 +555,57 @@ async function tryOperationalFix(parsed, cwd, logger) {
   return { fixed: false };
 }
+/**
+ * Try to infer JSON config structure by scanning the code that loads the file.
+ * Looks for property access patterns after require/readFile of the missing file.
+ * Returns a JSON string with empty/default values, or null if can't infer.
+ */
+function _inferJsonConfig(missingFile, cwd, parsed) {
+  const fs = require("fs");
+  const path = require("path");
+  // Find which source file loads the missing config
+  const basename = path.basename(missingFile);
+  const sourceFile = parsed.filePath;
+  if (!sourceFile) return null;
+  try {
+    const source = fs.readFileSync(sourceFile, "utf-8");
+    // Look for property accesses on the loaded config: config.apiUrl, config.timeout, etc.
+    const configVarMatch = source.match(new RegExp(`(?:const|let|var)\\s+(\\w+)\\s*=\\s*(?:require|JSON\\.parse).*${basename.replace(".", "\\.")}`));
+    if (!configVarMatch) return null;
+    const varName = configVarMatch[1];
+    // Find all property accesses: varName.prop or varName["prop"]
+    const propRegex = new RegExp(`${varName}\\.(\\w+)`, "g");
+    const bracketRegex = new RegExp(`${varName}\\["(\\w+)"\\]`, "g");
+    const props = new Set();
+    let m;
+    while ((m = propRegex.exec(source)) !== null) props.add(m[1]);
+    while ((m = bracketRegex.exec(source)) !== null) props.add(m[1]);
+    if (props.size === 0) return null;
+    // Build config with sensible defaults based on property names
+    const config = {};
+    for (const prop of props) {
+      const lower = prop.toLowerCase();
+      if (/url|endpoint|host|uri/.test(lower)) config[prop] = "http://localhost:3000";
+      else if (/port/.test(lower)) config[prop] = 3000;
+      else if (/timeout|delay|interval|ttl/.test(lower)) config[prop] = 5000;
+      else if (/key|token|secret/.test(lower)) config[prop] = "placeholder";
+      else if (/name/.test(lower)) config[prop] = "default";
+      else if (/enabled|active|debug/.test(lower)) config[prop] = true;
+      else if (/count|max|min|limit|size/.test(lower)) config[prop] = 10;
+      else if (/path|dir|file/.test(lower)) config[prop] = "./";
+      else config[prop] = "";
+    }
+    console.log(chalk.gray(`  🔍 Inferred ${props.size} config fields from ${path.basename(sourceFile)}: ${[...props].join(", ")}`));
+    return JSON.stringify(config, null, 2);
+  } catch {
+    return null;
+  }
+}
 module.exports = { heal };

package/src/skills/loop-guard.js CHANGED Viewed

@@ -242,8 +242,15 @@ function ensureSingleProcess(cwd) {
     fs.writeFileSync(pidFile, String(process.pid), "utf-8");
   } catch {}
-  // Clean up on exit
-  process.on("exit", () => { try { fs.unlinkSync(pidFile); } catch {} });
+  // Clean up on exit — only delete if PID file still belongs to us
+  // (prevents race condition where old process deletes new process's PID)
+  const myPid = process.pid;
+  process.on("exit", () => {
+    try {
+      const current = parseInt(fs.readFileSync(pidFile, "utf-8").trim(), 10);
+      if (current === myPid) fs.unlinkSync(pidFile);
+    } catch {}
+  });
 }
 // ── Skill Metadata ──

package/CLAUDE.md DELETED Viewed

@@ -1,146 +0,0 @@
-# CLAUDE.md
-This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
-## What This Is
-Wolverine is a self-healing Node.js server framework. It wraps a server process, catches crashes AND caught 500 errors, diagnoses them with AI (OpenAI or Anthropic), generates fixes, verifies them, and restarts — automatically. Published as `wolverine-ai` on npm (v3.1.0). 65 exports, 83 files, 6 skills.
-## Commands
-```bash
-npm start                        # Run server/index.js under wolverine (self-healing)
-npm run server                   # Run server/index.js directly (no healing)
-npm run test:pentest             # Security scan for secret leakage
-npm run demo:list                # List demo scenarios
-npm run demo -- 01               # Run specific demo
-npx wolverine server/index.js    # CLI entry point
-wolverine --info                 # System detection
-wolverine --update               # Safe framework upgrade
-wolverine --backup "reason"      # Create server snapshot
-wolverine --list-backups         # Show all snapshots
-wolverine --rollback <id>        # Restore specific backup
-wolverine --rollback-latest      # Restore most recent
-```
-No standard test runner — demos in `tests/fixtures/` serve as integration tests.
-## Architecture
-### Heal Pipeline (src/core/wolverine.js)
-```
-Error detected (crash OR caught 500 via IPC)
-  → Empty stderr? → Just restart, no AI ($0.00)
-  → Parse error → classify type → redact secrets
-  → Injection scan (skip if < 20 chars)
-  → Loop guard: same error failed 3+ times in 10min? → File bug report, stop
-  → Rate limit: 5 heals per 5min max
-  → Operational fix (zero tokens):
-      missing_module → deps.diagnose() → npm install
-      EADDRINUSE → kill stale process
-      ENOENT → create missing file
-      EACCES → chmod
-  → Token budget by complexity: simple=20K, moderate=50K, complex=100K
-  → Goal Loop (3 iterations):
-      1. Fast path: CODING_MODEL, JSON with code+commands, backup diff context
-      2. Agent: dynamic prompt (400 tokens simple, 1200 complex), 18 tools
-      3. Sub-agents: explore→plan→fix (Haiku triage, Sonnet/Opus fix only)
-  → Verify: syntax → boot probe (route probe skipped — ErrorMonitor is safety net)
-  → Success: retryCount reset, record to brain with full context
-  → Fail: rollback, brain records "DO NOT REPEAT", next iteration
-```
-`heal()` wraps `_healImpl()` with 5-minute `Promise.race` timeout.
-### IPC Error Chain (caught 500s without crash)
-1. **error-hook.js** — preloaded via `--require`, patches Fastify/Express for IPC. WeakSet dedup.
-2. **runner.js** — spawns child with `stdio: ["inherit","inherit","pipe","ipc"]`, listens `child.on("message")`
-3. **error-monitor.js** — tracks errors per normalized route (`/api/users/123` → `/api/users/:id`), threshold=1, 60s cooldown. Health check failures also trigger heal.
-### AI Client (src/core/ai-client.js)
-Dual provider: OpenAI + Anthropic. Auto-detected from model name (`claude-*` → Anthropic). All responses normalized to `{content, toolCalls, usage}`. **Anthropic prompt caching** — system prompt marked `cache_control: ephemeral`, 90% cheaper on repeat calls. Per-model output limits with 10% buffer. Every call tracked: latencyMs, success/failure, tokens, cost.
-Embeddings always use OpenAI (Anthropic has no embedding API).
-### Agent (src/agent/agent-engine.js)
-**Dynamic system prompt**: simple errors (TypeError/ReferenceError) get 400-token compact prompt with 7 tools. Complex errors get full prompt with all 18 tools + strategy table.
-18 tools: file (read/write/edit/glob/grep/list_dir/move_file), shell (bash_exec/git_log/git_diff), database (inspect_db/run_db_fix), diagnostics (check_port/check_env), deps (audit_deps/check_migration), research (web_fetch), control (done).
-**Cost optimizations**: zero-cost structural compaction (no LLM, extracts signals from messages), tool result truncation (4K cap), token estimation (`text.length/4`), pre/post tool hooks (`.wolverine/hooks.json`), error-graceful tools (`[ERROR]` results not thrown).
-**Protected paths**: agent cannot modify `src/`, `bin/`, `tests/`, `node_modules/`, `.env`, `package.json`. Only `server/` is editable.
-### Provider Config (server/config/settings.json)
-```json
-{ "provider": "hybrid", "openai_settings": {...}, "anthropic_settings": {...}, "hybrid_settings": {...} }
-```
-Config loader reads `{provider}_settings`. Env vars override per-role. Missing config sections auto-patched on startup via `_ensureDefaults()`.
-### Brain (src/brain/vector-store.js + brain.js)
-IVF-indexed vector store: k-means++ clustering, BM25 keyword search, binary persistence. 60 seed docs. Benchmarks: 100=0.2ms, 10K=4.4ms, 50K=23.7ms.
-**Namespace isolation**: error heals search only `errors/fixes/learnings/functions` — seed docs (20K tokens) excluded unless query is about wolverine itself. Function map hash check skips re-embedding if unchanged.
-### Backup (src/backup/backup-manager.js)
-All backups in `~/.wolverine-safe-backups/` (outside project, survives git pull/npm install). States: UNSTABLE → VERIFIED → STABLE (30min). Protected files never rolled back: `settings.json`, `db.js`, `.env.local`.
-### Skills (src/skills/ — 6 files)
-- **sql.js** — injection prevention, SafeDB, idempotency guard
-- **deps.js** — dependency diagnosis (zero tokens), npm audit, migration paths
-- **update.js** — safe framework upgrade, emergency backup, brain seed merge
-- **backup.js** — agent-friendly backup/rollback with CLI commands
-- **loop-guard.js** — infinite loop detection, bug reports, process dedup (PID file)
-- **skill-registry.js** — auto-discovery + token-scored matching
-### Telemetry (src/platform/)
-Heartbeats every 60s. Stable instance ID (persisted to `.wolverine/instance-id`). Cumulative usage from disk (not session-only). `byModel` with latency/success/tokens-per-sec/cost-per-call. `byProvider` aggregated. Auto-update checks every 5min, selective git checkout (never touches `server/`).
-## Key Constraints
-- **Server port is always 3000.** Any other port breaks APIs. Kill 3000 and bind there.
-- **Dashboard on PORT+1** (3001).
-- **heal() has 5-minute timeout.** `Promise.race` recovery.
-- **Global rate limit: 5 heals per 5 minutes.**
-- **Loop guard: 3 failed heals on same error in 10min → stop + bug report.**
-- **Error threshold: 1** — single 500 triggers heal. 60s cooldown per route.
-- **Empty stderr → just restart, no AI.** Prevents token burn on signal kills.
-- **bash_exec: 30s default, 60s cap.**
-- **Process dedup via PID file.** Kills old process on startup.
-- **Both API keys needed for hybrid mode** — OPENAI_API_KEY for embeddings.
-- **Auto-update: selective git checkout** — only updates `src/`, `bin/`, `package.json`. Never touches `server/`.
-- **Rollback protects:** `settings.json`, `db.js`, `.env.local` never overwritten.
-## Configuration
-- **Secrets:** `.env.local` (OPENAI_API_KEY, ANTHROPIC_API_KEY, WOLVERINE_ADMIN_KEY)
-- **Settings:** `server/config/settings.json` — provider, 3 model presets, cluster, telemetry, rate limits, health checks, autoUpdate, errorMonitor
-- **10 model slots:** reasoning, coding, chat, tool, classifier, audit, compacting, research, embedding
-- **Config priority:** env vars > `{provider}_settings` > defaults
-## Files That Matter Most
-| File | Why |
-|------|-----|
-| `src/core/wolverine.js` | Heal pipeline, operational fixes, goal loop, token budgets |
-| `src/core/runner.js` | Process manager, IPC, health/error monitors, loop guard, auto-update |
-| `src/core/ai-client.js` | Dual provider, prompt caching, output limits, latency tracking |
-| `src/agent/agent-engine.js` | Dynamic prompt, 18 tools, zero-cost compaction, hooks |
-| `src/agent/sub-agents.js` | Dynamic token budgets, Haiku triage, restricted tool sets |
-| `src/core/verifier.js` | Syntax + boot probe, error classification comparison |
-| `src/brain/vector-store.js` | IVF + BM25 + binary persistence |
-| `src/brain/brain.js` | 60 seed docs, namespace isolation, function map hash |
-| `src/skills/loop-guard.js` | Infinite loop detection, bug reports, process dedup |
-| `src/skills/update.js` | Safe upgrade, emergency backup, brain seed merge |
-| `src/platform/auto-update.js` | Version lock, dep verification, max 1 attempt per boot |
-| `server/config/settings.json` | Provider selection, 3 model presets, all config |