wolverine-ai 3.4.1 → 3.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -70,7 +70,7 @@ wolverine/
70
70
  │ ├── core/ ← Wolverine engine
71
71
  │ │ ├── wolverine.js ← Heal pipeline + goal loop
72
72
  │ │ ├── runner.js ← Process manager (PM2-like)
73
- │ │ ├── ai-client.js ← OpenAI client (Chat + Responses API)
73
+ │ │ ├── ai-client.js ← Dual provider client (OpenAI + Anthropic)
74
74
  │ │ ├── models.js ← 10-model configuration system
75
75
  │ │ ├── verifier.js ← Fix verification (syntax + boot probe)
76
76
  │ │ ├── error-parser.js ← Stack trace parsing + error classification
@@ -81,7 +81,7 @@ wolverine/
81
81
  │ │ ├── system-info.js ← Machine detection (cores, RAM, cloud, containers)
82
82
  │ │ └── cluster-manager.js← Auto-scaling worker management
83
83
  │ ├── agent/ ← AI agent system
84
- │ │ ├── agent-engine.js ← Multi-turn agent with 10 tools
84
+ │ │ ├── agent-engine.js ← Multi-turn agent with 18 tools + 45s per-call timeout
85
85
  │ │ ├── goal-loop.js ← Goal-driven repair loop
86
86
  │ │ ├── research-agent.js← Deep research + learning from failures
87
87
  │ │ └── sub-agents.js ← 7 specialized sub-agents (explore/plan/fix/verify/...)
@@ -151,8 +151,9 @@ Server crashes
151
151
 
152
152
  Operational Fix (zero AI tokens):
153
153
  → "Cannot find module 'cors'" → npm install cors (instant, free)
154
- → ENOENT on config file → create missing file with defaults
154
+ → ENOENT on config file → read source code, infer expected fields, create with correct structure
155
155
  → EACCES/EPERM → chmod 755
156
+ → EADDRINUSE → find and kill stale process on port
156
157
  → If operational fix works → done. No AI needed.
157
158
 
158
159
  Goal Loop (iterate until fixed or exhausted):
@@ -215,7 +216,7 @@ The AI agent has 18 built-in tools (inspired by [claw-code](https://github.com/u
215
216
  | `grep_code` | File | Regex search across codebase with context lines |
216
217
  | `list_dir` | File | List directory contents with sizes (find misplaced files) |
217
218
  | `move_file` | File | Move or rename files (fix structure problems) |
218
- | `bash_exec` | Shell | Sandboxed shell execution (npm install, chmod, kill, etc.) |
219
+ | `bash_exec` | Shell | Sandboxed shell execution (npm install, chmod, kill, etc.) 30s default, 60s cap |
219
220
  | `git_log` | Shell | View recent commit history |
220
221
  | `git_diff` | Shell | View uncommitted changes |
221
222
  | `inspect_db` | Database | List tables, show schema, run SELECT on SQLite databases |
@@ -439,8 +440,11 @@ Three layers prevent token waste:
439
440
  | **Empty stderr guard** | Signal kills, clean shutdowns with no error | $0.00 |
440
441
  | **Loop guard** | Same error failing 3+ times in 10min → files bug report, stops healing | $0.00 after detection |
441
442
  | **Global rate limit** | Max 5 heals per 5 minutes regardless of error | Caps total spend |
443
+ | **Per-API-call timeout** | 45s timeout on each AI call — prevents indefinite agent hangs | Saves time + tokens |
444
+ | **Heal timeout** | 5-minute overall heal timeout via Promise.race | Prevents stuck heals |
445
+ | **SIGTERM grace period** | 3s startup grace ignores SIGTERM — prevents restart scripts killing new process | Prevents shutdown loops |
442
446
 
443
- **Process dedup:** PID file ensures only one wolverine instance runs. Kills old process on startup.
447
+ **Process dedup:** PID file ensures only one wolverine instance runs. Kills old process on startup. Exit handler only deletes PID file if it still belongs to current process (prevents race condition on restart).
444
448
 
445
449
  **Bug reports:** When loop guard triggers, generates a security-scanned report (no secrets/injection patterns) and sends to the platform backend for human review.
446
450
 
@@ -450,7 +454,7 @@ Three layers prevent token waste:
450
454
 
451
455
  | Technique | What it does | Cost |
452
456
  |-----------|-------------|------|
453
- | **Dynamic system prompt** | Simple errors get 400-token prompt with 7 tools. Complex get 1200 with 18 + strategy | 50% on 70% of heals |
457
+ | **Dynamic system prompt** | Simple errors get 400-token prompt with 7 tools. Complex get 1200 with 18 + fast-fix strategy table | 50% on 70% of heals |
454
458
  | **Brain namespace isolation** | Seed docs (20K tokens) excluded from error heals — only searched for wolverine queries | 50% context reduction |
455
459
  | **Prompt caching** | Anthropic system prompt cached server-side — 90% cheaper on repeat calls | 12-16K tokens saved per heal |
456
460
  | **Tool result truncation** | Tool output capped at 4K chars — prevents context blowup from large reads | Up to 30K saved per turn |
package/bin/wolverine.js CHANGED
@@ -152,13 +152,23 @@ console.log("");
152
152
 
153
153
  const runner = new WolverineRunner(scriptPath, { cwd: process.cwd() });
154
154
 
155
+ // Grace period: ignore SIGTERM for 3s after startup.
156
+ // Prevents restart scripts using `pkill -f wolverine.js` from killing
157
+ // both the old AND newly spawned process.
158
+ let startupGrace = true;
159
+ setTimeout(() => { startupGrace = false; }, 3000);
160
+
155
161
  process.on("SIGINT", () => {
156
- console.log(chalk.yellow(`\n\n👋 Shutting down Wolverine${workerLabel}...`));
162
+ console.log(chalk.yellow(`\n\n👋 Shutting down Wolverine...`));
157
163
  runner.stop();
158
164
  process.exit(0);
159
165
  });
160
166
 
161
167
  process.on("SIGTERM", () => {
168
+ if (startupGrace) {
169
+ console.log(chalk.yellow(" ⚡ Ignoring SIGTERM during startup grace period (3s)"));
170
+ return;
171
+ }
162
172
  runner.stop();
163
173
  process.exit(0);
164
174
  });
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "wolverine-ai",
3
- "version": "3.4.1",
3
+ "version": "3.5.0",
4
4
  "description": "Self-healing Node.js server framework powered by AI. Catches crashes, diagnoses errors, generates fixes, verifies, and restarts — automatically.",
5
5
  "main": "src/index.js",
6
6
  "bin": {
@@ -49,8 +49,6 @@
49
49
  "src/",
50
50
  "server/",
51
51
  "examples/",
52
- "README.md",
53
- "CLAUDE.md",
54
52
  ".env.example"
55
53
  ],
56
54
  "dependencies": {
@@ -414,15 +414,23 @@ class AgentEngine {
414
414
  }
415
415
 
416
416
  let response;
417
+ const AI_CALL_TIMEOUT_MS = 45000; // 45s per API call — prevents indefinite hangs
417
418
  try {
418
- response = await aiCallWithHistory({
419
- model,
420
- messages: this.messages,
421
- tools: allTools,
422
- maxTokens: 4096,
423
- });
419
+ response = await Promise.race([
420
+ aiCallWithHistory({
421
+ model,
422
+ messages: this.messages,
423
+ tools: allTools,
424
+ maxTokens: 4096,
425
+ }),
426
+ new Promise((_, reject) => setTimeout(() => reject(new Error("AI call timed out after 45s")), AI_CALL_TIMEOUT_MS)),
427
+ ]);
424
428
  } catch (err) {
425
429
  console.log(chalk.red(` Agent API error: ${err.message}`));
430
+ // On timeout, return what we have so far rather than failing completely
431
+ if (err.message.includes("timed out") && this.filesModified.length > 0) {
432
+ return { success: true, summary: `Partial fix applied (API timeout on turn ${this.turnCount})`, filesModified: this.filesModified, turnCount: this.turnCount, totalTokens: this.totalTokens };
433
+ }
426
434
  return { success: false, summary: err.message, filesModified: [], turnCount: this.turnCount, totalTokens: this.totalTokens };
427
435
  }
428
436
 
@@ -1060,26 +1068,31 @@ Project: ${cwd}`;
1060
1068
  function _fullPrompt(cwd, primaryFile) {
1061
1069
  return `You are Wolverine, an autonomous Node.js server repair agent. Diagnose and fix the error.
1062
1070
 
1063
- You are a full server doctor. Errors can be code bugs, missing deps, database problems, config issues, port conflicts, permissions, or corrupted state. Investigate the root cause before fixing.
1071
+ You are a full server doctor. Errors can be code bugs, missing deps, database problems, config issues, port conflicts, permissions, or corrupted state.
1072
+
1073
+ CRITICAL: Act fast. You have limited turns. Fix immediately when the solution is obvious from the error. Only investigate when the cause is unclear.
1064
1074
 
1065
1075
  For maximum efficiency, invoke multiple independent tools simultaneously rather than sequentially.
1066
1076
 
1067
1077
  TOOLS: read_file, write_file, edit_file, glob_files, grep_code, list_dir, move_file, bash_exec, git_log, git_diff, inspect_db, run_db_fix, check_port, check_env, audit_deps, check_migration, web_fetch, done
1068
1078
 
1069
- STRATEGY:
1070
- - Cannot find module 'X' → bash_exec: npm install X
1071
- - Cannot find module './X' → edit_file: fix require path
1072
- - ENOENT → write_file or move_file
1073
- - EADDRINUSE → check_port then bash_exec: kill
1074
- - TypeError/ReferenceError → read_file then edit_file
1079
+ FAST FIXES (act immediately, don't investigate):
1080
+ - Cannot find module 'X' → bash_exec: npm install X → done
1081
+ - Cannot find module './X' → grep for correct path → edit_file → done
1082
+ - ENOENT missing config/json file read the code that loads it to see what fields it expects → write_file with required fields → done
1083
+ - EADDRINUSE → check_port bash_exec: kill PID → done
1084
+ - TypeError/ReferenceError → read_file edit_file → done
1085
+ - Missing env var → check_env → report it → done
1086
+
1087
+ INVESTIGATION (only when cause is unclear):
1075
1088
  - Database error → inspect_db then run_db_fix
1076
- - Missing env var check_env
1089
+ - Unknown errorsgrep_code, list_dir to find root cause
1077
1090
 
1078
1091
  RULES:
1079
- 1. Investigate first read files before modifying
1080
- 2. Minimal targeted changes fix root cause not symptoms
1081
- 3. bash_exec for operational fixes, edit_file for code, run_db_fix for data
1082
- 4. Call done with summary when finished
1092
+ 1. Fix on turn 1-2 when possible. Investigation is a last resort.
1093
+ 2. For ENOENT config files: read the code that requires the file, then create it with the expected structure.
1094
+ 3. bash_exec for operational fixes, edit_file for code, write_file for missing files, run_db_fix for data
1095
+ 4. Always call done with summary when finished — never end without calling done.
1083
1096
  ${primaryFile ? `\nFile: ${primaryFile}` : ""}
1084
1097
  Project: ${cwd}`;
1085
1098
  }
@@ -107,13 +107,18 @@ class GoalLoop {
107
107
  explanation: attempt.explanation,
108
108
  }).catch(() => {});
109
109
 
110
- // Deep research after 2nd failure — bring in RESEARCH_MODEL
111
- if (iteration >= 2) {
110
+ // Deep research only after 3rd failure — avoid adding latency on early iterations
111
+ if (iteration >= 3) {
112
112
  console.log(chalk.magenta(` 🔬 Triggering deep research after ${iteration} failures...`));
113
- const research = await this.researcher.research(errorMessage, context);
114
- if (research) {
115
- console.log(chalk.gray(` 🔬 Research insight: ${research.slice(0, 100)}`));
116
- }
113
+ try {
114
+ const research = await Promise.race([
115
+ this.researcher.research(errorMessage, context),
116
+ new Promise((resolve) => setTimeout(() => resolve(null), 30000)), // 30s cap
117
+ ]);
118
+ if (research) {
119
+ console.log(chalk.gray(` 🔬 Research insight: ${research.slice(0, 100)}`));
120
+ }
121
+ } catch {}
117
122
  }
118
123
  }
119
124
 
@@ -34,7 +34,7 @@ const SEED_DOCS = [
34
34
  metadata: { topic: "overview" },
35
35
  },
36
36
  {
37
- text: "Wolverine heal pipeline: crash detected → error parsed (file, line, message, errorType) → prompt injection scan (AUDIT_MODEL) → rate limit check → operational fix attempt (missing_module → npm install, missing_file → create file, permission → chmod — zero AI tokens) → if operational fix doesn't apply → fast path repair (CODING_MODEL, supports both code changes AND shell commands like npm install) → if fast path fails → agent path (REASONING_MODEL with tools including bash_exec for npm install) → if agent fails → sub-agents (explore → plan → fix, fixer has bash_exec) → verify fix (syntax check + boot probe) → rollback on failure. Error types classified: missing_module, missing_file, permission, port_conflict, syntax, runtime, unknown.",
37
+ text: "Wolverine heal pipeline: crash detected → error parsed (file, line, message, errorType) → prompt injection scan (AUDIT_MODEL) → rate limit check (per-signature + global 5/5min cap) → operational fix attempt (missing_module → npm install, missing_file → create file with inferred config, permission → chmod, port conflict → kill stale process — zero AI tokens) → if operational fix doesn't apply → fast path repair (CODING_MODEL, supports both code changes AND shell commands like npm install) → if fast path fails → agent path (REASONING_MODEL with tools including bash_exec, 45s per-API-call timeout) → if agent fails → sub-agents (explore → plan → fix, fixer has bash_exec) → verify fix (syntax check + boot probe + error classification comparison) → rollback on failure. Error types classified: missing_module, missing_file, permission, port_conflict, syntax, runtime, unknown. Heal timeout: 5 minutes via Promise.race. Config-aware turn budget: simple=4, config/ENOENT=5, complex=8 turns.",
38
38
  metadata: { topic: "heal-pipeline" },
39
39
  },
40
40
  {
@@ -66,7 +66,7 @@ const SEED_DOCS = [
66
66
  metadata: { topic: "verification" },
67
67
  },
68
68
  {
69
- text: "Wolverine multi-file agent: 15-turn agent loop with 18 tools across 7 categories. FILE: read_file (offset/limit), write_file (creates dirs), edit_file (find-and-replace), glob_files (pattern search), grep_code (regex with context), list_dir (directory listing with sizes), move_file (rename/relocate). SHELL: bash_exec (30s default, 60s cap), git_log, git_diff. DATABASE: inspect_db (tables/schema/SELECT on SQLite), run_db_fix (UPDATE/DELETE/ALTER with auto-backup). DIAGNOSTICS: check_port (find what uses a port), check_env (env vars, values redacted). DEPS: audit_deps (full npm health check), check_migration (known upgrade paths). RESEARCH: web_fetch. CONTROL: done. Used when fast path fails. Token budget 50k max.",
69
+ text: "Wolverine multi-file agent: turn-limited agent loop with 18 tools across 7 categories. Turn budget adapts to error type: simple (TypeError)=4, config/ENOENT=5, complex=8. Each AI call has 45s timeout via Promise.race — prevents indefinite hangs. If timeout occurs mid-fix, partial results returned. FILE: read_file (offset/limit), write_file (creates dirs), edit_file (find-and-replace), glob_files (pattern search), grep_code (regex with context), list_dir (directory listing with sizes), move_file (rename/relocate). SHELL: bash_exec (30s default, 60s cap), git_log, git_diff. DATABASE: inspect_db (tables/schema/SELECT on SQLite), run_db_fix (UPDATE/DELETE/ALTER with auto-backup). DIAGNOSTICS: check_port (find what uses a port), check_env (env vars, values redacted). DEPS: audit_deps (full npm health check), check_migration (known upgrade paths). RESEARCH: web_fetch (10s timeout). CONTROL: done. Prompt emphasizes fast action: fix immediately when solution is obvious, investigate only when cause unclear.",
70
70
  metadata: { topic: "agent" },
71
71
  },
72
72
  {
@@ -202,7 +202,7 @@ const SEED_DOCS = [
202
202
  metadata: { topic: "admin-auth" },
203
203
  },
204
204
  {
205
- text: "Operational fix layer: before calling AI, wolverine checks for common non-code errors that can be fixed instantly with zero tokens. Pattern 1: 'Cannot find module X' (where X is a package name, not a relative path) → runs npm install X (or just npm install if package is already in package.json). Pattern 2: ENOENT on config/data files (.json, .yaml, .env, .log, etc.) → creates the missing file with sensible defaults (empty JSON {}, empty string). Pattern 3: EACCES/EPERM → chmod 755 on the file. This layer runs before the AI repair loop and handles ~30% of production crashes at zero cost.",
205
+ text: "Operational fix layer: before calling AI, wolverine checks for common non-code errors that can be fixed instantly with zero tokens. Pattern 1: 'Cannot find module X' (where X is a package name, not a relative path) → runs npm install X via deps skill diagnosis. Pattern 2: ENOENT on config/data files (.json, .yaml, .env, .log, etc.) → for JSON configs, reads the source code that loads the file to infer expected fields (apiUrl, timeout, etc.) and creates the file with correct structure; for other types, creates empty file. Pattern 3: EACCES/EPERM → chmod 755 on the file. Pattern 4: EADDRINUSE → finds and kills stale process on the port (lsof on Linux, netstat on Windows). This layer runs before the AI repair loop and handles ~30% of production crashes at zero cost.",
206
206
  metadata: { topic: "operational-fix" },
207
207
  },
208
208
  {
@@ -214,7 +214,7 @@ const SEED_DOCS = [
214
214
  metadata: { topic: "agent-fix-strategy" },
215
215
  },
216
216
  {
217
- text: "Error Monitor: detects caught 500 errors that don't crash the process. Most production bugs are caught by Fastify/Express error handlers — the server stays alive but routes return 500. Wolverine's crash-based heal pipeline never triggers for these. ErrorMonitor tracks 5xx errors per route via IPC from child process. After N consecutive 500s within a time window (default: 3 failures in 30s), triggers the heal pipeline without killing the server. Error hook auto-injected via --require preload (no user code changes). Cooldown prevents heal spam (default: 60s per route). Stats available in dashboard and telemetry. Config: WOLVERINE_ERROR_THRESHOLD, WOLVERINE_ERROR_WINDOW_MS, WOLVERINE_ERROR_COOLDOWN_MS.",
217
+ text: "Error Monitor: detects caught 500 errors that don't crash the process. Most production bugs are caught by Fastify/Express error handlers — the server stays alive but routes return 500. Wolverine's crash-based heal pipeline never triggers for these. ErrorMonitor tracks 5xx errors per normalized route (/api/users/123 → /api/users/:id) via IPC from child process. Single error triggers heal (threshold=1, configurable). Error hook auto-injected via --require preload (no user code changes) hooks Fastify onError + setErrorHandler wrapper + auto-registers default error handler if user never sets one (catches async route throws). Cooldown prevents heal spam (default: 60s per route). Health check failures also trigger heal (not just restart). Config: WOLVERINE_ERROR_THRESHOLD, WOLVERINE_ERROR_WINDOW_MS, WOLVERINE_ERROR_COOLDOWN_MS.",
218
218
  metadata: { topic: "error-monitor" },
219
219
  },
220
220
  {
@@ -265,6 +265,10 @@ const SEED_DOCS = [
265
265
  text: "Agent efficiency (claw-code patterns): (1) Anthropic prompt caching — system prompt marked with cache_control:{type:'ephemeral'}, cached server-side across agent turns, 90% cheaper on repeat calls (12-16K saved tokens per heal). (2) Tool result truncation — capped at 4K chars before entering message history, prevents context blowup from large grep/file reads. (3) Zero-cost structural compaction — extracts signals (tools used, files touched, errors found, actions taken) from message history WITHOUT an LLM call. Costs $0.00 vs old method that burned tokens on a compacting model. Triggers when estimated tokens > 10K (text.length/4 approximation). Preserves last 4 messages verbatim. (2) Token estimation — text.length/4+1, fast approximation without tokenizer, ~10% accurate. Used for budget decisions before API calls. (3) Error-graceful tools — tool errors returned as [ERROR] prefixed results, not thrown. Model sees the error and decides how to proceed. (4) Pre/post tool hooks — shell commands in .wolverine/hooks.json, exit 0=allow, 2=deny. Enables audit logging and policy enforcement without hard-coding.",
266
266
  metadata: { topic: "agent-efficiency" },
267
267
  },
268
+ {
269
+ text: "Robustness guards: (1) Heal concurrency guard — _healInProgress flag prevents parallel heals from health monitor + crash handler racing. (2) Global rate limit — 5 heals per 5 minutes regardless of error signature, prevents infinite loop of different errors burning API quota. (3) Heal timeout — Promise.race wraps _healImpl() with 5-minute timeout, clears _healInProgress on timeout. (4) Per-API-call timeout — 45s timeout in agent engine via Promise.race, returns partial results if files already modified. (5) bash_exec enforced timeout — 30s default, 60s hard cap via Math.min(). (6) PID file race prevention — exit handler only deletes PID file if it still belongs to current process. (7) SIGTERM startup grace — 3s grace period ignores SIGTERM on startup, prevents restart scripts from killing both old and new processes. (8) Research timeout — deep research capped at 30s, deferred to iteration 3+ to avoid slowing early fix attempts.",
270
+ metadata: { topic: "robustness-guards" },
271
+ },
268
272
  {
269
273
  text: "Cost optimization: 7 techniques reduce heal cost from $0.31 to $0.02 for simple errors. (1) Verifier skips route probe for simple errors (TypeError/ReferenceError/SyntaxError) — trusts syntax+boot, ErrorMonitor is safety net. Prevents false-rejection cascades. (2) Sub-agents use Haiku (classifier model) for explore/plan/verify/research — only fixer uses Sonnet/Opus. 6 Haiku calls=$0.006 vs 6 Sonnet calls=$0.12. (3) Agent context compacted every 3 turns using compacting model — prevents 15K→95K token blowup. (4) Brain checked for cached fix patterns before AI — repeat errors cost $0. (5) Token budgets capped by error complexity: simple=20K agent budget, moderate=50K, complex=100K. Simple errors get 4 agent turns max. (6) Prior attempt summaries (not full context) passed between iterations — concise 'do NOT repeat' directives. (7) Fast path includes last known good backup code so AI can revert broken additions instead of patching around them.",
270
274
  metadata: { topic: "cost-optimization" },
@@ -58,20 +58,36 @@ Module._load = function (request, parent, isMain) {
58
58
  function _hookFastify(fastify) {
59
59
  // Wrap setErrorHandler so our IPC reporting runs BEFORE the user's handler
60
60
  const origSetError = fastify.setErrorHandler;
61
+ let customErrorHandlerSet = false;
61
62
  fastify.setErrorHandler = function (userHandler) {
63
+ customErrorHandlerSet = true;
62
64
  return origSetError.call(this, function (error, request, reply) {
63
65
  _reportError(request.url, request.method, error);
64
66
  return userHandler.call(this, error, request, reply);
65
67
  });
66
68
  };
67
69
 
68
- // Also add onError hook as a fallback (fires even if no custom error handler)
70
+ // Add onError hook as primary fallback fires for all route errors in Fastify
69
71
  try {
70
72
  fastify.addHook("onError", function (request, reply, error, done) {
71
73
  _reportError(request.url, request.method, error);
72
74
  done();
73
75
  });
74
76
  } catch { /* addHook may fail if server is already started */ }
77
+
78
+ // Register a default error handler if user never calls setErrorHandler
79
+ // This ensures we catch async route throws even without a custom handler
80
+ try {
81
+ fastify.addHook("onReady", function (done) {
82
+ if (!customErrorHandlerSet) {
83
+ origSetError.call(fastify, function (error, request, reply) {
84
+ _reportError(request.url, request.method, error);
85
+ reply.code(error.statusCode || 500).send({ error: error.message });
86
+ });
87
+ }
88
+ done();
89
+ });
90
+ } catch { /* non-fatal */ }
75
91
  }
76
92
 
77
93
  function _hookExpress(app) {
@@ -332,9 +332,12 @@ async function _healImpl({ stderr, cwd, sandbox, notifier, rateLimiter, backupMa
332
332
  } else if (iteration <= 2) {
333
333
  // Agent path — REASONING_MODEL (also handles iteration 1 when no file)
334
334
  console.log(chalk.magenta(` 🤖 Agent path (${getModel("reasoning")})...`));
335
+ // Tight turn budget: simple errors get 4 turns, ENOENT/config gets 5, complex gets 8
336
+ const isConfigError = /ENOENT|missing.*config|missing.*file|no such file/i.test(parsed.errorMessage);
337
+ const agentMaxTurns = isSimpleError ? 4 : isConfigError ? 5 : 8;
335
338
  const agent = new AgentEngine({
336
339
  sandbox, logger, cwd, mcp,
337
- maxTurns: isSimpleError ? 4 : 8,
340
+ maxTurns: agentMaxTurns,
338
341
  maxTokens: tokenBudget.agent,
339
342
  });
340
343
 
@@ -496,12 +499,20 @@ async function tryOperationalFix(parsed, cwd, logger) {
496
499
  if (!rel.startsWith("..") && /\.(json|yaml|yml|toml|ini|conf|cfg|env|log|txt|csv|db|sqlite)$/i.test(missingFile)) {
497
500
  try {
498
501
  fs.mkdirSync(path.dirname(missingFile), { recursive: true });
499
- // Create empty file or sensible default
500
502
  const ext = path.extname(missingFile).toLowerCase();
501
- const defaults = { ".json": "{}", ".yaml": "", ".yml": "", ".log": "", ".txt": "", ".csv": "", ".env": "" };
502
- fs.writeFileSync(missingFile, defaults[ext] || "", "utf-8");
503
+
504
+ // For JSON config files, try to infer expected structure from the code that loads them
505
+ let content = "";
506
+ if (ext === ".json") {
507
+ content = _inferJsonConfig(missingFile, cwd, parsed) || "{}";
508
+ } else {
509
+ const defaults = { ".yaml": "", ".yml": "", ".log": "", ".txt": "", ".csv": "", ".env": "" };
510
+ content = defaults[ext] || "";
511
+ }
512
+
513
+ fs.writeFileSync(missingFile, content, "utf-8");
503
514
  console.log(chalk.blue(` 📄 Created missing file: ${rel}`));
504
- return { fixed: true, action: `Created missing file: ${rel}` };
515
+ return { fixed: true, action: `Created missing file: ${rel} with ${content === "{}" ? "empty" : "inferred"} config` };
505
516
  } catch {}
506
517
  }
507
518
  }
@@ -544,4 +555,57 @@ async function tryOperationalFix(parsed, cwd, logger) {
544
555
  return { fixed: false };
545
556
  }
546
557
 
558
+ /**
559
+ * Try to infer JSON config structure by scanning the code that loads the file.
560
+ * Looks for property access patterns after require/readFile of the missing file.
561
+ * Returns a JSON string with empty/default values, or null if can't infer.
562
+ */
563
+ function _inferJsonConfig(missingFile, cwd, parsed) {
564
+ const fs = require("fs");
565
+ const path = require("path");
566
+
567
+ // Find which source file loads the missing config
568
+ const basename = path.basename(missingFile);
569
+ const sourceFile = parsed.filePath;
570
+ if (!sourceFile) return null;
571
+
572
+ try {
573
+ const source = fs.readFileSync(sourceFile, "utf-8");
574
+ // Look for property accesses on the loaded config: config.apiUrl, config.timeout, etc.
575
+ const configVarMatch = source.match(new RegExp(`(?:const|let|var)\\s+(\\w+)\\s*=\\s*(?:require|JSON\\.parse).*${basename.replace(".", "\\.")}`));
576
+ if (!configVarMatch) return null;
577
+
578
+ const varName = configVarMatch[1];
579
+ // Find all property accesses: varName.prop or varName["prop"]
580
+ const propRegex = new RegExp(`${varName}\\.(\\w+)`, "g");
581
+ const bracketRegex = new RegExp(`${varName}\\["(\\w+)"\\]`, "g");
582
+ const props = new Set();
583
+ let m;
584
+ while ((m = propRegex.exec(source)) !== null) props.add(m[1]);
585
+ while ((m = bracketRegex.exec(source)) !== null) props.add(m[1]);
586
+
587
+ if (props.size === 0) return null;
588
+
589
+ // Build config with sensible defaults based on property names
590
+ const config = {};
591
+ for (const prop of props) {
592
+ const lower = prop.toLowerCase();
593
+ if (/url|endpoint|host|uri/.test(lower)) config[prop] = "http://localhost:3000";
594
+ else if (/port/.test(lower)) config[prop] = 3000;
595
+ else if (/timeout|delay|interval|ttl/.test(lower)) config[prop] = 5000;
596
+ else if (/key|token|secret/.test(lower)) config[prop] = "placeholder";
597
+ else if (/name/.test(lower)) config[prop] = "default";
598
+ else if (/enabled|active|debug/.test(lower)) config[prop] = true;
599
+ else if (/count|max|min|limit|size/.test(lower)) config[prop] = 10;
600
+ else if (/path|dir|file/.test(lower)) config[prop] = "./";
601
+ else config[prop] = "";
602
+ }
603
+
604
+ console.log(chalk.gray(` 🔍 Inferred ${props.size} config fields from ${path.basename(sourceFile)}: ${[...props].join(", ")}`));
605
+ return JSON.stringify(config, null, 2);
606
+ } catch {
607
+ return null;
608
+ }
609
+ }
610
+
547
611
  module.exports = { heal };
@@ -242,8 +242,15 @@ function ensureSingleProcess(cwd) {
242
242
  fs.writeFileSync(pidFile, String(process.pid), "utf-8");
243
243
  } catch {}
244
244
 
245
- // Clean up on exit
246
- process.on("exit", () => { try { fs.unlinkSync(pidFile); } catch {} });
245
+ // Clean up on exit — only delete if PID file still belongs to us
246
+ // (prevents race condition where old process deletes new process's PID)
247
+ const myPid = process.pid;
248
+ process.on("exit", () => {
249
+ try {
250
+ const current = parseInt(fs.readFileSync(pidFile, "utf-8").trim(), 10);
251
+ if (current === myPid) fs.unlinkSync(pidFile);
252
+ } catch {}
253
+ });
247
254
  }
248
255
 
249
256
  // ── Skill Metadata ──
package/CLAUDE.md DELETED
@@ -1,146 +0,0 @@
1
- # CLAUDE.md
2
-
3
- This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
4
-
5
- ## What This Is
6
-
7
- Wolverine is a self-healing Node.js server framework. It wraps a server process, catches crashes AND caught 500 errors, diagnoses them with AI (OpenAI or Anthropic), generates fixes, verifies them, and restarts — automatically. Published as `wolverine-ai` on npm (v3.1.0). 65 exports, 83 files, 6 skills.
8
-
9
- ## Commands
10
-
11
- ```bash
12
- npm start # Run server/index.js under wolverine (self-healing)
13
- npm run server # Run server/index.js directly (no healing)
14
- npm run test:pentest # Security scan for secret leakage
15
- npm run demo:list # List demo scenarios
16
- npm run demo -- 01 # Run specific demo
17
- npx wolverine server/index.js # CLI entry point
18
- wolverine --info # System detection
19
- wolverine --update # Safe framework upgrade
20
- wolverine --backup "reason" # Create server snapshot
21
- wolverine --list-backups # Show all snapshots
22
- wolverine --rollback <id> # Restore specific backup
23
- wolverine --rollback-latest # Restore most recent
24
- ```
25
-
26
- No standard test runner — demos in `tests/fixtures/` serve as integration tests.
27
-
28
- ## Architecture
29
-
30
- ### Heal Pipeline (src/core/wolverine.js)
31
-
32
- ```
33
- Error detected (crash OR caught 500 via IPC)
34
- → Empty stderr? → Just restart, no AI ($0.00)
35
- → Parse error → classify type → redact secrets
36
- → Injection scan (skip if < 20 chars)
37
- → Loop guard: same error failed 3+ times in 10min? → File bug report, stop
38
- → Rate limit: 5 heals per 5min max
39
- → Operational fix (zero tokens):
40
- missing_module → deps.diagnose() → npm install
41
- EADDRINUSE → kill stale process
42
- ENOENT → create missing file
43
- EACCES → chmod
44
- → Token budget by complexity: simple=20K, moderate=50K, complex=100K
45
- → Goal Loop (3 iterations):
46
- 1. Fast path: CODING_MODEL, JSON with code+commands, backup diff context
47
- 2. Agent: dynamic prompt (400 tokens simple, 1200 complex), 18 tools
48
- 3. Sub-agents: explore→plan→fix (Haiku triage, Sonnet/Opus fix only)
49
- → Verify: syntax → boot probe (route probe skipped — ErrorMonitor is safety net)
50
- → Success: retryCount reset, record to brain with full context
51
- → Fail: rollback, brain records "DO NOT REPEAT", next iteration
52
- ```
53
-
54
- `heal()` wraps `_healImpl()` with 5-minute `Promise.race` timeout.
55
-
56
- ### IPC Error Chain (caught 500s without crash)
57
-
58
- 1. **error-hook.js** — preloaded via `--require`, patches Fastify/Express for IPC. WeakSet dedup.
59
- 2. **runner.js** — spawns child with `stdio: ["inherit","inherit","pipe","ipc"]`, listens `child.on("message")`
60
- 3. **error-monitor.js** — tracks errors per normalized route (`/api/users/123` → `/api/users/:id`), threshold=1, 60s cooldown. Health check failures also trigger heal.
61
-
62
- ### AI Client (src/core/ai-client.js)
63
-
64
- Dual provider: OpenAI + Anthropic. Auto-detected from model name (`claude-*` → Anthropic). All responses normalized to `{content, toolCalls, usage}`. **Anthropic prompt caching** — system prompt marked `cache_control: ephemeral`, 90% cheaper on repeat calls. Per-model output limits with 10% buffer. Every call tracked: latencyMs, success/failure, tokens, cost.
65
-
66
- Embeddings always use OpenAI (Anthropic has no embedding API).
67
-
68
- ### Agent (src/agent/agent-engine.js)
69
-
70
- **Dynamic system prompt**: simple errors (TypeError/ReferenceError) get 400-token compact prompt with 7 tools. Complex errors get full prompt with all 18 tools + strategy table.
71
-
72
- 18 tools: file (read/write/edit/glob/grep/list_dir/move_file), shell (bash_exec/git_log/git_diff), database (inspect_db/run_db_fix), diagnostics (check_port/check_env), deps (audit_deps/check_migration), research (web_fetch), control (done).
73
-
74
- **Cost optimizations**: zero-cost structural compaction (no LLM, extracts signals from messages), tool result truncation (4K cap), token estimation (`text.length/4`), pre/post tool hooks (`.wolverine/hooks.json`), error-graceful tools (`[ERROR]` results not thrown).
75
-
76
- **Protected paths**: agent cannot modify `src/`, `bin/`, `tests/`, `node_modules/`, `.env`, `package.json`. Only `server/` is editable.
77
-
78
- ### Provider Config (server/config/settings.json)
79
-
80
- ```json
81
- { "provider": "hybrid", "openai_settings": {...}, "anthropic_settings": {...}, "hybrid_settings": {...} }
82
- ```
83
-
84
- Config loader reads `{provider}_settings`. Env vars override per-role. Missing config sections auto-patched on startup via `_ensureDefaults()`.
85
-
86
- ### Brain (src/brain/vector-store.js + brain.js)
87
-
88
- IVF-indexed vector store: k-means++ clustering, BM25 keyword search, binary persistence. 60 seed docs. Benchmarks: 100=0.2ms, 10K=4.4ms, 50K=23.7ms.
89
-
90
- **Namespace isolation**: error heals search only `errors/fixes/learnings/functions` — seed docs (20K tokens) excluded unless query is about wolverine itself. Function map hash check skips re-embedding if unchanged.
91
-
92
- ### Backup (src/backup/backup-manager.js)
93
-
94
- All backups in `~/.wolverine-safe-backups/` (outside project, survives git pull/npm install). States: UNSTABLE → VERIFIED → STABLE (30min). Protected files never rolled back: `settings.json`, `db.js`, `.env.local`.
95
-
96
- ### Skills (src/skills/ — 6 files)
97
-
98
- - **sql.js** — injection prevention, SafeDB, idempotency guard
99
- - **deps.js** — dependency diagnosis (zero tokens), npm audit, migration paths
100
- - **update.js** — safe framework upgrade, emergency backup, brain seed merge
101
- - **backup.js** — agent-friendly backup/rollback with CLI commands
102
- - **loop-guard.js** — infinite loop detection, bug reports, process dedup (PID file)
103
- - **skill-registry.js** — auto-discovery + token-scored matching
104
-
105
- ### Telemetry (src/platform/)
106
-
107
- Heartbeats every 60s. Stable instance ID (persisted to `.wolverine/instance-id`). Cumulative usage from disk (not session-only). `byModel` with latency/success/tokens-per-sec/cost-per-call. `byProvider` aggregated. Auto-update checks every 5min, selective git checkout (never touches `server/`).
108
-
109
- ## Key Constraints
110
-
111
- - **Server port is always 3000.** Any other port breaks APIs. Kill 3000 and bind there.
112
- - **Dashboard on PORT+1** (3001).
113
- - **heal() has 5-minute timeout.** `Promise.race` recovery.
114
- - **Global rate limit: 5 heals per 5 minutes.**
115
- - **Loop guard: 3 failed heals on same error in 10min → stop + bug report.**
116
- - **Error threshold: 1** — single 500 triggers heal. 60s cooldown per route.
117
- - **Empty stderr → just restart, no AI.** Prevents token burn on signal kills.
118
- - **bash_exec: 30s default, 60s cap.**
119
- - **Process dedup via PID file.** Kills old process on startup.
120
- - **Both API keys needed for hybrid mode** — OPENAI_API_KEY for embeddings.
121
- - **Auto-update: selective git checkout** — only updates `src/`, `bin/`, `package.json`. Never touches `server/`.
122
- - **Rollback protects:** `settings.json`, `db.js`, `.env.local` never overwritten.
123
-
124
- ## Configuration
125
-
126
- - **Secrets:** `.env.local` (OPENAI_API_KEY, ANTHROPIC_API_KEY, WOLVERINE_ADMIN_KEY)
127
- - **Settings:** `server/config/settings.json` — provider, 3 model presets, cluster, telemetry, rate limits, health checks, autoUpdate, errorMonitor
128
- - **10 model slots:** reasoning, coding, chat, tool, classifier, audit, compacting, research, embedding
129
- - **Config priority:** env vars > `{provider}_settings` > defaults
130
-
131
- ## Files That Matter Most
132
-
133
- | File | Why |
134
- |------|-----|
135
- | `src/core/wolverine.js` | Heal pipeline, operational fixes, goal loop, token budgets |
136
- | `src/core/runner.js` | Process manager, IPC, health/error monitors, loop guard, auto-update |
137
- | `src/core/ai-client.js` | Dual provider, prompt caching, output limits, latency tracking |
138
- | `src/agent/agent-engine.js` | Dynamic prompt, 18 tools, zero-cost compaction, hooks |
139
- | `src/agent/sub-agents.js` | Dynamic token budgets, Haiku triage, restricted tool sets |
140
- | `src/core/verifier.js` | Syntax + boot probe, error classification comparison |
141
- | `src/brain/vector-store.js` | IVF + BM25 + binary persistence |
142
- | `src/brain/brain.js` | 60 seed docs, namespace isolation, function map hash |
143
- | `src/skills/loop-guard.js` | Infinite loop detection, bug reports, process dedup |
144
- | `src/skills/update.js` | Safe upgrade, emergency backup, brain seed merge |
145
- | `src/platform/auto-update.js` | Version lock, dep verification, max 1 attempt per boot |
146
- | `server/config/settings.json` | Provider selection, 3 model presets, all config |