npm - @ducci/jarvis - Versions diffs - 1.0.15 → 1.0.17 - Mend

@ducci/jarvis 1.0.15 → 1.0.17

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (7) hide show

package/docs/findings/003-event-loop-blocking-and-reliability.md +120 -0
package/docs/system-prompt.md +9 -0
package/package.json +1 -1
package/src/server/agent.js +50 -14
package/src/server/logging.js +2 -2
package/src/server/sessions.js +5 -4
package/src/server/tools.js +44 -6

package/docs/findings/003-event-loop-blocking-and-reliability.md ADDED Viewed

@@ -0,0 +1,120 @@
+# Finding 003: Event Loop Blocking, Async File I/O, and Session Reliability
+**Date:** 2026-02-27
+**Severity:** High — caused observed 100% CPU and server unresponsiveness in production
+**Status:** Fixed
+---
+## What Happened
+A session was started with the question *"Kannst du deinen source code finden und anschauen mittels Tools?"*. The agent used the `exec` tool to run two full-filesystem scans:
+```
+find / -type f \( -iname "*.js" -o -iname "*.ts" -o -iname "*.py" \) 2>/dev/null | head -20
+find / -type d -name "jarvis" 2>/dev/null
+```
+Both commands start from filesystem root `/`. The second has no output limit and scans everything: real disk filesystems, `/proc`, `/sys`, `/dev`, and any network mounts. On the affected Linux server this caused the CPU to reach 100% and the server became unresponsive. The server had to be shut down manually.
+---
+## Root Cause
+### 1. `execSync` blocks the entire Node.js event loop
+Both `exec` and `list_dir` used `execSync` from `child_process`. `execSync` is a synchronous call that blocks the event loop for its entire duration. While any shell command runs:
+- Express cannot process incoming HTTP requests
+- The Telegram bot cannot receive or process new messages
+- All timers and async callbacks are frozen (including the Telegram `typingInterval`, so the user sees no activity indicator)
+The OS sees a CPU-hungry `find` child process running at full speed while Node.js sits blocked waiting for it. Combined, this presents as ~100% CPU with a completely unresponsive server.
+Additionally, `list_dir` used `execSync` with **no timeout at all**. A hanging command (e.g. `ls` on an NFS mount or a blocked `/proc` entry) would freeze the server permanently.
+### 2. All file I/O was synchronous
+`loadSession`, `saveSession`, `appendLog`, and `loadTools` all used `fs.*Sync` variants. In an async Node.js server these block the event loop on every request. For small files the impact is measured in microseconds, but the pattern is architecturally incorrect and accumulates under load.
+### 3. Session not saved on unexpected error
+In `handleChat`, `saveSession` was called unconditionally after the `try/catch` block. If the catch re-threw an unexpected error, `saveSession` was never reached. The user message had already been appended to the in-memory session but the on-disk version did not reflect it — leaving the session in an inconsistent state for the next request.
+### 4. No concurrency protection per session
+The Telegram channel uses `@grammyjs/runner`, which processes updates concurrently. If a user sent two messages in quick succession, both `handleChat` calls could load the same session simultaneously, run independent agent loops, and then overwrite each other's `saveSession` call. The second write would silently discard the first response.
+### 5. Seed tools never updated after initial creation
+`seedTools()` used `if (!existing[name])` — it only wrote a seed tool on first run. Any update to `exec` or `list_dir` in the source code would never propagate to an existing installation. This blocked the async fix for `exec` and `list_dir` from taking effect.
+---
+## Fixes
+### 1. `exec` and `list_dir` → async (`src/server/tools.js`)
+**`exec`**: replaced `execSync` with `promisify(exec)`. The event loop is now free during shell command execution. Timeout (60s) and maxBuffer (2MB) are preserved.
+**`list_dir`**: replaced `execSync` with `promisify(execFile)`. `execFile` does not use a shell interpreter, which is safer against special characters in paths. Added a 10-second timeout (previously none).
+### 2. `executeTool` global timeout (`src/server/tools.js`)
+All tool executions — both built-in and AI-created — are now wrapped in `Promise.race` against a 60-second timeout. This protects against AI-created tools that hang on async operations (network requests, file I/O). The timeout matches the `exec` tool's own limit for consistency.
+```js
+const timeout = new Promise((_, reject) =>
+  setTimeout(() => reject(new Error(`Tool '${name}' timed out after 60s`)), 60_000)
+);
+return await Promise.race([fn(toolArgs, fs, path, process, _require), timeout]);
+```
+Note: this does not protect against synchronous CPU loops without `await` points — that would require Worker Threads. Such code is unlikely to be generated accidentally.
+### 3. Seed tools always updated (`src/server/tools.js`)
+`seedTools()` now compares the serialized content of each seed tool against the stored version and overwrites only when there is a difference. Updates to built-in tools propagate on the next server start without touching user-created tools.
+### 4. All file I/O → async (`src/server/sessions.js`, `src/server/logging.js`, `src/server/tools.js`)
+`loadSession`, `saveSession`, `appendLog`, and `loadTools` now use `fs.promises.*`. All callers in `agent.js` are updated to `await` these calls.
+### 5. `saveSession` moved to `finally` block (`src/server/agent.js`)
+The session is now always persisted — on success, on model error, and on unexpected errors. A failed save is caught and logged without masking the original error.
+```js
+} finally {
+  try {
+    await saveSession(sessionId, session);
+  } catch (saveErr) {
+    console.error(`Failed to save session ${sessionId}:`, saveErr);
+  }
+}
+```
+### 6. Session queue for concurrency control (`src/server/agent.js`)
+A module-level `Map<sessionId, Promise>` serializes concurrent requests for the same session. Each new request registers itself as the tail of the queue and waits for the previous request to resolve before starting. The map entry is cleaned up by whichever request is last in the chain.
+```js
+const previous = sessionQueues.get(sessionId) ?? Promise.resolve();
+let releaseLock;
+const current = new Promise(resolve => { releaseLock = resolve; });
+sessionQueues.set(sessionId, current);
+await previous;
+// ... process request ...
+// finally: releaseLock()
+```
+This is safe in Node.js because the event loop is single-threaded: `get`, `new Promise`, and `set` all execute synchronously before the first `await`, so there is no race between two requests reading the same `undefined` entry.
+---
+## What Was Not Changed
+- The agent loop logic, checkpoint/handoff system, loop detection, and format recovery — all unchanged.
+- `seedTools()` remains synchronous (called once at startup, before the server accepts requests).
+- `createSession()` and `getToolDefinitions()` remain synchronous (pure functions, no I/O).
+- No rate limiting or HTTP authentication added — the server is intended for local/personal use only.

package/docs/system-prompt.md CHANGED Viewed

@@ -44,6 +44,15 @@ You have access to a set of tools. Each tool has a name and description that tel
 - If the user shares personal information, persist it using the appropriate tool.
 - Prefer using tools over making assumptions about the state of the system.
+## exec Safety
+The `exec` tool runs real shell commands on the server. Use it responsibly:
+- **Never scan from filesystem root.** Commands like `find /`, `find / -name ...`, or `ls -R /` will scan everything including `/proc`, `/sys`, and network mounts. They can saturate CPU and I/O for minutes. Always scope `find` to a specific directory (e.g. `find ~/jarvis -name "*.js"`).
+- **Use known paths.** Prefer `process.cwd()`, `$HOME`, or paths you already know over broad searches. Use `which <binary>` to locate executables.
+- **Prefer targeted reads.** Use `grep`, `head`, or `tail` instead of `cat` on files you haven't seen before. Large file output is truncated anyway — a targeted command gives you better signal.
+- **Avoid commands with unbounded runtime.** If a command could run indefinitely or scan an unknown-size tree, scope it first.
 ## logSummary Guidelines
 The `logSummary` is written for a human observer, not for the user. It must:

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@ducci/jarvis",
-  "version": "1.0.15",
+  "version": "1.0.17",
   "description": "A fully automated agent system that lives on a server.",
   "main": "./src/index.js",
   "type": "module",

package/src/server/agent.js CHANGED Viewed

@@ -24,6 +24,11 @@ Respond with your normal JSON, but add a checkpoint field:
 The checkpoint field will be used to automatically resume the task in the next run.]`;
+// Serializes concurrent requests for the same session. Maps sessionId to the
+// tail of the current request chain (a Promise that resolves when the last
+// queued request finishes).
+const sessionQueues = new Map();
 async function callModel(client, model, messages, tools) {
   const params = { model, messages };
   if (tools && tools.length > 0) {
@@ -67,7 +72,7 @@ async function callModelWithFallback(client, config, messages, tools) {
  * Returns { iteration, response, logSummary, status, runToolCalls, checkpoint }.
  */
 async function runAgentLoop(client, config, session, prepareMessages) {
-  let tools = loadTools();
+  let tools = await loadTools();
   let toolDefs = getToolDefinitions(tools);
   let iteration = 0;
   const runToolCalls = [];
@@ -175,7 +180,7 @@ async function runAgentLoop(client, config, session, prepareMessages) {
       // Reload tools if any were created/updated this iteration
       if (toolsModified) {
-        tools = loadTools();
+        tools = await loadTools();
         toolDefs = getToolDefinitions(tools);
       }
@@ -313,14 +318,41 @@ async function runAgentLoop(client, config, session, prepareMessages) {
  * Manages the handoff loop across multiple agent runs.
  */
 export async function handleChat(config, requestSessionId, userMessage) {
+  const sessionId = requestSessionId || crypto.randomUUID();
+  // Serialize concurrent requests for the same session. Each request registers
+  // itself at the tail of the queue and waits for the previous request to finish
+  // before starting. New sessions (no requestSessionId) each get a unique ID,
+  // so they never contend with each other.
+  const previous = sessionQueues.get(sessionId) ?? Promise.resolve();
+  let releaseLock;
+  const current = new Promise(resolve => { releaseLock = resolve; });
+  sessionQueues.set(sessionId, current);
+  await previous;
+  try {
+    return await _runHandleChat(config, sessionId, userMessage);
+  } finally {
+    releaseLock();
+    // Clean up only if no one else has queued behind us
+    if (sessionQueues.get(sessionId) === current) {
+      sessionQueues.delete(sessionId);
+    }
+  }
+}
+/**
+ * The actual chat logic, extracted so handleChat can wrap it cleanly with the
+ * session lock.
+ */
+async function _runHandleChat(config, sessionId, userMessage) {
   const client = new OpenAI({
     baseURL: 'https://openrouter.ai/api/v1',
     apiKey: config.apiKey,
   });
   const systemPromptTemplate = loadSystemPrompt();
-  const sessionId = requestSessionId || crypto.randomUUID();
-  let session = loadSession(sessionId);
+  let session = await loadSession(sessionId);
   if (!session) {
     session = createSession(systemPromptTemplate);
@@ -345,8 +377,8 @@ export async function handleChat(config, requestSessionId, userMessage) {
   let finalLogSummary = '';
   let finalStatus = 'ok';
-  // Handoff loop
   try {
+    // Handoff loop
     while (true) {
       const runStartIndex = session.messages.length;
       const run = await runAgentLoop(client, config, session, prepareMessages);
@@ -369,7 +401,7 @@ export async function handleChat(config, requestSessionId, userMessage) {
         if (run.errorDetail) logEntry.errorDetail = run.errorDetail;
         if (run.contextInfo) logEntry.contextInfo = run.contextInfo;
         if (run.rawResponse) logEntry.rawResponse = run.rawResponse;
-        appendLog(sessionId, logEntry);
+        await appendLog(sessionId, logEntry);
         // Inject synthetic error note so the model has context on the next user turn
         if (finalStatus === 'model_error' || finalStatus === 'format_error') {
@@ -384,7 +416,7 @@ export async function handleChat(config, requestSessionId, userMessage) {
       }
       // Checkpoint reached — log this run
-      appendLog(sessionId, {
+      await appendLog(sessionId, {
         iteration: run.iteration,
         model: config.selectedModel,
         userInput: userMessage,
@@ -401,7 +433,7 @@ export async function handleChat(config, requestSessionId, userMessage) {
         finalLogSummary = run.logSummary;
         finalStatus = 'intervention_required';
-        appendLog(sessionId, {
+        await appendLog(sessionId, {
           iteration: 0,
           model: config.selectedModel,
           userInput: userMessage,
@@ -426,7 +458,7 @@ export async function handleChat(config, requestSessionId, userMessage) {
       session.messages.push({ role: 'user', content: run.checkpoint.remaining || 'Continue with the task.' });
     }
   } catch (e) {
-    const errorLog = {
+    await appendLog(sessionId, {
       iteration: 0,
       model: config.selectedModel,
       userInput: userMessage,
@@ -435,14 +467,18 @@ export async function handleChat(config, requestSessionId, userMessage) {
       logSummary: `Critical error: ${e.message}`,
       status: 'error',
       errorDetail: { message: e.message, stack: e.stack },
-    };
-    appendLog(sessionId, errorLog);
-    // Re-throw to let app.js handle the HTTP response
+    });
     throw e;
+  } finally {
+    // Always persist the session — even if an unexpected error occurred.
+    // A failed save must not mask the original error.
+    try {
+      await saveSession(sessionId, session);
+    } catch (saveErr) {
+      console.error(`Failed to save session ${sessionId}:`, saveErr);
+    }
   }
-  saveSession(sessionId, session);
   console.log(`${chalk.magenta('<<<')} ${chalk.bold('Final Response')} [SID: ${chalk.dim(sessionId.slice(0, 8))}] ${chalk.italic(finalLogSummary)}`);
   return {

package/src/server/logging.js CHANGED Viewed

@@ -3,10 +3,10 @@ import path from 'path';
 import chalk from 'chalk';
 import { PATHS } from './config.js';
-export function appendLog(sessionId, entry) {
+export async function appendLog(sessionId, entry) {
   const logFile = path.join(PATHS.logsDir, `session-${sessionId}.jsonl`);
   const line = JSON.stringify({ ts: new Date().toISOString(), sessionId, ...entry }) + '\n';
-  fs.appendFileSync(logFile, line, 'utf8');
+  await fs.promises.appendFile(logFile, line, 'utf8');
   // Console output for better visibility
   const statusColor = entry.status === 'ok' ? chalk.green : chalk.red;

package/src/server/sessions.js CHANGED Viewed

@@ -2,19 +2,20 @@ import fs from 'fs';
 import path from 'path';
 import { PATHS } from './config.js';
-export function loadSession(sessionId) {
+export async function loadSession(sessionId) {
   const filePath = path.join(PATHS.conversationsDir, `${sessionId}.json`);
   try {
-    return JSON.parse(fs.readFileSync(filePath, 'utf8'));
+    const raw = await fs.promises.readFile(filePath, 'utf8');
+    return JSON.parse(raw);
   } catch {
     return null;
   }
 }
-export function saveSession(sessionId, session) {
+export async function saveSession(sessionId, session) {
   session.metadata.updatedAt = new Date().toISOString();
   const filePath = path.join(PATHS.conversationsDir, `${sessionId}.json`);
-  fs.writeFileSync(filePath, JSON.stringify(session, null, 2), 'utf8');
+  await fs.promises.writeFile(filePath, JSON.stringify(session, null, 2), 'utf8');
 }
 export function createSession(systemPromptTemplate) {

package/src/server/tools.js CHANGED Viewed

@@ -6,6 +6,8 @@ import { PATHS } from './config.js';
 const _require = createRequire(import.meta.url);
 const AsyncFunction = Object.getPrototypeOf(async function () {}).constructor;
+const TOOL_TIMEOUT_MS = 60_000;
 const SEED_TOOLS = {
   list_dir: {
     definition: {
@@ -25,7 +27,18 @@ const SEED_TOOLS = {
         },
       },
     },
-    code: 'const targetPath = args.path || process.cwd(); const resolved = path.resolve(targetPath); const { execSync } = require("child_process"); const output = execSync(`ls -la "${resolved}"`, { encoding: "utf8" }); return { status: "ok", path: resolved, output };',
+    code: `
+      const { execFile } = require("child_process");
+      const { promisify } = require("util");
+      const execFileAsync = promisify(execFile);
+      const targetPath = args.path || process.cwd();
+      const resolved = path.resolve(targetPath);
+      const { stdout: output } = await execFileAsync("ls", ["-la", resolved], {
+        encoding: "utf8",
+        timeout: 10000,
+      });
+      return { status: "ok", path: resolved, output };
+    `,
   },
   exec: {
     definition: {
@@ -45,7 +58,21 @@ const SEED_TOOLS = {
         },
       },
     },
-    code: 'const { execSync } = require("child_process"); try { const stdout = execSync(args.cmd, { encoding: "utf8", timeout: 60000 }); return { status: "ok", exitCode: 0, stdout, stderr: "" }; } catch (e) { return { status: "error", exitCode: e.status || 1, stdout: e.stdout || "", stderr: e.stderr || e.message }; }',
+    code: `
+      const { exec } = require("child_process");
+      const { promisify } = require("util");
+      const execAsync = promisify(exec);
+      try {
+        const { stdout, stderr } = await execAsync(args.cmd, {
+          encoding: "utf8",
+          timeout: 60000,
+          maxBuffer: 2 * 1024 * 1024,
+        });
+        return { status: "ok", exitCode: 0, stdout, stderr };
+      } catch (e) {
+        return { status: "error", exitCode: e.code || 1, stdout: e.stdout || "", stderr: e.stderr || e.message };
+      }
+    `,
   },
   save_user_info: {
     definition: {
@@ -193,7 +220,9 @@ export function seedTools() {
   let changed = false;
   for (const [name, tool] of Object.entries(SEED_TOOLS)) {
-    if (!existing[name]) {
+    // Always keep seed tools up to date — user-created tools have different names
+    // and are never touched by this loop.
+    if (JSON.stringify(existing[name]) !== JSON.stringify(tool)) {
       existing[name] = tool;
       changed = true;
     }
@@ -207,9 +236,10 @@ export function seedTools() {
   return existing;
 }
-export function loadTools() {
+export async function loadTools() {
   try {
-    return JSON.parse(fs.readFileSync(PATHS.toolsFile, 'utf8'));
+    const raw = await fs.promises.readFile(PATHS.toolsFile, 'utf8');
+    return JSON.parse(raw);
   } catch {
     return {};
   }
@@ -226,5 +256,13 @@ export async function executeTool(tools, name, toolArgs) {
   }
   const fn = new AsyncFunction('args', 'fs', 'path', 'process', 'require', tool.code);
-  return await fn(toolArgs, fs, path, process, _require);
+  const timeout = new Promise((_, reject) =>
+    setTimeout(
+      () => reject(new Error(`Tool '${name}' timed out after ${TOOL_TIMEOUT_MS / 1000}s`)),
+      TOOL_TIMEOUT_MS
+    )
+  );
+  return await Promise.race([fn(toolArgs, fs, path, process, _require), timeout]);
 }