@ducci/jarvis 1.0.15 → 1.0.17

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,120 @@
1
+ # Finding 003: Event Loop Blocking, Async File I/O, and Session Reliability
2
+
3
+ **Date:** 2026-02-27
4
+ **Severity:** High — caused observed 100% CPU and server unresponsiveness in production
5
+ **Status:** Fixed
6
+
7
+ ---
8
+
9
+ ## What Happened
10
+
11
+ A session was started with the question *"Kannst du deinen source code finden und anschauen mittels Tools?"*. The agent used the `exec` tool to run two full-filesystem scans:
12
+
13
+ ```
14
+ find / -type f \( -iname "*.js" -o -iname "*.ts" -o -iname "*.py" \) 2>/dev/null | head -20
15
+ find / -type d -name "jarvis" 2>/dev/null
16
+ ```
17
+
18
+ Both commands start from filesystem root `/`. The second has no output limit and scans everything: real disk filesystems, `/proc`, `/sys`, `/dev`, and any network mounts. On the affected Linux server this caused the CPU to reach 100% and the server became unresponsive. The server had to be shut down manually.
19
+
20
+ ---
21
+
22
+ ## Root Cause
23
+
24
+ ### 1. `execSync` blocks the entire Node.js event loop
25
+
26
+ Both `exec` and `list_dir` used `execSync` from `child_process`. `execSync` is a synchronous call that blocks the event loop for its entire duration. While any shell command runs:
27
+
28
+ - Express cannot process incoming HTTP requests
29
+ - The Telegram bot cannot receive or process new messages
30
+ - All timers and async callbacks are frozen (including the Telegram `typingInterval`, so the user sees no activity indicator)
31
+
32
+ The OS sees a CPU-hungry `find` child process running at full speed while Node.js sits blocked waiting for it. Combined, this presents as ~100% CPU with a completely unresponsive server.
33
+
34
+ Additionally, `list_dir` used `execSync` with **no timeout at all**. A hanging command (e.g. `ls` on an NFS mount or a blocked `/proc` entry) would freeze the server permanently.
35
+
36
+ ### 2. All file I/O was synchronous
37
+
38
+ `loadSession`, `saveSession`, `appendLog`, and `loadTools` all used `fs.*Sync` variants. In an async Node.js server these block the event loop on every request. For small files the impact is measured in microseconds, but the pattern is architecturally incorrect and accumulates under load.
39
+
40
+ ### 3. Session not saved on unexpected error
41
+
42
+ In `handleChat`, `saveSession` was called unconditionally after the `try/catch` block. If the catch re-threw an unexpected error, `saveSession` was never reached. The user message had already been appended to the in-memory session but the on-disk version did not reflect it — leaving the session in an inconsistent state for the next request.
43
+
44
+ ### 4. No concurrency protection per session
45
+
46
+ The Telegram channel uses `@grammyjs/runner`, which processes updates concurrently. If a user sent two messages in quick succession, both `handleChat` calls could load the same session simultaneously, run independent agent loops, and then overwrite each other's `saveSession` call. The second write would silently discard the first response.
47
+
48
+ ### 5. Seed tools never updated after initial creation
49
+
50
+ `seedTools()` used `if (!existing[name])` — it only wrote a seed tool on first run. Any update to `exec` or `list_dir` in the source code would never propagate to an existing installation. This blocked the async fix for `exec` and `list_dir` from taking effect.
51
+
52
+ ---
53
+
54
+ ## Fixes
55
+
56
+ ### 1. `exec` and `list_dir` → async (`src/server/tools.js`)
57
+
58
+ **`exec`**: replaced `execSync` with `promisify(exec)`. The event loop is now free during shell command execution. Timeout (60s) and maxBuffer (2MB) are preserved.
59
+
60
+ **`list_dir`**: replaced `execSync` with `promisify(execFile)`. `execFile` does not use a shell interpreter, which is safer against special characters in paths. Added a 10-second timeout (previously none).
61
+
62
+ ### 2. `executeTool` global timeout (`src/server/tools.js`)
63
+
64
+ All tool executions — both built-in and AI-created — are now wrapped in `Promise.race` against a 60-second timeout. This protects against AI-created tools that hang on async operations (network requests, file I/O). The timeout matches the `exec` tool's own limit for consistency.
65
+
66
+ ```js
67
+ const timeout = new Promise((_, reject) =>
68
+ setTimeout(() => reject(new Error(`Tool '${name}' timed out after 60s`)), 60_000)
69
+ );
70
+ return await Promise.race([fn(toolArgs, fs, path, process, _require), timeout]);
71
+ ```
72
+
73
+ Note: this does not protect against synchronous CPU loops without `await` points — that would require Worker Threads. Such code is unlikely to be generated accidentally.
74
+
75
+ ### 3. Seed tools always updated (`src/server/tools.js`)
76
+
77
+ `seedTools()` now compares the serialized content of each seed tool against the stored version and overwrites only when there is a difference. Updates to built-in tools propagate on the next server start without touching user-created tools.
78
+
79
+ ### 4. All file I/O → async (`src/server/sessions.js`, `src/server/logging.js`, `src/server/tools.js`)
80
+
81
+ `loadSession`, `saveSession`, `appendLog`, and `loadTools` now use `fs.promises.*`. All callers in `agent.js` are updated to `await` these calls.
82
+
83
+ ### 5. `saveSession` moved to `finally` block (`src/server/agent.js`)
84
+
85
+ The session is now always persisted — on success, on model error, and on unexpected errors. A failed save is caught and logged without masking the original error.
86
+
87
+ ```js
88
+ } finally {
89
+ try {
90
+ await saveSession(sessionId, session);
91
+ } catch (saveErr) {
92
+ console.error(`Failed to save session ${sessionId}:`, saveErr);
93
+ }
94
+ }
95
+ ```
96
+
97
+ ### 6. Session queue for concurrency control (`src/server/agent.js`)
98
+
99
+ A module-level `Map<sessionId, Promise>` serializes concurrent requests for the same session. Each new request registers itself as the tail of the queue and waits for the previous request to resolve before starting. The map entry is cleaned up by whichever request is last in the chain.
100
+
101
+ ```js
102
+ const previous = sessionQueues.get(sessionId) ?? Promise.resolve();
103
+ let releaseLock;
104
+ const current = new Promise(resolve => { releaseLock = resolve; });
105
+ sessionQueues.set(sessionId, current);
106
+ await previous;
107
+ // ... process request ...
108
+ // finally: releaseLock()
109
+ ```
110
+
111
+ This is safe in Node.js because the event loop is single-threaded: `get`, `new Promise`, and `set` all execute synchronously before the first `await`, so there is no race between two requests reading the same `undefined` entry.
112
+
113
+ ---
114
+
115
+ ## What Was Not Changed
116
+
117
+ - The agent loop logic, checkpoint/handoff system, loop detection, and format recovery — all unchanged.
118
+ - `seedTools()` remains synchronous (called once at startup, before the server accepts requests).
119
+ - `createSession()` and `getToolDefinitions()` remain synchronous (pure functions, no I/O).
120
+ - No rate limiting or HTTP authentication added — the server is intended for local/personal use only.
@@ -44,6 +44,15 @@ You have access to a set of tools. Each tool has a name and description that tel
44
44
  - If the user shares personal information, persist it using the appropriate tool.
45
45
  - Prefer using tools over making assumptions about the state of the system.
46
46
 
47
+ ## exec Safety
48
+
49
+ The `exec` tool runs real shell commands on the server. Use it responsibly:
50
+
51
+ - **Never scan from filesystem root.** Commands like `find /`, `find / -name ...`, or `ls -R /` will scan everything including `/proc`, `/sys`, and network mounts. They can saturate CPU and I/O for minutes. Always scope `find` to a specific directory (e.g. `find ~/jarvis -name "*.js"`).
52
+ - **Use known paths.** Prefer `process.cwd()`, `$HOME`, or paths you already know over broad searches. Use `which <binary>` to locate executables.
53
+ - **Prefer targeted reads.** Use `grep`, `head`, or `tail` instead of `cat` on files you haven't seen before. Large file output is truncated anyway — a targeted command gives you better signal.
54
+ - **Avoid commands with unbounded runtime.** If a command could run indefinitely or scan an unknown-size tree, scope it first.
55
+
47
56
  ## logSummary Guidelines
48
57
 
49
58
  The `logSummary` is written for a human observer, not for the user. It must:
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@ducci/jarvis",
3
- "version": "1.0.15",
3
+ "version": "1.0.17",
4
4
  "description": "A fully automated agent system that lives on a server.",
5
5
  "main": "./src/index.js",
6
6
  "type": "module",
@@ -24,6 +24,11 @@ Respond with your normal JSON, but add a checkpoint field:
24
24
 
25
25
  The checkpoint field will be used to automatically resume the task in the next run.]`;
26
26
 
27
+ // Serializes concurrent requests for the same session. Maps sessionId to the
28
+ // tail of the current request chain (a Promise that resolves when the last
29
+ // queued request finishes).
30
+ const sessionQueues = new Map();
31
+
27
32
  async function callModel(client, model, messages, tools) {
28
33
  const params = { model, messages };
29
34
  if (tools && tools.length > 0) {
@@ -67,7 +72,7 @@ async function callModelWithFallback(client, config, messages, tools) {
67
72
  * Returns { iteration, response, logSummary, status, runToolCalls, checkpoint }.
68
73
  */
69
74
  async function runAgentLoop(client, config, session, prepareMessages) {
70
- let tools = loadTools();
75
+ let tools = await loadTools();
71
76
  let toolDefs = getToolDefinitions(tools);
72
77
  let iteration = 0;
73
78
  const runToolCalls = [];
@@ -175,7 +180,7 @@ async function runAgentLoop(client, config, session, prepareMessages) {
175
180
 
176
181
  // Reload tools if any were created/updated this iteration
177
182
  if (toolsModified) {
178
- tools = loadTools();
183
+ tools = await loadTools();
179
184
  toolDefs = getToolDefinitions(tools);
180
185
  }
181
186
 
@@ -313,14 +318,41 @@ async function runAgentLoop(client, config, session, prepareMessages) {
313
318
  * Manages the handoff loop across multiple agent runs.
314
319
  */
315
320
  export async function handleChat(config, requestSessionId, userMessage) {
321
+ const sessionId = requestSessionId || crypto.randomUUID();
322
+
323
+ // Serialize concurrent requests for the same session. Each request registers
324
+ // itself at the tail of the queue and waits for the previous request to finish
325
+ // before starting. New sessions (no requestSessionId) each get a unique ID,
326
+ // so they never contend with each other.
327
+ const previous = sessionQueues.get(sessionId) ?? Promise.resolve();
328
+ let releaseLock;
329
+ const current = new Promise(resolve => { releaseLock = resolve; });
330
+ sessionQueues.set(sessionId, current);
331
+ await previous;
332
+
333
+ try {
334
+ return await _runHandleChat(config, sessionId, userMessage);
335
+ } finally {
336
+ releaseLock();
337
+ // Clean up only if no one else has queued behind us
338
+ if (sessionQueues.get(sessionId) === current) {
339
+ sessionQueues.delete(sessionId);
340
+ }
341
+ }
342
+ }
343
+
344
+ /**
345
+ * The actual chat logic, extracted so handleChat can wrap it cleanly with the
346
+ * session lock.
347
+ */
348
+ async function _runHandleChat(config, sessionId, userMessage) {
316
349
  const client = new OpenAI({
317
350
  baseURL: 'https://openrouter.ai/api/v1',
318
351
  apiKey: config.apiKey,
319
352
  });
320
353
 
321
354
  const systemPromptTemplate = loadSystemPrompt();
322
- const sessionId = requestSessionId || crypto.randomUUID();
323
- let session = loadSession(sessionId);
355
+ let session = await loadSession(sessionId);
324
356
 
325
357
  if (!session) {
326
358
  session = createSession(systemPromptTemplate);
@@ -345,8 +377,8 @@ export async function handleChat(config, requestSessionId, userMessage) {
345
377
  let finalLogSummary = '';
346
378
  let finalStatus = 'ok';
347
379
 
348
- // Handoff loop
349
380
  try {
381
+ // Handoff loop
350
382
  while (true) {
351
383
  const runStartIndex = session.messages.length;
352
384
  const run = await runAgentLoop(client, config, session, prepareMessages);
@@ -369,7 +401,7 @@ export async function handleChat(config, requestSessionId, userMessage) {
369
401
  if (run.errorDetail) logEntry.errorDetail = run.errorDetail;
370
402
  if (run.contextInfo) logEntry.contextInfo = run.contextInfo;
371
403
  if (run.rawResponse) logEntry.rawResponse = run.rawResponse;
372
- appendLog(sessionId, logEntry);
404
+ await appendLog(sessionId, logEntry);
373
405
 
374
406
  // Inject synthetic error note so the model has context on the next user turn
375
407
  if (finalStatus === 'model_error' || finalStatus === 'format_error') {
@@ -384,7 +416,7 @@ export async function handleChat(config, requestSessionId, userMessage) {
384
416
  }
385
417
 
386
418
  // Checkpoint reached — log this run
387
- appendLog(sessionId, {
419
+ await appendLog(sessionId, {
388
420
  iteration: run.iteration,
389
421
  model: config.selectedModel,
390
422
  userInput: userMessage,
@@ -401,7 +433,7 @@ export async function handleChat(config, requestSessionId, userMessage) {
401
433
  finalLogSummary = run.logSummary;
402
434
  finalStatus = 'intervention_required';
403
435
 
404
- appendLog(sessionId, {
436
+ await appendLog(sessionId, {
405
437
  iteration: 0,
406
438
  model: config.selectedModel,
407
439
  userInput: userMessage,
@@ -426,7 +458,7 @@ export async function handleChat(config, requestSessionId, userMessage) {
426
458
  session.messages.push({ role: 'user', content: run.checkpoint.remaining || 'Continue with the task.' });
427
459
  }
428
460
  } catch (e) {
429
- const errorLog = {
461
+ await appendLog(sessionId, {
430
462
  iteration: 0,
431
463
  model: config.selectedModel,
432
464
  userInput: userMessage,
@@ -435,14 +467,18 @@ export async function handleChat(config, requestSessionId, userMessage) {
435
467
  logSummary: `Critical error: ${e.message}`,
436
468
  status: 'error',
437
469
  errorDetail: { message: e.message, stack: e.stack },
438
- };
439
- appendLog(sessionId, errorLog);
440
- // Re-throw to let app.js handle the HTTP response
470
+ });
441
471
  throw e;
472
+ } finally {
473
+ // Always persist the session — even if an unexpected error occurred.
474
+ // A failed save must not mask the original error.
475
+ try {
476
+ await saveSession(sessionId, session);
477
+ } catch (saveErr) {
478
+ console.error(`Failed to save session ${sessionId}:`, saveErr);
479
+ }
442
480
  }
443
481
 
444
- saveSession(sessionId, session);
445
-
446
482
  console.log(`${chalk.magenta('<<<')} ${chalk.bold('Final Response')} [SID: ${chalk.dim(sessionId.slice(0, 8))}] ${chalk.italic(finalLogSummary)}`);
447
483
 
448
484
  return {
@@ -3,10 +3,10 @@ import path from 'path';
3
3
  import chalk from 'chalk';
4
4
  import { PATHS } from './config.js';
5
5
 
6
- export function appendLog(sessionId, entry) {
6
+ export async function appendLog(sessionId, entry) {
7
7
  const logFile = path.join(PATHS.logsDir, `session-${sessionId}.jsonl`);
8
8
  const line = JSON.stringify({ ts: new Date().toISOString(), sessionId, ...entry }) + '\n';
9
- fs.appendFileSync(logFile, line, 'utf8');
9
+ await fs.promises.appendFile(logFile, line, 'utf8');
10
10
 
11
11
  // Console output for better visibility
12
12
  const statusColor = entry.status === 'ok' ? chalk.green : chalk.red;
@@ -2,19 +2,20 @@ import fs from 'fs';
2
2
  import path from 'path';
3
3
  import { PATHS } from './config.js';
4
4
 
5
- export function loadSession(sessionId) {
5
+ export async function loadSession(sessionId) {
6
6
  const filePath = path.join(PATHS.conversationsDir, `${sessionId}.json`);
7
7
  try {
8
- return JSON.parse(fs.readFileSync(filePath, 'utf8'));
8
+ const raw = await fs.promises.readFile(filePath, 'utf8');
9
+ return JSON.parse(raw);
9
10
  } catch {
10
11
  return null;
11
12
  }
12
13
  }
13
14
 
14
- export function saveSession(sessionId, session) {
15
+ export async function saveSession(sessionId, session) {
15
16
  session.metadata.updatedAt = new Date().toISOString();
16
17
  const filePath = path.join(PATHS.conversationsDir, `${sessionId}.json`);
17
- fs.writeFileSync(filePath, JSON.stringify(session, null, 2), 'utf8');
18
+ await fs.promises.writeFile(filePath, JSON.stringify(session, null, 2), 'utf8');
18
19
  }
19
20
 
20
21
  export function createSession(systemPromptTemplate) {
@@ -6,6 +6,8 @@ import { PATHS } from './config.js';
6
6
  const _require = createRequire(import.meta.url);
7
7
  const AsyncFunction = Object.getPrototypeOf(async function () {}).constructor;
8
8
 
9
+ const TOOL_TIMEOUT_MS = 60_000;
10
+
9
11
  const SEED_TOOLS = {
10
12
  list_dir: {
11
13
  definition: {
@@ -25,7 +27,18 @@ const SEED_TOOLS = {
25
27
  },
26
28
  },
27
29
  },
28
- code: 'const targetPath = args.path || process.cwd(); const resolved = path.resolve(targetPath); const { execSync } = require("child_process"); const output = execSync(`ls -la "${resolved}"`, { encoding: "utf8" }); return { status: "ok", path: resolved, output };',
30
+ code: `
31
+ const { execFile } = require("child_process");
32
+ const { promisify } = require("util");
33
+ const execFileAsync = promisify(execFile);
34
+ const targetPath = args.path || process.cwd();
35
+ const resolved = path.resolve(targetPath);
36
+ const { stdout: output } = await execFileAsync("ls", ["-la", resolved], {
37
+ encoding: "utf8",
38
+ timeout: 10000,
39
+ });
40
+ return { status: "ok", path: resolved, output };
41
+ `,
29
42
  },
30
43
  exec: {
31
44
  definition: {
@@ -45,7 +58,21 @@ const SEED_TOOLS = {
45
58
  },
46
59
  },
47
60
  },
48
- code: 'const { execSync } = require("child_process"); try { const stdout = execSync(args.cmd, { encoding: "utf8", timeout: 60000 }); return { status: "ok", exitCode: 0, stdout, stderr: "" }; } catch (e) { return { status: "error", exitCode: e.status || 1, stdout: e.stdout || "", stderr: e.stderr || e.message }; }',
61
+ code: `
62
+ const { exec } = require("child_process");
63
+ const { promisify } = require("util");
64
+ const execAsync = promisify(exec);
65
+ try {
66
+ const { stdout, stderr } = await execAsync(args.cmd, {
67
+ encoding: "utf8",
68
+ timeout: 60000,
69
+ maxBuffer: 2 * 1024 * 1024,
70
+ });
71
+ return { status: "ok", exitCode: 0, stdout, stderr };
72
+ } catch (e) {
73
+ return { status: "error", exitCode: e.code || 1, stdout: e.stdout || "", stderr: e.stderr || e.message };
74
+ }
75
+ `,
49
76
  },
50
77
  save_user_info: {
51
78
  definition: {
@@ -193,7 +220,9 @@ export function seedTools() {
193
220
 
194
221
  let changed = false;
195
222
  for (const [name, tool] of Object.entries(SEED_TOOLS)) {
196
- if (!existing[name]) {
223
+ // Always keep seed tools up to date — user-created tools have different names
224
+ // and are never touched by this loop.
225
+ if (JSON.stringify(existing[name]) !== JSON.stringify(tool)) {
197
226
  existing[name] = tool;
198
227
  changed = true;
199
228
  }
@@ -207,9 +236,10 @@ export function seedTools() {
207
236
  return existing;
208
237
  }
209
238
 
210
- export function loadTools() {
239
+ export async function loadTools() {
211
240
  try {
212
- return JSON.parse(fs.readFileSync(PATHS.toolsFile, 'utf8'));
241
+ const raw = await fs.promises.readFile(PATHS.toolsFile, 'utf8');
242
+ return JSON.parse(raw);
213
243
  } catch {
214
244
  return {};
215
245
  }
@@ -226,5 +256,13 @@ export async function executeTool(tools, name, toolArgs) {
226
256
  }
227
257
 
228
258
  const fn = new AsyncFunction('args', 'fs', 'path', 'process', 'require', tool.code);
229
- return await fn(toolArgs, fs, path, process, _require);
259
+
260
+ const timeout = new Promise((_, reject) =>
261
+ setTimeout(
262
+ () => reject(new Error(`Tool '${name}' timed out after ${TOOL_TIMEOUT_MS / 1000}s`)),
263
+ TOOL_TIMEOUT_MS
264
+ )
265
+ );
266
+
267
+ return await Promise.race([fn(toolArgs, fs, path, process, _require), timeout]);
230
268
  }