@ducci/jarvis 1.0.27 → 1.0.29

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,157 @@
1
+ # Finding 011: Empty Model Response Causes Generic Telegram Error
2
+
3
+ **Date:** 2026-03-01
4
+ **Severity:** High — user sees generic "please try again" with no actionable information
5
+ **Status:** Fixed
6
+
7
+ ---
8
+
9
+ ## Observed Session
10
+
11
+ Session `33a50dfe-38ea-4972-adac-498ef0525b0c`, run 16 of 17 (session.jsonl line 16):
12
+
13
+ ```
14
+ status=format_error
15
+ model=nvidia/nemotron-3-nano-30b-a3b:free
16
+ iteration=4
17
+ userInput='Ok. Kannst du bitte jetzt das shell script ausführen mit der domain...'
18
+ logSummary='Model returned non-JSON final response after recovery attempts.'
19
+ rawResponse=''
20
+ response=''
21
+ ```
22
+
23
+ The Telegram user received:
24
+
25
+ ```
26
+ The agent encountered an error and could not produce a response. Please try again.
27
+ ```
28
+
29
+ ---
30
+
31
+ ## What Happened
32
+
33
+ The agent executed a ZAP scan (`./scan.sh juice-shop.herokuapp.com`). The tool result was a large ZAP startup log, truncated at 4000 characters. Two subsequent tool calls failed:
34
+
35
+ - `pkill -f zaproxy || true` → exit 1 (no process to kill)
36
+ - `zaproxy -help | grep -i shutdown -A5` → failed (`libtiff.so.5` missing)
37
+
38
+ On iteration 4, the model returned `assistantMessage.content = null` with no `tool_calls`. This is the "went silent" case: the model produced neither a response nor another tool call.
39
+
40
+ ---
41
+
42
+ ## Bug Chain
43
+
44
+ ### Step 1 — Model returns null content
45
+
46
+ ```js
47
+ let content = assistantMessage.content || '';
48
+ // content = ''
49
+ ```
50
+
51
+ ### Step 2 — Recovery chain falls through on empty content
52
+
53
+ The existing format recovery chain was designed for *non-empty, non-JSON* responses:
54
+
55
+ 1. `JSON.parse('')` → throws
56
+ 2. Retry with fallback model (same messages, no nudge) → also `''`
57
+ 3. Retry with nudge "Your previous response was not valid JSON" → technically wrong for empty content; model still returns `''`
58
+ 4. Give up
59
+
60
+ ### Step 3 — Empty `response` propagates to Telegram
61
+
62
+ ```js
63
+ response = content; // ''
64
+ ```
65
+
66
+ `handleChat` returns `{ response: '', ... }`. In `telegram/index.js`:
67
+
68
+ ```js
69
+ const rawResponse = typeof result.response === 'string' ? result.response : ...;
70
+ // rawResponse = ''
71
+ const text = rawResponse.trim() || 'The agent encountered an error...';
72
+ // '' → fallback shown
73
+ ```
74
+
75
+ The user sees the generic Telegram fallback instead of any information about what happened or what to do.
76
+
77
+ ---
78
+
79
+ ## Root Causes
80
+
81
+ **Primary**: The `format_error` path set `response = content` without a fallback for the empty string case. An empty `response` triggers the Telegram handler's last-resort fallback message, giving the user no context.
82
+
83
+ **Secondary**: The format recovery chain was designed for non-empty non-JSON responses. When `content` is empty, the nudge message "Your previous response was not valid JSON" is inaccurate — the model produced nothing, not invalid JSON. A targeted nudge for the empty case increases the chance of recovery.
84
+
85
+ **Model-level cause**: The free model `nvidia/nemotron-3-nano-30b-a3b:free` can fail to produce any output after processing a heavily truncated tool result followed by consecutive tool failures. This is a model quality limitation that the recovery layer must account for.
86
+
87
+ ---
88
+
89
+ ## Difference from Finding 009 and 010
90
+
91
+ | Finding | Model produces... | Bug manifests at... |
92
+ |---------|-------------------|---------------------|
93
+ | 009 | Non-string `response` field (array/object) | Telegram `.trim()` crash |
94
+ | 010 | Non-string `checkpoint.remaining` | Zero-progress `.trim()` crash |
95
+ | 011 | Empty/null content (no text, no tool calls) | Telegram generic fallback (no crash, but useless to user) |
96
+
97
+ Finding 011 is the third in the same class: model output type does not match what the system expects.
98
+
99
+ ---
100
+
101
+ ## Fix
102
+
103
+ ### `src/server/agent.js` — two changes
104
+
105
+ **1. Empty-content detection with targeted nudge**
106
+
107
+ When `content` is empty, skip the standard recovery chain (designed for non-JSON text) and apply a targeted nudge that accurately describes the situation:
108
+
109
+ ```js
110
+ if (!content.trim()) {
111
+ // Model returned no content at all — use a targeted nudge instead of the
112
+ // standard JSON recovery chain (designed for non-empty non-JSON responses).
113
+ try {
114
+ const emptyNudge = [
115
+ ...preparedMessages,
116
+ { role: 'user', content: 'You returned an empty response. ' + FORMAT_NUDGE },
117
+ ];
118
+ const nudgeResult = await callModelWithFallback(client, config, emptyNudge, toolDefs);
119
+ const nudgeContent = nudgeResult.choices[0]?.message?.content || '';
120
+ parsed = JSON.parse(nudgeContent);
121
+ content = nudgeContent;
122
+ } catch {
123
+ // Give up — fall through to !parsed handler below
124
+ }
125
+ } else {
126
+ // Non-empty content — use the existing 3-step JSON recovery chain
127
+ try { parsed = JSON.parse(content); } catch {
128
+ // Step 1: fallback model...
129
+ // Step 2: nudge...
130
+ }
131
+ }
132
+ ```
133
+
134
+ **2. Non-empty fallback on format_error**
135
+
136
+ ```js
137
+ if (!parsed) {
138
+ // Ensure response is never empty so the delivery layer can show something
139
+ // meaningful rather than its generic fallback message.
140
+ response = content.trim() || 'The model did not produce a response. Please try again.';
141
+ logSummary = 'Model returned non-JSON final response after recovery attempts.';
142
+ status = 'format_error';
143
+ return { ... };
144
+ }
145
+ ```
146
+
147
+ ---
148
+
149
+ ## Outcome
150
+
151
+ | Scenario | Before | After |
152
+ |----------|--------|-------|
153
+ | Model returns empty content, nudge succeeds | format_error (3 wasted API calls) | Clean recovery (1 targeted API call) |
154
+ | Model returns empty content, nudge fails | Telegram generic fallback | "The model did not produce a response. Please try again." |
155
+ | Model returns non-JSON text, all recovery fails | Telegram generic fallback (if text was empty) | Raw model output shown to user |
156
+
157
+ **Effect on the debugging session**: instead of the generic Telegram fallback, the user would have received "The model did not produce a response. Please try again." — a clear signal that the model failed, not their message. In the best case, the new targeted nudge would have elicited a valid JSON response.
@@ -0,0 +1,121 @@
1
+ # Finding 012: Empty-Content Nudge Includes Tools and Loses Recovery Text
2
+
3
+ **Date:** 2026-03-02
4
+ **Severity:** Medium — user sees generic error when model produces a partial recovery response
5
+ **Status:** Fixed
6
+
7
+ ---
8
+
9
+ ## Observed Session
10
+
11
+ Session `21fb43a7-2b11-4208-99fb-e6b54fddc07b`, entry 9 in session.jsonl:
12
+
13
+ ```
14
+ status=format_error
15
+ model=nvidia/nemotron-3-nano-30b-a3b:free
16
+ iteration=3
17
+ userInput='Ok. Read the results folder. Is there anything?'
18
+ logSummary='Model returned non-JSON final response after recovery attempts.'
19
+ response='The model did not produce a response. Please try again.'
20
+ ```
21
+
22
+ The user received: **"The model did not produce a response. Please try again."**
23
+
24
+ ---
25
+
26
+ ## What Happened
27
+
28
+ 1. The agent executed two tool calls:
29
+ - `list_dir /root/.jarvis/projects/cybersecurity/results` → success
30
+ - `exec "list_dir /root/.jarvis/projects/cybersecurity/results/dviet.de"` → exit 127 (`list_dir: not found`)
31
+ - The model confused the `list_dir` jarvis tool with a shell command
32
+
33
+ 2. After the failed exec, the model returned `assistantMessage.content = null` with no `tool_calls` — it "went silent"
34
+
35
+ 3. Finding 011's empty-content nudge was triggered
36
+
37
+ 4. The nudge **also failed** — no valid JSON response was produced
38
+
39
+ 5. The agent fell through to `format_error` with the fallback message
40
+
41
+ ---
42
+
43
+ ## Bug Chain
44
+
45
+ ### Bug 1 — toolDefs included in empty nudge
46
+
47
+ ```js
48
+ const nudgeResult = await callModelWithFallback(client, config, emptyNudge, toolDefs);
49
+ ```
50
+
51
+ When the model is confused after a tool failure, it may respond to the nudge with **another tool call** instead of text. If it does:
52
+
53
+ ```
54
+ nudgeResult.choices[0].message.content = null
55
+ nudgeContent = ''
56
+ JSON.parse('') → throws
57
+ catch: // Give up — content stays ''
58
+ ```
59
+
60
+ The model had an opportunity to call more tools instead of producing a text response — the wrong behavior for a recovery nudge.
61
+
62
+ ### Bug 2 — content assigned after parse
63
+
64
+ ```js
65
+ const nudgeContent = nudgeResult.choices[0]?.message?.content || '';
66
+ parsed = JSON.parse(nudgeContent); // ← throws on non-JSON or empty
67
+ content = nudgeContent; // ← only reached if parse succeeded
68
+ ```
69
+
70
+ If the model responds to the nudge with non-empty but non-JSON text (e.g. a plain English answer), `JSON.parse` throws and `content` is **never updated**. The non-JSON text is discarded. The `!parsed` handler then shows the fallback message instead of the model's actual text.
71
+
72
+ ---
73
+
74
+ ## Difference from Finding 011
75
+
76
+ | Finding | Problem | Trigger |
77
+ |---------|---------|---------|
78
+ | 011 | Empty model response propagates to Telegram | Initial empty content, no recovery chain |
79
+ | 012 | Recovery nudge discards best-effort text; model can respond with tool call | Recovery nudge called with toolDefs + content assigned after parse |
80
+
81
+ Finding 012 is a refinement of the recovery path introduced in Finding 011.
82
+
83
+ ---
84
+
85
+ ## Fix
86
+
87
+ ### `src/server/agent.js` — empty-content nudge block
88
+
89
+ **Before:**
90
+ ```js
91
+ const nudgeResult = await callModelWithFallback(client, config, emptyNudge, toolDefs);
92
+ const nudgeContent = nudgeResult.choices[0]?.message?.content || '';
93
+ parsed = JSON.parse(nudgeContent);
94
+ content = nudgeContent;
95
+ ```
96
+
97
+ **After:**
98
+ ```js
99
+ // No tools: force text response, prevent model from calling another tool
100
+ const nudgeResult = await callModelWithFallback(client, config, emptyNudge, []);
101
+ const nudgeContent = nudgeResult.choices[0]?.message?.content || '';
102
+ // Persist before parsing — if JSON parse throws, content still carries the
103
+ // model's best-effort text so the !parsed handler can show it to the user
104
+ if (nudgeContent.trim()) {
105
+ content = nudgeContent;
106
+ }
107
+ parsed = JSON.parse(nudgeContent);
108
+ ```
109
+
110
+ ---
111
+
112
+ ## Outcome
113
+
114
+ | Nudge response | Before | After |
115
+ |---|---|---|
116
+ | Valid JSON | Clean recovery | Clean recovery (no change) |
117
+ | Non-JSON text | Text discarded, fallback shown | Text shown to user |
118
+ | Tool call (no content) | content='', fallback shown | Less likely; content='', fallback shown |
119
+ | Empty again | content='', fallback shown | content='', fallback shown (no change) |
120
+
121
+ The user in the observed session would have received the model's best-effort text about the results folder contents, rather than "The model did not produce a response. Please try again."
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@ducci/jarvis",
3
- "version": "1.0.27",
3
+ "version": "1.0.29",
4
4
  "description": "A fully automated agent system that lives on a server.",
5
5
  "main": "./src/index.js",
6
6
  "type": "module",
@@ -215,32 +215,58 @@ async function runAgentLoop(client, config, session, prepareMessages) {
215
215
  let content = assistantMessage.content || '';
216
216
  let parsed = null;
217
217
 
218
- try {
219
- parsed = JSON.parse(content);
220
- } catch {
221
- // Step 1: retry with fallback model
218
+ if (!content.trim()) {
219
+ // Model returned no content at all — use a targeted nudge instead of the
220
+ // standard JSON recovery chain (designed for non-empty non-JSON responses).
221
+ // Send with no tools so the model cannot respond with another tool call,
222
+ // which would leave content empty and discard any recovery text.
223
+ try {
224
+ const emptyNudge = [
225
+ ...preparedMessages,
226
+ { role: 'user', content: 'You returned an empty response. ' + FORMAT_NUDGE },
227
+ ];
228
+ const nudgeResult = await callModelWithFallback(client, config, emptyNudge, []);
229
+ const nudgeContent = nudgeResult.choices[0]?.message?.content || '';
230
+ // Persist nudge text before parsing — if JSON parse throws, content still
231
+ // carries the model's best-effort text so the !parsed handler can show it
232
+ // rather than falling back to "The model did not produce a response."
233
+ if (nudgeContent.trim()) {
234
+ content = nudgeContent;
235
+ }
236
+ parsed = JSON.parse(nudgeContent);
237
+ } catch {
238
+ // Fall through to !parsed handler; content may now carry the nudge text
239
+ }
240
+ } else {
222
241
  try {
223
- const fallbackResult = await callModel(client, config.fallbackModel, preparedMessages, toolDefs);
224
- const fallbackContent = fallbackResult.choices[0]?.message?.content || '';
225
- parsed = JSON.parse(fallbackContent);
226
- content = fallbackContent;
242
+ parsed = JSON.parse(content);
227
243
  } catch {
228
- // Step 2: nudge retry via both models
244
+ // Step 1: retry with fallback model
229
245
  try {
230
- const nudgeMessages = [...preparedMessages, { role: 'user', content: FORMAT_NUDGE }];
231
- const nudgeResult = await callModelWithFallback(client, config, nudgeMessages, toolDefs);
232
- const nudgeContent = nudgeResult.choices[0]?.message?.content || '';
233
- parsed = JSON.parse(nudgeContent);
234
- content = nudgeContent;
246
+ const fallbackResult = await callModel(client, config.fallbackModel, preparedMessages, toolDefs);
247
+ const fallbackContent = fallbackResult.choices[0]?.message?.content || '';
248
+ parsed = JSON.parse(fallbackContent);
249
+ content = fallbackContent;
235
250
  } catch {
236
- // Give up
251
+ // Step 2: nudge retry via both models
252
+ try {
253
+ const nudgeMessages = [...preparedMessages, { role: 'user', content: FORMAT_NUDGE }];
254
+ const nudgeResult = await callModelWithFallback(client, config, nudgeMessages, toolDefs);
255
+ const nudgeContent = nudgeResult.choices[0]?.message?.content || '';
256
+ parsed = JSON.parse(nudgeContent);
257
+ content = nudgeContent;
258
+ } catch {
259
+ // Give up
260
+ }
237
261
  }
238
262
  }
239
263
  }
240
264
 
241
265
  if (!parsed) {
242
- // Don't push bad content — handleChat will inject a synthetic error note
243
- response = content;
266
+ // Don't push bad content — handleChat will inject a synthetic error note.
267
+ // Ensure response is never empty so the delivery layer (e.g. Telegram) can
268
+ // show the user something meaningful rather than its generic fallback message.
269
+ response = content.trim() || 'The model did not produce a response. Please try again.';
244
270
  logSummary = 'Model returned non-JSON final response after recovery attempts.';
245
271
  status = 'format_error';
246
272
  return { iteration, response, logSummary, status, runToolCalls, checkpoint: null, rawResponse: content };