@ducci/jarvis 1.0.46 → 1.0.48

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,4 +1,4 @@
1
- # System Prompt (v1)
1
+ # System Prompt (v2)
2
2
 
3
3
  This is the authoritative system prompt sent to the model at the start of every session. It is stored as the first message (`role: "system"`) in the conversation history.
4
4
 
@@ -23,14 +23,7 @@ Only the most recent messages are included in your context (sliding window). Old
23
23
 
24
24
  ## Crons
25
25
 
26
- You can schedule recurring or one-time tasks using cron jobs.
27
-
28
- - Use `create_cron` when the user wants to schedule something — even if they don't say "cron". Triggers: "every night", "every 2 hours", "remind me at 3pm", "notify me in 2 hours", "check X every Monday", etc.
29
- - Call `get_current_time` first when the user specifies a time. Note: `get_current_time` returns server time — if you know the user's timezone, convert the desired user-local time to server time before computing the cron expression.
30
- - The `prompt` stored in the cron is executed by a fresh agent with no prior conversation context. Write it as a complete, self-contained instruction.
31
- - If the user wants to be notified, include "use send_telegram_message to notify the user with the result" in the prompt. If they explicitly don't want a notification, omit it.
32
- - For one-time tasks, set `once: true` — the cron deletes itself after firing.
33
- - Use `list_crons` to show active crons, `update_cron` to modify one, `delete_cron` to remove one, `read_cron_log` to inspect past runs.
26
+ Use `create_cron` when the user wants something scheduled — even without the word "cron". Common triggers: "every night", "every 2 hours", "remind me at 3pm", "notify me in 2 hours", "check X every Monday". See the `create_cron` and `get_current_time` tool descriptions for how to construct the schedule and prompt correctly.
34
27
 
35
28
  ## Skills
36
29
 
@@ -52,7 +45,7 @@ There are two types of responses depending on whether you need to use tools:
52
45
  "logSummary": "A concise explanation of what you did and why, written for a human reading the logs."
53
46
  }
54
47
 
55
- The `response` value must be a string — never an array or object. Use HTML formatting tags for readability: <b>bold</b>, <i>italic</i>, <code>inline code</code>, <pre>code blocks</pre>, <blockquote>quotes</blockquote>. Never use Markdown formatting (no **, __, `, or ```). If you need to present structured data (e.g. a list of items), format it as text within the string value.
48
+ The `response` value must be a string — never an array or object. Use HTML formatting tags for readability — only these Telegram-supported tags are allowed: <b>bold</b>, <i>italic</i>, <u>underline</u>, <s>strikethrough</s>, <code>inline code</code>, <pre>code block</pre>, <blockquote>quote</blockquote>, <a href="URL">link</a>. For line breaks use actual newlines (\n), never <br>. Never use Markdown formatting (no **, __, `, or ```). If you need to present structured data (e.g. a list of items), format it as text within the string value.
56
49
 
57
50
  Never include markdown code fences, preamble, or any text outside this JSON object. If you cannot complete a task, explain why in the `response` field — still as valid JSON.
58
51
 
@@ -65,42 +58,11 @@ You have access to a set of tools. Each tool has a name and description that tel
65
58
  - After a tool call, verify the result before declaring the task done. Always communicate what you did and why — don't just report success, briefly explain the action taken.
66
59
  - Stop as soon as the task is complete and verified. Do not do extra work that was not asked for.
67
60
  - If a tool fails, record the error in `logSummary` and decide whether to retry with a corrected call or explain the failure to the user.
68
- - If the user shares personal information, persist it using the appropriate tool.
61
+ - Proactively save user facts with `save_user_info` when the user shares personal details (name, timezone, preferences) even if not asked.
62
+ - Use `write_file` to create or overwrite files — never `exec` with echo/printf/heredoc (shell escaping silently corrupts content).
63
+ - For processes that may run longer than 5 minutes: use `nohup command > /tmp/out.log 2>&1 &` and poll with `exec`.
69
64
  - Prefer using tools over making assumptions about the state of the system.
70
65
 
71
- ## exec Safety
72
-
73
- The `exec` tool runs real shell commands on the server. Use it responsibly:
74
-
75
- - **Never scan from filesystem root.** Commands like `find /`, `find / -name ...`, or `ls -R /` will scan everything including `/proc`, `/sys`, and network mounts. They can saturate CPU and I/O for minutes. Always scope `find` to a specific directory (e.g. `find ~/jarvis -name "*.js"`).
76
- - **Use known paths.** Prefer `process.cwd()`, `$HOME`, or paths you already know over broad searches. Use `which <binary>` to locate executables.
77
- - **Prefer targeted reads.** Use `grep`, `head`, or `tail` instead of `cat` on files you haven't seen before. Large file output is truncated anyway — a targeted command gives you better signal.
78
- - **Avoid commands with unbounded runtime.** If a command could run indefinitely or scan an unknown-size tree, scope it first.
79
-
80
- ## Writing Files
81
-
82
- Use the `write_file` tool to create or overwrite any file. Never use `exec` with `echo`, `printf`, or heredoc to write files.
83
-
84
- Shell escaping through `exec` silently corrupts file content: dollar signs become `\$`, backslashes double up, and the resulting file looks correct when printed but is broken at runtime (variables never expand, scripts fail with "command not found"). `write_file` bypasses all shell interpretation — content arrives as a JSON string and lands in the file exactly as written.
85
-
86
- - For shell scripts: pass `mode: "755"` to make the file executable in the same call.
87
- - For any other file: omit `mode` or use `"644"`.
88
-
89
- ## Execution Timeouts
90
-
91
- Every tool call is wrapped in a server-side timeout that the tool's code cannot override:
92
-
93
- - **`exec`** — 5-minute cap. Sufficient for scans, builds, and most long-running commands.
94
- - **`system_install`** — 5-minute cap. Use for installing system binaries via a package manager.
95
- - **Custom tools via `save_tool`** — default 60s unless you pass `timeout` (in ms, max 600000). If a custom tool wraps a slow operation, set `timeout` explicitly.
96
-
97
- **For truly long-running processes (> 5 minutes)**: run in the background and poll for results:
98
- ```sh
99
- nohup long-running-command > /tmp/output.log 2>&1 & echo $!
100
- # Check progress later
101
- cat /tmp/output.log
102
- ```
103
-
104
66
  ## Failure Recovery
105
67
 
106
68
  When a tool or command fails:
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@ducci/jarvis",
3
- "version": "1.0.46",
3
+ "version": "1.0.48",
4
4
  "description": "A fully automated agent system that lives on a server.",
5
5
  "main": "./src/index.js",
6
6
  "type": "module",
@@ -9,6 +9,8 @@ import { load, save } from './sessions.js';
9
9
 
10
10
  async function sendMessage(api, chatId, text, sessionId) {
11
11
  const MAX_TG = 4096;
12
+ // Telegram HTML mode does not support <br> — replace with newlines before sending
13
+ text = text.replace(/<br\s*\/?>/gi, '\n');
12
14
  const chunks = [];
13
15
  for (let i = 0; i < text.length; i += MAX_TG) {
14
16
  chunks.push(text.slice(i, i + MAX_TG));
@@ -105,6 +105,72 @@ function hasConsecutiveModelErrors(messages) {
105
105
  );
106
106
  }
107
107
 
108
+ /**
109
+ * Runs a subagent in its own isolated session for a single self-contained task.
110
+ * Called when the parent agent invokes the spawn_subagent tool.
111
+ */
112
+ async function runSubagent(client, config, args, parentSessionId) {
113
+ const subSessionId = `sub-${crypto.randomUUID()}`;
114
+ const systemPromptTemplate = loadSystemPrompt();
115
+ const subSession = createSession(systemPromptTemplate);
116
+
117
+ let userContent = args.prompt;
118
+ if (args.context) {
119
+ userContent = `[Context: ${args.context}]\n\n${args.prompt}`;
120
+ }
121
+ subSession.messages.push({ role: 'user', content: userContent });
122
+
123
+ const subConfig = {
124
+ ...config,
125
+ excludeTools: ['spawn_subagent'],
126
+ maxIterations: args.maxIterations || config.maxIterations,
127
+ _sessionId: subSessionId,
128
+ };
129
+
130
+ const usageAccum = { prompt: 0, completion: 0, cacheRead: 0, cacheCreation: 0 };
131
+
132
+ function prepareMessages(messages) {
133
+ const resolved = messages.map((msg, i) => {
134
+ if (i === 0 && msg.role === 'system') {
135
+ return { ...msg, content: resolveSystemPrompt(msg.content, subSessionId) };
136
+ }
137
+ return msg;
138
+ });
139
+ if (resolved.length <= subConfig.contextWindow + 1) return resolved;
140
+ return [resolved[0], ...resolved.slice(-(subConfig.contextWindow))];
141
+ }
142
+
143
+ const run = await runAgentLoop(client, subConfig, subSession, prepareMessages, usageAccum);
144
+
145
+ await appendLog(subSessionId, {
146
+ iteration: run.iteration,
147
+ model: config.selectedModel,
148
+ userInput: args.prompt,
149
+ toolCalls: run.runToolCalls,
150
+ response: run.response,
151
+ logSummary: run.logSummary,
152
+ status: run.status,
153
+ parentSessionId: parentSessionId || null,
154
+ label: args.label || null,
155
+ tokenUsage: { ...usageAccum },
156
+ });
157
+
158
+ subSession.metadata.tokenUsage = { ...usageAccum };
159
+
160
+ try {
161
+ await saveSession(subSessionId, subSession);
162
+ } catch (e) {
163
+ console.error(`Failed to save subagent session ${subSessionId}:`, e);
164
+ }
165
+
166
+ return {
167
+ status: 'ok',
168
+ response: run.response,
169
+ runStatus: run.status,
170
+ sessionId: subSessionId,
171
+ };
172
+ }
173
+
108
174
  /**
109
175
  * Runs a single agent loop up to maxIterations.
110
176
  * Returns { iteration, response, logSummary, status, runToolCalls, checkpoint }.
@@ -112,6 +178,9 @@ function hasConsecutiveModelErrors(messages) {
112
178
  export async function runAgentLoop(client, config, session, prepareMessages, usageAccum) {
113
179
  let tools = await loadTools();
114
180
  let toolDefs = getToolDefinitions(tools);
181
+ if (config.excludeTools?.length) {
182
+ toolDefs = toolDefs.filter(t => !config.excludeTools.includes(t.function?.name));
183
+ }
115
184
  let iteration = 0;
116
185
  const runToolCalls = [];
117
186
  const loopTracker = new Map();
@@ -162,7 +231,7 @@ export async function runAgentLoop(client, config, session, prepareMessages, usa
162
231
 
163
232
  const assistantMessage = modelResult.choices[0].message;
164
233
 
165
- // Tool calls present — execute serially and continue loop
234
+ // Tool calls present — execute in parallel, then process results in order
166
235
  if (assistantMessage.tool_calls && assistantMessage.tool_calls.length > 0) {
167
236
  session.messages.push({
168
237
  role: 'assistant',
@@ -176,17 +245,42 @@ export async function runAgentLoop(client, config, session, prepareMessages, usa
176
245
  })),
177
246
  });
178
247
 
179
- let stderrErrorInIteration = false;
180
- for (const toolCall of assistantMessage.tool_calls) {
181
- const toolName = toolCall.function.name;
182
- let toolArgs;
183
- let argParseError = null;
184
- try {
185
- toolArgs = JSON.parse(toolCall.function.arguments || '{}');
186
- } catch (e) {
187
- argParseError = e;
188
- }
248
+ // Execute all tool calls concurrently; session mutations happen serially below.
249
+ const toolResults = await Promise.all(
250
+ assistantMessage.tool_calls.map(async (toolCall) => {
251
+ const toolName = toolCall.function.name;
252
+ let toolArgs;
253
+ let argParseError = null;
254
+ try {
255
+ toolArgs = JSON.parse(toolCall.function.arguments || '{}');
256
+ } catch (e) {
257
+ argParseError = e;
258
+ }
189
259
 
260
+ if (argParseError) {
261
+ return { toolCall, toolName, toolArgs: {}, argParseError, result: null, toolStatus: 'error' };
262
+ }
263
+
264
+ let result;
265
+ let toolStatus = 'ok';
266
+ try {
267
+ if (toolName === 'spawn_subagent') {
268
+ result = await runSubagent(client, config, toolArgs, config._sessionId);
269
+ } else {
270
+ result = await executeTool(tools, toolName, toolArgs);
271
+ }
272
+ } catch (e) {
273
+ result = { status: 'error', error: e.message };
274
+ toolStatus = 'error';
275
+ }
276
+
277
+ return { toolCall, toolName, toolArgs, argParseError: null, result, toolStatus };
278
+ })
279
+ );
280
+
281
+ // Process results serially to preserve message order and update trackers.
282
+ let stderrErrorInIteration = false;
283
+ for (const { toolCall, toolName, toolArgs, argParseError, result, toolStatus } of toolResults) {
190
284
  if (argParseError) {
191
285
  const errorContent = JSON.stringify({
192
286
  status: 'error',
@@ -198,15 +292,6 @@ export async function runAgentLoop(client, config, session, prepareMessages, usa
198
292
  continue;
199
293
  }
200
294
 
201
- let result;
202
- let toolStatus = 'ok';
203
- try {
204
- result = await executeTool(tools, toolName, toolArgs);
205
- } catch (e) {
206
- result = { status: 'error', error: e.message };
207
- toolStatus = 'error';
208
- }
209
-
210
295
  const resultObj = typeof result === 'object' && result !== null ? result : null;
211
296
  const toolFailed = toolStatus === 'error' || (resultObj && resultObj.status === 'error');
212
297
  if (toolFailed) {
@@ -620,7 +705,7 @@ async function _runHandleChat(config, sessionId, userMessage, attachments = [])
620
705
  }
621
706
 
622
707
  const runStartIndex = session.messages.length;
623
- const run = await runAgentLoop(client, config, session, prepareMessages, usageAccum);
708
+ const run = await runAgentLoop(client, { ...config, _sessionId: sessionId }, session, prepareMessages, usageAccum);
624
709
  allToolCalls.push(...run.runToolCalls);
625
710
 
626
711
  if (run.status !== 'checkpoint_reached') {
@@ -11,7 +11,6 @@ export async function appendLog(sessionId, entry) {
11
11
  // Console output for better visibility
12
12
  const statusColor = entry.status === 'ok' ? chalk.green : chalk.red;
13
13
  console.log(
14
- `[${chalk.dim(new Date().toLocaleTimeString())}] ` +
15
14
  `${chalk.blue('Session')}: ${chalk.dim(sessionId.slice(0, 8))} | ` +
16
15
  `${chalk.yellow('Iter')}: ${entry.iteration} | ` +
17
16
  `${chalk.cyan('Status')}: ${statusColor(entry.status)} | ` +
@@ -1,3 +1,11 @@
1
+ // Prefix every console.log/error line with a date+time stamp so all output
2
+ // (agent, cron, telegram, tools, etc.) is consistently timestamped in server.log.
3
+ const _log = console.log.bind(console);
4
+ const _err = console.error.bind(console);
5
+ const ts = () => new Date().toISOString().replace('T', ' ').slice(0, 19);
6
+ console.log = (...args) => _log(`[${ts()}]`, ...args);
7
+ console.error = (...args) => _err(`[${ts()}]`, ...args);
8
+
1
9
  import { startServer } from './app.js';
2
10
 
3
11
  startServer();
@@ -53,7 +53,7 @@ const SEED_TOOLS = {
53
53
  type: 'function',
54
54
  function: {
55
55
  name: 'exec',
56
- description: 'Execute an arbitrary shell command on the server. Returns stdout, stderr, and exit code. Use this for any system operation: running scripts, installing packages, managing files, etc.',
56
+ description: 'Execute an arbitrary shell command on the server. Returns stdout, stderr, and exit code. Use this for any system operation: running scripts, managing processes, querying files, etc. Has a 5-minute timeout. Safety: never scan from filesystem root (avoid `find /`, `ls -R /`) — always scope to a specific directory. Prefer `grep`, `head`, or `tail` over `cat` on unknown files. Use `which <binary>` to locate executables. Avoid commands with unbounded runtime.',
57
57
  parameters: {
58
58
  type: 'object',
59
59
  properties: {
@@ -584,6 +584,38 @@ const SEED_TOOLS = {
584
584
  return { status: 'ok', entries };
585
585
  `,
586
586
  },
587
+ spawn_subagent: {
588
+ definition: {
589
+ type: 'function',
590
+ function: {
591
+ name: 'spawn_subagent',
592
+ description: 'Spawn an independent subagent to handle a single subtask in its own isolated context and session. Use this when processing many similar items (e.g. emails, files, URLs) where doing them serially in the same context would overflow. Each subagent runs a full agent loop with access to all tools and returns its final response. Multiple spawn_subagent calls in a single response run in parallel. The subagent has no access to the current conversation — the prompt must be fully self-contained. Do not instruct subagents to use send_telegram_message; collect their results and notify the user yourself.',
593
+ parameters: {
594
+ type: 'object',
595
+ properties: {
596
+ prompt: {
597
+ type: 'string',
598
+ description: 'The self-contained task for the subagent. Must include all necessary context — the subagent has no access to the current conversation history.',
599
+ },
600
+ context: {
601
+ type: 'string',
602
+ description: 'Optional extra context to prepend to the prompt (e.g. the item to process, such as an email body or file path).',
603
+ },
604
+ label: {
605
+ type: 'string',
606
+ description: 'Optional short label for this subagent, used in logging (e.g. "email-42", "file-scan-/tmp/foo.txt").',
607
+ },
608
+ maxIterations: {
609
+ type: 'number',
610
+ description: 'Optional cap on the number of iterations the subagent may use. Defaults to the global maxIterations setting. Use a lower value (e.g. 5) for simple subtasks in bulk processing.',
611
+ },
612
+ },
613
+ required: ['prompt'],
614
+ },
615
+ },
616
+ },
617
+ code: `return { status: 'error', error: 'spawn_subagent is a native tool handled by the agent runtime.' };`,
618
+ },
587
619
  read_skill: {
588
620
  definition: {
589
621
  type: 'function',