@ducci/jarvis 1.0.46 → 1.0.48
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/docs/system-prompt.md +6 -44
- package/package.json +1 -1
- package/src/channels/telegram/index.js +2 -0
- package/src/server/agent.js +106 -21
- package/src/server/logging.js +0 -1
- package/src/server/start.js +8 -0
- package/src/server/tools.js +33 -1
package/docs/system-prompt.md
CHANGED
|
@@ -1,4 +1,4 @@
|
|
|
1
|
-
# System Prompt (
|
|
1
|
+
# System Prompt (v2)
|
|
2
2
|
|
|
3
3
|
This is the authoritative system prompt sent to the model at the start of every session. It is stored as the first message (`role: "system"`) in the conversation history.
|
|
4
4
|
|
|
@@ -23,14 +23,7 @@ Only the most recent messages are included in your context (sliding window). Old
|
|
|
23
23
|
|
|
24
24
|
## Crons
|
|
25
25
|
|
|
26
|
-
|
|
27
|
-
|
|
28
|
-
- Use `create_cron` when the user wants to schedule something — even if they don't say "cron". Triggers: "every night", "every 2 hours", "remind me at 3pm", "notify me in 2 hours", "check X every Monday", etc.
|
|
29
|
-
- Call `get_current_time` first when the user specifies a time. Note: `get_current_time` returns server time — if you know the user's timezone, convert the desired user-local time to server time before computing the cron expression.
|
|
30
|
-
- The `prompt` stored in the cron is executed by a fresh agent with no prior conversation context. Write it as a complete, self-contained instruction.
|
|
31
|
-
- If the user wants to be notified, include "use send_telegram_message to notify the user with the result" in the prompt. If they explicitly don't want a notification, omit it.
|
|
32
|
-
- For one-time tasks, set `once: true` — the cron deletes itself after firing.
|
|
33
|
-
- Use `list_crons` to show active crons, `update_cron` to modify one, `delete_cron` to remove one, `read_cron_log` to inspect past runs.
|
|
26
|
+
Use `create_cron` when the user wants something scheduled — even without the word "cron". Common triggers: "every night", "every 2 hours", "remind me at 3pm", "notify me in 2 hours", "check X every Monday". See the `create_cron` and `get_current_time` tool descriptions for how to construct the schedule and prompt correctly.
|
|
34
27
|
|
|
35
28
|
## Skills
|
|
36
29
|
|
|
@@ -52,7 +45,7 @@ There are two types of responses depending on whether you need to use tools:
|
|
|
52
45
|
"logSummary": "A concise explanation of what you did and why, written for a human reading the logs."
|
|
53
46
|
}
|
|
54
47
|
|
|
55
|
-
The `response` value must be a string — never an array or object. Use HTML formatting tags for readability: <b>bold</b>, <i>italic</i>, <code>inline code</code>, <pre>code
|
|
48
|
+
The `response` value must be a string — never an array or object. Use HTML formatting tags for readability — only these Telegram-supported tags are allowed: <b>bold</b>, <i>italic</i>, <u>underline</u>, <s>strikethrough</s>, <code>inline code</code>, <pre>code block</pre>, <blockquote>quote</blockquote>, <a href="URL">link</a>. For line breaks use actual newlines (\n), never <br>. Never use Markdown formatting (no **, __, `, or ```). If you need to present structured data (e.g. a list of items), format it as text within the string value.
|
|
56
49
|
|
|
57
50
|
Never include markdown code fences, preamble, or any text outside this JSON object. If you cannot complete a task, explain why in the `response` field — still as valid JSON.
|
|
58
51
|
|
|
@@ -65,42 +58,11 @@ You have access to a set of tools. Each tool has a name and description that tel
|
|
|
65
58
|
- After a tool call, verify the result before declaring the task done. Always communicate what you did and why — don't just report success, briefly explain the action taken.
|
|
66
59
|
- Stop as soon as the task is complete and verified. Do not do extra work that was not asked for.
|
|
67
60
|
- If a tool fails, record the error in `logSummary` and decide whether to retry with a corrected call or explain the failure to the user.
|
|
68
|
-
-
|
|
61
|
+
- Proactively save user facts with `save_user_info` when the user shares personal details (name, timezone, preferences) — even if not asked.
|
|
62
|
+
- Use `write_file` to create or overwrite files — never `exec` with echo/printf/heredoc (shell escaping silently corrupts content).
|
|
63
|
+
- For processes that may run longer than 5 minutes: use `nohup command > /tmp/out.log 2>&1 &` and poll with `exec`.
|
|
69
64
|
- Prefer using tools over making assumptions about the state of the system.
|
|
70
65
|
|
|
71
|
-
## exec Safety
|
|
72
|
-
|
|
73
|
-
The `exec` tool runs real shell commands on the server. Use it responsibly:
|
|
74
|
-
|
|
75
|
-
- **Never scan from filesystem root.** Commands like `find /`, `find / -name ...`, or `ls -R /` will scan everything including `/proc`, `/sys`, and network mounts. They can saturate CPU and I/O for minutes. Always scope `find` to a specific directory (e.g. `find ~/jarvis -name "*.js"`).
|
|
76
|
-
- **Use known paths.** Prefer `process.cwd()`, `$HOME`, or paths you already know over broad searches. Use `which <binary>` to locate executables.
|
|
77
|
-
- **Prefer targeted reads.** Use `grep`, `head`, or `tail` instead of `cat` on files you haven't seen before. Large file output is truncated anyway — a targeted command gives you better signal.
|
|
78
|
-
- **Avoid commands with unbounded runtime.** If a command could run indefinitely or scan an unknown-size tree, scope it first.
|
|
79
|
-
|
|
80
|
-
## Writing Files
|
|
81
|
-
|
|
82
|
-
Use the `write_file` tool to create or overwrite any file. Never use `exec` with `echo`, `printf`, or heredoc to write files.
|
|
83
|
-
|
|
84
|
-
Shell escaping through `exec` silently corrupts file content: dollar signs become `\$`, backslashes double up, and the resulting file looks correct when printed but is broken at runtime (variables never expand, scripts fail with "command not found"). `write_file` bypasses all shell interpretation — content arrives as a JSON string and lands in the file exactly as written.
|
|
85
|
-
|
|
86
|
-
- For shell scripts: pass `mode: "755"` to make the file executable in the same call.
|
|
87
|
-
- For any other file: omit `mode` or use `"644"`.
|
|
88
|
-
|
|
89
|
-
## Execution Timeouts
|
|
90
|
-
|
|
91
|
-
Every tool call is wrapped in a server-side timeout that the tool's code cannot override:
|
|
92
|
-
|
|
93
|
-
- **`exec`** — 5-minute cap. Sufficient for scans, builds, and most long-running commands.
|
|
94
|
-
- **`system_install`** — 5-minute cap. Use for installing system binaries via a package manager.
|
|
95
|
-
- **Custom tools via `save_tool`** — default 60s unless you pass `timeout` (in ms, max 600000). If a custom tool wraps a slow operation, set `timeout` explicitly.
|
|
96
|
-
|
|
97
|
-
**For truly long-running processes (> 5 minutes)**: run in the background and poll for results:
|
|
98
|
-
```sh
|
|
99
|
-
nohup long-running-command > /tmp/output.log 2>&1 & echo $!
|
|
100
|
-
# Check progress later
|
|
101
|
-
cat /tmp/output.log
|
|
102
|
-
```
|
|
103
|
-
|
|
104
66
|
## Failure Recovery
|
|
105
67
|
|
|
106
68
|
When a tool or command fails:
|
package/package.json
CHANGED
|
@@ -9,6 +9,8 @@ import { load, save } from './sessions.js';
|
|
|
9
9
|
|
|
10
10
|
async function sendMessage(api, chatId, text, sessionId) {
|
|
11
11
|
const MAX_TG = 4096;
|
|
12
|
+
// Telegram HTML mode does not support <br> — replace with newlines before sending
|
|
13
|
+
text = text.replace(/<br\s*\/?>/gi, '\n');
|
|
12
14
|
const chunks = [];
|
|
13
15
|
for (let i = 0; i < text.length; i += MAX_TG) {
|
|
14
16
|
chunks.push(text.slice(i, i + MAX_TG));
|
package/src/server/agent.js
CHANGED
|
@@ -105,6 +105,72 @@ function hasConsecutiveModelErrors(messages) {
|
|
|
105
105
|
);
|
|
106
106
|
}
|
|
107
107
|
|
|
108
|
+
/**
|
|
109
|
+
* Runs a subagent in its own isolated session for a single self-contained task.
|
|
110
|
+
* Called when the parent agent invokes the spawn_subagent tool.
|
|
111
|
+
*/
|
|
112
|
+
async function runSubagent(client, config, args, parentSessionId) {
|
|
113
|
+
const subSessionId = `sub-${crypto.randomUUID()}`;
|
|
114
|
+
const systemPromptTemplate = loadSystemPrompt();
|
|
115
|
+
const subSession = createSession(systemPromptTemplate);
|
|
116
|
+
|
|
117
|
+
let userContent = args.prompt;
|
|
118
|
+
if (args.context) {
|
|
119
|
+
userContent = `[Context: ${args.context}]\n\n${args.prompt}`;
|
|
120
|
+
}
|
|
121
|
+
subSession.messages.push({ role: 'user', content: userContent });
|
|
122
|
+
|
|
123
|
+
const subConfig = {
|
|
124
|
+
...config,
|
|
125
|
+
excludeTools: ['spawn_subagent'],
|
|
126
|
+
maxIterations: args.maxIterations || config.maxIterations,
|
|
127
|
+
_sessionId: subSessionId,
|
|
128
|
+
};
|
|
129
|
+
|
|
130
|
+
const usageAccum = { prompt: 0, completion: 0, cacheRead: 0, cacheCreation: 0 };
|
|
131
|
+
|
|
132
|
+
function prepareMessages(messages) {
|
|
133
|
+
const resolved = messages.map((msg, i) => {
|
|
134
|
+
if (i === 0 && msg.role === 'system') {
|
|
135
|
+
return { ...msg, content: resolveSystemPrompt(msg.content, subSessionId) };
|
|
136
|
+
}
|
|
137
|
+
return msg;
|
|
138
|
+
});
|
|
139
|
+
if (resolved.length <= subConfig.contextWindow + 1) return resolved;
|
|
140
|
+
return [resolved[0], ...resolved.slice(-(subConfig.contextWindow))];
|
|
141
|
+
}
|
|
142
|
+
|
|
143
|
+
const run = await runAgentLoop(client, subConfig, subSession, prepareMessages, usageAccum);
|
|
144
|
+
|
|
145
|
+
await appendLog(subSessionId, {
|
|
146
|
+
iteration: run.iteration,
|
|
147
|
+
model: config.selectedModel,
|
|
148
|
+
userInput: args.prompt,
|
|
149
|
+
toolCalls: run.runToolCalls,
|
|
150
|
+
response: run.response,
|
|
151
|
+
logSummary: run.logSummary,
|
|
152
|
+
status: run.status,
|
|
153
|
+
parentSessionId: parentSessionId || null,
|
|
154
|
+
label: args.label || null,
|
|
155
|
+
tokenUsage: { ...usageAccum },
|
|
156
|
+
});
|
|
157
|
+
|
|
158
|
+
subSession.metadata.tokenUsage = { ...usageAccum };
|
|
159
|
+
|
|
160
|
+
try {
|
|
161
|
+
await saveSession(subSessionId, subSession);
|
|
162
|
+
} catch (e) {
|
|
163
|
+
console.error(`Failed to save subagent session ${subSessionId}:`, e);
|
|
164
|
+
}
|
|
165
|
+
|
|
166
|
+
return {
|
|
167
|
+
status: 'ok',
|
|
168
|
+
response: run.response,
|
|
169
|
+
runStatus: run.status,
|
|
170
|
+
sessionId: subSessionId,
|
|
171
|
+
};
|
|
172
|
+
}
|
|
173
|
+
|
|
108
174
|
/**
|
|
109
175
|
* Runs a single agent loop up to maxIterations.
|
|
110
176
|
* Returns { iteration, response, logSummary, status, runToolCalls, checkpoint }.
|
|
@@ -112,6 +178,9 @@ function hasConsecutiveModelErrors(messages) {
|
|
|
112
178
|
export async function runAgentLoop(client, config, session, prepareMessages, usageAccum) {
|
|
113
179
|
let tools = await loadTools();
|
|
114
180
|
let toolDefs = getToolDefinitions(tools);
|
|
181
|
+
if (config.excludeTools?.length) {
|
|
182
|
+
toolDefs = toolDefs.filter(t => !config.excludeTools.includes(t.function?.name));
|
|
183
|
+
}
|
|
115
184
|
let iteration = 0;
|
|
116
185
|
const runToolCalls = [];
|
|
117
186
|
const loopTracker = new Map();
|
|
@@ -162,7 +231,7 @@ export async function runAgentLoop(client, config, session, prepareMessages, usa
|
|
|
162
231
|
|
|
163
232
|
const assistantMessage = modelResult.choices[0].message;
|
|
164
233
|
|
|
165
|
-
// Tool calls present — execute
|
|
234
|
+
// Tool calls present — execute in parallel, then process results in order
|
|
166
235
|
if (assistantMessage.tool_calls && assistantMessage.tool_calls.length > 0) {
|
|
167
236
|
session.messages.push({
|
|
168
237
|
role: 'assistant',
|
|
@@ -176,17 +245,42 @@ export async function runAgentLoop(client, config, session, prepareMessages, usa
|
|
|
176
245
|
})),
|
|
177
246
|
});
|
|
178
247
|
|
|
179
|
-
|
|
180
|
-
|
|
181
|
-
|
|
182
|
-
|
|
183
|
-
|
|
184
|
-
|
|
185
|
-
|
|
186
|
-
|
|
187
|
-
|
|
188
|
-
|
|
248
|
+
// Execute all tool calls concurrently; session mutations happen serially below.
|
|
249
|
+
const toolResults = await Promise.all(
|
|
250
|
+
assistantMessage.tool_calls.map(async (toolCall) => {
|
|
251
|
+
const toolName = toolCall.function.name;
|
|
252
|
+
let toolArgs;
|
|
253
|
+
let argParseError = null;
|
|
254
|
+
try {
|
|
255
|
+
toolArgs = JSON.parse(toolCall.function.arguments || '{}');
|
|
256
|
+
} catch (e) {
|
|
257
|
+
argParseError = e;
|
|
258
|
+
}
|
|
189
259
|
|
|
260
|
+
if (argParseError) {
|
|
261
|
+
return { toolCall, toolName, toolArgs: {}, argParseError, result: null, toolStatus: 'error' };
|
|
262
|
+
}
|
|
263
|
+
|
|
264
|
+
let result;
|
|
265
|
+
let toolStatus = 'ok';
|
|
266
|
+
try {
|
|
267
|
+
if (toolName === 'spawn_subagent') {
|
|
268
|
+
result = await runSubagent(client, config, toolArgs, config._sessionId);
|
|
269
|
+
} else {
|
|
270
|
+
result = await executeTool(tools, toolName, toolArgs);
|
|
271
|
+
}
|
|
272
|
+
} catch (e) {
|
|
273
|
+
result = { status: 'error', error: e.message };
|
|
274
|
+
toolStatus = 'error';
|
|
275
|
+
}
|
|
276
|
+
|
|
277
|
+
return { toolCall, toolName, toolArgs, argParseError: null, result, toolStatus };
|
|
278
|
+
})
|
|
279
|
+
);
|
|
280
|
+
|
|
281
|
+
// Process results serially to preserve message order and update trackers.
|
|
282
|
+
let stderrErrorInIteration = false;
|
|
283
|
+
for (const { toolCall, toolName, toolArgs, argParseError, result, toolStatus } of toolResults) {
|
|
190
284
|
if (argParseError) {
|
|
191
285
|
const errorContent = JSON.stringify({
|
|
192
286
|
status: 'error',
|
|
@@ -198,15 +292,6 @@ export async function runAgentLoop(client, config, session, prepareMessages, usa
|
|
|
198
292
|
continue;
|
|
199
293
|
}
|
|
200
294
|
|
|
201
|
-
let result;
|
|
202
|
-
let toolStatus = 'ok';
|
|
203
|
-
try {
|
|
204
|
-
result = await executeTool(tools, toolName, toolArgs);
|
|
205
|
-
} catch (e) {
|
|
206
|
-
result = { status: 'error', error: e.message };
|
|
207
|
-
toolStatus = 'error';
|
|
208
|
-
}
|
|
209
|
-
|
|
210
295
|
const resultObj = typeof result === 'object' && result !== null ? result : null;
|
|
211
296
|
const toolFailed = toolStatus === 'error' || (resultObj && resultObj.status === 'error');
|
|
212
297
|
if (toolFailed) {
|
|
@@ -620,7 +705,7 @@ async function _runHandleChat(config, sessionId, userMessage, attachments = [])
|
|
|
620
705
|
}
|
|
621
706
|
|
|
622
707
|
const runStartIndex = session.messages.length;
|
|
623
|
-
const run = await runAgentLoop(client, config, session, prepareMessages, usageAccum);
|
|
708
|
+
const run = await runAgentLoop(client, { ...config, _sessionId: sessionId }, session, prepareMessages, usageAccum);
|
|
624
709
|
allToolCalls.push(...run.runToolCalls);
|
|
625
710
|
|
|
626
711
|
if (run.status !== 'checkpoint_reached') {
|
package/src/server/logging.js
CHANGED
|
@@ -11,7 +11,6 @@ export async function appendLog(sessionId, entry) {
|
|
|
11
11
|
// Console output for better visibility
|
|
12
12
|
const statusColor = entry.status === 'ok' ? chalk.green : chalk.red;
|
|
13
13
|
console.log(
|
|
14
|
-
`[${chalk.dim(new Date().toLocaleTimeString())}] ` +
|
|
15
14
|
`${chalk.blue('Session')}: ${chalk.dim(sessionId.slice(0, 8))} | ` +
|
|
16
15
|
`${chalk.yellow('Iter')}: ${entry.iteration} | ` +
|
|
17
16
|
`${chalk.cyan('Status')}: ${statusColor(entry.status)} | ` +
|
package/src/server/start.js
CHANGED
|
@@ -1,3 +1,11 @@
|
|
|
1
|
+
// Prefix every console.log/error line with a date+time stamp so all output
|
|
2
|
+
// (agent, cron, telegram, tools, etc.) is consistently timestamped in server.log.
|
|
3
|
+
const _log = console.log.bind(console);
|
|
4
|
+
const _err = console.error.bind(console);
|
|
5
|
+
const ts = () => new Date().toISOString().replace('T', ' ').slice(0, 19);
|
|
6
|
+
console.log = (...args) => _log(`[${ts()}]`, ...args);
|
|
7
|
+
console.error = (...args) => _err(`[${ts()}]`, ...args);
|
|
8
|
+
|
|
1
9
|
import { startServer } from './app.js';
|
|
2
10
|
|
|
3
11
|
startServer();
|
package/src/server/tools.js
CHANGED
|
@@ -53,7 +53,7 @@ const SEED_TOOLS = {
|
|
|
53
53
|
type: 'function',
|
|
54
54
|
function: {
|
|
55
55
|
name: 'exec',
|
|
56
|
-
description: 'Execute an arbitrary shell command on the server. Returns stdout, stderr, and exit code. Use this for any system operation: running scripts,
|
|
56
|
+
description: 'Execute an arbitrary shell command on the server. Returns stdout, stderr, and exit code. Use this for any system operation: running scripts, managing processes, querying files, etc. Has a 5-minute timeout. Safety: never scan from filesystem root (avoid `find /`, `ls -R /`) — always scope to a specific directory. Prefer `grep`, `head`, or `tail` over `cat` on unknown files. Use `which <binary>` to locate executables. Avoid commands with unbounded runtime.',
|
|
57
57
|
parameters: {
|
|
58
58
|
type: 'object',
|
|
59
59
|
properties: {
|
|
@@ -584,6 +584,38 @@ const SEED_TOOLS = {
|
|
|
584
584
|
return { status: 'ok', entries };
|
|
585
585
|
`,
|
|
586
586
|
},
|
|
587
|
+
spawn_subagent: {
|
|
588
|
+
definition: {
|
|
589
|
+
type: 'function',
|
|
590
|
+
function: {
|
|
591
|
+
name: 'spawn_subagent',
|
|
592
|
+
description: 'Spawn an independent subagent to handle a single subtask in its own isolated context and session. Use this when processing many similar items (e.g. emails, files, URLs) where doing them serially in the same context would overflow. Each subagent runs a full agent loop with access to all tools and returns its final response. Multiple spawn_subagent calls in a single response run in parallel. The subagent has no access to the current conversation — the prompt must be fully self-contained. Do not instruct subagents to use send_telegram_message; collect their results and notify the user yourself.',
|
|
593
|
+
parameters: {
|
|
594
|
+
type: 'object',
|
|
595
|
+
properties: {
|
|
596
|
+
prompt: {
|
|
597
|
+
type: 'string',
|
|
598
|
+
description: 'The self-contained task for the subagent. Must include all necessary context — the subagent has no access to the current conversation history.',
|
|
599
|
+
},
|
|
600
|
+
context: {
|
|
601
|
+
type: 'string',
|
|
602
|
+
description: 'Optional extra context to prepend to the prompt (e.g. the item to process, such as an email body or file path).',
|
|
603
|
+
},
|
|
604
|
+
label: {
|
|
605
|
+
type: 'string',
|
|
606
|
+
description: 'Optional short label for this subagent, used in logging (e.g. "email-42", "file-scan-/tmp/foo.txt").',
|
|
607
|
+
},
|
|
608
|
+
maxIterations: {
|
|
609
|
+
type: 'number',
|
|
610
|
+
description: 'Optional cap on the number of iterations the subagent may use. Defaults to the global maxIterations setting. Use a lower value (e.g. 5) for simple subtasks in bulk processing.',
|
|
611
|
+
},
|
|
612
|
+
},
|
|
613
|
+
required: ['prompt'],
|
|
614
|
+
},
|
|
615
|
+
},
|
|
616
|
+
},
|
|
617
|
+
code: `return { status: 'error', error: 'spawn_subagent is a native tool handled by the agent runtime.' };`,
|
|
618
|
+
},
|
|
587
619
|
read_skill: {
|
|
588
620
|
definition: {
|
|
589
621
|
type: 'function',
|