@ducci/jarvis 1.0.63 → 1.0.65

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,121 +1,86 @@
1
1
  # Jarvis
2
2
 
3
- A fully automated agent system that lives on a server. Will always run, can be started, stopped, restarted. Autorestarts on crash and can be configured via a setup phase. This README is the entry point and links to focused docs for each major topic.
3
+ A self-hosted AI agent that runs as a background server. Chat with it via a web UI or Telegram, give it tools to run shell commands and manage files, and schedule recurring tasks all powered by any model on OpenRouter, z.ai, or the Anthropic API.
4
4
 
5
- ## Docs
5
+ ## Features
6
+
7
+ - **Agent loop** — runs tools autonomously, hands off to a fresh context when it hits the iteration limit, and keeps going until the task is done
8
+ - **Web UI** — built-in chat interface served at `http://localhost:18008`
9
+ - **Telegram** — optional channel adapter; chat from your phone, send photos, get proactive notifications
10
+ - **Cron scheduler** — schedule recurring or one-time tasks in plain English; agent runs them autonomously and can notify you via Telegram
11
+ - **Skills** — Markdown-defined workflows the agent discovers and follows for specific task types
12
+ - **Custom tools** — define tools in JSON (name, description, JS code); the agent picks them up without a restart
13
+ - **Multi-provider** — OpenRouter, z.ai, or Anthropic directly (with prompt caching)
14
+ - **Persistent sessions** — full conversation history per session, sliding context window
6
15
 
7
- - Onboarding and Configuration: [docs/setup.md](./docs/setup.md)
8
- - CLI and Server Lifecycle: [docs/cli.md](./docs/cli.md)
9
- - Agent system details: [docs/agent.md](./docs/agent.md)
10
- - UI implementation: [docs/ui.md](./docs/ui.md)
11
- - Evaluation guide: [docs/evaluation.md](./docs/evaluation.md)
12
-
13
- ## Principles (early draft)
14
-
15
- - Minimal surface area in v1
16
- - Clear defaults and predictable behavior
17
- - Simple local data model
18
- - No hidden automation
19
-
20
- ## end goal (how i wish the system will be onced finished)
21
- - following what jarvis is doing is super easy to understand for a human
22
- - everything that goes wrong e.g. failed tool calls, or errors from exec calls should be easy to understand for a human
23
- - transparency is very important, without this we can not easily debug or improve the system
24
- - it should work autonomously, i.e. it does not need any instructions from me on decicions but instead decide itself how to achieve whatever its doing
25
- - when working autonomously on a task its given it should know when to stop (task is done in a good quality)
26
-
27
- ## Implementation Roadmap
28
-
29
- To reach v1, we will follow this order:
30
-
31
- 1. **Phase 1: Project Skeleton [x]**
32
- - Scaffolding (`package.json`, folder structure).
33
- - Basic HTTP server on port `18008`.
34
- 2. **Phase 2: Onboarding & Config [x]**
35
- - `jarvis setup` CLI command.
36
- - Persistence for API keys (`.env`) and settings (`settings.json`).
37
- 3. **Phase 3: Core Agent Loop [x]**
38
- - Request/Response flow with OpenRouter.
39
- - Serial tool execution logic (`new Function`).
40
- - Basic session persistence.
41
- - Seed tool: `list_dir` (runs `ls -la`) to verify the full loop end-to-end.
42
- 4. **Phase 4: Lifecycle Management [x]**
43
- - CLI `start/stop/status` using programmatic PM2.
44
- - Pre-flight configuration checks.
45
- 5. **Phase 5: Tools & Refinement [x]**
46
- - Implementation of built-in tools (`exec`, `user_info`).
47
- - Standardized logging (JSONL).
48
- 6. **Phase 6: UI [x]**
49
- - Vite + React + Tailwind chat interface in `ui/`.
50
- - Server serves built UI as static files.
51
-
52
- ## Usage
53
-
54
- ### First-time setup
16
+ ## Quick start
55
17
 
56
18
  ```
57
- npm install
58
- npm run setup
19
+ npm i -g @ducci/jarvis
20
+ jarvis setup # configure API key, model, and optionally Telegram
21
+ jarvis start # start the background server (auto-restarts on crash)
59
22
  ```
60
23
 
61
- This prompts for your OpenRouter API key and model selection.
62
-
63
- ### Running in production (background via PM2)
24
+ Open `http://localhost:18008` to use the chat UI.
64
25
 
65
26
  ```
66
- npm start # start the server in the background (auto-restarts on crash)
67
- npm run status # check if it's running (PID, uptime, restarts)
68
- npm run stop # stop the background server
27
+ jarvis stop # stop the server
28
+ jarvis status # show PID, uptime, restart count
69
29
  ```
70
30
 
71
- The server runs on port `18008`. Open `http://localhost:18008` to use the chat UI.
31
+ ## Recommended models
72
32
 
73
- Logs are written to `~/.jarvis/logs/server.log`.
33
+ Any OpenRouter model works, but here's what's worth trying right now:
74
34
 
75
- ### Running in development (foreground)
35
+ | Model | Provider | Notes |
36
+ |---|---|---|
37
+ | `glm-5` | [z.ai](https://z.ai) directly | Personal pick — strong at coding and tool use, great value |
76
38
 
77
- ```
78
- npm run dev # start the server with nodemon (auto-reload on file changes)
79
- ```
39
+ **z.ai tip**: z.ai offers a "Coding Plan Pro" subscription that gives you direct, high-rate access to GLM-5. If you do a lot of agentic coding tasks, it's worth it. Run `jarvis setup` and select z.ai as your provider — it will configure the endpoint and model automatically.
80
40
 
81
- To develop the UI with hot-reload:
41
+ Fallback recommendation: set `fallbackModel` to `openrouter/auto` in `settings.json` so failed requests automatically retry on a capable free model.
82
42
 
83
- ```
84
- cd ui
85
- npm install # first time only
86
- npm run dev # starts Vite on port 5173, proxies /api to localhost:18008
87
- ```
43
+ ## Docs
88
44
 
89
- You need both the server (`npm run dev` in root) and the UI dev server (`npm run dev` in `ui/`) running at the same time. Open `http://localhost:5173` during UI development.
45
+ - [Setup and configuration](./docs/setup.md)
46
+ - [CLI and server lifecycle](./docs/cli.md)
47
+ - [Agent system](./docs/agent.md)
48
+ - [Telegram channel](./docs/telegram.md)
49
+ - [Cron scheduler](./docs/crons.md)
50
+ - [Skills](./docs/skills.md)
51
+ - [Identity and persona](./docs/identity.md)
52
+ - [UI](./docs/ui.md)
90
53
 
91
- ### Building the UI for production
54
+ ## Development
92
55
 
93
56
  ```
94
- cd ui
95
- npm run build
57
+ npm run dev # start server with nodemon (auto-reload)
96
58
  ```
97
59
 
98
- This outputs to `ui/dist/`, which the Express server serves as static files automatically.
60
+ For UI hot-reload, run both the server and the Vite dev server:
61
+
62
+ ```
63
+ npm run dev # server on :18008
64
+ cd ui && npm install && npm run dev # UI on :5173, proxies /api to :18008
65
+ ```
99
66
 
100
- ### Global install
67
+ Build the UI for production:
101
68
 
102
69
  ```
103
- npm i -g @ducci/jarvis
104
- jarvis setup
105
- jarvis start
106
- jarvis stop
107
- jarvis status
70
+ cd ui && npm run build # outputs to ui/dist/, served automatically by the server
108
71
  ```
109
72
 
110
- ## Security & Local Usage
73
+ ## Security
74
+
75
+ Jarvis is designed for **local or private server use only**. The API has no authentication — do not expose port `18008` to the public internet. The `exec` tool runs shell commands with the same permissions as the server process.
76
+
77
+ ## Data
111
78
 
112
- Jarvis is designed for **local use only**. There is no built-in authentication for the API. It is intended to be run on a trusted machine (e.g., your laptop or a private server) where the port is not exposed to the public internet.
79
+ All runtime data lives in `~/.jarvis/` and is never stored in the repo:
113
80
 
114
- ## Current status, IMPORTANT instructions for LLMs:
115
- - Phase 1 (Skeleton) is implemented.
116
- - Phase 2 (Onboarding & Config) is implemented.
117
- - Phase 3 (Core Agent Loop) is implemented.
118
- - Phase 4 (Lifecycle Management) is implemented.
119
- - Phase 5 (Tools & Refinement) is implemented.
120
- - Phase 6 (UI) is implemented.
121
- - the scope is only this jarvis folder and each file in it. no parent folders or any other outside of this
81
+ - `~/.jarvis/.env` API keys
82
+ - `~/.jarvis/data/config/settings.json` model, port, channel config
83
+ - `~/.jarvis/data/conversations/` session history
84
+ - `~/.jarvis/data/tools/tools.json` tool registry
85
+ - `~/.jarvis/data/skills/` skill definitions
86
+ - `~/.jarvis/logs/` per-session JSONL logs, cron logs, PM2 stdout
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@ducci/jarvis",
3
- "version": "1.0.63",
3
+ "version": "1.0.65",
4
4
  "description": "A fully automated agent system that lives on a server.",
5
5
  "main": "./src/index.js",
6
6
  "type": "module",
@@ -30,6 +30,10 @@
30
30
  "cli",
31
31
  "server"
32
32
  ],
33
+ "repository": {
34
+ "type": "git",
35
+ "url": "https://github.com/duc-gp/-ducci-jarvis.git"
36
+ },
33
37
  "author": "ducci",
34
38
  "engines": {
35
39
  "node": ">=18"
@@ -58,6 +58,11 @@ export async function startTelegramChannel(config) {
58
58
  const bot = new Bot(token);
59
59
  const sessions = load();
60
60
 
61
+ // Tracks chats with an active agent run and buffers messages arriving during that run.
62
+ // When the run finishes all buffered messages are merged into one combined run.
63
+ const isRunning = new Set();
64
+ const pendingMessages = new Map(); // chatId -> [{text, attachments, ts}]
65
+
61
66
  await bot.api.setMyCommands([
62
67
  { command: 'new', description: 'Start a fresh session' },
63
68
  { command: 'usage', description: 'Show token usage for the current session' },
@@ -97,6 +102,7 @@ export async function startTelegramChannel(config) {
97
102
  if (!allowedUserIds.includes(userId)) return;
98
103
 
99
104
  const chatId = ctx.chat.id;
105
+ pendingMessages.delete(chatId);
100
106
  if (sessions[chatId]) {
101
107
  await appendTelegramChatLog(chatId, sessions[chatId], 'SYSTEM', '--- /new: session reset ---');
102
108
  delete sessions[chatId];
@@ -107,22 +113,74 @@ export async function startTelegramChannel(config) {
107
113
  await ctx.reply('New session started.');
108
114
  });
109
115
 
116
+ // Runs one or more batches until the pending queue is drained.
117
+ // Each iteration takes all currently pending messages, merges them into a
118
+ // single user turn, calls handleChat once, and sends one response.
119
+ async function processQueue(api, chatId, firstBatch) {
120
+ let batch = firstBatch;
121
+ while (batch.length > 0) {
122
+ const sessionId = sessions[chatId] || null;
123
+ const combinedText = batch.length === 1
124
+ ? batch[0].text
125
+ : batch.map(m => m.text).join('\n\n');
126
+ const allAttachments = batch.flatMap(m => m.attachments);
127
+
128
+ let result;
129
+ try {
130
+ result = await handleChat(config, sessionId, combinedText, allAttachments);
131
+ } catch (e) {
132
+ console.error(`[telegram] agent error chat_id=${chatId}: ${e.message}`);
133
+ const errText = e.message
134
+ ? `Sorry, something went wrong: ${e.message}`
135
+ : 'Sorry, something went wrong. Please try again.';
136
+ await api.sendMessage(chatId, errText).catch(() => {});
137
+ batch = pendingMessages.get(chatId) || [];
138
+ pendingMessages.delete(chatId);
139
+ continue;
140
+ }
141
+
142
+ if (!sessions[chatId]) {
143
+ sessions[chatId] = result.sessionId;
144
+ save(sessions);
145
+ console.log(`[telegram] session created sessionId=${result.sessionId.slice(0, 8)}`);
146
+ }
147
+
148
+ // Log each original message individually with its own timestamp
149
+ for (const m of batch) {
150
+ await appendTelegramChatLog(chatId, result.sessionId, 'USER', m.text || '[photo]', m.ts);
151
+ }
152
+
153
+ try {
154
+ const rawResponse = typeof result.response === 'string'
155
+ ? result.response
156
+ : result.response != null ? JSON.stringify(result.response, null, 2) : '';
157
+ const text = rawResponse.trim()
158
+ || 'The agent encountered an error and could not produce a response. Please try again.';
159
+ await appendTelegramChatLog(chatId, result.sessionId, 'JARVIS', text);
160
+ await sendMessage(api, chatId, text, result.sessionId);
161
+ console.log(`[telegram] response sent chat_id=${chatId} length=${text.length}`);
162
+ } catch (e) {
163
+ console.error(`[telegram] delivery error chat_id=${chatId}: ${e.message}`);
164
+ await api.sendMessage(chatId, 'Sorry, something went wrong sending the response. Please try again.').catch(() => {});
165
+ }
166
+
167
+ // Drain any messages that arrived while we were running
168
+ batch = pendingMessages.get(chatId) || [];
169
+ pendingMessages.delete(chatId);
170
+ }
171
+ }
172
+
110
173
  bot.on('message:photo', async (ctx) => {
111
174
  const userId = ctx.from?.id;
112
175
  if (!allowedUserIds.includes(userId)) return;
113
176
 
114
177
  const chatId = ctx.chat.id;
115
- const sessionId = sessions[chatId] || null;
178
+ const ts = new Date().toISOString();
116
179
 
117
180
  console.log(`[telegram] incoming photo chat_id=${chatId}`);
118
181
 
119
- await ctx.api.sendChatAction(chatId, 'typing');
120
- const typingInterval = setInterval(() => {
121
- ctx.api.sendChatAction(chatId, 'typing').catch(() => {});
122
- }, 4000);
123
-
124
- const userTs = new Date().toISOString();
125
- let result;
182
+ // Download the photo first regardless of whether we buffer or run immediately
183
+ let attachment;
126
184
  try {
127
185
  const photo = ctx.message.photo.filter(p => p.width <= 800).at(-1)
128
186
  ?? ctx.message.photo[0];
@@ -131,42 +189,33 @@ export async function startTelegramChannel(config) {
131
189
  const imgResponse = await fetch(fileUrl);
132
190
  const buffer = await imgResponse.arrayBuffer();
133
191
  const base64 = Buffer.from(buffer).toString('base64');
134
- const dataUrl = `data:image/jpeg;base64,${base64}`;
135
- const caption = ctx.message.caption || '';
136
- result = await handleChat(config, sessionId, caption, [{ url: dataUrl }]);
192
+ attachment = { url: `data:image/jpeg;base64,${base64}` };
137
193
  } catch (e) {
138
- console.error(`[telegram] agent error chat_id=${chatId}: ${e.message}`);
139
- const errText = e.message
140
- ? `Sorry, something went wrong: ${e.message}`
141
- : 'Sorry, something went wrong. Please try again.';
142
- await ctx.reply(errText).catch(() => {});
143
- clearInterval(typingInterval);
194
+ console.error(`[telegram] photo download error chat_id=${chatId}: ${e.message}`);
195
+ await ctx.reply('Sorry, could not process the photo.').catch(() => {});
144
196
  return;
145
197
  }
146
198
 
147
- if (!sessions[chatId]) {
148
- sessions[chatId] = result.sessionId;
149
- save(sessions);
150
- console.log(`[telegram] session created sessionId=${result.sessionId.slice(0, 8)}`);
199
+ const entry = { text: ctx.message.caption || '', attachments: [attachment], ts };
200
+
201
+ if (isRunning.has(chatId)) {
202
+ if (!pendingMessages.has(chatId)) pendingMessages.set(chatId, []);
203
+ pendingMessages.get(chatId).push(entry);
204
+ console.log(`[telegram] buffered photo chat_id=${chatId} pending=${pendingMessages.get(chatId).length}`);
205
+ return;
151
206
  }
152
207
 
153
- const captionText = ctx.message.caption || '[photo]';
154
- await appendTelegramChatLog(chatId, result.sessionId, 'USER', `[photo] ${captionText}`, userTs);
208
+ isRunning.add(chatId);
209
+ await ctx.api.sendChatAction(chatId, 'typing');
210
+ const typingInterval = setInterval(() => {
211
+ ctx.api.sendChatAction(chatId, 'typing').catch(() => {});
212
+ }, 4000);
155
213
 
156
214
  try {
157
- const rawResponse = typeof result.response === 'string'
158
- ? result.response
159
- : result.response != null ? JSON.stringify(result.response, null, 2) : '';
160
- const text = rawResponse.trim()
161
- || 'The agent encountered an error and could not produce a response. Please try again.';
162
- await appendTelegramChatLog(chatId, result.sessionId, 'JARVIS', text);
163
- await sendMessage(ctx.api, chatId, text, result.sessionId);
164
- console.log(`[telegram] response sent chat_id=${chatId} length=${text.length}`);
165
- } catch (e) {
166
- console.error(`[telegram] delivery error chat_id=${chatId}: ${e.message}`);
167
- await ctx.api.sendMessage(chatId, 'Sorry, something went wrong sending the response. Please try again.').catch(() => {});
215
+ await processQueue(ctx.api, chatId, [entry]);
168
216
  } finally {
169
217
  clearInterval(typingInterval);
218
+ isRunning.delete(chatId);
170
219
  }
171
220
  });
172
221
 
@@ -177,53 +226,28 @@ export async function startTelegramChannel(config) {
177
226
  if (!allowedUserIds.includes(userId)) return;
178
227
 
179
228
  const chatId = ctx.chat.id;
180
- const sessionId = sessions[chatId] || null;
229
+ const ts = new Date().toISOString();
230
+ const entry = { text: ctx.message.text, attachments: [], ts };
181
231
 
182
- console.log(`[telegram] incoming chat_id=${chatId}`);
232
+ if (isRunning.has(chatId)) {
233
+ if (!pendingMessages.has(chatId)) pendingMessages.set(chatId, []);
234
+ pendingMessages.get(chatId).push(entry);
235
+ console.log(`[telegram] buffered message chat_id=${chatId} pending=${pendingMessages.get(chatId).length}`);
236
+ return;
237
+ }
183
238
 
239
+ isRunning.add(chatId);
240
+ console.log(`[telegram] incoming chat_id=${chatId}`);
184
241
  await ctx.api.sendChatAction(chatId, 'typing');
185
242
  const typingInterval = setInterval(() => {
186
243
  ctx.api.sendChatAction(chatId, 'typing').catch(() => {});
187
244
  }, 4000);
188
245
 
189
- const userTs = new Date().toISOString();
190
- let result;
191
- try {
192
- result = await handleChat(config, sessionId, ctx.message.text);
193
- } catch (e) {
194
- console.error(`[telegram] agent error chat_id=${chatId}: ${e.message}`);
195
- const errText = e.message
196
- ? `Sorry, something went wrong: ${e.message}`
197
- : 'Sorry, something went wrong. Please try again.';
198
- await ctx.reply(errText).catch(() => {});
199
- clearInterval(typingInterval);
200
- return;
201
- }
202
-
203
- // Persist new session mapping on first message
204
- if (!sessions[chatId]) {
205
- sessions[chatId] = result.sessionId;
206
- save(sessions);
207
- console.log(`[telegram] session created sessionId=${result.sessionId.slice(0, 8)}`);
208
- }
209
-
210
- await appendTelegramChatLog(chatId, result.sessionId, 'USER', ctx.message.text, userTs);
211
-
212
246
  try {
213
- // Guard against empty or non-string response (e.g. model returns array instead of string)
214
- const rawResponse = typeof result.response === 'string'
215
- ? result.response
216
- : result.response != null ? JSON.stringify(result.response, null, 2) : '';
217
- const text = rawResponse.trim()
218
- || 'The agent encountered an error and could not produce a response. Please try again.';
219
- await appendTelegramChatLog(chatId, result.sessionId, 'JARVIS', text);
220
- await sendMessage(ctx.api, chatId, text, result.sessionId);
221
- console.log(`[telegram] response sent chat_id=${chatId} length=${text.length}`);
222
- } catch (e) {
223
- console.error(`[telegram] delivery error chat_id=${chatId}: ${e.message}`);
224
- await ctx.api.sendMessage(chatId, 'Sorry, something went wrong sending the response. Please try again.').catch(() => {});
247
+ await processQueue(ctx.api, chatId, [entry]);
225
248
  } finally {
226
249
  clearInterval(typingInterval);
250
+ isRunning.delete(chatId);
227
251
  }
228
252
  });
229
253