alvin-bot 5.6.1 → 5.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -2,6 +2,32 @@
2
2
 
3
3
  All notable changes to Alvin Bot are documented here.
4
4
 
5
+ ## [5.7.0] — 2026-05-19
6
+
7
+ ### Background-task results now arrive the instant the task finishes, and survive a restart
8
+
9
+ Detached background-task results were delivered by a 15-second polling
10
+ loop whose in-memory state could diverge across a bot restart, so a task
11
+ that finished around a restart could keep its result undelivered with
12
+ nothing shown in chat. Delivery is now pushed the moment the task's
13
+ process exits — through an always-on local callback guarded by a
14
+ per-boot token — and a startup reconciliation pass drains anything that
15
+ completed while the bot was down. An atomic deliver-once marker
16
+ guarantees the push, the polling backstop, and reconciliation never
17
+ double-deliver, and a cancelled task can no longer be resurrected. The
18
+ polling loop is kept only as a backstop for timeouts and stalled tasks.
19
+ No configuration is required and nothing changes for existing setups.
20
+
21
+ ## [5.6.2] — 2026-05-19
22
+
23
+ ### Long background-task results now reliably arrive in chat
24
+
25
+ A background task that produced a long final answer could finish
26
+ successfully and yet never be delivered — you would see nothing and
27
+ have to ask for the status by hand. Alvin now recognises a finished
28
+ background task no matter how long its result is, so the answer always
29
+ lands in your chat the moment the task completes.
30
+
5
31
  ## [5.6.1] — 2026-05-18
6
32
 
7
33
  ### Background-task results stay in the chat
package/README.md CHANGED
@@ -2,9 +2,9 @@
2
2
 
3
3
  > Your personal AI agent — on Telegram, WhatsApp, Discord, Slack, Signal, Terminal, and Web.
4
4
 
5
- Open-source, self-hosted, multi-model. Lives where you chat, has full shell + filesystem access, remembers across sessions, and dispatches detached sub-agents for long-running work. Built on the Claude Agent SDK with a provider-agnostic engine that also drives OpenAI, Groq, Gemini, NVIDIA NIM, OpenRouter, and Ollama.
5
+ Alvin Bot is an open-source, MIT-licensed, self-hosted autonomous AI agent that runs on your own machine and answers you on Telegram, Slack, Discord, WhatsApp, Signal, a terminal TUI, and a web dashboard. It is built on the official Claude Agent SDK and runs a provider-agnostic engine that also drives OpenAI, Groq, Google Gemini, NVIDIA NIM, OpenRouter, and Ollama, with automatic failover after two consecutive provider failures and a heartbeat health check every five minutes. Unlike most personal AI agents, it ships a zero-config indexed memory store: with no embedding API key it falls back to a built-in SQLite FTS5 keyword index, so recall works out of the box. It dispatches detached sub-agents as independent `claude -p` subprocesses that keep running and deliver their result even if the parent conversation is aborted. It is local-first and telemetry-free — prompts and responses are never logged off-machine, secrets live in a chmod-0600 `.env`, and shell execution is allowlisted by default.
6
6
 
7
- > **What's new — v4.22 (May 2026):** Pluggable memory backends Gemini · OpenAI · Ollama · **FTS5 keyword fallback (zero-config)**. Users without an embedding API key now get a working indexed memory store out of the box. Smart inject mode trims ~25 k tokens per turn off long system prompts. [Full changelog →](CHANGELOG.md)
7
+ > **What's new — v5.6.2 (May 2026):** Background-task and scheduled-job results now land cleanly in chat a tight header (what ran, duration, tokens, success) plus the answer, with very long output attached as a single file instead of a wall of messages. Built on v5.5's instant, honest ⛔ Stop and calmer, evidence-based health monitoring. Earlier in the 5.x line: a zero-config FTS5 keyword memory index (indexed recall with **no embedding API key**), automatic multi-provider failover with a 5-minute heartbeat monitor, and detached sub-agents that survive a parent abort. [Full changelog →](CHANGELOG.md)
8
8
 
9
9
  ---
10
10
 
@@ -47,6 +47,34 @@ Open-source, self-hosted, multi-model. Lives where you chat, has full shell + fi
47
47
 
48
48
  ---
49
49
 
50
+ ## ⚖️ How Alvin Bot compares
51
+
52
+ Alvin Bot sits in the same category as **Hermes Agent** (Nous Research) and **OpenClaw** — self-hosted, open-source personal AI agents that live on your machine and reach you on the chat apps you already use. They optimize for different things. This table is intended to be fair: where Hermes or OpenClaw is the better tool, it says so.
53
+
54
+ | Dimension | **Alvin Bot** | **Hermes Agent** | **OpenClaw** |
55
+ |---|---|---|---|
56
+ | License / hosting | MIT · self-hosted · local-first · zero telemetry | MIT · self-hosted · 7 execution backends | Open-source · self-hosted · bring-your-own-key |
57
+ | Model providers | Claude Agent SDK + OpenAI · Groq · Gemini · NVIDIA NIM · OpenRouter · Ollama, with **automatic failover after 2 provider failures + a 5-min heartbeat monitor** | 200+ models | Bring-your-own model / key |
58
+ | Sub-agents | **Detached `claude -p` subprocesses that survive a parent abort**; `readonly`/`research` toolset presets | Isolated subagents for parallel workstreams | Not a primary focus |
59
+ | Browser automation | **4-tier escalation**: WebFetch → stealth Playwright → persistent-profile CDP → agent-browser CLI | Built-in browse / vision tools | Via tools |
60
+ | Platforms | Telegram · Slack · Discord · WhatsApp · Signal · terminal TUI · Web (7) | 20+ platforms from one gateway | 25–50+ platforms · native mobile apps · voice activation |
61
+ | Memory | Layered L0–L3; SQLite embeddings with a **zero-config FTS5 keyword fallback (works with no API key)**; smart prompt-injection trims ~25 k tokens/turn | SQLite + full-text search · agent-curated · Honcho user profiling | Transparent plain Markdown/YAML files you can grep and git-track |
62
+ | Extensibility | Hot-reload skills + 6 plugins · self-modifying skills · hooks · MCP client | 40+ built-in tools · **autonomous self-improving skill loop** | Skills as files · very large ecosystem |
63
+ | MCP | MCP **client** (connect any MCP server) | MCP client **and `hermes mcp serve`** (acts as an MCP server for Claude Desktop / Cursor / VS Code) | Tool integrations |
64
+ | Self-healing | **Startup preflight · dead-man's-switch heartbeat · crash forensic bundles · AI self-diagnosis · crash-loop brake · trend anomaly detection** | Stable in practice; self-improving | Frequent updates can break running instances |
65
+ | Security defaults | Exec **allowlist + shell-metachar filter on by default** · DM pairing · timing-safe webhook auth · 0600 file perms enforced · `alvin-bot audit` CLI · honestly documented threat model | Standard | Standard |
66
+ | Maturity / community | Small, focused, single-maintainer; modest public adoption | Large community, Nous Research team | Large community + team, Nvidia NemoClaw fork |
67
+
68
+ ### Use the right tool for the job
69
+
70
+ - **Use Alvin Bot when** you want one resilient, self-healing agent on your own box that keeps working when a provider rate-limits or fails, gives you indexed memory **without buying an embedding API key**, ships safe-by-default execution sandboxing, and is built directly on the official Claude Agent SDK — and you mainly live in Telegram / Slack / Discord / WhatsApp / Signal.
71
+ - **Use Hermes Agent when** you want a research-grade self-improving agent, need it to act as an **MCP server** for Claude Desktop / Cursor / VS Code, want 200+ model choice or many execution backends, and value a large community.
72
+ - **Use OpenClaw when** you want the **widest messaging reach** (25–50+ channels) plus native mobile apps and voice activation, fully transparent plain-file memory you can git-track, and the largest ecosystem.
73
+
74
+ A longer head-to-head with FAQ and decision guide: **[Alvin Bot vs Hermes vs OpenClaw](https://alvin.alev-b.com/vs/hermes-openclaw)**.
75
+
76
+ ---
77
+
50
78
  ## 🚀 Quick Start
51
79
 
52
80
  ```bash
@@ -35,6 +35,40 @@ import { SUBAGENTS_DIR } from "../paths.js";
35
35
  function generateAgentId() {
36
36
  return "alvin-" + crypto.randomBytes(12).toString("hex");
37
37
  }
38
+ /**
39
+ * v5.7.0 — Best-effort push: tell the bot a detached sub-agent just
40
+ * exited so it can read the jsonl and deliver immediately instead of
41
+ * waiting for the next 15s poll tick. An in-process self-call to the
42
+ * always-on loopback route. Any failure (bot mid-restart, port race,
43
+ * network) is swallowed — startup reconciliation and the poll backstop
44
+ * are the safety nets. Never throws, always resolves.
45
+ */
46
+ export async function postSubagentExit(agentId, exitCode) {
47
+ try {
48
+ const { getWebPort, getInternalToken } = await import("../web/server.js");
49
+ const port = getWebPort();
50
+ const token = getInternalToken();
51
+ const ac = new AbortController();
52
+ const timer = setTimeout(() => ac.abort(), 4000);
53
+ try {
54
+ await fetch(`http://127.0.0.1:${port}/internal/subagent-exit`, {
55
+ method: "POST",
56
+ headers: {
57
+ "Content-Type": "application/json",
58
+ Authorization: `Bearer ${token}`,
59
+ },
60
+ body: JSON.stringify({ agentId, exitCode }),
61
+ signal: ac.signal,
62
+ });
63
+ }
64
+ finally {
65
+ clearTimeout(timer);
66
+ }
67
+ }
68
+ catch {
69
+ /* best-effort — reconciliation/poll backstop cover any miss */
70
+ }
71
+ }
38
72
  /**
39
73
  * Dispatch a detached sub-agent. Returns synchronously — the subprocess
40
74
  * runs in the background. Throws if spawn fails. On success:
@@ -96,6 +130,16 @@ export function dispatchDetachedAgent(input) {
96
130
  catch {
97
131
  /* ignore */
98
132
  }
133
+ // v5.7.0 — Push: deliver the result the instant the subprocess exits,
134
+ // instead of waiting for the next 15s poll tick. `exit` fires after
135
+ // the OS has reaped the process and flushed its inherited stdout FD,
136
+ // so the terminating result line is fully on disk before the handler
137
+ // reads it (race-safe). Best-effort: startup reconciliation + the poll
138
+ // backstop cover a missed push (bot restarted mid-run, port race).
139
+ // Attached before unref() so the listener is registered on the handle.
140
+ child.on("exit", (code) => {
141
+ void postSubagentExit(agentId, code);
142
+ });
99
143
  // Detach from parent Node's event loop so parent exit doesn't wait.
100
144
  child.unref();
101
145
  // Register with watcher so it polls the output file and delivers.
@@ -68,6 +68,92 @@ export function parseAsyncLaunchedToolResult(raw) {
68
68
  return { agentId, outputFile };
69
69
  }
70
70
  const DEFAULT_TAIL_BYTES = 64 * 1024;
71
+ /**
72
+ * Upper bound for the window-independent final-line read (see
73
+ * readLastCompleteLine). Generous — ~128× the tail window — so any
74
+ * realistic final report is captured, but bounded so a pathological
75
+ * single line can't blow up memory. Beyond this we fall back to the
76
+ * windowed / staleness logic unchanged.
77
+ */
78
+ const MAX_LAST_LINE_BYTES = 8 * 1024 * 1024;
79
+ /**
80
+ * Read the LAST complete newline-delimited record of the file, regardless
81
+ * of how large it is, by scanning backward from EOF in chunks.
82
+ *
83
+ * Why this exists: for `claude -p --output-format stream-json` the
84
+ * terminating `{"type":"result",...}` event is ALWAYS the final line, but
85
+ * that line embeds the entire final report and is frequently larger than
86
+ * DEFAULT_TAIL_BYTES (e.g. an agent that writes a long status report
87
+ * after an auto-declined AskUserQuestion). The windowed tail read drops
88
+ * such a line as a truncated head fragment, so completion was missed and
89
+ * the agent sat "running" until the 12 h timeout — never auto-delivered.
90
+ * Reading the true final line makes completion detection independent of
91
+ * the tail-window size.
92
+ *
93
+ * Returns the final record WITHOUT its trailing newline, or null if the
94
+ * file is empty, the final line is not newline-terminated (still being
95
+ * written), or the line exceeds MAX_LAST_LINE_BYTES. In every null case
96
+ * the caller falls through to the existing windowed/staleness logic with
97
+ * no behavior change.
98
+ */
99
+ async function readLastCompleteLine(path, size) {
100
+ if (size <= 0)
101
+ return null;
102
+ let fh;
103
+ try {
104
+ fh = await fs.open(path, "r");
105
+ const chunkSize = 64 * 1024;
106
+ let pos = size;
107
+ let collected = Buffer.alloc(0);
108
+ while (pos > 0) {
109
+ if (size - pos > MAX_LAST_LINE_BYTES)
110
+ return null;
111
+ const readLen = Math.min(chunkSize, pos);
112
+ pos -= readLen;
113
+ const buf = Buffer.alloc(readLen);
114
+ await fh.read(buf, 0, readLen, pos);
115
+ collected = Buffer.concat([buf, collected]);
116
+ // Strip exactly one trailing newline (the file terminator) so we
117
+ // search for the delimiter BEFORE the final record, not after it.
118
+ let end = collected.length;
119
+ if (end > 0 && collected[end - 1] === 0x0a) {
120
+ end--;
121
+ if (end > 0 && collected[end - 1] === 0x0d)
122
+ end--;
123
+ }
124
+ else {
125
+ // No terminating newline → final line still being written.
126
+ return null;
127
+ }
128
+ if (end <= 0) {
129
+ // File is just a newline (or empty after trim) — nothing usable.
130
+ if (pos === 0)
131
+ return null;
132
+ continue;
133
+ }
134
+ const nl = collected.lastIndexOf(0x0a, end - 1);
135
+ if (nl >= 0) {
136
+ return collected.toString("utf-8", nl + 1, end);
137
+ }
138
+ if (pos === 0) {
139
+ // Whole file is a single (newline-terminated) record.
140
+ return collected.toString("utf-8", 0, end);
141
+ }
142
+ }
143
+ return null;
144
+ }
145
+ catch {
146
+ return null;
147
+ }
148
+ finally {
149
+ try {
150
+ await fh?.close();
151
+ }
152
+ catch {
153
+ /* ignore */
154
+ }
155
+ }
156
+ }
71
157
  /**
72
158
  * v4.12.4 — Default staleness window for partial-output delivery.
73
159
  *
@@ -148,12 +234,67 @@ export async function parseOutputFileStatus(path, opts = {}) {
148
234
  const usable = lines
149
235
  .slice(headIncomplete, lines.length - (trailIncomplete > 0 ? trailIncomplete : 0))
150
236
  .filter((l) => l.length > 0);
237
+ // Window-independent completion check (regression fix). The terminating
238
+ // `{"type":"result",...}` event for `claude -p --output-format
239
+ // stream-json` is ALWAYS the final line, but it embeds the whole final
240
+ // report and is routinely larger than maxTailBytes — the windowed tail
241
+ // below would drop it as a truncated head fragment, leaving the agent
242
+ // mis-classified "running" until the 12 h timeout (so a completed
243
+ // sub-agent is never auto-delivered and the user must ask "status?").
244
+ // Inspect the TRUE final complete line directly so detection no longer
245
+ // depends on the tail-window size. Falls through unchanged when the
246
+ // last line is not a result event (running / killed mid-write / etc.).
247
+ const finalLine = await readLastCompleteLine(path, stat.size);
248
+ if (finalLine) {
249
+ let parsedFinal = null;
250
+ try {
251
+ parsedFinal = JSON.parse(finalLine);
252
+ }
253
+ catch {
254
+ parsedFinal = null;
255
+ }
256
+ if (parsedFinal && parsedFinal.type === "result") {
257
+ let output = typeof parsedFinal.result === "string" ? parsedFinal.result : "";
258
+ if (!output) {
259
+ // Same aggregation fallback as the windowed FIRST PASS: when the
260
+ // result event carries no `result` text, stitch together the
261
+ // assistant text blocks visible in the tail.
262
+ const fragments = [];
263
+ for (const line of usable) {
264
+ let p;
265
+ try {
266
+ p = JSON.parse(line);
267
+ }
268
+ catch {
269
+ continue;
270
+ }
271
+ if (p.type === "assistant" && Array.isArray(p.message?.content)) {
272
+ for (const c of p.message.content) {
273
+ if (c?.type === "text" && typeof c.text === "string") {
274
+ fragments.push(c.text);
275
+ }
276
+ }
277
+ }
278
+ }
279
+ output = fragments.join("\n\n").trim();
280
+ }
281
+ const usage = parsedFinal.usage;
282
+ const tokensUsed = usage
283
+ ? {
284
+ input: usage.input_tokens ?? 0,
285
+ output: usage.output_tokens ?? 0,
286
+ }
287
+ : undefined;
288
+ return { state: "completed", output, tokensUsed };
289
+ }
290
+ }
151
291
  // v4.13 — FIRST PASS: look for a `{"type":"result"}` event anywhere in
152
292
  // the tail. This is the completion signal for `claude -p
153
293
  // --output-format stream-json` output (used by the v4.13 dispatch
154
294
  // mechanism). When present, the `result` field holds the authoritative
155
295
  // final text. If `result.result` is missing, aggregate from all
156
- // assistant text blocks in the tail.
296
+ // assistant text blocks in the tail. (Retained as a defensive fallback
297
+ // for the rare case the result event is NOT the final line.)
157
298
  for (let i = usable.length - 1; i >= 0; i--) {
158
299
  let parsed;
159
300
  try {
@@ -23,9 +23,10 @@
23
23
  * for the JSONL format details.
24
24
  */
25
25
  import fs from "fs";
26
- import { dirname } from "path";
26
+ import { dirname, resolve } from "path";
27
27
  import { parseOutputFileStatus } from "./async-agent-parser.js";
28
- import { ASYNC_AGENTS_STATE_FILE } from "../paths.js";
28
+ import { claimDelivery, markDelivered, isDelivered, cleanupAgentFiles, } from "./subagent-dedup.js";
29
+ import { ASYNC_AGENTS_STATE_FILE, SUBAGENTS_DIR } from "../paths.js";
29
30
  import { getAllSessions } from "./session.js";
30
31
  /**
31
32
  * B3 — Detect a permanent "target chat does not exist" delivery failure
@@ -191,12 +192,133 @@ function decrementPendingCount(sessionKey) {
191
192
  export function listPendingAgents() {
192
193
  return [...pending.values()];
193
194
  }
194
- /** Start the polling loop. Idempotent. Loads any persisted state from disk. */
195
+ /**
196
+ * v5.7.0 push entry point — called by POST /internal/subagent-exit when
197
+ * a detached subprocess exits. Looks the agent up, classifies its jsonl,
198
+ * and delivers immediately (claim-gated, so push / the poll backstop /
199
+ * reconciliation never double-deliver). Returns a coarse status for the
200
+ * HTTP layer:
201
+ *
202
+ * - "unknown" → no matching pending entry (cancelled, already
203
+ * delivered, or never tracked). HTTP 404. No side effect.
204
+ * - "delivered" → terminal jsonl found and delivered (or the claim was
205
+ * lost to another path — either way it is handled and
206
+ * the entry is dropped). HTTP 200.
207
+ * - "pending" → exit fired but the jsonl is not yet terminal (rare
208
+ * flush race). Left to the poll backstop. HTTP 202.
209
+ */
210
+ export async function deliverByAgentId(agentId) {
211
+ const entry = pending.get(agentId);
212
+ if (!entry)
213
+ return "unknown";
214
+ const status = await parseOutputFileStatus(entry.outputFile);
215
+ if (status.state === "completed") {
216
+ // Invariant: the path that WINS the claim owns removal of the
217
+ // shared pending entry. A claim-loser must not mutate shared state
218
+ // — the winner (this push, the poll backstop, or reconcile) removes
219
+ // it. If a winner ever crashes mid-delivery the next poll tick
220
+ // self-heals: claimDelivery() is false there too, so it skips
221
+ // re-delivery but still drops the entry. Either way "delivered".
222
+ if (claimDelivery(agentId)) {
223
+ await deliverAsCompleted(entry, status.output, status.tokensUsed);
224
+ pending.delete(agentId);
225
+ saveToDisk();
226
+ }
227
+ return "delivered";
228
+ }
229
+ if (status.state === "failed") {
230
+ if (claimDelivery(agentId)) {
231
+ await deliverAsFailure(entry, "error", status.error);
232
+ pending.delete(agentId);
233
+ saveToDisk();
234
+ }
235
+ return "delivered";
236
+ }
237
+ // running / missing → not yet flushed; the 15s poll backstop will get it.
238
+ return "pending";
239
+ }
240
+ /**
241
+ * v5.7.0 — Startup reconciliation. Runs once inside startWatcher() after
242
+ * loadFromDisk(). Two passes:
243
+ *
244
+ * Pass A — immediate terminal drain. For every entry already in the
245
+ * pending map (full delivery metadata present, restored by
246
+ * loadFromDisk), check its jsonl ONCE: if terminal, deliver immediately
247
+ * at startup instead of waiting up to 15s for the first poll tick. This
248
+ * is what removes the post-restart latency the BACKLOG observed and
249
+ * works even if the poll loop were disabled. Claim-gated.
250
+ *
251
+ * Pass B — disk hygiene, never delivers. Walk SUBAGENTS_DIR: skip
252
+ * delivered/tombstoned agents, age-cap-clean ancient files, and leave
253
+ * fresh orphans (jsonl with no persisted state entry → no delivery
254
+ * target) untouched for a later age-cap. We never fabricate a target.
255
+ *
256
+ * Idempotent (re-runnable), dedup-guarded by the .delivered marker.
257
+ */
258
+ async function reconcileOnStartup() {
259
+ // Pass A — drain pending entries with full metadata.
260
+ for (const entry of [...pending.values()]) {
261
+ try {
262
+ if (isDelivered(entry.agentId)) {
263
+ pending.delete(entry.agentId);
264
+ continue;
265
+ }
266
+ const status = await parseOutputFileStatus(entry.outputFile);
267
+ if (status.state === "completed" && claimDelivery(entry.agentId)) {
268
+ await deliverAsCompleted(entry, status.output, status.tokensUsed);
269
+ pending.delete(entry.agentId);
270
+ }
271
+ else if (status.state === "failed" && claimDelivery(entry.agentId)) {
272
+ await deliverAsFailure(entry, "error", status.error);
273
+ pending.delete(entry.agentId);
274
+ }
275
+ }
276
+ catch (err) {
277
+ console.error(`[async-watcher] reconcile passA ${entry.agentId}:`, err);
278
+ }
279
+ }
280
+ saveToDisk();
281
+ // Pass B — disk hygiene. Never delivers.
282
+ let files;
283
+ try {
284
+ files = fs.readdirSync(SUBAGENTS_DIR);
285
+ }
286
+ catch {
287
+ return; // no subagents dir yet — nothing to reconcile
288
+ }
289
+ const now = Date.now();
290
+ for (const f of files) {
291
+ if (!f.endsWith(".jsonl"))
292
+ continue;
293
+ const agentId = f.slice(0, -".jsonl".length);
294
+ if (isDelivered(agentId))
295
+ continue; // delivered or cancel-tombstone
296
+ if (pending.has(agentId))
297
+ continue; // Pass A / poll / push own it
298
+ const jsonlPath = resolve(SUBAGENTS_DIR, f);
299
+ try {
300
+ const st = fs.statSync(jsonlPath);
301
+ if (now - st.mtimeMs > MAX_AGENT_AGE_MS) {
302
+ cleanupAgentFiles(agentId); // ancient orphan → clean
303
+ }
304
+ // else: orphan within age, no persisted target → cannot deliver;
305
+ // leave it (age-cap cleans it on a later boot). No spam, no
306
+ // fabricated delivery target.
307
+ }
308
+ catch {
309
+ /* stat race — ignore */
310
+ }
311
+ }
312
+ }
313
+ /** Start the polling loop. Idempotent. Loads any persisted state from
314
+ * disk, then runs startup reconciliation (immediate terminal drain +
315
+ * disk hygiene) before the 15s poll backstop begins. */
195
316
  export function startWatcher() {
196
317
  if (started)
197
318
  return;
198
319
  started = true;
199
320
  loadFromDisk();
321
+ void reconcileOnStartup().catch((err) => console.error("[async-watcher] reconcile failed:", err));
200
322
  pollTimer = setInterval(() => {
201
323
  pollOnce().catch((err) => console.error("[async-watcher] poll cycle failed:", err));
202
324
  }, POLL_INTERVAL_MS);
@@ -235,21 +357,31 @@ export async function pollOnce() {
235
357
  entry.lastCheckedAt = now;
236
358
  // Timeout check first — if the agent is past its giveUpAt, give up
237
359
  // regardless of whether the file shows progress.
360
+ // v5.7.0 — every terminal delivery is claim-gated so push,
361
+ // this poll backstop, and startup reconciliation never
362
+ // double-deliver. A lost claim means another path already
363
+ // handled it: just drop the entry without re-delivering.
238
364
  if (now >= entry.giveUpAt) {
239
- const outcome = await deliverAsFailure(entry, "timeout", "Agent ran longer than 12h — giving up");
240
- abandonIfInvalidTarget(entry, outcome);
365
+ if (claimDelivery(entry.agentId)) {
366
+ const outcome = await deliverAsFailure(entry, "timeout", "Agent ran longer than 12h — giving up");
367
+ abandonIfInvalidTarget(entry, outcome);
368
+ }
241
369
  toRemove.push(entry.agentId);
242
370
  continue;
243
371
  }
244
372
  const status = await parseOutputFileStatus(entry.outputFile);
245
373
  if (status.state === "completed") {
246
- const outcome = await deliverAsCompleted(entry, status.output, status.tokensUsed);
247
- abandonIfInvalidTarget(entry, outcome);
374
+ if (claimDelivery(entry.agentId)) {
375
+ const outcome = await deliverAsCompleted(entry, status.output, status.tokensUsed);
376
+ abandonIfInvalidTarget(entry, outcome);
377
+ }
248
378
  toRemove.push(entry.agentId);
249
379
  }
250
380
  else if (status.state === "failed") {
251
- const outcome = await deliverAsFailure(entry, "error", status.error);
252
- abandonIfInvalidTarget(entry, outcome);
381
+ if (claimDelivery(entry.agentId)) {
382
+ const outcome = await deliverAsFailure(entry, "error", status.error);
383
+ abandonIfInvalidTarget(entry, outcome);
384
+ }
253
385
  toRemove.push(entry.agentId);
254
386
  }
255
387
  else if (status.state === "missing" &&
@@ -257,8 +389,10 @@ export async function pollOnce() {
257
389
  // v4.14.2 — Zombie guard: the subprocess never created its
258
390
  // output file within `missingFileFailureMs` (default 10 min).
259
391
  // Declare failed instead of polling until the 12h giveUpAt.
260
- const outcome = await deliverAsFailure(entry, "error", `Dispatched subprocess never wrote its output file (${Math.round((now - entry.startedAt) / 60_000)}m after start). Likely crashed before initializing, or the file was removed externally.`);
261
- abandonIfInvalidTarget(entry, outcome);
392
+ if (claimDelivery(entry.agentId)) {
393
+ const outcome = await deliverAsFailure(entry, "error", `Dispatched subprocess never wrote its output file (${Math.round((now - entry.startedAt) / 60_000)}m after start). Likely crashed before initializing, or the file was removed externally.`);
394
+ abandonIfInvalidTarget(entry, outcome);
395
+ }
262
396
  toRemove.push(entry.agentId);
263
397
  }
264
398
  // running / missing-but-young → keep polling next cycle
@@ -424,6 +558,12 @@ export function cancelPendingForSession(sessionKey) {
424
558
  let changed = false;
425
559
  for (const [id, entry] of pending.entries()) {
426
560
  if (entry.sessionKey === sessionKey) {
561
+ // v5.7.0 — tombstone before removal: a cancelled agent's
562
+ // subprocess may still exit later and POST, or its leftover jsonl
563
+ // may be rediscovered by reconciliation on a future boot. The
564
+ // .delivered marker makes "cancelled" authoritative and
565
+ // restart-proof so neither path can resurrect it.
566
+ markDelivered(entry.agentId);
427
567
  pending.delete(id);
428
568
  changed = true;
429
569
  }
@@ -0,0 +1,86 @@
1
+ /**
2
+ * Atomic deliver-once primitive for detached sub-agents (v5.7.0).
3
+ *
4
+ * The single source of truth for "has this agent's result already been
5
+ * delivered (or been cancel-tombstoned)?" — shared by all three delivery
6
+ * paths: the exit-push endpoint, the 15s poll backstop, and startup
7
+ * reconciliation. The marker is an empty `<agentId>.delivered` file in
8
+ * SUBAGENTS_DIR; `claimDelivery` creates it with O_EXCL ("wx") so exactly
9
+ * one caller wins the race. Crash-safe and restart-safe (it is on disk),
10
+ * which the in-memory pending map alone cannot be.
11
+ */
12
+ import fs from "fs";
13
+ import { resolve } from "path";
14
+ import { SUBAGENTS_DIR } from "../paths.js";
15
+ function markerPath(agentId) {
16
+ return resolve(SUBAGENTS_DIR, `${agentId}.delivered`);
17
+ }
18
+ function ensureDir() {
19
+ try {
20
+ fs.mkdirSync(SUBAGENTS_DIR, { recursive: true });
21
+ }
22
+ catch {
23
+ /* race-safe — a concurrent create is fine */
24
+ }
25
+ }
26
+ /**
27
+ * Attempt to claim delivery for `agentId`. Returns true exactly once
28
+ * (the caller that atomically created the marker) and false for every
29
+ * subsequent call. On an unexpected fs error other than EEXIST we return
30
+ * true: a rare double-delivery (a duplicate message) is strictly
31
+ * preferable to silently losing a result — the exact bug this whole
32
+ * feature exists to fix.
33
+ */
34
+ export function claimDelivery(agentId) {
35
+ ensureDir();
36
+ try {
37
+ const fd = fs.openSync(markerPath(agentId), "wx");
38
+ fs.closeSync(fd);
39
+ return true;
40
+ }
41
+ catch (err) {
42
+ if (err.code === "EEXIST")
43
+ return false;
44
+ return true; // prefer a possible duplicate over a lost delivery
45
+ }
46
+ }
47
+ /**
48
+ * Write the marker if absent, ignore if already present. Used as a
49
+ * cancel tombstone (claim-without-deliver) so a cancelled agent can
50
+ * never be resurrected by push or reconciliation. Idempotent; never
51
+ * throws.
52
+ */
53
+ export function markDelivered(agentId) {
54
+ ensureDir();
55
+ try {
56
+ const fd = fs.openSync(markerPath(agentId), "wx");
57
+ fs.closeSync(fd);
58
+ }
59
+ catch {
60
+ /* EEXIST or fs error — tombstone already effective / best-effort */
61
+ }
62
+ }
63
+ /** True if a delivered/tombstone marker exists for this agent. */
64
+ export function isDelivered(agentId) {
65
+ try {
66
+ return fs.existsSync(markerPath(agentId));
67
+ }
68
+ catch {
69
+ return false;
70
+ }
71
+ }
72
+ /**
73
+ * Best-effort removal of every on-disk artifact for an aged-out agent
74
+ * (jsonl + err + delivered marker). Used by reconciliation's age-cap so
75
+ * ancient files don't accumulate. Never throws.
76
+ */
77
+ export function cleanupAgentFiles(agentId) {
78
+ for (const ext of [".jsonl", ".err", ".delivered"]) {
79
+ try {
80
+ fs.unlinkSync(resolve(SUBAGENTS_DIR, `${agentId}${ext}`));
81
+ }
82
+ catch {
83
+ /* ignore — already gone or never existed */
84
+ }
85
+ }
86
+ }
@@ -12,6 +12,7 @@ import fs from "fs";
12
12
  import path from "path";
13
13
  import { resolve } from "path";
14
14
  import { execSync } from "child_process";
15
+ import crypto from "crypto";
15
16
  import { WebSocketServer, WebSocket } from "ws";
16
17
  import { getRegistry } from "../engine.js";
17
18
  import { getSession, resetSession, getAllSessions } from "../services/session.js";
@@ -51,6 +52,18 @@ let bindRetryTimer = null;
51
52
  * this and exits silently if set, so stop is truly terminal. */
52
53
  let stopRequested = false;
53
54
  const WEB_PASSWORD = process.env.WEB_PASSWORD || "";
55
+ /**
56
+ * v5.7.0 — Per-boot random token guarding POST /internal/subagent-exit.
57
+ * Generated once at module load; never persisted, never logged. The
58
+ * detached-agent exit hook reads it in-process via getInternalToken().
59
+ * The route is loopback-bound by default; this token additionally
60
+ * defends the opt-in WEB_HOST=0.0.0.0 (LAN-exposed) case. 48 hex chars.
61
+ */
62
+ const INTERNAL_TOKEN = crypto.randomBytes(24).toString("hex");
63
+ /** In-process accessor for the per-boot internal-route token. */
64
+ export function getInternalToken() {
65
+ return INTERNAL_TOKEN;
66
+ }
54
67
  /** The actual port the Web UI is running on (may differ from WEB_PORT if busy). */
55
68
  let actualWebPort = WEB_PORT;
56
69
  // ── MIME Types ──────────────────────────────────────────
@@ -154,6 +167,56 @@ async function handleAPI(req, res, urlPath, body) {
154
167
  }
155
168
  return;
156
169
  }
170
+ // POST /internal/subagent-exit — detached sub-agent exit push (v5.7.0).
171
+ // Always available (NO WEBHOOK_ENABLED dependency, so it works for every
172
+ // install with zero config); loopback-bound by default; per-boot bearer
173
+ // token. Reads the agent's jsonl, classifies, delivers immediately
174
+ // (claim-gated). Idempotent — a repeat POST is a 404 no-op.
175
+ if (urlPath === "/internal/subagent-exit" && req.method === "POST") {
176
+ // The legitimate payload is ~80 bytes ({agentId, exitCode}). Reject
177
+ // anything absurd before the auth compare so an unauthenticated
178
+ // caller (only reachable at all under the opt-in WEB_HOST=0.0.0.0)
179
+ // cannot push large bodies through this always-on route. (A deeper
180
+ // streaming cap on the shared body accumulator — also used by
181
+ // /api/ and /v1/ — is a separate pre-existing hardening item.)
182
+ if (body.length > 8 * 1024) {
183
+ res.statusCode = 413;
184
+ res.end(JSON.stringify({ error: "Payload too large" }));
185
+ return;
186
+ }
187
+ if (!timingSafeBearerMatch(req.headers.authorization, INTERNAL_TOKEN)) {
188
+ res.statusCode = 401;
189
+ res.end(JSON.stringify({ error: "Unauthorized" }));
190
+ return;
191
+ }
192
+ let agentId;
193
+ try {
194
+ const payload = JSON.parse(body);
195
+ agentId = typeof payload.agentId === "string" ? payload.agentId : "";
196
+ }
197
+ catch {
198
+ res.statusCode = 400;
199
+ res.end(JSON.stringify({ error: "Invalid JSON body" }));
200
+ return;
201
+ }
202
+ if (!agentId) {
203
+ res.statusCode = 400;
204
+ res.end(JSON.stringify({ error: "Missing agentId" }));
205
+ return;
206
+ }
207
+ try {
208
+ const { deliverByAgentId } = await import("../services/async-agent-watcher.js");
209
+ const outcome = await deliverByAgentId(agentId);
210
+ res.statusCode = outcome === "unknown" ? 404 : outcome === "pending" ? 202 : 200;
211
+ res.end(JSON.stringify({ ok: outcome !== "unknown", outcome }));
212
+ }
213
+ catch (err) {
214
+ res.statusCode = 500;
215
+ res.end(JSON.stringify({ error: "delivery failed" }));
216
+ console.error("[internal/subagent-exit] delivery error:", err);
217
+ }
218
+ return;
219
+ }
157
220
  // Auth check for all other API routes
158
221
  if (!checkAuth(req)) {
159
222
  res.statusCode = 401;
@@ -1580,6 +1643,14 @@ function handleWebRequest(req, res) {
1580
1643
  handleAPI(req, res, urlPath, body);
1581
1644
  return;
1582
1645
  }
1646
+ // v5.7.0 — internal bot-to-bot routes (detached sub-agent exit
1647
+ // push). Dispatched through handleAPI, which bearer-auths it and
1648
+ // returns before the cookie-auth gate. Kept off the /api/ prefix so
1649
+ // it is never surfaced in the Web UI route surface.
1650
+ if (urlPath.startsWith("/internal/")) {
1651
+ handleAPI(req, res, urlPath, body);
1652
+ return;
1653
+ }
1583
1654
  // Auth page (if password set and not authenticated)
1584
1655
  if (WEB_PASSWORD && !checkAuth(req) && urlPath !== "/login.html") {
1585
1656
  res.writeHead(302, { Location: "/login.html" });
package/llms.txt ADDED
@@ -0,0 +1,38 @@
1
+ # Alvin Bot
2
+
3
+ > Alvin Bot is an open-source, MIT-licensed, self-hosted autonomous AI agent that runs on your own machine and answers you on Telegram, Slack, Discord, WhatsApp, Signal, a terminal TUI, and a web dashboard. It is built on the official Claude Agent SDK and runs a provider-agnostic engine that also drives OpenAI, Groq, Google Gemini, NVIDIA NIM, OpenRouter, and Ollama, with automatic failover after two consecutive provider failures and a heartbeat health check every five minutes. It is local-first and telemetry-free: prompts and responses are never logged off-machine, secrets live in a chmod-0600 .env, and shell execution is allowlisted by default.
4
+
5
+ Alvin Bot is in the same category as Hermes Agent (Nous Research) and OpenClaw: a self-hosted personal AI agent with persistent memory, scheduled tasks, real shell/filesystem access, and multi-platform chat delivery. It differentiates on resilience (automatic provider failover + a self-preservation subsystem), zero-config memory (indexed recall with no embedding API key), detached sub-agents that survive a parent abort, and safe-by-default security.
6
+
7
+ ## Key capabilities
8
+
9
+ - Multi-provider engine: Claude Agent SDK + OpenAI, Groq, Google Gemini, NVIDIA NIM, OpenRouter, Ollama, any OpenAI-compatible API; automatic failover after 2 provider failures, 5-minute heartbeat health check, reorderable fallback chain.
10
+ - Detached sub-agents: `alvin_dispatch_agent` spawns independent `claude -p` subprocesses that keep running and deliver their result even if the parent conversation is aborted; `readonly`/`research` toolset presets restrict their privileges.
11
+ - Zero-config layered memory: SQLite embeddings store with a built-in FTS5 keyword fallback so indexed recall works with no embedding API key; smart prompt-injection mode trims ~25k tokens/turn once the store is populated.
12
+ - 4-tier browser automation: WebFetch → stealth Playwright → persistent-profile Chromium over CDP → opt-in agent-browser CLI; handles login-walled and bot-protected pages.
13
+ - Self-preservation subsystem: startup preflight, dead-man's-switch heartbeat, crash forensic bundles, AI self-diagnosis, crash-loop brake, daily trend anomaly detection — detects and recovers from its own failures unattended.
14
+ - Multi-platform: Telegram, Slack (one channel = one isolated workspace), Discord, WhatsApp (owner-approval gate), Signal, terminal TUI, full web dashboard.
15
+ - Cron / scheduled AI tasks in natural language (e.g. "check my email every morning"); MCP client (connect any Model Context Protocol server); hot-reload skills + 6 built-in plugins; hooks; per-channel context-isolated workspaces with shared memory.
16
+ - Security defaults: exec allowlist + shell-metachar filter on by default, DM pairing, timing-safe webhook auth, enforced 0600 file permissions, `alvin-bot audit` CLI, an honestly documented threat model (prompt injection is acknowledged as unsolved, not pretended away).
17
+ - Distribution: npm global install, from-source, one-line installer, Docker, macOS Electron .dmg with system tray; macOS launchd integration keeps the Claude Max OAuth token working across restarts.
18
+
19
+ ## When to use Alvin Bot
20
+
21
+ - Use Alvin Bot when you want one resilient, self-healing personal agent on your own machine that keeps working when a provider fails, gives you indexed memory without buying an embedding API key, ships safe-by-default execution sandboxing, and is built directly on the official Claude Agent SDK.
22
+ - Use Hermes Agent when you want a research-grade self-improving learning loop, need the agent to act as an MCP server for Claude Desktop / Cursor / VS Code, want 200+ model choice or many execution backends, or value a large community.
23
+ - Use OpenClaw when you want the widest messaging reach (25–50+ channels) plus native mobile apps and voice activation, fully transparent plain-file memory you can git-track, and the largest ecosystem.
24
+
25
+ ## Documentation
26
+
27
+ - [README](https://github.com/alvbln/Alvin-Bot/blob/main/README.md): full feature list, quick start, architecture, provider matrix, configuration.
28
+ - [Handbook](https://github.com/alvbln/Alvin-Bot/blob/main/docs/HANDBOOK.md): complete standalone reference — providers, sub-agents, cron, plugins, MCP, platforms.
29
+ - [Changelog](https://github.com/alvbln/Alvin-Bot/blob/main/CHANGELOG.md): per-release notes; current version 5.6.1.
30
+ - [Security threat model](https://github.com/alvbln/Alvin-Bot/blob/main/docs/security.md): honest threat model and hardening guide.
31
+ - [Alvin Bot vs Hermes vs OpenClaw](https://alvin.alev-b.com/vs/hermes-openclaw): fair, named head-to-head comparison and decision guide.
32
+ - [npm package](https://www.npmjs.com/package/alvin-bot): `npm install -g alvin-bot`.
33
+ - [GitHub repository](https://github.com/alvbln/Alvin-Bot): source, issues, releases.
34
+
35
+ ## Optional
36
+
37
+ - [Multi-session workspaces](https://github.com/alvbln/Alvin-Bot/blob/main/README.md#-multi-session-workspaces-v4120): parallel per-channel context-isolated sessions with globally shared memory.
38
+ - [Slack setup](https://github.com/alvbln/Alvin-Bot/releases/latest): copy-paste Slack App manifest + step-by-step guide.
package/package.json CHANGED
@@ -1,7 +1,7 @@
1
1
  {
2
2
  "name": "alvin-bot",
3
- "version": "5.6.1",
4
- "description": "Alvin Bot — Your personal AI agent on Telegram, WhatsApp, Discord, Signal, and Web.",
3
+ "version": "5.7.0",
4
+ "description": "Alvin Bot — open-source, self-hosted autonomous AI agent on Telegram, Slack, Discord, WhatsApp, Signal, terminal & web. Built on the Claude Agent SDK with a multi-provider engine (OpenAI, Groq, Gemini, NVIDIA NIM, OpenRouter, Ollama) and automatic failover, detached sub-agents that survive a parent abort, zero-config indexed memory (no embedding key needed), 4-tier browser automation, cron tasks, MCP client and a self-preservation subsystem. Local-first, telemetry-free. An OpenClaw / Hermes Agent alternative.",
5
5
  "type": "module",
6
6
  "main": "dist/index.js",
7
7
  "bin": {
@@ -143,6 +143,8 @@
143
143
  "ai",
144
144
  "claude",
145
145
  "agent",
146
+ "ai-agent",
147
+ "autonomous-agent",
146
148
  "llm",
147
149
  "multi-model",
148
150
  "gpt",
@@ -150,14 +152,26 @@
150
152
  "nvidia",
151
153
  "self-hosted",
152
154
  "autonomous",
155
+ "personal-assistant",
153
156
  "whatsapp",
154
157
  "discord",
155
158
  "signal",
159
+ "slack",
156
160
  "openai",
157
161
  "groq",
162
+ "ollama",
163
+ "openrouter",
158
164
  "chatbot",
159
165
  "assistant",
160
- "electron"
166
+ "electron",
167
+ "sub-agents",
168
+ "mcp",
169
+ "mcp-client",
170
+ "claude-agent-sdk",
171
+ "cron",
172
+ "skills",
173
+ "openclaw-alternative",
174
+ "hermes-alternative"
161
175
  ],
162
176
  "author": "alvbln",
163
177
  "license": "MIT",