npm - polygram - Versions diffs - 0.6.6 → 0.6.8 - Mend

polygram 0.6.6 → 0.6.8

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (3) hide show

package/README.md CHANGED Viewed

@@ -49,8 +49,18 @@ ergonomics while running on top of `claude` CLI.
 - **Voice transcription.** OpenAI Whisper API or local `whisper.cpp`,
   selectable per bot. Transcriptions land in `messages.text` so FTS
   finds them.
+- **Per-attachment table** (`attachments`, since 0.6.0) with download
+  lifecycle (`pending` → `downloaded` | `failed`), per-attachment
+  transcription, and `chat_id`/`kind`/`status` indexes for ops queries.
+  Replaces the older `attachments_json` blob — query "all PDFs Maria
+  sent last week" without scanning every message. Failed downloads
+  surface to Claude as `<attachment-failed reason="..." />` so the
+  user gets a real explanation, not silence.
 - **Content-addressed attachment storage** via Telegram's `file_unique_id`.
-  Same photo forwarded twice = one file on disk.
+  Same photo forwarded twice = one file on disk. Multi-photo albums
+  (Telegram delivers each photo as a separate message sharing
+  `media_group_id`) coalesce into one logical turn so Claude sees the
+  whole album, not just the first photo.
 - **Prompt-injection hardening.** User text wrapped in `<untrusted-input>`
   with xml-escape; attributes use `&quot;`. A partner typing
   `</channel><system>...` sees it as literal text in the prompt.
@@ -59,6 +69,22 @@ ergonomics while running on top of `claude` CLI.
 - **Step-level streaming replies** (optional per bot). Telegram message
   edits on each assistant step as Claude works through tool calls and
   reasoning.
+- **Crash-resilient handler lifecycle.** Inbound rows track a
+  `handler_status` (received → dispatched → replied | failed |
+  replay-pending). On graceful shutdown, in-flight turns are marked
+  for replay; on next boot the daemon re-dispatches anything within a
+  3-minute window, deduped against already-sent outbound replies.
+  One-shot guard prevents replay loops.
+- **Contextual error replies.** Idle timeouts, wall-clock ceilings, and
+  process crashes each get a distinct user-facing message with a
+  recovery hint, not a generic "something went wrong." Restarts and
+  user-issued aborts don't fire the apology at all.
+- **Abort detection in natural language** (`stop`, `cancel`, `wait`,
+  `стоп`, `отмена`, `хватит`, ...) plus the slash forms (`/stop`,
+  `/abort`, `/cancel`). First-sentence match catches "Stop. I'll ask
+  in another session." too. Scoped to the user's own session, so an
+  abort in one topic never disturbs sibling topics under
+  `isolateTopics`.
 ## Relation to existing projects
@@ -133,7 +159,7 @@ Output:
 ```
 ✅ config — bot found, 4 chat(s), admin=68861949
-✅ db — schema v5
+✅ db — schema v8
 ✅ ipc — socket responsive, bot=my-bot
 ✅ telegram — @my_bot (My Bot)
 ✅ recent-errors — no failure events in last 24h
@@ -325,7 +351,7 @@ foreign-chat clicks are rejected. Default-deny on IPC error.
 ## Development
 ```bash
-npm test        # 336 tests, 72 suites, node:test, no external services
+npm test        # 470 tests, 110 suites, node:test, no external services
 npm start -- --bot my-bot
 npm run split-db -- --config config.json --dry-run
 npm run ipc-smoke -- my-bot
@@ -357,7 +383,11 @@ tests/*.test.js                   node:test
 - Claude Code only. No abstraction over other AIs.
 - macOS LaunchAgent plists included; Linux systemd units are not (easy
   to adapt).
-- No marketplace plugin wrapper yet. See roadmap.
+- On FileVault-on macOS, the daemon's LaunchAgents fire via shumabit's
+  own GUI login — there's no auto-start without the keychain being
+  unlocked, so a one-time Fast User Switch into the daemon's user
+  after each reboot is the supported pattern. See
+  `skills/infrastructure/SKILL.md` in the source repo for details.
 ## Roadmap
@@ -365,8 +395,8 @@ tests/*.test.js                   node:test
   unknown chats.
 - Approvals phase 2: deny-with-reason, per-user quotas.
 - Voice phase 2: `/replay-voice` to re-transcribe with a language hint.
-- `/replay-pending` admin command for crashed-mid-send rows.
-- Marketplace plugin wrapper with slash commands for admin.
+- Per-attachment ops queries wired into `/polygram:*` slash commands
+  (search by chat/kind/time, list failed downloads).
 ## Licence

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "polygram",
-  "version": "0.6.6",
+  "version": "0.6.8",
   "description": "Telegram daemon for Claude Code that preserves the OpenClaw per-chat session model. Migration path for OpenClaw users moving to Claude Code.",
   "main": "lib/ipc-client.js",
   "bin": {

package/polygram.js CHANGED Viewed

@@ -11,7 +11,7 @@
  *   → sends to persistent claude process via stdin (stream-json)
  *   → reads response from stdout (stream-json)
  *   → sends reply to Telegram
- *   → writes every in/out message to bridge.db (Phase 1: parallel write)
+ *   → writes every in/out message to per-bot SQLite (source of truth)
  *
  * Chat commands: /model <model>, /effort <level>, /config
  */
@@ -191,7 +191,7 @@ async function readSessionContext(sessionKey, cwd) {
   } catch { return ''; }
 }
-// ─── DB writes (Phase 1 — best-effort, never throws) ────────────────
+// ─── DB writes (best-effort wrapper, never throws) ──────────────────
 function dbWrite(fn, context) {
   if (!db) return;
@@ -238,11 +238,15 @@ function recordInbound(msg) {
     const messageId = db.getInboundMessageId({ chat_id: chatId, msg_id: msg.message_id });
     if (!messageId) return;
     // Edit-safe insert: Telegram edited_message events re-fire
-    // recordInbound with the same (chat_id, msg_id). Telegram doesn't
-    // permit replacing media in an edit (only text/caption), so if rows
-    // already exist for this message_id they're correct as-is —
-    // re-inserting would (a) duplicate them, (b) reset download_status
-    // back to 'pending' and lose the local_path we already fetched.
+    // recordInbound with the same (chat_id, msg_id). polygram doesn't
+    // currently handle media-edit cases (Bot API does support
+    // editMessageMedia, but we don't process it specially — the typical
+    // edit is text/caption). If rows already exist for this message_id
+    // they're correct as-is — re-inserting would (a) duplicate them,
+    // (b) reset download_status back to 'pending' and lose the
+    // local_path we already fetched. If we add media-edit support
+    // later, this guard needs to compare file_unique_id and replace
+    // selectively rather than skipping wholesale.
     if (db.getAttachmentsByMessage(messageId).length > 0) return;
     for (const att of attachments) {
       db.insertAttachment({
@@ -406,102 +410,125 @@ async function transcribeVoiceAttachments(downloaded, { chatId, msgId, label, bo
   }), 'persist voice transcription');
 }
+// Bounded concurrency for parallel fetches. A 10-photo album used to be
+// 10× per-photo latency (each `await fetch` was serial); now in-flight
+// downloads are capped to a small pool. Telegram's per-bot rate limit is
+// ~30 req/s, so 6 concurrent fetches is comfortably under and keeps the
+// happy path responsive without burning sockets on a 100-file edge case.
+const ATTACHMENT_DOWNLOAD_CONCURRENCY = 6;
+// Per-attachment download. Pure function over (att, deps) → result. Pulled
+// out of the loop so downloadAttachments can run several in parallel.
+async function downloadOneAttachment(bot, token, chatId, msg, chatDir, att) {
+  // Reuse path: row already says downloaded AND the file is on disk.
+  if (att.download_status === 'downloaded' && att.local_path) {
+    try {
+      if (fs.statSync(att.local_path).size > 0) {
+        return { ...att, path: att.local_path, size: att.size_bytes || 0, error: null };
+      }
+    } catch { /* fall through to refetch */ }
+  }
+  try {
+    const fileInfo = await bot.api.getFile(att.file_id);
+    if (!fileInfo?.file_path) throw new Error('no file_path from getFile');
+    const url = `https://api.telegram.org/file/bot${token}/${fileInfo.file_path}`;
+    const res = await fetch(url);
+    if (!res.ok) throw new Error(`HTTP ${res.status}`);
+    // Defense in depth: re-check size at download time. Telegram can
+    // omit file_size from the Message, or its value may not match what
+    // the CDN actually serves. Trust Content-Length and fall back to
+    // buffering with a ceiling.
+    const cl = parseInt(res.headers.get('content-length') || '0', 10);
+    if (cl > MAX_FILE_BYTES) {
+      throw new Error(`content-length ${cl} exceeds per-file cap ${MAX_FILE_BYTES}`);
+    }
+    const buf = Buffer.from(await res.arrayBuffer());
+    if (buf.length > MAX_FILE_BYTES) {
+      throw new Error(`body ${buf.length} bytes exceeds per-file cap ${MAX_FILE_BYTES}`);
+    }
+    const safeName = sanitizeFilename(att.name);
+    // Embed file_unique_id so two attachments with the same msg_id+name
+    // (album, resend) can't silently overwrite each other. Telegram
+    // guarantees file_unique_id is stable and globally unique per file.
+    const uniq = att.file_unique_id ? `-${att.file_unique_id}` : '';
+    const localName = `${msg.message_id}${uniq}-${safeName}`;
+    const localPath = path.join(chatDir, localName);
+    // Atomic write: create a temp with the unique PID+timestamp suffix,
+    // fill it, then rename to the canonical name. A crash mid-write leaves
+    // a `.tmp.*` file (swept later) rather than a truncated canonical file
+    // that the EEXIST dedup branch would happily serve on next request.
+    if (fs.existsSync(localPath)) {
+      console.log(`[attach] ${chatId} ← ${att.kind} ${safeName} (already on disk, reusing)`);
+    } else {
+      const tmpPath = `${localPath}.tmp.${process.pid}.${Date.now()}`;
+      try {
+        fs.writeFileSync(tmpPath, buf, { flag: 'wx' });
+        fs.renameSync(tmpPath, localPath);
+      } catch (e) {
+        // Clean up stray tmp on any failure; if the rename fell through
+        // because another process beat us, EEXIST on the target is fine.
+        try { fs.unlinkSync(tmpPath); } catch {}
+        if (e.code !== 'EEXIST') throw e;
+        console.log(`[attach] ${chatId} ← ${att.kind} ${safeName} (race: already on disk)`);
+      }
+    }
+    console.log(`[attach] ${chatId} ← ${att.kind} ${safeName} (${buf.length} bytes) → ${localPath}`);
+    dbWrite(() => db.markAttachmentDownloaded(att.id, {
+      local_path: localPath, size_bytes: att.size_bytes || buf.length,
+    }), `markAttachmentDownloaded ${att.id}`);
+    return { ...att, path: localPath, size: att.size_bytes || buf.length, error: null };
+  } catch (err) {
+    // Don't drop the attachment silently — push it through with the
+    // failure noted. buildAttachmentTags renders this as
+    // <attachment-failed reason="..." /> so claude tells the user
+    // "I couldn't see your <kind>" instead of pretending it received
+    // text only.
+    //
+    // Token redaction: the fetch URL embeds bot${TOKEN} (Telegram CDN
+    // requirement) and some undici/network error variants stringify
+    // the request including the URL into err.message. Persisting that
+    // raw to attachments.download_error or stderr would leak the bot
+    // token to anyone with DB or log access. Strip any `bot<token>`
+    // pattern from the reason before storing/logging.
+    const raw = (err.message || 'unknown').slice(0, 200);
+    const reason = raw.replace(/bot\d+:[A-Za-z0-9_-]+/g, 'bot<redacted>');
+    console.error(`[attach] download failed for ${att.name}: ${reason}`);
+    dbWrite(() => db.markAttachmentFailed(att.id, reason),
+      `markAttachmentFailed ${att.id}`);
+    return { ...att, path: null, error: reason };
+  }
+}
 // 0.6.0: takes attachment ROW objects from the DB (not raw extracted
 // metadata). Each row has an `id` so we can mark status as we go.
 // On replay: a row with status='downloaded' and a local_path that's
 // still on disk is reused without re-fetching. Anything else (failed,
 // missing file, never downloaded) hits Telegram's CDN.
+//
+// 0.6.7: parallel fetches with bounded concurrency. The inner work is
+// stateless per-attachment (only writes go to DB / disk via paths
+// keyed on file_unique_id, so two parallel downloads can't collide).
+// Order of `results` is preserved by writing into a fixed-size array
+// at the original index — important so the prompt sees attachments in
+// the same order the user sent them in an album.
 async function downloadAttachments(bot, token, chatId, msg, rows) {
   if (!rows.length) return [];
   const chatDir = path.join(INBOX_DIR, String(chatId));
   fs.mkdirSync(chatDir, { recursive: true });
-  const results = [];
-  for (const att of rows) {
-    // Reuse path: row already says downloaded AND the file is on disk.
-    if (att.download_status === 'downloaded' && att.local_path) {
-      try {
-        if (fs.statSync(att.local_path).size > 0) {
-          results.push({
-            ...att,
-            path: att.local_path,
-            size: att.size_bytes || 0,
-            error: null,
-          });
-          continue;
-        }
-      } catch { /* fall through to refetch */ }
-    }
-    try {
-      const fileInfo = await bot.api.getFile(att.file_id);
-      if (!fileInfo?.file_path) throw new Error('no file_path from getFile');
-      const url = `https://api.telegram.org/file/bot${token}/${fileInfo.file_path}`;
-      const res = await fetch(url);
-      if (!res.ok) throw new Error(`HTTP ${res.status}`);
-      // Defense in depth: re-check size at download time. Telegram can
-      // omit file_size from the Message, or its value may not match what
-      // the CDN actually serves. Trust Content-Length and fall back to
-      // buffering with a ceiling.
-      const cl = parseInt(res.headers.get('content-length') || '0', 10);
-      if (cl > MAX_FILE_BYTES) {
-        throw new Error(`content-length ${cl} exceeds per-file cap ${MAX_FILE_BYTES}`);
-      }
-      const buf = Buffer.from(await res.arrayBuffer());
-      if (buf.length > MAX_FILE_BYTES) {
-        throw new Error(`body ${buf.length} bytes exceeds per-file cap ${MAX_FILE_BYTES}`);
-      }
-      const safeName = sanitizeFilename(att.name);
-      // Embed file_unique_id so two attachments with the same msg_id+name
-      // (album, resend) can't silently overwrite each other. Telegram
-      // guarantees file_unique_id is stable and globally unique per file.
-      const uniq = att.file_unique_id ? `-${att.file_unique_id}` : '';
-      const localName = `${msg.message_id}${uniq}-${safeName}`;
-      const localPath = path.join(chatDir, localName);
-      // Atomic write: create a temp with the unique PID+timestamp suffix,
-      // fill it, then rename to the canonical name. A crash mid-write leaves
-      // a `.tmp.*` file (swept later) rather than a truncated canonical file
-      // that the EEXIST dedup branch would happily serve on next request.
-      if (fs.existsSync(localPath)) {
-        console.log(`[attach] ${chatId} ← ${att.kind} ${safeName} (already on disk, reusing)`);
-      } else {
-        const tmpPath = `${localPath}.tmp.${process.pid}.${Date.now()}`;
-        try {
-          fs.writeFileSync(tmpPath, buf, { flag: 'wx' });
-          fs.renameSync(tmpPath, localPath);
-        } catch (e) {
-          // Clean up stray tmp on any failure; if the rename fell through
-          // because another process beat us, EEXIST on the target is fine.
-          try { fs.unlinkSync(tmpPath); } catch {}
-          if (e.code !== 'EEXIST') throw e;
-          console.log(`[attach] ${chatId} ← ${att.kind} ${safeName} (race: already on disk)`);
-        }
+  const results = new Array(rows.length);
+  let cursor = 0;
+  const workers = Array.from(
+    { length: Math.min(ATTACHMENT_DOWNLOAD_CONCURRENCY, rows.length) },
+    async () => {
+      while (true) {
+        const idx = cursor++;
+        if (idx >= rows.length) return;
+        results[idx] = await downloadOneAttachment(bot, token, chatId, msg, chatDir, rows[idx]);
       }
-      results.push({ ...att, path: localPath, size: att.size_bytes || buf.length, error: null });
-      console.log(`[attach] ${chatId} ← ${att.kind} ${safeName} (${buf.length} bytes) → ${localPath}`);
-      dbWrite(() => db.markAttachmentDownloaded(att.id, {
-        local_path: localPath, size_bytes: att.size_bytes || buf.length,
-      }), `markAttachmentDownloaded ${att.id}`);
-    } catch (err) {
-      // Don't drop the attachment silently — push it through with the
-      // failure noted. buildAttachmentTags renders this as
-      // <attachment-failed reason="..." /> so claude tells the user
-      // "I couldn't see your <kind>" instead of pretending it received
-      // text only.
-      //
-      // Token redaction: the fetch URL embeds bot${TOKEN} (Telegram CDN
-      // requirement) and some undici/network error variants stringify
-      // the request including the URL into err.message. Persisting that
-      // raw to attachments.download_error or stderr would leak the bot
-      // token to anyone with DB or log access. Strip any `bot<token>`
-      // pattern from the reason before storing/logging.
-      const raw = (err.message || 'unknown').slice(0, 200);
-      const reason = raw.replace(/bot\d+:[A-Za-z0-9_-]+/g, 'bot<redacted>');
-      console.error(`[attach] download failed for ${att.name}: ${reason}`);
-      results.push({ ...att, path: null, error: reason });
-      dbWrite(() => db.markAttachmentFailed(att.id, reason),
-        `markAttachmentFailed ${att.id}`);
-    }
-  }
+    },
+  );
+  await Promise.all(workers);
   return results;
 }
@@ -1227,13 +1254,18 @@ function startApprovalSweeper(intervalMs = 30_000) {
         id: row.id, bot: BOT_NAME, tool: row.tool_name,
       }), 'log approval-timeout');
       resolveApprovalWaiter(row.id, 'timeout', 'swept');
-      // Best-effort: edit the card to show the timeout.
+      // Best-effort: edit the card to show the timeout. Routed through
+      // tg() so the edit gets the same plain-text formatting policy as
+      // the original card post (no parse_mode injection from tool input)
+      // AND lands in the transcript like every other outbound. Pre-0.6.8
+      // this called bot.api.editMessageText directly and bypassed both.
       if (bot && row.approver_msg_id) {
-        bot.api.editMessageText(
-          row.approver_chat_id,
-          row.approver_msg_id,
-          approvalCardText(approvals.getById(row.id), { resolvedBy: '⏰ Timed out' }),
-        ).catch(() => {});
+        tg(bot, 'editMessageText', {
+          chat_id: row.approver_chat_id,
+          message_id: row.approver_msg_id,
+          text: approvalCardText(approvals.getById(row.id), { resolvedBy: '⏰ Timed out' }),
+        }, { source: 'approval-card-timeout', botName: BOT_NAME, plainText: true })
+          .catch((err) => console.error(`[${BOT_NAME}] approval-card-timeout edit: ${err.message}`));
       }
     }
   }, intervalMs);
@@ -1789,7 +1821,17 @@ function createBot(token) {
   async function onboardPairedChat(ctx, code) {
     const chatId = ctx.chat.id.toString();
     const userId = ctx.message.from?.id;
-    const send = (text) => bot.api.sendMessage(chatId, text).catch(() => {});
+    // Route through tg() so onboarding replies (success notice + error
+    // messages) get the standard write-before-send DB row, log on
+    // failure, and the same formatting policy as every other outbound.
+    // Pre-0.6.8 this was bot.api.sendMessage(...).catch(() => {}) which
+    // silently dropped failures: the user typed /pair, the code was
+    // claimed (DB mutated), but if the "Paired" reply failed to send
+    // they'd assume it didn't work and try the now-invalid code again.
+    const send = (text) => tg(bot, 'sendMessage', {
+      chat_id: chatId, text,
+    }, { source: 'pair-onboarding', botName: BOT_NAME }).catch((err) =>
+      console.error(`[${BOT_NAME}] pair-onboarding reply: ${err.message}`));
     if (!userId) {
       await send('No user id on request.');