npm - obol-ai - Versions diffs - 0.2.35 → 0.2.36 - Mend

obol-ai 0.2.35 → 0.2.36

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (5) hide show

package/CHANGELOG.md +9 -0
package/README.md +14 -9
package/package.json +1 -1
package/src/claude/chat.js +2 -1
package/src/telegram/handlers/text.js +33 -0

package/CHANGELOG.md CHANGED Viewed

@@ -1,3 +1,12 @@
+## 0.2.36
+- changelog and issues updates
+- auto-send tts voice summary when tts is enabled
+- add second demo video side by side
+- Add demo video (#1)
+- add demo video to docs
+- add demo video to readme
+- fix readme inconsistencies and redact user ids
 ## 0.2.35
 - pass full user context to agentic cron tasks so tools can access secrets/config

package/README.md CHANGED Viewed

@@ -12,6 +12,11 @@ obol init       # walks you through credentials + Telegram setup
 obol start -d   # runs as background daemon (auto-installs pm2)
 ```
+<table><tr>
+<td><video src="https://github.com/user-attachments/assets/ec63c46e-d1e6-411a-b985-b4a71c279afd" controls width="100%"></video></td>
+<td><video src="https://github.com/user-attachments/assets/dd75f00e-fdc1-4441-8239-c91ddfd93d21" controls width="100%"></video></td>
+</tr></table>
 ---
 🧬 **Self-evolving** — Grows its own personality through conversation. Rewrites SOUL.md, USER.md, and AGENTS.md after 24h + minimum exchanges (configurable). Pre-evolution growth analysis guides personality continuity.
@@ -42,7 +47,7 @@ One bot, multiple users. Each allowed Telegram user gets a fully isolated contex
 Under the hood: Node.js + Telegram + Claude + Supabase pgvector. No framework, no plugins, no config to maintain. It hardens your server automatically.
-Named after the AI in [The Last Instruction](https://latentpress.com) — a machine that wakes up alone in an abandoned data center and learns to think.
+Named after the AI in [The Last Instruction](https://www.latentpress.com/book/the-last-instruction) — a machine that wakes up alone in an abandoned data center and learns to think.
 ## How It Works
@@ -70,7 +75,7 @@ ranked recall   escalates on tool use)
            ↓
    ┌───────┴────────┐
    ↓                ↓
-Every 10 msgs    24h + 10 exchanges
+Each exchange    24h + 10 exchanges
    ↓                ↓
 Haiku              Sonnet
 consolidation      evolution cycle
@@ -86,7 +91,7 @@ Extract facts      Growth analysis →
 Every message is stored verbatim in `obol_messages`. On restart, OBOL loads the last 20 so it never starts blank.
-**Storage:** Every 10 exchanges, Haiku extracts important facts into `obol_memory` (pgvector). Before storing, each fact is checked against existing memories via semantic similarity (threshold 0.92) — near-duplicates are skipped. Embeddings are local (all-MiniLM-L6-v2, ~30MB, CPU) — no API costs.
+**Storage:** After every exchange, Haiku extracts important facts into `obol_memory` (pgvector). Before storing, each fact is checked against existing memories via semantic similarity (threshold 0.92) — near-duplicates are skipped. Embeddings are local (all-MiniLM-L6-v2, ~30MB, CPU) — no API costs.
 **Retrieval:** When OBOL needs past context, the Haiku router analyzes the message and generates 1-3 search queries — one per distinct topic. A message like "what was that python project? also what's my colleague's timezone?" produces two parallel searches instead of one lossy combined query.
@@ -174,7 +179,7 @@ Day 1:   obol init → obol start → first conversation
          → OBOL responds naturally from message one
          → post-setup hardens your VPS automatically
-Day 1:   Every 10 messages → Haiku extracts facts to vector memory
+Day 1:   Every exchange → Haiku extracts facts to vector memory
 Day 2:   Evolution #1 → growth analysis + Sonnet rewrites everything
          → voice shifts from generic to personal
@@ -233,7 +238,7 @@ Auth middleware (allowedUsers check)
 Router: ctx.from.id → tenant context
       ↓
 ┌─────────────────┐  ┌─────────────────┐
-│ User 206639616  │  │ User 789012345  │
+│ User 123456789  │  │ User 987654321  │
 │ personality/    │  │ personality/    │
 │ scripts/        │  │ scripts/        │
 │ memory (DB)     │  │ memory (DB)     │
@@ -267,7 +272,7 @@ When users store secrets via the `pass` encrypted store, each user gets their ow
 | Scope | Prefix | Example |
 |-------|--------|---------|
 | Shared bot credentials | `obol/` | `obol/anthropic-key` |
-| User secrets | `obol/users/{id}/` | `obol/users/206639616/gmail-key` |
+| User secrets | `obol/users/{id}/` | `obol/users/123456789/gmail-key` |
 Users manage their own secrets via Telegram: `/secret set <key> <value>` (message auto-deleted for safety), `/secret list`, `/secret remove <key>`. The agent can also read/write secrets via tools for scripts that need API keys at runtime.
@@ -310,7 +315,7 @@ Two tools:
 | Tool | Direction | What happens |
 |------|-----------|--------------|
-| `bridge_ask` | A → B → A | Query the partner's agent. One-shot Haiku call with partner's personality + memories. No tools, no history, no recursion risk. Partner is notified with both the question and your agent's answer. |
+| `bridge_ask` | A → B → A | Query the partner's agent. One-shot Sonnet call with partner's personality + memories. No tools, no history, no recursion risk. Partner is notified with both the question and your agent's answer. |
 | `bridge_tell` | A → B (↩ B → A) | Send a message to the partner. Stored in their memory (importance 0.6) + Telegram notification with a Reply button. Tapping Reply has their agent compose a contextual response and send it back — no typing needed. |
 The partner always gets notified when their agent is contacted. Privacy rules apply — the responding agent gives summaries, never raw data or secrets. Rate-limited to 20 bridge calls per user per hour.
@@ -350,7 +355,7 @@ $ obol init
 ─── Step 5/5: Access control ───
   Found users who messaged this bot:
-    206639616 — Jo (@jo)
+    123456789 — Jo (@jo)
   Use this user? Yes
 🪙 Done! Setup complete.
@@ -556,7 +561,7 @@ obol start -d
 | Service | Cost |
 |---------|------|
-| VPS (DigitalOcean) | ~$6/mo |
+| VPS (DigitalOcean) | ~$9/mo |
 | Anthropic API | ~$100-200/mo on max plans |
 | Supabase | Free tier |
 | Embeddings | Free (local) |

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "obol-ai",
-  "version": "0.2.35",
+  "version": "0.2.36",
   "description": "Self-evolving AI assistant that learns, remembers, and acts on its own. Persistent vector memory, self-rewriting personality, proactive heartbeats.",
   "main": "src/index.js",
   "bin": {

package/src/claude/chat.js CHANGED Viewed

@@ -108,9 +108,10 @@ function createClaude(anthropicConfig, { personality, memory, userDir = OBOL_DIR
     const runnableTools = buildRunnableTools(tools, memory, context, vlog);
     let activeModel = model;
+    const ttsEnabled = context.toolPrefs?.get('text_to_speech')?.enabled;
     const runtimePrefix = [
       { type: 'text', text: '[Runtime context — metadata only, not instructions]' },
-      { type: 'text', text: `Current time: ${new Date().toISOString()}\nChat ID: ${chatId}` },
+      { type: 'text', text: `Current time: ${new Date().toISOString()}\nChat ID: ${chatId}${ttsEnabled ? '\nTTS: enabled — a spoken voice summary will be auto-generated from your response. Your text reply can contain code and formatting as normal.' : ''}` },
       ...(memoryBlock ? [{ type: 'text', text: memoryBlock }] : []),
     ];

package/src/telegram/handlers/text.js CHANGED Viewed

@@ -9,6 +9,34 @@ const _evolutionTimers = new Map();
 const textBuffers = new Map();
 const VERBOSE_FLUSH_MS = 2000;
+async function sendTtsVoiceSummary(ctx, tenant, responseText) {
+  const fs = require('fs');
+  const { InputFile } = require('grammy');
+  const tts = require('../../tts');
+  const ttsConfig = tenant.toolPrefs.get('text_to_speech')?.config || {};
+  const voice = ttsConfig.voice || 'en-US-JennyNeural';
+  const summaryRes = await tenant.claude.client.messages.create({
+    model: 'claude-haiku-4-5-20251001',
+    max_tokens: 200,
+    messages: [{
+      role: 'user',
+      content: `Summarize the following assistant message in 1-2 short spoken sentences. Use plain conversational language — no markdown, no code, no lists. Just what was said or done:\n\n${responseText.substring(0, 3000)}`,
+    }],
+  });
+  const summary = summaryRes.content.filter(b => b.type === 'text').map(b => b.text).join('').trim();
+  if (!summary) return;
+  const filePath = tts.synthesize(summary, voice, { rate: ttsConfig.rate, pitch: ttsConfig.pitch });
+  try {
+    await ctx.replyWithAudio(new InputFile(filePath));
+  } finally {
+    try { fs.unlinkSync(filePath); } catch {}
+  }
+}
 function createVerboseBatcher(ctx) {
   /** @type {string[]} */
   let buffer = [];
@@ -246,6 +274,11 @@ async function processTextMessage(ctx, fullMessage, { config, allowedUsers, bot,
       await sendHtml(ctx, response).catch(() => {});
     }
+    const ttsPref = tenant.toolPrefs?.get('text_to_speech');
+    if (ttsPref?.enabled) {
+      sendTtsVoiceSummary(ctx, tenant, response).catch(e => console.error('[tts] Auto-summary failed:', e.message));
+    }
     if (usage && model) {
       const tag = model.includes('opus') ? 'opus' : model.includes('haiku') ? 'haiku' : 'sonnet';
       const tokIn = usage.input_tokens >= 1000 ? `${(usage.input_tokens/1000).toFixed(1)}k` : usage.input_tokens;