npm - alvin-bot - Versions diffs - 4.5.1 → 4.7.0 - Mend

alvin-bot 4.5.1 → 4.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (43) hide show

package/CHANGELOG.md +278 -0
package/README.md +25 -2
package/bin/cli.js +325 -26
package/dist/handlers/commands.js +505 -63
package/dist/handlers/message.js +209 -14
package/dist/i18n.js +470 -13
package/dist/index.js +45 -5
package/dist/providers/claude-sdk-provider.js +106 -14
package/dist/providers/ollama-provider.js +32 -0
package/dist/providers/openai-compatible.js +10 -1
package/dist/providers/registry.js +112 -17
package/dist/providers/types.js +25 -3
package/dist/services/compaction.js +2 -0
package/dist/services/cron.js +53 -42
package/dist/services/heartbeat.js +41 -7
package/dist/services/language-detect.js +12 -2
package/dist/services/ollama-manager.js +339 -0
package/dist/services/personality.js +20 -14
package/dist/services/session.js +21 -3
package/dist/services/subagent-delivery.js +266 -0
package/dist/services/subagent-stats.js +123 -0
package/dist/services/subagents.js +509 -42
package/dist/services/telegram.js +28 -1
package/dist/services/updater.js +158 -0
package/dist/services/usage-tracker.js +11 -4
package/dist/services/users.js +2 -1
package/docs/HANDBOOK.md +856 -0
package/package.json +7 -2
package/test/claude-sdk-provider.test.ts +69 -0
package/test/i18n.test.ts +108 -0
package/test/registry.test.ts +201 -0
package/test/subagent-delivery.test.ts +273 -0
package/test/subagent-stats.test.ts +119 -0
package/test/subagents-commands.test.ts +64 -0
package/test/subagents-config.test.ts +114 -0
package/test/subagents-depth.test.ts +58 -0
package/test/subagents-inheritance.test.ts +67 -0
package/test/subagents-name-resolver.test.ts +122 -0
package/test/subagents-priority-reject.test.ts +88 -0
package/test/subagents-queue.test.ts +127 -0
package/test/subagents-shutdown.test.ts +126 -0
package/test/subagents-toolset.test.ts +51 -0
package/vitest.config.ts +17 -0

package/CHANGELOG.md CHANGED Viewed

@@ -2,6 +2,284 @@
 All notable changes to Alvin Bot are documented here.
+## [4.7.0] — 2026-04-11
+### ✨ Sub-Agents Stufe 2 — live-stream, bounded queue, 24h stats
+Stufe 2 of the sub-agents refinement spec lands alongside the same-day 4.6.0 release. Everything here builds on the Stufe 1 foundation and is fully unit-tested (85 passing tests).
+#### A4 Live-Stream for user-spawns
+`/subagents visibility live` enables a new delivery mode where user-spawned sub-agents stream their text incrementally into a single Telegram message, then post a completion banner as a separate message.
+Implementation in `src/services/subagent-delivery.ts`:
+- `LiveStream` class with `start()` / `update()` / `finalize()`
+- `start()` posts an initial `⏳ <name> thinking…` placeholder and records its `message_id`
+- `update()` is called on every text chunk from the agent's generator; it coalesces rapid updates via a throttle window of **800 ms** so we never exceed Telegram's edit rate limit. Multiple `update()` calls within the window collapse into a single edit with the latest accumulated text.
+- `finalize()` flushes any pending text, replaces the `thinking…` header with the final body, then sends a new banner message so the user gets a completion notification (edits don't trigger push notifications).
+- The live-stream message uses **plain text** (no `parse_mode`) so half-formed markdown during streaming can never cause an edit to be rejected. The final banner does use markdown.
+Wiring in `runSubAgent`:
+- Detects `effectiveVisibility === "live"` AND `source === "user"` AND `parentChatId`. Cron and implicit spawns are never live-streamed — cron because there's no interactive watcher, implicit because the parent Claude stream already shows everything inline.
+- Creates the `LiveStream` via `createLiveStream()` before the for-await loop.
+- Calls `liveStream.update(chunk.text)` on every text chunk.
+- Calls `liveStream.finalize(info, result)` after the loop and marks `entry.delivered = true` so `spawnSubAgent.finally()` skips the regular `deliverSubAgentResult` path. If finalize fails, the `delivered` flag stays false and the normal banner delivery fires as a fallback.
+- Falls back to `"banner"` mode transparently if the bot API doesn't support `editMessageText` (e.g. during tests or if `attachBotApi` was never called).
+Tests added in `test/subagent-delivery.test.ts`:
+- `start` posts an initial placeholder and stores the message_id
+- `update` coalesces rapid calls into a single throttled edit within the 800 ms window
+- `finalize` posts a banner as a new message
+- `createLiveStream` returns `null` when `editMessageText` is missing
+#### D3 Bounded priority queue
+Previously, hitting `maxParallel` returned a hard reject. Now spawn requests that don't fit run into a **bounded priority queue**:
+- Default cap: **20** slots (configurable via `/subagents queue <n>`, clamped to 0–200)
+- Setting cap to 0 disables the queue entirely and restores the old reject-on-full behavior
+- Priority order on drain: **user > cron > implicit**
+- FIFO within each priority class
+- Drains automatically when a running agent finishes — the `runSubAgent.finally()` now calls `drainQueue()` after cleanup
+New fields:
+- `SubAgentsConfig.queueCap: number` — persisted in `~/.alvin-bot/sub-agents.json`
+- `SubAgentInfo.status: "queued"` — new valid state
+- `SubAgentInfo.queuePosition?: number` — 1-based position in the queue, shown in `/subagents list` as `#N`
+Functions in `subagents.ts`:
+- `getQueueCap()` / `setQueueCap(n)` — public config accessors
+- `drainQueue()` — called from `runSubAgent.finally()`, pops in priority order and transitions entries from `queued` to `running`
+- `popHighestPriorityQueued()` — internal FIFO-per-priority scan
+- `reindexQueue()` — keeps `SubAgentInfo.queuePosition` in sync after pop/cancel
+- `cancelSubAgent()` now handles queued entries by removing them from the queue without starting `runSubAgent` at all
+- `cancelAllSubAgents()` clears the pending queue before cancelling running agents, so shutdown doesn't spawn anything new
+- `spawnSubAgent()` is split: queue decision first (run immediately vs queue vs reject), then `startRun()` helper starts the background loop
+Reject messages stay priority-aware (D4) but now mention queue saturation:
+- `user` spawn + pool full + cron/implicit in pool + queue full → *"Alle Slots belegt (N/M), davon X cron/implicit im Hintergrund. Queue voll (Q/C). /subagents list für Details …"*
+- `user` spawn + pool full + user in pool + queue full → *"Alle Slots belegt (N/M) mit eigenen user-Spawns. Queue voll (Q/C). /subagents cancel <name> oder warten."*
+- Non-user spawns + pool + queue full → *"Sub-agent limit reached (N running, Q/C queued). Wait for a running agent to finish or cancel one."*
+Tests added in `test/subagents-queue.test.ts`:
+- Default cap is 20
+- Clamping (negative → 0, above 200 → 200, fractional floors)
+- Round-trip through disk
+- Third spawn at full pool lands as `status: "queued"` with `queuePosition: 1`
+- Queue drains automatically when a running agent finishes
+- Priority order: user spawns drain before cron at the same moment
+- `cancelSubAgent` removes a queued entry
+The existing priority-reject tests now explicitly set `queueCap = 0` to test the old reject path, and a new "queue enabled" test fills both pool and queue before asserting the reject message.
+#### H3 24-hour run stats
+New module `src/services/subagent-stats.ts` — a simple append-only JSON ring buffer persisted to `~/.alvin-bot/subagent-stats.json`. Each completed sub-agent run appends one entry:
+```ts
+{
+  completedAt: number;
+  name: string;
+  source: "user" | "cron" | "implicit";
+  status: "completed" | "timeout" | "error" | "cancelled";
+  durationMs: number;
+  inputTokens: number;
+  outputTokens: number;
+}
+```
+On every load or append, entries older than 24 hours are pruned. A hard cap of 5000 entries protects against unbounded growth on high-frequency bots.
+Accessors:
+- `recordSubAgentRun(info, result)` — called from `runSubAgent.finally()` as a non-blocking side effect. Errors are logged but don't affect delivery.
+- `getSubAgentStats()` — returns a `StatsSummary` with totals, per-source breakdown, and per-status counts.
+New Telegram command **`/subagents stats`** renders the summary:
+```
+📊 Sub-Agent Stats — last 24h
+Total: 44 runs · 165k in / 89k out · 12m
+By source:
+  👤 user:     12 runs · 45k in / 22k out
+  ⏰ cron:      8 runs · 31k in / 15k out
+  🔗 implicit: 24 runs · 89k in / 52k out
+By status:
+  ✅ completed: 42
+  ⚠️ cancelled: 1
+  ⏱️ timeout:   0
+  ❌ error:     1
+```
+The JSON backing file is a deliberate short-term choice. When the SQLite migration lands (already scoped in a separate memory entry as `project_alvinbot_sqlite_migration.md`), we swap the backend without touching `getSubAgentStats()` or `recordSubAgentRun()` — both are designed as a narrow interface.
+Tests added in `test/subagent-stats.test.ts`:
+- Fresh install returns zeros
+- Recording 3 runs updates totals + per-source breakdown
+- Persistence + reload round-trip
+- Entries older than 24h are pruned on load
+- `byStatus` tracks cancelled/error/timeout separately
+### 🖥 CLI: `alvin-bot start` / `stop` now auto-detect LaunchAgent
+The `start` and `stop` commands previously always went through pm2. That created a conflict after `alvin-bot launchd install`: the LaunchAgent ran the bot, but `alvin-bot start` would happily spawn a second instance via pm2, and `alvin-bot stop` would try to stop a pm2 process that didn't exist.
+Now both commands check for `~/Library/LaunchAgents/com.alvinbot.app.plist` on macOS and switch transparently:
+- **`alvin-bot start`** with a LaunchAgent present → `launchctl kickstart -k gui/$UID/com.alvinbot.app` (or `launchctl load -w` if not loaded yet). No pm2 involvement.
+- **`alvin-bot stop`** with a LaunchAgent present → `launchctl unload -w` (doesn't remove the plist, just stops the daemon).
+- **`alvin-bot start`** on macOS without a LaunchAgent → pm2 path + a helpful tip: *"💡 Tip: on macOS with Claude Code, switch to launchd for automatic Keychain access: alvin-bot launchd install"*.
+Linux and Windows users are unaffected — they always get the pm2 path.
+### 🐛 Other
+- `/subagents queue` is registered in the usage string for en/de/es/fr.
+- `/subagents stats` is registered in the usage string for en/de/es/fr.
+- `/subagents visibility` usage now lists `live` as a valid mode.
+- Removed the leftover `alvin-bot-4.5.1.tgz` from the repo root.
+## [4.6.0] — 2026-04-11
+### ✨ Sub-Agents Stufe 1 — context-aware delivery, name-first addressing, shutdown notifications
+**The big one.** Stufe 1 of the SubAgents refinement spec (9 design axes, two-stage rollout) is complete. Everything here is live-validated on a remote test MacBook via `@Alvin_testbot_bot` over Telegram with Claude Agent SDK + Max OAuth.
+#### A4 + I3 — Source-aware delivery router
+New module `src/services/subagent-delivery.ts`. Every completed sub-agent routes through a single entry point that picks its delivery path based on `SubAgentInfo.source`:
+- `implicit` (Main-Claude calling the SDK `Task` tool) → **no-op**, the parent stream already shows the result.
+- `user` (explicit user spawn) → **banner + final** to `parentChatId` in the originating chat.
+- `cron` (scheduled job) → **banner + final** to the `chatId` from the cron job's target.
+The banner format is fixed: `{icon} *{name}* {status} · {duration} · {input_tokens} in / {output_tokens} out` followed by the agent output. Status icons: ✅ completed, ⚠️ cancelled, ⏱️ timeout, ❌ error. Duration is human-formatted (`42s`, `3m 12s`). Token counts collapse at 1000 (`4.2k`).
+Output chunking:
+- ≤3800 chars → single message `banner + body`
+- 3800–20000 chars → banner alone, then body chunks of 3800 chars each
+- \>20000 chars → banner + the body as a `.md` file upload (via `grammy`'s `InputFile`)
+The bot API is attached lazily at startup via `attachBotApi()` so `subagent-delivery.ts` stays free of a circular import on `index.ts`. Test hook `__setBotApiForTest()` lets Vitest inject a fake.
+#### New command: `/subagents visibility <auto|banner|silent>`
+Per-install persistent visibility setting, written to `~/.alvin-bot/sub-agents.json`. `silent` suppresses the delivery entirely — the result is still stored in the `activeAgents` map and pullable via `/subagents result <name>`. `auto` is the default and falls through to the source-based routing described above.
+#### B2 — Name-first addressing with automatic `#N` collision suffixes
+`/subagents cancel <name|id>` and `/subagents result <name|id>` now accept names, not just UUIDs. When a new spawn collides with an existing name, the resolver appends `#2`, `#3`, … using the smallest free index. Example: three parallel `review` spawns appear as `review`, `review#2`, `review#3` in `/subagents list`.
+Resolution order:
+1. Explicit `#N` suffix (e.g. `review#2`) → exact match wins, never falls through to ambiguity
+2. Base name with a single sibling → that sibling
+3. Base name with multiple siblings **and** `ambiguousAsList: true` opt-in → disambiguation reply listing all candidates
+4. Base name with multiple siblings, no opt-in → first sibling
+5. No name match → UUID prefix (back-compat)
+#### C3 — Parent inheritance
+Sub-agents now inherit `workingDir` (with `inheritCwd: false` opt-out), `CLAUDE.md` (via `settingSources: ["project"]`), and the registry's provider/model. Conversation history is **not** inherited — the sub-agent reads only its own prompt, which forces clean, self-describing spawn requests and keeps parallel agents from colliding on shared context.
+#### D4 — Priority-aware reject messages
+Pool is still strictly capped (no preemption), but the error message when it's full now depends on who holds the slots:
+- User spawn + background (cron/implicit) hold slots → message points at `/subagents list` so the user knows the pool isn't stuck on another interactive task
+- User spawn + other user spawns → suggests cancel-or-wait with command hints
+- Cron/implicit rejects → generic "limit reached" (those callers handle retry themselves)
+#### E2 — Shutdown notifications
+`cancelAllSubAgents(notify: true)` is now async and fires a delivery to each still-running agent before the process exits. Each notification is a synth `cancelled` result with the body `⚠️ Agent wurde durch Bot-Restart unterbrochen. Bitte neu triggern.` and routes through the normal I3 delivery path. Total delivery phase is capped at 5s so a hanging Telegram send can't block shutdown.
+The shutdown hook in `src/index.ts` now `await`s `cancelAllSubAgents(true)` before stopping the grammy bot and tearing down plugins.
+#### F2 — Depth cap (hard limit = 2)
+`SubAgentConfig.depth` is a new optional field (defaults to 0 = root). `spawnSubAgent` rejects any depth > 2 with a clear error. The depth shows in `/subagents list` as `d0` / `d1` / `d2` with 2-space indentation per level, so nested scatter-gather runs are visually nested.
+#### G1 — Toolset preset infrastructure
+New `SubAgentConfig.toolset` field with a single valid value `"full"`. Runtime validation rejects any other string. This is purely infrastructure for future `"readonly"` / `"research"` presets — no behavior change today, but adding a preset later is a one-line diff.
+#### H2 — Per-run token accounting in the banner
+Every completed sub-agent's banner carries the input/output token counts it actually consumed. No aggregation (H3) — that comes later with the SQLite migration. For now, you can see "this agent cost me 4.2k/2.1k" right next to the result.
+#### Tests
+67 passing Vitest tests across 12 files. New test files added for this release:
+- `test/claude-sdk-provider.test.ts` — auth probe + `isAuthErrorOutput` helper
+- `test/subagents-depth.test.ts` — depth cap (F2)
+- `test/subagents-inheritance.test.ts` — cwd inheritance (C3)
+- `test/subagents-toolset.test.ts` — toolset literal (G1)
+- `test/subagents-name-resolver.test.ts` — `findSubAgentByName` including regression for exact-match vs ambiguity
+- `test/subagents-commands.test.ts` — `cancelSubAgentByName`/`getSubAgentResultByName` helpers
+- `test/subagent-delivery.test.ts` — I3 delivery router (all 5 source/visibility paths)
+- `test/subagents-shutdown.test.ts` — E2 notify=true / notify=false + regression for shutdown double-delivery
+- `test/subagents-priority-reject.test.ts` — D4 priority-aware reject messages
+- `test/subagents-config.test.ts` — expanded with visibility config round-trip
+### 🖥 New CLI: `alvin-bot launchd install|uninstall|status` (macOS only)
+**Why this matters.** Claude Code 2.x stores the Max-subscription OAuth token in the macOS Keychain, service `"Claude Code-credentials"`. Accessing the token requires:
+1. A Keychain ACL that permits the `claude` binary (granted via the "Always Allow" dialog on first GUI invocation)
+2. An *unlocked* Keychain in the calling process's security context
+Processes started via SSH, pm2, or `nohup` run in a detached launchd session that does **not** inherit the GUI user's unlocked-Keychain state. Even a manual `security unlock-keychain -p '...'` only unlocks the current SSH session — the pm2 daemon running in its own context stays locked out. Result: the Bot saw `Not logged in · Please run /login` on every sub-agent query, and the fix in 4.6.0's Phase 0 exposes that as a clean error instead of leaking it as chat text.
+**The fix**: run the bot as a **launchd user agent**. LaunchAgents run inside the GUI login session and inherit the unlocked Keychain automatically. No SSH dance, no pm2 drama, no manual unlocks on every restart.
+```
+alvin-bot launchd install    — Write ~/Library/LaunchAgents/com.alvinbot.app.plist,
+                                unload any existing instance, launchctl load -w.
+alvin-bot launchd uninstall  — Unload and rm the plist.
+alvin-bot launchd status     — Plist existence, PID from `launchctl list`,
+                                tail of ~/.alvin-bot/logs/alvin-bot.{out,err}.log.
+```
+Plist details:
+- `KeepAlive` → auto-restart on crash, not on successful exit
+- `RunAtLoad` → starts on login
+- `ThrottleInterval 10` → prevents rapid restart loops
+- `PATH` covers `~/.local/bin`, `/opt/homebrew/bin` (Apple Silicon), `/usr/local/bin` (Intel Homebrew)
+- stdout → `~/.alvin-bot/logs/alvin-bot.out.log`
+- stderr → `~/.alvin-bot/logs/alvin-bot.err.log`
+macOS users should migrate from `alvin-bot start` (pm2) to `alvin-bot launchd install`. Pm2 still works and remains the Linux/Windows default.
+### 🐛 Bug fixes
+- **`ClaudeSDKProvider.isAvailable()` now actually probes authentication.** The old check only ran `claude --version`, which succeeds whether or not the CLI has a valid OAuth token. A locked-out CLI would be reported as available, and the `Not logged in` response would leak into the chat as a normal assistant message. New behavior: `claude --version` for the binary check, then `claude -p "ping"` to verify auth. If the output matches the "Not logged in" pattern, the provider reports `false` and the registry falls through to the next provider.
+- **`ClaudeSDKProvider.query()` surfaces `Not logged in` as an error chunk.** Even in code paths where `isAvailable()` returned stale cache, a runtime failure during the stream would emit `Not logged in · Please run /login` as text. The query loop now detects the auth pattern on the first text chunk and yields a typed `error` chunk with a clear "Run `claude login`" message, instead of pretending it's a normal response.
+- **`/subagents cancel|result <name#N>` now hits the exact entry.** Regression caught during the remote test: asking for `test-ping#2` returned the "Mehrdeutig — welchen meinst du?" ambiguity reply instead of the specific `#2` entry, because `findSubAgentByName` checked base-name siblings before the exact-name match when `ambiguousAsList: true` was set. Explicit `#N` queries now always win.
+- **Shutdown double-delivery race fixed.** If the bot received SIGTERM while a sub-agent was mid-stream, Telegram saw two messages: a "completed · (empty output)" banner from `runSubAgent.finally()` (because the test generator exited gracefully after the abort), followed by the "cancelled · Bot-Restart" banner from `cancelAllSubAgents`. Fixed with a `delivered: boolean` flag on each `activeAgents` entry — whoever posts first sets it, the other skips.
+- **`providerKeyMap` alignment in `src/index.ts`.** The pre-flight provider-key warning used `gemini-2.5-flash` as the map key, but the registry registers Google Gemini under `google`. Users who set `PRIMARY_PROVIDER=google` never saw the "GOOGLE_API_KEY missing" warning. Fixed by canonical `google → GOOGLE_API_KEY`; legacy custom-model aliases stay for rollback safety.
+- **`cron.ts` ai-query triple-notification cleanup.** A single failed ai-query cron job was sending three legacy error messages (`slow-fox: cancelled — cancelled`, `AI-Query Error (slow-fox)`, `Cron Error (slow-fox)`) because the failure path fired `notifyCallback` in the inner `if`, the inner `catch`, and the outer `catch`. The I3 delivery router already posts the cancellation banner for ai-query jobs, so all three legacy notify calls are now skipped and ai-query errors propagate via the outer catch for bookkeeping only. Other job types (reminder, shell, http, message) keep the legacy notify path.
+- **`/subagents` now shows up in Telegram's command autocomplete.** The grammy handler was registered from v4.0.0 but `setMyCommands` never listed it, so users had to know the exact spelling. Added.
+### 📚 Documentation
+- New English-language handbook at `docs/HANDBOOK.md` — covers installation, architecture, all providers, the sub-agents system, cron jobs, platform adapters, security audit, and the web UI. Written to be readable standalone without cross-referencing the README.
+- README.md updated with a pointer to the handbook and the new `launchd` command.
 ## [4.5.1] — 2026-04-09
 ### 🐛 TUI Header Rendering Hotfix

package/README.md CHANGED Viewed

@@ -109,13 +109,29 @@ alvin-bot start
 That's it. The setup wizard validates everything:
 - ✅ Tests your AI provider key
-- ✅ Verifies your Telegram bot token
+- ✅ Verifies your Telegram bot token
 - ✅ Confirms the setup works before you start
 **Requires:** Node.js 18+ ([nodejs.org](https://nodejs.org)) · Telegram bot token ([@BotFather](https://t.me/BotFather)) · Your Telegram user ID ([@userinfobot](https://t.me/userinfobot))
 Free AI providers available — no credit card needed.
+### macOS: use `launchd` instead of pm2 (recommended)
+If you're on macOS and using Claude Code (Max subscription) as your provider, run the bot as a **LaunchAgent** — it inherits the GUI login session so the macOS Keychain stays unlocked and the Claude OAuth token just works without any manual `security unlock-keychain` dance:
+```bash
+alvin-bot launchd install    # writes ~/Library/LaunchAgents/com.alvinbot.app.plist and starts the agent
+alvin-bot launchd status     # show PID + recent stdout/stderr logs
+alvin-bot launchd uninstall  # unload + remove the plist
+```
+Pm2 still works and remains the default on Linux/Windows — but on macOS with Claude Code, `launchd` is the only path that reliably keeps Keychain access over restarts.
+### 📖 Handbook
+For a full walkthrough of everything Alvin Bot can do — providers, sub-agents, cron jobs, plugins, MCP, security audit, web UI — read **[`docs/HANDBOOK.md`](docs/HANDBOOK.md)**.
 ### AI Providers
 | Provider | Cost | Best for |
@@ -436,7 +452,14 @@ alvin-bot tui       # Terminal chat UI ✨
 alvin-bot chat      # Alias for tui
 alvin-bot doctor    # Health check
 alvin-bot update    # Pull latest & rebuild
-alvin-bot start     # Start the bot
+alvin-bot start     # Start the bot (background via pm2)
+alvin-bot start -f  # Start in foreground
+alvin-bot stop      # Stop the bot
+alvin-bot launchd install    # macOS only: install as LaunchAgent
+alvin-bot launchd status     # macOS only: show LaunchAgent state
+alvin-bot launchd uninstall  # macOS only: remove LaunchAgent
+alvin-bot audit     # Security health check
+alvin-bot search    # Search assets/memories/skills
 alvin-bot version   # Show version
 ```