2ndbrain 2026.1.37 → 2026.2.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -19,7 +19,8 @@
19
19
  "mcp__dude__get_project_context",
20
20
  "mcp__dude__create_project",
21
21
  "mcp__dude__create_issue",
22
- "mcp__dude__update_issue"
22
+ "mcp__dude__update_issue",
23
+ "Bash(npx --version)"
23
24
  ]
24
25
  }
25
26
  }
@@ -0,0 +1,417 @@
1
+ # Performance Audit Report
2
+
3
+ **Project:** 2ndbrain v0.5.0
4
+ **Date:** 2026-02-01
5
+ **Scope:** Full source code review (~5,600 LOC across 18 JS files + 1 bash script)
6
+
7
+ ---
8
+
9
+ ## Executive Summary
10
+
11
+ 2ndbrain is a single-user personal assistant running on low-power hardware (e.g., Raspberry Pi 5). Performance requirements are modest -- the system processes one message at a time and serves a single web admin user. Most performance issues identified are relevant for long-running uptime (days/weeks), not peak throughput. The most impactful findings are synchronous file I/O blocking the event loop, sequential embedding processing, and missing caching for repeated database queries.
12
+
13
+ **Overall Assessment:** The performance profile is acceptable for the intended use case. The issues below are ordered by impact and should be addressed as the system scales or as uptime requirements increase.
14
+
15
+ | Priority | Count |
16
+ |----------|-------|
17
+ | High | 3 |
18
+ | Medium | 7 |
19
+ | Low | 6 |
20
+
21
+ ---
22
+
23
+ ## High Priority
24
+
25
+ ### PERF-01: Synchronous File I/O Blocks Event Loop
26
+
27
+ **File:** `src/attachments/store.js:102, 105`
28
+
29
+ ```javascript
30
+ fs.mkdirSync(absoluteDir, { recursive: true }); // line 102
31
+ fs.writeFileSync(absolutePath, fileBuffer); // line 105
32
+ ```
33
+
34
+ Attachment saving uses synchronous `mkdirSync` and `writeFileSync`. These block the entire Node.js event loop for the duration of the disk operation. On an SD card (common for Raspberry Pi), a large file write could block for hundreds of milliseconds, during which:
35
+
36
+ - Telegram long-polling cannot process new updates
37
+ - The web admin panel becomes unresponsive
38
+ - Typing indicator refreshes are delayed
39
+ - Rate limiter drain timers are delayed
40
+
41
+ Additional synchronous file operations that block during startup (acceptable but worth noting):
42
+ - `src/index.js:61-93` -- `setupRuntimeFiles()` uses `mkdirSync`, `copyFileSync`, `chmodSync`
43
+ - `src/config.js:15-23` -- `.env` migration and directory creation
44
+ - `src/mcp/config.js:26-27, 49, 69` -- MCP config file writes
45
+
46
+ **Impact:** Event loop stalls proportional to file size and disk speed.
47
+
48
+ **Recommendation:** Replace `mkdirSync`/`writeFileSync` with `fs.promises.mkdir`/`fs.promises.writeFile` in the attachment store. Startup file operations can remain synchronous since they run before the event loop serves requests.
49
+
50
+ ### PERF-02: Sequential Embedding Processing
51
+
52
+ **File:** `src/embeddings/worker.js:168-178`
53
+
54
+ ```javascript
55
+ for (const row of result.rows) {
56
+ try {
57
+ await this._processRow(row); // sequential
58
+ } catch (err) { ... }
59
+ }
60
+ ```
61
+
62
+ Each embedding is processed sequentially: fetch source text from DB, call OpenAI API, write vector back to DB. With OpenAI API latency of ~200-500ms per call and a batch size of 10, processing takes 2-5 seconds per batch with a 5-second poll interval.
63
+
64
+ For a backlog of 1,000 messages, embedding takes ~8-17 minutes. For 10,000 messages (e.g., after a model switch that nullifies all vectors), it takes ~1.5-3 hours.
65
+
66
+ **Impact:** Slow embedding generation after initial setup or model changes.
67
+
68
+ **Recommendation:** Process embeddings concurrently within each batch using `Promise.allSettled()` with a concurrency limit of 3-5. This would reduce per-batch time to ~400-1000ms:
69
+
70
+ ```javascript
71
+ const CONCURRENCY = 5;
72
+ for (let i = 0; i < result.rows.length; i += CONCURRENCY) {
73
+ const batch = result.rows.slice(i, i + CONCURRENCY);
74
+ await Promise.allSettled(batch.map(row => this._processRow(row)));
75
+ }
76
+ ```
77
+
78
+ ### PERF-03: Per-Log Database INSERT
79
+
80
+ **File:** `src/logging.js:42-52`
81
+
82
+ ```javascript
83
+ if (this._pool) {
84
+ try {
85
+ await this._pool.query(
86
+ 'INSERT INTO system_logs (level, source, content) VALUES ($1, $2, $3)',
87
+ [level, source, content],
88
+ );
89
+ } catch (err) { ... }
90
+ }
91
+ ```
92
+
93
+ Every log statement issues a separate `INSERT` query to PostgreSQL. The logger methods (`debug`, `info`, `warn`, `error`) return the promise from `_log`, but callers do not await them -- meaning log writes are fire-and-forget but still consume database connections from the pool.
94
+
95
+ During heavy logging (e.g., debug level with embedding worker processing), this could saturate the connection pool (default 10 connections in pg) and delay actual application queries.
96
+
97
+ **Impact:** Database connection pool contention under heavy logging. Each log adds ~1-5ms of database overhead.
98
+
99
+ **Recommendation:** Implement batched log writing -- queue log entries in memory and flush to the database in a single multi-row INSERT every N seconds or every M entries:
100
+
101
+ ```javascript
102
+ // Example: batch insert every 5 seconds or 50 entries
103
+ INSERT INTO system_logs (level, source, content) VALUES
104
+ ($1, $2, $3), ($4, $5, $6), ...
105
+ ```
106
+
107
+ ---
108
+
109
+ ## Medium Priority
110
+
111
+ ### PERF-04: Array.reverse() on Every History Fetch
112
+
113
+ **Files:** `src/claude/conversation.js:67`, `src/hooks/lifecycle.js:256`
114
+
115
+ ```javascript
116
+ // conversation.js:58-67
117
+ const result = await this.db.query(
118
+ `SELECT ... FROM conversation_messages ORDER BY created_at DESC LIMIT $1`,
119
+ [effectiveLimit],
120
+ );
121
+ return result.rows.reverse();
122
+ ```
123
+
124
+ The query sorts `DESC` to get the N most recent rows, then reverses the array in JavaScript to get chronological order. With the default threshold of 100 messages, this creates and copies a 100-element array on every call.
125
+
126
+ The same pattern appears in `lifecycle.js:250-256` where 20 rows are fetched DESC and reversed.
127
+
128
+ **Impact:** Minor -- O(n) array copy per call. Negligible for current sizes but wasteful.
129
+
130
+ **Recommendation:** Use a subquery to get the correct order from SQL:
131
+
132
+ ```sql
133
+ SELECT * FROM (
134
+ SELECT ... FROM conversation_messages ORDER BY created_at DESC LIMIT $1
135
+ ) sub ORDER BY created_at ASC
136
+ ```
137
+
138
+ ### PERF-05: No Caching for Dashboard and Health Queries
139
+
140
+ **File:** `src/web/server.js:172-228, 289-319`
141
+
142
+ The dashboard handler issues 4 database queries per page load:
143
+ 1. `COUNT(*) FROM conversation_messages` (line 185)
144
+ 2. `SELECT ... FROM conversation_messages ORDER BY ... LIMIT 10` (line 197)
145
+ 3. `SELECT session_id ... LIMIT 1` (line 208)
146
+ 4. `SELECT ... FROM system_logs WHERE level = 'error' LIMIT 5` (line 217)
147
+
148
+ The health endpoint issues `SELECT 1` on every request (line 302).
149
+
150
+ The database page handler issues 4 queries including `pg_total_relation_size` (line 358-369) which scans system catalogs.
151
+
152
+ **Impact:** Unnecessary database load if the dashboard is auto-refreshed or monitored.
153
+
154
+ **Recommendation:** Add simple in-memory TTL caching (30-60 seconds) for dashboard stats and health checks. Example:
155
+
156
+ ```javascript
157
+ class Cache {
158
+ constructor(ttlMs = 30000) { ... }
159
+ get(key) { ... }
160
+ set(key, value) { ... }
161
+ }
162
+ ```
163
+
164
+ ### PERF-06: Array.shift() in Rate Limiter Hot Path
165
+
166
+ **File:** `src/rate-limiter.js:30-32`
167
+
168
+ ```javascript
169
+ while (this._timestamps.length > 0 && this._timestamps[0] <= cutoff) {
170
+ this._timestamps.shift(); // O(n) per call
171
+ }
172
+ ```
173
+
174
+ `Array.shift()` is O(n) because it copies all remaining elements forward. With `maxPerMinute` of 10-30, the array is small and the cost is negligible. However, if rate limits are increased significantly, this becomes a hot path.
175
+
176
+ **Impact:** Negligible at current scale. O(n^2) total cost over a sliding window cycle.
177
+
178
+ **Recommendation:** Track the window start index instead of shifting, and reset the array when the index passes halfway:
179
+
180
+ ```javascript
181
+ _prune() {
182
+ const cutoff = Date.now() - WINDOW_MS;
183
+ while (this._startIdx < this._timestamps.length && this._timestamps[this._startIdx] <= cutoff) {
184
+ this._startIdx++;
185
+ }
186
+ if (this._startIdx > this._timestamps.length / 2) {
187
+ this._timestamps = this._timestamps.slice(this._startIdx);
188
+ this._startIdx = 0;
189
+ }
190
+ }
191
+ ```
192
+
193
+ ### PERF-07: No Connection Pool Configuration
194
+
195
+ **File:** `src/db/pool.js:6-8`
196
+
197
+ ```javascript
198
+ const pool = new Pool({
199
+ connectionString: config.DATABASE_URL,
200
+ });
201
+ ```
202
+
203
+ The pg pool uses default settings: `max: 10` connections, `idleTimeoutMillis: 10000`, `connectionTimeoutMillis: 0` (infinite). For a single-user bot on a Raspberry Pi:
204
+
205
+ - 10 connections is likely excessive for PostgreSQL's memory overhead (~10MB each)
206
+ - Infinite connection timeout means a query will wait forever if the pool is exhausted
207
+ - No `statement_timeout` to catch runaway queries
208
+
209
+ **Impact:** Excessive memory usage on constrained hardware; potential hangs on pool exhaustion.
210
+
211
+ **Recommendation:** Configure the pool explicitly:
212
+
213
+ ```javascript
214
+ const pool = new Pool({
215
+ connectionString: config.DATABASE_URL,
216
+ max: 5,
217
+ idleTimeoutMillis: 30000,
218
+ connectionTimeoutMillis: 5000,
219
+ statement_timeout: 30000,
220
+ });
221
+ ```
222
+
223
+ ### PERF-08: `execSync` Blocks Event Loop During Startup
224
+
225
+ **File:** `src/index.js:103`
226
+
227
+ ```javascript
228
+ const version = execSync('claude --version', {
229
+ timeout: 10_000,
230
+ encoding: 'utf-8',
231
+ }).trim();
232
+ ```
233
+
234
+ `execSync` blocks the entire event loop for up to 10 seconds. During startup this is less critical since no requests are being served, but if `claude` is slow to respond (e.g., network lookup, first-time `npx` download), the web admin panel won't start until this completes.
235
+
236
+ **Impact:** Startup delay of up to 10 seconds if `claude --version` is slow.
237
+
238
+ **Recommendation:** Use async `execFile` or `spawn` with a promise wrapper. This allows the web server to start serving the settings page in parallel.
239
+
240
+ ### PERF-09: String Concatenation in HTTP Response Handlers
241
+
242
+ **File:** `src/mcp/embed-server.js:48-49`
243
+
244
+ ```javascript
245
+ let data = '';
246
+ res.on('data', (chunk) => { data += chunk; });
247
+ ```
248
+
249
+ String concatenation in a loop creates intermediate strings that must be garbage collected. For typical embedding API responses (~5-20KB), this is negligible. For larger responses, using an array and join would be more efficient.
250
+
251
+ The same pattern appears in `src/telegram/bot.js` for the API call response, but there it correctly uses `Buffer.concat` (line 420).
252
+
253
+ **Impact:** Minor -- only affects embedding API responses.
254
+
255
+ **Recommendation:** Use the array-and-join pattern for consistency:
256
+
257
+ ```javascript
258
+ const chunks = [];
259
+ res.on('data', (chunk) => chunks.push(chunk));
260
+ res.on('end', () => {
261
+ const data = Buffer.concat(chunks).toString('utf-8');
262
+ ...
263
+ });
264
+ ```
265
+
266
+ ### PERF-10: Unbounded Rate Limiter Queue
267
+
268
+ **File:** `src/rate-limiter.js:87-89`
269
+
270
+ ```javascript
271
+ return new Promise((resolve) => {
272
+ this._queue.push({ resolve });
273
+ this._scheduleDrain();
274
+ });
275
+ ```
276
+
277
+ The rate limiter queue grows without bound. If messages arrive faster than the rate limit allows, the queue accumulates promises. Each queued promise holds a reference to its closure, preventing garbage collection.
278
+
279
+ At 10 calls/minute for Claude, a burst of 100 messages would queue 90 promises, each waiting up to 9 minutes. For Telegram at 30/minute, the queue drains faster but is still unbounded.
280
+
281
+ **Impact:** Memory growth during sustained bursts. Each queued item is small (~100 bytes), so 100 items is ~10KB -- negligible. But in degenerate cases (e.g., bot added to a group chat receiving hundreds of messages), the queue could grow significantly.
282
+
283
+ **Recommendation:** Add a maximum queue depth with a rejection behavior:
284
+
285
+ ```javascript
286
+ if (this._queue.length >= MAX_QUEUE_DEPTH) {
287
+ return Promise.reject(new Error('Rate limit queue full'));
288
+ }
289
+ ```
290
+
291
+ ---
292
+
293
+ ## Low Priority
294
+
295
+ ### PERF-11: nextCronDate Scans Up to 527,040 Minutes
296
+
297
+ **File:** `src/scheduler/worker.js:26-48`
298
+
299
+ ```javascript
300
+ for (let i = 0; i < MAX_SCAN_MINUTES; i++) { // MAX_SCAN_MINUTES = 527,040
301
+ if (matcher.match(candidate)) {
302
+ return new Date(candidate.getTime());
303
+ }
304
+ candidate.setMinutes(candidate.getMinutes() + 1);
305
+ }
306
+ ```
307
+
308
+ For each scheduled task, the worst case scans ~366 days of minutes (527,040 iterations) to find the next match. For well-formed cron expressions this completes quickly (usually < 60 iterations), but a pathological expression like `0 0 31 2 *` (Feb 31) would scan all 527K minutes before returning null.
309
+
310
+ Additionally, a new `cron.schedule()` task object is created just to access the matcher (line 28-31), which is wasteful.
311
+
312
+ **Impact:** Occasional CPU spike during scheduler initialization if tasks have unusual expressions.
313
+
314
+ **Recommendation:** Consider using a dedicated cron-parsing library that computes next-run analytically rather than by minute-scanning.
315
+
316
+ ### PERF-12: Dashboard Queries Not Parallelized
317
+
318
+ **File:** `src/web/server.js:184-226`
319
+
320
+ The dashboard handler runs 4 sequential database queries with `await` between each. These queries are independent and could run concurrently:
321
+
322
+ ```javascript
323
+ const [countRes, recent, session, errors] = await Promise.all([
324
+ this._db.query('SELECT COUNT(*)::int ...'),
325
+ this._db.query('SELECT ... FROM conversation_messages ... LIMIT 10'),
326
+ this._db.query('SELECT session_id ... LIMIT 1'),
327
+ this._db.query('SELECT ... FROM system_logs ... LIMIT 5'),
328
+ ]);
329
+ ```
330
+
331
+ **Impact:** Dashboard page load takes ~4x the single-query latency instead of ~1x.
332
+
333
+ **Recommendation:** Use `Promise.all` or `Promise.allSettled` for independent queries.
334
+
335
+ ### PERF-13: Lifecycle Hook Conversation History Fetch is Redundant
336
+
337
+ **File:** `src/hooks/lifecycle.js:248-264`
338
+
339
+ The `on_pre_claude` hook fetches the 20 most recent conversation messages and attaches them to the context as `conversationContext`. However, this context doesn't appear to be used by the Claude bridge invocation at `src/index.js:198-202` -- the bridge only receives `text`, `sessionId`, and `systemPrompt`. The conversation context is fetched and then discarded.
340
+
341
+ **Impact:** Unnecessary database query and memory allocation on every message.
342
+
343
+ **Recommendation:** Remove the conversation history fetch from the hook if it's not consumed downstream. If it's intended for future use, add a feature flag to skip it.
344
+
345
+ ### PERF-14: Compaction Loads All Old Messages Into Memory
346
+
347
+ **File:** `src/claude/conversation.js:146-161`
348
+
349
+ ```javascript
350
+ const oldMessages = await this.db.query(
351
+ `SELECT id, created_at, role, content FROM conversation_messages
352
+ ORDER BY created_at ASC LIMIT $1`,
353
+ [removeCount],
354
+ );
355
+ ```
356
+
357
+ With a threshold of 100 and `keepRecent` of 20, compaction loads up to 80 messages into memory. With an average message size of ~500 bytes, this is ~40KB -- negligible. However, if the threshold is increased to 1000 or messages are very long, this could consume significant memory.
358
+
359
+ The messages are then formatted into a single string and sent to Claude for summarization (line 159-161), which could produce a very large prompt.
360
+
361
+ **Impact:** Memory spike proportional to `HISTORY_COMPACT_THRESHOLD * avg_message_size`.
362
+
363
+ **Recommendation:** For large thresholds, consider streaming or chunked summarization.
364
+
365
+ ### PERF-15: Typing Indicator Interval Overhead
366
+
367
+ **File:** `src/telegram/bot.js:331-344`
368
+
369
+ Each active typing indicator creates a `setInterval` that fires every 4 seconds, making an HTTPS request to Telegram. While only one conversation is active at a time (single user), the interval continues even if the Claude response is assembling quickly.
370
+
371
+ **Impact:** Unnecessary network traffic (~1 request/4 seconds during processing). Negligible bandwidth but adds noise to logs and consumes a connection slot.
372
+
373
+ **Recommendation:** Consider using a single pending-typing flag checked by the poll loop instead of per-chat intervals.
374
+
375
+ ### PERF-16: Web Admin HTML Templates Regenerated Per Request
376
+
377
+ **File:** `src/web/server.js:595-1255`
378
+
379
+ All HTML templates (`layoutHTML`, `dashboardHTML`, `settingsHTML`, `logsHTML`, `databaseHTML`) are generated from scratch on every request via string concatenation. The CSS (~280 lines) is inlined in every page response.
380
+
381
+ For a single-user admin panel, this is acceptable. The CSS is ~5KB and the templates are simple string concatenation.
382
+
383
+ **Impact:** Negligible for single-user access. ~5KB overhead per page from repeated CSS.
384
+
385
+ **Recommendation:** Optional: Extract CSS to a static file served with cache headers. This would also enable browser caching.
386
+
387
+ ---
388
+
389
+ ## Architecture Notes
390
+
391
+ ### What Works Well
392
+
393
+ 1. **Single-threaded simplicity** -- The application avoids concurrency complexity by processing one message at a time through the Claude bridge. This eliminates most race condition classes.
394
+
395
+ 2. **Event-driven lifecycle hooks** -- The hook pipeline cleanly separates concerns without adding overhead. Sequential handler execution prevents ordering bugs.
396
+
397
+ 3. **Rate limiter design** -- The sliding-window rate limiter with queuing is an effective pattern that backpressures callers without dropping requests.
398
+
399
+ 4. **Minimal dependencies** -- Only 5 runtime dependencies (express, pg, node-cron, dotenv, open), which minimizes supply chain risk and keeps the bundle small.
400
+
401
+ 5. **Background workers with overlap guards** -- Both the embedding worker and scheduler worker use `_processing` flags to prevent overlapping iterations, which is appropriate for the polling pattern.
402
+
403
+ ### Scaling Considerations
404
+
405
+ If the application were to scale beyond single-user:
406
+
407
+ 1. **Database queries should be indexed** -- The `conversation_messages` table is queried by `created_at DESC` frequently. Ensure a B-tree index exists on `created_at`.
408
+
409
+ 2. **Connection pooling** would need to be properly sized per concurrent user.
410
+
411
+ 3. **The Claude bridge is single-process** -- Only one Claude subprocess runs at a time. Multiple users would need a queue or multiple bridge instances.
412
+
413
+ 4. **The scheduler checks `claudeBridge.isActive()`** before running tasks (line 183). This means scheduled tasks are delayed while any user message is being processed. For multi-user, the scheduler would need its own bridge instance.
414
+
415
+ ---
416
+
417
+ *End of Performance Audit Report*
package/README.md CHANGED
@@ -1,70 +1,123 @@
1
1
  # 2ndbrain
2
2
 
3
- An always-on Node.js npx service that bridges Telegram messges to Claude with
4
- * persistent conversation history (logs)
5
- * receive text messages w/ attachments
6
- * slash commands
7
- * send text message responses w/ "Typing" indicator
8
- * whitelist users that it will interact with (multi-layered)
9
- * can run local commands, access local postgres (mcp) (whitelisted)
3
+ A personal, always-on AI assistant that lives on your local network. **2ndbrain** bridges Telegram to Claude via a Node.js service, giving you a private conversational AI with persistent memory, a knowledge platform, and full access to local tools — all from your phone.
4
+
5
+ You set it up on a device on your LAN (e.g. a Raspberry Pi 5), and you — and only you — interact with it by chatting over Telegram.
6
+
7
+
8
+ ## How It Works
9
+
10
+ ```
11
+ Telegram ──long-polling──▸ 2ndbrain ──subprocess──▸ Claude CLI
12
+ │ │
13
+ │ MCP tools
14
+ │ (postgres, embeddings,
15
+ │ shell commands)
16
+
17
+ PostgreSQL
18
+ (history, knowledge,
19
+ projects, journal,
20
+ embeddings)
21
+ ```
22
+
23
+ 1. Messages arrive from Telegram via long-polling (no public URL required)
24
+ 2. Slash commands are routed to built-in handlers; everything else goes to Claude
25
+ 3. Claude is spawned as a subprocess with access to MCP tools (database, semantic search, whitelisted shell commands)
26
+ 4. Responses stream back through Telegram with a typing indicator
27
+ 5. All conversations are persisted in PostgreSQL for recall and search
28
+
29
+
30
+ ## Integrations
31
+
32
+ | Integration | Role |
33
+ |---|---|
34
+ | **Telegram Bot API** | Messaging interface — long-polling, attachments (photos, docs, audio, video, voice), typing indicators |
35
+ | **Claude CLI** | Conversational AI — spawned as subprocess with streaming JSON, thinking mode, session continuity |
36
+ | **PostgreSQL + pgvector** | Persistent storage — conversation history, knowledge graph, projects, journal, vector embeddings with HNSW indexing |
37
+ | **Model Context Protocol (MCP)** | Tool framework — gives Claude direct access to the database (`pg` server) and a custom `embed_query` tool for semantic search |
38
+ | **OpenAI Embeddings API** | Vector embeddings — optional provider for semantic search (configurable model and dimensions) |
39
+ | **Express** | Web admin dashboard — settings, environment config, activity logs (LAN-only) |
40
+
41
+
42
+ ## Features
43
+
44
+ ### Conversation
45
+ - Persistent conversation history with session tracking
46
+ - Auto-compaction when history exceeds a configurable threshold
47
+ - Rate limiting for both Claude calls and Telegram sends
48
+ - Attachment storage (photos, documents, audio, video, voice) in `~/data`
49
+
50
+ ### Skills (Claude-managed via MCP)
51
+
52
+ | Skill | Description |
53
+ |---|---|
54
+ | **Knowledge Graph** | Entities and relationships with full-text search and embedding queue |
55
+ | **Journal** | Timestamped personal notes with semantic search |
56
+ | **Project Management** | Projects with specifications and issues, completion tracking |
57
+ | **Scheduler** | Recurring tasks via cron expressions with timezone support |
58
+ | **Recall** | Unified semantic search across journal, knowledge, projects, and history |
59
+ | **System Ops** | Read-only diagnostics — memory, disk, uptime, database status, logs |
60
+
61
+ ### Slash Commands
62
+
63
+ | Command | Action |
64
+ |---|---|
65
+ | `/status` | Current system status |
66
+ | `/health` | Health check across all subsystems |
67
+ | `/restart` | Restart the service |
68
+ | `/reboot` | Reboot the host |
69
+ | `/stop` | Graceful shutdown |
70
+ | `/new` | Start a new conversation session |
71
+ | `/help` | List available commands |
72
+
73
+ ### Security
74
+ - Whitelisted Telegram users (multi-layered)
75
+ - Whitelisted MCP tools and shell commands
76
+ - Configurable file-edit path restrictions
77
+ - LAN-only web admin interface
78
+
79
+ ### Lifecycle Hooks
80
+ Custom scripts that run at startup, shutdown, pre/post Claude invocation, and on errors.
10
81
 
11
82
 
12
83
  ## Setup
13
84
 
14
- * Start the `npx ...` runner on boot
15
- * Ensure that local postgres & MCP are ready
16
- * Ensure that claude-cli is ready
85
+ 1. Ensure **PostgreSQL** is running with the `pgvector` extension
86
+ 2. Ensure **claude-cli** is installed and configured
87
+ 3. Create a `.env` file at `~/.2ndbrain/.env` (see Configuration below)
88
+ 4. Start the service: `npx 2ndbrain`
89
+ 5. (Optional) Configure to start on boot via systemd or similar
17
90
 
18
91
 
19
- ## Vision
92
+ ## Configuration
20
93
 
21
- * You setup `2ndbrain` on a computer on your LAN (e.g. rapsberry pi 5)
22
- * You, and only you, can access with it by chatting over Telegram
23
- * **2ndbrain** uses Claude + local MCP tools to do stuff and respond to you
24
- * Web server interface
25
- * Setup wizard
26
- * Adjust settings & environment variables
27
- * View activity logs
28
- * GPIO interaction
29
- * Auto-compact history
30
- * Errors get pushed to the user
31
- * Graceful shutdown/restart
32
- * Rate-limiting of Claude and Telegram
33
- * Store attachments in `~/data`
34
- * Vector embeddings of db records
94
+ All configuration lives in `~/.2ndbrain/.env`:
35
95
 
96
+ | Category | Key Variables |
97
+ |---|---|
98
+ | **Required** | `TELEGRAM_BOT_TOKEN`, `TELEGRAM_ALLOWED_USERS`, `DATABASE_URL` |
99
+ | **Claude** | `CLAUDE_MODEL`, `CLAUDE_THINKING`, `CLAUDE_TIMEOUT`, `CLAUDE_MAX_BUDGET` |
100
+ | **MCP** | `MCP_CONFIG_PATH`, `MCP_TOOLS_WHITELIST`, `COMMANDS_WHITELIST` |
101
+ | **Embeddings** | `EMBEDDING_PROVIDER`, `EMBEDDING_API_KEY`, `EMBEDDING_MODEL`, `EMBEDDING_DIMENSIONS` |
102
+ | **Rate Limits** | `RATE_LIMIT_CLAUDE` (default 10/min), `RATE_LIMIT_TELEGRAM` (default 30/min) |
103
+ | **Web Admin** | `WEB_PORT`, `WEB_BIND`, `AUTO_OPEN_BROWSER` |
104
+ | **Storage** | `DATA_DIR` (default `~/data`) |
105
+ | **Conversation** | `HISTORY_COMPACT_THRESHOLD` (default 100) |
106
+ | **Security** | `FILE_EDIT_PATHS` |
107
+ | **Logging** | `LOG_LEVEL` |
36
108
 
37
- ## Slashes
38
-
39
- Enter slash commands in Telegram messages to perform tasks
40
-
41
- `/status`
42
- `/health`
43
- `/restart`
44
- `/reboot`
45
- `/stop`
46
-
47
109
 
48
110
  ## Data Schema
49
111
 
50
- * Projects(id, created, updated, name)
51
- * Specifications(id, created, updated, project_id, note)
52
- * Issues(id, created, updated, note, completed)
53
- * _knowledge_graph
54
- * Nodes(id, created, updated, name, note)
55
- * Edges(id, created, updated, node1_id, node2_id, name)
56
- * Journal(id, created, updated, note)
57
- * History(id, created, updated, user_id, message_id, content)
58
- * Logs(id, timestamp, content, level)
59
- * Embeddings(id, created, updated, entity_type, vector)
60
-
61
-
62
- ## Claude Stuff
63
-
64
- Skills <TBD>
65
- Hooks <TBD>
66
-
67
-
68
- ## Caveats
69
-
70
- * Run Claude w/ top model, thinking, ?accept edits?
112
+ - **Projects** (id, created, updated, name)
113
+ - Specifications (id, created, updated, project_id, note)
114
+ - Issues (id, created, updated, note, completed)
115
+ - **Knowledge Graph**
116
+ - Nodes (id, created, updated, name, note)
117
+ - Edges (id, created, updated, node1_id, node2_id, name)
118
+ - **Journal** (id, created, updated, note)
119
+ - **Conversation Messages** (id, created, updated, user_id, message_id, content, session_id)
120
+ - **System Logs** (id, timestamp, content, level)
121
+ - **Attachments** (id, created, updated, file_path, metadata)
122
+ - **Scheduled Tasks** (id, cron_expression, timezone, next_run, error tracking)
123
+ - **Embeddings** (id, created, updated, entity_type, vector)
@@ -0,0 +1,413 @@
1
+ # Security & Reliability Audit Report
2
+
3
+ **Project:** 2ndbrain v0.5.0
4
+ **Date:** 2026-02-01
5
+ **Scope:** Full source code review (~5,600 LOC across 18 JS files + 1 bash script)
6
+
7
+ ---
8
+
9
+ ## Executive Summary
10
+
11
+ 2ndbrain is a Node.js service bridging Telegram to Claude CLI, with a PostgreSQL backend and an Express-based admin panel. The architecture follows a defense-in-depth approach with Telegram user whitelisting, command whitelisting, and rate limiting. However, several gaps remain -- the most critical being the unauthenticated web admin panel that can modify all credentials and system configuration.
12
+
13
+ | Severity | Count |
14
+ |----------|-------|
15
+ | Critical | 2 |
16
+ | High | 6 |
17
+ | Medium | 8 |
18
+ | Low | 6 |
19
+
20
+ ---
21
+
22
+ ## Critical
23
+
24
+ ### SEC-01: Unauthenticated Web Admin Panel
25
+
26
+ **Files:** `src/web/server.js:124-131`
27
+
28
+ All web admin routes are served without any authentication:
29
+
30
+ ```
31
+ app.get('/', ... _handleDashboard)
32
+ app.get('/settings', ... _handleSettings)
33
+ app.post('/settings', ... _handleSaveSettings)
34
+ app.post('/database/migrate', ... _handleRunMigrations)
35
+ ```
36
+
37
+ The settings page allows reading masked versions of, and writing new values for: `TELEGRAM_BOT_TOKEN`, `DATABASE_URL`, `EMBEDDING_API_KEY`, and all other configuration. The database page allows running arbitrary schema migrations.
38
+
39
+ While the default bind address is `127.0.0.1`, nothing prevents a user from setting `WEB_BIND=0.0.0.0` (there is even a UI field for it at `src/web/server.js:76`), which exposes the entire admin panel to the network.
40
+
41
+ **Impact:** Full account takeover. An attacker on the local network (or remotely if `WEB_BIND` is `0.0.0.0`) can replace the Telegram bot token, database URL, or embedding API key with attacker-controlled values.
42
+
43
+ **Recommendation:** Add authentication to the web admin panel (token-based, password, or at minimum an admin secret in the `.env`). If the panel must remain open, hard-enforce `127.0.0.1` binding and do not expose it as a configurable option.
44
+
45
+ ### SEC-02: No CSRF Protection on State-Changing Endpoints
46
+
47
+ **Files:** `src/web/server.js:127, 131`
48
+
49
+ `POST /settings` and `POST /database/migrate` have no CSRF token validation. Since the admin panel has no authentication, any page a local user visits can submit a form to `http://localhost:3000/settings` and overwrite credentials.
50
+
51
+ **Impact:** A malicious website visited in the same browser can silently reconfigure the entire application.
52
+
53
+ **Recommendation:** Add CSRF tokens to all POST forms. Even with authentication, CSRF protection is necessary.
54
+
55
+ ---
56
+
57
+ ## High
58
+
59
+ ### SEC-03: Database Credentials Visible in Process Arguments
60
+
61
+ **File:** `src/mcp/config.js:35`
62
+
63
+ ```javascript
64
+ args: ['-y', '@modelcontextprotocol/server-postgres', config.DATABASE_URL],
65
+ ```
66
+
67
+ The full `DATABASE_URL` (including username and password) is passed as a command-line argument to the MCP postgres server spawned by `npx`. Command-line arguments are visible to all users on the system via `ps aux`.
68
+
69
+ **Impact:** Any local user can read database credentials from the process listing.
70
+
71
+ **Recommendation:** Pass the connection string via an environment variable in the child process `env` option, not via `args`.
72
+
73
+ ### SEC-04: Error Messages Leak Internal Details to Telegram Users
74
+
75
+ **File:** `src/index.js:261-263`
76
+
77
+ ```javascript
78
+ const userMessage = isTimeout
79
+ ? 'Response timed out, please try again.'
80
+ : `Sorry, an error occurred: ${err.message}`;
81
+ ```
82
+
83
+ Non-timeout error messages are forwarded verbatim to the Telegram user. `err.message` can contain database connection strings, file paths, stack traces from child process stderr, or other internal details.
84
+
85
+ **Impact:** Information disclosure. Even though the Telegram user is whitelisted, the messages traverse Telegram's servers.
86
+
87
+ **Recommendation:** Send a generic error message to users and log the full error internally. If the detail is useful, provide a reference ID that can be looked up in the logs.
88
+
89
+ ### SEC-05: `sudo reboot` Execution After Single-Factor Confirmation
90
+
91
+ **File:** `src/telegram/commands.js:233`
92
+
93
+ ```javascript
94
+ execSync('sudo reboot', { timeout: 10_000 });
95
+ ```
96
+
97
+ The `/reboot` command executes `sudo reboot` after a single "YES" reply within 60 seconds. The confirmation flow relies solely on the Telegram user whitelist -- if the bot token is compromised (e.g., via SEC-01), an attacker can reboot the host.
98
+
99
+ **Impact:** Denial of service / physical disruption of the host system.
100
+
101
+ **Recommendation:** Consider removing the reboot command entirely, or require a secondary authentication factor (e.g., a passphrase, TOTP code, or physical button press).
102
+
103
+ ### SEC-06: Validate Command Hook Can Be Bypassed via Whitelist Patterns
104
+
105
+ **File:** `hooks/validate-command.sh:278-280`
106
+
107
+ Whitelisted commands bypass all subsequent security checks, including dangerous-command blocking and write-target inspection. If `COMMANDS_WHITELIST` contains an overly broad pattern (e.g., `*`), all commands including `sudo`, `rm -rf /`, and arbitrary writes become allowed.
108
+
109
+ Additionally, the glob matching at line 134-140 checks the command prefix, but compound commands like `echo hello; rm -rf /` would be checked against the whitelist as the full string, not the individual subcommands. The dangerous-command check at lines 287-333 does inspect for embedded dangerous commands using grep patterns, but the whitelist check (Rule 1) runs first and exits 0 before those checks.
110
+
111
+ **Impact:** A permissive whitelist pattern bypasses all safety checks.
112
+
113
+ **Recommendation:** Always run the dangerous-command checks (Rule 2) regardless of whitelist match. The whitelist should only skip Rule 4/5 (read-only and default allow), not the unconditional block rules.
114
+
115
+ ### SEC-07: Missing Security Headers on Web Admin
116
+
117
+ **File:** `src/web/server.js:116-148`
118
+
119
+ The Express server sets no security headers:
120
+
121
+ - No `Content-Security-Policy` (allows inline scripts, external resource loading)
122
+ - No `X-Frame-Options` (clickjacking possible)
123
+ - No `X-Content-Type-Options: nosniff`
124
+ - No `Strict-Transport-Security`
125
+ - No `Referrer-Policy`
126
+
127
+ The admin panel contains inline `onclick` handlers (line 1032) which would need CSP allowances, but the absence of CSP entirely is worse.
128
+
129
+ **Impact:** The admin panel is vulnerable to clickjacking and content injection attacks.
130
+
131
+ **Recommendation:** Add a security headers middleware. At minimum: `X-Frame-Options: DENY`, `X-Content-Type-Options: nosniff`, and a restrictive `Content-Security-Policy`.
132
+
133
+ ### SEC-08: Full Process Environment Passed to Claude Subprocess
134
+
135
+ **File:** `src/claude/bridge.js:47`
136
+
137
+ ```javascript
138
+ env: { ...process.env },
139
+ ```
140
+
141
+ The entire process environment -- including `DATABASE_URL`, `TELEGRAM_BOT_TOKEN`, `EMBEDDING_API_KEY`, and any other secrets -- is passed to the Claude CLI subprocess. Claude CLI can access these via its MCP tools or tool-use capabilities.
142
+
143
+ **Impact:** If Claude's sandboxing is incomplete or a tool allows environment variable access, all application secrets are exposed.
144
+
145
+ **Recommendation:** Construct a minimal environment for the Claude subprocess containing only required variables (PATH, HOME, etc.).
146
+
147
+ ---
148
+
149
+ ## Medium
150
+
151
+ ### SEC-09: Unvalidated Query Parameter Rendered in HTML
152
+
153
+ **File:** `src/web/server.js:339-340`
154
+
155
+ ```javascript
156
+ } else if (req.query.error) {
157
+ data.message = { type: 'error', text: req.query.error };
158
+ }
159
+ ```
160
+
161
+ The `error` query parameter from `/database?error=...` is set as the message text. It is later rendered through `esc()` at line 1114, so XSS is prevented. However, this pattern of reflecting user-controlled input is fragile -- if any template path omits the `esc()` call, it becomes an XSS vector.
162
+
163
+ **Recommendation:** Validate and sanitize the error parameter, or use a flash message stored server-side.
164
+
165
+ ### SEC-10: No Rate Limiting on Web Admin Endpoints
166
+
167
+ **File:** `src/web/server.js:116-148`
168
+
169
+ While Telegram and Claude rate limiters exist, the web admin endpoints have none. An attacker could:
170
+ - Rapidly POST to `/settings` to cause disk I/O (`.env` writes)
171
+ - Repeatedly POST to `/database/migrate` to trigger migration attempts
172
+ - Flood `/health` which issues a `SELECT 1` on every request
173
+
174
+ **Recommendation:** Add basic rate limiting to web admin routes.
175
+
176
+ ### SEC-11: Database CREATE Statement Uses String Interpolation
177
+
178
+ **File:** `src/embeddings/engine.js:224, 271`
179
+
180
+ ```javascript
181
+ await this.db.query(`CREATE TABLE IF NOT EXISTS embeddings (
182
+ ...
183
+ vector VECTOR(${dimensions}),
184
+ ...
185
+ )`);
186
+ ```
187
+
188
+ The `dimensions` value is interpolated directly into SQL DDL. While the code validates it is a positive integer at `src/embeddings/engine.js:169-174`, this validation happens in the same class. If `_resolveDimensions` is called with `EMBEDDING_DIMENSIONS` containing a non-numeric value that passes `parseInt` (e.g., `"100; DROP TABLE users--"`), `parseInt` would return `100` and the injection would fail. However, this pattern is inherently risky.
189
+
190
+ **Impact:** Low given current validation, but defense-in-depth is missing.
191
+
192
+ **Recommendation:** Add an explicit integer range check (e.g., `dim > 0 && dim <= 10000`) before interpolation into DDL.
193
+
194
+ ### SEC-12: `ensureDatabase` Uses Unsanitized Database Name in DDL
195
+
196
+ **File:** `src/db/pool.js:39`
197
+
198
+ ```javascript
199
+ await client.query(`CREATE DATABASE "${dbName}"`);
200
+ ```
201
+
202
+ The database name is extracted from the URL pathname and used in a `CREATE DATABASE` statement with double-quote escaping. If the URL contains a database name with double quotes (e.g., `postgresql://.../"test"--drop`), the escaping could be bypassed. In practice this is unlikely since the user controls the `.env` file.
203
+
204
+ **Impact:** Low -- self-inflicted SQL injection via config file.
205
+
206
+ **Recommendation:** Use `pg_catalog.quote_ident()` or validate the database name against `[a-zA-Z0-9_-]+`.
207
+
208
+ ### SEC-13: Telegram Bot Token in URLs and Logs
209
+
210
+ **Files:** `src/telegram/bot.js:401, 372`
211
+
212
+ ```javascript
213
+ const url = new URL(`/bot${this._token}/${method}`, TELEGRAM_API_BASE);
214
+ ```
215
+
216
+ The bot token is embedded in every API URL. If an error occurs during an HTTP request and the URL is logged, the token is exposed. The `_getFileUrl` method at line 372 also constructs download URLs containing the token. While the logger appears to not log full URLs directly, any unexpected error that includes the request URL would leak the token.
217
+
218
+ **Recommendation:** Never log full Telegram API URLs. Mask the token portion in error messages.
219
+
220
+ ### SEC-14: Sensitive Data Stored in Logs Table
221
+
222
+ **File:** `src/logging.js:44-47`
223
+
224
+ All log entries, including those containing user messages and error details, are persisted to the `system_logs` database table. The web admin logs page (`/logs`) displays these without any redaction. User messages may contain personal information, and error logs may contain credentials or tokens.
225
+
226
+ **Recommendation:** Implement log-level content filtering, redact known secret patterns, and consider adding access controls to the logs page.
227
+
228
+ ### SEC-15: Command Validation Script Has Sed-Based JSON Parsing Fallback
229
+
230
+ **File:** `hooks/validate-command.sh:38-42`
231
+
232
+ When `jq` is not installed, command extraction falls back to `sed`:
233
+
234
+ ```bash
235
+ COMMAND=$(printf '%s' "$INPUT" \
236
+ | tr '\n' ' ' \
237
+ | sed 's/.*"command"[[:space:]]*:[[:space:]]*"//' \
238
+ | sed 's/"[[:space:]]*[,}].*//' \
239
+ | sed 's/\\"/"/g; s/\\\\/\\/g')
240
+ ```
241
+
242
+ This fallback cannot correctly handle all JSON edge cases (e.g., nested quotes, unicode escapes, multi-line commands). A specially crafted command string could cause incorrect extraction, potentially allowing the wrong string to be validated.
243
+
244
+ **Recommendation:** Require `jq` as a dependency, or use Node.js for JSON parsing instead of bash.
245
+
246
+ ### SEC-16: Relative Path Assumption in Command Validator
247
+
248
+ **File:** `hooks/validate-command.sh:176`
249
+
250
+ ```bash
251
+ "."*|[^/]*) return 0 ;; # Relative paths resolve under cwd (within home)
252
+ ```
253
+
254
+ The validator assumes relative paths resolve within the home directory. However, Claude CLI's working directory is configurable via the `--cwd` flag or by the runtime directory. If the working directory is set to `/`, relative paths like `../etc/passwd` would resolve outside home.
255
+
256
+ **Impact:** Depends on Claude CLI's working directory configuration.
257
+
258
+ **Recommendation:** Resolve relative paths to absolute before validation, using the actual working directory.
259
+
260
+ ---
261
+
262
+ ## Low
263
+
264
+ ### SEC-17: No Input Length Validation
265
+
266
+ **File:** `src/index.js:152`
267
+
268
+ User messages from Telegram are saved to the database and forwarded to Claude without any length validation. Telegram allows messages up to 4096 characters, but captions and forwarded messages could be longer. Extremely large messages could cause:
269
+ - Database storage bloat
270
+ - Claude CLI buffer overflow or timeout
271
+ - Memory pressure during compaction (all messages loaded into memory)
272
+
273
+ **Recommendation:** Enforce a maximum message length (e.g., 10,000 chars) before processing.
274
+
275
+ ### SEC-18: `_executeConfirmed` Not Awaited
276
+
277
+ **File:** `src/telegram/commands.js:200`
278
+
279
+ ```javascript
280
+ this._executeConfirmed(chatId, command);
281
+ return true;
282
+ ```
283
+
284
+ The async `_executeConfirmed` method is called without `await`, making it fire-and-forget. If it throws after the `return true`, the error is an unhandled promise rejection. The method has its own try/catch (line 212-248), but any error in `this._sendPlain` within the catch block would be unhandled.
285
+
286
+ **Recommendation:** Await the call, or add `.catch()` to handle edge cases.
287
+
288
+ ### SEC-19: Attachment MIME Type Derived from Untrusted Source
289
+
290
+ **File:** `src/attachments/store.js:43-45`
291
+
292
+ ```javascript
293
+ function extFromMime(mimeType) {
294
+ if (!mimeType) return 'bin';
295
+ return MIME_TO_EXT[mimeType] || mimeType.split('/').pop() || 'bin';
296
+ }
297
+ ```
298
+
299
+ The MIME type comes from Telegram's message data (client-provided). The fallback `mimeType.split('/').pop()` could produce unexpected extensions from crafted MIME types. While files are stored with UUID names (mitigating path-based attacks), the extension could confuse downstream consumers.
300
+
301
+ **Recommendation:** Use a strict whitelist of allowed MIME types. Reject or default unknown types.
302
+
303
+ ### SEC-20: Unhandled Rejection Handler Only Logs
304
+
305
+ **File:** `src/index.js:552-554`
306
+
307
+ ```javascript
308
+ process.on('unhandledRejection', (reason) => {
309
+ logger.error('process', `Unhandled rejection: ${reason}`);
310
+ });
311
+ ```
312
+
313
+ Unhandled promise rejections are logged but do not trigger the `on_error` hook, shutdown, or user notification. In Node.js 15+, unhandled rejections terminate the process by default, but this handler prevents that. Silent failures accumulate.
314
+
315
+ **Recommendation:** Either call `shutdown()` on unhandled rejections (as done for uncaught exceptions) or at minimum emit the `on_error` hook.
316
+
317
+ ### SEC-21: Conversation Compaction Is Not Transactional
318
+
319
+ **File:** `src/claude/conversation.js:145-203`
320
+
321
+ Compaction performs three sequential database operations (INSERT summary, then DELETE old messages) without wrapping them in a transaction. If the process crashes between the INSERT and DELETE, duplicate data accumulates. If it crashes after DELETE but before INSERT completes, messages are lost.
322
+
323
+ **Recommendation:** Wrap the INSERT and DELETE in a database transaction.
324
+
325
+ ### SEC-22: Web Admin `.env` File Write Has No Locking
326
+
327
+ **File:** `src/web/server.js:416-442`
328
+
329
+ The `_writeEnvFile` method reads and rewrites the `.env` file without file locking. Concurrent POST requests to `/settings` could produce corrupted output. While unlikely with a single-user admin panel, it's a correctness issue.
330
+
331
+ **Recommendation:** Use a file lock or serialize writes through an in-memory queue.
332
+
333
+ ---
334
+
335
+ ## Positive Findings
336
+
337
+ The following security practices are well-implemented:
338
+
339
+ - **Parameterized SQL queries** throughout -- SQL injection risk is minimal (`$1, $2, $3` pattern used consistently)
340
+ - **HTML escaping** via `esc()` function applied consistently in all template outputs
341
+ - **UUID-based attachment filenames** prevent path traversal and name collision
342
+ - **Telegram user whitelist** provides a strong first layer of access control
343
+ - **Secrets masked in UI** with `maskValue()` / `maskDatabaseUrl()`
344
+ - **Bot token validated** before starting Telegram polling
345
+ - **Dangerous command blocking** is comprehensive (sudo, rm -rf, shutdown, kill, network config, package managers)
346
+ - **File write path validation** blocks writes to system directories unconditionally
347
+ - **Rate limiting** on both Claude and Telegram prevents resource exhaustion
348
+ - **Embed MCP server** binds to `127.0.0.1` only
349
+ - **Signal handling** with graceful shutdown on SIGTERM/SIGINT
350
+
351
+ ---
352
+
353
+ ## Failure Points & Reliability
354
+
355
+ ### FP-01: No Retry Logic for Telegram API Calls
356
+
357
+ **File:** `src/telegram/bot.js:398-453`
358
+
359
+ `_apiCall` makes a single HTTPS request with no retry on transient failures (network timeouts, 429 rate limits, 500 server errors). The polling loop at line 131-133 has a fixed 5-second backoff with no exponential backoff.
360
+
361
+ **Impact:** Temporary Telegram API outages cause message loss.
362
+
363
+ ### FP-02: No Timeout on File Downloads
364
+
365
+ **File:** `src/telegram/bot.js:463-484`
366
+
367
+ `_httpsGet` has no timeout. A stalled download from Telegram's file servers blocks the message handler indefinitely, preventing all other message processing.
368
+
369
+ ### FP-03: Session ID Race Condition on Concurrent Messages
370
+
371
+ **File:** `src/claude/conversation.js:20-21, 109-111`
372
+
373
+ `currentSessionId` is a mutable instance variable with no synchronization. If two Telegram messages arrive in rapid succession, the first may start a Claude invocation (which takes seconds to minutes), and the second may overwrite `currentSessionId` before the first completes.
374
+
375
+ **Impact:** Messages saved with incorrect session IDs, corrupted conversation threading.
376
+
377
+ ### FP-04: Embedding Worker Duplicate Processing
378
+
379
+ **File:** `src/embeddings/worker.js:150-157`
380
+
381
+ The worker SELECTs rows with `vector IS NULL` and then updates them after processing. Between the SELECT and UPDATE, no row lock is held. If two workers were running (e.g., after a hot restart), both could process the same row.
382
+
383
+ **Impact:** Wasted API calls and potential database constraint violations.
384
+
385
+ ### FP-05: Claude Subprocess Zombie After SIGTERM
386
+
387
+ **File:** `src/claude/bridge.js:281-287`
388
+
389
+ The `kill()` method sends `SIGTERM` and immediately sets `activeProcess = null`. If the child process ignores SIGTERM, no SIGKILL follow-up occurs. The process becomes a zombie.
390
+
391
+ **Impact:** Resource leak, potential blocking of future invocations.
392
+
393
+ ### FP-06: Scheduler Task Has No Execution Timeout
394
+
395
+ **File:** `src/scheduler/worker.js:243-316`
396
+
397
+ `_executeTask` calls `claudeBridge.invoke()` which has a configurable timeout (default 120s). However, the scheduler worker itself has no per-task timeout. If the Claude timeout fails to trigger (e.g., due to a hung process that partially responds), the task blocks the scheduler indefinitely.
398
+
399
+ ### FP-07: Database Connection Loss Not Detected
400
+
401
+ **File:** `src/db/pool.js:11-13`
402
+
403
+ The pool's error handler only logs to console. There is no mechanism to notify the application that the database has become unavailable. The health endpoint checks with `SELECT 1` on each request, but background workers (embedding worker, scheduler) will fail silently and retry every poll interval without alerting the user.
404
+
405
+ ### FP-08: Compaction During Active Processing Can Lose Context
406
+
407
+ **File:** `src/claude/conversation.js:124-129`
408
+
409
+ Compaction checks `claudeBridge.isActive()` before starting, but the compaction itself takes significant time (it invokes Claude for summarization). A new user message could arrive and start processing while compaction is running, causing both to use Claude simultaneously.
410
+
411
+ ---
412
+
413
+ *End of Security & Reliability Audit Report*
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "2ndbrain",
3
- "version": "2026.1.37",
3
+ "version": "2026.2.2",
4
4
  "description": "Always-on Node.js service bridging Telegram messaging to Claude AI with knowledge graph, journal, project management, and semantic search.",
5
5
  "main": "src/index.js",
6
6
  "bin": {
@@ -1,5 +1,6 @@
1
1
  import { spawn } from 'node:child_process';
2
2
  import { EventEmitter } from 'node:events';
3
+ import path from 'node:path';
3
4
 
4
5
  /**
5
6
  * Claude CLI subprocess bridge (spec section 5).
@@ -41,9 +42,14 @@ class ClaudeBridge extends EventEmitter {
41
42
  const startTime = Date.now();
42
43
  const args = this._buildArgs(sessionId, systemPrompt);
43
44
 
45
+ this.logger.info('claude', `Spawning: claude ${args.join(' ')}`);
46
+
47
+ const runtimeDir = path.join(this.config.DATA_DIR, 'claude-runtime');
48
+
44
49
  return new Promise((resolve, reject) => {
45
50
  const proc = spawn('claude', args, {
46
51
  stdio: ['pipe', 'pipe', 'pipe'],
52
+ cwd: runtimeDir,
47
53
  env: { ...process.env },
48
54
  });
49
55
 
@@ -55,6 +61,19 @@ class ClaudeBridge extends EventEmitter {
55
61
  const toolCalls = [];
56
62
  let resultData = null;
57
63
  let timedOut = false;
64
+ let receivedFirstOutput = false;
65
+
66
+ // Startup watchdog: warn if no stdout arrives within 30s
67
+ const startupTimeout = setTimeout(() => {
68
+ if (!receivedFirstOutput) {
69
+ this.logger.warn(
70
+ 'claude',
71
+ 'No output received from Claude CLI within 30s of spawn -- ' +
72
+ 'subprocess may be stuck during MCP server initialization or permission prompt. ' +
73
+ `stderr so far: ${stderrBuffer.trim() || '(empty)'}`,
74
+ );
75
+ }
76
+ }, 30_000);
58
77
 
59
78
  // Set up the timeout guard
60
79
  const timeout = setTimeout(() => {
@@ -69,6 +88,12 @@ class ClaudeBridge extends EventEmitter {
69
88
 
70
89
  // Collect and parse stdout stream-json chunks
71
90
  proc.stdout.on('data', (chunk) => {
91
+ if (!receivedFirstOutput) {
92
+ receivedFirstOutput = true;
93
+ clearTimeout(startupTimeout);
94
+ this.logger.debug('claude', `First output received after ${Date.now() - startTime}ms`);
95
+ }
96
+
72
97
  stdoutBuffer += chunk.toString();
73
98
 
74
99
  // Process complete lines (NDJSON: one JSON object per line)
@@ -94,13 +119,21 @@ class ClaudeBridge extends EventEmitter {
94
119
  }
95
120
  });
96
121
 
97
- // Monitor stderr for errors
122
+ // Monitor stderr for errors (log in real time for diagnostics)
98
123
  proc.stderr.on('data', (chunk) => {
99
- stderrBuffer += chunk.toString();
124
+ const text = chunk.toString();
125
+ stderrBuffer += text;
126
+ for (const line of text.split('\n')) {
127
+ const trimmed = line.trim();
128
+ if (trimmed) {
129
+ this.logger.debug('claude-stderr', trimmed);
130
+ }
131
+ }
100
132
  });
101
133
 
102
134
  proc.on('close', (code) => {
103
135
  clearTimeout(timeout);
136
+ clearTimeout(startupTimeout);
104
137
  this.activeProcess = null;
105
138
 
106
139
  // Process any remaining data in the stdout buffer
@@ -172,7 +205,7 @@ class ClaudeBridge extends EventEmitter {
172
205
  * @private
173
206
  */
174
207
  _buildArgs(sessionId, systemPrompt) {
175
- const args = ['-p', '--output-format', 'stream-json', '--verbose'];
208
+ const args = ['-p', '--output-format', 'stream-json', '--verbose', '--permission-mode', 'bypassPermissions'];
176
209
 
177
210
  if (sessionId) {
178
211
  // Continuation: resume an existing session
@@ -188,6 +221,11 @@ class ClaudeBridge extends EventEmitter {
188
221
  args.push('--mcp-config', this.config.MCP_CONFIG_PATH);
189
222
  args.push('--allowed-tools', this.config.MCP_TOOLS_WHITELIST);
190
223
 
224
+ const settingsPath = path.join(
225
+ this.config.DATA_DIR, 'claude-runtime', '.claude', 'settings.json',
226
+ );
227
+ args.push('--settings', settingsPath);
228
+
191
229
  if (this.config.CLAUDE_MAX_BUDGET) {
192
230
  args.push('--max-budget-usd', this.config.CLAUDE_MAX_BUDGET);
193
231
  }
package/src/index.js CHANGED
@@ -504,6 +504,11 @@ async function main() {
504
504
  embeddingsEngine,
505
505
  };
506
506
 
507
+ // Catch emitted errors so they don't throw (Node.js EventEmitter behaviour)
508
+ bot.on('error', (err) => {
509
+ logger.error('telegram', `Bot error: ${err.message}`);
510
+ });
511
+
507
512
  // Wire message handler
508
513
  bot.on('message', (msg) => {
509
514
  handleMessage(msg, deps).catch((err) => {
@@ -261,7 +261,7 @@ class TelegramBot extends EventEmitter {
261
261
  * @returns {Promise<object[]>} Array of sent message results
262
262
  */
263
263
  async sendMessage(chatId, text, options = {}) {
264
- const parseMode = options.parse_mode ?? 'MarkdownV2';
264
+ const parseMode = 'parse_mode' in options ? options.parse_mode : 'MarkdownV2';
265
265
  const chunks = this._chunkText(text, MAX_MESSAGE_LENGTH);
266
266
  const results = [];
267
267