alvin-bot 4.10.0 → 4.11.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -2,6 +2,95 @@
2
2
 
3
3
  All notable changes to Alvin Bot are documented here.
4
4
 
5
+ ## [4.11.0] — 2026-04-13
6
+
7
+ ### 🧠 Memory Persistence + Smart Loading — sessions survive restart, memory is layered
8
+
9
+ A colleague asked the same day v4.10.0 shipped: *"Memory after session restart is also a bit fiddly. I installed mempalace as a workaround — maybe build something like that natively."* He was right. Alvin had a hand-curated `MEMORY.md`, a 128 MB embeddings vector index, and an AI-powered compaction service — but **the in-memory `sessions Map` was wiped on every bot restart**. Claude SDK then started a fresh conversation on the next user message, behaving like a goldfish despite all that memory infrastructure on disk.
10
+
11
+ This release fixes that with **five complementary tasks**, all bundled into v4.11.0. Three core fixes (P0) plus two structural improvements (P1) inspired by mempalace's L0–L3 stack and Mem0's auto-extraction pattern.
12
+
13
+ #### P0 #1 — Session Persistence (`src/services/session-persistence.ts`, NEW)
14
+
15
+ The core fix. The `sessions Map` in `src/services/session.ts` was in-memory only; every `launchctl kickstart` wiped every user's `sessionId`, history, language, effort, voiceReply, and tracking counters.
16
+
17
+ - **Debounced flush** (1.5 s coalesce window) writes a sanitized snapshot of `getAllSessions()` to `~/.alvin-bot/state/sessions.json` via atomic tmp+rename.
18
+ - **`loadPersistedSessions()`** rehydrates the Map at bot startup; `flushSessions()` flushes synchronously on graceful shutdown (SIGINT/SIGTERM).
19
+ - **`attachPersistHook()` / `markSessionDirty()`** in `session.ts` give handlers a callback to trigger persist after direct mutations (`/lang`, `/effort`, `/voice`). `addToHistory()` and `trackProviderUsage()` trigger it automatically.
20
+ - History is capped at `MAX_PERSISTED_HISTORY = 50` per session so the file stays small.
21
+ - Runtime-only fields (`abortController`, `isProcessing`, `messageQueue`) are stripped before persisting.
22
+ - Schema drift is handled: missing fields fall back to defaults; corrupt JSON loads zero sessions; null root rejected gracefully.
23
+ - **9 unit tests** + **18 stress tests** covering 100-session burst, 1000-mutate debounce coalescing, unicode (RTL/ZWJ/astral plane), atomic write recovery from stale `.tmp`, schema drift, hostile JSON, read-only filesystem, simulated bot restart.
24
+
25
+ #### P0 #2 — MEMORY.md Auto-Inject for SDK (`src/services/personality.ts`)
26
+
27
+ Before v4.11.0, only non-SDK providers (Groq, Gemini, NVIDIA) got `buildMemoryContext()` injected into their system prompt. The Claude SDK was *expected* to read memory files via tools, but in practice rarely did unless the user's first message specifically prompted it.
28
+
29
+ - Drops the `!isSDK` guard around `buildMemoryContext()` and asset-index injection.
30
+ - SDK now gets the same compact memory context (MEMORY.md + today + yesterday daily logs) at every turn — the same context non-SDK providers had since 4.0.
31
+ - **3 unit tests** verifying SDK includes the memory section, non-SDK regression, and graceful behavior when MEMORY.md is missing.
32
+
33
+ #### P0 #3 — Semantic Recall on SDK First Turn (`src/services/personality.ts`, `src/handlers/message.ts`, `src/handlers/platform-message.ts`)
34
+
35
+ `buildSmartSystemPrompt()` now accepts an `isFirstTurn` flag. For SDK providers it runs the embeddings-based `searchMemory()` only on the first turn (`session.sessionId === null` — meaning Claude hasn't given us a resume token yet for this session). After the first turn Claude carries the recalled context inside the SDK session via resume, so spamming the embeddings API on every subsequent turn is wasted work. Non-SDK providers still run the search on every turn (no resume mechanism).
36
+
37
+ - `handlers/message.ts` and `handlers/platform-message.ts` updated to compute `isFirstSDKTurn = isSDK && session.sessionId === null` and pass it through.
38
+ - The bare `buildSystemPrompt` calls on the SDK paths are gone — `buildSmartSystemPrompt` is the single entry point.
39
+ - **5 mocked-search tests** covering call-count semantics for SDK first/later turns, non-SDK every turn, missing `userMessage` skip, and graceful failure when `searchMemory` throws.
40
+
41
+ #### P1 #4 — Layered Memory Loader (`src/services/memory-layers.ts`, NEW)
42
+
43
+ Inspired by mempalace's L0–L3 stack. Replaces the monolithic `MEMORY.md → System Prompt` injection with a structured, token-budgeted layered loader:
44
+
45
+ - **L0** `~/.alvin-bot/memory/identity.md` — always loaded, ~200 tokens (core user facts: name, location, family, contact)
46
+ - **L1** `~/.alvin-bot/memory/preferences.md` — always loaded (communication style, do's and don'ts)
47
+ - **L1** `~/.alvin-bot/memory/MEMORY.md` — backwards-compat: existing curated knowledge (full content if no split files exist; truncated to 1500 chars when split files coexist)
48
+ - **L2** `~/.alvin-bot/memory/projects/*.md` — loaded only when the user's incoming query mentions the project topic (substring or first-200-char keyword overlap)
49
+ - **L3** daily logs — still handled by `embeddings.ts` vector search (unchanged)
50
+
51
+ The split is **opt-in**: if `identity.md` and `preferences.md` don't exist, the loader falls back to monolithic MEMORY.md exactly like before. No migration required for existing users. Users who want the cleaner layout can split MEMORY.md manually and the loader picks it up automatically. Token budget: L0+L1 capped at 5000 chars (~1300 tokens), L2 capped at 3000 chars total (~750 tokens, max 1500 per matched project file). New `query` parameter on `buildSystemPrompt()` and `buildMemoryContext()` propagates the user message all the way through. **9 unit tests** + 2 layered-context stress tests.
52
+
53
+ #### P1 #5 — Auto-Fact-Extraction in Compaction (`src/services/memory-extractor.ts`, NEW)
54
+
55
+ Inspired by Mem0's auto-extraction. When `compactSession()` archives old messages, it now runs an additional extraction pass that pulls structured facts (`user_facts`, `preferences`, `decisions`) out of the archived chunk via the active AI provider and appends them to MEMORY.md.
56
+
57
+ - **`parseExtractedFacts(text)`** — tolerates JSON wrapped in markdown code fences, surrounding prose, null/undefined fields, non-string entries.
58
+ - **`appendFactsToMemoryFile(facts)`** — exact-string dedup against existing MEMORY.md content, structured under `## Auto-extracted (YYYY-MM-DD)` header with `### User Facts` / `### Preferences` / `### Decisions` sub-sections.
59
+ - **`extractAndStoreFacts(chunk)`** — safe wrapper, never throws. Opt-out via `MEMORY_EXTRACTION_DISABLED=1` env var. Uses effort=low for cost minimization. Skips short input (<50 chars). Provider failures are swallowed; compaction always continues.
60
+ - Wired into `compactSession()` after the daily-log flush, before the AI summary generation.
61
+ - Marked **experimental** in v4.11.0. Semantic dedup (vs current exact-string match) deferred to v4.12+.
62
+ - **11 unit tests** covering JSON parsing edge cases, dedup, opt-out, short-input skip, garbage input, non-string filtering, graceful provider-failure handling.
63
+
64
+ #### Architecture decisions
65
+
66
+ - **mempalace as MCP server: rejected.** Considered installing mempalace as a Python MCP service. Rejected because (1) Alvin is all-TypeScript and adding a 2nd Python service to launchd is operational complexity, (2) Alvin already has an embeddings vector index — mempalace would be a parallel duplicate, (3) mempalace's MCP tools are only consumed by the SDK; cron jobs, sub-agents, and non-SDK providers wouldn't see them. Conclusion: **adopt the patterns natively** (L0–L3 layering, AAAK-style structured extraction) rather than running a second service.
67
+ - **SQLite migration deferred.** The 128 MB JSON embeddings index is a known performance issue and is already noted in `~/.claude/projects/-Users-alvin-de/memory/project_alvinbot_sqlite_migration.md` for v4.12+. Orthogonal to the "frickelig nach Restart" UX problem this release targets.
68
+ - **Multi-user isolation deferred.** Memories are still global per data dir. Single-user use case, not a privacy concern for Ali's setup.
69
+ - **Decay/aging deferred.** Daily logs grow monotonically. Will be addressed alongside SQLite migration.
70
+
71
+ #### Testing
72
+
73
+ **292 tests total** (237 baseline + 55 new). All green. TSC clean.
74
+
75
+ - 9 session-persistence unit tests
76
+ - 8 SDK memory-injection tests (3 base + 5 smart-prompt mocked-search)
77
+ - 9 memory-layers tests (loader + topic match + token budget)
78
+ - 11 memory-extractor tests (parse + append + extract pipeline)
79
+ - 18 stress tests (100 sessions, schema drift, unicode, atomic recovery, hostile JSON, simulated restart)
80
+
81
+ **Live verification:**
82
+ - `tmp/live-stress-memory.mjs` — 50 fake sessions against the built `dist/`, real ~/.alvin-bot/memory/MEMORY.md as the L1 source, simulated restart via Map clear + reload. Result: 215 KB state file, 1 ms flush, 1 ms reload, 50/50 perfect round-trip.
83
+ - `tmp/live-edge-cases.mjs` — 7 hostile scenarios: all-null fields, 1000-burst debounce (2 ms), 20 concurrent flushes, extreme unicode (RTL + ZWJ + astral plane), 4-layer memory with project topic match, atomic write recovery from stale .tmp, empty project file skipping. All passed.
84
+
85
+ #### Files changed
86
+
87
+ - **NEW:** `src/services/session-persistence.ts`, `src/services/memory-layers.ts`, `src/services/memory-extractor.ts`
88
+ - **NEW tests:** `test/session-persistence.test.ts`, `test/memory-sdk-injection.test.ts`, `test/memory-layers.test.ts`, `test/memory-extractor.test.ts`, `test/memory-stress-restart.test.ts`
89
+ - **Modified:** `src/services/session.ts` (persist hook), `src/services/personality.ts` (SDK injection + isFirstTurn), `src/services/memory.ts` (use layered loader), `src/services/compaction.ts` (extractor hook), `src/handlers/message.ts` + `src/handlers/platform-message.ts` (smart prompt wiring), `src/handlers/commands.ts` (`markSessionDirty` calls), `src/index.ts` (load + flush wiring), `src/paths.ts` (4 new constants)
90
+ - **Plan:** `docs/superpowers/plans/2026-04-13-memory-persistence.md`
91
+
92
+ ---
93
+
5
94
  ## [4.10.0] — 2026-04-13
6
95
 
7
96
  ### 🚀 Async sub-agents — main session no longer blocks during long tasks
@@ -2,7 +2,7 @@ import { InlineKeyboard, InputFile } from "grammy";
2
2
  import fs from "fs";
3
3
  import path, { resolve } from "path";
4
4
  import os from "os";
5
- import { getSession, resetSession } from "../services/session.js";
5
+ import { getSession, resetSession, markSessionDirty } from "../services/session.js";
6
6
  import { getRegistry } from "../engine.js";
7
7
  import { reloadSoul } from "../services/personality.js";
8
8
  import { parseDuration, createReminder, listReminders, cancelReminder } from "../services/reminders.js";
@@ -399,6 +399,7 @@ export function registerCommands(bot) {
399
399
  const userId = ctx.from.id;
400
400
  const session = getSession(userId);
401
401
  session.voiceReply = !session.voiceReply;
402
+ markSessionDirty(userId);
402
403
  await ctx.reply(session.voiceReply
403
404
  ? "Voice replies enabled. Responses will also be sent as voice messages."
404
405
  : "Voice replies disabled. Text-only responses.");
@@ -421,6 +422,7 @@ export function registerCommands(bot) {
421
422
  return;
422
423
  }
423
424
  session.effort = level;
425
+ markSessionDirty(userId);
424
426
  await ctx.reply(`✅ Effort: ${EFFORT_LABELS[session.effort]}`);
425
427
  });
426
428
  // Inline keyboard callback for effort switching
@@ -433,6 +435,7 @@ export function registerCommands(bot) {
433
435
  const userId = ctx.from.id;
434
436
  const session = getSession(userId);
435
437
  session.effort = level;
438
+ markSessionDirty(userId);
436
439
  const keyboard = new InlineKeyboard();
437
440
  for (const [key, label] of Object.entries(EFFORT_LABELS)) {
438
441
  const marker = key === session.effort ? "✅ " : "";
@@ -827,6 +830,7 @@ export function registerCommands(bot) {
827
830
  }
828
831
  else if (arg === "en" || arg === "de" || arg === "es" || arg === "fr") {
829
832
  session.language = arg;
833
+ markSessionDirty(userId);
830
834
  const { setExplicitLanguage } = await import("../services/language-detect.js");
831
835
  setExplicitLanguage(userId, arg);
832
836
  await ctx.reply(t("bot.lang.setFixed", arg, { name: LOCALE_NAMES[arg] }));
@@ -851,6 +855,7 @@ export function registerCommands(bot) {
851
855
  }
852
856
  const newLang = choice;
853
857
  session.language = newLang;
858
+ markSessionDirty(userId);
854
859
  const { setExplicitLanguage } = await import("../services/language-detect.js");
855
860
  setExplicitLanguage(userId, newLang);
856
861
  const currentName = `${LOCALE_FLAGS[newLang]} ${LOCALE_NAMES[newLang]}`;
@@ -4,7 +4,7 @@ import { getSession, addToHistory, trackProviderUsage, buildSessionKey } from ".
4
4
  import { TelegramStreamer } from "../services/telegram.js";
5
5
  import { getRegistry } from "../engine.js";
6
6
  import { textToSpeech } from "../services/voice.js";
7
- import { buildSystemPrompt, buildSmartSystemPrompt } from "../services/personality.js";
7
+ import { buildSmartSystemPrompt } from "../services/personality.js";
8
8
  import { buildSkillContext } from "../services/skills.js";
9
9
  import { isForwardingAllowed } from "../services/access.js";
10
10
  import { touchProfile } from "../services/users.js";
@@ -219,12 +219,17 @@ export async function handleMessage(ctx) {
219
219
  if (adaptedLang !== session.language) {
220
220
  session.language = adaptedLang;
221
221
  }
222
- // Build query options (with semantic memory search for non-SDK + skill injection)
222
+ // Build query options (with semantic memory search for non-SDK + skill injection).
223
+ // v4.11.0 P0 #3: SDK now also gets semantic recall on first-turn. The signal
224
+ // is `session.sessionId === null` — meaning Claude SDK hasn't given us a
225
+ // resume token yet for this session. True for: brand-new users, post-/new,
226
+ // and rehydrated sessions where the persisted snapshot lacked a sessionId.
227
+ // After the first SDK turn, Claude resumes via SDK session_id and already
228
+ // carries the recalled context — no need for another search per turn.
223
229
  const chatIdStr = String(ctx.chat.id);
224
230
  const skillContext = buildSkillContext(text);
225
- const systemPrompt = (isSDK
226
- ? buildSystemPrompt(isSDK, session.language, chatIdStr)
227
- : await buildSmartSystemPrompt(isSDK, session.language, text, chatIdStr)) + skillContext;
231
+ const isFirstSDKTurn = isSDK && session.sessionId === null;
232
+ const systemPrompt = (await buildSmartSystemPrompt(isSDK, session.language, text, chatIdStr, isFirstSDKTurn)) + skillContext;
228
233
  // Track the user turn in history regardless of provider type. This keeps
229
234
  // the fallback path (Ollama etc.) aware of what was said on SDK turns.
230
235
  addToHistory(userId, { role: "user", content: text });
@@ -9,7 +9,7 @@
9
9
  import fs from "fs";
10
10
  import { getSession, addToHistory, trackProviderUsage } from "../services/session.js";
11
11
  import { getRegistry } from "../engine.js";
12
- import { buildSystemPrompt, buildSmartSystemPrompt } from "../services/personality.js";
12
+ import { buildSmartSystemPrompt } from "../services/personality.js";
13
13
  import { buildSkillContext } from "../services/skills.js";
14
14
  import { touchProfile } from "../services/users.js";
15
15
  import { trackAndAdapt } from "../services/language-detect.js";
@@ -129,9 +129,9 @@ export async function handlePlatformMessage(msg, adapter) {
129
129
  const activeProvider = registry.getActive();
130
130
  const isSDK = activeProvider.config.type === "claude-sdk";
131
131
  const skillContext = buildSkillContext(fullText);
132
- const systemPrompt = (isSDK
133
- ? buildSystemPrompt(isSDK, session.language, msg.chatId)
134
- : await buildSmartSystemPrompt(isSDK, session.language, fullText, msg.chatId)) + skillContext;
132
+ // v4.11.0 P0 #3 — SDK gets semantic recall on first turn (when no resume token yet).
133
+ const isFirstSDKTurn = isSDK && session.sessionId === null;
134
+ const systemPrompt = (await buildSmartSystemPrompt(isSDK, session.language, fullText, msg.chatId, isFirstSDKTurn)) + skillContext;
135
135
  const queryOpts = {
136
136
  prompt: fullText,
137
137
  systemPrompt,
package/dist/index.js CHANGED
@@ -79,7 +79,8 @@ import { initMCP, disconnectMCP, hasMCPConfig } from "./services/mcp.js";
79
79
  import { startWebServer, stopWebServer } from "./web/server.js";
80
80
  import { startScheduler, stopScheduler, setNotifyCallback } from "./services/cron.js";
81
81
  import { startWatcher as startAsyncAgentWatcher, stopWatcher as stopAsyncAgentWatcher } from "./services/async-agent-watcher.js";
82
- import { startSessionCleanup, stopSessionCleanup } from "./services/session.js";
82
+ import { startSessionCleanup, stopSessionCleanup, attachPersistHook } from "./services/session.js";
83
+ import { loadPersistedSessions, flushSessions, schedulePersist, } from "./services/session-persistence.js";
83
84
  import { processQueue, cleanupQueue, setSenders, enqueue } from "./services/delivery-queue.js";
84
85
  import { discoverTools } from "./services/tool-discovery.js";
85
86
  import { startHeartbeat } from "./services/heartbeat.js";
@@ -257,6 +258,10 @@ const shutdown = async () => {
257
258
  stopScheduler();
258
259
  stopAsyncAgentWatcher();
259
260
  stopSessionCleanup();
261
+ // v4.11.0 — Final immediate flush of in-memory sessions to disk before exit.
262
+ // The debounced timer might be pending; flushSessions() cancels it and writes
263
+ // synchronously so the next boot can rehydrate the latest state.
264
+ await flushSessions().catch((err) => console.warn("[shutdown] flushSessions failed:", err));
260
265
  if (queueInterval)
261
266
  clearInterval(queueInterval);
262
267
  if (queueCleanupInterval)
@@ -430,6 +435,13 @@ startAsyncAgentWatcher();
430
435
  // Session memory hygiene: purge sessions idle > 7 days (configurable via
431
436
  // ALVIN_SESSION_TTL_DAYS). Never touches active sessions — see session.ts.
432
437
  startSessionCleanup();
438
+ // Session persistence (v4.11.0): wire the debounced persist hook BEFORE we
439
+ // load the snapshot, then rehydrate the in-memory Map from disk so users'
440
+ // Claude SDK session_id, conversation history, language and effort all
441
+ // survive bot restarts. Without this, every launchctl restart turns the
442
+ // bot into a goldfish for every active conversation.
443
+ attachPersistHook(schedulePersist);
444
+ loadPersistedSessions();
433
445
  // Wire delivery queue senders
434
446
  setSenders({
435
447
  telegram: async (chatId, content) => {
package/dist/paths.js CHANGED
@@ -41,8 +41,15 @@ export const TOOLS_EXAMPLE_JSON = resolve(BOT_ROOT, "docs", "tools.example.json"
41
41
  export const ENV_FILE = resolve(DATA_DIR, ".env");
42
42
  /** memory/ — Daily logs and embeddings */
43
43
  export const MEMORY_DIR = resolve(DATA_DIR, "memory");
44
- /** memory/MEMORY.md — Long-term curated memory */
44
+ /** memory/MEMORY.md — Long-term curated memory (legacy monolithic, still loaded) */
45
45
  export const MEMORY_FILE = resolve(DATA_DIR, "memory", "MEMORY.md");
46
+ /** memory/identity.md — L0 layer (v4.11.0): core user facts, always loaded.
47
+ * Optional. If missing, MEMORY.md acts as the L0+L1 fallback. */
48
+ export const IDENTITY_FILE = resolve(DATA_DIR, "memory", "identity.md");
49
+ /** memory/preferences.md — L1 layer (v4.11.0): communication style + don'ts. */
50
+ export const PREFERENCES_FILE = resolve(DATA_DIR, "memory", "preferences.md");
51
+ /** memory/projects/ — L2 layer (v4.11.0): per-project context loaded on topic match. */
52
+ export const PROJECTS_MEMORY_DIR = resolve(DATA_DIR, "memory", "projects");
46
53
  /** memory/.embeddings.json — Vector index */
47
54
  export const EMBEDDINGS_IDX = resolve(DATA_DIR, "memory", ".embeddings.json");
48
55
  /** users/ — User profiles and per-user memory */
@@ -66,6 +73,12 @@ export const BACKUP_DIR = resolve(DATA_DIR, "backups");
66
73
  * See src/services/async-agent-watcher.ts for the watcher that polls and
67
74
  * delivers these. Survives bot restarts. */
68
75
  export const ASYNC_AGENTS_STATE_FILE = resolve(DATA_DIR, "state", "async-agents.json");
76
+ /** state/sessions.json — Persisted user sessions across bot restarts (v4.11.0).
77
+ * Includes: sessionId (Claude SDK resume token), language, effort, voiceReply,
78
+ * workingDir, lastActivity, lastSdkHistoryIndex, history (capped). Atomic write
79
+ * via tmp+rename. Loaded on startup, debounce-flushed on mutations.
80
+ * See src/services/session-persistence.ts for the loader/flusher. */
81
+ export const SESSIONS_STATE_FILE = resolve(DATA_DIR, "state", "sessions.json");
69
82
  /** soul.md — Bot personality */
70
83
  export const SOUL_FILE = resolve(DATA_DIR, "soul.md");
71
84
  /** tools.md — Custom tool definitions (Markdown) */
@@ -65,6 +65,19 @@ export async function compactSession(session) {
65
65
  catch (err) {
66
66
  console.error("Compaction: failed to flush to memory:", err);
67
67
  }
68
+ // v4.11.0 P1 #5 — Auto-extract structured facts from the archived chunk
69
+ // and persist them to MEMORY.md. Experimental feature, opt-out via
70
+ // MEMORY_EXTRACTION_DISABLED=1. Safe wrapper — never throws.
71
+ try {
72
+ const { extractAndStoreFacts } = await import("./memory-extractor.js");
73
+ const result = await extractAndStoreFacts(summaryInput);
74
+ if (result.factsStored > 0) {
75
+ console.log(`🧠 memory-extractor: stored ${result.factsStored} new fact(s) in MEMORY.md`);
76
+ }
77
+ }
78
+ catch (err) {
79
+ console.warn("memory-extractor failed (non-fatal):", err instanceof Error ? err.message : err);
80
+ }
68
81
  // Try AI-powered summary
69
82
  let summaryText = null;
70
83
  try {
@@ -0,0 +1,178 @@
1
+ /**
2
+ * Memory Extractor (v4.11.0, experimental)
3
+ *
4
+ * When the compaction service archives old conversation chunks, it normally
5
+ * dumps prose into the daily log. This extractor adds a structured pass that
6
+ * pulls user_facts, preferences, and decisions out of the chunk and appends
7
+ * them to MEMORY.md (de-duplicated by exact-string match).
8
+ *
9
+ * Pattern inspired by Mem0's auto-extraction. Designed to be safe:
10
+ * - Opt-out via MEMORY_EXTRACTION_DISABLED=1
11
+ * - Uses the active provider with effort=low
12
+ * - Failures are swallowed; compaction continues regardless
13
+ * - Dedup is exact-string only (no embedding-based semantic dedup yet)
14
+ */
15
+ import fs from "fs";
16
+ import { dirname } from "path";
17
+ import { MEMORY_FILE } from "../paths.js";
18
+ const EMPTY_FACTS = {
19
+ user_facts: [],
20
+ preferences: [],
21
+ decisions: [],
22
+ };
23
+ const EXTRACTION_PROMPT = `Extract structured facts from this conversation chunk. Return ONLY a JSON object with these keys:
24
+
25
+ {
26
+ "user_facts": ["concrete facts about the user that should persist forever"],
27
+ "preferences": ["communication style or workflow preferences the user expressed"],
28
+ "decisions": ["explicit decisions made (e.g., 'use X instead of Y')"]
29
+ }
30
+
31
+ Rules:
32
+ - Each entry must be ONE short, declarative sentence (max 100 chars).
33
+ - Skip transient conversation details (questions, todos, ephemeral state).
34
+ - Skip facts that are obvious from context (e.g., "user asked a question").
35
+ - Empty arrays are fine — don't invent facts.
36
+ - Output ONLY the JSON, no commentary.
37
+
38
+ Conversation chunk:
39
+ `;
40
+ /**
41
+ * Parse the JSON output from the AI extractor. Tolerates markdown code-fence
42
+ * wrapping and surrounding prose. Returns empty arrays on any parse failure.
43
+ */
44
+ export function parseExtractedFacts(text) {
45
+ if (!text || typeof text !== "string")
46
+ return { ...EMPTY_FACTS };
47
+ // Strip markdown code fences if present
48
+ let cleaned = text.trim();
49
+ const fenceMatch = cleaned.match(/^```(?:json)?\s*\n?([\s\S]*?)\n?```\s*$/);
50
+ if (fenceMatch)
51
+ cleaned = fenceMatch[1].trim();
52
+ // Try to find the first { ... } block if there's surrounding prose
53
+ const braceMatch = cleaned.match(/\{[\s\S]*\}/);
54
+ if (braceMatch)
55
+ cleaned = braceMatch[0];
56
+ try {
57
+ const parsed = JSON.parse(cleaned);
58
+ return {
59
+ user_facts: Array.isArray(parsed.user_facts)
60
+ ? parsed.user_facts.filter((s) => typeof s === "string")
61
+ : [],
62
+ preferences: Array.isArray(parsed.preferences)
63
+ ? parsed.preferences.filter((s) => typeof s === "string")
64
+ : [],
65
+ decisions: Array.isArray(parsed.decisions)
66
+ ? parsed.decisions.filter((s) => typeof s === "string")
67
+ : [],
68
+ };
69
+ }
70
+ catch {
71
+ return { ...EMPTY_FACTS };
72
+ }
73
+ }
74
+ /**
75
+ * Append extracted facts to MEMORY.md under structured headers, deduplicated
76
+ * by exact-string match against existing content.
77
+ */
78
+ export async function appendFactsToMemoryFile(facts) {
79
+ const total = facts.user_facts.length + facts.preferences.length + facts.decisions.length;
80
+ if (total === 0)
81
+ return 0;
82
+ // Read existing content for dedup
83
+ let existing = "";
84
+ try {
85
+ existing = fs.readFileSync(MEMORY_FILE, "utf-8");
86
+ }
87
+ catch {
88
+ // File doesn't exist yet — that's fine, mkdir parent
89
+ fs.mkdirSync(dirname(MEMORY_FILE), { recursive: true });
90
+ }
91
+ const isDuplicate = (line) => existing.includes(line);
92
+ const newLines = [];
93
+ const todayIso = new Date().toISOString().slice(0, 10);
94
+ const sectionHeader = `\n\n## Auto-extracted (${todayIso})\n`;
95
+ let stored = 0;
96
+ if (facts.user_facts.length > 0) {
97
+ const newOnes = facts.user_facts.filter(f => !isDuplicate(f));
98
+ if (newOnes.length > 0) {
99
+ newLines.push("\n### User Facts");
100
+ for (const f of newOnes) {
101
+ newLines.push(`- ${f}`);
102
+ stored++;
103
+ }
104
+ }
105
+ }
106
+ if (facts.preferences.length > 0) {
107
+ const newOnes = facts.preferences.filter(p => !isDuplicate(p));
108
+ if (newOnes.length > 0) {
109
+ newLines.push("\n### Preferences");
110
+ for (const p of newOnes) {
111
+ newLines.push(`- ${p}`);
112
+ stored++;
113
+ }
114
+ }
115
+ }
116
+ if (facts.decisions.length > 0) {
117
+ const newOnes = facts.decisions.filter(d => !isDuplicate(d));
118
+ if (newOnes.length > 0) {
119
+ newLines.push("\n### Decisions");
120
+ for (const d of newOnes) {
121
+ newLines.push(`- ${d}`);
122
+ stored++;
123
+ }
124
+ }
125
+ }
126
+ if (stored > 0) {
127
+ const block = sectionHeader + newLines.join("\n") + "\n";
128
+ fs.appendFileSync(MEMORY_FILE, block, "utf-8");
129
+ }
130
+ return stored;
131
+ }
132
+ /**
133
+ * Extract facts from a conversation chunk and store them in MEMORY.md.
134
+ * Safe wrapper — never throws, always returns an ExtractionResult.
135
+ */
136
+ export async function extractAndStoreFacts(conversationText) {
137
+ if (process.env.MEMORY_EXTRACTION_DISABLED === "1") {
138
+ return { disabled: true, factsStored: 0 };
139
+ }
140
+ if (!conversationText || conversationText.trim().length < 50) {
141
+ return { disabled: false, factsStored: 0 };
142
+ }
143
+ let extractedText = "";
144
+ try {
145
+ // Lazy-import the registry so test environments without an engine init
146
+ // don't crash on module load.
147
+ const { getRegistry } = await import("../engine.js");
148
+ const registry = getRegistry();
149
+ const opts = {
150
+ prompt: EXTRACTION_PROMPT + conversationText.slice(0, 8000),
151
+ systemPrompt: "You are a fact extractor. Output only valid JSON, no commentary.",
152
+ effort: "low",
153
+ };
154
+ for await (const chunk of registry.queryWithFallback(opts)) {
155
+ if (chunk.type === "text" && chunk.text) {
156
+ extractedText = chunk.text;
157
+ }
158
+ if (chunk.type === "error") {
159
+ // Provider failed — silent fallback
160
+ return { disabled: false, factsStored: 0 };
161
+ }
162
+ }
163
+ }
164
+ catch {
165
+ return { disabled: false, factsStored: 0 };
166
+ }
167
+ if (!extractedText)
168
+ return { disabled: false, factsStored: 0 };
169
+ const facts = parseExtractedFacts(extractedText);
170
+ let stored = 0;
171
+ try {
172
+ stored = await appendFactsToMemoryFile(facts);
173
+ }
174
+ catch {
175
+ // appendFactsToMemoryFile failed — non-fatal
176
+ }
177
+ return { disabled: false, factsStored: stored };
178
+ }
@@ -0,0 +1,147 @@
1
+ /**
2
+ * Memory Layers Service (v4.11.0)
3
+ *
4
+ * Layered memory loader inspired by mempalace's L0–L3 stack:
5
+ *
6
+ * L0 identity.md always loaded, ~200 tokens (core user facts)
7
+ * L1 preferences.md always loaded (communication style)
8
+ * L1 MEMORY.md backwards-compat: monolithic curated knowledge
9
+ * L2 projects/*.md loaded on topic match against the user's query
10
+ * L3 daily logs only via vector search (handled by embeddings.ts)
11
+ *
12
+ * If neither identity.md nor preferences.md exists, this loader still works
13
+ * via the monolithic MEMORY.md fallback, so existing setups need no migration.
14
+ *
15
+ * Token budget: capped at ~5000 chars for L0+L1, +~3000 chars for matched L2.
16
+ */
17
+ import fs from "fs";
18
+ import path from "path";
19
+ import { IDENTITY_FILE, PREFERENCES_FILE, PROJECTS_MEMORY_DIR, MEMORY_FILE, } from "../paths.js";
20
+ const MAX_L0_L1_CHARS = 5000;
21
+ const MAX_L2_PROJECT_CHARS = 1500;
22
+ const MAX_L2_TOTAL_CHARS = 3000;
23
+ function readSafe(file) {
24
+ try {
25
+ return fs.readFileSync(file, "utf-8");
26
+ }
27
+ catch {
28
+ return "";
29
+ }
30
+ }
31
+ /**
32
+ * Load all memory layers from disk. Cheap — no API calls, just file reads.
33
+ */
34
+ export function loadMemoryLayers() {
35
+ const identity = readSafe(IDENTITY_FILE);
36
+ const preferences = readSafe(PREFERENCES_FILE);
37
+ const longTerm = readSafe(MEMORY_FILE);
38
+ const projects = [];
39
+ try {
40
+ if (fs.existsSync(PROJECTS_MEMORY_DIR)) {
41
+ const entries = fs.readdirSync(PROJECTS_MEMORY_DIR);
42
+ for (const entry of entries) {
43
+ if (!entry.endsWith(".md") || entry.startsWith("."))
44
+ continue;
45
+ const fullPath = path.resolve(PROJECTS_MEMORY_DIR, entry);
46
+ const content = readSafe(fullPath);
47
+ if (content.trim()) {
48
+ projects.push({
49
+ topic: entry.replace(/\.md$/, ""),
50
+ content,
51
+ });
52
+ }
53
+ }
54
+ }
55
+ }
56
+ catch {
57
+ // projects dir missing or unreadable — fine
58
+ }
59
+ return { identity, preferences, longTerm, projects };
60
+ }
61
+ /**
62
+ * Match L2 projects against the user query.
63
+ * Topic match is naive substring (case-insensitive) on filename + first 200 chars
64
+ * of the project content. For v4.11.0 this is intentionally simple — vector
65
+ * search via embeddings.ts handles the deep cases.
66
+ */
67
+ function matchProjectsToQuery(projects, query) {
68
+ if (!query)
69
+ return [];
70
+ const q = query.toLowerCase();
71
+ const matched = [];
72
+ for (const p of projects) {
73
+ const topicLower = p.topic.toLowerCase();
74
+ if (q.includes(topicLower)) {
75
+ matched.push(p);
76
+ continue;
77
+ }
78
+ // Also check the first 200 chars of project content — this catches cases
79
+ // where the user mentions a project's headline term that isn't the
80
+ // filename (e.g., "VPS" matching alev-b.md which mentions "VPS:" upfront).
81
+ const head = p.content.slice(0, 200).toLowerCase();
82
+ const headWords = head.split(/[\s\W]+/).filter(w => w.length >= 4);
83
+ if (headWords.some(w => q.includes(w))) {
84
+ matched.push(p);
85
+ }
86
+ }
87
+ return matched;
88
+ }
89
+ /**
90
+ * Build a token-budgeted layered context string suitable for system prompt injection.
91
+ *
92
+ * @param query Optional user query. If provided, L2 projects matching the query
93
+ * get included. If omitted, only L0+L1 are loaded (boot-up brief).
94
+ */
95
+ export function buildLayeredContext(query) {
96
+ const layers = loadMemoryLayers();
97
+ const parts = [];
98
+ let l0l1Chars = 0;
99
+ if (layers.identity) {
100
+ const truncated = layers.identity.length > MAX_L0_L1_CHARS
101
+ ? layers.identity.slice(0, MAX_L0_L1_CHARS) + "\n[...truncated]"
102
+ : layers.identity;
103
+ parts.push("## Identity (L0)\n" + truncated);
104
+ l0l1Chars += truncated.length;
105
+ }
106
+ if (layers.preferences && l0l1Chars < MAX_L0_L1_CHARS) {
107
+ const remaining = MAX_L0_L1_CHARS - l0l1Chars;
108
+ const truncated = layers.preferences.length > remaining
109
+ ? layers.preferences.slice(0, remaining) + "\n[...truncated]"
110
+ : layers.preferences;
111
+ parts.push("## Preferences (L1)\n" + truncated);
112
+ l0l1Chars += truncated.length;
113
+ }
114
+ // Backwards-compat: if no identity AND no preferences, use the monolithic
115
+ // MEMORY.md as L1 fully (existing user setups). If split files exist,
116
+ // include MEMORY.md as a secondary L1 with tighter truncation.
117
+ if (!layers.identity && !layers.preferences && layers.longTerm) {
118
+ const truncated = layers.longTerm.length > MAX_L0_L1_CHARS
119
+ ? layers.longTerm.slice(0, MAX_L0_L1_CHARS) + "\n[...truncated]"
120
+ : layers.longTerm;
121
+ parts.push("## Long-term Memory (L1, monolithic)\n" + truncated);
122
+ }
123
+ else if (layers.longTerm) {
124
+ const SECONDARY_CAP = 1500;
125
+ const truncated = layers.longTerm.length > SECONDARY_CAP
126
+ ? layers.longTerm.slice(0, SECONDARY_CAP) + "\n[...truncated]"
127
+ : layers.longTerm;
128
+ parts.push("## Long-term Memory (L1, legacy MEMORY.md)\n" + truncated);
129
+ }
130
+ // L2: project-specific, only when a query is provided
131
+ if (query && layers.projects.length > 0) {
132
+ const matched = matchProjectsToQuery(layers.projects, query);
133
+ let l2TotalChars = 0;
134
+ for (const p of matched) {
135
+ if (l2TotalChars >= MAX_L2_TOTAL_CHARS)
136
+ break;
137
+ const remaining = MAX_L2_TOTAL_CHARS - l2TotalChars;
138
+ const cap = Math.min(MAX_L2_PROJECT_CHARS, remaining);
139
+ const content = p.content.length > cap
140
+ ? p.content.slice(0, cap) + "\n[...truncated]"
141
+ : p.content;
142
+ parts.push(`## Project: ${p.topic} (L2)\n${content}`);
143
+ l2TotalChars += content.length;
144
+ }
145
+ }
146
+ return parts.join("\n\n");
147
+ }
@@ -11,6 +11,7 @@ import fs from "fs";
11
11
  import { resolve } from "path";
12
12
  import { MEMORY_DIR, MEMORY_FILE } from "../paths.js";
13
13
  import { reindexMemory } from "./embeddings.js";
14
+ import { buildLayeredContext } from "./memory-layers.js";
14
15
  // Ensure dirs exist
15
16
  if (!fs.existsSync(MEMORY_DIR))
16
17
  fs.mkdirSync(MEMORY_DIR, { recursive: true });
@@ -65,16 +66,22 @@ export function appendDailyLog(entry) {
65
66
  reindexMemory().catch(() => { });
66
67
  }
67
68
  /**
68
- * Build memory context for injection into non-SDK prompts.
69
- * Returns relevant memory as a compact string.
69
+ * Build memory context for injection into prompts.
70
+ *
71
+ * v4.11.0 — Now uses the layered memory loader (memory-layers.ts) which
72
+ * combines L0 (identity.md), L1 (preferences.md + legacy MEMORY.md), and
73
+ * optional L2 (projects/*.md matched against the query). Falls back to the
74
+ * monolithic MEMORY.md alone if the split files don't exist.
75
+ *
76
+ * @param query Optional user query — when provided, L2 projects matching
77
+ * the query get included. When omitted, only L0+L1 are loaded.
70
78
  */
71
- export function buildMemoryContext() {
79
+ export function buildMemoryContext(query) {
72
80
  const parts = [];
73
- // Long-term memory (truncate if too long)
74
- const ltm = loadLongTermMemory();
75
- if (ltm) {
76
- const truncated = ltm.length > 2000 ? ltm.slice(0, 2000) + "\n[...truncated]" : ltm;
77
- parts.push(`## Long-term Memory\n${truncated}`);
81
+ // L0+L1 (+ matched L2 if query) via layered loader
82
+ const layered = buildLayeredContext(query);
83
+ if (layered) {
84
+ parts.push(layered);
78
85
  }
79
86
  // Today's log
80
87
  const todayLog = loadDailyLog();