npm - agent-memory-store - Versions diffs - 0.0.5 → 0.0.7 - Mend

agent-memory-store 0.0.5 → 0.0.7

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (8) hide show

package/README.MD CHANGED Viewed

@@ -1,48 +1,49 @@
 # agent-memory-store
-> Local-first MCP memory server for multi-agent systems.
+> High-performance MCP memory server for multi-agent systems — SQLite-backed with hybrid search.
 [![npm version](https://img.shields.io/npm/v/agent-memory-store.svg)](https://www.npmjs.com/package/agent-memory-store)
 [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
-[![Node.js](https://img.shields.io/badge/node-%3E%3D18-green.svg)](https://nodejs.org)
+[![Node.js](https://img.shields.io/badge/node-%3E%3D22.5-green.svg)](https://nodejs.org)
-`agent-memory-store` gives your AI agents a shared, searchable, persistent memory — running entirely on your local filesystem. No vector database, no embedding APIs, no cloud services required.
+`agent-memory-store` gives your AI agents a shared, searchable, persistent memory — powered by SQLite with native FTS5 full-text search and optional semantic embeddings. No external services required.
-Agents read and write **chunks** (markdown files with YAML frontmatter) through a set of MCP tools. Search is powered by **BM25**, the same ranking algorithm used by Elasticsearch, implemented in pure JavaScript with zero runtime dependencies.
+Agents read and write **chunks** through MCP tools. Search combines **BM25 ranking** (via SQLite FTS5) with **semantic vector similarity** (via local embeddings), merged through Reciprocal Rank Fusion for best-of-both-worlds retrieval.
 ```
                  ┌─────────────┐   ┌─────────────┐   ┌─────────────┐
                  │   Agent A   │   │   Agent B   │   │   Agent C   │
                  └──────┬──────┘   └──────┬──────┘   └──────┬──────┘
                         │                 │                  │
-                        └────────────────┬─────────────────-┘
+                        └────────────────┬──────────────────┘
                                          │  MCP tools
                               ┌──────────▼──────────┐
-                              │   agent-memory-store  │
-                              │  search · write       │
-                              │  read · state · list  │
+                              │  agent-memory-store  │
+                              │  hybrid search       │
+                              │  BM25 + semantic     │
                               └──────────┬──────────┘
                                          │
                               ┌──────────▼──────────┐
                               │  .agent-memory-store/ │
-                              │  ├── chunks/          │
-                              │  └── state/           │
-                              └──────────────────────┘
+                              │  └── store.db         │
+                              └───────────────────────┘
 ```
 ## Features
 - **Zero-install usage** via `npx`
-- **BM25 full-text search** — relevance ranking without embeddings or APIs
+- **Hybrid search** — BM25 full-text (FTS5) + semantic vector similarity + Reciprocal Rank Fusion
+- **SQLite-backed** — single `store.db` file, WAL mode, native performance
+- **Local embeddings** — 384-dim vectors via `all-MiniLM-L6-v2`, no API keys needed
 - **Tag and agent filtering** — find chunks by who wrote them or what they cover
 - **TTL-based expiry** — chunks auto-delete after a configurable number of days
 - **Session state** — key/value store for pipeline progress, flags, and counters
-- **Plain files** — chunks are `.md` files, readable and editable by humans and git
-- **MCP-native** — works with Claude Code, opencode, and any MCP-compatible client
+- **MCP-native** — works with Claude Code, opencode, Cursor, and any MCP-compatible client
+- **Zero external database dependencies** — uses Node.js built-in SQLite (`node:sqlite`)
 ## Requirements
-- Node.js ≥ 18
+- Node.js >= 22.5 (required for native `node:sqlite` with FTS5 support)
 ## Quick start
@@ -52,7 +53,7 @@ No installation needed:
 npx agent-memory-store
 ```
-By default, memory is stored in `.agent-memory-store/` inside the directory where the server starts — so each project gets its own isolated store automatically.
+By default, memory is stored in `.agent-memory-store/store.db` inside the directory where the server starts — so each project gets its own isolated store automatically.
 To use a custom path:
@@ -60,6 +61,18 @@ To use a custom path:
 AGENT_STORE_PATH=/your/project/.agent-memory-store npx agent-memory-store
 ```
+## Performance
+Benchmarked on Apple Silicon (Node v25, darwin arm64):
+| Operation | 100 chunks | 1K chunks | 5K chunks | 10K chunks |
+|-----------|-----------|-----------|-----------|------------|
+| **write** | 2.16 ms | 0.15 ms | 0.15 ms | 0.15 ms |
+| **read** | 0.02 ms | 0.02 ms | 0.02 ms | 0.02 ms |
+| **search (BM25)** | 0.4 ms | 1.2 ms | 5.3 ms | 9.9 ms |
+| **list** | 0.2 ms | 1.4 ms | 9.9 ms | 14.7 ms |
+| **state get/set** | 0.03 ms | 0.03 ms | 0.03 ms | 0.03 ms |
 ## Configuration
 ### Claude Code
@@ -155,32 +168,41 @@ If you need to store memory outside the project directory, set `AGENT_STORE_PATH
 ### Environment variables
-| Variable           | Default                 | Description                                                        |
-| ------------------ | ----------------------- | ------------------------------------------------------------------ |
+| Variable | Default | Description |
+|---|---|---|
 | `AGENT_STORE_PATH` | `./.agent-memory-store` | Custom path to the storage directory. Omit to use project default. |
 ## Tools
-| Tool             | When to use                                                               |
-| ---------------- | ------------------------------------------------------------------------- |
+| Tool | When to use |
+|---|---|
 | `search_context` | **Start of every task** — retrieve relevant prior knowledge before acting |
-| `write_context`  | After decisions, discoveries, or outputs that other agents will need      |
-| `read_context`   | Read a specific chunk by ID                                               |
-| `list_context`   | Inventory the memory store (metadata only, no body)                       |
-| `delete_context` | Remove outdated or incorrect chunks                                       |
-| `get_state`      | Read a pipeline variable (progress, flags, counters)                      |
-| `set_state`      | Write a pipeline variable                                                 |
+| `write_context` | After decisions, discoveries, or outputs that other agents will need |
+| `read_context` | Read a specific chunk by ID |
+| `list_context` | Inventory the memory store (metadata only, no body) |
+| `delete_context` | Remove outdated or incorrect chunks |
+| `get_state` | Read a pipeline variable (progress, flags, counters) |
+| `set_state` | Write a pipeline variable |
 ### `search_context`
 ```
-query      string    Search query. Use specific, canonical terms.
-tags       string[]  (optional) Narrow to chunks matching any of these tags.
-agent      string    (optional) Narrow to chunks written by a specific agent.
-top_k      number    (optional) Max results to return. Default: 6.
-min_score  number    (optional) Minimum BM25 score. Default: 0.1.
+query        string    Search query. Use specific, canonical terms.
+tags         string[]  (optional) Narrow to chunks matching any of these tags.
+agent        string    (optional) Narrow to chunks written by a specific agent.
+top_k        number    (optional) Max results to return. Default: 6.
+min_score    number    (optional) Minimum relevance score. Default: 0.1.
+search_mode  string    (optional) "hybrid" (default), "bm25", or "semantic".
 ```
+**Search modes:**
+| Mode | How it works | Best for |
+|---|---|---|
+| `hybrid` | BM25 + semantic similarity merged via Reciprocal Rank Fusion | General use (default) |
+| `bm25` | FTS5 keyword matching only | Exact term lookups, canonical tags |
+| `semantic` | Vector cosine similarity only | Finding conceptually related chunks |
 ### `write_context`
 ```
@@ -199,33 +221,42 @@ key    string   State variable name.
 value  any      (set_state only) Any JSON-serializable value.
 ```
-## Storage format
+## Architecture
-Each chunk is a plain `.md` file under `.agent-memory-store/chunks/`:
-```markdown
----
-id: a3f9c12b40
-topic: "Auth service — chose JWT over sessions"
-agent: architect-agent
-tags: [auth, architecture, decision]
-importance: high
-updated: 2025-06-01T14:32:00.000Z
----
-Chose stateless JWT over server-side sessions.
-**Rationale:** No shared session store needed across services.
-Refresh tokens stored in Redis with 7-day TTL.
-Access tokens expire in 15 minutes.
-**Trade-offs:** Cannot invalidate individual tokens before expiry.
-Acceptable for our threat model.
 ```
+src/
+  index.js        MCP server — tool registration and transport
+  store.js        Public API — searchChunks, writeChunk, readChunk, etc.
+  db.js           SQLite layer — node:sqlite with FTS5, WAL mode
+  search.js       Hybrid search — FTS5 BM25 + vector similarity + RRF
+  embeddings.js   Local embeddings — @huggingface/transformers (all-MiniLM-L6-v2)
+  bm25.js         Pure JS BM25 — kept as fallback reference
+  migrate.js      Filesystem → SQLite migration (automatic, one-time)
+```
+### Storage format
+All data lives in a single SQLite database at `.agent-memory-store/store.db`:
+- **chunks table** — id, topic, agent, tags (JSON), importance, content, embedding (BLOB), timestamps, expiry
+- **chunks_fts** — FTS5 virtual table synced via triggers for full-text search
+- **state table** — key/value pairs for pipeline variables
+WAL mode is enabled for concurrent read performance. No manual flush needed.
+### How hybrid search works
+1. **BM25 (FTS5)** — SQLite's native full-text search ranks chunks by term frequency and inverse document frequency. Fast, deterministic, great for exact keyword matches.
+2. **Semantic similarity** — Query and chunks are embedded into 384-dimensional vectors using `all-MiniLM-L6-v2` (runs locally via ONNX Runtime). Cosine similarity finds conceptually related chunks even when exact terms don't match.
+3. **Reciprocal Rank Fusion** — Both ranked lists are merged using RRF with weights (BM25: 0.4, semantic: 0.6). Documents appearing in both lists get boosted.
-Session state lives in `.agent-memory-store/state/<key>.json`.
+The embedding model (~23MB) is downloaded automatically on first use and cached in `~/.cache/huggingface/`. If the model fails to load, the system falls back to BM25-only search transparently.
-Both directories are human-readable, diffable with git, and can be committed to version control if you want shared team memory.
+### Migration from filesystem
+If you're upgrading from a previous version that used `.md` files, the migration happens automatically on first startup. Your existing chunks and state are imported into SQLite, and the old directories are renamed to `chunks_backup/` and `state_backup/`.
 ## Agent system prompt
@@ -238,8 +269,8 @@ You have access to a persistent local memory store via agent-memory-store MCP to
 **At the start of each task:**
-1. Call `search_context` with 2–3 specific queries related to what you are about to do.
-2. Incorporate retrieved chunks (score > 1.0) into your reasoning.
+1. Call `search_context` with 2-3 specific queries related to what you are about to do.
+2. Incorporate retrieved chunks into your reasoning.
 3. Call `get_state` to check pipeline status if relevant.
 **After completing a subtask:**
@@ -254,33 +285,13 @@ You have access to a persistent local memory store via agent-memory-store MCP to
 **Best practices:**
 - Specific topics: "ZAP scraper — stack decision" > "decision"
-- Consistent tags: always use the same term (`auth`, not `authentication` or `autenticação`)
+- Consistent tags: always use the same term (`auth`, not `authentication`)
 - Check before writing: search first to avoid duplicate chunks
 - Temporary context: use `ttl_days: 7` for session-scoped information
+- Use `search_mode: "semantic"` when looking for conceptually related chunks
+- Use `search_mode: "bm25"` for exact tag/keyword lookups
 ```
-## How BM25 search works
-BM25 ranks documents by term frequency and inverse document frequency, normalized by document length. It is the ranking algorithm behind Elasticsearch and Apache Lucene.
-**Strengths:**
-- Works well for short, labeled text chunks
-- Instant — no network calls, no GPU, no warm-up
-- Deterministic and explainable
-**Limitations:**
-- No semantic understanding (`car` ≠ `automobile`)
-- Mitigated by using canonical tags and consistent terminology across agents
-**Score interpretation:**
-- `> 3.0` — strong match, highly relevant
-- `1.0 – 3.0` — good match, likely relevant
-- `0.1 – 1.0` — weak match, may be tangentially related
-- `< 0.1` — filtered out by default
 ## Development
 ```bash
@@ -296,22 +307,20 @@ Run tests:
 npm test
 ```
-See [CONTRIBUTING.md](./CONTRIBUTING.md) for guidelines.
-## Project structure
+Run benchmark:
+```bash
+node benchmark.js
 ```
-src/
-  bm25.js      BM25 ranking engine — pure JS, zero dependencies
-  store.js     File-based persistence (chunks + session state)
-  index.js     MCP server and tool definitions
-```
+See [CONTRIBUTING.md](./CONTRIBUTING.md) for guidelines.
 ## Roadmap
 - [ ] `summarize_context` tool — LLM-powered chunk consolidation
 - [ ] `prune_context` tool — remove chunks by age, agent, or importance
-- [ ] Hybrid scoring: BM25 + optional local embedding reranking (ollama)
+- [x] ~~Hybrid scoring: BM25 + local embedding reranking~~ — shipped in v0.1.0
+- [x] ~~SQLite-backed storage~~ — shipped in v0.1.0
 - [ ] Web UI for browsing the memory store
 - [ ] Multi-project workspace support

package/package.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
   "name": "agent-memory-store",
-  "version": "0.0.5",
-  "description": "Local-first MCP memory server for multi-agent systems. BM25 search, zero external dependencies, file-based persistence.",
+  "version": "0.0.7",
+  "description": "Local-first MCP memory server for multi-agent systems. Hybrid search (BM25 + semantic embeddings), SQLite-backed, zero-config.",
   "type": "module",
   "exports": "./src/index.js",
   "bin": {
@@ -10,7 +10,7 @@
   "scripts": {
     "start": "node src/index.js",
     "test": "node --test src/__tests__/store.test.js",
-    "lint": "node --check src/bm25.js src/store.js src/index.js"
+    "lint": "node --check src/bm25.js src/store.js src/index.js src/db.js src/embeddings.js src/search.js src/migrate.js"
   },
   "keywords": [
     "mcp",
@@ -20,6 +20,11 @@
     "memory",
     "rag",
     "bm25",
+    "embeddings",
+    "semantic-search",
+    "sqlite",
+    "vector",
+    "kv-store",
     "context",
     "opencode",
     "claude",
@@ -36,7 +41,7 @@
   },
   "homepage": "https://github.com/vbfs/agent-memory-store#readme",
   "engines": {
-    "node": ">=18.0.0"
+    "node": ">=22.5.0"
   },
   "files": [
     "src/",
@@ -44,6 +49,7 @@
     "LICENSE"
   ],
   "dependencies": {
+    "@huggingface/transformers": "^3.0.0",
     "@modelcontextprotocol/sdk": "^1.28.0",
     "gray-matter": "^4.0.3",
     "zod": "^4.3.6"

package/src/db.js ADDED Viewed

@@ -0,0 +1,354 @@
+/**
+ * SQLite database layer powered by node:sqlite (built-in).
+ *
+ * Single-file database at <STORE_PATH>/store.db with WAL mode.
+ * FTS5 for full-text BM25 search, BLOB columns for vector embeddings.
+ * Zero external dependencies — uses Node.js native SQLite (>=22.5).
+ */
+import { DatabaseSync } from "node:sqlite";
+import { mkdirSync } from "fs";
+import path from "path";
+const STORE_PATH = process.env.AGENT_STORE_PATH
+  ? path.resolve(process.env.AGENT_STORE_PATH)
+  : path.join(process.cwd(), ".agent-memory-store");
+const DB_PATH = path.join(STORE_PATH, "store.db");
+let db = null;
+// ─── Schema ─────────────────────────────────────────────────────────────────
+const SCHEMA_TABLES = `
+CREATE TABLE IF NOT EXISTS chunks (
+  id         TEXT PRIMARY KEY,
+  topic      TEXT NOT NULL,
+  agent      TEXT NOT NULL DEFAULT 'global',
+  tags       TEXT NOT NULL DEFAULT '[]',
+  importance TEXT NOT NULL DEFAULT 'medium',
+  content    TEXT NOT NULL,
+  embedding  BLOB,
+  created_at TEXT NOT NULL,
+  updated_at TEXT NOT NULL,
+  expires_at TEXT
+);
+CREATE INDEX IF NOT EXISTS idx_chunks_agent   ON chunks(agent);
+CREATE INDEX IF NOT EXISTS idx_chunks_updated ON chunks(updated_at);
+CREATE INDEX IF NOT EXISTS idx_chunks_expires ON chunks(expires_at);
+CREATE TABLE IF NOT EXISTS state (
+  key        TEXT PRIMARY KEY,
+  value      TEXT NOT NULL,
+  updated_at TEXT NOT NULL
+);
+`;
+const SCHEMA_FTS = `
+CREATE VIRTUAL TABLE IF NOT EXISTS chunks_fts USING fts5(
+  id UNINDEXED,
+  topic,
+  tags,
+  agent,
+  content,
+  content='chunks',
+  content_rowid=rowid
+);
+`;
+const SCHEMA_TRIGGERS = `
+CREATE TRIGGER IF NOT EXISTS chunks_ai AFTER INSERT ON chunks BEGIN
+  INSERT INTO chunks_fts(rowid, id, topic, tags, agent, content)
+  VALUES (new.rowid, new.id, new.topic, new.tags, new.agent, new.content);
+END;
+CREATE TRIGGER IF NOT EXISTS chunks_ad AFTER DELETE ON chunks BEGIN
+  INSERT INTO chunks_fts(chunks_fts, rowid, id, topic, tags, agent, content)
+  VALUES ('delete', old.rowid, old.id, old.topic, old.tags, old.agent, old.content);
+END;
+CREATE TRIGGER IF NOT EXISTS chunks_au AFTER UPDATE ON chunks BEGIN
+  INSERT INTO chunks_fts(chunks_fts, rowid, id, topic, tags, agent, content)
+  VALUES ('delete', old.rowid, old.id, old.topic, old.tags, old.agent, old.content);
+  INSERT INTO chunks_fts(rowid, id, topic, tags, agent, content)
+  VALUES (new.rowid, new.id, new.topic, new.tags, new.agent, new.content);
+END;
+`;
+// ─── Initialization ─────────────────────────────────────────────────────────
+/**
+ * Returns the database instance. Creates it on first call.
+ * Synchronous — node:sqlite DatabaseSync is synchronous by design.
+ */
+export function getDb() {
+  if (db) return db;
+  mkdirSync(STORE_PATH, { recursive: true });
+  db = new DatabaseSync(DB_PATH);
+  // WAL mode for better concurrent read performance
+  db.exec("PRAGMA journal_mode = WAL");
+  // Run schema
+  db.exec(SCHEMA_TABLES);
+  db.exec(SCHEMA_FTS);
+  db.exec(SCHEMA_TRIGGERS);
+  // Purge expired chunks
+  db.prepare(
+    `DELETE FROM chunks WHERE expires_at IS NOT NULL AND expires_at < datetime('now')`,
+  ).run();
+  // Graceful shutdown
+  const shutdown = () => {
+    if (db) db.close();
+    process.exit(0);
+  };
+  process.on("SIGINT", shutdown);
+  process.on("SIGTERM", shutdown);
+  return db;
+}
+// ─── CRUD Operations ────────────────────────────────────────────────────────
+/**
+ * Inserts or replaces a chunk in the database.
+ */
+export function insertChunk({
+  id,
+  topic,
+  agent,
+  tags,
+  importance,
+  content,
+  embedding,
+  createdAt,
+  updatedAt,
+  expiresAt,
+}) {
+  const d = getDb();
+  d.prepare(
+    `INSERT OR REPLACE INTO chunks (id, topic, agent, tags, importance, content, embedding, created_at, updated_at, expires_at)
+     VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)`,
+  ).run(
+    id,
+    topic,
+    agent,
+    JSON.stringify(tags),
+    importance,
+    content,
+    embedding ? Buffer.from(embedding.buffer) : null,
+    createdAt,
+    updatedAt,
+    expiresAt,
+  );
+}
+/**
+ * Retrieves a single chunk by ID.
+ * @returns {object|null}
+ */
+export function getChunk(id) {
+  const d = getDb();
+  const row = d.prepare(`SELECT * FROM chunks WHERE id = ?`).get(id);
+  if (!row) return null;
+  return parseChunkRow(row);
+}
+/**
+ * Deletes a chunk by ID.
+ * @returns {boolean} true if a row was deleted
+ */
+export function deleteChunkById(id) {
+  const d = getDb();
+  const result = d.prepare(`DELETE FROM chunks WHERE id = ?`).run(id);
+  return result.changes > 0;
+}
+/**
+ * Lists chunk metadata, with optional agent/tags filters.
+ * Sorted by updated_at descending.
+ */
+export function listChunksDb({ agent, tags = [] } = {}) {
+  const d = getDb();
+  let sql = `SELECT id, topic, agent, tags, importance, updated_at FROM chunks`;
+  const conditions = [];
+  const params = [];
+  if (agent) {
+    conditions.push(`agent = ?`);
+    params.push(agent);
+  }
+  if (tags.length > 0) {
+    const tagConditions = tags.map(() => `tags LIKE ?`);
+    conditions.push(`(${tagConditions.join(" OR ")})`);
+    params.push(...tags.map((t) => `%"${t}"%`));
+  }
+  if (conditions.length) sql += ` WHERE ${conditions.join(" AND ")}`;
+  sql += ` ORDER BY updated_at DESC`;
+  const rows = d.prepare(sql).all(...params);
+  return rows.map((r) => ({
+    id: r.id,
+    topic: r.topic,
+    agent: r.agent,
+    tags: JSON.parse(r.tags),
+    importance: r.importance,
+    updated: r.updated_at,
+  }));
+}
+/**
+ * Full-text search via FTS5 (BM25).
+ * Returns ranked results with scores.
+ */
+export function searchFTS({ query, agent, tags = [], topK = 18 }) {
+  const d = getDb();
+  // Escape FTS5 special chars and build query
+  const ftsQuery = query
+    .replace(/["*^:(){}[\]]/g, " ")
+    .split(/\s+/)
+    .filter((t) => t.length > 1)
+    .join(" OR ");
+  if (!ftsQuery) return [];
+  let sql = `
+    SELECT chunks_fts.id, rank
+    FROM chunks_fts
+    JOIN chunks ON chunks.id = chunks_fts.id
+    WHERE chunks_fts MATCH ?`;
+  const params = [ftsQuery];
+  if (agent) {
+    sql += ` AND chunks.agent = ?`;
+    params.push(agent);
+  }
+  if (tags.length > 0) {
+    const tagConditions = tags.map(() => `chunks.tags LIKE ?`);
+    sql += ` AND (${tagConditions.join(" OR ")})`;
+    params.push(...tags.map((t) => `%"${t}"%`));
+  }
+  sql += ` ORDER BY rank LIMIT ?`;
+  params.push(topK);
+  const rows = d.prepare(sql).all(...params);
+  return rows.map((r) => ({
+    id: r.id,
+    score: -r.rank, // FTS5 rank is negative (lower = better), invert
+  }));
+}
+/**
+ * Retrieves all embeddings for vector search.
+ * @returns {Array<{ id: string, embedding: Float32Array }>}
+ */
+export function getAllEmbeddings({ agent, tags = [] } = {}) {
+  const d = getDb();
+  let sql = `SELECT id, embedding FROM chunks WHERE embedding IS NOT NULL`;
+  const params = [];
+  if (agent) {
+    sql += ` AND agent = ?`;
+    params.push(agent);
+  }
+  if (tags.length > 0) {
+    const tagConditions = tags.map(() => `tags LIKE ?`);
+    sql += ` AND (${tagConditions.join(" OR ")})`;
+    params.push(...tags.map((t) => `%"${t}"%`));
+  }
+  const rows = d.prepare(sql).all(...params);
+  return rows
+    .filter((r) => r.embedding !== null)
+    .map((r) => ({
+      id: r.id,
+      embedding: new Float32Array(
+        r.embedding.buffer,
+        r.embedding.byteOffset,
+        r.embedding.byteLength / 4,
+      ),
+    }));
+}
+/**
+ * Updates only the embedding for a chunk.
+ */
+export function updateEmbedding(id, embedding) {
+  const d = getDb();
+  d.prepare(`UPDATE chunks SET embedding = ? WHERE id = ?`).run(
+    Buffer.from(embedding.buffer),
+    id,
+  );
+}
+/**
+ * Returns chunks that have no embedding yet.
+ */
+export function getChunksWithoutEmbedding() {
+  const d = getDb();
+  return d
+    .prepare(
+      `SELECT id, topic, tags, content FROM chunks WHERE embedding IS NULL`,
+    )
+    .all()
+    .map((r) => ({
+      id: r.id,
+      topic: r.topic,
+      tags: r.tags,
+      content: r.content,
+    }));
+}
+// ─── State Operations ───────────────────────────────────────────────────────
+export function getStateDb(key) {
+  const d = getDb();
+  const row = d.prepare(`SELECT value FROM state WHERE key = ?`).get(key);
+  if (!row) return null;
+  return JSON.parse(row.value);
+}
+export function setStateDb(key, value) {
+  const d = getDb();
+  const updatedAt = new Date().toISOString();
+  d.prepare(
+    `INSERT OR REPLACE INTO state (key, value, updated_at) VALUES (?, ?, ?)`,
+  ).run(key, JSON.stringify(value), updatedAt);
+  return { key, updated: updatedAt };
+}
+// ─── Helpers ────────────────────────────────────────────────────────────────
+function parseChunkRow(row) {
+  return {
+    id: row.id,
+    topic: row.topic,
+    agent: row.agent,
+    tags: JSON.parse(row.tags),
+    importance: row.importance,
+    content: row.content,
+    embedding: row.embedding
+      ? new Float32Array(
+          row.embedding.buffer,
+          row.embedding.byteOffset,
+          row.embedding.byteLength / 4,
+        )
+      : null,
+    createdAt: row.created_at,
+    updatedAt: row.updated_at,
+    expiresAt: row.expires_at,
+  };
+}
+export { STORE_PATH };