agent-memory-store 0.0.5 → 0.0.7

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.MD CHANGED
@@ -1,48 +1,49 @@
1
1
  # agent-memory-store
2
2
 
3
- > Local-first MCP memory server for multi-agent systems.
3
+ > High-performance MCP memory server for multi-agent systems — SQLite-backed with hybrid search.
4
4
 
5
5
  [![npm version](https://img.shields.io/npm/v/agent-memory-store.svg)](https://www.npmjs.com/package/agent-memory-store)
6
6
  [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
7
- [![Node.js](https://img.shields.io/badge/node-%3E%3D18-green.svg)](https://nodejs.org)
7
+ [![Node.js](https://img.shields.io/badge/node-%3E%3D22.5-green.svg)](https://nodejs.org)
8
8
 
9
- `agent-memory-store` gives your AI agents a shared, searchable, persistent memory — running entirely on your local filesystem. No vector database, no embedding APIs, no cloud services required.
9
+ `agent-memory-store` gives your AI agents a shared, searchable, persistent memory — powered by SQLite with native FTS5 full-text search and optional semantic embeddings. No external services required.
10
10
 
11
- Agents read and write **chunks** (markdown files with YAML frontmatter) through a set of MCP tools. Search is powered by **BM25**, the same ranking algorithm used by Elasticsearch, implemented in pure JavaScript with zero runtime dependencies.
11
+ Agents read and write **chunks** through MCP tools. Search combines **BM25 ranking** (via SQLite FTS5) with **semantic vector similarity** (via local embeddings), merged through Reciprocal Rank Fusion for best-of-both-worlds retrieval.
12
12
 
13
13
  ```
14
14
  ┌─────────────┐ ┌─────────────┐ ┌─────────────┐
15
15
  │ Agent A │ │ Agent B │ │ Agent C │
16
16
  └──────┬──────┘ └──────┬──────┘ └──────┬──────┘
17
17
  │ │ │
18
- └────────────────┬─────────────────-┘
18
+ └────────────────┬──────────────────┘
19
19
  │ MCP tools
20
20
  ┌──────────▼──────────┐
21
- agent-memory-store │
22
- │ search · write
23
- read · state · list
21
+ agent-memory-store │
22
+ hybrid search │
23
+ BM25 + semantic
24
24
  └──────────┬──────────┘
25
25
 
26
26
  ┌──────────▼──────────┐
27
27
  │ .agent-memory-store/ │
28
- ├── chunks/
29
- │ └── state/ │
30
- └──────────────────────┘
28
+ └── store.db
29
+ └───────────────────────┘
31
30
  ```
32
31
 
33
32
  ## Features
34
33
 
35
34
  - **Zero-install usage** via `npx`
36
- - **BM25 full-text search** relevance ranking without embeddings or APIs
35
+ - **Hybrid search** — BM25 full-text (FTS5) + semantic vector similarity + Reciprocal Rank Fusion
36
+ - **SQLite-backed** — single `store.db` file, WAL mode, native performance
37
+ - **Local embeddings** — 384-dim vectors via `all-MiniLM-L6-v2`, no API keys needed
37
38
  - **Tag and agent filtering** — find chunks by who wrote them or what they cover
38
39
  - **TTL-based expiry** — chunks auto-delete after a configurable number of days
39
40
  - **Session state** — key/value store for pipeline progress, flags, and counters
40
- - **Plain files** — chunks are `.md` files, readable and editable by humans and git
41
- - **MCP-native** works with Claude Code, opencode, and any MCP-compatible client
41
+ - **MCP-native** — works with Claude Code, opencode, Cursor, and any MCP-compatible client
42
+ - **Zero external database dependencies** uses Node.js built-in SQLite (`node:sqlite`)
42
43
 
43
44
  ## Requirements
44
45
 
45
- - Node.js 18
46
+ - Node.js >= 22.5 (required for native `node:sqlite` with FTS5 support)
46
47
 
47
48
  ## Quick start
48
49
 
@@ -52,7 +53,7 @@ No installation needed:
52
53
  npx agent-memory-store
53
54
  ```
54
55
 
55
- By default, memory is stored in `.agent-memory-store/` inside the directory where the server starts — so each project gets its own isolated store automatically.
56
+ By default, memory is stored in `.agent-memory-store/store.db` inside the directory where the server starts — so each project gets its own isolated store automatically.
56
57
 
57
58
  To use a custom path:
58
59
 
@@ -60,6 +61,18 @@ To use a custom path:
60
61
  AGENT_STORE_PATH=/your/project/.agent-memory-store npx agent-memory-store
61
62
  ```
62
63
 
64
+ ## Performance
65
+
66
+ Benchmarked on Apple Silicon (Node v25, darwin arm64):
67
+
68
+ | Operation | 100 chunks | 1K chunks | 5K chunks | 10K chunks |
69
+ |-----------|-----------|-----------|-----------|------------|
70
+ | **write** | 2.16 ms | 0.15 ms | 0.15 ms | 0.15 ms |
71
+ | **read** | 0.02 ms | 0.02 ms | 0.02 ms | 0.02 ms |
72
+ | **search (BM25)** | 0.4 ms | 1.2 ms | 5.3 ms | 9.9 ms |
73
+ | **list** | 0.2 ms | 1.4 ms | 9.9 ms | 14.7 ms |
74
+ | **state get/set** | 0.03 ms | 0.03 ms | 0.03 ms | 0.03 ms |
75
+
63
76
  ## Configuration
64
77
 
65
78
  ### Claude Code
@@ -155,32 +168,41 @@ If you need to store memory outside the project directory, set `AGENT_STORE_PATH
155
168
 
156
169
  ### Environment variables
157
170
 
158
- | Variable | Default | Description |
159
- | ------------------ | ----------------------- | ------------------------------------------------------------------ |
171
+ | Variable | Default | Description |
172
+ |---|---|---|
160
173
  | `AGENT_STORE_PATH` | `./.agent-memory-store` | Custom path to the storage directory. Omit to use project default. |
161
174
 
162
175
  ## Tools
163
176
 
164
- | Tool | When to use |
165
- | ---------------- | ------------------------------------------------------------------------- |
177
+ | Tool | When to use |
178
+ |---|---|
166
179
  | `search_context` | **Start of every task** — retrieve relevant prior knowledge before acting |
167
- | `write_context` | After decisions, discoveries, or outputs that other agents will need |
168
- | `read_context` | Read a specific chunk by ID |
169
- | `list_context` | Inventory the memory store (metadata only, no body) |
170
- | `delete_context` | Remove outdated or incorrect chunks |
171
- | `get_state` | Read a pipeline variable (progress, flags, counters) |
172
- | `set_state` | Write a pipeline variable |
180
+ | `write_context` | After decisions, discoveries, or outputs that other agents will need |
181
+ | `read_context` | Read a specific chunk by ID |
182
+ | `list_context` | Inventory the memory store (metadata only, no body) |
183
+ | `delete_context` | Remove outdated or incorrect chunks |
184
+ | `get_state` | Read a pipeline variable (progress, flags, counters) |
185
+ | `set_state` | Write a pipeline variable |
173
186
 
174
187
  ### `search_context`
175
188
 
176
189
  ```
177
- query string Search query. Use specific, canonical terms.
178
- tags string[] (optional) Narrow to chunks matching any of these tags.
179
- agent string (optional) Narrow to chunks written by a specific agent.
180
- top_k number (optional) Max results to return. Default: 6.
181
- min_score number (optional) Minimum BM25 score. Default: 0.1.
190
+ query string Search query. Use specific, canonical terms.
191
+ tags string[] (optional) Narrow to chunks matching any of these tags.
192
+ agent string (optional) Narrow to chunks written by a specific agent.
193
+ top_k number (optional) Max results to return. Default: 6.
194
+ min_score number (optional) Minimum relevance score. Default: 0.1.
195
+ search_mode string (optional) "hybrid" (default), "bm25", or "semantic".
182
196
  ```
183
197
 
198
+ **Search modes:**
199
+
200
+ | Mode | How it works | Best for |
201
+ |---|---|---|
202
+ | `hybrid` | BM25 + semantic similarity merged via Reciprocal Rank Fusion | General use (default) |
203
+ | `bm25` | FTS5 keyword matching only | Exact term lookups, canonical tags |
204
+ | `semantic` | Vector cosine similarity only | Finding conceptually related chunks |
205
+
184
206
  ### `write_context`
185
207
 
186
208
  ```
@@ -199,33 +221,42 @@ key string State variable name.
199
221
  value any (set_state only) Any JSON-serializable value.
200
222
  ```
201
223
 
202
- ## Storage format
224
+ ## Architecture
203
225
 
204
- Each chunk is a plain `.md` file under `.agent-memory-store/chunks/`:
205
-
206
- ```markdown
207
- ---
208
- id: a3f9c12b40
209
- topic: "Auth service — chose JWT over sessions"
210
- agent: architect-agent
211
- tags: [auth, architecture, decision]
212
- importance: high
213
- updated: 2025-06-01T14:32:00.000Z
214
- ---
215
-
216
- Chose stateless JWT over server-side sessions.
217
-
218
- **Rationale:** No shared session store needed across services.
219
- Refresh tokens stored in Redis with 7-day TTL.
220
- Access tokens expire in 15 minutes.
221
-
222
- **Trade-offs:** Cannot invalidate individual tokens before expiry.
223
- Acceptable for our threat model.
224
226
  ```
227
+ src/
228
+ index.js MCP server — tool registration and transport
229
+ store.js Public API — searchChunks, writeChunk, readChunk, etc.
230
+ db.js SQLite layer — node:sqlite with FTS5, WAL mode
231
+ search.js Hybrid search — FTS5 BM25 + vector similarity + RRF
232
+ embeddings.js Local embeddings — @huggingface/transformers (all-MiniLM-L6-v2)
233
+ bm25.js Pure JS BM25 — kept as fallback reference
234
+ migrate.js Filesystem → SQLite migration (automatic, one-time)
235
+ ```
236
+
237
+ ### Storage format
238
+
239
+ All data lives in a single SQLite database at `.agent-memory-store/store.db`:
240
+
241
+ - **chunks table** — id, topic, agent, tags (JSON), importance, content, embedding (BLOB), timestamps, expiry
242
+ - **chunks_fts** — FTS5 virtual table synced via triggers for full-text search
243
+ - **state table** — key/value pairs for pipeline variables
244
+
245
+ WAL mode is enabled for concurrent read performance. No manual flush needed.
246
+
247
+ ### How hybrid search works
248
+
249
+ 1. **BM25 (FTS5)** — SQLite's native full-text search ranks chunks by term frequency and inverse document frequency. Fast, deterministic, great for exact keyword matches.
250
+
251
+ 2. **Semantic similarity** — Query and chunks are embedded into 384-dimensional vectors using `all-MiniLM-L6-v2` (runs locally via ONNX Runtime). Cosine similarity finds conceptually related chunks even when exact terms don't match.
252
+
253
+ 3. **Reciprocal Rank Fusion** — Both ranked lists are merged using RRF with weights (BM25: 0.4, semantic: 0.6). Documents appearing in both lists get boosted.
225
254
 
226
- Session state lives in `.agent-memory-store/state/<key>.json`.
255
+ The embedding model (~23MB) is downloaded automatically on first use and cached in `~/.cache/huggingface/`. If the model fails to load, the system falls back to BM25-only search transparently.
227
256
 
228
- Both directories are human-readable, diffable with git, and can be committed to version control if you want shared team memory.
257
+ ### Migration from filesystem
258
+
259
+ If you're upgrading from a previous version that used `.md` files, the migration happens automatically on first startup. Your existing chunks and state are imported into SQLite, and the old directories are renamed to `chunks_backup/` and `state_backup/`.
229
260
 
230
261
  ## Agent system prompt
231
262
 
@@ -238,8 +269,8 @@ You have access to a persistent local memory store via agent-memory-store MCP to
238
269
 
239
270
  **At the start of each task:**
240
271
 
241
- 1. Call `search_context` with 23 specific queries related to what you are about to do.
242
- 2. Incorporate retrieved chunks (score > 1.0) into your reasoning.
272
+ 1. Call `search_context` with 2-3 specific queries related to what you are about to do.
273
+ 2. Incorporate retrieved chunks into your reasoning.
243
274
  3. Call `get_state` to check pipeline status if relevant.
244
275
 
245
276
  **After completing a subtask:**
@@ -254,33 +285,13 @@ You have access to a persistent local memory store via agent-memory-store MCP to
254
285
  **Best practices:**
255
286
 
256
287
  - Specific topics: "ZAP scraper — stack decision" > "decision"
257
- - Consistent tags: always use the same term (`auth`, not `authentication` or `autenticação`)
288
+ - Consistent tags: always use the same term (`auth`, not `authentication`)
258
289
  - Check before writing: search first to avoid duplicate chunks
259
290
  - Temporary context: use `ttl_days: 7` for session-scoped information
291
+ - Use `search_mode: "semantic"` when looking for conceptually related chunks
292
+ - Use `search_mode: "bm25"` for exact tag/keyword lookups
260
293
  ```
261
294
 
262
- ## How BM25 search works
263
-
264
- BM25 ranks documents by term frequency and inverse document frequency, normalized by document length. It is the ranking algorithm behind Elasticsearch and Apache Lucene.
265
-
266
- **Strengths:**
267
-
268
- - Works well for short, labeled text chunks
269
- - Instant — no network calls, no GPU, no warm-up
270
- - Deterministic and explainable
271
-
272
- **Limitations:**
273
-
274
- - No semantic understanding (`car` ≠ `automobile`)
275
- - Mitigated by using canonical tags and consistent terminology across agents
276
-
277
- **Score interpretation:**
278
-
279
- - `> 3.0` — strong match, highly relevant
280
- - `1.0 – 3.0` — good match, likely relevant
281
- - `0.1 – 1.0` — weak match, may be tangentially related
282
- - `< 0.1` — filtered out by default
283
-
284
295
  ## Development
285
296
 
286
297
  ```bash
@@ -296,22 +307,20 @@ Run tests:
296
307
  npm test
297
308
  ```
298
309
 
299
- See [CONTRIBUTING.md](./CONTRIBUTING.md) for guidelines.
300
-
301
- ## Project structure
310
+ Run benchmark:
302
311
 
312
+ ```bash
313
+ node benchmark.js
303
314
  ```
304
- src/
305
- bm25.js BM25 ranking engine — pure JS, zero dependencies
306
- store.js File-based persistence (chunks + session state)
307
- index.js MCP server and tool definitions
308
- ```
315
+
316
+ See [CONTRIBUTING.md](./CONTRIBUTING.md) for guidelines.
309
317
 
310
318
  ## Roadmap
311
319
 
312
320
  - [ ] `summarize_context` tool — LLM-powered chunk consolidation
313
321
  - [ ] `prune_context` tool — remove chunks by age, agent, or importance
314
- - [ ] Hybrid scoring: BM25 + optional local embedding reranking (ollama)
322
+ - [x] ~~Hybrid scoring: BM25 + local embedding reranking~~ — shipped in v0.1.0
323
+ - [x] ~~SQLite-backed storage~~ — shipped in v0.1.0
315
324
  - [ ] Web UI for browsing the memory store
316
325
  - [ ] Multi-project workspace support
317
326
 
package/package.json CHANGED
@@ -1,7 +1,7 @@
1
1
  {
2
2
  "name": "agent-memory-store",
3
- "version": "0.0.5",
4
- "description": "Local-first MCP memory server for multi-agent systems. BM25 search, zero external dependencies, file-based persistence.",
3
+ "version": "0.0.7",
4
+ "description": "Local-first MCP memory server for multi-agent systems. Hybrid search (BM25 + semantic embeddings), SQLite-backed, zero-config.",
5
5
  "type": "module",
6
6
  "exports": "./src/index.js",
7
7
  "bin": {
@@ -10,7 +10,7 @@
10
10
  "scripts": {
11
11
  "start": "node src/index.js",
12
12
  "test": "node --test src/__tests__/store.test.js",
13
- "lint": "node --check src/bm25.js src/store.js src/index.js"
13
+ "lint": "node --check src/bm25.js src/store.js src/index.js src/db.js src/embeddings.js src/search.js src/migrate.js"
14
14
  },
15
15
  "keywords": [
16
16
  "mcp",
@@ -20,6 +20,11 @@
20
20
  "memory",
21
21
  "rag",
22
22
  "bm25",
23
+ "embeddings",
24
+ "semantic-search",
25
+ "sqlite",
26
+ "vector",
27
+ "kv-store",
23
28
  "context",
24
29
  "opencode",
25
30
  "claude",
@@ -36,7 +41,7 @@
36
41
  },
37
42
  "homepage": "https://github.com/vbfs/agent-memory-store#readme",
38
43
  "engines": {
39
- "node": ">=18.0.0"
44
+ "node": ">=22.5.0"
40
45
  },
41
46
  "files": [
42
47
  "src/",
@@ -44,6 +49,7 @@
44
49
  "LICENSE"
45
50
  ],
46
51
  "dependencies": {
52
+ "@huggingface/transformers": "^3.0.0",
47
53
  "@modelcontextprotocol/sdk": "^1.28.0",
48
54
  "gray-matter": "^4.0.3",
49
55
  "zod": "^4.3.6"
package/src/db.js ADDED
@@ -0,0 +1,354 @@
1
+ /**
2
+ * SQLite database layer powered by node:sqlite (built-in).
3
+ *
4
+ * Single-file database at <STORE_PATH>/store.db with WAL mode.
5
+ * FTS5 for full-text BM25 search, BLOB columns for vector embeddings.
6
+ * Zero external dependencies — uses Node.js native SQLite (>=22.5).
7
+ */
8
+
9
+ import { DatabaseSync } from "node:sqlite";
10
+ import { mkdirSync } from "fs";
11
+ import path from "path";
12
+
13
+ const STORE_PATH = process.env.AGENT_STORE_PATH
14
+ ? path.resolve(process.env.AGENT_STORE_PATH)
15
+ : path.join(process.cwd(), ".agent-memory-store");
16
+
17
+ const DB_PATH = path.join(STORE_PATH, "store.db");
18
+
19
+ let db = null;
20
+
21
+ // ─── Schema ─────────────────────────────────────────────────────────────────
22
+
23
+ const SCHEMA_TABLES = `
24
+ CREATE TABLE IF NOT EXISTS chunks (
25
+ id TEXT PRIMARY KEY,
26
+ topic TEXT NOT NULL,
27
+ agent TEXT NOT NULL DEFAULT 'global',
28
+ tags TEXT NOT NULL DEFAULT '[]',
29
+ importance TEXT NOT NULL DEFAULT 'medium',
30
+ content TEXT NOT NULL,
31
+ embedding BLOB,
32
+ created_at TEXT NOT NULL,
33
+ updated_at TEXT NOT NULL,
34
+ expires_at TEXT
35
+ );
36
+
37
+ CREATE INDEX IF NOT EXISTS idx_chunks_agent ON chunks(agent);
38
+ CREATE INDEX IF NOT EXISTS idx_chunks_updated ON chunks(updated_at);
39
+ CREATE INDEX IF NOT EXISTS idx_chunks_expires ON chunks(expires_at);
40
+
41
+ CREATE TABLE IF NOT EXISTS state (
42
+ key TEXT PRIMARY KEY,
43
+ value TEXT NOT NULL,
44
+ updated_at TEXT NOT NULL
45
+ );
46
+ `;
47
+
48
+ const SCHEMA_FTS = `
49
+ CREATE VIRTUAL TABLE IF NOT EXISTS chunks_fts USING fts5(
50
+ id UNINDEXED,
51
+ topic,
52
+ tags,
53
+ agent,
54
+ content,
55
+ content='chunks',
56
+ content_rowid=rowid
57
+ );
58
+ `;
59
+
60
+ const SCHEMA_TRIGGERS = `
61
+ CREATE TRIGGER IF NOT EXISTS chunks_ai AFTER INSERT ON chunks BEGIN
62
+ INSERT INTO chunks_fts(rowid, id, topic, tags, agent, content)
63
+ VALUES (new.rowid, new.id, new.topic, new.tags, new.agent, new.content);
64
+ END;
65
+
66
+ CREATE TRIGGER IF NOT EXISTS chunks_ad AFTER DELETE ON chunks BEGIN
67
+ INSERT INTO chunks_fts(chunks_fts, rowid, id, topic, tags, agent, content)
68
+ VALUES ('delete', old.rowid, old.id, old.topic, old.tags, old.agent, old.content);
69
+ END;
70
+
71
+ CREATE TRIGGER IF NOT EXISTS chunks_au AFTER UPDATE ON chunks BEGIN
72
+ INSERT INTO chunks_fts(chunks_fts, rowid, id, topic, tags, agent, content)
73
+ VALUES ('delete', old.rowid, old.id, old.topic, old.tags, old.agent, old.content);
74
+ INSERT INTO chunks_fts(rowid, id, topic, tags, agent, content)
75
+ VALUES (new.rowid, new.id, new.topic, new.tags, new.agent, new.content);
76
+ END;
77
+ `;
78
+
79
+ // ─── Initialization ─────────────────────────────────────────────────────────
80
+
81
+ /**
82
+ * Returns the database instance. Creates it on first call.
83
+ * Synchronous — node:sqlite DatabaseSync is synchronous by design.
84
+ */
85
+ export function getDb() {
86
+ if (db) return db;
87
+
88
+ mkdirSync(STORE_PATH, { recursive: true });
89
+
90
+ db = new DatabaseSync(DB_PATH);
91
+
92
+ // WAL mode for better concurrent read performance
93
+ db.exec("PRAGMA journal_mode = WAL");
94
+
95
+ // Run schema
96
+ db.exec(SCHEMA_TABLES);
97
+ db.exec(SCHEMA_FTS);
98
+ db.exec(SCHEMA_TRIGGERS);
99
+
100
+ // Purge expired chunks
101
+ db.prepare(
102
+ `DELETE FROM chunks WHERE expires_at IS NOT NULL AND expires_at < datetime('now')`,
103
+ ).run();
104
+
105
+ // Graceful shutdown
106
+ const shutdown = () => {
107
+ if (db) db.close();
108
+ process.exit(0);
109
+ };
110
+ process.on("SIGINT", shutdown);
111
+ process.on("SIGTERM", shutdown);
112
+
113
+ return db;
114
+ }
115
+
116
+ // ─── CRUD Operations ────────────────────────────────────────────────────────
117
+
118
+ /**
119
+ * Inserts or replaces a chunk in the database.
120
+ */
121
+ export function insertChunk({
122
+ id,
123
+ topic,
124
+ agent,
125
+ tags,
126
+ importance,
127
+ content,
128
+ embedding,
129
+ createdAt,
130
+ updatedAt,
131
+ expiresAt,
132
+ }) {
133
+ const d = getDb();
134
+ d.prepare(
135
+ `INSERT OR REPLACE INTO chunks (id, topic, agent, tags, importance, content, embedding, created_at, updated_at, expires_at)
136
+ VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)`,
137
+ ).run(
138
+ id,
139
+ topic,
140
+ agent,
141
+ JSON.stringify(tags),
142
+ importance,
143
+ content,
144
+ embedding ? Buffer.from(embedding.buffer) : null,
145
+ createdAt,
146
+ updatedAt,
147
+ expiresAt,
148
+ );
149
+ }
150
+
151
+ /**
152
+ * Retrieves a single chunk by ID.
153
+ * @returns {object|null}
154
+ */
155
+ export function getChunk(id) {
156
+ const d = getDb();
157
+ const row = d.prepare(`SELECT * FROM chunks WHERE id = ?`).get(id);
158
+ if (!row) return null;
159
+ return parseChunkRow(row);
160
+ }
161
+
162
+ /**
163
+ * Deletes a chunk by ID.
164
+ * @returns {boolean} true if a row was deleted
165
+ */
166
+ export function deleteChunkById(id) {
167
+ const d = getDb();
168
+ const result = d.prepare(`DELETE FROM chunks WHERE id = ?`).run(id);
169
+ return result.changes > 0;
170
+ }
171
+
172
+ /**
173
+ * Lists chunk metadata, with optional agent/tags filters.
174
+ * Sorted by updated_at descending.
175
+ */
176
+ export function listChunksDb({ agent, tags = [] } = {}) {
177
+ const d = getDb();
178
+ let sql = `SELECT id, topic, agent, tags, importance, updated_at FROM chunks`;
179
+ const conditions = [];
180
+ const params = [];
181
+
182
+ if (agent) {
183
+ conditions.push(`agent = ?`);
184
+ params.push(agent);
185
+ }
186
+
187
+ if (tags.length > 0) {
188
+ const tagConditions = tags.map(() => `tags LIKE ?`);
189
+ conditions.push(`(${tagConditions.join(" OR ")})`);
190
+ params.push(...tags.map((t) => `%"${t}"%`));
191
+ }
192
+
193
+ if (conditions.length) sql += ` WHERE ${conditions.join(" AND ")}`;
194
+ sql += ` ORDER BY updated_at DESC`;
195
+
196
+ const rows = d.prepare(sql).all(...params);
197
+ return rows.map((r) => ({
198
+ id: r.id,
199
+ topic: r.topic,
200
+ agent: r.agent,
201
+ tags: JSON.parse(r.tags),
202
+ importance: r.importance,
203
+ updated: r.updated_at,
204
+ }));
205
+ }
206
+
207
+ /**
208
+ * Full-text search via FTS5 (BM25).
209
+ * Returns ranked results with scores.
210
+ */
211
+ export function searchFTS({ query, agent, tags = [], topK = 18 }) {
212
+ const d = getDb();
213
+
214
+ // Escape FTS5 special chars and build query
215
+ const ftsQuery = query
216
+ .replace(/["*^:(){}[\]]/g, " ")
217
+ .split(/\s+/)
218
+ .filter((t) => t.length > 1)
219
+ .join(" OR ");
220
+
221
+ if (!ftsQuery) return [];
222
+
223
+ let sql = `
224
+ SELECT chunks_fts.id, rank
225
+ FROM chunks_fts
226
+ JOIN chunks ON chunks.id = chunks_fts.id
227
+ WHERE chunks_fts MATCH ?`;
228
+ const params = [ftsQuery];
229
+
230
+ if (agent) {
231
+ sql += ` AND chunks.agent = ?`;
232
+ params.push(agent);
233
+ }
234
+
235
+ if (tags.length > 0) {
236
+ const tagConditions = tags.map(() => `chunks.tags LIKE ?`);
237
+ sql += ` AND (${tagConditions.join(" OR ")})`;
238
+ params.push(...tags.map((t) => `%"${t}"%`));
239
+ }
240
+
241
+ sql += ` ORDER BY rank LIMIT ?`;
242
+ params.push(topK);
243
+
244
+ const rows = d.prepare(sql).all(...params);
245
+ return rows.map((r) => ({
246
+ id: r.id,
247
+ score: -r.rank, // FTS5 rank is negative (lower = better), invert
248
+ }));
249
+ }
250
+
251
+ /**
252
+ * Retrieves all embeddings for vector search.
253
+ * @returns {Array<{ id: string, embedding: Float32Array }>}
254
+ */
255
+ export function getAllEmbeddings({ agent, tags = [] } = {}) {
256
+ const d = getDb();
257
+ let sql = `SELECT id, embedding FROM chunks WHERE embedding IS NOT NULL`;
258
+ const params = [];
259
+
260
+ if (agent) {
261
+ sql += ` AND agent = ?`;
262
+ params.push(agent);
263
+ }
264
+
265
+ if (tags.length > 0) {
266
+ const tagConditions = tags.map(() => `tags LIKE ?`);
267
+ sql += ` AND (${tagConditions.join(" OR ")})`;
268
+ params.push(...tags.map((t) => `%"${t}"%`));
269
+ }
270
+
271
+ const rows = d.prepare(sql).all(...params);
272
+ return rows
273
+ .filter((r) => r.embedding !== null)
274
+ .map((r) => ({
275
+ id: r.id,
276
+ embedding: new Float32Array(
277
+ r.embedding.buffer,
278
+ r.embedding.byteOffset,
279
+ r.embedding.byteLength / 4,
280
+ ),
281
+ }));
282
+ }
283
+
284
+ /**
285
+ * Updates only the embedding for a chunk.
286
+ */
287
+ export function updateEmbedding(id, embedding) {
288
+ const d = getDb();
289
+ d.prepare(`UPDATE chunks SET embedding = ? WHERE id = ?`).run(
290
+ Buffer.from(embedding.buffer),
291
+ id,
292
+ );
293
+ }
294
+
295
+ /**
296
+ * Returns chunks that have no embedding yet.
297
+ */
298
+ export function getChunksWithoutEmbedding() {
299
+ const d = getDb();
300
+ return d
301
+ .prepare(
302
+ `SELECT id, topic, tags, content FROM chunks WHERE embedding IS NULL`,
303
+ )
304
+ .all()
305
+ .map((r) => ({
306
+ id: r.id,
307
+ topic: r.topic,
308
+ tags: r.tags,
309
+ content: r.content,
310
+ }));
311
+ }
312
+
313
+ // ─── State Operations ───────────────────────────────────────────────────────
314
+
315
+ export function getStateDb(key) {
316
+ const d = getDb();
317
+ const row = d.prepare(`SELECT value FROM state WHERE key = ?`).get(key);
318
+ if (!row) return null;
319
+ return JSON.parse(row.value);
320
+ }
321
+
322
+ export function setStateDb(key, value) {
323
+ const d = getDb();
324
+ const updatedAt = new Date().toISOString();
325
+ d.prepare(
326
+ `INSERT OR REPLACE INTO state (key, value, updated_at) VALUES (?, ?, ?)`,
327
+ ).run(key, JSON.stringify(value), updatedAt);
328
+ return { key, updated: updatedAt };
329
+ }
330
+
331
+ // ─── Helpers ────────────────────────────────────────────────────────────────
332
+
333
+ function parseChunkRow(row) {
334
+ return {
335
+ id: row.id,
336
+ topic: row.topic,
337
+ agent: row.agent,
338
+ tags: JSON.parse(row.tags),
339
+ importance: row.importance,
340
+ content: row.content,
341
+ embedding: row.embedding
342
+ ? new Float32Array(
343
+ row.embedding.buffer,
344
+ row.embedding.byteOffset,
345
+ row.embedding.byteLength / 4,
346
+ )
347
+ : null,
348
+ createdAt: row.created_at,
349
+ updatedAt: row.updated_at,
350
+ expiresAt: row.expires_at,
351
+ };
352
+ }
353
+
354
+ export { STORE_PATH };