npm - agent-memory-store - Versions diffs - 0.0.7 → 0.0.9 - Mend

agent-memory-store 0.0.7 → 0.0.9

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (6) hide show

package/README.MD CHANGED Viewed

@@ -8,6 +8,14 @@
 `agent-memory-store` gives your AI agents a shared, searchable, persistent memory — powered by SQLite with native FTS5 full-text search and optional semantic embeddings. No external services required.
+## Why this exists
+Every time you start a new session with Claude Code, Cursor, or any MCP-compatible agent, it starts from zero. It doesn't know your project uses Fastify instead of Express. It doesn't know you decided on JWT two weeks ago. It doesn't know the staging deploy is on ECS.
+`agent-memory-store` gives agents a shared, searchable memory that survives across sessions. Agents write what they learn, search what they need, and build on each other's work — just like a team with good documentation, except it happens automatically.
+---
 Agents read and write **chunks** through MCP tools. Search combines **BM25 ranking** (via SQLite FTS5) with **semantic vector similarity** (via local embeddings), merged through Reciprocal Rank Fusion for best-of-both-worlds retrieval.
 ```
@@ -61,18 +69,6 @@ To use a custom path:
 AGENT_STORE_PATH=/your/project/.agent-memory-store npx agent-memory-store
 ```
-## Performance
-Benchmarked on Apple Silicon (Node v25, darwin arm64):
-| Operation | 100 chunks | 1K chunks | 5K chunks | 10K chunks |
-|-----------|-----------|-----------|-----------|------------|
-| **write** | 2.16 ms | 0.15 ms | 0.15 ms | 0.15 ms |
-| **read** | 0.02 ms | 0.02 ms | 0.02 ms | 0.02 ms |
-| **search (BM25)** | 0.4 ms | 1.2 ms | 5.3 ms | 9.9 ms |
-| **list** | 0.2 ms | 1.4 ms | 9.9 ms | 14.7 ms |
-| **state get/set** | 0.03 ms | 0.03 ms | 0.03 ms | 0.03 ms |
 ## Configuration
 ### Claude Code
@@ -168,21 +164,57 @@ If you need to store memory outside the project directory, set `AGENT_STORE_PATH
 ### Environment variables
-| Variable | Default | Description |
-|---|---|---|
+| Variable           | Default                 | Description                                                        |
+| ------------------ | ----------------------- | ------------------------------------------------------------------ |
 | `AGENT_STORE_PATH` | `./.agent-memory-store` | Custom path to the storage directory. Omit to use project default. |
+## Teach your agent to use memory
+Add this to your agent's system prompt (or `CLAUDE.md` / `AGENTS.md`):
+```markdown
+## Memory
+You have persistent memory via agent-memory-store MCP tools.
+**Before acting on any task:**
+1. `search_context` with 2–3 queries related to the task. Check for prior decisions, conventions, and relevant outputs.
+2. `get_state("project_tags")` to load the tag vocabulary. If empty, this is a new project — ask the user about stack, conventions, and structure, then persist them with `write_context` and `set_state`.
+**After completing work:**
+1. `write_context` to persist decisions (with rationale), outputs (with file paths), and discoveries (with impact).
+2. Use short, lowercase tags consistent with the vocabulary: `auth`, `config`, `decision`, `output`, `discovery`.
+3. Set `importance: "critical"` for decisions other agents depend on, `"high"` for outputs, `"medium"` for background context.
+**Before every write:**
+1. `search_context` for the same topic first. If a chunk exists, `delete_context` it, then write the updated version. One chunk per topic.
+**Rules:**
+- Never guess a fact that might be in memory — search first, it costs <10ms.
+- Never store secrets — write references to where they live, not the values.
+- `set_state` is for mutable values (current phase, counters). `write_context` is for searchable knowledge (decisions, outputs). Don't mix them.
+- Use `search_mode: "semantic"` when exact terms don't match (e.g., searching "autenticação" when the chunk says "auth").
+```
+Copy, paste, done. This is enough for any agent to use memory effectively.
+> **Want to go deeper?** The [`skills/SKILL.md`](./skills/SKILL.md) file is a comprehensive skill that teaches agents advanced patterns: cold start bootstrap for new projects, multi-agent pipeline handoffs, tag vocabulary management, deduplication workflows, and when to use each search mode. Install it in your project's skill directory if your agents run multi-step pipelines or need to coordinate across sessions.
 ## Tools
-| Tool | When to use |
-|---|---|
+| Tool             | When to use                                                               |
+| ---------------- | ------------------------------------------------------------------------- |
 | `search_context` | **Start of every task** — retrieve relevant prior knowledge before acting |
-| `write_context` | After decisions, discoveries, or outputs that other agents will need |
-| `read_context` | Read a specific chunk by ID |
-| `list_context` | Inventory the memory store (metadata only, no body) |
-| `delete_context` | Remove outdated or incorrect chunks |
-| `get_state` | Read a pipeline variable (progress, flags, counters) |
-| `set_state` | Write a pipeline variable |
+| `write_context`  | After decisions, discoveries, or outputs that other agents will need      |
+| `read_context`   | Read a specific chunk by ID                                               |
+| `list_context`   | Inventory the memory store (metadata only, no body)                       |
+| `delete_context` | Remove outdated or incorrect chunks                                       |
+| `get_state`      | Read a pipeline variable (progress, flags, counters)                      |
+| `set_state`      | Write a pipeline variable                                                 |
 ### `search_context`
@@ -197,11 +229,11 @@ search_mode  string    (optional) "hybrid" (default), "bm25", or "semantic".
 **Search modes:**
-| Mode | How it works | Best for |
-|---|---|---|
-| `hybrid` | BM25 + semantic similarity merged via Reciprocal Rank Fusion | General use (default) |
-| `bm25` | FTS5 keyword matching only | Exact term lookups, canonical tags |
-| `semantic` | Vector cosine similarity only | Finding conceptually related chunks |
+| Mode       | How it works                                                 | Best for                            |
+| ---------- | ------------------------------------------------------------ | ----------------------------------- |
+| `hybrid`   | BM25 + semantic similarity merged via Reciprocal Rank Fusion | General use (default)               |
+| `bm25`     | FTS5 keyword matching only                                   | Exact term lookups, canonical tags  |
+| `semantic` | Vector cosine similarity only                                | Finding conceptually related chunks |
 ### `write_context`
@@ -254,43 +286,27 @@ WAL mode is enabled for concurrent read performance. No manual flush needed.
 The embedding model (~23MB) is downloaded automatically on first use and cached in `~/.cache/huggingface/`. If the model fails to load, the system falls back to BM25-only search transparently.
-### Migration from filesystem
-If you're upgrading from a previous version that used `.md` files, the migration happens automatically on first startup. Your existing chunks and state are imported into SQLite, and the old directories are renamed to `chunks_backup/` and `state_backup/`.
-## Agent system prompt
-Paste this into the system prompt of every agent that should use the memory store:
-```markdown
-## Memory usage
-You have access to a persistent local memory store via agent-memory-store MCP tools.
-**At the start of each task:**
+## Performance
-1. Call `search_context` with 2-3 specific queries related to what you are about to do.
-2. Incorporate retrieved chunks into your reasoning.
-3. Call `get_state` to check pipeline status if relevant.
+Benchmarked on Apple Silicon (Node v25, darwin arm64, BM25 mode):
-**After completing a subtask:**
+| Operation         | 1K chunks | 10K chunks | 50K chunks | 100K chunks | 250K chunks |
+| ----------------- | --------- | ---------- | ---------- | ----------- | ----------- |
+| **write**         | 0.17 ms   | 0.19 ms    | 0.23 ms    | 0.21 ms     | 0.25 ms     |
+| **read**          | 0.01 ms   | 0.05 ms    | 0.21 ms    | 0.22 ms     | 0.85 ms     |
+| **search (BM25)** | ~5 ms†    | ~10 ms†    | ~60 ms†    | ~110 ms†    | ~390 ms†    |
+| **list**          | 0.2 ms    | 0.3 ms     | 0.3 ms     | 0.3 ms      | 1.1 ms      |
+| **state get/set** | 0.03 ms   | 0.03 ms    | 0.07 ms    | 0.05 ms     | 0.03 ms     |
-1. Call `write_context` to persist:
-   - Decisions made and their rationale
-   - Key discoveries or findings
-   - Structured outputs intended for downstream agents
-2. Use canonical tags consistent with the rest of the team.
-3. Set `importance: high` or `critical` for information other agents will need.
+† Search times from isolated run (no model loading interference). During warmup, first queries may be slower.
-**Best practices:**
+**Key insights:**
-- Specific topics: "ZAP scraper — stack decision" > "decision"
-- Consistent tags: always use the same term (`auth`, not `authentication`)
-- Check before writing: search first to avoid duplicate chunks
-- Temporary context: use `ttl_days: 7` for session-scoped information
-- Use `search_mode: "semantic"` when looking for conceptually related chunks
-- Use `search_mode: "bm25"` for exact tag/keyword lookups
-```
+- **list is O(1) in practice** — pagination caps results at 100 rows by default, so list time stays flat regardless of corpus size (0.2–1.1 ms at any scale)
+- **write is stable at ~0.2 ms/op** — FTS5 triggers and embedding backfill are non-blocking; inserts stay constant
+- **read is a single index lookup** — sub-millisecond up to 50K chunks, still <1 ms at 250K
+- **search scales linearly with FTS5 corpus** — this is inherent to BM25 full-text scan; for typical agent memory usage (≤25K chunks), search stays under 30 ms
+- **state ops are O(1)** — key/value store backed by a B-tree primary key, constant at all scales
 ## Development

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "agent-memory-store",
-  "version": "0.0.7",
+  "version": "0.0.9",
   "description": "Local-first MCP memory server for multi-agent systems. Hybrid search (BM25 + semantic embeddings), SQLite-backed, zero-config.",
   "type": "module",
   "exports": "./src/index.js",

package/src/db.js CHANGED Viewed

@@ -17,6 +17,20 @@ const STORE_PATH = process.env.AGENT_STORE_PATH
 const DB_PATH = path.join(STORE_PATH, "store.db");
 let db = null;
+const stmtCache = new Map();
+/**
+ * Returns a cached prepared statement for static SQL.
+ * Avoids re-preparing the same SQL on every call.
+ */
+function stmt(sql) {
+  let s = stmtCache.get(sql);
+  if (!s) {
+    s = getDb().prepare(sql);
+    stmtCache.set(sql, s);
+  }
+  return s;
+}
 // ─── Schema ─────────────────────────────────────────────────────────────────
@@ -98,9 +112,9 @@ export function getDb() {
   db.exec(SCHEMA_TRIGGERS);
   // Purge expired chunks
-  db.prepare(
+  db.exec(
     `DELETE FROM chunks WHERE expires_at IS NOT NULL AND expires_at < datetime('now')`,
-  ).run();
+  );
   // Graceful shutdown
   const shutdown = () => {
@@ -130,8 +144,7 @@ export function insertChunk({
   updatedAt,
   expiresAt,
 }) {
-  const d = getDb();
-  d.prepare(
+  stmt(
     `INSERT OR REPLACE INTO chunks (id, topic, agent, tags, importance, content, embedding, created_at, updated_at, expires_at)
      VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)`,
   ).run(
@@ -153,8 +166,7 @@ export function insertChunk({
  * @returns {object|null}
  */
 export function getChunk(id) {
-  const d = getDb();
-  const row = d.prepare(`SELECT * FROM chunks WHERE id = ?`).get(id);
+  const row = stmt(`SELECT * FROM chunks WHERE id = ?`).get(id);
   if (!row) return null;
   return parseChunkRow(row);
 }
@@ -164,8 +176,7 @@ export function getChunk(id) {
  * @returns {boolean} true if a row was deleted
  */
 export function deleteChunkById(id) {
-  const d = getDb();
-  const result = d.prepare(`DELETE FROM chunks WHERE id = ?`).run(id);
+  const result = stmt(`DELETE FROM chunks WHERE id = ?`).run(id);
   return result.changes > 0;
 }
@@ -173,7 +184,7 @@ export function deleteChunkById(id) {
  * Lists chunk metadata, with optional agent/tags filters.
  * Sorted by updated_at descending.
  */
-export function listChunksDb({ agent, tags = [] } = {}) {
+export function listChunksDb({ agent, tags = [], limit = 100, offset = 0 } = {}) {
   const d = getDb();
   let sql = `SELECT id, topic, agent, tags, importance, updated_at FROM chunks`;
   const conditions = [];
@@ -191,7 +202,8 @@ export function listChunksDb({ agent, tags = [] } = {}) {
   }
   if (conditions.length) sql += ` WHERE ${conditions.join(" AND ")}`;
-  sql += ` ORDER BY updated_at DESC`;
+  sql += ` ORDER BY updated_at DESC LIMIT ? OFFSET ?`;
+  params.push(limit, offset);
   const rows = d.prepare(sql).all(...params);
   return rows.map((r) => ({
@@ -206,7 +218,7 @@ export function listChunksDb({ agent, tags = [] } = {}) {
 /**
  * Full-text search via FTS5 (BM25).
- * Returns ranked results with scores.
+ * Returns ranked results with full chunk data (avoids separate lookups).
  */
 export function searchFTS({ query, agent, tags = [], topK = 18 }) {
   const d = getDb();
@@ -221,19 +233,19 @@ export function searchFTS({ query, agent, tags = [], topK = 18 }) {
   if (!ftsQuery) return [];
   let sql = `
-    SELECT chunks_fts.id, rank
+    SELECT c.id, c.topic, c.agent, c.tags, c.importance, c.content, c.updated_at, rank
     FROM chunks_fts
-    JOIN chunks ON chunks.id = chunks_fts.id
+    JOIN chunks c ON c.id = chunks_fts.id
     WHERE chunks_fts MATCH ?`;
   const params = [ftsQuery];
   if (agent) {
-    sql += ` AND chunks.agent = ?`;
+    sql += ` AND c.agent = ?`;
     params.push(agent);
   }
   if (tags.length > 0) {
-    const tagConditions = tags.map(() => `chunks.tags LIKE ?`);
+    const tagConditions = tags.map(() => `c.tags LIKE ?`);
     sql += ` AND (${tagConditions.join(" OR ")})`;
     params.push(...tags.map((t) => `%"${t}"%`));
   }
@@ -244,7 +256,13 @@ export function searchFTS({ query, agent, tags = [], topK = 18 }) {
   const rows = d.prepare(sql).all(...params);
   return rows.map((r) => ({
     id: r.id,
-    score: -r.rank, // FTS5 rank is negative (lower = better), invert
+    topic: r.topic,
+    agent: r.agent,
+    tags: JSON.parse(r.tags),
+    importance: r.importance,
+    content: r.content,
+    updated: r.updated_at,
+    score: -r.rank,
   }));
 }
@@ -285,8 +303,7 @@ export function getAllEmbeddings({ agent, tags = [] } = {}) {
  * Updates only the embedding for a chunk.
  */
 export function updateEmbedding(id, embedding) {
-  const d = getDb();
-  d.prepare(`UPDATE chunks SET embedding = ? WHERE id = ?`).run(
+  stmt(`UPDATE chunks SET embedding = ? WHERE id = ?`).run(
     Buffer.from(embedding.buffer),
     id,
   );
@@ -296,11 +313,9 @@ export function updateEmbedding(id, embedding) {
  * Returns chunks that have no embedding yet.
  */
 export function getChunksWithoutEmbedding() {
-  const d = getDb();
-  return d
-    .prepare(
-      `SELECT id, topic, tags, content FROM chunks WHERE embedding IS NULL`,
-    )
+  return stmt(
+    `SELECT id, topic, tags, content FROM chunks WHERE embedding IS NULL`,
+  )
     .all()
     .map((r) => ({
       id: r.id,
@@ -313,16 +328,14 @@ export function getChunksWithoutEmbedding() {
 // ─── State Operations ───────────────────────────────────────────────────────
 export function getStateDb(key) {
-  const d = getDb();
-  const row = d.prepare(`SELECT value FROM state WHERE key = ?`).get(key);
+  const row = stmt(`SELECT value FROM state WHERE key = ?`).get(key);
   if (!row) return null;
   return JSON.parse(row.value);
 }
 export function setStateDb(key, value) {
-  const d = getDb();
   const updatedAt = new Date().toISOString();
-  d.prepare(
+  stmt(
     `INSERT OR REPLACE INTO state (key, value, updated_at) VALUES (?, ?, ?)`,
   ).run(key, JSON.stringify(value), updatedAt);
   return { key, updated: updatedAt };

package/src/index.js CHANGED Viewed

@@ -223,9 +223,27 @@ server.tool(
   {
     agent: z.string().optional().describe("Filter by agent ID."),
     tags: z.array(z.string()).optional().describe("Filter by tags."),
+    limit: z
+      .number()
+      .int()
+      .min(1)
+      .max(500)
+      .optional()
+      .describe("Maximum number of results to return (default: 100)."),
+    offset: z
+      .number()
+      .int()
+      .min(0)
+      .optional()
+      .describe("Number of results to skip for pagination (default: 0)."),
   },
-  async ({ agent, tags }) => {
-    const chunks = await listChunks({ agent, tags: tags ?? [] });
+  async ({ agent, tags, limit, offset }) => {
+    const chunks = await listChunks({
+      agent,
+      tags: tags ?? [],
+      limit: limit ?? 100,
+      offset: offset ?? 0,
+    });
     if (chunks.length === 0) {
       return { content: [{ type: "text", text: "Memory store is empty." }] };

package/src/search.js CHANGED Viewed

@@ -12,6 +12,22 @@
 import { searchFTS, getAllEmbeddings, getChunk } from "./db.js";
 import { embed, isEmbeddingAvailable } from "./embeddings.js";
+/**
+ * Converts a full FTS result into the enriched output format.
+ */
+function ftsResultToEnriched(r) {
+  return {
+    id: r.id,
+    topic: r.topic,
+    agent: r.agent,
+    tags: r.tags,
+    importance: r.importance,
+    score: Math.round(r.score * 100) / 100,
+    content: r.content,
+    updated: r.updated,
+  };
+}
 // ─── Vector Search ──────────────────────────────────────────────────────────
 /**
@@ -94,47 +110,80 @@ export async function hybridSearch({
     effectiveMode = "bm25";
   }
-  let fusedResults;
+  // BM25-only: searchFTS already returns full chunk data, no enrichment needed
   if (effectiveMode === "bm25") {
-    fusedResults = searchFTS({ query, agent, tags, topK: candidateK });
-  } else if (effectiveMode === "semantic") {
+    const results = searchFTS({ query, agent, tags, topK: candidateK });
+    return results.slice(0, topK).map(ftsResultToEnriched);
+  }
+  // Semantic-only
+  if (effectiveMode === "semantic") {
     const queryEmbedding = await embed(query);
     if (!queryEmbedding) {
-      fusedResults = searchFTS({ query, agent, tags, topK: candidateK });
-    } else {
-      fusedResults = vectorSearch(queryEmbedding, {
-        agent,
-        tags,
-        topK: candidateK,
-      });
+      const results = searchFTS({ query, agent, tags, topK: candidateK });
+      return results.slice(0, topK).map(ftsResultToEnriched);
     }
-  } else {
-    // Hybrid: run FTS5 (sync) and embed query (async) in parallel
-    const queryEmbeddingPromise = embed(query);
-    const bm25Hits = searchFTS({ query, agent, tags, topK: candidateK });
-    const queryEmbedding = await queryEmbeddingPromise;
+    const vecHits = vectorSearch(queryEmbedding, { agent, tags, topK: candidateK });
+    return enrichVectorResults(vecHits.slice(0, topK));
+  }
-    if (!queryEmbedding) {
-      fusedResults = bm25Hits;
+  // Hybrid: run FTS5 (sync) and embed query (async) in parallel
+  const queryEmbeddingPromise = embed(query);
+  const bm25Hits = searchFTS({ query, agent, tags, topK: candidateK });
+  const queryEmbedding = await queryEmbeddingPromise;
+  if (!queryEmbedding) {
+    return bm25Hits.slice(0, topK).map(ftsResultToEnriched);
+  }
+  const vecHits = vectorSearch(queryEmbedding, { agent, tags, topK: candidateK });
+  const fused = reciprocalRankFusion(bm25Hits, vecHits);
+  // Enrich fused results: build lookup from BM25 data, only fetch missing from DB
+  const bm25Map = new Map(bm25Hits.map((r) => [r.id, r]));
+  const topResults = fused.slice(0, topK);
+  const enriched = [];
+  for (const { id, score } of topResults) {
+    const cached = bm25Map.get(id);
+    if (cached) {
+      enriched.push({
+        id: cached.id,
+        topic: cached.topic,
+        agent: cached.agent,
+        tags: cached.tags,
+        importance: cached.importance,
+        score: Math.round(score * 100) / 100,
+        content: cached.content,
+        updated: cached.updated,
+      });
     } else {
-      const vecHits = vectorSearch(queryEmbedding, {
-        agent,
-        tags,
-        topK: candidateK,
+      const chunk = getChunk(id);
+      if (!chunk) continue;
+      enriched.push({
+        id: chunk.id,
+        topic: chunk.topic,
+        agent: chunk.agent,
+        tags: chunk.tags,
+        importance: chunk.importance,
+        score: Math.round(score * 100) / 100,
+        content: chunk.content,
+        updated: chunk.updatedAt,
       });
-      fusedResults = reciprocalRankFusion(bm25Hits, vecHits);
     }
   }
-  // Take topK and enrich with full chunk data
-  const topResults = fusedResults.slice(0, topK);
-  const enriched = [];
+  return enriched;
+}
-  for (const { id, score } of topResults) {
+/**
+ * Enriches vector-only results by fetching full chunk data from DB.
+ */
+function enrichVectorResults(vecHits) {
+  const enriched = [];
+  for (const { id, score } of vecHits) {
     const chunk = getChunk(id);
     if (!chunk) continue;
     enriched.push({
       id: chunk.id,
       topic: chunk.topic,
@@ -146,6 +195,5 @@ export async function hybridSearch({
       updated: chunk.updatedAt,
     });
   }
   return enriched;
 }

package/src/store.js CHANGED Viewed

@@ -193,8 +193,8 @@ export async function deleteChunk(id) {
  * @param {string[]} [opts.tags]
  * @returns {Promise<Array>}
  */
-export async function listChunks({ agent, tags = [] } = {}) {
-  return listChunksDb({ agent, tags });
+export async function listChunks({ agent, tags = [], limit = 100, offset = 0 } = {}) {
+  return listChunksDb({ agent, tags, limit, offset });
 }
 /**