clawmem 0.1.1 → 0.1.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -8,7 +8,7 @@
8
8
 
9
9
  ClawMem fuses recent research into a retrieval-augmented memory layer that agents actually use. The hybrid architecture combines [QMD](https://github.com/tobi/qmd)-derived multi-signal retrieval (BM25 + vector search + reciprocal rank fusion + query expansion + cross-encoder reranking), [SAME](https://github.com/sgx-labs/statelessagent)-inspired composite scoring (recency decay, confidence, content-type half-lives, co-activation reinforcement), [MAGMA](https://arxiv.org/abs/2501.13956)-style intent classification with multi-graph traversal (semantic, temporal, and causal beam search), and [A-MEM](https://arxiv.org/abs/2510.02178) self-evolving memory notes that enrich documents with keywords, tags, and causal links between entries. Pattern extraction from [Engram](https://github.com/Gentleman-Programming/engram) adds deduplication windows, frequency-based durability scoring, and temporal navigation.
10
10
 
11
- Two integration paths: Claude Code hooks paired with an MCP server, or a native OpenClaw ContextEngine plugin. Both write to the same local SQLite vault. A decision captured during a Claude Code session shows up immediately when an OpenClaw agent picks up the same project.
11
+ Integrates via Claude Code hooks, an MCP server (works with any MCP-compatible client including OpenClaw), or a native OpenClaw ContextEngine plugin. All paths write to the same local SQLite vault. A decision captured during a Claude Code session shows up immediately when an OpenClaw agent picks up the same project.
12
12
 
13
13
  TypeScript on Bun. MIT License.
14
14
 
@@ -61,8 +61,21 @@ Runs fully local with no API keys and no cloud services. Integrates via Claude C
61
61
 
62
62
  ### Prerequisites
63
63
 
64
- - [Bun](https://bun.sh) v1.0+
65
- - SQLite with FTS5 support (included with Bun)
64
+ **Required:**
65
+
66
+ - [Bun](https://bun.sh) v1.0+ — runtime for ClawMem. On Linux, install via `curl -fsSL https://bun.sh/install | bash` (not snap — snap Bun cannot read stdin, which breaks hooks).
67
+ - SQLite with FTS5 — included with Bun. On macOS, install `brew install sqlite` for extension loading support (ClawMem detects and uses Homebrew SQLite automatically).
68
+
69
+ **Optional (for better performance):**
70
+
71
+ - [llama.cpp](https://github.com/ggml-org/llama.cpp) (`llama-server`) — for dedicated GPU inference. Without it, `node-llama-cpp` runs models in-process (auto-downloads on first use). GPU servers give better throughput and prevent silent CPU fallback.
72
+ - systemd (Linux) or launchd (macOS) — for persistent background services (watcher, embed timer, GPU servers). ClawMem ships systemd unit templates; macOS users can create equivalent launchd plists. See [systemd services](docs/guides/systemd-services.md).
73
+
74
+ **Optional integrations:**
75
+
76
+ - [Claude Code](https://docs.anthropic.com/en/docs/claude-code) — for hooks + MCP integration
77
+ - [OpenClaw](https://github.com/openclawai/openclaw) — for ContextEngine plugin integration
78
+ - [bd CLI](https://github.com/dolthub/dolt) v0.58.0+ — for Beads issue tracker sync (only if using Beads)
66
79
 
67
80
  ### Install from npm (recommended)
68
81
 
@@ -100,6 +113,8 @@ After installing, here's the full journey from zero to working memory:
100
113
 
101
114
  **Fastest path:** Step 1 alone gets you a working system with in-process CPU/GPU inference and default models — no manual model downloads or service configuration needed. Steps 2-4 are optional upgrades for better performance. Steps 5-6 are where you customize what gets indexed and how your agent connects.
102
115
 
116
+ **Customize what gets indexed:** Each collection has a `pattern` field in `~/.config/clawmem/config.yaml` (default: `**/*.md`). Tailor it per collection — index project docs, research notes, decision records, Obsidian vaults, or anything else your agents should know about. The more relevant content in the vault, the better retrieval works. See the [quickstart](docs/quickstart.md#customize-index-patterns) for config examples.
117
+
103
118
  ### Quick start commands
104
119
 
105
120
  ```bash
@@ -130,7 +145,7 @@ ClawMem integrates via hooks (`settings.json`) and an MCP stdio server. Hooks ha
130
145
 
131
146
  ```bash
132
147
  clawmem setup hooks # Install lifecycle hooks (SessionStart, UserPromptSubmit, Stop, PreCompact)
133
- clawmem setup mcp # Register MCP server in ~/.claude.json (20+ agent tools)
148
+ clawmem setup mcp # Register MCP server in ~/.claude.json (28 tools)
134
149
  ```
135
150
 
136
151
  **Automatic (90%):** `context-surfacing` injects relevant memory on every prompt. `postcompact-inject` re-injects state after compaction. `decision-extractor`, `handoff-generator`, `feedback-loop` capture session state on stop.
@@ -157,7 +172,7 @@ Disable OpenClaw's native memory and `memory-lancedb` auto-recall/capture to avo
157
172
  openclaw config set agents.defaults.memorySearch.extraPaths "[]"
158
173
  ```
159
174
 
160
- **Alternative:** You can also use the Claude Code-style hooks + MCP approach with OpenClaw (`clawmem setup hooks && clawmem setup mcp`). This works but bypasses OpenClaw's ContextEngine lifecycle - you lose token budget awareness, native compaction orchestration, and the `afterTurn()` message pipeline. The ContextEngine plugin is recommended for new OpenClaw setups.
175
+ **Alternative:** OpenClaw agents can also use ClawMem's MCP server directly (`clawmem setup mcp`), with or without hooks. This gives full access to all 28 MCP tools but bypasses OpenClaw's ContextEngine lifecycle, so you lose token budget awareness, native compaction orchestration, and the `afterTurn()` message pipeline. The ContextEngine plugin is recommended for new OpenClaw setups; MCP is available as an additional or standalone integration.
161
176
 
162
177
  #### Dual-Mode Operation
163
178
 
@@ -388,7 +403,7 @@ llama-server -m Qwen3-Reranker-0.6B-Q8_0.gguf \
388
403
 
389
404
  ### MCP Server
390
405
 
391
- ClawMem exposes 26 MCP tools via the [Model Context Protocol](https://modelcontextprotocol.io) and an optional HTTP REST API. Any MCP-compatible client or HTTP client can use it.
406
+ ClawMem exposes 28 MCP tools via the [Model Context Protocol](https://modelcontextprotocol.io) and an optional HTTP REST API. Any MCP-compatible client or HTTP client can use it.
392
407
 
393
408
  **Claude Code (automatic):**
394
409
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "clawmem",
3
- "version": "0.1.1",
3
+ "version": "0.1.3",
4
4
  "description": "On-device context engine and memory for AI agents. Claude Code and OpenClaw. Hooks + MCP server + hybrid RAG search.",
5
5
  "type": "module",
6
6
  "bin": {
package/src/config.ts CHANGED
@@ -71,12 +71,16 @@ export interface ProfileConfig {
71
71
  useVector: boolean;
72
72
  vectorTimeout: number;
73
73
  minScore: number;
74
+ /** Budget-aware escalation: if fast path finishes early, spend remaining time on expansion + reranking */
75
+ deepEscalation: boolean;
76
+ /** Max time (ms) allowed for the fast path before escalation is considered */
77
+ escalationBudgetMs: number;
74
78
  }
75
79
 
76
80
  export const PROFILES: Record<PerformanceProfile, ProfileConfig> = {
77
- speed: { tokenBudget: 400, maxResults: 5, useVector: false, vectorTimeout: 0, minScore: 0.55 },
78
- balanced: { tokenBudget: 800, maxResults: 10, useVector: true, vectorTimeout: 900, minScore: 0.45 },
79
- deep: { tokenBudget: 1200, maxResults: 15, useVector: true, vectorTimeout: 2000, minScore: 0.35 },
81
+ speed: { tokenBudget: 400, maxResults: 5, useVector: false, vectorTimeout: 0, minScore: 0.55, deepEscalation: false, escalationBudgetMs: 0 },
82
+ balanced: { tokenBudget: 800, maxResults: 10, useVector: true, vectorTimeout: 900, minScore: 0.45, deepEscalation: false, escalationBudgetMs: 0 },
83
+ deep: { tokenBudget: 1200, maxResults: 15, useVector: true, vectorTimeout: 2000, minScore: 0.25, deepEscalation: true, escalationBudgetMs: 4000 },
80
84
  };
81
85
 
82
86
  export function getActiveProfile(): ProfileConfig {
@@ -7,7 +7,8 @@
7
7
  */
8
8
 
9
9
  import type { Store, SearchResult } from "../store.ts";
10
- import { DEFAULT_EMBED_MODEL, extractSnippet } from "../store.ts";
10
+ import { DEFAULT_EMBED_MODEL, DEFAULT_QUERY_MODEL, DEFAULT_RERANK_MODEL, extractSnippet, resolveStore } from "../store.ts";
11
+ import { getVaultPath, getActiveProfile } from "../config.ts";
11
12
  import type { HookInput, HookOutput } from "../hooks.ts";
12
13
  import {
13
14
  makeContextOutput,
@@ -29,13 +30,12 @@ import { enrichResults } from "../search-utils.ts";
29
30
  import { sanitizeSnippet } from "../promptguard.ts";
30
31
  import { shouldSkipRetrieval, isRetrievedNoise } from "../retrieval-gate.ts";
31
32
  import { MAX_QUERY_LENGTH } from "../limits.ts";
32
- import { getActiveProfile } from "../config.ts";
33
33
 
34
34
  // =============================================================================
35
35
  // Config
36
36
  // =============================================================================
37
37
 
38
- // Profile-driven defaults (overridden by CLAWMEM_PROFILE env var)
38
+ // Profile-driven defaults (overridden by CLAWMEM_PROFILE env var via E14)
39
39
  const DEFAULT_TOKEN_BUDGET = 800;
40
40
  const DEFAULT_MAX_RESULTS = 10;
41
41
  const DEFAULT_MIN_SCORE = 0.45;
@@ -52,7 +52,7 @@ function getTierConfig(score: number): { snippetLen: number; showMeta: boolean;
52
52
  // Directories to never surface
53
53
  const FILTERED_PATHS = ["_PRIVATE/", "experiments/", "_clawmem/"];
54
54
 
55
- // File path patterns to extract from prompts (E13: file-aware UserPromptSubmit)
55
+ // File path patterns to extract from prompts (E13 replacement: file-aware UserPromptSubmit)
56
56
  const FILE_PATH_RE = /(?:^|\s)((?:\/[\w.@-]+)+(?:\.\w+)?|[\w.@-]+\.(?:ts|js|py|md|sh|yaml|yml|json|toml|rs|go|tsx|jsx|css|html))\b/g;
57
57
 
58
58
  // =============================================================================
@@ -81,10 +81,11 @@ export async function contextSurfacing(
81
81
  return makeEmptyOutput("context-surfacing");
82
82
  }
83
83
 
84
- // Load active performance profile
84
+ // Load active performance profile (E14)
85
85
  const profile = getActiveProfile();
86
86
  const maxResults = profile.maxResults;
87
87
  const tokenBudget = profile.tokenBudget;
88
+ const startTime = Date.now();
88
89
 
89
90
  const isRecency = hasRecencyIntent(prompt);
90
91
  const minScore = isRecency ? MIN_COMPOSITE_SCORE_RECENCY : profile.minScore;
@@ -118,7 +119,22 @@ export async function contextSurfacing(
118
119
  }
119
120
  }
120
121
 
121
- // File-aware supplemental search (E13): extract file paths/names from prompt
122
+ // Dual-query: also search skill vault if configured (secondary source)
123
+ if (getVaultPath("skill")) {
124
+ try {
125
+ const skillStore = resolveStore("skill");
126
+ const skillResults = skillStore.searchFTS(prompt, 5);
127
+ // Tag skill vault results for identification in output
128
+ for (const r of skillResults) {
129
+ (r as any)._fromVault = "skill";
130
+ }
131
+ results = [...results, ...skillResults];
132
+ } catch {
133
+ // Skill vault unavailable — continue with general results only
134
+ }
135
+ }
136
+
137
+ // File-aware supplemental search (E13 replacement): extract file paths/names from prompt
122
138
  // and run targeted FTS queries to surface file-specific vault context
123
139
  const fileMatches = [...prompt.matchAll(FILE_PATH_RE)].map(m => m[1]!.trim()).filter(Boolean);
124
140
  if (fileMatches.length > 0) {
@@ -138,6 +154,54 @@ export async function contextSurfacing(
138
154
 
139
155
  if (results.length === 0) return makeEmptyOutput("context-surfacing");
140
156
 
157
+ // Budget-aware deep escalation (deep profile only):
158
+ // If the fast path finished quickly and found results, spend remaining time budget
159
+ // on query expansion (discovers new candidates) and cross-encoder reranking (reorders).
160
+ if (profile.deepEscalation && results.length >= 2) {
161
+ const elapsed = Date.now() - startTime;
162
+ if (elapsed < profile.escalationBudgetMs) {
163
+ try {
164
+ // Phase 1: Query expansion — discover candidates BM25+vector missed
165
+ const expanded = await store.expandQuery(prompt, DEFAULT_QUERY_MODEL);
166
+ if (expanded.length > 0) {
167
+ const seen = new Set(results.map(r => r.filepath));
168
+ for (const eq of expanded.slice(0, 3)) {
169
+ if (Date.now() - startTime > 6000) break; // hard stop at 6s
170
+ const ftsExp = store.searchFTS(eq, 5);
171
+ for (const r of ftsExp) {
172
+ if (!seen.has(r.filepath)) {
173
+ seen.add(r.filepath);
174
+ results.push(r);
175
+ }
176
+ }
177
+ }
178
+ }
179
+
180
+ // Phase 2: Cross-encoder reranking — reorder with deeper relevance signal
181
+ if (Date.now() - startTime < 6000 && results.length >= 3) {
182
+ const toRerank = results.slice(0, 15).map(r => ({
183
+ file: r.filepath,
184
+ text: (r.body || "").slice(0, 2000),
185
+ }));
186
+ const reranked = await store.rerank(prompt, toRerank, DEFAULT_RERANK_MODEL);
187
+ if (reranked.length > 0) {
188
+ const rerankedMap = new Map(reranked.map(r => [r.file, r.score]));
189
+ // Blend: 60% original score + 40% reranker score for stability
190
+ for (const r of results) {
191
+ const rerankScore = rerankedMap.get(r.filepath);
192
+ if (rerankScore !== undefined) {
193
+ r.score = 0.6 * r.score + 0.4 * rerankScore;
194
+ }
195
+ }
196
+ results.sort((a, b) => b.score - a.score);
197
+ }
198
+ }
199
+ } catch {
200
+ // Escalation failed (GPU down, timeout, etc.) — continue with fast-path results
201
+ }
202
+ }
203
+ }
204
+
141
205
  // Filter out private/excluded paths
142
206
  results = results.filter(r =>
143
207
  !FILTERED_PATHS.some(p => r.displayPath.includes(p))
@@ -148,8 +212,12 @@ export async function contextSurfacing(
148
212
  // Filter out snoozed documents
149
213
  const now = new Date();
150
214
  results = results.filter(r => {
215
+ // filepath is a virtual path (clawmem://collection/path) but findActiveDocument
216
+ // expects the collection-relative path, not the full virtual path
151
217
  const parsed = r.filepath.startsWith('clawmem://') ? r.filepath.replace(/^clawmem:\/\/[^/]+\/?/, '') : r.filepath;
152
- const doc = store.findActiveDocument(r.collectionName, parsed);
218
+ // Use the correct store for skill-vault results
219
+ const targetStore = (r as any)._fromVault === "skill" ? (() => { try { return resolveStore("skill"); } catch { return store; } })() : store;
220
+ const doc = targetStore.findActiveDocument(r.collectionName, parsed);
153
221
  if (!doc) return true;
154
222
  if (doc.snoozed_until && new Date(doc.snoozed_until) > now) return false;
155
223
  return true;
@@ -170,8 +238,19 @@ export async function contextSurfacing(
170
238
  // Filter out noise results (agent denials, too-short snippets) before enrichment
171
239
  results = results.filter(r => !r.body || !isRetrievedNoise(r.body));
172
240
 
173
- // Enrich with SAME metadata
174
- const enriched = enrichResults(store, results, prompt);
241
+ // Enrich with SAME metadata — route skill-vault results through their own store
242
+ const generalResults = results.filter(r => !(r as any)._fromVault);
243
+ const skillResults = results.filter(r => (r as any)._fromVault === "skill");
244
+ let enriched = enrichResults(store, generalResults, prompt);
245
+ if (skillResults.length > 0) {
246
+ try {
247
+ const skillStore = resolveStore("skill");
248
+ enriched = [...enriched, ...enrichResults(skillStore, skillResults, prompt)];
249
+ } catch {
250
+ // Skill store unavailable — enrich with general store as fallback
251
+ enriched = [...enriched, ...enrichResults(store, skillResults, prompt)];
252
+ }
253
+ }
175
254
 
176
255
  // Apply composite scoring
177
256
  const scored = applyCompositeScoring(enriched, prompt)
@@ -191,6 +270,7 @@ export async function contextSurfacing(
191
270
  for (const ca of coActs) {
192
271
  const existing = scored.find(r => r.displayPath === ca.path);
193
272
  if (existing && existing.compositeScore <= 0.8) {
273
+ // Boost by 0.1 per co-activation count, capped at +0.2
194
274
  existing.compositeScore += Math.min(0.2, 0.1 * Math.min(ca.count, 2));
195
275
  }
196
276
  }
@@ -202,12 +282,14 @@ export async function contextSurfacing(
202
282
  }
203
283
 
204
284
  // Memory type diversification (E10): ensure procedural results aren't crowded out
285
+ // If top results are all semantic, promote the best procedural result
205
286
  if (scored.length > 3) {
206
287
  const top3Types = scored.slice(0, 3).map(r => inferMemoryType(r.displayPath, r.contentType, r.body));
207
288
  const hasProc = top3Types.includes("procedural");
208
289
  if (!hasProc) {
209
290
  const procIdx = scored.findIndex(r => inferMemoryType(r.displayPath, r.contentType, r.body) === "procedural");
210
291
  if (procIdx > 3) {
292
+ // Move the best procedural result to position 3
211
293
  const [proc] = scored.splice(procIdx, 1);
212
294
  scored.splice(3, 0, proc!);
213
295
  }
@@ -225,6 +307,7 @@ export async function contextSurfacing(
225
307
  }
226
308
 
227
309
  // Routing hint: detect query intent signals and prepend a tool routing directive
310
+ // This makes routing instructions salient at the moment of tool selection (per research)
228
311
  const routingHint = detectRoutingHint(prompt);
229
312
 
230
313
  return makeContextOutput(
@@ -247,14 +330,17 @@ export async function contextSurfacing(
247
330
  function detectRoutingHint(prompt: string): string | null {
248
331
  const q = prompt.toLowerCase();
249
332
 
333
+ // Timeline/session signals
250
334
  if (/\b(last session|yesterday|prior session|previous session|last time we|handoff|what happened last|what did we do|cross.session|earlier today|what we discussed|when we last)\b/i.test(q)) {
251
335
  return "If searching memory for this: use session_log or memory_retrieve, NOT query.";
252
336
  }
253
337
 
338
+ // Causal signals
254
339
  if (/\b(why did|why was|why were|what caused|what led to|reason for|decided to|decision about|trade.?off|instead of|chose to)\b/i.test(q) || /^why\b/i.test(q)) {
255
340
  return "If searching memory for this: use intent_search or memory_retrieve, NOT query.";
256
341
  }
257
342
 
343
+ // Discovery signals
258
344
  if (/\b(similar to|related to|what else|what other|reminds? me of|like this)\b/i.test(q)) {
259
345
  return "If searching memory for this: use find_similar or memory_retrieve, NOT query.";
260
346
  }
@@ -9,6 +9,7 @@
9
9
  import { resolve as pathResolve } from "path";
10
10
  import { existsSync, readFileSync } from "fs";
11
11
  import type { Store } from "../store.ts";
12
+ import { getIndexHealth } from "../store.ts";
12
13
  import type { HookInput, HookOutput } from "../hooks.ts";
13
14
  import {
14
15
  makeContextOutput,
@@ -64,11 +65,23 @@ export async function curatorNudge(
64
65
  return makeEmptyOutput("curator-nudge");
65
66
  }
66
67
 
68
+ // Override embedding backlog with live data (report value goes stale after embed timer runs)
69
+ let actions = [...report.actions];
70
+ try {
71
+ const health = getIndexHealth(store.db);
72
+ actions = actions.filter(a => !/documents? need embedding/i.test(a));
73
+ if (health.needsEmbedding > 0) {
74
+ actions.unshift(`${health.needsEmbedding} documents need embedding`);
75
+ }
76
+ } catch { /* fail-open: use report actions as-is */ }
77
+
78
+ if (actions.length === 0) return makeEmptyOutput("curator-nudge");
79
+
67
80
  // Build compact action summary within budget
68
81
  const lines = [`**Curator (${report.timestamp.slice(0, 10)}):**`];
69
82
  let tokens = estimateTokens(lines[0]!);
70
83
 
71
- for (const action of report.actions) {
84
+ for (const action of actions) {
72
85
  const line = `- ${action}`;
73
86
  const lineTokens = estimateTokens(line);
74
87
  if (tokens + lineTokens > MAX_TOKEN_BUDGET && lines.length > 1) break;