clawmem 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (50) hide show
  1. package/AGENTS.md +660 -0
  2. package/CLAUDE.md +660 -0
  3. package/LICENSE +21 -0
  4. package/README.md +993 -0
  5. package/SKILL.md +717 -0
  6. package/bin/clawmem +75 -0
  7. package/package.json +72 -0
  8. package/src/amem.ts +797 -0
  9. package/src/beads.ts +263 -0
  10. package/src/clawmem.ts +1849 -0
  11. package/src/collections.ts +405 -0
  12. package/src/config.ts +178 -0
  13. package/src/consolidation.ts +123 -0
  14. package/src/directory-context.ts +248 -0
  15. package/src/errors.ts +41 -0
  16. package/src/formatter.ts +427 -0
  17. package/src/graph-traversal.ts +247 -0
  18. package/src/hooks/context-surfacing.ts +317 -0
  19. package/src/hooks/curator-nudge.ts +89 -0
  20. package/src/hooks/decision-extractor.ts +639 -0
  21. package/src/hooks/feedback-loop.ts +214 -0
  22. package/src/hooks/handoff-generator.ts +345 -0
  23. package/src/hooks/postcompact-inject.ts +226 -0
  24. package/src/hooks/precompact-extract.ts +314 -0
  25. package/src/hooks/pretool-inject.ts +79 -0
  26. package/src/hooks/session-bootstrap.ts +324 -0
  27. package/src/hooks/staleness-check.ts +130 -0
  28. package/src/hooks.ts +367 -0
  29. package/src/indexer.ts +327 -0
  30. package/src/intent.ts +294 -0
  31. package/src/limits.ts +26 -0
  32. package/src/llm.ts +1175 -0
  33. package/src/mcp.ts +2138 -0
  34. package/src/memory.ts +336 -0
  35. package/src/mmr.ts +93 -0
  36. package/src/observer.ts +269 -0
  37. package/src/openclaw/engine.ts +283 -0
  38. package/src/openclaw/index.ts +221 -0
  39. package/src/openclaw/plugin.json +83 -0
  40. package/src/openclaw/shell.ts +207 -0
  41. package/src/openclaw/tools.ts +304 -0
  42. package/src/profile.ts +346 -0
  43. package/src/promptguard.ts +218 -0
  44. package/src/retrieval-gate.ts +106 -0
  45. package/src/search-utils.ts +127 -0
  46. package/src/server.ts +783 -0
  47. package/src/splitter.ts +325 -0
  48. package/src/store.ts +4062 -0
  49. package/src/validation.ts +67 -0
  50. package/src/watcher.ts +58 -0
package/SKILL.md ADDED
@@ -0,0 +1,717 @@
1
+ ---
2
+ name: clawmem
3
+ description: |
4
+ ClawMem agent reference — detailed operational guidance for the on-device hybrid memory system. Use when: setting up collections/indexing/embedding, troubleshooting retrieval, tuning query optimization (4 levers), understanding pipeline behavior, managing memory lifecycle (pin/snooze/forget), building graphs, or any ClawMem operation beyond basic tool routing.
5
+ allowed-tools: "mcp__clawmem__*"
6
+ metadata:
7
+ author: yoloshii
8
+ version: 1.0.0
9
+ ---
10
+
11
+ # ClawMem Agent Reference
12
+
13
+ ## Architecture
14
+
15
+ Two tiers: **hooks** handle automatic context flow (surfacing, extraction, compaction survival). **MCP tools** handle explicit recall, write, and lifecycle operations.
16
+
17
+ ---
18
+
19
+ ## Inference Services
20
+
21
+ Three `llama-server` instances for neural inference. The `bin/clawmem` wrapper defaults to `localhost:8088/8089/8090`.
22
+
23
+ **Default (QMD native combo, any GPU or in-process):**
24
+
25
+ | Service | Port | Model | VRAM | Protocol |
26
+ |---|---|---|---|---|
27
+ | Embedding | 8088 | EmbeddingGemma-300M-Q8_0 | ~400MB | `/v1/embeddings` |
28
+ | LLM | 8089 | qmd-query-expansion-1.7B-q4_k_m | ~2.2GB | `/v1/chat/completions` |
29
+ | Reranker | 8090 | qwen3-reranker-0.6B-Q8_0 | ~1.3GB | `/v1/rerank` |
30
+
31
+ All three models auto-download via `node-llama-cpp` if no server is running (Metal on Apple Silicon, Vulkan where available, CPU as last resort). Fast with GPU acceleration (Metal/Vulkan); significantly slower on CPU-only.
32
+
33
+ **SOTA upgrade (12GB+ GPU):** zembed-1-Q4_K_M (embedding, 2560d, ~4.4GB) + zerank-2-Q4_K_M (reranker, ~3.3GB). Total ~10GB with LLM. Distillation-paired via zELO. `-ub` must match `-b` for both. **CC-BY-NC-4.0** — non-commercial only.
34
+
35
+ **Remote option:** Set `CLAWMEM_EMBED_URL`, `CLAWMEM_LLM_URL`, `CLAWMEM_RERANK_URL` to remote host. Set `CLAWMEM_NO_LOCAL_MODELS=true` to prevent fallback downloads.
36
+
37
+ **Cloud embedding:** Set `CLAWMEM_EMBED_API_KEY` + `CLAWMEM_EMBED_URL` + `CLAWMEM_EMBED_MODEL` for cloud providers. Supported: Jina AI (`jina-embeddings-v5-text-small`, 1024d), OpenAI, Voyage, Cohere. Batch embedding, TPM-aware pacing, provider-specific params auto-detected.
38
+
39
+ ### Server Setup
40
+
41
+ ```bash
42
+ # === Default (QMD native combo) ===
43
+
44
+ # Embedding (--embeddings flag required)
45
+ llama-server -m embeddinggemma-300M-Q8_0.gguf \
46
+ --embeddings --port 8088 --host 0.0.0.0 -ngl 99 -c 2048 --batch-size 2048
47
+
48
+ # LLM (auto-downloads via node-llama-cpp if no server)
49
+ llama-server -m qmd-query-expansion-1.7B-q4_k_m.gguf \
50
+ --port 8089 --host 0.0.0.0 -ngl 99 -c 4096 --batch-size 512
51
+
52
+ # Reranker (auto-downloads via node-llama-cpp if no server)
53
+ llama-server -m Qwen3-Reranker-0.6B-Q8_0.gguf \
54
+ --reranking --port 8090 --host 0.0.0.0 -ngl 99 -c 2048 --batch-size 512
55
+
56
+ # === SOTA upgrade (12GB+ GPU) — -ub must match -b ===
57
+
58
+ # Embedding (zembed-1)
59
+ llama-server -m zembed-1-Q4_K_M.gguf \
60
+ --embeddings --port 8088 --host 0.0.0.0 -ngl 99 -c 8192 -b 2048 -ub 2048
61
+
62
+ # Reranker (zerank-2)
63
+ llama-server -m zerank-2-Q4_K_M.gguf \
64
+ --reranking --port 8090 --host 0.0.0.0 -ngl 99 -c 2048 -b 2048 -ub 2048
65
+ ```
66
+
67
+ ### Verify Endpoints
68
+
69
+ ```bash
70
+ curl http://host:8088/v1/embeddings -d '{"input":"test","model":"embedding"}' -H 'Content-Type: application/json'
71
+ curl http://host:8089/v1/models
72
+ curl http://host:8090/v1/models
73
+ ```
74
+
75
+ ## Environment Variables
76
+
77
+ | Variable | Default (via wrapper) | Effect |
78
+ |---|---|---|
79
+ | `CLAWMEM_EMBED_URL` | `http://localhost:8088` | Embedding server. Falls back to in-process `node-llama-cpp` if unset. |
80
+ | `CLAWMEM_EMBED_API_KEY` | (none) | API key for cloud embedding. Enables cloud mode: batch embedding, provider-specific params, TPM-aware pacing. |
81
+ | `CLAWMEM_EMBED_MODEL` | `embedding` | Model name for embedding requests. Override for cloud providers (e.g. `jina-embeddings-v5-text-small`). |
82
+ | `CLAWMEM_EMBED_TPM_LIMIT` | `100000` | Tokens-per-minute limit for cloud embedding pacing. Match to your provider tier. |
83
+ | `CLAWMEM_EMBED_DIMENSIONS` | (none) | Output dimensions for OpenAI `text-embedding-3-*` Matryoshka models. |
84
+ | `CLAWMEM_LLM_URL` | `http://localhost:8089` | LLM server. Falls to `node-llama-cpp` if unset + `NO_LOCAL_MODELS=false`. |
85
+ | `CLAWMEM_RERANK_URL` | `http://localhost:8090` | Reranker server. Falls to `node-llama-cpp` if unset + `NO_LOCAL_MODELS=false`. |
86
+ | `CLAWMEM_NO_LOCAL_MODELS` | `false` | Blocks `node-llama-cpp` auto-downloads. Set `true` for remote-only. |
87
+ | `CLAWMEM_ENABLE_AMEM` | enabled | A-MEM note construction + link generation during indexing. |
88
+ | `CLAWMEM_ENABLE_CONSOLIDATION` | disabled | Background worker backfills unenriched docs. Needs long-lived MCP process. |
89
+ | `CLAWMEM_CONSOLIDATION_INTERVAL` | 300000 | Worker interval in ms (min 15000). |
90
+
91
+ **Note:** The `bin/clawmem` wrapper sets all endpoint defaults. Always use the wrapper — never `bun run src/clawmem.ts` directly.
92
+
93
+ ---
94
+
95
+ ## Quick Setup
96
+
97
+ ```bash
98
+ # Install via npm
99
+ bun add -g clawmem # or: npm install -g clawmem
100
+
101
+ # Or from source
102
+ git clone https://github.com/yoloshii/clawmem.git ~/clawmem
103
+ cd ~/clawmem && bun install
104
+ ln -sf ~/clawmem/bin/clawmem ~/.bun/bin/clawmem
105
+
106
+ # Bootstrap a vault (init + index + embed + hooks + MCP)
107
+ clawmem bootstrap ~/notes --name notes
108
+
109
+ # Or step by step:
110
+ ./bin/clawmem init
111
+ ./bin/clawmem collection add ~/notes --name notes
112
+ ./bin/clawmem update --embed
113
+ ./bin/clawmem setup hooks
114
+ ./bin/clawmem setup mcp
115
+
116
+ # Verify
117
+ ./bin/clawmem doctor # Full health check
118
+ ./bin/clawmem status # Quick index status
119
+ ```
120
+
121
+ ### Background Services (systemd user units)
122
+
123
+ ```bash
124
+ mkdir -p ~/.config/systemd/user
125
+
126
+ # clawmem-watcher.service — auto-indexes on .md changes
127
+ cat > ~/.config/systemd/user/clawmem-watcher.service << 'EOF'
128
+ [Unit]
129
+ Description=ClawMem file watcher — auto-indexes on .md changes
130
+ After=default.target
131
+
132
+ [Service]
133
+ Type=simple
134
+ ExecStart=%h/clawmem/bin/clawmem watch
135
+ Restart=on-failure
136
+ RestartSec=10
137
+
138
+ [Install]
139
+ WantedBy=default.target
140
+ EOF
141
+
142
+ # clawmem-embed.service — oneshot embedding sweep
143
+ cat > ~/.config/systemd/user/clawmem-embed.service << 'EOF'
144
+ [Unit]
145
+ Description=ClawMem embedding sweep
146
+
147
+ [Service]
148
+ Type=oneshot
149
+ ExecStart=%h/clawmem/bin/clawmem embed
150
+ EOF
151
+
152
+ # clawmem-embed.timer — daily at 04:00
153
+ cat > ~/.config/systemd/user/clawmem-embed.timer << 'EOF'
154
+ [Unit]
155
+ Description=ClawMem daily embedding sweep
156
+
157
+ [Timer]
158
+ OnCalendar=*-*-* 04:00:00
159
+ Persistent=true
160
+ RandomizedDelaySec=300
161
+
162
+ [Install]
163
+ WantedBy=timers.target
164
+ EOF
165
+
166
+ # Enable and start
167
+ systemctl --user daemon-reload
168
+ systemctl --user enable --now clawmem-watcher.service clawmem-embed.timer
169
+ loginctl enable-linger $(whoami)
170
+ ```
171
+
172
+ **Note:** Service files use `%h` (home dir). For remote GPU, add `Environment=CLAWMEM_EMBED_URL=http://host:8088` etc. to both service files.
173
+
174
+ ---
175
+
176
+ ## Tier 2 — Automatic Retrieval (Hooks)
177
+
178
+ Hooks handle ~90% of retrieval. Zero agent effort.
179
+
180
+ | Hook | Trigger | Budget | Content |
181
+ |------|---------|--------|---------|
182
+ | `context-surfacing` | UserPromptSubmit | profile-driven (default 800) | retrieval gate -> profile-driven hybrid search (vector if `useVector`, timeout from profile) -> FTS supplement -> file-aware search (E13) -> snooze filter -> noise filter -> spreading activation (E11) -> memory type diversification (E10) -> tiered injection (HOT/WARM/COLD) -> `<vault-context>` + optional `<vault-routing>` hint. Budget, max results, vector timeout, min score all driven by `CLAWMEM_PROFILE`. |
183
+ | `postcompact-inject` | SessionStart (compact) | 1200 tokens | re-injects authoritative context after compaction: precompact state (600) + decisions (400) + antipatterns (150) + vault context (200) -> `<vault-postcompact>` |
184
+ | `curator-nudge` | SessionStart | 200 tokens | surfaces curator report actions, nudges when report is stale (>7 days) |
185
+ | `precompact-extract` | PreCompact | — | extracts decisions, file paths, open questions -> writes `precompact-state.md`. Query-aware ranking. Reindexes auto-memory. |
186
+ | `decision-extractor` | Stop | — | LLM extracts observations -> `_clawmem/agent/observations/`, infers causal links, detects contradictions |
187
+ | `handoff-generator` | Stop | — | LLM summarizes session -> `_clawmem/agent/handoffs/` |
188
+ | `feedback-loop` | Stop | — | tracks referenced notes -> boosts confidence, records usage relations + co-activations, tracks utility signals (surfaced vs referenced ratio) |
189
+
190
+ **Default behavior:** Read injected `<vault-context>` first. If sufficient, answer immediately.
191
+
192
+ **Hook blind spots (by design):** Hooks filter out `_clawmem/` system artifacts, enforce score thresholds, and cap token budget. Absence in `<vault-context>` does NOT mean absence in memory. Escalate to Tier 3 if expected memory wasn't surfaced.
193
+
194
+ ---
195
+
196
+ ## Tier 3 — Agent-Initiated Retrieval (MCP Tools)
197
+
198
+ ### 3-Rule Escalation Gate
199
+
200
+ Escalate to MCP tools ONLY when one of these fires:
201
+
202
+ 1. **Low-specificity injection** — `<vault-context>` is empty or lacks the specific fact the task requires. Hooks surface top-k by relevance; if needed memory wasn't in top-k, escalate.
203
+ 2. **Cross-session question** — task explicitly references prior sessions or decisions: "why did we decide X", "what changed since last time".
204
+ 3. **Pre-irreversible check** — about to make a destructive or hard-to-reverse change. Check vault for prior decisions.
205
+
206
+ All other retrieval is handled by Tier 2 hooks. Do NOT call MCP tools speculatively.
207
+
208
+ ### Tool Routing
209
+
210
+ Once escalated, route by query type:
211
+
212
+ **PREFERRED:** `memory_retrieve(query)` — auto-classifies and routes to the optimal backend (query, intent_search, session_log, find_similar, or query_plan). Use this instead of manually choosing a tool below.
213
+
214
+ ```
215
+ 1a. General recall -> query(query, compact=true, limit=20)
216
+ Full hybrid: BM25 + vector + query expansion + deep reranking.
217
+ Supports compact, collection filter, intent, candidateLimit.
218
+ Optional: intent="domain hint" for ambiguous queries.
219
+ Optional: candidateLimit=N (default 30).
220
+ BM25 strong-signal bypass: skips expansion when top BM25 >= 0.85 with gap >= 0.15
221
+ (disabled when intent is provided).
222
+
223
+ 1b. Causal/why/when/entity -> intent_search(query, enable_graph_traversal=true)
224
+ MAGMA intent classification + intent-weighted RRF + multi-hop graph traversal.
225
+ Use DIRECTLY when question is "why", "when", "how did X lead to Y",
226
+ or needs entity-relationship traversal.
227
+ Override auto-detection: force_intent="WHY"|"WHEN"|"ENTITY"|"WHAT"
228
+
229
+ Choose 1a or 1b based on query type. Parallel options, not sequential.
230
+
231
+ 1c. Multi-topic/complex -> query_plan(query, compact=true)
232
+ Decomposes query into 2-4 typed clauses (bm25/vector/graph), executes in parallel, merges via RRF.
233
+ Use when query spans multiple topics or needs both keyword and semantic recall simultaneously.
234
+ Falls back to single-query behavior for simple queries.
235
+
236
+ 2. Progressive disclosure -> multi_get("path1,path2") for full content of top hits
237
+
238
+ 3. Spot checks -> search(query) (BM25, 0 GPU) or vsearch(query) (vector, 1 GPU)
239
+
240
+ 4. Chain tracing -> find_causal_links(docid, direction="both", depth=5)
241
+ Traverses causal edges between _clawmem/agent/observations/ docs.
242
+
243
+ 5. Memory debugging -> memory_evolution_status(docid)
244
+
245
+ 6. Temporal context -> timeline(docid, before=5, after=5, same_collection=false)
246
+ Shows what was created/modified before and after a document.
247
+ Use after search to understand chronological neighborhood.
248
+ ```
249
+
250
+ ### All MCP Tools
251
+
252
+ | Tool | Purpose |
253
+ |------|---------|
254
+ | `memory_retrieve` | **Preferred.** Auto-classifies query and routes to optimal backend. Use instead of choosing manually. |
255
+ | `query` | Full hybrid (BM25 + vector + rerank). General-purpose when type unclear. WRONG for "why" (use `intent_search`) or cross-session (use `session_log`). Prefer `memory_retrieve`. |
256
+ | `intent_search` | USE THIS for "why did we decide X", "what caused Y", "who worked on Z". Classifies intent, traverses graph edges. Returns decision chains `query()` cannot find. |
257
+ | `query_plan` | USE THIS for multi-topic queries ("X and also Y", "compare A with B"). `query()` searches as one blob — this splits and routes each optimally. |
258
+ | `search` | BM25 keyword — for exact terms, config names, error codes. Fast, 0 GPU. Prefer `memory_retrieve`. |
259
+ | `vsearch` | Vector semantic — for conceptual/fuzzy when keywords unknown. ~100ms, 1 GPU. Prefer `memory_retrieve`. |
260
+ | `get` | Retrieve single doc by path or `#docid`. |
261
+ | `multi_get` | Retrieve multiple docs by glob or comma-separated list. |
262
+ | `find_similar` | USE THIS for "what else relates to X". k-NN vector neighbors — discovers connections beyond keyword overlap. |
263
+ | `find_causal_links` | Trace decision chains: "what led to X". Follow up `intent_search` on a top result to walk the full causal chain. |
264
+ | `session_log` | USE THIS for "last time", "yesterday", "what did we do". DO NOT use `query()` for cross-session questions. |
265
+ | `profile` | User profile (static facts + dynamic context). |
266
+ | `memory_forget` | Deactivate a memory by closest match. |
267
+ | `memory_pin` | +0.3 composite boost. USE PROACTIVELY for constraints, architecture decisions, corrections. Don't wait for curator. |
268
+ | `memory_snooze` | USE PROACTIVELY when `<vault-context>` surfaces noise — snooze 30 days instead of ignoring. |
269
+ | `build_graphs` | Build temporal backbone + semantic graph after bulk ingestion. |
270
+ | `beads_sync` | Sync Beads issues from Dolt backend into memory. |
271
+ | `index_stats` | Doc counts, embedding coverage, content type distribution. |
272
+ | `status` | Quick index health. |
273
+ | `reindex` | Force re-index (BM25 only, does NOT embed). |
274
+ | `memory_evolution_status` | Track how a doc's A-MEM metadata evolved over time. |
275
+ | `timeline` | Temporal neighborhood around a document — what was modified before/after. Progressive disclosure: search → timeline → get. Supports same-collection scoping and session correlation. |
276
+ | `list_vaults` | Show configured vault names and paths. Empty in single-vault mode. |
277
+ | `vault_sync` | Index markdown from a directory into a named vault. Restricted-path validation rejects sensitive directories. |
278
+ | `lifecycle_status` | Document lifecycle statistics: active, archived, forgotten, pinned, snoozed counts and policy summary. |
279
+ | `lifecycle_sweep` | Run lifecycle policies: archive stale docs. Defaults to dry_run (preview only). |
280
+ | `lifecycle_restore` | Restore auto-archived documents. Filter by query, collection, or all. Does NOT restore manually forgotten docs. |
281
+
282
+ **Multi-vault:** All tools accept an optional `vault` parameter. Omit for the default vault (single-vault mode). Named vaults configured in `~/.config/clawmem/config.yaml` under `vaults:` or via `CLAWMEM_VAULTS` env var. Vault paths support `~` expansion.
283
+
284
+ **Progressive disclosure:** ALWAYS `compact=true` first -> review snippets/scores -> `get(docid)` or `multi_get(pattern)` for full content.
285
+
286
+ ---
287
+
288
+ ## Query Optimization
289
+
290
+ ClawMem's pipeline autonomously generates lex/vec/hyde variants, fuses BM25 + vector via RRF, and reranks with a cross-encoder. The agent does NOT choose search types — the pipeline handles fusion internally. The agent's optimization levers are: **tool selection**, **query string quality**, **intent**, and **candidateLimit**.
291
+
292
+ ### Lever 1: Tool Selection (highest impact)
293
+
294
+ Pick the lightest tool that satisfies the need. Heavier tools are slower and consume more GPU.
295
+
296
+ | Tool | Cost | When |
297
+ |------|------|------|
298
+ | `search(q, compact=true)` | BM25 only, 0 GPU | Know exact terms, spot-check, fast keyword lookup |
299
+ | `vsearch(q, compact=true)` | Vector only, 1 GPU call | Conceptual/fuzzy, don't know vocabulary |
300
+ | `query(q, compact=true)` | Full hybrid, 3+ GPU calls | General recall, unsure which signal matters, need best results |
301
+ | `intent_search(q)` | Hybrid + graph traversal | Why/entity chains (graph traversal), when queries (BM25-biased) |
302
+ | `query_plan(q, compact=true)` | Hybrid + decomposition | Complex multi-topic queries needing parallel typed retrieval |
303
+
304
+ Use `search` for quick keyword spot-checks. Use `query` for general recall (default Tier 3 workhorse). Use `intent_search` directly (not as fallback) when the question is causal or relational.
305
+
306
+ ### Lever 2: Query String Quality
307
+
308
+ The query string directly feeds BM25 (which probes first and can short-circuit the entire pipeline) and anchors the 2x-weighted original signal in RRF. A good query string is the single biggest determinant of result quality.
309
+
310
+ **For keyword recall (BM25 path):**
311
+ - 2-5 precise terms, no filler words
312
+ - Code identifiers work: `handleError async`
313
+ - BM25 tokenizes on whitespace and AND's all terms as prefix matches (`perf` matches "performance")
314
+ - No phrase search or negation syntax — all terms are positive prefix matches
315
+ - A strong keyword hit (score >= 0.85 with gap >= 0.15) skips expansion entirely — faster results
316
+
317
+ **For semantic recall (vector path):**
318
+ - Full natural language question, be specific
319
+ - Include context: `"in the payment service, how are refunds processed"` > `"refunds"`
320
+ - The expansion LLM generates complementary variants — don't try to do its job
321
+
322
+ **Do NOT write hypothetical-answer-style queries.** The expansion LLM already generates hyde variants internally. Writing a 50-word hypothetical dilutes BM25 scoring and is redundant with what the pipeline does autonomously.
323
+
324
+ ### Lever 3: Intent (Disambiguation)
325
+
326
+ Steers 5 autonomous stages: expansion, reranking, chunk selection, snippet extraction, and strong-signal bypass (disabled when intent is provided, forcing full pipeline).
327
+
328
+ ```
329
+ query("performance", intent="web page load times and Core Web Vitals")
330
+ ```
331
+
332
+ **When to provide:**
333
+ - Query term has multiple meanings in the vault ("performance", "pipeline", "model")
334
+ - You know the domain but the query alone is ambiguous
335
+ - Cross-domain search where same terms appear in different contexts
336
+
337
+ **When NOT to provide:**
338
+ - Query is already specific enough
339
+ - Single-domain vault with no ambiguity
340
+ - Using `search` or `vsearch` (intent only affects `query` tool)
341
+
342
+ **Note:** Intent disables BM25 strong-signal bypass, forcing full expansion+reranking even on strong keyword hits. This is correct behavior — intent means the query is ambiguous, so keyword confidence alone is insufficient.
343
+
344
+ ### Lever 4: candidateLimit
345
+
346
+ Controls how many RRF candidates reach the cross-encoder reranker (default 30).
347
+
348
+ ```
349
+ query("architecture decisions", candidateLimit=15) # Faster, more precise
350
+ query("architecture decisions", candidateLimit=50) # Broader recall, slower
351
+ ```
352
+
353
+ Lower when: high-confidence keyword query, speed matters, vault is small.
354
+ Higher when: broad topic, vault is large, recall matters more than speed.
355
+
356
+ ---
357
+
358
+ ## Pipeline Details
359
+
360
+ ### `query` (default Tier 3 workhorse)
361
+
362
+ ```
363
+ User Query + optional intent hint
364
+ -> BM25 Probe -> Strong Signal Check (skip expansion if top hit >= 0.85 with gap >= 0.15; disabled when intent provided)
365
+ -> Query Expansion (LLM generates text variants; intent steers expansion prompt)
366
+ -> Parallel: BM25(original) + Vector(original) + BM25(each expanded) + Vector(each expanded)
367
+ -> Original query lists get positional 2x weight in RRF; expanded get 1x
368
+ -> Reciprocal Rank Fusion (k=60, top candidateLimit)
369
+ -> Intent-Aware Chunk Selection (intent terms at 0.5x weight alongside query terms at 1.0x)
370
+ -> Cross-Encoder Reranking (4000 char context; intent prepended to rerank query; chunk dedup; batch cap=4)
371
+ -> Position-Aware Blending (alpha=0.75 top3, 0.60 mid, 0.40 tail)
372
+ -> Composite Scoring
373
+ -> MMR Diversity Filter (Jaccard bigram similarity > 0.6 -> demoted, not removed)
374
+ ```
375
+
376
+ ### `intent_search` (specialist for causal chains)
377
+
378
+ ```
379
+ User Query -> Intent Classification (WHY/WHEN/ENTITY/WHAT)
380
+ -> BM25 + Vector (intent-weighted RRF: boost BM25 for WHEN, vector for WHY)
381
+ -> Graph Traversal (WHY/ENTITY only; multi-hop beam search over memory_relations)
382
+ Outbound: all edge types (semantic, supporting, contradicts, causal, temporal)
383
+ Inbound: semantic and entity only
384
+ Scores normalized to [0,1] before merge with search results
385
+ -> Cross-Encoder Reranking (200 char context per doc; file-keyed score join)
386
+ -> Composite Scoring (uses stored confidence from contradiction detection + feedback)
387
+ ```
388
+
389
+ ### Key Differences
390
+
391
+ | Aspect | `query` | `intent_search` |
392
+ |--------|---------|-----------------|
393
+ | Query expansion | Yes (skipped on strong BM25 signal) | No |
394
+ | Intent hint | Yes (`intent` param steers 5 stages) | Auto-detected (WHY/WHEN/ENTITY/WHAT) |
395
+ | Rerank context | 4000 chars/doc (intent-aware chunk selection) | 200 chars/doc |
396
+ | Chunk dedup | Yes (identical texts share single rerank call) | No |
397
+ | Graph traversal | No | Yes (WHY/ENTITY, multi-hop) |
398
+ | MMR diversity | Yes (`diverse=true` default) | No |
399
+ | `compact` param | Yes | No |
400
+ | `collection` filter | Yes | No |
401
+ | `candidateLimit` | Yes (default 30) | No |
402
+ | Best for | Most queries, progressive disclosure | Causal chains spanning multiple docs |
403
+
404
+ ### `intent_search` force_intent Guide
405
+
406
+ | Override | Triggers |
407
+ |----------|----------|
408
+ | `WHY` | "why", "what led to", "rationale", "tradeoff", "decision behind" |
409
+ | `ENTITY` | Named component/person/service needing cross-doc linkage |
410
+ | `WHEN` | Timelines, first/last occurrence, "when did this change/regress" |
411
+
412
+ **WHEN note:** Start with `enable_graph_traversal=false` (BM25-biased); fall back to `query()` if recall drifts.
413
+
414
+ ---
415
+
416
+ ## Composite Scoring
417
+
418
+ Applied automatically to all search tool results.
419
+
420
+ ```
421
+ compositeScore = (0.50 x searchScore + 0.25 x recencyScore + 0.25 x confidenceScore) x qualityMultiplier x coActivationBoost
422
+ ```
423
+
424
+ Where `qualityMultiplier = 0.7 + 0.6 x qualityScore` (range: 0.7x penalty to 1.3x boost).
425
+ `coActivationBoost = 1 + min(coCount/10, 0.15)` — documents frequently surfaced together get up to 15% boost.
426
+
427
+ Length normalization: `1/(1 + 0.5 x log2(max(bodyLength/500, 1)))` — penalizes verbose entries, floor at 30%.
428
+
429
+ Frequency boost: `freqSignal = (revisions-1)x2 + (duplicates-1)`, `freqBoost = min(0.10, log1p(freqSignal)x0.03)`. Revision count weighted 2x vs duplicate count. Capped at 10%.
430
+
431
+ Pinned documents get +0.3 additive boost (capped at 1.0).
432
+
433
+ ### Recency Intent Detected ("latest", "recent", "last session")
434
+
435
+ ```
436
+ compositeScore = (0.10 x searchScore + 0.70 x recencyScore + 0.20 x confidenceScore) x qualityMultiplier x coActivationBoost
437
+ ```
438
+
439
+ ### Content Type Half-Lives
440
+
441
+ | Content Type | Half-Life | Effect |
442
+ |--------------|-----------|--------|
443
+ | decision, hub | infinity | Never decay |
444
+ | antipattern | infinity | Never decay — accumulated negative patterns persist |
445
+ | project | 120 days | Slow decay |
446
+ | research | 90 days | Moderate decay |
447
+ | note | 60 days | Default |
448
+ | progress | 45 days | Faster decay |
449
+ | handoff | 30 days | Fast — recent matters most |
450
+
451
+ Half-lives extend up to 3x for frequently-accessed memories (access reinforcement decays over 90 days).
452
+
453
+ Attention decay: non-durable types (handoff, progress, note, project) lose 5% confidence per week without access. Decision/hub/research/antipattern are exempt.
454
+
455
+ ---
456
+
457
+ ## Indexing & Graph Building
458
+
459
+ ### What Gets Indexed (per collection in config.yaml)
460
+
461
+ - `**/MEMORY.md` — any depth
462
+ - `**/memory/**/*.md`, `**/memory/**/*.txt` — session logs
463
+ - `**/docs/**/*.md`, `**/docs/**/*.txt` — documentation
464
+ - `**/research/**/*.md`, `**/research/**/*.txt` — research dumps
465
+ - `**/YYYY-MM-DD*.md`, `**/YYYY-MM-DD*.txt` — date-format records
466
+
467
+ ### Excluded (even if pattern matches)
468
+
469
+ - `gits/`, `scraped/`, `.git/`, `node_modules/`, `dist/`, `build/`, `vendor/`
470
+
471
+ ### Indexing vs Embedding
472
+
473
+ **Infrastructure (Tier 1, no agent action):**
474
+ - **`clawmem-watcher`** — keeps index + A-MEM fresh (continuous, on `.md` change). Watches `.beads/` too. Does NOT embed.
475
+ - **`clawmem-embed` timer** — keeps embeddings fresh (daily). Idempotent, skips already-embedded fragments.
476
+
477
+ **Quality scoring:** Each document gets `quality_score` (0.0-1.0) during indexing based on length, structure (headings, lists), decision keywords, correction keywords, frontmatter richness. Applied as multiplier in composite scoring.
478
+
479
+ **Impact of missing embeddings:** `vsearch`, `query` (vector component), `context-surfacing` (vector component), and `generateMemoryLinks()` all depend on embeddings. BM25 still works, but vector recall and inter-doc link quality suffer.
480
+
481
+ **Agent escape hatches (rare):**
482
+ - `clawmem embed` via CLI for immediate vector recall after writing a doc.
483
+ - Manual `reindex` only when watcher hasn't caught up.
484
+
485
+ ### Adding New Collections
486
+
487
+ ```bash
488
+ # 1. Edit config
489
+ Edit ~/.config/clawmem/config.yaml
490
+
491
+ # 2. Reindex (BM25 only)
492
+ mcp__clawmem__reindex()
493
+
494
+ # 3. Embed (vectors, CLI only)
495
+ CLAWMEM_PATH=~/clawmem ~/clawmem/bin/clawmem embed
496
+
497
+ # 4. Verify
498
+ mcp__clawmem__search(query, collection="name", compact=true) # BM25
499
+ mcp__clawmem__vsearch(query, collection="name", compact=true) # vector
500
+ ```
501
+
502
+ **Gotcha:** `reindex` shows `added` count but does NOT embed. `needsEmbedding` in `index_stats` shows pending. Must run CLI `embed` separately.
503
+
504
+ ### Graph Population (memory_relations)
505
+
506
+ | Source | Edge Types | Trigger | Notes |
507
+ |--------|-----------|---------|-------|
508
+ | A-MEM `generateMemoryLinks()` | semantic, supporting, contradicts | Indexing (new docs) | LLM-assessed confidence. Requires embeddings. |
509
+ | A-MEM `inferCausalLinks()` | causal | Post-response (decision-extractor) | Links `_clawmem/agent/observations/` docs only. |
510
+ | Beads `syncBeadsIssues()` | causal, supporting, semantic | `beads_sync` MCP or watcher | Queries `bd` CLI (Dolt backend). |
511
+ | `buildTemporalBackbone()` | temporal | `build_graphs` MCP (manual) | Creation-order edges. |
512
+ | `buildSemanticGraph()` | semantic | `build_graphs` MCP (manual) | Pure cosine similarity. A-MEM edges take precedence (first-writer wins). |
513
+
514
+ **Graph traversal asymmetry:** `adaptiveTraversal()` traverses all edge types outbound (source->target) but only `semantic` and `entity` inbound.
515
+
516
+ ### When to Run `build_graphs`
517
+
518
+ - After **bulk ingestion** — adds temporal backbone + semantic gap filling.
519
+ - When `intent_search` for WHY/ENTITY returns weak results and you suspect graph sparsity.
520
+ - Do NOT run after every reindex (A-MEM handles per-doc links automatically).
521
+
522
+ ---
523
+
524
+ ## Memory Lifecycle
525
+
526
+ Pin, snooze, and forget are **manual MCP tools**.
527
+
528
+ ### Pin (`memory_pin`)
529
+
530
+ +0.3 composite boost, ensures persistent surfacing.
531
+
532
+ **Proactive triggers:**
533
+ - User says "remember this" / "don't forget" / "this is important"
534
+ - Architecture or critical design decision just made
535
+ - User-stated preference or constraint that should persist across sessions
536
+
537
+ **Do NOT pin:** routine decisions, session-specific context, or observations that naturally surface via recency.
538
+
539
+ ### Snooze (`memory_snooze`)
540
+
541
+ Temporarily hides from context surfacing until a date.
542
+
543
+ **Proactive triggers:**
544
+ - A memory keeps surfacing but isn't relevant to current work
545
+ - User says "not now" / "later" / "ignore this for now"
546
+ - Seasonal or time-boxed content
547
+
548
+ ### Forget (`memory_forget`)
549
+
550
+ Permanently deactivates. Use sparingly — only when genuinely wrong or permanently obsolete. Prefer snooze for temporary suppression.
551
+
552
+ ### Contradiction Auto-Resolution
553
+
554
+ When `decision-extractor` detects a new decision contradicting an old one, the old decision's confidence is lowered automatically. No manual intervention needed.
555
+
556
+ ---
557
+
558
+ ## Anti-Patterns
559
+
560
+ - Do NOT manually pick query/intent_search/search when `memory_retrieve` can auto-route.
561
+ - Do NOT call MCP tools every turn — 3-rule escalation gate is the only trigger.
562
+ - Do NOT re-search what's already in `<vault-context>`.
563
+ - Do NOT run `status` routinely. Only when retrieval feels broken or after large ingestion.
564
+ - Do NOT pin everything — pin is for persistent high-priority items.
565
+ - Do NOT forget memories to "clean up" — let confidence decay and contradiction detection handle it.
566
+ - Do NOT run `build_graphs` after every reindex — A-MEM creates per-doc links automatically.
567
+
568
+ ---
569
+
570
+ ## OpenClaw Integration
571
+
572
+ ### Option 1: ClawMem Exclusive (Recommended)
573
+
574
+ ClawMem handles 100% of memory. No redundancy.
575
+
576
+ ```bash
577
+ # Disable OpenClaw's native memory
578
+ openclaw config set agents.defaults.memorySearch.extraPaths "[]"
579
+ ```
580
+
581
+ **Distribution:** Hooks 90%, MCP tools 10%.
582
+
583
+ ### Option 2: Hybrid
584
+
585
+ Run both ClawMem and OpenClaw native memory.
586
+
587
+ ```bash
588
+ openclaw config set agents.defaults.memorySearch.extraPaths '["~/documents", "~/notes"]'
589
+ ```
590
+
591
+ **Tradeoffs:** Redundant recall but 10-15% context window waste from duplicate facts.
592
+
593
+ ---
594
+
595
+ ## Troubleshooting
596
+
597
+ ```
598
+ Symptom: "Local model download blocked" error
599
+ -> llama-server endpoint unreachable while CLAWMEM_NO_LOCAL_MODELS=true.
600
+ -> Fix: Start llama-server. Or set CLAWMEM_NO_LOCAL_MODELS=false for in-process fallback.
601
+
602
+ Symptom: Query expansion always fails / returns garbage
603
+ -> On CPU-only systems, in-process inference is significantly slower and less reliable. Systems with GPU acceleration (Metal/Vulkan) handle these models well in-process.
604
+ -> Fix: Run llama-server on GPU.
605
+
606
+ Symptom: Vector search returns no results but BM25 works
607
+ -> Missing embeddings. Watcher indexes but does NOT embed.
608
+ -> Fix: Run `clawmem embed` or wait for daily embed timer.
609
+
610
+ Symptom: context-surfacing hook returns empty
611
+ -> Prompt too short (<20 chars), starts with `/`, or no docs above threshold.
612
+ -> Fix: Check `clawmem status` for doc counts. Check `clawmem embed` for embedding coverage.
613
+
614
+ Symptom: intent_search returns weak results for WHY/ENTITY
615
+ -> Graph may be sparse (few A-MEM edges).
616
+ -> Fix: Run `build_graphs` to add temporal backbone + semantic edges.
617
+
618
+ Symptom: Watcher fires but collections show 0 docs
619
+ -> Bun.Glob does not support brace expansion {a,b,c}.
620
+ -> Fixed: indexer.ts splits brace patterns into individual Glob scans.
621
+
622
+ Symptom: Watcher fires but wrong collection processes events
623
+ -> Collection prefix matching returns first match. Parent paths match before children.
624
+ -> Fixed: cmdWatch() sorts by path length descending (most specific first).
625
+
626
+ Symptom: reindex --force crashes with UNIQUE constraint
627
+ -> Force deactivates rows but UNIQUE(collection, path) doesn't discriminate by active flag.
628
+ -> Fixed: indexer.ts reactivates inactive rows instead of inserting.
629
+
630
+ Symptom: CLI reindex/update falls back to node-llama-cpp
631
+ -> GPU env vars only in systemd drop-in, not in wrapper script.
632
+ -> Fixed: bin/clawmem wrapper exports CLAWMEM_EMBED_URL/LLM_URL/RERANK_URL defaults.
633
+ ```
634
+
635
+ ---
636
+
637
+ ## CLI Reference
638
+
639
+ Run `clawmem --help` for full command listing.
640
+
641
+ ### IO6 Surface Commands (daemon/`--print` mode)
642
+
643
+ ```bash
644
+ # IO6a: per-prompt context injection (pipe prompt on stdin)
645
+ echo "user query" | clawmem surface --context --stdin
646
+
647
+ # IO6b: per-session bootstrap injection (pipe session ID on stdin)
648
+ echo "session-id" | clawmem surface --bootstrap --stdin
649
+ ```
650
+
651
+ ### Analysis Commands
652
+
653
+ ```bash
654
+ clawmem reflect [N] # Cross-session reflection (last N days, default 14)
655
+ # Recurring themes, antipatterns, co-activation clusters
656
+ clawmem consolidate [--dry-run] # Find and archive duplicate low-confidence documents
657
+ # Jaccard similarity within same collection
658
+ ```
659
+
660
+ ---
661
+
662
+ ## Operational Issue Tracking
663
+
664
+ When encountering tool failures, instruction contradictions, retrieval gaps, or workflow friction:
665
+
666
+ Write to `docs/issues/YYYY-MM-DD-<slug>.md`:
667
+
668
+ ```
669
+ # <title>
670
+ - Category: tool-failure | instruction-gap | workflow-friction | retrieval-gap | inconsistency
671
+ - Severity: critical | high | medium
672
+ - Status: open | resolved
673
+
674
+ ## Observed
675
+ ## Expected
676
+ ## Context
677
+ ## Suggested Fix
678
+ ```
679
+
680
+ **Triggers:** repeated tool error, instruction contradicting observed behavior, retrieval consistently missing known content.
681
+
682
+ **Do NOT log:** one-off transient errors, user-caused issues, already recorded issues.
683
+
684
+ ---
685
+
686
+ ## Integration Notes
687
+
688
+ - QMD retrieval (BM25, vector, RRF, rerank, query expansion) is forked into ClawMem. Do not call standalone QMD tools.
689
+ - SAME (composite scoring), MAGMA (intent + graph), A-MEM (self-evolving notes) layer on top of QMD substrate.
690
+ - Three `llama-server` instances on local or remote GPU. Wrapper defaults to `localhost:8088/8089/8090`.
691
+ - `CLAWMEM_NO_LOCAL_MODELS=false` (default) allows in-process fallback. Set `true` for remote-only to fail fast.
692
+ - Consolidation worker (`CLAWMEM_ENABLE_CONSOLIDATION=true`) backfills unenriched docs. Only runs if MCP process stays alive long enough (every 5min).
693
+ - Beads integration: `syncBeadsIssues()` queries `bd` CLI (Dolt backend, v0.58.0+), creates markdown docs, maps dependency edges into `memory_relations`. Watcher auto-triggers on `.beads/` changes; `beads_sync` MCP for manual sync.
694
+ - HTTP REST API: `clawmem serve [--port 7438]` — optional REST server on localhost. Search, retrieval, lifecycle, and graph traversal. `POST /retrieve` mirrors `memory_retrieve` with auto-routing (keyword/semantic/causal/timeline/hybrid). `POST /search` provides direct mode selection. Bearer token auth via `CLAWMEM_API_TOKEN` env var (disabled if unset).
695
+ - OpenClaw ContextEngine plugin: `clawmem setup openclaw` — registers as native OpenClaw context engine. Dual-mode: shares vault with Claude Code hooks. Uses `before_prompt_build` for retrieval, `afterTurn()` for extraction, `compact()` for pre-compaction.
696
+
697
+ ## Tool Selection (one-liner)
698
+
699
+ ```
700
+ ClawMem escalation: memory_retrieve(query) | query(compact=true) | intent_search(why/when/entity) | query_plan(multi-topic) -> multi_get -> search/vsearch (spot checks)
701
+ ```
702
+
703
+ ## Curator Agent
704
+
705
+ Maintenance agent for Tier 3 operations the main agent typically neglects. Install with `clawmem setup curator`.
706
+
707
+ **Invoke:** "curate memory", "run curator", or "memory maintenance"
708
+
709
+ **6 phases:**
710
+ 1. Health snapshot — status, index_stats, lifecycle_status, doctor
711
+ 2. Lifecycle triage — pin high-value unpinned memories, snooze stale content, propose forget candidates (never auto-confirms)
712
+ 3. Retrieval health check — 5 probes (BM25, vector, hybrid, intent/graph, lifecycle)
713
+ 4. Maintenance — reflect (cross-session patterns), consolidate --dry-run (dedup candidates)
714
+ 5. Graph rebuild — conditional on probe results and embedding state
715
+ 6. Collection hygiene — orphan detection, content type distribution
716
+
717
+ **Safety rails:** Never auto-confirms forget. Never runs embed (timer's job). Never modifies config.yaml. All destructive proposals require user approval.