@psiclawops/hypermem 0.8.0 → 0.8.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/INSTALL.md ADDED
@@ -0,0 +1,800 @@
1
+ # hypermem — Installation Guide
2
+
3
+ ## Prerequisites
4
+
5
+ - **Node.js 22+** (uses built-in `node:sqlite`)
6
+ - **OpenClaw** must already be installed, onboarded, and running. The plugin install assumes a working OpenClaw home with a valid `openclaw.json` and a gateway that can restart.
7
+ - **Disk space:** allow at least 2 GB free. Plugin builds pull OpenClaw as a dev dependency.
8
+
9
+ **Verify before starting:**
10
+
11
+ ```bash
12
+ openclaw gateway status # should show "running" or "ready"
13
+ openclaw config get gateway # should return gateway config, not an error
14
+ ```
15
+
16
+ If `gateway status` shows "disabled" or "not configured", complete OpenClaw onboarding first. `openclaw gateway restart` only works when the gateway service is already set up. On a brand-new OpenClaw install that has never been started, you need `openclaw gateway start` (or the full onboarding flow) before installing plugins.
17
+
18
+ ## Quick Start
19
+
20
+ > **Disk space:** plugin installs pull OpenClaw as a dev dependency. Allow at least 2 GB free before starting.
21
+ >
22
+ > **Prerequisites:** OpenClaw must be installed and onboarded before this step. Run `openclaw gateway status` to confirm. If the gateway is not configured, complete OpenClaw setup first.
23
+ >
24
+ > **Production runtime path:** install the built runtime payload into `~/.openclaw/plugins/hypermem`. Do not point production at `/tmp` or at your development repo clone.
25
+ >
26
+ > **Config merge warning:** if you already have values in `plugins.load.paths` or `plugins.allow`, merge them instead of overwriting them blindly.
27
+
28
+ ```bash
29
+ git clone https://github.com/PsiClawOps/hypermem.git
30
+ cd hypermem
31
+ npm install && npm run build
32
+ npm --prefix plugin install && npm --prefix plugin run build # ~1 min on a clean machine
33
+ npm --prefix memory-plugin install && npm --prefix memory-plugin run build
34
+ npm run install:runtime
35
+ mkdir -p ~/.openclaw/hypermem
36
+ cat > ~/.openclaw/hypermem/config.json <<'JSON'
37
+ {
38
+ "embedding": {
39
+ "provider": "none"
40
+ }
41
+ }
42
+ JSON
43
+ ```
44
+
45
+ `install:runtime` stages the built plugin files into `~/.openclaw/plugins/hypermem`. It does **not** modify your OpenClaw config. The commands below wire the plugins manually.
46
+
47
+ Wire both plugins into OpenClaw:
48
+
49
+ ```bash
50
+ openclaw config set plugins.load.paths "[\"$HOME/.openclaw/plugins/hypermem/plugin\",\"$HOME/.openclaw/plugins/hypermem/memory-plugin\"]" --strict-json
51
+ openclaw config set plugins.slots.contextEngine hypercompositor
52
+ openclaw config set plugins.slots.memory hypermem
53
+ openclaw config set plugins.allow '["hypercompositor","hypermem"]' --strict-json
54
+ openclaw gateway restart
55
+ ```
56
+
57
+ The repo clone is for build and release work. OpenClaw should load the installed runtime payload from `~/.openclaw/plugins/hypermem/`.
58
+
59
+ ### Verification checkpoints
60
+
61
+ 1. **Build verified**
62
+ - root build succeeds
63
+ - `plugin` build succeeds
64
+ - `memory-plugin` build succeeds
65
+
66
+ 2. **Wiring verified**
67
+ - OpenClaw accepts `plugins.load.paths`
68
+ - slots are set to `hypercompositor` and `hypermem`
69
+ - gateway restart succeeds
70
+
71
+ 3. **Runtime verified active**
72
+
73
+ Send a message to any agent, then verify:
74
+
75
+ ```bash
76
+ openclaw logs --limit 100 | grep -E 'hypermem|context-engine'
77
+ ```
78
+
79
+ Expected lightweight-mode lines:
80
+ - `[hypermem] hypermem initialized`
81
+ - `[hypermem] Embedding provider: none — semantic search disabled, using FTS5 fallback`
82
+ - `[hypermem:compose]`
83
+
84
+ If you see a fallback like `falling back to default engine "legacy"`, the install is **not** fully active yet even if the build and wiring steps succeeded.
85
+
86
+ ---
87
+
88
+ ## What hypermem Does
89
+
90
+ hypermem replaces OpenClaw's default context assembly with a four-layer SQLite-backed memory system. Every turn, it queries all layers in parallel and composes context within a fixed token budget. No transcript accumulates. No lossy summarization. Content that doesn't fit this turn stays in storage instead of being destroyed.
91
+
92
+ | Layer | Storage | What it holds | Speed |
93
+ |---|---|---|---|
94
+ | **L1** | SQLite in-memory | Session cache: identity, recent history, active state | 0.08ms |
95
+ | **L2** | Per-agent SQLite | Conversation history, survives restarts, rotates at 100MB | 0.13ms |
96
+ | **L3** | Per-agent SQLite + sqlite-vec | Semantic search via embeddings | 0.29ms |
97
+ | **L4** | Shared SQLite | Structured knowledge: facts, episodes, preferences, fleet registry | 0.09ms |
98
+
99
+ Everything runs in-process on SQLite memory databases. No external database services required.
100
+
101
+ ---
102
+
103
+ ## Requirements
104
+
105
+ | Dependency | Required | Notes |
106
+ |---|---|---|
107
+ | Node.js 22+ | **Yes** | Uses built-in `node:sqlite`. No standalone SQLite install needed. |
108
+ | OpenClaw | **Yes** | Any version with context engine plugin support |
109
+ | Ollama | Local embeddings only | [ollama.ai](https://ollama.ai) — pull `nomic-embed-text` |
110
+ | OpenRouter API key | Hosted embeddings only | Alternative to local: [openrouter.ai](https://openrouter.ai) |
111
+ | Gemini API key | Gemini embeddings only | Alternative: [aistudio.google.com](https://aistudio.google.com/apikey) |
112
+
113
+ `sqlite-vec` is the only native dependency and installs automatically via npm.
114
+
115
+ > **Package versions:** the root package (`hypermem`) and the two plugins (`hypercompositor`, `hypermem-memory`) are versioned independently. Plugin versions trail the core by one minor version when no plugin-facing API changes ship in a release — this is expected.
116
+
117
+ The **embedding layer** (L3 semantic search) requires a configured provider. Without one, hypermem falls back to FTS5 keyword matching. This is functional but degrades recall quality. See [Setup Styles](#setup-styles) below.
118
+
119
+ ---
120
+
121
+ ## Setup Styles
122
+
123
+ Pick a style based on your hardware and cost tolerance. All styles support full history, fact recall, and session continuity — the differences are in semantic search quality and local resource requirements.
124
+
125
+ | Style | Embedding | Reranker | Semantic recall | Cost | Hardware |
126
+ |---|---|---|---|---|---|
127
+ | **Lightweight** | None (FTS5 only) | None | Keyword match only | Free | Any |
128
+ | **Local** | Ollama nomic-embed-text | None (RRF) or Ollama Qwen3-Reranker (GPU only) | Good | Free | ~1GB RAM + GPU for reranker |
129
+ | **High** | OpenRouter Qwen3-8B | OpenRouter Cohere Rerank 4 | Best (MTEB #1) | ~pennies/day | API key |
130
+
131
+ The **reranker is optional at every tier.** Without one, results are ordered by RRF fusion score (FTS5 + vector) — a solid default. The reranker improves precision but requires a GPU for the local option; CPU-only systems should leave it as None.
132
+
133
+ ---
134
+
135
+ ## Embedding Providers
136
+
137
+ Pick a tier based on your hardware:
138
+
139
+ | Tier | Provider | Quality | Cost | Setup |
140
+ |---|---|---|---|---|
141
+ | **Minimal** | None (FTS5 keyword only) | Keyword-only, no semantic recall | Free | None |
142
+ | **Local** | Ollama + nomic-embed-text (768d) | Good | Free | Ollama required |
143
+ | **Hosted** | OpenRouter + Qwen3 Embedding 8B (4096d) | Best (MTEB #1) | ~pennies/day | API key |
144
+ | **Gemini** | Google Gemini Embedding (768d) | Good | Free tier available | API key |
145
+
146
+ ### Minimal (no embedder)
147
+
148
+ No Ollama, no API key. This config must exist **before gateway restart and runtime verification** so the clean install validates the intended lightweight behavior. Set `provider: 'none'` explicitly in `~/.openclaw/hypermem/config.json` to disable embedding entirely:
149
+
150
+ ```json
151
+ {
152
+ "embedding": {
153
+ "provider": "none"
154
+ }
155
+ }
156
+ ```
157
+
158
+ Without a config file, the default provider is `ollama` — if Ollama isn't running, the vector store initialization fails non-fatally and hypermem falls back to FTS5. Using `provider: 'none'` makes the intent explicit and avoids the init attempt.
159
+
160
+ You'll see in the logs:
161
+ ```
162
+ [hypermem] Embedding provider: none — semantic search disabled, using FTS5 fallback
163
+ ```
164
+
165
+ Upgrade to a higher tier later without losing stored data.
166
+
167
+ ### Troubleshooting clean installs
168
+
169
+ **Symptom:** `Context engine "hypercompositor" ... falling back to default engine "legacy"`
170
+ - The plugin was found, but the context engine did not activate correctly.
171
+ - Treat the install as failed at runtime, not successful.
172
+ - Check for release artifact mismatch, stale plugin build output, or config collisions with existing plugin paths.
173
+
174
+ **Symptom:** HyperMem logs never appear after restart
175
+ - Re-check `plugins.load.paths` for exact absolute paths.
176
+ - Confirm the clone directory still exists and was not created in a temp location.
177
+ - Confirm existing `plugins.allow` and `plugins.load.paths` values were merged correctly instead of overwritten incorrectly.
178
+
179
+ **Symptom:** build succeeds, but behavior is not lightweight mode
180
+ - Confirm `~/.openclaw/hypermem/config.json` existed before restart.
181
+ - Confirm it contains:
182
+ ```json
183
+ {
184
+ "embedding": {
185
+ "provider": "none"
186
+ }
187
+ }
188
+ ```
189
+
190
+ ### Local — Ollama + nomic-embed-text
191
+
192
+ ```bash
193
+ ollama pull nomic-embed-text
194
+ ```
195
+
196
+ No config file needed. Ollama on `localhost:11434` with `nomic-embed-text` is the default. Requires ~1GB RAM for the model.
197
+
198
+ ### Hosted — OpenRouter + Qwen3 Embedding 8B (Recommended)
199
+
200
+ Best quality, no local compute. Embedding is async and doesn't affect agent response time.
201
+
202
+ Create or edit `~/.openclaw/hypermem/config.json`:
203
+
204
+ ```json
205
+ {
206
+ "embedding": {
207
+ "provider": "openai",
208
+ "openaiApiKey": "sk-or-YOUR_OPENROUTER_KEY",
209
+ "openaiBaseUrl": "https://openrouter.ai/api/v1",
210
+ "model": "qwen/qwen3-embedding-8b",
211
+ "dimensions": 4096,
212
+ "batchSize": 128
213
+ }
214
+ }
215
+ ```
216
+
217
+ Get a key at [openrouter.ai](https://openrouter.ai). Cost at typical agent volumes: under a cent per day.
218
+
219
+ ### Gemini
220
+
221
+ Create or edit `~/.openclaw/hypermem/config.json`:
222
+
223
+ ```json
224
+ {
225
+ "embedding": {
226
+ "provider": "gemini",
227
+ "geminiApiKey": "YOUR_GEMINI_API_KEY",
228
+ "model": "text-embedding-004",
229
+ "dimensions": 768,
230
+ "batchSize": 128
231
+ }
232
+ }
233
+ ```
234
+
235
+ Get a key at [aistudio.google.com](https://aistudio.google.com/apikey). The free tier covers typical agent usage.
236
+
237
+ Optional Gemini-specific settings:
238
+ - `geminiBaseUrl`: defaults to `https://generativelanguage.googleapis.com`
239
+ - `geminiIndexTaskType`: defaults to `RETRIEVAL_DOCUMENT`
240
+ - `geminiQueryTaskType`: defaults to `RETRIEVAL_QUERY`
241
+
242
+ ### Switching providers
243
+
244
+ Changing providers after vectors are built requires a full re-index (dimensions are incompatible):
245
+
246
+ ```bash
247
+ node scripts/embed-existing.mjs
248
+ ```
249
+
250
+ Fresh installs don't need this.
251
+
252
+ ---
253
+
254
+ ## Reranker (Optional)
255
+
256
+ The reranker re-orders semantic search candidates by relevance before injection. Without it, results are ordered by RRF fusion score (FTS5 + KNN). The reranker is optional — the system degrades gracefully to original order on any failure.
257
+
258
+ | Provider | Model | Cost | Hardware | Notes |
259
+ |---|---|---|---|---|
260
+ | **None** | — | Free | Any | Default — RRF fusion ordering |
261
+ | **Ollama (local)** | Qwen3-Reranker-0.6B | Free | GPU recommended | CPU-only: too slow for >5 candidates |
262
+ | **OpenRouter** | cohere/rerank-4-pro | ~pennies/day | Any | Best quality, uses existing key |
263
+ | **ZeroEntropy** | zerank-2 | ~pennies/day | Any | Dedicated reranking service |
264
+
265
+ **CPU-only systems:** skip the local reranker. Sequential inference makes it 2-10 seconds per document on CPU — unusable at any reasonable candidate depth. RRF fusion (`provider: "none"`) is the right default for CPU-only setups and is meaningfully better than raw vector ordering alone.
266
+
267
+ ### No reranker (default)
268
+
269
+ No config needed. RRF fusion of FTS5 + vector results is the default ordering. For most conversational memory workloads, this is sufficient and runs on any hardware.
270
+
271
+ ### Local — Ollama Qwen3-Reranker-0.6B
272
+
273
+ Best option for air-gapped or GPU-equipped setups. Slower than hosted due to sequential inference (one model call per candidate document) — requires a GPU for practical use.
274
+
275
+ ```bash
276
+ ollama pull dengcao/Qwen3-Reranker-0.6B:Q5_K_M
277
+ ```
278
+
279
+ Add to `~/.openclaw/hypermem/config.json`:
280
+
281
+ ```json
282
+ {
283
+ "reranker": {
284
+ "provider": "local",
285
+ "ollamaUrl": "http://localhost:11434",
286
+ "ollamaModel": "dengcao/Qwen3-Reranker-0.6B:Q5_K_M",
287
+ "topK": 10,
288
+ "minCandidates": 5
289
+ }
290
+ }
291
+ ```
292
+
293
+ ### Hosted — OpenRouter (Cohere Rerank 4)
294
+
295
+ Fastest, highest quality. Uses the same OpenRouter key as hosted embeddings if you already have one.
296
+
297
+ Put the key in your environment, not the config file:
298
+
299
+ ```bash
300
+ export OPENROUTER_API_KEY="sk-or-YOUR_OPENROUTER_KEY"
301
+ ```
302
+
303
+ Then in `~/.openclaw/hypermem/config.json`:
304
+
305
+ ```json
306
+ {
307
+ "reranker": {
308
+ "provider": "openrouter",
309
+ "openrouterModel": "cohere/rerank-4-pro",
310
+ "topK": 10,
311
+ "minCandidates": 5
312
+ }
313
+ }
314
+ ```
315
+
316
+ `openrouterApiKey` in the config file is still honored as a fallback for compatibility, but env-var-first keeps credentials out of any config-under-version-control.
317
+
318
+ ### Hosted — ZeroEntropy (zerank-2)
319
+
320
+ Alternative hosted option, specialized reranking service.
321
+
322
+ ```bash
323
+ export ZEROENTROPY_API_KEY="YOUR_ZEROENTROPY_KEY"
324
+ ```
325
+
326
+ Then:
327
+
328
+ ```json
329
+ {
330
+ "reranker": {
331
+ "provider": "zeroentropy",
332
+ "zeroEntropyModel": "zerank-2",
333
+ "topK": 10,
334
+ "minCandidates": 5
335
+ }
336
+ }
337
+ ```
338
+
339
+ `zeroEntropyApiKey` in the config file is still honored as a fallback. Get a key at [zeroentropy.dev](https://zeroentropy.dev).
340
+
341
+ ---
342
+
343
+ ## Installation Steps
344
+
345
+ ### Step 1 — Clone and build
346
+
347
+ ```bash
348
+ git clone https://github.com/PsiClawOps/hypermem.git
349
+ cd hypermem
350
+ npm install
351
+ npm run build
352
+ ```
353
+
354
+ Build both plugins, then install the runtime payload into OpenClaw's durable plugin directory:
355
+
356
+ ```bash
357
+ npm --prefix plugin install && npm --prefix plugin run build
358
+ npm --prefix memory-plugin install && npm --prefix memory-plugin run build
359
+ npm run install:runtime
360
+ ```
361
+
362
+ Verify:
363
+
364
+ ```bash
365
+ npm test
366
+ ```
367
+
368
+ The full suite takes 30–60 seconds. When complete, output ends with `ALL N TESTS PASSED ✅`. If you see `ENOSPC`, free up disk space and retry.
369
+
370
+ ### Step 2 — Wire the plugins
371
+
372
+ Use the OpenClaw CLI. **Do not edit `openclaw.json` directly.**
373
+
374
+ ```bash
375
+ # Add plugin load paths
376
+ openclaw config set plugins.load.paths "[\"$HOME/.openclaw/plugins/hypermem/plugin\",\"$HOME/.openclaw/plugins/hypermem/memory-plugin\"]" --strict-json
377
+
378
+ # Set the context engine slot
379
+ openclaw config set plugins.slots.contextEngine hypercompositor
380
+
381
+ # Set the memory slot
382
+ openclaw config set plugins.slots.memory hypermem
383
+
384
+ # Allow both plugins
385
+ openclaw config set plugins.allow '["hypercompositor","hypermem"]' --strict-json
386
+ ```
387
+
388
+ If you already have entries in `plugins.allow` or `plugins.load.paths`, merge rather than replace. Check current values:
389
+
390
+ ```bash
391
+ openclaw config get plugins.allow
392
+ openclaw config get plugins.load.paths
393
+ ```
394
+
395
+ ### Step 3 — Choose embedding provider
396
+
397
+ See [Embedding Providers](#embedding-providers) above.
398
+
399
+ - **Lightweight (no embedder):** create `~/.openclaw/hypermem/config.json` with `{"embedding":{"provider":"none"}}`. The Quick Start block above already does this. Without this file, the default provider is `ollama` and you'll see a non-fatal init warning if Ollama isn't running.
400
+ - **Local:** `ollama pull nomic-embed-text`. No config file needed (Ollama is the default).
401
+ - **Hosted/Gemini:** create `~/.openclaw/hypermem/config.json` with the provider config block from the relevant section above.
402
+
403
+ ### Step 4 — Restart and verify
404
+
405
+ ```bash
406
+ openclaw gateway restart
407
+ ```
408
+
409
+ > **If restart reports the gateway is disabled or not configured:** you need to complete OpenClaw onboarding before this step. See [Prerequisites](#prerequisites). `gateway restart` only works on an already-running gateway.
410
+
411
+ Send a message to any agent, then check:
412
+
413
+ ```bash
414
+ openclaw logs --limit 50 | grep hypermem
415
+ ```
416
+
417
+ > **If `openclaw logs` fails with an auth or token error:** the gateway API requires authentication. Run `openclaw gateway status` to confirm the gateway is running and accessible. If the gateway is running but logs fail, check `openclaw config get gateway.token` and ensure your shell session has the correct auth context.
418
+
419
+ Expected:
420
+ ```
421
+ [hypermem] hypermem initialized — dataDir=/home/.../.openclaw/hypermem
422
+ [hypermem:compose] agent=main triggers=0 fallback=true facts=3 semantic=2 ...
423
+ ```
424
+
425
+ Full health check (run from the repo clone directory):
426
+
427
+ ```bash
428
+ node bin/hypermem-status.mjs # full dashboard
429
+ node bin/hypermem-status.mjs --health # health checks only (exit 1 on failure)
430
+ ```
431
+
432
+ > **Note:** The health check requires the data directory to exist. It is created on first gateway restart with the plugin active. Run the `openclaw logs` check first to confirm initialization, then run the health check.
433
+
434
+ If the plugin didn't load:
435
+
436
+ ```bash
437
+ openclaw config get plugins.slots.contextEngine # should be: hypercompositor
438
+ openclaw config get plugins.slots.memory # should be: hypermem
439
+ openclaw status # look for hypermem in plugins
440
+ ```
441
+
442
+ ### Step 5 — Configure your fleet
443
+
444
+ hypermem works out of the box for both single-agent and multi-agent installs. The source ships with generic placeholder agent names (`agent1`, `agent2`, `director1`, etc.) in two files that define fleet topology:
445
+
446
+ | File | What it defines |
447
+ |---|---|
448
+ | `src/cross-agent.ts` | Org membership, agent tiers, visibility scoping |
449
+ | `src/background-indexer.ts` | Agent-to-domain mapping for fact classification |
450
+
451
+ #### Single-agent installs
452
+
453
+ No code changes needed. hypermem resolves your agent ID from your OpenClaw config at runtime. The placeholder names are never used.
454
+
455
+ Verify it's working after Step 4:
456
+
457
+ ```bash
458
+ openclaw logs --limit 20 | grep hypermem
459
+ ```
460
+
461
+ You should see your agent ID (not a placeholder) in the compose logs:
462
+
463
+ ```
464
+ [hypermem:compose] agent=my-agent triggers=0 fallback=true facts=3 semantic=2 ...
465
+ ```
466
+
467
+ Facts, episodes, and topics are all scoped to your agent ID automatically. Cross-agent features (org visibility, shared facts) are dormant with a single agent and activate only when additional agents are configured.
468
+
469
+ #### Multi-agent installs
470
+
471
+ hypermem ships with generic placeholder agent names (`agent1`, `agent2`, `director1`, etc.) in the two fleet topology files listed above.
472
+
473
+ Replace the placeholder names with your fleet:
474
+
475
+ **1. Edit `src/cross-agent.ts`** — replace the `agents` map and `orgs` map in `defaultOrgRegistry()` with your fleet:
476
+
477
+ ```typescript
478
+ // Before (placeholder):
479
+ agent1: { agentId: 'agent1', tier: 'council' },
480
+
481
+ // After (your fleet):
482
+ architect: { agentId: 'architect', tier: 'council' },
483
+ ```
484
+
485
+ **2. Edit `src/background-indexer.ts`** — update `AGENT_DOMAIN_MAP` with your agent IDs and their domains:
486
+
487
+ ```typescript
488
+ // Before (placeholder):
489
+ agent1: 'infrastructure',
490
+
491
+ // After (your fleet):
492
+ architect: 'infrastructure',
493
+ ```
494
+
495
+ **3. Rebuild and restart:**
496
+
497
+ ```bash
498
+ npm run build
499
+ npm --prefix plugin run build
500
+ openclaw gateway restart
501
+ ```
502
+
503
+ Agents not listed in `AGENT_DOMAIN_MAP` default to domain `'general'`, which is fine for most setups. The org registry only matters if you use cross-agent memory visibility (org-scoped or council-scoped facts). If all your facts are agent-private or fleet-wide, you can skip the org structure entirely.
504
+
505
+ **Test fixtures use the placeholder names by design.** Don't rename them in `test/` — the tests validate the cross-agent logic, not your specific fleet topology.
506
+
507
+ ---
508
+
509
+ ## Upgrading from 0.5.x
510
+
511
+ ```bash
512
+ cd <path-to-hypermem>
513
+ git pull
514
+ npm install
515
+ npm run build
516
+ npm --prefix plugin install && npm --prefix plugin run build
517
+ npm --prefix memory-plugin install && npm --prefix memory-plugin run build
518
+ openclaw gateway restart
519
+ ```
520
+
521
+ What changed on the path from 0.5.x to current:
522
+ - **0.6.0**: SQLite `:memory:` became the only hot layer. Redis was fully removed and the runtime no longer depends on any external cache service.
523
+ - **0.7.0**: Temporal validity, expertise storage, contradiction detection, and maintenance APIs landed.
524
+ - **0.8.1**: Documentation fixes — install instructions rewritten for clean first-run, `$HOME` replaces `~` in shell-interpolated paths, Lightweight mode config clarified.
525
+ - **0.8.0**: Phase C correctness guards, tool-artifact store, schema v10/v19, BLAKE3 dedup, RRF fusion, and fleet registry seeding shipped.
526
+ - **Upgrade impact**: current releases use `messages.db` schema v10 and `library.db` schema v19. If you are upgrading from older 0.5.x installs, expect both schema and runtime-behavior changes.
527
+
528
+ What changed in 0.5.x releases:
529
+ - **0.5.5**: Plugin config schema, tuning knobs moved into `openclaw.json`. Manual `config.json` edits for compositor settings may be superseded by the plugin schema.
530
+ - **0.5.6**: Content fingerprint dedup, indexer circuit breaker, SQL parameterization hardening.
531
+ - **Exports field** added to `package.json`: if you import from `@psiclawops/hypermem`, verify your import paths still resolve.
532
+
533
+ If you switch embedding providers during the upgrade, re-index:
534
+
535
+ ```bash
536
+ node scripts/embed-existing.mjs
537
+ ```
538
+
539
+ Check [CHANGELOG.md](CHANGELOG.md) for the full list of changes per version.
540
+
541
+ **Build errors after upgrade:** Clean `dist/` directories and rebuild:
542
+
543
+ ```bash
544
+ rm -rf dist plugin/dist memory-plugin/dist
545
+ npm run build
546
+ npm --prefix plugin run build
547
+ npm --prefix memory-plugin run build
548
+ ```
549
+
550
+ ---
551
+
552
+ ## OpenClaw Settings (Optional Tuning)
553
+
554
+ These are optional. hypermem works with OpenClaw defaults, but these changes reduce unnecessary overhead.
555
+
556
+ ### Lower OpenClaw's compaction threshold
557
+
558
+ hypermem owns compaction. OpenClaw's default fires at 24K reserved tokens, which races hypermem's budget management:
559
+
560
+ ```bash
561
+ openclaw config set agents.defaults.compaction.reserveTokens 1000 --strict-json
562
+ ```
563
+
564
+ This makes OpenClaw's compaction a last-resort safety net that never fires in normal operation.
565
+
566
+ ### Tighter session store retention
567
+
568
+ With hypermem active, SQLite is the durable record. JSONL transcripts provide no memory benefit:
569
+
570
+ ```bash
571
+ openclaw config set sessions.maintenance.pruneAfter "14d"
572
+ openclaw config set sessions.maintenance.maxEntries 200 --strict-json
573
+ ```
574
+
575
+ OpenClaw defaults: `pruneAfter: 30d`, `maxEntries: 500`. If you browse conversation history older than 14 days via the session list, keep the higher value.
576
+
577
+ ### Session max-age (fleet installs only)
578
+
579
+ Prevents idle sessions from accumulating indefinitely:
580
+
581
+ ```bash
582
+ openclaw config set sessions.maxAgeHours 168 --strict-json # 7 days
583
+ ```
584
+
585
+ Solo installs can skip this.
586
+
587
+ ---
588
+
589
+ ## Token Budget Tuning
590
+
591
+ These settings live in `~/.openclaw/hypermem/config.json` under the `compositor` key. All fields are optional — omit any knob to get the code-level default. Gateway restart required after changes.
592
+
593
+ The recommended starting config for a standard single-agent deployment is intentionally lean on turn-1 warming. Semantic recall and fact triggers fire against each incoming message, so topic-relevant context surfaces as the conversation takes shape. This produces a steadier pressure profile than aggressive pre-loading and avoids the warm→trim→compact cycling you see when every session starts near the top of the budget.
594
+
595
+ ```json
596
+ {
597
+ "compositor": {
598
+ "budgetFraction": 0.55,
599
+ "contextWindowReserve": 0.25,
600
+ "targetBudgetFraction": 0.50,
601
+ "warmHistoryBudgetFraction": 0.27,
602
+ "maxFacts": 25,
603
+ "maxHistoryMessages": 500,
604
+ "maxCrossSessionContext": 4000,
605
+ "maxRecentToolPairs": 3,
606
+ "maxProseToolPairs": 10,
607
+ "keystoneHistoryFraction": 0.15,
608
+ "keystoneMaxMessages": 12,
609
+ "wikiTokenCap": 500
610
+ }
611
+ }
612
+ ```
613
+
614
+ | Knob | Recommended | What it controls | Notes |
615
+ |---|---|---|---|
616
+ | `budgetFraction` | 0.55 | Fraction of the detected context window used as input budget | Raise to 0.65 for agents that aggressively tool-use. Autodetect only handles known model families — see *Context window overrides* below for custom/local/finetuned models |
617
+ | `contextWindowReserve` | 0.25 | Reserve left for output and tool results | Below 0.20 on large-context models invites late-turn overflow |
618
+ | `targetBudgetFraction` | 0.50 | Split between context assembly and history | Higher = richer facts/wiki; lower = more conversation headroom |
619
+ | `warmHistoryBudgetFraction` | 0.27 | History's share of first-turn warming | The key lever against tight trim cycles; don't push below 0.20 |
620
+ | `maxFacts` | 25 | Structured facts injected per turn | Recall surfaces more as topics emerge; 35 is fine for long-memory seats |
621
+ | `maxHistoryMessages` | 500 | Candidate pool for history ranking | Pool size, not load size. 300 is fine for short-session agents |
622
+ | `maxCrossSessionContext` | 4000 | Cross-session context tokens | Solo agents with one session: set to 0 |
623
+ | `maxRecentToolPairs` | 3 | Verbatim tool pairs kept | Raise to 5 for code agents with heavy tool output |
624
+ | `maxProseToolPairs` | 10 | Compressed tool pairs before stubbing | |
625
+ | `keystoneHistoryFraction` | 0.15 | Older significant turns reserved within history slot | |
626
+ | `keystoneMaxMessages` | 12 | Max keystone candidates per turn | Raise to 18 if the agent loses track of older decisions |
627
+ | `wikiTokenCap` | 500 | Cap on wiki/knowledge injection | Raise if your agent uses heavy doc content |
628
+
629
+ **Lean profile** (~35–45% fewer tokens per turn) — for constrained hosts, small models, or cost-sensitive deployments:
630
+
631
+ ```json
632
+ {
633
+ "compositor": {
634
+ "budgetFraction": 0.55,
635
+ "contextWindowReserve": 0.30,
636
+ "warmHistoryBudgetFraction": 0.20,
637
+ "maxFacts": 10,
638
+ "maxHistoryMessages": 150,
639
+ "maxCrossSessionContext": 0,
640
+ "maxRecentToolPairs": 2,
641
+ "maxProseToolPairs": 6,
642
+ "keystoneHistoryFraction": 0.10,
643
+ "keystoneMaxMessages": 5,
644
+ "wikiTokenCap": 300,
645
+ "hyperformProfile": "light"
646
+ }
647
+ }
648
+ ```
649
+
650
+ ---
651
+
652
+ ### Context window overrides (custom, local, or finetuned models)
653
+
654
+ HyperMem sizes the token budget from the model string using an internal pattern table covering known families (Claude, GPT, Gemini, GLM, Qwen, DeepSeek). If your model string doesn't match a known pattern, resolution silently falls through to `defaultTokenBudget` (90k), and **every downstream dial in this section becomes wrong**, because they're all fractions of the context window:
655
+
656
+ - `budgetFraction` × *wrong window* → wrong input budget
657
+ - `warmHistoryBudgetFraction` × *wrong budget* → wrong warm load on first turn
658
+ - Trim tiers and compaction thresholds fire against the wrong ceiling
659
+
660
+ The two symptoms that indicate window-detection failure:
661
+
662
+ 1. **Undersized window detected** (you have a 200k model, HyperMem thinks it's 90k): every turn warms near the top of the misdetected budget, trim fires constantly, semantic recall and facts get starved. You see continuous `warm→trim→compact` cycling even on short sessions.
663
+ 2. **Oversized window detected** (you have a 32k local model, HyperMem thinks it's larger): warm loads overshoot the real context window, turns land mid-response with truncated output or provider-side 400s on token overflow.
664
+
665
+ **Check what HyperMem is using.** Enable `verboseLogging: true` in the compositor config and look for the `budget source:` log line on each turn:
666
+
667
+ ```
668
+ [hypermem-plugin] budget source: runtime tokenBudget=163840 model=provider/my-model
669
+ [hypermem-plugin] budget source: contextWindowOverrides[provider/my-model]=131072, reserve=0.25, effective=98304
670
+ [hypermem-plugin] budget source: fallback contextWindowSize=90000, reserve=0.25, effective=67500 model=provider/my-model
671
+ ```
672
+
673
+ If you see `fallback contextWindowSize` for your model, detection failed and you need an override.
674
+
675
+ **Apply an override.** Add a `contextWindowOverrides` block to `~/.openclaw/hypermem/config.json`. The key is `"provider/model"` as it appears in your agent's model string (lowercase, exact match):
676
+
677
+ ```json
678
+ {
679
+ "compositor": {
680
+ "budgetFraction": 0.55,
681
+ "contextWindowReserve": 0.25,
682
+ "warmHistoryBudgetFraction": 0.27,
683
+ "contextWindowOverrides": {
684
+ "ollama/llama-3.3-70b": { "contextTokens": 131072 },
685
+ "copilot-local/custom-sft": { "contextTokens": 32768 },
686
+ "vllm/qwen3-coder-ft": { "contextTokens": 262144 }
687
+ }
688
+ }
689
+ }
690
+ ```
691
+
692
+ Resolution order, highest-to-lowest priority:
693
+
694
+ 1. Runtime `tokenBudget` passed by OpenClaw (always wins if present)
695
+ 2. `contextWindowOverrides["provider/model"]` from this config
696
+ 3. Internal pattern-table match against the model string
697
+ 4. `defaultTokenBudget` fallback (90k) — **you do not want to end up here**
698
+
699
+ Gateway restart required after editing overrides. Invalid override entries (malformed keys, impossible ranges, empty values) are dropped on load with a warning; the sanitizer will not let a bad override poison the resolver.
700
+
701
+ **Interaction with warming and trimming.** Once the correct window is in place:
702
+
703
+ - First-turn warm load = `detectedWindow × budgetFraction × (1 - contextWindowReserve) × warmHistoryBudgetFraction`
704
+ - Trim pressure zones are computed from the same `detectedWindow × budgetFraction × (1 - reserve)` effective budget, so trim fires at the right proportions of the real window, not a wrong one
705
+ - Compaction thresholds (85% nuclear, 80% afterTurn trim) are also against the effective budget, not the raw window
706
+
707
+ TL;DR for operators running custom/local/finetuned models: **set `contextWindowOverrides` before tuning anything else in this section**. Every other knob here assumes the detected window is right.
708
+
709
+ ---
710
+
711
+ ## Data Directory
712
+
713
+ Created automatically on first run at `~/.openclaw/hypermem/`:
714
+
715
+ ```
716
+ ~/.openclaw/hypermem/
717
+ ├── config.json ← embedding and compositor tuning (user-created)
718
+ ├── library.db ← L4: facts, episodes, knowledge, fleet registry (shared)
719
+ └── agents/
720
+ └── {agentId}/
721
+ ├── messages.db ← L2: conversation history (per-agent)
722
+ └── vectors.db ← L3: semantic search index (per-agent)
723
+ ```
724
+
725
+ **Backup:** `library.db` and per-agent `messages.db` files are persistent memory. Back them up before major upgrades.
726
+
727
+ **Rotation:** `messages.db` rotates at 100MB or 90 days. Archives (`messages_2026Q1.db` etc.) remain searchable.
728
+
729
+ ---
730
+
731
+ ## Troubleshooting
732
+
733
+ **Semantic search not working / no vector results**
734
+
735
+ Check your embedding tier:
736
+ - **Local (Ollama):** Confirm Ollama is running with `ollama list`. If `nomic-embed-text` is missing, `ollama pull nomic-embed-text` and restart the gateway.
737
+ - **Hosted (OpenRouter):** Verify `openaiApiKey` and `openaiBaseUrl` in `~/.openclaw/hypermem/config.json`.
738
+ - **Gemini:** Verify `geminiApiKey` in config.
739
+ - **Minimal:** Semantic search is intentionally disabled. FTS5 keyword fallback is active.
740
+
741
+ The background indexer runs on a 5-minute interval. After the first cycle, check `openclaw logs | grep embed`.
742
+
743
+ **`facts=0 semantic=0` every turn**
744
+
745
+ Expected on fresh installs. Facts and episodes accumulate over real conversations. After a few sessions these numbers grow. Workspace files can be seeded manually via the seeder API.
746
+
747
+ **Plugin not found**
748
+
749
+ Confirm the installed runtime artifacts exist:
750
+
751
+ ```bash
752
+ ls ~/.openclaw/plugins/hypermem/plugin/dist/index.js
753
+ ls ~/.openclaw/plugins/hypermem/memory-plugin/dist/index.js
754
+ ls ~/.openclaw/plugins/hypermem/dist/index.js
755
+ ```
756
+
757
+ If missing, rebuild and reinstall the runtime payload:
758
+
759
+ ```bash
760
+ cd <path-to-hypermem>
761
+ npm run build
762
+ npm --prefix plugin run build
763
+ npm --prefix memory-plugin run build
764
+ npm run install:runtime
765
+ ```
766
+
767
+ Then restart the gateway.
768
+
769
+ **Build errors after upgrade**
770
+
771
+ Clean `dist/` and rebuild:
772
+
773
+ ```bash
774
+ rm -rf dist plugin/dist memory-plugin/dist
775
+ npm run build
776
+ npm --prefix plugin run build
777
+ npm --prefix memory-plugin run build
778
+ ```
779
+
780
+ **Agent not resuming context after restart**
781
+
782
+ Check that `~/.openclaw/hypermem/agents/{agentId}/messages.db` exists. If missing, the agent hasn't bootstrapped yet and will create it on first session.
783
+
784
+ ---
785
+
786
+ ## Uninstalling
787
+
788
+ To return to OpenClaw's default context engine:
789
+
790
+ ```bash
791
+ openclaw config set plugins.slots.contextEngine legacy
792
+ openclaw config set plugins.slots.memory none
793
+ openclaw gateway restart
794
+ ```
795
+
796
+ Data in `~/.openclaw/hypermem/` is untouched. Re-enable by switching back.
797
+
798
+ ---
799
+
800
+ _Questions or issues: file against [the repo](https://github.com/PsiClawOps/hypermem) or ask in `#hypermem`._