@psiclawops/hypermem 0.8.4 → 0.9.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (99) hide show
  1. package/CHANGELOG.md +33 -0
  2. package/INSTALL.md +203 -23
  3. package/README.md +139 -216
  4. package/bench/README.md +42 -0
  5. package/bench/data-access-bench.mjs +380 -0
  6. package/bin/hypermem-bench.mjs +2 -0
  7. package/bin/hypermem-doctor.mjs +412 -0
  8. package/bin/hypermem-model-audit.mjs +339 -0
  9. package/bin/hypermem-status.mjs +491 -70
  10. package/dist/adaptive-lifecycle.d.ts +81 -0
  11. package/dist/adaptive-lifecycle.d.ts.map +1 -0
  12. package/dist/adaptive-lifecycle.js +190 -0
  13. package/dist/background-indexer.js +9 -9
  14. package/dist/budget-policy.d.ts +1 -1
  15. package/dist/budget-policy.d.ts.map +1 -1
  16. package/dist/budget-policy.js +10 -5
  17. package/dist/cache.d.ts +4 -0
  18. package/dist/cache.d.ts.map +1 -1
  19. package/dist/cache.js +2 -0
  20. package/dist/composition-snapshot-integrity.d.ts +36 -0
  21. package/dist/composition-snapshot-integrity.d.ts.map +1 -0
  22. package/dist/composition-snapshot-integrity.js +131 -0
  23. package/dist/composition-snapshot-runtime.d.ts +59 -0
  24. package/dist/composition-snapshot-runtime.d.ts.map +1 -0
  25. package/dist/composition-snapshot-runtime.js +250 -0
  26. package/dist/composition-snapshot-store.d.ts +44 -0
  27. package/dist/composition-snapshot-store.d.ts.map +1 -0
  28. package/dist/composition-snapshot-store.js +117 -0
  29. package/dist/compositor.d.ts +125 -1
  30. package/dist/compositor.d.ts.map +1 -1
  31. package/dist/compositor.js +692 -44
  32. package/dist/cross-agent.d.ts +1 -1
  33. package/dist/cross-agent.js +17 -17
  34. package/dist/doc-chunk-store.d.ts +19 -0
  35. package/dist/doc-chunk-store.d.ts.map +1 -1
  36. package/dist/doc-chunk-store.js +56 -6
  37. package/dist/dreaming-promoter.d.ts +1 -1
  38. package/dist/dreaming-promoter.js +2 -2
  39. package/dist/hybrid-retrieval.d.ts +38 -0
  40. package/dist/hybrid-retrieval.d.ts.map +1 -1
  41. package/dist/hybrid-retrieval.js +86 -1
  42. package/dist/index.d.ts +15 -6
  43. package/dist/index.d.ts.map +1 -1
  44. package/dist/index.js +33 -7
  45. package/dist/knowledge-store.d.ts +4 -1
  46. package/dist/knowledge-store.d.ts.map +1 -1
  47. package/dist/knowledge-store.js +27 -4
  48. package/dist/library-schema.d.ts +12 -8
  49. package/dist/library-schema.d.ts.map +1 -1
  50. package/dist/library-schema.js +22 -8
  51. package/dist/message-store.d.ts.map +1 -1
  52. package/dist/message-store.js +7 -3
  53. package/dist/metrics-dashboard.d.ts +18 -1
  54. package/dist/metrics-dashboard.d.ts.map +1 -1
  55. package/dist/metrics-dashboard.js +52 -14
  56. package/dist/reranker.d.ts +1 -1
  57. package/dist/reranker.js +2 -2
  58. package/dist/schema.d.ts +1 -1
  59. package/dist/schema.d.ts.map +1 -1
  60. package/dist/schema.js +28 -1
  61. package/dist/seed.d.ts +1 -1
  62. package/dist/seed.d.ts.map +1 -1
  63. package/dist/seed.js +3 -1
  64. package/dist/session-flusher.d.ts +2 -2
  65. package/dist/session-flusher.js +2 -2
  66. package/dist/spawn-context.d.ts +1 -1
  67. package/dist/spawn-context.js +1 -1
  68. package/dist/topic-store.js +5 -5
  69. package/dist/topic-synthesizer.d.ts +20 -0
  70. package/dist/topic-synthesizer.d.ts.map +1 -1
  71. package/dist/topic-synthesizer.js +114 -4
  72. package/dist/trigger-registry.d.ts +1 -1
  73. package/dist/trigger-registry.d.ts.map +1 -1
  74. package/dist/trigger-registry.js +14 -6
  75. package/dist/types.d.ts +273 -3
  76. package/dist/types.d.ts.map +1 -1
  77. package/dist/version.d.ts +7 -7
  78. package/dist/version.d.ts.map +1 -1
  79. package/dist/version.js +17 -7
  80. package/docs/DIAGNOSTICS.md +205 -0
  81. package/docs/INTEGRATION_VALIDATION.md +186 -0
  82. package/docs/MIGRATION.md +9 -6
  83. package/docs/MIGRATION_GUIDE.md +125 -101
  84. package/docs/ROADMAP.md +238 -20
  85. package/docs/TUNING.md +30 -6
  86. package/install.sh +159 -408
  87. package/memory-plugin/LICENSE +190 -0
  88. package/memory-plugin/README.md +20 -0
  89. package/memory-plugin/dist/index.js +50 -0
  90. package/memory-plugin/package.json +2 -2
  91. package/package.json +18 -4
  92. package/plugin/LICENSE +190 -0
  93. package/plugin/README.md +20 -0
  94. package/plugin/dist/index.d.ts +55 -0
  95. package/plugin/dist/index.d.ts.map +1 -1
  96. package/plugin/dist/index.js +362 -42
  97. package/plugin/dist/index.js.map +1 -1
  98. package/plugin/package.json +2 -2
  99. package/scripts/install-runtime.mjs +13 -3
package/docs/ROADMAP.md CHANGED
@@ -1,39 +1,257 @@
1
- # HyperMem Roadmap — Post-0.8.0
1
+ # HyperMem Roadmap
2
+
3
+ This is the single planning source of truth for HyperMem.
4
+
5
+ If a future-work item is not tracked here, it is not in the active improvement plan.
6
+ If an older spec disagrees with this file, this file wins.
2
7
 
3
- Items that are designed but not yet implemented, or explicitly deferred for future releases.
4
8
  For shipped capabilities, see [CHANGELOG.md](../CHANGELOG.md) and [ARCHITECTURE.md](../ARCHITECTURE.md).
5
9
 
6
10
  ---
7
11
 
8
- ## Open Items
12
+ ## Current state
9
13
 
10
- | Item | WQ | Status | Notes |
11
- |---|---|---|---|
12
- | Cross-session context boundary markers | WQ-20260402-001 | 🟡 OPEN | `buildCrossSessionContext()` renders flat previews, no per-message boundaries or sender identity. Incident 6. |
13
- | Cursor durability (SQLite dual-write) | — | 🟡 DEFERRED | Cursor TTL = 24h. Dual-write to SQLite required before background indexer reads cursor reliably across restarts. |
14
- | Plugin type unification | — | 🟡 DEFERRED | Plugin uses dynamic imports; can't use TS types from core. Shims are intentional. Structural change needed. |
15
- | Strict topic mode: legacy NULL backfill | — | 🟡 DEFERRED | After ≥2 weeks of topic detection in production, run backfill to assign `topic_id` to legacy NULL messages, then narrow `getRecentMessagesByTopic()` to exclude NULL. Gate: topic detection stable, coverage >80% of new messages. Tracked in `specs/DEFERRED.md`. |
16
- | ACA Step 4 — retrieval stubs replace static files | — | 🔲 PENDING | `systemPromptAddition` carries governance doc chunks instead of embedding full workspace files. Blocked on Step 3 ✅ |
17
- | ACA Step 5 — governance context assembly | — | 🔲 PENDING | Full on-demand assembly replaces static prompt injection. Requires Step 4. |
14
+ ### Released
15
+ - Current release: **0.8.6**
16
+
17
+ ### Landed on `main` after 0.8.6
18
+ In the order work actually landed:
19
+ 1. `47c1962` wire reranker into fused retrieval
20
+ 2. `157bca6` Sprint 1 observability telemetry
21
+ 3. `1b1cf51` Sprint 2 config-surface gaps
22
+ 4. `a62143d` Sprint 3 and Sprint 4 context engineering
23
+ 5. `be0457c` ZeroEntropy reranker endpoint fix
24
+ 6. `2af624f` sqlite-vec runtime installer native packaging fix
25
+ 7. `27046b7` composition snapshot integrity helpers
26
+ 8. `748c418` composition snapshot store and schema
27
+ 9. `1bf4785` repaired warm restore snapshot path
28
+ 10. `99b2e61` CI commit-data review for warm-restore gate closeout
29
+ 11. `7acec79` stable-prefix CI regression test stabilization
30
+ 12. `ef37137` missing doc-chunk source pruning during seeding
31
+ 13. `2c0fd7a` legacy keystone preservation under active context
32
+ 14. `e931524` warm-restore repair gates
33
+ 15. `87a4be9` snapshot slot integrity verification
34
+ 16. `31d07a6` warm-restore auto-rollout parity gates
35
+ 17. `eeedccf` cross-provider warm-restore policy
36
+ 18. `a03dc01` HyperMem governance trigger coverage
37
+ 19. `c94def0` adaptive lifecycle policy kernel
38
+ 20. `0d33286` doctrine-first retrieval over stale memory folklore
39
+ 21. `322a416` adaptive lifecycle diagnostics wiring
40
+
41
+ ### What changed in planning
42
+ We had overlapping planning streams for:
43
+ - near-term compositor fixes
44
+ - 0.9.0 adaptive context lifecycle
45
+ - warm restore
46
+ - long-tail memory-quality improvements
47
+
48
+ That produced split priorities and made it too easy to treat multiple drafts as active at once.
49
+
50
+ This roadmap consolidates those streams into one ordered list.
18
51
 
19
52
  ---
20
53
 
21
- ## Cross-Agent Registry — Live Load
54
+ ## Canonical execution order
55
+
56
+ ## 0. Already landed
57
+ These are complete enough to stop planning around as future work:
58
+ - reranker integration
59
+ - Sprint 1 observability telemetry
60
+ - Sprint 2 config-surface gaps
61
+ - Sprint 3 and Sprint 4 context engineering
62
+ - ZeroEntropy reranker endpoint fix
63
+ - sqlite-vec runtime installer packaging fix
64
+ - warm-restore foundation work: integrity helpers, snapshot store/schema, repaired restore path
65
+
66
+ Warm restore moved ahead of the earlier draft order and is now a partially landed capability, not a hypothetical future-only item.
67
+
68
+ ## 1. Warm-restore gate closeout
69
+ Status: **DONE for the tracked gate-closeout scope.**
70
+
71
+ Warm restore moved ahead of the earlier draft order and became a partially landed capability. The gate-closeout stream is no longer the highest-priority unfinished work.
72
+
73
+ Closed in warm-restore gate closeout:
74
+ - repair notice placement and non-suppressibility: repair notices are emitted as system context above restored/history content even when budget is exhausted.
75
+ - repair-depth cap enforcement: repaired snapshots are capped at depth 1 and cannot become restore sources.
76
+ - `slots_json` integrity-hash verification end to end: composed snapshots are verified after write, persisted hash mismatches are rejected, and restore resolution falls back to the previous valid snapshot.
77
+ - parity telemetry and rollout gates for automatic restore: restore diagnostics now surface rollout-gate pass/fail state and automatic warm restore falls back to cold rewarm when measurement gates fail.
78
+ - explicit zero-tolerance checks for required-slot loss, stable-prefix violations, and tool-pair parity: all three conditions are rollout blockers.
79
+ - cross-provider assistant-turn policy: foreign-provider assistant turns are explicitly counted and block automatic warm restore with a zero-tolerance rollout gate. User turns may restore cross-provider only when all measurement gates pass.
80
+
81
+ Final closeout work now complete:
82
+ - CI commit-data review for every failing warm-restore-related GitHub Actions run before sprint scope is finalized. The review maps failing workflow run, head commit SHA, commit title, failing step, and failing assertion back to a roadmap gate or triage item.
83
+
84
+ Rule going forward: do not reopen warm restore from historical planning notes. New warm-restore work needs a fresh defect, measurement gap, or roadmap item.
85
+
86
+ ## 2. HyperMem 0.9.0 adaptive context lifecycle
87
+ Status: **OPEN, release-candidate pending tag validation.**
88
+
89
+ The core runtime slices have landed: the pure adaptive lifecycle policy kernel, compose diagnostics wiring, afterTurn Redis gradient-cap wiring, adaptive recall breadth, adaptive eviction ordering, lifecycle telemetry, report tooling, forked-context lifecycle integration, and metadata-only topic-signal report classification. The first live telemetry baseline is populated; it shows steady/warmup behavior with zero lifecycle-band divergence, so no threshold tuning is warranted from the current evidence.
90
+
91
+ The lifecycle policy makes compose, afterTurn, recall, trim, compaction, and eviction share one pressure-band decision source instead of growing independent heuristics:
92
+ - tiered warming — policy bands: bootstrap, warmup, steady, elevated, high, critical
93
+ - T0 `/new` breadcrumb package — bootstrap policy emits the package trigger
94
+ - smart-recall surge — `/new` and confident topic shifts widen recall; high/critical pressure gates it down
95
+ - adaptive trim and compaction bands — trim and compaction targets resolve from the same lifecycle band
96
+ - topic-centroid-guided eviction — enabled only once pressure reaches elevated or worse
97
+ - telemetry and tuning pass — policy returns stable band, pressure, and reason fields for later runtime instrumentation
98
+
99
+ Done in this stream:
100
+ - adaptive lifecycle policy kernel — `c94def0`, CI `24879881852`
101
+ - compose diagnostics wiring + afterTurn Redis gradient-cap wiring — `322a416`
102
+ - adaptive recall breadth adjustment — `5e47fce`, CI `24918184839`
103
+ - adaptive eviction ordering — `a0f6780`, CI `24918940291`
104
+ - adaptive lifecycle telemetry — `61f9b9e`, CI `24919418833`
105
+ - telemetry report tooling — `a923987`, CI `24920282389`
106
+ - forked-context lifecycle integration — `85b5e3c`, CI `24921417908`
107
+
108
+ Remaining slices:
109
+ - runtime tuning only after evidence shows a specific threshold or behavior change is warranted; live topic-bearing samples are now future tuning evidence, not a 0.9.0 release gate
110
+
111
+ Closed release-readiness gates:
112
+ - vector coverage repair: `scripts/embed-existing.mjs` now supports active `knowledge` backfill, eligibility-aware coverage reporting, and a regression covering knowledge coverage. Production backfill reached 100% eligible coverage for facts, knowledge, and episodes on 2026-04-24.
113
+ - lifecycle telemetry baseline: the 2026-04-25 live one-hour window reported 222 lifecycle-policy records across `compose.preRecall`, `compose.eviction`, and `afterTurn.gradient`; bands were steady/warmup only, lifecycle divergence was zero, pressure p95 was 18%, and no threshold tuning was indicated.
114
+ - topic-signal interpretation path: compose/assemble telemetry now exposes metadata-only topic-state fields, and `trim-report.mjs`/`compose-report.mjs` classify `present`, `absent-no-active-topic`, `absent-stamping-incomplete`, and `intentionally-suppressed` without topic names, prompt text, document text, or user content. This closes the reporting ambiguity from the first baseline.
115
+ - topic-bearing compose evidence gate: the 0.9.0 release gate is now **replaced by a safer deterministic evidence gate**. `compose-report.mjs` seeds deterministic topic-bearing history in a temp workspace, `trim-report.mjs`/`compose-report.mjs` both emit `replaced-by-deterministic-evidence` only from metadata-only topic-state observations, and targeted tests cover the gate without live DB mutation or content-bearing telemetry. Live topic-bearing samples remain desirable for future tuning claims, but they are no longer required before tagging 0.9.0.
116
+
117
+ Release-candidate next steps before tagging 0.9.0:
118
+ - run targeted lifecycle evidence checks (`node test/trim-telemetry.mjs` and the existing adaptive lifecycle regression set)
119
+ - validate release docs/version surface (`npm run validate:docs`, `npm run validate:version-parity`, changelog review)
120
+ - complete the normal release checklist: final smoke/tests, tag notes, and publish/tag verification
121
+
122
+ Do not confuse this with the shipped governance-retrieval work. Governance trigger retrieval is closed unless a new regression appears.
123
+
124
+ ## 2a. Runtime diagnostics API allowlist defect
125
+ Status: **CLOSED, upstream verified.** Verified 2026-04-24 against the installed OpenClaw runtime.
126
+
127
+ `openclaw doctor --non-interactive` no longer reports the public-surface allowlist blocker, and a direct memory-core runtime facade probe can load `memory-core/runtime-api.js` through the installed OpenClaw public-surface loader.
22
128
 
23
- **Current state:** `visibilityFilter()` in `cross-agent.ts` uses a hardcoded `defaultOrgRegistry()` to resolve agent tiers, orgs, and capabilities.
129
+ This remains an upstream OpenClaw surface, not HyperMem-owned code. If the blocker reappears, classify it as `upstream-required` unless HyperMem is failing to expose its own memory plugin diagnostics.
24
130
 
25
- **Known limitation:** This duplicates fleet structure that lives authoritatively in `fleet_agents` + `fleet_orgs` in library.db.
131
+ ## 2b. Topic synthesis bridge defect
132
+ Status: **CLOSED.** Fixed in `8b9f928`; CI `24917765384` passed.
26
133
 
27
- **Planned:** Replace with live-loaded registry from library.db on gateway startup, with the hardcoded version as cold-start fallback only. This eliminates the need to maintain two copies of fleet topology.
134
+ Health stats on 2026-04-24 showed `knowledge: 0 active` despite eligible topics, indexed facts, and indexed episodes. Investigation found `TopicSynthesizer` still assumes `library.db.topics.id` matches `messages.db.messages.topic_id`. That invariant broke when per-session `SessionTopicMap` introduced UUID topic ids in messages DBs while library topics kept integer ids. The result is silent topic-wiki synthesis loss: eligible global topics cannot resolve their source messages.
135
+
136
+ Closed fix summary:
137
+ - bridges library topics to per-agent message topics where names align and falls back to the same content detector that created library topics;
138
+ - preserves legacy direct-id fallback for older integer-linked data;
139
+ - emits diagnostics when eligible topics cannot resolve message topic ids or source messages;
140
+ - refreshes unchanged-content upsert metadata so source-ref watermarks do not silently suppress regenerated pages;
141
+ - repairs long-lived `knowledge.visibility` schema drift;
142
+ - covers UUID topic ids, duplicate same-name session topic fragments, content-detector fallback, no-match skips, legacy direct-id fallback, schema-drift repair, unchanged-content watermark refresh, and idempotent upsert.
143
+
144
+ ## 3. Contradiction-aware decay
145
+ After 0.9.0 lifecycle work:
146
+ - accelerate decay for superseded facts
147
+ - reduce stale architectural facts surviving long after pivots
148
+ - prevent repeated ghost-fact failures after deletes, renames, and interface changes
149
+
150
+ This is a quality and correctness improvement, not a release blocker.
151
+
152
+ ## 4. Turn DAG Phase 5 storage and performance
153
+ After the above correctness and continuity work:
154
+ - content-addressed blob store for repeated large payloads
155
+ - zstd compression for large message bodies
156
+ - cached token estimates on insert
157
+ - optional garbage collection
158
+ - active-only FTS index maintenance
159
+
160
+ Phase 5 stays important, but it is not the next sprint until the higher-priority continuity and lifecycle work is settled.
161
+
162
+ ---
163
+
164
+ ## Open items
165
+
166
+ ### High priority
167
+ | Item | Status | Notes |
168
+ |---|---|---|
169
+ | Runtime diagnostics API allowlist defect | ✅ CLOSED | Verified installed OpenClaw runtime can reach `memory-core/runtime-api.js`; re-open only with a fresh public-surface failure trace. |
170
+ | Topic synthesis bridge defect | ✅ CLOSED | Fixed in `8b9f928`; CI `24917765384` passed. |
171
+ | Adaptive context lifecycle (0.9.0) | 🟡 OPEN | Kernel, compose diagnostics, afterTurn gradient cap, recall breadth, eviction order, lifecycle telemetry, report tooling, forked-context integration, and topic-signal report classification are landed; vector coverage, first live lifecycle baseline, and the 0.9.0 topic-bearing compose evidence gate are closed; threshold tuning remains deferred until future live evidence warrants it. |
172
+ | Vector coverage repair gate | ✅ CLOSED | `embed-existing` now supports knowledge and eligibility-aware coverage reporting; production vectors reached facts 113/113, knowledge 85/85, episodes 30,121/30,121 eligible coverage on 2026-04-24. |
173
+ | Contradiction-aware decay | 🟡 OPEN | Prevents stale-fact poisoning after architectural pivots. |
174
+ | Turn DAG Phase 5 storage/perf | 🟡 OPEN | Important, but later than the items above. |
175
+ | Warm-restore gate closeout | ✅ DONE | Tracked gate-closeout scope is complete; reopen only for a new concrete defect or measurement gap. |
176
+
177
+ ### Medium priority
178
+ | Item | Status | Notes |
179
+ |---|---|---|
180
+ | Cross-session context boundary markers | 🟡 OPEN | `buildCrossSessionContext()` still renders flat previews without strong per-message boundaries or sender identity. |
181
+ | Cursor durability (SQLite dual-write) | 🟡 DEFERRED | Needed before background indexer can rely on cursor state across restarts. |
182
+ | Cross-agent registry live load | 🟡 DEFERRED | Replace hardcoded org registry with library.db-backed load on startup. |
183
+ | Write authorization for global-scope facts | 🟡 DEFERRED | Add designated-writer policy for `scope='global'`. |
184
+
185
+ ### Lower priority / deferred
186
+ | Item | Status | Notes |
187
+ |---|---|---|
188
+ | Plugin type unification | 🟡 DEFERRED | Structural cleanup, not urgent product work. |
189
+ | Strict topic mode legacy NULL backfill | 🟡 DEFERRED | Wait for stable topic coverage before running the migration/backfill. |
190
+ | ACA Step 4 retrieval stubs replace static files | 🔲 PENDING | Still relevant, but downstream of lifecycle/diagnostics stability. Do not start from older ACA notes. |
191
+ | ACA Step 5 governance context assembly | 🔲 PENDING | Still relevant, but depends on Step 4. Do not start until Step 4 has an accepted implementation contract. |
192
+ | Codex harness compatibility for HyperMem hooks | 🔲 TRIAGE | OpenClaw is generalizing the embedded executor (PI stays default; Codex is a registered plugin harness via `agents.defaults.embeddedHarness`). When a turn runs through the Codex harness, OpenClaw owns mirror transcript + tool dispatch but Codex owns the agent loop and native compaction. Investigate before any agent moves off PI: (1) whether HyperMem's `before_prompt_build`, `before_compaction`, `after_compaction`, `before_message_write`, `llm_input`, `llm_output`, `agent_end`, `after_tool_call` hooks fire with the same timing on Codex turns; (2) collision between HyperCompositor compaction and Codex native app-server compaction (need an explicit disable on one side or a coordination contract); (3) whether the locally mirrored transcript loses any content the indexer assumes is present; (4) `subagent` `context: "fork"` behavior when parent runs on Codex (parent JSONL is mirror, not source of truth); (5) keep `openai-codex/*` provider routes (PI under the hood) distinct from `codex/*` runtime selection (Codex harness). Pilot on one non-critical agent with `runtime: "codex"`, `fallback: "pi"` once the parity gap closes; do not adopt fleet-wide before then. Surfaced 2026-04-24 by ragesaq. |
28
193
 
29
194
  ---
30
195
 
31
- ## Write Authorization for Global-Scope Facts
196
+ ## Working rules for future planning
32
197
 
33
- **Current state:** Facts written with `scope='global'` are readable fleet-wide. The write path has no authorization gate — any agent with HyperMem API access can write a global-scope fact.
198
+ 1. Add future-work items here first.
199
+ 2. Do not create a second roadmap doc for the same workstream.
200
+ 3. If a feature needs a design spec, that spec should support implementation details only and must point back here for priority/order.
201
+ 4. If code lands out of the planned order, update this file in the same work session.
202
+ 5. Historical phase briefs are not roadmap authority.
34
203
 
35
- **Impact:** Acceptable for trusted single-operator deployments. All agents share the same trust boundary.
204
+ ---
205
+
206
+ ## Retired planning documents
207
+
208
+ The following overlapping roadmap/spec files were consolidated into this roadmap and removed to stop split-planning:
209
+ - `specs/ROADMAP_RESEQUENCING_2026-04-21.md`
210
+ - `specs/ADAPTIVE_CONTEXT_LIFECYCLE_0.9.0.md`
211
+ - `specs/COMPOSITION_SNAPSHOT_WARM_RESTORE_PLAN.md`
212
+ - `specs/CONTRADICTION_AWARE_DECAY.md`
213
+
214
+ Their useful content is now represented here.
215
+
216
+ ---
217
+
218
+ ## Historical triage appendix
219
+
220
+ This appendix is the cleanup pass for older improvement lists that were still floating around in the repo and workspace.
221
+
222
+ ### Triage legend
223
+ - **SHIPPED**: landed in release code or on `main`
224
+ - **OPEN**: still active in the canonical roadmap above
225
+ - **BACKLOG**: valid idea, but not in the current active execution order
226
+ - **SUPERSEDED**: replaced by a later implementation or a narrower canonical item above
227
+
228
+ ### Historical improvement triage
229
+
230
+ | Historical item | First appeared in | Disposition | Canonical location / note |
231
+ |---|---|---|---|
232
+ | End-to-end integration verification | workspace `active/hypermem-prioritized-improvements-2026-04-14.md` | **SHIPPED** | Closed by the 0.8.0 release-hardening verification work. |
233
+ | Real gateway integration test | workspace `active/hypermem-prioritized-improvements-2026-04-14.md` | **SHIPPED** | Closed by the 0.8.0 release-hardening verification work. |
234
+ | Fact contradiction handling | workspace `active/hypermem-prioritized-improvements-2026-04-14.md` | **SHIPPED** | Landed as V19 tiered contradiction resolution in 0.8.0. |
235
+ | Post-retrieval reranking | workspace `active/hypermem-prioritized-improvements-2026-04-14.md` | **SHIPPED** | Already landed on `main`; no longer roadmap future work. |
236
+ | Oversized-payload artifact handling | workspace `active/hypermem-prioritized-improvements-2026-04-14.md`, workspace `specs/HYPERMEM_PLAN_2026-04-16.md` | **SHIPPED** | Landed in the 0.8.0 correctness cluster. |
237
+ | Model-aware compositor budgeting | workspace `active/hypermem-prioritized-improvements-2026-04-14.md`, workspace `specs/HYPERMEM_PLAN_2026-04-16.md` | **SHIPPED** | Landed as B4 model-aware budgeting in 0.8.0. |
238
+ | Benchmark suite completion | workspace `active/hypermem-prioritized-improvements-2026-04-14.md` | **BACKLOG** | Still valid, but not part of the current active execution order. |
239
+ | Trust-aware composition / prompt-boundary hygiene | workspace `active/hypermem-prioritized-improvements-2026-04-14.md` | **SUPERSEDED** | Survives only as narrower work inside warm-restore gates and future lifecycle tuning. |
240
+ | Fleet agent seeding on startup | workspace `active/hypermem-prioritized-improvements-2026-04-14.md` | **SHIPPED** | Closed in 0.8.0 startup completeness work. |
241
+ | Governance and workspace doc ingestion | workspace `active/hypermem-prioritized-improvements-2026-04-14.md` | **SUPERSEDED** | Broad ingestion work was replaced by scoped governance trigger retrieval, doctrine-first ranking, and later ACA Step 4/5 items. Do not reopen the broad item. |
242
+ | Prompt-cache validation | workspace `active/hypermem-prioritized-improvements-2026-04-14.md` | **SUPERSEDED** | Replaced by shipped cache-boundary work and current warm-restore parity gates. |
243
+ | Cross-seat and org-visible fact sharing | workspace `active/hypermem-prioritized-improvements-2026-04-14.md` | **BACKLOG** | Still deferred pending stronger write-auth and provenance rules. |
244
+ | Knowledge extraction expansion | workspace `active/hypermem-prioritized-improvements-2026-04-14.md` | **BACKLOG** | Valid future work, not in the current active sequence. |
245
+ | memory-core deprecation path | workspace `active/hypermem-prioritized-improvements-2026-04-14.md` | **BACKLOG** | Deferred until the current roadmap blocks are complete. |
246
+ | ACA kernel reduction | workspace `active/hypermem-prioritized-improvements-2026-04-14.md` | **SUPERSEDED** | Broad kernel-reduction framing is replaced by adaptive lifecycle 0.9.0 and later ACA Step 4/5 work. |
247
+ | Phase A stability block (duplicate compose, trim ownership, rescue-trim loop, history depth) | workspace `specs/HYPERMEM_PLAN_2026-04-16.md` | **SUPERSEDED** | Historical restructuring plan. Remaining active work is tracked only through the canonical sections above. |
248
+ | Phase B compositor restructure | workspace `specs/HYPERMEM_PLAN_2026-04-16.md` | **SUPERSEDED** | Do not treat as a second roadmap. Re-open only by adding a new item above. |
249
+ | Phase C correctness guards | workspace `specs/HYPERMEM_PLAN_2026-04-16.md`, `specs/RELEASE_HARDENING_0.8.0.md` | **SHIPPED** | Landed in 0.8.0. |
250
+ | Phase D graph semantics follow-on | workspace `specs/HYPERMEM_PLAN_2026-04-16.md` | **SUPERSEDED** | Turn DAG follow-on now lives only as item 4 in the canonical roadmap. |
251
+ | Turn DAG Phase 5 storage/perf | `specs/TURN_DAG_MIGRATION_SPEC.md`, `specs/HYPERBUILDER_PHASE4_BRIEF.md` | **OPEN** | Canonical roadmap item 4. |
252
+ | ACA governance trigger retrieval | direct roadmap follow-up, commits `a03dc01` and `0d33286` | **SHIPPED** | Governance trigger coverage and doctrine-first retrieval are done. Reopen only for a new failing query or regression test. |
253
+ | Adaptive lifecycle diagnostics | direct roadmap follow-up, commits `c94def0` and `322a416` | **SHIPPED** | Kernel and compose diagnostics wiring landed. Remaining lifecycle behavior stays under roadmap item 2, not a separate historical activity. |
36
254
 
37
- **Planned:** Write-authority model that gates global-scope writes to designated agents (council seats or explicitly allowlisted agent IDs).
255
+ ### Rule after this cleanup
38
256
 
39
- **Workaround:** Restrict HyperMem API access to trusted agents only.
257
+ If an older workspace note or historical spec lists future work that does not also appear in the main roadmap sections above, treat it as historical context only, not an active plan.
package/docs/TUNING.md CHANGED
@@ -1,6 +1,6 @@
1
1
  # hypermem Tuning Guide
2
2
 
3
- Configuration reference for operators and agents. All settings are optional, but the installer now writes a fully-expanded `config.json` so operators can see every default in one place.
3
+ Configuration reference for operators and agents. All settings are optional. The recommended install path writes a starter `config.json` with an explicit lightweight embedding choice and a declarative baseline compositor profile. Tune from that verified baseline, not from guesswork.
4
4
 
5
5
  Config lives in `~/.openclaw/hypermem/config.json` (takes effect on gateway restart) or is passed programmatically via `HyperMem.create()`:
6
6
 
@@ -23,6 +23,16 @@ Resolution order is:
23
23
  2. `~/.openclaw/hypermem/config.json`
24
24
  3. code defaults
25
25
 
26
+ ## Before you tune
27
+
28
+ Do not tune an install that is only staged. Verify these first:
29
+
30
+ 1. `openclaw plugins list` shows `hypercompositor` and `hypermem` loaded
31
+ 2. `openclaw logs --limit 50 | grep hypermem` shows `hypermem initialized`
32
+ 3. after a real agent message, logs show `[hypermem:compose]`
33
+
34
+ If any of those are missing, go back to `INSTALL.md`. That is an install-path problem, not a tuning problem.
35
+
26
36
  ---
27
37
 
28
38
  ## Token Cost Philosophy
@@ -393,6 +403,10 @@ effective budget × (1 - targetBudgetFraction) = history budget
393
403
 
394
404
  The autodetect pattern table in step 2 covers known model families (`claude-*`, `gpt-*`, `gemini-*`, `glm-*`, `qwen-*`, `deepseek-*`). If your model string doesn't match any pattern — custom finetunes, local models behind unusual provider prefixes, experimental Ollama/vLLM/LM Studio names — resolution silently falls through to `defaultTokenBudget` (90k). **Every dial in this section is a fraction of the detected window, so wrong detection propagates everywhere**: `budgetFraction`, `warmHistoryBudgetFraction`, trim tier thresholds (50% / 65% / 85%), and compaction gates (80% afterTurn, 85% nuclear) all end up sized against the wrong ceiling.
395
405
 
406
+ This is especially important on OpenAI-compatible surfaces. In real deployments, `openai/*`, `openai-codex/*`, OpenRouter-backed models, and custom OpenAI-compatible gateways often do **not** provide enough trustworthy runtime metadata to infer the usable context budget correctly. If you do not see a `runtime tokenBudget=...` log for the exact model you're running, assume you need a manual override.
407
+
408
+ When you know both numbers, declare both: `contextTokens` for the usable prompt budget and `contextWindow` for the full advertised window. HyperMem uses `contextTokens` first, then `contextWindow`, and the config validator enforces `contextTokens <= contextWindow`.
409
+
396
410
  Two failure signatures:
397
411
 
398
412
  - **Undersized detection** (real 200k model detected as 90k): continuous warm→trim→compact cycling, starved facts/wiki slots, tight first-turn budgets. The agent feels "boxed in" even in short sessions.
@@ -400,23 +414,33 @@ Two failure signatures:
400
414
 
401
415
  Verify what's being used by enabling `verboseLogging: true` and watching for the `budget source:` log line each turn. `runtime tokenBudget=...` or `contextWindowOverrides[...]` means HyperMem has the right number. `fallback contextWindowSize=...` with your model in the tail means detection failed.
402
416
 
417
+ You can also run the packaged audit helper:
418
+
419
+ ```bash
420
+ hypermem-model-audit
421
+ hypermem-model-audit --strict
422
+ ```
423
+
424
+ It inspects configured agent models plus your existing `contextWindowOverrides` and flags models that still rely on weak autodetect paths.
425
+
403
426
  Fix by adding `contextWindowOverrides` in the `compositor` block of `~/.openclaw/hypermem/config.json`:
404
427
 
405
428
  ```json
406
429
  {
407
430
  "compositor": {
408
431
  "contextWindowOverrides": {
409
- "ollama/llama-3.3-70b": { "contextTokens": 131072 },
410
- "copilot-local/custom-sft": { "contextTokens": 32768 },
411
- "vllm/qwen3-coder-ft": { "contextTokens": 262144 }
432
+ "ollama/llama-3.3-70b": { "contextTokens": 131072, "contextWindow": 131072 },
433
+ "openai-codex/gpt-5.4": { "contextTokens": 200000, "contextWindow": 200000 },
434
+ "copilot-local/custom-sft": { "contextTokens": 32768, "contextWindow": 32768 },
435
+ "vllm/qwen3-coder-ft": { "contextTokens": 262144, "contextWindow": 262144 }
412
436
  }
413
437
  }
414
438
  }
415
439
  ```
416
440
 
417
- Key format: `"provider/model"`, lowercase, exact match against the model identifier your agent runs on. Values accept either `contextTokens` or `contextWindow` (same effect). Malformed keys, impossible ranges, and empty entries are dropped by the sanitizer on load with a warning; the override system is designed to be safe to edit without risking the resolver.
441
+ Key format: `"provider/model"`, lowercase, exact match against the model identifier your agent runs on. Values accept either `contextTokens` or `contextWindow`, but for production installs you should prefer setting both. Malformed keys, impossible ranges, and empty entries are dropped by the sanitizer on load with a warning; the override system is designed to be safe to edit without risking the resolver.
418
442
 
419
- Gateway restart required after changes. Overrides interact with warming and trimming exactly as the autodetect path does — once the correct window is in place, every other knob here behaves as documented. Set `contextWindowOverrides` **before** tuning `budgetFraction`, `warmHistoryBudgetFraction`, or any trim-zone dials, otherwise you're tuning against the wrong window and the numbers won't behave.
443
+ Gateway restart required after changes. Overrides interact with warming and trimming exactly as the autodetect path does — once the correct window is in place, every other knob here behaves as documented. Set `contextWindowOverrides` **before** tuning `budgetFraction`, `warmHistoryBudgetFraction`, or any trim-zone dials, otherwise you're tuning against the wrong window and the numbers won't behave. For OpenAI-family models, make log verification part of bring-up: no `runtime tokenBudget=...` log, no trust.
420
444
 
421
445
  ### How the budget fills
422
446