@psiclawops/hypermem 0.8.2 → 0.8.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,132 @@
1
+ # HyperMem Phase 1 Validation Runbook
2
+
3
+ Operator-facing guide for running and interpreting the Phase 1 validation suite.
4
+
5
+ ---
6
+
7
+ ## Quick Start
8
+
9
+ Run the full Phase 1 validation flow:
10
+
11
+ ```bash
12
+ npm run build && node scripts/validate-compose.mjs && node scripts/validate-config-surface.mjs
13
+ ```
14
+
15
+ Or run individual validations:
16
+
17
+ ```bash
18
+ # Compose validation (facts, library, budget pressure)
19
+ node scripts/validate-compose.mjs
20
+
21
+ # Config surface parity (install.sh, docs, README)
22
+ node scripts/validate-config-surface.mjs
23
+
24
+ # Config key resolution tests
25
+ node test/config-validation.mjs
26
+
27
+ # Retrieval regression harness
28
+ node test/retrieval-regression.mjs
29
+
30
+ # Compositor integration (includes budget-pressure fixture)
31
+ node test/compositor.mjs
32
+
33
+ # Plugin pipeline (requires plugin build)
34
+ npm run validate:plugin-pipeline
35
+
36
+ # Startup fleet seeding on cold boot
37
+ node test/fleet-startup-seeding.mjs
38
+
39
+ # Release path hardening harness (builds core + plugin)
40
+ npm run validate:release-path
41
+
42
+ # Compose report (operator-readable diagnostics)
43
+ node scripts/compose-report.mjs
44
+ ```
45
+
46
+ ---
47
+
48
+ ## What Each Validation Covers
49
+
50
+ | Validation | What it proves |
51
+ |---|---|
52
+ | `validate-compose.mjs` | End-to-end compose with seeded facts, knowledge retrieval, and budget pressure |
53
+ | `validate-config-surface.mjs` | Config keys present in install.sh, INSTALL.md, TUNING.md, README |
54
+ | `config-validation.mjs` | contextWindowOverrides sanitization, budget resolution, maintenance defaults |
55
+ | `retrieval-regression.mjs` | Scope isolation, superseded-fact filtering, budget pressure, knowledge retrieval |
56
+ | `compositor.mjs` | Four-layer compose, trigger routing, keystone injection, budget-pressure filtering |
57
+ | `plugin-pipeline.mjs` | Real plugin assemble() path with seeded L4 memory, tight-budget proof |
58
+ | `fleet-startup-seeding.mjs` | Cold-start fleet population from workspace identity files plus idempotent repeat-boot behavior |
59
+ | `release-gateway-path.mjs` | Real plugin release-path proof: tool-chain ejection counters, ArtifactRef, replay marker, and degradation telemetry |
60
+ | `compose-report.mjs` | Operator-readable diagnostics showing layer counts and budget decisions |
61
+
62
+ ---
63
+
64
+ ## Interpreting Healthy Output
65
+
66
+ A passing run shows all checks green:
67
+
68
+ ```
69
+ ALL 12 CHECKS PASSED ✅
70
+ ```
71
+
72
+ Key diagnostics in the compose report:
73
+
74
+ - **factsIncluded > 0**: facts were retrieved for the prompt
75
+ - **tokenCount <= budget**: compositor respected the token ceiling
76
+ - **retrievalMode**: `trigger`, `fallback_knn`, or `fts_only` — shows which retrieval path fired
77
+ - **scopeFiltered >= 0**: cross-session facts correctly filtered
78
+
79
+ Maintenance diagnostics (when `verboseLogging` is enabled):
80
+
81
+ ```
82
+ [indexer] Maintenance: considered=5 skipped=2 scanned=3 mutated=0 duration=12ms exit=complete
83
+ ```
84
+
85
+ - **considered**: conversations examined
86
+ - **skipped**: conversations within cooldown window
87
+ - **scanned**: conversations where sweeps ran
88
+ - **mutated**: total messages deleted or truncated
89
+ - **exit**: `complete`, `cap-reached`, `cooldown`, or `no-conversations`
90
+
91
+ ---
92
+
93
+ ## Common Cases
94
+
95
+ ### Empty context (no facts/knowledge seeded)
96
+
97
+ Expected: `factsIncluded=0`, `contextBlock` may be empty or contain only history. This is normal for fresh installs or agents with no indexed conversations.
98
+
99
+ ### Missing fact in context
100
+
101
+ Check: is the fact's `superseded_by` column NULL? Superseded facts are filtered. Is the fact's `agent_id` correct for the composing agent? Cross-agent facts are scope-filtered.
102
+
103
+ ### Over-budget compose
104
+
105
+ The compositor respects token budgets with a small tolerance (5-15%). If `tokenCount` significantly exceeds `tokenBudget`, check `compositor.budgetFraction` and `contextWindowReserve` in config.
106
+
107
+ ### Maintenance not running
108
+
109
+ Check that `maintenance.periodicInterval` is set in config.json. Default is 300000ms (5 min). If `verboseLogging` is enabled, you should see maintenance diagnostics every tick.
110
+
111
+ ---
112
+
113
+ ## What Remains Unverified After Phase 1
114
+
115
+ - **Vector/semantic retrieval**: all Phase 1 tests run FTS-only (no Ollama dependency)
116
+ - **Full hot-cache plugin path inside Phase 1 itself**: these checks use the direct compositor, not the full cache-layer assemble lifecycle. Use `npm run validate:release-path` for that proof.
117
+ - **Multi-agent fleet interactions**: scope isolation is tested, but fleet-wide maintenance behavior is not
118
+ - **Provider-specific formatting**: tests use `provider: 'anthropic'` only
119
+ - **Real model token counting**: tests use the char/4 heuristic estimator, not tiktoken
120
+
121
+ ---
122
+
123
+ ## Maintenance Tuning Reference
124
+
125
+ See [TUNING.md](./TUNING.md#background-maintenance) for the full knob reference. Key defaults:
126
+
127
+ | Setting | Default | Effect |
128
+ |---|---|---|
129
+ | `maintenance.periodicInterval` | 300000ms | Background tick cadence |
130
+ | `maintenance.maxActiveConversations` | 5 | Conversations processed per agent per tick |
131
+ | `maintenance.recentConversationCooldownMs` | 30000ms | Skip recently processed conversations |
132
+ | `maintenance.maxCandidatesPerPass` | 200 | Cap on mutations per tick |
@@ -0,0 +1,70 @@
1
+ # HyperMem 0.8.0 Release Path Validation
2
+
3
+ This is the operator runbook for the release hardening harness.
4
+
5
+ ## Command
6
+
7
+ ```bash
8
+ npm run validate:release-path
9
+ ```
10
+
11
+ That command builds core + plugin, then runs `test/release-gateway-path.mjs` against the built plugin dist in an isolated temporary HOME.
12
+
13
+ ## What it proves
14
+
15
+ The harness exercises the real context-engine plugin path, not just direct compositor helpers.
16
+
17
+ It verifies four release-path behaviors in one run:
18
+
19
+ 1. **normal compose path** returns assembled context through `engine.assemble()`
20
+ 2. **artifact degradation** emits a canonical `[artifact:...]` reference in system context
21
+ 3. **tool-chain ejection** is recorded through real compose-path co-ejection counters in telemetry
22
+ 4. **tool-loop replay recovery** emits the canonical `[replay state=entering ...]` marker when runtime history is hot and the hot cache is cold
23
+
24
+ ## Telemetry contract
25
+
26
+ When `HYPERMEM_TELEMETRY=1`, the plugin now emits a `degradation` JSONL event alongside the existing `assemble`, `trim`, and `trim-guard` events.
27
+
28
+ Per event fields:
29
+
30
+ - `agentId`
31
+ - `sessionKey`
32
+ - `turnId`
33
+ - `path` (`compose` or `toolLoop`)
34
+ - `toolChainCoEjections`
35
+ - `toolChainStubReplacements`
36
+ - `artifactDegradations`
37
+ - `artifactOversizeThresholdTokens`
38
+ - `replayState`
39
+ - `replayReason` (legacy machine reason strings may still contain `redis` for compatibility)
40
+
41
+ The release harness asserts those counters against prompt-visible behavior, so the telemetry is not just emitted, it is verified.
42
+
43
+ ## Inspecting artifacts manually
44
+
45
+ By default the harness deletes its temp HOME on success.
46
+
47
+ To keep the temp workspace and telemetry file:
48
+
49
+ ```bash
50
+ HYPERMEM_KEEP_RELEASE_TMP=1 npm run validate:release-path
51
+ ```
52
+
53
+ The script will print the preserved temp path. The telemetry file lives at:
54
+
55
+ ```text
56
+ <tmp>/release-telemetry.jsonl
57
+ ```
58
+
59
+ ## Healthy result
60
+
61
+ ```text
62
+ ALL 12 CHECKS PASSED ✅
63
+ ```
64
+
65
+ A healthy run means the built plugin can prove the Phase C prompt-path contracts that matter for `0.8.0`:
66
+
67
+ - degraded content uses the canonical visible shapes where it is prompt-visible
68
+ - degradation counters line up with what entered the model path
69
+ - replay recovery is visible at the plugin boundary
70
+ - the proof runs against the real assemble lifecycle, not a mocked helper only
@@ -0,0 +1,10 @@
1
+ # HyperMem Release Process
2
+
3
+ **Canonical source:** [PsiClawOps/publication_process_internal](https://github.com/PsiClawOps/publication_process_internal)
4
+
5
+ - **Universal process:** `PROCESS.md` (5-stage publication pipeline)
6
+ - **HyperMem-specific:** `repos/hypermem.md` (scrub lists, stubs, build checks)
7
+ - **Install verification:** `INSTALL_VERIFICATION.md` (independent vetting gate)
8
+ - **Style:** `STYLE_GUIDE.md` + `FLEET_OUTPUT.md`
9
+
10
+ Do not duplicate process content here. If something is missing from the canonical repo, add it there.
@@ -0,0 +1,39 @@
1
+ # HyperMem Roadmap — Post-0.8.0
2
+
3
+ Items that are designed but not yet implemented, or explicitly deferred for future releases.
4
+ For shipped capabilities, see [CHANGELOG.md](../CHANGELOG.md) and [ARCHITECTURE.md](../ARCHITECTURE.md).
5
+
6
+ ---
7
+
8
+ ## Open Items
9
+
10
+ | Item | WQ | Status | Notes |
11
+ |---|---|---|---|
12
+ | Cross-session context boundary markers | WQ-20260402-001 | 🟡 OPEN | `buildCrossSessionContext()` renders flat previews, no per-message boundaries or sender identity. Incident 6. |
13
+ | Cursor durability (SQLite dual-write) | — | 🟡 DEFERRED | Cursor TTL = 24h. Dual-write to SQLite required before background indexer reads cursor reliably across restarts. |
14
+ | Plugin type unification | — | 🟡 DEFERRED | Plugin uses dynamic imports; can't use TS types from core. Shims are intentional. Structural change needed. |
15
+ | Strict topic mode: legacy NULL backfill | — | 🟡 DEFERRED | After ≥2 weeks of topic detection in production, run backfill to assign `topic_id` to legacy NULL messages, then narrow `getRecentMessagesByTopic()` to exclude NULL. Gate: topic detection stable, coverage >80% of new messages. Tracked in `specs/DEFERRED.md`. |
16
+ | ACA Step 4 — retrieval stubs replace static files | — | 🔲 PENDING | `systemPromptAddition` carries governance doc chunks instead of embedding full workspace files. Blocked on Step 3 ✅ |
17
+ | ACA Step 5 — governance context assembly | — | 🔲 PENDING | Full on-demand assembly replaces static prompt injection. Requires Step 4. |
18
+
19
+ ---
20
+
21
+ ## Cross-Agent Registry — Live Load
22
+
23
+ **Current state:** `visibilityFilter()` in `cross-agent.ts` uses a hardcoded `defaultOrgRegistry()` to resolve agent tiers, orgs, and capabilities.
24
+
25
+ **Known limitation:** This duplicates fleet structure that lives authoritatively in `fleet_agents` + `fleet_orgs` in library.db.
26
+
27
+ **Planned:** Replace with live-loaded registry from library.db on gateway startup, with the hardcoded version as cold-start fallback only. This eliminates the need to maintain two copies of fleet topology.
28
+
29
+ ---
30
+
31
+ ## Write Authorization for Global-Scope Facts
32
+
33
+ **Current state:** Facts written with `scope='global'` are readable fleet-wide. The write path has no authorization gate — any agent with HyperMem API access can write a global-scope fact.
34
+
35
+ **Impact:** Acceptable for trusted single-operator deployments. All agents share the same trust boundary.
36
+
37
+ **Planned:** Write-authority model that gates global-scope writes to designated agents (council seats or explicitly allowlisted agent IDs).
38
+
39
+ **Workaround:** Restrict HyperMem API access to trusted agents only.
@@ -0,0 +1,93 @@
1
+ # Slash Commands
2
+
3
+ hypermem supports operator-defined slash commands that hook into session lifecycle management. These are not built into the core runtime — they are wiring points you implement in your OpenClaw plugin or agent config.
4
+
5
+ ---
6
+
7
+ ## `/fresh` — Start an Unwarmed Session
8
+
9
+ Flushes the current session's hot cache and starts fresh. Long-term memory (facts, vectors, episodes, knowledge graph) is preserved and will re-warm naturally on the next bootstrap.
10
+
11
+ **Use case:** A user wants to start a new conversation without any session warmth bleeding in from a previous context.
12
+
13
+ ### What gets cleared
14
+
15
+ | Slot | Cleared |
16
+ |---|---|
17
+ | `system` | ✅ |
18
+ | `identity` | ✅ |
19
+ | `history` | ✅ |
20
+ | `window` | ✅ |
21
+ | `cursor` | ✅ |
22
+ | `context` | ✅ |
23
+ | `facts` | ✅ |
24
+ | `tools` | ✅ |
25
+ | `meta` | ✅ |
26
+ | Active sessions set | ✅ |
27
+ | SQLite facts / knowledge | ❌ preserved |
28
+ | Vector store | ❌ preserved |
29
+ | Episodes | ❌ preserved |
30
+ | Knowledge graph | ❌ preserved |
31
+
32
+ ### Wiring it in (OpenClaw plugin)
33
+
34
+ ```typescript
35
+ import { flushSession } from 'hypermem';
36
+ import type { CacheLayer } from 'hypermem';
37
+
38
+ // In your slash command handler:
39
+ if (input.trim() === '/fresh') {
40
+ const result = await flushSession(cache, agentId, sessionKey);
41
+
42
+ if (result.success) {
43
+ return `Session cache cleared. Starting fresh — long-term memory is preserved.\nFlushed at: ${result.flushedAt}`;
44
+ } else {
45
+ return `Failed to flush session: ${result.error}`;
46
+ }
47
+ }
48
+ ```
49
+
50
+ ### Using the `SessionFlusher` class
51
+
52
+ If you need a bound helper (e.g. inside an agent that always operates as a fixed agentId):
53
+
54
+ ```typescript
55
+ import { SessionFlusher } from 'hypermem';
56
+
57
+ const flusher = new SessionFlusher(cache, 'my-agent');
58
+
59
+ // Later, when /fresh is received:
60
+ const result = await flusher.flush(sessionKey);
61
+ ```
62
+
63
+ ### Aliases
64
+
65
+ You may want to register multiple names for the same command:
66
+
67
+ ```
68
+ /fresh
69
+ /newsession
70
+ /clearcache
71
+ /restart
72
+ ```
73
+
74
+ All of these are just convention. hypermem does not register slash command names — that is up to your OpenClaw plugin config.
75
+
76
+ ---
77
+
78
+ ## Planned Commands
79
+
80
+ | Command | Status | Description |
81
+ |---|---|---|
82
+ | `/fresh` | ✅ Available | Flush hot cache, preserve long-term memory |
83
+ | `/memory` | planned | Show what hypermem currently has in context |
84
+ | `/forget <topic>` | planned | Suppress a topic from future context injection |
85
+ | `/recall <query>` | planned | Manually trigger a vector search and display results |
86
+
87
+ ---
88
+
89
+ ## See Also
90
+
91
+ - [TUNING.md](./TUNING.md) — full operator knobs reference
92
+ - [MIGRATION.md](./MIGRATION.md) — schema version compatibility table
93
+ - `SessionFlusher` and `flushSession` exports in `src/session-flusher.ts`