@psiclawops/hypermem 0.8.2 → 0.8.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/INSTALL.md +129 -42
- package/README.md +29 -4
- package/docs/API_STABILITY.md +33 -0
- package/docs/KNOWN_LIMITATIONS.md +35 -0
- package/docs/MEMORY_MD_AUTHORING.md +243 -0
- package/docs/MIGRATION.md +56 -0
- package/docs/MIGRATION_GUIDE.md +1083 -0
- package/docs/PHASE1-VALIDATION.md +132 -0
- package/docs/RELEASE_0.8.0_VALIDATION.md +70 -0
- package/docs/RELEASE_PROCESS.md +10 -0
- package/docs/ROADMAP.md +39 -0
- package/docs/SLASH_COMMANDS.md +93 -0
- package/docs/TUNING.md +866 -0
- package/install.sh +516 -0
- package/memory-plugin/dist/index.d.ts +24 -0
- package/memory-plugin/dist/index.d.ts.map +1 -0
- package/memory-plugin/dist/index.js +300 -0
- package/memory-plugin/dist/index.js.map +1 -0
- package/memory-plugin/openclaw.plugin.json +13 -0
- package/memory-plugin/package.json +64 -0
- package/package.json +13 -2
- package/plugin/dist/index.d.ts +153 -0
- package/plugin/dist/index.d.ts.map +1 -0
- package/plugin/dist/index.js +3127 -0
- package/plugin/dist/index.js.map +1 -0
- package/plugin/openclaw.plugin.json +13 -0
- package/plugin/package.json +65 -0
- package/scripts/install-runtime.mjs +81 -0
|
@@ -0,0 +1,132 @@
|
|
|
1
|
+
# HyperMem Phase 1 Validation Runbook
|
|
2
|
+
|
|
3
|
+
Operator-facing guide for running and interpreting the Phase 1 validation suite.
|
|
4
|
+
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
## Quick Start
|
|
8
|
+
|
|
9
|
+
Run the full Phase 1 validation flow:
|
|
10
|
+
|
|
11
|
+
```bash
|
|
12
|
+
npm run build && node scripts/validate-compose.mjs && node scripts/validate-config-surface.mjs
|
|
13
|
+
```
|
|
14
|
+
|
|
15
|
+
Or run individual validations:
|
|
16
|
+
|
|
17
|
+
```bash
|
|
18
|
+
# Compose validation (facts, library, budget pressure)
|
|
19
|
+
node scripts/validate-compose.mjs
|
|
20
|
+
|
|
21
|
+
# Config surface parity (install.sh, docs, README)
|
|
22
|
+
node scripts/validate-config-surface.mjs
|
|
23
|
+
|
|
24
|
+
# Config key resolution tests
|
|
25
|
+
node test/config-validation.mjs
|
|
26
|
+
|
|
27
|
+
# Retrieval regression harness
|
|
28
|
+
node test/retrieval-regression.mjs
|
|
29
|
+
|
|
30
|
+
# Compositor integration (includes budget-pressure fixture)
|
|
31
|
+
node test/compositor.mjs
|
|
32
|
+
|
|
33
|
+
# Plugin pipeline (requires plugin build)
|
|
34
|
+
npm run validate:plugin-pipeline
|
|
35
|
+
|
|
36
|
+
# Startup fleet seeding on cold boot
|
|
37
|
+
node test/fleet-startup-seeding.mjs
|
|
38
|
+
|
|
39
|
+
# Release path hardening harness (builds core + plugin)
|
|
40
|
+
npm run validate:release-path
|
|
41
|
+
|
|
42
|
+
# Compose report (operator-readable diagnostics)
|
|
43
|
+
node scripts/compose-report.mjs
|
|
44
|
+
```
|
|
45
|
+
|
|
46
|
+
---
|
|
47
|
+
|
|
48
|
+
## What Each Validation Covers
|
|
49
|
+
|
|
50
|
+
| Validation | What it proves |
|
|
51
|
+
|---|---|
|
|
52
|
+
| `validate-compose.mjs` | End-to-end compose with seeded facts, knowledge retrieval, and budget pressure |
|
|
53
|
+
| `validate-config-surface.mjs` | Config keys present in install.sh, INSTALL.md, TUNING.md, README |
|
|
54
|
+
| `config-validation.mjs` | contextWindowOverrides sanitization, budget resolution, maintenance defaults |
|
|
55
|
+
| `retrieval-regression.mjs` | Scope isolation, superseded-fact filtering, budget pressure, knowledge retrieval |
|
|
56
|
+
| `compositor.mjs` | Four-layer compose, trigger routing, keystone injection, budget-pressure filtering |
|
|
57
|
+
| `plugin-pipeline.mjs` | Real plugin assemble() path with seeded L4 memory, tight-budget proof |
|
|
58
|
+
| `fleet-startup-seeding.mjs` | Cold-start fleet population from workspace identity files plus idempotent repeat-boot behavior |
|
|
59
|
+
| `release-gateway-path.mjs` | Real plugin release-path proof: tool-chain ejection counters, ArtifactRef, replay marker, and degradation telemetry |
|
|
60
|
+
| `compose-report.mjs` | Operator-readable diagnostics showing layer counts and budget decisions |
|
|
61
|
+
|
|
62
|
+
---
|
|
63
|
+
|
|
64
|
+
## Interpreting Healthy Output
|
|
65
|
+
|
|
66
|
+
A passing run shows all checks green:
|
|
67
|
+
|
|
68
|
+
```
|
|
69
|
+
ALL 12 CHECKS PASSED ✅
|
|
70
|
+
```
|
|
71
|
+
|
|
72
|
+
Key diagnostics in the compose report:
|
|
73
|
+
|
|
74
|
+
- **factsIncluded > 0**: facts were retrieved for the prompt
|
|
75
|
+
- **tokenCount <= budget**: compositor respected the token ceiling
|
|
76
|
+
- **retrievalMode**: `trigger`, `fallback_knn`, or `fts_only` — shows which retrieval path fired
|
|
77
|
+
- **scopeFiltered >= 0**: cross-session facts correctly filtered
|
|
78
|
+
|
|
79
|
+
Maintenance diagnostics (when `verboseLogging` is enabled):
|
|
80
|
+
|
|
81
|
+
```
|
|
82
|
+
[indexer] Maintenance: considered=5 skipped=2 scanned=3 mutated=0 duration=12ms exit=complete
|
|
83
|
+
```
|
|
84
|
+
|
|
85
|
+
- **considered**: conversations examined
|
|
86
|
+
- **skipped**: conversations within cooldown window
|
|
87
|
+
- **scanned**: conversations where sweeps ran
|
|
88
|
+
- **mutated**: total messages deleted or truncated
|
|
89
|
+
- **exit**: `complete`, `cap-reached`, `cooldown`, or `no-conversations`
|
|
90
|
+
|
|
91
|
+
---
|
|
92
|
+
|
|
93
|
+
## Common Cases
|
|
94
|
+
|
|
95
|
+
### Empty context (no facts/knowledge seeded)
|
|
96
|
+
|
|
97
|
+
Expected: `factsIncluded=0`, `contextBlock` may be empty or contain only history. This is normal for fresh installs or agents with no indexed conversations.
|
|
98
|
+
|
|
99
|
+
### Missing fact in context
|
|
100
|
+
|
|
101
|
+
Check: is the fact's `superseded_by` column NULL? Superseded facts are filtered. Is the fact's `agent_id` correct for the composing agent? Cross-agent facts are scope-filtered.
|
|
102
|
+
|
|
103
|
+
### Over-budget compose
|
|
104
|
+
|
|
105
|
+
The compositor respects token budgets with a small tolerance (5-15%). If `tokenCount` significantly exceeds `tokenBudget`, check `compositor.budgetFraction` and `contextWindowReserve` in config.
|
|
106
|
+
|
|
107
|
+
### Maintenance not running
|
|
108
|
+
|
|
109
|
+
Check that `maintenance.periodicInterval` is set in config.json. Default is 300000ms (5 min). If `verboseLogging` is enabled, you should see maintenance diagnostics every tick.
|
|
110
|
+
|
|
111
|
+
---
|
|
112
|
+
|
|
113
|
+
## What Remains Unverified After Phase 1
|
|
114
|
+
|
|
115
|
+
- **Vector/semantic retrieval**: all Phase 1 tests run FTS-only (no Ollama dependency)
|
|
116
|
+
- **Full hot-cache plugin path inside Phase 1 itself**: these checks use the direct compositor, not the full cache-layer assemble lifecycle. Use `npm run validate:release-path` for that proof.
|
|
117
|
+
- **Multi-agent fleet interactions**: scope isolation is tested, but fleet-wide maintenance behavior is not
|
|
118
|
+
- **Provider-specific formatting**: tests use `provider: 'anthropic'` only
|
|
119
|
+
- **Real model token counting**: tests use the char/4 heuristic estimator, not tiktoken
|
|
120
|
+
|
|
121
|
+
---
|
|
122
|
+
|
|
123
|
+
## Maintenance Tuning Reference
|
|
124
|
+
|
|
125
|
+
See [TUNING.md](./TUNING.md#background-maintenance) for the full knob reference. Key defaults:
|
|
126
|
+
|
|
127
|
+
| Setting | Default | Effect |
|
|
128
|
+
|---|---|---|
|
|
129
|
+
| `maintenance.periodicInterval` | 300000ms | Background tick cadence |
|
|
130
|
+
| `maintenance.maxActiveConversations` | 5 | Conversations processed per agent per tick |
|
|
131
|
+
| `maintenance.recentConversationCooldownMs` | 30000ms | Skip recently processed conversations |
|
|
132
|
+
| `maintenance.maxCandidatesPerPass` | 200 | Cap on mutations per tick |
|
|
@@ -0,0 +1,70 @@
|
|
|
1
|
+
# HyperMem 0.8.0 Release Path Validation
|
|
2
|
+
|
|
3
|
+
This is the operator runbook for the release hardening harness.
|
|
4
|
+
|
|
5
|
+
## Command
|
|
6
|
+
|
|
7
|
+
```bash
|
|
8
|
+
npm run validate:release-path
|
|
9
|
+
```
|
|
10
|
+
|
|
11
|
+
That command builds core + plugin, then runs `test/release-gateway-path.mjs` against the built plugin dist in an isolated temporary HOME.
|
|
12
|
+
|
|
13
|
+
## What it proves
|
|
14
|
+
|
|
15
|
+
The harness exercises the real context-engine plugin path, not just direct compositor helpers.
|
|
16
|
+
|
|
17
|
+
It verifies four release-path behaviors in one run:
|
|
18
|
+
|
|
19
|
+
1. **normal compose path** returns assembled context through `engine.assemble()`
|
|
20
|
+
2. **artifact degradation** emits a canonical `[artifact:...]` reference in system context
|
|
21
|
+
3. **tool-chain ejection** is recorded through real compose-path co-ejection counters in telemetry
|
|
22
|
+
4. **tool-loop replay recovery** emits the canonical `[replay state=entering ...]` marker when runtime history is hot and the hot cache is cold
|
|
23
|
+
|
|
24
|
+
## Telemetry contract
|
|
25
|
+
|
|
26
|
+
When `HYPERMEM_TELEMETRY=1`, the plugin now emits a `degradation` JSONL event alongside the existing `assemble`, `trim`, and `trim-guard` events.
|
|
27
|
+
|
|
28
|
+
Per event fields:
|
|
29
|
+
|
|
30
|
+
- `agentId`
|
|
31
|
+
- `sessionKey`
|
|
32
|
+
- `turnId`
|
|
33
|
+
- `path` (`compose` or `toolLoop`)
|
|
34
|
+
- `toolChainCoEjections`
|
|
35
|
+
- `toolChainStubReplacements`
|
|
36
|
+
- `artifactDegradations`
|
|
37
|
+
- `artifactOversizeThresholdTokens`
|
|
38
|
+
- `replayState`
|
|
39
|
+
- `replayReason` (legacy machine reason strings may still contain `redis` for compatibility)
|
|
40
|
+
|
|
41
|
+
The release harness asserts those counters against prompt-visible behavior, so the telemetry is not just emitted, it is verified.
|
|
42
|
+
|
|
43
|
+
## Inspecting artifacts manually
|
|
44
|
+
|
|
45
|
+
By default the harness deletes its temp HOME on success.
|
|
46
|
+
|
|
47
|
+
To keep the temp workspace and telemetry file:
|
|
48
|
+
|
|
49
|
+
```bash
|
|
50
|
+
HYPERMEM_KEEP_RELEASE_TMP=1 npm run validate:release-path
|
|
51
|
+
```
|
|
52
|
+
|
|
53
|
+
The script will print the preserved temp path. The telemetry file lives at:
|
|
54
|
+
|
|
55
|
+
```text
|
|
56
|
+
<tmp>/release-telemetry.jsonl
|
|
57
|
+
```
|
|
58
|
+
|
|
59
|
+
## Healthy result
|
|
60
|
+
|
|
61
|
+
```text
|
|
62
|
+
ALL 12 CHECKS PASSED ✅
|
|
63
|
+
```
|
|
64
|
+
|
|
65
|
+
A healthy run means the built plugin can prove the Phase C prompt-path contracts that matter for `0.8.0`:
|
|
66
|
+
|
|
67
|
+
- degraded content uses the canonical visible shapes where it is prompt-visible
|
|
68
|
+
- degradation counters line up with what entered the model path
|
|
69
|
+
- replay recovery is visible at the plugin boundary
|
|
70
|
+
- the proof runs against the real assemble lifecycle, not a mocked helper only
|
|
@@ -0,0 +1,10 @@
|
|
|
1
|
+
# HyperMem Release Process
|
|
2
|
+
|
|
3
|
+
**Canonical source:** [PsiClawOps/publication_process_internal](https://github.com/PsiClawOps/publication_process_internal)
|
|
4
|
+
|
|
5
|
+
- **Universal process:** `PROCESS.md` (5-stage publication pipeline)
|
|
6
|
+
- **HyperMem-specific:** `repos/hypermem.md` (scrub lists, stubs, build checks)
|
|
7
|
+
- **Install verification:** `INSTALL_VERIFICATION.md` (independent vetting gate)
|
|
8
|
+
- **Style:** `STYLE_GUIDE.md` + `FLEET_OUTPUT.md`
|
|
9
|
+
|
|
10
|
+
Do not duplicate process content here. If something is missing from the canonical repo, add it there.
|
package/docs/ROADMAP.md
ADDED
|
@@ -0,0 +1,39 @@
|
|
|
1
|
+
# HyperMem Roadmap — Post-0.8.0
|
|
2
|
+
|
|
3
|
+
Items that are designed but not yet implemented, or explicitly deferred for future releases.
|
|
4
|
+
For shipped capabilities, see [CHANGELOG.md](../CHANGELOG.md) and [ARCHITECTURE.md](../ARCHITECTURE.md).
|
|
5
|
+
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
## Open Items
|
|
9
|
+
|
|
10
|
+
| Item | WQ | Status | Notes |
|
|
11
|
+
|---|---|---|---|
|
|
12
|
+
| Cross-session context boundary markers | WQ-20260402-001 | 🟡 OPEN | `buildCrossSessionContext()` renders flat previews, no per-message boundaries or sender identity. Incident 6. |
|
|
13
|
+
| Cursor durability (SQLite dual-write) | — | 🟡 DEFERRED | Cursor TTL = 24h. Dual-write to SQLite required before background indexer reads cursor reliably across restarts. |
|
|
14
|
+
| Plugin type unification | — | 🟡 DEFERRED | Plugin uses dynamic imports; can't use TS types from core. Shims are intentional. Structural change needed. |
|
|
15
|
+
| Strict topic mode: legacy NULL backfill | — | 🟡 DEFERRED | After ≥2 weeks of topic detection in production, run backfill to assign `topic_id` to legacy NULL messages, then narrow `getRecentMessagesByTopic()` to exclude NULL. Gate: topic detection stable, coverage >80% of new messages. Tracked in `specs/DEFERRED.md`. |
|
|
16
|
+
| ACA Step 4 — retrieval stubs replace static files | — | 🔲 PENDING | `systemPromptAddition` carries governance doc chunks instead of embedding full workspace files. Blocked on Step 3 ✅ |
|
|
17
|
+
| ACA Step 5 — governance context assembly | — | 🔲 PENDING | Full on-demand assembly replaces static prompt injection. Requires Step 4. |
|
|
18
|
+
|
|
19
|
+
---
|
|
20
|
+
|
|
21
|
+
## Cross-Agent Registry — Live Load
|
|
22
|
+
|
|
23
|
+
**Current state:** `visibilityFilter()` in `cross-agent.ts` uses a hardcoded `defaultOrgRegistry()` to resolve agent tiers, orgs, and capabilities.
|
|
24
|
+
|
|
25
|
+
**Known limitation:** This duplicates fleet structure that lives authoritatively in `fleet_agents` + `fleet_orgs` in library.db.
|
|
26
|
+
|
|
27
|
+
**Planned:** Replace with live-loaded registry from library.db on gateway startup, with the hardcoded version as cold-start fallback only. This eliminates the need to maintain two copies of fleet topology.
|
|
28
|
+
|
|
29
|
+
---
|
|
30
|
+
|
|
31
|
+
## Write Authorization for Global-Scope Facts
|
|
32
|
+
|
|
33
|
+
**Current state:** Facts written with `scope='global'` are readable fleet-wide. The write path has no authorization gate — any agent with HyperMem API access can write a global-scope fact.
|
|
34
|
+
|
|
35
|
+
**Impact:** Acceptable for trusted single-operator deployments. All agents share the same trust boundary.
|
|
36
|
+
|
|
37
|
+
**Planned:** Write-authority model that gates global-scope writes to designated agents (council seats or explicitly allowlisted agent IDs).
|
|
38
|
+
|
|
39
|
+
**Workaround:** Restrict HyperMem API access to trusted agents only.
|
|
@@ -0,0 +1,93 @@
|
|
|
1
|
+
# Slash Commands
|
|
2
|
+
|
|
3
|
+
hypermem supports operator-defined slash commands that hook into session lifecycle management. These are not built into the core runtime — they are wiring points you implement in your OpenClaw plugin or agent config.
|
|
4
|
+
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
## `/fresh` — Start an Unwarmed Session
|
|
8
|
+
|
|
9
|
+
Flushes the current session's hot cache and starts fresh. Long-term memory (facts, vectors, episodes, knowledge graph) is preserved and will re-warm naturally on the next bootstrap.
|
|
10
|
+
|
|
11
|
+
**Use case:** A user wants to start a new conversation without any session warmth bleeding in from a previous context.
|
|
12
|
+
|
|
13
|
+
### What gets cleared
|
|
14
|
+
|
|
15
|
+
| Slot | Cleared |
|
|
16
|
+
|---|---|
|
|
17
|
+
| `system` | ✅ |
|
|
18
|
+
| `identity` | ✅ |
|
|
19
|
+
| `history` | ✅ |
|
|
20
|
+
| `window` | ✅ |
|
|
21
|
+
| `cursor` | ✅ |
|
|
22
|
+
| `context` | ✅ |
|
|
23
|
+
| `facts` | ✅ |
|
|
24
|
+
| `tools` | ✅ |
|
|
25
|
+
| `meta` | ✅ |
|
|
26
|
+
| Active sessions set | ✅ |
|
|
27
|
+
| SQLite facts / knowledge | ❌ preserved |
|
|
28
|
+
| Vector store | ❌ preserved |
|
|
29
|
+
| Episodes | ❌ preserved |
|
|
30
|
+
| Knowledge graph | ❌ preserved |
|
|
31
|
+
|
|
32
|
+
### Wiring it in (OpenClaw plugin)
|
|
33
|
+
|
|
34
|
+
```typescript
|
|
35
|
+
import { flushSession } from 'hypermem';
|
|
36
|
+
import type { CacheLayer } from 'hypermem';
|
|
37
|
+
|
|
38
|
+
// In your slash command handler:
|
|
39
|
+
if (input.trim() === '/fresh') {
|
|
40
|
+
const result = await flushSession(cache, agentId, sessionKey);
|
|
41
|
+
|
|
42
|
+
if (result.success) {
|
|
43
|
+
return `Session cache cleared. Starting fresh — long-term memory is preserved.\nFlushed at: ${result.flushedAt}`;
|
|
44
|
+
} else {
|
|
45
|
+
return `Failed to flush session: ${result.error}`;
|
|
46
|
+
}
|
|
47
|
+
}
|
|
48
|
+
```
|
|
49
|
+
|
|
50
|
+
### Using the `SessionFlusher` class
|
|
51
|
+
|
|
52
|
+
If you need a bound helper (e.g. inside an agent that always operates as a fixed agentId):
|
|
53
|
+
|
|
54
|
+
```typescript
|
|
55
|
+
import { SessionFlusher } from 'hypermem';
|
|
56
|
+
|
|
57
|
+
const flusher = new SessionFlusher(cache, 'my-agent');
|
|
58
|
+
|
|
59
|
+
// Later, when /fresh is received:
|
|
60
|
+
const result = await flusher.flush(sessionKey);
|
|
61
|
+
```
|
|
62
|
+
|
|
63
|
+
### Aliases
|
|
64
|
+
|
|
65
|
+
You may want to register multiple names for the same command:
|
|
66
|
+
|
|
67
|
+
```
|
|
68
|
+
/fresh
|
|
69
|
+
/newsession
|
|
70
|
+
/clearcache
|
|
71
|
+
/restart
|
|
72
|
+
```
|
|
73
|
+
|
|
74
|
+
All of these are just convention. hypermem does not register slash command names — that is up to your OpenClaw plugin config.
|
|
75
|
+
|
|
76
|
+
---
|
|
77
|
+
|
|
78
|
+
## Planned Commands
|
|
79
|
+
|
|
80
|
+
| Command | Status | Description |
|
|
81
|
+
|---|---|---|
|
|
82
|
+
| `/fresh` | ✅ Available | Flush hot cache, preserve long-term memory |
|
|
83
|
+
| `/memory` | planned | Show what hypermem currently has in context |
|
|
84
|
+
| `/forget <topic>` | planned | Suppress a topic from future context injection |
|
|
85
|
+
| `/recall <query>` | planned | Manually trigger a vector search and display results |
|
|
86
|
+
|
|
87
|
+
---
|
|
88
|
+
|
|
89
|
+
## See Also
|
|
90
|
+
|
|
91
|
+
- [TUNING.md](./TUNING.md) — full operator knobs reference
|
|
92
|
+
- [MIGRATION.md](./MIGRATION.md) — schema version compatibility table
|
|
93
|
+
- `SessionFlusher` and `flushSession` exports in `src/session-flusher.ts`
|