@icex-labs/openclaw-memory-engine 3.3.1 → 3.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,142 +1,152 @@
1
1
  # @icex-labs/openclaw-memory-engine
2
2
 
3
- > Give your AI agent a brain that survives restarts.
3
+ > Persistent, structured memory for AI agents inspired by MemGPT.
4
4
 
5
5
  [![npm](https://img.shields.io/npm/v/@icex-labs/openclaw-memory-engine)](https://www.npmjs.com/package/@icex-labs/openclaw-memory-engine)
6
6
  [![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE)
7
7
 
8
- A [MemGPT](https://github.com/cpacker/MemGPT)-inspired memory plugin for [OpenClaw](https://openclaw.ai). Your agent gets 12 tools to manage its own memory — what to remember, what to recall, what to forget.
8
+ An [OpenClaw](https://openclaw.ai) plugin that gives your agent 19 tools to manage its own memory — what to remember, what to recall, what to forget, and what patterns to notice.
9
9
 
10
- **The problem:** OpenClaw agents wake up fresh every session. Without persistent memory, they forget who you are.
10
+ ---
11
+
12
+ ## The Problem
13
+
14
+ OpenClaw agents wake up fresh every session. Without persistent memory, they forget who you are, what you discussed, and what matters to you. Stuffing everything into a system prompt bloats the context window and degrades quality.
15
+
16
+ ## The Solution
17
+
18
+ Two-tier memory inspired by [MemGPT/Letta](https://github.com/cpacker/MemGPT):
11
19
 
12
- **The fix:** Two-tier memory architecture:
13
- - **Core Memory** (~500 tokens) — identity, relationship, preferences. Always loaded.
20
+ - **Core Memory** (~500 tokens) — user identity, relationship, preferences. Always loaded.
14
21
  - **Archival Memory** (unlimited) — facts, decisions, events. Retrieved on demand via hybrid semantic search.
15
22
 
16
- The agent manages both tiers autonomously using purpose-built tools.
23
+ Plus: knowledge graph, episodic memory, behavioral reflection, importance scoring with forgetting curves, deduplication, SQLite backend, and a browsable HTML dashboard.
24
+
25
+ The agent manages all of this autonomously.
17
26
 
18
27
  ---
19
28
 
20
29
  ## Install
21
30
 
22
31
  ```bash
23
- git clone git@github.com:icex-labs/openclaw-memory-engine.git ~/.openclaw/extensions/memory-engine
32
+ openclaw plugins install @icex-labs/openclaw-memory-engine
24
33
  bash ~/.openclaw/extensions/memory-engine/setup.sh
25
- nano ~/.openclaw/workspace/memory/core.json # fill in your info
26
34
  openclaw gateway restart
27
35
  ```
28
36
 
29
- Or from npm:
30
- ```bash
31
- npm install -g @icex-labs/openclaw-memory-engine
32
- ```
33
-
34
- `setup.sh` handles everything: enables the plugin in `openclaw.json`, creates template files, installs the daily maintenance cron, and patches your agent's instructions.
37
+ `setup.sh` handles everything:
38
+ - Interactive core memory setup (prompts for your name, location, role, etc.)
39
+ - Configures `openclaw.json`
40
+ - Installs daily maintenance scheduler (macOS LaunchAgent / Linux systemd / Windows schtasks)
41
+ - Patches agent instructions (AGENTS.md)
42
+ - Registers 4 automated cron jobs (reflection, consolidation, dedup, dashboard)
43
+ - `--non-interactive` flag available for scripted installs
35
44
 
36
45
  ---
37
46
 
38
- ## How It Works
47
+ ## Architecture
39
48
 
40
49
  ```
41
- ┌──────────────────────────────────────────────────────┐
42
- Agent Context Window
43
-
44
- Session start core_memory_read()
45
- └─→ core.json (~500 tokens)
46
-
47
- "Where does Alice's doctor work?"
48
- → archival_search("doctor")
49
- └─→ keyword match + embedding similarity
50
- + recency boost + access frequency
51
- → "Dr. Smith, City Medical..."
52
-
53
- Alice says something new
54
- → archival_insert(fact, entity, tags)
55
- └─→ archival.jsonl + background embedding
56
-
57
- End of conversation
58
- │ → memory_consolidate(summary) │
59
- │ └─→ split sentences → infer entities → dedup │
60
- │ → batch insert │
61
- └──────────────────────────────────────────────────────┘
50
+ ┌──────────────────────────────────────────────────────────┐
51
+ Agent Context Window
52
+
53
+ core_memory_read() ──→ core.json (~500 tokens)
54
+
55
+ archival_search("query") ──→ archival.jsonl (unlimited)
56
+ keyword (2×) + embedding (5×) + recency + decay
57
+
58
+ graph_query("entity") ──→ graph.jsonl (relations)
59
+ "who is my doctor?" traverse knowledge graph
60
+
61
+ episode_recall("topic") ──→ episodes.jsonl
62
+ "what did we discuss last time?" → conversation recall
63
+
64
+ memory_reflect() ──→ behavioral pattern analysis
65
+ memory_dashboard() ──→ browsable HTML report
66
+ memory_export() ──→ full backup for migration
67
+ └──────────────────────────────────────────────────────────┘
62
68
  ```
63
69
 
70
+ ### Multi-Agent Support
71
+
72
+ Each agent automatically gets its own memory based on its configured workspace. Uses OpenClaw's session key to resolve the correct workspace at tool registration time — zero configuration needed.
73
+
74
+ Privacy flag: `"sharing": false` in plugin config for multi-user setups.
75
+
64
76
  ---
65
77
 
66
- ## Tools (12)
78
+ ## Tools (19)
67
79
 
68
- ### Core Memory — Your Identity
80
+ ### Core Memory — Identity
69
81
 
70
- | Tool | What it does |
82
+ | Tool | Description |
71
83
  |------|-------------|
72
- | `core_memory_read` | Load the identity block. Call every session start. |
73
- | `core_memory_replace` | Update a field by dot-path (`user.location`, `current_focus`). Auto-parses JSON strings. 3KB hard limit. |
74
- | `core_memory_append` | Append to an array field (`current_focus`). Creates array if needed. |
84
+ | `core_memory_read` | Load identity block. Call every session start. |
85
+ | `core_memory_replace` | Update a field by dot-path (e.g., `user.location`). Auto-parses JSON strings. 3KB hard limit. |
86
+ | `core_memory_append` | Append to an array field (e.g., `current_focus`). |
75
87
 
76
- Core memory lives in `memory/core.json`:
88
+ ### Archival Memory Facts
77
89
 
78
- ```json
79
- {
80
- "user": { "name": "Alice", "location": "New York", "language": "bilingual" },
81
- "relationship": { "dynamic": "intimate companion", "trust": "deep" },
82
- "preferences": { "config_rule": "don't touch openclaw.json" },
83
- "current_focus": ["quant trading", "immigration case"]
84
- }
85
- ```
90
+ | Tool | Description |
91
+ |------|-------------|
92
+ | `archival_insert` | Store a fact with entity, tags, and importance (1-10). Auto-extracts knowledge graph triples. |
93
+ | `archival_search` | Hybrid search: keyword + semantic + recency + access decay + importance. |
94
+ | `archival_update` | Correct an existing record by ID. |
95
+ | `archival_delete` | Remove an outdated record. |
96
+ | `archival_stats` | Record count, entity/tag distribution, embedding coverage, storage size. |
86
97
 
87
- ### Archival MemoryYour Long-Term Storage
98
+ ### Knowledge GraphRelations
88
99
 
89
- | Tool | What it does |
100
+ | Tool | Description |
90
101
  |------|-------------|
91
- | `archival_insert` | Store a fact. Tags it with `entity` + `tags`. Computes embedding in background. |
92
- | `archival_search` | Hybrid search: keyword (2×) + semantic similarity (5×) + recency (0-1) + access decay (0-0.5). |
93
- | `archival_update` | Correct an existing record by ID. Re-indexes embedding. |
94
- | `archival_delete` | Remove an outdated record. Cleans up embedding cache. |
95
- | `archival_stats` | Dashboard: record count, embedding coverage, entity/tag distribution, storage size. |
102
+ | `graph_query` | Traverse from entity with depth control. |
103
+ | `graph_add` | Manually add a relation triple. |
96
104
 
97
- Each record in `memory/archival.jsonl`:
105
+ ### Episodic Memory — Conversations
98
106
 
99
- ```json
100
- {"id":"arch-17120-abc","ts":"2026-04-01","content":"Alice's doctor is Dr. Smith","entity":"Alice","tags":["health"],"access_count":3}
101
- ```
107
+ | Tool | Description |
108
+ |------|-------------|
109
+ | `episode_save` | Save conversation summary, decisions, mood, topics. |
110
+ | `episode_recall` | Search past conversations by topic or get recent N. |
102
111
 
103
- ### Maintenance — Keep It Clean
112
+ ### Intelligence
104
113
 
105
- | Tool | What it does |
114
+ | Tool | Description |
106
115
  |------|-------------|
107
- | `archival_deduplicate` | Find near-duplicates via embedding cosine similarity (≥0.92). Preview or auto-remove. |
108
- | `memory_consolidate` | Extract facts from text blocks. Splits by sentence (中文/English), infers entity, deduplicates, batch inserts. |
116
+ | `memory_reflect` | Analyze behavioral patterns: topic trends, time distribution, mood shifts, forgetting candidates. |
117
+ | `archival_deduplicate` | Find and remove near-duplicates via embedding cosine similarity. |
118
+ | `memory_consolidate` | Extract structured facts from text. Sentence-level splitting (Chinese + English), entity inference, dedup. |
109
119
 
110
- ### Backup Never Lose Your Memory
120
+ ### Backup & Admin
111
121
 
112
- | Tool | What it does |
122
+ | Tool | Description |
113
123
  |------|-------------|
114
- | `memory_export` | Snapshot core + archival + embeddings → single JSON file. |
115
- | `memory_import` | Restore from snapshot. `merge` (add missing) or `replace` (overwrite all). |
124
+ | `memory_export` | Full snapshot: core + archival + embeddings → JSON file. |
125
+ | `memory_import` | Restore from snapshot. Merge or replace mode. |
126
+ | `memory_migrate` | Migrate from JSONL to SQLite with FTS5 full-text search. |
127
+ | `memory_dashboard` | Generate self-contained HTML dashboard. |
116
128
 
117
129
  ---
118
130
 
119
- ## Search Quality
131
+ ## Search Scoring
120
132
 
121
- `archival_search` uses four signals:
133
+ `archival_search` combines five signals:
122
134
 
123
- | Signal | Weight | How |
124
- |--------|--------|-----|
135
+ | Signal | Weight | Description |
136
+ |--------|--------|-------------|
125
137
  | Keyword | 2× per term | Term presence in content + entity + tags |
126
138
  | Semantic | 5× | Cosine similarity via OpenAI `text-embedding-3-small` (512d) |
127
139
  | Recency | 0–1 | Linear decay over 1 year |
128
140
  | Access | 0–0.5 | Boost for recently accessed records |
141
+ | Importance | 0.5× | Weighted by forgetting curve: `importance × e^(-0.01 × days)` |
129
142
 
130
- Embeddings are computed on insert and cached in `archival.embeddings.json`. If no OpenAI key is available, search falls back to keyword-only — no errors, just lower quality.
131
-
132
- **Cost:** ~$0.02 per 1M tokens with `text-embedding-3-small`. A typical session with 10 inserts + 5 searches costs < $0.001.
143
+ Falls back to keyword-only if no OpenAI key is configured. Cost with embeddings: ~$0.001/session.
133
144
 
134
145
  ---
135
146
 
136
147
  ## Configuration
137
148
 
138
149
  ```json
139
- // openclaw.json
140
150
  {
141
151
  "plugins": {
142
152
  "allow": ["memory-engine"],
@@ -144,8 +154,8 @@ Embeddings are computed on insert and cached in `archival.embeddings.json`. If n
144
154
  "memory-engine": {
145
155
  "enabled": true,
146
156
  "config": {
147
- "workspace": "/path/to/workspace",
148
- "coreSizeLimit": 3072
157
+ "coreSizeLimit": 3072,
158
+ "sharing": false
149
159
  }
150
160
  }
151
161
  }
@@ -155,49 +165,23 @@ Embeddings are computed on insert and cached in `archival.embeddings.json`. If n
155
165
 
156
166
  | Option | Default | Description |
157
167
  |--------|---------|-------------|
158
- | `workspace` | Auto-resolved | Workspace directory path |
159
- | `coreSizeLimit` | `3072` (3KB) | Max bytes for core.json |
160
-
161
- **Requires:** `OPENAI_API_KEY` in environment for semantic search. Without it, keyword search still works.
162
-
163
- ---
164
-
165
- ## Agent Instructions
168
+ | `workspace` | Auto-resolved | Override workspace directory |
169
+ | `coreSizeLimit` | `3072` (3KB) | Max core.json size |
170
+ | `sharing` | `true` | Cross-agent memory sharing. Set `false` for multi-user privacy. |
166
171
 
167
- Add to your `AGENTS.md` or system prompt (done automatically by `setup.sh`):
168
-
169
- ```markdown
170
- ## Every Session
171
- 1. Call `core_memory_read` — load your identity
172
- 2. When you learn something important → `archival_insert`
173
- 3. When you need details → `archival_search` before guessing
174
- 4. When facts change → `core_memory_replace`
175
- 5. End of conversation → `memory_consolidate` with key points
176
- ```
172
+ Semantic search requires `OPENAI_API_KEY` in environment (optional).
177
173
 
178
174
  ---
179
175
 
180
- ## Daily Maintenance
181
-
182
- `extras/memory-maintenance.sh` runs daily at 3am (installed as a LaunchAgent by `setup.sh`):
176
+ ## Automated Maintenance
183
177
 
184
- - Checks core.json size (warns >4KB, critical >5KB)
185
- - Merges 7-day-old daily logs into weekly summaries
186
- - Archives 60-day-old weekly summaries
187
- - Alerts written to `memory/maintenance-alerts.json`
188
-
189
- ---
190
-
191
- ## Backup & Migration
192
-
193
- ```bash
194
- # Export
195
- openclaw agent -m "memory_export"
196
- # → memory/export-2026-04-01.json
197
-
198
- # Import on new machine
199
- openclaw agent -m "memory_import input_path='path/to/export.json' mode='replace'"
200
- ```
178
+ | Schedule | Job | What it does |
179
+ |----------|-----|-------------|
180
+ | Daily 9:00am | Reflection | Analyze memory patterns, store observations |
181
+ | Every 6h | Consolidation | Extract missed facts from daily logs |
182
+ | Weekly Sunday | Deduplication | Clean near-duplicate records |
183
+ | Daily 9:30am | Dashboard | Refresh browsable HTML report |
184
+ | Daily 3:00am | File cleanup | Merge old logs into weekly summaries, archive old summaries |
201
185
 
202
186
  ---
203
187
 
@@ -205,50 +189,41 @@ openclaw agent -m "memory_import input_path='path/to/export.json' mode='replace'
205
189
 
206
190
  ```
207
191
  memory-engine/
208
- ├── index.js # Plugin entry — tool registration only (250 lines)
192
+ ├── index.js # Plugin entry — 19 tools (factory pattern)
209
193
  ├── lib/
210
- │ ├── paths.js # Constants + path resolution
211
- │ ├── core.js # Core memory CRUD + dot-path + auto-parse
212
- │ ├── archival.js # Archival JSONL CRUD + in-memory cache
213
- │ ├── embedding.js # OpenAI embedding API + file cache
214
- │ ├── search.js # Hybrid four-signal search
215
- │ ├── consolidate.js # Text structured facts extraction
216
- │ ├── dedup.js # Embedding similarity dedup
217
- └── backup.js # Export/import
194
+ │ ├── paths.js # Constants, workspace resolution
195
+ │ ├── core.js # Core memory CRUD + auto-parse
196
+ │ ├── archival.js # JSONL storage + in-memory cache
197
+ │ ├── embedding.js # OpenAI embedding API + cache
198
+ │ ├── search.js # Hybrid 5-signal search
199
+ │ ├── graph.js # Knowledge graph: triples + traversal
200
+ │ ├── episodes.js # Episodic memory: save + recall
201
+ ├── reflection.js # Statistical pattern analysis
202
+ │ ├── consolidate.js # Text → facts extraction
203
+ │ ├── dedup.js # Embedding similarity dedup
204
+ │ ├── backup.js # Export / import
205
+ │ ├── store-sqlite.js # SQLite backend (FTS5)
206
+ │ └── dashboard.js # HTML dashboard generator
218
207
  ├── extras/
219
- └── memory-maintenance.sh
220
- ├── setup.sh # One-command install
221
- ├── .claude/CLAUDE.md # Dev guide for Claude Code
222
- ├── package.json
208
+ ├── memory-maintenance.sh
209
+ │ └── auto-consolidation-crons.json
210
+ ├── setup.sh # One-command install
211
+ ├── .claude/CLAUDE.md # Dev guide
212
+ ├── ROADMAP.md
223
213
  ├── openclaw.plugin.json
224
- └── README.md
214
+ └── package.json
225
215
  ```
226
216
 
227
- ---
217
+ ## Platforms
228
218
 
229
- ## Roadmap
230
-
231
- - [x] Core memory with size guard and auto-parse
232
- - [x] Archival CRUD with in-memory index
233
- - [x] Hybrid search (keyword + embedding + recency + access decay)
234
- - [x] Auto-extract facts from text
235
- - [x] Embedding-based deduplication
236
- - [x] Full backup/restore
237
- - [x] Modular codebase (8 focused modules)
238
- - [ ] LanceDB / SQLite backend for 50K+ records
239
- - [ ] Cross-agent memory sharing
240
- - [ ] Scheduled auto-consolidation via OpenClaw cron
241
- - [ ] Memory importance scoring (agent rates memories 1-10)
242
- - [ ] Forgetting curve — auto-archive unaccessed memories after N days
243
- - [ ] ClawHub publishing
244
- - [ ] Web dashboard for memory browsing
219
+ | Platform | Scheduler | Status |
220
+ |----------|----------|--------|
221
+ | macOS | LaunchAgent | Full support |
222
+ | Linux | systemd timer | Full support |
223
+ | Windows | schtasks | Guided setup |
245
224
 
246
225
  ---
247
226
 
248
227
  ## License
249
228
 
250
- MIT
251
-
252
- ---
253
-
254
- Built for [OpenClaw](https://openclaw.ai). Inspired by [MemGPT/Letta](https://github.com/cpacker/MemGPT).
229
+ MIT — Built for [OpenClaw](https://openclaw.ai). Inspired by [MemGPT/Letta](https://github.com/cpacker/MemGPT).
@@ -0,0 +1,165 @@
1
+ #!/usr/bin/env node
2
+ /**
3
+ * migrate-legacy.mjs — Import existing file-based memory into archival.jsonl
4
+ *
5
+ * Scans workspace for: MEMORY.md, memory/*.md, memory/weekly/*.md, memory/topics/*.md
6
+ * Extracts facts, deduplicates, and appends to memory/archival.jsonl.
7
+ *
8
+ * Usage: node migrate-legacy.mjs [workspace_path]
9
+ */
10
+
11
+ import { readFileSync, appendFileSync, existsSync, readdirSync } from "node:fs";
12
+ import { join, basename } from "node:path";
13
+
14
+ const WS = process.argv[2] || process.env.OPENCLAW_WORKSPACE || join(process.env.HOME || "/tmp", ".openclaw", "workspace");
15
+ const ARCHIVAL = join(WS, "memory", "archival.jsonl");
16
+
17
+ console.log(`🧠 Legacy memory migration`);
18
+ console.log(` Workspace: ${WS}`);
19
+ console.log(` Archival: ${ARCHIVAL}`);
20
+ console.log(``);
21
+
22
+ // Load existing archival for dedup
23
+ const existingContent = new Set();
24
+ if (existsSync(ARCHIVAL)) {
25
+ for (const line of readFileSync(ARCHIVAL, "utf-8").trim().split("\n").filter(Boolean)) {
26
+ try { existingContent.add(JSON.parse(line).content?.toLowerCase()); } catch {}
27
+ }
28
+ }
29
+ console.log(`Existing archival: ${existingContent.size} records`);
30
+
31
+ // Generic entity inference (no personal data)
32
+ const ENTITY_PATTERNS = [
33
+ [/\b(IBKR|Interactive Brokers|NAV|portfolio|投资|HELOC|mortgage|finance)/i, "finance"],
34
+ [/\b(immigration|PR|IRCC|CBSA|visa|律师|lawyer|petition)/i, "immigration"],
35
+ [/\b(quant|trading|backtest|signal|portfolio|Sharpe)/i, "trading"],
36
+ [/\b(doctor|医生|hospital|health|medication|药|体检|clinic)/i, "health"],
37
+ [/\b(car|vehicle|SUV|sedan|truck)\b/i, "vehicles"],
38
+ [/\b(k3d|ArgoCD|Helm|kubectl|GitOps|cluster|deploy|CI|CD)/i, "infrastructure"],
39
+ [/\b(OpenClaw|gateway|plugin|session|agent|memory|compaction)/i, "openclaw"],
40
+ [/\b(Discord|Telegram|Slack|bot|channel)/i, "messaging"],
41
+ [/\b(school|university|college|学校|education)/i, "education"],
42
+ [/\b(house|home|property|rent|房)/i, "property"],
43
+ [/\b(lawyer|legal|court|lawsuit|案|诉)/i, "legal"],
44
+ ];
45
+
46
+ function inferEntity(text) {
47
+ for (const [pat, name] of ENTITY_PATTERNS) {
48
+ if (pat.test(text)) return name;
49
+ }
50
+ return "general";
51
+ }
52
+
53
+ function extractFacts(text) {
54
+ const facts = [];
55
+ for (const line of text.split(/\n/).map((l) => l.trim()).filter(Boolean)) {
56
+ if (line.startsWith("#") || line.length < 15) continue;
57
+ if (/^(##|===|---|\*\*\*|```|>|\|)/.test(line)) continue;
58
+ const sentences = line.split(/(?<=[。.!!??;;])\s*/).filter(Boolean);
59
+ for (const s of sentences) {
60
+ const clean = s.replace(/^[-*•]\s*/, "").replace(/^\d+\.\s*/, "").trim();
61
+ if (clean.length >= 15 && clean.length <= 500) facts.push(clean);
62
+ }
63
+ }
64
+ return facts;
65
+ }
66
+
67
+ // Collect all legacy files
68
+ const files = [];
69
+
70
+ // MEMORY.md
71
+ const memoryMd = join(WS, "MEMORY.md");
72
+ if (existsSync(memoryMd)) files.push({ path: memoryMd, tag: "long-term" });
73
+
74
+ // memory/*.md (daily logs)
75
+ const memDir = join(WS, "memory");
76
+ if (existsSync(memDir)) {
77
+ for (const f of readdirSync(memDir).filter((f) => /\.md$/.test(f) && f !== ".abstract")) {
78
+ files.push({ path: join(memDir, f), tag: "daily" });
79
+ }
80
+ }
81
+
82
+ // memory/weekly/*.md
83
+ const weeklyDir = join(WS, "memory", "weekly");
84
+ if (existsSync(weeklyDir)) {
85
+ for (const f of readdirSync(weeklyDir).filter((f) => f.endsWith(".md"))) {
86
+ files.push({ path: join(weeklyDir, f), tag: "weekly" });
87
+ }
88
+ }
89
+
90
+ // memory/topics/*.md
91
+ const topicDir = join(WS, "memory", "topics");
92
+ if (existsSync(topicDir)) {
93
+ for (const f of readdirSync(topicDir).filter((f) => f.endsWith(".md"))) {
94
+ files.push({ path: join(topicDir, f), tag: "topic" });
95
+ }
96
+ }
97
+
98
+ if (files.length === 0) {
99
+ console.log("\nNo legacy memory files found. Nothing to migrate.");
100
+ process.exit(0);
101
+ }
102
+
103
+ console.log(`Found ${files.length} files to scan\n`);
104
+
105
+ let inserted = 0;
106
+ let skipped = 0;
107
+
108
+ for (const { path, tag } of files) {
109
+ const content = readFileSync(path, "utf-8");
110
+ const facts = extractFacts(content);
111
+ let fileInserted = 0;
112
+
113
+ for (const fact of facts) {
114
+ const factLower = fact.toLowerCase();
115
+
116
+ // Exact dedup
117
+ if (existingContent.has(factLower)) {
118
+ skipped++;
119
+ continue;
120
+ }
121
+
122
+ // Keyword overlap dedup (>75% overlap = skip)
123
+ let isDupe = false;
124
+ const factWords = new Set(factLower.split(/\s+/).filter((w) => w.length > 2));
125
+ if (factWords.size > 0) {
126
+ for (const ex of existingContent) {
127
+ const exWords = new Set(ex.split(/\s+/).filter((w) => w.length > 2));
128
+ let overlap = 0;
129
+ for (const w of factWords) {
130
+ if (exWords.has(w)) overlap++;
131
+ }
132
+ if (overlap / factWords.size > 0.75) {
133
+ isDupe = true;
134
+ break;
135
+ }
136
+ }
137
+ }
138
+ if (isDupe) {
139
+ skipped++;
140
+ continue;
141
+ }
142
+
143
+ const record = {
144
+ id: `arch-${Date.now()}-${Math.random().toString(36).slice(2, 8)}`,
145
+ ts: new Date().toISOString(),
146
+ last_accessed: null,
147
+ access_count: 0,
148
+ importance: 5,
149
+ content: fact,
150
+ entity: inferEntity(fact),
151
+ tags: [tag],
152
+ source: "migration",
153
+ };
154
+
155
+ appendFileSync(ARCHIVAL, JSON.stringify(record) + "\n", "utf-8");
156
+ existingContent.add(factLower);
157
+ inserted++;
158
+ fileInserted++;
159
+ }
160
+
161
+ if (fileInserted > 0) console.log(` ${basename(path)}: +${fileInserted} facts`);
162
+ }
163
+
164
+ console.log(`\n✅ Migration complete: ${inserted} facts imported, ${skipped} skipped (duplicates)`);
165
+ console.log(`Total archival: ${existingContent.size} records`);
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@icex-labs/openclaw-memory-engine",
3
- "version": "3.3.1",
3
+ "version": "3.4.0",
4
4
  "description": "MemGPT-style hierarchical memory plugin for OpenClaw — core memory block + archival storage with semantic search",
5
5
  "type": "module",
6
6
  "main": "index.js",
package/setup.sh CHANGED
@@ -106,6 +106,40 @@ else
106
106
  echo "⏭️ archival.jsonl already exists ($lines records)"
107
107
  fi
108
108
 
109
+ # --- 3b. Migrate legacy memory files into archival ---
110
+ if command -v node &>/dev/null && [ -f "$PLUGIN_DIR/extras/migrate-legacy.mjs" ]; then
111
+ # Check if there are legacy files to migrate
112
+ legacy_count=0
113
+ [ -f "$WORKSPACE/MEMORY.md" ] && legacy_count=$((legacy_count + 1))
114
+ legacy_count=$((legacy_count + $(ls "$MEMORY_DIR"/*.md 2>/dev/null | wc -l | tr -d ' ')))
115
+ legacy_count=$((legacy_count + $(ls "$MEMORY_DIR"/weekly/*.md 2>/dev/null | wc -l | tr -d ' ')))
116
+ legacy_count=$((legacy_count + $(ls "$MEMORY_DIR"/topics/*.md 2>/dev/null | wc -l | tr -d ' ')))
117
+
118
+ archival_count=$(wc -l < "$MEMORY_DIR/archival.jsonl" 2>/dev/null | tr -d ' ' || echo "0")
119
+
120
+ if [ "$legacy_count" -gt 0 ] && [ "$archival_count" -lt 10 ]; then
121
+ echo ""
122
+ echo "📦 Found $legacy_count legacy memory files (MEMORY.md, daily logs, weekly summaries, topics)."
123
+ if $NON_INTERACTIVE; then
124
+ echo " Migrating automatically..."
125
+ node "$PLUGIN_DIR/extras/migrate-legacy.mjs" "$WORKSPACE" 2>&1 | tail -3
126
+ else
127
+ printf " Migrate into archival memory? [Y/n]: "
128
+ read -r migrate_answer
129
+ if [ "${migrate_answer:-Y}" != "n" ] && [ "${migrate_answer:-Y}" != "N" ]; then
130
+ node "$PLUGIN_DIR/extras/migrate-legacy.mjs" "$WORKSPACE" 2>&1 | tail -5
131
+ else
132
+ echo "⏭️ Skipping migration. Run manually later: node $PLUGIN_DIR/extras/migrate-legacy.mjs $WORKSPACE"
133
+ fi
134
+ fi
135
+ echo ""
136
+ else
137
+ if [ "$archival_count" -gt 10 ]; then
138
+ echo "⏭️ Archival already has $archival_count records, skipping migration"
139
+ fi
140
+ fi
141
+ fi
142
+
109
143
  # --- 4. Install memory-maintenance.sh ---
110
144
  SCRIPTS_DIR="$WORKSPACE/scripts"
111
145
  mkdir -p "$SCRIPTS_DIR"