@icex-labs/openclaw-memory-engine 3.3.1 → 3.4.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +129 -154
- package/extras/migrate-legacy.mjs +165 -0
- package/package.json +1 -1
- package/setup.sh +34 -0
package/README.md
CHANGED
|
@@ -1,142 +1,152 @@
|
|
|
1
1
|
# @icex-labs/openclaw-memory-engine
|
|
2
2
|
|
|
3
|
-
>
|
|
3
|
+
> Persistent, structured memory for AI agents — inspired by MemGPT.
|
|
4
4
|
|
|
5
5
|
[](https://www.npmjs.com/package/@icex-labs/openclaw-memory-engine)
|
|
6
6
|
[](LICENSE)
|
|
7
7
|
|
|
8
|
-
|
|
8
|
+
An [OpenClaw](https://openclaw.ai) plugin that gives your agent 19 tools to manage its own memory — what to remember, what to recall, what to forget, and what patterns to notice.
|
|
9
9
|
|
|
10
|
-
|
|
10
|
+
---
|
|
11
|
+
|
|
12
|
+
## The Problem
|
|
13
|
+
|
|
14
|
+
OpenClaw agents wake up fresh every session. Without persistent memory, they forget who you are, what you discussed, and what matters to you. Stuffing everything into a system prompt bloats the context window and degrades quality.
|
|
15
|
+
|
|
16
|
+
## The Solution
|
|
17
|
+
|
|
18
|
+
Two-tier memory inspired by [MemGPT/Letta](https://github.com/cpacker/MemGPT):
|
|
11
19
|
|
|
12
|
-
**
|
|
13
|
-
- **Core Memory** (~500 tokens) — identity, relationship, preferences. Always loaded.
|
|
20
|
+
- **Core Memory** (~500 tokens) — user identity, relationship, preferences. Always loaded.
|
|
14
21
|
- **Archival Memory** (unlimited) — facts, decisions, events. Retrieved on demand via hybrid semantic search.
|
|
15
22
|
|
|
16
|
-
|
|
23
|
+
Plus: knowledge graph, episodic memory, behavioral reflection, importance scoring with forgetting curves, deduplication, SQLite backend, and a browsable HTML dashboard.
|
|
24
|
+
|
|
25
|
+
The agent manages all of this autonomously.
|
|
17
26
|
|
|
18
27
|
---
|
|
19
28
|
|
|
20
29
|
## Install
|
|
21
30
|
|
|
22
31
|
```bash
|
|
23
|
-
|
|
32
|
+
openclaw plugins install @icex-labs/openclaw-memory-engine
|
|
24
33
|
bash ~/.openclaw/extensions/memory-engine/setup.sh
|
|
25
|
-
nano ~/.openclaw/workspace/memory/core.json # fill in your info
|
|
26
34
|
openclaw gateway restart
|
|
27
35
|
```
|
|
28
36
|
|
|
29
|
-
|
|
30
|
-
|
|
31
|
-
|
|
32
|
-
|
|
33
|
-
|
|
34
|
-
|
|
37
|
+
`setup.sh` handles everything:
|
|
38
|
+
- Interactive core memory setup (prompts for your name, location, role, etc.)
|
|
39
|
+
- Configures `openclaw.json`
|
|
40
|
+
- Installs daily maintenance scheduler (macOS LaunchAgent / Linux systemd / Windows schtasks)
|
|
41
|
+
- Patches agent instructions (AGENTS.md)
|
|
42
|
+
- Registers 4 automated cron jobs (reflection, consolidation, dedup, dashboard)
|
|
43
|
+
- `--non-interactive` flag available for scripted installs
|
|
35
44
|
|
|
36
45
|
---
|
|
37
46
|
|
|
38
|
-
##
|
|
47
|
+
## Architecture
|
|
39
48
|
|
|
40
49
|
```
|
|
41
|
-
|
|
42
|
-
│
|
|
43
|
-
│
|
|
44
|
-
│
|
|
45
|
-
│
|
|
46
|
-
│
|
|
47
|
-
│
|
|
48
|
-
│
|
|
49
|
-
│
|
|
50
|
-
│
|
|
51
|
-
│
|
|
52
|
-
│
|
|
53
|
-
│
|
|
54
|
-
│
|
|
55
|
-
│
|
|
56
|
-
│
|
|
57
|
-
│
|
|
58
|
-
|
|
59
|
-
│ └─→ split sentences → infer entities → dedup │
|
|
60
|
-
│ → batch insert │
|
|
61
|
-
└──────────────────────────────────────────────────────┘
|
|
50
|
+
┌──────────────────────────────────────────────────────────┐
|
|
51
|
+
│ Agent Context Window │
|
|
52
|
+
│ │
|
|
53
|
+
│ core_memory_read() ──→ core.json (~500 tokens) │
|
|
54
|
+
│ │
|
|
55
|
+
│ archival_search("query") ──→ archival.jsonl (unlimited) │
|
|
56
|
+
│ keyword (2×) + embedding (5×) + recency + decay │
|
|
57
|
+
│ │
|
|
58
|
+
│ graph_query("entity") ──→ graph.jsonl (relations) │
|
|
59
|
+
│ "who is my doctor?" → traverse knowledge graph │
|
|
60
|
+
│ │
|
|
61
|
+
│ episode_recall("topic") ──→ episodes.jsonl │
|
|
62
|
+
│ "what did we discuss last time?" → conversation recall │
|
|
63
|
+
│ │
|
|
64
|
+
│ memory_reflect() ──→ behavioral pattern analysis │
|
|
65
|
+
│ memory_dashboard() ──→ browsable HTML report │
|
|
66
|
+
│ memory_export() ──→ full backup for migration │
|
|
67
|
+
└──────────────────────────────────────────────────────────┘
|
|
62
68
|
```
|
|
63
69
|
|
|
70
|
+
### Multi-Agent Support
|
|
71
|
+
|
|
72
|
+
Each agent automatically gets its own memory based on its configured workspace. Uses OpenClaw's session key to resolve the correct workspace at tool registration time — zero configuration needed.
|
|
73
|
+
|
|
74
|
+
Privacy flag: `"sharing": false` in plugin config for multi-user setups.
|
|
75
|
+
|
|
64
76
|
---
|
|
65
77
|
|
|
66
|
-
## Tools (
|
|
78
|
+
## Tools (19)
|
|
67
79
|
|
|
68
|
-
### Core Memory —
|
|
80
|
+
### Core Memory — Identity
|
|
69
81
|
|
|
70
|
-
| Tool |
|
|
82
|
+
| Tool | Description |
|
|
71
83
|
|------|-------------|
|
|
72
|
-
| `core_memory_read` | Load
|
|
73
|
-
| `core_memory_replace` | Update a field by dot-path (`user.location
|
|
74
|
-
| `core_memory_append` | Append to an array field (`current_focus`).
|
|
84
|
+
| `core_memory_read` | Load identity block. Call every session start. |
|
|
85
|
+
| `core_memory_replace` | Update a field by dot-path (e.g., `user.location`). Auto-parses JSON strings. 3KB hard limit. |
|
|
86
|
+
| `core_memory_append` | Append to an array field (e.g., `current_focus`). |
|
|
75
87
|
|
|
76
|
-
|
|
88
|
+
### Archival Memory — Facts
|
|
77
89
|
|
|
78
|
-
|
|
79
|
-
|
|
80
|
-
|
|
81
|
-
|
|
82
|
-
|
|
83
|
-
|
|
84
|
-
|
|
85
|
-
```
|
|
90
|
+
| Tool | Description |
|
|
91
|
+
|------|-------------|
|
|
92
|
+
| `archival_insert` | Store a fact with entity, tags, and importance (1-10). Auto-extracts knowledge graph triples. |
|
|
93
|
+
| `archival_search` | Hybrid search: keyword + semantic + recency + access decay + importance. |
|
|
94
|
+
| `archival_update` | Correct an existing record by ID. |
|
|
95
|
+
| `archival_delete` | Remove an outdated record. |
|
|
96
|
+
| `archival_stats` | Record count, entity/tag distribution, embedding coverage, storage size. |
|
|
86
97
|
|
|
87
|
-
###
|
|
98
|
+
### Knowledge Graph — Relations
|
|
88
99
|
|
|
89
|
-
| Tool |
|
|
100
|
+
| Tool | Description |
|
|
90
101
|
|------|-------------|
|
|
91
|
-
| `
|
|
92
|
-
| `
|
|
93
|
-
| `archival_update` | Correct an existing record by ID. Re-indexes embedding. |
|
|
94
|
-
| `archival_delete` | Remove an outdated record. Cleans up embedding cache. |
|
|
95
|
-
| `archival_stats` | Dashboard: record count, embedding coverage, entity/tag distribution, storage size. |
|
|
102
|
+
| `graph_query` | Traverse from entity with depth control. |
|
|
103
|
+
| `graph_add` | Manually add a relation triple. |
|
|
96
104
|
|
|
97
|
-
|
|
105
|
+
### Episodic Memory — Conversations
|
|
98
106
|
|
|
99
|
-
|
|
100
|
-
|
|
101
|
-
|
|
107
|
+
| Tool | Description |
|
|
108
|
+
|------|-------------|
|
|
109
|
+
| `episode_save` | Save conversation summary, decisions, mood, topics. |
|
|
110
|
+
| `episode_recall` | Search past conversations by topic or get recent N. |
|
|
102
111
|
|
|
103
|
-
###
|
|
112
|
+
### Intelligence
|
|
104
113
|
|
|
105
|
-
| Tool |
|
|
114
|
+
| Tool | Description |
|
|
106
115
|
|------|-------------|
|
|
107
|
-
| `
|
|
108
|
-
| `
|
|
116
|
+
| `memory_reflect` | Analyze behavioral patterns: topic trends, time distribution, mood shifts, forgetting candidates. |
|
|
117
|
+
| `archival_deduplicate` | Find and remove near-duplicates via embedding cosine similarity. |
|
|
118
|
+
| `memory_consolidate` | Extract structured facts from text. Sentence-level splitting (Chinese + English), entity inference, dedup. |
|
|
109
119
|
|
|
110
|
-
### Backup
|
|
120
|
+
### Backup & Admin
|
|
111
121
|
|
|
112
|
-
| Tool |
|
|
122
|
+
| Tool | Description |
|
|
113
123
|
|------|-------------|
|
|
114
|
-
| `memory_export` |
|
|
115
|
-
| `memory_import` | Restore from snapshot.
|
|
124
|
+
| `memory_export` | Full snapshot: core + archival + embeddings → JSON file. |
|
|
125
|
+
| `memory_import` | Restore from snapshot. Merge or replace mode. |
|
|
126
|
+
| `memory_migrate` | Migrate from JSONL to SQLite with FTS5 full-text search. |
|
|
127
|
+
| `memory_dashboard` | Generate self-contained HTML dashboard. |
|
|
116
128
|
|
|
117
129
|
---
|
|
118
130
|
|
|
119
|
-
## Search
|
|
131
|
+
## Search Scoring
|
|
120
132
|
|
|
121
|
-
`archival_search`
|
|
133
|
+
`archival_search` combines five signals:
|
|
122
134
|
|
|
123
|
-
| Signal | Weight |
|
|
124
|
-
|
|
135
|
+
| Signal | Weight | Description |
|
|
136
|
+
|--------|--------|-------------|
|
|
125
137
|
| Keyword | 2× per term | Term presence in content + entity + tags |
|
|
126
138
|
| Semantic | 5× | Cosine similarity via OpenAI `text-embedding-3-small` (512d) |
|
|
127
139
|
| Recency | 0–1 | Linear decay over 1 year |
|
|
128
140
|
| Access | 0–0.5 | Boost for recently accessed records |
|
|
141
|
+
| Importance | 0.5× | Weighted by forgetting curve: `importance × e^(-0.01 × days)` |
|
|
129
142
|
|
|
130
|
-
|
|
131
|
-
|
|
132
|
-
**Cost:** ~$0.02 per 1M tokens with `text-embedding-3-small`. A typical session with 10 inserts + 5 searches costs < $0.001.
|
|
143
|
+
Falls back to keyword-only if no OpenAI key is configured. Cost with embeddings: ~$0.001/session.
|
|
133
144
|
|
|
134
145
|
---
|
|
135
146
|
|
|
136
147
|
## Configuration
|
|
137
148
|
|
|
138
149
|
```json
|
|
139
|
-
// openclaw.json
|
|
140
150
|
{
|
|
141
151
|
"plugins": {
|
|
142
152
|
"allow": ["memory-engine"],
|
|
@@ -144,8 +154,8 @@ Embeddings are computed on insert and cached in `archival.embeddings.json`. If n
|
|
|
144
154
|
"memory-engine": {
|
|
145
155
|
"enabled": true,
|
|
146
156
|
"config": {
|
|
147
|
-
"
|
|
148
|
-
"
|
|
157
|
+
"coreSizeLimit": 3072,
|
|
158
|
+
"sharing": false
|
|
149
159
|
}
|
|
150
160
|
}
|
|
151
161
|
}
|
|
@@ -155,49 +165,23 @@ Embeddings are computed on insert and cached in `archival.embeddings.json`. If n
|
|
|
155
165
|
|
|
156
166
|
| Option | Default | Description |
|
|
157
167
|
|--------|---------|-------------|
|
|
158
|
-
| `workspace` | Auto-resolved |
|
|
159
|
-
| `coreSizeLimit` | `3072` (3KB) | Max
|
|
160
|
-
|
|
161
|
-
**Requires:** `OPENAI_API_KEY` in environment for semantic search. Without it, keyword search still works.
|
|
162
|
-
|
|
163
|
-
---
|
|
164
|
-
|
|
165
|
-
## Agent Instructions
|
|
168
|
+
| `workspace` | Auto-resolved | Override workspace directory |
|
|
169
|
+
| `coreSizeLimit` | `3072` (3KB) | Max core.json size |
|
|
170
|
+
| `sharing` | `true` | Cross-agent memory sharing. Set `false` for multi-user privacy. |
|
|
166
171
|
|
|
167
|
-
|
|
168
|
-
|
|
169
|
-
```markdown
|
|
170
|
-
## Every Session
|
|
171
|
-
1. Call `core_memory_read` — load your identity
|
|
172
|
-
2. When you learn something important → `archival_insert`
|
|
173
|
-
3. When you need details → `archival_search` before guessing
|
|
174
|
-
4. When facts change → `core_memory_replace`
|
|
175
|
-
5. End of conversation → `memory_consolidate` with key points
|
|
176
|
-
```
|
|
172
|
+
Semantic search requires `OPENAI_API_KEY` in environment (optional).
|
|
177
173
|
|
|
178
174
|
---
|
|
179
175
|
|
|
180
|
-
##
|
|
181
|
-
|
|
182
|
-
`extras/memory-maintenance.sh` runs daily at 3am (installed as a LaunchAgent by `setup.sh`):
|
|
176
|
+
## Automated Maintenance
|
|
183
177
|
|
|
184
|
-
|
|
185
|
-
|
|
186
|
-
|
|
187
|
-
|
|
188
|
-
|
|
189
|
-
|
|
190
|
-
|
|
191
|
-
## Backup & Migration
|
|
192
|
-
|
|
193
|
-
```bash
|
|
194
|
-
# Export
|
|
195
|
-
openclaw agent -m "memory_export"
|
|
196
|
-
# → memory/export-2026-04-01.json
|
|
197
|
-
|
|
198
|
-
# Import on new machine
|
|
199
|
-
openclaw agent -m "memory_import input_path='path/to/export.json' mode='replace'"
|
|
200
|
-
```
|
|
178
|
+
| Schedule | Job | What it does |
|
|
179
|
+
|----------|-----|-------------|
|
|
180
|
+
| Daily 9:00am | Reflection | Analyze memory patterns, store observations |
|
|
181
|
+
| Every 6h | Consolidation | Extract missed facts from daily logs |
|
|
182
|
+
| Weekly Sunday | Deduplication | Clean near-duplicate records |
|
|
183
|
+
| Daily 9:30am | Dashboard | Refresh browsable HTML report |
|
|
184
|
+
| Daily 3:00am | File cleanup | Merge old logs into weekly summaries, archive old summaries |
|
|
201
185
|
|
|
202
186
|
---
|
|
203
187
|
|
|
@@ -205,50 +189,41 @@ openclaw agent -m "memory_import input_path='path/to/export.json' mode='replace'
|
|
|
205
189
|
|
|
206
190
|
```
|
|
207
191
|
memory-engine/
|
|
208
|
-
├── index.js
|
|
192
|
+
├── index.js # Plugin entry — 19 tools (factory pattern)
|
|
209
193
|
├── lib/
|
|
210
|
-
│ ├── paths.js
|
|
211
|
-
│ ├── core.js
|
|
212
|
-
│ ├── archival.js
|
|
213
|
-
│ ├── embedding.js
|
|
214
|
-
│ ├── search.js
|
|
215
|
-
│ ├──
|
|
216
|
-
│ ├──
|
|
217
|
-
│
|
|
194
|
+
│ ├── paths.js # Constants, workspace resolution
|
|
195
|
+
│ ├── core.js # Core memory CRUD + auto-parse
|
|
196
|
+
│ ├── archival.js # JSONL storage + in-memory cache
|
|
197
|
+
│ ├── embedding.js # OpenAI embedding API + cache
|
|
198
|
+
│ ├── search.js # Hybrid 5-signal search
|
|
199
|
+
│ ├── graph.js # Knowledge graph: triples + traversal
|
|
200
|
+
│ ├── episodes.js # Episodic memory: save + recall
|
|
201
|
+
│ ├── reflection.js # Statistical pattern analysis
|
|
202
|
+
│ ├── consolidate.js # Text → facts extraction
|
|
203
|
+
│ ├── dedup.js # Embedding similarity dedup
|
|
204
|
+
│ ├── backup.js # Export / import
|
|
205
|
+
│ ├── store-sqlite.js # SQLite backend (FTS5)
|
|
206
|
+
│ └── dashboard.js # HTML dashboard generator
|
|
218
207
|
├── extras/
|
|
219
|
-
│
|
|
220
|
-
|
|
221
|
-
├── .
|
|
222
|
-
├──
|
|
208
|
+
│ ├── memory-maintenance.sh
|
|
209
|
+
│ └── auto-consolidation-crons.json
|
|
210
|
+
├── setup.sh # One-command install
|
|
211
|
+
├── .claude/CLAUDE.md # Dev guide
|
|
212
|
+
├── ROADMAP.md
|
|
223
213
|
├── openclaw.plugin.json
|
|
224
|
-
└──
|
|
214
|
+
└── package.json
|
|
225
215
|
```
|
|
226
216
|
|
|
227
|
-
|
|
217
|
+
## Platforms
|
|
228
218
|
|
|
229
|
-
|
|
230
|
-
|
|
231
|
-
|
|
232
|
-
|
|
233
|
-
|
|
234
|
-
- [x] Auto-extract facts from text
|
|
235
|
-
- [x] Embedding-based deduplication
|
|
236
|
-
- [x] Full backup/restore
|
|
237
|
-
- [x] Modular codebase (8 focused modules)
|
|
238
|
-
- [ ] LanceDB / SQLite backend for 50K+ records
|
|
239
|
-
- [ ] Cross-agent memory sharing
|
|
240
|
-
- [ ] Scheduled auto-consolidation via OpenClaw cron
|
|
241
|
-
- [ ] Memory importance scoring (agent rates memories 1-10)
|
|
242
|
-
- [ ] Forgetting curve — auto-archive unaccessed memories after N days
|
|
243
|
-
- [ ] ClawHub publishing
|
|
244
|
-
- [ ] Web dashboard for memory browsing
|
|
219
|
+
| Platform | Scheduler | Status |
|
|
220
|
+
|----------|----------|--------|
|
|
221
|
+
| macOS | LaunchAgent | Full support |
|
|
222
|
+
| Linux | systemd timer | Full support |
|
|
223
|
+
| Windows | schtasks | Guided setup |
|
|
245
224
|
|
|
246
225
|
---
|
|
247
226
|
|
|
248
227
|
## License
|
|
249
228
|
|
|
250
|
-
MIT
|
|
251
|
-
|
|
252
|
-
---
|
|
253
|
-
|
|
254
|
-
Built for [OpenClaw](https://openclaw.ai). Inspired by [MemGPT/Letta](https://github.com/cpacker/MemGPT).
|
|
229
|
+
MIT — Built for [OpenClaw](https://openclaw.ai). Inspired by [MemGPT/Letta](https://github.com/cpacker/MemGPT).
|
|
@@ -0,0 +1,165 @@
|
|
|
1
|
+
#!/usr/bin/env node
|
|
2
|
+
/**
|
|
3
|
+
* migrate-legacy.mjs — Import existing file-based memory into archival.jsonl
|
|
4
|
+
*
|
|
5
|
+
* Scans workspace for: MEMORY.md, memory/*.md, memory/weekly/*.md, memory/topics/*.md
|
|
6
|
+
* Extracts facts, deduplicates, and appends to memory/archival.jsonl.
|
|
7
|
+
*
|
|
8
|
+
* Usage: node migrate-legacy.mjs [workspace_path]
|
|
9
|
+
*/
|
|
10
|
+
|
|
11
|
+
import { readFileSync, appendFileSync, existsSync, readdirSync } from "node:fs";
|
|
12
|
+
import { join, basename } from "node:path";
|
|
13
|
+
|
|
14
|
+
const WS = process.argv[2] || process.env.OPENCLAW_WORKSPACE || join(process.env.HOME || "/tmp", ".openclaw", "workspace");
|
|
15
|
+
const ARCHIVAL = join(WS, "memory", "archival.jsonl");
|
|
16
|
+
|
|
17
|
+
console.log(`🧠 Legacy memory migration`);
|
|
18
|
+
console.log(` Workspace: ${WS}`);
|
|
19
|
+
console.log(` Archival: ${ARCHIVAL}`);
|
|
20
|
+
console.log(``);
|
|
21
|
+
|
|
22
|
+
// Load existing archival for dedup
|
|
23
|
+
const existingContent = new Set();
|
|
24
|
+
if (existsSync(ARCHIVAL)) {
|
|
25
|
+
for (const line of readFileSync(ARCHIVAL, "utf-8").trim().split("\n").filter(Boolean)) {
|
|
26
|
+
try { existingContent.add(JSON.parse(line).content?.toLowerCase()); } catch {}
|
|
27
|
+
}
|
|
28
|
+
}
|
|
29
|
+
console.log(`Existing archival: ${existingContent.size} records`);
|
|
30
|
+
|
|
31
|
+
// Generic entity inference (no personal data)
|
|
32
|
+
const ENTITY_PATTERNS = [
|
|
33
|
+
[/\b(IBKR|Interactive Brokers|NAV|portfolio|投资|HELOC|mortgage|finance)/i, "finance"],
|
|
34
|
+
[/\b(immigration|PR|IRCC|CBSA|visa|律师|lawyer|petition)/i, "immigration"],
|
|
35
|
+
[/\b(quant|trading|backtest|signal|portfolio|Sharpe)/i, "trading"],
|
|
36
|
+
[/\b(doctor|医生|hospital|health|medication|药|体检|clinic)/i, "health"],
|
|
37
|
+
[/\b(car|vehicle|SUV|sedan|truck)\b/i, "vehicles"],
|
|
38
|
+
[/\b(k3d|ArgoCD|Helm|kubectl|GitOps|cluster|deploy|CI|CD)/i, "infrastructure"],
|
|
39
|
+
[/\b(OpenClaw|gateway|plugin|session|agent|memory|compaction)/i, "openclaw"],
|
|
40
|
+
[/\b(Discord|Telegram|Slack|bot|channel)/i, "messaging"],
|
|
41
|
+
[/\b(school|university|college|学校|education)/i, "education"],
|
|
42
|
+
[/\b(house|home|property|rent|房)/i, "property"],
|
|
43
|
+
[/\b(lawyer|legal|court|lawsuit|案|诉)/i, "legal"],
|
|
44
|
+
];
|
|
45
|
+
|
|
46
|
+
function inferEntity(text) {
|
|
47
|
+
for (const [pat, name] of ENTITY_PATTERNS) {
|
|
48
|
+
if (pat.test(text)) return name;
|
|
49
|
+
}
|
|
50
|
+
return "general";
|
|
51
|
+
}
|
|
52
|
+
|
|
53
|
+
function extractFacts(text) {
|
|
54
|
+
const facts = [];
|
|
55
|
+
for (const line of text.split(/\n/).map((l) => l.trim()).filter(Boolean)) {
|
|
56
|
+
if (line.startsWith("#") || line.length < 15) continue;
|
|
57
|
+
if (/^(##|===|---|\*\*\*|```|>|\|)/.test(line)) continue;
|
|
58
|
+
const sentences = line.split(/(?<=[。.!!??;;])\s*/).filter(Boolean);
|
|
59
|
+
for (const s of sentences) {
|
|
60
|
+
const clean = s.replace(/^[-*•]\s*/, "").replace(/^\d+\.\s*/, "").trim();
|
|
61
|
+
if (clean.length >= 15 && clean.length <= 500) facts.push(clean);
|
|
62
|
+
}
|
|
63
|
+
}
|
|
64
|
+
return facts;
|
|
65
|
+
}
|
|
66
|
+
|
|
67
|
+
// Collect all legacy files
|
|
68
|
+
const files = [];
|
|
69
|
+
|
|
70
|
+
// MEMORY.md
|
|
71
|
+
const memoryMd = join(WS, "MEMORY.md");
|
|
72
|
+
if (existsSync(memoryMd)) files.push({ path: memoryMd, tag: "long-term" });
|
|
73
|
+
|
|
74
|
+
// memory/*.md (daily logs)
|
|
75
|
+
const memDir = join(WS, "memory");
|
|
76
|
+
if (existsSync(memDir)) {
|
|
77
|
+
for (const f of readdirSync(memDir).filter((f) => /\.md$/.test(f) && f !== ".abstract")) {
|
|
78
|
+
files.push({ path: join(memDir, f), tag: "daily" });
|
|
79
|
+
}
|
|
80
|
+
}
|
|
81
|
+
|
|
82
|
+
// memory/weekly/*.md
|
|
83
|
+
const weeklyDir = join(WS, "memory", "weekly");
|
|
84
|
+
if (existsSync(weeklyDir)) {
|
|
85
|
+
for (const f of readdirSync(weeklyDir).filter((f) => f.endsWith(".md"))) {
|
|
86
|
+
files.push({ path: join(weeklyDir, f), tag: "weekly" });
|
|
87
|
+
}
|
|
88
|
+
}
|
|
89
|
+
|
|
90
|
+
// memory/topics/*.md
|
|
91
|
+
const topicDir = join(WS, "memory", "topics");
|
|
92
|
+
if (existsSync(topicDir)) {
|
|
93
|
+
for (const f of readdirSync(topicDir).filter((f) => f.endsWith(".md"))) {
|
|
94
|
+
files.push({ path: join(topicDir, f), tag: "topic" });
|
|
95
|
+
}
|
|
96
|
+
}
|
|
97
|
+
|
|
98
|
+
if (files.length === 0) {
|
|
99
|
+
console.log("\nNo legacy memory files found. Nothing to migrate.");
|
|
100
|
+
process.exit(0);
|
|
101
|
+
}
|
|
102
|
+
|
|
103
|
+
console.log(`Found ${files.length} files to scan\n`);
|
|
104
|
+
|
|
105
|
+
let inserted = 0;
|
|
106
|
+
let skipped = 0;
|
|
107
|
+
|
|
108
|
+
for (const { path, tag } of files) {
|
|
109
|
+
const content = readFileSync(path, "utf-8");
|
|
110
|
+
const facts = extractFacts(content);
|
|
111
|
+
let fileInserted = 0;
|
|
112
|
+
|
|
113
|
+
for (const fact of facts) {
|
|
114
|
+
const factLower = fact.toLowerCase();
|
|
115
|
+
|
|
116
|
+
// Exact dedup
|
|
117
|
+
if (existingContent.has(factLower)) {
|
|
118
|
+
skipped++;
|
|
119
|
+
continue;
|
|
120
|
+
}
|
|
121
|
+
|
|
122
|
+
// Keyword overlap dedup (>75% overlap = skip)
|
|
123
|
+
let isDupe = false;
|
|
124
|
+
const factWords = new Set(factLower.split(/\s+/).filter((w) => w.length > 2));
|
|
125
|
+
if (factWords.size > 0) {
|
|
126
|
+
for (const ex of existingContent) {
|
|
127
|
+
const exWords = new Set(ex.split(/\s+/).filter((w) => w.length > 2));
|
|
128
|
+
let overlap = 0;
|
|
129
|
+
for (const w of factWords) {
|
|
130
|
+
if (exWords.has(w)) overlap++;
|
|
131
|
+
}
|
|
132
|
+
if (overlap / factWords.size > 0.75) {
|
|
133
|
+
isDupe = true;
|
|
134
|
+
break;
|
|
135
|
+
}
|
|
136
|
+
}
|
|
137
|
+
}
|
|
138
|
+
if (isDupe) {
|
|
139
|
+
skipped++;
|
|
140
|
+
continue;
|
|
141
|
+
}
|
|
142
|
+
|
|
143
|
+
const record = {
|
|
144
|
+
id: `arch-${Date.now()}-${Math.random().toString(36).slice(2, 8)}`,
|
|
145
|
+
ts: new Date().toISOString(),
|
|
146
|
+
last_accessed: null,
|
|
147
|
+
access_count: 0,
|
|
148
|
+
importance: 5,
|
|
149
|
+
content: fact,
|
|
150
|
+
entity: inferEntity(fact),
|
|
151
|
+
tags: [tag],
|
|
152
|
+
source: "migration",
|
|
153
|
+
};
|
|
154
|
+
|
|
155
|
+
appendFileSync(ARCHIVAL, JSON.stringify(record) + "\n", "utf-8");
|
|
156
|
+
existingContent.add(factLower);
|
|
157
|
+
inserted++;
|
|
158
|
+
fileInserted++;
|
|
159
|
+
}
|
|
160
|
+
|
|
161
|
+
if (fileInserted > 0) console.log(` ${basename(path)}: +${fileInserted} facts`);
|
|
162
|
+
}
|
|
163
|
+
|
|
164
|
+
console.log(`\n✅ Migration complete: ${inserted} facts imported, ${skipped} skipped (duplicates)`);
|
|
165
|
+
console.log(`Total archival: ${existingContent.size} records`);
|
package/package.json
CHANGED
package/setup.sh
CHANGED
|
@@ -106,6 +106,40 @@ else
|
|
|
106
106
|
echo "⏭️ archival.jsonl already exists ($lines records)"
|
|
107
107
|
fi
|
|
108
108
|
|
|
109
|
+
# --- 3b. Migrate legacy memory files into archival ---
|
|
110
|
+
if command -v node &>/dev/null && [ -f "$PLUGIN_DIR/extras/migrate-legacy.mjs" ]; then
|
|
111
|
+
# Check if there are legacy files to migrate
|
|
112
|
+
legacy_count=0
|
|
113
|
+
[ -f "$WORKSPACE/MEMORY.md" ] && legacy_count=$((legacy_count + 1))
|
|
114
|
+
legacy_count=$((legacy_count + $(ls "$MEMORY_DIR"/*.md 2>/dev/null | wc -l | tr -d ' ')))
|
|
115
|
+
legacy_count=$((legacy_count + $(ls "$MEMORY_DIR"/weekly/*.md 2>/dev/null | wc -l | tr -d ' ')))
|
|
116
|
+
legacy_count=$((legacy_count + $(ls "$MEMORY_DIR"/topics/*.md 2>/dev/null | wc -l | tr -d ' ')))
|
|
117
|
+
|
|
118
|
+
archival_count=$(wc -l < "$MEMORY_DIR/archival.jsonl" 2>/dev/null | tr -d ' ' || echo "0")
|
|
119
|
+
|
|
120
|
+
if [ "$legacy_count" -gt 0 ] && [ "$archival_count" -lt 10 ]; then
|
|
121
|
+
echo ""
|
|
122
|
+
echo "📦 Found $legacy_count legacy memory files (MEMORY.md, daily logs, weekly summaries, topics)."
|
|
123
|
+
if $NON_INTERACTIVE; then
|
|
124
|
+
echo " Migrating automatically..."
|
|
125
|
+
node "$PLUGIN_DIR/extras/migrate-legacy.mjs" "$WORKSPACE" 2>&1 | tail -3
|
|
126
|
+
else
|
|
127
|
+
printf " Migrate into archival memory? [Y/n]: "
|
|
128
|
+
read -r migrate_answer
|
|
129
|
+
if [ "${migrate_answer:-Y}" != "n" ] && [ "${migrate_answer:-Y}" != "N" ]; then
|
|
130
|
+
node "$PLUGIN_DIR/extras/migrate-legacy.mjs" "$WORKSPACE" 2>&1 | tail -5
|
|
131
|
+
else
|
|
132
|
+
echo "⏭️ Skipping migration. Run manually later: node $PLUGIN_DIR/extras/migrate-legacy.mjs $WORKSPACE"
|
|
133
|
+
fi
|
|
134
|
+
fi
|
|
135
|
+
echo ""
|
|
136
|
+
else
|
|
137
|
+
if [ "$archival_count" -gt 10 ]; then
|
|
138
|
+
echo "⏭️ Archival already has $archival_count records, skipping migration"
|
|
139
|
+
fi
|
|
140
|
+
fi
|
|
141
|
+
fi
|
|
142
|
+
|
|
109
143
|
# --- 4. Install memory-maintenance.sh ---
|
|
110
144
|
SCRIPTS_DIR="$WORKSPACE/scripts"
|
|
111
145
|
mkdir -p "$SCRIPTS_DIR"
|