@levalicious/server-memory 0.0.13 → 0.0.15
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +79 -28
- package/dist/scripts/delete-document.js +91 -0
- package/dist/scripts/textrank-experiment.js +618 -0
- package/dist/server.js +127 -59
- package/dist/src/graphfile.js +118 -4
- package/dist/src/kb_load.js +396 -0
- package/dist/src/memoryfile.js +17 -0
- package/dist/src/merw.js +160 -0
- package/dist/src/stringtable.js +24 -6
- package/dist/tests/memory-server.test.js +129 -0
- package/dist/tests/test-utils.js +6 -0
- package/package.json +6 -2
package/README.md
CHANGED
|
@@ -1,6 +1,25 @@
|
|
|
1
1
|
# Knowledge Graph Memory Server
|
|
2
2
|
|
|
3
|
-
A
|
|
3
|
+
A persistent knowledge graph with binary storage, PageRank-based ranking, and Maximum Entropy Random Walk (MERW) exploration. Designed as an MCP server for use with LLM agents.
|
|
4
|
+
|
|
5
|
+
## Storage Format
|
|
6
|
+
|
|
7
|
+
The knowledge graph is stored in two binary files using a custom mmap-backed arena allocator:
|
|
8
|
+
|
|
9
|
+
- **`<base>.graph`** — Entity records (72 bytes each), adjacency blocks, and node log
|
|
10
|
+
- **`<base>.strings`** — Interned, refcounted string table
|
|
11
|
+
|
|
12
|
+
This replaces the original JSONL format. The binary format supports O(1) entity lookup, POSIX file locking for concurrent access, and in-place mutation without rewriting the entire file.
|
|
13
|
+
|
|
14
|
+
> [!NOTE]
|
|
15
|
+
> **Migrating from JSONL:** If you have an existing `.json` knowledge graph, use the migration script:
|
|
16
|
+
> ```sh
|
|
17
|
+
> npx tsx scripts/migrate-jsonl.ts [path/to/memory.json]
|
|
18
|
+
> ```
|
|
19
|
+
> The original `.json` file is preserved. See also `scripts/verify-migration.ts` to validate the result. The `MEMORY_FILE_PATH` does not need to change.
|
|
20
|
+
|
|
21
|
+
> [!NOTE]
|
|
22
|
+
> **Automatic v1→v2 migration:** Graph files using the v1 format (64-byte entity records) are automatically migrated to v2 (72-byte records with MERW ψ field) on first open. The old file is preserved as `<name>.graph.v1`.
|
|
4
23
|
|
|
5
24
|
## Core Concepts
|
|
6
25
|
|
|
@@ -53,6 +72,27 @@ Example:
|
|
|
53
72
|
}
|
|
54
73
|
```
|
|
55
74
|
|
|
75
|
+
### Ranking
|
|
76
|
+
|
|
77
|
+
Two ranking systems are maintained and updated after every graph mutation:
|
|
78
|
+
|
|
79
|
+
- **PageRank (`pagerank`)** — Structural importance via Monte Carlo random walks on graph topology (Avrachenkov et al. Algorithm 4). Each mutation triggers a full sampling pass.
|
|
80
|
+
- **LLM Rank (`llmrank`)** — Walker visit counts that track which nodes the LLM actually opens/searches. Primary sort for `llmrank` is walker visits, with PageRank as tiebreaker.
|
|
81
|
+
|
|
82
|
+
### Maximum Entropy Random Walk (MERW)
|
|
83
|
+
|
|
84
|
+
The `random_walk` tool uses MERW rather than a standard uniform random walk. MERW maximizes the global entropy rate by sampling uniformly among all paths in the graph, rather than locally maximizing entropy at each vertex.
|
|
85
|
+
|
|
86
|
+
Transition probabilities are computed from the dominant eigenvector ψ of the (damped) adjacency matrix:
|
|
87
|
+
|
|
88
|
+
```
|
|
89
|
+
S_ij = (A_ij / λ) · (ψ_j / ψ_i)
|
|
90
|
+
```
|
|
91
|
+
|
|
92
|
+
The eigenvector is computed via sparse power iteration with teleportation damping (α=0.85), warm-started from the previously stored ψ values. After a small graph mutation, convergence typically requires only 2–5 iterations rather than a full cold start.
|
|
93
|
+
|
|
94
|
+
**Practical effect:** Walks gravitate toward structurally rich regions of the graph rather than wandering down linear chains, making serendipitous exploration more productive.
|
|
95
|
+
|
|
56
96
|
## API
|
|
57
97
|
|
|
58
98
|
### Tools
|
|
@@ -110,51 +150,51 @@ Example:
|
|
|
110
150
|
- Search for nodes using a regex pattern
|
|
111
151
|
- Input:
|
|
112
152
|
- `query` (string): Regex pattern to search
|
|
113
|
-
- `sortBy` (string, optional): Sort field (
|
|
114
|
-
- `sortDir` (string, optional): Sort direction (
|
|
115
|
-
- `
|
|
153
|
+
- `sortBy` (string, optional): Sort field (`mtime`, `obsMtime`, `name`, `pagerank`, `llmrank`). Default: `llmrank`
|
|
154
|
+
- `sortDir` (string, optional): Sort direction (`asc` or `desc`)
|
|
155
|
+
- `direction` (string, optional): Edge direction filter (`forward`, `backward`, `any`). Default: `forward`
|
|
156
|
+
- `entityCursor`, `relationCursor` (number, optional): Pagination cursors
|
|
116
157
|
- Searches across entity names, types, and observation content
|
|
117
158
|
- Returns matching entities and their relations (paginated)
|
|
118
159
|
|
|
119
|
-
- **open_nodes_filtered**
|
|
120
|
-
- Retrieve specific nodes by name with filtered relations
|
|
121
|
-
- Input: `names` (string[]), `entityCursor` (number, optional), `relationCursor` (number, optional)
|
|
122
|
-
- Returns:
|
|
123
|
-
- Requested entities
|
|
124
|
-
- Only relations where both endpoints are in the requested set
|
|
125
|
-
- Silently skips non-existent nodes (paginated)
|
|
126
|
-
|
|
127
160
|
- **open_nodes**
|
|
128
161
|
- Retrieve specific nodes by name
|
|
129
|
-
- Input:
|
|
130
|
-
|
|
131
|
-
-
|
|
132
|
-
-
|
|
133
|
-
-
|
|
162
|
+
- Input:
|
|
163
|
+
- `names` (string[]): Entity names to retrieve
|
|
164
|
+
- `direction` (string, optional): Edge direction filter (`forward`, `backward`, `any`). Default: `forward`
|
|
165
|
+
- `entityCursor`, `relationCursor` (number, optional): Pagination cursors
|
|
166
|
+
- Returns requested entities and relations originating from them (paginated)
|
|
167
|
+
- Silently skips non-existent nodes
|
|
134
168
|
|
|
135
169
|
- **get_neighbors**
|
|
136
170
|
- Get names of neighboring entities connected to a specific entity within a given depth
|
|
137
171
|
- Input:
|
|
138
172
|
- `entityName` (string): The entity to find neighbors for
|
|
139
173
|
- `depth` (number, default: 1): Maximum traversal depth
|
|
140
|
-
- `sortBy` (string, optional): Sort field (
|
|
141
|
-
- `sortDir` (string, optional): Sort direction (
|
|
174
|
+
- `sortBy` (string, optional): Sort field (`mtime`, `obsMtime`, `name`, `pagerank`, `llmrank`). Default: `llmrank`
|
|
175
|
+
- `sortDir` (string, optional): Sort direction (`asc` or `desc`)
|
|
176
|
+
- `direction` (string, optional): Edge direction filter (`forward`, `backward`, `any`). Default: `forward`
|
|
142
177
|
- `cursor` (number, optional): Pagination cursor
|
|
143
178
|
- Returns neighbor names with timestamps (paginated)
|
|
144
179
|
- Use `open_nodes` to get full entity data for neighbors
|
|
145
180
|
|
|
146
181
|
- **find_path**
|
|
147
182
|
- Find a path between two entities in the knowledge graph
|
|
148
|
-
- Input:
|
|
183
|
+
- Input:
|
|
184
|
+
- `fromEntity` (string): Starting entity
|
|
185
|
+
- `toEntity` (string): Target entity
|
|
186
|
+
- `maxDepth` (number, default: 5): Maximum search depth
|
|
187
|
+
- `direction` (string, optional): Edge direction filter (`forward`, `backward`, `any`). Default: `forward`
|
|
188
|
+
- `cursor` (number, optional): Pagination cursor
|
|
149
189
|
- Returns path between entities if one exists (paginated)
|
|
150
190
|
|
|
151
191
|
- **get_entities_by_type**
|
|
152
192
|
- Get all entities of a specific type
|
|
153
193
|
- Input:
|
|
154
194
|
- `entityType` (string): Type to filter by
|
|
155
|
-
- `sortBy` (string, optional): Sort field (
|
|
156
|
-
- `sortDir` (string, optional): Sort direction (
|
|
157
|
-
- `cursor` (number, optional)
|
|
195
|
+
- `sortBy` (string, optional): Sort field (`mtime`, `obsMtime`, `name`, `pagerank`, `llmrank`). Default: `llmrank`
|
|
196
|
+
- `sortDir` (string, optional): Sort direction (`asc` or `desc`)
|
|
197
|
+
- `cursor` (number, optional): Pagination cursor
|
|
158
198
|
- Returns all entities matching the specified type (paginated)
|
|
159
199
|
|
|
160
200
|
- **get_entity_types**
|
|
@@ -176,9 +216,9 @@ Example:
|
|
|
176
216
|
- Get entities that have no relations (orphaned entities)
|
|
177
217
|
- Input:
|
|
178
218
|
- `strict` (boolean, default: false): If true, returns entities not connected to 'Self' entity
|
|
179
|
-
- `sortBy` (string, optional): Sort field (
|
|
180
|
-
- `sortDir` (string, optional): Sort direction (
|
|
181
|
-
- `cursor` (number, optional)
|
|
219
|
+
- `sortBy` (string, optional): Sort field (`mtime`, `obsMtime`, `name`, `pagerank`, `llmrank`). Default: `llmrank`
|
|
220
|
+
- `sortDir` (string, optional): Sort direction (`asc` or `desc`)
|
|
221
|
+
- `cursor` (number, optional): Pagination cursor
|
|
182
222
|
- Returns entities with no connections (paginated)
|
|
183
223
|
|
|
184
224
|
- **validate_graph**
|
|
@@ -195,13 +235,15 @@ Example:
|
|
|
195
235
|
- Useful for interpreting `mtime`/`obsMtime` values from entities
|
|
196
236
|
|
|
197
237
|
- **random_walk**
|
|
198
|
-
- Perform a random walk from a starting entity
|
|
238
|
+
- Perform a MERW-weighted random walk from a starting entity
|
|
199
239
|
- Input:
|
|
200
240
|
- `start` (string): Name of the entity to start the walk from
|
|
201
241
|
- `depth` (number, default: 3): Number of hops to take
|
|
202
242
|
- `seed` (string, optional): Seed for reproducible walks
|
|
243
|
+
- `direction` (string, optional): Edge direction filter (`forward`, `backward`, `any`). Default: `forward`
|
|
244
|
+
- Neighbors are selected proportional to their MERW eigenvector component ψ
|
|
245
|
+
- Falls back to uniform sampling if ψ has not been computed
|
|
203
246
|
- Returns the terminal entity name and the path taken
|
|
204
|
-
- Useful for serendipitous exploration of the knowledge graph
|
|
205
247
|
|
|
206
248
|
- **sequentialthinking**
|
|
207
249
|
- Record a thought in the knowledge graph
|
|
@@ -209,6 +251,15 @@ Example:
|
|
|
209
251
|
- Creates a Thought entity and links it to the previous thought if provided
|
|
210
252
|
- Returns the new thought's context ID for chaining
|
|
211
253
|
|
|
254
|
+
- **kb_load**
|
|
255
|
+
- Load a plaintext document into the knowledge graph
|
|
256
|
+
- Input:
|
|
257
|
+
- `filePath` (string): Absolute path to a plaintext file (`.txt`, `.md`, `.tex`, source code, etc.)
|
|
258
|
+
- `title` (string, optional): Document title. Defaults to filename without extension
|
|
259
|
+
- `topK` (number, optional): Number of top TextRank sentences to highlight in the index. Default: 15
|
|
260
|
+
- Creates a doubly-linked chain of TextChunk entities, a Document entity, and a DocumentIndex with TextRank-selected entry points
|
|
261
|
+
- For PDFs, convert to text first (e.g., `pdftotext`)
|
|
262
|
+
|
|
212
263
|
# Usage with Claude Desktop
|
|
213
264
|
|
|
214
265
|
### Setup
|
|
@@ -0,0 +1,91 @@
|
|
|
1
|
+
#!/usr/bin/env node
|
|
2
|
+
/**
|
|
3
|
+
* delete-document.ts — Remove a kb_load-style document and its TextChunk chain
|
|
4
|
+
* from the binary knowledge graph.
|
|
5
|
+
*
|
|
6
|
+
* Usage:
|
|
7
|
+
* MEMORY_FILE_PATH=~/.local/share/memory/vscode.json npx tsx scripts/delete-document.ts <document-entity-name> [--live]
|
|
8
|
+
*
|
|
9
|
+
* Without --live, runs in dry-run mode: walks the chain, counts chunks, prints
|
|
10
|
+
* what would be deleted, but does not mutate anything.
|
|
11
|
+
*
|
|
12
|
+
* With --live, actually deletes the document entity, the index entity (if any),
|
|
13
|
+
* and every TextChunk in the chain.
|
|
14
|
+
*/
|
|
15
|
+
import { KnowledgeGraphManager } from '../server.js';
|
|
16
|
+
const DOC_NAME = process.argv[2];
|
|
17
|
+
const LIVE = process.argv.includes('--live');
|
|
18
|
+
const BATCH_SIZE = 200;
|
|
19
|
+
if (!DOC_NAME) {
|
|
20
|
+
console.error('Usage: npx tsx scripts/delete-document.ts <document-entity-name> [--live]');
|
|
21
|
+
process.exit(1);
|
|
22
|
+
}
|
|
23
|
+
const memoryFilePath = process.env.MEMORY_FILE_PATH ?? `${process.env.HOME}/.local/share/memory/vscode.json`;
|
|
24
|
+
console.log(`Opening graph at: ${memoryFilePath}`);
|
|
25
|
+
console.log(`Mode: ${LIVE ? '🔴 LIVE — will delete' : '🟢 DRY RUN — read only'}`);
|
|
26
|
+
console.log();
|
|
27
|
+
const mgr = new KnowledgeGraphManager(memoryFilePath);
|
|
28
|
+
// ── Step 1: Open the document node, find starts_with target ──────────
|
|
29
|
+
const docGraph = await mgr.openNodes([DOC_NAME], 'forward');
|
|
30
|
+
const docEntity = docGraph.entities.find(e => e.name === DOC_NAME);
|
|
31
|
+
if (!docEntity) {
|
|
32
|
+
console.error(`Entity "${DOC_NAME}" not found.`);
|
|
33
|
+
process.exit(1);
|
|
34
|
+
}
|
|
35
|
+
console.log(`Found document: "${DOC_NAME}" (type: ${docEntity.entityType})`);
|
|
36
|
+
const startsWithRel = docGraph.relations.find(r => r.relationType === 'starts_with');
|
|
37
|
+
if (!startsWithRel) {
|
|
38
|
+
console.error(`No "starts_with" relation found on "${DOC_NAME}". Is this a kb_load document?`);
|
|
39
|
+
process.exit(1);
|
|
40
|
+
}
|
|
41
|
+
const headChunkName = startsWithRel.to;
|
|
42
|
+
console.log(`Head chunk: ${headChunkName}`);
|
|
43
|
+
// ── Step 2: Walk the chain via "follows" relations ───────────────────
|
|
44
|
+
const toDelete = [];
|
|
45
|
+
let currentName = headChunkName;
|
|
46
|
+
let visited = 0;
|
|
47
|
+
while (currentName) {
|
|
48
|
+
toDelete.push(currentName);
|
|
49
|
+
visited++;
|
|
50
|
+
if (visited % 500 === 0) {
|
|
51
|
+
process.stdout.write(` … walked ${visited} chunks\r`);
|
|
52
|
+
}
|
|
53
|
+
// Find the "follows" relation from this chunk
|
|
54
|
+
const chunkGraph = await mgr.openNodes([currentName], 'forward');
|
|
55
|
+
const followsRel = chunkGraph.relations.find(r => r.relationType === 'follows');
|
|
56
|
+
currentName = followsRel ? followsRel.to : '';
|
|
57
|
+
}
|
|
58
|
+
console.log(`\nChain walk complete: ${toDelete.length} TextChunks found.`);
|
|
59
|
+
// ── Step 3: Check for an index entity ────────────────────────────────
|
|
60
|
+
const indexName = `${DOC_NAME}__index`;
|
|
61
|
+
const indexGraph = await mgr.openNodes([indexName], 'forward');
|
|
62
|
+
const indexEntity = indexGraph.entities.find(e => e.name === indexName);
|
|
63
|
+
const extraDeletes = [DOC_NAME];
|
|
64
|
+
if (indexEntity) {
|
|
65
|
+
extraDeletes.push(indexName);
|
|
66
|
+
console.log(`Index entity found: "${indexName}"`);
|
|
67
|
+
}
|
|
68
|
+
else {
|
|
69
|
+
console.log(`No index entity "${indexName}" found (old-style import).`);
|
|
70
|
+
}
|
|
71
|
+
const totalDeletes = extraDeletes.length + toDelete.length;
|
|
72
|
+
console.log(`\nTotal entities to delete: ${totalDeletes} (${extraDeletes.length} header + ${toDelete.length} chunks)`);
|
|
73
|
+
if (!LIVE) {
|
|
74
|
+
console.log('\n✅ Dry run complete. Re-run with --live to actually delete.');
|
|
75
|
+
process.exit(0);
|
|
76
|
+
}
|
|
77
|
+
// ── Step 4: Delete in batches ────────────────────────────────────────
|
|
78
|
+
console.log(`\nDeleting ${totalDeletes} entities in batches of ${BATCH_SIZE}...`);
|
|
79
|
+
// Delete chunks first (the bulk), then the header entities
|
|
80
|
+
let deleted = 0;
|
|
81
|
+
for (let i = 0; i < toDelete.length; i += BATCH_SIZE) {
|
|
82
|
+
const batch = toDelete.slice(i, i + BATCH_SIZE);
|
|
83
|
+
await mgr.deleteEntities(batch);
|
|
84
|
+
deleted += batch.length;
|
|
85
|
+
process.stdout.write(` Deleted ${deleted}/${toDelete.length} chunks\r`);
|
|
86
|
+
}
|
|
87
|
+
console.log(`\n Chunks done.`);
|
|
88
|
+
// Delete document + index
|
|
89
|
+
await mgr.deleteEntities(extraDeletes);
|
|
90
|
+
console.log(` Deleted document header${indexEntity ? ' + index' : ''}.`);
|
|
91
|
+
console.log(`\n🔴 Done. Removed ${totalDeletes} entities from the graph.`);
|