superlocalmemory 2.3.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/ATTRIBUTION.md +140 -0
- package/CHANGELOG.md +1749 -0
- package/LICENSE +21 -0
- package/README.md +600 -0
- package/bin/aider-smart +72 -0
- package/bin/slm +202 -0
- package/bin/slm-npm +73 -0
- package/bin/slm.bat +195 -0
- package/bin/slm.cmd +10 -0
- package/bin/superlocalmemoryv2:list +3 -0
- package/bin/superlocalmemoryv2:profile +3 -0
- package/bin/superlocalmemoryv2:recall +3 -0
- package/bin/superlocalmemoryv2:remember +3 -0
- package/bin/superlocalmemoryv2:reset +3 -0
- package/bin/superlocalmemoryv2:status +3 -0
- package/completions/slm.bash +58 -0
- package/completions/slm.zsh +76 -0
- package/configs/antigravity-mcp.json +13 -0
- package/configs/chatgpt-desktop-mcp.json +7 -0
- package/configs/claude-desktop-mcp.json +15 -0
- package/configs/codex-mcp.toml +13 -0
- package/configs/cody-commands.json +29 -0
- package/configs/continue-mcp.yaml +14 -0
- package/configs/continue-skills.yaml +26 -0
- package/configs/cursor-mcp.json +15 -0
- package/configs/gemini-cli-mcp.json +11 -0
- package/configs/jetbrains-mcp.json +11 -0
- package/configs/opencode-mcp.json +12 -0
- package/configs/perplexity-mcp.json +9 -0
- package/configs/vscode-copilot-mcp.json +12 -0
- package/configs/windsurf-mcp.json +16 -0
- package/configs/zed-mcp.json +12 -0
- package/docs/ARCHITECTURE.md +877 -0
- package/docs/CLI-COMMANDS-REFERENCE.md +425 -0
- package/docs/COMPETITIVE-ANALYSIS.md +210 -0
- package/docs/COMPRESSION-README.md +390 -0
- package/docs/GRAPH-ENGINE.md +503 -0
- package/docs/MCP-MANUAL-SETUP.md +720 -0
- package/docs/MCP-TROUBLESHOOTING.md +787 -0
- package/docs/PATTERN-LEARNING.md +363 -0
- package/docs/PROFILES-GUIDE.md +453 -0
- package/docs/RESET-GUIDE.md +353 -0
- package/docs/SEARCH-ENGINE-V2.2.0.md +748 -0
- package/docs/SEARCH-INTEGRATION-GUIDE.md +502 -0
- package/docs/UI-SERVER.md +254 -0
- package/docs/UNIVERSAL-INTEGRATION.md +432 -0
- package/docs/V2.2.0-OPTIONAL-SEARCH.md +666 -0
- package/docs/WINDOWS-INSTALL-README.txt +34 -0
- package/docs/WINDOWS-POST-INSTALL.txt +45 -0
- package/docs/example_graph_usage.py +148 -0
- package/hooks/memory-list-skill.js +130 -0
- package/hooks/memory-profile-skill.js +284 -0
- package/hooks/memory-recall-skill.js +109 -0
- package/hooks/memory-remember-skill.js +127 -0
- package/hooks/memory-reset-skill.js +274 -0
- package/install-skills.sh +436 -0
- package/install.ps1 +417 -0
- package/install.sh +755 -0
- package/mcp_server.py +585 -0
- package/package.json +94 -0
- package/requirements-core.txt +24 -0
- package/requirements.txt +10 -0
- package/scripts/postinstall.js +126 -0
- package/scripts/preuninstall.js +57 -0
- package/skills/slm-build-graph/SKILL.md +423 -0
- package/skills/slm-list-recent/SKILL.md +348 -0
- package/skills/slm-recall/SKILL.md +325 -0
- package/skills/slm-remember/SKILL.md +194 -0
- package/skills/slm-status/SKILL.md +363 -0
- package/skills/slm-switch-profile/SKILL.md +442 -0
- package/src/__pycache__/cache_manager.cpython-312.pyc +0 -0
- package/src/__pycache__/embedding_engine.cpython-312.pyc +0 -0
- package/src/__pycache__/graph_engine.cpython-312.pyc +0 -0
- package/src/__pycache__/hnsw_index.cpython-312.pyc +0 -0
- package/src/__pycache__/hybrid_search.cpython-312.pyc +0 -0
- package/src/__pycache__/memory-profiles.cpython-312.pyc +0 -0
- package/src/__pycache__/memory-reset.cpython-312.pyc +0 -0
- package/src/__pycache__/memory_compression.cpython-312.pyc +0 -0
- package/src/__pycache__/memory_store_v2.cpython-312.pyc +0 -0
- package/src/__pycache__/migrate_v1_to_v2.cpython-312.pyc +0 -0
- package/src/__pycache__/pattern_learner.cpython-312.pyc +0 -0
- package/src/__pycache__/query_optimizer.cpython-312.pyc +0 -0
- package/src/__pycache__/search_engine_v2.cpython-312.pyc +0 -0
- package/src/__pycache__/setup_validator.cpython-312.pyc +0 -0
- package/src/__pycache__/tree_manager.cpython-312.pyc +0 -0
- package/src/cache_manager.py +520 -0
- package/src/embedding_engine.py +671 -0
- package/src/graph_engine.py +970 -0
- package/src/hnsw_index.py +626 -0
- package/src/hybrid_search.py +693 -0
- package/src/memory-profiles.py +518 -0
- package/src/memory-reset.py +485 -0
- package/src/memory_compression.py +999 -0
- package/src/memory_store_v2.py +1088 -0
- package/src/migrate_v1_to_v2.py +638 -0
- package/src/pattern_learner.py +898 -0
- package/src/query_optimizer.py +513 -0
- package/src/search_engine_v2.py +403 -0
- package/src/setup_validator.py +479 -0
- package/src/tree_manager.py +720 -0
|
@@ -0,0 +1,503 @@
|
|
|
1
|
+
# GraphEngine - Knowledge Graph Clustering for SuperLocalMemory
|
|
2
|
+
|
|
3
|
+
Complete implementation of GraphRAG with Leiden community detection for automatic memory clustering and relationship discovery.
|
|
4
|
+
|
|
5
|
+
## Overview
|
|
6
|
+
|
|
7
|
+
The GraphEngine implements Layer 3 of the memory architecture, building a knowledge graph that auto-discovers relationships between memories across projects using:
|
|
8
|
+
|
|
9
|
+
- **TF-IDF Entity Extraction** - Local keyword extraction (top 20 per memory)
|
|
10
|
+
- **Cosine Similarity Edges** - Relationship building (threshold > 0.3)
|
|
11
|
+
- **Leiden Clustering** - Community detection for thematic grouping
|
|
12
|
+
- **Auto-naming** - TF-IDF-based cluster name generation
|
|
13
|
+
|
|
14
|
+
**All processing is local** - no external APIs, all data stays on your machine.
|
|
15
|
+
|
|
16
|
+
## Architecture
|
|
17
|
+
|
|
18
|
+
```
|
|
19
|
+
┌─────────────────────────────────────────────────────────┐
|
|
20
|
+
│ GraphEngine (graph_engine.py) │
|
|
21
|
+
│ │
|
|
22
|
+
│ ┌──────────────────┐ ┌──────────────────┐ │
|
|
23
|
+
│ │ EntityExtractor │ │ EdgeBuilder │ │
|
|
24
|
+
│ │ (TF-IDF) │ │ (Cosine sim) │ │
|
|
25
|
+
│ └──────────────────┘ └──────────────────┘ │
|
|
26
|
+
│ │
|
|
27
|
+
│ ┌──────────────────┐ ┌──────────────────┐ │
|
|
28
|
+
│ │ ClusterBuilder │ │ ClusterNamer │ │
|
|
29
|
+
│ │ (Leiden) │ │ (TF-IDF) │ │
|
|
30
|
+
│ └──────────────────┘ └──────────────────┘ │
|
|
31
|
+
└─────────────────────────────────────────────────────────┘
|
|
32
|
+
│
|
|
33
|
+
▼
|
|
34
|
+
┌───────────────┐
|
|
35
|
+
│ SQLite Tables │
|
|
36
|
+
│ - graph_nodes │
|
|
37
|
+
│ - graph_edges │
|
|
38
|
+
│ - graph_clusters │
|
|
39
|
+
└───────────────┘
|
|
40
|
+
```
|
|
41
|
+
|
|
42
|
+
## Components
|
|
43
|
+
|
|
44
|
+
### 1. EntityExtractor
|
|
45
|
+
Extracts key concepts from memory content using TF-IDF vectorization.
|
|
46
|
+
|
|
47
|
+
**Features:**
|
|
48
|
+
- Top 20 keywords per memory
|
|
49
|
+
- Unigrams + bigrams (e.g., "authentication", "jwt tokens")
|
|
50
|
+
- English stop words filtering
|
|
51
|
+
- Minimum score threshold (0.05)
|
|
52
|
+
|
|
53
|
+
**Example:**
|
|
54
|
+
```python
|
|
55
|
+
Memory: "Next.js authentication using NextAuth.js with JWT tokens..."
|
|
56
|
+
|
|
57
|
+
Entities:
|
|
58
|
+
["nextjs", "authentication", "nextauth", "jwt", "tokens",
|
|
59
|
+
"oauth", "session", "credentials", "callback", "api"]
|
|
60
|
+
```
|
|
61
|
+
|
|
62
|
+
### 2. EdgeBuilder
|
|
63
|
+
Creates weighted edges between similar memories based on entity overlap.
|
|
64
|
+
|
|
65
|
+
**Algorithm:**
|
|
66
|
+
1. Compute pairwise cosine similarity of TF-IDF vectors
|
|
67
|
+
2. Create edge if similarity >= threshold (default 0.3)
|
|
68
|
+
3. Classify relationship type:
|
|
69
|
+
- `similar` (sim > 0.7) - Strong match
|
|
70
|
+
- `depends_on` - Contains dependency keywords
|
|
71
|
+
- `related_to` - Moderate/weak match
|
|
72
|
+
|
|
73
|
+
**Example:**
|
|
74
|
+
```python
|
|
75
|
+
Memory #42: ["nextjs", "authentication", "jwt"]
|
|
76
|
+
Memory #15: ["jwt", "tokens", "authentication", "python"]
|
|
77
|
+
|
|
78
|
+
Similarity: 0.72
|
|
79
|
+
Shared entities: ["authentication", "jwt"]
|
|
80
|
+
Edge type: "similar"
|
|
81
|
+
```
|
|
82
|
+
|
|
83
|
+
### 3. ClusterBuilder
|
|
84
|
+
Groups related memories into thematic clusters using Leiden algorithm.
|
|
85
|
+
|
|
86
|
+
**Why Leiden?**
|
|
87
|
+
- Better quality than Louvain algorithm
|
|
88
|
+
- Deterministic (reproducible with seed)
|
|
89
|
+
- Scalable (handles 1000+ nodes)
|
|
90
|
+
- Production-ready (used by Scanpy, 10k+ citations)
|
|
91
|
+
|
|
92
|
+
**Performance:**
|
|
93
|
+
- 50 memories: ~500ms
|
|
94
|
+
- 100 memories: ~2s
|
|
95
|
+
- 500 memories: ~15s
|
|
96
|
+
|
|
97
|
+
**Output:**
|
|
98
|
+
```
|
|
99
|
+
Cluster #1: 8 memories (avg importance: 7.2)
|
|
100
|
+
Theme: Authentication & JWT
|
|
101
|
+
Members: [12, 15, 23, 33, 42, 52, 67, 71]
|
|
102
|
+
|
|
103
|
+
Cluster #2: 12 memories (avg importance: 6.8)
|
|
104
|
+
Theme: React & Architecture
|
|
105
|
+
Members: [5, 8, 14, 19, 28, 35, 41, 46, 53, 60, 65, 70]
|
|
106
|
+
```
|
|
107
|
+
|
|
108
|
+
### 4. ClusterNamer
|
|
109
|
+
Auto-generates human-readable cluster names from member entities.
|
|
110
|
+
|
|
111
|
+
**Strategy:**
|
|
112
|
+
1. Collect all entities from cluster members
|
|
113
|
+
2. Count entity frequencies
|
|
114
|
+
3. Use top 2-3 entities for name
|
|
115
|
+
|
|
116
|
+
**Examples:**
|
|
117
|
+
- `"Authentication & JWT"` (from ["authentication", "jwt", "oauth"])
|
|
118
|
+
- `"React & Architecture"` (from ["react", "component", "architecture"])
|
|
119
|
+
- `"Performance & Optimization"` (from ["performance", "optimize", "speed"])
|
|
120
|
+
|
|
121
|
+
## Database Schema
|
|
122
|
+
|
|
123
|
+
### graph_nodes
|
|
124
|
+
Stores extracted entities and embedding vectors for each memory.
|
|
125
|
+
|
|
126
|
+
```sql
|
|
127
|
+
CREATE TABLE graph_nodes (
|
|
128
|
+
id INTEGER PRIMARY KEY,
|
|
129
|
+
memory_id INTEGER UNIQUE NOT NULL,
|
|
130
|
+
entities TEXT, -- JSON: ["auth", "jwt", ...]
|
|
131
|
+
embedding_vector TEXT, -- JSON: TF-IDF vector
|
|
132
|
+
created_at TIMESTAMP,
|
|
133
|
+
FOREIGN KEY (memory_id) REFERENCES memories(id)
|
|
134
|
+
);
|
|
135
|
+
```
|
|
136
|
+
|
|
137
|
+
### graph_edges
|
|
138
|
+
Stores relationships between memories.
|
|
139
|
+
|
|
140
|
+
```sql
|
|
141
|
+
CREATE TABLE graph_edges (
|
|
142
|
+
id INTEGER PRIMARY KEY,
|
|
143
|
+
source_memory_id INTEGER NOT NULL,
|
|
144
|
+
target_memory_id INTEGER NOT NULL,
|
|
145
|
+
relationship_type TEXT, -- 'similar', 'depends_on', 'related_to'
|
|
146
|
+
weight REAL, -- Similarity score (0-1)
|
|
147
|
+
shared_entities TEXT, -- JSON: ["auth", "jwt"]
|
|
148
|
+
similarity_score REAL,
|
|
149
|
+
created_at TIMESTAMP,
|
|
150
|
+
UNIQUE(source_memory_id, target_memory_id)
|
|
151
|
+
);
|
|
152
|
+
```
|
|
153
|
+
|
|
154
|
+
### graph_clusters
|
|
155
|
+
Stores detected communities.
|
|
156
|
+
|
|
157
|
+
```sql
|
|
158
|
+
CREATE TABLE graph_clusters (
|
|
159
|
+
id INTEGER PRIMARY KEY,
|
|
160
|
+
name TEXT NOT NULL, -- "Authentication & JWT"
|
|
161
|
+
description TEXT,
|
|
162
|
+
member_count INTEGER,
|
|
163
|
+
avg_importance REAL,
|
|
164
|
+
created_at TIMESTAMP,
|
|
165
|
+
updated_at TIMESTAMP
|
|
166
|
+
);
|
|
167
|
+
```
|
|
168
|
+
|
|
169
|
+
## Installation
|
|
170
|
+
|
|
171
|
+
### Dependencies
|
|
172
|
+
```bash
|
|
173
|
+
pip install scikit-learn python-igraph leidenalg
|
|
174
|
+
```
|
|
175
|
+
|
|
176
|
+
**Note:** All dependencies are already installed in the virtual environment.
|
|
177
|
+
|
|
178
|
+
## Usage
|
|
179
|
+
|
|
180
|
+
### CLI Commands
|
|
181
|
+
|
|
182
|
+
#### 1. Build Complete Graph
|
|
183
|
+
```bash
|
|
184
|
+
python graph_engine.py build [--min-similarity 0.3]
|
|
185
|
+
```
|
|
186
|
+
|
|
187
|
+
**Output:**
|
|
188
|
+
```json
|
|
189
|
+
{
|
|
190
|
+
"success": true,
|
|
191
|
+
"memories": 18,
|
|
192
|
+
"nodes": 18,
|
|
193
|
+
"edges": 40,
|
|
194
|
+
"clusters": 4,
|
|
195
|
+
"time_seconds": 0.03
|
|
196
|
+
}
|
|
197
|
+
```
|
|
198
|
+
|
|
199
|
+
#### 2. View Statistics
|
|
200
|
+
```bash
|
|
201
|
+
python graph_engine.py stats
|
|
202
|
+
```
|
|
203
|
+
|
|
204
|
+
**Output:**
|
|
205
|
+
```json
|
|
206
|
+
{
|
|
207
|
+
"nodes": 18,
|
|
208
|
+
"edges": 40,
|
|
209
|
+
"clusters": 4,
|
|
210
|
+
"top_clusters": [
|
|
211
|
+
{
|
|
212
|
+
"name": "Authentication & Tokens",
|
|
213
|
+
"members": 4,
|
|
214
|
+
"avg_importance": 6.2
|
|
215
|
+
},
|
|
216
|
+
{
|
|
217
|
+
"name": "Performance & Code",
|
|
218
|
+
"members": 4,
|
|
219
|
+
"avg_importance": 5.0
|
|
220
|
+
}
|
|
221
|
+
]
|
|
222
|
+
}
|
|
223
|
+
```
|
|
224
|
+
|
|
225
|
+
#### 3. Find Related Memories
|
|
226
|
+
```bash
|
|
227
|
+
python graph_engine.py related --memory-id 1 [--hops 2]
|
|
228
|
+
```
|
|
229
|
+
|
|
230
|
+
**Output:**
|
|
231
|
+
```
|
|
232
|
+
1. Memory #4 (1-hop, weight=0.875)
|
|
233
|
+
Relationship: similar
|
|
234
|
+
Summary: Authentication implementation...
|
|
235
|
+
Shared: authentication, jwt, oauth
|
|
236
|
+
|
|
237
|
+
2. Memory #2 (1-hop, weight=0.709)
|
|
238
|
+
Relationship: related_to
|
|
239
|
+
Summary: API security patterns...
|
|
240
|
+
Shared: security, api, tokens
|
|
241
|
+
```
|
|
242
|
+
|
|
243
|
+
#### 4. View Cluster Members
|
|
244
|
+
```bash
|
|
245
|
+
python graph_engine.py cluster --cluster-id 1
|
|
246
|
+
```
|
|
247
|
+
|
|
248
|
+
**Output:**
|
|
249
|
+
```
|
|
250
|
+
Cluster #1 members:
|
|
251
|
+
|
|
252
|
+
1. Memory #5 (importance=7)
|
|
253
|
+
JWT authentication implementation...
|
|
254
|
+
|
|
255
|
+
2. Memory #8 (importance=6)
|
|
256
|
+
OAuth2 flow setup...
|
|
257
|
+
|
|
258
|
+
3. Memory #10 (importance=8)
|
|
259
|
+
Security best practices...
|
|
260
|
+
```
|
|
261
|
+
|
|
262
|
+
### Programmatic Usage
|
|
263
|
+
|
|
264
|
+
```python
|
|
265
|
+
from graph_engine import GraphEngine
|
|
266
|
+
|
|
267
|
+
# Initialize engine
|
|
268
|
+
engine = GraphEngine()
|
|
269
|
+
|
|
270
|
+
# Build complete graph
|
|
271
|
+
stats = engine.build_graph(min_similarity=0.3)
|
|
272
|
+
print(f"Built graph: {stats['nodes']} nodes, {stats['edges']} edges")
|
|
273
|
+
|
|
274
|
+
# Find related memories
|
|
275
|
+
related = engine.get_related(memory_id=1, max_hops=2)
|
|
276
|
+
for mem in related:
|
|
277
|
+
print(f"Related: #{mem['id']} ({mem['relationship']}, weight={mem['weight']:.3f})")
|
|
278
|
+
|
|
279
|
+
# Query clusters
|
|
280
|
+
stats = engine.get_stats()
|
|
281
|
+
for cluster in stats['top_clusters']:
|
|
282
|
+
print(f"Cluster: {cluster['name']} ({cluster['members']} members)")
|
|
283
|
+
|
|
284
|
+
# Get cluster details
|
|
285
|
+
members = engine.get_cluster_members(cluster_id)
|
|
286
|
+
for mem in members:
|
|
287
|
+
print(f" - Memory #{mem['id']}: {mem['summary'][:50]}...")
|
|
288
|
+
|
|
289
|
+
# Add memory incrementally
|
|
290
|
+
success = engine.add_memory_incremental(new_memory_id)
|
|
291
|
+
if success:
|
|
292
|
+
print("Memory added to graph successfully")
|
|
293
|
+
```
|
|
294
|
+
|
|
295
|
+
### Example Script
|
|
296
|
+
|
|
297
|
+
Run `example_graph_usage.py` to see all features in action:
|
|
298
|
+
|
|
299
|
+
```bash
|
|
300
|
+
python example_graph_usage.py
|
|
301
|
+
```
|
|
302
|
+
|
|
303
|
+
This demonstrates:
|
|
304
|
+
1. Building a complete graph
|
|
305
|
+
2. Finding related memories
|
|
306
|
+
3. Querying clusters
|
|
307
|
+
4. Extracting entities
|
|
308
|
+
5. Incremental memory addition
|
|
309
|
+
|
|
310
|
+
## Performance
|
|
311
|
+
|
|
312
|
+
### Build Time (Full Graph)
|
|
313
|
+
- 10 memories: 0.02s
|
|
314
|
+
- 50 memories: 0.5s
|
|
315
|
+
- 100 memories: 2s
|
|
316
|
+
- 500 memories: ~15s
|
|
317
|
+
|
|
318
|
+
### Query Time
|
|
319
|
+
- Find related (1-hop): <5ms
|
|
320
|
+
- Find related (2-hop): <10ms
|
|
321
|
+
- Get cluster members: <2ms
|
|
322
|
+
- Graph stats: <5ms
|
|
323
|
+
|
|
324
|
+
### Space Complexity
|
|
325
|
+
- **Sparse storage** - Only edges > threshold
|
|
326
|
+
- **Example:** 50 memories
|
|
327
|
+
- Full matrix: 2,500 entries
|
|
328
|
+
- Sparse graph: ~150 edges (94% reduction)
|
|
329
|
+
|
|
330
|
+
## Graph Operations
|
|
331
|
+
|
|
332
|
+
### Full Rebuild
|
|
333
|
+
Recommended when:
|
|
334
|
+
- First time setup
|
|
335
|
+
- Major changes (10+ new memories)
|
|
336
|
+
- Weekly maintenance (cron job)
|
|
337
|
+
|
|
338
|
+
```python
|
|
339
|
+
stats = engine.build_graph(min_similarity=0.3)
|
|
340
|
+
```
|
|
341
|
+
|
|
342
|
+
### Incremental Update
|
|
343
|
+
Recommended when:
|
|
344
|
+
- Adding single memory
|
|
345
|
+
- Real-time graph updates
|
|
346
|
+
- After each new memory addition
|
|
347
|
+
|
|
348
|
+
```python
|
|
349
|
+
success = engine.add_memory_incremental(memory_id)
|
|
350
|
+
# Re-cluster if > 5 new edges added
|
|
351
|
+
```
|
|
352
|
+
|
|
353
|
+
### Graph Traversal
|
|
354
|
+
Find related memories via BFS traversal:
|
|
355
|
+
|
|
356
|
+
```python
|
|
357
|
+
# 1-hop: Direct neighbors only
|
|
358
|
+
related = engine.get_related(memory_id, max_hops=1)
|
|
359
|
+
|
|
360
|
+
# 2-hop: Neighbors + neighbors of neighbors
|
|
361
|
+
related = engine.get_related(memory_id, max_hops=2)
|
|
362
|
+
```
|
|
363
|
+
|
|
364
|
+
## Configuration
|
|
365
|
+
|
|
366
|
+
### Similarity Threshold
|
|
367
|
+
Controls edge creation sensitivity:
|
|
368
|
+
|
|
369
|
+
```python
|
|
370
|
+
# Strict (fewer, stronger connections)
|
|
371
|
+
engine.build_graph(min_similarity=0.5)
|
|
372
|
+
|
|
373
|
+
# Balanced (default)
|
|
374
|
+
engine.build_graph(min_similarity=0.3)
|
|
375
|
+
|
|
376
|
+
# Loose (more connections)
|
|
377
|
+
engine.build_graph(min_similarity=0.2)
|
|
378
|
+
```
|
|
379
|
+
|
|
380
|
+
**Recommendations:**
|
|
381
|
+
- Small corpus (<50 memories): 0.2-0.3
|
|
382
|
+
- Medium corpus (50-200): 0.3-0.4
|
|
383
|
+
- Large corpus (>200): 0.4-0.5
|
|
384
|
+
|
|
385
|
+
### Entity Extraction
|
|
386
|
+
Adjust entity count in `EntityExtractor`:
|
|
387
|
+
|
|
388
|
+
```python
|
|
389
|
+
extractor = EntityExtractor(max_features=20) # Default
|
|
390
|
+
extractor = EntityExtractor(max_features=30) # More granular
|
|
391
|
+
```
|
|
392
|
+
|
|
393
|
+
## Integration with Memory System
|
|
394
|
+
|
|
395
|
+
### Hook Integration (Optional - for Claude CLI)
|
|
396
|
+
If using Claude CLI integration, add to `hooks/remember-hook.js`:
|
|
397
|
+
|
|
398
|
+
```javascript
|
|
399
|
+
// After storing memory
|
|
400
|
+
execFile('python', ['graph_engine.py', 'add-memory', memoryId], (err) => {
|
|
401
|
+
if (err) console.error('Graph update failed:', err);
|
|
402
|
+
});
|
|
403
|
+
```
|
|
404
|
+
|
|
405
|
+
**Note:** SuperLocalMemory V2 works standalone. Hooks are optional Claude CLI integration.
|
|
406
|
+
|
|
407
|
+
### Automated Rebuild
|
|
408
|
+
Add to crontab for weekly rebuild:
|
|
409
|
+
|
|
410
|
+
```bash
|
|
411
|
+
# Run every Sunday at 2 AM
|
|
412
|
+
0 2 * * 0 cd ~/.claude-memory && ./venv/bin/python graph_engine.py build
|
|
413
|
+
```
|
|
414
|
+
|
|
415
|
+
### Search Enhancement
|
|
416
|
+
Use graph for context expansion in search:
|
|
417
|
+
|
|
418
|
+
```python
|
|
419
|
+
# Find memory via search
|
|
420
|
+
memory = store.search(query)[0]
|
|
421
|
+
|
|
422
|
+
# Expand context with related memories
|
|
423
|
+
related = engine.get_related(memory['id'], max_hops=2)
|
|
424
|
+
|
|
425
|
+
# Include related in context window
|
|
426
|
+
context = memory['content']
|
|
427
|
+
for rel in related[:3]: # Top 3
|
|
428
|
+
context += f"\n\nRelated: {rel['summary']}"
|
|
429
|
+
```
|
|
430
|
+
|
|
431
|
+
## Troubleshooting
|
|
432
|
+
|
|
433
|
+
### No Clusters Detected
|
|
434
|
+
**Cause:** Not enough edges or isolated memories
|
|
435
|
+
**Solution:**
|
|
436
|
+
- Lower similarity threshold: `--min-similarity 0.2`
|
|
437
|
+
- Add more memories (need 10+ for good clustering)
|
|
438
|
+
- Check if memories are diverse enough
|
|
439
|
+
|
|
440
|
+
### Slow Build Time
|
|
441
|
+
**Cause:** Large corpus (>500 memories)
|
|
442
|
+
**Solution:**
|
|
443
|
+
- Use incremental updates instead of full rebuild
|
|
444
|
+
- Increase similarity threshold to reduce edges
|
|
445
|
+
- Run as background job (cron)
|
|
446
|
+
|
|
447
|
+
### Import Errors (Python 3.14)
|
|
448
|
+
**Cause:** Conflict with `compression` module
|
|
449
|
+
**Solution:**
|
|
450
|
+
- Already handled via lazy imports in code
|
|
451
|
+
- Imports happen inside methods, not at module level
|
|
452
|
+
|
|
453
|
+
### Memory Not Found
|
|
454
|
+
**Cause:** Memory ID doesn't exist or graph not built
|
|
455
|
+
**Solution:**
|
|
456
|
+
- Verify memory exists: `SELECT id FROM memories WHERE id = ?`
|
|
457
|
+
- Rebuild graph: `python graph_engine.py build`
|
|
458
|
+
|
|
459
|
+
## Future Enhancements
|
|
460
|
+
|
|
461
|
+
### Optional LLM Naming
|
|
462
|
+
Use local LLM (Ollama) for better cluster names:
|
|
463
|
+
|
|
464
|
+
```python
|
|
465
|
+
# Install Ollama and pull model
|
|
466
|
+
ollama pull llama3.2
|
|
467
|
+
|
|
468
|
+
# Enable LLM naming (future feature)
|
|
469
|
+
engine.build_graph(use_llm_naming=True)
|
|
470
|
+
```
|
|
471
|
+
|
|
472
|
+
### Temporal Clustering
|
|
473
|
+
Group memories by time + content:
|
|
474
|
+
|
|
475
|
+
```python
|
|
476
|
+
# Future feature: time-aware clustering
|
|
477
|
+
engine.build_graph(temporal_weight=0.3)
|
|
478
|
+
```
|
|
479
|
+
|
|
480
|
+
### Interactive Visualization
|
|
481
|
+
Web-based D3.js graph viewer (see `docs/architecture/03-ui-architecture.md`)
|
|
482
|
+
|
|
483
|
+
## References
|
|
484
|
+
|
|
485
|
+
- [GraphRAG (Microsoft)](https://microsoft.github.io/graphrag/) - Knowledge graph clustering
|
|
486
|
+
- [Leiden Algorithm](https://www.nature.com/articles/s41598-019-41695-z) - Community detection
|
|
487
|
+
- [PageIndex](https://pageindex.ai/) - Hierarchical RAG patterns
|
|
488
|
+
- [TF-IDF](https://en.wikipedia.org/wiki/Tf%E2%80%93idf) - Text vectorization
|
|
489
|
+
|
|
490
|
+
## Files
|
|
491
|
+
|
|
492
|
+
- `graph_engine.py` - Main implementation (31KB)
|
|
493
|
+
- `example_graph_usage.py` - Usage examples
|
|
494
|
+
- `docs/architecture/04-graph-engine.md` - Architecture documentation
|
|
495
|
+
- `docs/ARCHITECTURE.md` - Full system design
|
|
496
|
+
|
|
497
|
+
## License
|
|
498
|
+
|
|
499
|
+
Local-only, no external dependencies. All data stays on your machine.
|
|
500
|
+
|
|
501
|
+
---
|
|
502
|
+
|
|
503
|
+
**Implementation complete.** Ready for production use with SuperLocalMemory V2.
|