superlocalmemory 2.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (100) hide show
  1. package/ATTRIBUTION.md +140 -0
  2. package/CHANGELOG.md +1749 -0
  3. package/LICENSE +21 -0
  4. package/README.md +600 -0
  5. package/bin/aider-smart +72 -0
  6. package/bin/slm +202 -0
  7. package/bin/slm-npm +73 -0
  8. package/bin/slm.bat +195 -0
  9. package/bin/slm.cmd +10 -0
  10. package/bin/superlocalmemoryv2:list +3 -0
  11. package/bin/superlocalmemoryv2:profile +3 -0
  12. package/bin/superlocalmemoryv2:recall +3 -0
  13. package/bin/superlocalmemoryv2:remember +3 -0
  14. package/bin/superlocalmemoryv2:reset +3 -0
  15. package/bin/superlocalmemoryv2:status +3 -0
  16. package/completions/slm.bash +58 -0
  17. package/completions/slm.zsh +76 -0
  18. package/configs/antigravity-mcp.json +13 -0
  19. package/configs/chatgpt-desktop-mcp.json +7 -0
  20. package/configs/claude-desktop-mcp.json +15 -0
  21. package/configs/codex-mcp.toml +13 -0
  22. package/configs/cody-commands.json +29 -0
  23. package/configs/continue-mcp.yaml +14 -0
  24. package/configs/continue-skills.yaml +26 -0
  25. package/configs/cursor-mcp.json +15 -0
  26. package/configs/gemini-cli-mcp.json +11 -0
  27. package/configs/jetbrains-mcp.json +11 -0
  28. package/configs/opencode-mcp.json +12 -0
  29. package/configs/perplexity-mcp.json +9 -0
  30. package/configs/vscode-copilot-mcp.json +12 -0
  31. package/configs/windsurf-mcp.json +16 -0
  32. package/configs/zed-mcp.json +12 -0
  33. package/docs/ARCHITECTURE.md +877 -0
  34. package/docs/CLI-COMMANDS-REFERENCE.md +425 -0
  35. package/docs/COMPETITIVE-ANALYSIS.md +210 -0
  36. package/docs/COMPRESSION-README.md +390 -0
  37. package/docs/GRAPH-ENGINE.md +503 -0
  38. package/docs/MCP-MANUAL-SETUP.md +720 -0
  39. package/docs/MCP-TROUBLESHOOTING.md +787 -0
  40. package/docs/PATTERN-LEARNING.md +363 -0
  41. package/docs/PROFILES-GUIDE.md +453 -0
  42. package/docs/RESET-GUIDE.md +353 -0
  43. package/docs/SEARCH-ENGINE-V2.2.0.md +748 -0
  44. package/docs/SEARCH-INTEGRATION-GUIDE.md +502 -0
  45. package/docs/UI-SERVER.md +254 -0
  46. package/docs/UNIVERSAL-INTEGRATION.md +432 -0
  47. package/docs/V2.2.0-OPTIONAL-SEARCH.md +666 -0
  48. package/docs/WINDOWS-INSTALL-README.txt +34 -0
  49. package/docs/WINDOWS-POST-INSTALL.txt +45 -0
  50. package/docs/example_graph_usage.py +148 -0
  51. package/hooks/memory-list-skill.js +130 -0
  52. package/hooks/memory-profile-skill.js +284 -0
  53. package/hooks/memory-recall-skill.js +109 -0
  54. package/hooks/memory-remember-skill.js +127 -0
  55. package/hooks/memory-reset-skill.js +274 -0
  56. package/install-skills.sh +436 -0
  57. package/install.ps1 +417 -0
  58. package/install.sh +755 -0
  59. package/mcp_server.py +585 -0
  60. package/package.json +94 -0
  61. package/requirements-core.txt +24 -0
  62. package/requirements.txt +10 -0
  63. package/scripts/postinstall.js +126 -0
  64. package/scripts/preuninstall.js +57 -0
  65. package/skills/slm-build-graph/SKILL.md +423 -0
  66. package/skills/slm-list-recent/SKILL.md +348 -0
  67. package/skills/slm-recall/SKILL.md +325 -0
  68. package/skills/slm-remember/SKILL.md +194 -0
  69. package/skills/slm-status/SKILL.md +363 -0
  70. package/skills/slm-switch-profile/SKILL.md +442 -0
  71. package/src/__pycache__/cache_manager.cpython-312.pyc +0 -0
  72. package/src/__pycache__/embedding_engine.cpython-312.pyc +0 -0
  73. package/src/__pycache__/graph_engine.cpython-312.pyc +0 -0
  74. package/src/__pycache__/hnsw_index.cpython-312.pyc +0 -0
  75. package/src/__pycache__/hybrid_search.cpython-312.pyc +0 -0
  76. package/src/__pycache__/memory-profiles.cpython-312.pyc +0 -0
  77. package/src/__pycache__/memory-reset.cpython-312.pyc +0 -0
  78. package/src/__pycache__/memory_compression.cpython-312.pyc +0 -0
  79. package/src/__pycache__/memory_store_v2.cpython-312.pyc +0 -0
  80. package/src/__pycache__/migrate_v1_to_v2.cpython-312.pyc +0 -0
  81. package/src/__pycache__/pattern_learner.cpython-312.pyc +0 -0
  82. package/src/__pycache__/query_optimizer.cpython-312.pyc +0 -0
  83. package/src/__pycache__/search_engine_v2.cpython-312.pyc +0 -0
  84. package/src/__pycache__/setup_validator.cpython-312.pyc +0 -0
  85. package/src/__pycache__/tree_manager.cpython-312.pyc +0 -0
  86. package/src/cache_manager.py +520 -0
  87. package/src/embedding_engine.py +671 -0
  88. package/src/graph_engine.py +970 -0
  89. package/src/hnsw_index.py +626 -0
  90. package/src/hybrid_search.py +693 -0
  91. package/src/memory-profiles.py +518 -0
  92. package/src/memory-reset.py +485 -0
  93. package/src/memory_compression.py +999 -0
  94. package/src/memory_store_v2.py +1088 -0
  95. package/src/migrate_v1_to_v2.py +638 -0
  96. package/src/pattern_learner.py +898 -0
  97. package/src/query_optimizer.py +513 -0
  98. package/src/search_engine_v2.py +403 -0
  99. package/src/setup_validator.py +479 -0
  100. package/src/tree_manager.py +720 -0
@@ -0,0 +1,503 @@
1
+ # GraphEngine - Knowledge Graph Clustering for SuperLocalMemory
2
+
3
+ Complete implementation of GraphRAG with Leiden community detection for automatic memory clustering and relationship discovery.
4
+
5
+ ## Overview
6
+
7
+ The GraphEngine implements Layer 3 of the memory architecture, building a knowledge graph that auto-discovers relationships between memories across projects using:
8
+
9
+ - **TF-IDF Entity Extraction** - Local keyword extraction (top 20 per memory)
10
+ - **Cosine Similarity Edges** - Relationship building (threshold > 0.3)
11
+ - **Leiden Clustering** - Community detection for thematic grouping
12
+ - **Auto-naming** - TF-IDF-based cluster name generation
13
+
14
+ **All processing is local** - no external APIs, all data stays on your machine.
15
+
16
+ ## Architecture
17
+
18
+ ```
19
+ ┌─────────────────────────────────────────────────────────┐
20
+ │ GraphEngine (graph_engine.py) │
21
+ │ │
22
+ │ ┌──────────────────┐ ┌──────────────────┐ │
23
+ │ │ EntityExtractor │ │ EdgeBuilder │ │
24
+ │ │ (TF-IDF) │ │ (Cosine sim) │ │
25
+ │ └──────────────────┘ └──────────────────┘ │
26
+ │ │
27
+ │ ┌──────────────────┐ ┌──────────────────┐ │
28
+ │ │ ClusterBuilder │ │ ClusterNamer │ │
29
+ │ │ (Leiden) │ │ (TF-IDF) │ │
30
+ │ └──────────────────┘ └──────────────────┘ │
31
+ └─────────────────────────────────────────────────────────┘
32
+
33
+
34
+ ┌───────────────┐
35
+ │ SQLite Tables │
36
+ │ - graph_nodes │
37
+ │ - graph_edges │
38
+ │ - graph_clusters │
39
+ └───────────────┘
40
+ ```
41
+
42
+ ## Components
43
+
44
+ ### 1. EntityExtractor
45
+ Extracts key concepts from memory content using TF-IDF vectorization.
46
+
47
+ **Features:**
48
+ - Top 20 keywords per memory
49
+ - Unigrams + bigrams (e.g., "authentication", "jwt tokens")
50
+ - English stop words filtering
51
+ - Minimum score threshold (0.05)
52
+
53
+ **Example:**
54
+ ```python
55
+ Memory: "Next.js authentication using NextAuth.js with JWT tokens..."
56
+
57
+ Entities:
58
+ ["nextjs", "authentication", "nextauth", "jwt", "tokens",
59
+ "oauth", "session", "credentials", "callback", "api"]
60
+ ```
61
+
62
+ ### 2. EdgeBuilder
63
+ Creates weighted edges between similar memories based on entity overlap.
64
+
65
+ **Algorithm:**
66
+ 1. Compute pairwise cosine similarity of TF-IDF vectors
67
+ 2. Create edge if similarity >= threshold (default 0.3)
68
+ 3. Classify relationship type:
69
+ - `similar` (sim > 0.7) - Strong match
70
+ - `depends_on` - Contains dependency keywords
71
+ - `related_to` - Moderate/weak match
72
+
73
+ **Example:**
74
+ ```python
75
+ Memory #42: ["nextjs", "authentication", "jwt"]
76
+ Memory #15: ["jwt", "tokens", "authentication", "python"]
77
+
78
+ Similarity: 0.72
79
+ Shared entities: ["authentication", "jwt"]
80
+ Edge type: "similar"
81
+ ```
82
+
83
+ ### 3. ClusterBuilder
84
+ Groups related memories into thematic clusters using Leiden algorithm.
85
+
86
+ **Why Leiden?**
87
+ - Better quality than Louvain algorithm
88
+ - Deterministic (reproducible with seed)
89
+ - Scalable (handles 1000+ nodes)
90
+ - Production-ready (used by Scanpy, 10k+ citations)
91
+
92
+ **Performance:**
93
+ - 50 memories: ~500ms
94
+ - 100 memories: ~2s
95
+ - 500 memories: ~15s
96
+
97
+ **Output:**
98
+ ```
99
+ Cluster #1: 8 memories (avg importance: 7.2)
100
+ Theme: Authentication & JWT
101
+ Members: [12, 15, 23, 33, 42, 52, 67, 71]
102
+
103
+ Cluster #2: 12 memories (avg importance: 6.8)
104
+ Theme: React & Architecture
105
+ Members: [5, 8, 14, 19, 28, 35, 41, 46, 53, 60, 65, 70]
106
+ ```
107
+
108
+ ### 4. ClusterNamer
109
+ Auto-generates human-readable cluster names from member entities.
110
+
111
+ **Strategy:**
112
+ 1. Collect all entities from cluster members
113
+ 2. Count entity frequencies
114
+ 3. Use top 2-3 entities for name
115
+
116
+ **Examples:**
117
+ - `"Authentication & JWT"` (from ["authentication", "jwt", "oauth"])
118
+ - `"React & Architecture"` (from ["react", "component", "architecture"])
119
+ - `"Performance & Optimization"` (from ["performance", "optimize", "speed"])
120
+
121
+ ## Database Schema
122
+
123
+ ### graph_nodes
124
+ Stores extracted entities and embedding vectors for each memory.
125
+
126
+ ```sql
127
+ CREATE TABLE graph_nodes (
128
+ id INTEGER PRIMARY KEY,
129
+ memory_id INTEGER UNIQUE NOT NULL,
130
+ entities TEXT, -- JSON: ["auth", "jwt", ...]
131
+ embedding_vector TEXT, -- JSON: TF-IDF vector
132
+ created_at TIMESTAMP,
133
+ FOREIGN KEY (memory_id) REFERENCES memories(id)
134
+ );
135
+ ```
136
+
137
+ ### graph_edges
138
+ Stores relationships between memories.
139
+
140
+ ```sql
141
+ CREATE TABLE graph_edges (
142
+ id INTEGER PRIMARY KEY,
143
+ source_memory_id INTEGER NOT NULL,
144
+ target_memory_id INTEGER NOT NULL,
145
+ relationship_type TEXT, -- 'similar', 'depends_on', 'related_to'
146
+ weight REAL, -- Similarity score (0-1)
147
+ shared_entities TEXT, -- JSON: ["auth", "jwt"]
148
+ similarity_score REAL,
149
+ created_at TIMESTAMP,
150
+ UNIQUE(source_memory_id, target_memory_id)
151
+ );
152
+ ```
153
+
154
+ ### graph_clusters
155
+ Stores detected communities.
156
+
157
+ ```sql
158
+ CREATE TABLE graph_clusters (
159
+ id INTEGER PRIMARY KEY,
160
+ name TEXT NOT NULL, -- "Authentication & JWT"
161
+ description TEXT,
162
+ member_count INTEGER,
163
+ avg_importance REAL,
164
+ created_at TIMESTAMP,
165
+ updated_at TIMESTAMP
166
+ );
167
+ ```
168
+
169
+ ## Installation
170
+
171
+ ### Dependencies
172
+ ```bash
173
+ pip install scikit-learn python-igraph leidenalg
174
+ ```
175
+
176
+ **Note:** All dependencies are already installed in the virtual environment.
177
+
178
+ ## Usage
179
+
180
+ ### CLI Commands
181
+
182
+ #### 1. Build Complete Graph
183
+ ```bash
184
+ python graph_engine.py build [--min-similarity 0.3]
185
+ ```
186
+
187
+ **Output:**
188
+ ```json
189
+ {
190
+ "success": true,
191
+ "memories": 18,
192
+ "nodes": 18,
193
+ "edges": 40,
194
+ "clusters": 4,
195
+ "time_seconds": 0.03
196
+ }
197
+ ```
198
+
199
+ #### 2. View Statistics
200
+ ```bash
201
+ python graph_engine.py stats
202
+ ```
203
+
204
+ **Output:**
205
+ ```json
206
+ {
207
+ "nodes": 18,
208
+ "edges": 40,
209
+ "clusters": 4,
210
+ "top_clusters": [
211
+ {
212
+ "name": "Authentication & Tokens",
213
+ "members": 4,
214
+ "avg_importance": 6.2
215
+ },
216
+ {
217
+ "name": "Performance & Code",
218
+ "members": 4,
219
+ "avg_importance": 5.0
220
+ }
221
+ ]
222
+ }
223
+ ```
224
+
225
+ #### 3. Find Related Memories
226
+ ```bash
227
+ python graph_engine.py related --memory-id 1 [--hops 2]
228
+ ```
229
+
230
+ **Output:**
231
+ ```
232
+ 1. Memory #4 (1-hop, weight=0.875)
233
+ Relationship: similar
234
+ Summary: Authentication implementation...
235
+ Shared: authentication, jwt, oauth
236
+
237
+ 2. Memory #2 (1-hop, weight=0.709)
238
+ Relationship: related_to
239
+ Summary: API security patterns...
240
+ Shared: security, api, tokens
241
+ ```
242
+
243
+ #### 4. View Cluster Members
244
+ ```bash
245
+ python graph_engine.py cluster --cluster-id 1
246
+ ```
247
+
248
+ **Output:**
249
+ ```
250
+ Cluster #1 members:
251
+
252
+ 1. Memory #5 (importance=7)
253
+ JWT authentication implementation...
254
+
255
+ 2. Memory #8 (importance=6)
256
+ OAuth2 flow setup...
257
+
258
+ 3. Memory #10 (importance=8)
259
+ Security best practices...
260
+ ```
261
+
262
+ ### Programmatic Usage
263
+
264
+ ```python
265
+ from graph_engine import GraphEngine
266
+
267
+ # Initialize engine
268
+ engine = GraphEngine()
269
+
270
+ # Build complete graph
271
+ stats = engine.build_graph(min_similarity=0.3)
272
+ print(f"Built graph: {stats['nodes']} nodes, {stats['edges']} edges")
273
+
274
+ # Find related memories
275
+ related = engine.get_related(memory_id=1, max_hops=2)
276
+ for mem in related:
277
+ print(f"Related: #{mem['id']} ({mem['relationship']}, weight={mem['weight']:.3f})")
278
+
279
+ # Query clusters
280
+ stats = engine.get_stats()
281
+ for cluster in stats['top_clusters']:
282
+ print(f"Cluster: {cluster['name']} ({cluster['members']} members)")
283
+
284
+ # Get cluster details
285
+ members = engine.get_cluster_members(cluster_id)
286
+ for mem in members:
287
+ print(f" - Memory #{mem['id']}: {mem['summary'][:50]}...")
288
+
289
+ # Add memory incrementally
290
+ success = engine.add_memory_incremental(new_memory_id)
291
+ if success:
292
+ print("Memory added to graph successfully")
293
+ ```
294
+
295
+ ### Example Script
296
+
297
+ Run `example_graph_usage.py` to see all features in action:
298
+
299
+ ```bash
300
+ python example_graph_usage.py
301
+ ```
302
+
303
+ This demonstrates:
304
+ 1. Building a complete graph
305
+ 2. Finding related memories
306
+ 3. Querying clusters
307
+ 4. Extracting entities
308
+ 5. Incremental memory addition
309
+
310
+ ## Performance
311
+
312
+ ### Build Time (Full Graph)
313
+ - 10 memories: 0.02s
314
+ - 50 memories: 0.5s
315
+ - 100 memories: 2s
316
+ - 500 memories: ~15s
317
+
318
+ ### Query Time
319
+ - Find related (1-hop): <5ms
320
+ - Find related (2-hop): <10ms
321
+ - Get cluster members: <2ms
322
+ - Graph stats: <5ms
323
+
324
+ ### Space Complexity
325
+ - **Sparse storage** - Only edges > threshold
326
+ - **Example:** 50 memories
327
+ - Full matrix: 2,500 entries
328
+ - Sparse graph: ~150 edges (94% reduction)
329
+
330
+ ## Graph Operations
331
+
332
+ ### Full Rebuild
333
+ Recommended when:
334
+ - First time setup
335
+ - Major changes (10+ new memories)
336
+ - Weekly maintenance (cron job)
337
+
338
+ ```python
339
+ stats = engine.build_graph(min_similarity=0.3)
340
+ ```
341
+
342
+ ### Incremental Update
343
+ Recommended when:
344
+ - Adding single memory
345
+ - Real-time graph updates
346
+ - After each new memory addition
347
+
348
+ ```python
349
+ success = engine.add_memory_incremental(memory_id)
350
+ # Re-cluster if > 5 new edges added
351
+ ```
352
+
353
+ ### Graph Traversal
354
+ Find related memories via BFS traversal:
355
+
356
+ ```python
357
+ # 1-hop: Direct neighbors only
358
+ related = engine.get_related(memory_id, max_hops=1)
359
+
360
+ # 2-hop: Neighbors + neighbors of neighbors
361
+ related = engine.get_related(memory_id, max_hops=2)
362
+ ```
363
+
364
+ ## Configuration
365
+
366
+ ### Similarity Threshold
367
+ Controls edge creation sensitivity:
368
+
369
+ ```python
370
+ # Strict (fewer, stronger connections)
371
+ engine.build_graph(min_similarity=0.5)
372
+
373
+ # Balanced (default)
374
+ engine.build_graph(min_similarity=0.3)
375
+
376
+ # Loose (more connections)
377
+ engine.build_graph(min_similarity=0.2)
378
+ ```
379
+
380
+ **Recommendations:**
381
+ - Small corpus (<50 memories): 0.2-0.3
382
+ - Medium corpus (50-200): 0.3-0.4
383
+ - Large corpus (>200): 0.4-0.5
384
+
385
+ ### Entity Extraction
386
+ Adjust entity count in `EntityExtractor`:
387
+
388
+ ```python
389
+ extractor = EntityExtractor(max_features=20) # Default
390
+ extractor = EntityExtractor(max_features=30) # More granular
391
+ ```
392
+
393
+ ## Integration with Memory System
394
+
395
+ ### Hook Integration (Optional - for Claude CLI)
396
+ If using Claude CLI integration, add to `hooks/remember-hook.js`:
397
+
398
+ ```javascript
399
+ // After storing memory
400
+ execFile('python', ['graph_engine.py', 'add-memory', memoryId], (err) => {
401
+ if (err) console.error('Graph update failed:', err);
402
+ });
403
+ ```
404
+
405
+ **Note:** SuperLocalMemory V2 works standalone. Hooks are optional Claude CLI integration.
406
+
407
+ ### Automated Rebuild
408
+ Add to crontab for weekly rebuild:
409
+
410
+ ```bash
411
+ # Run every Sunday at 2 AM
412
+ 0 2 * * 0 cd ~/.claude-memory && ./venv/bin/python graph_engine.py build
413
+ ```
414
+
415
+ ### Search Enhancement
416
+ Use graph for context expansion in search:
417
+
418
+ ```python
419
+ # Find memory via search
420
+ memory = store.search(query)[0]
421
+
422
+ # Expand context with related memories
423
+ related = engine.get_related(memory['id'], max_hops=2)
424
+
425
+ # Include related in context window
426
+ context = memory['content']
427
+ for rel in related[:3]: # Top 3
428
+ context += f"\n\nRelated: {rel['summary']}"
429
+ ```
430
+
431
+ ## Troubleshooting
432
+
433
+ ### No Clusters Detected
434
+ **Cause:** Not enough edges or isolated memories
435
+ **Solution:**
436
+ - Lower similarity threshold: `--min-similarity 0.2`
437
+ - Add more memories (need 10+ for good clustering)
438
+ - Check if memories are diverse enough
439
+
440
+ ### Slow Build Time
441
+ **Cause:** Large corpus (>500 memories)
442
+ **Solution:**
443
+ - Use incremental updates instead of full rebuild
444
+ - Increase similarity threshold to reduce edges
445
+ - Run as background job (cron)
446
+
447
+ ### Import Errors (Python 3.14)
448
+ **Cause:** Conflict with `compression` module
449
+ **Solution:**
450
+ - Already handled via lazy imports in code
451
+ - Imports happen inside methods, not at module level
452
+
453
+ ### Memory Not Found
454
+ **Cause:** Memory ID doesn't exist or graph not built
455
+ **Solution:**
456
+ - Verify memory exists: `SELECT id FROM memories WHERE id = ?`
457
+ - Rebuild graph: `python graph_engine.py build`
458
+
459
+ ## Future Enhancements
460
+
461
+ ### Optional LLM Naming
462
+ Use local LLM (Ollama) for better cluster names:
463
+
464
+ ```python
465
+ # Install Ollama and pull model
466
+ ollama pull llama3.2
467
+
468
+ # Enable LLM naming (future feature)
469
+ engine.build_graph(use_llm_naming=True)
470
+ ```
471
+
472
+ ### Temporal Clustering
473
+ Group memories by time + content:
474
+
475
+ ```python
476
+ # Future feature: time-aware clustering
477
+ engine.build_graph(temporal_weight=0.3)
478
+ ```
479
+
480
+ ### Interactive Visualization
481
+ Web-based D3.js graph viewer (see `docs/architecture/03-ui-architecture.md`)
482
+
483
+ ## References
484
+
485
+ - [GraphRAG (Microsoft)](https://microsoft.github.io/graphrag/) - Knowledge graph clustering
486
+ - [Leiden Algorithm](https://www.nature.com/articles/s41598-019-41695-z) - Community detection
487
+ - [PageIndex](https://pageindex.ai/) - Hierarchical RAG patterns
488
+ - [TF-IDF](https://en.wikipedia.org/wiki/Tf%E2%80%93idf) - Text vectorization
489
+
490
+ ## Files
491
+
492
+ - `graph_engine.py` - Main implementation (31KB)
493
+ - `example_graph_usage.py` - Usage examples
494
+ - `docs/architecture/04-graph-engine.md` - Architecture documentation
495
+ - `docs/ARCHITECTURE.md` - Full system design
496
+
497
+ ## License
498
+
499
+ Local-only, no external dependencies. All data stays on your machine.
500
+
501
+ ---
502
+
503
+ **Implementation complete.** Ready for production use with SuperLocalMemory V2.