superlocalmemory 2.3.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/ATTRIBUTION.md +140 -0
- package/CHANGELOG.md +1749 -0
- package/LICENSE +21 -0
- package/README.md +600 -0
- package/bin/aider-smart +72 -0
- package/bin/slm +202 -0
- package/bin/slm-npm +73 -0
- package/bin/slm.bat +195 -0
- package/bin/slm.cmd +10 -0
- package/bin/superlocalmemoryv2:list +3 -0
- package/bin/superlocalmemoryv2:profile +3 -0
- package/bin/superlocalmemoryv2:recall +3 -0
- package/bin/superlocalmemoryv2:remember +3 -0
- package/bin/superlocalmemoryv2:reset +3 -0
- package/bin/superlocalmemoryv2:status +3 -0
- package/completions/slm.bash +58 -0
- package/completions/slm.zsh +76 -0
- package/configs/antigravity-mcp.json +13 -0
- package/configs/chatgpt-desktop-mcp.json +7 -0
- package/configs/claude-desktop-mcp.json +15 -0
- package/configs/codex-mcp.toml +13 -0
- package/configs/cody-commands.json +29 -0
- package/configs/continue-mcp.yaml +14 -0
- package/configs/continue-skills.yaml +26 -0
- package/configs/cursor-mcp.json +15 -0
- package/configs/gemini-cli-mcp.json +11 -0
- package/configs/jetbrains-mcp.json +11 -0
- package/configs/opencode-mcp.json +12 -0
- package/configs/perplexity-mcp.json +9 -0
- package/configs/vscode-copilot-mcp.json +12 -0
- package/configs/windsurf-mcp.json +16 -0
- package/configs/zed-mcp.json +12 -0
- package/docs/ARCHITECTURE.md +877 -0
- package/docs/CLI-COMMANDS-REFERENCE.md +425 -0
- package/docs/COMPETITIVE-ANALYSIS.md +210 -0
- package/docs/COMPRESSION-README.md +390 -0
- package/docs/GRAPH-ENGINE.md +503 -0
- package/docs/MCP-MANUAL-SETUP.md +720 -0
- package/docs/MCP-TROUBLESHOOTING.md +787 -0
- package/docs/PATTERN-LEARNING.md +363 -0
- package/docs/PROFILES-GUIDE.md +453 -0
- package/docs/RESET-GUIDE.md +353 -0
- package/docs/SEARCH-ENGINE-V2.2.0.md +748 -0
- package/docs/SEARCH-INTEGRATION-GUIDE.md +502 -0
- package/docs/UI-SERVER.md +254 -0
- package/docs/UNIVERSAL-INTEGRATION.md +432 -0
- package/docs/V2.2.0-OPTIONAL-SEARCH.md +666 -0
- package/docs/WINDOWS-INSTALL-README.txt +34 -0
- package/docs/WINDOWS-POST-INSTALL.txt +45 -0
- package/docs/example_graph_usage.py +148 -0
- package/hooks/memory-list-skill.js +130 -0
- package/hooks/memory-profile-skill.js +284 -0
- package/hooks/memory-recall-skill.js +109 -0
- package/hooks/memory-remember-skill.js +127 -0
- package/hooks/memory-reset-skill.js +274 -0
- package/install-skills.sh +436 -0
- package/install.ps1 +417 -0
- package/install.sh +755 -0
- package/mcp_server.py +585 -0
- package/package.json +94 -0
- package/requirements-core.txt +24 -0
- package/requirements.txt +10 -0
- package/scripts/postinstall.js +126 -0
- package/scripts/preuninstall.js +57 -0
- package/skills/slm-build-graph/SKILL.md +423 -0
- package/skills/slm-list-recent/SKILL.md +348 -0
- package/skills/slm-recall/SKILL.md +325 -0
- package/skills/slm-remember/SKILL.md +194 -0
- package/skills/slm-status/SKILL.md +363 -0
- package/skills/slm-switch-profile/SKILL.md +442 -0
- package/src/__pycache__/cache_manager.cpython-312.pyc +0 -0
- package/src/__pycache__/embedding_engine.cpython-312.pyc +0 -0
- package/src/__pycache__/graph_engine.cpython-312.pyc +0 -0
- package/src/__pycache__/hnsw_index.cpython-312.pyc +0 -0
- package/src/__pycache__/hybrid_search.cpython-312.pyc +0 -0
- package/src/__pycache__/memory-profiles.cpython-312.pyc +0 -0
- package/src/__pycache__/memory-reset.cpython-312.pyc +0 -0
- package/src/__pycache__/memory_compression.cpython-312.pyc +0 -0
- package/src/__pycache__/memory_store_v2.cpython-312.pyc +0 -0
- package/src/__pycache__/migrate_v1_to_v2.cpython-312.pyc +0 -0
- package/src/__pycache__/pattern_learner.cpython-312.pyc +0 -0
- package/src/__pycache__/query_optimizer.cpython-312.pyc +0 -0
- package/src/__pycache__/search_engine_v2.cpython-312.pyc +0 -0
- package/src/__pycache__/setup_validator.cpython-312.pyc +0 -0
- package/src/__pycache__/tree_manager.cpython-312.pyc +0 -0
- package/src/cache_manager.py +520 -0
- package/src/embedding_engine.py +671 -0
- package/src/graph_engine.py +970 -0
- package/src/hnsw_index.py +626 -0
- package/src/hybrid_search.py +693 -0
- package/src/memory-profiles.py +518 -0
- package/src/memory-reset.py +485 -0
- package/src/memory_compression.py +999 -0
- package/src/memory_store_v2.py +1088 -0
- package/src/migrate_v1_to_v2.py +638 -0
- package/src/pattern_learner.py +898 -0
- package/src/query_optimizer.py +513 -0
- package/src/search_engine_v2.py +403 -0
- package/src/setup_validator.py +479 -0
- package/src/tree_manager.py +720 -0
|
@@ -0,0 +1,666 @@
|
|
|
1
|
+
# SuperLocalMemory V2.2.0 - Optional Search Components
|
|
2
|
+
|
|
3
|
+
**Created by:** Varun Pratap Bhardwaj
|
|
4
|
+
**Version:** 2.2.0
|
|
5
|
+
**Date:** February 7, 2026
|
|
6
|
+
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
## Overview
|
|
10
|
+
|
|
11
|
+
SuperLocalMemory V2.2.0 introduces two powerful optional search components that significantly enhance memory retrieval performance and accuracy:
|
|
12
|
+
|
|
13
|
+
1. **HNSW Index** (`src/hnsw_index.py`) - Fast approximate nearest neighbor search
|
|
14
|
+
2. **Embedding Engine** (`src/embedding_engine.py`) - Local semantic embedding generation
|
|
15
|
+
|
|
16
|
+
Both components are **completely optional** with graceful fallback to existing TF-IDF methods if dependencies are not installed.
|
|
17
|
+
|
|
18
|
+
---
|
|
19
|
+
|
|
20
|
+
## Component #1: HNSW Index
|
|
21
|
+
|
|
22
|
+
### What is HNSW?
|
|
23
|
+
|
|
24
|
+
HNSW (Hierarchical Navigable Small World) is a state-of-the-art algorithm for approximate nearest neighbor search. It provides:
|
|
25
|
+
|
|
26
|
+
- **Sub-10ms search** for 10,000 memories
|
|
27
|
+
- **Sub-50ms search** for 100,000 memories
|
|
28
|
+
- **Incremental updates** without full rebuild
|
|
29
|
+
- **Disk persistence** for instant startup
|
|
30
|
+
|
|
31
|
+
### Installation
|
|
32
|
+
|
|
33
|
+
```bash
|
|
34
|
+
# Optional - only install if you want HNSW acceleration
|
|
35
|
+
pip install hnswlib
|
|
36
|
+
```
|
|
37
|
+
|
|
38
|
+
### Usage
|
|
39
|
+
|
|
40
|
+
```python
|
|
41
|
+
from hnsw_index import HNSWIndex
|
|
42
|
+
import numpy as np
|
|
43
|
+
|
|
44
|
+
# Initialize index
|
|
45
|
+
index = HNSWIndex(dimension=384) # Match your embedding dimension
|
|
46
|
+
|
|
47
|
+
# Build from vectors
|
|
48
|
+
vectors = np.random.randn(1000, 384).astype('float32')
|
|
49
|
+
memory_ids = list(range(1000))
|
|
50
|
+
index.build(vectors, memory_ids)
|
|
51
|
+
|
|
52
|
+
# Search for similar vectors
|
|
53
|
+
query = np.random.randn(384).astype('float32')
|
|
54
|
+
results = index.search(query, k=5)
|
|
55
|
+
# Returns: [(memory_id, similarity_score), ...]
|
|
56
|
+
|
|
57
|
+
# Add single vector (incremental)
|
|
58
|
+
new_vector = np.random.randn(384).astype('float32')
|
|
59
|
+
index.add(new_vector, memory_id=1001)
|
|
60
|
+
|
|
61
|
+
# Get statistics
|
|
62
|
+
stats = index.get_stats()
|
|
63
|
+
print(f"Indexed: {stats['indexed_count']} vectors")
|
|
64
|
+
print(f"Method: {'HNSW' if stats['use_hnsw'] else 'Linear fallback'}")
|
|
65
|
+
```
|
|
66
|
+
|
|
67
|
+
### Fallback Behavior
|
|
68
|
+
|
|
69
|
+
If `hnswlib` is not installed:
|
|
70
|
+
- Automatically falls back to sklearn-based linear search
|
|
71
|
+
- Performance: ~100ms for 10K vectors (vs <10ms with HNSW)
|
|
72
|
+
- No feature degradation - all functionality remains available
|
|
73
|
+
|
|
74
|
+
### CLI Commands
|
|
75
|
+
|
|
76
|
+
```bash
|
|
77
|
+
# Show index statistics
|
|
78
|
+
python src/hnsw_index.py stats
|
|
79
|
+
|
|
80
|
+
# Rebuild from database
|
|
81
|
+
python src/hnsw_index.py rebuild
|
|
82
|
+
|
|
83
|
+
# Run performance test
|
|
84
|
+
python src/hnsw_index.py test
|
|
85
|
+
```
|
|
86
|
+
|
|
87
|
+
### Performance Benchmarks
|
|
88
|
+
|
|
89
|
+
| Dataset Size | HNSW Search | Linear Search | Speedup |
|
|
90
|
+
|--------------|-------------|---------------|---------|
|
|
91
|
+
| 1K vectors | <1ms | ~10ms | 10x |
|
|
92
|
+
| 10K vectors | <10ms | ~100ms | 10x |
|
|
93
|
+
| 100K vectors | <50ms | ~1000ms | 20x |
|
|
94
|
+
|
|
95
|
+
### Configuration
|
|
96
|
+
|
|
97
|
+
```python
|
|
98
|
+
index = HNSWIndex(
|
|
99
|
+
dimension=384, # Vector dimension
|
|
100
|
+
max_elements=100_000, # Maximum vectors to index
|
|
101
|
+
m=16, # HNSW M parameter (connections)
|
|
102
|
+
ef_construction=200, # Build-time quality
|
|
103
|
+
ef_search=50 # Search-time quality
|
|
104
|
+
)
|
|
105
|
+
```
|
|
106
|
+
|
|
107
|
+
**Parameters explained:**
|
|
108
|
+
- `m`: Higher = better accuracy, more memory (typical: 16)
|
|
109
|
+
- `ef_construction`: Higher = better index quality, slower build (typical: 200)
|
|
110
|
+
- `ef_search`: Higher = better accuracy, slower search (typical: 50)
|
|
111
|
+
|
|
112
|
+
### Security & Limits
|
|
113
|
+
|
|
114
|
+
```python
|
|
115
|
+
MAX_MEMORIES_FOR_HNSW = 100_000 # Prevents memory exhaustion
|
|
116
|
+
MAX_DIMENSION = 5000 # Typical: 384 for embeddings
|
|
117
|
+
```
|
|
118
|
+
|
|
119
|
+
---
|
|
120
|
+
|
|
121
|
+
## Component #2: Embedding Engine
|
|
122
|
+
|
|
123
|
+
### What is the Embedding Engine?
|
|
124
|
+
|
|
125
|
+
The Embedding Engine generates semantic embeddings locally using sentence-transformers, enabling:
|
|
126
|
+
|
|
127
|
+
- **True semantic search** (understands meaning, not just keywords)
|
|
128
|
+
- **100% local processing** (no API calls, no cloud dependencies)
|
|
129
|
+
- **GPU acceleration** (CUDA/Apple Silicon MPS support)
|
|
130
|
+
- **Smart caching** (LRU cache for 10K embeddings)
|
|
131
|
+
|
|
132
|
+
### Installation
|
|
133
|
+
|
|
134
|
+
```bash
|
|
135
|
+
# Optional - only install if you want semantic embeddings
|
|
136
|
+
pip install sentence-transformers
|
|
137
|
+
|
|
138
|
+
# For GPU acceleration (optional)
|
|
139
|
+
pip install torch # Will auto-detect CUDA/MPS
|
|
140
|
+
```
|
|
141
|
+
|
|
142
|
+
### Usage
|
|
143
|
+
|
|
144
|
+
```python
|
|
145
|
+
from embedding_engine import EmbeddingEngine
|
|
146
|
+
|
|
147
|
+
# Initialize engine
|
|
148
|
+
engine = EmbeddingEngine() # Auto-detects GPU, downloads model
|
|
149
|
+
|
|
150
|
+
# Generate single embedding
|
|
151
|
+
text = "SuperLocalMemory is a local memory system for AI assistants"
|
|
152
|
+
embedding = engine.encode(text)
|
|
153
|
+
# Returns: numpy array of shape (384,)
|
|
154
|
+
|
|
155
|
+
# Batch processing
|
|
156
|
+
texts = [
|
|
157
|
+
"First memory about authentication",
|
|
158
|
+
"Second memory about database design",
|
|
159
|
+
"Third memory about API endpoints"
|
|
160
|
+
]
|
|
161
|
+
embeddings = engine.encode(texts, batch_size=32)
|
|
162
|
+
# Returns: numpy array of shape (3, 384)
|
|
163
|
+
|
|
164
|
+
# Compute similarity
|
|
165
|
+
emb1 = engine.encode("JWT authentication")
|
|
166
|
+
emb2 = engine.encode("Token-based auth")
|
|
167
|
+
similarity = engine.similarity(emb1, emb2)
|
|
168
|
+
# Returns: 0.85 (high similarity)
|
|
169
|
+
|
|
170
|
+
# Get statistics
|
|
171
|
+
stats = engine.get_stats()
|
|
172
|
+
print(f"Device: {stats['device']}") # cuda / mps / cpu
|
|
173
|
+
print(f"Cache size: {stats['cache_size']}")
|
|
174
|
+
```
|
|
175
|
+
|
|
176
|
+
### Models Available
|
|
177
|
+
|
|
178
|
+
| Model | Dimension | Size | Speed | Quality | Use Case |
|
|
179
|
+
|-------|-----------|------|-------|---------|----------|
|
|
180
|
+
| **all-MiniLM-L6-v2** (default) | 384 | 80MB | Fast | Good | General use |
|
|
181
|
+
| all-mpnet-base-v2 | 768 | 420MB | Slower | Better | High accuracy |
|
|
182
|
+
| paraphrase-multilingual | 384 | 420MB | Medium | Good | Multiple languages |
|
|
183
|
+
|
|
184
|
+
```python
|
|
185
|
+
# Use different model
|
|
186
|
+
engine = EmbeddingEngine(model_name="all-mpnet-base-v2")
|
|
187
|
+
```
|
|
188
|
+
|
|
189
|
+
### Fallback Behavior
|
|
190
|
+
|
|
191
|
+
If `sentence-transformers` is not installed:
|
|
192
|
+
- Automatically falls back to TF-IDF vectorization
|
|
193
|
+
- Performance: Still fast, but less semantic understanding
|
|
194
|
+
- Dimension: 384 (padded/truncated to match)
|
|
195
|
+
|
|
196
|
+
### CLI Commands
|
|
197
|
+
|
|
198
|
+
```bash
|
|
199
|
+
# Show engine statistics
|
|
200
|
+
python src/embedding_engine.py stats
|
|
201
|
+
|
|
202
|
+
# Generate embeddings for all memories
|
|
203
|
+
python src/embedding_engine.py generate
|
|
204
|
+
|
|
205
|
+
# Clear embedding cache
|
|
206
|
+
python src/embedding_engine.py clear-cache
|
|
207
|
+
|
|
208
|
+
# Run performance test
|
|
209
|
+
python src/embedding_engine.py test
|
|
210
|
+
```
|
|
211
|
+
|
|
212
|
+
### Performance Benchmarks
|
|
213
|
+
|
|
214
|
+
| Device | Speed (texts/sec) | GPU Memory |
|
|
215
|
+
|--------|-------------------|------------|
|
|
216
|
+
| CPU (Intel i7) | ~100 | N/A |
|
|
217
|
+
| CUDA (RTX 3090) | ~1000 | ~2GB |
|
|
218
|
+
| Apple M1/M2 (MPS) | ~500 | Unified |
|
|
219
|
+
| Cache Hit | ~1,000,000 | N/A |
|
|
220
|
+
|
|
221
|
+
### GPU Acceleration
|
|
222
|
+
|
|
223
|
+
```python
|
|
224
|
+
# Auto-detect (recommended)
|
|
225
|
+
engine = EmbeddingEngine() # Tries: CUDA → MPS → CPU
|
|
226
|
+
|
|
227
|
+
# Force specific device
|
|
228
|
+
engine = EmbeddingEngine(device='cuda') # NVIDIA GPU
|
|
229
|
+
engine = EmbeddingEngine(device='mps') # Apple Silicon
|
|
230
|
+
engine = EmbeddingEngine(device='cpu') # CPU only
|
|
231
|
+
```
|
|
232
|
+
|
|
233
|
+
### Caching
|
|
234
|
+
|
|
235
|
+
The engine uses an LRU cache for repeated queries:
|
|
236
|
+
|
|
237
|
+
```python
|
|
238
|
+
# First call: generates embedding (~10-100ms)
|
|
239
|
+
emb1 = engine.encode("Same text")
|
|
240
|
+
|
|
241
|
+
# Second call: cache hit (~0.001ms)
|
|
242
|
+
emb2 = engine.encode("Same text")
|
|
243
|
+
|
|
244
|
+
# Save cache to disk
|
|
245
|
+
engine.save_cache()
|
|
246
|
+
|
|
247
|
+
# Clear cache
|
|
248
|
+
engine.clear_cache()
|
|
249
|
+
```
|
|
250
|
+
|
|
251
|
+
Cache settings:
|
|
252
|
+
- **Max size:** 10,000 entries
|
|
253
|
+
- **Eviction:** LRU (Least Recently Used)
|
|
254
|
+
- **Persistence:** Saved to `~/.claude-memory/embedding_cache.json`
|
|
255
|
+
|
|
256
|
+
### Security & Limits
|
|
257
|
+
|
|
258
|
+
```python
|
|
259
|
+
MAX_BATCH_SIZE = 128 # Prevents memory exhaustion
|
|
260
|
+
MAX_TEXT_LENGTH = 10_000 # Characters per input
|
|
261
|
+
CACHE_MAX_SIZE = 10_000 # Cache entries
|
|
262
|
+
```
|
|
263
|
+
|
|
264
|
+
---
|
|
265
|
+
|
|
266
|
+
## Integration with MemoryStoreV2
|
|
267
|
+
|
|
268
|
+
### Adding Embeddings to Database
|
|
269
|
+
|
|
270
|
+
```python
|
|
271
|
+
from embedding_engine import EmbeddingEngine
|
|
272
|
+
from pathlib import Path
|
|
273
|
+
|
|
274
|
+
engine = EmbeddingEngine()
|
|
275
|
+
|
|
276
|
+
# Add embeddings to all memories
|
|
277
|
+
db_path = Path.home() / ".claude-memory" / "memory.db"
|
|
278
|
+
engine.add_to_database(db_path)
|
|
279
|
+
```
|
|
280
|
+
|
|
281
|
+
This will:
|
|
282
|
+
1. Add `embedding` column to database (if not exists)
|
|
283
|
+
2. Generate embeddings for all memories
|
|
284
|
+
3. Store as JSON arrays in database
|
|
285
|
+
4. Use batch processing for efficiency
|
|
286
|
+
|
|
287
|
+
### Building HNSW Index from Database
|
|
288
|
+
|
|
289
|
+
```python
|
|
290
|
+
from hnsw_index import HNSWIndex
|
|
291
|
+
from pathlib import Path
|
|
292
|
+
|
|
293
|
+
index = HNSWIndex(dimension=384)
|
|
294
|
+
|
|
295
|
+
# Build from database embeddings
|
|
296
|
+
db_path = Path.home() / ".claude-memory" / "memory.db"
|
|
297
|
+
index.rebuild_from_db(db_path, embedding_column='embedding')
|
|
298
|
+
```
|
|
299
|
+
|
|
300
|
+
### Complete Search Pipeline
|
|
301
|
+
|
|
302
|
+
```python
|
|
303
|
+
from memory_store_v2 import MemoryStoreV2
|
|
304
|
+
from embedding_engine import EmbeddingEngine
|
|
305
|
+
from hnsw_index import HNSWIndex
|
|
306
|
+
|
|
307
|
+
# Initialize components
|
|
308
|
+
store = MemoryStoreV2()
|
|
309
|
+
engine = EmbeddingEngine()
|
|
310
|
+
index = HNSWIndex(dimension=384)
|
|
311
|
+
|
|
312
|
+
# Add memory with embedding
|
|
313
|
+
content = "Fixed authentication bug using JWT tokens"
|
|
314
|
+
mem_id = store.add_memory(content)
|
|
315
|
+
|
|
316
|
+
# Generate and store embedding
|
|
317
|
+
embedding = engine.encode(content)
|
|
318
|
+
# Store embedding in database...
|
|
319
|
+
|
|
320
|
+
# Add to HNSW index
|
|
321
|
+
index.add(embedding, mem_id)
|
|
322
|
+
|
|
323
|
+
# Search
|
|
324
|
+
query = "auth token issue"
|
|
325
|
+
query_embedding = engine.encode(query)
|
|
326
|
+
similar_ids = index.search(query_embedding, k=5)
|
|
327
|
+
|
|
328
|
+
# Get full memory details
|
|
329
|
+
for mem_id, score in similar_ids:
|
|
330
|
+
memory = store.get_by_id(mem_id)
|
|
331
|
+
print(f"[{mem_id}] Score: {score:.3f}")
|
|
332
|
+
print(f" {memory['content'][:100]}...")
|
|
333
|
+
```
|
|
334
|
+
|
|
335
|
+
---
|
|
336
|
+
|
|
337
|
+
## Installation Guide
|
|
338
|
+
|
|
339
|
+
### Minimal Installation (TF-IDF only)
|
|
340
|
+
|
|
341
|
+
```bash
|
|
342
|
+
# Already included in base requirements
|
|
343
|
+
pip install scikit-learn numpy
|
|
344
|
+
```
|
|
345
|
+
|
|
346
|
+
### Standard Installation (Recommended)
|
|
347
|
+
|
|
348
|
+
```bash
|
|
349
|
+
# Add semantic search
|
|
350
|
+
pip install sentence-transformers
|
|
351
|
+
|
|
352
|
+
# This automatically installs:
|
|
353
|
+
# - torch (PyTorch)
|
|
354
|
+
# - transformers (Hugging Face)
|
|
355
|
+
# - tqdm (progress bars)
|
|
356
|
+
```
|
|
357
|
+
|
|
358
|
+
### Full Installation (Maximum Performance)
|
|
359
|
+
|
|
360
|
+
```bash
|
|
361
|
+
# Standard + HNSW acceleration
|
|
362
|
+
pip install sentence-transformers hnswlib
|
|
363
|
+
|
|
364
|
+
# For GPU acceleration (if you have NVIDIA GPU)
|
|
365
|
+
pip install torch --index-url https://download.pytorch.org/whl/cu118
|
|
366
|
+
```
|
|
367
|
+
|
|
368
|
+
### Apple Silicon (M1/M2/M3)
|
|
369
|
+
|
|
370
|
+
```bash
|
|
371
|
+
# MPS (Metal Performance Shaders) support included
|
|
372
|
+
pip install sentence-transformers hnswlib
|
|
373
|
+
|
|
374
|
+
# torch with MPS support (usually auto-installed)
|
|
375
|
+
pip install torch torchvision torchaudio
|
|
376
|
+
```
|
|
377
|
+
|
|
378
|
+
---
|
|
379
|
+
|
|
380
|
+
## Verification
|
|
381
|
+
|
|
382
|
+
Check what's available:
|
|
383
|
+
|
|
384
|
+
```bash
|
|
385
|
+
# Test HNSW
|
|
386
|
+
python src/hnsw_index.py stats
|
|
387
|
+
|
|
388
|
+
# Test Embeddings
|
|
389
|
+
python src/embedding_engine.py stats
|
|
390
|
+
|
|
391
|
+
# Run comprehensive test
|
|
392
|
+
python test_new_modules.py
|
|
393
|
+
```
|
|
394
|
+
|
|
395
|
+
---
|
|
396
|
+
|
|
397
|
+
## Troubleshooting
|
|
398
|
+
|
|
399
|
+
### Issue: "No module named 'hnswlib'"
|
|
400
|
+
|
|
401
|
+
**Solution:**
|
|
402
|
+
```bash
|
|
403
|
+
pip install hnswlib
|
|
404
|
+
```
|
|
405
|
+
|
|
406
|
+
Or just use the fallback - the system will automatically use linear search.
|
|
407
|
+
|
|
408
|
+
### Issue: "No module named 'sentence_transformers'"
|
|
409
|
+
|
|
410
|
+
**Solution:**
|
|
411
|
+
```bash
|
|
412
|
+
pip install sentence-transformers
|
|
413
|
+
```
|
|
414
|
+
|
|
415
|
+
Or just use the fallback - the system will automatically use TF-IDF.
|
|
416
|
+
|
|
417
|
+
### Issue: "CUDA out of memory"
|
|
418
|
+
|
|
419
|
+
**Solution:**
|
|
420
|
+
```python
|
|
421
|
+
# Reduce batch size
|
|
422
|
+
engine = EmbeddingEngine()
|
|
423
|
+
embeddings = engine.encode(texts, batch_size=16) # Default: 32
|
|
424
|
+
|
|
425
|
+
# Or force CPU
|
|
426
|
+
engine = EmbeddingEngine(device='cpu')
|
|
427
|
+
```
|
|
428
|
+
|
|
429
|
+
### Issue: "Model download is slow"
|
|
430
|
+
|
|
431
|
+
**Solution:**
|
|
432
|
+
Models are cached locally after first download:
|
|
433
|
+
- Location: `~/.claude-memory/models/`
|
|
434
|
+
- Size: 80MB (all-MiniLM-L6-v2)
|
|
435
|
+
- Subsequent loads: <1 second
|
|
436
|
+
|
|
437
|
+
### Issue: "Search is slow even with HNSW"
|
|
438
|
+
|
|
439
|
+
**Possible causes:**
|
|
440
|
+
1. Fallback mode active (check `index.get_stats()`)
|
|
441
|
+
2. Need to adjust `ef_search` parameter
|
|
442
|
+
3. Index not built yet
|
|
443
|
+
|
|
444
|
+
**Solution:**
|
|
445
|
+
```python
|
|
446
|
+
# Rebuild index
|
|
447
|
+
index.rebuild_from_db(db_path)
|
|
448
|
+
|
|
449
|
+
# Increase search quality (slower but more accurate)
|
|
450
|
+
index.ef_search = 100 # Default: 50
|
|
451
|
+
```
|
|
452
|
+
|
|
453
|
+
---
|
|
454
|
+
|
|
455
|
+
## API Reference
|
|
456
|
+
|
|
457
|
+
### HNSWIndex Class
|
|
458
|
+
|
|
459
|
+
```python
|
|
460
|
+
class HNSWIndex:
|
|
461
|
+
def __init__(
|
|
462
|
+
dimension: int = 384,
|
|
463
|
+
max_elements: int = 100_000,
|
|
464
|
+
m: int = 16,
|
|
465
|
+
ef_construction: int = 200,
|
|
466
|
+
ef_search: int = 50
|
|
467
|
+
)
|
|
468
|
+
|
|
469
|
+
def build(vectors: np.ndarray, memory_ids: List[int])
|
|
470
|
+
def add(vector: np.ndarray, memory_id: int)
|
|
471
|
+
def search(query_vector: np.ndarray, k: int = 5) -> List[Tuple[int, float]]
|
|
472
|
+
def update(memory_id: int, vector: np.ndarray)
|
|
473
|
+
def delete(memory_id: int)
|
|
474
|
+
def rebuild_from_db(db_path: Path, embedding_column: str = 'embedding')
|
|
475
|
+
def get_stats() -> Dict[str, Any]
|
|
476
|
+
```
|
|
477
|
+
|
|
478
|
+
### EmbeddingEngine Class
|
|
479
|
+
|
|
480
|
+
```python
|
|
481
|
+
class EmbeddingEngine:
|
|
482
|
+
def __init__(
|
|
483
|
+
model_name: str = "all-MiniLM-L6-v2",
|
|
484
|
+
device: Optional[str] = None,
|
|
485
|
+
use_cache: bool = True
|
|
486
|
+
)
|
|
487
|
+
|
|
488
|
+
def encode(texts: Union[str, List[str]], batch_size: int = 32) -> np.ndarray
|
|
489
|
+
def encode_batch(texts: List[str], batch_size: int = 32) -> np.ndarray
|
|
490
|
+
def similarity(embedding1: np.ndarray, embedding2: np.ndarray) -> float
|
|
491
|
+
def add_to_database(db_path: Path, embedding_column: str = 'embedding')
|
|
492
|
+
def save_cache()
|
|
493
|
+
def clear_cache()
|
|
494
|
+
def get_stats() -> Dict[str, Any]
|
|
495
|
+
```
|
|
496
|
+
|
|
497
|
+
---
|
|
498
|
+
|
|
499
|
+
## Best Practices
|
|
500
|
+
|
|
501
|
+
### 1. Generate Embeddings Once
|
|
502
|
+
|
|
503
|
+
```python
|
|
504
|
+
# Do this once after installation
|
|
505
|
+
engine = EmbeddingEngine()
|
|
506
|
+
engine.add_to_database(db_path)
|
|
507
|
+
```
|
|
508
|
+
|
|
509
|
+
### 2. Rebuild HNSW Periodically
|
|
510
|
+
|
|
511
|
+
```python
|
|
512
|
+
# After adding many new memories
|
|
513
|
+
index = HNSWIndex()
|
|
514
|
+
index.rebuild_from_db(db_path)
|
|
515
|
+
```
|
|
516
|
+
|
|
517
|
+
### 3. Use Batch Processing
|
|
518
|
+
|
|
519
|
+
```python
|
|
520
|
+
# Good: process in batches
|
|
521
|
+
embeddings = engine.encode(texts, batch_size=32)
|
|
522
|
+
|
|
523
|
+
# Bad: process one at a time (slow)
|
|
524
|
+
embeddings = [engine.encode(text) for text in texts]
|
|
525
|
+
```
|
|
526
|
+
|
|
527
|
+
### 4. Monitor Cache Hit Rate
|
|
528
|
+
|
|
529
|
+
```python
|
|
530
|
+
stats = engine.get_stats()
|
|
531
|
+
print(f"Cache size: {stats['cache_size']}/{stats['cache_max_size']}")
|
|
532
|
+
|
|
533
|
+
# Save cache periodically
|
|
534
|
+
engine.save_cache()
|
|
535
|
+
```
|
|
536
|
+
|
|
537
|
+
### 5. Choose Right Model for Use Case
|
|
538
|
+
|
|
539
|
+
- **General use:** all-MiniLM-L6-v2 (default, fast)
|
|
540
|
+
- **High accuracy:** all-mpnet-base-v2 (slower, better)
|
|
541
|
+
- **Multilingual:** paraphrase-multilingual
|
|
542
|
+
|
|
543
|
+
---
|
|
544
|
+
|
|
545
|
+
## Performance Tuning
|
|
546
|
+
|
|
547
|
+
### HNSW Parameters
|
|
548
|
+
|
|
549
|
+
**For speed priority:**
|
|
550
|
+
```python
|
|
551
|
+
index = HNSWIndex(
|
|
552
|
+
m=8, # Fewer connections
|
|
553
|
+
ef_construction=100, # Faster build
|
|
554
|
+
ef_search=25 # Faster search
|
|
555
|
+
)
|
|
556
|
+
```
|
|
557
|
+
|
|
558
|
+
**For accuracy priority:**
|
|
559
|
+
```python
|
|
560
|
+
index = HNSWIndex(
|
|
561
|
+
m=32, # More connections
|
|
562
|
+
ef_construction=400, # Better build
|
|
563
|
+
ef_search=100 # More thorough search
|
|
564
|
+
)
|
|
565
|
+
```
|
|
566
|
+
|
|
567
|
+
**Balanced (recommended):**
|
|
568
|
+
```python
|
|
569
|
+
index = HNSWIndex(
|
|
570
|
+
m=16,
|
|
571
|
+
ef_construction=200,
|
|
572
|
+
ef_search=50
|
|
573
|
+
)
|
|
574
|
+
```
|
|
575
|
+
|
|
576
|
+
### Embedding Batch Sizes
|
|
577
|
+
|
|
578
|
+
| Use Case | Batch Size | Memory | Speed |
|
|
579
|
+
|----------|------------|--------|-------|
|
|
580
|
+
| Interactive (real-time) | 1-8 | Low | Fast response |
|
|
581
|
+
| Standard | 32 | Medium | Balanced |
|
|
582
|
+
| Bulk processing | 64-128 | High | Maximum throughput |
|
|
583
|
+
|
|
584
|
+
---
|
|
585
|
+
|
|
586
|
+
## Architecture Diagram
|
|
587
|
+
|
|
588
|
+
```
|
|
589
|
+
┌─────────────────────────────────────────────────────────────┐
|
|
590
|
+
│ SuperLocalMemory V2.2.0 Search Architecture │
|
|
591
|
+
├─────────────────────────────────────────────────────────────┤
|
|
592
|
+
│ │
|
|
593
|
+
│ User Query: "authentication token bug" │
|
|
594
|
+
│ │ │
|
|
595
|
+
│ ├──> [1] EmbeddingEngine.encode(query) │
|
|
596
|
+
│ │ ├─> Cache check (0.001ms if hit) │
|
|
597
|
+
│ │ └─> Model inference (10-100ms if miss) │
|
|
598
|
+
│ │ │
|
|
599
|
+
│ ├──> [2] HNSWIndex.search(embedding, k=5) │
|
|
600
|
+
│ │ ├─> HNSW search (<10ms for 10K vectors) │
|
|
601
|
+
│ │ └─> Linear fallback (~100ms if no HNSW) │
|
|
602
|
+
│ │ │
|
|
603
|
+
│ └──> [3] MemoryStoreV2.get_by_id(memory_ids) │
|
|
604
|
+
│ └─> SQLite lookup (<1ms per memory) │
|
|
605
|
+
│ │
|
|
606
|
+
│ Total latency: 10-150ms (depending on cache/GPU/HNSW) │
|
|
607
|
+
│ │
|
|
608
|
+
│ Fallback path (no optional deps): │
|
|
609
|
+
│ MemoryStoreV2.search() -> TF-IDF -> Full-text search │
|
|
610
|
+
│ Total latency: ~100ms (still acceptable) │
|
|
611
|
+
└─────────────────────────────────────────────────────────────┘
|
|
612
|
+
```
|
|
613
|
+
|
|
614
|
+
---
|
|
615
|
+
|
|
616
|
+
## Version History
|
|
617
|
+
|
|
618
|
+
### V2.2.0 (February 7, 2026)
|
|
619
|
+
|
|
620
|
+
**Added:**
|
|
621
|
+
- `src/hnsw_index.py` - Fast approximate nearest neighbor search
|
|
622
|
+
- `src/embedding_engine.py` - Local semantic embedding generation
|
|
623
|
+
- Complete test suite (`test_new_modules.py`)
|
|
624
|
+
- Comprehensive documentation (this file)
|
|
625
|
+
|
|
626
|
+
**Features:**
|
|
627
|
+
- Optional dependencies with graceful fallback
|
|
628
|
+
- Sub-10ms search for 10K memories (with HNSW)
|
|
629
|
+
- GPU acceleration (CUDA/MPS)
|
|
630
|
+
- LRU caching for embeddings
|
|
631
|
+
- Disk persistence for index
|
|
632
|
+
- Security limits and input validation
|
|
633
|
+
- CLI interfaces for both components
|
|
634
|
+
|
|
635
|
+
**Performance:**
|
|
636
|
+
- 10-20x speedup with HNSW vs linear search
|
|
637
|
+
- 100-1000 texts/sec embedding generation (GPU)
|
|
638
|
+
- <1ms cache hit latency
|
|
639
|
+
|
|
640
|
+
---
|
|
641
|
+
|
|
642
|
+
## Credits
|
|
643
|
+
|
|
644
|
+
**Architecture & Implementation:** Varun Pratap Bhardwaj (Solution Architect)
|
|
645
|
+
|
|
646
|
+
**Technologies:**
|
|
647
|
+
- hnswlib: Fast approximate nearest neighbor search
|
|
648
|
+
- sentence-transformers: Local embedding generation
|
|
649
|
+
- scikit-learn: TF-IDF fallback
|
|
650
|
+
- PyTorch: GPU acceleration
|
|
651
|
+
|
|
652
|
+
**License:** MIT License
|
|
653
|
+
|
|
654
|
+
---
|
|
655
|
+
|
|
656
|
+
## Support
|
|
657
|
+
|
|
658
|
+
For issues, questions, or contributions:
|
|
659
|
+
- **GitHub Issues:** https://github.com/varun369/SuperLocalMemoryV2/issues
|
|
660
|
+
- **Documentation:** https://github.com/varun369/SuperLocalMemoryV2/wiki
|
|
661
|
+
|
|
662
|
+
---
|
|
663
|
+
|
|
664
|
+
**SuperLocalMemory V2.2.0** - Making AI memory faster and smarter, locally.
|
|
665
|
+
|
|
666
|
+
*Created by Varun Pratap Bhardwaj, February 2026*
|
|
@@ -0,0 +1,34 @@
|
|
|
1
|
+
SuperLocalMemory V2.1.0 - Windows Installation
|
|
2
|
+
==============================================
|
|
3
|
+
|
|
4
|
+
Thank you for installing SuperLocalMemory V2!
|
|
5
|
+
|
|
6
|
+
This installer will:
|
|
7
|
+
1. Copy all necessary files to your system
|
|
8
|
+
2. Install Python modules to %USERPROFILE%\.claude-memory\
|
|
9
|
+
3. Configure MCP integration for supported IDEs
|
|
10
|
+
4. Install universal skills for AI tools
|
|
11
|
+
5. Add 'slm' command to your system PATH
|
|
12
|
+
|
|
13
|
+
System Requirements:
|
|
14
|
+
--------------------
|
|
15
|
+
• Windows 10 or higher (64-bit)
|
|
16
|
+
• Python 3.8 or higher
|
|
17
|
+
• 100 MB disk space
|
|
18
|
+
• Internet connection (for Python packages)
|
|
19
|
+
|
|
20
|
+
After Installation:
|
|
21
|
+
-------------------
|
|
22
|
+
1. Open Command Prompt or PowerShell
|
|
23
|
+
2. Run: slm status
|
|
24
|
+
3. Test: slm remember "Test memory"
|
|
25
|
+
4. Search: slm recall "test"
|
|
26
|
+
|
|
27
|
+
Documentation:
|
|
28
|
+
--------------
|
|
29
|
+
• GitHub: https://github.com/varun369/SuperLocalMemoryV2
|
|
30
|
+
• Wiki: https://github.com/varun369/SuperLocalMemoryV2/wiki
|
|
31
|
+
• Issues: https://github.com/varun369/SuperLocalMemoryV2/issues
|
|
32
|
+
|
|
33
|
+
Copyright (c) 2026 Varun Pratap Bhardwaj
|
|
34
|
+
Licensed under MIT License
|