@199-bio/engram 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/PLAN.md ADDED
@@ -0,0 +1,514 @@
1
+ # Engram Implementation Plan
2
+
3
+ ## Overview
4
+
5
+ Build a local-first MCP memory server with SOTA retrieval quality using ColBERT + BM25 hybrid search and a lightweight knowledge graph.
6
+
7
+ ## Core Insight
8
+
9
+ **The 80/20**: ColBERT (via RAGatouille) gives us embedding + reranking in one model, with better out-of-domain generalization than dense embeddings. Combined with BM25 for exact matches, this beats most API-based solutions while running entirely locally.
10
+
11
+ ---
12
+
13
+ ## Phase 1: Foundation
14
+
15
+ ### 1.1 Project Setup
16
+ - TypeScript + Node.js (MCP standard)
17
+ - ESM modules
18
+ - Directory structure:
19
+ ```
20
+ engram/
21
+ ├── src/
22
+ │ ├── index.ts # MCP server entry
23
+ │ ├── mcp/ # Tool definitions
24
+ │ ├── retrieval/ # ColBERT + BM25
25
+ │ ├── graph/ # Knowledge graph
26
+ │ ├── storage/ # SQLite operations
27
+ │ └── utils/ # Helpers
28
+ ├── models/ # Downloaded models
29
+ ├── tests/
30
+ └── scripts/
31
+ ```
32
+
33
+ ### 1.2 Storage Layer (SQLite)
34
+ Single SQLite database with tables:
35
+
36
+ ```sql
37
+ -- Memories: raw content
38
+ CREATE TABLE memories (
39
+ id TEXT PRIMARY KEY,
40
+ content TEXT NOT NULL,
41
+ source TEXT, -- 'conversation', 'import', etc.
42
+ timestamp DATETIME DEFAULT CURRENT_TIMESTAMP,
43
+ importance REAL DEFAULT 0.5, -- 0-1 score
44
+ access_count INTEGER DEFAULT 0,
45
+ last_accessed DATETIME
46
+ );
47
+
48
+ -- FTS5 for BM25 search
49
+ CREATE VIRTUAL TABLE memories_fts USING fts5(
50
+ content,
51
+ content='memories',
52
+ content_rowid='rowid'
53
+ );
54
+
55
+ -- Entities: nodes in knowledge graph
56
+ CREATE TABLE entities (
57
+ id TEXT PRIMARY KEY,
58
+ name TEXT NOT NULL,
59
+ type TEXT NOT NULL, -- 'person', 'place', 'concept', 'event'
60
+ created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
61
+ metadata JSON
62
+ );
63
+
64
+ -- Observations: facts about entities
65
+ CREATE TABLE observations (
66
+ id TEXT PRIMARY KEY,
67
+ entity_id TEXT NOT NULL REFERENCES entities(id),
68
+ content TEXT NOT NULL,
69
+ source_memory_id TEXT REFERENCES memories(id),
70
+ confidence REAL DEFAULT 1.0,
71
+ valid_from DATETIME DEFAULT CURRENT_TIMESTAMP,
72
+ valid_until DATETIME, -- NULL = still valid
73
+ UNIQUE(entity_id, content)
74
+ );
75
+
76
+ -- Relations: edges between entities
77
+ CREATE TABLE relations (
78
+ id TEXT PRIMARY KEY,
79
+ from_entity TEXT NOT NULL REFERENCES entities(id),
80
+ to_entity TEXT NOT NULL REFERENCES entities(id),
81
+ type TEXT NOT NULL, -- 'sibling', 'knows', 'works_at', etc.
82
+ properties JSON,
83
+ created_at DATETIME DEFAULT CURRENT_TIMESTAMP
84
+ );
85
+ ```
86
+
87
+ ### 1.3 MCP Server Skeleton
88
+ - Implement MCP protocol handlers
89
+ - Tool registration
90
+ - Error handling
91
+ - Logging
92
+
93
+ ---
94
+
95
+ ## Phase 2: Retrieval Engine
96
+
97
+ ### 2.1 ColBERT Integration
98
+
99
+ **Option A: RAGatouille (Python)**
100
+ - Proven, well-maintained
101
+ - Need Python subprocess or microservice
102
+ - ~500MB model size
103
+
104
+ **Option B: FastEmbed + ColBERT (Node.js)**
105
+ - Native Node.js via ONNX
106
+ - fastembed-js has ColBERT support
107
+ - May be less mature
108
+
109
+ **Decision**: Start with Python subprocess calling RAGatouille. If latency is issue, migrate to ONNX later.
110
+
111
+ ```typescript
112
+ // src/retrieval/colbert.ts
113
+ class ColBERTRetriever {
114
+ private pythonProcess: ChildProcess;
115
+
116
+ async index(documents: Document[]): Promise<void>;
117
+ async search(query: string, k: number): Promise<SearchResult[]>;
118
+ async rerank(query: string, docs: Document[]): Promise<RankedResult[]>;
119
+ }
120
+ ```
121
+
122
+ ### 2.2 BM25 via SQLite FTS5
123
+
124
+ ```typescript
125
+ // src/retrieval/bm25.ts
126
+ class BM25Retriever {
127
+ async search(query: string, k: number): Promise<SearchResult[]> {
128
+ return db.all(`
129
+ SELECT m.*, bm25(memories_fts) as score
130
+ FROM memories_fts
131
+ JOIN memories m ON memories_fts.rowid = m.rowid
132
+ WHERE memories_fts MATCH ?
133
+ ORDER BY score
134
+ LIMIT ?
135
+ `, [query, k]);
136
+ }
137
+ }
138
+ ```
139
+
140
+ ### 2.3 Hybrid Search with RRF
141
+
142
+ Reciprocal Rank Fusion combines rankings:
143
+
144
+ ```typescript
145
+ // src/retrieval/hybrid.ts
146
+ function reciprocalRankFusion(
147
+ rankings: SearchResult[][],
148
+ k: number = 60
149
+ ): SearchResult[] {
150
+ const scores = new Map<string, number>();
151
+
152
+ for (const ranking of rankings) {
153
+ for (let i = 0; i < ranking.length; i++) {
154
+ const docId = ranking[i].id;
155
+ const rrf = 1 / (k + i + 1);
156
+ scores.set(docId, (scores.get(docId) || 0) + rrf);
157
+ }
158
+ }
159
+
160
+ return Array.from(scores.entries())
161
+ .sort((a, b) => b[1] - a[1])
162
+ .map(([id, score]) => ({ id, score }));
163
+ }
164
+ ```
165
+
166
+ ---
167
+
168
+ ## Phase 3: Knowledge Graph
169
+
170
+ ### 3.1 Entity Extraction
171
+
172
+ **Option A: Local NER model**
173
+ - GLiNER or similar
174
+ - Runs locally
175
+ - Generic entities
176
+
177
+ **Option B: LLM-based (using calling model)**
178
+ - More accurate for personal context
179
+ - Already in conversation
180
+ - Prompt engineering needed
181
+
182
+ **Decision**: Start with regex/heuristics for names (capitalized words), dates, etc. Add GLiNER later if needed.
183
+
184
+ ```typescript
185
+ // src/graph/extractor.ts
186
+ class EntityExtractor {
187
+ extractPersons(text: string): string[] {
188
+ // Heuristic: capitalized words not at sentence start
189
+ // + common name patterns
190
+ }
191
+
192
+ extractDates(text: string): Date[] {
193
+ // chrono-node for date parsing
194
+ }
195
+
196
+ extractAll(text: string): Entity[] {
197
+ return [
198
+ ...this.extractPersons(text).map(p => ({ name: p, type: 'person' })),
199
+ ...this.extractDates(text).map(d => ({ name: d, type: 'date' })),
200
+ ];
201
+ }
202
+ }
203
+ ```
204
+
205
+ ### 3.2 Graph Operations
206
+
207
+ ```typescript
208
+ // src/graph/knowledge-graph.ts
209
+ class KnowledgeGraph {
210
+ async addEntity(name: string, type: EntityType): Promise<Entity>;
211
+ async addObservation(entityId: string, content: string, sourceMemoryId?: string): Promise<Observation>;
212
+ async addRelation(from: string, to: string, type: string): Promise<Relation>;
213
+
214
+ async getEntity(id: string): Promise<EntityWithObservations>;
215
+ async findEntities(query: string): Promise<Entity[]>;
216
+
217
+ async traverse(
218
+ startEntity: string,
219
+ depth: number,
220
+ relationTypes?: string[]
221
+ ): Promise<GraphTraversal>;
222
+ }
223
+ ```
224
+
225
+ ### 3.3 Graph-Enhanced Retrieval
226
+
227
+ When recalling, expand search with graph context:
228
+
229
+ ```typescript
230
+ async function recallWithGraph(query: string, k: number): Promise<Memory[]> {
231
+ // 1. Hybrid search
232
+ const hybridResults = await hybridSearch(query, k * 2);
233
+
234
+ // 2. Extract entities from query
235
+ const queryEntities = entityExtractor.extractAll(query);
236
+
237
+ // 3. Get related observations
238
+ const relatedObs = [];
239
+ for (const entity of queryEntities) {
240
+ const e = await graph.findEntities(entity.name);
241
+ if (e.length > 0) {
242
+ const traversal = await graph.traverse(e[0].id, depth: 2);
243
+ relatedObs.push(...traversal.observations);
244
+ }
245
+ }
246
+
247
+ // 4. Add source memories from observations to candidate pool
248
+ const candidateIds = new Set([
249
+ ...hybridResults.map(r => r.id),
250
+ ...relatedObs.map(o => o.source_memory_id).filter(Boolean)
251
+ ]);
252
+
253
+ // 5. ColBERT rerank all candidates
254
+ const candidates = await getMemoriesById([...candidateIds]);
255
+ return await colbert.rerank(query, candidates, k);
256
+ }
257
+ ```
258
+
259
+ ---
260
+
261
+ ## Phase 4: MCP Tools
262
+
263
+ ### 4.1 Core Tools
264
+
265
+ ```typescript
266
+ const tools = [
267
+ {
268
+ name: 'remember',
269
+ description: 'Store a new memory',
270
+ inputSchema: {
271
+ type: 'object',
272
+ properties: {
273
+ content: { type: 'string', description: 'The memory content' },
274
+ source: { type: 'string', description: 'Source of the memory' },
275
+ importance: { type: 'number', description: '0-1 importance score' }
276
+ },
277
+ required: ['content']
278
+ }
279
+ },
280
+ {
281
+ name: 'recall',
282
+ description: 'Retrieve relevant memories',
283
+ inputSchema: {
284
+ type: 'object',
285
+ properties: {
286
+ query: { type: 'string', description: 'What to search for' },
287
+ limit: { type: 'number', default: 5 },
288
+ include_graph: { type: 'boolean', default: true }
289
+ },
290
+ required: ['query']
291
+ }
292
+ },
293
+ // ... other tools
294
+ ];
295
+ ```
296
+
297
+ ### 4.2 Tool Implementations
298
+
299
+ ```typescript
300
+ // src/mcp/tools/remember.ts
301
+ async function remember(params: RememberParams): Promise<RememberResult> {
302
+ // 1. Store memory
303
+ const memory = await storage.createMemory({
304
+ content: params.content,
305
+ source: params.source || 'conversation',
306
+ importance: params.importance || 0.5
307
+ });
308
+
309
+ // 2. Index for ColBERT
310
+ await colbert.index([{ id: memory.id, text: memory.content }]);
311
+
312
+ // 3. Index for BM25 (automatic via FTS5 trigger)
313
+
314
+ // 4. Extract and store entities
315
+ const entities = entityExtractor.extractAll(params.content);
316
+ for (const entity of entities) {
317
+ const e = await graph.addEntity(entity.name, entity.type);
318
+ // Create observation linking entity to this memory
319
+ await graph.addObservation(e.id, params.content, memory.id);
320
+ }
321
+
322
+ return { id: memory.id, entities: entities.map(e => e.name) };
323
+ }
324
+ ```
325
+
326
+ ---
327
+
328
+ ## Phase 5: Optimizations
329
+
330
+ ### 5.1 Temporal Decay
331
+
332
+ Recent memories weighted higher:
333
+
334
+ ```typescript
335
+ function temporalScore(memory: Memory, now: Date): number {
336
+ const ageInDays = (now.getTime() - memory.timestamp.getTime()) / (1000 * 60 * 60 * 24);
337
+ // Exponential decay with half-life of 30 days
338
+ return Math.exp(-0.693 * ageInDays / 30);
339
+ }
340
+
341
+ function combinedScore(memory: Memory, retrievalScore: number): number {
342
+ const temporal = temporalScore(memory, new Date());
343
+ const importance = memory.importance;
344
+ const access = Math.log(1 + memory.access_count) / 10;
345
+
346
+ return retrievalScore * (0.6 + 0.2 * temporal + 0.1 * importance + 0.1 * access);
347
+ }
348
+ ```
349
+
350
+ ### 5.2 Memory Consolidation
351
+
352
+ Merge similar memories over time:
353
+
354
+ ```typescript
355
+ async function consolidate(): Promise<void> {
356
+ // Find similar memories (high ColBERT similarity)
357
+ const clusters = await findSimilarClusters(threshold: 0.9);
358
+
359
+ for (const cluster of clusters) {
360
+ if (cluster.length > 1) {
361
+ // Merge into single consolidated memory
362
+ const merged = mergeMemories(cluster);
363
+ await storage.createMemory(merged);
364
+ await storage.archiveMemories(cluster.map(m => m.id));
365
+ }
366
+ }
367
+ }
368
+ ```
369
+
370
+ ### 5.3 Lazy Loading
371
+
372
+ Don't load ColBERT until first use:
373
+
374
+ ```typescript
375
+ class LazyColBERT {
376
+ private instance: ColBERTRetriever | null = null;
377
+
378
+ async get(): Promise<ColBERTRetriever> {
379
+ if (!this.instance) {
380
+ this.instance = await ColBERTRetriever.initialize();
381
+ }
382
+ return this.instance;
383
+ }
384
+ }
385
+ ```
386
+
387
+ ---
388
+
389
+ ## Phase 6: Optional Cloud Enhancement
390
+
391
+ ### 6.1 Graceful Degradation
392
+
393
+ ```typescript
394
+ class HybridEmbedder {
395
+ async embed(texts: string[]): Promise<number[][]> {
396
+ if (process.env.GEMINI_API_KEY) {
397
+ try {
398
+ return await geminiEmbed(texts);
399
+ } catch (e) {
400
+ console.warn('Gemini unavailable, falling back to local');
401
+ }
402
+ }
403
+ return await qwen3Embed(texts);
404
+ }
405
+ }
406
+ ```
407
+
408
+ ### 6.2 Smart API Usage
409
+
410
+ Only use APIs when it adds value:
411
+
412
+ ```typescript
413
+ async function recall(query: string): Promise<Memory[]> {
414
+ // Local ColBERT for retrieval (always)
415
+ const candidates = await localRetrieval(query);
416
+
417
+ // Use Cohere rerank only for ambiguous queries
418
+ if (process.env.COHERE_API_KEY && candidates.length > 10) {
419
+ const topScores = candidates.slice(0, 5).map(c => c.score);
420
+ const variance = calculateVariance(topScores);
421
+
422
+ if (variance < 0.1) {
423
+ // Scores too close - worth API rerank
424
+ return await cohereRerank(query, candidates);
425
+ }
426
+ }
427
+
428
+ return candidates.slice(0, 5);
429
+ }
430
+ ```
431
+
432
+ ---
433
+
434
+ ## Implementation Order
435
+
436
+ ### Week 1: Core
437
+ 1. Project setup (TypeScript, deps, structure)
438
+ 2. SQLite schema + storage layer
439
+ 3. MCP server skeleton with remember/recall stubs
440
+
441
+ ### Week 2: Retrieval
442
+ 4. BM25 via FTS5
443
+ 5. RAGatouille Python bridge
444
+ 6. Hybrid search with RRF
445
+ 7. Basic remember/recall working
446
+
447
+ ### Week 3: Knowledge Graph
448
+ 8. Entity extraction (regex/heuristics)
449
+ 9. Graph schema + operations
450
+ 10. Graph-enhanced retrieval
451
+
452
+ ### Week 4: Polish
453
+ 11. Temporal decay
454
+ 12. Export/import
455
+ 13. Testing
456
+ 14. Documentation
457
+
458
+ ---
459
+
460
+ ## Dependencies
461
+
462
+ ```json
463
+ {
464
+ "dependencies": {
465
+ "@modelcontextprotocol/sdk": "latest",
466
+ "better-sqlite3": "^9.0.0",
467
+ "chrono-node": "^2.7.0",
468
+ "uuid": "^9.0.0",
469
+ "zod": "^3.22.0"
470
+ },
471
+ "devDependencies": {
472
+ "@types/better-sqlite3": "^7.6.0",
473
+ "@types/node": "^20.0.0",
474
+ "typescript": "^5.0.0",
475
+ "vitest": "^1.0.0"
476
+ }
477
+ }
478
+ ```
479
+
480
+ Python dependencies (for ColBERT):
481
+ ```
482
+ ragatouille>=0.0.8
483
+ torch>=2.0.0
484
+ ```
485
+
486
+ ---
487
+
488
+ ## Success Metrics
489
+
490
+ 1. **Retrieval Quality**: Manually test with 50 queries, measure if correct memory is in top-3
491
+ 2. **Latency**: recall < 100ms, remember < 200ms
492
+ 3. **Storage**: < 100MB for 10,000 memories (excluding model)
493
+ 4. **Reliability**: Zero crashes in 1-week daily use
494
+
495
+ ---
496
+
497
+ ## Risks & Mitigations
498
+
499
+ | Risk | Mitigation |
500
+ |------|------------|
501
+ | RAGatouille Python bridge adds latency | Start Python process once, keep alive |
502
+ | ColBERT model too large | Use quantized version, lazy load |
503
+ | Entity extraction inaccurate | Start simple, add GLiNER if needed |
504
+ | SQLite concurrent access | Use WAL mode, single writer |
505
+
506
+ ---
507
+
508
+ ## Future Extensions
509
+
510
+ - **Voice memos**: Whisper transcription → memory
511
+ - **Image memories**: CLIP embeddings
512
+ - **Calendar integration**: Auto-import events
513
+ - **Journaling mode**: Daily summary generation
514
+ - **Multi-user**: Shared knowledge graphs