claude_memory 0.4.0 → 0.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (50) hide show
  1. checksums.yaml +4 -4
  2. data/.claude/CLAUDE.md +1 -1
  3. data/.claude/rules/claude_memory.generated.md +14 -1
  4. data/.claude/skills/check-memory/SKILL.md +10 -0
  5. data/.claude/skills/improve/SKILL.md +12 -1
  6. data/.claude-plugin/plugin.json +1 -1
  7. data/CHANGELOG.md +70 -0
  8. data/db/migrations/008_add_provenance_line_range.rb +21 -0
  9. data/db/migrations/009_add_docid.rb +39 -0
  10. data/db/migrations/010_add_llm_cache.rb +30 -0
  11. data/docs/improvements.md +72 -1084
  12. data/docs/influence/claude-supermemory.md +498 -0
  13. data/docs/influence/qmd.md +424 -2022
  14. data/docs/quality_review.md +64 -705
  15. data/lib/claude_memory/commands/doctor_command.rb +45 -4
  16. data/lib/claude_memory/commands/explain_command.rb +11 -6
  17. data/lib/claude_memory/commands/stats_command.rb +1 -1
  18. data/lib/claude_memory/core/fact_graph.rb +122 -0
  19. data/lib/claude_memory/core/fact_query_builder.rb +34 -14
  20. data/lib/claude_memory/core/fact_ranker.rb +3 -20
  21. data/lib/claude_memory/core/relative_time.rb +45 -0
  22. data/lib/claude_memory/core/result_sorter.rb +2 -2
  23. data/lib/claude_memory/core/rr_fusion.rb +57 -0
  24. data/lib/claude_memory/core/snippet_extractor.rb +97 -0
  25. data/lib/claude_memory/domain/fact.rb +3 -1
  26. data/lib/claude_memory/index/index_query.rb +2 -0
  27. data/lib/claude_memory/index/lexical_fts.rb +18 -0
  28. data/lib/claude_memory/infrastructure/operation_tracker.rb +7 -21
  29. data/lib/claude_memory/infrastructure/schema_validator.rb +30 -25
  30. data/lib/claude_memory/ingest/content_sanitizer.rb +8 -1
  31. data/lib/claude_memory/ingest/ingester.rb +67 -56
  32. data/lib/claude_memory/ingest/tool_extractor.rb +1 -1
  33. data/lib/claude_memory/ingest/tool_filter.rb +55 -0
  34. data/lib/claude_memory/logging/logger.rb +112 -0
  35. data/lib/claude_memory/mcp/query_guide.rb +96 -0
  36. data/lib/claude_memory/mcp/response_formatter.rb +86 -23
  37. data/lib/claude_memory/mcp/server.rb +34 -4
  38. data/lib/claude_memory/mcp/text_summary.rb +257 -0
  39. data/lib/claude_memory/mcp/tool_definitions.rb +20 -4
  40. data/lib/claude_memory/mcp/tools.rb +133 -120
  41. data/lib/claude_memory/publish.rb +12 -2
  42. data/lib/claude_memory/recall/expansion_detector.rb +44 -0
  43. data/lib/claude_memory/recall.rb +93 -41
  44. data/lib/claude_memory/resolve/resolver.rb +72 -40
  45. data/lib/claude_memory/store/sqlite_store.rb +99 -24
  46. data/lib/claude_memory/sweep/sweeper.rb +6 -0
  47. data/lib/claude_memory/version.rb +1 -1
  48. data/lib/claude_memory.rb +21 -0
  49. metadata +14 -2
  50. data/docs/remaining_improvements.md +0 -330
data/docs/improvements.md CHANGED
@@ -1,1139 +1,127 @@
1
1
  # Improvements to Consider
2
2
 
3
- *Updated: 2026-01-29*
3
+ *Updated: 2026-02-03 - Removed Docid Short Hash System, LLM Response Caching, Structured Logging (implemented)*
4
4
  *Sources:*
5
5
  - *[thedotmack/claude-mem](https://github.com/thedotmack/claude-mem) - Memory compression system*
6
6
  - *[obra/episodic-memory](https://github.com/obra/episodic-memory) - Semantic conversation search*
7
7
  - *[yoanbernabeu/grepai](https://github.com/yoanbernabeu/grepai) - Semantic code search with vector embeddings*
8
+ - *[supermemoryai/claude-supermemory](https://github.com/supermemoryai/claude-supermemory) - Cloud-backed persistent memory plugin*
9
+ - *[tobi/qmd](https://github.com/tobi/qmd) - On-device hybrid search engine (updated 2026-02-02)*
8
10
 
9
- This document identifies design patterns and features from claude-mem and episodic-memory that could improve claude_memory. Implemented improvements have been removed from this document.
11
+ This document contains only unimplemented improvements. Completed items are removed.
10
12
 
11
13
  ---
12
14
 
13
- ## Implemented Improvements
15
+ ## High Priority (QMD-Inspired)
14
16
 
15
- The following improvements from the original analysis have been successfully implemented:
16
-
17
- 1. **Progressive Disclosure Pattern** - `memory.recall_index` and `memory.recall_details` MCP tools with token estimation
18
- 2. **Privacy Tag System** - ContentSanitizer with `<private>`, `<no-memory>`, and `<secret>` tag stripping
19
- 3. **Slim Orchestrator Pattern** - CLI refactored to thin router with extracted command classes
20
- 4. **Semantic Shortcuts** - `memory.decisions`, `memory.conventions`, and `memory.architecture` MCP tools
21
- 5. **Exit Code Strategy** - Hook::ExitCodes module with SUCCESS/WARNING/ERROR constants
22
- 6. **WAL Mode for Concurrency** - SQLite Write-Ahead Logging enabled for better concurrent access
23
- 7. **Enhanced Statistics** - Comprehensive stats command showing facts, entities, provenance, conflicts
24
- 8. **Session Metadata Tracking** - Captures git_branch, cwd, claude_version, thinking_level from transcripts
25
- 9. **Tool Usage Tracking** - Dedicated tool_calls table tracking tool names, inputs, timestamps
26
- 10. **Semantic Search with Local Embeddings** - FastEmbed (BAAI/bge-small-en-v1.5, 384-dim), hybrid vector + text search
27
- 11. **Multi-Concept AND Search** - Query facts matching all of 2-5 concepts simultaneously
28
- 12. **Incremental Sync** - mtime-based change detection to skip unchanged transcript files
29
- 13. **Context-Aware Queries** - Filter facts by git branch, directory, or tools used
30
- 14. **ROI Metrics Tracking** - ingestion_metrics table tracking token economics for distillation efficiency (2026-01-26)
31
-
32
- ---
33
-
34
- ## grepai Study (2026-01-29)
35
-
36
- Source: docs/influence/grepai.md
37
-
38
- ### High Priority Recommendations
39
-
40
- - [ ] **Incremental Indexing with File Watching**: Auto-update memory index during coding sessions
41
- - Value: Eliminates manual `claude-memory ingest` calls, huge UX win
42
- - Evidence: watcher/watcher.go:44 - `fsnotify` with debouncing (300ms default), gitignore respect
43
- - Implementation: Add `Listen` gem (Ruby fsnotify), watch `.claude/projects/*/transcripts/*.jsonl`, debounce 500ms, trigger IngestCommand automatically
44
- - Effort: 2-3 days (watcher class, integration, testing)
45
- - Trade-off: Background process ~10MB memory overhead, may complicate testing
46
-
47
- - [ ] **Compact Response Format for MCP Tools**: Reduce token usage by ~60% in MCP responses
48
- - Value: Critical for scaling to large fact databases (1000+ facts)
49
- - Evidence: mcp/server.go:219 - `SearchResultCompact` omits content field, returns only metadata
50
- - Implementation: Add `compact: true` parameter to all recall tools, omit provenance/context excerpts by default, user can override with `compact: false`
51
- - Effort: 4-6 hours (add parameter, update formatters, tests)
52
- - Trade-off: User needs follow-up `memory.explain <fact_id>` for full context (two-step interaction)
53
-
54
- - [ ] **Fact Dependency Graph Visualization**: Show supersession chains and conflict relationships
55
- - Value: Invaluable for understanding why facts were superseded or conflicted
56
- - Evidence: trace/trace.go:95 - `CallGraph` struct with nodes and edges for function dependencies
57
- - Implementation: Create `memory.fact_graph <fact_id> --depth 2` tool, query `fact_links` table with BFS traversal, return JSON with nodes (facts) and edges (supersedes/conflicts/supports)
58
- - Effort: 2-3 days (graph builder, MCP tool, tests)
59
- - Trade-off: Adds complexity for feature used mainly for debugging/exploration
60
-
61
- - [x] **Hybrid Search (Vector + Text)**: Better relevance combining semantic and keyword matching
62
- - Value: 173% improvement in Recall@5 over FTS-only (0.266 → 0.727 in benchmarks)
63
- - Implementation: FastEmbed adapter (BAAI/bge-small-en-v1.5), embeddings stored in `embedding_json` column, `Recall#query_semantic(mode: :both)` merges vector + FTS results
64
- - No API calls -- fastembed-rb runs ONNX model locally (~67MB, downloaded once)
65
- - RRF-style fusion still a potential optimization (current: naive merge with deduplication)
66
-
67
- ---
68
-
69
- ## Design Decisions
70
-
71
- ### No Tag Count Limit (2026-01-23)
72
-
73
- **Decision**: Removed MAX_TAG_COUNT limit from ContentSanitizer.
74
-
75
- **Rationale**:
76
- - The regex pattern `/<tag>.*?<\/tag>/m` is provably safe from ReDoS attacks
77
- - Non-greedy matching (`.*?`) with clear delimiters
78
- - No nested quantifiers or alternation that could cause catastrophic backtracking
79
- - Performance is O(n) and predictable
80
- - Performance benchmarks show excellent speed even at scale:
81
- - 100 tags: 0.07ms
82
- - 200 tags: 0.13ms
83
- - 1,000 tags: 0.64ms
84
- - Real-world usage legitimately produces 100-200+ tags in long sessions
85
- - System tags like `<claude-memory-context>` accumulate
86
- - Users mark multiple sections with `<private>` tags
87
- - The limit created false alarms and blocked legitimate ingestion
88
- - No other similar tool (claude-mem, episodic-memory) enforces tag count limits
89
-
90
- **Do not reintroduce**: Tag count validation is unnecessary and harmful. If extreme input causes issues, investigate the actual root cause rather than adding arbitrary limits.
91
-
92
- ---
93
-
94
- ## Executive Summary
95
-
96
- This document analyzes two complementary memory systems:
97
-
98
- **Claude-mem** (TypeScript/Node.js, v9.0.5) - Memory compression system with 6+ months of production usage:
99
- - ROI Metrics tracking token costs
100
- - Health monitoring and process management
101
- - Configuration-driven context injection
102
-
103
- **Episodic-memory** (TypeScript/Node.js, v1.0.15) - Semantic conversation search for Claude Code:
104
- - Local vector embeddings (Transformers.js)
105
- - Multi-concept AND search
106
- - Automatic conversation summarization
107
- - Tool usage tracking
108
- - Session metadata capture
109
- - Background sync with incremental updates
110
-
111
- **Our Current Advantages**:
112
- - Ruby ecosystem (simpler dependencies)
113
- - Dual-database architecture (global + project scope)
114
- - Fact-based knowledge graph (vs observation blobs or conversation exchanges)
115
- - Truth maintenance system (conflict resolution)
116
- - Predicate policies (single vs multi-value)
117
- - Progressive disclosure already implemented
118
- - Privacy tag stripping already implemented
119
-
120
- **High-Value Opportunities from Episodic-Memory**:
121
- - Vector embeddings for semantic search alongside FTS5
122
- - Tool usage tracking during fact discovery
123
- - Session metadata capture (git branch, working directory)
124
- - Multi-concept AND search
125
- - Background sync with incremental updates
126
- - Enhanced statistics and reporting
127
-
128
- ---
129
-
130
- ## Episodic-Memory Comparison
131
-
132
- ### Architecture Overview
133
-
134
- **Episodic-memory** focuses on **conversation-level semantic search** rather than fact extraction. Key differences:
135
-
136
- | Feature | Episodic-Memory | ClaudeMemory |
137
- |---------|----------------|--------------|
138
- | **Data Model** | Conversation exchanges (user-assistant pairs) | Facts (subject-predicate-object triples) |
139
- | **Search Method** | Vector embeddings + text search | Hybrid vector + FTS5 search |
140
- | **Embeddings** | Local Transformers.js (Xenova/all-MiniLM-L6-v2) | Local FastEmbed (BAAI/bge-small-en-v1.5) |
141
- | **Vector Storage** | sqlite-vec virtual table | JSON column in facts table |
142
- | **Scope** | Single database with project field | Dual database (global + project) |
143
- | **Truth Maintenance** | None (keeps all conversations) | Supersession + conflict resolution |
144
- | **Summarization** | Claude API generates summaries | N/A |
145
- | **Tool Tracking** | Explicit tool_calls table | Mentioned in provenance text |
146
- | **Session Metadata** | sessionId, cwd, gitBranch, claudeVersion, thinking metadata | Limited (session_id in content_items) |
147
- | **Multi-Concept Search** | Array-based AND queries (2-5 concepts) | Single query only |
148
- | **Incremental Sync** | Timestamp-based mtime checks | Re-processes all content |
149
- | **Background Processing** | Async hook with --background flag | Synchronous hook execution |
150
- | **Statistics** | Rich stats with project breakdown | Basic status command |
151
- | **Exclusion** | Content-based markers (`<INSTRUCTIONS-TO-EPISODIC-MEMORY>DO NOT INDEX`) | Tag stripping (`<private>`, `<no-memory>`) |
152
- | **Line References** | Stores line_start and line_end for each exchange | No line tracking |
153
- | **WAL Mode** | Enabled for concurrency | Not enabled |
154
-
155
- ### What Episodic-Memory Does Well
156
-
157
- 1. **Semantic Search with Local Embeddings**
158
- - Uses Transformers.js to run embedding model locally (offline-capable)
159
- - 384-dimensional vectors from `Xenova/all-MiniLM-L6-v2`
160
- - Hybrid vector + text search for best recall
161
- - sqlite-vec virtual table for fast similarity queries
162
-
163
- 2. **Multi-Concept AND Search**
164
- - Array of 2-5 concepts that must all be present in results
165
- - Searches each concept independently then intersects results
166
- - Ranks by average similarity across all concepts
167
- - Example: `["React Router", "authentication", "JWT"]`
168
-
169
- 3. **Tool Usage Tracking**
170
- - Dedicated `tool_calls` table with foreign key to exchanges
171
- - Captures tool_name, tool_input, tool_result, is_error
172
- - Tool names included in embeddings for tool-based searches
173
- - Search results show tool usage summary
174
-
175
- 4. **Rich Session Metadata**
176
- - Captures: sessionId, cwd, gitBranch, claudeVersion
177
- - Thinking metadata: level, disabled, triggers
178
- - Conversation structure: parentUuid, isSidechain
179
- - Enables filtering by branch, project context
180
-
181
- 5. **Incremental Sync**
182
- - Atomic file operations (temp file + rename)
183
- - mtime-based change detection (only copies modified files)
184
- - Fast subsequent syncs (seconds vs minutes)
185
- - Safe concurrent execution
186
-
187
- 6. **Automatic Conversation Summarization**
188
- - Uses Claude API to generate concise summaries
189
- - Summaries stored as `.txt` files alongside conversations
190
- - Concurrency-limited batch processing
191
- - Summary limit (default 10 per sync) to control API costs
192
-
193
- 7. **Background Sync**
194
- - `--background` flag for async processing
195
- - SessionStart hook runs sync without blocking
196
- - User continues working while indexing happens
197
- - Output logged to file for debugging
198
-
199
- 8. **Line-Range References**
200
- - Stores line_start and line_end for each exchange
201
- - Enables precise source linking in search results
202
- - Supports pagination: read specific line ranges from large conversations
203
- - Example: "Lines 10-25 in conversation.jsonl (295KB, 1247 lines)"
204
-
205
- 9. **Statistics and Reporting**
206
- - Total conversations, exchanges, date range
207
- - Summary coverage tracking
208
- - Project breakdown with top 10 projects
209
- - Database size reporting
210
-
211
- 10. **Exclusion Markers**
212
- - Content-based opt-out: `<INSTRUCTIONS-TO-EPISODIC-MEMORY>DO NOT INDEX THIS CHAT</INSTRUCTIONS-TO-EPISODIC-MEMORY>`
213
- - Files archived but excluded from search index
214
- - Prevents meta-conversations from polluting index
215
- - Use case: sensitive work, test sessions, agent conversations
216
-
217
- 11. **WAL Mode for Concurrency**
218
- - SQLite Write-Ahead Logging enabled
219
- - Better concurrency for multiple readers
220
- - Safe for concurrent sync operations
221
-
222
- ### Design Patterns Worth Adopting
223
-
224
- 1. **Local Vector Embeddings** ✅ IMPLEMENTED
225
- - **Value**: Semantic search finds conceptually similar content even with different terminology
226
- - **Implementation**: `FastembedAdapter` wrapping fastembed-rb (BAAI/bge-small-en-v1.5, ONNX runtime)
227
- - Embeddings stored as JSON in `embedding_json` column on facts table
228
- - Asymmetric query/passage encoding for better retrieval accuracy
229
- - Benchmark: Recall@5=0.696 on semantic paraphrase queries (medium difficulty)
230
-
231
- 2. **Multi-Concept AND Search**
232
- - **Value**: Precise queries like "find conversations about React AND authentication AND JWT"
233
- - **Implementation**: Run multiple searches and intersect results, rank by average similarity
234
- - **Application to facts**: Find facts matching multiple predicates or entities
235
- - **MCP tool**: `memory.search_concepts(concepts: ["auth", "API", "security"])`
236
-
237
- 3. **Tool Usage Tracking**
238
- - **Value**: Know which tools were used during fact discovery (Read, Edit, Bash, etc.)
239
- - **Implementation**: Add `tool_calls` table or JSON column in content_items
240
- - **Schema**: `{ tool_name, tool_input, tool_result, timestamp }`
241
- - **Use case**: "Which facts were discovered using the Bash tool?"
242
-
243
- 4. **Session Metadata Capture**
244
- - **Value**: Context about where/when facts were learned
245
- - **Implementation**: Extend content_items with git_branch, cwd, claude_version columns
246
- - **Use case**: "Show facts learned while on feature/auth branch"
247
-
248
- 5. **Incremental Sync**
249
- - **Value**: Faster subsequent ingestions (seconds vs minutes)
250
- - **Implementation**: Store mtime for each content_item, skip unchanged files
251
- - **Hook optimization**: Only process delta since last ingest
252
-
253
- 6. **Background Processing**
254
- - **Value**: Don't block user while processing large transcripts
255
- - **Implementation**: Fork process or use Ruby's async/await
256
- - **Hook flag**: `claude-memory hook ingest --async`
257
-
258
- 7. **Line-Range References in Provenance**
259
- - **Value**: Precise source linking for fact verification
260
- - **Implementation**: Store line_start and line_end in provenance table
261
- - **Display**: "Fact from lines 42-56 in transcript.jsonl"
262
-
263
- 8. **Statistics Command**
264
- - **Value**: Visibility into memory system health
265
- - **Implementation**: Enhance `claude-memory status` with more metrics
266
- - **Metrics**: Facts by predicate, entities by type, provenance coverage, scope breakdown
267
-
268
- 9. **WAL Mode**
269
- - **Value**: Better concurrency, safer concurrent operations
270
- - **Implementation**: `db.pragma('journal_mode = WAL')` in store initialization
271
- - **Benefit**: Multiple readers don't block each other
272
-
273
- ---
274
-
275
- ## 1. Health Monitoring and Process Management
276
-
277
- ### What claude-mem Does
278
-
279
- **Worker Service Management**:
280
-
281
- ```typescript
282
- // Health check endpoint
283
- app.get('/health', (req, res) => {
284
- res.json({
285
- status: 'ok',
286
- uptime: process.uptime(),
287
- port: WORKER_PORT,
288
- memory: process.memoryUsage(),
289
- version: packageJson.version
290
- });
291
- });
292
-
293
- // Smart startup
294
- async function ensureWorkerHealthy(timeout = 10000) {
295
- const healthy = await checkHealth();
296
- if (!healthy) {
297
- await startWorker();
298
- await waitForHealth(timeout);
299
- }
300
- }
301
- ```
302
-
303
- **Process Management**:
304
- - PID file tracking (`~/.claude-mem/worker.pid`)
305
- - Port conflict detection
306
- - Version mismatch warnings
307
- - Graceful shutdown handlers
308
- - Platform-aware timeouts (Windows vs Unix)
309
-
310
- **File**: `src/infrastructure/ProcessManager.ts`
311
-
312
- ### What We Should Do
313
-
314
- **Priority**: LOW (we use MCP server, not background worker)
315
-
316
- **Implementation** (if we add background worker):
317
-
318
- 1. **Health endpoint in MCP server**:
319
- ```ruby
320
- # lib/claude_memory/mcp/server.rb
321
- def handle_ping
322
- {
323
- status: "ok",
324
- version: ClaudeMemory::VERSION,
325
- databases: {
326
- global: File.exist?(global_db_path),
327
- project: File.exist?(project_db_path)
328
- },
329
- uptime: Process.clock_gettime(Process::CLOCK_MONOTONIC) - @start_time
330
- }
331
- end
332
- ```
333
-
334
- 2. **PID file management**:
335
- ```ruby
336
- # lib/claude_memory/daemon.rb
337
- class Daemon
338
- PID_FILE = File.expand_path("~/.claude/memory_server.pid")
339
-
340
- def start
341
- check_existing_process
342
- fork_and_daemonize
343
- write_pid_file
344
- setup_signal_handlers
345
- run_server
346
- end
347
-
348
- def stop
349
- pid = read_pid_file
350
- Process.kill("TERM", pid)
351
- wait_for_shutdown
352
- remove_pid_file
353
- end
354
- end
355
- ```
356
-
357
- **Benefits**:
358
- - Reliable server lifecycle
359
- - Easy debugging (health checks)
360
- - Prevents duplicate processes
361
-
362
- **Trade-offs**:
363
- - Complexity we may not need
364
- - Ruby daemons are tricky on Windows
365
- - MCP stdio transport doesn't need health checks
366
-
367
- **Verdict**: Skip unless we switch to HTTP-based MCP transport.
368
-
369
- ---
370
-
371
- ## 3. Web-Based Viewer UI
372
-
373
- ### What claude-mem Does
374
-
375
- **Real-Time Memory Viewer** at `http://localhost:37777`:
376
-
377
- - React-based web UI
378
- - Server-Sent Events (SSE) for real-time updates
379
- - Infinite scroll pagination
380
- - Project filtering
381
- - Settings persistence (sidebar state, theme)
382
- - Auto-reconnection with exponential backoff
383
- - Single-file HTML bundle (esbuild)
384
-
385
- **File**: `src/ui/viewer/` (React components)
386
-
387
- **Features**:
388
- - See observations as they're captured
389
- - Search historical observations
390
- - Filter by project
391
- - Export/share observations
392
- - Theme toggle (light/dark)
393
-
394
- **Build**:
395
- ```typescript
396
- esbuild.build({
397
- entryPoints: ['src/ui/viewer/index.tsx'],
398
- bundle: true,
399
- outfile: 'plugin/ui/viewer.html',
400
- loader: { '.tsx': 'tsx', '.woff2': 'dataurl' },
401
- });
402
- ```
403
-
404
- ### What We Should Do
405
-
406
- **Priority**: LOW (nice-to-have)
407
-
408
- **Implementation** (if we want it):
409
-
410
- 1. **Add Sinatra web server**:
411
- ```ruby
412
- # lib/claude_memory/web/server.rb
413
- require 'sinatra/base'
414
- require 'json'
415
-
416
- module ClaudeMemory
417
- module Web
418
- class Server < Sinatra::Base
419
- get '/' do
420
- erb :index
421
- end
422
-
423
- get '/api/facts' do
424
- facts = Recall.search(params[:query], limit: 100)
425
- json facts
426
- end
427
-
428
- get '/api/stream' do
429
- stream :keep_open do |out|
430
- # SSE for real-time updates
431
- EventMachine.add_periodic_timer(1) do
432
- out << "data: #{recent_facts.to_json}\n\n"
433
- end
434
- end
435
- end
436
- end
437
- end
438
- end
439
- ```
440
-
441
- 2. **Add to MCP server** (optional HTTP endpoint):
442
- ```ruby
443
- # claude-memory serve --web
444
- def serve_with_web
445
- Thread.new { Web::Server.run!(port: 37778) }
446
- serve_mcp # Main MCP server
447
- end
448
- ```
449
-
450
- 3. **Simple HTML viewer**:
451
- ```html
452
- <!-- lib/claude_memory/web/views/index.erb -->
453
- <!DOCTYPE html>
454
- <html>
455
- <head>
456
- <title>ClaudeMemory Viewer</title>
457
- <style>/* Minimal CSS */</style>
458
- </head>
459
- <body>
460
- <div id="facts-list"></div>
461
- <script>
462
- // Fetch and display facts
463
- fetch('/api/facts')
464
- .then(r => r.json())
465
- .then(facts => render(facts));
466
- </script>
467
- </body>
468
- </html>
469
- ```
470
-
471
- **Benefits**:
472
- - Visibility into memory system
473
- - Debugging tool
474
- - User trust (transparency)
475
-
476
- **Trade-offs**:
477
- - Significant development effort
478
- - Need to bundle web assets
479
- - Another dependency (web server)
480
- - Maintenance burden
481
-
482
- **Verdict**: Skip for MVP. Consider if users request it.
483
-
484
- ---
485
-
486
- ## 4. Dual-Integration Strategy
487
-
488
- ### What claude-mem Does
489
-
490
- **Plugin + MCP Server Hybrid**:
491
-
492
- 1. **Claude Code Plugin** (primary):
493
- - Hooks for lifecycle events
494
- - Worker service for AI processing
495
- - Installed via marketplace
496
-
497
- 2. **MCP Server** (secondary):
498
- - Thin wrapper delegating to worker HTTP API
499
- - Enables Claude Desktop integration
500
- - Same backend, different frontend
501
-
502
- **File**: `src/servers/mcp-server.ts` (thin wrapper)
503
-
504
- ```typescript
505
- // MCP server delegates to worker HTTP API
506
- const mcpServer = new McpServer({
507
- name: "claude-mem",
508
- version: packageJson.version
509
- });
510
-
511
- mcpServer.setRequestHandler(ListToolsRequestSchema, async () => {
512
- // Fetch tools from worker
513
- const tools = await fetch('http://localhost:37777/api/mcp/tools');
514
- return tools.json();
515
- });
516
-
517
- mcpServer.setRequestHandler(CallToolRequestSchema, async (request) => {
518
- // Forward to worker
519
- const result = await fetch('http://localhost:37777/api/mcp/call', {
520
- method: 'POST',
521
- body: JSON.stringify(request.params)
522
- });
523
- return result.json();
524
- });
525
- ```
526
-
527
- **Benefit**: One backend, multiple frontends.
528
-
529
- ### What We Should Do
530
-
531
- **Priority**: LOW
532
-
533
- **Current State**: We only have MCP server (no plugin hooks yet).
534
-
535
- **Implementation** (if we add Claude Code hooks):
536
-
537
- 1. **Keep MCP server as primary**:
538
- ```ruby
539
- # lib/claude_memory/mcp/server.rb
540
- # Current implementation - keep as-is
541
- ```
542
-
543
- 2. **Add hook handlers**:
544
- ```ruby
545
- # lib/claude_memory/hook/handler.rb
546
- # Delegate to same store manager
547
- def ingest_hook
548
- store_manager = Store::StoreManager.new
549
- ingester = Ingest::Ingester.new(store_manager)
550
- ingester.ingest(read_stdin[:transcript_delta])
551
- end
552
- ```
553
-
554
- 3. **Shared backend**:
555
- ```
556
- MCP Server (stdio) ──┐
557
- ├──> Store::StoreManager ──> SQLite
558
- Hook Handler (stdin) ─┘
559
- ```
560
-
561
- **Benefits**:
562
- - Works with both Claude Code and Claude Desktop
563
- - No duplicate logic
564
- - Clean separation of transport vs business logic
565
-
566
- **Trade-offs**:
567
- - More integration points to maintain
568
- - Hook contract is Claude Code-specific
569
-
570
- **Verdict**: Consider if we add Claude Code hooks (not urgent).
571
-
572
- ---
573
-
574
- ## 5. Configuration-Driven Context Injection
575
-
576
- ### What claude-mem Does
577
-
578
- **Context Config File**: `~/.claude-mem/settings.json`
579
-
580
- ```json
581
- {
582
- "context": {
583
- "mode": "reader", // reader | chat | inference
584
- "observations": {
585
- "enabled": true,
586
- "limit": 10,
587
- "types": ["decision", "gotcha", "trade-off"]
588
- },
589
- "summaries": {
590
- "enabled": true,
591
- "fields": ["request", "learned", "completed"]
592
- },
593
- "timeline": {
594
- "depth": 5
595
- }
596
- }
597
- }
598
- ```
599
-
600
- **File**: `src/services/context/ContextConfigLoader.ts`
601
-
602
- **Benefit**: Users can fine-tune what gets injected.
603
-
604
- ### What We Should Do
605
-
606
- **Priority**: LOW
607
-
608
- **Implementation**:
609
-
610
- 1. **Add config file**:
611
- ```ruby
612
- # ~/.claude/memory_config.yml
613
- publish:
614
- mode: shared # shared | local | home
615
- facts:
616
- limit: 50
617
- scopes: [global, project]
618
- predicates: [uses_*, depends_on, has_constraint]
619
- entities:
620
- limit: 20
621
- conflicts:
622
- show: true
623
- ```
624
-
625
- 2. **Load in publisher**:
626
- ```ruby
627
- # lib/claude_memory/publish.rb
628
- class Publisher
629
- def initialize
630
- @config = load_config
631
- end
632
-
633
- def load_config
634
- path = File.expand_path("~/.claude/memory_config.yml")
635
- YAML.load_file(path) if File.exist?(path)
636
- rescue
637
- default_config
638
- end
639
- end
640
- ```
641
-
642
- 3. **Apply during publish**:
643
- ```ruby
644
- def build_snapshot
645
- config = @config[:publish]
646
-
647
- facts = store.facts(
648
- limit: config[:facts][:limit],
649
- scopes: config[:facts][:scopes]
650
- )
651
-
652
- format_snapshot(facts, config)
653
- end
654
- ```
655
-
656
- **Benefits**:
657
- - User control over published content
658
- - Environment-specific configs
659
- - Reduces noise in generated files
660
-
661
- **Trade-offs**:
662
- - Another config file to document
663
- - May confuse users
664
- - Publish should be opinionated by default
665
-
666
- **Verdict**: Skip for MVP. Default config is sufficient.
667
-
668
- ---
669
-
670
- ## Features We're Already Doing Better
671
-
672
- ### 1. Dual-Database Architecture (Global + Project)
673
-
674
- **Our Advantage**: `Store::StoreManager` with global + project scopes.
675
-
676
- Claude-mem has a single database with project filtering. Our approach is cleaner:
677
-
678
- ```ruby
679
- # We separate global vs project knowledge
680
- @global_store = Store::SqliteStore.new(global_db_path)
681
- @project_store = Store::SqliteStore.new(project_db_path)
682
-
683
- # Claude-mem filters post-query
684
- SELECT * FROM observations WHERE project = ?
685
- ```
686
-
687
- **Keep this.** It's a better design.
688
-
689
- ### 2. Fact-Based Knowledge Graph
690
-
691
- **Our Advantage**: Subject-predicate-object triples with provenance.
692
-
693
- Claude-mem stores blob observations. We store structured facts:
694
-
695
- ```ruby
696
- # Ours (structured)
697
- { subject: "project", predicate: "uses_database", object: "PostgreSQL" }
698
-
699
- # Theirs (blob)
700
- { title: "Uses PostgreSQL", narrative: "The project uses..." }
701
- ```
702
-
703
- **Keep this.** Enables richer queries and inference.
704
-
705
- ### 3. Truth Maintenance System
706
-
707
- **Our Advantage**: `Resolve::Resolver` with supersession and conflicts.
708
-
709
- Claude-mem doesn't resolve contradictions. We do:
710
-
711
- ```ruby
712
- # We detect when facts supersede each other
713
- old: { subject: "api", predicate: "uses_auth", object: "JWT" }
714
- new: { subject: "api", predicate: "uses_auth", object: "OAuth2" }
715
- # → Creates supersession link
716
-
717
- # We detect conflicts
718
- fact1: { subject: "api", predicate: "rate_limit", object: "100/min" }
719
- fact2: { subject: "api", predicate: "rate_limit", object: "1000/min" }
720
- # → Creates conflict record
721
- ```
722
-
723
- **Keep this.** It's a core differentiator.
724
-
725
- ### 4. Predicate Policies
726
-
727
- **Our Advantage**: `Resolve::PredicatePolicy` for single vs multi-value.
728
-
729
- Claude-mem doesn't distinguish. We do:
730
-
731
- ```ruby
732
- # Single-value (supersedes)
733
- "uses_database" → only one database at a time
734
-
735
- # Multi-value (accumulates)
736
- "depends_on" → many dependencies
737
- ```
738
-
739
- **Keep this.** Prevents false conflicts.
740
-
741
- ### 5. Ruby Ecosystem (Simpler)
742
-
743
- **Our Advantage**: Fewer dependencies, easier install.
744
-
745
- ```ruby
746
- # Ours
747
- gem install claude_memory # Done
748
-
749
- # Theirs
750
- npm install # Needs Node.js
751
- npm install chromadb # Needs Python + pip
752
- npm install better-sqlite3 # Needs node-gyp + build tools
753
- ```
754
-
755
- **Keep this.** Ruby's stdlib is excellent.
756
-
757
- ---
758
-
759
- ## Features to Avoid
760
-
761
- ### 1. Chroma Vector Database
762
-
763
- **Their Approach**: Hybrid SQLite FTS5 + Chroma vector search.
764
-
765
- **Our Take**: **Skip it.** Adds significant complexity:
766
-
767
- - Python dependency
768
- - ChromaDB server
769
- - Embedding generation
770
- - Sync overhead
771
-
772
- **Alternative**: We use fastembed-rb with a local ONNX model (BAAI/bge-small-en-v1.5) -- no Python, no server, no API calls.
773
-
774
- ### 2. Claude Agent SDK for Distillation
775
-
776
- **Their Approach**: Use `@anthropic-ai/claude-agent-sdk` for observation compression.
777
-
778
- **Our Take**: **Skip it.** We already have `Distill::Distiller` interface. SDK adds:
779
-
780
- - Node.js dependency
781
- - Subprocess management
782
- - Complex event loop
783
-
784
- **Alternative**: Direct API calls via `anthropic-rb` gem (if we implement distiller).
785
-
786
- ### 3. Worker Service Background Process
787
-
788
- **Their Approach**: Long-running worker with HTTP API + MCP wrapper.
789
-
790
- **Our Take**: **Skip it.** We use MCP server directly:
791
-
792
- - No background process to manage
793
- - No port conflicts
794
- - No PID files
795
- - Simpler deployment
796
-
797
- **Alternative**: Keep stdio-based MCP server. Add HTTP transport only if needed.
798
-
799
- ### 4. Web Viewer UI
800
-
801
- **Their Approach**: React-based web UI at `http://localhost:37777`.
802
-
803
- **Our Take**: **Skip for MVP.** Significant effort for uncertain value:
804
-
805
- - React + esbuild
806
- - SSE implementation
807
- - State management
808
- - CSS/theming
809
-
810
- **Alternative**: CLI output is sufficient. Add web UI if users request it.
811
- **Alternative**: CLI output is sufficient. Add web UI if users request it.
812
-
813
- ---
814
-
815
- ## Remaining Improvements
816
-
817
- The following sections (6-12 from the original analysis) have been implemented and moved to the "Implemented Improvements" section above:
818
-
819
- - ✅ Section 6: Local Vector Embeddings for Semantic Search
820
- - ✅ Section 7: Multi-Concept AND Search
821
- - ✅ Section 8: Tool Usage Tracking
822
- - ✅ Section 9: Enhanced Session Metadata
823
- - ✅ Section 10: Incremental Sync (mtime-based)
824
- - ✅ Section 11: Enhanced Statistics and Reporting
825
- - ✅ Section 12: WAL Mode for Better Concurrency
826
-
827
- **For remaining unimplemented improvements, see:** [remaining_improvements.md](./remaining_improvements.md)
828
-
829
- Key remaining items:
830
- - Background processing for hooks (--async flag)
831
- - ROI metrics and token economics tracking
832
- - Structured logging
833
- - Embed command for backfilling embeddings
834
-
835
- ---
836
-
837
- ## QMD-Inspired Improvements (2026-01-26)
838
-
839
- Analysis of **QMD (Quick Markdown Search)** reveals several high-value optimizations for search quality and performance. QMD is an on-device markdown search engine with hybrid BM25 + vector + LLM reranking, achieving 50%+ Hit@3 improvement over BM25-only search.
840
-
841
- **See detailed analysis**: [docs/influence/qmd.md](./influence/qmd.md)
842
-
843
- ### High Priority ⭐
844
-
845
- #### 1. **Native Vector Storage (sqlite-vec)** ⭐ CRITICAL
17
+ ### 1. Native Vector Storage (sqlite-vec) CRITICAL
846
18
 
847
19
  - **Value**: 10-100x faster KNN queries, enables larger fact databases
848
20
  - **QMD Proof**: Handles 10,000+ documents with sub-second vector queries
849
21
  - **Current Issue**: JSON embedding storage requires loading all facts, O(n) Ruby similarity calculation
850
22
  - **Solution**: sqlite-vec extension with native C KNN queries
851
23
  - **Implementation**:
852
- - Schema migration v7: Create `facts_vec` virtual table using `vec0`
24
+ - Schema migration v11: Create `facts_vec` virtual table using `vec0`
853
25
  - Two-step query pattern (avoid JOINs - they hang with vec tables!)
854
26
  - Update `Embeddings::Similarity` class
855
27
  - Backfill existing embeddings
856
28
  - **Trade-off**: Adds native dependency (acceptable, well-maintained, cross-platform)
857
- - **Recommendation**: **ADOPT IMMEDIATELY** - This is foundational
858
-
859
- #### 2. **Reciprocal Rank Fusion (RRF) Algorithm** ⭐ HIGH VALUE
860
-
861
- - **Value**: 50% improvement in Hit@3 for medium-difficulty queries (QMD evaluation)
862
- - **QMD Proof**: Evaluation suite shows consistent improvements across all query types
863
- - **Current Issue**: Naive deduplication doesn't properly fuse ranking signals
864
- - **Solution**: Mathematical fusion of FTS + vector ranked lists with position-aware scoring
865
- - **Formula**: `score = Σ(weight / (k + rank + 1))` with top-rank bonus
866
- - **Implementation**:
867
- - Create `Recall::RRFusion` class
868
- - Update `Recall#query_semantic_dual` to use RRF
869
- - Apply weights: original query ×2, expanded queries ×1
870
- - Add top-rank bonus: +0.05 for #1, +0.02 for #2-3
871
- - **Trade-off**: Slightly more complex than naive merging (acceptable, well-tested)
872
- - **Recommendation**: **ADOPT IMMEDIATELY** - Pure algorithmic improvement
873
-
874
- #### 3. **Docid Short Hash System** ⭐ MEDIUM VALUE
875
-
876
- - **Value**: Better UX, cross-database fact references
877
- - **QMD Proof**: Used in all output, enables `qmd get #abc123`
878
- - **Current Issue**: Integer IDs are database-specific, not user-friendly
879
- - **Solution**: 8-character hash IDs for facts (e.g., `#abc123de`)
880
- - **Implementation**:
881
- - Schema migration v8: Add `docid` column (indexed, unique)
882
- - Backfill existing facts with SHA256-based docids
883
- - Update CLI commands (`explain`, `recall`) to accept docids
884
- - Update MCP tools to accept docids
885
- - Update output formatting to show docids
886
- - **Trade-off**: Hash collisions possible (8 chars = 1 in 4.3 billion, very rare)
887
- - **Recommendation**: **ADOPT IN PHASE 2** - Clear UX improvement
888
-
889
- #### 4. **Smart Expansion Detection** ⭐ MEDIUM VALUE
890
-
891
- - **Value**: Skip unnecessary vector search when FTS finds exact match
892
- - **QMD Proof**: Saves 2-3 seconds on 60% of queries (exact keyword matches)
893
- - **Current Issue**: Always runs both FTS and vector search, even for exact matches
894
- - **Solution**: Heuristic detection of strong FTS signal
895
- - **Thresholds**: `top_score >= 0.85` AND `gap >= 0.15`
896
- - **Implementation**:
897
- - Create `Recall::ExpansionDetector` class
898
- - Update `Recall#query_semantic_dual` to check before vector search
899
- - Add optional metrics tracking (skip rate, latency saved)
900
- - **Trade-off**: May miss semantic results for exact matches (acceptable)
901
- - **Recommendation**: **ADOPT IN PHASE 3** - Clear performance win
902
-
903
- ### Medium Priority
904
-
905
- #### 5. **Document Chunking for Long Transcripts**
906
-
907
- - **Value**: Better embeddings for long content (>3000 chars)
908
- - **QMD Approach**: 800 tokens, 15% overlap, semantic boundary detection
909
- - **Break Priority**: paragraph > sentence > line > word
910
- - **Implementation**: Modify ingestion to chunk long content_items before embedding
911
- - **Consideration**: Only if users report issues with long transcripts
912
- - **Recommendation**: **DEFER** - Not urgent, FastEmbed handles shorter content well
913
-
914
- #### 6. **LLM Response Caching**
915
-
916
- - **Value**: Reduce API costs for repeated distillation
917
- - **QMD Proof**: Hash-based caching with 80% hit rate
918
- - **Implementation**:
919
- - Add `llm_cache` table (hash, result, created_at)
920
- - Cache key: `SHA256(operation + model + input)`
921
- - Probabilistic cleanup: 1% chance per operation, keep latest 1000
922
- - **Consideration**: Most valuable when distiller is fully implemented
923
- - **Recommendation**: **ADOPT WHEN DISTILLER ACTIVE** - Cost savings
924
-
925
- #### 7. **Enhanced Snippet Extraction**
926
-
927
- - **Value**: Better search result previews with query term highlighting
928
- - **QMD Approach**: Find line with most query term matches, extract 1 line before + 2 after
929
- - **Implementation**: Add to `Recall` output formatting
930
- - **Consideration**: Improves UX but not critical
931
- - **Recommendation**: **CONSIDER** - Nice-to-have
932
-
933
- ### Low Priority / Not Recommended
934
-
935
- #### 8. **Neural Embeddings (EmbeddingGemma)** (SUPERSEDED)
936
-
937
- - **QMD Model**: 300M params, 300MB download, 384 dimensions
938
- - **Value**: Better semantic search quality (+40% Hit@3 over TF-IDF)
939
- - **Cost**: 300MB download, 300MB VRAM, 2s cold start, complex dependency
940
- - **Decision**: **SUPERSEDED** by FastEmbed integration (BAAI/bge-small-en-v1.5, 67MB, via fastembed-rb). Benchmark Recall@5=0.786 aggregate, no API key needed.
941
-
942
- #### 9. **Cross-Encoder Reranking** (REJECT)
943
-
944
- - **QMD Model**: Qwen3-Reranker-0.6B (640MB)
945
- - **Value**: Better ranking precision via LLM scoring
946
- - **Cost**: 640MB model, 400ms latency per query, complex dependency
947
- - **Decision**: **REJECT** - Over-engineering for fact retrieval
948
-
949
- #### 10. **Query Expansion (LLM)** (REJECT)
950
-
951
- - **QMD Model**: Qwen3-1.7B (2.2GB)
952
- - **Value**: Generate alternative query phrasings for better recall
953
- - **Cost**: 2.2GB model, 800ms latency per query
954
- - **Decision**: **REJECT** - No LLM in recall path, too heavy
955
-
956
- #### 11. **YAML Collection System** (REJECT)
957
-
958
- - **QMD Use**: Multi-directory indexing with per-path contexts
959
- - **Our Use**: Dual-database (global + project) already provides clean separation
960
- - **Decision**: **REJECT** - Our approach is cleaner for our use case
961
-
962
- #### 12. **Content-Addressable Storage** (REJECT)
963
-
964
- - **QMD Use**: Deduplicates documents by SHA256 hash
965
- - **Our Use**: Facts deduplicated by signature, not content hash
966
- - **Decision**: **REJECT** - Different data model
967
-
968
- #### 13. **Virtual Path System** (REJECT)
969
-
970
- - **QMD Use**: `qmd://collection/path` unified namespace
971
- - **Our Use**: Dual-database provides clear namespace
972
- - **Decision**: **REJECT** - Unnecessary complexity
973
29
 
974
30
  ---
975
31
 
976
- ## Implementation Priorities
977
-
978
- ### High Priority (QMD-Inspired)
32
+ ## High Priority (Study-Inspired)
979
33
 
980
- 1. **Native Vector Storage (sqlite-vec)** - 10-100x faster KNN, foundational improvement
981
- 2. **Reciprocal Rank Fusion (RRF)** ⭐ - 50% better search quality, pure algorithm
982
- 3. **Docid Short Hashes** - Better UX for fact references
983
- 4. **Smart Expansion Detection** - Skip unnecessary vector search when FTS is confident
34
+ ### 2. SessionStart Context Injection via Hook
984
35
 
985
- ### Medium Priority
36
+ Source: claude-supermemory study
986
37
 
987
- 5. **Background Processing** - Non-blocking hooks for better UX (from episodic-memory)
988
- 6. **ROI Metrics** - Track token economics for distillation (from claude-mem)
989
- 7. **LLM Response Caching** - Reduce API costs (from QMD)
990
- 8. **Document Chunking** - Better embeddings for long transcripts (from QMD, if needed)
38
+ - **Value**: Guarantees Claude sees memory context immediately, supplements existing `.claude/rules/` publish
39
+ - **Implementation**: Inject recalled facts into Claude's context at session start using `hookSpecificOutput.additionalContext`
40
+ - **Evidence**: `context-hook.js:72-74` uses hook response to inject `<supermemory-context>` XML
41
+ - **Effort**: 1-2 days (hook handler, context formatter, settings)
991
42
 
992
- ### Low Priority
43
+ ### 3. Tool-Specific Observation Compression ⭐
993
44
 
994
- 9. **Structured Logging** - Better debugging with JSON logs
995
- 10. **Embed Command** - Backfill embeddings for existing facts
996
- 11. **Enhanced Snippet Extraction** - Query-aware snippet preview (from QMD)
997
- 12. **Health Monitoring** - Only if we add background worker
998
- 13. **Web Viewer UI** - Only if users request visualization
999
- 14. **Configuration-Driven Context** - Only if users request snapshot customization
45
+ Source: claude-supermemory study
1000
46
 
1001
- ---
1002
-
1003
- ## Migration Path
47
+ - **Value**: ~70% token reduction vs raw tool I/O in provenance descriptions
48
+ - **Implementation**: Compact per-tool summarization for provenance (e.g., `Edited auth.js: "login()" → "async login()"`)
49
+ - **Evidence**: `compress.js:13-75` — 10 tool handlers with human-readable output
50
+ - **Effort**: 4-6 hours (class + tests + ingest integration)
1004
51
 
1005
- ### Completed
52
+ ### 4. Claude Code Plugin Distribution Format ⭐
1006
53
 
1007
- - [x] WAL mode for better concurrency
1008
- - [x] Enhanced statistics command
1009
- - [x] Session metadata tracking
1010
- - [x] Tool usage tracking
1011
- - [x] Semantic search with local embeddings (FastEmbed bge-small-en-v1.5)
1012
- - [x] Multi-concept AND search
1013
- - [x] Incremental sync with mtime tracking
1014
- - [x] Context-aware queries
54
+ Source: QMD study
1015
55
 
1016
- ### Phase 1: Vector Storage Upgrade (from QMD) - IMMEDIATE
56
+ - **Value**: 10x easier installation (one command vs multi-step gem + MCP + hook config)
57
+ - **Implementation**: Package ClaudeMemory as marketplace plugin for single-command installation
58
+ - **Evidence**: `.claude-plugin/marketplace.json` — complete plugin spec with MCP server bundling and skill definitions
59
+ - **Effort**: 2-3 days
1017
60
 
1018
- - [ ] Add sqlite-vec extension support (gem or FFI)
1019
- - [ ] Create schema migration v7: `facts_vec` virtual table using `vec0`
1020
- - [ ] Backfill existing embeddings from JSON to native vectors
1021
- - [ ] Update `Embeddings::Similarity` class for native KNN (two-step query pattern)
1022
- - [ ] Test migration on existing databases
1023
- - [ ] Document extension installation in README
1024
- - [ ] Benchmark: Measure KNN query improvement (expect 10-100x)
61
+ ---
1025
62
 
1026
- ### Phase 2: RRF Fusion (from QMD) - IMMEDIATE
63
+ ## Medium Priority
1027
64
 
1028
- - [ ] Implement `Recall::RRFusion` class with k=60 parameter
1029
- - [ ] Update `Recall#query_semantic_dual` to use RRF fusion
1030
- - [ ] Apply weights: original query ×2, expanded queries ×1
1031
- - [ ] Add top-rank bonus: +0.05 for #1, +0.02 for #2-3
1032
- - [ ] Test with synthetic ranked lists (unit tests)
1033
- - [ ] Validate improvements with real queries
65
+ ### 5. Incremental Indexing with File Watching
1034
66
 
1035
- ### Phase 3: UX Improvements (from QMD) - NEAR-TERM
67
+ Source: grepai study
1036
68
 
1037
- - [ ] Schema migration v8: Add `docid` column (8-char hash, indexed, unique)
1038
- - [ ] Backfill existing facts with SHA256-based docids
1039
- - [ ] Update CLI commands to accept/display docids (`ExplainCommand`, `RecallCommand`)
1040
- - [ ] Update MCP tools for docid support (`memory.explain`, `memory.recall`)
1041
- - [ ] Test cross-database docid lookups
69
+ - **Value**: Eliminates manual `claude-memory ingest` calls
70
+ - **Implementation**: Add `Listen` gem, watch `.claude/projects/*/transcripts/*.jsonl`, debounce 500ms, trigger IngestCommand automatically
71
+ - **Evidence**: `watcher/watcher.go:44` `fsnotify` with debouncing (300ms default), gitignore respect
72
+ - **Effort**: 2-3 days
73
+ - **Trade-off**: Background process ~10MB memory overhead
1042
74
 
1043
- ### Phase 4: Performance Optimizations (from QMD) - NEAR-TERM
75
+ ### 6. Background Processing for Hooks
1044
76
 
1045
- - [ ] Implement `Recall::ExpansionDetector` class
1046
- - [ ] Update `Recall#query_semantic_dual` to check before vector search
1047
- - [ ] Add metrics tracking (skip rate, avg latency saved)
1048
- - [ ] Tune thresholds based on usage patterns
77
+ Source: episodic-memory study
1049
78
 
1050
- ### Remaining Tasks
79
+ - **Value**: Non-blocking hooks for better UX
80
+ - **Implementation**: `--async` flag on hook commands, fork and detach
81
+ - **Trade-off**: Background process management complexity, potential race conditions
1051
82
 
1052
- - [ ] Background processing (--async flag for hooks)
1053
- - [ ] LLM response caching (from QMD, when distiller is active)
1054
- - [ ] Structured logging implementation
1055
- - [ ] Embed command for backfilling embeddings
83
+ ### 7. Document Chunking for Long Transcripts
1056
84
 
1057
- ### Future (If Requested)
85
+ Source: QMD study
1058
86
 
1059
- - [ ] Document chunking for long transcripts (from QMD, if users report issues)
1060
- - [ ] Enhanced snippet extraction (from QMD, for better search result previews)
1061
- - [ ] Build web viewer (if users request visualization)
1062
- - [ ] Add HTTP-based health checks (if background worker is added)
1063
- - [ ] Configuration-driven snapshot generation (if users request customization)
87
+ - **Value**: Better embeddings for long content (>3000 chars)
88
+ - **Implementation**: 800 tokens, 15% overlap, semantic boundary detection
89
+ - **Consideration**: Only if users report issues with long transcripts
1064
90
 
1065
91
  ---
1066
92
 
1067
- ## Key Takeaways
1068
-
1069
- ### Successfully Adopted from claude-mem ✓
1070
-
1071
- 1. Progressive disclosure (token-efficient retrieval)
1072
- 2. Privacy controls (tag-based content exclusion)
1073
- 3. Clean architecture (command pattern, slim CLI)
1074
- 4. Semantic shortcuts (decisions, conventions, architecture)
1075
- 5. Exit code strategy (hook error handling)
1076
- 6. ROI metrics tracking (token economics for distillation efficiency)
1077
-
1078
- ### Successfully Adopted from Episodic-Memory ✓
1079
-
1080
- 1. **WAL Mode** - Better concurrency with Write-Ahead Logging
1081
- 2. **Tool Usage Tracking** - Dedicated table tracking which tools discovered facts
1082
- 3. **Incremental Sync** - mtime-based change detection for fast re-ingestion
1083
- 4. **Session Metadata** - Context capture (git branch, cwd, Claude version)
1084
- 5. **Local Vector Embeddings** - FastEmbed (BAAI/bge-small-en-v1.5) semantic search alongside FTS5
1085
- 6. **Multi-Concept AND Search** - Precise queries matching 2-5 concepts simultaneously
1086
- 7. **Enhanced Statistics** - Comprehensive reporting on facts, entities, provenance
1087
- 8. **Context-Aware Queries** - Filter by branch, directory, or tools used
1088
-
1089
- ### Our Unique Advantages
1090
-
1091
- 1. **Dual-database architecture** - Global + project scopes
1092
- 2. **Fact-based knowledge graph** - Structured vs blob observations or conversation exchanges
1093
- 3. **Truth maintenance** - Conflict resolution and supersession
1094
- 4. **Predicate policies** - Single vs multi-value semantics
1095
- 5. **Ruby ecosystem** - Simpler dependencies, easier install
1096
- 6. **Local embeddings** - ONNX model via fastembed-rb, no API key (vs Transformers.js)
93
+ ---
1097
94
 
1098
- ### Remaining Opportunities
95
+ ## Features to Avoid
1099
96
 
1100
- - **Background Processing** - Non-blocking hooks for better UX (from episodic-memory)
1101
- - **ROI Metrics** - Track token economics for distillation (from claude-mem)
1102
- - **Structured Logging** - JSON-formatted logs for debugging
1103
- - **Embed Command** - Backfill embeddings for existing facts
1104
- - **Health Monitoring** - Only if we add background worker
1105
- - **Web Viewer UI** - Only if users request visualization
1106
- - **Configuration-Driven Context** - Only if users request snapshot customization
97
+ - **Chroma Vector Database** We use fastembed-rb with local ONNX model instead
98
+ - **Claude Agent SDK for Distillation** Direct API calls via `anthropic-rb` gem
99
+ - **Worker Service Background Process** Keep stdio-based MCP server
100
+ - **Web Viewer UI** CLI output is sufficient. Add if users request it
101
+ - **Configuration-Driven Context** Default config is sufficient. Add if users request it
102
+ - **Neural Embeddings (EmbeddingGemma)** Superseded by FastEmbed (BAAI/bge-small-en-v1.5)
103
+ - **Cross-Encoder Reranking (Qwen3-Reranker-0.6B)** Over-engineering for fact retrieval
104
+ - **Query Expansion (LLM, Qwen3-1.7B)** — No LLM in recall path, too heavy
105
+ - **Custom Fine-Tuned Query Expansion** — 1.7B model too heavy for fact retrieval
106
+ - **YAML Collection System** — Our dual-database approach is cleaner
107
+ - **Content-Addressable Storage** — Facts deduplicated by signature, not content hash
108
+ - **Virtual Path System** — Dual-database provides clear namespace
1107
109
 
1108
110
  ---
1109
111
 
1110
- ## Comparison Summary
112
+ ## Design Decisions
1111
113
 
1112
- **Episodic-memory** and **claude_memory** serve complementary but different needs:
114
+ ### No Tag Count Limit (2026-01-23)
1113
115
 
1114
- **Episodic-memory** excels at:
1115
- - Semantic conversation search with local embeddings
1116
- - Preserving complete conversation context
1117
- - Multi-concept AND queries
1118
- - Fast incremental sync
1119
- - Tool usage tracking
1120
- - Rich session metadata
116
+ **Decision**: Removed MAX_TAG_COUNT limit from ContentSanitizer.
1121
117
 
1122
- **ClaudeMemory** excels at:
1123
- - Structured fact extraction and storage
1124
- - Truth maintenance and conflict resolution
1125
- - Dual-scope architecture (global vs project)
1126
- - Knowledge graph with provenance
1127
- - Semantic shortcuts for common queries
118
+ **Rationale**:
119
+ - The regex pattern `/<tag>.*?<\/tag>/m` is provably safe from ReDoS attacks
120
+ - Performance is O(n) and excellent even with 1000+ tags (~0.6ms)
121
+ - Real-world usage legitimately produces 100-200+ tags in long sessions
122
+ - No other similar tool enforces tag count limits
1128
123
 
1129
- **Best of both worlds (achieved)**:
1130
- - ✅ Added vector embeddings for semantic search (FastEmbed BAAI/bge-small-en-v1.5, local ONNX)
1131
- - ✅ Kept fact-based knowledge graph for structured queries
1132
- - ✅ Adopted incremental sync and tool tracking from episodic-memory
1133
- - ✅ Maintained truth maintenance and conflict resolution
1134
- - ✅ Added session metadata for richer context
1135
- - ✅ Implemented multi-concept AND search
1136
- - ✅ Enhanced statistics and reporting
124
+ **Do not reintroduce**: Tag count validation is unnecessary and harmful.
1137
125
 
1138
126
  ---
1139
127
 
@@ -1141,10 +129,10 @@ Analysis of **QMD (Quick Markdown Search)** reveals several high-value optimizat
1141
129
 
1142
130
  - [episodic-memory GitHub](https://github.com/obra/episodic-memory) - Semantic conversation search
1143
131
  - [claude-mem GitHub](https://github.com/thedotmack/claude-mem) - Memory compression system
1144
- - [ClaudeMemory Updated Plan](updated_plan.md) - Original improvement plan
132
+ - [grepai GitHub](https://github.com/yoanbernabeu/grepai) - Semantic code search
133
+ - [claude-supermemory GitHub](https://github.com/supermemoryai/claude-supermemory) - Cloud-backed memory
134
+ - [QMD GitHub](https://github.com/tobi/qmd) - On-device hybrid search engine
1145
135
 
1146
136
  ---
1147
137
 
1148
- *This document has been updated to reflect completed implementations. Fourteen major improvements have been successfully integrated: 6 from claude-mem and 8 from episodic-memory. ClaudeMemory now combines the best of both systems while maintaining its unique advantages in fact-based knowledge representation and truth maintenance.*
1149
-
1150
- *Last updated: 2026-01-26 - Added ROI metrics tracking for distillation token economics*
138
+ *Last updated: 2026-02-03 - Removed Docid, LLM Cache, Structured Logging (implemented). Renumbered items.*