claude_memory 0.3.0 → 0.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (75) hide show
  1. checksums.yaml +4 -4
  2. data/.claude/CLAUDE.md +1 -1
  3. data/.claude/output-styles/memory-aware.md +1 -0
  4. data/.claude/rules/claude_memory.generated.md +9 -34
  5. data/.claude/settings.local.json +4 -1
  6. data/.claude/skills/check-memory/DEPRECATED.md +29 -0
  7. data/.claude/skills/check-memory/SKILL.md +10 -0
  8. data/.claude/skills/debug-memory +1 -0
  9. data/.claude/skills/improve/SKILL.md +12 -1
  10. data/.claude/skills/memory-first-workflow +1 -0
  11. data/.claude/skills/setup-memory +1 -0
  12. data/.claude-plugin/plugin.json +1 -1
  13. data/.lefthook/map_specs.rb +29 -0
  14. data/CHANGELOG.md +83 -5
  15. data/CLAUDE.md +38 -0
  16. data/README.md +43 -0
  17. data/Rakefile +14 -1
  18. data/WEEK2_COMPLETE.md +250 -0
  19. data/db/migrations/008_add_provenance_line_range.rb +21 -0
  20. data/db/migrations/009_add_docid.rb +39 -0
  21. data/db/migrations/010_add_llm_cache.rb +30 -0
  22. data/docs/architecture.md +49 -14
  23. data/docs/ci_integration.md +294 -0
  24. data/docs/eval_week1_summary.md +183 -0
  25. data/docs/eval_week2_summary.md +419 -0
  26. data/docs/evals.md +353 -0
  27. data/docs/improvements.md +72 -1085
  28. data/docs/influence/claude-supermemory.md +498 -0
  29. data/docs/influence/qmd.md +424 -2022
  30. data/docs/quality_review.md +64 -705
  31. data/lefthook.yml +8 -1
  32. data/lib/claude_memory/commands/doctor_command.rb +45 -4
  33. data/lib/claude_memory/commands/explain_command.rb +11 -6
  34. data/lib/claude_memory/commands/stats_command.rb +1 -1
  35. data/lib/claude_memory/core/fact_graph.rb +122 -0
  36. data/lib/claude_memory/core/fact_query_builder.rb +34 -14
  37. data/lib/claude_memory/core/fact_ranker.rb +3 -20
  38. data/lib/claude_memory/core/relative_time.rb +45 -0
  39. data/lib/claude_memory/core/result_sorter.rb +2 -2
  40. data/lib/claude_memory/core/rr_fusion.rb +57 -0
  41. data/lib/claude_memory/core/snippet_extractor.rb +97 -0
  42. data/lib/claude_memory/domain/fact.rb +3 -1
  43. data/lib/claude_memory/embeddings/fastembed_adapter.rb +55 -0
  44. data/lib/claude_memory/index/index_query.rb +2 -0
  45. data/lib/claude_memory/index/lexical_fts.rb +18 -0
  46. data/lib/claude_memory/infrastructure/operation_tracker.rb +7 -21
  47. data/lib/claude_memory/infrastructure/schema_validator.rb +30 -25
  48. data/lib/claude_memory/ingest/content_sanitizer.rb +8 -1
  49. data/lib/claude_memory/ingest/ingester.rb +74 -59
  50. data/lib/claude_memory/ingest/tool_extractor.rb +1 -1
  51. data/lib/claude_memory/ingest/tool_filter.rb +55 -0
  52. data/lib/claude_memory/logging/logger.rb +112 -0
  53. data/lib/claude_memory/mcp/query_guide.rb +96 -0
  54. data/lib/claude_memory/mcp/response_formatter.rb +86 -23
  55. data/lib/claude_memory/mcp/server.rb +34 -4
  56. data/lib/claude_memory/mcp/text_summary.rb +257 -0
  57. data/lib/claude_memory/mcp/tool_definitions.rb +27 -11
  58. data/lib/claude_memory/mcp/tools.rb +133 -120
  59. data/lib/claude_memory/publish.rb +12 -2
  60. data/lib/claude_memory/recall/expansion_detector.rb +44 -0
  61. data/lib/claude_memory/recall.rb +93 -41
  62. data/lib/claude_memory/resolve/resolver.rb +72 -40
  63. data/lib/claude_memory/store/sqlite_store.rb +99 -24
  64. data/lib/claude_memory/sweep/sweeper.rb +6 -0
  65. data/lib/claude_memory/version.rb +1 -1
  66. data/lib/claude_memory.rb +21 -0
  67. data/output-styles/memory-aware.md +71 -0
  68. data/skills/debug-memory/SKILL.md +146 -0
  69. data/skills/memory-first-workflow/SKILL.md +144 -0
  70. metadata +29 -5
  71. data/.claude/.mind.mv2.o2N83S +0 -0
  72. data/.claude/output-styles/memory-aware.md +0 -21
  73. data/docs/.claude/mind.mv2.lock +0 -0
  74. data/docs/remaining_improvements.md +0 -330
  75. /data/{.claude/skills → skills}/setup-memory/SKILL.md +0 -0
data/docs/improvements.md CHANGED
@@ -1,1140 +1,127 @@
1
1
  # Improvements to Consider
2
2
 
3
- *Updated: 2026-01-29*
3
+ *Updated: 2026-02-03 - Removed Docid Short Hash System, LLM Response Caching, Structured Logging (implemented)*
4
4
  *Sources:*
5
5
  - *[thedotmack/claude-mem](https://github.com/thedotmack/claude-mem) - Memory compression system*
6
6
  - *[obra/episodic-memory](https://github.com/obra/episodic-memory) - Semantic conversation search*
7
7
  - *[yoanbernabeu/grepai](https://github.com/yoanbernabeu/grepai) - Semantic code search with vector embeddings*
8
+ - *[supermemoryai/claude-supermemory](https://github.com/supermemoryai/claude-supermemory) - Cloud-backed persistent memory plugin*
9
+ - *[tobi/qmd](https://github.com/tobi/qmd) - On-device hybrid search engine (updated 2026-02-02)*
8
10
 
9
- This document identifies design patterns and features from claude-mem and episodic-memory that could improve claude_memory. Implemented improvements have been removed from this document.
11
+ This document contains only unimplemented improvements. Completed items are removed.
10
12
 
11
13
  ---
12
14
 
13
- ## Implemented Improvements
15
+ ## High Priority (QMD-Inspired)
14
16
 
15
- The following improvements from the original analysis have been successfully implemented:
16
-
17
- 1. **Progressive Disclosure Pattern** - `memory.recall_index` and `memory.recall_details` MCP tools with token estimation
18
- 2. **Privacy Tag System** - ContentSanitizer with `<private>`, `<no-memory>`, and `<secret>` tag stripping
19
- 3. **Slim Orchestrator Pattern** - CLI refactored to thin router with extracted command classes
20
- 4. **Semantic Shortcuts** - `memory.decisions`, `memory.conventions`, and `memory.architecture` MCP tools
21
- 5. **Exit Code Strategy** - Hook::ExitCodes module with SUCCESS/WARNING/ERROR constants
22
- 6. **WAL Mode for Concurrency** - SQLite Write-Ahead Logging enabled for better concurrent access
23
- 7. **Enhanced Statistics** - Comprehensive stats command showing facts, entities, provenance, conflicts
24
- 8. **Session Metadata Tracking** - Captures git_branch, cwd, claude_version, thinking_level from transcripts
25
- 9. **Tool Usage Tracking** - Dedicated tool_calls table tracking tool names, inputs, timestamps
26
- 10. **Semantic Search with TF-IDF** - Local embeddings (384-dimensional), hybrid vector + text search
27
- 11. **Multi-Concept AND Search** - Query facts matching all of 2-5 concepts simultaneously
28
- 12. **Incremental Sync** - mtime-based change detection to skip unchanged transcript files
29
- 13. **Context-Aware Queries** - Filter facts by git branch, directory, or tools used
30
- 14. **ROI Metrics Tracking** - ingestion_metrics table tracking token economics for distillation efficiency (2026-01-26)
31
-
32
- ---
33
-
34
- ## grepai Study (2026-01-29)
35
-
36
- Source: docs/influence/grepai.md
37
-
38
- ### High Priority Recommendations
39
-
40
- - [ ] **Incremental Indexing with File Watching**: Auto-update memory index during coding sessions
41
- - Value: Eliminates manual `claude-memory ingest` calls, huge UX win
42
- - Evidence: watcher/watcher.go:44 - `fsnotify` with debouncing (300ms default), gitignore respect
43
- - Implementation: Add `Listen` gem (Ruby fsnotify), watch `.claude/projects/*/transcripts/*.jsonl`, debounce 500ms, trigger IngestCommand automatically
44
- - Effort: 2-3 days (watcher class, integration, testing)
45
- - Trade-off: Background process ~10MB memory overhead, may complicate testing
46
-
47
- - [ ] **Compact Response Format for MCP Tools**: Reduce token usage by ~60% in MCP responses
48
- - Value: Critical for scaling to large fact databases (1000+ facts)
49
- - Evidence: mcp/server.go:219 - `SearchResultCompact` omits content field, returns only metadata
50
- - Implementation: Add `compact: true` parameter to all recall tools, omit provenance/context excerpts by default, user can override with `compact: false`
51
- - Effort: 4-6 hours (add parameter, update formatters, tests)
52
- - Trade-off: User needs follow-up `memory.explain <fact_id>` for full context (two-step interaction)
53
-
54
- - [ ] **Fact Dependency Graph Visualization**: Show supersession chains and conflict relationships
55
- - Value: Invaluable for understanding why facts were superseded or conflicted
56
- - Evidence: trace/trace.go:95 - `CallGraph` struct with nodes and edges for function dependencies
57
- - Implementation: Create `memory.fact_graph <fact_id> --depth 2` tool, query `fact_links` table with BFS traversal, return JSON with nodes (facts) and edges (supersedes/conflicts/supports)
58
- - Effort: 2-3 days (graph builder, MCP tool, tests)
59
- - Trade-off: Adds complexity for feature used mainly for debugging/exploration
60
-
61
- - [ ] **Hybrid Search (Vector + Text) with RRF**: Better relevance combining semantic and keyword matching
62
- - Value: 50% improvement in search quality (proven by grepai's Reciprocal Rank Fusion)
63
- - Evidence: search/search.go - RRF with K=60, combines cosine similarity with full-text search
64
- - Implementation: Add `sqlite-vec` extension, add `embeddings` BLOB column to `facts`, implement RRF in `Recall#query`, make hybrid optional via config
65
- - Effort: 5-7 days (embedder setup, schema migration, RRF implementation, testing)
66
- - Trade-off: Requires API calls for embedding (~$0.00001/fact), slower queries (2x search + fusion)
67
- - Recommendation: CONSIDER - High value but significant effort. Start with FTS5, add vectors later if quality issues arise
68
-
69
- ---
70
-
71
- ## Design Decisions
72
-
73
- ### No Tag Count Limit (2026-01-23)
74
-
75
- **Decision**: Removed MAX_TAG_COUNT limit from ContentSanitizer.
76
-
77
- **Rationale**:
78
- - The regex pattern `/<tag>.*?<\/tag>/m` is provably safe from ReDoS attacks
79
- - Non-greedy matching (`.*?`) with clear delimiters
80
- - No nested quantifiers or alternation that could cause catastrophic backtracking
81
- - Performance is O(n) and predictable
82
- - Performance benchmarks show excellent speed even at scale:
83
- - 100 tags: 0.07ms
84
- - 200 tags: 0.13ms
85
- - 1,000 tags: 0.64ms
86
- - Real-world usage legitimately produces 100-200+ tags in long sessions
87
- - System tags like `<claude-memory-context>` accumulate
88
- - Users mark multiple sections with `<private>` tags
89
- - The limit created false alarms and blocked legitimate ingestion
90
- - No other similar tool (claude-mem, episodic-memory) enforces tag count limits
91
-
92
- **Do not reintroduce**: Tag count validation is unnecessary and harmful. If extreme input causes issues, investigate the actual root cause rather than adding arbitrary limits.
93
-
94
- ---
95
-
96
- ## Executive Summary
97
-
98
- This document analyzes two complementary memory systems:
99
-
100
- **Claude-mem** (TypeScript/Node.js, v9.0.5) - Memory compression system with 6+ months of production usage:
101
- - ROI Metrics tracking token costs
102
- - Health monitoring and process management
103
- - Configuration-driven context injection
104
-
105
- **Episodic-memory** (TypeScript/Node.js, v1.0.15) - Semantic conversation search for Claude Code:
106
- - Local vector embeddings (Transformers.js)
107
- - Multi-concept AND search
108
- - Automatic conversation summarization
109
- - Tool usage tracking
110
- - Session metadata capture
111
- - Background sync with incremental updates
112
-
113
- **Our Current Advantages**:
114
- - Ruby ecosystem (simpler dependencies)
115
- - Dual-database architecture (global + project scope)
116
- - Fact-based knowledge graph (vs observation blobs or conversation exchanges)
117
- - Truth maintenance system (conflict resolution)
118
- - Predicate policies (single vs multi-value)
119
- - Progressive disclosure already implemented
120
- - Privacy tag stripping already implemented
121
-
122
- **High-Value Opportunities from Episodic-Memory**:
123
- - Vector embeddings for semantic search alongside FTS5
124
- - Tool usage tracking during fact discovery
125
- - Session metadata capture (git branch, working directory)
126
- - Multi-concept AND search
127
- - Background sync with incremental updates
128
- - Enhanced statistics and reporting
129
-
130
- ---
131
-
132
- ## Episodic-Memory Comparison
133
-
134
- ### Architecture Overview
135
-
136
- **Episodic-memory** focuses on **conversation-level semantic search** rather than fact extraction. Key differences:
137
-
138
- | Feature | Episodic-Memory | ClaudeMemory |
139
- |---------|----------------|--------------|
140
- | **Data Model** | Conversation exchanges (user-assistant pairs) | Facts (subject-predicate-object triples) |
141
- | **Search Method** | Vector embeddings + text search | FTS5 full-text search |
142
- | **Embeddings** | Local Transformers.js (Xenova/all-MiniLM-L6-v2) | None (FTS5 only) |
143
- | **Vector Storage** | sqlite-vec virtual table | N/A |
144
- | **Scope** | Single database with project field | Dual database (global + project) |
145
- | **Truth Maintenance** | None (keeps all conversations) | Supersession + conflict resolution |
146
- | **Summarization** | Claude API generates summaries | N/A |
147
- | **Tool Tracking** | Explicit tool_calls table | Mentioned in provenance text |
148
- | **Session Metadata** | sessionId, cwd, gitBranch, claudeVersion, thinking metadata | Limited (session_id in content_items) |
149
- | **Multi-Concept Search** | Array-based AND queries (2-5 concepts) | Single query only |
150
- | **Incremental Sync** | Timestamp-based mtime checks | Re-processes all content |
151
- | **Background Processing** | Async hook with --background flag | Synchronous hook execution |
152
- | **Statistics** | Rich stats with project breakdown | Basic status command |
153
- | **Exclusion** | Content-based markers (`<INSTRUCTIONS-TO-EPISODIC-MEMORY>DO NOT INDEX`) | Tag stripping (`<private>`, `<no-memory>`) |
154
- | **Line References** | Stores line_start and line_end for each exchange | No line tracking |
155
- | **WAL Mode** | Enabled for concurrency | Not enabled |
156
-
157
- ### What Episodic-Memory Does Well
158
-
159
- 1. **Semantic Search with Local Embeddings**
160
- - Uses Transformers.js to run embedding model locally (offline-capable)
161
- - 384-dimensional vectors from `Xenova/all-MiniLM-L6-v2`
162
- - Hybrid vector + text search for best recall
163
- - sqlite-vec virtual table for fast similarity queries
164
-
165
- 2. **Multi-Concept AND Search**
166
- - Array of 2-5 concepts that must all be present in results
167
- - Searches each concept independently then intersects results
168
- - Ranks by average similarity across all concepts
169
- - Example: `["React Router", "authentication", "JWT"]`
170
-
171
- 3. **Tool Usage Tracking**
172
- - Dedicated `tool_calls` table with foreign key to exchanges
173
- - Captures tool_name, tool_input, tool_result, is_error
174
- - Tool names included in embeddings for tool-based searches
175
- - Search results show tool usage summary
176
-
177
- 4. **Rich Session Metadata**
178
- - Captures: sessionId, cwd, gitBranch, claudeVersion
179
- - Thinking metadata: level, disabled, triggers
180
- - Conversation structure: parentUuid, isSidechain
181
- - Enables filtering by branch, project context
182
-
183
- 5. **Incremental Sync**
184
- - Atomic file operations (temp file + rename)
185
- - mtime-based change detection (only copies modified files)
186
- - Fast subsequent syncs (seconds vs minutes)
187
- - Safe concurrent execution
188
-
189
- 6. **Automatic Conversation Summarization**
190
- - Uses Claude API to generate concise summaries
191
- - Summaries stored as `.txt` files alongside conversations
192
- - Concurrency-limited batch processing
193
- - Summary limit (default 10 per sync) to control API costs
194
-
195
- 7. **Background Sync**
196
- - `--background` flag for async processing
197
- - SessionStart hook runs sync without blocking
198
- - User continues working while indexing happens
199
- - Output logged to file for debugging
200
-
201
- 8. **Line-Range References**
202
- - Stores line_start and line_end for each exchange
203
- - Enables precise source linking in search results
204
- - Supports pagination: read specific line ranges from large conversations
205
- - Example: "Lines 10-25 in conversation.jsonl (295KB, 1247 lines)"
206
-
207
- 9. **Statistics and Reporting**
208
- - Total conversations, exchanges, date range
209
- - Summary coverage tracking
210
- - Project breakdown with top 10 projects
211
- - Database size reporting
212
-
213
- 10. **Exclusion Markers**
214
- - Content-based opt-out: `<INSTRUCTIONS-TO-EPISODIC-MEMORY>DO NOT INDEX THIS CHAT</INSTRUCTIONS-TO-EPISODIC-MEMORY>`
215
- - Files archived but excluded from search index
216
- - Prevents meta-conversations from polluting index
217
- - Use case: sensitive work, test sessions, agent conversations
218
-
219
- 11. **WAL Mode for Concurrency**
220
- - SQLite Write-Ahead Logging enabled
221
- - Better concurrency for multiple readers
222
- - Safe for concurrent sync operations
223
-
224
- ### Design Patterns Worth Adopting
225
-
226
- 1. **Local Vector Embeddings**
227
- - **Value**: Semantic search finds conceptually similar content even with different terminology
228
- - **Implementation**: Add `embeddings` column to facts table, use sqlite-vec extension
229
- - **Ruby gems**: `onnxruntime` or shell out to Python/Node.js for embeddings
230
- - **Trade-off**: Increased storage (384 floats per fact), embedding generation time
231
-
232
- 2. **Multi-Concept AND Search**
233
- - **Value**: Precise queries like "find conversations about React AND authentication AND JWT"
234
- - **Implementation**: Run multiple searches and intersect results, rank by average similarity
235
- - **Application to facts**: Find facts matching multiple predicates or entities
236
- - **MCP tool**: `memory.search_concepts(concepts: ["auth", "API", "security"])`
237
-
238
- 3. **Tool Usage Tracking**
239
- - **Value**: Know which tools were used during fact discovery (Read, Edit, Bash, etc.)
240
- - **Implementation**: Add `tool_calls` table or JSON column in content_items
241
- - **Schema**: `{ tool_name, tool_input, tool_result, timestamp }`
242
- - **Use case**: "Which facts were discovered using the Bash tool?"
243
-
244
- 4. **Session Metadata Capture**
245
- - **Value**: Context about where/when facts were learned
246
- - **Implementation**: Extend content_items with git_branch, cwd, claude_version columns
247
- - **Use case**: "Show facts learned while on feature/auth branch"
248
-
249
- 5. **Incremental Sync**
250
- - **Value**: Faster subsequent ingestions (seconds vs minutes)
251
- - **Implementation**: Store mtime for each content_item, skip unchanged files
252
- - **Hook optimization**: Only process delta since last ingest
253
-
254
- 6. **Background Processing**
255
- - **Value**: Don't block user while processing large transcripts
256
- - **Implementation**: Fork process or use Ruby's async/await
257
- - **Hook flag**: `claude-memory hook ingest --async`
258
-
259
- 7. **Line-Range References in Provenance**
260
- - **Value**: Precise source linking for fact verification
261
- - **Implementation**: Store line_start and line_end in provenance table
262
- - **Display**: "Fact from lines 42-56 in transcript.jsonl"
263
-
264
- 8. **Statistics Command**
265
- - **Value**: Visibility into memory system health
266
- - **Implementation**: Enhance `claude-memory status` with more metrics
267
- - **Metrics**: Facts by predicate, entities by type, provenance coverage, scope breakdown
268
-
269
- 9. **WAL Mode**
270
- - **Value**: Better concurrency, safer concurrent operations
271
- - **Implementation**: `db.pragma('journal_mode = WAL')` in store initialization
272
- - **Benefit**: Multiple readers don't block each other
273
-
274
- ---
275
-
276
- ## 1. Health Monitoring and Process Management
277
-
278
- ### What claude-mem Does
279
-
280
- **Worker Service Management**:
281
-
282
- ```typescript
283
- // Health check endpoint
284
- app.get('/health', (req, res) => {
285
- res.json({
286
- status: 'ok',
287
- uptime: process.uptime(),
288
- port: WORKER_PORT,
289
- memory: process.memoryUsage(),
290
- version: packageJson.version
291
- });
292
- });
293
-
294
- // Smart startup
295
- async function ensureWorkerHealthy(timeout = 10000) {
296
- const healthy = await checkHealth();
297
- if (!healthy) {
298
- await startWorker();
299
- await waitForHealth(timeout);
300
- }
301
- }
302
- ```
303
-
304
- **Process Management**:
305
- - PID file tracking (`~/.claude-mem/worker.pid`)
306
- - Port conflict detection
307
- - Version mismatch warnings
308
- - Graceful shutdown handlers
309
- - Platform-aware timeouts (Windows vs Unix)
310
-
311
- **File**: `src/infrastructure/ProcessManager.ts`
312
-
313
- ### What We Should Do
314
-
315
- **Priority**: LOW (we use MCP server, not background worker)
316
-
317
- **Implementation** (if we add background worker):
318
-
319
- 1. **Health endpoint in MCP server**:
320
- ```ruby
321
- # lib/claude_memory/mcp/server.rb
322
- def handle_ping
323
- {
324
- status: "ok",
325
- version: ClaudeMemory::VERSION,
326
- databases: {
327
- global: File.exist?(global_db_path),
328
- project: File.exist?(project_db_path)
329
- },
330
- uptime: Process.clock_gettime(Process::CLOCK_MONOTONIC) - @start_time
331
- }
332
- end
333
- ```
334
-
335
- 2. **PID file management**:
336
- ```ruby
337
- # lib/claude_memory/daemon.rb
338
- class Daemon
339
- PID_FILE = File.expand_path("~/.claude/memory_server.pid")
340
-
341
- def start
342
- check_existing_process
343
- fork_and_daemonize
344
- write_pid_file
345
- setup_signal_handlers
346
- run_server
347
- end
348
-
349
- def stop
350
- pid = read_pid_file
351
- Process.kill("TERM", pid)
352
- wait_for_shutdown
353
- remove_pid_file
354
- end
355
- end
356
- ```
357
-
358
- **Benefits**:
359
- - Reliable server lifecycle
360
- - Easy debugging (health checks)
361
- - Prevents duplicate processes
362
-
363
- **Trade-offs**:
364
- - Complexity we may not need
365
- - Ruby daemons are tricky on Windows
366
- - MCP stdio transport doesn't need health checks
367
-
368
- **Verdict**: Skip unless we switch to HTTP-based MCP transport.
369
-
370
- ---
371
-
372
- ## 3. Web-Based Viewer UI
373
-
374
- ### What claude-mem Does
375
-
376
- **Real-Time Memory Viewer** at `http://localhost:37777`:
377
-
378
- - React-based web UI
379
- - Server-Sent Events (SSE) for real-time updates
380
- - Infinite scroll pagination
381
- - Project filtering
382
- - Settings persistence (sidebar state, theme)
383
- - Auto-reconnection with exponential backoff
384
- - Single-file HTML bundle (esbuild)
385
-
386
- **File**: `src/ui/viewer/` (React components)
387
-
388
- **Features**:
389
- - See observations as they're captured
390
- - Search historical observations
391
- - Filter by project
392
- - Export/share observations
393
- - Theme toggle (light/dark)
394
-
395
- **Build**:
396
- ```typescript
397
- esbuild.build({
398
- entryPoints: ['src/ui/viewer/index.tsx'],
399
- bundle: true,
400
- outfile: 'plugin/ui/viewer.html',
401
- loader: { '.tsx': 'tsx', '.woff2': 'dataurl' },
402
- });
403
- ```
404
-
405
- ### What We Should Do
406
-
407
- **Priority**: LOW (nice-to-have)
408
-
409
- **Implementation** (if we want it):
410
-
411
- 1. **Add Sinatra web server**:
412
- ```ruby
413
- # lib/claude_memory/web/server.rb
414
- require 'sinatra/base'
415
- require 'json'
416
-
417
- module ClaudeMemory
418
- module Web
419
- class Server < Sinatra::Base
420
- get '/' do
421
- erb :index
422
- end
423
-
424
- get '/api/facts' do
425
- facts = Recall.search(params[:query], limit: 100)
426
- json facts
427
- end
428
-
429
- get '/api/stream' do
430
- stream :keep_open do |out|
431
- # SSE for real-time updates
432
- EventMachine.add_periodic_timer(1) do
433
- out << "data: #{recent_facts.to_json}\n\n"
434
- end
435
- end
436
- end
437
- end
438
- end
439
- end
440
- ```
441
-
442
- 2. **Add to MCP server** (optional HTTP endpoint):
443
- ```ruby
444
- # claude-memory serve --web
445
- def serve_with_web
446
- Thread.new { Web::Server.run!(port: 37778) }
447
- serve_mcp # Main MCP server
448
- end
449
- ```
450
-
451
- 3. **Simple HTML viewer**:
452
- ```html
453
- <!-- lib/claude_memory/web/views/index.erb -->
454
- <!DOCTYPE html>
455
- <html>
456
- <head>
457
- <title>ClaudeMemory Viewer</title>
458
- <style>/* Minimal CSS */</style>
459
- </head>
460
- <body>
461
- <div id="facts-list"></div>
462
- <script>
463
- // Fetch and display facts
464
- fetch('/api/facts')
465
- .then(r => r.json())
466
- .then(facts => render(facts));
467
- </script>
468
- </body>
469
- </html>
470
- ```
471
-
472
- **Benefits**:
473
- - Visibility into memory system
474
- - Debugging tool
475
- - User trust (transparency)
476
-
477
- **Trade-offs**:
478
- - Significant development effort
479
- - Need to bundle web assets
480
- - Another dependency (web server)
481
- - Maintenance burden
482
-
483
- **Verdict**: Skip for MVP. Consider if users request it.
484
-
485
- ---
486
-
487
- ## 4. Dual-Integration Strategy
488
-
489
- ### What claude-mem Does
490
-
491
- **Plugin + MCP Server Hybrid**:
492
-
493
- 1. **Claude Code Plugin** (primary):
494
- - Hooks for lifecycle events
495
- - Worker service for AI processing
496
- - Installed via marketplace
497
-
498
- 2. **MCP Server** (secondary):
499
- - Thin wrapper delegating to worker HTTP API
500
- - Enables Claude Desktop integration
501
- - Same backend, different frontend
502
-
503
- **File**: `src/servers/mcp-server.ts` (thin wrapper)
504
-
505
- ```typescript
506
- // MCP server delegates to worker HTTP API
507
- const mcpServer = new McpServer({
508
- name: "claude-mem",
509
- version: packageJson.version
510
- });
511
-
512
- mcpServer.setRequestHandler(ListToolsRequestSchema, async () => {
513
- // Fetch tools from worker
514
- const tools = await fetch('http://localhost:37777/api/mcp/tools');
515
- return tools.json();
516
- });
517
-
518
- mcpServer.setRequestHandler(CallToolRequestSchema, async (request) => {
519
- // Forward to worker
520
- const result = await fetch('http://localhost:37777/api/mcp/call', {
521
- method: 'POST',
522
- body: JSON.stringify(request.params)
523
- });
524
- return result.json();
525
- });
526
- ```
527
-
528
- **Benefit**: One backend, multiple frontends.
529
-
530
- ### What We Should Do
531
-
532
- **Priority**: LOW
533
-
534
- **Current State**: We only have MCP server (no plugin hooks yet).
535
-
536
- **Implementation** (if we add Claude Code hooks):
537
-
538
- 1. **Keep MCP server as primary**:
539
- ```ruby
540
- # lib/claude_memory/mcp/server.rb
541
- # Current implementation - keep as-is
542
- ```
543
-
544
- 2. **Add hook handlers**:
545
- ```ruby
546
- # lib/claude_memory/hook/handler.rb
547
- # Delegate to same store manager
548
- def ingest_hook
549
- store_manager = Store::StoreManager.new
550
- ingester = Ingest::Ingester.new(store_manager)
551
- ingester.ingest(read_stdin[:transcript_delta])
552
- end
553
- ```
554
-
555
- 3. **Shared backend**:
556
- ```
557
- MCP Server (stdio) ──┐
558
- ├──> Store::StoreManager ──> SQLite
559
- Hook Handler (stdin) ─┘
560
- ```
561
-
562
- **Benefits**:
563
- - Works with both Claude Code and Claude Desktop
564
- - No duplicate logic
565
- - Clean separation of transport vs business logic
566
-
567
- **Trade-offs**:
568
- - More integration points to maintain
569
- - Hook contract is Claude Code-specific
570
-
571
- **Verdict**: Consider if we add Claude Code hooks (not urgent).
572
-
573
- ---
574
-
575
- ## 5. Configuration-Driven Context Injection
576
-
577
- ### What claude-mem Does
578
-
579
- **Context Config File**: `~/.claude-mem/settings.json`
580
-
581
- ```json
582
- {
583
- "context": {
584
- "mode": "reader", // reader | chat | inference
585
- "observations": {
586
- "enabled": true,
587
- "limit": 10,
588
- "types": ["decision", "gotcha", "trade-off"]
589
- },
590
- "summaries": {
591
- "enabled": true,
592
- "fields": ["request", "learned", "completed"]
593
- },
594
- "timeline": {
595
- "depth": 5
596
- }
597
- }
598
- }
599
- ```
600
-
601
- **File**: `src/services/context/ContextConfigLoader.ts`
602
-
603
- **Benefit**: Users can fine-tune what gets injected.
604
-
605
- ### What We Should Do
606
-
607
- **Priority**: LOW
608
-
609
- **Implementation**:
610
-
611
- 1. **Add config file**:
612
- ```ruby
613
- # ~/.claude/memory_config.yml
614
- publish:
615
- mode: shared # shared | local | home
616
- facts:
617
- limit: 50
618
- scopes: [global, project]
619
- predicates: [uses_*, depends_on, has_constraint]
620
- entities:
621
- limit: 20
622
- conflicts:
623
- show: true
624
- ```
625
-
626
- 2. **Load in publisher**:
627
- ```ruby
628
- # lib/claude_memory/publish.rb
629
- class Publisher
630
- def initialize
631
- @config = load_config
632
- end
633
-
634
- def load_config
635
- path = File.expand_path("~/.claude/memory_config.yml")
636
- YAML.load_file(path) if File.exist?(path)
637
- rescue
638
- default_config
639
- end
640
- end
641
- ```
642
-
643
- 3. **Apply during publish**:
644
- ```ruby
645
- def build_snapshot
646
- config = @config[:publish]
647
-
648
- facts = store.facts(
649
- limit: config[:facts][:limit],
650
- scopes: config[:facts][:scopes]
651
- )
652
-
653
- format_snapshot(facts, config)
654
- end
655
- ```
656
-
657
- **Benefits**:
658
- - User control over published content
659
- - Environment-specific configs
660
- - Reduces noise in generated files
661
-
662
- **Trade-offs**:
663
- - Another config file to document
664
- - May confuse users
665
- - Publish should be opinionated by default
666
-
667
- **Verdict**: Skip for MVP. Default config is sufficient.
668
-
669
- ---
670
-
671
- ## Features We're Already Doing Better
672
-
673
- ### 1. Dual-Database Architecture (Global + Project)
674
-
675
- **Our Advantage**: `Store::StoreManager` with global + project scopes.
676
-
677
- Claude-mem has a single database with project filtering. Our approach is cleaner:
678
-
679
- ```ruby
680
- # We separate global vs project knowledge
681
- @global_store = Store::SqliteStore.new(global_db_path)
682
- @project_store = Store::SqliteStore.new(project_db_path)
683
-
684
- # Claude-mem filters post-query
685
- SELECT * FROM observations WHERE project = ?
686
- ```
687
-
688
- **Keep this.** It's a better design.
689
-
690
- ### 2. Fact-Based Knowledge Graph
691
-
692
- **Our Advantage**: Subject-predicate-object triples with provenance.
693
-
694
- Claude-mem stores blob observations. We store structured facts:
695
-
696
- ```ruby
697
- # Ours (structured)
698
- { subject: "project", predicate: "uses_database", object: "PostgreSQL" }
699
-
700
- # Theirs (blob)
701
- { title: "Uses PostgreSQL", narrative: "The project uses..." }
702
- ```
703
-
704
- **Keep this.** Enables richer queries and inference.
705
-
706
- ### 3. Truth Maintenance System
707
-
708
- **Our Advantage**: `Resolve::Resolver` with supersession and conflicts.
709
-
710
- Claude-mem doesn't resolve contradictions. We do:
711
-
712
- ```ruby
713
- # We detect when facts supersede each other
714
- old: { subject: "api", predicate: "uses_auth", object: "JWT" }
715
- new: { subject: "api", predicate: "uses_auth", object: "OAuth2" }
716
- # → Creates supersession link
717
-
718
- # We detect conflicts
719
- fact1: { subject: "api", predicate: "rate_limit", object: "100/min" }
720
- fact2: { subject: "api", predicate: "rate_limit", object: "1000/min" }
721
- # → Creates conflict record
722
- ```
723
-
724
- **Keep this.** It's a core differentiator.
725
-
726
- ### 4. Predicate Policies
727
-
728
- **Our Advantage**: `Resolve::PredicatePolicy` for single vs multi-value.
729
-
730
- Claude-mem doesn't distinguish. We do:
731
-
732
- ```ruby
733
- # Single-value (supersedes)
734
- "uses_database" → only one database at a time
735
-
736
- # Multi-value (accumulates)
737
- "depends_on" → many dependencies
738
- ```
739
-
740
- **Keep this.** Prevents false conflicts.
741
-
742
- ### 5. Ruby Ecosystem (Simpler)
743
-
744
- **Our Advantage**: Fewer dependencies, easier install.
745
-
746
- ```ruby
747
- # Ours
748
- gem install claude_memory # Done
749
-
750
- # Theirs
751
- npm install # Needs Node.js
752
- npm install chromadb # Needs Python + pip
753
- npm install better-sqlite3 # Needs node-gyp + build tools
754
- ```
755
-
756
- **Keep this.** Ruby's stdlib is excellent.
757
-
758
- ---
759
-
760
- ## Features to Avoid
761
-
762
- ### 1. Chroma Vector Database
763
-
764
- **Their Approach**: Hybrid SQLite FTS5 + Chroma vector search.
765
-
766
- **Our Take**: **Skip it.** Adds significant complexity:
767
-
768
- - Python dependency
769
- - ChromaDB server
770
- - Embedding generation
771
- - Sync overhead
772
-
773
- **Alternative**: Stick with SQLite FTS5. Add embeddings only if users request semantic search.
774
-
775
- ### 2. Claude Agent SDK for Distillation
776
-
777
- **Their Approach**: Use `@anthropic-ai/claude-agent-sdk` for observation compression.
778
-
779
- **Our Take**: **Skip it.** We already have `Distill::Distiller` interface. SDK adds:
780
-
781
- - Node.js dependency
782
- - Subprocess management
783
- - Complex event loop
784
-
785
- **Alternative**: Direct API calls via `anthropic-rb` gem (if we implement distiller).
786
-
787
- ### 3. Worker Service Background Process
788
-
789
- **Their Approach**: Long-running worker with HTTP API + MCP wrapper.
790
-
791
- **Our Take**: **Skip it.** We use MCP server directly:
792
-
793
- - No background process to manage
794
- - No port conflicts
795
- - No PID files
796
- - Simpler deployment
797
-
798
- **Alternative**: Keep stdio-based MCP server. Add HTTP transport only if needed.
799
-
800
- ### 4. Web Viewer UI
801
-
802
- **Their Approach**: React-based web UI at `http://localhost:37777`.
803
-
804
- **Our Take**: **Skip for MVP.** Significant effort for uncertain value:
805
-
806
- - React + esbuild
807
- - SSE implementation
808
- - State management
809
- - CSS/theming
810
-
811
- **Alternative**: CLI output is sufficient. Add web UI if users request it.
812
- **Alternative**: CLI output is sufficient. Add web UI if users request it.
813
-
814
- ---
815
-
816
- ## Remaining Improvements
817
-
818
- The following sections (6-12 from the original analysis) have been implemented and moved to the "Implemented Improvements" section above:
819
-
820
- - ✅ Section 6: Local Vector Embeddings for Semantic Search
821
- - ✅ Section 7: Multi-Concept AND Search
822
- - ✅ Section 8: Tool Usage Tracking
823
- - ✅ Section 9: Enhanced Session Metadata
824
- - ✅ Section 10: Incremental Sync (mtime-based)
825
- - ✅ Section 11: Enhanced Statistics and Reporting
826
- - ✅ Section 12: WAL Mode for Better Concurrency
827
-
828
- **For remaining unimplemented improvements, see:** [remaining_improvements.md](./remaining_improvements.md)
829
-
830
- Key remaining items:
831
- - Background processing for hooks (--async flag)
832
- - ROI metrics and token economics tracking
833
- - Structured logging
834
- - Embed command for backfilling embeddings
835
-
836
- ---
837
-
838
- ## QMD-Inspired Improvements (2026-01-26)
839
-
840
- Analysis of **QMD (Quick Markdown Search)** reveals several high-value optimizations for search quality and performance. QMD is an on-device markdown search engine with hybrid BM25 + vector + LLM reranking, achieving 50%+ Hit@3 improvement over BM25-only search.
841
-
842
- **See detailed analysis**: [docs/influence/qmd.md](./influence/qmd.md)
843
-
844
- ### High Priority ⭐
845
-
846
- #### 1. **Native Vector Storage (sqlite-vec)** ⭐ CRITICAL
17
+ ### 1. Native Vector Storage (sqlite-vec) CRITICAL
847
18
 
848
19
  - **Value**: 10-100x faster KNN queries, enables larger fact databases
849
20
  - **QMD Proof**: Handles 10,000+ documents with sub-second vector queries
850
21
  - **Current Issue**: JSON embedding storage requires loading all facts, O(n) Ruby similarity calculation
851
22
  - **Solution**: sqlite-vec extension with native C KNN queries
852
23
  - **Implementation**:
853
- - Schema migration v7: Create `facts_vec` virtual table using `vec0`
24
+ - Schema migration v11: Create `facts_vec` virtual table using `vec0`
854
25
  - Two-step query pattern (avoid JOINs - they hang with vec tables!)
855
26
  - Update `Embeddings::Similarity` class
856
27
  - Backfill existing embeddings
857
28
  - **Trade-off**: Adds native dependency (acceptable, well-maintained, cross-platform)
858
- - **Recommendation**: **ADOPT IMMEDIATELY** - This is foundational
859
-
860
- #### 2. **Reciprocal Rank Fusion (RRF) Algorithm** ⭐ HIGH VALUE
861
-
862
- - **Value**: 50% improvement in Hit@3 for medium-difficulty queries (QMD evaluation)
863
- - **QMD Proof**: Evaluation suite shows consistent improvements across all query types
864
- - **Current Issue**: Naive deduplication doesn't properly fuse ranking signals
865
- - **Solution**: Mathematical fusion of FTS + vector ranked lists with position-aware scoring
866
- - **Formula**: `score = Σ(weight / (k + rank + 1))` with top-rank bonus
867
- - **Implementation**:
868
- - Create `Recall::RRFusion` class
869
- - Update `Recall#query_semantic_dual` to use RRF
870
- - Apply weights: original query ×2, expanded queries ×1
871
- - Add top-rank bonus: +0.05 for #1, +0.02 for #2-3
872
- - **Trade-off**: Slightly more complex than naive merging (acceptable, well-tested)
873
- - **Recommendation**: **ADOPT IMMEDIATELY** - Pure algorithmic improvement
874
-
875
- #### 3. **Docid Short Hash System** ⭐ MEDIUM VALUE
876
-
877
- - **Value**: Better UX, cross-database fact references
878
- - **QMD Proof**: Used in all output, enables `qmd get #abc123`
879
- - **Current Issue**: Integer IDs are database-specific, not user-friendly
880
- - **Solution**: 8-character hash IDs for facts (e.g., `#abc123de`)
881
- - **Implementation**:
882
- - Schema migration v8: Add `docid` column (indexed, unique)
883
- - Backfill existing facts with SHA256-based docids
884
- - Update CLI commands (`explain`, `recall`) to accept docids
885
- - Update MCP tools to accept docids
886
- - Update output formatting to show docids
887
- - **Trade-off**: Hash collisions possible (8 chars = 1 in 4.3 billion, very rare)
888
- - **Recommendation**: **ADOPT IN PHASE 2** - Clear UX improvement
889
-
890
- #### 4. **Smart Expansion Detection** ⭐ MEDIUM VALUE
891
-
892
- - **Value**: Skip unnecessary vector search when FTS finds exact match
893
- - **QMD Proof**: Saves 2-3 seconds on 60% of queries (exact keyword matches)
894
- - **Current Issue**: Always runs both FTS and vector search, even for exact matches
895
- - **Solution**: Heuristic detection of strong FTS signal
896
- - **Thresholds**: `top_score >= 0.85` AND `gap >= 0.15`
897
- - **Implementation**:
898
- - Create `Recall::ExpansionDetector` class
899
- - Update `Recall#query_semantic_dual` to check before vector search
900
- - Add optional metrics tracking (skip rate, latency saved)
901
- - **Trade-off**: May miss semantic results for exact matches (acceptable)
902
- - **Recommendation**: **ADOPT IN PHASE 3** - Clear performance win
903
-
904
- ### Medium Priority
905
-
906
- #### 5. **Document Chunking for Long Transcripts**
907
-
908
- - **Value**: Better embeddings for long content (>3000 chars)
909
- - **QMD Approach**: 800 tokens, 15% overlap, semantic boundary detection
910
- - **Break Priority**: paragraph > sentence > line > word
911
- - **Implementation**: Modify ingestion to chunk long content_items before embedding
912
- - **Consideration**: Only if users report issues with long transcripts
913
- - **Recommendation**: **DEFER** - Not urgent, TF-IDF handles shorter content well
914
-
915
- #### 6. **LLM Response Caching**
916
-
917
- - **Value**: Reduce API costs for repeated distillation
918
- - **QMD Proof**: Hash-based caching with 80% hit rate
919
- - **Implementation**:
920
- - Add `llm_cache` table (hash, result, created_at)
921
- - Cache key: `SHA256(operation + model + input)`
922
- - Probabilistic cleanup: 1% chance per operation, keep latest 1000
923
- - **Consideration**: Most valuable when distiller is fully implemented
924
- - **Recommendation**: **ADOPT WHEN DISTILLER ACTIVE** - Cost savings
925
-
926
- #### 7. **Enhanced Snippet Extraction**
927
-
928
- - **Value**: Better search result previews with query term highlighting
929
- - **QMD Approach**: Find line with most query term matches, extract 1 line before + 2 after
930
- - **Implementation**: Add to `Recall` output formatting
931
- - **Consideration**: Improves UX but not critical
932
- - **Recommendation**: **CONSIDER** - Nice-to-have
933
-
934
- ### Low Priority / Not Recommended
935
-
936
- #### 8. **Neural Embeddings (EmbeddingGemma)** (DEFER)
937
-
938
- - **QMD Model**: 300M params, 300MB download, 384 dimensions
939
- - **Value**: Better semantic search quality (+40% Hit@3 over TF-IDF)
940
- - **Cost**: 300MB download, 300MB VRAM, 2s cold start, complex dependency
941
- - **Decision**: **DEFER** - TF-IDF sufficient for now, revisit if users report poor quality
942
-
943
- #### 9. **Cross-Encoder Reranking** (REJECT)
944
-
945
- - **QMD Model**: Qwen3-Reranker-0.6B (640MB)
946
- - **Value**: Better ranking precision via LLM scoring
947
- - **Cost**: 640MB model, 400ms latency per query, complex dependency
948
- - **Decision**: **REJECT** - Over-engineering for fact retrieval
949
-
950
- #### 10. **Query Expansion (LLM)** (REJECT)
951
-
952
- - **QMD Model**: Qwen3-1.7B (2.2GB)
953
- - **Value**: Generate alternative query phrasings for better recall
954
- - **Cost**: 2.2GB model, 800ms latency per query
955
- - **Decision**: **REJECT** - No LLM in recall path, too heavy
956
-
957
- #### 11. **YAML Collection System** (REJECT)
958
-
959
- - **QMD Use**: Multi-directory indexing with per-path contexts
960
- - **Our Use**: Dual-database (global + project) already provides clean separation
961
- - **Decision**: **REJECT** - Our approach is cleaner for our use case
962
-
963
- #### 12. **Content-Addressable Storage** (REJECT)
964
-
965
- - **QMD Use**: Deduplicates documents by SHA256 hash
966
- - **Our Use**: Facts deduplicated by signature, not content hash
967
- - **Decision**: **REJECT** - Different data model
968
-
969
- #### 13. **Virtual Path System** (REJECT)
970
-
971
- - **QMD Use**: `qmd://collection/path` unified namespace
972
- - **Our Use**: Dual-database provides clear namespace
973
- - **Decision**: **REJECT** - Unnecessary complexity
974
29
 
975
30
  ---
976
31
 
977
- ## Implementation Priorities
978
-
979
- ### High Priority (QMD-Inspired)
32
+ ## High Priority (Study-Inspired)
980
33
 
981
- 1. **Native Vector Storage (sqlite-vec)** - 10-100x faster KNN, foundational improvement
982
- 2. **Reciprocal Rank Fusion (RRF)** ⭐ - 50% better search quality, pure algorithm
983
- 3. **Docid Short Hashes** - Better UX for fact references
984
- 4. **Smart Expansion Detection** - Skip unnecessary vector search when FTS is confident
34
+ ### 2. SessionStart Context Injection via Hook
985
35
 
986
- ### Medium Priority
36
+ Source: claude-supermemory study
987
37
 
988
- 5. **Background Processing** - Non-blocking hooks for better UX (from episodic-memory)
989
- 6. **ROI Metrics** - Track token economics for distillation (from claude-mem)
990
- 7. **LLM Response Caching** - Reduce API costs (from QMD)
991
- 8. **Document Chunking** - Better embeddings for long transcripts (from QMD, if needed)
38
+ - **Value**: Guarantees Claude sees memory context immediately, supplements existing `.claude/rules/` publish
39
+ - **Implementation**: Inject recalled facts into Claude's context at session start using `hookSpecificOutput.additionalContext`
40
+ - **Evidence**: `context-hook.js:72-74` uses hook response to inject `<supermemory-context>` XML
41
+ - **Effort**: 1-2 days (hook handler, context formatter, settings)
992
42
 
993
- ### Low Priority
43
+ ### 3. Tool-Specific Observation Compression ⭐
994
44
 
995
- 9. **Structured Logging** - Better debugging with JSON logs
996
- 10. **Embed Command** - Backfill embeddings for existing facts
997
- 11. **Enhanced Snippet Extraction** - Query-aware snippet preview (from QMD)
998
- 12. **Health Monitoring** - Only if we add background worker
999
- 13. **Web Viewer UI** - Only if users request visualization
1000
- 14. **Configuration-Driven Context** - Only if users request snapshot customization
45
+ Source: claude-supermemory study
1001
46
 
1002
- ---
1003
-
1004
- ## Migration Path
47
+ - **Value**: ~70% token reduction vs raw tool I/O in provenance descriptions
48
+ - **Implementation**: Compact per-tool summarization for provenance (e.g., `Edited auth.js: "login()" → "async login()"`)
49
+ - **Evidence**: `compress.js:13-75` — 10 tool handlers with human-readable output
50
+ - **Effort**: 4-6 hours (class + tests + ingest integration)
1005
51
 
1006
- ### Completed
52
+ ### 4. Claude Code Plugin Distribution Format ⭐
1007
53
 
1008
- - [x] WAL mode for better concurrency
1009
- - [x] Enhanced statistics command
1010
- - [x] Session metadata tracking
1011
- - [x] Tool usage tracking
1012
- - [x] Semantic search with TF-IDF embeddings
1013
- - [x] Multi-concept AND search
1014
- - [x] Incremental sync with mtime tracking
1015
- - [x] Context-aware queries
54
+ Source: QMD study
1016
55
 
1017
- ### Phase 1: Vector Storage Upgrade (from QMD) - IMMEDIATE
56
+ - **Value**: 10x easier installation (one command vs multi-step gem + MCP + hook config)
57
+ - **Implementation**: Package ClaudeMemory as marketplace plugin for single-command installation
58
+ - **Evidence**: `.claude-plugin/marketplace.json` — complete plugin spec with MCP server bundling and skill definitions
59
+ - **Effort**: 2-3 days
1018
60
 
1019
- - [ ] Add sqlite-vec extension support (gem or FFI)
1020
- - [ ] Create schema migration v7: `facts_vec` virtual table using `vec0`
1021
- - [ ] Backfill existing embeddings from JSON to native vectors
1022
- - [ ] Update `Embeddings::Similarity` class for native KNN (two-step query pattern)
1023
- - [ ] Test migration on existing databases
1024
- - [ ] Document extension installation in README
1025
- - [ ] Benchmark: Measure KNN query improvement (expect 10-100x)
61
+ ---
1026
62
 
1027
- ### Phase 2: RRF Fusion (from QMD) - IMMEDIATE
63
+ ## Medium Priority
1028
64
 
1029
- - [ ] Implement `Recall::RRFusion` class with k=60 parameter
1030
- - [ ] Update `Recall#query_semantic_dual` to use RRF fusion
1031
- - [ ] Apply weights: original query ×2, expanded queries ×1
1032
- - [ ] Add top-rank bonus: +0.05 for #1, +0.02 for #2-3
1033
- - [ ] Test with synthetic ranked lists (unit tests)
1034
- - [ ] Validate improvements with real queries
65
+ ### 5. Incremental Indexing with File Watching
1035
66
 
1036
- ### Phase 3: UX Improvements (from QMD) - NEAR-TERM
67
+ Source: grepai study
1037
68
 
1038
- - [ ] Schema migration v8: Add `docid` column (8-char hash, indexed, unique)
1039
- - [ ] Backfill existing facts with SHA256-based docids
1040
- - [ ] Update CLI commands to accept/display docids (`ExplainCommand`, `RecallCommand`)
1041
- - [ ] Update MCP tools for docid support (`memory.explain`, `memory.recall`)
1042
- - [ ] Test cross-database docid lookups
69
+ - **Value**: Eliminates manual `claude-memory ingest` calls
70
+ - **Implementation**: Add `Listen` gem, watch `.claude/projects/*/transcripts/*.jsonl`, debounce 500ms, trigger IngestCommand automatically
71
+ - **Evidence**: `watcher/watcher.go:44` `fsnotify` with debouncing (300ms default), gitignore respect
72
+ - **Effort**: 2-3 days
73
+ - **Trade-off**: Background process ~10MB memory overhead
1043
74
 
1044
- ### Phase 4: Performance Optimizations (from QMD) - NEAR-TERM
75
+ ### 6. Background Processing for Hooks
1045
76
 
1046
- - [ ] Implement `Recall::ExpansionDetector` class
1047
- - [ ] Update `Recall#query_semantic_dual` to check before vector search
1048
- - [ ] Add metrics tracking (skip rate, avg latency saved)
1049
- - [ ] Tune thresholds based on usage patterns
77
+ Source: episodic-memory study
1050
78
 
1051
- ### Remaining Tasks
79
+ - **Value**: Non-blocking hooks for better UX
80
+ - **Implementation**: `--async` flag on hook commands, fork and detach
81
+ - **Trade-off**: Background process management complexity, potential race conditions
1052
82
 
1053
- - [ ] Background processing (--async flag for hooks)
1054
- - [ ] LLM response caching (from QMD, when distiller is active)
1055
- - [ ] Structured logging implementation
1056
- - [ ] Embed command for backfilling embeddings
83
+ ### 7. Document Chunking for Long Transcripts
1057
84
 
1058
- ### Future (If Requested)
85
+ Source: QMD study
1059
86
 
1060
- - [ ] Document chunking for long transcripts (from QMD, if users report issues)
1061
- - [ ] Enhanced snippet extraction (from QMD, for better search result previews)
1062
- - [ ] Build web viewer (if users request visualization)
1063
- - [ ] Add HTTP-based health checks (if background worker is added)
1064
- - [ ] Configuration-driven snapshot generation (if users request customization)
87
+ - **Value**: Better embeddings for long content (>3000 chars)
88
+ - **Implementation**: 800 tokens, 15% overlap, semantic boundary detection
89
+ - **Consideration**: Only if users report issues with long transcripts
1065
90
 
1066
91
  ---
1067
92
 
1068
- ## Key Takeaways
1069
-
1070
- ### Successfully Adopted from claude-mem ✓
1071
-
1072
- 1. Progressive disclosure (token-efficient retrieval)
1073
- 2. Privacy controls (tag-based content exclusion)
1074
- 3. Clean architecture (command pattern, slim CLI)
1075
- 4. Semantic shortcuts (decisions, conventions, architecture)
1076
- 5. Exit code strategy (hook error handling)
1077
- 6. ROI metrics tracking (token economics for distillation efficiency)
1078
-
1079
- ### Successfully Adopted from Episodic-Memory ✓
1080
-
1081
- 1. **WAL Mode** - Better concurrency with Write-Ahead Logging
1082
- 2. **Tool Usage Tracking** - Dedicated table tracking which tools discovered facts
1083
- 3. **Incremental Sync** - mtime-based change detection for fast re-ingestion
1084
- 4. **Session Metadata** - Context capture (git branch, cwd, Claude version)
1085
- 5. **Local Vector Embeddings** - TF-IDF semantic search alongside FTS5
1086
- 6. **Multi-Concept AND Search** - Precise queries matching 2-5 concepts simultaneously
1087
- 7. **Enhanced Statistics** - Comprehensive reporting on facts, entities, provenance
1088
- 8. **Context-Aware Queries** - Filter by branch, directory, or tools used
1089
-
1090
- ### Our Unique Advantages
1091
-
1092
- 1. **Dual-database architecture** - Global + project scopes
1093
- 2. **Fact-based knowledge graph** - Structured vs blob observations or conversation exchanges
1094
- 3. **Truth maintenance** - Conflict resolution and supersession
1095
- 4. **Predicate policies** - Single vs multi-value semantics
1096
- 5. **Ruby ecosystem** - Simpler dependencies, easier install
1097
- 6. **Lightweight embeddings** - No external dependencies (TF-IDF vs Transformers.js)
93
+ ---
1098
94
 
1099
- ### Remaining Opportunities
95
+ ## Features to Avoid
1100
96
 
1101
- - **Background Processing** - Non-blocking hooks for better UX (from episodic-memory)
1102
- - **ROI Metrics** - Track token economics for distillation (from claude-mem)
1103
- - **Structured Logging** - JSON-formatted logs for debugging
1104
- - **Embed Command** - Backfill embeddings for existing facts
1105
- - **Health Monitoring** - Only if we add background worker
1106
- - **Web Viewer UI** - Only if users request visualization
1107
- - **Configuration-Driven Context** - Only if users request snapshot customization
97
+ - **Chroma Vector Database** We use fastembed-rb with local ONNX model instead
98
+ - **Claude Agent SDK for Distillation** Direct API calls via `anthropic-rb` gem
99
+ - **Worker Service Background Process** Keep stdio-based MCP server
100
+ - **Web Viewer UI** CLI output is sufficient. Add if users request it
101
+ - **Configuration-Driven Context** Default config is sufficient. Add if users request it
102
+ - **Neural Embeddings (EmbeddingGemma)** Superseded by FastEmbed (BAAI/bge-small-en-v1.5)
103
+ - **Cross-Encoder Reranking (Qwen3-Reranker-0.6B)** Over-engineering for fact retrieval
104
+ - **Query Expansion (LLM, Qwen3-1.7B)** — No LLM in recall path, too heavy
105
+ - **Custom Fine-Tuned Query Expansion** — 1.7B model too heavy for fact retrieval
106
+ - **YAML Collection System** — Our dual-database approach is cleaner
107
+ - **Content-Addressable Storage** — Facts deduplicated by signature, not content hash
108
+ - **Virtual Path System** — Dual-database provides clear namespace
1108
109
 
1109
110
  ---
1110
111
 
1111
- ## Comparison Summary
112
+ ## Design Decisions
1112
113
 
1113
- **Episodic-memory** and **claude_memory** serve complementary but different needs:
114
+ ### No Tag Count Limit (2026-01-23)
1114
115
 
1115
- **Episodic-memory** excels at:
1116
- - Semantic conversation search with local embeddings
1117
- - Preserving complete conversation context
1118
- - Multi-concept AND queries
1119
- - Fast incremental sync
1120
- - Tool usage tracking
1121
- - Rich session metadata
116
+ **Decision**: Removed MAX_TAG_COUNT limit from ContentSanitizer.
1122
117
 
1123
- **ClaudeMemory** excels at:
1124
- - Structured fact extraction and storage
1125
- - Truth maintenance and conflict resolution
1126
- - Dual-scope architecture (global vs project)
1127
- - Knowledge graph with provenance
1128
- - Semantic shortcuts for common queries
118
+ **Rationale**:
119
+ - The regex pattern `/<tag>.*?<\/tag>/m` is provably safe from ReDoS attacks
120
+ - Performance is O(n) and excellent even with 1000+ tags (~0.6ms)
121
+ - Real-world usage legitimately produces 100-200+ tags in long sessions
122
+ - No other similar tool enforces tag count limits
1129
123
 
1130
- **Best of both worlds (achieved)**:
1131
- - ✅ Added vector embeddings for semantic search (TF-IDF based)
1132
- - ✅ Kept fact-based knowledge graph for structured queries
1133
- - ✅ Adopted incremental sync and tool tracking from episodic-memory
1134
- - ✅ Maintained truth maintenance and conflict resolution
1135
- - ✅ Added session metadata for richer context
1136
- - ✅ Implemented multi-concept AND search
1137
- - ✅ Enhanced statistics and reporting
124
+ **Do not reintroduce**: Tag count validation is unnecessary and harmful.
1138
125
 
1139
126
  ---
1140
127
 
@@ -1142,10 +129,10 @@ Analysis of **QMD (Quick Markdown Search)** reveals several high-value optimizat
1142
129
 
1143
130
  - [episodic-memory GitHub](https://github.com/obra/episodic-memory) - Semantic conversation search
1144
131
  - [claude-mem GitHub](https://github.com/thedotmack/claude-mem) - Memory compression system
1145
- - [ClaudeMemory Updated Plan](updated_plan.md) - Original improvement plan
132
+ - [grepai GitHub](https://github.com/yoanbernabeu/grepai) - Semantic code search
133
+ - [claude-supermemory GitHub](https://github.com/supermemoryai/claude-supermemory) - Cloud-backed memory
134
+ - [QMD GitHub](https://github.com/tobi/qmd) - On-device hybrid search engine
1146
135
 
1147
136
  ---
1148
137
 
1149
- *This document has been updated to reflect completed implementations. Fourteen major improvements have been successfully integrated: 6 from claude-mem and 8 from episodic-memory. ClaudeMemory now combines the best of both systems while maintaining its unique advantages in fact-based knowledge representation and truth maintenance.*
1150
-
1151
- *Last updated: 2026-01-26 - Added ROI metrics tracking for distillation token economics*
138
+ *Last updated: 2026-02-03 - Removed Docid, LLM Cache, Structured Logging (implemented). Renumbered items.*