claude_memory 0.2.0 → 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (104) hide show
  1. checksums.yaml +4 -4
  2. data/.claude/.mind.mv2.o2N83S +0 -0
  3. data/.claude/CLAUDE.md +1 -0
  4. data/.claude/rules/claude_memory.generated.md +28 -9
  5. data/.claude/settings.local.json +9 -1
  6. data/.claude/skills/check-memory/SKILL.md +77 -0
  7. data/.claude/skills/improve/SKILL.md +532 -0
  8. data/.claude/skills/improve/feature-patterns.md +1221 -0
  9. data/.claude/skills/quality-update/SKILL.md +229 -0
  10. data/.claude/skills/quality-update/implementation-guide.md +346 -0
  11. data/.claude/skills/review-commit/SKILL.md +199 -0
  12. data/.claude/skills/review-for-quality/SKILL.md +154 -0
  13. data/.claude/skills/review-for-quality/expert-checklists.md +79 -0
  14. data/.claude/skills/setup-memory/SKILL.md +168 -0
  15. data/.claude/skills/study-repo/SKILL.md +307 -0
  16. data/.claude/skills/study-repo/analysis-template.md +323 -0
  17. data/.claude/skills/study-repo/focus-examples.md +327 -0
  18. data/CHANGELOG.md +133 -0
  19. data/CLAUDE.md +130 -11
  20. data/README.md +117 -10
  21. data/db/migrations/001_create_initial_schema.rb +117 -0
  22. data/db/migrations/002_add_project_scoping.rb +33 -0
  23. data/db/migrations/003_add_session_metadata.rb +42 -0
  24. data/db/migrations/004_add_fact_embeddings.rb +20 -0
  25. data/db/migrations/005_add_incremental_sync.rb +21 -0
  26. data/db/migrations/006_add_operation_tracking.rb +40 -0
  27. data/db/migrations/007_add_ingestion_metrics.rb +26 -0
  28. data/docs/.claude/mind.mv2.lock +0 -0
  29. data/docs/GETTING_STARTED.md +587 -0
  30. data/docs/RELEASE_NOTES_v0.2.0.md +0 -1
  31. data/docs/RUBY_COMMUNITY_POST_v0.2.0.md +0 -2
  32. data/docs/architecture.md +9 -8
  33. data/docs/auto_init_design.md +230 -0
  34. data/docs/improvements.md +557 -731
  35. data/docs/influence/.gitkeep +13 -0
  36. data/docs/influence/grepai.md +933 -0
  37. data/docs/influence/qmd.md +2195 -0
  38. data/docs/plugin.md +257 -11
  39. data/docs/quality_review.md +472 -1273
  40. data/docs/remaining_improvements.md +330 -0
  41. data/lefthook.yml +13 -0
  42. data/lib/claude_memory/commands/checks/claude_md_check.rb +41 -0
  43. data/lib/claude_memory/commands/checks/database_check.rb +120 -0
  44. data/lib/claude_memory/commands/checks/hooks_check.rb +112 -0
  45. data/lib/claude_memory/commands/checks/reporter.rb +110 -0
  46. data/lib/claude_memory/commands/checks/snapshot_check.rb +30 -0
  47. data/lib/claude_memory/commands/doctor_command.rb +12 -129
  48. data/lib/claude_memory/commands/help_command.rb +1 -0
  49. data/lib/claude_memory/commands/hook_command.rb +9 -2
  50. data/lib/claude_memory/commands/index_command.rb +169 -0
  51. data/lib/claude_memory/commands/ingest_command.rb +1 -1
  52. data/lib/claude_memory/commands/init_command.rb +5 -197
  53. data/lib/claude_memory/commands/initializers/database_ensurer.rb +30 -0
  54. data/lib/claude_memory/commands/initializers/global_initializer.rb +85 -0
  55. data/lib/claude_memory/commands/initializers/hooks_configurator.rb +156 -0
  56. data/lib/claude_memory/commands/initializers/mcp_configurator.rb +56 -0
  57. data/lib/claude_memory/commands/initializers/memory_instructions_writer.rb +135 -0
  58. data/lib/claude_memory/commands/initializers/project_initializer.rb +111 -0
  59. data/lib/claude_memory/commands/recover_command.rb +75 -0
  60. data/lib/claude_memory/commands/registry.rb +5 -1
  61. data/lib/claude_memory/commands/stats_command.rb +239 -0
  62. data/lib/claude_memory/commands/uninstall_command.rb +226 -0
  63. data/lib/claude_memory/core/batch_loader.rb +32 -0
  64. data/lib/claude_memory/core/concept_ranker.rb +73 -0
  65. data/lib/claude_memory/core/embedding_candidate_builder.rb +37 -0
  66. data/lib/claude_memory/core/fact_collector.rb +51 -0
  67. data/lib/claude_memory/core/fact_query_builder.rb +154 -0
  68. data/lib/claude_memory/core/fact_ranker.rb +113 -0
  69. data/lib/claude_memory/core/result_builder.rb +54 -0
  70. data/lib/claude_memory/core/result_sorter.rb +25 -0
  71. data/lib/claude_memory/core/scope_filter.rb +61 -0
  72. data/lib/claude_memory/core/text_builder.rb +29 -0
  73. data/lib/claude_memory/embeddings/generator.rb +161 -0
  74. data/lib/claude_memory/embeddings/similarity.rb +69 -0
  75. data/lib/claude_memory/hook/handler.rb +4 -3
  76. data/lib/claude_memory/index/lexical_fts.rb +7 -2
  77. data/lib/claude_memory/infrastructure/operation_tracker.rb +158 -0
  78. data/lib/claude_memory/infrastructure/schema_validator.rb +206 -0
  79. data/lib/claude_memory/ingest/content_sanitizer.rb +6 -7
  80. data/lib/claude_memory/ingest/ingester.rb +99 -15
  81. data/lib/claude_memory/ingest/metadata_extractor.rb +57 -0
  82. data/lib/claude_memory/ingest/tool_extractor.rb +71 -0
  83. data/lib/claude_memory/mcp/response_formatter.rb +331 -0
  84. data/lib/claude_memory/mcp/server.rb +19 -0
  85. data/lib/claude_memory/mcp/setup_status_analyzer.rb +73 -0
  86. data/lib/claude_memory/mcp/tool_definitions.rb +279 -0
  87. data/lib/claude_memory/mcp/tool_helpers.rb +80 -0
  88. data/lib/claude_memory/mcp/tools.rb +330 -320
  89. data/lib/claude_memory/recall/dual_query_template.rb +63 -0
  90. data/lib/claude_memory/recall.rb +304 -237
  91. data/lib/claude_memory/resolve/resolver.rb +52 -49
  92. data/lib/claude_memory/store/sqlite_store.rb +210 -144
  93. data/lib/claude_memory/store/store_manager.rb +6 -6
  94. data/lib/claude_memory/sweep/sweeper.rb +6 -0
  95. data/lib/claude_memory/version.rb +1 -1
  96. data/lib/claude_memory.rb +35 -3
  97. metadata +71 -11
  98. data/.claude/.mind.mv2.aLCUZd +0 -0
  99. data/.claude/memory.sqlite3 +0 -0
  100. data/.mcp.json +0 -11
  101. /data/docs/{feature_adoption_plan.md → plans/feature_adoption_plan.md} +0 -0
  102. /data/docs/{feature_adoption_plan_revised.md → plans/feature_adoption_plan_revised.md} +0 -0
  103. /data/docs/{plan.md → plans/plan.md} +0 -0
  104. /data/docs/{updated_plan.md → plans/updated_plan.md} +0 -0
data/docs/improvements.md CHANGED
@@ -1,491 +1,279 @@
1
- # Improvements to Consider (Based on claude-mem Analysis)
1
+ # Improvements to Consider
2
2
 
3
- *Generated: 2026-01-21*
4
- *Source: Comparative analysis of [thedotmack/claude-mem](https://github.com/thedotmack/claude-mem)*
3
+ *Updated: 2026-01-29*
4
+ *Sources:*
5
+ - *[thedotmack/claude-mem](https://github.com/thedotmack/claude-mem) - Memory compression system*
6
+ - *[obra/episodic-memory](https://github.com/obra/episodic-memory) - Semantic conversation search*
7
+ - *[yoanbernabeu/grepai](https://github.com/yoanbernabeu/grepai) - Semantic code search with vector embeddings*
5
8
 
6
- This document identifies design patterns, features, and architectural decisions from claude-mem that could improve claude_memory. Each section includes rationale, implementation considerations, and priority.
9
+ This document identifies design patterns and features from claude-mem and episodic-memory that could improve claude_memory. Implemented improvements have been removed from this document.
7
10
 
8
11
  ---
9
12
 
10
- ## Executive Summary
11
-
12
- Claude-mem (TypeScript/Node.js, v9.0.5) is a production-grade memory compression system with 6+ months of real-world usage. Key strengths:
13
-
14
- - **Progressive Disclosure**: Token-efficient 3-layer retrieval workflow
15
- - **ROI Metrics**: Tracks token costs and discovery efficiency
16
- - **Slim Architecture**: Clean separation via service layer pattern
17
- - **Dual Integration**: Plugin + MCP server for flexibility
18
- - **Privacy-First**: User-controlled content exclusion via tags
19
- - **Fail-Fast Philosophy**: Explicit error handling and exit codes
20
-
21
- **Our Advantages**:
22
- - Ruby ecosystem (simpler dependencies)
23
- - Dual-database architecture (global + project scope)
24
- - Fact-based knowledge graph (vs observation blobs)
25
- - Truth maintenance system (conflict resolution)
26
- - Predicate policies (single vs multi-value)
13
+ ## Implemented Improvements ✓
14
+
15
+ The following improvements from the original analysis have been successfully implemented:
16
+
17
+ 1. **Progressive Disclosure Pattern** - `memory.recall_index` and `memory.recall_details` MCP tools with token estimation
18
+ 2. **Privacy Tag System** - ContentSanitizer with `<private>`, `<no-memory>`, and `<secret>` tag stripping
19
+ 3. **Slim Orchestrator Pattern** - CLI refactored to thin router with extracted command classes
20
+ 4. **Semantic Shortcuts** - `memory.decisions`, `memory.conventions`, and `memory.architecture` MCP tools
21
+ 5. **Exit Code Strategy** - Hook::ExitCodes module with SUCCESS/WARNING/ERROR constants
22
+ 6. **WAL Mode for Concurrency** - SQLite Write-Ahead Logging enabled for better concurrent access
23
+ 7. **Enhanced Statistics** - Comprehensive stats command showing facts, entities, provenance, conflicts
24
+ 8. **Session Metadata Tracking** - Captures git_branch, cwd, claude_version, thinking_level from transcripts
25
+ 9. **Tool Usage Tracking** - Dedicated tool_calls table tracking tool names, inputs, timestamps
26
+ 10. **Semantic Search with TF-IDF** - Local embeddings (384-dimensional), hybrid vector + text search
27
+ 11. **Multi-Concept AND Search** - Query facts matching all of 2-5 concepts simultaneously
28
+ 12. **Incremental Sync** - mtime-based change detection to skip unchanged transcript files
29
+ 13. **Context-Aware Queries** - Filter facts by git branch, directory, or tools used
30
+ 14. **ROI Metrics Tracking** - ingestion_metrics table tracking token economics for distillation efficiency (2026-01-26)
27
31
 
28
32
  ---
29
33
 
30
- ## 1. Progressive Disclosure Pattern
31
-
32
- ### What claude-mem Does
33
-
34
- **3-Layer Workflow** enforced at the tool level:
35
-
36
- ```
37
- Layer 1: search Get compact index with IDs (~50-100 tokens/result)
38
- Layer 2: timeline Get chronological context around IDs
39
- Layer 3: get_observations Fetch full details (~500-1,000 tokens/result)
40
- ```
41
-
42
- **Token savings**: ~10x reduction by filtering before fetching.
43
-
44
- **MCP Tools**:
45
- - `search` - Returns index format (titles, IDs, token counts)
46
- - `timeline` - Returns context around specific observation
47
- - `get_observations` - Returns full details only for filtered IDs
48
- - `__IMPORTANT` - Workflow documentation (always visible)
49
-
50
- **File**: `docs/public/progressive-disclosure.mdx` (673 lines of philosophy)
51
-
52
- ### What We Should Do
53
-
54
- **Priority**: HIGH
55
-
56
- **Implementation**:
57
-
58
- 1. **Add token count field to facts table**:
59
- ```ruby
60
- alter table :facts do
61
- add_column :token_count, Integer
62
- end
63
- ```
64
-
65
- 2. **Create index format in Recall**:
66
- ```ruby
67
- # lib/claude_memory/recall.rb
68
- def recall_index(query, scope: :project, limit: 20)
69
- facts = search_facts(query, scope:, limit:)
70
- facts.map do |fact|
71
- {
72
- id: fact[:id],
73
- subject: fact[:subject],
74
- predicate: fact[:predicate],
75
- object_preview: fact[:object_value][0..50],
76
- scope: fact[:scope],
77
- token_count: fact[:token_count] || estimate_tokens(fact)
78
- }
79
- end
80
- end
81
- ```
82
-
83
- 3. **Add MCP tool for fetching details**:
84
- ```ruby
85
- # lib/claude_memory/mcp/tools.rb
86
- TOOLS["memory.recall_index"] = {
87
- description: "Layer 1: Search for facts. Returns compact index with IDs.",
88
- input_schema: {
89
- type: "object",
90
- properties: {
91
- query: { type: "string" },
92
- scope: { type: "string", enum: ["global", "project", "both"] },
93
- limit: { type: "integer", default: 20 }
94
- }
95
- }
96
- }
97
-
98
- TOOLS["memory.recall_details"] = {
99
- description: "Layer 2: Fetch full fact details by IDs.",
100
- input_schema: {
101
- type: "object",
102
- properties: {
103
- fact_ids: { type: "array", items: { type: "integer" } }
104
- },
105
- required: ["fact_ids"]
106
- }
107
- }
108
- ```
109
-
110
- 4. **Update publish format** to show costs:
111
- ```markdown
112
- ## Recent Facts
113
-
114
- | ID | Subject | Predicate | Preview | Tokens |
115
- |----|---------|-----------|---------|--------|
116
- | #123 | project | uses_database | PostgreSQL | ~45 |
117
- | #124 | project | has_constraint | API rate lim... | ~120 |
118
- ```
119
-
120
- **Benefits**:
121
- - Reduces context waste in published snapshots
122
- - Gives Claude control over retrieval depth
123
- - Makes token costs visible for informed decisions
124
-
125
- **Trade-offs**:
126
- - More complex MCP interface
127
- - Requires token estimation logic
128
- - May confuse users who expect full details
34
+ ## grepai Study (2026-01-29)
35
+
36
+ Source: docs/influence/grepai.md
37
+
38
+ ### High Priority Recommendations
39
+
40
+ - [ ] **Incremental Indexing with File Watching**: Auto-update memory index during coding sessions
41
+ - Value: Eliminates manual `claude-memory ingest` calls, huge UX win
42
+ - Evidence: watcher/watcher.go:44 - `fsnotify` with debouncing (300ms default), gitignore respect
43
+ - Implementation: Add `Listen` gem (Ruby fsnotify), watch `.claude/projects/*/transcripts/*.jsonl`, debounce 500ms, trigger IngestCommand automatically
44
+ - Effort: 2-3 days (watcher class, integration, testing)
45
+ - Trade-off: Background process ~10MB memory overhead, may complicate testing
46
+
47
+ - [ ] **Compact Response Format for MCP Tools**: Reduce token usage by ~60% in MCP responses
48
+ - Value: Critical for scaling to large fact databases (1000+ facts)
49
+ - Evidence: mcp/server.go:219 - `SearchResultCompact` omits content field, returns only metadata
50
+ - Implementation: Add `compact: true` parameter to all recall tools, omit provenance/context excerpts by default, user can override with `compact: false`
51
+ - Effort: 4-6 hours (add parameter, update formatters, tests)
52
+ - Trade-off: User needs follow-up `memory.explain <fact_id>` for full context (two-step interaction)
53
+
54
+ - [ ] **Fact Dependency Graph Visualization**: Show supersession chains and conflict relationships
55
+ - Value: Invaluable for understanding why facts were superseded or conflicted
56
+ - Evidence: trace/trace.go:95 - `CallGraph` struct with nodes and edges for function dependencies
57
+ - Implementation: Create `memory.fact_graph <fact_id> --depth 2` tool, query `fact_links` table with BFS traversal, return JSON with nodes (facts) and edges (supersedes/conflicts/supports)
58
+ - Effort: 2-3 days (graph builder, MCP tool, tests)
59
+ - Trade-off: Adds complexity for feature used mainly for debugging/exploration
60
+
61
+ - [ ] **Hybrid Search (Vector + Text) with RRF**: Better relevance combining semantic and keyword matching
62
+ - Value: 50% improvement in search quality (proven by grepai's Reciprocal Rank Fusion)
63
+ - Evidence: search/search.go - RRF with K=60, combines cosine similarity with full-text search
64
+ - Implementation: Add `sqlite-vec` extension, add `embeddings` BLOB column to `facts`, implement RRF in `Recall#query`, make hybrid optional via config
65
+ - Effort: 5-7 days (embedder setup, schema migration, RRF implementation, testing)
66
+ - Trade-off: Requires API calls for embedding (~$0.00001/fact), slower queries (2x search + fusion)
67
+ - Recommendation: CONSIDER - High value but significant effort. Start with FTS5, add vectors later if quality issues arise
129
68
 
130
69
  ---
131
70
 
132
- ## 2. ROI Metrics and Token Economics
133
-
134
- ### What claude-mem Does
135
-
136
- **Discovery Token Tracking**:
137
- - `discovery_tokens` field on observations table
138
- - Tracks tokens spent discovering each piece of knowledge
139
- - Cumulative metrics in session summaries
140
- - Footer displays ROI: "Access 10k tokens for 2,500t"
141
-
142
- **File**: `src/services/sqlite/Database.ts`
143
-
144
- ```typescript
145
- observations: {
146
- id: INTEGER PRIMARY KEY,
147
- title: TEXT,
148
- narrative: TEXT,
149
- discovery_tokens: INTEGER, // ← Cost tracking
150
- created_at_epoch: INTEGER
151
- }
152
-
153
- session_summaries: {
154
- cumulative_discovery_tokens: INTEGER, // ← Running total
155
- observation_count: INTEGER
156
- }
157
- ```
158
-
159
- **Context Footer Example**:
160
- ```markdown
161
- 💡 **Token Economics:**
162
- - Context shown: 2,500 tokens
163
- - Research captured: 10,000 tokens
164
- - ROI: 4x compression
165
- ```
166
-
167
- ### What We Should Do
168
-
169
- **Priority**: MEDIUM
170
-
171
- **Implementation**:
172
-
173
- 1. **Add metrics table**:
174
- ```ruby
175
- create_table :ingestion_metrics do
176
- primary_key :id
177
- foreign_key :content_item_id, :content_items
178
- Integer :input_tokens
179
- Integer :output_tokens
180
- Integer :facts_extracted
181
- DateTime :created_at
182
- end
183
- ```
71
+ ## Design Decisions
184
72
 
185
- 2. **Track during distillation**:
186
- ```ruby
187
- # lib/claude_memory/distill/distiller.rb
188
- def distill(content)
189
- response = api_call(content)
190
- facts = extract_facts(response)
191
-
192
- store_metrics(
193
- input_tokens: response.usage.input_tokens,
194
- output_tokens: response.usage.output_tokens,
195
- facts_extracted: facts.size
196
- )
73
+ ### No Tag Count Limit (2026-01-23)
197
74
 
198
- facts
199
- end
200
- ```
201
-
202
- 3. **Display in CLI**:
203
- ```ruby
204
- # claude-memory stats
205
- def stats_cmd
206
- metrics = store.aggregate_metrics
207
- puts "Token Economics:"
208
- puts " Input: #{metrics[:input_tokens]} tokens"
209
- puts " Output: #{metrics[:output_tokens]} tokens"
210
- puts " Facts: #{metrics[:facts_extracted]}"
211
- puts " Efficiency: #{metrics[:facts_extracted] / metrics[:input_tokens].to_f} facts/token"
212
- end
213
- ```
214
-
215
- 4. **Add to published snapshot**:
216
- ```markdown
217
- <!-- At bottom of .claude/rules/claude_memory.generated.md -->
218
-
219
- ---
220
-
221
- *Memory stats: 145 facts from 12,500 ingested tokens (86 facts/1k tokens)*
222
- ```
75
+ **Decision**: Removed MAX_TAG_COUNT limit from ContentSanitizer.
223
76
 
224
- **Benefits**:
225
- - Visibility into memory system efficiency
226
- - Justifies API costs (shows compression ratio)
227
- - Helps tune distillation prompts for better extraction
77
+ **Rationale**:
78
+ - The regex pattern `/<tag>.*?<\/tag>/m` is provably safe from ReDoS attacks
79
+ - Non-greedy matching (`.*?`) with clear delimiters
80
+ - No nested quantifiers or alternation that could cause catastrophic backtracking
81
+ - Performance is O(n) and predictable
82
+ - Performance benchmarks show excellent speed even at scale:
83
+ - 100 tags: 0.07ms
84
+ - 200 tags: 0.13ms
85
+ - 1,000 tags: 0.64ms
86
+ - Real-world usage legitimately produces 100-200+ tags in long sessions
87
+ - System tags like `<claude-memory-context>` accumulate
88
+ - Users mark multiple sections with `<private>` tags
89
+ - The limit created false alarms and blocked legitimate ingestion
90
+ - No other similar tool (claude-mem, episodic-memory) enforces tag count limits
228
91
 
229
- **Trade-offs**:
230
- - Requires API usage tracking
231
- - Adds database complexity
232
- - May not be meaningful for all distiller implementations
92
+ **Do not reintroduce**: Tag count validation is unnecessary and harmful. If extreme input causes issues, investigate the actual root cause rather than adding arbitrary limits.
233
93
 
234
94
  ---
235
95
 
236
- ## 3. Privacy Tag System
237
-
238
- ### What claude-mem Does
239
-
240
- **Dual-Tag Architecture** for content exclusion:
241
-
242
- 1. **`<claude-mem-context>`** (system tag):
243
- - Prevents recursive storage when context is auto-injected
244
- - Strips at hook layer before worker sees it
245
-
246
- 2. **`<private>`** (user tag):
247
- - Manual privacy control
248
- - Users wrap sensitive content to exclude from storage
249
- - Example: `API key: <private>sk-abc123</private>`
250
-
251
- **File**: `src/utils/tag-stripping.ts`
252
-
253
- ```typescript
254
- export function stripPrivateTags(text: string): string {
255
- const MAX_TAG_COUNT = 100; // ReDoS protection
256
-
257
- return text
258
- .replace(/<claude-mem-context>[\s\S]*?<\/claude-mem-context>/g, '')
259
- .replace(/<private>[\s\S]*?<\/private>/g, '');
260
- }
261
- ```
262
-
263
- **Edge Processing Philosophy**: Stripping happens at hook layer (before data reaches database).
264
-
265
- ### What We Should Do
266
-
267
- **Priority**: HIGH
268
-
269
- **Implementation**:
270
-
271
- 1. **Add tag stripping to ingester**:
272
- ```ruby
273
- # lib/claude_memory/ingest/transcript_reader.rb
274
- class TranscriptReader
275
- SYSTEM_TAGS = ['claude-memory-context'].freeze
276
- USER_TAGS = ['private', 'no-memory'].freeze
277
- MAX_TAG_COUNT = 100
278
-
279
- def strip_tags(text)
280
- validate_tag_count(text)
281
-
282
- ALL_TAGS = SYSTEM_TAGS + USER_TAGS
283
- ALL_TAGS.each do |tag|
284
- text = text.gsub(/<#{tag}>.*?<\/#{tag}>/m, '')
285
- end
286
-
287
- text
288
- end
289
-
290
- def validate_tag_count(text)
291
- count = text.scan(/<(?:#{ALL_TAGS.join('|')})>/).size
292
- raise "Too many tags (#{count}), possible ReDoS" if count > MAX_TAG_COUNT
293
- end
294
- end
295
- ```
296
-
297
- 2. **Document in README**:
298
- ```markdown
299
- ## Privacy Control
300
-
301
- Wrap sensitive content in `<private>` tags to exclude from storage:
302
-
303
- ```
304
- API endpoint: https://api.example.com
305
- API key: <private>sk-abc123def456</private>
306
- ```
307
-
308
- System tags (auto-stripped):
309
- - `<claude-memory-context>` - Prevents recursive storage of published memory
310
- ```
311
-
312
- 3. **Add to hook handler**:
313
- ```ruby
314
- # lib/claude_memory/hook/handler.rb
315
- def ingest_hook
316
- input = read_stdin
317
- transcript = input[:transcript_delta]
96
+ ## Executive Summary
318
97
 
319
- # Strip tags before processing
320
- transcript = strip_privacy_tags(transcript)
98
+ This document analyzes two complementary memory systems:
321
99
 
322
- ingester.ingest(transcript)
323
- end
324
- ```
325
-
326
- 4. **Test edge cases**:
327
- ```ruby
328
- # spec/claude_memory/ingest/transcript_reader_spec.rb
329
- it "strips nested private tags" do
330
- text = "Public <private>Secret <private>Nested</private></private> Public"
331
- expect(strip_tags(text)).to eq("Public Public")
332
- end
100
+ **Claude-mem** (TypeScript/Node.js, v9.0.5) - Memory compression system with 6+ months of production usage:
101
+ - ROI Metrics tracking token costs
102
+ - Health monitoring and process management
103
+ - Configuration-driven context injection
333
104
 
334
- it "prevents ReDoS with many tags" do
335
- text = "<private>" * 101
336
- expect { strip_tags(text) }.to raise_error(/Too many tags/)
337
- end
338
- ```
105
+ **Episodic-memory** (TypeScript/Node.js, v1.0.15) - Semantic conversation search for Claude Code:
106
+ - Local vector embeddings (Transformers.js)
107
+ - Multi-concept AND search
108
+ - Automatic conversation summarization
109
+ - Tool usage tracking
110
+ - Session metadata capture
111
+ - Background sync with incremental updates
339
112
 
340
- **Benefits**:
341
- - User control over sensitive data
342
- - Prevents credential leakage
343
- - Protects recursive context injection
344
- - Security-conscious design
113
+ **Our Current Advantages**:
114
+ - Ruby ecosystem (simpler dependencies)
115
+ - Dual-database architecture (global + project scope)
116
+ - Fact-based knowledge graph (vs observation blobs or conversation exchanges)
117
+ - Truth maintenance system (conflict resolution)
118
+ - Predicate policies (single vs multi-value)
119
+ - Progressive disclosure already implemented
120
+ - Privacy tag stripping already implemented
345
121
 
346
- **Trade-offs**:
347
- - Users must remember to tag sensitive content
348
- - May create false sense of security
349
- - Regex-based (could miss edge cases)
122
+ **High-Value Opportunities from Episodic-Memory**:
123
+ - Vector embeddings for semantic search alongside FTS5
124
+ - Tool usage tracking during fact discovery
125
+ - Session metadata capture (git branch, working directory)
126
+ - Multi-concept AND search
127
+ - Background sync with incremental updates
128
+ - Enhanced statistics and reporting
350
129
 
351
130
  ---
352
131
 
353
- ## 4. Slim Orchestrator Pattern
354
-
355
- ### What claude-mem Does
356
-
357
- **Worker Service Evolution**: Refactored from 2,000 lines 300 lines orchestrator.
358
-
359
- **File Structure**:
360
- ```
361
- src/services/worker-service.ts (300 lines - orchestrator)
362
- delegates to
363
- src/server/Server.ts (Express setup)
364
- src/services/sqlite/Database.ts (data layer)
365
- src/services/worker/ (business logic)
366
- ├── SDKAgent.ts (agent management)
367
- ├── SessionManager.ts (session lifecycle)
368
- └── search/SearchOrchestrator.ts (search strategies)
369
- src/infrastructure/ (process management)
370
- ```
371
-
372
- **Benefit**: Testability, readability, separation of concerns.
373
-
374
- ### What We Should Do
375
-
376
- **Priority**: MEDIUM
377
-
378
- **Current State**:
379
- - `lib/claude_memory/cli.rb`: 800+ lines (all commands)
380
- - Logic mixed with CLI parsing
381
- - Hard to test individual commands
382
-
383
- **Implementation**:
384
-
385
- 1. **Extract command classes**:
386
- ```ruby
387
- # lib/claude_memory/commands/
388
- ├── base_command.rb
389
- ├── ingest_command.rb
390
- ├── recall_command.rb
391
- ├── publish_command.rb
392
- ├── promote_command.rb
393
- └── sweep_command.rb
394
- ```
395
-
396
- 2. **Slim CLI to routing**:
397
- ```ruby
398
- # lib/claude_memory/cli.rb (150 lines)
399
- module ClaudeMemory
400
- class CLI
401
- def run(args)
402
- command_name = args[0]
403
- command = command_for(command_name)
404
- command.run(args[1..])
405
- end
406
-
407
- private
408
-
409
- def command_for(name)
410
- case name
411
- when "ingest" then Commands::IngestCommand.new
412
- when "recall" then Commands::RecallCommand.new
413
- # ...
414
- end
415
- end
416
- end
417
- end
418
- ```
419
-
420
- 3. **Command base class**:
421
- ```ruby
422
- # lib/claude_memory/commands/base_command.rb
423
- module ClaudeMemory
424
- module Commands
425
- class BaseCommand
426
- def initialize
427
- @store_manager = Store::StoreManager.new
428
- end
429
-
430
- def run(args)
431
- options = parse_options(args)
432
- validate_options(options)
433
- execute(options)
434
- end
435
-
436
- private
437
-
438
- def parse_options(args)
439
- raise NotImplementedError
440
- end
441
-
442
- def execute(options)
443
- raise NotImplementedError
444
- end
445
- end
446
- end
447
- end
448
- ```
449
-
450
- 4. **Example command**:
451
- ```ruby
452
- # lib/claude_memory/commands/recall_command.rb
453
- module ClaudeMemory
454
- module Commands
455
- class RecallCommand < BaseCommand
456
- def parse_options(args)
457
- OptionParser.new do |opts|
458
- opts.on("--query QUERY") { |q| options[:query] = q }
459
- opts.on("--scope SCOPE") { |s| options[:scope] = s }
460
- end.parse!(args)
461
- end
462
-
463
- def execute(options)
464
- results = Recall.search(
465
- options[:query],
466
- scope: options[:scope]
467
- )
468
- puts format_results(results)
469
- end
470
- end
471
- end
472
- end
473
- ```
474
-
475
- **Benefits**:
476
- - Each command is independently testable
477
- - CLI.rb becomes simple router
478
- - Easier to add new commands
479
- - Clear separation of parsing vs execution
480
-
481
- **Trade-offs**:
482
- - More files to navigate
483
- - Slightly more boilerplate
484
- - May be overkill for small CLI
132
+ ## Episodic-Memory Comparison
133
+
134
+ ### Architecture Overview
135
+
136
+ **Episodic-memory** focuses on **conversation-level semantic search** rather than fact extraction. Key differences:
137
+
138
+ | Feature | Episodic-Memory | ClaudeMemory |
139
+ |---------|----------------|--------------|
140
+ | **Data Model** | Conversation exchanges (user-assistant pairs) | Facts (subject-predicate-object triples) |
141
+ | **Search Method** | Vector embeddings + text search | FTS5 full-text search |
142
+ | **Embeddings** | Local Transformers.js (Xenova/all-MiniLM-L6-v2) | None (FTS5 only) |
143
+ | **Vector Storage** | sqlite-vec virtual table | N/A |
144
+ | **Scope** | Single database with project field | Dual database (global + project) |
145
+ | **Truth Maintenance** | None (keeps all conversations) | Supersession + conflict resolution |
146
+ | **Summarization** | Claude API generates summaries | N/A |
147
+ | **Tool Tracking** | Explicit tool_calls table | Mentioned in provenance text |
148
+ | **Session Metadata** | sessionId, cwd, gitBranch, claudeVersion, thinking metadata | Limited (session_id in content_items) |
149
+ | **Multi-Concept Search** | Array-based AND queries (2-5 concepts) | Single query only |
150
+ | **Incremental Sync** | Timestamp-based mtime checks | Re-processes all content |
151
+ | **Background Processing** | Async hook with --background flag | Synchronous hook execution |
152
+ | **Statistics** | Rich stats with project breakdown | Basic status command |
153
+ | **Exclusion** | Content-based markers (`<INSTRUCTIONS-TO-EPISODIC-MEMORY>DO NOT INDEX`) | Tag stripping (`<private>`, `<no-memory>`) |
154
+ | **Line References** | Stores line_start and line_end for each exchange | No line tracking |
155
+ | **WAL Mode** | Enabled for concurrency | Not enabled |
156
+
157
+ ### What Episodic-Memory Does Well
158
+
159
+ 1. **Semantic Search with Local Embeddings**
160
+ - Uses Transformers.js to run embedding model locally (offline-capable)
161
+ - 384-dimensional vectors from `Xenova/all-MiniLM-L6-v2`
162
+ - Hybrid vector + text search for best recall
163
+ - sqlite-vec virtual table for fast similarity queries
164
+
165
+ 2. **Multi-Concept AND Search**
166
+ - Array of 2-5 concepts that must all be present in results
167
+ - Searches each concept independently then intersects results
168
+ - Ranks by average similarity across all concepts
169
+ - Example: `["React Router", "authentication", "JWT"]`
170
+
171
+ 3. **Tool Usage Tracking**
172
+ - Dedicated `tool_calls` table with foreign key to exchanges
173
+ - Captures tool_name, tool_input, tool_result, is_error
174
+ - Tool names included in embeddings for tool-based searches
175
+ - Search results show tool usage summary
176
+
177
+ 4. **Rich Session Metadata**
178
+ - Captures: sessionId, cwd, gitBranch, claudeVersion
179
+ - Thinking metadata: level, disabled, triggers
180
+ - Conversation structure: parentUuid, isSidechain
181
+ - Enables filtering by branch, project context
182
+
183
+ 5. **Incremental Sync**
184
+ - Atomic file operations (temp file + rename)
185
+ - mtime-based change detection (only copies modified files)
186
+ - Fast subsequent syncs (seconds vs minutes)
187
+ - Safe concurrent execution
188
+
189
+ 6. **Automatic Conversation Summarization**
190
+ - Uses Claude API to generate concise summaries
191
+ - Summaries stored as `.txt` files alongside conversations
192
+ - Concurrency-limited batch processing
193
+ - Summary limit (default 10 per sync) to control API costs
194
+
195
+ 7. **Background Sync**
196
+ - `--background` flag for async processing
197
+ - SessionStart hook runs sync without blocking
198
+ - User continues working while indexing happens
199
+ - Output logged to file for debugging
200
+
201
+ 8. **Line-Range References**
202
+ - Stores line_start and line_end for each exchange
203
+ - Enables precise source linking in search results
204
+ - Supports pagination: read specific line ranges from large conversations
205
+ - Example: "Lines 10-25 in conversation.jsonl (295KB, 1247 lines)"
206
+
207
+ 9. **Statistics and Reporting**
208
+ - Total conversations, exchanges, date range
209
+ - Summary coverage tracking
210
+ - Project breakdown with top 10 projects
211
+ - Database size reporting
212
+
213
+ 10. **Exclusion Markers**
214
+ - Content-based opt-out: `<INSTRUCTIONS-TO-EPISODIC-MEMORY>DO NOT INDEX THIS CHAT</INSTRUCTIONS-TO-EPISODIC-MEMORY>`
215
+ - Files archived but excluded from search index
216
+ - Prevents meta-conversations from polluting index
217
+ - Use case: sensitive work, test sessions, agent conversations
218
+
219
+ 11. **WAL Mode for Concurrency**
220
+ - SQLite Write-Ahead Logging enabled
221
+ - Better concurrency for multiple readers
222
+ - Safe for concurrent sync operations
223
+
224
+ ### Design Patterns Worth Adopting
225
+
226
+ 1. **Local Vector Embeddings**
227
+ - **Value**: Semantic search finds conceptually similar content even with different terminology
228
+ - **Implementation**: Add `embeddings` column to facts table, use sqlite-vec extension
229
+ - **Ruby gems**: `onnxruntime` or shell out to Python/Node.js for embeddings
230
+ - **Trade-off**: Increased storage (384 floats per fact), embedding generation time
231
+
232
+ 2. **Multi-Concept AND Search**
233
+ - **Value**: Precise queries like "find conversations about React AND authentication AND JWT"
234
+ - **Implementation**: Run multiple searches and intersect results, rank by average similarity
235
+ - **Application to facts**: Find facts matching multiple predicates or entities
236
+ - **MCP tool**: `memory.search_concepts(concepts: ["auth", "API", "security"])`
237
+
238
+ 3. **Tool Usage Tracking**
239
+ - **Value**: Know which tools were used during fact discovery (Read, Edit, Bash, etc.)
240
+ - **Implementation**: Add `tool_calls` table or JSON column in content_items
241
+ - **Schema**: `{ tool_name, tool_input, tool_result, timestamp }`
242
+ - **Use case**: "Which facts were discovered using the Bash tool?"
243
+
244
+ 4. **Session Metadata Capture**
245
+ - **Value**: Context about where/when facts were learned
246
+ - **Implementation**: Extend content_items with git_branch, cwd, claude_version columns
247
+ - **Use case**: "Show facts learned while on feature/auth branch"
248
+
249
+ 5. **Incremental Sync**
250
+ - **Value**: Faster subsequent ingestions (seconds vs minutes)
251
+ - **Implementation**: Store mtime for each content_item, skip unchanged files
252
+ - **Hook optimization**: Only process delta since last ingest
253
+
254
+ 6. **Background Processing**
255
+ - **Value**: Don't block user while processing large transcripts
256
+ - **Implementation**: Fork process or use Ruby's async/await
257
+ - **Hook flag**: `claude-memory hook ingest --async`
258
+
259
+ 7. **Line-Range References in Provenance**
260
+ - **Value**: Precise source linking for fact verification
261
+ - **Implementation**: Store line_start and line_end in provenance table
262
+ - **Display**: "Fact from lines 42-56 in transcript.jsonl"
263
+
264
+ 8. **Statistics Command**
265
+ - **Value**: Visibility into memory system health
266
+ - **Implementation**: Enhance `claude-memory status` with more metrics
267
+ - **Metrics**: Facts by predicate, entities by type, provenance coverage, scope breakdown
268
+
269
+ 9. **WAL Mode**
270
+ - **Value**: Better concurrency, safer concurrent operations
271
+ - **Implementation**: `db.pragma('journal_mode = WAL')` in store initialization
272
+ - **Benefit**: Multiple readers don't block each other
485
273
 
486
274
  ---
487
275
 
488
- ## 5. Health Monitoring and Process Management
276
+ ## 1. Health Monitoring and Process Management
489
277
 
490
278
  ### What claude-mem Does
491
279
 
@@ -581,143 +369,7 @@ async function ensureWorkerHealthy(timeout = 10000) {
581
369
 
582
370
  ---
583
371
 
584
- ## 6. Semantic Shortcuts and Search Strategies
585
-
586
- ### What claude-mem Does
587
-
588
- **Semantic Shortcuts** (pre-configured queries):
589
-
590
- ```typescript
591
- // File: src/services/worker/http/routes/SearchRoutes.ts
592
- app.get('/api/decisions', (req, res) => {
593
- const results = await search({ type: 'decision' });
594
- res.json(results);
595
- });
596
-
597
- app.get('/api/changes', (req, res) => {
598
- const results = await search({ type: ['feature', 'change'] });
599
- res.json(results);
600
- });
601
-
602
- app.get('/api/how-it-works', (req, res) => {
603
- const results = await search({ type: 'how-it-works' });
604
- res.json(results);
605
- });
606
- ```
607
-
608
- **Search Strategy Pattern**:
609
-
610
- ```typescript
611
- // File: src/services/worker/search/SearchOrchestrator.ts
612
- class SearchOrchestrator {
613
- strategies: [
614
- ChromaSearchStrategy, // Vector search (if available)
615
- SQLiteSearchStrategy, // FTS5 fallback
616
- HybridSearchStrategy // Combine both
617
- ]
618
-
619
- async search(query, options) {
620
- const strategy = selectStrategy(options);
621
- return strategy.execute(query);
622
- }
623
- }
624
- ```
625
-
626
- **Fallback Logic**:
627
- 1. Try Chroma vector search (semantic)
628
- 2. Fall back to SQLite FTS5 (keyword)
629
- 3. Merge and re-rank results if both available
630
-
631
- ### What We Should Do
632
-
633
- **Priority**: MEDIUM
634
-
635
- **Implementation**:
636
-
637
- 1. **Add shortcut methods to Recall**:
638
- ```ruby
639
- # lib/claude_memory/recall.rb
640
- module ClaudeMemory
641
- class Recall
642
- class << self
643
- def recent_decisions(limit: 10)
644
- search("decision constraint rule", limit:)
645
- end
646
-
647
- def architecture_choices(limit: 10)
648
- search("uses framework implements architecture", limit:)
649
- end
650
-
651
- def conventions(limit: 20)
652
- search("convention style format pattern", scope: :global, limit:)
653
- end
654
-
655
- def project_config(limit: 10)
656
- search("uses requires depends_on", scope: :project, limit:)
657
- end
658
- end
659
- end
660
- end
661
- ```
662
-
663
- 2. **Add MCP tools for shortcuts**:
664
- ```ruby
665
- # lib/claude_memory/mcp/tools.rb
666
- TOOLS["memory.decisions"] = {
667
- description: "Quick access to architectural decisions and constraints",
668
- input_schema: { type: "object", properties: { limit: { type: "integer" } } }
669
- }
670
-
671
- TOOLS["memory.conventions"] = {
672
- description: "Quick access to coding conventions and preferences",
673
- input_schema: { type: "object", properties: { limit: { type: "integer" } } }
674
- }
675
- ```
676
-
677
- 3. **Search strategy pattern** (future: if we add vector search):
678
- ```ruby
679
- # lib/claude_memory/index/search_strategy.rb
680
- module ClaudeMemory
681
- module Index
682
- class SearchStrategy
683
- def self.select(options)
684
- if options[:semantic] && vector_db_available?
685
- VectorSearchStrategy.new
686
- else
687
- LexicalSearchStrategy.new
688
- end
689
- end
690
- end
691
-
692
- class LexicalSearchStrategy < SearchStrategy
693
- def search(query)
694
- LexicalFTS.search(query)
695
- end
696
- end
697
-
698
- class VectorSearchStrategy < SearchStrategy
699
- def search(query)
700
- # Future: vector embeddings
701
- end
702
- end
703
- end
704
- end
705
- ```
706
-
707
- **Benefits**:
708
- - Common queries are one command
709
- - Reduces cognitive load
710
- - Pre-optimized for specific use cases
711
- - Strategy pattern enables future enhancements
712
-
713
- **Trade-offs**:
714
- - Need to pick right shortcuts (user research)
715
- - May not cover all use cases
716
- - Shortcuts can become stale
717
-
718
- ---
719
-
720
- ## 7. Web-Based Viewer UI
372
+ ## 3. Web-Based Viewer UI
721
373
 
722
374
  ### What claude-mem Does
723
375
 
@@ -832,7 +484,7 @@ esbuild.build({
832
484
 
833
485
  ---
834
486
 
835
- ## 8. Dual-Integration Strategy
487
+ ## 4. Dual-Integration Strategy
836
488
 
837
489
  ### What claude-mem Does
838
490
 
@@ -920,85 +572,7 @@ mcpServer.setRequestHandler(CallToolRequestSchema, async (request) => {
920
572
 
921
573
  ---
922
574
 
923
- ## 9. Exit Code Strategy for Hooks
924
-
925
- ### What claude-mem Does
926
-
927
- **Hook Exit Code Contract**:
928
-
929
- ```typescript
930
- // Success or graceful shutdown
931
- process.exit(0); // Windows Terminal closes tab
932
-
933
- // Non-blocking error (show to user, continue)
934
- console.error("Warning: ...");
935
- process.exit(1);
936
-
937
- // Blocking error (feed to Claude for processing)
938
- console.error("ERROR: ...");
939
- process.exit(2);
940
- ```
941
-
942
- **Philosophy**: Worker/hook errors exit with 0 to prevent Windows Terminal tab accumulation.
943
-
944
- **File**: `docs/context/claude-code/exit-codes.md`
945
-
946
- ### What We Should Do
947
-
948
- **Priority**: MEDIUM (if we add hooks)
949
-
950
- **Implementation**:
951
-
952
- 1. **Define exit code constants**:
953
- ```ruby
954
- # lib/claude_memory/hook/exit_codes.rb
955
- module ClaudeMemory
956
- module Hook
957
- module ExitCodes
958
- SUCCESS = 0
959
- WARNING = 1 # Non-blocking error
960
- ERROR = 2 # Blocking error
961
- end
962
- end
963
- end
964
- ```
965
-
966
- 2. **Use in hook handler**:
967
- ```ruby
968
- # lib/claude_memory/hook/handler.rb
969
- def run
970
- handle_hook(ARGV[0])
971
- exit ExitCodes::SUCCESS
972
- rescue NonBlockingError => e
973
- warn e.message
974
- exit ExitCodes::WARNING
975
- rescue => e
976
- $stderr.puts "ERROR: #{e.message}"
977
- exit ExitCodes::ERROR
978
- end
979
- ```
980
-
981
- 3. **Document in CLAUDE.md**:
982
- ```markdown
983
- ## Hook Exit Codes
984
-
985
- - **0**: Success or graceful shutdown
986
- - **1**: Non-blocking error (shown to user, session continues)
987
- - **2**: Blocking error (fed to Claude for processing)
988
- ```
989
-
990
- **Benefits**:
991
- - Clear contract with Claude Code
992
- - Predictable behavior
993
- - Better error handling
994
-
995
- **Trade-offs**:
996
- - Hook-specific pattern
997
- - Not applicable to MCP server
998
-
999
- ---
1000
-
1001
- ## 10. Configuration-Driven Context Injection
575
+ ## 5. Configuration-Driven Context Injection
1002
576
 
1003
577
  ### What claude-mem Does
1004
578
 
@@ -1235,91 +809,343 @@ npm install better-sqlite3 # Needs node-gyp + build tools
1235
809
  - CSS/theming
1236
810
 
1237
811
  **Alternative**: CLI output is sufficient. Add web UI if users request it.
812
+ **Alternative**: CLI output is sufficient. Add web UI if users request it.
813
+
814
+ ---
815
+
816
+ ## Remaining Improvements
817
+
818
+ The following sections (6-12 from the original analysis) have been implemented and moved to the "Implemented Improvements" section above:
819
+
820
+ - ✅ Section 6: Local Vector Embeddings for Semantic Search
821
+ - ✅ Section 7: Multi-Concept AND Search
822
+ - ✅ Section 8: Tool Usage Tracking
823
+ - ✅ Section 9: Enhanced Session Metadata
824
+ - ✅ Section 10: Incremental Sync (mtime-based)
825
+ - ✅ Section 11: Enhanced Statistics and Reporting
826
+ - ✅ Section 12: WAL Mode for Better Concurrency
827
+
828
+ **For remaining unimplemented improvements, see:** [remaining_improvements.md](./remaining_improvements.md)
829
+
830
+ Key remaining items:
831
+ - Background processing for hooks (--async flag)
832
+ - ROI metrics and token economics tracking
833
+ - Structured logging
834
+ - Embed command for backfilling embeddings
835
+
836
+ ---
837
+
838
+ ## QMD-Inspired Improvements (2026-01-26)
839
+
840
+ Analysis of **QMD (Quick Markdown Search)** reveals several high-value optimizations for search quality and performance. QMD is an on-device markdown search engine with hybrid BM25 + vector + LLM reranking, achieving 50%+ Hit@3 improvement over BM25-only search.
841
+
842
+ **See detailed analysis**: [docs/influence/qmd.md](./influence/qmd.md)
843
+
844
+ ### High Priority ⭐
845
+
846
+ #### 1. **Native Vector Storage (sqlite-vec)** ⭐ CRITICAL
847
+
848
+ - **Value**: 10-100x faster KNN queries, enables larger fact databases
849
+ - **QMD Proof**: Handles 10,000+ documents with sub-second vector queries
850
+ - **Current Issue**: JSON embedding storage requires loading all facts, O(n) Ruby similarity calculation
851
+ - **Solution**: sqlite-vec extension with native C KNN queries
852
+ - **Implementation**:
853
+ - Schema migration v7: Create `facts_vec` virtual table using `vec0`
854
+ - Two-step query pattern (avoid JOINs - they hang with vec tables!)
855
+ - Update `Embeddings::Similarity` class
856
+ - Backfill existing embeddings
857
+ - **Trade-off**: Adds native dependency (acceptable, well-maintained, cross-platform)
858
+ - **Recommendation**: **ADOPT IMMEDIATELY** - This is foundational
859
+
860
+ #### 2. **Reciprocal Rank Fusion (RRF) Algorithm** ⭐ HIGH VALUE
861
+
862
+ - **Value**: 50% improvement in Hit@3 for medium-difficulty queries (QMD evaluation)
863
+ - **QMD Proof**: Evaluation suite shows consistent improvements across all query types
864
+ - **Current Issue**: Naive deduplication doesn't properly fuse ranking signals
865
+ - **Solution**: Mathematical fusion of FTS + vector ranked lists with position-aware scoring
866
+ - **Formula**: `score = Σ(weight / (k + rank + 1))` with top-rank bonus
867
+ - **Implementation**:
868
+ - Create `Recall::RRFusion` class
869
+ - Update `Recall#query_semantic_dual` to use RRF
870
+ - Apply weights: original query ×2, expanded queries ×1
871
+ - Add top-rank bonus: +0.05 for #1, +0.02 for #2-3
872
+ - **Trade-off**: Slightly more complex than naive merging (acceptable, well-tested)
873
+ - **Recommendation**: **ADOPT IMMEDIATELY** - Pure algorithmic improvement
874
+
875
+ #### 3. **Docid Short Hash System** ⭐ MEDIUM VALUE
876
+
877
+ - **Value**: Better UX, cross-database fact references
878
+ - **QMD Proof**: Used in all output, enables `qmd get #abc123`
879
+ - **Current Issue**: Integer IDs are database-specific, not user-friendly
880
+ - **Solution**: 8-character hash IDs for facts (e.g., `#abc123de`)
881
+ - **Implementation**:
882
+ - Schema migration v8: Add `docid` column (indexed, unique)
883
+ - Backfill existing facts with SHA256-based docids
884
+ - Update CLI commands (`explain`, `recall`) to accept docids
885
+ - Update MCP tools to accept docids
886
+ - Update output formatting to show docids
887
+ - **Trade-off**: Hash collisions possible (8 chars = 1 in 4.3 billion, very rare)
888
+ - **Recommendation**: **ADOPT IN PHASE 2** - Clear UX improvement
889
+
890
+ #### 4. **Smart Expansion Detection** ⭐ MEDIUM VALUE
891
+
892
+ - **Value**: Skip unnecessary vector search when FTS finds exact match
893
+ - **QMD Proof**: Saves 2-3 seconds on 60% of queries (exact keyword matches)
894
+ - **Current Issue**: Always runs both FTS and vector search, even for exact matches
895
+ - **Solution**: Heuristic detection of strong FTS signal
896
+ - **Thresholds**: `top_score >= 0.85` AND `gap >= 0.15`
897
+ - **Implementation**:
898
+ - Create `Recall::ExpansionDetector` class
899
+ - Update `Recall#query_semantic_dual` to check before vector search
900
+ - Add optional metrics tracking (skip rate, latency saved)
901
+ - **Trade-off**: May miss semantic results for exact matches (acceptable)
902
+ - **Recommendation**: **ADOPT IN PHASE 3** - Clear performance win
903
+
904
+ ### Medium Priority
905
+
906
+ #### 5. **Document Chunking for Long Transcripts**
907
+
908
+ - **Value**: Better embeddings for long content (>3000 chars)
909
+ - **QMD Approach**: 800 tokens, 15% overlap, semantic boundary detection
910
+ - **Break Priority**: paragraph > sentence > line > word
911
+ - **Implementation**: Modify ingestion to chunk long content_items before embedding
912
+ - **Consideration**: Only if users report issues with long transcripts
913
+ - **Recommendation**: **DEFER** - Not urgent, TF-IDF handles shorter content well
914
+
915
+ #### 6. **LLM Response Caching**
916
+
917
+ - **Value**: Reduce API costs for repeated distillation
918
+ - **QMD Proof**: Hash-based caching with 80% hit rate
919
+ - **Implementation**:
920
+ - Add `llm_cache` table (hash, result, created_at)
921
+ - Cache key: `SHA256(operation + model + input)`
922
+ - Probabilistic cleanup: 1% chance per operation, keep latest 1000
923
+ - **Consideration**: Most valuable when distiller is fully implemented
924
+ - **Recommendation**: **ADOPT WHEN DISTILLER ACTIVE** - Cost savings
925
+
926
+ #### 7. **Enhanced Snippet Extraction**
927
+
928
+ - **Value**: Better search result previews with query term highlighting
929
+ - **QMD Approach**: Find line with most query term matches, extract 1 line before + 2 after
930
+ - **Implementation**: Add to `Recall` output formatting
931
+ - **Consideration**: Improves UX but not critical
932
+ - **Recommendation**: **CONSIDER** - Nice-to-have
933
+
934
+ ### Low Priority / Not Recommended
935
+
936
+ #### 8. **Neural Embeddings (EmbeddingGemma)** (DEFER)
937
+
938
+ - **QMD Model**: 300M params, 300MB download, 384 dimensions
939
+ - **Value**: Better semantic search quality (+40% Hit@3 over TF-IDF)
940
+ - **Cost**: 300MB download, 300MB VRAM, 2s cold start, complex dependency
941
+ - **Decision**: **DEFER** - TF-IDF sufficient for now, revisit if users report poor quality
942
+
943
+ #### 9. **Cross-Encoder Reranking** (REJECT)
944
+
945
+ - **QMD Model**: Qwen3-Reranker-0.6B (640MB)
946
+ - **Value**: Better ranking precision via LLM scoring
947
+ - **Cost**: 640MB model, 400ms latency per query, complex dependency
948
+ - **Decision**: **REJECT** - Over-engineering for fact retrieval
949
+
950
+ #### 10. **Query Expansion (LLM)** (REJECT)
951
+
952
+ - **QMD Model**: Qwen3-1.7B (2.2GB)
953
+ - **Value**: Generate alternative query phrasings for better recall
954
+ - **Cost**: 2.2GB model, 800ms latency per query
955
+ - **Decision**: **REJECT** - No LLM in recall path, too heavy
956
+
957
+ #### 11. **YAML Collection System** (REJECT)
958
+
959
+ - **QMD Use**: Multi-directory indexing with per-path contexts
960
+ - **Our Use**: Dual-database (global + project) already provides clean separation
961
+ - **Decision**: **REJECT** - Our approach is cleaner for our use case
962
+
963
+ #### 12. **Content-Addressable Storage** (REJECT)
964
+
965
+ - **QMD Use**: Deduplicates documents by SHA256 hash
966
+ - **Our Use**: Facts deduplicated by signature, not content hash
967
+ - **Decision**: **REJECT** - Different data model
968
+
969
+ #### 13. **Virtual Path System** (REJECT)
970
+
971
+ - **QMD Use**: `qmd://collection/path` unified namespace
972
+ - **Our Use**: Dual-database provides clear namespace
973
+ - **Decision**: **REJECT** - Unnecessary complexity
1238
974
 
1239
975
  ---
1240
976
 
1241
977
  ## Implementation Priorities
1242
978
 
1243
- ### High Priority (Next Sprint)
979
+ ### High Priority (QMD-Inspired)
1244
980
 
1245
- 1. **Progressive Disclosure Pattern** - Add index format to Recall, update MCP tools
1246
- 2. **Privacy Tag System** - Implement `<private>` tag stripping
1247
- 3. **Exit Code Strategy** - Define exit codes for future hooks
981
+ 1. **Native Vector Storage (sqlite-vec)** - 10-100x faster KNN, foundational improvement
982
+ 2. **Reciprocal Rank Fusion (RRF)** - 50% better search quality, pure algorithm
983
+ 3. **Docid Short Hashes** - Better UX for fact references
984
+ 4. **Smart Expansion Detection** - Skip unnecessary vector search when FTS is confident
1248
985
 
1249
- ### Medium Priority (Next Quarter)
986
+ ### Medium Priority
1250
987
 
1251
- 4. **ROI Metrics** - Track token economics
1252
- 5. **Slim Orchestrator Pattern** - Extract commands from CLI
1253
- 6. **Semantic Shortcuts** - Add convenience methods to Recall
1254
- 7. **Search Strategies** - Prepare for future vector search
988
+ 5. **Background Processing** - Non-blocking hooks for better UX (from episodic-memory)
989
+ 6. **ROI Metrics** - Track token economics for distillation (from claude-mem)
990
+ 7. **LLM Response Caching** - Reduce API costs (from QMD)
991
+ 8. **Document Chunking** - Better embeddings for long transcripts (from QMD, if needed)
1255
992
 
1256
- ### Low Priority (Future)
993
+ ### Low Priority
1257
994
 
1258
- 8. **Health Monitoring** - Only if we add background worker
1259
- 9. **Dual Integration** - Only if we add Claude Code hooks
1260
- 10. **Config-Driven Context** - Only if users request customization
1261
- 11. **Web Viewer UI** - Only if users request visualization
995
+ 9. **Structured Logging** - Better debugging with JSON logs
996
+ 10. **Embed Command** - Backfill embeddings for existing facts
997
+ 11. **Enhanced Snippet Extraction** - Query-aware snippet preview (from QMD)
998
+ 12. **Health Monitoring** - Only if we add background worker
999
+ 13. **Web Viewer UI** - Only if users request visualization
1000
+ 14. **Configuration-Driven Context** - Only if users request snapshot customization
1262
1001
 
1263
1002
  ---
1264
1003
 
1265
1004
  ## Migration Path
1266
1005
 
1267
- ### Phase 1: Quick Wins (1-2 weeks)
1006
+ ### Completed
1007
+
1008
+ - [x] WAL mode for better concurrency
1009
+ - [x] Enhanced statistics command
1010
+ - [x] Session metadata tracking
1011
+ - [x] Tool usage tracking
1012
+ - [x] Semantic search with TF-IDF embeddings
1013
+ - [x] Multi-concept AND search
1014
+ - [x] Incremental sync with mtime tracking
1015
+ - [x] Context-aware queries
1016
+
1017
+ ### Phase 1: Vector Storage Upgrade (from QMD) - IMMEDIATE
1018
+
1019
+ - [ ] Add sqlite-vec extension support (gem or FFI)
1020
+ - [ ] Create schema migration v7: `facts_vec` virtual table using `vec0`
1021
+ - [ ] Backfill existing embeddings from JSON to native vectors
1022
+ - [ ] Update `Embeddings::Similarity` class for native KNN (two-step query pattern)
1023
+ - [ ] Test migration on existing databases
1024
+ - [ ] Document extension installation in README
1025
+ - [ ] Benchmark: Measure KNN query improvement (expect 10-100x)
1268
1026
 
1269
- - [ ] Implement `<private>` tag stripping in ingester
1270
- - [ ] Add token count estimation to facts
1271
- - [ ] Create index format in Recall
1272
- - [ ] Add `memory.recall_index` MCP tool
1273
- - [ ] Document progressive disclosure pattern
1027
+ ### Phase 2: RRF Fusion (from QMD) - IMMEDIATE
1274
1028
 
1275
- ### Phase 2: Structural (1 month)
1029
+ - [ ] Implement `Recall::RRFusion` class with k=60 parameter
1030
+ - [ ] Update `Recall#query_semantic_dual` to use RRF fusion
1031
+ - [ ] Apply weights: original query ×2, expanded queries ×1
1032
+ - [ ] Add top-rank bonus: +0.05 for #1, +0.02 for #2-3
1033
+ - [ ] Test with synthetic ranked lists (unit tests)
1034
+ - [ ] Validate improvements with real queries
1276
1035
 
1277
- - [ ] Extract command classes from CLI
1278
- - [ ] Add metrics table for token tracking
1279
- - [ ] Implement semantic shortcuts
1280
- - [ ] Add search strategy pattern (prep for vector search)
1036
+ ### Phase 3: UX Improvements (from QMD) - NEAR-TERM
1281
1037
 
1282
- ### Phase 3: Advanced (3+ months)
1038
+ - [ ] Schema migration v8: Add `docid` column (8-char hash, indexed, unique)
1039
+ - [ ] Backfill existing facts with SHA256-based docids
1040
+ - [ ] Update CLI commands to accept/display docids (`ExplainCommand`, `RecallCommand`)
1041
+ - [ ] Update MCP tools for docid support (`memory.explain`, `memory.recall`)
1042
+ - [ ] Test cross-database docid lookups
1283
1043
 
1284
- - [ ] Add vector embeddings (if requested)
1285
- - [ ] Build web viewer (if requested)
1286
- - [ ] Add Claude Code hooks (if requested)
1287
- - [ ] Implement background worker (if needed)
1044
+ ### Phase 4: Performance Optimizations (from QMD) - NEAR-TERM
1045
+
1046
+ - [ ] Implement `Recall::ExpansionDetector` class
1047
+ - [ ] Update `Recall#query_semantic_dual` to check before vector search
1048
+ - [ ] Add metrics tracking (skip rate, avg latency saved)
1049
+ - [ ] Tune thresholds based on usage patterns
1050
+
1051
+ ### Remaining Tasks
1052
+
1053
+ - [ ] Background processing (--async flag for hooks)
1054
+ - [ ] LLM response caching (from QMD, when distiller is active)
1055
+ - [ ] Structured logging implementation
1056
+ - [ ] Embed command for backfilling embeddings
1057
+
1058
+ ### Future (If Requested)
1059
+
1060
+ - [ ] Document chunking for long transcripts (from QMD, if users report issues)
1061
+ - [ ] Enhanced snippet extraction (from QMD, for better search result previews)
1062
+ - [ ] Build web viewer (if users request visualization)
1063
+ - [ ] Add HTTP-based health checks (if background worker is added)
1064
+ - [ ] Configuration-driven snapshot generation (if users request customization)
1288
1065
 
1289
1066
  ---
1290
1067
 
1291
1068
  ## Key Takeaways
1292
1069
 
1293
- **What claude-mem does exceptionally well**:
1294
- 1. Progressive disclosure (token efficiency)
1295
- 2. ROI metrics (visibility)
1296
- 3. Privacy controls (user trust)
1297
- 4. Clean architecture (maintainability)
1298
- 5. Production polish (error handling, logging, health checks)
1299
-
1300
- **What we do better**:
1301
- 1. Dual-database architecture (global + project)
1302
- 2. Fact-based knowledge graph (structured)
1303
- 3. Truth maintenance (conflict resolution)
1304
- 4. Predicate policies (semantic understanding)
1305
- 5. Simpler dependencies (Ruby ecosystem)
1306
-
1307
- **Our path forward**:
1308
- - Adopt their token efficiency patterns
1309
- - Keep our knowledge graph architecture
1310
- - Add privacy controls
1311
- - Improve observability (metrics)
1312
- - Maintain simplicity (avoid over-engineering)
1070
+ ### Successfully Adopted from claude-mem
1071
+
1072
+ 1. Progressive disclosure (token-efficient retrieval)
1073
+ 2. Privacy controls (tag-based content exclusion)
1074
+ 3. Clean architecture (command pattern, slim CLI)
1075
+ 4. Semantic shortcuts (decisions, conventions, architecture)
1076
+ 5. Exit code strategy (hook error handling)
1077
+ 6. ROI metrics tracking (token economics for distillation efficiency)
1078
+
1079
+ ### Successfully Adopted from Episodic-Memory ✓
1080
+
1081
+ 1. **WAL Mode** - Better concurrency with Write-Ahead Logging
1082
+ 2. **Tool Usage Tracking** - Dedicated table tracking which tools discovered facts
1083
+ 3. **Incremental Sync** - mtime-based change detection for fast re-ingestion
1084
+ 4. **Session Metadata** - Context capture (git branch, cwd, Claude version)
1085
+ 5. **Local Vector Embeddings** - TF-IDF semantic search alongside FTS5
1086
+ 6. **Multi-Concept AND Search** - Precise queries matching 2-5 concepts simultaneously
1087
+ 7. **Enhanced Statistics** - Comprehensive reporting on facts, entities, provenance
1088
+ 8. **Context-Aware Queries** - Filter by branch, directory, or tools used
1089
+
1090
+ ### Our Unique Advantages
1091
+
1092
+ 1. **Dual-database architecture** - Global + project scopes
1093
+ 2. **Fact-based knowledge graph** - Structured vs blob observations or conversation exchanges
1094
+ 3. **Truth maintenance** - Conflict resolution and supersession
1095
+ 4. **Predicate policies** - Single vs multi-value semantics
1096
+ 5. **Ruby ecosystem** - Simpler dependencies, easier install
1097
+ 6. **Lightweight embeddings** - No external dependencies (TF-IDF vs Transformers.js)
1098
+
1099
+ ### Remaining Opportunities
1100
+
1101
+ - **Background Processing** - Non-blocking hooks for better UX (from episodic-memory)
1102
+ - **ROI Metrics** - Track token economics for distillation (from claude-mem)
1103
+ - **Structured Logging** - JSON-formatted logs for debugging
1104
+ - **Embed Command** - Backfill embeddings for existing facts
1105
+ - **Health Monitoring** - Only if we add background worker
1106
+ - **Web Viewer UI** - Only if users request visualization
1107
+ - **Configuration-Driven Context** - Only if users request snapshot customization
1108
+
1109
+ ---
1110
+
1111
+ ## Comparison Summary
1112
+
1113
+ **Episodic-memory** and **claude_memory** serve complementary but different needs:
1114
+
1115
+ **Episodic-memory** excels at:
1116
+ - Semantic conversation search with local embeddings
1117
+ - Preserving complete conversation context
1118
+ - Multi-concept AND queries
1119
+ - Fast incremental sync
1120
+ - Tool usage tracking
1121
+ - Rich session metadata
1122
+
1123
+ **ClaudeMemory** excels at:
1124
+ - Structured fact extraction and storage
1125
+ - Truth maintenance and conflict resolution
1126
+ - Dual-scope architecture (global vs project)
1127
+ - Knowledge graph with provenance
1128
+ - Semantic shortcuts for common queries
1129
+
1130
+ **Best of both worlds (achieved)**:
1131
+ - ✅ Added vector embeddings for semantic search (TF-IDF based)
1132
+ - ✅ Kept fact-based knowledge graph for structured queries
1133
+ - ✅ Adopted incremental sync and tool tracking from episodic-memory
1134
+ - ✅ Maintained truth maintenance and conflict resolution
1135
+ - ✅ Added session metadata for richer context
1136
+ - ✅ Implemented multi-concept AND search
1137
+ - ✅ Enhanced statistics and reporting
1313
1138
 
1314
1139
  ---
1315
1140
 
1316
1141
  ## References
1317
1142
 
1318
- - [claude-mem GitHub](https://github.com/thedotmack/claude-mem)
1319
- - [Architecture Evolution](../claude-mem/docs/public/architecture-evolution.mdx)
1320
- - [Progressive Disclosure Philosophy](../claude-mem/docs/public/progressive-disclosure.mdx)
1321
- - [ClaudeMemory Updated Plan](updated_plan.md)
1143
+ - [episodic-memory GitHub](https://github.com/obra/episodic-memory) - Semantic conversation search
1144
+ - [claude-mem GitHub](https://github.com/thedotmack/claude-mem) - Memory compression system
1145
+ - [ClaudeMemory Updated Plan](updated_plan.md) - Original improvement plan
1322
1146
 
1323
1147
  ---
1324
1148
 
1325
- *This analysis represents a critical review of production-grade patterns that have proven effective in real-world usage. Our goal is to learn from claude-mem's strengths while preserving the unique advantages of our fact-based approach.*
1149
+ *This document has been updated to reflect completed implementations. Fourteen major improvements have been successfully integrated: 6 from claude-mem and 8 from episodic-memory. ClaudeMemory now combines the best of both systems while maintaining its unique advantages in fact-based knowledge representation and truth maintenance.*
1150
+
1151
+ *Last updated: 2026-01-26 - Added ROI metrics tracking for distillation token economics*