RubyGems - claude_memory - Versions diffs - 0.2.0 → 0.3.0 - Mend

claude_memory 0.2.0 → 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (104) hide show

checksums.yaml +4 -4
data/.claude/.mind.mv2.o2N83S +0 -0
data/.claude/CLAUDE.md +1 -0
data/.claude/rules/claude_memory.generated.md +28 -9
data/.claude/settings.local.json +9 -1
data/.claude/skills/check-memory/SKILL.md +77 -0
data/.claude/skills/improve/SKILL.md +532 -0
data/.claude/skills/improve/feature-patterns.md +1221 -0
data/.claude/skills/quality-update/SKILL.md +229 -0
data/.claude/skills/quality-update/implementation-guide.md +346 -0
data/.claude/skills/review-commit/SKILL.md +199 -0
data/.claude/skills/review-for-quality/SKILL.md +154 -0
data/.claude/skills/review-for-quality/expert-checklists.md +79 -0
data/.claude/skills/setup-memory/SKILL.md +168 -0
data/.claude/skills/study-repo/SKILL.md +307 -0
data/.claude/skills/study-repo/analysis-template.md +323 -0
data/.claude/skills/study-repo/focus-examples.md +327 -0
data/CHANGELOG.md +133 -0
data/CLAUDE.md +130 -11
data/README.md +117 -10
data/db/migrations/001_create_initial_schema.rb +117 -0
data/db/migrations/002_add_project_scoping.rb +33 -0
data/db/migrations/003_add_session_metadata.rb +42 -0
data/db/migrations/004_add_fact_embeddings.rb +20 -0
data/db/migrations/005_add_incremental_sync.rb +21 -0
data/db/migrations/006_add_operation_tracking.rb +40 -0
data/db/migrations/007_add_ingestion_metrics.rb +26 -0
data/docs/.claude/mind.mv2.lock +0 -0
data/docs/GETTING_STARTED.md +587 -0
data/docs/RELEASE_NOTES_v0.2.0.md +0 -1
data/docs/RUBY_COMMUNITY_POST_v0.2.0.md +0 -2
data/docs/architecture.md +9 -8
data/docs/auto_init_design.md +230 -0
data/docs/improvements.md +557 -731
data/docs/influence/.gitkeep +13 -0
data/docs/influence/grepai.md +933 -0
data/docs/influence/qmd.md +2195 -0
data/docs/plugin.md +257 -11
data/docs/quality_review.md +472 -1273
data/docs/remaining_improvements.md +330 -0
data/lefthook.yml +13 -0
data/lib/claude_memory/commands/checks/claude_md_check.rb +41 -0
data/lib/claude_memory/commands/checks/database_check.rb +120 -0
data/lib/claude_memory/commands/checks/hooks_check.rb +112 -0
data/lib/claude_memory/commands/checks/reporter.rb +110 -0
data/lib/claude_memory/commands/checks/snapshot_check.rb +30 -0
data/lib/claude_memory/commands/doctor_command.rb +12 -129
data/lib/claude_memory/commands/help_command.rb +1 -0
data/lib/claude_memory/commands/hook_command.rb +9 -2
data/lib/claude_memory/commands/index_command.rb +169 -0
data/lib/claude_memory/commands/ingest_command.rb +1 -1
data/lib/claude_memory/commands/init_command.rb +5 -197
data/lib/claude_memory/commands/initializers/database_ensurer.rb +30 -0
data/lib/claude_memory/commands/initializers/global_initializer.rb +85 -0
data/lib/claude_memory/commands/initializers/hooks_configurator.rb +156 -0
data/lib/claude_memory/commands/initializers/mcp_configurator.rb +56 -0
data/lib/claude_memory/commands/initializers/memory_instructions_writer.rb +135 -0
data/lib/claude_memory/commands/initializers/project_initializer.rb +111 -0
data/lib/claude_memory/commands/recover_command.rb +75 -0
data/lib/claude_memory/commands/registry.rb +5 -1
data/lib/claude_memory/commands/stats_command.rb +239 -0
data/lib/claude_memory/commands/uninstall_command.rb +226 -0
data/lib/claude_memory/core/batch_loader.rb +32 -0
data/lib/claude_memory/core/concept_ranker.rb +73 -0
data/lib/claude_memory/core/embedding_candidate_builder.rb +37 -0
data/lib/claude_memory/core/fact_collector.rb +51 -0
data/lib/claude_memory/core/fact_query_builder.rb +154 -0
data/lib/claude_memory/core/fact_ranker.rb +113 -0
data/lib/claude_memory/core/result_builder.rb +54 -0
data/lib/claude_memory/core/result_sorter.rb +25 -0
data/lib/claude_memory/core/scope_filter.rb +61 -0
data/lib/claude_memory/core/text_builder.rb +29 -0
data/lib/claude_memory/embeddings/generator.rb +161 -0
data/lib/claude_memory/embeddings/similarity.rb +69 -0
data/lib/claude_memory/hook/handler.rb +4 -3
data/lib/claude_memory/index/lexical_fts.rb +7 -2
data/lib/claude_memory/infrastructure/operation_tracker.rb +158 -0
data/lib/claude_memory/infrastructure/schema_validator.rb +206 -0
data/lib/claude_memory/ingest/content_sanitizer.rb +6 -7
data/lib/claude_memory/ingest/ingester.rb +99 -15
data/lib/claude_memory/ingest/metadata_extractor.rb +57 -0
data/lib/claude_memory/ingest/tool_extractor.rb +71 -0
data/lib/claude_memory/mcp/response_formatter.rb +331 -0
data/lib/claude_memory/mcp/server.rb +19 -0
data/lib/claude_memory/mcp/setup_status_analyzer.rb +73 -0
data/lib/claude_memory/mcp/tool_definitions.rb +279 -0
data/lib/claude_memory/mcp/tool_helpers.rb +80 -0
data/lib/claude_memory/mcp/tools.rb +330 -320
data/lib/claude_memory/recall/dual_query_template.rb +63 -0
data/lib/claude_memory/recall.rb +304 -237
data/lib/claude_memory/resolve/resolver.rb +52 -49
data/lib/claude_memory/store/sqlite_store.rb +210 -144
data/lib/claude_memory/store/store_manager.rb +6 -6
data/lib/claude_memory/sweep/sweeper.rb +6 -0
data/lib/claude_memory/version.rb +1 -1
data/lib/claude_memory.rb +35 -3
metadata +71 -11
data/.claude/.mind.mv2.aLCUZd +0 -0
data/.claude/memory.sqlite3 +0 -0
data/.mcp.json +0 -11
/data/docs/{feature_adoption_plan.md → plans/feature_adoption_plan.md} +0 -0
/data/docs/{feature_adoption_plan_revised.md → plans/feature_adoption_plan_revised.md} +0 -0
/data/docs/{plan.md → plans/plan.md} +0 -0
/data/docs/{updated_plan.md → plans/updated_plan.md} +0 -0

data/docs/improvements.md CHANGED Viewed

@@ -1,491 +1,279 @@
-# Improvements to Consider (Based on claude-mem Analysis)
+# Improvements to Consider
-*Generated: 2026-01-21*
-*Source: Comparative analysis of [thedotmack/claude-mem](https://github.com/thedotmack/claude-mem)*
+*Updated: 2026-01-29*
+*Sources:*
+- *[thedotmack/claude-mem](https://github.com/thedotmack/claude-mem) - Memory compression system*
+- *[obra/episodic-memory](https://github.com/obra/episodic-memory) - Semantic conversation search*
+- *[yoanbernabeu/grepai](https://github.com/yoanbernabeu/grepai) - Semantic code search with vector embeddings*
-This document identifies design patterns, features, and architectural decisions from claude-mem that could improve claude_memory. Each section includes rationale, implementation considerations, and priority.
+This document identifies design patterns and features from claude-mem and episodic-memory that could improve claude_memory. Implemented improvements have been removed from this document.
 ---
-## Executive Summary
-Claude-mem (TypeScript/Node.js, v9.0.5) is a production-grade memory compression system with 6+ months of real-world usage. Key strengths:
-- **Progressive Disclosure**: Token-efficient 3-layer retrieval workflow
-- **ROI Metrics**: Tracks token costs and discovery efficiency
-- **Slim Architecture**: Clean separation via service layer pattern
-- **Dual Integration**: Plugin + MCP server for flexibility
-- **Privacy-First**: User-controlled content exclusion via tags
-- **Fail-Fast Philosophy**: Explicit error handling and exit codes
-**Our Advantages**:
-- Ruby ecosystem (simpler dependencies)
-- Dual-database architecture (global + project scope)
-- Fact-based knowledge graph (vs observation blobs)
-- Truth maintenance system (conflict resolution)
-- Predicate policies (single vs multi-value)
+## Implemented Improvements ✓
+The following improvements from the original analysis have been successfully implemented:
+1. **Progressive Disclosure Pattern** - `memory.recall_index` and `memory.recall_details` MCP tools with token estimation
+2. **Privacy Tag System** - ContentSanitizer with `<private>`, `<no-memory>`, and `<secret>` tag stripping
+3. **Slim Orchestrator Pattern** - CLI refactored to thin router with extracted command classes
+4. **Semantic Shortcuts** - `memory.decisions`, `memory.conventions`, and `memory.architecture` MCP tools
+5. **Exit Code Strategy** - Hook::ExitCodes module with SUCCESS/WARNING/ERROR constants
+6. **WAL Mode for Concurrency** - SQLite Write-Ahead Logging enabled for better concurrent access
+7. **Enhanced Statistics** - Comprehensive stats command showing facts, entities, provenance, conflicts
+8. **Session Metadata Tracking** - Captures git_branch, cwd, claude_version, thinking_level from transcripts
+9. **Tool Usage Tracking** - Dedicated tool_calls table tracking tool names, inputs, timestamps
+10. **Semantic Search with TF-IDF** - Local embeddings (384-dimensional), hybrid vector + text search
+11. **Multi-Concept AND Search** - Query facts matching all of 2-5 concepts simultaneously
+12. **Incremental Sync** - mtime-based change detection to skip unchanged transcript files
+13. **Context-Aware Queries** - Filter facts by git branch, directory, or tools used
+14. **ROI Metrics Tracking** - ingestion_metrics table tracking token economics for distillation efficiency (2026-01-26)
 ---
-## 1. Progressive Disclosure Pattern
-### What claude-mem Does
-**3-Layer Workflow** enforced at the tool level:
-```
-Layer 1: search → Get compact index with IDs (~50-100 tokens/result)
-Layer 2: timeline → Get chronological context around IDs
-Layer 3: get_observations → Fetch full details (~500-1,000 tokens/result)
-```
-**Token savings**: ~10x reduction by filtering before fetching.
-**MCP Tools**:
-- `search` - Returns index format (titles, IDs, token counts)
-- `timeline` - Returns context around specific observation
-- `get_observations` - Returns full details only for filtered IDs
-- `__IMPORTANT` - Workflow documentation (always visible)
-**File**: `docs/public/progressive-disclosure.mdx` (673 lines of philosophy)
-### What We Should Do
-**Priority**: HIGH
-**Implementation**:
-1. **Add token count field to facts table**:
-   ```ruby
-   alter table :facts do
-     add_column :token_count, Integer
-   end
-   ```
-2. **Create index format in Recall**:
-   ```ruby
-   # lib/claude_memory/recall.rb
-   def recall_index(query, scope: :project, limit: 20)
-     facts = search_facts(query, scope:, limit:)
-     facts.map do |fact|
-       {
-         id: fact[:id],
-         subject: fact[:subject],
-         predicate: fact[:predicate],
-         object_preview: fact[:object_value][0..50],
-         scope: fact[:scope],
-         token_count: fact[:token_count] || estimate_tokens(fact)
-       }
-     end
-   end
-   ```
-3. **Add MCP tool for fetching details**:
-   ```ruby
-   # lib/claude_memory/mcp/tools.rb
-   TOOLS["memory.recall_index"] = {
-     description: "Layer 1: Search for facts. Returns compact index with IDs.",
-     input_schema: {
-       type: "object",
-       properties: {
-         query: { type: "string" },
-         scope: { type: "string", enum: ["global", "project", "both"] },
-         limit: { type: "integer", default: 20 }
-       }
-     }
-   }
-   TOOLS["memory.recall_details"] = {
-     description: "Layer 2: Fetch full fact details by IDs.",
-     input_schema: {
-       type: "object",
-       properties: {
-         fact_ids: { type: "array", items: { type: "integer" } }
-       },
-       required: ["fact_ids"]
-     }
-   }
-   ```
-4. **Update publish format** to show costs:
-   ```markdown
-   ## Recent Facts
-   | ID | Subject | Predicate | Preview | Tokens |
-   |----|---------|-----------|---------|--------|
-   | #123 | project | uses_database | PostgreSQL | ~45 |
-   | #124 | project | has_constraint | API rate lim... | ~120 |
-   ```
-**Benefits**:
-- Reduces context waste in published snapshots
-- Gives Claude control over retrieval depth
-- Makes token costs visible for informed decisions
-**Trade-offs**:
-- More complex MCP interface
-- Requires token estimation logic
-- May confuse users who expect full details
+## grepai Study (2026-01-29)
+Source: docs/influence/grepai.md
+### High Priority Recommendations
+- [ ] **Incremental Indexing with File Watching**: Auto-update memory index during coding sessions
+  - Value: Eliminates manual `claude-memory ingest` calls, huge UX win
+  - Evidence: watcher/watcher.go:44 - `fsnotify` with debouncing (300ms default), gitignore respect
+  - Implementation: Add `Listen` gem (Ruby fsnotify), watch `.claude/projects/*/transcripts/*.jsonl`, debounce 500ms, trigger IngestCommand automatically
+  - Effort: 2-3 days (watcher class, integration, testing)
+  - Trade-off: Background process ~10MB memory overhead, may complicate testing
+- [ ] **Compact Response Format for MCP Tools**: Reduce token usage by ~60% in MCP responses
+  - Value: Critical for scaling to large fact databases (1000+ facts)
+  - Evidence: mcp/server.go:219 - `SearchResultCompact` omits content field, returns only metadata
+  - Implementation: Add `compact: true` parameter to all recall tools, omit provenance/context excerpts by default, user can override with `compact: false`
+  - Effort: 4-6 hours (add parameter, update formatters, tests)
+  - Trade-off: User needs follow-up `memory.explain <fact_id>` for full context (two-step interaction)
+- [ ] **Fact Dependency Graph Visualization**: Show supersession chains and conflict relationships
+  - Value: Invaluable for understanding why facts were superseded or conflicted
+  - Evidence: trace/trace.go:95 - `CallGraph` struct with nodes and edges for function dependencies
+  - Implementation: Create `memory.fact_graph <fact_id> --depth 2` tool, query `fact_links` table with BFS traversal, return JSON with nodes (facts) and edges (supersedes/conflicts/supports)
+  - Effort: 2-3 days (graph builder, MCP tool, tests)
+  - Trade-off: Adds complexity for feature used mainly for debugging/exploration
+- [ ] **Hybrid Search (Vector + Text) with RRF**: Better relevance combining semantic and keyword matching
+  - Value: 50% improvement in search quality (proven by grepai's Reciprocal Rank Fusion)
+  - Evidence: search/search.go - RRF with K=60, combines cosine similarity with full-text search
+  - Implementation: Add `sqlite-vec` extension, add `embeddings` BLOB column to `facts`, implement RRF in `Recall#query`, make hybrid optional via config
+  - Effort: 5-7 days (embedder setup, schema migration, RRF implementation, testing)
+  - Trade-off: Requires API calls for embedding (~$0.00001/fact), slower queries (2x search + fusion)
+  - Recommendation: CONSIDER - High value but significant effort. Start with FTS5, add vectors later if quality issues arise
 ---
-## 2. ROI Metrics and Token Economics
-### What claude-mem Does
-**Discovery Token Tracking**:
-- `discovery_tokens` field on observations table
-- Tracks tokens spent discovering each piece of knowledge
-- Cumulative metrics in session summaries
-- Footer displays ROI: "Access 10k tokens for 2,500t"
-**File**: `src/services/sqlite/Database.ts`
-```typescript
-observations: {
-  id: INTEGER PRIMARY KEY,
-  title: TEXT,
-  narrative: TEXT,
-  discovery_tokens: INTEGER,  // ← Cost tracking
-  created_at_epoch: INTEGER
-}
-session_summaries: {
-  cumulative_discovery_tokens: INTEGER,  // ← Running total
-  observation_count: INTEGER
-}
-```
-**Context Footer Example**:
-```markdown
-💡 **Token Economics:**
-- Context shown: 2,500 tokens
-- Research captured: 10,000 tokens
-- ROI: 4x compression
-```
-### What We Should Do
-**Priority**: MEDIUM
-**Implementation**:
-1. **Add metrics table**:
-   ```ruby
-   create_table :ingestion_metrics do
-     primary_key :id
-     foreign_key :content_item_id, :content_items
-     Integer :input_tokens
-     Integer :output_tokens
-     Integer :facts_extracted
-     DateTime :created_at
-   end
-   ```
+## Design Decisions
-2. **Track during distillation**:
-   ```ruby
-   # lib/claude_memory/distill/distiller.rb
-   def distill(content)
-     response = api_call(content)
-     facts = extract_facts(response)
-     store_metrics(
-       input_tokens: response.usage.input_tokens,
-       output_tokens: response.usage.output_tokens,
-       facts_extracted: facts.size
-     )
+### No Tag Count Limit (2026-01-23)
-     facts
-   end
-   ```
-3. **Display in CLI**:
-   ```ruby
-   # claude-memory stats
-   def stats_cmd
-     metrics = store.aggregate_metrics
-     puts "Token Economics:"
-     puts "  Input: #{metrics[:input_tokens]} tokens"
-     puts "  Output: #{metrics[:output_tokens]} tokens"
-     puts "  Facts: #{metrics[:facts_extracted]}"
-     puts "  Efficiency: #{metrics[:facts_extracted] / metrics[:input_tokens].to_f} facts/token"
-   end
-   ```
-4. **Add to published snapshot**:
-   ```markdown
-   <!-- At bottom of .claude/rules/claude_memory.generated.md -->
-   ---
-   *Memory stats: 145 facts from 12,500 ingested tokens (86 facts/1k tokens)*
-   ```
+**Decision**: Removed MAX_TAG_COUNT limit from ContentSanitizer.
-**Benefits**:
-- Visibility into memory system efficiency
-- Justifies API costs (shows compression ratio)
-- Helps tune distillation prompts for better extraction
+**Rationale**:
+- The regex pattern `/<tag>.*?<\/tag>/m` is provably safe from ReDoS attacks
+  - Non-greedy matching (`.*?`) with clear delimiters
+  - No nested quantifiers or alternation that could cause catastrophic backtracking
+  - Performance is O(n) and predictable
+- Performance benchmarks show excellent speed even at scale:
+  - 100 tags: 0.07ms
+  - 200 tags: 0.13ms
+  - 1,000 tags: 0.64ms
+- Real-world usage legitimately produces 100-200+ tags in long sessions
+  - System tags like `<claude-memory-context>` accumulate
+  - Users mark multiple sections with `<private>` tags
+- The limit created false alarms and blocked legitimate ingestion
+- No other similar tool (claude-mem, episodic-memory) enforces tag count limits
-**Trade-offs**:
-- Requires API usage tracking
-- Adds database complexity
-- May not be meaningful for all distiller implementations
+**Do not reintroduce**: Tag count validation is unnecessary and harmful. If extreme input causes issues, investigate the actual root cause rather than adding arbitrary limits.
 ---
-## 3. Privacy Tag System
-### What claude-mem Does
-**Dual-Tag Architecture** for content exclusion:
-1. **`<claude-mem-context>`** (system tag):
-   - Prevents recursive storage when context is auto-injected
-   - Strips at hook layer before worker sees it
-2. **`<private>`** (user tag):
-   - Manual privacy control
-   - Users wrap sensitive content to exclude from storage
-   - Example: `API key: <private>sk-abc123</private>`
-**File**: `src/utils/tag-stripping.ts`
-```typescript
-export function stripPrivateTags(text: string): string {
-  const MAX_TAG_COUNT = 100;  // ReDoS protection
-  return text
-    .replace(/<claude-mem-context>[\s\S]*?<\/claude-mem-context>/g, '')
-    .replace(/<private>[\s\S]*?<\/private>/g, '');
-}
-```
-**Edge Processing Philosophy**: Stripping happens at hook layer (before data reaches database).
-### What We Should Do
-**Priority**: HIGH
-**Implementation**:
-1. **Add tag stripping to ingester**:
-   ```ruby
-   # lib/claude_memory/ingest/transcript_reader.rb
-   class TranscriptReader
-     SYSTEM_TAGS = ['claude-memory-context'].freeze
-     USER_TAGS = ['private', 'no-memory'].freeze
-     MAX_TAG_COUNT = 100
-     def strip_tags(text)
-       validate_tag_count(text)
-       ALL_TAGS = SYSTEM_TAGS + USER_TAGS
-       ALL_TAGS.each do |tag|
-         text = text.gsub(/<#{tag}>.*?<\/#{tag}>/m, '')
-       end
-       text
-     end
-     def validate_tag_count(text)
-       count = text.scan(/<(?:#{ALL_TAGS.join('|')})>/).size
-       raise "Too many tags (#{count}), possible ReDoS" if count > MAX_TAG_COUNT
-     end
-   end
-   ```
-2. **Document in README**:
-   ```markdown
-   ## Privacy Control
-   Wrap sensitive content in `<private>` tags to exclude from storage:
-   ```
-   API endpoint: https://api.example.com
-   API key: <private>sk-abc123def456</private>
-   ```
-   System tags (auto-stripped):
-   - `<claude-memory-context>` - Prevents recursive storage of published memory
-   ```
-3. **Add to hook handler**:
-   ```ruby
-   # lib/claude_memory/hook/handler.rb
-   def ingest_hook
-     input = read_stdin
-     transcript = input[:transcript_delta]
+## Executive Summary
-     # Strip tags before processing
-     transcript = strip_privacy_tags(transcript)
+This document analyzes two complementary memory systems:
-     ingester.ingest(transcript)
-   end
-   ```
-4. **Test edge cases**:
-   ```ruby
-   # spec/claude_memory/ingest/transcript_reader_spec.rb
-   it "strips nested private tags" do
-     text = "Public <private>Secret <private>Nested</private></private> Public"
-     expect(strip_tags(text)).to eq("Public  Public")
-   end
+**Claude-mem** (TypeScript/Node.js, v9.0.5) - Memory compression system with 6+ months of production usage:
+- ROI Metrics tracking token costs
+- Health monitoring and process management
+- Configuration-driven context injection
-   it "prevents ReDoS with many tags" do
-     text = "<private>" * 101
-     expect { strip_tags(text) }.to raise_error(/Too many tags/)
-   end
-   ```
+**Episodic-memory** (TypeScript/Node.js, v1.0.15) - Semantic conversation search for Claude Code:
+- Local vector embeddings (Transformers.js)
+- Multi-concept AND search
+- Automatic conversation summarization
+- Tool usage tracking
+- Session metadata capture
+- Background sync with incremental updates
-**Benefits**:
-- User control over sensitive data
-- Prevents credential leakage
-- Protects recursive context injection
-- Security-conscious design
+**Our Current Advantages**:
+- Ruby ecosystem (simpler dependencies)
+- Dual-database architecture (global + project scope)
+- Fact-based knowledge graph (vs observation blobs or conversation exchanges)
+- Truth maintenance system (conflict resolution)
+- Predicate policies (single vs multi-value)
+- Progressive disclosure already implemented
+- Privacy tag stripping already implemented
-**Trade-offs**:
-- Users must remember to tag sensitive content
-- May create false sense of security
-- Regex-based (could miss edge cases)
+**High-Value Opportunities from Episodic-Memory**:
+- Vector embeddings for semantic search alongside FTS5
+- Tool usage tracking during fact discovery
+- Session metadata capture (git branch, working directory)
+- Multi-concept AND search
+- Background sync with incremental updates
+- Enhanced statistics and reporting
 ---
-## 4. Slim Orchestrator Pattern
-### What claude-mem Does
-**Worker Service Evolution**: Refactored from 2,000 lines → 300 lines orchestrator.
-**File Structure**:
-```
-src/services/worker-service.ts (300 lines - orchestrator)
-  ↓ delegates to
-src/server/Server.ts (Express setup)
-src/services/sqlite/Database.ts (data layer)
-src/services/worker/ (business logic)
-  ├── SDKAgent.ts (agent management)
-  ├── SessionManager.ts (session lifecycle)
-  └── search/SearchOrchestrator.ts (search strategies)
-src/infrastructure/ (process management)
-```
-**Benefit**: Testability, readability, separation of concerns.
-### What We Should Do
-**Priority**: MEDIUM
-**Current State**:
-- `lib/claude_memory/cli.rb`: 800+ lines (all commands)
-- Logic mixed with CLI parsing
-- Hard to test individual commands
-**Implementation**:
-1. **Extract command classes**:
-   ```ruby
-   # lib/claude_memory/commands/
-   ├── base_command.rb
-   ├── ingest_command.rb
-   ├── recall_command.rb
-   ├── publish_command.rb
-   ├── promote_command.rb
-   └── sweep_command.rb
-   ```
-2. **Slim CLI to routing**:
-   ```ruby
-   # lib/claude_memory/cli.rb (150 lines)
-   module ClaudeMemory
-     class CLI
-       def run(args)
-         command_name = args[0]
-         command = command_for(command_name)
-         command.run(args[1..])
-       end
-       private
-       def command_for(name)
-         case name
-         when "ingest" then Commands::IngestCommand.new
-         when "recall" then Commands::RecallCommand.new
-         # ...
-         end
-       end
-     end
-   end
-   ```
-3. **Command base class**:
-   ```ruby
-   # lib/claude_memory/commands/base_command.rb
-   module ClaudeMemory
-     module Commands
-       class BaseCommand
-         def initialize
-           @store_manager = Store::StoreManager.new
-         end
-         def run(args)
-           options = parse_options(args)
-           validate_options(options)
-           execute(options)
-         end
-         private
-         def parse_options(args)
-           raise NotImplementedError
-         end
-         def execute(options)
-           raise NotImplementedError
-         end
-       end
-     end
-   end
-   ```
-4. **Example command**:
-   ```ruby
-   # lib/claude_memory/commands/recall_command.rb
-   module ClaudeMemory
-     module Commands
-       class RecallCommand < BaseCommand
-         def parse_options(args)
-           OptionParser.new do |opts|
-             opts.on("--query QUERY") { |q| options[:query] = q }
-             opts.on("--scope SCOPE") { |s| options[:scope] = s }
-           end.parse!(args)
-         end
-         def execute(options)
-           results = Recall.search(
-             options[:query],
-             scope: options[:scope]
-           )
-           puts format_results(results)
-         end
-       end
-     end
-   end
-   ```
-**Benefits**:
-- Each command is independently testable
-- CLI.rb becomes simple router
-- Easier to add new commands
-- Clear separation of parsing vs execution
-**Trade-offs**:
-- More files to navigate
-- Slightly more boilerplate
-- May be overkill for small CLI
+## Episodic-Memory Comparison
+### Architecture Overview
+**Episodic-memory** focuses on **conversation-level semantic search** rather than fact extraction. Key differences:
+| Feature | Episodic-Memory | ClaudeMemory |
+|---------|----------------|--------------|
+| **Data Model** | Conversation exchanges (user-assistant pairs) | Facts (subject-predicate-object triples) |
+| **Search Method** | Vector embeddings + text search | FTS5 full-text search |
+| **Embeddings** | Local Transformers.js (Xenova/all-MiniLM-L6-v2) | None (FTS5 only) |
+| **Vector Storage** | sqlite-vec virtual table | N/A |
+| **Scope** | Single database with project field | Dual database (global + project) |
+| **Truth Maintenance** | None (keeps all conversations) | Supersession + conflict resolution |
+| **Summarization** | Claude API generates summaries | N/A |
+| **Tool Tracking** | Explicit tool_calls table | Mentioned in provenance text |
+| **Session Metadata** | sessionId, cwd, gitBranch, claudeVersion, thinking metadata | Limited (session_id in content_items) |
+| **Multi-Concept Search** | Array-based AND queries (2-5 concepts) | Single query only |
+| **Incremental Sync** | Timestamp-based mtime checks | Re-processes all content |
+| **Background Processing** | Async hook with --background flag | Synchronous hook execution |
+| **Statistics** | Rich stats with project breakdown | Basic status command |
+| **Exclusion** | Content-based markers (`<INSTRUCTIONS-TO-EPISODIC-MEMORY>DO NOT INDEX`) | Tag stripping (`<private>`, `<no-memory>`) |
+| **Line References** | Stores line_start and line_end for each exchange | No line tracking |
+| **WAL Mode** | Enabled for concurrency | Not enabled |
+### What Episodic-Memory Does Well
+1. **Semantic Search with Local Embeddings**
+   - Uses Transformers.js to run embedding model locally (offline-capable)
+   - 384-dimensional vectors from `Xenova/all-MiniLM-L6-v2`
+   - Hybrid vector + text search for best recall
+   - sqlite-vec virtual table for fast similarity queries
+2. **Multi-Concept AND Search**
+   - Array of 2-5 concepts that must all be present in results
+   - Searches each concept independently then intersects results
+   - Ranks by average similarity across all concepts
+   - Example: `["React Router", "authentication", "JWT"]`
+3. **Tool Usage Tracking**
+   - Dedicated `tool_calls` table with foreign key to exchanges
+   - Captures tool_name, tool_input, tool_result, is_error
+   - Tool names included in embeddings for tool-based searches
+   - Search results show tool usage summary
+4. **Rich Session Metadata**
+   - Captures: sessionId, cwd, gitBranch, claudeVersion
+   - Thinking metadata: level, disabled, triggers
+   - Conversation structure: parentUuid, isSidechain
+   - Enables filtering by branch, project context
+5. **Incremental Sync**
+   - Atomic file operations (temp file + rename)
+   - mtime-based change detection (only copies modified files)
+   - Fast subsequent syncs (seconds vs minutes)
+   - Safe concurrent execution
+6. **Automatic Conversation Summarization**
+   - Uses Claude API to generate concise summaries
+   - Summaries stored as `.txt` files alongside conversations
+   - Concurrency-limited batch processing
+   - Summary limit (default 10 per sync) to control API costs
+7. **Background Sync**
+   - `--background` flag for async processing
+   - SessionStart hook runs sync without blocking
+   - User continues working while indexing happens
+   - Output logged to file for debugging
+8. **Line-Range References**
+   - Stores line_start and line_end for each exchange
+   - Enables precise source linking in search results
+   - Supports pagination: read specific line ranges from large conversations
+   - Example: "Lines 10-25 in conversation.jsonl (295KB, 1247 lines)"
+9. **Statistics and Reporting**
+   - Total conversations, exchanges, date range
+   - Summary coverage tracking
+   - Project breakdown with top 10 projects
+   - Database size reporting
+10. **Exclusion Markers**
+    - Content-based opt-out: `<INSTRUCTIONS-TO-EPISODIC-MEMORY>DO NOT INDEX THIS CHAT</INSTRUCTIONS-TO-EPISODIC-MEMORY>`
+    - Files archived but excluded from search index
+    - Prevents meta-conversations from polluting index
+    - Use case: sensitive work, test sessions, agent conversations
+11. **WAL Mode for Concurrency**
+    - SQLite Write-Ahead Logging enabled
+    - Better concurrency for multiple readers
+    - Safe for concurrent sync operations
+### Design Patterns Worth Adopting
+1. **Local Vector Embeddings**
+   - **Value**: Semantic search finds conceptually similar content even with different terminology
+   - **Implementation**: Add `embeddings` column to facts table, use sqlite-vec extension
+   - **Ruby gems**: `onnxruntime` or shell out to Python/Node.js for embeddings
+   - **Trade-off**: Increased storage (384 floats per fact), embedding generation time
+2. **Multi-Concept AND Search**
+   - **Value**: Precise queries like "find conversations about React AND authentication AND JWT"
+   - **Implementation**: Run multiple searches and intersect results, rank by average similarity
+   - **Application to facts**: Find facts matching multiple predicates or entities
+   - **MCP tool**: `memory.search_concepts(concepts: ["auth", "API", "security"])`
+3. **Tool Usage Tracking**
+   - **Value**: Know which tools were used during fact discovery (Read, Edit, Bash, etc.)
+   - **Implementation**: Add `tool_calls` table or JSON column in content_items
+   - **Schema**: `{ tool_name, tool_input, tool_result, timestamp }`
+   - **Use case**: "Which facts were discovered using the Bash tool?"
+4. **Session Metadata Capture**
+   - **Value**: Context about where/when facts were learned
+   - **Implementation**: Extend content_items with git_branch, cwd, claude_version columns
+   - **Use case**: "Show facts learned while on feature/auth branch"
+5. **Incremental Sync**
+   - **Value**: Faster subsequent ingestions (seconds vs minutes)
+   - **Implementation**: Store mtime for each content_item, skip unchanged files
+   - **Hook optimization**: Only process delta since last ingest
+6. **Background Processing**
+   - **Value**: Don't block user while processing large transcripts
+   - **Implementation**: Fork process or use Ruby's async/await
+   - **Hook flag**: `claude-memory hook ingest --async`
+7. **Line-Range References in Provenance**
+   - **Value**: Precise source linking for fact verification
+   - **Implementation**: Store line_start and line_end in provenance table
+   - **Display**: "Fact from lines 42-56 in transcript.jsonl"
+8. **Statistics Command**
+   - **Value**: Visibility into memory system health
+   - **Implementation**: Enhance `claude-memory status` with more metrics
+   - **Metrics**: Facts by predicate, entities by type, provenance coverage, scope breakdown
+9. **WAL Mode**
+   - **Value**: Better concurrency, safer concurrent operations
+   - **Implementation**: `db.pragma('journal_mode = WAL')` in store initialization
+   - **Benefit**: Multiple readers don't block each other
 ---
-## 5. Health Monitoring and Process Management
+## 1. Health Monitoring and Process Management
 ### What claude-mem Does
@@ -581,143 +369,7 @@ async function ensureWorkerHealthy(timeout = 10000) {
 ---
-## 6. Semantic Shortcuts and Search Strategies
-### What claude-mem Does
-**Semantic Shortcuts** (pre-configured queries):
-```typescript
-// File: src/services/worker/http/routes/SearchRoutes.ts
-app.get('/api/decisions', (req, res) => {
-  const results = await search({ type: 'decision' });
-  res.json(results);
-});
-app.get('/api/changes', (req, res) => {
-  const results = await search({ type: ['feature', 'change'] });
-  res.json(results);
-});
-app.get('/api/how-it-works', (req, res) => {
-  const results = await search({ type: 'how-it-works' });
-  res.json(results);
-});
-```
-**Search Strategy Pattern**:
-```typescript
-// File: src/services/worker/search/SearchOrchestrator.ts
-class SearchOrchestrator {
-  strategies: [
-    ChromaSearchStrategy,    // Vector search (if available)
-    SQLiteSearchStrategy,    // FTS5 fallback
-    HybridSearchStrategy     // Combine both
-  ]
-  async search(query, options) {
-    const strategy = selectStrategy(options);
-    return strategy.execute(query);
-  }
-}
-```
-**Fallback Logic**:
-1. Try Chroma vector search (semantic)
-2. Fall back to SQLite FTS5 (keyword)
-3. Merge and re-rank results if both available
-### What We Should Do
-**Priority**: MEDIUM
-**Implementation**:
-1. **Add shortcut methods to Recall**:
-   ```ruby
-   # lib/claude_memory/recall.rb
-   module ClaudeMemory
-     class Recall
-       class << self
-         def recent_decisions(limit: 10)
-           search("decision constraint rule", limit:)
-         end
-         def architecture_choices(limit: 10)
-           search("uses framework implements architecture", limit:)
-         end
-         def conventions(limit: 20)
-           search("convention style format pattern", scope: :global, limit:)
-         end
-         def project_config(limit: 10)
-           search("uses requires depends_on", scope: :project, limit:)
-         end
-       end
-     end
-   end
-   ```
-2. **Add MCP tools for shortcuts**:
-   ```ruby
-   # lib/claude_memory/mcp/tools.rb
-   TOOLS["memory.decisions"] = {
-     description: "Quick access to architectural decisions and constraints",
-     input_schema: { type: "object", properties: { limit: { type: "integer" } } }
-   }
-   TOOLS["memory.conventions"] = {
-     description: "Quick access to coding conventions and preferences",
-     input_schema: { type: "object", properties: { limit: { type: "integer" } } }
-   }
-   ```
-3. **Search strategy pattern** (future: if we add vector search):
-   ```ruby
-   # lib/claude_memory/index/search_strategy.rb
-   module ClaudeMemory
-     module Index
-       class SearchStrategy
-         def self.select(options)
-           if options[:semantic] && vector_db_available?
-             VectorSearchStrategy.new
-           else
-             LexicalSearchStrategy.new
-           end
-         end
-       end
-       class LexicalSearchStrategy < SearchStrategy
-         def search(query)
-           LexicalFTS.search(query)
-         end
-       end
-       class VectorSearchStrategy < SearchStrategy
-         def search(query)
-           # Future: vector embeddings
-         end
-       end
-     end
-   end
-   ```
-**Benefits**:
-- Common queries are one command
-- Reduces cognitive load
-- Pre-optimized for specific use cases
-- Strategy pattern enables future enhancements
-**Trade-offs**:
-- Need to pick right shortcuts (user research)
-- May not cover all use cases
-- Shortcuts can become stale
----
-## 7. Web-Based Viewer UI
+## 3. Web-Based Viewer UI
 ### What claude-mem Does
@@ -832,7 +484,7 @@ esbuild.build({
 ---
-## 8. Dual-Integration Strategy
+## 4. Dual-Integration Strategy
 ### What claude-mem Does
@@ -920,85 +572,7 @@ mcpServer.setRequestHandler(CallToolRequestSchema, async (request) => {
 ---
-## 9. Exit Code Strategy for Hooks
-### What claude-mem Does
-**Hook Exit Code Contract**:
-```typescript
-// Success or graceful shutdown
-process.exit(0);  // Windows Terminal closes tab
-// Non-blocking error (show to user, continue)
-console.error("Warning: ...");
-process.exit(1);
-// Blocking error (feed to Claude for processing)
-console.error("ERROR: ...");
-process.exit(2);
-```
-**Philosophy**: Worker/hook errors exit with 0 to prevent Windows Terminal tab accumulation.
-**File**: `docs/context/claude-code/exit-codes.md`
-### What We Should Do
-**Priority**: MEDIUM (if we add hooks)
-**Implementation**:
-1. **Define exit code constants**:
-   ```ruby
-   # lib/claude_memory/hook/exit_codes.rb
-   module ClaudeMemory
-     module Hook
-       module ExitCodes
-         SUCCESS = 0
-         WARNING = 1  # Non-blocking error
-         ERROR = 2    # Blocking error
-       end
-     end
-   end
-   ```
-2. **Use in hook handler**:
-   ```ruby
-   # lib/claude_memory/hook/handler.rb
-   def run
-     handle_hook(ARGV[0])
-     exit ExitCodes::SUCCESS
-   rescue NonBlockingError => e
-     warn e.message
-     exit ExitCodes::WARNING
-   rescue => e
-     $stderr.puts "ERROR: #{e.message}"
-     exit ExitCodes::ERROR
-   end
-   ```
-3. **Document in CLAUDE.md**:
-   ```markdown
-   ## Hook Exit Codes
-   - **0**: Success or graceful shutdown
-   - **1**: Non-blocking error (shown to user, session continues)
-   - **2**: Blocking error (fed to Claude for processing)
-   ```
-**Benefits**:
-- Clear contract with Claude Code
-- Predictable behavior
-- Better error handling
-**Trade-offs**:
-- Hook-specific pattern
-- Not applicable to MCP server
----
-## 10. Configuration-Driven Context Injection
+## 5. Configuration-Driven Context Injection
 ### What claude-mem Does
@@ -1235,91 +809,343 @@ npm install better-sqlite3  # Needs node-gyp + build tools
 - CSS/theming
 **Alternative**: CLI output is sufficient. Add web UI if users request it.
+**Alternative**: CLI output is sufficient. Add web UI if users request it.
+---
+## Remaining Improvements
+The following sections (6-12 from the original analysis) have been implemented and moved to the "Implemented Improvements" section above:
+- ✅ Section 6: Local Vector Embeddings for Semantic Search
+- ✅ Section 7: Multi-Concept AND Search
+- ✅ Section 8: Tool Usage Tracking
+- ✅ Section 9: Enhanced Session Metadata
+- ✅ Section 10: Incremental Sync (mtime-based)
+- ✅ Section 11: Enhanced Statistics and Reporting
+- ✅ Section 12: WAL Mode for Better Concurrency
+**For remaining unimplemented improvements, see:** [remaining_improvements.md](./remaining_improvements.md)
+Key remaining items:
+- Background processing for hooks (--async flag)
+- ROI metrics and token economics tracking
+- Structured logging
+- Embed command for backfilling embeddings
+---
+## QMD-Inspired Improvements (2026-01-26)
+Analysis of **QMD (Quick Markdown Search)** reveals several high-value optimizations for search quality and performance. QMD is an on-device markdown search engine with hybrid BM25 + vector + LLM reranking, achieving 50%+ Hit@3 improvement over BM25-only search.
+**See detailed analysis**: [docs/influence/qmd.md](./influence/qmd.md)
+### High Priority ⭐
+#### 1. **Native Vector Storage (sqlite-vec)** ⭐ CRITICAL
+- **Value**: 10-100x faster KNN queries, enables larger fact databases
+- **QMD Proof**: Handles 10,000+ documents with sub-second vector queries
+- **Current Issue**: JSON embedding storage requires loading all facts, O(n) Ruby similarity calculation
+- **Solution**: sqlite-vec extension with native C KNN queries
+- **Implementation**:
+  - Schema migration v7: Create `facts_vec` virtual table using `vec0`
+  - Two-step query pattern (avoid JOINs - they hang with vec tables!)
+  - Update `Embeddings::Similarity` class
+  - Backfill existing embeddings
+- **Trade-off**: Adds native dependency (acceptable, well-maintained, cross-platform)
+- **Recommendation**: **ADOPT IMMEDIATELY** - This is foundational
+#### 2. **Reciprocal Rank Fusion (RRF) Algorithm** ⭐ HIGH VALUE
+- **Value**: 50% improvement in Hit@3 for medium-difficulty queries (QMD evaluation)
+- **QMD Proof**: Evaluation suite shows consistent improvements across all query types
+- **Current Issue**: Naive deduplication doesn't properly fuse ranking signals
+- **Solution**: Mathematical fusion of FTS + vector ranked lists with position-aware scoring
+- **Formula**: `score = Σ(weight / (k + rank + 1))` with top-rank bonus
+- **Implementation**:
+  - Create `Recall::RRFusion` class
+  - Update `Recall#query_semantic_dual` to use RRF
+  - Apply weights: original query ×2, expanded queries ×1
+  - Add top-rank bonus: +0.05 for #1, +0.02 for #2-3
+- **Trade-off**: Slightly more complex than naive merging (acceptable, well-tested)
+- **Recommendation**: **ADOPT IMMEDIATELY** - Pure algorithmic improvement
+#### 3. **Docid Short Hash System** ⭐ MEDIUM VALUE
+- **Value**: Better UX, cross-database fact references
+- **QMD Proof**: Used in all output, enables `qmd get #abc123`
+- **Current Issue**: Integer IDs are database-specific, not user-friendly
+- **Solution**: 8-character hash IDs for facts (e.g., `#abc123de`)
+- **Implementation**:
+  - Schema migration v8: Add `docid` column (indexed, unique)
+  - Backfill existing facts with SHA256-based docids
+  - Update CLI commands (`explain`, `recall`) to accept docids
+  - Update MCP tools to accept docids
+  - Update output formatting to show docids
+- **Trade-off**: Hash collisions possible (8 chars = 1 in 4.3 billion, very rare)
+- **Recommendation**: **ADOPT IN PHASE 2** - Clear UX improvement
+#### 4. **Smart Expansion Detection** ⭐ MEDIUM VALUE
+- **Value**: Skip unnecessary vector search when FTS finds exact match
+- **QMD Proof**: Saves 2-3 seconds on 60% of queries (exact keyword matches)
+- **Current Issue**: Always runs both FTS and vector search, even for exact matches
+- **Solution**: Heuristic detection of strong FTS signal
+- **Thresholds**: `top_score >= 0.85` AND `gap >= 0.15`
+- **Implementation**:
+  - Create `Recall::ExpansionDetector` class
+  - Update `Recall#query_semantic_dual` to check before vector search
+  - Add optional metrics tracking (skip rate, latency saved)
+- **Trade-off**: May miss semantic results for exact matches (acceptable)
+- **Recommendation**: **ADOPT IN PHASE 3** - Clear performance win
+### Medium Priority
+#### 5. **Document Chunking for Long Transcripts**
+- **Value**: Better embeddings for long content (>3000 chars)
+- **QMD Approach**: 800 tokens, 15% overlap, semantic boundary detection
+- **Break Priority**: paragraph > sentence > line > word
+- **Implementation**: Modify ingestion to chunk long content_items before embedding
+- **Consideration**: Only if users report issues with long transcripts
+- **Recommendation**: **DEFER** - Not urgent, TF-IDF handles shorter content well
+#### 6. **LLM Response Caching**
+- **Value**: Reduce API costs for repeated distillation
+- **QMD Proof**: Hash-based caching with 80% hit rate
+- **Implementation**:
+  - Add `llm_cache` table (hash, result, created_at)
+  - Cache key: `SHA256(operation + model + input)`
+  - Probabilistic cleanup: 1% chance per operation, keep latest 1000
+- **Consideration**: Most valuable when distiller is fully implemented
+- **Recommendation**: **ADOPT WHEN DISTILLER ACTIVE** - Cost savings
+#### 7. **Enhanced Snippet Extraction**
+- **Value**: Better search result previews with query term highlighting
+- **QMD Approach**: Find line with most query term matches, extract 1 line before + 2 after
+- **Implementation**: Add to `Recall` output formatting
+- **Consideration**: Improves UX but not critical
+- **Recommendation**: **CONSIDER** - Nice-to-have
+### Low Priority / Not Recommended
+#### 8. **Neural Embeddings (EmbeddingGemma)** (DEFER)
+- **QMD Model**: 300M params, 300MB download, 384 dimensions
+- **Value**: Better semantic search quality (+40% Hit@3 over TF-IDF)
+- **Cost**: 300MB download, 300MB VRAM, 2s cold start, complex dependency
+- **Decision**: **DEFER** - TF-IDF sufficient for now, revisit if users report poor quality
+#### 9. **Cross-Encoder Reranking** (REJECT)
+- **QMD Model**: Qwen3-Reranker-0.6B (640MB)
+- **Value**: Better ranking precision via LLM scoring
+- **Cost**: 640MB model, 400ms latency per query, complex dependency
+- **Decision**: **REJECT** - Over-engineering for fact retrieval
+#### 10. **Query Expansion (LLM)** (REJECT)
+- **QMD Model**: Qwen3-1.7B (2.2GB)
+- **Value**: Generate alternative query phrasings for better recall
+- **Cost**: 2.2GB model, 800ms latency per query
+- **Decision**: **REJECT** - No LLM in recall path, too heavy
+#### 11. **YAML Collection System** (REJECT)
+- **QMD Use**: Multi-directory indexing with per-path contexts
+- **Our Use**: Dual-database (global + project) already provides clean separation
+- **Decision**: **REJECT** - Our approach is cleaner for our use case
+#### 12. **Content-Addressable Storage** (REJECT)
+- **QMD Use**: Deduplicates documents by SHA256 hash
+- **Our Use**: Facts deduplicated by signature, not content hash
+- **Decision**: **REJECT** - Different data model
+#### 13. **Virtual Path System** (REJECT)
+- **QMD Use**: `qmd://collection/path` unified namespace
+- **Our Use**: Dual-database provides clear namespace
+- **Decision**: **REJECT** - Unnecessary complexity
 ---
 ## Implementation Priorities
-### High Priority (Next Sprint)
+### High Priority (QMD-Inspired)
-1. **Progressive Disclosure Pattern** - Add index format to Recall, update MCP tools
-2. **Privacy Tag System** - Implement `<private>` tag stripping
-3. **Exit Code Strategy** - Define exit codes for future hooks
+1. **Native Vector Storage (sqlite-vec)** ⭐ - 10-100x faster KNN, foundational improvement
+2. **Reciprocal Rank Fusion (RRF)** ⭐ - 50% better search quality, pure algorithm
+3. **Docid Short Hashes** - Better UX for fact references
+4. **Smart Expansion Detection** - Skip unnecessary vector search when FTS is confident
-### Medium Priority (Next Quarter)
+### Medium Priority
-4. **ROI Metrics** - Track token economics
-5. **Slim Orchestrator Pattern** - Extract commands from CLI
-6. **Semantic Shortcuts** - Add convenience methods to Recall
-7. **Search Strategies** - Prepare for future vector search
+5. **Background Processing** - Non-blocking hooks for better UX (from episodic-memory)
+6. **ROI Metrics** - Track token economics for distillation (from claude-mem)
+7. **LLM Response Caching** - Reduce API costs (from QMD)
+8. **Document Chunking** - Better embeddings for long transcripts (from QMD, if needed)
-### Low Priority (Future)
+### Low Priority
-8. **Health Monitoring** - Only if we add background worker
-9. **Dual Integration** - Only if we add Claude Code hooks
-10. **Config-Driven Context** - Only if users request customization
-11. **Web Viewer UI** - Only if users request visualization
+9. **Structured Logging** - Better debugging with JSON logs
+10. **Embed Command** - Backfill embeddings for existing facts
+11. **Enhanced Snippet Extraction** - Query-aware snippet preview (from QMD)
+12. **Health Monitoring** - Only if we add background worker
+13. **Web Viewer UI** - Only if users request visualization
+14. **Configuration-Driven Context** - Only if users request snapshot customization
 ---
 ## Migration Path
-### Phase 1: Quick Wins (1-2 weeks)
+### Completed ✓
+- [x] WAL mode for better concurrency
+- [x] Enhanced statistics command
+- [x] Session metadata tracking
+- [x] Tool usage tracking
+- [x] Semantic search with TF-IDF embeddings
+- [x] Multi-concept AND search
+- [x] Incremental sync with mtime tracking
+- [x] Context-aware queries
+### Phase 1: Vector Storage Upgrade (from QMD) - IMMEDIATE
+- [ ] Add sqlite-vec extension support (gem or FFI)
+- [ ] Create schema migration v7: `facts_vec` virtual table using `vec0`
+- [ ] Backfill existing embeddings from JSON to native vectors
+- [ ] Update `Embeddings::Similarity` class for native KNN (two-step query pattern)
+- [ ] Test migration on existing databases
+- [ ] Document extension installation in README
+- [ ] Benchmark: Measure KNN query improvement (expect 10-100x)
-- [ ] Implement `<private>` tag stripping in ingester
-- [ ] Add token count estimation to facts
-- [ ] Create index format in Recall
-- [ ] Add `memory.recall_index` MCP tool
-- [ ] Document progressive disclosure pattern
+### Phase 2: RRF Fusion (from QMD) - IMMEDIATE
-### Phase 2: Structural (1 month)
+- [ ] Implement `Recall::RRFusion` class with k=60 parameter
+- [ ] Update `Recall#query_semantic_dual` to use RRF fusion
+- [ ] Apply weights: original query ×2, expanded queries ×1
+- [ ] Add top-rank bonus: +0.05 for #1, +0.02 for #2-3
+- [ ] Test with synthetic ranked lists (unit tests)
+- [ ] Validate improvements with real queries
-- [ ] Extract command classes from CLI
-- [ ] Add metrics table for token tracking
-- [ ] Implement semantic shortcuts
-- [ ] Add search strategy pattern (prep for vector search)
+### Phase 3: UX Improvements (from QMD) - NEAR-TERM
-### Phase 3: Advanced (3+ months)
+- [ ] Schema migration v8: Add `docid` column (8-char hash, indexed, unique)
+- [ ] Backfill existing facts with SHA256-based docids
+- [ ] Update CLI commands to accept/display docids (`ExplainCommand`, `RecallCommand`)
+- [ ] Update MCP tools for docid support (`memory.explain`, `memory.recall`)
+- [ ] Test cross-database docid lookups
-- [ ] Add vector embeddings (if requested)
-- [ ] Build web viewer (if requested)
-- [ ] Add Claude Code hooks (if requested)
-- [ ] Implement background worker (if needed)
+### Phase 4: Performance Optimizations (from QMD) - NEAR-TERM
+- [ ] Implement `Recall::ExpansionDetector` class
+- [ ] Update `Recall#query_semantic_dual` to check before vector search
+- [ ] Add metrics tracking (skip rate, avg latency saved)
+- [ ] Tune thresholds based on usage patterns
+### Remaining Tasks
+- [ ] Background processing (--async flag for hooks)
+- [ ] LLM response caching (from QMD, when distiller is active)
+- [ ] Structured logging implementation
+- [ ] Embed command for backfilling embeddings
+### Future (If Requested)
+- [ ] Document chunking for long transcripts (from QMD, if users report issues)
+- [ ] Enhanced snippet extraction (from QMD, for better search result previews)
+- [ ] Build web viewer (if users request visualization)
+- [ ] Add HTTP-based health checks (if background worker is added)
+- [ ] Configuration-driven snapshot generation (if users request customization)
 ---
 ## Key Takeaways
-**What claude-mem does exceptionally well**:
-1. Progressive disclosure (token efficiency)
-2. ROI metrics (visibility)
-3. Privacy controls (user trust)
-4. Clean architecture (maintainability)
-5. Production polish (error handling, logging, health checks)
-**What we do better**:
-1. Dual-database architecture (global + project)
-2. Fact-based knowledge graph (structured)
-3. Truth maintenance (conflict resolution)
-4. Predicate policies (semantic understanding)
-5. Simpler dependencies (Ruby ecosystem)
-**Our path forward**:
-- Adopt their token efficiency patterns
-- Keep our knowledge graph architecture
-- Add privacy controls
-- Improve observability (metrics)
-- Maintain simplicity (avoid over-engineering)
+### Successfully Adopted from claude-mem ✓
+1. Progressive disclosure (token-efficient retrieval)
+2. Privacy controls (tag-based content exclusion)
+3. Clean architecture (command pattern, slim CLI)
+4. Semantic shortcuts (decisions, conventions, architecture)
+5. Exit code strategy (hook error handling)
+6. ROI metrics tracking (token economics for distillation efficiency)
+### Successfully Adopted from Episodic-Memory ✓
+1. **WAL Mode** - Better concurrency with Write-Ahead Logging
+2. **Tool Usage Tracking** - Dedicated table tracking which tools discovered facts
+3. **Incremental Sync** - mtime-based change detection for fast re-ingestion
+4. **Session Metadata** - Context capture (git branch, cwd, Claude version)
+5. **Local Vector Embeddings** - TF-IDF semantic search alongside FTS5
+6. **Multi-Concept AND Search** - Precise queries matching 2-5 concepts simultaneously
+7. **Enhanced Statistics** - Comprehensive reporting on facts, entities, provenance
+8. **Context-Aware Queries** - Filter by branch, directory, or tools used
+### Our Unique Advantages
+1. **Dual-database architecture** - Global + project scopes
+2. **Fact-based knowledge graph** - Structured vs blob observations or conversation exchanges
+3. **Truth maintenance** - Conflict resolution and supersession
+4. **Predicate policies** - Single vs multi-value semantics
+5. **Ruby ecosystem** - Simpler dependencies, easier install
+6. **Lightweight embeddings** - No external dependencies (TF-IDF vs Transformers.js)
+### Remaining Opportunities
+- **Background Processing** - Non-blocking hooks for better UX (from episodic-memory)
+- **ROI Metrics** - Track token economics for distillation (from claude-mem)
+- **Structured Logging** - JSON-formatted logs for debugging
+- **Embed Command** - Backfill embeddings for existing facts
+- **Health Monitoring** - Only if we add background worker
+- **Web Viewer UI** - Only if users request visualization
+- **Configuration-Driven Context** - Only if users request snapshot customization
+---
+## Comparison Summary
+**Episodic-memory** and **claude_memory** serve complementary but different needs:
+**Episodic-memory** excels at:
+- Semantic conversation search with local embeddings
+- Preserving complete conversation context
+- Multi-concept AND queries
+- Fast incremental sync
+- Tool usage tracking
+- Rich session metadata
+**ClaudeMemory** excels at:
+- Structured fact extraction and storage
+- Truth maintenance and conflict resolution
+- Dual-scope architecture (global vs project)
+- Knowledge graph with provenance
+- Semantic shortcuts for common queries
+**Best of both worlds (achieved)**:
+- ✅ Added vector embeddings for semantic search (TF-IDF based)
+- ✅ Kept fact-based knowledge graph for structured queries
+- ✅ Adopted incremental sync and tool tracking from episodic-memory
+- ✅ Maintained truth maintenance and conflict resolution
+- ✅ Added session metadata for richer context
+- ✅ Implemented multi-concept AND search
+- ✅ Enhanced statistics and reporting
 ---
 ## References
-- [claude-mem GitHub](https://github.com/thedotmack/claude-mem)
-- [Architecture Evolution](../claude-mem/docs/public/architecture-evolution.mdx)
-- [Progressive Disclosure Philosophy](../claude-mem/docs/public/progressive-disclosure.mdx)
-- [ClaudeMemory Updated Plan](updated_plan.md)
+- [episodic-memory GitHub](https://github.com/obra/episodic-memory) - Semantic conversation search
+- [claude-mem GitHub](https://github.com/thedotmack/claude-mem) - Memory compression system
+- [ClaudeMemory Updated Plan](updated_plan.md) - Original improvement plan
 ---
-*This analysis represents a critical review of production-grade patterns that have proven effective in real-world usage. Our goal is to learn from claude-mem's strengths while preserving the unique advantages of our fact-based approach.*
+*This document has been updated to reflect completed implementations. Fourteen major improvements have been successfully integrated: 6 from claude-mem and 8 from episodic-memory. ClaudeMemory now combines the best of both systems while maintaining its unique advantages in fact-based knowledge representation and truth maintenance.*
+*Last updated: 2026-01-26 - Added ROI metrics tracking for distillation token economics*