claude_memory 0.5.1 → 0.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (54) hide show
  1. checksums.yaml +4 -4
  2. data/.claude/CLAUDE.md +1 -1
  3. data/.claude/rules/claude_memory.generated.md +1 -1
  4. data/.claude/settings.json +5 -0
  5. data/.claude/settings.local.json +9 -1
  6. data/.claude-plugin/marketplace.json +5 -2
  7. data/.claude-plugin/plugin.json +16 -3
  8. data/CHANGELOG.md +55 -0
  9. data/CLAUDE.md +27 -13
  10. data/README.md +6 -2
  11. data/Rakefile +22 -0
  12. data/db/migrations/011_add_tool_call_summaries.rb +18 -0
  13. data/db/migrations/012_add_vec_indexing_support.rb +19 -0
  14. data/docs/improvements.md +86 -66
  15. data/docs/influence/claude-mem.md +253 -0
  16. data/docs/influence/claude-supermemory.md +158 -430
  17. data/docs/influence/episodic-memory.md +217 -0
  18. data/docs/influence/grepai.md +163 -839
  19. data/docs/influence/kbs.md +437 -0
  20. data/docs/influence/qmd.md +139 -481
  21. data/hooks/hooks.json +19 -15
  22. data/lefthook.yml +4 -0
  23. data/lib/claude_memory/commands/checks/vec_check.rb +73 -0
  24. data/lib/claude_memory/commands/compact_command.rb +94 -0
  25. data/lib/claude_memory/commands/doctor_command.rb +1 -0
  26. data/lib/claude_memory/commands/export_command.rb +108 -0
  27. data/lib/claude_memory/commands/help_command.rb +2 -0
  28. data/lib/claude_memory/commands/hook_command.rb +110 -9
  29. data/lib/claude_memory/commands/index_command.rb +63 -8
  30. data/lib/claude_memory/commands/initializers/global_initializer.rb +26 -7
  31. data/lib/claude_memory/commands/initializers/project_initializer.rb +35 -12
  32. data/lib/claude_memory/commands/registry.rb +3 -1
  33. data/lib/claude_memory/hook/context_injector.rb +75 -0
  34. data/lib/claude_memory/hook/error_classifier.rb +67 -0
  35. data/lib/claude_memory/hook/handler.rb +21 -1
  36. data/lib/claude_memory/index/vector_index.rb +171 -0
  37. data/lib/claude_memory/infrastructure/schema_validator.rb +5 -1
  38. data/lib/claude_memory/ingest/ingester.rb +26 -1
  39. data/lib/claude_memory/ingest/observation_compressor.rb +177 -0
  40. data/lib/claude_memory/mcp/instructions_builder.rb +76 -0
  41. data/lib/claude_memory/mcp/server.rb +3 -1
  42. data/lib/claude_memory/mcp/tool_definitions.rb +15 -7
  43. data/lib/claude_memory/mcp/tools.rb +125 -2
  44. data/lib/claude_memory/publish.rb +28 -27
  45. data/lib/claude_memory/recall/dual_query_template.rb +1 -12
  46. data/lib/claude_memory/recall.rb +71 -17
  47. data/lib/claude_memory/store/sqlite_store.rb +17 -1
  48. data/lib/claude_memory/sweep/sweeper.rb +30 -0
  49. data/lib/claude_memory/version.rb +1 -1
  50. data/lib/claude_memory.rb +8 -0
  51. data/scripts/hook-runner.sh +14 -0
  52. data/scripts/serve-mcp.sh +14 -0
  53. data/skills/setup-memory/SKILL.md +6 -0
  54. metadata +31 -2
@@ -1,9 +1,9 @@
1
- # QMD Analysis: Quick Markdown Search (Updated)
1
+ # QMD Analysis (Updated)
2
2
 
3
- *Analysis Date: 2026-02-02*
4
- *Previous Analysis: 2026-01-26*
3
+ *Analysis Date: 2026-03-02*
4
+ *Previous Analysis: 2026-02-02*
5
5
  *Repository: https://github.com/tobi/qmd*
6
- *Version/Commit: 63028fd (latest main)*
6
+ *Version: 1.1.0 (commit 40610c3)*
7
7
 
8
8
  ---
9
9
 
@@ -11,37 +11,38 @@
11
11
 
12
12
  ### Project Purpose
13
13
 
14
- QMD (Quick Markdown Search) is an **on-device search engine** for markdown knowledge bases, notes, meeting transcripts, and documentation. It combines BM25 full-text search, vector semantic search, and LLM re-ranking — all running locally via node-llama-cpp with GGUF models.
14
+ QMD (Query Markup Documents) is an **on-device search engine** for markdown knowledge bases. It combines BM25 full-text search, vector semantic search, and LLM re-ranking — all running locally via node-llama-cpp with GGUF models.
15
15
 
16
- ### Key Innovation
16
+ ### Key Innovation (What's New Since Last Study)
17
17
 
18
- QMD's standout innovations since last analysis:
18
+ 1. **Query Document Format** (`docs/SYNTAX.md`): Structured multi-line queries with typed sub-queries (`lex:`, `vec:`, `hyde:`) that route to different search backends. First sub-query gets 2x weight in Reciprocal Rank Fusion. This replaces the separate `search`/`vsearch`/`query` commands with a unified `query` tool.
19
19
 
20
- 1. **Custom fine-tuned query expansion model** (`qmd-query-expansion-1.7B`): A Qwen3-1.7B model trained with SFT + GRPO (reinforcement learning) specifically for structured search query expansion. Produces typed outputs (`lex:`, `vec:`, `hyde:`) that route to different search backends.
20
+ 2. **Lex Query Syntax**: Full BM25 operator support `"exact phrase"` matching, `-term` exclusions, `-"phrase"` exclusions. Enables intent-aware disambiguation (e.g., `performance -sports -athlete`).
21
21
 
22
- 2. **Claude Code plugin ecosystem**: QMD ships as a Claude Code marketplace plugin (`.claude-plugin/marketplace.json`) with skills, MCP server integration, and inline status checks.
22
+ 3. **HTTP MCP Transport** (`src/mcp.ts:10-16`): Stateless HTTP server alongside stdio. Models stay loaded in VRAM across requests. Embedding/reranking contexts disposed after 5 min idle.
23
23
 
24
- 3. **Session-scoped LLM management** (`ILLMSession`): Structured lifecycle for LLM resources with abort signals, timeout management, and clean disposal.
24
+ 4. **Unified MCP `query` tool**: Removed separate `search`, `vector_search`, `deep_search` tools. Single `query` tool handles all modes via the query document format.
25
+
26
+ 5. **Collection Management Enhancements**: `include`/`exclude` collections from default queries, `update-cmd` for pre-update shell commands, multiple `-c` flags.
25
27
 
26
28
  ### Technology Stack
27
29
 
28
- - **Runtime**: Bun >= 1.0.0 (TypeScript)
29
- - **Database**: SQLite with sqlite-vec extension (cosine distance)
30
+ - **Runtime**: Node.js >= 22 / Bun (dual runtime, `src/db.ts:9-24`)
31
+ - **Database**: SQLite with better-sqlite3 + sqlite-vec extension v0.1.7-alpha.2
30
32
  - **Full-Text Search**: SQLite FTS5 with Porter tokenization
31
- - **Embeddings**: EmbeddingGemma-300M (GGUF, ~300MB)
32
- - **Reranking**: Qwen3-Reranker-0.6B (GGUF, ~640MB)
33
- - **Query Expansion**: qmd-query-expansion-1.7B (custom fine-tuned, ~1.1GB)
34
- - **MCP**: @modelcontextprotocol/sdk with stdio transport
35
- - **Validation**: Zod v4 for MCP tool input schemas
36
- - **Config**: YAML-based collection management (`~/.config/qmd/index.yml`)
33
+ - **Embeddings**: EmbeddingGemma (~300MB GGUF)
34
+ - **Reranking**: Qwen3-Reranker-0.6B (~640MB GGUF)
35
+ - **Query Expansion**: Qwen3-1.7B (custom fine-tuned, ~1.1GB)
36
+ - **MCP**: @modelcontextprotocol/sdk v1.25.1
37
+ - **Validation**: Zod v4
38
+ - **Plugin**: Claude Code marketplace format
37
39
 
38
40
  ### Production Readiness
39
41
 
40
- - **Maturity**: Beta, actively developed, 5,700+ GitHub stars
41
- - **Test Coverage**: Unit tests (store.test.ts, mcp.test.ts), eval harness (18 queries across 3 difficulty levels)
42
- - **Documentation**: Comprehensive README, CLAUDE.md, inline code docs
43
- - **Community**: 257 forks, 29 issues, 17 PRs, active maintainer (Tobi Lütke)
44
- - **Plugin Distribution**: Available via Claude Code marketplace
42
+ - **Maturity**: Stable (v1.1.0), 5,700+ GitHub stars
43
+ - **Test Coverage**: vitest suite (store, mcp, collections, formatter, cli, eval)
44
+ - **Plugin Distribution**: Claude Code marketplace
45
+ - **Community**: Active (256 PRs merged, external contributors)
45
46
 
46
47
  ---
47
48
 
@@ -49,8 +50,6 @@ QMD's standout innovations since last analysis:
49
50
 
50
51
  ### Data Model
51
52
 
52
- QMD uses content-addressable storage with a virtual filesystem layer:
53
-
54
53
  ```
55
54
  content table (SHA256 hash → document body, deduplication)
56
55
 
@@ -65,11 +64,11 @@ vectors_vec (sqlite-vec native KNN index, cosine distance)
65
64
  llm_cache (hash-keyed deterministic response cache)
66
65
  ```
67
66
 
68
- ### Key Design Patterns
67
+ ### Key Design Patterns (New)
69
68
 
70
- 1. **Content-Addressable Storage**: `content` table deduplicates by SHA256 hash multiple documents with identical content share one row (`store.ts:440-450`)
69
+ 1. **Query Document Format** (`docs/SYNTAX.md:1-100`): EBNF grammar for structured queries. Lines typed as `lex:`, `vec:`, or `hyde:` route to different backends. Plain text defaults to `expand:` (LLM-generated variants).
71
70
 
72
- 2. **Two-Step Vector Query**: JOINs with sqlite-vec virtual tables hang indefinitely. QMD enforces separate queries for vec lookup and metadata join (`store.ts:1912-1915`):
71
+ 2. **Two-Step Vector Query** (`store.ts:1912-1915`): JOINs with sqlite-vec virtual tables hang indefinitely. QMD uses separate queries:
73
72
  ```typescript
74
73
  // Step 1: KNN from vec table
75
74
  const vecResults = db.prepare(
@@ -78,272 +77,84 @@ llm_cache (hash-keyed deterministic response cache)
78
77
  // Step 2: Join with documents separately
79
78
  ```
80
79
 
81
- 3. **YAML-Based Collection Config**: Collections migrated from SQLite foreign keys to `~/.config/qmd/index.yml` for easier user management. Schema migration in `migrate-schema.ts` handled the transition.
82
-
83
- 4. **Hierarchical Context System**: Context descriptions inherit along path hierarchy — a file at `/work/projects/api.md` gets global context + `/` context + `/work` context concatenated (`collections.ts:94-113`)
80
+ 3. **Smart Chunking** (`store.ts:53-219`): 900 tokens/chunk, 15% overlap, markdown-aware break points with scored pattern matching (h1=100, h2=90, paragraph=20). Distance decay prevents splitting inside code fences.
84
81
 
85
- 5. **Probabilistic Cache Cleanup**: 1% chance per query to prune LLM cache to latest 1000 entries (`store.ts:804-807`)
82
+ 4. **Dynamic MCP Instructions** (`mcp.ts:91-98`): `buildInstructions()` generates context-aware server instructions from actual index state, injected into LLM system prompt.
86
83
 
87
- 6. **Lazy Model Singleton**: LLM models lazy-load on first use, keep in memory, and unload contexts after 2-minute idle (`llm.ts:920-951`)
88
-
89
- ### Module Organization
90
-
91
- ```
92
- qmd/
93
- ├── src/
94
- │ ├── qmd.ts # CLI entry point (~750 lines, lazy-loaded store)
95
- │ ├── store.ts # Core store: schema, search, indexing (~2400 lines)
96
- │ ├── mcp.ts # MCP server: 6 tools + resource + prompt (~626 lines)
97
- │ ├── llm.ts # LLM abstraction: embed, rerank, expand (~1208 lines)
98
- │ ├── collections.ts # YAML config management (~390 lines)
99
- │ ├── store.test.ts # Comprehensive store unit tests
100
- │ └── mcp.test.ts # MCP integration tests
101
- ├── finetune/ # Query expansion model training pipeline
102
- │ ├── reward.py # Multi-dimensional reward function (5 dimensions, 120 pts)
103
- │ ├── train.py # Unified SFT + GRPO training
104
- │ ├── eval.py # Model evaluation with scoring
105
- │ └── jobs/ # HuggingFace Jobs wrappers
106
- ├── test/
107
- │ └── eval-harness.ts # Search quality evaluation (18 queries)
108
- ├── skills/qmd/ # Claude Code plugin skill definition
109
- └── .claude-plugin/ # Marketplace distribution metadata
110
- ```
84
+ 5. **Dual Runtime Compatibility** (`db.ts:9-24`): Cross-runtime SQLite layer that works under both Bun (bun:sqlite) and Node.js (better-sqlite3).
111
85
 
112
86
  ### Comparison with ClaudeMemory
113
87
 
114
- | Aspect | QMD | ClaudeMemory | Notes |
115
- |--------|-----|--------------|-------|
116
- | **Data Model** | Full markdown documents | Structured fact triples | Different paradigms: recall vs extraction |
117
- | **Storage** | SQLite + sqlite-vec (native vectors) | SQLite + JSON embeddings | QMD has 10-100x faster KNN |
118
- | **Search** | BM25 + Vector + RRF + Reranking | BM25 + Vector (hybrid) | QMD adds reranking + query expansion |
119
- | **MCP** | 6 tools + resource + prompt | 18 tools | ClaudeMemory has richer tool surface |
120
- | **Distribution** | Bun global install + plugin | Ruby gem + MCP + hooks | QMD has smoother install via plugin |
121
- | **LLM Dependency** | 3 local GGUF models (~2GB total) | None (local ONNX only) | ClaudeMemory is dramatically lighter |
122
- | **Query Expansion** | Custom fine-tuned model (1.7B) | None | QMD has ML-powered query improvement |
123
- | **Truth Maintenance** | None (all docs valid) | Supersession + conflicts | ClaudeMemory handles contradictions |
124
- | **Scope System** | YAML collections | Dual-database (global/project) | Both approaches valid for their use case |
125
- | **Testing** | Unit + eval harness | Unit + evals + benchmarks (DevMemBench) | ClaudeMemory has more comprehensive benchmarks |
88
+ | Aspect | QMD (1.1.0) | ClaudeMemory | Notes |
89
+ |--------|-------------|--------------|-------|
90
+ | **Data Model** | Content-addressable chunks | Subject-predicate-object facts | QMD stores documents; we store knowledge |
91
+ | **Storage** | SQLite + sqlite-vec | SQLite + Sequel + fastembed-rb | Both use FTS5 |
92
+ | **Vector Search** | sqlite-vec (native C) | JSON embeddings (Ruby) | QMD 10-100x faster |
93
+ | **Query Language** | Typed sub-queries (lex/vec/hyde) | Free-text search | QMD more expressive |
94
+ | **Chunking** | Smart (900 tok, markdown-aware) | None (fact-level) | Different granularity |
95
+ | **Plugin Format** | marketplace.json | Ruby gem + MCP + hooks | QMD easier to install |
96
+ | **MCP Transport** | stdio + HTTP | stdio only | HTTP enables shared server |
126
97
 
127
98
  ---
128
99
 
129
100
  ## Key Components Deep-Dive
130
101
 
131
- ### Component 1: Fine-Tuned Query Expansion
132
-
133
- **Purpose**: Generate structured query variations (lex/vec/hyde) to improve search recall by routing different query types to appropriate backends.
134
-
135
- **Location**: `finetune/`, `src/llm.ts:637-679`
136
-
137
- **Implementation** (from `finetune/README.md`):
138
-
139
- The custom model `qmd-query-expansion-1.7B` is trained in two stages:
140
-
141
- 1. **SFT (Supervised Fine-Tuning)**: Teaches format compliance
142
- - Base model: Qwen3-1.7B
143
- - LoRA rank 16, alpha 32 (all projection layers)
144
- - ~2,290 training examples, 5 epochs
145
- - Loss: train 0.472, val 0.304
102
+ ### Component 1: Query Document Parser
146
103
 
147
- 2. **GRPO (Group Relative Policy Optimization)**: Refines quality
148
- - LoRA rank 4, alpha 8 (q_proj, v_proj only)
149
- - KL beta 0.04 (prevents drift from SFT)
150
- - 200 steps, mean reward 0.757
104
+ **Purpose**: Parse structured multi-line queries into typed sub-queries for routing to appropriate search backends.
151
105
 
152
- **Reward Function** (from `finetune/reward.py`):
153
- 5 dimensions totaling 120 points (140 with hyde):
154
- - Format (0-30): Valid lex/vec/hyde lines
155
- - Diversity (0-30): Multiple types, no echoing query
156
- - HyDE (0-20): Presence, length, quality
157
- - Quality (0-20): Lex < vec length, preserved terms
158
- - Entity (±45 to +20): Named entity preservation
159
- - Think penalty: No `<think>` blocks (uses `/no_think` directive)
160
-
161
- **Output Format**:
162
- ```
163
- lex: authentication configuration
164
- lex: auth settings setup
165
- vec: how to configure authentication settings
166
- hyde: Authentication can be configured by setting the AUTH_SECRET environment variable.
167
- ```
106
+ **Location**: `docs/SYNTAX.md`, `src/store.ts`
168
107
 
169
108
  **Design Decisions**:
170
- - Structured output types (`lex:`, `vec:`, `hyde:`) route to different backends instead of generic rewrites
171
- - `/no_think` Qwen3 directive suppresses chain-of-thought for direct output
172
- - Grammar-constrained generation ensures format compliance at inference time
173
- - Per-query caching avoids redundant expansion (80% hit rate)
174
-
175
- **Relevance to ClaudeMemory**: The structured lex/vec/hyde output pattern is interesting — if we ever add query expansion to our recall pipeline, this type-routed approach is more sophisticated than simple query rewriting. The reward function design (multi-dimensional scoring with entity preservation) is also a good reference for evaluating any future distiller quality.
176
-
177
- ---
178
-
179
- ### Component 2: Claude Code Plugin System
180
-
181
- **Purpose**: Package QMD for frictionless installation via Claude Code marketplace.
182
-
183
- **Location**: `.claude-plugin/marketplace.json`, `skills/qmd/SKILL.md`
184
-
185
- **Plugin Structure** (from `marketplace.json:1-29`):
186
- ```json
187
- {
188
- "name": "qmd",
189
- "plugins": [{
190
- "name": "qmd",
191
- "skills": ["./skills/"],
192
- "mcpServers": {
193
- "qmd": { "command": "qmd", "args": ["mcp"] }
194
- }
195
- }]
196
- }
197
- ```
109
+ - Typed lines (`lex:`, `vec:`, `hyde:`) enable precise control over search routing
110
+ - First sub-query gets 2x weight in RRF fusion
111
+ - Plain text auto-expands via LLM to generate all three types
112
+ - Lex supports phrase matching and negation for disambiguation
198
113
 
199
- **Skill Definition** (from `skills/qmd/SKILL.md:1-10`):
200
- ```yaml
201
- ---
202
- name: qmd
203
- description: Search personal markdown knowledge bases...
204
- metadata:
205
- author: tobi
206
- version: "1.1.1"
207
- allowed-tools: Bash(qmd:*), mcp__qmd__*
208
- ---
209
- ```
210
-
211
- Key features:
212
- - **Inline status check**: `!` prefix runs command during skill load (`SKILL.md:18`)
213
- - **Trigger phrases**: "search my notes", "find in docs", "what did I write about"
214
- - **Tool permissions**: Scoped to `qmd:*` bash commands and `mcp__qmd__*` tools
215
- - **Score interpretation guide**: Embedded in skill for LLM consumption
216
- - **Recommended workflow**: status → search → vsearch → query → get
217
-
218
- **Relevance to ClaudeMemory**: This is the clearest example of how to package a memory/search tool as a Claude Code plugin. The skill definition format, tool permissions scoping, inline status checks, and MCP server bundling are all patterns we should adopt when ready to ship as a plugin. The `allowed-tools` pattern (`Bash(qmd:*)`) is particularly useful for security scoping.
219
-
220
- ---
221
-
222
- ### Component 3: MCP Server with Structured Content
223
-
224
- **Purpose**: Expose QMD search as MCP tools with both human-readable text and machine-parseable structured content.
225
-
226
- **Location**: `src/mcp.ts`
227
-
228
- **Implementation** (from `mcp.ts:258-292`):
229
- ```typescript
230
- server.registerTool("search", {
231
- title: "Search (BM25)",
232
- inputSchema: {
233
- query: z.string().describe("Search query"),
234
- limit: z.number().optional().default(10),
235
- minScore: z.number().optional().default(0),
236
- collection: z.string().optional(),
237
- },
238
- }, async ({ query, limit, minScore, collection }) => {
239
- // ... search logic ...
240
- return {
241
- content: [{ type: "text", text: formatSearchSummary(filtered, query) }],
242
- structuredContent: { results: filtered },
243
- };
244
- });
245
- ```
114
+ ### Component 2: HTTP MCP Transport
246
115
 
247
- **Key patterns**:
248
- 1. **Dual output**: Both `content` (human-readable text) and `structuredContent` (JSON) returned from every tool
249
- 2. **Zod validation**: Input schemas use Zod v4 with `.describe()` for auto-documentation
250
- 3. **Resource template**: Documents accessible via `qmd://{+path}` URI pattern with suffix matching fallback (`mcp.ts:105-166`)
251
- 4. **Query guide prompt**: Registered prompt explaining search strategy to LLMs (`mcp.ts:172-252`)
252
- 5. **Line numbers**: Default in resource output for precise references
253
- 6. **Error handling**: `isError: true` flag for clear error signaling, fuzzy file suggestions on not-found
254
-
255
- **Relevance to ClaudeMemory**: We already have 18 MCP tools, but QMD's dual `content`/`structuredContent` pattern is worth adopting — it ensures both human (text summary) and machine (JSON) consumers get optimal formats. The registered prompt for query guidance is also a good pattern for improving Claude's tool usage.
256
-
257
- ---
116
+ **Purpose**: Long-lived MCP server that avoids repeated model loading.
258
117
 
259
- ### Component 4: Session-Scoped LLM Lifecycle
118
+ **Location**: `src/mcp.ts:119-137`
260
119
 
261
- **Purpose**: Manage LLM model loading, context creation, and cleanup with structured lifecycle guarantees.
120
+ **Design Decisions**:
121
+ - WebStandardStreamableHTTPServerTransport for stateless HTTP
122
+ - Models stay loaded in VRAM across requests
123
+ - Idle disposal after 5 min (transparent recreation ~1s)
124
+ - Health endpoint for liveness checks
125
+ - Daemon mode with PID file management
262
126
 
263
- **Location**: `src/llm.ts:126-146`
127
+ ### Component 3: Smart Chunking
264
128
 
265
- **Session Interface** (from `llm.ts:137-146`):
266
- ```typescript
267
- export interface ILLMSession {
268
- embed(text: string, options?: EmbedOptions): Promise<EmbeddingResult | null>;
269
- embedBatch(texts: string[]): Promise<(EmbeddingResult | null)[]>;
270
- expandQuery(query: string, options?): Promise<Queryable[]>;
271
- rerank(query: string, documents: RerankDocument[]): Promise<RerankResult>;
272
- readonly isValid: boolean;
273
- readonly signal: AbortSignal;
274
- }
275
- ```
129
+ **Purpose**: Split documents at natural boundaries for better embeddings.
276
130
 
277
- **Key patterns**:
278
- - Sessions have `isValid` flag and `signal` (AbortSignal) for lifecycle tracking
279
- - Maximum duration timeout prevents runaway sessions
280
- - Models lazy-load but stay resident; contexts dispose after 2-min idle
281
- - Singleton pattern ensures only one LLM instance (memory management)
131
+ **Location**: `src/store.ts:68-219`
282
132
 
283
- **Relevance to ClaudeMemory**: If we ever integrate local LLMs for distillation, this session-scoped lifecycle pattern is the right approach. Clean abort propagation via AbortSignal is a good practice for any long-running operation.
133
+ **Design Decisions**:
134
+ - Scored break points (h1=100 → newline=1) with distance decay
135
+ - Code fence detection prevents splitting mid-block
136
+ - 200-token search window for finding optimal cut points
137
+ - Squared distance decay for gentle early, steep late penalties
284
138
 
285
139
  ---
286
140
 
287
141
  ## Comparative Analysis
288
142
 
289
- ### What QMD Does Well (New Findings)
290
-
291
- #### 1. Custom Fine-Tuned Model Pipeline
292
- - **Description**: Full training pipeline (SFT → GRPO → GGUF conversion) for search-specific model
293
- - **Evidence**: `finetune/reward.py` — multi-dimensional reward function; `finetune/train.py` — unified training script
294
- - **Why It Works**: Domain-specific models outperform general-purpose LLMs for structured tasks. The two-stage approach (format learning via SFT, quality refinement via GRPO) is state-of-the-art.
295
- - **Metric**: Min 92% average score required before deployment
296
-
297
- #### 2. Plugin Distribution
298
- - **Description**: Ships as a Claude Code marketplace plugin with zero-config MCP + skills
299
- - **Evidence**: `.claude-plugin/marketplace.json`, `skills/qmd/SKILL.md`
300
- - **Why It Works**: `claude marketplace add tobi/qmd` is dramatically simpler than manual gem install + MCP config + hook setup
301
- - **Impact**: Massive UX improvement for installation
302
-
303
- #### 3. Typed Query Routing
304
- - **Description**: Query expansion produces typed outputs (`lex:`, `vec:`, `hyde:`) routed to appropriate backends
305
- - **Evidence**: `llm.ts:637-679` — structured prompt; `llm.ts:1006-1013` — grammar constraint
306
- - **Why It Works**: Different search backends have different strengths. Routing keyword queries to BM25 and semantic queries to vector search maximizes recall.
143
+ ### What They Do Well
307
144
 
308
- #### 4. Dual Content/StructuredContent MCP Responses
309
- - **Description**: Every MCP tool returns both human-readable text summary and machine-parseable JSON
310
- - **Evidence**: `mcp.ts:288-291` `return { content: [...], structuredContent: {...} }`
311
- - **Why It Works**: LLMs can parse both formats, but text summaries are more token-efficient for simple consumption
145
+ 1. **Native Vector Queries**: sqlite-vec provides sub-millisecond KNN with C-level performance
146
+ 2. **Typed Query Language**: Explicit control over search routing reduces ambiguity
147
+ 3. **Smart Chunking**: Markdown-aware splitting produces better embeddings
148
+ 4. **HTTP MCP Transport**: Shared server avoids repeated model loading
149
+ 5. **Dynamic Instructions**: Index-aware MCP instructions give LLM immediate context
312
150
 
313
151
  ### What We Do Well
314
152
 
315
- #### 1. Fact-Based Knowledge Graph
316
- - Our subject-predicate-object triples enable structured queries and inference
317
- - Truth maintenance resolves contradictions automatically
318
- - Far richer than document-level retrieval for knowledge extraction
319
-
320
- #### 2. Dual-Database Architecture
321
- - Clean global/project separation without YAML collections
322
- - Simpler queries, clearer data ownership
323
-
324
- #### 3. Comprehensive MCP Surface
325
- - 18 tools vs QMD's 6 — we cover recall, explain, manage, monitor
326
- - Progressive disclosure (recall_index → recall_details) for token efficiency
327
-
328
- #### 4. Lightweight Dependencies
329
- - ~5MB gem vs ~2GB+ with GGUF models
330
- - fastembed-rb (67MB ONNX) vs EmbeddingGemma (300MB GGUF)
331
- - No runtime LLM dependency
332
-
333
- #### 5. Robust Benchmarking
334
- - DevMemBench: 155 queries, Recall@k, MRR, nDCG@10
335
- - 100 truth maintenance test cases
336
- - 31 end-to-end scenarios with real Claude
337
- - QMD has 18 eval queries — our evaluation is more comprehensive
338
-
339
- ### Trade-offs
340
-
341
- | Approach | Pros | Cons | Best For |
342
- |----------|------|------|----------|
343
- | **QMD's LLM-powered search** | Better semantic recall, typed query routing | 2GB+ models, 2-3s cold start, complex deps | Large document collections, conceptual search |
344
- | **Our FastEmbed search** | Lightweight (67MB), fast (<100ms), no LLM | Lower semantic quality for vague queries | Structured fact retrieval, quick lookups |
345
- | **QMD's plugin distribution** | Zero-config install, marketplace discovery | Requires plugin ecosystem maturity | Wide user adoption |
346
- | **Our gem + MCP + hooks** | Fine-grained control, works today | Complex setup, multiple config files | Power users, custom integrations |
153
+ 1. **Knowledge Representation**: Facts with provenance > raw document chunks
154
+ 2. **Truth Maintenance**: Supersession and conflict resolution
155
+ 3. **Dual-Database System**: Project/global scope separation
156
+ 4. **Distillation Pipeline**: Extract structured knowledge from transcripts
157
+ 5. **Temporal Validity**: Facts have valid_from/valid_to windows
347
158
 
348
159
  ---
349
160
 
@@ -351,247 +162,94 @@ export interface ILLMSession {
351
162
 
352
163
  ### High Priority ⭐
353
164
 
354
- #### 1. Claude Code Plugin Distribution Format ⭐ NEW
355
- - **Value**: 10x easier installation (single command vs multi-step gem + MCP + hook config)
356
- - **Evidence**: `.claude-plugin/marketplace.json` — complete plugin spec; `skills/qmd/SKILL.md` skill definition with tool scoping
357
- - **Implementation**: Create `.claude-plugin/marketplace.json` with `mcpServers` pointing to `claude-memory serve-mcp`, skill definition from existing MCP tools, and `allowed-tools: mcp__claude-memory__*`
358
- - **Effort**: 2-3 days (plugin metadata, skill definition, testing, documentation)
359
- - **Trade-off**: Depends on Claude Code plugin ecosystem maturity; current hooks integration may still be needed
360
- - **Recommendation**: **ADOPT** — QMD proves the format works. Start with plugin skeleton, iterate as ecosystem matures
361
- - **Integration Points**: New `.claude-plugin/` directory, `skills/` directory, update installation docs
362
-
363
- #### 2. MCP Structured Content Pattern NEW
364
- - **Value**: Better MCP response quality dual human-readable + machine-parseable output
365
- - **Evidence**: `mcp.ts:288-291` `{ content: [{ type: "text", text: summary }], structuredContent: { results } }`
366
- - **Implementation**: Update all 18 MCP tool handlers to return both `content` (text summary) and `structuredContent` (JSON). Text content would be a concise summary; structured content preserves full data.
367
- - **Effort**: 1-2 days (update tool handlers, update tests)
368
- - **Trade-off**: Slightly more code per tool handler; may need to verify Claude Code MCP client supports `structuredContent`
369
- - **Recommendation**: **ADOPT** — Pure improvement, no downside if client supports it
370
- - **Integration Points**: `lib/claude_memory/mcp/server.rb`, all tool handler methods
371
-
372
- #### 3. MCP Registered Prompt for Query Guidance ⭐ NEW
373
- - **Value**: Claude uses memory tools more effectively with embedded search strategy
374
- - **Evidence**: `mcp.ts:172-252` — registered prompt explaining when to use recall vs recall_semantic vs search_concepts
375
- - **Implementation**: Register a `memory_guide` prompt in our MCP server explaining tool selection strategy (recall for keywords, recall_semantic for concepts, search_concepts for multi-faceted queries, explain for provenance)
376
- - **Effort**: 4-6 hours (write prompt, register in server, test)
377
- - **Trade-off**: Minimal; prompt is only loaded on request
378
- - **Recommendation**: **ADOPT** — Simple way to improve tool usage quality
379
- - **Integration Points**: `lib/claude_memory/mcp/server.rb`
380
-
381
- #### 4. Inline Status Check in Skills ⭐ NEW
382
- - **Value**: Immediate feedback on memory system health when skill loads
383
- - **Evidence**: `SKILL.md:18` — `!` prefix runs `qmd status 2>/dev/null || echo "Not installed"`
384
- - **Implementation**: Add inline check to our skill definition: `!claude-memory doctor --brief 2>/dev/null || echo "Not configured. Run: gem install claude_memory"`
385
- - **Effort**: 1-2 hours
386
- - **Trade-off**: None
387
- - **Recommendation**: **ADOPT** — Trivial improvement with clear benefit
388
- - **Integration Points**: Skill definition file
389
-
390
- ### Previously Identified (Carried Forward)
391
-
392
- These items from the 2026-01-26 analysis remain relevant:
393
-
394
- #### 5. ⭐ Native Vector Storage (sqlite-vec) — STILL CRITICAL
395
- - **Value**: 10-100x faster KNN queries
396
- - **Status**: Not yet implemented in ClaudeMemory
397
- - **Updated Evidence**: QMD now handles 10,000+ documents in production (5,700+ star project)
398
- - **Recommendation**: **ADOPT IMMEDIATELY** — Foundational improvement
399
-
400
- #### 6. ⭐ Reciprocal Rank Fusion (RRF) Algorithm — STILL HIGH VALUE
401
- - **Value**: 50% improvement in Hit@3 for medium-difficulty queries
402
- - **Status**: Not yet implemented in ClaudeMemory
403
- - **Recommendation**: **ADOPT IMMEDIATELY** — Pure algorithmic improvement
404
-
405
- #### 7. ⭐ Docid Short Hash System — STILL MEDIUM VALUE
406
- - **Value**: Better UX, cross-database fact references
407
- - **Status**: Not yet implemented
408
- - **Recommendation**: **ADOPT IN PHASE 2**
409
-
410
- #### 8. ⭐ Smart Expansion Detection — STILL MEDIUM VALUE
411
- - **Value**: Skip unnecessary vector search when FTS has strong signal
412
- - **Status**: Not yet implemented
413
- - **Recommendation**: **ADOPT IN PHASE 3**
165
+ #### 1. Native Vector Storage (sqlite-vec)
166
+ - **Value**: 10-100x faster KNN queries, eliminates O(n) Ruby similarity
167
+ - **Evidence**: `db.ts:52-54` — single function call to load extension
168
+ - **Implementation**: Add sqlite-vec gem, create `facts_vec` virtual table, two-step query pattern
169
+ - **Effort**: 3-5 days
170
+ - **Trade-off**: Native dependency (but well-maintained, cross-platform)
171
+ - **Recommendation**: **ADOPT** — Critical for scaling beyond 1000 facts
172
+
173
+ #### 2. Smart Chunking for Long Content
174
+ - **Value**: Better embeddings for transcripts > 3000 chars
175
+ - **Evidence**: `store.ts:53-219` scored break points, code fence awareness
176
+ - **Implementation**: Port chunking algorithm to Ruby for transcript ingestion
177
+ - **Effort**: 2-3 days
178
+ - **Trade-off**: Complexity; only needed for long content
179
+ - **Recommendation**: **CONSIDER** Adopt if users report long transcript issues
180
+
181
+ #### 3. HTTP MCP Transport
182
+ - **Value**: Shared server, models stay loaded, faster subsequent queries
183
+ - **Evidence**: `mcp.ts:119-137` WebStandardStreamableHTTPServerTransport
184
+ - **Implementation**: Add HTTP transport option alongside stdio
185
+ - **Effort**: 2-3 days
186
+ - **Trade-off**: Process management complexity
187
+ - **Recommendation**: **CONSIDER** Useful if MCP startup latency becomes an issue
414
188
 
415
189
  ### Medium Priority
416
190
 
417
- #### 9. Skill Definition with Tool Scoping
418
- - **Value**: Security and UX limit tool access to memory-related commands
419
- - **Evidence**: `SKILL.md:9` — `allowed-tools: Bash(qmd:*), mcp__qmd__*`
420
- - **Implementation**: Define skill with `allowed-tools: Bash(claude-memory:*), mcp__claude-memory__*`
421
- - **Effort**: Included in plugin distribution work
422
- - **Recommendation**: **CONSIDER** — Good practice for plugin security
423
- - **Integration Points**: Skills directory
424
-
425
- #### 10. Evaluation Harness Improvements
426
- - **Value**: QMD's eval structure with difficulty levels and Hit@K metrics is cleaner
427
- - **Evidence**: `test/eval-harness.ts:11-16` — typed queries with difficulty + description
428
- - **Implementation**: Already have DevMemBench (more comprehensive). Could adopt difficulty classification.
429
- - **Recommendation**: **CONSIDER** — Our evals are already better; could add difficulty labels
430
-
431
- ### Low Priority
432
-
433
- #### 11. YAML-Based Collection Configuration
434
- - **Value**: User-editable config for what gets indexed
435
- - **Evidence**: `collections.ts`, `example-index.yml`
436
- - **Recommendation**: **REJECT** — Our dual-database provides cleaner separation
437
-
438
- #### 12. Custom Query Expansion Model
439
- - **Value**: Better search recall via ML-powered query rewriting
440
- - **Evidence**: `finetune/` — complete training pipeline
441
- - **Recommendation**: **REJECT** — Too heavy (1.7B model) for our fact retrieval use case. If we need expansion, we can leverage Claude's own capabilities during recall.
442
-
443
- #### 13. LLM-Based Reranking
444
- - **Value**: Better ranking precision
445
- - **Recommendation**: **REJECT** — Over-engineering for structured fact retrieval
191
+ #### 4. Dynamic MCP Server Instructions
192
+ - **Value**: Give LLM immediate context about database state without extra tool call
193
+ - **Evidence**: `mcp.ts:91-98` — builds instructions from actual index state
194
+ - **Implementation**: Generate instructions showing fact counts, recent decisions, active conflicts
195
+ - **Effort**: 1 day
196
+ - **Trade-off**: Minimal
197
+ - **Recommendation**: **ADOPT**
198
+
199
+ #### 5. Query Document Format
200
+ - **Value**: More expressive queries with explicit search routing
201
+ - **Evidence**: `docs/SYNTAX.md:1-100` — formal EBNF grammar
202
+ - **Implementation**: Support typed queries in recall (e.g., `lex: exact term` vs `vec: semantic query`)
203
+ - **Effort**: 3-5 days
204
+ - **Trade-off**: Complexity; current free-text may be sufficient
205
+ - **Recommendation**: **DEFER** — Over-engineering for fact retrieval
446
206
 
447
207
  ### Features to Avoid
448
208
 
449
- #### 1. Heavy Local LLM Dependencies
450
- - **What It Is**: Three GGUF models totaling ~2GB for search operations
451
- - **Why Avoid**: ClaudeMemory targets lightweight, instant search. 2-3s cold start and 3GB memory is inappropriate for a fact lookup tool.
452
- - **Our Alternative**: FastEmbed (67MB ONNX, <100ms) provides adequate semantic search for structured facts.
453
-
454
- #### 2. Content-Addressable Document Storage
455
- - **What It Is**: SHA256 hash-based deduplication of full documents
456
- - **Why Avoid**: We store facts, not documents. Our deduplication is by fact signature.
457
- - **Our Alternative**: Existing fact signature-based deduplication.
458
-
459
- ---
460
-
461
- ## Implementation Recommendations
462
-
463
- ### Phase 1: Plugin Foundation (NEW)
464
-
465
- **Goals**: Establish ClaudeMemory as a Claude Code plugin with improved MCP output
466
-
467
- **Tasks**:
468
- - [ ] Create `.claude-plugin/marketplace.json` with plugin metadata
469
- - [ ] Create skill definition with tool scoping and inline health check
470
- - [ ] Add MCP structured content pattern to all 18 tool handlers
471
- - [ ] Register query guidance prompt in MCP server
472
- - [ ] Test plugin installation workflow
473
- - [ ] Update installation docs
474
-
475
- **Success Criteria**:
476
- - ClaudeMemory installable via `claude plugin add`
477
- - MCP tools return both text summaries and structured JSON
478
- - Query guide prompt available via MCP
479
-
480
- **Risks**: Plugin ecosystem may change; maintain backward compatibility with manual setup
481
-
482
- ---
483
-
484
- ### Phase 2: Vector Storage Upgrade (CARRIED FORWARD)
485
-
486
- **Goals**: Adopt sqlite-vec for native KNN and RRF fusion for search quality
487
-
488
- **Tasks**:
489
- - [ ] Add sqlite-vec extension support
490
- - [ ] Schema migration for `facts_vec` virtual table (two-step query pattern)
491
- - [ ] Implement `Recall::RRFusion` class
492
- - [ ] Backfill existing embeddings
493
- - [ ] Benchmark: target 10x KNN improvement
494
-
495
- **Success Criteria**:
496
- - Vector search uses native sqlite-vec
497
- - RRF fusion active for hybrid queries
498
- - DevMemBench shows improved retrieval metrics
499
-
500
- ---
501
-
502
- ### Phase 3: UX Polish (CARRIED FORWARD)
503
-
504
- **Goals**: Docid hashes and smart expansion detection
505
-
506
- **Tasks**:
507
- - [ ] Schema migration for `docid` column (8-char hash)
508
- - [ ] Implement `Recall::ExpansionDetector`
509
- - [ ] Update CLI and MCP tools for docid support
209
+ - **Custom Fine-Tuned Query Expansion (Qwen3-1.7B)**: Too heavy for fact retrieval
210
+ - **EmbeddingGemma**: We use fastembed-rb (BAAI/bge-small-en-v1.5) which is lighter
211
+ - **Content-Addressable Storage**: Our facts are deduplicated by signature, not content hash
212
+ - **LLM Reranking**: Cross-encoder reranking is over-engineering for our use case
510
213
 
511
214
  ---
512
215
 
513
216
  ## Architecture Decisions
514
217
 
515
218
  ### What to Preserve
219
+ - **Fact-based knowledge model**: More valuable than raw document chunks
220
+ - **Dual-database system**: Clean project/global separation
221
+ - **Ruby + Sequel**: Mature, stable, well-tested
516
222
 
517
- - **Fact-Based Knowledge Graph**: Our structured triples are fundamentally different from (and better suited for knowledge extraction than) QMD's document storage
518
- - **Truth Maintenance**: Supersession + conflict resolution is a core differentiator
519
- - **Dual-Database Architecture**: Cleaner than YAML collections for our use case
520
- - **Lightweight Dependencies**: Ruby gem + ONNX embeddings vs 2GB+ GGUF models
521
-
522
- ### What to Adopt (NEW)
523
-
524
- - **Plugin Distribution Format**: `.claude-plugin/marketplace.json` + skills for frictionless installation
525
- - **Structured MCP Content**: Dual `content`/`structuredContent` responses for all tools
526
- - **MCP Query Guide Prompt**: Registered prompt teaching Claude how to use memory tools effectively
527
- - **Inline Status Checks**: Skill-level health verification on load
528
-
529
- ### What to Adopt (CARRIED FORWARD)
530
-
531
- - **sqlite-vec Native Vectors**: 10-100x faster KNN (critical)
532
- - **RRF Fusion**: 50% search quality improvement (critical)
533
- - **Docid Short Hashes**: Better UX for fact references
534
- - **Smart Expansion Detection**: Skip vector search when FTS is confident
223
+ ### What to Adopt
224
+ - **sqlite-vec**: Critical for vector query performance
225
+ - **Two-step vector query pattern**: Avoid JOIN hangs
226
+ - **Dynamic MCP instructions**: Free context for LLMs
535
227
 
536
228
  ### What to Reject
537
-
538
- - **Local LLM Models for Search**: Too heavy (2GB+, 3s cold start)
539
- - **Custom Fine-Tuned Models**: Training pipeline is impressive but overkill for fact retrieval
540
- - **YAML Collection System**: Our dual-DB is better for our use case
541
- - **Content-Addressable Storage**: Different data model
542
- - **Virtual Path System**: Unnecessary for fact-based storage
229
+ - **YAML collection system**: Our dual-database is cleaner
230
+ - **Custom fine-tuned models**: Too heavy for our use case
231
+ - **Query document format**: Over-engineering for fact retrieval
543
232
 
544
233
  ---
545
234
 
546
235
  ## Key Takeaways
547
236
 
548
237
  ### Main Learnings
238
+ 1. sqlite-vec is production-ready (v0.1.7-alpha.2) and used by multiple projects
239
+ 2. Two-step query pattern is mandatory (JOINs hang with vec tables)
240
+ 3. Query document format is elegant but over-engineering for fact retrieval
241
+ 4. HTTP MCP transport enables shared server mode
549
242
 
550
- 1. **Plugin distribution is the future**: QMD's marketplace plugin reduces installation from "read docs, install gem, configure MCP, set up hooks, restart Claude" to one command. This is the single most impactful UX improvement we should adopt.
551
-
552
- 2. **Structured MCP responses matter**: Returning both text summary and structured JSON is a simple pattern that significantly improves how Claude consumes tool output.
553
-
554
- 3. **Fine-tuned models for specific tasks work**: QMD's two-stage SFT→GRPO pipeline for query expansion is state-of-the-art. While we shouldn't adopt the models themselves (too heavy), the reward function design and structured output routing are good reference patterns.
555
-
556
- 4. **Eval methodology with difficulty levels**: QMD's easy/medium/hard query classification provides clearer signal about where improvements matter. Our DevMemBench is more comprehensive but could benefit from this labeling.
557
-
558
- 5. **The previous QMD analysis recommendations remain valid**: sqlite-vec, RRF, docids, and smart expansion are still unimplemented and still valuable.
559
-
560
- ### Recommended Adoption Order
561
-
562
- 1. **First**: Plugin distribution format — highest UX impact, unblocks ecosystem adoption
563
- 2. **Second**: MCP structured content + query guide prompt — low effort, immediate quality gain
564
- 3. **Third**: sqlite-vec + RRF fusion — foundational performance and quality
565
- 4. **Fourth**: Docids + smart expansion — polish and optimization
566
-
567
- ### Expected Impact
568
-
569
- - **Installation**: 10x easier (single command vs multi-step)
570
- - **MCP Quality**: Better Claude tool usage with structured responses + query guidance
571
- - **Search Performance**: 10-100x faster KNN (sqlite-vec), 50% better Hit@3 (RRF)
572
- - **UX**: Human-friendly fact references (#abc123de), smarter search skipping
573
-
574
- ### Next Actions
575
-
576
- - [ ] Review plugin distribution feasibility (check Claude Code plugin spec)
577
- - [ ] Implement MCP structured content pattern (quick win)
578
- - [ ] Register query guide MCP prompt (quick win)
579
- - [ ] Continue with sqlite-vec + RRF adoption plan from previous analysis
580
- - [ ] Store analysis findings in memory
581
-
582
- ---
583
-
584
- ## References
585
-
586
- - **Repository**: https://github.com/tobi/qmd
587
- - **Previous Analysis**: docs/influence/qmd.md (2026-01-26)
588
- - **Claude Code Plugins**: https://code.claude.com/docs/en/plugins.md
589
- - **MCP Spec**: https://modelcontextprotocol.io
590
- - **sqlite-vec**: https://github.com/asg017/sqlite-vec
591
- - **RRF Paper**: Cormack et al., "Reciprocal Rank Fusion outperforms Condorcet and individual Rank Learning Methods" (2009)
243
+ ### Changes Since Last Analysis (2026-02-02)
244
+ - v1.1.0 released with query document format
245
+ - Lex syntax with phrase matching and negation
246
+ - Unified `query` MCP tool replacing 3 separate tools
247
+ - HTTP MCP transport with daemon mode
248
+ - Dual Node.js/Bun runtime support
249
+ - Collection include/exclude management
592
250
 
593
251
  ---
594
252
 
595
- *Analysis completed: 2026-02-02*
253
+ *Analysis completed: 2026-03-02*
596
254
  *Analyst: Claude Code*
597
- *Review Status: Draft — Updated from 2026-01-26 analysis with new findings on plugin distribution, fine-tuned models, and MCP patterns*
255
+ *Review Status: Draft*