claude_memory 0.6.0 → 0.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,9 +1,9 @@
1
1
  # QMD Analysis (Updated)
2
2
 
3
- *Analysis Date: 2026-03-02*
4
- *Previous Analysis: 2026-02-02*
3
+ *Analysis Date: 2026-03-10*
4
+ *Previous Analysis: 2026-03-02, 2026-02-02*
5
5
  *Repository: https://github.com/tobi/qmd*
6
- *Version: 1.1.0 (commit 40610c3)*
6
+ *Version: 2.0.1 (commit ae3604c)*
7
7
 
8
8
  ---
9
9
 
@@ -13,36 +13,40 @@
13
13
 
14
14
  QMD (Query Markup Documents) is an **on-device search engine** for markdown knowledge bases. It combines BM25 full-text search, vector semantic search, and LLM re-ranking — all running locally via node-llama-cpp with GGUF models.
15
15
 
16
- ### Key Innovation (What's New Since Last Study)
16
+ ### Key Innovation (What's New Since v1.1.5 Study)
17
17
 
18
- 1. **Query Document Format** (`docs/SYNTAX.md`): Structured multi-line queries with typed sub-queries (`lex:`, `vec:`, `hyde:`) that route to different search backends. First sub-query gets 2x weight in Reciprocal Rank Fusion. This replaces the separate `search`/`vsearch`/`query` commands with a unified `query` tool.
18
+ 1. **Stable SDK API** (`src/index.ts:1-524`): QMD 2.0 declares a stable library API via `createStore()` returning a `QMDStore` interface. Clean separation between SDK (public), CLI (consumer), and MCP (consumer). The SDK owns all search, retrieval, collection management, context management, indexing, and lifecycle operations.
19
19
 
20
- 2. **Lex Query Syntax**: Full BM25 operator support `"exact phrase"` matching, `-term` exclusions, `-"phrase"` exclusions. Enables intent-aware disambiguation (e.g., `performance -sports -athlete`).
20
+ 2. **Unified `search()` Method** (`src/index.ts:145-164`): Replaces the old `query()`/`search()`/`structuredSearch()` split. Accepts either a simple `query` string (auto-expanded) or pre-expanded `queries` array. Clean polymorphic design.
21
21
 
22
- 3. **HTTP MCP Transport** (`src/mcp.ts:10-16`): Stateless HTTP server alongside stdio. Models stay loaded in VRAM across requests. Embedding/reranking contexts disposed after 5 min idle.
22
+ 3. **MCP Server as SDK Consumer** (`src/mcp/server.ts:1-808`): MCP server completely rewritten to consume the SDK zero internal store access. Uses `QMDStore` interface exclusively. Multi-session HTTP transport with session map.
23
23
 
24
- 4. **Unified MCP `query` tool**: Removed separate `search`, `vector_search`, `deep_search` tools. Single `query` tool handles all modes via the query document format.
24
+ 4. **Self-Contained Database** (`src/store.ts:699-767`): New `store_collections` and `store_config` SQLite tables make the DB self-contained. No external YAML config needed SDK creates stores with inline config or DB-only mode.
25
25
 
26
- 5. **Collection Management Enhancements**: `include`/`exclude` collections from default queries, `update-cmd` for pre-update shell commands, multiple `-c` flags.
26
+ 5. **Maintenance Class** (`src/maintenance.ts:1-54`): Dedicated maintenance wrapper for cleanup operations — vacuum, orphaned content/vectors cleanup, LLM cache clearing, inactive doc deletion, embedding reset.
27
+
28
+ 6. **Embedded Skills** (`src/embedded-skills.ts`): Skill definitions (SKILL.md + references) embedded as base64 in source. `qmd skill install` copies packaged skill to `~/.claude/commands/`.
29
+
30
+ 7. **REST API** (`src/mcp/server.ts:626-675`): POST `/query` (alias `/search`) endpoint alongside MCP — structured search without MCP protocol overhead.
27
31
 
28
32
  ### Technology Stack
29
33
 
30
34
  - **Runtime**: Node.js >= 22 / Bun (dual runtime, `src/db.ts:9-24`)
31
- - **Database**: SQLite with better-sqlite3 + sqlite-vec extension v0.1.7-alpha.2
35
+ - **Database**: SQLite with better-sqlite3 ^12.4.5 + sqlite-vec v0.1.7-alpha.2
32
36
  - **Full-Text Search**: SQLite FTS5 with Porter tokenization
33
- - **Embeddings**: EmbeddingGemma (~300MB GGUF)
37
+ - **Embeddings**: Qwen3-Embedding (configurable via `QMD_EMBED_MODEL`)
34
38
  - **Reranking**: Qwen3-Reranker-0.6B (~640MB GGUF)
35
39
  - **Query Expansion**: Qwen3-1.7B (custom fine-tuned, ~1.1GB)
36
40
  - **MCP**: @modelcontextprotocol/sdk v1.25.1
37
- - **Validation**: Zod v4
38
- - **Plugin**: Claude Code marketplace format
41
+ - **Validation**: Zod v4.2.1
42
+ - **Plugin**: Claude Code marketplace format + embedded skills
39
43
 
40
44
  ### Production Readiness
41
45
 
42
- - **Maturity**: Stable (v1.1.0), 5,700+ GitHub stars
43
- - **Test Coverage**: vitest suite (store, mcp, collections, formatter, cli, eval)
44
- - **Plugin Distribution**: Claude Code marketplace
45
- - **Community**: Active (256 PRs merged, external contributors)
46
+ - **Maturity**: Stable (v2.0.1), 5,700+ GitHub stars
47
+ - **Test Coverage**: vitest suite with 1,286-line SDK test (store, mcp, collections, formatter, cli, sdk, eval)
48
+ - **Plugin Distribution**: Claude Code marketplace + `qmd skill install`
49
+ - **Community**: Active (362+ PRs, external contributors)
46
50
 
47
51
  ---
48
52
 
@@ -62,79 +66,111 @@ content_vectors (chunk metadata: hash, seq, pos, model)
62
66
  vectors_vec (sqlite-vec native KNN index, cosine distance)
63
67
 
64
68
  llm_cache (hash-keyed deterministic response cache)
69
+
70
+ store_collections (self-contained collection config in DB) [NEW in v2.0]
71
+
72
+ store_config (key-value metadata, e.g. config_hash) [NEW in v2.0]
65
73
  ```
66
74
 
67
- ### Key Design Patterns (New)
75
+ ### Key Design Patterns
76
+
77
+ 1. **SDK-First Architecture** (`src/index.ts`): Public `QMDStore` interface is the contract. CLI and MCP are consumers, not peers. Internal store exposed via `.internal` for advanced use only.
68
78
 
69
- 1. **Query Document Format** (`docs/SYNTAX.md:1-100`): EBNF grammar for structured queries. Lines typed as `lex:`, `vec:`, or `hyde:` route to different backends. Plain text defaults to `expand:` (LLM-generated variants).
79
+ 2. **Per-Store LLM Instance** (`src/index.ts:361-364`): Each SDK store creates its own `LlamaCpp` instance with lazy loading and 5-min inactivity timeout. No global singletons enables concurrent stores.
70
80
 
71
- 2. **Two-Step Vector Query** (`store.ts:1912-1915`): JOINs with sqlite-vec virtual tables hang indefinitely. QMD uses separate queries:
72
- ```typescript
73
- // Step 1: KNN from vec table
74
- const vecResults = db.prepare(
75
- `SELECT hash_seq, distance FROM vectors_vec WHERE embedding MATCH ? AND k = ?`
76
- ).all(embedding, limit * 3);
77
- // Step 2: Join with documents separately
78
- ```
81
+ 3. **Write-Through Config** (`src/index.ts:417-465`): Collection/context mutations write to both SQLite and YAML/inline config if configured. DB is source of truth; YAML is optional persistence layer.
79
82
 
80
- 3. **Smart Chunking** (`store.ts:53-219`): 900 tokens/chunk, 15% overlap, markdown-aware break points with scored pattern matching (h1=100, h2=90, paragraph=20). Distance decay prevents splitting inside code fences.
83
+ 4. **Two-Step Vector Query** (`store.ts`): JOINs with sqlite-vec virtual tables hang. Separate KNN query then batch hydration.
81
84
 
82
- 4. **Dynamic MCP Instructions** (`mcp.ts:91-98`): `buildInstructions()` generates context-aware server instructions from actual index state, injected into LLM system prompt.
85
+ 5. **Smart Chunking** (`store.ts:53-219`): 900 tokens/chunk, 15% overlap, markdown-aware break points with scored pattern matching.
83
86
 
84
- 5. **Dual Runtime Compatibility** (`db.ts:9-24`): Cross-runtime SQLite layer that works under both Bun (bun:sqlite) and Node.js (better-sqlite3).
87
+ 6. **Dynamic MCP Instructions** (`src/mcp/server.ts:92-152`): `buildInstructions()` generates context-aware instructions from actual index state including collection names, document counts, capability gaps, search examples, and retrieval workflow.
88
+
89
+ 7. **MCP Resource Templates** (`src/mcp/server.ts:172-207`): Documents accessible via `qmd://{+path}` URI scheme. Resources return structured content with context annotations.
85
90
 
86
91
  ### Comparison with ClaudeMemory
87
92
 
88
- | Aspect | QMD (1.1.0) | ClaudeMemory | Notes |
93
+ | Aspect | QMD (2.0.1) | ClaudeMemory | Notes |
89
94
  |--------|-------------|--------------|-------|
95
+ | **API** | SDK-first (`QMDStore` interface) | CLI-first + MCP tools | QMD more composable |
90
96
  | **Data Model** | Content-addressable chunks | Subject-predicate-object facts | QMD stores documents; we store knowledge |
91
- | **Storage** | SQLite + sqlite-vec | SQLite + Sequel + fastembed-rb | Both use FTS5 |
92
- | **Vector Search** | sqlite-vec (native C) | JSON embeddings (Ruby) | QMD 10-100x faster |
93
- | **Query Language** | Typed sub-queries (lex/vec/hyde) | Free-text search | QMD more expressive |
94
- | **Chunking** | Smart (900 tok, markdown-aware) | None (fact-level) | Different granularity |
95
- | **Plugin Format** | marketplace.json | Ruby gem + MCP + hooks | QMD easier to install |
96
- | **MCP Transport** | stdio + HTTP | stdio only | HTTP enables shared server |
97
+ | **Storage** | SQLite + sqlite-vec | SQLite + Sequel + sqlite-vec | Both use FTS5 + vec0 |
98
+ | **Config** | Self-contained DB + optional YAML | Dual-database (global + project) | Different scoping models |
99
+ | **Query** | Typed sub-queries (lex/vec/hyde) | Free-text + hybrid search | QMD more expressive |
100
+ | **Maintenance** | Dedicated `Maintenance` class | `Sweep` module + compact command | Similar approach |
101
+ | **Plugin Format** | marketplace.json + embedded skills | Ruby gem + MCP + hooks | QMD adds skill install |
102
+ | **MCP Transport** | stdio + HTTP + REST | stdio only | QMD more flexible |
103
+ | **Tests** | 1,286-line SDK test suite | RSpec suite + evals + benchmarks | Both comprehensive |
97
104
 
98
105
  ---
99
106
 
100
107
  ## Key Components Deep-Dive
101
108
 
102
- ### Component 1: Query Document Parser
103
-
104
- **Purpose**: Parse structured multi-line queries into typed sub-queries for routing to appropriate search backends.
105
-
106
- **Location**: `docs/SYNTAX.md`, `src/store.ts`
107
-
108
- **Design Decisions**:
109
- - Typed lines (`lex:`, `vec:`, `hyde:`) enable precise control over search routing
110
- - First sub-query gets 2x weight in RRF fusion
111
- - Plain text auto-expands via LLM to generate all three types
112
- - Lex supports phrase matching and negation for disambiguation
113
-
114
- ### Component 2: HTTP MCP Transport
109
+ ### Component 1: SDK API (`src/index.ts`)
110
+
111
+ **Purpose**: Stable programmatic interface for all QMD operations.
112
+
113
+ **Key Design Decisions**:
114
+ - `QMDStore` interface with 20+ methods across 6 categories (Search, Retrieval, Collections, Context, Indexing, Lifecycle)
115
+ - `createStore()` factory with three modes: YAML config, inline config, DB-only
116
+ - Per-store `LlamaCpp` with lazy loading and auto-disposal
117
+ - Write-through config pattern for collection mutations
118
+ - Re-exports types for SDK consumers
119
+
120
+ **Code Example** (`src/index.ts:331-524`):
121
+ ```typescript
122
+ export async function createStore(options: StoreOptions): Promise<QMDStore> {
123
+ const internal = createStoreInternal(options.dbPath);
124
+ const llm = new LlamaCpp({
125
+ inactivityTimeoutMs: 5 * 60 * 1000,
126
+ disposeModelsOnInactivity: true,
127
+ });
128
+ internal.llm = llm;
129
+ // ... builds QMDStore with all methods delegating to internal
130
+ }
131
+ ```
115
132
 
116
- **Purpose**: Long-lived MCP server that avoids repeated model loading.
133
+ ### Component 2: MCP Server as SDK Consumer (`src/mcp/server.ts`)
134
+
135
+ **Purpose**: Expose QMD search via MCP protocol, consuming only the public SDK.
136
+
137
+ **Key Design Decisions**:
138
+ - Zero internal store access — uses `QMDStore` interface exclusively
139
+ - `createMcpServer()` shared by both stdio and HTTP transports
140
+ - Multi-session HTTP with session map (`Map<string, Transport>`)
141
+ - REST `/query` endpoint alongside MCP for non-MCP clients
142
+ - Rich `buildInstructions()` with collection stats, context, capability gaps, search examples
143
+
144
+ **Code Example** (`src/mcp/server.ts:158-166`):
145
+ ```typescript
146
+ async function createMcpServer(store: QMDStore): Promise<McpServer> {
147
+ const server = new McpServer(
148
+ { name: "qmd", version: "0.9.9" },
149
+ { instructions: await buildInstructions(store) },
150
+ );
151
+ // ... registers tools using only store.search(), store.get(), etc.
152
+ }
153
+ ```
117
154
 
118
- **Location**: `src/mcp.ts:119-137`
155
+ ### Component 3: Self-Contained Database (`src/store.ts:699-767`)
119
156
 
120
- **Design Decisions**:
121
- - WebStandardStreamableHTTPServerTransport for stateless HTTP
122
- - Models stay loaded in VRAM across requests
123
- - Idle disposal after 5 min (transparent recreation ~1s)
124
- - Health endpoint for liveness checks
125
- - Daemon mode with PID file management
157
+ **Purpose**: Make the database self-contained so SDK consumers don't need external config files.
126
158
 
127
- ### Component 3: Smart Chunking
159
+ **Key Design Decisions**:
160
+ - `store_collections` table replaces YAML as the primary config store
161
+ - `store_config` key-value table for metadata (e.g., `config_hash` for sync optimization)
162
+ - `syncConfigToDb()` function syncs external config into SQLite on store creation
163
+ - Collection accessor functions (`getStoreCollections`, `upsertStoreCollection`, etc.) replace direct YAML reads
128
164
 
129
- **Purpose**: Split documents at natural boundaries for better embeddings.
165
+ ### Component 4: Maintenance Class (`src/maintenance.ts`)
130
166
 
131
- **Location**: `src/store.ts:68-219`
167
+ **Purpose**: Wrap low-level store cleanup operations for CLI housekeeping.
132
168
 
133
- **Design Decisions**:
134
- - Scored break points (h1=100 newline=1) with distance decay
135
- - Code fence detection prevents splitting mid-block
136
- - 200-token search window for finding optimal cut points
137
- - Squared distance decay for gentle early, steep late penalties
169
+ **Key Design Decisions**:
170
+ - Takes internal `Store` in constructor allowed direct DB access
171
+ - 6 operations: vacuum, orphaned content, orphaned vectors, LLM cache, inactive docs, clear embeddings
172
+ - Each method returns count of affected rows for reporting
173
+ - Used by CLI's `clean` subcommands
138
174
 
139
175
  ---
140
176
 
@@ -142,19 +178,23 @@ llm_cache (hash-keyed deterministic response cache)
142
178
 
143
179
  ### What They Do Well
144
180
 
145
- 1. **Native Vector Queries**: sqlite-vec provides sub-millisecond KNN with C-level performance
146
- 2. **Typed Query Language**: Explicit control over search routing reduces ambiguity
147
- 3. **Smart Chunking**: Markdown-aware splitting produces better embeddings
148
- 4. **HTTP MCP Transport**: Shared server avoids repeated model loading
149
- 5. **Dynamic Instructions**: Index-aware MCP instructions give LLM immediate context
181
+ 1. **SDK-First Architecture**: Clean separation between API contract and consumers (CLI, MCP). Makes QMD embeddable in other tools.
182
+ 2. **Self-Contained Database**: DB stores its own config no external files needed to reopen a store.
183
+ 3. **MCP as Pure Consumer**: MCP server has zero internal knowledge, proving the SDK is complete.
184
+ 4. **REST API Alongside MCP**: POST `/query` for non-MCP clients (curl, scripts, other tools).
185
+ 5. **Dynamic MCP Instructions**: Rich context about collections, doc counts, capability gaps — eliminates discovery calls.
186
+ 6. **Embedded Skill Distribution**: `qmd skill install` copies skill files — no manual setup.
187
+ 7. **Comprehensive SDK Tests**: 1,286-line test file covers constructor, search, collections, contexts, indexing, health.
150
188
 
151
189
  ### What We Do Well
152
190
 
153
191
  1. **Knowledge Representation**: Facts with provenance > raw document chunks
154
192
  2. **Truth Maintenance**: Supersession and conflict resolution
155
- 3. **Dual-Database System**: Project/global scope separation
193
+ 3. **Dual-Database System**: Project/global scope separation is cleaner than single-DB
156
194
  4. **Distillation Pipeline**: Extract structured knowledge from transcripts
157
195
  5. **Temporal Validity**: Facts have valid_from/valid_to windows
196
+ 6. **Hook Integration**: Deep integration with Claude Code lifecycle events
197
+ 7. **21 MCP Tools**: More granular tool surface than QMD's 4 tools
158
198
 
159
199
  ---
160
200
 
@@ -162,54 +202,73 @@ llm_cache (hash-keyed deterministic response cache)
162
202
 
163
203
  ### High Priority ⭐
164
204
 
165
- #### 1. Native Vector Storage (sqlite-vec)
166
- - **Value**: 10-100x faster KNN queries, eliminates O(n) Ruby similarity
167
- - **Evidence**: `db.ts:52-54` — single function call to load extension
168
- - **Implementation**: Add sqlite-vec gem, create `facts_vec` virtual table, two-step query pattern
169
- - **Effort**: 3-5 days
170
- - **Trade-off**: Native dependency (but well-maintained, cross-platform)
171
- - **Recommendation**: **ADOPT** — Critical for scaling beyond 1000 facts
172
-
173
- #### 2. Smart Chunking for Long Content
174
- - **Value**: Better embeddings for transcripts > 3000 chars
175
- - **Evidence**: `store.ts:53-219` — scored break points, code fence awareness
176
- - **Implementation**: Port chunking algorithm to Ruby for transcript ingestion
177
- - **Effort**: 2-3 days
178
- - **Trade-off**: Complexity; only needed for long content
179
- - **Recommendation**: **CONSIDER** — Adopt if users report long transcript issues
180
-
181
- #### 3. HTTP MCP Transport
182
- - **Value**: Shared server, models stay loaded, faster subsequent queries
183
- - **Evidence**: `mcp.ts:119-137` — WebStandardStreamableHTTPServerTransport
184
- - **Implementation**: Add HTTP transport option alongside stdio
185
- - **Effort**: 2-3 days
186
- - **Trade-off**: Process management complexity
187
- - **Recommendation**: **CONSIDER** — Useful if MCP startup latency becomes an issue
188
-
189
- ### Medium Priority
205
+ #### 1. Dedicated Maintenance Class
206
+ - **Value**: Clean separation of maintenance operations from main store. QMD's `Maintenance` class wraps 6 cleanup operations with return counts.
207
+ - **Evidence**: `src/maintenance.ts:1-54` — constructor takes internal store, each method returns affected count
208
+ - **Implementation**: Extract our `Sweep` module operations into a `Maintenance` class. Methods: `vacuum`, `cleanup_orphaned_content`, `cleanup_orphaned_vectors`, `cleanup_expired_facts`, `cleanup_superseded_facts`, `compact`. Return affected counts for reporting.
209
+ - **Effort**: 1 day
210
+ - **Trade-off**: Minor refactoring
211
+ - **Recommendation**: **ADOPT** — Our sweep is already similar, just needs cleaner wrapping
190
212
 
191
- #### 4. Dynamic MCP Server Instructions
192
- - **Value**: Give LLM immediate context about database state without extra tool call
193
- - **Evidence**: `mcp.ts:91-98` — builds instructions from actual index state
194
- - **Implementation**: Generate instructions showing fact counts, recent decisions, active conflicts
213
+ #### 2. Dynamic MCP Instructions Enhancement ⭐
214
+ - **Value**: QMD v2.0 builds rich instructions including collection stats, document counts, capability gaps, search examples, and retrieval workflow tips. Our MCP server has a static query guide prompt but no dynamic instructions.
215
+ - **Evidence**: `src/mcp/server.ts:92-152` — `buildInstructions()` with collections, counts, gaps, examples, tips
216
+ - **Implementation**: Add `buildInstructions()` to our MCP server that generates dynamic instructions with: fact counts (global/project), active conflict count, recent decision count, convention count, database health, and usage tips.
195
217
  - **Effort**: 1 day
196
- - **Trade-off**: Minimal
197
- - **Recommendation**: **ADOPT**
218
+ - **Trade-off**: Minimal — enhances existing MCP server instructions
219
+ - **Recommendation**: **ADOPT** — Free context for LLMs, eliminates need for `memory.status` call
220
+
221
+ #### 3. Embedded Skill Distribution ⭐
222
+ - **Value**: `qmd skill install` copies packaged skill files to `~/.claude/commands/` — zero-config setup. Skills are embedded as base64 in source code and extracted at install time.
223
+ - **Evidence**: `src/embedded-skills.ts:1-22` — base64-encoded SKILL.md + references; `src/cli/qmd.ts` — `skill install` command
224
+ - **Implementation**: Add `claude-memory install-skill` command that writes our memory recall agent (`agents/memory-recall.md`) to `~/.claude/commands/memory-recall.md`. Embed skill content in a Ruby constant.
225
+ - **Effort**: 1-2 days
226
+ - **Trade-off**: Adds a constant with skill content to codebase
227
+ - **Recommendation**: **ADOPT** — Pairs with Search Agent Delegation Pattern (#8 in improvements.md)
228
+
229
+ #### 4. REST API Endpoint ⭐
230
+ - **Value**: QMD v2.0 adds POST `/query` alongside MCP — enables search from curl, scripts, CI, and non-MCP clients without the full MCP protocol handshake.
231
+ - **Evidence**: `src/mcp/server.ts:626-675` — `/query` and `/search` endpoints with structured JSON request/response
232
+ - **Implementation**: Add optional HTTP server mode to `claude-memory serve-mcp --http` with POST `/recall` endpoint. Accept `{ query, scope, limit }`, return JSON facts.
233
+ - **Effort**: 2 days
234
+ - **Trade-off**: Requires WEBrick or similar Ruby HTTP server dependency
235
+ - **Recommendation**: **CONSIDER** — Useful for CI/scripting, but MCP covers primary use case
236
+
237
+ ### Medium Priority
198
238
 
199
- #### 5. Query Document Format
200
- - **Value**: More expressive queries with explicit search routing
201
- - **Evidence**: `docs/SYNTAX.md:1-100` — formal EBNF grammar
202
- - **Implementation**: Support typed queries in recall (e.g., `lex: exact term` vs `vec: semantic query`)
239
+ #### 5. SDK-First Architecture Pattern
240
+ - **Value**: QMD's `QMDStore` interface proves the core API is complete by having MCP consume only the public surface. Our MCP server directly accesses `Store`, `Recall`, `Sweep` internals.
241
+ - **Evidence**: `src/index.ts:212-304` — full `QMDStore` interface; `src/mcp/server.ts` — zero internal imports
242
+ - **Implementation**: Define a `ClaudeMemory::API` module that wraps all public operations. Refactor MCP server to consume only this API.
203
243
  - **Effort**: 3-5 days
204
- - **Trade-off**: Complexity; current free-text may be sufficient
205
- - **Recommendation**: **DEFER** — Over-engineering for fact retrieval
244
+ - **Trade-off**: Significant refactoring; current approach works fine
245
+ - **Recommendation**: **CONSIDER** — Good engineering but not urgent
246
+
247
+ #### 6. Self-Contained Database Config
248
+ - **Value**: QMD v2.0 stores collection config in SQLite tables (`store_collections`, `store_config`). Database is reopenable without external files.
249
+ - **Evidence**: `src/store.ts:699-727` — `store_collections` and `store_config` tables
250
+ - **Implementation**: Store configuration metadata (last ingest, publish mode, active scope) in a `config` table in our SQLite databases.
251
+ - **Effort**: 1-2 days
252
+ - **Trade-off**: Some config is better in ENV (paths, flags)
253
+ - **Recommendation**: **CONSIDER** — Useful for metadata like last_ingest_at, schema_version already tracked
254
+
255
+ #### 7. Per-Instance Resource Management
256
+ - **Value**: QMD creates per-store `LlamaCpp` instances with lazy loading and 5-min inactivity timeout. No global singletons. Enables concurrent stores safely.
257
+ - **Evidence**: `src/index.ts:361-364` — per-store LLM with `disposeModelsOnInactivity: true`
258
+ - **Implementation**: Ensure our `StoreManager` properly manages per-instance resources. Currently it's a singleton — consider making it closeable.
259
+ - **Effort**: 1 day
260
+ - **Trade-off**: Minimal for our use case (single process)
261
+ - **Recommendation**: **CONSIDER** — Good hygiene, low effort
206
262
 
207
263
  ### Features to Avoid
208
264
 
209
265
  - **Custom Fine-Tuned Query Expansion (Qwen3-1.7B)**: Too heavy for fact retrieval
210
- - **EmbeddingGemma**: We use fastembed-rb (BAAI/bge-small-en-v1.5) which is lighter
266
+ - **EmbeddingGemma / Qwen3-Embedding**: We use fastembed-rb (BAAI/bge-small-en-v1.5) which is lighter and requires no GPU
211
267
  - **Content-Addressable Storage**: Our facts are deduplicated by signature, not content hash
212
- - **LLM Reranking**: Cross-encoder reranking is over-engineering for our use case
268
+ - **LLM Reranking (Qwen3-Reranker-0.6B)**: Cross-encoder reranking is over-engineering for our use case
269
+ - **Query Document Format**: Over-engineering for fact retrieval (lex/vec/hyde routing unnecessary for SPO facts)
270
+ - **Write-Through YAML Config**: We don't use YAML config files; dual-database is our config model
271
+ - **Multi-Session HTTP Transport**: Our MCP server is lightweight enough for stdio; no model loading latency
213
272
 
214
273
  ---
215
274
 
@@ -219,37 +278,49 @@ llm_cache (hash-keyed deterministic response cache)
219
278
  - **Fact-based knowledge model**: More valuable than raw document chunks
220
279
  - **Dual-database system**: Clean project/global separation
221
280
  - **Ruby + Sequel**: Mature, stable, well-tested
281
+ - **21 MCP tools**: More granular than QMD's 4 tools
222
282
 
223
283
  ### What to Adopt
224
- - **sqlite-vec**: Critical for vector query performance
225
- - **Two-step vector query pattern**: Avoid JOIN hangs
226
- - **Dynamic MCP instructions**: Free context for LLMs
284
+ - **Dedicated maintenance class**: Clean operation wrapping with counts
285
+ - **Dynamic MCP instructions**: Rich context about database state
286
+ - **Embedded skill distribution**: `install-skill` command for zero-config setup
227
287
 
228
288
  ### What to Reject
229
- - **YAML collection system**: Our dual-database is cleaner
230
- - **Custom fine-tuned models**: Too heavy for our use case
231
- - **Query document format**: Over-engineering for fact retrieval
289
+ - **SDK-first refactor**: Over-engineering for a gem that's already well-structured
290
+ - **Self-contained DB config**: Our dual-database + ENV is already clean
291
+ - **REST API**: MCP covers our use case; REST adds complexity
232
292
 
233
293
  ---
234
294
 
235
295
  ## Key Takeaways
236
296
 
237
297
  ### Main Learnings
238
- 1. sqlite-vec is production-ready (v0.1.7-alpha.2) and used by multiple projects
239
- 2. Two-step query pattern is mandatory (JOINs hang with vec tables)
240
- 3. Query document format is elegant but over-engineering for fact retrieval
241
- 4. HTTP MCP transport enables shared server mode
242
-
243
- ### Changes Since Last Analysis (2026-02-02)
244
- - v1.1.0 released with query document format
245
- - Lex syntax with phrase matching and negation
246
- - Unified `query` MCP tool replacing 3 separate tools
247
- - HTTP MCP transport with daemon mode
248
- - Dual Node.js/Bun runtime support
249
- - Collection include/exclude management
298
+ 1. SDK-first architecture proves API completeness by having consumers use only the public surface
299
+ 2. Dynamic MCP instructions with database stats eliminate discovery tool calls
300
+ 3. Embedded skill distribution is a clean pattern for zero-config plugin setup
301
+ 4. Dedicated maintenance class improves separation of concerns
302
+ 5. REST endpoint alongside MCP is pragmatic for non-MCP consumers
303
+
304
+ ### Changes Since Last Analysis (2026-03-02)
305
+ - v2.0.0 and v2.0.1 released
306
+ - Stable SDK API with `QMDStore` interface and `createStore()` factory
307
+ - MCP server rewritten as pure SDK consumer (zero internal access)
308
+ - CLI and MCP organized into `src/cli/` and `src/mcp/` subdirectories
309
+ - Self-contained database with `store_collections` and `store_config` tables
310
+ - `Maintenance` class wrapping cleanup operations
311
+ - `embedded-skills.ts` with base64-encoded skill files
312
+ - `qmd skill install` command
313
+ - REST `/query` endpoint alongside MCP
314
+ - Runtime-aware `bin/qmd` wrapper for Bun/Node compatibility
315
+ - `better-sqlite3` bumped to ^12.4.5 for Node 25
316
+ - Comprehensive 1,286-line SDK test suite
317
+ - GPU init replaced with node-llama-cpp `autoAttempt`
318
+ - Collection ignore patterns
319
+ - Configurable `candidateLimit` for reranker
320
+ - Multi-session HTTP transport
250
321
 
251
322
  ---
252
323
 
253
- *Analysis completed: 2026-03-02*
324
+ *Analysis completed: 2026-03-10*
254
325
  *Analyst: Claude Code*
255
326
  *Review Status: Draft*