@joycodetech/qmd-ja 2.5.3-ja.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (48) hide show
  1. package/CHANGELOG.md +821 -0
  2. package/LICENSE +21 -0
  3. package/README.md +1143 -0
  4. package/bin/qmd-ja +162 -0
  5. package/dist/ast.d.ts +65 -0
  6. package/dist/ast.js +334 -0
  7. package/dist/bench/bench.d.ts +23 -0
  8. package/dist/bench/bench.js +280 -0
  9. package/dist/bench/score.d.ts +33 -0
  10. package/dist/bench/score.js +88 -0
  11. package/dist/bench/types.d.ts +80 -0
  12. package/dist/bench/types.js +8 -0
  13. package/dist/cli/formatter.d.ts +120 -0
  14. package/dist/cli/formatter.js +355 -0
  15. package/dist/cli/qmd.d.ts +43 -0
  16. package/dist/cli/qmd.js +4179 -0
  17. package/dist/collections.d.ts +166 -0
  18. package/dist/collections.js +410 -0
  19. package/dist/db.d.ts +44 -0
  20. package/dist/db.js +75 -0
  21. package/dist/index.d.ts +230 -0
  22. package/dist/index.js +242 -0
  23. package/dist/llm.d.ts +500 -0
  24. package/dist/llm.js +1615 -0
  25. package/dist/maintenance.d.ts +23 -0
  26. package/dist/maintenance.js +37 -0
  27. package/dist/mcp/server.d.ts +24 -0
  28. package/dist/mcp/server.js +702 -0
  29. package/dist/paths.d.ts +1 -0
  30. package/dist/paths.js +4 -0
  31. package/dist/store.d.ts +1002 -0
  32. package/dist/store.js +4208 -0
  33. package/models/vaporetto-bccwj.model +0 -0
  34. package/package.json +130 -0
  35. package/scripts/build.mjs +30 -0
  36. package/scripts/check-package-grammars.mjs +29 -0
  37. package/scripts/package-smoke.mjs +65 -0
  38. package/scripts/test-all.mjs +38 -0
  39. package/skills/qmd/SKILL.md +295 -0
  40. package/skills/qmd/references/mcp-setup.md +102 -0
  41. package/skills/release/SKILL.md +139 -0
  42. package/skills/release/scripts/install-hooks.sh +38 -0
  43. package/vendor/vaporetto-node-wasm/LICENSE +22 -0
  44. package/vendor/vaporetto-node-wasm/package.json +11 -0
  45. package/vendor/vaporetto-node-wasm/vaporetto_node_wasm.d.ts +19 -0
  46. package/vendor/vaporetto-node-wasm/vaporetto_node_wasm.js +202 -0
  47. package/vendor/vaporetto-node-wasm/vaporetto_node_wasm_bg.wasm +0 -0
  48. package/vendor/vaporetto-node-wasm/vaporetto_node_wasm_bg.wasm.d.ts +13 -0
package/CHANGELOG.md ADDED
@@ -0,0 +1,821 @@
1
+ # Changelog
2
+
3
+ ## [Unreleased]
4
+
5
+ ## [2.5.4] - 2026-06-22
6
+
7
+ ### Documentation
8
+
9
+ - README: documented collection filtering (`-c` semantics), the `collection
10
+ show`/`include`/`exclude`/`update-cmd` subcommands, the `--intent`/`--no-rerank`/
11
+ `-C`/`--full-path` search flags, the `--format <kind>` output selector (with the
12
+ legacy `--json`/`--csv`/`--md`/`--xml`/`--files` booleans noted as aliases),
13
+ `vector-search`/`deep-search` aliases, embed
14
+ memory flags (`--max-docs-per-batch`/`--max-batch-mb`), a sample `--explain`
15
+ score trace, the `qmd doctor`/`qmd init` commands, the `get` `:from:count`
16
+ suffix and `--no-line-numbers`, an MCP tool parameter reference, and a
17
+ Benchmarking section for `qmd bench`.
18
+ - docs/SYNTAX.md: removed the non-existent `q` MCP parameter example (the `query`
19
+ tool and REST endpoint accept only the `searches` array) and added a Scoping
20
+ section.
21
+ - README: removed the misleading `qmd update --pull` example. The `--pull` flag is
22
+ parsed but never consumed (`updateCollections()` ignores it); the real mechanism
23
+ for running `git pull` before re-indexing is a per-collection `update` command,
24
+ set via `qmd collection update-cmd`.
25
+
26
+ ### Fixed
27
+
28
+ - MCP server instructions now tell agents to scope with the plural `collections`
29
+ parameter (matching the schema). The previous singular `collection` hint led
30
+ agents to pass a parameter that Zod silently strips, producing unscoped results.
31
+ The `get` instruction line also now documents the full `file.md:from:count`
32
+ range suffix instead of only the single-line `file.md:100` offset.
33
+
34
+ - Filesystem paths with special characters (`#`, `&`, spaces, `[]`, `()`, etc.)
35
+ now round-trip correctly through index → search → get. Previously
36
+ `reindexCollection` called `handelize()` on relative paths before storing
37
+ them, turning `# Meeting - 234232 3432 __ 5.md` into
38
+ `Meeting-234232-3432-5.md` and making `qmd get <actual-path>`,
39
+ `qmd get --full-path`, and `qmd ls` return dead or garbled paths. Paths are
40
+ now stored verbatim. Existing indexes auto-migrate on the next `qmd update`.
41
+
42
+ - FTS5 search now correctly matches dotted version strings like `2026.4.10`. The
43
+ `porter unicode61` tokenizer splits on dots (storing `2026`, `4`, `10` as
44
+ separate tokens), but the query sanitizer was stripping dots and producing
45
+ `2026410` which never matched. Dotted terms are now split and ANDed together
46
+ so version-string searches work as expected (#563).
47
+ - HTTP REST endpoints `/query` and `/search` now return `qmd://collection/path`
48
+ URIs in the `file` field, matching the output format used by the CLI and MCP
49
+ resource URIs. Previously the raw `displayPath` (`collection/path`) was
50
+ returned without the scheme prefix (#576).
51
+ - The embed session `maxDuration` is now env-configurable via
52
+ `QMD_EMBED_MAX_DURATION_MS` (default: 30 min). This prevents large-corpus
53
+ embeddings from being aborted by the hardcoded 30-minute ceiling (#673).
54
+
55
+ ## [2.5.3] - 2026-05-28
56
+
57
+ ### Features
58
+
59
+ - `qmd get` now accepts a `:from:count` suffix on a path or docid (e.g.
60
+ `qmd get "#abc123:120:40"` reads 40 lines starting at line 120). Explicit
61
+ `--from`/`-l` flags still override the suffix. The MCP `get` tool accepts the
62
+ same suffix.
63
+ - `qmd get` and `qmd multi-get` are now **line-numbered by default** and print
64
+ the document's `#docid` and `qmd://` path in the output header. Disable line
65
+ numbers with `--no-line-numbers`. The MCP `get`/`multi_get` tools default
66
+ `lineNumbers` to `true` to match.
67
+ - `qmd multi-get` now includes the `#docid` in every output format
68
+ (`--md`, `--json`, `--csv`, `--xml`, `--files`, and the default CLI view),
69
+ consistent with `qmd search`.
70
+ - `qmd get` and `qmd multi-get` accept `--full-path`, which replaces the
71
+ `qmd://` path + `#docid` with the document's on-disk filesystem path (handy for
72
+ piping into `Read`/`Edit`/an editor). Falls back to the canonical `qmd://` +
73
+ docid header when the file no longer exists on disk.
74
+ - `qmd search` / `qmd query` now show a clearer hit identifier: the default CLI
75
+ view (and the new `**file:**` line in `--md` output) always prints the full
76
+ `qmd://collection/path` URI so you can pipe it straight back into `qmd get`.
77
+ - `qmd search` / `qmd query` accept `--full-path` with the same semantics as
78
+ `qmd get`: the result label becomes the file's on-disk path — `./`-prefixed
79
+ relative path when the file lives in a subfolder of `$PWD`, absolute realpath
80
+ otherwise — and the per-result `#docid` is dropped because the path is the
81
+ identifier. The leading `./` is intentional so the output is unambiguously a
82
+ filesystem path. Applies to all output formats.
83
+ - `qmd get` and `qmd multi-get` now also use the `./`-prefixed convention when
84
+ `--full-path` renders a path under `$PWD`, matching `search`/`query`.
85
+ - New `--format <kind>` flag selects the output format (`cli` | `json` | `csv` |
86
+ `md` | `xml` | `files`) for `search`, `query`, and `multi-get`. The legacy
87
+ boolean aliases (`--json`/`--csv`/`--md`/`--xml`/`--files`) still work but are
88
+ no longer in `--help`; prefer `--format`.
89
+
90
+ ### Fixes
91
+
92
+ - Launcher: source-mode runner selection now prefers Node + tsx over Bun when
93
+ both `package-lock.json` and `bun.lock` are present in the package root,
94
+ mirroring the dist-mode "npm priority" rule. Fixes pnpm-global installs that
95
+ copy the entire working tree (including `.git` and `bun.lock`) into the
96
+ install dir and previously routed through Bun, causing ABI mismatches with
97
+ the Node-built `better-sqlite3` / `sqlite-vec` native modules.
98
+ - Darwin Metal: llama-using commands (`query`, `vsearch`, `embed`) no longer
99
+ dump a multi-kB GGML/Metal backtrace at process exit even when output
100
+ succeeded. The libggml-metal static `ggml_metal_device` destructor asserts
101
+ `[rsets->data count] == 0` during `__cxa_finalize_ranges`, but the
102
+ buffer-free path never calls the symmetric `ggml_metal_device_rsets_rm`
103
+ to remove released rsets from the device collection (upstream
104
+ ggml-org/llama.cpp#22593, one-line fix open as PR #22595). The assertion
105
+ only fires when `process.exit()` skips Node's `beforeExit` hook, which is
106
+ what node-llama-cpp uses to auto-dispose Metal contexts. Primary fix:
107
+ `finishSuccessfulCliCommand` now sets `process.exitCode = 0` and returns
108
+ instead of calling `process.exit(0)`, so `beforeExit` fires and the native
109
+ binding cleans up before libc's static destructor runs. Defense-in-depth:
110
+ the launcher (`bin/qmd`) and the npm test driver (`scripts/test-all.mjs`
111
+ + the `test:bun` / `test:unit` package.json scripts) also set
112
+ `GGML_METAL_NO_RESIDENCY=1` on darwin before spawning node/bun, covering
113
+ error paths and tests that still terminate via `process.exit()`. The env
114
+ var must be set before node/bun start — libggml-metal reads it via libc
115
+ `getenv` at module-load time, and Bun does not propagate `process.env`
116
+ mutations to libc `setenv` — so it lives in the launcher rather than in
117
+ test-preload. Residency sets give no measurable speedup for QMD's
118
+ short-lived CLI workflow (benchmarked on M3 Pro). Opt back in with
119
+ `QMD_METAL_KEEP_RESIDENCY=1` for long-lived qmd processes (e.g. the MCP
120
+ daemon may benefit on hot reload) or to triage the upstream fix.
121
+ `qmd doctor` reports the mitigation state. Minimal reproduction:
122
+ `scripts/repro-metal-rsets-crash.mjs`.
123
+
124
+ ### Docs
125
+
126
+ - qmd skill: emphasize reading line ranges with `get`'s built-in
127
+ `:from:count` suffix / `--from`/`-l` flags instead of piping through
128
+ `sed`/`head`/`tail`; cite the docid and line numbers now present in retrieval
129
+ output; and author structured `intent:`/`lex:`/`vec:`/`hyde:` queries yourself
130
+ rather than relying on built-in query expansion.
131
+
132
+ ## [2.5.2] - 2026-05-22
133
+
134
+ ### Fixes
135
+
136
+ - Launcher: Rewrite `bin/qmd` as a Node-based shebang polyglot to fix global npm installation execution failures on Windows (#668 / #452), while supporting seamless fallback to Bun in Node-less environments.
137
+
138
+
139
+ ## [2.5.1] - 2026-05-20
140
+
141
+ ### Changes
142
+
143
+ - Release: publish from GitHub Actions via npm Trusted Publishing/OIDC instead of a long-lived `NPM_TOKEN` secret.
144
+
145
+ ## [2.5.0] - 2026-05-19
146
+
147
+ ### Changes
148
+
149
+ - Dependencies: update core SQLite/config/chunking packages (`better-sqlite3`, `yaml`, `web-tree-sitter`, `tree-sitter-go`, and `tree-sitter-python`) while keeping incompatible `zod`, `tsx`, and `vitest` majors pinned.
150
+ - Agent skills: add `qmd skills list|get|path` to serve version-matched runtime skill instructions from the installed CLI, and make `qmd skill install` write a stable discovery stub so installed agent skills do not go stale after QMD upgrades.
151
+ - CLI: add `qmd doctor` for index/runtime diagnostics, including SQLite/sqlite-vec versions, embedding fingerprint freshness, mixed-fingerprint detection, safe legacy fingerprint adoption, and content-hash sampling.
152
+
153
+ ### Fixes
154
+
155
+ - Launcher: prefer runnable TypeScript source in git checkouts even when ignored `dist/` artifacts exist, while packaged installs continue to run `dist/`.
156
+ - GPU: keep node-llama-cpp's documented `gpu: "auto"` initialization as the primary path, then perform no-build packaged CUDA/Vulkan/Metal probes only if auto falls back to CPU.
157
+ - CLI: move GPU/CPU runtime diagnostics out of `qmd status`; use `qmd doctor` for device probing and related environment guidance.
158
+ - CLI: point unexpected command/setup failures toward `qmd doctor` so diagnostics are the default next step when QMD behaves incorrectly.
159
+ - Doctor: explicitly warn when `content_vectors` contains multiple non-empty embedding fingerprint names, with the per-fingerprint document/chunk breakdown.
160
+ - Embed: make the TTY progress line label byte-based input progress explicitly, show embedded chunks as a count, and shorten the displayed model name.
161
+ - Embed: retain per-chunk failure details, retry failed chunks after later successful embeds and again when no other chunks remain, clear recovered errors, and cap retries to avoid endless loops.
162
+ - Tests: expand the container smoke harness to cover npm-global, npx-style, and Bun-global install scenarios, always checking auto and `QMD_FORCE_CPU=1` doctor modes, with opt-in tiny `qmd embed` and GPU probe runs for supported container runtimes.
163
+ - Embedding: fingerprint vector metadata using the active embedding model and formatting/chunking parameters so stale vectors are treated as pending after search semantics change. Legacy `content_vectors` columns are migrated lazily on first vector-health/write use to preserve fast QMD startup.
164
+
165
+ - Skill: expand the packaged QMD skill with retrieval-first workflows, structured query examples, wiki/source collection guidance, and safe fallbacks when model-backed search is unavailable.
166
+ - Tests: make `bun run test` execute the local unit suite under both Node/Vitest and Bun (`test:node` + `test:bun`) so runtime-specific regressions are caught before CI.
167
+ - Model config: centralize embedding/rerank/generation model resolution so `qmd embed`, `status`, `query`, `vsearch`, `pull`, SDK vector search, and `bench` use the same active `.qmd/index.yaml` model hints and environment fallbacks.
168
+ - GPU/status: `qmd status` now uses the same embedding model identity as `qmd embed` when computing pending embeddings, so URI-backed embeddings are not incorrectly reported as pending under the legacy `embeddinggemma` alias.
169
+ - GPU status: `qmd status` now always shows GPU mode/configuration without unsafe native probing, and CPU-fallback warnings point to `QMD_STATUS_DEVICE_PROBE=1 qmd status` for an actual backend probe. The no-GPU warning is emitted once per process instead of once per LLM instance during benchmarks.
170
+ - GPU: add `QMD_FORCE_CPU=1` / `--no-gpu` to bypass CUDA/Vulkan/Metal probing entirely, and route native llama.cpp stdout noise to stderr so JSON output stays parseable during search/query commands.
171
+ - Snippet line numbers: `qmd_query` (MCP), HTTP `/query`, and `qmd query`
172
+ (CLI JSON output and snippet headers) now return absolute source-file
173
+ line numbers instead of chunk-local ones, so the `line` field can be
174
+ passed back to `qmd_get` as `fromLine` without a separate lookup.
175
+ Snippet selection remains scoped to the best matching chunk
176
+ (preserves #149).
177
+ - CLI: `qmd query --full` now emits the full document body in all output
178
+ formats (json, csv, md, xml), restoring the documented behavior of the
179
+ flag. Previously it returned only the best matching chunk (~3.6KB max
180
+ per result). Output payload for `--full` queries is now proportional
181
+ to total document size.
182
+ - macOS Metal: `qmd query --json` now flushes successful JSON output and uses a safe immediate-exit path on Darwin to avoid ggml Metal finalizer aborts; other commands still dispose LLM contexts/models before the llama runtime. #368
183
+ - Embedding: require complete chunk coverage before treating a document as
184
+ embedded, remove partial vectors when chunk/session failures leave a
185
+ document incomplete, and keep `qmd status` pending counts honest after
186
+ interrupted long embed runs. #637 #378
187
+ - Embedding: `qmd embed -c <collection>` now scopes pending-doc selection
188
+ to the requested collection instead of embedding global pending work.
189
+ Scoped `--force` clears only collection-owned vectors, preserves shared
190
+ hashes referenced by sibling collections, and drops `vectors_vec` only
191
+ when the scoped clear empties all vectors.
192
+ - Hybrid search: weight RRF lists by query type so original FTS and original vector evidence get the intended 2x boost, instead of accidentally boosting the first lexical expansion. #591
193
+ - MCP: seed llama.cpp/GGML quiet env vars before launching `qmd mcp` so native logs cannot pollute stdio JSON-RPC framing. #593
194
+ - CLI: remove CommonJS `require()` calls from ESM index path normalization so `qmd --index <path>` no longer crashes with `ERR_AMBIGUOUS_MODULE_SYNTAX` on Node 22+. #634
195
+ - Windows CUDA: serialize llama.cpp embedding/reranking contexts by default to avoid intermittent `ggml-cuda.cu:98` crashes in `qmd query`; set `QMD_EMBED_PARALLELISM` to opt back into parallel contexts if your driver is stable. #519
196
+ - MCP: make `qmd mcp --index <name>` use the selected index for both foreground and daemon HTTP servers instead of falling back to the default store. #343
197
+ - Embedding: respect `QMD_EMBED_MODEL` consistently for vector indexing and vector-backed search, with default-model fallback when unset.
198
+ - Config: use one home-directory resolver for YAML config and the default SQLite cache path, avoiding Windows CLI/MCP split-brain when `HOME` is unset.
199
+ - GPU: respect explicit `QMD_LLAMA_GPU=metal|vulkan|cuda` backend overrides instead of always using auto GPU selection. #529
200
+ - Fix: preserve original filename case in `handelize()`. The previous
201
+ `.toLowerCase()` call made indexed paths unreachable on case-sensitive
202
+ filesystems (Linux). `qmd update` automatically migrates legacy
203
+ lowercase paths without re-embedding.
204
+ - CLI: make `qmd status` skip native `node-llama-cpp` device probing by
205
+ default so status stays safe on machines with broken or unsupported GPU
206
+ drivers. Set `QMD_STATUS_DEVICE_PROBE=1` to opt in.
207
+ - CLI: lazy-load `node-llama-cpp` so lightweight commands such as
208
+ `qmd status` do not import native ML dependencies or trigger llama.cpp
209
+ builds on ARM/no-GPU machines. #491
210
+ - Store: keep content rows referenced by inactive documents during orphan
211
+ cleanup so `qmd update` preserves soft-deleted tombstones for removed
212
+ files. #585
213
+ - Packaging: install AST grammar WASM packages as required dependencies so
214
+ Bun global installs include TypeScript/TSX/JavaScript grammars, and add a
215
+ `smoke:package-grammars` verification command. #595
216
+ - Launcher: add wrapper smoke coverage for scoped package, npm/npx,
217
+ Homebrew/Linuxbrew, Bun global symlink layouts, and `$BUN_INSTALL`
218
+ false-positive runtime selection regressions. #351 #353 #354 #356 #358 #359
219
+
220
+ ## [2.1.0] - 2026-04-05
221
+
222
+ Code files now chunk at function and class boundaries via tree-sitter,
223
+ clickable editor links land you at the right line from search results,
224
+ and per-collection model configuration means you can point different
225
+ collections at different embedding models. 25+ community PRs fix
226
+ embedding stability, BM25 accuracy, and cross-platform launcher issues.
227
+
228
+ ### Changes
229
+
230
+ - AST-aware chunking for code files via `web-tree-sitter`. Supported
231
+ languages: TypeScript/JavaScript, Python, Go, and Rust. Code files
232
+ are chunked at function, class, and import boundaries instead of
233
+ arbitrary text positions. Markdown and unknown file types are unchanged.
234
+ `--chunk-strategy <auto|regex>` flag on `qmd embed` and `qmd query`
235
+ (default `regex`). SDK: `chunkStrategy` option on `embed()` and
236
+ `search()`. `qmd status` shows grammar availability.
237
+ - `qmd bench <fixture.json>` command for search quality benchmarks.
238
+ Measures precision@k, recall, MRR, and F1 across BM25, vector, hybrid,
239
+ and full pipeline backends. Ships with an example fixture against
240
+ the eval-docs test collection. #470 (thanks @jmilinovich)
241
+ - `models:` section in `index.yml` lets you configure `embed`, `rerank`,
242
+ and `generate` model URIs per collection. Resolution order is
243
+ config > env var (`QMD_EMBED_MODEL`, `QMD_RERANK_MODEL`,
244
+ `QMD_GENERATE_MODEL`) > built-in default. #502
245
+ (thanks @JohnRichardEnders)
246
+ - CLI search output now emits clickable OSC 8 terminal hyperlinks when
247
+ stdout is a TTY. Links resolve `qmd://` paths to absolute filesystem
248
+ paths and open in editors via URI templates (default:
249
+ `vscode://file/{path}:{line}:{col}`). Configure with `QMD_EDITOR_URI`
250
+ or `editor_uri` in the YAML config. #508 (thanks @danmackinlay)
251
+ - `--no-rerank` flag skips the reranking step in `qmd query` — useful
252
+ when you want fast results or don't have a GPU. Also exposed as
253
+ `rerank: false` on the MCP `query` tool. #370 (thanks @mvanhorn),
254
+ #478 (thanks @zestyboy)
255
+ - ONNX conversion script for deploying embedding models via
256
+ Transformers.js. #399 (thanks @shreyaskarnik)
257
+ - GitHub Actions workflow to build the Nix flake on Linux and macOS.
258
+
259
+ ### Fixes
260
+
261
+ - Embedding: prevent `qmd embed` from running indefinitely when the
262
+ embedding loop stalls. #458 (thanks @ccc-fff)
263
+ - Embedding: truncate oversized text before embedding to prevent GGML
264
+ crash, and bound memory usage during batch embedding. #393
265
+ (thanks @lskun), #395 (thanks @ProgramCaiCai)
266
+ - Embedding: set explicit embed context size (default 2048, configurable
267
+ via `QMD_EMBED_CONTEXT_SIZE`) instead of using the model's full
268
+ window. #500
269
+ - Embedding: error on dimension mismatch instead of silently rebuilding
270
+ the vec0 table. #501
271
+ - Embedding: handle vec0 `OR REPLACE` limitation in `insertEmbedding`.
272
+ #456 (thanks @antonio-mello-ai)
273
+ - Embedding: fix model selection when multiple models are configured.
274
+ #494
275
+ - BM25: correct field weights to include all 3 FTS columns — title,
276
+ body, and path were not weighted correctly. #462 (thanks @goldsr09)
277
+ - BM25: handle hyphenated tokens in FTS5 lex queries so terms like
278
+ "real-time" match correctly. #463 (thanks @goldsr09)
279
+ - BM25: preserve underscores in search terms instead of stripping them.
280
+ #404
281
+ - BM25: use CTE in `searchFTS` to prevent query planner regression with
282
+ collection filter.
283
+ - Reranker: increase default context size 2048→4096 and make
284
+ configurable via `QMD_RERANK_CONTEXT_SIZE`. Fix template overhead
285
+ underestimate 200→512. #453 (thanks @builderjarvis)
286
+ - GPU: catch initialization failures and fall back to CPU instead of
287
+ crashing.
288
+ - MCP: read version from `package.json` instead of hardcoding. #431
289
+ - MCP: include collection name in status output. #416
290
+ - Multi-get: support brace expansion patterns in glob matching. #424
291
+ - Launcher: prioritize `package-lock.json` to prevent Bun false
292
+ positive. #385 (thanks @rymalia)
293
+ - Launcher: remove `$BUN_INSTALL` check that caused false Bun detection.
294
+ #362 (thanks @syedair)
295
+ - Launcher: skip Git Bash path detection on WSL. #371
296
+ (thanks @oysteinkrog)
297
+ - Model cache: respect `XDG_CACHE_HOME` for model cache directory. #457
298
+ (thanks @antonio-mello-ai)
299
+ - SQLite: add macOS Homebrew SQLite support for Bun and restore
300
+ actionable errors. #377 (thanks @serhii12)
301
+ - Pin zod to exact 4.2.1 to fix `tsc` build failure. #382
302
+ (thanks @rymalia)
303
+ - Preserve dots and original case in `handelize()` — filenames like
304
+ `MEMORY.md` no longer become `memory-md`. #475 (thanks @alexei-led)
305
+ - Include `line` in `--json` search output so editor integrations can
306
+ jump directly to `file:line`. #506 (thanks @danmackinlay)
307
+ - Nix: fix paths in flake and make Bun dependency a fixed-output
308
+ derivation so sandboxed Linux builds work offline. #479
309
+ (thanks @surma-dump)
310
+ - Sync stale `bun.lock` (`better-sqlite3` 11.x → 12.x). CI and release
311
+ script now use `--frozen-lockfile` to prevent recurrence. #386
312
+ (thanks @Mic92)
313
+ - Approve native build scripts in pnpm so `better-sqlite3` and
314
+ tree-sitter modules compile correctly. Update vitest ^3.0.0 → ^3.2.4.
315
+
316
+ ## [2.0.1] - 2026-03-10
317
+
318
+ ### Changes
319
+
320
+ - `qmd skill install` copies the packaged QMD skill into
321
+ `~/.claude/commands/` for one-command setup. #355 (thanks @nibzard)
322
+
323
+ ### Fixes
324
+
325
+ - Fix Qwen3-Embedding GGUF filename case — HuggingFace filenames are
326
+ case-sensitive, the lowercase variant returned 404. #349 (thanks @byheaven)
327
+ - Resolve symlinked global launcher path so `qmd` works correctly when
328
+ installed via `npm i -g`. #352 (thanks @nibzard)
329
+
330
+ ## [2.0.0] - 2026-03-10
331
+
332
+ QMD 2.0 declares a stable library API. The SDK is now the primary interface —
333
+ the MCP server is a clean consumer of it, and the source is organized into
334
+ `src/cli/` and `src/mcp/`. Also: Node 25 support and a runtime-aware bin wrapper
335
+ for bun installs.
336
+
337
+ ### Changes
338
+
339
+ - Stable SDK API with `QMDStore` interface — search, retrieval, collection/context
340
+ management, indexing, lifecycle
341
+ - Unified `search()`: pass `query` for auto-expansion or `queries` for
342
+ pre-expanded lex/vec/hyde — replaces the old query/search/structuredSearch split
343
+ - New `getDocumentBody()`, `getDefaultCollectionNames()`, `Maintenance` class
344
+ - MCP server rewritten as a clean SDK consumer — zero internal store access
345
+ - CLI and MCP organized into `src/cli/` and `src/mcp/` subdirectories
346
+ - Runtime-aware `bin/qmd` wrapper detects bun vs node to avoid ABI mismatches.
347
+ Closes #319
348
+ - `better-sqlite3` bumped to ^12.4.5 for Node 25 support. Closes #257
349
+ - Utility exports: `extractSnippet`, `addLineNumbers`, `DEFAULT_MULTI_GET_MAX_BYTES`
350
+
351
+ ### Fixes
352
+
353
+ - Remove unused `import { resolve }` in store.ts that shadowed local export
354
+
355
+ ## [1.1.6] - 2026-03-09
356
+
357
+ QMD can now be used as a library. `import { createStore } from '@tobilu/qmd'`
358
+ gives you the full search and indexing API — hybrid query, BM25, structured
359
+ search, collection/context management — without shelling out to the CLI.
360
+
361
+ ### Changes
362
+
363
+ - **SDK / library mode**: `createStore({ dbPath, config })` returns a
364
+ `QMDStore` with `query()`, `search()`, `structuredSearch()`, `get()`,
365
+ `multiGet()`, and collection/context management methods. Supports inline
366
+ config (no files needed) or a YAML config path.
367
+ - **Package exports**: `package.json` now declares `main`, `types`, and
368
+ `exports` so bundlers and TypeScript resolve `@tobilu/qmd` correctly.
369
+
370
+ ## [1.1.5] - 2026-03-07
371
+
372
+ Ambiguous queries like "performance" now produce dramatically better results
373
+ when the caller knows what they mean. The new `intent` parameter steers all
374
+ five pipeline stages — expansion, strong-signal bypass, chunk selection,
375
+ reranking, and snippet extraction — without searching on its own. Design and
376
+ original implementation by Ilya Grigorik (@vyalamar) in #180.
377
+
378
+ ### Changes
379
+
380
+ - **Intent parameter**: optional `intent` string disambiguates queries across
381
+ the entire search pipeline. Available via CLI (`--intent` flag or `intent:`
382
+ line in query documents), MCP (`intent` field on the query tool), and
383
+ programmatic API. Adapted from PR #180 (thanks @vyalamar).
384
+ - **Query expansion**: when intent is provided, the expansion LLM prompt
385
+ includes `Query intent: {intent}`, matching the finetune training data
386
+ format for better-aligned expansions.
387
+ - **Reranking**: intent is prepended to the rerank query so Qwen3-Reranker
388
+ scores with domain context.
389
+ - **Chunk selection**: intent terms scored at 0.5× weight alongside query
390
+ terms (1.0×) when selecting the best chunk per document for reranking.
391
+ - **Snippet extraction**: intent terms scored at 0.3× weight to nudge
392
+ snippets toward intent-relevant lines without overriding query anchoring.
393
+ - **Strong-signal bypass disabled with intent**: when intent is provided, the
394
+ BM25 strong-signal shortcut is skipped — the obvious keyword match may not
395
+ be what the caller wants.
396
+ - **MCP instructions**: callers are now guided to provide `intent` on every
397
+ search call for disambiguation.
398
+ - **Query document syntax**: `intent:` recognized as a line type. At most one
399
+ per document, cannot appear alone. Grammar updated in `docs/SYNTAX.md`.
400
+
401
+ ## [1.1.2] - 2026-03-07
402
+
403
+ 13 community PRs merged. GPU initialization replaced with node-llama-cpp's
404
+ built-in `autoAttempt` — deleting ~220 lines of manual fallback code and
405
+ fixing GPU issues reported across 10+ PRs in one shot. Reranking is faster
406
+ through chunk deduplication and a parallelism cap that prevents VRAM
407
+ exhaustion.
408
+
409
+ ### Changes
410
+
411
+ - **GPU init**: use node-llama-cpp's `build: "autoAttempt"` instead of manual
412
+ GPU backend detection. Automatically tries Metal/CUDA/Vulkan and falls back
413
+ gracefully. #310 (thanks @giladgd — the node-llama-cpp author)
414
+ - **Query `--explain`**: `qmd query --explain` exposes retrieval score traces
415
+ — backend scores, per-list RRF contributions, top-rank bonus, reranker
416
+ score, and final blended score. Works in JSON and CLI output. #242
417
+ (thanks @vyalamar)
418
+ - **Collection ignore patterns**: `ignore: ["Sessions/**", "*.tmp"]` in
419
+ collection config to exclude files from indexing. #304 (thanks @sebkouba)
420
+ - **Multilingual embeddings**: `QMD_EMBED_MODEL` env var lets you swap in
421
+ models like Qwen3-Embedding for non-English collections. #273 (thanks
422
+ @daocoding)
423
+ - **Configurable expansion context**: `QMD_EXPAND_CONTEXT_SIZE` env var
424
+ (default 2048) — previously used the model's full 40960-token window,
425
+ wasting VRAM. #313 (thanks @0xble)
426
+ - **`candidateLimit` exposed**: `-C` / `--candidate-limit` flag and MCP
427
+ parameter to tune how many candidates reach the reranker. #255 (thanks
428
+ @pandysp)
429
+ - **MCP multi-session**: HTTP transport now supports multiple concurrent
430
+ client sessions, each with its own server instance. #286 (thanks @joelev)
431
+
432
+ ### Fixes
433
+
434
+ - **Reranking performance**: cap parallel rerank contexts at 4 to prevent
435
+ VRAM exhaustion on high-core machines. Deduplicate identical chunk texts
436
+ before reranking — same content from different files now shares a single
437
+ reranker call. Cache scores by content hash instead of file path.
438
+ - Deactivate stale docs when all files are removed from a collection and
439
+ `qmd update` is run. #312 (thanks @0xble)
440
+ - Handle emoji-only filenames (`🐘.md` → `1f418.md`) instead of crashing.
441
+ #308 (thanks @debugerman)
442
+ - Skip unreadable files during indexing (e.g. iCloud-evicted files returning
443
+ EAGAIN) instead of crashing. #253 (thanks @jimmynail)
444
+ - Suppress progress bar escape sequences when stderr is not a TTY. #230
445
+ (thanks @dgilperez)
446
+ - Emit format-appropriate empty output (`[]` for JSON, CSV header for CSV,
447
+ etc.) instead of plain text "No results." #228 (thanks @amsminn)
448
+ - Correct Windows sqlite-vec package name (`sqlite-vec-windows-x64`) and add
449
+ `sqlite-vec-linux-arm64`. #225 (thanks @ilepn)
450
+ - Fix claude plugin setup CLI commands in README. #311 (thanks @gi11es)
451
+
452
+ ## [1.1.1] - 2026-03-06
453
+
454
+ ### Fixes
455
+
456
+ - Reranker: truncate documents exceeding the 2048-token context window
457
+ instead of silently producing garbage scores. Long chunks (e.g. from
458
+ PDF ingestion) now get a fair ranking.
459
+ - Nix: add python3 and cctools to build dependencies. #214 (thanks
460
+ @pcasaretto)
461
+
462
+ ## [1.1.0] - 2026-02-20
463
+
464
+ QMD now speaks in **query documents** — structured multi-line queries where every line is typed (`lex:`, `vec:`, `hyde:`), combining keyword precision with semantic recall. A single plain query still works exactly as before (it's treated as an implicit `expand:` and auto-expanded by the LLM). Lex now supports quoted phrases and negation (`"C++ performance" -sports -athlete`), making intent-aware disambiguation practical. The formal query grammar is documented in `docs/SYNTAX.md`.
465
+
466
+ The npm package now uses the standard `#!/usr/bin/env node` bin convention, replacing the custom bash wrapper. This fixes native module ABI mismatches when installed via bun and works on any platform with node >= 22 on PATH.
467
+
468
+ ### Changes
469
+
470
+ - **Query document format**: multi-line queries with typed sub-queries (`lex:`, `vec:`, `hyde:`). Plain queries remain the default (`expand:` implicit, but not written inside the document). First sub-query gets 2× fusion weight — put your strongest signal first. Formal grammar in `docs/SYNTAX.md`.
471
+ - **Lex syntax**: full BM25 operator support. `"exact phrase"` for verbatim matching; `-term` and `-"phrase"` for exclusions. Essential for disambiguation when a term is overloaded across domains (e.g. `performance -sports -athlete`).
472
+ - **`expand:` shortcut**: send a single plain query (or start the document with `expand:` on its only line) to auto-expand via the local LLM. Query documents themselves are limited to `lex`, `vec`, and `hyde` lines.
473
+ - **MCP `query` tool** (renamed from `structured_search`): rewrote the tool description to fully teach AI agents the query document format, lex syntax, and combination strategy. Includes worked examples with intent-aware lex.
474
+ - **HTTP `/query` endpoint** (renamed from `/search`; `/search` kept as silent alias).
475
+ - **`collections` array filter**: filter by multiple collections in a single query (`collections: ["notes", "brain"]`). Removed the single `collection` string param — array only.
476
+ - **Collection `include`/`exclude`**: `includeByDefault: false` hides a collection from all queries unless explicitly named via `collections`. CLI: `qmd collection exclude <name>` / `qmd collection include <name>`.
477
+ - **Collection `update-cmd`**: attach a shell command that runs before every `qmd update` (e.g. `git stash && git pull --rebase --ff-only && git stash pop`). CLI: `qmd collection update-cmd <name> '<cmd>'`.
478
+ - **`qmd status` tips**: shows actionable tips when collections lack context descriptions or update commands.
479
+ - **`qmd collection` subcommands**: `show`, `update-cmd`, `include`, `exclude`. Bare `qmd collection` now prints help.
480
+ - **Packaging**: replaced custom bash wrapper with standard `#!/usr/bin/env node` shebang on `dist/qmd.js`. Fixes native module ABI mismatches when installed via bun, and works on any platform where node >= 22 is on PATH.
481
+ - **Removed MCP tools** `search`, `vector_search`, `deep_search` — all superseded by `query`.
482
+ - **Removed** `qmd context check` command.
483
+ - **CLI timing**: each LLM step (expand, embed, rerank) prints elapsed time inline (`Expanding query... (4.2s)`).
484
+
485
+ ### Fixes
486
+
487
+ - `qmd collection list` shows `[excluded]` tag for collections with `includeByDefault: false`.
488
+ - Default searches now respect `includeByDefault` — excluded collections are skipped unless explicitly named.
489
+ - Fix main module detection when installed globally via npm/bun (symlink resolution).
490
+
491
+ ## [1.0.7] - 2026-02-18
492
+
493
+ ### Changes
494
+
495
+ - LLM: add LiquidAI LFM2-1.2B as an alternative base model for query
496
+ expansion fine-tuning. LFM2's hybrid architecture (convolutions + attention)
497
+ is 2x faster at decode/prefill vs standard transformers — good fit for
498
+ on-device inference.
499
+ - CLI: support multiple `-c` flags to search across several collections at
500
+ once (e.g. `qmd search -c notes -c journals "query"`). #191 (thanks
501
+ @openclaw)
502
+
503
+ ### Fixes
504
+
505
+ - Return empty JSON array `[]` instead of no output when `--json` search
506
+ finds no results.
507
+ - Resolve relative paths passed to `--index` so they don't produce malformed
508
+ config entries.
509
+ - Respect `XDG_CONFIG_HOME` for collection config path instead of always
510
+ using `~/.config`. #190 (thanks @openclaw)
511
+ - CLI: empty-collection hint now shows the correct `collection add` command.
512
+ #200 (thanks @vincentkoc)
513
+
514
+ ## [1.0.6] - 2026-02-16
515
+
516
+ ### Changes
517
+
518
+ - CLI: `qmd status` now shows models with full HuggingFace links instead of
519
+ static names in `--help`. Model info is derived from the actual configured
520
+ URIs so it stays accurate if models change.
521
+ - Release tooling: pre-push hook handles non-interactive shells (CI, editors)
522
+ gracefully — warnings auto-proceed instead of hanging on a tty prompt.
523
+ Annotated tags now resolve correctly for CI checks.
524
+
525
+ ## [1.0.5] - 2026-02-16
526
+
527
+ The npm package now ships compiled JavaScript instead of raw TypeScript,
528
+ removing the `tsx` runtime dependency. A new `/release` skill automates the
529
+ full release workflow with changelog validation and git hook enforcement.
530
+
531
+ ### Changes
532
+
533
+ - Build: compile TypeScript to `dist/` via `tsc` so the npm package no longer
534
+ requires `tsx` at runtime. The `qmd` shell wrapper now runs `dist/qmd.js`
535
+ directly.
536
+ - Release tooling: new `/release` skill that manages the full release
537
+ lifecycle — validates changelog, installs git hooks, previews release notes,
538
+ and cuts the release. Auto-populates `[Unreleased]` from git history when
539
+ empty.
540
+ - Release tooling: `scripts/extract-changelog.sh` extracts cumulative notes
541
+ for the full minor series (e.g. 1.0.0 through 1.0.5) for GitHub releases.
542
+ Includes `[Unreleased]` content in previews.
543
+ - Release tooling: `scripts/release.sh` renames `[Unreleased]` to a versioned
544
+ heading and inserts a fresh empty `[Unreleased]` section automatically.
545
+ - Release tooling: pre-push git hook blocks `v*` tag pushes unless
546
+ `package.json` version matches the tag, a changelog entry exists, and CI
547
+ passed on GitHub.
548
+ - Publish workflow: GitHub Actions now builds TypeScript, creates a GitHub
549
+ release with cumulative notes extracted from the changelog, and publishes
550
+ to npm with provenance.
551
+
552
+ ## [1.0.0] - 2026-02-15
553
+
554
+ QMD now runs on both Node.js and Bun, with up to 2.7x faster reranking
555
+ through parallel GPU contexts. GPU auto-detection replaces the unreliable
556
+ `gpu: "auto"` with explicit CUDA/Metal/Vulkan probing.
557
+
558
+ ### Changes
559
+
560
+ - Runtime: support Node.js (>=22) alongside Bun via a cross-runtime SQLite
561
+ abstraction layer (`src/db.ts`). `bun:sqlite` on Bun, `better-sqlite3` on
562
+ Node. The `qmd` wrapper auto-detects a suitable Node.js install via PATH,
563
+ then falls back to mise, asdf, nvm, and Homebrew locations.
564
+ - Performance: parallel embedding & reranking via multiple LlamaContext
565
+ instances — up to 2.7x faster on multi-core machines.
566
+ - Performance: flash attention for ~20% less VRAM per reranking context,
567
+ enabling more parallel contexts on GPU.
568
+ - Performance: right-sized reranker context (40960 → 2048 tokens, 17x less
569
+ memory) since chunks are capped at ~900 tokens.
570
+ - Performance: adaptive parallelism — context count computed from available
571
+ VRAM (GPU) or CPU math cores rather than hardcoded.
572
+ - GPU: probe for CUDA, Metal, Vulkan explicitly at startup instead of
573
+ relying on node-llama-cpp's `gpu: "auto"`. `qmd status` shows device info.
574
+ - Tests: reorganized into flat `test/` directory with vitest for Node.js and
575
+ bun test for Bun. New `eval-bm25` and `store.helpers.unit` suites.
576
+
577
+ ### Fixes
578
+
579
+ - Prevent VRAM waste from duplicate context creation during concurrent
580
+ `embedBatch` calls — initialization lock now covers the full path.
581
+ - Collection-aware FTS filtering so scoped keyword search actually restricts
582
+ results to the requested collection.
583
+
584
+ ## [0.9.0] - 2026-02-15
585
+
586
+ First published release on npm as `@tobilu/qmd`. MCP HTTP transport with
587
+ daemon mode cuts warm query latency from ~16s to ~10s by keeping models
588
+ loaded between requests.
589
+
590
+ ### Changes
591
+
592
+ - MCP: HTTP transport with daemon lifecycle — `qmd mcp --http --daemon`
593
+ starts a background server, `qmd mcp stop` shuts it down. Models stay warm
594
+ in VRAM between queries. #149 (thanks @igrigorik)
595
+ - Search: type-routed query expansion preserves lex/vec/hyde type info and
596
+ routes to the appropriate backend. Eliminates ~4 wasted backend calls per
597
+ query (10.0 → 6.0 calls, 1278ms → 549ms). #149 (thanks @igrigorik)
598
+ - Search: unified pipeline — extracted `hybridQuery()` and
599
+ `vectorSearchQuery()` to `store.ts` so CLI and MCP share identical logic.
600
+ Fixes a class of bugs where results differed between the two. #149 (thanks
601
+ @igrigorik)
602
+ - MCP: dynamic instructions generated at startup from actual index state —
603
+ LLMs see collection names, doc counts, and content descriptions. #149
604
+ (thanks @igrigorik)
605
+ - MCP: tool renames (vsearch → vector_search, query → deep_search) with
606
+ rewritten descriptions for better tool selection. #149 (thanks @igrigorik)
607
+ - Integration: Claude Code plugin with inline status checks and MCP
608
+ integration. #99 (thanks @galligan)
609
+
610
+ ### Fixes
611
+
612
+ - BM25 score normalization — formula was inverted (`1/(1+|x|)` instead of
613
+ `|x|/(1+|x|)`), so strong matches scored *lowest*. Broke `--min-score`
614
+ filtering and made the "strong signal" short-circuit dead code. #76 (thanks
615
+ @dgilperez)
616
+ - Normalize Unicode paths to NFC for macOS compatibility. #82 (thanks
617
+ @c-stoeckl)
618
+ - Handle dense content (code) that tokenizes beyond expected chunk size.
619
+ - Proper cleanup of Metal GPU resources on process exit.
620
+ - SQLite-vec readiness verification after extension load.
621
+ - Reactivate deactivated documents on re-index instead of creating duplicates.
622
+ - Bun UTF-8 path corruption workaround for non-ASCII filenames.
623
+ - Disable following symlinks in glob.scan to avoid infinite loops.
624
+
625
+ ## [0.8.0] - 2026-01-28
626
+
627
+ Fine-tuned query expansion model trained with GRPO replaces the stock Qwen3
628
+ 0.6B. The training pipeline scores expansions on named entity preservation,
629
+ format compliance, and diversity — producing noticeably better lexical
630
+ variations and HyDE documents.
631
+
632
+ ### Changes
633
+
634
+ - LLM: deploy GRPO-trained (Group Relative Policy Optimization) query
635
+ expansion model, hosted on HuggingFace and auto-downloaded on first use.
636
+ Better preservation of proper nouns and technical terms in expansions.
637
+ - LLM: `/only:lex` mode for single-type expansions — useful when you know
638
+ which search backend will help.
639
+ - LLM: HyDE output moved to first position so vector search can start
640
+ embedding while other expansions generate.
641
+ - LLM: session lifecycle management via `withLLMSession()` pattern — ensures
642
+ cleanup even on failure, similar to database transactions.
643
+ - Integration: org-mode title extraction support. #50 (thanks @sh54)
644
+ - Integration: SQLite extension loading in Nix devshell. #48 (thanks @sh54)
645
+ - Integration: AI agent discovery via skills.sh. #64 (thanks @Algiras)
646
+
647
+ ### Fixes
648
+
649
+ - Use sequential embedding on CPU-only systems — parallel contexts caused a
650
+ race condition where contexts competed for CPU cores, making things slower.
651
+ #54 (thanks @freeman-jiang)
652
+ - Fix `collectionName` column in vector search SQL (was still using old
653
+ `collectionId` from before YAML migration). #61 (thanks @jdvmi00)
654
+ - Fix Qwen3 sampling params to prevent repetition loops — stock
655
+ temperature/top-p caused occasional infinite repeat patterns.
656
+ - Add `--index` option to CLI argument parser (was documented but not wired
657
+ up). #84 (thanks @Tritlo)
658
+ - Fix DisposedError during slow batch embedding. #41 (thanks @wuhup)
659
+
660
+ ## [0.7.0] - 2026-01-09
661
+
662
+ First community contributions. The project gained external contributors,
663
+ surfacing bugs that only appear in diverse environments — Homebrew sqlite-vec
664
+ paths, case-sensitive model filenames, and sqlite-vec JOIN incompatibilities.
665
+
666
+ ### Changes
667
+
668
+ - Indexing: native `realpathSync()` replaces `readlink -f` subprocess spawn
669
+ per file. On a 5000-file collection this eliminates 5000 shell spawns,
670
+ ~15% faster. #8 (thanks @burke)
671
+ - Indexing: single-pass tokenization — chunking algorithm tokenized each
672
+ document twice (count then split); now tokenizes once and reuses. #9
673
+ (thanks @burke)
674
+
675
+ ### Fixes
676
+
677
+ - Fix `vsearch` and `query` hanging — sqlite-vec's virtual table doesn't
678
+ support the JOIN pattern used; rewrote to subquery. #23 (thanks @mbrendan)
679
+ - Fix MCP server exiting immediately after startup — process had no active
680
+ handles keeping the event loop alive. #29 (thanks @mostlydev)
681
+ - Fix collection filter SQL to properly restrict vector search results.
682
+ - Support non-ASCII filenames in collection filter.
683
+ - Skip empty files during indexing instead of crashing on zero-length content.
684
+ - Fix case sensitivity in Qwen3 model filename resolution. #15 (thanks
685
+ @gavrix)
686
+ - Fix sqlite-vec loading on macOS with Homebrew (`BREW_PREFIX` detection).
687
+ #42 (thanks @komsit37)
688
+ - Fix Nix flake to use correct `src/qmd.ts` path. #7 (thanks @burke)
689
+ - Fix docid lookup with quotes support in get command. #36 (thanks
690
+ @JoshuaLelon)
691
+ - Fix query expansion model size in documentation. #38 (thanks @odysseus0)
692
+
693
+ ## [0.6.0] - 2025-12-28
694
+
695
+ Replaced Ollama HTTP API with node-llama-cpp for all LLM operations. Ollama
696
+ adds convenience but also a running server dependency. node-llama-cpp loads
697
+ GGUF models directly in-process — zero external dependencies. Models
698
+ auto-download from HuggingFace on first use.
699
+
700
+ ### Changes
701
+
702
+ - LLM: structured query expansion via JSON schema grammar constraints.
703
+ Model produces typed expansions — **lexical** (BM25 keywords), **vector**
704
+ (semantic rephrasings), **HyDE** (hypothetical document excerpts) — so each
705
+ routes to the right backend instead of sending everything everywhere.
706
+ - LLM: lazy model loading with 2-minute inactivity auto-unload. Keeps memory
707
+ low when idle while avoiding ~3s model load on every query.
708
+ - Search: conditional query expansion — when BM25 returns strong results, the
709
+ expensive LLM expansion is skipped entirely.
710
+ - Search: multi-chunk reranking — documents with multiple relevant chunks
711
+ scored by aggregating across all chunks rather than best single chunk.
712
+ - Search: cosine distance for vector search (was L2).
713
+ - Search: embeddinggemma nomic-style prompt formatting.
714
+ - Testing: evaluation harness with synthetic test documents and Hit@K metrics
715
+ for BM25, vector, and hybrid RRF.
716
+
717
+ ## [0.5.0] - 2025-12-13
718
+
719
+ Collections and contexts moved from SQLite tables to YAML at
720
+ `~/.config/qmd/index.yml`. SQLite was overkill for config — you can't share
721
+ it, and it's opaque. YAML is human-readable and version-controllable. The
722
+ migration was extensive (35+ commits) because every part of the system that
723
+ touched collections or contexts had to be updated.
724
+
725
+ ### Changes
726
+
727
+ - Config: YAML-based collections and contexts replace SQLite tables.
728
+ `collections` and `path_contexts` tables dropped from schema. Collections
729
+ support an optional `update:` command (e.g., `git pull`) before re-index.
730
+ - CLI: `qmd collection add/list/remove/rename` commands with `--name` and
731
+ `--mask` glob pattern support.
732
+ - CLI: `qmd ls` virtual file tree — list collections, files in a collection,
733
+ or files under a path prefix.
734
+ - CLI: `qmd context add/list/check/rm` with hierarchical context inheritance.
735
+ A query to `qmd://notes/2024/jan/` inherits context from `notes/`,
736
+ `notes/2024/`, and `notes/2024/jan/`.
737
+ - CLI: `qmd context add / "text"` for global context across all collections.
738
+ - CLI: `qmd context check` audit command to find paths without context.
739
+ - Paths: `qmd://` virtual URI scheme for portable document references.
740
+ `qmd://notes/ideas.md` works regardless of where the collection lives on
741
+ disk. Works in `get`, `multi-get`, `ls`, and context commands.
742
+ - CLI: document IDs (docid) — first 6 chars of content hash for stable
743
+ references. Shown as `#abc123` in search results, usable with `get` and
744
+ `multi-get`.
745
+ - CLI: `--line-numbers` flag for get command output.
746
+
747
+ ## [0.4.0] - 2025-12-10
748
+
749
+ MCP server for AI agent integration. Without it, agents had to shell out to
750
+ `qmd search` and parse CLI output. The monolithic `qmd.ts` (1840 lines) was
751
+ split into focused modules with the project's first test suite (215 tests).
752
+
753
+ ### Changes
754
+
755
+ - MCP: stdio server with tools for search, vector search, hybrid query,
756
+ document retrieval, and status. Runs over stdio transport for Claude
757
+ Desktop and MCP clients.
758
+ - MCP: spec-compliant with June 2025 MCP specification — removed non-spec
759
+ `mimeType`, added `isError: true` to errors, `structuredContent` for
760
+ machine-readable results, proper URI encoding.
761
+ - MCP: simplified tool naming (`qmd_search` → `search`) since MCP already
762
+ namespaces by server.
763
+ - Architecture: extract `store.ts` (1221 LOC), `llm.ts` (539 LOC),
764
+ `formatter.ts` (359 LOC), `mcp.ts` (503 LOC) from monolithic `qmd.ts`.
765
+ - Testing: 215 tests (store: 96, llm: 60, mcp: 59) with mocked Ollama for
766
+ fast, deterministic runs. Before this: zero tests.
767
+
768
+ ## [0.3.0] - 2025-12-08
769
+
770
+ Document chunking for vector search. A 5000-word document about many topics
771
+ gets a single embedding that averages everything together, matching poorly for
772
+ specific queries. Chunking produces one embedding per ~900-token section with
773
+ focused semantic signal.
774
+
775
+ ### Changes
776
+
777
+ - Search: markdown-aware chunking — prefers heading boundaries, then paragraph
778
+ breaks, then sentence boundaries. 15% overlap between chunks ensures
779
+ cross-boundary queries still match.
780
+ - Search: multi-chunk scoring bonus (+0.02 per additional chunk, capped at
781
+ +0.1 for 5+ chunks). Documents relevant in multiple sections rank higher.
782
+ - CLI: display paths show collection-relative paths and extracted titles
783
+ (from H1 headings or YAML frontmatter) instead of raw filesystem paths.
784
+ - CLI: `--all` flag returns all matches (use with `--min-score` to filter).
785
+ - CLI: byte-based progress bar with ETA for `embed` command.
786
+ - CLI: human-readable time formatting ("15m 4s" instead of "904.2s").
787
+ - CLI: documents >64KB truncated with warning during embedding.
788
+
789
+ ## [0.2.0] - 2025-12-08
790
+
791
+ ### Changes
792
+
793
+ - CLI: `--json`, `--csv`, `--files`, `--md`, `--xml` output format flags.
794
+ `--json` for programmatic access, `--files` for piping, `--md`/`--xml` for
795
+ LLM consumption, `--csv` for spreadsheets.
796
+ - CLI: `qmd status` shows index health — document count, size, embedding
797
+ coverage, time since last update.
798
+ - Search: weighted RRF — original query gets 2x weight relative to expanded
799
+ queries since the user's actual words are a more reliable signal.
800
+
801
+ ## [0.1.0] - 2025-12-07
802
+
803
+ Initial implementation. Built in a single day for searching personal markdown
804
+ notes, journals, and meeting transcripts.
805
+
806
+ ### Changes
807
+
808
+ - Search: SQLite FTS5 with BM25 ranking. Chose SQLite over Elasticsearch
809
+ because QMD is a personal tool — single binary, no server dependencies.
810
+ - Search: sqlite-vec for vector similarity. Same rationale: in-process, no
811
+ external vector database.
812
+ - Search: Reciprocal Rank Fusion to combine BM25 and vector results. RRF is
813
+ parameter-free and handles missing signals gracefully.
814
+ - LLM: Ollama for embeddings, reranking, and query expansion. Later replaced
815
+ with node-llama-cpp in 0.6.0.
816
+ - CLI: `qmd add`, `qmd embed`, `qmd search`, `qmd vsearch`, `qmd query`,
817
+ `qmd get`. ~1800 lines of TypeScript in a single `qmd.ts` file.
818
+
819
+ [Unreleased]: https://github.com/tobi/qmd/compare/v1.0.0...HEAD
820
+ [1.0.0]: https://github.com/tobi/qmd/releases/tag/v1.0.0
821
+ [0.9.0]: https://github.com/tobi/qmd/compare/v0.8.0...v0.9.0