@oomkapwn/enquire-mcp 3.10.0-rc.4 → 3.10.0-rc.41

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (82) hide show
  1. package/CHANGELOG.md +973 -0
  2. package/README.md +64 -41
  3. package/README.zh.md +206 -0
  4. package/SECURITY.md +2 -1
  5. package/dist/bases.d.ts.map +1 -1
  6. package/dist/bases.js +28 -3
  7. package/dist/bases.js.map +1 -1
  8. package/dist/cli.d.ts.map +1 -1
  9. package/dist/cli.js +136 -7
  10. package/dist/cli.js.map +1 -1
  11. package/dist/doctor.d.ts +18 -1
  12. package/dist/doctor.d.ts.map +1 -1
  13. package/dist/doctor.js +15 -6
  14. package/dist/doctor.js.map +1 -1
  15. package/dist/dql.d.ts.map +1 -1
  16. package/dist/dql.js +7 -1
  17. package/dist/dql.js.map +1 -1
  18. package/dist/embed-db.d.ts +14 -0
  19. package/dist/embed-db.d.ts.map +1 -1
  20. package/dist/embed-db.js +36 -9
  21. package/dist/embed-db.js.map +1 -1
  22. package/dist/embed-pipeline.d.ts.map +1 -1
  23. package/dist/embed-pipeline.js +8 -2
  24. package/dist/embed-pipeline.js.map +1 -1
  25. package/dist/embeddings.d.ts +37 -24
  26. package/dist/embeddings.d.ts.map +1 -1
  27. package/dist/embeddings.js +93 -4
  28. package/dist/embeddings.js.map +1 -1
  29. package/dist/eval.d.ts +47 -0
  30. package/dist/eval.d.ts.map +1 -1
  31. package/dist/eval.js +66 -11
  32. package/dist/eval.js.map +1 -1
  33. package/dist/fts5.d.ts +15 -0
  34. package/dist/fts5.d.ts.map +1 -1
  35. package/dist/fts5.js +46 -2
  36. package/dist/fts5.js.map +1 -1
  37. package/dist/hnsw.d.ts.map +1 -1
  38. package/dist/hnsw.js +16 -3
  39. package/dist/hnsw.js.map +1 -1
  40. package/dist/http-transport.d.ts +27 -0
  41. package/dist/http-transport.d.ts.map +1 -1
  42. package/dist/http-transport.js +107 -63
  43. package/dist/http-transport.js.map +1 -1
  44. package/dist/index.d.ts +1 -1
  45. package/dist/index.d.ts.map +1 -1
  46. package/dist/index.js +1 -1
  47. package/dist/index.js.map +1 -1
  48. package/dist/parser.d.ts +6 -0
  49. package/dist/parser.d.ts.map +1 -1
  50. package/dist/parser.js +11 -0
  51. package/dist/parser.js.map +1 -1
  52. package/dist/server.d.ts +9 -0
  53. package/dist/server.d.ts.map +1 -1
  54. package/dist/server.js +72 -55
  55. package/dist/server.js.map +1 -1
  56. package/dist/shutdown.d.ts +41 -0
  57. package/dist/shutdown.d.ts.map +1 -0
  58. package/dist/shutdown.js +60 -0
  59. package/dist/shutdown.js.map +1 -0
  60. package/dist/staleness.d.ts +28 -0
  61. package/dist/staleness.d.ts.map +1 -1
  62. package/dist/staleness.js +38 -3
  63. package/dist/staleness.js.map +1 -1
  64. package/dist/tool-registry.d.ts +11 -1
  65. package/dist/tool-registry.d.ts.map +1 -1
  66. package/dist/tool-registry.js +18 -6
  67. package/dist/tool-registry.js.map +1 -1
  68. package/dist/tools/meta.d.ts +41 -0
  69. package/dist/tools/meta.d.ts.map +1 -1
  70. package/dist/tools/meta.js +329 -28
  71. package/dist/tools/meta.js.map +1 -1
  72. package/dist/tools/search.d.ts +108 -2
  73. package/dist/tools/search.d.ts.map +1 -1
  74. package/dist/tools/search.js +239 -11
  75. package/dist/tools/search.js.map +1 -1
  76. package/dist/watcher.d.ts.map +1 -1
  77. package/dist/watcher.js +42 -10
  78. package/dist/watcher.js.map +1 -1
  79. package/docs/COMPARISON.md +19 -4
  80. package/docs/api.md +9 -4
  81. package/docs/benchmarks.md +1 -1
  82. package/package.json +10 -6
package/CHANGELOG.md CHANGED
@@ -2,6 +2,979 @@
2
2
 
3
3
  All notable changes to this project will be documented here. The format follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and the project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
4
4
 
5
+ ## [3.10.0-rc.41] — 2026-06-09
6
+
7
+ > **TL;DR:** **Counter-positioning docs (caura-memclaw study output) — sharpens the "Grounded, not extracted" brand against the *server-fleet*-memory category.** The hero framing (README + COMPARISON) already distinguished enquire from the *chat-memory* cohort (mem0/Zep/Supermemory — which *extract* facts from chat logs). It now ALSO distinguishes from multi-tenant *fleet*-memory platforms (server-side stores that paraphrase agent traffic into a shared database): enquire is **single-user, local-first, zero cloud calls during serve** — one vault you own, read, edit, and delete yourself. Docs-only; no count or claim-surface change. **1177 source tests unchanged.**
8
+
9
+ **Pre-release (v3.10 line) — positioning.**
10
+
11
+ ### Changed
12
+
13
+ - **`README.md` hero + `docs/COMPARISON.md` intro** — the "Grounded, not extracted" framing gains a server-fleet-memory contrast clause. Prompted by a competitive study (caura-ai/caura-memclaw, a multi-tenant fleet-memory platform): the existing copy countered only the conversation-memory cohort, leaving the fleet-memory category unaddressed. No new factual/numeric claim — "zero cloud calls during serve" is the existing SECURITY.md/package.json claim, restated in the positioning context.
14
+
15
+ ### Notes
16
+
17
+ - Marketing assets (a problem-first content blog, a landing page, Discord) remain maintainer-driven and are **never** committed to this public repo (standing rule). The one remaining borrowable *technical* idea from the study — closed-loop retrieval feedback (a "Karpathy Loop" `mark_useful`-style tool) — is a net-new persistent-state feature with data-at-rest + right-to-erasure implications; it's captured for the v3.10.x feature line and best sequenced after the v3.10 → `@latest` promotion (so the promotion audit isn't complicated by a fresh feature) and with a dedicated privacy-design pass.
18
+
19
+ ---
20
+
21
+ ## [3.10.0-rc.40] — 2026-06-09
22
+
23
+ > **TL;DR:** **Closes the entire LOW/INFO tail of the wq9ml34gr workflow-audit — the audit is now fully shipped (1 CRIT + 4 MED + 7 LOW/INFO, all closed).** Seven low-stakes items: two watcher concurrency-hardening fixes (close-window event race + HNSW dirty-flag race — both were lost-fast-reload-only, the signature-guard already rebuilt), `--stale-days` doc/help/TSDoc honesty (it tunes recency re-ranking only — the `stale` flag is a hardcoded 365), two test-infra invariant broadenings (K-3 derives write handlers from source; resource-bound detects parallel-fanout scanners), eval `hits_relevant` dedup (INFO), and dropping an unused `id-token: write` grant. **1174 → 1177 source tests.**
24
+
25
+ **Pre-release (v3.10 line) — audit tail (LOW/INFO).**
26
+
27
+ ### Fixed
28
+
29
+ - **`src/watcher.ts` #6 — close()-window event race.** `close()` now stops the chokidar watcher FIRST (before draining the queue + flushing), and `onChange`/`handle` early-return when `closed` — so an edit landing mid-shutdown can't apply a live HNSW diff the just-persisted sidecar wouldn't reflect (pre-rc.40: a lost fast-reload; the signature-guard rebuilt on next serve, no corruption).
30
+ - **`src/watcher.ts` #7 — flushHnswToDisk dirty-flag race.** Clears `hnswDirty` BEFORE the `saveTo` await (re-set to true on failure) so a concurrent `applyDiff` that re-marks dirty during the write isn't clobbered by a late `= false` — the index stays correctly dirty → next serve rebuilds rather than trusting a sidecar that predates the diff.
31
+ - **`--stale-days` honesty (#9)** — `src/cli.ts` help, `src/server.ts` `ServeOptions` TSDoc, and `docs/api.md` no longer claim `--stale-days` is "the threshold behind the `stale` flag." It only tunes the recency RE-ranking half-life (active when `--recency-weight > 0`); the `stale` flag on hits always uses the fixed 365-day default. (Behavior was always correct + test-pinned; the docs over-claimed.)
32
+ - **`src/eval.ts` #13 (INFO)** — `hits_relevant` now counts DISTINCT relevant paths (Set), mirroring the rc.33 dedup in `recallAtK`/`ndcgAtK`, so a duplicate path can't print `N/M` with N>M. Unreachable at the default note granularity (paths unique); pins the contract for block-granularity callers.
33
+ - **`.github/workflows/dist-tag-cleanup.yml` #14 (INFO)** — dropped the unused `id-token: write` permission (dist-tag rm auths via `NPM_TOKEN`; no OIDC step). Least-privilege.
34
+
35
+ ### Tests (1177)
36
+
37
+ - **K-3 invariant #11** — new `fsMutatingExports(src)` derives fs/vault-mutating exported handlers from `write.ts` source; asserts ⊆ `KNOWN_WRITE_HANDLERS` so a NEW mutating handler wired under READ_ONLY (which would falsely advertise `readOnlyHint`) fails CI — closing the "did-we-remember-to-add-it" gap the erasure/resource-bound inventory invariants close. + NEGATIVE control (flags an untracked `fs.unlink` export).
38
+ - **resource-bound invariant #12** — `discoverScanners` now also matches parallel-fanout iteration (`Promise.all` / `.map(async` / `for await`), not only a literal `for (`, so a whole-vault reader written as a pure `Promise.all(entries.map(async …readNote))` can't escape the cap-or-exempt gate. + NEGATIVE fixture. (No new scanner discovered in current src — confirmed no cascade.)
39
+ - +3 source `it()` (K-3 ×2, resource-bound ×1); docs count 1174 → 1177. The watcher #6/#7 fixes are covered by the existing watcher invariant tests (which still pass — the close still drains + flushes; behavior hardened, not changed).
40
+
41
+ ### Method
42
+
43
+ - The full LOW/INFO tail from the rc.36 multi-agent behavioral/threat audit, batched into one RC (not N reactive patches). With this, every confirmed finding (1 CRIT rc.36 · 4 MED rc.37/38 · 7 LOW/INFO rc.40 · the rc.39 sink-bound from the mandated re-sweep) is shipped — the audit is closed.
44
+
45
+ ---
46
+
47
+ ## [3.10.0-rc.39] — 2026-06-09
48
+
49
+ > **TL;DR:** **The mandated post-rc.36 ReDoS re-sweep + its architectural fix (HIGH) — permanently ends the 4×-recurring ReDoS class.** A broader-generator 20,000-pattern re-sweep confirmed rc.36's specific fix is sound (all known shapes flagged, default accepted, no regression) **but surfaced the inherent undecidable residual: 80 SAFE-classified nested patterns genuinely hang V8** — e.g. the 37-char `\W?(([ca]*?){0,3}|c{2,5}b{2,5}){0,3}$` (confirmed: hangs indefinitely on a ~50-char line). This is NOT an rc.36 regression — it's the long-standing limit of *any* static ReDoS denylist (the detection problem is undecidable). Rather than chase 80 more hand-rules (the EDA rabbit hole CLAUDE.md forbids), the fix **bounds the SINK**: `obsidian_open_questions` now matches a caller-supplied pattern on a **worker thread with a hard wall-clock budget** (`MAX_QUESTION_SCAN_MS`=5000), so the main event loop can **never** hang for any pattern, and a pattern that blows the budget is rejected fail-closed. `isCatastrophicRegex` stays as the cheap pre-filter; the safe default pattern stays inline (zero overhead). **1170 → 1174 source tests.**
50
+
51
+ **Pre-release (v3.10 line) — security: ReDoS sink-bound (the architectural class-ender).**
52
+
53
+ ### Security
54
+
55
+ - **`src/tools/meta.ts` — hard ReDoS sink-bound for `obsidian_open_questions` (HIGH; closes the static-detector residual the rc.36 re-sweep confirmed).** New `matchLinesBounded(pattern, lines, budgetMs)` runs the caller pattern's per-line matching on a **worker thread** and races a `budgetMs` timeout: the worker isolates V8's backtracking off the main event loop, and on timeout it's terminated and the request is rejected fail-closed (`"matching exceeded the Nms safe budget"`). `getOpenQuestions` is refactored to collect candidate lines + metadata, then match via the worker (caller pattern) or inline (safe default). An invalid pattern rejects with a clear error. **This makes detector completeness moot — no pattern, however crafted, can hang the server.**
56
+ - Kept `isCatastrophicRegex` + `MAX_QUESTION_PATTERN_LEN` as the cheap pre-filter (rejects obvious shapes without spawning a worker). The worker-budget is the hard backstop for everything the best-effort denylist misses (ReDoS detection is undecidable). New `MAX_QUESTION_SCAN_MS` constant; an optional `scanBudgetMs` arg (NOT in the MCP tool schema, so not caller-settable) lets tests use a short budget.
57
+ - **Why the architectural fix over more rules:** the re-sweep proved a static analyzer for an undecidable property is forever incomplete (rc.21/24/25/36 each closed a shape; the next is always findable). Bounding the sink ends the class permanently — the project's own "fix the class architecturally, don't chase EDA-precise detection" rule.
58
+
59
+ ### Tests (1174)
60
+
61
+ - `tests/redos-guard.test.ts`: +4 — `matchLinesBounded` describe (safe pattern → first-capture matches; a pattern `isCatastrophicRegex` MISSES → rejected *within the budget*, asserting it did NOT hang; invalid pattern → clear error) + a `getOpenQuestions` end-to-end test that a detector-missed catastrophic pattern (base64-decoded at runtime for CodeQL js/redos hygiene) is rejected via the worker bound on a vault seeded with a long line. Docs test-count bumped 1170 → 1174.
62
+
63
+ ### Method
64
+
65
+ - The re-sweep is the MANDATED post-merge step for a recurring-class fix (it's how rc.24's fix surfaced rc.25's CRITICAL). It correctly returned a mixed verdict: rc.36 sound, but the residual real — leading to the architectural fix the maintainer approved over reactive detector-patching.
66
+
67
+ ---
68
+
69
+ ## [3.10.0-rc.38] — 2026-06-09
70
+
71
+ > **TL;DR:** **Correctness + resource batch (audit #5 + #2, both MEDIUM).** **#5:** a `.base` filter `not: '<unevaluated predicate>'` (a typo, or `inDate(...)` / any formula predicate — none of which `obsidian_query_base` evaluates) **returned EVERY row instead of none.** v3.6.2 HN-2 fail-closes an unevaluable predicate to `false` = "exclude the row"; `not` blindly negated that to `true` = "include" — silently over-including the whole scan, the exact over-inclusion HN-2 (and SECURITY.md) say is prevented, reachable through negation. Fixed by evaluating the `not` child against a **fresh `unevaluated` probe** (the real set is shared across rows, so a naive size-delta only catches the first row) and excluding if the child touched any unevaluable predicate. **#2:** the embedder / BGE-reranker **ONNX InferenceSession was rebuilt on EVERY query** — `loadEmbedder`/`loadReranker` re-ran `pipeline()` / `from_pretrained()` per call (~110–120MB session init each), so a long-lived `serve --enable-reranker` paid full model-init latency per `obsidian_search` and N concurrent authenticated queries spun up N simultaneous sessions (a memory-spike vector + a direct hit to the headline sub-10ms claim). Now **cached per-alias** at the module level. **1168 → 1170 source tests** (+2 for #5; #2's behavioral path is gated-smoke + build-verified, as the whole model path is).
72
+
73
+ **Pre-release (v3.10 line) — correctness + resource.**
74
+
75
+ ### Fixed
76
+
77
+ - **`src/bases.ts` — `not:` inverted the fail-closed semantics for unevaluated predicates (#5, MEDIUM correctness).** `evalFilter`'s `if ("not" in f) return !evalFilter(f.not, ctx)` negated the `false` that an unknown/typo/unparseable predicate (incl. `inDate(...)`) fail-closes to — turning "exclude" into "include every row." Now: `const probe = new Set(); const inner = evalFilter(f.not, { ...ctx, unevaluated: probe }); merge probe → ctx.unevaluated; if (probe.size > 0) return false; return !inner;`. A fresh probe is required because `ctx.unevaluated` is shared across all rows, so a same-set size delta only fires on the first row that hits the predicate (the bug the first fix attempt hit, caught by the new test). Contradicted SECURITY.md ("Predicate strings that don't match any pattern … treated as false (fail-closed … exclude the row rather than over-include it)").
78
+ - **`src/embeddings.ts` — embedder/reranker ONNX session rebuilt per query (#2, MEDIUM resource/latency).** `loadEmbedder` / `loadReranker` now check a module-level `Map<alias, Promise<Embedder|Reranker>>` and reuse the handle; the build is extracted to private `buildEmbedder` / `buildReranker`. The promise-cache also collapses a concurrent first-load thundering-herd, and a rejected load is evicted so a later call can retry. The `rerankerOverride` test seam (search.ts) is unaffected (it bypasses `loadReranker`).
79
+
80
+ ### Tests (1170)
81
+
82
+ - `tests/bases.test.ts`: +2 — `not:` over an unevaluated `inDate(...)` and over a typo'd predicate both fail-closed to 0 matches (the existing "filters via not" over a KNOWN predicate is the positive control). #2's cache is build-verified (types) + exercised by the gated `reranker-smoke` (which runs `buildReranker`); the model-load path is gated behind real weights codebase-wide, so no flaky real-load unit test was added. Docs count bumped 1168 → 1170.
83
+
84
+ ### Method
85
+
86
+ - Continuation of the rc.36 multi-agent behavioral/threat audit (correctness/resource dimensions). The two MEDIUMs are orthogonal to the rc.36 ReDoS detector and the rc.37 erasure paths. Shipped after rc.37 confirmed published on `@rc`.
87
+
88
+ ---
89
+
90
+ ## [3.10.0-rc.37] — 2026-06-09
91
+
92
+ > **TL;DR:** **Privacy / right-to-erasure batch (audit #3/#4 MEDIUM + #8 LOW).** The cross-vault `prune` GC silently **left full note bodies on disk forever**: its whitelist regex (`ENQUIRE_CACHE_ARTIFACT`) omitted the `<hash>.json` parse cache (written by `saveDiskCache`, holds every note's raw body), so decommissioning a vault and running `prune` deleted its `.fts5.db`/`.embed.db`/HNSW sidecars but kept its full-text `.json` (and any `.json.tmp`) — a GDPR-shaped right-to-erasure gap (same class as rc.34 P-2 / rc.36 F-2). Fixed the regex to cover `json` + the `.tmp` leftover, and — the structural half — the **erasure-completeness invariant now patrols the `prune` eraser too** (it previously only checked the 3 per-vault `clear-*` erasers; the `prune` surface was unguarded, which is exactly why #3 shipped). Plus #8: an emptied `--use-hnsw` embed-db now erases its stale `.hnsw.bin` + `.hnsw.meta.json` sidecars (the `.meta.json` carries deleted notes' `text_preview`). All local-only (mode-0600, no remote disclosure), behind opt-in preconditions. **1164 → 1168 source tests** (+4).
93
+
94
+ **Pre-release (v3.10 line) — privacy / right-to-erasure.**
95
+
96
+ ### Security / privacy
97
+
98
+ - **`src/fts5.ts` — `prune` left the `.json` parse cache (full note bodies) on disk (#3, MEDIUM right-to-erasure).** `ENQUIRE_CACHE_ARTIFACT` now matches `<hash>.{json,fts5.db,embed.db,hnsw.bin,hnsw.meta.json}` + the `-wal`/`-shm`/`.tmp` sidecars (was missing `json` and `.tmp`). A decommissioned vault's `<hash>.json` (and any crash-left `<hash>.json.tmp`), both holding raw note bodies, are now GC'd by `prune` like every other family. Help text (`cli.ts`) + `docs/api.md` updated to list the `.json` family.
99
+ - **`tests/erasure-invariant.test.ts` — the erasure invariant now patrols the cross-vault `prune` eraser (#4, MEDIUM structural).** It previously asserted only that the 3 per-vault `clear-*` erasers reference every artifact suffix; the `prune` whitelist was a 4th deletion authority with NO coverage — which is why #3 shipped undetected. New "prune covers every per-vault writer family" block asserts `planCachePrune` selects a representative filename of each writer family (`.json`, `.json.tmp`, `.fts5.db`, WAL, `.embed.db`, `.hnsw.bin`, `.hnsw.meta.json`) for an OTHER vault, with a NEGATIVE control that replays the literal pre-rc.37 regex (missing `.json`) and proves it leaves the parse cache. `writers ⊆ prune-eraser`, structurally.
100
+ - **`src/server.ts` — emptied `--use-hnsw` embed-db left stale HNSW sidecars (#8, LOW right-to-erasure residual).** When `getAllVectors()` is empty no index is built, so there was no `saveTo` to overwrite a prior `<base>.hnsw.bin` + `.hnsw.meta.json` — and the `.meta.json` carries deleted notes' `text_preview`. The empty branch now unlinks both sidecars (best-effort, persist-gated), mirroring `EmbedDb.clearOnDisk`'s sidecar-erase minus deleting the (valid, empty) db.
101
+
102
+ ### Tests (1168)
103
+
104
+ - +4 source `it()`: `tests/cache-prune.test.ts` (the `.json` parse-cache coverage test), `tests/erasure-invariant.test.ts` (the prune-coverage family loop + its NEGATIVE control + the #8 server.ts structural assertion). Existing prune test's "all 4 types" case extended to "all 5 families + tmp". Docs test-count bumped 1164 → 1168 across README ×4, package.json, llms.txt, AGENTS, COMPARISON, ROADMAP.
105
+
106
+ ### Method
107
+
108
+ - Continuation of the rc.36 multi-agent behavioral/threat audit (the privacy/erasure dimension). Each finding adversarially re-verified against current code before fixing. The orthogonal-module batch (no overlap with the rc.36 ReDoS detector) ships after rc.36 confirmed published on `@rc`.
109
+
110
+ ---
111
+
112
+ ## [3.10.0-rc.36] — 2026-06-09
113
+
114
+ > **TL;DR:** **CRITICAL ReDoS fix — the 4th recurrence of the class, found by a fresh behavioral/threat audit.** `isCatastrophicRegex` (the `obsidian_open_questions` guard) computed its catastrophe verdict ONLY when a quantified GROUP closed (`)` pop), so a **bare top-level run of adjacent overlapping unbounded quantifiers** — `\w*\w*\w*\w*\w*\w*\w*\w*$` (25 chars, under the 200-char cap) — was classified SAFE and compiled. On a single ~45-char word-run line it hangs V8 **~16s** (and `a*a*$` is ~1s at 2000 chars / ~68s at 8000), and `obsidian_open_questions` is **always-registered**, so any token-bearing client could freeze a `serve-http` instance for all clients. Fixed by evaluating a new `frameAdjacentOverlap` check on the TOP frame (never popped) **and** every group body. Overlap is decided by **probing actual single-char regex membership** (delegating the char-class truth-table to V8 — so disjoint broad pairs like `\d*\s*` / `[#.]+\s+` stay accepted while cross-class `\w*\d*` is caught), with a `.`-greedy **absorber tail-exemption** that keeps the shipped default `…\s*[:\-]?\s*(.+)$` safe (it is benign ONLY because `(.+)` absorbs the tail). **Overclaim #19**: the detector TSDoc claimed it "never UNDER-flags," which was false. **The durable fix is the fuzz**: the rc.25 generative ReDoS fuzz never caught this because its GENERATOR only emitted quantified groups — extended `genPattern` to emit bare top-level concatenations (SAFE corpus 43 → 390), so the next top-level under-flag fails CI empirically. Canonical source-`it()` count unchanged (1164); rc.36 adds 20 data-driven guard cases + a corpus-floor assertion.
115
+
116
+ **Pre-release (v3.10 line) — security: ReDoS recurrence #4 (CRITICAL).**
117
+
118
+ ### Security
119
+
120
+ - **`src/tools/meta.ts` `isCatastrophicRegex` — top-level adjacent-quantifier bypass (CRITICAL, remote DoS on bearer-auth `serve-http`).** The catastrophe `return true` lived only inside the `if (ch === ")")` pop branch; frame 0 is never popped, so a bare `\w*\w*…$` / `a*a*$` / `(a)*(a)*$` sequence reached `return false` unflagged. New `frameAdjacentOverlap(body, exemptByAbsorber)` walks atoms left-to-right and flags ≥2 adjacent (zero-width anchors + min-zero/nullable atoms transparent; a mandatory atom breaks the run) unbounded-quantified atoms with **overlapping** match sets. Called on the top frame (`exemptByAbsorber=true`, full tail visible → `.`-greedy absorber exemption applies) and on each group body at pop (`false` — external continuation unknown → over-flag; catches `(\w*\w*)x`).
121
+ - Overlap via **`atomsOverlap`**: probe `OVERLAP_PROBES` (a representative ASCII alphabet) against each atom's `^(?:atom)$` single-char matcher — no hand-maintained class truth table (which would itself be an under-flag bug surface). Correct for literals, `.`, char-classes, and shorthand overlaps (`\w`⊃`\d` → caught; `\w`∩`\s`=∅ → accepted).
122
+ - **Absorber exemption** (`isUniversalAbsorber` + `tailIsBenign`): a run followed by a `.`-greedy `.+`/`.*` (optionally grouped, e.g. `(.+)`) is benign — it consumes any tail and reaches the end-anchor without forced redistribution. This is precisely why the default `open_questions` pattern (`…\s*[:\-]?\s*(.+)$`) is SAFE (~0.1ms) while `…\s*[:\-]?\s*$` is CATASTROPHIC (~12s).
123
+ - **`tests/redos-fuzz.test.ts` — generator blind spot closed (the root reason rc.21/24/25 didn't end the class).** `genPattern` always wrapped output in a quantified group `(…)${q}$`, so the bare top-level shape that hung in rc.36 could never be generated → the empirical net had a hole exactly where the bug lived. Now ~40% of patterns are a bare `seq seq $` concatenation. The SAFE-classified corpus that reaches the timed worker grew **43 → 390**; added `expect(safeChecked).toBeGreaterThan(80)` + `expect(bareTopLevelSafe).toBeGreaterThan(20)` so corpus starvation (finding #10 from the audit) can't silently return.
124
+
125
+ ### Tests (1164)
126
+
127
+ - `tests/redos-guard.test.ts`: +12 catastrophic cases (`a*a*$`, `a*a*a*$`, `\w*…\w*$`, `\w*\w*$`, `.*.*$`, `\s*\s*$`, `\s*[:\-]?\s*$`, `a*x?a*$`, `\w*\d*$`, `(a)*(a)*$`, `(\w*\w*)x`, `a*a*b$`) + 8 safe-precision cases (`a*b*$`, `a*b*c*$`, `\d*\s*x`, `\w+\s+`, `[#.]+\s+`, `a*xa*$`, `\s*\s*(.+)$`, `\s*[:\-]?\s*(.+)$`). All are data-driven entries on the existing array-loop `it()`s, so the **canonical source-`it()` count is unchanged at 1164** (runtime test count rises; the two arrays are each other's positive/NEGATIVE controls).
128
+
129
+ ### Method
130
+
131
+ - Found by a fresh **multi-agent behavioral + threat-model audit** (7 dimensions, each finding adversarially re-verified against current code) — the external-lens, runtime-behavior sweep the drift/claim-driven home-grown gates are structurally blind to. The same audit surfaced a MEDIUM/LOW batch (embedder/reranker session cache, `prune` parse-cache right-to-erasure, `.base` `not:` fail-closed inversion, watcher close-window race, …) shipping in follow-up RCs.
132
+
133
+ ---
134
+
135
+ ## [3.10.0-rc.35] — 2026-06-08
136
+
137
+ > **TL;DR:** **README reorder (maintainer request) — lead with the competitive case.** Moved the **"🏆 Why it's the best"** section (the side-by-side comparison table + the "six features no other Obsidian-MCP has" framing + the Karpathy strategic claim) up to sit **immediately before "⚡ Quick start"** (right after "The solution"), so an evaluator sees the differentiation before the install steps. The hero's one-line `claude mcp add …` + the "30-second install" nav link keep "try it now" reachable from the very top, so the install isn't buried. Done as a **deterministic boundary-based move** (section heading → next heading), not a hand cut-paste, so the 20-row table relocated intact. **README-only — no code, no new/changed claims; the hero stat line stays the first `**N tools**` so the docs-consistency first-match regexes are unaffected; 1164 tests unchanged.**
138
+
139
+ **Pre-release (v3.10 line) — README section reorder.**
140
+
141
+ ### Changed
142
+
143
+ - **`README.md` — "🏆 Why it's the best" moved above "⚡ Quick start"** (was buried after "Use cases" / "When NOT to use it" / "API reference"). New top-level flow: hero → The problem → The solution → **Why it's the best** → Quick start → Set up in your AI agent → Use cases → When NOT to use it → API reference → How retrieval works → … No content changed — only the section's position. Separators verified (no double `---`); the removed-from spot (API reference → How retrieval works) closes cleanly.
144
+
145
+ ### Tests (1164)
146
+
147
+ - None — README-only reorder; no claims added/changed (docs-consistency green; the hero `**45 tools · …**` stat line remains the first tool-count match, so the relocated `**45 production tools**` comparison row doesn't shift any first-match regex). 1164 unchanged.
148
+
149
+ ### Files changed
150
+
151
+ - `README.md` (section reorder), `CLAUDE.md` (roll-up rc.34 → rc.35 + the reorder note).
152
+ - version bump 3.10.0-rc.34 → 3.10.0-rc.35.
153
+
154
+ ---
155
+
156
+ ## [3.10.0-rc.34] — 2026-06-08
157
+
158
+ > **TL;DR:** **RCA re-sweep of the rc.33 fix — the same bug had a SIBLING, and it was worse.** The post-rc.33 re-sweep (mandated by the project's "fix the class, not the instance" rule) found that `peekEmbedDbMeta` — the embed-db twin of the `peekFtsMetaSafe` function hardened in rc.33 — has the **identical `new Database()`-outside-the-try shape**, and it's called **UNGUARDED** in two hot spots the FTS one wasn't: `embeddingsSearch` (`tools/search.ts`, the peek runs *before* that function's own try/catch) and two CLI subcommands (`cli.ts`). So a **corrupt / unreadable / directory `.embed.db` would error the `embeddings_search` tool and crash those CLI subcommands** (vs the rc.33 FTS case, which only bit startup). Hardened it the same way: `new Database()` + the meta queries now sit inside one try → any failure returns `null` (treated as "no embed-db" — the existing graceful-degrade path). **`src/embed-db.ts` + tests only; +3 tests → 1164.** This is the re-sweep discipline paying off: the rc.33 instance fix's mandatory sibling-scan caught a higher-impact instance of the same class.
159
+
160
+ **Pre-release (v3.10 line) — post-rc.33 RCA re-sweep (peekEmbedDbMeta sibling).**
161
+
162
+ ### Fixed
163
+
164
+ - **`src/embed-db.ts` — `peekEmbedDbMeta` now truly never throws** (RCA sibling of rc.33's `peekFtsMetaSafe` fix). `new Database()` was outside the function's only try/catch (which guarded just the dep *load*), so a corrupt / unreadable / not-a-DB / directory `.embed.db` made the peek throw. Unlike the FTS peek (startup-only), this one is called UNGUARDED on the `embeddings_search` hot path (`tools/search.ts`, before that function's `try`) and in `cli.ts` subcommands — so the throw errored the search tool / crashed the CLI instead of degrading. Now the DB-open + read are wrapped → any failure → `null`. (`server.ts` call sites were already inside try/catch; this also makes them cleaner.)
165
+
166
+ ### Tests (1164)
167
+
168
+ - +3 (`tests/embed-db.test.ts`): `peekEmbedDbMeta` returns `null` (not throws) for a non-existent file, a **directory** path, and a **corrupt non-SQLite** file. Non-vacuous when `better-sqlite3` is present (CI): pre-fix `new Database()` threw → the directory/corrupt cases failed. 1161 → 1164.
169
+
170
+ ### Files changed
171
+
172
+ - `src/embed-db.ts` (peekEmbedDbMeta wrap), `tests/embed-db.test.ts` (+3), count-bump (`1161 → 1164`) in `README.md` / `package.json` / `llms.txt` / `AGENTS.md` / `ROADMAP.md` / `docs/COMPARISON.md` / `CLAUDE.md`.
173
+ - version bump 3.10.0-rc.33 → 3.10.0-rc.34.
174
+
175
+ ---
176
+
177
+ ## [3.10.0-rc.33] — 2026-06-08
178
+
179
+ > **TL;DR:** **Post-rc.31 audit response — code correctness (batch 2/2).** Ships the code findings from the 3-lens audit (rc.32 was docs/test-infra). The headline: **FTS5 (`--persistent-index`) now fails soft to TF-IDF instead of hard-crashing serve** when `better-sqlite3` is missing/unbuilt — closing the **"auto-degrades gracefully: works with any subset of signals" claimed-guarantee gap** (the embed-db / PDF / HNSW paths already fail-soft; FTS5 was the lone hard-crash). Writing the E2E test for it **surfaced a DEEPER latent bug the audit didn't name**: `peekFtsMetaSafe` — the pre-open metadata peek, which runs BEFORE the fail-soft try/catch — wrapped the better-sqlite3 *load* but not `new Database()`, so a **corrupt / unreadable / directory persistent-index file crashed serve at startup** (a function literally named "Safe" could throw). Both fixed: `peekFtsMetaSafe` now truly never throws (any failure → `null`), and the open/sync path degrades to TF-IDF with a loud stderr warning. Plus two eval-correctness polishes the audit flagged: `recallAtK`/`ndcgAtK` **dedupe duplicate relevant paths** (a path repeated in the result list no longer inflates recall past 1.0 / DCG past the ideal — pre-existing, unreachable via the eval path, now correct for any caller) and `formatEvalResult` uses a **dynamic id-column width** (ids > 15 chars no longer shift every following column). **+5 tests → 1161.** The FTS5 fail-soft is verified by a new E2E test that forces `ftsIndex.open()` to fail (points `--index-file` at a directory) and asserts serve still completes the MCP handshake + answers `tools/list`.
180
+
181
+ **Pre-release (v3.10 line) — post-rc.31 audit, batch 2/2 (code correctness).**
182
+
183
+ ### Fixed
184
+
185
+ - **`src/server.ts` — FTS5 `--persistent-index` fails soft to TF-IDF** (was: re-throw → serve crash). On any `ftsIndex.open()`/sync failure (most commonly `better-sqlite3` missing/unbuilt + `--persistent-index`, e.g. the Docker introspection image or a failed native build), serve now sets `ftsIndex = null` (exactly the heavily-tested no-`--persistent-index` state) + emits a stderr warning, instead of a hard crash with an unactionable "npm rebuild" stack trace. Parity with the already-fail-soft PDF / embed-db / HNSW paths.
186
+ - **`src/fts5.ts` — `peekFtsMetaSafe` now truly never throws** (latent bug surfaced while testing the above). `new Database()` + the meta queries were OUTSIDE the function's only try/catch (which guarded just the dep *load*), so a corrupt / unreadable / not-a-DB / directory index file made the pre-open peek throw and crash serve before the fail-soft could engage. Now the whole DB-open + read is wrapped → any failure returns `null`.
187
+ - **`src/eval.ts` — `recallAtK` + `ndcgAtK` dedupe duplicate relevant paths.** A relevant path repeated in the result list counted multiple times (recall could exceed 1.0; DCG could exceed the ideal). Now each relevant path is credited once (recall counts the distinct set; ndcg credits the first rank). Pre-existing + unreachable via the eval path (default `note` granularity yields one hit per path) — defensive correctness for any caller.
188
+ - **`src/eval.ts` — `formatEvalResult` dynamic id-column width.** Per-query ids longer than 15 chars previously overflowed the fixed pad and shifted every following column; now the id column sizes to the widest id (mirrors `formatEvalMatrix`).
189
+
190
+ ### Tests (1161)
191
+
192
+ - +5: `tests/e2e-handlers.test.ts` FTS5 fail-soft E2E (CI-GUARD that serve came up degraded + `tools/list` still answers — forces the failure via `--index-file <dir>`; revert-verified: restoring the re-throw crashes startup → the handshake times out → the guard fails); `tests/eval.test.ts` recallAtK-dedupe, ndcgAtK-dedupe, formatEvalResult long-id alignment (each fails pre-fix). 1156 → 1161.
193
+
194
+ ### Files changed
195
+
196
+ - `src/server.ts` (FTS5 open fail-soft), `src/fts5.ts` (peekFtsMetaSafe wrap), `src/eval.ts` (recall/ndcg dedupe + dynamic id width), `tests/e2e-handlers.test.ts` (+2), `tests/eval.test.ts` (+3), count-bump (`1156 → 1161`) in `README.md` / `package.json` / `llms.txt` / `AGENTS.md` / `ROADMAP.md` / `docs/COMPARISON.md` / `CLAUDE.md`.
197
+ - version bump 3.10.0-rc.32 → 3.10.0-rc.33.
198
+
199
+ ---
200
+
201
+ ## [3.10.0-rc.32] — 2026-06-08
202
+
203
+ > **TL;DR:** **Post-rc.31 audit response — docs + test-infra (batch 1/2).** Ran a from-scratch 3-lens audit (code · docs · test/process, via the Agent tool — NOT Workflow) of the rc.27→rc.31 seeklink batch; every finding **per-item re-verified against the actual code** (anti-overclaim). Verdict: the batch is **exceptionally clean — 0 CRITICAL, 0 HIGH** (the same-PR-invariant discipline held). This RC ships the docs/test-infra findings (the one MEDIUM + LOWs; the code findings follow as rc.33): **(1)** the **CLAUDE.md status roll-up** was frozen at `@rc`=rc.26 while the real @rc was rc.31 — the recurring **α-class "status section stale"** (rc.12 / rc.4 / v3.7.4 / v3.7.9 / v3.8.4 …). Updated it to rc.32 + the seeklink/audit summary, **and finally made it STRUCTURAL**: `check-version-consistency.mjs` now enforces the roll-up's `(current roll-up; \`@rc\`=<version>…)` marker == `package.json` on every `-rc.N` build, so a frozen roll-up fails CI (detection-power verified: it flagged the rc.32-vs-rc.31 mismatch before the bump). **(2)** three rc.28 tool-count guards were **vacuous-on-deletion** (caught a stale number but passed if the phrasing was removed) → presence-asserted. **(3)** the eval **`error` failure-bucket** is now asserted end-to-end (thrown query → `failure_bucket:"error"` + aggregate). **(4)** the rc.27 CHANGELOG "AGENTS.md ×2" advisory-sync count was an overcount → corrected. **No `src/` behavior change; 1156 tests unchanged.**
204
+
205
+ **Pre-release (v3.10 line) — post-rc.31 audit, batch 1/2 (docs + test-infra).**
206
+
207
+ ### Tooling (structural enforcement)
208
+
209
+ - **`check-version-consistency.mjs` gains a CLAUDE.md roll-up `@rc`-currency guard** (the α-class structural defense). On any `-rc.N` build it asserts the roll-up's `(current roll-up; \`@rc\`=X.Y.Z-rc.N …)` marker equals `package.json`'s version. NOT counted among the 7 published-version surfaces (it's a status-summary claim, not a published-version file) — the "7 surfaces" wording stays accurate. This converts the 6×-recurring "CLAUDE.md status stale" α-class from a discipline into a CI gate.
210
+ - **`tests/docs-consistency.test.ts`** — the three rc.28 tool-count guards (`**N production tools**`, `| Tool count | N |`, `N tool implementations`) now **presence-assert** before checking the value, so they catch both a stale number AND the phrasing being deleted (the rc.30 zh-numeric `it()` already did this; this brings the rc.28 guards to the same bar).
211
+
212
+ ### Tests / docs
213
+
214
+ - **`tests/eval.test.ts`** — the "survives a query that throws" test now asserts `per_query[1].failure_bucket === "error"` + `diagnostics.failure_buckets.error === 1` (end-to-end wiring of rc.31's classifier, previously only unit-tested).
215
+ - **`CLAUDE.md`** — status roll-up advanced rc.26 → rc.32 + the seeklink batch (rc.27→rc.31) and this audit summarized.
216
+ - **`CHANGELOG.md`** — rc.27's "advisory-gate-count … `AGENTS.md` ×2" corrected to "`AGENTS.md` (count header + advisory list)" (the "5 advisory" count appears once in AGENTS; the 2nd edit was the advisory list).
217
+
218
+ ### Rejected (with reasoning)
219
+
220
+ - **"advisory CI-gate count (5) is unpinned by an invariant"** (audit LOW) — **rejected.** 3 of the 5 advisory checks (CodeQL ×2 + Analyze) come from GitHub default-setup, not repo files, so the count can't be structurally derived; the **required** count (9) IS pinned (it's what gates releases). Documenting it as deliberately-unpinned.
221
+
222
+ ### Tests (1156)
223
+
224
+ - None — all additions are inline assertions in existing tests + a script guard (no new `it()`). 1156 unchanged.
225
+
226
+ ### Files changed
227
+
228
+ - `scripts/check-version-consistency.mjs` (α-class roll-up guard), `tests/docs-consistency.test.ts` (3 presence asserts), `tests/eval.test.ts` (error-bucket assertion), `CLAUDE.md` (roll-up rc.26→rc.32 + batch summary), `CHANGELOG.md` (rc.27 ×2 correction).
229
+ - version bump 3.10.0-rc.31 → 3.10.0-rc.32.
230
+
231
+ ---
232
+
233
+ ## [3.10.0-rc.31] — 2026-06-07
234
+
235
+ > **TL;DR:** **Eval failure-bucket diagnostics (seeklink-inspired) — turn "the score is low" into "*why* it's low".** The eval harness now classifies every query into a retrieval-failure bucket — `hit_rank_1` / `hit_top_k` / `miss` / `no_labels` / `error` — surfaced per-query and as an aggregate `diagnostics.failure_buckets` counter, plus a breakdown line in the CLI report. This is the half of seeklink's `failure_bucket` idea that fits enquire **safely**: the buckets are derived **only from the already-scored top-K** results, so the metric numbers (NDCG/Recall/MRR) are **byte-identical** — zero behavior change, zero extra retrieval cost. **`answer_contains` answerability is deliberately DEFERRED** (honest scoping): a faithful version needs the full matched-chunk text, but `SearchHybridHit` only carries a ~120-char `snippet`, so a snippet-based check would systematically *under-report* (phrase in the chunk but outside the preview) — a misleading metric this project won't ship. The deeper seeklink "candidate-gen miss vs ranking-budget miss" split is likewise deferred (it needs a retrieval wider than K, which would change the reranker budget and break historical comparability) — both deferrals are documented inline in `eval.ts`. **`src/eval.ts` + tests only; +12 tests → 1156.**
236
+
237
+ **Pre-release (v3.10 line) — eval failure-bucket diagnostics (seeklink-inspired).**
238
+
239
+ ### Added
240
+
241
+ - **`FailureBucket` type + `classifyFailureBucket()` + `tallyFailureBuckets()`** (`src/eval.ts`, all exported + pure) — classify a query's outcome from its scored top-K paths; `error` (threw) takes precedence, then `no_labels`, `hit_rank_1`, `hit_top_k`, `miss`.
242
+ - **`EvalQueryScore.failure_bucket`** (per query) + **`EvalResult.diagnostics.failure_buckets`** (aggregate counter; optional so hand-built results like `run-benchmarks.mjs` stay valid — `runEval` always populates it).
243
+ - **`formatEvalResult` failure-bucket breakdown** — a `failure buckets: hit@1=… hit@k=… miss=… …` line + a `bucket` column in `--per-query` mode.
244
+
245
+ ### Deferred (documented inline in `eval.ts`, with reasons)
246
+
247
+ - **`answer_contains` answerability** — needs full chunk text; `SearchHybridHit` exposes only a ~120-char snippet, so a snippet-based check would under-report and mislead. Revisit if `searchHybrid` ever returns the full matched chunk.
248
+ - **`miss` → candidate-gen-miss vs ranking-budget/reranker-ordering-miss split** — needs a retrieval wider than K, which would change the reranker's candidate budget and thus the scored numbers (breaking historical comparability). Needs first-stage-diagnostics plumbing from `searchHybrid` first.
249
+
250
+ ### Tests (1156)
251
+
252
+ - +12 (`tests/eval.test.ts`): `classifyFailureBucket` (5 positive + 3 NEGATIVE controls), `tallyFailureBuckets` (complete-counter + empty-list NEGATIVE), `runEval` populates `failure_bucket` + `diagnostics`, and `formatEvalResult` renders/omits the breakdown (positive + NEGATIVE). 1144 → 1156.
253
+
254
+ ### Files changed
255
+
256
+ - `src/eval.ts` (FailureBucket type + classifier + tally + interface fields + renderer), `tests/eval.test.ts` (+12), count-bump in `README.md` / `package.json` / `llms.txt` / `AGENTS.md` / `ROADMAP.md` / `docs/COMPARISON.md` / `CLAUDE.md`.
257
+ - version bump 3.10.0-rc.30 → 3.10.0-rc.31.
258
+
259
+ ---
260
+
261
+ ## [3.10.0-rc.30] — 2026-06-07
262
+
263
+ > **TL;DR:** **Bilingual `README.zh.md` (中文) — reach into the Chinese PKM / Obsidian / dev community (seeklink-inspired).** Added a complete, faithful **Chinese README** mirroring every section (problem/solution, grounded-not-extracted + freshness, quick start, use cases, "when NOT to use it", the full capability table, the 7-tier retrieval ladder, Trust, FAQ) — capitalizing on enquire's *already-shipped* multilingual + CJK (`Intl.Segmenter`) support that was previously under-marketed. A `[English] · [中文]` switcher sits at the top of **both** READMEs, and `README.zh.md` ships in the npm tarball (`package.json#files`). Honest disclaimer up top: the **English README is authoritative** (it updates every release). Per the rc.14 "new docs surface with numeric claims needs an invariant in the same PR" rule, `docs-consistency.test.ts` now **pins the zh numeric claims**: tool count (`45 个工具`) and prompt count (`19 个 MCP 提示词`) exact against `TOOL_MANIFEST`, and the test count as a **drift-proof lower bound** (`1100+ 单元测试`, mirroring AGENTS.md's `X+ tests`) so it stays valid as the suite grows. **Docs/tests only — zero `src/` runtime change. +1 test (the zh invariant) → 1144.**
264
+
265
+ **Pre-release (v3.10 line) — bilingual README.zh.md (seeklink-inspired).**
266
+
267
+ ### Added
268
+
269
+ - **`README.zh.md`** — complete Chinese translation of the README, all sections present (tables kept; code blocks verbatim). Markets the existing 50+-language / CJK retrieval to a Chinese-speaking audience.
270
+ - **`[English] · [中文]` language switcher** at the top of both `README.md` and `README.zh.md`.
271
+ - **`README.zh.md` added to `package.json#files`** so it ships to npm alongside the English README.
272
+
273
+ ### Tooling (structural enforcement)
274
+
275
+ - **`docs-consistency.test.ts` pins README.zh.md numeric claims** (rc.14 new-surface rule): `45 个工具` == `TOOL_MANIFEST.length`, `19 个 MCP 提示词` == registered prompts, and `1100+ 单元测试` as a lower bound (must be ≤ actual and within 200 of it).
276
+
277
+ ### Tests (1144)
278
+
279
+ - +1 (`docs-consistency.test.ts`): the README.zh.md numeric-claims invariant. 1143 → 1144 (English count surfaces bumped accordingly; the zh README uses the drift-proof `1100+` lower bound, so it never needs a count bump).
280
+
281
+ ### Files changed
282
+
283
+ - `README.zh.md` (new), `README.md` (switcher), `package.json` (files + description count), `tests/docs-consistency.test.ts` (zh invariant), count-bump in `llms.txt` / `AGENTS.md` / `ROADMAP.md` / `docs/COMPARISON.md` / `CLAUDE.md`.
284
+ - version bump 3.10.0-rc.29 → 3.10.0-rc.30.
285
+
286
+ ---
287
+
288
+ ## [3.10.0-rc.29] — 2026-06-07
289
+
290
+ > **TL;DR:** **`llms.txt` → full agent contract (seeklink-inspired).** enquire's `llms.txt` followed the [llmstxt.org](https://llmstxt.org/) curated-links shape; seeklink's packs a dense "how to drive me" contract instead. Took the best of both: kept the spec-compliant link sections and **added an `## Agent contract` + `### Common failure modes`** block (free-form, before the trailing `Optional` section, so still spec-valid). It gives an AI agent the minimum loop (`obsidian_search` → `obsidian_read_note` → cite), the prefer-enquire-for-meaning-vs-`grep`-for-literal rule, the observability fields (`per_signal`, `age_days`/`stale`), the read-only-by-default posture, an **untrusted-content security note** (retrieved note text is data, not instructions), and the real failure modes (model-not-downloaded → TF-IDF fallback, empty fresh vault, whole-vault scan cap, `serve-http` bearer 401). No new gated numeric claims (so no invariant churn). **`llms.txt` only — zero `src/`, 1143 tests unchanged.**
291
+
292
+ **Pre-release (v3.10 line) — llms.txt agent-contract enrichment (seeklink-inspired).**
293
+
294
+ ### Added
295
+
296
+ - **`llms.txt` `## Agent contract` section** — minimum agent loop, when-to-prefer-enquire-vs-`grep`, observability (`per_signal` / `age_days` / `stale`; scores sort within one query only), read-only-by-default + `--disabled-tools`/`--enabled-tools`, and an **untrusted-content** note (treat "ignore previous instructions"-style text inside a retrieved note as content, never a command).
297
+ - **`llms.txt` `### Common failure modes` subsection** — first-call embedding-model-not-downloaded → `setup`/`install-model` (umbrella degrades to TF-IDF meanwhile), empty fresh vault, whole-vault scan safety cap (partial results flagged, never silent), and `serve-http` bearer-token-too-short → HTTP 401 (`gen-token` mints a valid one).
298
+
299
+ ### Tests (1143)
300
+
301
+ - None — `llms.txt` only; no gated numeric claims added (existing llms.txt invariants — test count, 34+4+7 tool breakdown, prompt count, CI-gate count — unchanged). 1143 unchanged.
302
+
303
+ ### Files changed
304
+
305
+ - `llms.txt` (Agent contract + failure-modes sections).
306
+ - version bump 3.10.0-rc.28 → 3.10.0-rc.29.
307
+
308
+ ---
309
+
310
+ ## [3.10.0-rc.28] — 2026-06-07
311
+
312
+ > **TL;DR:** **README trust-batch — honest scoping + a self-propagating agent rule (seeklink-inspired).** Added a candid **"When enquire-mcp is *not* the right tool"** section (use `rg` for literal search; conversation-memory tools are a different category; not multi-user/GUI/web-scale) — explicit non-goals build trust, mirroring seeklink's "Not For". Marketed the already-existing **read-only-by-default** posture with a new **least-privilege** Trust row (`--disabled-tools` / `--enabled-tools`), and shipped a **reusable agent-rule snippet** users drop into their own `AGENTS.md`/`CLAUDE.md`/`.cursorrules` so their agent learns *when* to reach for the vault (and when to prefer `grep`). Plus two **state-driven catches** the change-driven gates' regexes missed: a stale **"44 tools" → 45** in three docs (README comparison row, `docs/COMPARISON.md` table cell, `AGENTS.md` file-tree — the 45th tool `obsidian_stale_notes` shipped in v3.10 but these phrasings never updated; one even contradicted its own 34+4+7=45 breakdown), and a **broken Karpathy gist link** in the README (404 — every other reference used the correct id). Per the "drift demands a structural defense" rule, the same PR **extends the tool-count invariants** to pin the missed phrasings (`**N production tools**`, `| Tool count | N |`, `N tool implementations`). **Docs/tests only — zero `src/` runtime change, 1143 tests unchanged.**
313
+
314
+ **Pre-release (v3.10 line) — README trust-batch (seeklink-inspired) + tool-count drift class.**
315
+
316
+ ### Added
317
+
318
+ - **README "🚫 When enquire-mcp is *not* the right tool"** — honest non-goals: literal search (`rg`), conversation-memory category (mem0/Zep/Supermemory), multi-user/hosted, non-Markdown sources, GUI/plugin, web-scale corpora. Trust through candor.
319
+ - **README "Reusable agent rule" snippet** — a copy-paste block for any `AGENTS.md`/`CLAUDE.md`/`.cursorrules` telling the agent to search the vault first for conceptual recall and use `grep`/`rg` for literal strings (the self-propagating-adoption pattern borrowed from seeklink's Agent Notes).
320
+ - **README Trust "Least privilege" row** — markets the existing `--disabled-tools`/`--enabled-tools` surface-subsetting (e.g. a read-only research agent gets only `obsidian_search` + `obsidian_read_note`).
321
+
322
+ ### Fixed
323
+
324
+ - **Stale "44 tools" → 45 in three docs** (README comparison row, `docs/COMPARISON.md` "Tool count" cell, `AGENTS.md` file-tree). The 45th tool (`obsidian_stale_notes`) shipped in the v3.10 line but these phrasings drifted; the README row even contradicted its own "34 + 4 + 7 = 45" breakdown.
325
+ - **Broken Karpathy LLM-Wiki gist link in the README** (`…914927…` → 404). Corrected to the canonical id (`…914893…`, HTTP 200) used everywhere else in the codebase.
326
+
327
+ ### Tooling (structural enforcement)
328
+
329
+ - **Extended the tool-count invariants** (`tests/docs-consistency.test.ts`) to close the phrasings the existing `**N tools**` regex couldn't see — all pinned to `TOOL_MANIFEST.length`: `**N production tools**` (README), `| Tool count | N |` (COMPARISON table cell), `N tool implementations` (AGENTS file-tree). This is the "a drift finding demands a full-surface sweep + structural defense" rule — the instance fix alone would let the class recur.
330
+
331
+ ### Tests (1143)
332
+
333
+ - No new `it()` — the new assertions extend three existing tool-count tests (no canonical-count change). 1143 unchanged.
334
+
335
+ ### Files changed
336
+
337
+ - `README.md` (Not-For section, agent-rule snippet, least-privilege Trust row, 44→45 comparison row, Karpathy link fix), `docs/COMPARISON.md` (Tool count 44→45), `AGENTS.md` (44→45 file-tree), `tests/docs-consistency.test.ts` (3 invariant extensions).
338
+ - version bump 3.10.0-rc.27 → 3.10.0-rc.28.
339
+
340
+ ---
341
+
342
+ ## [3.10.0-rc.27] — 2026-06-07
343
+
344
+ > **TL;DR:** **Docker / Glama discoverability — a borrowed lesson from `seeklink`.** MCP directories (Glama, and through Glama the `awesome-mcp-servers` listing) introspect a server by **building its Dockerfile** and completing an MCP handshake + `tools/list` over stdio. enquire shipped `glama.json` long ago but had **no Dockerfile**, so that check couldn't build it. Added a minimal, reproducible **multi-stage `Dockerfile`** that builds from source and serves the **read-only-by-default** MCP over stdio against a baked sample vault — it installs deps with `--ignore-scripts` so `tsc` resolves the optional-dep types with **no native toolchain**, then **prunes optional from the slim runtime**: each native dep loads via lazy `await import()` only when a heavy tool is *called*, so `tools/list` introspection works without them (umbrella search degrades to pure-JS TF-IDF; full FTS5/embeddings/PDF retrieval uses the npm install path). _(The first CI pass caught that `--omit=optional` broke `tsc` — the optional packages are referenced in typed dynamic imports — exactly why the `docker` job exists; corrected to `--ignore-scripts` + prune-optional.)_ Plus a `.dockerignore` for a lean context. Made it **structural** with `tests/docker-glama-invariant.test.ts`: the Dockerfile must invoke the real bin (`dist/index.js`), run `serve`, and use a Node base image whose major ≥ `engines.node` floor; `glama.json` must be valid + list the owner — each with a real NEGATIVE control. **Infra/docs/tests only — zero `src/` runtime change.** The canonical install path stays `npm install -g @oomkapwn/enquire-mcp`; the image is for directory introspection + quick container trials.
345
+
346
+ **Pre-release (v3.10 line) — Docker/Glama discoverability (seeklink-inspired).**
347
+
348
+ ### Added
349
+
350
+ - **`Dockerfile`** — multi-stage (build → slim runtime). Build stage: `npm ci --ignore-scripts` (optional deps present so `tsc` resolves their typed dynamic imports — `hnswlib-node` / `pdfjs-dist` / `tesseract.js` / `@napi-rs/canvas` — but never natively compiled → no python/make/g++) → `npm run build` → `npm prune --omit=dev --omit=optional` (slim runtime). Runtime stage: `node:22-slim` (matches `engines.node` ≥ 22) with `dist/` + prod deps + a baked `/vault/welcome.md`. `ENTRYPOINT ["node","dist/index.js"]` + `CMD ["serve","--vault","/vault"]` — read-only by default, so an introspection harness can never mutate. Header documents the real-use path (`docker run -i -v /abs/vault:/vault …`).
351
+ - **`.dockerignore`** — keeps the build context lean + deterministic (excludes `node_modules`, `dist`, `.git`, `tests`, `docs/audits`, `assets`, etc.).
352
+
353
+ ### Tooling (structural enforcement)
354
+
355
+ - **`tests/docker-glama-invariant.test.ts`** — pins the two files the directory check depends on. Asserts (1) the Dockerfile invokes `dist/index.js` + runs `serve`, (2) every `FROM node:<major>` base image major ≥ the `engines.node` floor (catches a future engines bump outrunning the base image → unsupported runtime), (3) `glama.json` is valid JSON with a `glama.ai` `$schema` + the owner in `maintainers`. Pure analyzers (`analyzeDockerfile` / `engineNodeMajorFloor` / `validateGlamaConfig`) are driven by 5 NEGATIVE controls (no-bin/no-serve Dockerfile, sub-floor base image, missing engines, invalid JSON, missing owner+schema) so the guard is provably non-vacuous. Auto-scanned by the META-invariant (`*-invariant.test.ts`).
356
+ - **CI `docker` job (`.github/workflows/ci.yml`, advisory).** Anti-overclaim: the image couldn't be built in this dev environment, so a new job actually `docker build`s it, smoke-runs `--help`, and performs a **`tools/list` stdio introspection** (the exact MCP handshake Glama does) asserting `obsidian_search` comes back — turning "Glama-introspectable" into an *enforced* claim and guarding the Dockerfile against rot. Advisory (not in the branch-protection required set → never blocks a merge); uses only the already-SHA-pinned `checkout` + preinstalled `docker` (no new action to pin, no `npm ci` → OIA Checks 9/10 N/A). Advisory gate count `4 → 5` synced across README ×2, `llms.txt`, `AGENTS.md` (count header + advisory list).
357
+
358
+ ### Docs
359
+
360
+ - Test-count surfaces bumped `1135 → 1143` (README badge/hero/trust-row/test-cmd, `package.json` description, `llms.txt`, `AGENTS.md`, `ROADMAP.md`, `docs/COMPARISON.md`, `CLAUDE.md` roll-up) — the docs-consistency invariant pins these to the live `it()` count.
361
+
362
+ ### Tests (1143)
363
+
364
+ - +8 (`tests/docker-glama-invariant.test.ts`): 3 positive (real Dockerfile + glama.json assertions) + 5 NEGATIVE controls. 1135 → 1143.
365
+
366
+ ### Files changed
367
+
368
+ - `Dockerfile` (new), `.dockerignore` (new), `tests/docker-glama-invariant.test.ts` (new), `.github/workflows/ci.yml` (advisory `docker` job).
369
+ - count-bump (`1135 → 1143`) in `README.md`, `package.json`, `llms.txt`, `AGENTS.md`, `ROADMAP.md`, `docs/COMPARISON.md`, `CLAUDE.md`; advisory-gate-count bump (`4 → 5`) in `README.md` ×2, `llms.txt`, `AGENTS.md` (count header + advisory list).
370
+ - version bump 3.10.0-rc.26 → 3.10.0-rc.27.
371
+
372
+ ---
373
+
374
+ ## [3.10.0-rc.26] — 2026-06-06
375
+
376
+ > **TL;DR:** **SYS-1 — supply-chain content-pin (M-9 completion).** The release workflow's one external `run:` download — the `mcp-publisher` CLI that runs with our **OIDC publish identity** on a stable release — was *tag*-pinned (`v1.7.9`, rc.33 M-9) but tag-pins are **mutable** (a tag can be force-moved, a release asset re-uploaded). Now it's **content-pinned**: the tarball's SHA256 is verified (`sha256sum -c`, fail-closed) before it's extracted/executed. Made it **structural** by extending **OIA Check 9b**: a tag-pinned release-archive (`releases/download/<tag>/…\.tar.gz`) `run:` download must ALSO carry a SHA256 verification in the same workflow, else CI fails (`RUN-DOWNLOAD-UNVERIFIED`; detection-power inject/revert-verified). Verified the deferred SYS-1 items against current code first (anti-overclaim): **H-3** paired-sink PDF/OCR parity was **already closed in rc.33**, and Check 9b's `releases/latest` guard already existed — so the genuine residual was just the tag→content upgrade. **Workflow/script/docs only — zero `src/`, 1135 tests unchanged.** Closes the rc.36 meta-audit's two named "deferred behavioral dimensions".
377
+
378
+ **Pre-release (v3.10 line) — SYS-1: deferred behavioral-defense dimensions.**
379
+
380
+ ### Security (supply-chain)
381
+
382
+ - **M-9 completion — `mcp-publisher` download is now SHA256 content-pinned (was tag-pinned).** `release.yml`'s registry-publish step downloads the official `mcp-publisher` CLI from a GitHub release. rc.33 pinned it to the `v1.7.9` tag (closed `releases/latest`), but a tag is not immutable. Now: download to a file → `echo "<sha256> mcp-publisher.tar.gz" | sha256sum -c -` (fail-closed) → extract. The pinned hash (`ab12…81ac`, linux/amd64 — the `ubuntu-latest` runner arch) is bumped *deliberately together with* the tag. This binary runs with the workflow's OIDC identity, so content-pinning it is the highest-value spot for the strongest supply-chain defense. (The download was also simplified from `uname`-portable to explicit `linux_amd64`, matching the fixed runner.)
383
+
384
+ ### Tooling (structural enforcement)
385
+
386
+ - **OIA Check 9b extended — release-archive `run:` downloads must be SHA256-verified.** Check 9b already flagged `releases/latest` (moving URL). It now ALSO flags a tag-pinned release **archive** (`releases/download/<tag>/…\.tar.gz|.tgz|.zip`) `curl`/`wget` that lacks a `sha256sum -c` / `shasum -a 256 -c` anywhere in the same workflow file (`RUN-DOWNLOAD-UNVERIFIED`). This converts the content-pin from a one-time fix into a permanent gate — the rc.36 "internalize the lens as an inventory invariant" move. Detection-power verified: stripping the `sha256sum -c` line flags `release.yml:240`; restored → clean. (Check 9b is a sub-check of Check 9 — the canonical OIA top-level count stays 12.)
387
+
388
+ ### Docs
389
+
390
+ - **`CLAUDE.md`** — the rc.36 "remaining uncovered behavioral dimensions" note marked M-9 (→ rc.26) and H-3 (→ rc.33) **closed**, with the still-uncovered set named honestly (generalized enforcement-verb→code-guard taxonomy; the accepted `block`-granularity FTS5↔embed chunk-index divergence). Status roll-up extended rc.25 → rc.26.
391
+
392
+ ### Tests (1135)
393
+
394
+ None — workflow/script/docs only; no `src/` or test change. Check 9b's new branch is verified by the inject/revert detection-power run (OIA checks run via `npm run check:oia`, not vitest). 1135 unchanged.
395
+
396
+ ### Files changed
397
+
398
+ - `.github/workflows/release.yml` (mcp-publisher download → file + `sha256sum -c` + extract), `scripts/oia-walk.mjs` (Check 9b archive-checksum requirement), `CLAUDE.md` (dimension-status note + roll-up).
399
+ - version bump 3.10.0-rc.25 → 3.10.0-rc.26.
400
+
401
+ ---
402
+
403
+ ## [3.10.0-rc.25] — 2026-06-06
404
+
405
+ > **TL;DR:** **Round-2 audit — LOW docs-currency batch (final; docs-only).** Five un-gated currency-drift surfaces the change-driven gates don't watch, all flagged by the round-2 state-driven docs sweep: (1) **`CLAUDE.md`** status section was frozen at `v3.9.0-rc.35` (header still said "v3.8.x stable + v3.9.0 architectural") — added a condensed v3.9.0-stable→v3.9.1→v3.10.0-rc.1→rc.25 roll-up + moved the "(current)" marker + updated the title; (2) **`llms.txt`** "what's new" list stopped at rc.3, omitting the v3.10 freshness flagship (contradicting llms.txt's own header) — added it; (3) **`docs/benchmarks.md`** metric-validity said "through the v3.9.0-rc cascade" — extended to the v3.10 line (recency re-rank is off-by-default, a provable no-op, so the numbers are unchanged); (4) **`README.md`** highlight reel stopped at "v3.9.0 stable" — appended a v3.10 (`@rc`) freshness entry; (5) **`docs/api.md`** described the freshness boolean as an "over-one-year flag (≥ 365)" without naming it — now `stale` flag (≥ `--stale-days`, default 365). **Docs-only — zero `src/`, 1135 tests unchanged. This closes the round-2 (post-MED) audit** (rc.23 HIGH shutdown + rc.24 LOW code + rc.25 LOW docs).
406
+
407
+ **Pre-release (v3.10 line) — round-2 audit; LOW docs-currency batch (final).**
408
+
409
+ ### Docs
410
+
411
+ - **`CLAUDE.md`** — title `v3.8.x stable + v3.9.0 architectural` → `v3.9.x stable maintenance + v3.10 forgetting-aware line`; added a single condensed status roll-up entry (v3.9.0 STABLE promotion → v3.9.1 → the full v3.10.0-rc.1→rc.25 line: staleness, bug-report batch, MED audit M1–M10, round-2 re-sweep incl. the rc.23 shutdown HIGH) marked "(current)"; removed "(current)" from the rc.35 entry. (Internal process doc — not packaged; the recurring "CLAUDE.md status frozen" α-drift the project's own anti-pattern list names.)
412
+ - **`llms.txt`** — added a `v3.10+ (@rc)` line to the recent-features list (forgetting-aware freshness + frontmatter-aware search), resolving the list-vs-header self-contradiction on an AI-discoverability surface.
413
+ - **`docs/benchmarks.md`** — metric-validity currency extended from "the v3.9.0-rc cascade" to "the v3.10 line", with the explicit note that the rc.5 recency re-rank is off by default (`--recency-weight 0` = provable no-op) so default-config numbers are unchanged.
414
+ - **`README.md`** — highlight reel gained a `v3.10` (`@rc`) entry (freshness + frontmatter-aware search), so it no longer lags the README's own hero differentiator.
415
+ - **`docs/api.md`** — the v3.10 freshness boolean is now named (`stale`) and `365` is shown as the `--stale-days` default rather than an absolute.
416
+
417
+ ### Tests (1135)
418
+
419
+ None — docs-only RC; no `src/` or test change. 1135 unchanged.
420
+
421
+ ### Files changed
422
+
423
+ - `CLAUDE.md` (title + status roll-up), `llms.txt`, `docs/benchmarks.md`, `README.md`, `docs/api.md`.
424
+ - `scripts/check-per-file-coverage.mjs` — refreshed the stale `watcher.ts` inline coverage comment (60.69% → 61.83%; rc.24's unlink-gate change + test raised it; caught by OIA Check 6 against the fresh `coverage-summary.json`).
425
+ - version bump 3.10.0-rc.24 → 3.10.0-rc.25.
426
+
427
+ ### Method note
428
+
429
+ This concludes the **round-2 (post-MED) audit** — a 3-agent pass on the shipped rc.22 commit (per the CLAUDE.md "re-run a focused audit after a class-closing release" rule). It returned 1 HIGH (rc.23 — a regression in our own rc.19, empirically reproduced + fix-verified before shipping), 3 LOW code (rc.24), and 5 LOW docs-currency (rc.25). No CRITICAL; `src/` remains exceptionally clean. The HIGH validated the meta-lesson: the home-grown gates are drift/claim-driven and structurally blind to runtime behavior, so the external-lens re-sweep after each batch is not optional.
430
+
431
+ ---
432
+
433
+ ## [3.10.0-rc.24] — 2026-06-06
434
+
435
+ > **TL;DR:** **Round-2 audit — LOW code batch (3 fixes).** The same post-MED re-sweep that found the rc.23 HIGH surfaced three LOWs, all verified against current code: (1) **`obsidian_query_base` (`bases.ts queryBase`)** was the lone uncapped member of the `capScanEntries` whole-vault-scanner class — always-registered + bearer-reachable, it reads every matched note's full body with no cap; the `resource-bound-invariant` missed it because `bases.ts` is outside `SCANNER_SOURCES` AND it uses `listFilesByExtension`+`readFile` (not the `listMarkdown`+`readNote` shape the heuristic detects). Capped via `capScanEntries` + a standalone invariant assertion (mirroring the `buildWikilinkGraph` one). (2) **`parser.ts bodyStartLine`** used `source.indexOf(body)`, which false-matches a degenerate note whose entire body text also appears verbatim in a frontmatter line (`---\nx: hi\n---\nhi` → wrong line) → `lastIndexOf` (body is the suffix) + empty-body guard. (3) **`watcher.ts handle()`** rc.20's exclude re-check skipped ALL kinds, so an excluded path's `unlink` never purged its rows (orphan index entries for a deleted-but-excluded note) → gate only add/change; `unlink` always cleans up. **1132 → 1135 tests.** Docs-currency LOWs → rc.25.
436
+
437
+ **Pre-release (v3.10 line) — round-2 audit; LOW code batch.**
438
+
439
+ ### Fixed
440
+
441
+ - **LOW (DoS-cap completeness) — `obsidian_query_base` uncapped whole-vault content scan.** `queryBase` (`src/bases.ts`) walks every `.md` note and reads its full body (`limit` applied only AFTER the walk, by design, so it can't bound the scan). It's always-registered + bearer-reachable on `serve-http` — the same shape as `runDql`, which got a `capScanEntries` cap in rc.18 (M4). Now `queryBase`'s scan is wrapped in `capScanEntries(..., "obsidian_query_base")`. The `resource-bound-invariant` couldn't see it (its `discoverScanners` heuristic requires `listMarkdown`+`readNote`; `queryBase` uses `listFilesByExtension`+`readFile`, and `bases.ts` wasn't in `SCANNER_SOURCES`), so a standalone assertion (`bases.queryBase caps its scan via capScanEntries`) now guards it explicitly — mirroring the existing `communities.buildWikilinkGraph` separate assertion. (Broadening the heuristic was rejected: it would then sweep doctor/vault/communities/server/media for the `listFilesByExtension` shape and risk false-positive cascades.)
442
+ - **LOW (correctness) — `parser.ts bodyStartLine` false-early match.** rc.17 computed the body's file line via `source.indexOf(body)` to make embedding line-citations file-absolute. For a degenerate note whose whole body text also appears verbatim inside a frontmatter line, `indexOf` matches the frontmatter occurrence → too-early a line → an embedding chunk's `line_start` could point inside the frontmatter (the exact deep-link mis-pointing rc.17 fixed, for this shape). Now `source.lastIndexOf(body)` (the body is always the SUFFIX of source) + an empty-body guard (`body.length > 0 ? lastIndexOf : -1`).
443
+ - **LOW (correctness) — watcher skipped `unlink` cleanup for excluded paths.** rc.20's M7 defense-in-depth re-check (`if (isExcluded) return`) gated ALL event kinds. A note indexed before an exclusion was added, then deleted, never had its rows dropped (orphan FTS5/embed entries — hidden from results by the terminal `isExcluded` filter, but stale on disk). Now the gate is `if (kind !== "unlink" && isExcluded) return` — only the INDEXING ops (add/change) skip; a delete always purges, since removing content is never a privacy risk.
444
+
445
+ ### Tests (1135)
446
+
447
+ `tests/resource-bound-invariant.test.ts` +1 (`bases.queryBase caps its scan via capScanEntries`). `tests/parser.test.ts` +1 (degenerate: body text also in frontmatter → `bodyStartLine` anchors to the real body line, not the frontmatter occurrence — fails with the old `indexOf`). `tests/watcher.test.ts` +1 (excluded `unlink` proceeds to cleanup — the discriminator vs the excluded `change` which stays gated). 1132 → 1135.
448
+
449
+ ### Files changed
450
+
451
+ - `src/bases.ts` (`capScanEntries` import + `queryBase` cap), `src/parser.ts` (`indexOf` → `lastIndexOf` + guard), `src/watcher.ts` (`handle()` gates only add/change on `isExcluded`), `tests/resource-bound-invariant.test.ts` (+1), `tests/parser.test.ts` (+1), `tests/watcher.test.ts` (+1), test-count claims → 1135.
452
+ - version bump 3.10.0-rc.23 → 3.10.0-rc.24.
453
+
454
+ ---
455
+
456
+ ## [3.10.0-rc.23] — 2026-06-06
457
+
458
+ > **TL;DR:** **HIGH — `serve-http` hung forever on SIGINT/SIGTERM (a regression of rc.19's own M3 fix).** A post-MED-batch re-sweep (3-agent audit on the rc.22 commit) caught it: rc.19 correctly made shutdown **await** the full teardown before `process.exit(0)` — but `shutdownHttpServer` awaits `server.close()`, and Node's `http.Server.close()` callback fires only once EVERY connection has ended and **does not terminate idle keep-alive sockets**. So any lingering connection (a reverse proxy's keep-alive, a half-open socket, an LB health probe, an SSE stream) made graceful shutdown block **indefinitely**. Pre-rc.19 this latent `close()` hang was masked because the cache-flush handler called `process.exit(0)` on its own; rc.19 removed that hatch and added no bound → "await the drain" became "await forever." **Reproduced** (lingering socket + SIGTERM → process alive >8s, would hang forever; pre-rc.19 exited in 8ms). Fix: a bounded `closeServerBounded()` — close idle keep-alives immediately, then force-close stragglers via `server.closeAllConnections()` after a 3s grace — so shutdown resolves on `close()` completion OR the grace, never never. **Verified: post-fix the same repro exits in ~3.0s, code 0.** This is the textbook recursion-pair (a fix for one shutdown bug shipping another); documented as such. **1130 → 1132 tests.**
459
+
460
+ **Pre-release (v3.10 line) — post-MED-batch audit; HIGH regression fix.**
461
+
462
+ ### Fixed
463
+
464
+ - **HIGH — `serve-http` graceful shutdown could hang forever on a lingering connection (regression introduced by rc.19).** `shutdownHttpServer` did `await new Promise(resolve => server.close(() => resolve()))`. `http.Server.close()` waits for ALL open connections to end and never force-closes idle keep-alives, so a single held-open socket blocked the await — and rc.19's `makeHttpShutdownHandler` gates `process.exit(0)` behind that await, so the process never exited. Under an orchestrator (systemd/docker) this escalates to a SIGKILL after the stop-timeout, defeating the very graceful-drain guarantee rc.19 was built to provide. New `closeServerBounded(server, graceMs = HTTP_CLOSE_GRACE_MS=3000)`: registers `server.close()`, immediately calls `server.closeIdleConnections()` (so the common no-in-flight case resolves at once), and arms an unref'd `setTimeout` that force-closes stragglers with `server.closeAllConnections()` after the grace — resolving whichever happens first. Both `shutdownHttpServer` close sites (the `!extras` fast path + the main path, after `registry.closeAll()` drains in-flight MCP requests) route through it. Reproduced + fix-verified empirically (lingering-socket SIGTERM: hang → 3.0s bounded exit).
465
+
466
+ ### Tests (1132)
467
+
468
+ `tests/http-transport.test.ts` +2: `closeServerBounded` resolves within the grace despite a lingering keep-alive socket (the rc.19 hang — pre-fix this never returns) + CONTROL (with nothing lingering it resolves promptly, well under a large grace — proving it resolves on `close()` completion, NOT by always waiting the grace; a naive `setTimeout(resolve, grace)` impl fails this). 1130 → 1132.
469
+
470
+ ### Files changed
471
+
472
+ - `src/http-transport.ts` (new `HTTP_CLOSE_GRACE_MS` + exported `closeServerBounded`; both `shutdownHttpServer` close sites bounded), `tests/http-transport.test.ts` (+2 + `node:http`/`node:net` imports), test-count claims → 1132.
473
+ - version bump 3.10.0-rc.22 → 3.10.0-rc.23.
474
+
475
+ ### Method note
476
+
477
+ The fix was found by re-running a focused audit on the just-shipped commit (the CLAUDE.md "post-merge re-sweep after a class-closing release" rule) — a 3-agent pass (adversarial diff re-review · docs/process · behavioral/STRIDE) that I commissioned after the MED batch, with the HIGH **empirically reproduced and fix-verified by me** before shipping (not taken on the agent's word). It is a clean recursion-pair instance: rc.19 fixed a shutdown race (flush `exit` beat the drain) and, in removing the `exit` hatch, exposed a latent `server.close()` hang. The remaining audit findings (1 LOW behavioral — `obsidian_query_base` uncapped scan; 2 LOW correctness — parser `indexOf`, watcher unlink-skip; 5 LOW docs currency) ship as rc.24/rc.25.
478
+
479
+ ---
480
+
481
+ ## [3.10.0-rc.22] — 2026-06-06
482
+
483
+ > **TL;DR:** **Audit MED-batch 7 (final) — M8/M9 test & process integrity.** **M8a (vacuous test):** `security.test.ts`'s "embeddingsSearch filters excluded paths" test was THEATER — it said "we can't test without a model" and **reimplemented** the privacy filter inline (`rawHits.filter(h => !vault.isExcluded(...))`), never running the real code, so `embeddingsSearch`'s two inline filter sites (search.ts ~1100/1106) were uncovered and would stay green even if the guard were deleted. Extracted the filter into a pure, exported `filterExcludedEmbedHits` (the `embeddingsSearch` sibling of rc.8's `pruneExcludedHits`), routed both sites through it, and made the test + a new unit test exercise the REAL helper. **M8b (silent-skip):** the E2E CI-GUARD only asserted `distExists()` — T-3/T-4 had **no** guard at all, and none checked that the server actually spawned, so a spawn failure in CI would silently skip whole suites (incl. T-4's 401-no-bearer auth check). Added/strengthened CI-GUARDs across T-2/T-3/T-4 to assert dist built **and** the process spawned in CI. **M9 (config drift, ι-class):** `package.json` `prepublishOnly` used a single `npm audit --audit-level=high` (all deps) while CI + release both use the stricter two-step (`--omit=dev --audit-level=moderate` then `--include=dev --audit-level=high`) — so prepublish would miss a *moderate prod* vuln CI/release catch. Aligned. **1126 → 1130 tests. This closes the comprehensive-audit MEDIUM batch (M1–M10).**
484
+
485
+ **Pre-release (v3.10 line) — audit fix batch 7/7 (M8/M9).**
486
+
487
+ ### Fixed
488
+
489
+ - **M8a — embeddingsSearch privacy filter was untested (vacuous "theater" test).** `tests/security.test.ts`'s embeddingsSearch privacy test reimplemented the `!vault.isExcluded(rel_path)` filter inline and asserted on the reimplementation — it never invoked anything in `src/`, so `embeddingsSearch`'s two inline filter sites had zero behavioral coverage. New pure `filterExcludedEmbedHits<T extends {rel_path}>(hits, isExcluded)` in `src/tools/search.ts` (sibling of `pruneExcludedHits`); both `embeddingsSearch` sites (HNSW refill path + brute-force path) now call it; the security test + a new `search-hybrid.test.ts` unit test (positive + NEGATIVE control) exercise the REAL helper. A regression that drops the guard now fails CI.
490
+ - **M8b — E2E CI-GUARD silent-skip gaps.** The `tests/e2e-handlers.test.ts` CI-GUARD asserted only `distExists()`, and only T-2 had one — T-3 (HyDE) and T-4 (serve-http) had none, so a failed `spawnServer` / `serve-http` spawn in CI would leave `client`/`proc` null and every test body's `if (!client) return` / `if (!proc) return` would silently skip the suite (including T-4's 401-without-bearer auth assertion). Now each of T-2/T-3/T-4 has a CI-GUARD asserting dist built **and** the server spawned (client non-null / proc non-null + port bound) in CI. Propagates the rc.23 silent-skip→CI-GUARD pattern to the two describes it missed.
491
+ - **M9 — `prepublishOnly` audit weaker than CI/release (ι-class config drift).** `package.json:prepublishOnly` ran `npm audit --audit-level=high` (high across ALL deps), while `ci.yml` + `release.yml` both run the two-step `npm audit --omit=dev --audit-level=moderate` + `npm audit --include=dev --audit-level=high` — so a **moderate severity prod** vuln would pass prepublish but fail CI/release. Aligned `prepublishOnly` to the identical two-step. (The v3.7.19 ι-class alignment synced release↔CI but missed `prepublishOnly`.)
492
+
493
+ ### Tests (1130)
494
+
495
+ `tests/search-hybrid.test.ts` +2 (`filterExcludedEmbedHits`: removes excluded `rel_path`s preserving order + NEGATIVE control proving it's predicate-driven). `tests/e2e-handlers.test.ts` +2 (T-3 + T-4 CI-GUARDs; T-2's existing guard strengthened to also assert the spawn, net 0). `tests/security.test.ts` rewired to call the real helper (net 0). 1126 → 1130.
496
+
497
+ ### Files changed
498
+
499
+ - `src/tools/search.ts` (new exported `filterExcludedEmbedHits`; `embeddingsSearch` routes both filter sites through it), `tests/security.test.ts` (theater test → real-helper call), `tests/search-hybrid.test.ts` (+2 + import), `tests/e2e-handlers.test.ts` (T-2 guard strengthened + T-3/T-4 guards added), `package.json` (`prepublishOnly` two-step audit + test-count 1130), test-count claims → 1130.
500
+ - version bump 3.10.0-rc.21 → 3.10.0-rc.22.
501
+
502
+ ---
503
+
504
+ ## [3.10.0-rc.21] — 2026-06-06
505
+
506
+ > **TL;DR:** **Audit MED-batch 6 — M2/M10 docs integrity.** State-driven docs sweep. **M2 (verified, anti-overclaim):** the total tool-count is ALREADY pinned to `TOOL_MANIFEST.length` (45) across README / STABILITY / COMPARISON / api.md / llms.txt — the audit's "extend docs-consistency to pin them" was already done; the ONE unguarded surface was `ROADMAP.md`, whose "44 tool descriptions" had silently drifted while every guarded surface stayed at 45. Fixed the `44`→`45` + added a `docs-consistency` pin (+ NEGATIVE control) so ROADMAP can't drift again. **M10:** `CITATION.cff` was 2 stables behind (`3.8.8` → `3.9.1`, the current `@latest`); `ROADMAP.md` contradicted itself (the v3.10 forgetting-aware freshness feature listed both as *shipped* under "Already shipped" AND as *open* `[ ]` in Tier-3 — reconciled to `[x]` citing rc.5; the TDQS-pass item was `[ ]` but shipped in rc.7 — marked `[x]`); `server.json`'s subcommand hint got a "run with no subcommand for the full list" suffix (kept representative, NOT an enumerated 15-item list — that would add a drift surface). **api.md "stable v3.9.x" verified accurate (no change).** **1124 → 1126 tests.** M8/M9 test/process → rc.22.
507
+
508
+ **Pre-release (v3.10 line) — audit fix batch 6 (M2/M10 docs).**
509
+
510
+ ### Fixed
511
+
512
+ - **M2 — ROADMAP.md tool-count drift + the unguarded surface.** Every *canonical* tool-count surface (README badge+hero+heading, STABILITY header, COMPARISON, api.md first paragraph, llms.txt breakdown) was already pinned to `TOOL_MANIFEST.length` by `docs-consistency.test.ts` — so they all correctly read **45**. `ROADMAP.md`'s "TDQS pass on all **44** tool descriptions" was the lone surface NOT in that invariant set, so it drifted. Fixed to 45 + added `checkRoadmapToolCount` (pure check + positive + NEGATIVE control) pinning ROADMAP's "N tool descriptions" to the manifest. (The audit's "extend docs-consistency to pin them" was already satisfied for the canonical surfaces — only ROADMAP needed closing; documented to avoid claiming a fix that already existed.)
513
+ - **M10 — CITATION.cff stale.** `version: "3.8.8"` → `"3.9.1"` + `date-released` → `2026-06-01` (the current `@latest`; CITATION updates only on a stable promotion, per its own comment — it had missed the 3.9.0/3.9.1 promotions).
514
+ - **M10 — ROADMAP self-contradiction.** (a) "Forgetting-aware freshness (v3.10)" is listed under **Already shipped** (rc.4 plumbing + rc.5 opt-in recency re-ranking) but Tier-3 still carried `[ ] "Forgetting-aware" note-staleness scoring` for the same user-facing capability → marked `[x]` (shipped v3.10-rc.5; post-fusion re-rank achieving the Memora-frontier goal) + a cross-reference so the intentional duplication is clear. (b) `[ ] TDQS pass on all 44 tool descriptions` shipped in rc.7 → `[x]` + 45. (Tier-2 items that are only *partially* shipped — rc.14 AI-search bundle, the Obsidian-MCP COMPARISON table — left `[ ]` to avoid overclaiming.)
515
+
516
+ ### Docs
517
+
518
+ - **M10 — server.json subcommand hint.** Appended "— run with no subcommand for the full list" to the positional-arg description so the MCP-Registry hint doesn't undersell the CLI, WITHOUT enumerating all 15 subcommands (an enumerated list would be a fresh drift surface + risks the registry schema's description length; `setup` already subsumes the model/index subcommands).
519
+
520
+ ### Tests (1126)
521
+
522
+ `tests/docs-consistency.test.ts` +2: `ROADMAP.md tool-count claim matches TOOL_MANIFEST` (the M2 pin) + `NEGATIVE: checkRoadmapToolCount flags drift / missing claim`. 1124 → 1126.
523
+
524
+ ### Files changed
525
+
526
+ - `CITATION.cff` (3.8.8 → 3.9.1 + date), `ROADMAP.md` (TDQS + forgetting-aware items → `[x]`; 44 → 45; test total → 1126), `server.json` (subcommand hint suffix), `docs/COMPARISON.md` / `README.md` / `llms.txt` / `AGENTS.md` / `package.json` (test-count 1124 → 1126), `tests/docs-consistency.test.ts` (+2 + `checkRoadmapToolCount`).
527
+ - version bump 3.10.0-rc.20 → 3.10.0-rc.21.
528
+
529
+ ---
530
+
531
+ ## [3.10.0-rc.20] — 2026-06-06
532
+
533
+ > **TL;DR:** **Audit MED-batch 5 — M7 privacy / right-to-erasure.** Three privacy-hardening fixes: (1) the HNSW persist **base** was computed independently by the WRITER (`server.ts`) and the ERASER (`EmbedDb.clearOnDisk`) — a duplication that, on drift, would leave the `.hnsw.bin` / `.hnsw.meta.json` sidecars (the meta sidecar carries raw `text_preview`) on disk after `clear-embeddings`, a right-to-erasure gap (the rc.34 P-2 class via a different seam); now both route through ONE shared `hnswPersistBase()` helper, and the erasure-completeness invariant asserts it. (2) The `--watch` handler `handle()` now re-checks `vault.isExcluded()` per file as **defense-in-depth** (chokidar's `ignored` predicate already drops excluded paths; this guards the case where `handle()` is reached another way — mirrors the existing PDF re-check). (3) `SECURITY.md` now documents that privacy filters are **not retroactive** for content already at rest — adding `--exclude-glob` hides matching notes from results immediately (terminal `isExcluded()` filter) but does NOT erase the chunk already written to `.fts5.db` / `.embed.db`; that needs `clear-index` / `clear-embeddings` + rebuild. **1119 → 1124 tests.** M2/M10 docs → rc.21.
534
+
535
+ **Pre-release (v3.10 line) — audit fix batch 5 (M7).**
536
+
537
+ ### Fixed
538
+
539
+ - **M7.1 — shared HNSW persist-base helper (right-to-erasure anti-drift).** `server.ts` (writer) derived `persistFile` as `` `${embedFile.replace(/\.embed\.db$/, "")}.hnsw` `` while `EmbedDb.clearOnDisk` (eraser) recomputed the identical expression independently. If either changed, `clear-embeddings` would erase the wrong sidecar path and leave HNSW files (incl. `.hnsw.meta.json`'s raw `text_preview`) on disk. New exported `hnswPersistBase(embedDbFile)` is the single source of truth; both call sites route through it. The `erasure-completeness invariant` now (a) scans the helper for the `.hnsw` suffix (moved out of the eraser method body) and (b) asserts BOTH the eraser AND the writer call `hnswPersistBase` and that the writer no longer recomputes the base inline.
540
+ - **M7.2 — watcher `handle()` privacy defense-in-depth.** The chokidar `ignored` predicate already drops `--exclude-glob` / `--read-paths` paths before they reach the handler, but `handle()` now ALSO re-checks `vault.isExcluded(relPath)` and returns before any cache-invalidation / index / embed work — so a filtered note can't be indexed even if `handle()` is reached by a direct call or a chokidar edge case. Mirrors the existing defensive PDF re-check.
541
+
542
+ ### Docs
543
+
544
+ - **M7.3 — SECURITY.md: privacy filters are not retroactive for at-rest content.** Added a bullet to the `--read-paths` / `--exclude-glob` posture: a filter added *after* a note was indexed hides it from all tool results immediately (same `isExcluded()` predicate gates search/read/walker) but does NOT erase the copy already on disk (`.fts5.db`, `.embed.db` `text_preview`, `.hnsw.meta.json`); purge via `clear-index` / `clear-embeddings` / `clear-cache` then rebuild. Also updated the "Watcher-aware" bullet to note the new `handle()` re-check.
545
+
546
+ ### Refactor
547
+
548
+ - `hnswPersistBase` lives in `src/embed-db.ts` (alongside `defaultEmbedDbFile`; it owns the `.embed.db` → `.hnsw` relation), imported by `server.ts`. No new import edge beyond server.ts's existing embed-db import; no cycle.
549
+
550
+ ### Tests (1124)
551
+
552
+ `tests/erasure-invariant.test.ts` +3: `hnswPersistBase` behavioral derivation (3 cases) + structural "eraser & writer both route through the helper, writer doesn't inline the base" + NEGATIVE control (the inline-base detector flags the pre-rc.20 shape). The manifest loop now scans `helperFns` so the `.hnsw` suffix is still verified after moving into the helper. `tests/watcher.test.ts` +2: `handle()` skips an excluded path before `invalidateOne` (the M7.2 fix) + POSITIVE control (a non-excluded path DOES reach `invalidateOne`; both build abs paths from the realpath-canonical `v.root` so handle()'s `path.relative` guard doesn't mask the result). 1119 → 1124.
553
+
554
+ ### Files changed
555
+
556
+ - `src/embed-db.ts` (new `hnswPersistBase`; `clearOnDisk` routes through it), `src/server.ts` (writer routes through it), `src/watcher.ts` (`handle()` isExcluded re-check), `SECURITY.md` (non-retroactive note + Watcher-aware update), `tests/erasure-invariant.test.ts` (+3 + `extractFn` + `helperFns` scan), `tests/watcher.test.ts` (+2), test-count claims → 1124.
557
+ - `scripts/check-per-file-coverage.mjs` — refreshed the stale `http-transport.ts` inline coverage comment (72.85% → 77.61%; rc.19's M3 handler removal raised it; caught by OIA Check 6 against the fresh `coverage-summary.json`).
558
+ - version bump 3.10.0-rc.19 → 3.10.0-rc.20.
559
+
560
+ ---
561
+
562
+ ## [3.10.0-rc.19] — 2026-06-06
563
+
564
+ > **TL;DR:** **Audit MED-batch 4 — M3 signal-shutdown race (both transports).** On `SIGINT`/`SIGTERM`, `serve-http` registered **four separate** listeners on the same signal — a cache-`flush` handler, `closeWatcher`, `closeFts`, and `shutdownHttpServer` — and the flush handler called `process.exit(0)` the **moment its fast cache flush resolved**, racing ahead of `shutdownHttpServer`'s up-to-5s in-flight-session drain and **cutting off in-flight requests**. The other three were pure duplication: `shutdownHttpServer` already flushes the cache and closes fts/watcher/embed-db. Fix: ONE `makeHttpShutdownHandler` orchestrator that **awaits** the full graceful teardown, then exits (re-entrancy-guarded). The stdio path (`startServer`) had the same shape (three handlers, the flush one calling `process.exit(0)` on its own — racing the async `watcher.close()`); consolidated into one `shutdownStdioDeps(deps)` that awaits watcher → embed-db → cache → fts before exit. `shutdownStdioDeps` was extracted to `src/shutdown.ts` so it's unit-testable (server.ts is in `no-internal-imports`' RESTRICTED_MODULES — same reason embed-pipeline.ts was split in rc.4). **1113 → 1119 tests.** M7 (privacy/erasure) → rc.20.
565
+
566
+ **Pre-release (v3.10 line) — audit fix batch 4 (M3).**
567
+
568
+ ### Fixed
569
+
570
+ - **M3 (HTTP) — the cache-flush SIGINT/SIGTERM handler raced the session drain.** `startHttpServer` registered FOUR listeners on each of `SIGINT`/`SIGTERM`: a persistent-cache `flush` (calling `process.exit(0)` in its `.finally`), `closeWatcher`, `closeFts`, and `shutdown` (= `shutdownHttpServer`). Because the flush's `saveDiskCache` is fast and the registry drain (`closeAll`, up to 5s) is slow, `process.exit(0)` fired **before** in-flight stateful requests finished — exactly the leak `shutdownHttpServer` (v3.8.7 P2-11) was built to prevent. New `makeHttpShutdownHandler(server, exit?)` returns a single, re-entrancy-guarded handler that `await`s `shutdownHttpServer` (drain → close TCP listener → flush cache → close fts/watcher/embed-db) and only THEN exits. The three redundant handlers are removed (their work is wholly subsumed by `shutdownHttpServer`); `beforeExit` keeps a guarded best-effort teardown for the natural-drain path.
571
+ - **M3 (stdio) — same shape.** `startServer` had three separate signal handlers and the cache-flush one called `process.exit(0)` on its own completion, racing the (async) `watcher.close()`. Consolidated into one orchestrator awaiting `shutdownStdioDeps(deps)` — which closes watcher + embed-db, flushes the persistent cache, then closes fts5, **awaiting each async step** (best-effort: a throw in one step never blocks the rest). The ordering is now deterministic and nothing exits mid-teardown.
572
+
573
+ ### Refactor
574
+
575
+ - **`shutdownStdioDeps` extracted to `src/shutdown.ts`.** `src/server.ts` is in the `no-internal-imports` RESTRICTED_MODULES list ("registration boilerplate"), so a helper there can't be imported by a test — the SAME constraint that drove the rc.4 embed-pipeline extraction. The new module declares a minimal structural `StdioShutdownDeps` interface locally (no import of `ServerDeps`) so there's zero import cycle with the server module; `ServerDeps` structurally satisfies it, so `startServer` passes `deps` directly.
576
+
577
+ ### Tests (1119)
578
+
579
+ `tests/http-transport.test.ts` +2: `makeHttpShutdownHandler` awaits full teardown before exit (asserts exit is NOT synchronous, the TCP listener is closed BEFORE exit fires, and a second signal is a re-entrancy no-op) + NEGATIVE control (a handler that skips the await "exits" while the listener is still up — proving the positive assertion depends on the await). `tests/shutdown.test.ts` (new) +4: ordering watcher→embed-db→cache→fts with awaited async steps; cache flush skipped when persistent cache disabled; best-effort (a throwing step doesn't block the rest); NEGATIVE control (a non-awaiting teardown records the sync "exit" step before the async one finishes). 1113 → 1119.
580
+
581
+ ### Files changed
582
+
583
+ - `src/http-transport.ts` (new `makeHttpShutdownHandler`; four signal handlers → one orchestrator + guarded `beforeExit`), `src/server.ts` (import `shutdownStdioDeps`; three handlers → one orchestrator; drop now-unused `vault`/`ftsIndex`/`watcher` destructuring), `src/shutdown.ts` (new — `StdioShutdownDeps` + `shutdownStdioDeps`), `tests/http-transport.test.ts` (+2), `tests/shutdown.test.ts` (new, +4), test-count claims → 1119.
584
+ - version bump 3.10.0-rc.18 → 3.10.0-rc.19.
585
+
586
+ ---
587
+
588
+ ## [3.10.0-rc.18] — 2026-06-06
589
+
590
+ > **TL;DR:** **Audit MED-batch 3 — M4 DoS-cap completeness (`obsidian_dataview_query` + invariant scope).** The audit flagged `obsidian_dataview_query` (`runDql`) as an uncapped whole-vault `readNote`+parse scan reachable over bearer `serve-http`. Root cause: the rc.36 `resource-bound-invariant`'s `SCANNER_SOURCES` covered `read.ts`/`search.ts`/`meta.ts` but NOT `dql.ts`, so `runDql` was never required to be CAP-or-EXEMPT (scope-too-narrow — the recurring class). Fix: cap `runDql` with `capScanEntries` (defense-in-depth — DQL is a *linear* query so a > MAX_SCAN_NOTES vault yields a partial, logged result, never a hang) + add `src/dql.ts` to `SCANNER_SOURCES` so the invariant patrols it. Also fixed a manifest drift my OWN rc.16 introduced: `getOpenQuestions` began calling `capScanEntries` in rc.16 but the manifest still listed it EXEMPT — reclassified CAPPED. **1113 tests unchanged.** M3 (signal-shutdown) → rc.19.
591
+
592
+ **Pre-release (v3.10 line) — audit fix batch 3 (M4).**
593
+
594
+ ### Fixed
595
+
596
+ - **M4 — `obsidian_dataview_query` whole-vault scan was uncapped AND invisible to the resource-bound invariant.** `runDql` (`src/dql.ts`) does `vault.listMarkdown()` → per-note `readNote` + frontmatter-eval; it's always-registered + bearer-reachable, but lived OUTSIDE the invariant's `SCANNER_SOURCES`, so the rc.36 "every whole-vault scanner is CAP-or-EXEMPT" completeness check never saw it. Now: the scan is wrapped in `capScanEntries` (defense-in-depth — DQL is O(N) linear, so a > MAX_SCAN_NOTES vault yields a partial result with a logged warning, not a hang), and `src/dql.ts` is added to `SCANNER_SOURCES`. The audit's "like the uncapped graph tools" framing is imprecise (graph tools are O(N²)/graph and MUST cap; DQL is linear) — but a cap is the right defense-in-depth for a bearer-reachable whole-vault tool, and closing the invariant scope is the structural fix.
597
+ - **rc.16 manifest drift — `getOpenQuestions` was CAPPED in code but EXEMPT in the manifest.** rc.16 (M5) added `capScanEntries` to `getOpenQuestions` but left its `resource-bound-invariant` classification EXEMPT. Reclassified CAPPED (with its `capScanEntries` token) so the manifest matches reality — a post-rc.16 recursion-sweep catch.
598
+
599
+ ### Tests (1113)
600
+
601
+ No new `it()` — the `resource-bound-invariant` now structurally covers `runDql` (CAPPED → must reference `capScanEntries`) and the corrected `getOpenQuestions` classification; the existing `dql.test.ts` exercises the > MAX_SCAN_NOTES cap path (logs the truncation). 1113 unchanged.
602
+
603
+ ### Files changed
604
+
605
+ - `src/dql.ts` (`capScanEntries` import + scan cap), `tests/resource-bound-invariant.test.ts` (`SCANNER_SOURCES` += `src/dql.ts`; `runDql` + `getOpenQuestions` → CAPPED; drop `getOpenQuestions` from EXEMPT).
606
+ - version bump 3.10.0-rc.17 → 3.10.0-rc.18.
607
+
608
+ ---
609
+
610
+ ## [3.10.0-rc.17] — 2026-06-06
611
+
612
+ > **TL;DR:** **Audit MED-batch 2/6 — M1 chunking parity (embeddings line citations).** The embedding pipeline chunks the frontmatter-STRIPPED body (to keep YAML out of the vectors) while the FTS5 index chunks the FULL note content — so for any note WITH frontmatter, embeddings / `find_similar` / `semantic_search` stored `line_start`/`line_end` that were BODY-relative (too low by the frontmatter line count), pointing deep-links at the wrong line; and the code comments falsely claimed "identical chunking across BM25 and embeddings." Fix: `parseNote` now exposes `bodyStartLine`, and `embedSingleNote` shifts each chunk's line numbers to FILE-absolute (matching FTS5) — keeping the clean body-only embeddings (no quality regression). Comments corrected; the residual `block`-granularity chunk-INDEX divergence for frontmatter'd notes is documented (the default `note` granularity fuses by path, unaffected). **1108 → 1113 tests.**
613
+
614
+ **Pre-release (v3.10 line) — audit fix batch 2/6.**
615
+
616
+ ### Fixed
617
+
618
+ - **M1 — embeddings line citations were body-relative (off by the frontmatter line count) for frontmatter'd notes.** `embedSingleNote` chunks `note.parsed.body` (frontmatter stripped, so YAML never pollutes the vectors), but `chunkContent`'s `lineStart`/`lineEnd` are then relative to the body, not the file — so a hit's deep-link pointed N lines too early (N = the frontmatter line count). `parseNote` now returns `bodyStartLine` (the 1-based file line where the body begins, via `source.indexOf(body)`; 1 with no frontmatter), and `embedSingleNote` adds `(bodyStartLine − 1)` to each chunk's line numbers → FILE-absolute, matching the FTS5 index (which chunks full content). Embedding quality is unchanged (still body-only). Existing embed-dbs apply the corrected lines as notes are re-embedded (on edit, or `enquire-mcp build-embeddings`) — no forced rebuild (proportionate: a full re-embed on every serve for a line-number fix would be disruptive).
619
+ - **M1 (claim-vs-reality) — `embed-db.ts` header claimed "Same chunking as FTS5 … so chunk identity matches across BM25 and embeddings."** False for markdown (FTS5 chunks full content; embeddings chunk body). Corrected to describe the actual design + the file-absolute line alignment + the `block`-granularity caveat. The `reindexPdfFile` "chunk IDs match" claim is accurate (PDFs have no frontmatter) and left as-is.
620
+
621
+ ### Docs
622
+
623
+ - `searchHybrid` granularity `@param` now notes that in `block` granularity a per-note chunk INDEX may not denote the same span across BM25 (content) and embeddings (body) for frontmatter'd notes — prefer the default `note` granularity (fused by path) for frontmatter-heavy vaults.
624
+
625
+ ### Tests (1113)
626
+
627
+ `tests/embed-pipeline.test.ts` +2 (frontmatter'd note → chunk `lineStart` lands on the file line that actually contains the chunk text, not the `---`; NEGATIVE control: no-frontmatter ⇒ offset 0, line 1). `tests/parser.test.ts` +3 (`bodyStartLine` > 1 with frontmatter; === 1 without — NEGATIVE control; points at the first body line). 1108 → 1113.
628
+
629
+ ### Files changed
630
+
631
+ - `src/parser.ts` (`bodyStartLine` field + computation), `src/embed-pipeline.ts` (file-absolute line offset in `embedSingleNote`), `src/embed-db.ts` (header comment), `src/tools/search.ts` (granularity `@param` caveat), `tests/embed-pipeline.test.ts` (+2), `tests/parser.test.ts` (+3), test-count claims → 1113.
632
+ - version bump 3.10.0-rc.16 → 3.10.0-rc.17.
633
+
634
+ ---
635
+
636
+ ## [3.10.0-rc.16] — 2026-06-05
637
+
638
+ > **TL;DR:** **Audit MED-batch 1/6 — retrieval correctness.** A from-scratch 7-agent system audit (core code · transport/CLI · security/STRIDE · privacy · agent-facing surfaces · docs/process · tests), every headline adversarially re-verified against the code, returned **0 CRITICAL / 0 HIGH** for this 30+-round-audited codebase — real findings sat in the apparatus's known behavioral/docs blind spots. This RC fixes the first two verified MEDIUMs. **M5:** `obsidian_open_questions` is documented "oldest-first" but broke at `limit` in vault-WALK order and only THEN sorted — so on a vault with > `limit` questions it returned an arbitrary subset, NOT the oldest. Now collects all (scan capped via `capScanEntries`), sorts oldest-first, slices. **M6:** `HnswIndex.applyDiff` validated vector dim INSIDE the addPoint loop, so a wrong-dim vector threw AFTER some labels were `markDelete`'d and some points added → a half-applied index (silent embed-db↔HNSW divergence in the watcher, which logs + continues rather than rebuilding). Dim is now pre-validated before ANY mutation → atomic for the only caller-data-driven throw. **1104 → 1108 tests.**
639
+
640
+ **Pre-release (v3.10 line) — audit fix batch 1/6.**
641
+
642
+ ### Fixed
643
+
644
+ - **M5 — `obsidian_open_questions` returned an arbitrary `limit`-subset, not the oldest** (`src/tools/meta.ts`). The outer loop `break`'d once `out.length >= limit` in `vault.listMarkdown` (readdir) order, then `out.sort(age desc)` ran on that already-truncated set. On a vault with more than `limit` questions, callers asking for "the most-aged open questions" got whichever notes came first in the walk. Now: cap the scan (`capScanEntries`, defense-in-depth like the graph tools), collect ALL matches, sort oldest-first, then `slice(0, limit)` — the documented contract.
645
+ - **M6 — `HnswIndex.applyDiff` could leave a half-applied index on a dim mismatch** (`src/hnsw.ts`). The `pt.vector.length !== dim` check lived inside the addPoint loop, so a bad vector threw after the `markDelete` loop + earlier `addPoint`s had already mutated the index; the watcher's `syncHnswForFile` catch logs + continues (doesn't rebuild), leaving HNSW out of sync with the freshly-upserted embed-db until the next serve restart (ghost labels / stale `text_preview`). Now ALL dims are pre-validated before any `markDelete`/`resizeIndex`/`addPoint` → applyDiff is atomic for the only caller-data-driven throw (a native addPoint failure after the pre-grow remains the sole, rare, eventually-consistent residual, documented).
646
+
647
+ ### Security (dependency)
648
+
649
+ - **`hono` moderate advisory (transitive, not reachable) — pinned to the patched 4.12.23 via `overrides`.** A fresh `npm audit` moderate advisory hit `hono ≤4.12.20` (GHSA-xrhx-7g5j-rcj5 IPv6-deny bypass + GHSA-3hrh-pfw6-9m5x cookie injection + GHSA-f577-qrjj-4474 JWT scheme + GHSA-2gcr-mfcq-wcc3 mount-prefix), pulled in TRANSITIVELY by `@modelcontextprotocol/sdk@1.29.0` (via `@hono/node-server`). enquire runs its OWN node-`http` transport, not hono, so none of the flagged paths are reachable — but the `audit` CI gate flags the dep tree regardless. Added `"overrides": { "hono": "^4.12.21" }` → resolves to 4.12.23 (in-range for the SDK's `^4.11.4`, non-breaking; build + SDK transport tests green). `npm audit` back to **0 vulnerabilities** (prod-moderate + all-high). Caught by CI's `audit` gate, not my local battery — lesson: run `npm audit` locally too (the battery omitted it).
650
+
651
+ ### Tests (1108)
652
+
653
+ `tests/redos-guard.test.ts` +3 (open-questions oldest-first: limit=1 returns the oldest not the walk-first/newest; limit=2 returns the 2 oldest; full-order regression — names+mtimes chosen so the oldest is never readdir-first, making the revert discriminating). `tests/hnsw.test.ts` +1 (applyDiff wrong-dim throws atomically — the removeLabel survives the failed diff; NEGATIVE control: a valid diff DOES remove it). 1104 → 1108 (the hono override adds no tests).
654
+
655
+ ### Files changed
656
+
657
+ - `src/tools/meta.ts` (M5 + `capScanEntries` import), `src/hnsw.ts` (M6 dim pre-validation), `tests/redos-guard.test.ts` (+3), `tests/hnsw.test.ts` (+1), test-count claims → 1108.
658
+ - `package.json` + `package-lock.json` (hono `overrides` → 4.12.23, security).
659
+ - version bump 3.10.0-rc.15 → 3.10.0-rc.16.
660
+
661
+ ---
662
+
663
+ ## [3.10.0-rc.15] — 2026-06-03
664
+
665
+ > **TL;DR:** **Watcher-test flake stabilized at the root — it was blocking RELEASES, not just PRs.** The rc.13 release run failed at `tests/watcher.test.ts:505` (chokidar FSEvents timing); a re-run published it — but a transient blip must never fail a publish (rc.20 rule). Two root races: **(1)** a brand-new file's FIRST inotify/FSEvents event can be dropped on a loaded runner even after `start()` resolves on `ready`, so a one-shot `writeFile` + `waitFor` can wait forever — the fixed-`setTimeout` warm-ups (rc.7 #36, rc.9 W-FLAKE-2) only approximated a fix; **(2)** the `:505` assertion read the embed-error stderr line IMMEDIATELY after the `fts.totalFiles()` check, but that line is logged a tick LATER in the same handler. Fixed: new `writeAndWaitFor` re-touch-on-miss helper (re-writes the file if the first event is dropped — idempotent reindex) on all 5 new-file-add sites; the lagging embed-error signal is now polled with `waitFor`; `waitFor` default 4000 → 8000 ms. **Verified green 3× back-to-back.** **1104 tests unchanged** (bodies refactored, no `it()` added). **Test-only — zero `src/`.**
666
+
667
+ **Pre-release (v3.10 line) — release-reliability fix; no `src/` / behavior change.**
668
+
669
+ ### Fixed
670
+
671
+ - **Watcher test flake blocked the rc.13 release** (`tests/watcher.test.ts:505`, "expected false to be true"). Two root races, both fixed:
672
+ - **Missed first event.** chokidar can drop the FIRST add event for a brand-new path on a loaded runner (the watch is still arming when the write lands, even though `start()` already resolved on `ready`). New `writeAndWaitFor(filePath, content, cond)` re-writes the file every ~1.2 s while waiting, regenerating a fresh event the watcher reliably catches. Idempotent reindex ⇒ extra writes never change the asserted outcome. Applied to all 5 new-file-add sites (`logged.md`, `added.md`, `added.pdf`, `note-embed.md`, `embed-error.md`).
673
+ - **Asserting a lagging signal too early.** The `:505` check read the "embed-db sync failed" stderr line right after `fts.totalFiles()>=1`, but that line is logged a tick later in the same handler. Now polled with `waitFor`.
674
+ - **`waitFor` default 4000 → 8000 ms** — headroom for the full chain (event → `awaitWriteFinish` 250 ms → per-file queue → reindex → embed-sync) under coverage instrumentation + parallel workers, without masking a genuine hang.
675
+
676
+ ### Method note
677
+
678
+ The durable fix the project's fixed-`setTimeout` warm-ups (rc.7 #36, rc.9 W-FLAKE-2) chased for three RCs: the root isn't "wait longer before writing" — it's "regenerate the event if it's dropped" + "poll lagging signals, don't assert them immediately". Per "a transient blip must never fail a release" (rc.20), but fixed at the ROOT (a fixable test race) rather than masked with a retry (the rc.20 npm-ci retry was for an *unfixable* network flake). Verified empirically — watcher suite green 3× back-to-back — per the rc.25 "run it, don't assume" lesson. Deferred (noted): a structural lint flagging `writeFile`-then-`waitFor` in watcher tests was considered and rejected as noisy; the helper is the durable mechanism and the remaining sites are change/unlink (reliable on already-watched files).
679
+
680
+ ### Tests (1104)
681
+
682
+ `tests/watcher.test.ts` — 5 add-sites → `writeAndWaitFor`, embed-error downstream signal → `waitFor`, default timeout 4000→8000. No `it()` added/removed; 1104 unchanged.
683
+
684
+ ### Files changed
685
+
686
+ - `tests/watcher.test.ts` only (new `writeAndWaitFor` helper + 6 site refactors + `waitFor` timeout bump).
687
+ - version bump 3.10.0-rc.14 → 3.10.0-rc.15.
688
+
689
+ ---
690
+
691
+ ## [3.10.0-rc.14] — 2026-06-03
692
+
693
+ > **TL;DR:** **`query` + `prune` CLI (bug-report Issues 4, 8) — concludes the actionable bug-report batch.** **(Issue 4)** New `enquire-mcp query "<text>" --vault <path>` runs the SAME hybrid `searchHybrid` the MCP `obsidian_search` tool uses and prints the results — a one-shot CLI search for smoke-tests / CI / debugging without an MCP client (the report had to hand-craft JSON-RPC over stdio to verify retrieval). **(Issue 8)** New `enquire-mcp prune --vault <path>` GCs the per-vault index clutter that accumulates in the cache dir (`clear-cache`/`clear-index` only target the current vault); it removes all OTHER vaults' enquire artifacts, **dry-run by default** (opt in with `--yes`), and — via the pure `planCachePrune` with a strict `<12-hex>.{fts5.db,embed.db,hnsw.bin,hnsw.meta.json}` filter — can NEVER touch a file enquire didn't create. **1098 → 1104 tests.**
694
+
695
+ **Minor (pre-release) — v3.10 line; bug-report response batch 3/3 (DX CLI). Concludes the actionable bug-report batch (rc.12 model-path · rc.13 reranker · rc.14 query+prune).**
696
+
697
+ ### Added
698
+
699
+ - **`query` subcommand (Issue 4).** `enquire-mcp query "<text>" --vault <path> [--limit N] [--index-file …] [--json]` — builds/reuses the persistent FTS5 index (peek-safe tokenize, K-1 invariant), runs `searchHybrid`, prints `path:line [kind]` + snippet per hit (or the full JSON with `--json`). Unblocks CLI/CI smoke-testing of retrieval without an MCP client.
700
+ - **`prune` subcommand (Issue 8).** `enquire-mcp prune --vault <path> [--yes]` — removes cached index artifacts for OTHER vaults, keeping the named one. Dry-run preview by default; `--yes` deletes. Backed by the pure, exported `planCachePrune(entries, keepHash)` whose strict enquire-artifact regex is the safety property (verified: ignores user notes / wrong-shaped hashes / wrong extensions).
701
+
702
+ ### Fixed
703
+
704
+ - **Issue 4 — no CLI search for smoke/CI.** Previously retrieval could only be exercised through the MCP protocol (stdio JSON-RPC). `query` gives a direct CLI path.
705
+ - **Issue 8 — no GC for accumulated per-vault indexes.** The cache dir grew one index set per vault path/config hash with no cleanup command for OTHER vaults. `prune` is that command. (The root cause for the maintainer's own clutter was the test suite — fixed in rc.11; `prune` is the user-facing GC.)
706
+
707
+ ### Tests (1104)
708
+
709
+ `tests/cache-prune.test.ts` +4 (planCachePrune: selects other vaults, never the kept one, **NEGATIVE control** ignores non-enquire files, empty cases); `tests/cli.test.ts` +2 (`query` prints results, `prune` previews + deletes nothing without `--yes`). 1098 → 1104; claims synced (README ×4, package.json, llms.txt, AGENTS, COMPARISON, ROADMAP).
710
+
711
+ ### Files changed
712
+
713
+ - `src/fts5.ts` (`planCachePrune` + `ENQUIRE_CACHE_ARTIFACT` regex), `src/cli.ts` (`query` + `prune` commands + `searchHybrid`/`planCachePrune` imports), `docs/api.md` (2 subcommand rows), `tests/cache-prune.test.ts` (new, +4), `tests/cli.test.ts` (+2), test-count claims → 1104.
714
+ - version bump 3.10.0-rc.13 → 3.10.0-rc.14.
715
+
716
+ ### Known / next
717
+
718
+ - **A flaky watcher test (`tests/watcher.test.ts:505`, chokidar FSEvents timing) failed the rc.13 release run** (a re-run published it). Same class as rc.7 #36 / rc.9 W-FLAKE-2, but now blocking *releases*, not just PRs — more severe. **rc.15 will stabilize it** (wait on the watcher's `ready` signal instead of a fixed `setTimeout` warmup) per the "a transient blip must never fail a release" rule (rc.20).
719
+
720
+ ---
721
+
722
+ ## [3.10.0-rc.13] — 2026-06-03
723
+
724
+ > **TL;DR:** **Reranker observability + pre-cache (bug-report Issues 9, 3, 5).** The cross-encoder reranker was a black box: enabling it triggered a SILENT ~110 MB download on the first query (which could exceed a client's tool-call timeout → unexplained RRF fallback), there was no way to pre-cache it, and the response gave no positive signal it ran. Now: **(Issue 9)** `obsidian_search` emits stderr lifecycle logs (`reranker '<alias>' loading (~110 MB…)` BEFORE the blocking load, `loaded; reranked N pairs` after — failures were already logged) and returns a `reranked: { applied, pairs?, reason? }` field; **(Issue 3)** `install-model` resolves the reranker catalog too, so `enquire-mcp install-model rerank-bge` pre-downloads the cross-encoder (the ~110 MB no longer blocks the first query); **(Issue 5, docs)** `docs/api.md` now states the default reranker is English-tuned and RU/multilingual vaults can leave it off (RRF hybrid already handles them), plus which aliases are verified-working. **1094 → 1098 tests.**
725
+
726
+ **Minor (pre-release) — v3.10 line; bug-report response batch 2/3 (reranker cluster).**
727
+
728
+ ### Added
729
+
730
+ - **`SearchHybridResponse.reranked`** — `{ applied: boolean; pairs?: number; reason?: string }`, present ONLY when a reranker was requested. `{applied:true, pairs:N}` after a successful re-score; `{applied:false, reason}` when requested but it didn't run (reason mirrors `signal_errors.reranker` on a load failure, or notes "no candidates"). Closes Issue 9's "silent no-op" — a caller can now tell applied-vs-fell-back without guessing.
731
+ - **`install-model` accepts reranker aliases** (Issue 3). `alias in RERANKER_MODELS` routes to `loadReranker` + a one-pair smoke; `enquire-mcp install-model rerank-bge` pre-caches the ~110 MB cross-encoder so `serve --enable-reranker` doesn't block on the download at first query. Unknown aliases now fail with BOTH catalogs listed (resolves the `bge` embedding vs `rerank-bge` cross-encoder naming confusion).
732
+
733
+ ### Fixed
734
+
735
+ - **Issue 9 — reranker silently no-op'd with no diagnostics (Medium).** `loadReranker`'s download was silent and `search.ts` logged ONLY failures, so a first-run download that exceeded the client timeout looked identical to a silent failure. Added stderr lifecycle logging (loading… with size / loaded; reranked N pairs) — three distinguishable states — plus the `reranked` response field above. (`signal_errors.reranker` already carried the failure reason; this adds the in-progress + success signals.)
736
+
737
+ ### Docs
738
+
739
+ - **Issue 5 — reranker language guidance.** `docs/api.md` `--enable-reranker` / `--reranker-model` rows now state the default cross-encoder is English-tuned (RRF hybrid handles RU/multilingual well → leave it off with no quality loss), name `rerank-bge` as the only verified-working reranker (the multilingual aliases still fail at `AutoTokenizer`), and point at `install-model` for pre-caching. The empirical RU conclusion from the bug report (RRF-hybrid already correct without reranker) is now documented guidance.
740
+
741
+ ### Tests (1098)
742
+
743
+ `tests/reranker.test.ts` +3 (reranked applied / failed-with-reason / NEGATIVE-control absent-when-not-requested); `tests/cli.test.ts` +1 (install-model unknown-alias lists both catalogs). 1094 → 1098; claims synced (README ×4, package.json, llms.txt, AGENTS, COMPARISON, ROADMAP).
744
+
745
+ ### Files changed
746
+
747
+ - `src/tools/search.ts` (`reranked` field + stderr lifecycle logs), `src/cli.ts` (install-model reranker routing + combined-catalog error + imports), `docs/api.md` (reranker guidance), `tests/reranker.test.ts` (+3), `tests/cli.test.ts` (+1), test-count claims → 1098.
748
+ - version bump 3.10.0-rc.12 → 3.10.0-rc.13.
749
+
750
+ ---
751
+
752
+ ## [3.10.0-rc.12] — 2026-06-03
753
+
754
+ > **TL;DR:** **Model-cache path: one resolver, no more lying paths (bug-report Issues 1 + 2).** A fresh-install bug report on a real 236-note vault found `enquire-mcp doctor` printing **"NOT READY — no Xenova model weights found"** on a fully-working **global** install: the cache probe only looked under `process.cwd()/node_modules/…`, but a global `npm i -g` loads the model from the package's OWN nested `node_modules`, resolved relative to the module — never relative to cwd (Issue 1). Relatedly, `install-model` / `setup` printed `cached under ~/.cache/huggingface/` — a path that stays empty — while the help text and two TSDocs named other (also wrong) paths (Issue 2). Both now flow through ONE source of truth, `resolveTransformersCacheDir()` (resolves `@huggingface/transformers` via `createRequire(import.meta.url)` → its package `.cache`, correct for hoisted AND nested layouts), so the diagnostic and the success message can never again disagree with reality. **Verified:** on this machine `doctor` now reports `✓ Embedding model cache — 2 model(s) cached under …/node_modules/@huggingface/transformers/.cache/Xenova/`. **1088 → 1094 tests.**
755
+
756
+ **Minor (pre-release) — v3.10 line; bug-report response batch 1/3 (model-path correctness).**
757
+
758
+ ### Fixed
759
+
760
+ - **Issue 1 — `doctor` false-negative on a global install (Medium).** `candidateModelCacheRoots()` (`src/doctor.ts`) probed only a cwd-based path; the model on a global install lives in the package's nested `node_modules/@huggingface/transformers/.cache`, which cwd never reaches → false `NOT READY` on a fully-working setup (panic / needless reinstalls). Now the FIRST candidate is `resolveTransformersCacheDir()` — the module-relative path transformers.js actually loads from — with the cwd probe kept as a local-dev/npx fallback and the HF-Hub conventions kept after it.
761
+ - **Issue 2 — `install-model` / `setup` printed a wrong, empty cache path (Low).** The success message (`cached under ~/.cache/huggingface/`), the `install-model` description (`~/.cache/huggingface/transformers.js/`), and two `src/embeddings.ts` TSDocs each named a different wrong path. All now print / point to the resolved truth (`resolveTransformersCacheDir()`); the `docs/api.md` `install-model` row was corrected. Root-cause sweep fixed **5 instances** of the wrong-path claim (install-model msg, setup msg, 2 TSDocs, api.md), not just the reported one.
762
+
763
+ ### Added
764
+
765
+ - **`resolveTransformersCacheDir()` + `deriveTransformersCacheDir()`** (`src/embeddings.ts`, exported): single resolver for the transformers.js model-cache dir. Pure derivation slices at the INNERMOST `node_modules/@huggingface/transformers` segment so it's correct for hoisted (`<root>/node_modules/…`) AND nested global-install (`…/enquire-mcp/node_modules/@huggingface/transformers/.cache`) layouts. Resolution-only (no ONNX load) → keeps `doctor`'s fast-read-only promise.
766
+ - **`tests/transformers-cache-path.test.ts`** (+6): pins the pure derivation incl. the nested global-install layout (the exact Issue-1 shape), with a discriminating NEGATIVE control (no marker → `null`); asserts the live resolver returns the package `.cache` and that `doctor` ranks it first.
767
+
768
+ ### Method note
769
+
770
+ Classic change-driven-vs-state-driven gap: my home gates never run a global install, so the cwd-only probe survived every internal sweep. A real fresh install found it immediately. Fix is the standard transform — collapse N drifting path strings into ONE resolver + a structural test on the derivation, so the diagnostic and the messages are provably consistent. The α-class (TSDoc-drift) rule applied: the wrong-path claims in the `embeddings.ts` TSDocs + the module header comment were corrected in the same commit as the code.
771
+
772
+ ### Tests (1094)
773
+
774
+ `tests/transformers-cache-path.test.ts` +6 source `it()`. 1088 → 1094; claims synced (README ×4, package.json, llms.txt, AGENTS, COMPARISON, ROADMAP).
775
+
776
+ ### Files changed
777
+
778
+ - `src/embeddings.ts` (resolver + helpers + 2 TSDoc path fixes + header comment), `src/doctor.ts` (candidate #0 = resolved cache + stale-comment fix), `src/cli.ts` (install-model + setup messages → resolved path), `docs/api.md` (install-model row), `tests/transformers-cache-path.test.ts` (new), test-count claims → 1094.
779
+ - version bump 3.10.0-rc.11 → 3.10.0-rc.12.
780
+
781
+ ---
782
+
783
+ ## [3.10.0-rc.11] — 2026-06-03
784
+
785
+ > **TL;DR:** **Hermetic test cache — found by live-testing the installed server on a real machine.** Driving the installed `enquire-mcp@3.9.1` over JSON-RPC against a real 237-note vault confirmed it works end-to-end (boot, 33 tools, `obsidian_stats`, semantic `obsidian_search` with a built embed-db, path-escape guard) — but also surfaced **~27,000 orphaned files / ~699 MB** sitting in the real user cache (`~/Library/Caches/enquire/`). Root cause: `defaultIndexFile()` (and the embed-db / HNSW sidecars) resolve their dir from `XDG_CACHE_HOME`, and any test that spawns `serve`/`setup`/`build-embeddings`/`index` **without** an explicit `--index-file` fell back to that REAL cache and never cleaned up — weeks of `npm test` accumulated there (mtimes May 8 → Jun 3). Fixed at the root: `tests/setup.ts` now redirects `XDG_CACHE_HOME` to a throwaway temp dir before any test (and every inheriting child spawn) touches the cache. **Verified: real-cache file delta from the suite is now 0.** Structural guard added so it can't regress. **1085 → 1088 tests.**
786
+
787
+ **Minor (pre-release) — v3.10 line; test-hygiene fix from live-testing (no `src/` runtime change).**
788
+
789
+ ### Fixed
790
+
791
+ - **Test suite no longer pollutes the real user cache.** `tests/setup.ts` sets `process.env.XDG_CACHE_HOME = mkdtempSync(tmpdir()/enquire-test-cache-…)` (guarded by `if (!XDG_CACHE_HOME)` so CI/devs can override). Because `defaultIndexFile()` keys on `XDG_CACHE_HOME` on every platform and every test child-spawn inherits `process.env` (verified: no test overrides `env` without spreading `process.env`), this redirects ALL fts5 / embed-db / HNSW cache writes — in-process and spawned — to a throwaway dir the OS reclaims. Previously these landed in `~/Library/Caches/enquire/` (macOS) / `~/.cache/enquire/` (Linux) and were never cleaned, so a dev machine that ran the suite over weeks accumulated tens of thousands of orphaned index files. **Verified end-to-end:** real-cache file count is unchanged (Δ0) after a full `npm test`; the only real-cache writes observed during testing came from a separate *real* `serve` run on the actual vault (correct behavior), confirmed by matching the file hash to `sha1("/…/Obsidian Vault")`.
792
+
793
+ ### Added
794
+
795
+ - **`tests/cache-isolation-invariant.test.ts`** (+3): asserts `XDG_CACHE_HOME` is a temp dir during tests and that `defaultIndexFile()` resolves UNDER it, NOT under the real `~/Library/Caches/enquire` (or `~/.cache/enquire`) — so removing the `setup.ts` redirect fails CI. Includes a NEGATIVE control proving the real-cache classifier discriminates (a constant `() => false` would make it vacuous). Meta-invariant-enrolled.
796
+
797
+ ### Method note
798
+
799
+ This is the textbook value of **testing on a real machine** (the project's home-grown gates are drift/claim-driven and structurally blind to filesystem-side-effects like this): the unit suite *caused* the accumulation but never *observed* it, because no gate inspects the real cache dir. The fix is the project's standard transform — root-cause + a permanent inventory/structural invariant (`cache-isolation-invariant`) so the class can't silently return. Severity is **dev-hygiene** (not a product/CI/end-user correctness bug — a real user with one vault gets one index file; CI runners are ephemeral), but it's a real papercut (699 MB on the maintainer's machine) with a clean, verified fix. **The pre-existing local cruft is maintainer-gated to clear** (deleting files is out of scope for the agent) — see the session hand-off for the exact one-liner that keeps the live vault's index and removes the orphans.
800
+
801
+ ### Tests (1088)
802
+
803
+ `tests/cache-isolation-invariant.test.ts` +3 source `it()`. 1085 → 1088; claims synced (README ×4, package.json, llms.txt, AGENTS, COMPARISON, ROADMAP).
804
+
805
+ ### Files changed
806
+
807
+ - `tests/setup.ts` (XDG_CACHE_HOME redirect), `tests/cache-isolation-invariant.test.ts` (new), test-count claims → 1088.
808
+ - version bump 3.10.0-rc.10 → 3.10.0-rc.11.
809
+
810
+ ---
811
+
812
+ ## [3.10.0-rc.10] — 2026-06-02
813
+
814
+ > **TL;DR:** **New capability — frontmatter-aware retrieval.** `obsidian_search` gains an optional `filter_frontmatter` map so an agent can scope hybrid search by YAML frontmatter: `{ status: "active", type: ["meeting","decision"] }` → only notes whose frontmatter matches **every** key (strings case-insensitive; an array frontmatter value matches by membership; an array filter value is OR). It's the first genuinely-new *feature* (not polish) since the v3.10 staleness line — Obsidian users live in frontmatter (`status`/`type`/`project`), and **no other Obsidian-MCP can scope semantic search by it**. Opt-in + additive: **absent ⇒ byte-identical** to before (same safe pattern as recency re-ranking). Matching is filter-on-the-fused-candidate-pool, which is already excluded-pruned (rc.8), so no excluded note's frontmatter is read; PDFs (no frontmatter) are excluded without a binary read. **1076 → 1085 tests.**
815
+
816
+ **Minor (pre-release) — v3.10 line; new feature: frontmatter-aware retrieval (increment 1/N).**
817
+
818
+ ### Added
819
+
820
+ - **`obsidian_search` `filter_frontmatter?: Record<string, scalar | scalar[]>`** — post-filters fused hits by the note's parsed YAML frontmatter. AND across keys; per key, scalar-equality (strings case-insensitive, numbers/booleans strict, no cross-type coercion) or array-membership; a filter value may be an array for OR. Notes with no frontmatter, or missing a filtered key, are excluded (a filter is a positive assertion). Runs only when the param is passed; filters the candidate pool (so a strict filter can legitimately return < `limit`). Reads candidate frontmatter via the cached `vault.readNote` (graph-boost usually warms it); fail-soft (an unreadable candidate is excluded, honoring the filter).
821
+ - **Pure exported `frontmatterMatches(frontmatter, filter)`** (+ `FrontmatterFilterValue` / `FrontmatterFilterScalar` types) — the matching semantics, unit-testable in isolation. zod schema added to the `obsidian_search` registration (`z.record` of string→scalar|scalar[]).
822
+ - **`tests/search-hybrid.test.ts`** (+9): 6 `frontmatterMatches` unit tests (scalar/case-insensitive, array-membership, array-OR + multi-key-AND, number/boolean strictness, missing-key/empty/absent → no match, a NEGATIVE control that discriminates) + 3 integration tests through `searchHybrid` (filter narrows to the matching note; **NEGATIVE control** — no filter returns all three, proving the filter is what narrowed it; array-value OR + multi-key AND).
823
+
824
+ ### Method note
825
+
826
+ This is the deliberate answer to "is the project at a dead-end?" — the *refinement* track had hit diminishing returns, so the next real value is a **new capability**, not another micro-RC. Frontmatter-aware retrieval was chosen because it (1) expands what the product can *do* (not polish), (2) plays to the retrieval core — the project's strength, (3) is deeply Obsidian-native (frontmatter is a first-class Obsidian primitive) with **no competitor parity**, and (4) ships **additively/opt-in** so the critical search path is byte-identical when unused. Rejected alternatives, with reasons: conversation write-back (it's `basic-memory`'s grain and muddies the "grounded, not extracted" differentiator), multi-vault (explicit non-goal in CLAUDE.md), and answer-synthesis (we're a retriever, not a QA generator — would be the kind of overclaim `docs/benchmarks.md` explicitly avoids). The integration test is non-vacuous by construction (the NEGATIVE control returns all three notes when the filter is absent — so a no-op filter implementation fails it), unlike the rc.8 case the revert-verify caught. **Next increments:** rc.11 — `tag` filter parity (FTS has it; hybrid doesn't) + optional boost-by-frontmatter; rc.12 — positioning for the capability.
827
+
828
+ ### Tests (1085)
829
+
830
+ `tests/search-hybrid.test.ts` +9 source `it()`. 1076 → 1085; claims synced (README ×4, package.json, llms.txt, AGENTS, COMPARISON, ROADMAP). Tool count unchanged (45 — this adds a *parameter*, not a tool).
831
+
832
+ ### Files changed
833
+
834
+ - `src/tools/search.ts` (`filter_frontmatter` arg + `frontmatterMatches`/`frontmatterValueMatches`/`frontmatterScalarEq` helpers + matches-loop integration + `fmFilter` hoist), `src/tool-registry.ts` (zod schema), `tests/search-hybrid.test.ts` (+9), `docs/api.md` (args-table row), test-count claims → 1085.
835
+ - version bump 3.10.0-rc.9 → 3.10.0-rc.10.
836
+
837
+ ---
838
+
839
+ ## [3.10.0-rc.9] — 2026-06-02
840
+
841
+ > **TL;DR:** **Positioning — a verified, fair head-to-head vs `basic-memory`** (the closest local-markdown-MCP rival), added to the COMPARISON "when to pick something other than enquire-mcp" section. Grounded in a fresh web-research pass (Track B of the promotion plan): `basic-memory` solves the **inverse** problem — it *writes* a knowledge-base **from your AI conversations** (readable markdown, viewable in Obsidian as a GUI), whereas enquire-mcp *recalls the notes you authored*. The entry is intentionally fair-not-sales (calls out exactly when basic-memory is the better pick, and that the two **compose**) and makes the "grounded, not extracted" line concrete with a real, citable example. Docs-only; no overclaim about the competitor (every claim verified against its public repo). **1076 tests unchanged.**
842
+
843
+ **Minor (pre-release) — v3.10 line; promotion/positioning increment (no code change).**
844
+
845
+ ### Changed
846
+
847
+ - **`docs/COMPARISON.md`** — "when to pick something other than enquire-mcp" expanded from four cases to **five**: added **`basic-memory` (basicmachines-co)**. It's the closest project in spirit (local-first, markdown, MCP-native, semantic search over a wikilinked knowledge graph, Obsidian as a GUI) but solves the inverse problem — write-memory-from-chat vs recall-what-you-authored — which makes the choice clean and sharpens enquire's "grounded, not extracted" differentiator with a concrete example. Notes that the two compose (basic-memory writes conversation-derived notes; enquire retrieves across the whole authored vault). Kept OUT of the Obsidian-MCP feature matrix (different category) to avoid a misleading row.
848
+
849
+ ### Method note
850
+
851
+ This is the first increment of the **promotion track**. It's grounded in a Firecrawl research pass (PROMO-1), not authored from memory, specifically to avoid competitor-claim overclaim: every `basic-memory` capability stated here was verified against its public GitHub/docs (knowledge-graph, semantic search, wikilinks, MCP-native, Obsidian GUI, conversation-capture). The research also surfaced **discoverability gaps that are maintainer-gated** (not shippable as repo docs) — handed off separately: enquire is absent from the high-intent "best Obsidian MCP server" results (needs stars + listicle presence), the brand search surfaces a stale OpenClaw-directory listing rather than the canonical repo, and the highest-leverage lever remains the published LongMemEval/retrieval score (reference hardware). Glama listing confirmed live (auto-synced from the MCP registry); the "claim" is OAuth-gated. **Deliberately did NOT** churn the README use-cases for marginal on-page SEO — the real high-intent-query gap is off-page (stars/listicles), and quality > keyword-stuffing.
852
+
853
+ ### Tests (1076)
854
+
855
+ No `it()` change (docs-only). 1076 unchanged.
856
+
857
+ ### Files changed
858
+
859
+ - `docs/COMPARISON.md` (basic-memory "when to pick else" entry; four→five), version bump 3.10.0-rc.8 → 3.10.0-rc.9.
860
+
861
+ ---
862
+
863
+ ## [3.10.0-rc.8] — 2026-06-02
864
+
865
+ > **TL;DR:** **Post-rc.7 audit response — fusion-stage privacy parity (defense-in-depth) + a self-caught vacuous-test correction.** A state-driven audit of the rc.3→rc.7 line (behavioral/threat lens, per the rc.36 meta-audit) found that the two fusion-stage consumers of the RRF `fused` list — pre-existing **graph-boost** (calls `vault.readNote` to parse a candidate's wikilinks → reads its **content**) and the rc.5 **recency re-rank** (stats a candidate's **mtime**) — both run BEFORE the rc.18 L-HYB-1 response-build `isExcluded` guard and don't replicate it. Not exploitable today (every ranker arm already drops excluded paths before `fused`, and the response-build guard drops them from output), so this is a **third, defense-in-depth layer** for a hypothetical future ranker-arm regression — exactly the "RRF fusion trusts ranker inputs; don't" rationale L-HYB-1 was shipped on. Fixed by pruning excluded paths from `fused` once at the source via a new pure `pruneExcludedHits`. **The audit also caught itself overclaiming:** the first test written for this was an *integration* test that **passed with the fix disabled** (vacuous — the per-arm filters prevent an excluded path from ever reaching `fused` through the public API). The revert-verify exposed it; it was replaced with a **pure-helper unit test** that actually fails when the guard is removed. **1072 → 1076 tests.** `src/` change is one fusion-stage filter line + the extracted helper.
866
+
867
+ **Minor (pre-release) — v3.10 line; post-sprint audit hardening.**
868
+
869
+ ### Added
870
+
871
+ - **`pruneExcludedHits(hits, isExcluded, granularity)`** in `src/tools/search.ts` — pure, granularity-aware (`block` ids strip the `#chunk` suffix before the membership test, matching the response-build guard's `lastIndexOf("#")` logic exactly). `searchHybrid` now calls it on `fused` immediately after RRF, so graph-boost + recency + matches-build are all excluded-free by construction.
872
+ - **`tests/search-hybrid.test.ts`** (+4): `pruneExcludedHits` note-granularity removal + order preservation, `#chunk`-suffix stripping (block), the `C# Notes.md` literal-`#` case (regression guard for the v3.7.16 P2-16 class), and a **NEGATIVE control** (predicate-driven, not unconditional — a `return hits` no-op fails it).
873
+
874
+ ### Method note
875
+
876
+ This is the project's signature **incomplete-class-sweep** closure: rc.18 L-HYB-1 added the *response-build* `isExcluded` guard but left graph-boost's fusion-stage content-read unguarded; rc.5 then added a second unguarded fusion-stage consumer (recency mtime-stat). The class fix prunes at the source so any future consumer of `fused` inherits the guard. **Honest self-correction (worth recording):** the integration test first written to "prove" the fix was **vacuous** — it asserted graph-boost never read an excluded note, but the per-arm ranker filters (BM25 `~line 1373`, embeddings `~1100`, TF-IDF via `listMarkdown`) already prevent an excluded path from reaching `fused` through the public `searchHybrid` API, so the assertion held with OR without the prune. The mandated **revert-verify** (disable the fix, confirm the test fails) caught it red-handed. Lesson: a guard that sits *behind* an existing filter can't be exercised through the front door — test it as a pure unit with the dependency injected, or the "test" is theater. Severity of the underlying finding is **LOW** (triple-guarded; no live leak), but the defense-in-depth + the class closure + the testing lesson justify the patch. **No new behavior on any real vault** — `fused` is already excluded-free in every current code path (the prune is a no-op until a ranker arm regresses).
877
+
878
+ ### Tests (1076)
879
+
880
+ `tests/search-hybrid.test.ts` +4 source `it()` (the vacuous integration test added during the audit was removed before ship — net 0 in `security.test.ts`). 1072 → 1076; claims synced (README ×4, package.json, llms.txt, AGENTS, COMPARISON, ROADMAP).
881
+
882
+ ### Files changed
883
+
884
+ - `src/tools/search.ts` (`pruneExcludedHits` helper + the `fused = pruneExcludedHits(...)` call), `tests/search-hybrid.test.ts` (+4), test-count claims → 1076.
885
+ - version bump 3.10.0-rc.7 → 3.10.0-rc.8.
886
+
887
+ ---
888
+
889
+ ## [3.10.0-rc.7] — 2026-06-02
890
+
891
+ > **TL;DR:** **v3.10 increment 6 — TDQS (tool-description quality): make the freshness signal discoverable to agents.** rc.4/rc.5 added `age_days` + `stale` to `obsidian_search` / `obsidian_find_similar` / `obsidian_semantic_search` results, but the **tool descriptions an agent actually reads** never mentioned them — so an agent had no way to know the freshness signal exists, let alone reason over it. This RC adds a concise freshness note to all three descriptions (what the fields are + that `--recency-weight` can blend fresher notes upward) — closing the "shipped-but-undiscoverable" gap. **The benchmark-methodology half of the original rc.7 plan needs no work** — `docs/benchmarks.md` already carries the full methodology (dataset, ground-truth, metric definitions, ablations, reproducibility) AND the precise "what we measure and what we don't" framing (retrieval quality, NOT end-to-end QA accuracy — "a QA-accuracy number for a retriever would be an overclaim"), shipped across the v3.7.x cascade + rc.19. **Src/description-only — zero behavior change, 1072 tests unchanged.**
892
+
893
+ **Minor (pre-release) — v3.10 forgetting-aware staleness, increment 6/N (TDQS).**
894
+
895
+ ### Changed
896
+
897
+ - **`src/tool-registry.ts`** — added a forgetting-aware freshness note to three tool descriptions:
898
+ - `obsidian_search`: "every hit also carries `age_days` … and a `stale` boolean … use these to flag a recalled fact as possibly out-of-date … if the server was started with `--recency-weight`, fresher notes are blended upward."
899
+ - `obsidian_find_similar`: "each result also carries `age_days` + a `stale` flag … so you can prefer fresher related notes or flag aged ones."
900
+ - `obsidian_semantic_search`: "each hit also carries `age_days` + a `stale` flag … a freshness signal you can reason over."
901
+ These are the agent-facing strings returned by `tools/list`, so the capability is now self-describing — an agent discovers the freshness signal from the tool contract, not just the docs.
902
+
903
+ ### Method note
904
+
905
+ A TDQS (tool-description quality) pass is most valuable where a *shipped capability is invisible in the contract the consumer reads* — not as cosmetic rewording of already-audited prose. The scan found exactly that gap (freshness fields shipped rc.4/rc.5, undocumented in `tools/list`) and fixed only it; the rest of the 45 descriptions were already high-quality from prior audit rounds, so they're left untouched (no churn). No structural wording-invariant was added: tying a test to exact description prose is brittle, and the descriptions are already protected by `smoke` (they load) + the K-3 readOnlyHint invariant. **Dependabot triage (separate housekeeping, not in this commit):** PR #91 (better-sqlite3 12.9.0→12.10.0, patch, native optionalDep) and PR #90 (dev-dependencies group) are low-risk with green CI → safe to merge; PR #178 (commander 14→15, **major** — CLI option-parsing behavior) and PR #177 (pdfjs-dist 5→6, **major** — PDF-extraction behavior) need a dedicated test pass + maintainer review before merge; community PR #113 (docs) is maintainer-review-gated. **This concludes the autonomously-shippable v3.10 forgetting-aware line (rc.1→rc.7).** Maintainer-gated next: dependabot major bumps, the published LongMemEval reference-hardware score, and v3.10.0 → `@latest` (fresh external audit per the v3.6.1 ≥2-auditor rule).
906
+
907
+ ### Tests (1072)
908
+
909
+ No `it()` added (description-only). 1072 unchanged; version-bearing surfaces synced to 3.10.0-rc.7.
910
+
911
+ ### Files changed
912
+
913
+ - `src/tool-registry.ts` (3 tool descriptions), `package.json` / `package-lock.json` / `src/index.ts` / `server.json` (version bump 3.10.0-rc.6 → 3.10.0-rc.7).
914
+
915
+ ---
916
+
917
+ ## [3.10.0-rc.6] — 2026-06-02
918
+
919
+ > **TL;DR:** **v3.10 messaging — the positioning catches up to the shipped forgetting-aware capability.** rc.1–rc.5 built freshness fields + recency re-ranking; rc.6 is the docs-only RC that makes that *discoverable* and *positioned*. Adds a "**Grounded — and freshness-aware**" narrative to the README (the Memora stale-fact-reuse frontier, arXiv:2604.20006, which conversation-memory stores ignore), a 4th top-line differentiator (**Freshness-aware recall**), a freshness row in the COMPARISON feature matrix, and the same framing in llms.txt + ROADMAP. Also **sharpens the "grounded, not extracted" claim**: names the chat-memory cohort precisely (mem0 / Zep / Supermemory / **Memobase**) and explicitly scopes the "extracted" critique to *that* cohort — NOT to knowledge-graph/ETL tools (cognee) or personal-search peers (Khoj), so the comparison stays fair-not-sales. **Docs-only — zero `src/` change, 1072 tests unchanged.**
920
+
921
+ **Minor (pre-release) — v3.10 forgetting-aware staleness, increment 5/N (messaging).**
922
+
923
+ ### Changed
924
+
925
+ - **README** — extended the "Grounded, not extracted" block with a "**Grounded — and freshness-aware**" paragraph (cites the Memora benchmark) + a 4th differentiator bullet ("Freshness-aware recall"); the "Three things" header → "What makes enquire-mcp different". Added the cohort-precision parenthetical (the "extracted" critique is specific to chat-memory tools, not cognee / Khoj).
926
+ - **llms.txt** — added Memobase to the conversation-memory cohort + a FRESHNESS-AWARE sentence (age_days/stale + `--recency-weight` + the Memora citation) for AI-agent discovery.
927
+ - **docs/COMPARISON.md** — added Memobase + a freshness-aware sentence to the grounded intro; added a **"Forgetting-aware freshness (`age_days` / recency re-rank)"** row to the feature matrix (enquire Yes (v3.10), all four alternatives No). Row "Yes" deliberately left un-bolded to respect the matrix's stated "bold only in the four audit-priority rows" convention.
928
+ - **ROADMAP.md** — added a "Forgetting-aware freshness (v3.10)" bullet to the "Already shipped and differentiating" list.
929
+
930
+ ### Method note
931
+
932
+ This is the "messaging catches up to capability" RC the project runs after a feature line lands (cf. v3.6.3 marketing pivot, v3.9.0-rc.27 "grounded, not extracted"). All competitor claims are kept to the **verifiable cohort already named in the docs** + one addition (Memobase, a chat-memory backend the "extract" critique accurately describes); the deliberately-scoped parenthetical (NOT cognee / Khoj) is the anti-overclaim move — it's easy to over-broaden "every memory tool extracts" into an unfair-comparison overclaim, so the critique is explicitly bounded. **Deferred (documented):** a head-to-head vs `basic-memory` (a non-Obsidian markdown-memory MCP) — out of scope for the *Obsidian-MCP* COMPARISON matrix and would carry an unverified-license/feature-claim burden; revisit if a dedicated AI-memory-framework comparison page is added. **Deferred to rc.7:** TDQS (tool-description quality) pass on the 45 tool descriptions + a benchmark-methodology doc + dependabot triage.
933
+
934
+ ### Tests (1072)
935
+
936
+ No `it()` added (docs-only). 1072 unchanged; version-bearing surfaces synced to 3.10.0-rc.6.
937
+
938
+ ### Files changed
939
+
940
+ - `README.md`, `llms.txt`, `docs/COMPARISON.md`, `ROADMAP.md` (messaging), `package.json` / `package-lock.json` / `src/index.ts` / `server.json` (version bump 3.10.0-rc.5 → 3.10.0-rc.6).
941
+
942
+ ---
943
+
944
+ ## [3.10.0-rc.5] — 2026-06-02
945
+
946
+ > **TL;DR:** **v3.10 staleness increment 4 — OPT-IN recency re-ranking (the forgetting-aware knob).** Two new shared serve/serve-http flags: **`--recency-weight <w>`** (0–1, **default 0 = OFF**) and **`--stale-days <n>`** (recency half-life, default 365). When `weight > 0`, `obsidian_search` re-sorts the fused result set by `(1 − w)·relevanceRank + w·recency`, where recency decays hyperbolically with the note's **live** on-disk mtime (`recencyScore` = `staleDays / (staleDays + age_days)`). The relevance term is **rank-based** (`1/(1+pos)`), so the blend composes cleanly on top of RRF + graph-boost + the cross-encoder reranker without any score-scale mismatch — and `weight = 0` makes the blend key a strictly-decreasing function of position, i.e. a **provable no-op** (the default keeps ranking purely relevance-driven; nobody is surprised by recency silently reordering relevance). Bounded (stats ≤ candidate-pool unique paths, only when enabled) and fail-soft. This is the Memora stale-reuse-frontier knob: your knowledge, now freshness-*weightable*. **1062 → 1072 tests.**
947
+
948
+ **Minor (pre-release) — v3.10 forgetting-aware staleness, increment 4/N.**
949
+
950
+ ### Added
951
+
952
+ - **`--recency-weight <w>` + `--stale-days <n>`** on both `serve` and `serve-http` (via the shared `addAdvancedRetrievalOptions` helper → inherently cli-parity-safe; helper flag count 11 → 13). `--recency-weight` is validated to `[0, 1]` (`server.ts` throws on out-of-range, matching the rc.9 input-validation posture); `--stale-days` parses as a positive integer. Both default to OFF behavior (`weight 0` → no re-rank; `staleDays` only matters when weight > 0).
953
+ - **`recencyScore(ageDays, staleDays)`** in `src/staleness.ts` — a pure, monotonically-decreasing recency curve in `(0, 1]`: `1` at age 0, `0.5` at the half-life, → `0` as age → ∞. Smooth hyperbolic decay (not a hard stale cliff) so a highly-relevant year-old note still competes. Clamps negative/non-finite age → 0 and sub-1 half-life → 1 (no divide-by-zero).
954
+ - **`searchHybrid` ctx gains `recency?: { weight; staleDays }`** — applied after RRF + graph-boost + reranker, before truncation. Re-stats the candidate pool for live mtimes (dedup by path, `Promise.all`, fail-soft per path), blends, re-sorts.
955
+ - **Tests (+10):** `tests/staleness.test.ts` +6 (`recencyScore` curve: anchor points, strict monotonicity, half-life sensitivity, default, clamps, + a NEGATIVE control that fresh strictly outscores old); `tests/search-hybrid.test.ts` +4 (baseline relevance-first; weight 1.0 floats the fresh note above a more-relevant old one; **NEGATIVE control** weight 0 == baseline order; small-half-life still fresh-first).
956
+
957
+ ### Changed
958
+
959
+ - **`tests/cli-parity.test.ts`** — helper flag count 11 → 13; `--recency-weight` / `--stale-days` added to `REQUIRED_RETRIEVAL_FLAGS` (asserts both serve + serve-http carry them).
960
+ - **`docs/api.md`** — two flag-table rows + an "opt-in recency re-ranking" note in the `obsidian_search` freshness paragraph.
961
+ - **`src/staleness.ts` header** — updated the forward-looking deferral comment (rc.1 said recency re-ranking + `--stale-days` were "v3.10 follow-ups"; now documents the incremental rc.1→rc.5 buildout) per the overclaim-#13 rule (update deferral claims in the same commit that ships them).
962
+
963
+ ### Method note
964
+
965
+ The design choice that makes this safe to ship on the **critical search path**: blend the relevance **rank** (`1/(1+pos)`), not the raw fused score. Rank is scale-free, so the blend is agnostic to whether the order came from RRF, graph-boost, or the cross-encoder — and `weight = 0` is a *provable* no-op (the key reduces to a strictly-decreasing function of position, reproducing the input order exactly), which is why the entire feature is gated behind `weight > 0` and the default behavior is byte-identical to rc.4. Recency uses a smooth `staleDays/(staleDays+age)` decay rather than a hard cliff at the stale threshold, so the knob is a nudge, not a guillotine. Per the project's "surface before reorder" caution, rc.4 surfaced the freshness signal read-only; rc.5 only *now* lets it influence ranking, and only when the operator explicitly opts in. **Deferred to rc.6:** the FAMA/forgetting-aware narrative + "grounded, not extracted" sharpening in README/COMPARISON (docs-only).
966
+
967
+ ### Tests (1072)
968
+
969
+ `tests/staleness.test.ts` +6, `tests/search-hybrid.test.ts` +4. 1062 → 1072; claims synced (README ×4 incl. badge, package.json, llms.txt, AGENTS, COMPARISON, ROADMAP).
970
+
971
+ ### Files changed
972
+
973
+ - `src/staleness.ts` (`recencyScore` + header), `src/tools/search.ts` (ctx `recency` + post-rerank blend), `src/cli.ts` (2 flags), `src/server.ts` (parse + validate + plumb), `src/tool-registry.ts` (`registerReadTools` param + ctx), `tests/staleness.test.ts` (+6), `tests/search-hybrid.test.ts` (+4), `tests/cli-parity.test.ts` (11→13 + flags), `docs/api.md` (flag rows + note), test-count claims → 1072.
974
+ - version bump 3.10.0-rc.4 → 3.10.0-rc.5.
975
+
976
+ ---
977
+
5
978
  ## [3.10.0-rc.4] — 2026-06-02
6
979
 
7
980
  > **TL;DR:** **v3.10 staleness increment 3 — freshness fields on the PRIMARY search surface.** The hybrid `obsidian_search` tool (the recommended default, the one agents actually call) now carries the same forgetting-aware freshness signal that rc.1 added to `obsidian_find_similar` / `obsidian_semantic_search`: every hit gains `age_days` (whole days since the note's **current on-disk** mtime) and an over-one-year `stale` boolean. Computed by statting the final ≤`limit` hit paths — so it reflects the **live** file mtime, not the possibly-lagging indexed mtime in FTS5/embed-db `source_state`. **Read-only signal — does NOT reorder results** (opt-in recency re-ranking is the next increment); it just lets an agent flag a recalled fact as potentially out-of-date instead of presenting it as current (the Memora stale-memory-reuse frontier). Bounded (O(unique paths) ≤ `limit` concurrent stats) and **fail-soft** (a file deleted between fusion and response simply omits the two fields — never throws). **1059 → 1062 tests.**