@oomkapwn/enquire-mcp 2.0.0-beta.1 → 2.0.0-beta.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -2,6 +2,171 @@
2
2
 
3
3
  All notable changes to this project will be documented here. The format follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and the project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
4
4
 
5
+ ## [2.0.0-beta.3] — 2026-05-08
6
+
7
+ **Backlog cleanup + tool-surface consolidation.** All audit-driven P0/P1 work landed in beta.2; this release closes the long tail of P2/P3 backlog items the same audits surfaced. No new features, no breaking changes for default users — but the default tool list is now narrower (21 read tools instead of 24) because the four single-ranker search tools moved behind a new opt-in flag.
8
+
9
+ ### Changed — `obsidian_search` is the headline; single-ranker tools moved behind `--diagnostic-search-tools`
10
+
11
+ The audit's recurring observation: agents routinely picked the wrong single-ranker search tool from the five options (`search_text`, `full_text_search`, `semantic_search`, `embeddings_search`, `search`). The umbrella `obsidian_search` (added v2.0.0-beta.0) auto-detects available signals and produces consistent recall — five-tool surface is now bloat.
12
+
13
+ - **Default surface (v2.0.0-beta.3+):** 21 always-on read tools. The single search tool is `obsidian_search`. Hybrid retrieval auto-detects what's available (BM25 if `--persistent-index`, ML embeddings if `build-embeddings` ran) and falls back gracefully.
14
+ - **Diagnostic surface:** add `--diagnostic-search-tools` to register `obsidian_search_text`, `obsidian_semantic_search`, `obsidian_embeddings_search` (and `obsidian_full_text_search` if `--persistent-index` is also set). Use these for A/B benchmarking or when you specifically need single-ranker output.
15
+
16
+ This is **not breaking** for clients calling `obsidian_search` (the v2.0 default). It IS a change for clients hard-coded to call `obsidian_search_text` / `obsidian_semantic_search` / `obsidian_embeddings_search` / `obsidian_full_text_search` — they need to either switch to `obsidian_search` (recommended) or add the flag.
17
+
18
+ ### Added — Cross-platform CI: macOS advisory job
19
+
20
+ CI test matrix was Linux-only. `Vault` does cross-platform path work (`vault.ts:631` has a Windows separator normalization), symlink handling, and `chmod` operations — all of which behave differently on non-Linux platforms. Pre-fix, regressions only surfaced on user reports.
21
+
22
+ New `test-macos` job runs the same suite on `macos-latest` × Node 22. **Advisory only** (`continue-on-error: true`) so it doesn't block merges, but failures appear in the PR check list. Required CI gate stays Linux × {Node 20, 22, 24} for ruleset stability.
23
+
24
+ ### Added — Coverage threshold gates in vitest
25
+
26
+ Pre-fix: the `coverage` CI job uploaded an HTML report and exited 0 regardless of the numbers. A regression that dropped coverage 90% → 40% would ship green. New `vitest.config.ts` thresholds:
27
+
28
+ - lines: ≥86%
29
+ - statements: ≥82%
30
+ - functions: ≥75%
31
+ - branches: ≥73%
32
+
33
+ All ~5pp below current. Excludes `src/index.ts` (registration boilerplate; line-count doesn't reflect quality) and test files. Fails CI if any threshold drops below.
34
+
35
+ ### Changed — `npm audit` elevated to `moderate` for production deps
36
+
37
+ Pre-fix: `--audit-level=high` everywhere. The recently-resolved `ip-address` advisory (CVE-2026-42338, moderate severity) sat undetected between Dependabot scans because no audit gate caught it. Now production deps gate at `moderate`, dev deps stay at `high` (more noise, less surface).
38
+
39
+ ### Process — branch-protection ruleset bypass mode hardened
40
+
41
+ `bypass_actors` for the admin role was `bypass_mode: always`. Changed to `bypass_mode: pull_request`. The maintainer's own pushes now go through PR (auto-mergeable), creating an audit trail. Combined with the v2.0.0-beta.2 release-pipeline integrity check, this means every change shipped to npm has a reviewable diff.
42
+
43
+ ### Docs
44
+
45
+ - README "Configure your AI client" tool count: `24 read + 1 opt-in` → `21 read + 4 opt-in` (3 diagnostic + 1 FTS) reflecting the consolidation above.
46
+ - `docs/api.md` header updated with the new tool-count math + opt-in flag breakdown.
47
+ - README footer ENQUIRE paragraph deduplicated (was repeated near-verbatim at lines 59 and 484; footer now just references the inline note).
48
+ - GitHub repo About description shortened from 340 → 195 chars to fit OpenGraph truncation.
49
+
50
+ ### Tests
51
+
52
+ 408 unit tests pass (was 408 in beta.2 — no test count delta; tests exercise the same surfaces with the new gating reflected in `tests/docs-consistency.test.ts` to count diagnostic-gated tools as opt-in, not always-on).
53
+
54
+ `scripts/smoke.mjs` adds `--diagnostic-search-tools` to its server invocation so smoke continues to exercise all 5 search tools (was: 4, post-consolidation default surface is 1).
55
+
56
+ ### Migration from v2.0.0-beta.2
57
+
58
+ **No-op for clients of `obsidian_search`** (the v2.0 hybrid default). Recommended path forward.
59
+
60
+ **Clients calling per-ranker tools directly:**
61
+ - Either switch to `obsidian_search` (preferred — auto-fuses signals)
62
+ - Or pass `--diagnostic-search-tools` to your `enquire-mcp serve` invocation
63
+
64
+ **Programmatic API surface unchanged.** The 4 gated tools have identical schemas + behavior when registered.
65
+
66
+ ## [2.0.0-beta.2] — 2026-05-06
67
+
68
+ **Audit-driven patch.** A second deep audit (5 parallel agents covering architecture, tests, docs, CI/CD, security threat model) surfaced one P0 privacy bypass of the same shape as the writeNote bug from beta.1, three release-pipeline P0s, and a long tail of P1 hardening. This release closes 16 findings and adds new architectural invariants to prevent recurrence.
69
+
70
+ ### Fixed — P0: persistent search indexes ignored `isExcluded` after config flip
71
+
72
+ **Same architectural debt as the writeNote miss in v2.0.0-beta.0.** The audit's root-cause analysis: `Vault.listMarkdown()` is the privacy chokepoint, but new persistent layers (FTS5 db, embed db) introduced their own search paths that bypassed it. Result: if a user built `.fts5.db` / `.embed.db` once, then added `--exclude-glob` later, excluded chunks leaked through:
73
+
74
+ - `obsidian_full_text_search` — BM25 hits from stale entries
75
+ - `obsidian_embeddings_search` — cosine hits from stale entries
76
+ - `obsidian_search` (the v2.0 default) — both BM25 + embed branches inherited
77
+ - `obsidian://chunk/{n}/{path}` resource — direct chunk fetch ignored exclusion
78
+
79
+ **Fix:** five new `isExcluded` filters, applied at the right layer:
80
+ 1. `embeddingsSearch` post-filters `db.search()` results, with 2× over-fetch to keep top-K stable
81
+ 2. `searchHybrid` BM25 branch post-filters `ftsIndex.search()` results
82
+ 3. `searchHybrid` embed branch — automatically protected since `embeddingsSearch` now filters
83
+ 4. `obsidian_full_text_search` handler post-filters with 2× over-fetch
84
+ 5. `vault-chunk` resource refuses with "not found" framing (matches FTS5 search post-filter, so the attacker can't distinguish "doesn't exist" from "exists but excluded")
85
+
86
+ Architecturally, the indexes themselves can keep stale entries — content filtering happens at search time, mirroring how `Vault.readNote` filters at read time even when the parse cache has the path.
87
+
88
+ ### Fixed — P0: release-pipeline integrity
89
+
90
+ **`release.yml`** previously trusted any tag pointing at any commit. An attacker who got commit access could `git tag v9.9.9 <evil-sha> && git push --tags` and ship malware bypassing main protections — the workflow re-ran lint/test/audit on the tag's SHA and would happily green-light it. Now release.yml:
91
+
92
+ 1. Asserts the tagged SHA is reachable from `main` (`git merge-base --is-ancestor`)
93
+ 2. Polls GitHub's check-runs API to verify all 8 required CI checks (`lint`, `test (20/22/24)`, `smoke`, `audit`, `coverage`, `version-consistency`) reported `success` on this exact SHA, with up to 5-minute tolerance for tag-vs-CI race conditions
94
+ 3. Refuses to publish if either check fails
95
+
96
+ **dist-tag regex** was hand-rolled `/-([a-z]+)\.[0-9]+$/`, which misrouted three valid SemVer prereleases to `latest`:
97
+
98
+ - `2.0.0-rc` (no `.N` suffix) → previously latest, now `rc`
99
+ - `2.0.0-rc.0+build.1` (build metadata) → previously latest, now `rc`
100
+ - `2.0.0-alpha-3` (dash separator) → previously latest, now `alpha-3`
101
+
102
+ Replaced with a Node-side parser that extracts the prerelease channel by SemVer rules. Verified against 8-case matrix.
103
+
104
+ ### Fixed — P1 sec DiD: `.obsidian/` plugin config bypassed `--read-paths`
105
+
106
+ **Defense in depth.** `loadPeriodicConfig()` read `.obsidian/daily-notes.json` and `.obsidian/plugins/periodic-notes/data.json` directly via `fs.readFile`, bypassing the user's privacy filter. Not a content leak (downstream `vault.stat` rejected paths), but the contract `--read-paths "Public/**"` = "ONLY Public/ visible" was technically violated. Now `loadPeriodicConfig` accepts an optional `isExcluded` predicate; when the user's allowlist excludes `.obsidian/**`, we silently fall back to v0.11 hard-coded defaults.
107
+
108
+ ### Fixed — P1 sec DiD: empty exclusion patterns silent-disable
109
+
110
+ **Privacy fail-closed.** Pre-fix, `--read-paths ""` (empty after shell interpolation of an unset variable) survived as `[""]`. `globToRegex("")` produces `^$` which matches no real paths — so the user's intent ("filter to nothing") functionally meant the readPaths predicate matched nothing → every path treated as excluded. The opposite mistake (whitespace-only) silently disabled. Now the Vault constructor strips empty/whitespace-only patterns and throws if the cleaned list is empty but the user explicitly passed flags — privacy is fail-closed.
111
+
112
+ ### Fixed — P1 architecture: searchHybrid silently swallowed ranker errors
113
+
114
+ `searchHybrid` wrapped each ranker in `try/catch` with stderr-only logging. The MCP response just showed `signals_used: []` with `matches: []` — a caller couldn't tell "no hits" from "all rankers crashed." New optional `signal_errors: { bm25?, tfidf?, embeddings? }` field surfaces per-signal failures so agents can reason about reliability.
115
+
116
+ ### Fixed — P1 architecture: `replaceInNotes` partial-state on mid-loop write failure
117
+
118
+ Pre-fix, a throw on file 5 of 20 lost the response — files 1-4 silently committed with no way for the agent to discover. Now per-file errors are collected; response includes `partial: true` flag and `errors: [{path, message}]` array. Systemic failures (read-only vault) still throw fast — they're config errors, not per-file failures.
119
+
120
+ ### Fixed — P1 architecture: `resolveTarget` periodic-alias fallthrough leaked content via basename collision
121
+
122
+ Pre-fix, when `vault.stat()` returned ENOENT for the configured periodic path (e.g., `Daily Notes/2026-05-08.md` doesn't exist yet), `resolveTarget` fell through to a basename match across the whole vault. With `--exclude-glob 'Daily Notes/**'` AND a `Public/2026-05-08.md`, the basename match silently redirected "today" to the unrelated public note. Now we only fall through if the periodic config produces a folder-less stem (i.e., user keeps periodic notes at vault root); configured-folder cases must hit the configured folder or fail clean.
123
+
124
+ ### Fixed — P1: `renameNote` and `Vault.renameFile` error messages now distinguish allowlist vs denylist
125
+
126
+ Pre-fix, both always blamed `--exclude-glob` even when `--read-paths` was the reason. New `Vault.exclusionReason()` helper exposes the same logic that writeNote already used; renameNote and renameFile both adopt it.
127
+
128
+ ### Fixed — P1: `replaceInNotes` accepted excluded `folder=` argument
129
+
130
+ Pre-fix, `replaceInNotes(folder: "Personal")` with `--exclude-glob "Personal/**"` returned `files_scanned: 0, scope: "Personal/"` — confirming the folder name existed in the user's layout. Now the function refuses early: `folder is excluded by privacy filter`. Same pattern applies to other tools that take `folder` arguments — listed as P2 backlog for v2.0.0-beta.3.
131
+
132
+ ### Fixed — P1 docs
133
+
134
+ - README + SECURITY.md "v2.0 alpha" → "v2.0" (already shipped beta).
135
+ - README "Configure your AI client" section: now shows BOTH `@latest` (v1.x) AND `@beta` (v2.0) install snippets explicitly. Pre-fix, copying the snippet pulled v1.11.1 while the section below described v2.0 features.
136
+ - README source-line-count claim: `~3500 lines` → `~7500 lines` (verified `wc -l src/*.ts`).
137
+ - README test-count claim: `388+` → `405+` (will be `408+` after this release).
138
+ - CHANGELOG v1.11.1 entry: removed phantom `obsidian_resolve_periodic_alias` reference (replaced with `obsidian_read_note({title:"today"})`, the actual MCP-exposed entry-point).
139
+
140
+ ### Added — Architecture invariant: docs-consistency tests for numeric drift
141
+
142
+ `tests/docs-consistency.test.ts` previously checked tool-name parity. Extended to:
143
+
144
+ - **Tool-count parity:** README's "N read tools (always on)" must match the actual count of `registerTool()` calls outside `registerWriteTools` and `registerFtsTools`.
145
+ - **`docs/api.md` math:** "M MCP tools (X always-on read + Y opt-in read + Z opt-in write)" must satisfy M = X + Y + Z.
146
+ - **CLI subcommand parity:** every `program.command()` registered must appear in the docs/api.md Subcommands table.
147
+
148
+ These prevent the kind of drift the audit caught manually. Now caught at CI time.
149
+
150
+ ### Tests
151
+
152
+ 408 unit tests pass (was 393, +15 new):
153
+ - 5 privacy-regression tests for `appendToNote`, `archiveNote`, `renameNote` (source + dest with allowlist), `replaceInNotes` (denylist)
154
+ - 2 search-time isExcluded filter tests (`searchHybrid` BM25 path with stale FTS5 db; `embeddingsSearch` filter post-search)
155
+ - 3 fail-closed Vault constructor tests (empty `--read-paths` / `--exclude-glob` rejection)
156
+ - 3 docs-consistency invariant tests
157
+ - 1 updated periodic-alias test (now expects "No note found" silent fallback instead of "excluded" leak)
158
+ - 1 architecture refactoring (security.test.ts test reordering after lint:fix)
159
+
160
+ ### Migration from v2.0.0-beta.1
161
+
162
+ **No breaking changes for end users.** All v2.0.0-beta.1 tools and CLI flags continue to work.
163
+
164
+ **Programmatic callers (rare):** `Vault` now throws on empty `excludeGlobs: [""]` / `readPaths: [""]`. Filter empty strings in the caller before constructing.
165
+
166
+ **`searchHybrid` response shape:** new optional `signal_errors` field. Existing parsers that ignore unknown fields are unaffected.
167
+
168
+ **`replaceInNotes` response shape:** new `partial: boolean` field (always present) and `errors?: Array` (only when partial). Existing parsers ignoring unknown fields are unaffected.
169
+
5
170
  ## [2.0.0-beta.1] — 2026-05-06
6
171
 
7
172
  **Audit-driven patch.** An independent external audit of v2.0.0-beta.0 surfaced one P0 privacy/security bug, several P1 doc/correctness drifts, and a handful of P2 hardening opportunities. This release closes all 17 findings (1 P0 + 7 P1 + 7 P2 + 2 P3). No new features.
@@ -241,7 +406,7 @@ Regression test: `tests/security.test.ts` adds two cases — one for `--exclude-
241
406
 
242
407
  `scripts/synthetic-vault.mjs` (CI smoke) didn't write `.obsidian/daily-notes.json`, so smoke fell back to the v0.11 hard-coded defaults — leaving `loadPeriodicConfig()` + `formatMoment()` regression-free in CI even when the actual code broke.
243
408
 
244
- Added a 3-line config (`folder: "99_Daily"`, `format: "YYYY-MM-DD"`) so `obsidian_resolve_periodic_alias today` now exercises the lazy-load → cache → format codepath in every CI run.
409
+ Added a 3-line config (`folder: "99_Daily"`, `format: "YYYY-MM-DD"`) so `obsidian_read_note({ title: "today" })` now exercises the lazy-load → cache → format codepath in every CI run.
245
410
 
246
411
  ### Docs
247
412
 
package/README.md CHANGED
@@ -96,10 +96,10 @@ There are several Obsidian-MCP servers out there. enquire differentiates on thre
96
96
  | **Strict path allowlist** (`--read-paths '01_Projects/**'` — only paths matching one of these globs are visible; complement to `--exclude-glob` denylist) | ❌ | ✅ |
97
97
  | **Canvas (`.canvas`) read tools** (`obsidian_list_canvases` + `obsidian_read_canvas` — typed nodes + edges, broken-ref detection) | ❌ rare / partial | ✅ first-class |
98
98
  | **Semantic search** (`obsidian_semantic_search` — TF-IDF cosine, free / offline / no model download) | ❌ usually paywalled (Smart Connections) | ✅ in-tree |
99
- | **ML embeddings search** (`obsidian_embeddings_search` — paraphrase-multilingual-MiniLM-L12-v2, 50+ languages, persistent SQLite vector index) | ❌ usually paywalled (Smart Connections) | ✅ free + offline-capable (v2.0 alpha) |
100
- | TypeScript strict + Biome lint + 388+ unit tests | varies | ✅ |
99
+ | **ML embeddings search** (`obsidian_embeddings_search` — paraphrase-multilingual-MiniLM-L12-v2, 50+ languages, persistent SQLite vector index) | ❌ usually paywalled (Smart Connections) | ✅ free + offline-capable (v2.0 beta) |
100
+ | TypeScript strict + Biome lint + 405+ unit tests | varies | ✅ |
101
101
 
102
- That's the gap. enquire closes it in ~3500 lines of TypeScript with five mandatory runtime dependencies (`@modelcontextprotocol/sdk`, `chokidar`, `commander`, `gray-matter`, `zod`) plus two optional (`better-sqlite3` for `--persistent-index` and `--build-embeddings`; `@huggingface/transformers` for ML embeddings — both are no-ops when not invoked).
102
+ That's the gap. enquire closes it in ~7500 lines of TypeScript with five mandatory runtime dependencies (`@modelcontextprotocol/sdk`, `chokidar`, `commander`, `gray-matter`, `zod`) plus two optional (`better-sqlite3` for `--persistent-index` and the `build-embeddings` subcommand; `@huggingface/transformers` for ML embeddings — both are no-ops when not invoked).
103
103
 
104
104
  > **Not affiliated with Obsidian.md.** Obsidian and the Obsidian logo are trademarks of Dynalist Inc. enquire-mcp is an independent open-source project that reads Obsidian-format vaults. The name «enquire» is a tribute to Tim Berners-Lee's 1980 hypertext system, not a trademark claim against any party.
105
105
 
@@ -115,9 +115,12 @@ That's the gap. enquire closes it in ~3500 lines of TypeScript with five mandato
115
115
 
116
116
  ## Configure your AI client
117
117
 
118
- **Recommended: zero-install via `npx` — no clone, no build.** Drop this into your MCP client's config:
118
+ **Recommended: zero-install via `npx` — no clone, no build.** Drop this into your MCP client's config.
119
+
120
+ > **Pick a channel.** `@oomkapwn/enquire-mcp` (no `@beta`) → stable **v1.11.1** with 28 tools — no `obsidian_search` umbrella, no ML embeddings. Add `@beta` for the v2.0 surface (30 tools, hybrid retrieval). Stable `@latest` will move to v2.0 once beta proves out.
119
121
 
120
122
  ```json
123
+ // Stable v1.x — 28 tools, no hybrid search
121
124
  {
122
125
  "mcpServers": {
123
126
  "obsidian": {
@@ -126,6 +129,16 @@ That's the gap. enquire closes it in ~3500 lines of TypeScript with five mandato
126
129
  }
127
130
  }
128
131
  }
132
+
133
+ // Beta v2.0 — adds obsidian_search (BM25 + TF-IDF + ML embeddings via RRF)
134
+ {
135
+ "mcpServers": {
136
+ "obsidian": {
137
+ "command": "npx",
138
+ "args": ["-y", "@oomkapwn/enquire-mcp@beta", "serve", "--vault", "/Users/you/Documents/Obsidian Vault"]
139
+ }
140
+ }
141
+ }
129
142
  ```
130
143
 
131
144
  **Where to drop that JSON, by client:**
@@ -171,7 +184,7 @@ Restart your client. The server logs `enquire <version> ready (read-only, vault=
171
184
 
172
185
  ## What you get
173
186
 
174
- ### 24 read tools (always on) + 1 opt-in (`--persistent-index`) — **30 total** with 5 write tools
187
+ ### 21 read tools (always on) + 4 opt-in (`--persistent-index` adds 1 BM25 / `--diagnostic-search-tools` adds 3 single-ranker) — **30 total** with 5 write tools
175
188
 
176
189
  | Tool | What it does |
177
190
  |---|---|
@@ -468,4 +481,4 @@ Other ways to help:
468
481
 
469
482
  [MIT](./LICENSE). Built by Alex — [GitHub `@oomkapwn`](https://github.com/oomkapwn) · [X `@OomkaBear`](https://x.com/OomkaBear). Powered by [Model Context Protocol](https://modelcontextprotocol.io/), [`gray-matter`](https://github.com/jonschlinkert/gray-matter), [`commander`](https://github.com/tj/commander.js), and the patience of one specific Obsidian vault that didn't deserve to be parsed by hand.
470
483
 
471
- Named after [ENQUIRE](https://en.wikipedia.org/wiki/ENQUIRE) — the program Tim Berners-Lee wrote at CERN in 1980 to track «the complex web of relationships between people, programs, machines and ideas». ENQUIRE was the direct prototype of the World Wide Web. enquire-mcp brings the same idea to your AI: hyperlinked notes, structured access, no plugin required.
484
+ Named after [ENQUIRE](https://en.wikipedia.org/wiki/ENQUIRE) — the 1980 hypertext prototype of the World Wide Web (see the inline note above for the longer story).
package/SECURITY.md CHANGED
@@ -132,7 +132,7 @@ Posture:
132
132
  - **Write-tool gating composes with `--enable-write`.** Disabling `obsidian_create_note` while leaving `obsidian_replace_in_notes` enabled is a valid configuration; the gate is independent of the global write flag.
133
133
  - **Posture is "fail closed".** Tools blocked at registration time never appear in `tools/list` and a `tools/call` against a gated name returns a clean MCP-protocol error from the SDK — there's no codepath where a disabled tool can still execute.
134
134
 
135
- ## ML embeddings (v2.0 alpha): networked-download + cache posture
135
+ ## ML embeddings (v2.0): networked-download + cache posture
136
136
 
137
137
  The `obsidian_embeddings_search` tool plus the `install-model` and `build-embeddings` subcommands (added v2.0.0-alpha.0) introduce two new surfaces with networked / on-disk implications:
138
138
 
package/dist/index.d.ts CHANGED
@@ -14,6 +14,7 @@ interface ServeOptions {
14
14
  watch?: boolean;
15
15
  disabledTools?: string[];
16
16
  enabledTools?: string[];
17
+ diagnosticSearchTools?: boolean;
17
18
  }
18
19
  declare function main(): Promise<void>;
19
20
  declare function startServer(opts: ServeOptions): Promise<void>;
@@ -1 +1 @@
1
- {"version":3,"file":"index.d.ts","sourceRoot":"","sources":["../src/index.ts"],"names":[],"mappings":";AAsDA,UAAU,YAAY;IACpB,KAAK,EAAE,MAAM,CAAC;IACd,WAAW,CAAC,EAAE,OAAO,CAAC;IACtB,YAAY,CAAC,EAAE,MAAM,CAAC;IACtB,SAAS,CAAC,EAAE,MAAM,CAAC;IACnB,eAAe,CAAC,EAAE,OAAO,CAAC;IAC1B,SAAS,CAAC,EAAE,MAAM,CAAC;IACnB,eAAe,CAAC,EAAE,OAAO,CAAC;IAC1B,SAAS,CAAC,EAAE,MAAM,CAAC;IACnB,QAAQ,CAAC,EAAE,WAAW,GAAG,SAAS,CAAC;IACnC,WAAW,CAAC,EAAE,MAAM,EAAE,CAAC;IACvB,SAAS,CAAC,EAAE,MAAM,EAAE,CAAC;IACrB,KAAK,CAAC,EAAE,OAAO,CAAC;IAChB,aAAa,CAAC,EAAE,MAAM,EAAE,CAAC;IACzB,YAAY,CAAC,EAAE,MAAM,EAAE,CAAC;CACzB;AAED,iBAAe,IAAI,IAAI,OAAO,CAAC,IAAI,CAAC,CAuMnC;AAED,iBAAe,WAAW,CAAC,IAAI,EAAE,YAAY,GAAG,OAAO,CAAC,IAAI,CAAC,CA2K5D;AA+qCD,iBAAS,gBAAgB,CAAC,GAAG,EAAE,MAAM,EAAE,IAAI,EAAE,MAAM,GAAG,MAAM,CAM3D;AAsCD,OAAO,EAAE,IAAI,EAAE,gBAAgB,EAAE,WAAW,EAAE,CAAC"}
1
+ {"version":3,"file":"index.d.ts","sourceRoot":"","sources":["../src/index.ts"],"names":[],"mappings":";AAsDA,UAAU,YAAY;IACpB,KAAK,EAAE,MAAM,CAAC;IACd,WAAW,CAAC,EAAE,OAAO,CAAC;IACtB,YAAY,CAAC,EAAE,MAAM,CAAC;IACtB,SAAS,CAAC,EAAE,MAAM,CAAC;IACnB,eAAe,CAAC,EAAE,OAAO,CAAC;IAC1B,SAAS,CAAC,EAAE,MAAM,CAAC;IACnB,eAAe,CAAC,EAAE,OAAO,CAAC;IAC1B,SAAS,CAAC,EAAE,MAAM,CAAC;IACnB,QAAQ,CAAC,EAAE,WAAW,GAAG,SAAS,CAAC;IACnC,WAAW,CAAC,EAAE,MAAM,EAAE,CAAC;IACvB,SAAS,CAAC,EAAE,MAAM,EAAE,CAAC;IACrB,KAAK,CAAC,EAAE,OAAO,CAAC;IAChB,aAAa,CAAC,EAAE,MAAM,EAAE,CAAC;IACzB,YAAY,CAAC,EAAE,MAAM,EAAE,CAAC;IACxB,qBAAqB,CAAC,EAAE,OAAO,CAAC;CACjC;AAED,iBAAe,IAAI,IAAI,OAAO,CAAC,IAAI,CAAC,CA2MnC;AAED,iBAAe,WAAW,CAAC,IAAI,EAAE,YAAY,GAAG,OAAO,CAAC,IAAI,CAAC,CA2K5D;AAgtCD,iBAAS,gBAAgB,CAAC,GAAG,EAAE,MAAM,EAAE,IAAI,EAAE,MAAM,GAAG,MAAM,CAM3D;AAsCD,OAAO,EAAE,IAAI,EAAE,gBAAgB,EAAE,WAAW,EAAE,CAAC"}
package/dist/index.js CHANGED
@@ -12,7 +12,7 @@ import { chunkContent, defaultIndexFile, FtsIndex } from "./fts5.js";
12
12
  import { appendToNote, archiveNote, createNote, dataviewQuery, embeddingsSearch, findPath, findSimilar, getBacklinks, getNoteNeighbors, getOpenQuestions, getOutboundLinks, getRecentEdits, getUnresolvedWikilinks, getVaultStats, lintWiki, listCanvases, listNotes, listTags, openInUi, paperAudit, readCanvas, readNote, renameNote, replaceInNotes, resolveWikilink, searchHybrid, searchText, semanticSearch, validateNoteProposal } from "./tools.js";
13
13
  import { Vault } from "./vault.js";
14
14
  import { VaultWatcher } from "./watcher.js";
15
- const VERSION = "2.0.0-beta.1";
15
+ const VERSION = "2.0.0-beta.3";
16
16
  /** Default location for the persistent embedding index, alongside .fts5.db. */
17
17
  function embedDbPath(vaultRoot) {
18
18
  // Match the FTS5 location convention by stripping the .fts5.db extension
@@ -42,6 +42,7 @@ async function main() {
42
42
  .option("--watch", "Watch the vault for .md add/change/unlink events and incrementally invalidate the parsed-note cache (and refresh the FTS5 index when --persistent-index is also enabled). Off by default. Use this for long-running servers where you keep editing in Obsidian and want search to stay fresh without restarting.")
43
43
  .option("--disabled-tools <name...>", "Skip registration of specific tools by exact name. Useful when you want to expose a smaller surface to a particular agent (e.g. read-only research agent gets only obsidian_search_text + obsidian_read_note). Repeatable. Names are the same as in `tools/list` — `obsidian_*`. Example: `--disabled-tools obsidian_dataview_query obsidian_full_text_search`.")
44
44
  .option("--enabled-tools <name...>", "Strict allowlist — when set, ONLY listed tools register. Complement to --disabled-tools (denylist). If both are set: a tool must be in the allowlist AND not in the denylist. Repeatable. Example: `--enabled-tools obsidian_search_text obsidian_read_note obsidian_get_recent_edits`.")
45
+ .option("--diagnostic-search-tools", "Register the four single-ranker search tools (obsidian_search_text, obsidian_full_text_search, obsidian_semantic_search, obsidian_embeddings_search) IN ADDITION to the default obsidian_search hybrid tool. Off by default in v2.0+ — the umbrella obsidian_search auto-detects available signals and produces consistent recall. Enable when you need single-ranker output for diagnostics or A/B benchmarking.")
45
46
  .action(async (opts) => {
46
47
  await startServer(opts);
47
48
  });
@@ -242,14 +243,14 @@ async function startServer(opts) {
242
243
  return origRegisterTool(name, ...rest);
243
244
  };
244
245
  }
245
- registerReadTools(server, vault, ftsIndex);
246
+ registerReadTools(server, vault, ftsIndex, opts.diagnosticSearchTools ?? false);
246
247
  if (vault.writeEnabled)
247
248
  registerWriteTools(server, vault);
248
- if (ftsIndex)
249
- registerFtsTools(server, ftsIndex);
249
+ if (ftsIndex && opts.diagnosticSearchTools)
250
+ registerFtsTools(server, ftsIndex, vault);
250
251
  registerResources(server, vault);
251
252
  if (ftsIndex)
252
- registerChunkResource(server, ftsIndex);
253
+ registerChunkResource(server, ftsIndex, vault);
253
254
  registerPrompts(server);
254
255
  // v2.0.0-beta.1: warn on unknown names AFTER all tools are registered.
255
256
  // We can't validate at parse time because the canonical list depends on
@@ -428,7 +429,7 @@ async function syncFtsIndex(vault, idx) {
428
429
  total_chunks: idx.totalChunks()
429
430
  };
430
431
  }
431
- function registerFtsTools(server, idx) {
432
+ function registerFtsTools(server, idx, vault) {
432
433
  const READ_ONLY = { readOnlyHint: true, idempotentHint: true, openWorldHint: false };
433
434
  server.registerTool("obsidian_full_text_search", {
434
435
  title: "Full-text search (BM25, FTS5 index)",
@@ -459,6 +460,20 @@ function registerFtsTools(server, idx) {
459
460
  else
460
461
  throw new Error(`Invalid 'since' value (expected ISO date): ${args.since}`);
461
462
  }
463
+ // v2.0.0-beta.2 P0 fix: filter excluded paths from FTS5 hits before
464
+ // returning. The .fts5.db can contain entries from when the index was
465
+ // built without exclusion flags. Pre-fix, BM25 search leaked excluded
466
+ // chunks through `rel_path` and `snippet` (which contains the matched
467
+ // chunk text bracketed with «…»).
468
+ const userLimit = args.limit ?? 25;
469
+ const overFetch = userLimit * 2;
470
+ const rawMatches = idx.search(args.query, {
471
+ limit: overFetch,
472
+ folder: args.folder,
473
+ tag: args.tag,
474
+ sinceMtimeMs
475
+ });
476
+ const matches = rawMatches.filter((m) => !vault.isExcluded(m.rel_path)).slice(0, userLimit);
462
477
  return textResult({
463
478
  query: args.query,
464
479
  total_chunks: idx.totalChunks(),
@@ -468,16 +483,11 @@ function registerFtsTools(server, idx) {
468
483
  tag: args.tag ?? null,
469
484
  since: args.since ?? null
470
485
  },
471
- matches: idx.search(args.query, {
472
- limit: args.limit,
473
- folder: args.folder,
474
- tag: args.tag,
475
- sinceMtimeMs
476
- })
486
+ matches
477
487
  });
478
488
  });
479
489
  }
480
- function registerReadTools(server, vault, ftsIndex) {
490
+ function registerReadTools(server, vault, ftsIndex, diagnosticSearchTools) {
481
491
  const READ_ONLY = { readOnlyHint: true, idempotentHint: true, openWorldHint: false };
482
492
  server.registerTool("obsidian_list_notes", {
483
493
  title: "List notes",
@@ -519,23 +529,29 @@ function registerReadTools(server, vault, ftsIndex) {
519
529
  include_content: z.boolean().optional().describe("Include resolved file's body (default true)")
520
530
  }
521
531
  }, async (args) => textResult(await resolveWikilink(vault, args)));
522
- server.registerTool("obsidian_search_text", {
523
- title: "Search text",
524
- description: "Case-insensitive token search across all notes. Default mode `all` requires every whitespace-separated token to appear in a note (AND-tokenizer); `any` requires at least one (OR); `phrase` does the old contiguous-substring match. Returns a structured response with `query`, `mode`, `scanned_notes`, and ranked `matches` (each with snippet, line, score, matched_terms) — empty matches are explicit, not ambiguous with a broken call.",
525
- annotations: { ...READ_ONLY, title: "Search text" },
526
- inputSchema: {
527
- query: z
528
- .string()
529
- .min(1)
530
- .describe('Search string. With mode=all/any, whitespace tokenizes ("foo bar" ["foo","bar"]).'),
531
- folder: z.string().optional().describe("Restrict to a subfolder"),
532
- limit: z.number().int().positive().max(200).optional().describe("Max results (default 25)"),
533
- mode: z
534
- .enum(["all", "any", "phrase"])
535
- .optional()
536
- .describe('"all" (default, AND), "any" (OR), or "phrase" (literal substring pre-v0.9 behavior)')
537
- }
538
- }, async (args) => textResult(await searchText(vault, args)));
532
+ // v2.0.0-beta.3: obsidian_search_text is now a DIAGNOSTIC tool — gated
533
+ // behind --diagnostic-search-tools. Default search surface is the umbrella
534
+ // obsidian_search which auto-detects + fuses signals. Pre-fix, agents
535
+ // routinely picked the wrong single-ranker tool; consolidation reduces
536
+ // tool-list bloat and produces consistent recall.
537
+ if (diagnosticSearchTools)
538
+ server.registerTool("obsidian_search_text", {
539
+ title: "Search text",
540
+ description: "Case-insensitive token search across all notes. Default mode `all` requires every whitespace-separated token to appear in a note (AND-tokenizer); `any` requires at least one (OR); `phrase` does the old contiguous-substring match. Returns a structured response with `query`, `mode`, `scanned_notes`, and ranked `matches` (each with snippet, line, score, matched_terms) — empty matches are explicit, not ambiguous with a broken call.",
541
+ annotations: { ...READ_ONLY, title: "Search text" },
542
+ inputSchema: {
543
+ query: z
544
+ .string()
545
+ .min(1)
546
+ .describe('Search string. With mode=all/any, whitespace tokenizes ("foo bar" ["foo","bar"]).'),
547
+ folder: z.string().optional().describe("Restrict to a subfolder"),
548
+ limit: z.number().int().positive().max(200).optional().describe("Max results (default 25)"),
549
+ mode: z
550
+ .enum(["all", "any", "phrase"])
551
+ .optional()
552
+ .describe('"all" (default, AND), "any" (OR), or "phrase" (literal substring — pre-v0.9 behavior)')
553
+ }
554
+ }, async (args) => textResult(await searchText(vault, args)));
539
555
  server.registerTool("obsidian_get_recent_edits", {
540
556
  title: "Get recent edits",
541
557
  description: "List notes ordered by most recent modification. Useful for picking up where work was left off.",
@@ -761,44 +777,48 @@ function registerReadTools(server, vault, ftsIndex) {
761
777
  path: z.string().describe("Vault-relative path of the .canvas file (with or without .canvas)")
762
778
  }
763
779
  }, async (args) => textResult(await readCanvas(vault, args)));
764
- server.registerTool("obsidian_semantic_search", {
765
- title: "Semantic search (TF-IDF cosine)",
766
- description: "Pure-JS lexical-semantic retrieval. Tokenizes + TF-IDFs + L2-normalizes every note's body once per session, then ranks notes by cosine similarity to the query. Free / offline / no model download — closes the gap to Smart Connections without paywall, ML deps, or HTTP. Use this when `obsidian_search_text` (substring) and `obsidian_full_text_search` (BM25) miss synonyms or related-term matches. For best results pair with `--persistent-index` so BM25 + semantic both run cheap. Returns ranked hits with snippet + matched terms (highest-IDF first).",
767
- annotations: { ...READ_ONLY, title: "Semantic search" },
768
- inputSchema: {
769
- query: z.string().min(1).describe("Free-form query — multi-word, natural language is fine"),
770
- folder: z.string().optional().describe("Restrict to a subfolder (vault-relative)"),
771
- limit: z.number().int().positive().max(100).optional().describe("Max hits (default 10)"),
772
- min_score: z
773
- .number()
774
- .min(0)
775
- .max(1)
776
- .optional()
777
- .describe("Drop hits below this cosine score (default 0.05). Cosine ranges 0–1.")
778
- }
779
- }, async (args) => textResult(await semanticSearch(vault, args)));
780
+ // v2.0.0-beta.3: gated — see comment on obsidian_search_text above.
781
+ if (diagnosticSearchTools)
782
+ server.registerTool("obsidian_semantic_search", {
783
+ title: "Semantic search (TF-IDF cosine)",
784
+ description: "Pure-JS lexical-semantic retrieval. Tokenizes + TF-IDFs + L2-normalizes every note's body once per session, then ranks notes by cosine similarity to the query. Free / offline / no model download — closes the gap to Smart Connections without paywall, ML deps, or HTTP. Use this when `obsidian_search_text` (substring) and `obsidian_full_text_search` (BM25) miss synonyms or related-term matches. For best results pair with `--persistent-index` so BM25 + semantic both run cheap. Returns ranked hits with snippet + matched terms (highest-IDF first).",
785
+ annotations: { ...READ_ONLY, title: "Semantic search" },
786
+ inputSchema: {
787
+ query: z.string().min(1).describe("Free-form query multi-word, natural language is fine"),
788
+ folder: z.string().optional().describe("Restrict to a subfolder (vault-relative)"),
789
+ limit: z.number().int().positive().max(100).optional().describe("Max hits (default 10)"),
790
+ min_score: z
791
+ .number()
792
+ .min(0)
793
+ .max(1)
794
+ .optional()
795
+ .describe("Drop hits below this cosine score (default 0.05). Cosine ranges 0–1.")
796
+ }
797
+ }, async (args) => textResult(await semanticSearch(vault, args)));
780
798
  // v2.0 alpha — ML-embeddings retrieval. Reads a persistent vector index
781
799
  // built by `enquire-mcp build-embeddings`. Returns clean error if the index
782
800
  // doesn't exist (rather than silently downloading a model).
783
- server.registerTool("obsidian_embeddings_search", {
784
- title: "Embeddings search (ML, paraphrase-multilingual)",
785
- description: "ML-embedding retrieval via @huggingface/transformers + paraphrase-multilingual-MiniLM-L12-v2 (50+ languages, 384-dim, runs on CPU). Higher-quality than `obsidian_semantic_search` for paraphrases / synonyms / cross-language queries, but requires a one-time setup: (1) `enquire-mcp install-model multilingual` downloads the ONNX weights (~120MB) and (2) `enquire-mcp build-embeddings --vault <path>` writes the persistent vector index (~1ms/chunk on M1). Subsequent queries are sub-100ms top-10. If the index is missing, the tool returns a clean error with the exact command to run — it does NOT silently kick off a model download.",
786
- annotations: { ...READ_ONLY, title: "Embeddings search" },
787
- inputSchema: {
788
- query: z.string().min(1).describe("Free-form query — multi-word, natural language, any supported language"),
789
- folder: z.string().optional().describe("Restrict to a subfolder (vault-relative)"),
790
- limit: z.number().int().positive().max(100).optional().describe("Max hits (default 10)"),
791
- min_score: z
792
- .number()
793
- .min(0)
794
- .max(1)
795
- .optional()
796
- .describe("Drop hits below this cosine score (default 0.3). Cosine ranges -1 to 1; embeddings cluster ~0.4-0.9.")
797
- }
798
- }, async (args) => {
799
- const embedFile = embedDbPath(vault.root);
800
- return textResult(await embeddingsSearch(vault, args, embedFile));
801
- });
801
+ // v2.0.0-beta.3: gated — see comment on obsidian_search_text above.
802
+ if (diagnosticSearchTools)
803
+ server.registerTool("obsidian_embeddings_search", {
804
+ title: "Embeddings search (ML, paraphrase-multilingual)",
805
+ description: "ML-embedding retrieval via @huggingface/transformers + paraphrase-multilingual-MiniLM-L12-v2 (50+ languages, 384-dim, runs on CPU). Higher-quality than `obsidian_semantic_search` for paraphrases / synonyms / cross-language queries, but requires a one-time setup: (1) `enquire-mcp install-model multilingual` downloads the ONNX weights (~120MB) and (2) `enquire-mcp build-embeddings --vault <path>` writes the persistent vector index (~1ms/chunk on M1). Subsequent queries are sub-100ms top-10. If the index is missing, the tool returns a clean error with the exact command to run — it does NOT silently kick off a model download.",
806
+ annotations: { ...READ_ONLY, title: "Embeddings search" },
807
+ inputSchema: {
808
+ query: z.string().min(1).describe("Free-form query multi-word, natural language, any supported language"),
809
+ folder: z.string().optional().describe("Restrict to a subfolder (vault-relative)"),
810
+ limit: z.number().int().positive().max(100).optional().describe("Max hits (default 10)"),
811
+ min_score: z
812
+ .number()
813
+ .min(0)
814
+ .max(1)
815
+ .optional()
816
+ .describe("Drop hits below this cosine score (default 0.3). Cosine ranges -1 to 1; embeddings cluster ~0.4-0.9.")
817
+ }
818
+ }, async (args) => {
819
+ const embedFile = embedDbPath(vault.root);
820
+ return textResult(await embeddingsSearch(vault, args, embedFile));
821
+ });
802
822
  // v2.0 beta — hybrid RRF over BM25 + TF-IDF + embeddings. Single umbrella
803
823
  // tool that auto-detects which signals are available and gracefully
804
824
  // degrades. Equal weights, k=60 (Cormack et al's recommendation). Note-
@@ -913,7 +933,7 @@ function registerWriteTools(server, vault) {
913
933
  }
914
934
  }, async (args) => textResult(await archiveNote(vault, args)));
915
935
  }
916
- function registerChunkResource(server, idx) {
936
+ function registerChunkResource(server, idx, vault) {
917
937
  // Chunk-level addressing — closes the v0.10 roadmap item from issue #10
918
938
  // suggestion 1. URI shape: obsidian://chunk/{chunkIndex}/{+notePath}.
919
939
  // Index FIRST so the {+notePath} can greedily eat slash-bearing paths.
@@ -937,6 +957,15 @@ function registerChunkResource(server, idx) {
937
957
  }
938
958
  const notePathRaw = Array.isArray(params.notePath) ? params.notePath.join("/") : params.notePath;
939
959
  const decoded = decodeNotePath(notePathRaw);
960
+ // v2.0.0-beta.2 P0 fix: enforce --read-paths / --exclude-glob on the
961
+ // chunk resource. The .fts5.db can contain entries from before the user
962
+ // added a privacy filter, so a stale URI returned earlier in the
963
+ // session would otherwise serve excluded content. We refuse with the
964
+ // same "not found" framing the FTS5 search uses post-filter, so the
965
+ // attacker can't distinguish "doesn't exist" from "exists but excluded".
966
+ if (vault.isExcluded(decoded)) {
967
+ throw new Error(`Chunk not found: ${decoded}#${chunkIndex}`);
968
+ }
940
969
  const chunk = idx.getChunk(decoded, chunkIndex);
941
970
  if (!chunk)
942
971
  throw new Error(`Chunk not found: ${decoded}#${chunkIndex}`);