npm - @oomkapwn/enquire-mcp - Versions diffs - 2.0.0-beta.1 → 2.0.0-beta.3 - Mend

@oomkapwn/enquire-mcp 2.0.0-beta.1 → 2.0.0-beta.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (21) hide show

package/CHANGELOG.md CHANGED Viewed

@@ -2,6 +2,171 @@
 All notable changes to this project will be documented here. The format follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and the project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
+## [2.0.0-beta.3] — 2026-05-08
+**Backlog cleanup + tool-surface consolidation.** All audit-driven P0/P1 work landed in beta.2; this release closes the long tail of P2/P3 backlog items the same audits surfaced. No new features, no breaking changes for default users — but the default tool list is now narrower (21 read tools instead of 24) because the four single-ranker search tools moved behind a new opt-in flag.
+### Changed — `obsidian_search` is the headline; single-ranker tools moved behind `--diagnostic-search-tools`
+The audit's recurring observation: agents routinely picked the wrong single-ranker search tool from the five options (`search_text`, `full_text_search`, `semantic_search`, `embeddings_search`, `search`). The umbrella `obsidian_search` (added v2.0.0-beta.0) auto-detects available signals and produces consistent recall — five-tool surface is now bloat.
+- **Default surface (v2.0.0-beta.3+):** 21 always-on read tools. The single search tool is `obsidian_search`. Hybrid retrieval auto-detects what's available (BM25 if `--persistent-index`, ML embeddings if `build-embeddings` ran) and falls back gracefully.
+- **Diagnostic surface:** add `--diagnostic-search-tools` to register `obsidian_search_text`, `obsidian_semantic_search`, `obsidian_embeddings_search` (and `obsidian_full_text_search` if `--persistent-index` is also set). Use these for A/B benchmarking or when you specifically need single-ranker output.
+This is **not breaking** for clients calling `obsidian_search` (the v2.0 default). It IS a change for clients hard-coded to call `obsidian_search_text` / `obsidian_semantic_search` / `obsidian_embeddings_search` / `obsidian_full_text_search` — they need to either switch to `obsidian_search` (recommended) or add the flag.
+### Added — Cross-platform CI: macOS advisory job
+CI test matrix was Linux-only. `Vault` does cross-platform path work (`vault.ts:631` has a Windows separator normalization), symlink handling, and `chmod` operations — all of which behave differently on non-Linux platforms. Pre-fix, regressions only surfaced on user reports.
+New `test-macos` job runs the same suite on `macos-latest` × Node 22. **Advisory only** (`continue-on-error: true`) so it doesn't block merges, but failures appear in the PR check list. Required CI gate stays Linux × {Node 20, 22, 24} for ruleset stability.
+### Added — Coverage threshold gates in vitest
+Pre-fix: the `coverage` CI job uploaded an HTML report and exited 0 regardless of the numbers. A regression that dropped coverage 90% → 40% would ship green. New `vitest.config.ts` thresholds:
+- lines: ≥86%
+- statements: ≥82%
+- functions: ≥75%
+- branches: ≥73%
+All ~5pp below current. Excludes `src/index.ts` (registration boilerplate; line-count doesn't reflect quality) and test files. Fails CI if any threshold drops below.
+### Changed — `npm audit` elevated to `moderate` for production deps
+Pre-fix: `--audit-level=high` everywhere. The recently-resolved `ip-address` advisory (CVE-2026-42338, moderate severity) sat undetected between Dependabot scans because no audit gate caught it. Now production deps gate at `moderate`, dev deps stay at `high` (more noise, less surface).
+### Process — branch-protection ruleset bypass mode hardened
+`bypass_actors` for the admin role was `bypass_mode: always`. Changed to `bypass_mode: pull_request`. The maintainer's own pushes now go through PR (auto-mergeable), creating an audit trail. Combined with the v2.0.0-beta.2 release-pipeline integrity check, this means every change shipped to npm has a reviewable diff.
+### Docs
+- README "Configure your AI client" tool count: `24 read + 1 opt-in` → `21 read + 4 opt-in` (3 diagnostic + 1 FTS) reflecting the consolidation above.
+- `docs/api.md` header updated with the new tool-count math + opt-in flag breakdown.
+- README footer ENQUIRE paragraph deduplicated (was repeated near-verbatim at lines 59 and 484; footer now just references the inline note).
+- GitHub repo About description shortened from 340 → 195 chars to fit OpenGraph truncation.
+### Tests
+408 unit tests pass (was 408 in beta.2 — no test count delta; tests exercise the same surfaces with the new gating reflected in `tests/docs-consistency.test.ts` to count diagnostic-gated tools as opt-in, not always-on).
+`scripts/smoke.mjs` adds `--diagnostic-search-tools` to its server invocation so smoke continues to exercise all 5 search tools (was: 4, post-consolidation default surface is 1).
+### Migration from v2.0.0-beta.2
+**No-op for clients of `obsidian_search`** (the v2.0 hybrid default). Recommended path forward.
+**Clients calling per-ranker tools directly:**
+- Either switch to `obsidian_search` (preferred — auto-fuses signals)
+- Or pass `--diagnostic-search-tools` to your `enquire-mcp serve` invocation
+**Programmatic API surface unchanged.** The 4 gated tools have identical schemas + behavior when registered.
+## [2.0.0-beta.2] — 2026-05-06
+**Audit-driven patch.** A second deep audit (5 parallel agents covering architecture, tests, docs, CI/CD, security threat model) surfaced one P0 privacy bypass of the same shape as the writeNote bug from beta.1, three release-pipeline P0s, and a long tail of P1 hardening. This release closes 16 findings and adds new architectural invariants to prevent recurrence.
+### Fixed — P0: persistent search indexes ignored `isExcluded` after config flip
+**Same architectural debt as the writeNote miss in v2.0.0-beta.0.** The audit's root-cause analysis: `Vault.listMarkdown()` is the privacy chokepoint, but new persistent layers (FTS5 db, embed db) introduced their own search paths that bypassed it. Result: if a user built `.fts5.db` / `.embed.db` once, then added `--exclude-glob` later, excluded chunks leaked through:
+- `obsidian_full_text_search` — BM25 hits from stale entries
+- `obsidian_embeddings_search` — cosine hits from stale entries
+- `obsidian_search` (the v2.0 default) — both BM25 + embed branches inherited
+- `obsidian://chunk/{n}/{path}` resource — direct chunk fetch ignored exclusion
+**Fix:** five new `isExcluded` filters, applied at the right layer:
+1. `embeddingsSearch` post-filters `db.search()` results, with 2× over-fetch to keep top-K stable
+2. `searchHybrid` BM25 branch post-filters `ftsIndex.search()` results
+3. `searchHybrid` embed branch — automatically protected since `embeddingsSearch` now filters
+4. `obsidian_full_text_search` handler post-filters with 2× over-fetch
+5. `vault-chunk` resource refuses with "not found" framing (matches FTS5 search post-filter, so the attacker can't distinguish "doesn't exist" from "exists but excluded")
+Architecturally, the indexes themselves can keep stale entries — content filtering happens at search time, mirroring how `Vault.readNote` filters at read time even when the parse cache has the path.
+### Fixed — P0: release-pipeline integrity
+**`release.yml`** previously trusted any tag pointing at any commit. An attacker who got commit access could `git tag v9.9.9 <evil-sha> && git push --tags` and ship malware bypassing main protections — the workflow re-ran lint/test/audit on the tag's SHA and would happily green-light it. Now release.yml:
+1. Asserts the tagged SHA is reachable from `main` (`git merge-base --is-ancestor`)
+2. Polls GitHub's check-runs API to verify all 8 required CI checks (`lint`, `test (20/22/24)`, `smoke`, `audit`, `coverage`, `version-consistency`) reported `success` on this exact SHA, with up to 5-minute tolerance for tag-vs-CI race conditions
+3. Refuses to publish if either check fails
+**dist-tag regex** was hand-rolled `/-([a-z]+)\.[0-9]+$/`, which misrouted three valid SemVer prereleases to `latest`:
+- `2.0.0-rc` (no `.N` suffix) → previously latest, now `rc`
+- `2.0.0-rc.0+build.1` (build metadata) → previously latest, now `rc`
+- `2.0.0-alpha-3` (dash separator) → previously latest, now `alpha-3`
+Replaced with a Node-side parser that extracts the prerelease channel by SemVer rules. Verified against 8-case matrix.
+### Fixed — P1 sec DiD: `.obsidian/` plugin config bypassed `--read-paths`
+**Defense in depth.** `loadPeriodicConfig()` read `.obsidian/daily-notes.json` and `.obsidian/plugins/periodic-notes/data.json` directly via `fs.readFile`, bypassing the user's privacy filter. Not a content leak (downstream `vault.stat` rejected paths), but the contract `--read-paths "Public/**"` = "ONLY Public/ visible" was technically violated. Now `loadPeriodicConfig` accepts an optional `isExcluded` predicate; when the user's allowlist excludes `.obsidian/**`, we silently fall back to v0.11 hard-coded defaults.
+### Fixed — P1 sec DiD: empty exclusion patterns silent-disable
+**Privacy fail-closed.** Pre-fix, `--read-paths ""` (empty after shell interpolation of an unset variable) survived as `[""]`. `globToRegex("")` produces `^$` which matches no real paths — so the user's intent ("filter to nothing") functionally meant the readPaths predicate matched nothing → every path treated as excluded. The opposite mistake (whitespace-only) silently disabled. Now the Vault constructor strips empty/whitespace-only patterns and throws if the cleaned list is empty but the user explicitly passed flags — privacy is fail-closed.
+### Fixed — P1 architecture: searchHybrid silently swallowed ranker errors
+`searchHybrid` wrapped each ranker in `try/catch` with stderr-only logging. The MCP response just showed `signals_used: []` with `matches: []` — a caller couldn't tell "no hits" from "all rankers crashed." New optional `signal_errors: { bm25?, tfidf?, embeddings? }` field surfaces per-signal failures so agents can reason about reliability.
+### Fixed — P1 architecture: `replaceInNotes` partial-state on mid-loop write failure
+Pre-fix, a throw on file 5 of 20 lost the response — files 1-4 silently committed with no way for the agent to discover. Now per-file errors are collected; response includes `partial: true` flag and `errors: [{path, message}]` array. Systemic failures (read-only vault) still throw fast — they're config errors, not per-file failures.
+### Fixed — P1 architecture: `resolveTarget` periodic-alias fallthrough leaked content via basename collision
+Pre-fix, when `vault.stat()` returned ENOENT for the configured periodic path (e.g., `Daily Notes/2026-05-08.md` doesn't exist yet), `resolveTarget` fell through to a basename match across the whole vault. With `--exclude-glob 'Daily Notes/**'` AND a `Public/2026-05-08.md`, the basename match silently redirected "today" to the unrelated public note. Now we only fall through if the periodic config produces a folder-less stem (i.e., user keeps periodic notes at vault root); configured-folder cases must hit the configured folder or fail clean.
+### Fixed — P1: `renameNote` and `Vault.renameFile` error messages now distinguish allowlist vs denylist
+Pre-fix, both always blamed `--exclude-glob` even when `--read-paths` was the reason. New `Vault.exclusionReason()` helper exposes the same logic that writeNote already used; renameNote and renameFile both adopt it.
+### Fixed — P1: `replaceInNotes` accepted excluded `folder=` argument
+Pre-fix, `replaceInNotes(folder: "Personal")` with `--exclude-glob "Personal/**"` returned `files_scanned: 0, scope: "Personal/"` — confirming the folder name existed in the user's layout. Now the function refuses early: `folder is excluded by privacy filter`. Same pattern applies to other tools that take `folder` arguments — listed as P2 backlog for v2.0.0-beta.3.
+### Fixed — P1 docs
+- README + SECURITY.md "v2.0 alpha" → "v2.0" (already shipped beta).
+- README "Configure your AI client" section: now shows BOTH `@latest` (v1.x) AND `@beta` (v2.0) install snippets explicitly. Pre-fix, copying the snippet pulled v1.11.1 while the section below described v2.0 features.
+- README source-line-count claim: `~3500 lines` → `~7500 lines` (verified `wc -l src/*.ts`).
+- README test-count claim: `388+` → `405+` (will be `408+` after this release).
+- CHANGELOG v1.11.1 entry: removed phantom `obsidian_resolve_periodic_alias` reference (replaced with `obsidian_read_note({title:"today"})`, the actual MCP-exposed entry-point).
+### Added — Architecture invariant: docs-consistency tests for numeric drift
+`tests/docs-consistency.test.ts` previously checked tool-name parity. Extended to:
+- **Tool-count parity:** README's "N read tools (always on)" must match the actual count of `registerTool()` calls outside `registerWriteTools` and `registerFtsTools`.
+- **`docs/api.md` math:** "M MCP tools (X always-on read + Y opt-in read + Z opt-in write)" must satisfy M = X + Y + Z.
+- **CLI subcommand parity:** every `program.command()` registered must appear in the docs/api.md Subcommands table.
+These prevent the kind of drift the audit caught manually. Now caught at CI time.
+### Tests
+408 unit tests pass (was 393, +15 new):
+- 5 privacy-regression tests for `appendToNote`, `archiveNote`, `renameNote` (source + dest with allowlist), `replaceInNotes` (denylist)
+- 2 search-time isExcluded filter tests (`searchHybrid` BM25 path with stale FTS5 db; `embeddingsSearch` filter post-search)
+- 3 fail-closed Vault constructor tests (empty `--read-paths` / `--exclude-glob` rejection)
+- 3 docs-consistency invariant tests
+- 1 updated periodic-alias test (now expects "No note found" silent fallback instead of "excluded" leak)
+- 1 architecture refactoring (security.test.ts test reordering after lint:fix)
+### Migration from v2.0.0-beta.1
+**No breaking changes for end users.** All v2.0.0-beta.1 tools and CLI flags continue to work.
+**Programmatic callers (rare):** `Vault` now throws on empty `excludeGlobs: [""]` / `readPaths: [""]`. Filter empty strings in the caller before constructing.
+**`searchHybrid` response shape:** new optional `signal_errors` field. Existing parsers that ignore unknown fields are unaffected.
+**`replaceInNotes` response shape:** new `partial: boolean` field (always present) and `errors?: Array` (only when partial). Existing parsers ignoring unknown fields are unaffected.
 ## [2.0.0-beta.1] — 2026-05-06
 **Audit-driven patch.** An independent external audit of v2.0.0-beta.0 surfaced one P0 privacy/security bug, several P1 doc/correctness drifts, and a handful of P2 hardening opportunities. This release closes all 17 findings (1 P0 + 7 P1 + 7 P2 + 2 P3). No new features.
@@ -241,7 +406,7 @@ Regression test: `tests/security.test.ts` adds two cases — one for `--exclude-
 `scripts/synthetic-vault.mjs` (CI smoke) didn't write `.obsidian/daily-notes.json`, so smoke fell back to the v0.11 hard-coded defaults — leaving `loadPeriodicConfig()` + `formatMoment()` regression-free in CI even when the actual code broke.
-Added a 3-line config (`folder: "99_Daily"`, `format: "YYYY-MM-DD"`) so `obsidian_resolve_periodic_alias today` now exercises the lazy-load → cache → format codepath in every CI run.
+Added a 3-line config (`folder: "99_Daily"`, `format: "YYYY-MM-DD"`) so `obsidian_read_note({ title: "today" })` now exercises the lazy-load → cache → format codepath in every CI run.
 ### Docs

package/README.md CHANGED Viewed

@@ -96,10 +96,10 @@ There are several Obsidian-MCP servers out there. enquire differentiates on thre
 | **Strict path allowlist** (`--read-paths '01_Projects/**'` — only paths matching one of these globs are visible; complement to `--exclude-glob` denylist) | ❌ | ✅ |
 | **Canvas (`.canvas`) read tools** (`obsidian_list_canvases` + `obsidian_read_canvas` — typed nodes + edges, broken-ref detection) | ❌ rare / partial | ✅ first-class |
 | **Semantic search** (`obsidian_semantic_search` — TF-IDF cosine, free / offline / no model download) | ❌ usually paywalled (Smart Connections) | ✅ in-tree |
-| **ML embeddings search** (`obsidian_embeddings_search` — paraphrase-multilingual-MiniLM-L12-v2, 50+ languages, persistent SQLite vector index) | ❌ usually paywalled (Smart Connections) | ✅ free + offline-capable (v2.0 alpha) |
-| TypeScript strict + Biome lint + 388+ unit tests | varies | ✅ |
+| **ML embeddings search** (`obsidian_embeddings_search` — paraphrase-multilingual-MiniLM-L12-v2, 50+ languages, persistent SQLite vector index) | ❌ usually paywalled (Smart Connections) | ✅ free + offline-capable (v2.0 beta) |
+| TypeScript strict + Biome lint + 405+ unit tests | varies | ✅ |
-That's the gap. enquire closes it in ~3500 lines of TypeScript with five mandatory runtime dependencies (`@modelcontextprotocol/sdk`, `chokidar`, `commander`, `gray-matter`, `zod`) plus two optional (`better-sqlite3` for `--persistent-index` and `--build-embeddings`; `@huggingface/transformers` for ML embeddings — both are no-ops when not invoked).
+That's the gap. enquire closes it in ~7500 lines of TypeScript with five mandatory runtime dependencies (`@modelcontextprotocol/sdk`, `chokidar`, `commander`, `gray-matter`, `zod`) plus two optional (`better-sqlite3` for `--persistent-index` and the `build-embeddings` subcommand; `@huggingface/transformers` for ML embeddings — both are no-ops when not invoked).
 > **Not affiliated with Obsidian.md.** Obsidian and the Obsidian logo are trademarks of Dynalist Inc. enquire-mcp is an independent open-source project that reads Obsidian-format vaults. The name «enquire» is a tribute to Tim Berners-Lee's 1980 hypertext system, not a trademark claim against any party.
@@ -115,9 +115,12 @@ That's the gap. enquire closes it in ~3500 lines of TypeScript with five mandato
 ## Configure your AI client
-**Recommended: zero-install via `npx` — no clone, no build.** Drop this into your MCP client's config:
+**Recommended: zero-install via `npx` — no clone, no build.** Drop this into your MCP client's config.
+> **Pick a channel.** `@oomkapwn/enquire-mcp` (no `@beta`) → stable **v1.11.1** with 28 tools — no `obsidian_search` umbrella, no ML embeddings. Add `@beta` for the v2.0 surface (30 tools, hybrid retrieval). Stable `@latest` will move to v2.0 once beta proves out.
 ```json
+// Stable v1.x — 28 tools, no hybrid search
 {
   "mcpServers": {
     "obsidian": {
@@ -126,6 +129,16 @@ That's the gap. enquire closes it in ~3500 lines of TypeScript with five mandato
     }
   }
 }
+// Beta v2.0 — adds obsidian_search (BM25 + TF-IDF + ML embeddings via RRF)
+{
+  "mcpServers": {
+    "obsidian": {
+      "command": "npx",
+      "args": ["-y", "@oomkapwn/enquire-mcp@beta", "serve", "--vault", "/Users/you/Documents/Obsidian Vault"]
+    }
+  }
+}
 ```
 **Where to drop that JSON, by client:**
@@ -171,7 +184,7 @@ Restart your client. The server logs `enquire <version> ready (read-only, vault=
 ## What you get
-### 24 read tools (always on) + 1 opt-in (`--persistent-index`) — **30 total** with 5 write tools
+### 21 read tools (always on) + 4 opt-in (`--persistent-index` adds 1 BM25 / `--diagnostic-search-tools` adds 3 single-ranker) — **30 total** with 5 write tools
 | Tool | What it does |
 |---|---|
@@ -468,4 +481,4 @@ Other ways to help:
 [MIT](./LICENSE). Built by Alex — [GitHub `@oomkapwn`](https://github.com/oomkapwn) · [X `@OomkaBear`](https://x.com/OomkaBear). Powered by [Model Context Protocol](https://modelcontextprotocol.io/), [`gray-matter`](https://github.com/jonschlinkert/gray-matter), [`commander`](https://github.com/tj/commander.js), and the patience of one specific Obsidian vault that didn't deserve to be parsed by hand.
-Named after [ENQUIRE](https://en.wikipedia.org/wiki/ENQUIRE) — the program Tim Berners-Lee wrote at CERN in 1980 to track «the complex web of relationships between people, programs, machines and ideas». ENQUIRE was the direct prototype of the World Wide Web. enquire-mcp brings the same idea to your AI: hyperlinked notes, structured access, no plugin required.
+Named after [ENQUIRE](https://en.wikipedia.org/wiki/ENQUIRE) — the 1980 hypertext prototype of the World Wide Web (see the inline note above for the longer story).

package/SECURITY.md CHANGED Viewed

@@ -132,7 +132,7 @@ Posture:
 - **Write-tool gating composes with `--enable-write`.** Disabling `obsidian_create_note` while leaving `obsidian_replace_in_notes` enabled is a valid configuration; the gate is independent of the global write flag.
 - **Posture is "fail closed".** Tools blocked at registration time never appear in `tools/list` and a `tools/call` against a gated name returns a clean MCP-protocol error from the SDK — there's no codepath where a disabled tool can still execute.
-## ML embeddings (v2.0 alpha): networked-download + cache posture
+## ML embeddings (v2.0): networked-download + cache posture
 The `obsidian_embeddings_search` tool plus the `install-model` and `build-embeddings` subcommands (added v2.0.0-alpha.0) introduce two new surfaces with networked / on-disk implications:

package/dist/index.d.ts CHANGED Viewed

@@ -14,6 +14,7 @@ interface ServeOptions {
     watch?: boolean;
     disabledTools?: string[];
     enabledTools?: string[];
+    diagnosticSearchTools?: boolean;
 }
 declare function main(): Promise<void>;
 declare function startServer(opts: ServeOptions): Promise<void>;

package/dist/index.d.ts.map CHANGED Viewed

	@@ -1 +1 @@
1	- {"version":3,"file":"index.d.ts","sourceRoot":"","sources":["../src/index.ts"],"names":[],"mappings":";AAsDA,UAAU,YAAY;IACpB,KAAK,EAAE,MAAM,CAAC;IACd,WAAW,CAAC,EAAE,OAAO,CAAC;IACtB,YAAY,CAAC,EAAE,MAAM,CAAC;IACtB,SAAS,CAAC,EAAE,MAAM,CAAC;IACnB,eAAe,CAAC,EAAE,OAAO,CAAC;IAC1B,SAAS,CAAC,EAAE,MAAM,CAAC;IACnB,eAAe,CAAC,EAAE,OAAO,CAAC;IAC1B,SAAS,CAAC,EAAE,MAAM,CAAC;IACnB,QAAQ,CAAC,EAAE,WAAW,GAAG,SAAS,CAAC;IACnC,WAAW,CAAC,EAAE,MAAM,EAAE,CAAC;IACvB,SAAS,CAAC,EAAE,MAAM,EAAE,CAAC;IACrB,KAAK,CAAC,EAAE,OAAO,CAAC;IAChB,aAAa,CAAC,EAAE,MAAM,EAAE,CAAC;IACzB,YAAY,CAAC,EAAE,MAAM,EAAE,CAAC;~~CACzB~~;AAED,iBAAe,IAAI,IAAI,OAAO,CAAC,IAAI,CAAC,~~CAuMnC~~;AAED,iBAAe,WAAW,CAAC,IAAI,EAAE,YAAY,GAAG,OAAO,CAAC,IAAI,CAAC,CA2K5D;~~AA+qCD~~,iBAAS,gBAAgB,CAAC,GAAG,EAAE,MAAM,EAAE,IAAI,EAAE,MAAM,GAAG,MAAM,CAM3D;AAsCD,OAAO,EAAE,IAAI,EAAE,gBAAgB,EAAE,WAAW,EAAE,CAAC"}
1	+ {"version":3,"file":"index.d.ts","sourceRoot":"","sources":["../src/index.ts"],"names":[],"mappings":";AAsDA,UAAU,YAAY;IACpB,KAAK,EAAE,MAAM,CAAC;IACd,WAAW,CAAC,EAAE,OAAO,CAAC;IACtB,YAAY,CAAC,EAAE,MAAM,CAAC;IACtB,SAAS,CAAC,EAAE,MAAM,CAAC;IACnB,eAAe,CAAC,EAAE,OAAO,CAAC;IAC1B,SAAS,CAAC,EAAE,MAAM,CAAC;IACnB,eAAe,CAAC,EAAE,OAAO,CAAC;IAC1B,SAAS,CAAC,EAAE,MAAM,CAAC;IACnB,QAAQ,CAAC,EAAE,WAAW,GAAG,SAAS,CAAC;IACnC,WAAW,CAAC,EAAE,MAAM,EAAE,CAAC;IACvB,SAAS,CAAC,EAAE,MAAM,EAAE,CAAC;IACrB,KAAK,CAAC,EAAE,OAAO,CAAC;IAChB,aAAa,CAAC,EAAE,MAAM,EAAE,CAAC;IACzB,YAAY,CAAC,EAAE,MAAM,EAAE,CAAC;IACxB,qBAAqB,CAAC,EAAE,OAAO,CAAC;CACjC;AAED,iBAAe,IAAI,IAAI,OAAO,CAAC,IAAI,CAAC,CA2MnC;AAED,iBAAe,WAAW,CAAC,IAAI,EAAE,YAAY,GAAG,OAAO,CAAC,IAAI,CAAC,CA2K5D;AAgtCD,iBAAS,gBAAgB,CAAC,GAAG,EAAE,MAAM,EAAE,IAAI,EAAE,MAAM,GAAG,MAAM,CAM3D;AAsCD,OAAO,EAAE,IAAI,EAAE,gBAAgB,EAAE,WAAW,EAAE,CAAC"}

package/dist/index.js CHANGED Viewed

@@ -12,7 +12,7 @@ import { chunkContent, defaultIndexFile, FtsIndex } from "./fts5.js";
 import { appendToNote, archiveNote, createNote, dataviewQuery, embeddingsSearch, findPath, findSimilar, getBacklinks, getNoteNeighbors, getOpenQuestions, getOutboundLinks, getRecentEdits, getUnresolvedWikilinks, getVaultStats, lintWiki, listCanvases, listNotes, listTags, openInUi, paperAudit, readCanvas, readNote, renameNote, replaceInNotes, resolveWikilink, searchHybrid, searchText, semanticSearch, validateNoteProposal } from "./tools.js";
 import { Vault } from "./vault.js";
 import { VaultWatcher } from "./watcher.js";
-const VERSION = "2.0.0-beta.1";
+const VERSION = "2.0.0-beta.3";
 /** Default location for the persistent embedding index, alongside .fts5.db. */
 function embedDbPath(vaultRoot) {
     // Match the FTS5 location convention by stripping the .fts5.db extension
@@ -42,6 +42,7 @@ async function main() {
         .option("--watch", "Watch the vault for .md add/change/unlink events and incrementally invalidate the parsed-note cache (and refresh the FTS5 index when --persistent-index is also enabled). Off by default. Use this for long-running servers where you keep editing in Obsidian and want search to stay fresh without restarting.")
         .option("--disabled-tools <name...>", "Skip registration of specific tools by exact name. Useful when you want to expose a smaller surface to a particular agent (e.g. read-only research agent gets only obsidian_search_text + obsidian_read_note). Repeatable. Names are the same as in `tools/list` — `obsidian_*`. Example: `--disabled-tools obsidian_dataview_query obsidian_full_text_search`.")
         .option("--enabled-tools <name...>", "Strict allowlist — when set, ONLY listed tools register. Complement to --disabled-tools (denylist). If both are set: a tool must be in the allowlist AND not in the denylist. Repeatable. Example: `--enabled-tools obsidian_search_text obsidian_read_note obsidian_get_recent_edits`.")
+        .option("--diagnostic-search-tools", "Register the four single-ranker search tools (obsidian_search_text, obsidian_full_text_search, obsidian_semantic_search, obsidian_embeddings_search) IN ADDITION to the default obsidian_search hybrid tool. Off by default in v2.0+ — the umbrella obsidian_search auto-detects available signals and produces consistent recall. Enable when you need single-ranker output for diagnostics or A/B benchmarking.")
         .action(async (opts) => {
         await startServer(opts);
     });
@@ -242,14 +243,14 @@ async function startServer(opts) {
             return origRegisterTool(name, ...rest);
         };
     }
-    registerReadTools(server, vault, ftsIndex);
+    registerReadTools(server, vault, ftsIndex, opts.diagnosticSearchTools ?? false);
     if (vault.writeEnabled)
         registerWriteTools(server, vault);
-    if (ftsIndex)
-        registerFtsTools(server, ftsIndex);
+    if (ftsIndex && opts.diagnosticSearchTools)
+        registerFtsTools(server, ftsIndex, vault);
     registerResources(server, vault);
     if (ftsIndex)
-        registerChunkResource(server, ftsIndex);
+        registerChunkResource(server, ftsIndex, vault);
     registerPrompts(server);
     // v2.0.0-beta.1: warn on unknown names AFTER all tools are registered.
     // We can't validate at parse time because the canonical list depends on
@@ -428,7 +429,7 @@ async function syncFtsIndex(vault, idx) {
         total_chunks: idx.totalChunks()
     };
 }
-function registerFtsTools(server, idx) {
+function registerFtsTools(server, idx, vault) {
     const READ_ONLY = { readOnlyHint: true, idempotentHint: true, openWorldHint: false };
     server.registerTool("obsidian_full_text_search", {
         title: "Full-text search (BM25, FTS5 index)",
@@ -459,6 +460,20 @@ function registerFtsTools(server, idx) {
             else
                 throw new Error(`Invalid 'since' value (expected ISO date): ${args.since}`);
         }
+        // v2.0.0-beta.2 P0 fix: filter excluded paths from FTS5 hits before
+        // returning. The .fts5.db can contain entries from when the index was
+        // built without exclusion flags. Pre-fix, BM25 search leaked excluded
+        // chunks through `rel_path` and `snippet` (which contains the matched
+        // chunk text bracketed with «…»).
+        const userLimit = args.limit ?? 25;
+        const overFetch = userLimit * 2;
+        const rawMatches = idx.search(args.query, {
+            limit: overFetch,
+            folder: args.folder,
+            tag: args.tag,
+            sinceMtimeMs
+        });
+        const matches = rawMatches.filter((m) => !vault.isExcluded(m.rel_path)).slice(0, userLimit);
         return textResult({
             query: args.query,
             total_chunks: idx.totalChunks(),
@@ -468,16 +483,11 @@ function registerFtsTools(server, idx) {
                 tag: args.tag ?? null,
                 since: args.since ?? null
             },
-            matches: idx.search(args.query, {
-                limit: args.limit,
-                folder: args.folder,
-                tag: args.tag,
-                sinceMtimeMs
-            })
+            matches
         });
     });
 }
-function registerReadTools(server, vault, ftsIndex) {
+function registerReadTools(server, vault, ftsIndex, diagnosticSearchTools) {
     const READ_ONLY = { readOnlyHint: true, idempotentHint: true, openWorldHint: false };
     server.registerTool("obsidian_list_notes", {
         title: "List notes",
@@ -519,23 +529,29 @@ function registerReadTools(server, vault, ftsIndex) {
             include_content: z.boolean().optional().describe("Include resolved file's body (default true)")
         }
     }, async (args) => textResult(await resolveWikilink(vault, args)));
-    server.registerTool("obsidian_search_text", {
-        title: "Search text",
-        description: "Case-insensitive token search across all notes. Default mode `all` requires every whitespace-separated token to appear in a note (AND-tokenizer); `any` requires at least one (OR); `phrase` does the old contiguous-substring match. Returns a structured response with `query`, `mode`, `scanned_notes`, and ranked `matches` (each with snippet, line, score, matched_terms) — empty matches are explicit, not ambiguous with a broken call.",
-        annotations: { ...READ_ONLY, title: "Search text" },
-        inputSchema: {
-            query: z
-                .string()
-                .min(1)
-                .describe('Search string. With mode=all/any, whitespace tokenizes ("foo bar" → ["foo","bar"]).'),
-            folder: z.string().optional().describe("Restrict to a subfolder"),
-            limit: z.number().int().positive().max(200).optional().describe("Max results (default 25)"),
-            mode: z
-                .enum(["all", "any", "phrase"])
-                .optional()
-                .describe('"all" (default, AND), "any" (OR), or "phrase" (literal substring — pre-v0.9 behavior)')
-        }
-    }, async (args) => textResult(await searchText(vault, args)));
+    // v2.0.0-beta.3: obsidian_search_text is now a DIAGNOSTIC tool — gated
+    // behind --diagnostic-search-tools. Default search surface is the umbrella
+    // obsidian_search which auto-detects + fuses signals. Pre-fix, agents
+    // routinely picked the wrong single-ranker tool; consolidation reduces
+    // tool-list bloat and produces consistent recall.
+    if (diagnosticSearchTools)
+        server.registerTool("obsidian_search_text", {
+            title: "Search text",
+            description: "Case-insensitive token search across all notes. Default mode `all` requires every whitespace-separated token to appear in a note (AND-tokenizer); `any` requires at least one (OR); `phrase` does the old contiguous-substring match. Returns a structured response with `query`, `mode`, `scanned_notes`, and ranked `matches` (each with snippet, line, score, matched_terms) — empty matches are explicit, not ambiguous with a broken call.",
+            annotations: { ...READ_ONLY, title: "Search text" },
+            inputSchema: {
+                query: z
+                    .string()
+                    .min(1)
+                    .describe('Search string. With mode=all/any, whitespace tokenizes ("foo bar" → ["foo","bar"]).'),
+                folder: z.string().optional().describe("Restrict to a subfolder"),
+                limit: z.number().int().positive().max(200).optional().describe("Max results (default 25)"),
+                mode: z
+                    .enum(["all", "any", "phrase"])
+                    .optional()
+                    .describe('"all" (default, AND), "any" (OR), or "phrase" (literal substring — pre-v0.9 behavior)')
+            }
+        }, async (args) => textResult(await searchText(vault, args)));
     server.registerTool("obsidian_get_recent_edits", {
         title: "Get recent edits",
         description: "List notes ordered by most recent modification. Useful for picking up where work was left off.",
@@ -761,44 +777,48 @@ function registerReadTools(server, vault, ftsIndex) {
             path: z.string().describe("Vault-relative path of the .canvas file (with or without .canvas)")
         }
     }, async (args) => textResult(await readCanvas(vault, args)));
-    server.registerTool("obsidian_semantic_search", {
-        title: "Semantic search (TF-IDF cosine)",
-        description: "Pure-JS lexical-semantic retrieval. Tokenizes + TF-IDFs + L2-normalizes every note's body once per session, then ranks notes by cosine similarity to the query. Free / offline / no model download — closes the gap to Smart Connections without paywall, ML deps, or HTTP. Use this when `obsidian_search_text` (substring) and `obsidian_full_text_search` (BM25) miss synonyms or related-term matches. For best results pair with `--persistent-index` so BM25 + semantic both run cheap. Returns ranked hits with snippet + matched terms (highest-IDF first).",
-        annotations: { ...READ_ONLY, title: "Semantic search" },
-        inputSchema: {
-            query: z.string().min(1).describe("Free-form query — multi-word, natural language is fine"),
-            folder: z.string().optional().describe("Restrict to a subfolder (vault-relative)"),
-            limit: z.number().int().positive().max(100).optional().describe("Max hits (default 10)"),
-            min_score: z
-                .number()
-                .min(0)
-                .max(1)
-                .optional()
-                .describe("Drop hits below this cosine score (default 0.05). Cosine ranges 0–1.")
-        }
-    }, async (args) => textResult(await semanticSearch(vault, args)));
+    // v2.0.0-beta.3: gated — see comment on obsidian_search_text above.
+    if (diagnosticSearchTools)
+        server.registerTool("obsidian_semantic_search", {
+            title: "Semantic search (TF-IDF cosine)",
+            description: "Pure-JS lexical-semantic retrieval. Tokenizes + TF-IDFs + L2-normalizes every note's body once per session, then ranks notes by cosine similarity to the query. Free / offline / no model download — closes the gap to Smart Connections without paywall, ML deps, or HTTP. Use this when `obsidian_search_text` (substring) and `obsidian_full_text_search` (BM25) miss synonyms or related-term matches. For best results pair with `--persistent-index` so BM25 + semantic both run cheap. Returns ranked hits with snippet + matched terms (highest-IDF first).",
+            annotations: { ...READ_ONLY, title: "Semantic search" },
+            inputSchema: {
+                query: z.string().min(1).describe("Free-form query — multi-word, natural language is fine"),
+                folder: z.string().optional().describe("Restrict to a subfolder (vault-relative)"),
+                limit: z.number().int().positive().max(100).optional().describe("Max hits (default 10)"),
+                min_score: z
+                    .number()
+                    .min(0)
+                    .max(1)
+                    .optional()
+                    .describe("Drop hits below this cosine score (default 0.05). Cosine ranges 0–1.")
+            }
+        }, async (args) => textResult(await semanticSearch(vault, args)));
     // v2.0 alpha — ML-embeddings retrieval. Reads a persistent vector index
     // built by `enquire-mcp build-embeddings`. Returns clean error if the index
     // doesn't exist (rather than silently downloading a model).
-    server.registerTool("obsidian_embeddings_search", {
-        title: "Embeddings search (ML, paraphrase-multilingual)",
-        description: "ML-embedding retrieval via @huggingface/transformers + paraphrase-multilingual-MiniLM-L12-v2 (50+ languages, 384-dim, runs on CPU). Higher-quality than `obsidian_semantic_search` for paraphrases / synonyms / cross-language queries, but requires a one-time setup: (1) `enquire-mcp install-model multilingual` downloads the ONNX weights (~120MB) and (2) `enquire-mcp build-embeddings --vault <path>` writes the persistent vector index (~1ms/chunk on M1). Subsequent queries are sub-100ms top-10. If the index is missing, the tool returns a clean error with the exact command to run — it does NOT silently kick off a model download.",
-        annotations: { ...READ_ONLY, title: "Embeddings search" },
-        inputSchema: {
-            query: z.string().min(1).describe("Free-form query — multi-word, natural language, any supported language"),
-            folder: z.string().optional().describe("Restrict to a subfolder (vault-relative)"),
-            limit: z.number().int().positive().max(100).optional().describe("Max hits (default 10)"),
-            min_score: z
-                .number()
-                .min(0)
-                .max(1)
-                .optional()
-                .describe("Drop hits below this cosine score (default 0.3). Cosine ranges -1 to 1; embeddings cluster ~0.4-0.9.")
-        }
-    }, async (args) => {
-        const embedFile = embedDbPath(vault.root);
-        return textResult(await embeddingsSearch(vault, args, embedFile));
-    });
+    // v2.0.0-beta.3: gated — see comment on obsidian_search_text above.
+    if (diagnosticSearchTools)
+        server.registerTool("obsidian_embeddings_search", {
+            title: "Embeddings search (ML, paraphrase-multilingual)",
+            description: "ML-embedding retrieval via @huggingface/transformers + paraphrase-multilingual-MiniLM-L12-v2 (50+ languages, 384-dim, runs on CPU). Higher-quality than `obsidian_semantic_search` for paraphrases / synonyms / cross-language queries, but requires a one-time setup: (1) `enquire-mcp install-model multilingual` downloads the ONNX weights (~120MB) and (2) `enquire-mcp build-embeddings --vault <path>` writes the persistent vector index (~1ms/chunk on M1). Subsequent queries are sub-100ms top-10. If the index is missing, the tool returns a clean error with the exact command to run — it does NOT silently kick off a model download.",
+            annotations: { ...READ_ONLY, title: "Embeddings search" },
+            inputSchema: {
+                query: z.string().min(1).describe("Free-form query — multi-word, natural language, any supported language"),
+                folder: z.string().optional().describe("Restrict to a subfolder (vault-relative)"),
+                limit: z.number().int().positive().max(100).optional().describe("Max hits (default 10)"),
+                min_score: z
+                    .number()
+                    .min(0)
+                    .max(1)
+                    .optional()
+                    .describe("Drop hits below this cosine score (default 0.3). Cosine ranges -1 to 1; embeddings cluster ~0.4-0.9.")
+            }
+        }, async (args) => {
+            const embedFile = embedDbPath(vault.root);
+            return textResult(await embeddingsSearch(vault, args, embedFile));
+        });
     // v2.0 beta — hybrid RRF over BM25 + TF-IDF + embeddings. Single umbrella
     // tool that auto-detects which signals are available and gracefully
     // degrades. Equal weights, k=60 (Cormack et al's recommendation). Note-
@@ -913,7 +933,7 @@ function registerWriteTools(server, vault) {
         }
     }, async (args) => textResult(await archiveNote(vault, args)));
 }
-function registerChunkResource(server, idx) {
+function registerChunkResource(server, idx, vault) {
     // Chunk-level addressing — closes the v0.10 roadmap item from issue #10
     // suggestion 1. URI shape: obsidian://chunk/{chunkIndex}/{+notePath}.
     // Index FIRST so the {+notePath} can greedily eat slash-bearing paths.
@@ -937,6 +957,15 @@ function registerChunkResource(server, idx) {
         }
         const notePathRaw = Array.isArray(params.notePath) ? params.notePath.join("/") : params.notePath;
         const decoded = decodeNotePath(notePathRaw);
+        // v2.0.0-beta.2 P0 fix: enforce --read-paths / --exclude-glob on the
+        // chunk resource. The .fts5.db can contain entries from before the user
+        // added a privacy filter, so a stale URI returned earlier in the
+        // session would otherwise serve excluded content. We refuse with the
+        // same "not found" framing the FTS5 search uses post-filter, so the
+        // attacker can't distinguish "doesn't exist" from "exists but excluded".
+        if (vault.isExcluded(decoded)) {
+            throw new Error(`Chunk not found: ${decoded}#${chunkIndex}`);
+        }
         const chunk = idx.getChunk(decoded, chunkIndex);
         if (!chunk)
             throw new Error(`Chunk not found: ${decoded}#${chunkIndex}`);