@oomkapwn/enquire-mcp 2.7.0 → 2.9.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -2,6 +2,103 @@
2
2
 
3
3
  All notable changes to this project will be documented here. The format follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and the project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
4
4
 
5
+ ## [2.9.0] — 2026-05-08
6
+
7
+ **Sprint 9 — BGE cross-encoder reranking on top of RRF.** Cross-encoder reranking is the SOTA technique in IR for boosting retrieval quality over bi-encoder candidates: after RRF fusion, the top-N hits are re-scored by a model that sees query+document interaction directly (instead of comparing pre-computed embeddings). Typical wins: +5-10 NDCG@10 on real-world retrieval. **No other Obsidian-MCP currently does cross-encoder reranking** — this extends our retrieval quality leadership claim.
8
+
9
+ ### Added — `--enable-reranker` CLI flag
10
+
11
+ Off by default — opt-in because the cross-encoder model is downloaded from HuggingFace on first call (~25-110 MB depending on alias) and adds ~30-50ms per query at top-50 candidates on M1 CPU. When enabled:
12
+
13
+ - `enquire-mcp serve --vault <path> --persistent-index --enable-reranker` → boots; reranker model lazy-loads on first search call.
14
+ - After RRF fusion + graph-boost, top-N candidates (default 50; tunable via `--reranker-top-n <n>`) are re-scored by a cross-encoder, then re-sorted before the response is truncated to `limit`.
15
+ - Each reranked hit carries a `reranker_score` field in `[0, 1]` (sigmoid of the model's relevance logit) so agents see the cross-encoder's relevance estimate alongside RRF observability.
16
+
17
+ ### Added — `RERANKER_MODELS` catalog
18
+
19
+ Two models ship out of the box, both via the existing `@huggingface/transformers` `optionalDependency`:
20
+
21
+ - **`rerank-multilingual`** (default) — `Xenova/mxbai-rerank-xsmall-v1`, ~25 MB, multilingual. Best balance of speed × quality × language coverage.
22
+ - **`rerank-bge`** — `Xenova/bge-reranker-base`, ~110 MB, English-only. Higher peak quality on English content; recommended only when you don't need multilingual support.
23
+
24
+ Choose via `--reranker-model <alias>`. Same lazy-load pattern as embedding models — first call downloads weights into `~/.cache/huggingface/transformers.js/`; subsequent queries hit the warm cache.
25
+
26
+ ### Wiring
27
+
28
+ - `searchHybrid(vault, args, ctx)` accepts an optional `ctx.reranker?: { alias?, topN? }`. When set, the reranker runs after RRF + graph-boost; failures surface via `signal_errors.reranker` (matching the existing per-signal failure-reporting pattern from v2.0.0-beta.2). The fused order is preserved if reranking fails, so a model load problem doesn't break search.
29
+ - A `ctx.rerankerOverride` injection point lets unit tests validate the rerank-and-resort plumbing without pulling in the real ML model.
30
+ - Reranker passages are derived from each candidate's best snippet (BM25 > embeddings > TF-IDF preference), with FTS5 highlight markers stripped and length capped at 600 chars to fit safely under the 512-token model budget.
31
+
32
+ ### Tests
33
+
34
+ 502 unit tests pass (was 493 in v2.8.0, +9 new):
35
+ - **RERANKER_MODELS catalog (+5):** rerank-multilingual is the multilingual default, rerank-bge is English-only, defaults to rerank-multilingual on undefined alias, throws on unknown alias with helpful list, every entry has sensible approxSizeMB.
36
+ - **searchHybrid + reranker plumbing (+4):** reranker invoked when override is set, top-N re-orders by reranker score (high.md beats mid.md beats low.md by synthetic scores), errors surface via `signal_errors.reranker` with original RRF order preserved, `topN` caps how many candidates carry `reranker_score`.
37
+
38
+ ### Surface delta vs v2.8.0
39
+
40
+ - **No new tools.** Reranking is a property of `obsidian_search`, not a new tool surface.
41
+ - **+3 CLI flags** (`--enable-reranker`, `--reranker-model <alias>`, `--reranker-top-n <n>`) on `serve` (and via the same options shape, on `serve-http`).
42
+
43
+ ### Migration
44
+
45
+ **No-op for default users.** Reranking is opt-in via `--enable-reranker`. Existing users keep working unchanged. Once you opt in, the first search call downloads the reranker model (~25 MB for default `rerank-multilingual`); subsequent queries reuse the cached weights.
46
+
47
+ ### Strategic position
48
+
49
+ Combined with v2.0-v2.8's hybrid RRF + wikilink graph-boost + breadcrumb chunking + multilingual embeddings + remote MCP transport + PDF retrieval, **enquire-mcp is now the only Obsidian-MCP that runs cross-encoder reranking on top of hybrid retrieval over markdown + PDFs**. Smart Connections (paid) doesn't rerank. Khoj doesn't either. The retrieval-quality moat widens.
50
+
51
+ ## [2.8.0] — 2026-05-08
52
+
53
+ **Sprint 8 — PDF retrieval integration.** v2.7.0 added PDF text-extraction tools (`obsidian_list_pdfs` / `obsidian_read_pdf`); v2.8.0 makes PDFs **first-class citizens of `obsidian_search`**. Index PDFs into the same FTS5 + embedding stores as markdown, blend them in hybrid retrieval (BM25 + TF-IDF + embeddings → RRF fusion), and surface a `kind: "md" | "pdf"` flag on every hit so agents can distinguish content sources at a glance.
54
+
55
+ ### Added — `--include-pdfs` flag on `serve`, `index`, `build-embeddings`
56
+
57
+ Off by default — opt-in because PDF extraction is ~10-30× slower per file than markdown chunking. When enabled:
58
+
59
+ - `enquire-mcp serve --vault <path> --persistent-index --include-pdfs` → boots and incrementally syncs PDFs into the FTS5 index alongside markdown.
60
+ - `enquire-mcp index --vault <path> --include-pdfs` → cold-build / refresh the FTS5 index for both markdown and PDFs.
61
+ - `enquire-mcp build-embeddings --vault <path> --include-pdfs` → embed PDF chunks too.
62
+
63
+ Bad PDFs (encrypted without password / corrupt / image-only / scanned) are caught per-file and surfaced via stderr without taking down the markdown index path. Image-only / scanned PDFs are skipped with a clear log line — OCR is tracked for v2.9+ (Tesseract.js).
64
+
65
+ ### Schema migration — FTS5 v3 → v4, embed-db v1 → v2
66
+
67
+ Both indexes added a `kind` column (`'md' | 'pdf'`, default `'md'`). Schema bump auto-rebuilds the index on first open after upgrade — same pattern as the `tokenize_mode` / `vault_root` cross-config-change guards. Existing markdown indexes are preserved (they re-sync from the markdown source as kind=md).
68
+
69
+ ### `obsidian_search` returns `kind` on every hit
70
+
71
+ Both `note` and `block` granularity propagate the kind flag. PDF hits use the filename without the `.pdf` extension as the title (so titles read naturally in agent output). The tool description was updated to flag the v2.8.0 capability so MCP clients introspecting `tools/list` see it immediately.
72
+
73
+ ### Page-citation markers in PDF chunks
74
+
75
+ When indexing PDFs, page boundaries are preserved as `[page: N]\n` markers in the joined text before chunking. The chunker may split a page across chunks or merge short pages, but the markers travel with the text — so search snippets carry page citations the agent can extract. Same `chunkContent` pipeline as markdown, so chunk identity matches across BM25 / TF-IDF / embeddings (RRF requires stable IDs).
76
+
77
+ ### Independent sync paths via kind-aware diff()
78
+
79
+ `FtsIndex.diff(live, kind?)` and `EmbedDb.getSourceStates(kind?)` now accept an optional kind filter. Lets the markdown-sync path run independently from the PDF-sync path against the same DB without one's "missing files" being mistakenly deleted by the other. Backward-compat: omitting the kind arg returns all rows (legacy behavior).
80
+
81
+ ### Tests
82
+
83
+ 493 unit tests pass (was 481 in v2.7.0, +12 new):
84
+ - **FTS5 PDF (+6):** indexes PDF chunks with kind='pdf' alongside markdown, page markers travel through chunks for snippets, kind-scoped diff() doesn't see other-kind rows, kind-undefined diff() shows both, reindexPdfFile is atomically idempotent, schema bump v3→v4 auto-rebuilds.
85
+ - **Embed-db PDF (+3):** upserts with kind='pdf' and search returns kind='pdf', getSourceStates(kind=…) doesn't overlap, schema bump v1→v2 idempotent on matching schema.
86
+ - **searchHybrid kind (+3):** blended hits with both kind='md' and kind='pdf', PDF hits use .pdf-stripped titles, kind defaults to 'md' on TF-IDF-only matches.
87
+
88
+ ### Surface delta vs v2.7.0
89
+
90
+ - **No new tools.** The 38 from v2.7.0 stay. PDF retrieval is a property of `obsidian_search` (and the diagnostic single-ranker tools), not a new tool surface.
91
+ - **+1 CLI flag** (`--include-pdfs`) wired on three subcommands (`serve`, `index`, `build-embeddings`).
92
+ - **Schema bumps** auto-rebuild legacy indexes on first open.
93
+
94
+ ### Migration
95
+
96
+ **No-op for default users.** PDF indexing is opt-in via `--include-pdfs`. Existing `serve` / `serve-http` / `index` / `build-embeddings` users keep working unchanged. Once you opt in, the FTS5 + embed-db files auto-rebuild on first open (same one-time cost as `tokenize_mode` change in earlier versions).
97
+
98
+ ### Strategic position
99
+
100
+ v2.7.0 added the foundation (PDF extraction tools); v2.8.0 makes them retrievable. Combined with v2.0-v2.6's hybrid RRF + wikilink graph-boost + breadcrumb chunking + multilingual embeddings + remote MCP transport, **enquire-mcp is the only Obsidian-MCP that searches markdown and PDFs in a unified hybrid retrieval surface**. Smart Connections (paid) doesn't index PDFs. Khoj indexes PDFs but doesn't run on Obsidian's substrate (separate app, separate vault). The intersection is uniquely ours.
101
+
5
102
  ## [2.7.0] — 2026-05-08
6
103
 
7
104
  **Sprint 7 — PDF as a first-class indexable content type.** PDFs are the #1 non-markdown content kind in real research vaults (papers, scanned notes, downloaded references). **No other Obsidian-MCP currently indexes them.** v2.7.0 adds two new read tools that work identically over stdio + `serve-http`, gated behind `pdfjs-dist` as an `optionalDependency` so the markdown-only path stays zero-cost.
package/README.md CHANGED
@@ -38,13 +38,15 @@ That's it. Your AI now has structured access to wikilinks, backlinks, frontmatte
38
38
  | TF-IDF semantic search | ❌ | ❌ | ✅ |
39
39
  | **ML embeddings (multilingual)** | ❌ | ✅ paid | ✅ **free** |
40
40
  | **Hybrid (BM25+TF-IDF+embeddings, RRF)** | ❌ | ❌ | ✅ **only here** |
41
+ | **PDFs blended into hybrid search** | ❌ | ❌ | ✅ **only here** (v2.8.0) |
42
+ | **Cross-encoder reranking on top of RRF** | ❌ | ❌ | ✅ **only here** (v2.9.0) |
41
43
  | Per-signal observability on each hit | ❌ | ❌ | ✅ |
42
44
  | Privacy filter (`--exclude-glob` / `--read-paths`) | ❌ | n/a | ✅ verified at search + write paths |
43
45
  | Standalone (no Obsidian plugin) | varies | ❌ requires Obsidian | ✅ direct vault read |
44
46
  | MCP-native (any agent) | varies | ❌ Obsidian-only | ✅ stdio JSON-RPC |
45
47
  | **Remote MCP (HTTP transport, bearer auth)** | ❌ | ❌ | ✅ **only here** (v2.6.0) |
46
48
  | SLSA-3 provenance | ❌ | n/a | ✅ |
47
- | Test suite | rare | n/a | ✅ 481 unit tests |
49
+ | Test suite | rare | n/a | ✅ 502 unit tests |
48
50
 
49
51
  ---
50
52
 
@@ -131,7 +133,7 @@ No other Obsidian-MCP currently ships a remote-HTTP transport. Same vault, same
131
133
 
132
134
  | Tool | What it does |
133
135
  |---|---|
134
- | `obsidian_search` | **Hybrid retrieval** — fuses BM25 + TF-IDF + ML embeddings via RRF. The default search tool. Auto-detects available signals. v2.2.0: `granularity: "block"` arg returns chunks instead of notes. |
136
+ | `obsidian_search` | **Hybrid retrieval** — fuses BM25 + TF-IDF + ML embeddings via RRF. The default search tool. Auto-detects available signals. v2.2.0: `granularity: "block"` arg returns chunks instead of notes. **v2.8.0:** with `--include-pdfs`, PDF chunks blend in alongside markdown — every hit carries `kind: "md" \| "pdf"` and PDF snippets include `[page: N]` markers for citation. **v2.9.0:** with `--enable-reranker`, top-N RRF candidates are re-scored by a BGE cross-encoder for +5-10 NDCG@10 typical retrieval quality. |
135
137
  | `obsidian_context_pack` | **v2.2.0.** Token-budgeted context bundling: takes a question, runs hybrid search, gathers note bodies + backlinks + optionally recent dailies, returns one ready-to-paste markdown bundle. Saves ~5 tool calls. |
136
138
  | `obsidian_chat_thread_read` | **v2.2.0.** Parse a note's `## Chat: <title>` block into structured messages (role/timestamp/content/line-range). Pair with `_append` (write) for note-tethered AI conversations. |
137
139
  | `obsidian_frontmatter_get` | **v2.3.0.** Read parsed YAML frontmatter for a note. With `key`, returns just that field. |
@@ -230,7 +232,7 @@ Full posture: [SECURITY.md](./SECURITY.md). Report vulnerabilities to `oomkapwn@
230
232
  |---|---|
231
233
  | Language | TypeScript strict + `noUncheckedIndexedAccess` |
232
234
  | Lint | Biome 2 (zero-warning policy) |
233
- | Tests | 481 unit tests across 23 files |
235
+ | Tests | 502 unit tests across 24 files |
234
236
  | CI | ubuntu × {Node 20, 22, 24} required + macOS advisory job |
235
237
  | Coverage | Lines ≥86%, statements ≥82%, functions ≥75%, branches ≥73% (gated) |
236
238
  | Audit | `npm audit --audit-level=moderate` for prod; high for dev |
@@ -1,3 +1,5 @@
1
+ /** Content-source kind. Mirrors ChunkKind in src/fts5.ts. */
2
+ export type EmbedChunkKind = "md" | "pdf";
1
3
  export interface EmbedSearchHit {
2
4
  rel_path: string;
3
5
  chunk_index: number;
@@ -7,6 +9,8 @@ export interface EmbedSearchHit {
7
9
  text_preview: string;
8
10
  /** Cosine similarity (since vectors are L2-normalized at insert time). */
9
11
  score: number;
12
+ /** v2.8.0 — content-source kind. Defaults to "md" for backward compat. */
13
+ kind: EmbedChunkKind;
10
14
  }
11
15
  export interface EmbedSyncReport {
12
16
  added: number;
@@ -44,19 +48,27 @@ export declare class EmbedDb {
44
48
  private readMeta;
45
49
  private writeMeta;
46
50
  private requireDb;
47
- /** Replace all embeddings for a single note. Caller computes vectors. */
51
+ /**
52
+ * Replace all embeddings for a single note. Caller computes vectors.
53
+ * v2.8.0: optional `kind` parameter ("md" | "pdf"); defaults to "md" so
54
+ * existing callers (markdown indexing path) need no changes.
55
+ */
48
56
  upsertNote(relPath: string, mtimeMs: number, chunks: ReadonlyArray<{
49
57
  chunkIndex: number;
50
58
  lineStart: number;
51
59
  lineEnd: number;
52
60
  textPreview: string;
53
61
  vector: Float32Array;
54
- }>): void;
62
+ }>, kind?: EmbedChunkKind): void;
55
63
  /** Drop a note's embeddings entirely (used on file deletion). */
56
64
  deleteNote(relPath: string): void;
57
- /** Read the source-state table — caller compares mtimes to decide what to
58
- * re-embed. */
59
- getSourceStates(): SourceStateRow[];
65
+ /**
66
+ * Read the source-state table — caller compares mtimes to decide what to
67
+ * re-embed. v2.8.0: optional `kind` filter — when set, only rows of that
68
+ * kind are returned. Lets the markdown-sync and PDF-sync paths run
69
+ * independently without one's "missing files" being deleted by the other.
70
+ */
71
+ getSourceStates(kind?: EmbedChunkKind): SourceStateRow[];
60
72
  /** Brute-force cosine top-K. Vectors are L2-normalized at insert time so
61
73
  * cosine == dot product. Acceptable up to ~50K chunks; v2.1 will swap to
62
74
  * HNSW if real vaults hit that ceiling. */
@@ -1 +1 @@
1
- {"version":3,"file":"embed-db.d.ts","sourceRoot":"","sources":["../src/embed-db.ts"],"names":[],"mappings":"AAmBA,MAAM,WAAW,cAAc;IAC7B,QAAQ,EAAE,MAAM,CAAC;IACjB,WAAW,EAAE,MAAM,CAAC;IACpB,UAAU,EAAE,MAAM,CAAC;IACnB,QAAQ,EAAE,MAAM,CAAC;IACjB,mDAAmD;IACnD,YAAY,EAAE,MAAM,CAAC;IACrB,0EAA0E;IAC1E,KAAK,EAAE,MAAM,CAAC;CACf;AAED,MAAM,WAAW,eAAe;IAC9B,KAAK,EAAE,MAAM,CAAC;IACd,OAAO,EAAE,MAAM,CAAC;IAChB,OAAO,EAAE,MAAM,CAAC;IAChB,SAAS,EAAE,MAAM,CAAC;IAClB,YAAY,EAAE,MAAM,CAAC;CACtB;AAED,UAAU,cAAc;IACtB,QAAQ,EAAE,MAAM,CAAC;IACjB,QAAQ,EAAE,MAAM,CAAC;CAClB;AA+CD,MAAM,WAAW,cAAc;IAC7B,2CAA2C;IAC3C,IAAI,EAAE,MAAM,CAAC;IACb,sDAAsD;IACtD,SAAS,EAAE,MAAM,CAAC;IAClB,wEAAwE;IACxE,UAAU,EAAE,MAAM,CAAC;IACnB,oDAAoD;IACpD,GAAG,EAAE,MAAM,CAAC;CACb;AAED,qBAAa,OAAO;IAClB,OAAO,CAAC,EAAE,CAAmB;IAC7B,OAAO,CAAC,QAAQ,CAAC,IAAI,CAAS;IAC9B,OAAO,CAAC,QAAQ,CAAC,SAAS,CAAS;IACnC,OAAO,CAAC,QAAQ,CAAC,UAAU,CAAS;IACpC,OAAO,CAAC,QAAQ,CAAC,GAAG,CAAS;gBAEjB,IAAI,EAAE,cAAc;IAO1B,IAAI,IAAI,OAAO,CAAC,IAAI,CAAC;IAc3B,0DAA0D;IACpD,WAAW,IAAI,OAAO,CAAC,OAAO,CAAC;IAcrC,KAAK,IAAI,IAAI;IAOb,OAAO,CAAC,eAAe;IAqDvB,OAAO,CAAC,QAAQ;IAQhB,OAAO,CAAC,SAAS;IAMjB,OAAO,CAAC,SAAS;IAKjB,yEAAyE;IACzE,UAAU,CACR,OAAO,EAAE,MAAM,EACf,OAAO,EAAE,MAAM,EACf,MAAM,EAAE,aAAa,CAAC;QACpB,UAAU,EAAE,MAAM,CAAC;QACnB,SAAS,EAAE,MAAM,CAAC;QAClB,OAAO,EAAE,MAAM,CAAC;QAChB,WAAW,EAAE,MAAM,CAAC;QACpB,MAAM,EAAE,YAAY,CAAC;KACtB,CAAC,GACD,IAAI;IAiCP,iEAAiE;IACjE,UAAU,CAAC,OAAO,EAAE,MAAM,GAAG,IAAI;IAMjC;oBACgB;IAChB,eAAe,IAAI,cAAc,EAAE;IAKnC;;gDAE4C;IAC5C,MAAM,CAAC,QAAQ,EAAE,YAAY,EAAE,CAAC,EAAE,MAAM,EAAE,IAAI,GAAE;QAAE,MAAM,CAAC,EAAE,MAAM,CAAC;QAAC,QAAQ,CAAC,EAAE,MAAM,CAAA;KAAO,GAAG,cAAc,EAAE;IA4D9G,kDAAkD;IAClD,WAAW,IAAI,MAAM;CAKtB;AAED,8EAA8E;AAC9E,wBAAgB,kBAAkB,CAAC,eAAe,EAAE,MAAM,GAAG,MAAM,CAIlE"}
1
+ {"version":3,"file":"embed-db.d.ts","sourceRoot":"","sources":["../src/embed-db.ts"],"names":[],"mappings":"AAsBA,6DAA6D;AAC7D,MAAM,MAAM,cAAc,GAAG,IAAI,GAAG,KAAK,CAAC;AAE1C,MAAM,WAAW,cAAc;IAC7B,QAAQ,EAAE,MAAM,CAAC;IACjB,WAAW,EAAE,MAAM,CAAC;IACpB,UAAU,EAAE,MAAM,CAAC;IACnB,QAAQ,EAAE,MAAM,CAAC;IACjB,mDAAmD;IACnD,YAAY,EAAE,MAAM,CAAC;IACrB,0EAA0E;IAC1E,KAAK,EAAE,MAAM,CAAC;IACd,0EAA0E;IAC1E,IAAI,EAAE,cAAc,CAAC;CACtB;AAED,MAAM,WAAW,eAAe;IAC9B,KAAK,EAAE,MAAM,CAAC;IACd,OAAO,EAAE,MAAM,CAAC;IAChB,OAAO,EAAE,MAAM,CAAC;IAChB,SAAS,EAAE,MAAM,CAAC;IAClB,YAAY,EAAE,MAAM,CAAC;CACtB;AAED,UAAU,cAAc;IACtB,QAAQ,EAAE,MAAM,CAAC;IACjB,QAAQ,EAAE,MAAM,CAAC;CAClB;AA+CD,MAAM,WAAW,cAAc;IAC7B,2CAA2C;IAC3C,IAAI,EAAE,MAAM,CAAC;IACb,sDAAsD;IACtD,SAAS,EAAE,MAAM,CAAC;IAClB,wEAAwE;IACxE,UAAU,EAAE,MAAM,CAAC;IACnB,oDAAoD;IACpD,GAAG,EAAE,MAAM,CAAC;CACb;AAED,qBAAa,OAAO;IAClB,OAAO,CAAC,EAAE,CAAmB;IAC7B,OAAO,CAAC,QAAQ,CAAC,IAAI,CAAS;IAC9B,OAAO,CAAC,QAAQ,CAAC,SAAS,CAAS;IACnC,OAAO,CAAC,QAAQ,CAAC,UAAU,CAAS;IACpC,OAAO,CAAC,QAAQ,CAAC,GAAG,CAAS;gBAEjB,IAAI,EAAE,cAAc;IAO1B,IAAI,IAAI,OAAO,CAAC,IAAI,CAAC;IAc3B,0DAA0D;IACpD,WAAW,IAAI,OAAO,CAAC,OAAO,CAAC;IAcrC,KAAK,IAAI,IAAI;IAOb,OAAO,CAAC,eAAe;IAuDvB,OAAO,CAAC,QAAQ;IAQhB,OAAO,CAAC,SAAS;IAMjB,OAAO,CAAC,SAAS;IAKjB;;;;OAIG;IACH,UAAU,CACR,OAAO,EAAE,MAAM,EACf,OAAO,EAAE,MAAM,EACf,MAAM,EAAE,aAAa,CAAC;QACpB,UAAU,EAAE,MAAM,CAAC;QACnB,SAAS,EAAE,MAAM,CAAC;QAClB,OAAO,EAAE,MAAM,CAAC;QAChB,WAAW,EAAE,MAAM,CAAC;QACpB,MAAM,EAAE,YAAY,CAAC;KACtB,CAAC,EACF,IAAI,GAAE,cAAqB,GAC1B,IAAI;IAkCP,iEAAiE;IACjE,UAAU,CAAC,OAAO,EAAE,MAAM,GAAG,IAAI;IAMjC;;;;;OAKG;IACH,eAAe,CAAC,IAAI,CAAC,EAAE,cAAc,GAAG,cAAc,EAAE;IAQxD;;gDAE4C;IAC5C,MAAM,CAAC,QAAQ,EAAE,YAAY,EAAE,CAAC,EAAE,MAAM,EAAE,IAAI,GAAE;QAAE,MAAM,CAAC,EAAE,MAAM,CAAC;QAAC,QAAQ,CAAC,EAAE,MAAM,CAAA;KAAO,GAAG,cAAc,EAAE;IA8D9G,kDAAkD;IAClD,WAAW,IAAI,MAAM;CAKtB;AAED,8EAA8E;AAC9E,wBAAgB,kBAAkB,CAAC,eAAe,EAAE,MAAM,GAAG,MAAM,CAIlE"}
package/dist/embed-db.js CHANGED
@@ -13,7 +13,7 @@
13
13
  // on 50K × 384 floats). HNSW comes in v2.1 if real users hit that ceiling.
14
14
  import { promises as fs } from "node:fs";
15
15
  import * as path from "node:path";
16
- const SCHEMA_VERSION = 1;
16
+ const SCHEMA_VERSION = 2;
17
17
  // v2.0.0-beta.1 P2 fix: probe the native binding via :memory: open so the
18
18
  // "JS package present but *.node binary missing" failure mode produces a
19
19
  // clean error pointing at `npm rebuild`, not a raw bindings stack trace.
@@ -120,6 +120,7 @@ export class EmbedDb {
120
120
  line_end INTEGER NOT NULL,
121
121
  text_preview TEXT NOT NULL,
122
122
  vector BLOB NOT NULL,
123
+ kind TEXT NOT NULL DEFAULT 'md',
123
124
  UNIQUE(rel_path, chunk_index)
124
125
  );
125
126
  CREATE INDEX IF NOT EXISTS embeddings_rel_path ON embeddings(rel_path);
@@ -127,6 +128,7 @@ export class EmbedDb {
127
128
  rel_path TEXT PRIMARY KEY,
128
129
  mtime_ms INTEGER NOT NULL,
129
130
  n_chunks INTEGER NOT NULL,
131
+ kind TEXT NOT NULL DEFAULT 'md',
130
132
  indexed_at TEXT NOT NULL
131
133
  );
132
134
  `);
@@ -156,23 +158,27 @@ export class EmbedDb {
156
158
  throw new Error("EmbedDb is not open — call .open() first");
157
159
  return this.db;
158
160
  }
159
- /** Replace all embeddings for a single note. Caller computes vectors. */
160
- upsertNote(relPath, mtimeMs, chunks) {
161
+ /**
162
+ * Replace all embeddings for a single note. Caller computes vectors.
163
+ * v2.8.0: optional `kind` parameter ("md" | "pdf"); defaults to "md" so
164
+ * existing callers (markdown indexing path) need no changes.
165
+ */
166
+ upsertNote(relPath, mtimeMs, chunks, kind = "md") {
161
167
  const db = this.requireDb();
162
168
  const dim = this.dim;
163
169
  const tx = db.transaction((...args) => {
164
170
  const rows = args[0];
165
171
  db.prepare("DELETE FROM embeddings WHERE rel_path = ?").run(relPath);
166
- const insert = db.prepare(`INSERT INTO embeddings (rel_path, chunk_index, line_start, line_end, text_preview, vector)
167
- VALUES (?, ?, ?, ?, ?, ?)`);
172
+ const insert = db.prepare(`INSERT INTO embeddings (rel_path, chunk_index, line_start, line_end, text_preview, vector, kind)
173
+ VALUES (?, ?, ?, ?, ?, ?, ?)`);
168
174
  for (const c of rows) {
169
175
  if (c.vector.length !== dim) {
170
176
  throw new Error(`vector dim mismatch for ${relPath} chunk ${c.chunkIndex}: got ${c.vector.length}, expected ${dim}`);
171
177
  }
172
- insert.run(relPath, c.chunkIndex, c.lineStart, c.lineEnd, c.textPreview, Buffer.from(c.vector.buffer, c.vector.byteOffset, c.vector.byteLength));
178
+ insert.run(relPath, c.chunkIndex, c.lineStart, c.lineEnd, c.textPreview, Buffer.from(c.vector.buffer, c.vector.byteOffset, c.vector.byteLength), kind);
173
179
  }
174
- db.prepare(`INSERT OR REPLACE INTO source_state (rel_path, mtime_ms, n_chunks, indexed_at)
175
- VALUES (?, ?, ?, datetime('now'))`).run(relPath, mtimeMs, rows.length);
180
+ db.prepare(`INSERT OR REPLACE INTO source_state (rel_path, mtime_ms, n_chunks, kind, indexed_at)
181
+ VALUES (?, ?, ?, ?, datetime('now'))`).run(relPath, mtimeMs, rows.length, kind);
176
182
  });
177
183
  tx(chunks);
178
184
  }
@@ -182,10 +188,17 @@ export class EmbedDb {
182
188
  db.prepare("DELETE FROM embeddings WHERE rel_path = ?").run(relPath);
183
189
  db.prepare("DELETE FROM source_state WHERE rel_path = ?").run(relPath);
184
190
  }
185
- /** Read the source-state table — caller compares mtimes to decide what to
186
- * re-embed. */
187
- getSourceStates() {
191
+ /**
192
+ * Read the source-state table — caller compares mtimes to decide what to
193
+ * re-embed. v2.8.0: optional `kind` filter — when set, only rows of that
194
+ * kind are returned. Lets the markdown-sync and PDF-sync paths run
195
+ * independently without one's "missing files" being deleted by the other.
196
+ */
197
+ getSourceStates(kind) {
188
198
  const db = this.requireDb();
199
+ if (kind !== undefined) {
200
+ return db.prepare("SELECT rel_path, mtime_ms FROM source_state WHERE kind = ?").all(kind);
201
+ }
189
202
  return db.prepare("SELECT rel_path, mtime_ms FROM source_state").all();
190
203
  }
191
204
  /** Brute-force cosine top-K. Vectors are L2-normalized at insert time so
@@ -204,9 +217,9 @@ export class EmbedDb {
204
217
  // FtsIndex.search() in fts5.ts.
205
218
  const rows = db
206
219
  .prepare(folderPrefix
207
- ? `SELECT rel_path, chunk_index, line_start, line_end, text_preview, vector
220
+ ? `SELECT rel_path, chunk_index, line_start, line_end, text_preview, vector, kind
208
221
  FROM embeddings WHERE substr(rel_path, 1, ?) = ?`
209
- : `SELECT rel_path, chunk_index, line_start, line_end, text_preview, vector FROM embeddings`)
222
+ : `SELECT rel_path, chunk_index, line_start, line_end, text_preview, vector, kind FROM embeddings`)
210
223
  .all(...(folderPrefix ? [folderPrefix.length, folderPrefix] : []));
211
224
  const expectedBytes = this.dim * 4; // Float32 = 4 bytes
212
225
  const heap = [];
@@ -232,7 +245,8 @@ export class EmbedDb {
232
245
  line_start: r.line_start,
233
246
  line_end: r.line_end,
234
247
  text_preview: r.text_preview,
235
- score
248
+ score,
249
+ kind: (r.kind === "pdf" ? "pdf" : "md")
236
250
  });
237
251
  }
238
252
  heap.sort((a, b) => b.score - a.score);
@@ -1 +1 @@
1
- {"version":3,"file":"embed-db.js","sourceRoot":"","sources":["../src/embed-db.ts"],"names":[],"mappings":"AAAA,0EAA0E;AAC1E,6EAA6E;AAC7E,gFAAgF;AAChF,kDAAkD;AAClD,EAAE;AACF,gCAAgC;AAChC,gDAAgD;AAChD,0CAA0C;AAC1C,gFAAgF;AAChF,2DAA2D;AAC3D,EAAE;AACF,+EAA+E;AAC/E,2EAA2E;AAE3E,OAAO,EAAE,QAAQ,IAAI,EAAE,EAAE,MAAM,SAAS,CAAC;AACzC,OAAO,KAAK,IAAI,MAAM,WAAW,CAAC;AAElC,MAAM,cAAc,GAAG,CAAC,CAAC;AA0BzB,0EAA0E;AAC1E,yEAAyE;AACzE,yEAAyE;AACzE,IAAI,gBAAgB,GAA2C,IAAI,CAAC;AACpE,KAAK,UAAU,gBAAgB;IAC7B,IAAI,gBAAgB;QAAE,OAAO,gBAAgB,CAAC;IAC9C,IAAI,CAAC;QACH,MAAM,GAAG,GAAG,CAAC,MAAM,MAAM,CAAC,gBAAgB,CAAC,CAAgD,CAAC;QAC5F,MAAM,IAAI,GAAG,GAAG,CAAC,OAAO,CAAC;QACzB,IAAI,CAAC,IAAI;YAAE,MAAM,IAAI,KAAK,CAAC,sCAAsC,CAAC,CAAC;QACnE,IAAI,CAAC;YACH,MAAM,KAAK,GAAG,IAAI,IAAI,CAAC,UAAU,CAA2B,CAAC;YAC7D,KAAK,CAAC,KAAK,EAAE,EAAE,CAAC;QAClB,CAAC;QAAC,OAAO,QAAQ,EAAE,CAAC;YAClB,MAAM,IAAI,KAAK,CACb,+IAA+I,QAAQ,YAAY,KAAK,CAAC,CAAC,CAAC,QAAQ,CAAC,OAAO,CAAC,CAAC,CAAC,MAAM,CAAC,QAAQ,CAAC,EAAE,CACjN,CAAC;QACJ,CAAC;QACD,gBAAgB,GAAG,IAAI,CAAC;QACxB,OAAO,IAAI,CAAC;IACd,CAAC;IAAC,OAAO,GAAG,EAAE,CAAC;QACb,MAAM,IAAI,KAAK,CACb,8HACE,GAAG,YAAY,KAAK,CAAC,CAAC,CAAC,GAAG,CAAC,OAAO,CAAC,CAAC,CAAC,MAAM,CAAC,GAAG,CACjD,EAAE,CACH,CAAC;IACJ,CAAC;AACH,CAAC;AA6BD,MAAM,OAAO,OAAO;IACV,EAAE,GAAc,IAAI,CAAC;IACZ,IAAI,CAAS;IACb,SAAS,CAAS;IAClB,UAAU,CAAS;IACnB,GAAG,CAAS;IAE7B,YAAY,IAAoB;QAC9B,IAAI,CAAC,IAAI,GAAG,IAAI,CAAC,IAAI,CAAC;QACtB,IAAI,CAAC,SAAS,GAAG,IAAI,CAAC,SAAS,CAAC;QAChC,IAAI,CAAC,UAAU,GAAG,IAAI,CAAC,UAAU,CAAC;QAClC,IAAI,CAAC,GAAG,GAAG,IAAI,CAAC,GAAG,CAAC;IACtB,CAAC;IAED,KAAK,CAAC,IAAI;QACR,IAAI,IAAI,CAAC,EAAE;YAAE,OAAO;QACpB,MAAM,IAAI,GAAG,MAAM,gBAAgB,EAAE,CAAC;QACtC,MAAM,EAAE,CAAC,KAAK,CAAC,IAAI,CAAC,OAAO,CAAC,IAAI,CAAC,IAAI,CAAC,EAAE,EAAE,SAAS,EAAE,IAAI,EAAE,IAAI,EAAE,KAAK,EAAE,CAAC,CAAC;QAC1E,MAAM,EAAE,CAAC,KAAK,CAAC,IAAI,CAAC,OAAO,CAAC,IAAI,CAAC,IAAI,CAAC,EAAE,KAAK,CAAC,CAAC,KAAK,CAAC,GAAG,EAAE,GAAE,CAAC,CAAC,CAAC;QAC/D,IAAI,CAAC,EAAE,GAAG,IAAI,IAAI,CAAC,IAAI,CAAC,IAAI,CAAO,CAAC;QACpC,IAAI,CAAC,EAAE,CAAC,MAAM,CAAC,oBAAoB,CAAC,CAAC;QACrC,IAAI,CAAC,EAAE,CAAC,MAAM,CAAC,sBAAsB,CAAC,CAAC;QACvC,IAAI,CAAC,eAAe,EAAE,CAAC;QACvB,MAAM,OAAO,CAAC,GAAG,CACf,CAAC,IAAI,CAAC,IAAI,EAAE,GAAG,IAAI,CAAC,IAAI,MAAM,EAAE,GAAG,IAAI,CAAC,IAAI,MAAM,CAAC,CAAC,GAAG,CAAC,CAAC,CAAC,EAAE,EAAE,CAAC,EAAE,CAAC,KAAK,CAAC,CAAC,EAAE,KAAK,CAAC,CAAC,KAAK,CAAC,GAAG,EAAE,GAAE,CAAC,CAAC,CAAC,CACnG,CAAC;IACJ,CAAC;IAED,0DAA0D;IAC1D,KAAK,CAAC,WAAW;QACf,IAAI,CAAC,KAAK,EAAE,CAAC;QACb,IAAI,OAAO,GAAG,KAAK,CAAC;QACpB,KAAK,MAAM,CAAC,IAAI,CAAC,IAAI,CAAC,IAAI,EAAE,GAAG,IAAI,CAAC,IAAI,MAAM,EAAE,GAAG,IAAI,CAAC,IAAI,MAAM,CAAC,EAAE,CAAC;YACpE,IAAI,CAAC;gBACH,MAAM,EAAE,CAAC,MAAM,CAAC,CAAC,CAAC,CAAC;gBACnB,OAAO,GAAG,IAAI,CAAC;YACjB,CAAC;YAAC,MAAM,CAAC;gBACP,kBAAkB;YACpB,CAAC;QACH,CAAC;QACD,OAAO,OAAO,CAAC;IACjB,CAAC;IAED,KAAK;QACH,IAAI,IAAI,CAAC,EAAE,EAAE,CAAC;YACZ,IAAI,CAAC,EAAE,CAAC,KAAK,EAAE,CAAC;YAChB,IAAI,CAAC,EAAE,GAAG,IAAI,CAAC;QACjB,CAAC;IACH,CAAC;IAEO,eAAe;QACrB,MAAM,EAAE,GAAG,IAAI,CAAC,SAAS,EAAE,CAAC;QAE5B,EAAE,CAAC,IAAI,CAAC;;;;;KAKP,CAAC,CAAC;QAEH,MAAM,IAAI,GAAG,IAAI,CAAC,QAAQ,EAAE,CAAC;QAC7B,MAAM,YAAY,GAAG,IAAI,CAAC,cAAc,KAAK,SAAS,IAAI,IAAI,CAAC,cAAc,KAAK,MAAM,CAAC,cAAc,CAAC,CAAC;QACzG,MAAM,SAAS,GAAG,IAAI,CAAC,UAAU,KAAK,SAAS,IAAI,IAAI,CAAC,UAAU,KAAK,IAAI,CAAC,SAAS,CAAC;QACtF,MAAM,UAAU,GAAG,IAAI,CAAC,WAAW,KAAK,SAAS,IAAI,IAAI,CAAC,WAAW,KAAK,IAAI,CAAC,UAAU,CAAC;QAC1F,MAAM,QAAQ,GAAG,IAAI,CAAC,GAAG,KAAK,SAAS,IAAI,IAAI,CAAC,GAAG,KAAK,MAAM,CAAC,IAAI,CAAC,GAAG,CAAC,CAAC;QACzE,IAAI,CAAC,YAAY,IAAI,CAAC,SAAS,IAAI,CAAC,UAAU,IAAI,CAAC,QAAQ,EAAE,CAAC;YAC5D,MAAM,MAAM,GAAa,EAAE,CAAC;YAC5B,IAAI,CAAC,YAAY;gBAAE,MAAM,CAAC,IAAI,CAAC,kBAAkB,IAAI,CAAC,cAAc,MAAM,cAAc,EAAE,CAAC,CAAC;YAC5F,IAAI,CAAC,SAAS;gBAAE,MAAM,CAAC,IAAI,CAAC,cAAc,IAAI,CAAC,UAAU,MAAM,IAAI,CAAC,SAAS,EAAE,CAAC,CAAC;YACjF,IAAI,CAAC,UAAU;gBAAE,MAAM,CAAC,IAAI,CAAC,SAAS,IAAI,CAAC,WAAW,MAAM,IAAI,CAAC,UAAU,EAAE,CAAC,CAAC;YAC/E,IAAI,CAAC,QAAQ;gBAAE,MAAM,CAAC,IAAI,CAAC,OAAO,IAAI,CAAC,GAAG,MAAM,IAAI,CAAC,GAAG,EAAE,CAAC,CAAC;YAC5D,OAAO,CAAC,MAAM,CAAC,KAAK,CAAC,oCAAoC,MAAM,CAAC,IAAI,CAAC,IAAI,CAAC,KAAK,CAAC,CAAC;YACjF,EAAE,CAAC,IAAI,CAAC,qEAAqE,CAAC,CAAC;QACjF,CAAC;QAED,EAAE,CAAC,IAAI,CAAC;;;;;;;;;;;;;;;;;;KAkBP,CAAC,CAAC;QAEH,IAAI,CAAC,SAAS,CAAC;YACb,cAAc,EAAE,MAAM,CAAC,cAAc,CAAC;YACtC,UAAU,EAAE,IAAI,CAAC,SAAS;YAC1B,WAAW,EAAE,IAAI,CAAC,UAAU;YAC5B,GAAG,EAAE,MAAM,CAAC,IAAI,CAAC,GAAG,CAAC;SACtB,CAAC,CAAC;IACL,CAAC;IAEO,QAAQ;QACd,MAAM,EAAE,GAAG,IAAI,CAAC,SAAS,EAAE,CAAC;QAC5B,MAAM,IAAI,GAAG,EAAE,CAAC,OAAO,CAAC,6BAA6B,CAAC,CAAC,GAAG,EAAkC,CAAC;QAC7F,MAAM,GAAG,GAA2B,EAAE,CAAC;QACvC,KAAK,MAAM,CAAC,IAAI,IAAI;YAAE,GAAG,CAAC,CAAC,CAAC,GAAG,CAAC,GAAG,CAAC,CAAC,KAAK,CAAC;QAC3C,OAAO,GAAG,CAAC;IACb,CAAC;IAEO,SAAS,CAAC,EAA0B;QAC1C,MAAM,EAAE,GAAG,IAAI,CAAC,SAAS,EAAE,CAAC;QAC5B,MAAM,IAAI,GAAG,EAAE,CAAC,OAAO,CAAC,wDAAwD,CAAC,CAAC;QAClF,KAAK,MAAM,CAAC,CAAC,EAAE,CAAC,CAAC,IAAI,MAAM,CAAC,OAAO,CAAC,EAAE,CAAC;YAAE,IAAI,CAAC,GAAG,CAAC,CAAC,EAAE,CAAC,CAAC,CAAC;IAC1D,CAAC;IAEO,SAAS;QACf,IAAI,CAAC,IAAI,CAAC,EAAE;YAAE,MAAM,IAAI,KAAK,CAAC,0CAA0C,CAAC,CAAC;QAC1E,OAAO,IAAI,CAAC,EAAE,CAAC;IACjB,CAAC;IAED,yEAAyE;IACzE,UAAU,CACR,OAAe,EACf,OAAe,EACf,MAME;QAEF,MAAM,EAAE,GAAG,IAAI,CAAC,SAAS,EAAE,CAAC;QAC5B,MAAM,GAAG,GAAG,IAAI,CAAC,GAAG,CAAC;QACrB,MAAM,EAAE,GAAG,EAAE,CAAC,WAAW,CAAC,CAAC,GAAG,IAAe,EAAE,EAAE;YAC/C,MAAM,IAAI,GAAG,IAAI,CAAC,CAAC,CAAkB,CAAC;YACtC,EAAE,CAAC,OAAO,CAAC,2CAA2C,CAAC,CAAC,GAAG,CAAC,OAAO,CAAC,CAAC;YACrE,MAAM,MAAM,GAAG,EAAE,CAAC,OAAO,CACvB;mCAC2B,CAC5B,CAAC;YACF,KAAK,MAAM,CAAC,IAAI,IAAI,EAAE,CAAC;gBACrB,IAAI,CAAC,CAAC,MAAM,CAAC,MAAM,KAAK,GAAG,EAAE,CAAC;oBAC5B,MAAM,IAAI,KAAK,CACb,2BAA2B,OAAO,UAAU,CAAC,CAAC,UAAU,SAAS,CAAC,CAAC,MAAM,CAAC,MAAM,cAAc,GAAG,EAAE,CACpG,CAAC;gBACJ,CAAC;gBACD,MAAM,CAAC,GAAG,CACR,OAAO,EACP,CAAC,CAAC,UAAU,EACZ,CAAC,CAAC,SAAS,EACX,CAAC,CAAC,OAAO,EACT,CAAC,CAAC,WAAW,EACb,MAAM,CAAC,IAAI,CAAC,CAAC,CAAC,MAAM,CAAC,MAAM,EAAE,CAAC,CAAC,MAAM,CAAC,UAAU,EAAE,CAAC,CAAC,MAAM,CAAC,UAAU,CAAC,CACvE,CAAC;YACJ,CAAC;YACD,EAAE,CAAC,OAAO,CACR;2CACmC,CACpC,CAAC,GAAG,CAAC,OAAO,EAAE,OAAO,EAAE,IAAI,CAAC,MAAM,CAAC,CAAC;QACvC,CAAC,CAAC,CAAC;QACH,EAAE,CAAC,MAAM,CAAC,CAAC;IACb,CAAC;IAED,iEAAiE;IACjE,UAAU,CAAC,OAAe;QACxB,MAAM,EAAE,GAAG,IAAI,CAAC,SAAS,EAAE,CAAC;QAC5B,EAAE,CAAC,OAAO,CAAC,2CAA2C,CAAC,CAAC,GAAG,CAAC,OAAO,CAAC,CAAC;QACrE,EAAE,CAAC,OAAO,CAAC,6CAA6C,CAAC,CAAC,GAAG,CAAC,OAAO,CAAC,CAAC;IACzE,CAAC;IAED;oBACgB;IAChB,eAAe;QACb,MAAM,EAAE,GAAG,IAAI,CAAC,SAAS,EAAE,CAAC;QAC5B,OAAO,EAAE,CAAC,OAAO,CAAC,6CAA6C,CAAC,CAAC,GAAG,EAAkB,CAAC;IACzF,CAAC;IAED;;gDAE4C;IAC5C,MAAM,CAAC,QAAsB,EAAE,CAAS,EAAE,OAA+C,EAAE;QACzF,MAAM,EAAE,GAAG,IAAI,CAAC,SAAS,EAAE,CAAC;QAC5B,IAAI,QAAQ,CAAC,MAAM,KAAK,IAAI,CAAC,GAAG,EAAE,CAAC;YACjC,MAAM,IAAI,KAAK,CAAC,kCAAkC,QAAQ,CAAC,MAAM,cAAc,IAAI,CAAC,GAAG,EAAE,CAAC,CAAC;QAC7F,CAAC;QACD,MAAM,QAAQ,GAAG,IAAI,CAAC,QAAQ,IAAI,CAAC,QAAQ,CAAC;QAC5C,MAAM,YAAY,GAAG,IAAI,CAAC,MAAM,CAAC,CAAC,CAAC,GAAG,IAAI,CAAC,MAAM,CAAC,OAAO,CAAC,MAAM,EAAE,EAAE,CAAC,GAAG,CAAC,CAAC,CAAC,IAAI,CAAC;QAEhF,yEAAyE;QACzE,uEAAuE;QACvE,yEAAyE;QACzE,gCAAgC;QAChC,MAAM,IAAI,GAAG,EAAE;aACZ,OAAO,CACN,YAAY;YACV,CAAC,CAAC;8DACkD;YACpD,CAAC,CAAC,0FAA0F,CAC/F;aACA,GAAG,CAOD,GAAG,CAAC,YAAY,CAAC,CAAC,CAAC,CAAC,YAAY,CAAC,MAAM,EAAE,YAAY,CAAC,CAAC,CAAC,CAAC,EAAE,CAAC,CAAC,CAAC;QAEnE,MAAM,aAAa,GAAG,IAAI,CAAC,GAAG,GAAG,CAAC,CAAC,CAAC,oBAAoB;QACxD,MAAM,IAAI,GAAqB,EAAE,CAAC;QAClC,KAAK,MAAM,CAAC,IAAI,IAAI,EAAE,CAAC;YACrB,uEAAuE;YACvE,qEAAqE;YACrE,qEAAqE;YACrE,gEAAgE;YAChE,IAAI,CAAC,CAAC,MAAM,CAAC,UAAU,KAAK,aAAa,EAAE,CAAC;gBAC1C,OAAO,CAAC,MAAM,CAAC,KAAK,CAClB,qBAAqB,CAAC,CAAC,QAAQ,IAAI,CAAC,CAAC,WAAW,iBAAiB,CAAC,CAAC,MAAM,CAAC,UAAU,eAAe,aAAa,UAAU,IAAI,CAAC,GAAG,wDAAwD,CAC3L,CAAC;gBACF,SAAS;YACX,CAAC;YACD,MAAM,GAAG,GAAG,IAAI,YAAY,CAAC,CAAC,CAAC,MAAM,CAAC,MAAM,EAAE,CAAC,CAAC,MAAM,CAAC,UAAU,EAAE,IAAI,CAAC,GAAG,CAAC,CAAC;YAC7E,IAAI,KAAK,GAAG,CAAC,CAAC;YACd,KAAK,IAAI,CAAC,GAAG,CAAC,EAAE,CAAC,GAAG,IAAI,CAAC,GAAG,EAAE,CAAC,EAAE,EAAE,CAAC;gBAClC,KAAK,IAAI,CAAC,QAAQ,CAAC,CAAC,CAAC,IAAI,CAAC,CAAC,GAAG,CAAC,GAAG,CAAC,CAAC,CAAC,IAAI,CAAC,CAAC,CAAC;YAC9C,CAAC;YACD,IAAI,KAAK,GAAG,QAAQ;gBAAE,SAAS;YAC/B,IAAI,CAAC,IAAI,CAAC;gBACR,QAAQ,EAAE,CAAC,CAAC,QAAQ;gBACpB,WAAW,EAAE,CAAC,CAAC,WAAW;gBAC1B,UAAU,EAAE,CAAC,CAAC,UAAU;gBACxB,QAAQ,EAAE,CAAC,CAAC,QAAQ;gBACpB,YAAY,EAAE,CAAC,CAAC,YAAY;gBAC5B,KAAK;aACN,CAAC,CAAC;QACL,CAAC;QACD,IAAI,CAAC,IAAI,CAAC,CAAC,CAAC,EAAE,CAAC,EAAE,EAAE,CAAC,CAAC,CAAC,KAAK,GAAG,CAAC,CAAC,KAAK,CAAC,CAAC;QACvC,OAAO,IAAI,CAAC,KAAK,CAAC,CAAC,EAAE,CAAC,CAAC,CAAC;IAC1B,CAAC;IAED,kDAAkD;IAClD,WAAW;QACT,MAAM,EAAE,GAAG,IAAI,CAAC,SAAS,EAAE,CAAC;QAC5B,MAAM,GAAG,GAAG,EAAE,CAAC,OAAO,CAAC,sCAAsC,CAAC,CAAC,GAAG,EAAiB,CAAC;QACpF,OAAO,GAAG,EAAE,CAAC,IAAI,CAAC,CAAC;IACrB,CAAC;CACF;AAED,8EAA8E;AAC9E,MAAM,UAAU,kBAAkB,CAAC,eAAuB;IACxD,4EAA4E;IAC5E,wEAAwE;IACxE,OAAO,GAAG,eAAe,WAAW,CAAC;AACvC,CAAC"}
1
+ {"version":3,"file":"embed-db.js","sourceRoot":"","sources":["../src/embed-db.ts"],"names":[],"mappings":"AAAA,0EAA0E;AAC1E,6EAA6E;AAC7E,gFAAgF;AAChF,kDAAkD;AAClD,EAAE;AACF,gCAAgC;AAChC,gDAAgD;AAChD,0CAA0C;AAC1C,gFAAgF;AAChF,2DAA2D;AAC3D,EAAE;AACF,+EAA+E;AAC/E,2EAA2E;AAE3E,OAAO,EAAE,QAAQ,IAAI,EAAE,EAAE,MAAM,SAAS,CAAC;AACzC,OAAO,KAAK,IAAI,MAAM,WAAW,CAAC;AAElC,MAAM,cAAc,GAAG,CAAC,CAAC;AAkCzB,0EAA0E;AAC1E,yEAAyE;AACzE,yEAAyE;AACzE,IAAI,gBAAgB,GAA2C,IAAI,CAAC;AACpE,KAAK,UAAU,gBAAgB;IAC7B,IAAI,gBAAgB;QAAE,OAAO,gBAAgB,CAAC;IAC9C,IAAI,CAAC;QACH,MAAM,GAAG,GAAG,CAAC,MAAM,MAAM,CAAC,gBAAgB,CAAC,CAAgD,CAAC;QAC5F,MAAM,IAAI,GAAG,GAAG,CAAC,OAAO,CAAC;QACzB,IAAI,CAAC,IAAI;YAAE,MAAM,IAAI,KAAK,CAAC,sCAAsC,CAAC,CAAC;QACnE,IAAI,CAAC;YACH,MAAM,KAAK,GAAG,IAAI,IAAI,CAAC,UAAU,CAA2B,CAAC;YAC7D,KAAK,CAAC,KAAK,EAAE,EAAE,CAAC;QAClB,CAAC;QAAC,OAAO,QAAQ,EAAE,CAAC;YAClB,MAAM,IAAI,KAAK,CACb,+IAA+I,QAAQ,YAAY,KAAK,CAAC,CAAC,CAAC,QAAQ,CAAC,OAAO,CAAC,CAAC,CAAC,MAAM,CAAC,QAAQ,CAAC,EAAE,CACjN,CAAC;QACJ,CAAC;QACD,gBAAgB,GAAG,IAAI,CAAC;QACxB,OAAO,IAAI,CAAC;IACd,CAAC;IAAC,OAAO,GAAG,EAAE,CAAC;QACb,MAAM,IAAI,KAAK,CACb,8HACE,GAAG,YAAY,KAAK,CAAC,CAAC,CAAC,GAAG,CAAC,OAAO,CAAC,CAAC,CAAC,MAAM,CAAC,GAAG,CACjD,EAAE,CACH,CAAC;IACJ,CAAC;AACH,CAAC;AA6BD,MAAM,OAAO,OAAO;IACV,EAAE,GAAc,IAAI,CAAC;IACZ,IAAI,CAAS;IACb,SAAS,CAAS;IAClB,UAAU,CAAS;IACnB,GAAG,CAAS;IAE7B,YAAY,IAAoB;QAC9B,IAAI,CAAC,IAAI,GAAG,IAAI,CAAC,IAAI,CAAC;QACtB,IAAI,CAAC,SAAS,GAAG,IAAI,CAAC,SAAS,CAAC;QAChC,IAAI,CAAC,UAAU,GAAG,IAAI,CAAC,UAAU,CAAC;QAClC,IAAI,CAAC,GAAG,GAAG,IAAI,CAAC,GAAG,CAAC;IACtB,CAAC;IAED,KAAK,CAAC,IAAI;QACR,IAAI,IAAI,CAAC,EAAE;YAAE,OAAO;QACpB,MAAM,IAAI,GAAG,MAAM,gBAAgB,EAAE,CAAC;QACtC,MAAM,EAAE,CAAC,KAAK,CAAC,IAAI,CAAC,OAAO,CAAC,IAAI,CAAC,IAAI,CAAC,EAAE,EAAE,SAAS,EAAE,IAAI,EAAE,IAAI,EAAE,KAAK,EAAE,CAAC,CAAC;QAC1E,MAAM,EAAE,CAAC,KAAK,CAAC,IAAI,CAAC,OAAO,CAAC,IAAI,CAAC,IAAI,CAAC,EAAE,KAAK,CAAC,CAAC,KAAK,CAAC,GAAG,EAAE,GAAE,CAAC,CAAC,CAAC;QAC/D,IAAI,CAAC,EAAE,GAAG,IAAI,IAAI,CAAC,IAAI,CAAC,IAAI,CAAO,CAAC;QACpC,IAAI,CAAC,EAAE,CAAC,MAAM,CAAC,oBAAoB,CAAC,CAAC;QACrC,IAAI,CAAC,EAAE,CAAC,MAAM,CAAC,sBAAsB,CAAC,CAAC;QACvC,IAAI,CAAC,eAAe,EAAE,CAAC;QACvB,MAAM,OAAO,CAAC,GAAG,CACf,CAAC,IAAI,CAAC,IAAI,EAAE,GAAG,IAAI,CAAC,IAAI,MAAM,EAAE,GAAG,IAAI,CAAC,IAAI,MAAM,CAAC,CAAC,GAAG,CAAC,CAAC,CAAC,EAAE,EAAE,CAAC,EAAE,CAAC,KAAK,CAAC,CAAC,EAAE,KAAK,CAAC,CAAC,KAAK,CAAC,GAAG,EAAE,GAAE,CAAC,CAAC,CAAC,CACnG,CAAC;IACJ,CAAC;IAED,0DAA0D;IAC1D,KAAK,CAAC,WAAW;QACf,IAAI,CAAC,KAAK,EAAE,CAAC;QACb,IAAI,OAAO,GAAG,KAAK,CAAC;QACpB,KAAK,MAAM,CAAC,IAAI,CAAC,IAAI,CAAC,IAAI,EAAE,GAAG,IAAI,CAAC,IAAI,MAAM,EAAE,GAAG,IAAI,CAAC,IAAI,MAAM,CAAC,EAAE,CAAC;YACpE,IAAI,CAAC;gBACH,MAAM,EAAE,CAAC,MAAM,CAAC,CAAC,CAAC,CAAC;gBACnB,OAAO,GAAG,IAAI,CAAC;YACjB,CAAC;YAAC,MAAM,CAAC;gBACP,kBAAkB;YACpB,CAAC;QACH,CAAC;QACD,OAAO,OAAO,CAAC;IACjB,CAAC;IAED,KAAK;QACH,IAAI,IAAI,CAAC,EAAE,EAAE,CAAC;YACZ,IAAI,CAAC,EAAE,CAAC,KAAK,EAAE,CAAC;YAChB,IAAI,CAAC,EAAE,GAAG,IAAI,CAAC;QACjB,CAAC;IACH,CAAC;IAEO,eAAe;QACrB,MAAM,EAAE,GAAG,IAAI,CAAC,SAAS,EAAE,CAAC;QAE5B,EAAE,CAAC,IAAI,CAAC;;;;;KAKP,CAAC,CAAC;QAEH,MAAM,IAAI,GAAG,IAAI,CAAC,QAAQ,EAAE,CAAC;QAC7B,MAAM,YAAY,GAAG,IAAI,CAAC,cAAc,KAAK,SAAS,IAAI,IAAI,CAAC,cAAc,KAAK,MAAM,CAAC,cAAc,CAAC,CAAC;QACzG,MAAM,SAAS,GAAG,IAAI,CAAC,UAAU,KAAK,SAAS,IAAI,IAAI,CAAC,UAAU,KAAK,IAAI,CAAC,SAAS,CAAC;QACtF,MAAM,UAAU,GAAG,IAAI,CAAC,WAAW,KAAK,SAAS,IAAI,IAAI,CAAC,WAAW,KAAK,IAAI,CAAC,UAAU,CAAC;QAC1F,MAAM,QAAQ,GAAG,IAAI,CAAC,GAAG,KAAK,SAAS,IAAI,IAAI,CAAC,GAAG,KAAK,MAAM,CAAC,IAAI,CAAC,GAAG,CAAC,CAAC;QACzE,IAAI,CAAC,YAAY,IAAI,CAAC,SAAS,IAAI,CAAC,UAAU,IAAI,CAAC,QAAQ,EAAE,CAAC;YAC5D,MAAM,MAAM,GAAa,EAAE,CAAC;YAC5B,IAAI,CAAC,YAAY;gBAAE,MAAM,CAAC,IAAI,CAAC,kBAAkB,IAAI,CAAC,cAAc,MAAM,cAAc,EAAE,CAAC,CAAC;YAC5F,IAAI,CAAC,SAAS;gBAAE,MAAM,CAAC,IAAI,CAAC,cAAc,IAAI,CAAC,UAAU,MAAM,IAAI,CAAC,SAAS,EAAE,CAAC,CAAC;YACjF,IAAI,CAAC,UAAU;gBAAE,MAAM,CAAC,IAAI,CAAC,SAAS,IAAI,CAAC,WAAW,MAAM,IAAI,CAAC,UAAU,EAAE,CAAC,CAAC;YAC/E,IAAI,CAAC,QAAQ;gBAAE,MAAM,CAAC,IAAI,CAAC,OAAO,IAAI,CAAC,GAAG,MAAM,IAAI,CAAC,GAAG,EAAE,CAAC,CAAC;YAC5D,OAAO,CAAC,MAAM,CAAC,KAAK,CAAC,oCAAoC,MAAM,CAAC,IAAI,CAAC,IAAI,CAAC,KAAK,CAAC,CAAC;YACjF,EAAE,CAAC,IAAI,CAAC,qEAAqE,CAAC,CAAC;QACjF,CAAC;QAED,EAAE,CAAC,IAAI,CAAC;;;;;;;;;;;;;;;;;;;;KAoBP,CAAC,CAAC;QAEH,IAAI,CAAC,SAAS,CAAC;YACb,cAAc,EAAE,MAAM,CAAC,cAAc,CAAC;YACtC,UAAU,EAAE,IAAI,CAAC,SAAS;YAC1B,WAAW,EAAE,IAAI,CAAC,UAAU;YAC5B,GAAG,EAAE,MAAM,CAAC,IAAI,CAAC,GAAG,CAAC;SACtB,CAAC,CAAC;IACL,CAAC;IAEO,QAAQ;QACd,MAAM,EAAE,GAAG,IAAI,CAAC,SAAS,EAAE,CAAC;QAC5B,MAAM,IAAI,GAAG,EAAE,CAAC,OAAO,CAAC,6BAA6B,CAAC,CAAC,GAAG,EAAkC,CAAC;QAC7F,MAAM,GAAG,GAA2B,EAAE,CAAC;QACvC,KAAK,MAAM,CAAC,IAAI,IAAI;YAAE,GAAG,CAAC,CAAC,CAAC,GAAG,CAAC,GAAG,CAAC,CAAC,KAAK,CAAC;QAC3C,OAAO,GAAG,CAAC;IACb,CAAC;IAEO,SAAS,CAAC,EAA0B;QAC1C,MAAM,EAAE,GAAG,IAAI,CAAC,SAAS,EAAE,CAAC;QAC5B,MAAM,IAAI,GAAG,EAAE,CAAC,OAAO,CAAC,wDAAwD,CAAC,CAAC;QAClF,KAAK,MAAM,CAAC,CAAC,EAAE,CAAC,CAAC,IAAI,MAAM,CAAC,OAAO,CAAC,EAAE,CAAC;YAAE,IAAI,CAAC,GAAG,CAAC,CAAC,EAAE,CAAC,CAAC,CAAC;IAC1D,CAAC;IAEO,SAAS;QACf,IAAI,CAAC,IAAI,CAAC,EAAE;YAAE,MAAM,IAAI,KAAK,CAAC,0CAA0C,CAAC,CAAC;QAC1E,OAAO,IAAI,CAAC,EAAE,CAAC;IACjB,CAAC;IAED;;;;OAIG;IACH,UAAU,CACR,OAAe,EACf,OAAe,EACf,MAME,EACF,OAAuB,IAAI;QAE3B,MAAM,EAAE,GAAG,IAAI,CAAC,SAAS,EAAE,CAAC;QAC5B,MAAM,GAAG,GAAG,IAAI,CAAC,GAAG,CAAC;QACrB,MAAM,EAAE,GAAG,EAAE,CAAC,WAAW,CAAC,CAAC,GAAG,IAAe,EAAE,EAAE;YAC/C,MAAM,IAAI,GAAG,IAAI,CAAC,CAAC,CAAkB,CAAC;YACtC,EAAE,CAAC,OAAO,CAAC,2CAA2C,CAAC,CAAC,GAAG,CAAC,OAAO,CAAC,CAAC;YACrE,MAAM,MAAM,GAAG,EAAE,CAAC,OAAO,CACvB;sCAC8B,CAC/B,CAAC;YACF,KAAK,MAAM,CAAC,IAAI,IAAI,EAAE,CAAC;gBACrB,IAAI,CAAC,CAAC,MAAM,CAAC,MAAM,KAAK,GAAG,EAAE,CAAC;oBAC5B,MAAM,IAAI,KAAK,CACb,2BAA2B,OAAO,UAAU,CAAC,CAAC,UAAU,SAAS,CAAC,CAAC,MAAM,CAAC,MAAM,cAAc,GAAG,EAAE,CACpG,CAAC;gBACJ,CAAC;gBACD,MAAM,CAAC,GAAG,CACR,OAAO,EACP,CAAC,CAAC,UAAU,EACZ,CAAC,CAAC,SAAS,EACX,CAAC,CAAC,OAAO,EACT,CAAC,CAAC,WAAW,EACb,MAAM,CAAC,IAAI,CAAC,CAAC,CAAC,MAAM,CAAC,MAAM,EAAE,CAAC,CAAC,MAAM,CAAC,UAAU,EAAE,CAAC,CAAC,MAAM,CAAC,UAAU,CAAC,EACtE,IAAI,CACL,CAAC;YACJ,CAAC;YACD,EAAE,CAAC,OAAO,CACR;8CACsC,CACvC,CAAC,GAAG,CAAC,OAAO,EAAE,OAAO,EAAE,IAAI,CAAC,MAAM,EAAE,IAAI,CAAC,CAAC;QAC7C,CAAC,CAAC,CAAC;QACH,EAAE,CAAC,MAAM,CAAC,CAAC;IACb,CAAC;IAED,iEAAiE;IACjE,UAAU,CAAC,OAAe;QACxB,MAAM,EAAE,GAAG,IAAI,CAAC,SAAS,EAAE,CAAC;QAC5B,EAAE,CAAC,OAAO,CAAC,2CAA2C,CAAC,CAAC,GAAG,CAAC,OAAO,CAAC,CAAC;QACrE,EAAE,CAAC,OAAO,CAAC,6CAA6C,CAAC,CAAC,GAAG,CAAC,OAAO,CAAC,CAAC;IACzE,CAAC;IAED;;;;;OAKG;IACH,eAAe,CAAC,IAAqB;QACnC,MAAM,EAAE,GAAG,IAAI,CAAC,SAAS,EAAE,CAAC;QAC5B,IAAI,IAAI,KAAK,SAAS,EAAE,CAAC;YACvB,OAAO,EAAE,CAAC,OAAO,CAAC,4DAA4D,CAAC,CAAC,GAAG,CAAiB,IAAI,CAAC,CAAC;QAC5G,CAAC;QACD,OAAO,EAAE,CAAC,OAAO,CAAC,6CAA6C,CAAC,CAAC,GAAG,EAAkB,CAAC;IACzF,CAAC;IAED;;gDAE4C;IAC5C,MAAM,CAAC,QAAsB,EAAE,CAAS,EAAE,OAA+C,EAAE;QACzF,MAAM,EAAE,GAAG,IAAI,CAAC,SAAS,EAAE,CAAC;QAC5B,IAAI,QAAQ,CAAC,MAAM,KAAK,IAAI,CAAC,GAAG,EAAE,CAAC;YACjC,MAAM,IAAI,KAAK,CAAC,kCAAkC,QAAQ,CAAC,MAAM,cAAc,IAAI,CAAC,GAAG,EAAE,CAAC,CAAC;QAC7F,CAAC;QACD,MAAM,QAAQ,GAAG,IAAI,CAAC,QAAQ,IAAI,CAAC,QAAQ,CAAC;QAC5C,MAAM,YAAY,GAAG,IAAI,CAAC,MAAM,CAAC,CAAC,CAAC,GAAG,IAAI,CAAC,MAAM,CAAC,OAAO,CAAC,MAAM,EAAE,EAAE,CAAC,GAAG,CAAC,CAAC,CAAC,IAAI,CAAC;QAEhF,yEAAyE;QACzE,uEAAuE;QACvE,yEAAyE;QACzE,gCAAgC;QAChC,MAAM,IAAI,GAAG,EAAE;aACZ,OAAO,CACN,YAAY;YACV,CAAC,CAAC;8DACkD;YACpD,CAAC,CAAC,gGAAgG,CACrG;aACA,GAAG,CAQD,GAAG,CAAC,YAAY,CAAC,CAAC,CAAC,CAAC,YAAY,CAAC,MAAM,EAAE,YAAY,CAAC,CAAC,CAAC,CAAC,EAAE,CAAC,CAAC,CAAC;QAEnE,MAAM,aAAa,GAAG,IAAI,CAAC,GAAG,GAAG,CAAC,CAAC,CAAC,oBAAoB;QACxD,MAAM,IAAI,GAAqB,EAAE,CAAC;QAClC,KAAK,MAAM,CAAC,IAAI,IAAI,EAAE,CAAC;YACrB,uEAAuE;YACvE,qEAAqE;YACrE,qEAAqE;YACrE,gEAAgE;YAChE,IAAI,CAAC,CAAC,MAAM,CAAC,UAAU,KAAK,aAAa,EAAE,CAAC;gBAC1C,OAAO,CAAC,MAAM,CAAC,KAAK,CAClB,qBAAqB,CAAC,CAAC,QAAQ,IAAI,CAAC,CAAC,WAAW,iBAAiB,CAAC,CAAC,MAAM,CAAC,UAAU,eAAe,aAAa,UAAU,IAAI,CAAC,GAAG,wDAAwD,CAC3L,CAAC;gBACF,SAAS;YACX,CAAC;YACD,MAAM,GAAG,GAAG,IAAI,YAAY,CAAC,CAAC,CAAC,MAAM,CAAC,MAAM,EAAE,CAAC,CAAC,MAAM,CAAC,UAAU,EAAE,IAAI,CAAC,GAAG,CAAC,CAAC;YAC7E,IAAI,KAAK,GAAG,CAAC,CAAC;YACd,KAAK,IAAI,CAAC,GAAG,CAAC,EAAE,CAAC,GAAG,IAAI,CAAC,GAAG,EAAE,CAAC,EAAE,EAAE,CAAC;gBAClC,KAAK,IAAI,CAAC,QAAQ,CAAC,CAAC,CAAC,IAAI,CAAC,CAAC,GAAG,CAAC,GAAG,CAAC,CAAC,CAAC,IAAI,CAAC,CAAC,CAAC;YAC9C,CAAC;YACD,IAAI,KAAK,GAAG,QAAQ;gBAAE,SAAS;YAC/B,IAAI,CAAC,IAAI,CAAC;gBACR,QAAQ,EAAE,CAAC,CAAC,QAAQ;gBACpB,WAAW,EAAE,CAAC,CAAC,WAAW;gBAC1B,UAAU,EAAE,CAAC,CAAC,UAAU;gBACxB,QAAQ,EAAE,CAAC,CAAC,QAAQ;gBACpB,YAAY,EAAE,CAAC,CAAC,YAAY;gBAC5B,KAAK;gBACL,IAAI,EAAE,CAAC,CAAC,CAAC,IAAI,KAAK,KAAK,CAAC,CAAC,CAAC,KAAK,CAAC,CAAC,CAAC,IAAI,CAAmB;aAC1D,CAAC,CAAC;QACL,CAAC;QACD,IAAI,CAAC,IAAI,CAAC,CAAC,CAAC,EAAE,CAAC,EAAE,EAAE,CAAC,CAAC,CAAC,KAAK,GAAG,CAAC,CAAC,KAAK,CAAC,CAAC;QACvC,OAAO,IAAI,CAAC,KAAK,CAAC,CAAC,EAAE,CAAC,CAAC,CAAC;IAC1B,CAAC;IAED,kDAAkD;IAClD,WAAW;QACT,MAAM,EAAE,GAAG,IAAI,CAAC,SAAS,EAAE,CAAC;QAC5B,MAAM,GAAG,GAAG,EAAE,CAAC,OAAO,CAAC,sCAAsC,CAAC,CAAC,GAAG,EAAiB,CAAC;QACpF,OAAO,GAAG,EAAE,CAAC,IAAI,CAAC,CAAC;IACrB,CAAC;CACF;AAED,8EAA8E;AAC9E,MAAM,UAAU,kBAAkB,CAAC,eAAuB;IACxD,4EAA4E;IAC5E,wEAAwE;IACxE,OAAO,GAAG,eAAe,WAAW,CAAC;AACvC,CAAC"}
@@ -35,4 +35,38 @@ export interface Embedder {
35
35
  export declare function loadEmbedder(alias?: string): Promise<Embedder>;
36
36
  /** Cosine similarity between two L2-normalized vectors (= dot product). */
37
37
  export declare function cosineSim(a: Float32Array, b: Float32Array): number;
38
+ /** BGE reranker model catalog — analogous to `EMBEDDING_MODELS`. */
39
+ export interface RerankerModel {
40
+ alias: string;
41
+ hfId: string;
42
+ approxSizeMB: number;
43
+ multilingual: boolean;
44
+ /** Max combined (query + passage) tokens — BGE base is 512. */
45
+ maxTokens: number;
46
+ }
47
+ export declare const RERANKER_MODELS: Readonly<Record<string, RerankerModel>>;
48
+ export declare const DEFAULT_RERANKER_ALIAS = "rerank-multilingual";
49
+ export declare function resolveRerankerModel(alias: string | undefined): RerankerModel;
50
+ /** Opaque handle for a loaded reranker. Constructed via `loadReranker()`. */
51
+ export interface Reranker {
52
+ readonly model: RerankerModel;
53
+ /**
54
+ * Score (query, passage) pairs. Higher = more relevant. BGE rerankers
55
+ * return logits in roughly [-10, +10]; we apply sigmoid to get [0, 1] for
56
+ * comparable scoring across models. Truncation of overly-long passages
57
+ * is the model's responsibility (it'll silently chop at maxTokens).
58
+ *
59
+ * Returns one score per passage in input order.
60
+ */
61
+ score(query: string, passages: readonly string[]): Promise<number[]>;
62
+ }
63
+ /**
64
+ * Load a BGE-style cross-encoder reranker. Lazy-imports
65
+ * `@huggingface/transformers` on first call (same lazy-load pattern as
66
+ * `loadEmbedder`). Cold-start downloads the model from HuggingFace
67
+ * (~25-110 MB depending on alias) into `~/.cache/huggingface/`.
68
+ *
69
+ * @param alias - Reranker alias from RERANKER_MODELS (default: "rerank-multilingual").
70
+ */
71
+ export declare function loadReranker(alias?: string): Promise<Reranker>;
38
72
  //# sourceMappingURL=embeddings.d.ts.map
@@ -1 +1 @@
1
- {"version":3,"file":"embeddings.d.ts","sourceRoot":"","sources":["../src/embeddings.ts"],"names":[],"mappings":"AAcA;;mCAEmC;AACnC,MAAM,WAAW,cAAc;IAC7B,iEAAiE;IACjE,KAAK,EAAE,MAAM,CAAC;IACd,uDAAuD;IACvD,IAAI,EAAE,MAAM,CAAC;IACb,4DAA4D;IAC5D,GAAG,EAAE,MAAM,CAAC;IACZ,8EAA8E;IAC9E,YAAY,EAAE,MAAM,CAAC;IACrB,gEAAgE;IAChE,YAAY,EAAE,OAAO,CAAC;IACtB,6DAA6D;IAC7D,SAAS,EAAE,MAAM,CAAC;CACnB;AAED,eAAO,MAAM,gBAAgB,EAAE,QAAQ,CAAC,MAAM,CAAC,MAAM,EAAE,cAAc,CAAC,CAiBpE,CAAC;AAEH,0EAA0E;AAC1E,eAAO,MAAM,mBAAmB,iBAAiB,CAAC;AAElD,wBAAgB,YAAY,CAAC,KAAK,EAAE,MAAM,GAAG,SAAS,GAAG,cAAc,CAQtE;AAED,6EAA6E;AAC7E,MAAM,WAAW,QAAQ;IACvB,QAAQ,CAAC,KAAK,EAAE,cAAc,CAAC;IAC/B;wDACoD;IACpD,KAAK,CAAC,KAAK,EAAE,SAAS,MAAM,EAAE,GAAG,OAAO,CAAC,YAAY,EAAE,CAAC,CAAC;CAC1D;AA0BD;;;;;GAKG;AACH,wBAAsB,YAAY,CAAC,KAAK,CAAC,EAAE,MAAM,GAAG,OAAO,CAAC,QAAQ,CAAC,CAyCpE;AAED,2EAA2E;AAC3E,wBAAgB,SAAS,CAAC,CAAC,EAAE,YAAY,EAAE,CAAC,EAAE,YAAY,GAAG,MAAM,CASlE"}
1
+ {"version":3,"file":"embeddings.d.ts","sourceRoot":"","sources":["../src/embeddings.ts"],"names":[],"mappings":"AAcA;;mCAEmC;AACnC,MAAM,WAAW,cAAc;IAC7B,iEAAiE;IACjE,KAAK,EAAE,MAAM,CAAC;IACd,uDAAuD;IACvD,IAAI,EAAE,MAAM,CAAC;IACb,4DAA4D;IAC5D,GAAG,EAAE,MAAM,CAAC;IACZ,8EAA8E;IAC9E,YAAY,EAAE,MAAM,CAAC;IACrB,gEAAgE;IAChE,YAAY,EAAE,OAAO,CAAC;IACtB,6DAA6D;IAC7D,SAAS,EAAE,MAAM,CAAC;CACnB;AAED,eAAO,MAAM,gBAAgB,EAAE,QAAQ,CAAC,MAAM,CAAC,MAAM,EAAE,cAAc,CAAC,CAiBpE,CAAC;AAEH,0EAA0E;AAC1E,eAAO,MAAM,mBAAmB,iBAAiB,CAAC;AAElD,wBAAgB,YAAY,CAAC,KAAK,EAAE,MAAM,GAAG,SAAS,GAAG,cAAc,CAQtE;AAED,6EAA6E;AAC7E,MAAM,WAAW,QAAQ;IACvB,QAAQ,CAAC,KAAK,EAAE,cAAc,CAAC;IAC/B;wDACoD;IACpD,KAAK,CAAC,KAAK,EAAE,SAAS,MAAM,EAAE,GAAG,OAAO,CAAC,YAAY,EAAE,CAAC,CAAC;CAC1D;AA0BD;;;;;GAKG;AACH,wBAAsB,YAAY,CAAC,KAAK,CAAC,EAAE,MAAM,GAAG,OAAO,CAAC,QAAQ,CAAC,CAyCpE;AAED,2EAA2E;AAC3E,wBAAgB,SAAS,CAAC,CAAC,EAAE,YAAY,EAAE,CAAC,EAAE,YAAY,GAAG,MAAM,CASlE;AAuBD,oEAAoE;AACpE,MAAM,WAAW,aAAa;IAC5B,KAAK,EAAE,MAAM,CAAC;IACd,IAAI,EAAE,MAAM,CAAC;IACb,YAAY,EAAE,MAAM,CAAC;IACrB,YAAY,EAAE,OAAO,CAAC;IACtB,+DAA+D;IAC/D,SAAS,EAAE,MAAM,CAAC;CACnB;AAED,eAAO,MAAM,eAAe,EAAE,QAAQ,CAAC,MAAM,CAAC,MAAM,EAAE,aAAa,CAAC,CAoBlE,CAAC;AAEH,eAAO,MAAM,sBAAsB,wBAAwB,CAAC;AAE5D,wBAAgB,oBAAoB,CAAC,KAAK,EAAE,MAAM,GAAG,SAAS,GAAG,aAAa,CAQ7E;AAED,6EAA6E;AAC7E,MAAM,WAAW,QAAQ;IACvB,QAAQ,CAAC,KAAK,EAAE,aAAa,CAAC;IAC9B;;;;;;;OAOG;IACH,KAAK,CAAC,KAAK,EAAE,MAAM,EAAE,QAAQ,EAAE,SAAS,MAAM,EAAE,GAAG,OAAO,CAAC,MAAM,EAAE,CAAC,CAAC;CACtE;AAED;;;;;;;GAOG;AACH,wBAAsB,YAAY,CAAC,KAAK,CAAC,EAAE,MAAM,GAAG,OAAO,CAAC,QAAQ,CAAC,CAyCpE"}
@@ -114,4 +114,83 @@ export function cosineSim(a, b) {
114
114
  }
115
115
  return s;
116
116
  }
117
+ export const RERANKER_MODELS = Object.freeze({
118
+ // BGE-reranker-base — English, ~110 MB. Latency ~30-50ms per pair on M1 CPU.
119
+ "rerank-bge": {
120
+ alias: "rerank-bge",
121
+ hfId: "Xenova/bge-reranker-base",
122
+ approxSizeMB: 110,
123
+ multilingual: false,
124
+ maxTokens: 512
125
+ },
126
+ // mxbai-rerank-xsmall-v1 — multilingual, ~25 MB, much faster than BGE-base.
127
+ // Better default for users on slower hardware or larger candidate sets.
128
+ // Cited in MTEB leaderboard as comparable to BGE-base on English while
129
+ // staying multilingual.
130
+ "rerank-multilingual": {
131
+ alias: "rerank-multilingual",
132
+ hfId: "Xenova/mxbai-rerank-xsmall-v1",
133
+ approxSizeMB: 25,
134
+ multilingual: true,
135
+ maxTokens: 512
136
+ }
137
+ });
138
+ export const DEFAULT_RERANKER_ALIAS = "rerank-multilingual";
139
+ export function resolveRerankerModel(alias) {
140
+ const key = alias ?? DEFAULT_RERANKER_ALIAS;
141
+ const model = RERANKER_MODELS[key];
142
+ if (!model) {
143
+ const known = Object.keys(RERANKER_MODELS).join(", ");
144
+ throw new Error(`Unknown reranker model alias '${key}'. Known aliases: ${known}.`);
145
+ }
146
+ return model;
147
+ }
148
+ /**
149
+ * Load a BGE-style cross-encoder reranker. Lazy-imports
150
+ * `@huggingface/transformers` on first call (same lazy-load pattern as
151
+ * `loadEmbedder`). Cold-start downloads the model from HuggingFace
152
+ * (~25-110 MB depending on alias) into `~/.cache/huggingface/`.
153
+ *
154
+ * @param alias - Reranker alias from RERANKER_MODELS (default: "rerank-multilingual").
155
+ */
156
+ export async function loadReranker(alias) {
157
+ const model = resolveRerankerModel(alias);
158
+ const pipeline = await loadPipeline();
159
+ const classifier = (await pipeline("text-classification", model.hfId));
160
+ return {
161
+ model,
162
+ async score(query, passages) {
163
+ if (passages.length === 0)
164
+ return [];
165
+ // Build the (query, passage) pair inputs. transformers.js
166
+ // text-classification accepts an array; the model returns one
167
+ // {label, score} per input.
168
+ const inputs = passages.map((p) => ({ text: query, text_pair: p }));
169
+ // Sub-batch to bound memory — same rationale as the embedder's
170
+ // MAX_INTERNAL_BATCH. Cross-encoder is heavier per pair, so we use a
171
+ // smaller batch (4) to keep peak memory under ~150 MB on M1.
172
+ const MAX_INTERNAL_BATCH = 4;
173
+ const out = [];
174
+ for (let batchStart = 0; batchStart < inputs.length; batchStart += MAX_INTERNAL_BATCH) {
175
+ const batch = inputs.slice(batchStart, batchStart + MAX_INTERNAL_BATCH);
176
+ const result = await classifier(batch);
177
+ // Pipeline returns one Array per input by default; flatten to scores.
178
+ // Each output is {label, score}; for binary-relevance rerankers, the
179
+ // score is already the model's relevance probability.
180
+ const scores = Array.isArray(result) ? result : [result];
181
+ for (const r of scores) {
182
+ if (typeof r?.score === "number") {
183
+ out.push(r.score);
184
+ }
185
+ else {
186
+ // Defensive: surface as -Infinity so this hit goes to the bottom
187
+ // rather than poisoning the sort with NaN.
188
+ out.push(-Infinity);
189
+ }
190
+ }
191
+ }
192
+ return out;
193
+ }
194
+ };
195
+ }
117
196
  //# sourceMappingURL=embeddings.js.map
@@ -1 +1 @@
1
- {"version":3,"file":"embeddings.js","sourceRoot":"","sources":["../src/embeddings.ts"],"names":[],"mappings":"AAAA,kFAAkF;AAClF,gFAAgF;AAChF,sEAAsE;AACtE,gFAAgF;AAChF,EAAE;AACF,gBAAgB;AAChB,8EAA8E;AAC9E,oEAAoE;AACpE,wEAAwE;AACxE,yEAAyE;AACzE,iFAAiF;AACjF,8EAA8E;AAC9E,uEAAuE;AAoBvE,MAAM,CAAC,MAAM,gBAAgB,GAA6C,MAAM,CAAC,MAAM,CAAC;IACtF,YAAY,EAAE;QACZ,KAAK,EAAE,cAAc;QACrB,IAAI,EAAE,8CAA8C;QACpD,GAAG,EAAE,GAAG;QACR,YAAY,EAAE,GAAG;QACjB,YAAY,EAAE,IAAI;QAClB,SAAS,EAAE,GAAG;KACf;IACD,GAAG,EAAE;QACH,KAAK,EAAE,KAAK;QACZ,IAAI,EAAE,0BAA0B;QAChC,GAAG,EAAE,GAAG;QACR,YAAY,EAAE,EAAE;QAChB,YAAY,EAAE,KAAK;QACnB,SAAS,EAAE,GAAG;KACf;CACF,CAAC,CAAC;AAEH,0EAA0E;AAC1E,MAAM,CAAC,MAAM,mBAAmB,GAAG,cAAc,CAAC;AAElD,MAAM,UAAU,YAAY,CAAC,KAAyB;IACpD,MAAM,GAAG,GAAG,KAAK,IAAI,mBAAmB,CAAC;IACzC,MAAM,KAAK,GAAG,gBAAgB,CAAC,GAAG,CAAC,CAAC;IACpC,IAAI,CAAC,KAAK,EAAE,CAAC;QACX,MAAM,KAAK,GAAG,MAAM,CAAC,IAAI,CAAC,gBAAgB,CAAC,CAAC,IAAI,CAAC,IAAI,CAAC,CAAC;QACvD,MAAM,IAAI,KAAK,CAAC,kCAAkC,GAAG,qBAAqB,KAAK,GAAG,CAAC,CAAC;IACtF,CAAC;IACD,OAAO,KAAK,CAAC;AACf,CAAC;AAUD,2EAA2E;AAC3E,2EAA2E;AAC3E,4EAA4E;AAC5E,IAAI,YAAY,GAA+D,IAAI,CAAC;AAEpF,KAAK,UAAU,YAAY;IACzB,IAAI,YAAY;QAAE,OAAO,YAAY,CAAC;IACtC,IAAI,CAAC;QACH,gEAAgE;QAChE,MAAM,GAAG,GAAG,CAAC,MAAM,MAAM,CAAC,2BAA2B,CAAC,CAErD,CAAC;QACF,IAAI,CAAC,GAAG,CAAC,QAAQ;YAAE,MAAM,IAAI,KAAK,CAAC,oDAAoD,CAAC,CAAC;QACzF,YAAY,GAAG,GAAG,CAAC,QAAQ,CAAC;QAC5B,OAAO,YAAY,CAAC;IACtB,CAAC;IAAC,OAAO,GAAG,EAAE,CAAC;QACb,MAAM,IAAI,KAAK,CACb,6HAA6H;YAC3H,iGAAiG;YACjG,mBAAmB,GAAG,YAAY,KAAK,CAAC,CAAC,CAAC,GAAG,CAAC,OAAO,CAAC,CAAC,CAAC,MAAM,CAAC,GAAG,CAAC,EAAE,CACxE,CAAC;IACJ,CAAC;AACH,CAAC;AAED;;;;;GAKG;AACH,MAAM,CAAC,KAAK,UAAU,YAAY,CAAC,KAAc;IAC/C,MAAM,KAAK,GAAG,YAAY,CAAC,KAAK,CAAC,CAAC;IAClC,MAAM,QAAQ,GAAG,MAAM,YAAY,EAAE,CAAC;IACtC,MAAM,SAAS,GAAG,CAAC,MAAM,QAAQ,CAAC,oBAAoB,EAAE,KAAK,CAAC,IAAI,CAAC,CAGN,CAAC;IAE9D,wEAAwE;IACxE,wEAAwE;IACxE,yEAAyE;IACzE,wEAAwE;IACxE,0EAA0E;IAC1E,sEAAsE;IACtE,MAAM,kBAAkB,GAAG,CAAC,CAAC;IAE7B,MAAM,GAAG,GAAG,KAAK,CAAC,GAAG,CAAC;IACtB,OAAO;QACL,KAAK;QACL,KAAK,CAAC,KAAK,CAAC,KAAwB;YAClC,IAAI,KAAK,CAAC,MAAM,KAAK,CAAC;gBAAE,OAAO,EAAE,CAAC;YAClC,MAAM,GAAG,GAAmB,EAAE,CAAC;YAC/B,oEAAoE;YACpE,gEAAgE;YAChE,KAAK,IAAI,UAAU,GAAG,CAAC,EAAE,UAAU,GAAG,KAAK,CAAC,MAAM,EAAE,UAAU,IAAI,kBAAkB,EAAE,CAAC;gBACrF,MAAM,KAAK,GAAG,KAAK,CAAC,KAAK,CAAC,UAAU,EAAE,UAAU,GAAG,kBAAkB,CAAC,CAAC;gBACvE,MAAM,MAAM,GAAG,MAAM,SAAS,CAAC,CAAC,GAAG,KAAK,CAAC,EAAE,EAAE,OAAO,EAAE,MAAM,EAAE,SAAS,EAAE,IAAI,EAAE,CAAC,CAAC;gBACjF,IAAI,MAAM,CAAC,IAAI,CAAC,CAAC,CAAC,KAAK,GAAG,EAAE,CAAC;oBAC3B,MAAM,IAAI,KAAK,CACb,SAAS,KAAK,CAAC,IAAI,iBAAiB,MAAM,CAAC,IAAI,CAAC,CAAC,CAAC,cAAc,GAAG,sCAAsC,CAC1G,CAAC;gBACJ,CAAC;gBACD,KAAK,IAAI,CAAC,GAAG,CAAC,EAAE,CAAC,GAAG,KAAK,CAAC,MAAM,EAAE,CAAC,EAAE,EAAE,CAAC;oBACtC,MAAM,KAAK,GAAG,CAAC,GAAG,GAAG,CAAC;oBACtB,uEAAuE;oBACvE,GAAG,CAAC,IAAI,CAAC,IAAI,YAAY,CAAC,MAAM,CAAC,IAAI,CAAC,KAAK,CAAC,KAAK,EAAE,KAAK,GAAG,GAAG,CAAC,CAAC,CAAC,CAAC;gBACpE,CAAC;YACH,CAAC;YACD,OAAO,GAAG,CAAC;QACb,CAAC;KACF,CAAC;AACJ,CAAC;AAED,2EAA2E;AAC3E,MAAM,UAAU,SAAS,CAAC,CAAe,EAAE,CAAe;IACxD,IAAI,CAAC,CAAC,MAAM,KAAK,CAAC,CAAC,MAAM,EAAE,CAAC;QAC1B,MAAM,IAAI,KAAK,CAAC,wBAAwB,CAAC,CAAC,MAAM,OAAO,CAAC,CAAC,MAAM,EAAE,CAAC,CAAC;IACrE,CAAC;IACD,IAAI,CAAC,GAAG,CAAC,CAAC;IACV,KAAK,IAAI,CAAC,GAAG,CAAC,EAAE,CAAC,GAAG,CAAC,CAAC,MAAM,EAAE,CAAC,EAAE,EAAE,CAAC;QAClC,CAAC,IAAI,CAAC,CAAC,CAAC,CAAC,CAAC,IAAI,CAAC,CAAC,GAAG,CAAC,CAAC,CAAC,CAAC,CAAC,IAAI,CAAC,CAAC,CAAC;IACjC,CAAC;IACD,OAAO,CAAC,CAAC;AACX,CAAC"}
1
+ {"version":3,"file":"embeddings.js","sourceRoot":"","sources":["../src/embeddings.ts"],"names":[],"mappings":"AAAA,kFAAkF;AAClF,gFAAgF;AAChF,sEAAsE;AACtE,gFAAgF;AAChF,EAAE;AACF,gBAAgB;AAChB,8EAA8E;AAC9E,oEAAoE;AACpE,wEAAwE;AACxE,yEAAyE;AACzE,iFAAiF;AACjF,8EAA8E;AAC9E,uEAAuE;AAoBvE,MAAM,CAAC,MAAM,gBAAgB,GAA6C,MAAM,CAAC,MAAM,CAAC;IACtF,YAAY,EAAE;QACZ,KAAK,EAAE,cAAc;QACrB,IAAI,EAAE,8CAA8C;QACpD,GAAG,EAAE,GAAG;QACR,YAAY,EAAE,GAAG;QACjB,YAAY,EAAE,IAAI;QAClB,SAAS,EAAE,GAAG;KACf;IACD,GAAG,EAAE;QACH,KAAK,EAAE,KAAK;QACZ,IAAI,EAAE,0BAA0B;QAChC,GAAG,EAAE,GAAG;QACR,YAAY,EAAE,EAAE;QAChB,YAAY,EAAE,KAAK;QACnB,SAAS,EAAE,GAAG;KACf;CACF,CAAC,CAAC;AAEH,0EAA0E;AAC1E,MAAM,CAAC,MAAM,mBAAmB,GAAG,cAAc,CAAC;AAElD,MAAM,UAAU,YAAY,CAAC,KAAyB;IACpD,MAAM,GAAG,GAAG,KAAK,IAAI,mBAAmB,CAAC;IACzC,MAAM,KAAK,GAAG,gBAAgB,CAAC,GAAG,CAAC,CAAC;IACpC,IAAI,CAAC,KAAK,EAAE,CAAC;QACX,MAAM,KAAK,GAAG,MAAM,CAAC,IAAI,CAAC,gBAAgB,CAAC,CAAC,IAAI,CAAC,IAAI,CAAC,CAAC;QACvD,MAAM,IAAI,KAAK,CAAC,kCAAkC,GAAG,qBAAqB,KAAK,GAAG,CAAC,CAAC;IACtF,CAAC;IACD,OAAO,KAAK,CAAC;AACf,CAAC;AAUD,2EAA2E;AAC3E,2EAA2E;AAC3E,4EAA4E;AAC5E,IAAI,YAAY,GAA+D,IAAI,CAAC;AAEpF,KAAK,UAAU,YAAY;IACzB,IAAI,YAAY;QAAE,OAAO,YAAY,CAAC;IACtC,IAAI,CAAC;QACH,gEAAgE;QAChE,MAAM,GAAG,GAAG,CAAC,MAAM,MAAM,CAAC,2BAA2B,CAAC,CAErD,CAAC;QACF,IAAI,CAAC,GAAG,CAAC,QAAQ;YAAE,MAAM,IAAI,KAAK,CAAC,oDAAoD,CAAC,CAAC;QACzF,YAAY,GAAG,GAAG,CAAC,QAAQ,CAAC;QAC5B,OAAO,YAAY,CAAC;IACtB,CAAC;IAAC,OAAO,GAAG,EAAE,CAAC;QACb,MAAM,IAAI,KAAK,CACb,6HAA6H;YAC3H,iGAAiG;YACjG,mBAAmB,GAAG,YAAY,KAAK,CAAC,CAAC,CAAC,GAAG,CAAC,OAAO,CAAC,CAAC,CAAC,MAAM,CAAC,GAAG,CAAC,EAAE,CACxE,CAAC;IACJ,CAAC;AACH,CAAC;AAED;;;;;GAKG;AACH,MAAM,CAAC,KAAK,UAAU,YAAY,CAAC,KAAc;IAC/C,MAAM,KAAK,GAAG,YAAY,CAAC,KAAK,CAAC,CAAC;IAClC,MAAM,QAAQ,GAAG,MAAM,YAAY,EAAE,CAAC;IACtC,MAAM,SAAS,GAAG,CAAC,MAAM,QAAQ,CAAC,oBAAoB,EAAE,KAAK,CAAC,IAAI,CAAC,CAGN,CAAC;IAE9D,wEAAwE;IACxE,wEAAwE;IACxE,yEAAyE;IACzE,wEAAwE;IACxE,0EAA0E;IAC1E,sEAAsE;IACtE,MAAM,kBAAkB,GAAG,CAAC,CAAC;IAE7B,MAAM,GAAG,GAAG,KAAK,CAAC,GAAG,CAAC;IACtB,OAAO;QACL,KAAK;QACL,KAAK,CAAC,KAAK,CAAC,KAAwB;YAClC,IAAI,KAAK,CAAC,MAAM,KAAK,CAAC;gBAAE,OAAO,EAAE,CAAC;YAClC,MAAM,GAAG,GAAmB,EAAE,CAAC;YAC/B,oEAAoE;YACpE,gEAAgE;YAChE,KAAK,IAAI,UAAU,GAAG,CAAC,EAAE,UAAU,GAAG,KAAK,CAAC,MAAM,EAAE,UAAU,IAAI,kBAAkB,EAAE,CAAC;gBACrF,MAAM,KAAK,GAAG,KAAK,CAAC,KAAK,CAAC,UAAU,EAAE,UAAU,GAAG,kBAAkB,CAAC,CAAC;gBACvE,MAAM,MAAM,GAAG,MAAM,SAAS,CAAC,CAAC,GAAG,KAAK,CAAC,EAAE,EAAE,OAAO,EAAE,MAAM,EAAE,SAAS,EAAE,IAAI,EAAE,CAAC,CAAC;gBACjF,IAAI,MAAM,CAAC,IAAI,CAAC,CAAC,CAAC,KAAK,GAAG,EAAE,CAAC;oBAC3B,MAAM,IAAI,KAAK,CACb,SAAS,KAAK,CAAC,IAAI,iBAAiB,MAAM,CAAC,IAAI,CAAC,CAAC,CAAC,cAAc,GAAG,sCAAsC,CAC1G,CAAC;gBACJ,CAAC;gBACD,KAAK,IAAI,CAAC,GAAG,CAAC,EAAE,CAAC,GAAG,KAAK,CAAC,MAAM,EAAE,CAAC,EAAE,EAAE,CAAC;oBACtC,MAAM,KAAK,GAAG,CAAC,GAAG,GAAG,CAAC;oBACtB,uEAAuE;oBACvE,GAAG,CAAC,IAAI,CAAC,IAAI,YAAY,CAAC,MAAM,CAAC,IAAI,CAAC,KAAK,CAAC,KAAK,EAAE,KAAK,GAAG,GAAG,CAAC,CAAC,CAAC,CAAC;gBACpE,CAAC;YACH,CAAC;YACD,OAAO,GAAG,CAAC;QACb,CAAC;KACF,CAAC;AACJ,CAAC;AAED,2EAA2E;AAC3E,MAAM,UAAU,SAAS,CAAC,CAAe,EAAE,CAAe;IACxD,IAAI,CAAC,CAAC,MAAM,KAAK,CAAC,CAAC,MAAM,EAAE,CAAC;QAC1B,MAAM,IAAI,KAAK,CAAC,wBAAwB,CAAC,CAAC,MAAM,OAAO,CAAC,CAAC,MAAM,EAAE,CAAC,CAAC;IACrE,CAAC;IACD,IAAI,CAAC,GAAG,CAAC,CAAC;IACV,KAAK,IAAI,CAAC,GAAG,CAAC,EAAE,CAAC,GAAG,CAAC,CAAC,MAAM,EAAE,CAAC,EAAE,EAAE,CAAC;QAClC,CAAC,IAAI,CAAC,CAAC,CAAC,CAAC,CAAC,IAAI,CAAC,CAAC,GAAG,CAAC,CAAC,CAAC,CAAC,CAAC,IAAI,CAAC,CAAC,CAAC;IACjC,CAAC;IACD,OAAO,CAAC,CAAC;AACX,CAAC;AAiCD,MAAM,CAAC,MAAM,eAAe,GAA4C,MAAM,CAAC,MAAM,CAAC;IACpF,6EAA6E;IAC7E,YAAY,EAAE;QACZ,KAAK,EAAE,YAAY;QACnB,IAAI,EAAE,0BAA0B;QAChC,YAAY,EAAE,GAAG;QACjB,YAAY,EAAE,KAAK;QACnB,SAAS,EAAE,GAAG;KACf;IACD,4EAA4E;IAC5E,wEAAwE;IACxE,uEAAuE;IACvE,wBAAwB;IACxB,qBAAqB,EAAE;QACrB,KAAK,EAAE,qBAAqB;QAC5B,IAAI,EAAE,+BAA+B;QACrC,YAAY,EAAE,EAAE;QAChB,YAAY,EAAE,IAAI;QAClB,SAAS,EAAE,GAAG;KACf;CACF,CAAC,CAAC;AAEH,MAAM,CAAC,MAAM,sBAAsB,GAAG,qBAAqB,CAAC;AAE5D,MAAM,UAAU,oBAAoB,CAAC,KAAyB;IAC5D,MAAM,GAAG,GAAG,KAAK,IAAI,sBAAsB,CAAC;IAC5C,MAAM,KAAK,GAAG,eAAe,CAAC,GAAG,CAAC,CAAC;IACnC,IAAI,CAAC,KAAK,EAAE,CAAC;QACX,MAAM,KAAK,GAAG,MAAM,CAAC,IAAI,CAAC,eAAe,CAAC,CAAC,IAAI,CAAC,IAAI,CAAC,CAAC;QACtD,MAAM,IAAI,KAAK,CAAC,iCAAiC,GAAG,qBAAqB,KAAK,GAAG,CAAC,CAAC;IACrF,CAAC;IACD,OAAO,KAAK,CAAC;AACf,CAAC;AAgBD;;;;;;;GAOG;AACH,MAAM,CAAC,KAAK,UAAU,YAAY,CAAC,KAAc;IAC/C,MAAM,KAAK,GAAG,oBAAoB,CAAC,KAAK,CAAC,CAAC;IAC1C,MAAM,QAAQ,GAAG,MAAM,YAAY,EAAE,CAAC;IACtC,MAAM,UAAU,GAAG,CAAC,MAAM,QAAQ,CAAC,qBAAqB,EAAE,KAAK,CAAC,IAAI,CAAC,CAGhB,CAAC;IAEtD,OAAO;QACL,KAAK;QACL,KAAK,CAAC,KAAK,CAAC,KAAa,EAAE,QAA2B;YACpD,IAAI,QAAQ,CAAC,MAAM,KAAK,CAAC;gBAAE,OAAO,EAAE,CAAC;YACrC,0DAA0D;YAC1D,8DAA8D;YAC9D,4BAA4B;YAC5B,MAAM,MAAM,GAAG,QAAQ,CAAC,GAAG,CAAC,CAAC,CAAC,EAAE,EAAE,CAAC,CAAC,EAAE,IAAI,EAAE,KAAK,EAAE,SAAS,EAAE,CAAC,EAAE,CAAC,CAAC,CAAC;YACpE,+DAA+D;YAC/D,qEAAqE;YACrE,6DAA6D;YAC7D,MAAM,kBAAkB,GAAG,CAAC,CAAC;YAC7B,MAAM,GAAG,GAAa,EAAE,CAAC;YACzB,KAAK,IAAI,UAAU,GAAG,CAAC,EAAE,UAAU,GAAG,MAAM,CAAC,MAAM,EAAE,UAAU,IAAI,kBAAkB,EAAE,CAAC;gBACtF,MAAM,KAAK,GAAG,MAAM,CAAC,KAAK,CAAC,UAAU,EAAE,UAAU,GAAG,kBAAkB,CAAC,CAAC;gBACxE,MAAM,MAAM,GAAG,MAAM,UAAU,CAAC,KAAK,CAAC,CAAC;gBACvC,sEAAsE;gBACtE,qEAAqE;gBACrE,sDAAsD;gBACtD,MAAM,MAAM,GAAG,KAAK,CAAC,OAAO,CAAC,MAAM,CAAC,CAAC,CAAC,CAAC,MAAM,CAAC,CAAC,CAAC,CAAC,MAAM,CAAC,CAAC;gBACzD,KAAK,MAAM,CAAC,IAAI,MAAM,EAAE,CAAC;oBACvB,IAAI,OAAO,CAAC,EAAE,KAAK,KAAK,QAAQ,EAAE,CAAC;wBACjC,GAAG,CAAC,IAAI,CAAC,CAAC,CAAC,KAAK,CAAC,CAAC;oBACpB,CAAC;yBAAM,CAAC;wBACN,iEAAiE;wBACjE,2CAA2C;wBAC3C,GAAG,CAAC,IAAI,CAAC,CAAC,QAAQ,CAAC,CAAC;oBACtB,CAAC;gBACH,CAAC;YACH,CAAC;YACD,OAAO,GAAG,CAAC;QACb,CAAC;KACF,CAAC;AACJ,CAAC"}
package/dist/fts5.d.ts CHANGED
@@ -1,4 +1,6 @@
1
1
  export type TokenizeMode = "unicode61" | "trigram";
2
+ /** Content-source kind. v2.7.0 added `pdf`; v2.8.0 indexes them. */
3
+ export type ChunkKind = "md" | "pdf";
2
4
  export interface FtsSearchHit {
3
5
  rel_path: string;
4
6
  chunk_index: number;
@@ -6,6 +8,8 @@ export interface FtsSearchHit {
6
8
  line_end: number;
7
9
  snippet: string;
8
10
  score: number;
11
+ /** v2.8.0 — content-source kind. Defaults to "md" for backward compat. */
12
+ kind: ChunkKind;
9
13
  }
10
14
  export interface FtsSyncReport {
11
15
  added: number;
@@ -36,11 +40,18 @@ export declare class FtsIndex {
36
40
  * Diff the on-disk source_state against the live vault snapshot. Returns
37
41
  * categorized lists; caller is expected to feed `added` + `updated` paths
38
42
  * back into reindexFile() and pass `deleted` to dropFile().
43
+ *
44
+ * v2.8.0: optional `kind` filter — when set, the diff only considers
45
+ * source_state rows of that kind. Lets the markdown-sync and PDF-sync
46
+ * paths run independently against the same DB without one's "missing
47
+ * files" being mistakenly deleted by the other. Default `undefined`
48
+ * means "all kinds" (used by older callers + diff queries that want
49
+ * a global view).
39
50
  */
40
51
  diff(liveEntries: Array<{
41
52
  relPath: string;
42
53
  mtimeMs: number;
43
- }>): {
54
+ }>, kind?: ChunkKind): {
44
55
  added: string[];
45
56
  updated: string[];
46
57
  deleted: string[];
@@ -48,8 +59,24 @@ export declare class FtsIndex {
48
59
  };
49
60
  /** Drop a file's chunks + state row. Idempotent. */
50
61
  dropFile(relPath: string): void;
51
- /** Re-chunk a single file, replacing its existing chunks atomically. */
62
+ /** Re-chunk a single markdown file, replacing its existing chunks atomically. */
52
63
  reindexFile(relPath: string, mtimeMs: number, content: string, wikilinkTargets?: string[], tags?: string[]): number;
64
+ /**
65
+ * v2.8.0 — re-chunk a single PDF, replacing its existing chunks atomically.
66
+ * Caller pre-extracts page text via `extractPdfText` (src/pdf.ts) so this
67
+ * method stays decoupled from pdfjs-dist (which is an optionalDependency).
68
+ *
69
+ * Page boundaries are preserved as `[page: N]` markers in the joined text
70
+ * before chunking — the chunker may split a page across chunks or merge
71
+ * short pages, but the markers travel with the text so search snippets
72
+ * carry page citations. Same `chunkContent` pipeline as markdown so chunk
73
+ * IDs match across the BM25 / TF-IDF / embeddings rankers (RRF requires
74
+ * stable IDs).
75
+ */
76
+ reindexPdfFile(relPath: string, mtimeMs: number, pages: ReadonlyArray<{
77
+ pageNumber: number;
78
+ text: string;
79
+ }>): number;
53
80
  search(rawQuery: string, opts?: {
54
81
  limit?: number;
55
82
  folder?: string;
@@ -1 +1 @@
1
- {"version":3,"file":"fts5.d.ts","sourceRoot":"","sources":["../src/fts5.ts"],"names":[],"mappings":"AAwBA,MAAM,MAAM,YAAY,GAAG,WAAW,GAAG,SAAS,CAAC;AAEnD,MAAM,WAAW,YAAY;IAC3B,QAAQ,EAAE,MAAM,CAAC;IACjB,WAAW,EAAE,MAAM,CAAC;IACpB,UAAU,EAAE,MAAM,CAAC;IACnB,QAAQ,EAAE,MAAM,CAAC;IACjB,OAAO,EAAE,MAAM,CAAC;IAChB,KAAK,EAAE,MAAM,CAAC;CACf;AAED,MAAM,WAAW,aAAa;IAC5B,KAAK,EAAE,MAAM,CAAC;IACd,OAAO,EAAE,MAAM,CAAC;IAChB,OAAO,EAAE,MAAM,CAAC;IAChB,SAAS,EAAE,MAAM,CAAC;IAClB,YAAY,EAAE,MAAM,CAAC;CACtB;AA4DD,qBAAa,QAAQ;IACnB,OAAO,CAAC,EAAE,CAAmB;IAC7B,OAAO,CAAC,QAAQ,CAAC,IAAI,CAAS;IAC9B,OAAO,CAAC,QAAQ,CAAC,QAAQ,CAAe;IACxC,OAAO,CAAC,QAAQ,CAAC,SAAS,CAAS;gBAEvB,IAAI,EAAE;QAAE,IAAI,EAAE,MAAM,CAAC;QAAC,SAAS,EAAE,MAAM,CAAC;QAAC,QAAQ,CAAC,EAAE,YAAY,CAAA;KAAE;IAMxE,IAAI,IAAI,OAAO,CAAC,IAAI,CAAC;IAiB3B,iEAAiE;IAC3D,WAAW,IAAI,OAAO,CAAC,OAAO,CAAC;IAcrC,KAAK,IAAI,IAAI;IAOb,OAAO,CAAC,eAAe;IAqDvB,OAAO,CAAC,QAAQ;IAQhB,OAAO,CAAC,SAAS;IAMjB,OAAO,CAAC,SAAS;IAKjB;;;;OAIG;IACH,IAAI,CAAC,WAAW,EAAE,KAAK,CAAC;QAAE,OAAO,EAAE,MAAM,CAAC;QAAC,OAAO,EAAE,MAAM,CAAA;KAAE,CAAC,GAAG;QAC9D,KAAK,EAAE,MAAM,EAAE,CAAC;QAChB,OAAO,EAAE,MAAM,EAAE,CAAC;QAClB,OAAO,EAAE,MAAM,EAAE,CAAC;QAClB,SAAS,EAAE,MAAM,EAAE,CAAC;KACrB;IAuBD,oDAAoD;IACpD,QAAQ,CAAC,OAAO,EAAE,MAAM,GAAG,IAAI;IAM/B,wEAAwE;IACxE,WAAW,CACT,OAAO,EAAE,MAAM,EACf,OAAO,EAAE,MAAM,EACf,OAAO,EAAE,MAAM,EACf,eAAe,GAAE,MAAM,EAAO,EAC9B,IAAI,GAAE,MAAM,EAAO,GAClB,MAAM;IA8BT,MAAM,CACJ,QAAQ,EAAE,MAAM,EAChB,IAAI,GAAE;QAAE,KAAK,CAAC,EAAE,MAAM,CAAC;QAAC,MAAM,CAAC,EAAE,MAAM,CAAC;QAAC,GAAG,CAAC,EAAE,MAAM,CAAC;QAAC,YAAY,CAAC,EAAE,MAAM,CAAA;KAAO,GAClF,YAAY,EAAE;IAyDjB;;;;;;;OAOG;IACH,QAAQ,CAAC,OAAO,EAAE,MAAM,EAAE,UAAU,EAAE,MAAM,GAAG;QAAE,OAAO,EAAE,MAAM,CAAC;QAAC,UAAU,EAAE,MAAM,CAAC;QAAC,QAAQ,EAAE,MAAM,CAAA;KAAE,GAAG,IAAI;IAQ/G,WAAW,IAAI,MAAM;IAMrB,UAAU,IAAI,MAAM;CAKrB;AAMD,wBAAgB,aAAa,CAAC,CAAC,EAAE,MAAM,GAAG,MAAM,CAe/C;AAED,UAAU,YAAY;IACpB,IAAI,EAAE,MAAM,CAAC;IACb,SAAS,EAAE,MAAM,CAAC;IAClB,OAAO,EAAE,MAAM,CAAC;IAChB;;;;+EAI2E;IAC3E,UAAU,EAAE,MAAM,CAAC;CACpB;AAID;;;;;;;;GAQG;AACH,wBAAgB,YAAY,CAAC,OAAO,EAAE,MAAM,EAAE,QAAQ,SAAkB,GAAG,YAAY,EAAE,CAwDxF;AAuED,wBAAgB,gBAAgB,CAAC,SAAS,EAAE,MAAM,GAAG,MAAM,CAM1D"}
1
+ {"version":3,"file":"fts5.d.ts","sourceRoot":"","sources":["../src/fts5.ts"],"names":[],"mappings":"AA2BA,MAAM,MAAM,YAAY,GAAG,WAAW,GAAG,SAAS,CAAC;AAEnD,oEAAoE;AACpE,MAAM,MAAM,SAAS,GAAG,IAAI,GAAG,KAAK,CAAC;AAErC,MAAM,WAAW,YAAY;IAC3B,QAAQ,EAAE,MAAM,CAAC;IACjB,WAAW,EAAE,MAAM,CAAC;IACpB,UAAU,EAAE,MAAM,CAAC;IACnB,QAAQ,EAAE,MAAM,CAAC;IACjB,OAAO,EAAE,MAAM,CAAC;IAChB,KAAK,EAAE,MAAM,CAAC;IACd,0EAA0E;IAC1E,IAAI,EAAE,SAAS,CAAC;CACjB;AAED,MAAM,WAAW,aAAa;IAC5B,KAAK,EAAE,MAAM,CAAC;IACd,OAAO,EAAE,MAAM,CAAC;IAChB,OAAO,EAAE,MAAM,CAAC;IAChB,SAAS,EAAE,MAAM,CAAC;IAClB,YAAY,EAAE,MAAM,CAAC;CACtB;AA4DD,qBAAa,QAAQ;IACnB,OAAO,CAAC,EAAE,CAAmB;IAC7B,OAAO,CAAC,QAAQ,CAAC,IAAI,CAAS;IAC9B,OAAO,CAAC,QAAQ,CAAC,QAAQ,CAAe;IACxC,OAAO,CAAC,QAAQ,CAAC,SAAS,CAAS;gBAEvB,IAAI,EAAE;QAAE,IAAI,EAAE,MAAM,CAAC;QAAC,SAAS,EAAE,MAAM,CAAC;QAAC,QAAQ,CAAC,EAAE,YAAY,CAAA;KAAE;IAMxE,IAAI,IAAI,OAAO,CAAC,IAAI,CAAC;IAiB3B,iEAAiE;IAC3D,WAAW,IAAI,OAAO,CAAC,OAAO,CAAC;IAcrC,KAAK,IAAI,IAAI;IAOb,OAAO,CAAC,eAAe;IAuDvB,OAAO,CAAC,QAAQ;IAQhB,OAAO,CAAC,SAAS;IAMjB,OAAO,CAAC,SAAS;IAKjB;;;;;;;;;;;OAWG;IACH,IAAI,CACF,WAAW,EAAE,KAAK,CAAC;QAAE,OAAO,EAAE,MAAM,CAAC;QAAC,OAAO,EAAE,MAAM,CAAA;KAAE,CAAC,EACxD,IAAI,CAAC,EAAE,SAAS,GACf;QACD,KAAK,EAAE,MAAM,EAAE,CAAC;QAChB,OAAO,EAAE,MAAM,EAAE,CAAC;QAClB,OAAO,EAAE,MAAM,EAAE,CAAC;QAClB,SAAS,EAAE,MAAM,EAAE,CAAC;KACrB;IA0BD,oDAAoD;IACpD,QAAQ,CAAC,OAAO,EAAE,MAAM,GAAG,IAAI;IAM/B,iFAAiF;IACjF,WAAW,CACT,OAAO,EAAE,MAAM,EACf,OAAO,EAAE,MAAM,EACf,OAAO,EAAE,MAAM,EACf,eAAe,GAAE,MAAM,EAAO,EAC9B,IAAI,GAAE,MAAM,EAAO,GAClB,MAAM;IA8BT;;;;;;;;;;;OAWG;IACH,cAAc,CAAC,OAAO,EAAE,MAAM,EAAE,OAAO,EAAE,MAAM,EAAE,KAAK,EAAE,aAAa,CAAC;QAAE,UAAU,EAAE,MAAM,CAAC;QAAC,IAAI,EAAE,MAAM,CAAA;KAAE,CAAC,GAAG,MAAM;IAsBpH,MAAM,CACJ,QAAQ,EAAE,MAAM,EAChB,IAAI,GAAE;QAAE,KAAK,CAAC,EAAE,MAAM,CAAC;QAAC,MAAM,CAAC,EAAE,MAAM,CAAC;QAAC,GAAG,CAAC,EAAE,MAAM,CAAC;QAAC,YAAY,CAAC,EAAE,MAAM,CAAA;KAAO,GAClF,YAAY,EAAE;IA+DjB;;;;;;;OAOG;IACH,QAAQ,CAAC,OAAO,EAAE,MAAM,EAAE,UAAU,EAAE,MAAM,GAAG;QAAE,OAAO,EAAE,MAAM,CAAC;QAAC,UAAU,EAAE,MAAM,CAAC;QAAC,QAAQ,EAAE,MAAM,CAAA;KAAE,GAAG,IAAI;IAQ/G,WAAW,IAAI,MAAM;IAMrB,UAAU,IAAI,MAAM;CAKrB;AAMD,wBAAgB,aAAa,CAAC,CAAC,EAAE,MAAM,GAAG,MAAM,CAe/C;AAED,UAAU,YAAY;IACpB,IAAI,EAAE,MAAM,CAAC;IACb,SAAS,EAAE,MAAM,CAAC;IAClB,OAAO,EAAE,MAAM,CAAC;IAChB;;;;+EAI2E;IAC3E,UAAU,EAAE,MAAM,CAAC;CACpB;AAID;;;;;;;;GAQG;AACH,wBAAgB,YAAY,CAAC,OAAO,EAAE,MAAM,EAAE,QAAQ,SAAkB,GAAG,YAAY,EAAE,CAwDxF;AAuED,wBAAgB,gBAAgB,CAAC,SAAS,EAAE,MAAM,GAAG,MAAM,CAM1D"}