nlm-memory 0.5.0 → 0.5.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (257) hide show
  1. package/README.md +89 -34
  2. package/dist/cli/digest.d.ts +20 -0
  3. package/dist/cli/digest.js +142 -0
  4. package/dist/cli/digest.js.map +1 -0
  5. package/dist/cli/nlm.d.ts +1 -0
  6. package/dist/cli/nlm.js +25 -1
  7. package/dist/cli/nlm.js.map +1 -1
  8. package/dist/core/digest/compose.d.ts +38 -0
  9. package/dist/core/digest/compose.js +93 -0
  10. package/dist/core/digest/compose.js.map +1 -0
  11. package/dist/core/digest/hook-liveness.d.ts +32 -0
  12. package/dist/core/digest/hook-liveness.js +54 -0
  13. package/dist/core/digest/hook-liveness.js.map +1 -0
  14. package/dist/http/app.js +2 -1
  15. package/dist/http/app.js.map +1 -1
  16. package/dist/mcp/server.js +20 -1
  17. package/dist/mcp/server.js.map +1 -1
  18. package/dist/ui/assets/{index-C8cpwbYJ.css → index-Beo8psd-.css} +1 -1
  19. package/dist/ui/assets/{index-CB50QnL-.js → index-CSPTTeeM.js} +8 -8
  20. package/dist/ui/index.html +2 -2
  21. package/package.json +26 -1
  22. package/.agents/plugins/marketplace.json +0 -20
  23. package/.github/workflows/ci.yml +0 -30
  24. package/docs/methodology/re-derivation-rate.md +0 -112
  25. package/docs/methodology/useful-hit-rate.md +0 -79
  26. package/docs/plans/2026-05-20-fts5-lexical-recall.md +0 -1088
  27. package/docs/plans/2026-05-20-recall-daemon-wedge-fix.md +0 -662
  28. package/docs/plans/2026-05-20-recall-hook-design.md +0 -131
  29. package/docs/plans/2026-05-20-recall-hook-implementation.md +0 -1222
  30. package/docs/plans/desktop-product.md +0 -69
  31. package/docs/plans/factstore-design.md +0 -236
  32. package/logs/CHANGELOG/CHANGELOG-2026.md +0 -1575
  33. package/logs/CHANGELOG/CHANGELOG.md +0 -209
  34. package/migrations/000_initial_schema.sql +0 -174
  35. package/migrations/001_entity_type_rename.sql +0 -17
  36. package/migrations/002_adapter_state_extend.sql +0 -12
  37. package/migrations/003_session_embeddings.sql +0 -11
  38. package/migrations/004_facts.sql +0 -46
  39. package/migrations/005_sources.sql +0 -31
  40. package/migrations/006_providers.sql +0 -33
  41. package/migrations/007_source_tokens.sql +0 -17
  42. package/migrations/008_fts_rebuild.sql +0 -9
  43. package/migrations/009_session_embedding_chunks.sql +0 -46
  44. package/migrations/010_sources_opencode.sql +0 -30
  45. package/migrations/011_sources_hermes_agent.sql +0 -30
  46. package/migrations/012_sources_aider.sql +0 -30
  47. package/migrations/013_adapter_state_failure_count.sql +0 -12
  48. package/migrations/014_sources_cursor.sql +0 -30
  49. package/migrations/015_sources_windsurf.sql +0 -30
  50. package/plugin-hermes-agent/README.md +0 -49
  51. package/plugin-hermes-agent/__init__.py +0 -75
  52. package/plugin-hermes-agent/plugin.yaml +0 -15
  53. package/scripts/backfill-citations.mjs +0 -0
  54. package/scripts/build-codex-plugin.mjs +0 -61
  55. package/scripts/deepseek-probe.mjs +0 -67
  56. package/scripts/extract-triples.mjs +0 -207
  57. package/scripts/longmemeval/embedding-cache.ts +0 -77
  58. package/scripts/longmemeval/fetch-dataset.sh +0 -25
  59. package/scripts/longmemeval/run-harness.ts +0 -315
  60. package/scripts/longmemeval/scorer.ts +0 -99
  61. package/scripts/longmemeval/tsconfig.json +0 -9
  62. package/scripts/longmemeval/types.ts +0 -35
  63. package/scripts/nlm-daily-digest.py +0 -239
  64. package/scripts/nlm-daily-digest.sh +0 -28
  65. package/src/cli/classify-parity.ts +0 -257
  66. package/src/cli/launchctl-helpers.ts +0 -49
  67. package/src/cli/nlm.ts +0 -1078
  68. package/src/core/actions/actions-log.ts +0 -118
  69. package/src/core/actions/overlay.ts +0 -117
  70. package/src/core/adapters/aider.ts +0 -205
  71. package/src/core/adapters/claude-code.ts +0 -293
  72. package/src/core/adapters/common.ts +0 -54
  73. package/src/core/adapters/cursor.ts +0 -486
  74. package/src/core/adapters/from-source.ts +0 -67
  75. package/src/core/adapters/hermes-agent.ts +0 -240
  76. package/src/core/adapters/hermes.ts +0 -277
  77. package/src/core/adapters/jsonl-generic.ts +0 -208
  78. package/src/core/adapters/opencode.ts +0 -281
  79. package/src/core/adapters/pi.ts +0 -264
  80. package/src/core/adapters/windsurf.ts +0 -386
  81. package/src/core/classifier/prompt.ts +0 -200
  82. package/src/core/dataset/build-dataset.ts +0 -463
  83. package/src/core/embedding/chunk-body.ts +0 -76
  84. package/src/core/embedding/embed-backfill.ts +0 -210
  85. package/src/core/embedding/embed-normalize.ts +0 -135
  86. package/src/core/facts/backfill-facts.ts +0 -254
  87. package/src/core/facts/extract-facts.ts +0 -50
  88. package/src/core/hook/citation-detect.ts +0 -124
  89. package/src/core/hook/cite-memo.ts +0 -68
  90. package/src/core/hook/claude-settings.ts +0 -187
  91. package/src/core/hook/gate.ts +0 -25
  92. package/src/core/hook/hook-log.ts +0 -41
  93. package/src/core/hook/memo-sweep.ts +0 -164
  94. package/src/core/hook/memo.ts +0 -67
  95. package/src/core/hook/pointer-block.ts +0 -26
  96. package/src/core/hook/select.ts +0 -32
  97. package/src/core/hook/transcript.ts +0 -121
  98. package/src/core/ingest/ingest-session.ts +0 -111
  99. package/src/core/providers/provider-models.ts +0 -100
  100. package/src/core/providers/provider-registry.ts +0 -196
  101. package/src/core/recall/citation-log.ts +0 -108
  102. package/src/core/recall/filter.ts +0 -27
  103. package/src/core/recall/index.ts +0 -6
  104. package/src/core/recall/match-fields.ts +0 -40
  105. package/src/core/recall/query-log.ts +0 -149
  106. package/src/core/recall/query-shape.ts +0 -66
  107. package/src/core/recall/recall-service.ts +0 -320
  108. package/src/core/recall/recent-log.ts +0 -59
  109. package/src/core/recall/tokenize.ts +0 -18
  110. package/src/core/recall/useful-scan.ts +0 -336
  111. package/src/core/recall-facts/fact-query-log.ts +0 -150
  112. package/src/core/recall-facts/fact-recall-service.ts +0 -327
  113. package/src/core/scheduler/scan-once.ts +0 -142
  114. package/src/core/scheduler/scheduler.ts +0 -225
  115. package/src/core/sources/source-registry.ts +0 -278
  116. package/src/core/storage/db-restore.ts +0 -133
  117. package/src/core/storage/live-status.ts +0 -45
  118. package/src/core/storage/migrate.ts +0 -72
  119. package/src/core/storage/sqlite-fact-store.ts +0 -304
  120. package/src/core/storage/sqlite-session-store.ts +0 -810
  121. package/src/hook/hook-auth.ts +0 -18
  122. package/src/hook/prompt-recall-hook.ts +0 -180
  123. package/src/hook/session-end-hook.ts +0 -81
  124. package/src/hook/session-start-hook.ts +0 -168
  125. package/src/hook/stop-hook.ts +0 -239
  126. package/src/http/app.ts +0 -1215
  127. package/src/install/claude-code.ts +0 -128
  128. package/src/install/codex.ts +0 -367
  129. package/src/install/cursor.ts +0 -68
  130. package/src/install/hermes-agent.ts +0 -76
  131. package/src/install/hermes.ts +0 -78
  132. package/src/install/nlm-dir-perms.ts +0 -55
  133. package/src/install/ollama.ts +0 -284
  134. package/src/install/setup.ts +0 -489
  135. package/src/install/windsurf.ts +0 -68
  136. package/src/llm/classifier-box.ts +0 -64
  137. package/src/llm/deepseek-client.ts +0 -150
  138. package/src/llm/env-autoload.ts +0 -55
  139. package/src/llm/ollama-client.ts +0 -189
  140. package/src/mcp/server.ts +0 -534
  141. package/src/ports/fact-store.ts +0 -102
  142. package/src/ports/llm-client.ts +0 -52
  143. package/src/ports/logger.ts +0 -16
  144. package/src/ports/session-store.ts +0 -45
  145. package/src/ports/transcript-adapter.ts +0 -55
  146. package/src/shared/types.ts +0 -149
  147. package/src/ui/App.tsx +0 -58
  148. package/src/ui/components/PromoteOpenButton.tsx +0 -65
  149. package/src/ui/components/SessionDrawer.tsx +0 -199
  150. package/src/ui/components/SideNav.tsx +0 -162
  151. package/src/ui/components/Skeleton.tsx +0 -107
  152. package/src/ui/index.html +0 -13
  153. package/src/ui/lib/actions.ts +0 -30
  154. package/src/ui/lib/api.ts +0 -92
  155. package/src/ui/lib/dataset.ts +0 -141
  156. package/src/ui/lib/registries.ts +0 -155
  157. package/src/ui/lib/view-settings.ts +0 -41
  158. package/src/ui/main.tsx +0 -15
  159. package/src/ui/pages/Live.tsx +0 -229
  160. package/src/ui/pages/Pulse.tsx +0 -415
  161. package/src/ui/pages/Recall.tsx +0 -190
  162. package/src/ui/pages/River.tsx +0 -354
  163. package/src/ui/pages/Search.tsx +0 -386
  164. package/src/ui/pages/Stub.tsx +0 -9
  165. package/src/ui/pages/Thread.tsx +0 -473
  166. package/src/ui/pages/settings/Classifier.tsx +0 -227
  167. package/src/ui/pages/settings/Data.tsx +0 -190
  168. package/src/ui/pages/settings/Index.tsx +0 -65
  169. package/src/ui/pages/settings/Labels.tsx +0 -224
  170. package/src/ui/pages/settings/Providers.tsx +0 -305
  171. package/src/ui/pages/settings/SettingsSubnav.tsx +0 -28
  172. package/src/ui/pages/settings/Sources.tsx +0 -326
  173. package/src/ui/pages/settings/Views.tsx +0 -96
  174. package/src/ui/styles.css +0 -1890
  175. package/src/ui/tsconfig.json +0 -21
  176. package/src/ui/vite.config.ts +0 -19
  177. package/tests/fixtures/claude_code/short_session.jsonl +0 -2
  178. package/tests/fixtures/claude_code/standard_iso.jsonl +0 -4
  179. package/tests/fixtures/claude_code/tool_heavy.jsonl +0 -8
  180. package/tests/fixtures/claude_code/with_subagent.jsonl +0 -7
  181. package/tests/fixtures/facts.ts +0 -17
  182. package/tests/fixtures/golden-corpus.ts +0 -85
  183. package/tests/fixtures/hermes/paired_request_dump.json +0 -24
  184. package/tests/fixtures/hermes/paired_session.json +0 -23
  185. package/tests/fixtures/hermes/request_dump.json +0 -28
  186. package/tests/fixtures/hermes/session_iso.json +0 -38
  187. package/tests/fixtures/hermes/session_unix.json +0 -38
  188. package/tests/fixtures/hermes/system_only.json +0 -18
  189. package/tests/fixtures/pi/error-connection-abort.jsonl +0 -8
  190. package/tests/fixtures/pi/short-successful.jsonl +0 -5
  191. package/tests/fixtures/pi/with-custom-message.jsonl +0 -6
  192. package/tests/fixtures/sessions.ts +0 -22
  193. package/tests/integration/backfill-facts.test.ts +0 -362
  194. package/tests/integration/citation-explicit.test.ts +0 -111
  195. package/tests/integration/cite-event.test.ts +0 -169
  196. package/tests/integration/cite-memo.test.ts +0 -87
  197. package/tests/integration/db-restore.test.ts +0 -153
  198. package/tests/integration/embed-backfill.test.ts +0 -176
  199. package/tests/integration/fact-supersedence.test.ts +0 -313
  200. package/tests/integration/fts-index.test.ts +0 -60
  201. package/tests/integration/getbyids-sqlite.test.ts +0 -100
  202. package/tests/integration/hermes-agent-hooks.test.ts +0 -248
  203. package/tests/integration/hook-claude-settings.test.ts +0 -218
  204. package/tests/integration/hook-log.test.ts +0 -54
  205. package/tests/integration/hook-memo.test.ts +0 -68
  206. package/tests/integration/hook-pre-compact.test.ts +0 -105
  207. package/tests/integration/hook-subagent-start.test.ts +0 -102
  208. package/tests/integration/http.test.ts +0 -401
  209. package/tests/integration/keyword-search-fts.test.ts +0 -66
  210. package/tests/integration/mcp-recall-logging.test.ts +0 -88
  211. package/tests/integration/mcp.test.ts +0 -260
  212. package/tests/integration/memo-sweep.test.ts +0 -91
  213. package/tests/integration/prompt-recall-hook.test.ts +0 -88
  214. package/tests/integration/provider-registry.test.ts +0 -107
  215. package/tests/integration/recall-golden.test.ts +0 -59
  216. package/tests/integration/recall-sqlite.test.ts +0 -169
  217. package/tests/integration/scheduler.test.ts +0 -391
  218. package/tests/integration/session-end-hook.test.ts +0 -48
  219. package/tests/integration/session-start-hook.test.ts +0 -126
  220. package/tests/integration/source-registry.test.ts +0 -122
  221. package/tests/integration/sqlite-fact-store.test.ts +0 -346
  222. package/tests/integration/stop-hook.test.ts +0 -560
  223. package/tests/integration/wal-checkpoint.test.ts +0 -49
  224. package/tests/unit/cli/launchctl-helpers.test.ts +0 -60
  225. package/tests/unit/core/adapters/aider.test.ts +0 -230
  226. package/tests/unit/core/adapters/claude-code.test.ts +0 -118
  227. package/tests/unit/core/adapters/cursor.test.ts +0 -485
  228. package/tests/unit/core/adapters/hermes-agent.test.ts +0 -329
  229. package/tests/unit/core/adapters/hermes.test.ts +0 -81
  230. package/tests/unit/core/adapters/jsonl-generic.test.ts +0 -142
  231. package/tests/unit/core/adapters/opencode.test.ts +0 -354
  232. package/tests/unit/core/adapters/pi.test.ts +0 -110
  233. package/tests/unit/core/adapters/windsurf.test.ts +0 -416
  234. package/tests/unit/core/classifier/prompt.test.ts +0 -126
  235. package/tests/unit/core/embedding/chunk-body.test.ts +0 -100
  236. package/tests/unit/core/facts/extract-facts.test.ts +0 -117
  237. package/tests/unit/core/filter.test.ts +0 -40
  238. package/tests/unit/core/hook/citation-detect-cite-session.test.ts +0 -96
  239. package/tests/unit/core/hook/citation-detect.test.ts +0 -124
  240. package/tests/unit/core/hook/gate.test.ts +0 -29
  241. package/tests/unit/core/hook/pointer-block.test.ts +0 -22
  242. package/tests/unit/core/hook/select.test.ts +0 -66
  243. package/tests/unit/core/match-fields.test.ts +0 -39
  244. package/tests/unit/core/mcp-cite-session.test.ts +0 -51
  245. package/tests/unit/core/providers/provider-models.test.ts +0 -101
  246. package/tests/unit/core/query-shape.test.ts +0 -92
  247. package/tests/unit/core/recall-facts/fact-recall-service.test.ts +0 -258
  248. package/tests/unit/core/recall-service.test.ts +0 -200
  249. package/tests/unit/core/storage/live-status.test.ts +0 -54
  250. package/tests/unit/core/tokenize.test.ts +0 -32
  251. package/tests/unit/core/useful-scan.test.ts +0 -537
  252. package/tests/unit/llm/embed.test.ts +0 -93
  253. package/tests/unit/llm/ollama-client.test.ts +0 -124
  254. package/tests/unit/scripts/longmemeval-scorer.test.ts +0 -114
  255. package/tsconfig.json +0 -31
  256. package/tsconfig.test.json +0 -11
  257. package/vitest.config.ts +0 -22
@@ -1,1088 +0,0 @@
1
- # FTS5 Lexical Recall Upgrade — Implementation Plan
2
-
3
- > **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
4
-
5
- **Goal:** Replace NLM's in-memory token-intersection keyword scorer with a SQLite FTS5 BM25-ranked lexical search, behind the existing `SessionStore` port, without regressing recall quality.
6
-
7
- **Architecture:** The keyword leg of recall moves from a pure core function (`scoreKeyword`, loads every session into memory and scores by token overlap) to a new `SessionStore.keywordSearch` port method backed by the FTS5 virtual table `sessions_fts`. This mirrors how the semantic leg already works (`semanticSearch` → sqlite-vec). `RecallService` keeps orchestrating filter + merge; it just sources keyword hits from the store instead of computing them. The byte-for-byte parity test suite (pinned to a retired Python scorer) is replaced — *before any production code changes* — by a tolerant golden-set recall-quality test that must stay green through the swap. That golden test is the regression net.
8
-
9
- **Key design decision (documented tradeoff):** `sessions_fts` already exists in migration 000 with columns `(label, summary, body)` and sync triggers — it is created and maintained, just never queried. We reuse it as-is rather than adding dedicated `decisions`/`open` FTS columns. Decision and open-question text already lives inside `body` (markers are extracted *from* the body markdown), so BM25 over `(label, summary, body)` covers the same text the old scorer covered. What changes: decision/open lines get `body` column weight rather than the old explicit 2x. BM25's IDF term-rarity weighting compensates. The `matchedIn` badges stay accurate — they are computed in `RecallService` from the resolved `Session` object (which has `decisions[]`/`open[]` from the `markers` table), not from FTS. No FTS schema change, no `MatchField` type change.
10
-
11
- **Tech Stack:** TypeScript, better-sqlite3, SQLite FTS5 (built in), vitest. Hexagonal architecture — `core/` depends on the `SessionStore` port; `SqliteSessionStore` is the adapter.
12
-
13
- **Branch:** Create and work on `feat/fts5-lexical-recall` off `main`.
14
-
15
- **Out of scope:** pgvector (stays the optional power-tier swap, open task #96). The vector leg (`semanticSearch` / sqlite-vec) is untouched.
16
-
17
- ---
18
-
19
- ## File Structure
20
-
21
- **Created:**
22
- - `tests/fixtures/golden-corpus.ts` — fixed 8-session corpus with realistic `body`, used by the golden recall test.
23
- - `tests/integration/recall-golden.test.ts` — the regression gate: query → expected top-3 assertions, run through `RecallService` + real `SqliteSessionStore`.
24
- - `migrations/008_fts_rebuild.sql` — one-time `INSERT INTO sessions_fts(sessions_fts) VALUES('rebuild')` safety rebuild.
25
- - `tests/integration/fts-index.test.ts` — asserts `sessions_fts` is populated and synced after migrations + inserts.
26
- - `tests/integration/keyword-search-fts.test.ts` — exercises `SqliteSessionStore.keywordSearch` directly (ranking + FTS-syntax safety).
27
- - `src/core/recall/match-fields.ts` — pure helper computing `matchedIn` fields for a session against query tokens (the `matchedIn` half of the old `scoreKeyword`).
28
-
29
- **Modified:**
30
- - `src/ports/session-store.ts` — add `KeywordNeighbor` interface + `keywordSearch` method.
31
- - `src/core/storage/sqlite-session-store.ts` — implement `keywordSearch`.
32
- - `src/core/recall/recall-service.ts` — keyword + hybrid legs call `store.keywordSearch`; add `runKeyword`; delete `scoreAll`.
33
- - `src/core/recall/index.ts` — drop `scoreKeyword` export; add `keywordMatchFields` export.
34
- - `tests/unit/core/recall-service.test.ts` — `InMemoryStore` fake implements `keywordSearch`; keyword/hybrid tests feed pre-baked hits.
35
- - `tests/integration/recall-sqlite.test.ts` — populate `body` on seed sessions so FTS keyword recall works.
36
-
37
- **Deleted:**
38
- - `src/core/recall/score-keyword.ts` — ranking moves to FTS; `matchedIn` logic moves to `match-fields.ts`.
39
- - `tests/unit/core/score-keyword.test.ts` — byte-parity suite, replaced by the golden test.
40
-
41
- **Kept (do not delete):**
42
- - `src/core/recall/tokenize.ts` — still imported by `src/core/recall-facts/fact-recall-service.ts` and reused by `keywordSearch` + `match-fields.ts`.
43
-
44
- ---
45
-
46
- ## Task 1: Golden-set recall regression test (the gate)
47
-
48
- Write the regression net **first**, against the current (unchanged) in-memory scorer. It must pass now and stay green through every later task. Assertions are tolerant — "expected session in top 3" — so they survive the ranking-algorithm change from token-overlap to BM25.
49
-
50
- **Files:**
51
- - Create: `tests/fixtures/golden-corpus.ts`
52
- - Create: `tests/integration/recall-golden.test.ts`
53
-
54
- - [ ] **Step 1: Create the golden corpus fixture**
55
-
56
- Create `tests/fixtures/golden-corpus.ts`:
57
-
58
- ```typescript
59
- import type { Session } from "../../src/shared/types.js";
60
- import { makeSession } from "./sessions.js";
61
-
62
- /**
63
- * A fixed, realistic corpus for recall-quality regression testing.
64
- * `body` is populated on every session because FTS5 keyword search indexes
65
- * label + summary + body. Decision/open text is also present in `body`
66
- * (mirrors production: markers are extracted from the body markdown).
67
- */
68
- export const GOLDEN_CORPUS: ReadonlyArray<Session> = [
69
- makeSession({
70
- id: "g_fts",
71
- label: "FTS5 vs pgvector for recall search backend",
72
- summary: "Compared SQLite FTS5 lexical search against pgvector for the recall layer",
73
- body: "Evaluated FTS5 versus pgvector. FTS5 ships with SQLite and stays zero-config. pgvector needs Postgres running which breaks the five-minute install.",
74
- decisions: ["Use FTS5 for the lexical recall leg"],
75
- entities: ["NLM"],
76
- }),
77
- makeSession({
78
- id: "g_hono",
79
- label: "Hono router setup on port 3940",
80
- summary: "Wired the Hono HTTP router and mounted the recall API",
81
- body: "Set up Hono as the HTTP framework. Mounted routes for recall, sessions, and the live dashboard on port 3940.",
82
- decisions: ["Chose Hono over Express for HTTP routing"],
83
- entities: ["NLM"],
84
- }),
85
- makeSession({
86
- id: "g_pgvector",
87
- label: "pgvector migration plan for the power tier",
88
- summary: "Sketched the Postgres mirror behind the SessionStore port",
89
- body: "Planned a PostgresSessionStore satisfying the same port as SqliteSessionStore. pgvector handles the vector index for users already running Postgres.",
90
- open: ["Timing of the SQLite to Postgres cutover"],
91
- entities: ["NLM", "Postgres"],
92
- }),
93
- makeSession({
94
- id: "g_tauri",
95
- label: "Tauri desktop packaging for v2 distribution",
96
- summary: "Plan to wrap the server and SPA in Tauri for signed installers",
97
- body: "Tauri hosts the Vite SPA in a webview and runs the Node server as a sidecar. Produces dmg, exe, and deb installers.",
98
- open: ["Whether to rewrite the server in Rust later"],
99
- entities: ["NLM"],
100
- }),
101
- makeSession({
102
- id: "g_classifier",
103
- label: "Ollama classifier latency during backfill",
104
- summary: "The Ollama classifier runs about one session per second",
105
- body: "Backfilling a year of history is slow because the Ollama classifier processes roughly one session per second. Considered parallelizing the calls.",
106
- open: ["Parallelize classifier calls or document the DeepSeek path"],
107
- entities: ["NLM", "Ollama"],
108
- }),
109
- makeSession({
110
- id: "g_supersede",
111
- label: "Fact supersedence policy on subject predicate collision",
112
- summary: "Deterministic supersedence when a newer fact collides with an older one",
113
- body: "When a new fact shares the same subject and predicate as a current fact, the older row is marked superseded by the new one. Always supersede, even on same value.",
114
- decisions: ["Always supersede on subject predicate collision"],
115
- entities: ["NLM"],
116
- }),
117
- makeSession({
118
- id: "g_toon",
119
- label: "TOON encoding for MCP tool responses",
120
- summary: "Encode MCP responses as TOON to cut token usage",
121
- body: "The MCP server encodes tool responses as TOON when NLM_FORMAT is set to toon. Falls back to JSON when toonEncode throws.",
122
- decisions: ["TOON-encode MCP responses behind the NLM_FORMAT env flag"],
123
- entities: ["NLM"],
124
- }),
125
- makeSession({
126
- id: "g_camofox",
127
- label: "Camofox audit of the search page",
128
- summary: "Ran a Camofox browser audit against the recall search UI",
129
- body: "Camofox audit found the search page returned zero results because the static build ignored query strings. Fixed with client-side hydration.",
130
- open: ["Should entity facet links filter within search"],
131
- entities: ["NLM", "Camofox"],
132
- }),
133
- ];
134
-
135
- /** query → session id expected to appear in the top 3 keyword results. */
136
- export const GOLDEN_QUERIES: ReadonlyArray<{ query: string; expectTop3: string }> = [
137
- { query: "FTS5 pgvector search backend", expectTop3: "g_fts" },
138
- { query: "Hono router", expectTop3: "g_hono" },
139
- { query: "Tauri packaging installers", expectTop3: "g_tauri" },
140
- { query: "Ollama classifier latency", expectTop3: "g_classifier" },
141
- { query: "fact supersedence collision", expectTop3: "g_supersede" },
142
- { query: "TOON encoding MCP", expectTop3: "g_toon" },
143
- ];
144
- ```
145
-
146
- - [ ] **Step 2: Write the golden recall test**
147
-
148
- Create `tests/integration/recall-golden.test.ts`:
149
-
150
- ```typescript
151
- /**
152
- * Recall-quality regression gate. A fixed corpus + query/expectation pairs,
153
- * run through RecallService against a real SqliteSessionStore. Assertions are
154
- * tolerant (expected session within top 3) so they survive the swap from the
155
- * token-overlap scorer to FTS5 BM25 ranking. This test must stay green from
156
- * the current code through every task in this plan.
157
- */
158
-
159
- import { mkdtempSync, rmSync } from "node:fs";
160
- import { tmpdir } from "node:os";
161
- import { join, resolve } from "node:path";
162
- import { afterEach, beforeEach, describe, expect, it } from "vitest";
163
- import { RecallService } from "../../src/core/recall/recall-service.js";
164
- import { SqliteSessionStore } from "../../src/core/storage/sqlite-session-store.js";
165
- import type { EmbedResult, LLMClient } from "../../src/ports/llm-client.js";
166
- import { LLMUnreachableError } from "../../src/ports/llm-client.js";
167
- import { GOLDEN_CORPUS, GOLDEN_QUERIES } from "../fixtures/golden-corpus.js";
168
-
169
- const MIGRATIONS_DIR = resolve(__dirname, "../../migrations");
170
-
171
- // Keyword-only recall must never touch the embedder; this stub proves it.
172
- class UnreachableEmbedder implements LLMClient {
173
- async embed(): Promise<EmbedResult> {
174
- throw new LLMUnreachableError("ollama");
175
- }
176
- async classify(): Promise<never> {
177
- throw new Error("not used");
178
- }
179
- }
180
-
181
- describe("golden recall regression gate", () => {
182
- let tmp: string;
183
- let store: SqliteSessionStore;
184
-
185
- beforeEach(() => {
186
- tmp = mkdtempSync(join(tmpdir(), "nlm-golden-"));
187
- store = new SqliteSessionStore({
188
- dbPath: join(tmp, "canonical.sqlite"),
189
- migrationsDir: MIGRATIONS_DIR,
190
- });
191
- for (const session of GOLDEN_CORPUS) {
192
- store.insertSessionForTest(session);
193
- }
194
- });
195
-
196
- afterEach(() => {
197
- store.close();
198
- rmSync(tmp, { recursive: true, force: true });
199
- });
200
-
201
- for (const { query, expectTop3 } of GOLDEN_QUERIES) {
202
- it(`keyword recall surfaces "${expectTop3}" in the top 3 for "${query}"`, async () => {
203
- const svc = new RecallService({ store, llm: new UnreachableEmbedder() });
204
- const result = await svc.search({ query, mode: "keyword", limit: 10 });
205
- const top3 = result.results.slice(0, 3).map((r) => r.id);
206
- expect(top3).toContain(expectTop3);
207
- });
208
- }
209
- });
210
- ```
211
-
212
- - [ ] **Step 3: Run the golden test against current code**
213
-
214
- Run: `npm test -- tests/integration/recall-golden.test.ts`
215
- Expected: PASS — all 6 cases. This proves the current in-memory scorer satisfies the golden set; the same test will guard the FTS swap.
216
-
217
- - [ ] **Step 4: Commit**
218
-
219
- ```bash
220
- git checkout -b feat/fts5-lexical-recall
221
- git add tests/fixtures/golden-corpus.ts tests/integration/recall-golden.test.ts
222
- git commit -m "test: add golden-set recall regression gate before FTS5 swap"
223
- ```
224
-
225
- ---
226
-
227
- ## Task 2: FTS index rebuild migration
228
-
229
- `sessions_fts` and its `sessions_ai`/`sessions_au`/`sessions_ad` triggers were declared in migration `000` and have fired on every insert since — the index is normally in sync. Add a one-time `rebuild` as a safety net so the recall path can depend on FTS being complete for all pre-existing rows.
230
-
231
- **Files:**
232
- - Create: `migrations/008_fts_rebuild.sql`
233
- - Create: `tests/integration/fts-index.test.ts`
234
-
235
- - [ ] **Step 1: Write the failing test**
236
-
237
- Create `tests/integration/fts-index.test.ts`:
238
-
239
- ```typescript
240
- /**
241
- * Verifies the sessions_fts FTS5 index is present and kept in sync with the
242
- * sessions table after migrations run and rows are inserted.
243
- */
244
-
245
- import { mkdtempSync, rmSync } from "node:fs";
246
- import { tmpdir } from "node:os";
247
- import { join, resolve } from "node:path";
248
- import { afterEach, beforeEach, describe, expect, it } from "vitest";
249
- import { SqliteSessionStore } from "../../src/core/storage/sqlite-session-store.js";
250
- import { makeSession } from "../fixtures/sessions.js";
251
-
252
- const MIGRATIONS_DIR = resolve(__dirname, "../../migrations");
253
-
254
- describe("sessions_fts index", () => {
255
- let tmp: string;
256
- let store: SqliteSessionStore;
257
-
258
- beforeEach(() => {
259
- tmp = mkdtempSync(join(tmpdir(), "nlm-fts-"));
260
- store = new SqliteSessionStore({
261
- dbPath: join(tmp, "canonical.sqlite"),
262
- migrationsDir: MIGRATIONS_DIR,
263
- });
264
- });
265
-
266
- afterEach(() => {
267
- store.close();
268
- rmSync(tmp, { recursive: true, force: true });
269
- });
270
-
271
- it("populates sessions_fts via triggers on insert", () => {
272
- store.insertSessionForTest(makeSession({ id: "s1", label: "alpha", body: "beta" }));
273
- store.insertSessionForTest(makeSession({ id: "s2", label: "gamma", body: "delta" }));
274
- const db = store.rawDb();
275
- const fts = db.prepare<[], { n: number }>("SELECT count(*) AS n FROM sessions_fts").get();
276
- const rows = db.prepare<[], { n: number }>("SELECT count(*) AS n FROM sessions").get();
277
- expect(fts?.n).toBe(rows?.n);
278
- expect(fts?.n).toBe(2);
279
- });
280
-
281
- it("records the 008 fts_rebuild migration as applied", () => {
282
- const db = store.rawDb();
283
- const row = db
284
- .prepare<[number], { name: string }>("SELECT name FROM schema_migrations WHERE version = ?")
285
- .get(8);
286
- expect(row?.name).toBe("fts_rebuild");
287
- });
288
-
289
- it("answers a raw FTS5 MATCH query", () => {
290
- store.insertSessionForTest(makeSession({ id: "s1", label: "pgvector plan", body: "" }));
291
- const db = store.rawDb();
292
- const hit = db
293
- .prepare<[string], { id: string }>(
294
- "SELECT s.id FROM sessions_fts JOIN sessions s ON s.rowid = sessions_fts.rowid WHERE sessions_fts MATCH ?",
295
- )
296
- .get('"pgvector"');
297
- expect(hit?.id).toBe("s1");
298
- });
299
- });
300
- ```
301
-
302
- - [ ] **Step 2: Run the test to verify it fails**
303
-
304
- Run: `npm test -- tests/integration/fts-index.test.ts`
305
- Expected: FAIL on the "records the 008 fts_rebuild migration" case — version 8 is not in `schema_migrations` because the migration file does not exist yet. (The other two cases may already pass — the triggers exist in migration 000.)
306
-
307
- - [ ] **Step 3: Write the migration**
308
-
309
- Create `migrations/008_fts_rebuild.sql`:
310
-
311
- ```sql
312
- -- One-time safety rebuild of the sessions_fts external-content FTS5 index.
313
- -- The virtual table and its sync triggers (sessions_ai / sessions_au /
314
- -- sessions_ad) were declared in migration 000 and have fired on every write
315
- -- since, so the index is normally already in sync. This rebuild guarantees
316
- -- the index matches every existing sessions row before the recall path
317
- -- starts depending on FTS5 for keyword search. Safe and idempotent.
318
- INSERT INTO sessions_fts(sessions_fts) VALUES('rebuild');
319
-
320
- INSERT OR IGNORE INTO schema_migrations (version, name) VALUES (8, 'fts_rebuild');
321
- ```
322
-
323
- - [ ] **Step 4: Run the test to verify it passes**
324
-
325
- Run: `npm test -- tests/integration/fts-index.test.ts`
326
- Expected: PASS — all 3 cases.
327
-
328
- - [ ] **Step 5: Commit**
329
-
330
- ```bash
331
- git add migrations/008_fts_rebuild.sql tests/integration/fts-index.test.ts
332
- git commit -m "feat: add FTS5 index rebuild migration with sync verification"
333
- ```
334
-
335
- ---
336
-
337
- ## Task 3: Add `keywordSearch` to the SessionStore port
338
-
339
- Add the port method and its `SqliteSessionStore` implementation. The unit-test `InMemoryStore` fake (in `recall-service.test.ts`) also `implements SessionStore`, so it must gain a `keywordSearch` stub in this task or `typecheck` breaks — `RecallService` does not call it until Task 4, so a minimal stub is correct here.
340
-
341
- **Files:**
342
- - Modify: `src/ports/session-store.ts`
343
- - Modify: `src/core/storage/sqlite-session-store.ts`
344
- - Modify: `tests/unit/core/recall-service.test.ts:12-27` (add stub method to `InMemoryStore`)
345
- - Test: `tests/integration/keyword-search-fts.test.ts`
346
-
347
- - [ ] **Step 1: Write the failing test**
348
-
349
- Create `tests/integration/keyword-search-fts.test.ts`:
350
-
351
- ```typescript
352
- /**
353
- * Direct coverage of SqliteSessionStore.keywordSearch — FTS5 BM25 ranking
354
- * and resilience to FTS5 query-syntax metacharacters in user input.
355
- */
356
-
357
- import { mkdtempSync, rmSync } from "node:fs";
358
- import { tmpdir } from "node:os";
359
- import { join, resolve } from "node:path";
360
- import { afterEach, beforeEach, describe, expect, it } from "vitest";
361
- import { SqliteSessionStore } from "../../src/core/storage/sqlite-session-store.js";
362
- import { makeSession } from "../fixtures/sessions.js";
363
-
364
- const MIGRATIONS_DIR = resolve(__dirname, "../../migrations");
365
-
366
- describe("SqliteSessionStore.keywordSearch", () => {
367
- let tmp: string;
368
- let store: SqliteSessionStore;
369
-
370
- beforeEach(() => {
371
- tmp = mkdtempSync(join(tmpdir(), "nlm-kw-"));
372
- store = new SqliteSessionStore({
373
- dbPath: join(tmp, "canonical.sqlite"),
374
- migrationsDir: MIGRATIONS_DIR,
375
- });
376
- store.insertSessionForTest(
377
- makeSession({ id: "s_pg", label: "pgvector migration plan", body: "postgres mirror" }),
378
- );
379
- store.insertSessionForTest(
380
- makeSession({ id: "s_hono", label: "Hono router", body: "http framework setup" }),
381
- );
382
- store.insertSessionForTest(
383
- makeSession({ id: "s_misc", label: "unrelated work", body: "nothing in common" }),
384
- );
385
- });
386
-
387
- afterEach(() => {
388
- store.close();
389
- rmSync(tmp, { recursive: true, force: true });
390
- });
391
-
392
- it("ranks the matching session first and returns a positive score", async () => {
393
- const hits = await store.keywordSearch("pgvector", 10);
394
- expect(hits[0]?.sessionId).toBe("s_pg");
395
- expect(hits[0]?.score).toBeGreaterThan(0);
396
- });
397
-
398
- it("matches body text, not just the label", async () => {
399
- const hits = await store.keywordSearch("framework", 10);
400
- expect(hits.map((h) => h.sessionId)).toContain("s_hono");
401
- });
402
-
403
- it("returns an empty array for a query with no indexable tokens", async () => {
404
- const hits = await store.keywordSearch("---", 10);
405
- expect(hits).toEqual([]);
406
- });
407
-
408
- it("does not throw on FTS5 metacharacters in the query", async () => {
409
- const hits = await store.keywordSearch('pgvector OR (qdrant) NEAR "x"', 10);
410
- expect(hits.map((h) => h.sessionId)).toContain("s_pg");
411
- });
412
-
413
- it("respects the limit", async () => {
414
- const hits = await store.keywordSearch("plan router work", 1);
415
- expect(hits.length).toBeLessThanOrEqual(1);
416
- });
417
- });
418
- ```
419
-
420
- - [ ] **Step 2: Run the test to verify it fails**
421
-
422
- Run: `npm test -- tests/integration/keyword-search-fts.test.ts`
423
- Expected: FAIL — `store.keywordSearch is not a function`.
424
-
425
- - [ ] **Step 3: Add the port interface members**
426
-
427
- In `src/ports/session-store.ts`, add the `KeywordNeighbor` interface immediately after the existing `SemanticNeighbor` interface (after line 20):
428
-
429
- ```typescript
430
- export interface KeywordNeighbor {
431
- readonly sessionId: string;
432
- readonly score: number;
433
- }
434
- ```
435
-
436
- Then add the method to the `SessionStore` interface, immediately after the `semanticSearch` declaration (after line 30):
437
-
438
- ```typescript
439
- keywordSearch(
440
- query: string,
441
- limit: number,
442
- ): Promise<ReadonlyArray<KeywordNeighbor>>;
443
- ```
444
-
445
- - [ ] **Step 4: Implement `keywordSearch` in `SqliteSessionStore`**
446
-
447
- In `src/core/storage/sqlite-session-store.ts`:
448
-
449
- Add to the import from `@ports/session-store.js` (currently `SemanticNeighbor, SessionFilter, SessionStore`) the new `KeywordNeighbor` type:
450
-
451
- ```typescript
452
- import type {
453
- KeywordNeighbor,
454
- SemanticNeighbor,
455
- SessionFilter,
456
- SessionStore,
457
- } from "@ports/session-store.js";
458
- ```
459
-
460
- Add a `tokenize` import below the existing `runMigrations` import (after line 32):
461
-
462
- ```typescript
463
- import { tokenize } from "@core/recall/tokenize.js";
464
- ```
465
-
466
- Add the method immediately after `semanticSearch` (after line 461, before `updateStatus`):
467
-
468
- ```typescript
469
- /**
470
- * Lexical recall via the sessions_fts FTS5 index. BM25 column weights
471
- * favour label over summary over body. Returns sessions ranked best-first
472
- * with a positive score (the negated bm25() value — bm25 is more negative
473
- * for better matches). User input is tokenized and rebuilt into a quoted
474
- * OR query so FTS5 metacharacters cannot reach the MATCH parser.
475
- */
476
- async keywordSearch(
477
- query: string,
478
- limit: number,
479
- ): Promise<ReadonlyArray<KeywordNeighbor>> {
480
- const matchExpr = toMatchExpression(query);
481
- if (!matchExpr) return [];
482
- const k = Math.max(1, Math.trunc(limit));
483
- const rows = this.db
484
- .prepare<[string, number], { sessionId: string; score: number }>(`
485
- SELECT s.id AS sessionId,
486
- -bm25(sessions_fts, 10.0, 4.0, 1.0) AS score
487
- FROM sessions_fts
488
- JOIN sessions s ON s.rowid = sessions_fts.rowid
489
- WHERE sessions_fts MATCH ?
490
- ORDER BY score DESC
491
- LIMIT ?
492
- `)
493
- .all(matchExpr, k);
494
- return rows.map((r) => ({ sessionId: r.sessionId, score: r.score }));
495
- }
496
- ```
497
-
498
- Add this module-level helper at the end of the file, after the closing brace of the `SqliteSessionStore` class:
499
-
500
- ```typescript
501
- /**
502
- * Builds a safe FTS5 MATCH expression from raw user input. Each indexable
503
- * token becomes a double-quoted string literal; literals are OR-joined.
504
- * Quoting neutralizes FTS5 operators (AND, OR, NEAR, *, parentheses, colon).
505
- * Returns null when the query has no indexable tokens.
506
- */
507
- function toMatchExpression(query: string): string | null {
508
- const terms = tokenize(query);
509
- if (terms.length === 0) return null;
510
- return terms.map((t) => `"${t.replace(/"/g, '""')}"`).join(" OR ");
511
- }
512
- ```
513
-
514
- - [ ] **Step 5: Add a `keywordSearch` stub to the `InMemoryStore` test fake**
515
-
516
- In `tests/unit/core/recall-service.test.ts`, the `InMemoryStore` class (lines 12-27) `implements SessionStore` and will no longer compile without the new method. Add the import and a minimal stub — Task 4 replaces this stub with a real implementation.
517
-
518
- Change the import block (lines 5-8) to include `KeywordNeighbor`:
519
-
520
- ```typescript
521
- import type {
522
- KeywordNeighbor,
523
- SessionStore,
524
- SemanticNeighbor,
525
- } from "../../../src/ports/session-store.js";
526
- ```
527
-
528
- Add this method inside `InMemoryStore`, after `semanticSearch` (after line 25):
529
-
530
- ```typescript
531
- async keywordSearch(): Promise<ReadonlyArray<KeywordNeighbor>> {
532
- return [];
533
- }
534
- ```
535
-
536
- - [ ] **Step 6: Run the tests**
537
-
538
- Run: `npm test -- tests/integration/keyword-search-fts.test.ts && npm run typecheck`
539
- Expected: PASS — all 5 `keywordSearch` cases; `typecheck` clean.
540
-
541
- Run: `npm test -- tests/integration/recall-golden.test.ts`
542
- Expected: PASS — golden gate still green (`RecallService` unchanged, still uses the in-memory scorer).
543
-
544
- - [ ] **Step 7: Commit**
545
-
546
- ```bash
547
- git add src/ports/session-store.ts src/core/storage/sqlite-session-store.ts tests/unit/core/recall-service.test.ts tests/integration/keyword-search-fts.test.ts
548
- git commit -m "feat: add FTS5-backed keywordSearch to the SessionStore port"
549
- ```
550
-
551
- ---
552
-
553
- ## Task 4: Rewire `RecallService` to use `keywordSearch`
554
-
555
- Switch the keyword and hybrid legs from the in-memory `scoreAll`/`scoreKeyword` path to `store.keywordSearch`. `matchedIn` badges are computed in core from the resolved `Session` (which carries `decisions`/`open` from the `markers` table) via a new pure helper, so the `MatchField` type is unchanged and decision/open badges stay accurate.
556
-
557
- **Hybrid weighting note:** `mergeHybrid` normalizes each leg by its own max (`score / maxKw`). That normalization absorbs the scale change from token-overlap counts to negated-BM25 values, so the 0.6 semantic / 0.4 keyword split is *deliberately retained* — this is the re-tuning conclusion, verified by the hybrid test below, not an oversight.
558
-
559
- **Files:**
560
- - Create: `src/core/recall/match-fields.ts`
561
- - Modify: `src/core/recall/recall-service.ts`
562
- - Modify: `src/core/recall/index.ts`
563
- - Modify: `tests/unit/core/recall-service.test.ts`
564
- - Modify: `tests/integration/recall-sqlite.test.ts`
565
-
566
- - [ ] **Step 1: Write the `match-fields` helper with its test**
567
-
568
- Create `src/core/recall/match-fields.ts`:
569
-
570
- ```typescript
571
- /**
572
- * Computes which session fields a keyword query matched, for the `matchedIn`
573
- * badge on a RecallHit. Pure function — no DB, no I/O. FTS5 BM25 ranks the
574
- * whole row; this recovers per-field attribution from the resolved Session,
575
- * including decisions/open which live in the markers table (not in FTS).
576
- */
577
-
578
- import type { MatchField, Session } from "@shared/types.js";
579
- import { tokenSet } from "./tokenize.js";
580
-
581
- type SessionFields = Pick<Session, "label" | "summary" | "decisions" | "open">;
582
-
583
- export function keywordMatchFields(
584
- session: SessionFields,
585
- queryTokens: ReadonlySet<string>,
586
- ): ReadonlyArray<MatchField> {
587
- if (queryTokens.size === 0) return [];
588
- const fields: MatchField[] = [];
589
-
590
- if (overlaps(queryTokens, tokenSet(session.label))) fields.push("label");
591
- if (overlaps(queryTokens, joinedTokens(session.decisions))) fields.push("decisions");
592
- if (overlaps(queryTokens, joinedTokens(session.open))) fields.push("open");
593
- if (overlaps(queryTokens, tokenSet(session.summary))) fields.push("summary");
594
-
595
- return fields;
596
- }
597
-
598
- function joinedTokens(values: ReadonlyArray<string>): Set<string> {
599
- const out = new Set<string>();
600
- for (const v of values) {
601
- for (const t of tokenSet(v)) out.add(t);
602
- }
603
- return out;
604
- }
605
-
606
- function overlaps(a: ReadonlySet<string>, b: ReadonlySet<string>): boolean {
607
- const [small, large] = a.size <= b.size ? [a, b] : [b, a];
608
- for (const item of small) if (large.has(item)) return true;
609
- return false;
610
- }
611
- ```
612
-
613
- Create `tests/unit/core/match-fields.test.ts`:
614
-
615
- ```typescript
616
- import { describe, expect, it } from "vitest";
617
- import { keywordMatchFields } from "../../../src/core/recall/match-fields.js";
618
- import { tokenSet } from "../../../src/core/recall/tokenize.js";
619
- import { makeSession } from "../../fixtures/sessions.js";
620
-
621
- describe("keywordMatchFields", () => {
622
- it("returns no fields for empty query tokens", () => {
623
- expect(keywordMatchFields(makeSession({ label: "anything" }), new Set())).toEqual([]);
624
- });
625
-
626
- it("reports the label field on a label match", () => {
627
- const session = makeSession({ label: "pgvector migration plan" });
628
- expect(keywordMatchFields(session, tokenSet("pgvector"))).toEqual(["label"]);
629
- });
630
-
631
- it("reports decisions and open from marker text", () => {
632
- const session = makeSession({
633
- decisions: ["picked Hono for HTTP"],
634
- open: ["whether to use Tauri later"],
635
- });
636
- expect(keywordMatchFields(session, tokenSet("Hono"))).toEqual(["decisions"]);
637
- expect(keywordMatchFields(session, tokenSet("Tauri"))).toEqual(["open"]);
638
- });
639
-
640
- it("reports every matching field in label, decisions, open, summary order", () => {
641
- const session = makeSession({
642
- label: "recall port",
643
- summary: "ported recall to TypeScript",
644
- decisions: ["use sqlite-vec for semantic recall"],
645
- open: ["recall stats endpoint"],
646
- });
647
- expect(keywordMatchFields(session, tokenSet("recall"))).toEqual([
648
- "label",
649
- "decisions",
650
- "open",
651
- "summary",
652
- ]);
653
- });
654
- });
655
- ```
656
-
657
- - [ ] **Step 2: Run the helper test**
658
-
659
- Run: `npm test -- tests/unit/core/match-fields.test.ts`
660
- Expected: PASS — all 4 cases.
661
-
662
- - [ ] **Step 3: Rewire `RecallService`**
663
-
664
- In `src/core/recall/recall-service.ts`:
665
-
666
- Replace the two import lines for `score-keyword` and `tokenize` (lines 20-22) with:
667
-
668
- ```typescript
669
- import { applyFilter } from "./filter.js";
670
- import { keywordMatchFields } from "./match-fields.js";
671
- import { tokenSet } from "./tokenize.js";
672
- ```
673
-
674
- Add a keyword overfetch constant next to `SEMANTIC_OVERFETCH` (after line 28):
675
-
676
- ```typescript
677
- const KEYWORD_OVERFETCH = 3;
678
- ```
679
-
680
- Replace the keyword-hits block (lines 65-68):
681
-
682
- ```typescript
683
- const kwHits =
684
- mode === "keyword" || mode === "hybrid"
685
- ? scoreAll(filtered, queryTokens)
686
- : [];
687
- ```
688
-
689
- with:
690
-
691
- ```typescript
692
- const kwHits =
693
- (mode === "keyword" || mode === "hybrid") && input.query
694
- ? await this.runKeyword(
695
- input.query,
696
- byId,
697
- queryTokens,
698
- limit * KEYWORD_OVERFETCH,
699
- )
700
- : [];
701
- ```
702
-
703
- Add a `runKeyword` private method immediately after the `runSemantic` method (after line 116, before the closing brace of the class):
704
-
705
- ```typescript
706
- private async runKeyword(
707
- query: string,
708
- byId: ReadonlyMap<string, Session>,
709
- queryTokens: ReadonlySet<string>,
710
- fetchLimit: number,
711
- ): Promise<ReadonlyArray<KeywordHit>> {
712
- const neighbors = await this.deps.store.keywordSearch(query, fetchLimit);
713
- const hits: KeywordHit[] = [];
714
- for (const n of neighbors) {
715
- const session = byId.get(n.sessionId);
716
- if (!session) continue;
717
- hits.push({
718
- session,
719
- score: n.score,
720
- matchedIn: keywordMatchFields(session, queryTokens),
721
- });
722
- }
723
- return hits;
724
- }
725
- ```
726
-
727
- Delete the now-unused `scoreAll` function (lines 130-142):
728
-
729
- ```typescript
730
- function scoreAll(
731
- sessions: ReadonlyArray<Session>,
732
- queryTokens: ReadonlySet<string>,
733
- ): ReadonlyArray<KeywordHit> {
734
- if (queryTokens.size === 0) return [];
735
- const hits: KeywordHit[] = [];
736
- for (const s of sessions) {
737
- const { score, matchedIn } = scoreKeyword(s, queryTokens);
738
- if (score > 0) hits.push({ session: s, score, matchedIn });
739
- }
740
- hits.sort((a, b) => b.score - a.score);
741
- return hits;
742
- }
743
- ```
744
-
745
- Note: `filtered` is still used (it builds `byId`), and `queryTokens` is still used (passed to `runKeyword` for `matchedIn`). Leave both. `byId` is built from the entity/kind-filtered set, so `runKeyword` resolving through it naturally drops filtered-out sessions — same pattern as `runSemantic`.
746
-
747
- - [ ] **Step 4: Update the recall barrel export**
748
-
749
- In `src/core/recall/index.ts`, remove the `scoreKeyword` export (lines 3-4) and add the `keywordMatchFields` export. The file becomes:
750
-
751
- ```typescript
752
- export { RecallService } from "./recall-service.js";
753
- export type { RecallServiceDeps } from "./recall-service.js";
754
- export { keywordMatchFields } from "./match-fields.js";
755
- export { applyFilter } from "./filter.js";
756
- export type { RecallFilter } from "./filter.js";
757
- export { tokenize, tokenSet } from "./tokenize.js";
758
- ```
759
-
760
- - [ ] **Step 5: Update the `InMemoryStore` fake and keyword/hybrid unit tests**
761
-
762
- Replace the entire contents of `tests/unit/core/recall-service.test.ts` with:
763
-
764
- ```typescript
765
- import { describe, expect, it } from "vitest";
766
- import { RecallService } from "../../../src/core/recall/recall-service.js";
767
- import type { LLMClient, EmbedResult } from "../../../src/ports/llm-client.js";
768
- import { LLMUnreachableError } from "../../../src/ports/llm-client.js";
769
- import type {
770
- KeywordNeighbor,
771
- SessionStore,
772
- SemanticNeighbor,
773
- } from "../../../src/ports/session-store.js";
774
- import type { Session } from "../../../src/shared/types.js";
775
- import { makeSession } from "../../fixtures/sessions.js";
776
-
777
- // Fake store: keyword and semantic hits are pre-baked. Unit tests here cover
778
- // RecallService orchestration (filter, merge, limit, error handling) — not
779
- // keyword ranking quality, which is covered by the FTS integration tests.
780
- class InMemoryStore implements SessionStore {
781
- constructor(
782
- private readonly sessions: Session[],
783
- private readonly neighbors: SemanticNeighbor[] = [],
784
- private readonly keywordHits: KeywordNeighbor[] = [],
785
- ) {}
786
- async list(): Promise<ReadonlyArray<Session>> {
787
- return this.sessions;
788
- }
789
- async getById(id: string): Promise<Session | null> {
790
- return this.sessions.find((s) => s.id === id) ?? null;
791
- }
792
- async semanticSearch(): Promise<ReadonlyArray<SemanticNeighbor>> {
793
- return this.neighbors;
794
- }
795
- async keywordSearch(): Promise<ReadonlyArray<KeywordNeighbor>> {
796
- return this.keywordHits;
797
- }
798
- async updateStatus(): Promise<void> {}
799
- }
800
-
801
- class StubEmbedder implements LLMClient {
802
- constructor(private readonly fail: boolean = false) {}
803
- async embed(): Promise<EmbedResult> {
804
- if (this.fail) throw new LLMUnreachableError("ollama");
805
- return { vector: new Float32Array([1, 0, 0]), model: "stub" };
806
- }
807
- async classify(): Promise<never> {
808
- throw new Error("not used");
809
- }
810
- }
811
-
812
- const corpus: Session[] = [
813
- makeSession({
814
- id: "a",
815
- label: "Hono router setup",
816
- entities: ["NLM"],
817
- decisions: ["chose Hono over Express"],
818
- }),
819
- makeSession({
820
- id: "b",
821
- label: "pgvector migration plan",
822
- entities: ["NLM", "Postgres"],
823
- open: ["timing of cutover"],
824
- }),
825
- makeSession({
826
- id: "c",
827
- label: "unrelated session",
828
- entities: ["Other"],
829
- }),
830
- ];
831
-
832
- describe("RecallService.search", () => {
833
- it("returns empty result when query and filters are all blank", async () => {
834
- const svc = new RecallService({
835
- store: new InMemoryStore(corpus),
836
- llm: new StubEmbedder(),
837
- });
838
- const result = await svc.search({ query: "" });
839
- expect(result.total).toBe(0);
840
- expect(result.results).toEqual([]);
841
- });
842
-
843
- it("keyword mode surfaces store keyword hits ranked by store score", async () => {
844
- const store = new InMemoryStore(corpus, [], [
845
- { sessionId: "b", score: 9.2 },
846
- { sessionId: "a", score: 2.1 },
847
- ]);
848
- const svc = new RecallService({ store, llm: new StubEmbedder() });
849
- const result = await svc.search({ query: "pgvector", mode: "keyword" });
850
- expect(result.results.map((r) => r.id)).toEqual(["b", "a"]);
851
- expect(result.results[0]?.matchScore).toBe(9.2);
852
- });
853
-
854
- it("keyword mode populates matchedIn from the resolved session", async () => {
855
- const store = new InMemoryStore(corpus, [], [{ sessionId: "b", score: 5 }]);
856
- const svc = new RecallService({ store, llm: new StubEmbedder() });
857
- const result = await svc.search({ query: "pgvector", mode: "keyword" });
858
- expect(result.results[0]?.matchedIn).toEqual(["label"]);
859
- });
860
-
861
- it("entity filter restricts the keyword corpus", async () => {
862
- const store = new InMemoryStore(corpus, [], [
863
- { sessionId: "b", score: 5 },
864
- { sessionId: "c", score: 4 },
865
- ]);
866
- const svc = new RecallService({ store, llm: new StubEmbedder() });
867
- const result = await svc.search({ query: "session", mode: "keyword", entity: "NLM" });
868
- expect(result.results.every((r) => r.entities.includes("NLM"))).toBe(true);
869
- expect(result.results.map((r) => r.id)).not.toContain("c");
870
- });
871
-
872
- it("semantic mode returns ollama_unreachable when the embedder fails", async () => {
873
- const svc = new RecallService({
874
- store: new InMemoryStore(corpus),
875
- llm: new StubEmbedder(true),
876
- });
877
- const result = await svc.search({ query: "anything", mode: "semantic" });
878
- expect(result.modeUnavailable).toBe("ollama_unreachable");
879
- expect(result.results).toEqual([]);
880
- });
881
-
882
- it("hybrid mode degrades to keyword scores when semantic is unavailable", async () => {
883
- const store = new InMemoryStore(corpus, [], [{ sessionId: "b", score: 7 }]);
884
- const svc = new RecallService({ store, llm: new StubEmbedder(true) });
885
- const result = await svc.search({ query: "pgvector", mode: "hybrid" });
886
- expect(result.modeUnavailable).toBe("ollama_unreachable");
887
- expect(result.results).toHaveLength(1);
888
- expect(result.results[0]?.id).toBe("b");
889
- });
890
-
891
- it("semantic mode reports cosine similarity computed from L2 distance of unit vectors", async () => {
892
- const store = new InMemoryStore(corpus, [{ sessionId: "a", distance: 0 }]);
893
- const svc = new RecallService({ store, llm: new StubEmbedder() });
894
- const result = await svc.search({ query: "anything", mode: "semantic" });
895
- expect(result.results[0]?.matchScore).toBe(1);
896
- });
897
-
898
- it("hybrid mode blends 0.4 * kw + 0.6 * sem after per-leg normalization", async () => {
899
- const store = new InMemoryStore(
900
- corpus,
901
- [{ sessionId: "b", distance: 0 }],
902
- [{ sessionId: "b", score: 9.2 }],
903
- );
904
- const svc = new RecallService({ store, llm: new StubEmbedder() });
905
- const result = await svc.search({ query: "pgvector", mode: "hybrid" });
906
- const top = result.results[0];
907
- expect(top?.id).toBe("b");
908
- // kwNorm = 1 (only hit / its own max), semNorm = 1 (distance 0) => 0.4 + 0.6 = 1
909
- expect(top?.matchScore).toBeCloseTo(1, 4);
910
- expect(top?.keywordScore).toBe(1);
911
- expect(top?.semanticScore).toBe(1);
912
- });
913
-
914
- it("clamps limit to MAX_LIMIT (100) and at least 1", async () => {
915
- const store = new InMemoryStore(corpus, [], [{ sessionId: "b", score: 5 }]);
916
- const svc = new RecallService({ store, llm: new StubEmbedder() });
917
- const big = await svc.search({ query: "session", mode: "keyword", limit: 9999 });
918
- expect(big.limit).toBe(100);
919
- const small = await svc.search({ query: "session", mode: "keyword", limit: 0 });
920
- expect(small.limit).toBe(1);
921
- });
922
- });
923
- ```
924
-
925
- - [ ] **Step 6: Update the integration test seed to populate `body`**
926
-
927
- In `tests/integration/recall-sqlite.test.ts`, the seed sessions (lines 42-72) set `label`/`summary` but not `body`. FTS5 keyword search now drives recall, and although `label`/`summary` are indexed, add `body` to each seed session so the corpus is realistic and the keyword cases exercise body matching. Replace the `seed` array (lines 42-72) with:
928
-
929
- ```typescript
930
- const seed: ReadonlyArray<{ session: Session; embedding: Float32Array }> = [
931
- {
932
- session: makeSession({
933
- id: "sess_a",
934
- label: "Hono router setup",
935
- summary: "Wired Hono onto port 3940 with sqlite session store",
936
- body: "Chose Hono over Express for routing. Mounted the recall API on port 3940.",
937
- entities: ["NLM"],
938
- decisions: ["chose Hono over Express for routing"],
939
- }),
940
- embedding: unit([1, 0, 0]),
941
- },
942
- {
943
- session: makeSession({
944
- id: "sess_b",
945
- label: "pgvector migration plan",
946
- summary: "Sketched eventual Postgres mirror via PostgresSessionStore port",
947
- body: "Planned the pgvector power tier. Open question: timing of cutover from SQLite to Postgres.",
948
- entities: ["NLM", "Postgres"],
949
- open: ["timing of cutover from SQLite to Postgres"],
950
- }),
951
- embedding: unit([0, 1, 0]),
952
- },
953
- {
954
- session: makeSession({
955
- id: "sess_c",
956
- label: "TX Tax county scraper",
957
- summary: "Unrelated work on Texas tax exemption directory",
958
- body: "Built the Texas tax exemption county scraper and directory pipeline.",
959
- entities: ["TX Tax Exemptions"],
960
- }),
961
- embedding: unit([0, 0, 1]),
962
- },
963
- ];
964
- ```
965
-
966
- The existing assertions in that file (keyword finds `sess_b` for "pgvector", entity filter on "scraper" excludes non-NLM, hybrid blends) remain valid — `sess_b`'s label still contains "pgvector" and `sess_c`'s body contains "scraper".
967
-
968
- - [ ] **Step 7: Run the full recall test set**
969
-
970
- Run: `npm test -- tests/integration/recall-golden.test.ts tests/integration/recall-sqlite.test.ts tests/unit/core/recall-service.test.ts tests/unit/core/match-fields.test.ts && npm run typecheck`
971
- Expected: PASS — golden gate green (proves no recall-quality regression through the FTS swap), integration green, unit green, typecheck clean.
972
-
973
- - [ ] **Step 8: Commit**
974
-
975
- ```bash
976
- git add src/core/recall/match-fields.ts src/core/recall/recall-service.ts src/core/recall/index.ts tests/unit/core/match-fields.test.ts tests/unit/core/recall-service.test.ts tests/integration/recall-sqlite.test.ts
977
- git commit -m "feat: route keyword recall through FTS5 keywordSearch"
978
- ```
979
-
980
- ---
981
-
982
- ## Task 5: Delete the dead token-overlap scorer
983
-
984
- The FTS swap is complete and green. Remove the byte-parity scorer and its test. `tokenize.ts` stays — it is still imported by `src/core/recall-facts/fact-recall-service.ts` and reused by `keywordSearch` and `match-fields.ts`.
985
-
986
- **Files:**
987
- - Delete: `src/core/recall/score-keyword.ts`
988
- - Delete: `tests/unit/core/score-keyword.test.ts`
989
-
990
- - [ ] **Step 1: Confirm `scoreKeyword` has no remaining references**
991
-
992
- Run: `grep -rn "scoreKeyword\|score-keyword" src tests`
993
- Expected: no output. If anything prints, fix that reference before deleting (it should already be gone after Task 4 — `recall-service.ts` and `index.ts` were the only importers).
994
-
995
- - [ ] **Step 2: Delete the files**
996
-
997
- ```bash
998
- git rm src/core/recall/score-keyword.ts tests/unit/core/score-keyword.test.ts
999
- ```
1000
-
1001
- - [ ] **Step 3: Confirm `tokenize.ts` is still wired**
1002
-
1003
- Run: `grep -rn "tokenize" src/core/recall-facts src/core/storage src/core/recall`
1004
- Expected: references in `fact-recall-service.ts`, `sqlite-session-store.ts`, `match-fields.ts`, `recall-service.ts`, `tokenize.ts` itself. `tokenize.ts` must NOT be deleted.
1005
-
1006
- - [ ] **Step 4: Run the full suite**
1007
-
1008
- Run: `npm test && npm run typecheck && npm run lint`
1009
- Expected: PASS — entire suite green, typecheck clean, lint clean. No reference to the deleted scorer.
1010
-
1011
- - [ ] **Step 5: Commit**
1012
-
1013
- ```bash
1014
- git add -A
1015
- git commit -m "refactor: remove token-overlap scorer superseded by FTS5"
1016
- ```
1017
-
1018
- ---
1019
-
1020
- ## Task 6: Rebuild `dist/` and update the CHANGELOG
1021
-
1022
- Per the repo protocol, `dist/` is committed (the GitHub install is a pure copy — see the 2026-05-20 CHANGELOG entry) and every session ends with a CHANGELOG append.
1023
-
1024
- **Files:**
1025
- - Modify: `dist/` (regenerated)
1026
- - Modify: `logs/CHANGELOG/CHANGELOG.md`
1027
-
1028
- - [ ] **Step 1: Rebuild `dist/`**
1029
-
1030
- Run: `npm run build`
1031
- Expected: `build:server` and `build:ui` both succeed.
1032
-
1033
- - [ ] **Step 2: Append the CHANGELOG entry**
1034
-
1035
- Prepend a new entry below the title line in `logs/CHANGELOG/CHANGELOG.md` (newest first), matching the existing entry style:
1036
-
1037
- ```markdown
1038
- ## 2026-05-20 — FTS5 lexical recall: keywordSearch replaces the token-overlap scorer
1039
-
1040
- The keyword leg of recall moved from an in-memory token-intersection scorer to a SQLite FTS5 BM25 query behind a new `SessionStore.keywordSearch` port method — symmetric with the existing `semanticSearch` sqlite-vec leg.
1041
-
1042
- **Changes**
1043
- - `migrations/008_fts_rebuild.sql` — one-time safety rebuild of the `sessions_fts` index (table + sync triggers already existed in migration 000, just unqueried).
1044
- - `SessionStore.keywordSearch(query, limit)` — FTS5 MATCH with BM25 column weights 10/4/1 for label/summary/body; user input tokenized into a quoted OR query so FTS5 metacharacters cannot reach the parser.
1045
- - `RecallService` keyword + hybrid legs call `keywordSearch`; `matchedIn` badges computed in core via `match-fields.ts` from the resolved session (keeps decision/open attribution accurate — those live in `markers`, not FTS).
1046
- - Byte-parity test suite (pinned to the retired Python scorer) replaced by a tolerant golden-set recall regression test written before the swap and green throughout.
1047
- - Deleted `score-keyword.ts`; `tokenize.ts` retained (used by fact recall).
1048
-
1049
- **Decisions**
1050
- - Reused `sessions_fts(label, summary, body)` rather than adding `decisions`/`open` FTS columns — decision/open text already lives in `body`. Tradeoff: those lines get `body` weight, not an explicit 2x; BM25 IDF compensates.
1051
- - Hybrid 0.6/0.4 split retained — `mergeHybrid` normalizes each leg by its own max, which absorbs the token-count → BM25 scale change.
1052
-
1053
- **State:** v0.3.0. pgvector remains the optional power-tier swap (open task #96), untouched.
1054
- ```
1055
-
1056
- If the CHANGELOG now exceeds 10 `##` date headings, move the oldest entries to `logs/CHANGELOG/CHANGELOG-2026.md` per the session protocol.
1057
-
1058
- - [ ] **Step 3: Commit**
1059
-
1060
- ```bash
1061
- git add dist logs/CHANGELOG/CHANGELOG.md
1062
- git commit -m "build: rebuild dist for FTS5 recall + CHANGELOG"
1063
- ```
1064
-
1065
- ---
1066
-
1067
- ## Self-Review
1068
-
1069
- **Spec coverage:**
1070
- - Consensus requirement 1 — replace byte-parity tests with golden-set recall tests → Task 1 (golden gate written first), Task 5 (delete `score-keyword.test.ts`). ✓
1071
- - Consensus requirement 2 — wire `sessions_fts` with sync triggers → triggers already existed in migration 000; Task 2 adds the safety rebuild, Task 3 wires the *query* path (`keywordSearch`). ✓
1072
- - Consensus requirement 3 — re-tune the 0.6/0.4 hybrid weights → Task 4 documents and verifies that normalize-by-max absorbs the BM25 scale change; the split is deliberately retained, covered by the hybrid unit + integration tests. ✓
1073
-
1074
- **Placeholder scan:** No TBDs, no "add error handling", no "similar to Task N" — every step has complete code or an exact command. ✓
1075
-
1076
- **Type consistency:** `KeywordNeighbor { sessionId, score }` defined in Task 3, consumed unchanged in Task 4 (`runKeyword`, `InMemoryStore.keywordSearch`). `keywordMatchFields` signature defined in Task 4 Step 1, called identically in `runKeyword`. `keywordSearch(query, limit)` signature identical across port, `SqliteSessionStore`, and both fakes. `MatchField` unchanged — `keywordMatchFields` returns only existing members (`label`/`decisions`/`open`/`summary`). ✓
1077
-
1078
- ---
1079
-
1080
- ## Execution Handoff
1081
-
1082
- **Plan complete and saved to `docs/plans/2026-05-20-fts5-lexical-recall.md`. Two execution options:**
1083
-
1084
- **1. Subagent-Driven (recommended)** — dispatch a fresh subagent per task, review between tasks, fast iteration.
1085
-
1086
- **2. Inline Execution** — execute tasks in this session using executing-plans, batch execution with checkpoints.
1087
-
1088
- **Which approach?**