nlm-memory 0.5.0 → 0.5.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (247) hide show
  1. package/README.md +72 -34
  2. package/dist/cli/nlm.js +2 -1
  3. package/dist/cli/nlm.js.map +1 -1
  4. package/dist/http/app.js +2 -1
  5. package/dist/http/app.js.map +1 -1
  6. package/dist/mcp/server.js +20 -1
  7. package/dist/mcp/server.js.map +1 -1
  8. package/dist/ui/assets/{index-C8cpwbYJ.css → index-Beo8psd-.css} +1 -1
  9. package/dist/ui/assets/{index-CB50QnL-.js → index-CSPTTeeM.js} +8 -8
  10. package/dist/ui/index.html +2 -2
  11. package/package.json +26 -1
  12. package/.agents/plugins/marketplace.json +0 -20
  13. package/.github/workflows/ci.yml +0 -30
  14. package/docs/methodology/re-derivation-rate.md +0 -112
  15. package/docs/methodology/useful-hit-rate.md +0 -79
  16. package/docs/plans/2026-05-20-fts5-lexical-recall.md +0 -1088
  17. package/docs/plans/2026-05-20-recall-daemon-wedge-fix.md +0 -662
  18. package/docs/plans/2026-05-20-recall-hook-design.md +0 -131
  19. package/docs/plans/2026-05-20-recall-hook-implementation.md +0 -1222
  20. package/docs/plans/desktop-product.md +0 -69
  21. package/docs/plans/factstore-design.md +0 -236
  22. package/logs/CHANGELOG/CHANGELOG-2026.md +0 -1575
  23. package/logs/CHANGELOG/CHANGELOG.md +0 -209
  24. package/migrations/000_initial_schema.sql +0 -174
  25. package/migrations/001_entity_type_rename.sql +0 -17
  26. package/migrations/002_adapter_state_extend.sql +0 -12
  27. package/migrations/003_session_embeddings.sql +0 -11
  28. package/migrations/004_facts.sql +0 -46
  29. package/migrations/005_sources.sql +0 -31
  30. package/migrations/006_providers.sql +0 -33
  31. package/migrations/007_source_tokens.sql +0 -17
  32. package/migrations/008_fts_rebuild.sql +0 -9
  33. package/migrations/009_session_embedding_chunks.sql +0 -46
  34. package/migrations/010_sources_opencode.sql +0 -30
  35. package/migrations/011_sources_hermes_agent.sql +0 -30
  36. package/migrations/012_sources_aider.sql +0 -30
  37. package/migrations/013_adapter_state_failure_count.sql +0 -12
  38. package/migrations/014_sources_cursor.sql +0 -30
  39. package/migrations/015_sources_windsurf.sql +0 -30
  40. package/plugin-hermes-agent/README.md +0 -49
  41. package/plugin-hermes-agent/__init__.py +0 -75
  42. package/plugin-hermes-agent/plugin.yaml +0 -15
  43. package/scripts/backfill-citations.mjs +0 -0
  44. package/scripts/build-codex-plugin.mjs +0 -61
  45. package/scripts/deepseek-probe.mjs +0 -67
  46. package/scripts/extract-triples.mjs +0 -207
  47. package/scripts/longmemeval/embedding-cache.ts +0 -77
  48. package/scripts/longmemeval/fetch-dataset.sh +0 -25
  49. package/scripts/longmemeval/run-harness.ts +0 -315
  50. package/scripts/longmemeval/scorer.ts +0 -99
  51. package/scripts/longmemeval/tsconfig.json +0 -9
  52. package/scripts/longmemeval/types.ts +0 -35
  53. package/scripts/nlm-daily-digest.py +0 -239
  54. package/scripts/nlm-daily-digest.sh +0 -28
  55. package/src/cli/classify-parity.ts +0 -257
  56. package/src/cli/launchctl-helpers.ts +0 -49
  57. package/src/cli/nlm.ts +0 -1078
  58. package/src/core/actions/actions-log.ts +0 -118
  59. package/src/core/actions/overlay.ts +0 -117
  60. package/src/core/adapters/aider.ts +0 -205
  61. package/src/core/adapters/claude-code.ts +0 -293
  62. package/src/core/adapters/common.ts +0 -54
  63. package/src/core/adapters/cursor.ts +0 -486
  64. package/src/core/adapters/from-source.ts +0 -67
  65. package/src/core/adapters/hermes-agent.ts +0 -240
  66. package/src/core/adapters/hermes.ts +0 -277
  67. package/src/core/adapters/jsonl-generic.ts +0 -208
  68. package/src/core/adapters/opencode.ts +0 -281
  69. package/src/core/adapters/pi.ts +0 -264
  70. package/src/core/adapters/windsurf.ts +0 -386
  71. package/src/core/classifier/prompt.ts +0 -200
  72. package/src/core/dataset/build-dataset.ts +0 -463
  73. package/src/core/embedding/chunk-body.ts +0 -76
  74. package/src/core/embedding/embed-backfill.ts +0 -210
  75. package/src/core/embedding/embed-normalize.ts +0 -135
  76. package/src/core/facts/backfill-facts.ts +0 -254
  77. package/src/core/facts/extract-facts.ts +0 -50
  78. package/src/core/hook/citation-detect.ts +0 -124
  79. package/src/core/hook/cite-memo.ts +0 -68
  80. package/src/core/hook/claude-settings.ts +0 -187
  81. package/src/core/hook/gate.ts +0 -25
  82. package/src/core/hook/hook-log.ts +0 -41
  83. package/src/core/hook/memo-sweep.ts +0 -164
  84. package/src/core/hook/memo.ts +0 -67
  85. package/src/core/hook/pointer-block.ts +0 -26
  86. package/src/core/hook/select.ts +0 -32
  87. package/src/core/hook/transcript.ts +0 -121
  88. package/src/core/ingest/ingest-session.ts +0 -111
  89. package/src/core/providers/provider-models.ts +0 -100
  90. package/src/core/providers/provider-registry.ts +0 -196
  91. package/src/core/recall/citation-log.ts +0 -108
  92. package/src/core/recall/filter.ts +0 -27
  93. package/src/core/recall/index.ts +0 -6
  94. package/src/core/recall/match-fields.ts +0 -40
  95. package/src/core/recall/query-log.ts +0 -149
  96. package/src/core/recall/query-shape.ts +0 -66
  97. package/src/core/recall/recall-service.ts +0 -320
  98. package/src/core/recall/recent-log.ts +0 -59
  99. package/src/core/recall/tokenize.ts +0 -18
  100. package/src/core/recall/useful-scan.ts +0 -336
  101. package/src/core/recall-facts/fact-query-log.ts +0 -150
  102. package/src/core/recall-facts/fact-recall-service.ts +0 -327
  103. package/src/core/scheduler/scan-once.ts +0 -142
  104. package/src/core/scheduler/scheduler.ts +0 -225
  105. package/src/core/sources/source-registry.ts +0 -278
  106. package/src/core/storage/db-restore.ts +0 -133
  107. package/src/core/storage/live-status.ts +0 -45
  108. package/src/core/storage/migrate.ts +0 -72
  109. package/src/core/storage/sqlite-fact-store.ts +0 -304
  110. package/src/core/storage/sqlite-session-store.ts +0 -810
  111. package/src/hook/hook-auth.ts +0 -18
  112. package/src/hook/prompt-recall-hook.ts +0 -180
  113. package/src/hook/session-end-hook.ts +0 -81
  114. package/src/hook/session-start-hook.ts +0 -168
  115. package/src/hook/stop-hook.ts +0 -239
  116. package/src/http/app.ts +0 -1215
  117. package/src/install/claude-code.ts +0 -128
  118. package/src/install/codex.ts +0 -367
  119. package/src/install/cursor.ts +0 -68
  120. package/src/install/hermes-agent.ts +0 -76
  121. package/src/install/hermes.ts +0 -78
  122. package/src/install/nlm-dir-perms.ts +0 -55
  123. package/src/install/ollama.ts +0 -284
  124. package/src/install/setup.ts +0 -489
  125. package/src/install/windsurf.ts +0 -68
  126. package/src/llm/classifier-box.ts +0 -64
  127. package/src/llm/deepseek-client.ts +0 -150
  128. package/src/llm/env-autoload.ts +0 -55
  129. package/src/llm/ollama-client.ts +0 -189
  130. package/src/mcp/server.ts +0 -534
  131. package/src/ports/fact-store.ts +0 -102
  132. package/src/ports/llm-client.ts +0 -52
  133. package/src/ports/logger.ts +0 -16
  134. package/src/ports/session-store.ts +0 -45
  135. package/src/ports/transcript-adapter.ts +0 -55
  136. package/src/shared/types.ts +0 -149
  137. package/src/ui/App.tsx +0 -58
  138. package/src/ui/components/PromoteOpenButton.tsx +0 -65
  139. package/src/ui/components/SessionDrawer.tsx +0 -199
  140. package/src/ui/components/SideNav.tsx +0 -162
  141. package/src/ui/components/Skeleton.tsx +0 -107
  142. package/src/ui/index.html +0 -13
  143. package/src/ui/lib/actions.ts +0 -30
  144. package/src/ui/lib/api.ts +0 -92
  145. package/src/ui/lib/dataset.ts +0 -141
  146. package/src/ui/lib/registries.ts +0 -155
  147. package/src/ui/lib/view-settings.ts +0 -41
  148. package/src/ui/main.tsx +0 -15
  149. package/src/ui/pages/Live.tsx +0 -229
  150. package/src/ui/pages/Pulse.tsx +0 -415
  151. package/src/ui/pages/Recall.tsx +0 -190
  152. package/src/ui/pages/River.tsx +0 -354
  153. package/src/ui/pages/Search.tsx +0 -386
  154. package/src/ui/pages/Stub.tsx +0 -9
  155. package/src/ui/pages/Thread.tsx +0 -473
  156. package/src/ui/pages/settings/Classifier.tsx +0 -227
  157. package/src/ui/pages/settings/Data.tsx +0 -190
  158. package/src/ui/pages/settings/Index.tsx +0 -65
  159. package/src/ui/pages/settings/Labels.tsx +0 -224
  160. package/src/ui/pages/settings/Providers.tsx +0 -305
  161. package/src/ui/pages/settings/SettingsSubnav.tsx +0 -28
  162. package/src/ui/pages/settings/Sources.tsx +0 -326
  163. package/src/ui/pages/settings/Views.tsx +0 -96
  164. package/src/ui/styles.css +0 -1890
  165. package/src/ui/tsconfig.json +0 -21
  166. package/src/ui/vite.config.ts +0 -19
  167. package/tests/fixtures/claude_code/short_session.jsonl +0 -2
  168. package/tests/fixtures/claude_code/standard_iso.jsonl +0 -4
  169. package/tests/fixtures/claude_code/tool_heavy.jsonl +0 -8
  170. package/tests/fixtures/claude_code/with_subagent.jsonl +0 -7
  171. package/tests/fixtures/facts.ts +0 -17
  172. package/tests/fixtures/golden-corpus.ts +0 -85
  173. package/tests/fixtures/hermes/paired_request_dump.json +0 -24
  174. package/tests/fixtures/hermes/paired_session.json +0 -23
  175. package/tests/fixtures/hermes/request_dump.json +0 -28
  176. package/tests/fixtures/hermes/session_iso.json +0 -38
  177. package/tests/fixtures/hermes/session_unix.json +0 -38
  178. package/tests/fixtures/hermes/system_only.json +0 -18
  179. package/tests/fixtures/pi/error-connection-abort.jsonl +0 -8
  180. package/tests/fixtures/pi/short-successful.jsonl +0 -5
  181. package/tests/fixtures/pi/with-custom-message.jsonl +0 -6
  182. package/tests/fixtures/sessions.ts +0 -22
  183. package/tests/integration/backfill-facts.test.ts +0 -362
  184. package/tests/integration/citation-explicit.test.ts +0 -111
  185. package/tests/integration/cite-event.test.ts +0 -169
  186. package/tests/integration/cite-memo.test.ts +0 -87
  187. package/tests/integration/db-restore.test.ts +0 -153
  188. package/tests/integration/embed-backfill.test.ts +0 -176
  189. package/tests/integration/fact-supersedence.test.ts +0 -313
  190. package/tests/integration/fts-index.test.ts +0 -60
  191. package/tests/integration/getbyids-sqlite.test.ts +0 -100
  192. package/tests/integration/hermes-agent-hooks.test.ts +0 -248
  193. package/tests/integration/hook-claude-settings.test.ts +0 -218
  194. package/tests/integration/hook-log.test.ts +0 -54
  195. package/tests/integration/hook-memo.test.ts +0 -68
  196. package/tests/integration/hook-pre-compact.test.ts +0 -105
  197. package/tests/integration/hook-subagent-start.test.ts +0 -102
  198. package/tests/integration/http.test.ts +0 -401
  199. package/tests/integration/keyword-search-fts.test.ts +0 -66
  200. package/tests/integration/mcp-recall-logging.test.ts +0 -88
  201. package/tests/integration/mcp.test.ts +0 -260
  202. package/tests/integration/memo-sweep.test.ts +0 -91
  203. package/tests/integration/prompt-recall-hook.test.ts +0 -88
  204. package/tests/integration/provider-registry.test.ts +0 -107
  205. package/tests/integration/recall-golden.test.ts +0 -59
  206. package/tests/integration/recall-sqlite.test.ts +0 -169
  207. package/tests/integration/scheduler.test.ts +0 -391
  208. package/tests/integration/session-end-hook.test.ts +0 -48
  209. package/tests/integration/session-start-hook.test.ts +0 -126
  210. package/tests/integration/source-registry.test.ts +0 -122
  211. package/tests/integration/sqlite-fact-store.test.ts +0 -346
  212. package/tests/integration/stop-hook.test.ts +0 -560
  213. package/tests/integration/wal-checkpoint.test.ts +0 -49
  214. package/tests/unit/cli/launchctl-helpers.test.ts +0 -60
  215. package/tests/unit/core/adapters/aider.test.ts +0 -230
  216. package/tests/unit/core/adapters/claude-code.test.ts +0 -118
  217. package/tests/unit/core/adapters/cursor.test.ts +0 -485
  218. package/tests/unit/core/adapters/hermes-agent.test.ts +0 -329
  219. package/tests/unit/core/adapters/hermes.test.ts +0 -81
  220. package/tests/unit/core/adapters/jsonl-generic.test.ts +0 -142
  221. package/tests/unit/core/adapters/opencode.test.ts +0 -354
  222. package/tests/unit/core/adapters/pi.test.ts +0 -110
  223. package/tests/unit/core/adapters/windsurf.test.ts +0 -416
  224. package/tests/unit/core/classifier/prompt.test.ts +0 -126
  225. package/tests/unit/core/embedding/chunk-body.test.ts +0 -100
  226. package/tests/unit/core/facts/extract-facts.test.ts +0 -117
  227. package/tests/unit/core/filter.test.ts +0 -40
  228. package/tests/unit/core/hook/citation-detect-cite-session.test.ts +0 -96
  229. package/tests/unit/core/hook/citation-detect.test.ts +0 -124
  230. package/tests/unit/core/hook/gate.test.ts +0 -29
  231. package/tests/unit/core/hook/pointer-block.test.ts +0 -22
  232. package/tests/unit/core/hook/select.test.ts +0 -66
  233. package/tests/unit/core/match-fields.test.ts +0 -39
  234. package/tests/unit/core/mcp-cite-session.test.ts +0 -51
  235. package/tests/unit/core/providers/provider-models.test.ts +0 -101
  236. package/tests/unit/core/query-shape.test.ts +0 -92
  237. package/tests/unit/core/recall-facts/fact-recall-service.test.ts +0 -258
  238. package/tests/unit/core/recall-service.test.ts +0 -200
  239. package/tests/unit/core/storage/live-status.test.ts +0 -54
  240. package/tests/unit/core/tokenize.test.ts +0 -32
  241. package/tests/unit/core/useful-scan.test.ts +0 -537
  242. package/tests/unit/llm/embed.test.ts +0 -93
  243. package/tests/unit/llm/ollama-client.test.ts +0 -124
  244. package/tests/unit/scripts/longmemeval-scorer.test.ts +0 -114
  245. package/tsconfig.json +0 -31
  246. package/tsconfig.test.json +0 -11
  247. package/vitest.config.ts +0 -22
@@ -1,1088 +0,0 @@
1
- # FTS5 Lexical Recall Upgrade — Implementation Plan
2
-
3
- > **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
4
-
5
- **Goal:** Replace NLM's in-memory token-intersection keyword scorer with a SQLite FTS5 BM25-ranked lexical search, behind the existing `SessionStore` port, without regressing recall quality.
6
-
7
- **Architecture:** The keyword leg of recall moves from a pure core function (`scoreKeyword`, loads every session into memory and scores by token overlap) to a new `SessionStore.keywordSearch` port method backed by the FTS5 virtual table `sessions_fts`. This mirrors how the semantic leg already works (`semanticSearch` → sqlite-vec). `RecallService` keeps orchestrating filter + merge; it just sources keyword hits from the store instead of computing them. The byte-for-byte parity test suite (pinned to a retired Python scorer) is replaced — *before any production code changes* — by a tolerant golden-set recall-quality test that must stay green through the swap. That golden test is the regression net.
8
-
9
- **Key design decision (documented tradeoff):** `sessions_fts` already exists in migration 000 with columns `(label, summary, body)` and sync triggers — it is created and maintained, just never queried. We reuse it as-is rather than adding dedicated `decisions`/`open` FTS columns. Decision and open-question text already lives inside `body` (markers are extracted *from* the body markdown), so BM25 over `(label, summary, body)` covers the same text the old scorer covered. What changes: decision/open lines get `body` column weight rather than the old explicit 2x. BM25's IDF term-rarity weighting compensates. The `matchedIn` badges stay accurate — they are computed in `RecallService` from the resolved `Session` object (which has `decisions[]`/`open[]` from the `markers` table), not from FTS. No FTS schema change, no `MatchField` type change.
10
-
11
- **Tech Stack:** TypeScript, better-sqlite3, SQLite FTS5 (built in), vitest. Hexagonal architecture — `core/` depends on the `SessionStore` port; `SqliteSessionStore` is the adapter.
12
-
13
- **Branch:** Create and work on `feat/fts5-lexical-recall` off `main`.
14
-
15
- **Out of scope:** pgvector (stays the optional power-tier swap, open task #96). The vector leg (`semanticSearch` / sqlite-vec) is untouched.
16
-
17
- ---
18
-
19
- ## File Structure
20
-
21
- **Created:**
22
- - `tests/fixtures/golden-corpus.ts` — fixed 8-session corpus with realistic `body`, used by the golden recall test.
23
- - `tests/integration/recall-golden.test.ts` — the regression gate: query → expected top-3 assertions, run through `RecallService` + real `SqliteSessionStore`.
24
- - `migrations/008_fts_rebuild.sql` — one-time `INSERT INTO sessions_fts(sessions_fts) VALUES('rebuild')` safety rebuild.
25
- - `tests/integration/fts-index.test.ts` — asserts `sessions_fts` is populated and synced after migrations + inserts.
26
- - `tests/integration/keyword-search-fts.test.ts` — exercises `SqliteSessionStore.keywordSearch` directly (ranking + FTS-syntax safety).
27
- - `src/core/recall/match-fields.ts` — pure helper computing `matchedIn` fields for a session against query tokens (the `matchedIn` half of the old `scoreKeyword`).
28
-
29
- **Modified:**
30
- - `src/ports/session-store.ts` — add `KeywordNeighbor` interface + `keywordSearch` method.
31
- - `src/core/storage/sqlite-session-store.ts` — implement `keywordSearch`.
32
- - `src/core/recall/recall-service.ts` — keyword + hybrid legs call `store.keywordSearch`; add `runKeyword`; delete `scoreAll`.
33
- - `src/core/recall/index.ts` — drop `scoreKeyword` export; add `keywordMatchFields` export.
34
- - `tests/unit/core/recall-service.test.ts` — `InMemoryStore` fake implements `keywordSearch`; keyword/hybrid tests feed pre-baked hits.
35
- - `tests/integration/recall-sqlite.test.ts` — populate `body` on seed sessions so FTS keyword recall works.
36
-
37
- **Deleted:**
38
- - `src/core/recall/score-keyword.ts` — ranking moves to FTS; `matchedIn` logic moves to `match-fields.ts`.
39
- - `tests/unit/core/score-keyword.test.ts` — byte-parity suite, replaced by the golden test.
40
-
41
- **Kept (do not delete):**
42
- - `src/core/recall/tokenize.ts` — still imported by `src/core/recall-facts/fact-recall-service.ts` and reused by `keywordSearch` + `match-fields.ts`.
43
-
44
- ---
45
-
46
- ## Task 1: Golden-set recall regression test (the gate)
47
-
48
- Write the regression net **first**, against the current (unchanged) in-memory scorer. It must pass now and stay green through every later task. Assertions are tolerant — "expected session in top 3" — so they survive the ranking-algorithm change from token-overlap to BM25.
49
-
50
- **Files:**
51
- - Create: `tests/fixtures/golden-corpus.ts`
52
- - Create: `tests/integration/recall-golden.test.ts`
53
-
54
- - [ ] **Step 1: Create the golden corpus fixture**
55
-
56
- Create `tests/fixtures/golden-corpus.ts`:
57
-
58
- ```typescript
59
- import type { Session } from "../../src/shared/types.js";
60
- import { makeSession } from "./sessions.js";
61
-
62
- /**
63
- * A fixed, realistic corpus for recall-quality regression testing.
64
- * `body` is populated on every session because FTS5 keyword search indexes
65
- * label + summary + body. Decision/open text is also present in `body`
66
- * (mirrors production: markers are extracted from the body markdown).
67
- */
68
- export const GOLDEN_CORPUS: ReadonlyArray<Session> = [
69
- makeSession({
70
- id: "g_fts",
71
- label: "FTS5 vs pgvector for recall search backend",
72
- summary: "Compared SQLite FTS5 lexical search against pgvector for the recall layer",
73
- body: "Evaluated FTS5 versus pgvector. FTS5 ships with SQLite and stays zero-config. pgvector needs Postgres running which breaks the five-minute install.",
74
- decisions: ["Use FTS5 for the lexical recall leg"],
75
- entities: ["NLM"],
76
- }),
77
- makeSession({
78
- id: "g_hono",
79
- label: "Hono router setup on port 3940",
80
- summary: "Wired the Hono HTTP router and mounted the recall API",
81
- body: "Set up Hono as the HTTP framework. Mounted routes for recall, sessions, and the live dashboard on port 3940.",
82
- decisions: ["Chose Hono over Express for HTTP routing"],
83
- entities: ["NLM"],
84
- }),
85
- makeSession({
86
- id: "g_pgvector",
87
- label: "pgvector migration plan for the power tier",
88
- summary: "Sketched the Postgres mirror behind the SessionStore port",
89
- body: "Planned a PostgresSessionStore satisfying the same port as SqliteSessionStore. pgvector handles the vector index for users already running Postgres.",
90
- open: ["Timing of the SQLite to Postgres cutover"],
91
- entities: ["NLM", "Postgres"],
92
- }),
93
- makeSession({
94
- id: "g_tauri",
95
- label: "Tauri desktop packaging for v2 distribution",
96
- summary: "Plan to wrap the server and SPA in Tauri for signed installers",
97
- body: "Tauri hosts the Vite SPA in a webview and runs the Node server as a sidecar. Produces dmg, exe, and deb installers.",
98
- open: ["Whether to rewrite the server in Rust later"],
99
- entities: ["NLM"],
100
- }),
101
- makeSession({
102
- id: "g_classifier",
103
- label: "Ollama classifier latency during backfill",
104
- summary: "The Ollama classifier runs about one session per second",
105
- body: "Backfilling a year of history is slow because the Ollama classifier processes roughly one session per second. Considered parallelizing the calls.",
106
- open: ["Parallelize classifier calls or document the DeepSeek path"],
107
- entities: ["NLM", "Ollama"],
108
- }),
109
- makeSession({
110
- id: "g_supersede",
111
- label: "Fact supersedence policy on subject predicate collision",
112
- summary: "Deterministic supersedence when a newer fact collides with an older one",
113
- body: "When a new fact shares the same subject and predicate as a current fact, the older row is marked superseded by the new one. Always supersede, even on same value.",
114
- decisions: ["Always supersede on subject predicate collision"],
115
- entities: ["NLM"],
116
- }),
117
- makeSession({
118
- id: "g_toon",
119
- label: "TOON encoding for MCP tool responses",
120
- summary: "Encode MCP responses as TOON to cut token usage",
121
- body: "The MCP server encodes tool responses as TOON when NLM_FORMAT is set to toon. Falls back to JSON when toonEncode throws.",
122
- decisions: ["TOON-encode MCP responses behind the NLM_FORMAT env flag"],
123
- entities: ["NLM"],
124
- }),
125
- makeSession({
126
- id: "g_camofox",
127
- label: "Camofox audit of the search page",
128
- summary: "Ran a Camofox browser audit against the recall search UI",
129
- body: "Camofox audit found the search page returned zero results because the static build ignored query strings. Fixed with client-side hydration.",
130
- open: ["Should entity facet links filter within search"],
131
- entities: ["NLM", "Camofox"],
132
- }),
133
- ];
134
-
135
- /** query → session id expected to appear in the top 3 keyword results. */
136
- export const GOLDEN_QUERIES: ReadonlyArray<{ query: string; expectTop3: string }> = [
137
- { query: "FTS5 pgvector search backend", expectTop3: "g_fts" },
138
- { query: "Hono router", expectTop3: "g_hono" },
139
- { query: "Tauri packaging installers", expectTop3: "g_tauri" },
140
- { query: "Ollama classifier latency", expectTop3: "g_classifier" },
141
- { query: "fact supersedence collision", expectTop3: "g_supersede" },
142
- { query: "TOON encoding MCP", expectTop3: "g_toon" },
143
- ];
144
- ```
145
-
146
- - [ ] **Step 2: Write the golden recall test**
147
-
148
- Create `tests/integration/recall-golden.test.ts`:
149
-
150
- ```typescript
151
- /**
152
- * Recall-quality regression gate. A fixed corpus + query/expectation pairs,
153
- * run through RecallService against a real SqliteSessionStore. Assertions are
154
- * tolerant (expected session within top 3) so they survive the swap from the
155
- * token-overlap scorer to FTS5 BM25 ranking. This test must stay green from
156
- * the current code through every task in this plan.
157
- */
158
-
159
- import { mkdtempSync, rmSync } from "node:fs";
160
- import { tmpdir } from "node:os";
161
- import { join, resolve } from "node:path";
162
- import { afterEach, beforeEach, describe, expect, it } from "vitest";
163
- import { RecallService } from "../../src/core/recall/recall-service.js";
164
- import { SqliteSessionStore } from "../../src/core/storage/sqlite-session-store.js";
165
- import type { EmbedResult, LLMClient } from "../../src/ports/llm-client.js";
166
- import { LLMUnreachableError } from "../../src/ports/llm-client.js";
167
- import { GOLDEN_CORPUS, GOLDEN_QUERIES } from "../fixtures/golden-corpus.js";
168
-
169
- const MIGRATIONS_DIR = resolve(__dirname, "../../migrations");
170
-
171
- // Keyword-only recall must never touch the embedder; this stub proves it.
172
- class UnreachableEmbedder implements LLMClient {
173
- async embed(): Promise<EmbedResult> {
174
- throw new LLMUnreachableError("ollama");
175
- }
176
- async classify(): Promise<never> {
177
- throw new Error("not used");
178
- }
179
- }
180
-
181
- describe("golden recall regression gate", () => {
182
- let tmp: string;
183
- let store: SqliteSessionStore;
184
-
185
- beforeEach(() => {
186
- tmp = mkdtempSync(join(tmpdir(), "nlm-golden-"));
187
- store = new SqliteSessionStore({
188
- dbPath: join(tmp, "canonical.sqlite"),
189
- migrationsDir: MIGRATIONS_DIR,
190
- });
191
- for (const session of GOLDEN_CORPUS) {
192
- store.insertSessionForTest(session);
193
- }
194
- });
195
-
196
- afterEach(() => {
197
- store.close();
198
- rmSync(tmp, { recursive: true, force: true });
199
- });
200
-
201
- for (const { query, expectTop3 } of GOLDEN_QUERIES) {
202
- it(`keyword recall surfaces "${expectTop3}" in the top 3 for "${query}"`, async () => {
203
- const svc = new RecallService({ store, llm: new UnreachableEmbedder() });
204
- const result = await svc.search({ query, mode: "keyword", limit: 10 });
205
- const top3 = result.results.slice(0, 3).map((r) => r.id);
206
- expect(top3).toContain(expectTop3);
207
- });
208
- }
209
- });
210
- ```
211
-
212
- - [ ] **Step 3: Run the golden test against current code**
213
-
214
- Run: `npm test -- tests/integration/recall-golden.test.ts`
215
- Expected: PASS — all 6 cases. This proves the current in-memory scorer satisfies the golden set; the same test will guard the FTS swap.
216
-
217
- - [ ] **Step 4: Commit**
218
-
219
- ```bash
220
- git checkout -b feat/fts5-lexical-recall
221
- git add tests/fixtures/golden-corpus.ts tests/integration/recall-golden.test.ts
222
- git commit -m "test: add golden-set recall regression gate before FTS5 swap"
223
- ```
224
-
225
- ---
226
-
227
- ## Task 2: FTS index rebuild migration
228
-
229
- `sessions_fts` and its `sessions_ai`/`sessions_au`/`sessions_ad` triggers were declared in migration `000` and have fired on every insert since — the index is normally in sync. Add a one-time `rebuild` as a safety net so the recall path can depend on FTS being complete for all pre-existing rows.
230
-
231
- **Files:**
232
- - Create: `migrations/008_fts_rebuild.sql`
233
- - Create: `tests/integration/fts-index.test.ts`
234
-
235
- - [ ] **Step 1: Write the failing test**
236
-
237
- Create `tests/integration/fts-index.test.ts`:
238
-
239
- ```typescript
240
- /**
241
- * Verifies the sessions_fts FTS5 index is present and kept in sync with the
242
- * sessions table after migrations run and rows are inserted.
243
- */
244
-
245
- import { mkdtempSync, rmSync } from "node:fs";
246
- import { tmpdir } from "node:os";
247
- import { join, resolve } from "node:path";
248
- import { afterEach, beforeEach, describe, expect, it } from "vitest";
249
- import { SqliteSessionStore } from "../../src/core/storage/sqlite-session-store.js";
250
- import { makeSession } from "../fixtures/sessions.js";
251
-
252
- const MIGRATIONS_DIR = resolve(__dirname, "../../migrations");
253
-
254
- describe("sessions_fts index", () => {
255
- let tmp: string;
256
- let store: SqliteSessionStore;
257
-
258
- beforeEach(() => {
259
- tmp = mkdtempSync(join(tmpdir(), "nlm-fts-"));
260
- store = new SqliteSessionStore({
261
- dbPath: join(tmp, "canonical.sqlite"),
262
- migrationsDir: MIGRATIONS_DIR,
263
- });
264
- });
265
-
266
- afterEach(() => {
267
- store.close();
268
- rmSync(tmp, { recursive: true, force: true });
269
- });
270
-
271
- it("populates sessions_fts via triggers on insert", () => {
272
- store.insertSessionForTest(makeSession({ id: "s1", label: "alpha", body: "beta" }));
273
- store.insertSessionForTest(makeSession({ id: "s2", label: "gamma", body: "delta" }));
274
- const db = store.rawDb();
275
- const fts = db.prepare<[], { n: number }>("SELECT count(*) AS n FROM sessions_fts").get();
276
- const rows = db.prepare<[], { n: number }>("SELECT count(*) AS n FROM sessions").get();
277
- expect(fts?.n).toBe(rows?.n);
278
- expect(fts?.n).toBe(2);
279
- });
280
-
281
- it("records the 008 fts_rebuild migration as applied", () => {
282
- const db = store.rawDb();
283
- const row = db
284
- .prepare<[number], { name: string }>("SELECT name FROM schema_migrations WHERE version = ?")
285
- .get(8);
286
- expect(row?.name).toBe("fts_rebuild");
287
- });
288
-
289
- it("answers a raw FTS5 MATCH query", () => {
290
- store.insertSessionForTest(makeSession({ id: "s1", label: "pgvector plan", body: "" }));
291
- const db = store.rawDb();
292
- const hit = db
293
- .prepare<[string], { id: string }>(
294
- "SELECT s.id FROM sessions_fts JOIN sessions s ON s.rowid = sessions_fts.rowid WHERE sessions_fts MATCH ?",
295
- )
296
- .get('"pgvector"');
297
- expect(hit?.id).toBe("s1");
298
- });
299
- });
300
- ```
301
-
302
- - [ ] **Step 2: Run the test to verify it fails**
303
-
304
- Run: `npm test -- tests/integration/fts-index.test.ts`
305
- Expected: FAIL on the "records the 008 fts_rebuild migration" case — version 8 is not in `schema_migrations` because the migration file does not exist yet. (The other two cases may already pass — the triggers exist in migration 000.)
306
-
307
- - [ ] **Step 3: Write the migration**
308
-
309
- Create `migrations/008_fts_rebuild.sql`:
310
-
311
- ```sql
312
- -- One-time safety rebuild of the sessions_fts external-content FTS5 index.
313
- -- The virtual table and its sync triggers (sessions_ai / sessions_au /
314
- -- sessions_ad) were declared in migration 000 and have fired on every write
315
- -- since, so the index is normally already in sync. This rebuild guarantees
316
- -- the index matches every existing sessions row before the recall path
317
- -- starts depending on FTS5 for keyword search. Safe and idempotent.
318
- INSERT INTO sessions_fts(sessions_fts) VALUES('rebuild');
319
-
320
- INSERT OR IGNORE INTO schema_migrations (version, name) VALUES (8, 'fts_rebuild');
321
- ```
322
-
323
- - [ ] **Step 4: Run the test to verify it passes**
324
-
325
- Run: `npm test -- tests/integration/fts-index.test.ts`
326
- Expected: PASS — all 3 cases.
327
-
328
- - [ ] **Step 5: Commit**
329
-
330
- ```bash
331
- git add migrations/008_fts_rebuild.sql tests/integration/fts-index.test.ts
332
- git commit -m "feat: add FTS5 index rebuild migration with sync verification"
333
- ```
334
-
335
- ---
336
-
337
- ## Task 3: Add `keywordSearch` to the SessionStore port
338
-
339
- Add the port method and its `SqliteSessionStore` implementation. The unit-test `InMemoryStore` fake (in `recall-service.test.ts`) also `implements SessionStore`, so it must gain a `keywordSearch` stub in this task or `typecheck` breaks — `RecallService` does not call it until Task 4, so a minimal stub is correct here.
340
-
341
- **Files:**
342
- - Modify: `src/ports/session-store.ts`
343
- - Modify: `src/core/storage/sqlite-session-store.ts`
344
- - Modify: `tests/unit/core/recall-service.test.ts:12-27` (add stub method to `InMemoryStore`)
345
- - Test: `tests/integration/keyword-search-fts.test.ts`
346
-
347
- - [ ] **Step 1: Write the failing test**
348
-
349
- Create `tests/integration/keyword-search-fts.test.ts`:
350
-
351
- ```typescript
352
- /**
353
- * Direct coverage of SqliteSessionStore.keywordSearch — FTS5 BM25 ranking
354
- * and resilience to FTS5 query-syntax metacharacters in user input.
355
- */
356
-
357
- import { mkdtempSync, rmSync } from "node:fs";
358
- import { tmpdir } from "node:os";
359
- import { join, resolve } from "node:path";
360
- import { afterEach, beforeEach, describe, expect, it } from "vitest";
361
- import { SqliteSessionStore } from "../../src/core/storage/sqlite-session-store.js";
362
- import { makeSession } from "../fixtures/sessions.js";
363
-
364
- const MIGRATIONS_DIR = resolve(__dirname, "../../migrations");
365
-
366
- describe("SqliteSessionStore.keywordSearch", () => {
367
- let tmp: string;
368
- let store: SqliteSessionStore;
369
-
370
- beforeEach(() => {
371
- tmp = mkdtempSync(join(tmpdir(), "nlm-kw-"));
372
- store = new SqliteSessionStore({
373
- dbPath: join(tmp, "canonical.sqlite"),
374
- migrationsDir: MIGRATIONS_DIR,
375
- });
376
- store.insertSessionForTest(
377
- makeSession({ id: "s_pg", label: "pgvector migration plan", body: "postgres mirror" }),
378
- );
379
- store.insertSessionForTest(
380
- makeSession({ id: "s_hono", label: "Hono router", body: "http framework setup" }),
381
- );
382
- store.insertSessionForTest(
383
- makeSession({ id: "s_misc", label: "unrelated work", body: "nothing in common" }),
384
- );
385
- });
386
-
387
- afterEach(() => {
388
- store.close();
389
- rmSync(tmp, { recursive: true, force: true });
390
- });
391
-
392
- it("ranks the matching session first and returns a positive score", async () => {
393
- const hits = await store.keywordSearch("pgvector", 10);
394
- expect(hits[0]?.sessionId).toBe("s_pg");
395
- expect(hits[0]?.score).toBeGreaterThan(0);
396
- });
397
-
398
- it("matches body text, not just the label", async () => {
399
- const hits = await store.keywordSearch("framework", 10);
400
- expect(hits.map((h) => h.sessionId)).toContain("s_hono");
401
- });
402
-
403
- it("returns an empty array for a query with no indexable tokens", async () => {
404
- const hits = await store.keywordSearch("---", 10);
405
- expect(hits).toEqual([]);
406
- });
407
-
408
- it("does not throw on FTS5 metacharacters in the query", async () => {
409
- const hits = await store.keywordSearch('pgvector OR (qdrant) NEAR "x"', 10);
410
- expect(hits.map((h) => h.sessionId)).toContain("s_pg");
411
- });
412
-
413
- it("respects the limit", async () => {
414
- const hits = await store.keywordSearch("plan router work", 1);
415
- expect(hits.length).toBeLessThanOrEqual(1);
416
- });
417
- });
418
- ```
419
-
420
- - [ ] **Step 2: Run the test to verify it fails**
421
-
422
- Run: `npm test -- tests/integration/keyword-search-fts.test.ts`
423
- Expected: FAIL — `store.keywordSearch is not a function`.
424
-
425
- - [ ] **Step 3: Add the port interface members**
426
-
427
- In `src/ports/session-store.ts`, add the `KeywordNeighbor` interface immediately after the existing `SemanticNeighbor` interface (after line 20):
428
-
429
- ```typescript
430
- export interface KeywordNeighbor {
431
- readonly sessionId: string;
432
- readonly score: number;
433
- }
434
- ```
435
-
436
- Then add the method to the `SessionStore` interface, immediately after the `semanticSearch` declaration (after line 30):
437
-
438
- ```typescript
439
- keywordSearch(
440
- query: string,
441
- limit: number,
442
- ): Promise<ReadonlyArray<KeywordNeighbor>>;
443
- ```
444
-
445
- - [ ] **Step 4: Implement `keywordSearch` in `SqliteSessionStore`**
446
-
447
- In `src/core/storage/sqlite-session-store.ts`:
448
-
449
- Add to the import from `@ports/session-store.js` (currently `SemanticNeighbor, SessionFilter, SessionStore`) the new `KeywordNeighbor` type:
450
-
451
- ```typescript
452
- import type {
453
- KeywordNeighbor,
454
- SemanticNeighbor,
455
- SessionFilter,
456
- SessionStore,
457
- } from "@ports/session-store.js";
458
- ```
459
-
460
- Add a `tokenize` import below the existing `runMigrations` import (after line 32):
461
-
462
- ```typescript
463
- import { tokenize } from "@core/recall/tokenize.js";
464
- ```
465
-
466
- Add the method immediately after `semanticSearch` (after line 461, before `updateStatus`):
467
-
468
- ```typescript
469
- /**
470
- * Lexical recall via the sessions_fts FTS5 index. BM25 column weights
471
- * favour label over summary over body. Returns sessions ranked best-first
472
- * with a positive score (the negated bm25() value — bm25 is more negative
473
- * for better matches). User input is tokenized and rebuilt into a quoted
474
- * OR query so FTS5 metacharacters cannot reach the MATCH parser.
475
- */
476
- async keywordSearch(
477
- query: string,
478
- limit: number,
479
- ): Promise<ReadonlyArray<KeywordNeighbor>> {
480
- const matchExpr = toMatchExpression(query);
481
- if (!matchExpr) return [];
482
- const k = Math.max(1, Math.trunc(limit));
483
- const rows = this.db
484
- .prepare<[string, number], { sessionId: string; score: number }>(`
485
- SELECT s.id AS sessionId,
486
- -bm25(sessions_fts, 10.0, 4.0, 1.0) AS score
487
- FROM sessions_fts
488
- JOIN sessions s ON s.rowid = sessions_fts.rowid
489
- WHERE sessions_fts MATCH ?
490
- ORDER BY score DESC
491
- LIMIT ?
492
- `)
493
- .all(matchExpr, k);
494
- return rows.map((r) => ({ sessionId: r.sessionId, score: r.score }));
495
- }
496
- ```
497
-
498
- Add this module-level helper at the end of the file, after the closing brace of the `SqliteSessionStore` class:
499
-
500
- ```typescript
501
- /**
502
- * Builds a safe FTS5 MATCH expression from raw user input. Each indexable
503
- * token becomes a double-quoted string literal; literals are OR-joined.
504
- * Quoting neutralizes FTS5 operators (AND, OR, NEAR, *, parentheses, colon).
505
- * Returns null when the query has no indexable tokens.
506
- */
507
- function toMatchExpression(query: string): string | null {
508
- const terms = tokenize(query);
509
- if (terms.length === 0) return null;
510
- return terms.map((t) => `"${t.replace(/"/g, '""')}"`).join(" OR ");
511
- }
512
- ```
513
-
514
- - [ ] **Step 5: Add a `keywordSearch` stub to the `InMemoryStore` test fake**
515
-
516
- In `tests/unit/core/recall-service.test.ts`, the `InMemoryStore` class (lines 12-27) `implements SessionStore` and will no longer compile without the new method. Add the import and a minimal stub — Task 4 replaces this stub with a real implementation.
517
-
518
- Change the import block (lines 5-8) to include `KeywordNeighbor`:
519
-
520
- ```typescript
521
- import type {
522
- KeywordNeighbor,
523
- SessionStore,
524
- SemanticNeighbor,
525
- } from "../../../src/ports/session-store.js";
526
- ```
527
-
528
- Add this method inside `InMemoryStore`, after `semanticSearch` (after line 25):
529
-
530
- ```typescript
531
- async keywordSearch(): Promise<ReadonlyArray<KeywordNeighbor>> {
532
- return [];
533
- }
534
- ```
535
-
536
- - [ ] **Step 6: Run the tests**
537
-
538
- Run: `npm test -- tests/integration/keyword-search-fts.test.ts && npm run typecheck`
539
- Expected: PASS — all 5 `keywordSearch` cases; `typecheck` clean.
540
-
541
- Run: `npm test -- tests/integration/recall-golden.test.ts`
542
- Expected: PASS — golden gate still green (`RecallService` unchanged, still uses the in-memory scorer).
543
-
544
- - [ ] **Step 7: Commit**
545
-
546
- ```bash
547
- git add src/ports/session-store.ts src/core/storage/sqlite-session-store.ts tests/unit/core/recall-service.test.ts tests/integration/keyword-search-fts.test.ts
548
- git commit -m "feat: add FTS5-backed keywordSearch to the SessionStore port"
549
- ```
550
-
551
- ---
552
-
553
- ## Task 4: Rewire `RecallService` to use `keywordSearch`
554
-
555
- Switch the keyword and hybrid legs from the in-memory `scoreAll`/`scoreKeyword` path to `store.keywordSearch`. `matchedIn` badges are computed in core from the resolved `Session` (which carries `decisions`/`open` from the `markers` table) via a new pure helper, so the `MatchField` type is unchanged and decision/open badges stay accurate.
556
-
557
- **Hybrid weighting note:** `mergeHybrid` normalizes each leg by its own max (`score / maxKw`). That normalization absorbs the scale change from token-overlap counts to negated-BM25 values, so the 0.6 semantic / 0.4 keyword split is *deliberately retained* — this is the re-tuning conclusion, verified by the hybrid test below, not an oversight.
558
-
559
- **Files:**
560
- - Create: `src/core/recall/match-fields.ts`
561
- - Modify: `src/core/recall/recall-service.ts`
562
- - Modify: `src/core/recall/index.ts`
563
- - Modify: `tests/unit/core/recall-service.test.ts`
564
- - Modify: `tests/integration/recall-sqlite.test.ts`
565
-
566
- - [ ] **Step 1: Write the `match-fields` helper with its test**
567
-
568
- Create `src/core/recall/match-fields.ts`:
569
-
570
- ```typescript
571
- /**
572
- * Computes which session fields a keyword query matched, for the `matchedIn`
573
- * badge on a RecallHit. Pure function — no DB, no I/O. FTS5 BM25 ranks the
574
- * whole row; this recovers per-field attribution from the resolved Session,
575
- * including decisions/open which live in the markers table (not in FTS).
576
- */
577
-
578
- import type { MatchField, Session } from "@shared/types.js";
579
- import { tokenSet } from "./tokenize.js";
580
-
581
- type SessionFields = Pick<Session, "label" | "summary" | "decisions" | "open">;
582
-
583
- export function keywordMatchFields(
584
- session: SessionFields,
585
- queryTokens: ReadonlySet<string>,
586
- ): ReadonlyArray<MatchField> {
587
- if (queryTokens.size === 0) return [];
588
- const fields: MatchField[] = [];
589
-
590
- if (overlaps(queryTokens, tokenSet(session.label))) fields.push("label");
591
- if (overlaps(queryTokens, joinedTokens(session.decisions))) fields.push("decisions");
592
- if (overlaps(queryTokens, joinedTokens(session.open))) fields.push("open");
593
- if (overlaps(queryTokens, tokenSet(session.summary))) fields.push("summary");
594
-
595
- return fields;
596
- }
597
-
598
- function joinedTokens(values: ReadonlyArray<string>): Set<string> {
599
- const out = new Set<string>();
600
- for (const v of values) {
601
- for (const t of tokenSet(v)) out.add(t);
602
- }
603
- return out;
604
- }
605
-
606
- function overlaps(a: ReadonlySet<string>, b: ReadonlySet<string>): boolean {
607
- const [small, large] = a.size <= b.size ? [a, b] : [b, a];
608
- for (const item of small) if (large.has(item)) return true;
609
- return false;
610
- }
611
- ```
612
-
613
- Create `tests/unit/core/match-fields.test.ts`:
614
-
615
- ```typescript
616
- import { describe, expect, it } from "vitest";
617
- import { keywordMatchFields } from "../../../src/core/recall/match-fields.js";
618
- import { tokenSet } from "../../../src/core/recall/tokenize.js";
619
- import { makeSession } from "../../fixtures/sessions.js";
620
-
621
- describe("keywordMatchFields", () => {
622
- it("returns no fields for empty query tokens", () => {
623
- expect(keywordMatchFields(makeSession({ label: "anything" }), new Set())).toEqual([]);
624
- });
625
-
626
- it("reports the label field on a label match", () => {
627
- const session = makeSession({ label: "pgvector migration plan" });
628
- expect(keywordMatchFields(session, tokenSet("pgvector"))).toEqual(["label"]);
629
- });
630
-
631
- it("reports decisions and open from marker text", () => {
632
- const session = makeSession({
633
- decisions: ["picked Hono for HTTP"],
634
- open: ["whether to use Tauri later"],
635
- });
636
- expect(keywordMatchFields(session, tokenSet("Hono"))).toEqual(["decisions"]);
637
- expect(keywordMatchFields(session, tokenSet("Tauri"))).toEqual(["open"]);
638
- });
639
-
640
- it("reports every matching field in label, decisions, open, summary order", () => {
641
- const session = makeSession({
642
- label: "recall port",
643
- summary: "ported recall to TypeScript",
644
- decisions: ["use sqlite-vec for semantic recall"],
645
- open: ["recall stats endpoint"],
646
- });
647
- expect(keywordMatchFields(session, tokenSet("recall"))).toEqual([
648
- "label",
649
- "decisions",
650
- "open",
651
- "summary",
652
- ]);
653
- });
654
- });
655
- ```
656
-
657
- - [ ] **Step 2: Run the helper test**
658
-
659
- Run: `npm test -- tests/unit/core/match-fields.test.ts`
660
- Expected: PASS — all 4 cases.
661
-
662
- - [ ] **Step 3: Rewire `RecallService`**
663
-
664
- In `src/core/recall/recall-service.ts`:
665
-
666
- Replace the two import lines for `score-keyword` and `tokenize` (lines 20-22) with:
667
-
668
- ```typescript
669
- import { applyFilter } from "./filter.js";
670
- import { keywordMatchFields } from "./match-fields.js";
671
- import { tokenSet } from "./tokenize.js";
672
- ```
673
-
674
- Add a keyword overfetch constant next to `SEMANTIC_OVERFETCH` (after line 28):
675
-
676
- ```typescript
677
- const KEYWORD_OVERFETCH = 3;
678
- ```
679
-
680
- Replace the keyword-hits block (lines 65-68):
681
-
682
- ```typescript
683
- const kwHits =
684
- mode === "keyword" || mode === "hybrid"
685
- ? scoreAll(filtered, queryTokens)
686
- : [];
687
- ```
688
-
689
- with:
690
-
691
- ```typescript
692
- const kwHits =
693
- (mode === "keyword" || mode === "hybrid") && input.query
694
- ? await this.runKeyword(
695
- input.query,
696
- byId,
697
- queryTokens,
698
- limit * KEYWORD_OVERFETCH,
699
- )
700
- : [];
701
- ```
702
-
703
- Add a `runKeyword` private method immediately after the `runSemantic` method (after line 116, before the closing brace of the class):
704
-
705
- ```typescript
706
- private async runKeyword(
707
- query: string,
708
- byId: ReadonlyMap<string, Session>,
709
- queryTokens: ReadonlySet<string>,
710
- fetchLimit: number,
711
- ): Promise<ReadonlyArray<KeywordHit>> {
712
- const neighbors = await this.deps.store.keywordSearch(query, fetchLimit);
713
- const hits: KeywordHit[] = [];
714
- for (const n of neighbors) {
715
- const session = byId.get(n.sessionId);
716
- if (!session) continue;
717
- hits.push({
718
- session,
719
- score: n.score,
720
- matchedIn: keywordMatchFields(session, queryTokens),
721
- });
722
- }
723
- return hits;
724
- }
725
- ```
726
-
727
- Delete the now-unused `scoreAll` function (lines 130-142):
728
-
729
- ```typescript
730
- function scoreAll(
731
- sessions: ReadonlyArray<Session>,
732
- queryTokens: ReadonlySet<string>,
733
- ): ReadonlyArray<KeywordHit> {
734
- if (queryTokens.size === 0) return [];
735
- const hits: KeywordHit[] = [];
736
- for (const s of sessions) {
737
- const { score, matchedIn } = scoreKeyword(s, queryTokens);
738
- if (score > 0) hits.push({ session: s, score, matchedIn });
739
- }
740
- hits.sort((a, b) => b.score - a.score);
741
- return hits;
742
- }
743
- ```
744
-
745
- Note: `filtered` is still used (it builds `byId`), and `queryTokens` is still used (passed to `runKeyword` for `matchedIn`). Leave both. `byId` is built from the entity/kind-filtered set, so `runKeyword` resolving through it naturally drops filtered-out sessions — same pattern as `runSemantic`.
746
-
747
- - [ ] **Step 4: Update the recall barrel export**
748
-
749
- In `src/core/recall/index.ts`, remove the `scoreKeyword` export (lines 3-4) and add the `keywordMatchFields` export. The file becomes:
750
-
751
- ```typescript
752
- export { RecallService } from "./recall-service.js";
753
- export type { RecallServiceDeps } from "./recall-service.js";
754
- export { keywordMatchFields } from "./match-fields.js";
755
- export { applyFilter } from "./filter.js";
756
- export type { RecallFilter } from "./filter.js";
757
- export { tokenize, tokenSet } from "./tokenize.js";
758
- ```
759
-
760
- - [ ] **Step 5: Update the `InMemoryStore` fake and keyword/hybrid unit tests**
761
-
762
- Replace the entire contents of `tests/unit/core/recall-service.test.ts` with:
763
-
764
- ```typescript
765
- import { describe, expect, it } from "vitest";
766
- import { RecallService } from "../../../src/core/recall/recall-service.js";
767
- import type { LLMClient, EmbedResult } from "../../../src/ports/llm-client.js";
768
- import { LLMUnreachableError } from "../../../src/ports/llm-client.js";
769
- import type {
770
- KeywordNeighbor,
771
- SessionStore,
772
- SemanticNeighbor,
773
- } from "../../../src/ports/session-store.js";
774
- import type { Session } from "../../../src/shared/types.js";
775
- import { makeSession } from "../../fixtures/sessions.js";
776
-
777
- // Fake store: keyword and semantic hits are pre-baked. Unit tests here cover
778
- // RecallService orchestration (filter, merge, limit, error handling) — not
779
- // keyword ranking quality, which is covered by the FTS integration tests.
780
- class InMemoryStore implements SessionStore {
781
- constructor(
782
- private readonly sessions: Session[],
783
- private readonly neighbors: SemanticNeighbor[] = [],
784
- private readonly keywordHits: KeywordNeighbor[] = [],
785
- ) {}
786
- async list(): Promise<ReadonlyArray<Session>> {
787
- return this.sessions;
788
- }
789
- async getById(id: string): Promise<Session | null> {
790
- return this.sessions.find((s) => s.id === id) ?? null;
791
- }
792
- async semanticSearch(): Promise<ReadonlyArray<SemanticNeighbor>> {
793
- return this.neighbors;
794
- }
795
- async keywordSearch(): Promise<ReadonlyArray<KeywordNeighbor>> {
796
- return this.keywordHits;
797
- }
798
- async updateStatus(): Promise<void> {}
799
- }
800
-
801
- class StubEmbedder implements LLMClient {
802
- constructor(private readonly fail: boolean = false) {}
803
- async embed(): Promise<EmbedResult> {
804
- if (this.fail) throw new LLMUnreachableError("ollama");
805
- return { vector: new Float32Array([1, 0, 0]), model: "stub" };
806
- }
807
- async classify(): Promise<never> {
808
- throw new Error("not used");
809
- }
810
- }
811
-
812
- const corpus: Session[] = [
813
- makeSession({
814
- id: "a",
815
- label: "Hono router setup",
816
- entities: ["NLM"],
817
- decisions: ["chose Hono over Express"],
818
- }),
819
- makeSession({
820
- id: "b",
821
- label: "pgvector migration plan",
822
- entities: ["NLM", "Postgres"],
823
- open: ["timing of cutover"],
824
- }),
825
- makeSession({
826
- id: "c",
827
- label: "unrelated session",
828
- entities: ["Other"],
829
- }),
830
- ];
831
-
832
- describe("RecallService.search", () => {
833
- it("returns empty result when query and filters are all blank", async () => {
834
- const svc = new RecallService({
835
- store: new InMemoryStore(corpus),
836
- llm: new StubEmbedder(),
837
- });
838
- const result = await svc.search({ query: "" });
839
- expect(result.total).toBe(0);
840
- expect(result.results).toEqual([]);
841
- });
842
-
843
- it("keyword mode surfaces store keyword hits ranked by store score", async () => {
844
- const store = new InMemoryStore(corpus, [], [
845
- { sessionId: "b", score: 9.2 },
846
- { sessionId: "a", score: 2.1 },
847
- ]);
848
- const svc = new RecallService({ store, llm: new StubEmbedder() });
849
- const result = await svc.search({ query: "pgvector", mode: "keyword" });
850
- expect(result.results.map((r) => r.id)).toEqual(["b", "a"]);
851
- expect(result.results[0]?.matchScore).toBe(9.2);
852
- });
853
-
854
- it("keyword mode populates matchedIn from the resolved session", async () => {
855
- const store = new InMemoryStore(corpus, [], [{ sessionId: "b", score: 5 }]);
856
- const svc = new RecallService({ store, llm: new StubEmbedder() });
857
- const result = await svc.search({ query: "pgvector", mode: "keyword" });
858
- expect(result.results[0]?.matchedIn).toEqual(["label"]);
859
- });
860
-
861
- it("entity filter restricts the keyword corpus", async () => {
862
- const store = new InMemoryStore(corpus, [], [
863
- { sessionId: "b", score: 5 },
864
- { sessionId: "c", score: 4 },
865
- ]);
866
- const svc = new RecallService({ store, llm: new StubEmbedder() });
867
- const result = await svc.search({ query: "session", mode: "keyword", entity: "NLM" });
868
- expect(result.results.every((r) => r.entities.includes("NLM"))).toBe(true);
869
- expect(result.results.map((r) => r.id)).not.toContain("c");
870
- });
871
-
872
- it("semantic mode returns ollama_unreachable when the embedder fails", async () => {
873
- const svc = new RecallService({
874
- store: new InMemoryStore(corpus),
875
- llm: new StubEmbedder(true),
876
- });
877
- const result = await svc.search({ query: "anything", mode: "semantic" });
878
- expect(result.modeUnavailable).toBe("ollama_unreachable");
879
- expect(result.results).toEqual([]);
880
- });
881
-
882
- it("hybrid mode degrades to keyword scores when semantic is unavailable", async () => {
883
- const store = new InMemoryStore(corpus, [], [{ sessionId: "b", score: 7 }]);
884
- const svc = new RecallService({ store, llm: new StubEmbedder(true) });
885
- const result = await svc.search({ query: "pgvector", mode: "hybrid" });
886
- expect(result.modeUnavailable).toBe("ollama_unreachable");
887
- expect(result.results).toHaveLength(1);
888
- expect(result.results[0]?.id).toBe("b");
889
- });
890
-
891
- it("semantic mode reports cosine similarity computed from L2 distance of unit vectors", async () => {
892
- const store = new InMemoryStore(corpus, [{ sessionId: "a", distance: 0 }]);
893
- const svc = new RecallService({ store, llm: new StubEmbedder() });
894
- const result = await svc.search({ query: "anything", mode: "semantic" });
895
- expect(result.results[0]?.matchScore).toBe(1);
896
- });
897
-
898
- it("hybrid mode blends 0.4 * kw + 0.6 * sem after per-leg normalization", async () => {
899
- const store = new InMemoryStore(
900
- corpus,
901
- [{ sessionId: "b", distance: 0 }],
902
- [{ sessionId: "b", score: 9.2 }],
903
- );
904
- const svc = new RecallService({ store, llm: new StubEmbedder() });
905
- const result = await svc.search({ query: "pgvector", mode: "hybrid" });
906
- const top = result.results[0];
907
- expect(top?.id).toBe("b");
908
- // kwNorm = 1 (only hit / its own max), semNorm = 1 (distance 0) => 0.4 + 0.6 = 1
909
- expect(top?.matchScore).toBeCloseTo(1, 4);
910
- expect(top?.keywordScore).toBe(1);
911
- expect(top?.semanticScore).toBe(1);
912
- });
913
-
914
- it("clamps limit to MAX_LIMIT (100) and at least 1", async () => {
915
- const store = new InMemoryStore(corpus, [], [{ sessionId: "b", score: 5 }]);
916
- const svc = new RecallService({ store, llm: new StubEmbedder() });
917
- const big = await svc.search({ query: "session", mode: "keyword", limit: 9999 });
918
- expect(big.limit).toBe(100);
919
- const small = await svc.search({ query: "session", mode: "keyword", limit: 0 });
920
- expect(small.limit).toBe(1);
921
- });
922
- });
923
- ```
924
-
925
- - [ ] **Step 6: Update the integration test seed to populate `body`**
926
-
927
- In `tests/integration/recall-sqlite.test.ts`, the seed sessions (lines 42-72) set `label`/`summary` but not `body`. FTS5 keyword search now drives recall, and although `label`/`summary` are indexed, add `body` to each seed session so the corpus is realistic and the keyword cases exercise body matching. Replace the `seed` array (lines 42-72) with:
928
-
929
- ```typescript
930
- const seed: ReadonlyArray<{ session: Session; embedding: Float32Array }> = [
931
- {
932
- session: makeSession({
933
- id: "sess_a",
934
- label: "Hono router setup",
935
- summary: "Wired Hono onto port 3940 with sqlite session store",
936
- body: "Chose Hono over Express for routing. Mounted the recall API on port 3940.",
937
- entities: ["NLM"],
938
- decisions: ["chose Hono over Express for routing"],
939
- }),
940
- embedding: unit([1, 0, 0]),
941
- },
942
- {
943
- session: makeSession({
944
- id: "sess_b",
945
- label: "pgvector migration plan",
946
- summary: "Sketched eventual Postgres mirror via PostgresSessionStore port",
947
- body: "Planned the pgvector power tier. Open question: timing of cutover from SQLite to Postgres.",
948
- entities: ["NLM", "Postgres"],
949
- open: ["timing of cutover from SQLite to Postgres"],
950
- }),
951
- embedding: unit([0, 1, 0]),
952
- },
953
- {
954
- session: makeSession({
955
- id: "sess_c",
956
- label: "TX Tax county scraper",
957
- summary: "Unrelated work on Texas tax exemption directory",
958
- body: "Built the Texas tax exemption county scraper and directory pipeline.",
959
- entities: ["TX Tax Exemptions"],
960
- }),
961
- embedding: unit([0, 0, 1]),
962
- },
963
- ];
964
- ```
965
-
966
- The existing assertions in that file (keyword finds `sess_b` for "pgvector", entity filter on "scraper" excludes non-NLM, hybrid blends) remain valid — `sess_b`'s label still contains "pgvector" and `sess_c`'s body contains "scraper".
967
-
968
- - [ ] **Step 7: Run the full recall test set**
969
-
970
- Run: `npm test -- tests/integration/recall-golden.test.ts tests/integration/recall-sqlite.test.ts tests/unit/core/recall-service.test.ts tests/unit/core/match-fields.test.ts && npm run typecheck`
971
- Expected: PASS — golden gate green (proves no recall-quality regression through the FTS swap), integration green, unit green, typecheck clean.
972
-
973
- - [ ] **Step 8: Commit**
974
-
975
- ```bash
976
- git add src/core/recall/match-fields.ts src/core/recall/recall-service.ts src/core/recall/index.ts tests/unit/core/match-fields.test.ts tests/unit/core/recall-service.test.ts tests/integration/recall-sqlite.test.ts
977
- git commit -m "feat: route keyword recall through FTS5 keywordSearch"
978
- ```
979
-
980
- ---
981
-
982
- ## Task 5: Delete the dead token-overlap scorer
983
-
984
- The FTS swap is complete and green. Remove the byte-parity scorer and its test. `tokenize.ts` stays — it is still imported by `src/core/recall-facts/fact-recall-service.ts` and reused by `keywordSearch` and `match-fields.ts`.
985
-
986
- **Files:**
987
- - Delete: `src/core/recall/score-keyword.ts`
988
- - Delete: `tests/unit/core/score-keyword.test.ts`
989
-
990
- - [ ] **Step 1: Confirm `scoreKeyword` has no remaining references**
991
-
992
- Run: `grep -rn "scoreKeyword\|score-keyword" src tests`
993
- Expected: no output. If anything prints, fix that reference before deleting (it should already be gone after Task 4 — `recall-service.ts` and `index.ts` were the only importers).
994
-
995
- - [ ] **Step 2: Delete the files**
996
-
997
- ```bash
998
- git rm src/core/recall/score-keyword.ts tests/unit/core/score-keyword.test.ts
999
- ```
1000
-
1001
- - [ ] **Step 3: Confirm `tokenize.ts` is still wired**
1002
-
1003
- Run: `grep -rn "tokenize" src/core/recall-facts src/core/storage src/core/recall`
1004
- Expected: references in `fact-recall-service.ts`, `sqlite-session-store.ts`, `match-fields.ts`, `recall-service.ts`, `tokenize.ts` itself. `tokenize.ts` must NOT be deleted.
1005
-
1006
- - [ ] **Step 4: Run the full suite**
1007
-
1008
- Run: `npm test && npm run typecheck && npm run lint`
1009
- Expected: PASS — entire suite green, typecheck clean, lint clean. No reference to the deleted scorer.
1010
-
1011
- - [ ] **Step 5: Commit**
1012
-
1013
- ```bash
1014
- git add -A
1015
- git commit -m "refactor: remove token-overlap scorer superseded by FTS5"
1016
- ```
1017
-
1018
- ---
1019
-
1020
- ## Task 6: Rebuild `dist/` and update the CHANGELOG
1021
-
1022
- Per the repo protocol, `dist/` is committed (the GitHub install is a pure copy — see the 2026-05-20 CHANGELOG entry) and every session ends with a CHANGELOG append.
1023
-
1024
- **Files:**
1025
- - Modify: `dist/` (regenerated)
1026
- - Modify: `logs/CHANGELOG/CHANGELOG.md`
1027
-
1028
- - [ ] **Step 1: Rebuild `dist/`**
1029
-
1030
- Run: `npm run build`
1031
- Expected: `build:server` and `build:ui` both succeed.
1032
-
1033
- - [ ] **Step 2: Append the CHANGELOG entry**
1034
-
1035
- Prepend a new entry below the title line in `logs/CHANGELOG/CHANGELOG.md` (newest first), matching the existing entry style:
1036
-
1037
- ```markdown
1038
- ## 2026-05-20 — FTS5 lexical recall: keywordSearch replaces the token-overlap scorer
1039
-
1040
- The keyword leg of recall moved from an in-memory token-intersection scorer to a SQLite FTS5 BM25 query behind a new `SessionStore.keywordSearch` port method — symmetric with the existing `semanticSearch` sqlite-vec leg.
1041
-
1042
- **Changes**
1043
- - `migrations/008_fts_rebuild.sql` — one-time safety rebuild of the `sessions_fts` index (table + sync triggers already existed in migration 000, just unqueried).
1044
- - `SessionStore.keywordSearch(query, limit)` — FTS5 MATCH with BM25 column weights 10/4/1 for label/summary/body; user input tokenized into a quoted OR query so FTS5 metacharacters cannot reach the parser.
1045
- - `RecallService` keyword + hybrid legs call `keywordSearch`; `matchedIn` badges computed in core via `match-fields.ts` from the resolved session (keeps decision/open attribution accurate — those live in `markers`, not FTS).
1046
- - Byte-parity test suite (pinned to the retired Python scorer) replaced by a tolerant golden-set recall regression test written before the swap and green throughout.
1047
- - Deleted `score-keyword.ts`; `tokenize.ts` retained (used by fact recall).
1048
-
1049
- **Decisions**
1050
- - Reused `sessions_fts(label, summary, body)` rather than adding `decisions`/`open` FTS columns — decision/open text already lives in `body`. Tradeoff: those lines get `body` weight, not an explicit 2x; BM25 IDF compensates.
1051
- - Hybrid 0.6/0.4 split retained — `mergeHybrid` normalizes each leg by its own max, which absorbs the token-count → BM25 scale change.
1052
-
1053
- **State:** v0.3.0. pgvector remains the optional power-tier swap (open task #96), untouched.
1054
- ```
1055
-
1056
- If the CHANGELOG now exceeds 10 `##` date headings, move the oldest entries to `logs/CHANGELOG/CHANGELOG-2026.md` per the session protocol.
1057
-
1058
- - [ ] **Step 3: Commit**
1059
-
1060
- ```bash
1061
- git add dist logs/CHANGELOG/CHANGELOG.md
1062
- git commit -m "build: rebuild dist for FTS5 recall + CHANGELOG"
1063
- ```
1064
-
1065
- ---
1066
-
1067
- ## Self-Review
1068
-
1069
- **Spec coverage:**
1070
- - Consensus requirement 1 — replace byte-parity tests with golden-set recall tests → Task 1 (golden gate written first), Task 5 (delete `score-keyword.test.ts`). ✓
1071
- - Consensus requirement 2 — wire `sessions_fts` with sync triggers → triggers already existed in migration 000; Task 2 adds the safety rebuild, Task 3 wires the *query* path (`keywordSearch`). ✓
1072
- - Consensus requirement 3 — re-tune the 0.6/0.4 hybrid weights → Task 4 documents and verifies that normalize-by-max absorbs the BM25 scale change; the split is deliberately retained, covered by the hybrid unit + integration tests. ✓
1073
-
1074
- **Placeholder scan:** No TBDs, no "add error handling", no "similar to Task N" — every step has complete code or an exact command. ✓
1075
-
1076
- **Type consistency:** `KeywordNeighbor { sessionId, score }` defined in Task 3, consumed unchanged in Task 4 (`runKeyword`, `InMemoryStore.keywordSearch`). `keywordMatchFields` signature defined in Task 4 Step 1, called identically in `runKeyword`. `keywordSearch(query, limit)` signature identical across port, `SqliteSessionStore`, and both fakes. `MatchField` unchanged — `keywordMatchFields` returns only existing members (`label`/`decisions`/`open`/`summary`). ✓
1077
-
1078
- ---
1079
-
1080
- ## Execution Handoff
1081
-
1082
- **Plan complete and saved to `docs/plans/2026-05-20-fts5-lexical-recall.md`. Two execution options:**
1083
-
1084
- **1. Subagent-Driven (recommended)** — dispatch a fresh subagent per task, review between tasks, fast iteration.
1085
-
1086
- **2. Inline Execution** — execute tasks in this session using executing-plans, batch execution with checkpoints.
1087
-
1088
- **Which approach?**