nlm-memory 0.5.0 → 0.5.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (247) hide show
  1. package/README.md +72 -34
  2. package/dist/cli/nlm.js +2 -1
  3. package/dist/cli/nlm.js.map +1 -1
  4. package/dist/http/app.js +2 -1
  5. package/dist/http/app.js.map +1 -1
  6. package/dist/mcp/server.js +20 -1
  7. package/dist/mcp/server.js.map +1 -1
  8. package/dist/ui/assets/{index-C8cpwbYJ.css → index-Beo8psd-.css} +1 -1
  9. package/dist/ui/assets/{index-CB50QnL-.js → index-CSPTTeeM.js} +8 -8
  10. package/dist/ui/index.html +2 -2
  11. package/package.json +26 -1
  12. package/.agents/plugins/marketplace.json +0 -20
  13. package/.github/workflows/ci.yml +0 -30
  14. package/docs/methodology/re-derivation-rate.md +0 -112
  15. package/docs/methodology/useful-hit-rate.md +0 -79
  16. package/docs/plans/2026-05-20-fts5-lexical-recall.md +0 -1088
  17. package/docs/plans/2026-05-20-recall-daemon-wedge-fix.md +0 -662
  18. package/docs/plans/2026-05-20-recall-hook-design.md +0 -131
  19. package/docs/plans/2026-05-20-recall-hook-implementation.md +0 -1222
  20. package/docs/plans/desktop-product.md +0 -69
  21. package/docs/plans/factstore-design.md +0 -236
  22. package/logs/CHANGELOG/CHANGELOG-2026.md +0 -1575
  23. package/logs/CHANGELOG/CHANGELOG.md +0 -209
  24. package/migrations/000_initial_schema.sql +0 -174
  25. package/migrations/001_entity_type_rename.sql +0 -17
  26. package/migrations/002_adapter_state_extend.sql +0 -12
  27. package/migrations/003_session_embeddings.sql +0 -11
  28. package/migrations/004_facts.sql +0 -46
  29. package/migrations/005_sources.sql +0 -31
  30. package/migrations/006_providers.sql +0 -33
  31. package/migrations/007_source_tokens.sql +0 -17
  32. package/migrations/008_fts_rebuild.sql +0 -9
  33. package/migrations/009_session_embedding_chunks.sql +0 -46
  34. package/migrations/010_sources_opencode.sql +0 -30
  35. package/migrations/011_sources_hermes_agent.sql +0 -30
  36. package/migrations/012_sources_aider.sql +0 -30
  37. package/migrations/013_adapter_state_failure_count.sql +0 -12
  38. package/migrations/014_sources_cursor.sql +0 -30
  39. package/migrations/015_sources_windsurf.sql +0 -30
  40. package/plugin-hermes-agent/README.md +0 -49
  41. package/plugin-hermes-agent/__init__.py +0 -75
  42. package/plugin-hermes-agent/plugin.yaml +0 -15
  43. package/scripts/backfill-citations.mjs +0 -0
  44. package/scripts/build-codex-plugin.mjs +0 -61
  45. package/scripts/deepseek-probe.mjs +0 -67
  46. package/scripts/extract-triples.mjs +0 -207
  47. package/scripts/longmemeval/embedding-cache.ts +0 -77
  48. package/scripts/longmemeval/fetch-dataset.sh +0 -25
  49. package/scripts/longmemeval/run-harness.ts +0 -315
  50. package/scripts/longmemeval/scorer.ts +0 -99
  51. package/scripts/longmemeval/tsconfig.json +0 -9
  52. package/scripts/longmemeval/types.ts +0 -35
  53. package/scripts/nlm-daily-digest.py +0 -239
  54. package/scripts/nlm-daily-digest.sh +0 -28
  55. package/src/cli/classify-parity.ts +0 -257
  56. package/src/cli/launchctl-helpers.ts +0 -49
  57. package/src/cli/nlm.ts +0 -1078
  58. package/src/core/actions/actions-log.ts +0 -118
  59. package/src/core/actions/overlay.ts +0 -117
  60. package/src/core/adapters/aider.ts +0 -205
  61. package/src/core/adapters/claude-code.ts +0 -293
  62. package/src/core/adapters/common.ts +0 -54
  63. package/src/core/adapters/cursor.ts +0 -486
  64. package/src/core/adapters/from-source.ts +0 -67
  65. package/src/core/adapters/hermes-agent.ts +0 -240
  66. package/src/core/adapters/hermes.ts +0 -277
  67. package/src/core/adapters/jsonl-generic.ts +0 -208
  68. package/src/core/adapters/opencode.ts +0 -281
  69. package/src/core/adapters/pi.ts +0 -264
  70. package/src/core/adapters/windsurf.ts +0 -386
  71. package/src/core/classifier/prompt.ts +0 -200
  72. package/src/core/dataset/build-dataset.ts +0 -463
  73. package/src/core/embedding/chunk-body.ts +0 -76
  74. package/src/core/embedding/embed-backfill.ts +0 -210
  75. package/src/core/embedding/embed-normalize.ts +0 -135
  76. package/src/core/facts/backfill-facts.ts +0 -254
  77. package/src/core/facts/extract-facts.ts +0 -50
  78. package/src/core/hook/citation-detect.ts +0 -124
  79. package/src/core/hook/cite-memo.ts +0 -68
  80. package/src/core/hook/claude-settings.ts +0 -187
  81. package/src/core/hook/gate.ts +0 -25
  82. package/src/core/hook/hook-log.ts +0 -41
  83. package/src/core/hook/memo-sweep.ts +0 -164
  84. package/src/core/hook/memo.ts +0 -67
  85. package/src/core/hook/pointer-block.ts +0 -26
  86. package/src/core/hook/select.ts +0 -32
  87. package/src/core/hook/transcript.ts +0 -121
  88. package/src/core/ingest/ingest-session.ts +0 -111
  89. package/src/core/providers/provider-models.ts +0 -100
  90. package/src/core/providers/provider-registry.ts +0 -196
  91. package/src/core/recall/citation-log.ts +0 -108
  92. package/src/core/recall/filter.ts +0 -27
  93. package/src/core/recall/index.ts +0 -6
  94. package/src/core/recall/match-fields.ts +0 -40
  95. package/src/core/recall/query-log.ts +0 -149
  96. package/src/core/recall/query-shape.ts +0 -66
  97. package/src/core/recall/recall-service.ts +0 -320
  98. package/src/core/recall/recent-log.ts +0 -59
  99. package/src/core/recall/tokenize.ts +0 -18
  100. package/src/core/recall/useful-scan.ts +0 -336
  101. package/src/core/recall-facts/fact-query-log.ts +0 -150
  102. package/src/core/recall-facts/fact-recall-service.ts +0 -327
  103. package/src/core/scheduler/scan-once.ts +0 -142
  104. package/src/core/scheduler/scheduler.ts +0 -225
  105. package/src/core/sources/source-registry.ts +0 -278
  106. package/src/core/storage/db-restore.ts +0 -133
  107. package/src/core/storage/live-status.ts +0 -45
  108. package/src/core/storage/migrate.ts +0 -72
  109. package/src/core/storage/sqlite-fact-store.ts +0 -304
  110. package/src/core/storage/sqlite-session-store.ts +0 -810
  111. package/src/hook/hook-auth.ts +0 -18
  112. package/src/hook/prompt-recall-hook.ts +0 -180
  113. package/src/hook/session-end-hook.ts +0 -81
  114. package/src/hook/session-start-hook.ts +0 -168
  115. package/src/hook/stop-hook.ts +0 -239
  116. package/src/http/app.ts +0 -1215
  117. package/src/install/claude-code.ts +0 -128
  118. package/src/install/codex.ts +0 -367
  119. package/src/install/cursor.ts +0 -68
  120. package/src/install/hermes-agent.ts +0 -76
  121. package/src/install/hermes.ts +0 -78
  122. package/src/install/nlm-dir-perms.ts +0 -55
  123. package/src/install/ollama.ts +0 -284
  124. package/src/install/setup.ts +0 -489
  125. package/src/install/windsurf.ts +0 -68
  126. package/src/llm/classifier-box.ts +0 -64
  127. package/src/llm/deepseek-client.ts +0 -150
  128. package/src/llm/env-autoload.ts +0 -55
  129. package/src/llm/ollama-client.ts +0 -189
  130. package/src/mcp/server.ts +0 -534
  131. package/src/ports/fact-store.ts +0 -102
  132. package/src/ports/llm-client.ts +0 -52
  133. package/src/ports/logger.ts +0 -16
  134. package/src/ports/session-store.ts +0 -45
  135. package/src/ports/transcript-adapter.ts +0 -55
  136. package/src/shared/types.ts +0 -149
  137. package/src/ui/App.tsx +0 -58
  138. package/src/ui/components/PromoteOpenButton.tsx +0 -65
  139. package/src/ui/components/SessionDrawer.tsx +0 -199
  140. package/src/ui/components/SideNav.tsx +0 -162
  141. package/src/ui/components/Skeleton.tsx +0 -107
  142. package/src/ui/index.html +0 -13
  143. package/src/ui/lib/actions.ts +0 -30
  144. package/src/ui/lib/api.ts +0 -92
  145. package/src/ui/lib/dataset.ts +0 -141
  146. package/src/ui/lib/registries.ts +0 -155
  147. package/src/ui/lib/view-settings.ts +0 -41
  148. package/src/ui/main.tsx +0 -15
  149. package/src/ui/pages/Live.tsx +0 -229
  150. package/src/ui/pages/Pulse.tsx +0 -415
  151. package/src/ui/pages/Recall.tsx +0 -190
  152. package/src/ui/pages/River.tsx +0 -354
  153. package/src/ui/pages/Search.tsx +0 -386
  154. package/src/ui/pages/Stub.tsx +0 -9
  155. package/src/ui/pages/Thread.tsx +0 -473
  156. package/src/ui/pages/settings/Classifier.tsx +0 -227
  157. package/src/ui/pages/settings/Data.tsx +0 -190
  158. package/src/ui/pages/settings/Index.tsx +0 -65
  159. package/src/ui/pages/settings/Labels.tsx +0 -224
  160. package/src/ui/pages/settings/Providers.tsx +0 -305
  161. package/src/ui/pages/settings/SettingsSubnav.tsx +0 -28
  162. package/src/ui/pages/settings/Sources.tsx +0 -326
  163. package/src/ui/pages/settings/Views.tsx +0 -96
  164. package/src/ui/styles.css +0 -1890
  165. package/src/ui/tsconfig.json +0 -21
  166. package/src/ui/vite.config.ts +0 -19
  167. package/tests/fixtures/claude_code/short_session.jsonl +0 -2
  168. package/tests/fixtures/claude_code/standard_iso.jsonl +0 -4
  169. package/tests/fixtures/claude_code/tool_heavy.jsonl +0 -8
  170. package/tests/fixtures/claude_code/with_subagent.jsonl +0 -7
  171. package/tests/fixtures/facts.ts +0 -17
  172. package/tests/fixtures/golden-corpus.ts +0 -85
  173. package/tests/fixtures/hermes/paired_request_dump.json +0 -24
  174. package/tests/fixtures/hermes/paired_session.json +0 -23
  175. package/tests/fixtures/hermes/request_dump.json +0 -28
  176. package/tests/fixtures/hermes/session_iso.json +0 -38
  177. package/tests/fixtures/hermes/session_unix.json +0 -38
  178. package/tests/fixtures/hermes/system_only.json +0 -18
  179. package/tests/fixtures/pi/error-connection-abort.jsonl +0 -8
  180. package/tests/fixtures/pi/short-successful.jsonl +0 -5
  181. package/tests/fixtures/pi/with-custom-message.jsonl +0 -6
  182. package/tests/fixtures/sessions.ts +0 -22
  183. package/tests/integration/backfill-facts.test.ts +0 -362
  184. package/tests/integration/citation-explicit.test.ts +0 -111
  185. package/tests/integration/cite-event.test.ts +0 -169
  186. package/tests/integration/cite-memo.test.ts +0 -87
  187. package/tests/integration/db-restore.test.ts +0 -153
  188. package/tests/integration/embed-backfill.test.ts +0 -176
  189. package/tests/integration/fact-supersedence.test.ts +0 -313
  190. package/tests/integration/fts-index.test.ts +0 -60
  191. package/tests/integration/getbyids-sqlite.test.ts +0 -100
  192. package/tests/integration/hermes-agent-hooks.test.ts +0 -248
  193. package/tests/integration/hook-claude-settings.test.ts +0 -218
  194. package/tests/integration/hook-log.test.ts +0 -54
  195. package/tests/integration/hook-memo.test.ts +0 -68
  196. package/tests/integration/hook-pre-compact.test.ts +0 -105
  197. package/tests/integration/hook-subagent-start.test.ts +0 -102
  198. package/tests/integration/http.test.ts +0 -401
  199. package/tests/integration/keyword-search-fts.test.ts +0 -66
  200. package/tests/integration/mcp-recall-logging.test.ts +0 -88
  201. package/tests/integration/mcp.test.ts +0 -260
  202. package/tests/integration/memo-sweep.test.ts +0 -91
  203. package/tests/integration/prompt-recall-hook.test.ts +0 -88
  204. package/tests/integration/provider-registry.test.ts +0 -107
  205. package/tests/integration/recall-golden.test.ts +0 -59
  206. package/tests/integration/recall-sqlite.test.ts +0 -169
  207. package/tests/integration/scheduler.test.ts +0 -391
  208. package/tests/integration/session-end-hook.test.ts +0 -48
  209. package/tests/integration/session-start-hook.test.ts +0 -126
  210. package/tests/integration/source-registry.test.ts +0 -122
  211. package/tests/integration/sqlite-fact-store.test.ts +0 -346
  212. package/tests/integration/stop-hook.test.ts +0 -560
  213. package/tests/integration/wal-checkpoint.test.ts +0 -49
  214. package/tests/unit/cli/launchctl-helpers.test.ts +0 -60
  215. package/tests/unit/core/adapters/aider.test.ts +0 -230
  216. package/tests/unit/core/adapters/claude-code.test.ts +0 -118
  217. package/tests/unit/core/adapters/cursor.test.ts +0 -485
  218. package/tests/unit/core/adapters/hermes-agent.test.ts +0 -329
  219. package/tests/unit/core/adapters/hermes.test.ts +0 -81
  220. package/tests/unit/core/adapters/jsonl-generic.test.ts +0 -142
  221. package/tests/unit/core/adapters/opencode.test.ts +0 -354
  222. package/tests/unit/core/adapters/pi.test.ts +0 -110
  223. package/tests/unit/core/adapters/windsurf.test.ts +0 -416
  224. package/tests/unit/core/classifier/prompt.test.ts +0 -126
  225. package/tests/unit/core/embedding/chunk-body.test.ts +0 -100
  226. package/tests/unit/core/facts/extract-facts.test.ts +0 -117
  227. package/tests/unit/core/filter.test.ts +0 -40
  228. package/tests/unit/core/hook/citation-detect-cite-session.test.ts +0 -96
  229. package/tests/unit/core/hook/citation-detect.test.ts +0 -124
  230. package/tests/unit/core/hook/gate.test.ts +0 -29
  231. package/tests/unit/core/hook/pointer-block.test.ts +0 -22
  232. package/tests/unit/core/hook/select.test.ts +0 -66
  233. package/tests/unit/core/match-fields.test.ts +0 -39
  234. package/tests/unit/core/mcp-cite-session.test.ts +0 -51
  235. package/tests/unit/core/providers/provider-models.test.ts +0 -101
  236. package/tests/unit/core/query-shape.test.ts +0 -92
  237. package/tests/unit/core/recall-facts/fact-recall-service.test.ts +0 -258
  238. package/tests/unit/core/recall-service.test.ts +0 -200
  239. package/tests/unit/core/storage/live-status.test.ts +0 -54
  240. package/tests/unit/core/tokenize.test.ts +0 -32
  241. package/tests/unit/core/useful-scan.test.ts +0 -537
  242. package/tests/unit/llm/embed.test.ts +0 -93
  243. package/tests/unit/llm/ollama-client.test.ts +0 -124
  244. package/tests/unit/scripts/longmemeval-scorer.test.ts +0 -114
  245. package/tsconfig.json +0 -31
  246. package/tsconfig.test.json +0 -11
  247. package/vitest.config.ts +0 -22
@@ -4,8 +4,8 @@
4
4
  <meta charset="UTF-8" />
5
5
  <meta name="viewport" content="width=device-width, initial-scale=1.0" />
6
6
  <title>nlm memory</title>
7
- <script type="module" crossorigin src="/ui/assets/index-CB50QnL-.js"></script>
8
- <link rel="stylesheet" crossorigin href="/ui/assets/index-C8cpwbYJ.css">
7
+ <script type="module" crossorigin src="/ui/assets/index-CSPTTeeM.js"></script>
8
+ <link rel="stylesheet" crossorigin href="/ui/assets/index-Beo8psd-.css">
9
9
  </head>
10
10
  <body>
11
11
  <div id="root"></div>
package/package.json CHANGED
@@ -1,12 +1,37 @@
1
1
  {
2
2
  "name": "nlm-memory",
3
- "version": "0.5.0",
3
+ "version": "0.5.1",
4
4
  "description": "Local-first non-linear memory operating system for AI operators.",
5
5
  "type": "module",
6
6
  "license": "Apache-2.0",
7
7
  "engines": {
8
8
  "node": ">=20.0.0"
9
9
  },
10
+ "repository": {
11
+ "type": "git",
12
+ "url": "git+https://github.com/pbmagnet4/nlm-memory-ts.git"
13
+ },
14
+ "homepage": "https://github.com/pbmagnet4/nlm-memory-ts#readme",
15
+ "bugs": {
16
+ "url": "https://github.com/pbmagnet4/nlm-memory-ts/issues"
17
+ },
18
+ "keywords": [
19
+ "ai",
20
+ "memory",
21
+ "mcp",
22
+ "claude-code",
23
+ "codex",
24
+ "hermes",
25
+ "local-first",
26
+ "recall",
27
+ "session-memory"
28
+ ],
29
+ "files": [
30
+ "dist",
31
+ "plugin",
32
+ "LICENSE",
33
+ "README.md"
34
+ ],
10
35
  "bin": {
11
36
  "nlm": "dist/cli/nlm.js"
12
37
  },
@@ -1,20 +0,0 @@
1
- {
2
- "name": "nlm-memory-ts",
3
- "interface": {
4
- "displayName": "nlm-memory"
5
- },
6
- "plugins": [
7
- {
8
- "name": "nlm-memory",
9
- "source": {
10
- "source": "local",
11
- "path": "./plugin"
12
- },
13
- "policy": {
14
- "installation": "AVAILABLE",
15
- "authentication": "ON_USE"
16
- },
17
- "category": "Coding"
18
- }
19
- ]
20
- }
@@ -1,30 +0,0 @@
1
- name: CI
2
-
3
- on:
4
- push:
5
- branches: [main]
6
- pull_request:
7
- branches: [main]
8
-
9
- jobs:
10
- test:
11
- runs-on: ubuntu-latest
12
- steps:
13
- - uses: actions/checkout@v4
14
-
15
- - uses: actions/setup-node@v4
16
- with:
17
- node-version: "20"
18
- cache: npm
19
-
20
- - name: Install dependencies
21
- run: npm ci
22
-
23
- - name: Typecheck
24
- run: npm run typecheck
25
-
26
- - name: Test
27
- run: npm test
28
-
29
- - name: Build (server)
30
- run: npm run build
@@ -1,112 +0,0 @@
1
- # re-derivation_rate — design
2
-
3
- ## Why
4
-
5
- `re_derivation_rate` is NLM's strategic metric — the operator-outcome number that competitors (mem0, agentmemory, Letta) cannot match because their destructive lifecycle (decay, auto-forget) erases the data needed to compute it. It is the headline number for Pulse, the cron digest, and any public marketing scorecard. Detection rule, methodology, and a reproducible script live here so the metric is auditable.
6
-
7
- ## Plain-language definition
8
-
9
- A *re-derivation* is when an operator (you, in any AI runtime) solves the same problem twice across multiple sessions without recall of the prior solution. It is the tax NLM exists to eliminate: every re-derivation is a session where memory could have helped but didn't.
10
-
11
- `re_derivation_rate` over a window = (re-derivation events) / (decision events) in that window.
12
-
13
- `re_derivations_prevented` = recall events whose `useful_hit_rate` is true AND whose returned session contained the matching decision. Inverse of re-derivation: the events where memory *did* help.
14
-
15
- ## Detection rule (V1)
16
-
17
- A pair of sessions `(A, B)` is a re-derivation iff all of the following hold:
18
-
19
- 1. **Same entity.** A and B share at least one entity in their respective `entities` arrays.
20
- 2. **Same decision normalized.** A `decision` marker in A and a `decision` marker in B normalize to overlapping content. Normalization: lowercase, strip stopwords, tokenize, Jaccard similarity ≥ 0.6.
21
- 3. **Temporal gap.** `B.started_at - A.started_at >= 7 days`.
22
- 4. **No supersedence link.** No `session_edges` row of kind `supersedes` connects A and B in either direction.
23
- 5. **No continues link.** No `session_edges` row of kind `continues` connects A and B.
24
- 6. **No intervening recall.** Between A.started_at and B.started_at, no recall event in `query-log.jsonl` or `hook-log.jsonl` returned A's id (would mean B's operator was aware of A and chose not to link).
25
-
26
- When all six are true, `B` is a re-derivation of `A`. Count B (not A) — the metric measures fresh re-derivations, not the original.
27
-
28
- ## Edge cases and resolutions
29
-
30
- - **Three sessions A, B, C** where B re-derives A and C re-derives B: count B and C, not A.
31
- - **Trivial decisions.** Decisions under N tokens (default 6) are excluded — "yes ship it" is not a meaningful decision to track.
32
- - **High-frequency entities.** If an entity has >50 sessions in the window, scale the Jaccard threshold up to 0.75 to reduce false positives (common topics will inevitably overlap in keyword-trivial ways).
33
- - **Probe / test entities.** Sessions whose label matches probe patterns (see useful-hit-rate.md) are excluded from both sides.
34
-
35
- ## Computation algorithm
36
-
37
- ```python
38
- def find_re_derivations(sessions, edges, recalls, window_days):
39
- pairs = []
40
- decisions = collect_decisions(sessions) # one row per (session_id, normalized_decision_tokens, entities)
41
- for ent in distinct_entities(decisions):
42
- ent_decisions = sorted(by_session_start([d for d in decisions if ent in d.entities]))
43
- for i, a in enumerate(ent_decisions):
44
- for b in ent_decisions[i+1:]:
45
- if days_between(a, b) < 7: continue
46
- if days_between(a, b) > window_days: break
47
- if jaccard(a.tokens, b.tokens) < threshold(ent): continue
48
- if has_edge(edges, a, b, ("supersedes", "continues")): continue
49
- if recall_returned_a_between(recalls, a, b): continue
50
- pairs.append((a, b))
51
- return pairs
52
- ```
53
-
54
- Runs over the existing canonical sqlite (sessions + session_edges) and the recall log jsonl files. No new schema, no migration. Computed in a single pass; results cached by `(window_start, window_end)` in a new `re_derivation_log` table.
55
-
56
- ## Storage
57
-
58
- - New table `re_derivation_log`: `(window_start, window_end, computed_at, session_a_id, session_b_id, entity, jaccard, decision_a, decision_b)`. One row per detected pair. Re-computable; deletable; not source of truth.
59
- - New endpoint field on `/api/recall/stats`: `re_derivation_count_7d`, `re_derivations_prevented_7d`.
60
- - Pulse: new headline tile showing both numbers and the weekly trend.
61
-
62
- ## CLI
63
-
64
- - `nlm re-derivation scan` — recomputes the log for a window. Default last 30 days.
65
- - `nlm re-derivation list --since 7d` — lists detected pairs with the matched decisions for human review (false-positive triage).
66
- - `nlm re-derivation explain <session-b-id>` — for one B, show why it was flagged (matched A, decision overlap, why no recall covered it).
67
-
68
- ## Calibration loop
69
-
70
- Re-derivation detection is heuristic. False positives waste reader trust; false negatives undersell the metric. Calibration weekly for the first month after V1:
71
-
72
- 1. Run `nlm re-derivation list --since 7d`
73
- 2. Edward reviews each flagged pair
74
- 3. Mark `true_re_derivation: true|false` in a `re_derivation_feedback` table
75
- 4. Adjust Jaccard threshold + minimum decision length until precision/recall both > 70% on Edward's review
76
-
77
- After 4 weeks of calibration, freeze the parameters and publish them in `docs/methodology/re-derivation-rate.md` for external use.
78
-
79
- ## Public scorecard format
80
-
81
- For external publication (gated on the marketing-readiness checklist):
82
-
83
- ```
84
- Edward's corpus, week of YYYY-MM-DD:
85
- Sessions in window: N
86
- Decisions in window: M
87
- Re-derivations detected: X
88
- Re-derivations prevented: Y (recall returned the matching prior session)
89
- Re-derivation rate: X / M = Z.Z%
90
- Methodology: docs/methodology/re-derivation-rate.md
91
- Calibration set: docs/calibration/re-derivation-2026-MM.md
92
- ```
93
-
94
- Publish weekly to the repo. The trend (rate falling over time as NLM gets more useful) is the marketing story.
95
-
96
- ## Why competitors cannot match this
97
-
98
- agentmemory's 4-tier lifecycle decays old observations and auto-forgets stale facts. Without the historical session record intact, there is no Session A to detect a re-derivation against — the data is gone. mem0 uses passive extraction and accretion, with no native concept of session identity that would let you pair A and B. Letta's core memory is in-context, not historical.
99
-
100
- NLM's supersedence + full-session retention is the prerequisite for this metric. It is the strategic moat made measurable.
101
-
102
- ## Out of scope (V1)
103
-
104
- - Cross-runtime re-derivation (decision in Claude Code, re-derived in Hermes). Requires reliable entity normalization across adapters; defer to V2.
105
- - Semantic similarity instead of Jaccard (would catch paraphrased decisions but requires embedding every decision). Defer.
106
- - Automatic supersedence link suggestion from detected re-derivations. The metric should measure, not act, until calibrated.
107
-
108
- ## Implementation phasing
109
-
110
- 1. **Phase 1 (after #152, #153, #154 ship):** implement detection algorithm + CLI + scan command. No UI changes. Validate on Edward's corpus.
111
- 2. **Phase 2 (after 2 weeks of calibration):** wire `re_derivation_count_7d` into `/api/recall/stats` and the daily digest. Pulse tile.
112
- 3. **Phase 3 (gated on marketing readiness):** publish first weekly scorecard publicly. Repo README. Landing site.
@@ -1,79 +0,0 @@
1
- # useful_hit_rate — design
2
-
3
- ## Why
4
-
5
- `hit_rate` reports the fraction of recall calls that returned ≥1 row. With the MCP default now hybrid, that number is structurally close to 100% — semantic always returns *something*. `hit_rate` no longer separates "found stuff" from "found stuff that mattered." `useful_hit_rate` is the metric we actually want: the fraction of recall calls whose returned results were referenced in the next assistant turn.
6
-
7
- This is the signal that lets us answer "is NLM serving its intended purpose" with evidence instead of opinion, and it's an input to the headline re-derivation rate metric (see [re-derivation-rate.md](re-derivation-rate.md) — pending).
8
-
9
- ## Definitions
10
-
11
- **A recall event** is one of:
12
- - A hook fire (logged in `~/.nlm/hook-log.jsonl` with `wouldInject` ids)
13
- - An MCP `recall_sessions` / `recall_facts` call (logged in `~/.nlm/query-log.jsonl`)
14
- - An HTTP `/api/recall` call (logged in `~/.nlm/query-log.jsonl`)
15
-
16
- **A useful recall** is a recall event where:
17
- - At least one of the returned session ids OR session labels appears in the next assistant message in the same conversation transcript, AND
18
- - The match occurs within 3 assistant turns of the recall, AND
19
- - The recall is not a probe (excluded query patterns: `concurrency probe`, `test probe`, `path test`, `recall test`, smoke/cutover patterns)
20
-
21
- **`useful_hit_rate`** = (useful recalls) / (real recalls) over the reporting window.
22
-
23
- ## Detection algorithm
24
-
25
- ```
26
- for each real recall event in window:
27
- transcript = find_transcript(event.conversationId)
28
- if transcript is None:
29
- mark useful = null (unmeasurable)
30
- continue
31
- next_assistant_msgs = transcript.messages_after(event.ts, role="assistant", limit=3)
32
- haystack = " ".join(m.content for m in next_assistant_msgs)
33
- for hit_id in event.returnedIds:
34
- if hit_id in haystack or session_label(hit_id) in haystack:
35
- mark useful = true; break
36
- else:
37
- mark useful = false
38
- ```
39
-
40
- ## Data flow
41
-
42
- 1. **Hook recalls** have `conversationId` directly. Transcript path: `~/.claude/projects/<sanitized-project>/<conversationId>.jsonl`.
43
- 2. **MCP recalls** currently have no conversation context in `query-log.jsonl`. Adding `x-claude-session-id` capture to the MCP server is a prerequisite for measuring MCP useful_hit_rate.
44
- 3. **HTTP recalls** are operator-driven (UI browsing) and excluded from this metric — `useful_hit_rate` measures agent recall usefulness, not UI search satisfaction.
45
-
46
- ## Storage
47
-
48
- - New log file `~/.nlm/useful-hit-log.jsonl`, one entry per scanned recall:
49
- ```json
50
- {"ts": "...", "source": "hook|mcp", "conversationId": "...", "returnedIds": [...], "useful": true|false|null, "matchedId": "...", "scannedAt": "..."}
51
- ```
52
- - New CLI: `nlm useful-scan` — scans the last 24h of recalls, joins against transcripts, appends to the log
53
- - New endpoint field: `/api/recall/stats` includes `useful_hit_rate` and `useful_hit_count` over the same window as `hit_rate`
54
-
55
- ## Out of scope (V1)
56
-
57
- - MCP useful_hit_rate (blocked on conversation-id capture; track as follow-up)
58
- - Real-time useful-hit detection (V1 is batch-scan, run on the daily digest cron)
59
- - Distinguishing "agent quoted the recall" vs "agent acted on it" (the former is a proxy for the latter; V2 could refine)
60
- - HTTP UI click-through (different metric — would live under a separate `ui_click_rate`)
61
-
62
- ## V1 scope (shipping now)
63
-
64
- - Ship the daily digest cron consuming existing `hit_rate` (this doc justifies the upgrade path)
65
- - Add stub field `useful_hit_rate: null` to `/api/recall/stats` so the digest schema is forward-compatible
66
- - Implement the scanner + CLI in a follow-up commit (target: within 7 days)
67
-
68
- ## Why batch-scan vs hook-vs-hook real-time
69
-
70
- A second Claude Code hook (`Stop` or `PostToolUse`) could compute usefulness in real time. Rejected because:
71
- - Doubles installation surface (two hooks per agent runtime)
72
- - Adds per-turn latency for a metric the user reads once/day
73
- - Doesn't generalize to Hermes, pi, Codex, Gemini, Aider (no equivalent post-turn hook on most)
74
- - Batch-scan reads the same transcript files the daemon already polls
75
-
76
- ## Open questions
77
-
78
- - Hit-label heuristic: substring match is cheap but noisy. Worth fuzzy matching session label tokens? Defer until V1 data shows the false-positive rate.
79
- - Window for scan: hour-bucket vs day-bucket? Daily-bucket for now to match the digest cadence; revisit if cron interval changes.