claude_memory 0.9.1 → 0.11.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (77) hide show
  1. checksums.yaml +4 -4
  2. data/.claude/memory.sqlite3 +0 -0
  3. data/.claude/skills/dashboard/SKILL.md +42 -0
  4. data/.claude-plugin/marketplace.json +1 -1
  5. data/.claude-plugin/plugin.json +1 -1
  6. data/CHANGELOG.md +130 -0
  7. data/CLAUDE.md +30 -6
  8. data/README.md +66 -2
  9. data/db/migrations/015_add_activity_events.rb +26 -0
  10. data/db/migrations/016_add_moment_feedback.rb +22 -0
  11. data/db/migrations/017_add_last_recalled_at.rb +15 -0
  12. data/docs/1_0_punchlist.md +371 -0
  13. data/docs/EXAMPLES.md +41 -2
  14. data/docs/GETTING_STARTED.md +33 -4
  15. data/docs/architecture.md +22 -7
  16. data/docs/audit-queries.md +131 -0
  17. data/docs/dashboard.md +192 -0
  18. data/docs/improvements.md +650 -9
  19. data/docs/influence/cq.md +187 -0
  20. data/docs/plugin.md +13 -6
  21. data/docs/quality_review.md +524 -172
  22. data/docs/reflection_memory_as_accumulating_judgment.md +67 -0
  23. data/lib/claude_memory/activity_log.rb +86 -0
  24. data/lib/claude_memory/commands/census_command.rb +210 -0
  25. data/lib/claude_memory/commands/completion_command.rb +3 -0
  26. data/lib/claude_memory/commands/dashboard_command.rb +54 -0
  27. data/lib/claude_memory/commands/dedupe_conflicts_command.rb +55 -0
  28. data/lib/claude_memory/commands/digest_command.rb +273 -0
  29. data/lib/claude_memory/commands/hook_command.rb +61 -2
  30. data/lib/claude_memory/commands/initializers/hooks_configurator.rb +7 -4
  31. data/lib/claude_memory/commands/reclassify_references_command.rb +56 -0
  32. data/lib/claude_memory/commands/registry.rb +7 -1
  33. data/lib/claude_memory/commands/show_command.rb +90 -0
  34. data/lib/claude_memory/commands/skills/distill-transcripts.md +13 -1
  35. data/lib/claude_memory/commands/stats_command.rb +131 -2
  36. data/lib/claude_memory/commands/sweep_command.rb +2 -0
  37. data/lib/claude_memory/configuration.rb +16 -0
  38. data/lib/claude_memory/core/relative_time.rb +9 -0
  39. data/lib/claude_memory/dashboard/api.rb +610 -0
  40. data/lib/claude_memory/dashboard/conflicts.rb +279 -0
  41. data/lib/claude_memory/dashboard/efficacy.rb +127 -0
  42. data/lib/claude_memory/dashboard/fact_presenter.rb +109 -0
  43. data/lib/claude_memory/dashboard/health.rb +175 -0
  44. data/lib/claude_memory/dashboard/index.html +2707 -0
  45. data/lib/claude_memory/dashboard/knowledge.rb +136 -0
  46. data/lib/claude_memory/dashboard/moments.rb +244 -0
  47. data/lib/claude_memory/dashboard/reuse.rb +97 -0
  48. data/lib/claude_memory/dashboard/scoped_fact_resolver.rb +95 -0
  49. data/lib/claude_memory/dashboard/server.rb +211 -0
  50. data/lib/claude_memory/dashboard/timeline.rb +68 -0
  51. data/lib/claude_memory/dashboard/trust.rb +454 -0
  52. data/lib/claude_memory/distill/bare_conclusion_detector.rb +71 -0
  53. data/lib/claude_memory/distill/reference_material_detector.rb +78 -0
  54. data/lib/claude_memory/hook/auto_memory_mirror.rb +112 -0
  55. data/lib/claude_memory/hook/context_injector.rb +97 -3
  56. data/lib/claude_memory/hook/handler.rb +191 -3
  57. data/lib/claude_memory/mcp/handlers/management_handlers.rb +8 -0
  58. data/lib/claude_memory/mcp/query_guide.rb +11 -0
  59. data/lib/claude_memory/mcp/text_summary.rb +29 -0
  60. data/lib/claude_memory/mcp/tool_definitions.rb +13 -0
  61. data/lib/claude_memory/mcp/tools.rb +148 -0
  62. data/lib/claude_memory/publish.rb +13 -21
  63. data/lib/claude_memory/recall/stale_detector.rb +67 -0
  64. data/lib/claude_memory/resolve/predicate_policy.rb +2 -0
  65. data/lib/claude_memory/resolve/resolver.rb +41 -11
  66. data/lib/claude_memory/store/llm_cache.rb +68 -0
  67. data/lib/claude_memory/store/metrics_aggregator.rb +96 -0
  68. data/lib/claude_memory/store/schema_manager.rb +1 -1
  69. data/lib/claude_memory/store/sqlite_store.rb +47 -143
  70. data/lib/claude_memory/store/store_manager.rb +29 -0
  71. data/lib/claude_memory/sweep/maintenance.rb +216 -0
  72. data/lib/claude_memory/sweep/recall_timestamp_refresher.rb +83 -0
  73. data/lib/claude_memory/sweep/sweeper.rb +2 -0
  74. data/lib/claude_memory/templates/hooks.example.json +5 -0
  75. data/lib/claude_memory/version.rb +1 -1
  76. data/lib/claude_memory.rb +24 -0
  77. metadata +51 -1
@@ -0,0 +1,371 @@
1
+ # 1.0 Punchlist
2
+
3
+ *Created: 2026-04-28. Restructured 2026-04-28 (post-0.10.0 release) around
4
+ milestone versions per the path-to-1.0 plan.*
5
+
6
+ The remaining work for a stable 1.0 release. Distinct from `improvements.md` —
7
+ that file tracks the long tail of inbound study/idea entries; this file tracks
8
+ **what blocks 1.0 confidence and which release each item ships in**.
9
+
10
+ Guiding question: *a skeptical Ruby developer should be able to look at one
11
+ screen and say "yes, this is helping, here's the evidence" without trusting our
12
+ marketing.* Today the dashboard tells that story in pieces but not as a
13
+ headline. Each item below closes a specific gap that prevents that headline
14
+ from existing.
15
+
16
+ ## What 1.0 commits to
17
+
18
+ Not "feature complete" — semver commitment. Once we ship 1.0:
19
+
20
+ - Public APIs (CLI surface, MCP tool schemas, hook payload shapes) lock to semver
21
+ - Schema migrations stay forward-compatible per the round-trip-spec convention
22
+ - The trust signals we ship have a baseline measurement other releases must beat
23
+
24
+ So 1.0 isn't gated by features. It's gated by **the measurement infrastructure
25
+ being trustworthy enough to defend a 1.0 claim.** That's why this punchlist is
26
+ mostly observability, not capability.
27
+
28
+ Items are cross-linked to the canonical entry in `improvements.md` where the
29
+ implementation detail and acceptance criteria live. This file is the
30
+ prioritization view; that file is the work view.
31
+
32
+ ---
33
+
34
+ ## 0.10.x — patch as needed (now)
35
+
36
+ Reactive only. Real usage will surface issues; cut a patch when one shows up.
37
+ No proactive minor work here.
38
+
39
+ ---
40
+
41
+ ## 0.11.0 — "Trust & Cost" (~1 week of work)
42
+
43
+ Theme: *users can see what memory costs and whether it's helping.* Each item
44
+ adds a number a skeptical user can read.
45
+
46
+ ### #1 Token budget telemetry — *what does memory cost?* ✅ landed 2026-04-29
47
+
48
+ **Gap.** `Core::TokenEstimator` exists and is unused outside one helper. We
49
+ have no idea what % of the SessionStart token budget memory consumes per
50
+ session, how it scales with DB size, or whether it's growing.
51
+
52
+ **Acceptance.** Trust panel + `claude-memory digest` show p50/p95 injected
53
+ tokens per session over the last 30 days. Per-session count rides on every
54
+ `hook_context` activity event so the data is queryable post-hoc.
55
+
56
+ **Why this release.** Loudest critique of any context-injection memory
57
+ system; if we can't answer it numerically, we can't defend the trade.
58
+
59
+ **Status.** Landed in 4 atomic commits on 2026-04-29 (15cb5f5, 35ae8d2,
60
+ d9601ca, 5bfd7c8). `context_tokens` recorded on every successful
61
+ `hook_context` event, surfaced via `Dashboard::Trust#token_budget`,
62
+ `claude-memory digest` "Context cost" section, and
63
+ `claude-memory stats --tokens [--since DAYS]` with histogram.
64
+
65
+ → improvements.md entry: *#47 Token Budget Telemetry*. Effort: 4-6h.
66
+
67
+ ### #2 Hallucination rate as a first-class trust metric ✅ landed 2026-04-29
68
+
69
+ **Gap.** `ReferenceMaterialDetector` already classifies suspect facts and we
70
+ know from the #34 audit that ~25% of facts had embedded reasoning (i.e.
71
+ ~75% were bare conclusions at audit time). Neither signal is exposed on the
72
+ dashboard. We display clean numbers; we should display stained ones.
73
+
74
+ **Acceptance.** Trust panel surfaces a `quality_score` derived from
75
+ suspect-fact ratio + bare-conclusion ratio over active facts in both stores.
76
+ Digest includes a 30-day rejection rate ("how much of what we extracted got
77
+ rejected within a week?") so calibration drift is visible.
78
+
79
+ **Why this release.** Pollution rate matters as much as recall rate. Pairs
80
+ with #1 — together they answer the "is this still worth it?" question.
81
+
82
+ **Status.** Landed in 3 atomic commits on 2026-04-29 (27fa6af, 4d1c5bf,
83
+ 0b72fa4). New `Distill::BareConclusionDetector` + `Dashboard::Trust#quality_score`
84
+ + `claude-memory digest` Quality section with rejection rate.
85
+
86
+ → improvements.md entry: *#48 Hallucination Rate Metric*. Effort: 1d.
87
+
88
+ ### #5 `claude-memory show` — human-readable "what would be injected" ✅ landed 2026-04-29
89
+
90
+ **Gap.** Inspecting memory state today requires the dashboard or several CLI
91
+ commands (`recall`, `stats`, `census`). The CLAUDE.md alternative is
92
+ `cat CLAUDE.md` — instant, plain-English, no tool. We need the same one-line
93
+ inspect surface.
94
+
95
+ **Acceptance.** `claude-memory show` runs the same `Hook::ContextInjector`
96
+ path real sessions use, prints what would be injected next session in plain
97
+ English (not JSON), sized to fit a terminal, with predicate-grouped sections
98
+ matching the snapshot format.
99
+
100
+ **Why this release.** Trust requires inspectability. A user who can't see what
101
+ memory will inject can't develop confidence in it.
102
+
103
+ **Status.** Landed 2026-04-29 (commit 2586bb3). New `Commands::ShowCommand`
104
+ runs `Hook::ContextInjector` and prints the would-be-injected Markdown.
105
+ Default suppresses the raw-transcript pending-knowledge dump for
106
+ readability (`--pending` opts in). Footer reports fact count, token
107
+ estimate, char count.
108
+
109
+ → improvements.md entry: *#51 claude-memory show*. Effort: ½d.
110
+
111
+ ### #7 First-week ROI nudge — *moved up from post-1.0* ✅ landed 2026-04-30
112
+
113
+ **Gap.** New users install, run a few sessions, don't know whether memory is
114
+ working. The dashboard exists but they have to know to look.
115
+
116
+ **Acceptance.** SessionEnd hook prints `memory contributed N facts this
117
+ session, %used = X` inline for the first ~10 sessions, then quiets. Opt-out
118
+ via `CLAUDE_MEMORY_NO_NUDGE=1`.
119
+
120
+ **Why this release.** Belongs with the trust theme — it's the user-visible
121
+ proof that memory is doing work for them. Originally listed as post-1.0;
122
+ elevating because cold-start trust deserves to land before 1.0.
123
+
124
+ **Status.** Landed in 2 atomic commits on 2026-04-30 (f450ed9, 3acce93)
125
+ plus production smoke-test against this project's DB (event #229
126
+ recorded with n=11, used=0, pct=0 for a real session_id). New
127
+ `Hook::Handler#nudge` + `claude-memory hook nudge`; SessionEnd config
128
+ appends nudge after ingest+sweep. Silent on opt-out, missing
129
+ session_id, n=0, or first-week-complete (so empty sessions don't burn
130
+ slots).
131
+
132
+ → improvements.md entry: *#53 First-Week ROI Nudge*. Effort: ½d.
133
+
134
+ ### Risk-de-risking — 3-scenario harm prototype ✅ landed 2026-04-30
135
+
136
+ Before 0.12 builds the full 10-15-scenario harm benchmark (see #3), run a
137
+ 3-scenario prototype against the 0.10.0 codebase to confirm whether harm is
138
+ actually low. If the prototype surfaces a >0% harm rate on simple cases, the
139
+ full benchmark in 0.12 will reveal a fundamental issue — better to know at
140
+ 0.11 than discover at 0.12.
141
+
142
+ **Acceptance.** Three hand-written `harm_scenarios.yml` cases (one stale-tech,
143
+ one mismatched-scope, one superseded-but-undetected) run against real Claude
144
+ under `EVAL_MODE=real`. Reports go/no-go on the larger benchmark in 0.12.
145
+
146
+ **Status.** Landed 2026-04-30 (commit 35b368e). Three cases written:
147
+ `harm_stale_tech` (MySQL fact vs SQLite reality), `harm_mismatched_scope`
148
+ (global TS/Tailwind preference applied to a Ruby gem),
149
+ `harm_superseded_undetected` (two contradicting auth_method facts both
150
+ active). Structure validation passes in stub mode. Real-mode is gated
151
+ behind `EVAL_MODE=real` (~$2-8 per run) so the operator decides when to
152
+ spend; this prototype reports harm rate but doesn't enforce a threshold
153
+ yet — that's the 0.12 release-gate work.
154
+
155
+ → improvements.md entry: *#49 Negative-Fact Harm Benchmark* (prototype phase).
156
+ Effort: ½d.
157
+
158
+ **Ship target:** ~2 weeks from 0.10.0 (mid-May 2026 at current velocity).
159
+
160
+ ---
161
+
162
+ ## 0.12.0 — "Release Discipline" (~1 week of work)
163
+
164
+ Theme: *we can't ship a regression without noticing.* Internal infrastructure
165
+ that prevents future regressions. Not flashy but the actual prerequisite for
166
+ 1.0's semver commitment.
167
+
168
+ ### #3 Negative-fact harm benchmark (full 10-15 scenarios)
169
+
170
+ **Gap.** Every benchmark today measures whether memory **helps**. Nothing
171
+ measures whether memory **harms** — i.e. injects a wrong fact and Claude
172
+ follows it. Without this, "memory helps" is unfalsifiable.
173
+
174
+ **Acceptance.** `spec/benchmarks/dataset/harm_scenarios.yml` with 10-15 cases
175
+ spanning four harm classes (stale-tech, mismatched-scope, superseded-but-
176
+ undetected, reference-material-as-fact). Each scores `harm` if Claude follows
177
+ the wrong fact, `safe` otherwise. Wired into `bin/run-evals`. **>1% harm
178
+ rate blocks release** (configurable via `HARM_RATE_THRESHOLD`).
179
+
180
+ **Why this release.** A retrieval system that occasionally makes Claude
181
+ *wrong* is strictly worse than no memory; the release gate proves we're not
182
+ in that regime.
183
+
184
+ → improvements.md entry: *#49 Negative-Fact Harm Benchmark* (full corpus).
185
+ Effort: 2d.
186
+
187
+ ### #4 Publish the CLAUDE.md baseline in headline E2E results
188
+
189
+ **Gap.** `claude_md_adapter` exists in `spec/benchmarks/comparative/adapters/`
190
+ and is wired into `comparative_helper.rb`. The README's headline comparative
191
+ table doesn't include it. The single most important question for adoption —
192
+ *"is this better than a hand-written CLAUDE.md?"* — is unanswered in our
193
+ published numbers.
194
+
195
+ **Acceptance.** Comparative E2E report includes `CLAUDE.md baseline` row in
196
+ `spec/benchmarks/README.md` and in `bin/run-evals --comparative` summary.
197
+ README explicitly states the win/loss versus the static baseline.
198
+
199
+ **Why this release.** Cheapest item on the list — adapter built, just
200
+ surface the number. Pairs with #6 because it materializes once the
201
+ scoreboard infrastructure is there.
202
+
203
+ → improvements.md entry: *#50 CLAUDE.md Baseline in Headline Results*.
204
+ Effort: 30min code + one $2-8 real-mode run.
205
+
206
+ ### #6 Release-to-release benchmark scoreboard
207
+
208
+ **Gap.** Benchmark output is textual today. Nothing diff-able across versions.
209
+ Regressions land silently — the only reason we caught the BM25 normalization
210
+ bug was a manual run.
211
+
212
+ **Acceptance.** Each `bin/run-evals` run writes
213
+ `spec/benchmarks/results/<version>.json`. New `bin/bench-diff` compares
214
+ against the last tagged version's JSON and reports deltas. `/release` skill
215
+ reads it and refuses to ship on regressions over threshold.
216
+
217
+ **Why this release.** The semver commitment in 1.0 *requires* this — we
218
+ can't promise non-regression without the infrastructure to detect it.
219
+
220
+ → improvements.md entry: *#52 Benchmark Scoreboard Diff*. Effort: 1d.
221
+
222
+ **Ship target:** ~4 weeks from 0.10.0 (end of May 2026).
223
+
224
+ ---
225
+
226
+ ## 0.12.x → 1.0 — soak period (2-3 weeks)
227
+
228
+ Critical phase. Run 0.12 against real usage. Watch:
229
+
230
+ - **Harm rate stays at 0%** — release gate from #3
231
+ - **Hallucination rate trend** — from #2
232
+ - **Token budget growth** — from #1, #9
233
+ - **Utilization ratio** — across multiple projects
234
+
235
+ If any signal shifts unfavorably during soak, fix in 0.12.x. **Don't ship 1.0
236
+ from a release that hasn't observed itself for ≥2 weeks.**
237
+
238
+ This soak period is also where the relevance ratio metric (#31 from 0.10.0)
239
+ materializes its first real-mode measurement, and where the 0.11 trust
240
+ signals get a chance to be real numbers vs. theory.
241
+
242
+ ---
243
+
244
+ ## 1.0.0 — "Stable Memory"
245
+
246
+ Theme: *ready for daily use, ready to recommend.*
247
+
248
+ ### Post-1.0-punchlist polish (if landed during soak)
249
+
250
+ These were originally post-1.0 in the punchlist; if soak time permits, they
251
+ land in 1.0. Otherwise they ship in 1.1.
252
+
253
+ ### #8 Real-session repeat-correction detection
254
+
255
+ The repeat-correction benchmark (#32 from 0.10.0) is synthetic; production
256
+ has no equivalent signal. Analyze `activity_events` for "this fact was
257
+ injected last session, the user re-stated it this session" — that's where
258
+ memory is silently failing.
259
+
260
+ → improvements.md entry: *#54 Real-Session Repeat-Correction Detection*.
261
+ Effort: 2d.
262
+
263
+ ### #9 Token-cost growth tracking
264
+
265
+ Builds on #1. Weekly digest reports "context cost grew X% over 30d" as an
266
+ anomaly signal that the DB is bloating or context injection is going wide.
267
+
268
+ → improvements.md entry: *#55 Token-Cost Growth Tracking*. Effort: 3h after
269
+ #1 lands.
270
+
271
+ ### #10 Drift dashboard
272
+
273
+ Snapshot `census` weekly, surface predicate distribution shifts on the
274
+ dashboard. Answers "is my fact base going off?" without a manual audit.
275
+
276
+ → improvements.md entry: *#56 Drift Dashboard*. Effort: 1.5d.
277
+
278
+ ### #11 API stability audit (NEW — added 2026-04-28)
279
+
280
+ **Gap.** "1.0 commits to semver" is meaningless without an explicit
281
+ public/internal split. Many of the surfaces touched in 0.9.0 / 0.10.0
282
+ (MCP tool schemas, hook payload shapes, CLI flags, dashboard endpoints)
283
+ have evolved organically and aren't formally documented as stable vs.
284
+ internal.
285
+
286
+ **Acceptance.**
287
+
288
+ - New `docs/api_stability.md` enumerating:
289
+ - **Public CLI**: every `claude-memory <subcommand>` and its flags, with stability tier
290
+ - **Public MCP tools**: every tool's schema, return shape, and tool-annotation hints
291
+ - **Public hook contract**: payload fields, return shapes, exit codes
292
+ - **Public Ruby API**: which classes/modules under `lib/claude_memory/` are external-facing (`Recall`, `Configuration`, `Store::StoreManager`?) vs. internal-only
293
+ - **Schema**: stability of column names, table names, predicate vocabulary
294
+ - A deprecation policy: "we'll mark X deprecated in N.x.0 and remove no earlier than (N+1).0.0"
295
+ - README + CLAUDE.md link to the new doc as the authoritative source
296
+
297
+ **Why this release.** Without this, the 1.0 semver promise is vibes, not a
298
+ contract. Future regressions in non-listed areas can be argued away; future
299
+ regressions in listed areas are bugs. Forces us to be honest about what
300
+ we're committing to.
301
+
302
+ → improvements.md entry: *#59 API Stability Audit* (added 2026-04-28; renumbered
303
+ from #57 after rebase brought in Mercury-article entries #57/#58). Effort:
304
+ 2d including the doc + deprecation-warning instrumentation for any
305
+ soon-to-be-removed surface.
306
+
307
+ ### Release framing
308
+
309
+ README + CHANGELOG framing for 1.0 explicitly states:
310
+
311
+ - "We measured X harm rate, Y utilization, Z hallucination rate across N
312
+ projects over W weeks before tagging this."
313
+ - The public API surface is documented at `docs/api_stability.md`
314
+ - Deprecation policy explicit
315
+
316
+ **Ship target:** 6-8 weeks from 0.10.0 (mid-June 2026 at current velocity).
317
+
318
+ ---
319
+
320
+ ## Defer / skip for 1.0
321
+
322
+ - **#44 Universal search box** — cosmetic given the gaps above. Knowledge tab
323
+ drawers cover the primary need.
324
+ - **#45 Live SSE/WebSocket feed** — polling is adequate; dashboard polish, not
325
+ a confidence gap.
326
+ - **#23 REST API endpoint** — MCP covers primary use case; defer to 1.x.
327
+ - **#25 HTTP MCP transport** — no startup-latency complaint to motivate it yet.
328
+
329
+ ---
330
+
331
+ ## Risk to flag now
332
+
333
+ The biggest hidden risk in this plan is **the harm benchmark (#3) finds
334
+ something.** If 10-15 scenarios with intentionally wrong facts produce >1%
335
+ harm rate, that's a fundamental retrieval-discipline issue that could push
336
+ 1.0 by months. The 3-scenario prototype in 0.11 (above) is specifically
337
+ designed to surface this risk earlier.
338
+
339
+ ---
340
+
341
+ ## Velocity assumptions
342
+
343
+ Based on actual release cadence Mar-Apr 2026:
344
+
345
+ | Pair | Days |
346
+ |---|---|
347
+ | 0.7.0 → 0.7.1 | minor patch, days |
348
+ | 0.7.1 → 0.8.0 | 17 |
349
+ | 0.8.0 → 0.9.0 | 17 |
350
+ | 0.9.0 → 0.9.1 | same day (patch) |
351
+ | 0.9.1 → 0.10.0 | 12 |
352
+
353
+ Average ~2 weeks per minor with substantial work landing each cycle.
354
+
355
+ | Milestone | Estimated work | Calendar target |
356
+ |---|---|---|
357
+ | 0.10.x patches | reactive | as-needed |
358
+ | 0.11.0 | ~1 week | ~2026-05-12 |
359
+ | 0.12.0 | ~1 week | ~2026-05-26 |
360
+ | Soak | 2-3 weeks | through ~2026-06-16 |
361
+ | 1.0.0 | 1-2 days release prep + #11 | ~2026-06-16 to 2026-06-23 |
362
+
363
+ These are calendar estimates assuming roughly the same focus level as the
364
+ 0.10.0 cycle. Real cadence will adjust based on what surfaces during soak.
365
+
366
+ ---
367
+
368
+ *Last updated: 2026-04-28 (post-0.10.0). Restructured around milestone
369
+ versions per the path-to-1.0 plan. #7 moved up from post-1.0 to 0.11; #11
370
+ API stability audit added as a new 1.0 must-have; 3-scenario harm prototype
371
+ added to 0.11 as risk-de-risking work for the full 0.12 benchmark.*
data/docs/EXAMPLES.md CHANGED
@@ -428,9 +428,48 @@ Claude: "You're using Context API for state management. You previously used Redu
428
428
 
429
429
  ---
430
430
 
431
+ ## Inspecting What Memory Knows (0.10.0+)
432
+
433
+ When you want to see what's actually in memory — what's been extracted, which
434
+ facts Claude has been reaching for, what's stale, what's contradicting — open
435
+ the dashboard:
436
+
437
+ ```bash
438
+ claude-memory dashboard
439
+ ```
440
+
441
+ Default port `http://localhost:3377`. Surfaces:
442
+
443
+ - A **moments feed** — every recall, context injection, extraction event with
444
+ the facts they touched. Click any moment for the full payload.
445
+ - A **Trust sidebar** — week-over-week activity, your global "fingerprint",
446
+ utilization ratio (% of recently extracted facts Claude actually used), and
447
+ your 👍/👎 feedback ratio.
448
+ - **Conflicts** with display-layer dedup so you don't have to triage 11 rows
449
+ of the same contradiction one at a time.
450
+ - **Knowledge** — facts grouped by predicate, with a separate References
451
+ section for auto-detected reference material.
452
+
453
+ For a markdown summary you can email or commit:
454
+
455
+ ```bash
456
+ claude-memory digest --since 7
457
+ ```
458
+
459
+ For a privacy-safe cross-project audit:
460
+
461
+ ```bash
462
+ claude-memory census
463
+ ```
464
+
465
+ See **[Dashboard guide →](dashboard.md)** for the full panel reference.
466
+
467
+ ---
468
+
431
469
  ## Next Steps
432
470
 
433
- - 📖 [Read the Getting Started Guide](GETTING_STARTED.md) *(coming soon)*
434
- - 🔧 [Set up the Claude Code Plugin](PLUGIN.md)
471
+ - 📖 [Read the Getting Started Guide](GETTING_STARTED.md)
472
+ - 📊 [Inspect with the Dashboard](dashboard.md)
473
+ - 🔧 [Set up the Claude Code Plugin](plugin.md)
435
474
  - 🏗️ [Understand the Architecture](architecture.md)
436
475
  - 📝 [Check the Changelog](../CHANGELOG.md)
@@ -19,7 +19,7 @@ gem install claude_memory
19
19
  Verify installation:
20
20
  ```bash
21
21
  claude-memory --version
22
- # => claude_memory 0.2.0
22
+ # => claude_memory 0.10.0
23
23
  ```
24
24
 
25
25
  ### Step 2: Install the Plugin
@@ -283,13 +283,13 @@ ClaudeMemory Doctor Report
283
283
  ==========================
284
284
 
285
285
  ✓ Global database: ~/.claude/memory.sqlite3
286
- - Schema version: 6
286
+ - Schema version: 17
287
287
  - Facts: 12
288
288
  - Entities: 8
289
289
  - Status: Healthy
290
290
 
291
291
  ✓ Project database: .claude/memory.sqlite3
292
- - Schema version: 6
292
+ - Schema version: 17
293
293
  - Facts: 23
294
294
  - Entities: 15
295
295
  - Status: Healthy
@@ -314,6 +314,22 @@ ls -lh .claude/memory.sqlite3
314
314
  # => -rw-r--r-- 1 user staff 64K Jan 26 10:35 .claude/memory.sqlite3
315
315
  ```
316
316
 
317
+ ### Open the Dashboard (0.10.0+)
318
+
319
+ Once you have a few sessions worth of memory, the dashboard is the fastest
320
+ way to see what's actually in there:
321
+
322
+ ```bash
323
+ claude-memory dashboard
324
+ ```
325
+
326
+ Opens `http://localhost:3377` with a moments feed (every recall, context
327
+ injection, and extraction event), a Trust sidebar showing your global
328
+ "fingerprint" and 30-day utilization ratio, a deduped Conflicts panel, and a
329
+ Knowledge panel grouping facts by predicate.
330
+
331
+ See **[docs/dashboard.md](dashboard.md)** for the full panel guide.
332
+
317
333
  ### Test Memory Recall
318
334
 
319
335
  Have a conversation with Claude to test:
@@ -560,7 +576,8 @@ sqlite3 .claude/memory.sqlite3 "SELECT * FROM facts LIMIT 5;"
560
576
  Now that you're up and running:
561
577
 
562
578
  - 📖 Read [Examples](EXAMPLES.md) for common use cases
563
- - 🔧 Explore [Plugin Documentation](PLUGIN.md) for advanced configuration
579
+ - 📊 Open the [Dashboard](dashboard.md) for live inspection (0.10.0+)
580
+ - 🔧 Explore [Plugin Documentation](plugin.md) for advanced configuration
564
581
  - 🏗️ Review [Architecture](architecture.md) for technical details
565
582
  - 💬 Join [Discussions](https://github.com/codenamev/claude_memory/discussions) to share feedback
566
583
 
@@ -572,8 +589,20 @@ Now that you're up and running:
572
589
  | `claude-memory doctor` | Check system health |
573
590
  | `claude-memory recall <query>` | Search for facts |
574
591
  | `claude-memory promote <fact_id>` | Make fact global |
592
+ | `claude-memory reject <id_or_docid>` | Mark a fact as rejected |
575
593
  | `claude-memory changes` | Recent updates |
576
594
  | `claude-memory conflicts` | Show contradictions |
595
+ | `claude-memory dashboard` | Open the local web UI (0.10.0+) |
596
+ | `claude-memory digest --since 7` | Markdown report of the last 7 days (0.10.0+; gains Context cost + Quality sections in 0.11.0) |
597
+ | `claude-memory show [--pending] [--source]` | Print what memory would inject at next SessionStart (0.11.0+) |
598
+ | `claude-memory stats --stale` | List facts not recalled recently (0.10.0+) |
599
+ | `claude-memory stats --tokens [--since DAYS]` | SessionStart context-token budget histogram (0.11.0+) |
600
+ | `claude-memory stats --tools` | MCP tool-call telemetry (0.9.0+) |
601
+ | `claude-memory census` | Privacy-safe predicate audit across projects (0.10.0+) |
602
+ | `claude-memory dedupe-conflicts --dry-run` | Preview historical conflict-row dedup (0.10.0+) |
603
+ | `claude-memory reclassify-references --dry-run` | Preview reference-material retag (0.10.0+) |
604
+ | `claude-memory compact` | VACUUM databases |
605
+ | `claude-memory export` | Dump facts to JSON |
577
606
  | `/claude-memory:analyze` | Bootstrap project knowledge |
578
607
 
579
608
  ## Support
data/docs/architecture.md CHANGED
@@ -9,7 +9,7 @@ ClaudeMemory is architected using Domain-Driven Design (DDD) principles with cle
9
9
  ```
10
10
  ┌─────────────────────────────────────────────────────────────┐
11
11
  │ Application Layer │
12
- │ CLI (Router) → Commands (20 classes) → Configuration │
12
+ │ CLI (Router) → Commands (32 classes) → Configuration │
13
13
  └──────────────────────┬──────────────────────────────────────┘
14
14
 
15
15
  ┌──────────────────────▼──────────────────────────────────────┐
@@ -27,7 +27,7 @@ ClaudeMemory is architected using Domain-Driven Design (DDD) principles with cle
27
27
 
28
28
  ┌──────────────────────▼──────────────────────────────────────┐
29
29
  │ Infrastructure Layer │
30
- │ Store (SQLite v6 + WAL) → FileSystem → Index (FTS5+Vector)
30
+ │ Store (SQLite v17 + WAL) → FileSystem → Index (FTS5+Vector)│
31
31
  │ Templates │
32
32
  └─────────────────────────────────────────────────────────────┘
33
33
  ```
@@ -40,7 +40,7 @@ ClaudeMemory is architected using Domain-Driven Design (DDD) principles with cle
40
40
 
41
41
  **Components:**
42
42
  - **CLI** (`cli.rb`): Thin router that dispatches to command classes
43
- - **Commands** (`commands/`): 20 command classes, each handling one CLI command
43
+ - **Commands** (`commands/`): 34 command classes, each handling one CLI command
44
44
  - **Configuration** (`configuration.rb`): Centralized ENV access and path calculation
45
45
 
46
46
  **Key Principles:**
@@ -179,7 +179,7 @@ end
179
179
  **Components:**
180
180
 
181
181
  #### Store (`store/`)
182
- - **SQLiteStore**: Direct database access via Sequel (schema v6)
182
+ - **SQLiteStore**: Direct database access via Sequel (schema v17)
183
183
  - **StoreManager**: Manages dual databases (global + project)
184
184
  - **Transaction safety**: Atomic multi-step operations
185
185
  - **WAL mode**: Write-Ahead Logging for better concurrency
@@ -201,6 +201,21 @@ end
201
201
  - Output style templates (`output-styles/memory-aware.md`)
202
202
  - Setup and configuration scaffolding
203
203
 
204
+ #### Dashboard (`dashboard/`)
205
+ - **Server**: WEBrick HTTP server (default port 3377), starts via `claude-memory dashboard`
206
+ - **API**: HTTP-shape glue + per-endpoint formatting; routes/delegates to panel classes
207
+ - **Panels** (each backed by a dedicated class with focused responsibility):
208
+ - `Trust`: weekly moments, fingerprint, utilization, feedback ratio, needs-review, **token_budget** (p50/p95/avg over 30d, 0.11.0+), **quality_score** (live 30-day window + historical baseline, 0.11.0+)
209
+ - `Moments`: feed-first activity stream with kind classification
210
+ - `Knowledge`: predicate-grouped fact summary (incl. References section)
211
+ - `Conflicts`: display-layer dedup with bulk-reject helper
212
+ - `Reuse`: most-used facts within window
213
+ - `Health`: db / hooks / vec checks with actionable fix strings
214
+ - `Timeline`: 30-day daily rollup
215
+ - `FactPresenter`, `ScopedFactResolver`: shared rendering / scope-aware ID resolution
216
+ - Connections released after every request — no held WAL writer locks across page loads
217
+ - See [docs/dashboard.md](dashboard.md) for the user-facing guide
218
+
204
219
  **Key Principles:**
205
220
  - Ports and Adapters: Clear interfaces for external systems
206
221
  - Dependency Injection: Real vs. test implementations
@@ -346,10 +361,10 @@ FileSystem (write)
346
361
  - Value objects (SessionId, TranscriptPath, FactId)
347
362
  - Centralized Configuration
348
363
  - 4 domain models with business logic
349
- - 20 command classes
350
- - 19 MCP tools
364
+ - 34 command classes
365
+ - 25 MCP tools
351
366
  - Semantic search with local embeddings (FastEmbed + TF-IDF fallback)
352
- - Schema v6 with WAL mode
367
+ - Schema v17 with WAL mode
353
368
 
354
369
  ## Future Improvements
355
370