claude_memory 0.9.0 → 0.10.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (76) hide show
  1. checksums.yaml +4 -4
  2. data/.claude/memory.sqlite3 +0 -0
  3. data/.claude/rules/claude_memory.generated.md +63 -1
  4. data/.claude/skills/dashboard/SKILL.md +42 -0
  5. data/.claude/skills/release/SKILL.md +168 -0
  6. data/.claude-plugin/marketplace.json +1 -1
  7. data/.claude-plugin/plugin.json +1 -1
  8. data/CHANGELOG.md +92 -0
  9. data/CLAUDE.md +21 -5
  10. data/README.md +32 -2
  11. data/db/migrations/015_add_activity_events.rb +26 -0
  12. data/db/migrations/016_add_moment_feedback.rb +22 -0
  13. data/db/migrations/017_add_last_recalled_at.rb +15 -0
  14. data/docs/1_0_punchlist.md +190 -0
  15. data/docs/EXAMPLES.md +41 -2
  16. data/docs/GETTING_STARTED.md +31 -4
  17. data/docs/architecture.md +22 -7
  18. data/docs/audit-queries.md +131 -0
  19. data/docs/dashboard.md +172 -0
  20. data/docs/improvements.md +465 -9
  21. data/docs/influence/cq.md +187 -0
  22. data/docs/plugin.md +13 -6
  23. data/docs/quality_review.md +489 -172
  24. data/docs/reflection_memory_as_accumulating_judgment.md +67 -0
  25. data/lib/claude_memory/activity_log.rb +86 -0
  26. data/lib/claude_memory/commands/census_command.rb +210 -0
  27. data/lib/claude_memory/commands/completion_command.rb +3 -0
  28. data/lib/claude_memory/commands/dashboard_command.rb +54 -0
  29. data/lib/claude_memory/commands/dedupe_conflicts_command.rb +55 -0
  30. data/lib/claude_memory/commands/digest_command.rb +181 -0
  31. data/lib/claude_memory/commands/hook_command.rb +34 -0
  32. data/lib/claude_memory/commands/reclassify_references_command.rb +56 -0
  33. data/lib/claude_memory/commands/registry.rb +6 -1
  34. data/lib/claude_memory/commands/skills/distill-transcripts.md +13 -1
  35. data/lib/claude_memory/commands/stats_command.rb +38 -1
  36. data/lib/claude_memory/commands/sweep_command.rb +2 -0
  37. data/lib/claude_memory/configuration.rb +16 -0
  38. data/lib/claude_memory/core/relative_time.rb +9 -0
  39. data/lib/claude_memory/dashboard/api.rb +610 -0
  40. data/lib/claude_memory/dashboard/conflicts.rb +279 -0
  41. data/lib/claude_memory/dashboard/efficacy.rb +127 -0
  42. data/lib/claude_memory/dashboard/fact_presenter.rb +109 -0
  43. data/lib/claude_memory/dashboard/health.rb +175 -0
  44. data/lib/claude_memory/dashboard/index.html +2707 -0
  45. data/lib/claude_memory/dashboard/knowledge.rb +136 -0
  46. data/lib/claude_memory/dashboard/moments.rb +244 -0
  47. data/lib/claude_memory/dashboard/reuse.rb +97 -0
  48. data/lib/claude_memory/dashboard/scoped_fact_resolver.rb +95 -0
  49. data/lib/claude_memory/dashboard/server.rb +211 -0
  50. data/lib/claude_memory/dashboard/timeline.rb +68 -0
  51. data/lib/claude_memory/dashboard/trust.rb +285 -0
  52. data/lib/claude_memory/distill/reference_material_detector.rb +78 -0
  53. data/lib/claude_memory/hook/auto_memory_mirror.rb +112 -0
  54. data/lib/claude_memory/hook/context_injector.rb +97 -3
  55. data/lib/claude_memory/hook/handler.rb +50 -3
  56. data/lib/claude_memory/mcp/handlers/management_handlers.rb +8 -0
  57. data/lib/claude_memory/mcp/query_guide.rb +11 -0
  58. data/lib/claude_memory/mcp/server.rb +8 -2
  59. data/lib/claude_memory/mcp/text_summary.rb +29 -0
  60. data/lib/claude_memory/mcp/tool_definitions.rb +13 -0
  61. data/lib/claude_memory/mcp/tools.rb +148 -0
  62. data/lib/claude_memory/publish.rb +13 -21
  63. data/lib/claude_memory/recall/stale_detector.rb +67 -0
  64. data/lib/claude_memory/resolve/predicate_policy.rb +2 -0
  65. data/lib/claude_memory/resolve/resolver.rb +41 -11
  66. data/lib/claude_memory/store/llm_cache.rb +68 -0
  67. data/lib/claude_memory/store/metrics_aggregator.rb +96 -0
  68. data/lib/claude_memory/store/schema_manager.rb +1 -1
  69. data/lib/claude_memory/store/sqlite_store.rb +47 -143
  70. data/lib/claude_memory/store/store_manager.rb +29 -0
  71. data/lib/claude_memory/sweep/maintenance.rb +216 -0
  72. data/lib/claude_memory/sweep/recall_timestamp_refresher.rb +83 -0
  73. data/lib/claude_memory/sweep/sweeper.rb +2 -0
  74. data/lib/claude_memory/version.rb +1 -1
  75. data/lib/claude_memory.rb +22 -0
  76. metadata +50 -1
@@ -0,0 +1,190 @@
1
+ # 1.0 Punchlist
2
+
3
+ *Created: 2026-04-28*
4
+
5
+ The remaining work for a stable 1.0 release. Distinct from `improvements.md` —
6
+ that file tracks the long tail of inbound study/idea entries; this file tracks
7
+ **what blocks 1.0 confidence**.
8
+
9
+ Guiding question: *a skeptical Ruby developer should be able to look at one
10
+ screen and say "yes, this is helping, here's the evidence" without trusting our
11
+ marketing.* Today the dashboard tells that story in pieces but not as a
12
+ headline. Each item below closes a specific gap that prevents that headline
13
+ from existing.
14
+
15
+ Items are cross-linked to the canonical entry in `improvements.md` where the
16
+ implementation detail and acceptance criteria live. This file is the
17
+ prioritization view; that file is the work view.
18
+
19
+ ---
20
+
21
+ ## Must-have for 1.0
22
+
23
+ ### 1. Token budget telemetry — *what does memory cost?*
24
+
25
+ **Gap.** `Core::TokenEstimator` exists and is unused outside one helper. We
26
+ have no idea what % of the SessionStart token budget memory consumes per
27
+ session, how it scales with DB size, or whether it's growing.
28
+
29
+ **Acceptance.** Trust panel + `claude-memory digest` show p50/p95 injected
30
+ tokens per session over the last 30 days. Per-session count rides on every
31
+ `hook_context` activity event so the data is queryable post-hoc.
32
+
33
+ **Why must-have.** "Costs you tokens forever" is the strongest critique of any
34
+ context-injection memory system; if we can't answer it numerically, we can't
35
+ defend the trade.
36
+
37
+ → improvements.md entry: *Token Budget Telemetry*
38
+
39
+ ### 2. Hallucination rate as a first-class trust metric
40
+
41
+ **Gap.** `ReferenceMaterialDetector` already classifies suspect facts and we
42
+ know from the #34 audit that ~25% of facts had embedded reasoning (i.e.
43
+ ~75% were bare conclusions at audit time). Neither signal is exposed on the
44
+ dashboard. We display clean numbers; we should display stained ones.
45
+
46
+ **Acceptance.** Trust panel surfaces a `quality_score` derived from
47
+ suspect-fact ratio + bare-conclusion ratio over active facts in both stores.
48
+ Digest includes a 30-day rejection rate ("how much of what we extracted got
49
+ rejected within a week?") so calibration drift is visible.
50
+
51
+ **Why must-have.** We can't claim "memory is helping" if we can't show "memory
52
+ isn't poisoning the well."
53
+
54
+ → improvements.md entry: *Hallucination Rate Metric*
55
+
56
+ ### 3. Negative-fact harm benchmark
57
+
58
+ **Gap.** Every benchmark we run today measures whether memory **helps**.
59
+ Nothing measures whether memory **harms** — i.e. injects a wrong fact and
60
+ Claude follows it. Without this, "memory helps" is unfalsifiable.
61
+
62
+ **Acceptance.** New `spec/benchmarks/dataset/harm_scenarios.yml` with 10–15
63
+ cases where memory holds a stale or wrong fact. Each case scores `harm` if
64
+ Claude's response follows the wrong fact, `safe` otherwise. Wired into
65
+ `bin/run-evals`. >1% harm rate blocks release.
66
+
67
+ **Why must-have.** A retrieval system that occasionally makes Claude *wrong*
68
+ is strictly worse than no memory; we need a release gate that proves we're
69
+ not in that regime.
70
+
71
+ → improvements.md entry: *Negative-Fact Harm Benchmark*
72
+
73
+ ### 4. Publish the CLAUDE.md baseline in headline E2E results
74
+
75
+ **Gap.** `claude_md_adapter` exists in `spec/benchmarks/comparative/adapters/`
76
+ and supports E2E. The adapter is wired into `comparative_helper.rb` but the
77
+ README's headline comparative table doesn't include it. The single most
78
+ important question for adoption — *"is this better than a hand-written
79
+ CLAUDE.md?"* — is currently unanswered in our published numbers.
80
+
81
+ **Acceptance.** Comparative E2E report includes `CLAUDE.md baseline` row in
82
+ `spec/benchmarks/README.md` and in `bin/run-evals --comparative` summary
83
+ output. README explicitly states the win/loss versus the static baseline.
84
+
85
+ **Why must-have.** Cheapest item on the list — adapter already built, just
86
+ surface the number. If we can't beat a static CLAUDE.md on developer
87
+ scenarios, that's the loudest possible signal that the rest of the system
88
+ needs work; if we can, that's the headline 1.0 brag.
89
+
90
+ → improvements.md entry: *CLAUDE.md Baseline in Headline Results*
91
+
92
+ ### 5. `claude-memory show` — human-readable "what would be injected"
93
+
94
+ **Gap.** Inspecting memory state today requires the dashboard or several CLI
95
+ commands (`recall`, `stats`, `census`). The CLAUDE.md alternative is
96
+ `cat CLAUDE.md` — instant, plain-English, no tool. We need the same one-line
97
+ inspect surface.
98
+
99
+ **Acceptance.** `claude-memory show` runs the same `Hook::ContextInjector`
100
+ path real sessions use, prints what would be injected next session in plain
101
+ English (not JSON), sized to fit a terminal, with predicate-grouped sections
102
+ matching the snapshot format.
103
+
104
+ **Why must-have.** Trust requires inspectability. A user who can't see what
105
+ memory will inject can't develop confidence in it.
106
+
107
+ → improvements.md entry: *claude-memory show*
108
+
109
+ ### 6. Release-to-release benchmark scoreboard
110
+
111
+ **Gap.** Benchmark output is textual today. Nothing diff-able across versions.
112
+ Regressions land silently — the only reason we caught the FTS5/RRF
113
+ normalization bug was a manual run.
114
+
115
+ **Acceptance.** Each `bin/run-evals` run writes
116
+ `spec/benchmarks/results/<version>.json`. New `bin/bench-diff` (or rake task)
117
+ compares against the last tagged version's JSON and reports deltas. Release
118
+ script (`/release` skill) reads it and refuses to ship on regressions over a
119
+ configurable threshold.
120
+
121
+ **Why must-have.** Without longitudinal tracking, every benchmark we run is a
122
+ snapshot. 1.0 is the moment we commit to *not regressing* what we ship.
123
+
124
+ → improvements.md entry: *Benchmark Scoreboard Diff*
125
+
126
+ ---
127
+
128
+ ## Strong post-1.0
129
+
130
+ These shouldn't block 1.0 but should land in the next release window.
131
+
132
+ ### 7. First-week ROI nudge
133
+
134
+ SessionEnd hook prints `memory contributed N facts this session, %used = X`
135
+ inline for the first ~10 sessions. Closes the cold-start gap where new users
136
+ don't see value because they don't think to look.
137
+
138
+ → improvements.md entry: *First-Week ROI Nudge*
139
+
140
+ ### 8. Real-session repeat-correction detector
141
+
142
+ The repeat-correction benchmark (#32) is synthetic; production has no
143
+ equivalent signal. Analyze `activity_events` to detect "this fact was injected
144
+ last session, the user re-stated it this session" — that's where memory is
145
+ silently failing.
146
+
147
+ → improvements.md entry: *Real-Session Repeat-Correction Detection*
148
+
149
+ ### 9. Token-cost growth tracking
150
+
151
+ Builds on #1. Weekly digest reports "context cost grew X% over 30d" as an
152
+ anomaly signal that the DB is bloating or context injection is going wide.
153
+
154
+ → improvements.md entry: *Token-Cost Growth Tracking*
155
+
156
+ ### 10. Drift dashboard
157
+
158
+ Snapshot `census` weekly, surface predicate distribution shifts on the
159
+ dashboard. Answers "is my fact base going off?" without a manual audit.
160
+
161
+ → improvements.md entry: *Drift Dashboard*
162
+
163
+ ---
164
+
165
+ ## Defer / skip for 1.0
166
+
167
+ - **#44 Universal search box** — cosmetic given the gaps above. Knowledge tab
168
+ drawers cover the primary need.
169
+ - **#45 Live SSE/WebSocket feed** — polling is adequate; dashboard polish, not
170
+ a confidence gap.
171
+
172
+ ---
173
+
174
+ ## Sequencing recommendation
175
+
176
+ Smallest set that materially shifts 1.0 confidence (~2 days):
177
+
178
+ 1. **Token budget telemetry** (#1) — closes the loudest critique.
179
+ 2. **CLAUDE.md baseline publish** (#4) — adapter already built, one report change.
180
+ 3. **Hallucination rate** (#2) — reuses ReferenceMaterialDetector.
181
+
182
+ Then in roughly priority order: `claude-memory show` (#5), harm benchmark
183
+ (#3), scoreboard (#6). Post-1.0 items follow naturally once the must-haves
184
+ land.
185
+
186
+ ---
187
+
188
+ *Last updated: 2026-04-28 — initial punchlist drawn from session-end critique
189
+ of observability/outcome gaps. Each entry will be elaborated with concrete
190
+ file:line refs in improvements.md as it's worked.*
data/docs/EXAMPLES.md CHANGED
@@ -428,9 +428,48 @@ Claude: "You're using Context API for state management. You previously used Redu
428
428
 
429
429
  ---
430
430
 
431
+ ## Inspecting What Memory Knows (0.10.0+)
432
+
433
+ When you want to see what's actually in memory — what's been extracted, which
434
+ facts Claude has been reaching for, what's stale, what's contradicting — open
435
+ the dashboard:
436
+
437
+ ```bash
438
+ claude-memory dashboard
439
+ ```
440
+
441
+ Default port `http://localhost:3377`. Surfaces:
442
+
443
+ - A **moments feed** — every recall, context injection, extraction event with
444
+ the facts they touched. Click any moment for the full payload.
445
+ - A **Trust sidebar** — week-over-week activity, your global "fingerprint",
446
+ utilization ratio (% of recently extracted facts Claude actually used), and
447
+ your 👍/👎 feedback ratio.
448
+ - **Conflicts** with display-layer dedup so you don't have to triage 11 rows
449
+ of the same contradiction one at a time.
450
+ - **Knowledge** — facts grouped by predicate, with a separate References
451
+ section for auto-detected reference material.
452
+
453
+ For a markdown summary you can email or commit:
454
+
455
+ ```bash
456
+ claude-memory digest --since 7
457
+ ```
458
+
459
+ For a privacy-safe cross-project audit:
460
+
461
+ ```bash
462
+ claude-memory census
463
+ ```
464
+
465
+ See **[Dashboard guide →](dashboard.md)** for the full panel reference.
466
+
467
+ ---
468
+
431
469
  ## Next Steps
432
470
 
433
- - 📖 [Read the Getting Started Guide](GETTING_STARTED.md) *(coming soon)*
434
- - 🔧 [Set up the Claude Code Plugin](PLUGIN.md)
471
+ - 📖 [Read the Getting Started Guide](GETTING_STARTED.md)
472
+ - 📊 [Inspect with the Dashboard](dashboard.md)
473
+ - 🔧 [Set up the Claude Code Plugin](plugin.md)
435
474
  - 🏗️ [Understand the Architecture](architecture.md)
436
475
  - 📝 [Check the Changelog](../CHANGELOG.md)
@@ -19,7 +19,7 @@ gem install claude_memory
19
19
  Verify installation:
20
20
  ```bash
21
21
  claude-memory --version
22
- # => claude_memory 0.2.0
22
+ # => claude_memory 0.10.0
23
23
  ```
24
24
 
25
25
  ### Step 2: Install the Plugin
@@ -283,13 +283,13 @@ ClaudeMemory Doctor Report
283
283
  ==========================
284
284
 
285
285
  ✓ Global database: ~/.claude/memory.sqlite3
286
- - Schema version: 6
286
+ - Schema version: 17
287
287
  - Facts: 12
288
288
  - Entities: 8
289
289
  - Status: Healthy
290
290
 
291
291
  ✓ Project database: .claude/memory.sqlite3
292
- - Schema version: 6
292
+ - Schema version: 17
293
293
  - Facts: 23
294
294
  - Entities: 15
295
295
  - Status: Healthy
@@ -314,6 +314,22 @@ ls -lh .claude/memory.sqlite3
314
314
  # => -rw-r--r-- 1 user staff 64K Jan 26 10:35 .claude/memory.sqlite3
315
315
  ```
316
316
 
317
+ ### Open the Dashboard (0.10.0+)
318
+
319
+ Once you have a few sessions worth of memory, the dashboard is the fastest
320
+ way to see what's actually in there:
321
+
322
+ ```bash
323
+ claude-memory dashboard
324
+ ```
325
+
326
+ Opens `http://localhost:3377` with a moments feed (every recall, context
327
+ injection, and extraction event), a Trust sidebar showing your global
328
+ "fingerprint" and 30-day utilization ratio, a deduped Conflicts panel, and a
329
+ Knowledge panel grouping facts by predicate.
330
+
331
+ See **[docs/dashboard.md](dashboard.md)** for the full panel guide.
332
+
317
333
  ### Test Memory Recall
318
334
 
319
335
  Have a conversation with Claude to test:
@@ -560,7 +576,8 @@ sqlite3 .claude/memory.sqlite3 "SELECT * FROM facts LIMIT 5;"
560
576
  Now that you're up and running:
561
577
 
562
578
  - 📖 Read [Examples](EXAMPLES.md) for common use cases
563
- - 🔧 Explore [Plugin Documentation](PLUGIN.md) for advanced configuration
579
+ - 📊 Open the [Dashboard](dashboard.md) for live inspection (0.10.0+)
580
+ - 🔧 Explore [Plugin Documentation](plugin.md) for advanced configuration
564
581
  - 🏗️ Review [Architecture](architecture.md) for technical details
565
582
  - 💬 Join [Discussions](https://github.com/codenamev/claude_memory/discussions) to share feedback
566
583
 
@@ -572,8 +589,18 @@ Now that you're up and running:
572
589
  | `claude-memory doctor` | Check system health |
573
590
  | `claude-memory recall <query>` | Search for facts |
574
591
  | `claude-memory promote <fact_id>` | Make fact global |
592
+ | `claude-memory reject <id_or_docid>` | Mark a fact as rejected |
575
593
  | `claude-memory changes` | Recent updates |
576
594
  | `claude-memory conflicts` | Show contradictions |
595
+ | `claude-memory dashboard` | Open the local web UI (0.10.0+) |
596
+ | `claude-memory digest --since 7` | Markdown report of the last 7 days (0.10.0+) |
597
+ | `claude-memory stats --stale` | List facts not recalled recently (0.10.0+) |
598
+ | `claude-memory stats --tools` | MCP tool-call telemetry (0.9.0+) |
599
+ | `claude-memory census` | Privacy-safe predicate audit across projects (0.10.0+) |
600
+ | `claude-memory dedupe-conflicts --dry-run` | Preview historical conflict-row dedup (0.10.0+) |
601
+ | `claude-memory reclassify-references --dry-run` | Preview reference-material retag (0.10.0+) |
602
+ | `claude-memory compact` | VACUUM databases |
603
+ | `claude-memory export` | Dump facts to JSON |
577
604
  | `/claude-memory:analyze` | Bootstrap project knowledge |
578
605
 
579
606
  ## Support
data/docs/architecture.md CHANGED
@@ -9,7 +9,7 @@ ClaudeMemory is architected using Domain-Driven Design (DDD) principles with cle
9
9
  ```
10
10
  ┌─────────────────────────────────────────────────────────────┐
11
11
  │ Application Layer │
12
- │ CLI (Router) → Commands (20 classes) → Configuration │
12
+ │ CLI (Router) → Commands (32 classes) → Configuration │
13
13
  └──────────────────────┬──────────────────────────────────────┘
14
14
 
15
15
  ┌──────────────────────▼──────────────────────────────────────┐
@@ -27,7 +27,7 @@ ClaudeMemory is architected using Domain-Driven Design (DDD) principles with cle
27
27
 
28
28
  ┌──────────────────────▼──────────────────────────────────────┐
29
29
  │ Infrastructure Layer │
30
- │ Store (SQLite v6 + WAL) → FileSystem → Index (FTS5+Vector)
30
+ │ Store (SQLite v17 + WAL) → FileSystem → Index (FTS5+Vector)│
31
31
  │ Templates │
32
32
  └─────────────────────────────────────────────────────────────┘
33
33
  ```
@@ -40,7 +40,7 @@ ClaudeMemory is architected using Domain-Driven Design (DDD) principles with cle
40
40
 
41
41
  **Components:**
42
42
  - **CLI** (`cli.rb`): Thin router that dispatches to command classes
43
- - **Commands** (`commands/`): 20 command classes, each handling one CLI command
43
+ - **Commands** (`commands/`): 32 command classes, each handling one CLI command
44
44
  - **Configuration** (`configuration.rb`): Centralized ENV access and path calculation
45
45
 
46
46
  **Key Principles:**
@@ -179,7 +179,7 @@ end
179
179
  **Components:**
180
180
 
181
181
  #### Store (`store/`)
182
- - **SQLiteStore**: Direct database access via Sequel (schema v6)
182
+ - **SQLiteStore**: Direct database access via Sequel (schema v17)
183
183
  - **StoreManager**: Manages dual databases (global + project)
184
184
  - **Transaction safety**: Atomic multi-step operations
185
185
  - **WAL mode**: Write-Ahead Logging for better concurrency
@@ -201,6 +201,21 @@ end
201
201
  - Output style templates (`output-styles/memory-aware.md`)
202
202
  - Setup and configuration scaffolding
203
203
 
204
+ #### Dashboard (`dashboard/`)
205
+ - **Server**: WEBrick HTTP server (default port 3377), starts via `claude-memory dashboard`
206
+ - **API**: HTTP-shape glue + per-endpoint formatting; routes/delegates to panel classes
207
+ - **Panels** (each backed by a dedicated class with focused responsibility):
208
+ - `Trust`: weekly moments, fingerprint, utilization, feedback ratio, needs-review
209
+ - `Moments`: feed-first activity stream with kind classification
210
+ - `Knowledge`: predicate-grouped fact summary (incl. References section)
211
+ - `Conflicts`: display-layer dedup with bulk-reject helper
212
+ - `Reuse`: most-used facts within window
213
+ - `Health`: db / hooks / vec checks with actionable fix strings
214
+ - `Timeline`: 30-day daily rollup
215
+ - `FactPresenter`, `ScopedFactResolver`: shared rendering / scope-aware ID resolution
216
+ - Connections released after every request — no held WAL writer locks across page loads
217
+ - See [docs/dashboard.md](dashboard.md) for the user-facing guide
218
+
204
219
  **Key Principles:**
205
220
  - Ports and Adapters: Clear interfaces for external systems
206
221
  - Dependency Injection: Real vs. test implementations
@@ -346,10 +361,10 @@ FileSystem (write)
346
361
  - Value objects (SessionId, TranscriptPath, FactId)
347
362
  - Centralized Configuration
348
363
  - 4 domain models with business logic
349
- - 20 command classes
350
- - 19 MCP tools
364
+ - 32 command classes
365
+ - 25 MCP tools
351
366
  - Semantic search with local embeddings (FastEmbed + TF-IDF fallback)
352
- - Schema v6 with WAL mode
367
+ - Schema v17 with WAL mode
353
368
 
354
369
  ## Future Improvements
355
370
 
@@ -0,0 +1,131 @@
1
+ # Audit Queries
2
+
3
+ Pre-written SQL for validating that the ClaudeMemory plugin is being invoked when it should. Run via [cq](https://github.com/technicalpickles/cq) — install with `cargo install --git https://github.com/technicalpickles/cq`.
4
+
5
+ These query Claude Code's raw transcripts (in `~/.claude/projects/`), not ClaudeMemory's own SQLite databases. That's deliberate: cq sees *all* tool calls including ones that bypassed the MCP server entirely, which is exactly the angle needed to spot activation gaps.
6
+
7
+ For server-side telemetry (counts, latencies of MCP calls that *did* land), use `claude-memory stats --tools` against ClaudeMemory's `mcp_tool_calls` table instead.
8
+
9
+ ## Query 1 — Memory plugin activation rate
10
+
11
+ How often is any `mcp__memory__*` tool being called, normalized by total sessions?
12
+
13
+ ```bash
14
+ cq sql "
15
+ WITH session_window AS (
16
+ SELECT DISTINCT session_id FROM messages
17
+ ),
18
+ memory_sessions AS (
19
+ SELECT DISTINCT session_id FROM tool_calls
20
+ WHERE name LIKE 'mcp__memory__%'
21
+ )
22
+ SELECT
23
+ (SELECT count(*) FROM session_window) AS total_sessions,
24
+ (SELECT count(*) FROM memory_sessions) AS sessions_with_memory_call,
25
+ ROUND(100.0 * (SELECT count(*) FROM memory_sessions)
26
+ / NULLIF((SELECT count(*) FROM session_window), 0), 1) AS pct
27
+ " --since 30d --table
28
+ ```
29
+
30
+ **Why it matters**: a low percentage doesn't mean the plugin is broken — many sessions don't need memory. It's a denominator for the next two queries.
31
+
32
+ ## Query 2 — Sessions that asked memory-shaped questions but never called memory
33
+
34
+ The most useful query. Surfaces user prompts where memory *should* have been the obvious tool, but Claude went elsewhere (Read, Grep, Bash) instead.
35
+
36
+ ```bash
37
+ cq sql "
38
+ WITH memory_sessions AS (
39
+ SELECT DISTINCT session_id FROM tool_calls
40
+ WHERE name LIKE 'mcp__memory__%'
41
+ )
42
+ SELECT
43
+ m.session_id,
44
+ m.timestamp,
45
+ left(m.text, 200) AS user_prompt
46
+ FROM messages m
47
+ LEFT JOIN memory_sessions ms ON m.session_id = ms.session_id
48
+ WHERE m.type = 'user'
49
+ AND ms.session_id IS NULL
50
+ AND (
51
+ m.text ILIKE '%why did we%'
52
+ OR m.text ILIKE '%what convention%'
53
+ OR m.text ILIKE '%how do we usually%'
54
+ OR m.text ILIKE '%what did we decide%'
55
+ OR m.text ILIKE '%architecture%'
56
+ OR m.text ILIKE '%what''s the pattern%'
57
+ )
58
+ ORDER BY m.timestamp DESC
59
+ " --since 30d --table --limit 30
60
+ ```
61
+
62
+ **What to do with results**: each row is a candidate for either (a) a tightening of MCP server instructions / skill descriptions, or (b) confirmation that the question genuinely didn't need memory and the keyword filter is too loose.
63
+
64
+ ## Query 3 — Which memory tools actually get called?
65
+
66
+ ```bash
67
+ cq sql "
68
+ SELECT
69
+ name AS tool,
70
+ count(*) AS invocations,
71
+ count(DISTINCT session_id) AS sessions
72
+ FROM tool_calls
73
+ WHERE name LIKE 'mcp__memory__%'
74
+ GROUP BY name
75
+ ORDER BY invocations DESC
76
+ " --since 30d --table
77
+ ```
78
+
79
+ **Expected shape**: `mcp__memory__recall`, `mcp__memory__conventions`, `mcp__memory__decisions` should dominate. Tools that never fire (`memory_fact_graph`, `memory_explain`, `memory_search_concepts`, `memory_facts_by_*`) might have description/triggering issues — same pattern as cq's "skill audit" use case.
80
+
81
+ ## Query 4 — Error rate per memory tool
82
+
83
+ ```bash
84
+ cq sql "
85
+ SELECT
86
+ tc.name AS tool,
87
+ count(*) AS calls,
88
+ sum(CASE WHEN tr.is_error THEN 1 ELSE 0 END) AS errors,
89
+ ROUND(100.0 * sum(CASE WHEN tr.is_error THEN 1 ELSE 0 END)
90
+ / count(*), 1) AS pct_errors
91
+ FROM tool_calls tc
92
+ JOIN tool_results tr ON tc.tool_use_id = tr.tool_use_id
93
+ WHERE tc.name LIKE 'mcp__memory__%'
94
+ GROUP BY tc.name
95
+ ORDER BY errors DESC
96
+ " --since 30d --table
97
+ ```
98
+
99
+ **Why it matters**: a memory tool returning errors is much worse than not firing — Claude sees the failure and learns to avoid that tool. Triage anything above ~5%.
100
+
101
+ ## Query 5 — Result-size distribution (context budget hygiene)
102
+
103
+ ```bash
104
+ cq sql "
105
+ SELECT
106
+ tc.name AS tool,
107
+ count(*) AS calls,
108
+ MIN(length(tr.content)) AS min_chars,
109
+ ROUND(AVG(length(tr.content))) AS avg_chars,
110
+ MAX(length(tr.content)) AS max_chars
111
+ FROM tool_calls tc
112
+ JOIN tool_results tr ON tc.tool_use_id = tr.tool_use_id
113
+ WHERE tc.name LIKE 'mcp__memory__%'
114
+ GROUP BY tc.name
115
+ ORDER BY avg_chars DESC
116
+ " --since 30d --table
117
+ ```
118
+
119
+ **Why it matters**: ClaudeMemory exposes a `compact: true` option that drops receipts for ~60% smaller responses. If averages are large, either the compact flag isn't being passed by callers or the tools that don't accept it are dumping too much.
120
+
121
+ ## When to re-run
122
+
123
+ - Before each release — does the new version improve activation rate or reduce errors?
124
+ - After meaningful changes to MCP server instructions / skill descriptions
125
+ - If a user reports "the memory plugin doesn't seem to do anything" — Query 2 will usually surface the gap concretely
126
+
127
+ ## Related
128
+
129
+ - Source for the methodology: `docs/influence/cq.md`
130
+ - Server-side telemetry alternative: `claude-memory stats --tools --since 30`
131
+ - cq schema reference: `cq schema --examples`