RubyGems - claude_memory - Versions diffs - 0.9.0 → 0.10.0 - Mend

claude_memory 0.9.0 → 0.10.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (76) hide show

checksums.yaml +4 -4
data/.claude/memory.sqlite3 +0 -0
data/.claude/rules/claude_memory.generated.md +63 -1
data/.claude/skills/dashboard/SKILL.md +42 -0
data/.claude/skills/release/SKILL.md +168 -0
data/.claude-plugin/marketplace.json +1 -1
data/.claude-plugin/plugin.json +1 -1
data/CHANGELOG.md +92 -0
data/CLAUDE.md +21 -5
data/README.md +32 -2
data/db/migrations/015_add_activity_events.rb +26 -0
data/db/migrations/016_add_moment_feedback.rb +22 -0
data/db/migrations/017_add_last_recalled_at.rb +15 -0
data/docs/1_0_punchlist.md +190 -0
data/docs/EXAMPLES.md +41 -2
data/docs/GETTING_STARTED.md +31 -4
data/docs/architecture.md +22 -7
data/docs/audit-queries.md +131 -0
data/docs/dashboard.md +172 -0
data/docs/improvements.md +465 -9
data/docs/influence/cq.md +187 -0
data/docs/plugin.md +13 -6
data/docs/quality_review.md +489 -172
data/docs/reflection_memory_as_accumulating_judgment.md +67 -0
data/lib/claude_memory/activity_log.rb +86 -0
data/lib/claude_memory/commands/census_command.rb +210 -0
data/lib/claude_memory/commands/completion_command.rb +3 -0
data/lib/claude_memory/commands/dashboard_command.rb +54 -0
data/lib/claude_memory/commands/dedupe_conflicts_command.rb +55 -0
data/lib/claude_memory/commands/digest_command.rb +181 -0
data/lib/claude_memory/commands/hook_command.rb +34 -0
data/lib/claude_memory/commands/reclassify_references_command.rb +56 -0
data/lib/claude_memory/commands/registry.rb +6 -1
data/lib/claude_memory/commands/skills/distill-transcripts.md +13 -1
data/lib/claude_memory/commands/stats_command.rb +38 -1
data/lib/claude_memory/commands/sweep_command.rb +2 -0
data/lib/claude_memory/configuration.rb +16 -0
data/lib/claude_memory/core/relative_time.rb +9 -0
data/lib/claude_memory/dashboard/api.rb +610 -0
data/lib/claude_memory/dashboard/conflicts.rb +279 -0
data/lib/claude_memory/dashboard/efficacy.rb +127 -0
data/lib/claude_memory/dashboard/fact_presenter.rb +109 -0
data/lib/claude_memory/dashboard/health.rb +175 -0
data/lib/claude_memory/dashboard/index.html +2707 -0
data/lib/claude_memory/dashboard/knowledge.rb +136 -0
data/lib/claude_memory/dashboard/moments.rb +244 -0
data/lib/claude_memory/dashboard/reuse.rb +97 -0
data/lib/claude_memory/dashboard/scoped_fact_resolver.rb +95 -0
data/lib/claude_memory/dashboard/server.rb +211 -0
data/lib/claude_memory/dashboard/timeline.rb +68 -0
data/lib/claude_memory/dashboard/trust.rb +285 -0
data/lib/claude_memory/distill/reference_material_detector.rb +78 -0
data/lib/claude_memory/hook/auto_memory_mirror.rb +112 -0
data/lib/claude_memory/hook/context_injector.rb +97 -3
data/lib/claude_memory/hook/handler.rb +50 -3
data/lib/claude_memory/mcp/handlers/management_handlers.rb +8 -0
data/lib/claude_memory/mcp/query_guide.rb +11 -0
data/lib/claude_memory/mcp/server.rb +8 -2
data/lib/claude_memory/mcp/text_summary.rb +29 -0
data/lib/claude_memory/mcp/tool_definitions.rb +13 -0
data/lib/claude_memory/mcp/tools.rb +148 -0
data/lib/claude_memory/publish.rb +13 -21
data/lib/claude_memory/recall/stale_detector.rb +67 -0
data/lib/claude_memory/resolve/predicate_policy.rb +2 -0
data/lib/claude_memory/resolve/resolver.rb +41 -11
data/lib/claude_memory/store/llm_cache.rb +68 -0
data/lib/claude_memory/store/metrics_aggregator.rb +96 -0
data/lib/claude_memory/store/schema_manager.rb +1 -1
data/lib/claude_memory/store/sqlite_store.rb +47 -143
data/lib/claude_memory/store/store_manager.rb +29 -0
data/lib/claude_memory/sweep/maintenance.rb +216 -0
data/lib/claude_memory/sweep/recall_timestamp_refresher.rb +83 -0
data/lib/claude_memory/sweep/sweeper.rb +2 -0
data/lib/claude_memory/version.rb +1 -1
data/lib/claude_memory.rb +22 -0
metadata +50 -1

data/docs/1_0_punchlist.md ADDED Viewed

@@ -0,0 +1,190 @@
+# 1.0 Punchlist
+*Created: 2026-04-28*
+The remaining work for a stable 1.0 release. Distinct from `improvements.md` —
+that file tracks the long tail of inbound study/idea entries; this file tracks
+**what blocks 1.0 confidence**.
+Guiding question: *a skeptical Ruby developer should be able to look at one
+screen and say "yes, this is helping, here's the evidence" without trusting our
+marketing.* Today the dashboard tells that story in pieces but not as a
+headline. Each item below closes a specific gap that prevents that headline
+from existing.
+Items are cross-linked to the canonical entry in `improvements.md` where the
+implementation detail and acceptance criteria live. This file is the
+prioritization view; that file is the work view.
+---
+## Must-have for 1.0
+### 1. Token budget telemetry — *what does memory cost?*
+**Gap.** `Core::TokenEstimator` exists and is unused outside one helper. We
+have no idea what % of the SessionStart token budget memory consumes per
+session, how it scales with DB size, or whether it's growing.
+**Acceptance.** Trust panel + `claude-memory digest` show p50/p95 injected
+tokens per session over the last 30 days. Per-session count rides on every
+`hook_context` activity event so the data is queryable post-hoc.
+**Why must-have.** "Costs you tokens forever" is the strongest critique of any
+context-injection memory system; if we can't answer it numerically, we can't
+defend the trade.
+→ improvements.md entry: *Token Budget Telemetry*
+### 2. Hallucination rate as a first-class trust metric
+**Gap.** `ReferenceMaterialDetector` already classifies suspect facts and we
+know from the #34 audit that ~25% of facts had embedded reasoning (i.e.
+~75% were bare conclusions at audit time). Neither signal is exposed on the
+dashboard. We display clean numbers; we should display stained ones.
+**Acceptance.** Trust panel surfaces a `quality_score` derived from
+suspect-fact ratio + bare-conclusion ratio over active facts in both stores.
+Digest includes a 30-day rejection rate ("how much of what we extracted got
+rejected within a week?") so calibration drift is visible.
+**Why must-have.** We can't claim "memory is helping" if we can't show "memory
+isn't poisoning the well."
+→ improvements.md entry: *Hallucination Rate Metric*
+### 3. Negative-fact harm benchmark
+**Gap.** Every benchmark we run today measures whether memory **helps**.
+Nothing measures whether memory **harms** — i.e. injects a wrong fact and
+Claude follows it. Without this, "memory helps" is unfalsifiable.
+**Acceptance.** New `spec/benchmarks/dataset/harm_scenarios.yml` with 10–15
+cases where memory holds a stale or wrong fact. Each case scores `harm` if
+Claude's response follows the wrong fact, `safe` otherwise. Wired into
+`bin/run-evals`. >1% harm rate blocks release.
+**Why must-have.** A retrieval system that occasionally makes Claude *wrong*
+is strictly worse than no memory; we need a release gate that proves we're
+not in that regime.
+→ improvements.md entry: *Negative-Fact Harm Benchmark*
+### 4. Publish the CLAUDE.md baseline in headline E2E results
+**Gap.** `claude_md_adapter` exists in `spec/benchmarks/comparative/adapters/`
+and supports E2E. The adapter is wired into `comparative_helper.rb` but the
+README's headline comparative table doesn't include it. The single most
+important question for adoption — *"is this better than a hand-written
+CLAUDE.md?"* — is currently unanswered in our published numbers.
+**Acceptance.** Comparative E2E report includes `CLAUDE.md baseline` row in
+`spec/benchmarks/README.md` and in `bin/run-evals --comparative` summary
+output. README explicitly states the win/loss versus the static baseline.
+**Why must-have.** Cheapest item on the list — adapter already built, just
+surface the number. If we can't beat a static CLAUDE.md on developer
+scenarios, that's the loudest possible signal that the rest of the system
+needs work; if we can, that's the headline 1.0 brag.
+→ improvements.md entry: *CLAUDE.md Baseline in Headline Results*
+### 5. `claude-memory show` — human-readable "what would be injected"
+**Gap.** Inspecting memory state today requires the dashboard or several CLI
+commands (`recall`, `stats`, `census`). The CLAUDE.md alternative is
+`cat CLAUDE.md` — instant, plain-English, no tool. We need the same one-line
+inspect surface.
+**Acceptance.** `claude-memory show` runs the same `Hook::ContextInjector`
+path real sessions use, prints what would be injected next session in plain
+English (not JSON), sized to fit a terminal, with predicate-grouped sections
+matching the snapshot format.
+**Why must-have.** Trust requires inspectability. A user who can't see what
+memory will inject can't develop confidence in it.
+→ improvements.md entry: *claude-memory show*
+### 6. Release-to-release benchmark scoreboard
+**Gap.** Benchmark output is textual today. Nothing diff-able across versions.
+Regressions land silently — the only reason we caught the FTS5/RRF
+normalization bug was a manual run.
+**Acceptance.** Each `bin/run-evals` run writes
+`spec/benchmarks/results/<version>.json`. New `bin/bench-diff` (or rake task)
+compares against the last tagged version's JSON and reports deltas. Release
+script (`/release` skill) reads it and refuses to ship on regressions over a
+configurable threshold.
+**Why must-have.** Without longitudinal tracking, every benchmark we run is a
+snapshot. 1.0 is the moment we commit to *not regressing* what we ship.
+→ improvements.md entry: *Benchmark Scoreboard Diff*
+---
+## Strong post-1.0
+These shouldn't block 1.0 but should land in the next release window.
+### 7. First-week ROI nudge
+SessionEnd hook prints `memory contributed N facts this session, %used = X`
+inline for the first ~10 sessions. Closes the cold-start gap where new users
+don't see value because they don't think to look.
+→ improvements.md entry: *First-Week ROI Nudge*
+### 8. Real-session repeat-correction detector
+The repeat-correction benchmark (#32) is synthetic; production has no
+equivalent signal. Analyze `activity_events` to detect "this fact was injected
+last session, the user re-stated it this session" — that's where memory is
+silently failing.
+→ improvements.md entry: *Real-Session Repeat-Correction Detection*
+### 9. Token-cost growth tracking
+Builds on #1. Weekly digest reports "context cost grew X% over 30d" as an
+anomaly signal that the DB is bloating or context injection is going wide.
+→ improvements.md entry: *Token-Cost Growth Tracking*
+### 10. Drift dashboard
+Snapshot `census` weekly, surface predicate distribution shifts on the
+dashboard. Answers "is my fact base going off?" without a manual audit.
+→ improvements.md entry: *Drift Dashboard*
+---
+## Defer / skip for 1.0
+- **#44 Universal search box** — cosmetic given the gaps above. Knowledge tab
+  drawers cover the primary need.
+- **#45 Live SSE/WebSocket feed** — polling is adequate; dashboard polish, not
+  a confidence gap.
+---
+## Sequencing recommendation
+Smallest set that materially shifts 1.0 confidence (~2 days):
+1. **Token budget telemetry** (#1) — closes the loudest critique.
+2. **CLAUDE.md baseline publish** (#4) — adapter already built, one report change.
+3. **Hallucination rate** (#2) — reuses ReferenceMaterialDetector.
+Then in roughly priority order: `claude-memory show` (#5), harm benchmark
+(#3), scoreboard (#6). Post-1.0 items follow naturally once the must-haves
+land.
+---
+*Last updated: 2026-04-28 — initial punchlist drawn from session-end critique
+of observability/outcome gaps. Each entry will be elaborated with concrete
+file:line refs in improvements.md as it's worked.*

data/docs/EXAMPLES.md CHANGED Viewed

@@ -428,9 +428,48 @@ Claude: "You're using Context API for state management. You previously used Redu
 ---
+## Inspecting What Memory Knows (0.10.0+)
+When you want to see what's actually in memory — what's been extracted, which
+facts Claude has been reaching for, what's stale, what's contradicting — open
+the dashboard:
+```bash
+claude-memory dashboard
+```
+Default port `http://localhost:3377`. Surfaces:
+- A **moments feed** — every recall, context injection, extraction event with
+  the facts they touched. Click any moment for the full payload.
+- A **Trust sidebar** — week-over-week activity, your global "fingerprint",
+  utilization ratio (% of recently extracted facts Claude actually used), and
+  your 👍/👎 feedback ratio.
+- **Conflicts** with display-layer dedup so you don't have to triage 11 rows
+  of the same contradiction one at a time.
+- **Knowledge** — facts grouped by predicate, with a separate References
+  section for auto-detected reference material.
+For a markdown summary you can email or commit:
+```bash
+claude-memory digest --since 7
+```
+For a privacy-safe cross-project audit:
+```bash
+claude-memory census
+```
+See **[Dashboard guide →](dashboard.md)** for the full panel reference.
+---
 ## Next Steps
-- 📖 [Read the Getting Started Guide](GETTING_STARTED.md) *(coming soon)*
-- 🔧 [Set up the Claude Code Plugin](PLUGIN.md)
+- 📖 [Read the Getting Started Guide](GETTING_STARTED.md)
+- 📊 [Inspect with the Dashboard](dashboard.md)
+- 🔧 [Set up the Claude Code Plugin](plugin.md)
 - 🏗️ [Understand the Architecture](architecture.md)
 - 📝 [Check the Changelog](../CHANGELOG.md)

data/docs/GETTING_STARTED.md CHANGED Viewed

@@ -19,7 +19,7 @@ gem install claude_memory
 Verify installation:
 ```bash
 claude-memory --version
-# => claude_memory 0.2.0
+# => claude_memory 0.10.0
 ```
 ### Step 2: Install the Plugin
@@ -283,13 +283,13 @@ ClaudeMemory Doctor Report
 ==========================
 ✓ Global database: ~/.claude/memory.sqlite3
-  - Schema version: 6
+  - Schema version: 17
   - Facts: 12
   - Entities: 8
   - Status: Healthy
 ✓ Project database: .claude/memory.sqlite3
-  - Schema version: 6
+  - Schema version: 17
   - Facts: 23
   - Entities: 15
   - Status: Healthy
@@ -314,6 +314,22 @@ ls -lh .claude/memory.sqlite3
 # => -rw-r--r-- 1 user staff 64K Jan 26 10:35 .claude/memory.sqlite3
 ```
+### Open the Dashboard (0.10.0+)
+Once you have a few sessions worth of memory, the dashboard is the fastest
+way to see what's actually in there:
+```bash
+claude-memory dashboard
+```
+Opens `http://localhost:3377` with a moments feed (every recall, context
+injection, and extraction event), a Trust sidebar showing your global
+"fingerprint" and 30-day utilization ratio, a deduped Conflicts panel, and a
+Knowledge panel grouping facts by predicate.
+See **[docs/dashboard.md](dashboard.md)** for the full panel guide.
 ### Test Memory Recall
 Have a conversation with Claude to test:
@@ -560,7 +576,8 @@ sqlite3 .claude/memory.sqlite3 "SELECT * FROM facts LIMIT 5;"
 Now that you're up and running:
 - 📖 Read [Examples](EXAMPLES.md) for common use cases
-- 🔧 Explore [Plugin Documentation](PLUGIN.md) for advanced configuration
+- 📊 Open the [Dashboard](dashboard.md) for live inspection (0.10.0+)
+- 🔧 Explore [Plugin Documentation](plugin.md) for advanced configuration
 - 🏗️ Review [Architecture](architecture.md) for technical details
 - 💬 Join [Discussions](https://github.com/codenamev/claude_memory/discussions) to share feedback
@@ -572,8 +589,18 @@ Now that you're up and running:
 | `claude-memory doctor` | Check system health |
 | `claude-memory recall <query>` | Search for facts |
 | `claude-memory promote <fact_id>` | Make fact global |
+| `claude-memory reject <id_or_docid>` | Mark a fact as rejected |
 | `claude-memory changes` | Recent updates |
 | `claude-memory conflicts` | Show contradictions |
+| `claude-memory dashboard` | Open the local web UI (0.10.0+) |
+| `claude-memory digest --since 7` | Markdown report of the last 7 days (0.10.0+) |
+| `claude-memory stats --stale` | List facts not recalled recently (0.10.0+) |
+| `claude-memory stats --tools` | MCP tool-call telemetry (0.9.0+) |
+| `claude-memory census` | Privacy-safe predicate audit across projects (0.10.0+) |
+| `claude-memory dedupe-conflicts --dry-run` | Preview historical conflict-row dedup (0.10.0+) |
+| `claude-memory reclassify-references --dry-run` | Preview reference-material retag (0.10.0+) |
+| `claude-memory compact` | VACUUM databases |
+| `claude-memory export` | Dump facts to JSON |
 | `/claude-memory:analyze` | Bootstrap project knowledge |
 ## Support

data/docs/architecture.md CHANGED Viewed

@@ -9,7 +9,7 @@ ClaudeMemory is architected using Domain-Driven Design (DDD) principles with cle
 ```
 ┌─────────────────────────────────────────────────────────────┐
 │                    Application Layer                         │
-│  CLI (Router) → Commands (20 classes) → Configuration       │
+│  CLI (Router) → Commands (32 classes) → Configuration       │
 └──────────────────────┬──────────────────────────────────────┘
                        │
 ┌──────────────────────▼──────────────────────────────────────┐
@@ -27,7 +27,7 @@ ClaudeMemory is architected using Domain-Driven Design (DDD) principles with cle
                        │
 ┌──────────────────────▼──────────────────────────────────────┐
 │                 Infrastructure Layer                         │
-│  Store (SQLite v6 + WAL) → FileSystem → Index (FTS5+Vector) │
+│  Store (SQLite v17 + WAL) → FileSystem → Index (FTS5+Vector)│
 │  Templates                                                   │
 └─────────────────────────────────────────────────────────────┘
 ```
@@ -40,7 +40,7 @@ ClaudeMemory is architected using Domain-Driven Design (DDD) principles with cle
 **Components:**
 - **CLI** (`cli.rb`): Thin router that dispatches to command classes
-- **Commands** (`commands/`): 20 command classes, each handling one CLI command
+- **Commands** (`commands/`): 32 command classes, each handling one CLI command
 - **Configuration** (`configuration.rb`): Centralized ENV access and path calculation
 **Key Principles:**
@@ -179,7 +179,7 @@ end
 **Components:**
 #### Store (`store/`)
-- **SQLiteStore**: Direct database access via Sequel (schema v6)
+- **SQLiteStore**: Direct database access via Sequel (schema v17)
 - **StoreManager**: Manages dual databases (global + project)
 - **Transaction safety**: Atomic multi-step operations
 - **WAL mode**: Write-Ahead Logging for better concurrency
@@ -201,6 +201,21 @@ end
 - Output style templates (`output-styles/memory-aware.md`)
 - Setup and configuration scaffolding
+#### Dashboard (`dashboard/`)
+- **Server**: WEBrick HTTP server (default port 3377), starts via `claude-memory dashboard`
+- **API**: HTTP-shape glue + per-endpoint formatting; routes/delegates to panel classes
+- **Panels** (each backed by a dedicated class with focused responsibility):
+  - `Trust`: weekly moments, fingerprint, utilization, feedback ratio, needs-review
+  - `Moments`: feed-first activity stream with kind classification
+  - `Knowledge`: predicate-grouped fact summary (incl. References section)
+  - `Conflicts`: display-layer dedup with bulk-reject helper
+  - `Reuse`: most-used facts within window
+  - `Health`: db / hooks / vec checks with actionable fix strings
+  - `Timeline`: 30-day daily rollup
+  - `FactPresenter`, `ScopedFactResolver`: shared rendering / scope-aware ID resolution
+- Connections released after every request — no held WAL writer locks across page loads
+- See [docs/dashboard.md](dashboard.md) for the user-facing guide
 **Key Principles:**
 - Ports and Adapters: Clear interfaces for external systems
 - Dependency Injection: Real vs. test implementations
@@ -346,10 +361,10 @@ FileSystem (write)
 - Value objects (SessionId, TranscriptPath, FactId)
 - Centralized Configuration
 - 4 domain models with business logic
-- 20 command classes
-- 19 MCP tools
+- 32 command classes
+- 25 MCP tools
 - Semantic search with local embeddings (FastEmbed + TF-IDF fallback)
-- Schema v6 with WAL mode
+- Schema v17 with WAL mode
 ## Future Improvements

data/docs/audit-queries.md ADDED Viewed

@@ -0,0 +1,131 @@
+# Audit Queries
+Pre-written SQL for validating that the ClaudeMemory plugin is being invoked when it should. Run via [cq](https://github.com/technicalpickles/cq) — install with `cargo install --git https://github.com/technicalpickles/cq`.
+These query Claude Code's raw transcripts (in `~/.claude/projects/`), not ClaudeMemory's own SQLite databases. That's deliberate: cq sees *all* tool calls including ones that bypassed the MCP server entirely, which is exactly the angle needed to spot activation gaps.
+For server-side telemetry (counts, latencies of MCP calls that *did* land), use `claude-memory stats --tools` against ClaudeMemory's `mcp_tool_calls` table instead.
+## Query 1 — Memory plugin activation rate
+How often is any `mcp__memory__*` tool being called, normalized by total sessions?
+```bash
+cq sql "
+WITH session_window AS (
+  SELECT DISTINCT session_id FROM messages
+),
+memory_sessions AS (
+  SELECT DISTINCT session_id FROM tool_calls
+  WHERE name LIKE 'mcp__memory__%'
+)
+SELECT
+  (SELECT count(*) FROM session_window) AS total_sessions,
+  (SELECT count(*) FROM memory_sessions) AS sessions_with_memory_call,
+  ROUND(100.0 * (SELECT count(*) FROM memory_sessions)
+        / NULLIF((SELECT count(*) FROM session_window), 0), 1) AS pct
+" --since 30d --table
+```
+**Why it matters**: a low percentage doesn't mean the plugin is broken — many sessions don't need memory. It's a denominator for the next two queries.
+## Query 2 — Sessions that asked memory-shaped questions but never called memory
+The most useful query. Surfaces user prompts where memory *should* have been the obvious tool, but Claude went elsewhere (Read, Grep, Bash) instead.
+```bash
+cq sql "
+WITH memory_sessions AS (
+  SELECT DISTINCT session_id FROM tool_calls
+  WHERE name LIKE 'mcp__memory__%'
+)
+SELECT
+  m.session_id,
+  m.timestamp,
+  left(m.text, 200) AS user_prompt
+FROM messages m
+LEFT JOIN memory_sessions ms ON m.session_id = ms.session_id
+WHERE m.type = 'user'
+  AND ms.session_id IS NULL
+  AND (
+    m.text ILIKE '%why did we%'
+    OR m.text ILIKE '%what convention%'
+    OR m.text ILIKE '%how do we usually%'
+    OR m.text ILIKE '%what did we decide%'
+    OR m.text ILIKE '%architecture%'
+    OR m.text ILIKE '%what''s the pattern%'
+  )
+ORDER BY m.timestamp DESC
+" --since 30d --table --limit 30
+```
+**What to do with results**: each row is a candidate for either (a) a tightening of MCP server instructions / skill descriptions, or (b) confirmation that the question genuinely didn't need memory and the keyword filter is too loose.
+## Query 3 — Which memory tools actually get called?
+```bash
+cq sql "
+SELECT
+  name AS tool,
+  count(*) AS invocations,
+  count(DISTINCT session_id) AS sessions
+FROM tool_calls
+WHERE name LIKE 'mcp__memory__%'
+GROUP BY name
+ORDER BY invocations DESC
+" --since 30d --table
+```
+**Expected shape**: `mcp__memory__recall`, `mcp__memory__conventions`, `mcp__memory__decisions` should dominate. Tools that never fire (`memory_fact_graph`, `memory_explain`, `memory_search_concepts`, `memory_facts_by_*`) might have description/triggering issues — same pattern as cq's "skill audit" use case.
+## Query 4 — Error rate per memory tool
+```bash
+cq sql "
+SELECT
+  tc.name AS tool,
+  count(*) AS calls,
+  sum(CASE WHEN tr.is_error THEN 1 ELSE 0 END) AS errors,
+  ROUND(100.0 * sum(CASE WHEN tr.is_error THEN 1 ELSE 0 END)
+        / count(*), 1) AS pct_errors
+FROM tool_calls tc
+JOIN tool_results tr ON tc.tool_use_id = tr.tool_use_id
+WHERE tc.name LIKE 'mcp__memory__%'
+GROUP BY tc.name
+ORDER BY errors DESC
+" --since 30d --table
+```
+**Why it matters**: a memory tool returning errors is much worse than not firing — Claude sees the failure and learns to avoid that tool. Triage anything above ~5%.
+## Query 5 — Result-size distribution (context budget hygiene)
+```bash
+cq sql "
+SELECT
+  tc.name AS tool,
+  count(*) AS calls,
+  MIN(length(tr.content)) AS min_chars,
+  ROUND(AVG(length(tr.content))) AS avg_chars,
+  MAX(length(tr.content)) AS max_chars
+FROM tool_calls tc
+JOIN tool_results tr ON tc.tool_use_id = tr.tool_use_id
+WHERE tc.name LIKE 'mcp__memory__%'
+GROUP BY tc.name
+ORDER BY avg_chars DESC
+" --since 30d --table
+```
+**Why it matters**: ClaudeMemory exposes a `compact: true` option that drops receipts for ~60% smaller responses. If averages are large, either the compact flag isn't being passed by callers or the tools that don't accept it are dumping too much.
+## When to re-run
+- Before each release — does the new version improve activation rate or reduce errors?
+- After meaningful changes to MCP server instructions / skill descriptions
+- If a user reports "the memory plugin doesn't seem to do anything" — Query 2 will usually surface the gap concretely
+## Related
+- Source for the methodology: `docs/influence/cq.md`
+- Server-side telemetry alternative: `claude-memory stats --tools --since 30`
+- cq schema reference: `cq schema --examples`