RubyGems - claude_memory - Versions diffs - 0.8.0 → 0.9.0 - Mend

claude_memory 0.8.0 → 0.9.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (64) hide show

checksums.yaml +4 -4
data/.claude/memory.sqlite3 +0 -0
data/.claude/rules/claude_memory.generated.md +32 -2
data/.claude/settings.json +30 -52
data/.claude/settings.local.json +3 -1
data/.claude/skills/upgrade-dependencies/SKILL.md +154 -0
data/.claude-plugin/marketplace.json +2 -2
data/.claude-plugin/plugin.json +3 -3
data/.claude-plugin/scripts/hook-runner.sh +14 -0
data/.claude-plugin/scripts/serve-mcp.sh +14 -0
data/.ruby-version +1 -1
data/CHANGELOG.md +41 -0
data/CLAUDE.md +31 -17
data/README.md +35 -0
data/db/migrations/013_add_mcp_tool_calls.rb +26 -0
data/db/migrations/014_canonicalize_predicates.rb +30 -0
data/docs/improvements.md +58 -20
data/docs/influence/claude-mem.md +1 -0
data/docs/influence/claude-supermemory.md +1 -0
data/docs/influence/episodic-memory.md +1 -0
data/docs/influence/grepai.md +1 -0
data/docs/influence/kbs.md +1 -0
data/docs/influence/lossless-claw.md +1 -0
data/docs/influence/qmd.md +1 -0
data/lib/claude_memory/commands/completion_command.rb +1 -31
data/lib/claude_memory/commands/embeddings_command.rb +198 -0
data/lib/claude_memory/commands/help_command.rb +8 -1
data/lib/claude_memory/commands/registry.rb +47 -34
data/lib/claude_memory/commands/reject_command.rb +62 -0
data/lib/claude_memory/commands/restore_command.rb +77 -0
data/lib/claude_memory/commands/skills/distill-transcripts.md +5 -1
data/lib/claude_memory/commands/stats_command.rb +98 -2
data/lib/claude_memory/configuration.rb +14 -1
data/lib/claude_memory/distill/json_schema.md +8 -4
data/lib/claude_memory/distill/null_distiller.rb +2 -0
data/lib/claude_memory/domain/entity.rb +13 -1
data/lib/claude_memory/domain/fact.rb +26 -2
data/lib/claude_memory/embeddings/api_adapter.rb +5 -4
data/lib/claude_memory/embeddings/fastembed_adapter.rb +43 -13
data/lib/claude_memory/embeddings/inspector.rb +91 -0
data/lib/claude_memory/embeddings/model_registry.rb +210 -0
data/lib/claude_memory/embeddings/resolver.rb +32 -6
data/lib/claude_memory/ingest/ingester.rb +17 -0
data/lib/claude_memory/mcp/handlers/management_handlers.rb +24 -0
data/lib/claude_memory/mcp/handlers/stats_handlers.rb +5 -2
data/lib/claude_memory/mcp/instructions_builder.rb +17 -0
data/lib/claude_memory/mcp/server.rb +22 -1
data/lib/claude_memory/mcp/telemetry.rb +86 -0
data/lib/claude_memory/mcp/tool_definitions.rb +86 -3
data/lib/claude_memory/mcp/tools.rb +10 -0
data/lib/claude_memory/publish.rb +40 -5
data/lib/claude_memory/recall.rb +81 -0
data/lib/claude_memory/resolve/predicate_policy.rb +63 -3
data/lib/claude_memory/resolve/resolver.rb +43 -0
data/lib/claude_memory/store/schema_manager.rb +1 -1
data/lib/claude_memory/store/sqlite_store.rb +250 -1
data/lib/claude_memory/store/store_manager.rb +50 -1
data/lib/claude_memory/sweep/maintenance.rb +115 -1
data/lib/claude_memory/sweep/sweeper.rb +3 -0
data/lib/claude_memory/version.rb +1 -1
data/lib/claude_memory.rb +5 -0
metadata +26 -8
data/.claude/memory.sqlite3-shm +0 -0
data/.claude/memory.sqlite3-wal +0 -0

data/README.md CHANGED Viewed

@@ -83,6 +83,41 @@ Claude: "Based on my memory, you're using Rails with PostgreSQL..."
 👉 **[See Getting Started Guide →](docs/GETTING_STARTED.md)**
 👉 **[View Example Conversations →](docs/EXAMPLES.md)**
+## Why It Matters — Real A/B Test Results
+We tested identical prompts with and without ClaudeMemory to measure the actual impact. Here's what we found:
+### Architecture Recall Without File Traversal
+> **Prompt:** "Explain the conflict detection and resolution system. Answer from knowledge only — do not read any files."
+| | Without Memory | With Memory |
+|---|---|---|
+| **Response** | 16 lines: "I don't know this codebase — let me read the files" | 76 lines: correct 4-role PredicatePolicy explanation, resolution pipeline, specific examples |
+| **Outcome** | Honest refusal — zero architectural understanding | Deep understanding without touching the filesystem |
+### Correct File Paths vs Hallucinated Guesses
+> **Prompt:** "I want to add a new predicate. Walk me through every file I need to update."
+| | Without Memory | With Memory |
+|---|---|---|
+| **Response** | 6 steps targeting 3 **non-existent files** (`predicate.rb`, `predicate_synonyms.rb`, `json_schema.rb`) | 8 steps, all targeting **real files** with correct paths |
+| **Outcome** | Plausible but wrong — would waste developer time | Actionable, correct, references actual commits |
+### Cross-Project Preferences
+> **Prompt:** "What are my standard development environment preferences across all my projects?"
+| | Without Memory | With Memory |
+|---|---|---|
+| **Response** | "I don't have stored knowledge of your preferences" | Lists 7 real preferences: iTerm2, tmux, VS Code, PostgreSQL, Redis, Docker |
+| **Outcome** | Blank slate every session | Personalized from day one |
+### When Memory Doesn't Help
+File-searchable questions ("what version is this?") and one-shot code generation without explicit recall don't benefit — `grep` is equally effective. Memory shines when the answer **isn't in any single file**: architecture spanning dozens of classes, conventions from past sessions, decisions with rationale, and user preferences.
 ## How It Works
 1. **You chat with Claude** - Tell it about your project

data/db/migrations/013_add_mcp_tool_calls.rb ADDED Viewed

@@ -0,0 +1,26 @@
+# frozen_string_literal: true
+# Migration v13: Add mcp_tool_calls telemetry table
+# Records every MCP server tool invocation for usage stats and ROI tracking.
+# Distinct from `tool_calls` (v3), which stores Claude Code tool observations
+# extracted from transcripts.
+Sequel.migration do
+  up do
+    create_table?(:mcp_tool_calls) do
+      primary_key :id
+      String :tool_name, null: false
+      String :called_at, null: false
+      Integer :duration_ms, null: false
+      Integer :result_count
+      String :scope
+      String :error_class
+    end
+    run "CREATE INDEX IF NOT EXISTS idx_mcp_tool_calls_name_time ON mcp_tool_calls(tool_name, called_at)"
+    run "CREATE INDEX IF NOT EXISTS idx_mcp_tool_calls_called_at ON mcp_tool_calls(called_at)"
+  end
+  down do
+    drop_table?(:mcp_tool_calls)
+  end
+end

data/db/migrations/014_canonicalize_predicates.rb ADDED Viewed

@@ -0,0 +1,30 @@
+# frozen_string_literal: true
+# Migration v14: Canonicalize stale predicate names in existing facts.
+#
+# The predicate vocabulary was curated in 0.9.0 — synonym canonicalization
+# now runs at insert time (Resolver), but existing facts with stale
+# predicate names need a one-time rewrite so they appear in the correct
+# snapshot sections and query results.
+#
+# This migration applies PredicatePolicy::SYNONYMS to all active facts.
+# Reversible: the down migration is a no-op because we can't know the
+# original predicate name after rewriting.
+Sequel.migration do
+  up do
+    # Inline the synonym map so the migration is self-contained and
+    # doesn't break if PredicatePolicy::SYNONYMS changes later.
+    synonyms = {
+      "has_convention" => "convention",
+      "primary_language" => "uses_language"
+    }
+    synonyms.each do |from, to|
+      self[:facts].where(predicate: from).update(predicate: to)
+    end
+  end
+  down do
+    # No-op: can't reverse a predicate rename without storing the original.
+  end
+end

data/docs/improvements.md CHANGED Viewed

@@ -1,14 +1,14 @@
 # Improvements to Consider
-*Updated: 2026-03-24 - Implemented automatic distillation pipeline: NullDistiller wired into ingest hooks (Stop/SessionStart/PreCompact/SessionEnd/TaskCompleted/TeammateIdle), context hook injection for LLM extraction, `memory.undistilled` and `memory.mark_distilled` MCP tools, `/distill-transcripts` skill. Previously: Intent Parameter for Recall (#3), Retrieval Score Traces (#5), Search Agent Delegation (#8), Embedded Skill Distribution (#12), Shell Completion (#18), and 12 earlier features. Studied lossless-claw (v0.3.0). Other 6 repos unchanged since 2026-03-10.*
+*Updated: 2026-03-30 - Re-studied all 7 influencer repos. New recommendations: CLAUDE_CONFIG_DIR support (#26, from episodic-memory), Usage Stats / ROI Tracking (#27, from grepai v0.35.0). New Features to Avoid: AST-Aware Code Chunking (QMD), Custom Instructions via Env Var (lossless-claw v0.5.2), OpenClaw Context Injection (claude-mem v10.6.0). Repos with no changes: kbs (v0.2.1), claude-supermemory (v2.0.1), episodic-memory (v1.0.15). Previously: 14 features implemented through 2026-03-24.*
 *Sources:*
-- *[thedotmack/claude-mem](https://github.com/thedotmack/claude-mem) - Memory compression system (v10.5.5, studied 2026-03-09)*
-- *[obra/episodic-memory](https://github.com/obra/episodic-memory) - Semantic conversation search (v1.0.15, studied 2026-03-09)*
-- *[yoanbernabeu/grepai](https://github.com/yoanbernabeu/grepai) - Semantic code search (latest, studied 2026-03-09)*
-- *[supermemoryai/claude-supermemory](https://github.com/supermemoryai/claude-supermemory) - Cloud-backed persistent memory (v2.0.1, studied 2026-03-09)*
-- *[tobi/qmd](https://github.com/tobi/qmd) - On-device hybrid search engine (v2.0.1, studied 2026-03-10)*
-- *[MadBomber/kbs](https://github.com/MadBomber/kbs) - Knowledge-Based System with RETE inference (v0.2.1, studied 2026-03-09 — no changes)*
-- *[martian-engineering/lossless-claw](https://github.com/martian-engineering/lossless-claw) - DAG-based lossless context management (v0.3.0, studied 2026-03-16)*
+- *[thedotmack/claude-mem](https://github.com/thedotmack/claude-mem) - Memory compression system (v10.6.3, re-studied 2026-03-30)*
+- *[obra/episodic-memory](https://github.com/obra/episodic-memory) - Semantic conversation search (v1.0.15, re-studied 2026-03-30 — no changes)*
+- *[yoanbernabeu/grepai](https://github.com/yoanbernabeu/grepai) - Semantic code search (v0.35.0, re-studied 2026-03-30)*
+- *[supermemoryai/claude-supermemory](https://github.com/supermemoryai/claude-supermemory) - Cloud-backed persistent memory (v2.0.1, re-studied 2026-03-30 — no changes)*
+- *[tobi/qmd](https://github.com/tobi/qmd) - On-device hybrid search engine (v2.0.1+unreleased, re-studied 2026-03-30)*
+- *[MadBomber/kbs](https://github.com/MadBomber/kbs) - Knowledge-Based System with RETE inference (v0.2.1, studied 2026-03-30 — no changes)*
+- *[martian-engineering/lossless-claw](https://github.com/martian-engineering/lossless-claw) - DAG-based lossless context management (v0.5.2, re-studied 2026-03-30)*
 This document contains only unimplemented improvements. Completed items are removed.
@@ -128,10 +128,43 @@ Source: QMD study (updated 2026-03-02)
 `--async` flag on hook ingest/sweep/publish subcommands. Fork+detach for non-blocking execution, fallback to sync when fork unavailable.
+### ~~26. CLAUDE_CONFIG_DIR Support~~ ✅ Implemented 2026-04-13
+`Configuration#claude_config_dir` reads `CLAUDE_CONFIG_DIR` env var before falling back to `~/.claude`. `global_db_path` routes through it, so users with non-standard Claude Code config locations (or multiple profiles) can point the global memory DB anywhere without touching project DB resolution.
+### 28. Code-Aware Transcript Chunking
+Source: QMD v2.0.1+unreleased re-study (2026-03-30)
+- **Value**: Better embeddings for transcripts containing code — detect fenced code blocks and apply AST-aware break points (function/class/import boundaries) rather than naive text splitting
+- **Implementation**: Detect ` ```language ` fences in transcript content, parse code blocks with tree-sitter (via ruby_tree_sitter gem or shelling out), score break points (class=100, func=90, type=80, import=60), merge with markdown break points from #22
+- **Evidence**: QMD `src/ast.ts` (392 lines) — web-tree-sitter with WASM grammars, `mergeBreakPoints()` combining AST + regex scores, graceful degradation on parse failure
+- **Consideration**: Only useful in combination with #22 (Document Chunking). Transcripts often contain significant code in tool_use results and assistant responses
+- **Effort**: 2-3 days (after #22)
+- **Trade-off**: Adds tree-sitter dependency; graceful fallback to regex-only chunking when grammar unavailable
+### 30. Predicate Census Command
+Source: predicate retrospective (2026-04-15)
+- **Value**: Aggregate predicate usage data across many project databases for informed vocabulary decisions — without exposing content. Enables data-sharing across machines (work/personal) via a privacy-safe JSON report.
+- **Implementation**: `claude-memory census [--root ~/src]`. Finds all `.claude/memory.sqlite3` files under root, opens each read-only, collects per-DB predicate × status counts, entity type counts, schema version, novel predicates, synonym canonicalization candidates. Outputs aggregated JSON with **no object_literal, no entity names, no project paths, no quotes** — only schema-level signal.
+- **Evidence**: The multi-project survey that caught the `uses_framework` cardinality bug (commit `29818c2`) was a manual bash loop. Productizing it means any user can contribute usage data for vocabulary curation without privacy risk.
+- **Effort**: 0.5 days
+- **Trade-off**: None — purely additive, read-only, privacy-safe by design
+### ~~27. Usage Stats / ROI Tracking~~ ✅ Implemented 2026-04-15
+Schema migration v13 adds `mcp_tool_calls` telemetry table (tool_name, called_at, duration_ms, result_count, scope, error_class). `MCP::Telemetry` wraps `Server#handle_tools_call` with monotonic-clock timing, captures errors, and records to the project DB; DB errors are swallowed so telemetry never fails a real tool call. `StatsCommand` gains `--tools` and `--since DAYS` flags showing total calls, error rate, and per-tool breakdown (calls, avg ms, p95 ms, error rate). `Sweep::Maintenance#prune_old_mcp_tool_calls` enforces a 90-day retention window, wired into `Sweeper#run!`. Rejected NDJSON in favor of SQLite for schema/query consistency with the rest of the gem. Dropped query-text capture (YAGNI — the dedup insight the hash would enable also needs raw text). Also fixed a latent bug where `StatsCommand` opened the DB via `Sequel.sqlite` (requiring the unlisted `sqlite3` gem); now uses the extralite adapter consistently.
 ---
 ## Low Priority / Defer
+### ~~29. Derive CompletionCommand Descriptions from Registry~~ ✅ Implemented 2026-04-15
+`Registry::COMMANDS` now stores `{class:, description:}` entries as the single source of truth. New `Registry.description` and `Registry.descriptions` accessors. `CompletionCommand` reads descriptions via `Registry.descriptions` instead of maintaining its own parallel hash. `Registry.find` also simplified — class references stored directly since command files are required before the Registry, eliminating `const_get` string indirection. Drift between the command list and completion output is now impossible without a deliberate edit to a single file.
 ### 23. REST API Endpoint
 Source: QMD v2.0.1 study (2026-03-10)
@@ -219,6 +252,11 @@ Added `claude-memory export` command. Dumps facts with entities and provenance t
 - **Sub-Agent Delegation for Deep Recall** — lossless-claw spawns sub-agents for DAG traversal. Adds latency and complexity; our direct MCP tool responses are simpler and faster
 - **Message Parts Polymorphism** — lossless-claw's 10-column message_parts for tool calls, reasoning, patches. We don't store raw messages, so irrelevant
 - **OpenClaw ContextEngine Interface** — Tight framework coupling. Our MCP + hooks approach is more portable
+- **Chunk Strategy Option** — QMD's `--chunk-strategy auto` for code files. ClaudeMemory has no standalone chunking pipeline to configure (QMD v0.35.0)
+- **Custom Instructions via Env Var** — lossless-claw's `LCM_CUSTOM_INSTRUCTIONS` config stub exists but is never wired to summarization prompts. Incomplete pattern; our skill-based prompts are better (lossless-claw v0.5.2)
+- **OpenClaw Context Injection** — claude-mem v10.6.0's `appendSystemContext` with 60s cache replaces MEMORY.md writes. Our SessionStart hook context injection already does this (claude-mem v10.6.0)
+- **Message Parts Polymorphism** — lossless-claw's 10-column message_parts for tool calls, reasoning, patches. We don't store raw messages, so irrelevant
+- **OpenClaw ContextEngine Interface** — Tight framework coupling. Our MCP + hooks approach is more portable
 ---
@@ -241,22 +279,22 @@ Added `claude-memory export` command. Dumps facts with entities and provenance t
 ## References
 - [episodic-memory GitHub](https://github.com/obra/episodic-memory) - Semantic conversation search (v1.0.15)
-- [claude-mem GitHub](https://github.com/thedotmack/claude-mem) - Memory compression system (v10.5.5)
-- [grepai GitHub](https://github.com/yoanbernabeu/grepai) - Semantic code search (latest)
+- [claude-mem GitHub](https://github.com/thedotmack/claude-mem) - Memory compression system (v10.6.3)
+- [grepai GitHub](https://github.com/yoanbernabeu/grepai) - Semantic code search (v0.35.0)
 - [claude-supermemory GitHub](https://github.com/supermemoryai/claude-supermemory) - Cloud-backed memory (v2.0.1)
-- [QMD GitHub](https://github.com/tobi/qmd) - On-device hybrid search engine (v2.0.1)
+- [QMD GitHub](https://github.com/tobi/qmd) - On-device hybrid search engine (v2.0.1+unreleased)
 - [KBS GitHub](https://github.com/MadBomber/kbs) - Knowledge-Based System with RETE inference (v0.2.1)
-- [lossless-claw GitHub](https://github.com/martian-engineering/lossless-claw) - DAG-based lossless context management (v0.3.0)
+- [lossless-claw GitHub](https://github.com/martian-engineering/lossless-claw) - DAG-based lossless context management (v0.5.2)
 Influence documents:
-- [docs/influence/qmd.md](influence/qmd.md) - Updated 2026-03-10
-- [docs/influence/episodic-memory.md](influence/episodic-memory.md) - Updated 2026-03-09
-- [docs/influence/claude-mem.md](influence/claude-mem.md) - Updated 2026-03-09
-- [docs/influence/grepai.md](influence/grepai.md) - Updated 2026-03-09
-- [docs/influence/claude-supermemory.md](influence/claude-supermemory.md) - Updated 2026-03-09
-- [docs/influence/kbs.md](influence/kbs.md) - Updated 2026-03-09 (no changes)
-- [docs/influence/lossless-claw.md](influence/lossless-claw.md) - Updated 2026-03-16
+- [docs/influence/qmd.md](influence/qmd.md) - Re-studied 2026-03-30
+- [docs/influence/episodic-memory.md](influence/episodic-memory.md) - Re-studied 2026-03-30
+- [docs/influence/claude-mem.md](influence/claude-mem.md) - Re-studied 2026-03-30
+- [docs/influence/grepai.md](influence/grepai.md) - Re-studied 2026-03-30
+- [docs/influence/claude-supermemory.md](influence/claude-supermemory.md) - Re-studied 2026-03-30
+- [docs/influence/kbs.md](influence/kbs.md) - Re-studied 2026-03-30 (no changes)
+- [docs/influence/lossless-claw.md](influence/lossless-claw.md) - Re-studied 2026-03-30
 ---
-*Last updated: 2026-03-24 - Implemented 14 features total. Latest: Automatic Distillation Pipeline (#13 partial, #17), Intent Parameter for Recall (#3). Previously: Retrieval Score Traces (#5), Search Agent Delegation (#8), Embedded Skill Distribution (#12), Shell Completion (#18), Dynamic MCP Instructions (#11), Structured Error Classification (#16), Content-Addressed Dedup (#19), Dedup Before Vector Scoring (#20), Dedicated Maintenance Class (#10), Three-Level Escalation (#14), Tool Escalation Workflow (#15). Studied lossless-claw (v0.3.0). Implemented earlier: MCP Tool Annotations, MCP Stdout Protection, Worktree-Aware Git Root, Self-Excluding Conversations, Plugin Distribution, sqlite-vec, Database Compact, Fact Export, Background Processing, MCP Discovery Tools.*
+*Last updated: 2026-04-15 - Predicate retrospective: fixed uses_framework cardinality bug, curated vocabulary to 8 predicates, added synonym canonicalization + novel-predicate warnings. Also: reject/restore commands, #26 CLAUDE_CONFIG_DIR, #27 telemetry, #29 Registry descriptions.*

data/docs/influence/claude-mem.md CHANGED Viewed

@@ -3,6 +3,7 @@
 *Analysis Date: 2026-03-02*
 *Repository: https://github.com/thedotmack/claude-mem*
 *Version: 10.5.2 (commit ecb09df)*
+*Re-studied: 2026-03-30 — v10.6.3 (commit d068821). 4 releases since last study (v10.5.6, v10.6.0, v10.6.1, v10.6.2, v10.6.3). Key changes: OpenClaw system prompt context injection replacing MEMORY.md writes (v10.6.0), compressed context output ~53% smaller (v10.6.1), timeline report skill (v10.6.1), process supervisor hardening with PID 0 fix and signal race condition fix (v10.5.6), activity spinner orphan session fix (v10.6.2), Gemini CLI integration (v10.6.3), 7 critical cross-platform bug fixes (v10.6.3). Context injection pattern (appendSystemContext with 60s cache) aligns with our SessionStart hook approach. Compressed context format worth studying. No new adoptable patterns beyond what we already implement.*
 ---

data/docs/influence/claude-supermemory.md CHANGED Viewed

@@ -2,6 +2,7 @@
 *Analysis Date: 2026-03-02*
 *Previous Analysis: 2026-02-02*
+*Re-studied: 2026-03-30 — No meaningful code changes since v2.0.1. marketplace.json bumped to 0.0.3, added claude-code-review GitHub Action (anthropics/claude-code-action@v1). All findings remain current.*
 *Repository: https://github.com/supermemoryai/claude-supermemory*
 *Version: 2.0.0 (commit de39413)*

data/docs/influence/episodic-memory.md CHANGED Viewed

@@ -3,6 +3,7 @@
 *Analysis Date: 2026-03-02*
 *Repository: https://github.com/obra/episodic-memory*
 *Version: 1.0.15 (commit 6feaa5b)*
+*Re-studied: 2026-03-30 — No changes since v1.0.15. Repo dormant. One adoptable pattern identified: CLAUDE_CONFIG_DIR env var support (`src/paths.ts:20-22`) for configurable Claude config directory. Orphaned MCP process prevention (SIGHUP handler in wrapper) not applicable — ClaudeMemory runs as single Ruby process, no wrapper/child architecture.*
 ---

data/docs/influence/grepai.md CHANGED Viewed

@@ -4,6 +4,7 @@
 *Previous Analysis: 2026-01-29*
 *Repository: https://github.com/yoanbernabeu/grepai*
 *Version: 0.34.0 (commit 1c7aba9)*
+*Re-studied: 2026-03-30 — v0.35.0. One release since last study (2026-03-16). Key addition: privacy-first usage stats tracking (`stats/` package) recording every search/trace to NDJSON file (`.grepai/stats.json`), computing output tokens vs grep-equivalent tokens with savings percentages and optional USD cost savings. Fire-and-forget recording via goroutine with 100ms timeout, file-locking for cross-process safety. Shell completion also added (we already have this via #18). `.grepaiignore` support not relevant.*
 ---

data/docs/influence/kbs.md CHANGED Viewed

@@ -1,6 +1,7 @@
 # KBS (Knowledge-Based System) Analysis
 *Analysis Date: 2026-03-02*
+*Re-studied: 2026-03-30 — no changes since v0.2.1*
 *Repository: https://github.com/MadBomber/kbs*
 *Version: v0.2.1 (commit c04561d)*

data/docs/influence/lossless-claw.md CHANGED Viewed

@@ -3,6 +3,7 @@
 *Analysis Date: 2026-03-16*
 *Repository: https://github.com/martian-engineering/lossless-claw*
 *Version: 0.3.0 (commit 49949fb)*
+*Re-studied: 2026-03-30 — v0.5.2. 5 releases since last study. Core DAG architecture unchanged; changes are operational hardening. Custom Instructions (`LCM_CUSTOM_INSTRUCTIONS`) — config stub exists but never wired to summarization prompts, do not adopt. Session exclusion patterns (ignore + stateless) — clean implementation but low priority (we already have ContentSanitizer exclusion tags). Prompt Slot Pattern — does not exist in codebase, not applicable. New: CJK-aware FTS5 fallback with `icu` tokenizer detection worth considering. Also: provider auth error surfacing, summarizer timeouts, bootstrap checkpoints, Docker support, TUI doctor command.*
 ---

data/docs/influence/qmd.md CHANGED Viewed

@@ -4,6 +4,7 @@
 *Previous Analysis: 2026-03-02, 2026-02-02*
 *Repository: https://github.com/tobi/qmd*
 *Version: 2.0.1 (commit ae3604c)*
+*Re-studied: 2026-03-30 — v2.0.1+unreleased. One significant addition: AST-aware chunking (`src/ast.ts`, 392 lines) using web-tree-sitter with WASM grammars for TS/JS/Python/Go/Rust. Detects language from extension, parses AST, extracts break points at function/class/import boundaries (class=100, func=90, type=80, import=60). Merged with regex break points via `mergeBreakPoints()`. Opt-in via `--chunk-strategy auto`. While we ingest transcripts rather than source code, transcripts frequently contain embedded code in tool results and assistant responses. AST-aware break points could improve embedding quality for code-heavy transcripts when combined with Document Chunking (#22). Added as improvement #28 (Code-Aware Transcript Chunking).*
 ---

data/lib/claude_memory/commands/completion_command.rb CHANGED Viewed

@@ -37,7 +37,7 @@ module ClaudeMemory
       end
       def zsh_completion
-        commands_with_desc = command_descriptions.map { |name, desc|
+        commands_with_desc = Registry.descriptions.sort.map { |name, desc|
           "    '#{name}:#{desc}'"
         }.join("\n")
@@ -141,36 +141,6 @@ module ClaudeMemory
         BASH
       end
-      def command_descriptions
-        {
-          "changes" => "Show recent fact changes",
-          "compact" => "Compact databases",
-          "completion" => "Generate shell completions",
-          "conflicts" => "Show open conflicts",
-          "db:init" => "Initialize database",
-          "doctor" => "Check system health",
-          "explain" => "Explain a fact with receipts",
-          "export" => "Export facts to JSON",
-          "git-lfs" => "Git LFS integration",
-          "help" => "Show help message",
-          "hook" => "Run hook entrypoints",
-          "index" => "Index content",
-          "ingest" => "Ingest transcript delta",
-          "init" => "Initialize ClaudeMemory",
-          "install-skill" => "Install agent skills",
-          "promote" => "Promote fact to global",
-          "publish" => "Publish snapshot",
-          "recall" => "Recall facts matching query",
-          "recover" => "Recover database",
-          "search" => "Search indexed content",
-          "serve-mcp" => "Start MCP server",
-          "stats" => "Show statistics",
-          "sweep" => "Run maintenance",
-          "uninstall" => "Remove configuration",
-          "version" => "Show version"
-        }
-      end
       def skill_names
         InstallSkillCommand::AVAILABLE_SKILLS.keys
       end

data/lib/claude_memory/commands/embeddings_command.rb ADDED Viewed

@@ -0,0 +1,198 @@
+# frozen_string_literal: true
+module ClaudeMemory
+  module Commands
+    # Shows embedding configuration, lists available models, and validates setup.
+    #
+    # Subcommands:
+    #   claude-memory embeddings          # Show current config
+    #   claude-memory embeddings list     # List available models
+    #   claude-memory embeddings check    # Validate current setup
+    #
+    class EmbeddingsCommand < BaseCommand
+      def call(args)
+        opts = parse_options(args, {}) do |o|
+          OptionParser.new do |parser|
+            parser.banner = "Usage: claude-memory embeddings [list|check]"
+          end
+        end
+        return 1 if opts.nil?
+        subcommand = args.first
+        case subcommand
+        when "list" then list_models
+        when "check" then check_setup
+        when nil then show_config
+        else
+          failure("Unknown subcommand: #{subcommand}. Use: list, check")
+        end
+      end
+      private
+      def inspector
+        @inspector ||= Embeddings::Inspector.new
+      end
+      def show_config
+        provider = ENV["CLAUDE_MEMORY_EMBEDDING_PROVIDER"] || "tfidf"
+        model = ENV["CLAUDE_MEMORY_EMBEDDING_MODEL"]
+        api_url = ENV["CLAUDE_MEMORY_EMBEDDING_API_URL"]
+        stdout.puts "Embedding Configuration"
+        stdout.puts "======================"
+        stdout.puts "Provider:  #{provider}"
+        stdout.puts "Model:     #{model || "(default)"}"
+        if model
+          info = Embeddings::ModelRegistry.find(model)
+          if info
+            stdout.puts "Dimensions: #{info.dimensions}"
+            stdout.puts "Description: #{info.description}"
+          else
+            stdout.puts "Dimensions: (unknown - will be discovered at runtime)"
+          end
+        else
+          info = Embeddings::ModelRegistry.default_for_provider(provider)
+          if info
+            stdout.puts "Default model: #{info.name}"
+            stdout.puts "Dimensions: #{info.dimensions}"
+          end
+        end
+        stdout.puts "API URL:   #{api_url}" if api_url && provider == "api"
+        inspector.database_states.each do |state|
+          stdout.puts ""
+          stdout.puts "#{state.label.capitalize} DB: provider=#{state.provider || "unknown"}, dimensions=#{state.dimensions || "unknown"}"
+        end
+        stdout.puts ""
+        stdout.puts "ENV variables:"
+        stdout.puts "  CLAUDE_MEMORY_EMBEDDING_PROVIDER  Provider (tfidf, fastembed, api)"
+        stdout.puts "  CLAUDE_MEMORY_EMBEDDING_MODEL     Model name"
+        stdout.puts "  CLAUDE_MEMORY_EMBEDDING_API_KEY   API key (for api provider)"
+        stdout.puts "  CLAUDE_MEMORY_EMBEDDING_API_URL   API endpoint (for api provider)"
+        0
+      end
+      def list_models
+        Embeddings::ModelRegistry.providers.each do |provider|
+          stdout.puts ""
+          stdout.puts "#{provider_label(provider)}:"
+          stdout.puts "-" * 40
+          Embeddings::ModelRegistry.models_for_provider(provider).each do |model|
+            size = model.size_mb ? "#{model.size_mb}MB" : "cloud"
+            tokens = model.max_tokens ? "#{model.max_tokens} tokens" : ""
+            stdout.puts "  #{model.name}"
+            stdout.puts "    #{model.dimensions}-dim | #{size} | #{tokens}"
+            stdout.puts "    #{model.description}"
+          end
+        end
+        stdout.puts ""
+        stdout.puts "Custom models: Set CLAUDE_MEMORY_EMBEDDING_MODEL to any model"
+        stdout.puts "supported by your provider. Dimensions are auto-detected."
+        0
+      end
+      def check_setup
+        provider_name = ENV["CLAUDE_MEMORY_EMBEDDING_PROVIDER"] || "tfidf"
+        model_name = ENV["CLAUDE_MEMORY_EMBEDDING_MODEL"]
+        stdout.puts "Checking embedding setup..."
+        stdout.puts ""
+        ok = true
+        ok &= check_provider(provider_name)
+        ok &= check_model(provider_name, model_name) if model_name
+        ok &= render_dimension_checks(provider_name, model_name)
+        stdout.puts ""
+        stdout.puts ok ? "All checks passed." : "Some checks failed. See above."
+        ok ? 0 : 1
+      end
+      def check_provider(name)
+        case name
+        when "fastembed"
+          check_fastembed
+        when "api"
+          check_api_config
+        when "tfidf"
+          stdout.puts "  [OK] tfidf provider (built-in, always available)"
+          true
+        else
+          stdout.puts "  [FAIL] Unknown provider: #{name}"
+          false
+        end
+      end
+      def check_model(provider_name, model_name)
+        info = Embeddings::ModelRegistry.find(model_name)
+        if info
+          if info.provider != provider_name
+            stdout.puts "  [WARN] Model '#{model_name}' is for '#{info.provider}' provider, but '#{provider_name}' is selected"
+            stdout.puts "         Set CLAUDE_MEMORY_EMBEDDING_PROVIDER=#{info.provider}"
+          else
+            stdout.puts "  [OK] Model '#{model_name}' (#{info.dimensions}-dim)"
+          end
+        else
+          stdout.puts "  [INFO] Model '#{model_name}' not in registry (dimensions will be auto-detected)"
+        end
+        true
+      end
+      def render_dimension_checks(provider_name, model_name)
+        ok = true
+        inspector.dimension_checks(provider_name, model_name).each do |check|
+          case check.status
+          when :mismatch
+            stdout.puts "  [WARN] #{check.label}: Dimension mismatch (stored: #{check.stored_dims}, current: #{check.current_dims})"
+            stdout.puts "         Re-index with: claude-memory index --force --scope #{check.label}"
+            ok = false
+          when :match
+            stdout.puts "  [OK] #{check.label}: #{check.stored_dims}-dim (provider: #{check.stored_provider || "unknown"})"
+          when :fresh
+            stdout.puts "  [INFO] #{check.label}: No embeddings indexed yet"
+          end
+        end
+        ok
+      end
+      def check_fastembed
+        require "fastembed"
+        stdout.puts "  [OK] fastembed gem available"
+        true
+      rescue LoadError
+        stdout.puts "  [FAIL] fastembed gem not installed"
+        stdout.puts "         Add `gem 'fastembed'` to your Gemfile"
+        false
+      end
+      def check_api_config
+        key = ENV["CLAUDE_MEMORY_EMBEDDING_API_KEY"] || ENV["OPENAI_API_KEY"]
+        if key
+          stdout.puts "  [OK] API key configured"
+          true
+        else
+          stdout.puts "  [FAIL] No API key found"
+          stdout.puts "         Set CLAUDE_MEMORY_EMBEDDING_API_KEY or OPENAI_API_KEY"
+          false
+        end
+      end
+      def provider_label(provider)
+        case provider
+        when "fastembed" then "fastembed (local ONNX, no API key)"
+        when "api" then "api (OpenAI-compatible endpoints, requires API key)"
+        when "tfidf" then "tfidf (built-in, no dependencies)"
+        end
+      end
+    end
+  end
+end

data/lib/claude_memory/commands/help_command.rb CHANGED Viewed

@@ -19,20 +19,27 @@ module ClaudeMemory
             explain    Explain a fact with receipts
             export     Export facts to JSON for backup
             help       Show this help message
-            hook       Run hook entrypoints (ingest|sweep|publish)
+            hook       Run hook entrypoints (ingest|sweep|publish|context)
             init       Initialize ClaudeMemory in a project
             ingest     Ingest transcript delta
             promote    Promote a project fact to global memory
             publish    Publish snapshot to Claude Code memory
             recall     Recall facts matching a query
+            recover    Recover stuck operations
+            reject     Mark a fact as rejected (e.g. hallucination)
+            restore    Restore superseded facts from reclassified predicates
             search     Search indexed content
             serve-mcp  Start MCP server
+            stats      Show statistics (--tools for MCP telemetry)
             sweep      Run maintenance/pruning
             uninstall  Remove ClaudeMemory configuration
             version    Show version number
           Utilities:
             completion     Generate shell completions (bash/zsh)
+            embeddings     Inspect embedding backend
+            git-lfs        Git LFS integration for memory DB
+            index          Build or rebuild content indexes
             install-skill  Install agent skills to ~/.claude/commands/
           Run 'claude-memory <command> --help' for more information on a command.