npm - clawmem - Versions diffs - 0.1.3 → 0.1.5 - Mend

clawmem 0.1.3 → 0.1.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (7) hide show

package/AGENTS.md +4 -2
package/CLAUDE.md +4 -2
package/README.md +24 -2
package/SKILL.md +4 -2
package/package.json +1 -1
package/src/config.ts +12 -3
package/src/hooks/context-surfacing.ts +23 -2

package/AGENTS.md CHANGED Viewed

@@ -257,6 +257,8 @@ ClawMem hooks handle ~90% of retrieval automatically. Agent-initiated MCP calls
 **Hook blind spots (by design):** Hooks filter out `_clawmem/` system artifacts, enforce score thresholds, and cap token budget. Absence in `<vault-context>` does NOT mean absence in memory. If you expect a memory to exist but it wasn't surfaced, escalate to Tier 3.
+**Adaptive thresholds:** Context-surfacing uses ratio-based scoring that adapts to vault characteristics (size, document quality, content age, embedding model). Results are kept within a percentage of the best result's composite score rather than a fixed absolute threshold. An activation floor prevents surfacing when all results are weak. Profiles control the ratio: `speed` (65%), `balanced` (55%), `deep` (45% + query expansion + reranking). `CLAWMEM_PROFILE=deep` is recommended for vaults with older content or lower-quality documents. MCP tools use fixed absolute thresholds, not adaptive.
 ### Tier 3 — Agent-Initiated (one targeted MCP call)
 **Escalate ONLY when one of these three rules fires:**
@@ -631,8 +633,8 @@ Symptom: "UserPromptSubmit hook error" on context-surfacing hook (intermittent)
     (including non-.md files like session .jsonl transcripts) and holds brief write locks. If the
     hook fires during a lock, it can exceed its timeout. More likely during active conversations
     when the watcher is processing rapid transcript changes.
-  → Fix: Bump the hook timeout from 5s to 8s in ~/.claude/settings.json. If persistent, restart
-    the watcher to clear memory bloat: `systemctl --user restart clawmem-watcher.service`.
+  → Fix: The default timeout is 8s (since v0.1.1). If you have an older install, re-run
+    `clawmem setup hooks` to update. If persistent, restart the watcher: `systemctl --user restart clawmem-watcher.service`.
 ```
 ## CLI Reference

package/CLAUDE.md CHANGED Viewed

@@ -257,6 +257,8 @@ ClawMem hooks handle ~90% of retrieval automatically. Agent-initiated MCP calls
 **Hook blind spots (by design):** Hooks filter out `_clawmem/` system artifacts, enforce score thresholds, and cap token budget. Absence in `<vault-context>` does NOT mean absence in memory. If you expect a memory to exist but it wasn't surfaced, escalate to Tier 3.
+**Adaptive thresholds:** Context-surfacing uses ratio-based scoring that adapts to vault characteristics (size, document quality, content age, embedding model). Results are kept within a percentage of the best result's composite score rather than a fixed absolute threshold. An activation floor prevents surfacing when all results are weak. Profiles control the ratio: `speed` (65%), `balanced` (55%), `deep` (45% + query expansion + reranking). `CLAWMEM_PROFILE=deep` is recommended for vaults with older content or lower-quality documents. MCP tools use fixed absolute thresholds, not adaptive.
 ### Tier 3 — Agent-Initiated (one targeted MCP call)
 **Escalate ONLY when one of these three rules fires:**
@@ -631,8 +633,8 @@ Symptom: "UserPromptSubmit hook error" on context-surfacing hook (intermittent)
     (including non-.md files like session .jsonl transcripts) and holds brief write locks. If the
     hook fires during a lock, it can exceed its timeout. More likely during active conversations
     when the watcher is processing rapid transcript changes.
-  → Fix: Bump the hook timeout from 5s to 8s in ~/.claude/settings.json. If persistent, restart
-    the watcher to clear memory bloat: `systemctl --user restart clawmem-watcher.service`.
+  → Fix: The default timeout is 8s (since v0.1.1). If you have an older install, re-run
+    `clawmem setup hooks` to update. If persistent, restart the watcher: `systemctl --user restart clawmem-watcher.service`.
 ```
 ## CLI Reference

package/README.md CHANGED Viewed

@@ -606,7 +606,7 @@ clawmem doctor                                  Full health check
 clawmem status                                  Quick index status
 ```
-## MCP Tools (25)
+## MCP Tools (28)
 Registered by `clawmem setup mcp`. Available to any MCP-compatible client.
@@ -652,6 +652,13 @@ Registered by `clawmem setup mcp`. Available to any MCP-compatible client.
 |---|---|
 | `beads_sync` | Sync Beads issues from Dolt backend (`bd` CLI) into memory: creates docs, bridges all dep types to `memory_relations`, runs A-MEM enrichment |
+### Vault Management
+| Tool | Description |
+|---|---|
+| `list_vaults` | Show configured vault names and paths. Empty in single-vault mode. |
+| `vault_sync` | Index markdown from a directory into a named vault. Restricted-path validation rejects sensitive directories. |
 ### Memory Management & Lifecycle
 | Tool | Description |
@@ -998,7 +1005,7 @@ Manual layers benefit from periodic re-indexing — a cron job running `clawmem
 Three-tier retrieval architecture: infrastructure (watcher + embed timer) → hooks (~90%) → agent MCP (~10%). Works out of the box without a dedicated GPU (all models auto-download via `node-llama-cpp`, uses Metal on Apple Silicon). For best performance, run three `llama-server` instances — see [GPU Services](#gpu-services) for model tiers (SOTA vs QMD native) and [Cloud Embedding](#option-c-cloud-embedding-api) for cloud embedding alternatives.
-Key services: `clawmem-watcher` (auto-index on file change + beads sync), `clawmem-embed` timer (daily embedding sweep), 9 Claude Code hooks (context injection, session bootstrap, decision extraction, handoffs, feedback, compaction support). Optional `clawmem-curator` agent for on-demand lifecycle triage, retrieval health checks, and maintenance (`clawmem setup curator`).
+Key services: `clawmem-watcher` (auto-index on file change + beads sync), `clawmem-embed` timer (daily embedding sweep), 7 Claude Code hooks installed by default (context injection, curator nudge, compaction support, decision extraction, handoffs, feedback). Optional `clawmem-curator` agent for on-demand lifecycle triage, retrieval health checks, and maintenance (`clawmem setup curator`).
 ## Acknowledgments
@@ -1015,6 +1022,21 @@ Built on the shoulders of:
 - [SAME](https://github.com/sgx-labs/statelessagent) — agent memory concepts (recency decay, confidence scoring, session tracking)
 - [supermemory](https://github.com/supermemoryai/clawdbot-supermemory) — hook patterns and context surfacing ideas
+## Roadmap
+| Status | Feature | Description |
+|--------|---------|-------------|
+| :white_check_mark: | Adaptive thresholds | Ratio-based filtering that adapts to vault characteristics (v0.1.3) |
+| :white_check_mark: | Deep escalation | Budget-aware query expansion + cross-encoder reranking for `deep` profile |
+| :white_check_mark: | Cloud embedding providers | Jina, OpenAI, Voyage, Cohere with batch embedding + TPM pacing |
+| :construction: | Calibration probes | One-time `clawmem calibrate` command that measures your vault's score distribution and tunes thresholds automatically |
+| :construction: | Rolling threshold learning | Learns optimal activation floor from actual usage patterns (which surfaced content gets referenced vs ignored) |
+| :memo: | Multi-vault namespacing | Isolated vaults with independent calibration and lifecycle policies |
+| :memo: | REST API authentication | Token-scoped access for multi-agent deployments |
+| :memo: | Streaming rerank | Cross-encoder reranking within the hook timeout budget for larger candidate sets |
+:white_check_mark: Shipped&ensp; :construction: Planned&ensp; :memo: Exploring
 ## License
 MIT

package/SKILL.md CHANGED Viewed

@@ -191,6 +191,8 @@ Hooks handle ~90% of retrieval. Zero agent effort.
 **Hook blind spots (by design):** Hooks filter out `_clawmem/` system artifacts, enforce score thresholds, and cap token budget. Absence in `<vault-context>` does NOT mean absence in memory. Escalate to Tier 3 if expected memory wasn't surfaced.
+**Adaptive thresholds:** Context-surfacing uses ratio-based scoring that adapts to vault characteristics. Results are kept within a percentage of the best result's composite score (speed: 65%, balanced: 55%, deep: 45%). An activation floor prevents surfacing when all results are weak. `CLAWMEM_PROFILE=deep` adds query expansion + reranking. MCP tools use fixed absolute thresholds, not adaptive.
 ---
 ## Tier 3 — Agent-Initiated Retrieval (MCP Tools)
@@ -635,8 +637,8 @@ Symptom: "UserPromptSubmit hook error" on context-surfacing hook (intermittent)
   -> SQLite contention between watcher and hook. Watcher processes filesystem events (including
      non-.md files like session .jsonl transcripts) and holds brief write locks. If the hook fires
      during a lock, it can exceed its timeout. More likely during active conversations.
-  -> Fix: Bump hook timeout from 5s to 8s in ~/.claude/settings.json. If persistent, restart
-     the watcher: `systemctl --user restart clawmem-watcher.service`.
+  -> Fix: Default timeout is 8s (since v0.1.1). If you have an older install, re-run
+     `clawmem setup hooks`. If persistent, restart the watcher: `systemctl --user restart clawmem-watcher.service`.
 ```
 ---

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "clawmem",
-  "version": "0.1.3",
+  "version": "0.1.5",
   "description": "On-device context engine and memory for AI agents. Claude Code and OpenClaw. Hooks + MCP server + hybrid RAG search.",
   "type": "module",
   "bin": {

package/src/config.ts CHANGED Viewed

@@ -70,7 +70,16 @@ export interface ProfileConfig {
   maxResults: number;
   useVector: boolean;
   vectorTimeout: number;
+  /** Legacy absolute threshold — used by MCP tools and as fallback when thresholdMode="absolute" */
   minScore: number;
+  /** Adaptive: keep results within this ratio of best score (e.g., 0.55 = top 55%) */
+  minScoreRatio: number;
+  /** Adaptive: never surface below this regardless of ratio */
+  absoluteFloor: number;
+  /** Adaptive: if best result is below this, return empty (prevents all-weak surfacing) */
+  activationFloor: number;
+  /** "adaptive" uses ratio-based filtering; "absolute" uses legacy minScore */
+  thresholdMode: "adaptive" | "absolute";
   /** Budget-aware escalation: if fast path finishes early, spend remaining time on expansion + reranking */
   deepEscalation: boolean;
   /** Max time (ms) allowed for the fast path before escalation is considered */
@@ -78,9 +87,9 @@ export interface ProfileConfig {
 }
 export const PROFILES: Record<PerformanceProfile, ProfileConfig> = {
-  speed:    { tokenBudget: 400,  maxResults: 5,  useVector: false, vectorTimeout: 0,    minScore: 0.55, deepEscalation: false, escalationBudgetMs: 0 },
-  balanced: { tokenBudget: 800,  maxResults: 10, useVector: true,  vectorTimeout: 900,  minScore: 0.45, deepEscalation: false, escalationBudgetMs: 0 },
-  deep:     { tokenBudget: 1200, maxResults: 15, useVector: true,  vectorTimeout: 2000, minScore: 0.25, deepEscalation: true,  escalationBudgetMs: 4000 },
+  speed:    { tokenBudget: 400,  maxResults: 5,  useVector: false, vectorTimeout: 0,    minScore: 0.55, minScoreRatio: 0.65, absoluteFloor: 0.18, activationFloor: 0.24, thresholdMode: "adaptive", deepEscalation: false, escalationBudgetMs: 0 },
+  balanced: { tokenBudget: 800,  maxResults: 10, useVector: true,  vectorTimeout: 900,  minScore: 0.45, minScoreRatio: 0.55, absoluteFloor: 0.15, activationFloor: 0.20, thresholdMode: "adaptive", deepEscalation: false, escalationBudgetMs: 0 },
+  deep:     { tokenBudget: 1200, maxResults: 15, useVector: true,  vectorTimeout: 2000, minScore: 0.25, minScoreRatio: 0.45, absoluteFloor: 0.12, activationFloor: 0.16, thresholdMode: "adaptive", deepEscalation: true,  escalationBudgetMs: 4000 },
 };
 export function getActiveProfile(): ProfileConfig {

package/src/hooks/context-surfacing.ts CHANGED Viewed

@@ -178,7 +178,10 @@ export async function contextSurfacing(
         }
         // Phase 2: Cross-encoder reranking — reorder with deeper relevance signal
+        // Sort by score first so reranking covers the best candidates, not just
+        // the first-inserted (expansion hits appended later would otherwise be missed)
         if (Date.now() - startTime < 6000 && results.length >= 3) {
+          results.sort((a, b) => b.score - a.score);
           const toRerank = results.slice(0, 15).map(r => ({
             file: r.filepath,
             text: (r.body || "").slice(0, 2000),
@@ -253,8 +256,26 @@ export async function contextSurfacing(
   }
   // Apply composite scoring
-  const scored = applyCompositeScoring(enriched, prompt)
-    .filter(r => r.compositeScore >= minScore);
+  const allScored = applyCompositeScoring(enriched, prompt);
+  // Threshold filtering — adaptive (ratio-based) or absolute (legacy)
+  let scored: typeof allScored;
+  if (profile.thresholdMode === "adaptive") {
+    // Use max composite score across the set (not positional [0], which may be
+    // reordered by recency-intent sorting in applyCompositeScoring)
+    const bestScore = allScored.length > 0
+      ? Math.max(...allScored.map(r => r.compositeScore))
+      : 0;
+    // Activation floor: if even the best result is too weak, bail entirely
+    if (bestScore < profile.activationFloor) return makeEmptyOutput("context-surfacing");
+    const adaptiveMin = Math.max(bestScore * profile.minScoreRatio, profile.absoluteFloor);
+    scored = allScored.filter(r => r.compositeScore >= adaptiveMin);
+  } else {
+    // Legacy absolute threshold (backward compat)
+    scored = allScored.filter(r => r.compositeScore >= minScore);
+  }
   if (scored.length === 0) return makeEmptyOutput("context-surfacing");