npm - clawmem - Versions diffs - 0.1.3 → 0.1.4 - Mend

clawmem 0.1.3 → 0.1.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (7) hide show

package/AGENTS.md +2 -0
package/CLAUDE.md +2 -0
package/README.md +15 -0
package/SKILL.md +2 -0
package/package.json +1 -1
package/src/config.ts +12 -3
package/src/hooks/context-surfacing.ts +23 -2

package/AGENTS.md CHANGED Viewed

@@ -257,6 +257,8 @@ ClawMem hooks handle ~90% of retrieval automatically. Agent-initiated MCP calls
 **Hook blind spots (by design):** Hooks filter out `_clawmem/` system artifacts, enforce score thresholds, and cap token budget. Absence in `<vault-context>` does NOT mean absence in memory. If you expect a memory to exist but it wasn't surfaced, escalate to Tier 3.
+**Adaptive thresholds:** Context-surfacing uses ratio-based scoring that adapts to vault characteristics (size, document quality, content age, embedding model). Results are kept within a percentage of the best result's composite score rather than a fixed absolute threshold. An activation floor prevents surfacing when all results are weak. Profiles control the ratio: `speed` (65%), `balanced` (55%), `deep` (45% + query expansion + reranking). `CLAWMEM_PROFILE=deep` is recommended for vaults with older content or lower-quality documents. MCP tools use fixed absolute thresholds, not adaptive.
 ### Tier 3 — Agent-Initiated (one targeted MCP call)
 **Escalate ONLY when one of these three rules fires:**

package/CLAUDE.md CHANGED Viewed

@@ -257,6 +257,8 @@ ClawMem hooks handle ~90% of retrieval automatically. Agent-initiated MCP calls
 **Hook blind spots (by design):** Hooks filter out `_clawmem/` system artifacts, enforce score thresholds, and cap token budget. Absence in `<vault-context>` does NOT mean absence in memory. If you expect a memory to exist but it wasn't surfaced, escalate to Tier 3.
+**Adaptive thresholds:** Context-surfacing uses ratio-based scoring that adapts to vault characteristics (size, document quality, content age, embedding model). Results are kept within a percentage of the best result's composite score rather than a fixed absolute threshold. An activation floor prevents surfacing when all results are weak. Profiles control the ratio: `speed` (65%), `balanced` (55%), `deep` (45% + query expansion + reranking). `CLAWMEM_PROFILE=deep` is recommended for vaults with older content or lower-quality documents. MCP tools use fixed absolute thresholds, not adaptive.
 ### Tier 3 — Agent-Initiated (one targeted MCP call)
 **Escalate ONLY when one of these three rules fires:**

package/README.md CHANGED Viewed

@@ -1015,6 +1015,21 @@ Built on the shoulders of:
 - [SAME](https://github.com/sgx-labs/statelessagent) — agent memory concepts (recency decay, confidence scoring, session tracking)
 - [supermemory](https://github.com/supermemoryai/clawdbot-supermemory) — hook patterns and context surfacing ideas
+## Roadmap
+| Status | Feature | Description |
+|--------|---------|-------------|
+| :white_check_mark: | Adaptive thresholds | Ratio-based filtering that adapts to vault characteristics (v0.1.3) |
+| :white_check_mark: | Deep escalation | Budget-aware query expansion + cross-encoder reranking for `deep` profile |
+| :white_check_mark: | Cloud embedding providers | Jina, OpenAI, Voyage, Cohere with batch embedding + TPM pacing |
+| :construction: | Calibration probes | One-time `clawmem calibrate` command that measures your vault's score distribution and tunes thresholds automatically |
+| :construction: | Rolling threshold learning | Learns optimal activation floor from actual usage patterns (which surfaced content gets referenced vs ignored) |
+| :memo: | Multi-vault namespacing | Isolated vaults with independent calibration and lifecycle policies |
+| :memo: | REST API authentication | Token-scoped access for multi-agent deployments |
+| :memo: | Streaming rerank | Cross-encoder reranking within the hook timeout budget for larger candidate sets |
+:white_check_mark: Shipped&ensp; :construction: Planned&ensp; :memo: Exploring
 ## License
 MIT

package/SKILL.md CHANGED Viewed

@@ -191,6 +191,8 @@ Hooks handle ~90% of retrieval. Zero agent effort.
 **Hook blind spots (by design):** Hooks filter out `_clawmem/` system artifacts, enforce score thresholds, and cap token budget. Absence in `<vault-context>` does NOT mean absence in memory. Escalate to Tier 3 if expected memory wasn't surfaced.
+**Adaptive thresholds:** Context-surfacing uses ratio-based scoring that adapts to vault characteristics. Results are kept within a percentage of the best result's composite score (speed: 65%, balanced: 55%, deep: 45%). An activation floor prevents surfacing when all results are weak. `CLAWMEM_PROFILE=deep` adds query expansion + reranking. MCP tools use fixed absolute thresholds, not adaptive.
 ---
 ## Tier 3 — Agent-Initiated Retrieval (MCP Tools)

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "clawmem",
-  "version": "0.1.3",
+  "version": "0.1.4",
   "description": "On-device context engine and memory for AI agents. Claude Code and OpenClaw. Hooks + MCP server + hybrid RAG search.",
   "type": "module",
   "bin": {

package/src/config.ts CHANGED Viewed

@@ -70,7 +70,16 @@ export interface ProfileConfig {
   maxResults: number;
   useVector: boolean;
   vectorTimeout: number;
+  /** Legacy absolute threshold — used by MCP tools and as fallback when thresholdMode="absolute" */
   minScore: number;
+  /** Adaptive: keep results within this ratio of best score (e.g., 0.55 = top 55%) */
+  minScoreRatio: number;
+  /** Adaptive: never surface below this regardless of ratio */
+  absoluteFloor: number;
+  /** Adaptive: if best result is below this, return empty (prevents all-weak surfacing) */
+  activationFloor: number;
+  /** "adaptive" uses ratio-based filtering; "absolute" uses legacy minScore */
+  thresholdMode: "adaptive" | "absolute";
   /** Budget-aware escalation: if fast path finishes early, spend remaining time on expansion + reranking */
   deepEscalation: boolean;
   /** Max time (ms) allowed for the fast path before escalation is considered */
@@ -78,9 +87,9 @@ export interface ProfileConfig {
 }
 export const PROFILES: Record<PerformanceProfile, ProfileConfig> = {
-  speed:    { tokenBudget: 400,  maxResults: 5,  useVector: false, vectorTimeout: 0,    minScore: 0.55, deepEscalation: false, escalationBudgetMs: 0 },
-  balanced: { tokenBudget: 800,  maxResults: 10, useVector: true,  vectorTimeout: 900,  minScore: 0.45, deepEscalation: false, escalationBudgetMs: 0 },
-  deep:     { tokenBudget: 1200, maxResults: 15, useVector: true,  vectorTimeout: 2000, minScore: 0.25, deepEscalation: true,  escalationBudgetMs: 4000 },
+  speed:    { tokenBudget: 400,  maxResults: 5,  useVector: false, vectorTimeout: 0,    minScore: 0.55, minScoreRatio: 0.65, absoluteFloor: 0.18, activationFloor: 0.24, thresholdMode: "adaptive", deepEscalation: false, escalationBudgetMs: 0 },
+  balanced: { tokenBudget: 800,  maxResults: 10, useVector: true,  vectorTimeout: 900,  minScore: 0.45, minScoreRatio: 0.55, absoluteFloor: 0.15, activationFloor: 0.20, thresholdMode: "adaptive", deepEscalation: false, escalationBudgetMs: 0 },
+  deep:     { tokenBudget: 1200, maxResults: 15, useVector: true,  vectorTimeout: 2000, minScore: 0.25, minScoreRatio: 0.45, absoluteFloor: 0.12, activationFloor: 0.16, thresholdMode: "adaptive", deepEscalation: true,  escalationBudgetMs: 4000 },
 };
 export function getActiveProfile(): ProfileConfig {

package/src/hooks/context-surfacing.ts CHANGED Viewed

@@ -178,7 +178,10 @@ export async function contextSurfacing(
         }
         // Phase 2: Cross-encoder reranking — reorder with deeper relevance signal
+        // Sort by score first so reranking covers the best candidates, not just
+        // the first-inserted (expansion hits appended later would otherwise be missed)
         if (Date.now() - startTime < 6000 && results.length >= 3) {
+          results.sort((a, b) => b.score - a.score);
           const toRerank = results.slice(0, 15).map(r => ({
             file: r.filepath,
             text: (r.body || "").slice(0, 2000),
@@ -253,8 +256,26 @@ export async function contextSurfacing(
   }
   // Apply composite scoring
-  const scored = applyCompositeScoring(enriched, prompt)
-    .filter(r => r.compositeScore >= minScore);
+  const allScored = applyCompositeScoring(enriched, prompt);
+  // Threshold filtering — adaptive (ratio-based) or absolute (legacy)
+  let scored: typeof allScored;
+  if (profile.thresholdMode === "adaptive") {
+    // Use max composite score across the set (not positional [0], which may be
+    // reordered by recency-intent sorting in applyCompositeScoring)
+    const bestScore = allScored.length > 0
+      ? Math.max(...allScored.map(r => r.compositeScore))
+      : 0;
+    // Activation floor: if even the best result is too weak, bail entirely
+    if (bestScore < profile.activationFloor) return makeEmptyOutput("context-surfacing");
+    const adaptiveMin = Math.max(bestScore * profile.minScoreRatio, profile.absoluteFloor);
+    scored = allScored.filter(r => r.compositeScore >= adaptiveMin);
+  } else {
+    // Legacy absolute threshold (backward compat)
+    scored = allScored.filter(r => r.compositeScore >= minScore);
+  }
   if (scored.length === 0) return makeEmptyOutput("context-surfacing");