npm - @joshuaswarren/openclaw-engram - Versions diffs - 9.0.14 → 9.0.16 - Mend

@joshuaswarren/openclaw-engram 9.0.14 → 9.0.16

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (5) hide show

package/README.md CHANGED Viewed

@@ -5,6 +5,22 @@
 [![npm version](https://img.shields.io/npm/v/@joshuaswarren/openclaw-engram)](https://www.npmjs.com/package/@joshuaswarren/openclaw-engram)
 [![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE)
+## Product Thesis
+Engram is being built around three requirements:
+- **Memory that improves action outcomes**
+- **Memory that survives long horizons and failures**
+- **Memory that can defend itself**
+That product thesis drives the roadmap order:
+1. Evaluation harness and shadow-mode measurement
+2. Objective-state and causal trajectory memory
+3. Trust-zoned memory promotion and poisoning defense
+4. Harmonic retrieval over abstractions plus anchors
+5. Creation-memory, commitments, and recoverability
 ## Why Engram?
 AI agents forget everything between conversations. Engram fixes that.
@@ -13,6 +29,8 @@ AI agents forget everything between conversations. Engram fixes that.
 - **Smart recall** — Before each conversation, Engram injects the most relevant memories into the agent's context. Your agents remember what they need, when they need it.
 - **Local-first** — All memory data stays on your filesystem as plain markdown files. No cloud dependency, no vendor lock-in, fully portable.
 - **Pluggable search** — Choose from six search backends: QMD (hybrid BM25+vector+reranking), LanceDB, Meilisearch, Orama, remote HTTP, or bring your own.
+- **Memory OS features** — Graph recall, temporal memory tree, lifecycle policy, compounding, shared context, memory boxes, and identity continuity can be enabled progressively as your install grows.
+- **Benchmark-first roadmap** — Engram now has an evaluation-harness foundation so memory improvements can be measured on real agent trajectories instead of subjective recall demos.
 - **Zero-config start** — Install, add an API key, restart. Engram works out of the box with sensible defaults and progressively unlocks advanced features as you enable them.
 ## Quick Start
@@ -121,6 +139,7 @@ Engram's capabilities are organized into feature families that you can enable pr
 | **Compounding** | Weekly synthesis that surfaces patterns and recurring mistakes |
 | **Hot/Cold Tiering** | Automatic migration of aging memories to cold storage |
 | **Behavior Loop Tuning** | Runtime self-tuning of extraction and recall parameters |
+| **Evaluation Harness Foundation** | Tracks benchmark packs and run summaries so future PRs can be gated on memory quality instead of anecdotes |
 Start with defaults, then enable features as needed. See [Enable All Features](docs/enable-all-v8.md) for a full-feature config profile.
@@ -130,6 +149,9 @@ Start with defaults, then enable features as needed. See [Enable All Features](d
 openclaw engram stats                        # Memory counts, search status, health
 openclaw engram search "your query"          # Search memories from CLI
 openclaw engram compat --strict              # Compatibility check
+openclaw engram benchmark-status             # Benchmark/eval harness packs, runs, latest summary
+openclaw engram benchmark-validate <path>    # Validate a benchmark manifest or pack directory
+openclaw engram benchmark-import <path>      # Import a validated benchmark pack into the eval store
 openclaw engram conversation-index-health    # Conversation index status
 openclaw engram graph-health                 # Entity graph status
 openclaw engram tier-status                  # Hot/cold tier metrics
@@ -149,6 +171,9 @@ Key settings:
 | `searchBackend` | `"qmd"` | Search engine: `qmd`, `orama`, `lancedb`, `meilisearch`, `remote`, `noop` |
 | `qmdEnabled` | `true` | Enable QMD hybrid search |
 | `memoryDir` | `~/.openclaw/workspace/memory/local` | Memory storage root |
+| `evalHarnessEnabled` | `false` | Enable the evaluation harness foundation for benchmark packs and run summaries |
+| `evalShadowModeEnabled` | `false` | Reserve shadow-mode measurement paths for future benchmark instrumentation |
+| `evalStoreDir` | `{memoryDir}/state/evals` | Root directory for benchmark packs and run summaries |
 Full reference: [Config Reference](docs/config-reference.md)
@@ -158,6 +183,7 @@ Full reference: [Config Reference](docs/config-reference.md)
 - [Search Backends](docs/search-backends.md) — Choosing and configuring search engines
 - [Writing a Search Backend](docs/writing-a-search-backend.md) — Build your own adapter
 - [Config Reference](docs/config-reference.md) — Every setting with defaults
+- [Evaluation Harness](docs/evaluation-harness.md) — Benchmark pack and run-summary format
 - [Architecture Overview](docs/architecture/overview.md) — System design and storage layout
 - [Retrieval Pipeline](docs/architecture/retrieval-pipeline.md) — How recall works
 - [Memory Lifecycle](docs/architecture/memory-lifecycle.md) — Write, consolidation, expiry
@@ -166,6 +192,7 @@ Full reference: [Config Reference](docs/config-reference.md)
 - [Namespaces](docs/namespaces.md) — Multi-agent memory isolation
 - [Shared Context](docs/shared-context.md) — Cross-agent intelligence
 - [Identity Continuity](docs/identity-continuity.md) — Consistent agent personality
+- [Agentic Memory Roadmap](docs/plans/2026-03-06-engram-agentic-memory-roadmap.md) — Benchmark-first roadmap and PR slices
 ## Developer Install

package/dist/index.js CHANGED Viewed

@@ -281,6 +281,9 @@ function parseConfig(raw) {
     conversationRecallTopK: typeof cfg.conversationRecallTopK === "number" ? cfg.conversationRecallTopK : 3,
     conversationRecallMaxChars: typeof cfg.conversationRecallMaxChars === "number" ? cfg.conversationRecallMaxChars : 2500,
     conversationRecallTimeoutMs: typeof cfg.conversationRecallTimeoutMs === "number" ? cfg.conversationRecallTimeoutMs : 800,
+    evalHarnessEnabled: cfg.evalHarnessEnabled === true,
+    evalShadowModeEnabled: cfg.evalShadowModeEnabled === true,
+    evalStoreDir: typeof cfg.evalStoreDir === "string" && cfg.evalStoreDir.trim().length > 0 ? cfg.evalStoreDir.trim() : path.join(memoryDir, "state", "evals"),
     // Local LLM Provider (v2.1)
     localLlmEnabled: cfg.localLlmEnabled === true || cfg.localLlmEnabled === "true",
     // default: false
@@ -22908,8 +22911,8 @@ promotionCandidates: ${res.promotionCandidateCount}`
 }
 // src/cli.ts
-import path50 from "path";
-import { access as access3, readFile as readFile36, readdir as readdir22, unlink as unlink7 } from "fs/promises";
+import path51 from "path";
+import { access as access3, readFile as readFile37, readdir as readdir23, unlink as unlink7 } from "fs/promises";
 import { createHash as createHash10 } from "crypto";
 // src/transfer/export-json.ts
@@ -23789,8 +23792,8 @@ function gatherCandidates(input, warnings) {
       const record = rec;
       const content = typeof record.content === "string" ? record.content : null;
       if (!content) continue;
-      const path52 = typeof record.path === "string" ? record.path : "";
-      if (!path52.startsWith("transcripts/") && !path52.includes("/transcripts/")) continue;
+      const path53 = typeof record.path === "string" ? record.path : "";
+      if (!path53.startsWith("transcripts/") && !path53.includes("/transcripts/")) continue;
       rows.push(...parseJsonl(content, warnings));
     }
     return rows;
@@ -25398,6 +25401,277 @@ async function runCompatChecks(options) {
   };
 }
+// src/evals.ts
+import path50 from "path";
+import { cp, mkdir as mkdir33, readFile as readFile36, readdir as readdir22, rm as rm5, stat as stat11 } from "fs/promises";
+function isRecord(value) {
+  return typeof value === "object" && value !== null && !Array.isArray(value);
+}
+function assertString(value, field) {
+  if (typeof value !== "string" || value.trim().length === 0) {
+    throw new Error(`${field} must be a non-empty string`);
+  }
+  return value.trim();
+}
+function optionalStringArray(value, field) {
+  if (value === void 0) return void 0;
+  if (!Array.isArray(value)) {
+    throw new Error(`${field} must be an array of strings`);
+  }
+  const out = value.filter((item) => typeof item === "string").map((item) => item.trim()).filter((item) => item.length > 0);
+  if (out.length !== value.length) {
+    throw new Error(`${field} must be an array of non-empty strings`);
+  }
+  return out;
+}
+function resolveEvalStoreDir(memoryDir, overrideDir) {
+  if (typeof overrideDir === "string" && overrideDir.trim().length > 0) {
+    return overrideDir.trim();
+  }
+  return path50.join(memoryDir, "state", "evals");
+}
+function assertSafeBenchmarkId(benchmarkId) {
+  if (benchmarkId === "." || benchmarkId === ".." || benchmarkId.includes("/") || benchmarkId.includes("\\")) {
+    throw new Error("benchmarkId must be a safe path segment");
+  }
+  return benchmarkId;
+}
+function validateEvalBenchmarkManifest(raw) {
+  if (!isRecord(raw)) throw new Error("benchmark manifest must be an object");
+  if (raw.schemaVersion !== 1) throw new Error("schemaVersion must be 1");
+  if (!Array.isArray(raw.cases)) throw new Error("cases must be an array");
+  const cases = raw.cases.map((item, index) => {
+    if (!isRecord(item)) throw new Error(`cases[${index}] must be an object`);
+    return {
+      id: assertString(item.id, `cases[${index}].id`),
+      prompt: assertString(item.prompt, `cases[${index}].prompt`),
+      expectedSignals: optionalStringArray(item.expectedSignals, `cases[${index}].expectedSignals`),
+      notes: typeof item.notes === "string" && item.notes.trim().length > 0 ? item.notes.trim() : void 0
+    };
+  });
+  return {
+    schemaVersion: 1,
+    benchmarkId: assertString(raw.benchmarkId, "benchmarkId"),
+    title: assertString(raw.title, "title"),
+    description: typeof raw.description === "string" && raw.description.trim().length > 0 ? raw.description.trim() : void 0,
+    tags: optionalStringArray(raw.tags, "tags"),
+    sourceLinks: optionalStringArray(raw.sourceLinks, "sourceLinks"),
+    cases
+  };
+}
+function validateEvalRunSummary(raw) {
+  if (!isRecord(raw)) throw new Error("eval run summary must be an object");
+  if (raw.schemaVersion !== 1) throw new Error("schemaVersion must be 1");
+  const status = assertString(raw.status, "status");
+  if (!["running", "completed", "failed", "partial"].includes(status)) {
+    throw new Error("status must be one of running|completed|failed|partial");
+  }
+  const totalCases = Number(raw.totalCases);
+  const passedCases = Number(raw.passedCases);
+  const failedCases = Number(raw.failedCases);
+  if (!Number.isFinite(totalCases) || totalCases < 0) throw new Error("totalCases must be a non-negative number");
+  if (!Number.isFinite(passedCases) || passedCases < 0) throw new Error("passedCases must be a non-negative number");
+  if (!Number.isFinite(failedCases) || failedCases < 0) throw new Error("failedCases must be a non-negative number");
+  const metrics = isRecord(raw.metrics) ? {
+    recallPrecisionAtK: typeof raw.metrics.recallPrecisionAtK === "number" ? raw.metrics.recallPrecisionAtK : void 0,
+    actionOutcomeScore: typeof raw.metrics.actionOutcomeScore === "number" ? raw.metrics.actionOutcomeScore : void 0,
+    objectiveStateCoverage: typeof raw.metrics.objectiveStateCoverage === "number" ? raw.metrics.objectiveStateCoverage : void 0,
+    causalPathRecall: typeof raw.metrics.causalPathRecall === "number" ? raw.metrics.causalPathRecall : void 0,
+    trustViolationRate: typeof raw.metrics.trustViolationRate === "number" ? raw.metrics.trustViolationRate : void 0,
+    creationRecoveryScore: typeof raw.metrics.creationRecoveryScore === "number" ? raw.metrics.creationRecoveryScore : void 0
+  } : void 0;
+  return {
+    schemaVersion: 1,
+    runId: assertString(raw.runId, "runId"),
+    benchmarkId: assertString(raw.benchmarkId, "benchmarkId"),
+    status,
+    startedAt: assertString(raw.startedAt, "startedAt"),
+    completedAt: typeof raw.completedAt === "string" && raw.completedAt.trim().length > 0 ? raw.completedAt.trim() : void 0,
+    totalCases,
+    passedCases,
+    failedCases,
+    metrics,
+    notes: typeof raw.notes === "string" && raw.notes.trim().length > 0 ? raw.notes.trim() : void 0,
+    gitRef: typeof raw.gitRef === "string" && raw.gitRef.trim().length > 0 ? raw.gitRef.trim() : void 0
+  };
+}
+async function listJsonFiles(dir) {
+  try {
+    const entries = await readdir22(dir, { withFileTypes: true });
+    const out = [];
+    for (const entry of entries) {
+      const fullPath = path50.join(dir, entry.name);
+      if (entry.isDirectory()) {
+        out.push(...await listJsonFiles(fullPath));
+      } else if (entry.isFile() && entry.name.endsWith(".json")) {
+        out.push(fullPath);
+      }
+    }
+    return out.sort();
+  } catch {
+    return [];
+  }
+}
+async function listNamedFiles(dir, fileName) {
+  try {
+    const entries = await readdir22(dir, { withFileTypes: true });
+    const out = [];
+    for (const entry of entries) {
+      const fullPath = path50.join(dir, entry.name);
+      if (entry.isDirectory()) {
+        out.push(...await listNamedFiles(fullPath, fileName));
+      } else if (entry.isFile() && entry.name === fileName) {
+        out.push(fullPath);
+      }
+    }
+    return out.sort();
+  } catch {
+    return [];
+  }
+}
+async function readJsonFile2(filePath) {
+  return JSON.parse(await readFile36(filePath, "utf-8"));
+}
+async function resolveBenchmarkManifestPath(sourcePath) {
+  const info = await stat11(sourcePath);
+  if (info.isDirectory()) {
+    return {
+      sourceKind: "directory",
+      manifestPath: path50.join(sourcePath, "manifest.json")
+    };
+  }
+  if (info.isFile()) {
+    return {
+      sourceKind: "file",
+      manifestPath: sourcePath
+    };
+  }
+  throw new Error("benchmark pack source must be a file or directory");
+}
+async function validateEvalBenchmarkPack(sourcePath) {
+  const trimmedSourcePath = sourcePath.trim();
+  if (trimmedSourcePath.length === 0) {
+    throw new Error("benchmark pack path must be a non-empty string");
+  }
+  const { manifestPath } = await resolveBenchmarkManifestPath(trimmedSourcePath);
+  const manifest = validateEvalBenchmarkManifest(await readJsonFile2(manifestPath));
+  return {
+    sourcePath: trimmedSourcePath,
+    manifestPath,
+    benchmarkId: assertSafeBenchmarkId(manifest.benchmarkId),
+    title: manifest.title,
+    totalCases: manifest.cases.length,
+    tags: [...manifest.tags ?? []],
+    sourceLinks: [...manifest.sourceLinks ?? []]
+  };
+}
+async function importEvalBenchmarkPack(options) {
+  const summary = await validateEvalBenchmarkPack(options.sourcePath);
+  const rootDir = resolveEvalStoreDir(options.memoryDir, options.evalStoreDir);
+  const benchmarkDir = path50.join(rootDir, "benchmarks");
+  const targetDir = path50.join(benchmarkDir, summary.benchmarkId);
+  const { sourceKind, manifestPath } = await resolveBenchmarkManifestPath(summary.sourcePath);
+  let overwritten = false;
+  try {
+    await stat11(targetDir);
+    if (options.force !== true) {
+      throw new Error(`benchmark pack already exists at ${targetDir}; rerun with force to replace it`);
+    }
+    overwritten = true;
+    await rm5(targetDir, { recursive: true, force: true });
+  } catch (error) {
+    if (!(error instanceof Error) || !("code" in error) || error.code !== "ENOENT") {
+      throw error;
+    }
+  }
+  await mkdir33(benchmarkDir, { recursive: true });
+  if (sourceKind === "directory") {
+    await cp(summary.sourcePath, targetDir, { recursive: true });
+  } else {
+    await mkdir33(targetDir, { recursive: true });
+    await cp(manifestPath, path50.join(targetDir, "manifest.json"));
+  }
+  return {
+    ...summary,
+    targetDir,
+    overwritten
+  };
+}
+async function getEvalHarnessStatus(options) {
+  const rootDir = resolveEvalStoreDir(options.memoryDir, options.evalStoreDir);
+  const benchmarkDir = path50.join(rootDir, "benchmarks");
+  const runsDir = path50.join(rootDir, "runs");
+  const benchmarkFiles = await listNamedFiles(benchmarkDir, "manifest.json");
+  const runFiles = await listJsonFiles(runsDir);
+  const invalidBenchmarks = [];
+  const invalidRuns = [];
+  const manifests = [];
+  for (const filePath of benchmarkFiles) {
+    try {
+      manifests.push(validateEvalBenchmarkManifest(await readJsonFile2(filePath)));
+    } catch (error) {
+      invalidBenchmarks.push({
+        path: filePath,
+        error: error instanceof Error ? error.message : String(error)
+      });
+    }
+  }
+  const runs = [];
+  for (const filePath of runFiles) {
+    try {
+      runs.push(validateEvalRunSummary(await readJsonFile2(filePath)));
+    } catch (error) {
+      invalidRuns.push({
+        path: filePath,
+        error: error instanceof Error ? error.message : String(error)
+      });
+    }
+  }
+  runs.sort((a, b) => {
+    const aTime = Date.parse(a.completedAt ?? a.startedAt);
+    const bTime = Date.parse(b.completedAt ?? b.startedAt);
+    return (Number.isNaN(bTime) ? 0 : bTime) - (Number.isNaN(aTime) ? 0 : aTime);
+  });
+  const latestRun = runs[0];
+  const tags = /* @__PURE__ */ new Set();
+  const sourceLinks = /* @__PURE__ */ new Set();
+  let totalCases = 0;
+  for (const manifest of manifests) {
+    totalCases += manifest.cases.length;
+    for (const tag of manifest.tags ?? []) tags.add(tag);
+    for (const link of manifest.sourceLinks ?? []) sourceLinks.add(link);
+  }
+  return {
+    enabled: options.enabled,
+    shadowModeEnabled: options.shadowModeEnabled,
+    rootDir,
+    benchmarkDir,
+    runsDir,
+    benchmarks: {
+      total: benchmarkFiles.length,
+      valid: manifests.length,
+      invalid: invalidBenchmarks.length,
+      totalCases,
+      tags: [...tags].sort(),
+      sourceLinks: [...sourceLinks].sort()
+    },
+    runs: {
+      total: runFiles.length,
+      invalid: invalidRuns.length,
+      completed: runs.filter((run) => run.status === "completed").length,
+      failed: runs.filter((run) => run.status === "failed").length,
+      partial: runs.filter((run) => run.status === "partial").length,
+      running: runs.filter((run) => run.status === "running").length,
+      latestRunId: latestRun?.runId,
+      latestBenchmarkId: latestRun?.benchmarkId,
+      latestCompletedAt: latestRun?.completedAt
+    },
+    latestRun,
+    invalidBenchmarks,
+    invalidRuns
+  };
+}
 // src/cli.ts
 function rankCandidateForKeep(a, b) {
   const aConfidence = typeof a.frontmatter.confidence === "number" ? a.frontmatter.confidence : 0;
@@ -25554,6 +25828,25 @@ async function runGraphHealthCliCommand(options) {
     includeRepairGuidance: options.includeRepairGuidance
   });
 }
+async function runBenchmarkStatusCliCommand(options) {
+  return getEvalHarnessStatus({
+    memoryDir: options.memoryDir,
+    evalStoreDir: options.evalStoreDir,
+    enabled: options.evalHarnessEnabled,
+    shadowModeEnabled: options.evalShadowModeEnabled
+  });
+}
+async function runBenchmarkValidateCliCommand(options) {
+  return validateEvalBenchmarkPack(options.path);
+}
+async function runBenchmarkImportCliCommand(options) {
+  return importEvalBenchmarkPack({
+    sourcePath: options.path,
+    memoryDir: options.memoryDir,
+    evalStoreDir: options.evalStoreDir,
+    force: options.force === true
+  });
+}
 async function runSessionCheckCliCommand(options) {
   return analyzeSessionIntegrity({ memoryDir: options.memoryDir });
 }
@@ -25781,7 +26074,7 @@ function policyVersionForValues(values, config) {
   return createHash10("sha256").update(JSON.stringify(normalized)).digest("hex").slice(0, 12);
 }
 async function readRuntimePolicySnapshot2(config, fileName) {
-  const filePath = path50.join(config.memoryDir, "state", fileName);
+  const filePath = path51.join(config.memoryDir, "state", fileName);
   const snapshot = await readRuntimePolicySnapshot(filePath, {
     maxStaleDecayThreshold: config.lifecycleArchiveDecayThreshold
   });
@@ -26281,7 +26574,7 @@ async function withTimeout(promise, timeoutMs, timeoutMessage) {
 }
 async function runReplayCliCommand(orchestrator, options) {
   const extractionIdleTimeoutMs = Number.isFinite(options.extractionIdleTimeoutMs) ? Math.max(1e3, Math.floor(options.extractionIdleTimeoutMs)) : 15 * 6e4;
-  const inputRaw = await readFile36(options.inputPath, "utf-8");
+  const inputRaw = await readFile37(options.inputPath, "utf-8");
   const registry = buildReplayNormalizerRegistry([
     openclawReplayNormalizer,
     claudeReplayNormalizer,
@@ -26346,7 +26639,7 @@ async function runReplayCliCommand(orchestrator, options) {
 async function getPluginVersion() {
   try {
     const pkgPath = new URL("../package.json", import.meta.url);
-    const raw = await readFile36(pkgPath, "utf-8");
+    const raw = await readFile37(pkgPath, "utf-8");
     const parsed = JSON.parse(raw);
     return parsed.version ?? "unknown";
   } catch {
@@ -26365,32 +26658,32 @@ async function resolveMemoryDirForNamespace(orchestrator, namespace) {
   const ns = (namespace ?? "").trim();
   if (!ns) return orchestrator.config.memoryDir;
   if (!orchestrator.config.namespacesEnabled) return orchestrator.config.memoryDir;
-  const candidate = path50.join(orchestrator.config.memoryDir, "namespaces", ns);
+  const candidate = path51.join(orchestrator.config.memoryDir, "namespaces", ns);
   if (ns === orchestrator.config.defaultNamespace) {
     return await exists2(candidate) ? candidate : orchestrator.config.memoryDir;
   }
   return candidate;
 }
 async function readAllMemoryFiles(memoryDir) {
-  const roots = [path50.join(memoryDir, "facts"), path50.join(memoryDir, "corrections")];
+  const roots = [path51.join(memoryDir, "facts"), path51.join(memoryDir, "corrections")];
   const out = [];
   const walk = async (dir) => {
     let entries;
     try {
-      entries = await readdir22(dir, { withFileTypes: true });
+      entries = await readdir23(dir, { withFileTypes: true });
     } catch {
       return;
     }
     for (const entry of entries) {
       const entryName = typeof entry.name === "string" ? entry.name : entry.name.toString("utf-8");
-      const fullPath = path50.join(dir, entryName);
+      const fullPath = path51.join(dir, entryName);
       if (entry.isDirectory()) {
         await walk(fullPath);
         continue;
       }
       if (!entry.isFile() || !entryName.endsWith(".md")) continue;
       try {
-        const raw = await readFile36(fullPath, "utf-8");
+        const raw = await readFile37(fullPath, "utf-8");
         const parsed = raw.match(/^---\n([\s\S]*?)\n---\n([\s\S]*)$/);
         if (!parsed) continue;
         const fmRaw = parsed[1];
@@ -26651,6 +26944,36 @@ function registerCli(api, orchestrator) {
         }
         console.log("OK");
       });
+      cmd.command("benchmark-status").description("Show benchmark/evaluation harness status, benchmark packs, and latest run summary").action(async () => {
+        const status = await runBenchmarkStatusCliCommand({
+          memoryDir: orchestrator.config.memoryDir,
+          evalStoreDir: orchestrator.config.evalStoreDir,
+          evalHarnessEnabled: orchestrator.config.evalHarnessEnabled,
+          evalShadowModeEnabled: orchestrator.config.evalShadowModeEnabled
+        });
+        console.log(JSON.stringify(status, null, 2));
+        console.log("OK");
+      });
+      cmd.command("benchmark-validate").description("Validate a benchmark manifest file or pack directory without importing it").argument("<path>", "Path to a benchmark manifest JSON file or a directory with manifest.json").action(async (...args) => {
+        const inputPath = args[0];
+        const summary = await runBenchmarkValidateCliCommand({
+          path: typeof inputPath === "string" ? inputPath : ""
+        });
+        console.log(JSON.stringify(summary, null, 2));
+        console.log("OK");
+      });
+      cmd.command("benchmark-import").description("Validate and import a benchmark manifest file or pack directory into Engram's eval store").argument("<path>", "Path to a benchmark manifest JSON file or a directory with manifest.json").option("--force", "Replace an existing imported benchmark pack with the same benchmarkId").action(async (...args) => {
+        const inputPath = args[0];
+        const options = args[1] ?? {};
+        const summary = await runBenchmarkImportCliCommand({
+          path: typeof inputPath === "string" ? inputPath : "",
+          memoryDir: orchestrator.config.memoryDir,
+          evalStoreDir: orchestrator.config.evalStoreDir,
+          force: options.force === true
+        });
+        console.log(JSON.stringify(summary, null, 2));
+        console.log("OK");
+      });
       cmd.command("conversation-index-health").description("Show conversation index backend health and index stats").action(async () => {
         const health = await runConversationIndexHealthCliCommand(orchestrator);
         console.log(JSON.stringify(health, null, 2));
@@ -27300,7 +27623,7 @@ function registerCli(api, orchestrator) {
         }
       });
       cmd.command("identity").description("Show agent identity reflections").action(async () => {
-        const workspaceDir = path50.join(process.env.HOME ?? "~", ".openclaw", "workspace");
+        const workspaceDir = path51.join(process.env.HOME ?? "~", ".openclaw", "workspace");
         const identity = await orchestrator.storage.readIdentity(workspaceDir);
         if (!identity) {
           console.log("No identity file found.");
@@ -27523,8 +27846,8 @@ function registerCli(api, orchestrator) {
         const options = args[0] ?? {};
         const threadId = options.thread;
         const top = parseInt(options.top ?? "10", 10);
-        const memoryDir = path50.join(process.env.HOME ?? "~", ".openclaw", "workspace", "memory", "local");
-        const threading = new ThreadingManager(path50.join(memoryDir, "threads"));
+        const memoryDir = path51.join(process.env.HOME ?? "~", ".openclaw", "workspace", "memory", "local");
+        const threading = new ThreadingManager(path51.join(memoryDir, "threads"));
         if (threadId) {
           const thread = await threading.loadThread(threadId);
           if (!thread) {
@@ -27697,9 +28020,9 @@ function parseDuration(duration) {
 }
 // src/index.ts
-import { readFile as readFile37, writeFile as writeFile29 } from "fs/promises";
+import { readFile as readFile38, writeFile as writeFile29 } from "fs/promises";
 import { readFileSync as readFileSync4 } from "fs";
-import path51 from "path";
+import path52 from "path";
 import os6 from "os";
 var ENGRAM_REGISTERED_GUARD = "__openclawEngramRegistered";
 var ENGRAM_HOOK_APIS = "__openclawEngramHookApis";
@@ -27707,7 +28030,7 @@ function loadPluginConfigFromFile() {
   try {
     const explicitConfigPath = process.env.OPENCLAW_ENGRAM_CONFIG_PATH || process.env.OPENCLAW_CONFIG_PATH;
     const homeDir = process.env.HOME ?? os6.homedir();
-    const configPath = explicitConfigPath && explicitConfigPath.length > 0 ? explicitConfigPath : path51.join(homeDir, ".openclaw", "openclaw.json");
+    const configPath = explicitConfigPath && explicitConfigPath.length > 0 ? explicitConfigPath : path52.join(homeDir, ".openclaw", "openclaw.json");
     const content = readFileSync4(configPath, "utf-8");
     const config = JSON.parse(content);
     const pluginEntry = config?.plugins?.entries?.["openclaw-engram"];
@@ -27944,7 +28267,7 @@ Use this context naturally when relevant. Never quote or expose this memory cont
                 `session reset via API for ${sessionKey}, new sessionId=${result.sessionId}`
               );
               const safeSessionKey = sanitizeSessionKeyForFilename(sessionKey);
-              const signalPath = path51.join(
+              const signalPath = path52.join(
                 workspaceDir,
                 `.compaction-reset-signal-${safeSessionKey}`
               );
@@ -27975,11 +28298,11 @@ Use this context naturally when relevant. Never quote or expose this memory cont
     );
     async function ensureHourlySummaryCron(api2) {
       const jobId = "engram-hourly-summary";
-      const cronFilePath = path51.join(os6.homedir(), ".openclaw", "cron", "jobs.json");
+      const cronFilePath = path52.join(os6.homedir(), ".openclaw", "cron", "jobs.json");
       try {
         let jobsData = { version: 1, jobs: [] };
         try {
-          const content = await readFile37(cronFilePath, "utf-8");
+          const content = await readFile38(cronFilePath, "utf-8");
           jobsData = JSON.parse(content);
         } catch {
         }