@joshuaswarren/openclaw-engram 9.0.16 → 9.0.18

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -30,7 +30,7 @@ AI agents forget everything between conversations. Engram fixes that.
30
30
  - **Local-first** — All memory data stays on your filesystem as plain markdown files. No cloud dependency, no vendor lock-in, fully portable.
31
31
  - **Pluggable search** — Choose from six search backends: QMD (hybrid BM25+vector+reranking), LanceDB, Meilisearch, Orama, remote HTTP, or bring your own.
32
32
  - **Memory OS features** — Graph recall, temporal memory tree, lifecycle policy, compounding, shared context, memory boxes, and identity continuity can be enabled progressively as your install grows.
33
- - **Benchmark-first roadmap** — Engram now has an evaluation-harness foundation so memory improvements can be measured on real agent trajectories instead of subjective recall demos.
33
+ - **Benchmark-first roadmap** — Engram now has an evaluation harness with live shadow recall recording and a CI benchmark delta gate, so memory improvements can be measured and regression-checked instead of argued from anecdotes.
34
34
  - **Zero-config start** — Install, add an API key, restart. Engram works out of the box with sensible defaults and progressively unlocks advanced features as you enable them.
35
35
 
36
36
  ## Quick Start
@@ -139,7 +139,7 @@ Engram's capabilities are organized into feature families that you can enable pr
139
139
  | **Compounding** | Weekly synthesis that surfaces patterns and recurring mistakes |
140
140
  | **Hot/Cold Tiering** | Automatic migration of aging memories to cold storage |
141
141
  | **Behavior Loop Tuning** | Runtime self-tuning of extraction and recall parameters |
142
- | **Evaluation Harness Foundation** | Tracks benchmark packs and run summaries so future PRs can be gated on memory quality instead of anecdotes |
142
+ | **Evaluation Harness** | Tracks benchmark packs, run summaries, live shadow recall records, and CI delta comparisons so future PRs can be gated on memory quality instead of anecdotes |
143
143
 
144
144
  Start with defaults, then enable features as needed. See [Enable All Features](docs/enable-all-v8.md) for a full-feature config profile.
145
145
 
@@ -149,9 +149,10 @@ Start with defaults, then enable features as needed. See [Enable All Features](d
149
149
  openclaw engram stats # Memory counts, search status, health
150
150
  openclaw engram search "your query" # Search memories from CLI
151
151
  openclaw engram compat --strict # Compatibility check
152
- openclaw engram benchmark-status # Benchmark/eval harness packs, runs, latest summary
152
+ openclaw engram benchmark-status # Benchmark/eval harness packs, runs, shadow recalls, latest summaries
153
153
  openclaw engram benchmark-validate <path> # Validate a benchmark manifest or pack directory
154
154
  openclaw engram benchmark-import <path> # Import a validated benchmark pack into the eval store
155
+ openclaw engram benchmark-ci-gate # Compare base vs candidate eval stores and fail on regressions
155
156
  openclaw engram conversation-index-health # Conversation index status
156
157
  openclaw engram graph-health # Entity graph status
157
158
  openclaw engram tier-status # Hot/cold tier metrics
@@ -171,9 +172,9 @@ Key settings:
171
172
  | `searchBackend` | `"qmd"` | Search engine: `qmd`, `orama`, `lancedb`, `meilisearch`, `remote`, `noop` |
172
173
  | `qmdEnabled` | `true` | Enable QMD hybrid search |
173
174
  | `memoryDir` | `~/.openclaw/workspace/memory/local` | Memory storage root |
174
- | `evalHarnessEnabled` | `false` | Enable the evaluation harness foundation for benchmark packs and run summaries |
175
- | `evalShadowModeEnabled` | `false` | Reserve shadow-mode measurement paths for future benchmark instrumentation |
176
- | `evalStoreDir` | `{memoryDir}/state/evals` | Root directory for benchmark packs and run summaries |
175
+ | `evalHarnessEnabled` | `false` | Enable the evaluation harness for benchmark packs, run summaries, and shadow recall bookkeeping |
176
+ | `evalShadowModeEnabled` | `false` | Record live recall decisions to the eval store without changing injected output |
177
+ | `evalStoreDir` | `{memoryDir}/state/evals` | Root directory for benchmark packs, run summaries, and shadow recall records |
177
178
 
178
179
  Full reference: [Config Reference](docs/config-reference.md)
179
180
 
@@ -183,7 +184,7 @@ Full reference: [Config Reference](docs/config-reference.md)
183
184
  - [Search Backends](docs/search-backends.md) — Choosing and configuring search engines
184
185
  - [Writing a Search Backend](docs/writing-a-search-backend.md) — Build your own adapter
185
186
  - [Config Reference](docs/config-reference.md) — Every setting with defaults
186
- - [Evaluation Harness](docs/evaluation-harness.md) — Benchmark pack and run-summary format
187
+ - [Evaluation Harness](docs/evaluation-harness.md) — Benchmark pack, shadow recall, and CI delta gate format
187
188
  - [Architecture Overview](docs/architecture/overview.md) — System design and storage layout
188
189
  - [Retrieval Pipeline](docs/architecture/retrieval-pipeline.md) — How recall works
189
190
  - [Memory Lifecycle](docs/architecture/memory-lifecycle.md) — Write, consolidation, expiry