ai-memory-layer 2.0.1 → 3.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (186) hide show
  1. package/CHANGELOG.md +19 -12
  2. package/README.md +435 -320
  3. package/bin/memory-server.mjs +0 -0
  4. package/dist/adapters/memory/embeddings.d.ts.map +1 -1
  5. package/dist/adapters/memory/embeddings.js +12 -1
  6. package/dist/adapters/memory/embeddings.js.map +1 -1
  7. package/dist/adapters/memory/index.d.ts.map +1 -1
  8. package/dist/adapters/memory/index.js +1281 -48
  9. package/dist/adapters/memory/index.js.map +1 -1
  10. package/dist/adapters/postgres/index.d.ts +1 -0
  11. package/dist/adapters/postgres/index.d.ts.map +1 -1
  12. package/dist/adapters/postgres/index.js +1770 -42
  13. package/dist/adapters/postgres/index.js.map +1 -1
  14. package/dist/adapters/sqlite/embeddings.d.ts.map +1 -1
  15. package/dist/adapters/sqlite/embeddings.js +49 -12
  16. package/dist/adapters/sqlite/embeddings.js.map +1 -1
  17. package/dist/adapters/sqlite/index.d.ts.map +1 -1
  18. package/dist/adapters/sqlite/index.js +1720 -38
  19. package/dist/adapters/sqlite/index.js.map +1 -1
  20. package/dist/adapters/sqlite/mappers.d.ts +39 -4
  21. package/dist/adapters/sqlite/mappers.d.ts.map +1 -1
  22. package/dist/adapters/sqlite/mappers.js +87 -0
  23. package/dist/adapters/sqlite/mappers.js.map +1 -1
  24. package/dist/adapters/sqlite/schema.d.ts +1 -1
  25. package/dist/adapters/sqlite/schema.d.ts.map +1 -1
  26. package/dist/adapters/sqlite/schema.js +297 -1
  27. package/dist/adapters/sqlite/schema.js.map +1 -1
  28. package/dist/adapters/sync-to-async.d.ts.map +1 -1
  29. package/dist/adapters/sync-to-async.js +54 -0
  30. package/dist/adapters/sync-to-async.js.map +1 -1
  31. package/dist/contracts/async-storage.d.ts +61 -1
  32. package/dist/contracts/async-storage.d.ts.map +1 -1
  33. package/dist/contracts/cognitive.d.ts +37 -0
  34. package/dist/contracts/cognitive.d.ts.map +1 -0
  35. package/dist/contracts/cognitive.js +24 -0
  36. package/dist/contracts/cognitive.js.map +1 -0
  37. package/dist/contracts/coordination.d.ts +101 -0
  38. package/dist/contracts/coordination.d.ts.map +1 -0
  39. package/dist/contracts/coordination.js +26 -0
  40. package/dist/contracts/coordination.js.map +1 -0
  41. package/dist/contracts/embedding.d.ts +1 -1
  42. package/dist/contracts/embedding.d.ts.map +1 -1
  43. package/dist/contracts/errors.d.ts +28 -0
  44. package/dist/contracts/errors.d.ts.map +1 -0
  45. package/dist/contracts/errors.js +41 -0
  46. package/dist/contracts/errors.js.map +1 -0
  47. package/dist/contracts/identity.d.ts +2 -0
  48. package/dist/contracts/identity.d.ts.map +1 -1
  49. package/dist/contracts/identity.js +26 -1
  50. package/dist/contracts/identity.js.map +1 -1
  51. package/dist/contracts/observability.d.ts +2 -1
  52. package/dist/contracts/observability.d.ts.map +1 -1
  53. package/dist/contracts/observability.js +11 -0
  54. package/dist/contracts/observability.js.map +1 -1
  55. package/dist/contracts/profile.d.ts +29 -0
  56. package/dist/contracts/profile.d.ts.map +1 -0
  57. package/dist/contracts/profile.js +2 -0
  58. package/dist/contracts/profile.js.map +1 -0
  59. package/dist/contracts/session-state.d.ts +10 -0
  60. package/dist/contracts/session-state.d.ts.map +1 -0
  61. package/dist/contracts/session-state.js +2 -0
  62. package/dist/contracts/session-state.js.map +1 -0
  63. package/dist/contracts/storage.d.ts +73 -1
  64. package/dist/contracts/storage.d.ts.map +1 -1
  65. package/dist/contracts/storage.js +16 -1
  66. package/dist/contracts/storage.js.map +1 -1
  67. package/dist/contracts/temporal.d.ts +112 -0
  68. package/dist/contracts/temporal.d.ts.map +1 -0
  69. package/dist/contracts/temporal.js +31 -0
  70. package/dist/contracts/temporal.js.map +1 -0
  71. package/dist/contracts/types.d.ts +135 -0
  72. package/dist/contracts/types.d.ts.map +1 -1
  73. package/dist/contracts/types.js +27 -0
  74. package/dist/contracts/types.js.map +1 -1
  75. package/dist/core/associations.d.ts +18 -0
  76. package/dist/core/associations.d.ts.map +1 -0
  77. package/dist/core/associations.js +185 -0
  78. package/dist/core/associations.js.map +1 -0
  79. package/dist/core/circuit-breaker.d.ts +9 -0
  80. package/dist/core/circuit-breaker.d.ts.map +1 -1
  81. package/dist/core/circuit-breaker.js +13 -1
  82. package/dist/core/circuit-breaker.js.map +1 -1
  83. package/dist/core/cognitive.d.ts +5 -0
  84. package/dist/core/cognitive.d.ts.map +1 -0
  85. package/dist/core/cognitive.js +120 -0
  86. package/dist/core/cognitive.js.map +1 -0
  87. package/dist/core/context.d.ts +72 -1
  88. package/dist/core/context.d.ts.map +1 -1
  89. package/dist/core/context.js +471 -45
  90. package/dist/core/context.js.map +1 -1
  91. package/dist/core/episodic.d.ts +28 -0
  92. package/dist/core/episodic.d.ts.map +1 -0
  93. package/dist/core/episodic.js +371 -0
  94. package/dist/core/episodic.js.map +1 -0
  95. package/dist/core/formatter.d.ts +4 -0
  96. package/dist/core/formatter.d.ts.map +1 -1
  97. package/dist/core/formatter.js +103 -0
  98. package/dist/core/formatter.js.map +1 -1
  99. package/dist/core/maintenance.d.ts +1 -0
  100. package/dist/core/maintenance.d.ts.map +1 -1
  101. package/dist/core/maintenance.js +75 -0
  102. package/dist/core/maintenance.js.map +1 -1
  103. package/dist/core/manager.d.ts +159 -7
  104. package/dist/core/manager.d.ts.map +1 -1
  105. package/dist/core/manager.js +740 -31
  106. package/dist/core/manager.js.map +1 -1
  107. package/dist/core/orchestrator.d.ts.map +1 -1
  108. package/dist/core/orchestrator.js +210 -178
  109. package/dist/core/orchestrator.js.map +1 -1
  110. package/dist/core/playbook.d.ts +35 -0
  111. package/dist/core/playbook.d.ts.map +1 -0
  112. package/dist/core/playbook.js +184 -0
  113. package/dist/core/playbook.js.map +1 -0
  114. package/dist/core/profile.d.ts +8 -0
  115. package/dist/core/profile.d.ts.map +1 -0
  116. package/dist/core/profile.js +103 -0
  117. package/dist/core/profile.js.map +1 -0
  118. package/dist/core/quick.d.ts +5 -0
  119. package/dist/core/quick.d.ts.map +1 -1
  120. package/dist/core/quick.js +10 -1
  121. package/dist/core/quick.js.map +1 -1
  122. package/dist/core/runtime.d.ts +17 -1
  123. package/dist/core/runtime.d.ts.map +1 -1
  124. package/dist/core/runtime.js +88 -5
  125. package/dist/core/runtime.js.map +1 -1
  126. package/dist/core/streaming.d.ts +1 -1
  127. package/dist/core/streaming.d.ts.map +1 -1
  128. package/dist/core/temporal.d.ts +29 -0
  129. package/dist/core/temporal.d.ts.map +1 -0
  130. package/dist/core/temporal.js +447 -0
  131. package/dist/core/temporal.js.map +1 -0
  132. package/dist/core/validation.d.ts +3 -0
  133. package/dist/core/validation.d.ts.map +1 -1
  134. package/dist/core/validation.js +25 -10
  135. package/dist/core/validation.js.map +1 -1
  136. package/dist/core/workspace-detect.d.ts +17 -0
  137. package/dist/core/workspace-detect.d.ts.map +1 -0
  138. package/dist/core/workspace-detect.js +55 -0
  139. package/dist/core/workspace-detect.js.map +1 -0
  140. package/dist/embeddings/resilience.d.ts.map +1 -1
  141. package/dist/embeddings/resilience.js +19 -8
  142. package/dist/embeddings/resilience.js.map +1 -1
  143. package/dist/index.d.ts +21 -4
  144. package/dist/index.d.ts.map +1 -1
  145. package/dist/index.js +9 -0
  146. package/dist/index.js.map +1 -1
  147. package/dist/integrations/claude-agent.d.ts +6 -0
  148. package/dist/integrations/claude-agent.d.ts.map +1 -1
  149. package/dist/integrations/claude-agent.js +5 -1
  150. package/dist/integrations/claude-agent.js.map +1 -1
  151. package/dist/integrations/claude-tools.d.ts +5 -4
  152. package/dist/integrations/claude-tools.d.ts.map +1 -1
  153. package/dist/integrations/claude-tools.js +155 -2
  154. package/dist/integrations/claude-tools.js.map +1 -1
  155. package/dist/integrations/middleware.d.ts +6 -0
  156. package/dist/integrations/middleware.d.ts.map +1 -1
  157. package/dist/integrations/middleware.js +11 -1
  158. package/dist/integrations/middleware.js.map +1 -1
  159. package/dist/integrations/openai-tools.d.ts +5 -4
  160. package/dist/integrations/openai-tools.d.ts.map +1 -1
  161. package/dist/integrations/openai-tools.js +170 -2
  162. package/dist/integrations/openai-tools.js.map +1 -1
  163. package/dist/integrations/vercel-ai.d.ts +6 -0
  164. package/dist/integrations/vercel-ai.d.ts.map +1 -1
  165. package/dist/integrations/vercel-ai.js +4 -0
  166. package/dist/integrations/vercel-ai.js.map +1 -1
  167. package/dist/server/http-server.d.ts +8 -0
  168. package/dist/server/http-server.d.ts.map +1 -1
  169. package/dist/server/http-server.js +976 -58
  170. package/dist/server/http-server.js.map +1 -1
  171. package/dist/server/mcp-server.d.ts +8 -0
  172. package/dist/server/mcp-server.d.ts.map +1 -1
  173. package/dist/server/mcp-server.js +1157 -37
  174. package/dist/server/mcp-server.js.map +1 -1
  175. package/dist/server/parsing.d.ts +12 -0
  176. package/dist/server/parsing.d.ts.map +1 -0
  177. package/dist/server/parsing.js +42 -0
  178. package/dist/server/parsing.js.map +1 -0
  179. package/dist/summarizers/prompts.d.ts +4 -0
  180. package/dist/summarizers/prompts.d.ts.map +1 -1
  181. package/dist/summarizers/prompts.js +42 -0
  182. package/dist/summarizers/prompts.js.map +1 -1
  183. package/docs/ULTIMATE_MEMORY_LAYER_ROADMAP.md +291 -0
  184. package/docs/prd.json +1498 -0
  185. package/openapi.yaml +1945 -112
  186. package/package.json +4 -2
package/README.md CHANGED
@@ -1,31 +1,29 @@
1
1
  <p align="center">
2
2
  <h1 align="center">memory-layer</h1>
3
3
  <p align="center">
4
- Persistent memory for AI systems.<br/>
5
- Drop it into any agent, IDE, or autonomous loop.<br/>
6
- Two lines to remember. Zero lines to forget.
4
+ A cognitive memory architecture for AI systems.<br/>
5
+ Interactions become summaries. Summaries become knowledge. Knowledge becomes context.<br/>
6
+ Two lines to start. Enough architecture to scale.
7
7
  </p>
8
8
  </p>
9
9
 
10
10
  <p align="center">
11
+ <a href="#why-it-stands-out">Why</a> &nbsp;&bull;&nbsp;
11
12
  <a href="#quick-start">Quick Start</a> &nbsp;&bull;&nbsp;
12
13
  <a href="#how-it-works">How It Works</a> &nbsp;&bull;&nbsp;
13
- <a href="#integration-patterns">Integrations</a> &nbsp;&bull;&nbsp;
14
- <a href="#python">Python</a> &nbsp;&bull;&nbsp;
14
+ <a href="#integrations">Integrations</a> &nbsp;&bull;&nbsp;
15
+ <a href="#temporal-intelligence">Temporal</a> &nbsp;&bull;&nbsp;
16
+ <a href="#multi-agent-coordination">Coordination</a> &nbsp;&bull;&nbsp;
17
+ <a href="#python-client">Python</a> &nbsp;&bull;&nbsp;
15
18
  <a href="#api-reference">API</a> &nbsp;&bull;&nbsp;
16
- <a href="#configuration">Config</a> &nbsp;&bull;&nbsp;
17
- <a href="docs/DEPLOYMENT.md">Deploy</a>
19
+ <a href="#configuration">Config</a>
18
20
  </p>
19
21
 
20
22
  ---
21
23
 
22
- ## The Problem
24
+ Every AI system built today has the same blind spot: it forgets everything between sessions. Context vanishes. Learned preferences disappear. Mistakes repeat. The model is perpetually starting over.
23
25
 
24
- AI systems have no memory. Every session starts cold. Context vanishes. Learned preferences disappear. Mistakes repeat.
25
-
26
- If you're building an autonomous agent, a coding assistant, or a dark-factory loop, the model forgets everything the moment the conversation ends. Bolting on memory means building compaction, extraction, trust scoring, retrieval, multi-tenant scoping, and lifecycle management from scratch.
27
-
28
- **memory-layer** is that entire stack as a drop-in package.
26
+ **memory-layer** is a complete cognitive memory architecture not a vector store, not a chat log, but a tiered system where conversations compress into summaries, summaries crystallize into trust-scored knowledge, and the most relevant memory is assembled into a token-budgeted context window on every call. It handles compaction, extraction, evidence grounding, contradiction detection, hybrid retrieval, multi-tenant scoping, temporal replay, and lifecycle management so you don't build any of it.
29
27
 
30
28
  ```typescript
31
29
  import { createMemory } from 'ai-memory-layer';
@@ -37,6 +35,30 @@ That's a working memory system. No API keys. No infrastructure. No configuration
37
35
 
38
36
  ---
39
37
 
38
+ ## Why It Stands Out
39
+
40
+ - **Starts as a package, not an infrastructure project.** `createMemory()` works offline out of the box, then grows into SQLite, PostgreSQL, HTTP, or MCP without changing the core mental model.
41
+ - **Treats memory as evolving state, not just search results.** Turns compact into summaries, summaries promote into evidence-backed knowledge, and context assembly respects trust, scope, and token budget.
42
+ - **Keeps history, not just the latest projection.** Replay, diffs, streaming, and snapshots let you ask what changed and what the system knew at a specific time.
43
+ - **Built for real agent operations.** Multi-scope routing, work claims, handoffs, playbooks, profiles, and association graphs are first-class instead of bolted on later.
44
+
45
+ ## When To Use It
46
+
47
+ Use `memory-layer` when:
48
+
49
+ - your agent needs durable preferences, constraints, and decisions across sessions
50
+ - you need temporal replay, auditability, or change streams
51
+ - multiple agents share workspace memory or coordinate on work
52
+ - you want one memory abstraction that can start embedded and later move behind HTTP or MCP
53
+
54
+ It is probably overkill when:
55
+
56
+ - you only need vector search over a document corpus
57
+ - a single chat transcript is enough and nothing needs to persist
58
+ - you do not need trust lifecycles, temporal semantics, or multi-agent coordination
59
+
60
+ ---
61
+
40
62
  ## Quick Start
41
63
 
42
64
  ### Install
@@ -52,17 +74,18 @@ import { createMemory } from 'ai-memory-layer';
52
74
 
53
75
  const memory = createMemory();
54
76
 
55
- await memory.processExchange(
77
+ await memory.learnFact(
56
78
  'Always use TypeScript strict mode in this project.',
57
- 'Got it — TypeScript strict mode is now a stored constraint.',
79
+ 'constraint',
58
80
  );
59
81
 
60
- // Later, in a new session or turn:
61
82
  const ctx = await memory.getContext('typescript config');
62
83
  // ctx.relevantKnowledge → [{ fact: "Use TypeScript strict mode", knowledge_class: "constraint", ... }]
63
84
  ```
64
85
 
65
- No API keys required. Uses a pure-JS extractive summarizer, heuristic fact extractor, and local embedding fallback. Good enough to start. Upgrades automatically when provider credentials appear.
86
+ For direct durable memory in one call, use `learnFact(...)`. Conversation-driven extraction also works, but durable knowledge appears after compaction rather than after a single exchange.
87
+
88
+ No API keys required. Uses a pure-JS extractive summarizer, heuristic fact extractor, and local embedding fallback out of the box.
66
89
 
67
90
  ### Persistent (SQLite)
68
91
 
@@ -97,7 +120,7 @@ const memory = createMemory({
97
120
  });
98
121
  ```
99
122
 
100
- When `OPENAI_API_KEY` or `VOYAGE_API_KEY` is present, `createMemory()` auto-upgrades to provider-backed embeddings. You don't have to change anything the quality tier shifts silently.
123
+ When `OPENAI_API_KEY` or `VOYAGE_API_KEY` is present, `createMemory()` upgrades the embedding tier automatically. You keep the same integration code and can opt into provider summarization or extraction when you want higher-fidelity memory formation.
101
124
 
102
125
  ---
103
126
 
@@ -107,23 +130,27 @@ When `OPENAI_API_KEY` or `VOYAGE_API_KEY` is present, `createMemory()` auto-upgr
107
130
  User Input ──> Turn Storage ──> Compaction ──> Working Memory
108
131
 
109
132
  Extraction ──> Knowledge Memory
133
+ │ │
134
+ Association ──> Knowledge Graph
110
135
 
111
136
  Retrieval ──> Prompt-Ready Context
112
137
  ```
113
138
 
114
- ### Three-Tier Memory
139
+ ### Three-Tier Memory Architecture
115
140
 
116
- Memory flows through three tiers, each optimized for a different time horizon:
141
+ Memory flows through three tiers, each optimized for a different time horizon — modeled on how durable memory actually forms:
117
142
 
118
- | Tier | What It Stores | How Long It Lives |
119
- |------|---------------|-------------------|
120
- | **Short-term** (Turns) | Raw conversation exchanges | Until compacted |
121
- | **Medium-term** (Working Memory) | Summaries with entities and topic tags | Days to weeks (TTL) |
122
- | **Long-term** (Knowledge) | Extracted facts with trust scores and evidence | Weeks to years (lifecycle) |
143
+ | Tier | What It Stores | Lifecycle | Analogy |
144
+ |------|---------------|-----------|---------|
145
+ | **Short-term** (Turns) | Raw conversation exchanges | Until compacted | Working memory |
146
+ | **Medium-term** (Working Memory) | Summaries with entities and topic tags | Days to weeks | Episodic memory |
147
+ | **Long-term** (Knowledge) | Extracted facts with trust scores, evidence chains, and association graphs | Weeks to years | Semantic memory |
148
+
149
+ Turns accumulate until the compaction monitor fires (configurable thresholds for turn count, token budget, session gaps, and topic drift). The summarizer compresses them into a working memory summary. The extractor identifies durable facts — preferences, constraints, decisions, entities — and promotes them through a trust lifecycle before they become long-term knowledge.
123
150
 
124
151
  ### Knowledge Trust Lifecycle
125
152
 
126
- Extracted facts aren't blindly trusted. Every fact has a state:
153
+ Extracted facts aren't blindly trusted. Every fact earns its place through evidence:
127
154
 
128
155
  ```
129
156
  candidate ──> provisional ──> trusted
@@ -131,28 +158,28 @@ candidate ──> provisional ──> trusted
131
158
  └── disputed └── superseded ──> retired
132
159
  ```
133
160
 
134
- Promotion requires evidence. A fact needs grounding in source turns, explicit user statements, tool verification, or repeated corroboration before it reaches `trusted`. Contradictions are detected and facts are marked `disputed` not silently overwritten.
161
+ Promotion requires grounding in source turns, corroboration across sessions, or explicit user statements. Contradictions are detected automatically — conflicting facts are marked `disputed`, not silently overwritten. Every decision is audited: you can inspect why any fact was promoted, demoted, or retired via the evidence chain and knowledge audit log.
135
162
 
136
- Every decision is audited. You can inspect why any fact was promoted, demoted, or retired.
163
+ Facts are classified by type (`preference`, `constraint`, `entity`, `decision`, `reference`) and by knowledge class (`identity`, `preference`, `constraint`, `procedure`, `strategy`, `project_fact`, `anti_pattern`). Classification drives retrieval ranking, maintenance retention, and profile assembly.
137
164
 
138
165
  ### Hybrid Retrieval
139
166
 
140
- When you call `getContext()`, the engine scores every candidate fact across multiple dimensions:
167
+ When you call `getContext()`, the engine scores every candidate fact across eight dimensions:
141
168
 
142
- - **Lexical** — full-text search relevance
143
- - **Semantic** — vector similarity (when embeddings are available)
169
+ - **Lexical** — full-text search relevance (FTS5 on SQLite, tsvector on Postgres)
170
+ - **Semantic** — vector similarity via embeddings (pgvector ANN, local cosine, or provider-backed)
144
171
  - **Recency** — when the fact was last accessed
145
172
  - **Trust** — knowledge state and confidence score
146
- - **Class importance** — identity facts rank higher than episodic ones
173
+ - **Class importance** — identity and constraint facts outrank episodic ones
147
174
  - **Evidence density** — better-grounded facts rank higher
148
- - **Scope relation** — local facts rank higher than cross-scope ones
149
- - **Diversity** — penalizes clustering of same-type results
175
+ - **Scope affinity** — local facts rank higher than cross-scope; lineage scoring for branched scopes
176
+ - **Diversity** — penalizes clustering of same-type or same-slot results
150
177
 
151
- The result is a `MemoryContext` object ready to inject into any model call.
178
+ Selected knowledge then seeds a **single-hop association expansion**: the top seeds' `supports` and `related_to` edges are traversed, and connected facts are pulled in ranked by confidence. The result is token-trimmed to budget (turns → summaries → playbooks → associated knowledge → core knowledge, in priority order) and returned as a structured `MemoryContext` ready to inject into any model call.
152
179
 
153
180
  ---
154
181
 
155
- ## Integration Patterns
182
+ ## Integrations
156
183
 
157
184
  ### Before/After Hooks (recommended)
158
185
 
@@ -162,7 +189,7 @@ import { createMemory, createMemoryRuntime } from 'ai-memory-layer';
162
189
  const manager = createMemory({ adapter: 'sqlite', path: './memory.db' });
163
190
  const runtime = createMemoryRuntime(manager);
164
191
 
165
- // Before the model call — get context
192
+ // Before the model call — assemble context
166
193
  const { prompt, messages } = await runtime.beforeModelCall(userInput);
167
194
 
168
195
  // Call your model with enriched context
@@ -181,7 +208,7 @@ const { result } = await runtime.wrapModelCall(
181
208
  );
182
209
  ```
183
210
 
184
- `wrapModelCall` handles the full cycle: context assembly, model call, turn storage, compaction, extraction, and work item tracking.
211
+ Handles the full cycle: context assembly, model call, turn storage, compaction, extraction, and work item tracking in one call.
185
212
 
186
213
  ### Claude Agent SDK
187
214
 
@@ -231,75 +258,218 @@ const handler = wrapWithMemory(
231
258
  import { createMemoryMcpAdapter } from 'ai-memory-layer';
232
259
 
233
260
  const mcp = createMemoryMcpAdapter(runtime);
234
- // mcp.tools — tool definitions
235
- // mcp.callTool(name, args) dispatcher
261
+ // mcp.tools — tool definitions for memory_store_turn, memory_get_context,
262
+ // memory_search, memory_learn_fact, memory_stream_changes, memory_snapshot, ...
236
263
  ```
237
264
 
238
- Or run as a standalone MCP server:
265
+ Or run standalone:
239
266
 
240
267
  ```bash
241
268
  npx memory-layer serve --transport mcp --db ./memory.db
242
269
  ```
243
270
 
244
- ### HTTP Service
271
+ Tool surface spanning turns, context, search, episodes, cognitive retrieval, playbooks, associations, coordination, temporal queries, maintenance, and snapshots.
245
272
 
246
- For polyglot deployments, run memory-layer as a standalone HTTP service:
273
+ ### HTTP Service
247
274
 
248
275
  ```bash
249
276
  npx memory-layer serve --transport http --db ./memory.db --port 3100
250
277
  ```
251
278
 
252
- Full REST API documented in [`openapi.yaml`](openapi.yaml). Supports multi-tenant routing via scope headers, event streaming via SSE, and API key authentication.
279
+ Full REST API documented in [`openapi.yaml`](openapi.yaml), with multi-tenant routing via scope headers, SSE event streaming, bearer auth, and admin key separation.
280
+
281
+ Important trust-model note: the built-in HTTP server treats a valid API key as one trust domain. Scope headers and query/body scope overrides are routing inputs, not tenant authorization boundaries. If you expose the server beyond localhost or a private service mesh, put tenant-aware auth in front of it.
253
282
 
254
283
  ---
255
284
 
256
- ## Python
285
+ ## Surfaces
257
286
 
258
- ```bash
259
- pip install memory-layer-client
287
+ One memory engine, every access pattern:
288
+
289
+ | Surface | Best For | Start |
290
+ |---------|---------|-------|
291
+ | **Node package** | In-process agents, IDEs, autonomous loops | `import { createMemory } from 'ai-memory-layer'` |
292
+ | **HTTP API** | Polyglot services, hosted deployments | `npx memory-layer serve --transport http` |
293
+ | **MCP server** | Tool ecosystems, Claude Desktop, agent frameworks | `npx memory-layer serve --transport mcp` |
294
+ | **CLI** | Inspection, debugging, admin | `npx memory-layer inspect` |
295
+ | **Python client** | Python agents consuming the HTTP API | `pip install memory-layer-client` |
296
+
297
+ ---
298
+
299
+ ## Multi-Tenancy & Scoping
300
+
301
+ Every record belongs to a five-tuple scope that enables isolation and selective sharing across agents, projects, and organizations:
302
+
303
+ ```typescript
304
+ const memory = createMemory({
305
+ scope: {
306
+ tenant_id: 'acme-corp', // Organization boundary
307
+ system_id: 'code-assistant', // Which agent or system
308
+ workspace_id: 'backend-repo', // Shared project context
309
+ collaboration_id: 'incident-7', // Cross-system collaboration boundary
310
+ scope_id: 'task-refactor-auth', // This specific task or thread
311
+ },
312
+ crossScopeLevel: 'workspace', // Read workspace-wide knowledge
313
+ });
260
314
  ```
261
315
 
262
- The Python client mirrors the HTTP API surface. It's an HTTP client, not a second engine — run the Node service and point Python at it.
316
+ ### Cross-Scope Retrieval
263
317
 
264
- ```python
265
- from memory_layer_client import MemoryClient, MemoryRuntimeClient, MemoryScope
318
+ ```typescript
319
+ // Search across the workspace
320
+ const results = await memory.searchCrossScope('rate limiting', 'workspace');
266
321
 
267
- client = MemoryClient(
268
- "http://localhost:3100",
269
- default_scope=MemoryScope(
270
- tenant_id="acme",
271
- system_id="research-agent",
272
- scope_id="session-1",
273
- ),
274
- )
322
+ // Poll for knowledge changes from other agents
323
+ const changes = await memory.pollForChanges(lastSyncTimestamp);
324
+ ```
275
325
 
276
- runtime = MemoryRuntimeClient(client)
326
+ Retrieval levels: `scope` → `workspace` → `system` → `tenant`. Each level widens the knowledge pool while preserving ranking preference for local, high-trust facts.
277
327
 
278
- # Full before/after cycle with your model
279
- result = runtime.run_turn(
280
- "What constraints apply to this project?",
281
- lambda prepared: call_model(prepared.context),
282
- )
328
+ ### Visibility Classes
283
329
 
284
- # Direct operations
285
- client.learn_fact("Deployment target is AWS us-east-1", "constraint")
286
- results = client.search("deployment")
287
- context = client.get_context("deployment constraints")
330
+ Items carry a visibility class (`private`, `shared_collaboration`, `workspace`, `tenant`) that controls what surfaces under each context view policy:
331
+
332
+ | View | Sees |
333
+ |------|------|
334
+ | `local_only` | Private items in the exact scope |
335
+ | `local_plus_shared_collaboration` | Private + collaboration-shared items |
336
+ | `workspace_shared` | Private + collaboration + workspace items |
337
+ | `operator_supervisor` | Everything including tenant-wide items |
338
+
339
+ ---
340
+
341
+ ## Temporal Intelligence
342
+
343
+ An append-only event log records every state change — turn created, knowledge promoted, work item claimed, fact disputed — with before/after payloads. This powers capabilities most memory systems can't offer:
344
+
345
+ ### Point-in-Time Replay
346
+
347
+ ```typescript
348
+ // What did the system know at 2pm yesterday?
349
+ const context = await memory.getContextAt(asOfTimestamp, 'deployment status');
350
+
351
+ // Full state snapshot at a past time
352
+ const state = await memory.getStateAt(asOfTimestamp);
353
+ // state.turns, state.knowledge, state.workItems, state.workClaims, state.handoffs, ...
288
354
  ```
289
355
 
290
- Async support included:
356
+ Historical replay is exact after the replay cutover and best-effort before it. For exactness-sensitive flows, `getStateAt(...)` exposes whether the reconstruction is exact.
291
357
 
292
- ```python
293
- from memory_layer_client import AsyncMemoryClient, AsyncMemoryRuntimeClient
358
+ ### Temporal Diffs
294
359
 
295
- async with AsyncMemoryClient("http://localhost:3100") as client:
296
- runtime = AsyncMemoryRuntimeClient(client)
297
- result = await runtime.run_turn(user_input, model_call)
360
+ ```typescript
361
+ // What changed between two timestamps?
362
+ const diff = await memory.diffState(fromTimestamp, toTimestamp);
363
+ // diff.summary.byEntityKind → { knowledge_memory: 3, work_item: 1 }
364
+ // diff.events → full event records
365
+ ```
366
+
367
+ ### Change Streaming
368
+
369
+ ```typescript
370
+ // Real-time SSE stream of memory events
371
+ for await (const event of memory.streamChanges({ signal: controller.signal })) {
372
+ console.log(event.event_type, event.entity_kind, event.entity_id);
373
+ }
374
+ ```
375
+
376
+ ### Consistent Snapshots
377
+
378
+ Snapshots pin a watermark event id before assembling context, then filter events by that watermark. Same-second writes that land during capture are excluded — the snapshot is a consistent cut of the event log.
379
+
380
+ ---
381
+
382
+ ## Multi-Agent Coordination
383
+
384
+ Built-in primitives for multi-agent workflows where agents need to share work, claim tasks, and hand off context:
385
+
386
+ ### Work Items & Claims
387
+
388
+ ```typescript
389
+ // Track work
390
+ const item = await memory.trackWorkItem('Deploy canary to us-east-1', 'objective', 'open');
391
+
392
+ // Claim it (lease-based, with automatic expiry)
393
+ const claim = await memory.claimWorkItem({
394
+ workItemId: item.id,
395
+ actor: { actor_kind: 'agent', actor_id: 'deployer', system_id: null, display_name: 'Deploy Bot', metadata: null },
396
+ leaseSeconds: 600,
397
+ });
398
+
399
+ // Renew or release
400
+ await memory.renewWorkClaim(claim.id, claim.actor, 300);
401
+ await memory.releaseWorkClaim(claim.id, claim.actor, 'deployment complete');
402
+ ```
403
+
404
+ ### Handoffs
405
+
406
+ ```typescript
407
+ // Hand off work between agents with context
408
+ const handoff = await memory.handoffWorkItem({
409
+ workItemId: item.id,
410
+ fromActor: deployBot,
411
+ toActor: monitorBot,
412
+ summary: 'Canary deployed. Monitor for 30min then promote.',
413
+ });
414
+
415
+ // Receiving agent accepts
416
+ await memory.acceptHandoff(handoff.id, monitorBot);
417
+ ```
418
+
419
+ ### Episodic & Cognitive Retrieval
420
+
421
+ ```typescript
422
+ // Search across episodes (requires a structuredClient)
423
+ const episodes = await memory.searchEpisodes({ query: 'deployment failures', limit: 5 });
424
+
425
+ // Summarize a specific session
426
+ const recap = await memory.summarizeEpisode('session-xyz', { detailLevel: 'detailed' });
427
+
428
+ // Reflect across memory types
429
+ const reflection = await memory.reflect({ query: 'What patterns emerge from our deployments?' });
430
+
431
+ // Cognitive search (grouped by memory type)
432
+ const cognitive = await memory.searchCognitive({ query: 'rate limiting', limit: 10 });
433
+ ```
434
+
435
+ ### Playbooks
436
+
437
+ ```typescript
438
+ // Create reusable procedures from experience
439
+ const playbook = await memory.createPlaybook({
440
+ title: 'Canary Deployment Runbook',
441
+ description: 'Step-by-step canary deployment with rollback gates',
442
+ instructions: '1. Deploy to canary. 2. Monitor error rate. 3. ...',
443
+ tags: ['deployment', 'canary'],
444
+ });
445
+
446
+ // Search for relevant playbooks during context assembly
447
+ const matches = await memory.searchPlaybooks('deployment procedure');
448
+
449
+ // Playbooks surface automatically in getContext() when relevant
450
+ ```
451
+
452
+ ### Profiles
453
+
454
+ ```typescript
455
+ // Aggregate knowledge into a structured profile
456
+ const profile = await memory.getProfile({ view: 'user' });
457
+ // profile.sections → { identity: [...], preferences: [...], constraints: [...], ... }
458
+ ```
459
+
460
+ ### Association Graphs
461
+
462
+ ```typescript
463
+ // Traverse the knowledge graph
464
+ const graph = await memory.traverseAssociations('knowledge', factId, { maxDepth: 2, maxNodes: 20 });
465
+ // graph.nodes, graph.edges — typed association graph with supports/contradicts/supersedes/related_to edges
298
466
  ```
299
467
 
300
468
  ---
301
469
 
302
- ## Presets
470
+ ## Configuration
471
+
472
+ ### Presets
303
473
 
304
474
  Start with a preset. Override only when you need to.
305
475
 
@@ -323,16 +493,7 @@ Orthogonal to presets. Controls how aggressively the system trusts and retains k
323
493
  | `balanced_memory` | 0.70 | 365-day core | Production default |
324
494
  | `high_fidelity_memory` | 0.82 | 730-day core | Safety-critical, long-running systems |
325
495
 
326
- ```typescript
327
- const memory = createMemory({
328
- preset: 'autonomous_agent',
329
- qualityMode: 'high_fidelity_memory',
330
- });
331
- ```
332
-
333
- ---
334
-
335
- ## Quality Tiers
496
+ ### Quality Tiers
336
497
 
337
498
  `createMemory()` auto-detects your environment and resolves to the best available tier:
338
499
 
@@ -342,57 +503,89 @@ const memory = createMemory({
342
503
  | **Local semantic** | Composite heuristic | Lexical + local TF-IDF embeddings | Nothing |
343
504
  | **Provider-backed** | Claude/OpenAI LLM | Lexical + provider embeddings | API key |
344
505
 
345
- Pass `onEvent` to see which tier resolved at startup:
506
+ The local path is fully functional offline. Provider-backed is the highest-quality tier when API access is available.
346
507
 
347
- ```typescript
348
- const memory = createMemory({
349
- onEvent: (event) => {
350
- if (event.type === 'capability') {
351
- console.log(event.meta);
352
- // { qualityMode: 'balanced_memory', extractorTier: 'local_heuristic',
353
- // embeddingTier: 'local_semantic', providerBacked: false }
354
- }
355
- },
356
- });
357
- ```
508
+ <details>
509
+ <summary><strong>MonitorPolicy</strong> when compaction triggers</summary>
358
510
 
359
- The local path is an honest fallback — functional, not aspirational. Provider-backed is the gold standard for extraction and retrieval quality.
511
+ | Field | Default | Description |
512
+ |-------|---------|-------------|
513
+ | `softTurnThreshold` | 15 | Turns before soft compaction is considered |
514
+ | `hardTurnThreshold` | 30 | Turns that force compaction |
515
+ | `softTokenThreshold` | 3000 | Token estimate for soft trigger |
516
+ | `hardTokenThreshold` | 6000 | Token estimate that forces compaction |
517
+ | `softRetainTurns` | 12 | Turns to keep after soft compaction |
518
+ | `hardRetainTurns` | 8 | Turns to keep after hard compaction |
519
+ | `intraSessionGapSeconds` | 1800 | Idle gap that triggers session_gap compaction |
360
520
 
361
- ---
521
+ </details>
362
522
 
363
- ## Scoping & Multi-Tenancy
523
+ <details>
524
+ <summary><strong>ExtractionPolicy</strong> — how facts are extracted and promoted</summary>
364
525
 
365
- Every record belongs to a scope. Scopes enable isolation and selective sharing across agents, workspaces, and tenants.
526
+ | Field | Default | Description |
527
+ |-------|---------|-------------|
528
+ | `autoExtractAfterCompaction` | true | Run extraction after each compaction |
529
+ | `maxFactsPerExtraction` | 10 | Max facts per compaction cycle |
530
+ | `deduplicateFacts` | true | Deduplicate against existing knowledge |
531
+ | `minConfidenceForPromotion` | `'medium'` | Minimum confidence for storage |
532
+ | `trustPromotionThreshold` | 0.7 | Score required for `trusted` state |
533
+ | `contradictionDisputeThreshold` | 0.35 | Score that marks facts `disputed` |
534
+ | `requireGroundingForTrusted` | true | Require evidence in source turns |
535
+ | `conflictStrategy` | `'supersede'` | How to handle conflicting facts |
366
536
 
367
- ```typescript
368
- const memory = createMemory({
369
- scope: {
370
- tenant_id: 'acme-corp', // Organization boundary
371
- system_id: 'code-assistant', // Which agent
372
- workspace_id: 'backend-repo', // Shared project context
373
- scope_id: 'task-refactor-auth', // This specific task
374
- },
375
- crossScopeLevel: 'workspace', // Can read workspace-wide knowledge
376
- });
377
- ```
537
+ </details>
378
538
 
379
- ### Cross-Scope Retrieval
539
+ <details>
540
+ <summary><strong>ContextPolicy</strong> — how knowledge is selected for prompts</summary>
380
541
 
381
- ```typescript
382
- // Search across the workspace
383
- const results = await memory.searchCrossScope('rate limiting', 'workspace');
542
+ | Field | Default | Description |
543
+ |-------|---------|-------------|
544
+ | `mode` | `'chat'` | Scoring profile: `chat`, `coding`, `autonomous_agent`, `review` |
545
+ | `maxKnowledgeItems` | 20 | Max facts in assembled context |
546
+ | `maxRecentSummaries` | 3 | Max summaries in context |
547
+ | `tokenBudget` | unlimited | Token cap for context |
548
+ | `lexicalWeight` | 1.0 | Full-text search weight |
549
+ | `semanticWeight` | 1.0 | Embedding similarity weight |
550
+ | `recencyWeight` | 1.0 | Access recency weight |
551
+ | `trustWeight` | 1.3 | Knowledge confidence weight |
552
+ | `importanceWeight` | 0.25 | Access frequency weight |
553
+ | `diversityPenalty` | 0.2 | Same-type clustering penalty |
384
554
 
385
- // Poll for knowledge changes from other agents
386
- const changes = await memory.pollForChanges(lastSyncTimestamp);
387
- ```
555
+ </details>
388
556
 
389
- Retrieval levels: `scope` (exact match) → `workspace` → `system` → `tenant`
557
+ <details>
558
+ <summary><strong>MaintenancePolicy</strong> — data lifecycle and cleanup</summary>
559
+
560
+ | Field | Default | Description |
561
+ |-------|---------|-------------|
562
+ | `workingMemoryTtlSeconds` | 30 days | Summary expiry |
563
+ | `completedWorkItemTtlSeconds` | 14 days | Completed work item cleanup |
564
+ | `knowledgeStaleAfterSeconds` | 60 days | Knowledge staleness threshold |
565
+ | `minKnowledgeAccessCount` | 1 | Minimum accesses to avoid retirement |
566
+ | `maxActiveKnowledgeItems` | 500 | Hard cap on active knowledge |
567
+ | `reverificationCadenceDays` | 30 | Days between reverification checks |
568
+ | `trustedCoreRetentionDays` | 365 | Retention for identity/preference/constraint |
569
+ | `provisionalRetentionDays` | 7 | Retention for provisional facts |
570
+
571
+ </details>
390
572
 
391
573
  ---
392
574
 
393
575
  ## API Reference
394
576
 
395
- ### MemoryManager
577
+ Most integrations only need a small slice of the surface:
578
+
579
+ - `createMemory()` or `createMemoryRuntime(manager)`
580
+ - `processExchange(...)` or `wrapModelCall(...)`
581
+ - `getContext(...)` or `getSessionBootstrap(...)`
582
+ - `learnFact(...)` for explicit durable memory
583
+ - `forceCompact()`, `runMaintenance()`, and `getRuntimeDiagnostics()` for operations
584
+
585
+ For the full transport contract, see [openapi.yaml](openapi.yaml). For the TypeScript surface, the package exports the types shown below.
586
+
587
+ <details>
588
+ <summary><strong>MemoryManager</strong> — full manager surface</summary>
396
589
 
397
590
  Returned by `createMemory()`, `createMemoryManager()`, and provider factories.
398
591
 
@@ -408,31 +601,70 @@ interface MemoryManager {
408
601
  getContext(relevanceQuery?): Promise<MemoryContext>
409
602
  getContextAt(asOf, relevanceQuery?): Promise<MemoryContext>
410
603
  getSessionBootstrap(relevanceQuery?): Promise<SessionBootstrap>
604
+ getSessionBootstrapAt(asOf, relevanceQuery?): Promise<SessionBootstrap>
605
+ captureSnapshot(relevanceQuery?): Promise<SnapshotData>
411
606
  search(query, options?): Promise<{ turns, knowledge }>
412
607
  searchCrossScope(query, level, options?): Promise<{ knowledge }>
413
608
  recall(timeRange): Promise<{ turns, workingMemory, knowledge, workItems }>
414
609
  pollForChanges(since, options?): Promise<KnowledgeMemory[]>
415
610
 
611
+ // --- Temporal ---
612
+ getStateAt(asOf, options?): Promise<TemporalStateSnapshot>
613
+ getTimeline(options?): Promise<TimelineResult>
614
+ diffState(from, to, options?): Promise<TemporalStateDiff>
615
+ listMemoryEvents(options?): Promise<TimelineResult>
616
+ streamChanges(options?): AsyncIterable<MemoryEventRecord>
617
+
416
618
  // --- Knowledge ---
417
619
  learnFact(fact, factType, confidence?): Promise<KnowledgeMemory>
418
- trackWorkItem(title, kind?, status?, detail?): Promise<WorkItem>
419
620
  inspectKnowledge(id): Promise<{ knowledge, evidence, audits }>
420
621
  listKnowledge(options?): Promise<PaginatedResult<KnowledgeMemory>>
622
+ reverifyKnowledge(id): Promise<TrustAssessment>
623
+
624
+ // --- Coordination ---
625
+ trackWorkItem(title, kind?, status?, detail?): Promise<WorkItem>
626
+ updateWorkItem(id, patch): Promise<WorkItem | null>
627
+ claimWorkItem(input): Promise<WorkClaim>
628
+ renewWorkClaim(claimId, actor, leaseSeconds?): Promise<WorkClaim | null>
629
+ releaseWorkClaim(claimId, actor, reason?): Promise<WorkClaim | null>
630
+ listWorkClaims(options?): Promise<WorkClaim[]>
631
+ handoffWorkItem(input): Promise<HandoffRecord>
632
+ acceptHandoff(id, actor): Promise<HandoffRecord | null>
633
+
634
+ // --- Episodic & Cognitive ---
635
+ searchEpisodes(options): Promise<EpisodeSummary[]>
636
+ summarizeEpisode(sessionId, options?): Promise<EpisodeSummary>
637
+ reflect(options): Promise<ReflectResult>
638
+ searchCognitive(options): Promise<CognitiveSearchResult>
639
+
640
+ // --- Playbooks ---
641
+ createPlaybook(input): Promise<Playbook>
642
+ searchPlaybooks(query): Promise<SearchResult<Playbook>[]>
643
+ revisePlaybook(id, instructions, reason): Promise<{ playbook, revision }>
644
+
645
+ // --- Profiles & Associations ---
646
+ getProfile(options?): Promise<Profile>
647
+ traverseAssociations(kind, id, options?): Promise<AssociationGraph>
648
+ addAssociation(input): Promise<Association>
649
+ removeAssociation(id): Promise<void>
421
650
 
422
651
  // --- System ---
423
652
  forceCompact(): Promise<CompactionResult | null>
424
653
  runMaintenance(policy?): Promise<MaintenanceReport>
425
- runReverification(options?): Promise<{ reverifiedIds, demotedIds }>
654
+ getRuntimeDiagnostics(): Promise<DiagnosticsReport>
426
655
  close(): Promise<void>
427
656
  }
428
657
  ```
658
+ </details>
429
659
 
430
- ### MemoryRuntime
660
+ <details>
661
+ <summary><strong>MemoryRuntime</strong> — model-call integration hooks</summary>
431
662
 
432
663
  Returned by `createMemoryRuntime(manager)`. Higher-level hooks for model call integration.
433
664
 
434
665
  ```typescript
435
666
  interface MemoryRuntime {
667
+ manager: MemoryManager;
436
668
  startSession(relevanceQuery?): Promise<{ bootstrap, bootstrapPrompt }>
437
669
  resumeSession(relevanceQuery?): Promise<{ bootstrap, bootstrapPrompt }>
438
670
  beforeModelCall(input): Promise<{
@@ -442,10 +674,14 @@ interface MemoryRuntime {
442
674
  wrapModelCall(modelFn, input, actors?): Promise<{
443
675
  result, runtime, exchange, trackedWorkItems
444
676
  }>
677
+ refreshSnapshot(): Promise<SessionSnapshot | null>
678
+ getSnapshot(): SessionSnapshot | null
445
679
  }
446
680
  ```
681
+ </details>
447
682
 
448
- ### MemoryContext
683
+ <details>
684
+ <summary><strong>MemoryContext</strong> — prompt-ready retrieval output</summary>
449
685
 
450
686
  The structured object returned by `getContext()`. Ready for prompt injection.
451
687
 
@@ -454,137 +690,72 @@ interface MemoryContext {
454
690
  mode: 'chat' | 'coding' | 'autonomous_agent' | 'review';
455
691
  activeTurns: Turn[];
456
692
  workingMemory: WorkingMemory | null;
457
- trustedCoreMemory: KnowledgeMemory[]; // High-confidence, durable facts
458
- taskRelevantKnowledge: KnowledgeMemory[]; // Matched to current query
459
- provisionalKnowledge: KnowledgeMemory[]; // Not yet fully trusted
460
- disputedKnowledge: KnowledgeMemory[]; // Contradicted facts
461
- relevantKnowledge: KnowledgeMemory[]; // All selected facts
693
+ trustedCoreMemory: KnowledgeMemory[];
694
+ taskRelevantKnowledge: KnowledgeMemory[];
695
+ provisionalKnowledge: KnowledgeMemory[];
696
+ disputedKnowledge: KnowledgeMemory[];
697
+ relevantKnowledge: KnowledgeMemory[];
698
+ associatedKnowledge: KnowledgeMemory[];
462
699
  recentSummaries: WorkingMemory[];
463
700
  currentObjective: string | null;
701
+ sessionState: SessionState;
464
702
  activeObjectives: WorkItem[];
465
703
  unresolvedWork: string[];
704
+ coordinationState: CoordinationState | null;
705
+ relevantPlaybooks: Playbook[];
466
706
  knowledgeSelectionReasons: KnowledgeSelectionReason[];
707
+ debugTrace: ContextDebugTrace;
467
708
  tokenEstimate: number;
468
709
  }
469
710
  ```
470
711
 
471
- ---
472
-
473
- ## Configuration
474
-
475
- ### createMemory() Options
476
-
477
- ```typescript
478
- createMemory({
479
- // Storage
480
- adapter?: 'sqlite' | 'memory' | StorageAdapter,
481
- path?: string,
482
-
483
- // Identity
484
- scope?: string | MemoryScope,
485
- sessionId?: string,
486
-
487
- // Behavior
488
- preset?: 'ai_ide' | 'chat_agent' | 'autonomous_agent',
489
- qualityMode?: 'fast_adoption' | 'balanced_memory' | 'high_fidelity_memory',
490
-
491
- // Components (auto-resolved if omitted)
492
- summarizer?: 'extractive' | 'claude' | 'openai' | Summarizer,
493
- extractor?: 'regex' | 'heuristic' | 'claude' | 'openai' | Extractor | false,
494
- embeddingGenerator?: 'local' | EmbeddingGenerator | false,
495
-
496
- // Fine-tuning (partial overrides merge with preset/quality defaults)
497
- policies?: {
498
- monitor?: Partial<MonitorPolicy>,
499
- extraction?: Partial<ExtractionPolicy>,
500
- context?: Partial<ContextPolicy>,
501
- maintenance?: Partial<MaintenancePolicy>,
502
- },
712
+ </details>
503
713
 
504
- // Automation
505
- autoCompact?: boolean, // default: true
506
- autoExtract?: boolean, // default: true (when extractor present)
507
- crossScopeLevel?: ScopeLevel,
714
+ ---
508
715
 
509
- // Observability
510
- logger?: Logger,
511
- onEvent?: EventHook,
512
- redactText?: (input: { kind: string; text: string }) => string,
716
+ ## Python Client
513
717
 
514
- // Resilience
515
- failurePolicy?: {
516
- summarizer?: 'throw' | 'retry_once' | 'log_and_continue',
517
- extractor?: 'throw' | 'retry_once' | 'log_and_continue' | 'disable_auto_extract',
518
- },
519
- })
718
+ ```bash
719
+ pip install memory-layer-client
520
720
  ```
521
721
 
522
- ### Policy Reference
523
-
524
- <details>
525
- <summary><strong>MonitorPolicy</strong> — when compaction triggers</summary>
526
-
527
- | Field | Default | Description |
528
- |-------|---------|-------------|
529
- | `softTurnThreshold` | 15 | Turns before soft compaction is considered |
530
- | `hardTurnThreshold` | 30 | Turns that force compaction |
531
- | `softTokenThreshold` | 3000 | Token estimate for soft trigger |
532
- | `hardTokenThreshold` | 6000 | Token estimate that forces compaction |
533
- | `softRetainTurns` | 12 | Turns to keep after soft compaction |
534
- | `hardRetainTurns` | 8 | Turns to keep after hard compaction |
535
- | `intraSessionGapSeconds` | 1800 | Idle gap that triggers session_gap compaction |
536
-
537
- </details>
538
-
539
- <details>
540
- <summary><strong>ExtractionPolicy</strong> — how facts are extracted and promoted</summary>
722
+ The Python client mirrors the full HTTP API surface. Run the Node service and point Python at it.
541
723
 
542
- | Field | Default | Description |
543
- |-------|---------|-------------|
544
- | `autoExtractAfterCompaction` | true | Run extraction after each compaction |
545
- | `maxFactsPerExtraction` | 10 | Max facts per compaction cycle |
546
- | `deduplicateFacts` | true | Deduplicate against existing knowledge |
547
- | `minConfidenceForPromotion` | `'medium'` | Minimum confidence for storage |
548
- | `trustPromotionThreshold` | 0.7 | Score required for `trusted` state |
549
- | `contradictionDisputeThreshold` | 0.35 | Score that marks facts `disputed` |
550
- | `requireGroundingForTrusted` | true | Require evidence in source turns |
551
- | `conflictStrategy` | `'supersede'` | How to handle conflicting facts |
724
+ ```python
725
+ from memory_layer_client import MemoryClient, MemoryRuntimeClient, MemoryScope
552
726
 
553
- </details>
727
+ client = MemoryClient(
728
+ "http://localhost:3100",
729
+ default_scope=MemoryScope(
730
+ tenant_id="acme",
731
+ system_id="research-agent",
732
+ scope_id="session-1",
733
+ ),
734
+ )
554
735
 
555
- <details>
556
- <summary><strong>ContextPolicy</strong> — how knowledge is selected for prompts</summary>
736
+ runtime = MemoryRuntimeClient(client)
557
737
 
558
- | Field | Default | Description |
559
- |-------|---------|-------------|
560
- | `mode` | `'chat'` | Scoring profile: `chat`, `coding`, `autonomous_agent`, `review` |
561
- | `maxKnowledgeItems` | 20 | Max facts in assembled context |
562
- | `maxRecentSummaries` | 3 | Max summaries in context |
563
- | `tokenBudget` | unlimited | Token cap for context |
564
- | `lexicalWeight` | 1.0 | Full-text search weight |
565
- | `semanticWeight` | 1.0 | Embedding similarity weight |
566
- | `recencyWeight` | 1.0 | Access recency weight |
567
- | `trustWeight` | 1.3 | Knowledge confidence weight |
568
- | `importanceWeight` | 0.25 | Access frequency weight |
569
- | `diversityPenalty` | 0.2 | Same-type clustering penalty |
738
+ # Full before/after cycle with your model
739
+ result = runtime.run_turn(
740
+ "What constraints apply to this project?",
741
+ lambda prepared: call_model(prepared.context),
742
+ )
570
743
 
571
- </details>
744
+ # Direct operations
745
+ client.learn_fact("Deployment target is AWS us-east-1", "constraint")
746
+ results = client.search("deployment")
747
+ context = client.get_context("deployment constraints")
748
+ ```
572
749
 
573
- <details>
574
- <summary><strong>MaintenancePolicy</strong> — data lifecycle and cleanup</summary>
750
+ Async support:
575
751
 
576
- | Field | Default | Description |
577
- |-------|---------|-------------|
578
- | `workingMemoryTtlSeconds` | 30 days | Summary expiry |
579
- | `completedWorkItemTtlSeconds` | 14 days | Completed work item cleanup |
580
- | `knowledgeStaleAfterSeconds` | 60 days | Knowledge staleness threshold |
581
- | `minKnowledgeAccessCount` | 1 | Minimum accesses to avoid retirement |
582
- | `maxActiveKnowledgeItems` | 500 | Hard cap on active knowledge |
583
- | `reverificationCadenceDays` | 30 | Days between reverification checks |
584
- | `trustedCoreRetentionDays` | 365 | Retention for identity/preference/constraint |
585
- | `provisionalRetentionDays` | 7 | Retention for provisional facts |
752
+ ```python
753
+ from memory_layer_client import AsyncMemoryClient, AsyncMemoryRuntimeClient
586
754
 
587
- </details>
755
+ async with AsyncMemoryClient("http://localhost:3100") as client:
756
+ runtime = AsyncMemoryRuntimeClient(client)
757
+ result = await runtime.run_turn(user_input, model_call)
758
+ ```
588
759
 
589
760
  ---
590
761
 
@@ -596,7 +767,7 @@ createMemory({
596
767
  const memory = createMemory({
597
768
  onEvent: (event) => {
598
769
  // event.type: 'compaction' | 'extraction' | 'promotion' | 'retrieval' |
599
- // 'search' | 'maintenance' | 'capability' | 'knowledge_change'
770
+ // 'search' | 'maintenance' | 'capability' | 'knowledge_change' | 'context_assembly'
600
771
  console.log(`[${event.type}] scope=${event.scope.scope_id} duration=${event.durationMs}ms`);
601
772
  },
602
773
  });
@@ -610,6 +781,7 @@ import { createMemoryEventEmitter } from 'ai-memory-layer';
610
781
  const emitter = createMemoryEventEmitter();
611
782
  emitter.on('compaction', (e) => metrics.track('compaction', e.durationMs));
612
783
  emitter.on('extraction', (e) => metrics.track('facts_extracted', e.meta.factCount));
784
+ emitter.on('knowledge_change', (e) => audit.log(e.meta.action, e.meta.knowledgeId));
613
785
 
614
786
  const memory = createMemory({ eventEmitter: emitter });
615
787
  ```
@@ -623,99 +795,41 @@ const memory = createMemory({
623
795
  });
624
796
  ```
625
797
 
626
- ---
627
-
628
- ## Surfaces
629
-
630
- One memory engine, multiple access patterns:
631
-
632
- | Surface | Best For | How to Start |
633
- |---------|---------|-------------|
634
- | **Node package** | In-process agents, IDEs | `import { createMemory } from 'ai-memory-layer'` |
635
- | **HTTP API** | Polyglot services, hosted deployments | `npx memory-layer serve --transport http` |
636
- | **MCP server** | Tool ecosystems that speak MCP | `npx memory-layer serve --transport mcp` |
637
- | **CLI** | Inspection, admin, debugging | `npx memory-layer inspect` |
638
- | **Python client** | Python agents consuming the HTTP API | `pip install memory-layer-client` |
639
-
640
- The Node package is the source of truth. HTTP mirrors it over REST ([`openapi.yaml`](openapi.yaml)). MCP and CLI are operational wrappers. The Python client follows the HTTP contract.
641
-
642
- ---
643
-
644
- ## Storage
645
-
646
- | Backend | Best For | Install |
647
- |---------|---------|---------|
648
- | **In-memory** | Tests, prototypes, zero-friction | Built-in |
649
- | **SQLite** | Single-process production, local agents | `npm install better-sqlite3` |
650
- | **PostgreSQL + pgvector** | Multi-writer, hosted, high-volume | `npm install pg` |
651
-
652
- SQLite is the low-friction path. Postgres + pgvector is the strongest scaling path with ANN indexing for semantic retrieval.
798
+ ### Circuit Breakers
653
799
 
654
- ---
655
-
656
- ## Embeddings
800
+ Summarizer, extractor, and embedding subsystems each have independent circuit breakers. When a provider goes down, the system degrades gracefully — retrieval falls back to lexical-only, extraction disables auto-extract, and telemetry emits `degraded_mode` events so you can alert on it.
657
801
 
658
802
  ```typescript
659
- // Auto-resolved: local heuristic if no API key, OpenAI/Voyage if key present
660
- const memory = createMemory(); // just works
661
-
662
- // Explicit local (offline, pure-JS)
663
- const memory = createMemory({ embeddingGenerator: 'local' });
664
-
665
- // Explicit provider
666
- import { createOpenAIEmbeddingGenerator } from 'ai-memory-layer';
667
- const memory = createMemory({
668
- embeddingGenerator: createOpenAIEmbeddingGenerator({ apiKey: process.env.OPENAI_API_KEY }),
669
- });
670
-
671
- // Custom
672
- const memory = createMemory({
673
- embeddingGenerator: async (texts) => texts.map(t => new Float32Array(/* your vectors */)),
674
- });
803
+ const diagnostics = await memory.getRuntimeDiagnostics();
804
+ // diagnostics.circuitBreakers { summarizer: { state, failures, ... }, extractor: ..., embeddings: ... }
675
805
  ```
676
806
 
677
- Built-in resilience for provider embeddings: `withRetry()`, `batchedGenerate()`, `createCachedEmbeddingGenerator()`.
678
-
679
807
  ---
680
808
 
681
- ## Testing & Evals
682
-
683
- ### Unit Tests
684
-
685
- ```bash
686
- npm test # 257 test cases across 30+ files
687
- npm run test:coverage # with coverage reporting
688
- ```
809
+ ## Storage Backends
689
810
 
690
- ### Memory Quality Gate
811
+ | Backend | Best For | Install | Search |
812
+ |---------|---------|---------|--------|
813
+ | **In-memory** | Tests, prototypes | Built-in | Exact match |
814
+ | **SQLite** | Single-process production | `npm install better-sqlite3` | FTS5 + local embeddings |
815
+ | **PostgreSQL** | Multi-writer, hosted | `npm install pg` | tsvector + pgvector ANN |
691
816
 
692
- A 14-metric behavioral eval suite that acts as a hard release gate:
817
+ SQLite is the zero-friction production path. PostgreSQL + pgvector is the scaling path with ANN indexing for high-volume semantic retrieval.
693
818
 
694
- ```bash
695
- npm run eval:memory-quality:enforce # all 14 metrics must pass
696
- npm run eval:memory-quality:delta:enforce # must not regress from baseline
697
- ```
698
-
699
- Metrics include: constraint retention, preference retention, identity retention, update correctness, false memory rate, contradiction resolution, trusted memory precision/recall, scope isolation, compaction fidelity, and maintenance fidelity.
819
+ ---
700
820
 
701
- Current baseline: **100/100** on all 14 metrics.
821
+ ## Testing & Quality
702
822
 
703
- ### Full Release Gate
823
+ Broad automated test coverage plus a behavioral eval gate keep the package honest before release.
704
824
 
705
825
  ```bash
706
- npm run release:check
826
+ npm test # full test suite
827
+ npm run eval:memory-quality:enforce # all 14 metrics must pass
828
+ npm run eval:memory-quality:delta:enforce # must not regress from baseline
829
+ npm run release:check # full release gate
707
830
  ```
708
831
 
709
- Runs: lint, test coverage, retrieval eval, scenario eval, memory quality gate, delta regression check, Python client checks, platform quality proof (HTTP + Node CLI + Python CLI), and package validation.
710
-
711
- ---
712
-
713
- ## Docker
714
-
715
- ```bash
716
- docker build -t memory-layer .
717
- docker run --rm -p 3100:3100 -v "$(pwd)/data:/data" memory-layer
718
- ```
832
+ Release checks cover unit and integration tests, behavioral memory-quality evals, transport parity, platform proofs, and packaging validation.
719
833
 
720
834
  ---
721
835
 
@@ -738,11 +852,11 @@ docker run --rm -p 3100:3100 -v "$(pwd)/data:/data" memory-layer
738
852
 
739
853
  ---
740
854
 
741
- ## Export / Import
855
+ ## Docker
742
856
 
743
857
  ```bash
744
- node scripts/export-memory.mjs ./data/memory.db ./backup.json
745
- node scripts/import-memory.mjs ./data/restored.db ./backup.json
858
+ docker build -t memory-layer .
859
+ docker run --rm -p 3100:3100 -v "$(pwd)/data:/data" memory-layer
746
860
  ```
747
861
 
748
862
  ---
@@ -751,15 +865,16 @@ node scripts/import-memory.mjs ./data/restored.db ./backup.json
751
865
 
752
866
  - [Deployment Guide](docs/DEPLOYMENT.md) — embedded, HTTP, MCP, Docker
753
867
  - [Integration Patterns](docs/INTEGRATIONS.md) — AI IDE, hosted service, autonomous agent, framework adapters
868
+ - [Operations Guide](docs/OPERATIONS.md) — monitoring, maintenance, scaling
869
+ - [Security Guide](docs/SECURITY.md) — trust model, auth boundaries, PII handling
754
870
  - [Memory Quality Rubric](docs/MEMORY_QUALITY_RUBRIC.md) — the 14-metric eval framework
755
- - [Release Gate](docs/MEMORY_QUALITY_RELEASE_GATE.md) — how quality gates enforce the baseline
756
871
  - [OpenAPI Spec](openapi.yaml) — full HTTP API contract
757
- - [Security Guide](docs/SECURITY.md)
872
+ - [Changelog](CHANGELOG.md)
758
873
 
759
874
  ---
760
875
 
761
876
  ## Requirements
762
877
 
763
- - Node 20+
764
- - MIT licensed
765
- - Optional provider SDKs are dynamically imported — no hard dependencies on `@anthropic-ai/sdk`, `openai`, `better-sqlite3`, or `pg`
878
+ - **Node 20+**
879
+ - **MIT licensed**
880
+ - Optional provider SDKs (`@anthropic-ai/sdk`, `openai`, `better-sqlite3`, `pg`) are dynamically imported — zero hard dependencies beyond Node