ai-memory-layer 2.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +26 -0
- package/LICENSE +21 -0
- package/README.md +765 -0
- package/bin/memory-server.mjs +157 -0
- package/dist/adapters/memory/embeddings.d.ts +4 -0
- package/dist/adapters/memory/embeddings.d.ts.map +1 -0
- package/dist/adapters/memory/embeddings.js +53 -0
- package/dist/adapters/memory/embeddings.js.map +1 -0
- package/dist/adapters/memory/index.d.ts +7 -0
- package/dist/adapters/memory/index.d.ts.map +1 -0
- package/dist/adapters/memory/index.js +650 -0
- package/dist/adapters/memory/index.js.map +1 -0
- package/dist/adapters/postgres/index.d.ts +38 -0
- package/dist/adapters/postgres/index.d.ts.map +1 -0
- package/dist/adapters/postgres/index.js +982 -0
- package/dist/adapters/postgres/index.js.map +1 -0
- package/dist/adapters/sqlite/embeddings.d.ts +5 -0
- package/dist/adapters/sqlite/embeddings.d.ts.map +1 -0
- package/dist/adapters/sqlite/embeddings.js +122 -0
- package/dist/adapters/sqlite/embeddings.js.map +1 -0
- package/dist/adapters/sqlite/index.d.ts +8 -0
- package/dist/adapters/sqlite/index.d.ts.map +1 -0
- package/dist/adapters/sqlite/index.js +839 -0
- package/dist/adapters/sqlite/index.js.map +1 -0
- package/dist/adapters/sqlite/mappers.d.ts +40 -0
- package/dist/adapters/sqlite/mappers.d.ts.map +1 -0
- package/dist/adapters/sqlite/mappers.js +95 -0
- package/dist/adapters/sqlite/mappers.js.map +1 -0
- package/dist/adapters/sqlite/schema.d.ts +4 -0
- package/dist/adapters/sqlite/schema.d.ts.map +1 -0
- package/dist/adapters/sqlite/schema.js +394 -0
- package/dist/adapters/sqlite/schema.js.map +1 -0
- package/dist/adapters/sync-to-async.d.ts +15 -0
- package/dist/adapters/sync-to-async.d.ts.map +1 -0
- package/dist/adapters/sync-to-async.js +95 -0
- package/dist/adapters/sync-to-async.js.map +1 -0
- package/dist/cli/inspect.d.ts +34 -0
- package/dist/cli/inspect.d.ts.map +1 -0
- package/dist/cli/inspect.js +190 -0
- package/dist/cli/inspect.js.map +1 -0
- package/dist/contracts/async-storage.d.ts +86 -0
- package/dist/contracts/async-storage.d.ts.map +1 -0
- package/dist/contracts/async-storage.js +2 -0
- package/dist/contracts/async-storage.js.map +1 -0
- package/dist/contracts/embedding.d.ts +22 -0
- package/dist/contracts/embedding.d.ts.map +1 -0
- package/dist/contracts/embedding.js +2 -0
- package/dist/contracts/embedding.js.map +1 -0
- package/dist/contracts/identity.d.ts +29 -0
- package/dist/contracts/identity.d.ts.map +1 -0
- package/dist/contracts/identity.js +34 -0
- package/dist/contracts/identity.js.map +1 -0
- package/dist/contracts/observability.d.ts +18 -0
- package/dist/contracts/observability.d.ts.map +1 -0
- package/dist/contracts/observability.js +7 -0
- package/dist/contracts/observability.js.map +1 -0
- package/dist/contracts/policy.d.ts +108 -0
- package/dist/contracts/policy.d.ts.map +1 -0
- package/dist/contracts/policy.js +107 -0
- package/dist/contracts/policy.js.map +1 -0
- package/dist/contracts/storage.d.ts +78 -0
- package/dist/contracts/storage.d.ts.map +1 -0
- package/dist/contracts/storage.js +2 -0
- package/dist/contracts/storage.js.map +1 -0
- package/dist/contracts/types.d.ts +381 -0
- package/dist/contracts/types.d.ts.map +1 -0
- package/dist/contracts/types.js +94 -0
- package/dist/contracts/types.js.map +1 -0
- package/dist/core/circuit-breaker.d.ts +11 -0
- package/dist/core/circuit-breaker.d.ts.map +1 -0
- package/dist/core/circuit-breaker.js +38 -0
- package/dist/core/circuit-breaker.js.map +1 -0
- package/dist/core/context.d.ts +56 -0
- package/dist/core/context.d.ts.map +1 -0
- package/dist/core/context.js +345 -0
- package/dist/core/context.js.map +1 -0
- package/dist/core/events.d.ts +8 -0
- package/dist/core/events.d.ts.map +1 -0
- package/dist/core/events.js +25 -0
- package/dist/core/events.js.map +1 -0
- package/dist/core/extractor.d.ts +37 -0
- package/dist/core/extractor.d.ts.map +1 -0
- package/dist/core/extractor.js +448 -0
- package/dist/core/extractor.js.map +1 -0
- package/dist/core/formatter.d.ts +25 -0
- package/dist/core/formatter.d.ts.map +1 -0
- package/dist/core/formatter.js +97 -0
- package/dist/core/formatter.js.map +1 -0
- package/dist/core/knowledge-lifecycle.d.ts +15 -0
- package/dist/core/knowledge-lifecycle.d.ts.map +1 -0
- package/dist/core/knowledge-lifecycle.js +103 -0
- package/dist/core/knowledge-lifecycle.js.map +1 -0
- package/dist/core/maintenance.d.ts +13 -0
- package/dist/core/maintenance.d.ts.map +1 -0
- package/dist/core/maintenance.js +102 -0
- package/dist/core/maintenance.js.map +1 -0
- package/dist/core/manager.d.ts +110 -0
- package/dist/core/manager.d.ts.map +1 -0
- package/dist/core/manager.js +640 -0
- package/dist/core/manager.js.map +1 -0
- package/dist/core/monitor.d.ts +73 -0
- package/dist/core/monitor.d.ts.map +1 -0
- package/dist/core/monitor.js +395 -0
- package/dist/core/monitor.js.map +1 -0
- package/dist/core/orchestrator.d.ts +64 -0
- package/dist/core/orchestrator.d.ts.map +1 -0
- package/dist/core/orchestrator.js +916 -0
- package/dist/core/orchestrator.js.map +1 -0
- package/dist/core/presets.d.ts +15 -0
- package/dist/core/presets.d.ts.map +1 -0
- package/dist/core/presets.js +99 -0
- package/dist/core/presets.js.map +1 -0
- package/dist/core/provider-managers.d.ts +47 -0
- package/dist/core/provider-managers.d.ts.map +1 -0
- package/dist/core/provider-managers.js +112 -0
- package/dist/core/provider-managers.js.map +1 -0
- package/dist/core/quick.d.ts +62 -0
- package/dist/core/quick.d.ts.map +1 -0
- package/dist/core/quick.js +300 -0
- package/dist/core/quick.js.map +1 -0
- package/dist/core/retrieval.d.ts +29 -0
- package/dist/core/retrieval.d.ts.map +1 -0
- package/dist/core/retrieval.js +150 -0
- package/dist/core/retrieval.js.map +1 -0
- package/dist/core/runtime.d.ts +67 -0
- package/dist/core/runtime.d.ts.map +1 -0
- package/dist/core/runtime.js +84 -0
- package/dist/core/runtime.js.map +1 -0
- package/dist/core/streaming.d.ts +37 -0
- package/dist/core/streaming.d.ts.map +1 -0
- package/dist/core/streaming.js +51 -0
- package/dist/core/streaming.js.map +1 -0
- package/dist/core/sync.d.ts +13 -0
- package/dist/core/sync.d.ts.map +1 -0
- package/dist/core/sync.js +46 -0
- package/dist/core/sync.js.map +1 -0
- package/dist/core/telemetry.d.ts +8 -0
- package/dist/core/telemetry.d.ts.map +1 -0
- package/dist/core/telemetry.js +14 -0
- package/dist/core/telemetry.js.map +1 -0
- package/dist/core/tokens.d.ts +8 -0
- package/dist/core/tokens.d.ts.map +1 -0
- package/dist/core/tokens.js +59 -0
- package/dist/core/tokens.js.map +1 -0
- package/dist/core/trust.d.ts +23 -0
- package/dist/core/trust.d.ts.map +1 -0
- package/dist/core/trust.js +164 -0
- package/dist/core/trust.js.map +1 -0
- package/dist/core/validation.d.ts +36 -0
- package/dist/core/validation.d.ts.map +1 -0
- package/dist/core/validation.js +185 -0
- package/dist/core/validation.js.map +1 -0
- package/dist/embeddings/local.d.ts +5 -0
- package/dist/embeddings/local.d.ts.map +1 -0
- package/dist/embeddings/local.js +128 -0
- package/dist/embeddings/local.js.map +1 -0
- package/dist/embeddings/openai.d.ts +26 -0
- package/dist/embeddings/openai.d.ts.map +1 -0
- package/dist/embeddings/openai.js +48 -0
- package/dist/embeddings/openai.js.map +1 -0
- package/dist/embeddings/resilience.d.ts +5 -0
- package/dist/embeddings/resilience.d.ts.map +1 -0
- package/dist/embeddings/resilience.js +53 -0
- package/dist/embeddings/resilience.js.map +1 -0
- package/dist/embeddings/voyage.d.ts +30 -0
- package/dist/embeddings/voyage.d.ts.map +1 -0
- package/dist/embeddings/voyage.js +53 -0
- package/dist/embeddings/voyage.js.map +1 -0
- package/dist/index.d.ts +72 -0
- package/dist/index.d.ts.map +1 -0
- package/dist/index.js +40 -0
- package/dist/index.js.map +1 -0
- package/dist/integrations/claude-agent.d.ts +21 -0
- package/dist/integrations/claude-agent.d.ts.map +1 -0
- package/dist/integrations/claude-agent.js +44 -0
- package/dist/integrations/claude-agent.js.map +1 -0
- package/dist/integrations/claude-tools.d.ts +18 -0
- package/dist/integrations/claude-tools.d.ts.map +1 -0
- package/dist/integrations/claude-tools.js +60 -0
- package/dist/integrations/claude-tools.js.map +1 -0
- package/dist/integrations/langchain.d.ts +24 -0
- package/dist/integrations/langchain.d.ts.map +1 -0
- package/dist/integrations/langchain.js +48 -0
- package/dist/integrations/langchain.js.map +1 -0
- package/dist/integrations/mcp.d.ts +23 -0
- package/dist/integrations/mcp.d.ts.map +1 -0
- package/dist/integrations/mcp.js +60 -0
- package/dist/integrations/mcp.js.map +1 -0
- package/dist/integrations/middleware.d.ts +15 -0
- package/dist/integrations/middleware.d.ts.map +1 -0
- package/dist/integrations/middleware.js +27 -0
- package/dist/integrations/middleware.js.map +1 -0
- package/dist/integrations/openai-tools.d.ts +21 -0
- package/dist/integrations/openai-tools.d.ts.map +1 -0
- package/dist/integrations/openai-tools.js +69 -0
- package/dist/integrations/openai-tools.js.map +1 -0
- package/dist/integrations/vercel-ai.d.ts +19 -0
- package/dist/integrations/vercel-ai.d.ts.map +1 -0
- package/dist/integrations/vercel-ai.js +41 -0
- package/dist/integrations/vercel-ai.js.map +1 -0
- package/dist/server/http-server.d.ts +61 -0
- package/dist/server/http-server.d.ts.map +1 -0
- package/dist/server/http-server.js +684 -0
- package/dist/server/http-server.js.map +1 -0
- package/dist/server/index.d.ts +5 -0
- package/dist/server/index.d.ts.map +1 -0
- package/dist/server/index.js +3 -0
- package/dist/server/index.js.map +1 -0
- package/dist/server/mcp-server.d.ts +61 -0
- package/dist/server/mcp-server.d.ts.map +1 -0
- package/dist/server/mcp-server.js +465 -0
- package/dist/server/mcp-server.js.map +1 -0
- package/dist/summarizers/claude.d.ts +11 -0
- package/dist/summarizers/claude.d.ts.map +1 -0
- package/dist/summarizers/claude.js +39 -0
- package/dist/summarizers/claude.js.map +1 -0
- package/dist/summarizers/client.d.ts +23 -0
- package/dist/summarizers/client.d.ts.map +1 -0
- package/dist/summarizers/client.js +24 -0
- package/dist/summarizers/client.js.map +1 -0
- package/dist/summarizers/extractive.d.ts +6 -0
- package/dist/summarizers/extractive.d.ts.map +1 -0
- package/dist/summarizers/extractive.js +204 -0
- package/dist/summarizers/extractive.js.map +1 -0
- package/dist/summarizers/extractor.d.ts +12 -0
- package/dist/summarizers/extractor.d.ts.map +1 -0
- package/dist/summarizers/extractor.js +75 -0
- package/dist/summarizers/extractor.js.map +1 -0
- package/dist/summarizers/openai.d.ts +11 -0
- package/dist/summarizers/openai.d.ts.map +1 -0
- package/dist/summarizers/openai.js +41 -0
- package/dist/summarizers/openai.js.map +1 -0
- package/dist/summarizers/prompts.d.ts +11 -0
- package/dist/summarizers/prompts.d.ts.map +1 -0
- package/dist/summarizers/prompts.js +104 -0
- package/dist/summarizers/prompts.js.map +1 -0
- package/docs/DEPLOYMENT.md +84 -0
- package/docs/INTEGRATIONS.md +64 -0
- package/docs/MEMORY_QUALITY_BASELINE.md +55 -0
- package/docs/MEMORY_QUALITY_RELEASE_GATE.md +63 -0
- package/docs/MEMORY_QUALITY_RUBRIC.md +249 -0
- package/docs/OPERATIONS.md +49 -0
- package/docs/SECURITY.md +25 -0
- package/openapi.yaml +843 -0
- package/package.json +157 -0
|
@@ -0,0 +1,84 @@
|
|
|
1
|
+
# Deployment Guide
|
|
2
|
+
|
|
3
|
+
`memory-layer` can run embedded in-process or as a standalone HTTP/MCP service.
|
|
4
|
+
|
|
5
|
+
## Embedded Package
|
|
6
|
+
|
|
7
|
+
Use the package directly when your application already runs in Node.js:
|
|
8
|
+
|
|
9
|
+
```ts
|
|
10
|
+
import { createMemory } from 'ai-memory-layer';
|
|
11
|
+
|
|
12
|
+
const memory = createMemory({
|
|
13
|
+
adapter: 'sqlite',
|
|
14
|
+
path: './data/memory.db',
|
|
15
|
+
preset: 'ai_ide',
|
|
16
|
+
scope: 'default',
|
|
17
|
+
});
|
|
18
|
+
```
|
|
19
|
+
|
|
20
|
+
This is the lowest-friction option for AI IDEs, copilots, and single-service agents.
|
|
21
|
+
|
|
22
|
+
The zero-config quick path is pure-JS and ephemeral. For durable local storage, install the optional `better-sqlite3` package and pass `adapter: 'sqlite'` with a file path.
|
|
23
|
+
|
|
24
|
+
## Standalone HTTP Service
|
|
25
|
+
|
|
26
|
+
Run the built-in server when multiple processes need shared memory:
|
|
27
|
+
|
|
28
|
+
```bash
|
|
29
|
+
npx memory-layer serve \
|
|
30
|
+
--transport http \
|
|
31
|
+
--db ./data/memory.db \
|
|
32
|
+
--preset autonomous_agent \
|
|
33
|
+
--port 3100
|
|
34
|
+
```
|
|
35
|
+
|
|
36
|
+
Recommended environment variables:
|
|
37
|
+
|
|
38
|
+
```bash
|
|
39
|
+
MEMORY_DB_PATH=./data/memory.db
|
|
40
|
+
MEMORY_TRANSPORT=http
|
|
41
|
+
MEMORY_PORT=3100
|
|
42
|
+
MEMORY_API_KEY=replace-me
|
|
43
|
+
MEMORY_ADMIN_API_KEY=replace-me-admin
|
|
44
|
+
```
|
|
45
|
+
|
|
46
|
+
## Docker
|
|
47
|
+
|
|
48
|
+
Build and run the provided image:
|
|
49
|
+
|
|
50
|
+
```bash
|
|
51
|
+
docker build -t memory-layer .
|
|
52
|
+
docker run --rm \
|
|
53
|
+
-p 3100:3100 \
|
|
54
|
+
-v "$(pwd)/data:/data" \
|
|
55
|
+
-e MEMORY_API_KEY=local-dev-key \
|
|
56
|
+
-e MEMORY_ADMIN_API_KEY=local-dev-admin \
|
|
57
|
+
memory-layer
|
|
58
|
+
```
|
|
59
|
+
|
|
60
|
+
The container persists SQLite data under `/data/memory.db`.
|
|
61
|
+
|
|
62
|
+
## MCP Transport
|
|
63
|
+
|
|
64
|
+
Use the MCP server when integrating with AI tools that speak the Model Context Protocol:
|
|
65
|
+
|
|
66
|
+
```bash
|
|
67
|
+
npx memory-layer serve --transport mcp --db ./data/memory.db --preset ai_ide
|
|
68
|
+
```
|
|
69
|
+
|
|
70
|
+
The server reads and writes on stdio, so it fits directly into MCP-compatible runtimes.
|
|
71
|
+
|
|
72
|
+
## Production Notes
|
|
73
|
+
|
|
74
|
+
- Use a file-backed SQLite database or Postgres for durable deployments.
|
|
75
|
+
- Install `better-sqlite3` only when you want the durable SQLite path.
|
|
76
|
+
- Install `pg` only when you want the hosted Postgres path.
|
|
77
|
+
- Treat SQLite as the lowest-friction embedded path. Its semantic retrieval is an in-process scan over local embeddings, which is appropriate for local and moderate-sized workloads but not the strongest scaling path.
|
|
78
|
+
- Treat SQLite HTTP/MCP deployments as a single-process service contract. It is the right fit when one runtime owns writes and other components talk to that one service.
|
|
79
|
+
- Use Postgres when multiple processes, workers, or hosted instances need to write shared memory concurrently. That is the operationally safe multi-writer path.
|
|
80
|
+
- For the strongest hosted retrieval path, use Postgres with the `pgvector` extension enabled and keep the `knowledge_embeddings` HNSW index from `src/adapters/postgres/schema.sql`.
|
|
81
|
+
- Put `MEMORY_API_KEY` behind an API gateway or private network if the service is shared.
|
|
82
|
+
- Reserve `MEMORY_ADMIN_API_KEY` for compaction and maintenance automation.
|
|
83
|
+
- Keep `bodyLimitBytes` low unless you intentionally ingest large prompts or transcripts.
|
|
84
|
+
- Enable SSE consumers on `/v1/events` when you want real-time observability hooks.
|
|
@@ -0,0 +1,64 @@
|
|
|
1
|
+
# Integration Guide
|
|
2
|
+
|
|
3
|
+
## Core Choices
|
|
4
|
+
|
|
5
|
+
Pick the narrowest integration surface that fits your system:
|
|
6
|
+
|
|
7
|
+
- Package API: best for Node apps and AI IDE extensions
|
|
8
|
+
- HTTP API: best for polyglot services and hosted memory
|
|
9
|
+
- MCP server: best for tools that already use MCP
|
|
10
|
+
|
|
11
|
+
## AI IDE Pattern
|
|
12
|
+
|
|
13
|
+
Use `createMemoryRuntime()` to inject prompt-ready context before each model call and persist the exchange afterward:
|
|
14
|
+
|
|
15
|
+
```ts
|
|
16
|
+
const prepared = await runtime.beforeModelCall(userInput);
|
|
17
|
+
const result = await model(prepared.prompt);
|
|
18
|
+
await runtime.afterModelCall({
|
|
19
|
+
userInput,
|
|
20
|
+
assistantOutput: result,
|
|
21
|
+
});
|
|
22
|
+
```
|
|
23
|
+
|
|
24
|
+
## Hosted Service Pattern
|
|
25
|
+
|
|
26
|
+
Run one HTTP service and route each request into its own scope:
|
|
27
|
+
|
|
28
|
+
- `tenant_id`: customer or org
|
|
29
|
+
- `system_id`: product surface
|
|
30
|
+
- `workspace_id`: shared project memory
|
|
31
|
+
- `scope_id`: thread, task, run, or conversation
|
|
32
|
+
|
|
33
|
+
Operational contract:
|
|
34
|
+
|
|
35
|
+
- Use a single SQLite-backed service when one process should own writes.
|
|
36
|
+
- Use Postgres-backed hosting when multiple workers or agents need concurrent shared-memory writes.
|
|
37
|
+
- Use `collaboration_id` when memory must be intentionally shared across distinct systems without collapsing all workspace memory together.
|
|
38
|
+
|
|
39
|
+
## Autonomous Agent Pattern
|
|
40
|
+
|
|
41
|
+
Use the `autonomous_agent` preset, aggressive compaction, work-item tracking, and periodic maintenance:
|
|
42
|
+
|
|
43
|
+
```ts
|
|
44
|
+
await manager.trackWorkItem('Finish migration rollout', 'objective', 'in_progress');
|
|
45
|
+
await runtime.afterModelCall({ userInput, assistantOutput });
|
|
46
|
+
await manager.runMaintenance();
|
|
47
|
+
```
|
|
48
|
+
|
|
49
|
+
## Framework Adapters
|
|
50
|
+
|
|
51
|
+
The repo now ships tested integrations for:
|
|
52
|
+
|
|
53
|
+
- Claude-adjacent agent wrapping via `wrapClaudeAgentModel()`
|
|
54
|
+
- OpenAI/Claude tool-call surfaces via `createOpenAIMemoryTools()` and `createClaudeMemoryTools()`
|
|
55
|
+
- Vercel AI SDK middleware-style wrapping via `wrapVercelAIModel()`
|
|
56
|
+
- LangChain chat-history style bridging via `createLangChainMemoryBridge()`
|
|
57
|
+
|
|
58
|
+
Runnable examples:
|
|
59
|
+
|
|
60
|
+
- `examples/autonomous-agent.ts`: Claude-style lifecycle wrapping without requiring a provider SDK just to understand the flow
|
|
61
|
+
- `examples/tool-calling-agent.ts`: OpenAI-compatible tool surface
|
|
62
|
+
- `examples/langchain.ts`: LangChain memory variable bridge
|
|
63
|
+
- `examples/multi-agent-postgres.ts`: real Postgres-backed shared memory using `MEMORY_DATABASE_URL`
|
|
64
|
+
- `clients/python/`: hosted Python client helpers for service-oriented deployments
|
|
@@ -0,0 +1,55 @@
|
|
|
1
|
+
# Memory Quality Baseline
|
|
2
|
+
|
|
3
|
+
This document records the current release-quality baseline used by the enforced delta report in `evals/memory-quality/baseline.json`.
|
|
4
|
+
|
|
5
|
+
It is the known-good anchor that future releases must not regress from.
|
|
6
|
+
|
|
7
|
+
## Baseline Run
|
|
8
|
+
|
|
9
|
+
Command:
|
|
10
|
+
|
|
11
|
+
```bash
|
|
12
|
+
npm run eval:memory-quality
|
|
13
|
+
```
|
|
14
|
+
|
|
15
|
+
Result:
|
|
16
|
+
|
|
17
|
+
- overall score: `100`
|
|
18
|
+
- passed: `true`
|
|
19
|
+
|
|
20
|
+
## Metrics
|
|
21
|
+
|
|
22
|
+
| Metric | Baseline | Threshold | Status |
|
|
23
|
+
|---|---:|---:|---|
|
|
24
|
+
| `constraintRetentionRate` | `1.00` | `0.92` | pass |
|
|
25
|
+
| `preferenceRetentionRate` | `1.00` | `0.90` | pass |
|
|
26
|
+
| `identityRetentionRate` | `1.00` | `0.95` | pass |
|
|
27
|
+
| `procedureRetentionRate` | `1.00` | `0.88` | pass |
|
|
28
|
+
| `updateCorrectnessRate` | `1.00` | `0.88` | pass |
|
|
29
|
+
| `strategyOutcomeRecallRate` | `1.00` | `0.85` | pass |
|
|
30
|
+
| `falseMemoryRate` | `0.00` | `0.05` | pass |
|
|
31
|
+
| `contradictionResolutionAccuracy` | `1.00` | `0.85` | pass |
|
|
32
|
+
| `trustedMemoryPrecision` | `1.00` | `0.90` | pass |
|
|
33
|
+
| `trustedMemoryRecall` | `1.00` | `0.88` | pass |
|
|
34
|
+
| `memoryIsolationAccuracy` | `1.00` | `0.95` | pass |
|
|
35
|
+
| `provisionalLeakRate` | `0.00` | `0.08` | pass |
|
|
36
|
+
| `postCompactionFidelityScore` | `1.00` | `0.88` | pass |
|
|
37
|
+
| `postMaintenanceFidelityScore` | `1.00` | `0.86` | pass |
|
|
38
|
+
|
|
39
|
+
## What This Baseline Means
|
|
40
|
+
|
|
41
|
+
This baseline represents the current release claim:
|
|
42
|
+
|
|
43
|
+
- evidence-grounded promotion is working
|
|
44
|
+
- contradiction handling is explicit and safe
|
|
45
|
+
- trust-aware retrieval prefers durable memory correctly
|
|
46
|
+
- long-horizon compaction and maintenance preserve critical memory
|
|
47
|
+
- isolation and cross-scope behavior remain safe by default
|
|
48
|
+
- fresh-install no-provider replay still preserves the right local memory contract
|
|
49
|
+
- hosted shared-memory replay still surfaces the right cross-scope knowledge
|
|
50
|
+
|
|
51
|
+
Because the baseline is a known-good release anchor, the delta gate has a stricter meaning:
|
|
52
|
+
|
|
53
|
+
- green delta output means the current build has not regressed from the proven release baseline
|
|
54
|
+
- baseline refreshes should only happen after a full hard-gate pass
|
|
55
|
+
- any future baseline change should be treated as a deliberate quality reset, not a convenience update
|
|
@@ -0,0 +1,63 @@
|
|
|
1
|
+
# Memory Quality Release Gate
|
|
2
|
+
|
|
3
|
+
`memory-layer` treats memory quality as a release blocker, not a best-effort benchmark.
|
|
4
|
+
|
|
5
|
+
## Required Commands
|
|
6
|
+
|
|
7
|
+
Run these before shipping:
|
|
8
|
+
|
|
9
|
+
```bash
|
|
10
|
+
npm run eval:retrieval:enforce
|
|
11
|
+
npm run eval:memory-quality:enforce
|
|
12
|
+
npm run eval:memory-quality:delta:enforce
|
|
13
|
+
npm run python:check
|
|
14
|
+
npm run eval:platform-quality:enforce
|
|
15
|
+
```
|
|
16
|
+
|
|
17
|
+
The enforced memory-quality run must pass every threshold. The enforced delta report blocks regressions versus the recorded baseline in `evals/memory-quality/baseline.json`. Refresh that baseline only after a full hard-gate pass so it remains a known-good release anchor. The Python and platform checks prove that hosted HTTP, Node CLI inspection, Python client surfaces, the fresh-install no-provider replay, and the hosted shared-memory replay all still work against the same product contract.
|
|
18
|
+
|
|
19
|
+
## Final Thresholds
|
|
20
|
+
|
|
21
|
+
| Metric | Threshold |
|
|
22
|
+
|---|---:|
|
|
23
|
+
| `constraintRetentionRate` | `>= 0.92` |
|
|
24
|
+
| `preferenceRetentionRate` | `>= 0.90` |
|
|
25
|
+
| `identityRetentionRate` | `>= 0.95` |
|
|
26
|
+
| `procedureRetentionRate` | `>= 0.88` |
|
|
27
|
+
| `updateCorrectnessRate` | `>= 0.88` |
|
|
28
|
+
| `strategyOutcomeRecallRate` | `>= 0.85` |
|
|
29
|
+
| `falseMemoryRate` | `<= 0.05` |
|
|
30
|
+
| `contradictionResolutionAccuracy` | `>= 0.85` |
|
|
31
|
+
| `trustedMemoryPrecision` | `>= 0.90` |
|
|
32
|
+
| `trustedMemoryRecall` | `>= 0.88` |
|
|
33
|
+
| `memoryIsolationAccuracy` | `>= 0.95` |
|
|
34
|
+
| `provisionalLeakRate` | `<= 0.08` |
|
|
35
|
+
| `postCompactionFidelityScore` | `>= 0.88` |
|
|
36
|
+
| `postMaintenanceFidelityScore` | `>= 0.86` |
|
|
37
|
+
|
|
38
|
+
## How To Read The Score
|
|
39
|
+
|
|
40
|
+
- `overallScore = 100` means every tracked metric met or exceeded its threshold.
|
|
41
|
+
- `passed = true` means there are no threshold failures.
|
|
42
|
+
- The score is evidence-backed only when the scenario list and per-metric evaluations also pass.
|
|
43
|
+
|
|
44
|
+
## Quality Modes
|
|
45
|
+
|
|
46
|
+
The quick factory reports mode behavior separately from the aggregate score:
|
|
47
|
+
|
|
48
|
+
- `fast_adoption`: easiest start, weakest trust posture.
|
|
49
|
+
- `balanced_memory`: recommended default.
|
|
50
|
+
- `high_fidelity_memory`: strictest safety and lifecycle posture.
|
|
51
|
+
|
|
52
|
+
Mode reporting is descriptive. The release gate is still the main memory-quality suite.
|
|
53
|
+
|
|
54
|
+
## Platform Proof
|
|
55
|
+
|
|
56
|
+
`memory-layer` does not treat core-engine quality as sufficient proof on its own. The final gate also requires:
|
|
57
|
+
|
|
58
|
+
- `npm run python:check`
|
|
59
|
+
Verifies the Python package can be installed in a clean virtualenv, built, linted, and tested.
|
|
60
|
+
- `npm run eval:platform-quality:enforce`
|
|
61
|
+
Starts the hosted server, seeds real memory, verifies hosted inspection routes, verifies the Node inspection CLI, verifies the Python CLI against the same live service, and replays a shared-memory hosted trace across multiple scopes.
|
|
62
|
+
|
|
63
|
+
The release claim is only defensible when both the engine-quality gate and the platform-quality gate are green.
|
|
@@ -0,0 +1,249 @@
|
|
|
1
|
+
# Memory Quality Rubric
|
|
2
|
+
|
|
3
|
+
This rubric defines what "memory quality" means for `memory-layer`.
|
|
4
|
+
|
|
5
|
+
The goal is not to measure how many features the system has. The goal is to measure whether an AI-heavy product can safely keep context, learn over time, and recall the right memory later.
|
|
6
|
+
|
|
7
|
+
## Scoring Philosophy
|
|
8
|
+
|
|
9
|
+
The overall score is a weighted summary of memory-quality metrics. A higher score means:
|
|
10
|
+
|
|
11
|
+
- important memory survives over time
|
|
12
|
+
- durable knowledge is correct more often
|
|
13
|
+
- updates and contradictions are handled safely
|
|
14
|
+
- retrieval surfaces trustworthy memory
|
|
15
|
+
- workflow memory behavior is safe across scopes and lineages
|
|
16
|
+
|
|
17
|
+
The score does **not** give extra credit for:
|
|
18
|
+
|
|
19
|
+
- packaging quality
|
|
20
|
+
- transport surfaces
|
|
21
|
+
- documentation quality
|
|
22
|
+
- number of tests
|
|
23
|
+
- size or complexity of the implementation
|
|
24
|
+
|
|
25
|
+
Those matter for adoption, but not for memory quality.
|
|
26
|
+
|
|
27
|
+
## Metrics
|
|
28
|
+
|
|
29
|
+
All metrics are normalized to `0.0` through `1.0`.
|
|
30
|
+
|
|
31
|
+
### `constraintRetentionRate`
|
|
32
|
+
|
|
33
|
+
How often durable constraints that should survive are still available when needed later.
|
|
34
|
+
|
|
35
|
+
Formula:
|
|
36
|
+
|
|
37
|
+
`correctly_recalled_constraints / total_expected_constraints`
|
|
38
|
+
|
|
39
|
+
Target:
|
|
40
|
+
|
|
41
|
+
`>= 0.92`
|
|
42
|
+
|
|
43
|
+
### `preferenceRetentionRate`
|
|
44
|
+
|
|
45
|
+
How often user or system preferences survive long enough to be recalled correctly.
|
|
46
|
+
|
|
47
|
+
Formula:
|
|
48
|
+
|
|
49
|
+
`correctly_recalled_preferences / total_expected_preferences`
|
|
50
|
+
|
|
51
|
+
Target:
|
|
52
|
+
|
|
53
|
+
`>= 0.90`
|
|
54
|
+
|
|
55
|
+
### `identityRetentionRate`
|
|
56
|
+
|
|
57
|
+
How often durable identity information survives and is recalled correctly.
|
|
58
|
+
|
|
59
|
+
Formula:
|
|
60
|
+
|
|
61
|
+
`correctly_recalled_identity_facts / total_expected_identity_facts`
|
|
62
|
+
|
|
63
|
+
Target:
|
|
64
|
+
|
|
65
|
+
`>= 0.95`
|
|
66
|
+
|
|
67
|
+
### `procedureRetentionRate`
|
|
68
|
+
|
|
69
|
+
How often durable procedural knowledge remains available and correctly recalled.
|
|
70
|
+
|
|
71
|
+
Formula:
|
|
72
|
+
|
|
73
|
+
`correctly_recalled_procedures / total_expected_procedures`
|
|
74
|
+
|
|
75
|
+
Target:
|
|
76
|
+
|
|
77
|
+
`>= 0.88`
|
|
78
|
+
|
|
79
|
+
### `updateCorrectnessRate`
|
|
80
|
+
|
|
81
|
+
How often the system prefers the latest correct fact when a value is revised or reversed.
|
|
82
|
+
|
|
83
|
+
Formula:
|
|
84
|
+
|
|
85
|
+
`correct_updates_handled / total_update_scenarios`
|
|
86
|
+
|
|
87
|
+
Target:
|
|
88
|
+
|
|
89
|
+
`>= 0.88`
|
|
90
|
+
|
|
91
|
+
### `strategyOutcomeRecallRate`
|
|
92
|
+
|
|
93
|
+
How often the system remembers that a strategy succeeded or failed, and recalls that outcome appropriately.
|
|
94
|
+
|
|
95
|
+
Formula:
|
|
96
|
+
|
|
97
|
+
`correct_strategy_outcome_recalls / total_strategy_outcome_checks`
|
|
98
|
+
|
|
99
|
+
Target:
|
|
100
|
+
|
|
101
|
+
`>= 0.85`
|
|
102
|
+
|
|
103
|
+
### `falseMemoryRate`
|
|
104
|
+
|
|
105
|
+
How often the system promotes or recalls unsupported memory as if it were durable truth.
|
|
106
|
+
|
|
107
|
+
Formula:
|
|
108
|
+
|
|
109
|
+
`false_durable_memories_detected / total_false_memory_checks`
|
|
110
|
+
|
|
111
|
+
Target:
|
|
112
|
+
|
|
113
|
+
`<= 0.05`
|
|
114
|
+
|
|
115
|
+
### `contradictionResolutionAccuracy`
|
|
116
|
+
|
|
117
|
+
How often contradictions are handled safely instead of silently preserving outdated memory.
|
|
118
|
+
|
|
119
|
+
Formula:
|
|
120
|
+
|
|
121
|
+
`correctly_resolved_contradictions / total_contradiction_checks`
|
|
122
|
+
|
|
123
|
+
Target:
|
|
124
|
+
|
|
125
|
+
`>= 0.85`
|
|
126
|
+
|
|
127
|
+
### `trustedMemoryPrecision`
|
|
128
|
+
|
|
129
|
+
How often memory that is surfaced as durable/trusted is actually correct and appropriate.
|
|
130
|
+
|
|
131
|
+
Formula:
|
|
132
|
+
|
|
133
|
+
`correct_trusted_recalls / total_trusted_recalls`
|
|
134
|
+
|
|
135
|
+
Target:
|
|
136
|
+
|
|
137
|
+
`>= 0.90`
|
|
138
|
+
|
|
139
|
+
### `trustedMemoryRecall`
|
|
140
|
+
|
|
141
|
+
How often the system successfully surfaces trusted memory when it should.
|
|
142
|
+
|
|
143
|
+
Formula:
|
|
144
|
+
|
|
145
|
+
`trusted_memories_recalled_when_needed / total_trusted_memory_needs`
|
|
146
|
+
|
|
147
|
+
Target:
|
|
148
|
+
|
|
149
|
+
`>= 0.88`
|
|
150
|
+
|
|
151
|
+
### `memoryIsolationAccuracy`
|
|
152
|
+
|
|
153
|
+
How often the system preserves the right boundary behavior between local, lineage, workspace, and cross-scope memory.
|
|
154
|
+
|
|
155
|
+
Formula:
|
|
156
|
+
|
|
157
|
+
`correct_isolation_or_inheritance_behaviors / total_isolation_checks`
|
|
158
|
+
|
|
159
|
+
Target:
|
|
160
|
+
|
|
161
|
+
`>= 0.95`
|
|
162
|
+
|
|
163
|
+
### `provisionalLeakRate`
|
|
164
|
+
|
|
165
|
+
How often weak, provisional, or otherwise unsafe memory leaks into default recall behavior.
|
|
166
|
+
|
|
167
|
+
Formula:
|
|
168
|
+
|
|
169
|
+
`unsafe_provisional_surface_events / total_provisional_safety_checks`
|
|
170
|
+
|
|
171
|
+
Target:
|
|
172
|
+
|
|
173
|
+
`<= 0.08`
|
|
174
|
+
|
|
175
|
+
### `postCompactionFidelityScore`
|
|
176
|
+
|
|
177
|
+
How much important information survives compaction without corruption.
|
|
178
|
+
|
|
179
|
+
Formula:
|
|
180
|
+
|
|
181
|
+
`important_facts_preserved_after_compaction / total_important_facts_checked_after_compaction`
|
|
182
|
+
|
|
183
|
+
Target:
|
|
184
|
+
|
|
185
|
+
`>= 0.88`
|
|
186
|
+
|
|
187
|
+
### `postMaintenanceFidelityScore`
|
|
188
|
+
|
|
189
|
+
How much important memory remains correct and available after lifecycle maintenance.
|
|
190
|
+
|
|
191
|
+
Formula:
|
|
192
|
+
|
|
193
|
+
`important_memories_preserved_after_maintenance / total_important_memories_checked_after_maintenance`
|
|
194
|
+
|
|
195
|
+
Target:
|
|
196
|
+
|
|
197
|
+
`>= 0.86`
|
|
198
|
+
|
|
199
|
+
## Overall Score
|
|
200
|
+
|
|
201
|
+
Each metric contributes equally to the overall score for now. The weighted metric score is the average of all per-metric normalized scores, multiplied by `100`.
|
|
202
|
+
|
|
203
|
+
For metrics where higher is better:
|
|
204
|
+
|
|
205
|
+
`normalized = min(actual / target, 1)`
|
|
206
|
+
|
|
207
|
+
For metrics where lower is better:
|
|
208
|
+
|
|
209
|
+
`normalized = 1` when `actual <= target`, otherwise `target / actual`
|
|
210
|
+
|
|
211
|
+
Overall score:
|
|
212
|
+
|
|
213
|
+
`overallScore = average(normalized_metrics) * 100`
|
|
214
|
+
|
|
215
|
+
## Pass / Fail
|
|
216
|
+
|
|
217
|
+
The suite passes only if **all** threshold metrics pass.
|
|
218
|
+
|
|
219
|
+
This is intentionally strict. A memory system is only as safe as its weakest major behavior.
|
|
220
|
+
|
|
221
|
+
## Score Interpretation
|
|
222
|
+
|
|
223
|
+
- `95-100`: elite memory behavior with strong evidence
|
|
224
|
+
- `90-94`: very strong, but still has a few meaningful edge weaknesses
|
|
225
|
+
- `80-89`: useful and serious, but not yet trustworthy enough for the hardest autonomous use cases
|
|
226
|
+
- `70-79`: capable memory platform, but still too error-prone or lossy in core behaviors
|
|
227
|
+
- `< 70`: significant memory-quality gaps remain
|
|
228
|
+
|
|
229
|
+
## Failure Interpretation
|
|
230
|
+
|
|
231
|
+
If a metric fails:
|
|
232
|
+
|
|
233
|
+
- `constraintRetentionRate` or `identityRetentionRate`: the system is forgetting durable high-value memory
|
|
234
|
+
- `updateCorrectnessRate` or `contradictionResolutionAccuracy`: the system is not safely handling change over time
|
|
235
|
+
- `falseMemoryRate`: the system is learning things it should not
|
|
236
|
+
- `trustedMemoryPrecision`: the system is surfacing weak or unsafe memory as durable truth
|
|
237
|
+
- `memoryIsolationAccuracy`: the system is leaking or misapplying memory across boundaries
|
|
238
|
+
- `postCompactionFidelityScore`: the compaction path is destroying important information
|
|
239
|
+
- `postMaintenanceFidelityScore`: lifecycle automation is too destructive
|
|
240
|
+
|
|
241
|
+
## Required Eval Behavior
|
|
242
|
+
|
|
243
|
+
The eval harness must:
|
|
244
|
+
|
|
245
|
+
- run deterministically
|
|
246
|
+
- produce structured JSON output
|
|
247
|
+
- expose per-scenario results
|
|
248
|
+
- support `--enforce`
|
|
249
|
+
- be able to fail on real regressions
|
|
@@ -0,0 +1,49 @@
|
|
|
1
|
+
# Operations Guide
|
|
2
|
+
|
|
3
|
+
## Health Endpoints
|
|
4
|
+
|
|
5
|
+
- `GET /healthz`: process liveness
|
|
6
|
+
- `GET /readyz`: process readiness and active scope count
|
|
7
|
+
- `GET /v1/health`: per-scope memory counters
|
|
8
|
+
- `GET /v1/events`: server-sent events for memory activity
|
|
9
|
+
|
|
10
|
+
## Compaction and Maintenance
|
|
11
|
+
|
|
12
|
+
Use the admin surface for lifecycle operations:
|
|
13
|
+
|
|
14
|
+
```bash
|
|
15
|
+
curl -X POST http://localhost:3100/v1/compact \
|
|
16
|
+
-H "Authorization: Bearer $MEMORY_API_KEY" \
|
|
17
|
+
-H "x-admin-key: $MEMORY_ADMIN_API_KEY" \
|
|
18
|
+
-H "Content-Type: application/json" \
|
|
19
|
+
-d '{"scope":{"tenant_id":"acme","system_id":"ai-ide","scope_id":"task-42"}}'
|
|
20
|
+
```
|
|
21
|
+
|
|
22
|
+
```bash
|
|
23
|
+
curl -X POST http://localhost:3100/v1/maintenance \
|
|
24
|
+
-H "Authorization: Bearer $MEMORY_API_KEY" \
|
|
25
|
+
-H "x-admin-key: $MEMORY_ADMIN_API_KEY"
|
|
26
|
+
```
|
|
27
|
+
|
|
28
|
+
## Scope Routing
|
|
29
|
+
|
|
30
|
+
Requests can resolve scope three ways:
|
|
31
|
+
|
|
32
|
+
1. `scope` object in the JSON body
|
|
33
|
+
2. Query parameters: `tenant_id`, `system_id`, `workspace_id`, `scope_id`
|
|
34
|
+
3. Headers: `x-memory-tenant`, `x-memory-system`, `x-memory-workspace`, `x-memory-scope`
|
|
35
|
+
|
|
36
|
+
Use body scope when the request already has a JSON payload. Use headers for shared gateways or framework middleware.
|
|
37
|
+
|
|
38
|
+
## Recommended Alerts
|
|
39
|
+
|
|
40
|
+
- Sustained growth in `activeTurnCount`
|
|
41
|
+
- Low or zero `knowledgeCount` for a workload that should learn
|
|
42
|
+
- Repeated `compacted: false` results during forced compaction
|
|
43
|
+
- Large `expiredWorkingMemory` or `retiredKnowledge` spikes during maintenance
|
|
44
|
+
|
|
45
|
+
## Data Hygiene
|
|
46
|
+
|
|
47
|
+
- Redact sensitive text at ingest using `redactText`.
|
|
48
|
+
- Prefer tenant and workspace boundaries that match your product’s isolation model.
|
|
49
|
+
- Export memory before major schema or provider changes.
|
package/docs/SECURITY.md
ADDED
|
@@ -0,0 +1,25 @@
|
|
|
1
|
+
# Security Guide
|
|
2
|
+
|
|
3
|
+
## Authentication
|
|
4
|
+
|
|
5
|
+
- `apiKey` protects the general HTTP surface with bearer auth.
|
|
6
|
+
- `adminApiKey` separately protects force-compaction and maintenance endpoints.
|
|
7
|
+
- MCP transport is intended for local or already-trusted stdio integrations.
|
|
8
|
+
|
|
9
|
+
## Recommended Defaults
|
|
10
|
+
|
|
11
|
+
- Bind the HTTP server to `127.0.0.1` unless you intentionally expose it.
|
|
12
|
+
- Keep `bodyLimitBytes` close to your expected prompt sizes.
|
|
13
|
+
- Run the service behind TLS termination when exposed beyond localhost.
|
|
14
|
+
|
|
15
|
+
## Sensitive Data Handling
|
|
16
|
+
|
|
17
|
+
- Use `redactText` to scrub secrets before they enter turns, working memory, or knowledge memory.
|
|
18
|
+
- Avoid sharing tenant-level scopes across customers.
|
|
19
|
+
- Audit `relevantKnowledge` and `searchCrossScope()` use before enabling broad cross-scope retrieval.
|
|
20
|
+
|
|
21
|
+
## Release Hygiene
|
|
22
|
+
|
|
23
|
+
- Only publish built assets under `dist/`.
|
|
24
|
+
- Keep provider SDKs as optional peers so non-provider installs stay lean.
|
|
25
|
+
- Use `npm run release:check` before publishing.
|