@shadowforge0/aquifer-memory 1.5.12 → 1.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (60) hide show
  1. package/.env.example +23 -0
  2. package/README.md +84 -73
  3. package/README_CN.md +676 -0
  4. package/README_TW.md +684 -0
  5. package/aquifer.config.example.json +34 -0
  6. package/consumers/claude-code.js +11 -11
  7. package/consumers/cli.js +421 -53
  8. package/consumers/codex-handoff.js +258 -0
  9. package/consumers/codex.js +1676 -0
  10. package/consumers/default/daily-entries.js +23 -4
  11. package/consumers/default/index.js +2 -2
  12. package/consumers/default/prompts/summary.js +6 -6
  13. package/consumers/mcp.js +96 -5
  14. package/consumers/openclaw-ext/index.js +0 -1
  15. package/consumers/openclaw-plugin.js +1 -1
  16. package/consumers/shared/config.js +8 -0
  17. package/consumers/shared/factory.js +1 -0
  18. package/consumers/shared/ingest.js +1 -1
  19. package/consumers/shared/normalize.js +14 -3
  20. package/consumers/shared/recall-format.js +27 -0
  21. package/consumers/shared/summary-parser.js +151 -0
  22. package/core/aquifer.js +380 -18
  23. package/core/finalization-review.js +319 -0
  24. package/core/mcp-manifest.js +52 -2
  25. package/core/memory-bootstrap.js +200 -0
  26. package/core/memory-consolidation.js +1590 -0
  27. package/core/memory-promotion.js +544 -0
  28. package/core/memory-recall.js +247 -0
  29. package/core/memory-records.js +797 -0
  30. package/core/memory-safety-gate.js +224 -0
  31. package/core/session-finalization.js +365 -0
  32. package/core/storage.js +385 -2
  33. package/docs/getting-started.md +105 -0
  34. package/docs/postprocess-contract.md +2 -2
  35. package/docs/setup.md +92 -2
  36. package/package.json +25 -11
  37. package/pipeline/normalize/adapters/codex.js +106 -0
  38. package/pipeline/normalize/detect.js +3 -2
  39. package/schema/001-base.sql +3 -0
  40. package/schema/007-v1-foundation.sql +273 -0
  41. package/schema/008-session-finalizations.sql +50 -0
  42. package/schema/009-v1-assertion-plane.sql +193 -0
  43. package/schema/010-v1-finalization-review.sql +160 -0
  44. package/schema/011-v1-compaction-claim.sql +46 -0
  45. package/schema/012-v1-compaction-lease.sql +39 -0
  46. package/schema/013-v1-compaction-lineage.sql +193 -0
  47. package/scripts/codex-recovery.js +672 -0
  48. package/consumers/miranda/context-inject.js +0 -120
  49. package/consumers/miranda/daily-entries.js +0 -224
  50. package/consumers/miranda/index.js +0 -364
  51. package/consumers/miranda/instance.js +0 -55
  52. package/consumers/miranda/llm.js +0 -99
  53. package/consumers/miranda/profile.json +0 -145
  54. package/consumers/miranda/prompts/summary.js +0 -303
  55. package/consumers/miranda/recall-format.js +0 -76
  56. package/consumers/miranda/render-daily-md.js +0 -186
  57. package/consumers/miranda/workspace-files.js +0 -91
  58. package/scripts/drop-entity-state-history.sql +0 -17
  59. package/scripts/drop-insights.sql +0 -12
  60. package/scripts/install-openclaw.sh +0 -59
package/.env.example ADDED
@@ -0,0 +1,23 @@
1
+ DATABASE_URL=postgresql://aquifer:aquifer@localhost:5432/aquifer
2
+ AQUIFER_SCHEMA=aquifer
3
+ AQUIFER_TENANT_ID=default
4
+
5
+ # Legacy is the default for backward compatibility. Use curated only after
6
+ # finalization and scoped serving have been verified for your host.
7
+ AQUIFER_MEMORY_SERVING_MODE=legacy
8
+ # AQUIFER_MEMORY_SERVING_MODE=curated
9
+ # AQUIFER_MEMORY_ACTIVE_SCOPE_KEY=project:example
10
+ # AQUIFER_MEMORY_ACTIVE_SCOPE_PATH=global,project:example
11
+
12
+ AQUIFER_EMBED_BASE_URL=http://localhost:11434/v1
13
+ AQUIFER_EMBED_MODEL=bge-m3
14
+ # EMBED_PROVIDER=ollama
15
+ # EMBED_PROVIDER=openai
16
+ # OPENAI_API_KEY=sk-...
17
+
18
+ # Optional built-in summarization.
19
+ # AQUIFER_LLM_BASE_URL=http://localhost:11434/v1
20
+ # AQUIFER_LLM_MODEL=llama3.1
21
+
22
+ # Startup migration behavior: apply | check | off.
23
+ AQUIFER_MIGRATIONS_MODE=apply
package/README.md CHANGED
@@ -2,9 +2,9 @@
2
2
 
3
3
  # 🌊 Aquifer
4
4
 
5
- **PG-native long-term memory for AI agents**
5
+ **Long-term memory for AI agents, backed by PostgreSQL.**
6
6
 
7
- *Turn-level embedding, hybrid RRF ranking, trust scoring, entity intersection, knowledge graph, entity scopingall on PostgreSQL + pgvector.*
7
+ *Store sessions, enrich them, and recall the exact turn where a decision happened without adding a separate vector database.*
8
8
 
9
9
  [![npm version](https://img.shields.io/npm/v/@shadowforge0/aquifer-memory)](https://www.npmjs.com/package/@shadowforge0/aquifer-memory)
10
10
  [![PostgreSQL 15+](https://img.shields.io/badge/PostgreSQL-15%2B-336791)](https://www.postgresql.org/)
@@ -17,6 +17,79 @@
17
17
 
18
18
  ---
19
19
 
20
+ ## Start Here
21
+
22
+ Aquifer is designed to have a short default path: start PostgreSQL + embeddings, run `quickstart`, then point your MCP client at `aquifer mcp`.
23
+
24
+ For library API usage, skip to [API Reference](#api-reference). For a slightly more guided first run, see [docs/getting-started.md](docs/getting-started.md).
25
+
26
+ ### 1. Start the local stack
27
+
28
+ ```bash
29
+ docker compose up -d
30
+ # PostgreSQL 16 + pgvector and Ollama with bge-m3 (auto-pulled).
31
+ # First run pulls the model — `docker compose logs -f ollama-pull` to watch.
32
+ ```
33
+
34
+ Already running PostgreSQL + pgvector and an embedding endpoint? Skip this step. `quickstart` picks up `DATABASE_URL` and embed settings from your environment if you already have them.
35
+
36
+ ### 2. Verify end-to-end
37
+
38
+ ```bash
39
+ npx --yes @shadowforge0/aquifer-memory quickstart
40
+ ```
41
+
42
+ `quickstart` autodetects `localhost:5432` PostgreSQL and `localhost:11434` Ollama (from step 1 or your own), runs migrations, embeds a test session, recalls it, and cleans up. If it prints `✓ Aquifer is working`, you're done.
43
+
44
+ For ongoing use, install it into your project so you skip the `npx` resolution cost: `npm install @shadowforge0/aquifer-memory` then `npx aquifer quickstart`.
45
+
46
+ Using OpenAI instead of Ollama? `export EMBED_PROVIDER=openai` + `OPENAI_API_KEY=sk-...` before `quickstart` — model defaults to `text-embedding-3-small`.
47
+
48
+ ### 3. Connect your MCP client
49
+
50
+ Claude Code, Claude Desktop, or any MCP-capable client — drop this into `.mcp.json` (project-level) or `claude_desktop_config.json`:
51
+
52
+ ```jsonc
53
+ {
54
+ "mcpServers": {
55
+ "aquifer": {
56
+ "command": "npx",
57
+ "args": ["--yes", "@shadowforge0/aquifer-memory", "mcp"],
58
+ "env": {
59
+ "DATABASE_URL": "postgresql://aquifer:aquifer@localhost:5432/aquifer",
60
+ "EMBED_PROVIDER": "ollama",
61
+ "AQUIFER_MEMORY_SERVING_MODE": "legacy"
62
+ }
63
+ }
64
+ }
65
+ }
66
+ ```
67
+
68
+ Or run it directly: `DATABASE_URL=... EMBED_PROVIDER=ollama npx aquifer mcp`. The MCP server itself stays strict about env; `quickstart` autodetect is the try-it path, not the production one.
69
+
70
+ Keep `AQUIFER_MEMORY_SERVING_MODE=legacy` for first rollout. Switch to `curated` only when you want `session_recall` and `session_bootstrap` to serve active curated memory; `evidence_recall` stays the explicit audit/debug lane. Rollback is just flipping env or config back to `legacy`.
71
+
72
+ ### Common commands
73
+
74
+ | Goal | Command |
75
+ |---|---|
76
+ | Verify setup | `npx aquifer quickstart` |
77
+ | Start the MCP server | `npx aquifer mcp` |
78
+ | Search memory manually | `npx aquifer recall "auth middleware"` |
79
+ | Plan curated memory compaction | `npx aquifer compact --cadence daily --period-start 2026-04-27T00:00:00Z --period-end 2026-04-28T00:00:00Z` |
80
+ | Generate a timer synthesis prompt | `npx aquifer operator compaction daily --include-synthesis-prompt --json` |
81
+ | Apply reviewed timer synthesis candidates | `npx aquifer operator compaction daily --synthesis-summary-file /tmp/timer-summary.json --apply --promote-candidates --json` |
82
+ | Inspect storage health | `npx aquifer stats` |
83
+ | Enrich pending sessions | `npx aquifer backfill` |
84
+
85
+ Timer synthesis output is candidate material until an operator applies it with
86
+ `--promote-candidates`; it does not become active curated memory from the
87
+ prompt or summary file alone.
88
+
89
+ Need LLM summarization, the knowledge graph, OpenAI embeddings, reranking, or operations details? See [docs/setup.md](docs/setup.md) and [Environment Variables](#environment-variables).
90
+
91
+ ---
92
+
20
93
  ## Why Aquifer?
21
94
 
22
95
  Most AI memory systems bolt a vector DB on the side. Aquifer takes a different approach: **PostgreSQL is the memory**.
@@ -61,57 +134,6 @@ Sessions, summaries, turn-level embeddings, entity graph — all live in one dat
61
134
 
62
135
  ---
63
136
 
64
- ## Quick Start (MCP Server)
65
-
66
- Two commands from zero to a working MCP memory server — no env vars to set. For library API usage, see [API Reference](#api-reference) below.
67
-
68
- ### 1. Start the stack
69
-
70
- ```bash
71
- docker compose up -d
72
- # PostgreSQL 16 + pgvector and Ollama with bge-m3 (auto-pulled).
73
- # First run pulls the model — `docker compose logs -f ollama-pull` to watch.
74
- ```
75
-
76
- Already running PostgreSQL + pgvector and an embedding endpoint? Skip this step — `quickstart` picks up `DATABASE_URL` / `EMBED_PROVIDER` from your environment if you've set them.
77
-
78
- ### 2. Verify
79
-
80
- ```bash
81
- npx --yes @shadowforge0/aquifer-memory quickstart
82
- ```
83
-
84
- That's it. `quickstart` autodetects `localhost:5432` PostgreSQL and `localhost:11434` Ollama (from step 1 or your own), runs migrations, embeds a test session, recalls it, and cleans up. If it prints `✓ Aquifer is working`, you're done.
85
-
86
- For ongoing use, install it into your project so you skip the `npx` resolution cost: `npm install @shadowforge0/aquifer-memory` then `npx aquifer quickstart`.
87
-
88
- Using OpenAI instead of Ollama? `export EMBED_PROVIDER=openai` + `OPENAI_API_KEY=sk-...` before `quickstart` — model defaults to `text-embedding-3-small`.
89
-
90
- ### 3. Wire into your MCP client
91
-
92
- Claude Code, Claude Desktop, or any MCP-capable client — drop this into `.mcp.json` (project-level) or `claude_desktop_config.json`:
93
-
94
- ```jsonc
95
- {
96
- "mcpServers": {
97
- "aquifer": {
98
- "command": "npx",
99
- "args": ["--yes", "@shadowforge0/aquifer-memory", "mcp"],
100
- "env": {
101
- "DATABASE_URL": "postgresql://aquifer:aquifer@localhost:5432/aquifer",
102
- "EMBED_PROVIDER": "ollama"
103
- }
104
- }
105
- }
106
- }
107
- ```
108
-
109
- Or run it directly: `DATABASE_URL=... EMBED_PROVIDER=ollama npx aquifer mcp`. (MCP server itself stays strict about env — `quickstart`'s autodetect is the try-it path, not the production one.)
110
-
111
- Need LLM summarization, the knowledge graph, OpenAI embeddings, or the reranker? See [Environment Variables](#environment-variables) below and [docs/setup.md](docs/setup.md).
112
-
113
- ---
114
-
115
137
  ## Environment Variables
116
138
 
117
139
  | Variable | Required? | Purpose | Example |
@@ -132,6 +154,9 @@ Need LLM summarization, the knowledge graph, OpenAI embeddings, or the reranker?
132
154
  | `AQUIFER_RERANK_PROVIDER` | No | Reranker provider: `tei`, `jina`, `openrouter` | `tei` |
133
155
  | `AQUIFER_RERANK_BASE_URL` | No | Reranker endpoint | `http://localhost:8080` |
134
156
  | `AQUIFER_AGENT_ID` | No | Default agent ID | `main` |
157
+ | `AQUIFER_MEMORY_SERVING_MODE` | No | Public serving mode: `legacy` default, or opt-in `curated` | `curated` |
158
+ | `AQUIFER_MEMORY_ACTIVE_SCOPE_KEY` | No | Default active curated scope for recall/bootstrap | `project:aquifer` |
159
+ | `AQUIFER_MEMORY_ACTIVE_SCOPE_PATH` | No | Ordered curated scope path for inheritance | `global,project:aquifer` |
135
160
  | `AQUIFER_MIGRATIONS_MODE` | No | Startup handshake mode: `apply` (default), `check`, `off` | `apply` |
136
161
  | `AQUIFER_MIGRATION_LOCK_TIMEOUT_MS` | No | Advisory-lock wait before `AQ_MIGRATION_LOCK_TIMEOUT` (default 30000) | `30000` |
137
162
  | `AQUIFER_INSIGHTS_DEDUP_MODE` | No | Insights semantic dedup mode: `off` (default), `shadow`, `enforce` — env wins over code for this field only, so operators can kill-switch without redeploy | `shadow` |
@@ -140,6 +165,8 @@ Need LLM summarization, the knowledge graph, OpenAI embeddings, or the reranker?
140
165
 
141
166
  Full env-to-config mapping is in [consumers/shared/config.js](consumers/shared/config.js).
142
167
 
168
+ Curated serving is opt-in. If a host needs rollback during rollout, set `AQUIFER_MEMORY_SERVING_MODE=legacy` and restart the MCP/CLI process; no destructive DB rollback is required.
169
+
143
170
  ### Insights semantic dedup (1.5.10)
144
171
 
145
172
  When a cron extractor (`scripts/extract-insights-from-recent-sessions.js`) or any other caller writes insights via `commitInsight`, the canonical-key layer (1.5.3+) dedupes rows whose `canonicalClaim + entities` hash to the same value. But LLMs don't always produce the same `canonicalClaim` across runs, so 1.5.10 adds a second tier: `title + body` are embedded, matched against `(tenant, agent, type)`-scoped active rows, and a top cosine above `AQUIFER_INSIGHTS_DEDUP_COSINE` triggers supersede (enforce) or metadata-only would-merge logging (shadow). Close-band hits (`closeBandFrom ≤ cos < threshold`) write `metadata.dedupNear` without supersede so operators can tune thresholds without committing.
@@ -159,7 +186,7 @@ The script is idempotent (`WHERE canonical_key_v2 IS NULL` guard) and race-safe
159
186
 
160
187
  ## Host Integration
161
188
 
162
- MCP is the primary integration surface. Agent hosts connect to the Aquifer MCP server, which exposes six tools: `session_recall`, `session_feedback`, `feedback_stats`, `session_bootstrap`, `memory_stats`, `memory_pending`.
189
+ MCP is the primary integration surface. Agent hosts connect to the Aquifer MCP server, which exposes eight tools: `session_recall`, `evidence_recall`, `session_feedback`, `memory_feedback`, `feedback_stats`, `session_bootstrap`, `memory_stats`, `memory_pending`.
163
190
 
164
191
  | Integration | Route | Status | When to use |
165
192
  |-------------|-------|--------|-------------|
@@ -190,7 +217,7 @@ Add to your project's `.claude.json` or user-level MCP config:
190
217
  }
191
218
  ```
192
219
 
193
- Tools appear as `mcp__aquifer__session_recall`, `mcp__aquifer__session_feedback`, `mcp__aquifer__session_bootstrap`, etc.
220
+ Tools appear as `mcp__aquifer__session_recall`, `mcp__aquifer__evidence_recall`, `mcp__aquifer__session_bootstrap`, `mcp__aquifer__session_feedback`, `mcp__aquifer__memory_feedback`, `mcp__aquifer__feedback_stats`, `mcp__aquifer__memory_stats`, `mcp__aquifer__memory_pending`.
194
221
 
195
222
  ### OpenClaw
196
223
 
@@ -214,7 +241,7 @@ Add to `openclaw.json` under `mcp.servers`:
214
241
  }
215
242
  ```
216
243
 
217
- Tools materialize as `aquifer__session_recall`, `aquifer__session_feedback`, `aquifer__feedback_stats`, `aquifer__session_bootstrap`, `aquifer__memory_stats`, `aquifer__memory_pending` (server name prefix added by the host).
244
+ Tools materialize as `aquifer__session_recall`, `aquifer__evidence_recall`, `aquifer__session_feedback`, `aquifer__memory_feedback`, `aquifer__feedback_stats`, `aquifer__session_bootstrap`, `aquifer__memory_stats`, `aquifer__memory_pending` (server name prefix added by the host).
218
245
 
219
246
  The OpenClaw plugin (`consumers/openclaw-plugin.js`) is retained for session capture via `before_reset` but is **not** the recommended tool delivery path. Use MCP.
220
247
 
@@ -349,22 +376,6 @@ Built-in entity extraction and relationship tracking:
349
376
  - **Entity-session mapping**: which entities appear in which sessions
350
377
  - **Entity boost in ranking**: sessions with relevant entities score higher
351
378
 
352
- ---
353
-
354
- ## Benchmark: LongMemEval
355
-
356
- We tested Aquifer's retrieval pipeline on [LongMemEval_S](https://github.com/xiaowu0162/LongMemEval) — 470 questions across 19,195 sessions with 98,795 turn embeddings. Per-question haystack scoping (matching the official protocol), bge-m3 embeddings via OpenRouter.
357
-
358
- | Pipeline | R@1 | R@3 | R@5 | R@10 |
359
- |----------|-----|-----|-----|------|
360
- | Turn-only (cosine) | 89.5% | 96.6% | 98.1% | 98.9% |
361
- | Three-way hybrid (FTS + session_emb + turn_emb → RRF) | 79.2% | 94.0% | 97.7% | 98.9% |
362
- | **Hybrid + Cohere Rerank v3.5 (top-30)** | **96.0%** | **98.5%** | **99.3%** | **99.8%** |
363
-
364
- Measured 2026-04-19 on Aquifer 1.2.1.
365
-
366
- **Key findings.** Turn-level embedding alone beats session-level (26.8% → 89.5% R@1, a 3× improvement). Hybrid fusion adds robustness at R@3-R@10 but trades R@1 because FTS + session-level signals spread the top candidate across adjacent sessions. Re-ranking the hybrid top-30 with a cross-encoder (Cohere Rerank v3.5) wins back the top-1 precision and then some — +16.9pt R@1 over hybrid baseline, and 6.5pt above pure turn-level cosine. That's the production pipeline Aquifer ships by default when a reranker is configured.
367
-
368
379
  ### Multi-Tenant
369
380
 
370
381
  Every table includes `tenant_id` (default: `'default'`). Isolation is enforced at the query level — no cross-tenant data leakage by design.
@@ -418,7 +429,7 @@ The MCP consumer (`consumers/mcp.js`) already wires `aquifer.init()` before `ser
418
429
 
419
430
  #### `aquifer.listPendingMigrations()` / `aquifer.getMigrationStatus()`
420
431
 
421
- Returns `{ required, applied, pending, lastRunAt }` via a `pg_tables` signature probe. No DDL runs. Use it from a health check or from a consumer that wants to surface drift before calling `init()`.
432
+ Returns `{ required, applied, pending, lastRunAt }` via table and column signature probes (`pg_tables` plus `information_schema.columns` for alter-only migrations). No DDL runs. Use it from a health check or from a consumer that wants to surface drift before calling `init()`.
422
433
 
423
434
  #### `aquifer.migrate()`
424
435