@shadowforge0/aquifer-memory 1.5.12 → 1.7.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.env.example +23 -0
- package/README.md +84 -73
- package/README_CN.md +676 -0
- package/README_TW.md +684 -0
- package/aquifer.config.example.json +34 -0
- package/consumers/claude-code.js +11 -11
- package/consumers/cli.js +421 -53
- package/consumers/codex-handoff.js +258 -0
- package/consumers/codex.js +1676 -0
- package/consumers/default/daily-entries.js +23 -4
- package/consumers/default/index.js +2 -2
- package/consumers/default/prompts/summary.js +6 -6
- package/consumers/mcp.js +96 -5
- package/consumers/openclaw-ext/index.js +0 -1
- package/consumers/openclaw-plugin.js +1 -1
- package/consumers/shared/config.js +8 -0
- package/consumers/shared/factory.js +1 -0
- package/consumers/shared/ingest.js +1 -1
- package/consumers/shared/normalize.js +14 -3
- package/consumers/shared/recall-format.js +27 -0
- package/consumers/shared/summary-parser.js +151 -0
- package/core/aquifer.js +380 -18
- package/core/finalization-review.js +319 -0
- package/core/mcp-manifest.js +52 -2
- package/core/memory-bootstrap.js +200 -0
- package/core/memory-consolidation.js +1590 -0
- package/core/memory-promotion.js +544 -0
- package/core/memory-recall.js +247 -0
- package/core/memory-records.js +797 -0
- package/core/memory-safety-gate.js +224 -0
- package/core/session-finalization.js +365 -0
- package/core/storage.js +385 -2
- package/docs/getting-started.md +105 -0
- package/docs/postprocess-contract.md +2 -2
- package/docs/setup.md +92 -2
- package/package.json +25 -11
- package/pipeline/normalize/adapters/codex.js +106 -0
- package/pipeline/normalize/detect.js +3 -2
- package/schema/001-base.sql +3 -0
- package/schema/007-v1-foundation.sql +273 -0
- package/schema/008-session-finalizations.sql +50 -0
- package/schema/009-v1-assertion-plane.sql +193 -0
- package/schema/010-v1-finalization-review.sql +160 -0
- package/schema/011-v1-compaction-claim.sql +46 -0
- package/schema/012-v1-compaction-lease.sql +39 -0
- package/schema/013-v1-compaction-lineage.sql +193 -0
- package/scripts/codex-recovery.js +672 -0
- package/consumers/miranda/context-inject.js +0 -120
- package/consumers/miranda/daily-entries.js +0 -224
- package/consumers/miranda/index.js +0 -364
- package/consumers/miranda/instance.js +0 -55
- package/consumers/miranda/llm.js +0 -99
- package/consumers/miranda/profile.json +0 -145
- package/consumers/miranda/prompts/summary.js +0 -303
- package/consumers/miranda/recall-format.js +0 -76
- package/consumers/miranda/render-daily-md.js +0 -186
- package/consumers/miranda/workspace-files.js +0 -91
- package/scripts/drop-entity-state-history.sql +0 -17
- package/scripts/drop-insights.sql +0 -12
- package/scripts/install-openclaw.sh +0 -59
package/.env.example
ADDED
|
@@ -0,0 +1,23 @@
|
|
|
1
|
+
DATABASE_URL=postgresql://aquifer:aquifer@localhost:5432/aquifer
|
|
2
|
+
AQUIFER_SCHEMA=aquifer
|
|
3
|
+
AQUIFER_TENANT_ID=default
|
|
4
|
+
|
|
5
|
+
# Legacy is the default for backward compatibility. Use curated only after
|
|
6
|
+
# finalization and scoped serving have been verified for your host.
|
|
7
|
+
AQUIFER_MEMORY_SERVING_MODE=legacy
|
|
8
|
+
# AQUIFER_MEMORY_SERVING_MODE=curated
|
|
9
|
+
# AQUIFER_MEMORY_ACTIVE_SCOPE_KEY=project:example
|
|
10
|
+
# AQUIFER_MEMORY_ACTIVE_SCOPE_PATH=global,project:example
|
|
11
|
+
|
|
12
|
+
AQUIFER_EMBED_BASE_URL=http://localhost:11434/v1
|
|
13
|
+
AQUIFER_EMBED_MODEL=bge-m3
|
|
14
|
+
# EMBED_PROVIDER=ollama
|
|
15
|
+
# EMBED_PROVIDER=openai
|
|
16
|
+
# OPENAI_API_KEY=sk-...
|
|
17
|
+
|
|
18
|
+
# Optional built-in summarization.
|
|
19
|
+
# AQUIFER_LLM_BASE_URL=http://localhost:11434/v1
|
|
20
|
+
# AQUIFER_LLM_MODEL=llama3.1
|
|
21
|
+
|
|
22
|
+
# Startup migration behavior: apply | check | off.
|
|
23
|
+
AQUIFER_MIGRATIONS_MODE=apply
|
package/README.md
CHANGED
|
@@ -2,9 +2,9 @@
|
|
|
2
2
|
|
|
3
3
|
# 🌊 Aquifer
|
|
4
4
|
|
|
5
|
-
**
|
|
5
|
+
**Long-term memory for AI agents, backed by PostgreSQL.**
|
|
6
6
|
|
|
7
|
-
*
|
|
7
|
+
*Store sessions, enrich them, and recall the exact turn where a decision happened — without adding a separate vector database.*
|
|
8
8
|
|
|
9
9
|
[](https://www.npmjs.com/package/@shadowforge0/aquifer-memory)
|
|
10
10
|
[](https://www.postgresql.org/)
|
|
@@ -17,6 +17,79 @@
|
|
|
17
17
|
|
|
18
18
|
---
|
|
19
19
|
|
|
20
|
+
## Start Here
|
|
21
|
+
|
|
22
|
+
Aquifer is designed to have a short default path: start PostgreSQL + embeddings, run `quickstart`, then point your MCP client at `aquifer mcp`.
|
|
23
|
+
|
|
24
|
+
For library API usage, skip to [API Reference](#api-reference). For a slightly more guided first run, see [docs/getting-started.md](docs/getting-started.md).
|
|
25
|
+
|
|
26
|
+
### 1. Start the local stack
|
|
27
|
+
|
|
28
|
+
```bash
|
|
29
|
+
docker compose up -d
|
|
30
|
+
# PostgreSQL 16 + pgvector and Ollama with bge-m3 (auto-pulled).
|
|
31
|
+
# First run pulls the model — `docker compose logs -f ollama-pull` to watch.
|
|
32
|
+
```
|
|
33
|
+
|
|
34
|
+
Already running PostgreSQL + pgvector and an embedding endpoint? Skip this step. `quickstart` picks up `DATABASE_URL` and embed settings from your environment if you already have them.
|
|
35
|
+
|
|
36
|
+
### 2. Verify end-to-end
|
|
37
|
+
|
|
38
|
+
```bash
|
|
39
|
+
npx --yes @shadowforge0/aquifer-memory quickstart
|
|
40
|
+
```
|
|
41
|
+
|
|
42
|
+
`quickstart` autodetects `localhost:5432` PostgreSQL and `localhost:11434` Ollama (from step 1 or your own), runs migrations, embeds a test session, recalls it, and cleans up. If it prints `✓ Aquifer is working`, you're done.
|
|
43
|
+
|
|
44
|
+
For ongoing use, install it into your project so you skip the `npx` resolution cost: `npm install @shadowforge0/aquifer-memory` then `npx aquifer quickstart`.
|
|
45
|
+
|
|
46
|
+
Using OpenAI instead of Ollama? `export EMBED_PROVIDER=openai` + `OPENAI_API_KEY=sk-...` before `quickstart` — model defaults to `text-embedding-3-small`.
|
|
47
|
+
|
|
48
|
+
### 3. Connect your MCP client
|
|
49
|
+
|
|
50
|
+
Claude Code, Claude Desktop, or any MCP-capable client — drop this into `.mcp.json` (project-level) or `claude_desktop_config.json`:
|
|
51
|
+
|
|
52
|
+
```jsonc
|
|
53
|
+
{
|
|
54
|
+
"mcpServers": {
|
|
55
|
+
"aquifer": {
|
|
56
|
+
"command": "npx",
|
|
57
|
+
"args": ["--yes", "@shadowforge0/aquifer-memory", "mcp"],
|
|
58
|
+
"env": {
|
|
59
|
+
"DATABASE_URL": "postgresql://aquifer:aquifer@localhost:5432/aquifer",
|
|
60
|
+
"EMBED_PROVIDER": "ollama",
|
|
61
|
+
"AQUIFER_MEMORY_SERVING_MODE": "legacy"
|
|
62
|
+
}
|
|
63
|
+
}
|
|
64
|
+
}
|
|
65
|
+
}
|
|
66
|
+
```
|
|
67
|
+
|
|
68
|
+
Or run it directly: `DATABASE_URL=... EMBED_PROVIDER=ollama npx aquifer mcp`. The MCP server itself stays strict about env; `quickstart` autodetect is the try-it path, not the production one.
|
|
69
|
+
|
|
70
|
+
Keep `AQUIFER_MEMORY_SERVING_MODE=legacy` for first rollout. Switch to `curated` only when you want `session_recall` and `session_bootstrap` to serve active curated memory; `evidence_recall` stays the explicit audit/debug lane. Rollback is just flipping env or config back to `legacy`.
|
|
71
|
+
|
|
72
|
+
### Common commands
|
|
73
|
+
|
|
74
|
+
| Goal | Command |
|
|
75
|
+
|---|---|
|
|
76
|
+
| Verify setup | `npx aquifer quickstart` |
|
|
77
|
+
| Start the MCP server | `npx aquifer mcp` |
|
|
78
|
+
| Search memory manually | `npx aquifer recall "auth middleware"` |
|
|
79
|
+
| Plan curated memory compaction | `npx aquifer compact --cadence daily --period-start 2026-04-27T00:00:00Z --period-end 2026-04-28T00:00:00Z` |
|
|
80
|
+
| Generate a timer synthesis prompt | `npx aquifer operator compaction daily --include-synthesis-prompt --json` |
|
|
81
|
+
| Apply reviewed timer synthesis candidates | `npx aquifer operator compaction daily --synthesis-summary-file /tmp/timer-summary.json --apply --promote-candidates --json` |
|
|
82
|
+
| Inspect storage health | `npx aquifer stats` |
|
|
83
|
+
| Enrich pending sessions | `npx aquifer backfill` |
|
|
84
|
+
|
|
85
|
+
Timer synthesis output is candidate material until an operator applies it with
|
|
86
|
+
`--promote-candidates`; it does not become active curated memory from the
|
|
87
|
+
prompt or summary file alone.
|
|
88
|
+
|
|
89
|
+
Need LLM summarization, the knowledge graph, OpenAI embeddings, reranking, or operations details? See [docs/setup.md](docs/setup.md) and [Environment Variables](#environment-variables).
|
|
90
|
+
|
|
91
|
+
---
|
|
92
|
+
|
|
20
93
|
## Why Aquifer?
|
|
21
94
|
|
|
22
95
|
Most AI memory systems bolt a vector DB on the side. Aquifer takes a different approach: **PostgreSQL is the memory**.
|
|
@@ -61,57 +134,6 @@ Sessions, summaries, turn-level embeddings, entity graph — all live in one dat
|
|
|
61
134
|
|
|
62
135
|
---
|
|
63
136
|
|
|
64
|
-
## Quick Start (MCP Server)
|
|
65
|
-
|
|
66
|
-
Two commands from zero to a working MCP memory server — no env vars to set. For library API usage, see [API Reference](#api-reference) below.
|
|
67
|
-
|
|
68
|
-
### 1. Start the stack
|
|
69
|
-
|
|
70
|
-
```bash
|
|
71
|
-
docker compose up -d
|
|
72
|
-
# PostgreSQL 16 + pgvector and Ollama with bge-m3 (auto-pulled).
|
|
73
|
-
# First run pulls the model — `docker compose logs -f ollama-pull` to watch.
|
|
74
|
-
```
|
|
75
|
-
|
|
76
|
-
Already running PostgreSQL + pgvector and an embedding endpoint? Skip this step — `quickstart` picks up `DATABASE_URL` / `EMBED_PROVIDER` from your environment if you've set them.
|
|
77
|
-
|
|
78
|
-
### 2. Verify
|
|
79
|
-
|
|
80
|
-
```bash
|
|
81
|
-
npx --yes @shadowforge0/aquifer-memory quickstart
|
|
82
|
-
```
|
|
83
|
-
|
|
84
|
-
That's it. `quickstart` autodetects `localhost:5432` PostgreSQL and `localhost:11434` Ollama (from step 1 or your own), runs migrations, embeds a test session, recalls it, and cleans up. If it prints `✓ Aquifer is working`, you're done.
|
|
85
|
-
|
|
86
|
-
For ongoing use, install it into your project so you skip the `npx` resolution cost: `npm install @shadowforge0/aquifer-memory` then `npx aquifer quickstart`.
|
|
87
|
-
|
|
88
|
-
Using OpenAI instead of Ollama? `export EMBED_PROVIDER=openai` + `OPENAI_API_KEY=sk-...` before `quickstart` — model defaults to `text-embedding-3-small`.
|
|
89
|
-
|
|
90
|
-
### 3. Wire into your MCP client
|
|
91
|
-
|
|
92
|
-
Claude Code, Claude Desktop, or any MCP-capable client — drop this into `.mcp.json` (project-level) or `claude_desktop_config.json`:
|
|
93
|
-
|
|
94
|
-
```jsonc
|
|
95
|
-
{
|
|
96
|
-
"mcpServers": {
|
|
97
|
-
"aquifer": {
|
|
98
|
-
"command": "npx",
|
|
99
|
-
"args": ["--yes", "@shadowforge0/aquifer-memory", "mcp"],
|
|
100
|
-
"env": {
|
|
101
|
-
"DATABASE_URL": "postgresql://aquifer:aquifer@localhost:5432/aquifer",
|
|
102
|
-
"EMBED_PROVIDER": "ollama"
|
|
103
|
-
}
|
|
104
|
-
}
|
|
105
|
-
}
|
|
106
|
-
}
|
|
107
|
-
```
|
|
108
|
-
|
|
109
|
-
Or run it directly: `DATABASE_URL=... EMBED_PROVIDER=ollama npx aquifer mcp`. (MCP server itself stays strict about env — `quickstart`'s autodetect is the try-it path, not the production one.)
|
|
110
|
-
|
|
111
|
-
Need LLM summarization, the knowledge graph, OpenAI embeddings, or the reranker? See [Environment Variables](#environment-variables) below and [docs/setup.md](docs/setup.md).
|
|
112
|
-
|
|
113
|
-
---
|
|
114
|
-
|
|
115
137
|
## Environment Variables
|
|
116
138
|
|
|
117
139
|
| Variable | Required? | Purpose | Example |
|
|
@@ -132,6 +154,9 @@ Need LLM summarization, the knowledge graph, OpenAI embeddings, or the reranker?
|
|
|
132
154
|
| `AQUIFER_RERANK_PROVIDER` | No | Reranker provider: `tei`, `jina`, `openrouter` | `tei` |
|
|
133
155
|
| `AQUIFER_RERANK_BASE_URL` | No | Reranker endpoint | `http://localhost:8080` |
|
|
134
156
|
| `AQUIFER_AGENT_ID` | No | Default agent ID | `main` |
|
|
157
|
+
| `AQUIFER_MEMORY_SERVING_MODE` | No | Public serving mode: `legacy` default, or opt-in `curated` | `curated` |
|
|
158
|
+
| `AQUIFER_MEMORY_ACTIVE_SCOPE_KEY` | No | Default active curated scope for recall/bootstrap | `project:aquifer` |
|
|
159
|
+
| `AQUIFER_MEMORY_ACTIVE_SCOPE_PATH` | No | Ordered curated scope path for inheritance | `global,project:aquifer` |
|
|
135
160
|
| `AQUIFER_MIGRATIONS_MODE` | No | Startup handshake mode: `apply` (default), `check`, `off` | `apply` |
|
|
136
161
|
| `AQUIFER_MIGRATION_LOCK_TIMEOUT_MS` | No | Advisory-lock wait before `AQ_MIGRATION_LOCK_TIMEOUT` (default 30000) | `30000` |
|
|
137
162
|
| `AQUIFER_INSIGHTS_DEDUP_MODE` | No | Insights semantic dedup mode: `off` (default), `shadow`, `enforce` — env wins over code for this field only, so operators can kill-switch without redeploy | `shadow` |
|
|
@@ -140,6 +165,8 @@ Need LLM summarization, the knowledge graph, OpenAI embeddings, or the reranker?
|
|
|
140
165
|
|
|
141
166
|
Full env-to-config mapping is in [consumers/shared/config.js](consumers/shared/config.js).
|
|
142
167
|
|
|
168
|
+
Curated serving is opt-in. If a host needs rollback during rollout, set `AQUIFER_MEMORY_SERVING_MODE=legacy` and restart the MCP/CLI process; no destructive DB rollback is required.
|
|
169
|
+
|
|
143
170
|
### Insights semantic dedup (1.5.10)
|
|
144
171
|
|
|
145
172
|
When a cron extractor (`scripts/extract-insights-from-recent-sessions.js`) or any other caller writes insights via `commitInsight`, the canonical-key layer (1.5.3+) dedupes rows whose `canonicalClaim + entities` hash to the same value. But LLMs don't always produce the same `canonicalClaim` across runs, so 1.5.10 adds a second tier: `title + body` are embedded, matched against `(tenant, agent, type)`-scoped active rows, and a top cosine above `AQUIFER_INSIGHTS_DEDUP_COSINE` triggers supersede (enforce) or metadata-only would-merge logging (shadow). Close-band hits (`closeBandFrom ≤ cos < threshold`) write `metadata.dedupNear` without supersede so operators can tune thresholds without committing.
|
|
@@ -159,7 +186,7 @@ The script is idempotent (`WHERE canonical_key_v2 IS NULL` guard) and race-safe
|
|
|
159
186
|
|
|
160
187
|
## Host Integration
|
|
161
188
|
|
|
162
|
-
MCP is the primary integration surface. Agent hosts connect to the Aquifer MCP server, which exposes
|
|
189
|
+
MCP is the primary integration surface. Agent hosts connect to the Aquifer MCP server, which exposes eight tools: `session_recall`, `evidence_recall`, `session_feedback`, `memory_feedback`, `feedback_stats`, `session_bootstrap`, `memory_stats`, `memory_pending`.
|
|
163
190
|
|
|
164
191
|
| Integration | Route | Status | When to use |
|
|
165
192
|
|-------------|-------|--------|-------------|
|
|
@@ -190,7 +217,7 @@ Add to your project's `.claude.json` or user-level MCP config:
|
|
|
190
217
|
}
|
|
191
218
|
```
|
|
192
219
|
|
|
193
|
-
Tools appear as `mcp__aquifer__session_recall`, `
|
|
220
|
+
Tools appear as `mcp__aquifer__session_recall`, `mcp__aquifer__evidence_recall`, `mcp__aquifer__session_bootstrap`, `mcp__aquifer__session_feedback`, `mcp__aquifer__memory_feedback`, `mcp__aquifer__feedback_stats`, `mcp__aquifer__memory_stats`, `mcp__aquifer__memory_pending`.
|
|
194
221
|
|
|
195
222
|
### OpenClaw
|
|
196
223
|
|
|
@@ -214,7 +241,7 @@ Add to `openclaw.json` under `mcp.servers`:
|
|
|
214
241
|
}
|
|
215
242
|
```
|
|
216
243
|
|
|
217
|
-
Tools materialize as `aquifer__session_recall`, `aquifer__session_feedback`, `aquifer__feedback_stats`, `aquifer__session_bootstrap`, `aquifer__memory_stats`, `aquifer__memory_pending` (server name prefix added by the host).
|
|
244
|
+
Tools materialize as `aquifer__session_recall`, `aquifer__evidence_recall`, `aquifer__session_feedback`, `aquifer__memory_feedback`, `aquifer__feedback_stats`, `aquifer__session_bootstrap`, `aquifer__memory_stats`, `aquifer__memory_pending` (server name prefix added by the host).
|
|
218
245
|
|
|
219
246
|
The OpenClaw plugin (`consumers/openclaw-plugin.js`) is retained for session capture via `before_reset` but is **not** the recommended tool delivery path. Use MCP.
|
|
220
247
|
|
|
@@ -349,22 +376,6 @@ Built-in entity extraction and relationship tracking:
|
|
|
349
376
|
- **Entity-session mapping**: which entities appear in which sessions
|
|
350
377
|
- **Entity boost in ranking**: sessions with relevant entities score higher
|
|
351
378
|
|
|
352
|
-
---
|
|
353
|
-
|
|
354
|
-
## Benchmark: LongMemEval
|
|
355
|
-
|
|
356
|
-
We tested Aquifer's retrieval pipeline on [LongMemEval_S](https://github.com/xiaowu0162/LongMemEval) — 470 questions across 19,195 sessions with 98,795 turn embeddings. Per-question haystack scoping (matching the official protocol), bge-m3 embeddings via OpenRouter.
|
|
357
|
-
|
|
358
|
-
| Pipeline | R@1 | R@3 | R@5 | R@10 |
|
|
359
|
-
|----------|-----|-----|-----|------|
|
|
360
|
-
| Turn-only (cosine) | 89.5% | 96.6% | 98.1% | 98.9% |
|
|
361
|
-
| Three-way hybrid (FTS + session_emb + turn_emb → RRF) | 79.2% | 94.0% | 97.7% | 98.9% |
|
|
362
|
-
| **Hybrid + Cohere Rerank v3.5 (top-30)** | **96.0%** | **98.5%** | **99.3%** | **99.8%** |
|
|
363
|
-
|
|
364
|
-
Measured 2026-04-19 on Aquifer 1.2.1.
|
|
365
|
-
|
|
366
|
-
**Key findings.** Turn-level embedding alone beats session-level (26.8% → 89.5% R@1, a 3× improvement). Hybrid fusion adds robustness at R@3-R@10 but trades R@1 because FTS + session-level signals spread the top candidate across adjacent sessions. Re-ranking the hybrid top-30 with a cross-encoder (Cohere Rerank v3.5) wins back the top-1 precision and then some — +16.9pt R@1 over hybrid baseline, and 6.5pt above pure turn-level cosine. That's the production pipeline Aquifer ships by default when a reranker is configured.
|
|
367
|
-
|
|
368
379
|
### Multi-Tenant
|
|
369
380
|
|
|
370
381
|
Every table includes `tenant_id` (default: `'default'`). Isolation is enforced at the query level — no cross-tenant data leakage by design.
|
|
@@ -418,7 +429,7 @@ The MCP consumer (`consumers/mcp.js`) already wires `aquifer.init()` before `ser
|
|
|
418
429
|
|
|
419
430
|
#### `aquifer.listPendingMigrations()` / `aquifer.getMigrationStatus()`
|
|
420
431
|
|
|
421
|
-
Returns `{ required, applied, pending, lastRunAt }` via
|
|
432
|
+
Returns `{ required, applied, pending, lastRunAt }` via table and column signature probes (`pg_tables` plus `information_schema.columns` for alter-only migrations). No DDL runs. Use it from a health check or from a consumer that wants to surface drift before calling `init()`.
|
|
422
433
|
|
|
423
434
|
#### `aquifer.migrate()`
|
|
424
435
|
|