prism-mcp-server 4.6.1 → 5.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +208 -100
- package/dist/dashboard/server.js +240 -7
- package/dist/dashboard/ui.js +198 -16
- package/dist/server.js +15 -2
- package/dist/storage/sqlite.js +247 -6
- package/dist/storage/supabase.js +58 -0
- package/dist/storage/supabaseMigrations.js +86 -1
- package/dist/tools/index.js +2 -2
- package/dist/tools/sessionMemoryDefinitions.js +63 -0
- package/dist/tools/sessionMemoryHandlers.js +99 -5
- package/dist/utils/turboquant.js +730 -0
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -8,12 +8,14 @@
|
|
|
8
8
|
[](https://www.typescriptlang.org/)
|
|
9
9
|
[](https://nodejs.org/)
|
|
10
10
|
|
|
11
|
-
> **Your AI agent's memory that survives between sessions.** Prism MCP is a Model Context Protocol server that gives Claude Desktop, Cursor, Windsurf, and any MCP client **persistent memory**, **time travel**, **visual context**, **multi-agent sync**, **GDPR-compliant deletion**, **memory tracing**, and **LangChain integration** — all running locally with zero cloud dependencies.
|
|
11
|
+
> **Your AI agent's memory that survives between sessions.** Prism MCP is a Model Context Protocol server that gives Claude Desktop, Cursor, Windsurf, and any MCP client **persistent memory**, **time travel**, **visual context**, **multi-agent sync**, **GDPR-compliant deletion**, **memory tracing**, **quantized vector compression**, and **LangChain integration** — all running locally with zero cloud dependencies.
|
|
12
12
|
>
|
|
13
|
-
> Built with **SQLite + F32_BLOB vector search**, **optimistic concurrency control**, **MCP Prompts & Resources**, **auto-compaction**, **Gemini-powered Morning Briefings**, **MemoryTrace explainability**, and optional **Supabase cloud sync**.
|
|
13
|
+
> Built with **SQLite + F32_BLOB vector search**, **TurboQuant 10× embedding compression**, **optimistic concurrency control**, **MCP Prompts & Resources**, **auto-compaction**, **Gemini-powered Morning Briefings**, **MemoryTrace explainability**, and optional **Supabase cloud sync**.
|
|
14
14
|
|
|
15
15
|
## Table of Contents
|
|
16
16
|
|
|
17
|
+
- [What's New (v5.1.0)](#whats-new-in-v510--deep-storage--knowledge-graph-)
|
|
18
|
+
- [What's New (v5.0.0)](#whats-new-in-v500--quantized-agentic-memory-)
|
|
17
19
|
- [What's New (v4.6.0)](#whats-new-in-v460--opentelemetry-observability-)
|
|
18
20
|
- [Multi-Instance Support](#multi-instance-support)
|
|
19
21
|
- [How Prism Compares](#how-prism-compares)
|
|
@@ -23,7 +25,7 @@
|
|
|
23
25
|
- [Claude Code Integration (Hooks)](#claude-code-integration-hooks)
|
|
24
26
|
- [Gemini / Antigravity Integration](#gemini--antigravity-integration)
|
|
25
27
|
- [Use Cases](#use-cases)
|
|
26
|
-
- [Architecture](#architecture)
|
|
28
|
+
- [Architecture](#architecture) | [Full Architecture Guide](docs/ARCHITECTURE.md) | [Self-Improving Agent Guide](docs/self-improving-agent.md)
|
|
27
29
|
- [Tool Reference](#tool-reference)
|
|
28
30
|
- [Agent Hivemind — Role Usage](#agent-hivemind--role-usage)
|
|
29
31
|
- [LangChain / LangGraph Integration](#langchain--langgraph-integration)
|
|
@@ -42,12 +44,98 @@
|
|
|
42
44
|
|
|
43
45
|
---
|
|
44
46
|
|
|
45
|
-
## What's New in
|
|
47
|
+
## What's New in v5.1.0 — Deep Storage & Knowledge Graph 🗂️
|
|
48
|
+
|
|
49
|
+
> **🗂️ Reclaim 90% of your vector storage and visually edit your agent's knowledge graph.**
|
|
50
|
+
> [CHANGELOG](CHANGELOG.md)
|
|
51
|
+
|
|
52
|
+
| Feature | Description |
|
|
53
|
+
|---|---|
|
|
54
|
+
| 🗑️ **Deep Storage Mode** | New `deep_storage_purge` tool NULLs out redundant float32 embeddings for entries with TurboQuant compressed blobs, reclaiming ~90% of vector storage. Safety guards: 7-day minimum age, dry-run preview, multi-tenant isolation. |
|
|
55
|
+
| 🕸️ **Knowledge Graph Editor** | The Mind Palace Neural Graph is now fully interactive — click nodes to rename or delete keywords, filter by project/date/importance, and surgically groom your agent's semantic memory. |
|
|
56
|
+
| 🔧 **Auto-Load Reliability** | Hardened hook-based integration patterns for Claude Code and Gemini/Antigravity to guarantee context loading on the absolute first turn without reasoning hallucinations. |
|
|
57
|
+
| 🧪 **303 Tests** | 8 new deep-storage test cases covering dry run, execute, safety guards, and idempotency — zero regressions across 13 suites. |
|
|
58
|
+
|
|
59
|
+
---
|
|
60
|
+
|
|
61
|
+
## What's New in v5.0.0 — Quantized Agentic Memory 🧬
|
|
62
|
+
|
|
63
|
+
> **🧬 10× embedding compression is here.** Powered by Google's TurboQuant (ICLR 2026), Prism now compresses 768-dim embeddings from **3,072 bytes → ~400 bytes** — enabling decades of session history on a standard laptop.
|
|
64
|
+
> [RFC-001: Quantized Agentic Memory](docs/rfcs/001-turboquant-integration.md) | [CHANGELOG](CHANGELOG.md)
|
|
65
|
+
|
|
66
|
+
### Performance Benchmarks
|
|
67
|
+
|
|
68
|
+
| Metric | Before v5.0 | After v5.0 |
|
|
69
|
+
|--------|------------|------------|
|
|
70
|
+
| **Storage per embedding** | 3,072 bytes (float32) | ~400 bytes (turbo4) |
|
|
71
|
+
| **Compression ratio** | 1:1 | **~7.7:1** (4-bit) / **~10.1:1** (3-bit) |
|
|
72
|
+
| **Similarity correlation** | Baseline | >0.85 (4-bit) |
|
|
73
|
+
| **Top-1 retrieval accuracy** | Baseline | >90% (N=100) |
|
|
74
|
+
| **Entries per GB** | ~330K | **~2.5M** |
|
|
75
|
+
| **Search without vector DB** | ❌ Empty | ✅ Tier-2 JS fallback |
|
|
76
|
+
|
|
77
|
+
### Three-Tier Memory Architecture
|
|
78
|
+
|
|
79
|
+
```
|
|
80
|
+
┌─────────────────────────────────────────────────────────────┐
|
|
81
|
+
│ PRISM v5.0 MEMORY │
|
|
82
|
+
├─────────┬───────────────┬───────────────────────────────────┤
|
|
83
|
+
│ TIER │ STORAGE │ SEARCH METHOD │
|
|
84
|
+
├─────────┼───────────────┼───────────────────────────────────┤
|
|
85
|
+
│ Tier 0 │ FTS5 keywords │ Full-text search (knowledge_search) │
|
|
86
|
+
│ Tier 1 │ float32 3072B │ sqlite-vec cosine (native) │
|
|
87
|
+
│ Tier 2 │ turbo4 400B │ JS asymmetricCosineSimilarity │
|
|
88
|
+
└─────────┴───────────────┴───────────────────────────────────┘
|
|
89
|
+
|
|
90
|
+
searchMemory() flow:
|
|
91
|
+
→ Tier 1 (sqlite-vec) ── success → return results
|
|
92
|
+
── fail → Tier 2 (TurboQuant JS)
|
|
93
|
+
── success → return results
|
|
94
|
+
── fail → return []
|
|
95
|
+
```
|
|
96
|
+
|
|
97
|
+
### Live Usage: How TurboQuant Works in Practice
|
|
98
|
+
|
|
99
|
+
**Every `session_save_ledger` call now generates both tiers automatically:**
|
|
100
|
+
|
|
101
|
+
```typescript
|
|
102
|
+
// What happens behind the scenes when you save a session:
|
|
103
|
+
await saveLedger({ project: "my-app", summary: "Built auth flow" });
|
|
104
|
+
|
|
105
|
+
// 1. Gemini generates float32 embedding (3,072 bytes)
|
|
106
|
+
// 2. TurboQuant compresses to turbo4 blob (~400 bytes)
|
|
107
|
+
// 3. Single atomic patchLedger writes BOTH to the database
|
|
108
|
+
// → embedding: "[0.0234, -0.0156, ...]" (float32)
|
|
109
|
+
// → embedding_compressed: "base64..." (turbo4)
|
|
110
|
+
// → embedding_format: "turbo4"
|
|
111
|
+
// → embedding_turbo_radius: 12.847
|
|
112
|
+
|
|
113
|
+
// Searching works seamlessly across both tiers:
|
|
114
|
+
await searchMemory({ query: "auth flow" });
|
|
115
|
+
// → Tier 1 tries native vector search
|
|
116
|
+
// → If unavailable, Tier 2 deserializes compressed blobs
|
|
117
|
+
// and ranks using asymmetric cosine similarity in JS
|
|
118
|
+
```
|
|
119
|
+
|
|
120
|
+
**Backfill existing entries with one command:**
|
|
121
|
+
```
|
|
122
|
+
> Use tool: session_backfill_embeddings
|
|
123
|
+
> Now repairs AND compresses in a single atomic update
|
|
124
|
+
```
|
|
125
|
+
|
|
126
|
+
> **💡 Ollama TurboQuant Tip:** If using Ollama for self-hosted inference, set `OLLAMA_KV_CACHE_TYPE=turbo3` for 10× smaller KV caches during generation — the same algorithm powering Prism's memory compression.
|
|
127
|
+
|
|
128
|
+
---
|
|
129
|
+
|
|
130
|
+
<details>
|
|
131
|
+
<summary><strong>What's in v4.6.0 — OpenTelemetry Observability 🔭</strong></summary>
|
|
46
132
|
|
|
47
133
|
> **🔭 Full distributed tracing for every MCP tool call, LLM provider hop, and background AI worker.**
|
|
48
134
|
> Configure in the new **🔭 Observability** tab in Mind Palace — no code changes required.
|
|
49
135
|
> Activates a 4-tier span waterfall: `mcp.call_tool` → `worker.vlm_caption` → `llm.generate_image_description` / `llm.generate_embedding`.
|
|
50
136
|
|
|
137
|
+
</details>
|
|
138
|
+
|
|
51
139
|
<a name="whats-new-in-v451--gdpr-export-"></a>
|
|
52
140
|
<details>
|
|
53
141
|
<summary><strong>What's in v4.5.1 — GDPR Export & Test Hardening 🔒</strong></summary>
|
|
@@ -234,7 +322,7 @@
|
|
|
234
322
|
| Feature | Description |
|
|
235
323
|
|---|---|
|
|
236
324
|
| 🏠 **Local-First SQLite** | Run Prism entirely locally with zero cloud dependencies. Full vector search (libSQL F32_BLOB) and FTS5 included. |
|
|
237
|
-
| 🔮 **Mind Palace UI** | A beautiful glassmorphism dashboard at `localhost:3000` to inspect your agent's memory, visual vault, and Git drift. |
|
|
325
|
+
| 🔮 **Mind Palace UI** | A beautiful glassmorphism dashboard at `localhost:3000` (configurable via `PRISM_DASHBOARD_PORT`) to inspect your agent's memory, visual vault, and Git drift. |
|
|
238
326
|
| 🕰️ **Time Travel** | `memory_history` and `memory_checkout` act like `git revert` for your agent's brain — full version history with OCC. |
|
|
239
327
|
| 🖼️ **Visual Memory** | Agents can save screenshots to a local media vault. Auto-capture mode snapshots your local dev server on every handoff save. |
|
|
240
328
|
| 📡 **Agent Telepathy** | Multi-client sync: if your agent in Cursor saves state, Claude Desktop gets a live notification instantly. |
|
|
@@ -277,6 +365,8 @@
|
|
|
277
365
|
| **VLM Image Captions** | ✅ Auto-caption vault (v4.5) | ❌ | ❌ | ❌ | ❌ |
|
|
278
366
|
| **Pluggable LLM Adapters** | ✅ OpenAI/Anthropic/Gemini/Ollama | ❌ | ✅ Multi-provider | ❌ | ❌ |
|
|
279
367
|
| **LangChain** | ✅ BaseRetriever | ❌ | ❌ | ❌ | ❌ |
|
|
368
|
+
| **Vector Compression** | ✅ TurboQuant 10× (v5.0) | ❌ | ❌ | ❌ | ❌ |
|
|
369
|
+
| **Three-Tier Search** | ✅ FTS + Vec + Quantized | ❌ | ❌ | ❌ | ❌ |
|
|
280
370
|
| **MCP Native** | ✅ stdio | ✅ stdio | ❌ Python SDK | ✅ HTTP + MCP | ✅ stdio |
|
|
281
371
|
| **Language** | TypeScript | TypeScript | Python | Python | Python |
|
|
282
372
|
|
|
@@ -468,11 +558,36 @@ Add to your Continue `config.json` or Cline MCP settings:
|
|
|
468
558
|
|
|
469
559
|
## Claude Code Integration (Hooks)
|
|
470
560
|
|
|
471
|
-
Claude Code supports
|
|
561
|
+
Claude Code supports custom hooks (`SessionStart`, `Stop`) that can force the agent to load and save Prism context automatically. Because Claude Code requires explicit permission for MCP tools, you must also whitelist the Prism commands.
|
|
562
|
+
|
|
563
|
+
### 1. The Auto-Load Hook Script
|
|
564
|
+
|
|
565
|
+
Create a Python script (e.g., `~/.claude/mcp_autoload_hook.py`). This script outputs JSON that Claude Code reads during the `SessionStart` event.
|
|
472
566
|
|
|
473
|
-
|
|
567
|
+
```python
|
|
568
|
+
#!/usr/bin/env python3
|
|
569
|
+
import json
|
|
570
|
+
import sys
|
|
571
|
+
|
|
572
|
+
def main():
|
|
573
|
+
# Inject a system message forcing the agent to load memory BEFORE speaking
|
|
574
|
+
print(json.dumps({
|
|
575
|
+
"continue": True,
|
|
576
|
+
"suppressOutput": True,
|
|
577
|
+
"systemMessage": (
|
|
578
|
+
"## First Action\n"
|
|
579
|
+
"Call `mcp__prism-mcp__session_load_context(project='my-project', level='deep')` "
|
|
580
|
+
"before responding to the user. Do not generate any text before calling this tool."
|
|
581
|
+
)
|
|
582
|
+
}))
|
|
583
|
+
|
|
584
|
+
if __name__ == "__main__":
|
|
585
|
+
main()
|
|
586
|
+
```
|
|
587
|
+
|
|
588
|
+
### 2. Configure `settings.json`
|
|
474
589
|
|
|
475
|
-
|
|
590
|
+
Map the hooks in your `~/.claude/settings.json`:
|
|
476
591
|
|
|
477
592
|
```json
|
|
478
593
|
{
|
|
@@ -483,47 +598,45 @@ Automatically loads context when a new session begins:
|
|
|
483
598
|
"hooks": [
|
|
484
599
|
{
|
|
485
600
|
"type": "command",
|
|
486
|
-
"command": "python3
|
|
601
|
+
"command": "python3 /Users/you/.claude/mcp_autoload_hook.py",
|
|
487
602
|
"timeout": 10
|
|
488
603
|
}
|
|
489
604
|
]
|
|
490
605
|
}
|
|
491
|
-
]
|
|
492
|
-
}
|
|
493
|
-
}
|
|
494
|
-
```
|
|
495
|
-
|
|
496
|
-
### Stop Hook
|
|
497
|
-
|
|
498
|
-
Automatically saves session memory when a session ends:
|
|
499
|
-
|
|
500
|
-
```json
|
|
501
|
-
{
|
|
502
|
-
"hooks": {
|
|
606
|
+
],
|
|
503
607
|
"Stop": [
|
|
504
608
|
{
|
|
505
609
|
"matcher": "*",
|
|
506
610
|
"hooks": [
|
|
507
611
|
{
|
|
508
612
|
"type": "command",
|
|
509
|
-
"command": "python3 -c \"import json; print(json.dumps({'continue': True, 'suppressOutput':
|
|
613
|
+
"command": "python3 -c \"import json; print(json.dumps({'continue': True, 'suppressOutput': True, 'systemMessage': 'MANDATORY END WORKFLOW: 1) Call mcp__prism-mcp__session_save_ledger with project and summary. 2) Call mcp__prism-mcp__session_save_handoff with expected_version set to the loaded version.'}))\"",
|
|
510
614
|
"timeout": 10
|
|
511
615
|
}
|
|
512
616
|
]
|
|
513
617
|
}
|
|
514
618
|
]
|
|
619
|
+
},
|
|
620
|
+
"permissions": {
|
|
621
|
+
"allow": [
|
|
622
|
+
"mcp__prism-mcp__session_load_context",
|
|
623
|
+
"mcp__prism-mcp__session_save_ledger",
|
|
624
|
+
"mcp__prism-mcp__session_save_handoff",
|
|
625
|
+
"mcp__prism-mcp__knowledge_search",
|
|
626
|
+
"mcp__prism-mcp__session_search_memory"
|
|
627
|
+
]
|
|
515
628
|
}
|
|
516
629
|
}
|
|
517
630
|
```
|
|
518
631
|
|
|
519
632
|
### How the Hooks Work
|
|
520
633
|
|
|
521
|
-
The hook `command` runs a Python
|
|
634
|
+
The hook `command` runs a Python script that returns a JSON object to Claude Code:
|
|
522
635
|
|
|
523
636
|
| Field | Purpose |
|
|
524
637
|
|---|---|
|
|
525
638
|
| `continue: true` | Tell Claude Code to proceed (don't abort the session) |
|
|
526
|
-
| `suppressOutput:
|
|
639
|
+
| `suppressOutput: true` | Silently inject the system message (recommended for Stop hooks) |
|
|
527
640
|
| `systemMessage` | Instruction injected as a system message — the agent follows it |
|
|
528
641
|
|
|
529
642
|
The agent receives the `systemMessage` as an instruction and executes the tool calls. The server resolves the agent's **role** and **name** automatically from the dashboard — no need to specify them in the hook.
|
|
@@ -542,49 +655,55 @@ explicit tool argument → dashboard setting → "global" (default)
|
|
|
542
655
|
|
|
543
656
|
Change your role once in the dashboard, and it automatically applies to every session — CLI, extension, and all MCP clients.
|
|
544
657
|
|
|
545
|
-
###
|
|
546
|
-
|
|
547
|
-
If hydration ran successfully, the agent's output will include:
|
|
548
|
-
- A `[👤 AGENT IDENTITY]` block showing your dashboard-configured role and name
|
|
549
|
-
- `PRISM_CONTEXT_LOADED` marker text
|
|
658
|
+
### Troubleshooting Claude Code
|
|
550
659
|
|
|
551
|
-
|
|
660
|
+
- **Hook not firing?** Check the hook `timeout` in Claude Code settings. If your Python script takes too long, Claude ignores it silently.
|
|
661
|
+
- **"Tool not available" hallucination?** If Claude claims it doesn't have the tool, it's usually an adversarial Chain-of-Thought loop. Ensure the `permissions.allow` array exactly matches the double-underscore format (`mcp__prism-mcp__...`).
|
|
662
|
+
- **Missing `PRISM_CONTEXT_LOADED`?** The hook didn't fire or the MCP server isn't connected. Verify `prism-mcp` is listed in your `mcpServers` config.
|
|
552
663
|
|
|
553
664
|
---
|
|
554
665
|
|
|
555
666
|
## Gemini / Antigravity Integration
|
|
556
667
|
|
|
557
|
-
Gemini-based
|
|
668
|
+
Antigravity and Gemini-based agents require a radically simplified approach to auto-loading. If you give modern instruction-tuned models a long list of "Banned Behaviors" (e.g., "Do NOT say hello first"), their internal reasoning often over-indexes on the constraints and causes them to hallucinate that the tool doesn't exist.
|
|
558
669
|
|
|
559
|
-
###
|
|
670
|
+
### The 2-Line "First Action" Rule
|
|
560
671
|
|
|
561
|
-
|
|
562
|
-
## Prism MCP Memory Auto-Load (CRITICAL)
|
|
563
|
-
At the start of every new session, call `mcp__prism-mcp__session_load_context`
|
|
564
|
-
for these projects:
|
|
565
|
-
- `my-project` (level=standard)
|
|
566
|
-
- `my-other-project` (level=standard)
|
|
672
|
+
Create a `GEMINI.md` file in your project root (or globally at `~/.gemini/GEMINI.md`) or paste this into your Antigravity **User Rules**:
|
|
567
673
|
|
|
568
|
-
|
|
674
|
+
```markdown
|
|
675
|
+
## First Action
|
|
676
|
+
Call `mcp_prism-mcp_session_load_context(project="my-project", level="deep")` before responding.
|
|
569
677
|
```
|
|
570
678
|
|
|
571
|
-
|
|
679
|
+
> **Note:** Antigravity uses single underscores (`mcp_prism-mcp_...`) compared to Claude Code's double underscores (`mcp__prism-mcp__...`).
|
|
680
|
+
|
|
681
|
+
That's it — **two lines**. This approach proved reliable after 13 iterations of increasingly complex prompt engineering. The key insight: shorter instructions avoid triggering the model's adversarial reasoning about tool availability.
|
|
572
682
|
|
|
573
|
-
|
|
683
|
+
### Session End Protocol
|
|
574
684
|
|
|
575
|
-
|
|
576
|
-
|
|
685
|
+
At the end of your conversation, explicitly tell the agent:
|
|
686
|
+
> *"Wrap up the session."*
|
|
577
687
|
|
|
578
|
-
|
|
688
|
+
The agent will rely on its system prompt to execute:
|
|
689
|
+
1. `session_save_ledger` — immutable work log with summary, TODOs, and decisions
|
|
690
|
+
2. `session_save_handoff` — passing the `expected_version` it received during the load step to ensure Optimistic Concurrency Control
|
|
579
691
|
|
|
580
|
-
|
|
692
|
+
### Antigravity UI Caveats
|
|
693
|
+
|
|
694
|
+
Antigravity's UI currently does **not** visually render the raw output of MCP tool calls. To ensure the agent actually ingested the context, add this to your User Rules:
|
|
581
695
|
|
|
582
696
|
```markdown
|
|
583
|
-
##
|
|
584
|
-
|
|
585
|
-
|
|
697
|
+
## STEP 2: Echo Context in Your Text Response
|
|
698
|
+
After the tool returns, include the following in your greeting text:
|
|
699
|
+
- Agent identity: `🤖 Agent: <role> — <name>`
|
|
700
|
+
- Last session summary
|
|
701
|
+
- Open TODOs
|
|
702
|
+
- Session version number
|
|
586
703
|
```
|
|
587
704
|
|
|
705
|
+
This forces the agent to prove it loaded context by echoing it in visible text.
|
|
706
|
+
|
|
588
707
|
---
|
|
589
708
|
|
|
590
709
|
## Use Cases
|
|
@@ -651,70 +770,38 @@ session_view_image(project="my-app", image_id="8f2a1b3c")
|
|
|
651
770
|
|
|
652
771
|
---
|
|
653
772
|
|
|
654
|
-
### 🔌 Pluggable LLM Adapters
|
|
655
|
-
|
|
656
|
-
**Why:** Run fully local/air-gapped with Ollama, or switch providers without changing tool logic.
|
|
657
|
-
|
|
658
|
-
**Setup:** Set in MCP config `env`:
|
|
659
|
-
|
|
660
|
-
```json
|
|
661
|
-
{
|
|
662
|
-
"env": {
|
|
663
|
-
"PRISM_LLM_PROVIDER": "ollama",
|
|
664
|
-
"PRISM_LLM_MODEL": "llama3.2",
|
|
665
|
-
"PRISM_LLM_BASE_URL": "http://localhost:11434"
|
|
666
|
-
}
|
|
667
|
-
}
|
|
668
|
-
```
|
|
669
|
-
|
|
670
|
-
| Provider | Env Var | Notes |
|
|
671
|
-
|----------|---------|-------|
|
|
672
|
-
| `gemini` (default) | `GOOGLE_API_KEY` | Best for Morning Briefings |
|
|
673
|
-
| `openai` | `OPENAI_API_KEY` | GPT-4o supports VLM |
|
|
674
|
-
| `anthropic` | `ANTHROPIC_API_KEY` | Claude 3.5 supports VLM |
|
|
675
|
-
| `ollama` | none | Full local/air-gapped mode |
|
|
676
|
-
|
|
677
|
-
---
|
|
678
|
-
|
|
679
|
-
### 📦 GDPR Memory Export
|
|
680
|
-
|
|
681
|
-
```
|
|
682
|
-
session_export_memory(project="my-app", format="zip")
|
|
683
|
-
```
|
|
684
|
-
|
|
685
|
-
Outputs a ZIP containing:
|
|
686
|
-
- `ledger.json` — all session entries
|
|
687
|
-
- `handoffs.json` — all project state snapshots
|
|
688
|
-
- `knowledge.md` — graduated insights in Markdown
|
|
689
|
-
- Sensitive fields (API keys, tokens) automatically redacted
|
|
690
|
-
|
|
691
|
-
---
|
|
692
|
-
|
|
693
773
|
## Architecture
|
|
694
774
|
|
|
775
|
+
> **📖 Deep dive**: [Full Architecture Guide](docs/ARCHITECTURE.md) — TurboQuant math, Three-Tier search, storage optimization flow
|
|
776
|
+
> **🤖 Tutorial**: [How to Build a Self-Improving Agent](docs/self-improving-agent.md) — corrections → behavioral memory → IDE rules
|
|
777
|
+
|
|
695
778
|
```mermaid
|
|
696
779
|
graph TB
|
|
697
780
|
Client["AI Client<br/>(Claude Desktop / Cursor / Windsurf)"]
|
|
698
|
-
LangChain["LangChain / LangGraph<br/>(Python Retrievers)"]
|
|
781
|
+
LangChain["LangChain / LangGraph<br/>(Python/TS Retrievers)"]
|
|
699
782
|
MCP["Prism MCP Server<br/>(TypeScript)"]
|
|
700
783
|
|
|
701
784
|
Client -- "MCP Protocol (stdio)" --> MCP
|
|
702
785
|
LangChain -- "JSON-RPC via MCP Bridge" --> MCP
|
|
703
786
|
|
|
704
|
-
MCP --> Tracing["
|
|
705
|
-
MCP --> Dashboard["Mind Palace Dashboard<br/>localhost:3000"]
|
|
787
|
+
MCP --> Tracing["OTel Tracing<br/>v4.6 Observability"]
|
|
788
|
+
MCP --> Dashboard["Mind Palace Dashboard<br/>localhost:3000<br/>(PRISM_DASHBOARD_PORT)"]
|
|
706
789
|
MCP --> Brave["Brave Search API<br/>Web + Local + AI Answers"]
|
|
707
|
-
MCP -->
|
|
790
|
+
MCP --> LLM["LLM Factory<br/>Gemini / OpenAI / Ollama"]
|
|
708
791
|
MCP --> Sandbox["QuickJS Sandbox<br/>Code-Mode Templates"]
|
|
709
792
|
MCP --> SyncBus["SyncBus<br/>Agent Telepathy"]
|
|
710
793
|
MCP --> GDPR["GDPR Engine<br/>Soft/Hard Delete + Audit"]
|
|
711
794
|
|
|
712
795
|
MCP --> Storage{"Storage Backend"}
|
|
713
|
-
Storage --> SQLite["SQLite (Local)<br/>libSQL +
|
|
796
|
+
Storage --> SQLite["SQLite (Local)<br/>libSQL + sqlite-vec"]
|
|
714
797
|
Storage --> Supabase["Supabase (Cloud)<br/>PostgreSQL + pgvector"]
|
|
715
798
|
|
|
716
|
-
SQLite --> Ledger["session_ledger
|
|
717
|
-
|
|
799
|
+
SQLite --> Ledger["session_ledger"]
|
|
800
|
+
Ledger --> T1["Tier 1: float32<br/>3,072B native search"]
|
|
801
|
+
T1 -- "v5.0 TurboQuant" --> T2["Tier 2: turbo4<br/>400B JS search"]
|
|
802
|
+
T1 -. "v5.1 Purge" .-> Null["NULL after 30d"]
|
|
803
|
+
|
|
804
|
+
SQLite --> Handoffs["session_handoffs<br/>(OCC versioning)"]
|
|
718
805
|
SQLite --> History["history_snapshots<br/>(Time Travel)"]
|
|
719
806
|
SQLite --> Media["media vault<br/>(Visual Memory)"]
|
|
720
807
|
|
|
@@ -724,13 +811,16 @@ graph TB
|
|
|
724
811
|
style Tracing fill:#D69E2E,color:#fff
|
|
725
812
|
style Dashboard fill:#9F7AEA,color:#fff
|
|
726
813
|
style Brave fill:#FB542B,color:#fff
|
|
727
|
-
style
|
|
814
|
+
style LLM fill:#4285F4,color:#fff
|
|
728
815
|
style Sandbox fill:#805AD5,color:#fff
|
|
729
816
|
style SyncBus fill:#ED64A6,color:#fff
|
|
730
817
|
style GDPR fill:#E53E3E,color:#fff
|
|
731
818
|
style Storage fill:#2D3748,color:#fff
|
|
732
819
|
style SQLite fill:#38B2AC,color:#fff
|
|
733
820
|
style Supabase fill:#3ECF8E,color:#fff
|
|
821
|
+
style T1 fill:#48BB78,color:#fff
|
|
822
|
+
style T2 fill:#E8B004,color:#000
|
|
823
|
+
style Null fill:#E53E3E,color:#fff
|
|
734
824
|
```
|
|
735
825
|
|
|
736
826
|
---
|
|
@@ -1060,6 +1150,7 @@ The retrievers use `_aget_relevant_documents` as the primary path with `asyncio.
|
|
|
1060
1150
|
| `PRISM_AUTO_CAPTURE` | No | Set `"true"` to auto-capture HTML snapshots of dev servers |
|
|
1061
1151
|
| `PRISM_CAPTURE_PORTS` | No | Comma-separated ports to scan (default: `3000,3001,5173,8080`) |
|
|
1062
1152
|
| `PRISM_DEBUG_LOGGING` | No | Set `"true"` to enable verbose debug logs (default: quiet) |
|
|
1153
|
+
| `PRISM_DASHBOARD_PORT` | No | Configure the dashboard port (default: `3000`) |
|
|
1063
1154
|
|
|
1064
1155
|
---
|
|
1065
1156
|
|
|
@@ -1512,7 +1603,6 @@ See [`vertex-ai/`](vertex-ai/) for setup and benchmarks.
|
|
|
1512
1603
|
│ │ ├── compactionHandler.ts # Gemini-powered ledger compaction
|
|
1513
1604
|
│ │ └── index.ts # Tool registration & re-exports
|
|
1514
1605
|
│ └── utils/
|
|
1515
|
-
│ └── utils/
|
|
1516
1606
|
│ ├── telemetry.ts # OTel singleton — NodeTracerProvider, BatchSpanProcessor, no-op mode
|
|
1517
1607
|
│ ├── tracing.ts # MemoryTrace types + factory (Phase 1 — LLM explainability)
|
|
1518
1608
|
│ ├── imageCaptioner.ts # VLM auto-caption pipeline (v4.5) + worker.vlm_caption OTel span
|
|
@@ -1554,6 +1644,24 @@ See [`vertex-ai/`](vertex-ai/) for setup and benchmarks.
|
|
|
1554
1644
|
|
|
1555
1645
|
> **[View the full project board →](https://github.com/users/dcostenco/projects/1/views/1)** | **[Full ROADMAP.md →](ROADMAP.md)**
|
|
1556
1646
|
|
|
1647
|
+
### ✅ v5.0 — Quantized Agentic Memory (Shipped!)
|
|
1648
|
+
|
|
1649
|
+
| Feature | Description |
|
|
1650
|
+
|---|---|
|
|
1651
|
+
| 🧮 **TurboQuant Math Core** | Pure TypeScript port of Google's TurboQuant (ICLR 2026) — Lloyd-Max codebook, QR rotation, QJL error correction. Zero dependencies. [RFC-001](docs/rfcs/001-turboquant-integration.md) |
|
|
1652
|
+
| 📦 **~7× Embedding Compression** | 768-dim embeddings shrink from 3,072 bytes to ~400 bytes (4-bit) via variable bit-packing. |
|
|
1653
|
+
| 🔍 **Asymmetric Similarity** | Unbiased inner product estimator: query as float32 vs compressed blobs. No decompression needed. |
|
|
1654
|
+
| 🗄️ **Two-Tier Search** | FTS5 candidate filter → JS-side asymmetric scoring. Bypasses sqlite-vec float32 limitation. |
|
|
1655
|
+
|
|
1656
|
+
### ✅ v5.1 — Deep Storage Mode (Shipped!)
|
|
1657
|
+
|
|
1658
|
+
| Feature | Description |
|
|
1659
|
+
|---|---|
|
|
1660
|
+
| 🧬 **Deep Storage Purge** | Automated `deep_storage_purge` tool NULLs out redundant float32 embeddings for entries with TurboQuant compressed blobs, reclaiming ~90% of vector storage. |
|
|
1661
|
+
| 🛡️ **Safety Guards** | Minimum 7-day age threshold, dry-run preview mode, multi-tenant isolation, and compressed-blob-existence validation ensure zero data loss. |
|
|
1662
|
+
| 🗃️ **Supabase RPC** | `prism_purge_embeddings` Postgres function (migration 030) provides full backend parity with SQLite. Auto-applied via the v4.1 migration runner. |
|
|
1663
|
+
| 🧪 **303 Tests** | 8 new deep-storage test cases covering dry run, execute, safety guards, and idempotency — zero regressions across the full suite. |
|
|
1664
|
+
|
|
1557
1665
|
### ✅ v4.6 — OpenTelemetry Observability (Shipped!)
|
|
1558
1666
|
|
|
1559
1667
|
| Feature | Description |
|
|
@@ -1608,11 +1716,11 @@ See [v3.1.0](#whats-in-v310--memory-lifecycle-) and [v3.0.0](#whats-in-v300--age
|
|
|
1608
1716
|
|
|
1609
1717
|
| Priority | Feature | Description |
|
|
1610
1718
|
|----------|---------|-------------|
|
|
1611
|
-
|
|
|
1612
|
-
|
|
|
1719
|
+
| ✅ | **Documentation & Architecture Guide** | [Architecture Guide](docs/ARCHITECTURE.md), [Self-Improving Agent Guide](docs/self-improving-agent.md), updated README diagram with v5.x vector tiers. |
|
|
1720
|
+
| ✅ | **Knowledge Graph Editor** | Interactive vis.js graph with click-to-filter, node stats, project/keyword/category visualization. |
|
|
1613
1721
|
| 🥉 | **Autonomous Web Scholar** | Agent-driven learning pipeline using Brave Search + VLM to autonomously build project context while the developer sleeps. |
|
|
1614
|
-
|
|
|
1615
|
-
|
|
|
1722
|
+
| ✅ | **Dashboard Auth** | HTTP Basic Auth with session cookies, timing-safe comparison, styled login page. Set `PRISM_DASHBOARD_USER`/`PRISM_DASHBOARD_PASS`. |
|
|
1723
|
+
| ✅ | **TypeScript LangGraph Examples** | [Reference agent](examples/langgraph-ts/) with MCP client, memory retriever nodes, and session persistence. |
|
|
1616
1724
|
| — | **CRDT Conflict Resolution** | Conflict-free types for concurrent multi-agent edits on the same handoff. |
|
|
1617
1725
|
|
|
1618
1726
|
---
|