prism-mcp-server 4.6.0 → 5.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +271 -71
- package/dist/dashboard/server.js +240 -7
- package/dist/dashboard/ui.js +198 -16
- package/dist/server.js +22 -3
- package/dist/storage/sqlite.js +247 -6
- package/dist/storage/supabase.js +58 -0
- package/dist/storage/supabaseMigrations.js +86 -1
- package/dist/tools/index.js +2 -2
- package/dist/tools/sessionMemoryDefinitions.js +63 -0
- package/dist/tools/sessionMemoryHandlers.js +99 -5
- package/dist/utils/turboquant.js +730 -0
- package/package.json +2 -2
package/README.md
CHANGED
|
@@ -8,12 +8,14 @@
|
|
|
8
8
|
[](https://www.typescriptlang.org/)
|
|
9
9
|
[](https://nodejs.org/)
|
|
10
10
|
|
|
11
|
-
> **Your AI agent's memory that survives between sessions.** Prism MCP is a Model Context Protocol server that gives Claude Desktop, Cursor, Windsurf, and any MCP client **persistent memory**, **time travel**, **visual context**, **multi-agent sync**, **GDPR-compliant deletion**, **memory tracing**, and **LangChain integration** — all running locally with zero cloud dependencies.
|
|
11
|
+
> **Your AI agent's memory that survives between sessions.** Prism MCP is a Model Context Protocol server that gives Claude Desktop, Cursor, Windsurf, and any MCP client **persistent memory**, **time travel**, **visual context**, **multi-agent sync**, **GDPR-compliant deletion**, **memory tracing**, **quantized vector compression**, and **LangChain integration** — all running locally with zero cloud dependencies.
|
|
12
12
|
>
|
|
13
|
-
> Built with **SQLite + F32_BLOB vector search**, **optimistic concurrency control**, **MCP Prompts & Resources**, **auto-compaction**, **Gemini-powered Morning Briefings**, **MemoryTrace explainability**, and optional **Supabase cloud sync**.
|
|
13
|
+
> Built with **SQLite + F32_BLOB vector search**, **TurboQuant 10× embedding compression**, **optimistic concurrency control**, **MCP Prompts & Resources**, **auto-compaction**, **Gemini-powered Morning Briefings**, **MemoryTrace explainability**, and optional **Supabase cloud sync**.
|
|
14
14
|
|
|
15
15
|
## Table of Contents
|
|
16
16
|
|
|
17
|
+
- [What's New (v5.1.0)](#whats-new-in-v510--deep-storage--knowledge-graph-)
|
|
18
|
+
- [What's New (v5.0.0)](#whats-new-in-v500--quantized-agentic-memory-)
|
|
17
19
|
- [What's New (v4.6.0)](#whats-new-in-v460--opentelemetry-observability-)
|
|
18
20
|
- [Multi-Instance Support](#multi-instance-support)
|
|
19
21
|
- [How Prism Compares](#how-prism-compares)
|
|
@@ -23,7 +25,7 @@
|
|
|
23
25
|
- [Claude Code Integration (Hooks)](#claude-code-integration-hooks)
|
|
24
26
|
- [Gemini / Antigravity Integration](#gemini--antigravity-integration)
|
|
25
27
|
- [Use Cases](#use-cases)
|
|
26
|
-
- [Architecture](#architecture)
|
|
28
|
+
- [Architecture](#architecture) | [Full Architecture Guide](docs/ARCHITECTURE.md) | [Self-Improving Agent Guide](docs/self-improving-agent.md)
|
|
27
29
|
- [Tool Reference](#tool-reference)
|
|
28
30
|
- [Agent Hivemind — Role Usage](#agent-hivemind--role-usage)
|
|
29
31
|
- [LangChain / LangGraph Integration](#langchain--langgraph-integration)
|
|
@@ -42,12 +44,98 @@
|
|
|
42
44
|
|
|
43
45
|
---
|
|
44
46
|
|
|
45
|
-
## What's New in
|
|
47
|
+
## What's New in v5.1.0 — Deep Storage & Knowledge Graph 🗂️
|
|
48
|
+
|
|
49
|
+
> **🗂️ Reclaim 90% of your vector storage and visually edit your agent's knowledge graph.**
|
|
50
|
+
> [CHANGELOG](CHANGELOG.md)
|
|
51
|
+
|
|
52
|
+
| Feature | Description |
|
|
53
|
+
|---|---|
|
|
54
|
+
| 🗑️ **Deep Storage Mode** | New `deep_storage_purge` tool NULLs out redundant float32 embeddings for entries with TurboQuant compressed blobs, reclaiming ~90% of vector storage. Safety guards: 7-day minimum age, dry-run preview, multi-tenant isolation. |
|
|
55
|
+
| 🕸️ **Knowledge Graph Editor** | The Mind Palace Neural Graph is now fully interactive — click nodes to rename or delete keywords, filter by project/date/importance, and surgically groom your agent's semantic memory. |
|
|
56
|
+
| 🔧 **Auto-Load Reliability** | Hardened hook-based integration patterns for Claude Code and Gemini/Antigravity to guarantee context loading on the absolute first turn without reasoning hallucinations. |
|
|
57
|
+
| 🧪 **303 Tests** | 8 new deep-storage test cases covering dry run, execute, safety guards, and idempotency — zero regressions across 13 suites. |
|
|
58
|
+
|
|
59
|
+
---
|
|
60
|
+
|
|
61
|
+
## What's New in v5.0.0 — Quantized Agentic Memory 🧬
|
|
62
|
+
|
|
63
|
+
> **🧬 10× embedding compression is here.** Powered by Google's TurboQuant (ICLR 2026), Prism now compresses 768-dim embeddings from **3,072 bytes → ~400 bytes** — enabling decades of session history on a standard laptop.
|
|
64
|
+
> [RFC-001: Quantized Agentic Memory](docs/rfcs/001-turboquant-integration.md) | [CHANGELOG](CHANGELOG.md)
|
|
65
|
+
|
|
66
|
+
### Performance Benchmarks
|
|
67
|
+
|
|
68
|
+
| Metric | Before v5.0 | After v5.0 |
|
|
69
|
+
|--------|------------|------------|
|
|
70
|
+
| **Storage per embedding** | 3,072 bytes (float32) | ~400 bytes (turbo4) |
|
|
71
|
+
| **Compression ratio** | 1:1 | **~7.7:1** (4-bit) / **~10.1:1** (3-bit) |
|
|
72
|
+
| **Similarity correlation** | Baseline | >0.85 (4-bit) |
|
|
73
|
+
| **Top-1 retrieval accuracy** | Baseline | >90% (N=100) |
|
|
74
|
+
| **Entries per GB** | ~330K | **~2.5M** |
|
|
75
|
+
| **Search without vector DB** | ❌ Empty | ✅ Tier-2 JS fallback |
|
|
76
|
+
|
|
77
|
+
### Three-Tier Memory Architecture
|
|
78
|
+
|
|
79
|
+
```
|
|
80
|
+
┌─────────────────────────────────────────────────────────────┐
|
|
81
|
+
│ PRISM v5.0 MEMORY │
|
|
82
|
+
├─────────┬───────────────┬───────────────────────────────────┤
|
|
83
|
+
│ TIER │ STORAGE │ SEARCH METHOD │
|
|
84
|
+
├─────────┼───────────────┼───────────────────────────────────┤
|
|
85
|
+
│ Tier 0 │ FTS5 keywords │ Full-text search (knowledge_search) │
|
|
86
|
+
│ Tier 1 │ float32 3072B │ sqlite-vec cosine (native) │
|
|
87
|
+
│ Tier 2 │ turbo4 400B │ JS asymmetricCosineSimilarity │
|
|
88
|
+
└─────────┴───────────────┴───────────────────────────────────┘
|
|
89
|
+
|
|
90
|
+
searchMemory() flow:
|
|
91
|
+
→ Tier 1 (sqlite-vec) ── success → return results
|
|
92
|
+
── fail → Tier 2 (TurboQuant JS)
|
|
93
|
+
── success → return results
|
|
94
|
+
── fail → return []
|
|
95
|
+
```
|
|
96
|
+
|
|
97
|
+
### Live Usage: How TurboQuant Works in Practice
|
|
98
|
+
|
|
99
|
+
**Every `session_save_ledger` call now generates both tiers automatically:**
|
|
100
|
+
|
|
101
|
+
```typescript
|
|
102
|
+
// What happens behind the scenes when you save a session:
|
|
103
|
+
await saveLedger({ project: "my-app", summary: "Built auth flow" });
|
|
104
|
+
|
|
105
|
+
// 1. Gemini generates float32 embedding (3,072 bytes)
|
|
106
|
+
// 2. TurboQuant compresses to turbo4 blob (~400 bytes)
|
|
107
|
+
// 3. Single atomic patchLedger writes BOTH to the database
|
|
108
|
+
// → embedding: "[0.0234, -0.0156, ...]" (float32)
|
|
109
|
+
// → embedding_compressed: "base64..." (turbo4)
|
|
110
|
+
// → embedding_format: "turbo4"
|
|
111
|
+
// → embedding_turbo_radius: 12.847
|
|
112
|
+
|
|
113
|
+
// Searching works seamlessly across both tiers:
|
|
114
|
+
await searchMemory({ query: "auth flow" });
|
|
115
|
+
// → Tier 1 tries native vector search
|
|
116
|
+
// → If unavailable, Tier 2 deserializes compressed blobs
|
|
117
|
+
// and ranks using asymmetric cosine similarity in JS
|
|
118
|
+
```
|
|
119
|
+
|
|
120
|
+
**Backfill existing entries with one command:**
|
|
121
|
+
```
|
|
122
|
+
> Use tool: session_backfill_embeddings
|
|
123
|
+
> Now repairs AND compresses in a single atomic update
|
|
124
|
+
```
|
|
125
|
+
|
|
126
|
+
> **💡 Ollama TurboQuant Tip:** If using Ollama for self-hosted inference, set `OLLAMA_KV_CACHE_TYPE=turbo3` for 10× smaller KV caches during generation — the same algorithm powering Prism's memory compression.
|
|
127
|
+
|
|
128
|
+
---
|
|
129
|
+
|
|
130
|
+
<details>
|
|
131
|
+
<summary><strong>What's in v4.6.0 — OpenTelemetry Observability 🔭</strong></summary>
|
|
46
132
|
|
|
47
133
|
> **🔭 Full distributed tracing for every MCP tool call, LLM provider hop, and background AI worker.**
|
|
48
134
|
> Configure in the new **🔭 Observability** tab in Mind Palace — no code changes required.
|
|
49
135
|
> Activates a 4-tier span waterfall: `mcp.call_tool` → `worker.vlm_caption` → `llm.generate_image_description` / `llm.generate_embedding`.
|
|
50
136
|
|
|
137
|
+
</details>
|
|
138
|
+
|
|
51
139
|
<a name="whats-new-in-v451--gdpr-export-"></a>
|
|
52
140
|
<details>
|
|
53
141
|
<summary><strong>What's in v4.5.1 — GDPR Export & Test Hardening 🔒</strong></summary>
|
|
@@ -234,7 +322,7 @@
|
|
|
234
322
|
| Feature | Description |
|
|
235
323
|
|---|---|
|
|
236
324
|
| 🏠 **Local-First SQLite** | Run Prism entirely locally with zero cloud dependencies. Full vector search (libSQL F32_BLOB) and FTS5 included. |
|
|
237
|
-
| 🔮 **Mind Palace UI** | A beautiful glassmorphism dashboard at `localhost:3000` to inspect your agent's memory, visual vault, and Git drift. |
|
|
325
|
+
| 🔮 **Mind Palace UI** | A beautiful glassmorphism dashboard at `localhost:3000` (configurable via `PRISM_DASHBOARD_PORT`) to inspect your agent's memory, visual vault, and Git drift. |
|
|
238
326
|
| 🕰️ **Time Travel** | `memory_history` and `memory_checkout` act like `git revert` for your agent's brain — full version history with OCC. |
|
|
239
327
|
| 🖼️ **Visual Memory** | Agents can save screenshots to a local media vault. Auto-capture mode snapshots your local dev server on every handoff save. |
|
|
240
328
|
| 📡 **Agent Telepathy** | Multi-client sync: if your agent in Cursor saves state, Claude Desktop gets a live notification instantly. |
|
|
@@ -271,9 +359,14 @@
|
|
|
271
359
|
| **Auto-Compaction** | ✅ Gemini rollups | ❌ | ❌ | ❌ | ❌ |
|
|
272
360
|
| **Morning Briefing** | ✅ Gemini synthesis | ❌ | ❌ | ❌ | ❌ |
|
|
273
361
|
| **OCC (Concurrency)** | ✅ Version-based | ❌ | ❌ | ❌ | ❌ |
|
|
274
|
-
| **GDPR Compliance** | ✅ Soft/hard delete | ❌ | ❌ | ❌ | ❌ |
|
|
362
|
+
| **GDPR Compliance** | ✅ Soft/hard delete + ZIP export | ❌ | ❌ | ❌ | ❌ |
|
|
275
363
|
| **Memory Tracing** | ✅ Latency breakdown | ❌ | ❌ | ❌ | ❌ |
|
|
364
|
+
| **OpenTelemetry** | ✅ OTLP spans (v4.6) | ❌ | ❌ | ❌ | ❌ |
|
|
365
|
+
| **VLM Image Captions** | ✅ Auto-caption vault (v4.5) | ❌ | ❌ | ❌ | ❌ |
|
|
366
|
+
| **Pluggable LLM Adapters** | ✅ OpenAI/Anthropic/Gemini/Ollama | ❌ | ✅ Multi-provider | ❌ | ❌ |
|
|
276
367
|
| **LangChain** | ✅ BaseRetriever | ❌ | ❌ | ❌ | ❌ |
|
|
368
|
+
| **Vector Compression** | ✅ TurboQuant 10× (v5.0) | ❌ | ❌ | ❌ | ❌ |
|
|
369
|
+
| **Three-Tier Search** | ✅ FTS + Vec + Quantized | ❌ | ❌ | ❌ | ❌ |
|
|
277
370
|
| **MCP Native** | ✅ stdio | ✅ stdio | ❌ Python SDK | ✅ HTTP + MCP | ✅ stdio |
|
|
278
371
|
| **Language** | TypeScript | TypeScript | Python | Python | Python |
|
|
279
372
|
|
|
@@ -465,11 +558,36 @@ Add to your Continue `config.json` or Cline MCP settings:
|
|
|
465
558
|
|
|
466
559
|
## Claude Code Integration (Hooks)
|
|
467
560
|
|
|
468
|
-
Claude Code supports
|
|
561
|
+
Claude Code supports custom hooks (`SessionStart`, `Stop`) that can force the agent to load and save Prism context automatically. Because Claude Code requires explicit permission for MCP tools, you must also whitelist the Prism commands.
|
|
562
|
+
|
|
563
|
+
### 1. The Auto-Load Hook Script
|
|
564
|
+
|
|
565
|
+
Create a Python script (e.g., `~/.claude/mcp_autoload_hook.py`). This script outputs JSON that Claude Code reads during the `SessionStart` event.
|
|
469
566
|
|
|
470
|
-
|
|
567
|
+
```python
|
|
568
|
+
#!/usr/bin/env python3
|
|
569
|
+
import json
|
|
570
|
+
import sys
|
|
571
|
+
|
|
572
|
+
def main():
|
|
573
|
+
# Inject a system message forcing the agent to load memory BEFORE speaking
|
|
574
|
+
print(json.dumps({
|
|
575
|
+
"continue": True,
|
|
576
|
+
"suppressOutput": True,
|
|
577
|
+
"systemMessage": (
|
|
578
|
+
"## First Action\n"
|
|
579
|
+
"Call `mcp__prism-mcp__session_load_context(project='my-project', level='deep')` "
|
|
580
|
+
"before responding to the user. Do not generate any text before calling this tool."
|
|
581
|
+
)
|
|
582
|
+
}))
|
|
583
|
+
|
|
584
|
+
if __name__ == "__main__":
|
|
585
|
+
main()
|
|
586
|
+
```
|
|
471
587
|
|
|
472
|
-
|
|
588
|
+
### 2. Configure `settings.json`
|
|
589
|
+
|
|
590
|
+
Map the hooks in your `~/.claude/settings.json`:
|
|
473
591
|
|
|
474
592
|
```json
|
|
475
593
|
{
|
|
@@ -480,47 +598,45 @@ Automatically loads context when a new session begins:
|
|
|
480
598
|
"hooks": [
|
|
481
599
|
{
|
|
482
600
|
"type": "command",
|
|
483
|
-
"command": "python3
|
|
601
|
+
"command": "python3 /Users/you/.claude/mcp_autoload_hook.py",
|
|
484
602
|
"timeout": 10
|
|
485
603
|
}
|
|
486
604
|
]
|
|
487
605
|
}
|
|
488
|
-
]
|
|
489
|
-
}
|
|
490
|
-
}
|
|
491
|
-
```
|
|
492
|
-
|
|
493
|
-
### Stop Hook
|
|
494
|
-
|
|
495
|
-
Automatically saves session memory when a session ends:
|
|
496
|
-
|
|
497
|
-
```json
|
|
498
|
-
{
|
|
499
|
-
"hooks": {
|
|
606
|
+
],
|
|
500
607
|
"Stop": [
|
|
501
608
|
{
|
|
502
609
|
"matcher": "*",
|
|
503
610
|
"hooks": [
|
|
504
611
|
{
|
|
505
612
|
"type": "command",
|
|
506
|
-
"command": "python3 -c \"import json; print(json.dumps({'continue': True, 'suppressOutput':
|
|
613
|
+
"command": "python3 -c \"import json; print(json.dumps({'continue': True, 'suppressOutput': True, 'systemMessage': 'MANDATORY END WORKFLOW: 1) Call mcp__prism-mcp__session_save_ledger with project and summary. 2) Call mcp__prism-mcp__session_save_handoff with expected_version set to the loaded version.'}))\"",
|
|
507
614
|
"timeout": 10
|
|
508
615
|
}
|
|
509
616
|
]
|
|
510
617
|
}
|
|
511
618
|
]
|
|
619
|
+
},
|
|
620
|
+
"permissions": {
|
|
621
|
+
"allow": [
|
|
622
|
+
"mcp__prism-mcp__session_load_context",
|
|
623
|
+
"mcp__prism-mcp__session_save_ledger",
|
|
624
|
+
"mcp__prism-mcp__session_save_handoff",
|
|
625
|
+
"mcp__prism-mcp__knowledge_search",
|
|
626
|
+
"mcp__prism-mcp__session_search_memory"
|
|
627
|
+
]
|
|
512
628
|
}
|
|
513
629
|
}
|
|
514
630
|
```
|
|
515
631
|
|
|
516
632
|
### How the Hooks Work
|
|
517
633
|
|
|
518
|
-
The hook `command` runs a Python
|
|
634
|
+
The hook `command` runs a Python script that returns a JSON object to Claude Code:
|
|
519
635
|
|
|
520
636
|
| Field | Purpose |
|
|
521
637
|
|---|---|
|
|
522
638
|
| `continue: true` | Tell Claude Code to proceed (don't abort the session) |
|
|
523
|
-
| `suppressOutput:
|
|
639
|
+
| `suppressOutput: true` | Silently inject the system message (recommended for Stop hooks) |
|
|
524
640
|
| `systemMessage` | Instruction injected as a system message — the agent follows it |
|
|
525
641
|
|
|
526
642
|
The agent receives the `systemMessage` as an instruction and executes the tool calls. The server resolves the agent's **role** and **name** automatically from the dashboard — no need to specify them in the hook.
|
|
@@ -539,90 +655,153 @@ explicit tool argument → dashboard setting → "global" (default)
|
|
|
539
655
|
|
|
540
656
|
Change your role once in the dashboard, and it automatically applies to every session — CLI, extension, and all MCP clients.
|
|
541
657
|
|
|
542
|
-
###
|
|
543
|
-
|
|
544
|
-
If hydration ran successfully, the agent's output will include:
|
|
545
|
-
- A `[👤 AGENT IDENTITY]` block showing your dashboard-configured role and name
|
|
546
|
-
- `PRISM_CONTEXT_LOADED` marker text
|
|
658
|
+
### Troubleshooting Claude Code
|
|
547
659
|
|
|
548
|
-
|
|
660
|
+
- **Hook not firing?** Check the hook `timeout` in Claude Code settings. If your Python script takes too long, Claude ignores it silently.
|
|
661
|
+
- **"Tool not available" hallucination?** If Claude claims it doesn't have the tool, it's usually an adversarial Chain-of-Thought loop. Ensure the `permissions.allow` array exactly matches the double-underscore format (`mcp__prism-mcp__...`).
|
|
662
|
+
- **Missing `PRISM_CONTEXT_LOADED`?** The hook didn't fire or the MCP server isn't connected. Verify `prism-mcp` is listed in your `mcpServers` config.
|
|
549
663
|
|
|
550
664
|
---
|
|
551
665
|
|
|
552
666
|
## Gemini / Antigravity Integration
|
|
553
667
|
|
|
554
|
-
Gemini-based
|
|
668
|
+
Antigravity and Gemini-based agents require a radically simplified approach to auto-loading. If you give modern instruction-tuned models a long list of "Banned Behaviors" (e.g., "Do NOT say hello first"), their internal reasoning often over-indexes on the constraints and causes them to hallucinate that the tool doesn't exist.
|
|
555
669
|
|
|
556
|
-
###
|
|
670
|
+
### The 2-Line "First Action" Rule
|
|
557
671
|
|
|
558
|
-
|
|
559
|
-
## Prism MCP Memory Auto-Load (CRITICAL)
|
|
560
|
-
At the start of every new session, call `mcp__prism-mcp__session_load_context`
|
|
561
|
-
for these projects:
|
|
562
|
-
- `my-project` (level=standard)
|
|
563
|
-
- `my-other-project` (level=standard)
|
|
672
|
+
Create a `GEMINI.md` file in your project root (or globally at `~/.gemini/GEMINI.md`) or paste this into your Antigravity **User Rules**:
|
|
564
673
|
|
|
565
|
-
|
|
674
|
+
```markdown
|
|
675
|
+
## First Action
|
|
676
|
+
Call `mcp_prism-mcp_session_load_context(project="my-project", level="deep")` before responding.
|
|
566
677
|
```
|
|
567
678
|
|
|
568
|
-
|
|
679
|
+
> **Note:** Antigravity uses single underscores (`mcp_prism-mcp_...`) compared to Claude Code's double underscores (`mcp__prism-mcp__...`).
|
|
569
680
|
|
|
570
|
-
|
|
681
|
+
That's it — **two lines**. This approach proved reliable after 13 iterations of increasingly complex prompt engineering. The key insight: shorter instructions avoid triggering the model's adversarial reasoning about tool availability.
|
|
571
682
|
|
|
572
|
-
|
|
573
|
-
2. **Verify** — confirm the response includes `version` and `last_summary`.
|
|
683
|
+
### Session End Protocol
|
|
574
684
|
|
|
575
|
-
|
|
685
|
+
At the end of your conversation, explicitly tell the agent:
|
|
686
|
+
> *"Wrap up the session."*
|
|
576
687
|
|
|
577
|
-
|
|
688
|
+
The agent will rely on its system prompt to execute:
|
|
689
|
+
1. `session_save_ledger` — immutable work log with summary, TODOs, and decisions
|
|
690
|
+
2. `session_save_handoff` — passing the `expected_version` it received during the load step to ensure Optimistic Concurrency Control
|
|
691
|
+
|
|
692
|
+
### Antigravity UI Caveats
|
|
693
|
+
|
|
694
|
+
Antigravity's UI currently does **not** visually render the raw output of MCP tool calls. To ensure the agent actually ingested the context, add this to your User Rules:
|
|
578
695
|
|
|
579
696
|
```markdown
|
|
580
|
-
##
|
|
581
|
-
|
|
582
|
-
|
|
697
|
+
## STEP 2: Echo Context in Your Text Response
|
|
698
|
+
After the tool returns, include the following in your greeting text:
|
|
699
|
+
- Agent identity: `🤖 Agent: <role> — <name>`
|
|
700
|
+
- Last session summary
|
|
701
|
+
- Open TODOs
|
|
702
|
+
- Session version number
|
|
583
703
|
```
|
|
584
704
|
|
|
705
|
+
This forces the agent to prove it loaded context by echoing it in visible text.
|
|
706
|
+
|
|
585
707
|
---
|
|
586
708
|
|
|
587
709
|
## Use Cases
|
|
588
710
|
|
|
589
|
-
| Scenario | How Prism MCP Helps |
|
|
590
|
-
|
|
591
|
-
| **Long-running feature work** | Save session state at end of day, restore full context
|
|
592
|
-
| **Multi-agent collaboration** | Telepathy
|
|
593
|
-
| **Consulting / multi-project** | Switch between client projects with progressive context loading |
|
|
594
|
-
| **Research & analysis** | Multi-engine search with 94% context reduction via sandboxed code transforms |
|
|
595
|
-
| **Team onboarding** | New team member's agent loads full project history
|
|
596
|
-
| **Visual debugging** | Save screenshots
|
|
597
|
-
| **Offline / air-gapped** | Full SQLite local mode
|
|
711
|
+
| Scenario | How Prism MCP Helps | Live Sample |
|
|
712
|
+
|----------|---------------------|-------------|
|
|
713
|
+
| **Long-running feature work** | Save session state at end of day, restore full context next morning — no re-explaining | `session_save_handoff(project, last_summary, open_todos)` |
|
|
714
|
+
| **Multi-agent collaboration** | Hivemind Telepathy lets multiple agents share real-time context across clients | `session_load_context(project, role="qa")` |
|
|
715
|
+
| **Consulting / multi-project** | Switch between client projects with progressive context loading | `session_load_context(project, level="quick")` |
|
|
716
|
+
| **Research & analysis** | Multi-engine search with 94% context reduction via sandboxed code transforms | `brave_web_search` + `code_mode_transform(template="api_endpoints")` |
|
|
717
|
+
| **Team onboarding** | New team member's agent loads full project history instantly | `session_load_context(project, level="deep")` |
|
|
718
|
+
| **Visual debugging** | Save UI screenshots to visual memory — searchable by description | `session_save_image(project, path, description)` → `session_view_image(id)` |
|
|
719
|
+
| **Offline / air-gapped** | Full SQLite local mode, Ollama LLM adapter — zero internet dependency | `PRISM_LLM_PROVIDER=ollama` in MCP config env |
|
|
720
|
+
| **Behavior enforcement** | Agent corrections auto-graduate into permanent `.cursorrules` | `session_save_experience(event_type="correction")` → `knowledge_sync_rules(project)` |
|
|
721
|
+
| **Infrastructure observability** | OTel spans to Jaeger/Grafana for every MCP tool call fanout | Enable in Dashboard → Settings → 🔭 Observability |
|
|
722
|
+
| **GDPR / audit export** | ZIP export of all memory as JSON + Markdown, sensitive fields redacted | `session_export_memory(project, format="zip")` |
|
|
723
|
+
|
|
724
|
+
---
|
|
725
|
+
|
|
726
|
+
## New in v4.6.0 — Feature Setup Guide
|
|
727
|
+
|
|
728
|
+
### 🔭 OpenTelemetry Distributed Tracing
|
|
729
|
+
|
|
730
|
+
**Why:** Every `session_save_ledger` call can silently fan out into a synchronous DB write, an async VLM caption, and a vector embedding backfill. Without tracing, these are invisible. OTel makes the full call tree visible in Jaeger, Grafana Tempo, or any OTLP-compatible collector.
|
|
731
|
+
|
|
732
|
+
**Setup:**
|
|
733
|
+
1. Open Mind Palace Dashboard → ⚙️ Settings → 🔭 Observability
|
|
734
|
+
2. Toggle **Enable OpenTelemetry** → set your OTLP endpoint (default: `http://localhost:4318`)
|
|
735
|
+
3. Restart the MCP server
|
|
736
|
+
4. Run Jaeger locally:
|
|
737
|
+
```bash
|
|
738
|
+
docker run -d --name jaeger \
|
|
739
|
+
-p 16686:16686 -p 4318:4318 \
|
|
740
|
+
jaegertracing/all-in-one:latest
|
|
741
|
+
```
|
|
742
|
+
5. Open http://localhost:16686 — select service `prism-mcp` to see span waterfalls.
|
|
743
|
+
|
|
744
|
+
**Span hierarchy:**
|
|
745
|
+
```
|
|
746
|
+
mcp.call_tool [session_save_ledger]
|
|
747
|
+
├── storage.write_ledger ~2ms
|
|
748
|
+
├── llm.generate_embedding ~180ms
|
|
749
|
+
└── worker.vlm_caption (async) ~1.2s
|
|
750
|
+
```
|
|
751
|
+
|
|
752
|
+
> GDPR note: Span attributes contain only metadata — no prompt content, embeddings, or image data.
|
|
753
|
+
|
|
754
|
+
---
|
|
755
|
+
|
|
756
|
+
### 🖼️ VLM Multimodal Memory
|
|
757
|
+
|
|
758
|
+
**Why:** Agents lose visual context between sessions. UI screenshots, architecture diagrams, and bug states all become searchable memory.
|
|
759
|
+
|
|
760
|
+
**Setup:** Requires `ANTHROPIC_API_KEY` or `OPENAI_API_KEY` (vision-capable model).
|
|
761
|
+
|
|
762
|
+
**Usage:**
|
|
763
|
+
```
|
|
764
|
+
session_save_image(project="my-app", file_path="/path/to/screenshot.png", description="Login page broken layout after CSS refactor")
|
|
765
|
+
```
|
|
766
|
+
The image is auto-captioned by a VLM and stored in the media vault. Retrieve later:
|
|
767
|
+
```
|
|
768
|
+
session_view_image(project="my-app", image_id="8f2a1b3c")
|
|
769
|
+
```
|
|
598
770
|
|
|
599
771
|
---
|
|
600
772
|
|
|
601
773
|
## Architecture
|
|
602
774
|
|
|
775
|
+
> **📖 Deep dive**: [Full Architecture Guide](docs/ARCHITECTURE.md) — TurboQuant math, Three-Tier search, storage optimization flow
|
|
776
|
+
> **🤖 Tutorial**: [How to Build a Self-Improving Agent](docs/self-improving-agent.md) — corrections → behavioral memory → IDE rules
|
|
777
|
+
|
|
603
778
|
```mermaid
|
|
604
779
|
graph TB
|
|
605
780
|
Client["AI Client<br/>(Claude Desktop / Cursor / Windsurf)"]
|
|
606
|
-
LangChain["LangChain / LangGraph<br/>(Python Retrievers)"]
|
|
781
|
+
LangChain["LangChain / LangGraph<br/>(Python/TS Retrievers)"]
|
|
607
782
|
MCP["Prism MCP Server<br/>(TypeScript)"]
|
|
608
783
|
|
|
609
784
|
Client -- "MCP Protocol (stdio)" --> MCP
|
|
610
785
|
LangChain -- "JSON-RPC via MCP Bridge" --> MCP
|
|
611
786
|
|
|
612
|
-
MCP --> Tracing["
|
|
613
|
-
MCP --> Dashboard["Mind Palace Dashboard<br/>localhost:3000"]
|
|
787
|
+
MCP --> Tracing["OTel Tracing<br/>v4.6 Observability"]
|
|
788
|
+
MCP --> Dashboard["Mind Palace Dashboard<br/>localhost:3000<br/>(PRISM_DASHBOARD_PORT)"]
|
|
614
789
|
MCP --> Brave["Brave Search API<br/>Web + Local + AI Answers"]
|
|
615
|
-
MCP -->
|
|
790
|
+
MCP --> LLM["LLM Factory<br/>Gemini / OpenAI / Ollama"]
|
|
616
791
|
MCP --> Sandbox["QuickJS Sandbox<br/>Code-Mode Templates"]
|
|
617
792
|
MCP --> SyncBus["SyncBus<br/>Agent Telepathy"]
|
|
618
793
|
MCP --> GDPR["GDPR Engine<br/>Soft/Hard Delete + Audit"]
|
|
619
794
|
|
|
620
795
|
MCP --> Storage{"Storage Backend"}
|
|
621
|
-
Storage --> SQLite["SQLite (Local)<br/>libSQL +
|
|
796
|
+
Storage --> SQLite["SQLite (Local)<br/>libSQL + sqlite-vec"]
|
|
622
797
|
Storage --> Supabase["Supabase (Cloud)<br/>PostgreSQL + pgvector"]
|
|
623
798
|
|
|
624
|
-
SQLite --> Ledger["session_ledger
|
|
625
|
-
|
|
799
|
+
SQLite --> Ledger["session_ledger"]
|
|
800
|
+
Ledger --> T1["Tier 1: float32<br/>3,072B native search"]
|
|
801
|
+
T1 -- "v5.0 TurboQuant" --> T2["Tier 2: turbo4<br/>400B JS search"]
|
|
802
|
+
T1 -. "v5.1 Purge" .-> Null["NULL after 30d"]
|
|
803
|
+
|
|
804
|
+
SQLite --> Handoffs["session_handoffs<br/>(OCC versioning)"]
|
|
626
805
|
SQLite --> History["history_snapshots<br/>(Time Travel)"]
|
|
627
806
|
SQLite --> Media["media vault<br/>(Visual Memory)"]
|
|
628
807
|
|
|
@@ -632,13 +811,16 @@ graph TB
|
|
|
632
811
|
style Tracing fill:#D69E2E,color:#fff
|
|
633
812
|
style Dashboard fill:#9F7AEA,color:#fff
|
|
634
813
|
style Brave fill:#FB542B,color:#fff
|
|
635
|
-
style
|
|
814
|
+
style LLM fill:#4285F4,color:#fff
|
|
636
815
|
style Sandbox fill:#805AD5,color:#fff
|
|
637
816
|
style SyncBus fill:#ED64A6,color:#fff
|
|
638
817
|
style GDPR fill:#E53E3E,color:#fff
|
|
639
818
|
style Storage fill:#2D3748,color:#fff
|
|
640
819
|
style SQLite fill:#38B2AC,color:#fff
|
|
641
820
|
style Supabase fill:#3ECF8E,color:#fff
|
|
821
|
+
style T1 fill:#48BB78,color:#fff
|
|
822
|
+
style T2 fill:#E8B004,color:#000
|
|
823
|
+
style Null fill:#E53E3E,color:#fff
|
|
642
824
|
```
|
|
643
825
|
|
|
644
826
|
---
|
|
@@ -968,6 +1150,7 @@ The retrievers use `_aget_relevant_documents` as the primary path with `asyncio.
|
|
|
968
1150
|
| `PRISM_AUTO_CAPTURE` | No | Set `"true"` to auto-capture HTML snapshots of dev servers |
|
|
969
1151
|
| `PRISM_CAPTURE_PORTS` | No | Comma-separated ports to scan (default: `3000,3001,5173,8080`) |
|
|
970
1152
|
| `PRISM_DEBUG_LOGGING` | No | Set `"true"` to enable verbose debug logs (default: quiet) |
|
|
1153
|
+
| `PRISM_DASHBOARD_PORT` | No | Configure the dashboard port (default: `3000`) |
|
|
971
1154
|
|
|
972
1155
|
---
|
|
973
1156
|
|
|
@@ -1420,7 +1603,6 @@ See [`vertex-ai/`](vertex-ai/) for setup and benchmarks.
|
|
|
1420
1603
|
│ │ ├── compactionHandler.ts # Gemini-powered ledger compaction
|
|
1421
1604
|
│ │ └── index.ts # Tool registration & re-exports
|
|
1422
1605
|
│ └── utils/
|
|
1423
|
-
│ └── utils/
|
|
1424
1606
|
│ ├── telemetry.ts # OTel singleton — NodeTracerProvider, BatchSpanProcessor, no-op mode
|
|
1425
1607
|
│ ├── tracing.ts # MemoryTrace types + factory (Phase 1 — LLM explainability)
|
|
1426
1608
|
│ ├── imageCaptioner.ts # VLM auto-caption pipeline (v4.5) + worker.vlm_caption OTel span
|
|
@@ -1462,6 +1644,24 @@ See [`vertex-ai/`](vertex-ai/) for setup and benchmarks.
|
|
|
1462
1644
|
|
|
1463
1645
|
> **[View the full project board →](https://github.com/users/dcostenco/projects/1/views/1)** | **[Full ROADMAP.md →](ROADMAP.md)**
|
|
1464
1646
|
|
|
1647
|
+
### ✅ v5.0 — Quantized Agentic Memory (Shipped!)
|
|
1648
|
+
|
|
1649
|
+
| Feature | Description |
|
|
1650
|
+
|---|---|
|
|
1651
|
+
| 🧮 **TurboQuant Math Core** | Pure TypeScript port of Google's TurboQuant (ICLR 2026) — Lloyd-Max codebook, QR rotation, QJL error correction. Zero dependencies. [RFC-001](docs/rfcs/001-turboquant-integration.md) |
|
|
1652
|
+
| 📦 **~7× Embedding Compression** | 768-dim embeddings shrink from 3,072 bytes to ~400 bytes (4-bit) via variable bit-packing. |
|
|
1653
|
+
| 🔍 **Asymmetric Similarity** | Unbiased inner product estimator: query as float32 vs compressed blobs. No decompression needed. |
|
|
1654
|
+
| 🗄️ **Two-Tier Search** | FTS5 candidate filter → JS-side asymmetric scoring. Bypasses sqlite-vec float32 limitation. |
|
|
1655
|
+
|
|
1656
|
+
### ✅ v5.1 — Deep Storage Mode (Shipped!)
|
|
1657
|
+
|
|
1658
|
+
| Feature | Description |
|
|
1659
|
+
|---|---|
|
|
1660
|
+
| 🧬 **Deep Storage Purge** | Automated `deep_storage_purge` tool NULLs out redundant float32 embeddings for entries with TurboQuant compressed blobs, reclaiming ~90% of vector storage. |
|
|
1661
|
+
| 🛡️ **Safety Guards** | Minimum 7-day age threshold, dry-run preview mode, multi-tenant isolation, and compressed-blob-existence validation ensure zero data loss. |
|
|
1662
|
+
| 🗃️ **Supabase RPC** | `prism_purge_embeddings` Postgres function (migration 030) provides full backend parity with SQLite. Auto-applied via the v4.1 migration runner. |
|
|
1663
|
+
| 🧪 **303 Tests** | 8 new deep-storage test cases covering dry run, execute, safety guards, and idempotency — zero regressions across the full suite. |
|
|
1664
|
+
|
|
1465
1665
|
### ✅ v4.6 — OpenTelemetry Observability (Shipped!)
|
|
1466
1666
|
|
|
1467
1667
|
| Feature | Description |
|
|
@@ -1516,11 +1716,11 @@ See [v3.1.0](#whats-in-v310--memory-lifecycle-) and [v3.0.0](#whats-in-v300--age
|
|
|
1516
1716
|
|
|
1517
1717
|
| Priority | Feature | Description |
|
|
1518
1718
|
|----------|---------|-------------|
|
|
1519
|
-
|
|
|
1520
|
-
|
|
|
1719
|
+
| ✅ | **Documentation & Architecture Guide** | [Architecture Guide](docs/ARCHITECTURE.md), [Self-Improving Agent Guide](docs/self-improving-agent.md), updated README diagram with v5.x vector tiers. |
|
|
1720
|
+
| ✅ | **Knowledge Graph Editor** | Interactive vis.js graph with click-to-filter, node stats, project/keyword/category visualization. |
|
|
1521
1721
|
| 🥉 | **Autonomous Web Scholar** | Agent-driven learning pipeline using Brave Search + VLM to autonomously build project context while the developer sleeps. |
|
|
1522
|
-
|
|
|
1523
|
-
|
|
|
1722
|
+
| ✅ | **Dashboard Auth** | HTTP Basic Auth with session cookies, timing-safe comparison, styled login page. Set `PRISM_DASHBOARD_USER`/`PRISM_DASHBOARD_PASS`. |
|
|
1723
|
+
| ✅ | **TypeScript LangGraph Examples** | [Reference agent](examples/langgraph-ts/) with MCP client, memory retriever nodes, and session persistence. |
|
|
1524
1724
|
| — | **CRDT Conflict Resolution** | Conflict-free types for concurrent multi-agent edits on the same handoff. |
|
|
1525
1725
|
|
|
1526
1726
|
---
|