prism-mcp-server 4.6.1 → 5.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -8,12 +8,14 @@
8
8
  [![TypeScript](https://img.shields.io/badge/TypeScript-5.0+-3178C6?logo=typescript&logoColor=white)](https://www.typescriptlang.org/)
9
9
  [![Node.js](https://img.shields.io/badge/Node.js-18+-339933?logo=node.js&logoColor=white)](https://nodejs.org/)
10
10
 
11
- > **Your AI agent's memory that survives between sessions.** Prism MCP is a Model Context Protocol server that gives Claude Desktop, Cursor, Windsurf, and any MCP client **persistent memory**, **time travel**, **visual context**, **multi-agent sync**, **GDPR-compliant deletion**, **memory tracing**, and **LangChain integration** — all running locally with zero cloud dependencies.
11
+ > **Your AI agent's memory that survives between sessions.** Prism MCP is a Model Context Protocol server that gives Claude Desktop, Cursor, Windsurf, and any MCP client **persistent memory**, **time travel**, **visual context**, **multi-agent sync**, **GDPR-compliant deletion**, **memory tracing**, **quantized vector compression**, and **LangChain integration** — all running locally with zero cloud dependencies.
12
12
  >
13
- > Built with **SQLite + F32_BLOB vector search**, **optimistic concurrency control**, **MCP Prompts & Resources**, **auto-compaction**, **Gemini-powered Morning Briefings**, **MemoryTrace explainability**, and optional **Supabase cloud sync**.
13
+ > Built with **SQLite + F32_BLOB vector search**, **TurboQuant 10× embedding compression**, **optimistic concurrency control**, **MCP Prompts & Resources**, **auto-compaction**, **Gemini-powered Morning Briefings**, **MemoryTrace explainability**, and optional **Supabase cloud sync**.
14
14
 
15
15
  ## Table of Contents
16
16
 
17
+ - [What's New (v5.1.0)](#whats-new-in-v510--deep-storage--knowledge-graph-)
18
+ - [What's New (v5.0.0)](#whats-new-in-v500--quantized-agentic-memory-)
17
19
  - [What's New (v4.6.0)](#whats-new-in-v460--opentelemetry-observability-)
18
20
  - [Multi-Instance Support](#multi-instance-support)
19
21
  - [How Prism Compares](#how-prism-compares)
@@ -23,7 +25,7 @@
23
25
  - [Claude Code Integration (Hooks)](#claude-code-integration-hooks)
24
26
  - [Gemini / Antigravity Integration](#gemini--antigravity-integration)
25
27
  - [Use Cases](#use-cases)
26
- - [Architecture](#architecture)
28
+ - [Architecture](#architecture) | [Full Architecture Guide](docs/ARCHITECTURE.md) | [Self-Improving Agent Guide](docs/self-improving-agent.md)
27
29
  - [Tool Reference](#tool-reference)
28
30
  - [Agent Hivemind — Role Usage](#agent-hivemind--role-usage)
29
31
  - [LangChain / LangGraph Integration](#langchain--langgraph-integration)
@@ -42,12 +44,98 @@
42
44
 
43
45
  ---
44
46
 
45
- ## What's New in v4.6.0 — OpenTelemetry Observability 🔭
47
+ ## What's New in v5.1.0 — Deep Storage & Knowledge Graph 🗂️
48
+
49
+ > **🗂️ Reclaim 90% of your vector storage and visually edit your agent's knowledge graph.**
50
+ > [CHANGELOG](CHANGELOG.md)
51
+
52
+ | Feature | Description |
53
+ |---|---|
54
+ | 🗑️ **Deep Storage Mode** | New `deep_storage_purge` tool NULLs out redundant float32 embeddings for entries with TurboQuant compressed blobs, reclaiming ~90% of vector storage. Safety guards: 7-day minimum age, dry-run preview, multi-tenant isolation. |
55
+ | 🕸️ **Knowledge Graph Editor** | The Mind Palace Neural Graph is now fully interactive — click nodes to rename or delete keywords, filter by project/date/importance, and surgically groom your agent's semantic memory. |
56
+ | 🔧 **Auto-Load Reliability** | Hardened hook-based integration patterns for Claude Code and Gemini/Antigravity to guarantee context loading on the absolute first turn without reasoning hallucinations. |
57
+ | 🧪 **303 Tests** | 8 new deep-storage test cases covering dry run, execute, safety guards, and idempotency — zero regressions across 13 suites. |
58
+
59
+ ---
60
+
61
+ ## What's New in v5.0.0 — Quantized Agentic Memory 🧬
62
+
63
+ > **🧬 10× embedding compression is here.** Powered by Google's TurboQuant (ICLR 2026), Prism now compresses 768-dim embeddings from **3,072 bytes → ~400 bytes** — enabling decades of session history on a standard laptop.
64
+ > [RFC-001: Quantized Agentic Memory](docs/rfcs/001-turboquant-integration.md) | [CHANGELOG](CHANGELOG.md)
65
+
66
+ ### Performance Benchmarks
67
+
68
+ | Metric | Before v5.0 | After v5.0 |
69
+ |--------|------------|------------|
70
+ | **Storage per embedding** | 3,072 bytes (float32) | ~400 bytes (turbo4) |
71
+ | **Compression ratio** | 1:1 | **~7.7:1** (4-bit) / **~10.1:1** (3-bit) |
72
+ | **Similarity correlation** | Baseline | >0.85 (4-bit) |
73
+ | **Top-1 retrieval accuracy** | Baseline | >90% (N=100) |
74
+ | **Entries per GB** | ~330K | **~2.5M** |
75
+ | **Search without vector DB** | ❌ Empty | ✅ Tier-2 JS fallback |
76
+
77
+ ### Three-Tier Memory Architecture
78
+
79
+ ```
80
+ ┌─────────────────────────────────────────────────────────────┐
81
+ │ PRISM v5.0 MEMORY │
82
+ ├─────────┬───────────────┬───────────────────────────────────┤
83
+ │ TIER │ STORAGE │ SEARCH METHOD │
84
+ ├─────────┼───────────────┼───────────────────────────────────┤
85
+ │ Tier 0 │ FTS5 keywords │ Full-text search (knowledge_search) │
86
+ │ Tier 1 │ float32 3072B │ sqlite-vec cosine (native) │
87
+ │ Tier 2 │ turbo4 400B │ JS asymmetricCosineSimilarity │
88
+ └─────────┴───────────────┴───────────────────────────────────┘
89
+
90
+ searchMemory() flow:
91
+ → Tier 1 (sqlite-vec) ── success → return results
92
+ ── fail → Tier 2 (TurboQuant JS)
93
+ ── success → return results
94
+ ── fail → return []
95
+ ```
96
+
97
+ ### Live Usage: How TurboQuant Works in Practice
98
+
99
+ **Every `session_save_ledger` call now generates both tiers automatically:**
100
+
101
+ ```typescript
102
+ // What happens behind the scenes when you save a session:
103
+ await saveLedger({ project: "my-app", summary: "Built auth flow" });
104
+
105
+ // 1. Gemini generates float32 embedding (3,072 bytes)
106
+ // 2. TurboQuant compresses to turbo4 blob (~400 bytes)
107
+ // 3. Single atomic patchLedger writes BOTH to the database
108
+ // → embedding: "[0.0234, -0.0156, ...]" (float32)
109
+ // → embedding_compressed: "base64..." (turbo4)
110
+ // → embedding_format: "turbo4"
111
+ // → embedding_turbo_radius: 12.847
112
+
113
+ // Searching works seamlessly across both tiers:
114
+ await searchMemory({ query: "auth flow" });
115
+ // → Tier 1 tries native vector search
116
+ // → If unavailable, Tier 2 deserializes compressed blobs
117
+ // and ranks using asymmetric cosine similarity in JS
118
+ ```
119
+
120
+ **Backfill existing entries with one command:**
121
+ ```
122
+ > Use tool: session_backfill_embeddings
123
+ > Now repairs AND compresses in a single atomic update
124
+ ```
125
+
126
+ > **💡 Ollama TurboQuant Tip:** If using Ollama for self-hosted inference, set `OLLAMA_KV_CACHE_TYPE=turbo3` for 10× smaller KV caches during generation — the same algorithm powering Prism's memory compression.
127
+
128
+ ---
129
+
130
+ <details>
131
+ <summary><strong>What's in v4.6.0 — OpenTelemetry Observability 🔭</strong></summary>
46
132
 
47
133
  > **🔭 Full distributed tracing for every MCP tool call, LLM provider hop, and background AI worker.**
48
134
  > Configure in the new **🔭 Observability** tab in Mind Palace — no code changes required.
49
135
  > Activates a 4-tier span waterfall: `mcp.call_tool` → `worker.vlm_caption` → `llm.generate_image_description` / `llm.generate_embedding`.
50
136
 
137
+ </details>
138
+
51
139
  <a name="whats-new-in-v451--gdpr-export-"></a>
52
140
  <details>
53
141
  <summary><strong>What's in v4.5.1 — GDPR Export & Test Hardening 🔒</strong></summary>
@@ -234,7 +322,7 @@
234
322
  | Feature | Description |
235
323
  |---|---|
236
324
  | 🏠 **Local-First SQLite** | Run Prism entirely locally with zero cloud dependencies. Full vector search (libSQL F32_BLOB) and FTS5 included. |
237
- | 🔮 **Mind Palace UI** | A beautiful glassmorphism dashboard at `localhost:3000` to inspect your agent's memory, visual vault, and Git drift. |
325
+ | 🔮 **Mind Palace UI** | A beautiful glassmorphism dashboard at `localhost:3000` (configurable via `PRISM_DASHBOARD_PORT`) to inspect your agent's memory, visual vault, and Git drift. |
238
326
  | 🕰️ **Time Travel** | `memory_history` and `memory_checkout` act like `git revert` for your agent's brain — full version history with OCC. |
239
327
  | 🖼️ **Visual Memory** | Agents can save screenshots to a local media vault. Auto-capture mode snapshots your local dev server on every handoff save. |
240
328
  | 📡 **Agent Telepathy** | Multi-client sync: if your agent in Cursor saves state, Claude Desktop gets a live notification instantly. |
@@ -277,6 +365,8 @@
277
365
  | **VLM Image Captions** | ✅ Auto-caption vault (v4.5) | ❌ | ❌ | ❌ | ❌ |
278
366
  | **Pluggable LLM Adapters** | ✅ OpenAI/Anthropic/Gemini/Ollama | ❌ | ✅ Multi-provider | ❌ | ❌ |
279
367
  | **LangChain** | ✅ BaseRetriever | ❌ | ❌ | ❌ | ❌ |
368
+ | **Vector Compression** | ✅ TurboQuant 10× (v5.0) | ❌ | ❌ | ❌ | ❌ |
369
+ | **Three-Tier Search** | ✅ FTS + Vec + Quantized | ❌ | ❌ | ❌ | ❌ |
280
370
  | **MCP Native** | ✅ stdio | ✅ stdio | ❌ Python SDK | ✅ HTTP + MCP | ✅ stdio |
281
371
  | **Language** | TypeScript | TypeScript | Python | Python | Python |
282
372
 
@@ -468,11 +558,36 @@ Add to your Continue `config.json` or Cline MCP settings:
468
558
 
469
559
  ## Claude Code Integration (Hooks)
470
560
 
471
- Claude Code supports **lifecycle hooks** in `~/.claude/settings.json` that fire automatically at session start and end. Use these to auto-hydrate and persist Prism memory without manual prompting.
561
+ Claude Code supports custom hooks (`SessionStart`, `Stop`) that can force the agent to load and save Prism context automatically. Because Claude Code requires explicit permission for MCP tools, you must also whitelist the Prism commands.
562
+
563
+ ### 1. The Auto-Load Hook Script
564
+
565
+ Create a Python script (e.g., `~/.claude/mcp_autoload_hook.py`). This script outputs JSON that Claude Code reads during the `SessionStart` event.
472
566
 
473
- ### SessionStart Hook
567
+ ```python
568
+ #!/usr/bin/env python3
569
+ import json
570
+ import sys
571
+
572
+ def main():
573
+ # Inject a system message forcing the agent to load memory BEFORE speaking
574
+ print(json.dumps({
575
+ "continue": True,
576
+ "suppressOutput": True,
577
+ "systemMessage": (
578
+ "## First Action\n"
579
+ "Call `mcp__prism-mcp__session_load_context(project='my-project', level='deep')` "
580
+ "before responding to the user. Do not generate any text before calling this tool."
581
+ )
582
+ }))
583
+
584
+ if __name__ == "__main__":
585
+ main()
586
+ ```
587
+
588
+ ### 2. Configure `settings.json`
474
589
 
475
- Automatically loads context when a new session begins:
590
+ Map the hooks in your `~/.claude/settings.json`:
476
591
 
477
592
  ```json
478
593
  {
@@ -483,47 +598,45 @@ Automatically loads context when a new session begins:
483
598
  "hooks": [
484
599
  {
485
600
  "type": "command",
486
- "command": "python3 -c \"import json; print(json.dumps({'continue': True, 'suppressOutput': False, 'systemMessage': 'You MUST call mcp__prism-mcp__session_load_context twice before responding to the user: first with project=my-project level=standard, then with project=my-other-project level=standard. Do not skip this.'}))\"",
601
+ "command": "python3 /Users/you/.claude/mcp_autoload_hook.py",
487
602
  "timeout": 10
488
603
  }
489
604
  ]
490
605
  }
491
- ]
492
- }
493
- }
494
- ```
495
-
496
- ### Stop Hook
497
-
498
- Automatically saves session memory when a session ends:
499
-
500
- ```json
501
- {
502
- "hooks": {
606
+ ],
503
607
  "Stop": [
504
608
  {
505
609
  "matcher": "*",
506
610
  "hooks": [
507
611
  {
508
612
  "type": "command",
509
- "command": "python3 -c \"import json; print(json.dumps({'continue': True, 'suppressOutput': False, 'systemMessage': 'MANDATORY END WORKFLOW: 1) Call mcp__prism-mcp__session_save_ledger with project and summary. 2) Call mcp__prism-mcp__session_save_handoff with expected_version set to the loaded version.'}))\"",
613
+ "command": "python3 -c \"import json; print(json.dumps({'continue': True, 'suppressOutput': True, 'systemMessage': 'MANDATORY END WORKFLOW: 1) Call mcp__prism-mcp__session_save_ledger with project and summary. 2) Call mcp__prism-mcp__session_save_handoff with expected_version set to the loaded version.'}))\"",
510
614
  "timeout": 10
511
615
  }
512
616
  ]
513
617
  }
514
618
  ]
619
+ },
620
+ "permissions": {
621
+ "allow": [
622
+ "mcp__prism-mcp__session_load_context",
623
+ "mcp__prism-mcp__session_save_ledger",
624
+ "mcp__prism-mcp__session_save_handoff",
625
+ "mcp__prism-mcp__knowledge_search",
626
+ "mcp__prism-mcp__session_search_memory"
627
+ ]
515
628
  }
516
629
  }
517
630
  ```
518
631
 
519
632
  ### How the Hooks Work
520
633
 
521
- The hook `command` runs a Python one-liner that returns a JSON object to Claude Code:
634
+ The hook `command` runs a Python script that returns a JSON object to Claude Code:
522
635
 
523
636
  | Field | Purpose |
524
637
  |---|---|
525
638
  | `continue: true` | Tell Claude Code to proceed (don't abort the session) |
526
- | `suppressOutput: false` | Show the hook result to the agent |
639
+ | `suppressOutput: true` | Silently inject the system message (recommended for Stop hooks) |
527
640
  | `systemMessage` | Instruction injected as a system message — the agent follows it |
528
641
 
529
642
  The agent receives the `systemMessage` as an instruction and executes the tool calls. The server resolves the agent's **role** and **name** automatically from the dashboard — no need to specify them in the hook.
@@ -542,49 +655,55 @@ explicit tool argument → dashboard setting → "global" (default)
542
655
 
543
656
  Change your role once in the dashboard, and it automatically applies to every session — CLI, extension, and all MCP clients.
544
657
 
545
- ### Verification
546
-
547
- If hydration ran successfully, the agent's output will include:
548
- - A `[👤 AGENT IDENTITY]` block showing your dashboard-configured role and name
549
- - `PRISM_CONTEXT_LOADED` marker text
658
+ ### Troubleshooting Claude Code
550
659
 
551
- If the marker is missing, the hook did not fire or the MCP server is not connected.
660
+ - **Hook not firing?** Check the hook `timeout` in Claude Code settings. If your Python script takes too long, Claude ignores it silently.
661
+ - **"Tool not available" hallucination?** If Claude claims it doesn't have the tool, it's usually an adversarial Chain-of-Thought loop. Ensure the `permissions.allow` array exactly matches the double-underscore format (`mcp__prism-mcp__...`).
662
+ - **Missing `PRISM_CONTEXT_LOADED`?** The hook didn't fire or the MCP server isn't connected. Verify `prism-mcp` is listed in your `mcpServers` config.
552
663
 
553
664
  ---
554
665
 
555
666
  ## Gemini / Antigravity Integration
556
667
 
557
- Gemini-based clients (like Antigravity) use `GEMINI.md` global rules or user rules for startup behavior. The server resolves the role from the dashboard automatically.
668
+ Antigravity and Gemini-based agents require a radically simplified approach to auto-loading. If you give modern instruction-tuned models a long list of "Banned Behaviors" (e.g., "Do NOT say hello first"), their internal reasoning often over-indexes on the constraints and causes them to hallucinate that the tool doesn't exist.
558
669
 
559
- ### Global Rules (`~/.gemini/GEMINI.md`)
670
+ ### The 2-Line "First Action" Rule
560
671
 
561
- ```markdown
562
- ## Prism MCP Memory Auto-Load (CRITICAL)
563
- At the start of every new session, call `mcp__prism-mcp__session_load_context`
564
- for these projects:
565
- - `my-project` (level=standard)
566
- - `my-other-project` (level=standard)
672
+ Create a `GEMINI.md` file in your project root (or globally at `~/.gemini/GEMINI.md`) or paste this into your Antigravity **User Rules**:
567
673
 
568
- After both succeed, print PRISM_CONTEXT_LOADED.
674
+ ```markdown
675
+ ## First Action
676
+ Call `mcp_prism-mcp_session_load_context(project="my-project", level="deep")` before responding.
569
677
  ```
570
678
 
571
- ### User Rules (Antigravity Settings)
679
+ > **Note:** Antigravity uses single underscores (`mcp_prism-mcp_...`) compared to Claude Code's double underscores (`mcp__prism-mcp__...`).
680
+
681
+ That's it — **two lines**. This approach proved reliable after 13 iterations of increasingly complex prompt engineering. The key insight: shorter instructions avoid triggering the model's adversarial reasoning about tool availability.
572
682
 
573
- If your Gemini client supports user rules, add the same instructions there. The key points:
683
+ ### Session End Protocol
574
684
 
575
- 1. **Call `session_load_context` as a tool** not `read_resource`. Only the tool returns the `[👤 AGENT IDENTITY]` block.
576
- 2. **Verify** confirm the response includes `version` and `last_summary`.
685
+ At the end of your conversation, explicitly tell the agent:
686
+ > *"Wrap up the session."*
577
687
 
578
- ### Session End
688
+ The agent will rely on its system prompt to execute:
689
+ 1. `session_save_ledger` — immutable work log with summary, TODOs, and decisions
690
+ 2. `session_save_handoff` — passing the `expected_version` it received during the load step to ensure Optimistic Concurrency Control
579
691
 
580
- At the end of each session, save state:
692
+ ### Antigravity UI Caveats
693
+
694
+ Antigravity's UI currently does **not** visually render the raw output of MCP tool calls. To ensure the agent actually ingested the context, add this to your User Rules:
581
695
 
582
696
  ```markdown
583
- ## Session End Protocol
584
- 1) Call `mcp__prism-mcp__session_save_ledger` with project and summary.
585
- 2) Call `mcp__prism-mcp__session_save_handoff` with expected_version from the loaded version.
697
+ ## STEP 2: Echo Context in Your Text Response
698
+ After the tool returns, include the following in your greeting text:
699
+ - Agent identity: `🤖 Agent: <role> <name>`
700
+ - Last session summary
701
+ - Open TODOs
702
+ - Session version number
586
703
  ```
587
704
 
705
+ This forces the agent to prove it loaded context by echoing it in visible text.
706
+
588
707
  ---
589
708
 
590
709
  ## Use Cases
@@ -651,70 +770,38 @@ session_view_image(project="my-app", image_id="8f2a1b3c")
651
770
 
652
771
  ---
653
772
 
654
- ### 🔌 Pluggable LLM Adapters
655
-
656
- **Why:** Run fully local/air-gapped with Ollama, or switch providers without changing tool logic.
657
-
658
- **Setup:** Set in MCP config `env`:
659
-
660
- ```json
661
- {
662
- "env": {
663
- "PRISM_LLM_PROVIDER": "ollama",
664
- "PRISM_LLM_MODEL": "llama3.2",
665
- "PRISM_LLM_BASE_URL": "http://localhost:11434"
666
- }
667
- }
668
- ```
669
-
670
- | Provider | Env Var | Notes |
671
- |----------|---------|-------|
672
- | `gemini` (default) | `GOOGLE_API_KEY` | Best for Morning Briefings |
673
- | `openai` | `OPENAI_API_KEY` | GPT-4o supports VLM |
674
- | `anthropic` | `ANTHROPIC_API_KEY` | Claude 3.5 supports VLM |
675
- | `ollama` | none | Full local/air-gapped mode |
676
-
677
- ---
678
-
679
- ### 📦 GDPR Memory Export
680
-
681
- ```
682
- session_export_memory(project="my-app", format="zip")
683
- ```
684
-
685
- Outputs a ZIP containing:
686
- - `ledger.json` — all session entries
687
- - `handoffs.json` — all project state snapshots
688
- - `knowledge.md` — graduated insights in Markdown
689
- - Sensitive fields (API keys, tokens) automatically redacted
690
-
691
- ---
692
-
693
773
  ## Architecture
694
774
 
775
+ > **📖 Deep dive**: [Full Architecture Guide](docs/ARCHITECTURE.md) — TurboQuant math, Three-Tier search, storage optimization flow
776
+ > **🤖 Tutorial**: [How to Build a Self-Improving Agent](docs/self-improving-agent.md) — corrections → behavioral memory → IDE rules
777
+
695
778
  ```mermaid
696
779
  graph TB
697
780
  Client["AI Client<br/>(Claude Desktop / Cursor / Windsurf)"]
698
- LangChain["LangChain / LangGraph<br/>(Python Retrievers)"]
781
+ LangChain["LangChain / LangGraph<br/>(Python/TS Retrievers)"]
699
782
  MCP["Prism MCP Server<br/>(TypeScript)"]
700
783
 
701
784
  Client -- "MCP Protocol (stdio)" --> MCP
702
785
  LangChain -- "JSON-RPC via MCP Bridge" --> MCP
703
786
 
704
- MCP --> Tracing["MemoryTrace Engine<br/>Latency + Strategy + Scoring"]
705
- MCP --> Dashboard["Mind Palace Dashboard<br/>localhost:3000"]
787
+ MCP --> Tracing["OTel Tracing<br/>v4.6 Observability"]
788
+ MCP --> Dashboard["Mind Palace Dashboard<br/>localhost:3000<br/>(PRISM_DASHBOARD_PORT)"]
706
789
  MCP --> Brave["Brave Search API<br/>Web + Local + AI Answers"]
707
- MCP --> Gemini["Google Gemini API<br/>Analysis + Briefings"]
790
+ MCP --> LLM["LLM Factory<br/>Gemini / OpenAI / Ollama"]
708
791
  MCP --> Sandbox["QuickJS Sandbox<br/>Code-Mode Templates"]
709
792
  MCP --> SyncBus["SyncBus<br/>Agent Telepathy"]
710
793
  MCP --> GDPR["GDPR Engine<br/>Soft/Hard Delete + Audit"]
711
794
 
712
795
  MCP --> Storage{"Storage Backend"}
713
- Storage --> SQLite["SQLite (Local)<br/>libSQL + F32_BLOB vectors"]
796
+ Storage --> SQLite["SQLite (Local)<br/>libSQL + sqlite-vec"]
714
797
  Storage --> Supabase["Supabase (Cloud)<br/>PostgreSQL + pgvector"]
715
798
 
716
- SQLite --> Ledger["session_ledger<br/>(+ deleted_at tombstoning)"]
717
- SQLite --> Handoffs["session_handoffs"]
799
+ SQLite --> Ledger["session_ledger"]
800
+ Ledger --> T1["Tier 1: float32<br/>3,072B native search"]
801
+ T1 -- "v5.0 TurboQuant" --> T2["Tier 2: turbo4<br/>400B JS search"]
802
+ T1 -. "v5.1 Purge" .-> Null["NULL after 30d"]
803
+
804
+ SQLite --> Handoffs["session_handoffs<br/>(OCC versioning)"]
718
805
  SQLite --> History["history_snapshots<br/>(Time Travel)"]
719
806
  SQLite --> Media["media vault<br/>(Visual Memory)"]
720
807
 
@@ -724,13 +811,16 @@ graph TB
724
811
  style Tracing fill:#D69E2E,color:#fff
725
812
  style Dashboard fill:#9F7AEA,color:#fff
726
813
  style Brave fill:#FB542B,color:#fff
727
- style Gemini fill:#4285F4,color:#fff
814
+ style LLM fill:#4285F4,color:#fff
728
815
  style Sandbox fill:#805AD5,color:#fff
729
816
  style SyncBus fill:#ED64A6,color:#fff
730
817
  style GDPR fill:#E53E3E,color:#fff
731
818
  style Storage fill:#2D3748,color:#fff
732
819
  style SQLite fill:#38B2AC,color:#fff
733
820
  style Supabase fill:#3ECF8E,color:#fff
821
+ style T1 fill:#48BB78,color:#fff
822
+ style T2 fill:#E8B004,color:#000
823
+ style Null fill:#E53E3E,color:#fff
734
824
  ```
735
825
 
736
826
  ---
@@ -1060,6 +1150,7 @@ The retrievers use `_aget_relevant_documents` as the primary path with `asyncio.
1060
1150
  | `PRISM_AUTO_CAPTURE` | No | Set `"true"` to auto-capture HTML snapshots of dev servers |
1061
1151
  | `PRISM_CAPTURE_PORTS` | No | Comma-separated ports to scan (default: `3000,3001,5173,8080`) |
1062
1152
  | `PRISM_DEBUG_LOGGING` | No | Set `"true"` to enable verbose debug logs (default: quiet) |
1153
+ | `PRISM_DASHBOARD_PORT` | No | Configure the dashboard port (default: `3000`) |
1063
1154
 
1064
1155
  ---
1065
1156
 
@@ -1512,7 +1603,6 @@ See [`vertex-ai/`](vertex-ai/) for setup and benchmarks.
1512
1603
  │ │ ├── compactionHandler.ts # Gemini-powered ledger compaction
1513
1604
  │ │ └── index.ts # Tool registration & re-exports
1514
1605
  │ └── utils/
1515
- │ └── utils/
1516
1606
  │ ├── telemetry.ts # OTel singleton — NodeTracerProvider, BatchSpanProcessor, no-op mode
1517
1607
  │ ├── tracing.ts # MemoryTrace types + factory (Phase 1 — LLM explainability)
1518
1608
  │ ├── imageCaptioner.ts # VLM auto-caption pipeline (v4.5) + worker.vlm_caption OTel span
@@ -1554,6 +1644,24 @@ See [`vertex-ai/`](vertex-ai/) for setup and benchmarks.
1554
1644
 
1555
1645
  > **[View the full project board →](https://github.com/users/dcostenco/projects/1/views/1)** | **[Full ROADMAP.md →](ROADMAP.md)**
1556
1646
 
1647
+ ### ✅ v5.0 — Quantized Agentic Memory (Shipped!)
1648
+
1649
+ | Feature | Description |
1650
+ |---|---|
1651
+ | 🧮 **TurboQuant Math Core** | Pure TypeScript port of Google's TurboQuant (ICLR 2026) — Lloyd-Max codebook, QR rotation, QJL error correction. Zero dependencies. [RFC-001](docs/rfcs/001-turboquant-integration.md) |
1652
+ | 📦 **~7× Embedding Compression** | 768-dim embeddings shrink from 3,072 bytes to ~400 bytes (4-bit) via variable bit-packing. |
1653
+ | 🔍 **Asymmetric Similarity** | Unbiased inner product estimator: query as float32 vs compressed blobs. No decompression needed. |
1654
+ | 🗄️ **Two-Tier Search** | FTS5 candidate filter → JS-side asymmetric scoring. Bypasses sqlite-vec float32 limitation. |
1655
+
1656
+ ### ✅ v5.1 — Deep Storage Mode (Shipped!)
1657
+
1658
+ | Feature | Description |
1659
+ |---|---|
1660
+ | 🧬 **Deep Storage Purge** | Automated `deep_storage_purge` tool NULLs out redundant float32 embeddings for entries with TurboQuant compressed blobs, reclaiming ~90% of vector storage. |
1661
+ | 🛡️ **Safety Guards** | Minimum 7-day age threshold, dry-run preview mode, multi-tenant isolation, and compressed-blob-existence validation ensure zero data loss. |
1662
+ | 🗃️ **Supabase RPC** | `prism_purge_embeddings` Postgres function (migration 030) provides full backend parity with SQLite. Auto-applied via the v4.1 migration runner. |
1663
+ | 🧪 **303 Tests** | 8 new deep-storage test cases covering dry run, execute, safety guards, and idempotency — zero regressions across the full suite. |
1664
+
1557
1665
  ### ✅ v4.6 — OpenTelemetry Observability (Shipped!)
1558
1666
 
1559
1667
  | Feature | Description |
@@ -1608,11 +1716,11 @@ See [v3.1.0](#whats-in-v310--memory-lifecycle-) and [v3.0.0](#whats-in-v300--age
1608
1716
 
1609
1717
  | Priority | Feature | Description |
1610
1718
  |----------|---------|-------------|
1611
- | 🥇 | **Documentation & Architecture Guide** | Full README overhaul with architecture diagrams, "How to build a self-improving agent" walkthrough, and v4.x feature matrix. |
1612
- | 🥈 | **Knowledge Graph Editor** | Visual graph in Mind Palace showing nodes for projects, agents, sessions, and graduated rules. |
1719
+ | | **Documentation & Architecture Guide** | [Architecture Guide](docs/ARCHITECTURE.md), [Self-Improving Agent Guide](docs/self-improving-agent.md), updated README diagram with v5.x vector tiers. |
1720
+ | | **Knowledge Graph Editor** | Interactive vis.js graph with click-to-filter, node stats, project/keyword/category visualization. |
1613
1721
  | 🥉 | **Autonomous Web Scholar** | Agent-driven learning pipeline using Brave Search + VLM to autonomously build project context while the developer sleeps. |
1614
- | | **Dashboard Auth** | Optional basic auth for remote Mind Palace access. |
1615
- | | **TypeScript LangGraph Examples** | Reference implementations alongside the existing Python agent. |
1722
+ | | **Dashboard Auth** | HTTP Basic Auth with session cookies, timing-safe comparison, styled login page. Set `PRISM_DASHBOARD_USER`/`PRISM_DASHBOARD_PASS`. |
1723
+ | | **TypeScript LangGraph Examples** | [Reference agent](examples/langgraph-ts/) with MCP client, memory retriever nodes, and session persistence. |
1616
1724
  | — | **CRDT Conflict Resolution** | Conflict-free types for concurrent multi-agent edits on the same handoff. |
1617
1725
 
1618
1726
  ---