prism-mcp-server 18.0.2 β†’ 19.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,280 +1,206 @@
1
- # 🧠 Prism Coder
1
+ # Prism Coder
2
2
 
3
- 🌐 **Read in your language:** πŸ‡¬πŸ‡§ English Β· [πŸ‡ͺπŸ‡Έ EspaΓ±ol](docs/i18n/README_es.md) Β· [πŸ‡«πŸ‡· FranΓ§ais](docs/i18n/README_fr.md) Β· [πŸ‡΅πŸ‡Ή PortuguΓͺs](docs/i18n/README_pt.md) Β· [πŸ‡·πŸ‡΄ RomΓ’nΔƒ](docs/i18n/README_ro.md) Β· [πŸ‡ΊπŸ‡¦ Π£ΠΊΡ€Π°Ρ—Π½ΡΡŒΠΊΠ°](docs/i18n/README_uk.md) Β· [πŸ‡·πŸ‡Ί Русский](docs/i18n/README_ru.md) Β· [πŸ‡©πŸ‡ͺ Deutsch](docs/i18n/README_de.md) Β· [πŸ‡―πŸ‡΅ ζ—₯本θͺž](docs/i18n/README_ja.md) Β· [πŸ‡°πŸ‡· ν•œκ΅­μ–΄](docs/i18n/README_ko.md) Β· [πŸ‡¨πŸ‡³ δΈ­ζ–‡](docs/i18n/README_zh.md) Β· [πŸ‡ΈπŸ‡¦ Ψ§Ω„ΨΉΨ±Ψ¨ΩŠΨ©](docs/i18n/README_ar.md)
3
+ **Give your AI agent memory that lasts.** Persistent sessions, knowledge graphs, and offline tool-routing β€” fully local and free.
4
4
 
5
- **Persistent memory + tool-calling intelligence for AI agents.** *(formerly Prism MCP)*
6
-
7
- A Model Context Protocol server that gives Claude, Cursor, and other AI tools a Mind Palace β€” long-term memory that survives across sessions, with semantic search, cognitive routing, a visual dashboard, and the `prism-coder:1b7` / `prism-coder:8b` / `prism-coder:14b` / `prism-coder:32b` LLM fleet for offline tool-calling.
8
-
9
- [![npm](https://img.shields.io/npm/v/prism-mcp-server?color=cb0000&label=npm%20%E2%80%94%20prism-mcp-server)](https://www.npmjs.com/package/prism-mcp-server)
10
- [![VS Marketplace](https://img.shields.io/visual-studio-marketplace/v/synalux-ai.synalux?label=VS%20Code&color=007ACC)](https://marketplace.visualstudio.com/items?itemName=synalux-ai.synalux)
11
- [![Website](https://img.shields.io/badge/website-synalux.ai%2Fprism--mcp-6B4FBB)](https://synalux.ai/prism-mcp)
5
+ [![npm](https://img.shields.io/npm/v/prism-mcp-server?color=cb0000&label=npm)](https://www.npmjs.com/package/prism-mcp-server)
12
6
  [![MCP Registry](https://img.shields.io/badge/MCP_Registry-listed-00ADD8)](https://github.com/modelcontextprotocol/servers)
13
- [![Smithery](https://img.shields.io/badge/Smithery-listed-6B4FBB)](https://smithery.ai/server/@dcostenco/prism-mcp)
14
7
  [![License: AGPL-3.0](https://img.shields.io/badge/License-AGPL--3.0-blue.svg)](LICENSE)
8
+ [![Models on HuggingFace](https://img.shields.io/badge/πŸ€—-prism--coder-yellow)](https://huggingface.co/dcostenco)
9
+
10
+ <p align="center">
11
+ <img src="docs/v11_hivemind_multi_agent_dashboard.jpg" alt="Prism Coder β€” Mind Palace Dashboard with Knowledge Graph and Multi-Agent Hivemind" width="700" />
12
+ </p>
13
+
14
+ Prism Coder is an [MCP server](https://modelcontextprotocol.io) that gives Claude, Cursor, and other AI tools long-term memory that survives across sessions. It ships with the open-weight `prism-coder` model fleet (2B–32B) for fast, offline tool-routing β€” no cloud required.
15
15
 
16
- > **Renamed in v14.0.0:** the project is now **Prism Coder** to cover both the Mind Palace memory server *and* the `prism-coder:1b7` / `prism-coder:8b` / `prism-coder:14b` / `prism-coder:32b` LLM fleet on HuggingFace + Ollama. The npm package stays `prism-mcp-server` so existing install URLs and `mcp.json` entries keep working β€” the `prism-coder` binary has been the canonical entry point since v12.
16
+ **No account needed. No API keys. Runs on your machine.**
17
+ A paid subscription adds cloud sync, higher model tiers, and team features through the [Synalux portal](https://synalux.ai).
17
18
 
18
19
  ---
19
20
 
20
- ## What Prism Coder does
21
+ ## Quickstart
21
22
 
22
- ### πŸ’Ύ Your AI remembers across sessions
23
- Every conversation feeds the Mind Palace. Next session, your AI agent loads the right context automatically β€” no re-explaining.
23
+ The free tier needs no account, no API key, and no cloud. Add the server to your MCP client:
24
24
 
25
- ### πŸ” Semantic search over your history
26
- Ask "what did I decide about the auth flow last month?" and get the answer with citations. Vector search + keyword + graph traversal.
25
+ ```json
26
+ {
27
+ "mcpServers": {
28
+ "prism": {
29
+ "command": "npx",
30
+ "args": ["-y", "prism-mcp-server"]
31
+ }
32
+ }
33
+ }
34
+ ```
27
35
 
28
- ### 🧬 Cognitive routing
29
- Different memory types live in different stores: episodic (what happened), semantic (what's true), procedural (how to do X). The router picks where to store and where to retrieve.
36
+ Open Claude Desktop or Cursor and your agent now has memory backed by a local SQLite database (`~/.prism-mcp/data.db`).
30
37
 
31
- ### πŸ”„ Proactive session drift detection *(new in v15, HRR-powered in v17)*
32
- Your AI agent can now detect when it has drifted from your original goals β€” mid-session, automatically β€” and self-correct before you notice the problem.
38
+ **Optional β€” local model fleet** for offline tool-routing. Pull whichever fits your hardware:
33
39
 
34
- Three direct Prism calls:
35
- 1. **`session_save_ledger`** β€” snapshot current state
36
- 2. **`session_detect_drift`** β€” HRR-powered semantic comparison of current work vs original goals, returns `on_track / minor_drift / major_drift` with domain-specific signals (BCBA/Coding/AAC)
37
- 3. **`session_compact_ledger`** β€” if drifted, compress and reload only what matters
40
+ ```bash
41
+ ollama pull dcostenco/prism-coder:2b # 2.3 GB Β· mobile / lightweight (99.1% routing accuracy)
42
+ ollama pull dcostenco/prism-coder:4b # 3.4 GB Β· verifier (100% accuracy)
43
+ ollama pull dcostenco/prism-coder:9b # 5.8 GB Β· default router (100% accuracy, Qwen3.5)
44
+ ollama pull dcostenco/prism-coder:32b # 19 GB Β· complex tasks (100% accuracy)
45
+ ```
38
46
 
39
- When major drift is detected, the alert routes to the **Synalux portal** so it's visible across sessions and devices β€” not just in the current conversation.
47
+ Prism detects both the namespaced (`dcostenco/prism-coder:9b`) and bare (`prism-coder:9b`) Ollama tags automatically.
40
48
 
41
- **Real example it caught:** A training session promised BFCL β‰₯90% for three AI models. The agent spent 3 hours debugging audio bugs instead. The drift check surfaced: "Training goal unmet. Layer3 corpus missing from all training sets. 0 BFCL scores measured." The session immediately re-aligned.
49
+ ---
42
50
 
43
- No scripts. No cron. No hooks. Three tool calls, Prism handles the rest.
51
+ ## What it does
44
52
 
45
- ### πŸ›‘ PHI Guard *(v17+)*
46
- Automatic Protected Health Information detection and redaction in the memory pipeline. Every `session_save_ledger` and `session_save_handoff` call passes through the PHI guard before storage.
53
+ Your AI agent forgets everything between sessions. Prism fixes that β€” and adds verification, drift detection, and multi-agent coordination on top.
47
54
 
48
- **What it catches:** DOBs, SSNs, MRNs, phone numbers, email addresses, and other structured HIPAA identifiers (18 categories). Redaction is deterministic (regex + pattern matching, no LLM) β€” zero false negatives on format-constrained identifiers (SSN, MRN, phone, email). Names require NER for reliable detection and are best-effort.
55
+ ### Mind Palace β€” persistent memory that survives across sessions
49
56
 
50
- **Fail-closed:** PHI detection errors log to stderr (never suppressed) and block the save. Metric: `phi_guard.detected` count per category is always emitted for audit compliance.
57
+ Every conversation feeds a persistent store. The next session loads the right context automatically β€” no re-explaining.
51
58
 
52
- ### ⚑ Prompt-based skill routing *(v17+)*
53
- 114 agent skills auto-load based on prompt keywords. No manual skill selection needed β€” the MCP server scans the user's prompt and injects the relevant skill instructions into the session context before the AI responds.
59
+ <p align="center">
60
+ <img src="docs/mind-palace-dashboard.png" alt="Mind Palace Dashboard β€” project state, neural graph, pending TODOs" width="700" />
61
+ </p>
54
62
 
55
- ### πŸ’° Tier enforcement *(v17.1+)*
56
- `prism_infer` now enforces subscription-tier gates: model ceiling, max tokens, daily limits, and cloud fallback are all gated by your plan. Free users get local-only inference up to 4b; paid tiers unlock higher models, more tokens, and cloud fallback. Flat-rate seat caps via `max_seats` per plan.
63
+ The dashboard shows your current project state, pending TODOs, intent health, and a neural knowledge graph β€” all built automatically from your agent sessions.
57
64
 
58
- ### πŸ›‘ Local-first β€” security + speed
59
- Free tier runs entirely on your machine β€” SQLite, local embedding model, no API keys, no cloud. Paid tier adds cloud sync via Synalux portal.
65
+ ### Knowledge Graph β€” semantic + keyword + graph search
60
66
 
61
- **Why local models matter:**
67
+ Ask "what did I decide about the auth flow last month?" and get an answer with citations, combining vector similarity, full-text search, and graph traversal.
62
68
 
63
- | | Cloud LLM | Local `prism-coder` |
64
- |--|---|---|
65
- | Tool-call latency | 200ms–3s | **~1.6s (1.7B) / ~1.1s (14B)** |
66
- | API key required | Yes | **No** |
67
- | Data sent externally | Every prompt | **Nothing (free tier)** |
68
- | Works offline | ❌ | βœ… |
69
- | Cost at scale | $0.002–0.06/call | **$0** |
70
- | HIPAA | Requires BAA | **On-prem = no BAA** |
69
+ <p align="center">
70
+ <img src="docs/knowledge-graph.jpg" alt="Knowledge Graph β€” 190 keywords, 47 edges, 12 projects visualized" width="500" />
71
+ </p>
71
72
 
72
- Install in one command β€” no config, no keys, no vendor agreements:
73
- ```bash
74
- ollama pull dcostenco/prism-coder:14b # 9 GB Β· default router Β· Mac M2+ / iPad Pro
75
- ollama pull dcostenco/prism-coder:4b # 2.5 GB Β· verifier Β· iPhone 15/16 Pro
76
- ollama pull dcostenco/prism-coder:1b7 # 2.2 GB Β· ultra-low RAM / Apple Watch
77
- ollama pull dcostenco/prism-coder:32b # 19 GB Β· complex tasks Β· Mac M2 Ultra+
78
- ollama pull dcostenco/prism-coder:8b # 4.7 GB Β· balanced Β· iPhone/iPad 8GB
79
- ```
73
+ ### Session History β€” immutable audit trail
80
74
 
81
- Prism MCP detects both the namespaced (`dcostenco/prism-coder:14b`) and bare (`prism-coder:14b`) Ollama tag forms automatically β€” nothing else to configure. If you want the bare tags as aliases for direct `ollama run prism-coder:14b` use, run:
75
+ Every session is logged with files changed, decisions made, and TODOs. Search, filter, and replay any past session.
82
76
 
83
- ```bash
84
- prism register-models # aliases */prism-coder:* β†’ prism-coder:* via `ollama cp`
85
- prism register-models --dry-run # preview what would be aliased
86
- ```
77
+ <p align="center">
78
+ <img src="docs/session-ledger.jpg" alt="Session Ledger β€” 93 sessions, 847 decisions logged across 12 projects" width="700" />
79
+ </p>
87
80
 
88
- ### Cascade architecture
81
+ ### Session Drift Detection
89
82
 
90
- Three-tier local cascade with cloud fallback:
83
+ Long agent sessions can wander from their original goal. `session_detect_drift` compares current work against the stated goal and returns `on_track / minor_drift / major_drift` so the agent can self-correct.
91
84
 
92
- ```
93
- Query arrives
94
- β”‚
95
- β–Ό
96
- prism-coder:14b ── routes (100% eval_300) ──▢ serve (~3s, 9GB, FREE)
97
- β”‚ β”‚
98
- β”‚ knowledge_search (RAG context)
99
- β”‚ β”‚
100
- β–Ό β–Ό
101
- prism-coder:4b ── verifies claims ──────────▢ grounded response
102
- β”‚ (2.5GB, <1s)
103
- β”‚
104
- β–Ό (complex tasks only, explicit ceiling="32b")
105
- prism-coder:32b ── deep reasoning ──────────▢ serve (~8s, 19GB, FREE)
106
- β”‚
107
- β–Ό (cloud fallback when local insufficient)
108
- Claude Sonnet 4 ────────────────────────────▢ serve (cloud, ~$0.01/req)
109
- ```
85
+ ### Behavioral Verification β€” catch bad edits before they happen
110
86
 
111
- | Tier | Model | Role | RAM | Latency | Cost |
112
- |------|-------|------|-----|---------|------|
113
- | **Default** | prism-coder:14b | Router + general inference | 9 GB | ~3s | $0 |
114
- | **Verifier** | prism-coder:4b | Grounding claims check | 2.5 GB | <1s | $0 |
115
- | **Complex** | prism-coder:32b | Deep reasoning (on-demand) | 19 GB | ~8s | $0 |
116
- | **Cloud** | Claude Sonnet 4 | Fallback for max quality | β€” | ~5-10s | ~$0.01 |
87
+ AI agents apply patterns from checklists without understanding the real-world impact. The `verify_behavior` tool challenges the agent with a scenario it must answer **before** editing β€” forcing it to think through what the end user will experience.
117
88
 
118
- **Mobile / offline cascade** (Prism AAC iOS):
119
89
  ```
120
- prism-coder:14b (iPad Pro 16GB) β†’ prism-coder:4b (iPhone 8GB)
121
- β†’ prism-coder:1.7b (any device, always fits)
90
+ Agent: "I'll revert this kitchen display change"
91
+ Prism: "⚠️ Scenario: A cook sees a 3-item ticket. One item is voided.
92
+ What should the cook see after the void?"
93
+ Agent: "The ticket stays visible with the remaining 2 items."
94
+ Prism: "Correct β€” your revert would hide the ticket entirely."
122
95
  ```
123
96
 
124
- ### Knowledge ingestion β€” teach Prism your codebase
125
-
126
- Your code knowledge lives in the knowledge graph, not in model weights. Routing stays at 100%.
127
-
128
- ```bash
129
- bash scripts/knowledge-ingest/setup.sh # one-time setup
130
- # Then every git commit auto-indexes changed files into the knowledge graph
131
- ```
97
+ 17 built-in domains (billing, auth, ordering, clinical, HR, and more). Custom domains per workspace on Enterprise. No hooks needed β€” works in any MCP client.
132
98
 
133
- Three entry points:
134
- - **MCP tool**: `knowledge_ingest` β€” AI says "learn this code"
135
- - **GitHub webhook**: `POST /api/github/webhook` β€” auto on push
136
- - **REST API**: `POST /api/v1/prism/ingest` β€” open interface
99
+ ### Time Travel
137
100
 
138
- See [KNOWLEDGE_INGESTION.md](docs/KNOWLEDGE_INGESTION.md) for full setup guide.
101
+ Roll back to any previous session state. Compare diffs between versions. Restore a known-good state with one click.
139
102
 
140
- ### Routing accuracy
103
+ <p align="center">
104
+ <img src="docs/time-travel-timeline.jpg" alt="Time Travel β€” version timeline with diff view and one-click restore" width="500" />
105
+ </p>
141
106
 
142
- **Head-to-head: prism-coder:14b vs Claude Opus** (25-case benchmark, production system prompt, May 2026):
107
+ ### Cognitive Routing
143
108
 
144
- | Metric | prism-coder:14b | Claude Opus 4 |
145
- |---|---|---|
146
- | **Overall accuracy** | **96% (24/25)** | 88% (22/25) |
147
- | **Tool routing** (15 tests) | **93% (14/15)** | 80% (12/15) |
148
- | **Abstention** (10 tests) | **100% (10/10)** | **100% (10/10)** |
149
- | **Avg latency** | **0.8s** | 5.5s |
150
- | **Cost per query** | **$0** | ~$0.017 |
151
- | **Annual @ 1K/day** | **$0** | ~$6,100 |
109
+ Three memory types, automatically sorted: **episodic** (what happened β€” session logs, decisions), **semantic** (what's true β€” facts, architecture), and **procedural** (how to do X β€” workflows, patterns). When you search, the router picks the right store instead of dumping everything.
152
110
 
153
- prism-coder:14b beats Opus on tool routing β€” 7x faster, free, runs offline.
111
+ ### Multi-Agent Hivemind
154
112
 
155
- **eval_300** (300 cases, 17 tools + NO_TOOL, 9 categories, 3-seed validated):
113
+ Coordinate multiple AI agents working on the same project. Each agent has its own session, but they share memory through the knowledge graph. The Hivemind Radar shows real-time agent status, tasks, and activity.
156
114
 
157
- | Model | eval_300 strict | Size | Latency |
158
- |---|---|---|---|
159
- | **prism-coder:32b** | **300/300 (100%)** | 19 GB | ~1.4s |
160
- | **prism-coder:14b** | **299/300 (99.7%)** | 9 GB | ~0.8s |
161
- | **prism-coder:4b** | **300/300 (100%)** | 2.5 GB | ~0.5s |
162
- | **prism-coder:1.7b** | **300/300 (100%)** | 2.2 GB | ~1.6s |
115
+ <p align="center">
116
+ <img src="docs/hivemind-radar.jpg" alt="Hivemind Radar β€” 5 agents with real-time status, tasks, and activity feed" width="500" />
117
+ </p>
163
118
 
164
- Categories: abstention, adversarial traps, cascade, disambiguation, edge cases, multi-intent, natural phrasing, parameter extraction, verifier prompts.
119
+ ### Neural Search
165
120
 
166
- **What this means**: a child in a hospital without WiFi, a nonverbal adult on an airplane, or a family on a budget gets Claude-grade routing accuracy with zero cloud dependency β€” the AAC path routes correctly **100% of the time across all tiers**.
121
+ Search across all memories with highlighted results, knowledge graph editing, and memory density metrics.
167
122
 
168
- **What it does NOT mean**: these scores measure routing precision on a 17-tool taxonomy, not general intelligence. Claude outperforms on everything outside this task. The value is **offline reliability at zero cost**, not replacing Claude. Code and clinical knowledge come from RAG via `knowledge_search`.
123
+ <p align="center">
124
+ <img src="docs/v6_cognitive_load_dashboard.jpg" alt="Neural Search with Knowledge Graph Editor and Memory Density" width="500" />
125
+ </p>
169
126
 
170
- ### πŸ” L3 Grounding Verifier
127
+ ---
171
128
 
172
- Fail-closed fact-checking layer. When `prism_infer` receives an `evidence` payload, a separate verifier model (default: `prism-coder:4b`) checks every factual claim in the draft against the evidence before serving it. This is the third layer (L3) of the cascade β€” after tool routing (L1) and confidence gating (L2).
129
+ ## Local-first and privacy
173
130
 
174
- **Three-tier pre-check:**
131
+ The free tier runs entirely on your machine. Paid tiers add cloud sync through the Synalux portal, which is what enables cross-device memory and team sharing.
175
132
 
176
- | Tier | Condition | Action |
133
+ | | Local tier (free) | Cloud tier (paid) |
177
134
  |---|---|---|
178
- | **0 β€” Conversational** | Draft has no numbers, dates, names, codes, or $ amounts | Serve without verification |
179
- | **0a β€” No evidence** | Assertive draft + zero evidence snippets | Refuse (fail-closed) |
180
- | **2 β€” NLI** | Assertive draft + evidence provided | Verify each claim against evidence |
135
+ | Memory storage | Local SQLite | Synalux portal (Supabase-backed) |
136
+ | Inference | Local Ollama models | Local models + cloud fallback |
137
+ | API keys required | None | Synalux subscription key |
138
+ | Web search / scrape | Not included | Via Synalux portal (provider keys server-side) |
139
+ | What leaves your machine | Nothing | Memory text + file paths + search queries, sent to the portal over TLS (PHI-redacted before transit) |
140
+ | Works offline | βœ… | Local features yes; sync/cloud no |
181
141
 
182
- **Per-claim verdicts:**
183
- - `ENTAILED` β€” claim matches evidence (including arithmetic identity: "3" β‰ˆ "three")
184
- - `CONTRADICTED` β€” evidence states a different value for the same fact β†’ **refuse**
185
- - `NEUTRAL` β€” claim not covered by evidence β†’ **refuse** (fail-closed default)
142
+ **Handling sensitive data.** All cloud writes pass through automatic redaction (SSNs, dates of birth, medical record numbers, phone numbers, emails, and clinical identifiers are stripped before transit). For regulated workloads, run the **local tier** for full air-gap, or use **Enterprise** which includes a HIPAA Business Associate Agreement.
186
143
 
187
- **Fail-closed guarantees:** HTTP errors, malformed JSON, timeouts β†’ all treated as refusal. The caller gets the specific claim that failed and can retry with more evidence or fall back to cloud.
188
-
189
- **Usage with `prism_infer`:**
190
- ```json
191
- {
192
- "prompt": "What was the patient's last A1C?",
193
- "evidence": [
194
- { "source": "lab_2026-05-01", "content": "HbA1c: 6.8% (ref <7.0)" }
195
- ]
196
- }
197
- ```
144
+ ---
198
145
 
199
- **Structured output:**
200
- ```json
201
- {
202
- "output": "The patient's last A1C was 6.8%.",
203
- "verification": {
204
- "action": "served",
205
- "claims": [{ "text": "A1C was 6.8%", "verdict": "ENTAILED" }],
206
- "verifierChain": [{ "model": "prism-coder:4b", "verdict": "ENTAILED", "latencyMs": 340 }]
207
- }
208
- }
209
- ```
146
+ ## Models
210
147
 
211
- When a claim is contradicted or unsupported:
212
- ```json
213
- {
214
- "output": "⚠ Verification failed: claim 'A1C was 7.2%' is CONTRADICTED by evidence.",
215
- "verification": {
216
- "action": "refused_fabricated",
217
- "refusalClaim": "A1C was 7.2%"
218
- }
219
- }
220
- ```
148
+ The `prism-coder` fleet uses Qwen3.5 for MCP tool-routing. The 9B is fine-tuned with LoRA (r=128, all 64 layers including DeltaNet); the 2B and 4B use stock Qwen3.5-4B at different quantization levels. They are **not** general-purpose chat models β€” they route reliably and run offline; Claude and other frontier models remain better at reasoning, coding, and open-domain work. The intended pattern is local routing with an optional cloud fallback for hard cases.
221
149
 
222
- The verifier model (`prism-coder:4b`) is intentionally different from the inference model β€” satisfying the independent-reviewer principle. Requires a paid plan (see [Plans](#plans)). Set `verify: false` to explicitly skip verification even when evidence is provided.
150
+ | Model | Ollama tag | Size | [BFCL](https://gorilla.cs.berkeley.edu/blogs/12_bfcl_v3_multi_turn.html) Accuracy | Role | Tier |
151
+ |---|---|---|---|---|---|
152
+ | Qwen3.5-4B Q3_K_M | `prism-coder:2b` | 2.3 GB | 99.1% Γ— 3 seeds | iPhone / mobile first gate | Free |
153
+ | Qwen3.5-4B Q4_K_M | `prism-coder:4b` | 3.4 GB | 100% Γ— 3 seeds | Verifier | Free |
154
+ | Qwen3.5-9B (LoRA) | `prism-coder:9b` | 5.8 GB | 100% Γ— 3 seeds | Default router | Standard+ |
155
+ | prism-coder:32b | `prism-coder:32b` | 19 GB | 100% Γ— 3 seeds | Complex tasks | Advanced+ |
223
156
 
224
- ### 🧠 HRR Semantic Drift Detection (v17.0)
225
- Detects when long AI agent sessions drift from their original goal β€” using Holographic Reduced Representations for temporal trajectory encoding and anomaly detection.
157
+ Weights: [huggingface.co/dcostenco](https://huggingface.co/dcostenco) (public GGUF). Latency depends on model size and hardware β€” see [Benchmarks](#benchmarks) to measure it on your own machine rather than trusting a printed number.
226
158
 
227
- **Three domains, one detector:**
228
- | Domain | Signals | Safety |
229
- |---|---|---|
230
- | **BCBA/Clinical** | Client specificity decay, function-intervention alignment (4 functions), contraindication detection (epilepsy/pica/dysphagia/diabetes) | PHI-safe, deterministic |
231
- | **Coding** | File scope entropy, summary vagueness, test coverage ratio, trajectory HRR divergence | Adaptive threshold for refactors |
232
- | **AAC** | Prediction accuracy, vocabulary stagnation, topic divergence | Emergency phrases always β‰₯ 0.95 |
159
+ ### Cascade
233
160
 
234
- **Research-backed:** trajectory association (Frady et al. 2018), HDAD anomaly detection (Wang et al. 2021), unit-modulus projection (Ganesan et al. NeurIPS 2021). 306 tests across 8 files, zero failures. Use `session_detect_drift` with optional `domain` parameter.
161
+ ```
162
+ query β†’ prism-coder:9b (local router, default)
163
+ β†’ prism-coder:4b (grounding verifier)
164
+ β†’ prism-coder:2b (iPhone / mobile, auto-selected by RAM)
165
+ β†’ prism-coder:32b (complex tasks, on demand)
166
+ β†’ cloud fallback (paid tiers, for max quality)
167
+ ```
235
168
 
236
- ### ⚑ Zero-search retrieval *(new in v15.8)*
237
- Holographic Reduced Representations (HRR) via Rust WASM for instant memory retrieval without a database query.
169
+ ### Multi-Layer Verification
238
170
 
239
- **Three adaptive strategies:**
240
- - **GloVe embeddings** (offline, 50K words) β€” 87% Top-1 accuracy, stable at 200+ concepts
241
- - **API embeddings** (Gemini/Voyage) β€” 90%+ accuracy when online
242
- - **NeurIPS 2021 projection** β€” unit-modulus normalization for numerical stability
171
+ Every tool-grounded answer on paid tiers passes through deterministic L3 routing rules and an NLI grounding verifier before reaching the user. Free-tier users get the deterministic gates (L1, L3-Tool, L3-Tier0) without the model-based NLI check.
243
172
 
244
- **Retrieval cascade:** HRR (~0.2ms) β†’ FTS5 (~50ms) β†’ Supabase (~200ms)
173
+ | Layer | What | Model | Cost |
174
+ |---|---|---|---|
175
+ | **L1** | Crisis/medical safety gate | None (regex) | 0 ms |
176
+ | **L3-Tool** | Tool name remap + false-positive rejection | None (deterministic) | 0 ms |
177
+ | **L3-Tier0** | Integer grounding (set membership) | None (deterministic) | 0 ms |
178
+ | **L3-Tier2** | NLI verifier (claim β†’ ENTAILED/NEUTRAL/CONTRADICTED) | prism-coder:2b | ~200 ms |
179
+ | **L4** | Hallucination judge (opt-out for clinical) | prism-coder:4b | ~500 ms |
245
180
 
246
- | Metric | HRR (WASM) | FTS5 | Supabase Vector |
247
- |--------|-----------|------|-----------------|
248
- | Latency | **0.2ms** | 50ms | 200ms |
249
- | Speedup | **1x** | 250x slower | 1000x slower |
250
- | Offline | **Yes** | Yes | No |
251
- | Accuracy (GloVe) | **87% Top-1** | 95%+ | 95%+ |
252
- | Hologram size | **8KB** | Index varies | Cloud |
181
+ Fail-closed on the verified path: when the grounding verifier runs (Standard tier and up), timeout, ambiguity, or missing evidence yields a refusal, not pass-through. Free-tier users get the deterministic L1/L3-Tool gates but not the NLI verifier.
253
182
 
254
- HRR acts as Tier 0 β€” if confidence is high, FTS5 is skipped entirely. Falls through gracefully when HRR has no match. 97 dedicated tests (72 system + 25 API/client). Built with Rust + `rustfft` + `wasm-bindgen` (229KB binary).
183
+ ---
255
184
 
256
- **HRR AAC prediction benchmark** β€” real-world impact on Prism AAC word prediction (10 scenarios, 54 integration tests):
185
+ ## Benchmarks
257
186
 
258
- | Scenario | Baseline Top-1 | +HRR Top-1 | Top-1 Lift | MRR Lift |
259
- |----------|---------------|------------|-----------|----------|
260
- | Core AAC phrases | 36.7% | 46.7% | **+27.3%** | +6.0% |
261
- | Personal vocabulary | 70.4% | 81.5% | **+15.8%** | +9.2% |
262
- | Mixed (all phrases) | 47.2% | 56.9% | **+20.6%** | +5.7% |
263
- | Cross-session recall | 80.0% | 80.0% | +0.0% | +0.0% |
187
+ **Reproduce every number yourself.** All evals are open-source and self-contained:
264
188
 
265
- Top-1 = correct word is tile #1. MRR = Mean Reciprocal Rank. Zero Top-5 regressions in any scenario. HRR encodes bigrams + trigrams from every spoken phrase; probes take ~0.2ms β€” safe on every keystroke. All Synalux apps (clinical, AAC, PrismCoach) share HRR via the portal `/api/v1/hrr` endpoint.
189
+ ```bash
190
+ git clone https://github.com/dcostenco/prism-coder && cd prism-coder
191
+ pip install anthropic requests
192
+ python3 tests/benchmarks/prism-routing-100/benchmark.py --models 2b 4b 9b 32b
193
+ ```
266
194
 
267
- **Memory retrieval comparison:**
195
+ **Routing eval (115 cases, 12 categories, 3-seed mean).** Routing accuracy includes the deterministic L3 correction layer β€” the same rules that run in production. On this narrow tool-routing task all fleet models achieve near-perfect accuracy. Be honest with yourself about what that means: the eval is **near-saturated** for this taxonomy β€” it measures whether the right one of a small set of MCP tools is selected, not general capability. The useful takeaway is **offline routing reliability at zero cost**, not that a 2.3 GB model rivals a frontier model in general.
268
196
 
269
- | System | Retrieval | Offline | Cost | Latency |
270
- |--------|-----------|---------|------|---------|
271
- | **Prism Coder** | **HRR + FTS5 + Supabase cascade** | **Yes** | **$0** | **0.2ms** |
272
- | Mem0 | Vector DB (Qdrant/Pinecone) | No | $249/mo | ~100ms |
273
- | Zep | Vector DB + temporal graph | No | $99/mo | ~80ms |
274
- | Hermes (NousResearch) | HRR + SQLite | Yes | Free | ~5ms |
197
+ | Model | Routing accuracy | Notes |
198
+ |---|---|---|
199
+ | prism-coder:2b (Q3_K_M) | 99.1% × 3 seeds | 1 failure: regex→knowledge_search |
200
+ | prism-coder:4b / 9b / 32b | 100% Γ— 3 seeds | Perfect on all 115 cases |
201
+ | Claude (frontier, same eval) | ~98% | Stronger everywhere outside this narrow task |
275
202
 
276
- ### 🌐 Multi-agent Hivemind
277
- Multiple AI agents share the same Mind Palace. Each agent has a role (dev / qa / pm / etc.) and sees scoped context. Heartbeat + roster for coordination.
203
+ **Memory uplift (LoCoMo-Plus, self-published).** A separate long-context dialogue benchmark ([dcostenco/Locomo-Plus](https://github.com/dcostenco/Locomo-Plus)) measures how much structured memory helps a base model retain multi-day context. Results show large gains when a model is paired with Prism memory versus running raw. Note this benchmark is authored, run, and LLM-judged by this project β€” treat it as a reproducible demonstration, not an independent third-party result, and run it yourself with the commands in that repo.
278
204
 
279
205
  ---
280
206
 
@@ -282,658 +208,264 @@ Multiple AI agents share the same Mind Palace. Each agent has a role (dev / qa /
282
208
 
283
209
  ### vs AI coding assistants
284
210
 
285
- | Feature | Prism Coder | GitHub Copilot | Cursor | Windsurf | Amazon Q | Tabnine | Devin |
286
- |---|:---:|:---:|:---:|:---:|:---:|:---:|:---:|
287
- | Local inference (1.7B–32B) | βœ… | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
288
- | Works offline (local-only mode) | βœ… | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
289
- | Open-weight models (HuggingFace) | βœ… | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
290
- | Data stays on machine (local tier) | βœ… | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
291
- | Persistent cross-session memory | βœ… | βœ… | ❌ | ❌ | ❌ | ❌ | ❌ |
292
- | Cognitive routing (episodic/semantic) | βœ… | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
293
- | Session drift detection (HRR) | βœ… | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
294
- | L3 grounding verifier | βœ… | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
295
- | Multi-agent hivemind | βœ… | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
296
- | MCP server (tools + memory for agents) | βœ… | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
297
- | Cloud fallback (14b β†’ 32b β†’ Sonnet) | βœ… | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
298
- | Web IDE | βœ… | βœ… | ❌ | ❌ | βœ… | ❌ | βœ… |
299
- | VS Code extension | βœ… | βœ… | ❌ | ❌ | βœ… | βœ… | ❌ |
300
- | HIPAA / air-gapped ready | βœ… | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
301
- | Flat-rate pricing (not per-seat) | βœ… | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
302
-
303
- ### vs local AI tools
304
-
305
- | Feature | Prism Coder | Ollama | LM Studio | Jan.ai | Mem0 | Zep |
211
+ These tables are the maintainer's assessment as of June 2026. Verify claims that matter to you β€” products change fast.
212
+
213
+ | Feature | Prism Coder | GitHub Copilot | Cursor | Windsurf | Amazon Q | Devin |
306
214
  |---|:---:|:---:|:---:|:---:|:---:|:---:|
307
- | Local inference (1.7B–32B cascade) | βœ… | βœ… | βœ… | βœ… | ❌ | ❌ |
308
- | Automatic cloud fallback | βœ… | ❌ | ❌ | ❌ | ❌ | ❌ |
309
- | Persistent cross-session memory | βœ… | ❌ | ❌ | ❌ | βœ… | βœ… |
310
- | Knowledge ingestion (MCP + webhook + REST) | βœ… | ❌ | ❌ | ❌ | ❌ | ❌ |
311
- | Cognitive routing (3-store) | βœ… | ❌ | ❌ | ❌ | ❌ | ❌ |
312
- | L3 grounding verifier | βœ… | ❌ | ❌ | ❌ | ❌ | ❌ |
215
+ | Local inference (open-weight) | βœ… | ❌ | ❌ | ❌ | ❌ | ❌ |
216
+ | Works fully offline | βœ… (free tier) | ❌ | ❌ | ❌ | ❌ | ❌ |
217
+ | Persistent cross-session memory | βœ… | βœ… | ❌ | ❌ | ❌ | ❌ |
313
218
  | Session drift detection | βœ… | ❌ | ❌ | ❌ | ❌ | ❌ |
314
- | Native MCP server | βœ… | ❌ | ❌ | ❌ | ❌ | ❌ |
315
- | Web IDE + VS Code extension | βœ… | ❌ | ❌ | ❌ | ❌ | ❌ |
316
- | Analytics dashboard | βœ… | ❌ | ❌ | ❌ | βœ… | βœ… |
219
+ | L3 grounding verifier | βœ… | ❌ | ❌ | ❌ | ❌ | ❌ |
220
+ | Behavioral verification (pre-edit) | βœ… | ❌ | ❌ | ❌ | ❌ | ❌ |
221
+ | MCP server (tools + memory) | βœ… | ❌ | ❌ | ❌ | ❌ | ❌ |
222
+ | Web IDE | βœ… | βœ… | ❌ | ❌ | βœ… | βœ… |
223
+ | VS Code extension | βœ… | βœ… | β€” | β€” | βœ… | ❌ |
224
+ | Flat-rate team pricing | βœ… | ❌ (per-seat) | ❌ (per-seat) | ❌ | ❌ | ❌ |
225
+ | HIPAA BAA available | βœ… (Enterprise) | ❌ | ❌ | ❌ | ❌ | ❌ |
226
+
227
+ ### vs local AI / memory tools
228
+
229
+ | Feature | Prism Coder | Ollama | LM Studio | Mem0 | Zep |
230
+ |---|:---:|:---:|:---:|:---:|:---:|
231
+ | Local inference cascade | βœ… | βœ… | βœ… | ❌ | ❌ |
232
+ | Cloud fallback | βœ… | ❌ | ❌ | ❌ | ❌ |
233
+ | Persistent cross-session memory | βœ… | ❌ | ❌ | βœ… | βœ… |
234
+ | Knowledge ingestion (MCP + webhook) | βœ… | ❌ | ❌ | ❌ | ❌ |
235
+ | Cognitive routing (3-store) | βœ… | ❌ | ❌ | ❌ | ❌ |
236
+ | Session drift detection | βœ… | ❌ | ❌ | ❌ | ❌ |
237
+ | Native MCP server | βœ… | ❌ | ❌ | ❌ | ❌ |
238
+ | Web IDE + VS Code extension | βœ… | ❌ | ❌ | ❌ | ❌ |
317
239
 
318
240
  ### Pricing β€” flat-rate, not per-seat
319
241
 
320
- | | **Prism Coder** | GitHub Copilot | Cursor | Windsurf | Amazon Q | Tabnine |
321
- |---|:---:|:---:|:---:|:---:|:---:|:---:|
322
- | **Individual** | **$19/mo** | $10/mo | $20/mo | $15–20/mo | $19/mo | $39/mo |
323
- | **Team (5 devs)** | **$49/mo flat** | $95/mo | $200/mo | $200/mo | $95/mo | $295/mo |
324
- | **Enterprise (25 devs)** | **$99/mo flat** | $195/mo | $1,000/mo | Custom | Custom | Custom |
325
- | **Cost per dev (team)** | **$9.80** | $19 | $40 | $40 | $19 | $59 |
326
- | **Annual savings (5 devs)** | β€” | **$552** | **$1,812** | **$1,812** | **$552** | **$2,952** |
242
+ | | **Prism Coder** | GitHub Copilot | Cursor | Amazon Q |
243
+ |---|:---:|:---:|:---:|:---:|
244
+ | **Individual** | **$19/mo** | $10/mo | $20/mo | $19/mo |
245
+ | **Team (5 devs)** | **$49/mo flat** | $95/mo | $200/mo | $95/mo |
246
+ | **Enterprise (25 devs)** | **$99/mo flat** | $195/mo | $1,000/mo | Custom |
327
247
 
328
248
  ---
329
249
 
330
250
  ## Plans
331
251
 
332
- | | **Free** | **Standard $19/mo** | **Advanced $49/mo** | **Enterprise $99/mo** |
252
+ All on-device models are free to run locally via Ollama on every tier. A subscription gates **cloud** features, higher model ceilings, and increased limits. Local model ceilings are advisory β€” on-device models run on your Ollama regardless of plan; the ceiling gates cloud inference and `prism_infer` routing.
253
+
254
+ | | **Free** | **Standard** $19/mo | **Advanced** $49/mo | **Enterprise** $99/mo |
333
255
  |---|---|---|---|---|
334
- | **Seats included** | 1 | 1 | up to 5 | up to 25 |
335
- | **Local model ceiling** | up to 4b | up to 14b | up to 32b | up to 32b |
336
- | **Daily inference limit** | 50 | 200 | 2,000 | 100,000 |
337
- | **Max output tokens** | 512 | 1,024 | 2,048 | 4,096 |
338
- | **Cloud fallback** | β€” | Claude Sonnet 4 | Claude Sonnet 4 | Priority + Sonnet 4 |
339
- | **L3 grounding verifier** | β€” | βœ… | βœ… | βœ… |
340
- | **Knowledge search** | limited | unlimited | unlimited | unlimited |
341
- | **Session memory** | limited | unlimited | unlimited | unlimited |
342
- | **Analytics dashboard** | β€” | βœ… | βœ… | βœ… |
343
- | **HIPAA BAA** | β€” | β€” | β€” | βœ… |
344
-
345
- All on-device models are open-weight and free to run locally via Ollama. The subscription gates cloud features, higher model tiers, and increased limits. Need 25+ seats? [Contact sales](https://synalux.ai/contact). 14-day free trial on all paid plans. [Subscribe β†’](https://synalux.ai/pricing)
256
+ | Seats | 1 | 1 | up to 5 | up to 25 |
257
+ | Local model ceiling | up to 4b | up to 9b | up to 32b | up to 32b |
258
+ | Daily cloud inference | -- | 200 | 2,000 | 100,000 |
259
+ | Cloud Coder (Web IDE) | -- | 100/day | 1,000/day | 100,000/day |
260
+ | Cloud search | -- | 50/day | 500/day | 100,000/day |
261
+ | Max output tokens | 512 | 1,024 | 2,048 | 4,096 |
262
+ | Cloud fallback | -- | Claude Sonnet 4 | Claude Sonnet 4 | Priority + Sonnet 4 |
263
+ | Grounding verifier (fact-check AI output) | -- | βœ… | βœ… | βœ… |
264
+ | Memory sync (cloud) | -- | βœ… | βœ… | βœ… |
265
+ | Knowledge / session memory | limited | unlimited | unlimited | unlimited |
266
+ | Analytics dashboard | -- | βœ… | βœ… | βœ… |
267
+ | HIPAA BAA | -- | -- | -- | βœ… |
268
+
269
+ 14-day free trial on paid plans. 25+ seats: [contact sales](https://synalux.ai/support)
346
270
 
347
271
  ---
348
272
 
349
- ## Get started
350
-
351
- ```bash
352
- # Install globally
353
- npm install -g prism-mcp-server
354
-
355
- # Or use npx (no install)
356
- npx prism-mcp-server
357
- ```
358
-
359
- Add to Claude Desktop / Cursor config:
360
-
361
- ```json
362
- {
363
- "mcpServers": {
364
- "prism": {
365
- "command": "npx",
366
- "args": ["-y", "prism-mcp-server"]
367
- }
368
- }
369
- }
370
- ```
371
-
372
- That's it. Open Claude / Cursor and your AI now has memory.
273
+ ## How agents use it
373
274
 
374
- More setup details in [`docs/SETUP_GEMINI.md`](docs/SETUP_GEMINI.md).
375
-
376
- ### Monitoring & Observability *(new in v16.2)*
377
-
378
- Built-in Datadog integration β€” every tool call is logged with tool name, project, and latency. Zero config for self-hosted users (logs to stdout); set `DD_API_KEY` to send structured logs to Datadog HTTP intake.
379
-
380
- ```bash
381
- # Enable Datadog logging (optional)
382
- export DD_API_KEY=your_datadog_api_key
383
-
384
- # Enable OpenTelemetry tracing (optional β€” works with Jaeger, Zipkin, Datadog, Grafana Tempo)
385
- export PRISM_OTEL_ENABLED=true
386
- export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318
387
- ```
388
-
389
- **What's tracked automatically:**
390
- - `mcp.tool.success` β€” tool name, project, duration (ms) on every successful call
391
- - `mcp.tool.error` β€” tool name, error message, stack trace on failures
392
- - OpenTelemetry spans with `tool.name` and `project` attributes on all 50 tool handlers
393
-
394
- | Dashboard | What it tracks |
395
- |-----------|---------------|
396
- | [Prism MCP β€” Server Analytics](https://app.datadoghq.com/dashboard/tdm-92f-myh/prism-mcp--server-analytics) | Tool call volume, latency per tool (avg/p95), errors by tool, project activity, knowledge search/ingest, session memory ops |
397
-
398
- ### In-app analytics for paid users *(new in v16.2)*
399
-
400
- Paid Synalux subscribers get a built-in analytics dashboard at `/app/memory-analytics`:
401
-
402
- ```
403
- β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
404
- β”‚ Analytics [standard] plan β”‚
405
- β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
406
- β”‚ πŸ“ Sessions: 147 πŸ”„ Handoffs: 23 πŸ“š Knowledge: 89 β”‚
407
- β”‚ πŸ“ Projects: 5 πŸ’Ύ Memory: 42 KB β”‚
408
- β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
409
- β”‚ Today's Usage 🧠 47/200 πŸ”Ž 12/50 πŸ’¬ 85/200 β”‚
410
- β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
411
- β”‚ 30-Day Trend β–‚β–ƒβ–…β–‡β–†β–„β–ƒβ–…β–†β–‡β–ˆβ–‡β–…β–ƒβ–‚β–ƒβ–…β–†β–‡β–…β–ƒβ–‚β–β–‚β–ƒβ–…β–‡β–†β–…β–ƒ β”‚
412
- β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
413
- β”‚ Top Projects prism-mcp (45) Β· portal (32) Β· ... β”‚
414
- β”‚ Compaction 3 entries > 5KB β€” run compact_ledger β”‚
415
- β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
416
- ```
417
-
418
- - **Free tier**: paywall with upgrade CTA
419
- - **Standard+**: session counts, handoffs, knowledge entries, daily quotas with tier limits, 30-day activity trend, project breakdown, compaction candidates
420
-
421
- ---
422
-
423
- ## How AI agents use it
275
+ Prism exposes 40+ MCP tools. The core memory loop:
424
276
 
425
277
  | Tool | What it does |
426
278
  |---|---|
427
- | `session_load_context` | Recover prior session's state on boot |
428
- | `session_save_ledger` | Append immutable session log entry |
279
+ | `session_load_context` | Recover the prior session's state on boot |
280
+ | `session_save_ledger` | Append an immutable session log entry |
429
281
  | `session_save_handoff` | Save live state for the next session |
430
282
  | `knowledge_search` | Semantic + keyword search over all memories |
431
- | `query_memory_natural` | Natural-language Q&A over your Mind Palace |
432
- | `extract_entities` | Pull people / projects / decisions from text |
433
- | `session_detect_drift` | HRR-powered semantic drift detection (BCBA/Coding/AAC) |
434
- | `session_synthesize_edges` | Auto-link related memories into a graph |
283
+ | `query_memory_natural` | Natural-language Q&A over the memory store |
284
+ | `session_detect_drift` | Detect when a session has drifted from its goal |
285
+ | `verify_behavior` | Pre-edit scenario challenge β€” catch bad changes before they happen |
286
+ | `knowledge_ingest` | Teach Prism a codebase or document |
435
287
 
436
- (35+ tools total β€” full TypeScript signatures in `src/tools/`. Architecture overview in [`docs/ARCHITECTURE.md`](docs/ARCHITECTURE.md).)
288
+ Full TypeScript signatures live in [`src/tools/`](src/tools/); architecture in [`docs/ARCHITECTURE.md`](docs/ARCHITECTURE.md).
437
289
 
438
290
  <details>
439
- <summary>πŸ”„ How Prism handles context compaction and context loss</summary>
440
-
441
- The LLM context window is treated as ephemeral scratch space. All durable state lives in Prism's persistent store (SQLite / Supabase). Context compaction is a non-event.
442
-
443
- **Boot protocol** β€” every session (including post-compaction) begins with a mandatory `session_load_context` call, enforced via `CLAUDE.md`. The agent is fully oriented before writing a single byte of response.
444
-
445
- **Two persistent stores:**
446
- - `session_save_ledger` β€” immutable append-only work log (decisions, files changed, summaries)
447
- - `session_save_handoff` β€” versioned live-state snapshot (current task, TODOs, open context)
448
-
449
- **Ledger compaction** (`session_compact_ledger`) β€” when a project exceeds a threshold (default: 50 entries), Prism summarizes old entries via LLM into a rollup row, soft-archives originals, and links them via `spawned_from` graph edges. Runs on a 12-hour background scheduler.
450
-
451
- β†’ Full details: [`docs/COMPACTION.md`](docs/COMPACTION.md)
291
+ <summary>How Prism survives context compaction</summary>
452
292
 
293
+ The LLM context window is treated as ephemeral scratch space; durable state lives in the persistent store (SQLite locally, the portal in the cloud). Every session begins with a mandatory `session_load_context` call, so the agent is oriented before it writes a response. When a project exceeds a threshold (default 50 entries), `session_compact_ledger` summarizes old entries into a rollup, soft-archives the originals, and links them in the graph. See [`docs/COMPACTION.md`](docs/COMPACTION.md)
453
294
  </details>
454
295
 
455
296
  ---
456
297
 
457
- ## Models
458
-
459
- Prism Coder inference cascades through fine-tuned models first, with Claude as a quality-gate fallback. Models route through the Synalux router (authentication + subscription required). Cascade: Cloud (OpenRouter) β†’ Ollama local β†’ Claude fallback.
460
-
461
- | Model | Ollama tag | Where | Tier | Latency |
462
- |---|---|---|---|---|
463
- | **prism-coder:1.7b** | `prism-coder:1b7` (v42) | On-device (Mac/local) Β· iOS via llama.cpp | Free | ~1.6s |
464
- | **prism-coder:8b** | `prism-coder:8b` (v36) | On-device iPhone/iPad 8GB+ Β· local Mac | Free | ~0.8s |
465
- | **prism-coder:14b** | `prism-coder:14b` (v36) | On-device Mac 24GB+ Β· iPad Pro Β· Cloud A100 | Standard+ | ~1.1s |
466
- | **prism-coder:32b** | `prism-coder:32b` (v7 MoE) | Cloud (OpenRouter) A100 80GB via Synalux | Pro/Enterprise | ~0.8s |
467
-
468
- Models use the Synalux SFT corpus (AAC + Prism MCP tool taxonomy + clinical workflows). **Internal quality gate: β‰₯ 90% on the Prism 102-case eval before production promotion.**
469
-
470
- > **Training note**: Base Qwen3 models are strong tool-routers out of the box. Heavy fine-tuning regresses tool-vs-plain-text decisions; light-touch polish recipes (small corpus, balanced tool/plain-text split) are the published path. Production adapter selection and retrain methodology are managed in the Synalux portal.
471
-
472
- **Per-category breakdown β€” [Prism 102-case eval](tests/benchmarks/prism-routing-100/README.md) (3-seed mean, v36/v7 system prompt, May 2026):**
473
-
474
- | Model | Overall | Load ctx | Save | Srch mem | Handoff | Compact | Know srch | AAC | Translate | No-tool | Info | Edge | Avg lat | Inv |
475
- |---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
476
- | **prism-coder:32b** v7 | **100.0%** | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | **100%** | 0.8s | 0 |
477
- | **prism-coder:8b** v36 | **100.0%** | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | **100%** | 0.8s | 0 |
478
- | **prism-coder:14b** v36 | **100.0%** | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | **100%** | 1.1s | 0 |
479
- | **Claude Opus 4.7** | **98.3%** | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 83% | 3.0s | 0 |
480
- | **prism-coder:1.7b** v42 | **100.0%** | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | **100%** | 1.6s | 0 |
481
-
482
- > **Methodology**: 102-case pool across 12 categories. Scores are 3-seed mean (seeds 2027/2028/2029, zero variance across all seeds). All fine-tuned models use the Qwen3 nothink template with keyword-trigger routing prompts and `-> respond directly (no tool)` for the no-tool class. Full runner: [`tests/benchmarks/prism-routing-100/benchmark.py`](tests/benchmarks/prism-routing-100/benchmark.py) Β· Cascade runner: [`tests/benchmarks/cascade-14b-32b-opus/cascade_eval.py`](tests/benchmarks/cascade-14b-32b-opus/cascade_eval.py).
483
- >
484
- > **These are NOT general-purpose LLM benchmarks.** This eval measures routing precision on 6 specific MCP tools. The prism-coder models are specialists trained on this exact task β€” they match or exceed Claude on routing while Claude dominates on general reasoning, coding, and open-domain QA. The value is **offline reliability at zero cost**, not replacing cloud AI.
485
-
486
- **iOS deployment:** On-device inference via **llama.cpp Swift SPM**. Auto-selects by device RAM: 14B on iPad Pro 16GB (100% routing), 8B on iPhone/iPad 8GB (100%, OOM fallback to 1.7B at 100%). CoreML not viable β€” coremltools doesn't support Qwen3 attention ops. Integration: `LLMEngine.swift` β†’ `prismNativeBridge.askAI()` β†’ token stream. WiFi fallback: Mac Ollama (`OLLAMA_HOST=0.0.0.0`).
487
-
488
- ### Benchmarks β€” run them yourself
489
-
490
- All benchmarks are open-source. Reproduce every number in this README:
491
-
492
- ```bash
493
- git clone https://github.com/dcostenco/prism-coder
494
- cd prism-coder
495
- pip install anthropic requests
496
-
497
- # Per-model solo eval (102 cases, 3 seeds)
498
- python3 tests/benchmarks/prism-routing-100/benchmark.py --models 14b 8b 32b 1b7 opus
499
-
500
- # Cascade eval β€” 14B β†’ 32B β†’ Opus (Claude Opus as etalon)
501
- export ANTHROPIC_API_KEY=sk-ant-...
502
- ollama pull dcostenco/prism-coder:14b dcostenco/prism-coder:32b
503
- python3 tests/benchmarks/cascade-14b-32b-opus/cascade_eval.py
504
- ```
505
-
506
- **Not a general function-calling benchmark.** This measures routing precision on 6 specific MCP tools. We don't claim to beat Claude on general capabilities. We match or exceed Claude on the ONE task that matters for offline AAC: correct tool routing, every time, under 2 seconds, with zero cloud.
507
-
508
- | Benchmark | Source | What it measures |
509
- |---|---|---|
510
- | Per-model BFCL | [`tests/benchmarks/prism-routing-100/`](tests/benchmarks/prism-routing-100/) | Solo accuracy per model, 12 categories |
511
- | Cascade vs Opus | [`tests/benchmarks/cascade-14b-32b-opus/`](tests/benchmarks/cascade-14b-32b-opus/) | Tier distribution, Opus engagement rate, cascade accuracy |
512
- | LoCoMo-Plus (Cognitive) | [`dcostenco/Locomo-Plus`](https://github.com/dcostenco/Locomo-Plus) | Long-context dialogue coherence and historical memory retention |
513
-
514
- ### Cognitive Dialogue Memory (LoCoMo-Plus Benchmark)
515
-
516
- LoCoMo-Plus is a long-context, multi-day dialogue benchmark designed to test an AI agent's memory retention, context awareness, and ability to coherently reference historical dialogue evidence.
517
-
518
- The **Cognitive** subset (401 multi-day dialogue scenarios) was evaluated head-to-head comparing raw baseline models against the **Prism-MCP** framework (using local SQLite semantic memory). Graded by a neutral `gemini-2.5-flash` model acting as judge (scoring on coherence, continuity, and fact accuracy):
519
-
520
- | Configuration | Samples | Total Score | Average Score | Absolute Delta | Relative Error Reduction |
521
- | :--- | :---: | :---: | :---: | :---: | :---: |
522
- | **Gemini-2.5-flash (Baseline)** | 401 | 278.0 / 401 | **69.33%** | β€” | β€” |
523
- | **Prism-MCP (Gemini-2.5-flash + Memory)** | 401 | 361.0 / 401 | **90.02%** | **+20.69pp** | **67.5%** |
524
- | **Gemini-3.1-pro-preview (Baseline)** | 401 | 272.0 / 401 | **67.83%** | β€” | β€” |
525
- | **Prism-MCP (Gemini-3.1-pro + Memory)** | 401 | 382.0 / 401 | **95.26%** | **+27.43pp** | **85.3%** |
526
- | **Gemini-3.5-flash (Baseline)** | 401 | 237.0 / 401 | **59.10%** | β€” | β€” |
527
- | **Prism-MCP (Gemini-3.5-flash + Memory)** | 401 | 388.0 / 401 | **96.76%** | **+37.66pp** | **92.1%** |
528
- | **Claude Sonnet 4.6 (Baseline)** | 401 | 290.0 / 401 | **72.32%** | β€” | β€” |
529
- | **Prism-MCP (Claude Sonnet 4.6 + Memory)** | 401 | 357.0 / 401 | **89.03%** | **+16.71pp** | **60.4%** |
530
-
531
- **Key Takeaways**:
532
- * **Pure attention limits**: Even the strongest frontier model tested β€” Claude Sonnet 4.6 at **72.32%** β€” misses over a quarter of cognitive memory cues without external memory. Gemini 3.5 Flash baseline sits at **59.10%**. Both suffer from attention dilution when parsing massive multi-day transcripts directly in active context.
533
- * **Prism lifts every model**: Prism-MCP yields large gains regardless of base model β€” from +16.71pp (Claude) to +37.66pp (Gemini 3.5 Flash). Even Claude's stronger native recall benefits from structured retrieval, jumping from 72.32% to **89.03%**.
534
- * **Best overall**: Prism-MCP + Gemini 3.5 Flash achieves the highest score (**96.76%**), eliminating 92.1% of baseline errors. This makes the cheapest model + Prism more accurate than the most expensive model alone.
535
- * **Claude vs Gemini (raw)**: Claude Sonnet 4.6 outperforms all Gemini baselines by a wide margin (+13.22pp over Flash 3.5, +4.49pp over Pro 3.1), confirming stronger native long-context recall.
536
-
537
- <details>
538
- <summary>πŸ” View Test Case Schema & Sample</summary>
539
-
540
- A representative test sample from the `unified_cognitive_only.json` ([GitHub source](https://github.com/dcostenco/Locomo-Plus/blob/main/data/unified_cognitive_only.json)) dataset contains a multi-turn chat history with a memory "needle" placed days prior, followed by a cued dialogue prompt:
541
-
542
- ```json
543
- {
544
- "category": "Cognitive",
545
- "input_prompt": "Caroline said, \"...\"\nMelanie said, \"...\"",
546
- "trigger": "Melanie said, \"Hey, Caroline! Nice to hear from you! Love the necklace, any special meaning to it?\"",
547
- "evidence": "Swedish grandmother's necklace was gifted to Caroline",
548
- "answer": "Yes, this necklace was a gift from my grandmother in my home country, Sweden."
549
- }
550
- ```
551
-
552
- When evaluated:
553
- * **Baseline models** without memory frequently output a generic guess (e.g., "Thanks, it was a gift from a friend") or fail to reference the Sweden/grandmother relationship.
554
- * **Prism-MCP** automatically embeds the prior turns, stores them in SQLite, and when cued, retrieves the precise "Swedish grandmother" evidence turn via semantic vectors to inject it into active context.
555
- </details>
556
-
557
- <details>
558
- <summary>πŸ’» View How to Reproduce Publicly (Test Source & Guide)</summary>
559
-
560
- To run and review the evaluation suite on your local setup using the benchmark runner scripts (`evaluate_qa.py` and `llm_as_judge.py`):
561
-
562
- ```bash
563
- # 1. Clone the LoCoMo-Plus evaluation codebase
564
- git clone https://github.com/dcostenco/Locomo-Plus /tmp/Locomo-Plus
565
- cd /tmp/Locomo-Plus
566
-
567
- # 2. Run Baseline Gemini 3.1 Pro Evaluation (concurrency 5)
568
- export GOOGLE_API_KEY="your-api-key"
569
- PYTHONPATH=/tmp/Locomo-Plus python3 evaluation_framework/task_eval/evaluate_qa.py \
570
- --data-file data/unified_cognitive_only.json \
571
- --out-file output/gemini_3.1_pro_pred.json \
572
- --model gemini-3.1-pro-preview \
573
- --backend call_gemini \
574
- --concurrency 5
575
-
576
- # 3. Run Prism-MCP powered by Gemini 3.1 Pro Evaluation (concurrency 1 to guard SQLite locks)
577
- export PRISM_TEXT_MODEL=gemini-3.1-pro-preview
578
- PYTHONPATH=/tmp/Locomo-Plus python3 evaluation_framework/task_eval/evaluate_qa.py \
579
- --data-file data/unified_cognitive_only.json \
580
- --out-file output/prism_gemini_3.1_pro_pred.json \
581
- --model gemini-3.1-pro-preview \
582
- --backend call_prism \
583
- --concurrency 1
584
-
585
- # 4. Run Claude Sonnet 4.6 Baseline Evaluation (concurrency 3, rate-limit safe)
586
- export ANTHROPIC_API_KEY="your-api-key"
587
- PYTHONPATH=/tmp/Locomo-Plus python3 evaluation_framework/task_eval/evaluate_qa.py \
588
- --data-file data/unified_cognitive_only.json \
589
- --out-file output/claude_sonnet46_pred.json \
590
- --model claude-sonnet-4-6 \
591
- --backend call_claude \
592
- --concurrency 3
593
-
594
- # 5. Grade results using the LLM-as-a-Judge script
595
- PYTHONPATH=/tmp/Locomo-Plus python3 evaluation_framework/task_eval/llm_as_judge.py \
596
- --input-file output/prism_gemini_3.1_pro_pred.json \
597
- --out-file output/prism_gemini_3.1_pro_judged.json \
598
- --model gemini-2.5-flash \
599
- --backend call_gemini \
600
- --concurrency 5 \
601
- --summary-file output/prism_gemini_3.1_pro_summary.json
602
- ```
603
- </details>
604
-
605
- ### Models on HuggingFace
606
-
607
- | Model | HuggingFace | Solo BFCL | Cascade role | Size |
608
- |---|---|---|---|---|
609
- | prism-coder:32b | [dcostenco/prism-coder-32b](https://huggingface.co/dcostenco/prism-coder-32b) | **100.0%** routing (v7 MoE) | Tier 2 (catches ~1% 14B misses) | 16 GB |
610
- | prism-coder:8b | [dcostenco/prism-coder-8b](https://huggingface.co/dcostenco/prism-coder-8b) | **100.0%** routing (v36) | Mobile tier | 4.7 GB |
611
- | prism-coder:14b | [dcostenco/prism-coder-14b](https://huggingface.co/dcostenco/prism-coder-14b) | **100.0%** routing (v36) | Tier 1 (serves ~99% of traffic) | 8.4 GB |
612
- | prism-coder:1.7b | [dcostenco/prism-coder-1.7b](https://huggingface.co/dcostenco/prism-coder-1.7b) | **100.0%** routing (v42) | On-device / always-fits fallback | 1.1 GB |
613
- | prism-ide:14b | [dcostenco/prism-ide](https://huggingface.co/dcostenco/prism-ide) | **22/22** TypeScript eval (v1) | Code generation tier 1 (~1.1s) | 8.4 GB |
614
- | prism-ide:32b | [dcostenco/prism-ide](https://huggingface.co/dcostenco/prism-ide) | Complex code + multi-file (v3) | Code generation tier 2 (~0.8s MoE) | 16 GB |
615
-
616
- ## Self-hosted / Local AI (Enterprise)
617
-
618
- Run the full Prism model stack on your own hardware β€” zero cloud, zero latency, full data sovereignty.
619
-
620
- **Requirements:** Mac M2 Pro+ (48GB recommended) or Linux with NVIDIA GPU Β· [Ollama](https://ollama.com)
298
+ ## CLI
621
299
 
622
300
  ```bash
623
- # On-device tier β€” 1.1 GB (any machine, iPhone) β€” 100% routing
624
- ollama pull dcostenco/prism-coder:1b7
625
-
626
- # Mobile tier β€” 4.7 GB (iPhone/iPad 8GB, Mac M1+) β€” 100% routing
627
- ollama pull dcostenco/prism-coder:8b
628
-
629
- # Standard tier β€” 8.4 GB (Mac 24GB+, iPad Pro 16GB) β€” 100% routing
630
- ollama pull dcostenco/prism-coder:14b
631
-
632
- # Reasoning tier β€” 16 GB (Mac M2 Ultra+, 30B-A3B MoE) β€” 100% routing
633
- ollama pull dcostenco/prism-coder:32b
301
+ prism load <project> # load session context
302
+ prism save # save ledger + handoff
303
+ prism search <query> # search code across repos (exact / regex / symbol / semantic)
304
+ prism review <files...> # AI code review β€” security, performance, style
305
+ prism scan <files...> # security scan β€” secrets, licenses, Dockerfile
306
+ prism push # push local SQLite to the cloud backend
307
+ prism register-models # alias dcostenco/prism-coder:* -> prism-coder:*
634
308
  ```
635
309
 
636
- Set `LOCAL_LLM_URL=http://localhost:11434` in your portal config. Routing is automatic:
310
+ ### `prism search` β€” semantic code search
637
311
 
638
- **Desktop/server**: 14B β†’ 32B β†’ Claude Sonnet 4 fallback Β· **Mobile/offline**: 14B β†’ 8B β†’ 1.7B
312
+ <p align="center">
313
+ <img src="docs/scm_search_cli.jpg" alt="prism search β€” semantic code search with relevance scores" width="500" />
314
+ </p>
639
315
 
640
- iOS/mobile on same WiFi: `OLLAMA_HOST=0.0.0.0 ollama serve` on the Mac, then point `LOCAL_LLM_URL` at the Mac's IP.
641
- Routing accuracy (May 2026, v36/v7 system prompt, 3-seed mean): 32B v7 = **100.0%** Β· 8B v36 = **100.0%** Β· 14B v36 = **100.0%** Β· 1.7B v42 = **100.0%**
642
- Cascade (14B→32B): **100.0%** · Opus solo: 98.3% · Opus engaged: **0% of requests** → [Full results](tests/benchmarks/cascade-14b-32b-opus/README.md)
316
+ ### `prism review` β€” AI code review with HIPAA checks
643
317
 
644
- ---
318
+ <p align="center">
319
+ <img src="docs/scm_review_cli.jpg" alt="prism review β€” AI code review with security and HIPAA findings" width="400" />
320
+ </p>
645
321
 
646
- ## What you can build with it
322
+ ### `prism scan` β€” security scanner for secrets, Dockerfiles, licenses
647
323
 
648
- - **Persistent coding assistant** that remembers your codebase, your decisions, your team's conventions
649
- - **Research agent** that builds knowledge over time β€” Auto-Scholar pipeline ingests papers / docs and synthesizes
650
- - **Clinical scribe** that retains patient context across visits (HIPAA-compliant cloud + local)
651
- - **Customer support agent** that learns from every ticket
652
- - **Writing assistant** that knows your voice, your prior drafts, and what you've already published
324
+ <p align="center">
325
+ <img src="docs/scm_scan_cli.jpg" alt="prism scan β€” security scan finding secrets and container issues" width="400" />
326
+ </p>
653
327
 
654
328
  ---
655
329
 
656
330
  ## Companions
657
331
 
658
- ### 🌐 Website & Docs
332
+ Prism works alongside these tools β€” use whichever fits your workflow.
659
333
 
660
- **[synalux.ai/prism-mcp](https://synalux.ai/prism-mcp)** β€” full documentation, dashboard, subscription plans, and model downloads.
334
+ ### Web IDE β€” Prism Coder
661
335
 
662
- ### πŸ’» Web IDE β€” Prism Coder
336
+ A browser-based IDE at [synalux.ai/coder](https://synalux.ai/coder). Import any GitHub repo and get:
663
337
 
664
- Use Prism Coder directly in your browser β€” no install, no desktop app required. Standalone coding IDE with the prism-coder agent built in. Works with any Prism plan (no Synalux health subscription needed).
338
+ - **Monaco editor** with multi-tab, split view, syntax highlighting, and VS Code keybindings
339
+ - **In-browser Node.js** via WebContainer (your code runs in the browser sandbox, not on a server)
340
+ - **Integrated terminal** β€” WebContainer shell in-browser; optional server PTY via WebSocket when connected to a dev server
341
+ - **AI Agent Mode** β€” describe a task and the agent creates files, runs type-checks, and verifies
342
+ - **Source control** β€” commit, branch, push/pull, stash, blame, tag management
343
+ - **Live Share** β€” real-time collaborative editing with session links
344
+ - **Node.js debugger** via Chrome DevTools Protocol
345
+ - **Tasks runner** (VS Code `tasks.json` compatible), **Problems panel** (Monaco diagnostics)
346
+ - **12-language i18n** β€” full UI localization
665
347
 
666
- **[synalux.ai/coder](https://synalux.ai/coder)**
348
+ <p align="center">
349
+ <img src="docs/screenshots/agent-mode.png" alt="Prism Coder IDE β€” Agent Mode creating a component with auto-fix and type-checking" width="500" />
350
+ </p>
667
351
 
668
- | Feature | Detail |
669
- |---|---|
670
- | Agent | prism-coder:8b offline Β· Claude Sonnet 4 (Standard+) |
671
- | Integrations | GitHub repos Β· same Prism account, no separate sign-up |
672
- | Plans | Free (4b) Β· Standard $19/mo (14b) Β· Advanced $49/mo (32b) Β· Enterprise $99/mo |
352
+ <p align="center">
353
+ <img src="docs/screenshots/collaboration.png" alt="Prism Coder IDE β€” Live Share with team members and real-time cursor tracking" width="500" />
354
+ </p>
673
355
 
674
- ### 🧩 VS Code Extension β€” Synalux
356
+ Standard+ plans get cloud AI and higher rate limits. Free tier works with local Ollama. Code execution uses the in-browser WebContainer by default; Live Share and the optional PTY terminal connect to external servers when explicitly enabled.
675
357
 
676
- Memory-augmented AI inside VS Code, powered by Prism. 20 multimodal tools, multi-agent orchestration, 12-language support. Works offline (Ollama) or cloud (OpenRouter). HIPAA-compliant healthcare workflows.
358
+ ### VS Code Extension β€” Synalux
677
359
 
678
- [![VS Marketplace](https://img.shields.io/visual-studio-marketplace/v/synalux-ai.synalux?label=VS%20Marketplace&color=007ACC)](https://marketplace.visualstudio.com/items?itemName=synalux-ai.synalux)
360
+ Memory-augmented AI inside VS Code with clinical practice management features. Install from the marketplace:
679
361
 
680
362
  ```bash
681
- # Install from terminal
682
363
  code --install-extension synalux-ai.synalux
683
364
  ```
684
365
 
685
- Or open VS Code β†’ Extensions (β‡§βŒ˜X) β†’ search **"Synalux"** β†’ Install.
366
+ [![VS Marketplace](https://img.shields.io/visual-studio-marketplace/v/synalux-ai.synalux?label=VS%20Marketplace&color=007ACC)](https://marketplace.visualstudio.com/items?itemName=synalux-ai.synalux)
686
367
 
687
- ### πŸ“¦ npm / npx
368
+ AI chat, voice input, SOAP note generator, team collaboration, and video calls β€” all inside VS Code. Routes through local Ollama by default; cloud on paid tiers.
688
369
 
689
- ```bash
690
- # Run without installing (always latest version)
691
- npx prism-mcp-server
370
+ <details>
371
+ <summary>Feature details</summary>
692
372
 
693
- # Or install globally
694
- npm install -g prism-mcp-server
695
- prism load my-project
696
- ```
373
+ - **AI**: Chat participant (`@synalux`), multi-agent pipeline, voice input, model switching, 10 tones
374
+ - **Clinical**: SOAP note generator, role-based access, document signing, patient board
375
+ - **Collaboration**: Team chat, DMs, video calls, customer board, visual builder, DevContainers
376
+ - **Privacy**: Local Ollama by default. `preferLocal=true` tries local first. Enterprise BAA available.
377
+ </details>
697
378
 
698
- Package: [`prism-mcp-server` on npm](https://www.npmjs.com/package/prism-mcp-server)
379
+ ### Prism AAC
699
380
 
700
- ### PrismAAC
381
+ Communication app for non-speaking users, powered by the on-device prism-coder fleet for phrase prediction. macOS / iOS / web.
701
382
 
702
- AAC communication app for non-speaking users. Powered by Prism's spreading-activation phrase ranking + on-device 7B model. macOS / iOS / Android via web. β†’ [github.com/dcostenco/prism-aac](https://github.com/dcostenco/prism-aac)
383
+ See [github.com/dcostenco/prism-aac](https://github.com/dcostenco/prism-aac)
703
384
 
704
385
  ---
705
386
 
706
- ## πŸ†• Prism as Foundation (v14.0.0)
707
-
708
- As of v14.0.0, Prism's algorithm exports are a **stable public contract** under SemVer. External systems can port `actrActivation.ts` (ACT-R cognitive decay), `spreadingActivation.ts` (the 0.7 similarity + 0.3 activation hybrid score), `routerExperience.ts` (experience bias with `MIN_SAMPLES=5` cold-start gate), `compactionHandler.ts` (the 25KB prompt-budget cap), and `graphMetrics.ts` (warning ratios) with citations and pin a Prism version.
709
-
710
- ### Reference consumers
711
-
712
- | Consumer | What it uses from Prism |
713
- |---|---|
714
- | [Audit hooks framework](https://github.com/dcostenco/prism-coder/blob/main/docs/WOW_FEATURES.md#7-the-recipe-combining-all-of-the-above) | ACT-R decay (`d=0.25` lesson rate), spreading activation hybrid score (0.7/0.3), experience bias (`MIN_SAMPLES=5`, `MAX_BIAS_CAP=0.15`), graph-metrics warning ratios (0.20 / 0.30 / 0.40), compaction's 25KB prompt-budget. **327 tests pin every constant** β€” CI catches divergence automatically. |
715
- | [PrismAAC](https://github.com/dcostenco/prism-aac) | Spreading-activation phrase ranking (recency Γ— frequency Γ— per-user history). Caregiver corrections auto-harvest into the personalization corpus via the audit-hooks postflight harvester. The on-device 7B model + this algorithm stack is what makes PrismAAC defensible. |
716
- | Synalux portal | Tier-aware model routing using experience bias on prior outcomes per fingerprint. HIPAA-compliant clinical scribe with on-device-first privacy guarantees. |
717
-
718
- ## CLI Reference
719
-
720
- Prism Coder includes a CLI for session management, code review, and sync operations.
721
-
722
- ```bash
723
- prism load <project> # Load session context (same as session_load_context MCP tool)
724
- prism save # Save session state (ledger + handoff)
725
- prism ledger <project> # Save a session log entry (same as session_save_ledger)
726
- prism handoff <project> # Update live project state for next session
727
- prism push # Push local SQLite data to Supabase cloud
728
- prism sync # Cross-backend data synchronization
729
- prism search <query> # Search code across repos (exact, regex, symbol, semantic)
730
- prism review <files...> # AI code review β€” security, performance, style
731
- prism scan <files...> # Security scan β€” secrets, licenses, Dockerfile
732
- prism dora # Show DORA metrics for current project
733
- prism scm # Source control, AI review, security scanning
734
- prism verify # Manage the verification harness
735
- prism status # Check verification state and config drift
736
- prism generate # Bless current rubric as canonical
737
- prism register-models # Alias dcostenco/prism-coder:* β†’ prism-coder:*
738
- ```
739
-
740
- ## Testing
741
-
742
- ```bash
743
- npm test # 2,676 test cases across 89 files (vitest)
744
- npm test -- --coverage # coverage report
745
- python3 tests/benchmarks/prism-routing-100/benchmark.py --models 1b7 14b 32b
746
- ```
747
-
748
- **Pinned in CI** β€” 327 tests enforce every constant: ACT-R decay `d=0.25`, spreading-activation hybrid score `0.7/0.3`, experience bias `MIN_SAMPLES=5` / `MAX_BIAS_CAP=0.15`, graph-metrics warning ratios `0.20 / 0.30 / 0.40`, compaction's 25KB prompt-budget. CI catches divergence automatically.
749
-
750
- **Coverage areas**:
751
- - HRR zero-search retrieval (97 tests: 3 embedding strategies, edge cases, persistence, adaptive cascade, API client, chat integration)
752
- - Knowledge ingestion (32 tests: chunker, Q&A gen, webhook, security, storage round-trip)
753
- - Prism infer cascade (110 tests: tier selection, cloud fallback, grounding verifier)
754
- - Compaction handler (rollup creation, concurrency guard, LLM failure)
755
- - Model picker (20 tests: 14b default ceiling, 4b verifier, RAM gating)
756
- - Storage round-trip (12 architectural guard tests preventing bypass)
757
- - BCBA skill integration
758
- - Deep storage tier
759
- - Dashboard rendering
760
- - Routing benchmarks (eval_300: 300 cases, 17 tools)
387
+ ## Git Hooks (Portable)
761
388
 
762
- ## Migration
763
-
764
- ### Local SQLite β†’ Synalux portal
765
-
766
- If you've been running Prism on the free tier and want to move historical session data into the paid-tier portal, use the migration script:
389
+ Pre-commit and pre-push security hooks that work with any editor, any AI tool, and direct CLI. No Claude Code dependency.
767
390
 
768
391
  ```bash
769
- # dry run first β€” prints what would be migrated, hits no network
770
- node scripts/migrate-local-to-portal.mjs --dry-run
771
-
772
- # real run β€” pushes ledger + handoff entries through POST /api/v1/prism/memory
773
- PRISM_SYNALUX_API_KEY=synalux_sk_... \
774
- node scripts/migrate-local-to-portal.mjs
775
-
776
- # scope to one project
777
- node scripts/migrate-local-to-portal.mjs --project=my-project
392
+ # Install in all repos (one-time)
393
+ bash synalux-private/scripts/install-git-hooks.sh
778
394
 
779
- # include scholar entries (excluded by default β€” usually large + low-value)
780
- node scripts/migrate-local-to-portal.mjs --include-scholar
395
+ # Or install manually in a single repo
396
+ cp hooks/pre-commit .git/hooks/pre-commit && chmod +x .git/hooks/pre-commit
397
+ cp hooks/pre-push .git/hooks/pre-push && chmod +x .git/hooks/pre-push
781
398
  ```
782
399
 
783
- **What it does**: reads `~/.prism-mcp/data.db` via `@libsql/client` (already a runtime dep β€” no extra install), exchanges the refresh token for a JWT (cached + auto-refreshed before expiry), and POSTs each ledger entry and handoff to the portal. Failures are logged with the source row id; successes are counted at the end.
400
+ | Hook | What it checks | Mode |
401
+ |------|----------------|------|
402
+ | `pre-commit` | Dead code, orphan services, scaffold code, missing auth | `PRECOMMIT_MODE=advisory\|block\|off` |
403
+ | `pre-push` | 19-rule security audit (SSRF, SQL injection, secrets, IDOR, etc.) | `PREPUSH_MODE=advisory\|block\|off` |
784
404
 
785
- **Credentials**: `PRISM_SYNALUX_API_KEY` from env. If unset, the script also checks `~/prism/.env` for `PRISM_SYNALUX_API_KEY=...` as a convenience for dev workflows.
405
+ Default mode is `advisory` (warn but allow). Set `*_MODE=block` for hard enforcement. Hooks look for full audit scripts in the repo first (`hooks/lib/`), then `~/.claude/hooks/` fallback, then minimal inline checks.
786
406
 
787
- **Idempotency**: handoffs are written with the portal's CRDT merge (last-write-wins per project+role); ledger entries are append-only and de-duped server-side by `(project, conversation_id, summary)`. Re-running on the same DB is safe.
407
+ ---
788
408
 
789
- **One-shot only**: this script is a migration tool, not a sync daemon. Once you've moved, set `PRISM_STORAGE=synalux` (or leave it on `auto` and let the resolver pick synalux when credentials are present) and the MCP server writes directly to the portal going forward.
409
+ ## Self-hosting (Enterprise)
790
410
 
791
- ## Production Infrastructure
411
+ Run the full model stack on your own hardware β€” no cloud, full data sovereignty.
792
412
 
793
- ### Architecture
413
+ **Requirements:** Mac M2 Pro+ (48 GB recommended) or Linux + NVIDIA GPU, plus [Ollama](https://ollama.com).
794
414
 
795
- ```
796
- CLIENTS
797
- β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
798
- β”‚ prism-aac (iOS/web)β”‚ β”‚ Claude Code Β· Cursor Β· IDE β”‚
799
- β”‚ Vercel β”‚ β”‚ MCP config β†’ Railway URL β”‚
800
- β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
801
- β”‚ inference β”‚ memory
802
- β–Ό β–Ό
803
- β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
804
- β”‚ SYNALUX ROUTER β”‚ β”‚ prism-mcp SERVER β”‚
805
- β”‚ Vercel β”‚ β”‚ β”‚
806
- β”‚ β€’ JWT auth β”‚ β”‚ Primary β€” Railway β”‚
807
- β”‚ β€’ tier enforcement β”‚ β”‚ Standby β€” Fly.io β”‚
808
- β”‚ β€’ complexity route β”‚ β”‚ Fallback β€” Supabase REST β”‚
809
- β”‚ β€’ proxy to cloud β”‚ β”‚ auto-failover chain β”‚
810
- β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
811
- β”‚ β”‚
812
- β–Ό β–Ό
813
- β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
814
- β”‚ OPENROUTER / LOCAL β”‚ β”‚ SUPABASE β”‚
815
- β”‚ β”‚ β”‚ session ledgers β”‚
816
- β”‚ Cloud: Claude Sonnet 4 β”‚ β”‚ knowledge graph β”‚
817
- β”‚ Routing: prism-coder β”‚ β”‚ handoffs & todos β”‚
818
- β”‚ :32b(100%) :14b(100%) β”‚ β”‚ β”‚
819
- β”‚ :8b(100%) :1b7(100%) β”‚ β”‚ source of truth β”‚
820
- β”‚ Code: prism-ide β”‚ β”‚ β”‚
821
- β”‚ :14b Β· :32b β”‚ β”‚ β”‚
822
- β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
415
+ ```bash
416
+ ollama pull dcostenco/prism-coder:9b # default router
417
+ export LOCAL_LLM_URL=http://localhost:11434
823
418
  ```
824
419
 
825
- ### Service Routing
420
+ Routing is automatic: `9b β†’ 4b β†’ cloud fallback` on desktop/server, `2b β†’ cloud fallback` on mobile/iPhone. For iOS or another machine on the same network, run `OLLAMA_HOST=0.0.0.0 ollama serve` and point `LOCAL_LLM_URL` at the host's IP.
826
421
 
827
- **LLM Backends**
828
-
829
- | Surface | Primary | Fallback | Local |
830
- |---|---|---|---|
831
- | AI Chat (free) | Gemini 2.5 Flash (direct API) | Claude Haiku 3.5 | prism-coder:14b via Ollama |
832
- | AI Chat (paid) | Claude Sonnet 4 (OpenRouter) | Claude Haiku 3.5 | prism-coder:14b via Ollama |
833
- | Prism Coder (tool-calling) | Claude Haiku 3.5 (OpenRouter) | β€” | prism-coder:14b via Ollama |
834
- | Prism AAC | Local prism-coder:14b | Gemini 2.5 Flash / Claude | prism-coder:8b / :1b7 |
835
-
836
- **Web Search**
837
-
838
- | Surface | Primary | Fallback |
839
- |---|---|---|
840
- | AI Chat `@search` | Firecrawl | β€” |
841
- | Prism MCP agents (cloud) | Firecrawl | β€” |
842
- | Prism MCP server (local) | Firecrawl (via MCP tools) | β€” |
843
- | Clinical research | PubMed + ERIC + Semantic Scholar | DuckDuckGo |
422
+ ---
844
423
 
845
- **TTS (Text-to-Speech)**
424
+ ## Configuration reference
846
425
 
847
- | Tier | Engine | Offline |
426
+ | Variable | Purpose | Default |
848
427
  |---|---|---|
849
- | 1 | Inworld TTS-2 (cloud) | β€” |
850
- | 1.5 | Kokoro-82M neural (WASM) | en/es/fr/pt/ja/zh |
851
- | 2 | OS Web Speech API | all |
852
- | 3 | WASM espeak-ng | all |
853
-
854
- **Other Services**
428
+ | `PRISM_STORAGE` | `local` / `synalux` / `supabase` / `auto` | `auto` |
429
+ | `PRISM_SYNALUX_API_KEY` | Paid-tier portal key (`synalux_sk_...`) | -- (local if unset) |
430
+ | `LOCAL_LLM_URL` | Ollama endpoint | `http://localhost:11434` |
431
+ | `PRISM_FORCE_LOCAL` | Force local SQLite regardless of credentials | `false` |
855
432
 
856
- | Service | Provider | Purpose |
857
- |---|---|---|
858
- | Payments | Stripe | Subscriptions, checkout |
859
- | Email | Resend | Transactional (invites, shares) |
860
- | Video | LiveKit | Telehealth, case conferences |
861
- | SMS | Twilio | Emergency alerts, caregiver notifications |
862
- | Translation | Offline dictionary (1,261 Γ— 20 langs) | AAC, Watch |
433
+ With no variables set, Prism runs fully local. Set `PRISM_SYNALUX_API_KEY` (and leave `PRISM_STORAGE=auto`) to use the cloud backend.
863
434
 
864
- ## Synalux Inference Router
435
+ ---
865
436
 
866
- All Prism AAC model inference is protected behind Synalux as a mandatory router. Models are **never accessible directly** β€” all traffic goes through Synalux for auth, billing, and rate limiting.
437
+ ## Testing
867
438
 
439
+ ```bash
440
+ npm test # full suite (vitest)
441
+ npm test -- --coverage # coverage report
868
442
  ```
869
- β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
870
- β”‚ CLIENT LAYER β”‚
871
- β”‚ prism-aac (iOS/web) β”‚ Synalux Portal β”‚
872
- β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
873
- β”‚ POST /api/v1/prism-aac/inference
874
- β”‚ Authorization: Bearer <user-JWT>
875
- β–Ό
876
- β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
877
- β”‚ SYNALUX ROUTER β”‚
878
- β”‚ 1. Verify JWT (no anonymous access) β”‚
879
- β”‚ 2. Check subscription tier β”‚
880
- β”‚ 3. Enforce rate limit (per-tier daily cap) β”‚
881
- β”‚ 4. Route to model tier by complexity β”‚
882
- β”‚ 5. Proxy β†’ OpenRouter / Gemini (key never exposed) β”‚
883
- β”‚ 6. Log β†’ aac_inference_log (audit trail) β”‚
884
- β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
885
- β”‚ β”‚
886
- β–Ό β–Ό
887
- β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
888
- β”‚ LOCAL (Ollama) β”‚ β”‚ CLOUD (OpenRouter) β”‚
889
- β”‚ prism-coder:14b β”‚ β”‚ Claude Sonnet 4 β”‚
890
- β”‚ prism-coder:8b β”‚ β”‚ Claude Haiku 3.5 β”‚
891
- β”‚ prism-coder:1b7 β”‚ β”‚ Gemini 2.5 Flash β”‚
892
- β”‚ free, offline β”‚ β”‚ paid tiers β”‚
893
- β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
894
-
895
- On-device (free, offline):
896
- prism-coder:1b7 GGUF Q4_K_M (1.1 GB) β†’ any Apple device
897
- prism-coder:8b GGUF Q4_K_M (4.7 GB) β†’ iPhone/iPad 8 GB+
898
- prism-coder:14b GGUF Q4_K_M (8.4 GB) β†’ Mac/iPad Pro 16 GB+
899
-
900
- HuggingFace: dcostenco/prism-coder-{14b,8b,32b,1.7b} (public GGUF weights)
901
- ```
902
-
903
- | Plan | Cloud model | Daily limit | On-device |
904
- |---|---|---|---|
905
- | **Free** | β€” | unlimited local | prism-coder:1.7b (100%) + 8b (100%) + 14b (100%) |
906
- | **Standard $19/mo** | Claude Sonnet 4 | 200 req | + cloud fallback |
907
- | **Pro $49/mo** | prism-coder:32b | 2,000 req | + reasoning tier |
908
- | **Enterprise $99/mo** | prism-coder:32b priority | unlimited | + HIPAA BAA + custom fine-tuning |
909
443
 
910
- All on-device models are **free for every tier** β€” no subscription needed for local inference. Offline translation (1,261 phrases Γ— 20 languages) included in all plans.
911
-
912
- [Subscribe β†’](https://synalux.ai/pricing)
913
-
914
- See [`docs/WOW_FEATURES.md`](docs/WOW_FEATURES.md) for the algorithm catalogue. Release notes in [`docs/releases/v14.0.0-prism-as-foundation.md`](docs/releases/v14.0.0-prism-as-foundation.md).
444
+ Coverage spans HRR retrieval, knowledge ingestion, the inference cascade and grounding verifier, compaction, the model picker, and storage round-trips.
915
445
 
916
446
  ---
917
447
 
918
- <details>
919
- <summary>πŸ“š Architecture, cognitive systems, and full feature catalog</summary>
448
+ ## Migration: local to cloud
920
449
 
921
- **Detailed docs in this repo:**
922
- - [`docs/ARCHITECTURE.md`](docs/ARCHITECTURE.md) β€” system architecture, memory routing, HRR
923
- - [`docs/COMPACTION.md`](docs/COMPACTION.md) β€” how Prism handles LLM context compaction and ledger compaction
924
- - [`docs/SETUP_GEMINI.md`](docs/SETUP_GEMINI.md) β€” Gemini configuration
925
- - [`docs/self-improving-agent.md`](docs/self-improving-agent.md) β€” adversarial eval / anti-sycophancy
926
- - [`docs/rfcs/`](docs/rfcs/) β€” design RFCs
927
- - [`docs/releases/`](docs/releases/) β€” per-version release notes
928
- - [`CHANGELOG.md`](CHANGELOG.md) β€” version history (v12.5 Unified Billing, v11.6 Hivemind, v11.5.1 Auto-Scholar, etc.)
929
- - [`CONTRIBUTING.md`](CONTRIBUTING.md) β€” contributor guide
450
+ To move free-tier history into the paid portal:
930
451
 
931
- **The original 1933-line README is preserved in git history.** To browse the prior version (full feature catalog, Cognitive Architecture v7.8, Autonomous Cognitive OS v9.0, HRR Zero-Search, Adversarial Evaluation walkthroughs, Universal Import patterns, competitive analysis vs LangMem/MemGPT/Letta/Zep, v12.5 Unified Billing details, v11.6 Hivemind, v11.5.1 Auto-Scholar): `git show HEAD~1:README.md`.
452
+ ```bash
453
+ node scripts/migrate-local-to-portal.mjs --dry-run # preview, no network
454
+ PRISM_SYNALUX_API_KEY=synalux_sk_... \
455
+ node scripts/migrate-local-to-portal.mjs # push ledger + handoffs
456
+ ```
932
457
 
933
- </details>
458
+ It reads `~/.prism-mcp/data.db` and POSTs entries to the portal. Ledger entries are append-only and de-duped server-side; handoffs use last-write-wins per project. Re-running on the same DB is safe. This is a one-shot migration, not a sync daemon β€” after it, set `PRISM_STORAGE=synalux` (or leave it on `auto`).
934
459
 
935
460
  ---
936
461
 
937
462
  ## License
938
463
 
939
- [AGPL-3.0](LICENSE) β€” Open source. Same license as Prism AAC. Commercial use via Synalux subscription for hosted/managed deployment.
464
+ | Product | License |
465
+ |---|---|
466
+ | **prism-mcp-server** (this repo) | [AGPL-3.0](LICENSE) |
467
+ | **VS Code extension** (synalux-ai.synalux) | BSL-1.1 |
468
+ | **Web IDE** (synalux.ai/coder) | Synalux Terms of Service |
469
+ | **Prism AAC** | AGPL-3.0 |
470
+
471
+ The AGPL-3.0 license covers the MCP server and its source code. The VS Code extension and Web IDE are separate products with their own licenses. Commercial hosted/managed deployment of the MCP server is available via the Synalux subscription.