prism-mcp-server 19.0.0 → 19.0.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +106 -74
- package/dist/cli.js +2 -2
- package/dist/storage/sqlite.js +4 -2
- package/dist/tools/behavioralVerifierHandler.js +3 -4
- package/dist/tools/ledgerHandlers.js +7 -5
- package/dist/tools/prismInferHandler.js +12 -13
- package/dist/utils/entitlements.js +27 -7
- package/dist/utils/modelPicker.js +14 -15
- package/dist/verification/gatekeeper.js +2 -1
- package/dist/verification/runner.js +7 -2
- package/dist/verification/schema.js +9 -1
- package/dist/verification/severityPolicy.js +12 -0
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -1,10 +1,6 @@
|
|
|
1
1
|
# Prism Coder
|
|
2
2
|
|
|
3
|
-
**
|
|
4
|
-
|
|
5
|
-
Prism Coder is a [Model Context Protocol](https://modelcontextprotocol.io) server that gives Claude, Cursor, and other AI tools long-term memory that survives across sessions — semantic search, cognitive routing, and a visual dashboard. It ships alongside the open-weight `prism-coder` model fleet (1.7B-32B) for fast, offline tool-routing when you don't want a cloud round-trip.
|
|
6
|
-
|
|
7
|
-
It runs **fully local and free** on SQLite + Ollama with no API keys. A paid subscription adds cloud sync, higher model tiers, and team features through the Synalux portal.
|
|
3
|
+
**Give your AI agent memory that lasts.** Persistent sessions, knowledge graphs, and offline tool-routing — fully local and free.
|
|
8
4
|
|
|
9
5
|
[](https://www.npmjs.com/package/prism-mcp-server)
|
|
10
6
|
[](https://github.com/modelcontextprotocol/servers)
|
|
@@ -15,7 +11,10 @@ It runs **fully local and free** on SQLite + Ollama with no API keys. A paid sub
|
|
|
15
11
|
<img src="docs/v11_hivemind_multi_agent_dashboard.jpg" alt="Prism Coder — Mind Palace Dashboard with Knowledge Graph and Multi-Agent Hivemind" width="700" />
|
|
16
12
|
</p>
|
|
17
13
|
|
|
18
|
-
|
|
14
|
+
Prism Coder is an [MCP server](https://modelcontextprotocol.io) that gives Claude, Cursor, and other AI tools long-term memory that survives across sessions. It ships with the open-weight `prism-coder` model fleet (2B–32B) for fast, offline tool-routing — no cloud required.
|
|
15
|
+
|
|
16
|
+
**No account needed. No API keys. Runs on your machine.**
|
|
17
|
+
A paid subscription adds cloud sync, higher model tiers, and team features through the [Synalux portal](https://synalux.ai).
|
|
19
18
|
|
|
20
19
|
---
|
|
21
20
|
|
|
@@ -39,18 +38,20 @@ Open Claude Desktop or Cursor and your agent now has memory backed by a local SQ
|
|
|
39
38
|
**Optional — local model fleet** for offline tool-routing. Pull whichever fits your hardware:
|
|
40
39
|
|
|
41
40
|
```bash
|
|
42
|
-
ollama pull dcostenco/prism-coder:2b # 2.3 GB ·
|
|
43
|
-
ollama pull dcostenco/prism-coder:4b # 3.4 GB · verifier
|
|
44
|
-
ollama pull dcostenco/prism-coder:
|
|
45
|
-
ollama pull dcostenco/prism-coder:32b #
|
|
41
|
+
ollama pull dcostenco/prism-coder:2b # 2.3 GB · mobile / lightweight (99.1% routing accuracy)
|
|
42
|
+
ollama pull dcostenco/prism-coder:4b # 3.4 GB · verifier (100% accuracy)
|
|
43
|
+
ollama pull dcostenco/prism-coder:9b # 5.8 GB · default router (100% accuracy, Qwen3.5)
|
|
44
|
+
ollama pull dcostenco/prism-coder:32b # 19 GB · complex tasks (100% accuracy)
|
|
46
45
|
```
|
|
47
46
|
|
|
48
|
-
Prism detects both the namespaced (`dcostenco/prism-coder:
|
|
47
|
+
Prism detects both the namespaced (`dcostenco/prism-coder:9b`) and bare (`prism-coder:9b`) Ollama tags automatically.
|
|
49
48
|
|
|
50
49
|
---
|
|
51
50
|
|
|
52
51
|
## What it does
|
|
53
52
|
|
|
53
|
+
Your AI agent forgets everything between sessions. Prism fixes that — and adds verification, drift detection, and multi-agent coordination on top.
|
|
54
|
+
|
|
54
55
|
### Mind Palace — persistent memory that survives across sessions
|
|
55
56
|
|
|
56
57
|
Every conversation feeds a persistent store. The next session loads the right context automatically — no re-explaining.
|
|
@@ -83,27 +84,17 @@ Long agent sessions can wander from their original goal. `session_detect_drift`
|
|
|
83
84
|
|
|
84
85
|
### Behavioral Verification — catch bad edits before they happen
|
|
85
86
|
|
|
86
|
-
AI agents
|
|
87
|
+
AI agents apply patterns from checklists without understanding the real-world impact. The `verify_behavior` tool challenges the agent with a scenario it must answer **before** editing — forcing it to think through what the end user will experience.
|
|
87
88
|
|
|
88
89
|
```
|
|
89
|
-
Agent: "I'll revert
|
|
90
|
-
Prism: "⚠️
|
|
91
|
-
|
|
92
|
-
Agent: "The ticket
|
|
93
|
-
Prism: "Correct — your revert would
|
|
90
|
+
Agent: "I'll revert this kitchen display change"
|
|
91
|
+
Prism: "⚠️ Scenario: A cook sees a 3-item ticket. One item is voided.
|
|
92
|
+
What should the cook see after the void?"
|
|
93
|
+
Agent: "The ticket stays visible with the remaining 2 items."
|
|
94
|
+
Prism: "Correct — your revert would hide the ticket entirely."
|
|
94
95
|
```
|
|
95
96
|
|
|
96
|
-
|
|
97
|
-
|
|
98
|
-
**How it works**: The `verify_behavior` tool calls the Synalux portal API, which matches the file path against domain scenarios stored in the database. The agent must answer the scenario concretely before editing. No local hooks required — works in Claude, Cursor, or any MCP client.
|
|
99
|
-
|
|
100
|
-
**Why it matters**: In a single audit session, 47 bugs were found across 7 days of AI-generated code. Every bug was introduced by an agent that applied a "correct" pattern without simulating the end-user journey. The behavioral verifier would have caught all of them.
|
|
101
|
-
|
|
102
|
-
| Tier | Coverage |
|
|
103
|
-
|------|----------|
|
|
104
|
-
| Free | Skill-based advisory (agent prompted to think before editing) |
|
|
105
|
-
| Standard+ | `verify_behavior` tool with 17 domain scenarios via API |
|
|
106
|
-
| Enterprise | Custom per-workspace scenarios |
|
|
97
|
+
17 built-in domains (billing, auth, ordering, clinical, HR, and more). Custom domains per workspace on Enterprise. No hooks needed — works in any MCP client.
|
|
107
98
|
|
|
108
99
|
### Time Travel
|
|
109
100
|
|
|
@@ -115,7 +106,7 @@ Roll back to any previous session state. Compare diffs between versions. Restore
|
|
|
115
106
|
|
|
116
107
|
### Cognitive Routing
|
|
117
108
|
|
|
118
|
-
|
|
109
|
+
Three memory types, automatically sorted: **episodic** (what happened — session logs, decisions), **semantic** (what's true — facts, architecture), and **procedural** (how to do X — workflows, patterns). When you search, the router picks the right store instead of dumping everything.
|
|
119
110
|
|
|
120
111
|
### Multi-Agent Hivemind
|
|
121
112
|
|
|
@@ -144,37 +135,51 @@ The free tier runs entirely on your machine. Paid tiers add cloud sync through t
|
|
|
144
135
|
| Memory storage | Local SQLite | Synalux portal (Supabase-backed) |
|
|
145
136
|
| Inference | Local Ollama models | Local models + cloud fallback |
|
|
146
137
|
| API keys required | None | Synalux subscription key |
|
|
147
|
-
| Web search / scrape | Not included |
|
|
138
|
+
| Web search / scrape | Not included | Via Synalux portal (provider keys server-side) |
|
|
148
139
|
| What leaves your machine | Nothing | Memory text + file paths + search queries, sent to the portal over TLS (PHI-redacted before transit) |
|
|
149
|
-
| Works offline |
|
|
140
|
+
| Works offline | ✅ | Local features yes; sync/cloud no |
|
|
150
141
|
|
|
151
|
-
**Handling sensitive data.**
|
|
142
|
+
**Handling sensitive data.** All cloud writes pass through automatic redaction (SSNs, dates of birth, medical record numbers, phone numbers, emails, and clinical identifiers are stripped before transit). For regulated workloads, run the **local tier** for full air-gap, or use **Enterprise** which includes a HIPAA Business Associate Agreement.
|
|
152
143
|
|
|
153
144
|
---
|
|
154
145
|
|
|
155
146
|
## Models
|
|
156
147
|
|
|
157
|
-
The `prism-coder` fleet uses Qwen3.5 for MCP tool-routing. The
|
|
148
|
+
The `prism-coder` fleet uses Qwen3.5 for MCP tool-routing. The 9B is fine-tuned with LoRA (r=128, all 64 layers including DeltaNet); the 2B and 4B use stock Qwen3.5-4B at different quantization levels. They are **not** general-purpose chat models — they route reliably and run offline; Claude and other frontier models remain better at reasoning, coding, and open-domain work. The intended pattern is local routing with an optional cloud fallback for hard cases.
|
|
158
149
|
|
|
159
|
-
| Model | Ollama tag | Size | BFCL Accuracy | Role | Tier |
|
|
150
|
+
| Model | Ollama tag | Size | [BFCL](https://gorilla.cs.berkeley.edu/blogs/12_bfcl_v3_multi_turn.html) Accuracy | Role | Tier |
|
|
160
151
|
|---|---|---|---|---|---|
|
|
161
152
|
| Qwen3.5-4B Q3_K_M | `prism-coder:2b` | 2.3 GB | 99.1% × 3 seeds | iPhone / mobile first gate | Free |
|
|
162
|
-
| Qwen3.5-4B Q4_K_M | `prism-coder:4b` | 3.4 GB | 100% × 3 seeds | Verifier
|
|
163
|
-
|
|
|
164
|
-
| prism-coder:32b | `prism-coder:32b` |
|
|
153
|
+
| Qwen3.5-4B Q4_K_M | `prism-coder:4b` | 3.4 GB | 100% × 3 seeds | Verifier | Free |
|
|
154
|
+
| Qwen3.5-9B (LoRA) | `prism-coder:9b` | 5.8 GB | 100% × 3 seeds | Default router | Standard+ |
|
|
155
|
+
| prism-coder:32b | `prism-coder:32b` | 19 GB | 100% × 3 seeds | Complex tasks | Advanced+ |
|
|
165
156
|
|
|
166
157
|
Weights: [huggingface.co/dcostenco](https://huggingface.co/dcostenco) (public GGUF). Latency depends on model size and hardware — see [Benchmarks](#benchmarks) to measure it on your own machine rather than trusting a printed number.
|
|
167
158
|
|
|
168
159
|
### Cascade
|
|
169
160
|
|
|
170
161
|
```
|
|
171
|
-
query → prism-coder:
|
|
172
|
-
→
|
|
162
|
+
query → prism-coder:9b (local router, default)
|
|
163
|
+
→ prism-coder:4b (grounding verifier)
|
|
173
164
|
→ prism-coder:2b (iPhone / mobile, auto-selected by RAM)
|
|
174
165
|
→ prism-coder:32b (complex tasks, on demand)
|
|
175
166
|
→ cloud fallback (paid tiers, for max quality)
|
|
176
167
|
```
|
|
177
168
|
|
|
169
|
+
### Multi-Layer Verification
|
|
170
|
+
|
|
171
|
+
Every tool-grounded answer on paid tiers passes through deterministic L3 routing rules and an NLI grounding verifier before reaching the user. Free-tier users get the deterministic gates (L1, L3-Tool, L3-Tier0) without the model-based NLI check.
|
|
172
|
+
|
|
173
|
+
| Layer | What | Model | Cost |
|
|
174
|
+
|---|---|---|---|
|
|
175
|
+
| **L1** | Crisis/medical safety gate | None (regex) | 0 ms |
|
|
176
|
+
| **L3-Tool** | Tool name remap + false-positive rejection | None (deterministic) | 0 ms |
|
|
177
|
+
| **L3-Tier0** | Integer grounding (set membership) | None (deterministic) | 0 ms |
|
|
178
|
+
| **L3-Tier2** | NLI verifier (claim → ENTAILED/NEUTRAL/CONTRADICTED) | prism-coder:2b | ~200 ms |
|
|
179
|
+
| **L4** | Hallucination judge (opt-out for clinical) | prism-coder:4b | ~500 ms |
|
|
180
|
+
|
|
181
|
+
Fail-closed on the verified path: when the grounding verifier runs (Standard tier and up), timeout, ambiguity, or missing evidence yields a refusal, not pass-through. Free-tier users get the deterministic L1/L3-Tool gates but not the NLI verifier.
|
|
182
|
+
|
|
178
183
|
---
|
|
179
184
|
|
|
180
185
|
## Benchmarks
|
|
@@ -184,15 +189,15 @@ query → prism-coder:14b (local router, Mac default)
|
|
|
184
189
|
```bash
|
|
185
190
|
git clone https://github.com/dcostenco/prism-coder && cd prism-coder
|
|
186
191
|
pip install anthropic requests
|
|
187
|
-
python3 tests/benchmarks/prism-routing-100/benchmark.py --models 2b 4b
|
|
192
|
+
python3 tests/benchmarks/prism-routing-100/benchmark.py --models 2b 4b 9b 32b
|
|
188
193
|
```
|
|
189
194
|
|
|
190
|
-
**Routing eval (115 cases, 12 categories, 3-seed mean).** On this narrow tool-routing task all fleet models achieve near-perfect accuracy. Be honest with yourself about what that means: the eval is **near-saturated** for this taxonomy — it measures whether the right one of a small set of MCP tools is selected, not general capability. The useful takeaway is **offline routing reliability at zero cost**, not that a 2.3 GB model rivals a frontier model in general.
|
|
195
|
+
**Routing eval (115 cases, 12 categories, 3-seed mean).** Routing accuracy includes the deterministic L3 correction layer — the same rules that run in production. On this narrow tool-routing task all fleet models achieve near-perfect accuracy. Be honest with yourself about what that means: the eval is **near-saturated** for this taxonomy — it measures whether the right one of a small set of MCP tools is selected, not general capability. The useful takeaway is **offline routing reliability at zero cost**, not that a 2.3 GB model rivals a frontier model in general.
|
|
191
196
|
|
|
192
197
|
| Model | Routing accuracy | Notes |
|
|
193
198
|
|---|---|---|
|
|
194
199
|
| prism-coder:2b (Q3_K_M) | 99.1% × 3 seeds | 1 failure: regex→knowledge_search |
|
|
195
|
-
| prism-coder:4b /
|
|
200
|
+
| prism-coder:4b / 9b / 32b | 100% × 3 seeds | Perfect on all 115 cases |
|
|
196
201
|
| Claude (frontier, same eval) | ~98% | Stronger everywhere outside this narrow task |
|
|
197
202
|
|
|
198
203
|
**Memory uplift (LoCoMo-Plus, self-published).** A separate long-context dialogue benchmark ([dcostenco/Locomo-Plus](https://github.com/dcostenco/Locomo-Plus)) measures how much structured memory helps a base model retain multi-day context. Results show large gains when a model is paired with Prism memory versus running raw. Note this benchmark is authored, run, and LLM-judged by this project — treat it as a reproducible demonstration, not an independent third-party result, and run it yourself with the commands in that repo.
|
|
@@ -207,30 +212,30 @@ These tables are the maintainer's assessment as of June 2026. Verify claims that
|
|
|
207
212
|
|
|
208
213
|
| Feature | Prism Coder | GitHub Copilot | Cursor | Windsurf | Amazon Q | Devin |
|
|
209
214
|
|---|:---:|:---:|:---:|:---:|:---:|:---:|
|
|
210
|
-
| Local inference (open-weight) |
|
|
211
|
-
| Works fully offline |
|
|
212
|
-
| Persistent cross-session memory |
|
|
213
|
-
| Session drift detection |
|
|
214
|
-
| L3 grounding verifier |
|
|
215
|
-
| Behavioral verification (pre-edit) |
|
|
216
|
-
| MCP server (tools + memory) |
|
|
217
|
-
| Web IDE |
|
|
218
|
-
| VS Code extension |
|
|
219
|
-
| Flat-rate team pricing |
|
|
220
|
-
| HIPAA BAA available |
|
|
215
|
+
| Local inference (open-weight) | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ |
|
|
216
|
+
| Works fully offline | ✅ (free tier) | ❌ | ❌ | ❌ | ❌ | ❌ |
|
|
217
|
+
| Persistent cross-session memory | ✅ | ✅ | ❌ | ❌ | ❌ | ❌ |
|
|
218
|
+
| Session drift detection | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ |
|
|
219
|
+
| L3 grounding verifier | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ |
|
|
220
|
+
| Behavioral verification (pre-edit) | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ |
|
|
221
|
+
| MCP server (tools + memory) | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ |
|
|
222
|
+
| Web IDE | ✅ | ✅ | ❌ | ❌ | ✅ | ✅ |
|
|
223
|
+
| VS Code extension | ✅ | ✅ | — | — | ✅ | ❌ |
|
|
224
|
+
| Flat-rate team pricing | ✅ | ❌ (per-seat) | ❌ (per-seat) | ❌ | ❌ | ❌ |
|
|
225
|
+
| HIPAA BAA available | ✅ (Enterprise) | ❌ | ❌ | ❌ | ❌ | ❌ |
|
|
221
226
|
|
|
222
227
|
### vs local AI / memory tools
|
|
223
228
|
|
|
224
229
|
| Feature | Prism Coder | Ollama | LM Studio | Mem0 | Zep |
|
|
225
230
|
|---|:---:|:---:|:---:|:---:|:---:|
|
|
226
|
-
| Local inference cascade |
|
|
227
|
-
| Cloud fallback |
|
|
228
|
-
| Persistent cross-session memory |
|
|
229
|
-
| Knowledge ingestion (MCP + webhook) |
|
|
230
|
-
| Cognitive routing (3-store) |
|
|
231
|
-
| Session drift detection |
|
|
232
|
-
| Native MCP server |
|
|
233
|
-
| Web IDE + VS Code extension |
|
|
231
|
+
| Local inference cascade | ✅ | ✅ | ✅ | ❌ | ❌ |
|
|
232
|
+
| Cloud fallback | ✅ | ❌ | ❌ | ❌ | ❌ |
|
|
233
|
+
| Persistent cross-session memory | ✅ | ❌ | ❌ | ✅ | ✅ |
|
|
234
|
+
| Knowledge ingestion (MCP + webhook) | ✅ | ❌ | ❌ | ❌ | ❌ |
|
|
235
|
+
| Cognitive routing (3-store) | ✅ | ❌ | ❌ | ❌ | ❌ |
|
|
236
|
+
| Session drift detection | ✅ | ❌ | ❌ | ❌ | ❌ |
|
|
237
|
+
| Native MCP server | ✅ | ❌ | ❌ | ❌ | ❌ |
|
|
238
|
+
| Web IDE + VS Code extension | ✅ | ❌ | ❌ | ❌ | ❌ |
|
|
234
239
|
|
|
235
240
|
### Pricing — flat-rate, not per-seat
|
|
236
241
|
|
|
@@ -249,19 +254,19 @@ All on-device models are free to run locally via Ollama on every tier. A subscri
|
|
|
249
254
|
| | **Free** | **Standard** $19/mo | **Advanced** $49/mo | **Enterprise** $99/mo |
|
|
250
255
|
|---|---|---|---|---|
|
|
251
256
|
| Seats | 1 | 1 | up to 5 | up to 25 |
|
|
252
|
-
| Local model ceiling | up to 4b | up to
|
|
257
|
+
| Local model ceiling | up to 4b | up to 9b | up to 32b | up to 32b |
|
|
253
258
|
| Daily cloud inference | -- | 200 | 2,000 | 100,000 |
|
|
254
259
|
| Cloud Coder (Web IDE) | -- | 100/day | 1,000/day | 100,000/day |
|
|
255
260
|
| Cloud search | -- | 50/day | 500/day | 100,000/day |
|
|
256
261
|
| Max output tokens | 512 | 1,024 | 2,048 | 4,096 |
|
|
257
262
|
| Cloud fallback | -- | Claude Sonnet 4 | Claude Sonnet 4 | Priority + Sonnet 4 |
|
|
258
|
-
| Grounding verifier | -- |
|
|
259
|
-
| Memory sync (cloud) | -- |
|
|
263
|
+
| Grounding verifier (fact-check AI output) | -- | ✅ | ✅ | ✅ |
|
|
264
|
+
| Memory sync (cloud) | -- | ✅ | ✅ | ✅ |
|
|
260
265
|
| Knowledge / session memory | limited | unlimited | unlimited | unlimited |
|
|
261
|
-
| Analytics dashboard | -- |
|
|
262
|
-
| HIPAA BAA | -- | -- | -- |
|
|
266
|
+
| Analytics dashboard | -- | ✅ | ✅ | ✅ |
|
|
267
|
+
| HIPAA BAA | -- | -- | -- | ✅ |
|
|
263
268
|
|
|
264
|
-
14-day free trial on paid plans.
|
|
269
|
+
14-day free trial on paid plans. 25+ seats: [contact sales](https://synalux.ai/support)
|
|
265
270
|
|
|
266
271
|
---
|
|
267
272
|
|
|
@@ -324,6 +329,8 @@ prism register-models # alias dcostenco/prism-coder:* -> prism-coder:*
|
|
|
324
329
|
|
|
325
330
|
## Companions
|
|
326
331
|
|
|
332
|
+
Prism works alongside these tools — use whichever fits your workflow.
|
|
333
|
+
|
|
327
334
|
### Web IDE — Prism Coder
|
|
328
335
|
|
|
329
336
|
A browser-based IDE at [synalux.ai/coder](https://synalux.ai/coder). Import any GitHub repo and get:
|
|
@@ -358,13 +365,16 @@ code --install-extension synalux-ai.synalux
|
|
|
358
365
|
|
|
359
366
|
[](https://marketplace.visualstudio.com/items?itemName=synalux-ai.synalux)
|
|
360
367
|
|
|
361
|
-
|
|
362
|
-
|
|
363
|
-
**Clinical features (BCBA / healthcare):** SOAP note generator, role-based access, document signing, patient board. Voice recording with AES-256-GCM encryption (consent-gated, off by default, plaintext deleted after encryption).
|
|
368
|
+
AI chat, voice input, SOAP note generator, team collaboration, and video calls — all inside VS Code. Routes through local Ollama by default; cloud on paid tiers.
|
|
364
369
|
|
|
365
|
-
|
|
370
|
+
<details>
|
|
371
|
+
<summary>Feature details</summary>
|
|
366
372
|
|
|
367
|
-
**
|
|
373
|
+
- **AI**: Chat participant (`@synalux`), multi-agent pipeline, voice input, model switching, 10 tones
|
|
374
|
+
- **Clinical**: SOAP note generator, role-based access, document signing, patient board
|
|
375
|
+
- **Collaboration**: Team chat, DMs, video calls, customer board, visual builder, DevContainers
|
|
376
|
+
- **Privacy**: Local Ollama by default. `preferLocal=true` tries local first. Enterprise BAA available.
|
|
377
|
+
</details>
|
|
368
378
|
|
|
369
379
|
### Prism AAC
|
|
370
380
|
|
|
@@ -374,6 +384,28 @@ See [github.com/dcostenco/prism-aac](https://github.com/dcostenco/prism-aac)
|
|
|
374
384
|
|
|
375
385
|
---
|
|
376
386
|
|
|
387
|
+
## Git Hooks (Portable)
|
|
388
|
+
|
|
389
|
+
Pre-commit and pre-push security hooks that work with any editor, any AI tool, and direct CLI. No Claude Code dependency.
|
|
390
|
+
|
|
391
|
+
```bash
|
|
392
|
+
# Install in all repos (one-time)
|
|
393
|
+
bash synalux-private/scripts/install-git-hooks.sh
|
|
394
|
+
|
|
395
|
+
# Or install manually in a single repo
|
|
396
|
+
cp hooks/pre-commit .git/hooks/pre-commit && chmod +x .git/hooks/pre-commit
|
|
397
|
+
cp hooks/pre-push .git/hooks/pre-push && chmod +x .git/hooks/pre-push
|
|
398
|
+
```
|
|
399
|
+
|
|
400
|
+
| Hook | What it checks | Mode |
|
|
401
|
+
|------|----------------|------|
|
|
402
|
+
| `pre-commit` | Dead code, orphan services, scaffold code, missing auth | `PRECOMMIT_MODE=advisory\|block\|off` |
|
|
403
|
+
| `pre-push` | 19-rule security audit (SSRF, SQL injection, secrets, IDOR, etc.) | `PREPUSH_MODE=advisory\|block\|off` |
|
|
404
|
+
|
|
405
|
+
Default mode is `advisory` (warn but allow). Set `*_MODE=block` for hard enforcement. Hooks look for full audit scripts in the repo first (`hooks/lib/`), then `~/.claude/hooks/` fallback, then minimal inline checks.
|
|
406
|
+
|
|
407
|
+
---
|
|
408
|
+
|
|
377
409
|
## Self-hosting (Enterprise)
|
|
378
410
|
|
|
379
411
|
Run the full model stack on your own hardware — no cloud, full data sovereignty.
|
|
@@ -381,11 +413,11 @@ Run the full model stack on your own hardware — no cloud, full data sovereignt
|
|
|
381
413
|
**Requirements:** Mac M2 Pro+ (48 GB recommended) or Linux + NVIDIA GPU, plus [Ollama](https://ollama.com).
|
|
382
414
|
|
|
383
415
|
```bash
|
|
384
|
-
ollama pull dcostenco/prism-coder:
|
|
416
|
+
ollama pull dcostenco/prism-coder:9b # default router
|
|
385
417
|
export LOCAL_LLM_URL=http://localhost:11434
|
|
386
418
|
```
|
|
387
419
|
|
|
388
|
-
Routing is automatic: `
|
|
420
|
+
Routing is automatic: `9b → 4b → cloud fallback` on desktop/server, `2b → cloud fallback` on mobile/iPhone. For iOS or another machine on the same network, run `OLLAMA_HOST=0.0.0.0 ollama serve` and point `LOCAL_LLM_URL` at the host's IP.
|
|
389
421
|
|
|
390
422
|
---
|
|
391
423
|
|
package/dist/cli.js
CHANGED
|
@@ -521,10 +521,10 @@ scmCmd
|
|
|
521
521
|
});
|
|
522
522
|
// ─── prism register-models ────────────────────────────────────
|
|
523
523
|
// Convenience: alias namespaced HF-style prism-coder tags
|
|
524
|
-
// (`dcostenco/prism-coder:
|
|
524
|
+
// (`dcostenco/prism-coder:9b`) to the bare tags (`prism-coder:9b`)
|
|
525
525
|
// some external tooling expects. The MCP picker handles both forms
|
|
526
526
|
// natively as of v15.5, so this command is OPTIONAL — useful only
|
|
527
|
-
// when a user wants to run `ollama run prism-coder:
|
|
527
|
+
// when a user wants to run `ollama run prism-coder:9b` directly,
|
|
528
528
|
// or for tools that pre-date the picker's namespace fallback.
|
|
529
529
|
program
|
|
530
530
|
.command('register-models')
|
package/dist/storage/sqlite.js
CHANGED
|
@@ -1268,7 +1268,7 @@ export class SqliteStorage {
|
|
|
1268
1268
|
FROM session_ledger
|
|
1269
1269
|
WHERE project = ? AND user_id = ? AND role = ?
|
|
1270
1270
|
AND event_type = 'correction'
|
|
1271
|
-
AND importance >=
|
|
1271
|
+
AND importance >= 0
|
|
1272
1272
|
AND deleted_at IS NULL
|
|
1273
1273
|
AND archived_at IS NULL
|
|
1274
1274
|
ORDER BY importance DESC
|
|
@@ -2323,10 +2323,12 @@ export class SqliteStorage {
|
|
|
2323
2323
|
SET importance = MAX(0, importance - 1)
|
|
2324
2324
|
WHERE project = ? AND user_id = ?
|
|
2325
2325
|
AND importance > 0
|
|
2326
|
+
AND importance < 10
|
|
2326
2327
|
AND event_type != 'session'
|
|
2327
2328
|
AND created_at < datetime('now', '-' || ? || ' days')
|
|
2329
|
+
AND (last_accessed_at IS NULL OR last_accessed_at < datetime('now', '-' || ? || ' days'))
|
|
2328
2330
|
AND deleted_at IS NULL`,
|
|
2329
|
-
args: [project, userId, decayDays],
|
|
2331
|
+
args: [project, userId, decayDays, decayDays],
|
|
2330
2332
|
});
|
|
2331
2333
|
const decayed = result.rowsAffected || 0;
|
|
2332
2334
|
if (decayed > 0) {
|
|
@@ -10,7 +10,6 @@
|
|
|
10
10
|
*/
|
|
11
11
|
import { PRISM_SYNALUX_BASE_URL, SYNALUX_CONFIGURED } from "../config.js";
|
|
12
12
|
import { getSynaluxJwt } from "../utils/synaluxJwt.js";
|
|
13
|
-
import { debugLog } from "../utils/logger.js";
|
|
14
13
|
const FALLBACK_SCENARIO = [
|
|
15
14
|
"⚠️ BEHAVIORAL VERIFICATION (OFFLINE MODE)",
|
|
16
15
|
"",
|
|
@@ -30,7 +29,7 @@ export async function verifyBehaviorHandler(args) {
|
|
|
30
29
|
}
|
|
31
30
|
const jwt = await getSynaluxJwt();
|
|
32
31
|
if (!jwt) {
|
|
33
|
-
|
|
32
|
+
console.error("[verify-behavior] ⚠️ JWT unavailable — fail-closed with generic scenario");
|
|
34
33
|
return FALLBACK_SCENARIO;
|
|
35
34
|
}
|
|
36
35
|
try {
|
|
@@ -49,14 +48,14 @@ export async function verifyBehaviorHandler(args) {
|
|
|
49
48
|
signal: AbortSignal.timeout(5_000),
|
|
50
49
|
});
|
|
51
50
|
if (!res.ok) {
|
|
52
|
-
|
|
51
|
+
console.error(`[verify-behavior] ⚠️ portal returned ${res.status} — fail-closed. URL: ${url}`);
|
|
53
52
|
return FALLBACK_SCENARIO;
|
|
54
53
|
}
|
|
55
54
|
const data = (await res.json());
|
|
56
55
|
return formatResult(data);
|
|
57
56
|
}
|
|
58
57
|
catch (err) {
|
|
59
|
-
|
|
58
|
+
console.error(`[verify-behavior] ⚠️ VERIFICATION FAILED: ${err.message} — using generic fallback`);
|
|
60
59
|
return FALLBACK_SCENARIO;
|
|
61
60
|
}
|
|
62
61
|
}
|
|
@@ -977,15 +977,17 @@ export async function sessionLoadContextHandler(args) {
|
|
|
977
977
|
// Build the response object before v4.0 augmentations
|
|
978
978
|
// SECURITY: Wrap output in boundary tags to prevent context confusion.
|
|
979
979
|
// The LLM sees <prism_memory context="historical"> and knows this is data, not instructions.
|
|
980
|
-
|
|
981
|
-
//
|
|
982
|
-
//
|
|
983
|
-
// formatted output so the agent sees them prominently.
|
|
980
|
+
// ─── v19.1: Behavioral Warnings — BEFORE skills (protected from truncation) ───
|
|
981
|
+
// Corrections must surface prominently. Placed before skillBlock so the
|
|
982
|
+
// skill budget cannot push them out. Capped at 2,000 chars.
|
|
984
983
|
const behavWarnings = data?.behavioral_warnings;
|
|
984
|
+
let behavBlock = '';
|
|
985
985
|
if (behavWarnings && behavWarnings.length > 0) {
|
|
986
|
-
|
|
986
|
+
const rawBlock = `\n\n[⚠️ BEHAVIORAL WARNINGS — DO NOT IGNORE]\n` +
|
|
987
987
|
behavWarnings.map(w => `- ${w.summary} (importance: ${w.importance})`).join("\n");
|
|
988
|
+
behavBlock = [...rawBlock].slice(0, 2000).join('');
|
|
988
989
|
}
|
|
990
|
+
let responseText = `${MEMORY_BOUNDARY_PREFIX}📋 Session context for "${project}" (${level}):\n\n${formattedContext.trim()}${splitBrainWarning}${driftReport}${briefingBlock}${sdmRecallBlock}${greetingBlock}${visualMemoryBlock}${behavBlock}${skillBlock}${versionNote}`;
|
|
989
991
|
// ─── v9.4.7: ABA Precision Protocol (foundational) ────────
|
|
990
992
|
// Injected into EVERY session load so the agent always operates
|
|
991
993
|
// under these behavioral rules. Never truncated (placed before
|
|
@@ -2,7 +2,7 @@
|
|
|
2
2
|
* prism_infer — local-first inference tool
|
|
3
3
|
* ─────────────────────────────────────────────────────────────
|
|
4
4
|
* Save the caller's cloud tokens by routing to a local prism-coder
|
|
5
|
-
* model via Ollama. Tiers (32B/
|
|
5
|
+
* model via Ollama. Tiers (32B/9B/8B/1.7B) auto-selected by free
|
|
6
6
|
* RAM, then capped by `model_ceiling` and the set of tags that are
|
|
7
7
|
* actually pulled into Ollama.
|
|
8
8
|
*
|
|
@@ -12,7 +12,7 @@
|
|
|
12
12
|
* 4. On local fail, if cloud_fallback=true:
|
|
13
13
|
* - exchange synalux_sk_ → JWT (cached)
|
|
14
14
|
* - POST synalux portal /api/v1/prism-aac/inference
|
|
15
|
-
* - portal runs its own cascade (
|
|
15
|
+
* - portal runs its own cascade (9B/32B/Claude by tier)
|
|
16
16
|
* 5. Return { output, backend, model_picked, ram_free_mb, latency_ms, used_cloud }
|
|
17
17
|
*
|
|
18
18
|
* `prism_infer` is a thin client. It never calls Anthropic / OpenRouter
|
|
@@ -24,16 +24,15 @@ import { getSynaluxJwt, invalidateSynaluxJwt } from "../utils/synaluxJwt.js";
|
|
|
24
24
|
import { getAvailableMemoryBytes } from "../utils/availableMemory.js";
|
|
25
25
|
import { PRISM_SYNALUX_BASE_URL, PRISM_LOCAL_LLM_URL, } from "../config.js";
|
|
26
26
|
import { debugLog } from "../utils/logger.js";
|
|
27
|
-
import { verifyGrounding } from "../utils/groundingVerifier.js";
|
|
28
27
|
import { getEntitlements, clampCeiling } from "../utils/entitlements.js";
|
|
29
28
|
import { ddLog } from "../utils/ddLogger.js";
|
|
30
29
|
// ─── Tool Definition ────────────────────────────────────────────
|
|
31
30
|
export const PRISM_INFER_TOOL = {
|
|
32
31
|
name: "prism_infer",
|
|
33
32
|
description: "Run an inference on a local prism-coder model (Ollama) to save cloud tokens. " +
|
|
34
|
-
"Picks the largest viable tier — 32B /
|
|
33
|
+
"Picks the largest viable tier — 32B / 9B / 8B / 1.7B — based on free RAM at call time, " +
|
|
35
34
|
"clamped by `model_ceiling` and what is actually pulled in Ollama. " +
|
|
36
|
-
"Falls through to the synalux portal cloud cascade (
|
|
35
|
+
"Falls through to the synalux portal cloud cascade (9B → 32B → Claude Opus 4.7) " +
|
|
37
36
|
"only when local is unviable AND `cloud_fallback=true`. " +
|
|
38
37
|
"Use this for code generation, summarisation, classification, or any synth task you would " +
|
|
39
38
|
"otherwise hand to the cloud model — it costs $0 when the local hit succeeds.",
|
|
@@ -60,8 +59,8 @@ export const PRISM_INFER_TOOL = {
|
|
|
60
59
|
},
|
|
61
60
|
model_ceiling: {
|
|
62
61
|
type: "string",
|
|
63
|
-
enum: ["32b", "
|
|
64
|
-
description: "Cap the largest tier the picker may select. e.g. '
|
|
62
|
+
enum: ["32b", "9b", "4b", "2b"],
|
|
63
|
+
description: "Cap the largest tier the picker may select. e.g. '9b' forbids 32B even if RAM allows.",
|
|
65
64
|
},
|
|
66
65
|
cloud_fallback: {
|
|
67
66
|
type: "boolean",
|
|
@@ -70,7 +69,7 @@ export const PRISM_INFER_TOOL = {
|
|
|
70
69
|
},
|
|
71
70
|
timeout_ms: {
|
|
72
71
|
type: "number",
|
|
73
|
-
description: "Override per-call timeout. Default scales with model size: 32B=120s,
|
|
72
|
+
description: "Override per-call timeout. Default scales with model size: 32B=120s, 9B=60s, 4B=20s, 1.7B=15s.",
|
|
74
73
|
},
|
|
75
74
|
evidence: {
|
|
76
75
|
type: "array",
|
|
@@ -124,7 +123,7 @@ export function isPrismInferArgs(args) {
|
|
|
124
123
|
if (a.timeout_ms !== undefined && typeof a.timeout_ms !== "number")
|
|
125
124
|
return false;
|
|
126
125
|
if (a.model_ceiling !== undefined &&
|
|
127
|
-
!["32b", "
|
|
126
|
+
!["32b", "9b", "4b", "2b"].includes(a.model_ceiling))
|
|
128
127
|
return false;
|
|
129
128
|
if (a.verify !== undefined && typeof a.verify !== "boolean")
|
|
130
129
|
return false;
|
|
@@ -148,8 +147,8 @@ export function isPrismInferArgs(args) {
|
|
|
148
147
|
// ─── Ollama helpers ────────────────────────────────────────────
|
|
149
148
|
const DEFAULT_TIMEOUTS = {
|
|
150
149
|
"prism-coder:32b": 120_000,
|
|
151
|
-
"prism-coder:
|
|
152
|
-
"
|
|
150
|
+
"prism-coder:9b": 60_000,
|
|
151
|
+
"prism-coder:4b": 20_000,
|
|
153
152
|
"prism-coder:2b": 15_000,
|
|
154
153
|
};
|
|
155
154
|
/** List Ollama-installed tags. Returns null if Ollama unreachable. */
|
|
@@ -407,10 +406,10 @@ export async function runInfer(args, deps) {
|
|
|
407
406
|
*/
|
|
408
407
|
async function applyVerification(draft, args, deps, partial) {
|
|
409
408
|
const shouldVerify = args.verify ?? (args.evidence !== undefined && args.evidence.length > 0);
|
|
410
|
-
if (!shouldVerify) {
|
|
409
|
+
if (!shouldVerify || !deps.callVerifier) {
|
|
411
410
|
return { ...partial, output: draft };
|
|
412
411
|
}
|
|
413
|
-
const verifier = deps.callVerifier
|
|
412
|
+
const verifier = deps.callVerifier;
|
|
414
413
|
const outcome = await verifier({
|
|
415
414
|
draft,
|
|
416
415
|
evidence: args.evidence ?? [],
|
|
@@ -6,7 +6,7 @@
|
|
|
6
6
|
* to enforce model ceiling, max_tokens, and feature gates.
|
|
7
7
|
*
|
|
8
8
|
* Unauthenticated users (no SYNALUX_API_KEY) get free-tier defaults.
|
|
9
|
-
* Authenticated users get their plan from the portal (
|
|
9
|
+
* Authenticated users get their plan from the portal (5-minute cache).
|
|
10
10
|
*/
|
|
11
11
|
import { getSynaluxJwt } from "./synaluxJwt.js";
|
|
12
12
|
import { PRISM_SYNALUX_BASE_URL, SYNALUX_CONFIGURED } from "../config.js";
|
|
@@ -32,10 +32,10 @@ const CACHE_TTL_MS = 5 * 60 * 1000; // 5 minutes
|
|
|
32
32
|
let cache = null;
|
|
33
33
|
let inFlight = null;
|
|
34
34
|
// ── Model tier ordering for ceiling enforcement ───────────────────
|
|
35
|
-
const TIER_ORDER = ["2b", "4b", "
|
|
35
|
+
const TIER_ORDER = ["2b", "4b", "9b", "32b"];
|
|
36
36
|
/**
|
|
37
37
|
* Returns true if `requested` exceeds `ceiling`.
|
|
38
|
-
* e.g. ceilingExceeded("
|
|
38
|
+
* e.g. ceilingExceeded("9b", "4b") → true (9b > 4b ceiling)
|
|
39
39
|
*/
|
|
40
40
|
export function ceilingExceeded(requested, ceiling) {
|
|
41
41
|
const reqIdx = TIER_ORDER.indexOf(requested);
|
|
@@ -79,12 +79,18 @@ async function fetchEntitlements() {
|
|
|
79
79
|
redirect: "error",
|
|
80
80
|
});
|
|
81
81
|
if (!res.ok) {
|
|
82
|
-
debugLog(`[entitlements] portal HTTP ${res.status}
|
|
82
|
+
debugLog(`[entitlements] portal HTTP ${res.status}`);
|
|
83
|
+
if (cache) {
|
|
84
|
+
debugLog("[entitlements] using last-known-good (safety fail-closed)");
|
|
85
|
+
return cache.entitlements;
|
|
86
|
+
}
|
|
83
87
|
return FREE_ENTITLEMENTS;
|
|
84
88
|
}
|
|
85
89
|
const data = (await res.json());
|
|
86
90
|
if (!data.plan || !data.model_ceiling) {
|
|
87
|
-
debugLog("[entitlements] malformed response
|
|
91
|
+
debugLog("[entitlements] malformed response");
|
|
92
|
+
if (cache)
|
|
93
|
+
return cache.entitlements;
|
|
88
94
|
return FREE_ENTITLEMENTS;
|
|
89
95
|
}
|
|
90
96
|
debugLog(`[entitlements] plan=${data.plan} ceiling=${data.model_ceiling} ` +
|
|
@@ -92,7 +98,14 @@ async function fetchEntitlements() {
|
|
|
92
98
|
return data;
|
|
93
99
|
}
|
|
94
100
|
catch (err) {
|
|
95
|
-
debugLog(`[entitlements] fetch error: ${err instanceof Error ? err.message : String(err)}
|
|
101
|
+
debugLog(`[entitlements] fetch error: ${err instanceof Error ? err.message : String(err)}`);
|
|
102
|
+
// F1 fix: fail-closed — keep last-known-good entitlements on fetch error.
|
|
103
|
+
// Safety controls (grounding_verifier) must not degrade on availability failures.
|
|
104
|
+
if (cache) {
|
|
105
|
+
debugLog("[entitlements] using last-known-good (safety fail-closed)");
|
|
106
|
+
return cache.entitlements;
|
|
107
|
+
}
|
|
108
|
+
debugLog("[entitlements] no cached entitlements — free tier fallback (cold start)");
|
|
96
109
|
return FREE_ENTITLEMENTS;
|
|
97
110
|
}
|
|
98
111
|
}
|
|
@@ -111,7 +124,14 @@ export async function getEntitlements() {
|
|
|
111
124
|
inFlight = (async () => {
|
|
112
125
|
try {
|
|
113
126
|
const ent = await fetchEntitlements();
|
|
114
|
-
cache
|
|
127
|
+
// Only update cache if this is a REAL fetch (not a cached fallback).
|
|
128
|
+
// fetchEntitlements returns cache.entitlements on error — detect by
|
|
129
|
+
// checking if the returned object is the exact same reference.
|
|
130
|
+
const isFallback = cache && ent === cache.entitlements;
|
|
131
|
+
if (!isFallback) {
|
|
132
|
+
cache = { entitlements: ent, expiresAt: Date.now() + CACHE_TTL_MS };
|
|
133
|
+
}
|
|
134
|
+
// On fallback: DON'T refresh expiresAt — let it expire so we retry.
|
|
115
135
|
return ent;
|
|
116
136
|
}
|
|
117
137
|
finally {
|
|
@@ -1,23 +1,22 @@
|
|
|
1
1
|
/**
|
|
2
2
|
* RAM-Gated Local Model Picker
|
|
3
3
|
* ─────────────────────────────────────────────────────────────
|
|
4
|
-
* Cascade:
|
|
4
|
+
* Cascade: 9b (default) → 4b (verifier) → 2b (mobile) → 32b (complex only).
|
|
5
5
|
*
|
|
6
|
-
* The default ceiling is "
|
|
7
|
-
* -
|
|
6
|
+
* The default ceiling is "9b" — NOT "32b". This means:
|
|
7
|
+
* - 9b is the primary model for routing + general inference (Qwen3.5-9B, 100% BFCL)
|
|
8
8
|
* - 4b is used as the grounding verifier (fast, small)
|
|
9
|
-
* - 2b is the mobile/iPhone first gate (Qwen3.5-
|
|
9
|
+
* - 2b is the mobile/iPhone first gate (Qwen3.5-2B, 99.1% BFCL)
|
|
10
10
|
* - 32b is only loaded when caller explicitly passes ceiling="32b"
|
|
11
11
|
* or when the task requires maximum quality (complex code gen, etc.)
|
|
12
12
|
*
|
|
13
|
-
* This saves
|
|
14
|
-
* The 14b achieves 100% on eval_300 — same as 32b.
|
|
13
|
+
* This saves 13GB+ RAM vs 32b and keeps response times fast.
|
|
15
14
|
*
|
|
16
15
|
* tag weights need free ctx role
|
|
17
16
|
* prism-coder:32b ~19 GB ≥ 24 GB 32K complex (on-demand)
|
|
18
|
-
* prism-coder:
|
|
19
|
-
*
|
|
20
|
-
* prism-coder:2b ~ 2.3 GB ≥ 3 GB 8K mobile / iPhone (
|
|
17
|
+
* prism-coder:9b ~ 5.8 GB ≥ 8 GB 32K default router (Qwen3.5, 100% BFCL)
|
|
18
|
+
* prism-coder:4b ~ 3.4 GB ≥ 5 GB 32K verifier (Qwen3.5, 100%)
|
|
19
|
+
* prism-coder:2b ~ 2.3 GB ≥ 3 GB 8K mobile / iPhone (Qwen3.5, 99.1%)
|
|
21
20
|
*
|
|
22
21
|
* Below 3 GB free → no local pick (caller must use cloud).
|
|
23
22
|
*/
|
|
@@ -28,8 +27,8 @@ const GB = 1024 ** 3;
|
|
|
28
27
|
*/
|
|
29
28
|
export const MODEL_TIERS = [
|
|
30
29
|
{ tag: 'prism-coder:32b', weightsGb: 19, minFreeGb: 24, ctxTokens: 32_768 },
|
|
31
|
-
{ tag: 'prism-coder:
|
|
32
|
-
{ tag: '
|
|
30
|
+
{ tag: 'prism-coder:9b', weightsGb: 5.8, minFreeGb: 8, ctxTokens: 32_768 },
|
|
31
|
+
{ tag: 'prism-coder:4b', weightsGb: 3.4, minFreeGb: 5, ctxTokens: 32_768 },
|
|
33
32
|
{ tag: 'prism-coder:2b', weightsGb: 2.3, minFreeGb: 3, ctxTokens: 8_192 },
|
|
34
33
|
];
|
|
35
34
|
/**
|
|
@@ -43,14 +42,14 @@ export const MODEL_TIERS = [
|
|
|
43
42
|
function tagMatches(installed, tierTag) {
|
|
44
43
|
return installed === tierTag || installed.endsWith(`/${tierTag}`);
|
|
45
44
|
}
|
|
46
|
-
/** Default ceiling:
|
|
47
|
-
export const DEFAULT_CEILING = "
|
|
45
|
+
/** Default ceiling: 9b. Pass ceiling="32b" explicitly for max quality. */
|
|
46
|
+
export const DEFAULT_CEILING = "9b";
|
|
48
47
|
/**
|
|
49
48
|
* Pick the best viable tier for the given free RAM.
|
|
50
|
-
* Default ceiling is
|
|
49
|
+
* Default ceiling is 9b — use ceiling="32b" only for complex tasks.
|
|
51
50
|
*
|
|
52
51
|
* @param freeBytes Result of os.freemem() — binary bytes
|
|
53
|
-
* @param ceiling Cap tier. Default "
|
|
52
|
+
* @param ceiling Cap tier. Default "9b". Pass "32b" for complex tasks.
|
|
54
53
|
* @param available Optional whitelist of installed Ollama tags.
|
|
55
54
|
*/
|
|
56
55
|
export function pickLocalModel(freeBytes, ceiling, available) {
|
|
@@ -15,8 +15,9 @@ export class Gatekeeper {
|
|
|
15
15
|
console.warn(`\n⚠️ [OVERRIDDEN] Verification Gate bypassed via administrator override.`);
|
|
16
16
|
// Enforce immutability and record audit trail context via environment variables
|
|
17
17
|
validatedResult.gate_override = true;
|
|
18
|
+
// F19 fix: process.env.USER is trivially spoofable — log it but note it's unauthenticated.
|
|
18
19
|
const actor = process.env.USER || process.env.USERNAME || 'unknown_user';
|
|
19
|
-
validatedResult.override_reason = validatedResult.override_reason || `CLI --force bypass
|
|
20
|
+
validatedResult.override_reason = validatedResult.override_reason || `CLI --force bypass (unauthenticated env.USER=${actor})`;
|
|
20
21
|
return { canContinue: true, validatedResult };
|
|
21
22
|
}
|
|
22
23
|
switch (validatedResult.gate_action) {
|
|
@@ -196,7 +196,12 @@ export class VerificationRunner {
|
|
|
196
196
|
* Throws an error if the hash does not match, ensuring test integrity.
|
|
197
197
|
*/
|
|
198
198
|
static verifyRubricHash(tests, harness) {
|
|
199
|
-
|
|
199
|
+
// F11 fix: include min_pass_rate in hash verification when harness has it.
|
|
200
|
+
// Try with min_pass_rate first; fall back to without for backward compat.
|
|
201
|
+
const minRate = harness.min_pass_rate;
|
|
202
|
+
const computed = minRate !== undefined
|
|
203
|
+
? computeRubricHash(tests, minRate)
|
|
204
|
+
: computeRubricHash(tests);
|
|
200
205
|
if (computed !== harness.rubric_hash) {
|
|
201
206
|
throw new Error(`Rubric hash mismatch. Expected ${harness.rubric_hash}, but computeRubricHash returned ${computed}. The tests have been modified since the harness was created.`);
|
|
202
207
|
}
|
|
@@ -405,7 +410,7 @@ export class VerificationRunner {
|
|
|
405
410
|
if (!targetCheck.ok) {
|
|
406
411
|
return { passed: false, error: `HTTP target blocked: ${targetCheck.reason}` };
|
|
407
412
|
}
|
|
408
|
-
const res = await fetch(a.target);
|
|
413
|
+
const res = await fetch(a.target, { redirect: "error" });
|
|
409
414
|
return res.status === a.expected
|
|
410
415
|
? { passed: true }
|
|
411
416
|
: { passed: false, error: `Expected status ${a.expected}, got ${res.status} for ${a.target}` };
|
|
@@ -56,8 +56,16 @@ export const TestSuiteSchema = z.object({
|
|
|
56
56
|
* @param tests - The array of TestAssertion to hash
|
|
57
57
|
* @returns Lowercase hex SHA-256 digest
|
|
58
58
|
*/
|
|
59
|
-
export function computeRubricHash(tests) {
|
|
59
|
+
export function computeRubricHash(tests, minPassRate) {
|
|
60
60
|
const sorted = [...tests].sort((a, b) => a.id.localeCompare(b.id));
|
|
61
|
+
// F11 fix: when minPassRate is provided, include it in the hash so the
|
|
62
|
+
// threshold can't be changed without invalidating the rubric.
|
|
63
|
+
// When omitted, hash only tests (backward compatible with existing harnesses).
|
|
64
|
+
if (minPassRate !== undefined) {
|
|
65
|
+
return createHash("sha256")
|
|
66
|
+
.update(JSON.stringify({ tests: sorted, min_pass_rate: minPassRate }))
|
|
67
|
+
.digest("hex");
|
|
68
|
+
}
|
|
61
69
|
return createHash("sha256")
|
|
62
70
|
.update(JSON.stringify(sorted))
|
|
63
71
|
.digest("hex");
|
|
@@ -44,6 +44,18 @@ export function resolveEffectiveSeverity(assertionSeverity, defaultSeverity) {
|
|
|
44
44
|
*/
|
|
45
45
|
export function evaluateSeverityGates(results, config) {
|
|
46
46
|
const failures = results.filter(r => !r.passed && !r.skipped);
|
|
47
|
+
// F10 fix: skipped critical (gate/abort) assertions count as failures.
|
|
48
|
+
// Crafting depends_on to skip critical checks must not neutralize the gate.
|
|
49
|
+
const skippedCritical = results.filter(r => r.skipped && (r.severity === 'gate' || r.severity === 'abort'));
|
|
50
|
+
if (skippedCritical.length > 0) {
|
|
51
|
+
const ids = skippedCritical.map(r => r.id).join(", ");
|
|
52
|
+
const hasAbort = skippedCritical.some(r => r.severity === 'abort');
|
|
53
|
+
return {
|
|
54
|
+
action: hasAbort ? "abort" : "block",
|
|
55
|
+
failed_assertions: skippedCritical,
|
|
56
|
+
summary: `${hasAbort ? 'ABORT' : 'BLOCKED'}: ${skippedCritical.length} critical assertion(s) were skipped [${ids}] — treating as failures.`
|
|
57
|
+
};
|
|
58
|
+
}
|
|
47
59
|
if (failures.length === 0) {
|
|
48
60
|
return {
|
|
49
61
|
action: "continue",
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "prism-mcp-server",
|
|
3
|
-
"version": "19.0.
|
|
3
|
+
"version": "19.0.1",
|
|
4
4
|
"mcpName": "io.github.dcostenco/prism-coder",
|
|
5
5
|
"description": "Prism Coder — Cognitive memory + tool-calling intelligence for AI agents. Mind Palace persistent memory (BFCL Gold Certified, 100% Tool-Call Accuracy, 114 Agent Skills, PHI Guard, Tier Enforcement, Prompt-Based Skill Routing, Zero-Search HDC/HRR retrieval, HRR Semantic Drift Detection across BCBA/Coding/AAC domains, HIPAA-hardened local-first storage, SLERP-optimized GRPO alignment) plus the prism-coder 1.7B–32B open-weights LLM fleet.",
|
|
6
6
|
"module": "index.ts",
|