prism-mcp-server 15.6.0 → 15.6.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +13 -1
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -106,7 +106,7 @@ The routing cascade validates each response against the 6 known tool names and e
|
|
|
106
106
|
| Model | Accuracy | Cost/req | Latency | Runs on | AAC | Edge cases |
|
|
107
107
|
|---|---|---|---|---|---|---|
|
|
108
108
|
| Claude Sonnet 4 | **99%** | ~$0.01 | 3.2s | Cloud | 100% | 83% |
|
|
109
|
-
| **prism-coder:32b**
|
|
109
|
+
| **prism-coder:32b** swe14 | **100.0%** | **$0** | 1.4s | Mac 24GB+ | **100%** | **100%** |
|
|
110
110
|
| **prism-coder:8b** v36 | **100.0%** | **$0** | **0.8s** | iPhone/iPad 8GB | **100%** | **100%** |
|
|
111
111
|
| **prism-coder:14b** v36 | **100.0%** | **$0** | **1.1s** | Mac 24GB+ / iPad Pro 16GB | **100%** | **100%** |
|
|
112
112
|
| Claude Opus 4.7 | **98.3%** | ~$0.05 | 3.0s | Cloud | 100% | 83% |
|
|
@@ -115,12 +115,24 @@ The routing cascade validates each response against the 6 known tool names and e
|
|
|
115
115
|
|
|
116
116
|
¹ ~99% of requests served by 14B at 1.1s; 32B for the ~1% 14B misses.
|
|
117
117
|
|
|
118
|
+
**Extended eval — eval_300** (300 cases, 17 tools + NO_TOOL, 9 categories, 3-seed validated, May 2026):
|
|
119
|
+
|
|
120
|
+
| Model | eval_300 strict | Categories |
|
|
121
|
+
|---|---|---|
|
|
122
|
+
| **prism-coder:32b** swe14 | **300/300 (100%)** | abstention 20/20, adversarial 70/70, cascade 25/25, disambiguation 40/40, edge_case 25/25, multi_intent 20/20, natural_phrasing 50/50, param_extraction 25/25, verifier 25/25 |
|
|
123
|
+
| **prism-coder:14b** s17 | **299/300 (99.7%)** | 1 failure in adversarial_trap |
|
|
124
|
+
|
|
125
|
+
The eval_300 suite covers natural phrasing, adversarial traps (CS/meta questions that should NOT trigger tools), disambiguation between similar tools, edge cases (single-word prompts), multi-intent cascades, parameter extraction, and verifier-style prompts.
|
|
126
|
+
|
|
118
127
|
**Why this matters for a life-critical AAC app**: a child in a hospital without WiFi, a nonverbal adult on an airplane, or a family on a budget gets Claude-grade routing accuracy with zero cloud dependency — and the AAC path (expressing pain, asking for help) routes correctly **100% of the time across all tiers and all seeds tested**.
|
|
119
128
|
|
|
120
129
|
**What it does NOT mean**: these scores measure routing precision on a narrow 6-tool taxonomy, not general intelligence. Claude outperforms these models on everything outside this task. The value is **offline reliability at zero cost**, not replacing Claude.
|
|
121
130
|
|
|
122
131
|
> **The prompt engineering breakthrough**: Q4_K_M quantized models confuse semantically similar tool names when routing rules use plain keyword lists. Two structural fixes eliminated all confusion: (1) replacing `-> plain text` with `-> respond directly (no tool)`, and (2) adding category labels (`CONVERSATION RECALL:` / `SAVED KNOWLEDGE:`) as semantic anchors stronger than keyword matching. Combined effect: 14B went from 87% → 100% on the 102-case Prism eval (v36/v7 system prompt, 3-seed mean).
|
|
123
132
|
|
|
133
|
+
### 🔍 L3 Grounding Verifier
|
|
134
|
+
When `prism_infer` receives an `evidence` payload, the grounding verifier automatically checks the model's response against the provided evidence before returning to the caller. Unverified or hallucinated claims are flagged. This is the third layer (L3) of the cascade — after tool routing (L1) and confidence gating (L2).
|
|
135
|
+
|
|
124
136
|
### ⚡ Zero-search retrieval
|
|
125
137
|
Holographic Reduced Representations (HRR) for instant similarity lookups without an index. ~5ms over 100K memories.
|
|
126
138
|
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "prism-mcp-server",
|
|
3
|
-
"version": "15.6.
|
|
3
|
+
"version": "15.6.1",
|
|
4
4
|
"mcpName": "io.github.dcostenco/prism-coder",
|
|
5
5
|
"description": "Prism Coder — Cognitive memory + tool-calling intelligence for AI agents. Mind Palace persistent memory (BFCL Gold Certified, 100% Tool-Call Accuracy, 54 Agent Skills, Zero-Search HDC/HRR retrieval, HIPAA-hardened local-first storage, SLERP-optimized GRPO alignment) plus the prism-coder:7b / 14b open-weights LLM fleet.",
|
|
6
6
|
"module": "index.ts",
|