npm - prism-mcp-server - Versions diffs - 17.0.1 → 17.1.1 - Mend

prism-mcp-server 17.0.1 → 17.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (28) hide show

package/README.md +78 -25
package/dist/tools/ledgerHandlers.js +37 -7
package/dist/tools/prismInferHandler.js +46 -6
package/dist/tools/skillRouting.js +39 -0
package/dist/utils/entitlements.js +137 -0
package/dist/utils/phiGuard.js +88 -0
package/package.json +2 -2
package/dist/agent/agentTools.js +0 -453
package/dist/agent/mcpBridge.js +0 -234
package/dist/agent/platformUtils.js +0 -470
package/dist/agent/terminalUI.js +0 -198
package/dist/auth.js +0 -218
package/dist/darkfactory/cloudDelegate.js +0 -173
package/dist/env-preload.cjs +0 -2
package/dist/plugins/pluginManager.js +0 -199
package/dist/prism-cloud.js +0 -110
package/dist/scm/ciPipeline.js +0 -220
package/dist/start-with-env.sh +0 -5
package/dist/sync/encryptedSync.js +0 -172
package/dist/sync/synaluxProxy.js +0 -177
package/dist/tools/adaptiveDefinitions.js +0 -148
package/dist/tools/projects.js +0 -214
package/dist/utils/changelogGenerator.js +0 -158
package/dist/utils/fallbackClient.js +0 -52
package/dist/utils/memoryAttestation.js +0 -163
package/dist/utils/rbac.js +0 -321
package/dist/utils/tavilyApi.js +0 -70
package/dist/vm/quotaEnforcer.js +0 -192

package/README.md CHANGED Viewed

@@ -92,7 +92,7 @@ prism-coder:4b ── verifies claims ──────────▶  grounde
 prism-coder:32b ── deep reasoning ──────────▶  serve  (~8s, 19GB, FREE)
   │
   ▼  (cloud fallback when local insufficient)
-Claude Sonnet 4 → Claude Opus 4.7 ─────────▶  serve  (cloud, ~$0.01/req)
+Claude Sonnet 4 ────────────────────────────▶  serve  (cloud, ~$0.01/req)
 ```
 | Tier | Model | Role | RAM | Latency | Cost |
@@ -100,7 +100,7 @@ Claude Sonnet 4 → Claude Opus 4.7 ─────────▶  serve  (clou
 | **Default** | prism-coder:14b | Router + general inference | 9 GB | ~3s | $0 |
 | **Verifier** | prism-coder:4b | Grounding claims check | 2.5 GB | <1s | $0 |
 | **Complex** | prism-coder:32b | Deep reasoning (on-demand) | 19 GB | ~8s | $0 |
-| **Cloud** | Sonnet → Opus | Fallback for max quality | — | ~5-10s | ~$0.01 |
+| **Cloud** | Claude Sonnet 4 | Fallback for max quality | — | ~5-10s | ~$0.01 |
 **Mobile / offline cascade** (Prism AAC iOS):
 ```
@@ -200,7 +200,7 @@ HRR acts as Tier 0 — if confidence is high, FTS5 is skipped entirely. Falls th
 Top-1 = correct word is tile #1. MRR = Mean Reciprocal Rank. Zero Top-5 regressions in any scenario. HRR encodes bigrams + trigrams from every spoken phrase; probes take ~0.2ms — safe on every keystroke. All Synalux apps (clinical, AAC, PrismCoach) share HRR via the portal `/api/v1/hrr` endpoint.
-**Competitive comparison:**
+**Memory retrieval comparison:**
 | System | Retrieval | Offline | Cost | Latency |
 |--------|-----------|---------|------|---------|
@@ -214,6 +214,74 @@ Multiple AI agents share the same Mind Palace. Each agent has a role (dev / qa /
 ---
+## Why Prism Coder
+### vs AI coding assistants
+| Feature | Prism Coder | GitHub Copilot | Cursor | Windsurf | Amazon Q | Tabnine | Devin |
+|---|:---:|:---:|:---:|:---:|:---:|:---:|:---:|
+| Local inference (1.7B–32B) | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
+| Works offline (local-only mode) | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
+| Open-weight models (HuggingFace) | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
+| Data stays on machine (local tier) | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
+| Persistent cross-session memory | ✅ | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ |
+| Cognitive routing (episodic/semantic) | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
+| Session drift detection (HRR) | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
+| L3 grounding verifier | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
+| Multi-agent hivemind | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
+| MCP server (tools + memory for agents) | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
+| Cloud fallback (14b → 32b → Sonnet) | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
+| Web IDE | ✅ | ✅ | ❌ | ❌ | ✅ | ❌ | ✅ |
+| VS Code extension | ✅ | ✅ | ❌ | ❌ | ✅ | ✅ | ❌ |
+| HIPAA / air-gapped ready | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
+| Flat-rate pricing (not per-seat) | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
+### vs local AI tools
+| Feature | Prism Coder | Ollama | LM Studio | Jan.ai | Mem0 | Zep |
+|---|:---:|:---:|:---:|:---:|:---:|:---:|
+| Local inference (1.7B–32B cascade) | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ |
+| Automatic cloud fallback | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ |
+| Persistent cross-session memory | ✅ | ❌ | ❌ | ❌ | ✅ | ✅ |
+| Knowledge ingestion (MCP + webhook + REST) | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ |
+| Cognitive routing (3-store) | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ |
+| L3 grounding verifier | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ |
+| Session drift detection | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ |
+| Native MCP server | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ |
+| Web IDE + VS Code extension | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ |
+| Analytics dashboard | ✅ | ❌ | ❌ | ❌ | ✅ | ✅ |
+### Pricing — flat-rate, not per-seat
+| | **Prism Coder** | GitHub Copilot | Cursor | Windsurf | Amazon Q | Tabnine |
+|---|:---:|:---:|:---:|:---:|:---:|:---:|
+| **Individual** | **$19/mo** | $10/mo | $20/mo | $15–20/mo | $19/mo | $39/mo |
+| **Team (5 devs)** | **$49/mo flat** | $95/mo | $200/mo | $200/mo | $95/mo | $295/mo |
+| **Enterprise (25 devs)** | **$99/mo flat** | $195/mo | $1,000/mo | Custom | Custom | Custom |
+| **Cost per dev (team)** | **$9.80** | $19 | $40 | $40 | $19 | $59 |
+| **Annual savings (5 devs)** | — | **$552** | **$1,812** | **$1,812** | **$552** | **$2,952** |
+---
+## Plans
+| | **Free** | **Standard $19/mo** | **Advanced $49/mo** | **Enterprise $99/mo** |
+|---|---|---|---|---|
+| **Seats included** | 1 | 1 | up to 5 | up to 25 |
+| **Local model ceiling** | up to 4b | up to 14b | up to 32b | up to 32b |
+| **Daily inference limit** | 50 | 200 | 2,000 | 100,000 |
+| **Max output tokens** | 512 | 1,024 | 2,048 | 4,096 |
+| **Cloud fallback** | — | Claude Sonnet 4 | Claude Sonnet 4 | Priority + Sonnet 4 |
+| **L3 grounding verifier** | — | ✅ | ✅ | ✅ |
+| **Knowledge search** | limited | unlimited | unlimited | unlimited |
+| **Session memory** | limited | unlimited | unlimited | unlimited |
+| **Analytics dashboard** | — | ✅ | ✅ | ✅ |
+| **HIPAA BAA** | — | — | — | ✅ |
+All on-device models are open-weight and free to run locally via Ollama. The subscription gates cloud features, higher model tiers, and increased limits. Need 25+ seats? [Contact sales](https://synalux.ai/contact). 14-day free trial on all paid plans. [Subscribe →](https://synalux.ai/pricing)
+---
 ## Get started
 ```bash
@@ -502,7 +570,7 @@ ollama pull dcostenco/prism-coder:32b
 Set `LOCAL_LLM_URL=http://localhost:11434` in your portal config. Routing is automatic:
-**Desktop/server**: 14B → 32B → Claude Opus fallback · **Mobile/offline**: 14B → 8B → 1.7B
+**Desktop/server**: 14B → 32B → Claude Sonnet 4 fallback · **Mobile/offline**: 14B → 8B → 1.7B
 iOS/mobile on same WiFi: `OLLAMA_HOST=0.0.0.0 ollama serve` on the Mac, then point `LOCAL_LLM_URL` at the Mac's IP.
 Routing accuracy (May 2026, v36/v7 system prompt, 3-seed mean): 32B v7 = **100.0%** · 8B v36 = **100.0%** · 14B v36 = **100.0%** · 1.7B v42 = **100.0%**
@@ -510,21 +578,6 @@ Cascade (14B→32B): **100.0%** · Opus solo: 98.3% · Opus engaged: **0% of req
 ---
-## Plans
-| Plan | Cloud model | Daily limit | On-device |
-|---|---|---|---|
-| **Free** | — | unlimited local | prism-coder:1.7b (100%) + 8b (100%) + 14b (100%) |
-| **Standard $19/mo** | Claude Sonnet 4 | 200 req | + cloud fallback |
-| **Pro $49/mo** | prism-coder:32b | 2,000 req | + reasoning tier |
-| **Enterprise $99/mo** | prism-coder:32b priority | unlimited | + HIPAA BAA + custom fine-tuning |
-All on-device models are **free for every tier** — no subscription needed for local inference. Offline translation (1,261 phrases × 20 languages) included in all plans.
-[Subscribe →](https://synalux.ai/pricing)
----
 ## What you can build with it
 - **Persistent coding assistant** that remembers your codebase, your decisions, your team's conventions
@@ -541,17 +594,17 @@ All on-device models are **free for every tier** — no subscription needed for
 **[synalux.ai/prism-mcp](https://synalux.ai/prism-mcp)** — full documentation, dashboard, subscription plans, and model downloads.
-### 💻 Web IDE — Synalux Coder
+### 💻 Web IDE — Prism Coder
-Use Prism Coder directly in your browser — no install required. Local-first IDE with the prism-coder agent built in. Connects to GitHub repos, Synalux Mail, Drive, and Source for cross-product workflows.
+Use Prism Coder directly in your browser — no install, no desktop app required. Standalone coding IDE with the prism-coder agent built in. Works with any Prism plan (no Synalux health subscription needed).
-**[synalux.ai/coder](https://synalux.ai/coder)** · also reachable at **[synalux.ai/prism-ide](https://synalux.ai/prism-ide)**
+**[synalux.ai/coder](https://synalux.ai/coder)**
 | Feature | Detail |
 |---|---|
-| Agent | prism-coder:7b offline · Claude Sonnet 4 (Standard+) · Claude Opus 4 (Enterprise) |
-| Integrations | GitHub repos, Synalux Mail, Drive, Source — same OAuth, no separate accounts |
-| Compliance | Audit log on every turn · PHI redaction · air-gapped offline mode (HIPAA) |
+| Agent | prism-coder:8b offline · Claude Sonnet 4 (Standard+) |
+| Integrations | GitHub repos · same Prism account, no separate sign-up |
+| Plans | Free (4b) · Standard $19/mo (14b) · Advanced $49/mo (32b) · Enterprise $99/mo |
 ### 🧩 VS Code Extension — Synalux

package/dist/tools/ledgerHandlers.js CHANGED Viewed

@@ -3,6 +3,7 @@ import * as nodePath from "node:path";
 import * as os from "node:os";
 import { randomUUID } from "node:crypto";
 import { redactSettings, toMarkdown } from "./commonHelpers.js";
+import { scanAndRedactPHI } from "../utils/phiGuard.js";
 import * as fflate from "fflate";
 import { buildVaultDirectory } from "../utils/vaultExporter.js";
 /**
@@ -60,9 +61,11 @@ import { notifyResourceUpdate } from "../server.js";
  * Zero-latency (pure regex, no API calls). Runs on every save.
  */
 export function sanitizeMemoryInput(text) {
-    return text
+    const stripped = text
         .replace(/<\/?(?:system|user_input|instruction|anti_pattern|desired_pattern|assistant|tool_call|prism_memory)[^>]*>/gi, '')
         .trim();
+    // HIPAA: redact PHI before storage — SSN, DOB, MRN, patient names, etc.
+    return scanAndRedactPHI(stripped).redacted;
 }
 /** Sanitize each string in an array (for decisions[], todos[], etc.) */
 function sanitizeArray(arr) {
@@ -853,17 +856,29 @@ export async function sessionLoadContextHandler(args) {
             .fetchSkillContent(missing).catch(() => ({}));
         debugLog(`[session_load_context] Synalux skill content fetched: ${Object.keys(synaluxContent).join(", ") || "none"}`);
     }
+    const SKILL_BLOCK_CAP = 30_000;
+    const skippedSkills = [];
     for (const skillName of skillsToLoad) {
         if (loadedSkills.includes(skillName))
             continue;
-        // Synalux (paid) → platform fallback skill:<name> (free/offline). Never user_skill:.
+        if (skillBlock.length >= SKILL_BLOCK_CAP) {
+            skippedSkills.push(skillName);
+            debugLog(`[session_load_context] Skill "${skillName}" skipped — block cap ${SKILL_BLOCK_CAP} reached`);
+            continue;
+        }
         const content = synaluxContent[skillName] || await getSetting(`skill:${skillName}`, "");
         if (content && content.trim()) {
+            const trimmed = content.trim();
+            if (skillBlock.length + trimmed.length > SKILL_BLOCK_CAP && loadedSkills.length > 0) {
+                skippedSkills.push(skillName);
+                debugLog(`[session_load_context] Skill "${skillName}" skipped — would exceed cap (${skillBlock.length}+${trimmed.length} > ${SKILL_BLOCK_CAP})`);
+                continue;
+            }
             const source = synaluxContent[skillName] ? "synalux" : "local-platform";
-            skillBlock += `\n\n[📜 SKILL: ${skillName}]\n${content.trim()}`;
+            skillBlock += `\n\n[📜 SKILL: ${skillName}]\n${trimmed}`;
             loadedSkills.push(skillName);
             skillLoaded = true;
-            debugLog(`[session_load_context] Skill "${skillName}" loaded (${source}) for project="${project}"`);
+            debugLog(`[session_load_context] Skill "${skillName}" loaded (${source}) for project="${project}" [${skillBlock.length}/${SKILL_BLOCK_CAP} chars]`);
         }
     }
     // ─── User-Local Skills ──────────────────────────────────────
@@ -876,10 +891,15 @@ export async function sessionLoadContextHandler(args) {
         for (const [k, v] of Object.entries(allSettings)) {
             if (!k.startsWith(prefix) || !v)
                 continue;
+            if (skillBlock.length >= SKILL_BLOCK_CAP)
+                break;
             const skillName = k.replace(prefix, "");
             if (loadedSkills.includes(skillName))
                 continue;
-            skillBlock += `\n\n[📜 USER SKILL: ${skillName}]\n${v.trim()}`;
+            const trimmed = v.trim();
+            if (skillBlock.length + trimmed.length > SKILL_BLOCK_CAP && loadedSkills.length > 0)
+                continue;
+            skillBlock += `\n\n[📜 USER SKILL: ${skillName}]\n${trimmed}`;
             loadedSkills.push(skillName);
             skillLoaded = true;
             debugLog(`[session_load_context] User-local skill "${skillName}" loaded`);
@@ -888,22 +908,32 @@ export async function sessionLoadContextHandler(args) {
     // ─── Memory-Based Skill Discovery ──────────────────────────
     // If recent handoff/ledger mentions a platform skill name, auto-load it.
     // Only scans platform skill: keys — user_skill: discovery is not automatic.
-    if (formattedContext.length > 0) {
+    if (formattedContext.length > 0 && skillBlock.length < SKILL_BLOCK_CAP) {
         const contextText = formattedContext.toLowerCase();
         const allSkillKeys = await storage.getAllSettings?.() || {};
         for (const [k, v] of Object.entries(allSkillKeys)) {
             if (!k.startsWith("skill:") || !v)
                 continue;
+            if (skillBlock.length >= SKILL_BLOCK_CAP)
+                break;
             const skillName = k.replace("skill:", "");
             if (loadedSkills.includes(skillName))
                 continue;
             if (contextText.includes(skillName.replace(/-/g, " ")) || contextText.includes(skillName)) {
-                skillBlock += `\n\n[📜 CONTEXT SKILL: ${skillName}]\n${v}`;
+                const trimmed = v.trim();
+                if (skillBlock.length + trimmed.length > SKILL_BLOCK_CAP && loadedSkills.length > 0) {
+                    skippedSkills.push(skillName);
+                    continue;
+                }
+                skillBlock += `\n\n[📜 CONTEXT SKILL: ${skillName}]\n${trimmed}`;
                 loadedSkills.push(skillName);
                 debugLog(`[session_load_context] Context-triggered skill "${skillName}"`);
             }
         }
     }
+    if (skippedSkills.length > 0) {
+        skillBlock += `\n\n[⏭️ ${skippedSkills.length} skills skipped (cap ${SKILL_BLOCK_CAP} chars): ${skippedSkills.join(", ")}]`;
+    }
     // ─── Agent Greeting Block ────────────────────────────────────
     // Shows agent identity (name + role) and skill status after briefing.
     let greetingBlock = "";

package/dist/tools/prismInferHandler.js CHANGED Viewed

@@ -25,6 +25,8 @@ import { getAvailableMemoryBytes } from "../utils/availableMemory.js";
 import { PRISM_SYNALUX_BASE_URL, PRISM_LOCAL_LLM_URL, } from "../config.js";
 import { debugLog } from "../utils/logger.js";
 import { verifyGrounding } from "../utils/groundingVerifier.js";
+import { getEntitlements, clampCeiling } from "../utils/entitlements.js";
+import { ddLog } from "../utils/ddLogger.js";
 // ─── Tool Definition ────────────────────────────────────────────
 export const PRISM_INFER_TOOL = {
     name: "prism_infer",
@@ -273,12 +275,47 @@ async function callSynaluxInference(prompt, maxTokens, timeoutMs) {
 }
 export async function runInfer(args, deps) {
     const t0 = Date.now();
-    const maxTokens = Math.min(args.max_tokens ?? 1024, 8192);
     const temperature = args.temperature ?? 0;
-    const allowCloud = args.cloud_fallback === true;
+    // ── Entitlement enforcement ──────────────────────────────────
+    // Fetch user's plan limits (cached 1hr). Free users without auth
+    // get 4b ceiling, 50 calls/day, 512 max tokens.
+    const ent = deps.entitlements ?? await getEntitlements();
+    // Clamp model ceiling to what the plan allows
+    const effectiveCeiling = clampCeiling(args.model_ceiling, ent.model_ceiling);
+    // Clamp max_tokens to plan limit
+    const maxTokens = Math.min(args.max_tokens ?? 1024, ent.max_tokens, 8192);
+    // Cloud fallback only for paid plans
+    const allowCloud = args.cloud_fallback === true && ent.features.cloud_fallback;
+    // Verification only for paid plans (free users skip L3 grounding)
+    const canVerify = ent.features.grounding_verifier;
     const freeBytes = deps.freemem();
     const ramFreeMb = Math.round(freeBytes / (1024 * 1024));
     const attempts = [];
+    // Strip verification args if plan lacks grounding_verifier
+    const gatedArgs = canVerify ? args : { ...args, verify: false, evidence: undefined };
+    debugLog(`[prism_infer] plan=${ent.plan} ceiling=${effectiveCeiling} max_tokens=${maxTokens} cloud=${allowCloud} verify=${canVerify}`);
+    // Log tier enforcement to Datadog for monetization visibility
+    const ceilingClamped = effectiveCeiling !== (args.model_ceiling ?? ent.model_ceiling);
+    const tokensClamped = maxTokens < (args.max_tokens ?? 1024);
+    const cloudBlocked = args.cloud_fallback === true && !allowCloud;
+    const verifierBlocked = (args.verify === true || (args.evidence?.length ?? 0) > 0) && !canVerify;
+    if (ceilingClamped || tokensClamped || cloudBlocked || verifierBlocked) {
+        ddLog("info", "prism_infer.tier_enforcement", {
+            plan: ent.plan,
+            requested_ceiling: args.model_ceiling,
+            effective_ceiling: effectiveCeiling,
+            ceiling_clamped: ceilingClamped,
+            requested_tokens: args.max_tokens,
+            effective_tokens: maxTokens,
+            tokens_clamped: tokensClamped,
+            cloud_requested: args.cloud_fallback,
+            cloud_allowed: allowCloud,
+            cloud_blocked: cloudBlocked,
+            verify_requested: args.verify,
+            verify_allowed: canVerify,
+            verify_blocked: verifierBlocked,
+        });
+    }
     // Discover which tags Ollama actually has + which are already warm.
     // Already-loaded models don't need RAM headroom — they're reusing
     // memory Ollama allocated previously.
@@ -292,8 +329,8 @@ export async function runInfer(args, deps) {
     // so the caller can see exactly why each tier was bypassed.
     if (installed) {
         // Find start index from ceiling — if no ceiling, start at the top (32B).
-        const ceilStart = args.model_ceiling
-            ? Math.max(0, MODEL_TIERS.findIndex(t => t.tag.endsWith(args.model_ceiling) || t.tag === args.model_ceiling))
+        const ceilStart = effectiveCeiling
+            ? Math.max(0, MODEL_TIERS.findIndex(t => t.tag.endsWith(effectiveCeiling) || t.tag === effectiveCeiling))
             : 0;
         let anyViable = false;
         for (let i = ceilStart; i < MODEL_TIERS.length; i++) {
@@ -318,13 +355,14 @@ export async function runInfer(args, deps) {
             const timeout = args.timeout_ms ?? DEFAULT_TIMEOUTS[tier.tag] ?? 60_000;
             const result = await deps.callLocal(deps.ollamaUrl, ollamaName, args.prompt, args.system, maxTokens, temperature, timeout);
             if (result.ok) {
-                return await applyVerification(result.text, args, deps, {
+                return await applyVerification(result.text, gatedArgs, deps, {
                     backend: `ollama-${tier.tag.replace("prism-coder:", "")}`,
                     model_picked: tier.tag,
                     ram_free_mb: ramFreeMb,
                     latency_ms: Date.now() - t0,
                     used_cloud: false,
                     attempts,
+                    plan: ent.plan,
                 });
             }
             attempts.push({ tier: tier.tag, reason: result.reason });
@@ -340,13 +378,14 @@ export async function runInfer(args, deps) {
         const cloudTimeout = args.timeout_ms ?? 90_000;
         const cloud = await deps.callCloud(args.prompt, maxTokens, cloudTimeout);
         if (cloud.ok && cloud.output) {
-            return await applyVerification(cloud.output, args, deps, {
+            return await applyVerification(cloud.output, gatedArgs, deps, {
                 backend: cloud.backend ?? "synalux",
                 model_picked: null,
                 ram_free_mb: ramFreeMb,
                 latency_ms: Date.now() - t0,
                 used_cloud: true,
                 attempts,
+                plan: ent.plan,
             });
         }
         attempts.push({ tier: "synalux", reason: cloud.reason ?? "unknown" });
@@ -408,6 +447,7 @@ export async function prismInferHandler(args) {
         debugLog(`[prism_infer] backend=${result.backend} model=${result.model_picked} latency=${result.latency_ms}ms free=${result.ram_free_mb}MB`);
         const header = `[prism_infer] backend=${result.backend}` +
             ` model=${result.model_picked ?? "n/a"}` +
+            ` plan=${result.plan ?? "unknown"}` +
             ` free_ram=${result.ram_free_mb}MB` +
             ` latency=${result.latency_ms}ms` +
             ` used_cloud=${result.used_cloud}` +

package/dist/tools/skillRouting.js CHANGED Viewed

@@ -81,6 +81,45 @@ export async function resolveSkillsForProject(project) {
         user_local: table.user_local ?? OFFLINE_FALLBACK.user_local,
     };
 }
+/**
+ * Resolve skills based on user prompt keywords. Matches prompt text
+ * against the routing table's prompt_keywords regex patterns.
+ * Returns deduplicated skill names (excluding any already in baseSkills).
+ */
+export async function resolveSkillsForPrompt(prompt, baseSkills = []) {
+    const now = Date.now();
+    if (!cached || now - cached.fetchedAt > CACHE_TTL_MS) {
+        if (!inflight) {
+            inflight = fetchOnce().then((table) => {
+                cached = { table, fetchedAt: Date.now() };
+                return table;
+            }).finally(() => { inflight = null; });
+        }
+        await inflight;
+    }
+    const table = cached.table;
+    if (!table.prompt_keywords)
+        return [];
+    const existing = new Set(baseSkills);
+    const matched = [];
+    for (const [pattern, skills] of Object.entries(table.prompt_keywords)) {
+        try {
+            const re = new RegExp(pattern, 'i');
+            if (re.test(prompt)) {
+                for (const s of skills) {
+                    if (!existing.has(s)) {
+                        existing.add(s);
+                        matched.push(s);
+                    }
+                }
+            }
+        }
+        catch {
+            // Invalid regex in routing table — skip silently
+        }
+    }
+    return matched;
+}
 /** Force a re-fetch on the next call. Exposed for tests + admin tooling. */
 export function _invalidateRoutingCache() {
     cached = null;

package/dist/utils/entitlements.js ADDED Viewed

@@ -0,0 +1,137 @@
+/**
+ * Prism Entitlements — Plan-Based Feature & Model Gating
+ * ═══════════════════════════════════════════════════════════
+ * Fetches the user's plan entitlements from the Synalux portal
+ * and caches them locally. Used by prism_infer and other tools
+ * to enforce model ceiling, max_tokens, and feature gates.
+ *
+ * Unauthenticated users (no SYNALUX_API_KEY) get free-tier defaults.
+ * Authenticated users get their plan from the portal (1-hour cache).
+ */
+import { getSynaluxJwt } from "./synaluxJwt.js";
+import { PRISM_SYNALUX_BASE_URL, SYNALUX_CONFIGURED } from "../config.js";
+import { debugLog } from "./logger.js";
+// ── Free-tier defaults (no auth) ──────────────────────────────────
+export const FREE_ENTITLEMENTS = {
+    plan: "free",
+    model_ceiling: "4b",
+    daily_infer_limit: 50,
+    max_tokens: 512,
+    max_seats: 1,
+    features: {
+        cloud_fallback: false,
+        grounding_verifier: false,
+        knowledge_search_unlimited: false,
+        session_memory_unlimited: false,
+        analytics_dashboard: false,
+    },
+    upgrade_url: "https://synalux.ai/pricing",
+};
+// ── Cache ─────────────────────────────────────────────────────────
+const CACHE_TTL_MS = 5 * 60 * 1000; // 5 minutes
+let cache = null;
+let inFlight = null;
+// ── Model tier ordering for ceiling enforcement ───────────────────
+const TIER_ORDER = ["1b7", "4b", "8b", "14b", "32b"];
+/**
+ * Returns true if `requested` exceeds `ceiling`.
+ * e.g. ceilingExceeded("14b", "4b") → true (14b > 4b ceiling)
+ */
+export function ceilingExceeded(requested, ceiling) {
+    const reqIdx = TIER_ORDER.indexOf(requested);
+    const ceilIdx = TIER_ORDER.indexOf(ceiling);
+    if (reqIdx === -1 || ceilIdx === -1)
+        return false;
+    return reqIdx > ceilIdx;
+}
+/**
+ * Clamp a model ceiling string to the plan's maximum.
+ * Returns the lower of the two ceilings.
+ */
+export function clampCeiling(requested, planCeiling) {
+    if (!requested)
+        return planCeiling;
+    const reqIdx = TIER_ORDER.indexOf(requested);
+    const planIdx = TIER_ORDER.indexOf(planCeiling);
+    if (reqIdx === -1)
+        return planCeiling;
+    if (planIdx === -1)
+        return requested;
+    return TIER_ORDER[Math.min(reqIdx, planIdx)];
+}
+// ── Fetch ─────────────────────────────────────────────────────────
+async function fetchEntitlements() {
+    if (!SYNALUX_CONFIGURED || !PRISM_SYNALUX_BASE_URL) {
+        debugLog("[entitlements] no Synalux auth configured — free tier");
+        return FREE_ENTITLEMENTS;
+    }
+    const jwt = await getSynaluxJwt();
+    if (!jwt) {
+        debugLog("[entitlements] JWT exchange failed — free tier fallback");
+        return FREE_ENTITLEMENTS;
+    }
+    try {
+        const url = `${PRISM_SYNALUX_BASE_URL}/api/v1/prism/entitlements`;
+        const res = await fetch(url, {
+            method: "GET",
+            headers: { Authorization: `Bearer ${jwt}` },
+            signal: AbortSignal.timeout(10_000),
+            redirect: "error",
+        });
+        if (!res.ok) {
+            debugLog(`[entitlements] portal HTTP ${res.status} — free tier fallback`);
+            return FREE_ENTITLEMENTS;
+        }
+        const data = (await res.json());
+        if (!data.plan || !data.model_ceiling) {
+            debugLog("[entitlements] malformed response — free tier fallback");
+            return FREE_ENTITLEMENTS;
+        }
+        debugLog(`[entitlements] plan=${data.plan} ceiling=${data.model_ceiling} ` +
+            `daily=${data.daily_infer_limit} max_tokens=${data.max_tokens}`);
+        return data;
+    }
+    catch (err) {
+        debugLog(`[entitlements] fetch error: ${err instanceof Error ? err.message : String(err)} — free tier fallback`);
+        return FREE_ENTITLEMENTS;
+    }
+}
+// ── Public API ────────────────────────────────────────────────────
+/**
+ * Get the current user's entitlements (cached for 1 hour).
+ * Concurrent callers share a single in-flight fetch.
+ */
+export async function getEntitlements() {
+    const now = Date.now();
+    if (cache && cache.expiresAt > now) {
+        return cache.entitlements;
+    }
+    if (inFlight)
+        return inFlight;
+    inFlight = (async () => {
+        try {
+            const ent = await fetchEntitlements();
+            cache = { entitlements: ent, expiresAt: Date.now() + CACHE_TTL_MS };
+            return ent;
+        }
+        finally {
+            inFlight = null;
+        }
+    })();
+    return inFlight;
+}
+/**
+ * Force cache invalidation (e.g. after plan upgrade).
+ */
+export function invalidateEntitlements() {
+    cache = null;
+}
+/** Test-only: reset all state. */
+export function _resetEntitlementsForTest() {
+    cache = null;
+    inFlight = null;
+}
+/** Test-only: inject a cached entitlement. */
+export function _setCacheForTest(ent, ttlMs = CACHE_TTL_MS) {
+    cache = { entitlements: ent, expiresAt: Date.now() + ttlMs };
+}

package/dist/utils/phiGuard.js ADDED Viewed

@@ -0,0 +1,88 @@
+/**
+ * PHI Guard — detect and redact Protected Health Information before storage/logging.
+ *
+ * HIPAA §164.502: PHI must not be disclosed except as permitted.
+ * This module scans text for common PHI patterns (SSN, DOB, MRN, phone,
+ * email, patient names in clinical context) and redacts them.
+ *
+ * Detection events are logged to stderr (picked up by DD agent) with
+ * the pattern type and character position — never the actual PHI value.
+ *
+ * Usage:
+ *   import { scanAndRedactPHI, hasPHI } from './phiGuard.js';
+ *   const { redacted, detections } = scanAndRedactPHI(userText);
+ *   // `redacted` is safe to store/log; `detections` lists what was found
+ */
+// Patterns ordered by specificity (most specific first)
+const PHI_PATTERNS = [
+    // SSN: 123-45-6789 or 123456789
+    { name: 'SSN', regex: /\b\d{3}-\d{2}-\d{4}\b/g, replacement: '[SSN-REDACTED]' },
+    { name: 'SSN', regex: /\b\d{9}\b(?=\s|$|[,.])/g, replacement: '[SSN-REDACTED]' },
+    // Date of birth patterns: DOB: 01/15/1990, born 1990-01-15, birthday 01/15/90
+    { name: 'DOB', regex: /\b(?:dob|date\s*of\s*birth|born|birthday)\s*[:=]?\s*\d{1,2}[/\-]\d{1,2}[/\-]\d{2,4}\b/gi, replacement: '[DOB-REDACTED]' },
+    { name: 'DOB', regex: /\b(?:dob|date\s*of\s*birth|born|birthday)\s*[:=]?\s*\d{4}[/\-]\d{1,2}[/\-]\d{1,2}\b/gi, replacement: '[DOB-REDACTED]' },
+    // Medical Record Number: MRN: 12345678, MRN#12345
+    { name: 'MRN', regex: /\b(?:mrn|medical\s*record)\s*[#:=]?\s*\d{4,12}\b/gi, replacement: '[MRN-REDACTED]' },
+    // US Phone: (301) 433-1943, 301-433-1943, +1-301-433-1943
+    { name: 'PHONE', regex: /\b(?:\+?1[-.]?)?\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}\b/g, replacement: '[PHONE-REDACTED]' },
+    // Email in clinical context: patient email, client email
+    { name: 'EMAIL', regex: /\b(?:patient|client|parent|caregiver)\s*(?:email|e-mail)\s*[:=]?\s*[\w.+-]+@[\w.-]+\.\w{2,}\b/gi, replacement: '[EMAIL-REDACTED]' },
+    // Patient/client name patterns: "Patient: John Doe", "Client Name: Jane Smith"
+    { name: 'PATIENT_NAME', regex: /\b(?:patient|client)\s*(?:name)?\s*[:=]\s*[A-Z][a-z]+\s+[A-Z][a-z]+/gi, replacement: '[NAME-REDACTED]' },
+    // Insurance ID: Ins#, Policy#, Member ID
+    { name: 'INSURANCE_ID', regex: /\b(?:ins(?:urance)?|policy|member)\s*(?:id|#|number)\s*[:=]?\s*[A-Z0-9]{6,20}\b/gi, replacement: '[INSURANCE-REDACTED]' },
+    // Diagnosis codes in patient context: "diagnosed with F84.0", "ICD: F32.1"
+    { name: 'DIAGNOSIS', regex: /\b(?:diagnos\w*|icd|dx)\s*(?:[:=]|with)?\s*[A-Z]\d{2}(?:\.\d{1,2})?\b/gi, replacement: '[DX-REDACTED]' },
+];
+/**
+ * Scan text for PHI patterns and return redacted version + detection list.
+ * Never logs or stores the actual PHI values — only type + position.
+ */
+export function scanAndRedactPHI(text) {
+    if (typeof text !== 'string' || !text) {
+        return { redacted: text || '', detections: [], hasPHI: false };
+    }
+    const detections = [];
+    let redacted = text;
+    for (const { name, regex, replacement } of PHI_PATTERNS) {
+        // Reset regex state for global patterns
+        regex.lastIndex = 0;
+        let match;
+        while ((match = regex.exec(text)) !== null) {
+            detections.push({
+                type: name,
+                position: match.index,
+                length: match[0].length,
+            });
+        }
+        redacted = redacted.replace(regex, replacement);
+    }
+    if (detections.length > 0) {
+        // Log detection event — type + count only, NEVER the actual value
+        const summary = detections.reduce((acc, d) => {
+            acc[d.type] = (acc[d.type] || 0) + 1;
+            return acc;
+        }, {});
+        const summaryStr = Object.entries(summary).map(([k, v]) => `${k}=${v}`).join(' ');
+        console.error(`[PHI-GUARD] Detected and redacted PHI: ${summaryStr}`);
+    }
+    return {
+        redacted,
+        detections,
+        hasPHI: detections.length > 0,
+    };
+}
+/**
+ * Quick check — does the text contain PHI patterns?
+ * Faster than full redaction when you only need a boolean.
+ */
+export function hasPHI(text) {
+    if (typeof text !== 'string' || !text)
+        return false;
+    for (const { regex } of PHI_PATTERNS) {
+        regex.lastIndex = 0;
+        if (regex.test(text))
+            return true;
+    }
+    return false;
+}

package/package.json CHANGED Viewed

@@ -1,8 +1,8 @@
 {
   "name": "prism-mcp-server",
-  "version": "17.0.1",
+  "version": "17.1.1",
   "mcpName": "io.github.dcostenco/prism-coder",
-  "description": "Prism Coder — Cognitive memory + tool-calling intelligence for AI agents. Mind Palace persistent memory (BFCL Gold Certified, 100% Tool-Call Accuracy, 54 Agent Skills, Zero-Search HDC/HRR retrieval, HRR Semantic Drift Detection across BCBA/Coding/AAC domains, HIPAA-hardened local-first storage, SLERP-optimized GRPO alignment) plus the prism-coder:7b / 14b open-weights LLM fleet.",
+  "description": "Prism Coder — Cognitive memory + tool-calling intelligence for AI agents. Mind Palace persistent memory (BFCL Gold Certified, 100% Tool-Call Accuracy, 114 Agent Skills, Zero-Search HDC/HRR retrieval, HRR Semantic Drift Detection across BCBA/Coding/AAC domains, HIPAA-hardened local-first storage, SLERP-optimized GRPO alignment) plus the prism-coder 1.7B–32B open-weights LLM fleet.",
   "module": "index.ts",
   "type": "module",
   "main": "dist/server.js",