npm - prism-mcp-server - Versions diffs - 15.1.0 → 15.2.1 - Mend

prism-mcp-server 15.1.0 → 15.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (5) hide show

package/README.md +68 -12
package/dist/dashboard/server.js +8 -6
package/dist/tools/ledgerHandlers.js +35 -10
package/dist/tools/skillRouting.js +8 -4
package/package.json +1 -1

package/README.md CHANGED Viewed

@@ -113,21 +113,31 @@ The LLM context window is treated as ephemeral scratch space. All durable state
 ---
-## Plans
+## Models
+All Prism Coder inference uses **only fine-tuned Prism Coder models** — no Claude, no Gemini, no OpenRouter fallbacks. Models are exclusively accessible through the Synalux router (authentication + subscription required).
+| Model | Where | Tier | Latency |
+|---|---|---|---|
+| **Qwen3-1.7B** (fine-tuned) | On-device — iOS CoreML / Android ONNX | Free | ~50ms offline |
+| **Qwen3-14B** (fine-tuned) | RunPod A100 via Synalux | Standard+ | ~200ms |
+| **QwQ-32B** (fine-tuned) | RunPod A100 80GB via Synalux | Pro/Enterprise | ~3–5s |
+| **Qwen3-30B-A3B** (fine-tuned MoE) | RunPod via Synalux | Enterprise | ~2–3s |
-| | Free (local) | Paid (Synalux portal) |
-|---|---|---|
-| Local SQLite memory | ✅ | ✅ |
-| Semantic search | ✅ (local embedding) | ✅ (cloud-backed) |
-| Cross-device sync | — | ✅ |
-| Hivemind multi-agent | ✅ local team | ✅ + cloud roster |
-| Auto-Scholar (web research → memory) | — | ✅ |
-| HRR Zero-Search retrieval | ✅ | ✅ |
-| Custom domains / SSO | — | Enterprise |
+Fine-tuned on the 3-layer corpus: AAC + BFCL tool-calling + clinical workflows. BFCL gate: ≥ 90% on all tiers before production promotion. Adapters stored at `dcostenco/prism-coder-*` (private HuggingFace).
-The thin-client architecture: when authenticated to Synalux, Prism Coder routes through the portal for paid features. When not authenticated (or `PRISM_FORCE_LOCAL=1`), runs purely local. Same binary.
+## Plans
+| | Free | Standard $19/mo | Pro $49/mo | Enterprise $99/mo |
+|---|---|---|---|---|
+| Qwen3-1.7B on-device | ✅ unlimited | ✅ | ✅ | ✅ |
+| Qwen3-14B cloud | — | ✅ 200 req/day | ✅ 2K req/day | ✅ unlimited |
+| QwQ-32B reasoning | — | — | ✅ | ✅ priority |
+| Qwen3-30B-A3B MoE | — | — | — | ✅ |
+| Custom fine-tuning | — | — | — | ✅ |
+| HIPAA BAA | — | — | — | ✅ |
-[Pricing →](https://synalux.ai/pricing)
+[Subscribe →](https://synalux.ai/pricing)
 ---
@@ -174,6 +184,52 @@ As of v14.0.0, Prism's algorithm exports are a **stable public contract** under
 | [PrismAAC](https://github.com/dcostenco/prism-aac) | Spreading-activation phrase ranking (recency × frequency × per-user history). Caregiver corrections auto-harvest into the personalization corpus via the audit-hooks postflight harvester. The on-device 7B model + this algorithm stack is what makes PrismAAC defensible. |
 | Synalux portal | Tier-aware model routing using experience bias on prior outcomes per fingerprint. HIPAA-compliant clinical scribe with on-device-first privacy guarantees. |
+## Synalux Inference Router — Architecture (v16)
+All Prism AAC model inference is protected behind Synalux as a mandatory router. Models are **never accessible directly** — all traffic goes through Synalux for auth, billing, and rate limiting.
+```
+┌─────────────────────────────────────────────────────────────┐
+│                      CLIENT LAYER                           │
+│  prism-aac (iOS/web)         │   Synalux Portal             │
+└──────────────┬──────────────────────────────────────────────┘
+               │ POST /api/v1/prism-aac/inference
+               │ Authorization: Bearer <user-JWT>
+               ▼
+┌─────────────────────────────────────────────────────────────┐
+│                   SYNALUX ROUTER                            │
+│  1. Verify JWT (no anonymous access)                        │
+│  2. Check subscription tier                                 │
+│  3. Enforce rate limit (50–2000 req/day by plan)            │
+│  4. Route to model tier by complexity                       │
+│  5. Proxy → RunPod with SECRET key (never sent to client)   │
+│  6. Log → aac_inference_log (billing audit trail)           │
+└──────────┬─────────────────────────────────────┬────────────┘
+           │ tier=fast                            │ tier=reason
+           ▼                                      ▼
+  ┌──────────────────┐               ┌───────────────────────┐
+  │  Qwen3-14B       │               │  QwQ-32B              │
+  │  RunPod A100 40G │               │  RunPod A100 80G      │
+  │  ~200ms          │               │  ~3–5s (reasoning)    │
+  │  standard/pro    │               │  pro/enterprise only  │
+  └──────────────────┘               └───────────────────────┘
+           │                                      │
+           └────────────────┬─────────────────────┘
+                            ▼
+               HuggingFace dcostenco/prism-coder-* (private)
+               RunPod pulls at pod start with server-side token
+On-device (free, zero latency, offline):
+  Qwen3-1.7B GGUF Q4_K_M → iOS CoreML / Android ONNX
+```
+| Plan | Cloud model | Daily limit | On-device |
+|---|---|---|---|
+| Free | — | unlimited local | Qwen3-1.7B |
+| Standard $5/mo | Qwen3-14B | 200 req | + cloud |
+| Pro $15/mo | QwQ-32B | 2,000 req | + reasoning |
+| Enterprise | QwQ-32B priority | unlimited | full stack |
 See [`docs/WOW_FEATURES.md`](docs/WOW_FEATURES.md) for the algorithm catalogue. Release notes in [`docs/releases/v14.0.0-prism-as-foundation.md`](docs/releases/v14.0.0-prism-as-foundation.md).
 ---

package/dist/dashboard/server.js CHANGED Viewed

@@ -567,7 +567,9 @@ return false;}
                 res.writeHead(200, { "Content-Type": "application/json" });
                 return res.end(JSON.stringify({ skills }));
             }
-            // POST /api/skills → { role, content } saves skill:<role>
+            // POST /api/skills → { role, content } saves user_skill:<role> (user-local namespace).
+            // Platform skills (skill:*) are read-only — populated only by sync-skills.sh
+            // or fetched from Synalux. Users cannot overwrite platform skills from the dashboard.
             if (url.pathname === "/api/skills" && req.method === "POST") {
                 const body = await readBody(req);
                 const { role, content } = JSON.parse(body || "{}");
@@ -575,16 +577,16 @@ return false;}
                     res.writeHead(400);
                     return res.end(JSON.stringify({ error: "role required" }));
                 }
-                await setSetting(`skill:${role}`, content || "");
+                await setSetting(`user_skill:${role}`, content || "");
                 res.writeHead(200, { "Content-Type": "application/json" });
-                return res.end(JSON.stringify({ ok: true, role }));
+                return res.end(JSON.stringify({ ok: true, role, namespace: "user_skill" }));
             }
-            // DELETE /api/skills/:role → clears skill:<role>
+            // DELETE /api/skills/:role → clears user_skill:<role> (user-local only).
             if (url.pathname.startsWith("/api/skills/") && req.method === "DELETE") {
                 const role = url.pathname.replace("/api/skills/", "");
-                await setSetting(`skill:${role}`, "");
+                await setSetting(`user_skill:${role}`, "");
                 res.writeHead(200, { "Content-Type": "application/json" });
-                return res.end(JSON.stringify({ ok: true, role }));
+                return res.end(JSON.stringify({ ok: true, role, namespace: "user_skill" }));
             }
             // ─── API: Knowledge Graph (v6.2 — extracted to graphRouter.ts) ───
             if (url.pathname.startsWith("/api/graph")) {

package/dist/tools/ledgerHandlers.js CHANGED Viewed

@@ -798,12 +798,19 @@ export async function sessionLoadContextHandler(args) {
         }
     }
     // ─── Project-Aware Skill Injection ──────────────────────────
-    // Routing (WHICH skills): always from Synalux /api/v1/skills/routing.
-    // Content (WHAT): paid tier → Synalux /api/v1/skills/content (batch fetch,
-    // single source of truth). Free tier / offline → local SQLite fallback.
+    // Routing (WHICH skills + user_local policy): Synalux /api/v1/skills/routing.
+    // Content (WHAT):
+    //   Platform skills  → Synalux /api/v1/skills/content (DB first, filesystem fallback)
+    //                      → local SQLite skill:<name> (free tier / offline fallback)
+    //   User-local skills → local SQLite user_skill:<name>
+    //                       ONLY when user_local.enabled=true in routing table
+    //                       OR session_load_context called with user_local=true.
+    //                       Users CANNOT write to the platform skill: namespace.
     const { resolveSkillsForProject } = await import("./skillRouting.js");
-    const skillsToLoad = await resolveSkillsForProject(project);
-    // Paid tier: batch-fetch all skill content from Synalux portal in one request.
+    const resolved = await resolveSkillsForProject(project);
+    const skillsToLoad = resolved.names;
+    const userLocalPolicy = resolved.user_local;
+    // Paid tier: batch-fetch platform skill content from Synalux in one request.
     let synaluxContent = {};
     if (SYNALUX_CONFIGURED && storage && typeof storage.fetchSkillContent === "function") {
         const missing = skillsToLoad.filter(n => !loadedSkills.includes(n));
@@ -814,19 +821,38 @@ export async function sessionLoadContextHandler(args) {
     for (const skillName of skillsToLoad) {
         if (loadedSkills.includes(skillName))
             continue;
-        // Prefer Synalux content (always up-to-date); fall back to local SQLite.
+        // Synalux (paid) → platform fallback skill:<name> (free/offline). Never user_skill:.
         const content = synaluxContent[skillName] || await getSetting(`skill:${skillName}`, "");
         if (content && content.trim()) {
-            const source = synaluxContent[skillName] ? "synalux" : "local";
+            const source = synaluxContent[skillName] ? "synalux" : "local-platform";
             skillBlock += `\n\n[📜 SKILL: ${skillName}]\n${content.trim()}`;
             loadedSkills.push(skillName);
             skillLoaded = true;
             debugLog(`[session_load_context] Skill "${skillName}" loaded (${source}) for project="${project}"`);
         }
     }
+    // ─── User-Local Skills ──────────────────────────────────────
+    // Loaded ONLY when user_local.enabled=true (set in Synalux routing table
+    // or explicitly requested). Stored under user_skill: prefix — users can
+    // write here via dashboard; they CANNOT write to the platform skill: keys.
+    if (userLocalPolicy.enabled) {
+        const prefix = userLocalPolicy.key_prefix || "user_skill:";
+        const allSettings = await storage.getAllSettings?.() || {};
+        for (const [k, v] of Object.entries(allSettings)) {
+            if (!k.startsWith(prefix) || !v)
+                continue;
+            const skillName = k.replace(prefix, "");
+            if (loadedSkills.includes(skillName))
+                continue;
+            skillBlock += `\n\n[📜 USER SKILL: ${skillName}]\n${v.trim()}`;
+            loadedSkills.push(skillName);
+            skillLoaded = true;
+            debugLog(`[session_load_context] User-local skill "${skillName}" loaded`);
+        }
+    }
     // ─── Memory-Based Skill Discovery ──────────────────────────
-    // If recent handoff/ledger mentions a skill name, auto-load it.
-    // This lets the agent's own memory drive skill activation.
+    // If recent handoff/ledger mentions a platform skill name, auto-load it.
+    // Only scans platform skill: keys — user_skill: discovery is not automatic.
     if (formattedContext.length > 0) {
         const contextText = formattedContext.toLowerCase();
         const allSkillKeys = await storage.getAllSettings?.() || {};
@@ -836,7 +862,6 @@ export async function sessionLoadContextHandler(args) {
             const skillName = k.replace("skill:", "");
             if (loadedSkills.includes(skillName))
                 continue;
-            // Only load if the skill name appears in recent context
             if (contextText.includes(skillName.replace(/-/g, " ")) || contextText.includes(skillName)) {
                 skillBlock += `\n\n[📜 CONTEXT SKILL: ${skillName}]\n${v}`;
                 loadedSkills.push(skillName);

package/dist/tools/skillRouting.js CHANGED Viewed

@@ -16,12 +16,12 @@
  * Do NOT add hardcoded skill names here outside the OFFLINE_FALLBACK block
  * — that defeats the single-source-of-truth design.
  */
-// Minimal fallback when synalux is unreachable. Only the universal BCBA
-// skill — project-specific mappings need synalux to resolve.
+// Minimal fallback when synalux is unreachable.
 const OFFLINE_FALLBACK = {
     version: 1,
     universal: ['bcba_ai_assistant'],
     projects: {},
+    user_local: { enabled: false, key_prefix: 'user_skill:' },
 };
 const SYNALUX_BASE = process.env.SYNALUX_BASE_URL || 'https://synalux.ai';
 const CACHE_TTL_MS = 5 * 60 * 1000;
@@ -53,7 +53,8 @@ async function fetchOnce() {
 /**
  * Resolve the skill list for a given project (case-insensitive substring
  * match against the routing table). Always returns at least the universal
- * skills.
+ * skills. Also returns the user_local policy so callers know whether to
+ * load user_skill:* entries from local SQLite.
  */
 export async function resolveSkillsForProject(project) {
     const now = Date.now();
@@ -75,7 +76,10 @@ export async function resolveSkillsForProject(project) {
                 out.add(s);
         }
     }
-    return Array.from(out);
+    return {
+        names: Array.from(out),
+        user_local: table.user_local ?? OFFLINE_FALLBACK.user_local,
+    };
 }
 /** Force a re-fetch on the next call. Exposed for tests + admin tooling. */
 export function _invalidateRoutingCache() {

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "prism-mcp-server",
-  "version": "15.1.0",
+  "version": "15.2.1",
   "mcpName": "io.github.dcostenco/prism-coder",
   "description": "Prism Coder — Cognitive memory + tool-calling intelligence for AI agents. Mind Palace persistent memory (BFCL Gold Certified, 100% Tool-Call Accuracy, 54 Agent Skills, Zero-Search HDC/HRR retrieval, HIPAA-hardened local-first storage, SLERP-optimized GRPO alignment) plus the prism-coder:7b / 14b open-weights LLM fleet.",
   "module": "index.ts",