prism-mcp-server 15.1.0 → 15.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -113,21 +113,31 @@ The LLM context window is treated as ephemeral scratch space. All durable state
113
113
 
114
114
  ---
115
115
 
116
- ## Plans
116
+ ## Models
117
+
118
+ All Prism Coder inference uses **only fine-tuned Prism Coder models** — no Claude, no Gemini, no OpenRouter fallbacks. Models are exclusively accessible through the Synalux router (authentication + subscription required).
119
+
120
+ | Model | Where | Tier | Latency |
121
+ |---|---|---|---|
122
+ | **Qwen3-1.7B** (fine-tuned) | On-device — iOS CoreML / Android ONNX | Free | ~50ms offline |
123
+ | **Qwen3-14B** (fine-tuned) | RunPod A100 via Synalux | Standard+ | ~200ms |
124
+ | **QwQ-32B** (fine-tuned) | RunPod A100 80GB via Synalux | Pro/Enterprise | ~3–5s |
125
+ | **Qwen3-30B-A3B** (fine-tuned MoE) | RunPod via Synalux | Enterprise | ~2–3s |
117
126
 
118
- | | Free (local) | Paid (Synalux portal) |
119
- |---|---|---|
120
- | Local SQLite memory | ✅ | ✅ |
121
- | Semantic search | ✅ (local embedding) | ✅ (cloud-backed) |
122
- | Cross-device sync | — | ✅ |
123
- | Hivemind multi-agent | ✅ local team | ✅ + cloud roster |
124
- | Auto-Scholar (web research → memory) | — | ✅ |
125
- | HRR Zero-Search retrieval | ✅ | ✅ |
126
- | Custom domains / SSO | — | Enterprise |
127
+ Fine-tuned on the 3-layer corpus: AAC + BFCL tool-calling + clinical workflows. BFCL gate: ≥ 90% on all tiers before production promotion. Adapters stored at `dcostenco/prism-coder-*` (private HuggingFace).
127
128
 
128
- The thin-client architecture: when authenticated to Synalux, Prism Coder routes through the portal for paid features. When not authenticated (or `PRISM_FORCE_LOCAL=1`), runs purely local. Same binary.
129
+ ## Plans
130
+
131
+ | | Free | Standard $19/mo | Pro $49/mo | Enterprise $99/mo |
132
+ |---|---|---|---|---|
133
+ | Qwen3-1.7B on-device | ✅ unlimited | ✅ | ✅ | ✅ |
134
+ | Qwen3-14B cloud | — | ✅ 200 req/day | ✅ 2K req/day | ✅ unlimited |
135
+ | QwQ-32B reasoning | — | — | ✅ | ✅ priority |
136
+ | Qwen3-30B-A3B MoE | — | — | — | ✅ |
137
+ | Custom fine-tuning | — | — | — | ✅ |
138
+ | HIPAA BAA | — | — | — | ✅ |
129
139
 
130
- [Pricing →](https://synalux.ai/pricing)
140
+ [Subscribe →](https://synalux.ai/pricing)
131
141
 
132
142
  ---
133
143
 
@@ -174,6 +184,52 @@ As of v14.0.0, Prism's algorithm exports are a **stable public contract** under
174
184
  | [PrismAAC](https://github.com/dcostenco/prism-aac) | Spreading-activation phrase ranking (recency × frequency × per-user history). Caregiver corrections auto-harvest into the personalization corpus via the audit-hooks postflight harvester. The on-device 7B model + this algorithm stack is what makes PrismAAC defensible. |
175
185
  | Synalux portal | Tier-aware model routing using experience bias on prior outcomes per fingerprint. HIPAA-compliant clinical scribe with on-device-first privacy guarantees. |
176
186
 
187
+ ## Synalux Inference Router — Architecture (v16)
188
+
189
+ All Prism AAC model inference is protected behind Synalux as a mandatory router. Models are **never accessible directly** — all traffic goes through Synalux for auth, billing, and rate limiting.
190
+
191
+ ```
192
+ ┌─────────────────────────────────────────────────────────────┐
193
+ │ CLIENT LAYER │
194
+ │ prism-aac (iOS/web) │ Synalux Portal │
195
+ └──────────────┬──────────────────────────────────────────────┘
196
+ │ POST /api/v1/prism-aac/inference
197
+ │ Authorization: Bearer <user-JWT>
198
+
199
+ ┌─────────────────────────────────────────────────────────────┐
200
+ │ SYNALUX ROUTER │
201
+ │ 1. Verify JWT (no anonymous access) │
202
+ │ 2. Check subscription tier │
203
+ │ 3. Enforce rate limit (50–2000 req/day by plan) │
204
+ │ 4. Route to model tier by complexity │
205
+ │ 5. Proxy → RunPod with SECRET key (never sent to client) │
206
+ │ 6. Log → aac_inference_log (billing audit trail) │
207
+ └──────────┬─────────────────────────────────────┬────────────┘
208
+ │ tier=fast │ tier=reason
209
+ ▼ ▼
210
+ ┌──────────────────┐ ┌───────────────────────┐
211
+ │ Qwen3-14B │ │ QwQ-32B │
212
+ │ RunPod A100 40G │ │ RunPod A100 80G │
213
+ │ ~200ms │ │ ~3–5s (reasoning) │
214
+ │ standard/pro │ │ pro/enterprise only │
215
+ └──────────────────┘ └───────────────────────┘
216
+ │ │
217
+ └────────────────┬─────────────────────┘
218
+
219
+ HuggingFace dcostenco/prism-coder-* (private)
220
+ RunPod pulls at pod start with server-side token
221
+
222
+ On-device (free, zero latency, offline):
223
+ Qwen3-1.7B GGUF Q4_K_M → iOS CoreML / Android ONNX
224
+ ```
225
+
226
+ | Plan | Cloud model | Daily limit | On-device |
227
+ |---|---|---|---|
228
+ | Free | — | unlimited local | Qwen3-1.7B |
229
+ | Standard $5/mo | Qwen3-14B | 200 req | + cloud |
230
+ | Pro $15/mo | QwQ-32B | 2,000 req | + reasoning |
231
+ | Enterprise | QwQ-32B priority | unlimited | full stack |
232
+
177
233
  See [`docs/WOW_FEATURES.md`](docs/WOW_FEATURES.md) for the algorithm catalogue. Release notes in [`docs/releases/v14.0.0-prism-as-foundation.md`](docs/releases/v14.0.0-prism-as-foundation.md).
178
234
 
179
235
  ---
@@ -567,7 +567,9 @@ return false;}
567
567
  res.writeHead(200, { "Content-Type": "application/json" });
568
568
  return res.end(JSON.stringify({ skills }));
569
569
  }
570
- // POST /api/skills → { role, content } saves skill:<role>
570
+ // POST /api/skills → { role, content } saves user_skill:<role> (user-local namespace).
571
+ // Platform skills (skill:*) are read-only — populated only by sync-skills.sh
572
+ // or fetched from Synalux. Users cannot overwrite platform skills from the dashboard.
571
573
  if (url.pathname === "/api/skills" && req.method === "POST") {
572
574
  const body = await readBody(req);
573
575
  const { role, content } = JSON.parse(body || "{}");
@@ -575,16 +577,16 @@ return false;}
575
577
  res.writeHead(400);
576
578
  return res.end(JSON.stringify({ error: "role required" }));
577
579
  }
578
- await setSetting(`skill:${role}`, content || "");
580
+ await setSetting(`user_skill:${role}`, content || "");
579
581
  res.writeHead(200, { "Content-Type": "application/json" });
580
- return res.end(JSON.stringify({ ok: true, role }));
582
+ return res.end(JSON.stringify({ ok: true, role, namespace: "user_skill" }));
581
583
  }
582
- // DELETE /api/skills/:role → clears skill:<role>
584
+ // DELETE /api/skills/:role → clears user_skill:<role> (user-local only).
583
585
  if (url.pathname.startsWith("/api/skills/") && req.method === "DELETE") {
584
586
  const role = url.pathname.replace("/api/skills/", "");
585
- await setSetting(`skill:${role}`, "");
587
+ await setSetting(`user_skill:${role}`, "");
586
588
  res.writeHead(200, { "Content-Type": "application/json" });
587
- return res.end(JSON.stringify({ ok: true, role }));
589
+ return res.end(JSON.stringify({ ok: true, role, namespace: "user_skill" }));
588
590
  }
589
591
  // ─── API: Knowledge Graph (v6.2 — extracted to graphRouter.ts) ───
590
592
  if (url.pathname.startsWith("/api/graph")) {
@@ -798,12 +798,19 @@ export async function sessionLoadContextHandler(args) {
798
798
  }
799
799
  }
800
800
  // ─── Project-Aware Skill Injection ──────────────────────────
801
- // Routing (WHICH skills): always from Synalux /api/v1/skills/routing.
802
- // Content (WHAT): paid tier → Synalux /api/v1/skills/content (batch fetch,
803
- // single source of truth). Free tier / offline local SQLite fallback.
801
+ // Routing (WHICH skills + user_local policy): Synalux /api/v1/skills/routing.
802
+ // Content (WHAT):
803
+ // Platform skills → Synalux /api/v1/skills/content (DB first, filesystem fallback)
804
+ // → local SQLite skill:<name> (free tier / offline fallback)
805
+ // User-local skills → local SQLite user_skill:<name>
806
+ // ONLY when user_local.enabled=true in routing table
807
+ // OR session_load_context called with user_local=true.
808
+ // Users CANNOT write to the platform skill: namespace.
804
809
  const { resolveSkillsForProject } = await import("./skillRouting.js");
805
- const skillsToLoad = await resolveSkillsForProject(project);
806
- // Paid tier: batch-fetch all skill content from Synalux portal in one request.
810
+ const resolved = await resolveSkillsForProject(project);
811
+ const skillsToLoad = resolved.names;
812
+ const userLocalPolicy = resolved.user_local;
813
+ // Paid tier: batch-fetch platform skill content from Synalux in one request.
807
814
  let synaluxContent = {};
808
815
  if (SYNALUX_CONFIGURED && storage && typeof storage.fetchSkillContent === "function") {
809
816
  const missing = skillsToLoad.filter(n => !loadedSkills.includes(n));
@@ -814,19 +821,38 @@ export async function sessionLoadContextHandler(args) {
814
821
  for (const skillName of skillsToLoad) {
815
822
  if (loadedSkills.includes(skillName))
816
823
  continue;
817
- // Prefer Synalux content (always up-to-date); fall back to local SQLite.
824
+ // Synalux (paid) platform fallback skill:<name> (free/offline). Never user_skill:.
818
825
  const content = synaluxContent[skillName] || await getSetting(`skill:${skillName}`, "");
819
826
  if (content && content.trim()) {
820
- const source = synaluxContent[skillName] ? "synalux" : "local";
827
+ const source = synaluxContent[skillName] ? "synalux" : "local-platform";
821
828
  skillBlock += `\n\n[📜 SKILL: ${skillName}]\n${content.trim()}`;
822
829
  loadedSkills.push(skillName);
823
830
  skillLoaded = true;
824
831
  debugLog(`[session_load_context] Skill "${skillName}" loaded (${source}) for project="${project}"`);
825
832
  }
826
833
  }
834
+ // ─── User-Local Skills ──────────────────────────────────────
835
+ // Loaded ONLY when user_local.enabled=true (set in Synalux routing table
836
+ // or explicitly requested). Stored under user_skill: prefix — users can
837
+ // write here via dashboard; they CANNOT write to the platform skill: keys.
838
+ if (userLocalPolicy.enabled) {
839
+ const prefix = userLocalPolicy.key_prefix || "user_skill:";
840
+ const allSettings = await storage.getAllSettings?.() || {};
841
+ for (const [k, v] of Object.entries(allSettings)) {
842
+ if (!k.startsWith(prefix) || !v)
843
+ continue;
844
+ const skillName = k.replace(prefix, "");
845
+ if (loadedSkills.includes(skillName))
846
+ continue;
847
+ skillBlock += `\n\n[📜 USER SKILL: ${skillName}]\n${v.trim()}`;
848
+ loadedSkills.push(skillName);
849
+ skillLoaded = true;
850
+ debugLog(`[session_load_context] User-local skill "${skillName}" loaded`);
851
+ }
852
+ }
827
853
  // ─── Memory-Based Skill Discovery ──────────────────────────
828
- // If recent handoff/ledger mentions a skill name, auto-load it.
829
- // This lets the agent's own memory drive skill activation.
854
+ // If recent handoff/ledger mentions a platform skill name, auto-load it.
855
+ // Only scans platform skill: keys user_skill: discovery is not automatic.
830
856
  if (formattedContext.length > 0) {
831
857
  const contextText = formattedContext.toLowerCase();
832
858
  const allSkillKeys = await storage.getAllSettings?.() || {};
@@ -836,7 +862,6 @@ export async function sessionLoadContextHandler(args) {
836
862
  const skillName = k.replace("skill:", "");
837
863
  if (loadedSkills.includes(skillName))
838
864
  continue;
839
- // Only load if the skill name appears in recent context
840
865
  if (contextText.includes(skillName.replace(/-/g, " ")) || contextText.includes(skillName)) {
841
866
  skillBlock += `\n\n[📜 CONTEXT SKILL: ${skillName}]\n${v}`;
842
867
  loadedSkills.push(skillName);
@@ -16,12 +16,12 @@
16
16
  * Do NOT add hardcoded skill names here outside the OFFLINE_FALLBACK block
17
17
  * — that defeats the single-source-of-truth design.
18
18
  */
19
- // Minimal fallback when synalux is unreachable. Only the universal BCBA
20
- // skill — project-specific mappings need synalux to resolve.
19
+ // Minimal fallback when synalux is unreachable.
21
20
  const OFFLINE_FALLBACK = {
22
21
  version: 1,
23
22
  universal: ['bcba_ai_assistant'],
24
23
  projects: {},
24
+ user_local: { enabled: false, key_prefix: 'user_skill:' },
25
25
  };
26
26
  const SYNALUX_BASE = process.env.SYNALUX_BASE_URL || 'https://synalux.ai';
27
27
  const CACHE_TTL_MS = 5 * 60 * 1000;
@@ -53,7 +53,8 @@ async function fetchOnce() {
53
53
  /**
54
54
  * Resolve the skill list for a given project (case-insensitive substring
55
55
  * match against the routing table). Always returns at least the universal
56
- * skills.
56
+ * skills. Also returns the user_local policy so callers know whether to
57
+ * load user_skill:* entries from local SQLite.
57
58
  */
58
59
  export async function resolveSkillsForProject(project) {
59
60
  const now = Date.now();
@@ -75,7 +76,10 @@ export async function resolveSkillsForProject(project) {
75
76
  out.add(s);
76
77
  }
77
78
  }
78
- return Array.from(out);
79
+ return {
80
+ names: Array.from(out),
81
+ user_local: table.user_local ?? OFFLINE_FALLBACK.user_local,
82
+ };
79
83
  }
80
84
  /** Force a re-fetch on the next call. Exposed for tests + admin tooling. */
81
85
  export function _invalidateRoutingCache() {
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "prism-mcp-server",
3
- "version": "15.1.0",
3
+ "version": "15.2.0",
4
4
  "mcpName": "io.github.dcostenco/prism-coder",
5
5
  "description": "Prism Coder — Cognitive memory + tool-calling intelligence for AI agents. Mind Palace persistent memory (BFCL Gold Certified, 100% Tool-Call Accuracy, 54 Agent Skills, Zero-Search HDC/HRR retrieval, HIPAA-hardened local-first storage, SLERP-optimized GRPO alignment) plus the prism-coder:7b / 14b open-weights LLM fleet.",
6
6
  "module": "index.ts",