prism-mcp-server 15.0.0 → 15.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -113,21 +113,31 @@ The LLM context window is treated as ephemeral scratch space. All durable state
113
113
 
114
114
  ---
115
115
 
116
- ## Plans
116
+ ## Models
117
+
118
+ All Prism Coder inference uses **only fine-tuned Prism Coder models** — no Claude, no Gemini, no OpenRouter fallbacks. Models are exclusively accessible through the Synalux router (authentication + subscription required).
119
+
120
+ | Model | Where | Tier | Latency |
121
+ |---|---|---|---|
122
+ | **Qwen3-1.7B** (fine-tuned) | On-device — iOS CoreML / Android ONNX | Free | ~50ms offline |
123
+ | **Qwen3-14B** (fine-tuned) | RunPod A100 via Synalux | Standard+ | ~200ms |
124
+ | **QwQ-32B** (fine-tuned) | RunPod A100 80GB via Synalux | Pro/Enterprise | ~3–5s |
125
+ | **Qwen3-30B-A3B** (fine-tuned MoE) | RunPod via Synalux | Enterprise | ~2–3s |
117
126
 
118
- | | Free (local) | Paid (Synalux portal) |
119
- |---|---|---|
120
- | Local SQLite memory | ✅ | ✅ |
121
- | Semantic search | ✅ (local embedding) | ✅ (cloud-backed) |
122
- | Cross-device sync | — | ✅ |
123
- | Hivemind multi-agent | ✅ local team | ✅ + cloud roster |
124
- | Auto-Scholar (web research → memory) | — | ✅ |
125
- | HRR Zero-Search retrieval | ✅ | ✅ |
126
- | Custom domains / SSO | — | Enterprise |
127
+ Fine-tuned on the 3-layer corpus: AAC + BFCL tool-calling + clinical workflows. BFCL gate: ≥ 90% on all tiers before production promotion. Adapters stored at `dcostenco/prism-coder-*` (private HuggingFace).
127
128
 
128
- The thin-client architecture: when authenticated to Synalux, Prism Coder routes through the portal for paid features. When not authenticated (or `PRISM_FORCE_LOCAL=1`), runs purely local. Same binary.
129
+ ## Plans
130
+
131
+ | | Free | Standard $19/mo | Pro $49/mo | Enterprise $99/mo |
132
+ |---|---|---|---|---|
133
+ | Qwen3-1.7B on-device | ✅ unlimited | ✅ | ✅ | ✅ |
134
+ | Qwen3-14B cloud | — | ✅ 200 req/day | ✅ 2K req/day | ✅ unlimited |
135
+ | QwQ-32B reasoning | — | — | ✅ | ✅ priority |
136
+ | Qwen3-30B-A3B MoE | — | — | — | ✅ |
137
+ | Custom fine-tuning | — | — | — | ✅ |
138
+ | HIPAA BAA | — | — | — | ✅ |
129
139
 
130
- [Pricing →](https://synalux.ai/pricing)
140
+ [Subscribe →](https://synalux.ai/pricing)
131
141
 
132
142
  ---
133
143
 
@@ -174,6 +184,52 @@ As of v14.0.0, Prism's algorithm exports are a **stable public contract** under
174
184
  | [PrismAAC](https://github.com/dcostenco/prism-aac) | Spreading-activation phrase ranking (recency × frequency × per-user history). Caregiver corrections auto-harvest into the personalization corpus via the audit-hooks postflight harvester. The on-device 7B model + this algorithm stack is what makes PrismAAC defensible. |
175
185
  | Synalux portal | Tier-aware model routing using experience bias on prior outcomes per fingerprint. HIPAA-compliant clinical scribe with on-device-first privacy guarantees. |
176
186
 
187
+ ## Synalux Inference Router — Architecture (v16)
188
+
189
+ All Prism AAC model inference is protected behind Synalux as a mandatory router. Models are **never accessible directly** — all traffic goes through Synalux for auth, billing, and rate limiting.
190
+
191
+ ```
192
+ ┌─────────────────────────────────────────────────────────────┐
193
+ │ CLIENT LAYER │
194
+ │ prism-aac (iOS/web) │ Synalux Portal │
195
+ └──────────────┬──────────────────────────────────────────────┘
196
+ │ POST /api/v1/prism-aac/inference
197
+ │ Authorization: Bearer <user-JWT>
198
+
199
+ ┌─────────────────────────────────────────────────────────────┐
200
+ │ SYNALUX ROUTER │
201
+ │ 1. Verify JWT (no anonymous access) │
202
+ │ 2. Check subscription tier │
203
+ │ 3. Enforce rate limit (50–2000 req/day by plan) │
204
+ │ 4. Route to model tier by complexity │
205
+ │ 5. Proxy → RunPod with SECRET key (never sent to client) │
206
+ │ 6. Log → aac_inference_log (billing audit trail) │
207
+ └──────────┬─────────────────────────────────────┬────────────┘
208
+ │ tier=fast │ tier=reason
209
+ ▼ ▼
210
+ ┌──────────────────┐ ┌───────────────────────┐
211
+ │ Qwen3-14B │ │ QwQ-32B │
212
+ │ RunPod A100 40G │ │ RunPod A100 80G │
213
+ │ ~200ms │ │ ~3–5s (reasoning) │
214
+ │ standard/pro │ │ pro/enterprise only │
215
+ └──────────────────┘ └───────────────────────┘
216
+ │ │
217
+ └────────────────┬─────────────────────┘
218
+
219
+ HuggingFace dcostenco/prism-coder-* (private)
220
+ RunPod pulls at pod start with server-side token
221
+
222
+ On-device (free, zero latency, offline):
223
+ Qwen3-1.7B GGUF Q4_K_M → iOS CoreML / Android ONNX
224
+ ```
225
+
226
+ | Plan | Cloud model | Daily limit | On-device |
227
+ |---|---|---|---|
228
+ | Free | — | unlimited local | Qwen3-1.7B |
229
+ | Standard $5/mo | Qwen3-14B | 200 req | + cloud |
230
+ | Pro $15/mo | QwQ-32B | 2,000 req | + reasoning |
231
+ | Enterprise | QwQ-32B priority | unlimited | full stack |
232
+
177
233
  See [`docs/WOW_FEATURES.md`](docs/WOW_FEATURES.md) for the algorithm catalogue. Release notes in [`docs/releases/v14.0.0-prism-as-foundation.md`](docs/releases/v14.0.0-prism-as-foundation.md).
178
234
 
179
235
  ---
@@ -567,7 +567,9 @@ return false;}
567
567
  res.writeHead(200, { "Content-Type": "application/json" });
568
568
  return res.end(JSON.stringify({ skills }));
569
569
  }
570
- // POST /api/skills → { role, content } saves skill:<role>
570
+ // POST /api/skills → { role, content } saves user_skill:<role> (user-local namespace).
571
+ // Platform skills (skill:*) are read-only — populated only by sync-skills.sh
572
+ // or fetched from Synalux. Users cannot overwrite platform skills from the dashboard.
571
573
  if (url.pathname === "/api/skills" && req.method === "POST") {
572
574
  const body = await readBody(req);
573
575
  const { role, content } = JSON.parse(body || "{}");
@@ -575,16 +577,16 @@ return false;}
575
577
  res.writeHead(400);
576
578
  return res.end(JSON.stringify({ error: "role required" }));
577
579
  }
578
- await setSetting(`skill:${role}`, content || "");
580
+ await setSetting(`user_skill:${role}`, content || "");
579
581
  res.writeHead(200, { "Content-Type": "application/json" });
580
- return res.end(JSON.stringify({ ok: true, role }));
582
+ return res.end(JSON.stringify({ ok: true, role, namespace: "user_skill" }));
581
583
  }
582
- // DELETE /api/skills/:role → clears skill:<role>
584
+ // DELETE /api/skills/:role → clears user_skill:<role> (user-local only).
583
585
  if (url.pathname.startsWith("/api/skills/") && req.method === "DELETE") {
584
586
  const role = url.pathname.replace("/api/skills/", "");
585
- await setSetting(`skill:${role}`, "");
587
+ await setSetting(`user_skill:${role}`, "");
586
588
  res.writeHead(200, { "Content-Type": "application/json" });
587
- return res.end(JSON.stringify({ ok: true, role }));
589
+ return res.end(JSON.stringify({ ok: true, role, namespace: "user_skill" }));
588
590
  }
589
591
  // ─── API: Knowledge Graph (v6.2 — extracted to graphRouter.ts) ───
590
592
  if (url.pathname.startsWith("/api/graph")) {
@@ -269,4 +269,30 @@ export class SynaluxStorage extends SupabaseStorage {
269
269
  const results = Array.isArray(result.results) ? result.results : [];
270
270
  return { count, results };
271
271
  }
272
+ /**
273
+ * Fetch skill content from Synalux portal (paid-tier single source of truth).
274
+ * Returns a map of { skillName → SKILL.md content } for all names that exist.
275
+ * Missing skills are absent from the map — caller falls back to local SQLite.
276
+ *
277
+ * Uses a public GET (no auth required) since skill content is not sensitive
278
+ * and the route is already in synalux-private/portal at /api/v1/skills/content.
279
+ */
280
+ async fetchSkillContent(names) {
281
+ if (names.length === 0)
282
+ return {};
283
+ const url = `${this.baseUrl}/api/v1/skills/content?names=${encodeURIComponent(names.join(","))}`;
284
+ try {
285
+ const res = await fetch(url, {
286
+ headers: { Accept: "application/json" },
287
+ signal: AbortSignal.timeout(3_000),
288
+ });
289
+ if (!res.ok)
290
+ return {};
291
+ const body = await res.json();
292
+ return typeof body.skills === "object" && body.skills !== null ? body.skills : {};
293
+ }
294
+ catch {
295
+ return {};
296
+ }
297
+ }
272
298
  }
@@ -30,7 +30,7 @@ import { getCurrentGitState, getGitDrift } from "../utils/git.js";
30
30
  import { getSetting, getAllSettings } from "../storage/configStorage.js";
31
31
  import { mergeHandoff, dbToHandoffSchema, sanitizeForMerge } from "../utils/crdtMerge.js";
32
32
  import { resolveProject } from "../utils/projectResolver.js";
33
- import { PRISM_USER_ID, PRISM_AUTO_CAPTURE, PRISM_CAPTURE_PORTS } from "../config.js";
33
+ import { PRISM_USER_ID, PRISM_AUTO_CAPTURE, PRISM_CAPTURE_PORTS, SYNALUX_CONFIGURED } from "../config.js";
34
34
  import { captureLocalEnvironment } from "../utils/autoCapture.js";
35
35
  import { fireCaptionAsync } from "../utils/imageCaptioner.js";
36
36
  import { isSessionSaveLedgerArgs, isSessionSaveHandoffArgs, isSessionLoadContextArgs, isMemoryHistoryArgs, isMemoryCheckoutArgs, // v2.2.0: health check type guard
@@ -798,28 +798,61 @@ export async function sessionLoadContextHandler(args) {
798
798
  }
799
799
  }
800
800
  // ─── Project-Aware Skill Injection ──────────────────────────
801
- // Skill routing (which skills load for which project) is the SINGLE
802
- // SOURCE OF TRUTH at synalux: /api/v1/skills/routing. We pull the
803
- // canonical table on every session and resolve locally. Skill CONTENT
804
- // continues to be stored in this server's settings under skill:<name>
805
- // synalux owns the WHICH, this server owns the WHAT, no duplication
806
- // of the routing config in three repos.
801
+ // Routing (WHICH skills + user_local policy): Synalux /api/v1/skills/routing.
802
+ // Content (WHAT):
803
+ // Platform skills → Synalux /api/v1/skills/content (DB first, filesystem fallback)
804
+ // local SQLite skill:<name> (free tier / offline fallback)
805
+ // User-local skills local SQLite user_skill:<name>
806
+ // ONLY when user_local.enabled=true in routing table
807
+ // OR session_load_context called with user_local=true.
808
+ // Users CANNOT write to the platform skill: namespace.
807
809
  const { resolveSkillsForProject } = await import("./skillRouting.js");
808
- const skillsToLoad = await resolveSkillsForProject(project);
810
+ const resolved = await resolveSkillsForProject(project);
811
+ const skillsToLoad = resolved.names;
812
+ const userLocalPolicy = resolved.user_local;
813
+ // Paid tier: batch-fetch platform skill content from Synalux in one request.
814
+ let synaluxContent = {};
815
+ if (SYNALUX_CONFIGURED && storage && typeof storage.fetchSkillContent === "function") {
816
+ const missing = skillsToLoad.filter(n => !loadedSkills.includes(n));
817
+ synaluxContent = await storage
818
+ .fetchSkillContent(missing).catch(() => ({}));
819
+ debugLog(`[session_load_context] Synalux skill content fetched: ${Object.keys(synaluxContent).join(", ") || "none"}`);
820
+ }
809
821
  for (const skillName of skillsToLoad) {
810
822
  if (loadedSkills.includes(skillName))
811
823
  continue;
812
- const content = await getSetting(`skill:${skillName}`, "");
824
+ // Synalux (paid) platform fallback skill:<name> (free/offline). Never user_skill:.
825
+ const content = synaluxContent[skillName] || await getSetting(`skill:${skillName}`, "");
813
826
  if (content && content.trim()) {
827
+ const source = synaluxContent[skillName] ? "synalux" : "local-platform";
814
828
  skillBlock += `\n\n[📜 SKILL: ${skillName}]\n${content.trim()}`;
815
829
  loadedSkills.push(skillName);
816
830
  skillLoaded = true;
817
- debugLog(`[session_load_context] Skill "${skillName}" loaded for project="${project}"`);
831
+ debugLog(`[session_load_context] Skill "${skillName}" loaded (${source}) for project="${project}"`);
832
+ }
833
+ }
834
+ // ─── User-Local Skills ──────────────────────────────────────
835
+ // Loaded ONLY when user_local.enabled=true (set in Synalux routing table
836
+ // or explicitly requested). Stored under user_skill: prefix — users can
837
+ // write here via dashboard; they CANNOT write to the platform skill: keys.
838
+ if (userLocalPolicy.enabled) {
839
+ const prefix = userLocalPolicy.key_prefix || "user_skill:";
840
+ const allSettings = await storage.getAllSettings?.() || {};
841
+ for (const [k, v] of Object.entries(allSettings)) {
842
+ if (!k.startsWith(prefix) || !v)
843
+ continue;
844
+ const skillName = k.replace(prefix, "");
845
+ if (loadedSkills.includes(skillName))
846
+ continue;
847
+ skillBlock += `\n\n[📜 USER SKILL: ${skillName}]\n${v.trim()}`;
848
+ loadedSkills.push(skillName);
849
+ skillLoaded = true;
850
+ debugLog(`[session_load_context] User-local skill "${skillName}" loaded`);
818
851
  }
819
852
  }
820
853
  // ─── Memory-Based Skill Discovery ──────────────────────────
821
- // If recent handoff/ledger mentions a skill name, auto-load it.
822
- // This lets the agent's own memory drive skill activation.
854
+ // If recent handoff/ledger mentions a platform skill name, auto-load it.
855
+ // Only scans platform skill: keys user_skill: discovery is not automatic.
823
856
  if (formattedContext.length > 0) {
824
857
  const contextText = formattedContext.toLowerCase();
825
858
  const allSkillKeys = await storage.getAllSettings?.() || {};
@@ -829,7 +862,6 @@ export async function sessionLoadContextHandler(args) {
829
862
  const skillName = k.replace("skill:", "");
830
863
  if (loadedSkills.includes(skillName))
831
864
  continue;
832
- // Only load if the skill name appears in recent context
833
865
  if (contextText.includes(skillName.replace(/-/g, " ")) || contextText.includes(skillName)) {
834
866
  skillBlock += `\n\n[📜 CONTEXT SKILL: ${skillName}]\n${v}`;
835
867
  loadedSkills.push(skillName);
@@ -16,12 +16,12 @@
16
16
  * Do NOT add hardcoded skill names here outside the OFFLINE_FALLBACK block
17
17
  * — that defeats the single-source-of-truth design.
18
18
  */
19
- // Minimal fallback when synalux is unreachable. Only the universal BCBA
20
- // skill — project-specific mappings need synalux to resolve.
19
+ // Minimal fallback when synalux is unreachable.
21
20
  const OFFLINE_FALLBACK = {
22
21
  version: 1,
23
22
  universal: ['bcba_ai_assistant'],
24
23
  projects: {},
24
+ user_local: { enabled: false, key_prefix: 'user_skill:' },
25
25
  };
26
26
  const SYNALUX_BASE = process.env.SYNALUX_BASE_URL || 'https://synalux.ai';
27
27
  const CACHE_TTL_MS = 5 * 60 * 1000;
@@ -53,7 +53,8 @@ async function fetchOnce() {
53
53
  /**
54
54
  * Resolve the skill list for a given project (case-insensitive substring
55
55
  * match against the routing table). Always returns at least the universal
56
- * skills.
56
+ * skills. Also returns the user_local policy so callers know whether to
57
+ * load user_skill:* entries from local SQLite.
57
58
  */
58
59
  export async function resolveSkillsForProject(project) {
59
60
  const now = Date.now();
@@ -75,7 +76,10 @@ export async function resolveSkillsForProject(project) {
75
76
  out.add(s);
76
77
  }
77
78
  }
78
- return Array.from(out);
79
+ return {
80
+ names: Array.from(out),
81
+ user_local: table.user_local ?? OFFLINE_FALLBACK.user_local,
82
+ };
79
83
  }
80
84
  /** Force a re-fetch on the next call. Exposed for tests + admin tooling. */
81
85
  export function _invalidateRoutingCache() {
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "prism-mcp-server",
3
- "version": "15.0.0",
3
+ "version": "15.2.0",
4
4
  "mcpName": "io.github.dcostenco/prism-coder",
5
5
  "description": "Prism Coder — Cognitive memory + tool-calling intelligence for AI agents. Mind Palace persistent memory (BFCL Gold Certified, 100% Tool-Call Accuracy, 54 Agent Skills, Zero-Search HDC/HRR retrieval, HIPAA-hardened local-first storage, SLERP-optimized GRPO alignment) plus the prism-coder:7b / 14b open-weights LLM fleet.",
6
6
  "module": "index.ts",